Site-aware Failover Clusters in Windows Server 2016

This post has been republished via RSS; it originally appeared at: Failover Clustering articles.

Windows Server 2016, debuts the birth of site-aware clusters. Nodes in stretched clusters can now be grouped based on their physical location (site). Cluster site-awareness enhances key operations during the cluster lifecycle such as failover behavior, placement policies, heartbeating between the nodes and quorum behavior. In the remainder of this blog I will explain how you can configure sites for your cluster, the notion of a “preferred site” and how site awareness manifests itself in your cluster operations.

Configuring Sites

A node’s site membership can be configured by setting the Site node property to a unique numerical value.

For example, in a four node cluster with nodes - Node1, Node2, Node3 and Node4, to assign the nodes to Sites 1 and Site 2, do the following:

    • Launch Microsoft PowerShell © as an Administrator and type:
#Create Site Fault Domains 
New-ClusterFaultDomain –Name Seattle –Type Site –Description “Primary” –Location “Seattle DC”
New-ClusterFaultDomain –Name Denver –Type Site –Description “Secondary” –Location “Denver DC”

#Set Fault Domain membership
Set-ClusterFaultDomain –Name Node1 –Parent Seattle
Set-ClusterFaultDomain –Name Node2 –Parent Seattle

Set-ClusterFaultDomain –Name Node3 –Parent Denver
Set-ClusterFaultDomain –Name Node4 –Parent Denver


Configuring sites enhances the operation of your cluster in the following ways:

 

Failover Affinity

    • Groups failover to a node within the same site, before failing to a node in a different site
    • During Node Drain VMs are moved first to a node within the same site before being moved cross site
    • The CSV load balancer will distribute within the same site



Storage Affinity
Virtual Machines (VMs) follow storage and are placed in same site where their associated storage resides. VMs will begin live migrating to the same site as their associated CSV after 1 minute of the storage being moved.
Cross-Site Heartbeating
You now have the ability to configure the thresholds for heartbeating between sites. These thresholds are controlled by the following new cluster properties:


Property


Default Value


Description


CrossSiteDelay


1000


Amount of time between each heartbeat sent to nodes on dissimilar sites in milliseconds


CrossSiteThreshold


20


Missed heartbeats before interface considered down to nodes on dissimilar sites


To configure the above properties launch PowerShell © as an Administrator and type:

(Get-Cluster).CrossSiteDelay = <value> 
(Get-Cluster).CrossSiteThreshold = <value>

You can find more information on other properties controlling failover clustering heartbeating here .

The following rules define the applicability of the thresholds controlling heartbeating between two cluster nodes:

    • If the two cluster nodes are in two different sites and two different subnets, then the Cross-Site thresholds will override the Cross-Subnet thresholds.
    • If the two cluster nodes are in two different sites and the same subnets, then the Cross-Site thresholds will override the Same-Subnet thresholds.
    • If the two cluster nodes are in the same site and two different subnets, then the Cross-Subnet thresholds will be effective.
    • If the two cluster nodes are in the same site and the same subnets, then the Same-Subnet thresholds will be effective.

 

Configuring Preferred Site


In addition to configuring the site a cluster node belongs to, a “Preferred Site” can be configured for the cluster. The Preferred Site is a preference for placement. The Preferred Site will be your Primary datacenter site.

Before the Preferred Site can be configured, the site being chosen as the preferred site needs to be assigned to a set of cluster nodes. To configure the Preferred Site for a cluster, launch PowerShell © as an Administrator and type:

(Get-Cluster).PreferredSite = <Site assigned to a set of cluster nodes> 

Configuring a Preferred Site for your cluster enhances operation in the following ways:

 

Cold Start
During a cold start VMs are placed in in the preferred site

 

Quorum 

    • Dynamic Quorum drops weights from the Disaster Recovery site (DR site i.e. the site which is not designated as the Preferred Site) first to ensure that the Preferred Site survives if all things are equal. In addition, nodes are pruned from the DR site first, during regroup after events such as asymmetric network connectivity failures.
    • During a Quorum Split i.e. the even split of two datacenters with no witness, the Preferred Site is automatically elected to win 
        • The nodes in the DR site drop out of cluster membership

        • This allows the cluster to survive a simultaneous 50% loss of votes

        • Note that the LowerQuorumPriorityNodeID property previously controlling this behavior is deprecated in Windows Server 2016


Preferred Site and Multi-master Datacenters
The Preferred Site can also be configured at the granularity of a cluster group i.e. a different preferred site can be configured for each group. This enables a datacenter to be active and preferred for specific groups/VMs.

To configure the Preferred Site for a cluster group, launch PowerShell © as an Administrator and type:

(Get-ClusterGroup -Name <GroupName>).PreferredSite = <Site assigned to a set of cluster nodes>


Placement Priority
Groups in a cluster are placed based on the following site priority: 

  1. Storage affinity site
  2. Group preferred site
  3. Cluster preferred site

 

Additional Information:


Fault Domains are being introduced for clustering in Windows Server 2016, which provide Node, Chasse, Rack, and Site awareness.  See this blog as well as the below video's to learn more about this new feature: https://technet.microsoft.com/en-us/windows-server-docs/storage/storage-spaces/fault-domains-windows-server-2016

Fault Domain Awareness in WS2016 - Part 1: Overview



Fault Domain Awareness in WS2016 - Part 2: Using PowerShell





Fault Domain Awareness in WS2016 - Part 3: Using XML

 



Fault Domain Awareness in WS2016 - Part 4: Location, Description

Leave a Reply

Your email address will not be published. Required fields are marked *

*

This site uses Akismet to reduce spam. Learn how your comment data is processed.