What are 3 requirements that must be met in order to build a failover cluster using Windows Server 2023?
You can create a failover cluster using Windows Server on Google Cloud Platform (GCP). A group of servers works together to provide higher availability (HA) for your Windows applications. If one cluster node fails, another node can take over running the software. You can configure the failover to happen automatically, which is the usual configuration, or you can manually trigger a failover. Show
This tutorial assumes you are familiar with failover clustering, Active Directory (AD), and administration of Windows Server. For a brief overview of networking in GCP, see GCP for Data Center Pros: Networking. ArchitectureThis tutorial walks you through how to create an example failover cluster on Compute Engine. The example system contains the following two servers:
Additionally, you deploy an AD domain contoller, which, for this tutorial, serves the following purposes:
You can create the domain controller in any zone. This tutorial uses zone 3. In a production system, you can host the file share witness elsewhere, and you don't need a separate AD system only to support your failover cluster. See What's next for links to articles about using AD on GCP.The two servers that you will use to deploy the failover cluster are located in different zones to ensure that each server is on a different physical machine and to protect against the unlikely possibility of a zonal failure. The following diagram describes the architecture you deploy by following this tutorial. Shared storage optionsThis tutorial does not cover setting up a file server for high-availability shared storage. Google Cloud supports multiple shared storage solutions that you can use with Windows Server Failover Clustering, including:
For information about other possible shared storage solutions, see:
Understanding the network routingWhen the cluster fails over, requests must go to the newly active node. The clustering technology normally handles routing by using address resolution protocol (ARP), which associates IP addresses with MAC addresses. In GCP, the Virtual Private Cloud (VPC) system uses software-defined networking, which doesn't leverage MAC addresses. This means the changes broadcast by ARP don't affect routing at all. To make routing work, the cluster requires some software-level help from the Internal Load Balancer. Usually, internal load balancing distributes incoming network traffic among multiple backend instances that are internal to your VPC, to share the load. For failover clustering, you instead use internal load balancing to route all traffic to just one instance: the currently active cluster node. Here's how internal load balancing detects the correct node:
What happens during a failoverWhen a failover happens in the cluster, the following changes take place:
Putting it togetherNow that you've reviewed some of the concepts, here are some details to notice about the architecture diagram:
Advice for following this tutorialThis tutorial has a lot of steps. Sometimes, you are asked to follow steps in external documents, such as Microsoft documentation. Don't miss the notes in this document providing specifics for following the external steps. This tutorial uses Cloud Shell in the Google Cloud console. Though it's possible to use the Google Cloud console user interface or the gcloud CLI to set up failover clustering, this tutorial mainly uses Cloud Shell to make it easy for you. This approach helps you to complete the tutorial faster. When more appropriate, some steps use the Google Cloud console instead. It's a good idea to take snapshots of your Compute Engine persistent disks along the way. If something goes wrong, you can use a snapshot to avoid starting over from the beginning. This tutorial suggests good times to take the snapshots. If you find that things aren't working as you expect, there might be instructions in the section you're reading. Otherwise, refer to the Troubleshooting section. Objectives
CostsThis tutorial uses Compute Engine images that include Windows Server licenses. This means the cost to run this tutorial can be significant if you leave VMs running. It's a good idea to stop the VMs when you're not using them. See the Pricing Calculator for an estimate of the costs to complete this tutorial. Before you begin
Your cluster requires a custom network. Use VPC to create a custom network and one subnetwork by running
Notice that the CIDR range for IP addresses in this subnetwork is 5. This is an example range used for this tutorial. In production systems, work with your network administrators to allocate appropriate ranges for IP addresses for your systems.Create firewall rulesBy default, your network is closed to external traffic. You must open ports in the firewall to enable remote connections to the servers. Use 1 commands in Cloud Shell to create the rules.
Enabling failover clustering in Compute EngineTo enable failover clustering in the Compute Engine agent, you need to add the flag 9 to your VM definitions either by specifying it as custom metadata for the VM or by creating a configuration file on each VM, as described in the Compute Engine documentation.This tutorial defines the flag as custom metadata when the VMs are created, as described in the next section. The tutorial also relies on the default behavior for NAME NETWORK DIRECTION PRIORITY ALLOW DENY DISABLED allow-all-subnet wsfcnet INGRESS 1000 all False allow-rdp wsfcnet INGRESS 1000 tcp:3389 False0 and NAME NETWORK DIRECTION PRIORITY ALLOW DENY DISABLED allow-all-subnet wsfcnet INGRESS 1000 all False allow-rdp wsfcnet INGRESS 1000 tcp:3389 False1, so you don't need to set these values. Creating the serversNext, create the 3 servers. Use the 1 command in Cloud Shell.Create the first cluster-node serverCreate a new Compute Engine instance. Configure the instance as follows:
Run the following command, replacing 1 with the name of your first zone:
Create the second cluster-node serverFor the second server, follow the same steps, except:
Replace 7 with the name of your second zone:
Create the third server for Active DirectoryFor the domain controller, follow the same steps, except:
Replace 4 with the name of your zone:
View your instancesYou can see the details about the instances you created.
You will see output similar to the following: 0Connecting to your VMsTo connect to a Windows-based VM, you must first generate a password for the VM. You can then connect to the VM using RDP. Generating passwords
Connecting through RDPThe Compute Engine documentation provides details about how to connect to your Windows VM instances by using RDP. You can either:
Whenever this tutorial tells you to connect to a Windows instance, use your preferred RDP connection. Configuring Windows networkingThe internal IP addresses that you assigned when you created the VMs are static. To ensure that Windows treats the IP addresses as static, you need to add them, along with the IP addresses of the default gateway and the DNS server, to the Windows Server networking configuration. Use RDP to connect to 5, 4, and 8, and repeat the following steps for each instance:
This is a good time to take snapshots of 5 and 4.Setting up Active DirectoryNow, set up the domain controller.
This is a good time to take a snapshot of 8.Create the domain user accountIt can take some time for 8 to restart. Before joining servers to the domain, use RDP to sign in to 8 to validate that the domain controller is running.You need a domain user that has administrator privileges for the cluster servers. Follow these steps:
This tutorial uses the 09 account as an administrator account wherever such an account is required. In a production system, follow your usual security practices for allocating accounts and permissions. For more information, see Overview of Active Directory accounts needed by a failover cluster.Join the servers to the domainAdd the two cluster-node servers to the 10 domain. Perform the following steps on each cluster-node server ( 5 and 4):
This is a good point to take snapshots of all three VMs. Setting up failover clusteringReserve an IP address for the cluster in Compute EngineWhen you create the failover cluster, you assign an IP address to create an administrative access point. In a production environment, you might use an IP address from a separate subnet. However, in this tutorial you reserve an IP address from the subnet you already created. Reserving the IP address prevents conflicts with other IP assignments.
Create the clusterTo create and configure the failover cluster:
Add the cluster administratorAdding a domain account as an administrator for the cluster enables you to perform actions on the cluster from tools such as Windows PowerShell. Add the 05 domain account as a cluster admin.
This is a good point to take snapshots. Creating the file share witnessYou have a two-node failover cluster, but the cluster uses a voting mechanism to decide which node should be active. To achieve a quorum, you can add a file share witness. This tutorial simply adds a shared folder to the domain controller server. If this server were to go offline at the same time one of the cluster nodes is restarting, the entire cluster could stop working because the remaining server can't vote by itself. For this tutorial, the assumption is that the GCP infrastructure features, such as Live Migration and automatic restart, provide enough reliability to keep the shared folder alive. If you want to create a more-highly-available file share witness, you have these options:
Follow these steps to create the file share for the witness:
Configure sharing for the file share witnessYou must set permissions on the file share witness folder to enable the cluster to use it.
Add the file share witness to the failover clusterNow, configure the failover cluster to use the file share witness as a quorum vote.
Testing the failover clusterYour Windows Server failover cluster should now be working. You can test manually moving cluster resources between your instances. You're not done yet, but this is a good checkpoint to validate that everything you've done so far is working.
You should see the name of the current host server change to the other VM. If this didn't work, review the previous steps and see if you missed anything. The most common issue is a missing firewall rule that is blocking access on the network. Refer to the Troubleshooting section for more issues to check. Otherwise, you can now move on to setting up the internal load balancer, which is required in order to route network traffic to the current host server in the cluster. This is a good time to take snapshots. Adding a role to the failover clusterIn Windows failover clustering, roles host clustered workloads. You can use a role to specify in the cluster the IP address that your application uses. For this tutorial, you add a role for the test workload, which is the Internet Information Services (IIS) web server, and assign an IP address to the role. Reserve an IP address for the role in Compute EngineTo prevent IP addressing conflicts within your subnet in Compute Engine, reserve the IP address for the role.
Add the roleFollow these steps:
Creating the internal load balancerCreate and configure the internal load balancer, which is required in order to route network traffic to the active cluster host node. You will use the Google Cloud console, because the user interface gives you a good view into how internal load balancing is organized. You will also create a Compute Engine instance group for each zone in the cluster, which the load balancer uses to manage the cluster nodes. Create the instance groupsCreate an instance group in each zone that contains a cluster node and then add each node to the instance group in its zone. Don't add the domain controller 8 to an instance group.
Create the load balancer
Don't click Create yet. Configure the backendRecall that the GCP internal load balancer uses a periodic health check to determine the active node. The health check pings the Compute Engine cluster host agent that is running on the active cluster node. The health check payload is the IP address of the application, which is represented by the clustered role. The agent responds with a value of 1 if the node is active or 0 if it is not.
Configure the frontendThe frontend configuration creates a forwarding rule that defines how the load balancer handles incoming requests. For this tutorial, to keep it simple, you will test the system by making requests between the VMs in the subnetwork. In your production system, you probably want to open the system up to external traffic, such as Internet traffic. To do this, you can create a bastion host that accepts external traffic and forwards it to your internal network. Using a bastion host is not covered in this tutorial.
Review and finalize
Create firewall rules for the health checkYou might have noticed that the Google Cloud console notified you that the health-check system would require a firewall rule to enable the health checks to reach their targets. In this section, you set up the firewall rule. Note: The documentation about setting up an internal load balancer instructs you to create a firewall rule for the load balancer itself. For this tutorial, this is not needed because you already created a rule that allows all traffic between nodes on the subnet, and the load balancer is operating within this CIDR range and using only allowed ports.
Open the Windows FirewallOn each cluster node, 5 and 4, create a firewall rule in the Windows firewall to allow the load balancer to access each Windows system.
Validating the load balancerAfter your internal load balancer is running, you can inspect its status to validate that it can find a healthy instance, and then test failover again.
60 is your region: 3The output looks like the following: 4Installing your applicationNow that you have a cluster, you can set up your application on each node and configure it for running in a clustered environment. For this tutorial, you need to set up something that can demonstrate that the cluster is really working with the internal load balancer. Set up IIS on each VM to serve a simple web page. You're not setting up IIS for HA in the cluster. You are creating separate IIS instances that each serve a different web page. After a failover, the web server serves its own content, not shared content. Setting up your application or IIS for HA is beyond the scope of this tutorial. Set up IIS
In each case, you see the Welcome page, which is the default IIS web page. Edit the default web pagesChange each default web page so you can easily see which server is currently serving the page.
Now, when you view a web page served from one of these servers, the name of the server appears as the title in the Internet Explorer tab. Test the failover
Congratulations! You now have a working Windows Server 2016 failover cluster running on GCP. TroubleshootingHere are some common issues you can check if things aren't working. GCP firewall rules blocks health checkIf the health check isn't working, double-check that you have a firewall rule to enable incoming traffic from the IP addresses that the health check system uses: 54and 55.Windows Firewall blocks health checkMake sure port 59998 is open in Windows Firewall on each cluster node. See Open the Windows Firewall. Cluster nodes using DHCPIt's important that each VM in the cluster has a static IP address. If a VM is configured to use DHCP in Windows, change the networking settings in Windows to make the IPv4 address match the IP address of the VM as shown in the Google Cloud console. Also set the gateway IP address to match the address of the subnetwork gateway in the GCP VPC. GCP network tags in firewall rulesIf you use network tags in your firewall rules, be sure the correct tags are set on every VM instance. This tutorial doesn't use tags, but if you've set them for some other reason, they must be used consistently. Clean upTo avoid incurring charges to your Google Cloud account for the resources used in this tutorial, either delete the project that contains the resources, or keep the project and delete the individual resources. After you finish the tutorial, you can clean up the resources that you created so that they stop using quota and incurring charges. The following sections describe how to delete or turn off these resources. Deleting the projectThe easiest way to eliminate billing is to delete the project that you created for the tutorial. To delete the project:
If you plan to explore multiple tutorials and quickstarts, reusing projects can help you avoid exceeding project quota limits. Cleaning up resources without deleting the projectIf you need to keep your project, you can clean up the tutorial resources by deleting them individually. What are requirements of Failover Clustering?Windows Failover Cluster requirements. Windows Server should be able to ping the DNS and NTP servers.. Appropriate admin credentials should be set up with access for creation of Windows Failover Cluster Computer Object.. RDM Disks added to the Windows Servers should be accessible by all the servers in the cluster.. What are 3 requirements that must be met in order to build a failover cluster using Windows Server 2016?All servers must use the same version of Windows Server 2016. All servers must be joined to an Active Directory domain. All servers must use identical hardware components.
What are the infrastructure and software requirements for a failover cluster?It's required that the multipath I/O (MPIO) software be identical and that the Device Specific Module (DSM) software be identical. It's recommended that the mass-storage device controllers— the host bus adapter (HBA), HBA drivers, and HBA firmware—that are attached to cluster storage be identical.
What are the major components that you must configure before deploy a failover cluster?Before you install a SQL Server failover cluster, you must select the hardware and the operating system on which SQL Server will run. You must also configure Windows Server Failover Clustering (WSFC) and review network, security, and considerations for other software that will run on your failover cluster.
|