winadmins.wordpress.com-failover clustering in windows server 2008 r2 partnbsp1

winadmins.wo rdpress.co m http://winadmins.wordpress.com/tag/clustering/

Failover Clustering in Windows Server 2008 R2 Part 1

Introduction

A f ailover cluster is a group of independent computers that work together to increase the availability ofapplications and services. The clustered servers (called nodes) are connected by physical cables and bysof tware. If one of the cluster nodes f ails, another node begins to provide service (a process known asf ailover). Users experience a minimum of disruptions in service.

Windows Server Failover Clustering (WSFC) is a f eature that can help ensure that an organization’s crit icalapplications and services, such as e-mail, databases, or line-of -business applications, are available wheneverthey are needed. Clustering can help build redundancy into an inf rastructure and eliminate single points off ailure. This, in turn, helps reduce downtime, guards against data loss, and increases the return on investment.

Failover clusters provide support f or mission-crit ical applications—such as databases, messaging systems,f ile and print services, and virtualized workloads—that require high availability, scalability, and reliability.

What is a Cluster?

Cluster is a group of machines acting as a single entity to provide resources and services to the network. Intime of f ailure, a f ailover will occur to a system in that group that will maintain availability of those resources tothe network.

How Failover Clusters Work?

A f ailover cluster is a group of independent computers, or nodes, that are physically connected by a local-areanetwork (LAN) or a wide-area network (WAN) and that are programmatically connected by cluster sof tware. Thegroup of nodes is managed as a single system and shares a common namespace. The group usually includesmultiple network connections and data storage connected to the nodes via storage area networks (SANs). Thef ailover cluster operates by moving resources between nodes to provide service if system components f ail.

Normally, if a server that is running a particular application crashes, the application will be unavailable until theserver is f ixed. Failover clustering addresses this situation by detecting hardware or sof tware f aults andimmediately restarting the application on another node without requiring administrative intervention—a processknown as f ailover. Users can continue to access the service and may be completely unaware that it is nowbeing provided f rom a dif f erent server

Figure . Failover clustering

Failover Clustering Terminology

1. Failover and Failback Clustering Failover is the act of another server in the cluster group taking overwhere the f ailed server lef t of f . An example of a f ailover system canbe seen in below Figure. If you have a two-node cluster f or f ile access and one f ails, the service will f ailover to

http://winadmins.wordpress.com

http://winadmins.wordpress.com/tag/clustering/

http://winadmins.files.wordpress.com/2011/10/clip_image00214.jpg

another server in the cluster. Failback is the capability of the f ailed server to come back online and take theload back f rom the node the original server f ailed over to.

2. Active/Passive cluster model:

Active/Passive is def ined as a cluster group where one server ishandling the entire load and, in case of f ailure and disaster, a Passivenode is standing by waiting f or f ailover.

· One node in the f ailover cluster typically sits idle until a f ailoveroccurs. Af ter a f ailover, this passive node becomes active andprovides services to clients. Because it was passive, it presumablyhas enough capacity to serve the f ailed-over application withoutperf ormance degradation.

3. Active/Active failover cluster model

All nodes in the f ailover cluster are f unctioning and serving clients. If anode f ails, the resource will move to another node and continue tof unction normally, assuming that the new server has enough capacityto handle the additional workload.

4. Resource . A hardware or sof tware component in a f ailover cluster(such as a disk, an IP address, or a network name).

5. Resource group.

A combination of resources that are managed as a unit of f ailover.Resource groups are logical collections of cluster resources. Typicallya resource group is made up of logically related resources such asapplications and their associated peripherals and data. However,resource groups can contain cluster entit ies that are related only byadministrative needs, such as an administrative collection of virtualserver names and IP addresses. A resource group can be owned byonly one node at a t ime and individual resources within a group mustexist on the node that currently owns the group. At any given instance,dif f erent servers in the cluster cannot own dif f erent resources in thesame resource group.

6. Dependency. An alliance between two or more resources in thecluster architecture.

7. Heartbeat .

The cluster ’s health-monitoring mechanism between cluster nodes.This health checking allows nodes to detect f ailures of other serversin the f ailover cluster by sending packets to each other ’s networkinterf aces. The heartbeat exchange enables each node to check theavailability of other nodes and their applications. If a server f ails torespond to a heartbeat exchange, the surviving servers init iatef ailover processes including ownership arbitration f or resources andapplications owned by the f ailed server.

http://winadmins.files.wordpress.com/2011/10/image24.png




The heartbeat is simply packets sent f rom the Passive node to the Active node. When the Passive nodedoesn’t see the Active node anymore, it comes up online

8. Membership. The orderly addition and removal of nodes to andf rom the cluster.

9. Global update . The propagation of cluster conf iguration changesto all cluster members.

10. Cluster registry. The cluster database, stored on each node andon the quorum resource, maintains conf iguration inf ormation(including resources and parameters) f or each member of the cluster.

11. Virtual server.

A combination of conf iguration inf ormation and cluster resources, such as an IP address, a network name, andapplication resources.

Applications and services running on a server cluster can be exposed to users and workstations as virtualservers. To users and clients, connecting to an application or service running as a clustered virtual serverappears to be the same process as connecting to a single, physical server. In f act, the connection to a virtualserver can be hosted by any node in the cluster. The user or client application will not know which node isactually hosting the virtual server.

12. Shared storage .

All nodes in the f ailover cluster must be able to access data onshared storage. The highly available workloads write their data to thisshared storage. Theref ore, if a node f ails, when the resource isrestarted on another node, the new node can read the same dataf rom the shared storage that the previous node was accessing.Shared storage can be created with iSCSI, Serial Attached SCSI, orFibre Channel, provided that it supports persistent reservations.

13. LUN

LUN stands f or Logical Unit Number. A LUN is used to identif y a disk ora disk volume that is presented to a host server or multiple hosts by ashared storage array or a SAN. LUNs provided by shared storagearrays and SANs must meet many requirements bef ore they can beused with f ailover clusters but when they do, all active nodes in thecluster must have exclusive access to these LUNs.

Storage volumes or logical unit numbers (LUNs) exposed to the nodesin a cluster must not be exposed to other servers, including servers inanother cluster. The f ollowing diagram illustrates this.

14. Services and Applications group

Cluster resources are contained within a cluster in a logical set calleda Services and Applications group or historically ref erred to as acluster group. Services and Applications groups are the units off ailover within the cluster. When a cluster resource f ails and cannot berestarted automatically, the Services and Applications group this




http://winadmins.files.wordpress.com/2011/10/clip_image0016.gif


resource is a part of will be taken of f line, moved to another node inthe cluster, and the group will be brought back online.

15. Quorum

The cluster quorum maintains the def init ive cluster conf iguration dataand the current state of each node, each Services and Applicationsgroup, and each resource and network in the cluster. Furthermore,when each node reads the quorum data, depending on the inf ormationretrieved, the node determines if it should remain available, shut downthe cluster, or activate any particular Services and Applications groupson the local node. To extend this even f urther, f ailover clusters can beconf igured to use one of f our dif f erent cluster quorum models andessentially the quorum type chosen f or a cluster def ines the cluster.For example, a cluster that utilizes the Node and Disk Majority Quorumcan be called a Node and Disk Majority cluster.

A quorum is simply a conf iguration database f or Microsof t Cluster Service, and is stored in the quorum log f ile.A standard quorum uses a quorum log f ile that is located on a disk hosted on a shared storage interconnectthat is accessible by all members of the cluster

Why quorum is necessary

When network problems occur, they can interf ere with communication between cluster nodes. A small set ofnodes might be able to communicate together across a f unctioning part of a network, but might not be able tocommunicate with a dif f erent set of nodes in another part of the network. This can cause serious issues. Inthis “split” situation, at least one of the sets of nodes must stop running as a cluster.

To prevent the issues that are caused by a split in the cluster, the cluster sof tware requires that any set ofnodes running as a cluster must use a voting algorithm to determine whether, at a given time, that set hasquorum. Because a given cluster has a specif ic set of nodes and a specif ic quorum conf iguration, the clusterwill know how many “votes” constitutes a majority (that is, a quorum). If the number drops below the majority,the cluster stops running. Nodes will still listen f or the presence of other nodes, in case another node appearsagain on the network, but the nodes will not begin to f unction as a cluster until the quorum exists again.

For example, in a f ive node cluster that is using a node majority, consider what happens if nodes 1, 2, and 3can communicate with each other but not with nodes 4 and 5. Nodes 1, 2, and 3 constitute a majority, and theycontinue running as a cluster. Nodes 4 and 5 are a minority and stop running as a cluster, which prevents theproblems of a “split” situation. If node 3 loses communication with other nodes, all nodes stop running as acluster. However, all f unctioning nodes will continue to listen f or communication, so that when the networkbegins working again, the cluster can f orm and begin to run.

There are four quorum modes:

Node Majority: Each node that is available and in communication can vote. The cluster f unctions onlywith a majority of the votes, that is, more than half .

Node and Disk Majority: Each node plus a designated disk in the cluster storage (the “disk witness”)can vote, whenever they are available and in communication. The cluster f unctions only with a majority ofthe votes, that is, more than half .

Node and File Share Majority: Each node plus a designated f ile share created by the administrator (the“f ile share witness”) can vote, whenever they are available and in communication. The cluster f unctionsonly with a majority of the votes, that is, more than half .


No Majority: Disk Only. The cluster has quorum if one node is available and in communication with aspecif ic disk in the cluster storage. Only the nodes that are also in communication with that disk can jointhe cluster. This is equivalent to the quorum disk in Windows Server 2003. The disk is a single point off ailure, so only select scenarios should implement this quorum mode.

16. Witness Disk – The witness disk is a disk in the cluster storage that is designated to hold a copy ofthe cluster conf iguration database. (A witness disk is part of some, not all, quorum conf igurations.)

Configuration of two node Failover Cluster and Quorum Configuration:

Multi-site cluster is a disaster recovery solution and a high availability solution all rolled into one. A multi-sitecluster gives you the highest recovery point objective (RTO) and recovery time objective (RTO) available f oryour crit ical applications. With the introduction of Windows Server 2008 f ailover clustering a multi-site clusterhas become much more f easible with the introduction of cross subnet f ailover and support f or high latencynetwork communications.

Which edit ions include failover clustering?

The f ailover cluster f eature is available in Windows Server 2008 R2 Enterprise and Windows Server 2008 R2Datacenter. The f eature is not available in Windows Web Server 2008 R2 or Windows Server 2008 R2 Standard

Network Considerations

All Microsof t f ailover clusters must have redundant network communication paths. This ensures that a f ailureof any one communication path will not result in a f alse f ailover and ensures that your cluster remains highlyavailable. A multi-site cluster has this requirement as well, so you will want to plan your network with that inmind. There are generally two things that will have to travel between nodes: replication traf f ic and clusterheartbeats. In addition to that, you will also need to consider client connectivity and cluster managementactivity

Quorum model:

For a 2-node multi-site cluster conf iguration, the Microsof t recommended conf iguration is a Node and FileShare Majority quorum

Step –1 Configure the Cluster

Add the Failover Clustering f eature to both nodes of your cluster. Follow the below steps:

1. Click Start, click Administrative Tools, and then click Server Manager. (If the User Account Control dialog boxappears, conf irm that the action it displays is what you want, and then click Continue.)

2. In Server Manager, under Features Summary, click Add Features. Select Failover Clustering, and then clickInstall

3. Follow the instructions in the wizard to complete theinstallation of the f eature. When the wizard f inishes,close it.

4. Repeat the process f or each server that you want toinclude in the cluster.


5. Next you will want to have a look at your networkconnections. It is best if you rename the connections oneach of your servers to ref lect the network that theyrepresent. This will make things easier to rememberlater.

Go to properties of Cluster (or private) network andcheck out register the connection’s addresses in DNS.

6. Next, go to Advanced Settings of your NetworkConnections (hit Alt to see Advanced Settings menu) ofeach server and make sure the Public network (LAN) isf irst in the list:

7. Your private network should only contain an IP addressand Subnet mask. No Def ault Gateway or DNS serversshould be def ined. Your nodes need to be able tocommunicate across this network, so make sure theservers can communicate across this network; add staticroutes if necessary.

Step 2 – Validate the Cluster Configuration:

1. Open up the Failover Cluster Manager and click onValidate a Conf iguration.

2. The Validation Wizard launches and presents you thef irst screen as shown below. Add the two servers in yourcluster and click Next to continue.

3. we need this cluster to be supported so we must run allthe needed tests

4. Select run all tests.

5. Click next t ill it gives the report like below

When you click on view report, it will display the reportsimilar as below:

Step 2 – Create a Cluster:

In the Failover Cluster Manager, click on Create a Cluster.

The next step is that you must create a name f or thiscluster and IP f or administering this cluster. This will be thename that you will use to administer the cluster, not thename of the SQL cluster resource which you will createlater. Enter a unique name and IP address and click Next.

Note: This is also the computer name that will needpermission to the File Share Witness as described laterin this document.



http://winadmins.files.wordpress.com/2011/10/a7.jpg

http://ahmedhusseinonline.com/wp-content/uploads/2011/02/clip_image0121.jpg?d8bc3b

http://ahmedhusseinonline.com/wp-content/uploads/2011/02/clip_image0141.jpg?d8bc3b

http://winadmins.files.wordpress.com/2011/10/b8.jpg




clip_image012

clip_image014

Conf irm your choices and click Next.

Click Next t ill f inish, it will create the cluster by nameMYCLUSTER.

Step 3 – Implementing a Node and File ShareMajority quorum

First, we need to identif y the server that will hold our FileShare witness. This File Share witness should be locatedin a 3rd location, accessible by both nodes of the cluster.Once you have identif ied the server, share a f older asyou normally would share a f older. In my case, I create ashare called MYCLUSTER on a server named NYDC01

.

The key thing to remember about this share is that youmust give the cluster computer name read/writepermissions to the share at both the Share level andNTFS level permissions. You will need to make sureyou give the cluster computer account read/writepermissions in both shared and NTFS f or MYCLUSTERshare.

Now with the shared f older in place and the appropriatepermissions assigned, you are ready to change yourquorum type. From Failover Cluster Manager, right-clickon your cluster, choose More Actions and Conf igureCluster Quorum Settings.

On the next screen choose Node and File ShareMajority and click Next.

In this screen, enter the path to the f ile share youpreviously created and click Next.

Conf irm that the inf ormation is correct and click Next t illsummary page and click Finish.

Now when you view your cluster, the QuorumConf iguration should say “Node and File ShareMajority” as shown below.

The steps I have outlined up until this point apply to anymulti-site cluster, whether it is a SQL, Exchange, FileServer or other type of f ailover cluster. The next step increating a multi-site cluster involves integrating yourstorage and replication solution into the f ailover cluster

http://winadmins.files.wordpress.com/2011/10/c7.jpg

http://winadmins.files.wordpress.com/2011/10/d5.jpg





http://winadmins.files.wordpress.com/2011/10/e4.jpg

http://winadmins.files.wordpress.com/2011/10/f3.jpg

winadmins.wordpress.com-failover clustering in windows server 2008 r2 partnbsp1

Documents