load balancer

20
Cisco Systems, Inc. All contents are Copyright © 1992–2004 Cisco Systems, Inc. All rights reserved. Important Notices and Privacy Statement. Page 1 of 20 White Paper The Global Server Load Balancing Primer Forward Business continuance and disaster recovery planning were once considered low business priorities. Recent world events have heightened the IT professional’s focus on deploying business continuance and disaster-recovery architectures. As mission-critical applications have become Web-enabled, IT professionals must understand how these applications will withstand an array of disruptions ranging from catastrophic natural disasters, to acts of terrorism, to technical failures. To effectively react to a business-continuance situation, all business organizations must have a comprehensive disaster-recovery plan that involves several elements, including: Compliance with national government regulations Human health and safety Reoccupation of an affected site Recovery of vital records Recovery of information systems (including LAN and WAN recovery), electronics, and telecommunications This primer focuses on the recovery of information systems, and on applications that are Web-based or use Domain Name System (DNS) infrastructure. It also describes the products involved in a disaster- recovery plan, deployment scenarios, and functional interaction between products, both internal and external to the data center. Overview Reasons for Global Server Load Balancing Disaster Recovery and Business Continuance In today’s electronic economy, any application downtime quickly threatens a business’s livelihood. Enterprises lose thousands of dollars in productivity and revenue for every minute of IT downtime. A recent Forrester Research survey of 250 Fortune 1000 companies revealed that these businesses lose a staggering US$13,000 for each minute that an enterprise resource planning (ERP) application is inaccessible. The cost of supply-chain management application downtime runs a close second at $11,000 per minute, followed by e-commerce ($10,000). To avoid costly disruptions, enterprises are turning to intelligent networking capabilities to distribute and load balance their corporate data centers, where many of their core business applications reside. The intelligence now available in IP networking devices can inspect many variables about the content of an IP packet. Based on this information, the network can direct traffic to the best-available, least-loaded sites and servers that will provide the fastest—and best—response.

Upload: grabonlee

Post on 21-Nov-2014

96 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Load Balancer

Cisco Systems, Inc.All contents are Copyright © 1992–2004 Cisco Systems, Inc. All rights reserved. Important Notices and Privacy Statement.

Page 1 of 20

White Paper

TheGlobal Server Load Balancing Primer

Forward

Business continuance and disaster recovery

planning were once considered low business

priorities. Recent world events have

heightened the IT professional’s focus on

deploying business continuance and

disaster-recovery architectures.

As mission-critical applications have

become Web-enabled, IT professionals

must understand how these applications

will withstand an array of disruptions

ranging from catastrophic natural disasters,

to acts of terrorism, to technical failures. To

effectively react to a business-continuance

situation, all business organizations must

have a comprehensive disaster-recovery

plan that involves several elements,

including:

• Compliance with national government

regulations

• Human health and safety

• Reoccupation of an affected site

• Recovery of vital records

• Recovery of information systems

(including LAN and WAN recovery),

electronics, and telecommunications

This primer focuses on the recovery of

information systems, and on applications

that are Web-based or use Domain Name

System (DNS) infrastructure. It also

describes the products involved in a

disaster- recovery plan, deployment

scenarios, and functional interaction

between products, both internal and

external to the data center.

Overview

Reasons for Global Server Load

Balancing

Disaster Recovery and Business

Continuance

In today’s electronic economy, any

application downtime quickly threatens a

business’s livelihood. Enterprises lose

thousands of dollars in productivity and

revenue for every minute of IT downtime. A

recent Forrester Research survey of 250

Fortune 1000 companies revealed that these

businesses lose a staggering US$13,000 for

each minute that an enterprise resource

planning (ERP) application is inaccessible.

The cost of supply-chain management

application downtime runs a close second

at $11,000 per minute, followed by

e-commerce ($10,000).

To avoid costly disruptions, enterprises are

turning to intelligent networking

capabilities to distribute and load balance

their corporate data centers, where many of

their core business applications reside. The

intelligence now available in IP networking

devices can inspect many variables about

the content of an IP packet. Based on this

information, the network can direct traffic

to the best-available, least-loaded sites and

servers that will provide the fastest—and

best—response.

Page 2: Load Balancer

Cisco Systems, Inc.All contents are Copyright © 1992–2004 Cisco Systems, Inc. All rights reserved. Important Notices and Privacy Statement.

Page 2 of 20

Improve Customer Service

A network using global server load balancing (GSLB) directs users to the most appropriate data centers for their

requests, improving the end-user experience. For example, a software manufacturer offers its product via download

from its Internet site but encounters customer satisfaction issues when download time is too long. An intelligent

GSLB solution will determine which of the manufacturer’s multiple data centers, located in disparate locations with

mirrored content, is closest to the requesting client. A client in Asia will receive content from a data center in Asia;

a client in Europe will receive content from Europe; and a client in North America will receive content from North

America. The result is drastically reduced wait times for software downloads, and increased customer satisfaction.

Save Wide-Area Bandwidth

As companies extend applications throughout global or dispersed organizations, they can be hindered by limited

WAN bandwidth. For example, an international bank has 500 remote offices worldwide that are supported by six

distributed data centers. This bank wants to deploy sophisticated, content-rich applications to all of its offices

without upgrading the entire WAN infrastructure. An intelligent GSLB solution that can point the client to a local

data center for content requests, instead of one located remotely, will save costly bandwidth and upgrade expenses.

Types of GSLB Deployments

Active-Standby

The traditional disaster-recovery deployment uses two data centers—one that is active and a second one that is

dormant, operating in standby mode. In this scenario, the activation of the second data center requires physically

moving business-critical information to the dormant site, reloading this information, and reactivating all IT systems

with the backup information. This approach was used because server technology was not capable of supporting a

single application spanning multiple distributed data centers.

From a business perspective, the second dormant data center is a cost center with value realized only when a

catastrophic event occurs. Also, the readiness of the standby data center is simulated but never fully tested until an

actual disaster event occurs.

Active-Active1

There is a more cost-effective and resilient deployment alternative, where both data centers are active. With the

advancement of Web clustering technologies and storage area networks (SANs), an application can be mirrored in

multiple locations to improve performance and availability. Instead of increasing the number of networking devices

within a single data center, new networking devices are deployed in a secondary, fully operational data center,

ensuring that all mission-critical data is properly shared across each distributed data center. This architecture is

inherently more resilient to interruptions.

1. The deployment requirements of simultaneously operating data centers depend on business objectives and on the capabilities of the technology used.Following are examples of the common business needs for an active-active deployment:

– Users accessing the data center from the Internet or intranet must be routed across multiple data centers based on congestion levels within the data center

– In case of a failure, users accessing the data center must be efficiently rerouted to an available data center

– Users must be automatically routed to the closest geographic data center to improve Internet download time

– A globally deployed internal application requires that users be routed to the closest data center to save costly international WAN bandwidth

Page 3: Load Balancer

Cisco Systems, Inc.All contents are Copyright © 1992–2004 Cisco Systems, Inc. All rights reserved. Important Notices and Privacy Statement.

Page 3 of 20

Web-clustering technologies and SANs address the back-end requirements of deploying distributed data centers.

However, IT professionals also need a front-end system that can properly load-balance incoming user traffic and

respond to outages within each distributed data center.

An active-active GSLB deployment focuses on intelligently, globally load-balancing client traffic across the two active

data centers. This type of deployment eliminates the uncertainty associated with reactivating a standby data center.

Proximity

A proximity deployment is a variation of the active-active GSLB deployment, designed to route the client to the

“closest” data center to achieve better customer Web experience and save expensive wide-area bandwidth. This

deployment is critical for applications that involve the transfer of large volumes of data, require a global presence,

or rely on real-time interaction with remote users.

Figure 1 DNS-Based GSLB Processes

GSLB Processes

A complete understanding of GSLB deployments requires familiarity with the GSLB processes involved (Figure 1). A

DNS-based approach to GSLB consists of five processes:

• DNS—This front-end process focuses on the interaction with the DNS servers.

• Keepalives—This back-end process gathers state and load information from devices within the data center such

as local server load balancers and origin servers.

• Rules and associations—This process establishes a relationship between the front-end and back-end process.

• Global load-balancing management policy and algorithms—These processes send client traffic to the distributed

data centers based on a defined policy.

• Exception handling—This process focuses on which conditions constitute a failure event and what actions the

GSLB device should perform.

Note: Most of this paper focuses on DNS-based GSLB. There are other approaches such as route health injection

(covered in this document) and HTTP-based redirection (not covered in this document).

Page 4: Load Balancer

Cisco Systems, Inc.All contents are Copyright © 1992–2004 Cisco Systems, Inc. All rights reserved. Important Notices and Privacy Statement.

Page 4 of 20

DNS—The Front-End Process to a GSLB Solution

The front-end process of GSLB interacts with the DNS infrastructure, which was developed because humans can

remember names better than numbers. Without the DNS infrastructure, people would have to remember the IP

address of every Website they wanted to visit. This would require a directory or a phone book listing all Websites

and their IP addresses. The user would have to physically look up the IP address and enter it (198.133.219.25) into

their browsers to connect to the Cisco Systems® Website. The DNS architects created the concept of a host

name—when a user types www.cisco.com, the DNS infrastructure sends to the browser the IP address of the Cisco®

Website.

The DNS structure is based on a hierarchical tree similar to common file systems. The primary components in this

infrastructure are:

• DNS resolvers—Clients that access client name servers.

• Client name server—A server running DNS software that locates the requested Website. It is sometimes called

client DNS proxy (D-proxy).

• Root name server—Resides at the top of the DNS hierarchy, and knows how to get to every extension after the

“.” in the host name. There are many top-level domains; the most common are com, org, edu, net, gov, mil, and

arpa. There are approximately 13 root servers worldwide handling requests for the entire Internet.

• Intermediate name server—Used for scaling purposes. When the root name server does not have the IP address

of the authoritative name server, the root name server will send the requesting client name server to an

intermediate name server. The intermediate name server will send the client name server to the authoritative

name server.

• Authoritative name server—Provides IP addresses for requested domains. When the client name server asks the

question, “What is the IP address for Cisco.com?” this name server will respond directly to the client name server

(not the client) with the IP address for Cisco.com. The authoritative name server is run by the enterprise or can

be outsourced to a service provider.

Page 5: Load Balancer

Cisco Systems, Inc.All contents are Copyright © 1992–2004 Cisco Systems, Inc. All rights reserved. Important Notices and Privacy Statement.

Page 5 of 20

DNS Resolution Process Without GSLBs

The following six steps (Figure 2) are performed by the DNS infrastructure to return an IP address to a client trying

to access www.cisco.com.

Figure 2 DNS Resolution Process Without GSLB

1. The resolver (client) sends a query for www.cisco.com to the local client name server (D-proxy). In this case, the

Cisco Network Registrar is acting as the D-proxy.

2. The local D-proxy does not have an IP address for www.cisco.com, so it sends a query to a root name server. The

root name server can respond to the request in two different ways; the most common way is to send the D-proxy

directly to the authoritative name server for www.cisco.com. Another method, called “iterated query,” is when

the root name server sends the D-proxy to an intermediate name server that knows the address of the

authoritative name server www.cisco.com.

Page 6: Load Balancer

Cisco Systems, Inc.All contents are Copyright © 1992–2004 Cisco Systems, Inc. All rights reserved. Important Notices and Privacy Statement.

Page 6 of 20

3. The local D-proxy sends a query to the intermediate name server, which responds, referring the D-proxy to the

authoritative name server for Cisco.com.

4. The local D-proxy sends a query to the authoritative name server. This name server is authoritative for Cisco.com,

which is the top-level domain, and is therefore also authoritative for the subdomain www.cisco.com. The

authoritative name server sends the IP address to the D-proxy.

5. In the next-to-last step of the DNS process, the D-proxy sends the IP address (198.133.219.25) to the client

browser.

6. The browser uses this IP address and initiates a connection to the www.cisco.com Website (red arrow).

To improve performance and scalability, the D-proxy will cache this IP address depending on the value in the

time-to-live parameter. So the next time the resolver (client) asks the D-proxy for the IP address for www.cisco.com,

instead of repeating this process, the D-proxy sends the IP address 198.133.219.25 directly to the client browser.

This entire six-step process is completely transparent to the user.

DNS Resolution Process with GSLB

In the previous example, the enterprise was supporting only a single data center. Motivated by the need for a resilient

architecture, this same enterprise has deployed a secondary active data center.

With the deployment of the secondary data center, the enterprise needs a solution that can support two active data

centers. The typical DNS name server can somewhat support this new architecture, but lacks several critical

capabilities.

The typical DNS name server cannot:

• Determine if the devices within the data center are available or unavailable

• Determine if the server within the data center is overloaded

• Determine which server load balancer is the best performing

• Determine which data center is closer to the client that is requesting content

• Intelligently manage the client traffic flow to each data center

• React quickly to changes in availability or load on the devices within the data center

• Provide data center persistence

• Give conditional responses, such as “data center one is unavailable and data center two is overloaded, so send all

traffic to third data center”

The Cisco GSLB solution, discussed later in this document, delivers all the critical capabilities needed to support

multiple distributed data centers that cannot be supported by the typical DNS name server. In addition, the Cisco GSLB

solution can be used in either active-active or active-standby deployments. This solution is also fully compatible with

the existing DNS process that has been previously described—meaning the Cisco GSLB solution can be deployed with

a minor change to the DNS process.

The DNS administrator makes the Cisco GSLB products authoritative for the subdomain (called “delegation”). The

DNS administrator adds two “name server” records to the authoritative name server that was previously responsible

for issuing the IP address for www.cisco.com.

Page 7: Load Balancer

Cisco Systems, Inc.All contents are Copyright © 1992–2004 Cisco Systems, Inc. All rights reserved. Important Notices and Privacy Statement.

Page 7 of 20

Figure 3 DNS Resolution Process With GSLB

This new DNS configuration adds a step to the DNS resolution process. Figure 3 shows the new DNS resolution

process:

1. No change.

2. No change.

3. No change.

4. Authoritative name server tells the D-proxy to ask the Cisco GSLB product for the IP address for www.cisco.com.

5. The Cisco GSLB product is authoritative for the www.cisco.com subdomain, so it sends the IP address to the

D-proxy. The Cisco GSLB product applies intelligence to its response. Following are examples of this intelligence,

which is not supported by a generic DNS name server. Cisco GSLB products:

Page 8: Load Balancer

Cisco Systems, Inc.All contents are Copyright © 1992–2004 Cisco Systems, Inc. All rights reserved. Important Notices and Privacy Statement.

Page 8 of 20

– Will not send an IP address to the D-proxy if the device is unavailable.

– Will not send an IP address to the D-proxy if the device is overloaded.

– Can send the users to the most proximate data center.

– Can intelligently manage client traffic flow to each data center.

– Will issue a virtual IP address, not a real IP address of a back-end server.

– Can automatically reroute users to an alternative data center if the primary data center becomes unavailable.

– Can mix and match the response to changing conditions. For example, send all users to data center 1 until it

reaches 70 percent capacity, and then send all new users to data center 2. If data center 2 becomes overloaded

or unavailable, send all new users to a third data center that is hosted by a service provider.

These intelligent GSLB capabilities are achieved through the information-gathering processes and algorithms

supported in Cisco GSLB products. A full description of these processes and algorithms is included later in this

document.

Keepalives—The Back-End Process to a GSLB Solution

The keepalive process is the information-collection mechanism of an advanced GSLB solution. A keepalive is a

specific interaction, or “handshake,” between two devices using a commonly supported protocol. A keepalive tests

if a specific protocol stack on a networking device, such as a server, router, or switch, is functioning properly. The

logic of a keepalive is: A successful handshake means the target device is available, active, and able to receive traffic.

A failed handshake means the target device is unavailable and inactive. This technique allows the device issuing a

keepalive to understand the operational status of the device responding to the keepalive.

The most common keepalive is an Internet Control Message Protocol (ICMP) ping, documented in RFC 792. With

a simple Layer 3 ping, if the device issuing the ping receives a proper response it knows two things: The Layer 3

network between the two devices is working properly, and the Layer 3 protocol stack on the device receiving the ping

is active. An intelligent GSLB device will have a full suite of keepalives capable of supporting a wide range of

protocols.

Rules and Associations—Determining Interrelationships

When determining interrelationships in a GSLB solution, the user decides what interrelationship should exist

between the DNS subdomain being advertised to the public and the devices within the data center supporting this

domain. These rules and associations can be generic or complex.

An example of a generic association is that the DNS administrator is not concerned about which server load balancer

is selected by the global load balancer. This same logic could apply to physically distributed servers; the DNS

administrator only cares that the devices supporting this distinct domain are available and not overloaded.

Alternatively, an association could be very complex. For example, the DNS administrator wants all the users from a

specific ISP requesting access to a distinct domain name to be sent to a special set of servers only. If the special servers

become overloaded or unavailable, the users should be automatically sent to a different set of servers at a different

location.

This makes it possible to design a solution that allows the mixing and matching of different subdomains with

different equipment deployed throughout the enterprise. An example would involve advertising three subdomains on

the Internet:

Page 9: Load Balancer

Cisco Systems, Inc.All contents are Copyright © 1992–2004 Cisco Systems, Inc. All rights reserved. Important Notices and Privacy Statement.

Page 9 of 20

1. Subdomain 1 has fully mirrored content at two data centers, and needs to be dynamically load-balanced across

two sites. Distributed load balancers at each data center support the domain and are used to load-balance user

requests across several servers in each data center.

2. Subdomain 2 is a popular application that is hard-coded to a specific superserver that should be rerouted only

when a major network interruption occurs.

3. Subdomain 3 is supported by several standalone servers distributed worldwide, and the selection is conditional.

The user should be sent to the remote server; if the remote server is unavailable or overloaded, the user should

be sent to the main data center.

The administrator can set up any number of subdomain-to-device associations. Cisco GSLB products support these

rules and associations capabilities.

Global Load Balancing Management Policy and Algorithms

The DNS administrator creates a traffic management policy and algorithms to control how incoming traffic is

load-balanced across distributed data centers. A GSLB policy consists of three elements:

1. A global algorithm used, for example, to equally distribute all inbound traffic across two data centers. The most

common global algorithms are round robin and weighted round robin.

2. Detection of changes of state, for example, when keepalive information indicates that all devices within the

primary data center are unavailable or overloaded.

3. A dynamic or manual enforcement of this procedure based on information detected by the keepalive mechanism.

For example, send all traffic to the secondary data center by issuing the new IP address of the secondary data

center.

Exception Handling

Exception handling refers to how a GSLB deployment recovers from unexpected events.

DNS administrators use this process to establish recovery scenarios. They need to define what constitutes a failure,

what action to take, where to route the new and existing traffic, and what to do if the primary backup plan fails.

Cisco GSLB Solution

Cisco offers a complete GSLB product line that meets the requirements of any enterprise or service provider. The

Cisco GSLB solution includes products integrated within server load balancers, products integrated within routers,

and a dedicated appliance. Each product offers unique strengths and capabilities. The Cisco GSLB products do not

completely replace the authoritative DNS name server within the data center; rather, the name server becomes

authoritative for one or more applications (for example, subdomains).

Integrated Within Server Load Balancers

Cisco CSS 11500 Series content services switches (Figure 4) and the Cisco Content Switching Module (CSM) for

Cisco Catalyst® 6500 Series switches (Figure 5) offer an integrated GSLB solution. These server load balancers

contain DNS processing capabilities, allowing a network designer to deploy a GSLB solution without the addition

of any external devices to process DNS requests. These Cisco products are targeted at disaster recovery and global

load balancing of a small number of data centers.

Page 10: Load Balancer

Cisco Systems, Inc.All contents are Copyright © 1992–2004 Cisco Systems, Inc. All rights reserved. Important Notices and Privacy Statement.

Page 10 of 20

In switch-based GSLB deployments, the Cisco CSM and the Cisco CSS 11500 Series are simultaneously performing

two major processes. The first process is interacting with the DNS infrastructure, as described previously, and the

second process is load-balancing client traffic across multiple servers. These products are unique in that they combine

local server load balancing with global server load balancing.

Figure 4 Cisco CSS 11503 Content Services Switch

Specifically, the Cisco CSS 11500 Series employs the Content and Application Peering Protocol (CAPP) to construct

a content mesh, which allows the switches to exchange content and GSLB information. This intelligent mesh enables

a powerful automatic global load-balancing solution. CAPP helps to ensure that the least-loaded site or server

responds to Website requests.

The Cisco CSM also supports GSLB deployments. It can be configured to be authoritative for the domains that will

be globally load balanced.

The Cisco CSM supports numerous global load-balancing algorithms that can be tuned to any customer

requirements.

Figure 5 Cisco Content Switching Module for the Cisco Catalyst 6500 Series Switches

Integrated Within Routers

The Cisco DistributedDirector software is integrated into Cisco IOS® Software, available with Cisco routers.

Capable of making global decisions based on router metrics and real-time information, this product is also targeted

at disaster recovery and global load balancing distributed data centers.

A description of the GSLB processes performed by Cisco DistributedDirector is available at:

http://www.cisco.com/en/US/products/hw/contnetw/ps813/products_white_paper09186a0080091e1c.shtml.

Page 11: Load Balancer

Cisco Systems, Inc.All contents are Copyright © 1992–2004 Cisco Systems, Inc. All rights reserved. Important Notices and Privacy Statement.

Page 11 of 20

Dedicated Appliance

The Cisco GSS 4490 Global Site Selector (GSS; Figure 6) was introduced to expand the GSLB services provided by

integrated Cisco solutions. The Cisco GSS 4490 is ideal for enterprises and service providers that need more network

design flexibility and extra scalability to provide disaster recovery and global load balancing of hundreds or

thousands of domains. The Cisco GSS 4490 is tightly integrated with market-leading Cisco content switches.

Customers using Cisco content services switches such as the Cisco CSS 11500 Series and Cisco CSM, or those who

are using traditional switches such as the Cisco CSS 11000 Series and Cisco LocalDirector 400 Series, can benefit

from the new levels of traffic management and centralized command and control provided by the Cisco GSS 4490.

Figure 6 Cisco GSS 4490 Global Site Selector

Cisco GSLB Solution Deployment Scenarios

In the Cisco GSLB solution, the global load-balancing and redirection functions are handled by the DNS software

within the Cisco CSS 11500 Series, the Cisco CSM, and the Cisco GSS 4490 dedicated appliance.

All of these products can be integrated into an enterprise or service provider’s DNS infrastructure by being made

authoritative for the domains that will be globally load balanced. This means the DNS administrator needs to add

name-server records to the authoritative name server within the data center. In the previous example, this was the

Cisco.com name server.

The number of name-server records required in a subdomain corresponds to the number of GSLB devices deployed.

When a network designer deploys two GSLB devices, then two name-server records need to be configured; if three

GSLB devices are required, then three name-server records are required, and so on. These records contain the

subdomain name and the IP address of the GSLB device deployed.

GSLB Using the Cisco Content Services Switch

An enhanced version of Cisco WebNS Software enables the Cisco content services switch (CSS) to act as a DNS

authoritative name server. In a typical deployment the network designer places the Cisco CSS, performing the GSLB

function, in the active and standby data centers (Figure 7).

Page 12: Load Balancer

Cisco Systems, Inc.All contents are Copyright © 1992–2004 Cisco Systems, Inc. All rights reserved. Important Notices and Privacy Statement.

Page 12 of 20

Figure 7 GSLB Using the Cisco Content Services Switch

The network designer delegates the subdomain that will be globally load-balanced to the Cisco CSSs. This is

accomplished by configuring two name-server records in the local DNS name server. This minor modification to the

local DNS name server makes the Cisco CSSs authoritative for the GSLB subdomain. Now, whenever this subdomain

is requested, the Cisco CSSs will respond with the proper A-record.

The next step is to establish data center-to-data center communication. The Cisco CSS uses CAPP to construct a

peering communication mesh, which allows the Cisco CSSs to exchange content and GSLB information.

The Cisco CSS administrator assigns a zone to each Cisco CSS. The Cisco CSS uses this zone assignment to

differentiate GSLB information. In the example shown in Figure 7, Cisco CSS 1 knows that any information from

Zone 0 is local information, and any information from Zone 1 is remote or learned information from the remote

Cisco CSS.

The GSLB peering relationship is at the domain level, and the linkage is the domain name that is being globally

load-balanced. After a peering relationship has been established, the Cisco CSSs automatically share the following

zone-based information—status of the server load, availability, virtual IP address, and domain names.

Based on this frequent exchange of information, the Cisco CSS can determine whether the local virtual IP (VIP)

address within Zone 0 or the learned VIP address for Zone 1 should be used. The conditions that can affect this

decision-making process are:

Page 13: Load Balancer

Cisco Systems, Inc.All contents are Copyright © 1992–2004 Cisco Systems, Inc. All rights reserved. Important Notices and Privacy Statement.

Page 13 of 20

• Global load-balancing policies (round robin, weighted round robin)

• Availability of resources

• Preference of an individual site

The Cisco CSSs running this GSLB-enabled software can simultaneously perform server load balancing within the

data center and can globally load-balance across multiple data centers. The main advantage of this type of

deployment is that the network designer can add GSLB capability without adding new networking devices.

GSLB Using the Cisco CSM

GSLB capabilities for the Cisco CSM were introduced in the 3.1 software release. With this new capability, the Cisco

CSM uniquely combines GSLB with route health injection.

Figure 8 GSLB Using the Cisco CSM

GSLB for the Cisco CSM (Figure 8) is functionally the same as for the Cisco CSS, and is deployed in the same manner.

The Cisco CSM becomes the DNS authoritative name server for the GSLB subdomain. The DNS configuration is the

same as described previously for the Cisco CSS. The only exception is that the name-server records will point to the

Cisco CSMs located at each data center.

Page 14: Load Balancer

Cisco Systems, Inc.All contents are Copyright © 1992–2004 Cisco Systems, Inc. All rights reserved. Important Notices and Privacy Statement.

Page 14 of 20

Although the Cisco CSM does not use CAPP, it keeps track of local and remote devices by using health-checking

probes. Theses probes are sent to the local real servers within the data center and to the remote virtual server on the

remote Cisco CSMs. Figure 8 shows that Cisco CSM 2 will be configured with the same probes (not shown) to

monitor the availability of the local server and the remote virtual server at the active data centers.

When DNS requests are sent to the Cisco CSM (1 or 2), the Cisco CSM will assess the load and availability of the

local virtual servers and the remote virtual servers supporting the requested domain. The Cisco CSM will respond

with the IP address of the local or remote virtual servers, depending on availability and load. If the information

collected by the health probes confirms that the local virtual server is unavailable or overloaded, the Cisco CSM will

respond with the IP address of the remote virtual server.

Route Health Injection—Using Routing Metrics for GSLB

The Cisco CSM uses a unique feature called route health injection (RHI). Using RHI, the Cisco CSM sends messages

to the Cisco Catalyst 6500 Series switch containing the VIP addresses of the load-balanced servers that are available.

A VIP address is available when one or more servers at the location are operational and successfully responding to

health checks. After the VIP address is active, the Cisco Catalyst 6500 Series switch adds an entry in its routing table

for each VIP address it receives in an update message from the Cisco CSM. The switch calculates the VLAN ID and

enters the route into its routing table. The routing protocol running on the Cisco Catalyst 6500 Series sends the

routing table updates upstream to client-side routers in the intranet.

If there are two Cisco CSMs running in a failover configuration at the site, the Cisco CSM will inject the virtual

interface address (the alias address) that both Cisco CSMs share for rapid failover between a local switch or Cisco

CSM without having to wait for a routing protocol convergence. The Cisco CSM can inject VIP addresses for

networks that are local, as well as for subnets that are not locally configured on the Cisco Catalyst 6500 Series.

Cisco CSM server load balancers can provide failover protection for applications that use IP addresses instead of

DNS names, as well as protection against DNS servers that cache DNS responses.

When a virtual server becomes unavailable, its route is no longer advertised, and the client-side router will remove

the entry for that virtual server from its routing table when the entry times out. The Cisco Catalyst 6500 Series

Multilayer Switch Feature Card will then propagate the route to the Internet or aggregate routers through a Border

Gateway Protocol (BGP) connection so the route can be securely propagated through the firewalls. The Internet

routers or aggregate routers will then summarize the route so that a host route is not injected to the Internet.

As stated earlier, the main advantage of this type of deployment is that the network designer can add GSLB capability

without adding new networking devices.

GSLB Using the Cisco GSS

When the Cisco GSS 4490 is responsible for GSLB services, the DNS process migrates to the Cisco GSS 4490. The

DNS configuration is the same as described previously for the Cisco CSS and CSM. The only exception is that the

name-server records will point to the Cisco GSS 4490 devices located at each data center. Ultimately, the Cisco GSS

4490 determines which data center site should receive client traffic.

The Cisco GSS 4490 is a networking product that addresses critical disaster-recovery needs by globally

load-balancing distributed data centers. The Cisco GSS 4490 is the cornerstone of any disaster recovery plan and

should be considered when deploying market-leading Cisco server load balancers, such as the Cisco CSS 11500

Series, the Cisco CSM, Cisco IOS SLBs, and Cisco LocalDirectors.

Page 15: Load Balancer

Cisco Systems, Inc.All contents are Copyright © 1992–2004 Cisco Systems, Inc. All rights reserved. Important Notices and Privacy Statement.

Page 15 of 20

The Cisco GSS 4490 Series complements server load balancers by delivering these six important capabilities:

• Helps to ensure that Web-based applications are always available, by detecting site outages and rerouting content

requests

• Improves global data-center selection process, by offering state-of-the-art global load-balancing algorithms

• Offloads DNS servers by taking over the domain resolution process and responding to thousands of requests per

second

• Scales to support hundreds of data centers or server load balancers

• Complements existing the DNS infrastructure by providing centralized domain management

• Tightly integrates with Cisco server load balancers without sacrificing the ability to work with third-party server

load balancers

In the example shown in Figure 9, the Cisco GSS 4490 offloads the Website selection process from the DNS

infrastructure. The Cisco GSS 4490 continuously monitors the load and health of up to 128 server load balancers or

4000 VIP addresses. These server load balancers can be colocated, or located at remote and disparate data centers.

The Cisco GSS 4490 interacts with the client in the Website selection process summarized in the following six steps:

1. A client wants to access an application at foo.com. The client types www.foo.com into the browser. This

application is supported at three different data centers.

2. The request is processed by the DNS global control plane infrastructure and arrives at the Cisco GSS 4490.

3. The Cisco GSS 4490 offloads the site-selection process from the DNS global control plane. The request and site

selection are based on the load and health information in conjunction with customer-controlled load-balancing

algorithms. The Cisco GSS 4490, in real time, selects a data center that is available and not overloaded.

4. The Cisco GSS 4490 sends the IP address of the “best” server load balancer at a specific data center (in this case

the one at Data Center 2).

5. The browser processes this IP address.

6. When the transfer for the DNS control plane is complete, the client is directed to the server load balancer at Data

Center 2 by the IP control and forwarding plane.

Page 16: Load Balancer

Cisco Systems, Inc.All contents are Copyright © 1992–2004 Cisco Systems, Inc. All rights reserved. Important Notices and Privacy Statement.

Page 16 of 20

Figure 9 GSLB Using the Cisco Global Site Selector

The benefit of this deployment is a dedicated solution that is deterministic, highly scalable and delivers a rapid, secure

disaster recovery solution. The remainder of this document explains the critical processes and mechanisms executed

by the GSS 4490.

Page 17: Load Balancer

Cisco Systems, Inc.All contents are Copyright © 1992–2004 Cisco Systems, Inc. All rights reserved. Important Notices and Privacy Statement.

Page 17 of 20

Global Monitoring of Cisco Server Load Balancers

The mechanism that is used by the Cisco GSS 4490 to extract this load and health information is a specially designed

keepalive, based on a proprietary User Datagram Protocol (UDP). However, this protocol has been enhanced to allow

for the tracking of packets transmitted, resulting in a lightweight and reliable keepalive designed exclusively for Cisco

CSSs and CSMs. Understanding that the Cisco GSS 4490 needed to support other Cisco server load balancers like

Cisco LocalDirector and Cisco IOS SLB, as well as third-party server load balancers, Cisco added four more

keepalives. Following are all the keepalives supported in Cisco GSS 4490 Software Version 1.1:

• KAL-AP—Extracts both load and availability from the Cisco CSS and the Cisco CSM. When this detailed query

is sent to the Cisco CSS or CSM, these server load balancers will respond with information about a hosted domain

name, hosted VIP address, or a configured tag on a content rule.

• HTTP—An HTTP “head request” method to a given origin server. The keepalive sends a request to an origin

server and checks for a “200 OK.” If the Cisco GSS 4490 receives a “200 OK,” then it will direct traffic to the

VIP address supporting that server. The user can configure the host tag, the remote host IP, TCP port, and the

URL (including a path). [RFC 2616]

• ICMP—A simple Layer 3 ping that shows the status of a given device based on connectivity to the network. This

can be used with any device that can respond to a ping request. If there is no response, the ping will be sent once

every five seconds up to three times. If there is still no response, the device (VIP address or real server) is

considered offline.

• Name server query—A simple DNS request is sent to a host (name server, mail server, or other) to receive a

resolved domain name to prove the “aliveness” of the system. In this case, the Cisco GSS 4490 sends a generic

domain name, probing for a failure response, which proves that the DNS server is “alive.” This is used in

conjunction with the name-server forwarding feature (described later).

• Content Routing Agent (CRA)—This UDP-based keepalive was built for the Cisco GSS 4490 and sends keepalive

requests to port 1304 to retrieve round-trip times between the Cisco GSS 4490 and agent (Cisco CSS or content

engine). This keepalive is used with the DNS race feature (described later).

• TCP—The GSS now supports the TCP connect keepalive type. The TCP keepalive is used when the GSS answer

that you are testing is transmitted to GSLB devices other than a Cisco Content Services Switch (CSS) or Content

Switching Module (CSM). GSLB remote devices may include webservers, LocalDirectors, WAP gateways, and

other devices that can be checked using a TCP keepalive. The TCP keepalive initiates a TCP connection to the

remote device by performing the three-way handshake sequence. The TCP termination connection method can

be Graceful (FIN) or Reset (RST). The choice in termination connection method has been also been added for

HTTP-HEAD keepalives.

Each type of GSS keepalive can support a Fast or Standard keepalive rate. The Fast keepalive rate can be as fast as

four seconds, while the Standard keepalive rate is 40 to 255 seconds. For the Fast keepalive rate, you can adjust the

number of retries for the ICMP, TCP, HTTP HEAD, and KAL-AP keepalive types, which adjusts the detection time

determined by the GSS.

Page 18: Load Balancer

Cisco Systems, Inc.All contents are Copyright © 1992–2004 Cisco Systems, Inc. All rights reserved. Important Notices and Privacy Statement.

Page 18 of 20

Global Load-Balancing Algorithms

The Cisco GSS 4490 offers eight advanced global load-balancing methods. It supports the following global

load-balancing algorithms:

• Ordered List—Uses the next VIP address when all previous VIP addresses are overloaded or down.

• Static Based on Client’s DNS Address—Maps the IP address of the client’s DNS to available VIP addresses.

• Round Robin—Cycles through available VIP addresses in order.

• Weighted Round Robin—Weighting causes repeat hits (up to 10) to a VIP address.

• Least-Loaded—Communicates through CAPP UDP the least connections on Cisco CSM and least loaded on

Cisco CSS. (The load is calculated based on how fast a server responds to a TCP connection request.)

• Source Address and Domain Hash—The IP address of the client’s DNS proxy and domain used always matches

the same client to the same VIP address.

• DNS Race—Initiates race of A-record responses to the client’s name server. Can achieve proximity without

probing.

• Drop—Silently discards request.

Breakthrough Concept—DNS Rules

At the core of the Cisco GSS 4490 is the DNS rule that gives the user the centralized command and control of how

the Cisco GSS 4490 will globally load-balance a given hosted domain, what IP address is sent the client’s named

server (D-proxy), and what recovery method should be used if the first one fails.

Using the Cisco GSS 4490 GUI, the user defines what domain will be globally load-balanced. Users have the option

to include the IP address of specific client-named servers (D-proxy), which allows them to hard-wire a single D-proxy

or a group of D-proxies to a particular site. This option is useful for a customer who wants to specify a certain set

of server load balancers to support all the traffic from an individual client (such as AOL, Genuity, and AT&T

Broadband). Next, on the same GUI screen, the user can define the server load balancers that should support this

domain (and the server load balancers’ VIP addresses). The user has the option to return multiple A-records (up to

eight) and a time-to-live value that will be used by the client D-proxy.

The final configuration option is the recovery method that will be used if the first method fails (in the GUI, these are

called “load-balance clauses”). Each DNS rule can support up to three load-balance clauses. These clauses are used

to establish the primary load-balancing method, then the secondary load-balancing method, and finally the third

method—which could involve hard wiring a global “sorry server.”

Using a clause allows the network engineering staff to create a solution that always directs clients to the appropriate

data center.

Page 19: Load Balancer

Cisco Systems, Inc.All contents are Copyright © 1992–2004 Cisco Systems, Inc. All rights reserved. Important Notices and Privacy Statement.

Page 19 of 20

Name-Server Forwarding

Although not considered a global load-balancing method, Name-Server Forwarding plays a vital role in providing

increased flexibility for the Cisco GSS 4490. Used in instances where the Cisco GSS 4490 cannot handle requests for

domains, the Name-Server Forwarding feature passes on requests it cannot answer to a configured name server that

can fulfill the request. That name server’s response is passed back through the Cisco GSS 4490 such that it appears

to have come from the Cisco GSS, as expected by the D-proxy.

As the Cisco GSS 4490 processes these DNS requests, it can use one of the load-balancing methods to spread the

requests out. The global load-balancing methods can be used with Name-Server Forwarding feature are: Round

Robin, Weighted Round Robin, and Ordered List and Hashing If the Cisco GSS receives a request for an mail

exchange (MX) record, although the Cisco GSS cannot respond directly, it can forward the request to one of many

DNS name servers deployed in the enterprise or service-provider network.

DNS Race—Proximity Without Probing

The Cisco GSS 4490 supports the DNS Race method of proximity. This feature allows proximity to be achieved

without probing the client D-proxy. It is based on a simple concept—that instantaneous proximity can be found if a

device within each data center sends an A-record (IP address), at the same time, to the client’s name server. Whichever

A-record is received first is the most proximate.

The DNS Race method of DNS resolution is initiated by the Cisco GSS 4490 and is designed to load-balance between

2 and 20 sites. DNS Race gives all possible CRAs, Cisco content engines, or Cisco CSSs a fair chance at resolving a

DNS request. For the Cisco GSS 4490 to initiate a race, it needs to establish the following information per CRA:

• The delay between the Cisco GSS and each CRA in each data center—With this data, the Cisco GSS computes

the time needed to delay the race from each data center so that each CRA will start the race simultaneously.

• The “aliveness” of the CRAs—With this data, the Cisco GSS knows not to forward requests to any CRAs that

are not responding.

The Cisco GSS 4490 gathers this information by sending keepalive messages at predetermined intervals. This data,

along with the IP addresses of the CRA, is used to request the start of the race. When the Cisco GSS 4490 receives a

DNS request, a race request is sent to each CRA with appropriate delay, and the race is initiated from each data

center. The first A-record received by the client’s D-proxy is the winner and is the most proximate.

Summary

Distributed data centers that incorporate global load balancing on the front end with data and server synchronization

on the back end help to ensure rapid recovery from disruptions in any single site.

As core business-critical applications are Web-enabled, Cisco GSS, CSS, and CSM site- and load-balancing solutions

help to ensure continuous access to these applications. Intelligent networks that balance resources and user traffic

within and between data centers compensate for any component downtime or degradation through fast failover and

redirection of server requests. With disaster recovery concerns at an all-time high, adding this extra measure of

reinforcement to the data center front end is critical for organizations that need business to proceed as usual.

Page 20: Load Balancer

Corporate HeadquartersCisco Systems, Inc.170 West Tasman DriveSan Jose, CA 95134-1706USAwww.cisco.comTel: 408 526-4000

800 553-NETS (6387)Fax: 408 526-4100

European HeadquartersCisco Systems International BVHaarlerbergparkHaarlerbergweg 13-191101 CH AmsterdamThe Netherlandswww-europe.cisco.comTel: 31 0 20 357 1000Fax: 31 0 20 357 1100

Americas HeadquartersCisco Systems, Inc.170 West Tasman DriveSan Jose, CA 95134-1706USAwww.cisco.comTel: 408 526-7660Fax: 408 527-0883

Asia Pacific HeadquartersCisco Systems, Inc.Capital Tower168 Robinson Road#22-01 to #29-01Singapore 068912www.cisco.comTel: +65 6317 7777Fax: +65 6317 7799

Cisco Systems has more than 200 offices in the following countries and regions. Addresses, phone numbers, and fax numbers are listed on the

C i s c o We b s i t e a t w w w. c i s c o . c o m / g o / o f f i c e s

Argentina • Australia • Austria • Belgium • Brazil • Bulgaria • Canada • Chile • China PRC • Colombia • Costa Rica • Croatia

Czech Republic • Denmark • Dubai, UAE • Finland • France • Germany • Greece • Hong Kong SAR • Hungary • India • Indonesia • Ireland

Israel • Italy • Japan • Korea • Luxembourg • Malaysia • Mexico • The Netherlands • New Zealand • Norway • Peru • Philippines • Poland

Portugal • Puerto Rico • Romania • Russia • Saudi Arabia • Scotland • Singapore • Slovakia • Slovenia • South Africa • Spain • Sweden

Switzer land • Taiwan • Thai land • Turkey • Ukraine • United Kingdom • United States • Venezuela • Vietnam • Zimbabwe

All contents are Copyright © 1992–2004 Cisco Systems, Inc. All rights reserved. Cisco, Cisco Systems, the Cisco Systems logo, Catalyst, and Cisco IOS are registered trademarks or trademarks of Cisco Systems, Inc. and/

or its affiliates in the U.S. and certain other countries.

All other trademarks mentioned in this document or Web site are the property of their respective owners. The use of the word partner does not imply a partnership relationship between Cisco and any other company.

(0304R) ETMG203144—WH 01/04

Cisco offers a complete GSLB product line that meets the

requirements of any enterprise or service provider. Cisco offers

GSLB products that are integrated within server load balancers,

integrated within routers, or are dedicated appliances. These

products complement Cisco high-speed routing, storage

networking, and optical technologies that facilitate synchronization

of data and clustering of servers, allowing enterprises to build

distributed data centers that function logically as a single “virtual”

data center. IT professionals, DNS administrators and network

architects know that only Cisco can deliver a complete end-to-end

business continuance and disaster recovery solution that

Web-enabled, mission-critical applications can rely on.

Author: Lance McCallum, [email protected] Co-author: Jay

Cedrone and Andrew Thurber

For more information, visit:

http://www.cisco.com/go/gss

http://www.cisco.com/warp/public/784/packet/apr03/pdfs/

ent_optimize.pdf

http://www.cisco.com/go/datacenter

http://www.cisco.com/go/contentswitch