scalable web server clustering technologies j. wei

Scalable Web Server Clustering Technologies

J. Wei

BackgroundGrowth of Internet, Dynamic content and increasing users

force us to find faster server (Web).In the past, we replaced the web server with faster

machine (processor).Drawback: Short-term (Moore Law, the number of transistors per

integrated circuit would double every 18 months);Expensive: we need to replace almost the whole machine.Solution: Add more processor or machine to the Web

server. (It is commodity hardware and software, so that we can keep the past investment.)

Requirement There is no application state kept in

server. For application requests need to be transfer from one to other servers. (Except some protocol-specific service, such as Secure Sockets Layer)

Transactions must be relatively short and with high frequencies. (Short, because we do not use special hardware or software to process the request. High frequency, the sample space is large so that we can employ the stochastic method to distribute the requests. Requests are stochastic distribution, from anywhere at anytime.)

OSI vs. TCP/IP

Layer 4 SwitchSpecial Technologies: Single IP. Because of the Network Address

Translation (NAT) so that the cluster servers appear to be a single server with one IP address.

Higher-layer address screening. The switch can make forwarding decision upon the content of the request from layer 4.

Terminology L4/2: Layer 4 Switching with Layer 2 Packet

Forwarding. The system has identical layer 3 (Network) with unique MAC address.

L4/3: Layer 4 Switching with Layer 3 Packet Forwarding. The system has identical layer 4 (Transport, same services) with unique network address.

Layer 7 Switch: Make forwarding decision based on the content of client requests. It can employ L4/2 or L4/3.

Terminology (cont.) Client-side Transparency: The whole cluster

servers appear to be a single host to clients because of the dispatcher.

Server-side Transparency: Each cluster server runs standard web-server designed for standalone server. It servers the requests forwarded from dispatcher just the same as the requests come directly from the clients.

Performance Index: Connections per seconds or

bits per seconds. (Cluster Maximum Utilization)

L4/2 Clustering The cluster’s IP address (A) is shared by the dispatcher

and servers through the use of primary and secondary IP addresses. (BK: Each host can have several IP addresses.)

The dispatcher’s primary IP address is A.

The servers use A as secondary address.

All packets whose destinations are A are forwarded to the dispatcher through the use of Address Resolution Protocol (ARP) in the nearest gateway/router.

Technology Specification Load-Sharing Algorithm: Round-Robin or other

policies.

Session Map: When request is connection initiation, if it belongs to established connection in the map, forward it to the previously selected server, or select a server and save the connection in the map. If it doesn’t contain a SYN, it maybe discarded or not.

Backup method: To avoid the down of the dispatcher and servers.

L4/2 Traffic Flow

L4/2 Traffic Flow (cont.)

1. A client sends a request to A. 2. The router sends the request to the

dispatcher. 3. Based on the load-sharing algorithm, the

dispatcher selects actual server (2) to serve the client.

4. Server 2 replies the client directly.

Advantage vs. DisadvantageAdvantage: Servers reply clients directly, which avoid the

dispatcher to be bottleneck. Don’t need to recalculate the checksum because it

operates on layer 2.

Disadvantage: There must be direct physical connection to all

servers and the dispatcher.

ONE-IP (Bell Lab, 1996) Load-Sharing Algorithm

Routing-based Dispatching: Hash the incoming client’s address to get a number that indicates which server to service the request;

ONE-IP (cont.) Broadcast based dispatching: Each server has a fixed

and disjoint portion of the address space.

ONE-IP (cont.) Drawback: Cannot adapt to the condition that the

client requests are disproportionately distributed.

Backup: Watchdog daemonm watchd• Dispatcher fail: The backup dispatcher will notice the

missing heartbeat of the primary dispatcher and take over.

• Server fail: Reconfigure the hash table or the address filters on other servers.

Network Dispatcher(IBM 1996) It powered the 1998 Olympic Games website with up to

2000 requests/s. Experimental results are 2200 requests/s.

Network Dispatcher (cont.) Load-Sharing Algorithm: Weighted Round Robin

algorithm. Connection Map: Discard the packet that doesn’t

contain a SYN or a non-zero allocation weight is unavailable.

Backup: Dispatcher: Secondary dispatcher. In fact, it contains

some extra dispatchers. The secondary dispatcher will take the IP of the failed dispatcher.

Server: High Availability Cluster Multi-Processing for AIX (HACMP) on the IBM SP-2. Reconfigure the dispatchers to exclude the node; Failed server will automatically reboot; Reconfigure the dispatchers to include the node.

Network Dispatcher (cont.) Client affinity: Background: Two connections from the same client

must be assigned to the same server such as the FTP and SSL services.

The connection requests from the same client before given affinity life span expires are sent to the same server.

“The quality of the load sharing may suffer slightly, but the overall performance of the system improves.”

Others LSMAC (University of Nebraska-Lincoln)

“Implement L4/2 clustering as a portable user-space application running on commodity systems”

Alteon ACEdirector (hardware implementation)

AceDirector 2’s primary focus is on load balancing Internet services such as HTTP and FTP.

Load-Sharing Algorithm: Round-robin and least-connections load sharing policies.

Support SSL service.

L4/3 Clustering• The dispatcher appears as

a single host to clients while as a gateway to the servers (IP address = A).

• Each server has its own IP address that can be globally unique or locally unique (IP addresses = B1, B2, … , Bn).

• Load sharing algorithm: Round robin or other algorithms;

• Keep a session map table.

L4/3 Clustering (cont.)

L4/3 Clustering (cont.)1. A client sends request with A as the destination;2. The packet comes to the dispatcher;3. Based on the load sharing algorithm and session table,

select the server, rewrite the destination IP address, recalculate the checksums, forward it to the server;

4. The server replies the request through the dispatcher (gateway) address A as the destination address.

5. The dispatcher rewrite the source IP address of reply as A, recalculate the checksums, forward it to the client.

• Disadvantage:1. Recalculate twice the checksums. (IP and TCP)2. All traffic flow through the dispatcher. (Bottleneck)

Magicrouter University of California at Berkeley, 1996 Fast Packet Interposing and modifications of kernel

Load sharing Algorithms: • Round robin• Random• Incremental Load

Backup:• Dispatcher: primary + backup model.• Server: Use ARP to map server IP addresses to MAC

addresses to detect the fail of servers.

LocalDirector (Cisco, 1996)Load sharing Algorithm: • Least connections: choose the server with fewest

connections• Fastest Response: choose the server that response the

request first.• Round-Robin: Strictly RR policy.

Backup:• Dispatcher: extra LocalDirector unit that linked to the

primary one with special failover cable• Server: Contact servers periodically, when fail, remove it,

continue to contact, when up, add to the server pool

Sticky flag: similar as IBM’s client affinity.

LSNAT University of Nebraska-Lincoln User-space implementation

RFC2391: Load Sharing using IP Network Address Translation (LSNAT)

Backup:• Dispatcher: select one server as new dispatcher.

Distributed State Reconstruction Mechanism to rebuild the map of existing connections.

• Server: Exclude from active servers pool. When up, include it again.

L7 Clustering Make dispatch decision based on the content.

(Application Layer) Content-based dispatching

LARD Locality-Aware Request Distribution, Rice University It uses TCP handoff protocol with the modified kernel. Different server processes different kind of requests,

which can make use of specialized server.

Web Accelerator (IBM) “The accelerator can now

perform content-based routing in which it makes intelligent decisions about where to route requests based on the URL.”

L7 based on L4/2; Web page caching; The dispatcher services

as a gateway/router. All traffic flows through

the dispatcher.

ArrowPoint Content-based dispatching policy; Caching mechanism is similar to Web Accelerator; Sticky connection; Hot standby of the dispatcher and server node fail

detection mechanism.

ConclusionL4/2 Clustering Bottleneck: power of dispatcher to process incoming request; Advantage: Sustainable request rate.

L4/3 Clustering Bottleneck: recalculation of checksums.

L7 Clustering Bottleneck: complexity of content-based dispatching

algorithm; Advantage: Localizing request space and caching request

results.

Qualitative comparison

Client-based approach: Advantage: Reduce the load on web server by

implementing route service in client side. Disadvantage: It is not general applicability and it need

the server-side cooperation.

Dispatcher-based approach: Advantage: Full control of client requests to gain good

load balancing. Easy to implementation. Disadvantage: Risk of dispatcher bottleneck.

Qualitative comparison (cont.)

DNS-based approach: Advantage: High Scalability. No risk of bottleneck. Disadvantage:

Due to the address caching mechanisms, need sophisticated algorithms to gain load balancing.

Less than 32 web servers for each public URL because of the limitation of UDP packet size.

Server-based approach: Advantage: No risk of single-point failure and bottleneck. Disadvantage: Redirection will increase the latency time

for clients.

Qualitative comparison (cont.)

Quantitative comparison Cluster Maximum Utilization: At a given

instant, the highest utilization among all servers in the cluster.

Cumulative Frequency: Exponential Distribution: Heavy-tailed Distribution:

Quantitative comparison (cont.) Exponential Distribution Model

Quantitative comparison (cont.)

Dispatcher-based: In almost all time, the utilization is below 0.8

DNS-based (adaptive TTL): utilization below 0.9

DNS-based (constant TTL): 20% time overload

Server-based: utilization below 0.9

DNS-RR: overload time > 70%

Quantitative comparison (cont.) Heavy-tailed Distribution Model

Quantitative comparison (cont.)

Dispatcher-based: work fine

DNS-based (adaptive TTL): work fine without risk of bottleneck

Server-based: poor performance when the load is high, work fine before the load over 0.9

Conclusion Bottleneck will be the network throughput. By making use of wide area network bandwidth, it can

get much better performance.

scalable web server clustering technologies j. wei

Documents

single server

standalone server

actual server

selected server

faster server web

standard webserver

identical layer

secondary address