scalable web server clustering technologies j. wei
TRANSCRIPT
![Page 1: Scalable Web Server Clustering Technologies J. Wei](https://reader035.vdocuments.site/reader035/viewer/2022062314/56649c905503460f94949b67/html5/thumbnails/1.jpg)
Scalable Web Server Clustering Technologies
J. Wei
![Page 2: Scalable Web Server Clustering Technologies J. Wei](https://reader035.vdocuments.site/reader035/viewer/2022062314/56649c905503460f94949b67/html5/thumbnails/2.jpg)
BackgroundGrowth of Internet, Dynamic content and increasing users
force us to find faster server (Web).In the past, we replaced the web server with faster
machine (processor).Drawback: Short-term (Moore Law, the number of transistors per
integrated circuit would double every 18 months);Expensive: we need to replace almost the whole machine.Solution: Add more processor or machine to the Web
server. (It is commodity hardware and software, so that we can keep the past investment.)
![Page 3: Scalable Web Server Clustering Technologies J. Wei](https://reader035.vdocuments.site/reader035/viewer/2022062314/56649c905503460f94949b67/html5/thumbnails/3.jpg)
Requirement There is no application state kept in
server. For application requests need to be transfer from one to other servers. (Except some protocol-specific service, such as Secure Sockets Layer)
Transactions must be relatively short and with high frequencies. (Short, because we do not use special hardware or software to process the request. High frequency, the sample space is large so that we can employ the stochastic method to distribute the requests. Requests are stochastic distribution, from anywhere at anytime.)
![Page 4: Scalable Web Server Clustering Technologies J. Wei](https://reader035.vdocuments.site/reader035/viewer/2022062314/56649c905503460f94949b67/html5/thumbnails/4.jpg)
OSI vs. TCP/IP
![Page 5: Scalable Web Server Clustering Technologies J. Wei](https://reader035.vdocuments.site/reader035/viewer/2022062314/56649c905503460f94949b67/html5/thumbnails/5.jpg)
Layer 4 SwitchSpecial Technologies: Single IP. Because of the Network Address
Translation (NAT) so that the cluster servers appear to be a single server with one IP address.
Higher-layer address screening. The switch can make forwarding decision upon the content of the request from layer 4.
![Page 6: Scalable Web Server Clustering Technologies J. Wei](https://reader035.vdocuments.site/reader035/viewer/2022062314/56649c905503460f94949b67/html5/thumbnails/6.jpg)
Terminology L4/2: Layer 4 Switching with Layer 2 Packet
Forwarding. The system has identical layer 3 (Network) with unique MAC address.
L4/3: Layer 4 Switching with Layer 3 Packet Forwarding. The system has identical layer 4 (Transport, same services) with unique network address.
Layer 7 Switch: Make forwarding decision based on the content of client requests. It can employ L4/2 or L4/3.
![Page 7: Scalable Web Server Clustering Technologies J. Wei](https://reader035.vdocuments.site/reader035/viewer/2022062314/56649c905503460f94949b67/html5/thumbnails/7.jpg)
Terminology (cont.) Client-side Transparency: The whole cluster
servers appear to be a single host to clients because of the dispatcher.
Server-side Transparency: Each cluster server runs standard web-server designed for standalone server. It servers the requests forwarded from dispatcher just the same as the requests come directly from the clients.
Performance Index: Connections per seconds or
bits per seconds. (Cluster Maximum Utilization)
![Page 8: Scalable Web Server Clustering Technologies J. Wei](https://reader035.vdocuments.site/reader035/viewer/2022062314/56649c905503460f94949b67/html5/thumbnails/8.jpg)
L4/2 Clustering The cluster’s IP address (A) is shared by the dispatcher
and servers through the use of primary and secondary IP addresses. (BK: Each host can have several IP addresses.)
The dispatcher’s primary IP address is A.
The servers use A as secondary address.
All packets whose destinations are A are forwarded to the dispatcher through the use of Address Resolution Protocol (ARP) in the nearest gateway/router.
![Page 9: Scalable Web Server Clustering Technologies J. Wei](https://reader035.vdocuments.site/reader035/viewer/2022062314/56649c905503460f94949b67/html5/thumbnails/9.jpg)
Technology Specification Load-Sharing Algorithm: Round-Robin or other
policies.
Session Map: When request is connection initiation, if it belongs to established connection in the map, forward it to the previously selected server, or select a server and save the connection in the map. If it doesn’t contain a SYN, it maybe discarded or not.
Backup method: To avoid the down of the dispatcher and servers.
![Page 10: Scalable Web Server Clustering Technologies J. Wei](https://reader035.vdocuments.site/reader035/viewer/2022062314/56649c905503460f94949b67/html5/thumbnails/10.jpg)
L4/2 Traffic Flow
![Page 11: Scalable Web Server Clustering Technologies J. Wei](https://reader035.vdocuments.site/reader035/viewer/2022062314/56649c905503460f94949b67/html5/thumbnails/11.jpg)
L4/2 Traffic Flow (cont.)
1. A client sends a request to A. 2. The router sends the request to the
dispatcher. 3. Based on the load-sharing algorithm, the
dispatcher selects actual server (2) to serve the client.
4. Server 2 replies the client directly.
![Page 12: Scalable Web Server Clustering Technologies J. Wei](https://reader035.vdocuments.site/reader035/viewer/2022062314/56649c905503460f94949b67/html5/thumbnails/12.jpg)
Advantage vs. DisadvantageAdvantage: Servers reply clients directly, which avoid the
dispatcher to be bottleneck. Don’t need to recalculate the checksum because it
operates on layer 2.
Disadvantage: There must be direct physical connection to all
servers and the dispatcher.
![Page 13: Scalable Web Server Clustering Technologies J. Wei](https://reader035.vdocuments.site/reader035/viewer/2022062314/56649c905503460f94949b67/html5/thumbnails/13.jpg)
ONE-IP (Bell Lab, 1996) Load-Sharing Algorithm
Routing-based Dispatching: Hash the incoming client’s address to get a number that indicates which server to service the request;
![Page 14: Scalable Web Server Clustering Technologies J. Wei](https://reader035.vdocuments.site/reader035/viewer/2022062314/56649c905503460f94949b67/html5/thumbnails/14.jpg)
ONE-IP (cont.) Broadcast based dispatching: Each server has a fixed
and disjoint portion of the address space.
![Page 15: Scalable Web Server Clustering Technologies J. Wei](https://reader035.vdocuments.site/reader035/viewer/2022062314/56649c905503460f94949b67/html5/thumbnails/15.jpg)
ONE-IP (cont.) Drawback: Cannot adapt to the condition that the
client requests are disproportionately distributed.
Backup: Watchdog daemonm watchd• Dispatcher fail: The backup dispatcher will notice the
missing heartbeat of the primary dispatcher and take over.
• Server fail: Reconfigure the hash table or the address filters on other servers.
![Page 16: Scalable Web Server Clustering Technologies J. Wei](https://reader035.vdocuments.site/reader035/viewer/2022062314/56649c905503460f94949b67/html5/thumbnails/16.jpg)
Network Dispatcher(IBM 1996) It powered the 1998 Olympic Games website with up to
2000 requests/s. Experimental results are 2200 requests/s.
![Page 17: Scalable Web Server Clustering Technologies J. Wei](https://reader035.vdocuments.site/reader035/viewer/2022062314/56649c905503460f94949b67/html5/thumbnails/17.jpg)
Network Dispatcher (cont.) Load-Sharing Algorithm: Weighted Round Robin
algorithm. Connection Map: Discard the packet that doesn’t
contain a SYN or a non-zero allocation weight is unavailable.
Backup: Dispatcher: Secondary dispatcher. In fact, it contains
some extra dispatchers. The secondary dispatcher will take the IP of the failed dispatcher.
Server: High Availability Cluster Multi-Processing for AIX (HACMP) on the IBM SP-2. Reconfigure the dispatchers to exclude the node; Failed server will automatically reboot; Reconfigure the dispatchers to include the node.
![Page 18: Scalable Web Server Clustering Technologies J. Wei](https://reader035.vdocuments.site/reader035/viewer/2022062314/56649c905503460f94949b67/html5/thumbnails/18.jpg)
Network Dispatcher (cont.) Client affinity: Background: Two connections from the same client
must be assigned to the same server such as the FTP and SSL services.
The connection requests from the same client before given affinity life span expires are sent to the same server.
“The quality of the load sharing may suffer slightly, but the overall performance of the system improves.”
![Page 19: Scalable Web Server Clustering Technologies J. Wei](https://reader035.vdocuments.site/reader035/viewer/2022062314/56649c905503460f94949b67/html5/thumbnails/19.jpg)
Others LSMAC (University of Nebraska-Lincoln)
“Implement L4/2 clustering as a portable user-space application running on commodity systems”
Alteon ACEdirector (hardware implementation)
AceDirector 2’s primary focus is on load balancing Internet services such as HTTP and FTP.
Load-Sharing Algorithm: Round-robin and least-connections load sharing policies.
Support SSL service.
![Page 20: Scalable Web Server Clustering Technologies J. Wei](https://reader035.vdocuments.site/reader035/viewer/2022062314/56649c905503460f94949b67/html5/thumbnails/20.jpg)
L4/3 Clustering• The dispatcher appears as
a single host to clients while as a gateway to the servers (IP address = A).
• Each server has its own IP address that can be globally unique or locally unique (IP addresses = B1, B2, … , Bn).
• Load sharing algorithm: Round robin or other algorithms;
• Keep a session map table.
![Page 21: Scalable Web Server Clustering Technologies J. Wei](https://reader035.vdocuments.site/reader035/viewer/2022062314/56649c905503460f94949b67/html5/thumbnails/21.jpg)
L4/3 Clustering (cont.)
![Page 22: Scalable Web Server Clustering Technologies J. Wei](https://reader035.vdocuments.site/reader035/viewer/2022062314/56649c905503460f94949b67/html5/thumbnails/22.jpg)
L4/3 Clustering (cont.)1. A client sends request with A as the destination;2. The packet comes to the dispatcher;3. Based on the load sharing algorithm and session table,
select the server, rewrite the destination IP address, recalculate the checksums, forward it to the server;
4. The server replies the request through the dispatcher (gateway) address A as the destination address.
5. The dispatcher rewrite the source IP address of reply as A, recalculate the checksums, forward it to the client.
• Disadvantage:1. Recalculate twice the checksums. (IP and TCP)2. All traffic flow through the dispatcher. (Bottleneck)
![Page 23: Scalable Web Server Clustering Technologies J. Wei](https://reader035.vdocuments.site/reader035/viewer/2022062314/56649c905503460f94949b67/html5/thumbnails/23.jpg)
Magicrouter University of California at Berkeley, 1996 Fast Packet Interposing and modifications of kernel
Load sharing Algorithms: • Round robin• Random• Incremental Load
Backup:• Dispatcher: primary + backup model.• Server: Use ARP to map server IP addresses to MAC
addresses to detect the fail of servers.
![Page 24: Scalable Web Server Clustering Technologies J. Wei](https://reader035.vdocuments.site/reader035/viewer/2022062314/56649c905503460f94949b67/html5/thumbnails/24.jpg)
LocalDirector (Cisco, 1996)Load sharing Algorithm: • Least connections: choose the server with fewest
connections• Fastest Response: choose the server that response the
request first.• Round-Robin: Strictly RR policy.
Backup:• Dispatcher: extra LocalDirector unit that linked to the
primary one with special failover cable• Server: Contact servers periodically, when fail, remove it,
continue to contact, when up, add to the server pool
Sticky flag: similar as IBM’s client affinity.
![Page 25: Scalable Web Server Clustering Technologies J. Wei](https://reader035.vdocuments.site/reader035/viewer/2022062314/56649c905503460f94949b67/html5/thumbnails/25.jpg)
LSNAT University of Nebraska-Lincoln User-space implementation
RFC2391: Load Sharing using IP Network Address Translation (LSNAT)
Backup:• Dispatcher: select one server as new dispatcher.
Distributed State Reconstruction Mechanism to rebuild the map of existing connections.
• Server: Exclude from active servers pool. When up, include it again.
![Page 26: Scalable Web Server Clustering Technologies J. Wei](https://reader035.vdocuments.site/reader035/viewer/2022062314/56649c905503460f94949b67/html5/thumbnails/26.jpg)
L7 Clustering Make dispatch decision based on the content.
(Application Layer) Content-based dispatching
![Page 27: Scalable Web Server Clustering Technologies J. Wei](https://reader035.vdocuments.site/reader035/viewer/2022062314/56649c905503460f94949b67/html5/thumbnails/27.jpg)
LARD Locality-Aware Request Distribution, Rice University It uses TCP handoff protocol with the modified kernel. Different server processes different kind of requests,
which can make use of specialized server.
![Page 28: Scalable Web Server Clustering Technologies J. Wei](https://reader035.vdocuments.site/reader035/viewer/2022062314/56649c905503460f94949b67/html5/thumbnails/28.jpg)
Web Accelerator (IBM) “The accelerator can now
perform content-based routing in which it makes intelligent decisions about where to route requests based on the URL.”
L7 based on L4/2; Web page caching; The dispatcher services
as a gateway/router. All traffic flows through
the dispatcher.
![Page 29: Scalable Web Server Clustering Technologies J. Wei](https://reader035.vdocuments.site/reader035/viewer/2022062314/56649c905503460f94949b67/html5/thumbnails/29.jpg)
ArrowPoint Content-based dispatching policy; Caching mechanism is similar to Web Accelerator; Sticky connection; Hot standby of the dispatcher and server node fail
detection mechanism.
![Page 30: Scalable Web Server Clustering Technologies J. Wei](https://reader035.vdocuments.site/reader035/viewer/2022062314/56649c905503460f94949b67/html5/thumbnails/30.jpg)
ConclusionL4/2 Clustering Bottleneck: power of dispatcher to process incoming request; Advantage: Sustainable request rate.
L4/3 Clustering Bottleneck: recalculation of checksums.
L7 Clustering Bottleneck: complexity of content-based dispatching
algorithm; Advantage: Localizing request space and caching request
results.
![Page 31: Scalable Web Server Clustering Technologies J. Wei](https://reader035.vdocuments.site/reader035/viewer/2022062314/56649c905503460f94949b67/html5/thumbnails/31.jpg)
Qualitative comparison
Client-based approach: Advantage: Reduce the load on web server by
implementing route service in client side. Disadvantage: It is not general applicability and it need
the server-side cooperation.
Dispatcher-based approach: Advantage: Full control of client requests to gain good
load balancing. Easy to implementation. Disadvantage: Risk of dispatcher bottleneck.
![Page 32: Scalable Web Server Clustering Technologies J. Wei](https://reader035.vdocuments.site/reader035/viewer/2022062314/56649c905503460f94949b67/html5/thumbnails/32.jpg)
Qualitative comparison (cont.)
DNS-based approach: Advantage: High Scalability. No risk of bottleneck. Disadvantage:
Due to the address caching mechanisms, need sophisticated algorithms to gain load balancing.
Less than 32 web servers for each public URL because of the limitation of UDP packet size.
Server-based approach: Advantage: No risk of single-point failure and bottleneck. Disadvantage: Redirection will increase the latency time
for clients.
![Page 33: Scalable Web Server Clustering Technologies J. Wei](https://reader035.vdocuments.site/reader035/viewer/2022062314/56649c905503460f94949b67/html5/thumbnails/33.jpg)
Qualitative comparison (cont.)
![Page 34: Scalable Web Server Clustering Technologies J. Wei](https://reader035.vdocuments.site/reader035/viewer/2022062314/56649c905503460f94949b67/html5/thumbnails/34.jpg)
Quantitative comparison Cluster Maximum Utilization: At a given
instant, the highest utilization among all servers in the cluster.
Cumulative Frequency: Exponential Distribution: Heavy-tailed Distribution:
![Page 35: Scalable Web Server Clustering Technologies J. Wei](https://reader035.vdocuments.site/reader035/viewer/2022062314/56649c905503460f94949b67/html5/thumbnails/35.jpg)
Quantitative comparison (cont.) Exponential Distribution Model
![Page 36: Scalable Web Server Clustering Technologies J. Wei](https://reader035.vdocuments.site/reader035/viewer/2022062314/56649c905503460f94949b67/html5/thumbnails/36.jpg)
Quantitative comparison (cont.)
Dispatcher-based: In almost all time, the utilization is below 0.8
DNS-based (adaptive TTL): utilization below 0.9
DNS-based (constant TTL): 20% time overload
Server-based: utilization below 0.9
DNS-RR: overload time > 70%
![Page 37: Scalable Web Server Clustering Technologies J. Wei](https://reader035.vdocuments.site/reader035/viewer/2022062314/56649c905503460f94949b67/html5/thumbnails/37.jpg)
Quantitative comparison (cont.) Heavy-tailed Distribution Model
![Page 38: Scalable Web Server Clustering Technologies J. Wei](https://reader035.vdocuments.site/reader035/viewer/2022062314/56649c905503460f94949b67/html5/thumbnails/38.jpg)
Quantitative comparison (cont.)
Dispatcher-based: work fine
DNS-based (adaptive TTL): work fine without risk of bottleneck
Server-based: poor performance when the load is high, work fine before the load over 0.9
![Page 39: Scalable Web Server Clustering Technologies J. Wei](https://reader035.vdocuments.site/reader035/viewer/2022062314/56649c905503460f94949b67/html5/thumbnails/39.jpg)
Conclusion Bottleneck will be the network throughput. By making use of wide area network bandwidth, it can
get much better performance.