Download - Handling web servers of high traffic sites
Handling High Traffic Websites
ASeminarOnHandling High Traffic WebsitesByAshish Kumar aka AshfameCP06023 / 06EJECS022
Processing Load
Server generates the page
User starts receiving
page
Page is rendered
in browser
User opens a page in browser
Server process the
request
What slows down?
HTTP requests
DNS Lookups
No cache expiry date
Multiple CSS and JS file includes
Page Size
Database queries
Areas to improve
USER
Server SideAdopt methodologies which will make server(s) work faster
Client SideReduce resources requirement to minimum
On-Site Improvements
ashfame.com/code.jsblog.ashfame.com/
style.cssbbninja.com/guide.pdf
SERVER
DNS #2 DNS #3DNS #1
DNS Lookups
DNS Lookup counts for the
max of the delay•Different domains and sub-domains are looked up to determine from where their files reside•More DNS lookups mean more delay
SERVER
SERVER
On-Site Improvements
blog.ashfame.com
SERVER
Images JSCSS
HTTP Requests
Reduce requests to a minimum
•Fewer requests to images, stylesheets & JavaScript files should be made
On-Site Improvements
CSS Sprites
3 Image requests
Single Sprite and used using CSS
On-Site Improvements
Minify CSS
•Removes comments•Removes white spaces•Use shorthand notations•Reduces file size and save bandwidth which means faster page load
CSS Compressor
(csscompressor.com)
•This will make sure all the style rules are fetched from one single file thus reducing HTTP requests
Combine multiple
files
On-Site Improvements
Minify JS
•Removes comments•Removes white spaces•Reduces file size and save bandwidth which means faster page load
JS Compressor (javascriptcompressor.com)
•This will make sure all javascript code is fetched from one single file thus reducing HTTP requests
Combine multiple
files
On-Site Improvements
Browser Cache
blog.ashfame.com
Browser Cache
SERVER
Set Cache expiry date
•Unspecified cache date makes the object as uncacheable
•Cached objects can be read from the cache till they are expired
On-Site Improvements
Use Content Distribution Network (CDN)
Gzip Compression
Avoid CSS expressions
JIT Approach – CSS at top | JS at bottom
Reduce Image size (Image format)
Server Side Improvements
Employ CachingCatching
File based
Caching
Memory based
Caching
Database queries are cached
Saved as file on disk
Cached queries are saved in memory
This even saves the I/O delay
Server Side Improvements
Setup a Proxy Server
Database Server
Back End Server (Apache | IIS)
Front End Server (Nginx)
HTTP Client
Nginx proxying the requests
•Requests for static files are served by Nginx as it has a smaller memory footprint•Dynamic parsing requests can be proxied to existing web server which is bulky
Server Side Improvements
Master Slave DB Setup
Master Slave Server will serve under heavy load
•It reduces latency•More requests can be handled now
Slave 1
Slave III
Slave II
Heavy Load
Usual Load
Server Side Improvements
Load Balancer
SERVER1
SERVER II
SERVER III
SERVER IV
SERVER V
SERVER VI
Load Balancer
Requests (Heavy Load)
Server Side Improvements
Threaded Model over Processes Model
MySQL (Database Server) Tweaking Settings
MySQL Storage Engine (InnoDB | MyISAM)
RAID Configuration
Avoid hitting Swap (I/O) at all cost
Your Approach
•Type of website you have•Content you have on your pages•Optimize according to the objects on your pages•There is nothing like a solution which will work well always
Determine your needs
•Caching content makes no sense on video hosted sites•For images latency matters•CDN is a good option to consider•CPU Intensive processing can’t be improved unless we provide more processing power even if other resources are idle
Act on what will actually
help
Case study - Google
Google File system
• Custom File System
MapReduce• Storage System• TBs of data across several machines
BigTable• Used to store structured data• Can handle millions of reads/writes per second
Inline CSS & JS• No links to include external files• Reduces HTTP requests
Single logo image• Less bandwidth• Faster page load
No extensive styling
• Browser default styles are used
Case study - Facebook
Scaling PHP
• PHP is a scripting language• Complex parts are coded in C++ as PHP extensions
HipHop
• PHP is dynamic and weakly typed• C++ is a compiled language with static typing• It let runs PHP code with 2X speed• Facebook developers released it to public
Using C++
• Complex modules are coded in C++• C++ works more close to the processor• PHP passes them data, they manipulate it and return it back to PHP
Case study - YouTube
Uses Flash
• Lesser size• Saves bandwidth• Faster streaming
Popular content
• It is moved to CDN
Regular content
• Served by mini-clusters
Efficient Resource Utilization
• Most of the resources in fulfilling primary objective of streaming videos• Social networking features such as commenting can be priortized as low
Queries?