twitter - architecture and scalability lessons

Post on 04-Dec-2014

9.581 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

One of the seminar's i'd given at college a couple of years back!

TRANSCRIPT

Twitter! Architecture and Scalability

Aditya B05IT04

WHAT IS TWITTER ?

Its addictive

• micro-blogging platform

• text-based posts

• 140 characters in length

• followers receive updates

Who uses twitter

Web Traffic

Twitter’s web-based traffic

• Plus Twitter's API Traffic which is 10x the Site’s

As it often happens..

Downtimes!!

2008

So why the problem ?

• Over 350,000 users.The actual numbers are as always, very super super top secret.

• 600 requests per second.

• Average 200-300 connections per second.Spiking to 800 connections per second.

• MySQL handled 2,400 requests per second.

What? Why so??

• When a user Abhinav writes a simple “I’m hanging out with…” message, Twitter has two choices –

1. PUSH the message to the queue’s of each of his 6,864 followers, or

2. Wait for the 6,864 followers to log in, then PULL the message.

• It is not as easy as it looks.

A 6000x multiplication factor

• Do you see a scaling problem with this scenario?

Scoble writes something boom 6,800 writes are kicked off. 1 for each follower.

Michael Arrington replies boom another 6,600 writes.Jason Calacanis jumps in boom another 6,500 writes.

• कि�तनॆ� आदमी थे� ?~350,000 सर��र�

• और� त�मी� ?– 1 database सर��र�

• बहुत� नॆ�इन्स�फी� है�!

Bottlenecks

• Single MySQL database

• no monitoring, no graphs, no statistics

• Abuses

• Plan to partition in the future

SOLUTION ?

Caching

• Getting your friends status is complicated. There are security and other issues.

• So rather than doing a query, a friend's status is updated in cache instead.

• It never touches the database. This gives a predictable response time frame (upper bound 20 ms)

Partitioning

• Plan to partition in the future.Currently they don't.

• The partition scheme will be based on time, not usersBecause most requests are very temporally local.

Abuse Prevention

• Bots crawl the site and add everyone as friends.

• 9000 friends in 24 hours. It would take down the site.

Saraha

• Be ruthless. Delete them as users.

9000 14 2

Following Followers Updates

Scalability -- Doing It Right

• Asynchronous event-driven design

• Partitioning/Shards

• Parallel execution

• Replication (read-mostly)

Are we all doomed to go through this painful process when we are

successful?

• Time-To-MarketVs

• Architecture

Good, Fast, Cheap - pick two :P

LESSONS LEARNED

1. Talk to the community.2. Treat your scaling plan like a business plan3. Build it yourself4. Build in user limits5. Don't make the database the central

bottleneck of doom6. Make your application easily partitionable

from the start

7. Optimize the database8. Cache the hell out of everything9. Most performance comes not from the

language, but from application design10.Turn your website into an open service by

creating an API. Their API is the single most powerful reason for Twitter's success.

References

• http://twitter.com/• http://highscalability.com/scaling-twitter-making-twitter-

10000-percent-faster

• http://www.slideshare.net/Blaine/scaling-twitter

• http://dev.twitter.com/2008/05/twittering-about-architecture.html

• http://www.danga.com/memcached/• http://geekandpoke.com/

QUESTIONS

ADITYAhttp://twitter.com/arbityaadityabheemarao@gmail.com

top related