ankara cloud meetup 6. etkinlik scaling real-time messaging on cloud sunumu
TRANSCRIPT
Scaling Real-Time Messaging on
Cloud
Ozan Yerli
What is Connected2.me?
10M+ Registered
Users
300K+New Users
Every Month
4M+ Monthly
Active Users
Use StandardsXMPP (eXtensible Messaging and Presence Protocol)
• Tried our own implementation of chat• XMPP is an industry standard• It is extensible• Stable open-source implementations: ejabberd
KISS (Keep It Simple, Stupid)DNS Load-Balancing
XMPP Server
Client Client Client Client Client
DNS
XMPP Server XMPP ServerClustering Clustering
KISS (Keep It Simple, Stupid)DNS Load-Balancing
De-centralize• Eliminates single points-of-failures• Infinite scalability• Reduces system management tasks
How to de-centralize authentication?
De-centralizeClient-side Anonymous Nick Generation
• Choose a long random password• Calculate hash of the password• Anonymous nick will be “anon-hash”• Server can verify nick & password without a
centralized database of nicks and passwords• No need to “register” nicks to a database
DRW (Don’t Reinvent Wheel)Sending media
• Users upload directly to Amazon S3 with a long random filename• User than send URL of that file via XMPP• No need to maintain servers for static files• No need to authorize users due to long random
filename
DRW (Don’t Reinvent Wheel)Searching users
• Collect bio change events in Redis• Push those changes to CloudSearch periodically• Clients directly query CloudSearch• Automatic scaling• Extensible with multiple field types and computed
fields
DRW (Don’t Reinvent Wheel)Analyzing client connection errors
DRW (Don’t Reinvent Wheel)Analyzing client connection errors
• When there’s a connection error, most of the time client cannot access our servers• We push data about such events from another
channel (Firehose)• We can find rare client bugs / connectivity issues
Let It CrashSupervisor
• Crashes are inevitable, even Erlang nodes can crash under high load due to rare bugs• Design the system in order to recover from crash
quickly and reliably• Configure Supervisor to maintain process health
and recover from crashes
Thank [email protected]