customer sharing: 17 media - scale to 12,000,000 users with aws

35
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 5/20/2016 Scale to 12,000,000 users with AWS Kevin Li, Product Lead, 17 Media

Upload: amazon-web-services

Post on 18-Jan-2017

299 views

Category:

Technology


0 download

TRANSCRIPT

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

5/20/2016

Scale to 12,000,000 users with AWS

Kevin Li, Product Lead, 17 Media

Your Life’s Moment

17 – 你的生活點滴

25million

streams watched/month

6million

MAU

12million

Downloads

Principle of Architecture Evolution

Figure out the business need of the current stage

Balance the quality and time to market

Optimize the bottleneck first

What’s the need of 17 architecture?

Scalable Available Personalized

Grow with the users Always there for users Understand the users

The journey of our architecture

User 100 – Launch ASAP

First 100 users

Don’t even think about scalability

Launch and verify the idea ASAP

Amazon

Route 53

Amazon EC2 MongoDB

Request

User 100,000 – CDN and Cache

User 100,000

Cache the database

Use CDN to deliver the live streaming content

CDNAmazon

Route 53

Amazon EC2

MongoDBRequest

Amazon

ElastiCache

User 1 million – Design for failure

Design for failure

Failures are the norm, not exceptions

Suppose the rate of failure of one machine is once

every 10 years (120 month)

The mean time of failure (MTTF) is

1 month if you have 120 servers

Always assume

that things will go WRONG,

and design for it

Design for failure

Amazon

Route 53

Amazon

EC2MongoDB

Amazon

EC2

Amazon

EC2

MongoDB

MongoDB

Elastic Load

Balancing

Multi-AZ Multi-AZ

Mix spot and on-demand to

save the cost

TIP: Use C3 instance for spotAmazon

ElastiCache

Are your servers PET or CATTLE?

Pet servers

Unique, lovingly hand raised servers

When they get ill, someone has to fix it at 4 am

Usually database server, like mysql, mongo,…

Cattle servers

They are almost identical

If they get ill, replace with another one

Usually API servers, workers

Let AWS raise the pets,

and we raise the cattle

User 5 million –

Build loosely coupled systems

User 5 million – Build loosely coupled systems

Our system was a monolithic system consists of

API ServerStreaming Server Worker

Application Server

API ServerStreaming Server Worker

Application Server

We discovered a bug that the API servers

didn’t send requests to worker

API ServerStreaming Server Worker

Application Server

After fixed, the overloaded worker crashed the whole server

Split the service,

so that it’s easier to scale,

and fail independently

Build loosely coupled systems

API

Server

Streaming

Server

Worker

API

Server

API

Server

Streaming

Server

Worker

Worker

Amazon

SQS

API Cluster Worker Cluster

Streaming Cluster

Monitor for each service,

and design for failure

User 10,000,000 – Data

Business Intelligence

Who’s our best streamers?

How’s the retention changes

among different version?

We need a real-time data pipeline and self-service tool for the business team

Which event is the most effective?

User 10,000,000 - Data and Personalization

Amazon

Kinesis

Amazon S3

bucket

Amazon

EC2

Event

Data

AWS

Lambda

A real-time self-service dashboard for

the management and marketing team

Fraud Detection and Security Monitoring

Hackers are always trying to get valuable stuff from your

service, like virtual goods, data,…

Lots of spammer leaves dirty words or fraud information

You’ll need enough data to detect the fraud and prevent it

“50% of reddit’s development

time focused on stopping spam

and vote cheating” - Jeremy Edberg, Chief Architect of Reddit

Log Search

Our customer service often received many questions

To answer those questions, we need a log search system

“I bought 1,000 points, but didn’t receive”

“My stream isn’t smooth enough, there is a bug!”

User 10,000,000 – Log Search

Amazon

KinesisAmazon

EC2

Event

Data

HTTP

Request

AWS

Lambda Amazon

Elasticsearch

Service

Amazon S3

bucket

User 10,000,000 – Data Architecture

Amazon

Kinesis

Amazon S3

bucket

Amazon

EC2

Event

Data

AWS

LambdaHTTP

Request

AWS

Lambda Amazon

Elasticsearch

Service

If you are interested in building

scalable distributed systems…

We are hiring!

[email protected]

Thank you