donatas mažionis, building low latency web apis

45

Upload: tanya-denisyuk

Post on 26-Jun-2015

206 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Donatas Mažionis, Building low latency web APIs
Page 2: Donatas Mažionis, Building low latency web APIs

This talk is not a hardcore latency talk

I will not talk about:

• CPU caches

• System.nanoTime

• lockless concurrent queues

• magic low latency framework

Page 3: Donatas Mažionis, Building low latency web APIs

This talk is not a hardcore latency talk

Scaling from 500 to 150K QPS, the hard way

Page 4: Donatas Mažionis, Building low latency web APIs

Latency

a size telling us how long something took

Page 6: Donatas Mažionis, Building low latency web APIs

Why average is a common metric?

• Everyone understands it

• It’s easy to calculate

Page 7: Donatas Mažionis, Building low latency web APIs

Why average is a common metric?

• Everyone understands it

• It’s easy to calculate

• It can also hide important unwanted behaviour of

the system!

Page 8: Donatas Mažionis, Building low latency web APIs

Imagine we have a service

with the following response

latencies

Page 9: Donatas Mažionis, Building low latency web APIs

Calculating latency average

Page 10: Donatas Mažionis, Building low latency web APIs

Calculating latency average

20% of the requests got latency twice as above 10 ms

Page 11: Donatas Mažionis, Building low latency web APIs

Percentiles

The value below which a given percentage of observations in a group of

observations fall

Like p50% = the max value of 50% of the values

Page 12: Donatas Mažionis, Building low latency web APIs

Percentiles

Page 13: Donatas Mažionis, Building low latency web APIs

Percentiles in real life

Page 14: Donatas Mažionis, Building low latency web APIs

Libraries for tracking latencies

HdrHistogram: http://hdrhistogram.github.io/HdrHistogram/

Uses fixed memory and constant CPU for recording (C, Java, C#

work in progress).

Finagle: https://twitter.github.io/finagle/

Scala, Java RPC framework by Twitter, has built in stats and latency

tracking.

Page 15: Donatas Mažionis, Building low latency web APIs

APIs in online advertising

Page 16: Donatas Mažionis, Building low latency web APIs

APIs in online advertising

98% of requests under 100 ms

Page 17: Donatas Mažionis, Building low latency web APIs

APIs in online advertising

98% of requests under 100 ms

HTTP

Page 18: Donatas Mažionis, Building low latency web APIs

APIs in online advertising

98% of requests under 100 ms

HTTP JSON

Page 19: Donatas Mažionis, Building low latency web APIs

APIs in online advertising

98% of requests under 100 ms

HTTP JSON Protocol Buffers

Page 20: Donatas Mažionis, Building low latency web APIs

Real-time bidding API

How much would you pay if you give us an ad of size 200x120 to

show it on youtube.com for a user from Belgium, who is interested in

Sports and Culture?

Page 21: Donatas Mažionis, Building low latency web APIs

Real-time bidding API

Page 22: Donatas Mažionis, Building low latency web APIs
Page 23: Donatas Mažionis, Building low latency web APIs

1. Deserialize request

2. Process some rules

3. Get pre-calculated bid price from storage

4. Calculate some more

5. Serialize response

Real-time bidding request processing

All rest 40 ms for network latency

40 ms 60 ms

Page 24: Donatas Mažionis, Building low latency web APIs

LVS + keepalived

Profiler API

User profiles

Bid price

calculators

Bidder API

Ad serving

Page 25: Donatas Mažionis, Building low latency web APIs

Redis in 50 words or less

Redis is an open source, BSD licensed, advanced

key-value cache and store.

Page 26: Donatas Mažionis, Building low latency web APIs

Redis as key-value store

• Append write, flush every second

• Operations on multiple keys

• Works great, but watch out when writing/reading on the same

node simultaneously

Page 27: Donatas Mažionis, Building low latency web APIs

Redis latencies

Simultaneous writes and reads on the same node

Page 28: Donatas Mažionis, Building low latency web APIs

Cassandra in 50 words or less

Apache Cassandra is an open source, distributed,

decentralized, elastically scalable, highly available,

fault-tolerant, tuneably consistent, column-oriented

database

Page 29: Donatas Mažionis, Building low latency web APIs

Why Cassandra is good • Fast writes

• User profile is a natural key-value model

• Easy to scale (especially with virtual nodes)

• Seemed the most mature at that time (started using from v0.7)

• Runs on a legacy spare HW

• Runs on Windows :)

Page 30: Donatas Mažionis, Building low latency web APIs

Why Cassandra is good • Fast writes

• User profile is a natural key-value model

• All nice features mentioned before

• Seemed the most mature at that time (started using from v0.7)

• Runs on a legacy spare HW

• Runs on Windows :)

Page 31: Donatas Mažionis, Building low latency web APIs

Why Cassandra is not so good

Page 32: Donatas Mažionis, Building low latency web APIs

Why Cassandra is not so good

GC pauses

Page 33: Donatas Mažionis, Building low latency web APIs

Cassandra tuning tricks that worked

• LeveledCompactionStrategy

• Changing Java heap size (8 GB)

• Client direct read of data (token aware strategy)

Page 34: Donatas Mažionis, Building low latency web APIs
Page 35: Donatas Mažionis, Building low latency web APIs

Cassandra tuning tricks that did not work

GC tuning

Page 36: Donatas Mažionis, Building low latency web APIs

Cassandra tuning tricks that did not work

GC tuning

20% of requests exceeding 40 ms

Page 37: Donatas Mažionis, Building low latency web APIs

Connecting to Cassandra

Thrift version

Page 38: Donatas Mažionis, Building low latency web APIs

Fail fast plan

1. Set a TSocket timeout to 10 ms

2. If node does not answer under 10 ms, try another from the same range

3. Repeat this 3 times

Page 39: Donatas Mažionis, Building low latency web APIs

Timeouts in .NET are broken

• .NET Socket Send\ReceiveTimeout does not work for values less

than 500 ms

• Same applies to SocketAsyncEventArgs

• Async version even worse (timer queues, etc.)

Page 40: Donatas Mažionis, Building low latency web APIs

Thing that worked Socket.Poll(int microseconds, SelectMode mode) allows to block until

data is available or timeout occurs

Page 41: Donatas Mažionis, Building low latency web APIs

Blocking is not always bad

• Timeouts between 0 and 2%

• Scale by adding new servers

Page 42: Donatas Mažionis, Building low latency web APIs

Or scale by adding less servers

• Cassandra is not very good at deterministic low latencies

• We switched to Aerospike, same number of QPS, 2x less

servers, p99% for reads <= 10 ms

• The whole story here: “Married to Cassandra”

http://vimeo.com/101290545

Page 43: Donatas Mažionis, Building low latency web APIs

Takeaways • Don’t measure latency averages

• It’s expensive to scale in .NET:

• No decent Cassandra library, have to roll your own (while Java

devs having fun with astyanax, datastax driver, etc.)

• Even though we have rewritten our WCF based bidder to

HttpListener (saved 10% CPU), netty throughput is 15% better

• Finagle is a great framework

Page 44: Donatas Mažionis, Building low latency web APIs

Takeaways • Blocking is not always bad, measure

• Choose the right NoSQL(s) for the job

Page 45: Donatas Mažionis, Building low latency web APIs

Thank you!