donatas mažionis, building low latency web apis

This talk is not a hardcore latency talk

I will not talk about:

• CPU caches

• System.nanoTime

• lockless concurrent queues

• magic low latency framework

This talk is not a hardcore latency talk

Scaling from 500 to 150K QPS, the hard way

Latency

a size telling us how long something took

http://www.techempower.com/benchmarks/#section=data-r9&hw=peak&test=json

Typical latency

benchmark on the

internet





Why average is a common metric?

• Everyone understands it

• It’s easy to calculate

Why average is a common metric?

• Everyone understands it

• It’s easy to calculate

• It can also hide important unwanted behaviour of

the system!

Imagine we have a service

with the following response

latencies

Calculating latency average

Calculating latency average

20% of the requests got latency twice as above 10 ms

Percentiles

The value below which a given percentage of observations in a group of

observations fall

Like p50% = the max value of 50% of the values

Percentiles

Percentiles in real life

Libraries for tracking latencies

HdrHistogram: http://hdrhistogram.github.io/HdrHistogram/

Uses fixed memory and constant CPU for recording (C, Java, C#

work in progress).

Finagle: https://twitter.github.io/finagle/

Scala, Java RPC framework by Twitter, has built in stats and latency

tracking.

http://hdrhistogram.github.io/HdrHistogram/

https://twitter.github.io/finagle/

APIs in online advertising


98% of requests under 100 ms



HTTP



HTTP JSON



HTTP JSON Protocol Buffers

Real-time bidding API

How much would you pay if you give us an ad of size 200x120 to

show it on youtube.com for a user from Belgium, who is interested in

Sports and Culture?

http://youtube.com

Real-time bidding API

1. Deserialize request

2. Process some rules

3. Get pre-calculated bid price from storage

4. Calculate some more

5. Serialize response

Real-time bidding request processing

All rest 40 ms for network latency

40 ms 60 ms

LVS + keepalived

Profiler API

User profiles

Bid price

calculators

Bidder API

Ad serving

Redis in 50 words or less

Redis is an open source, BSD licensed, advanced

key-value cache and store.

Redis as key-value store

• Append write, flush every second

• Operations on multiple keys

• Works great, but watch out when writing/reading on the same

node simultaneously

Redis latencies

Simultaneous writes and reads on the same node

Cassandra in 50 words or less

Apache Cassandra is an open source, distributed,

decentralized, elastically scalable, highly available,

fault-tolerant, tuneably consistent, column-oriented

database

Why Cassandra is good • Fast writes

• User profile is a natural key-value model

• Easy to scale (especially with virtual nodes)

• Seemed the most mature at that time (started using from v0.7)

• Runs on a legacy spare HW

• Runs on Windows :)

Why Cassandra is good • Fast writes

• User profile is a natural key-value model

• All nice features mentioned before

• Seemed the most mature at that time (started using from v0.7)

• Runs on a legacy spare HW

• Runs on Windows :)

Why Cassandra is not so good

Why Cassandra is not so good

GC pauses

Cassandra tuning tricks that worked

• LeveledCompactionStrategy

• Changing Java heap size (8 GB)

• Client direct read of data (token aware strategy)

Cassandra tuning tricks that did not work

GC tuning

Cassandra tuning tricks that did not work

GC tuning

20% of requests exceeding 40 ms

Connecting to Cassandra

Thrift version

Fail fast plan

1. Set a TSocket timeout to 10 ms

2. If node does not answer under 10 ms, try another from the same range

3. Repeat this 3 times

Timeouts in .NET are broken

• .NET Socket Send\ReceiveTimeout does not work for values less

than 500 ms

• Same applies to SocketAsyncEventArgs

• Async version even worse (timer queues, etc.)

Thing that worked Socket.Poll(int microseconds, SelectMode mode) allows to block until

data is available or timeout occurs

Blocking is not always bad

• Timeouts between 0 and 2%

• Scale by adding new servers

Or scale by adding less servers

• Cassandra is not very good at deterministic low latencies

• We switched to Aerospike, same number of QPS, 2x less

servers, p99% for reads <= 10 ms

• The whole story here: “Married to Cassandra”

http://vimeo.com/101290545

http://vimeo.com/101290545

Takeaways • Don’t measure latency averages

• It’s expensive to scale in .NET:

• No decent Cassandra library, have to roll your own (while Java

devs having fun with astyanax, datastax driver, etc.)

• Even though we have rewritten our WCF based bidder to

HttpListener (saved 10% CPU), netty throughput is 15% better

• Finagle is a great framework

Takeaways • Blocking is not always bad, measure

• Choose the right NoSQL(s) for the job

Thank you!

donatas mažionis, building low latency web apis

Technology

cassandra tuning tricks

lessapache cassandra

latency tracking

online advertising98

latency benchmark

latency average20

keyvalue storeappend

hardcore latency talkscaling