Download - Traitement temps réel chez Streamroot - Golang Paris Juin 2016

REALTIME

AT

STREAMROOT

Simon Caplette

I am a Backend Scalability Engineer

at

streamroot.io

http://streamroot.io/

What Streamroot does ?

o Provide drop in/transparent P2P functionalities for large video broadcasters (vod, live)

o Allow broadcasters to save up to 70% bandwidth, handle huge ramp up/spikes for live events

Ingredients used to build the platform

Love & compassionTest Driven DevelopmentPragmatism (YAGNI !!!)

Ship when you are green (CD, CR)Beck's Simple design rules

Passes the tests

Reveals intention

No duplication

Fewest elements- Kent Beck, 1990

REALTIME

CONCURRENT

LOW LATENCY

RESPONSIVE

SCALABILITY != PERFORMANCE

SCALABILITY ~ CONCURRENCY

(Thinking more in process distribution rather than CPU perf)

… so Go was good candidate from the start

INPUT

SYSTEM

BUSINESS & OPERATIONAL OUTPUT

Processing time ~ Real life events

Realtime components at Streamroot

o Tracker: matching viewers for p2po Signaling server: initial exchange of

clients metadata before direct communication

o Autoscaler & traffic reportero Data pipeline for realtime display

Go signaling servero Relayer: very easy logico Persistent connections (web sockets)o Locking + map registryo Unit holds 150k with no load balancer

(HAProxy), 35,000 msg/secondso Spike in Elixir/Erlang: 200k

Go Trackero A tracker remembers and match viewers

amongst themselveso Responsiveness: in memory vs.

interprocesso Go has few data structures (usually

composition of map primitive)o Tracking and handling concurrent accesso Locking (W, RW), snapshot isolationo Optimistic control, channels

Our Autoscaler (and realtime reporter)

o One Go process is very good at doing different things simultaneously (a cron that exposes JSON on HTTP)

o Autoscaler: watch overall load; control cloud instances; report/store historically on action/data

o What I called Octopus pattern (built in concurrency, solid timers/tickers, http cancellation)

o Go is terse! Module has 619 Lines of code (include both Azure, AWS controls)

o Stable: been running for months without any quirkso (Demo image)

REALTIMEDATA

PIPELINE

Non functional requirements

DISPLAY REALTIME MULTIPLE GRAPHS & AGGREGATES

…ALL UNDER A SECOND

KAFKA CLUSTER

CONSUMER 2CONSUMER 1

TIME SERIE STORAGE

COLLECTOR COLLECTOR COLLECTOR

Low latency

o Realtime pipeline's components have all high write throughput (fast writes)

o Kafka: contiguous write, retention, …, no ack if needed

o Time series DB ( InfluxDB in Go): line protocol, batches, write ahead log, UDP client if really needed, ...

o … Collectors (HTTP end points)

Collectorso Simple Go http endpoints that received,

validate and push JSON text to the pipeline

o Out of the box Go stdlib endpoint holds 10,000 to 20,000 request per seconds (C10k problem solved ;) )

Collectors - Fail fasto If you bail out as soon as you're back on

your feet for otherso Go system language: great io, http

packageso io.LimitedReader, http.MaxBytesReader

https://gist.github.com/simcap/8e333aed7c5af7e4865352e866cba124

Collectors – Payload sizeo For uncompressed JSON, payload size

matterso Go un/marshal is based on reflection.

Larger payloads suffero github.com/pquerna/ffjson on 1.5MB

payloads did not helpo Avoid switching on payload type. A url

path for a type when neededo Do not nest JSON until necessary.

Friendlier down the pipeline

Collectors – Reuse resourceso Great Go package synco sync.Pool (reminder of the Flyweight

pattern)o exampleo … have not tried/measured it yet.

YAGNI ;)

https://gist.github.com/simcap/6fe510d8490b7ea4217343bd1b526e3b

Collectors - Benchingo https://github.com/tsenart/vegetao Great first results even from your local

computero Easily push your bench command line

binary on any cloud machine with more cores

https://github.com/tsenart/vegeta

Data permanent storageInfluxDB

o Flexible and powerful time series DB … approaching version 1.0 ;)

o Allow fast query with useful functions (max, percentile, median, derivative, …)

o … less density in points leads to faster query overall

KAFKA CLUSTER

CONSUMER 2CONSUMER 1

TIME SERIE STORAGE

COLLECTOR COLLECTOR COLLECTOR

Go consumerso https://github.com/Shopify/saramao Lots of redundancy in incoming data

(JSON payloads) that can be reduce per broadcaster, per content, etc...

o Consumers:a. pulls JSON payloads from Kafkab. apply logic (reduce, discard)c. push to backend storage or anything

(live geo map, etc...)

Go consumers (2)o Consumer's main logic pattern is our

replay aggregatoro Aggregator:

a. consume live payloadsb. wait for configured timec. flushd. repeat

o Design allow anytime restart; supports failure; can stop service for long period if needed

https://gist.github.com/simcap/8958d3de68252fbf7e9b2b5e38566bac

DASHBOARD DEMO SCREENS

MISCELLANEOUS

YAGNI deploymento Disclaimer: I am not a sysadmin although

I enjoy ito Any Cloud → Ubuntu → Systemdo Use conventions for your binary

deploymento Capture your conventions with Ansibleo Easy rollback / respawn

YAGNI deployment (2)

$PROJECT_NAME-`git rev-parse --short HEAD`-`date +%Y-%m-%d`

# example: collector-073b570-2016-05-27

GOARCH=amd64 GOOS=linux go build -o pkg/$ARTIFACT_NAME

scp pkg/$ARTIFACT_NAME $HOST_ALIAS:~/goapps/$PROJECT_NAME/

ln -fs $ARTIFACT_NAME $PROJECT_NAME

sudo service restart $PROJECT_NAME

YAGNI deployment (3)

[Unit]Description=Payload collectorAfter=network.target

[Service]LimitNOFILE=1000000Environment=KAFKA_BROKERS=x.x.x.x:9092,x.x.x.x:9092,x.x.x.x:9092SyslogIdentifier=streamroot-traff-collectorExecStart="/home/streamroot/goapps/collector/collector"Restart=always

[Install]WantedBy=multi-user.target

Neutral binary. Injection from environment

Dependencies surface$ REPO=github.com/streamroot;$ for PROJECT in `ls $GOPATH/src/$REPO`;$ do go list -f '{{ join .Deps "\n" }}' $REPO/$PROJECT | grep -v $REPO | grep '\.' | grep -v 'internal' | sort | uniq; done

github.com/Azure/azure-sdk-for-gogithub.com/influxdata/influxdbgithub.com/dgrijalva/jwt-gogithub.com/gorilla/contextgithub.com/gorilla/websocketgolang.org/x/time/rategopkg.in/mgo.v2github.com/Shopify/saramagithub.com/rcrowley/go-metricsgithub.com/jeromer/syslogparser

http://github.com/Azure/azure-sdk-for-go

http://github.com/influxdata/influxdb

http://github.com/dgrijalva/jwt-go

http://github.com/gorilla/context

http://github.com/gorilla/websocket

http://golang.org/x/time/rate

http://gopkg.in/mgo.v2

http://github.com/Shopify/sarama

http://github.com/rcrowley/go-metrics

http://github.com/jeromer/syslogparser

Great place to work.You are responsible for your shit!

http://www.streamroot.io/[email protected]

(with Subject: Golang meetup)Core JS developer

Backend Scalability Engineer

Our developer's bloghttps://indevwith.streamroot.io/

http://www.streamroot.io/jobs

http://www.streamroot.io/jobs

mailto:[email protected]

https://indevwith.streamroot.io/

Thank you!

Any questions?

(I am available for new projects)

Download - Traitement temps réel chez Streamroot - Golang Paris Juin 2016

Top Related