intro stream processing.be meetup #1
TRANSCRIPT
StreamProcessing.beBrussels, May 27, 2015
Theme: hosted solutions forStream Processing and ML
#StreamProcessingBe
Agenda
15’ Intro (Peter)35’ Azure Stream Analytics and ML (Jan)5’ short break35’ Google Cloud DataFlow (Alex)35’ Amazon AWS ML (Nils)
Many thanks to
Microsoft BeluxJan, Alex, Nils@maasg, @svendfxBigData.be, DataScience.be, AWS Belgiumyou !
Next StreamProcessing.be Meetup
Thu, June 25, 2015, near Mechelen station(looking for a location +/- 50 ppl)
● Introduction to Apache Kafka (Svend)● Akka Streams and Kinesis (Peter)● Understanding Spark Streaming (Gerard)
whoami : Peter Vandenabeele @peter_v
All Things Data (my consultancy)
current clients:Real Impact Analytics
Telecom Analytics (emerging markets)
“Green” start-up (stealth mode)IoT project (see next Meetup)
Why ?
(before anything else)
Why Stream Processing ?
(a personal view)
E.g. collaborative research (2013)UniProt
(180 GB)monthly update
consumerupdate cost
≅freq (1/month)
*size (180 GB)
*# consumers (5)
fetch + load + indexFULL data set
solution: Stream of updates (CDC)Users tablecontinuous
updates
consumerupdate cost
≅Rate of Change(10% / month)
*size * # consumers
fetch + loadONLY updates
stream
3M entries300k updates/month(independent of consumer update frequency)
Why Stream Processing ?
Real-time*
Big Data*
Distributed processing(“many collaborators”)
Stream becomes the “master data”
● see stream as the master data (not the DB)● allows real-time, distributed processing● allows unification between:
○ operational teams○ analytics teams○ security, ...
● e.g. Kafka at LinkedIn (Kappa architecture)
Kafka (LinkedIn) : Martin Kleppmann
source : Martin Kleppmannat strata Hadoop London
Kafka (LinkedIn) : Jay Kreps
source: Jay Krepson slideshare
“I ♥ Log”Real-time Data and Apache Kafka
Why Stream Processing ?
Peter : real-time * (big data * distributed proc.)Nathan Marz : recovery from human error + ...Jay Kreps : organizational scalability + ...Martin Kleppmann : data agility + …YOU : ??? let’s discuss at beer ...
Speakers for today● Jan Tielens (Microsoft) @jantielens● Alex Van Boxel (Vente-Exclusive.com)
@alexvb● Nils De Moor (Woorank) @ndemoor