perfect norikra 2nd season
TRANSCRIPT
![Page 1: Perfect Norikra 2nd Season](https://reader031.vdocuments.site/reader031/viewer/2022031518/5a6734527f8b9a0c518b498d/html5/thumbnails/1.jpg)
Perfect Norikra 2nd SeasonStream Processing Casual Talks #2 2017/07/27
Satoshi Tagomori (@tagomoris)
![Page 2: Perfect Norikra 2nd Season](https://reader031.vdocuments.site/reader031/viewer/2022031518/5a6734527f8b9a0c518b498d/html5/thumbnails/2.jpg)
Satoshi "Moris" Tagomori (@tagomoris)
Fluentd, MessagePack-Ruby, Norikra, ...
Treasure Data, Inc.
![Page 3: Perfect Norikra 2nd Season](https://reader031.vdocuments.site/reader031/viewer/2022031518/5a6734527f8b9a0c518b498d/html5/thumbnails/3.jpg)
![Page 4: Perfect Norikra 2nd Season](https://reader031.vdocuments.site/reader031/viewer/2022031518/5a6734527f8b9a0c518b498d/html5/thumbnails/4.jpg)
http://norikra.github.io/
![Page 5: Perfect Norikra 2nd Season](https://reader031.vdocuments.site/reader031/viewer/2022031518/5a6734527f8b9a0c518b498d/html5/thumbnails/5.jpg)
Streaming +
SQL
![Page 6: Perfect Norikra 2nd Season](https://reader031.vdocuments.site/reader031/viewer/2022031518/5a6734527f8b9a0c518b498d/html5/thumbnails/6.jpg)
Norikra: Schema-less Stream Processing using SQL
• Server software, written in JRuby, runs on JVM
• Open source software (GPLv2)
• http://norikra.github.io/
• https://github.com/norikra/norikra
![Page 7: Perfect Norikra 2nd Season](https://reader031.vdocuments.site/reader031/viewer/2022031518/5a6734527f8b9a0c518b498d/html5/thumbnails/7.jpg)
SELECT user.age, COUNT(*) as cnt FROM events.win:time_batch(5 mins)
WHERE current=”San Diego” AND attend.$0 AND attend.$1
GROUP BY user.age
{“name”:”tagomoris”, “user:{“age”:35, “corp”:”LINE”, “address”:”Tokyo”}, “current”:”San Diego”, “speaker”:true, “attend”:[true,true,false, ...]}
{“user.age":35,"cnt":5}, {"user.age":36,"cnt":8}, ...
![Page 8: Perfect Norikra 2nd Season](https://reader031.vdocuments.site/reader031/viewer/2022031518/5a6734527f8b9a0c518b498d/html5/thumbnails/8.jpg)
How Norikra is Perfect• Ultra fast bootstrap • Schema on read • Handling complex (nested) events • Dynamic query registration/unregistration • Simple Web UI • Data connector: Fluentd • Extensible: UDF/Listener plugins • Performance: good enough for small/middle site
![Page 9: Perfect Norikra 2nd Season](https://reader031.vdocuments.site/reader031/viewer/2022031518/5a6734527f8b9a0c518b498d/html5/thumbnails/9.jpg)
Schema on Read• Query first, Data next • Query must know what it requires
• field names, types of fields, ... • Platform can ingest any data into processor.
Query can fetch events which matches required schema.
schema-less (mixed) data stream
fields subset
for query A
fields subset for query B
query A
query Bevents from
billing service
events from API endpoint
![Page 10: Perfect Norikra 2nd Season](https://reader031.vdocuments.site/reader031/viewer/2022031518/5a6734527f8b9a0c518b498d/html5/thumbnails/10.jpg)
Architecture
Norikra Server (on JVM)
Esper Instance (Query Engine)
Type DefinitionManager
Output Event Pool
Norikra Engine
RPC Servermizuno (Jetty + Rack)
Rack RPC Handler
NorikraClientmsgpack-
rpc-over-http
![Page 11: Perfect Norikra 2nd Season](https://reader031.vdocuments.site/reader031/viewer/2022031518/5a6734527f8b9a0c518b498d/html5/thumbnails/11.jpg)
For details :)• Norikra: Stream Processing with SQL
http://www.slideshare.net/tagomoris/norikra-stream-processing-with-sql
• Norikra: SQL Stream Processing in Ruby http://www.slideshare.net/tagomoris/norikra-sql-stream-processing-in-ruby
• Norikra in Action http://www.slideshare.net/tagomoris/norikra-in-action-ver-2014-spring
• Landscape of Norikra Features http://www.slideshare.net/tagomoris/norikra-meetup-features
• Norikra Recent Updates http://www.slideshare.net/tagomoris/norikra-recent-updates
![Page 12: Perfect Norikra 2nd Season](https://reader031.vdocuments.site/reader031/viewer/2022031518/5a6734527f8b9a0c518b498d/html5/thumbnails/12.jpg)
Recent Updates
• v1.4.0: Jul 19, 2016 • Add support for "-D" and "-agentlib" of JVM • Update msgpack version
• Previous release v1.3.1: May 7, 2015 • Explained in "Norikra Recent Updates" slide
![Page 13: Perfect Norikra 2nd Season](https://reader031.vdocuments.site/reader031/viewer/2022031518/5a6734527f8b9a0c518b498d/html5/thumbnails/13.jpg)
User Companies
• LINE Corporation
• Kayac Inc.
• Mercari, Inc.
• (and some/many others)
![Page 14: Perfect Norikra 2nd Season](https://reader031.vdocuments.site/reader031/viewer/2022031518/5a6734527f8b9a0c518b498d/html5/thumbnails/14.jpg)
https://www.slideshare.net/tagomoris/how-to-make-norikra-perfect
![Page 15: Perfect Norikra 2nd Season](https://reader031.vdocuments.site/reader031/viewer/2022031518/5a6734527f8b9a0c518b498d/html5/thumbnails/15.jpg)
Perfect Norikra• All features of Norikra
• Including "Ultra fast bootstrap" • Compatible RPC API w/ original Norikra
• Distributed execution on any scheduler • YARN? Mesos? or ...? • Automatic failover & retry for failures (HA) • Automated optimization for load balancing • Dynamic scaling out
from 1 to 100 nodes - without any restarts/retries
![Page 16: Perfect Norikra 2nd Season](https://reader031.vdocuments.site/reader031/viewer/2022031518/5a6734527f8b9a0c518b498d/html5/thumbnails/16.jpg)
MAKE Norikra
PERFECT AGAIN
![Page 17: Perfect Norikra 2nd Season](https://reader031.vdocuments.site/reader031/viewer/2022031518/5a6734527f8b9a0c518b498d/html5/thumbnails/17.jpg)
Features for More Perfection
• Loading operator internal states from Batch query engines
• Sharing operator internal states between queries
![Page 18: Perfect Norikra 2nd Season](https://reader031.vdocuments.site/reader031/viewer/2022031518/5a6734527f8b9a0c518b498d/html5/thumbnails/18.jpg)
Stream Processing
• Monitoring, Reporting, Alerting
• Fast recommendation
• Matching behaviors
• and ...
![Page 19: Perfect Norikra 2nd Season](https://reader031.vdocuments.site/reader031/viewer/2022031518/5a6734527f8b9a0c518b498d/html5/thumbnails/19.jpg)
Handling Long Term Data/History
timeline
Website audience data
Jul 24, 2014 Purchase a car
Jul 28, 2017 ....?
Start batch queryto read 3~4 years history
Offer a nice bonus to possible customer!
Browser session already expired......
![Page 20: Perfect Norikra 2nd Season](https://reader031.vdocuments.site/reader031/viewer/2022031518/5a6734527f8b9a0c518b498d/html5/thumbnails/20.jpg)
Stream Processing on Long Term Data
timeline
Website audience data: processed continuously
Jul 24, 2014 Purchase a car
Jul 28, 2017 Got a nice bonus offer!
Jul 28, 2017 Got a wrong offer...
Rewrite the query & start itwithout past data... more 3 years required for test?
![Page 21: Perfect Norikra 2nd Season](https://reader031.vdocuments.site/reader031/viewer/2022031518/5a6734527f8b9a0c518b498d/html5/thumbnails/21.jpg)
Resume/Restart of Queries
• Queries may be stopped/killed by many reasons • cluster version up / migration • troubles
• Queries should be modified anytime • wrong logic • data schema upgrade • new business requirement
![Page 22: Perfect Norikra 2nd Season](https://reader031.vdocuments.site/reader031/viewer/2022031518/5a6734527f8b9a0c518b498d/html5/thumbnails/22.jpg)
What we want:
timeline
Website audience data: processed continuously
Jul 24, 2014 Purchase a car
Jul 28, 2017 Got a nice bonus offer!
Jul 28, 2017 Got a wrong offer...
Rewrite & start the query with past long history
![Page 23: Perfect Norikra 2nd Season](https://reader031.vdocuments.site/reader031/viewer/2022031518/5a6734527f8b9a0c518b498d/html5/thumbnails/23.jpg)
Load "Running" QueriesLoad "running" stream query from batch engines!
Submit a stream query
Query the history on batch engines & load the result as intermediate state of stream query
Start to process realtime data
![Page 24: Perfect Norikra 2nd Season](https://reader031.vdocuments.site/reader031/viewer/2022031518/5a6734527f8b9a0c518b498d/html5/thumbnails/24.jpg)
Load "Running" QueriesLoad "running" stream query from batch engines!
Submit a stream query
Query the history on batch engines & load the result as intermediate state of stream query
Start to process realtime data
![Page 25: Perfect Norikra 2nd Season](https://reader031.vdocuments.site/reader031/viewer/2022031518/5a6734527f8b9a0c518b498d/html5/thumbnails/25.jpg)
JOINs with Past DataSubmit a stream query w/ JOIN past data
JOIN
Submit a query
Query past data from batch & load it
JOINStart to process realtime data w/ JOIN
![Page 26: Perfect Norikra 2nd Season](https://reader031.vdocuments.site/reader031/viewer/2022031518/5a6734527f8b9a0c518b498d/html5/thumbnails/26.jpg)
JOINs with Past DataSubmit a stream query w/ JOIN past data
JOIN
Submit a query
Query past data from batch & load it
JOINStart to process realtime data w/ JOIN
![Page 27: Perfect Norikra 2nd Season](https://reader031.vdocuments.site/reader031/viewer/2022031518/5a6734527f8b9a0c518b498d/html5/thumbnails/27.jpg)
True Lambda Architecture
• Use just one DSL on both of Stream & Batch • SQL!
• Ingest data stream to both of Stream & Storage
• Handle time window intelligently • Specify time window out of DSL • Write once on batch, Run anywhere :D
![Page 28: Perfect Norikra 2nd Season](https://reader031.vdocuments.site/reader031/viewer/2022031518/5a6734527f8b9a0c518b498d/html5/thumbnails/28.jpg)
Idempotent Operator State
• As a stream operator with realtime data
• As a loaded stream operator with past data
• Serializable operator internal states
![Page 29: Perfect Norikra 2nd Season](https://reader031.vdocuments.site/reader031/viewer/2022031518/5a6734527f8b9a0c518b498d/html5/thumbnails/29.jpg)
Sharing Operators between Queries
Query A
Query B
![Page 30: Perfect Norikra 2nd Season](https://reader031.vdocuments.site/reader031/viewer/2022031518/5a6734527f8b9a0c518b498d/html5/thumbnails/30.jpg)
SHARED Operators
Sharing Operators between Queries
history(stream)
history(batch: 3 - 4 years ago)
JOIN
Query Afilter + projection
Query Bfilter + projection
![Page 31: Perfect Norikra 2nd Season](https://reader031.vdocuments.site/reader031/viewer/2022031518/5a6734527f8b9a0c518b498d/html5/thumbnails/31.jpg)
Sharing Operators during Updating Query
history(stream)
history(batch: 3 - 4 years ago)
JOIN
Query Afilter + projection
Oops, I found mistake on Query A!
![Page 32: Perfect Norikra 2nd Season](https://reader031.vdocuments.site/reader031/viewer/2022031518/5a6734527f8b9a0c518b498d/html5/thumbnails/32.jpg)
SHARED Operators
Sharing Operators during Updating Query
history(stream)
history(batch: 3 - 4 years ago)
JOIN
Query Afilter + projection
Query A'filter + projection
I've just added updated query...
![Page 33: Perfect Norikra 2nd Season](https://reader031.vdocuments.site/reader031/viewer/2022031518/5a6734527f8b9a0c518b498d/html5/thumbnails/33.jpg)
Sharing Operators during Updating Query
history(stream)
history(batch: 3 - 4 years ago)
JOIN
Query A'filter + projection
It works!I can remove older one.
![Page 34: Perfect Norikra 2nd Season](https://reader031.vdocuments.site/reader031/viewer/2022031518/5a6734527f8b9a0c518b498d/html5/thumbnails/34.jpg)
Perfect Stream Processing Engine• Just same SQL on both of Batch and Stream
• Stream processor which can resume queries using batch query engine results • reduces memory usage of JOINs • reduces memory usage about historical data
• Stream Processor which can share operators between queries • reduces total amount of memory usage • makes it possible to restart/update queries anytime,
casually
![Page 35: Perfect Norikra 2nd Season](https://reader031.vdocuments.site/reader031/viewer/2022031518/5a6734527f8b9a0c518b498d/html5/thumbnails/35.jpg)
Perfect Norikra
![Page 36: Perfect Norikra 2nd Season](https://reader031.vdocuments.site/reader031/viewer/2022031518/5a6734527f8b9a0c518b498d/html5/thumbnails/36.jpg)
Named
![Page 37: Perfect Norikra 2nd Season](https://reader031.vdocuments.site/reader031/viewer/2022031518/5a6734527f8b9a0c518b498d/html5/thumbnails/37.jpg)
It has still 0 bytes. Stay tuned!
We are hiring! - Treasure Data