storm: distributed and fault-tolerant realtime computation
DESCRIPTION
TRANSCRIPT
![Page 1: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/1.jpg)
Nathan MarzTwitter
Distributed and fault-tolerant realtime computation
Storm
![Page 2: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/2.jpg)
Storm at Twitter
Twitter Web Analytics
![Page 3: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/3.jpg)
Before Storm
Queues Workers
![Page 4: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/4.jpg)
Example
(simplified)
![Page 5: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/5.jpg)
Example
Workers schemify tweets and append to Hadoop
![Page 6: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/6.jpg)
Example
Workers update statistics on URLs by incrementing counters in Cassandra
![Page 7: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/7.jpg)
Example
Distribute tweets randomly on multiple queues
![Page 8: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/8.jpg)
Example
Workers share the load of schemifying tweets
![Page 9: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/9.jpg)
Example
Desire all updates for same URL go to same worker
![Page 10: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/10.jpg)
Message locality
• Because:
• No transactions in Cassandra (and no atomic increments at the time)
• More effective batching of updates
![Page 11: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/11.jpg)
• Have a queue for each consuming worker
• Choose queue for a URL using consistent hashing
Implementing message locality
![Page 12: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/12.jpg)
Example
Workers choose queue to enqueue to using hash/mod of URL
![Page 13: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/13.jpg)
Example
All updates for same URL guaranteed to go to same worker
![Page 14: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/14.jpg)
Adding a worker
![Page 15: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/15.jpg)
Adding a workerDeploy
Reconfigure/redeploy
![Page 16: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/16.jpg)
Problems
• Scaling is painful
• Poor fault-tolerance
• Coding is tedious
![Page 17: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/17.jpg)
What we want
• Guaranteed data processing
• Horizontal scalability
• Fault-tolerance
• No intermediate message brokers!
• Higher level abstraction than message passing
• “Just works”
![Page 18: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/18.jpg)
Storm
Guaranteed data processing
Horizontal scalability
Fault-tolerance
No intermediate message brokers!
Higher level abstraction than message passing
“Just works”
![Page 19: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/19.jpg)
Streamprocessing
Continuouscomputation
DistributedRPC
Use cases
![Page 20: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/20.jpg)
Storm Cluster
![Page 21: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/21.jpg)
Storm Cluster
Master node (similar to Hadoop JobTracker)
![Page 22: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/22.jpg)
Storm Cluster
Used for cluster coordination
![Page 23: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/23.jpg)
Storm Cluster
Run worker processes
![Page 24: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/24.jpg)
Starting a topology
![Page 25: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/25.jpg)
Killing a topology
![Page 26: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/26.jpg)
Concepts
• Streams
• Spouts
• Bolts
• Topologies
![Page 27: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/27.jpg)
Streams
Unbounded sequence of tuples
Tuple Tuple Tuple Tuple Tuple Tuple Tuple
![Page 28: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/28.jpg)
Spouts
Source of streams
![Page 29: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/29.jpg)
Spout examples
• Read from Kestrel queue
• Read from Twitter streaming API
![Page 30: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/30.jpg)
Bolts
Processes input streams and produces new streams
![Page 31: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/31.jpg)
Bolts
• Functions
• Filters
• Aggregation
• Joins
• Talk to databases
![Page 32: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/32.jpg)
Topology
Network of spouts and bolts
![Page 33: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/33.jpg)
Tasks
Spouts and bolts execute as many tasks across the cluster
![Page 34: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/34.jpg)
Stream grouping
When a tuple is emitted, which task does it go to?
![Page 35: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/35.jpg)
Stream grouping
• Shuffle grouping: pick a random task
• Fields grouping: consistent hashing on a subset of tuple fields
• All grouping: send to all tasks
• Global grouping: pick task with lowest id
![Page 36: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/36.jpg)
Topology
shuffle
[“url”]
shuffle
shuffle
[“id1”, “id2”]
all
![Page 37: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/37.jpg)
Streaming word count
TopologyBuilder is used to construct topologies in Java
![Page 38: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/38.jpg)
Streaming word count
Define a spout in the topology with parallelism of 5 tasks
![Page 39: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/39.jpg)
Streaming word count
Split sentences into words with parallelism of 8 tasks
![Page 40: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/40.jpg)
Consumer decides what data it receives and how it gets grouped
Streaming word count
Split sentences into words with parallelism of 8 tasks
![Page 41: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/41.jpg)
Streaming word count
Create a word count stream
![Page 42: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/42.jpg)
Streaming word count
splitsentence.py
![Page 43: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/43.jpg)
Streaming word count
![Page 44: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/44.jpg)
Streaming word count
Submitting topology to a cluster
![Page 45: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/45.jpg)
Streaming word count
Running topology in local mode
![Page 46: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/46.jpg)
Demo
![Page 47: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/47.jpg)
Traditional data processing
![Page 48: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/48.jpg)
Traditional data processing
Intense processing (Hadoop, databases, etc.)
![Page 49: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/49.jpg)
Traditional data processing
Light processing on a single machine to resolve queries
![Page 50: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/50.jpg)
Distributed RPC
Distributed RPC lets you do intense processing at query-time
![Page 51: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/51.jpg)
Game changer
![Page 52: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/52.jpg)
Distributed RPC
Data flow for Distributed RPC
![Page 53: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/53.jpg)
DRPC Example
Computing “reach” of a URL on the fly
![Page 54: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/54.jpg)
Reach
Reach is the number of unique peopleexposed to a URL on Twitter
![Page 55: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/55.jpg)
Computing reach
URL
Tweeter
Tweeter
Tweeter
Follower
Follower
Follower
Follower
Follower
Follower
Distinct follower
Distinct follower
Distinct follower
Count Reach
![Page 56: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/56.jpg)
Reach topology
![Page 57: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/57.jpg)
Guaranteeing message processing
“Tuple tree”
![Page 58: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/58.jpg)
Guaranteeing message processing
• A spout tuple is not fully processed until all tuples in the tree have been completed
![Page 59: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/59.jpg)
Guaranteeing message processing
• If the tuple tree is not completed within a specified timeout, the spout tuple is replayed
![Page 60: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/60.jpg)
Guaranteeing message processing
Reliability API
![Page 61: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/61.jpg)
Guaranteeing message processing
“Anchoring” creates a new edge in the tuple tree
![Page 62: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/62.jpg)
Guaranteeing message processing
Marks a single node in the tree as complete
![Page 63: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/63.jpg)
Guaranteeing message processing
• Storm tracks tuple trees for you in an extremely efficient way
![Page 64: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/64.jpg)
Storm UI
![Page 65: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/65.jpg)
Storm UI
![Page 66: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/66.jpg)
Storm UI
![Page 67: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/67.jpg)
Storm on EC2
https://github.com/nathanmarz/storm-deploy
One-click deploy tool
![Page 68: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/68.jpg)
Documentation
![Page 69: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/69.jpg)
Synchronize a large amount of frequently changing state into a topology
State spout (almost done)
![Page 70: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/70.jpg)
State spout (almost done)
Optimizing reach topology by eliminating the database calls
![Page 71: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/71.jpg)
State spout (almost done)
Each GetFollowers task keeps a synchronous cache of a subset of the social graph
![Page 72: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/72.jpg)
State spout (almost done)
This works because GetFollowers repartitions the social graph the same way it partitions GetTweeter’s stream
![Page 73: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/73.jpg)
Future work
• Storm on Mesos
• “Swapping”
• Auto-scaling
• Higher level abstractions
![Page 74: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/74.jpg)
Questions?
http://github.com/nathanmarz/storm
![Page 75: Storm: distributed and fault-tolerant realtime computation](https://reader034.vdocuments.site/reader034/viewer/2022052409/540e04a98d7f72747e8b4c3f/html5/thumbnails/75.jpg)
What Storm does
• Distributes code and configurations
• Robust process management
• Monitors topologies and reassigns failed tasks
• Provides reliability by tracking tuple trees
• Routing and partitioning of streams
• Serialization
• Fine-grained performance stats of topologies