april 20, 2015 for big data analytics - harvard...
TRANSCRIPT
The Stratosphere Platform for Big Data Analytics
Hongyao MaFranco Solleza
April 20, 2015
Stratosphere
Stratosphere
Stratosphere
Big Data Analytics
● “BIG Data”
● Heterogeneous datasets: structured / unstructured / semi-structured
● Users have different needs for declarativity and expressivity
What we have covered so far
● Polybase
● Shark
● MLBase
● SharedDB
● BlinkDB
The Promises● Declarative, high-level language
● “In situ” data analysis
● Richer set of primitives than MapReduce
● Treat UDFs at first-class citizens
● Automated parallelization and optimization
● Support for iterative programs
● Includes external memory query processing algorithms to support arbitrarily long programs
Outline
● Meteor & Sopremo
● PACT
● Nephele
● Experiment Results
● Future work & Discussions
Sopremo
Meteor Script
● Declarative interface● High level script
Meteor Translates To SopremoOutput
Lineitem
Filter
ComputeRevenue
Join
Supplier
Group
Sopremo
● Modular and extensible● Composable
Sopremo compiled to PACTOutput
Lineitem
Filter
ComputeRevenue
Join
Supplier
Group
PACT
PACT● Programmer makes a “pact”
with system● Uses one of 5 functions
PACT● Programmer makes a “pact”
with system● Uses one of 5 functions
Map Reduce Cross
Match Co-group
PACT● Programmer makes a “pact”
with system● Uses one of 5 functions
Map Reduce Cross
Match Co-group
PACT● Programmer makes a “pact”
with system● Uses one of 5 functions
Map Reduce Cross
Match Co-group
PACT● Programmer makes a “pact”
with system● Uses one of 5 functions
Map Reduce Cross
Match Co-group
What’s a PACT?
● Data and a function● Specifies how data are partitioned across the system● An atomic(?) operation on all specified data
Iterative PACT Programs
Iterative PACT Programs
● Implicitly, iteration mutates state
Iterative PACT Programs
● Implicitly, iteration mutates state● How to do iteration without explicit
mutation of state?
Iterative PACT Programs
● Bulk iteration
Iterative PACT Programs
● Bulk iteration
Starts with a solution set
Iterative PACT Programs
● Bulk iteration
Sends group by label to neighbors
Iterative PACT Programs
● Bulk iteration
Find minimum among those neighbors
Iterative PACT Programs
● Bulk iteration
Outputs an incremental solution set
Iterative PACT Programs
● Bulk iteration
Incremental solution set becomes input to next iteration
Iterative PACT Programs
● Bulk iteration
Iterative PACT Programs
● Incremental iteration
Iterative PACT Programs
● Incremental iteration
Starts with a work set, and a solution set
Iterative PACT Programs
● Incremental iteration
Calculates the min for a group
Iterative PACT Programs
● Incremental iteration
Merges work set with solution set and checks if label changed
Iterative PACT Programs
● Incremental iteration
If the label is new, it becomes part of the delta set ..
Iterative PACT Programs
● Incremental iteration
Which gets sent back to the next iteration
Iterative PACT Programs
● Incremental iteration
If changed, also gets matched to the neighbors...
Iterative PACT Programs
● Incremental iteration
And those matches become the new workset
Iterative PACT Programs
● Incremental iteration
PACT Optimization
PACT Optimization
PACT Optimization
PACT Optimization
PACT Optimization
PACT Optimization
PACT Optimization
Nephele
Nephele Execution
Nephele Execution● Tasks, channels,
scheduling
Nephele Execution● Tasks, channels,
scheduling
Tasks with all local pipelines associated with that task are pushed by to slaves
Nephele Execution● Tasks, channels,
scheduling
Tasks can request to send data over network (only when necessary or ready)
Nephele Execution● Fault tolerance
Nephele Execution● Fault tolerance
Conceptually, follows the same concept as lineage (RDDs) but...
Nephele Execution● Fault tolerance
Intermediate
Blocking operator model
Nephele Execution● Fault tolerance
Intermediate
Non- Blocking operator model
Nephele Execution● Runtime operators
Does it deliver?
Does it deliver?
● Maybe - what do the experiments say?● What’s old?
○ A lot of things
● What’s new?○ second-order functions that abstract parallelization○ optimization in a UDF-heavy environment○ Integrate iterative processing○ an extensible query language and underlying operator model
Experimental Evaluation
Experimental SetupSetup:
● 1 master + 25 slave machines● 16 cores @ 2.0Hz with 32GB of RAM (29GB of operating memory)● 80TB HDFS in plain ASCII, 4 SATA drives at 500MB/s read/write per node● 8 parallel tasks per slave, total DOP 40-200
Comparison with Hadoop
● Vanilla MapReduce engine● Apache Hive● Apache Giraph
Summary of Results
● Stratosphere achieves linear speedup and similar performance to Hadoop for simple tasks (TeraSort, Word Count)
● Stratosphere beats Hive and Hadoop by 5 times for complicated tasks like TPC-H and triangle enumeration, though no gain from increasing DOP
● Stratosphere performed worse on Connected Components than Giraph due to the better tuned implementation of the latter
● Checkpointing adds little overhead and saves much time when failure occurs
TeraSort --- Stratosphere v.s. HadoopStratosphere achieves similar performance as Hadoop and Linear Speedup
Word Count --- Stratosphere v.s. HadoopStratosphere is 20% faster than Hadoop and achieves linear speedup
Triangle Enumeration: Reducer 1
Triangle Enumeration: Reducer 2
Triangle Enumeration: PACT
Triangle EnumerationStratosphere is 5x faster than Hadoop, though parallelism does not help
TPC-H Query
TPC-H --- Stratosphere v.s. HiveParallelism does not seem to help, however, Stratosphere is 5x faster
Connected ComponentsGiraph is faster, due to better tuned implementation
CC --- Execution time per superstep
Fault ToleranceCheckpointing adds little overhead and saves much time when failure occurs
What Else Do We Want to See? For presented experiments:
● Breakdown of execution time to distinguish bottlenecks● What happens with even smaller DOP?● What happens with more/less tasks on each core?
Further:
● What happens with even larger data? Current size does fit into RAM● Comparison with MPP, or split query processing systems like Polybase, or
Shark given the size of the tested data
The Promises?● Declarative, high-level language
● “In situ” data analysis
● Richer set of primitives than MapReduce
● Treat UDFs at first-class citizens
● Automated parallelization and optimization
● Support for iterative programs
● Includes external memory query processing algorithms to support arbitrarily long programs
Ongoing and Future Work● One-pass optimizer unifying PACT and sopremo layers
● Strengthening fault-tolerant capabilities
● Improving scalability and efficiency of Nephele
● Design, compilation and optimization of higher-level languages
● Scalable, efficient, and adaptive algorithms and architecture
● “Stateful” systems for fast ingestion and low-latency data analysis
Discussions and Questions
● Declarativity - expressiveness tradeoff
○ More declarative -> less expressive, but easier to optimize
● Run-time optimization is the way to go?
○ Skewed data distribution may become a bottleneck for such systems
○ Detecting performance bottleneck on the fly
QEDTHANKS!