hadoop & spark performance tuning using dr. elephant
TRANSCRIPT
![Page 1: Hadoop & Spark Performance tuning using Dr. Elephant](https://reader034.vdocuments.site/reader034/viewer/2022042513/586f77e41a28ab10258b6969/html5/thumbnails/1.jpg)
Dr. Elephantgithub.com/linkedin/dr-elephant
Akshay RaiHadoop Dev Team
![Page 2: Hadoop & Spark Performance tuning using Dr. Elephant](https://reader034.vdocuments.site/reader034/viewer/2022042513/586f77e41a28ab10258b6969/html5/thumbnails/2.jpg)
Introduction
![Page 3: Hadoop & Spark Performance tuning using Dr. Elephant](https://reader034.vdocuments.site/reader034/viewer/2022042513/586f77e41a28ab10258b6969/html5/thumbnails/3.jpg)
Scaling Hadoop Infrastructure
![Page 4: Hadoop & Spark Performance tuning using Dr. Elephant](https://reader034.vdocuments.site/reader034/viewer/2022042513/586f77e41a28ab10258b6969/html5/thumbnails/4.jpg)
Scale and Optimize Hardware● More users, more jobs, more resources
● Large investment in hardware
● Can’t keep upgrading and adding machines to solve problem forever
● Some tuning is needed to get things running
![Page 5: Hadoop & Spark Performance tuning using Dr. Elephant](https://reader034.vdocuments.site/reader034/viewer/2022042513/586f77e41a28ab10258b6969/html5/thumbnails/5.jpg)
Users are more valuable than machines
What do we do?
![Page 6: Hadoop & Spark Performance tuning using Dr. Elephant](https://reader034.vdocuments.site/reader034/viewer/2022042513/586f77e41a28ab10258b6969/html5/thumbnails/6.jpg)
Improve User Productivity
![Page 7: Hadoop & Spark Performance tuning using Dr. Elephant](https://reader034.vdocuments.site/reader034/viewer/2022042513/586f77e41a28ab10258b6969/html5/thumbnails/7.jpg)
User Productivity● Freedom to experiment and run jobs on the cluster
● Build tools to help developers. (Hadoop DSL, Resolvers for Pig/Hive)
○ Improve developer lifecycle
○ Also reduce unnecessary resource wastage
![Page 8: Hadoop & Spark Performance tuning using Dr. Elephant](https://reader034.vdocuments.site/reader034/viewer/2022042513/586f77e41a28ab10258b6969/html5/thumbnails/8.jpg)
The Tuning Problem
![Page 9: Hadoop & Spark Performance tuning using Dr. Elephant](https://reader034.vdocuments.site/reader034/viewer/2022042513/586f77e41a28ab10258b6969/html5/thumbnails/9.jpg)
How easy is it to tune a job?● Problems are not obvious
● Critical information is scattered
● Inter-related settings
● Large parameter space
![Page 10: Hadoop & Spark Performance tuning using Dr. Elephant](https://reader034.vdocuments.site/reader034/viewer/2022042513/586f77e41a28ab10258b6969/html5/thumbnails/10.jpg)
Here’s what we learned!
![Page 11: Hadoop & Spark Performance tuning using Dr. Elephant](https://reader034.vdocuments.site/reader034/viewer/2022042513/586f77e41a28ab10258b6969/html5/thumbnails/11.jpg)
Expert Intervention● Not enough support resources available
● Poor coverage
● Difficult to prioritize efforts
● Delays user development
Random
Suggestions
![Page 12: Hadoop & Spark Performance tuning using Dr. Elephant](https://reader034.vdocuments.site/reader034/viewer/2022042513/586f77e41a28ab10258b6969/html5/thumbnails/12.jpg)
Training is not at all easy● Too many users
● Diverse backgrounds
● Scope is large and evolving
● Other responsibilities are more important
![Page 13: Hadoop & Spark Performance tuning using Dr. Elephant](https://reader034.vdocuments.site/reader034/viewer/2022042513/586f77e41a28ab10258b6969/html5/thumbnails/13.jpg)
Scaling Productivity is Hard!
![Page 14: Hadoop & Spark Performance tuning using Dr. Elephant](https://reader034.vdocuments.site/reader034/viewer/2022042513/586f77e41a28ab10258b6969/html5/thumbnails/14.jpg)
Dr. Elephant to the Rescue
![Page 15: Hadoop & Spark Performance tuning using Dr. Elephant](https://reader034.vdocuments.site/reader034/viewer/2022042513/586f77e41a28ab10258b6969/html5/thumbnails/15.jpg)
What does Dr. Elephant do?● Automated performance monitoring and tuning tool
● Help every user get the best performance from their jobs
● Highlights common mistakes
● Indicates best practices and tuning tips
● Provides a platform for other performance related tools
● Analyzes hundred thousand jobs every day
![Page 16: Hadoop & Spark Performance tuning using Dr. Elephant](https://reader034.vdocuments.site/reader034/viewer/2022042513/586f77e41a28ab10258b6969/html5/thumbnails/16.jpg)
Architecture
![Page 17: Hadoop & Spark Performance tuning using Dr. Elephant](https://reader034.vdocuments.site/reader034/viewer/2022042513/586f77e41a28ab10258b6969/html5/thumbnails/17.jpg)
Dashboard
![Page 18: Hadoop & Spark Performance tuning using Dr. Elephant](https://reader034.vdocuments.site/reader034/viewer/2022042513/586f77e41a28ab10258b6969/html5/thumbnails/18.jpg)
Search
![Page 19: Hadoop & Spark Performance tuning using Dr. Elephant](https://reader034.vdocuments.site/reader034/viewer/2022042513/586f77e41a28ab10258b6969/html5/thumbnails/19.jpg)
Job Page
![Page 20: Hadoop & Spark Performance tuning using Dr. Elephant](https://reader034.vdocuments.site/reader034/viewer/2022042513/586f77e41a28ab10258b6969/html5/thumbnails/20.jpg)
MapReduce Report
![Page 21: Hadoop & Spark Performance tuning using Dr. Elephant](https://reader034.vdocuments.site/reader034/viewer/2022042513/586f77e41a28ab10258b6969/html5/thumbnails/21.jpg)
Failed Job
![Page 22: Hadoop & Spark Performance tuning using Dr. Elephant](https://reader034.vdocuments.site/reader034/viewer/2022042513/586f77e41a28ab10258b6969/html5/thumbnails/22.jpg)
Help Page
![Page 23: Hadoop & Spark Performance tuning using Dr. Elephant](https://reader034.vdocuments.site/reader034/viewer/2022042513/586f77e41a28ab10258b6969/html5/thumbnails/23.jpg)
Tuning Tips
![Page 24: Hadoop & Spark Performance tuning using Dr. Elephant](https://reader034.vdocuments.site/reader034/viewer/2022042513/586f77e41a28ab10258b6969/html5/thumbnails/24.jpg)
Awesome Features
![Page 25: Hadoop & Spark Performance tuning using Dr. Elephant](https://reader034.vdocuments.site/reader034/viewer/2022042513/586f77e41a28ab10258b6969/html5/thumbnails/25.jpg)
Simplified analysis of a flow’s historical executions● Monitoring performance, resource usage and many others
● Comparing flows against previous executions
● Impact of tuning a specific parameter or a changing a line of code
![Page 26: Hadoop & Spark Performance tuning using Dr. Elephant](https://reader034.vdocuments.site/reader034/viewer/2022042513/586f77e41a28ab10258b6969/html5/thumbnails/26.jpg)
Flow History
![Page 27: Hadoop & Spark Performance tuning using Dr. Elephant](https://reader034.vdocuments.site/reader034/viewer/2022042513/586f77e41a28ab10258b6969/html5/thumbnails/27.jpg)
Job History
![Page 28: Hadoop & Spark Performance tuning using Dr. Elephant](https://reader034.vdocuments.site/reader034/viewer/2022042513/586f77e41a28ab10258b6969/html5/thumbnails/28.jpg)
Heuristics
![Page 29: Hadoop & Spark Performance tuning using Dr. Elephant](https://reader034.vdocuments.site/reader034/viewer/2022042513/586f77e41a28ab10258b6969/html5/thumbnails/29.jpg)
How does a Heuristic work?● Fetch Counters and Task Data
● Some logic to compute a value
● Compare value against threshold levels
![Page 30: Hadoop & Spark Performance tuning using Dr. Elephant](https://reader034.vdocuments.site/reader034/viewer/2022042513/586f77e41a28ab10258b6969/html5/thumbnails/30.jpg)
Heuristic Severity
Severity Color Description
CRITICAL The job is in critical state and must be tuned
SEVERE There is scope for improvement
MODERATE There is scope for further improvement
LOW There is scope for few minor improvements
NONE The job is safe. No tuning necessary
![Page 31: Hadoop & Spark Performance tuning using Dr. Elephant](https://reader034.vdocuments.site/reader034/viewer/2022042513/586f77e41a28ab10258b6969/html5/thumbnails/31.jpg)
Example | Mapper Data Skew
![Page 32: Hadoop & Spark Performance tuning using Dr. Elephant](https://reader034.vdocuments.site/reader034/viewer/2022042513/586f77e41a28ab10258b6969/html5/thumbnails/32.jpg)
Mapper Skew Problem● Number of Mappers depend on the number of splits
● Varying size of splits can cause skewness in the Mapper Input
![Page 33: Hadoop & Spark Performance tuning using Dr. Elephant](https://reader034.vdocuments.site/reader034/viewer/2022042513/586f77e41a28ab10258b6969/html5/thumbnails/33.jpg)
Solution to Mapper Skewness● Each Mapper should process the same amount of data
● Combine the small chunks and feed it to a single Mapper
![Page 34: Hadoop & Spark Performance tuning using Dr. Elephant](https://reader034.vdocuments.site/reader034/viewer/2022042513/586f77e41a28ab10258b6969/html5/thumbnails/34.jpg)
Example | Spark Executor Load Balance
![Page 35: Hadoop & Spark Performance tuning using Dr. Elephant](https://reader034.vdocuments.site/reader034/viewer/2022042513/586f77e41a28ab10258b6969/html5/thumbnails/35.jpg)
Spark Driver
Executor 1
Executor 2
Executor 3
RDD
Partition 1
Partition 2
Partition 3
![Page 36: Hadoop & Spark Performance tuning using Dr. Elephant](https://reader034.vdocuments.site/reader034/viewer/2022042513/586f77e41a28ab10258b6969/html5/thumbnails/36.jpg)
Custom Heuristics
![Page 37: Hadoop & Spark Performance tuning using Dr. Elephant](https://reader034.vdocuments.site/reader034/viewer/2022042513/586f77e41a28ab10258b6969/html5/thumbnails/37.jpg)
Adding a New Heuristic1. Create a new heuristic and test it.
2. Create a new view for the heuristic. For example, helpMapperSpill.scala.html
3. Add the details of the heuristic in the HeuristicConf.xml file.
<heuristic>
<applicationtype>mapreduce</applicationtype>
<heuristicname>Mapper GC</heuristicname>
<classname>com.linkedin.dre.mapreduce.heuristics.MapperGC</classname>
<viewname>views.html.help.mapreduce.helpGC</viewname>
</heuristic>
4. Run Dr. Elephant. It should now include the new heuristics.
![Page 38: Hadoop & Spark Performance tuning using Dr. Elephant](https://reader034.vdocuments.site/reader034/viewer/2022042513/586f77e41a28ab10258b6969/html5/thumbnails/38.jpg)
Configuring Heuristics/Threshold levels<heuristics>
<heuristic>
<applicationtype>mapreduce</applicationtype>
<heuristicname>Mapper Data Skew</heuristicname>
<classname>com.linkedin.dre.mapreduce.heuristics.MapperDataSkew</classname>
<viewname>views.html.help.mapreduce.helpMapperDataSkew</viewname>
<params>
<num_tasks_severity>10, 50, 100, 200</num_tasks_severity>
<deviation_severity>2, 4, 8, 16</deviation_severity>
<files_severity>1/8, 1/4, 1/2, 1</files_severity>
</params>
</heuristic>
</heuristics>
![Page 39: Hadoop & Spark Performance tuning using Dr. Elephant](https://reader034.vdocuments.site/reader034/viewer/2022042513/586f77e41a28ab10258b6969/html5/thumbnails/39.jpg)
Elephagent
![Page 40: Hadoop & Spark Performance tuning using Dr. Elephant](https://reader034.vdocuments.site/reader034/viewer/2022042513/586f77e41a28ab10258b6969/html5/thumbnails/40.jpg)
Workflow monitoring and reports● Performance characteristics change
○ Data Growth
○ Data distribution change
○ Hardware change
○ Incremental software change
● Monitor performance on each execution
● Compare behaviour across revisions
● Cost to Serve analysis
![Page 41: Hadoop & Spark Performance tuning using Dr. Elephant](https://reader034.vdocuments.site/reader034/viewer/2022042513/586f77e41a28ab10258b6969/html5/thumbnails/41.jpg)
Production Reviews | JIRA Bot● Separate cluster for critical workloads
● Audit before deployment
● Improved accuracy
● Faster turnaround
● Higher throughput
![Page 42: Hadoop & Spark Performance tuning using Dr. Elephant](https://reader034.vdocuments.site/reader034/viewer/2022042513/586f77e41a28ab10258b6969/html5/thumbnails/42.jpg)
Future Plans
![Page 43: Hadoop & Spark Performance tuning using Dr. Elephant](https://reader034.vdocuments.site/reader034/viewer/2022042513/586f77e41a28ab10258b6969/html5/thumbnails/43.jpg)
Upcoming● Job Resource Usage and Wastage
● Job Wait time
● Real time analysis of a job
● Workflow DAG visualization
● Improved Spark heuristics
![Page 44: Hadoop & Spark Performance tuning using Dr. Elephant](https://reader034.vdocuments.site/reader034/viewer/2022042513/586f77e41a28ab10258b6969/html5/thumbnails/44.jpg)
ReferencesEngineering Blog: engineering.linkedin.com/blog/2016/04/dr-elephant-open-source-self-serve-performance-tuning-hadoop-spark
Open Source Github Link:github.com/linkedin/dr-elephant
Mailing List:Dr-elephant-users
Hadoop Summit 2015:https://www.youtube.com/watch?v=aL3OJ4YoxPA
![Page 45: Hadoop & Spark Performance tuning using Dr. Elephant](https://reader034.vdocuments.site/reader034/viewer/2022042513/586f77e41a28ab10258b6969/html5/thumbnails/45.jpg)
Thank You
![Page 46: Hadoop & Spark Performance tuning using Dr. Elephant](https://reader034.vdocuments.site/reader034/viewer/2022042513/586f77e41a28ab10258b6969/html5/thumbnails/46.jpg)
©2014 LinkedIn Corporation. All Rights Reserved.
©2014 LinkedIn Corporation. All Rights Reserved.
© 2016