optimizing hive queries
DESCRIPTION
Owen O'Malley gave a talk at Hadoop Summit EU 2013 about optimizing Hive queries.TRANSCRIPT
![Page 1: Optimizing Hive Queries](https://reader034.vdocuments.site/reader034/viewer/2022051323/54b6cff74a7959d84d8b4605/html5/thumbnails/1.jpg)
© Hortonworks Inc. 2013:
Optimizing Hive Queries
Page 1
Owen O’Malley Founder and Architect [email protected] @owen_omalley
![Page 2: Optimizing Hive Queries](https://reader034.vdocuments.site/reader034/viewer/2022051323/54b6cff74a7959d84d8b4605/html5/thumbnails/2.jpg)
© Hortonworks Inc. 2013
Who Am I?
• Founder and Architect at Hortonworks – Working on Hive, working with customer – Formerly Hadoop MapReduce & Security – Been working on Hadoop since beginning
• Apache Hadoop, ASF – Hadoop PMC (Original VP) – Tez, Ambari, Giraph PMC – Mentor for: Accumulo, Kafka, Knox – Apache Member
Page 2
![Page 3: Optimizing Hive Queries](https://reader034.vdocuments.site/reader034/viewer/2022051323/54b6cff74a7959d84d8b4605/html5/thumbnails/3.jpg)
© Hortonworks Inc. 2013
Outline
• Data Layout • Data Format • Joins • Debugging
Page 3
![Page 4: Optimizing Hive Queries](https://reader034.vdocuments.site/reader034/viewer/2022051323/54b6cff74a7959d84d8b4605/html5/thumbnails/4.jpg)
© Hortonworks Inc. 2013
Data Layout Location, Location, Location
Page 4
![Page 5: Optimizing Hive Queries](https://reader034.vdocuments.site/reader034/viewer/2022051323/54b6cff74a7959d84d8b4605/html5/thumbnails/5.jpg)
© Hortonworks Inc. 2013
Fundamental Questions
• What is your primary use case? – What kind of queries and filters?
• How do you need to access the data? – What information do you need together?
• How much data do you have? – What is your year to year growth?
• How do you get the data?
Page 5
![Page 6: Optimizing Hive Queries](https://reader034.vdocuments.site/reader034/viewer/2022051323/54b6cff74a7959d84d8b4605/html5/thumbnails/6.jpg)
© Hortonworks Inc. 2013
HDFS Characteristics
• Provides Distributed File System – Very high aggregate bandwidth – Extreme scalability (up to 100 PB) – Self-healing storage – Relatively simple to administer
• Limitations – Can’t modify existing files – Single writer for each file – Heavy bias for large files ( > 100 MB)
Page 6
![Page 7: Optimizing Hive Queries](https://reader034.vdocuments.site/reader034/viewer/2022051323/54b6cff74a7959d84d8b4605/html5/thumbnails/7.jpg)
© Hortonworks Inc. 2013
Choices for Layout
• Partitions – Top level mechanism for pruning – Primary unit for updating tables (& schema) – Directory per value of specified column
• Bucketing – Hashed into a file, good for sampling – Controls write parallelism
• Sort order – The order the data is written within file
Page 7
![Page 8: Optimizing Hive Queries](https://reader034.vdocuments.site/reader034/viewer/2022051323/54b6cff74a7959d84d8b4605/html5/thumbnails/8.jpg)
© Hortonworks Inc. 2013
Example Hive Layout
• Directory Structure warehouse/$database/$table
• Partitioning /part1=$partValue/part2=$partValue
• Bucketing /$bucket_$attempt (eg. 000000_0)
• Sort – Each file is sorted within the file
Page 8
![Page 9: Optimizing Hive Queries](https://reader034.vdocuments.site/reader034/viewer/2022051323/54b6cff74a7959d84d8b4605/html5/thumbnails/9.jpg)
© Hortonworks Inc. 2013
Layout Guidelines
• Limit the number of partitions – 1,000 partitions is much faster than 10,000 – Nested partitions are almost always wrong
• Gauge the number of buckets – Calculate file size and keep big (200-500MB) – Don’t forget number of files (Buckets * Parts)
• Layout related tables the same way – Partition – Bucket and sort order
Page 9
![Page 10: Optimizing Hive Queries](https://reader034.vdocuments.site/reader034/viewer/2022051323/54b6cff74a7959d84d8b4605/html5/thumbnails/10.jpg)
© Hortonworks Inc. 2013
Normalization
• Most databases suggest normalization – Keep information about each thing together – Customer, Sales, Returns, Inventory tables
• Has lots of good properties, but… – Is typically slow to query
• Often best to denormalize during load – Write once, read many times – Additionally provides snapshots in time.
Page 10
![Page 11: Optimizing Hive Queries](https://reader034.vdocuments.site/reader034/viewer/2022051323/54b6cff74a7959d84d8b4605/html5/thumbnails/11.jpg)
© Hortonworks Inc. 2013
Data Format How is your data stored?
Page 11
![Page 12: Optimizing Hive Queries](https://reader034.vdocuments.site/reader034/viewer/2022051323/54b6cff74a7959d84d8b4605/html5/thumbnails/12.jpg)
© Hortonworks Inc. 2013
Choice of Format
• Serde – How each record is encoded?
• Input/Output (aka File) Format – How are the files stored?
• Primary Choices – Text – Sequence File – RCFile – ORC (Coming Soon!)
Page 12
![Page 13: Optimizing Hive Queries](https://reader034.vdocuments.site/reader034/viewer/2022051323/54b6cff74a7959d84d8b4605/html5/thumbnails/13.jpg)
© Hortonworks Inc. 2013
Text Format
• Critical to pick a Serde – Default - ^A’s between fields – JSON – top level JSON record – CSV – commas between fields (on github)
• Slow to read and write • Can’t split compressed files – Leads to huge maps
• Need to read/decompress all fields
Page 13
![Page 14: Optimizing Hive Queries](https://reader034.vdocuments.site/reader034/viewer/2022051323/54b6cff74a7959d84d8b4605/html5/thumbnails/14.jpg)
© Hortonworks Inc. 2013
Sequence File
• Traditional MapReduce binary file format – Stores keys and values as classes – Not a good fit for Hive, which has SQL types – Hive always stores entire row as value
• Splittable but only by searching file – Default block size is 1 MB
• Need to read and decompress all fields
Page 14
![Page 15: Optimizing Hive Queries](https://reader034.vdocuments.site/reader034/viewer/2022051323/54b6cff74a7959d84d8b4605/html5/thumbnails/15.jpg)
© Hortonworks Inc. 2013
RC (Row Columnar) File
• Columns stored separately – Read and decompress only needed ones – Better compression
• Columns stored as binary blobs – Depends on metastore to supply types
• Larger blocks – 4 MB by default – Still search file for split boundary
Page 15
![Page 16: Optimizing Hive Queries](https://reader034.vdocuments.site/reader034/viewer/2022051323/54b6cff74a7959d84d8b4605/html5/thumbnails/16.jpg)
© Hortonworks Inc. 2013
ORC (Optimized Row Columnar)
• Columns stored separately • Knows types – Uses type-specific encoders – Stores statistics (min, max, sum, count)
• Has light-weight index – Skip over blocks of rows that don’t matter
• Larger blocks – 256 MB by default – Has an index for block boundaries
Page 16
![Page 17: Optimizing Hive Queries](https://reader034.vdocuments.site/reader034/viewer/2022051323/54b6cff74a7959d84d8b4605/html5/thumbnails/17.jpg)
© Hortonworks Inc. 2013
ORC - File Layout
Page 17
![Page 18: Optimizing Hive Queries](https://reader034.vdocuments.site/reader034/viewer/2022051323/54b6cff74a7959d84d8b4605/html5/thumbnails/18.jpg)
© Hortonworks Inc. 2013
Example File Sizes from TPC-DS
Page 18
![Page 19: Optimizing Hive Queries](https://reader034.vdocuments.site/reader034/viewer/2022051323/54b6cff74a7959d84d8b4605/html5/thumbnails/19.jpg)
© Hortonworks Inc. 2013
Compression
• Need to pick level of compression – None – LZO or Snappy – fast but sloppy – Best for temporary tables
– ZLIB – slow and complete – Best for long term storage
Page 19
![Page 20: Optimizing Hive Queries](https://reader034.vdocuments.site/reader034/viewer/2022051323/54b6cff74a7959d84d8b4605/html5/thumbnails/20.jpg)
© Hortonworks Inc. 2013
Joins Putting the pieces together
Page 20
![Page 21: Optimizing Hive Queries](https://reader034.vdocuments.site/reader034/viewer/2022051323/54b6cff74a7959d84d8b4605/html5/thumbnails/21.jpg)
© Hortonworks Inc. 2013
Default Assumption
• Hive assumes users are either: – Noobies – Hive developers
• Default behavior is always finish – Little Engine that Could!
• Experts could override default behaviors – Get better performance, but riskier
• We’re working on improving heuristics
Page 21
![Page 22: Optimizing Hive Queries](https://reader034.vdocuments.site/reader034/viewer/2022051323/54b6cff74a7959d84d8b4605/html5/thumbnails/22.jpg)
© Hortonworks Inc. 2013
Shuffle Join
• Default choice – Always works (I’ve sorted a petabyte!) – Worst case scenario
• Each process – Reads from part of one of the tables – Buckets and sorts on join key – Sends one bucket to each reduce
• Works everytime!
Page 22
![Page 23: Optimizing Hive Queries](https://reader034.vdocuments.site/reader034/viewer/2022051323/54b6cff74a7959d84d8b4605/html5/thumbnails/23.jpg)
© Hortonworks Inc. 2013
Map Join
• One table is small (eg. dimension table) – Fits in memory
• Each process – Reads small table into memory hash table – Streams through part of the big file – Joining each record from hash table
• Very fast, but limited
Page 23
![Page 24: Optimizing Hive Queries](https://reader034.vdocuments.site/reader034/viewer/2022051323/54b6cff74a7959d84d8b4605/html5/thumbnails/24.jpg)
© Hortonworks Inc. 2013
Sort Merge Bucket (SMB) Join
• If both tables are: – Sorted the same – Bucketed the same – And joining on the sort/bucket column
• Each process: – Reads a bucket from each table – Process the row with the lowest value
• Very efficient if applicable
Page 24
![Page 25: Optimizing Hive Queries](https://reader034.vdocuments.site/reader034/viewer/2022051323/54b6cff74a7959d84d8b4605/html5/thumbnails/25.jpg)
© Hortonworks Inc. 2013
Debugging What could possibly go wrong?
Page 25
![Page 26: Optimizing Hive Queries](https://reader034.vdocuments.site/reader034/viewer/2022051323/54b6cff74a7959d84d8b4605/html5/thumbnails/26.jpg)
© Hortonworks Inc. 2013
Performance Question
• Which of the following is faster? – select count(distinct(Col)) from Tbl – select count(*) from (select distict(Col) from Tbl)
Page 26
![Page 27: Optimizing Hive Queries](https://reader034.vdocuments.site/reader034/viewer/2022051323/54b6cff74a7959d84d8b4605/html5/thumbnails/27.jpg)
© Hortonworks Inc. 2013
Count Distinct
Page 27
![Page 28: Optimizing Hive Queries](https://reader034.vdocuments.site/reader034/viewer/2022051323/54b6cff74a7959d84d8b4605/html5/thumbnails/28.jpg)
© Hortonworks Inc. 2013
Answer
• Surprisingly the second is usually faster – In the first case: – Maps send each value to the reduce – Single reduce counts them all
– In the second case: – Maps split up the values to many reduces – Each reduce generates its list – Final job counts the size of each list
– Singleton reduces are almost always BAD
Page 28
![Page 29: Optimizing Hive Queries](https://reader034.vdocuments.site/reader034/viewer/2022051323/54b6cff74a7959d84d8b4605/html5/thumbnails/29.jpg)
© Hortonworks Inc. 2013
Communication is Good!
• Hive doesn’t tell you what is wrong. – Expects you to know! – “Lucy, you have some ‘splaining to do!”
• Explain tool provides query plan – Filters on input – Numbers of jobs – Numbers of maps and reduces – What the jobs are sorting by – What directories are they reading or writing
Page 29
![Page 30: Optimizing Hive Queries](https://reader034.vdocuments.site/reader034/viewer/2022051323/54b6cff74a7959d84d8b4605/html5/thumbnails/30.jpg)
© Hortonworks Inc. 2013
Blinded by Science
• The explanation tool is confusing. – It takes practice to understand. – It doesn’t include some critical details like partition pruning.
• Running the query makes things clearer! – Pay attention to the details – Look at JobConf and job history files
Page 30
![Page 31: Optimizing Hive Queries](https://reader034.vdocuments.site/reader034/viewer/2022051323/54b6cff74a7959d84d8b4605/html5/thumbnails/31.jpg)
© Hortonworks Inc. 2013
Skew
• Skew is typical in real datasets. • A user complained that his job was slow – He had 100 reduces – 98 of them finished fast – 2 ran really slow
• The key was a boolean…
Page 31
![Page 32: Optimizing Hive Queries](https://reader034.vdocuments.site/reader034/viewer/2022051323/54b6cff74a7959d84d8b4605/html5/thumbnails/32.jpg)
© Hortonworks Inc. 2013
Root Cause Analysis
• Ambari – Apache project building Hadoop installation and management tool – Provides metrics (Ganglia & Nagios) – Root Cause Analysis – Processes MapReduce job logs – Displays timing of each part of query plan
Page 32
![Page 33: Optimizing Hive Queries](https://reader034.vdocuments.site/reader034/viewer/2022051323/54b6cff74a7959d84d8b4605/html5/thumbnails/33.jpg)
© Hortonworks Inc. 2013
Root Cause Analysis Screenshots
Page 33
![Page 34: Optimizing Hive Queries](https://reader034.vdocuments.site/reader034/viewer/2022051323/54b6cff74a7959d84d8b4605/html5/thumbnails/34.jpg)
© Hortonworks Inc. 2013
Root Cause Analysis Screenshots
Page 34
![Page 35: Optimizing Hive Queries](https://reader034.vdocuments.site/reader034/viewer/2022051323/54b6cff74a7959d84d8b4605/html5/thumbnails/35.jpg)
© Hortonworks Inc. 2012: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION
Thank You! Questions & Answers
Page 35
@owen_omalley
![Page 36: Optimizing Hive Queries](https://reader034.vdocuments.site/reader034/viewer/2022051323/54b6cff74a7959d84d8b4605/html5/thumbnails/36.jpg)
© Hortonworks Inc. 2013
ORCFile - Comparison
Page 36
RC File Trevni ORC File Hive Type Model N N Y Separate complex columns N Y Y Splits found quickly N Y Y Default column group size 4MB 64MB* 250MB Files per a bucket 1 > 1 1 Store min, max, sum, count N N Y Versioned metadata N Y Y Run length data encoding N N Y Store strings in dictionary N N Y Store row count N Y Y Skip compressed blocks N N Y Store internal indexes N N Y