putting apache drill into production

© 2016 MapR Technologies 1© 2016 MapR Technologies

Putting Apache Drill into ProductionNeeraja Rentachintala, Sr. Director, Product Management

Aman Sinha, Lead Software Engineer, Apache Drill & Calcite PMC

© 2016 MapR Technologies 2

Topics• Apache Drill –What & Why

– Use Cases– Customer Examples

• Considerations & Best Practices for Production Deployments

– Deployment Architecture– Storage Format Selection– Query Performance – Security

• Product Roadmap• Q&A


Apache Drill –What & Why


Schema-Free SQL engine for Flexibility & PerformanceRapid time to insights

• Query data in-situ• No Schemas required• Easy to get started

Access to any data type, any data source

• Relational• Nested data• Schema-lessIntegration with existing tools• ANSI SQL• BI tool integration• User Defined Functions

Scale in all dimensions• TB-PB of scale• 1000s of users• 1000s of nodes

Granular security• Authentication• Row/column level controls• De-centralized


MapR-DB MapR StreamsDatabase Event Streaming

Real-time dashboardsBI/Ad-hoc queriesData Exploration

Unified SQL Layer for The MapR Converged Data Platform

Global Sources

Web scale StorageMapR-FS

Batch Processing (MapReduce, Spark, Pig)

Stream Processing (Spark Streaming, Storm)


Use Cases for Drill

Data Exploration

Adhoc queries Dashboards/BI reporting

ETL

Primary Purpose

Data discovery & Model development

Investigative analytics Operational reporting Data prep for downstream needs

Usage Internal Internal Internal and external facing apps

Internal

Typical Users

Data scientists, Technical analysts, General SQL users

Business analysts, General SQL users

Business analysts, End users

ETL/DWH developers

Tools involved

Command Line, SQL/BI tools, R, Python, Spark..

Command line , SQL/BI tools

BI tools, Custom apps ETL/DI tools , Scripts

Critical requirement

Flexibility (File format variety, nested data, UDFs..)

Flexibility(File format variety ) , Interactive performance – ok up to 10s of seconds

Performance Fault tolerance

Type of datasets

Raw datasets Raw datasets, Processed datasets (via Hive and Spark).

Processed datasets , OK to structure data layout for optimized performance

Raw datasets

Query patterns

Unknown models & unknown query patterns

Known models , Unknown query patterns

Known models, known query patterns

Predefined queries

Traditional and New Types of BI on Hadoop

More raw data

More real time

More Agility & Self

Service

More Users

More Cost Effectively

+


Customer examples

https://www.mapr.com/blog/happy-anniversary-apache-drill-what-difference-year-makes


Agile and Iterative Releases

Drill 1.0 (May’15)

Drill 1.1 (July’15)

Drill 1.2 (Oct’15)

Drill 1.3 (Nov’15)

Drill 1.4 (Jan’16)

Drill 1.5 (Feb’16)

Drill 1.6 (April’16)

Drill 1.7 (Jul’16) Drill 1.8

Just released

• 14 releases since Beta in Sep’14• 50+ contributors (MapR, Dremio, Intuit, Microsoft, Hortonworks...)• 1000’s of sandbox downloads since GA • 6,000+ Analyst and developer certifications through MapR ODT• 14,000+ email threads on Drill Dev and User forums• Lot of new contributions: JDBC/Mongo-DB/Kudu storage plugins,

Geospatial functions..


Drill Product Evolution

Drill 1.0 GA•Drill GA

Drill 1.1•Automatic Partitioning for Parquet Files

•Window Functions support

•- Aggregate Functions: AVG, COUNT, MAX, MIN, SUM

•-Ranking Functions: CUME_DIST, DENSE_RANK, PERCENT_RANK, RANK and ROW_NUMBER

•Hive impersonation

•SQL Union support

•Complex data enhancements· and more

Drill 1.2•Native parquet reader for Hive tables

•Hive partition pruning

•Multiple Hive versions support

•Hive 1.2.1 version support

•New analytical functions (Lead, lag, Ntiile etc)

•Multiple window Partition By clauses support

•Drop table syntax

•Metadata caching

•Security support for web UI

• INT 96 data type support

•UNION distinct support

Drill 1.3/1.4• Improved Tableau experience with faster Limit 0 queries

•Metadata (INFORMATION_SCHEMA) query speed ups on Hive schemas/tables

•Robust partition pruning (more data types, large # of partitions)

•Optimized metadata cache

• Improved window functions resource usage and performance

•New & improved JDBC driver

Drill 1.5/1.6•Enhanced Stability & scale•New memory allocator

• Improved uniform query load distribution via connection pooling

•Enhanced query performance•Early application of partition pruning in query planning

•Hive tables query planning improvements

•Row count based pruning for Limit N queries

•Lazy reading of parquet metadata caching

•Limit 0 performance

•Enhanced SQL Window function frame syntax

•Client impersonation

• JDK 1.8 support

Drill 1.7•Enhanced MaxDir/MinDir functions

•Access to Drill logs in the Web UI

•Addition of JDBC/ODBC client IP in Drill audit logs

•Monitoring via JMX

•Hive CHAR data type support

•Partition pruning enhancements

•Ability to return file names as part of queries

ANSI SQL Window

Functions

Enhanced Hive

Compatibility

Query Performance & Scale

Drill on MapR-DB

JSON tables

Easy Monitoring & Security


Considerations & Best Practices for Production Deployments


Deployment


Drill is a scale-out MPP query engine

Zookeeper

DFS/HBase/Hive

DFS/HBase/Hive

DFS/HBase/Hive

Drillbit Drillbit Drillbit

Client apps

• Install Drill on all the data nodes on cluster • Improves performance w/ data locality

• Client tools must communicate with Drill via Zookeeper quorum

• Direct connections to Drillbit are not recommended for prod deployments

• When installing Drill on a client/edge node, make sure the node has the network connection to zookeeper+all drillbit nodes.


Appropriate Memory Allocation is Key• Drill is an in-memory query engine with optimistic/pipelined execution model

– Performance and concurrency offered by Drill are factor of resources available to it

• It is possible to restrict the resources Drill uses on a cluster– Direct and Heap memory allocation need to be set for all Drillbits in cluster– Recommend at least 32 cores & 32-48GB memory per node

• Memory controls also available at various granular operations– Query Planning– Sort operation

• Drill supports spooling to disk for sort based operations– Recommend creating spill directories on local volumes (Enable local reads & writes)


Storage Format Selection


Choosing the Right Storage Format is Vital• Format Selection

– Data Exploration/Ad-hoc queries: Any file formats : Text, JSON, Parquet, Avro ..

– SLA Critical BI & Analytics workloads : Parquet– BI/Ad-hoc queries on changing data : MapR-DB/HBase

• Regarding Parquet– Drill can generate Parquet data using CTAS syntax or read data generated by

other tools such as Hive/Spark– Types of Parquet compression - Snappy (default), Gzip– Parquet block size considerations

• For MapR , recommend to set Parquet block size to match MFS chunk size• When generating data through Drill CTAS, use parameter

– ALTER <SYSTEM or SESSION> SET `store.parquet.block-size` = 268435456;


Query Performance


How Drill Achieves Performance

➢ Execution in Drill

➢ Scale-out MPP➢ Hierarchical “JSON like” data

model➢ Columnar processing➢ Optimistic & pipelined

execution ➢ Runtime code generation➢ Late binding➢ Extensible

➢ Optimization in Drill

➢ Apache Calcite+ Parallel optimizations

➢ Data locality awareness➢ Projection pruning➢ Filter pushdown➢ Partition pruning➢ CBO & pluggable optimization

rules➢ Metadata caching


Partition Your Data Layout for Reducing I/OSales

US

2 01 6

Jan

1

2

3

4

..

Feb

..

2 01 5

Jan

Feb

..

2 01 4

Jan

Feb

..

…

E ur op e

• Partition pruning allows a query engine to determine and retrieve the smallest needed dataset to answer a given query

• Data can be partitioned – At the time of ingestion into the cluster– As part of ETL via Hive or Spark or other

batch processing tools – Drill support CTAS with PARTITION BY

clause• Drill does partition pruning for queries

on partitioned Hive tables as well as file system queries Select * from Sales

Where dir0=‘US’ and dir1 =‘2015’


Partitioning ExamplesCreate partitioned table

Create table dfs.tmp.businessparquet partition by(state,city,stars) as select state, city, stars, business_id, full_address,hours,name, review_count from `business.json`;

Queries on partitioned keys

select name, city, stars from dfs.tmp.businessparquet where state='AZ' and city = 'Fountain Hills' limit 5;

select name, city, stars from dfs.tmp.businessparquet where state='AZ' and city = 'Fountain Hills' and stars= '3.5' limit 5;

How to determine the right partitions?

Determine the common access patterns from SQL queries

Columns frequently used in the WHERE clause are good candidates for partition keys.

Balance total # of partitions with optimal query planning performance


Run EXPLAIN PLAN to check if Partition Pruning is Applied

00-00 Screen : rowType = RecordType(ANY name, ANY city, ANY stars): rowcount = 5.0, cumulative cost = {40.5 rows, 145.5 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 100500-01 Project(name=[$0], city=[$1], stars=[$2]) : rowType = RecordType(ANY name, ANY city, ANY stars): rowcount = 5.0, cumulative cost = {40.0 rows, 145.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 100400-02 SelectionVectorRemover : rowType = RecordType(ANY name, ANY city, ANY stars): rowcount = 5.0, cumulative cost = {40.0 rows, 145.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 100300-03 Limit(fetch=[5]) : rowType = RecordType(ANY name, ANY city, ANY stars): rowcount = 5.0, cumulative cost = {35.0 rows, 140.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 100200-04 Project(name=[$3], city=[$1], stars=[$2]) : rowType = RecordType(ANY name, ANY city, ANY stars): rowcount = 30.0, cumulative cost = {30.0 rows, 120.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 100100-05 Project(state=[$1], city=[$2], stars=[$3], name=[$0]) : rowType = RecordType(ANY state, ANY city, ANY stars, ANY name): rowcount = 30.0, cumulative cost = {30.0 rows, 120.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 100000-06 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=/tmp/businessparquet/0_0_114.parquet]], selectionRoot=file:/tmp/businessparquet, numFiles=1, usedMetadataFile=false, columns=[`state`, `city`, `stars`, `name`]]]) : rowType = RecordType(ANY name, ANY state, ANY city, ANY stars): rowcount = 30.0, cumulative cost = {30.0 rows, 120.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 999Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=/tmp/businessparquet/0_0_114.parquet]], selectionRoot=file:/tmp/businessparquet, numFiles=1,


Create Parquet Metadata Cache to Speed up Query Planning

• Helps reduce query planning time significantly when working with large # of Parquet files (thousands to millions)

• Highly optimized cache with the key metadata from parquet files– Column names, data types, nullability, row group size…

• Recursive cache creation at root level or selectively for specific directories or files– Ex: REFRESH TABLE METADATA dfs.tmp.BusinessParquet;

• Metadata caching is better suited for large amounts of data with moderate rate of change

• Applicable for only direct queries on parquet data in file system– For queries via Hive tables enable meta store caching instead in storage plugin config

• "hive.metastore.cache-ttl-seconds": "<value>”,• "hive.metastore.cache-expire-after": "<value>"


Run Explain Plan to Check if Metadata Cache is Used00-00 Screen : rowType = RecordType(ANY name, ANY city, ANY stars): rowcount = 5.0, cumulative cost = {40.5 rows, 145.5 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 127900-01 Project(name=[$0], city=[$1], stars=[$2]) : rowType = RecordType(ANY name, ANY city, ANY stars): rowcount = 5.0, cumulative cost = {40.0 rows, 145.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 127800-02 SelectionVectorRemover : rowType = RecordType(ANY name, ANY city, ANY stars): rowcount = 5.0, cumulative cost = {40.0 rows, 145.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 127700-03 Limit(fetch=[5]) : rowType = RecordType(ANY name, ANY city, ANY stars): rowcount = 5.0, cumulative cost = {35.0 rows, 140.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 127600-04 Project(name=[$3], city=[$1], stars=[$2]) : rowType = RecordType(ANY name, ANY city, ANY stars): rowcount = 30.0, cumulative cost = {30.0 rows, 120.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 127500-05 Project(state=[$1], city=[$2], stars=[$3], name=[$0]) : rowType = RecordType(ANY state, ANY city, ANY stars, ANY name): rowcount = 30.0, cumulative cost = {30.0 rows, 120.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 127400-06 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=/tmp/BusinessParquet/0_0_114.parquet]], selectionRoot=/tmp/BusinessParquet, numFiles=1, usedMetadataFile=true, columns=[`state`, `city`, `stars`, `name`]]]) : rowType = RecordType(ANY name, ANY state, ANY city, ANY stars): rowcount = 30.0, cumulative cost = {30.0 rows, 120.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 1273Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=/tmp/businessparquet/0_0_114.parquet]], selectionRoot=file:/tmp/businessparquet, numFiles=1, , usedMetadataFile=true


Create Data Sources & Schemas for Fast Metadata Queries by BI Tools• Metadata queries are very commonly

used by BI/Visualization tools – INFORMATION_SCHEMA (Show Schemas, Show

tables..)– Limit 0/1 queries

• Drill is a schema-less system , so metadata queries at scale might need careful consideration

• Drill provides optimized query paths to provide fast schema returns wherever possible

• User level Guidelines – Disable unused Drill storage plugins– Restrict schemas via IncludeSchemas &

ExcludeSchemas flags from ODBC/JDBC connections

– Give Drill explicit schema information via views– Enable metadata caching

CREATE or REPLACE VIEW dfs.views.stock_quotes ASSELECT CAST(columns[0] as VARCHAR(6)) as symbol,CAST(columns[1] as VARCHAR(20)) as `name`,CAST((to_date(columns[2], 'MM/dd/yyyy')) as date) as `date`,CAST(columns[3] as FLOAT) as trade_price,CAST(columns[4] as INT) as trade_volumefrom dfs.csv.`/stock_quotes`;

Sample view definition with schemas


Tune by Understanding Query Plans and Execution ProfilesSingleMergeExchange 00-02

StreamAgg 01-01

HashToMergeExchange 01-02

StreamAgg 02-01

Sort 02-02

Project 02-03

Project 02-04

MergeJoin 02-05

StreamAgg 02-02 Project 02-06

HashToMergeExchange 02-09 SelectionBectorRemover 02-08

StreamAgg 03-01 Sort 02-10

Sort 03-02 Project 02-11

Project 03-03 HashToRandomExchange 02-12

MergeJoin 03-04 UnorderedMuxExchange 04-01

SelectionVectorRemover 03-06 Project 07-01StreamAgg 03-06

StreamAgg 01-01

Visual Query PlanDrill web UI - http://<localhost:8047>


Tune by Understanding Query Plans and Execution Profiles

Visual Query PlanDrill web UI - http://<localhost:8047>


Visual Query Fragment Profiles


Analyze detailed fragment profiles


Analyze detailed operator level profiles


Example: Handling Data Skew

Discover skew in datasets from query profiles.Example Query to discover skew in dataset:SELECT a1, COUNT(*) as cnt FROM T1 GROUP BY a1 ORDER BY cnt DESC limit 10;


Use Drill Parallelization Controls to Balance Single Query Performance with Concurrent Usage

Key setting to look for:planner.width.max_per_node

• The maximum degree of distribution of a query across cores and cluster nodes.

Interpreting parallelization from query profiles


Use Monitoring as a first step for Drill Cluster Management

• New JMX based metrics Drill Web Console or Spyglass (Beta) or a remote JMX monitoring tool, such as Jconsole

• Various system and query metrics

– drill.queries.running – drill.queries.completed– heap.used– direct.used– waiting.count …


Security


Use Drill Security Controls to Provide Granular Access➢ End to end security from

BI tools to Hadoop➢ Standard based PAM

Authentication➢ 2 level user Impersonation➢ Drill respects storage level

security permissions➢ Ex: Hive authorization (SQL

and Storage based), File system permissions, MapR-DB table ACEs

➢ More Fine-grained row and column level access control with Drill Views – no centralized security repository required


Granular Security Permissions through Drill Views

Name City State

Credit Card #

Dave San Jose CA 1374-7914-3865-4817John Boulder CO 1374-9735-1794-9711

Raw File (/raw/cards.csv)OwnerAdmins

Permission Admins

Business Analyst Data Scientist

Name City State

Credit Card #

Dave San Jose CA 1374-1111-1111-1111

John Boulder CO 1374-1111-1111-1111

Data Scientist View (/views/maskedcards.view.drill)

Not a physical data copy

Name City State

Dave San Jose CAJohn Boulder CO

Business Analyst View

OwnerAdmins

Permission Business Analysts

OwnerAdmins

Permission Data

Scientists


Drill Best Practices on the MapR Converge Community

https://community.mapr.com/docs/DOC-1497


Roadmap


Roadmap for 2016• YARN Integration• Kerberos/SASL support• Parquet Reader Improvements• Improved Statistics• Query Performance Improvements• Enhanced Concurrency & Resource Management• Deeper Integrations with MapR-DB & MapR Streams• A variety of SQL & Usability Features


Get started with Drill today• Learn:

– http://drill.apache.org – https://www.mapr.com/products/apache-drill

• Download MapR Sandbox– https://www.mapr.com/products/mapr-sandbox-hadoop/download-sandbox-drill

• Ask questions: – Ask Us Anything about Drill in the MapR Community from Wed- Fri– https://community.mapr.com/– [email protected]

• Contact us:– [email protected] – [email protected]

http://drill.apache.org/

https://www.mapr.com/products/apache-drill

https://www.mapr.com/products/apache-drill

https://www.mapr.com/products/mapr-sandbox-hadoop/download-sandbox-drill

https://www.mapr.com/products/mapr-sandbox-hadoop/download-sandbox-drill

https://community.mapr.com/

https://community.mapr.com/

mailto:[email protected]



putting apache drill into production

Data & Analytics