sql for everything at cwt2014

29
Masahiro Nakagawa Nov 6, 2014 Cloudera World Tokyo SQL for Everything Presto: Distributed SQL Query Engine

Upload: n-masahiro

Post on 03-Jul-2015

413 views

Category:

Technology


1 download

DESCRIPTION

Presto presentation at Cloudera World Tokyo 2014

TRANSCRIPT

Page 1: SQL for Everything at CWT2014

Masahiro NakagawaNov 6, 2014

Cloudera World Tokyo

SQL for EverythingPresto: Distributed SQL Query Engine

Page 2: SQL for Everything at CWT2014

Who are you?

> Masahiro Nakagawa > github/twitter: @repeatedly > Ingress: Blue

> Treasure Data, Inc. > Senior Software Engineer > Fluentd / td-agent developer

> I love OSS :) > D language - Phobos committer > Fluentd - Main maintainer > MessagePack / RPC- D and Python (only RPC) > The organizer of Presto Source Code Reading > etc…

Page 3: SQL for Everything at CWT2014

SQL on Hadoop?

Page 4: SQL for Everything at CWT2014

> Hive > Spark SQL

Batch

Short Batch Low latency

Stream

> Presto > Impala > Drill

> Norikra > StreamSQL

> HAWQ > Actian > etc…

This color indicates a commercial product

SQL Players on Hadoop

Latency: minutes - hours

Latency: seconds - minutes

Latency: immediate

Page 5: SQL for Everything at CWT2014

> Hive > Spark SQL

SQL Players on Hadoop

Batch

Short Batch Low latency

Stream

> Presto > Impala > Drill

> HAWQ > Actian > etc…

Red Ocean

Blue Ocean?> Norikra > StreamSQL

This color indicates a commercial product

Page 6: SQL for Everything at CWT2014

Prestohttp://prestodb.io/

Page 7: SQL for Everything at CWT2014

Presto overview> Open sourced by Facebook

> https://github.com/facebook/presto • github is a primary

> written in Java > latest version is 0.81

> Built-in useful features > Connectors > Machine Learning > Window function > Approximate query > etc…

Page 8: SQL for Everything at CWT2014

What’s Presto?

A distributed SQL query engine for interactive data analisys against GBs to PBs of data.

Page 9: SQL for Everything at CWT2014

What problems does it solve?> We couldn’t visualize data in HDFS directly

using dashboards or BI tools > because Hive is too slow (not interactive) > or ODBC connectivity is unavailable/unstable

> We needed to store daily-batch results to an interactive DB for quick response(PostgreSQL, Redshift, etc.) > Interactive DB costs more & less scalable

> Some data are not stored in HDFS > We need to copy the data into HDFS to analyze

Page 10: SQL for Everything at CWT2014

What problems does it solve?> We couldn’t visualize data in HDFS directly

using dashboards or BI tools > because Hive is too slow (not interactive) > or ODBC connectivity is unavailable/unstable

> We needed to store daily-batch results to an interactive DB for quick response(PostgreSQL, Redshift, etc.) > Interactive DB costs more & less scalable

> Some data are not stored in HDFS > We need to copy the data into HDFS to analyze

Page 11: SQL for Everything at CWT2014

What problems does it solve?> We couldn’t visualize data in HDFS directly

using dashboards or BI tools > because Hive is too slow (not interactive) > or ODBC connectivity is unavailable/unstable

> We needed to store daily-batch results to an interactive DB for quick response(PostgreSQL, Redshift, etc.) > Interactive DB costs more & less scalable

> Some data are not stored in HDFS > We need to copy the data into HDFS to analyze

Page 12: SQL for Everything at CWT2014

What problems does it solve?> We couldn’t visualize data in HDFS directly

using dashboards or BI tools > because Hive is too slow (not interactive) > or ODBC connectivity is unavailable/unstable

> We needed to store daily-batch results to an interactive DB for quick response(PostgreSQL, Redshift, etc.) > Interactive DB costs more & less scalable

> Some data are not stored in HDFS > We need to copy the data into HDFS to analyze

Page 13: SQL for Everything at CWT2014

HDFS

Hive Dashboard

Presto

PostgreSQL, etc.

Daily/Hourly Batch

HDFS

HiveDashboard

Daily/Hourly Batch

Interactive query

Interactive query

Page 14: SQL for Everything at CWT2014

Presto

HDFS

HiveDashboard

Daily/Hourly BatchInteractive query

Cassandra MySQL Commertial DBs

SQL on any data sets CommercialBI Tools

✓ IBM Cognos✓ Tableau ✓ ...

Data analysis platform

Page 15: SQL for Everything at CWT2014

Presto’s deployment> Facebook

> Multiple geographical regions > scaled to 1,000 nodes > actively used by 1,000+ employees > processing 1PB/day

> Netflix, Dropbox, Treasure Data, Airbnb, Qubole, LINE, GREE, Scaleout, etc

> Presto as a Service > Treasure Data, Qubole

Page 16: SQL for Everything at CWT2014

PostgreSQL gateway for Presto> A PostgreSQL protocol gateway based on

PostgreSQL’s stable ODBC / JDBC drivers > Developed by Sadayuki Furuhashi

https://github.com/treasure-data/prestogres

Page 17: SQL for Everything at CWT2014

Distributed architecture

Page 18: SQL for Everything at CWT2014

Client

Coordinator ConnectorPlugin

Worker

Worker

Worker

Storage / Metadata

Discovery Service

Page 19: SQL for Everything at CWT2014

What’s Connectors?> Access to storage and metadata

> provide table schema to coordinators > provide table rows to workers

> Connectors are pluggable to Presto > written in Java

> Implementations: > Hive(CDH, HDP, Community), Cassandra,

MySQL, JDBC, Kafka, etc… > Or your own connector

• Treasure Data has own connector

Page 20: SQL for Everything at CWT2014

Client

Coordinator

otherconnectors

...

Worker

Worker

Worker

Cassandra

Discovery Service

find servers in a cluster

Hive Connector

HDFS / Metastore

Multiple connectors in a query

CassandraConnector

Other data sources...

Page 21: SQL for Everything at CWT2014

Distributed architecture

> 3 type of servers: > Coordinator, worker, discovery service

> Get data/metadata through connector plugins. > Presto is NOT a database > Presto provides SQL to existent data stores

> Client protocol is HTTP + JSON > Language bindings:

Ruby, Python, PHP, Java (JDBC), R, Node.JS...

Page 22: SQL for Everything at CWT2014

Presto’s execution model

> Presto is NOT MapReduce > Use its own execution engine

> Presto’s query plan is based on DAG > more like Apache Tez / Spark or

traditional MPP databases > Impala and Drill use a similar model

Page 23: SQL for Everything at CWT2014

Query Planner

SELECT name, count(*) AS c FROM impressions GROUP BY name

SQL

impressions ( name varchar time bigint)

Table schemaTable scan

(name:varchar)

GROUP BY (name,

count(*))

Output (name, c)

+

Sink

Final aggr

Exchange

Sink

Partial aggr

Table scan

Output

Exchange

Logical query plan

Distributed query plan

Page 24: SQL for Everything at CWT2014

Query Planner - Stages

Sink

Final aggr

Exchange

Sink

Partial aggr

Table scan

Output

Exchange

inter-worker data transfer

pipelined aggregation

inter-worker data transfer

Stage-0

Stage-1

Stage-2

Page 25: SQL for Everything at CWT2014

Sink

Partial aggr

Table scan

Sink

Partial aggr

Table scan

Execution Planner

+Node list✓ 2 workers

Sink

Final aggr

Exchange

Output

Exchange

Sink

Final aggr

Exchange

Sink

Final aggr

Exchange

Sink

Partial aggr

Table scan

Output

Exchange

Worker 1 Worker 2

Page 26: SQL for Everything at CWT2014

All stages are pipe-lined ✓ No wait time ✓ No fault-tolerance

MapReduce vs. Presto

MapReduce Presto

map map

reduce reduce

task task

task task

task

task

memory-to-memory data transfer ✓ No disk IO ✓ Data chunk must fit in memory

task

disk

map map

reduce reduce

disk

disk

Write data to disk

Wait betweenstages

Page 27: SQL for Everything at CWT2014

Demo

Page 28: SQL for Everything at CWT2014

Presto Meetup

The first half of 2015

Page 29: SQL for Everything at CWT2014

Check: treasuredata.com

Cloud service for the entire data pipeline, including Presto