impala概要 道玄坂lt祭り 20150312 #dogenzakalt
TRANSCRIPT
-
1 Cloudera, Inc. All rights reserved.
Impala - Hadoop , Cloudera
-
2 Cloudera, Inc. All rights reserved.
20114ClouderaCloudera
email: [email protected] twitter: @shiumachi
-
3 Cloudera, Inc. All rights reserved.
Hadoop
BISQL
Hadoop
Hadoop
-
4 Cloudera, Inc. All rights reserved.
BI /
Sqoop, Flume
MapReduce, Hive, Pig, Spark
SAS, R, Spark,
Mahout
NoSQL HBase
Spark
Streaming
Impala
Solr
HDFS, HBase
YARN, Cloudera Manager,Cloudera Navigator
-
5 Cloudera, Inc. All rights reserved.
Cloudera Impala
Hadoop MPP SQL http://impala.io/
Cloudera / MapR / Amazon / Oracle HDFS HBase Hive
ODBC / JDBC Kerberos / LDAP
-
6 Cloudera, Inc. All rights reserved.
Impala
HDFS DN
Query Exec Engine
Query Coordinator
Query Planner
HBase HDFS DN
Query Exec Engine
Query Coordinator
Query Planner
HBase HDFS DN
Query Exec Engine
Query Coordinator
Query Planner
HBase
ODBC / JDBC
SQL App
Hive Metastore HDFS NN State Store Catalogd
-
7 Cloudera, Inc. All rights reserved.
Impala 1.x Impala 1.0 (2013/04)
SQL-92 () Hadoop
ParquetAvroSequenceFile Kerberos ODBC / JDBC
Impala 1.1 Apache Sentry RBAC(
Impala 1.2 UDF / UDAF JOIN
Impala 1.3 / CDH 5.0
Impala 1.4 CDH 5.1 (2014/07) SQL (DECIMAL ORDER BY without LIMITetc.) HDFS
-
8 Cloudera, Inc. All rights reserved.
Impala 2.0 (2014/10)
SQLSQL:2003 /(WHEREEXISTSIN)CHAR / VARCHARGRANT / REVOKE (Sentry )
Hash Table disk join and aggregate tables
-
9 Cloudera, Inc. All rights reserved.
SQL-on-Hadoop (2014/09)
Impala 1.4.0 Presto 0.74 Stinger phase 3 (Hive 0.13.0) Spark SQL 1.1
TPC-DS Impala TPC-DS https://github.com/cloudera/impala-tpcds-kit
SQL-92 JOIN Presto JVM
http://blog.cloudera.com/blog/2014/09/new-benchmarks-for-sql-on-hadoop-impala-1-4-widens-the-performance-gap/
-
10 Cloudera, Inc. All rights reserved.
Impala :
-
11 Cloudera, Inc. All rights reserved.
Impala :
-
12 Cloudera, Inc. All rights reserved.
-
13 Cloudera, Inc. All rights reserved.
/
2.0
RANK() / DENSE_RANK() FIRST_VALUE() / LAST_VALUE() LAG() / LEAD() ROW_NUMBER()
-
14 Cloudera, Inc. All rights reserved.
select stock_symbol, closing_date, closing_price,! lag(closing_price,1) over (partition by stock_symbol order by closing_date) as "yesterday closing"! from stock_ticker! order by closing_date;!+--------------+---------------------+---------------+-------------------+!| stock_symbol | closing_date | closing_price | yesterday closing |!+--------------+---------------------+---------------+-------------------+!| JDR | 2014-09-13 00:00:00 | 12.86 | NULL |!| JDR | 2014-09-14 00:00:00 | 12.89 | 12.86 |!| JDR | 2014-09-15 00:00:00 | 12.94 | 12.89 |!| JDR | 2014-09-16 00:00:00 | 12.55 | 12.94 |!| JDR | 2014-09-17 00:00:00 | 14.03 | 12.55 |!| JDR | 2014-09-18 00:00:00 | 14.75 | 14.03 |!| JDR | 2014-09-19 00:00:00 | 13.98 | 14.75 |!+--------------+---------------------+---------------+-------------------+!
-
15 Cloudera, Inc. All rights reserved.
HBase Impala HBase SELECT INSERT
ImpalaHBase
HBase : WebPVSNS
()HBase :
1 INSERT VALUES
Impala HBase external systems
put SELECT * FROM hbase_tbl
INSERT / INSERT VALUES get, scan
-
16 Cloudera, Inc. All rights reserved.
impalad
SPOF
-
17 Cloudera, Inc. All rights reserved.
2 Cloudera Manager fair-scheduler.xml llama-site.xml
-
18 Cloudera, Inc. All rights reserved.
100 10
10 1
1000 GB
100 GB
Group A
Group B
-
19 Cloudera, Inc. All rights reserved.
Hue Web UI (CDH)
-
20 Cloudera, Inc. All rights reserved.
JDBC / ODBC BI
MicroStrategy, QlikViewSASTableau
: https://zoomdata.zendesk.com/hc/en-us/articles/203813488-Date-and-Time-Formats-Supported-By-Zoomdata
-
21 Cloudera, Inc. All rights reserved.
Impala ()
http://demo.gethue.com/ Quick Start VM (VM)
http://www.cloudera.com/content/cloudera/en/downloads/quickstart_vms/cdh-5-3-x.html Cloudera Live
(14)4 TableauZoomData http://www.cloudera.com/content/cloudera/en/products-and-services/cloudera-live.html
Cloudera Director AWS http://www.cloudera.com/content/cloudera/en/downloads/cloudera-director/1-1-0.html
Amazon EMR http://docs.aws.amazon.com/ja_jp/ElasticMapReduce/latest/DeveloperGuide/emr-impala.html
-
22 Cloudera, Inc. All rights reserved.
Thank you
-
23 Cloudera, Inc. All rights reserved.
-
24 Cloudera, Inc. All rights reserved.
Impala
http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/impala_cluster_sizing.html
: CPU1264GB2TB HDD x 121015TB2020
-
25 Cloudera, Inc. All rights reserved.
Impala
:
10http://www.slideshare.net/cloudera/the-impala-cookbook-42530186
Parquet read-once SequenceFile + Snappy
-
26 Cloudera, Inc. All rights reserved.
-
27 Cloudera, Inc. All rights reserved.
http://www.cidrdb.org/cidr2015/Papers/CIDR15_Paper28.pdf
-
28 Cloudera, Inc. All rights reserved.
http://www.vldb.org/pvldb/vol7/p1295-floratou.pdf