benchmark bigdata olap kylin vs vertica cn final...研olap6Ç,¥]5 (2018c 12p )...
TRANSCRIPT
���������������������������� 研 OLAP (2018 12 )�
Apache�Kylin�Vs�Vertica�And�PostgreSQL�
Pg.�1�of�26�
OLAP �
�
VS
���������������������������� 研 OLAP (2018 12 )�
Apache�Kylin�Vs�Vertica�And�PostgreSQL�
Pg.�2�of�26�
�
BI匹 匹 �BI� �OLAP �
告 ⼜ 研 ( ) 研 ⼜ () 副 ( ) 副 研 研
研 ��
副 研�OLAP� 务 �Apache�Kylin Vertica Druid Google�Big�Query� �Amazon�Red�Shift �
告 IT �SQL��
⼜ ⼜ 研�OLAP�务 Apache�Kylin� �Vertica� �Vertica�
�Kylin PostgreSQL 研 ⼜务 研�OLAP� �
���������������������������� 研 OLAP (2018 12 )�
Apache�Kylin�Vs�Vertica�And�PostgreSQL�
Pg.�3�of�26�
�OLAP� �
务 ⼜ 务 研�OLAP� Apache�Kylin� �Vertica⼜ ⼜ 研 �PostgreSQL�
研 研 告 �OLAP� ⼜�PostgreSQL 研�OLAP� 务 �Kylin� �Vertica� �
APACHE�KYLIN�
Apache�Kylin� 副 空 �SQL� �Hadoop/Spark� 副 研 研 OLAP �
卿 �eBay� �2014� �Apache��Github� �
�
Kylin� �
• 100 �• �SQL� �• �• � J/ODBC� � BI� Power� BI TableauPentaho Zeppelin Superset Microstrategy
• �Hadoop� �HDFS� �Amazon�S3�务 的模 的 。
• 模 �Web�� ��RESTful�API�� 研 �Cube�研 研 �JSON� 。
• �Cube� �• �LDAP� �Apache�Ranger Cube� �Schema 研
�• ⼜ 务
Kylin� �Kyligence�Enterprise�研
�
�1.� �Apache�Kylin� ⼜ �Kylin� 研 研 DW�Apache�Hive� 研 �SQL�Server MySQL� �Vertica 务Kylin� �Apache�Kafka� 研 �
���������������������������� 研 OLAP (2018 12 )�
Apache�Kylin�Vs�Vertica�And�PostgreSQL�
Pg.�4�of�26�
研 ⼜ �Apache�Kylin�Web� 研⼜ 研
�
研 副 �OLAP�Cube ⼜ 研研 �OLAP� �OLAP M-OLAP
Kylin� �
�1.�Apache�Kylin� �
研 �Cube� 模 ⼜ �SQL� �Cube� �OLAP�Cube 务 研 1 模 Kylin�Cube ��
� 200� 尽 � Kylin 务 � Kylin� 尽 务 �StrateBI �
���������������������������� 研 OLAP (2018 12 )�
Apache�Kylin�Vs�Vertica�And�PostgreSQL�
Pg.�5�of�26�
VERTICA�
Vertica� 研 研 MPP�Kylin� Vertica� 研 副 �OLAP� �
2005� 卿 Michale�Stonebraker C-Store DBMS �Vertica� 2011 Vertica� HP 2017 MicroFocus-
Vertica� MicroFocus �
�
�Kylin� Vertica� �Hadoop� �Vertica� �
• �• �SQL� �• �• � J/OBDC� � BI� Power� BI TableauPentaho Zeppelin Superset Microstrategy �
• �Vertica� 模 的
• 模 �Web� 够
研 Schema �Schema� 够 �• �Schema� �• ⼝ Schema LDAP�
- �• 研 �• UDF SQL Python C++Java� �R� �SQL� 务 �
• 务 �3� �1�TB�告 尽
�
�2.� �Vertica� �Vertica�务 �OLAP�Cube �Schema研 研 �CSV 研 �ETL� � Pentaho� �
Talend �Kafka� 研 �
Schema 研 务 ⼜ �SQL� �BI �Tableau Power�BI� �Pentaho 匹 �
�
���������������������������� 研 OLAP (2018 12 )�
Apache�Kylin�Vs�Vertica�And�PostgreSQL�
Pg.�6�of�26�
�
�Kylin� Vertica� 研 Vertica�务 Kylin务 告 ⼜ �Projection
�Vertica�Schema 副OLAP H-OLAP
Figure�2.�Vertica� �
Vertica� �Hadoop� �Hive� �HDFS�务 研研 �Vertica�务 �Spark�
研 �
Vertica� 尽 �StrateBI ⼜⼜ �Vertica� 名 �
���������������������������� 研 OLAP (2018 12 )�
Apache�Kylin�Vs�Vertica�And�PostgreSQL�
Pg.�7�of�26�
�
模 ⼜�
�
研 TPC-H� �Oracle� �Microsoft� ⼜ 务 ⼜ �SSB� SSB� �TPC-H� �TPC-H�
研 �
�3.� �SSB� 研务
�TPC-H� �13� SQL��SQL �group�by where join
�1�务 �
SSB� �TPC-H� 研 ⼜ �Kyligence� �Kylin� �Hive� 告 务 �Vertica� �PostgreSQL� �
���������������������������� 研 OLAP (2018 12 )�
Apache�Kylin�Vs�Vertica�And�PostgreSQL�
Pg.�8�of�26�
�3.�SSB �
�
�
• LINE�ORDER DATE CUSTOMER SUPPLIER PART 务Volume �
• �• �
⼜ 研�OLAP� LINE_ORDERCUSTOMER PART SUPPLIER DATE 务
�SSB� 研 务 研 Cardinality�
务 研PostgreSQL OLAP 研 研�OLAP �
���������������������������� 研 OLAP (2018 12 )�
Apache�Kylin�Vs�Vertica�And�PostgreSQL�
Pg.�9�of�26�
1.� 务 �
� LINEORDER� CUSTOMER� PART� SUPPLIER� DATE�
� (KPI)� � � � �
100M� 100,000,000� 40,000� 32,000� 20,000� 2,556�
500M� 500,000,000� 200,000� 48,000� 100,000� 2,556�
1,000M� 1,000,000,000� 400,000� 56,000� 200,000� 2,556�
�1.� �3� 务 �
务 �OLAP� 模 �3� ⼜ �2.�务�
� � ��
��
�
Kylin�2.4� � � 3� Intel(R)�Atom(TM)�CPU��C2750��@�2.40GHz�
8� 32�Gb�
Vertica�9.1� � � 3� Intel(R)�Atom(TM)�CPU��C2750��@�2.40GHz�
8� 32�Gb�
PostgreSQL�9.6�
� � 1� Intel(R)�Atom(TM)�CPU��C2750��@�2.40GHz�
8� 32�Gb�
�2.� 务 �
�Kylin� �Vertica ⼜ 告 PostgreSQL��
Vertica� 务 Kylin� � Hortonworks� �Hadoop� 副 Kylin� �Hadoop� 副
���������������������������� 研 OLAP (2018 12 )�
Apache�Kylin�Vs�Vertica�And�PostgreSQL�
Pg.�10�of�26�
务 �Kylin� �Hadoop� 副 Kylin��Kylin �
⼜ �SSB� 研⼜ �
• Apache�Kylin�
o Normal �OLAP�Cube 务M-OLAP� Normal�
• Vertica�o ⼜ �Vertica�
�Projection� �Schema Projection�研 �OLAP� 务
�• PostgreSQL�
o 研 副 �OLAP� �R-OLAP �
���
�
���������������������������� 研 OLAP (2018 12 )�
Apache�Kylin�Vs�Vertica�And�PostgreSQL�
Pg.�11�of�26�
�
务 ⼜ �Kylin Vertica� �PostgreSQL��20� �Kylin�
�3.� ⼜ �
• <�5� 研 匹 OLAP �• �<�10� 告
�• �<�20� 匹 �OLAP�
务 �• �>=�20� �OLAP�
匹 �
� P1� �100M�( )� P1� �500M�( )� P1� �1 000M�( )�� Kylin� Vertica� PostgreSQL� Kylin� Vertica� PostgreSQL� Kylin� Vertica� PostgreSQL�
Q1.1� 0.2� 0.2� 22.4� 0.3� 0.3� +280� 0.6� 0.6� -�Q1.2� 0.2� 0.4� 18.7� 0.3� 0.2� +280� 0.5� 0.3� -�Q1.3� 0.2� 0.4� 18.5� 0.3� 0.3� +280� 0.6� 0.2� -�Q2.1� 0.3� 1.1� 18.1� 0.4� 2.7� +280� 0.6� 9.1� -�Q2.2� 0.3� 0.8� 16.3� 0.4� 2.7� +280� 0.7� 8.2� -�Q2.3� 0.3� 0.8� 15.2� 0.4� 2.2� +280� 0.6� 7.4� -�Q3.1� 0.3� 1.4� 23.9� 0.4� 3.7� +280� 0.8� 15.1� -�Q3.2� 0.6� 0.7� 18.5� 0.8� 0.7� +280� 0.9� 9.8� -�Q3.3� 0.3� 0.9� 15.8� 0.3� 0.6� +280� 0.7� 3.7� -�Q3.4� 0.2� 0.6� 15.9� 0.2� 0.2� +280� 0.2� 1.0� -�Q4.1� 0.3� 1.4� 23.7� 0.4� 7.3� +280� 0.7� 14.7� -�Q4.2� 0.3� 1.0� 23.3� 0.4� 2.0� +280� 0.7� 3.8� -�Q4.3� 2.5� 0.8� 17.1� 2.4� 1.3� +280� 2.9� 2.0� -�
�3.� �
副 Kylin� Kylin� 务5 Q4.3 �1� �
�Vertica� 务 务 1 5 务 Vertica��5� Q4.1 �1� 副�
���������������������������� 研 OLAP (2018 12 )�
Apache�Kylin�Vs�Vertica�And�PostgreSQL�
Pg.�12�of�26�
10 务 Vertica� �Kylin� �13 务10� 研 ⼜ �Vertica� �Schema�
�RAM� �
PostgreSQL 务 PostgreSQL��15� �25� �PostgreSQL� 研5 务 �280� 务 �
⼜ �PostgreSQL� 10 �PostgreSQL�5 �Kylin� �Vertica� �PostgreSQL�StrateBI� 研 �OLAP� 务 �
务 Vertica� �Kylin� �
�4.� �3� 务 �
���������������������������� 研 OLAP (2018 12 )�
Apache�Kylin�Vs�Vertica�And�PostgreSQL�
Pg.�13�of�26�
Kylin� �Vertica �5 ⼜ �Kylin�
Figure�5.�Kylin� �Vertica� �
�Kylin� 研 �I/O� �Vertica� ⼜研 Vertica� 10
Kylin� �Vertica , 研 Schema��Cube 研
Kylin� Cube� 研 �Kylin� �Hadoop-MapReduce� 模 �Hadoop� 务
研 �
�Kylin� 务 10 �Kylin� 16�Vertica� �2� Kylin �Vertica��Kylin� �80%� �Hadoop� �40%� �Vertica�
Kylin��
���������������������������� 研 OLAP (2018 12 )�
Apache�Kylin�Vs�Vertica�And�PostgreSQL�
Pg.�14�of�26�
�Kylin�务 务 。
Kylin� �Spark� �Map�Reduce�研 Cube� � �
���������������������������� 研 OLAP (2018 12 )�
Apache�Kylin�Vs�Vertica�And�PostgreSQL�
Pg.�15�of�26�
�
务 ⼜ 副 研�OLAP� Kylin Vertica⼜ 研�OLAP� 研 研 �
PostgreSQL� �
Kylin� �Vertica� �OLAP� 。 务匹 �
Kylin� Vertica 10� 1000M �LINEORDER� 务 �Vertica� Kylin� �100%�
Vertica� 3 1�TB�
Vertica� 务 �Kylin� 务 ⼜⼜ �Vertica�
Kylin �Hadoop� �Hadoop� �Spark� �Kafka 研
务 �Hadoop� �Hadoop� �Vertica�
���������������������������� 研 OLAP (2018 12 )�
Apache�Kylin�Vs�Vertica�And�PostgreSQL�
Pg.�16�of�26�
�1 SSB� �
Q1.1�
select�sum(v_revenue)�as�revenue�
from�SSB.p_lineorder�
left�join�SSB.dates�on�lo_orderdate�=�d_datekey�
where�d_year�=�1993�
and�lo_discount�between�1�and�3�
and�lo_quantity�<�25�
Q1.2�
select�sum(v_revenue)�as�revenue�
from�SSB.p_lineorder�
left�join�SSB.dates�on�lo_orderdate�=�d_datekey�
where�d_yearmonthnum�=�199401�
and�lo_discount�between�4�and�6�
and�lo_quantity�between�26�and�35;�
Q1.3�
select�sum(v_revenue)�as�revenue�
from�SSB.p_lineorder�
left�join�SSB.dates�on�lo_orderdate�=�d_datekey�
where�d_weeknuminyear�=�6�and�d_year�=�1994�
and�lo_discount�between�5�and�7�
and�lo_quantity�between�26�and�35;�
Q2.1�
select�sum(lo_revenue)�as�lo_revenue,�d_year,�p_brand�
from�SSB.p_lineorder�
left�join�SSB.dates�on�lo_orderdate�=�d_datekey�
left�join�SSB.part�on�lo_partkey�=�p_partkey�
left�join�SSB.supplier�on�lo_suppkey�=�s_suppkey�
���������������������������� 研 OLAP (2018 12 )�
Apache�Kylin�Vs�Vertica�And�PostgreSQL�
Pg.�17�of�26�
where�p_category�=�'MFGR#12'�and�s_region�=�'AMERICA'�
group�by�d_year,�p_brand�
order�by�d_year,�p_brand;�
Q2.2�
select�sum(lo_revenue)�as�lo_revenue,�d_year,�p_brand�
from�SSB.p_lineorder�
left�join�SSB.dates�on�lo_orderdate�=�d_datekey�
left�join�SSB.part�on�lo_partkey�=�p_partkey�
left�join�SSB.supplier�on�lo_suppkey�=�s_suppkey�
where�p_brand�between�'MFGR#2221'�and�'MFGR#2228'�and�s_region�=�'ASIA'�
group�by�d_year,�p_brand�
order�by�d_year,�p_brand;�
Q2.3�
select�sum(lo_revenue)�as�lo_revenue,�d_year,�p_brand�
from�SSB.p_lineorder�
left�join�SSB.dates�on�lo_orderdate�=�d_datekey�
left�join�SSB.part�on�lo_partkey�=�p_partkey�
left�join�SSB.supplier�on�lo_suppkey�=�s_suppkey�
where�p_brand�=�'MFGR#2239'�and�s_region�=�'EUROPE'�
group�by�d_year,�p_brand�
order�by�d_year,�p_brand;�
Q3.1�
select�c_nation,�s_nation,�d_year,�sum(lo_revenue)�as�lo_revenue�
from�SSB.p_lineorder�
left�join�SSB.dates�on�lo_orderdate�=�d_datekey�
left�join�SSB.customer�on�lo_custkey�=�c_custkey�
left�join�SSB.supplier�on�lo_suppkey�=�s_suppkey�
where�c_region�=�'ASIA'�and�s_region�=�'ASIA'and�d_year�>=�1992�and�d_year�<=�1997�
group�by�c_nation,�s_nation,�d_year�
order�by�d_year�asc,�lo_revenue�desc;�
���������������������������� 研 OLAP (2018 12 )�
Apache�Kylin�Vs�Vertica�And�PostgreSQL�
Pg.�18�of�26�
Q3.2�
select�c_city,�s_city,�d_year,�sum(lo_revenue)�as�lo_revenue�
from�SSB.p_lineorder�
left�join�SSB.dates�on�lo_orderdate�=�d_datekey�
left�join�SSB.customer�on�lo_custkey�=�c_custkey�
left�join�SSB.supplier�on�lo_suppkey�=�s_suppkey�
where�c_nation�=�'UNITED�STATES'�and�s_nation�=�'UNITED�STATES'�
and�d_year�>=�1992�and�d_year�<=�1997�
group�by�c_city,�s_city,�d_year�
order�by�d_year�asc,�lo_revenue�desc;�
Q3.3�
select�c_city,�s_city,�d_year,�sum(lo_revenue)�as�lo_revenue�
from�SSB.p_lineorder�
left�join�SSB.dates�on�lo_orderdate�=�d_datekey�
left�join�SSB.customer�on�lo_custkey�=�c_custkey�
left�join�SSB.supplier�on�lo_suppkey�=�s_suppkey�
where�(c_city='UNITED�KI1'�or�c_city='UNITED�KI5')�
and�(s_city='UNITED�KI1'�or�s_city='UNITED�KI5')�
and�d_year�>=�1992�and�d_year�<=�1997�
group�by�c_city,�s_city,�d_year�
order�by�d_year�asc,�lo_revenue�desc;�
Q3.4�
select�c_city,�s_city,�d_year,�sum(lo_revenue)�as�lo_revenue�
from�SSB.p_lineorder�
left�join�SSB.dates�on�lo_orderdate�=�d_datekey�
left�join�SSB.customer�on�lo_custkey�=�c_custkey�
left�join�SSB.supplier�on�lo_suppkey�=�s_suppkey�
where� (c_city='UNITED� KI1'� or� c_city='UNITED� KI5')� and� (s_city='UNITED� KI1'� or�s_city='UNITED�KI5')�and�d_yearmonth�=�'Dec1997'�
group�by�c_city,�s_city,�d_year�
order�by�d_year�asc,�lo_revenue�desc;�
���������������������������� 研 OLAP (2018 12 )�
Apache�Kylin�Vs�Vertica�And�PostgreSQL�
Pg.�19�of�26�
Q4.1�
select�d_year,�c_nation,�sum(lo_revenue)�-�sum(lo_supplycost)�as�profit�
from�SSB.p_lineorder�
left�join�SSB.dates�on�lo_orderdate�=�d_datekey�
left�join�SSB.customer�on�lo_custkey�=�c_custkey�
left�join�SSB.supplier�on�lo_suppkey�=�s_suppkey�
left�join�SSB.part�on�lo_partkey�=�p_partkey�
where� c_region� =� 'AMERICA'� and� s_region� =� 'AMERICA'� and� (p_mfgr� =� 'MFGR#1'� or�p_mfgr�=�'MFGR#2')�
group�by�d_year,�c_nation�
order�by�d_year,�c_nation;�
Q4.2�
select�d_year,�s_nation,�p_category,�sum(lo_revenue)�-�sum(lo_supplycost)�as�profit�
from�SSB.p_lineorder�
left�join�SSB.dates�on�lo_orderdate�=�d_datekey�
left�join�SSB.customer�on�lo_custkey�=�c_custkey�
left�join�SSB.supplier�on�lo_suppkey�=�s_suppkey�
left�join�SSB.part�on�lo_partkey�=�p_partkey�
where�c_region�=�'AMERICA'and�s_region�=�'AMERICA'�
and�(d_year�=�1997�or�d_year�=�1998)�
and�(p_mfgr�=�'MFGR#1'�or�p_mfgr�=�'MFGR#2')�
group�by�d_year,�s_nation,�p_category�
order�by�d_year,�s_nation,�p_category;�
Q4.3�
select�d_year,�s_city,�p_brand,�sum(lo_revenue)�-�sum(lo_supplycost)�as�profit�
from�SSB.p_lineorder�
left�join�SSB.dates�on�lo_orderdate�=�d_datekey�
left�join�SSB.customer�on�lo_custkey�=�c_custkey�
left�join�SSB.supplier�on�lo_suppkey�=�s_suppkey�
left�join�SSB.part�on�lo_partkey�=�p_partkey�
where�c_region�=�'AMERICA'and�s_nation�=�'UNITED�STATES'�
���������������������������� 研 OLAP (2018 12 )�
Apache�Kylin�Vs�Vertica�And�PostgreSQL�
Pg.�20�of�26�
and�(d_year�=�1997�or�d_year�=�1998)�
and�p_category�=�'MFGR#14'�
group�by�d_year,�s_city,�p_brand�
order�by�d_year,�s_city,�p_brand;�
���������������������������� 研 OLAP (2018 12 )�
Apache�Kylin�Vs�Vertica�And�PostgreSQL�
Pg.�21�of�26�
�STRATEBI�
StrateBI� 尽卿
�
www.stratebi.com��
�Hadoop� 研 ⼜ 研 研�Spark Hive� �Kafka� Kylin Vertica� �Amazon Redshift �OLAP��NoSQL 研 Neo4J
�
⼜ �ETL� Pentaho� �Talend �BI� Pentaho Power�BI� �Tableau 模 研 ⼜ 研 务
⼜ ⼜ 务 名 �Kyligence Hortonworks Vertica� �Talend ��
Stratebi� 研 �LinceBI.com �
���������������������������� 研 OLAP (2018 12 )�
Apache�Kylin�Vs�Vertica�And�PostgreSQL�
Pg.�22�of�26�
���������������������������� 研 OLAP (2018 12 )�
Apache�Kylin�Vs�Vertica�And�PostgreSQL�
Pg.�23�of�26�
�
最近,Stratebi 成为 Kyligence、 Vertica 、 Talend 和微软的认证合作伙伴。
���������������������������� 研 OLAP (2018 12 )�
Apache�Kylin�Vs�Vertica�And�PostgreSQL�
Pg.�24�of�26�
�
�StrateBI� ⼜ ⼜ 研 �BI �Stratebi� 副 ⼜ :�
���������������������������� 研 OLAP (2018 12 )�
Apache�Kylin�Vs�Vertica�And�PostgreSQL�
Pg.�25�of�26�
���������������������������� 研 OLAP (2018 12 )�
Apache�Kylin�Vs�Vertica�And�PostgreSQL�
Pg.�26�of�26�