calcite meetup-2016-04-20
TRANSCRIPT
Introduction toApache CalciteJosh ElserMTS2016-04-20
2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
About me
Apache Calcite is a project at the Apache Software Foundation.This name is a trademark of the Foundation.
Apache Calcite Committer and PMC
(Slowly) Re-learning SQL
Distributed systems nerd
3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Users
Apache Kylin
Apache SamzaQuark
SQL-Gremlin/Apache TinkerPop
See the respective project pages at the ASF
4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Brief History
Originally known as “Optiq” (https://github.com/julianhyde/optiq): 2012-05-07 Entered Apache Software Fundation’s Incubator: 2014-05-25 Renamed to Apache Calcite (incubating): 2014-09-30 Graduates to top-level project (TLP): 2015-10-21 2 major releases since graduation: 2016-03-XX Currently comprised of 16 committers and 14 PMC members
“The foundation for your next high-performance database.”
5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
AgendaSQL Parser
6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
SQL Parser
SELECT d.name, COUNT(*) as cFROM Emps as e JOIN Depts as d ON e.deptno = d.deptnoWHERE e.age < 30GROUP BY d.deptnoHAVING COUNT(*) > 5ORDER BY c DESC
Scan
Join
Filter
Aggregate
Filter
Project
Sort
Scanhttps://calcite.apache.org/docs/reference.html
7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
AgendaSQL Parser
Cost-Based Optimizer
8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Cost-Based Optimizer
Extensible
Java API– Parser output– Inline Java code
AKA Relational AlgebraRelBuilder builder = RelBuilder.create(config);RelNode node = builder .scan("EMP") .project(builder.field(“DEPTNO”), builder.field(“ENAME”)) .build();
SELECT ename, deptno FROM emp;
LogicalProject(DEPTNO, ENAME) LogicalTableScan(EMP)
9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Cost-Based Optimizer
SELECT d.name, COUNT(*) as cFROM depts AS dJOIN emp AS e on d.deptno = e.deptnoGROUP BY d.name;
Scan Emp[deptno]
Join
Aggregate
Scan Depts[deptno,
name]
Join
Aggregate
Project[name, c]
Scan Emp[*] Scan Depts[*]
10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
AgendaSQL Parser
Cost-Based Optimizer
Pluggable Data Sources
11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Pluggable Data Sources
User-implemented– Yes, you.
Custom optimizations– Predicate pushdown– Projections
Sources of Sources– Federation
Everything but the data
Join
Aggregate
Project
Scan Emp[*] Scan Depts[*]
12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
AgendaSQL Parser
Cost-Based Optimizer
Pluggable Data Sources
Avatica
13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Avatica
Calcite sub-project Wire protocol
– Protocol Buffers– JSON
Metrics Authentication Clients
– JDBC client– Python and Go (in-progress)
JDBC over HTTP – SQL for Everyone
14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Thank You!Email: [email protected]: @josh_elserMailing lists: [email protected] info: https://calcite.apache.org/