calcite meetup-2016-04-20

14
Introduction to Apache Calcite Josh Elser MTS 2016-04-20

Upload: josh-elser

Post on 15-Apr-2017

481 views

Category:

Software


0 download

TRANSCRIPT

Page 1: Calcite meetup-2016-04-20

Introduction toApache CalciteJosh ElserMTS2016-04-20

Page 2: Calcite meetup-2016-04-20

2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

About me

Apache Calcite is a project at the Apache Software Foundation.This name is a trademark of the Foundation.

Apache Calcite Committer and PMC

(Slowly) Re-learning SQL

Distributed systems nerd

Page 3: Calcite meetup-2016-04-20

3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Users

Apache Kylin

Apache SamzaQuark

SQL-Gremlin/Apache TinkerPop

See the respective project pages at the ASF

Page 4: Calcite meetup-2016-04-20

4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Brief History

Originally known as “Optiq” (https://github.com/julianhyde/optiq): 2012-05-07 Entered Apache Software Fundation’s Incubator: 2014-05-25 Renamed to Apache Calcite (incubating): 2014-09-30 Graduates to top-level project (TLP): 2015-10-21 2 major releases since graduation: 2016-03-XX Currently comprised of 16 committers and 14 PMC members

“The foundation for your next high-performance database.”

Page 5: Calcite meetup-2016-04-20

5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

AgendaSQL Parser

Page 6: Calcite meetup-2016-04-20

6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

SQL Parser

SELECT d.name, COUNT(*) as cFROM Emps as e JOIN Depts as d ON e.deptno = d.deptnoWHERE e.age < 30GROUP BY d.deptnoHAVING COUNT(*) > 5ORDER BY c DESC

Scan

Join

Filter

Aggregate

Filter

Project

Sort

Scanhttps://calcite.apache.org/docs/reference.html

Page 7: Calcite meetup-2016-04-20

7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

AgendaSQL Parser

Cost-Based Optimizer

Page 8: Calcite meetup-2016-04-20

8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Cost-Based Optimizer

Extensible

Java API– Parser output– Inline Java code

AKA Relational AlgebraRelBuilder builder = RelBuilder.create(config);RelNode node = builder .scan("EMP") .project(builder.field(“DEPTNO”), builder.field(“ENAME”)) .build();

SELECT ename, deptno FROM emp;

LogicalProject(DEPTNO, ENAME) LogicalTableScan(EMP)

Page 9: Calcite meetup-2016-04-20

9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Cost-Based Optimizer

SELECT d.name, COUNT(*) as cFROM depts AS dJOIN emp AS e on d.deptno = e.deptnoGROUP BY d.name;

Scan Emp[deptno]

Join

Aggregate

Scan Depts[deptno,

name]

Join

Aggregate

Project[name, c]

Scan Emp[*] Scan Depts[*]

Page 10: Calcite meetup-2016-04-20

10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

AgendaSQL Parser

Cost-Based Optimizer

Pluggable Data Sources

Page 11: Calcite meetup-2016-04-20

11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Pluggable Data Sources

User-implemented– Yes, you.

Custom optimizations– Predicate pushdown– Projections

Sources of Sources– Federation

Everything but the data

Join

Aggregate

Project

Scan Emp[*] Scan Depts[*]

Page 12: Calcite meetup-2016-04-20

12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

AgendaSQL Parser

Cost-Based Optimizer

Pluggable Data Sources

Avatica

Page 13: Calcite meetup-2016-04-20

13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Avatica

Calcite sub-project Wire protocol

– Protocol Buffers– JSON

Metrics Authentication Clients

– JDBC client– Python and Go (in-progress)

JDBC over HTTP – SQL for Everyone

Page 14: Calcite meetup-2016-04-20

14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Thank You!Email: [email protected]: @josh_elserMailing lists: [email protected] info: https://calcite.apache.org/