cockroachdb: from oltp to htap · 2017-10-27 · keeping an eye on htap •the distant future (the...
TRANSCRIPT
ABenchmarkingStory
CockroachDB:FromOLTPtoHTAP
presentedby ArjunNarayan
@cockroachdb
WARNING!• Opinionatedspeculationahead
• Here’swhatI wantCockroachDBtolooklike
• Notnecessarilywhatwillhappen
@cockroachdb
HTAP: What is it?Here’s To A Perspective
@cockroachdb
Databasestoday(simplified)
OLTP Database
OLAP Warehouse
ETL
TPC-C
TPC-H
@cockroachdb
Databasestomorrow(simplified)
HTAP Database
TPC-C
TPC-H
@cockroachdb
Dataarchitecturestoday(reality)
OLTP
OLTP
OLTP
OLTP
KAFKA
Hadoop
Vertica
Storm
Samza
Custom denormalizationmicroservices
Ad-hocData Science
Reporting
Realtime feeds
Dashboards
Memcache
@cockroachdb
Avisionfordataarchitecturestomorrow
Cockroach Zone: OLTP
Cockroach Zone:Incrementally Updated
Materialized Views
Ad-hocData Science
Reporting
Realtime feeds
Dashboards
Cache
@cockroachdb
HTAP: How do we benchmark it?You are what you measure
@cockroachdb
CockroachDB Architecture in 60 secondsScale out SQL
@cockroachdb
ArchitectureOverview
Raft
Fully featured SQL API
Distributed Query Execution
Node 1
Node 2
Node 3
Replicas
@cockroachdb
TheCockroachDBKVLayer
@cockroachdb
UsingClocksforFastReads
Node 1
Lease
Node 2
Lease
Node 3
Lease
@cockroachdb
DistributedQueryExecutioninCockroachDB
@cockroachdb
Building CockroachDB
@cockroachdb
BuildingCockroachDB:AnOrderofPriorities
Correctness •Jepsen testing
Stability •Chaos testing
Performance •Benchmarking
@cockroachdb
BuildingCockroachDB:AnOrderofPriorities
Scale-outOLTP
• Where we are today
AcceptableOLAP
• Where we are keeping an eye on
HTAP • The distant future (the year 2000)
@cockroachdb
CockroachDB: An HTAP vision
@cockroachdb
HTAP:Astrawman
OLTP Database
OLAP Warehouse
ETL
TPC-C
TPC-H
HTAP Database
TPC-C
TPC-H
@cockroachdb
Dataarchitecturestoday
OLTP
OLTP
OLTP
OLTP
KAFKA
Hadoop
Vertica
Storm
Samza
Custom denormalizationmicroservices
Ad-hocData Science
Reporting
Realtime feeds
Dashboards
Memcache
@cockroachdb
HTAP:Desiderata
• Computeanalyticsqueriesandkeepresultsup-to-date
• ResourceisolationforOLTPwork
@cockroachdb
Dataarchitecturestomorrow
KAFKA
Cockroach Zone: OLTP
Hadoop
Vertica
Storm
Samza
Ad-hocData Science
Reporting
Realtime feeds
Dashboards
Memcache
@cockroachdb
IncrementallyUpdatedMaterializedViews
• Real-timeChangefeedsfromOLTPreplicas
• DifferentialDataflowforincrementalcomputationofallSQLqueries
• BuiltrightintotheRDBMS,noseparateETLstep
• CockroachtimestampsindataflowgivesserializabletransactionsthatspanOLTP+OLAP
• “ReadyourOLTPwritesinOLAPreads”transactionalguarantees
• SeeNaiad(SOSP2013),DifferentialDataflow(CIDR2013),FrankMcSherry’s Blog(GitHub)
@cockroachdb
Dataarchitecturesthedayafter
Cockroach Zone: OLTP
Cockroach Zone:Incrementally Updated
Materialized Views
Ad-hocData Science
Reporting
Realtime feeds
Dashboards
Memcache
@cockroachdb
BenchmarkingHTAP:Whatnottodo
• Don’tjustrunTPC-C+TPC-Honthesamesystem
• Why?MaterializedviewsbreakTPC-H
• It’snotjustaboutrunningad-hocTPC-Hqueries
• HTAPdoesn’tmatterifyoucan’tguaranteepredictableTPC-Cperformance
@cockroachdb
Benchmarkingmaterializedviews
• Howlongdoesittaketocomputeincrementalreports?
• Howlongdoesittaketocomputead-hocqueries?
• Howlongdoesittaketobuildamaterializedviewfromscratch?
• HowlongdoesittaketopropagateOLTPchanges?
• HowmuchoverheaddoweimposeontheOLTPsystem?
@cockroachdb
Questions?