self-driving database management systemscidrdb.org/cidr2017/slides/p42-pavlo-cidr17-slides.pdf ·...
TRANSCRIPT
![Page 1: Self-Driving Database Management Systemscidrdb.org/cidr2017/slides/p42-pavlo-cidr17-slides.pdf · Self-Driving Database Management Systems CIDR 2017 @andy_pavlo . 1920s Cornelius](https://reader034.vdocuments.site/reader034/viewer/2022051814/6036c5c19da05d5f61038c03/html5/thumbnails/1.jpg)
Self-Driving Database Management Systems
CIDR 2017 @andy_pavlo
![Page 2: Self-Driving Database Management Systemscidrdb.org/cidr2017/slides/p42-pavlo-cidr17-slides.pdf · Self-Driving Database Management Systems CIDR 2017 @andy_pavlo . 1920s Cornelius](https://reader034.vdocuments.site/reader034/viewer/2022051814/6036c5c19da05d5f61038c03/html5/thumbnails/2.jpg)
1920s
Cornelius Von Pavlo
1950s
Joseph Pavlo
1980s
Timothy Pavlo
![Page 3: Self-Driving Database Management Systemscidrdb.org/cidr2017/slides/p42-pavlo-cidr17-slides.pdf · Self-Driving Database Management Systems CIDR 2017 @andy_pavlo . 1920s Cornelius](https://reader034.vdocuments.site/reader034/viewer/2022051814/6036c5c19da05d5f61038c03/html5/thumbnails/3.jpg)
2015 Median DBA Salary
$81,710 [Source]
![Page 4: Self-Driving Database Management Systemscidrdb.org/cidr2017/slides/p42-pavlo-cidr17-slides.pdf · Self-Driving Database Management Systems CIDR 2017 @andy_pavlo . 1920s Cornelius](https://reader034.vdocuments.site/reader034/viewer/2022051814/6036c5c19da05d5f61038c03/html5/thumbnails/4.jpg)
Possible
» Physical Database Design
» Resource Allocation
» Query Optimization & Tuning
» Knob Configuration
4
![Page 5: Self-Driving Database Management Systemscidrdb.org/cidr2017/slides/p42-pavlo-cidr17-slides.pdf · Self-Driving Database Management Systems CIDR 2017 @andy_pavlo . 1920s Cornelius](https://reader034.vdocuments.site/reader034/viewer/2022051814/6036c5c19da05d5f61038c03/html5/thumbnails/5.jpg)
What’s Different?
» Previous tools only dealt with handling problems in the past.
» Humans still make final decisions.
» Hardware & algorithm advancements.
5
![Page 7: Self-Driving Database Management Systemscidrdb.org/cidr2017/slides/p42-pavlo-cidr17-slides.pdf · Self-Driving Database Management Systems CIDR 2017 @andy_pavlo . 1920s Cornelius](https://reader034.vdocuments.site/reader034/viewer/2022051814/6036c5c19da05d5f61038c03/html5/thumbnails/7.jpg)
Planning 3
Search Tree
Action Sequence
2 Forecasting
Historical Workload
Predicated Workload
1 Clustering
Clusters
Workload Monitor
![Page 8: Self-Driving Database Management Systemscidrdb.org/cidr2017/slides/p42-pavlo-cidr17-slides.pdf · Self-Driving Database Management Systems CIDR 2017 @andy_pavlo . 1920s Cornelius](https://reader034.vdocuments.site/reader034/viewer/2022051814/6036c5c19da05d5f61038c03/html5/thumbnails/8.jpg)
#1 – Clustering
» Group similar queries together to improve the forecasting models.
» Logical vs. Physical Features
7
![Page 9: Self-Driving Database Management Systemscidrdb.org/cidr2017/slides/p42-pavlo-cidr17-slides.pdf · Self-Driving Database Management Systems CIDR 2017 @andy_pavlo . 1920s Cornelius](https://reader034.vdocuments.site/reader034/viewer/2022051814/6036c5c19da05d5f61038c03/html5/thumbnails/9.jpg)
8 SELECT C_ID FROM CUSTOMER WHERE C_W_ID = ? AND C_D_ID = ? AND C_LAST = ? ORDER BY C_FIRST
table={CUSTOMER} attributes={C_ID,C_W_ID,C_D_ID,C_LAST} orderby={C_FIRST} aggregate={Ø}
Logical Features
![Page 10: Self-Driving Database Management Systemscidrdb.org/cidr2017/slides/p42-pavlo-cidr17-slides.pdf · Self-Driving Database Management Systems CIDR 2017 @andy_pavlo . 1920s Cornelius](https://reader034.vdocuments.site/reader034/viewer/2022051814/6036c5c19da05d5f61038c03/html5/thumbnails/10.jpg)
8 SELECT C_ID FROM CUSTOMER WHERE C_W_ID = ? AND C_D_ID = ? AND C_LAST = ? ORDER BY C_FIRST
table={CUSTOMER} attributes={C_ID,C_W_ID,C_D_ID,C_LAST} orderby={C_FIRST} aggregate={Ø}
Logical Features
![Page 11: Self-Driving Database Management Systemscidrdb.org/cidr2017/slides/p42-pavlo-cidr17-slides.pdf · Self-Driving Database Management Systems CIDR 2017 @andy_pavlo . 1920s Cornelius](https://reader034.vdocuments.site/reader034/viewer/2022051814/6036c5c19da05d5f61038c03/html5/thumbnails/11.jpg)
8 SELECT C_ID FROM CUSTOMER WHERE C_W_ID = ? AND C_D_ID = ? AND C_LAST = ? ORDER BY C_FIRST
table={CUSTOMER} attributes={C_ID,C_W_ID,C_D_ID,C_LAST} orderby={C_FIRST} aggregate={Ø}
Logical Features
![Page 12: Self-Driving Database Management Systemscidrdb.org/cidr2017/slides/p42-pavlo-cidr17-slides.pdf · Self-Driving Database Management Systems CIDR 2017 @andy_pavlo . 1920s Cornelius](https://reader034.vdocuments.site/reader034/viewer/2022051814/6036c5c19da05d5f61038c03/html5/thumbnails/12.jpg)
8 SELECT C_ID FROM CUSTOMER WHERE C_W_ID = ? AND C_D_ID = ? AND C_LAST = ? ORDER BY C_FIRST
table={CUSTOMER} attributes={C_ID,C_W_ID,C_D_ID,C_LAST} orderby={C_FIRST} aggregate={Ø}
Logical Features
Physical Features tuplesRead={##} tuplesWritten={##} cpu={##} memory={##}
lockWait={##} indexPages={##} networkRead={##} networkWritten={##}
![Page 13: Self-Driving Database Management Systemscidrdb.org/cidr2017/slides/p42-pavlo-cidr17-slides.pdf · Self-Driving Database Management Systems CIDR 2017 @andy_pavlo . 1920s Cornelius](https://reader034.vdocuments.site/reader034/viewer/2022051814/6036c5c19da05d5f61038c03/html5/thumbnails/13.jpg)
8 table={CUSTOMER} attributes={C_ID,C_W_ID,C_D_ID,C_LAST} orderby={C_FIRST} aggregate={Ø}
Logical Features
Physical Features tuplesRead={##} tuplesWritten={##} cpu={##} memory={##}
lockWait={##} indexPages={##} networkRead={##} networkWritten={##}
Lacks Execution Info –
Fixed/Immutable + Cheap to Compute +
Unstable/Changes –
Descriptive + Identifies Problems +
![Page 14: Self-Driving Database Management Systemscidrdb.org/cidr2017/slides/p42-pavlo-cidr17-slides.pdf · Self-Driving Database Management Systems CIDR 2017 @andy_pavlo . 1920s Cornelius](https://reader034.vdocuments.site/reader034/viewer/2022051814/6036c5c19da05d5f61038c03/html5/thumbnails/14.jpg)
#2 – Forecasting
» Generate forecasting models for each cluster to predict future arrival rate.
» Multiple horizons & intervals.
9
![Page 15: Self-Driving Database Management Systemscidrdb.org/cidr2017/slides/p42-pavlo-cidr17-slides.pdf · Self-Driving Database Management Systems CIDR 2017 @andy_pavlo . 1920s Cornelius](https://reader034.vdocuments.site/reader034/viewer/2022051814/6036c5c19da05d5f61038c03/html5/thumbnails/15.jpg)
10
Gaming Stats
Bus Tracking
Admissions
Real Workload Predicted Workload
LSTM RNN Linear Regression
24 Hours
24 Hours
24 Hours
7 Days
7 Days
7 Days
120 Days
30 Days
120 Days
LR LSTM LSTM
![Page 16: Self-Driving Database Management Systemscidrdb.org/cidr2017/slides/p42-pavlo-cidr17-slides.pdf · Self-Driving Database Management Systems CIDR 2017 @andy_pavlo . 1920s Cornelius](https://reader034.vdocuments.site/reader034/viewer/2022051814/6036c5c19da05d5f61038c03/html5/thumbnails/16.jpg)
#3 – Planning
» Generate optimization actions for the DBMS based on the workload forecasts.
» Select a sequence of actions that optimize the target metric.
11
![Page 17: Self-Driving Database Management Systemscidrdb.org/cidr2017/slides/p42-pavlo-cidr17-slides.pdf · Self-Driving Database Management Systems CIDR 2017 @andy_pavlo . 1920s Cornelius](https://reader034.vdocuments.site/reader034/viewer/2022051814/6036c5c19da05d5f61038c03/html5/thumbnails/17.jpg)
12 Action Catalog
Action Sequence
• • • Search Tree
![Page 18: Self-Driving Database Management Systemscidrdb.org/cidr2017/slides/p42-pavlo-cidr17-slides.pdf · Self-Driving Database Management Systems CIDR 2017 @andy_pavlo . 1920s Cornelius](https://reader034.vdocuments.site/reader034/viewer/2022051814/6036c5c19da05d5f61038c03/html5/thumbnails/18.jpg)
12 Action Catalog
AddIndex(i)
Action Sequence
• • • Search Tree
Cost – Benefit +
![Page 19: Self-Driving Database Management Systemscidrdb.org/cidr2017/slides/p42-pavlo-cidr17-slides.pdf · Self-Driving Database Management Systems CIDR 2017 @andy_pavlo . 1920s Cornelius](https://reader034.vdocuments.site/reader034/viewer/2022051814/6036c5c19da05d5f61038c03/html5/thumbnails/19.jpg)
12 Action Catalog
AddIndex(i)
Forecast Models
Action Sequence
• • • Search Tree
Cost – Benefit +
Affected Clusters
Optimizer
Expected Resource Usage
![Page 20: Self-Driving Database Management Systemscidrdb.org/cidr2017/slides/p42-pavlo-cidr17-slides.pdf · Self-Driving Database Management Systems CIDR 2017 @andy_pavlo . 1920s Cornelius](https://reader034.vdocuments.site/reader034/viewer/2022051814/6036c5c19da05d5f61038c03/html5/thumbnails/20.jpg)
12 Action Catalog
AddIndex(i)
Forecast Models
Action Sequence
• • • Search Tree
Cost – Benefit +
Affected Clusters
Optimizer
Expected Resource Usage
![Page 21: Self-Driving Database Management Systemscidrdb.org/cidr2017/slides/p42-pavlo-cidr17-slides.pdf · Self-Driving Database Management Systems CIDR 2017 @andy_pavlo . 1920s Cornelius](https://reader034.vdocuments.site/reader034/viewer/2022051814/6036c5c19da05d5f61038c03/html5/thumbnails/21.jpg)
Search Tree
Optimizer
Expected Resource Usage
![Page 22: Self-Driving Database Management Systemscidrdb.org/cidr2017/slides/p42-pavlo-cidr17-slides.pdf · Self-Driving Database Management Systems CIDR 2017 @andy_pavlo . 1920s Cornelius](https://reader034.vdocuments.site/reader034/viewer/2022051814/6036c5c19da05d5f61038c03/html5/thumbnails/22.jpg)
Demo
» Peloton (v2017-01)
» TPC-C with 100 warehouses
» Database loaded without indexes
13
![Page 23: Self-Driving Database Management Systemscidrdb.org/cidr2017/slides/p42-pavlo-cidr17-slides.pdf · Self-Driving Database Management Systems CIDR 2017 @andy_pavlo . 1920s Cornelius](https://reader034.vdocuments.site/reader034/viewer/2022051814/6036c5c19da05d5f61038c03/html5/thumbnails/23.jpg)
Current Status
» Clusters/forecasts computed off-line.
» No universal planning algorithm.
» We lost our catalog, planner, and optimizer in the “purge”.
14
![Page 24: Self-Driving Database Management Systemscidrdb.org/cidr2017/slides/p42-pavlo-cidr17-slides.pdf · Self-Driving Database Management Systems CIDR 2017 @andy_pavlo . 1920s Cornelius](https://reader034.vdocuments.site/reader034/viewer/2022051814/6036c5c19da05d5f61038c03/html5/thumbnails/24.jpg)
More Self-Driving
TensorFlow Integration
LLVM Execution Engine
Cascades Optimizer
Intra-Query Parallelism
2017 2016 In-Memory / NVM Storage
Open Bw-Tree
WAL (SSD) / WBL (NVM)
Index / Layout Tuning
Apache v2.0 License
![Page 25: Self-Driving Database Management Systemscidrdb.org/cidr2017/slides/p42-pavlo-cidr17-slides.pdf · Self-Driving Database Management Systems CIDR 2017 @andy_pavlo . 1920s Cornelius](https://reader034.vdocuments.site/reader034/viewer/2022051814/6036c5c19da05d5f61038c03/html5/thumbnails/25.jpg)
Unsolved Problems
» Cluster Prioritization (OLTP vs. OLAP)
» Self-Driving Components Interference
» Human Interactions
» “Traditional” ML Problems
16
![Page 26: Self-Driving Database Management Systemscidrdb.org/cidr2017/slides/p42-pavlo-cidr17-slides.pdf · Self-Driving Database Management Systems CIDR 2017 @andy_pavlo . 1920s Cornelius](https://reader034.vdocuments.site/reader034/viewer/2022051814/6036c5c19da05d5f61038c03/html5/thumbnails/26.jpg)