non-volatile memory (nvm)jarulraj/talks/2017.nvm.sigmod.pdf · – file i/o: 2 copies (device page...
TRANSCRIPT
![Page 1: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/1.jpg)
1
![Page 2: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/2.jpg)
2
NON-VOLATILE MEMORY (NVM)2
NVMDRAM SSD
Like DRAM, low latency loads and stores
Like SSD, persistent writes and high density
![Page 3: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/3.jpg)
3
TUTORIAL OVERVIEW• Blueprint of an NVM DBMS– Overview of major design decisions impacted by NVM
3
DRAM NVM
DBMS
![Page 4: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/4.jpg)
4
TUTORIAL OVERVIEW• Target audience– Developers, researchers, and practitioners
• Assume knowledge of DBMS internals – No need for any in-depth experience with NVM
4
![Page 5: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/5.jpg)
5
TUTORIAL OVERVIEW• Highlight recent research findings– Identify a set of open problems
5
![Page 6: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/6.jpg)
6
OUTLINE• Introduction– Recent Developments– NVM Overview– Motivation
• Blueprint of an NVM DBMS– Access Interfaces– Storage Manager– Execution Engine
• Conclusion– Outlook
6
![Page 7: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/7.jpg)
7
RECENT DEVELOPMENTS
7
![Page 8: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/8.jpg)
8
#1: INDUSTRY STANDARDS8
• Form factors (e.g., JEDEC classification)– NVDIMM-F: Flash only. Has to be paired with DRAM DIMM.– NVDIMM-N: Flash and DRAM together on the same DIMM.– NVDIMM-P: True persistent memory. No DRAM or flash.
• Interface specifications (e.g., NVM Express over Fabrics)
JUNE 2016
![Page 9: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/9.jpg)
9
#2: OPERATING-SYSTEM SUPPORT9
• Growing OS support for NVM– Linux 4.8 (e.g. NVM Express over Fabrics library)– Windows 10 (e.g. Direct access to files on NVM)
OCTOBER 2016
![Page 10: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/10.jpg)
10
#3: PROCESSOR SUPPORT10
• ISA updates in Kaby Lake Processor for NVM management– Instructions for flushing cache-lines to NVM– Removed PCOMMIT instruction
MARCH 2017
![Page 11: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/11.jpg)
11
NVM OVERVIEW
11
![Page 12: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/12.jpg)
12
NVM PROPERTIES12
• Byte addressable– Loads and stores unlike SSD/HDD
• High random write throughput– Orders of magnitude higher than SSD/HDD– Smaller gap between sequential & random write throughput
• Read-write asymmetry & wear-leveling– Writes might take longer to complete compared to reads– Excessive writes to a single NVM cell can destroy it
1
2
3
![Page 13: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/13.jpg)
13
EVALUATION SETUP13
• Benchmark storage devices on NVM emulator – Write throughput of a single thread with fio– Synchronous writes to a large file
• Devices– Hard-disk drive (HDD) [Seagate Barracuda]
– Solid-state disk (SSD) [Intel DC S3700]
– Emulated NVM
![Page 14: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/14.jpg)
14
PERFORMANCE14
1
100
10,000
1,000,000
Sequential Writes Random Writes
IOPS
SSDHDD NVM
100x 500x
![Page 15: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/15.jpg)
15
MOTIVATION
15
![Page 16: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/16.jpg)
16
EXISTING DBMSs ON NVM• How do existing systems perform on NVM?– Treat NVM like a faster SSD
• Evaluate two types of database systems– Disk-oriented DBMS– Memory-oriented DBMS
• TPC-C benchmark– 1/8th of database fits in DRAM– Rest on NVM
16
A PROLEGOMENON ON OLTP DATABASE SYSTEMS FOR NON-VOLATILE MEMORYADMS 2014
![Page 17: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/17.jpg)
17
EXISTING DBMSs• Compare representative DBMSs of each architecture
17
DISK-ORIENTED DBMS MEMORY-ORIENTED DBMS
![Page 18: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/18.jpg)
18
NVM
NVM HARDWARE EMULATOR• Special CPU microcode to add stalls on cache misses– Tune DRAM latency to emulate NVM
• New instructions for managing NVM– Cache-line write-back (CLWB) instruction
18
STORE CLWB
CPU
L1 Cache
L2 Cache
CACHE
![Page 19: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/19.jpg)
19
0
20,000
40,000
Database systems
PERFORMANCE19
In-memory DBMSDisk-Oriented DBMS
Throughput(txn/sec)
8x DRAM Latency
12x
![Page 20: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/20.jpg)
20
0
20,000
40,000
Database systems
PERFORMANCE20
Throughput(txn/sec)
In-memory DBMSDisk-Oriented DBMS
2x DRAM Latency 4x
1x
1xLegacy database systems are not prepared for NVM
![Page 21: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/21.jpg)
2121
#1: DISK-ORIENTED DBMSs
Table Heap
LogCheckpoints
Buffer PoolDRAM
NVM
Designed to minimizerandom writes to NVM
But, NVM supports fast random writes
![Page 22: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/22.jpg)
22
#2: MEMORY-ORIENTED DBMSs 22
LogCheckpoints
Table HeapDRAM
NVM
Designed to overcome the volatility of memory
But, writes to NVM are persistent
![Page 23: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/23.jpg)
23
BLUEPRINT OF AN NVM DBMS23
ACCESSINTERFACES
STORAGEMANAGER
EXECUTION ENGINE
PLANEXECUTOR
QUERYOPTIMIZER
SQLEXTENSIONS
LOGGING &RECOVERY
DATAPLACEMENT
ACCESSMETHODS
ALLOCATORINTERFACE
FILESYSTEMINTERFACE
HOW TO BUILD A NON-VOLATILE MEMORY DBMSSIGMOD 2017 (TUTORIAL)
1
2
3
![Page 24: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/24.jpg)
24
ACCESS INTERFACES
24
![Page 25: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/25.jpg)
25
ACCESS INTERFACES25
• Two interfaces used by the DBMS to access NVM – Allocator interface (byte-level memory allocation)– Filesystem interface (POSIX-compliant filesystem API)
• Support in latest versions of major operating systems– Windows Server 2016– Linux 4.7
![Page 26: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/26.jpg)
26
#1: ALLOCATOR INTERFACE26
• Similar to regular DRAM allocator– Dynamic memory allocation– Meta-data management
• Additional features with respect to DRAM allocator– Durability mechanism– Naming mechanism– Recovery mechanism
NVM—ALLOC: MEMORY ALLOCATION FOR NVRAMADMS 2015
![Page 27: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/27.jpg)
27
DURABILITY MECHANISM27
• Ensure that data modifications are persisted– Necessary because they may reside in volatile processor caches– Lost if a power failure happens before changes reach NVM
• Two-step implementation– Allocator first writes out dirty cache-lines (CLWB)– Issues a memory fence (SFENCE) to ensure changes are visible
Persist(Address, Length)
![Page 28: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/28.jpg)
28
NAMING MECHANISM28
• Pointers should be valid even after the system restarts– NVM region might be mapped to a different base address
• Allocator maps NVM to a well-defined base address– Pointers, therefore, remain valid even after system restart– Foundation for building crash-consistent data structures
Absolute pointer = Base address + Relative pointer
![Page 29: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/29.jpg)
29
RECOVERY MECHANISM29
• Unlike DRAM, persistent memory leaks with NVM– Let’s say an application allocates a memory chunk– But crashes before linking the chunk to its data structure– Neither allocator nor application can reclaim the space
• Recovery ensures all chunks are either allocated or free– Interesting problem, will be covered in next tutorial
Data Structures Engineering For NVMIsmail Oukid and Wolfgang Lehner, TU Dresden
![Page 30: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/30.jpg)
30
#2: FILESYSTEM INTERFACE30
• Traditional block-based filesystem like EXT4– File I/O: 2 copies (Device ⟶ Page Cache ⟶ App Buffer)– Efficiency of I/O stack not critical when hidden by disk latency– However, NVM is byte-addressable and supports very fast I/O
Device
Page CacheDRAM
NVM
App Buffer
1
2
![Page 31: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/31.jpg)
31
NON-VOLATILE MEMORY FILESYSTEM31
• Direct access storage (DAX) to avoid data duplication– DBMS can directly manage database by skipping page cache– File I/O: 1 copy (Device ⟶ App Buffer)
Device
Page CacheDRAM
NVM
App Buffer
SYSTEM SOFTWARE FOR PERSISTENT MEMORYEUROSYS 2014
1
![Page 32: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/32.jpg)
32
NON-VOLATILE MEMORY FILESYSTEM32
• To ensure durability, uses a hybrid recovery protocol– NVM only supports 64-byte (cacheline) atomic updates– DATA CHANGES: Copy-on-write mechanism at page granularity– METADATA CHANGES: In-place updates & write-ahead logging
• NVM filesystem– Reduces data duplication– Uses lightweight recovery protocol– 10x more IOPS compared to EXT4
![Page 33: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/33.jpg)
33
RECAP: ACCESS INTERFACES33
• Allocator interface– Non-volatile data structures – Table heap, Indexes
• Filesystem interface– Log files, Checkpoints
![Page 34: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/34.jpg)
34
BLUEPRINT OF AN NVM DBMS34
ACCESSINTERFACES
STORAGEMANAGER
EXECUTION ENGINE
PLANEXECUTOR
QUERYOPTIMIZER
SQLEXTENSIONS
LOGGING &RECOVERY
DATAPLACEMENT
ACCESSMETHODS
ALLOCATORINTERFACE
FILESYSTEMINTERFACE
HOW TO BUILD A NON-VOLATILE MEMORY DBMSSIGMOD 2017 (TUTORIAL)
1
2
3
![Page 35: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/35.jpg)
35
STORAGE MANAGER
35
![Page 36: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/36.jpg)
36
MULTI-VERSIONED DBMS36
BEGINTIMESTAMP
ENDTIMESTAMP
PREVIOUSVERSION
TUPLE ID
TUPLEDATA
10 ∞ —1 X
10 ∞ —2 Y
20 ∞ 23 Y’
10 20 —
![Page 37: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/37.jpg)
37
THOUGHT EXPERIMENT• To keep things simple, NVM-only storage hierarchy– No volatile DRAM
37
NVM
DBMS
![Page 38: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/38.jpg)
38
LOGGING AND RECOVERY38
Table Heap
LogCheckpoints
1
2 3
DataData
NVM
• Traditional write-ahead logging in off-the-shelf DBMS
Can we avoid duplicating data in the log as well as the checkpoints?
![Page 39: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/39.jpg)
39
NON-VOLATILE POINTER39
POINTER DATA
DRAM DRAM
POINTER DATA
NVM NVM
VOLATILE POINTER NON-VOLATILE POINTER
RESTART: DISAPPEARS RESTART: VALID
ü✗
![Page 40: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/40.jpg)
40
AVOIDING DATA DUPLICATION• Only store non-volatile tuple pointers in log records
40
TRADITIONAL MANAGER
INSERT TUPLE XYZ
UPDATE TUPLE XYZ → X’Y’Z’
NVM-AWARE MANAGER
INSERT TUPLE 100
UPDATE TUPLE 100 → 101
Table Heap Write-Ahead Log
TUPLE ID TUPLE DATA
100 XYZ
101 X’Y’Z’
LET’S TALK ABOUT STORAGE AND RECOVERY METHODS FORNON-VOLATILE MEMORY DATABASE SYSTEMSSIGMOD 2015
![Page 41: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/41.jpg)
41
NVM-AWARE STORAGE MANAGER41
Table Heap
Log
1
2NVM
MetaData
Checkpoints✗
• Write-ahead meta-data logging
![Page 42: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/42.jpg)
42
EVALUATION• Compare storage managers on NVM emulator– Traditional storage manager– Write-ahead logging + Filesystem interface– NVM-aware storage manager– Write-ahead meta-data logging + Allocator interface
• Yahoo! Cloud Serving Benchmark– Database fits on NVM
42
![Page 43: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/43.jpg)
43
RUNTIME PERFORMANCE43
Throughput(txn/sec)
0
500,000
1,000,000
1,500,000
Storage Managers
3x
NVM-Aware ManagerTraditional Manager
8x DRAM Latency
![Page 44: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/44.jpg)
4444
0
500,000
1,000,000
1,500,000
Storage Managers
6x
RUNTIME PERFORMANCENVM-Aware ManagerTraditional Manager
2x DRAM Latency 4x
1.5x
NVM latency has a significant impact on the performance of NVM-aware storage manager
Throughput(txn/sec)
![Page 45: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/45.jpg)
45
DEVICE LIFETIME45
NVM Stores(M)
0
50
100
Storage Managers
2x
NVM-Aware ManagerTraditional Manager
LOWERIS
BETTER
Redesigning the storage manager for NVM not only improves runtime performance
but also extends device lifetime
![Page 46: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/46.jpg)
46
RECAP: WRITE-AHEAD METADATA LOGGING46
• Targets an NVM-only storage hierarchy– Leverages the durability of memory– Skips duplicating data in the log and checkpoints– Improves runtime performance– Extends lifetime of the device
![Page 47: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/47.jpg)
47
WRITE-BEHIND LOGGING
47
![Page 48: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/48.jpg)
48
TWO-TIER STORAGE HIERARCHY• Generalize the logging and recovery algorithms
48
DRAM NVM
DBMS
WRITE-BEHIND LOGGINGVLDB 2016
![Page 49: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/49.jpg)
49
WRITE-AHEAD LOGGING• Write-ahead log serves two purposes– Transform random database writes into sequential log writes– Support transaction rollback– Design makes sense for disks with slow random writes
• But, NVM supports fast random writes– Directly write data to the multi-versioned database– LATER, only record meta-data about committed txns in log– Core idea behind write-behind logging
49
![Page 50: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/50.jpg)
50
WRITE-BEHIND LOGGING50
Table Heap2
Table Heap
Log
1
3
DRAM
NVM
MetaData Data
![Page 51: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/51.jpg)
51
WRITE-BEHIND LOGGING51
• Recovery algorithm is simple– No need to REDO the log, unlike write-ahead logging– Since all changes are already persisted in database at commit– Can recover the database almost instantaneously from failure
• Supporting transaction rollback– Need to record meta-data about in-flight transactions– In case of failure, ignore their effects
![Page 52: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/52.jpg)
52
WRITE-BEHIND LOGGING52
• DBMS assigns timestamps to transactions– All transactions in a particular group commit– Get timestamps within same group commit timestamp range– To ignore the effects of all in-flight transactions
• Idea: Use failed group commit timestamp range– DBMS uses this timestamp range during tuple visibility checks– Ignores tuples created or updated within this timestamp range– UNDO is, therefore, implicitly done via visibility checks
![Page 53: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/53.jpg)
53
WRITE-BEHIND LOGGING53
• Recovery consists of only analysis phase– Can immediately start processing transactions after restart
• Garbage collection eventually kicks in– Undoes effects of all uncommitted transactions– Using timestamp range information in write-behind log– After this finishes, no need to do extra visibility checks
![Page 54: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/54.jpg)
54
METADATA FOR INSTANT RECOVERY• Group commit timestamp range– Use it to ignore effects of transactions in failed group commit– Maintain list of failed timestamp ranges
54
(T1, T2)
Group Commit
Time
(T2, T3) (T3, T4) (T4, T5)
Garbage Collection
Current range
T1 T4T3T2Write-behind logging not only avoids data
duplication but also enables instant recovery
(T1, T2) (T1, T2) Failed ranges
![Page 55: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/55.jpg)
55
EVALUATION SETUP• Compare logging protocols in Peloton DBMS– Write-Ahead logging– Write-Behind logging
• TPC-C benchmark• Storage devices– Solid-state drive– Non-volatile memory
55
![Page 56: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/56.jpg)
56
RECOVERY TIME56
1
10
100
1,000
Solid State Drive Non-Volatile Memory
Write-Behind LoggingWrite-Ahead Logging
Recovery Time(sec) 30x250x
LOWERIS
BETTER
![Page 57: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/57.jpg)
57
THROUGHPUT57
Write-Behind LoggingWrite-Ahead Logging
Throughput(txn/sec)
0
10,000
20,000
30,000
Solid State Drive Non-Volatile Memory
8x
1.3x
![Page 58: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/58.jpg)
58
RECAP: WRITE-BEHIND LOGGING
• Rethinking key algorithms– Write-behind logging enables instant recovery– Improves device utilization by reducing data duplication– Extends the device lifetime
58
![Page 59: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/59.jpg)
59
DATA PLACEMENT
59
![Page 60: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/60.jpg)
60
THREE-TIER STORAGE HIERARCHY• Cost of first-generation NVM devices– SSD is still going to be in the picture
• Data placement – Three-tier DRAM + NVM + SSD hierarchy
60
![Page 61: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/61.jpg)
61
THREE-TIER STORAGE HIERARCHY61
Database2
Database
Log
1
3
DRAM
NVM
Database4
SSD
Data
Data
MetaData
![Page 62: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/62.jpg)
62
DATA PLACEMENT• Unlike SSD, can directly read data from NVM– No need to always copy data over to DRAM for reading
• Cache hot data in DRAM– Dynamically migrate cold data to SSD– And keep warm data on NVM
62
OPEN PROBLEM:How do NVM capacity and access latencies
affect the performance of DBMS?
![Page 63: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/63.jpg)
63
ACCESS METHODS
63
![Page 64: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/64.jpg)
64
NVM-AWARE ACCESS METHODS64
• Read-write asymmetry & wear-leveling– Writes might take longer to complete compared to reads– Excessive writes to a single NVM cell can destroy it
• Write-limited access methods– NVM-aware B+tree, hash table
Perform fewer writes, and instead do more reads
FPTREE: A HYBRID SCM-DRAM PERSISTENT AND CONCURRENT B-TREE FOR STORAGE CLASS MEMORYSIGMOD 2016
![Page 65: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/65.jpg)
65
1 5 3 2 4
NVM-AWARE B+TREE65
• Leave the entries in the leaf node unsorted– Require a linear scan instead of a binary search– But, fewer writes associated with shuffling entries
1 2 3 4 5
Unsorted Data Sorted Data
Fewer Writes More Writes
![Page 66: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/66.jpg)
66
NVM-AWARE B+TREE66
• Temporarily relax the balance of the tree– Extra node reads, fewer writes associated with balancing nodes
Unbalanced Tree Balanced Tree
Fewer Writes More Writes
![Page 67: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/67.jpg)
67
NVM-AWARE ACCESS METHODS67
• More design principles will be covered in next tutorial
Data Structures Engineering For NVMIsmail Oukid and Wolfgang Lehner, TU Dresden
OPEN PROBLEM:Synthesizing other NVM-aware access methods.
![Page 68: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/68.jpg)
68
BLUEPRINT OF AN NVM DBMS68
ACCESSINTERFACES
STORAGEMANAGER
EXECUTION ENGINE
PLANEXECUTOR
QUERYOPTIMIZER
SQLEXTENSIONS
LOGGING &RECOVERY
DATAPLACEMENT
ACCESSMETHODS
ALLOCATORINTERFACE
FILESYSTEMINTERFACE
HOW TO BUILD A NON-VOLATILE MEMORY DBMSSIGMOD 2017 (TUTORIAL)
1
2
3
![Page 69: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/69.jpg)
69
EXECUTION ENGINE
69
![Page 70: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/70.jpg)
70
PLAN EXECUTOR70
• Query processing algorithms– Sorting algorithm– Join algorithm
• Reduce the number of writes– Adjusting the write-intensivity knob– Write-limited algorithms
WRITE-LIMITED SORTS AND JOINS FOR PERSISTENT MEMORYVLDB 2014
![Page 71: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/71.jpg)
71
SEGMENT SORT71
• Hybrid sorting algorithm– Run merge sort on a part of the input (segment): x%– Run selection sort on the rest of the input: (1-x)%– Adjust “x” to limit the number of writes
1 5 3 9 4 7 2 10 11 12 6 8
Selection Sort Merge Sort
Fewer Writes More Writes
![Page 72: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/72.jpg)
72
SEGMENT GRACE JOIN72
• Hybrid join algorithm– Materialize a part of the input partitions: x%– Iterate over input for remaining partitions: (1-x)%– Adjust “x” to limit the number of writes
Iterate P1 P2 P3
Don’t Materialize Materialize
Fewer Writes More Writes
![Page 73: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/73.jpg)
73
SQL EXTENSIONS
73
![Page 74: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/74.jpg)
74
SQL EXTENSIONS74
• Allow the user to control data placement on NVM– Certain performance-critical tables and materialized views
• Store only a subset of the columns on NVM– Exclude certain columns from being stored on NVM
ALTER TABLESPACE nvm_table_space DEFAULT ON_NVM;
ALTER TABLE orders ON_NVM EXCLUDE(order_tax);
![Page 75: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/75.jpg)
75
NVM-RELATED SQL EXTENSIONS75
• Need to construct new NVM-related extensions– Standardize these extensions
OPEN PROBLEM:Need to construct new extensions and
standardize them.
![Page 76: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/76.jpg)
76
QUERY OPTIMIZER
76
![Page 77: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/77.jpg)
77
QUERY OPTIMIZATION77
• Cost-based query optimizer– Distinguish between sequential & random accesses– But not between reads and writes
• NVM-oriented redesign– Differentiate between reads and writes in cost model
MAKING COST-BASED QUERY OPTIMIZATION ASYMMETRY-AWAREDAMON 2012
![Page 78: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/78.jpg)
78
SEQUENTIAL SCAN78
• Accounts for sequential access of all pages in table– Does not distinguish reads and writes
• Updated cost function
Cost(seqential scan) = Costsequential ‖Table‖page-count
Cost(seqential scan) = Costsequential-reads ‖Table‖page-count
![Page 79: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/79.jpg)
79
HASH JOIN79
• Function accounts for reading and writing all data once– Does not distinguish reads and writes
• Updated cost function
Cost(hash join) = (Costsequential + Costrandom) *( ‖Inner-Table‖#pages+ ‖Outer-Table‖#pages)
Cost(hash join) = (Costsequential-reads + Costrandom-writes) *( ‖Inner-Table‖#pages+ ‖Outer-Table‖#pages)
![Page 80: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/80.jpg)
80
EVALUATION80
• Compare different cost models on NVM emulator– Traditional cost model– NVM-aware cost model
• TPC-H benchmark on Postgres• Performance impact– 50% speedup of queries– Maximum speedup: 500% (!)– Maximum slowdown: 1%
![Page 81: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/81.jpg)
81
NVM-ORIENTED DESIGN81
• Page-oriented cost functions– NVM is byte-addressable
OPEN PROBLEM:Update cost model to factor in
byte-addressability of NVM
![Page 82: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/82.jpg)
82
LESSONS LEARNED
82
![Page 83: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/83.jpg)
83
LESSONS LEARNED83
• Important to reexamine the design choice– To leverage the raw device performance differential– Across different components of the DBMS– Helpful to think about an NVM-only hierarchy
NVM
DBMS
![Page 84: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/84.jpg)
84
LESSONS LEARNED84
• NVM invalidates multiple long-held assumptions– Storage is several orders of magnitude slower than DRAM– Large performance gap between sequential & random accesses– Memory read and write latencies are symmetric
![Page 85: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/85.jpg)
85
BLUEPRINT OF AN NVM DBMS85
ACCESSINTERFACES
STORAGEMANAGER
EXECUTION ENGINE
PLANEXECUTOR
QUERYOPTIMIZER
SQLEXTENSIONS
LOGGING &RECOVERY
DATAPLACEMENT
ACCESSMETHODS
ALLOCATORINTERFACE
FILESYSTEMINTERFACE
HOW TO BUILD A NON-VOLATILE MEMORY DBMSSIGMOD 2017 (TUTORIAL)
1
2
3
![Page 86: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/86.jpg)
86
FUTURE WORK86
• Highlighted a set of open problems– Data placement– Access methods– Query optimization
• Improvement in performance of storage layer– By several orders of magnitude over a short period of time– We anticipate high-impact research in this space
![Page 87: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/87.jpg)
8787
![Page 88: NON-VOLATILE MEMORY (NVM)jarulraj/talks/2017.nvm.sigmod.pdf · – File I/O: 2 copies (Device Page Cache App Buffer) – Efficiency of I/O stack not critical when hidden by disk latency](https://reader034.vdocuments.site/reader034/viewer/2022042710/5f5af7c4acc36520886e1b0d/html5/thumbnails/88.jpg)
8888