research faculty summit 2018 - microsoft.com...ceph orion. etc. nova. nova. nova. nova. nova. nova....
TRANSCRIPT
![Page 1: Research Faculty Summit 2018 - microsoft.com...CEPH Orion. etc. NOVA. NOVA. NOVA. NOVA. NOVA. NOVA. Log. 5 What Should You Do With NVMM? 1. Use files and a conventional (distributed)](https://reader033.vdocuments.site/reader033/viewer/2022052423/5f0425167e708231d40c887c/html5/thumbnails/1.jpg)
Systems | Fueling future disruptions
ResearchFaculty Summit 2018
![Page 2: Research Faculty Summit 2018 - microsoft.com...CEPH Orion. etc. NOVA. NOVA. NOVA. NOVA. NOVA. NOVA. Log. 5 What Should You Do With NVMM? 1. Use files and a conventional (distributed)](https://reader033.vdocuments.site/reader033/viewer/2022052423/5f0425167e708231d40c887c/html5/thumbnails/2.jpg)
2
What Should You Do With Persistent Memory?
Steven Swanson
Director, Non-Volatile Systems LabComputer Science and Engineering
UC San Diego
![Page 3: Research Faculty Summit 2018 - microsoft.com...CEPH Orion. etc. NOVA. NOVA. NOVA. NOVA. NOVA. NOVA. Log. 5 What Should You Do With NVMM? 1. Use files and a conventional (distributed)](https://reader033.vdocuments.site/reader033/viewer/2022052423/5f0425167e708231d40c887c/html5/thumbnails/3.jpg)
3
Non-volatile main memory (NVMM)
• Byte-addressable• Denser than DRAM• DRAM-comparable latency• Higher bandwidth than SSD• Ready for DMA / RDMA
Intel 3D XPoint NVDIMM
MemoryDRAM DIMM
DRAM DIMM
NVMMNVDIMM
MemoryController
MemoryController
CPU
Last-level cacheL1/L2Core
L1/L2Core
NVDIMM
![Page 4: Research Faculty Summit 2018 - microsoft.com...CEPH Orion. etc. NOVA. NOVA. NOVA. NOVA. NOVA. NOVA. Log. 5 What Should You Do With NVMM? 1. Use files and a conventional (distributed)](https://reader033.vdocuments.site/reader033/viewer/2022052423/5f0425167e708231d40c887c/html5/thumbnails/4.jpg)
4
What Should You Do With NVMM?
1. Use files and a conventional (distributed) file system
2. Use files and better file (distributed) system
3. Build persistent data structures
4. Use it as slow DRAM
EXT4
EXT4EXT4
EXT4EXT4
EXT4CEPH etc.Orion
NOVA NOVA
NOVA
NOVANOVA
NOVA
Log
![Page 5: Research Faculty Summit 2018 - microsoft.com...CEPH Orion. etc. NOVA. NOVA. NOVA. NOVA. NOVA. NOVA. Log. 5 What Should You Do With NVMM? 1. Use files and a conventional (distributed)](https://reader033.vdocuments.site/reader033/viewer/2022052423/5f0425167e708231d40c887c/html5/thumbnails/5.jpg)
5
What Should You Do With NVMM?
1. Use files and a conventional (distributed) file system
2. Use files and better file (distributed) system
3. Build persistent data structures
4. Use it as slow DRAM
Orion
NOVA NOVA
NOVA
NOVANOVA
NOVA
![Page 6: Research Faculty Summit 2018 - microsoft.com...CEPH Orion. etc. NOVA. NOVA. NOVA. NOVA. NOVA. NOVA. Log. 5 What Should You Do With NVMM? 1. Use files and a conventional (distributed)](https://reader033.vdocuments.site/reader033/viewer/2022052423/5f0425167e708231d40c887c/html5/thumbnails/6.jpg)
6
File IO Atomicity Fault Tolerance SpeedDirect
Access
DAX
![Page 7: Research Faculty Summit 2018 - microsoft.com...CEPH Orion. etc. NOVA. NOVA. NOVA. NOVA. NOVA. NOVA. Log. 5 What Should You Do With NVMM? 1. Use files and a conventional (distributed)](https://reader033.vdocuments.site/reader033/viewer/2022052423/5f0425167e708231d40c887c/html5/thumbnails/7.jpg)
7
NOVA: A File System for NVMM
• A NOVA FS is a tree of logs• One log per inode
– Inode points to head and tail– Logs are not contiguous
• Many Logs -> high concurrency• Strong consistency guarantees• Log-structured + journals + copy-
on-write
Head TailInode
Inode log
Committed entry
Uncommitted entry
Per-inode logging
![Page 8: Research Faculty Summit 2018 - microsoft.com...CEPH Orion. etc. NOVA. NOVA. NOVA. NOVA. NOVA. NOVA. Log. 5 What Should You Do With NVMM? 1. Use files and a conventional (distributed)](https://reader033.vdocuments.site/reader033/viewer/2022052423/5f0425167e708231d40c887c/html5/thumbnails/8.jpg)
8
Atomicity: Logging for Simple Metadata Operations
• Combines log-structuring, journaling and copy-on-write
• Log-structuring for single log update– Write, msync, chmod, etc– Lower overhead than journaling
and shadow paging
File log
Tail Tail
![Page 9: Research Faculty Summit 2018 - microsoft.com...CEPH Orion. etc. NOVA. NOVA. NOVA. NOVA. NOVA. NOVA. Log. 5 What Should You Do With NVMM? 1. Use files and a conventional (distributed)](https://reader033.vdocuments.site/reader033/viewer/2022052423/5f0425167e708231d40c887c/html5/thumbnails/9.jpg)
9
Atomicity: Lightweight Journaling for Complex Metadata Operations
• Lightweight journaling for update across logs– Unlink, rename, etc– Journal log tails instead of
metadata or data File log
Directory log
Tail
TailTail
Tail
Dir tail
File tailJournal
![Page 10: Research Faculty Summit 2018 - microsoft.com...CEPH Orion. etc. NOVA. NOVA. NOVA. NOVA. NOVA. NOVA. Log. 5 What Should You Do With NVMM? 1. Use files and a conventional (distributed)](https://reader033.vdocuments.site/reader033/viewer/2022052423/5f0425167e708231d40c887c/html5/thumbnails/10.jpg)
10
Atomicity: Copy-on-write for file data
• Copy-on-write for file data– Log only contains metadata– Log is short– Instant data GC
File log
Tail
Data 1 Data 2
Tail
Data 0 Data 1
![Page 11: Research Faculty Summit 2018 - microsoft.com...CEPH Orion. etc. NOVA. NOVA. NOVA. NOVA. NOVA. NOVA. Log. 5 What Should You Do With NVMM? 1. Use files and a conventional (distributed)](https://reader033.vdocuments.site/reader033/viewer/2022052423/5f0425167e708231d40c887c/html5/thumbnails/11.jpg)
11
Filebench throughput
050
100150200250300350400
Fileserver Varmail Webproxy Webserver
KOps
per s
econ
d
Ext4-datajournal Ext4-DAX xfs-DAX NOVA
![Page 12: Research Faculty Summit 2018 - microsoft.com...CEPH Orion. etc. NOVA. NOVA. NOVA. NOVA. NOVA. NOVA. Log. 5 What Should You Do With NVMM? 1. Use files and a conventional (distributed)](https://reader033.vdocuments.site/reader033/viewer/2022052423/5f0425167e708231d40c887c/html5/thumbnails/12.jpg)
12
What Should You Do With NVMM?
1. Use files and a conventional (distributed) file system
2. Use files and better file (distributed) system
3. Build persistent data structures
4. Use it as slow DRAM
???
NOVA NOVA
NOVA
NOVANOVA
NOVA
![Page 13: Research Faculty Summit 2018 - microsoft.com...CEPH Orion. etc. NOVA. NOVA. NOVA. NOVA. NOVA. NOVA. Log. 5 What Should You Do With NVMM? 1. Use files and a conventional (distributed)](https://reader033.vdocuments.site/reader033/viewer/2022052423/5f0425167e708231d40c887c/html5/thumbnails/13.jpg)
13
Existing Distributed File Systems are Slow
Local File System
Distributed File System Server
NetworkStack
Distributed File System Client
NetworkStack
Application
Client FS
![Page 14: Research Faculty Summit 2018 - microsoft.com...CEPH Orion. etc. NOVA. NOVA. NOVA. NOVA. NOVA. NOVA. Log. 5 What Should You Do With NVMM? 1. Use files and a conventional (distributed)](https://reader033.vdocuments.site/reader033/viewer/2022052423/5f0425167e708231d40c887c/html5/thumbnails/14.jpg)
14
Orion: A Distributed Persistent Memory File System
OrionFS
Application
OrionFS
RDMA
![Page 15: Research Faculty Summit 2018 - microsoft.com...CEPH Orion. etc. NOVA. NOVA. NOVA. NOVA. NOVA. NOVA. Log. 5 What Should You Do With NVMM? 1. Use files and a conventional (distributed)](https://reader033.vdocuments.site/reader033/viewer/2022052423/5f0425167e708231d40c887c/html5/thumbnails/15.jpg)
15
Orion: Key Features
• Based on NOVA• Mirrored metadata on client
– Client keep local, NVMM copy of inode’s log– Leases + simple arbitration for concurrent updates
• Mostly-local operation– Local read cache– CoW creates new, local copy
• Pervasive RDMA– All addresses/pointers are RDMA-friendly– Zero-copy IO for most transfers (NOVA data structures are RDMA targets)– Single-ended remote data access
![Page 16: Research Faculty Summit 2018 - microsoft.com...CEPH Orion. etc. NOVA. NOVA. NOVA. NOVA. NOVA. NOVA. Log. 5 What Should You Do With NVMM? 1. Use files and a conventional (distributed)](https://reader033.vdocuments.site/reader033/viewer/2022052423/5f0425167e708231d40c887c/html5/thumbnails/16.jpg)
16
Application performance on Orion
![Page 17: Research Faculty Summit 2018 - microsoft.com...CEPH Orion. etc. NOVA. NOVA. NOVA. NOVA. NOVA. NOVA. Log. 5 What Should You Do With NVMM? 1. Use files and a conventional (distributed)](https://reader033.vdocuments.site/reader033/viewer/2022052423/5f0425167e708231d40c887c/html5/thumbnails/17.jpg)
17
What Should You Do With NVMM?
1. Use files and a conventional (distributed) file system
2. Use files and better file (distributed) system
3. Build persistent data structures
4. Use it as slow DRAMLog
![Page 18: Research Faculty Summit 2018 - microsoft.com...CEPH Orion. etc. NOVA. NOVA. NOVA. NOVA. NOVA. NOVA. Log. 5 What Should You Do With NVMM? 1. Use files and a conventional (distributed)](https://reader033.vdocuments.site/reader033/viewer/2022052423/5f0425167e708231d40c887c/html5/thumbnails/18.jpg)
18
Build NVMM Data Structures Is Hard
• All existing programming errors are still possible– Memory leaks– Multiple frees– Locking errors
• There are new kinds of errors– Pointers between NV memory pools– Pointers from NVMM to DRAM
• Programmers get this stuff wrong• Rebooting/restarting won’t help!• Language + Compiler support will come, but slowly
![Page 19: Research Faculty Summit 2018 - microsoft.com...CEPH Orion. etc. NOVA. NOVA. NOVA. NOVA. NOVA. NOVA. Log. 5 What Should You Do With NVMM? 1. Use files and a conventional (distributed)](https://reader033.vdocuments.site/reader033/viewer/2022052423/5f0425167e708231d40c887c/html5/thumbnails/19.jpg)
19
Optimizing RocksDB
![Page 20: Research Faculty Summit 2018 - microsoft.com...CEPH Orion. etc. NOVA. NOVA. NOVA. NOVA. NOVA. NOVA. Log. 5 What Should You Do With NVMM? 1. Use files and a conventional (distributed)](https://reader033.vdocuments.site/reader033/viewer/2022052423/5f0425167e708231d40c887c/html5/thumbnails/20.jpg)
20
What Should You Do With NVMM?
1. Use files and a conventional (distributed) file system
2. Use files and better file (distributed) system
3. Build persistent data structures
4. Use it as slow DRAM
Easy; ~5x gains
Pretty easy; ~10x gains
Really hard; ~30x gainsThe NVMM Programmability Gap
![Page 21: Research Faculty Summit 2018 - microsoft.com...CEPH Orion. etc. NOVA. NOVA. NOVA. NOVA. NOVA. NOVA. Log. 5 What Should You Do With NVMM? 1. Use files and a conventional (distributed)](https://reader033.vdocuments.site/reader033/viewer/2022052423/5f0425167e708231d40c887c/html5/thumbnails/21.jpg)
21
Optimizing RocksDB
File Emulation
![Page 22: Research Faculty Summit 2018 - microsoft.com...CEPH Orion. etc. NOVA. NOVA. NOVA. NOVA. NOVA. NOVA. Log. 5 What Should You Do With NVMM? 1. Use files and a conventional (distributed)](https://reader033.vdocuments.site/reader033/viewer/2022052423/5f0425167e708231d40c887c/html5/thumbnails/22.jpg)
22
File Emulation
• Normal write-ahead logging– open();– write(); sync();
• Emulate read/write in user space– open(); mmap();– memcpy() + clwb + fence
• Almost POSIX semantics– Minimal changes to app logic– No complex logging, allocation, or
locking• Almost persistent data structure
performance– Just 10% slower.
![Page 23: Research Faculty Summit 2018 - microsoft.com...CEPH Orion. etc. NOVA. NOVA. NOVA. NOVA. NOVA. NOVA. Log. 5 What Should You Do With NVMM? 1. Use files and a conventional (distributed)](https://reader033.vdocuments.site/reader033/viewer/2022052423/5f0425167e708231d40c887c/html5/thumbnails/23.jpg)
23
File Emulation Speedups
0
10
20
30
40
50
60
SQLite KyotoCabinet RocksDB
File
Em
ulat
ion
Spee
dup
XFS-DAX Ext4-DAX NOVA
![Page 24: Research Faculty Summit 2018 - microsoft.com...CEPH Orion. etc. NOVA. NOVA. NOVA. NOVA. NOVA. NOVA. Log. 5 What Should You Do With NVMM? 1. Use files and a conventional (distributed)](https://reader033.vdocuments.site/reader033/viewer/2022052423/5f0425167e708231d40c887c/html5/thumbnails/24.jpg)
24
What Should You Do With NVMM?
• You should study it!– Many interesting, open problems remain– Lots of PhDs to come
• You should use it!– Use a file system!– Want more performance? Use file emulation!– Want more performance? Build persistent data structures.
![Page 25: Research Faculty Summit 2018 - microsoft.com...CEPH Orion. etc. NOVA. NOVA. NOVA. NOVA. NOVA. NOVA. Log. 5 What Should You Do With NVMM? 1. Use files and a conventional (distributed)](https://reader033.vdocuments.site/reader033/viewer/2022052423/5f0425167e708231d40c887c/html5/thumbnails/25.jpg)
25
NOVA is open source.We are preparing it for “upstreaming” in to Linux.
To help or try it out: https://github.com/NVSL/linux-nova
![Page 26: Research Faculty Summit 2018 - microsoft.com...CEPH Orion. etc. NOVA. NOVA. NOVA. NOVA. NOVA. NOVA. Log. 5 What Should You Do With NVMM? 1. Use files and a conventional (distributed)](https://reader033.vdocuments.site/reader033/viewer/2022052423/5f0425167e708231d40c887c/html5/thumbnails/26.jpg)
26
Thanks!
![Page 27: Research Faculty Summit 2018 - microsoft.com...CEPH Orion. etc. NOVA. NOVA. NOVA. NOVA. NOVA. NOVA. Log. 5 What Should You Do With NVMM? 1. Use files and a conventional (distributed)](https://reader033.vdocuments.site/reader033/viewer/2022052423/5f0425167e708231d40c887c/html5/thumbnails/27.jpg)
Thank you!
![Page 28: Research Faculty Summit 2018 - microsoft.com...CEPH Orion. etc. NOVA. NOVA. NOVA. NOVA. NOVA. NOVA. Log. 5 What Should You Do With NVMM? 1. Use files and a conventional (distributed)](https://reader033.vdocuments.site/reader033/viewer/2022052423/5f0425167e708231d40c887c/html5/thumbnails/28.jpg)