persistent memory support in red hat enterprise … · persistent memory support in red hat...
TRANSCRIPT
![Page 1: Persistent Memory Support in Red Hat Enterprise … · Persistent Memory Support in Red Hat Enterprise Linux Jeff Moyer, Red Hat, Inc. Andy Rudoff, Intel June 30, 2016](https://reader030.vdocuments.site/reader030/viewer/2022020114/5b90acb909d3f28a7e8c798e/html5/thumbnails/1.jpg)
Persistent Memory Supportin Red Hat Enterprise Linux
Jeff Moyer, Red Hat, Inc.Andy Rudoff, Intel
June 30, 2016
![Page 2: Persistent Memory Support in Red Hat Enterprise … · Persistent Memory Support in Red Hat Enterprise Linux Jeff Moyer, Red Hat, Inc. Andy Rudoff, Intel June 30, 2016](https://reader030.vdocuments.site/reader030/viewer/2022020114/5b90acb909d3f28a7e8c798e/html5/thumbnails/2.jpg)
Agenda
● A Bit of Background Information● Software Architecture● pmem Configuration and Management● pmem Advantages/Challenges● pmem Examples
![Page 3: Persistent Memory Support in Red Hat Enterprise … · Persistent Memory Support in Red Hat Enterprise Linux Jeff Moyer, Red Hat, Inc. Andy Rudoff, Intel June 30, 2016](https://reader030.vdocuments.site/reader030/viewer/2022020114/5b90acb909d3f28a7e8c798e/html5/thumbnails/3.jpg)
Background
![Page 4: Persistent Memory Support in Red Hat Enterprise … · Persistent Memory Support in Red Hat Enterprise Linux Jeff Moyer, Red Hat, Inc. Andy Rudoff, Intel June 30, 2016](https://reader030.vdocuments.site/reader030/viewer/2022020114/5b90acb909d3f28a7e8c798e/html5/thumbnails/4.jpg)
Persistent Memory
● Order-of-magnitude DRAM Performance● Byte-addressable● Persistent● DMA Target● High capacity
● Use Cases:– Rapid start-up (data set already in memory)– Random, odd-shaped accesses (avoid transferring blocks)– Fast write-cache
![Page 5: Persistent Memory Support in Red Hat Enterprise … · Persistent Memory Support in Red Hat Enterprise Linux Jeff Moyer, Red Hat, Inc. Andy Rudoff, Intel June 30, 2016](https://reader030.vdocuments.site/reader030/viewer/2022020114/5b90acb909d3f28a7e8c798e/html5/thumbnails/5.jpg)
Flavors of NVDIMMs
● NVDIMM-N– Energy-backed DRAM– Flash used for persistence
(not exposed to OS)
– Performance on par with DRAM– Small Capacity– Expensive
● NVDIMM-P– Same order of magnitude performance as DRAM (read: may be slightly slower)– Much larger capacity– Cheaper (?)
Image Source: SNIA_NVDIMM
![Page 6: Persistent Memory Support in Red Hat Enterprise … · Persistent Memory Support in Red Hat Enterprise Linux Jeff Moyer, Red Hat, Inc. Andy Rudoff, Intel June 30, 2016](https://reader030.vdocuments.site/reader030/viewer/2022020114/5b90acb909d3f28a7e8c798e/html5/thumbnails/6.jpg)
Software Architecture
![Page 7: Persistent Memory Support in Red Hat Enterprise … · Persistent Memory Support in Red Hat Enterprise Linux Jeff Moyer, Red Hat, Inc. Andy Rudoff, Intel June 30, 2016](https://reader030.vdocuments.site/reader030/viewer/2022020114/5b90acb909d3f28a7e8c798e/html5/thumbnails/7.jpg)
NVM Programming Model
36+ Member Companies
http://snia.org/sites/default/files/NVMProgrammingModel_v1.pdf
![Page 8: Persistent Memory Support in Red Hat Enterprise … · Persistent Memory Support in Red Hat Enterprise Linux Jeff Moyer, Red Hat, Inc. Andy Rudoff, Intel June 30, 2016](https://reader030.vdocuments.site/reader030/viewer/2022020114/5b90acb909d3f28a7e8c798e/html5/thumbnails/8.jpg)
NVM.PM Modes
Source: ProgModel
![Page 9: Persistent Memory Support in Red Hat Enterprise … · Persistent Memory Support in Red Hat Enterprise Linux Jeff Moyer, Red Hat, Inc. Andy Rudoff, Intel June 30, 2016](https://reader030.vdocuments.site/reader030/viewer/2022020114/5b90acb909d3f28a7e8c798e/html5/thumbnails/9.jpg)
Major Kernel Subsystems
Architecture Support
Platform Support (ACPI, etc) Device Drivers
Block Layer Network Core
VFS
ext4ext2 xfs ...Virtual Memory
Process Control
System Call Interface
![Page 10: Persistent Memory Support in Red Hat Enterprise … · Persistent Memory Support in Red Hat Enterprise Linux Jeff Moyer, Red Hat, Inc. Andy Rudoff, Intel June 30, 2016](https://reader030.vdocuments.site/reader030/viewer/2022020114/5b90acb909d3f28a7e8c798e/html5/thumbnails/10.jpg)
Modified Kernel Subsystems
Architecture Support
Platform Support (ACPI, etc) Device Drivers
Block Layer Network Core
VFS
ext4ext2 xfs ...Virtual Memory
Process Control
System Call Interface
![Page 11: Persistent Memory Support in Red Hat Enterprise … · Persistent Memory Support in Red Hat Enterprise Linux Jeff Moyer, Red Hat, Inc. Andy Rudoff, Intel June 30, 2016](https://reader030.vdocuments.site/reader030/viewer/2022020114/5b90acb909d3f28a7e8c798e/html5/thumbnails/11.jpg)
Software Architecture
Source: Namespace
![Page 12: Persistent Memory Support in Red Hat Enterprise … · Persistent Memory Support in Red Hat Enterprise Linux Jeff Moyer, Red Hat, Inc. Andy Rudoff, Intel June 30, 2016](https://reader030.vdocuments.site/reader030/viewer/2022020114/5b90acb909d3f28a7e8c798e/html5/thumbnails/12.jpg)
pmem Configuration and Management
![Page 13: Persistent Memory Support in Red Hat Enterprise … · Persistent Memory Support in Red Hat Enterprise Linux Jeff Moyer, Red Hat, Inc. Andy Rudoff, Intel June 30, 2016](https://reader030.vdocuments.site/reader030/viewer/2022020114/5b90acb909d3f28a7e8c798e/html5/thumbnails/13.jpg)
PMEM Namespace Configurations
● Default, but don't use it!
RAW SECTOR MEMORY
![Page 14: Persistent Memory Support in Red Hat Enterprise … · Persistent Memory Support in Red Hat Enterprise Linux Jeff Moyer, Red Hat, Inc. Andy Rudoff, Intel June 30, 2016](https://reader030.vdocuments.site/reader030/viewer/2022020114/5b90acb909d3f28a7e8c798e/html5/thumbnails/14.jpg)
PMEM Namespace Configurations
● Default, but don't use it! ● Atomic Sector Updates(provided by the btt)
● Configurable Sector Size(includes DIF/DIX)
● Applicable to both PMEM and BLK namespaces
RAW SECTOR MEMORY
![Page 15: Persistent Memory Support in Red Hat Enterprise … · Persistent Memory Support in Red Hat Enterprise Linux Jeff Moyer, Red Hat, Inc. Andy Rudoff, Intel June 30, 2016](https://reader030.vdocuments.site/reader030/viewer/2022020114/5b90acb909d3f28a7e8c798e/html5/thumbnails/15.jpg)
PMEM Namespace Configurations
● Default, but don't use it! ● Atomic Sector Updates(provided by the btt)
● Configurable Sector Size(includes DIF/DIX)
● Applicable to both PMEM and BLK namespaces
● DAX Support● Applies only to PMEM
namespaces● Requires space for kernel
data structures
RAW SECTOR MEMORY
![Page 16: Persistent Memory Support in Red Hat Enterprise … · Persistent Memory Support in Red Hat Enterprise Linux Jeff Moyer, Red Hat, Inc. Andy Rudoff, Intel June 30, 2016](https://reader030.vdocuments.site/reader030/viewer/2022020114/5b90acb909d3f28a7e8c798e/html5/thumbnails/16.jpg)
“Memory” Namespaces
● Need to reserve space for kernel page structures● 2 options:
1) Eat up DRAM
2) Lose storage space
64 bytes per 4K page = 16GB/TB
32GB DIMM = 512 MB
![Page 17: Persistent Memory Support in Red Hat Enterprise … · Persistent Memory Support in Red Hat Enterprise Linux Jeff Moyer, Red Hat, Inc. Andy Rudoff, Intel June 30, 2016](https://reader030.vdocuments.site/reader030/viewer/2022020114/5b90acb909d3f28a7e8c798e/html5/thumbnails/17.jpg)
Configuring DAX
# ndctl list[ { "dev":"namespace0.0", "mode":"raw", "size":17179869184, "blockdev":"pmem0" }]# fdisk -l /dev/pmem0
Disk /dev/pmem0: 17.2 GB, 17179869184 bytes, 33554432 sectorsUnits = sectors of 1 * 512 = 512 bytesSector size (logical/physical): 512 bytes / 4096 bytesI/O size (minimum/optimal): 4096 bytes / 4096 bytes
![Page 18: Persistent Memory Support in Red Hat Enterprise … · Persistent Memory Support in Red Hat Enterprise Linux Jeff Moyer, Red Hat, Inc. Andy Rudoff, Intel June 30, 2016](https://reader030.vdocuments.site/reader030/viewer/2022020114/5b90acb909d3f28a7e8c798e/html5/thumbnails/18.jpg)
Configuring DAXusing DRAM to host struct pages
# ndctl create-namespace -f -e namespace0.0 --mode=memory --map=mem{ "dev":"namespace0.0", "mode":"memory", "size":17177772032, "uuid":"3c88e67f-8b25-4661-adf9-f0ed390cbd6a", "blockdev":"pmem0"}
# fdisk -l /dev/pmem0
Disk /dev/pmem0: 17.2 GB, 17177772032 bytes, 33550336 sectorsUnits = sectors of 1 * 512 = 512 bytesSector size (logical/physical): 512 bytes / 4096 bytesI/O size (minimum/optimal): 4096 bytes / 4096 bytes
![Page 19: Persistent Memory Support in Red Hat Enterprise … · Persistent Memory Support in Red Hat Enterprise Linux Jeff Moyer, Red Hat, Inc. Andy Rudoff, Intel June 30, 2016](https://reader030.vdocuments.site/reader030/viewer/2022020114/5b90acb909d3f28a7e8c798e/html5/thumbnails/19.jpg)
Configuring DAXusing DRAM to host struct pages
# ndctl create-namespace -f -e namespace0.0 --mode=memory --map=mem{ "dev":"namespace0.0", "mode":"memory", "size":17177772032, "uuid":"3c88e67f-8b25-4661-adf9-f0ed390cbd6a", "blockdev":"pmem0"}
# fdisk -l /dev/pmem0
Disk /dev/pmem0: 17.2 GB, 17177772032 bytes, 33550336 sectorsUnits = sectors of 1 * 512 = 512 bytesSector size (logical/physical): 512 bytes / 4096 bytesI/O size (minimum/optimal): 4096 bytes / 4096 bytes
2MB Shy of 16GB
![Page 20: Persistent Memory Support in Red Hat Enterprise … · Persistent Memory Support in Red Hat Enterprise Linux Jeff Moyer, Red Hat, Inc. Andy Rudoff, Intel June 30, 2016](https://reader030.vdocuments.site/reader030/viewer/2022020114/5b90acb909d3f28a7e8c798e/html5/thumbnails/20.jpg)
Configuring DAXusing the NVDIMM to host struct pages
# ndctl create-namespace -f -e namespace0.0 --mode=memory --map=dev{ "dev":"namespace0.0", "mode":"memory", "size":16909336576, "uuid":"b5c852b2-75c2-4e8b-94b2-06694d6ff243", "blockdev":"pmem0"}
# fdisk -l /dev/pmem0
Disk /dev/pmem0: 17.2 GB, 17177772032 bytes, 33550336 sectorsUnits = sectors of 1 * 512 = 512 bytesSector size (logical/physical): 512 bytes / 4096 bytesI/O size (minimum/optimal): 4096 bytes / 4096 bytes
![Page 21: Persistent Memory Support in Red Hat Enterprise … · Persistent Memory Support in Red Hat Enterprise Linux Jeff Moyer, Red Hat, Inc. Andy Rudoff, Intel June 30, 2016](https://reader030.vdocuments.site/reader030/viewer/2022020114/5b90acb909d3f28a7e8c798e/html5/thumbnails/21.jpg)
Configuring a BTT Namespace
# ndctl list[ { "dev":"namespace0.0", "mode":"raw", "size":17179869184, "blockdev":"pmem0" }]
![Page 22: Persistent Memory Support in Red Hat Enterprise … · Persistent Memory Support in Red Hat Enterprise Linux Jeff Moyer, Red Hat, Inc. Andy Rudoff, Intel June 30, 2016](https://reader030.vdocuments.site/reader030/viewer/2022020114/5b90acb909d3f28a7e8c798e/html5/thumbnails/22.jpg)
Configuring a BTT Namespace
# ndctl create-namespace -f -e namespace0.0 -m sector{ "dev":"namespace0.0", "mode":"sector", "uuid":"9e24b27a-bb46-44ad-b7fb-81ebfee0a3d6", "sector_size":4096, "blockdev":"pmem0s"}
# fdisk -l /dev/pmem0s
Disk /dev/pmem0s: 17.2 GB, 17162027008 bytes, 4189948 sectorsUnits = sectors of 1 * 4096 = 4096 bytesSector size (logical/physical): 4096 bytes / 4096 bytesI/O size (minimum/optimal): 4096 bytes / 4096 bytes
![Page 23: Persistent Memory Support in Red Hat Enterprise … · Persistent Memory Support in Red Hat Enterprise Linux Jeff Moyer, Red Hat, Inc. Andy Rudoff, Intel June 30, 2016](https://reader030.vdocuments.site/reader030/viewer/2022020114/5b90acb909d3f28a7e8c798e/html5/thumbnails/23.jpg)
File System Setup for DAX
# mkfs -t xfs -d su=1g,sw=1 /dev/pmem0# mount -t xfs -o dax /dev/pmem0 /mnt/dax
# mkfs -t ext4 /dev/pmem0# mount -t ext4 -o dax /dev/pmem0 /mnt/dax
NOTE: Inconsistent Behavior:– Ext4 fails if DAX unavailable– Xfs logs a message
![Page 24: Persistent Memory Support in Red Hat Enterprise … · Persistent Memory Support in Red Hat Enterprise Linux Jeff Moyer, Red Hat, Inc. Andy Rudoff, Intel June 30, 2016](https://reader030.vdocuments.site/reader030/viewer/2022020114/5b90acb909d3f28a7e8c798e/html5/thumbnails/24.jpg)
pmem Advantages/Challenges
![Page 25: Persistent Memory Support in Red Hat Enterprise … · Persistent Memory Support in Red Hat Enterprise Linux Jeff Moyer, Red Hat, Inc. Andy Rudoff, Intel June 30, 2016](https://reader030.vdocuments.site/reader030/viewer/2022020114/5b90acb909d3f28a7e8c798e/html5/thumbnails/25.jpg)
pmem Challenges
● Non-transparent usage means application changes– App must decide what data lives in each tier– Any app change is impactful
● Do volatile memory algorithms “just work”?– Sure, for volatile use cases– Algorithms for persistence are different
● Primary challenge: decide where to spend effort
![Page 26: Persistent Memory Support in Red Hat Enterprise … · Persistent Memory Support in Red Hat Enterprise Linux Jeff Moyer, Red Hat, Inc. Andy Rudoff, Intel June 30, 2016](https://reader030.vdocuments.site/reader030/viewer/2022020114/5b90acb909d3f28a7e8c798e/html5/thumbnails/26.jpg)
pmem Examples
![Page 27: Persistent Memory Support in Red Hat Enterprise … · Persistent Memory Support in Red Hat Enterprise Linux Jeff Moyer, Red Hat, Inc. Andy Rudoff, Intel June 30, 2016](https://reader030.vdocuments.site/reader030/viewer/2022020114/5b90acb909d3f28a7e8c798e/html5/thumbnails/27.jpg)
Programming Model Summary
● pmem exposed as memory-mapped files– Always safe to use standard API: msync()
● Only when Linux says it is safe:– Optimized flush from user space
●CLFLUSH or CLFLUSHOPT+fence or CLWB+fence or NT store+fence– libpmem's pmem_is_pmem() function tells you if it is safe
● Only when Linux says platform supports it (future use):– CPU caches are part of persistence domain– libpmem's pmem_persist() will handle this
● Standard API may flush to smaller failure domain than optimized flush
![Page 28: Persistent Memory Support in Red Hat Enterprise … · Persistent Memory Support in Red Hat Enterprise Linux Jeff Moyer, Red Hat, Inc. Andy Rudoff, Intel June 30, 2016](https://reader030.vdocuments.site/reader030/viewer/2022020114/5b90acb909d3f28a7e8c798e/html5/thumbnails/28.jpg)
POSIX Load/Store Persistence
open(…);pmem = mmap(…);
strcpy(pmem, "hello");
msync(pmem, 6, MS_SYNC);
![Page 29: Persistent Memory Support in Red Hat Enterprise … · Persistent Memory Support in Red Hat Enterprise Linux Jeff Moyer, Red Hat, Inc. Andy Rudoff, Intel June 30, 2016](https://reader030.vdocuments.site/reader030/viewer/2022020114/5b90acb909d3f28a7e8c798e/html5/thumbnails/29.jpg)
pmem Programming Model Load/Store Persistence
open(…);pmem = mmap(…);
assert(pmem_is_pmem(pmem, len));
strcpy(pmem, "hello");
pmem_persist(pmem, 6);
![Page 30: Persistent Memory Support in Red Hat Enterprise … · Persistent Memory Support in Red Hat Enterprise Linux Jeff Moyer, Red Hat, Inc. Andy Rudoff, Intel June 30, 2016](https://reader030.vdocuments.site/reader030/viewer/2022020114/5b90acb909d3f28a7e8c798e/html5/thumbnails/30.jpg)
Storing More Than 8 Aligned Bytes
open(…);pmem = mmap(…);
assert(pmem_is_pmem(pmem, len));
strcpy(pmem, "hello there");
pmem_persist(pmem, 12); crash
“\0\0\0\0\0\0\0\0\0\0\0\0”
“hello the\0\0\0\0”
“\0\0\0\0\0\0\0\0ere\0”
“hello there\0”
![Page 31: Persistent Memory Support in Red Hat Enterprise … · Persistent Memory Support in Red Hat Enterprise Linux Jeff Moyer, Red Hat, Inc. Andy Rudoff, Intel June 30, 2016](https://reader030.vdocuments.site/reader030/viewer/2022020114/5b90acb909d3f28a7e8c798e/html5/thumbnails/31.jpg)
Visibility versus Powerfail Atomicity
Feature Atomicity
Atomic Store 8 byte powerfail atomicityMuch larger visibility atomicity
TSXProgrammer must comprehend thatXABORT, cache flush can abort
LOCK CMPXCHG non-blocking algorithms depend on CAS, but CAS doesn’t include flush to persistence
![Page 32: Persistent Memory Support in Red Hat Enterprise … · Persistent Memory Support in Red Hat Enterprise Linux Jeff Moyer, Red Hat, Inc. Andy Rudoff, Intel June 30, 2016](https://reader030.vdocuments.site/reader030/viewer/2022020114/5b90acb909d3f28a7e8c798e/html5/thumbnails/32.jpg)
NVM Libraries
● Transactions– Hardest part to get right, still non-trivial to use in library
● Persistent Memory Allocation– Always-consistent heap (no persistent memory leaks)
● Common Set of Atomic Operations– Lists, Allocation onto/off of lists
● Replication– Local active/passive now– Remote active/passive next– More flexible later
● More transparent usages supported over time
![Page 33: Persistent Memory Support in Red Hat Enterprise … · Persistent Memory Support in Red Hat Enterprise Linux Jeff Moyer, Red Hat, Inc. Andy Rudoff, Intel June 30, 2016](https://reader030.vdocuments.site/reader030/viewer/2022020114/5b90acb909d3f28a7e8c798e/html5/thumbnails/33.jpg)
Transactional Object Store
application
pmem
libpmemobj
libpmem
![Page 34: Persistent Memory Support in Red Hat Enterprise … · Persistent Memory Support in Red Hat Enterprise Linux Jeff Moyer, Red Hat, Inc. Andy Rudoff, Intel June 30, 2016](https://reader030.vdocuments.site/reader030/viewer/2022020114/5b90acb909d3f28a7e8c798e/html5/thumbnails/34.jpg)
Transactional Object Store
application
pmem
libpmemobj
libpmem
BEGIN, END, ABORTAllocate, Free
is_pmem()persist()
![Page 35: Persistent Memory Support in Red Hat Enterprise … · Persistent Memory Support in Red Hat Enterprise Linux Jeff Moyer, Red Hat, Inc. Andy Rudoff, Intel June 30, 2016](https://reader030.vdocuments.site/reader030/viewer/2022020114/5b90acb909d3f28a7e8c798e/html5/thumbnails/35.jpg)
Simple pmemobj Transaction
TX_BEGIN_LOCK(pop, TX_LOCK_MUTEX, &op->mylock) {
TX_STRCPY(op->greeting, “hello there”);
} TX_END
struct myobj {PMEMmutex mylock;char greeting[GREETINGLEN];
};
![Page 36: Persistent Memory Support in Red Hat Enterprise … · Persistent Memory Support in Red Hat Enterprise Linux Jeff Moyer, Red Hat, Inc. Andy Rudoff, Intel June 30, 2016](https://reader030.vdocuments.site/reader030/viewer/2022020114/5b90acb909d3f28a7e8c798e/html5/thumbnails/36.jpg)
Two Types of Atomicity
TX_BEGIN_LOCK(pop, TX_LOCK_MUTEX, &op->mylock) {
TX_STRCPY(op->greeting, “hello there”);
} TX_END
Multi-ThreadAtomicity
PowerfailAtomicity
![Page 37: Persistent Memory Support in Red Hat Enterprise … · Persistent Memory Support in Red Hat Enterprise Linux Jeff Moyer, Red Hat, Inc. Andy Rudoff, Intel June 30, 2016](https://reader030.vdocuments.site/reader030/viewer/2022020114/5b90acb909d3f28a7e8c798e/html5/thumbnails/37.jpg)
NVM Library: pmem.io
NVDIMM
UserSpace
KernelSpace
Application Application
Load/StoreStandardFile API
pmem-AwareFile System
pmem-AwareFile System
MMUMappings
Library
• Open Source• http://pmem.io
• libpmem• libpmemobj• libpmemblk• libpmemlog• libvmem• libvmmalloc
Transactional
![Page 38: Persistent Memory Support in Red Hat Enterprise … · Persistent Memory Support in Red Hat Enterprise Linux Jeff Moyer, Red Hat, Inc. Andy Rudoff, Intel June 30, 2016](https://reader030.vdocuments.site/reader030/viewer/2022020114/5b90acb909d3f28a7e8c798e/html5/thumbnails/38.jpg)
Summary
● Persistent Memory products available today– Capacities about to explode
● Linux is prepared– pmem driver stack, DAX, ext4, xfs, etc.
● RHEL is prepared– ndctl & other tools, validation
● Potential value of pmem programming is quite large– Applications re-organize data into memory, storage, and pmem
● Numerous challenges– NVM Libraries provide some solutions that applications can leverage
![Page 39: Persistent Memory Support in Red Hat Enterprise … · Persistent Memory Support in Red Hat Enterprise Linux Jeff Moyer, Red Hat, Inc. Andy Rudoff, Intel June 30, 2016](https://reader030.vdocuments.site/reader030/viewer/2022020114/5b90acb909d3f28a7e8c798e/html5/thumbnails/39.jpg)
References
● ProgModel - http://www.snia.org/tech_activities/standards/curr_standards/npm● Namespace - http://pmem.io/documents/NVDIMM_Namespace_Spec.pdf● SNIA_NVDIMM - http://www.snia.org/forums/sssi/NVDIMM● Williams_Vault –
http://events.linuxfoundation.org/sites/events/files/slides/Managing%20Persistent%20Memory_0.pdf
● WIKI – https://nvdimm.wiki.kernel.org/
![Page 40: Persistent Memory Support in Red Hat Enterprise … · Persistent Memory Support in Red Hat Enterprise Linux Jeff Moyer, Red Hat, Inc. Andy Rudoff, Intel June 30, 2016](https://reader030.vdocuments.site/reader030/viewer/2022020114/5b90acb909d3f28a7e8c798e/html5/thumbnails/40.jpg)