challenges for implementing pmem aware application with …...challenges for implementing pmem aware...

40
NTT Confidential Challenges for Implementing PMEM Aware Application with PMDK Yoshimi Ichiyanagi NTT Software Innovation Center

Upload: others

Post on 27-Jun-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Challenges for Implementing PMEM Aware Application with …...Challenges for Implementing PMEM Aware Application with PMDK Yoshimi Ichiyanagi ... 1. Introduction 2. Background and

2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 1

NTT Confidential

Challenges for Implementing PMEM Aware Application with PMDK

Yoshimi IchiyanagiNTT Software Innovation Center

Page 2: Challenges for Implementing PMEM Aware Application with …...Challenges for Implementing PMEM Aware Application with PMDK Yoshimi Ichiyanagi ... 1. Introduction 2. Background and

2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 2

NTT Confidential

Outline1. Introduction2. Background and Motivation3. How to use PMEM4. Challenges for implementing PMEM aware

applications5. Challenges for performance evaluation to get

valid results

Page 3: Challenges for Implementing PMEM Aware Application with …...Challenges for Implementing PMEM Aware Application with PMDK Yoshimi Ichiyanagi ... 1. Introduction 2. Background and

2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 3

NTT Confidential

Outline1. Introduction2. Background and Motivation3. How to use PMEM4. Challenges for implementing PMEM aware

applications5. Challenges for performance evaluation to get

valid results

Page 4: Challenges for Implementing PMEM Aware Application with …...Challenges for Implementing PMEM Aware Application with PMDK Yoshimi Ichiyanagi ... 1. Introduction 2. Background and

2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 4

NTT Confidential

Introduction NTT Software Innovation Center is part of NTT

Laboratories

Worked on system softwareDistributed file system (HDFS)Operating system (Linux kernel)

Trying to rewrite open-source software using new storage and new library

Page 5: Challenges for Implementing PMEM Aware Application with …...Challenges for Implementing PMEM Aware Application with PMDK Yoshimi Ichiyanagi ... 1. Introduction 2. Background and

2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 5

NTT Confidential

Outline1. Introduction2. Background and Motivation3. How to use PMEM4. Challenges for implementing PMEM aware

applications5. Challenges for performance evaluation to get

valid results

Page 6: Challenges for Implementing PMEM Aware Application with …...Challenges for Implementing PMEM Aware Application with PMDK Yoshimi Ichiyanagi ... 1. Introduction 2. Background and

2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 6

NTT Confidential

Background Persistent memory (PMEM) begins to be supplied

NVDIMM-N Intel® Optane™ DC Persistent Memory

PMEM features are: Memory-like features

Low-latency Byte-addressable

Storage-like features large-capacity non-volatile

DRAM

Persistent memory(PMEM)

Solid state disk (SSD)

Hard Disk (HDD)

Low

High

Small

Large

CapacityLatencyCPU

Page 7: Challenges for Implementing PMEM Aware Application with …...Challenges for Implementing PMEM Aware Application with PMDK Yoshimi Ichiyanagi ... 1. Introduction 2. Background and

2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 7

NTT Confidential

Motivation Trying to rewrite storage applications since PMEM

features are utilizedRDBMS (e.g. PostgreSQL)Message queue systems (e.g. Apache Kafka) etc.

Let me share my challenges about PMEM Implementation Performance evaluation

Page 8: Challenges for Implementing PMEM Aware Application with …...Challenges for Implementing PMEM Aware Application with PMDK Yoshimi Ichiyanagi ... 1. Introduction 2. Background and

2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 8

NTT Confidential

Outline1. Introduction2. Background and Motivation3. How to use PMEM4. Challenges for implementing PMEM aware

applications5. Challenges for performance evaluation to get

valid results

Page 9: Challenges for Implementing PMEM Aware Application with …...Challenges for Implementing PMEM Aware Application with PMDK Yoshimi Ichiyanagi ... 1. Introduction 2. Background and

2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 9

NTT Confidential

HW/SW components What kind of PMEM is used?

NVDIMM-N

What Linux kernel support is necessary to use PMEM? Direct-Access for files (DAX)

Supported by ext4 and xfs

What user library is used? Persistent Memory Development Kit (PMDK)

Page 10: Challenges for Implementing PMEM Aware Application with …...Challenges for Implementing PMEM Aware Application with PMDK Yoshimi Ichiyanagi ... 1. Introduction 2. Background and

2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 10

NTT Confidential

DAX FS and PMDK

Memory-mapped file

Application

PMEM (NVDIMM-N)

Library (PMDK)

DAX FSPage Cache

Traditional FS

Application Application

UserKernel

File I/O APIs

File I/OAPIs CPU instructions

ContextSwitch

ContextSwitch

HW

Access like memoryAccess like storageAccess like storage

Page 11: Challenges for Implementing PMEM Aware Application with …...Challenges for Implementing PMEM Aware Application with PMDK Yoshimi Ichiyanagi ... 1. Introduction 2. Background and

2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 11

NTT Confidential

Benefits of DAX FS and PMDK With DAX FS onlyNot necessary to rewrite applications

With DAX FS and PMDKNecessary to rewrite applicationsPerformance of I/O-intensive workload is

greatly improved

Page 12: Challenges for Implementing PMEM Aware Application with …...Challenges for Implementing PMEM Aware Application with PMDK Yoshimi Ichiyanagi ... 1. Introduction 2. Background and

2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 12

NTT Confidential

PMDK features Application accesses PMEM like memory

Memory-mapped file (mmap file)

Application developers can select fine-grained sync size Details are on the next slide

CPU instructions suitable for copy data size are selected 8 / 16 / 32 / 64 bytes registers MOVNT & SFENCE

without CPU caches

Page 13: Challenges for Implementing PMEM Aware Application with …...Challenges for Implementing PMEM Aware Application with PMDK Yoshimi Ichiyanagi ... 1. Introduction 2. Background and

2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 13

NTT Confidential

PMDK sync function pmem_msync()

File metadata and written data are flushed pmem_msync() calls msync syscall msync syscall is general sync API for mmap file

pmem_drain() Only written data is flushed pmem_drain() is faster than pmem_msync()

pmem_msync() Pmem_drain()

Durability 〇(Flush file metadata and written data)

△(Flush written data only)

Performance × 〇

Page 14: Challenges for Implementing PMEM Aware Application with …...Challenges for Implementing PMEM Aware Application with PMDK Yoshimi Ichiyanagi ... 1. Introduction 2. Background and

2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 14

NTT Confidential

Implementing application with PMDK Trying to rewrite storage applications with PMDK

RDBMS – PostgreSQL (PG)Message queue systems - Apache Kafka

Let me share know-how gained by rewriting PGCheckpoint file

Many writes occur during checkpointWrite ahead logging (WAL)

Critical for transaction performance

Page 15: Challenges for Implementing PMEM Aware Application with …...Challenges for Implementing PMEM Aware Application with PMDK Yoshimi Ichiyanagi ... 1. Introduction 2. Background and

2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 15

NTT Confidential

Outline1. Introduction2. Background and Motivation3. How to use PMEM4. Challenges for implementing PMEM aware

applications5. Challenges for performance evaluation to get

valid results

Page 16: Challenges for Implementing PMEM Aware Application with …...Challenges for Implementing PMEM Aware Application with PMDK Yoshimi Ichiyanagi ... 1. Introduction 2. Background and

2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 16

NTT Confidential

4. Challenges for implementation1. How to resize checkpoint file2. How to select sync function for WAL

Page 17: Challenges for Implementing PMEM Aware Application with …...Challenges for Implementing PMEM Aware Application with PMDK Yoshimi Ichiyanagi ... 1. Introduction 2. Background and

2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 17

NTT Confidential

4. Challenges for implementation1. How to resize checkpoint file2. How to select sync function for WAL

Page 18: Challenges for Implementing PMEM Aware Application with …...Challenges for Implementing PMEM Aware Application with PMDK Yoshimi Ichiyanagi ... 1. Introduction 2. Background and

2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 18

NTT Confidential

Resizing checkpoint file Huge table and so forth consist of multiple checkpoint

files Variable length up to 1GBNecessary to resizing checkpoint file

PMDK provides APIs for mmap fileDifficult to resize mmap file without overhead Best practice is to access only fixed-size file

We changed how to access only 1GB checkpoint file

Page 19: Challenges for Implementing PMEM Aware Application with …...Challenges for Implementing PMEM Aware Application with PMDK Yoshimi Ichiyanagi ... 1. Introduction 2. Background and

2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 19

NTT Confidential

How to resize mmap file Enlarge file

Remap file

Enlargefile

Remap file

file

Enlarge

File mapping

Only part of file can be accessed with PMDK

Virtual address space

fileFile

mapping

Whole file can be accessed with PMDK

Virtual address space

fileFile

mapping

Virtual address space

file File mapping

Virtual address space

Page 20: Challenges for Implementing PMEM Aware Application with …...Challenges for Implementing PMEM Aware Application with PMDK Yoshimi Ichiyanagi ... 1. Introduction 2. Background and

2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 20

NTT Confidential

Implementation to enlarge file

DAX FS DAX FS and PMDK

Open fd = open(path, ...); addr1 = pmem_map_file(path, len1, ...);

Unmap pmem_unmap(addr1, len1);

Extend truncate(path, len2);

Remap addr2 = pmem_map_file(path, len2, ...);

Close close(fd); pmem_unmap(addr2, len2);

3 function calls are added to use PMDK

Page 21: Challenges for Implementing PMEM Aware Application with …...Challenges for Implementing PMEM Aware Application with PMDK Yoshimi Ichiyanagi ... 1. Introduction 2. Background and

2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 21

NTT Confidential

Implementation to shrink file

DAX FS DAX FS and PMDK

Open fd = open(path, ...); addr1 = pmem_map_file(path, len1, ...);

Unmap pmem_unmap(addr1, len1)

Shrink ftruncate(fd, len2); truncate(path, len2);

Remap addr2 = pmem_map_file(path, len2, ...);

Close close(fd); pmem_unmap(addr2, len2);

2 function calls are added to use PMDK

Page 22: Challenges for Implementing PMEM Aware Application with …...Challenges for Implementing PMEM Aware Application with PMDK Yoshimi Ichiyanagi ... 1. Introduction 2. Background and

2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 22

NTT Confidential

Resizing file with PMDK Difficult to use PMDK unless file size is fixedRepeating remapping many times degrades

performanceRemapping file has large overhead

By using PMDK, munmap() /close()/open()/mmap() syscall is called again

Mapping large file may make file system full

Best practice is to use fixed-size file only

Page 23: Challenges for Implementing PMEM Aware Application with …...Challenges for Implementing PMEM Aware Application with PMDK Yoshimi Ichiyanagi ... 1. Introduction 2. Background and

2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 23

NTT Confidential

4. Challenges for implementation1. How to resize checkpoint file2. How to select sync function for WAL

Page 24: Challenges for Implementing PMEM Aware Application with …...Challenges for Implementing PMEM Aware Application with PMDK Yoshimi Ichiyanagi ... 1. Introduction 2. Background and

2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 24

NTT Confidential

Selecting sync function for WAL How to write log to WAL file

Initialization – create file and fill file with zero Necessary to flush file metadata

Synchronous logging - sequential synchronous write Necessary to flush only written data

PMDK sync APIs pmem_msync() – file metadata and written data are

flushed pmem_drain() – only written data is flushed

Page 25: Challenges for Implementing PMEM Aware Application with …...Challenges for Implementing PMEM Aware Application with PMDK Yoshimi Ichiyanagi ... 1. Introduction 2. Background and

2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 25

NTT Confidential

PMDK sync function for WAL Initialization of WAL file

Necessary to flush WAL file metadata pmem_msync()

Synchronous loggingNecessary to flush only written data pmem_drain() or pmem_msync()

pmem_drain() is faster than pmem_msync()We selected pmem_drain()

Page 26: Challenges for Implementing PMEM Aware Application with …...Challenges for Implementing PMEM Aware Application with PMDK Yoshimi Ichiyanagi ... 1. Introduction 2. Background and

2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 26

NTT Confidential

Comparing PMDK sync functions I ran microbenchmark to compare pmem_msync()

to pmem_drain()Preprocessing

WAL initializationCreate file and fill file with zero and flush file metadata

Measurement processingSynchronous loggingOverwrite data and flush written data

Page 27: Challenges for Implementing PMEM Aware Application with …...Challenges for Implementing PMEM Aware Application with PMDK Yoshimi Ichiyanagi ... 1. Introduction 2. Background and

2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 27

NTT Confidential

Microbenchmark

1. DAX FSfdatasync()

2. DAX FS and PMDKpmem_msync()

3. DAX FS and PMDKpmem_drain()

Open open() pmem_map_file() pmem_map_file()

while () { while () { while() {

Write write() pmem_memcpy_nodrain() pmem_memcpy_nodrain()

Sync fdatasync() pmem_msync() pmem_drain()

Loop N } } }

Close close() pmem_unmap() pmem_unmap()

Synchronous logging

Page 28: Challenges for Implementing PMEM Aware Application with …...Challenges for Implementing PMEM Aware Application with PMDK Yoshimi Ichiyanagi ... 1. Introduction 2. Background and

2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 28

NTT Confidential

Evaluation setup32 GBDRAM

32 GBDRAM

CPU

CPU

Node0

Node1Running

benchmarks

NUMA node

48 GBPMEM

HardwareCPU E5-2667 v4 x 2 (8 cores per node)

DRAM [Node0/1] 32 GB each

PMEM (NVDIMM-N) [Node1] 48 GB (HPE 8GB NVDIMM x 6)

SoftwareDistro Ubuntu 16.04

Linux kernel 4.17. 9*

PMDK 1.4.1

Filesystem ext4 (DAX available)

*: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git

Page 29: Challenges for Implementing PMEM Aware Application with …...Challenges for Implementing PMEM Aware Application with PMDK Yoshimi Ichiyanagi ... 1. Introduction 2. Background and

2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 29

NTT Confidential

Performance evaluation - microbenchmark pmem_drain() is fastest of 3 patterns as expected

• Total written data is 10 GB• Block is 8 KB

[GB/s] 15 GB/s

4.3 GB/s

0.0023 GB/sDAX FS

fdatasync()DAX FS and PMDK

pmem_msync()DAX FS and PMDK

pmem_drain()

Page 30: Challenges for Implementing PMEM Aware Application with …...Challenges for Implementing PMEM Aware Application with PMDK Yoshimi Ichiyanagi ... 1. Introduction 2. Background and

2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 30

NTT Confidential

PMDK sync functions pmem_drain() greatly improves performance of I/O-intensive

workload

You should use pmem_drain() with caution pmem_drain() can’t flush file metadata

pmem_msync() should be called by application that uses file metadata such as Time of last modification Time of last access

pmem_drain() doesn’t work without pmem_memcpy_nodrain()

Page 31: Challenges for Implementing PMEM Aware Application with …...Challenges for Implementing PMEM Aware Application with PMDK Yoshimi Ichiyanagi ... 1. Introduction 2. Background and

2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 31

NTT Confidential

Outline1. Introduction2. Background and Motivation3. How to use PMEM4. Challenges for implementing PMEM aware

applications5. Challenges for performance evaluation to get

valid results

Page 32: Challenges for Implementing PMEM Aware Application with …...Challenges for Implementing PMEM Aware Application with PMDK Yoshimi Ichiyanagi ... 1. Introduction 2. Background and

2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 32

NTT Confidential

5. Challenges for performance evaluation Difficult to get valid results in PG performance

evaluation

What is valid result?Avoid Non-Uniform Memory Access (NUMA)

effectsAvoid CPUs becoming hotspots

tuning application for PMEM

Page 33: Challenges for Implementing PMEM Aware Application with …...Challenges for Implementing PMEM Aware Application with PMDK Yoshimi Ichiyanagi ... 1. Introduction 2. Background and

2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 33

NTT Confidential

5. Challenges for performance evaluation1. NUMA effect2. Tuning application for PMEM

Page 34: Challenges for Implementing PMEM Aware Application with …...Challenges for Implementing PMEM Aware Application with PMDK Yoshimi Ichiyanagi ... 1. Introduction 2. Background and

2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 34

NTT Confidential

5. Challenges for performance evaluation1. NUMA effect2. Tuning application for PMEM

Page 35: Challenges for Implementing PMEM Aware Application with …...Challenges for Implementing PMEM Aware Application with PMDK Yoshimi Ichiyanagi ... 1. Introduction 2. Background and

2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 35

NTT Confidential

NUMA effect Synchronous write is about 1.5 times faster on

local NUMA node than on remote NUMA nodeNUMA node 32 GiB

DRAM

32 GiBDRAM

CPU

CPU

Node0

Node1

48 GiBPMEM

Synchronous write

15 GB/s11 GB/s

Local node Remote node

X1.5

Page 36: Challenges for Implementing PMEM Aware Application with …...Challenges for Implementing PMEM Aware Application with PMDK Yoshimi Ichiyanagi ... 1. Introduction 2. Background and

2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 36

NTT Confidential

5. Challenges for performance evaluation1. NUMA effect2. Tuning application for PMEM

Page 37: Challenges for Implementing PMEM Aware Application with …...Challenges for Implementing PMEM Aware Application with PMDK Yoshimi Ichiyanagi ... 1. Introduction 2. Background and

2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 37

NTT Confidential

Tuning application for PMEM Important to avoid calculation processing

becoming hotspotBetter to use Stored Procedure in PG

Stored Procedure improves PG performance since user-defined functions are pre-compiled and stored in PG serve

pgbench -c 16 -j 16 -T 1800 -r [db_name] -M prepared

Page 38: Challenges for Implementing PMEM Aware Application with …...Challenges for Implementing PMEM Aware Application with PMDK Yoshimi Ichiyanagi ... 1. Introduction 2. Background and

2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 38

NTT Confidential

Evaluation setup32 GBDRAM

32 GBDRAM

CPU

CPU

Node0

Node1PG server

NUMA node

48 GBPMEM

HardwareCPU E5-2667 v4 x 2 (8 cores per node)

DRAM [Node0/1] 32 GB each

PMEM (NVDIMM-N) [Node1] 48 GB (HPE 8GB NVDIMM x 6)

SSD Intel® Optane™ SSD DC P4800X Series 750GB

SoftwareDistro Ubuntu 16.04

Linux kernel 4.17. 9

PMDK 1.4.1

Filesystem ext4 (DAX available)

PostgreSQL 10.4 Beta*

*: https://www.postgresql.org/message-id/C20D38E97BCB33DAD59E3A1%40lab.ntt.co.jp

750 GBSSD

PG clients

Page 39: Challenges for Implementing PMEM Aware Application with …...Challenges for Implementing PMEM Aware Application with PMDK Yoshimi Ichiyanagi ... 1. Introduction 2. Background and

2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 39

NTT Confidential

Stored Procedure Improvement ratio using PMEM is improved by

12% compared with using SSD

Intel® Optane™ SSDDC P4800X Series 750GB

PMEMwith DAX FS + PMDK

Without Stored Procedure

With Stored Procedure

18,396 tps

29,125 tps

x1.58

36,449 tps

21,406 tps

x1.70

[tps]

Page 40: Challenges for Implementing PMEM Aware Application with …...Challenges for Implementing PMEM Aware Application with PMDK Yoshimi Ichiyanagi ... 1. Introduction 2. Background and

2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 40

NTT Confidential

Conclusion Difficult to implement PMEM aware applications

Resizing file with overhead Best practice is to access only fixed-size file

Selecting inappropriate sync function seriously degrades performance or durability

Difficult to get valid results in performance evaluation Avoiding NUMA effect Avoiding CPUs becoming hotspots