daniel bittman peter alvaro darrell d. e. long ethan l. miller · 26/02/2019 · daniel bittman...

Optimizing Systems for Byte-Addressable NVM by Reducing Bit Flipping

Daniel Bittman Peter Alvaro Darrell D. E. Long Ethan L. Miller

FAST ‘192019-02-26

CRSS Confidential

Byte-addressable Non-volatile Memory

BNVM is coming, and with it, new optimization targets

CRSS Confidential

Byte-addressable Non-volatile Memory

0 1 0 1 0 0 0 1

0 1 1 0 0 1 0 1

It’s not just writes... ...it’s the bits flipped by those writes

BNVM power usage

Can we take advantage of this?

Software vs. hardware?

How hard is it to reason about bit flips?

How do we design data structures to reduce bit flips?

Reducing Bit Flips in Software ●

●●

XOR linked lists

prevnext

valuexptr

xptr = next ⊕ prev

Traditional doubly linked list

XOR linked list

Pointers!

Some actual pointersA = 0x000055b7bda8f260B = 0x000055b7bda8f6a0

A ⊕ B = 0x4C0 = 0b10011000000

Using XOR in hash tables

xnext D

00000 0

xxxxx 1

Both indicate “entry is empty”

From XOR linked lists to Red Black Trees

Standard 3-pointer red-black tree design

From XOR linked lists to Red Black Trees

L RP LX RXLX = L ⊕ P

RX = R ⊕ P

Now 2-pointer, and XOR pointers

Evaluation ●

Experimental framework

, at the memory controller

Test different data structures, with different cache parameters,over a varying number of operations

Bit flips: calling malloc()

Cross-over point between40 and 48 bytes

Bit flips: XOR Linked Lists

XOR linked lists reducebit flips dramatically

better

Bit flips: Hash table

better

Hash table already hadfew flips to save:

chains should be short

Bit flips: Red-black Trees

better

Significant savings

Bit flips: L2 Cache Behavior

better

L2 cache has ultimately little effect!

Bit flips: L2 Cache Behavior

better

L2 cache has ultimately little effect!

...even when increasing in size

Performance: RBT insert

better

Performance is notsignificantly affected!

a) Performance cost of XORsb) Performance benefit of smaller node size

Performance: hash table

better

Almost no effect on performance

CRSS Confidential

Conclusions

Savings are significant with little performance impact.

We can design around bit flips, and we should.

Bit flip/write inversion

Data structures

Systemsimulation Caching

effects Pointerdistance

Power and wear estimates

Full systems

Real hardware

More datastructuresAdditional

techniques

Bitflips from algorithms

ABI modifications

Thank You! Questions?Daniel Bittman

darrell@ucsc.edu elm@ucsc.edupalvaro@ucsc.edu

https://gitlab.soe.ucsc.edu/gitlab/crss/opensource-bitflipping-fast19

Backup slides

Bit flips: instrumentation

better

Performance: RBT lookup

better

Performance: XOR Linked List

Operation XOR Linked List Doubly Linked List

45 +/- 1 45 +/- 1

27 +/- 1 28 +/- 1

2.6 +/ 0.1 2.2 +/- 0.1

Stack frames

arg0pc

csr0sp

arg0pc

csr1sp

fn0 fn1

Stack frames

arg0pc

csr0sp

arg0pcsp

fn0 fn1

“Wasted space”

Bit flips: stack frames

daniel bittman peter alvaro darrell d. e. long ethan l. miller · 26/02/2019 · daniel bittman...

Documents

university of california, santa cruz center for research...

darrell siebert

Últimos títulos publicados darrell huff cómo mentir ·...

dan bittman & mari veliz, ausable bayfield conservation

data-centric oses - hpts · data-centric oses nvm and the...

james bittman - trading options

fcv rep darrell

the kgb and soviet disinformation - ladislav bittman

james bittman - trading options as a

advanced strategies for options trading success with james...

ethan younger

darrell heiden

uc santa cruz providing high reliability in a minimum...

darrell francis complaint

darrell roberts paintings

wordpress.com · 2015. 6. 1. · mark bittman mark bittman...

darrell lea

darrell the dingo

ceph: a scalable, high-performance distributed file system...

ayes2010 - darrell zhang