theory of memory w. paul saarland university and dfki bmb+f projekt verisoft-xt joint work with ulan...

36
Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University

Upload: julia-ellis

Post on 18-Jan-2016

222 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University

Theory of Memory

W. Paul Saarland University and DFKI

bmb+f Projekt Verisoft-XT

joint work withUlan Degebaev and Norbert Schirmer

Saarland University

Page 2: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University

why might his be important?

• Unites theories of– store buffers– interlocking– caches– cache coherence– out of order execution– X64 instruction set– address translation– optimized compilation– structured parallel C

semantics

• Explains why hypervisor might run structured parallel C

• VCC is supposed to mirror structured parallel C semantics

• thus VCC might be(come) sound

Page 3: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University

Specifying Memory

M(x)x

Page 4: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University

Store Buffer

memory M

w(i)r(j)

sbuf(y)

Page 5: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University

Store Buffer

memory M

w(i)r(j)

sbuf(y)

Page 6: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University

Caches

M

ca

Page 7: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University

Many Caches: Snooping

M

ca(1) ca(p)

Page 8: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University

Many Caches

M

ca(1) ca(p)

x.la x.off

Page 9: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University

Many Caches

M

ca(1) ca(p)

x.la x.off

Page 10: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University

Many Caches

M

ca(1) ca(p)

x.off

Page 11: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University

Overlapping Transactions

public (a) a

c

c

b

c

Page 12: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University

Sequentially Consistent Memorylemma 5

public (a) a

c

c

b

c

Page 13: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University

Tomasulo Schedulers for OOO

IF

WB

reservation stations

ROB

issue

funct.

units

CDB

Page 14: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University

Two Memory Units

MMU

ROB

funct.

units

CDB

LS

RS RSsbuf

m

Page 15: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University

Single Processor OOO correctnesslemma 6

MMU

ROB

funct.

units

CDB

LS

RS RSsbuf

m

Page 16: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University

Multi Processor OOO implementation

MMUfunct.

units

CDB

LS

RS RSsbuf

m

ROB

data(i,j)

Page 17: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University

Multi Processor OOO correctnesslemma 7

MMUfunct.

units

CDB

LS

RS RSsbuf

m

ROB

data(i,j)

Page 18: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University

Multi Processor OOO correctnesslemma 7

MMUfunct.

units

CDB

LS

RS RSsbuf

m

ROB

data(i,j)

Page 19: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University

X64 architecture

• CPU core– R: user registers– SR: system registers

• CR3

– acc: access– segmentation

• mmu: memory management unit– tlb: translation look aside

buffer

• memory system– mm: main memory– ca: cache– sbuf: store buffer

sbuf

core

acc CR3

R

ca

mm

mmutlb

acc

segmentation

Page 20: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University

segmentation offlemma 8

• 1 segment• large as entire address

space• segmentation invisible

sbuf

core

acc CR3

R

ca

mm

mmutlb

acc

segmentation

Page 21: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University

Bad news: cache state is visible

• CPU core– acc: access

• acc.adr: address• acc.r: rights (user,write,

exe)• acc.data• acc.mmode: memory

mode– WB: write back– WT: write through ...– NC: no cache

sbuf

core

acc CR3

R

ca

mm or devices

mmutlb

acc

Page 22: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University

Good News: no device, no NC mode

• acc.mmode: memory mode– WB: write back– WT: write through ...– NC: no cache not usedsbuf

core

acc CR3

R

ca

mm

mmutlb

acc

Page 23: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University

Sequentially Consistent Physical Memorylemma 9

• acc.mmode: memory mode– WB: write back– WT: write through ...

mix on same address

• PM: sequentially consistent physical memory abstraction– Proof: MOESI invariants

are maintained

sbuf

PM

core

acc CR3

R

mmutlb

acc

Page 24: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University

Initialize page tables

• 1 processor– sbuf invisible

• operating mode: paging disabled– mmu invisible

• set up page table tree in PM

sbuf

PM

core

acc CR3

R

mmutlb

acc

page tables

Page 25: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University

Translated Linear Memory

• many processors• operating mode: paging

enabled• keep tlb consistent

sbuf

PM

core

acc CR3

R

mmutlb

acc

page tables

Page 26: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University

Translated Consistent Linear Memory+ sbufs lemma 10

• many processors• operating mode: paging

enabled• keep tlb consistent

sbuf

LM

core

acc CR3

R

page tables

Page 27: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University

C0: Pascal with C syntaxconfigurations

• c = ( pr, rd, lms, hm,gm)– pr program rest

– rd recursion depth

– lms: [0: recursion depth]!{local memories}

– hm: heap memory

– gm: global memory

• subvariables– (m,i)[17].gpr[3]

• value of pointers: subvariables !

va(c,(m,i))

ba(m,i)

memory m

size(m,i)

Page 28: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University

Parallel C

• c = ( pr, rd, lms, hm,gm)– pr program rest

– rd recursion depth

– lms: [0: recursion depth]!{local memories}

– hm: heap memory

– gm: global memory

• Share– gm

– hm

• Interleave at small steps semantics steps

va(c,(m,i))

ba(m,i)

memory m

size(m,i)

Page 29: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University

Parallel C

• c = ( pr, rd, lms, hm,gm)– pr program rest

– rd recursion depth

– lms: [0: recursion depth]!{local memories}

– hm: heap memory

– gm: global memory

• Share– gm

– hm

• Interleave at small steps semantics steps• Problem:

– Processor interleaves instructions

of compiled programs code(p)

va(c,(m,i))

ba(m,i)

memory m

size(m,i)

Page 30: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University

simulation relation consis(c, alloc, d)

p

y

alloc(c,p)

alloc(c,y)

LM

Page 31: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University

Non optimizing compiler:step by step simulation

Page 32: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University

Optimizing compiler:simulation between IO-steps

Page 33: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University

IO-steps (1): volatile accesses

Page 34: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University

Volatiles Sequentially Consistentlemma 11

Page 35: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University

Structured Parallel C

• Implement Locks using Volatiles• IO-steps (2): lock release• Run Processors alone on locked portions

of linear memory• Lemma 1: sbufs invisible• Lemma 10: Ordinary C code in linear memory

Page 36: Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University

Summary

• Implement Locks using Volatiles• IO-steps (2): lock release• Run Processors alone on locked portions

of linear memory• Lemma 1: sbufs invisible• Lemma 10: Ordinary C code in linear memory

• Outlined correctness proof for implementation of structured parallel C– Initialisation– compilation