partial-coherence abstractions for relaxed memory models

1

PARTIAL-COHERENCE ABSTRACTIONS FOR RELAXED MEMORY MODELS

Presented by Michael Kuperstein, TechnionJoint work with Martin Vechev, IBM Research and Eran Yahav, Technion

2

Sequential Consistency We expect our programs to have

“Interleaving semantics” Consistent with program order

“The result of any execution is the same as if the operations of all the processors were executed in some sequential order, and the operations of each individual processor appear in this sequence in the order specified by its program.” – Leslie Lamport, 1973

3

Process 0: flag[0] := true while flag[1] = true { if turn ≠ 0 { flag[0] := false while turn ≠ 0 { } flag[0] := true } } // critical sectionturn := 1 flag[0] := false

Process 1: flag[1] := true while flag[0] = true { if turn ≠ 1 { flag[1] := false while turn ≠ 1 { } flag[1] := true } } // critical sectionturn := 0 flag[1] := false

Dekker’s Algorithm for Mutual Exclusion

Specification: mutual exclusion over critical section

4

…P0

MainMemor

y

…P1

……

……

XYZ

XYZ

123

Store Buffer Based Models TSO & PSO

x86 ~ TSO

Memory Fences Restore order

Every store before the fence becomes globally visible before anything after the fence executes

store flush

load

fence

5

Process 0: flag[0] := truefence while flag[1] = true { if turn ≠ 0 { flag[0] := false fence while turn ≠ 0 { } flag[0] := true fence } } // critical sectionturn := 1 fenceflag[0] := falsefence

Memory Fences

Fences are expensive 10s-100s of cycles

Practical Significance Data structures Linux Kernel spinlocks

Placing fences manually Overfencing: hurts

performance Underfencing: subtle

bugs

6

Process 0: flag[0] := truefence while flag[1] = true { if turn ≠ 0 { flag[0] := false while turn ≠ 0 { } flag[0] := true } } // critical sectionturn := 1 flag[0] := false

Memory Fences

Fences are expensive 10s-100s of cycles

Practical Significance Data structures Linux Kernel spinlocks

Placing fences manually Overfencing: hurts

performance Underfencing: subtle

bugs

7

Automatic Solutions Equivalence to Sequential Consistency

Reduce program behaviors to sequentially consistent (SC) runs

High-level specifications are ignored Goes back to Shasha & Snir [TOPLAS ’88]

Place fences to satisfy provided specification Using specification may forbid less executions May require fewer fences

Safe

SCPSO

8

Goal

P’ satisfies the specification S under M

BLENDER

Finite-State

ProgramP

SafetySpecificati

on S

Memory Model

M

Program P’

with Fences

9

General Recipe1. Compute reachable

states

2. Compute weakest constraints that guarantee all “bad states” are avoided

3. Implement the constraints with fences

10

Constraints Constraint language

Not every transition can be prevented using a fence10

P2 : (D) LOAD R1 = X

P1 : (D) LOAD R1 = X

P1:P2:

1 2 3A B C

XX

P1:P2:

1 2 3A B C

XX

P1:P2:

1 2 3A B C

XX

P1:P2:

1 2 3A B C

XX

Unavoidable

[A < D][B < D][C < D]

11

Concrete Transition System Building transition system under TSO/PSO is

hard No a-priori bound on buffer length

Unbounded state-space Even for programs that were finite-state under SC

Reachability has non-primitive recursive complexity [Atig et al., POPL ’10]

12

Abstract Memory Models (AMM) Bounded approximation of unbounded

buffers Strictly weaker than concrete TSO/PSO Finite-state programs remain finite-state

Reachability becomes effectively computable Construct finite (abstract) transition system

Apply fence inference Can also be used for verification Safe

SCPSO

AMM

13

Partial Coherence Abstractions

…P0

MainMemor

y

…P1

……

……

XYZ

XYX

P0

MainMemor

y

P1

X

Z

XYZ

Recent value

Bounded

length kUnordered elements

Y

Allows precise fence semantics

Allows precise loads from bufferKeeps the analysis precise for “well behaved” programs

Record what values appeared (withoutorder or number)

14

Partial Coherence Abstractions

1 2 3 4 5 6 7

{2,3,4,5}

1 2 3 4 5 6 7Concrete

Abstract

15

Abstract Fence Inference

1. Compute reachable abstract states

2. Compute constraints. Precision depends on abstraction.

3. Implement the constraints with fences

16

Fence Inference Results

Benchmarks are mutual exclusion primitives k - the bound on the FIFO part of the abstract buffer PD more “aggressive” than FD

Program

FD k=0

FD k=1

FD k=2

PD k=0

PD k=1

PD k=2

Sense0 Pet0 Dek0 Lam0 Fast0 Fast1a Fast1b Fast1c

17

Summary Partial-coherence abstractions

Verification without arbitrary bounds Abstraction precision affects quality of

results Synthesis of fences

Can infer optimal fences for mutual exclusion primitives

BLENDER

P

S

M

P’

18

Questions

19

Related Work Under-approximation

CheckFence [Burckhardt et al., PLDI ’07] Fender [KVY, FMCAD ’10] And more…

Over-approximation Equivalence to SC

Very imprecise Goes back to Shasha & Snir [TOPLAS ‘88]

Abstract Interpretation Varying precision Regular Abstraction [Linden et al., SPIN ’10] Partial-Coherence [KVY, PLDI ’11]

partial-coherence abstractions for relaxed memory models

Documents

true fence

false fence

critical sectionturn

fence instructions

fences manuallyoverfencing

critical sectionid

relaxed memory modelspresented

tso memory fencesrestore