shared memory consistency models: a tutorial sarita v. adve kouroush ghrachorloo western research...
TRANSCRIPT
![Page 1: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/1.jpg)
Shared Memory Consistency Models: A Tutorial
Sarita V. Adve Kouroush Ghrachorloo
Western Research Laboratory
September 1995
![Page 2: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/2.jpg)
Goals
• Expand intuition about concurrent program behavior
• Explore execution sequences due to compiler or hardware optimizations
• Introduce shared memory consistency models• Explore execution sequences due to a
particular memory model• Demonstrate Memory Barriers (“fences”)
![Page 3: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/3.jpg)
What happens?
Example of a mutual exclusion (“Dekker’s Algorithm”)
Global variables initially: Flag1 = 0, Flag2 = 0
Flag1 = 1If(Flag2 == 0)
Critical section
Flag2 = 1If(Flag1 == 0)
Critical section
P1 P2
![Page 4: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/4.jpg)
Uniprocessor Hardware
Flag2 = 1If(Flag1 == 0)
Critical section
P2
Flag 1 == 0 Flag 2 == 0
Flag1 = 1If(Flag2 == 0)
Critical section
P1
T0Flag 1 = 0 and Flag 2 = 0
![Page 5: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/5.jpg)
Uniprocessor Hardware
Flag2 = 1If(Flag1 == 0)
Critical section
P2
Flag 1 == 1 Flag 2 == 0
T1Write Flag 1
Flag1 = 1If(Flag2 == 0)
Critical section
P1
T0Flag 1 = 0 and Flag 2 = 0
T1 P1 Flag1 = 1
![Page 6: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/6.jpg)
Uniprocessor Hardware
Flag2 = 1If(Flag1 == 0)
Critical section
P2
Flag 1 == 1 Flag 2 == 0
T1Write Flag 1
T2Read Flag 2
Flag1 = 1If(Flag2 == 0)
Critical section
P1
T0Flag 1 = 0 and Flag 2 = 0
T2 P1 Flag2 == 0
T1 P1 Flag1 = 1
![Page 7: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/7.jpg)
Uniprocessor Hardware
Flag2 = 1If(Flag1 == 0)
Critical section
P2
Flag 1 == 1 Flag 2 == 1
T1Write Flag 1
T2Read Flag 2
Flag1 = 1If(Flag2 == 0)
Critical section
P1
T3Write Flag 2
T0Flag 1 = 0 and Flag 2 = 0
T2 P1 Flag2 == 0
T1 P1 Flag1 = 1
T3 P2 Flag2 = 1
![Page 8: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/8.jpg)
Uniprocessor Hardware
Flag2 = 1If(Flag1 == 0)
Critical section
P2
Flag 1 == 1 Flag 2 == 1
T1Write Flag 1
T2Read Flag 2
Flag1 = 1If(Flag2 == 0)
Critical section
P1
T3Write Flag 2
T4Read Flag 1
T0Flag 1 = 0 and Flag 2 = 0
T2 P1 Flag2 == 0
T1 P1 Flag1 = 1
T4 P2 Flag1 == 1
T3 P2 Flag2 = 1
Critical Section Protected
![Page 9: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/9.jpg)
Uniprocessor Hardware OptimizationsBuffer (Cache)
• Writes take about 100 cycles• Reads take about 1 cycle• Use Write Buffer Bypass
![Page 10: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/10.jpg)
Uniprocessor Hardware OptimizationsWrite Buffer Bypass
Flag2 = 1If(Flag1 == 0)
Critical section
P2
Flag 1 == 0 Flag 2 == 0
Flag1 = 1If(Flag2 == 0)
Critical section
P1
T0Flag 1 = 0 and Flag 2 = 0
![Page 11: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/11.jpg)
Uniprocessor Hardware OptimizationsWrite Buffer Bypass
Flag2 = 1If(Flag1 == 0)
Critical section
P2
Flag 1 = 1
Flag 1 == 0 Flag 2 == 0
T1Write Flag 1Flag1 = 1
If(Flag2 == 0)Critical section
P1
T0Flag 1 = 0 and Flag 2 = 0
T1 P1 Flag1 = 1
![Page 12: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/12.jpg)
Uniprocessor Hardware OptimizationsWrite Buffer Bypass
Flag2 = 1If(Flag1 == 0)
Critical section
P2
Flag 1 = 1
Flag 1 == 0 Flag 2 == 0
T2Read Flag 2
T1Write Flag 1Flag1 = 1
If(Flag2 == 0)Critical section
P1
T0Flag 1 = 0 and Flag 2 = 0
T2 P1 Flag2 == ?
T1 P1 Flag1 = 1
![Page 13: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/13.jpg)
Uniprocessor Hardware OptimizationsWrite Buffer Bypass
Flag2 = 1If(Flag1 == 0)
Critical section
P2
Flag 1 = 1
Flag 1 == 0 Flag 2 == 0
T2Read Flag 2
T1Write Flag 1Flag1 = 1
If(Flag2 == 0)Critical section
P1
T0Flag 1 = 0 and Flag 2 = 0
T2 P1 Flag2 == 0
T1 P1 Flag1 = 1
![Page 14: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/14.jpg)
Uniprocessor Hardware OptimizationsWrite Buffer Bypass
Flag2 = 1If(Flag1 == 0)
Critical section
P2
Flag 1 = 1Flag 2 = 1
Flag 1 == 0 Flag 2 == 0
T2Read Flag 2
T1Write Flag 1Flag1 = 1
If(Flag2 == 0)Critical section
P1 T3Write Flag 2
T0Flag 1 = 0 and Flag 2 = 0
T2 P1 Flag2 == 0
T1 P1 Flag1 = 1
T3 P2 Flag2 = 1
![Page 15: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/15.jpg)
Uniprocessor Hardware OptimizationsWrite Buffer Bypass
Flag2 = 1If(Flag1 == 0)
Critical section
P2
Flag 1 = 1Flag 2 = 1
Flag 1 == 0 Flag 2 == 0
T2Read Flag 2
T1Write Flag 1Flag1 = 1
If(Flag2 == 0)Critical section
P1 T3Write Flag 2
T4Read Flag 1
T0Flag 1 = 0 and Flag 2 = 0
T2 P1 Flag2 == 0
T1 P1 Flag1 = 1
T4 P2 Flag1 == ?
T3 P2 Flag2 = 1
![Page 16: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/16.jpg)
Uniprocessor Hardware OptimizationsWrite Buffer Bypass
Flag2 = 1If(Flag1 == 0)
Critical section
P2
Flag 1 = 1Flag 2 = 1
Flag 1 == 0 Flag 2 == 0
T2Read Flag 2
T1Write Flag 1Flag1 = 1
If(Flag2 == 0)Critical section
P1 T3Write Flag 2
T4Read Flag 1
T0Flag 1 = 0 and Flag 2 = 0
T2 P1 Flag2 == 0
T1 P1 Flag1 = 1
T4 P2 Flag1 == 1
T3 P2 Flag2 = 1
Critical Section Protected
![Page 17: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/17.jpg)
Multiprocessor Hardware OptimizationsWrite Buffer Bypass
Flag2 = 1If(Flag1 == 0)
Critical section
P2
Flag 1 == 0 Flag 2 == 0
Flag1 = 1If(Flag2 == 0)
Critical section
P1
T0Flag 1 = 0 and Flag 2 = 0
Shared Bus
![Page 18: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/18.jpg)
Multiprocessor Hardware OptimizationsWrite Buffer Bypass
Flag2 = 1If(Flag1 == 0)
Critical section
P2
Flag 1 = 1
Flag 1 == 0 Flag 2 == 0
T1Write Flag 1Flag1 = 1
If(Flag2 == 0)Critical section
P1
T0Flag 1 = 0 and Flag 2 = 0
T1 P1 Flag1 = 1
Shared Bus
![Page 19: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/19.jpg)
Multiprocessor Hardware OptimizationsWrite Buffer Bypass
Flag2 = 1If(Flag1 == 0)
Critical section
P2
Flag 1 = 1
Flag 1 == 0 Flag 2 == 0
T2Read Flag 2
T1Write Flag 1Flag1 = 1
If(Flag2 == 0)Critical section
P1
T0Flag 1 = 0 and Flag 2 = 0
T2 P1 Flag2 == ?
T1 P1 Flag1 = 1
Shared Bus
![Page 20: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/20.jpg)
Multiprocessor Hardware OptimizationsWrite Buffer Bypass
Flag2 = 1If(Flag1 == 0)
Critical section
P2
Flag 1 = 1
Flag 1 == 0 Flag 2 == 0
T2Read Flag 2
T1Write Flag 1Flag1 = 1
If(Flag2 == 0)Critical section
P1
T0Flag 1 = 0 and Flag 2 = 0
T2 P1 Flag2 == 0
T1 P1 Flag1 = 1
Shared Bus
![Page 21: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/21.jpg)
Multiprocessor Hardware OptimizationsWrite Buffer Bypass
Flag2 = 1If(Flag1 == 0)
Critical section
P2
Flag 1 = 1
Flag 1 == 0 Flag 2 == 0
T2Read Flag 2
T1Write Flag 1Flag1 = 1
If(Flag2 == 0)Critical section
P1 T3Write Flag 2
T0Flag 1 = 0 and Flag 2 = 0
T2 P1 Flag2 == 0
T1 P1 Flag1 = 1
T3 P2 Flag2 = 1Flag 2 = 1
Shared Bus
![Page 22: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/22.jpg)
Multiprocessor Hardware OptimizationsWrite Buffer Bypass
Flag2 = 1If(Flag1 == 0)
Critical section
P2
Flag 1 = 1
Flag 1 == 0 Flag 2 == 0
T2Read Flag 2
T1Write Flag 1Flag1 = 1
If(Flag2 == 0)Critical section
P1 T3Write Flag 2
T0Flag 1 = 0 and Flag 2 = 0
T2 P1 Flag2 == 0
T1 P1 Flag1 = 1
T3 P2 Flag2 = 1Flag 2 = 1
Shared Bus
T4Read Flag 1
T4 P2 Flag1 == ?
![Page 23: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/23.jpg)
Multiprocessor Hardware OptimizationsWrite Buffer Bypass
Flag2 = 1If(Flag1 == 0)
Critical section
P2
Flag 1 = 1
Flag 1 == 0 Flag 2 == 0
T2Read Flag 2
T1Write Flag 1Flag1 = 1
If(Flag2 == 0)Critical section
P1 T3Write Flag 2
T0Flag 1 = 0 and Flag 2 = 0
T2 P1 Flag2 == 0
T1 P1 Flag1 = 1
T3 P2 Flag2 = 1Flag 2 = 1
Shared Bus
T4Read Flag 1
T4 P2 Flag1 == 0
Both in Critical Section
![Page 24: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/24.jpg)
Producer Consumer
Example of a Producer and Consumer
Global variables initially: Data = 0, Head = 0
Data = 2Head = 1
while(Head == 0);print Data;
P1 P2
![Page 25: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/25.jpg)
General Interconnect
Multiprocessor Hardware OptimizationsOverlapped Writes
while(Head == 0);print Data
P2
Data == 0 Head == 0
Data = 2Head = 1
P1
T0Data = 0, Head = 0
P1 Head = 1
P1 Data = 2
![Page 26: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/26.jpg)
General Interconnect
Multiprocessor Hardware OptimizationsOverlapped Writes
while(Head == 0);print Data
P2
Data == 0 Head == 1
T1Write Head = 1
Data = 2Head = 1
P1
T0Data = 0, Head = 0
T1 GI Head = 1
P1 Head = 1
P1 Data = 2
![Page 27: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/27.jpg)
General Interconnect
Multiprocessor Hardware OptimizationsOverlapped Writes
while(Head == 0);print Data
P2
Data == 0 Head == 1
T1Write Head = 1
Data = 2Head = 1
P1
T0Data = 0, Head = 0
T2 P2 Head == 1
T1 GI Head = 1
T2Read Head = 1
P1 Head = 1
P1 Data = 2
![Page 28: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/28.jpg)
General Interconnect
Multiprocessor Hardware OptimizationsOverlapped Writes
while(Head == 0);print Data
P2
Data == 0 Head == 1
T1Write Head = 1
Data = 2Head = 1
P1
T3Read Data = 0
T0Data = 0, Head = 0
T2 P2 Head == 1
T1 GI Head = 1
T3 P2 Data == 0T2Read Head = 1
P1 Head = 1
P1 Data = 2
![Page 29: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/29.jpg)
General Interconnect
Multiprocessor Hardware OptimizationsOverlapped Writes
while(Head == 0);print Data
P2
Data == 2 Head == 1
T1Write Head = 1
Data = 2Head = 1
P1
T3Read Data = 0
T0Data = 0, Head = 0
T2 P2 Head == 1
T1 GI Head = 1
T4 GI Data = 2
T3 P2 Data == 0
Wrong Data
T4Write Data = 2
T2Read Head = 1
P1 Head = 1
P1 Data = 2
![Page 30: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/30.jpg)
What was expected?
Example of a Producer and Consumer
Global variables initially: Data = 0, Head = 0
Data = 2Head = 1
while(Head == 0);print Data;
P1 P2
![Page 31: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/31.jpg)
Simplify Example and the Operations
Simple Program
Global variables initially: A = 0, B = 0
A = 1B = 2
P1print Aprint B
P2
WX
WY
RX
RY
![Page 32: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/32.jpg)
Reason about possible sequencesExpected Output
A = 1B = 2
P1print Aprint B
P2WX
WY
RX
RY
WX WYRXRY
WXRXWYRY
WXRXRY WY
RX WX RY WY
RX WX WY RY
RX RY WX WY
12
12
10
02
00
00
![Page 33: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/33.jpg)
Reason about possible sequences.We get them all?
A = 1B = 2
P1print Aprint B
P2WX
WY
RX
RY
WX WY RX RY
WX RX WY RY
WX RX RY WY
RX WX RY WY
RX WX WY RY
RX RY WX WY
![Page 34: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/34.jpg)
Similar Reasoning
Example of a Producer and Consumer
Global variables initially: Data = 0, Head = 0
Data = 2Head = 1
while(Head == 0);... = Data;
P1 P2
WX
WY
RY
RX
![Page 35: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/35.jpg)
Reason about possible sequences.Expected Outcomes
Data = 2Head = 1
P1while(Head == 0);
print Data;
P2WX
WY
RY
RX
WX WYRYRX
WXRYWYRX
WXRYRX WY
RY WX RX WY
RY WX WY RX
RY RX WX WY
2
![Page 36: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/36.jpg)
Data = 2Head = 1
P1while(Head == 0);
print Data;
P2WX
WY
RY
RX
WX WYRYRX
WXRYWYRX
WXRYRX WY
RY WX RX WY
RY WX WY RX
RY RX WX WY
2
0Expect This?
Reason about possible sequences.Expected Outcomes
![Page 37: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/37.jpg)
General Interconnect
Multiprocessor Hardware OptimizationsOverlapped Writes
while(Head == 0);print Data
P2
Data == 0 Head == 0
Data = 2Head = 1
P1
T0Data = 0, Head = 0
P1 Head = 1
P1 Data = 2
![Page 38: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/38.jpg)
General Interconnect
Multiprocessor Hardware OptimizationsOverlapped Writes
while(Head == 0);print Data
P2
Data == 0 Head == 1
T1Write Head = 1
Data = 2Head = 1
P1
T0Data = 0, Head = 0
T1 GI Head = 1
P1 Head = 1
P1 Data = 2
WY
![Page 39: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/39.jpg)
General Interconnect
Multiprocessor Hardware OptimizationsOverlapped Writes
while(Head == 0);print Data
P2
Data == 0 Head == 1
T1Write Head = 1
Data = 2Head = 1
P1
T0Data = 0, Head = 0
T2 P2 Head == 1
T1 GI Head = 1
T2Read Head = 1
P1 Head = 1
P1 Data = 2
WY RY
![Page 40: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/40.jpg)
General Interconnect
Multiprocessor Hardware OptimizationsOverlapped Writes
while(Head == 0);print Data
P2
Data == 0 Head == 1
T1Write Head = 1
Data = 2Head = 1
P1
T3Read Data = 0
T0Data = 0, Head = 0
T2 P2 Head == 1
T1 GI Head = 1
T3 P2 Data == 0T2Read Head = 1
P1 Head = 1
P1 Data = 2
WY RY RX
![Page 41: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/41.jpg)
General Interconnect
Multiprocessor Hardware OptimizationsOverlapped Writes
while(Head == 0);print Data
P2
Data == 2 Head == 1
T1Write Head = 1
Data = 2Head = 1
P1
T3Read Data = 0
T0Data = 0, Head = 0
T2 P2 Head == 1
T1 GI Head = 1
T4 GI Data = 2
T3 P2 Data == 0
Wrong Data
T4Write Data = 2
T2Read Head = 1
P1 Head = 1
P1 Data = 2
WY RY RX WX
0
![Page 42: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/42.jpg)
Compiler Optimizations
• Constant Propagation• Register Allocation• Loop Transformation• Instruction Scheduling• Common Subexpression elimination• Et Cetera
![Page 43: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/43.jpg)
More H/W Optimizations
• Speculative Execution• Execution reordering (e.g. pipelining)• Speculative Store• Read to Write reordering• Write to Read reordering• Write to Write reordering• Read to Read reordering• Et Cetera
![Page 44: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/44.jpg)
Possible Outcomes
Data = 2Head = 1
P1while(Head == 0);
print Data;
P2WX
WY
RY
RX
WYRYRXWX
WYRXRYWX
WYRXWXRY
RX WX RY WY
RXWYRYWX
RXWY WX RY
0 0 0 0 0 0
WX WYRYRX
WXRYWYRX
WXRYRX WY
RY WX RX WY
RY WX WY RX
RY RX WX WY
2
Get These Too
![Page 45: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/45.jpg)
What’s missing?
A = 1B = 2
P1print Aprint B
P2WX
WY
RX
RY
WX WY RX RY
WX RX WY RY
WX RX RY WY
RX WX RY WY
RX WX WY RY
RX RY WX WY
![Page 46: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/46.jpg)
Simple ProgramAll possible sequences
A = 1B = 2
P1print Aprint B
P2WX
WY
RX
RY
WX WY RX RY
WX WY RY RX
WX RX WY RY
WX RX RY WY
WX RY RX WY
WX RY WY RX
WY WX RX RY
WY WX RY RX
WY RX WX RY
WY RY WX RX
WY RX RY WX
WY RY RX WX
RX WX RY WY
RX WX WY RY
RX WY RY WX
RX WY WX RY
RX RY WX WY
RX RY WY WX
RY WX WY RX
RY WX RX WY
RY WY WX RX
RY WY RX WX
RY RX WX WY
RY RX WY WX
![Page 47: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/47.jpg)
Dekker’s AlgorithmSimplify the Operations
Example of a mutual exclusion (“Dekker’s Algorithm)
Global variables initially: Flag1 = 0, Flag2 = 0
Flag1 = 1If(Flag2 == 0)
Critical section
P1 Flag2 = 1If(Flag1 == 0)
Critical section
P2
WX
RY
WY
RX
![Page 48: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/48.jpg)
Dekker’s AlgorithmAll possible sequences
WX WY RX RY
WX WY RY RX
WX RX WY RY
WX RX RY WY
WX RY RX WY
WX RY WY RX
WY WX RX RY
WY WX RY RX
WY RX WX RY
WY RY WX RX
WY RX RY WX
WY RY RX WX
RX WX RY WY
RX WX WY RY
RX WY RY WX
RX WY WX RY
RX RY WX WY
RX RY WY WX
RY WX WY RX
RY WX RX WY
RY WY WX RX
RY WY RX WX
RY RX WX WY
RY RX WY WX
Example of a Synchronization (“Dekker’s Algorithm”)
Which of these sequences will prevent concurrent execution?
![Page 49: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/49.jpg)
OK OK OK OK OK OK
OK OK OK OK OK OK
Wrong OK OK OK Wrong Wrong
OK OK OK Wrong Wrong Wrong
Dekker’s AlgorithmSequences and Outcomes
WX WY RX RY
WX WY RY RX
WX RX WY RY
WX RX RY WY
WX RY RX WY
WX RY WY RX
WY WX RX RY
WY WX RY RX
WY RX WX RY
WY RY WX RX
WY RX RY WX
WY RY RX WX
RX WX RY WY
RX WX WY RY
RX WY RY WX
RX WY WX RY
RX RY WX WY
RX RY WY WX
RY WX WY RX
RY WX RX WY
RY WY WX RX
RY WY RX WX
RY RX WX WY
RY RX WY WX
Flag1 = 1If(Flag2 == 0)
Critical section
P1 Flag2 = 1If(Flag1 == 0)
Critical section
P2WX
RY
WY
RX
![Page 50: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/50.jpg)
OK OK OK OK OK OK
OK OK OK OK OK OK
Wrong OK OK OK Wrong Wrong
OK OK OK Wrong Wrong Wrong
Flag1 = 1If(Flag2 == 0)
Critical section
P1 Flag2 = 1If(Flag1 == 0)
Critical section
P2WX
RY
WY
RX
Need to restrict certain sequences
Dekker’s AlgorithmSequences and Outcomes
![Page 51: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/51.jpg)
OK OK OK OK OK OK
OK OK OK OK OK OK
Wrong OK OK OK Wrong Wrong
OK OK OK Wrong Wrong Wrong
WX WY RX RY
WX WY RY RX
WX RX WY RY
WX RX RY WY
WX RY RX WY
WX RY WY RX
WY WX RX RY
WY WX RY RX
WY RX WX RY
WY RY WX RX
WY RX RY WX
WY RY RX WX
RX WX RY WY
RX WX WY RY
RX WY RY WX
RX WY WX RY
RX RY WX WY
RX RY WY WX
RY WX WY RX
RY WX RX WY
RY WY WX RX
RY WY RX WX
RY RX WX WY
RY RX WY WX
Flag1 = 1If(Flag2 == 0)
Critical section
P1 Flag2 = 1If(Flag1 == 0)
Critical section
P2WX
RY
WY
RX
Works whenever WX precedes RX or WY precedes RY
Dekker’s AlgorithmSequences and Outcomes
![Page 52: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/52.jpg)
Dekker’s AlgorithmAll possible sequences
WX WY RX RY
WX WY RY RX
WX RX WY RY
WX RX RY WY
WX RY RX WY
WX RY WY RX
WY WX RX RY
WY WX RY RX
WY RX WX RY
WY RY WX RX
WY RX RY WX
WY RY RX WX
RX WX RY WY
RX WX WY RY
RX WY RY WX
RX WY WX RY
RX RY WX WY
RX RY WY WX
RY WX WY RX
RY WX RX WY
RY WY WX RX
RY WY RX WX
RY RX WX WY
RY RX WY WX
Flag1 = 1If(Flag2 == 0)
Critical section
P1 Flag2 = 1If(Flag1 == 0)
Critical section
P2WX
RY
WY
RX
Works whenever WX precedes RX or WY precedes RY
18 are OK
6 are Wrong
![Page 53: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/53.jpg)
Simple ProgramAll possible sequences
A = 1B = 2
P1print Aprint B
P2WX
WY
RX
RY
WX WY RX RY
WX WY RY RX
WX RX WY RY
WX RX RY WY
WX RY RX WY
WX RY WY RX
WY WX RX RY
WY WX RY RX
WY RX WX RY
WY RY WX RX
WY RX RY WX
WY RY RX WX
RX WX RY WY
RX WX WY RY
RX WY RY WX
RX WY WX RY
RX RY WX WY
RX RY WY WX
RY WX WY RX
RY WX RX WY
RY WY WX RX
RY WY RX WX
RY RX WX WY
RY RX WY WX
No ordering requirement
![Page 54: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/54.jpg)
Simple ProgramAll possible sequences
A = 1B = 2
P1print Aprint B
P2WX
WY
RX
RY
WX WY RX RY
WX WY RY RX
WX RX WY RY
WX RX RY WY
WX RY RX WY
WX RY WY RX
WY WX RX RY
WY WX RY RX
WY RX WX RY
WY RY WX RX
WY RX RY WX
WY RY RX WX
RX WX RY WY
RX WX WY RY
RX WY RY WX
RX WY WX RY
RX RY WX WY
RX RY WY WX
RY WX WY RX
RY WX RX WY
RY WY WX RX
RY WY RX WX
RY RX WX WY
RY RX WY WX
No ordering requirement
All 24 are “OK”
0 are “Wrong”
![Page 55: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/55.jpg)
Producer ConsumerAll sequences
Data = 2Head = 1
P1while(Head == 0);
print Data;
P2WX
WY
RY
RX
WX WY RX RY
WX WY RY RX
WX RX WY RY
WX RX RY WY
WX RY RX WY
WX RY WY RX
WY WX RX RY
WY WX RY RX
WY RX WX RY
WY RY WX RX
WY RX RY WX
WY RY RX WX
RX WX RY WY
RX WX WY RY
RX WY RY WX
RX WY WX RY
RX RY WX WY
RX RY WY WX
RY WX WY RX
RY WX RX WY
RY WY WX RX
RY WY RX WX
RY RX WX WY
RY RX WY WX
Require WX precede RX and WY precede RY and WY precede RX
![Page 56: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/56.jpg)
Producer ConsumerAll sequences
Data = 2Head = 1
P1while(Head == 0);
print Data;
P2WX
WY
RY
RX
WX WY RX RY
WX WY RY RX
WX RX WY RY
WX RX RY WY
WX RY RX WY
WX RY WY RX
WY WX RX RY
WY WX RY RX
WY RX WX RY
WY RY WX RX
WY RX RY WX
WY RY RX WX
RX WX RY WY
RX WX WY RY
RX WY RY WX
RX WY WX RY
RX RY WX WY
RX RY WY WX
RY WX WY RX
RY WX RX WY
RY WY WX RX
RY WY RX WX
RY RX WX WY
RY RX WY WX
Require WX precede RX and WY precede RY and WY precede RX
5 are OK
19 are Wrong
![Page 57: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/57.jpg)
Producer ConsumerAll sequences
Data = 2Head = 1
P1while(Head == 0);
print Data;
P2WX
WY
RY
RX
WX WY RX RY
WX WY RY RX
WX RX WY RY
WX RX RY WY
WX RY RX WY
WX RY WY RX
WY WX RX RY
WY WX RY RX
WY RX WX RY
WY RY WX RX
WY RX RY WX
WY RY RX WX
RX WX RY WY
RX WX WY RY
RX WY RY WX
RX WY WX RY
RX RY WX WY
RX RY WY WX
RY WX WY RX
RY WX RX WY
RY WY WX RX
RY WY RX WX
RY RX WX WY
RY RX WY WX
When RY precedes WY, while-RY-loop spins. Eventually we get WY < RY.
5 are OK
19 are Wrong
![Page 58: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/58.jpg)
Producer ConsumerAll sequences
Data = 2Head = 1
P1while(Head == 0);
print Data;
P2WX
WY
RY
RX
WX WY RX RY
WX WY RY RX
WX RX WY RY
WX RX RY WY
WX RY RX WY
WX RY WY RX
WY WX RX RY
WY WX RY RX
WY RX WX RY
WY RY WX RX
WY RX RY WX
WY RY RX WX
RX WX RY WY
RX WX WY RY
RX WY RY WX
RX WY WX RY
RX RY WX WY
RX RY WY WX
RY WX WY RX
RY WX RX WY
RY WY WX RX
RY WY RX WX
RY RX WX WY
RY RX WY WX
We have RY, RY, RY … Sequences with RY < WY will eventually end with RY
5 are OK
19 are Wrong?
![Page 59: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/59.jpg)
Producer ConsumerAll sequences
Data = 2Head = 1
P1while(Head == 0);
print Data;
P2WX
WY
RY
RX
WX WY RX RY
WX WY RY RX
WX RX WY RY
WX RX RY WY
WX RY RX WY
WX RY WY RX
WY WX RX RY
WY WX RY RX
WY RX WX RY
WY RY WX RX
WY RX RY WX
WY RY RX WX
RX WX RY WY
RX WX WY RY
RX WY RY WX
RX WY WX RY
RX RY WX WY
RX RY WY WX
RY WX WY RX
RY WX RX WY
RY WY WX RX
RY WY RX WX
RY RX WX WY
RY RX WY WX
We have RY, RY, RY … Sequences with RY < WY will eventually end with RY
5 are OK
19 are Wrong?
![Page 60: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/60.jpg)
Producer ConsumerAll sequences
Data = 2Head = 1
P1while(Head == 0);
print Data;
P2WX
WY
RY
RX
WX WY RX RY
WX WY RY RX
WX RX WY RY
WX RX RY WY
RY
WX RY RX WY
RY
WX RY WY RX
RYWY WX RX RY
WY WX RY RX
WY RX WX RY
WY RY WX RX
WY RX RY WX
WY RY RX WX
RX WX RY WY
RY
RX WX WY RY
RX WY RY WX
RX WY WX RY
RX RY WX WY
RY
RX RY WY WX
RYRY WX WY RX
RY
RY WX RX WY
RY
RY WY WX RX
RY
RY WY RX WX
RY
RY RX WX WY
RY
RY RX WY WX
RY
We have RY, RY, RY … Sequences with RY < WY will eventually end with RY
5 are OK
19 are Wrong?
![Page 61: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/61.jpg)
Producer ConsumerAll sequences
Data = 2Head = 1
P1while(Head == 0);
print Data;
P2WX
WY
RY
RX
WX WY RX RY
WX WY RY RX
WX RX WY RY
WX RX RY WY
RY
WX RY RX WY
RY
WX RY WY RX
RYWY WX RX RY
WY WX RY RX
WY RX WX RY
WY RY WX RX
WY RX RY WX
WY RY RX WX
RX WX RY WY
RY
RX WX WY RY
RX WY RY WX
RX WY WX RY
RX RY WX WY
RY
RX RY WY WX
RYRY WX WY RX
RY
RY WX RX WY
RY
RY WY WX RX
RY
RY WY RX WX
RY
RY RX WX WY
RY
RY RX WY WX
RY
We can remove the earlier RY in those sequences.
5 are OK
19 are Wrong?
![Page 62: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/62.jpg)
Producer ConsumerAll sequences
Data = 2Head = 1
P1while(Head == 0);
print Data;
P2WX
WY
RY
RX
WX WY RX RY
WX WY RY RX
WX RX WY RY
WX RX WY RY
WX RX WY RY
WX WY RX RY
WY WX RX RY
WY WX RY RX
WY RX WX RY
WY RY WX RX
WY RX RY WX
WY RY RX WX
RX WX WY RY
RX WX WY RY
RX WY RY WX
RX WY WX RY
RX WX WY RY
RX WY WX RY
WX WY RX RY
WX RX WY RY
WY WX RX RY
WY RX WX RY
RX WX WY RY
RX WY WX RY
Remove all of the duplicated sequences
5 are OK
19 are Wrong?
![Page 63: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/63.jpg)
Producer ConsumerAll sequences
Data = 2Head = 1
P1while(Head == 0);
print Data;
P2WX
WY
RY
RX
WX WY RX RY
WX WY RY RX
WX RX WY RY
WY WX RX RY
WY WX RY RX
WY RX WX RY
WY RY WX RX
WY RX RY WX
WY RY RX WX
RX WX WY RY
RX WY RY WX
RX WY WX RY
Remove all of the duplicated sequences
5 are OK
7 are Wrong
![Page 64: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/64.jpg)
Producer ConsumerPossible sequences w/write acknowledge
Data = 2Head = 1
P1while(Head == 0);
print Data;
P2WX
WY
RY
RX
WX WY RX RY
WX WY RY RX
WX RX WY RY
WY WX RX RY
WY WX RY RX
WY RX WX RY
WY RY WX RX
WY RX RY WX
WY RY RX WX
RX WX WY RY
RX WY RY WX
RX WY WX RY
Some H/W provides write acknowledgment (i.e. wait for pending writes to complete)
5 are OK
7 are Wrong
![Page 65: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/65.jpg)
Producer ConsumerPossible sequences w/write acknowledge
Data = 2Head = 1
P1while(Head == 0);
print Data;
P2WX
WY
RY
RX
WX WY RX RY
WX WY RY RX
WX RX WY RY
WY WX RX RY
WY WX RY RX
WY RX WX RY
WY RY WX RX
WY RX RY WX
WY RY RX WX
RX WX WY RY
RX WY RY WX
RX WY WX RY
Remove all sequences where WY < WX.
5 are OK
7 are Wrong
![Page 66: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/66.jpg)
Producer ConsumerPossible sequences w/write acknowledge
Data = 2Head = 1
P1while(Head == 0);
print Data;
P2WX
WY
RY
RX
WX WY RX RY
WX WY RY RX
WX RX WY RY
RX WX WY RY
Remove all sequences where WY < WX.
2 are OK
2 are Wrong
![Page 67: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/67.jpg)
Review. What does the H/W provide?
• Reordering of loads and stores – doesn’t help• Write acknowledge – almost helps• Memory Models
![Page 68: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/68.jpg)
Memory Models
Sequential Consistency
Definition: [A multiprocessor system is sequentially consistent if] the result of any execution is the same as if the operations of all the processors were executed in some sequential order, and the operations of each individual processor appear in this sequence in the order specified by its program. [Lamport 1979]
Pros
Cons
• Simple view of program• OK for Uniprocessor environments
• Not OK for Multiprocessor environments• Too restrictive for processor performance
![Page 69: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/69.jpg)
Memory Models
Relaxed Consistency
Description: Relaxed memory consistency models are already implemented on the multiprocessors available. They specify what memory operations may be expected to be reordered by the hardware.
• Write to Read• Write to Write• Read to Read / Write• Read Others Write Early• Read Own Early
They all have methods to force a particular ordering and these are known as the
Safety Net
![Page 70: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/70.jpg)
Available Relaxed Memory Models
SYNCPowerPC
various MEMBARsRMO
MB, WMBAlpha
release, acquire, nsync, RMW
RCpc
release, acquire, nsync, RMW
RCsc
synchronizationWO
RMW, STBARPSO
RMWPC
RMWTSO
serialization instructions
IBM 370
Safety NetRead Own Write Early
Read Others’ Write Early
R RW Order
W W Order
W R Order
Relaxation:
![Page 71: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/71.jpg)
Producer ConsumerRelaxed W->R memory model
WX WY RX RY
WX WY RY RX
WX RX WY RY
WX RX RY WY
WX RY RX WY
WX RY WY RX
WY WX RX RY
WY WX RY RX
WY RX WX RY
WY RY WX RX
WY RX RY WX
WY RY RX WX
RX WX RY WY
RX WX WY RY
RX WY RY WX
RX WY WX RY
RX RY WX WY
RX RY WY WX
RY WX WY RX
RY WX RX WY
RY WY WX RX
RY WY RX WX
RY RX WX WY
RY RX WY WX
Which of these sequences can be expected with all the memory models listed?
![Page 72: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/72.jpg)
Producer ConsumerAll sequences
Data = 2Head = 1
P1while(Head == 0);
print Data;
P2WX
WY
RY
RX
WX WY RX RY
WX WY RY RX
WX RX WY RY
WX RX RY WY
WX RY RX WY
WX RY WY RX
WY WX RX RY
WY WX RY RX
WY RX WX RY
WY RY WX RX
WY RX RY WX
WY RY RX WX
RX WX RY WY
RX WX WY RY
RX WY RY WX
RX WY WX RY
RX RY WX WY
RX RY WY WX
RY WX WY RX
RY WX RX WY
RY WY WX RX
RY WY RX WX
RY RX WX WY
RY RX WY WX
Require WX precede RX and WY precede RY and WY precede RX
![Page 73: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/73.jpg)
Producer ConsumerPossible sequences
Data = 2Head = 1
P1while(Head == 0);
print Data;
P2WX
WY
RY
RX
WX WY RX RY
WX WY RY RX
WX RX WY RY
WY WX RX RY
WY WX RY RX
WY RX WX RY
WY RY WX RX
WY RX RY WX
WY RY RX WX
RX WX WY RY
RX WY RY WX
RX WY WX RY
Require WX precede RX and WY precede RY and WY precede RX
5 are OK
7 are Wrong
![Page 74: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/74.jpg)
Data = 2Head = 1
P1while(Head == 0);
print Data;
P2WX
WY
RY
RX
WX WY RX RY
WX WY RY RX
WX RX WY RY
WY WX RX RY
WY WX RY RX
WY RX WX RY
WY RY WX RX
WY RX RY WX
WY RY RX WX
RX WX WY RY
RX WY RY WX
RX WY WX RY
Start with Sequential Consistency
5 are OK
7 are Wrong
Producer Consumerwith sequential consistency
![Page 75: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/75.jpg)
Producer Consumerwith sequential consistency
Data = 2Head = 1
P1while(Head == 0);
print Data;
P2WX
WY
RY
RX
WX WY RY RX
Start with Sequential Consistency
1 is OK
0 are Wrong
![Page 76: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/76.jpg)
Producer ConsumerRelaxed W->R ordering sequences
Data = 2Head = 1
P1while(Head == 0);
print Data;
P2WX
WY
RY
RX
WX WY RY RX
Add sequences due to the relaxation of W->R ordering
1 is OK
0 are Wrong
![Page 77: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/77.jpg)
Data = 2Head = 1
P1while(Head == 0);
print Data;
P2WX
WY
RY
RX
WX WY RY RX
No change
1 is OK
0 are Wrong
Producer ConsumerRelaxed W->R ordering sequences
![Page 78: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/78.jpg)
Data = 2Head = 1
P1while(Head == 0);
print Data;
P2WX
WY
RY
RX
WX WY RY RX
Most processors have relaxed w->w orderings also.
1 is OK
0 are Wrong
Producer ConsumerRelaxed W->R, and W->W ordering sequences
![Page 79: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/79.jpg)
Producer ConsumerRelaxed W->R, and W->W ordering sequences
Data = 2Head = 1
P1while(Head == 0);
print Data;
P2WX
WY
RY
RX
Started with sequential consistency, then added relaxed w->r and w->w orderings
3 are OK
1 is Wrong
WX WY RY RX
WY WX RY RX
WY RY WX RX
WY RY RX WX
![Page 80: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/80.jpg)
Dekker’s AlgorithmRelaxed W->R memory model
WX WY RX RY
WX WY RY RX
WX RX WY RY
WX RX RY WY
WX RY RX WY
WX RY WY RX
WY WX RX RY
WY WX RY RX
WY RX WX RY
WY RY WX RX
WY RX RY WX
WY RY RX WX
RX WX RY WY
RX WX WY RY
RX WY RY WX
RX WY WX RY
RX RY WX WY
RX RY WY WX
RY WX WY RX
RY WX RX WY
RY WY WX RX
RY WY RX WX
RY RX WX WY
RY RX WY WX
Which of these sequences can be expected with all the memory models listed?
![Page 81: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/81.jpg)
Dekker’s AlgorithmRelaxed W->R ordering sequences
WX WY RX RY
WX WY RY RX
WX RX WY RY
WX RX RY WY
WX RY RX WY
WX RY WY RX
WY WX RX RY
WY WX RY RX
WY RX WX RY
WY RY WX RX
WY RX RY WX
WY RY RX WX
RX WX RY WY
RX WX WY RY
RX WY RY WX
RX WY WX RY
RX RY WX WY
RX RY WY WX
RY WX WY RX
RY WX RX WY
RY WY WX RX
RY WY RX WX
RY RX WX WY
RY RX WY WX
Flag1 = 1If(Flag2 == 0)
Critical section
P1 Flag2 = 1If(Flag1 == 0)
Critical section
P2WX
RY
WY
RX
Works whenever WX precedes RX or WY precedes RY
18 are OK
6 are Wrong
![Page 82: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/82.jpg)
Dekker’s AlgorithmRelaxed W->R ordering sequences
WX WY RX RY
WX WY RY RX
WX RX WY RY
WX RX RY WY
WX RY RX WY
WX RY WY RX
WY WX RX RY
WY WX RY RX
WY RX WX RY
WY RY WX RX
WY RX RY WX
WY RY RX WX
RX WX RY WY
RX WX WY RY
RX WY RY WX
RX WY WX RY
RX RY WX WY
RX RY WY WX
RY WX WY RX
RY WX RX WY
RY WY WX RX
RY WY RX WX
RY RX WX WY
RY RX WY WX
Flag1 = 1If(Flag2 == 0)
Critical section
P1 Flag2 = 1If(Flag1 == 0)
Critical section
P2WX
RY
WY
RX
Start with Sequential Consistency
18 are OK
6 are Wrong
![Page 83: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/83.jpg)
Dekker’s AlgorithmRelaxed W->R ordering sequences
WX WY RX RY
WX WY RY RX
WX RY WY RX
WY WX RX RY
WY WX RY RX
WY RX WX RY
Flag1 = 1If(Flag2 == 0)
Critical section
P1 Flag2 = 1If(Flag1 == 0)
Critical section
P2WX
RY
WY
RX
Start with Sequential Consistency
6 are OK
0 are Wrong
![Page 84: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/84.jpg)
Dekker’s AlgorithmRelaxed W->R ordering sequences
WX WY RX RY
WX WY RY RX
WX RY WY RX
WY WX RX RY
WY WX RY RX
WY RX WX RY
Flag1 = 1If(Flag2 == 0)
Critical section
P1 Flag2 = 1If(Flag1 == 0)
Critical section
P2WX
RY
WY
RX
Add sequences due to relaxed memory model
6 are OK
0 are Wrong
![Page 85: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/85.jpg)
Dekker’s AlgorithmRelaxed W->R ordering sequences
WX WY RX RY
WX WY RY RX
WX RX WY RY
WX RX RY WY
WX RY RX WY
WX RY WY RX
WY WX RX RY
WY WX RY RX
WY RX WX RY
WY RY WX RX
WY RX RY WX
WY RY RX WX
RX WX RY WY
RX WX WY RY
RX WY RY WX
RX WY WX RY
RX RY WX WY
RX RY WY WX
RY WX WY RX
RY WX RX WY
RY WY WX RX
RY WY RX WX
RY RX WX WY
RY RX WY WX
Flag1 = 1If(Flag2 == 0)
Critical section
P1 Flag2 = 1If(Flag1 == 0)
Critical section
P2WX
RY
WY
RX
Add sequences due to relaxed memory model
18 are OK
6 are Wrong
![Page 86: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/86.jpg)
Safety Nets
• Atomic instruction (RMW)• Code delineation (serialization instructions)• Synchronization instructions (SYNC)• Identify Data and Synch operations (Weak
Ordering model, and Release Consistency model)
• Memory Bars (aka “fences”)
![Page 87: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/87.jpg)
Producer Consumerw/Fence
Insert a memory barrier between the instructions we want ordered.
Global variables initially: Data = 0, Head = 0
Data = 2Head = 1
while(Head == 0);... = Data;
P1 P2
WX
WY
RY
RX
![Page 88: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/88.jpg)
Producer Consumerw/Fence
Example of a Producer and Consumer with a Memory Barrier applied.
Global variables initially: Data = 0, Head = 0
Data = 2memory_barrier
Head = 1
while(Head == 0);memory_barrier
... = Data;
P1 P2
WX
WY
RY
RX
All memory operations before the memory
barrier must complete before proceeding to
memory operations after the memory barrier.
![Page 89: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/89.jpg)
Producer Consumer w/FenceRelaxed W->R, and W->W ordering sequences
Data = 2Head = 1
P1while(Head == 0);
print Data;
P2WX
WY
RY
RX
Started with sequential consistency, then added relaxed w->r and w->w orderings
3 are OK
1 is Wrong
WX WY RY RX
WY WX RY RX
WY RY WX RX
WY RY RX WX
![Page 90: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/90.jpg)
Producer Consumer w/FenceRelaxed W->R, and W->W ordering sequences
Data = 2memory_barrier
Head = 1
P1 while(Head == 0);memory_barrier
print Data;
P2WX
WY
RY
RX
Add memory barriers to force WX < WY and RY < RX
3 are OK
1 is Wrong
WX WY RY RX
WY WX RY RX
WY RY WX RX
WY RY RX WX
![Page 91: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/91.jpg)
Data = 2memory_barrier
Head = 1
P1 while(Head == 0);memory_barrier
print Data;
P2WX
WY
RY
RX
Looks the same.
3 are OK
1 is Wrong
WX WY RY RX
WY WX RY RX
WY RY WX RX
WY RY RX WX
Producer Consumer w/FenceRelaxed W->R, and W->W ordering sequences
![Page 92: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/92.jpg)
Data = 2memory_barrier
Head = 1
P1 while(Head == 0);memory_barrier
print Data;
P2WX
WY
RY
RX
With WX < WY and RY < RX enforced with memory barriers, RX < WX is not possible.
3 are OK
1 is Wrong
WX WY RY RX
WY WX RY RX
WY RY WX RX
WY RY RX WX
Producer Consumer w/FenceRelaxed W->R, and W->W ordering sequences
![Page 93: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/93.jpg)
Data = 2memory_barrier
Head = 1
P1 while(Head == 0);memory_barrier
print Data;
P2WX
WY
RY
RX
Due to MB, RY < RX is enforced
2 are OK
1 is Wrong
WX WY RY RX
WY WX RY RX
WY RY WX RX
WY RY RX WX
Producer Consumer w/FenceRelaxed W->R, and W->W ordering sequences
![Page 94: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/94.jpg)
Data = 2memory_barrier
Head = 1
P1 while(Head == 0);memory_barrier
print Data;
P2WX
WY
RY
RX
WY < RY < MB. while-RY-loop waits for WY.
3 are OK
1 is Wrong
WX WY RY RX
WY WX RY RX
WY RY WX RX
WY RY RX WX
Producer Consumer w/FenceRelaxed W->R, and W->W ordering sequences
![Page 95: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/95.jpg)
Data = 2memory_barrier
Head = 1
P1 while(Head == 0);memory_barrier
print Data;
P2WX
WY
RY
RX
Due to MB, WX < WY is enforced
3 are OK
1 is Wrong
WX WY RY RX
WY WX RY RX
WY RY WX RX
WY RY RX WX
Producer Consumer w/FenceRelaxed W->R, and W->W ordering sequences
![Page 96: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/96.jpg)
Data = 2memory_barrier
Head = 1
P1 while(Head == 0);memory_barrier
print Data;
P2WX
WY
RY
RX
WX < WY and WY < RY and RY < RX is enforced therefore WX < RX is enforced
3 are OK
1 is Wrong
WX WY RY RX
WY WX RY RX
WY RY WX RX
WY RY RX WX
Producer Consumer w/FenceRelaxed W->R, and W->W ordering sequences
![Page 97: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/97.jpg)
Data = 2memory_barrier
Head = 1
P1 while(Head == 0);memory_barrier
print Data;
P2WX
WY
RY
RX
With WX < WY and RY < RX enforced with memory barriers, RX < WX is not possible.
3 are OK
1 is Wrong
WX WY RY RX
WY WX RY RX
WY RY WX RX
WY RY RX WX
Producer Consumer w/FenceRelaxed W->R, and W->W ordering sequences
![Page 98: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/98.jpg)
Data = 2memory_barrier
Head = 1
P1 while(Head == 0);memory_barrier
print Data;
P2WX
WY
RY
RX
With WX < WY and RY < RX enforced with memory barriers, RX < WX is not possible.
WX WY RY RX
WY WX RY RX
WY RY WX RX
3 are OK
0 are Wrong
Producer Consumer w/FenceRelaxed W->R, and W->W ordering sequences
![Page 99: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/99.jpg)
Dekker’s Algorithmw/Fence
Example of a mutual exclusion (“Dekker’s Algorithm)
Global variables initially: Flag1 = 0, Flag2 = 0
Flag1 = 1memory_barrier
If(Flag2 == 0)Critical section
Flag2 = 1memory_barrier
If(Flag1 == 0)Critical section
P1 P2
WX
WY
RY
RX
All memory operations before the memory
barrier must complete before proceeding to
memory operations after the memory barrier.
![Page 100: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/100.jpg)
Dekker’s Algorithm w/FenceRelaxed W->R ordering sequences
WX WY RX RY
WX WY RY RX
WX RX WY RY
WX RX RY WY
WX RY RX WY
WX RY WY RX
WY WX RX RY
WY WX RY RX
WY RX WX RY
WY RY WX RX
WY RX RY WX
WY RY RX WX
RX WX RY WY
RX WX WY RY
RX WY RY WX
RX WY WX RY
RX RY WX WY
RX RY WY WX
RY WX WY RX
RY WX RX WY
RY WY WX RX
RY WY RX WX
RY RX WX WY
RY RX WY WX
Flag1 = 1If(Flag2 == 0)
Critical section
P1 Flag2 = 1If(Flag1 == 0)
Critical section
P2WX
RY
WY
RX
Started with sequential consistency, then added relaxed w->r orderings
18 are OK
6 are Wrong
![Page 101: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/101.jpg)
Dekker’s Algorithm w/FenceRelaxed W->R ordering sequences
WX WY RX RY
WX WY RY RX
WX RX WY RY
WX RX RY WY
WX RY RX WY
WX RY WY RX
WY WX RX RY
WY WX RY RX
WY RX WX RY
WY RY WX RX
WY RX RY WX
WY RY RX WX
RX WX RY WY
RX WX WY RY
RX WY RY WX
RX WY WX RY
RX RY WX WY
RX RY WY WX
RY WX WY RX
RY WX RX WY
RY WY WX RX
RY WY RX WX
RY RX WX WY
RY RX WY WX
Add memory barriers to force WX < WY and RY < RX
18 are OK
6 are Wrong
Flag1 = 1memory_barrier
If(Flag2 == 0)Critical section
P1 Flag2 = 1memory_barrier
If(Flag1 == 0)Critical section
P2WX
WY
RY
RX
![Page 102: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/102.jpg)
Dekker’s Algorithm w/FenceRelaxed W->R ordering sequences
Add memory barriers to force WX < RY and WY < RX
6 are OK
0 are Wrong
Flag1 = 1memory_barrier
If(Flag2 == 0)Critical section
P1 Flag2 = 1memory_barrier
If(Flag1 == 0)Critical section
P2WX
RY
WY
RX
WX WY RX RY
WX WY RY RX
WX RY WY RX
WY WX RX RY
WY WX RY RX
WY RX WX RY
![Page 103: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/103.jpg)
Serialization of Writes (Fig 6)w/Fence
Insert a memory barrier between the instructions we want ordered.
Global variables initially: A = 0, B = 0, C= 0
A = 1B = 2
P1
WX
WY
while(B != 1);while(C != 1);Register1 = A
P3
RY
RZ
A = 2C = 1
P2
WX
WZ
while(B != 1);while(C != 1);Register2 = A
P4
RY
RZ
W1 W2
![Page 104: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/104.jpg)
Higher Level Abstractions
• Lower level of complexity• Explicit Parallel Constructs– Fortran 90– MPI
![Page 105: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/105.jpg)
Conclusion
• The Uniprocessor programming model is simple, but does not work on Multiprocessors
• Hardware and compilers make many optimizations that reorder loads and stores
• Memory models exist on the hardware and need to be considered for program correctness
• The Sequential Consistency model was considered for concurrent programs on the Uniprocessor
• Relaxed Memory Consistency models are considered on the Multiprocessor because SC is too restrictive for hardware performance.
• Use memory barriers (fences) to override relaxed memory model when ordering between memory operations must be maintained.
![Page 106: Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995](https://reader035.vdocuments.site/reader035/viewer/2022062520/56649ea25503460f94ba5749/html5/thumbnails/106.jpg)
Other Processors
R to RW to RW to WR to W