memory hierarchy part 2
DESCRIPTION
Refreshing Memory. Memory Hierarchy Part 2. Writing Cache-Conscious Programs. Problem: Write C code for a function that computes the sum of the elements of a two dimensional array, a[M][N], of integers. int SumArray ( int a[][], int M, int N). - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/1.jpg)
Faculty of Computer Science
CMPUT 229 © 2006
Memory HierarchyPart 2
Refreshing Memory
![Page 2: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/2.jpg)
© 2006
Department of Computing Science
CMPUT 229
Writing Cache-Conscious Programs
Problem: Write C code for a function that computes the sum of the elements of a two dimensional array, a[M][N], of integers.
int SumArray(int a[][], int M, int N)
1 int SumArrayRows(int a[][], int M, int N) 2 { 3 int i, j; 4 int sum = 0; 5 6 for (i=0 ; i<M ; i++) 7 for (j=0 ; j<N ; j++) 8 sum += a[i][j]; 8 return sum; 9 }
1 int SumArrayCols(int a[][], int M, int N) 2 { 3 int i, j; 4 int sum = 0; 5 6 for (j=0 ; j<N ; i++) 7 for (i=0 ; i<M ; i++) 8 sum += a[i][j]; 8 return sum; 9 }
Byant/O’Hallaron, pp. 508
![Page 3: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/3.jpg)
© 2006
Department of Computing Science
CMPUT 229
SumArrayRows Data Access Order
a[1][2]a[1][3]a[1][4]a[1][5]a[2][0]a[2][1]a[2]2]a[2][3]a[2][4]a[2][5]a[3][0]a[3][1]a[3][2]a[3][3]a[3][4]
•••
a[0][0]a[0][1]a[0][2]a[0][3]a[0][4]a[0][5]a[1][0]a[1][1]
0x8000 4000
0x8000 4004
0x8000 4010
0x8000 4024
0x8000 4008
0x8000 4014
0x8000 4028
0x8000 403C
0x8000 400C
0x8000 4018
0x8000 402C
0x8000 4040
0x8000 401C
0x8000 4030
0x8000 4044
0x8000 4050
0x8000 4020
0x8000 4034
0x8000 4048
0x8000 4054
0x8000 4038
0x8000 404C
0x8000 4058
•••
1 int SumArrayRows(int a[][], int M, int N) 2 { 3 int i, j; 4 int sum = 0; 5 6 for (i=0 ; i<M ; i++) 7 for (j=0 ; j<N ; j++) 8 sum += a[i][j]; 8 return sum; 9 }
Byant/O’Hallaron, pp. 508
••• Cache
Memory
![Page 4: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/4.jpg)
© 2006
Department of Computing Science
CMPUT 229
SumArrayRows Data Access Order
a[1][2]a[1][3]a[1][4]a[1][5]a[2][0]a[2][1]a[2]2]a[2][3]a[2][4]a[2][5]a[3][0]a[3][1]a[3][2]a[3][3]a[3][4]
•••
a[0][0]a[0][1]a[0][2]a[0][3]a[0][4]a[0][5]a[1][0]a[1][1]
0x8000 4000
0x8000 4004
0x8000 4010
0x8000 4024
0x8000 4008
0x8000 4014
0x8000 4028
0x8000 403C
0x8000 400C
0x8000 4018
0x8000 402C
0x8000 4040
0x8000 401C
0x8000 4030
0x8000 4044
0x8000 4050
0x8000 4020
0x8000 4034
0x8000 4048
0x8000 4054
0x8000 4038
0x8000 404C
0x8000 4058
•••
1 int SumArrayRows(int a[][], int M, int N) 2 { 3 int i, j; 4 int sum = 0; 5 6 for (i=0 ; i<M ; i++) 7 for (j=0 ; j<N ; j++) 8 sum += a[i][j]; 8 return sum; 9 }
Byant/O’Hallaron, pp. 508
a[0][0] a[0][1] a[0][2] a[0][3]
••• Cache
Memory
![Page 5: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/5.jpg)
© 2006
Department of Computing Science
CMPUT 229
SumArrayRows Data Access Order
a[1][2]a[1][3]a[1][4]a[1][5]a[2][0]a[2][1]a[2]2]a[2][3]a[2][4]a[2][5]a[3][0]a[3][1]a[3][2]a[3][3]a[3][4]
•••
a[0][0]
a[0][2]a[0][3]a[0][4]a[0][5]a[1][0]a[1][1]
0x8000 4000
0x8000 4004
0x8000 4010
0x8000 4024
0x8000 4008
0x8000 4014
0x8000 4028
0x8000 403C
0x8000 400C
0x8000 4018
0x8000 402C
0x8000 4040
0x8000 401C
0x8000 4030
0x8000 4044
0x8000 4050
0x8000 4020
0x8000 4034
0x8000 4048
0x8000 4054
0x8000 4038
0x8000 404C
0x8000 4058
•••
1 int SumArrayRows(int a[][], int M, int N) 2 { 3 int i, j; 4 int sum = 0; 5 6 for (i=0 ; i<M ; i++) 7 for (j=0 ; j<N ; j++) 8 sum += a[i][j]; 8 return sum; 9 }
a[0][0] a[0][1] a[0][2] a[0][3]
•••
a[0][1]
Cache
Memory Byant/O’Hallaron, pp. 508
![Page 6: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/6.jpg)
© 2006
Department of Computing Science
CMPUT 229
SumArrayRows Data Access Order
a[1][2]a[1][3]a[1][4]a[1][5]a[2][0]a[2][1]a[2]2]a[2][3]a[2][4]a[2][5]a[3][0]a[3][1]a[3][2]a[3][3]a[3][4]
•••
a[0][0]
a[0][2]a[0][3]a[0][4]a[0][5]a[1][0]a[1][1]
0x8000 4000
0x8000 4004
0x8000 4010
0x8000 4024
0x8000 4008
0x8000 4014
0x8000 4028
0x8000 403C
0x8000 400C
0x8000 4018
0x8000 402C
0x8000 4040
0x8000 401C
0x8000 4030
0x8000 4044
0x8000 4050
0x8000 4020
0x8000 4034
0x8000 4048
0x8000 4054
0x8000 4038
0x8000 404C
0x8000 4058
•••
1 int SumArrayRows(int a[][], int M, int N) 2 { 3 int i, j; 4 int sum = 0; 5 6 for (i=0 ; i<M ; i++) 7 for (j=0 ; j<N ; j++) 8 sum += a[i][j]; 8 return sum; 9 }
a[0][0] a[0][1] a[0][2] a[0][3]
•••
a[0][1]
Cache
Memory Byant/O’Hallaron, pp. 508
![Page 7: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/7.jpg)
© 2006
Department of Computing Science
CMPUT 229
SumArrayRows Data Access Order
a[1][2]a[1][3]a[1][4]a[1][5]a[2][0]a[2][1]a[2]2]a[2][3]a[2][4]a[2][5]a[3][0]a[3][1]a[3][2]a[3][3]a[3][4]
•••
a[0][0]
a[0][2]a[0][3]a[0][4]a[0][5]a[1][0]a[1][1]
0x8000 4000
0x8000 4004
0x8000 4010
0x8000 4024
0x8000 4008
0x8000 4014
0x8000 4028
0x8000 403C
0x8000 400C
0x8000 4018
0x8000 402C
0x8000 4040
0x8000 401C
0x8000 4030
0x8000 4044
0x8000 4050
0x8000 4020
0x8000 4034
0x8000 4048
0x8000 4054
0x8000 4038
0x8000 404C
0x8000 4058
•••
1 int SumArrayRows(int a[][], int M, int N) 2 { 3 int i, j; 4 int sum = 0; 5 6 for (i=0 ; i<M ; i++) 7 for (j=0 ; j<N ; j++) 8 sum += a[i][j]; 8 return sum; 9 }
a[0][0] a[0][1] a[0][2] a[0][3]
•••
a[0][1]
Cache
Memory Byant/O’Hallaron, pp. 508
![Page 8: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/8.jpg)
© 2006
Department of Computing Science
CMPUT 229
SumArrayRows Data Access Order
a[1][2]a[1][3]a[1][4]a[1][5]a[2][0]a[2][1]a[2]2]a[2][3]a[2][4]a[2][5]a[3][0]a[3][1]a[3][2]a[3][3]a[3][4]
•••
a[0][0]
a[0][2]a[0][3]a[0][4]a[0][5]a[1][0]a[1][1]
0x8000 4000
0x8000 4004
0x8000 4010
0x8000 4024
0x8000 4008
0x8000 4014
0x8000 4028
0x8000 403C
0x8000 400C
0x8000 4018
0x8000 402C
0x8000 4040
0x8000 401C
0x8000 4030
0x8000 4044
0x8000 4050
0x8000 4020
0x8000 4034
0x8000 4048
0x8000 4054
0x8000 4038
0x8000 404C
0x8000 4058
•••
1 int SumArrayRows(int a[][], int M, int N) 2 { 3 int i, j; 4 int sum = 0; 5 6 for (i=0 ; i<M ; i++) 7 for (j=0 ; j<N ; j++) 8 sum += a[i][j]; 8 return sum; 9 }
a[0][0] a[0][1] a[0][2] a[0][3]
•••
a[0][5] a[1][0] a[1][1]a[0][4]
a[0][1]
Cache
Memory Byant/O’Hallaron, pp. 508
![Page 9: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/9.jpg)
© 2006
Department of Computing Science
CMPUT 229
SumArrayRows Data Access Order
a[1][2]a[1][3]a[1][4]a[1][5]a[2][0]a[2][1]a[2]2]a[2][3]a[2][4]a[2][5]a[3][0]a[3][1]a[3][2]a[3][3]a[3][4]
•••
a[0][0]
a[0][2]a[0][3]a[0][4]a[0][5]a[1][0]a[1][1]
0x8000 4000
0x8000 4004
0x8000 4010
0x8000 4024
0x8000 4008
0x8000 4014
0x8000 4028
0x8000 403C
0x8000 400C
0x8000 4018
0x8000 402C
0x8000 4040
0x8000 401C
0x8000 4030
0x8000 4044
0x8000 4050
0x8000 4020
0x8000 4034
0x8000 4048
0x8000 4054
0x8000 4038
0x8000 404C
0x8000 4058
•••
1 int SumArrayRows(int a[][], int M, int N) 2 { 3 int i, j; 4 int sum = 0; 5 6 for (i=0 ; i<M ; i++) 7 for (j=0 ; j<N ; j++) 8 sum += a[i][j]; 8 return sum; 9 }
a[0][0] a[0][1] a[0][2] a[0][3]
•••
a[0][5] a[1][0] a[1][1]a[0][4]
a[0][1]
Cache
Memory Byant/O’Hallaron, pp. 508
![Page 10: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/10.jpg)
© 2006
Department of Computing Science
CMPUT 229
SumArrayRows Data Access Order
a[1][2]a[1][3]a[1][4]a[1][5]a[2][0]a[2][1]a[2]2]a[2][3]a[2][4]a[2][5]a[3][0]a[3][1]a[3][2]a[3][3]a[3][4]
•••
a[0][0]
a[0][2]a[0][3]a[0][4]a[0][5]a[1][0]a[1][1]
0x8000 4000
0x8000 4004
0x8000 4010
0x8000 4024
0x8000 4008
0x8000 4014
0x8000 4028
0x8000 403C
0x8000 400C
0x8000 4018
0x8000 402C
0x8000 4040
0x8000 401C
0x8000 4030
0x8000 4044
0x8000 4050
0x8000 4020
0x8000 4034
0x8000 4048
0x8000 4054
0x8000 4038
0x8000 404C
0x8000 4058
•••
1 int SumArrayRows(int a[][], int M, int N) 2 { 3 int i, j; 4 int sum = 0; 5 6 for (i=0 ; i<M ; i++) 7 for (j=0 ; j<N ; j++) 8 sum += a[i][j]; 8 return sum; 9 }
a[0][0] a[0][1] a[0][2] a[0][3]
•••
a[0][5] a[1][0] a[1][1]a[0][4]
a[0][1]
Cache
Memory Byant/O’Hallaron, pp. 508
![Page 11: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/11.jpg)
© 2006
Department of Computing Science
CMPUT 229
SumArrayCols Data Access Order
a[1][2]a[1][3]a[1][4]a[1][5]a[2][0]a[2][1]a[2]2]a[2][3]a[2][4]a[2][5]a[3][0]a[3][1]a[3][2]a[3][3]a[3][4]
•••
a[0][0]a[0][1]a[0][2]a[0][3]a[0][4]a[0][5]a[1][0]a[1][1]
0x8000 4000
0x8000 4004
0x8000 4010
0x8000 4024
0x8000 4008
0x8000 4014
0x8000 4028
0x8000 403C
0x8000 400C
0x8000 4018
0x8000 402C
0x8000 4040
0x8000 401C
0x8000 4030
0x8000 4044
0x8000 4050
0x8000 4020
0x8000 4034
0x8000 4048
0x8000 4054
0x8000 4038
0x8000 404C
0x8000 4058
•••
1 int SumArrayCols(int a[][], int M, int N) 2 { 3 int i, j; 4 int sum = 0; 5 6 for (j=0 ; j<N ; i++) 7 for (i=0 ; i<M ; i++) 8 sum += a[i][j]; 8 return sum; 9 }
a[0][0] a[0][1] a[0][2] a[0][3]
••• Cache
MemoryByant/O’Hallaron, pp. 508
![Page 12: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/12.jpg)
© 2006
Department of Computing Science
CMPUT 229
SumArrayCols Data Access Order
a[1][2]a[1][3]a[1][4]a[1][5]a[2][0]a[2][1]a[2]2]a[2][3]a[2][4]a[2][5]a[3][0]a[3][1]a[3][2]a[3][3]a[3][4]
•••
a[0][0]a[0][1]a[0][2]a[0][3]a[0][4]a[0][5]a[1][0]a[1][1]
0x8000 4000
0x8000 4004
0x8000 4010
0x8000 4024
0x8000 4008
0x8000 4014
0x8000 4028
0x8000 403C
0x8000 400C
0x8000 4018
0x8000 402C
0x8000 4040
0x8000 401C
0x8000 4030
0x8000 4044
0x8000 4050
0x8000 4020
0x8000 4034
0x8000 4048
0x8000 4054
0x8000 4038
0x8000 404C
0x8000 4058
•••
1 int SumArrayCols(int a[][], int M, int N) 2 { 3 int i, j; 4 int sum = 0; 5 6 for (j=0 ; j<N ; i++) 7 for (i=0 ; i<M ; i++) 8 sum += a[i][j]; 8 return sum; 9 }
a[0][0] a[0][1] a[0][2] a[0][3]
•••
a[0][5] a[1][0] a[1][1]a[0][4]
Cache
Byant/O’Hallaron, pp. 508
![Page 13: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/13.jpg)
© 2006
Department of Computing Science
CMPUT 229
SumArrayCols Data Access Order
a[1][2]a[1][3]a[1][4]a[1][5]a[2][0]a[2][1]a[2]2]a[2][3]a[2][4]a[2][5]a[3][0]a[3][1]a[3][2]a[3][3]a[3][4]
•••
a[0][0]a[0][1]a[0][2]a[0][3]a[0][4]a[0][5]a[1][0]a[1][1]
0x8000 4000
0x8000 4004
0x8000 4010
0x8000 4024
0x8000 4008
0x8000 4014
0x8000 4028
0x8000 403C
0x8000 400C
0x8000 4018
0x8000 402C
0x8000 4040
0x8000 401C
0x8000 4030
0x8000 4044
0x8000 4050
0x8000 4020
0x8000 4034
0x8000 4048
0x8000 4054
0x8000 4038
0x8000 404C
0x8000 4058
•••
1 int SumArrayCols(int a[][], int M, int N) 2 { 3 int i, j; 4 int sum = 0; 5 6 for (j=0 ; j<N ; i++) 7 for (i=0 ; i<M ; i++) 8 sum += a[i][j]; 8 return sum; 9 }
a[0][0] a[0][1] a[0][2] a[0][3]
a[2][1] a[2][2] a[2][3]
•••
a[2][0]a[0][5] a[1][0] a[1][1]a[0][4]
Cache
Byant/O’Hallaron, pp. 508
![Page 14: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/14.jpg)
© 2006
Department of Computing Science
CMPUT 229
The Cost of Programming Productivity
Easy-to-read and easy-to-maintain code often result
in lower runtime performance.
StudentClass
University
![Page 15: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/15.jpg)
© 2006
Department of Computing Science
CMPUT 229
The Cost of Programming Productivity
Abstraction
Inheritance
StudentProfessor Support Staff
Person
![Page 16: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/16.jpg)
© 2006
Department of Computing Science
CMPUT 229
The Cost of Programming Productivity
Data Encapsulation
Person
Date of BirthGender
AddressCitizenship
Name
Driver Lic.
Student
FacultyDate of Adm
DepartmentProgram
Univ. ID
Classes Enr.Grades
![Page 17: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/17.jpg)
© 2006
Department of Computing Science
CMPUT 229
Data Locality Primer
AMD Atlon 64 X2
![Page 18: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/18.jpg)
© 2006
Department of Computing Science
CMPUT 229
Data Locality Primer: Cache Organization
POWER5 Cache Organization
– L1 Data Cache: 32 Kbytes, 128-byte cache lines
– L2 Cache: 1.44 Mbytes, 128-byte cache lines
– L3 Cache: 32 Mbytes, 512-byte cache lines
![Page 19: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/19.jpg)
© 2006
Department of Computing Science
CMPUT 229
Data Locality Primer: Cache OrganizationBytes
FacultyDate of Adm
DepartmentProgram
Univ. ID
Classes Enr.Grades
Student:
1 byte4 bytes
1 byte2 bytes
4 bytes
4 bytes4 bytes4 bytes
Date of BirthGender
AddressCitizenship
Name
Driver Lic.
Person:
4 byte1 bytes
32 bytes16 bytes
32 bytes
4 bytes
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 ••• 127
0
2
•••
255
Cac
he L
ines
![Page 20: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/20.jpg)
© 2006
Department of Computing Science
CMPUT 229
Data Locality Primer: Data in Memory
Mem
ory
Add
ress
Bytes
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 ••• 127
0
128
256
384
Univ. ID Date of Adm. Fa. De Progr. •••Classes Enr. Grades
••• Fa. De Progr. Classes Enr. Grades Univ. ID
Univ. ID Date of Adm. Fa. De Progr. •••Classes Enr. Grades
••• Fa. De Progr. Classes Enr. Grades Univ. ID
FacultyDate of Adm
DepartmentProgram
Univ. ID
Classes Enr.Grades
Student:
1 byte4 bytes
1 byte2 bytes
4 bytes
4 bytes4 bytes4 bytes
![Page 21: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/21.jpg)
© 2006
Department of Computing Science
CMPUT 229
0 ••• 30 31 32 33 ••• 36 37 ••• 47 48 ••• 51 52 ••• 69 ••• 84 85 ••• 89
768
1024
1152
1280Mem
ory
Add
ress
Data Locality Primer: Data in Memory
Name DofB Ge Citizens. Address Dr. Lic.
Namedress Ge Citizens. Dr. Lic. DofB
Name DofB Ge Citizens. Address Dr. Lic.
Namedress Ge Citizens. Dr. Lic. DofB
Date of BirthGender
AddressCitizenship
Name
Driver Lic.
Person:
4 byte1 bytes
32 bytes16 bytes
32 bytes
4 bytes
![Page 22: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/22.jpg)
© 2006
Department of Computing Science
CMPUT 229
0 ••• 30 31 32 33 ••• 36 37 ••• 47 48 ••• 51 52 ••• 69 ••• 84 85 ••• 89
768
1024
1152
1280Mem
ory
Add
ress
Data Locality Primer: Data in Memory
Mem
ory
Add
ress
Bytes
Name DofB Ge Citizens. Address Dr. Lic.
Namedress Ge Citizens. Dr. Lic. DofB
Name DofB Ge Citizens. Address Dr. Lic.
Namedress Ge Citizens. Dr. Lic. DofB
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 ••• 127
0
128
256
384
Univ. ID Date of Adm. Fa. De Progr. •••Classes Enr. Grades
••• Fa. De Progr. Classes Enr. Grades Univ. ID
Univ. ID Date of Adm. Fa. De Progr. •••Classes Enr. Grades
••• Fa. De Progr. Classes Enr. Grades Univ. ID
![Page 23: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/23.jpg)
© 2006
Department of Computing Science
CMPUT 229
Example: A search through the data structures
How many Computing Science students are
younger than 23 year old?Bytes
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 ••• 127
0
2
•••
255
Univ. ID Date of Adm. Fa. De Progr. •••Classes Enr. Grades
Cac
he L
ines
![Page 24: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/24.jpg)
© 2006
Department of Computing Science
CMPUT 229
Example: A search through the data structures
How many Computing Science students are younger than 23 year old?
Load 128 bytes and uses 5 bytes!
Bytes
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 ••• 127
0
2
•••
255
Univ. ID Date of Adm. Fa. De Progr. •••Classes Enr. Grades
Cac
he L
ines
![Page 25: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/25.jpg)
© 2006
Department of Computing Science
CMPUT 229
Example: A search through the data structures
How many Computing Science students are younger than 23 year old?
Load 128 bytes and uses 5 bytes!
Bytes
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 ••• 127
0
2
•••
255
Univ. ID Date of Adm. Fa. De Progr. •••Classes Enr. Grades
Name DofB Ge Citizens. Address Dr. Lic.
Cac
he L
ines
![Page 26: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/26.jpg)
© 2006
Department of Computing Science
CMPUT 229
Example: A search through the data structures
How many Computing Science students are younger than 23 year old?
Load 128 bytes and uses 5.3 bytes!
Load 128 bytes and uses 5.8 bytes!
Bytes
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 ••• 127
0
2
•••
255
Univ. ID Date of Adm. Fa. De Progr. •••Classes Enr. Grades
Name DofB Ge Citizens. Address Dr. Lic.
Cac
he L
ines
![Page 27: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/27.jpg)
© 2006
Department of Computing Science
CMPUT 229
Data Reshaping for Arrays of StructuresStudent *ListOfStudents;
….
ListOfStudents = (Student*)malloc(….);
Univ. ID Date of Adm. Fa. De Progr. •••Classes Enr. Grades
Univ. ID Date of Adm. Fa. De Progr. •••Classes Enr. Grades
Univ. ID Date of Adm. Fa. De Progr. •••Classes Enr. Grades
Univ. ID
Date of Adm.
Fa.
De
Progr.
Univ. ID
Date of Adm.
Fa.
De
Progr.
Univ. ID
Date of Adm.
Fa.
De
Progr.
••• •••
•••
•••
•••
•••
•••
•••
![Page 28: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/28.jpg)
© 2006
Department of Computing Science
CMPUT 229
Reshaping Linked Data Structures
E.g. A linked list of students
struct student { int age; int studentNumber; int studentProgram;float averageGrade;struct student *next;
};
age num gpaprog age num gpaprog …
![Page 29: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/29.jpg)
© 2006
Department of Computing Science
CMPUT 229
Maximal Structure Splitting
age1 num1 gpa1prog1
age2 num2 gpa2prog2
…
age3 num3 gpa3prog3
age1 age2 age3
num1 num2 num3
prog1 prog2 prog3
gpa1 gpa2 gpa3
next1 next2 next3
![Page 30: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/30.jpg)
© 2006
Department of Computing Science
CMPUT 229
Is it safe to transform a given data structure?
Build alias set
– If a pointer P points to the structure
• Then all the objects in the points-to set of P must have the
same layout.
• The layout of two structures is the same if each field has the
same offset and the same length.
![Page 31: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/31.jpg)
© 2006
Department of Computing Science
CMPUT 229
Pool Allocation
Intercept mallocs and
replace by pool
allocation: each
structure layout gets
its own pool.
If pool is full another
pool can be allocated
![Page 32: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/32.jpg)
© 2006
Department of Computing Science
CMPUT 229
Pool Allocation
age1
num1
prog1
gpa1
next1
Intercept mallocs and
replace by pool
allocation: each
structure layout gets
its own pool.
![Page 33: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/33.jpg)
© 2006
Department of Computing Science
CMPUT 229
Pool Allocation
age1 age2
num1 num2
prog1 prog2
gpa1 gpa2
next1 next2
Intercept mallocs and
replace by pool
allocation: each
structure layout gets
its own pool.
![Page 34: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/34.jpg)
© 2006
Department of Computing Science
CMPUT 229
Pool Allocation
age1 age2 age3
num1 num2 num3
prog1 prog2 prog3
gpa1 gpa2 gpa3
next1 next2 next3
Intercept mallocs and
replace by pool
allocation: each
structure layout gets
its own pool.
![Page 35: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/35.jpg)
© 2006
Department of Computing Science
CMPUT 229
Pool Allocation
age1 age2 age3
num1 num2 num3
prog1 prog2 prog3
gpa1 gpa2 gpa3
next1 next2 next3
age4
num4
prog4
gpa4
next4
Intercept mallocs and
replace by pool
allocation: each
structure layout gets
its own pool.
![Page 36: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/36.jpg)
© 2006
Department of Computing Science
CMPUT 229
Pool Allocation
age1 age2 age3
num1 num2 num3
prog1 prog2 prog3
gpa1 gpa2 gpa3
next1 next2 next3
age4
num4
prog4
gpa4
next4
age5
num5
prog5
gpa5
next6
Intercept mallocs and
replace by pool
allocation: each
structure layout gets
its own pool.
![Page 37: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/37.jpg)
© 2006
Department of Computing Science
CMPUT 229
Pool Allocation
age1 age2 age3
num1 num2 num3
prog1 prog2 prog3
gpa1 gpa2 gpa3
next1 next2 next3
age4
num4
prog4
gpa4
next4
age5
num5
prog5
gpa5
next6
Intercept mallocs and
replace by pool
allocation: each
structure layout gets
its own pool.
If pool is full another
pool can be allocated
![Page 38: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/38.jpg)
© 2006
Department of Computing Science
CMPUT 229
Pool Allocation
age1 age2 age3
num1 num2 num3
prog1 prog2 prog3
gpa1 gpa2 gpa3
next1 next2 next3
age4
num4
prog4
gpa4
next4
age5
num5
prog5
gpa5
next6
age7
num7
prog7
gpa7
next7
Intercept mallocs and
replace by pool
allocation: each
structure layout gets
its own pool.
If pool is full another
pool can be allocated
![Page 39: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/39.jpg)
© 2006
Department of Computing Science
CMPUT 229
Pointer Dereferencing - Before
struct student { int age; int studentNumber; int studentProgram;float averageGrade;struct student *next;
};
struct student *s = malloc (sizeof (struct student));
s->age = 21;
s->averageGrade = 3.8;
s->age == *(s + 0)
s->averageGrade == *(s + 12)
age num gpaprog0 4 8 12 16
age num gpaprog …0 4 8 12 16
s
![Page 40: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/40.jpg)
© 2006
Department of Computing Science
CMPUT 229
Uniform Structure Splitting
Requires that all in the structure have the same
number of bytes
– Advantage
• Simpler address computation
– Disadvantage
• Either restrict the application of the technique
• Or wastes memory with padding to create same-length fields
![Page 41: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/41.jpg)
© 2006
Department of Computing Science
CMPUT 229
Uniform Splitting Pointer Transformation
age1 age2 age3
num1 num2 num3
prog1 prog2 prog3
gpa1 gpa2 gpa3
next1 next2 next3
s1->age == *(s1 + 0)
s1->gpa == *(s1 + (3 * pool_field_len))
s1
Pool_field_len is the same for each field
3 * pool_field_len
pool_field_len
![Page 42: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/42.jpg)
© 2006
Department of Computing Science
CMPUT 229
Non-Uniform Structure Splitting
Requires pools to be aligned by the size of the pool.
E.g. If the pools are 4k then they must be aligned on
4k boundaries.
More general
Address calculation is more involved
![Page 43: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/43.jpg)
© 2006
Department of Computing Science
CMPUT 229
Non-UniformExample
struct example { type_2 a; /* 4 bytes */type_8 b; /* 8 bytes */type_4 c; /* 4 bytes */};
s
How can the compiler
find the address to
access:
s->c
![Page 44: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/44.jpg)
© 2006
Department of Computing Science
CMPUT 229
Non-UniformExample
struct example { type_2 a; /* 4 bytes */type_8 b; /* 8 bytes */type_4 c; /* 4 bytes */};
s
How can the compiler
find the address to
access:
s->c
pool_base = s & 0x0…0FFF
index = (s – pool_base) / 2
field_base = (2+8)*num_structs_per_pool
s->c = *(s + field_base + 4*index - index*2)
s->c = *(s + field_base + 4*index - s + pool_base)
s->c = *(field_base + 4*index + pool_base)
![Page 45: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/45.jpg)
© 2006
Department of Computing Science
CMPUT 229
Experiments - Micro Benchmarks (Speedup)Power 4 Power 5
![Page 46: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/46.jpg)
© 2006
Department of Computing Science
CMPUT 229
Experiments - Micro Benchmarks (Instruction Count)
Power 4 Power 5
![Page 47: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/47.jpg)
© 2006
Department of Computing Science
CMPUT 229
Experiments - Micro Benchmarks (CPI)
Power 4 Power 5
![Page 48: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/48.jpg)
© 2006
Department of Computing Science
CMPUT 229
Experiments - Micro Benchmarks (DTLB Misses)
Power 4 Power 5
![Page 49: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/49.jpg)
© 2006
Department of Computing Science
CMPUT 229
Experiments - Micro Benchmarks (L1D Misses)
Power 4 Power 5
![Page 50: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/50.jpg)
© 2006
Department of Computing Science
CMPUT 229
Experiments - Micro Benchmarks (L2 Misses)
Power 4 Power 5
![Page 51: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/51.jpg)
© 2006
Department of Computing Science
CMPUT 229
Experiments - Micro Benchmarks (L3 Misses)
Power 4 Power 5
![Page 52: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/52.jpg)
© 2006
Department of Computing Science
CMPUT 229
Experiments
Evaluated SPEC 2000, Olden and LLU
Many opportunities in SPEC missed
– Pointer analysis didn’t have enough precision to identify
opportunities in the SPEC 2000 benchmarks
– Could only identify small opportunities
– No impact on performance
![Page 53: Memory Hierarchy Part 2](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e6c550346895dbc08d6/html5/thumbnails/53.jpg)
© 2006
Department of Computing Science
CMPUT 229
Experiments - Olden & LLU (Speedup)
Power 4 Power 5
bhem
3d
healt
h
power tsp llu bh
em3d
healt
h
power tsp llu