machine programming – branching ceng331: introduction to computer systems 6 th lecture instructor:...
TRANSCRIPT
![Page 1: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/1.jpg)
Machine Programming – BranchingCENG331: Introduction to Computer Systems6th Lecture
Instructor: Erol Sahin
Acknowledgement: Most of the slides are adapted from the ones prepared by R.E. Bryant, D.R. O’Hallaron of Carnegie-Mellon Univ.
![Page 2: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/2.jpg)
Conditional Branch Example
int absdiff(int x, int y){ int result; if (x > y) { result = x-y; } else { result = y-x; } return result;}
absdiff:pushl %ebpmovl %esp, %ebpmovl 8(%ebp), %edxmovl 12(%ebp), %eaxcmpl %eax, %edxjle .L7subl %eax, %edxmovl %edx, %eax
.L8:leaveret
.L7:subl %edx, %eaxjmp .L8
Body1
Setup
Finish
Body2
![Page 3: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/3.jpg)
Conditional Branch Example (Cont.)int goto_ad(int x, int y){ int result; if (x <= y) goto Else; result = x-y;Exit: return result;Else: result = y-x; goto Exit;}
C allows “goto” as means of transferring control Closer to machine-level
programming style Generally considered bad coding
style
absdiff:pushl %ebpmovl %esp, %ebpmovl 8(%ebp), %edxmovl 12(%ebp), %eaxcmpl %eax, %edxjle .L7subl %eax, %edxmovl %edx, %eax
.L8:leaveret
.L7:subl %edx, %eaxjmp .L8
![Page 4: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/4.jpg)
Conditional Branch Example (Cont.)int goto_ad(int x, int y){ int result; if (x <= y) goto Else; result = x-y;Exit: return result;Else: result = y-x; goto Exit;}
absdiff:pushl %ebpmovl %esp, %ebpmovl 8(%ebp), %edxmovl 12(%ebp), %eaxcmpl %eax, %edxjle .L7subl %eax, %edxmovl %edx, %eax
.L8:leaveret
.L7:subl %edx, %eaxjmp .L8
![Page 5: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/5.jpg)
Conditional Branch Example (Cont.)int goto_ad(int x, int y){ int result; if (x <= y) goto Else; result = x-y;Exit: return result;Else: result = y-x; goto Exit;}
absdiff:pushl %ebpmovl %esp, %ebpmovl 8(%ebp), %edxmovl 12(%ebp), %eaxcmpl %eax, %edxjle .L7subl %eax, %edxmovl %edx, %eax
.L8:leaveret
.L7:subl %edx, %eaxjmp .L8
![Page 6: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/6.jpg)
Conditional Branch Example (Cont.)int goto_ad(int x, int y){ int result; if (x <= y) goto Else; result = x-y;Exit: return result;Else: result = y-x; goto Exit;}
absdiff:pushl %ebpmovl %esp, %ebpmovl 8(%ebp), %edxmovl 12(%ebp), %eaxcmpl %eax, %edxjle .L7subl %eax, %edxmovl %edx, %eax
.L8:leaveret
.L7:subl %edx, %eaxjmp .L8
![Page 7: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/7.jpg)
Conditional Branch Example (Cont.)int goto_ad(int x, int y){ int result; if (x <= y) goto Else; result = x-y;Exit: return result;Else: result = y-x; goto Exit;}
absdiff:pushl %ebpmovl %esp, %ebpmovl 8(%ebp), %edxmovl 12(%ebp), %eaxcmpl %eax, %edxjle .L7subl %eax, %edxmovl %edx, %eax
.L8:leaveret
.L7:subl %edx, %eaxjmp .L8
![Page 8: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/8.jpg)
C Codeval = Test ? Then-Expr : Else-Expr;
Goto Versionnt = !Test;if (nt) goto Else;val = Then-Expr;
Done:. . .
Else: val = Else-Expr; goto Done;
General Conditional Expression Translation
Test is expression returning integer= 0 interpreted as false0 interpreted as true
Create separate code regions for then & else expressions
Execute appropriate one
val = x>y ? x-y : y-x;
![Page 9: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/9.jpg)
Conditionals: x86-64absdiff: # x in %edi, y in %esimovl %edi, %eax # eax = xmovl %esi, %edx # edx = ysubl %esi, %eax # eax = x-ysubl %edi, %edx # edx = y-xcmpl %esi, %edi # x:ycmovle %edx, %eax # eax=edx if <=ret
int absdiff( int x, int y){ int result; if (x > y) { result = x-y; } else { result = y-x; } return result;}
Will disappearBlackboard?
![Page 10: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/10.jpg)
Conditionals: x86-64
Conditional move instruction cmovC src, dest Move value from src to dest if condition C holds More efficient than conditional branching (simple control flow) But overhead: both branches are evaluated
absdiff: # x in %edi, y in %esimovl %edi, %eax # eax = xmovl %esi, %edx # edx = ysubl %esi, %eax # eax = x-ysubl %edi, %edx # edx = y-xcmpl %esi, %edi # x:ycmovle %edx, %eax # eax=edx if <=ret
int absdiff( int x, int y){ int result; if (x > y) { result = x-y; } else { result = y-x; } return result;}
![Page 11: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/11.jpg)
C Code
Conditional Move Versionval1 = Then-Expr;val2 = Else-Expr;val1 = val2 if !Test;
General Form with Conditional Move
Both values get computed Overwrite then-value with else-value if condition doesn’t hold Don’t use when:
Then or else expression have side effects Then and else expression are to expensive
val = Test ? Then-Expr : Else-Expr;
![Page 12: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/12.jpg)
C Codeint fact_do(int x){ int result = 1; do { result *= x; x = x-1; } while (x > 1);
return result;}
Goto Versionint fact_goto(int x){ int result = 1;loop: result *= x; x = x-1; if (x > 1) goto loop; return result;}
“Do-While” Loop Example
Use backward branch to continue looping Only take branch when “while” condition holds
![Page 13: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/13.jpg)
Goto Versionintfact_goto(int x){ int result = 1;
loop: result *= x; x = x-1; if (x > 1) goto loop;
return result;}
“Do-While” Loop CompilationRegisters:%edx x%eax result
fact_goto:pushl %ebp # Setupmovl %esp,%ebp # Setupmovl $1,%eax # eax = 1movl 8(%ebp),%edx # edx = x
.L11:imull %edx,%eax # result *= xdecl %edx # x--cmpl $1,%edx # Compare x : 1jg .L11 # if > goto loop
movl %ebp,%esp # Finishpopl %ebp # Finishret # Finish
Assembly
Will disappearBlackboard?
![Page 14: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/14.jpg)
Goto Versionintfact_goto(int x){ int result = 1;
loop: result *= x; x = x-1; if (x > 1) goto loop;
return result;}
“Do-While” Loop CompilationRegisters:%edx x%eax result
fact_goto:pushl %ebp # Setupmovl %esp,%ebp # Setupmovl $1,%eax # eax = 1movl 8(%ebp),%edx # edx = x
.L11:imull %edx,%eax # result *= xdecl %edx # x--cmpl $1,%edx # Compare x : 1jg .L11 # if > goto loop
movl %ebp,%esp # Finishpopl %ebp # Finishret # Finish
Assembly
![Page 15: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/15.jpg)
C Codedo Body while (Test);
Goto Versionloop: Body if (Test) goto loop
General “Do-While” Translation
Body:
Test returns integer= 0 interpreted as false0 interpreted as true
{ Statement1; Statement2; … Statementn;}
![Page 16: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/16.jpg)
C Codeint fact_while(int x){ int result = 1; while (x > 1) {
result *= x; x = x-1; };
return result;}
Goto Version #1int fact_while_goto(int x){ int result = 1;loop: if (!(x > 1)) goto done; result *= x; x = x-1; goto loop;done: return result;}
“While” Loop Example
Is this code equivalent to the do-while version? Must jump out of loop if test fails
![Page 17: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/17.jpg)
C Codeint fact_while(int x){ int result = 1; while (x > 1) { result *= x; x = x-1; }; return result;}
Goto Version #2int fact_while_goto2(int x){ int result = 1; if (!(x > 1)) goto done; loop: result *= x; x = x-1; if (x > 1) goto loop;done: return result;}
Alternative “While” Loop Translation
Historically used by GCC Uses same inner loop as do-
while version Guards loop entry with extra
test
![Page 18: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/18.jpg)
While versionwhile (Test) Body
Do-While Version if (!Test) goto done; do Body while(Test);done:
General “While” Translation
Goto Version if (!Test) goto done;loop: Body if (Test) goto loop;done:
![Page 19: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/19.jpg)
C Codeint fact_while(int x){ int result = 1; while (x > 1) { result *= x; x = x-1; }; return result;}
Goto Versionint fact_while_goto3(int x){ int result = 1; goto middle; loop: result *= x; x = x-1;middle: if (x > 1) goto loop; return result;}
New Style “While” Loop Translation
Recent technique for GCC Both IA32 & x86-64
First iteration jumps over body computation within loop
![Page 20: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/20.jpg)
C Codewhile (Test) Body
Jump-to-Middle While Translation
Avoids duplicating test code Unconditional goto incurs no
performance penalty for loops compiled in similar fashion
Goto Versiongoto middle;loop: Bodymiddle: if (Test) goto loop;
Goto (Previous) Version if (!Test) goto done;loop: Body if (Test) goto loop;done:
![Page 21: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/21.jpg)
int fact_while(int x){ int result = 1; while (x > 1) { result *= x; x--; }; return result;}
# x in %edx, result in %eax jmp .L34 # goto Middle.L35: # Loop: imull %edx, %eax # result *= x decl %edx # x--.L34: # Middle: cmpl $1, %edx # x:1 jg .L35 # if >, goto Loop
Jump-to-Middle Example
![Page 22: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/22.jpg)
“For” Loop Example: Square-and-Multiply
Algorithm Exploit bit representation: p = p0 + 2p1 + 22p2 + … 2n–1pn–1
Gives: xp = z0 · z1 2 · (z2 2) 2 · … · (…((zn –12) 2 )…) 2
zi = 1 when pi = 0
zi = x when pi = 1
Complexity O(log p)
/* Compute x raised to nonnegative power p */int ipwr_for(int x, unsigned p){
int result;for (result = 1; p != 0; p = p>>1) {
if (p & 0x1) result *= x; x = x*x; } return result;}
n–1 times
Example
310 = 32 * 38
= 32 * ((32)2)2
![Page 23: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/23.jpg)
ipwr Computation/* Compute x raised to nonnegative power p */int ipwr_for(int x, unsigned p){
int result;for (result = 1; p != 0; p = p>>1) {
if (p & 0x1) result *= x; x = x*x; } return result;}
before iteration result x=3 p=101 1 3 10=10102
2 1 9 5= 1012
3 9 81 2= 102
4 9 6561 1= 12
5 59049 43046721 0
![Page 24: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/24.jpg)
“For” Loop Example
for (Init; Test; Update) Body
int result; for (result = 1; p != 0; p = p>>1) { if (p & 0x1) result *= x; x = x*x; }
General Form
Init
result = 1
Test
p != 0
Update
p = p >> 1
Body
{ if (p & 0x1) result *= x; x = x*x; }
![Page 25: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/25.jpg)
“For” “While” “Do-While”
for (Init; Test; Update )
Body
Init;while (Test ) { Body Update ;}
Goto Version Init; if (!Test) goto done;loop: Body Update ; if (Test) goto loop;done:
While VersionFor Version
Do-While Version Init; if (!Test) goto done; do { Body Update ; } while (Test)done:
![Page 26: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/26.jpg)
For-Loop: Compilation #1
for (Init; Test; Update )
Body
Goto Version Init; if (!Test) goto done;loop: Body Update ; if (Test) goto loop;done:
For Version for (result = 1; p != 0; p = p>>1){ if (p & 0x1) result *= x; x = x*x;}
result = 1; if (p == 0) goto done;loop: if (p & 0x1) result *= x; x = x*x; p = p >> 1; if (p != 0) goto loop;done:
![Page 27: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/27.jpg)
“For” “While” (Jump-to-Middle)
for (Init; Test; Update )
Body
Init;while (Test ) { Body Update ;}
Init; goto middle;loop: Body Update ;middle: if (Test) goto loop;done:
While Version
For Version
Goto Version
![Page 28: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/28.jpg)
For-Loop: Compilation #2
for (Init; Test; Update )
Body
Init; goto middle;loop: Body Update ;middle: if (Test) goto loop;done:
For Version
Goto Version
for (result = 1; p != 0; p = p>>1){ if (p & 0x1) result *= x; x = x*x;}
result = 1;goto middle;loop: if (p & 0x1) result *= x; x = x*x; p = p >> 1;middle: if (p != 0) goto loop;done:
![Page 29: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/29.jpg)
Implementing Loops IA32
All loops translated into form based on “do-while”
x86-64 Also make use of “jump to middle”
Why the difference IA32 compiler developed for machine where all operations costly x86-64 compiler developed for machine where unconditional
branches incur (almost) no overhead
![Page 30: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/30.jpg)
Switch Statement Example
Multiple case labels Here: 5, 6
Fall through cases Here: 2
Missing cases Here: 4
long switch_eg (long x, long y, long z){ long w = 1; switch(x) { case 1: w = y*z; break; case 2: w = y/z; /* Fall Through */ case 3: w += z; break; case 5: case 6: w -= z; break; default: w = 2; } return w;}
![Page 31: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/31.jpg)
Jump Table Structure
Code Block0
Targ0:
Code Block1
Targ1:
Code Block2
Targ2:
Code Blockn–1
Targn-1:
•••
Targ0
Targ1
Targ2
Targn-1
•••
jtab:
target = JTab[x];goto *target;
switch(x) { case val_0: Block 0 case val_1: Block 1 • • • case val_n-1: Block n–1}
Switch Form
Approximate Translation
Jump Table Jump Targets
![Page 32: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/32.jpg)
Switch Statement Example (IA32)
Setup: switch_eg:pushl %ebp # Setupmovl %esp, %ebp # Setuppushl %ebx # Setupmovl $1, %ebx # w = 1movl 8(%ebp), %edx # edx = xmovl 16(%ebp), %ecx # ecx = zcmpl $6, %edx # x:6ja .L61 # if > goto defaultjmp *.L62(,%edx,4) # goto JTab[x]
long switch_eg(long x, long y, long z){ long w = 1; switch(x) { . . . } return w;}
Will disappearBlackboard?
![Page 33: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/33.jpg)
Switch Statement Example (IA32)
Setup: switch_eg:pushl %ebp # Setupmovl %esp, %ebp # Setuppushl %ebx # Setupmovl $1, %ebx # w = 1movl 8(%ebp), %edx # edx = xmovl 16(%ebp), %ecx # ecx = zcmpl $6, %edx # x:6ja .L61 # if > goto defaultjmp *.L62(,%edx,4) # goto JTab[x]
long switch_eg(long x, long y, long z){ long w = 1; switch(x) { . . . } return w;}
Indirect jump
Jump table.section .rodata .align 4.L62:
.long .L61 # x = 0
.long .L56 # x = 1
.long .L57 # x = 2
.long .L58 # x = 3
.long .L61 # x = 4
.long .L60 # x = 5
.long .L60 # x = 6
![Page 34: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/34.jpg)
Assembly Setup Explanation Table Structure
Each target requires 4 bytes Base address at .L62
JumpingDirect: jmp .L61 Jump target is denoted by label .L61
Indirect: jmp *.L62(,%edx,4) Start of jump table: .L62 Must scale by factor of 4 (labels have 32-bit = 4 Bytes on IA32) Fetch target from effective Address .L61 + edx*4
Only for 0 x 6
.section .rodata .align 4.L62:.long .L61 # x = 0.long .L56 # x = 1.long .L57 # x = 2.long .L58 # x = 3.long .L61 # x = 4.long .L60 # x = 5.long .L60 # x = 6
Jump table
![Page 35: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/35.jpg)
Jump Table
.section .rodata .align 4.L62:.long .L61 # x = 0.long .L56 # x = 1.long .L57 # x = 2.long .L58 # x = 3.long .L61 # x = 4.long .L60 # x = 5.long .L60 # x = 6
Jump table switch(x) { case 1: // .L56 w = y*z; break; case 2: // .L57 w = y/z; /* Fall Through */ case 3: // .L58 w += z; break; case 5: case 6: // .L60 w -= z; break; default: // .L61 w = 2; }
![Page 36: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/36.jpg)
Code Blocks (Partial).L61: // Default case
movl $2, %ebx # w = 2movl %ebx, %eax # Return wpopl %ebxleaveret
.L57: // Case 2:movl 12(%ebp), %eax # ycltd # Div prepidivl %ecx # y/z movl %eax, %ebx # w = y/z
# Fall through.L58: // Case 3:
addl %ecx, %ebx # w+= zmovl %ebx, %eax # Return wpopl %ebxleaveret
switch(x) { . . . case 2: // .L57 w = y/z; /* Fall Through */ case 3: // .L58 w += z; break; . . . default: // .L61 w = 2; }
![Page 37: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/37.jpg)
x86-64 Switch Implementation
.section .rodata .align 8.L62:.quad .L55 # x = 0.quad .L50 # x = 1.quad .L51 # x = 2.quad .L52 # x = 3.quad .L55 # x = 4.quad .L54 # x = 5.quad .L54 # x = 6
Jump Table
Same general idea, adapted to 64-bit code Table entries 64 bits (pointers) Cases use revised code
.L50: // Case 1:movq %rsi, %r8 # w = yimulq %rdx, %r8 # w *= zmovq %r8, %rax # Return wret
switch(x) { case 1: // .L50 w = y*z; break; . . . }
![Page 38: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/38.jpg)
IA32 Object Code Setup
Label .L61 becomes address 0x8048630 Label .L62 becomes address 0x80488dc
08048610 <switch_eg>: . . . 8048622: 77 0c ja 8048630 8048624: ff 24 95 dc 88 04 08 jmp *0x80488dc(,%edx,4)
switch_eg: . . .
ja .L61 # if > goto defaultjmp *.L62(,%edx,4) # goto JTab[x]
Assembly Code
Disassembled Object Code
![Page 39: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/39.jpg)
IA32 Object Code (cont.) Jump Table
Doesn’t show up in disassembled code Can inspect using GDB gdb asm-cntl(gdb) x/7xw 0x80488dc
Examine 7 hexadecimal format “words” (4-bytes each) Use command “help x” to get format documentation
0x80488dc: 0x08048630 0x08048650 0x0804863a 0x08048642 0x08048630 0x08048649 0x08048649
![Page 40: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/40.jpg)
Disassembled Targets 8048630: bb 02 00 00 00 mov $0x2,%ebx 8048635: 89 d8 mov %ebx,%eax 8048637: 5b pop %ebx 8048638: c9 leave 8048639: c3 ret 804863a: 8b 45 0c mov 0xc(%ebp),%eax 804863d: 99 cltd 804863e: f7 f9 idiv %ecx 8048640: 89 c3 mov %eax,%ebx 8048642: 01 cb add %ecx,%ebx 8048644: 89 d8 mov %ebx,%eax 8048646: 5b pop %ebx 8048647: c9 leave 8048648: c3 ret 8048649: 29 cb sub %ecx,%ebx 804864b: 89 d8 mov %ebx,%eax 804864d: 5b pop %ebx 804864e: c9 leave 804864f: c3 ret 8048650: 8b 5d 0c mov 0xc(%ebp),%ebx 8048653: 0f af d9 imul %ecx,%ebx 8048656: 89 d8 mov %ebx,%eax 8048658: 5b pop %ebx 8048659: c9 leave 804865a: c3 ret
![Page 41: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/41.jpg)
Matching Disassembled Targets 8048630: bb 02 00 00 00 mov 8048635: 89 d8 mov 8048637: 5b pop 8048638: c9 leave 8048639: c3 ret 804863a: 8b 45 0c mov 804863d: 99 cltd 804863e: f7 f9 idiv 8048640: 89 c3 mov 8048642: 01 cb add 8048644: 89 d8 mov 8048646: 5b pop 8048647: c9 leave 8048648: c3 ret 8048649: 29 cb sub 804864b: 89 d8 mov 804864d: 5b pop 804864e: c9 leave 804864f: c3 ret 8048650: 8b 5d 0c mov 8048653: 0f af d9 imul 8048656: 89 d8 mov 8048658: 5b pop 8048659: c9 leave 804865a: c3 ret
0x08048630
0x08048650
0x0804863a
0x08048642
0x08048630
0x08048649
0x08048649
![Page 42: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/42.jpg)
x86-64 Object Code Setup
Label .L61 becomes address 0x0000000000400716 Label .L62 becomes address 0x0000000000400990
0000000000400700 <switch_eg>: . . . 40070d: 77 07 ja 400716 40070f: ff 24 fd 90 09 40 00 jmpq *0x400990(,%rdi,8)
switch_eg: . . .
ja .L55 # if > goto defaultjmp *.L56(,%rdi,8) # goto JTab[x]
Assembly Code
Disassembled Object Code
![Page 43: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/43.jpg)
x86-64 Object Code (cont.) Jump Table
Can inspect using GDB gdb asm-cntl(gdb) x/7xg 0x400990
Examine 7 hexadecimal format “giant words” (8-bytes each) Use command “help x” to get format documentation
0x400990: 0x0000000000400716 0x0000000000400739 0x0000000000400720 0x000000000040072b 0x0000000000400716 0x0000000000400732 0x0000000000400732
![Page 44: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/44.jpg)
Summarizing C Control
if-then-else do-while while, for switch
Assembler Control Conditional jump Conditional move Indirect jump Compiler Must generate assembly code to
implement more complex control
Standard Techniques IA32 loops converted to do-while form x86-64 loops use jump-to-middle Large switch statements use jump tables Sparse switch statements may use
decision trees (not shown)
Conditions in CISC CISC machines generally have condition
code registers
![Page 45: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/45.jpg)
Machine Programming – Procedures and IA32 StackCENG331: Introduction to Computer Systems7th Lecture
Instructor: Erol Sahin
Acknowledgement: Most of the slides are adapted from the ones prepared by R.E. Bryant, D.R. O’Hallaron of Carnegie-Mellon Univ.
![Page 46: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/46.jpg)
IA32 Stack
Region of memory managed with stack discipline
Grows toward lower addresses
Register %esp contains lowest stack address= address of “top” element
Stack Pointer: %esp
Stack GrowsDown
IncreasingAddresses
Stack “Top”
Stack “Bottom”
![Page 47: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/47.jpg)
IA32 Stack: Push
pushl Src Fetch operand at Src Decrement %esp by 4 Write operand at address given
by %esp
Stack GrowsDown
IncreasingAddresses
Stack “Top”
Stack “Bottom”
Stack Pointer: %esp-4
![Page 48: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/48.jpg)
IA32 Stack: Pop
Stack Pointer: %esp
Stack GrowsDown
IncreasingAddresses
Stack “Top”
Stack “Bottom” popl Dest
Read operand at address %esp Increment %esp by 4 Write operand to Dest
+4
![Page 49: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/49.jpg)
Procedure Control Flow Use stack to support procedure call and return Procedure call: call label
Push return address on stack Jump to label
Return address: Address of instruction beyond call Example from disassembly804854e: e8 3d 06 00 00 call 8048b90 <main>
8048553: 50 pushl %eax Return address = 0x8048553
Procedure return: ret Pop address from stack Jump to address
![Page 50: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/50.jpg)
%esp
%eip
%esp
%eip 0x804854e
0x108
0x108
0x10c
0x110
0x104
0x804854e
0x8048553
123
Procedure Call Example
0x108
0x10c
0x110
123
0x108
call 8048b90
804854e: e8 3d 06 00 00 call 8048b90 <main>8048553: 50 pushl %eax
0x8048b90
0x104
%eip: program counter
![Page 51: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/51.jpg)
%esp
%eip
0x104
%esp
%eip 0x80485910x8048591
0x1040x104
0x108
0x10c
0x110
0x8048553
123
Procedure Return Example
0x108
0x10c
0x110
123
ret
8048591: c3 ret
0x108
0x8048553
0x8048553
%eip: program counter
![Page 52: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/52.jpg)
Stack-Based Languages Languages that support recursion
e.g., C, Pascal, Java Code must be “Reentrant”
Multiple simultaneous instantiations of single procedure Need some place to store state of each instantiation
Arguments Local variables Return pointer
Stack discipline State for given procedure needed for limited time
From when called to when return Callee returns before caller does
Stack allocated in Frames state for single procedure instantiation
![Page 53: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/53.jpg)
Call Chain Example
yoo(…){
••who();••}
who(…){
• • •amI();• • •amI();• • •}
amI(…){
••amI();••}
yoo
who
amI
amI
amI
ExampleCall Chain
amI
Procedure amI is recursive
![Page 54: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/54.jpg)
Frame for
proc
Frame Pointer: %ebp
Stack Frames Contents
Local variables Return information Temporary space
Management Space allocated when enter procedure
“Set-up” code Deallocated when return
“Finish” code
Stack Pointer: %esp
PreviousFrame
Stack “Top”
![Page 55: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/55.jpg)
Example
yoo(…){
••who();••}
yoo
who
amI
amI
amI
amI
yoo
%ebp
%esp
Stack
![Page 56: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/56.jpg)
who(…){
• • •amI();• • •amI();• • •}
Example
yoo
who
amI
amI
amI
amI
yoo
%ebp
%esp
Stack
who
![Page 57: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/57.jpg)
amI(…){
••amI();••}
Example
yoo
who
amI
amI
amI
amI
yoo
%ebp
%esp
Stack
who
amI
![Page 58: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/58.jpg)
amI(…){
••amI();••}
Example
yoo
who
amI
amI
amI
amI
yoo
%ebp
%esp
Stack
who
amI
amI
![Page 59: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/59.jpg)
amI(…){
••amI();••}
Example
yoo
who
amI
amI
amI
amI
yoo
%ebp
%esp
Stack
who
amI
amI
amI
![Page 60: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/60.jpg)
amI(…){
••amI();••}
Example
yoo
who
amI
amI
amI
amI
yoo
%ebp
%esp
Stack
who
amI
amI
![Page 61: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/61.jpg)
amI(…){
••amI();••}
Example
yoo
who
amI
amI
amI
amI
yoo
%ebp
%esp
Stack
who
amI
![Page 62: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/62.jpg)
who(…){
• • •amI();• • •amI();• • •}
Example
yoo
who
amI
amI
amI
amI
yoo
%ebp
%esp
Stack
who
![Page 63: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/63.jpg)
amI(…){
•••••}
Example
yoo
who
amI
amI
amI
amI
yoo
%ebp
%esp
Stack
who
amI
![Page 64: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/64.jpg)
who(…){
• • •amI();• • •amI();• • •}
Example
yoo
who
amI
amI
amI
amI
yoo
%ebp
%esp
Stack
who
![Page 65: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/65.jpg)
Example
yoo(…){
••who();••}
yoo
who
amI
amI
amI
amI
yoo
%ebp
%esp
Stack
![Page 66: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/66.jpg)
IA32/Linux Stack Frame Current Stack Frame (“Top” to Bottom)
“Argument build:”Parameters for function about to call
Local variablesIf can’t keep in registers
Saved register context Old frame pointer
Caller Stack Frame Return address Pushed by call instruction Arguments for this call
Return Addr
SavedRegisters
+Local
Variables
ArgumentBuild
Old %ebp
Arguments
CallerFrame
Frame pointer%ebp
Stack pointer%esp
![Page 67: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/67.jpg)
Revisiting swap
void swap(int *xp, int *yp) { int t0 = *xp; int t1 = *yp; *xp = t1; *yp = t0;}
int zip1 = 15213;int zip2 = 91125;
void call_swap(){ swap(&zip1, &zip2);}
call_swap:• • •pushl $zip2 # Global Varpushl $zip1 # Global Varcall swap• • •
&zip2
&zip1
Rtn adr %esp
ResultingStack•
••
Calling swap from call_swap
![Page 68: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/68.jpg)
Revisiting swap
void swap(int *xp, int *yp) { int t0 = *xp; int t1 = *yp; *xp = t1; *yp = t0;}
swap:pushl %ebpmovl %esp,%ebppushl %ebx
movl 12(%ebp),%ecxmovl 8(%ebp),%edxmovl (%ecx),%eaxmovl (%edx),%ebxmovl %eax,(%edx)movl %ebx,(%ecx)
movl -4(%ebp),%ebxmovl %ebp,%esppopl %ebpret
Body
SetUp
Finish
Do on blackboard?
![Page 69: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/69.jpg)
swap Setup #1
swap:pushl %ebpmovl %esp,%ebppushl %ebx
Resulting Stack
&zip2
&zip1
Rtn adr %esp
Entering Stack
•••
%ebp
yp
xp
Rtn adr
Old %ebp
%ebp•••
%esp
![Page 70: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/70.jpg)
swap Setup #1
swap:pushl %ebpmovl %esp,%ebppushl %ebx
&zip2
&zip1
Rtn adr %esp
Entering Stack
•••
%ebp
yp
xp
Rtn adr
Old %ebp
%ebp•••
%esp
![Page 71: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/71.jpg)
swap Setup #1
swap:pushl %ebpmovl %esp,%ebppushl %ebx
&zip2
&zip1
Rtn adr %esp
Entering Stack
•••
%ebp
yp
xp
Rtn adr
Old %ebp %ebp
•••
%esp
Resulting Stack
![Page 72: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/72.jpg)
swap Setup #1
swap:pushl %ebpmovl %esp,%ebppushl %ebx
&zip2
&zip1
Rtn adr %esp
Entering Stack
•••
%ebp
yp
xp
Rtn adr
Old %ebp %ebp
•••
%esp
![Page 73: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/73.jpg)
1284
swap Setup #1
&zip2
&zip1
Rtn adr %esp
Entering Stack
•••
%ebp
yp
xp
Rtn adr
Old %ebp %ebp
•••
%esp
Resulting Stack
Old %ebx
movl 12(%ebp),%ecx # get ypmovl 8(%ebp),%edx # get xp. . .
Offset relative to %ebp
![Page 74: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/74.jpg)
swap Finish #1
yp
xp
Rtn adr
Old %ebp %ebp
•••
%esp
swap’s Stack
Old %ebx
movl -4(%ebp),%ebxmovl %ebp,%esppopl %ebpret
yp
xp
Rtn adr
Old %ebp %ebp
•••
%esp
Resulting Stack
Old %ebx
Observation: Saved and restored register %ebx
![Page 75: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/75.jpg)
swap Finish #2
yp
xp
Rtn adr
Old %ebp %ebp
•••
%esp
swap’s Stack
Old %ebx
movl -4(%ebp),%ebxmovl %ebp,%esppopl %ebpret
yp
xp
Rtn adr
Old %ebp %ebp
•••
%espOld %ebx
![Page 76: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/76.jpg)
swap Finish #2
yp
xp
Rtn adr
Old %ebp %ebp
•••
%esp
swap’s Stack
Old %ebx
movl -4(%ebp),%ebxmovl %ebp,%esppopl %ebpret
yp
xp
Rtn adr
Old %ebp %ebp
•••
%esp
Resulting Stack
![Page 77: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/77.jpg)
swap Finish #2
yp
xp
Rtn adr
Old %ebp %ebp
•••
%esp
swap’s Stack
Old %ebx
movl -4(%ebp),%ebxmovl %ebp,%esppopl %ebpret
yp
xp
Rtn adr
Old %ebp %ebp
•••
%esp
![Page 78: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/78.jpg)
swap Finish #3
yp
xp
Rtn adr
Old %ebp %ebp
•••
%esp
swap’s Stack
Old %ebx
movl -4(%ebp),%ebxmovl %ebp,%esppopl %ebpret
Resulting Stack
yp
xp
Rtn adr
%ebp•••
%esp
![Page 79: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/79.jpg)
swap Finish #4
yp
xp
Rtn adr
Old %ebp %ebp
•••
%esp
swap’s Stack
Old %ebx
movl -4(%ebp),%ebxmovl %ebp,%esppopl %ebpret
yp
xp
Rtn adr
%ebp•••
%esp
![Page 80: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/80.jpg)
swap Finish #4
yp
xp
Rtn adr
Old %ebp %ebp
•••
%esp
swap’s Stack
Old %ebx
movl -4(%ebp),%ebxmovl %ebp,%esppopl %ebpret
yp
xp
%ebp•••
%esp
Resulting Stack
Observation Saved & restored register %ebx Didn’t do so for %eax, %ecx, or %edx
![Page 81: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/81.jpg)
Disassembled swap080483a4 <swap>: 80483a4: 55 push %ebp 80483a5: 89 e5 mov %esp,%ebp 80483a7: 53 push %ebx 80483a8: 8b 55 08 mov 0x8(%ebp),%edx 80483ab: 8b 4d 0c mov 0xc(%ebp),%ecx 80483ae: 8b 1a mov (%edx),%ebx 80483b0: 8b 01 mov (%ecx),%eax 80483b2: 89 02 mov %eax,(%edx) 80483b4: 89 19 mov %ebx,(%ecx) 80483b6: 5b pop %ebx 80483b7: c9 leave 80483b8: c3 ret
8048409: e8 96 ff ff ff call 80483a4 <swap> 804840e: 8b 45 f8 mov 0xfffffff8(%ebp),%eax
Calling Code
![Page 82: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/82.jpg)
Register Saving Conventions When procedure yoo calls who:
yoo is the caller who is the callee
Can Register be used for temporary storage?
Contents of register %edx overwritten by who
yoo:• • •movl $15213, %edxcall whoaddl %edx, %eax
• • •ret
who:• • •movl 8(%ebp), %edxaddl $91125, %edx
• • •ret
![Page 83: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/83.jpg)
Register Saving Conventions When procedure yoo calls who:
yoo is the caller who is the callee
Can register be used for temporary storage? Conventions
“Caller Save” Caller saves temporary in its frame before calling
“Callee Save” Callee saves temporary in its frame before using
![Page 84: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/84.jpg)
IA32/Linux Register Usage %eax, %edx, %ecx
Caller saves prior to call if values are used later
%eax also used to return integer
value
%ebx, %esi, %edi Callee saves if wants to
use them
%esp, %ebp special
%eax
%edx
%ecx
%ebx
%esi
%edi
%esp
%ebp
Caller-SaveTemporaries
Callee-SaveTemporaries
Special
![Page 85: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/85.jpg)
int rfact(int x){ int rval; if (x <= 1) return 1; rval = rfact(x-1); return rval * x;}
.globl rfact.type
rfact,@functionrfact:
pushl %ebpmovl %esp,%ebppushl %ebxmovl 8(%ebp),%ebxcmpl $1,%ebxjle .L78leal -1(%ebx),%eaxpushl %eaxcall rfactimull %ebx,%eaxjmp .L79.align 4
.L78:movl $1,%eax
.L79:movl -4(%ebp),%ebxmovl %ebp,%esppopl %ebpret
Recursive Factorial
Registers %eax used without first saving %ebx used, but saved at
beginning & restore at end
![Page 86: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/86.jpg)
Pointer Code
void s_helper (int x, int *accum){ if (x <= 1) return; else { int z = *accum * x; *accum = z; s_helper (x-1,accum); }}
int sfact(int x){ int val = 1; s_helper(x, &val); return val;}
Top-Level CallRecursive Procedure
Pass pointer to update location
![Page 87: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/87.jpg)
Temp.Space
%esp
Creating & Initializing Pointer
int sfact(int x){ int val = 1; s_helper(x, &val); return val;}
_sfact:pushl %ebp # Save %ebpmovl %esp,%ebp # Set %ebpsubl $16,%esp # Add 16 bytes movl 8(%ebp),%edx # edx = xmovl $1,-4(%ebp) # val = 1
Variable val must be stored on stack Because: Need to create pointer to it
Compute pointer as -4(%ebp) Push on stack as second argument
Initial part of sfact
x
Rtn adr
Old %ebp %ebp 0
4
8
-4 val = 1
Unused-12
-8
-16
_sfact:pushl %ebp # Save %ebpmovl %esp,%ebp # Set %ebpsubl $16,%esp # Add 16 bytes movl 8(%ebp),%edx # edx = xmovl $1,-4(%ebp) # val = 1
_sfact:pushl %ebp # Save %ebpmovl %esp,%ebp # Set %ebpsubl $16,%esp # Add 16 bytes movl 8(%ebp),%edx # edx = xmovl $1,-4(%ebp) # val = 1
_sfact:pushl %ebp # Save %ebpmovl %esp,%ebp # Set %ebpsubl $16,%esp # Add 16 bytes movl 8(%ebp),%edx # edx = xmovl $1,-4(%ebp) # val = 1
![Page 88: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/88.jpg)
Temp.Space
%esp
Creating & Initializing Pointer
int sfact(int x){ int val = 1; s_helper(x, &val); return val;}
_sfact:pushl %ebp # Save %ebpmovl %esp,%ebp # Set %ebpsubl $16,%esp # Add 16 bytes movl 8(%ebp),%edx # edx = xmovl $1,-4(%ebp) # val = 1
Variable val must be stored on stack Because: Need to create pointer to it
Compute pointer as -4(%ebp) Push on stack as second argument
Initial part of sfact
x
Rtn adr
Old %ebp %ebp 0
4
8
-4 val = 1
Unused-12
-8
-16
_sfact:pushl %ebp # Save %ebpmovl %esp,%ebp # Set %ebpsubl $16,%esp # Add 16 bytes movl 8(%ebp),%edx # edx = xmovl $1,-4(%ebp) # val = 1
_sfact:pushl %ebp # Save %ebpmovl %esp,%ebp # Set %ebpsubl $16,%esp # Add 16 bytes movl 8(%ebp),%edx # edx = xmovl $1,-4(%ebp) # val = 1
_sfact:pushl %ebp # Save %ebpmovl %esp,%ebp # Set %ebpsubl $16,%esp # Add 16 bytes movl 8(%ebp),%edx # edx = xmovl $1,-4(%ebp) # val = 1
![Page 89: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/89.jpg)
Passing Pointer
int sfact(int x){ int val = 1; s_helper(x, &val); return val;}
leal -4(%ebp),%eax # Compute &valpushl %eax # Push on stackpushl %edx # Push xcall s_helper # callmovl -4(%ebp),%eax # Return val• • • # Finish
Calling s_helper from sfact
x
Rtn adr
Old %ebp %ebp 0
4
8
val = 1 -4
Unused-12
-8
-16
%espx
&val
Stack at time of call
leal -4(%ebp),%eax # Compute &valpushl %eax # Push on stackpushl %edx # Push xcall s_helper # callmovl -4(%ebp),%eax # Return val• • • # Finish
leal -4(%ebp),%eax # Compute &valpushl %eax # Push on stackpushl %edx # Push xcall s_helper # callmovl -4(%ebp),%eax # Return val• • • # Finish
val=x!
![Page 90: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/90.jpg)
Passing Pointer
int sfact(int x){ int val = 1; s_helper(x, &val); return val;}
leal -4(%ebp),%eax # Compute &valpushl %eax # Push on stackpushl %edx # Push xcall s_helper # callmovl -4(%ebp),%eax # Return val• • • # Finish
Calling s_helper from sfact
x
Rtn adr
Old %ebp %ebp 0
4
8
val = 1 -4
Unused-12
-8
-16
%espx
&val
Stack at time of call
leal -4(%ebp),%eax # Compute &valpushl %eax # Push on stackpushl %edx # Push xcall s_helper # callmovl -4(%ebp),%eax # Return val• • • # Finish
leal -4(%ebp),%eax # Compute &valpushl %eax # Push on stackpushl %edx # Push xcall s_helper # callmovl -4(%ebp),%eax # Return val• • • # Finish
val=x!
![Page 91: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/91.jpg)
IA 32 Procedure Summary The Stack Makes Recursion Work
Private storage for each instance of procedure call Instantiations don’t clobber each other Addressing of locals + arguments can be
relative to stack positions Managed by stack discipline
Procedures return in inverse order of calls IA32 Procedures Combination of Instructions
+ Conventions Call / Ret instructions Register usage conventions
Caller / Callee save %ebp and %esp
Stack frame organization conventions
Return Addr
SavedRegisters
+Local
Variables
ArgumentBuild
Old %ebp
Arguments
CallerFrame
%ebp
%esp
![Page 92: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/92.jpg)
Today Procedures (x86-64) Arrays
One-dimensional Multi-dimensional (nested) Multi-level
Structures
![Page 93: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/93.jpg)
%rax
%rbx
%rcx
%rdx
%rsi
%rdi
%rsp
%rbp
x86-64 Integer Registers
Twice the number of registers Accessible as 8, 16, 32, 64 bits
%eax
%ebx
%ecx
%edx
%esi
%edi
%esp
%ebp
%r8
%r9
%r10
%r11
%r12
%r13
%r14
%r15
%r8d
%r9d
%r10d
%r11d
%r12d
%r13d
%r14d
%r15d
![Page 94: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/94.jpg)
%rax
%rbx
%rcx
%rdx
%rsi
%rdi
%rsp
%rbp
x86-64 Integer Registers
%r8
%r9
%r10
%r11
%r12
%r13
%r14
%r15Callee saved Callee saved
Callee saved
Callee saved
C: Callee saved
Callee saved
Callee saved
Stack pointer
Used for linking
Return value
Argument #4
Argument #1
Argument #3
Argument #2
Argument #6
Argument #5
![Page 95: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/95.jpg)
x86-64 Registers Arguments passed to functions via registers
If more than 6 integral parameters, then pass rest on stack These registers can be used as caller-saved as well
All references to stack frame via stack pointer Eliminates need to update %ebp/%rbp
Other Registers 6+1 callee saved 2 or 3 have special uses
![Page 96: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/96.jpg)
x86-64 Long Swap
Operands passed in registers First (xp) in %rdi, second (yp) in %rsi 64-bit pointers
No stack operations required (except ret) Avoiding stack
Can hold all local information in registers
void swap(long *xp, long *yp) { long t0 = *xp; long t1 = *yp; *xp = t1; *yp = t0;}
swap:movq (%rdi), %rdxmovq (%rsi), %raxmovq %rax, (%rdi)movq %rdx, (%rsi)ret
![Page 97: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/97.jpg)
x86-64 Locals in the Red Zone
Avoiding Stack Pointer Change Can hold all information within small
window beyond stack pointer
/* Swap, using local array */void swap_a(long *xp, long *yp) { volatile long loc[2]; loc[0] = *xp; loc[1] = *yp; *xp = loc[1]; *yp = loc[0];}
swap_a: movq (%rdi), %rax movq %rax, -24(%rsp) movq (%rsi), %rax movq %rax, -16(%rsp) movq -16(%rsp), %rax movq %rax, (%rdi) movq -24(%rsp), %rax movq %rax, (%rsi) ret
rtn Ptr
unused
%rsp
−8loc[1]
loc[0]
−16
−24
![Page 98: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/98.jpg)
x86-64 NonLeaf without Stack Frame No values held while swap
being invoked
No callee save registers needed
long scount = 0;
/* Swap a[i] & a[i+1] */void swap_ele_se (long a[], int i){ swap(&a[i], &a[i+1]); scount++;}
swap_ele_se: movslq %esi,%rsi # Sign extend i leaq (%rdi,%rsi,8), %rdi # &a[i] leaq 8(%rdi), %rsi # &a[i+1] call swap # swap() incq scount(%rip) # scount++; ret
![Page 99: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/99.jpg)
x86-64 Call using Jump
long scount = 0;
/* Swap a[i] & a[i+1] */void swap_ele(long a[], int i){ swap(&a[i], &a[i+1]);}
swap_ele: movslq %esi,%rsi # Sign extend i leaq (%rdi,%rsi,8), %rdi # &a[i] leaq 8(%rdi), %rsi # &a[i+1] jmp swap # swap()
Will disappearBlackboard?
![Page 100: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/100.jpg)
x86-64 Call using Jump When swap executes ret,
it will return from swap_ele
Possible since swap is a “tail call”(no instructions afterwards)
long scount = 0;
/* Swap a[i] & a[i+1] */void swap_ele(long a[], int i){ swap(&a[i], &a[i+1]);}
swap_ele: movslq %esi,%rsi # Sign extend i leaq (%rdi,%rsi,8), %rdi # &a[i] leaq 8(%rdi), %rsi # &a[i+1] jmp swap # swap()
![Page 101: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/101.jpg)
x86-64 Stack Frame Example
Keeps values of a and i in callee save registers
Must set up stack frame to save these registers
long sum = 0;/* Swap a[i] & a[i+1] */void swap_ele_su (long a[], int i){ swap(&a[i], &a[i+1]); sum += a[i];}
swap_ele_su: movq %rbx, -16(%rsp) movslq %esi,%rbx movq %r12, -8(%rsp) movq %rdi, %r12 leaq (%rdi,%rbx,8), %rdi subq $16, %rsp leaq 8(%rdi), %rsi call swap movq (%r12,%rbx,8), %rax addq %rax, sum(%rip) movq (%rsp), %rbx movq 8(%rsp), %r12 addq $16, %rsp ret
Blackboard?
![Page 102: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/102.jpg)
Understanding x86-64 Stack Frameswap_ele_su: movq %rbx, -16(%rsp) # Save %rbx movslq %esi,%rbx # Extend & save i movq %r12, -8(%rsp) # Save %r12 movq %rdi, %r12 # Save a leaq (%rdi,%rbx,8), %rdi # &a[i] subq $16, %rsp # Allocate stack frame leaq 8(%rdi), %rsi # &a[i+1] call swap # swap() movq (%r12,%rbx,8), %rax # a[i] addq %rax, sum(%rip) # sum += a[i] movq (%rsp), %rbx # Restore %rbx movq 8(%rsp), %r12 # Restore %r12 addq $16, %rsp # Deallocate stack frame ret
![Page 103: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/103.jpg)
Understanding x86-64 Stack Frameswap_ele_su: movq %rbx, -16(%rsp) # Save %rbx movslq %esi,%rbx # Extend & save i movq %r12, -8(%rsp) # Save %r12 movq %rdi, %r12 # Save a leaq (%rdi,%rbx,8), %rdi # &a[i] subq $16, %rsp # Allocate stack frame leaq 8(%rdi), %rsi # &a[i+1] call swap # swap() movq (%r12,%rbx,8), %rax # a[i] addq %rax, sum(%rip) # sum += a[i] movq (%rsp), %rbx # Restore %rbx movq 8(%rsp), %r12 # Restore %r12 addq $16, %rsp # Deallocate stack frame ret
rtn addr%r12
%rsp
−8%rbx−16
rtn addr%r12
%rsp
+8%rbx
![Page 104: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/104.jpg)
Interesting Features of Stack Frame Allocate entire frame at once
All stack accesses can be relative to %rsp Do by decrementing stack pointer Can delay allocation, since safe to temporarily use red zone
Simple deallocation Increment stack pointer No base/frame pointer needed
![Page 105: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/105.jpg)
x86-64 Procedure Summary Heavy use of registers
Parameter passing More temporaries since more registers
Minimal use of stack Sometimes none Allocate/deallocate entire block
Many tricky optimizations What kind of stack frame to use Calling with jump Various allocation techniques
![Page 106: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/106.jpg)
Today Procedures (x86-64) Arrays
One-dimensional Multi-dimensional (nested) Multi-level
Structures
![Page 107: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/107.jpg)
Basic Data Types Integral
Stored & operated on in general (integer) registers Signed vs. unsigned depends on instructions used
Intel GAS Bytes Cbyte b 1 [unsigned] charword w 2 [unsigned] shortdouble word l 4 [unsigned] intquad word q 8 [unsigned] long int (x86-64)
Floating Point Stored & operated on in floating point registers
Intel GAS Bytes CSingle s 4 floatDouble l 8 doubleExtended t 10/12/16 long double
![Page 108: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/108.jpg)
Array Allocation Basic Principle
T A[L]; Array of data type T and length L Contiguously allocated region of L * sizeof(T) bytes
char string[12];
x x + 12
int val[5];
x x + 4 x + 8 x + 12 x + 16 x + 20
double a[3];
x + 24x x + 8 x + 16
char *p[3];
x x + 8 x + 16 x + 24
x x + 4 x + 8 x + 12
IA32
x86-64
![Page 109: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/109.jpg)
Array Access Basic Principle
T A[L]; Array of data type T and length L Identifier A can be used as a pointer to array element 0: Type T*
Reference Type Valueval[4] int 3val int * xval+1 int * x + 4&val[2] int * x + 8val[5] int ??*(val+1)int 5val + i int * x + 4 i
int val[5]; 1 5 2 1 3
x x + 4 x + 8 x + 12 x + 16 x + 20
Will disappearBlackboard?
![Page 110: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/110.jpg)
Array Access Basic Principle
T A[L]; Array of data type T and length L Identifier A can be used as a pointer to array element 0: Type T*
Reference Type Valueval[4] int 3val int * xval+1 int * x + 4&val[2] int * x + 8val[5] int ??*(val+1)int 5val + i int * x + 4 i
int val[5]; 1 5 2 1 3
x x + 4 x + 8 x + 12 x + 16 x + 20
![Page 111: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/111.jpg)
Array Example
Declaration “zip_dig cmu” equivalent to “int cmu[5]” Example arrays were allocated in successive 20 byte blocks
Not guaranteed to happen in general
typedef int zip_dig[5];
zip_dig cmu = { 1, 5, 2, 1, 3 };zip_dig mit = { 0, 2, 1, 3, 9 };zip_dig ucb = { 9, 4, 7, 2, 0 };
zip_dig cmu; 1 5 2 1 3
16 20 24 28 32 36
zip_dig mit; 0 2 1 3 9
36 40 44 48 52 56
zip_dig ucb; 9 4 7 2 0
56 60 64 68 72 76
![Page 112: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/112.jpg)
Array Accessing Example
Register %edx contains starting address of array
Register %eax contains array index
Desired digit at 4*%eax + %edx
Use memory reference (%edx,%eax,4)
int get_digit (zip_dig z, int dig){ return z[dig];}
# %edx = z # %eax = dig
movl (%edx,%eax,4),%eax # z[dig]
IA32
zip_dig cmu; 1 5 2 1 3
16 20 24 28 32 36
![Page 113: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/113.jpg)
Referencing Examples
Reference Address Value Guaranteed?mit[3] 36 + 4* 3 = 48 3mit[5] 36 + 4* 5 = 56 9mit[-1] 36 + 4*-1 = 32 3
cmu[15] 16 + 4*15 = 76 ??
zip_dig cmu; 1 5 2 1 3
16 20 24 28 32 36
zip_dig mit; 0 2 1 3 9
36 40 44 48 52 56
zip_dig ucb; 9 4 7 2 0
56 60 64 68 72 76
Will disappearBlackboard?
![Page 114: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/114.jpg)
Referencing Examples
Reference Address Value Guaranteed?mit[3] 36 + 4* 3 = 48 3mit[5] 36 + 4* 5 = 56 9mit[-1] 36 + 4*-1 = 32 3
cmu[15] 16 + 4*15 = 76 ??
No bound checking Out of range behavior implementation-dependent No guaranteed relative allocation of different arrays
YesNoNoNo
zip_dig cmu; 1 5 2 1 3
16 20 24 28 32 36
zip_dig mit; 0 2 1 3 9
36 40 44 48 52 56
zip_dig mit; 9 4 7 2 0
56 60 64 68 72 76
![Page 115: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/115.jpg)
int zd2int(zip_dig z){ int i; int zi = 0; for (i = 0; i < 5; i++) { zi = 10 * zi + z[i]; } return zi;}
Array Loop Example
int zd2int(zip_dig z){ int zi = 0; int *zend = z + 4; do { zi = 10 * zi + *z; z++; } while (z <= zend); return zi;}
Original
Transformed As generated by GCC Eliminate loop variable i Convert array code to
pointer code Express in do-while form
(no test at entrance)
![Page 116: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/116.jpg)
# %ecx = zxorl %eax,%eax # zi = 0leal 16(%ecx),%ebx # zend = z+4
.L59:leal (%eax,%eax,4),%edx # 5*zimovl (%ecx),%eax # *zaddl $4,%ecx # z++leal (%eax,%edx,2),%eax # zi = *z + 2*(5*zi)cmpl %ebx,%ecx # z : zendjle .L59 # if <= goto loop
Array Loop Implementation (IA32)int zd2int(zip_dig z){ int zi = 0; int *zend = z + 4; do { zi = 10 * zi + *z; z++; } while(z <= zend); return zi;}
Will disappearBlackboard?
![Page 117: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/117.jpg)
# %ecx = zxorl %eax,%eax # zi = 0leal 16(%ecx),%ebx # zend = z+4
.L59:leal (%eax,%eax,4),%edx # 5*zimovl (%ecx),%eax # *zaddl $4,%ecx # z++leal (%eax,%edx,2),%eax # zi = *z + 2*(5*zi)cmpl %ebx,%ecx # z : zendjle .L59 # if <= goto loop
Array Loop Implementation (IA32) Registers
%ecx z%eax zi%ebx zend
Computations 10*zi + *z implemented as *z + 2*(zi+4*zi)
z++ increments by 4
int zd2int(zip_dig z){ int zi = 0; int *zend = z + 4; do { zi = 10 * zi + *z; z++; } while(z <= zend); return zi;}
# %ecx = zxorl %eax,%eax # zi = 0leal 16(%ecx),%ebx # zend = z+4
.L59:leal (%eax,%eax,4),%edx # 5*zimovl (%ecx),%eax # *zaddl $4,%ecx # z++leal (%eax,%edx,2),%eax # zi = *z + 2*(5*zi)cmpl %ebx,%ecx # z : zendjle .L59 # if <= goto loop
int zd2int(zip_dig z){ int zi = 0; int *zend = z + 4; do { zi = 10 * zi + *z; z++; } while(z <= zend); return zi;}
# %ecx = zxorl %eax,%eax # zi = 0leal 16(%ecx),%ebx # zend = z+4
.L59:leal (%eax,%eax,4),%edx # 5*zimovl (%ecx),%eax # *zaddl $4,%ecx # z++leal (%eax,%edx,2),%eax # zi = *z + 2*(5*zi)cmpl %ebx,%ecx # z : zendjle .L59 # if <= goto loop
int zd2int(zip_dig z){ int zi = 0; int *zend = z + 4; do { zi = 10 * zi + *z; z++; } while(z <= zend); return zi;}
# %ecx = zxorl %eax,%eax # zi = 0leal 16(%ecx),%ebx # zend = z+4
.L59:leal (%eax,%eax,4),%edx # 5*zimovl (%ecx),%eax # *zaddl $4,%ecx # z++leal (%eax,%edx,2),%eax # zi = *z + 2*(5*zi)cmpl %ebx,%ecx # z : zendjle .L59 # if <= goto loop
int zd2int(zip_dig z){ int zi = 0; int *zend = z + 4; do { zi = 10 * zi + *z; z++; } while(z <= zend); return zi;}
# %ecx = zxorl %eax,%eax # zi = 0leal 16(%ecx),%ebx # zend = z+4
.L59:leal (%eax,%eax,4),%edx # 5*zimovl (%ecx),%eax # *zaddl $4,%ecx # z++leal (%eax,%edx,2),%eax # zi = *z + 2*(5*zi)cmpl %ebx,%ecx # z : zendjle .L59 # if <= goto loop
int zd2int(zip_dig z){ int zi = 0; int *zend = z + 4; do { zi = 10 * zi + *z; z++; } while(z <= zend); return zi;}
![Page 118: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/118.jpg)
Nested Array Example
“zip_dig pgh[4]” equivalent to “int pgh[4][5]” Variable pgh: array of 4 elements, allocated contiguously Each element is an array of 5 int’s, allocated contiguously
“Row-Major” ordering of all elements guaranteed
#define PCOUNT 4zip_dig pgh[PCOUNT] = {{1, 5, 2, 0, 6}, {1, 5, 2, 1, 3 }, {1, 5, 2, 1, 7 }, {1, 5, 2, 2, 1 }};
zip_digpgh[4];
76 96 116 136 156
1 5 2 0 6 1 5 2 1 3 1 5 2 1 7 1 5 2 2 1
![Page 119: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/119.jpg)
Multidimensional (Nested) Arrays Declaration
T A[R][C]; 2D array of data type T R rows, C columns Type T element requires K bytes
Array Size R * C * K bytes
Arrangement Row-Major Ordering
A[0][0] A[0][C-1]
A[R-1][0]
• • •
• • • A[R-1][C-1]
•••
•••
int A[R][C];
• • •A[0][0]
A[0]
[C-1]• • •
A[1][0]
A[1]
[C-1]• • •
A[R-1][0]
A[R-1][C-1]
• • •
4*R*C Bytes
![Page 120: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/120.jpg)
• • •
Nested Array Row Access Row Vectors
A[i] is array of C elements Each element of type T requires K bytes Starting address A + i * (C * K)
• • •A[i][0]
A[i]
[C-1]
A[i]
• • •A
[R-1][0]
A[R-1][C-1]
A[R-1]
• • •
A
• • •A[0][0]
A[0]
[C-1]
A[0]
A+i*C*4 A+(R-1)*C*4
int A[R][C];
![Page 121: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/121.jpg)
Nested Array Row Access Codeint *get_pgh_zip(int index){ return pgh[index];}
# %eax = indexleal (%eax,%eax,4),%eax # 5 * indexleal pgh(,%eax,4),%eax # pgh + (20 * index)
#define PCOUNT 4zip_dig pgh[PCOUNT] = {{1, 5, 2, 0, 6}, {1, 5, 2, 1, 3 }, {1, 5, 2, 1, 7 }, {1, 5, 2, 2, 1 }};
Will disappearBlackboard?
What data type is pgh[index]? What is its starting address?
![Page 122: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/122.jpg)
Nested Array Row Access Code
Row Vector pgh[index] is array of 5 int’s Starting address pgh+20*index
IA32 Code Computes and returns address Compute as pgh + 4*(index+4*index)
int *get_pgh_zip(int index){ return pgh[index];}
# %eax = indexleal (%eax,%eax,4),%eax # 5 * indexleal pgh(,%eax,4),%eax # pgh + (20 * index)
#define PCOUNT 4zip_dig pgh[PCOUNT] = {{1, 5, 2, 0, 6}, {1, 5, 2, 1, 3 }, {1, 5, 2, 1, 7 }, {1, 5, 2, 2, 1 }};
![Page 123: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/123.jpg)
• • •
Nested Array Row Access Array Elements
A[i][j] is element of type T, which requires K bytes Address A + i * (C * K) + j * K = A + (i * C + j)* K
• • • • • •A[i][j]
A[i]
• • •A
[R-1][0]
A[R-1][C-1]
A[R-1]
• • •
A
• • •A[0][0]
A[0]
[C-1]
A[0]
A+i*C*4 A+(R-1)*C*4
int A[R][C];
A+i*C*4+j*4
![Page 124: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/124.jpg)
Nested Array Element Access Code
Array Elements pgh[index][dig] is int Address: pgh + 20*index + 4*dig
IA32 Code Computes address pgh + 4*dig + 4*(index+4*index) movl performs memory reference
int get_pgh_digit (int index, int dig){ return pgh[index][dig];}
# %ecx = dig# %eax = indexleal 0(,%ecx,4),%edx # 4*digleal (%eax,%eax,4),%eax # 5*indexmovl pgh(%edx,%eax,4),%eax # *(pgh + 4*dig + 20*index)
![Page 125: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/125.jpg)
Strange Referencing Examples
Reference Address Value Guaranteed?pgh[3][3] 76+20*3+4*3 = 148 2pgh[2][5] 76+20*2+4*5 = 136 1pgh[2][-1] 76+20*2+4*-1 = 112 3pgh[4][-1] 76+20*4+4*-1 = 152 1pgh[0][19] 76+20*0+4*19 = 152 1
pgh[0][-1] 76+20*0+4*-1 = 72 ??
zip_digpgh[4];
76 96 116 136 156
1 5 2 0 6 1 5 2 1 3 1 5 2 1 7 1 5 2 2 1
Will disappear
![Page 126: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/126.jpg)
Strange Referencing Examples
Reference Address Value Guaranteed?pgh[3][3] 76+20*3+4*3 = 148 2pgh[2][5] 76+20*2+4*5 = 136 1pgh[2][-1] 76+20*2+4*-1 = 112 3pgh[4][-1] 76+20*4+4*-1 = 152 1pgh[0][19] 76+20*0+4*19 = 152 1
pgh[0][-1] 76+20*0+4*-1 = 72 ??
Code does not do any bounds checking Ordering of elements within array guaranteed
Yes
Yes
Yes
Yes
No
zip_digpgh[4];
76 96 116 136 156
1 5 2 0 6 1 5 2 1 3 1 5 2 1 7 1 5 2 2 1
![Page 127: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/127.jpg)
Multi-Level Array Example Variable univ denotes
array of 3 elements Each element is a pointer
4 bytes Each pointer points to array
of int’s
zip_dig cmu = { 1, 5, 2, 1, 3 };zip_dig mit = { 0, 2, 1, 3, 9 };zip_dig ucb = { 9, 4, 7, 2, 0 };
#define UCOUNT 3int *univ[UCOUNT] = {mit, cmu, ucb};
36160
16
56
164
168
univ
cmu
mit
ucb
1 5 2 1 3
16 20 24 28 32 360 2 1 3 9
36 40 44 48 52 56
9 4 7 2 0
56 60 64 68 72 76
![Page 128: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/128.jpg)
Element Access in Multi-Level Array
# %ecx = index# %eax = digleal 0(,%ecx,4),%edx # 4*indexmovl univ(%edx),%edx # Mem[univ+4*index]movl (%edx,%eax,4),%eax # Mem[...+4*dig]
int get_univ_digit (int index, int dig){ return univ[index][dig];}
Will disappearBlackboard?
![Page 129: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/129.jpg)
Element Access in Multi-Level Array
Computation (IA32) Element access Mem[Mem[univ+4*index]+4*dig] Must do two memory reads
First get pointer to row array Then access element within array
# %ecx = index# %eax = digleal 0(,%ecx,4),%edx # 4*indexmovl univ(%edx),%edx # Mem[univ+4*index]movl (%edx,%eax,4),%eax # Mem[...+4*dig]
int get_univ_digit (int index, int dig){ return univ[index][dig];}
![Page 130: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/130.jpg)
Array Element Accesses
int get_pgh_digit (int index, int dig){ return pgh[index][dig];}
int get_univ_digit (int index, int dig){ return univ[index][dig];}
Nested array Multi-level array
Access looks similar, but element:
Mem[pgh+20*index+4*dig] Mem[Mem[univ+4*index]+4*dig]
![Page 131: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/131.jpg)
Strange Referencing Examples
Reference Address Value Guaranteed?univ[2][3] 56+4*3 = 68 2univ[1][5] 16+4*5 = 36 0univ[2][-1] 56+4*-1 = 52 9univ[3][-1] ?? ??univ[1][12] 16+4*12 = 64 7
36160
16
56
164
168
univ
cmu
mit
ucb
1 5 2 1 3
16 20 24 28 32 360 2 1 3 9
36 40 44 48 52 56
9 4 7 2 0
56 60 64 68 72 76
Will disappear
![Page 132: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/132.jpg)
Strange Referencing Examples
Reference Address Value Guaranteed?univ[2][3] 56+4*3 = 68 2univ[1][5] 16+4*5 = 36 0univ[2][-1] 56+4*-1 = 52 9univ[3][-1] ?? ??univ[1][12] 16+4*12 = 64 7
Code does not do any bounds checking Ordering of elements in different arrays not guaranteed
YesNoNoNoNo
36160
16
56
164
168
univ
cmu
mit
ucb
1 5 2 1 3
16 20 24 28 32 360 2 1 3 9
36 40 44 48 52 56
9 4 7 2 0
56 60 64 68 72 76
![Page 133: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/133.jpg)
Using Nested Arrays Strengths
C compiler handles doubly subscripted arrays
Generates very efficient code Avoids multiply in index
computation
Limitation Only works for fixed array size
#define N 16typedef int fix_matrix[N][N];
/* Compute element i,k of fixed matrix product */int fix_prod_ele(fix_matrix a, fix_matrix b, int i, int k){ int j; int result = 0; for (j = 0; j < N; j++) result += a[i][j]*b[j][k]; return result;}
a b
i-th row
j-th columnx
![Page 134: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/134.jpg)
Dynamic Nested Arrays Strength
Can create matrix of any size Programming
Must do index computation explicitly
Performance Accessing single element costly Must do multiplication
int * new_var_matrix(int n){ return (int *) calloc(sizeof(int), n*n);}
int var_ele (int *a, int i, int j, int n){ return a[i*n+j];}
movl 12(%ebp),%eax # imovl 8(%ebp),%edx # aimull 20(%ebp),%eax # n*iaddl 16(%ebp),%eax # n*i+jmovl (%edx,%eax,4),%eax # Mem[a+4*(i*n+j)]
![Page 135: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/135.jpg)
Dynamic Array Multiplication
Without Optimizations Multiplies: 3
2 for subscripts 1 for data
Adds: 4 2 for array indexing 1 for loop index 1 for data
/* Compute element i,k of variable matrix product */int var_prod_ele (int *a, int *b, int i, int k, int n){ int j; int result = 0; for (j = 0; j < n; j++) result += a[i*n+j] * b[j*n+k]; return result;}
![Page 136: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/136.jpg)
Optimizing Dynamic Array Multiplication Optimizations
Performed when set optimization level to -O2
Code Motion Expression i*n can be
computed outside loop Strength Reduction
Incrementing j has effect of incrementing j*n+k by n
Operations count 4 adds, 1 mult
Compiler can optimize regular access patterns
{ int j; int result = 0; for (j = 0; j < n; j++) result += a[i*n+j] * b[j*n+k]; return result;}
{ int j; int result = 0; int iTn = i*n; int jTnPk = k; for (j = 0; j < n; j++) { result += a[iTn+j] * b[jTnPk]; jTnPk += n; } return result;}
![Page 137: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/137.jpg)
struct rec { int i; int a[3]; int *p;};
IA32 Assembly# %eax = val# %edx = rmovl %eax,(%edx) # Mem[r] = val
void set_i(struct rec *r, int val){ r->i = val;}
Structures
Concept Contiguously-allocated region of memory Refer to members within structure by names Members may be of different types
Accessing Structure Member
Memory Layouti a p
0 4 16 20
![Page 138: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/138.jpg)
# %ecx = idx# %edx = rleal 0(,%ecx,4),%eax # 4*idxleal 4(%eax,%edx),%eax # r+4*idx+4
int *find_a (struct rec *r, int idx){ return &r->a[idx];}
Generating Pointer to Structure Member
Generating Pointer to Array Element Offset of each structure
member determined at compile time
struct rec { int i; int a[3]; int *p;};
i a p
0 4 16 20
r+4+4*idxr
![Page 139: Machine Programming – Branching CENG331: Introduction to Computer Systems 6 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted](https://reader031.vdocuments.site/reader031/viewer/2022032203/56649e2a5503460f94b18011/html5/thumbnails/139.jpg)
struct rec { int i; int a[3]; int *p;};
# %edx = rmovl (%edx),%ecx # r->ileal 0(,%ecx,4),%eax # 4*(r->i)leal 4(%edx,%eax),%eax # r+4+4*(r->i)movl %eax,16(%edx) # Update r->p
void set_p(struct rec *r){ r->p = &r->a[r->i];}
Structure Referencing (Cont.) C Code
i a p
0 4 16 20
i a
0 4 16 20
Element i