– 1 –CSCE 212H Spring 2012
Lecture 5Assembly Language
Lecture 5Assembly Language
TopicsTopics
Assembly Language Lab 2 -
January 27, 2011
CSCE 212 Computer Architecture
– 2 –CSCE 212H Spring 2012
OverviewOverviewLast TimeLast Time
Covered through slides 11… of Lecture 4 Floating point: Review, rounding to even, multiplication,
addition Compilation steps
NewNew Architecture (Fred Brooks): Assembly Programmer’s View
Address Modes Swap
Next Time:Next Time: Lab02 - Datalab
– 3 –CSCE 212H Spring 2012
Pop Quiz - denormalsPop Quiz - denormals
1.1. What is the representation of the largest denormalized What is the representation of the largest denormalized IEEE float (in binary)?IEEE float (in binary)?
Denormal expField = 0000 0000 Largest denormal all frac bits are 1, ie., frac =111 1111 …1111 Largest denormal representation = 0 0000 0000 111 ….1
2.2. In hex? 0x007FFFFIn hex? 0x007FFFF
3.3. What is its value as an expression, i.e., (-1)What is its value as an expression, i.e., (-1)signsign m * 2 m * 2expexp
Largest denormal’s value = 0.111 1111 … 1111 x 2-BIAS+2
4.4. How many floats are there between 1.0 and 2.0?How many floats are there between 1.0 and 2.0?
– 4 –CSCE 212H Spring 2012
5.5. What is a/the representation of minus infinity?What is a/the representation of minus infinity? expField=0xFF, sign bit =1, frac=0x000000 (23 zeroes) -infinity = 0xFF80 0000
6.6. In C are there more ints or doubles?In C are there more ints or doubles? #doubles = (distinct exp)*(number of doubles with same exp) #doubles = (211 – 2)*(252) = 263 – 253 (note this does not count NaN or
+/- infinity as a double (This only counts positives? Ignores 0)
7.7. In Math are there more rationals than integers ?In Math are there more rationals than integers ? Argument for No: the sets have the same cardinality. There are
both countably infinite, where the Reals are uncountable. Argument for yes: every integer is a rational and ½ is a rational
that is not an integer. So actually the way the question is worded what is the best
answer?
8.8. Extra credit for pop quiz 1: what is aleph-0?Extra credit for pop quiz 1: what is aleph-0? http://mathworld.wolfram.com/Aleph-0.html
– 5 –CSCE 212H Spring 2012
Pop Quiz – FP multiplication Pop Quiz – FP multiplication
1.1. If x=1.5 what is the If x=1.5 what is the representation of x as a representation of x as a float (in hex)float (in hex)
2.2. And if y=(2-And if y=(2-εε)*2)*23737 Note Note 1.1111.111……1 =(2-1 =(2-εε)) Then what is the frac field
of the float z = x*y
3.3. And what is the And what is the exponent (not the exponent (not the exponent field) of z?exponent field) of z?
4.4. What is the largest gap What is the largest gap between consecutive between consecutive floats?floats?
NoteNote 1.111.11……1111
X_______1.1__X_______1.1__ 111111……1111 (24 bits)(24 bits)
111111……1111 (24 (24 bits)bits)
------------------------------------------ 10.1110.11……101101 (26 bits?)(26 bits?)
– 6 –CSCE 212H Spring 2012
Printf conversion specificationsPrintf conversion specifications
% -#0 12 .4 L d
ExamplesExamples
Figure taken from page 368 of “C a Reference Manual” by Harbison and Steele
Start specificationStart specification
FlagsFlags
Minimum field widthMinimum field width
conversion conversion typetype
Size modifierSize modifier
PrecisionPrecision
– 7 –CSCE 212H Spring 2012
CYGWINCYGWIN
Unix Emulation under WindowsUnix Emulation under Windows Provides a bash window .bash_profile
Others CH, …Others CH, …
Other Direction: WineOther Direction: Wine
Downloading CYGWINDownloading CYGWIN Google CYGWIN startxwin – run a windows emulator under GYGWIN
Virtual Machines: Virtual Box, VMwareVirtual Machines: Virtual Box, VMware
– 8 –CSCE 212H Spring 2012
Homework 1 problem 2.90Homework 1 problem 2.90
/usr/include/math.h/usr/include/math.h # define M_PI 3.14159265358979323846 /* pi */ Hex rep given in problem pi = 0x40490fdb Binary rep 0100 0000 0100 1001 0000 1111 1101 1011
sign +, ExpField = 100 0000 0 Exp = 128-BIAS = 1 Binary val = 1. 0100 1001 0000 1111 1101 1011 * 21,
Now 22/7Now 22/7
– 9 –CSCE 212H Spring 2012
Setting Variables and aliases in .bash_profileSetting Variables and aliases in .bash_profilePATH=$HOME/bin:${PATH:-/usr/bin:.}PATH=$HOME/bin:${PATH:-/usr/bin:.}
PATH=$PATH:/usr/local/simplescalar/bin:/usr/local/simplescalar/PATH=$PATH:/usr/local/simplescalar/bin:/usr/local/simplescalar/simplesim-3.0simplesim-3.0
# list of directories separated by colons, used to specify where to # list of directories separated by colons, used to specify where to find commandsfind commands
export PATHexport PATH
PS1="`hostname`> PS1="`hostname`>
w5=/class/csce574-001/web/w5=/class/csce574-001/web/
w=/class/csce212-501/Code/w=/class/csce212-501/Code/
alias h=historyalias h=history
alias lsl="ls -lrt | grep ^d"alias lsl="ls -lrt | grep ^d"
# later you can use the variables in commands like “ls $w”# later you can use the variables in commands like “ls $w”
– 10 –CSCE 212H Spring 2012
Intel Registers figure 3.2Intel Registers figure 3.2
Intel microprocessor evolutionIntel microprocessor evolution
40044004 8008 8008 8080 8080 8086 8086 80x86 80x86
Backward compatibiltyBackward compatibilty
Registers of 8080Registers of 8080
A: AH – ALA: AH – AL
C: CH -- CLC: CH -- CL
D: DH – DLD: DH – DL
B: BH -- BLB: BH -- BL
Si,di,sp,bpSi,di,sp,bp
– 11 –CSCE 212H Spring 2012
Homework page 105 of textHomework page 105 of text
1.1. 2.562.56
2.2. 2.572.57
3.3. 2.58 give hex representations and the value as an 2.58 give hex representations and the value as an expression of the form 1.xexpression of the form 1.x-1-1xx-2-2…x…x-n-n * 2 * 2 expexp
– 12 –CSCE 212H Spring 2012
2.562.56Fill in the return value for the following procedure that Fill in the return value for the following procedure that
tests whether its first argument is greater than or tests whether its first argument is greater than or equal to its second. Assume the function f2u return equal to its second. Assume the function f2u return an unsigned 32-bit number having the same bit an unsigned 32-bit number having the same bit representation as its floating-point argument. You representation as its floating-point argument. You can assume that neither argument is NaN. The two can assume that neither argument is NaN. The two flavors of zero: +0 and -0 are considered equal.flavors of zero: +0 and -0 are considered equal.
int float-qe(float x, float y){int float-qe(float x, float y){unsigned ux = f2u(x);unsigned ux = f2u(x);unsigned uy = f2u(y);unsigned uy = f2u(y);/* Get the sign bits *//* Get the sign bits */unsigned sx = ux >> 31; unsigned sx = ux >> 31; unsiqned sy = uu >> 31; unsiqned sy = uu >> 31; /* Give an expression using only ux, uy, sx and sy *//* Give an expression using only ux, uy, sx and sy */return /* … */ ;return /* … */ ;
}}
– 13 –CSCE 212H Spring 2012
2.572.57
Given a floating point format with a k-bit exponent Given a floating point format with a k-bit exponent and an n-bit fraction, write formulas for the exponent and an n-bit fraction, write formulas for the exponent E, significand M, the fraction f, and the value V for E, significand M, the fraction f, and the value V for the quantities that follow. In addition, describe the bit the quantities that follow. In addition, describe the bit representation.representation.
A.A. The number 5.0.The number 5.0.
B.B. The largest odd integer that can be represented The largest odd integer that can be represented exactly.exactly.
C.C. The reciprocal of the smallest positive normalized The reciprocal of the smallest positive normalized value. value.
– 14 –CSCE 212H Spring 2012
2.58’ - changed table columns2.58’ - changed table columns
Intel-compatible processors also support an Intel-compatible processors also support an "extended precision" floating-point format with an "extended precision" floating-point format with an 80-bit word divided into a sign bit, k = 15 exponent 80-bit word divided into a sign bit, k = 15 exponent bits, a single integer bit, and n = 63 fraction bits. The bits, a single integer bit, and n = 63 fraction bits. The integer bit is an explicit copy of the implied bit in the integer bit is an explicit copy of the implied bit in the IEEE, floating-point representation. That is, it equals IEEE, floating-point representation. That is, it equals 1- for normalized values and 0 for denormalized 1- for normalized values and 0 for denormalized values. Fill in the following table giving the appropri-values. Fill in the following table giving the appropri-ate values of some "interesting" numbers in this ate values of some "interesting" numbers in this format:format:
Description Representation Value as Expression
Smallest denormalized
Smallest normalized
Largest normalized
– 15 –CSCE 212H Spring 2012
New Species: IA64New Species: IA64
NameName DateDate TransistorsTransistors
ItaniumItanium 20012001 10M10M Extends to IA64, a 64-bit architecture Radically new instruction set designed for high performance Will be able to run existing IA32 programs
On-board “x86 engine”
Joint project with Hewlett-Packard
Itanium 2Itanium 2 20022002 221M221M Big performance boost
– 16 –CSCE 212H Spring 2012
Assembly Programmer’s ViewAssembly Programmer’s View
Programmer-Visible StateProgrammer-Visible State EIP Program Counter
Address of next instruction
Register FileHeavily used program data
Condition CodesStore status information about
most recent arithmetic operationUsed for conditional branching
EIP
Registers
CPU Memory
Object CodeProgram Data
OS Data
Addresses
Data
Instructions
Stack
ConditionCodes
Memory Byte addressable array Code, user data, (some) OS
data Includes stack used to support
procedures
– 17 –CSCE 212H Spring 2012
Moving DataMoving Data
Moving DataMoving Datamovl Source,Dest: Move 4-byte (“long”) word Lots of these in typical code
Operand TypesOperand Types Immediate: Constant integer data
Like C constant, but prefixed with ‘$’E.g., $0x400, $-533Encoded with 1, 2, or 4 bytes
Register: One of 8 integer registersBut %esp and %ebp reserved for special useOthers have special uses for particular instructions
Memory: 4 consecutive bytes of memoryVarious “address modes”
%eax
%edx
%ecx
%ebx
%esi
%edi
%esp
%ebp
– 18 –CSCE 212H Spring 2012
Simple Addressing ModesSimple Addressing Modes
NormalNormal (R)(R) Mem[Reg[R]]Mem[Reg[R]] Register R specifies memory address
movl (%ecx),%eax
DisplacementDisplacement D(R)D(R) Mem[Reg[R]+D]Mem[Reg[R]+D] Register R specifies start of memory region Constant displacement D specifies offset
movl 8(%ebp),%edx
– 19 –CSCE 212H Spring 2012
Using Simple Addressing ModesUsing Simple Addressing Modes
void swap(int *xp, int *yp) { int t0 = *xp; int t1 = *yp; *xp = t1; *yp = t0;}
swap:pushl %ebpmovl %esp,%ebppushl %ebx
movl 12(%ebp),%ecxmovl 8(%ebp),%edxmovl (%ecx),%eaxmovl (%edx),%ebxmovl %eax,(%edx)movl %ebx,(%ecx)
movl -4(%ebp),%ebxmovl %ebp,%esppopl %ebpret
Body
SetUp
Finish
– 20 –CSCE 212H Spring 2012
Understanding SwapUnderstanding Swap
void swap(int *xp, int *yp) { int t0 = *xp; int t1 = *yp; *xp = t1; *yp = t0;}
movl 12(%ebp),%ecx # ecx = yp
movl 8(%ebp),%edx # edx = xp
movl (%ecx),%eax # eax = *yp (t1)
movl (%edx),%ebx # ebx = *xp (t0)
movl %eax,(%edx) # *xp = eax
movl %ebx,(%ecx) # *yp = ebx
Stack
Register Variable
%ecx yp
%edx xp
%eax t1
%ebx t0
yp
xp
Rtn adr
Old %ebp %ebp 0
4
8
12
Offset
•••
Old %ebx-4
– 21 –CSCE 212H Spring 2012
Understanding SwapUnderstanding Swap
movl 12(%ebp),%ecx # ecx = yp
movl 8(%ebp),%edx # edx = xp
movl (%ecx),%eax # eax = *yp (t1)
movl (%edx),%ebx # ebx = *xp (t0)
movl %eax,(%edx) # *xp = eax
movl %ebx,(%ecx) # *yp = ebx
0x120
0x124
Rtn adr
%ebp 0
4
8
12
Offset
-4
123
456
Address
0x124
0x120
0x11c
0x118
0x114
0x110
0x10c
0x108
0x104
0x100
yp
xp
%eax
%edx
%ecx
%ebx
%esi
%edi
%esp
%ebp 0x104
– 22 –CSCE 212H Spring 2012
Understanding SwapUnderstanding Swap
movl 12(%ebp),%ecx # ecx = yp
movl 8(%ebp),%edx # edx = xp
movl (%ecx),%eax # eax = *yp (t1)
movl (%edx),%ebx # ebx = *xp (t0)
movl %eax,(%edx) # *xp = eax
movl %ebx,(%ecx) # *yp = ebx
0x120
0x124
Rtn adr
%ebp 0
4
8
12
Offset
-4
123
456
Address
0x124
0x120
0x11c
0x118
0x114
0x110
0x10c
0x108
0x104
0x100
yp
xp
%eax
%edx
%ecx
%ebx
%esi
%edi
%esp
%ebp
0x120
0x104
– 23 –CSCE 212H Spring 2012
Understanding SwapUnderstanding Swap
movl 12(%ebp),%ecx # ecx = yp
movl 8(%ebp),%edx # edx = xp
movl (%ecx),%eax # eax = *yp (t1)
movl (%edx),%ebx # ebx = *xp (t0)
movl %eax,(%edx) # *xp = eax
movl %ebx,(%ecx) # *yp = ebx
0x120
0x124
Rtn adr
%ebp 0
4
8
12
Offset
-4
123
456
Address
0x124
0x120
0x11c
0x118
0x114
0x110
0x10c
0x108
0x104
0x100
yp
xp
%eax
%edx
%ecx
%ebx
%esi
%edi
%esp
%ebp
0x124
0x120
0x104
– 24 –CSCE 212H Spring 2012
Understanding SwapUnderstanding Swap
movl 12(%ebp),%ecx # ecx = yp
movl 8(%ebp),%edx # edx = xp
movl (%ecx),%eax # eax = *yp (t1)
movl (%edx),%ebx # ebx = *xp (t0)
movl %eax,(%edx) # *xp = eax
movl %ebx,(%ecx) # *yp = ebx
0x120
0x124
Rtn adr
%ebp 0
4
8
12
Offset
-4
123
456
Address
0x124
0x120
0x11c
0x118
0x114
0x110
0x10c
0x108
0x104
0x100
yp
xp
%eax
%edx
%ecx
%ebx
%esi
%edi
%esp
%ebp
456
0x124
0x120
0x104
– 25 –CSCE 212H Spring 2012
Understanding SwapUnderstanding Swap
movl 12(%ebp),%ecx # ecx = yp
movl 8(%ebp),%edx # edx = xp
movl (%ecx),%eax # eax = *yp (t1)
movl (%edx),%ebx # ebx = *xp (t0)
movl %eax,(%edx) # *xp = eax
movl %ebx,(%ecx) # *yp = ebx
0x120
0x124
Rtn adr
%ebp 0
4
8
12
Offset
-4
123
456
Address
0x124
0x120
0x11c
0x118
0x114
0x110
0x10c
0x108
0x104
0x100
yp
xp
%eax
%edx
%ecx
%ebx
%esi
%edi
%esp
%ebp
456
0x124
0x120
123
0x104
– 26 –CSCE 212H Spring 2012
Understanding SwapUnderstanding Swap
movl 12(%ebp),%ecx # ecx = yp
movl 8(%ebp),%edx # edx = xp
movl (%ecx),%eax # eax = *yp (t1)
movl (%edx),%ebx # ebx = *xp (t0)
movl %eax,(%edx) # *xp = eax
movl %ebx,(%ecx) # *yp = ebx
0x120
0x124
Rtn adr
%ebp 0
4
8
12
Offset
-4
456
456
Address
0x124
0x120
0x11c
0x118
0x114
0x110
0x10c
0x108
0x104
0x100
yp
xp
%eax
%edx
%ecx
%ebx
%esi
%edi
%esp
%ebp
456
0x124
0x120
123
0x104
– 27 –CSCE 212H Spring 2012
Understanding SwapUnderstanding Swap
movl 12(%ebp),%ecx # ecx = yp
movl 8(%ebp),%edx # edx = xp
movl (%ecx),%eax # eax = *yp (t1)
movl (%edx),%ebx # ebx = *xp (t0)
movl %eax,(%edx) # *xp = eax
movl %ebx,(%ecx) # *yp = ebx
0x120
0x124
Rtn adr
%ebp 0
4
8
12
Offset
-4
456
123
Address
0x124
0x120
0x11c
0x118
0x114
0x110
0x10c
0x108
0x104
0x100
yp
xp
%eax
%edx
%ecx
%ebx
%esi
%edi
%esp
%ebp
456
0x124
0x120
123
0x104
– 28 –CSCE 212H Spring 2012
Indexed Addressing ModesIndexed Addressing ModesMost General FormMost General Form
D(Rb,Ri,S)D(Rb,Ri,S)
Refers to AddressRefers to Address
Mem[Reg[Rb]+S*Reg[Ri]+ D]Mem[Reg[Rb]+S*Reg[Ri]+ D] D: Constant “displacement” 1, 2, or 4 bytes Rb: Base register: Any of 8 integer registers Ri: Index register: Any, except for %esp
Unlikely you’d use %ebp, either S: Scale: 1, 2, 4, or 8
Special CasesSpecial Cases (Rb,Ri) Mem[Reg[Rb]+Reg[Ri]] D(Rb,Ri) Mem[Reg[Rb]+Reg[Ri]+D] (Rb,Ri,S) Mem[Reg[Rb]+S*Reg[Ri]]
– 29 –CSCE 212H Spring 2012
Address Computation ExamplesAddress Computation Examples
%edx
%ecx
0xf000
0x100
ExpressionExpression ComputationComputation AddressAddress
0x8(%edx)0x8(%edx)
(%edx,%ecx)(%edx,%ecx)
(%edx,%ecx,4)(%edx,%ecx,4)
0x80(,%edx,2)0x80(,%edx,2)
– 30 –CSCE 212H Spring 2012
Address Computation InstructionAddress Computation Instruction
lealleal SrcSrc,,DestDest Src is address mode expression Set Dest to address denoted by expression
UsesUses Computing address without doing memory reference
E.g., translation of p = &x[i]; Computing arithmetic expressions of the form x + k*y
k = 1, 2, 4, or 8.
– 31 –CSCE 212H Spring 2012
Some Arithmetic OperationsSome Arithmetic Operations
Format Computation
Two Operand InstructionsTwo Operand Instructionsaddl Src,Dest Dest = Dest + Src
subl Src,Dest Dest = Dest - Src
imull Src,Dest Dest = Dest * Src
sall Src,Dest Dest = Dest << Src Also called shll
sarl Src,Dest Dest = Dest >> Src Arithmetic
shrl Src,Dest Dest = Dest >> Src Logical
xorl Src,Dest Dest = Dest ^ Src
andl Src,Dest Dest = Dest & Src
orl Src,Dest Dest = Dest | Src
– 32 –CSCE 212H Spring 2012
Some Arithmetic OperationsSome Arithmetic Operations
Format Computation
One Operand InstructionsOne Operand Instructionsincl Dest Dest = Dest + 1
decl Dest Dest = Dest - 1
negl Dest Dest = - Dest
notl Dest Dest = ~ Dest
– 33 –CSCE 212H Spring 2012
Using leal for Arithmetic ExpressionsUsing leal for Arithmetic Expressions
int arith (int x, int y, int z){ int t1 = x+y; int t2 = z+t1; int t3 = x+4; int t4 = y * 48; int t5 = t3 + t4; int rval = t2 * t5; return rval;}
arith:pushl %ebpmovl %esp,%ebp
movl 8(%ebp),%eaxmovl 12(%ebp),%edxleal (%edx,%eax),%ecxleal (%edx,%edx,2),%edxsall $4,%edxaddl 16(%ebp),%ecxleal 4(%edx,%eax),%eaximull %ecx,%eax
movl %ebp,%esppopl %ebpret
Body
SetUp
Finish
– 34 –CSCE 212H Spring 2012
Understanding arithUnderstanding arithint arith (int x, int y, int z){ int t1 = x+y; int t2 = z+t1; int t3 = x+4; int t4 = y * 48; int t5 = t3 + t4; int rval = t2 * t5; return rval;}
movl 8(%ebp),%eax # eax = xmovl 12(%ebp),%edx # edx = yleal (%edx,%eax),%ecx # ecx = x+y (t1)leal (%edx,%edx,2),%edx # edx = 3*ysall $4,%edx # edx = 48*y (t4)addl 16(%ebp),%ecx # ecx = z+t1 (t2)leal 4(%edx,%eax),%eax # eax = 4+t4+x (t5)imull %ecx,%eax # eax = t5*t2 (rval)
y
x
Rtn adr
Old %ebp %ebp 0
4
8
12
OffsetStack
•••
z16
– 35 –CSCE 212H Spring 2012
Understanding arithUnderstanding arith
int arith (int x, int y, int z){ int t1 = x+y; int t2 = z+t1; int t3 = x+4; int t4 = y * 48; int t5 = t3 + t4; int rval = t2 * t5; return rval;}
# eax = xmovl 8(%ebp),%eax
# edx = ymovl 12(%ebp),%edx
# ecx = x+y (t1)leal (%edx,%eax),%ecx
# edx = 3*yleal (%edx,%edx,2),%edx
# edx = 48*y (t4)sall $4,%edx
# ecx = z+t1 (t2)addl 16(%ebp),%ecx
# eax = 4+t4+x (t5)leal 4(%edx,%eax),%eax
# eax = t5*t2 (rval)imull %ecx,%eax
– 36 –CSCE 212H Spring 2012
Another ExampleAnother Example
int logical(int x, int y){ int t1 = x^y; int t2 = t1 >> 17; int mask = (1<<13) - 7; int rval = t2 & mask; return rval;}
logical:pushl %ebpmovl %esp,%ebp
movl 8(%ebp),%eaxxorl 12(%ebp),%eaxsarl $17,%eaxandl $8185,%eax
movl %ebp,%esppopl %ebpret
Body
SetUp
Finish
movl 8(%ebp),%eax eax = xxorl 12(%ebp),%eax eax = x^y (t1)sarl $17,%eax eax = t1>>17 (t2)andl $8185,%eax eax = t2 & 8185
213 = 8192, 213 – 7 = 8185
– 37 –CSCE 212H Spring 2012
CISC PropertiesCISC Properties
Instruction can reference different operand typesInstruction can reference different operand types Immediate, register, memory
Arithmetic operations can read/write memoryArithmetic operations can read/write memory
Memory reference can involve complex computationMemory reference can involve complex computation Rb + S*Ri + D Useful for arithmetic expressions, too
Instructions can have varying lengthsInstructions can have varying lengths IA32 instructions can range from 1 to 15 bytes
– 38 –CSCE 212H Spring 2012
Summary: Abstract MachinesSummary: Abstract Machines
1) loops2) conditionals3) switch4) Proc. call5) Proc. return
Machine Models Data Control
1) char2) int, float3) double4) struct, array5) pointer
mem proc
C
Assembly1) byte2) 2-byte word3) 4-byte long word4) contiguous byte allocation5) address of initial byte
3) branch/jump4) call5) retmem regs alu
processorStack Cond.Codes
– 39 –CSCE 212H Spring 2012
Pentium Pro (P6)Pentium Pro (P6)HistoryHistory
Announced in Feb. ‘95 Basis for Pentium II, Pentium III, and Celeron processors Pentium 4 similar idea, but different details
FeaturesFeatures Dynamically translates instructions to more regular format
Very wide, but simple instructions
Executes operations in parallelUp to 5 at once
Very deep pipeline12–18 cycle latency
– 40 –CSCE 212H Spring 2012
PentiumPro Block DiagramPentiumPro Block Diagram
Microprocessor Report2/16/95
– 41 –CSCE 212H Spring 2012
PentiumPro OperationPentiumPro Operation
Translates instructions dynamically into “Uops”Translates instructions dynamically into “Uops” 118 bits wide Holds operation, two sources, and destination
Executes Uops with “Out of Order” engineExecutes Uops with “Out of Order” engine Uop executed when
Operands availableFunctional unit available
Execution controlled by “Reservation Stations”Keeps track of data dependencies between uopsAllocates resources
ConsequencesConsequences Indirect relationship between IA32 code & what actually gets
executed Tricky to predict / optimize performance at assembly level
– 42 –CSCE 212H Spring 2012
Whose Assembler?Whose Assembler?
Intel/Microsoft Differs from GASIntel/Microsoft Differs from GAS Operands listed in opposite order
mov Dest, Src movl Src, Dest
Constants not preceded by ‘$’, Denote hex with ‘h’ at end100h $0x100
Operand size indicated by operands rather than operator suffixsub subl
Addressing format shows effective address computation[eax*4+100h] $0x100(,%eax,4)
lea eax,[ecx+ecx*2]sub esp,8cmp dword ptr [ebp-8],0mov eax,dword ptr [eax*4+100h]
leal (%ecx,%ecx,2),%eaxsubl $8,%espcmpl $0,-8(%ebp)movl $0x100(,%eax,4),%eax
Intel/Microsoft Format GAS/Gnu Format
– 43 –CSCE 212H Spring 2012
OverviewOverviewLast TimeLast Time
Lecture 03 – slides 1-14, 16? Denormalized floats Special floats, Infinity, NaN Tiny Floats Error in show bytes code!!!
NewNew Finish denormals from last time Special floats, Infinity, NaN Tiny Floats Rounding, multiplication, addition Lab 1 comments
Libraries Masks Unions
Assembly Language
– 44 –CSCE 212H Spring 2012
211 Review ofBase-r to Decimal Conversions211 Review ofBase-r to Decimal Conversions Converting base-r to decimal by definitionConverting base-r to decimal by definition
ddnndd
n-1n-1dd
n-2n-2…d…d
2 2dd
1 1dd
0 0(base r)(base r) = d = d
nnrrnn + d + d
n-1n-1rrn-1n-1… d… d
22rr22 +d +d
1 1rr1 + 1 + dd
0 0rr00
ExampleExample
4F0C.A4F0C.A1616 = 4*16 = 4*1633 + F*16 + F*1622 + 0*16 + 0*1611 + C*16 + C*160 0 +A*16+A*16-1-1
== 4*4096 + 15*256 + 0 + 12*1 + 10/16 4*4096 + 15*256 + 0 + 12*1 + 10/16
= = 16384 + 3840 + 12 + 5/816384 + 3840 + 12 + 5/8
== 20236.62520236.625
– 45 –CSCE 212H Spring 2012
211 Review of Decimal to Base-r Conversion211 Review of Decimal to Base-r Conversion Repeated division algorithmRepeated division algorithm
Justification:Justification:
ddnndd
n-1n-1dd
n-2n-2…d…d
2 2dd
1 1dd
0 0 = d = d
nnrrnn + d + d
n-1n-1rrn-1n-1… d… d
22rr22 +d +d
1 1rr1 + 1 + dd
0 0rr00
Dividing each side by r yieldsDividing each side by r yields
(d(dnndd
n-1n-1dd
n-2n-2…d…d
2 2dd
1 1dd
0 0) / r = d) / r = d
nnrrn-1n-1 + d + d
n-1n-1rrn-2n-2… d… d
22rr11+d+d
1 1rr0 + 0 + dd
0 0rr-1-1
So dSo d 0 0 is the remainder of the first division is the remainder of the first division
((q((q11) / r = d) / r = d
nnrrn-2n-2 + d + d
n-1n-1rrn-3n-3… d… d
33rr11+d+d
2 2rr0 + 0 + dd
1 1rr-1-1
So dSo d 1 1 is the remainder of the next division is the remainder of the next division
and dand d 2 2 is the remainder of the next division is the remainder of the next division
……
– 46 –CSCE 212H Spring 2012
211 Review of Decimal to Base-r Conversion Example211 Review of Decimal to Base-r Conversion Example Repeated division algorithm ExampleRepeated division algorithm Example
Convert 4343 to hexConvert 4343 to hex
4343/16 = 271 remainder = 74343/16 = 271 remainder = 7
271/16 = 16 remainder = 15271/16 = 16 remainder = 15
16/16 = 1 remainder = 0 16/16 = 1 remainder = 0
1/16 = 0 remainder = 11/16 = 0 remainder = 1
So 4343So 43431010
= 10F7 = 10F71616
To check the answer convert back to decimalTo check the answer convert back to decimal
10F7 = 1*1610F7 = 1*1633 + 15*16 + 7*1 = 4096 + 240 + 7 = 4343 + 15*16 + 7*1 = 4096 + 240 + 7 = 4343
– 47 –CSCE 212H Spring 2012
211 Review of Decimal Fractions to Hex211 Review of Decimal Fractions to Hex Repeated multiplication of base 16 time the fractional Repeated multiplication of base 16 time the fractional
portion to generate the digitsportion to generate the digits
.884 * 16 = 14.144 .884 * 16 = 14.144 (since 14 = E in hex) .884 (since 14 = E in hex) .8841010
~ .E ~ .E1616
.144 * 16 = 2.304 .144 * 16 = 2.304 .884 .8841010
~.E2 ~.E21616
.304 * 16 = 4.864 .304 * 16 = 4.864 .884 .8841010
~ .E24 ~ .E241616
.864 *16 = 13.824 .864 *16 = 13.824 .884 .8841010
~ .E24D ~ .E24D1616
.824 * 16 = 13.184 .824 * 16 = 13.184 .884 .8841010
~ .E24DD ~ .E24DD1616
.184 * 16 = 2.94 so if we are rounding to five hex digits .184 * 16 = 2.94 so if we are rounding to five hex digits since 2 < ½ *16 we round down andsince 2 < ½ *16 we round down and
.884.8841010
= .E24DD = .E24DD1616
– 48 –CSCE 212H Spring 2012
Convert -4343.884 to IEEE 754 float Convert -4343.884 to IEEE 754 float
4343.8844343.8841010
= 10F7. E24DD = 10F7. E24DD1616
, now converting to binary, now converting to binary
0001 0000 1111 0111.1110 0010 0100 1101 1101 *20001 0000 1111 0111.1110 0010 0100 1101 1101 *200,,
= 1. 0000 1111 0111 1110 0010 0100 1101 1101 *2= 1. 0000 1111 0111 1110 0010 0100 1101 1101 *21212 , ,
So ExpField = 12 + 126 = 138 = 128 + 8 + 2 = 1000 1010So ExpField = 12 + 126 = 138 = 128 + 8 + 2 = 1000 1010
Sign bit = 1 (to represent a negative) and the fraction is the firs 110t 23 Sign bit = 1 (to represent a negative) and the fraction is the firs 110t 23 bits above almost because we round.bits above almost because we round.
. 0000 1111 0111 1110 0010 010. 0000 1111 0111 1110 0010 010^̂0 11010 1101
The rounding rule usually is (round to even if exactly ½ )The rounding rule usually is (round to even if exactly ½ )
In this case the next digit is a 0 so we round downIn this case the next digit is a 0 so we round down
Rep = 1 - 1000 1010 - 0000 1111 0111 1110 0010 010Rep = 1 - 1000 1010 - 0000 1111 0111 1110 0010 010
= 1100 0101 0000 0111 1011 1111 0001 0010= 1100 0101 0000 0111 1011 1111 0001 0010
= 0x C 5 0 7 B F 1 2= 0x C 5 0 7 B F 1 2
– 49 –CSCE 212H Spring 2012
Pop Quiz – Normal floatsPop Quiz – Normal floatsValueValue
Float F = 212; 21210 =
SignificandSignificandM = 1. 2
frac = 2
ExponentExponentE = Bias =
Exp = = 2
Floating Point Representation:
Hex:
Binary:
exponent:
212:
– 50 –CSCE 212H Spring 2012
Jan 25 Pop Quiz - denormalsJan 25 Pop Quiz - denormals
1.1. What is the representation of the largest What is the representation of the largest denormalized IEEE float (in binary)?denormalized IEEE float (in binary)?
2.2. In hex?In hex?
3.3. What is its value as an expression, i.e., (-1)What is its value as an expression, i.e., (-1)signsign m * m * 22expexp
4.4. How many floats are there between 1.0 and 2.0?How many floats are there between 1.0 and 2.0?
5.5. What is a/the representation of minus infinity?What is a/the representation of minus infinity?
6.6. In C are there more ints or doubles?In C are there more ints or doubles?
7.7. In Math are there more rationals than integers ?In Math are there more rationals than integers ?
8.8. Extra credit for pop quiz 1: what is aleph-0?Extra credit for pop quiz 1: what is aleph-0?
– 51 –CSCE 212H Spring 2012
Lab01Lab01
msb.c – extract and print most significant bytemsb.c – extract and print most significant byte Unions Pointers Masks and such
sin.c – using math librarysin.c – using math library gcc sin.c -lm
– 52 –CSCE 212H Spring 2012
label_show_byteslabel_show_bytes
void label_show_bytes(char *label, pointer start, int len)void label_show_bytes(char *label, pointer start, int len)
{{
int i;int i;
printf("%s ", label);printf("%s ", label);
for (i = 0; i < len; i++)for (i = 0; i < len; i++)
printf("0x%p\t0x%.2x",printf("0x%p\t0x%.2x",
start+i, start[i]);start+i, start[i]);
printf("\n");printf("\n");
}}
– 53 –CSCE 212H Spring 2012
Unions and suchUnions and such
float f, pi;float f, pi;
union {union {
float fl;float fl;
unsigned int ui;unsigned int ui;
} un;} un;
pi = 3.14159265358979323846; /* what precision!*/pi = 3.14159265358979323846; /* what precision!*/
un.fl = -1*pi;un.fl = -1*pi;
printf("float %f assigned to unsigned %ud\n", pi, un.ui);printf("float %f assigned to unsigned %ud\n", pi, un.ui);
label_show_bytes("un.fl", (pointer)&un.fl, 4);label_show_bytes("un.fl", (pointer)&un.fl, 4);
label_show_bytes("un.ui", (pointer)&un.ui, 4);label_show_bytes("un.ui", (pointer)&un.ui, 4);
– 54 –CSCE 212H Spring 2012
PointersPointers
DeclarationsDeclarations
DereferencesDereferences
Address-of operatorAddress-of operator
Explicit CastingExplicit Casting
– 55 –CSCE 212H Spring 2012
Masks and such Masks and such
– 56 –CSCE 212H Spring 2012
Math libraryMath library
/usr/lib/usr/lib
ar ar t /usr/lib/libm.at /usr/lib/libm.a
gcc sin.c -lmgcc sin.c -lm
– 57 –CSCE 212H Spring 2012
FP MultiplicationFP MultiplicationOperandsOperands
(–1)s1 M1 2E1 * (–1)s2 M2 2E2
Exact ResultExact Result(–1)s M 2E
Sign s: s1 ^ s2 Significand M: M1 * M2 Exponent E: E1 + E2
FixingFixing If M ≥ 2, shift M right, increment E If E out of range, overflow Round M to fit frac precision
ImplementationImplementation Biggest chore is multiplying significands
– 58 –CSCE 212H Spring 2012
FP AdditionFP AdditionOperandsOperands
(–1)s1 M1 2E1
(–1)s2 M2 2E2
Assume E1 > E2
Exact ResultExact Result(–1)s M 2E
Sign s, significand M: Result of signed align & add
Exponent E: E1
FixingFixing If M ≥ 2, shift M right, increment E if M < 1, shift M left k positions, decrement E by k Overflow if E out of range Round M to fit frac precision
(–1)s1 M1
(–1)s2 M2
E1–E2
+
(–1)s M
– 59 –CSCE 212H Spring 2012
Floating Point in CFloating Point in CC Guarantees Two LevelsC Guarantees Two Levels
float single precision
double double precision
ConversionsConversions Casting between int, float, and double changes numeric
values Double or float to int
Truncates fractional part Like rounding toward zero Not defined when out of range
» Generally saturates to TMin or TMax
int to double Exact conversion, as long as int has ≤ 53 bit word size
int to float Will round according to rounding mode
– 60 –CSCE 212H Spring 2012
IEEE 754 Rounding AlgorithmsIEEE 754 Rounding Algorithms
1.1. Round to nearest, ties to even – rounds to the nearest value; if Round to nearest, ties to even – rounds to the nearest value; if the number falls midway it is rounded to the nearest value with the number falls midway it is rounded to the nearest value with an even (zero) least significant bit, which occurs 50% of the an even (zero) least significant bit, which occurs 50% of the time; this is the default algorithm for binary floating-point and time; this is the default algorithm for binary floating-point and the recommended default for decimalthe recommended default for decimal
2.2. Round to nearest, ties away from zero – rounds to the nearest Round to nearest, ties away from zero – rounds to the nearest value; if the number falls midway it is rounded to the nearest value; if the number falls midway it is rounded to the nearest value above (for positive numbers) or below (for negative value above (for positive numbers) or below (for negative numbers)numbers)
3.3. Round toward 0 – directed rounding towards zero (also called Round toward 0 – directed rounding towards zero (also called truncation)truncation)
4.4. Round toward – directed rounding towards positive infinityRound toward – directed rounding towards positive infinity
5.5. Round toward – directed rounding towards negative infinity.Round toward – directed rounding towards negative infinity.
http://en.wikipedia.org/wiki/IEEE_754
– 61 –CSCE 212H Spring 2012
Ariane 5Ariane 5
Exploded 37 seconds after liftoff
Cargo worth $500 million
WhyWhy Computed horizontal
velocity as floating point number
Converted to 16-bit integer
Worked OK for Ariane 4 Overflowed for Ariane 5
Used same software
– 62 –CSCE 212H Spring 2012
IA32 ProcessorsIA32 Processors
Totally Dominate Computer MarketTotally Dominate Computer Market
Evolutionary DesignEvolutionary Design Starting in 1978 with 8086 Added more features as time goes on Still support old features, although obsolete
Complex Instruction Set Computer (CISC)Complex Instruction Set Computer (CISC) Many different instructions with many different formats
But, only small subset encountered with Linux programs
Hard to match performance of Reduced Instruction Set Computers (RISC)
But, Intel has done just that!
– 63 –CSCE 212H Spring 2012
X86 Evolution: Programmer’s ViewX86 Evolution: Programmer’s ViewNameName DateDate TransistorsTransistors
80868086 19781978 29K29K 16-bit processor. Basis for IBM PC & DOS Limited to 1MB address space. DOS only gives you 640K
8028680286 19821982 134K134K Added elaborate, but not very useful, addressing scheme Basis for IBM PC-AT and Windows
386386 19851985 275K275K Extended to 32 bits. Added “flat addressing” Capable of running Unix Linux/gcc uses no instructions introduced in later models
– 64 –CSCE 212H Spring 2012
X86 Evolution: Programmer’s ViewX86 Evolution: Programmer’s View
NameName DateDate TransistorsTransistors
486486 19891989 1.9M1.9M
PentiumPentium 19931993 3.1M3.1M
Pentium/MMXPentium/MMX 19971997 4.5M4.5M Added special collection of instructions for operating on 64-
bit vectors of 1, 2, or 4 byte integer data
PentiumProPentiumPro 19951995 6.5M6.5M Added conditional move instructions Big change in underlying microarchitecture
– 65 –CSCE 212H Spring 2012
X86 Evolution: Programmer’s ViewX86 Evolution: Programmer’s View
NameName DateDate TransistorsTransistors
Pentium IIIPentium III 19991999 8.2M8.2M Added “streaming SIMD” instructions for operating on 128-bit
vectors of 1, 2, or 4 byte integer or floating point data Our fish machines
Pentium 4Pentium 4 20012001 42M42M Added 8-byte formats and 144 new instructions for streaming
SIMD mode
– 66 –CSCE 212H Spring 2012
X86 Evolution: ClonesX86 Evolution: Clones
Advanced Micro Devices (AMD)Advanced Micro Devices (AMD) Historically
AMD has followed just behind IntelA little bit slower, a lot cheaper
RecentlyRecruited top circuit designers from Digital Equipment Corp.Exploited fact that Intel distracted by IA64Now are close competitors to Intel
Developing own extension to 64 bits
– 67 –CSCE 212H Spring 2012
X86 Evolution: ClonesX86 Evolution: Clones
TransmetaTransmeta Recent start-up
Employer of Linus Torvalds
Radically different approach to implementationTranslates x86 code into “Very Long Instruction Word” (VLIW)
codeHigh degree of parallelism
Shooting for low-power market
– 68 –CSCE 212H Spring 2012
New Species: IA64New Species: IA64
NameName DateDate TransistorsTransistors
ItaniumItanium 20012001 10M10M Extends to IA64, a 64-bit architecture Radically new instruction set designed for high performance Will be able to run existing IA32 programs
On-board “x86 engine”
Joint project with Hewlett-Packard
Itanium 2Itanium 2 20022002 221M221M Big performance boost
– 69 –CSCE 212H Spring 2012
Assembly Programmer’s ViewAssembly Programmer’s View
Programmer-Visible StateProgrammer-Visible State EIP Program Counter
Address of next instruction
Register FileHeavily used program data
Condition CodesStore status information about
most recent arithmetic operationUsed for conditional branching
EIP
Registers
CPU Memory
Object CodeProgram Data
OS Data
Addresses
Data
Instructions
Stack
ConditionCodes
Memory Byte addressable array Code, user data, (some) OS
data Includes stack used to support
procedures
– 70 –CSCE 212H Spring 2012
text
text
binary
binary
Compiler (gcc -S)
Assembler (gcc or as)
Linker (gcc or ld)
C program (p1.c p2.c)
Asm program (p1.s p2.s)
Object program (p1.o p2.o)
Executable program (p)
Static libraries (.a)
Turning C into Object CodeTurning C into Object Code Code in files p1.c p2.c Compile with command: gcc -O p1.c p2.c -o p
Use optimizations (-O)Put resulting binary in file p
– 71 –CSCE 212H Spring 2012
Compiling Into AssemblyCompiling Into Assembly
C CodeC Code
int sum(int x, int y){ int t = x+y; return t;}
Generated Assembly
_sum:pushl %ebpmovl %esp,%ebpmovl 12(%ebp),%eaxaddl 8(%ebp),%eaxmovl %ebp,%esppopl %ebpret
Obtain with command
gcc -O -S code.c
Produces file code.s
– 72 –CSCE 212H Spring 2012
Assembly CharacteristicsAssembly CharacteristicsMinimal Data TypesMinimal Data Types
“Integer” data of 1, 2, or 4 bytesData valuesAddresses (untyped pointers)
Floating point data of 4, 8, or 10 bytes No aggregate types such as arrays or structures
Just contiguously allocated bytes in memory
Primitive OperationsPrimitive Operations Perform arithmetic function on register or memory data Transfer data between memory and register
Load data from memory into registerStore register data into memory
Transfer controlUnconditional jumps to/from proceduresConditional branches
– 73 –CSCE 212H Spring 2012
Code for sum
0x401040 <sum>:0x550x890xe50x8b0x450x0c0x030x450x080x890xec0x5d0xc3
Object CodeObject CodeAssemblerAssembler
Translates .s into .o Binary encoding of each instruction Nearly-complete image of executable
code Missing linkages between code in
different files
LinkerLinker Resolves references between files Combines with static run-time
librariesE.g., code for malloc, printf
Some libraries are dynamically linkedLinking occurs when program begins
execution
• Total of 13 bytes
• Each instruction 1, 2, or 3 bytes
• Starts at address 0x401040
– 74 –CSCE 212H Spring 2012
Machine Instruction ExampleMachine Instruction ExampleC CodeC Code
Add two signed integers
AssemblyAssembly Add 2 4-byte integers
“Long” words in GCC parlanceSame instruction whether signed
or unsigned
Operands:x: Register %eaxy: Memory M[%ebp+8]t: Register %eax
» Return function value in %eax
Object CodeObject Code 3-byte instruction Stored at address 0x401046
int t = x+y;
addl 8(%ebp),%eax
0x401046: 03 45 08
Similar to expression x += y
– 75 –CSCE 212H Spring 2012
Disassembled00401040 <_sum>: 0: 55 push %ebp 1: 89 e5 mov %esp,%ebp 3: 8b 45 0c mov 0xc(%ebp),%eax 6: 03 45 08 add 0x8(%ebp),%eax 9: 89 ec mov %ebp,%esp b: 5d pop %ebp c: c3 ret d: 8d 76 00 lea 0x0(%esi),%esi
Disassembling Object CodeDisassembling Object Code
DisassemblerDisassemblerobjdump -d p Useful tool for examining object code Analyzes bit pattern of series of instructions Produces approximate rendition of assembly code Can be run on either a.out (complete executable) or .o file
– 76 –CSCE 212H Spring 2012
Disassembled
0x401040 <sum>: push %ebp0x401041 <sum+1>: mov %esp,%ebp0x401043 <sum+3>: mov 0xc(%ebp),%eax0x401046 <sum+6>: add 0x8(%ebp),%eax0x401049 <sum+9>: mov %ebp,%esp0x40104b <sum+11>: pop %ebp0x40104c <sum+12>: ret 0x40104d <sum+13>: lea 0x0(%esi),%esi
Alternate DisassemblyAlternate Disassembly
Within gdb DebuggerWithin gdb Debuggergdb p
disassemble sum Disassemble procedure
x/13b sum Examine the 13 bytes starting at sum
Object0x401040:
0x550x890xe50x8b0x450x0c0x030x450x080x890xec0x5d0xc3
– 77 –CSCE 212H Spring 2012
What Can be Disassembled?What Can be Disassembled?
Anything that can be interpreted as executable code Disassembler examines bytes and reconstructs assembly
source
% objdump -d WINWORD.EXE
WINWORD.EXE: file format pei-i386
No symbols in "WINWORD.EXE".Disassembly of section .text:
30001000 <.text>:30001000: 55 push %ebp30001001: 8b ec mov %esp,%ebp30001003: 6a ff push $0xffffffff30001005: 68 90 10 00 30 push $0x300010903000100a: 68 91 dc 4c 30 push $0x304cdc91
– 78 –CSCE 212H Spring 2012
Moving DataMoving Data
Moving DataMoving Datamovl Source,Dest: Move 4-byte (“long”) word Lots of these in typical code
Operand TypesOperand Types Immediate: Constant integer data
Like C constant, but prefixed with ‘$’E.g., $0x400, $-533Encoded with 1, 2, or 4 bytes
Register: One of 8 integer registersBut %esp and %ebp reserved for special useOthers have special uses for particular instructions
Memory: 4 consecutive bytes of memoryVarious “address modes”
%eax
%edx
%ecx
%ebx
%esi
%edi
%esp
%ebp
– 79 –CSCE 212H Spring 2012
movl Operand Combinationsmovl Operand Combinations
Cannot do memory-memory transfers with single instruction
movl
Imm
Reg
Mem
Reg
Mem
Reg
Mem
Reg
Source Destination
movl $0x4,%eax
movl $-147,(%eax)
movl %eax,%edx
movl %eax,(%edx)
movl (%eax),%edx
C Analog
temp = 0x4;
*p = -147;
temp2 = temp1;
*p = temp;
temp = *p;
– 80 –CSCE 212H Spring 2012
Simple Addressing ModesSimple Addressing Modes
NormalNormal (R)(R) Mem[Reg[R]]Mem[Reg[R]] Register R specifies memory address
movl (%ecx),%eax
DisplacementDisplacement D(R)D(R) Mem[Reg[R]+D]Mem[Reg[R]+D] Register R specifies start of memory region Constant displacement D specifies offset
movl 8(%ebp),%edx
– 81 –CSCE 212H Spring 2012
Using Simple Addressing ModesUsing Simple Addressing Modes
void swap(int *xp, int *yp) { int t0 = *xp; int t1 = *yp; *xp = t1; *yp = t0;}
swap:pushl %ebpmovl %esp,%ebppushl %ebx
movl 12(%ebp),%ecxmovl 8(%ebp),%edxmovl (%ecx),%eaxmovl (%edx),%ebxmovl %eax,(%edx)movl %ebx,(%ecx)
movl -4(%ebp),%ebxmovl %ebp,%esppopl %ebpret
Body
SetUp
Finish
– 82 –CSCE 212H Spring 2012
Understanding SwapUnderstanding Swap
void swap(int *xp, int *yp) { int t0 = *xp; int t1 = *yp; *xp = t1; *yp = t0;}
movl 12(%ebp),%ecx # ecx = yp
movl 8(%ebp),%edx # edx = xp
movl (%ecx),%eax # eax = *yp (t1)
movl (%edx),%ebx # ebx = *xp (t0)
movl %eax,(%edx) # *xp = eax
movl %ebx,(%ecx) # *yp = ebx
Stack
Register Variable
%ecx yp
%edx xp
%eax t1
%ebx t0
yp
xp
Rtn adr
Old %ebp %ebp 0
4
8
12
Offset
•••
Old %ebx-4
– 83 –CSCE 212H Spring 2012
Understanding SwapUnderstanding Swap
movl 12(%ebp),%ecx # ecx = yp
movl 8(%ebp),%edx # edx = xp
movl (%ecx),%eax # eax = *yp (t1)
movl (%edx),%ebx # ebx = *xp (t0)
movl %eax,(%edx) # *xp = eax
movl %ebx,(%ecx) # *yp = ebx
0x120
0x124
Rtn adr
%ebp 0
4
8
12
Offset
-4
123
456
Address
0x124
0x120
0x11c
0x118
0x114
0x110
0x10c
0x108
0x104
0x100
yp
xp
%eax
%edx
%ecx
%ebx
%esi
%edi
%esp
%ebp 0x104
– 84 –CSCE 212H Spring 2012
Understanding SwapUnderstanding Swap
movl 12(%ebp),%ecx # ecx = yp
movl 8(%ebp),%edx # edx = xp
movl (%ecx),%eax # eax = *yp (t1)
movl (%edx),%ebx # ebx = *xp (t0)
movl %eax,(%edx) # *xp = eax
movl %ebx,(%ecx) # *yp = ebx
0x120
0x124
Rtn adr
%ebp 0
4
8
12
Offset
-4
123
456
Address
0x124
0x120
0x11c
0x118
0x114
0x110
0x10c
0x108
0x104
0x100
yp
xp
%eax
%edx
%ecx
%ebx
%esi
%edi
%esp
%ebp
0x120
0x104
– 85 –CSCE 212H Spring 2012
Understanding SwapUnderstanding Swap
movl 12(%ebp),%ecx # ecx = yp
movl 8(%ebp),%edx # edx = xp
movl (%ecx),%eax # eax = *yp (t1)
movl (%edx),%ebx # ebx = *xp (t0)
movl %eax,(%edx) # *xp = eax
movl %ebx,(%ecx) # *yp = ebx
0x120
0x124
Rtn adr
%ebp 0
4
8
12
Offset
-4
123
456
Address
0x124
0x120
0x11c
0x118
0x114
0x110
0x10c
0x108
0x104
0x100
yp
xp
%eax
%edx
%ecx
%ebx
%esi
%edi
%esp
%ebp
0x124
0x120
0x104
– 86 –CSCE 212H Spring 2012
Understanding SwapUnderstanding Swap
movl 12(%ebp),%ecx # ecx = yp
movl 8(%ebp),%edx # edx = xp
movl (%ecx),%eax # eax = *yp (t1)
movl (%edx),%ebx # ebx = *xp (t0)
movl %eax,(%edx) # *xp = eax
movl %ebx,(%ecx) # *yp = ebx
0x120
0x124
Rtn adr
%ebp 0
4
8
12
Offset
-4
123
456
Address
0x124
0x120
0x11c
0x118
0x114
0x110
0x10c
0x108
0x104
0x100
yp
xp
%eax
%edx
%ecx
%ebx
%esi
%edi
%esp
%ebp
456
0x124
0x120
0x104
– 87 –CSCE 212H Spring 2012
Understanding SwapUnderstanding Swap
movl 12(%ebp),%ecx # ecx = yp
movl 8(%ebp),%edx # edx = xp
movl (%ecx),%eax # eax = *yp (t1)
movl (%edx),%ebx # ebx = *xp (t0)
movl %eax,(%edx) # *xp = eax
movl %ebx,(%ecx) # *yp = ebx
0x120
0x124
Rtn adr
%ebp 0
4
8
12
Offset
-4
123
456
Address
0x124
0x120
0x11c
0x118
0x114
0x110
0x10c
0x108
0x104
0x100
yp
xp
%eax
%edx
%ecx
%ebx
%esi
%edi
%esp
%ebp
456
0x124
0x120
123
0x104
– 88 –CSCE 212H Spring 2012
Understanding SwapUnderstanding Swap
movl 12(%ebp),%ecx # ecx = yp
movl 8(%ebp),%edx # edx = xp
movl (%ecx),%eax # eax = *yp (t1)
movl (%edx),%ebx # ebx = *xp (t0)
movl %eax,(%edx) # *xp = eax
movl %ebx,(%ecx) # *yp = ebx
0x120
0x124
Rtn adr
%ebp 0
4
8
12
Offset
-4
456
456
Address
0x124
0x120
0x11c
0x118
0x114
0x110
0x10c
0x108
0x104
0x100
yp
xp
%eax
%edx
%ecx
%ebx
%esi
%edi
%esp
%ebp
456
0x124
0x120
123
0x104
– 89 –CSCE 212H Spring 2012
Understanding SwapUnderstanding Swap
movl 12(%ebp),%ecx # ecx = yp
movl 8(%ebp),%edx # edx = xp
movl (%ecx),%eax # eax = *yp (t1)
movl (%edx),%ebx # ebx = *xp (t0)
movl %eax,(%edx) # *xp = eax
movl %ebx,(%ecx) # *yp = ebx
0x120
0x124
Rtn adr
%ebp 0
4
8
12
Offset
-4
456
123
Address
0x124
0x120
0x11c
0x118
0x114
0x110
0x10c
0x108
0x104
0x100
yp
xp
%eax
%edx
%ecx
%ebx
%esi
%edi
%esp
%ebp
456
0x124
0x120
123
0x104
– 90 –CSCE 212H Spring 2012
Indexed Addressing ModesIndexed Addressing ModesMost General FormMost General Form
D(Rb,Ri,S)D(Rb,Ri,S) Mem[Reg[Rb]+S*Reg[Ri]+ D]Mem[Reg[Rb]+S*Reg[Ri]+ D] D: Constant “displacement” 1, 2, or 4 bytes Rb: Base register: Any of 8 integer registers Ri: Index register: Any, except for %esp
Unlikely you’d use %ebp, either
S: Scale: 1, 2, 4, or 8
Special CasesSpecial Cases
(Rb,Ri)(Rb,Ri) Mem[Reg[Rb]+Reg[Ri]]Mem[Reg[Rb]+Reg[Ri]]
D(Rb,Ri)D(Rb,Ri) Mem[Reg[Rb]+Reg[Ri]+D]Mem[Reg[Rb]+Reg[Ri]+D]
(Rb,Ri,S)(Rb,Ri,S) Mem[Reg[Rb]+S*Reg[Ri]]Mem[Reg[Rb]+S*Reg[Ri]]
– 91 –CSCE 212H Spring 2012
Address Computation ExamplesAddress Computation Examples
%edx
%ecx
0xf000
0x100
ExpressionExpression ComputationComputation AddressAddress
0x8(%edx)0x8(%edx) 0xf000 + 0x80xf000 + 0x8 0xf0080xf008
(%edx,%ecx)(%edx,%ecx) 0xf000 + 0x1000xf000 + 0x100 0xf1000xf100
(%edx,%ecx,4)(%edx,%ecx,4) 0xf000 + 4*0x1000xf000 + 4*0x100 0xf4000xf400
0x80(,%edx,2)0x80(,%edx,2) 2*0xf000 + 0x802*0xf000 + 0x80 0x1e0800x1e080