CA226 — AdvancedComputer Architecture
2
MIPSMIPS is:
• a RISC instruction-set architecture:
• all ALU operations are register-register
• initially 32-bit, later 64-bit
Its design is heavily influenced by opportunities for instruction-levelparallelism:
• and we’ll talk much more about that later
CA226 — AdvancedComputer Architecture
3
MIPS OverviewWe will cover:
• 64-bit MIPS, as simulated by the WinMIPS64 simulator
• there is a summary of the (WinMIPS64) MIPS instruction set here [./mips/winmips64.html]
CA226 — AdvancedComputer Architecture
4
MIPS BasicsBasics:
• fixed-sized, 32-bit instructions
• r0 is always 0
• 31 general-purpose integer registers (r1, …, r31)
• 32 floating-point registers (f0, …, f31)
• byte, half-word, word and double-word addressing
• displacement addressing with 16-bit displacements
CA226 — AdvancedComputer Architecture
5
Integer Loads
ld r1,64(r2) load double word from Mem[r2+64]
lw r1,64(r3) load word from Mem[r2+64]
lh r1,64(r4) load half word from Mem[r2+64]
lb r1,64(r5) load byte from Mem[r2+64]
CA226 — AdvancedComputer Architecture
6
All addressing is displacement addressingDisplacement addressing:
ld r1,64(r2)
Direct addressing:
ld r1,1024(r0) ; use r0 (always 0)
Register-indirect addressing:
ld r1,0(r2) ; use a displacement of 0
CA226 — AdvancedComputer Architecture
7
Immediate Addressingint i = 123;
Immediate addressing:
• is not supported for loads/stores
Note
However, immediate addressing can be emulated (see anon).
CA226 — AdvancedComputer Architecture
8
Sign ExtensionFor words, half words and bytes:
• loads are sign extended [http://en.wikipedia.org/wiki/Sign_extension]
• unmatched bits in the target register are padded with copies of the most significantbit from the source
Therefore:
• the appropriate twos-compliment [http://en.wikipedia.org/wiki/Twos_Compliment]sign and value is retained for both loads and stores
CA226 — AdvancedComputer Architecture
9
Sign Extension — ExamplesExample — lb with 01000000:
• 0000000000000000....01000000
Example — lb with 10000000:
• 1111111111111111....10000000
CA226 — AdvancedComputer Architecture
10
Unsigned Integer Loads
lwu r1,64(r3) load unsigned word Mem[r2+64]
lhu r1,128(r4) load unsigned half word Mem[r2+128]
lbu r1,256(r5) load unsigned byte Mem[r2+256]
Note
Unmatched bits are 0 (no sign extension).
CA226 — AdvancedComputer Architecture
11
Integer Stores
sd r1,32(r2) store double word to Mem[r2+32]
sw r1,64(r3) store word to Mem[r3+64]
sh r1,128(r4) store half word to Mem[r4+128]
sb r1,256(r5) store byte to Mem[r5+256]
Note
Unmatched bits are discarded (which is correct for both signed data and unsigneddata).
CA226 — AdvancedComputer Architecture
12
NoteAll loads and stores use:
• 16 bits of displacement(although we’ll come nowhere near having to worry about that)
CA226 — AdvancedComputer Architecture
13
SummaryLoads and stores:
• displacement addressing onlydirect and register indirect addressing via 0-valued address components
• b, h, w and d variants of all load/store operation (where necessary)
Loads:
• signed loads with sign extension
• unsigned loads (lwu, etc)
CA226 — AdvancedComputer Architecture
14
Floating-Point Loads/StoresThere are also:
• 32 64-bit floating-point registers(always 64-bits)
Load 64-bit floating-point value:
l.d f1,1024(r2)
Store 64-bit floating-point value:
s.d f1,1024(r2)
CA226 — AdvancedComputer Architecture
15
Integer ALU InstructionsInteger arithmetic instructions are:
• register-register or register-immediate
• 64-bit arithmetic
• signed and unsigned variants
CA226 — AdvancedComputer Architecture
16
Register-Register Integer Arithmetic
dadd r1,r2,r3 r1 = r2 + r3
dsub r1,r2,r3 r1 = r2 - r3
dmul r1,r2,r3 r1 = r2 * r3
ddiv r1,r2,r3 r1 = r2 / r3
Note
ddiv is integer division, remainders are discarded.
CA226 — AdvancedComputer Architecture
17
Unsigned Integer Arithmetic
daddu r1,r2,r3 r1 = r2 + r3
dsubu r1,r2,r3 r1 = r2 - r3
dmulu r1,r2,r3 r1 = r2 * r3
ddivu r1,r2,r3 r1 = r2 / r3
CA226 — AdvancedComputer Architecture
18
Immediates
daddi r1,r2,100 r1 = r2 + 100
daddi r1,r2,-1 r1 = r2 - 1
CA226 — AdvancedComputer Architecture
19
Immediate Addressing (for "loads")Load in immediate value into a register:
int i = 123;
daddi r1,r0,123
CA226 — AdvancedComputer Architecture
20
Logical Operations (Bitwise)
and r1,r2,r3 r1 = r2 and r3
or r1,r2,r3 r1 = r2 or r3
xor r1,r2,r3 r1 = r2 xor r3
andi r1,r2,1 r1 = r2 and 1
ori r1,r2,1 r1 = r2 or 1
xori r1,r2,1 r1 = r2 xor 1
CA226 — AdvancedComputer Architecture
21
Logical Shifts
Table 1. Logical shifts:
dsll r1,r2,1 r1 = r2 << 1
dsrl r1,r2,3 r1 = r2 >> 3
Table 2. Logical shifts (by variable amounts):
dsllv r1,r2,r3 r1 = r2 << r3
dsrlv r1,r2,r3 r1 = r2 >> r3
Table 3. Arithmetic right shifts (sign extended):
dsra r1,r2,1 r1 = r2 << 1
dsrav r1,r2,r3 r1 = r2 >> r3
CA226 — AdvancedComputer Architecture
22
ExamplesMultiply r2 by 4, result in r1:
dsll r1,r2,2
Multiply r2 by 3, result in r1:
dsll r1,r2,1dadd r1,r1,r2
Divide r2 by 2, result in r1:
dsrl r1,r2,1 ; if r2 is unsigned (or known positive)
; or...dsra r1,r2,1 ; general arithmetic case
CA226 — AdvancedComputer Architecture
23
Aside
Note
Addition, subtraction and shift instructions require considerably fewer cycles thanmultiplication and division, even for integers.
CA226 — AdvancedComputer Architecture
24
Set a value (set conditions)Signed:
slt r1,r2,r3 ; r1 = (r2 < r3) ? 1 : 0slti r1,r2,100 ; r1 = (r2 < 100) ? 1 : 0
Unsigned:
sltu r1,r2,r3 ; r1 = (r2 < r3) ? 1 : 0sltiu r1,r2,100 ; r1 = (r2 < 100) ? 1 : 0
Note
Often used immediately before a branch.
CA226 — AdvancedComputer Architecture
25
Branches and Jumps
Branchesconditional changes to the programme counter
Jumpsunconditional changes to the programme counter
CA226 — AdvancedComputer Architecture
26
Branches and Jumps
Branchestypically for and while loop conditions, if statements
Jumpstypically loops, also subroutine, function calls
CA226 — AdvancedComputer Architecture
27
BranchesBranch on equality/inequality:
beq r1,r2,1024 ; branch to 1024 if r1 equals r2bne r1,r2,1024 ; branch to 1024 if r1 doesn't equals r2
Branch on zero/not zero:
beqz r1,1024 ; branch to 1024 if r1 equals 0bnez r1,1024 ; branch to 1024 if r1 doesn't equals 0
CA226 — AdvancedComputer Architecture
28
Branch ExampleExample:
• branch iff value in r1 is less than 100
CA226 — AdvancedComputer Architecture
29
Branch ExampleExample:
• branch iff value in r1 is less than 100
slti r2,r1,100 ; set r2 iff r1 < 100bnez r2,1024 ; branch to 1024 if r2 is set
CA226 — AdvancedComputer Architecture
30
Branch ExampleIf statement:
if ( x == 3 ) { ...}// rest...
; assume x is initially in r1daddi r3,r0,3 ; load 3 into r3bne r1,r3,rest ; if ( x == 3 ) {... ; ...rest: ; }... ; // rest
CA226 — AdvancedComputer Architecture
31
JumpsJumps are unconditional:
j 1024 ; jump to immediate 1024jr r20 ; jump to register r20
CA226 — AdvancedComputer Architecture
32
Jump and Link (function call)For function calls:
jal 1024 ; jump to 1024jalr r20 ; jump to r20
In both cases:
• leave the return address (PC+4) in r31(in this regard, r31 is also a special-purpose register)
CA226 — AdvancedComputer Architecture
33
Branches and JumpsWith fixed, 32-bit instructions:
• it makes no sense to branch/jump to an address which is not divisible by four
• so target addresses are shifted left by two bits(so multiplied by four)
Note
This happens transparently in assembly language (where target addresses aresymbolic).
CA226 — AdvancedComputer Architecture
34
ExampleWrite a MIPS program involving a loop to:
• sum the numbers from 1 to 100
• leaving the result in r1
Note
Note to self.See mips/sum-to-100-template.s.
CA226 — AdvancedComputer Architecture
35
Example.textmain: daddi r1,r0,0 ; int s = 0; daddi r2,r0,1 ; int i = 1; daddi r3,r0,101 ; int N = 101;loop: ; beq r2,r3,done ; while ( i < N ) dadd r1,r1,r2 ; s += i; daddi r2,r2,1 ; i += 1; j loop ; }done: halt ; R1 now contains 5050 (= 0x13ba)
CA226 — AdvancedComputer Architecture
36
Done<script> (function() { var mathjax = 'mathjax/MathJax.js?config=asciimath'; // var mathjax= 'http://smblott.computing.dcu.ie/mathjax/MathJax.js?config=asciimath'; var element= document.createElement('script'); element.async = true; element.src = mathjax;element.type = 'text/javascript'; (document.getElementsByTagName('HEAD')[0]||document.body).appendChild(element); })(); </script>