chapter 2 instructions sets - 國立中興大學
TRANSCRIPT
![Page 1: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/1.jpg)
Chapter 2Instructions SetsHsung-Pin ChangDepartment of Computer ScienceNational ChungHsing University
![Page 2: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/2.jpg)
OutlineInstructionPreliminariesARM ProcessorSHARC Processor
![Page 3: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/3.jpg)
2.1 InstructionsInstructions sets
The programmer’s interface to the hardwareTwo CPUs as example
ARM processor: ARM version 7SHARK
Digital signal processor (DSP)
![Page 4: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/4.jpg)
2.2 Preliminaries2.2.1 Computer architecture taxonomy2.2.2 Assembly language.
![Page 5: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/5.jpg)
2.2.2 Computer Architecture Taxonomy
Von Neumann architecture
Harvard architectures
RISC vs. CISC
![Page 6: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/6.jpg)
Von Neumann architectureA computer whose memory holds both data and instructions The CPU has several internal registers
Program counter, general-purpose register…CPU fetches instructions by program counter from memoryThe separation of the instruction memory from the CPU
Distinguish a stored-program computer from a general finite-state machine
![Page 7: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/7.jpg)
A von Neumann Architecture Computer
memoryCPU
PC
address
data
IR ADD r5,r1,r3
200
ADD r5,r1,r3200
![Page 8: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/8.jpg)
Harvard ArchitectureSeparate memories for data and programThe program counter points to the program memory
Hard to write self-modifying programsUsed for one very simple reason
Provide higher performance for digital signal processingMost of DSPs are Harvard architectures
Most of the phone calls go through at least 2 DSPs, one at each end of the phone call
![Page 9: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/9.jpg)
Harvard Architecture
CPU
PCdata memory
program memory
address
data
address
data
![Page 10: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/10.jpg)
von Neumann vs. HarvardHarvard cannot use self-modifying code.Harvard allows two simultaneous memory fetches.Most DSPs use Harvard architecture for streaming data:
greater memory bandwidthmore predictable bandwidth
Streaming dataData set the arrive continuously and periodically
![Page 11: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/11.jpg)
RISC vs. CISCAnother axis relates to instructions and how they are executedComplex instruction set computer (CISC)
A variety of instructionsReduced instruction set computer (RISC)
Fewer and simpler instructionsLoad/storePipelined processors
![Page 12: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/12.jpg)
Instruction Set CharacteristicsFixed vs. variable length.Addressing modes.Number of operands.Types of operands.
![Page 13: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/13.jpg)
Programming ModelProgramming model: the set of registers available for use by programs
Some registers are unavailable to programmer (IR).
![Page 14: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/14.jpg)
2.2.2 Assembly languageThe textual description of instructionsBasic features:
One instruction per lineLabels
Provide names for memory locationStart in the first column
Instructions often start in later columnsTo distinguish them from labels
Comments run from some designated comment character to the end of the line
![Page 15: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/15.jpg)
An Example of ARM Assembly Language
label1 ADR r4,cLDR r0,[r4] ; a commentADR r4,dLDR r1,[r4]SUB r0,r0,r1 ; comment
![Page 16: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/16.jpg)
Pseudo-opsSome assembler directives don’t correspond directly to instructions
Help programmer create complete language programs
For exampleDefine current addressReserve storageConstants
![Page 17: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/17.jpg)
Pseudo-ops (Cont.)Examples
In ARM:
BIGBLOCK %10 ; allocate a block of 10-byete; memory and initialize to 0
![Page 18: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/18.jpg)
2.3 ARM ProcessorARM is a family of RISC architecture and has been extended over several versions
ARM 610, ARM7, ARM9, ARM10, ARM11ARM 7
Von Neumann architecture machineARM9
Harvard architecture machineWe will concentrate on ARM7
![Page 19: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/19.jpg)
ARM Assembly LanguageFairly standard assembly language
LDR r0,[r8] ; a comment
label ADD r4,r0,r1
![Page 20: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/20.jpg)
ARM Data TypeThe standard ARM word is 32 bits long
Word may be divided into four 8-bit bytes
ARM allow addresses to be 32 bits longAn address refer to a byte, not a wordWord 0 is at location 0Word 1 is at location 4PC is incremented by 4 in sequential access
Can be configured at power-up to address the bytes in a word in either little-endian or bit-endian mode
![Page 21: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/21.jpg)
Byte Organization Within an ARM Word
Little-endian modeThe lowest-order byte residing in the low-order bits of the word
Big-endian modeThe lowest-order byte stored in the highest bits of the word
byte 3 byte 2 byte 1 byte 0 byte 0 byte 1 byte 2 byte 3
bit 31 bit 0 bit 0 bit 31
little-endian big-endian
![Page 22: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/22.jpg)
ARM Data OperationARM is a load-store architecture
Arithmetic and logical operations cannot be performed directly on memory locations
Data operands must first be loaded into the CPU and then stored back to main memory
![Page 23: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/23.jpg)
ARM Programming ModelARM has 16 general-purpose registers
r0~r15
r15: also used as the program counterAllow the program counter value to be used as an operand in computations
![Page 24: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/24.jpg)
ARM Programming Model
r0 r8r1 r9 031r2 r10
CPSRr3 r11r4 r12
N Z C Vr5 r13r6 r14r7 r15 (PC)
![Page 25: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/25.jpg)
ARM Programming Model (Cont.)CPSR: Current Program Status Register
Set automatically during every arithmetic, logical, or shifting operation
Top four bits of the CPSRN: set when the result is negative in two’s-complement arithmeticZ: set when every bit of the result is zeroC: set when there is a carry out of the operationV: set when an arithmetic operation results in an overflow
![Page 26: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/26.jpg)
ARM Status BitsExamples:
-1 + 1 = 0: NZCV = 01100xffffffff + 0x1 = 0x0
0-1 = -1: NZCV = 10000x0-0x1=0xffffffff
![Page 27: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/27.jpg)
ARM Data InstructionsBasic format:ADD r0,r1,r2
Computes r1+r2, stores in r0.Immediate operand:ADD r0,r1,#2
Computes r1+2, stores in r0Data instructions
Arithmetic Logical Shift/store
![Page 28: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/28.jpg)
ARM Arithmetic InstructionsADD : addADC : add with carry
ADC r0, r1, r2; r0 = r1 + r2 + CSUB : subtractSBC : subtract with carry
SUB r0, r1, r2; r0 = r1 - r2 + C -1RSB : reverse subtract
RSB r0, r1, r2; r0 = r2 – r1RSC : reverse subtract w. carry
RSB r0, r1, r2; r0 = r2 – r1 + C -1MUL : multiply MLA : multiply and accumulate
MLA r0, r1, r2, r3; r0 = r1 x r2 + r3
![Page 29: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/29.jpg)
ARM Logical InstructionsAND : Bit-wise andORR : Bit-wise orEOR : Bit-wise exclusive-orBIC : bit clear
BIC r0, r1, r2; r0 = r1 & Not(r2)
![Page 30: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/30.jpg)
ARM Shift/Rotate InstructionsLSL : logical shift left
Fills with zeroesLSR : logical shift rightASL : arithmetic shift left
= LSLASR : arithmetic shift left
Copy the sign bit to the most significant bitROR : rotate rightRRX : rotate right extended with C
Performs 33-bit rotate, with the CPSR’s C bit being inserted above sign bit of the word
![Page 31: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/31.jpg)
ARM Comparison InstructionsDo not modify registers but only set the values of the NZCV bits of the CPSR registerCMP : compare
CMP r0, r1; compute (r0 - r1)and set NZCVCMN : negated compare
CMP r0, r1; compute (r0 + r1)and set NZCV TST : bit-wise AND test
TST r0, r1; compute (r0 AND r1)and set NZCVTEQ : bit-wise exclusive-or test
TEQ r0, r1; compute (r0 EOR r1)and set NZCV
![Page 32: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/32.jpg)
ARM Move InstructionsMOV : move
MOV r0, r1; r0 = r1MVN : move (negated)
MOV r0, r1; r0 = NOT(r1)
![Page 33: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/33.jpg)
ARM Load/Store instructionsLDR : load wordLDRH : load half-wordLDRB : load byteSTR : store wordSTRH : store half-wordSTRB : store byte
![Page 34: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/34.jpg)
Addressing ModesRegisterImmediateIndirect or Register IndirectBase-Plus-Offset
Base registerr0 – r15
Offset, and or subtract an unsigned numberImmediateRegister (not PC)Scaled register: only available for word and unsigned byte instructions
![Page 35: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/35.jpg)
Register Indirect AddressingLDR r0,[r1]
If r1 = 0x100Set r0 = mem32[0x100]
STR r0,[r1]If r1 = 0x100Set mem32[0x100] = r0
![Page 36: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/36.jpg)
Base-Plus-Offset AddressingPre-indexing
LDR r0,[r1,#4] ; r0:=mem32[r1+4]Offset up to 4K, added or subtracted, (# -4)
Post-indexingLDR r0,[r1],#4 ; r0:=mem32[r1], then r1:=r1+4
Auto-indexingLDR r0, [r1,#4]! ; r0:=mem32[r1+4], then r1:=r1+4
![Page 37: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/37.jpg)
ARM ADR pseudo-opCannot refer to an address directly in an instruction.
Generate value by performing arithmetic on PC.ADR provide a pseudo-operation (or pseudo-instruction) to simply the steps
ADR r1, F00 ; give location 0x100 the name FOO
![Page 38: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/38.jpg)
Example: C assignmentsC: x = (a + b) - c;Assembler:ADR r4,a ; get address for aLDR r0,[r4] ; get value of aADR r4,b ; get address for b, reusing r4LDR r1,[r4] ; get value of bADD r3,r0,r1 ; compute a+bADR r4,c ; get address for cLDR r2[r4] ; get value of cSUB r3,r3,r2 ; complete computation of xADR r4,x ; get address for xSTR r3[r4] ; store value of x
![Page 39: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/39.jpg)
Example: C assignmentC: y = a*(b+c);Assembler:
ADR r4, b ; get address for bLDR r0, [r4] ; get value of bADR r4, c ; get address for cLDR r1, [r4] ; get value of cADD r2, r0,r1 ; compute partial resultADR r4, a ; get address for aLDR r0, [r4] ; get value of aMUL r2, r2, r0 ; compute final value for yADR r4, y ; get address for ySTR r2, [r4] ; store y
![Page 40: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/40.jpg)
Example: C assignmentC: z = (a << 2) | (b & 15);
Assembler:ADR r4,a ; get address for aLDR r0,[r4] ; get value of aMOV r0,r0,LSL 2 ; perform shiftADR r4,b ; get address for bLDR r1,[r4] ; get value of bAND r1,r1,#15 ; perform ANDORR r1,r0,r1 ; perform ORADR r4,z ; get address for zSTR r1,[r4] ; store value for z
![Page 41: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/41.jpg)
ARM Flow of ControlBranch instruction: PC-relativeB #100 ; add 400 to the current PC valueB LABEL ; branch a LABELConditional branch by testing CPSR value
EQ : equal (Z = 1)NE : not equal (Z = 0)CS : carry set ( C = 1)CC : carry clear (C = 0)
![Page 42: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/42.jpg)
ARM Flow of Control (Cont.)MI : minus ( N = 1)PL : nonnegative ( N = 0)VS : overflow ( V = 1)VC : no overflow ( V = 0)HI : unsigned higher ( C = 1 and Z = 0)LS : unsigned lower or same ( C = 0 or Z = 1)GE : greater than or equal ( N = V)LT : signed less than ( N != V)GT : signed greater than ( Z = 0 and N = V)LE : signed less than or equal ( Z = 1 and N != V)
![Page 43: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/43.jpg)
Example: if statementC: if (a > b) { x = 5; y = c + d; } else x = c - d;
Assembler:; compute and test condition
ADR r4,a ; get address for aLDR r0,[r4] ; get value of aADR r4,b ; get address for bLDR r1,[r4] ; get value for bCMP r0,r1 ; compare a < bBGE fblock ; if a >= b, branch to false block
![Page 44: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/44.jpg)
If statement, cont’d.; true block
MOV r0,#5 ; generate value for xADR r4,x ; get address for xSTR r0,[r4] ; store xADR r4,c ; get address for cLDR r0,[r4] ; get value of cADR r4,d ; get address for dLDR r1,[r4] ; get value of dADD r0,r0,r1 ; compute yADR r4,y ; get address for ySTR r0,[r4] ; store yB after ; branch around false block
![Page 45: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/45.jpg)
If statement, cont’d.; false blockfblock ADR r4,c ; get address for c
LDR r0,[r4] ; get value of cADR r4,d ; get address for dLDR r1,[r4] ; get value for dSUB r0,r0,r1 ; compute a-bADR r4,x ; get address for xSTR r0,[r4] ; store value of x
after ...
![Page 46: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/46.jpg)
Example: Conditional instruction implementation; true block
MOVLT r0,#5 ; generate value for xADRLT r4,x ; get address for xSTRLT r0,[r4] ; store xADRLT r4,c ; get address for cLDRLT r0,[r4] ; get value of cADRLT r4,d ; get address for dLDRLT r1,[r4] ; get value of dADDLT r0,r0,r1 ; compute yADRLT r4,y ; get address for ySTRLT r0,[r4] ; store y
![Page 47: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/47.jpg)
Conditional instruction implementation, cont’d.; false block
ADRGE r4,c ; get address for cLDRGE r0,[r4] ; get value of cADRGE r4,d ; get address for dLDRGE r1,[r4] ; get value for dSUBGE r0,r0,r1 ; compute a-bADRGE r4,x ; get address for xSTRGE r0,[r4] ; store value of x
![Page 48: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/48.jpg)
Example: switch statementC: switch (test) { case 0: … break; case 1: … }
Assembler:ADR r2,test ; get address for testLDR r0,[r2] ; load value for testADR r1,switchtab ; load address for switch tableLDR r1,[r1,r0,LSL #2] ; index switch table
switchtab DCD case0DCD case1
...
![Page 49: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/49.jpg)
Example: FIR filterC:for (i=0, f=0; i<N; i++)
f = f + c[i]*x[i];
Assembler; loop initiation code
MOV r0,#0 ; use r0 for IMOV r8,#0 ; use separate index for arraysADR r2,N ; get address for NLDR r1,[r2] ; get value of NMOV r2,#0 ; use r2 for f
![Page 50: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/50.jpg)
FIR filter, cont’.dADR r3,c ; load r3 with base of cADR r5,x ; load r5 with base of x
; loop bodyloop LDR r4,[r3,r8] ; get c[i]LDR r6,[r5,r8] ; get x[i]MUL r4,r4,r6 ; compute c[i]*x[i]ADD r2,r2,r4 ; add into running sumADD r8,r8,#4 ; add one word offset to array indexADD r0,r0,#1 ; add 1 to iCMP r0,r1 ; exit?BLT loop ; if i < N, continue
![Page 51: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/51.jpg)
ARM Subroutine LinkageBranch and link instruction:BL foo
Save the current PC value to r14And then jump to foo
To return from subroutine:MOV r15,r14
![Page 52: Chapter 2 Instructions Sets - 國立中興大學](https://reader034.vdocuments.site/reader034/viewer/2022052616/628e2358163ec46c8a1da1b7/html5/thumbnails/52.jpg)
SummaryLoad/store architectureMost instructions are RISCy, operate in single cycle.
Some multi-register operations take longer.All instructions can be executed conditionally.