4 instn sets

Computer Instruction SetOp-code
Operand(s)
ADD
R0
100
e.g.
1) zero-address
2) one-address
3) two-address
4) three-address
The total number of instructions and the types and formats of the operands determine the length of an instruction.
The shorter the instruction, the faster the time that it can be fetched and decoded.
Shorter instructions are better than longer ones:
(i) take up less space in memory
(ii) transferred to the CPU faster
*
- Little-endian: bytes in word ordererd right-to-left eg. Intel
Creates havoc when transferring data; need to swap byte order in transferred words
2.10, 2.11
1. Block-code technique
To each of the 2^K instructions a unique binary bit pattern of length K is assigned.
An K-to-2^K decoder can then be used to decode all the instructions. For example,
3-to-8
decoder
2. Expanding op-code technique
Consider an 4+12 bit instruction with a 4-bit op-code and three 4-bit addresses.
Op-code
It can at most encode 16 three-address instructions.
If there are only 15 such three-address instructions, then one of the unused op-code can be used to expand to two-address, one-address or zero address instructions.
1 1 1 1
Address 1
Address 2
Again, this expanded op-code can encode at most 16 two-address instructions. And if there are less than 16 such instructions, we can expand the op-code further.
1 1 1 1
1 1 1 1
Opcode Encoding
Note that the three address fields may not necessarily be used to encode a three-address operand; they canl be used as a single 12-bit one-address operand.
Can have some part of the op-code to specify the instruction format and/or length.
*
Op-code Encoding
Huffman encoding
Given the probabilty of occurrences of each instruction, it is possible to encode all the instructions with minimal number of bits, and with the following property:
Fewer bits are used for most frequently used instructions and more for the least frequently used ones.
1
1/4
1/2
1/8
1/8
1/4
1/2
1/16
1/16
1/16
1/16
1/8
1/8
1/4
1/4
LOAD
STO
SHIFT
NOT
JUMP
HALT
AND
ADD
1
1
0
0
1
0
1
0
1
0
1
0
1
0
11
10
011
010
0011
0010
0001
0000
Huffman encoding algorithm:
1. Initialize the leaf nodes each with a probability of an instruction. All nodes are unmarked.
2. Find the two unmarked nodes with the smallest values and mark them. Add a new unmarked node with a value equal to the sum of the chosen two.
3. Repeat step (2) until all nodes have been marked except the last one, which has a value of 1.
4. The encoding for each instruction is found by tracing the path from the unmarked node (the root) to that instruction.
- may mark branches arbitrarily with 0, 1
Advantage: minimal number of bits
Disadvantage: must decode instructions bit-by-bit, (can be slow).
*
absolute - an instruction contains the memory address of its operand
register - an instruction contains the register address of its operand
immediate - an instruction contains or immediately precedes its operand value
CLI ; clear the interrupt flag
ADD #250, R1 % R1 := R1 + 250;
ADD 250, R1 % R1 := R1 + *(250);
ADD R2, R1 % R1 := R1 + R2;
*
Addressing Modes
register indirect - the register address in an instruction specifies the address of its operand
indexed - an offset is added to a register to give the address of the operand
ADD @R2, @R1 % *R1 := *R1 + *R2;
auto-decrement or auto-increment - the contents of the register is automatically decremented or incremented before or after the execution of the instruction
MOV (R2)+, R1 % R1 := *(R2); R2 := R2 + k;
MOV -(R2), R1 % R2 := R2 - k; R1 := *(R2);
base-register - a displacement is added to an implicit or explicit base register to give the address of the operand
relative - same as base-register mode except that the instruction pointer is used as the base register
MOV 2(R2), R1 % R1 := R2[2];
*
Addressing modes
Indirect addressing mode in general also applies to absolute addresses, not just register addresses; the absolute address is a pointer to the operand.
The offset added to an index register may be as large as the entire address space. On the other hand, the displacement added to a base register is generally much smaller than the entire address space.
*
Instruction Types
Instructions, of most modern computers, may be classified into the following six groups:
Data transfer (40% of user program instructions)
Arithmetic
Logical
System-control
I/O
Test-And-Set
Unconditional branch
Conditional branch
Subroutine call
ADBLEQ R5, R6, LOOP % repeat until R5>R6
CALL SUB % push PC; branch to SUB
RET % pop PC
*
Instruction types
Typical branch instructions test the value of some flags called conditions. Certain instructions cause these flags to be set automatically.
The registers used in implementing a subroutine call are called linkage registers, which typically include the instruction pointer and stack pointer..
The parameters passed between the caller and the called subroutine are to be established by programming conventions. Very few computers support parameter-passing mechanisms in the hardware.
*
B. Ross COSC 3p92
Examples: Intel Pentium 4
back-compatible to 8088 (16 bit, 8 bit data bus), 8086 (16 bit), 80286 (16 bit, larger addr), 80386 (32 bit), ...
3 operating modes:
2. virtual 8086 - protected
little endian words
registers: [5.3]
EAX, EBX, ECX, EDX - general purpose, but have special uses (eg. EAX = arithmetic, ...)
ESI, EDI, EBP, ESP - addr registers
*
Registers:
32 64-bit general regs, 32 FP regs
global var regs: used by all procedures
*
64 KB address space for programs, 64 KB for data
prog in ROM, data in RAM
lots of memory configurations:
others
2-bits in PSW determine which one is current
permits rapid interrupt processing: register set switching
all registers are addressable in memory space
eg. R0 and addr 0 are same
above 4 reg banks are 16 bytes that are bit-addressable (bits 0 thru 127)
permits status processing w/o fetching entire bytes
Special registers
carry, aux carry, reg set, overflow, parity
IE: interrupt enable/disable
IP: interrupt priority for each interrupt
low or high
TCON: timer control
*
formats are complex, irregular, with variable-sized fields (due to historical evolution)
no memory-to-memory instructions
Some fields:
3 bit register REG, R/M
SIB (scale, index, base) array manipulation codes
1,2,4 more bytes for operands, constants
*
• 8 modes, 8 regs -- regs 6 & 7 are stack, PC
• "orthogonal" addressing -- addressing and opcodes
are independent
x111 -> use longer opcode
first 2 bits help decode instruction format
to encode a 32 bit constant, need to do it in 2 separate instructions!
Example: 8051
6 simple formats; 1, 2 or 3 bytes
*
• 386 -- if 16-bit segments used, then use above
- if 32-bit segments, use following...
old 5.35
• SIB mechanism: [5.27] --> arrays
adding to Base register
*
• eg. mode 6 with PC (reg 7)
5.33
5.34
Example: UltraSPARC addressing
all instructions use immediate or register addressing, except those that address memory
only 3 instructions address memory: Load, Store, and a multiproc. synch
use indirect addressing
13 bit constants for immediate
Example: 8051
5 modes [fig 5-29]
some instns use accumulator implicitly (no code telling such... means instns are more compact!)
some modes (reg indirect) require operand to be in bottom 256 bytes of memory, because that’s where registers are residing
64 Kb of memory addressed by loading 2-bit offsets
*
Pentium: specialized formats, addressing schemes
386 - 32 bit addressing is more general
RISC (Ultra): simpler instructions, fewer modes
Compilers will generate required addressing, so a simple scheme will suffice
Specialized modes, formats makes instruction parallelism (pipelining) more difficult
fewer modes preferrable over specialized modes
simplicity means better compilers

4 instn sets

Documents