emulation: interpretation and binary translation
DESCRIPTION
Emulation: Interpretation and Binary Translation. Haiyu Li [email protected]. Content. Emulation Interpreter Decode-and-Dispatch Interpreter Threaded Interpretation Direct Threaded Interpretation with Predecoding Comparison - PowerPoint PPT PresentationTRANSCRIPT
Content
Emulation Interpreter
Decode-and-Dispatch Interpreter Threaded Interpretation Direct Threaded Interpretation with Predecoding
Comparison Interpreting a Complex Instruction Set (CISC)
Emulation
IA-32 program
FX!32 emulator
Alpha ISA HostTarget ISA
Host
GuestSource ISA
emulator
Emulation
Emulation : Interpretation, binary translation
Interpretation:
Fetching analyzing performing
next
Interpreter
Code
Data
Stack
Program Counter
Condition Codes
Reg 0
Reg 1
……
Reg n-1
Interpreter Code
Source Memory StateSource Context
Block
Interpreter Overview
Decode-and-dispatch interpreter
decodes an instruction
dispatches it to an interpretation routine based on the type of instruction.
Decode-and-dispatch interpreter
While (!halt&&interrupt){
inst=code[PC];
opcode=extract(inst,31,6);
switch(opcode){
case LoadWordAndZero:LoadWordAndZero(inst);
case ALU:ALU(inst);
case Branch: Branch(inst);
· · · · ·
}
Code for Interpreting the PowerPC Instruction Set Architecture.
Decode-and-dispatch interpreter
Instruction function list
LoadWordAndZero(inst){
RT=extract(inst,25,5);
RA=extract(inst,20,5);
displacement=
source=regs[RA];
address=source+displacement;
regs[RT]=(data[address]<<32)>>32
PC=PC+4;
}
ALU(inst){
RT=extract(inst,25,5);
RA=extract(inst,20,5);
RB=extract(inst,15,5);
source1=
source2=
extended_opcode=extract(inst,10,10);
switch(extended_opcode){
case Add:Add(inst);
case AddCarrying:
· · · · · · ·}
PC=PC+4;
}
Decode-and-dispatch interpreter
Advantage: Low memory requirements Zero star-up time
Disadvantage: Steady-state performance is slow- A source instruction must be parsed each time
it is emulated- Put a lot of pressure on the cache- Branches .
Decode-and-dispatch interpreter
Switch(opcode)
case
return
1.Switch statement->case Indirect
2.ALU(inst) direct
3.Return from the routine Indirect
4.Loop back-edge direct
While (!halt&&interrupt){ switch(opcode){ case ALU:ALU(inst); · · · · ·}
Threaded Interpretation
Instruction function list Add: RT=extract(inst,25,5); RA=extract(inst,20,5); RB=extract(inst,15,5); source1=regs[RA]; source2=regs[RB]; sum=source1+source2; regs[RT]=sum; PC=PC+4; If (halt || interrupt) goto exit; inst=code[PC]; opcode=extract(inst,31,6); extended_opcode=extract(inst,10,10); routine=dispatch[opcode,extended_opcode]; goto *routine;}
Switch(opcode)
case
case
case
Put the dispatch code to the end of each of the instruction interpretation routines.
Threaded Interpretation
Advantage: low Memory requirements (more than decode-
and-dispatch interpretation) Start-up time is zero
Disadvantage: Steady-state performance is slow (better than
decode-and-dispatch interpretation) - Indirect branch
Predecoding
lwz r1, 8(r2) ;load word and zeroAdd r3, r3, r1 ; r3=r3+r1stw r3, 0(r4) ;store word
LoadWordAndZero(inst) RT=extract(inst,25,5); RA=extract(inst,20,5); RB=extract(inst,15,5);
Parsing an instruction, putting it in a form (intermediate form)
07
1 2 08
Intermediate form
RT= code[TPC].dest;RA= code[TPC].src1;displacement=code[TPC].src2
Predecoding
Struct instruction{
unsigned long op;
unsigned char dest;
unsigned char src1;
unsigned int src2;
} code [CODE_SIZE
Fast interpretation, but Memory requirements are high.
Direct Threaded Interpretation
the instruction codes contained in the intermediate code can be replaced with the actual addresses of the interpreter routines.
001048d0
1 2 08
07
1 2 08
If (halt || interrupt) goto exit; opcode= code[TPC].op; routine=dispatch [opcode]; goto *routine;
If (halt || interrupt) goto exit; routine= code[TPC].op; goto *routine;
Comparison
Source code source code Interpreter routines Source code Interpreter routines
dispatch loop
(a) (b ) ( c)
Comparison Intermediate
code
Interpreter routines
Source code
(d)
Predecoder
ComparisonDecode-and-Dispatch Interpretation
Indirect Threaded Interpreter
Direct Threaded Interpreter with Predecoding
Memory requirements
Low Low(more than the first one)
High
Start-up performance
Fast Fast Slow
Steady-state
performance Slow Slow
(better than the first one)
Medium
Code portability
Good Good Medium
RISC ISA & CISC ISA RISC ISA (Power PC) 32 bit register. 32bit
length.
Op Rd Rs1 Rs2 OpxRegister-register
Op Rd Rs1 Const
31 25 20 15 10 0
31 25 20 15 0
Register-immediate
Op Const opx
2
Jump/call
Interpreting a Complex Instruction Set
Prefixes Opcode ModR/M SIB Displacement Immediate
CISC instruction sets have wide variety of formats, variable instruction lengths, and even variable field lengths
Up to fourPrefixes of 1 byte each(optional)
1-,2-,or 3-byteopcode
1byte(if required)
1byte(if required)
AddressDisplacement
Of 1,2,or 4Bytes or none
Immediate data
Of 1,2,or 4Bytes or none
Mod Reg/
Opcode
R/M Scale Index Base
7 6 5 3 2 0 7 6 5 3 2 0
IA-32 Instruction Format
IA-32 Instruction set
ModR/M:e.g. Mod=10 R/M=000 REG=001 ModR/M=10001000=88H Effective Address = [EAX]+disp32
IA-32 Instruction set
SIB e.g. SS=01=2H; Index=000; Base=001; SIB value=01000001=41H; Scales Index=[EAX*2] Mod bits Effective Address 00 [scales index]+disp32 01 [scales index]+disp8+[EBP] 10 [scales index]+disp32+[EBP]
IA-32 Instruction set
Add AX, 8[ESI]; Effective addr=ESI+disp Disp=8 AX=AX+mem[8+[ESI]];
Prefixes Opcode ModR/M SIB Displacement Immediate
0 ADD 01 000 110 000000…….1000
AX
Mod=01;
R/M=110;
Reg=000;
ADD Mod Reg R/M Displacement
Interpreting a Complex Instruction Set
Program Flow
for a Basic CISC ISA Interpreter Figure 2.12 (code) General
Decode(fill-in instruction
Structure)
Dispatch
Inst.1Specialized
routine
Inst.1Specialized
routine
Inst.1Specialized
routine
Interpreting a Complex Instruction Set Interpreter Based on
Optimization of Common Cases
SimpleInst.1
Specializedroutine
SimpleInst.m
Specializedroutine
ComplexInst.m+1
Specializedroutine
ComplexInst.m+1
Specializedroutine
PrefixSet flags
Dispatch On
first byte
Shared routines
Interpreting a Complex Instruction Set
Threaded Interpreter for a CISC ISA
SimpleInstructionSpecialized
routine
SimpleDecode/Dispatch
SimpleInstructionSpecialized
routine
SimpleDecode/Dispatch
SimpleInstructionSpecialized
routine
SimpleDecode/Dispatch
SimpleDecode/Dispatch
SimpleInstructionSpecialized
routine
ComplexDecode/Dispatch