emulation: interpretation and binary translation

26
CAPP Emulation: Interpretation and Binary Translation Haiyu Li [email protected]

Upload: oralee

Post on 15-Jan-2016

226 views

Category:

Documents


12 download

DESCRIPTION

Emulation: Interpretation and Binary Translation. Haiyu Li [email protected]. Content. Emulation Interpreter Decode-and-Dispatch Interpreter Threaded Interpretation Direct Threaded Interpretation with Predecoding Comparison - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Emulation: Interpretation and Binary Translation

CAPP

Emulation: Interpretation and Binary Translation

Haiyu Li

[email protected]

Page 2: Emulation: Interpretation and Binary Translation

Content

Emulation Interpreter

Decode-and-Dispatch Interpreter Threaded Interpretation Direct Threaded Interpretation with Predecoding

Comparison Interpreting a Complex Instruction Set (CISC)

Page 3: Emulation: Interpretation and Binary Translation

Emulation

IA-32 program

FX!32 emulator

Alpha ISA HostTarget ISA

Host

GuestSource ISA

emulator

Page 4: Emulation: Interpretation and Binary Translation

Emulation

Emulation : Interpretation, binary translation

Interpretation:

Fetching analyzing performing

next

Page 5: Emulation: Interpretation and Binary Translation

Interpreter

Code

Data

Stack

Program Counter

Condition Codes

Reg 0

Reg 1

……

Reg n-1

Interpreter Code

Source Memory StateSource Context

Block

Interpreter Overview

Page 6: Emulation: Interpretation and Binary Translation

Decode-and-dispatch interpreter

decodes an instruction

dispatches it to an interpretation routine based on the type of instruction.

Page 7: Emulation: Interpretation and Binary Translation

Decode-and-dispatch interpreter

While (!halt&&interrupt){

inst=code[PC];

opcode=extract(inst,31,6);

switch(opcode){

case LoadWordAndZero:LoadWordAndZero(inst);

case ALU:ALU(inst);

case Branch: Branch(inst);

· · · · ·

}

Code for Interpreting the PowerPC Instruction Set Architecture.

Page 8: Emulation: Interpretation and Binary Translation

Decode-and-dispatch interpreter

Instruction function list

LoadWordAndZero(inst){

RT=extract(inst,25,5);

RA=extract(inst,20,5);

displacement=

source=regs[RA];

address=source+displacement;

regs[RT]=(data[address]<<32)>>32

PC=PC+4;

}

ALU(inst){

RT=extract(inst,25,5);

RA=extract(inst,20,5);

RB=extract(inst,15,5);

source1=

source2=

extended_opcode=extract(inst,10,10);

switch(extended_opcode){

case Add:Add(inst);

case AddCarrying:

· · · · · · ·}

PC=PC+4;

}

Page 9: Emulation: Interpretation and Binary Translation

Decode-and-dispatch interpreter

Advantage: Low memory requirements Zero star-up time

Disadvantage: Steady-state performance is slow- A source instruction must be parsed each time

it is emulated- Put a lot of pressure on the cache- Branches .

Page 10: Emulation: Interpretation and Binary Translation

Decode-and-dispatch interpreter

Switch(opcode)

case

return

1.Switch statement->case Indirect

2.ALU(inst) direct

3.Return from the routine Indirect

4.Loop back-edge direct

While (!halt&&interrupt){ switch(opcode){ case ALU:ALU(inst); · · · · ·}

Page 11: Emulation: Interpretation and Binary Translation

Threaded Interpretation

Instruction function list Add: RT=extract(inst,25,5); RA=extract(inst,20,5); RB=extract(inst,15,5); source1=regs[RA]; source2=regs[RB]; sum=source1+source2; regs[RT]=sum; PC=PC+4; If (halt || interrupt) goto exit; inst=code[PC]; opcode=extract(inst,31,6); extended_opcode=extract(inst,10,10); routine=dispatch[opcode,extended_opcode]; goto *routine;}

Switch(opcode)

case

case

case

Put the dispatch code to the end of each of the instruction interpretation routines.

Page 12: Emulation: Interpretation and Binary Translation

Threaded Interpretation

Advantage: low Memory requirements (more than decode-

and-dispatch interpretation) Start-up time is zero

Disadvantage: Steady-state performance is slow (better than

decode-and-dispatch interpretation) - Indirect branch

Page 13: Emulation: Interpretation and Binary Translation

Predecoding

lwz r1, 8(r2) ;load word and zeroAdd r3, r3, r1 ; r3=r3+r1stw r3, 0(r4) ;store word

LoadWordAndZero(inst) RT=extract(inst,25,5); RA=extract(inst,20,5); RB=extract(inst,15,5);

Parsing an instruction, putting it in a form (intermediate form)

07

1 2 08

Intermediate form

RT= code[TPC].dest;RA= code[TPC].src1;displacement=code[TPC].src2

Page 14: Emulation: Interpretation and Binary Translation

Predecoding

Struct instruction{

unsigned long op;

unsigned char dest;

unsigned char src1;

unsigned int src2;

} code [CODE_SIZE

Fast interpretation, but Memory requirements are high.

Page 15: Emulation: Interpretation and Binary Translation

Direct Threaded Interpretation

the instruction codes contained in the intermediate code can be replaced with the actual addresses of the interpreter routines.

001048d0

1 2 08

07

1 2 08

If (halt || interrupt) goto exit; opcode= code[TPC].op; routine=dispatch [opcode]; goto *routine;

If (halt || interrupt) goto exit; routine= code[TPC].op; goto *routine;

Page 16: Emulation: Interpretation and Binary Translation

Comparison

Source code source code Interpreter routines Source code Interpreter routines

dispatch loop

(a) (b ) ( c)

Page 17: Emulation: Interpretation and Binary Translation

Comparison Intermediate

code

Interpreter routines

Source code

(d)

Predecoder

Page 18: Emulation: Interpretation and Binary Translation

ComparisonDecode-and-Dispatch Interpretation

Indirect Threaded Interpreter

Direct Threaded Interpreter with Predecoding

Memory requirements

Low Low(more than the first one)

High

Start-up performance

Fast Fast Slow

Steady-state

performance Slow Slow

(better than the first one)

Medium

Code portability

Good Good Medium

Page 19: Emulation: Interpretation and Binary Translation

RISC ISA & CISC ISA RISC ISA (Power PC) 32 bit register. 32bit

length.

Op Rd Rs1 Rs2 OpxRegister-register

Op Rd Rs1 Const

31 25 20 15 10 0

31 25 20 15 0

Register-immediate

Op Const opx

2

Jump/call

Page 20: Emulation: Interpretation and Binary Translation

Interpreting a Complex Instruction Set

Prefixes Opcode ModR/M SIB Displacement Immediate

CISC instruction sets have wide variety of formats, variable instruction lengths, and even variable field lengths

Up to fourPrefixes of 1 byte each(optional)

1-,2-,or 3-byteopcode

1byte(if required)

1byte(if required)

AddressDisplacement

Of 1,2,or 4Bytes or none

Immediate data

Of 1,2,or 4Bytes or none

Mod Reg/

Opcode

R/M Scale Index Base

7 6 5 3 2 0 7 6 5 3 2 0

IA-32 Instruction Format

Jaemok Lee
X86 명령어는 피연산자로서, 메모리에 있는 값을 사용할 수 있는 경우가 상당히 많습니다.즉... 다음과 같은 명령어가 가능합니다.ADD AX, [SI]이건 SI 레지스터가 가르키고 있는 주소에 있는 값을, AX 라는 레지스터에 더하는 명령입니다.RISC 명령으로 표현하자면,LD R1, [SI]ADD AX, AX, R1이렇게 되겠죠...이렇게 피연산자가 Register 인 경우도 있고, Memory 인 경우도 있습니다. 그것을 구별하기 위해서 ModR/M 이라는 플래그가 있습니다.
Page 21: Emulation: Interpretation and Binary Translation

IA-32 Instruction set

ModR/M:e.g. Mod=10 R/M=000 REG=001 ModR/M=10001000=88H Effective Address = [EAX]+disp32

Page 22: Emulation: Interpretation and Binary Translation

IA-32 Instruction set

SIB e.g. SS=01=2H; Index=000; Base=001; SIB value=01000001=41H; Scales Index=[EAX*2] Mod bits Effective Address 00 [scales index]+disp32 01 [scales index]+disp8+[EBP] 10 [scales index]+disp32+[EBP]

Page 23: Emulation: Interpretation and Binary Translation

IA-32 Instruction set

Add AX, 8[ESI]; Effective addr=ESI+disp Disp=8 AX=AX+mem[8+[ESI]];

Prefixes Opcode ModR/M SIB Displacement Immediate

0 ADD 01 000 110 000000…….1000

AX

Mod=01;

R/M=110;

Reg=000;

ADD Mod Reg R/M Displacement

Page 24: Emulation: Interpretation and Binary Translation

Interpreting a Complex Instruction Set

Program Flow

for a Basic CISC ISA Interpreter Figure 2.12 (code) General

Decode(fill-in instruction

Structure)

Dispatch

Inst.1Specialized

routine

Inst.1Specialized

routine

Inst.1Specialized

routine

Page 25: Emulation: Interpretation and Binary Translation

Interpreting a Complex Instruction Set Interpreter Based on

Optimization of Common Cases

SimpleInst.1

Specializedroutine

SimpleInst.m

Specializedroutine

ComplexInst.m+1

Specializedroutine

ComplexInst.m+1

Specializedroutine

PrefixSet flags

Dispatch On

first byte

Shared routines

Page 26: Emulation: Interpretation and Binary Translation

Interpreting a Complex Instruction Set

Threaded Interpreter for a CISC ISA

SimpleInstructionSpecialized

routine

SimpleDecode/Dispatch

SimpleInstructionSpecialized

routine

SimpleDecode/Dispatch

SimpleInstructionSpecialized

routine

SimpleDecode/Dispatch

SimpleDecode/Dispatch

SimpleInstructionSpecialized

routine

ComplexDecode/Dispatch