peter s. magnusson, magnus crhistensson, jesper eskilson, daniel forsgren, gustav hallberg, johan...
TRANSCRIPT
Peter S. Magnusson, Magnus Crhistensson, Jesper Eskilson, Daniel Forsgren, Gustav Hallberg, Johan
Högberg, Frederik larsson, Anreas Moestedt.Presented by Eduardo Cuervo
Simulation is an important step◦ Research◦ Evaluation◦ Computer Design
Not enough to simulate only user level code◦ Not accurate enough◦ Need for Full System Simulation (slower)
Simulation must be able to interface with detailed HW models timely.
Scope (What is modeled?)◦ Full system◦ User level
Level of abstraction ◦ Functional behavior (what)◦ Timing behavior (when)
Realistic workloads◦ Functional:Boot, run unmodified OS, benchmarks◦ Timing: Support hardware engineering◦ Fast enough to run real workloads
Full system simulator Instruction-set level Support for multiple
architectures◦ SPARC◦ Alpha◦ X86◦ Itanium◦ MIPS◦ ARM
Unmodified OS support
Processor Models Device Models
◦ Accurate enough for real drivers and firmware Simics Central
◦ Creation of full-scale distributed system◦ Router◦ Communicates multiple Simics instances
Multiple nodes of the same architecture per instance◦ Synchronizes instances
Microprocessor design◦ Interaction with memory manager an scheduler◦ Approximate cache and I/O timing models◦ Scalable to full server workloads (TPC-C)◦ OoO Support, custom ROB but no pipeline
Memory studies◦ Memory Spaces Address Spaces◦ Extendable with timing models
OS Development◦ Implementation of very specific breakpoints
Debugging
Simics Central◦ Synchronizes virtual time◦ Simulation speed = speed of the slowest Simics
process Configuration
◦ Object Oriented Command Line Interface (CLI) and Scripting
◦ Built-in Python runtime environment◦ Scripts tied to events (TLB misses, I/O operations)
Devices◦ Timer, floppy, keyboard,mouse,
DMA,interrupt,etc.
OBJECT cpu0 TYPE x86-hammer{
freq_mhz: 3500physical_memory: phys_mem0
}OBJECT phys_mem0 TYPE memory-space{
map: ((0xa0000, vga0, 1, 0,0x20000),
(0x100000, mem0, 0, 0x100000,0xff00000),
...}OBJECT con0 TYPE gfx-console{
queue: cpu0x-size: 720y-size: 400keyboard: kbd0Mouse: kbd0
}
from sim_core import *import conf
def break_handler(id): if conf.cpu0.eax > conf.cpu0.ecx: raise SimExc_Breakid = SIM_breakpoint(conf.phys_mem0,
Break_Physical, Break_Execute, 0x000f2501, 1, 0)
SIM_hap_register_callback( “Core_Breakpoint” ,break_handler, id)
HDL interface◦ Link Simics to Verilog through C interface
Simics API◦ Makes Simics extensible◦ Write new device models, commands, routines
Memory◦ Biggest performance challenge◦ Simulator transaction cache
Speeds up loads, stores and fetches
Pointers to simulated memory◦ Indexed by virtual address
No side effects on hit◦ Alignment exception, TLB
miss, cache miss, breakpoint
Interpreter cache Hit inlined in the kernel Most complex construct
Two event queues◦Step queue Triggered by program counter steps
◦Time queue Resolution of a processor clock cycle
Mix of time-driven and event-driven components
Specification language: Sim Gen
Generates all permitted combinations
Better interpreter than practical to do manually
Outputs an interpreter in C
// IA32/x86-64 add to left instructioninstruction ADD_L({REG}, {REG_OR_MEM}) pattern op_h2 == 0 && opl == 0 && d == 1 && opm == 0 syntax “add {REG},{REG_OR_MEM}” semantics #{ ireg_t op1 = {REG}; ireg_t op2 = {REG_OR_MEM}; ireg_t dst = op1 + op2; EFLAGS_ADD(dst,op1,op2,w,os); SET({REG_W}, dst); #} attributes type = IT_ALU
Os boot workloads◦Modeled with 7 processor architectures
Scalability shown on Ultra II Enterprise Servers◦Increasing number of CPUs
Lower performance on OoO versions
IBM first emulator (7070) PDP-11 G88 Gsim
◦ Based on g88◦ Rewritten as the first version of simics
SimOS◦ MIPS-based processor◦ Similar goals and solutions◦ More general solution◦ Three CPU simulators