virtualized system development · why use virtual systems? “because hardware is no fun” because...

72
Virtualized System Development Dr. Jakob Engblom Virtutech

Upload: others

Post on 22-Sep-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Virtualized System Development

Dr. Jakob Engblom

Virtutech

Page 2: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Virtuali-what?

For Uppsala University RT Course, Copyright Virtutech 20092009-11-12

• Making a computer program behave like a computer for the purpose of running software

• Mechanisms for making some computer resource less subject to physical constraints

• Virtual memory, disk virtualization, virtual machines, …

• Very hip in the data center world currently

Virtualization

• A piece of software that simulates something

• Used to perform experiments and gain insight not possible in the real world

• Control and insight into the internals of the process big advantages

• For computing, often associated with being slow

• Physics, computers, weather, games, …

Simulation

• A piece of software that mimics some computer software or hardware

• Focused on the execution of existing software

• Terminal emulators, OS emulators, console emulators

Emulation

Virtutech Simics is really doing a bit of all of these, and it has been called all three

2

Page 3: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

What is a Virtual Platform?

A piece of software

Running on a regular PC, server,

or workstation

Functionally identical to the

target hardware

Runs the same software as the

physical hardware system

For Uppsala University RT Course, Copyright Virtutech 20092009-11-12

Virtual Platform

3

Page 4: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Virtutech Core Technology

Model any electronic system on a PC or workstation

‒ Simics is a software program, no hardware required

Run the exact same software as the physical target (complete binary)

Run it fast (100s of MIPS)

Model any target system

‒ Networks, SoCs, boards, ASICs, ... no limits

For the benefit of software developers and hardware providers

Enables process change in software development

For Uppsala University RT Course, Copyright Virtutech 20092009-11-12

SimicsSimics

User application code

Host hardwareHost hardware

Host operating systemHost operating system

Virtual target hardware

Target operating system (s)

Middleware and libraries

Typically, an embedded or real-time control computer system

4

Page 5: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Why use Virtual Systems?

“Because hardware is no fun”

Page 6: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Because Hardware Is...

For Uppsala University RT Course, Copyright Virtutech 20092009-11-12

Not yet available Flaky prototype stage Not available anymore

?

Photo: Computer History Museum

Photo: Freescale

6

Page 7: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Because Hardware Is...

For Uppsala University RT Course, Copyright Virtutech 20092009-11-12

Inconvenient Dangerous Inaccessible

Photo: ESA

Photo: www.mil.se, Bromma Conquip

7

Page 8: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Because Hardware Is...

For Uppsala University RT Course, Copyright Virtutech 20092009-11-12

Impractical in scale Limited Inflexible

8

Page 9: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Virtual Platform Advantages and Features

Page 10: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Example: Early Hardware

For Uppsala University RT Course, Copyright Virtutech 20092009-11-12

Hardware/Software

Integration and Test

Hardware-dependent

software development

Hardware design and production

Simulator

development

Hardware-dependent

software development

Hardware/Software

Integration and Test

First successful power-on and boot

Reduced project time-to-ship using simulated

hardware

Hardware design and production

10

Page 11: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Handy Features of Simulation

For Uppsala University RT Course, Copyright Virtutech 20092009-11-12

Checkpointing

‒ Store current state; pick up and continue later

‒ Position workload once, use many times

‒ Spread a system state to multiple developers

‒ Package error reports (see demo)

‒ Can checkpoint an entire network of machines

‒ Key for repeating executions

Determinism/Repeatability

‒ Same initial state gives same execution;

‒ Repeat the same execution any number of times

‒ Investigate a problem time after time

‒ For multiprocessor systems & network systems

‒ Very useful for complex systems, where repeatable runs otherwise do not happen

11

Page 12: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Checkpointing in Simics

For Uppsala University RT Course, Copyright Virtutech 20092009-11-12

Save checkpoint

Restore to same machine, same model version

Restore to different host machine, same model version

Restore to same machine, updated model version (bug fix)

Restore to an updated and upgraded version of the model

12

Page 13: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Handy Features of Simulation

For Uppsala University RT Course, Copyright Virtutech 20092009-11-12

Visibility (insight without intrusion)

‒ All state can be observed

‒ All events can be traced and logged

Controllability

‒ Any part of machine or state can be changed

‒ Fault injection

Virtual time

‒ Time is completely virtual

‒ Global synchronization across all machines in a network

‒ Global stop across all processors in a multiprocessor

13

Page 14: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Example system insight: Load over time

For Uppsala University RT Course, Copyright Virtutech 20092009-11-1214

Page 15: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Convenience: Loading Flash

For Uppsala University RT Course, Copyright Virtutech 20092009-11-12

SimicsSimulator

FLASH

bin

Simics

Flash

Programmer

bin

Much shorter turn-around time for changing target software setup

No risk of “bricking” a target

15

Page 16: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Hardware Availability

Wide Availability

Virtual system is ”just software”

Trivial to copy

Trivial to distribute

Each engineer can have a custom

hardware system at their desk

Scalable

No physical supply limit

‒ Any number of each type of board

‒ Any type of system in ”infinite” supply

A virtual system can be big or small

by simple software (re)configuration

For Uppsala University RT Course, Copyright Virtutech 20092009-11-1216

Page 17: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Handy Features of Simulation

For Uppsala University RT Course, Copyright Virtutech 20092009-11-12

Configurability

‒ Any parameter of system can be changed

Sandboxing

‒ Allows investigating ”nasty code”

‒ Simulated machine complete isolated

‒ Networks can be isolated

‒ Simics undetectable by malware

Complete hardware simulation, no virtualization tricks

Reverse execution

‒ Roll back execution to previous state

‒ Reverse breakpoints

‒ Investigate details of program errors

17

Page 18: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Virtualization = Infinite Longevity

For Uppsala University RT Course, Copyright Virtutech 20092009-11-12

Host hardwareToday: 32-bit PC

Host operating systemWindows

SimicsSimics for x86/win

PPC 750fx Card

Target OS

Applications

Host hardwareTomorrow: 64-bit PC

Host operating systemLinux

SimicsSimics for AMD64/linux

PPC 750fx Card

Target OS

Applications

Host hardwareFuture: X Hardware

Host operating systemY OS

SimicsSimics for X/Y

PPC 750fx Card

Target OS

Applications

Time

...but the simulated target hardware stays the same

And the target software keeps working

The host machines available change over time...

18

Page 19: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

System-Level Features

Checkpoint and restore Multicore, processor, board Real-world connections

Repeatable fault injection on

any system component

Scripting Mixed endianness, word

sizes, heterogeneitycon0.wait-for-string "$“

con0.record-start

con0.input "./ptest.elf 5\n"

con0.wait-for-string "."

$r = con0.record-stop

if ($r == "fail.”) {

echo ”test failed”

}

2009-11-12 For Uppsala University RT Course, Copyright Virtutech 200919

Page 20: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Simics Debugging Features

For Uppsala University RT Course, Copyright Virtutech 20092009-11-12

Synchronous stop

for entire system

Determinism and

repeatability

Reverse execution

Unlimited and powerful

breakpoints

Trace anything Insight into all devices

break –x 0x0000->0x1F00

break-io uart0

break-exception int13

20

Page 21: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

What Types of Systems Can be Simulated?

For Uppsala University RT Course, Copyright Virtutech 20092009-11-12

Complete Systems & Networks

• Satellite, telecom network,

backbone net

Racks of Boards

& Backplanes

• Telecom rack, avionics

bay, blade server

Complete Boards

• MPC8548CDS,

MPC8572DSDevices &

Buses

• PCI, PCI-X, RapidIO,

Custom ASICS

SoC Devices

• Freescale QorIQ P4080,

MPC8572E, MPC8548E

Processor

& Memory

• Processor cores such as

e300, e500mc, e600

Examples

21

Page 22: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Debugging with Virtual Platforms

Page 23: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Three Steps of Debugging

For Uppsala University RT Course, Copyright Virtutech 20092009-11-12

1. Provoking errors

‒ Forcing the system to a state where things break

2. Reproducing errors

‒ Recreating a provoked error reliably

3. Locating the source of errors

‒ Investigating the program flow and data

‒ Depends on success in reproduction

A simulator can help with all three steps

23

Page 24: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Repeatability and Reverse Debugging

Repeat any run trivially

‒ No need to rerun and hope for bug to

reoccur

Stop & go back in time

‒ Instead of rerunning program from

start

‒ Breakpoints & watchpoints

backwards in time

‒ Investigate exactly what happened

this time

This control and reliable

repeatability is very powerful for

parallel code!

For Uppsala University RT Course, Copyright Virtutech 20092009-11-12

On hardware, only some runs reproduce an error

On virtual hardware, debugging is much easier

24

Page 25: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Code is not just about CPUs

For Uppsala University RT Course, Copyright Virtutech 20092009-11-12

On a modern SoC, the processor cores are just one part of the system

Much application functionality is implemented by using special accelerators... and you need to debug their interaction with the processors & software

25

Page 26: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Divide-by-zero in OS Kernel

For Uppsala University RT Course, Copyright Virtutech 20092009-11-12

Operating-system kernel crash in virtual model

‒ Divide-by-zero right in the kernel

‒ Algorithm to determine and compensate for clock skew

‒ Division by difference in time between two processors

Virtual model had zero clock skew = provoked error

‒ Could have happened on a real system

‒ Just not very likely

‒ Typical rare problem in the field

‒ Essentially testing a rare corner case in system state

26

Page 27: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Race Condition in Serial Driver

For Uppsala University RT Course, Copyright Virtutech 20092009-11-12

The problem:

‒ Dual-core MPC8641D machine

‒ Changed clock frequency from 800 to 833 Mhz

‒ OS froze on startup – quite unexpectedly

Investigation:

‒ Only happened at 832.9 to 833.3 MHz

‒ Determinism: 100% reproduction of error trivial

‒ Time control: single-step code feasible

‒ Insight: look at complete system state, log interrupts, check the call stack at the point of the freeze, check lock state

What we found:

‒ An interrupt service routine attempted to take a lock, before re-enabling interrupts. In the case that froze, the lock was already taken when the service routine was entered, and with no interrupts enabled there was no way for it to be released.

27

Page 28: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Custom Scripts to Debug Software

For Uppsala University RT Course, Copyright Virtutech 20092009-11-12

Scripting is a great tool for debugging

Simics debugger is fully scriptable

Available information:

‒ Looking up where functions, variables are in memory

‒ Mapping of instruction addresses to functions and code lines

‒ Tracking executing processes

‒ Switching processor debug context as target operating system executes

Also known as ”OS Awareness”

28

Page 29: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

OS awareness and debug scripting examples

For Uppsala University RT Course, Copyright Virtutech 2009

Output example: Analyzing a multithreaded queue

[bp] Thread 1401, writing variable done with value 1.At rule30_packet_queue_signal_done, line 62 Prev. state: Done: 0 Empty: 0 Full: 0 Tail: 0 Head: 99 Elems: 1

[bp] Thread 1396, writing variable empty with value 1.At rule30_packet_queue_get, line 152 Prev. state: Done: 1 Empty: 0 Full: 0 Tail: 0 Head: 0 Elems: 0

[bp] Thread 1396, writing variable full with value 0.At rule30_packet_queue_get, line 153 Prev. state: Done: 1 Empty: 1 Full: 0 Tail: 0 Head: 0 Elems: 0

Code behind the above

def general_breakpoint_handle(target,user_arg,context,bpno,memop):...cpu = SIM_get_mem_op_initiator(memop)tid = process_tracker.iface.tracker.active_trackee(cpu)pc = cpu.iface.processor_info.get_program_counter()now = cpu.iface.cycle.get_cycle_count(cpu)value = SIM_get_mem_op_value_cpu(memop)(file,line,func) = context.symtable.source_at[pc]done = read_32b_variable(pq_done)empty = read_32b_variable(pq_empty)full = read_32b_variable(pq_full)head = read_32b_variable(pq_head)tail = read_32b_variable(pq_tail)

2009-11-12

Source: http://www.virtutech.com/whitepapers/fixing-intermittent-multicore-bug-simics-807.html

29

Page 30: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

The Disk Corruption

For Uppsala University RT Course, Copyright Virtutech 20092009-11-12

Distributed fault-tolerant file system got corrupted

‒ Not shared-memory machine, all boards single-processor connected by several networks

‒ Intermittent error

‒ Error seen as a composite state across multiple disks

‒ Months spent chasing it on physical hardware

Simics solution:

‒ Reproduce corruption in Simics model of target

‒ Pin-point time when it happens, by interval halving

‒ Around the critical time, take periodic snapshots of disks

‒ Check consistency of disk states in offline scripts

Result:

‒ Found the precise instruction causing the problem

‒ Could capture the network traffic pattern causing issue

‒ Communicated the complete setup to the file system creator, allowing the root cause to be fixed

30

Page 31: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Computer System Simulation Technology

Page 32: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Embedded Computer System

For Uppsala University RT Course, Copyright Virtutech 20092009-11-12

So

ftwa

re s

tac

k

Communications

networks

Controlled

Environment

Human user interface

BootROM, drivers, HAL

Operating system

Middleware, libraries

Applications

32

Page 33: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Simulating Embedded Computer System

For Uppsala University RT Course, Copyright Virtutech 20092009-11-12

So

ftwa

re s

tac

k

Communications

networks

Controlled

Environment

Human user interface

BootROM, drivers, HAL

Operating system

Middleware, libraries

Applications

Simulation: “fake” one or more of the system pieces to enable work on other pieces. Some parts may be physical, while others are virtual.

Each piece has its own simulation issues and specialized simulation tools

We focus on running the software, and simulating the networks. Other parts are done using integrations with other companies’ tools.

33

Page 34: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Simics Architecture

Mo

de

l libra

ry

Simics

Target Machine(s)

Processor

Memory DevicesDevicesDevices

Processors

DevicesDevicesNetworks

and IO

Target operating system

Target hardware drivers

User program Middleware

Target boot code

User program

Target ISA decoder

JIT

CompilerInterpreter

Simics

Core

Configuration

management

Core Services

API

Inspection

Control

Features Memory

GUI

Scripting

Built-in

Debugger

Device Network CPU core SoCInter-connect

Intercon

nect

Configscripts

VMP

External

world

connections

Ethernet

Serial

Keyboard

Mouse

Ext. Debuggers

...

DML, C, C++,

Python,

Simics script,

SystemC

Event queue

and time

Multithreading

and scaling

2009-11-12 For Uppsala University RT Course, Copyright Virtutech 200934

Page 35: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Detail level determines speed

‒ The more detail, the slower the simulation

Abstraction: timing precision, implementation details

Functionality must always be correct!

Simulation detail levelTypical

slowdown

Approximate

speed in “MIPS”

Time to simulate one

real-world minute

Gate-level simulation 1000000 0.002 2 years

Cycle-accurate simulation 10000 0.2 7 days

Cycle-approximate simulation 500 4 8 hours

Fast functional simulation 5 400 5 minutes

Full-System Simulation

2009-11-12 For Uppsala University RT Course, Copyright Virtutech 200935

Page 36: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Cardinal Rule of Simulation

For Uppsala University RT Course, Copyright Virtutech 20092009-11-12

Scope of

modeled system

Quarks

Atom

Galaxy

Galaxies

Reasonable to simulate: scope

proportional to abstraction

Universe

Planets

Units of the

simulation

36

Page 37: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

The Art of Fast Simulation

”Know when to bluff”

‒ You are in a poker game against the

software

It is an art to implement just

enough to fool the software, but

not more

‒ Details cost dev time and execution

speed

‒ Implement the what and not the how

‒ Do work in largest possible units

entire Ethernet packets

DMA in a single step

‒ ”transaction-level modeling”

For Uppsala University RT Course, Copyright Virtutech 20092009-11-1237

Page 38: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Simics Modeling Level: Processor

For Uppsala University RT Course, Copyright Virtutech 20092009-11-12

Instruction-set simulation (ISS)

Complete and correct processor functionality

‒ Endianess, word lengths, arithmetic results

‒ All instructions semantics bit-correct vs real machine

‒ Supervisor-mode & user-mode

‒ Runs the complete target instruction set

Including Altivec, SSE, 3dNow, VIS, etc. extensions

‒ All accessible values represented

User-level registers

Supervisor-level registers

Model-specific registers, ASIs, debug register, etc.

Memory-management unit

Timing abstracted

‒ Fixed execution time per instruction

‒ No cache model in default mode

38

Page 39: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Some things have to be modeled…

Correct endianness Incorrect endianness

For Uppsala University RT Course, Copyright Virtutech 20092009-11-1239

Page 40: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Simics Model of Memory Bus

For Uppsala University RT Course, Copyright Virtutech 20092009-11-12

DDR

SDRAM

Core0

UART

Coherency

module

DDR MC

L1 cache

Core1

DDR MC

Core2

L1 cache L1 cache

Shared

L2 cache

DDR

SDRAMFast interconnect for

high-bandwidth devices

Bridge

Slower interconnect for

other devices Flash MC

Flash

memory

Timer

I2C

Ethernet

Accelerator

RapidIO

Typical hardware

structure of a generic

modern SoC

40

Page 41: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Simics Model of Memory Bus

For Uppsala University RT Course, Copyright Virtutech 20092009-11-12

DDR

SDRAM

UART

Coherency

module

DDR MC

L1 cache

DDR MC

L1 cache L1 cache

Shared

L2 cache

DDR

SDRAMFast interconnect for

high-bandwidth devices

Bridge

Slower interconnect for

other devices Flash MC

Flash

memory

Timer

I2C

Ethernet

Accelerator

RapidIO

Coherency

module

L1 cache L1 cache L1 cache

Shared

L2 cache

Coherency

module

L1 cache L1 cache L1 cache

Fast interconnect for

high-bandwidth devices

Shared

L2 cache

Coherency

module

L1 cache L1 cache L1 cache

Bridge

Fast interconnect for

high-bandwidth devices

Shared

L2 cache

Coherency

module

L1 cache L1 cache L1 cache

Slower interconnect for

other devices

Bridge

Fast interconnect for

high-bandwidth devices

Shared

L2 cache

Coherency

module

L1 cache L1 cache L1 cache

Simics does not model

cache timing & coherency

protocol

Memory traffic goes directly

to memory store, memory

controller, bridges, etc only

modeled for their effect on

the configuration of the

memory map

A single global memory

map directly routes

memory accesses to

devices or memory

Memory map

Core0 Core1 Core2

41

Page 42: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Simics Modeling Level: Devices

For Uppsala University RT Course, Copyright Virtutech 20092009-11-12

Hardware modeled as a set of devices

‒ Memory map of machine (as seen by processor)

‒ At the programming register level

Model the program-visible behavior

‒ Configuration registers

‒ Control register

‒ Data transmitted & received

Transaction-level modeling

‒ Reads, writes, DMA transfers, network packets

Reactive, passive models

‒ Only execute code when a transaction occurs

ASICs & FPGAs

‒ Model programming interface behavior

‒ Not detailed implementation

42

Page 43: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Simics Modeling Level: Networks

For Uppsala University RT Course, Copyright Virtutech 20092009-11-12

Interfaced using “real” network devices

Networks modeled at message level

‒ Entire messages (packets, frames, ...) delivered as a unit

Hardware addressing used

‒ Ethernet MAC

‒ Does not care about higher-level protocols

‒ Ethernet allows IPv4, IPv6, TCP, UDP, SCP, ICMP, ...

Any topology or addressing scheme

‒ Broadcast, unicast, switched, point-to-point, etc.

Perfect network by default

‒ Introduce latencies

‒ Introduce bandwidth limits

‒ Introduce faults

43

Page 44: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Key Technology: Temporal Decoupling

For Uppsala University RT Course, Copyright Virtutech 20092009-11-12

Board

Chipset

Board

SoC

Core Core PCIe

EthRTOS

SW

MW RTC

PIC

RAM FLASH

CPU

Core Core UART

EthRTOS

SW

MW RTC

PIC

RAM

DiskUART

USB

SATA

ROM

Board

SoC

Core IO

EthRTOS

SW

MW RTC

PIC

RAM FLASH

UART

network

Simulation progress, temporal decoupling

Core

Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core

Simulation progress, cycle-by-cycle interleave

Core Core Core Core Core Core Core Core Core Core Core Core Core Core

44

Page 45: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Key Technology: Hypersimulation

Fast-forward the simulation

when a processor is idle

‒ Detect nothing to run for a while

‒ Relies on Simics isolating hardware

models from the outside world

Performance effect drastic

‒ Effectively removes idle processors

from the simulation of parallel

systems

‒ Can simulate at 100s of GHz

Examples of idle:

‒ Processor stop/power-down until an

interrupt happens

X86 “halt” instruction

PPC nap mode

Etc.

‒ Instruction sequence with known

outcome and no side effects

PPC “bndz 0”, which loops COUNT

times

Patterns for operating-system idle

loops, if not using power-down or other

obvious idle instructions

For Uppsala University RT Course, Copyright Virtutech 20092009-11-1245

Page 46: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Key Technology: Multithreading

For Uppsala University RT Course, Copyright Virtutech 2009

Simple system Complex systemComplex system with

Simics Accelerator

Host WorkstationHost Workstation

Simics Simics

Single thread

Simics

Host Workstation

Target

simulatio

n speed

Total

simulator

work

25% 1.0100% 4.0100% 1.0

2009-11-1246

Page 47: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core

Simulation progress, cycle-by-cycle interleave

Multithreading the Simulator

2009-11-12For Uppsala University RT Course, Copyright Virtutech 2009

Board

Chipset

Board

SoC

Core Core PCIe

EthRTOS

SW

MW RTC

PIC

RAM FLASH

CPU

Core Core UART

EthRTOS

SW

MW RTC

PIC

RAM

DiskUART

USB

SATA

ROM

Board

SoC

Core IO

EthRTOS

SW

MW RTC

PIC

RAM FLASH

UART

network

Simulation progress, parallel execution

Core Core Core Core Core Core

Core Core Core Core Core Core

Core Core Core

We need to synchronize the threads every once in a while, to maintain

semantics: no more apart than they would be under sequential execution

of the same system

Thread 1 Thread 2 Thread 3

Load balancing is a new limiter of simulation performnace, never an

issue in a single-threaded simulation

47

Page 48: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Speed Impact

For Uppsala University RT Course, Copyright Virtutech 20092009-11-12

-20%

0%

20%

40%

60%

80%

100%

120%

0

20

40

60

80

100

120

10

10

0

10

00

10

00

0

10

00

00

10

00

00

0

Time quantum length

Rela

tiv

e S

peed

Computation-Intense Benchmark on P4080 Model

Single P4080

With second P4080 idling

With second P4080 multithreaded

Overhead of second P4080 idling

48

Page 49: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Linux boot on MPC8572E

‒ Contains an spin-lock loop algorithm that is affected by temporal decoupling

‒ But still, things work out fine with a long time slice

0

5 000 000 000

10 000 000 000

15 000 000 000

20 000 000 000

25 000 000 000

30 000 000 000

35 000 000 000

40 000 000 000

45 000 000 000

50 000 000 000

0

100

200

300

400

500

600

700

10

50

10

0

50

0

10

00

10

00

0

10

00

00

50

00

00

10

00

00

0

15

00

00

0

wall time

total instr

Temporal Decoupling Affects Execution

2009-11-12 For Uppsala University RT Course, Copyright Virtutech 2009

Fastest execution at 10000, but not the

least number of target instructions

49

Page 50: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

SIMULATIONS DO NOT NEED TO BE

COMPLETE

For Uppsala University RT Course, Copyright Virtutech 20092009-11-1250

Page 51: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Virtual Platforms Evolve with your Hardware

Start with simple board: CPU/SoC + memory

Add details as the hardware design evolves

Virtual platform does not need to be complete until the end

Enables hardware/software co-design‒ Develop software as soon as hardware is designed

Fast feedback at each iteration

Reduces time to market, improves product quality

Physic

al

Hard

ware

Vir

tual

Hard

ware

2009-11-12 For Uppsala University RT Course, Copyright Virtutech 200951

Page 52: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Dummy Devices

For Uppsala University RT Course, Copyright Virtutech 20092009-11-12

Many devices lack interesting behavior

‒ From the perspective of the software for a particular system

Only affect low-level system timing

Not used by current software setup

No interesting effects from using them

‒ Replace with dummies that do nothing

But do not give “access out of memory” errors

Examples: memory timing setup, performance counters, error

detection registers, ...

‒ Note that you can add them later if effects are needed

= Simulation runs faster, takes less time to create

52

Page 53: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Stubs

For Uppsala University RT Course, Copyright Virtutech 20092009-11-12

Not all parts of a system need to be modeled

Replace by stubs

‒ Find an appropriate (narrow) interface to cut at

‒ Replace complete model with its behavior

53

Page 54: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Stubs Example

For Uppsala University RT Course, Copyright Virtutech 20092009-11-12

Control card

Main

processor

SoC

Line card

Interface

processor Rack back-plane

DSP DSP

DSP DSP

DSP DSP

Control card

Main

processor

SoC

Line card

Interface

processor Rack back-plane

DSP DSP

DSP DSP

DSP DSP

For simulation of rack management, stub out the

DSPs on the line cards

For testing control-plane algorithms, stub out the

entire line card

54

Page 55: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Simulating Networks in Simics: Mixing Abstractions

For Uppsala University RT Course, Copyright Virtutech 20092009-11-12

SimicsSimics

Simulated HW

OS

Application

Simics

Network Link

Simulation

Simulated HW

OS

Application

Network connect

Simulated HW

OS

Application

Network connect

Real-network

Traffic gen

Network

testerRest-of-network

model

System under test, fully simulated

Physical HW

OS

Application

Real-world

network test

equipment

Physical HW

OS

Application

Dedicated test system that injects packets and checks the

replies

Instrumentation

module

Other fully simulated nodes on the

simulated network

Simplified behavioral simulation of other nodes, based on network

I/O

55

Page 56: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

MIXING ABSTRACTION LEVELS

2009-11-12 For Uppsala University RT Course, Copyright Virtutech 200956

Page 57: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Abstraction Levels for Virtual Platforms

For Uppsala University RT Course, Copyright Virtutech 20092009-11-12

MeaningSystemTiming

Timing points for memory transaction

Memory transactiondelay

Arbitration, contention, bus bandwidth

Temporal decoupling

Simulationdriver

Max obs. speed

UTUntimed; SystemC, C, CSP

no none none No no Events 1 MTrans

ST Software timing:Simics, Qemu, Mambo

yes 1 Zero No yes Processor 1000s MIPS

LT SystemC TLM-2.0 Loosely Timed

yes 2 Loose No Optional Processor 100s MIPS

ATSystemC TLM-2.0Approximately Timed

yes 4 (2 phases) Approximate YesNot meaningful

Processor or Devices

10 MIPS

CC Clock-cycle yes Each clock Accurate Yes no Clock 100 Kcycles

RTL Register-transfer level

yes The truth The truth Yes no Hardware clocks n/a

57

Page 58: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

System Simulation use Cases

System-on-Chip Design

‒ Focus on hardware designer needs

‒ Architecture exploration

‒ Sizing, performance, optimization of hardware

Fidelity to target is primary driver for models

‒ Timing

‒ Bandwidth

‒ Latency

‒ Bus structure

All components are equals

UT, LT, AT, CC

Software Development

‒ Focus on software developer needs

‒ Execute large workloads

‒ Debug code

Speed of execution is the primary driver for model

‒ Abstract as far as possible

‒ Approximate timing

Work from the processor outwards

Clear difference between processors and other devices

ST, LT

For Uppsala University RT Course, Copyright Virtutech 20092009-11-1258

Page 59: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Performance of HW acceleration vs SW impl

‒ Including the overhead of Linux device drivers

‒ Different driver modes and different hardware latencies

‒ Packet lengths from 8 to 256 bits

‒ Total of 400 runs, 200 billion target instructions

‒ ST mode, ideal memory

0

200000

400000

600000

800000

1000000

1200000

1400000

1600000

1800000

2000000

8 16 24 32 40 48 56 64 72 80 88 96 104112120128136144152160168176184192200208216224232240248256

bit

char

hw-1

hw-10

hw-100

hw-1000

hw-10000

hw-mmap-1

hw-mmap-10

hw-mmap-100

hw-mmap-1000

hwo-1

Example Measurement: Execution Time

2009-11-12 For Uppsala University RT Course, Copyright Virtutech 200959

Page 60: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Same setup as previous slide

Shows impact of OS on result latencies (aka jitter)

‒ Note how different modes have different susceptibility to jitter

0,00

1,00

2,00

3,00

4,00

5,00

6,00

7,00

8,00

9,00

8 16 24 32 40 48 56 64 72 80 88 96 104112120128136144152160168176184192200208216224232240248256

Ma

xim

um

Ex

ec

uti

on

Tim

e D

ivid

ed

by

Ave

rag

e

Ex

ec

uti

on

Tim

e

Packet Length

Maximum Compared to Average Time

char

hw-100

hw-1000

hw-10000

hw-mmap-100

hw-mmap-1000

Example Measurement: Jitter

2009-11-12 For Uppsala University RT Course, Copyright Virtutech 200960

Page 61: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Shows value of memory latency simulation in Simics, “LT” mode

Parallel processing benchmark

‒ Shared memory restricted single access and high latencies

‒ Testing two different transfer modes, 1 packet and 4 packets per transmission

‒ Scalability quite different

0,00

1,00

2,00

3,00

4,00

5,00

6,00

7,00

8,00

9,00

10,00

1 2 3 4 5 6 7 8 9

Pero

frm

an

ce r

ela

tive t

o o

ne w

ork

er

nd

oe

Number of worker nodes

Scaling as Worker Nodes are Added

Perfect memory100 cycles, single port200 cycles, single port500 cycles, single portPerfect memory, 4 packets/trans100 cycles, single port, 4 packets/trans200 cycles, single port, 4 packets/trans500 cycles, single port, 4 packets/trans

Example Measurement: Memory Speed Impact

2009-11-12 For Uppsala University RT Course, Copyright Virtutech 200961

Page 62: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Hybrid Simulation

For Uppsala University RT Course, Copyright Virtutech 20092009-11-12

Mix fast functional (ST) and clock-cycle-accurate (CC) models

‒ Two models of each device: ST, CC

‒ Use fast simulation to get to interesting places

‒ Zoom in selectively using detailed models

‒ Switch mode using a checkpoint: repeatable, archived, restartable

Additional Simics implementation mode: support detailed models

‒ You need a more complicated bus model

‒ Allows out-of-order transactions and timed buses

‒ Supports full bus hierarchy, not just a streamlined memory map

‒ A single simulation kernel for the entire system

First product: the Freescale QorIQ P4080

‒ Functional fast models from Virtutech

‒ Detailed models from Freescale (internal engineering)

62

Page 63: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Hybrid: What it Means

For Uppsala University RT Course, Copyright Virtutech 20092009-11-12

Mix temporally: change from functional to detailed simulation when

workload reaches interesting point

‒ Use fast mode to position in reasonable time

Mix spatially: combine fast and detailed models in the same

simulation setup

‒ Leverage fast models to get complete system

‒ Speed up simulation by only simulating what is relevant

time

Functional simulation

Detailed

simulation

Drop into detailed mode at

interesting points

Virtual board

Virtual model of new SoC

CPU

Pattern

Matching

Timer

Interrupt MemCtrl

UART

Ethernet

Ethernet

CPU

CryptoBuffer

Memory

TCP

Offload

Buffer

Memory

CPURT clock

RAM FLASH

A packet processing pipeline in detail, rest of virtual platform

functional

Flow control

63

Page 64: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Simics

CC Simulator Bridge

Another Option: Co-Simulation with Detailed System

Integrate a complete subsystem

from a CC (or AT) simulator

‒ Fully-detailed subsystem simulation

‒ Multiple CC models with a CC

interconnect between them

‒ Use existing proven CC bus models

‒ Transactor between the worlds,

typically bus-to-bus

‒ Let the Simics system run as today,

pass transactions in and out of the

CC subsystem

Dual simulation kernels

‒ Automatic reuse of second simulator

Simics bus model

STCPU

Transactor

STCPU

STdevice

ST Memory

CC Simulator

CC Device

CC Device

CC CPU

CC DSP

CC

Bu

s M

od

el

2009-11-12 For Uppsala University RT Course, Copyright Virtutech 200964

Page 65: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Simics

Wrapper device model

Another Option: Individual CC models

Individual CC models inside a

Simics ST framework

‒ Bus is Simics bus model

‒ Transactor to convert from ST to CC

traffic, locally for each model

‒ Transactor creates local clocks

‒ Will not get correct bus contention,

arbitration, etc.

Main point: to check the internal

behavior of a device model

‒ Maybe run the RTL vs Driver

CC Device

Simics bus model

STCPU

Transactor

STCPU

STdevice

STMemory

2009-11-12 For Uppsala University RT Course, Copyright Virtutech 200965

Page 66: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Questions?

Page 67: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Work for Us!

For Uppsala University RT Course, Copyright Virtutech 20092009-11-12

Exjobb finnes!

67

Page 68: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Spares

Page 69: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Locking Test Program

0

0,2

0,4

0,6

0,8

1

1,2

10 100 1000 10000

Time quantum

Execu

tio

n t

ime

no locking

fake locking

proper locking

Test program

‒ 2 threads

‒ 1000000 iterations

‒ MPC8641D virtual target

All locking disciplines

Time quantum 10-10000

Notes:

‒ Locking overhead visible

‒ Lock contention visible

‒ Only proper locking varies in

execution time

On real hardware:

‒ no << fake << proper

‒ Same relation seen in simulation

2009-11-12 For Uppsala University RT Course, Copyright Virtutech 2009

Lock contention execution time

effect clearly visible

72

Page 70: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Locking Test Program: Find Race

Seriously broken program with an unprotected variable access in a tight loop

Test on single-core and dual-core setups

‒ Range of frequencies

‒ Test program run 20 times on each setup

‒ Count percentage of runs triggering race

Results:

‒ Race always triggers in dual-core mode

‒ Triggers around 10% in single-core mode

‒ Higher clock = less chance to trigger in single-core

Simplified timing does not hide the race, this was run on standard Simics

1 CPU2 CPUs

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

1

3

10

100

200

500

800

950

977

1000

1013

10000

Clock freqency (MHz)

Percentage of runs triggering race

2009-11-12For Uppsala University RT Course, Copyright Virtutech 2009

Simulator shows the difference between single-core and multi-core setups

in bug aggressiveness

73

Page 71: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Units of Simulation

For Uppsala University RT Course, Copyright Virtutech 20092009-11-12

Processor Cores Devices

The CPUs running code

Special case to gain performance,

simulated using ISS, JIT, API, etc. – buy or

borrow!

Comparatively limited in variants,

compared to devices

Typically provided by tool vendors

Anything that the system contains that does

things and that is not a user-programmable

CPU

Written by tool vendors and their users

Examples: Timers, interrupt controllers,

ADC, DAC, network interfaces, I2C

controllers, serial ports, LEDs, displays,

media accelerators, pattern matches, table

lookup engines, memory controllers, ...

Memories Interconnects

RAM, ROM, FLASH, EEPROM, ...

Store code and data

Usually special simulation case for

performance reasons, closely integrated

with processor core simulators

Typically provided by tool vendors

Connecting devices, chips, boards, cabinets,

systems together

I2C, Serial, Ethernet, PCI, PCIe, RapidIO,

ATM, CAN, FireWire, USB, MIL-STD-1553,

MII, VME, HyperTransport, memory bus, ...

Typically provided by tool vendors

74

Page 72: Virtualized System Development · Why use Virtual Systems? “Because hardware is no fun” Because Hardware Is... 2009 -11 12 For Uppsala University RT Course, Copyright Virtutech

Virtual Platform Block Diagram

For Uppsala University RT Course, Copyright Virtutech 20092009-11-12

Virtual MPC8572E Board

MPC8572E

DDR

SDRAM

Core0

EBC

eTSEC

PCIe

DUART

OS

Apps

BSP

I2C

PHY

PHY

PHY

ECM

PHY

PHY

eTSEC

eTSEC

eTSEC

RapidIO

FEC

SEC

DDR MC

L2$ /

SRAM

Core1

OS

Apps

BSP

TLU

Deflate

PME

PIC

DMA

DDR MC

Flash

DeviceMemory Processor

Eth

ern

et l

ink

Se

ria

l lin

k

Interconnect

RapidIO link

PCIe link

I2C link

75