performance simulators josé nelson amaral cmput 429 dept. of computing science university of...

15
Performance Simulators José Nelson Amaral CMPUT 429 Dept. of Computing Science University of Alberta

Upload: claud-fields

Post on 03-Jan-2016

220 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Performance Simulators José Nelson Amaral CMPUT 429 Dept. of Computing Science University of Alberta

Performance Simulators

José Nelson AmaralCMPUT 429

Dept. of Computing ScienceUniversity of Alberta

Page 2: Performance Simulators José Nelson Amaral CMPUT 429 Dept. of Computing Science University of Alberta

Reading material

• Section 1.3.2 Performance Simulators in Baer’s textbook.

Page 3: Performance Simulators José Nelson Amaral CMPUT 429 Dept. of Computing Science University of Alberta

Circuit Design Simulation (SPICE)

• Wires, gates, transistors, CMOS, electric signals, etc.

Page 4: Performance Simulators José Nelson Amaral CMPUT 429 Dept. of Computing Science University of Alberta

Logical Design Simulation

• Arithmetic and Logic Units (ALUs), Programmable Logic Arrays (PLAs)

• Hardware description languages (VHDL, Verilog)

Page 5: Performance Simulators José Nelson Amaral CMPUT 429 Dept. of Computing Science University of Alberta

Register Transfer Level (RTL)

• Microarchitecture level: data flow between basic blocks; control lines

RETRO: Univ. of Western Australiartlib: Universitat Hamburg

Page 6: Performance Simulators José Nelson Amaral CMPUT 429 Dept. of Computing Science University of Alberta

Processor and Memory Hierarchy Description (SimpleScalar)

• ISA definition, cache specifications, etc.

Page 7: Performance Simulators José Nelson Amaral CMPUT 429 Dept. of Computing Science University of Alberta

System level simulators

• I/O, multithreading, multiprocessing

Page 8: Performance Simulators José Nelson Amaral CMPUT 429 Dept. of Computing Science University of Alberta

Flavors of Simulation

• Trace-driven simulators: input is a sequence of instructions that have been executed by a program.– Needs trace collection

• hardware monitors: imprecise• software monitors: slow and interfering with execution• need lots of storage for the traces

• Execution-driven simulators: input is from a program interpreter.– Level of detail is a designing choice

Bauer, p. 19

Page 9: Performance Simulators José Nelson Amaral CMPUT 429 Dept. of Computing Science University of Alberta

Drawbacks of simulators

• Difficult to simulate I/O • Simulation take a long time– slowdown of 30 to 40 times!• Takes more than 5 hours to simulate a 2-minute

program.

Bauer, p. 20

Page 10: Performance Simulators José Nelson Amaral CMPUT 429 Dept. of Computing Science University of Alberta

Speeding Up Simulation

• Simulate only the first billion instructions– probably not representative of the actual execution;

• Fast-forward first billion instructions and simulate the second billion instructions– Again only one contiguous portion of the program is

simulated.

• Sample execution intervals (p.e. every 10 intervals of 100 million instructions)

• Detect similar phases in the program.

Bauer, p. 21

Page 11: Performance Simulators José Nelson Amaral CMPUT 429 Dept. of Computing Science University of Alberta

Speeding Up Simulation(Phase-based simulation)

1. Divide execution in intervals (pe. 100 million instructions).

2. Give each interval a signature (pe. average frequency of execution of each basic block).

0.1 0.8 0.4 0.7 0.2 0.5 0.9 0.3 0.1 0.3 0.8 0.6 0.1

3. Cluster intervals based on signature.0.1 0.8 0.4 0.7 0.2 0.5 0.9 0.3 0.1 0.3 0.8 0.6 0.0

4. Simulate a limited number of samples from each cluster.0.4 0.9 0.1 0.3 0.6

5. Weigh the results of the simulation based on cluster frequency.Bauer, p. 21

Page 12: Performance Simulators José Nelson Amaral CMPUT 429 Dept. of Computing Science University of Alberta

Simulation AccuracyFirst billion instructions

Fast forwarding

Phase-based

Bauer, p. 22

Page 13: Performance Simulators José Nelson Amaral CMPUT 429 Dept. of Computing Science University of Alberta

Smaller inputs

• Handcraft smaller inputs to a set of benchmarks– Smaller runs should be statistically equivalent to

the original benchmark runs– Advantage: no sampling– Disadvantage: no automation, difficult to find

adequate reduced inputs

Bauer, p. 22

Page 14: Performance Simulators José Nelson Amaral CMPUT 429 Dept. of Computing Science University of Alberta

Further reading

• T. Austin, E. Larson, and D. Ernst, “SimpleScalar: An Infrastructure for Computer System Modeling,” IEEE Computer, 35, 2, Feb. 2002, 59-67

• How to find it:1. Go to http://www.library.ualberta.ca/2. Click

Page 15: Performance Simulators José Nelson Amaral CMPUT 429 Dept. of Computing Science University of Alberta

How to find papers in library

1. Go to http://www.library.ualberta.ca/2. Click on Science on left hand-side.3. Click on Computing Science.4. Select the database (most common are ACM,

IEEE, Springer). For this paper click on IEEE Explore

5. If you are off-campus click on Web AccessEnter your CCID and password