chapter01 fundamentals
TRANSCRIPT
-
8/4/2019 Chapter01 Fundamentals
1/36
Chapter 1 - Fundamentals 1
Computer Architecture
Chapter 1
Fundamentals
-
8/4/2019 Chapter01 Fundamentals
2/36
Chapter 1 - Fundamentals 2
Introduction
1.1 Introduction
1.2 The Task of a Computer Designer
1.3 Technology and Computer Usage Trends1.4 Cost and Trends in Cost
1.5 Measuring and Reporting Performance
1.6 Quantitative Principles of Computer Design
1.7 Putting It All Together: The Concept of Memory Hierarchy
-
8/4/2019 Chapter01 Fundamentals
3/36
Chapter 1 - Fundamentals 3
Art and
Architecture
Whats the differencebetween Art andArchitecture?
Lyonel Feininger,Marktkirche in Halle
-
8/4/2019 Chapter01 Fundamentals
4/36
Chapter 1 - Fundamentals 4
Art and Architecture
Whats the difference between Art and Architecture?
Notre Damede Paris
-
8/4/2019 Chapter01 Fundamentals
5/36
Chapter 1 - Fundamentals 5
Whats Computer Architecture?
The attributes of a [computing] system as seen by theprogrammer, i.e., the conceptual structure and functionalbehavior, as distinct from the organization of the data
flows and controls the logic design, and the physicalimplementation.
Amdahl, Blaaw, and Brooks, 1964
SOFTWARE
-
8/4/2019 Chapter01 Fundamentals
6/36
Chapter 1 - Fundamentals 6
Whats Computer Architecture?
1950s to 1960s: Computer Architecture CourseComputer Arithmetic.
1970s to mid 1980s: Computer Architecture Course
Instruction Set Design, especially ISA appropriate forcompilers. (What well do in Chapter 2)
1990s to 2000s: Computer Architecture CourseDesign of CPU, memory system, I/O system,
Multiprocessors. (All evolving at a tremendous rate!)
-
8/4/2019 Chapter01 Fundamentals
7/36
Chapter 1 - Fundamentals 7
The Task of aComputer Designer
1.1 Introduction1.2 The Task of a Computer
Designer
1.3 Technology and ComputerUsage Trends
1.4 Cost and Trends in Cost
1.5 Measuring and Reporting
Performance1.6 Quantitative Principles of
Computer Design
1.7 Putting It All Together: TheConcept of MemoryHierarchy
Evaluate ExistingSystems forBottlenecks
Simulate NewDesigns and
Organizations
Implement Next
Generation System
TechnologyTrends
Benchmarks
Workloads
ImplementationComplexity
-
8/4/2019 Chapter01 Fundamentals
8/36
Chapter 1 - Fundamentals 8
Technology andComputer Usage Trends
1.1 Introduction
1.2 The Task of a Computer Designer
1.3 Technology and Computer UsageTrends
1.4 Cost and Trends in Cost
1.5 Measuring and Reporting Performance
1.6 Quantitative Principles of ComputerDesign
1.7 Putting It All Together: The Concept ofMemory Hierarchy
Similarly, Computer Architecture is about
working within constraints: What will the market buy?
Cost/Performance
Tradeoffs in materials and processes
When building a Cathedral numerousvery practical considerations need tobe taken into account:
available materials
worker skills willingness of the client to pay the
price.
-
8/4/2019 Chapter01 Fundamentals
9/36
Chapter 1 - Fundamentals 9
TrendsGordon Moore (Founder of Intel) observed in 1965 that the number of
transistors that could be crammed on a chip doubles every year.
This has CONTINUED to be true since then.Transistors Per Chip
1.E+03
1.E+04
1.E+05
1.E+06
1.E+07
1.E+08
1970 1975 1980 1985 1990 1995 2000 2005
4004
Power PC 601486
386
80286
8086
Pentium
Pentium Pro
Pentium II
Power PC G3
Pentium 3
-
8/4/2019 Chapter01 Fundamentals
10/36
Chapter 1 - Fundamentals 10
TrendsProcessor performance, as measured by the SPEC benchmark has
also risen dramatically.
0
1000
2000
3000
4000
5000
87
88
89
90
91
92
93
94
95
96
97
98
99
2000
DEC Alpha 21264/600
DEC Alpha 5/500
DEC Alpha 4/266
DEC
AXP/
500Sun
-4/260
IBM
RS/
6000
MIPS
M2000
Alpha 6/833
-
8/4/2019 Chapter01 Fundamentals
11/36
Chapter 1 - Fundamentals 11
TrendsMemory Capacity (and Cost) have changed dramatically in the last 20
years.
size
Year
1000
10000
100000
1000000
10000000
100000000
1000000000
1970 1975 1980 1985 1990 1995 2000
year size(Mb) cyc time
1980 0.0625 250 ns
1983 0.25 220 ns
1986 1 190 ns
1989 4 165 ns
1992 16 145 ns
1996 64 120 ns
2000 256 100 ns
-
8/4/2019 Chapter01 Fundamentals
12/36
Chapter 1 - Fundamentals 12
TrendsBased on SPEED, the CPU has increased dramatically, but memory
and disk have increased only a little. This has led to dramaticchanged in architecture, Operating Systems, and Programmingpractices.
Capacity Speed (latency)Logic 2x in 3 years 2x in 3 years
DRAM 4x in 3 years 2x in 10 years
Disk 4x in 3 years 2x in 10 years
-
8/4/2019 Chapter01 Fundamentals
13/36
Chapter 1 - Fundamentals 13
Measuring AndReporting Performance
1.1 Introduction
1.2 The Task of a Computer Designer
1.3 Technology and Computer UsageTrends
1.4 Cost and Trends in Cost
1.5 Measuring and Reporting Performance
1.6 Quantitative Principles of ComputerDesign
1.7 Putting It All Together: The Concept ofMemory Hierarchy
This section talks about:
1. Metrics how do we describein a numerical way theperformance of a computer?
2. What tools do we use to findthose metrics?
-
8/4/2019 Chapter01 Fundamentals
14/36
Chapter 1 - Fundamentals 14
Metrics
Time to run the task (ExTime) Execution time, response time, latency
Tasks per day, hour, week, sec, ns (Performance) Throughput, bandwidth
Plane
Boeing 747
BAD/SudConcodre
Speed
610 mph
1350 mph
DC to Paris
6.5 hours
3 hours
Passengers
470
132
Throughput
(pmph)
286,700
178,200
-
8/4/2019 Chapter01 Fundamentals
15/36
Chapter 1 - Fundamentals 15
Metrics - Comparisons
"X is n times faster than Y" means
ExTime(Y) Performance(X)
--------- = ---------------
ExTime(X) Performance(Y)
Speed of Concorde vs. Boeing 747
Throughput of Boeing 747 vs. Concorde
-
8/4/2019 Chapter01 Fundamentals
16/36
Chapter 1 - Fundamentals 16
Metrics - ComparisonsPat has developed a new product, "rabbit" about which she wishes to determine
performance. There is special interest in comparing the new product, rabbit to theold product, turtle, since the product was rewritten for performance reasons. (Pathad used Performance Engineering techniques and thus knew that rabbit was"about twice as fast" as turtle.) The measurements showed:
Performance Comparisons
Product Transactions / second Seconds/ transaction Seconds to process transaction
Turtle 30 0.0333 3
Rabbit 60 0.0166 1
Which of the following statements reflect the performance comparison of rabbit andturtle?
o Rabbit is 100% faster than turtle.o Rabbit is twice as fast as turtle.
o Rabbit takes 1/2 as long as turtle.
o Rabbit takes 1/3 as long as turtle.
o Rabbit takes 100% less time than turtle.
o Rabbit takes 200% less time than turtle.o Turtle is 50% as fast as rabbit.
o Turtle is 50% slower than rabbit.
o Turtle takes 200% longer than rabbit.
o Turtle takes 300% longer than rabbit.
-
8/4/2019 Chapter01 Fundamentals
17/36
Chapter 1 - Fundamentals 17
Metrics - Throughput
Compiler
ProgrammingLanguage
Application
DatapathControl
Transistors Wires Pins
ISA
Function Units
(millions) of Instructions per second: MIPS(millions) of (FP) operations per second: MFLOP/s
Cycles per second (clock rate)
Megabytes per second
Answers per monthOperations per second
-
8/4/2019 Chapter01 Fundamentals
18/36
Chapter 1 - Fundamentals 18
Methods For Predicting
Performance Benchmarks, Traces, Mixes
Hardware: Cost, delay, area, power estimation
Simulation (many levels)
ISA, RT, Gate, Circuit
Queuing Theory
Rules of Thumb
Fundamental Laws/Principles
-
8/4/2019 Chapter01 Fundamentals
19/36
Chapter 1 - Fundamentals 19
Benchmarks
First Round 1989
10 programs yielding a single number (SPECmarks)
Second Round 1992
SPECInt92 (6 integer programs) and SPECfp92 (14 floating point programs)
Compiler Flags unlimited. March 93 of DEC 4000 Model 610:
spice: unix.c:/def=(sysv,has_bcopy,bcopy(a,b,c)=memcpy(b,a,c)
wave5: /ali=(all,dcom=nat)/ag=a/ur=4/ur=200
nasa7: /norecu/ag=a/ur=4/ur2=200/lc=blas
Third Round 1995 new set of programs: SPECint95 (8 integer programs) and SPECfp95 (10 floating
point)
benchmarks useful for 3 years
Single flag setting for all programs: SPECint_base95, SPECfp_base95
SPEC: System Performance Evaluation
Cooperative
-
8/4/2019 Chapter01 Fundamentals
20/36
Chapter 1 - Fundamentals 20
BenchmarksCINT2000 (Integer Component of SPEC CPU2000):
Program Language What Is It
164.gzip C Compression
175.vpr C FPGA Circuit Placement and Routing
176.gcc C C Programming Language Compiler
181.mcf C Combinatorial Optimization
186.crafty C Game Playing: Chess
197.parser C Word Processing
252.eon C++ Computer Visualization
253.perlbmk C PERL Programming Language
254.gap C Group Theory, Interpreter255.vortex C Object-oriented Database
256.bzip2 C Compression
300.twolf C Place and Route Simulator
http://www.spec.org/osg/cpu2000/CINT2000/
-
8/4/2019 Chapter01 Fundamentals
21/36
Chapter 1 - Fundamentals 21
BenchmarksCFP2000 (Floating Point Component of SPEC
CPU2000):Program Language What Is It168.wupwise Fortran 77 Physics / Quantum Chromodynamics
171.swim Fortran 77 Shallow Water Modeling
172.mgrid Fortran 77 Multi-grid Solver: 3D Potential Field
173.applu Fortran 77 Parabolic / Elliptic Differential Equations
177.mesa C 3-D Graphics Library178.galgel Fortran 90 Computational Fluid Dynamics
179.art C Image Recognition / Neural Networks
183.equake C Seismic Wave Propagation Simulation
187.facerec Fortran 90 Image Processing: Face Recognition
188.ammp C Computational Chemistry
189.lucas Fortran 90 Number Theory / Primality Testing191.fma3d Fortran 90 Finite-element Crash Simulation
200.sixtrack Fortran 77 High Energy Physics Accelerator Design
301.apsi Fortran 77 Meteorology: Pollutant Distribution
http://www.spec.org/osg/cpu2000/CFP2000/
-
8/4/2019 Chapter01 Fundamentals
22/36
Chapter 1 - Fundamentals 22
Benchmarks Sample Results ForSpecINT2000
Base Base Base Peak Peak PeakBenchmarks Ref Time Run Time Ratio Ref Time Run Time Ratio
164.gzip 1400 277 505* 1400 270 518*
175.vpr 1400 419 334* 1400 417 336*
176.gcc 1100 275 399* 1100 272 405*
181.mcf 1800 621 290* 1800 619 291*
186.crafty 1000 191 522* 1000 191 523*
197.parser 1800 500 360* 1800 499 361*
252.eon 1300 267 486* 1300 267 486*
253.perlbmk 1800 302 596* 1800 302 596*
254.gap 1100 249 442* 1100 248 443*
255.vortex 1900 268 710* 1900 264 719*256.bzip2 1500 389 386* 1500 375 400*
300.twolf 3000 784 382* 3000 776 387*
SPECint_base2000 438
SPECint2000 442
http://www.spec.org/osg/cpu2000/results/res2000q3/cpu2000-20000718-00168.asc
Intel OR840(1 GHz
Pentium III processor)
-
8/4/2019 Chapter01 Fundamentals
23/36
Chapter 1 - Fundamentals 23
Benchmarks
Performance Evaluation
For better or worse, benchmarks shape a field
Good products created when have:
Good benchmarks
Good ways to summarize performance
Given sales is a function in part of performance relative tocompetition, investment in improving product as reported byperformance summary
If benchmarks/summary inadequate, then choose betweenimproving product for real programs vs. improving product to get
more sales;Sales almost always wins!
Execution time is the measure of computer performance!
-
8/4/2019 Chapter01 Fundamentals
24/36
Chapter 1 - Fundamentals 24
Benchmarks
Management would like to have one number.
Technical people want more:
1. They want to have evidence of reproducibility there should be enoughinformation so that you or someone else can repeat the experiment.
2. There should be consistency when doing the measurements multiple
times.
How to Summarize Performance
How would you report these results?
Computer A Computer B Computer C
Program P1 (secs) 1 10 20
Program P2 (secs) 1000 100 20
Total Time (secs) 1001 110 40
-
8/4/2019 Chapter01 Fundamentals
25/36
Chapter 1 - Fundamentals 25
Quantitative Principlesof Computer Design
1.1 Introduction1.2 The Task of a Computer Designer
1.3 Technology and Computer UsageTrends
1.4 Cost and Trends in Cost
1.5 Measuring and Reporting Performance
1.6 Quantitative Principles of ComputerDesign
1.7 Putting It All Together: The Concept ofMemory Hierarchy
Make the common case fast.Amdahls Law:
Relates total speedup of asystem to the speedup of someportion of that system.
Q i i
-
8/4/2019 Chapter01 Fundamentals
26/36
Chapter 1 - Fundamentals 26
Amdahl's Law
Suppose that enhancement E accelerates a fraction Fof the task by a factor S, and the remainder of thetask is unaffected
QuantitativeDesign
tEnhancemenWithoutePerformanc
tEnhancemenWithePerformanc
tEnhancemenWithTimeExecution
tEnhancemenWithoutTimeExecution
ESpeedup __
__
___
___
)(
Speedup due to enhancement E:
This fraction enhanced
Q i i
-
8/4/2019 Chapter01 Fundamentals
27/36
Chapter 1 - Fundamentals 27
ExTimenew = ExTimeold x (1 - Fractionenhanced) + Fractionenhanced
Speedupoverall =ExTime
old
ExTimenew
Speedupenhanced
=
1
(1 - Fractionenhanced) + Fractionenhanced
Speedupenhanced
QuantitativeDesign
This fraction enhanced
ExTimeold ExTimenew
Amdahl's Law
-
8/4/2019 Chapter01 Fundamentals
28/36
Chapter 1 - Fundamentals 28
Amdahl's Law
Floating point instructions improved to run 2X; but only10% of actual instructions are FP
Speedupoverall =1
0.95= 1.053
ExTimenew= ExTimeold x (0.9 + .1/2) = 0.95 x ExTimeold
QuantitativeDesign
-
8/4/2019 Chapter01 Fundamentals
29/36
Chapter 1 - Fundamentals 29
QuantitativeDesign
Instruction Frequency
Invest Resources where time is Spent!
CPI = (CPU Time * Clock Rate) / Instruction Count= Cycles / Instruction Count
n
iii ICPITimeCycleTimeCPU 1 **__
n
i
ii FCPICPI1
* whereCountnInstructio
Ii
iF_
Number ofinstructions oftype I.
Cycles PerInstruction
-
8/4/2019 Chapter01 Fundamentals
30/36
Chapter 1 - Fundamentals 30
QuantitativeDesign
Base Machine (Reg / Reg)
Op Freq Cycles CPI(i) (% Time)ALU 50% 1 .5 (33%)
Load 20% 2 .4 (27%)
Store 10% 2 .2 (13%)
Branch 20% 2 .4 (27%)
Total CPI 1.5
Suppose we have a machine where we can count the frequency with whichinstructions are executed. We also know how many cycles it takes foreach instruction type.
Cycles PerInstruction
How do we get CPI(I)?How do we get %time?
-
8/4/2019 Chapter01 Fundamentals
31/36
Chapter 1 - Fundamentals 31
QuantitativeDesign
Locality ofReference
Programs access a relatively small portion of the address space atany instant of time.
There are two different types of locality:
Temporal Locality (locality in time): If an item is referenced, it willtend to be referenced again soon (loops, reuse, etc.)
Spatial Locality (locality in space/location): If an item is referenced,items whose addresses are close by tend to be referenced soon
(straight line code, array access, etc.)
Th C f
-
8/4/2019 Chapter01 Fundamentals
32/36
Chapter 1 - Fundamentals 32
The Concept ofMemory Hierarchy
1.1 Introduction1.2 The Task of a Computer Designer
1.3 Technology and Computer UsageTrends
1.4 Cost and Trends in Cost
1.5 Measuring and Reporting Performance
1.6 Quantitative Principles of ComputerDesign
1.7 Putting It All Together: The Concept ofMemory Hierarchy
Fast memory is expensive.
Slow memory is cheap.
The goal is to minimize theprice/performance for a
particular price point.
-
8/4/2019 Chapter01 Fundamentals
33/36
Chapter 1 - Fundamentals 33
Memory Hierarchy
RegistersLevel 1cache
Level 2Cache
Memory Disk
Typical
Size
4 - 64
-
8/4/2019 Chapter01 Fundamentals
34/36
Chapter 1 - Fundamentals 34
Memory Hierarchy
Hit: data appears in some block in the upper level (example:Block X)
Hit Rate: the fraction of memory access found in the upper level
Hit Time: Time to access the upper level which consists of
RAM access time + Time to determine hit/miss
Miss: data needs to be retrieve from a block in the lower level(Block Y)
Miss Rate = 1 - (Hit Rate)
Miss Penalty: Time to replace a block in the upper level +
Time to deliver the block the processor Hit Time
-
8/4/2019 Chapter01 Fundamentals
35/36
Chapter 1 - Fundamentals 35
Memory Hierarchy
RegistersLevel 1cache
Level 2Cache
Memory Disk
What is the cost of executing a program if: Stores are free (theres a write pipe) Loads are 20% of all instructions 80% of loads hit (are found) in the Level 1 cache 97 of loads hit in the Level 2 cache.
-
8/4/2019 Chapter01 Fundamentals
36/36
Chapter 1 Fundamentals 36
Wrap Up
1.1 Introduction
1.2 The Task of a Computer Designer
1.3 Technology and Computer Usage Trends1.4 Cost and Trends in Cost
1.5 Measuring and Reporting Performance
1.6 Quantitative Principles of Computer Design
1.7 Putting It All Together: The Concept of Memory Hierarchy