lecture 1: flynn’s taxonomy

44
CDA 4150 Eurípides Montagne University of Ce ntral Florida 1 Lecture 1: Flynn’s Taxonomy

Upload: leola

Post on 22-Feb-2016

318 views

Category:

Documents


46 download

DESCRIPTION

Lecture 1: Flynn’s Taxonomy. The Global View of Computer Architecture. Applications. Parallelism. History. Technology. Computer Architecture -instruction set design -Organization -Hardware/Software boundary. Programming Languages. OS. Measurement and Evaluation. Compilers. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Lecture 1: Flynn’s Taxonomy

CDA 4150 Eurípides Montagne University of Central Florida

1

Lecture 1: Flynn’s Taxonomy

Page 2: Lecture 1: Flynn’s Taxonomy

CDA 4150 Eurípides Montagne University of Central Florida

2

The Global View of Computer Architecture

Computer Architecture-instruction set design-Organization-Hardware/Software boundary

HistoryApplications Parallelism

Technology

OS

CompilersInterface Design (ISA)

Measurement and

Evaluation

Programming Languages

Page 3: Lecture 1: Flynn’s Taxonomy

CDA 4150 Eurípides Montagne University of Central Florida

3

The Task of A Computer Designer

– Determine what attributes are important for a new machine.

– Design a machine to maximize performance while staying within cost constrains.

Page 4: Lecture 1: Flynn’s Taxonomy

CDA 4150 Eurípides Montagne University of Central Florida

4

Flynn’s Taxonomy

• Michael Flynn (from Stanford)– Made a characterization of computer systems which

became known as Flynn’s Taxonomy

Computer

Instructions Data

Page 5: Lecture 1: Flynn’s Taxonomy

CDA 4150 Eurípides Montagne University of Central Florida

5

Flynn’s Taxonomy• SISD – Single Instruction Single Data

Systems

SI SISD SD

Page 6: Lecture 1: Flynn’s Taxonomy

CDA 4150 Eurípides Montagne University of Central Florida

6

Flynn’s Taxonomy

• SIMD – Single Instruction Multiple Data Systems “Array Processors”

SI

SISD

SISD

SISD

SD

SD

SD

Multiple Data

Page 7: Lecture 1: Flynn’s Taxonomy

CDA 4150 Eurípides Montagne University of Central Florida

7

Flynn’s Taxonomy

• MIMD Multiple Instructions Multiple Data System: “Multiprocessors”

Multiple Instructions Multiple Data

SI

SI

SI

SISD

SISD

SISD

SD

SD

SD

Page 8: Lecture 1: Flynn’s Taxonomy

CDA 4150 Eurípides Montagne University of Central Florida

8

Flynn’s Taxonomy

• MISD- Multiple Instructions / Single Data System– Some people say “pipelining” lies here, but

this is debatable. Multiple Instructions Single Data

SI

SISD

SISD

SISD

SD

SI

SI

Page 9: Lecture 1: Flynn’s Taxonomy

CDA 4150 Eurípides Montagne University of Central Florida

9

Abbreviations:SISD one address Machine.

• IP: Instruction pointer• MAR: Memory Address Register• MDR: Memory Data Register• A: Accumulator• ALU: Arithmetic Logic Unit• IR: Instruction Register• OP: Opcode• ADDR: Address

IP

MAR

MEMORY

MDRADDR AOP

DECODER ALU

Page 10: Lecture 1: Flynn’s Taxonomy

CDA 4150 Eurípides Montagne University of Central Florida

10

LOAD X– MAR IP– MDR M[MAR] || IP IP + 1

– IR MDR– DECODER IR.OP– MAR IR.ADDR– MDR M[MAR]– A MDR

IP

MAR

MEMORY

MDRADDR AOP

DECODER ALU

OP ADDRESS

One address format

Page 11: Lecture 1: Flynn’s Taxonomy

CDA 4150 Eurípides Montagne University of Central Florida

11

IP

MAR

MEMORY

MDRADDR AOP

DECODER ALU

ADD X - MAR IP

– MDR M[MAR] || IP IP + 1 – IR MDR– DECODER IR.OP– MAR IR.ADDR– MDR M[MAR]– A A + MDR

OP ADDRESS

One address format

Page 12: Lecture 1: Flynn’s Taxonomy

CDA 4150 Eurípides Montagne University of Central Florida

12

IP

MAR

MEMORY

MDRADDR AOP

DECODER ALU

STORE X

- MAR IP- MDR M[MAR] || IP IP + 1 - IR MDR- DECODER IR.OP- MAR IR.ADDR- MDR A- M[MAR] MDR

OP ADDRESS

One address format

Page 13: Lecture 1: Flynn’s Taxonomy

CDA 4150 Eurípides Montagne University of Central Florida

13

SISD Stack Machine

First Stack Machine

• B5000

IP

MAR

MEMORY

MDRADDROP

DECODER

1234

ALU

Stack

Page 14: Lecture 1: Flynn’s Taxonomy

CDA 4150 Eurípides Montagne University of Central Florida

14

IP

MAR

MEMORY

MDRADDROP

DECODER

1234

ALU

Stack

PUSH? ST[4] ||ST[4] ST[3] ||ST[3] ST[2] ||ST[2] ST[1] ||ST[1] MDR ||

LOAD XMAR IP MDR M[MAR] || IP IP + 1IR MDRDECODER IR.OPMAR IR.ADDRMDR M[MAR]PUSH

“||” means in parallel

ST

Page 15: Lecture 1: Flynn’s Taxonomy

CDA 4150 Eurípides Montagne University of Central Florida

15

IP

MAR

MEMORY

MDRADDROP

DECODER

1234

ALU

Stack

POPMDR ST[1] ||ST[1] ST[2] ||ST[2] ST[3] ||ST[3] ST[4] ||ST[4] 0

STORE XMAR IPMDR M[MAR]IR MDRDECODER IR.OPMAR IR.ADDRPOPM[MAR] MDR

“||” means in parallel

Page 16: Lecture 1: Flynn’s Taxonomy

CDA 4150 Eurípides Montagne University of Central Florida

16

IP

MAR

MEMORY

MDRADDROP

DECODER

1234

ALU

Stack

ADD MAR IPMDR M[MAR]IR MDRDECODER IR.OPST[2] ST[1] + ST[2]ST[1] ST[2]ST[2] ST[3]ST[3] ST[4]ST[4] 0

Zero address format

OP NOT USED

Page 17: Lecture 1: Flynn’s Taxonomy

CDA 4150 Eurípides Montagne University of Central Florida

17

Example

• Stack Trace– Loadi 1 _ _ _ _ – Loadi 2 – Add – Store X

Page 18: Lecture 1: Flynn’s Taxonomy

CDA 4150 Eurípides Montagne University of Central Florida

18

Cont’

IP

MAR

MEMORY

MDRADDROP

DECODER

1

ALU

Stack

Stack TraceLoadi “1” 1 _ _ _Loadi “2” Add Store X

Page 19: Lecture 1: Flynn’s Taxonomy

CDA 4150 Eurípides Montagne University of Central Florida

19

Cont’

• Stack Trace– Loadi 1 1 _ _ _– Loadi 2 2 1 _ _– Add – Store X

IP

MAR

MEMORY

MDRADDROP

DECODER

21

ALU

Stack

Page 20: Lecture 1: Flynn’s Taxonomy

CDA 4150 Eurípides Montagne University of Central Florida

20

ADD Step 1

IP

MAR

MEMORY

MDRADDROP

DECODER

23

ALU

Stack

Stack TracePush 1 1 _ _ _Push 2 2 1 _ _Add 2 3 _ _Store X

Page 21: Lecture 1: Flynn’s Taxonomy

CDA 4150 Eurípides Montagne University of Central Florida

21

ADD step 2

IP

MAR

MEMORY

MDRADDROP

DECODER

3

ALU

Stack

Stack TracePush 1 1 _ _ _Push 2 2 1 _ _Add 3 _ _ _Store X

Page 22: Lecture 1: Flynn’s Taxonomy

CDA 4150 Eurípides Montagne University of Central Florida

22

Before Store X is executed

IP

MAR

MEMORY

MDRADDROP

DECODER

3

ALU

Stack

Stack TracePush 1 1 _ _ _Push 2 2 1 _ _Add 3 _ _ _Store X 3 _ _ _

Page 23: Lecture 1: Flynn’s Taxonomy

CDA 4150 Eurípides Montagne University of Central Florida

23

After Store X is executed

IP

MAR

MEMORY

MDRADDROP

DECODERALU

Stack

Stack TracePush 1 1 _ _ _Push 2 2 1 _ _Add 2 3 _ _Store _ _ _ _

Page 24: Lecture 1: Flynn’s Taxonomy

CDA 4150 Eurípides Montagne University of Central Florida

24

SIMD(Array Processor)

IP

MAR

MEMORY

MDRADDROP

DECODER

A1 B1 C1 A2 B2 C2 AN BN CN

ALU ALU ALU

Page 25: Lecture 1: Flynn’s Taxonomy

CDA 4150 Eurípides Montagne University of Central Florida

25

Array Processors • One of the first Array

Processors was the ILLIIAC IV– Load A1, V[1]– Load B1,Y[1]– Load A2, V[2]– Load B2, Y[2]– ………………– ………………– Load An, V[n]– Load Bn, Y[n]

– ADD– Store C1, W[1]– Store C2, W[2]– Store C3, W[3]– ………………..– ………………..– Store Cn, W[n]

Page 26: Lecture 1: Flynn’s Taxonomy

CDA 4150 Eurípides Montagne University of Central Florida

26

Matrix Access

•Parallel access by Rows

•Parallel access by Columns

•Parallel access by Diagonals

•Skewed matrix representation

Page 27: Lecture 1: Flynn’s Taxonomy

CDA 4150 Eurípides Montagne University of Central Florida

27

Parallel access by Row

a11 a12 a13 a14

a21 a22 a23 a24

a31 a32 a33 a34

a41 a42 a43 a44

P1 P2 P3 P4

Serial accessby Columns

Page 28: Lecture 1: Flynn’s Taxonomy

CDA 4150 Eurípides Montagne University of Central Florida

28

Parallel access by Columnreordering is necessary

a11 a21 a31 a41

a12 a22 a32 a42

a13 a23 a33 a43

a14 a24 a34 a44

P1 P2 P3 P4

Serial accessby Rows

Page 29: Lecture 1: Flynn’s Taxonomy

CDA 4150 Eurípides Montagne University of Central Florida

29

Parallel access by Diagonals

a11 a12 a13 a14

a24 a21 a22 a23

a33 a34 a31 a32

a42 a43 a44 a41

P1 P2 P3 P1 P4

Page 30: Lecture 1: Flynn’s Taxonomy

CDA 4150 Eurípides Montagne University of Central Florida

30

Parallel Column/Row Access

a11 a12 a13 a14

a24 a21 a22 a23

a33 a34 a31 a32

a42 a423 a44 a41

a11

a21

a31

a41

a13

a23

a33

a43

Skewed matrix representation

P1 P2 P3 P4

Page 31: Lecture 1: Flynn’s Taxonomy

CDA 4150 Eurípides Montagne University of Central Florida

31

Parallel Column/Row/Diagonal Access

a11 a12 a13 a14

a21 a22 a23 a24

a34 a31 a32 a33

a43 a44 a41 a42

P1 P2 P3 P4

Interconnection Network

Skewed matrix representation (5 banks)

Page 32: Lecture 1: Flynn’s Taxonomy

CDA 4150 Eurípides Montagne University of Central Florida

32

Pipelining

• Definition: Pipelining is an implementation technique whereby multiple instructions are overlapped in execution, taking advantage of parallelism that exists among actions needed to execute an instruction.

• Example: Pipelining is similar to an automobile assembly line.

Page 33: Lecture 1: Flynn’s Taxonomy

CDA 4150 Eurípides Montagne University of Central Florida

33

Pipelining: Automobile Assembly line• T0: Frame|• T1: Frame| wheels• T2: Frame| wheels| Engine• T3: Frame| Wheels| Engine| Body New Car

If it takes 1 hour to complete one car, and each of the above stages takes 15 minutes, then building the first car takes 1 hour; one car can produced every 15 minutes. This same principle can be applied to computer instructions.

Page 34: Lecture 1: Flynn’s Taxonomy

CDA 4150 Eurípides Montagne University of Central Florida

34

Pipelining: Floating point addition

• Suppose a floating point addition operation could be divided into 4 stages, each completion ¼ of the total addition operation.

• Then the following chart would be possible.

STAGE1 STAGE2 STAGE3 STAGE4

Page 35: Lecture 1: Flynn’s Taxonomy

CDA 4150 Eurípides Montagne University of Central Florida

35

Example

• 0.9056 10^2• 3.7401 10^4 .3749156 10^5

• 0.9056 10^2• 374.01 10^2 374.9156 10^2

Alignment

Compare Exponents

Add

Normalize

+

Page 36: Lecture 1: Flynn’s Taxonomy

CDA 4150 Eurípides Montagne University of Central Florida

36

Floating point unit

Compare Exponents

Alignments

Add

Normalization

ALU

Page 37: Lecture 1: Flynn’s Taxonomy

CDA 4150 Eurípides Montagne University of Central Florida

37

The steps

• Step one: Compare and choose the exponents.

• Step two: Set both numbers to the same exponent.

• Step three: Perform addition/subtraction on the two numbers.

• Last step: Normalization.

Page 38: Lecture 1: Flynn’s Taxonomy

CDA 4150 Eurípides Montagne University of Central Florida

38

Vector processors

Vector processor Characteristics:- Pipelining is used in processing.- Vector Registers are used to provide the ALU with constant input.- Memory interleaving is used to load input to the vector registers.- AKA Supercomputers.- the CRAY-1 is regarded as the first of these types.

Page 39: Lecture 1: Flynn’s Taxonomy

CDA 4150 Eurípides Montagne University of Central Florida

39

Vector Processors: DiagramIP

MAR

MEMORY

MDRADDROP

DECODER

A B C

ALU

Page 40: Lecture 1: Flynn’s Taxonomy

CDA 4150 Eurípides Montagne University of Central Florida

40

Vector Processing: Compilers• The addition of vector processing has led to

vectorization in compiler design.Example: the following loop construct:

FOR I, 1 to NC[i] A[i]+ B[i]

Can be unrolled into the following instruction:C[1] A[1]+B[1]C[2] A[2]+B[2]C[3] A[3]+B[3]…..

Page 41: Lecture 1: Flynn’s Taxonomy

CDA 4150 Eurípides Montagne University of Central Florida

41

Memory Interleaving

• Definition: Memory Interleaving is a design used to gain faster access to memory, by organizing memory into separate memory banks, each with their own MAR (memory address register). This allows parallel access and eliminates the required wait for a single MAR to finish a memory access.

Page 42: Lecture 1: Flynn’s Taxonomy

CDA 4150 Eurípides Montagne University of Central Florida

42

Memory Interleaving: DiagramMAR

MAR2 MAR3 MAR4MAR1

MEMORY1 MEMORY2 MEMORY3 MEMORY4

MDR1 MDR2 MDR3 MDR4

MDR

Page 43: Lecture 1: Flynn’s Taxonomy

CDA 4150 Eurípides Montagne University of Central Florida

43

Vector Processors & Memory Interleaving Diagram

IP

MAR

MDROP ADDR

A B C

ALUDECODER

Vector Register

Page 44: Lecture 1: Flynn’s Taxonomy

CDA 4150 Eurípides Montagne University of Central Florida

44

Multiprocessor Machines (MIMD)

MEMORY

CPU CPU CPU