computer organization cs224 fall 2012 lesson 28. pipelining analogy pipelined laundry: overlapping...

Computer Organization CS224 Fall 2012 Lesson 28

Upload: david-clark

Post on 04-Jan-2016

245 views

Category:

Documents

8 download

Report

Download

Embed Size (px):

TRANSCRIPT

Computer OrganizationCS224

Fall 2012

Lesson 28

Pipelining Analogy

Pipelined laundry: overlapping execution Parallelism improves performance

§4.5 An O

verview of P

ipelining Four loads: Speedup

= 8/3.5 = 2.3 Non-stop:

Speedup= 2n/0.5n + 1.5 ≈ 4= number of stages

Page 3: Computer Organization CS224 Fall 2012 Lesson 28. Pipelining Analogy Pipelined laundry: overlapping execution l Parallelism improves performance §4.5

MIPS Pipeline

Five stages, one step per stage1. IF: Instruction fetch from memory

2. ID: Instruction decode & register read

3. EX: Execute operation or calculate address

4. MEM: Access memory operand

5. WB: Write result back to register

Page 4: Computer Organization CS224 Fall 2012 Lesson 28. Pipelining Analogy Pipelined laundry: overlapping execution l Parallelism improves performance §4.5

Pipeline Performance

Assume time for stages is 100ps for register read or write 200ps for other stages

Compare pipelined datapath with single-cycle datapath

Instr Instr fetch Register read

ALU op Memory access

Total time

lw 200ps 100 ps 200ps 200ps 100 ps 800ps

sw 200ps 100 ps 200ps 200ps 700ps

R-format 200ps 100 ps 200ps 100 ps 600ps

beq 200ps 100 ps 200ps 500ps

Page 5: Computer Organization CS224 Fall 2012 Lesson 28. Pipelining Analogy Pipelined laundry: overlapping execution l Parallelism improves performance §4.5

Pipeline Performance

Single-cycle (Tc= 800ps)

Pipelined (Tc= 200ps)

Page 6: Computer Organization CS224 Fall 2012 Lesson 28. Pipelining Analogy Pipelined laundry: overlapping execution l Parallelism improves performance §4.5

Pipeline Speedup

If all stages are balanced i.e., all take the same time

Time between instructionspipelined

= Time between instructionsnonpipelined

Number of stages

If not balanced, speedup is less

Speedup due to increased throughput Latency (time for each instruction) does not decrease

Page 7: Computer Organization CS224 Fall 2012 Lesson 28. Pipelining Analogy Pipelined laundry: overlapping execution l Parallelism improves performance §4.5

Pipelining and ISA Design

MIPS ISA designed for pipelining All instructions are 32-bits

- Easier to fetch and decode in one cycle

- c.f. x86: 1- to 17-byte instructions Few instruction formats, very regular

- Can decode and read registers in one step Load/store addressing

- Can calculate address in 3rd stage, access memory in 4th stage

Alignment of memory operands

- Memory access takes only one cycle

Page 8: Computer Organization CS224 Fall 2012 Lesson 28. Pipelining Analogy Pipelined laundry: overlapping execution l Parallelism improves performance §4.5

Hazards

Situations that prevent starting the next instruction in the next cycle

Structure hazards A required resource is busy

Data hazard Need to wait for previous instruction to complete its data

read/write

Control hazard Deciding on control action depends on previous instruction

Page 9: Computer Organization CS224 Fall 2012 Lesson 28. Pipelining Analogy Pipelined laundry: overlapping execution l Parallelism improves performance §4.5

Structure Hazards

Conflict for use of a resource

In MIPS pipeline with a single memory Load/store requires data access Instruction fetch would have to stall for that cycle

- Would cause a pipeline “bubble”

Hence, pipelined datapaths require separate instruction/data memories

Or separate instruction/data caches

Page 10: Computer Organization CS224 Fall 2012 Lesson 28. Pipelining Analogy Pipelined laundry: overlapping execution l Parallelism improves performance §4.5

Data Hazards

An instruction depends on completion of data access by a previous instruction

add $s0, $t0, $t1sub $t2, $s0, $t3

Page 11: Computer Organization CS224 Fall 2012 Lesson 28. Pipelining Analogy Pipelined laundry: overlapping execution l Parallelism improves performance §4.5

Forwarding (aka Bypassing)

Use result when it is computed Don’t wait for it to be stored in a register Requires extra connections in the datapath

Page 12: Computer Organization CS224 Fall 2012 Lesson 28. Pipelining Analogy Pipelined laundry: overlapping execution l Parallelism improves performance §4.5

Load-Use Data Hazard

Can’t always avoid stalls by forwarding If value not computed when needed Can’t forward backward in time!

Page 13: Computer Organization CS224 Fall 2012 Lesson 28. Pipelining Analogy Pipelined laundry: overlapping execution l Parallelism improves performance §4.5

Code Scheduling to Avoid Stalls

Reorder code to avoid use of load result in the next instruction

C code for A = B + E; C = B + F;

lw $t1, 0($t0)lw $t2, 4($t0)add $t3, $t1, $t2sw $t3, 12($t0)lw $t4, 8($t0)add $t5, $t1, $t4sw $t5, 16($t0)

stall

lw $t1, 0($t0)lw $t2, 4($t0)lw $t4, 8($t0)add $t3, $t1, $t2sw $t3, 12($t0)add $t5, $t1, $t4sw $t5, 16($t0)

11 cycles13 cycles

Transport Layer3-1 Pipelined protocols Pipelining: sender allows multiple, “in-flight”, yet-to- be-acknowledged pkts m range of sequence numbers must be

Feed Forward Pipelined Accumulate Unit for the Machine ...the most prominent methodologies for expanding the activity clock recurrence. In spite of the fact that pipelining is an effective

ECE260: Fundamentals of Computer Engineering …...ECE260: Fundamentals of Computer Engineering 3 Pipelining Analogy • Pipelined laundry — overlapping execution of different stages

CS224 – Final Project Report A Study on Social Behaviors

Pipelined Iterative Solvers with Kernel Fusion for ... · overall number of kernel launches, not exhibiting any pipelining effect. Pipelining and kernel fusion are then applied to

EECE 476: Computer Architecture Slide Set #4: Pipelining Instructor: Tor Aamodt Slide background: Die photo of the Intel 486 (first pipelined x86 processor)

Constructive Computer Architecture: Non-Pipelined and Pipelined Processors Arvind

MIPS Processor · PDF fileDesign of Pipelined MIPS Processor Sept. 24 & 26, 1997 Topics • Instruction processing • Principles of pipelining • Inserting pipe registers • Data

Simple Instruction Pipelining · 2019-09-12 · Pipelining History 6.823 Some very early machines had limited pipelined execution (e.g., Zuse Z4, WWII) Usually overlap fetch of next

Computer Architecture (CS-213) · • How much cycles does first instruction take, in a pipelined processor, to execute ? • How do we manage control signals in pipelining ? Pipeline

1 Naïve Pipelined Implementation. 2 Outline General Principles of Pipelining –Goal –Difficulties Naïve PIPE Implementation Suggested Reading 4.4, 4.5

Pipelined Iterative Solvers with Kernel Fusion for ...juengel/publications/pdf/p14rupp.pdf · the system matrix, pipelining and kernel fusion techniques are presented in Section 2

A HIGH-PERFORMANCE, HYBRID WAVE-PIPELINED LINEAR … · The advantages of hybrid wave-pipelining will be explored using a Linear Feedback Shift Register (LFSR) which enables the study

061 MIPS Pipelined Implementation 061 · 065 Pipeline Details 065 Pipelining Idea ... MIPS implementations above only subject to RAW hazards. RAR not a hazard since read order irrelevant

Data Link Layer - University of MichiganARQ.pdf · Sliding Window: Pipelined Flow Control Pipelining: sender allows multiple “in-ﬂight,” yet-to-be-acknowledged, packets Sliding

VLSI Implementation of Hybrid Wave-Pipelined 2D DWT Using ...downloads.hindawi.com/archive/2008/512746.pdfis aimed at combining the advantages of both pipelining and wave-pipelining

Pipelined Datapath - Professional Web Presencepwp.gatech.edu/ece-ece3056-sy/wp-content/uploads/sites/546/2019/02/pipelined-datapath.pdf11 (21) Pipelining Example Instruction memory

Integral University, Lucknow Department of Computer ... · Pipelining Processing and overlapped parallelism: Principle of Linear Pipelining, Classification of Pipelined Processor,

Example of the Pipelined CISC and RISC Chapter 06 ... of the Pipelined CISC and RISC Processors Chapter 06: Instruction Pipelining and Parallel Processing Objective • • To understand

PIPELINED MULTITHREADING · transformation called Decoupled Software Pipelining (DSWP). DSWP, in particular, and PMT techniques, in general, are able to tolerate inter-core latencies,

Chapter 8. Pipelining. Overview Pipelining is widely used in modern processors. Pipelining improves system performance in terms of throughput. Pipelined

Computer Organization CS224 Fall 2012 Lessons 9 and 10

55:132/22C:160 Spring 2011 Ideal Pipelining Pipelined ...user.engineering.uiowa.edu/~hpca/LectureNotes/Lecture3spring2011.pdf · 55:132/22C:160 Spring 2011 Jon Kuhl 4 Development

LECTURE 8 Pipelining: Datapath and Controlcarnahan/cda3101/Lecture8_cda3101.pdf · PIPELINED DATAPATH One way to visualize pipelining is to consider the execution of each instruction

Pipelining 國立清華大學資訊工程學系黃婷婷教授. Outline An overview of pipelining A pipelined datapath Pipelined control Hazards: types of hazard Handling data

Pipelining - University of Toronto · 2005-09-17 · Pipelining • Principles of pipelining † Simple pipelining † Structural Hazards † Data Hazards † Control Hazards †

Pipelining - · PDF filenya bisa dimulai tanpa menunggu selesainya pekerjaan sebelumnya Sekuensial : Pipelined : 1 pekerjaan (B ) ... PC ðßTarget atau Bila syarat percabangan

Concetti di base del PIPELININGgunetti/DIDATTICA/architettureII/02-pipeline-1.pdf · 9 L’architettura MIPS pipelined • Sulla schematizzazione del pipelining del lucido precedente

CS224 Spring 2011 Computer Organization CS224 Chapter 6A: Disk Systems With thanks to M.J. Irwin, D. Patterson, and J. Hennessy for some lecture slide

Computer Organization CS224 Fall 2012 Lessons 49 & 50

Computer Organization CS224

Pipelining the MIPS Datapath - GitHub Pages · Today’s lecture Pipeline implementation Single-cycle Datapath Pipelining performance Pipelined datapath Example. Refresher: The Full

PIPELINING basics - · PIPELINING basics • A pipelined architecture for MIPS • Hurdles in pipelining • Simple solutions to pipelining hurdles • Advanced pipelining

Pipelining & Parallel Processing - ics.kaist.ac.krics.kaist.ac.kr/ee878_2018f/[EE878]3 Pipelining and Parallel Processing.pdf · Pipelining processing By using pipelining latches

Pipelining. Overview Pipelining is widely used in modern processors. Pipelining improves system performance in terms of throughput. Pipelined organization