lecture 2 - computer science and...

24
Computer Science and Engineering College of Engineering The Ohio State University Instruction Processing Cycle Lecture 2

Upload: others

Post on 17-Jun-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lecture 2 - Computer Science and Engineeringweb.cse.ohio-state.edu/~sivilotti.1/teaching/3903.recent/lectures/lecture02.pdfInstruction Cycle Repeat: 1. Fetch - get the instruction

Computer Science and Engineering College of Engineering The Ohio State University

Instruction Processing Cycle

Lecture 2

Page 2: Lecture 2 - Computer Science and Engineeringweb.cse.ohio-state.edu/~sivilotti.1/teaching/3903.recent/lectures/lecture02.pdfInstruction Cycle Repeat: 1. Fetch - get the instruction

Computer Science and Engineering The Ohio State University

Expectations Survey: Results

Greatest hopes for the class?

Greatest fear about the class?

Would you take it even if it were not required?

Page 3: Lecture 2 - Computer Science and Engineeringweb.cse.ohio-state.edu/~sivilotti.1/teaching/3903.recent/lectures/lecture02.pdfInstruction Cycle Repeat: 1. Fetch - get the instruction

Computer Science and Engineering The Ohio State University

Our Virtual Machine

A simple, fictitious architecture Memory CPU (central processing unit) Registers (PC, CCRs, general purpose)

Memory

CPU

PC

R0 R1 R2 R7

N Z P

OutputDevice

InputDevice

CCRs

Page 4: Lecture 2 - Computer Science and Engineeringweb.cse.ohio-state.edu/~sivilotti.1/teaching/3903.recent/lectures/lecture02.pdfInstruction Cycle Repeat: 1. Fetch - get the instruction

Computer Science and Engineering The Ohio State University

Memory Organized as a

sequence of “cells” The smallest

addressable unit N cells, addressed

0..N-1 Each cell has a

“width” (k bits) Typically k = 8 (a

byte) Other widths are

possible Adjacent cells can

be combined to form a single quantity (a word)

0123456

N-2N-1

k bits

Page 5: Lecture 2 - Computer Science and Engineeringweb.cse.ohio-state.edu/~sivilotti.1/teaching/3903.recent/lectures/lecture02.pdfInstruction Cycle Repeat: 1. Fetch - get the instruction

Computer Science and Engineering The Ohio State University

Questions

How many different values can a cell have? A function of ___

How many bits are needed to represent an address A function of ___

Page 6: Lecture 2 - Computer Science and Engineeringweb.cse.ohio-state.edu/~sivilotti.1/teaching/3903.recent/lectures/lecture02.pdfInstruction Cycle Repeat: 1. Fetch - get the instruction

Computer Science and Engineering The Ohio State University

Instruction Cycle

Repeat:1. Fetch - get the instruction2. Decode - figure out what it means3. Evaluate Addresses - calculate addresses

of operands4. Fetch Operands - get the operands5. Execute - do it!6. Store Result - update memory, CCRs

Instruction processing cycle, IPC, fetch-decode-execute cycle

Page 7: Lecture 2 - Computer Science and Engineeringweb.cse.ohio-state.edu/~sivilotti.1/teaching/3903.recent/lectures/lecture02.pdfInstruction Cycle Repeat: 1. Fetch - get the instruction

Computer Science and Engineering The Ohio State University

Fetch

First copy (contents of) memory location indicated by PC to the CPU

Then increment PC

CPU

PC

R0 R1 R2 R7

N Z P

OutputDevice

InputDevice

CCRs0x4D9A0x4D9B

Page 8: Lecture 2 - Computer Science and Engineeringweb.cse.ohio-state.edu/~sivilotti.1/teaching/3903.recent/lectures/lecture02.pdfInstruction Cycle Repeat: 1. Fetch - get the instruction

Computer Science and Engineering The Ohio State University

Incrementing the PC

Note that the increment is part of the Fetch phase

Therefore, if a subsequent phase uses the PC, its value is the address of the next instruction

Eg. “branch to subroutine” instruction Execute phase: store PC value, then

change PC to be the address of subroutine Next fetch phase will get first instruction

of subroutine The stored value is the address of the

instruction after the branch

Page 9: Lecture 2 - Computer Science and Engineeringweb.cse.ohio-state.edu/~sivilotti.1/teaching/3903.recent/lectures/lecture02.pdfInstruction Cycle Repeat: 1. Fetch - get the instruction

Computer Science and Engineering The Ohio State University

The IPC is Fast

Clock ticks control this cycle Intel Core i7: 3.4 GHz (3.2 billion ticks/s) Light bulb: 60 Hz (120 flickers/s) Every flicker about 25 million clock ticks!

Not every phase is needed every time Basic phases: fetch, decode, execute

Some phases can be combined Eg. Operands can be fetched from

registers in the same tick as instruction decoding

Page 10: Lecture 2 - Computer Science and Engineeringweb.cse.ohio-state.edu/~sivilotti.1/teaching/3903.recent/lectures/lecture02.pdfInstruction Cycle Repeat: 1. Fetch - get the instruction

Computer Science and Engineering The Ohio State University

Reduced Instruction Set (RISC)

Repeat:1. Instruction Fetch (IF)2. Instruction Decode (ID)3. Execute (EX)4. Memory Access (MEM)5. Writeback (WB)

Evaluate addresses,fetch operands, execute

Page 11: Lecture 2 - Computer Science and Engineeringweb.cse.ohio-state.edu/~sivilotti.1/teaching/3903.recent/lectures/lecture02.pdfInstruction Cycle Repeat: 1. Fetch - get the instruction

Computer Science and Engineering The Ohio State University

Speeding Things Up: Pipelining

Metaphor: Washer, dryer, folder

Assume Each phase takes 30 min There are 4 loads to do

Without pipeline: 30min x 3 x 4 = 6hrs With a pipeline? Abstraction: Client doesn’t care how

laundry service completes the 4 loads

Page 12: Lecture 2 - Computer Science and Engineeringweb.cse.ohio-state.edu/~sivilotti.1/teaching/3903.recent/lectures/lecture02.pdfInstruction Cycle Repeat: 1. Fetch - get the instruction

Computer Science and Engineering The Ohio State University

Pipelining: Diagram

30min

Load 0

Load 1

Load 2

Load 3

time

Page 13: Lecture 2 - Computer Science and Engineeringweb.cse.ohio-state.edu/~sivilotti.1/teaching/3903.recent/lectures/lecture02.pdfInstruction Cycle Repeat: 1. Fetch - get the instruction

Computer Science and Engineering The Ohio State University

Speed-up

How much faster is the pipeline? For 1 load: no improvement (1.5 hrs) For 2 loads: 2 hrs vs 3 hrs For 3 loads: 2.5 hrs vs 4.5 hrs

As the number of loads goes to infinity, what is the improvement?

What determines the asymptotic speedup of a pipeline?

Page 14: Lecture 2 - Computer Science and Engineeringweb.cse.ohio-state.edu/~sivilotti.1/teaching/3903.recent/lectures/lecture02.pdfInstruction Cycle Repeat: 1. Fetch - get the instruction

Computer Science and Engineering The Ohio State University

Pipelining: RISC architecture

CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=140179

Page 15: Lecture 2 - Computer Science and Engineeringweb.cse.ohio-state.edu/~sivilotti.1/teaching/3903.recent/lectures/lecture02.pdfInstruction Cycle Repeat: 1. Fetch - get the instruction

Computer Science and Engineering The Ohio State University

Pipelining Hazards

Unlike loads of laundry, instructions are not independent One instruction may change the effect of

the very next instruction Examples: Write to a register read by next instruction Change the PC

Solution: Stall the next instruction until it is safe Creates a “bubble” (an idle cycle) that

moves through the pipeline

Page 16: Lecture 2 - Computer Science and Engineeringweb.cse.ohio-state.edu/~sivilotti.1/teaching/3903.recent/lectures/lecture02.pdfInstruction Cycle Repeat: 1. Fetch - get the instruction

Computer Science and Engineering The Ohio State University

Speeding Things Up: Concurrency

When things are independent, they can be done concurrently Quad-core laundry service: 4 washers,

dryers, and folders Each core is still pipelined

Challenges Identifying things that are independent Synchronizing things that are not

See CSE 2431 (Systems II)

Page 17: Lecture 2 - Computer Science and Engineeringweb.cse.ohio-state.edu/~sivilotti.1/teaching/3903.recent/lectures/lecture02.pdfInstruction Cycle Repeat: 1. Fetch - get the instruction

Computer Science and Engineering The Ohio State University

Microarchitecture (μarch)

Low-level structure: processor, registers, data path, pipeline, cache…

https://upload.wikimedia.org/wikipedia/commons/thumb/d/d6/AMD_K10_Arch.svg/718px-AMD_K10_Arch.svg.png

Page 18: Lecture 2 - Computer Science and Engineeringweb.cse.ohio-state.edu/~sivilotti.1/teaching/3903.recent/lectures/lecture02.pdfInstruction Cycle Repeat: 1. Fetch - get the instruction

Computer Science and Engineering The Ohio State University

Instruction Set Architecture: ISA

Higher-level description: machine instruction set, programmer accessible registers, memory addressing modes

Recall CSE 2421 (Systems I)

By en:User:Booyabazooka - Mips32_addi.svg, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=1362890

Page 19: Lecture 2 - Computer Science and Engineeringweb.cse.ohio-state.edu/~sivilotti.1/teaching/3903.recent/lectures/lecture02.pdfInstruction Cycle Repeat: 1. Fetch - get the instruction

Computer Science and Engineering The Ohio State University

Lab 1: Simulator

Goal: Functional correctness at the instruction set architecture (ISA) level

You do NOT need to account for other aspects of the microarchitecture No pipelining No concurrency No cache

Page 20: Lecture 2 - Computer Science and Engineeringweb.cse.ohio-state.edu/~sivilotti.1/teaching/3903.recent/lectures/lecture02.pdfInstruction Cycle Repeat: 1. Fetch - get the instruction

Computer Science and Engineering The Ohio State University

Overview of Labs

Assembly language

Executing program

Linked machine code

Object file

Assembler

Linking Loader

Simulator

e.g. LOAD r1,Total

e.g. ...T003F2200...

e.g. ...0010001000110010...

Page 21: Lecture 2 - Computer Science and Engineeringweb.cse.ohio-state.edu/~sivilotti.1/teaching/3903.recent/lectures/lecture02.pdfInstruction Cycle Repeat: 1. Fetch - get the instruction

Computer Science and Engineering The Ohio State University

Lab 1 Requirements Given an object file (a text file)

Contents of file denote initialization for a chunk of memory Format: header record, sequence of text records, end record

Develop a simulator that allows us to: Initialize the machine Load the object file into memory Simulate execution of the loaded program

Simulation has 3 modes Quiet: no output (except for that of the machine itself) Trace: output machine state before and after execution, as

well as affect of each instruction (memory and registers) Step: same trace mode, but pause after each instruction

Optional: extra functionality for usability View state of machine, modify registers and memory, etc

Robustness: Test it thoroughly!

Page 22: Lecture 2 - Computer Science and Engineeringweb.cse.ohio-state.edu/~sivilotti.1/teaching/3903.recent/lectures/lecture02.pdfInstruction Cycle Repeat: 1. Fetch - get the instruction

Computer Science and Engineering The Ohio State University

Lab 1 Milestones

Preliminary Documentation: Sept 7 Programmer’s Guide is particularly important We will look at and comment on everything

you turn in Mandatory Design Review: Sept 10, 11 Sign up for a 30 minute slot Everyone in group must attend

Completed Documentation: Sept 24 Interactive Grading: Sept 24, 25 Sign up for a 60 minute slot Everyone in group must attend

Page 23: Lecture 2 - Computer Science and Engineeringweb.cse.ohio-state.edu/~sivilotti.1/teaching/3903.recent/lectures/lecture02.pdfInstruction Cycle Repeat: 1. Fetch - get the instruction

Computer Science and Engineering The Ohio State University

Group Formation

Form groups of 4 Exchange contact information Set regular in-person meeting times Note: Lab 0 is due soon!

Page 24: Lecture 2 - Computer Science and Engineeringweb.cse.ohio-state.edu/~sivilotti.1/teaching/3903.recent/lectures/lecture02.pdfInstruction Cycle Repeat: 1. Fetch - get the instruction

Computer Science and Engineering The Ohio State University

Summary

Instruction processing cycle PC is incremented during fetch phase

Pipelining Overlap parts of execution to increase

throughput Microarchitecture vs ISA Abstraction

Lab overview Group formation