lecture#06 inner workings of the cpu

80
1 ICT1012 ICT1012 ICT1012 ICT1012 ICT1012 ICT1012 ICT1012 ICT1012 – Computer Systems Computer Systems Computer Systems Computer Systems Computer Systems Computer Systems Computer Systems Computer Systems Lecture 6 Lecture 6 – Inner Workings of the Central Inner Workings of the Central Processing Unit Processing Unit Lakshman Jayaratne Lakshman Jayaratne Learning Objectives Learning Objectives Computer architecture Computer architecture Components of a simple central processing unit: Components of a simple central processing unit: o registers, ALU, control unit and buses registers, ALU, control unit and buses Other hardware components of a computer: Other hardware components of a computer: o Buses, clocks, peripheral devices, memory Buses, clocks, peripheral devices, memory Features of computers Features of computers Speed and reliability Speed and reliability Components and CPU registers Components and CPU registers Memory organization Memory organization Fetch Fetch–decode decode–execute cycle and its use to execute cycle and its use to execute instructions in a simple computer execute instructions in a simple computer

Upload: kanapathipillai-shujeevan

Post on 17-Dec-2014

2.250 views

Category:

Technology


4 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Lecture#06   inner workings of the cpu

1

ICT1012 ICT1012 ICT1012 ICT1012 ICT1012 ICT1012 ICT1012 ICT1012 –––––––– Computer SystemsComputer SystemsComputer SystemsComputer SystemsComputer SystemsComputer SystemsComputer SystemsComputer Systems

Lecture 6 Lecture 6 –– Inner Workings of the Central Inner Workings of the Central Processing UnitProcessing Unit

Lakshman JayaratneLakshman Jayaratne

Learning ObjectivesLearning Objectives

�� Computer architectureComputer architecture�� Components of a simple central processing unit:Components of a simple central processing unit:

oo registers, ALU, control unit and busesregisters, ALU, control unit and buses

�� Other hardware components of a computer:Other hardware components of a computer:oo Buses, clocks, peripheral devices, memoryBuses, clocks, peripheral devices, memory

�� Features of computersFeatures of computers�� Speed and reliabilitySpeed and reliability

�� Components and CPU registersComponents and CPU registers

�� Memory organizationMemory organization

�� FetchFetch––decodedecode––execute cycle and its use to execute cycle and its use to

execute instructions in a simple computerexecute instructions in a simple computer

Page 2: Lecture#06   inner workings of the cpu

2

Hardware Components of a Typical Hardware Components of a Typical

ComputerComputer

Peripheral

Devices

Central ProcessingUnit (CPU)

Memory

Buses allow components to pass data to each Buses allow components to pass data to each

otherother

Central Processing Unit (CPU)Central Processing Unit (CPU)

�� Performs the basic operationsPerforms the basic operations

�� Consists of two parts:Consists of two parts:

�� Arithmetic / Logic Unit (ALU)Arithmetic / Logic Unit (ALU) -- data manipulationdata manipulation

�� Control UnitControl Unit -- coordinate machinecoordinate machine’’s activitiess activities

MemoryMemoryPeripheralPeripheral

DevicesDevices

Central Central ProcessingProcessingUnit (CPU)Unit (CPU)

Hardware Components of a Typical Hardware Components of a Typical

Computer Computer -- CPUCPU

Page 3: Lecture#06   inner workings of the cpu

3

Central Processing Unit (CPU)Central Processing Unit (CPU)

�� Fetches, decodes and executes program Fetches, decodes and executes program

instructionsinstructions

�� Two principal parts of the CPU Two principal parts of the CPU

�� ArithmeticArithmetic--Logic Unit (ALU)Logic Unit (ALU)

oo Connected to Connected to registersregisters and and memorymemory by a by a

data bus data bus

oo All three comprise the All three comprise the DatapathDatapath

�� Control unitControl unit

oo Sends signals to CPU components to perform Sends signals to CPU components to perform

sequenced operationssequenced operations

CPU: Registers, ALU and Control UnitCPU: Registers, ALU and Control Unit

�� RegistersRegisters

�� Hold data that can be readily accessed by the CPUHold data that can be readily accessed by the CPU

�� Implemented using D flipImplemented using D flip--flopsflops

oo A 32A 32--bit register requires 32 D flipbit register requires 32 D flip--flopsflops

�� ArithmeticArithmetic--logic unit (ALU) logic unit (ALU)

�� Carries out logical and arithmetic operations Carries out logical and arithmetic operations

�� Often affects the status register (e.g., overflow, carry)Often affects the status register (e.g., overflow, carry)

�� Operations are controlled by the control unitOperations are controlled by the control unit

�� Control unit (CU)Control unit (CU)

�� Policeman or traffic managerPoliceman or traffic manager

�� Determines which actions to carry outDetermines which actions to carry out according to the values in according to the values in

a program counter register and a status registera program counter register and a status register

Page 4: Lecture#06   inner workings of the cpu

4

Main Memory Main Memory

�� Holds programs and dataHolds programs and data

�� Stores bits in fixedStores bits in fixed--sized chunks: sized chunks: ““wordword”” (8, 16, (8, 16, 32 or 64 bits)32 or 64 bits)

�� Each word has a Each word has a unique addressunique address

�� The words can be accessed in any order The words can be accessed in any order ��������randomrandom--access memoryaccess memory or or ““RAMRAM””

MemoryMemoryPeripheralPeripheral

DevicesDevices

Central Central ProcessingProcessingUnit (CPU)Unit (CPU)

Hardware Components of a Typical Hardware Components of a Typical

Computer Computer -- MemoryMemory

�� Consists of a linear array of addressable Consists of a linear array of addressable

storage cellsstorage cells

�� A memory address is represented by an A memory address is represented by an

unsigned integerunsigned integer

�� Can be byteCan be byte--addressable or wordaddressable or word--addressableaddressable

�� ByteByte--addressable:addressable: each byte has a unique addresseach byte has a unique address

�� WordWord--addressable:addressable: a word (e.g., 4 bytes) has a unique a word (e.g., 4 bytes) has a unique

addressaddress

MemoryMemory

Page 5: Lecture#06   inner workings of the cpu

5

�� A memory word size of a machine is 16 bitsA memory word size of a machine is 16 bits

�� A 4MB A 4MB ×××××××× 16 RAM chip gives us 4 megabytes of 16 RAM chip gives us 4 megabytes of 1616--bit memory locationsbit memory locations�� 4MB = 24MB = 222 * 2* 22020 = 2= 22222 = 4,194,304 unique locations (each = 4,194,304 unique locations (each

location contains a 16location contains a 16--bit word)bit word)

�� Memory locations range from 0 to 4,194,303 in unsigned Memory locations range from 0 to 4,194,303 in unsigned integersintegers

�� 22NN addressable units of memory require N bits addressable units of memory require N bits to address each locationto address each location�� Thus, the memory bus of this system requires at least 22 Thus, the memory bus of this system requires at least 22

address linesaddress lines

�� The address lines The address lines ““countcount”” from 0 to 2from 0 to 22222 --1 in binary1 in binary

Memory: ExampleMemory: Example

Hardware Components of a Typical Hardware Components of a Typical Computer Computer –– Peripheral Devices that Peripheral Devices that Communicate with the Outside WorldCommunicate with the Outside World

Peripheral

Devices

Central ProcessingUnit (CPU)

Memory

�� Input/Output (I/O)Input/Output (I/O)�� Input:Input: keyboard, mouse, microphone, scanner, keyboard, mouse, microphone, scanner,

sensors (camera, infrasensors (camera, infra--red), punchred), punch--cardscards

�� Output:Output: video, printer, audio speakers, etcvideo, printer, audio speakers, etc

�� CommunicationCommunication�� modem, ethernet cardmodem, ethernet card

Page 6: Lecture#06   inner workings of the cpu

6

Hardware Components of a Typical Hardware Components of a Typical Computer Computer –– Peripheral Devices that Peripheral Devices that Store Data Long TermStore Data Long Term

�� Secondary (mass) storageSecondary (mass) storage

�� Stores information for long periods of Stores information for long periods of

time as time as filesfiles

�� Examples:Examples: hard drive, floppy disk, tape, CDhard drive, floppy disk, tape, CD--

ROM (Compact Disk ReadROM (Compact Disk Read--Only Memory), flash Only Memory), flash

drive, DVD (Digital Video/Versatile Disk)drive, DVD (Digital Video/Versatile Disk)

Hardware Components of a Typical Hardware Components of a Typical Computer Computer –– BusesBuses

Peripheral

Devices

Central ProcessingUnit (CPU)

Memory

BusesBuses�� Used to share data between system components Used to share data between system components

inside and outside the CPUinside and outside the CPU

�� Set of wires (lines) that Set of wires (lines) that

�� act as a shared pathact as a shared path

�� allow parallel movement of bitsallow parallel movement of bits

Page 7: Lecture#06   inner workings of the cpu

7

Typical Bus TransactionsTypical Bus Transactions

�� Sending an address (for performing a read Sending an address (for performing a read

or write)or write)

�� Transferring data from memory to register Transferring data from memory to register

and vice versaand vice versa

�� Transferring data for I/O reads and writes Transferring data for I/O reads and writes

from peripheral devicesfrom peripheral devices

BusesBuses

�� Physically a bus is a group of Physically a bus is a group of

conductors that allows all the conductors that allows all the

bits in a binary word to be bits in a binary word to be

copied from a copied from a sourcesource component component

to a to a destinationdestination componentcomponent

�� Buses move binary values insideBuses move binary values inside

the CPU between registers and the CPU between registers and

other componentsother components

�� Buses are also used outside the CPU, to copy values Buses are also used outside the CPU, to copy values

between the CPU registers and main memory, and between the CPU registers and main memory, and

between the CPU registers and the I/O subbetween the CPU registers and the I/O sub--systemsystem

Page 8: Lecture#06   inner workings of the cpu

8

Types of Buses: Source and DestinationTypes of Buses: Source and Destination

��PointPoint--toto--pointpoint: :

connects two connects two

specific componentsspecific components

��MultiMulti--pointpoint: a shared : a shared

resource that resource that

connects several connects several

componentscomponents

�� access to it is access to it is

controlled through controlled through

protocols, which are protocols, which are

built into the hardware built into the hardware

�� Data busData bus: conveys bits from one device to another: conveys bits from one device to another

�� Control busControl bus: determines the direction of data flow and : determines the direction of data flow and

when each device can access the buswhen each device can access the bus

�� Address busAddress bus: determines the location of the source : determines the location of the source

or destination of the dataor destination of the data

Types of Buses: ContentsTypes of Buses: Contents

Page 9: Lecture#06   inner workings of the cpu

9

��Every computer contains at least one clock that Every computer contains at least one clock that

synchronizes the activities of its componentssynchronizes the activities of its components

�� A fixed number of clock cycles are required to carry out each A fixed number of clock cycles are required to carry out each

data movement or computational operationdata movement or computational operation

�� The clock frequency determines the speed of all operationsThe clock frequency determines the speed of all operations

oo Measured in megaHertz or gigaHertzMeasured in megaHertz or gigaHertz

��Generally the term clock refers to the CPU (master) Generally the term clock refers to the CPU (master)

clockclock

�� Buses can have their own clocks which are usually slowerBuses can have their own clocks which are usually slower

��Most machines are synchronous Most machines are synchronous

�� Controlled by a master clock signalControlled by a master clock signal

�� Registers must wait for the clock to tick before loading new datRegisters must wait for the clock to tick before loading new data a

ClockClock

�� Clock cycle time is the reciprocal of clock Clock cycle time is the reciprocal of clock

frequencyfrequency

�� Example, an 800 MHz clock has a cycle time of 1.25 nsExample, an 800 MHz clock has a cycle time of 1.25 ns

oo 1/800,000,000 = 0.00000000125 = 1.25 * 101/800,000,000 = 0.00000000125 = 1.25 * 10--99

�� ClockClock--speed speed ≠≠ CPUCPU--performance performance

�� The CPU time required to run a program is given by the The CPU time required to run a program is given by the

general performance equation:general performance equation:

Clock Speed (I)Clock Speed (I)

Page 10: Lecture#06   inner workings of the cpu

10

�� Therefore, we can improve CPU throughput Therefore, we can improve CPU throughput

when we reducewhen we reduce

�� the number of instructions in a programthe number of instructions in a program

�� the number of cycles per instructionthe number of cycles per instruction

�� the number of nanoseconds per clock cyclethe number of nanoseconds per clock cycle

�� But, in generalBut, in general

�� Multiplication takes longer than additionMultiplication takes longer than addition

�� Floating point operations require more cycles than Floating point operations require more cycles than

integer operationsinteger operations

�� Accessing memory takes longer than accessing Accessing memory takes longer than accessing

registersregisters

Clock Speed (II)Clock Speed (II)

Features of Computers: Speed and Features of Computers: Speed and

ReliabilityReliability

�� SpeedSpeed

��CPU speedCPU speed

�� SystemSystem--clock / Bus speedclock / Bus speed

��MemoryMemory--access speedaccess speed

�� Peripheral device speedPeripheral device speed

�� ReliabilityReliability

Page 11: Lecture#06   inner workings of the cpu

11

�� CPU clock speed: in cycles per second CPU clock speed: in cycles per second

("hertz")("hertz")

�� Example: 700MHz Pentium III, 3GHz Example: 700MHz Pentium III, 3GHz

Pentium IVPentium IV

�� but different CPU designs do different but different CPU designs do different

amounts of work in one clock cycleamounts of work in one clock cycle

�� Other measures of speedOther measures of speed

�� ““flopsflops”” (floating(floating--point operations per second)point operations per second)

�� ““mipsmips”” (million instructions per second)(million instructions per second)

CPU SpeedCPU Speed

SystemSystem--Clock / Bus SpeedClock / Bus Speed

�� Speed of communication between CPU, Speed of communication between CPU,

memory and peripheral devicesmemory and peripheral devices

�� Depends on main board designDepends on main board design

�� ExamplesExamples: :

oo Intel 1.50GHz PentiumIntel 1.50GHz Pentium--4 works on a 400MHz bus 4 works on a 400MHz bus

speedspeed

Page 12: Lecture#06   inner workings of the cpu

12

MemoryMemory--Access SpeedAccess Speed

�� RAMRAM

�� about 60ns (1 nanosecond = a billionth of a about 60ns (1 nanosecond = a billionth of a

second), and getting fastersecond), and getting faster

��may be rated with respect to may be rated with respect to ““bus speedbus speed’’’’ (e.g., (e.g.,

PCPC--100)100)

�� Cache memoryCache memory

�� faster than main memory (about 20ns access faster than main memory (about 20ns access

speed), but more expensivespeed), but more expensive

�� contains data which the CPU is likely to use nextcontains data which the CPU is likely to use next

Peripheral Device SpeedPeripheral Device Speed

�� Mass storageMass storage

�� ExamplesExamples: :

oo 3.5in 1.4MB floppy disk: about 200kb/sec at 300 rpm 3.5in 1.4MB floppy disk: about 200kb/sec at 300 rpm

(revolutions per minute)(revolutions per minute)

oo Hard drive: up to 160 GB of storage, average seek time Hard drive: up to 160 GB of storage, average seek time

about 6 milliseconds, and 7,200 rpmabout 6 milliseconds, and 7,200 rpm

�� CommunicationsCommunications

�� Examples: modems at 56 kilobits per second, and Examples: modems at 56 kilobits per second, and

network cards at 10 or 100 megabits per secondnetwork cards at 10 or 100 megabits per second

�� I/OI/O

�� Examples: ISA, PCI, IDE, SCSI, ATA, USB, etc....Examples: ISA, PCI, IDE, SCSI, ATA, USB, etc....

Page 13: Lecture#06   inner workings of the cpu

13

Cache Memory and Virtual MemoryCache Memory and Virtual Memory

�� Cache memory Cache memory –– random access memory random access memory

that a processor can access more quickly that a processor can access more quickly

than regular RAMthan regular RAM

�� Virtual memory Virtual memory –– an an ““extensionextension”” of RAM of RAM

using the hard diskusing the hard disk

�� allows the computer to behave as though it has allows the computer to behave as though it has

more memory than what is physically availablemore memory than what is physically available

Interrupts and ExceptionsInterrupts and Exceptions

�� Events that alter the normal execution of a Events that alter the normal execution of a

programprogram

�� Exceptions are triggered within the processorExceptions are triggered within the processor

�� Arithmetic errors, overflow or underflowArithmetic errors, overflow or underflow

�� Invalid instructionsInvalid instructions

�� UserUser--defined break pointsdefined break points

�� Interrupts are triggered outside the processorInterrupts are triggered outside the processor

�� I/O requestsI/O requests

�� Each type of interrupt or exception is Each type of interrupt or exception is

associated with a procedure that directs the associated with a procedure that directs the

actions of the CPUactions of the CPU

Page 14: Lecture#06   inner workings of the cpu

14

FetchFetch--decodedecode--execute Cycleexecute Cycle

A computer runs programs by performing

fetch-decode-execute cycles

fetchfetch next instruction from next instruction from

memory ( word pointed to memory ( word pointed to

by PC ) and place in IRby PC ) and place in IR

decodedecode instruction in the IR instruction in the IR

to determine typeto determine type

executeexecute instructioninstruction

go to the next instruction go to the next instruction

(next word in memory)(next word in memory)

ExampleExample: instruction word : instruction word

at at mem[PCmem[PC] is 0x20A9FFFD] is 0x20A9FFFD

OpcodeOpcode 88 is is ““add immediateadd immediate””, ,

source reg is source reg is $5$5, , ““targettarget”” reg reg

is reg is reg $9$9, add amount is , add amount is ––33

Send regSend reg $5$5 andand --33 to ALU,to ALU,

add them, put result in regadd them, put result in reg $9$9

PC = PC + 4PC = PC + 4

001000001000 0010100101 0100101001 11111111111111011111111111111101

Accessing Memory (I)Accessing Memory (I)

�� Every memory access needs an address Every memory access needs an address

word to be sent from CPU to memoryword to be sent from CPU to memory

�� Address range is Address range is 0x000000000x00000000 to to 0xFFFFFFFF0xFFFFFFFF

oo about 4 billion bytes of addressable spaceabout 4 billion bytes of addressable space

�� Addresses output by the CPU go to the Addresses output by the CPU go to the

MMemory emory AAddress ddress RRegister (egister (MARMAR))

��During a During a fetchfetch access, the access, the PCPC value is copied to value is copied to

MARMAR

��During a During a load/storeload/store access, a access, a ““computed computed

addressaddress”” from the ALU is copied tofrom the ALU is copied to MARMAR

Page 15: Lecture#06   inner workings of the cpu

15

Accessing Memory (II)Accessing Memory (II)

�� Why compute load/store addresses?Why compute load/store addresses?

��32(instruction bits) 32(instruction bits) –– 6(opcode bits) = 6(opcode bits) =

2626(available bits) (available bits)

�� insufficient to hold a full memory addressinsufficient to hold a full memory address

�� Solution: register based addressingSolution: register based addressing

��use 26use 26--bits to specify a bits to specify a base address GPRbase address GPR, a , a

target GPRtarget GPR, plus a 16, plus a 16--bit signedbit signed offsetoffset

��ALU computes memory reference address ALU computes memory reference address ““on on

the flythe fly”” as: as: MARMAR = base GPR + offset= base GPR + offset

�� target GPR receives/supplies memory data target GPR receives/supplies memory data

Memory SegmentsMemory Segments

Memory is organized into Memory is organized into segmentssegments, each with its own , each with its own purposepurpose

reserved for the reserved for the

Operating System (OS)Operating System (OS)

data segmentdata segment

stack segmentstack segment

0x000000000x000000000x000000000x000000000x000000000x000000000x000000000x00000000

0x004000000x004000000x004000000x004000000x004000000x004000000x004000000x00400000

0x100000000x100000000x100000000x100000000x100000000x100000000x100000000x10000000

0x800000000x800000000x800000000x800000000x800000000x800000000x800000000x80000000

0xFFFFFFFF0xFFFFFFFF0xFFFFFFFF0xFFFFFFFF0xFFFFFFFF0xFFFFFFFF0xFFFFFFFF0xFFFFFFFF

((heapheap))

kernel codekernel code

kernel code kernel code

and dataand data

free space, free space,

grows and grows and

shrinks as shrinks as

stack/data stack/data

segments segments

changechange

useruser’’s codes codetext segmenttext segment

reserved for OSreserved for OS

memory memory addressesaddresses

Page 16: Lecture#06   inner workings of the cpu

16

Text SegmentText Segment

�� Starts at memory address 0x00400000Starts at memory address 0x00400000

�� runs up to address 0x0FFFFFFFruns up to address 0x0FFFFFFF

�� Contains userContains user’’s s executable program codeexecutable program code

(often called the (often called the code segment code segment ))

�� PCPC register value is a CPU register value is a CPU ““referencereference”” into into

this memory segmentthis memory segment

Data SegmentData Segment

�� Starts at memory address 0x10000000Starts at memory address 0x10000000

�� expands upwards towards stackexpands upwards towards stack

�� Contains programContains program’’s s static datastatic data, i.e., data and , i.e., data and

variables whose location in memory is fixed variables whose location in memory is fixed

(and known to the assembler)(and known to the assembler)

public, staticpublic, static

objectsobjects

global variablesglobal variables

string constantsstring constants

In JavaIn JavaIn CIn C

Page 17: Lecture#06   inner workings of the cpu

17

Stack SegmentStack Segment

�� Starts at memory address 0x7FFFFFFFStarts at memory address 0x7FFFFFFF

�� grows in the direction of decreasing memory grows in the direction of decreasing memory

addressesaddresses ( i.e., towards the data segment)( i.e., towards the data segment)

�� Contains Contains system stacksystem stack

�� Used for Used for temporary storagetemporary storage of:of:

�� local variables of functionslocal variables of functions

�� function parameter valuesfunction parameter values

�� return addresses of functionsreturn addresses of functions

�� saved register valuessaved register values

HeapHeap

�� Technically part of data segmentTechnically part of data segment

�� located at end of data segment, after all static located at end of data segment, after all static

datadata

�� Empty at start of program executionEmpty at start of program execution

�� Dynamically allocated memory is taken from Dynamically allocated memory is taken from

heap for program to useheap for program to use

�� Freed memory (by user or garbage Freed memory (by user or garbage

collection) is returned to heapcollection) is returned to heap

Page 18: Lecture#06   inner workings of the cpu

18

1001100101001 0010011100011

INPUTINPUT OUTPUTOUTPUT

ControlControl

UnitUnit

ArithmeticArithmetic

LogicLogic

UnitUnit BUSBUS

CodeCode

SegmentSegment

DataData

SegmentSegment

MEMORYMEMORY

CENTRALCENTRAL

PROCESSINGPROCESSING

UNITUNIT

A Von NeumanA Von Neuman

MachineMachine

Block Diagram of the SystemBlock Diagram of the System

�� ALUALU

�� The part of a computer that performs all The part of a computer that performs all

arithmetic computations, such as addition and arithmetic computations, such as addition and

multiplication, and all comparison operationsmultiplication, and all comparison operations

�� A typical schematic symbol for an ALU: A typical schematic symbol for an ALU: AA & & BB

are operands; are operands; RR is the output; is the output; FF is the input is the input

from the Control Unit; from the Control Unit; DD is an output statusis an output status

Arithmetic Logic UnitArithmetic Logic Unit

Page 19: Lecture#06   inner workings of the cpu

19

�� The component where data is held The component where data is held

temporarilytemporarily

�� Calculations occur hereCalculations occur here

�� It knows how to perform operations such as It knows how to perform operations such as

ADDADD, , SUBSUB, , LOADLOAD, , STORESTORE, , SHIFTSHIFT

�� It knows the commands that make up the It knows the commands that make up the

machine language of the CPUmachine language of the CPU

�� It is the calculatorIt is the calculator

Arithmetic Logic UnitArithmetic Logic Unit……

Control UnitControl Unit

�� A computerA computer’’s control unit keeps things s control unit keeps things

synchronizedsynchronized

�� Makes sure that the correct components are activated as Makes sure that the correct components are activated as

the components are neededthe components are needed

�� Sends bits down control lines to trigger eventsSends bits down control lines to trigger events

oo E.g., when Add is performed, the control signal tells the ALU toE.g., when Add is performed, the control signal tells the ALU to

AddAdd

�� How do these control lines become asserted?How do these control lines become asserted?

oo HardwiredHardwired controlcontrol: controllers implement this : controllers implement this

program using digital logic componentsprogram using digital logic components

oo Microprogrammed controlMicroprogrammed control: a small program is : a small program is

placed into readplaced into read--only memory in the only memory in the microcontrollermicrocontroller

Page 20: Lecture#06   inner workings of the cpu

20

Control Unit: Hardwired ControlControl Unit: Hardwired Control

�� Physically connect all of the control lines to the actual Physically connect all of the control lines to the actual

machine instructionmachine instruction

�� Instructions are divided into fields and different bits are Instructions are divided into fields and different bits are

combined with various digital logic components (which combined with various digital logic components (which

drive the control line)drive the control line)

�� The control unit is implemented The control unit is implemented

using hardwareusing hardware

�� The digital circuit uses inputs to The digital circuit uses inputs to

generate the control signal to generate the control signal to

drive various componentsdrive various components

�� Advantage: very fastAdvantage: very fast

�� Disadvantage: instruction set Disadvantage: instruction set

and digital logic are lockedand digital logic are locked

�� Microprogram: software stored in the CPU control unitMicroprogram: software stored in the CPU control unit

�� Converts machine instructions (binary) into control Converts machine instructions (binary) into control

signalssignals

�� One subroutine for eachOne subroutine for each

machine instructionmachine instruction

�� Advantage: very flexible Advantage: very flexible

�� Disadvantage: additionalDisadvantage: additional

layer of interpretationlayer of interpretation

Control Unit: Microprogrammed Control Unit: Microprogrammed

ControlControl

Page 21: Lecture#06   inner workings of the cpu

21

�� ““A A registerregister is a single, permanent storage is a single, permanent storage

location within the CPU used for a location within the CPU used for a

PARTICULAR, defined purposePARTICULAR, defined purpose””

�� ““A register is used to hold a binary value A register is used to hold a binary value

temporarily for storage, for manipulation, temporarily for storage, for manipulation,

and/or for simple calculationsand/or for simple calculations””

�� Registers have special addressesRegisters have special addresses

RegistersRegisters

Main MemoryMain Memory

ControlControl

UnitUnit

ALUALU

Fetch an instruction Fetch an instruction

from the memory cell from the memory cell

where the PC pointswhere the PC points

1011011110110111

0110100101101001

0011010000110100

……..

……..

0011011100110111

1110100111101001

0111010001110100

……..

……..

CPU CycleCPU Cycle

Decode the instruction Decode the instruction

Execute the Execute the

instructioninstruction

Increment the PCIncrement the PC

CPUCPU

OutputOutput

DataData

InputInput

Data andData and

InstructionsInstructions

BusBus

10110111

01101001

00110100

01111101

11100000

….

PCPC

ProgramProgram

CounterCounter

Von Neuman Machine ModelVon Neuman Machine Model

Page 22: Lecture#06   inner workings of the cpu

22

R0

R1

Rn

Arithmetic/ LogicArithmetic/ Logic

UnitUnit

Control UnitControl Unit

CPUCPU BUS

BUS

Input devices

Output devices

Main Memory

Secondary Storage

RegistersRegisters are used to hold the data immediately applicable to the operation at hand;

MainMain memorymemory is used to hold the data that will be needed in the near future

SecondarySecondary storagestorage is used to hold data that will be likely not be needed in the

near future

RegistersRegisters

�� Consider a machine withConsider a machine with

�� 256 byte Main Memory: 00256 byte Main Memory: 00--FFFF

�� 16 General Purpose Registers: 016 General Purpose Registers: 0--FF

�� 16 Bit Instruction16 Bit Instruction

�� 8 Bit Integer Format (28 Bit Integer Format (2’’s Complement)s Complement)

�� 8 Bit Floating Point Format8 Bit Floating Point Format

oo 1 Sign Bit1 Sign Bit

oo 3 Exponent Bits3 Exponent Bits

oo 4 Bit Mantissa4 Bit Mantissa

�� 16 Instructions: 116 Instructions: 1--FF

0001 00010001 0001

0011 00000011 0000

0001 00100001 0010

0100 00000100 0000

0011 00010011 0001

0100 00000100 0000

0100 00000100 0000

0000

0101

0202

0303

0404

ffff

Example: Machine ArchitectureExample: Machine Architecture

Page 23: Lecture#06   inner workings of the cpu

23

LoadLoad the first number from memory cell AA into register RR11

LoadLoad the second number from memory cell BB into register RR22

AddingAdding the numbers in these two registers and put the result in register RR00

StoreStore the result in RR00 into the memory call XX

1001 10011001 1001

0110 11010110 1101

AA

BB

XX

1001100110011001

0110110101101101

0101010001010100

R1

R2

R0

A+BA+B

LOAD R1 , A

LOAD R2 , B

ADD R0 , R1 , R2

STORE R0 , X

LOADLOAD RR11 , A, A

LOADLOAD RR22 , B, B

ADDADD RR00 , R, R11 , R, R22

STORESTORE RR00 , X, X

Example: Addition OperationExample: Addition Operation

CPUCPU -- Central Central Processing UnitProcessing Unit

MARMAR -- Memory Address Memory Address RegisterRegister

IRIR -- Instruction RegisterInstruction Register

MDRMDR -- Memory Data Memory Data Register Register

PCPC -- Program CounterProgram Counter

ALUALU -- Arithmetic Logic Arithmetic Logic UnitUnit

Block Diagram of the CPUBlock Diagram of the CPU

Page 24: Lecture#06   inner workings of the cpu

24

�� The address in the The address in the Program CounterProgram Counter is is

placed in placed in MARMAR

�� The addressed instruction is read from The addressed instruction is read from

memory (through the memory (through the MDRMDR) and placed into ) and placed into

the the Instruction RegisterInstruction Register

Instruction FetchInstruction Fetch

�� The The Instruction DecoderInstruction Decoder examines the examines the

instruction in the instruction in the Instruction RegisterInstruction Register and and

sends appropriate signals to other parts of the sends appropriate signals to other parts of the

CPUCPU to carry out the actions specified by the to carry out the actions specified by the

instruction. This may include:instruction. This may include:

�� Reading operands from memory or registers into the Reading operands from memory or registers into the

Arithmetic Logic UnitArithmetic Logic Unit,,

�� Enabling the circuits of the Enabling the circuits of the Arithmetic Logic UnitArithmetic Logic Unit to to

perform arithmetic or other computations,perform arithmetic or other computations,

�� Storing data values into memory or registers,Storing data values into memory or registers,

�� Changing the value of the Changing the value of the Program CounterProgram Counter

Instruction ExecuteInstruction Execute

Page 25: Lecture#06   inner workings of the cpu

25

�� The processor endlessly repeats the cycle:The processor endlessly repeats the cycle:

fetch, execute, fetch, execute, fetch, execute,fetch, execute, fetch, execute, fetch, execute,

fetch, execute, fetch, execute, fetch, execute,fetch, execute, fetch, execute, fetch, execute,

fetch, execute, fetch, execute, fetch, execute,fetch, execute, fetch, execute, fetch, execute,

fetch ...fetch ...

The CPU CycleThe CPU Cycle

�� At the beginning of each cycle the CPU At the beginning of each cycle the CPU

presents the value of the presents the value of the program counterprogram counter

on the on the address busaddress bus

�� The CPU then fetches the instruction from The CPU then fetches the instruction from

main memorymain memory (possibly via a (possibly via a cachecache and/or and/or

a a pipelinepipeline) via the ) via the data busdata bus into the into the

instruction register instruction register

Fetch and Execute CycleFetch and Execute Cycle

Page 26: Lecture#06   inner workings of the cpu

26

�� From the From the instruction registerinstruction register, the data , the data

forming the instruction is decoded and passed forming the instruction is decoded and passed

to the to the control unitcontrol unit

�� It sends a sequence of control signals to the It sends a sequence of control signals to the

relevant function units of the relevant function units of the CPUCPU to perform to perform

the actions required by the instruction such as the actions required by the instruction such as

reading values from registers, passing them to reading values from registers, passing them to

the the ALUALU to add them together and writing the to add them together and writing the

result back to a result back to a registerregister

Fetch and Execute CycleFetch and Execute Cycle

�� The The program counterprogram counter is then incremented is then incremented

to address the next instruction and the to address the next instruction and the

cycle is repeatedcycle is repeated

Fetch and Execute CycleFetch and Execute Cycle

Page 27: Lecture#06   inner workings of the cpu

27

5353

Instruction Set Architecture (ISA)Instruction Set Architecture (ISA)

�� Instruction sets Instruction sets –– definition and featuresdefinition and features

�� Instruction typesInstruction types

�� Operand organizationOperand organization

�� Number of operands and instruction lengthNumber of operands and instruction length

�� Addressing Addressing

�� Instruction execution Instruction execution –– pipeliningpipelining

�� Features of two machine instruction sets Features of two machine instruction sets

(CISC and RISC)(CISC and RISC)

�� Instruction format Instruction format

�� Machine instructionsMachine instructions

�� Opcodes and operandsOpcodes and operands

�� High level languagesHigh level languages

�� Hide detail of the architecture from the programmerHide detail of the architecture from the programmer

�� Easier to programEasier to program

�� Why learn computer architectures and Why learn computer architectures and

assembly language?assembly language?

�� To understand how the computer worksTo understand how the computer works

�� To write more efficient programs To write more efficient programs

Instruction Set Architecture (ISA)Instruction Set Architecture (ISA)

Page 28: Lecture#06   inner workings of the cpu

28

Instruction sets are differentiated byInstruction sets are differentiated by

�� InstructionsInstructions

�� types of instructionstypes of instructions

�� instruction length and number of operandsinstruction length and number of operands

�� OperandsOperands

�� type (addresses, numbers, characters) and access mode type (addresses, numbers, characters) and access mode

�� location (CPU or memory)location (CPU or memory)

�� organization (stack or register based)organization (stack or register based)

oo number of addressable registersnumber of addressable registers

�� Memory organization Memory organization

�� bytebyte-- or wordor word--addressableaddressable

�� CPU instruction executionCPU instruction execution

�� with/without pipelining with/without pipelining

Instruction Set Architecture (ISA)Instruction Set Architecture (ISA)

�� The instruction set format is critical to the The instruction set format is critical to the

machinemachine’’s architectures architecture

�� Performance of instruction set architectures is Performance of instruction set architectures is

measured bymeasured by

�� Main memory space occupied by a programMain memory space occupied by a program

�� Instruction complexityInstruction complexity

�� Instruction length (in bits)Instruction length (in bits)

�� Total number of instructionsTotal number of instructions

Instruction Set Architecture (ISA)Instruction Set Architecture (ISA)

Page 29: Lecture#06   inner workings of the cpu

29

�� Instruction typesInstruction types

�� Operand organizationOperand organization

�� Number of operands and instruction lengthNumber of operands and instruction length

�� Addressing Addressing

�� Instruction execution Instruction execution –– pipeliningpipelining

Instruction Set Architecture (ISA)Instruction Set Architecture (ISA)

�� An An instruction setinstruction set, or , or instruction set architectureinstruction set architecture

(ISA)(ISA) describes the aspects of a computer architecture describes the aspects of a computer architecture

visible to a programmer, including the native datavisible to a programmer, including the native data--

types, instructions, registers, addressing modes, types, instructions, registers, addressing modes,

memory architecture, interrupt and exception handling, memory architecture, interrupt and exception handling,

and external I/O (if any)and external I/O (if any)

�� An ISA includes a specification of the set of all binary An ISA includes a specification of the set of all binary

codes codes (opcodes)(opcodes) that are the native form of that are the native form of

commands implemented by a particular CPU designcommands implemented by a particular CPU design

�� The set of The set of opcodesopcodes for a particular ISA is also known for a particular ISA is also known

as the as the machine languagemachine language for the ISAfor the ISA

Instruction Set Architecture (ISA)Instruction Set Architecture (ISA)

Page 30: Lecture#06   inner workings of the cpu

30

�� ISAs commonly implemented in hardwareISAs commonly implemented in hardware

�� Alpha AXP (DEC Alpha) Alpha AXP (DEC Alpha)

�� ARM (Acorn RISC Machine) (Advanced RISC Machine now ARM ARM (Acorn RISC Machine) (Advanced RISC Machine now ARM

Ltd) Ltd)

�� IAIA--64 (Itanium) 64 (Itanium)

�� MIPS MIPS

�� Motorola 68k Motorola 68k

�� PAPA--RISC (HP Precision Architecture) RISC (HP Precision Architecture)

�� IBM POWER IBM POWER

�� PowerPC PowerPC

�� SPARC SPARC

�� SuperH SuperH

�� VAX (Digital Equipment Corporation) VAX (Digital Equipment Corporation)

�� x86 (IAx86 (IA--32, Pentium, Athlon) (AMD64, EM64T) 32, Pentium, Athlon) (AMD64, EM64T)

Instruction Set Architecture (ISA)Instruction Set Architecture (ISA)

�� Data TransferData Transfer: transfer data between : transfer data between

registers and memory cellsregisters and memory cells

�� Arithmetic/Logic OperationsArithmetic/Logic Operations: perform : perform

addition, AND, OR, XOR and etc.addition, AND, OR, XOR and etc.

�� Control OperationsControl Operations: control the execution : control the execution

of the programof the program

Machine InstructionsMachine Instructions

Page 31: Lecture#06   inner workings of the cpu

31

1.1. L R , AL R , A LOADLOAD the register the register RR with the with the

content of memory cell content of memory cell AA

2.2. LI R , ILI R , I LOADLOAD the register the register RR with with II ((II is is

called an called an immediateimmediate number)number)

3.3. ST R , AST R , A STORESTORE the content of the register the content of the register RR

to the memory cell whose address to the memory cell whose address

is is AA

4.4. LR R1 , R2LR R1 , R2 LOADLOAD the register the register RR11 with the with the

content of the register content of the register RR22

Data Transfer InstructionsData Transfer Instructions

SwapSwap the content of two memory cells 3030(16)(16) and 4040(16)(16)

L 1 , 30 /*Load R1 with the content

in memory cell 30 */

L 2 , 40 /* Load R2 with the content

in memory cell 40 */

ST 1 , 40 /* Store R1 to 40 */

ST 2 , 30 /* Store R2 to 30 */

L 1 , 30L 1 , 30 /*Load R/*Load R11 with the content with the content

in memory cell 30 */in memory cell 30 */

L 2 , 40L 2 , 40 /* Load R/* Load R22 with the content with the content

in memory cell 40 */in memory cell 40 */

ST 1 , 40ST 1 , 40 /* Store R/* Store R11 to 40 */to 40 */

ST 2 , 30ST 2 , 30 /* Store R/* Store R22 to 30 */to 30 */

0110 11010110 1101

1001101010011010

3030

4040

0110110101101101

1001101010011010

R1

R2

Example: Data Transfer InstructionsExample: Data Transfer Instructions

Page 32: Lecture#06   inner workings of the cpu

32

SwapSwap the content of two memory cells 3030(16)(16) and 4040(16)(16)

L 1 , 30 /*Load R1 with the content

in memory cell 30 */

L 2 , 40 /* Load R2 with the content

in memory cell 40 */

ST 1 , 40 /* Store R1 to 40 */

ST 2 , 30 /* Store R2 to 30 */

L 1 , 30L 1 , 30 /*Load R/*Load R11 with the content with the content

in memory cell 30 */in memory cell 30 */

L 2 , 40L 2 , 40 /* Load R/* Load R22 with the content with the content

in memory cell 40 */in memory cell 40 */

ST 1 , 40ST 1 , 40 /* Store R/* Store R11 to 40 */to 40 */

ST 2 , 30ST 2 , 30 /* Store R/* Store R22 to 30 */to 30 */

0110 11010110 1101

1001101010011010

3030

4040

0110110101101101

1001101010011010

R1

R2

Example: Data Transfer InstructionsExample: Data Transfer Instructions

10011010

01101101

5.5. ADD R0, R1, R2ADD R0, R1, R2 ADDADD the numbers in the numbers in RR11 and and

RR22 representing in 2representing in 2’’s s

complement and place the complement and place the

result in result in RR00

6.6. AFP R0, R1, R2AFP R0, R1, R2 ADDADD the numbers in the numbers in RR11 and and

RR22 representing in floatingrepresenting in floating--

point and place the result in point and place the result in

RR00

Arithmetic InstructionsArithmetic Instructions

Arithmetic/Logic Instructions (I)Arithmetic/Logic Instructions (I)

Page 33: Lecture#06   inner workings of the cpu

33

L 1 , A0

L 2 , A1

ADD 0 , 1 , 2

ST 0 , X0

L 1 , A0L 1 , A0

L 2 , A1L 2 , A1

ADD 0 , 1 , 2ADD 0 , 1 , 2

ST 0 , X0ST 0 , X0

1001100110011001

0110110101101101

0101010001010100

A0A0

A1A1

X0X0

1001100110011001

0110110101101101

RR11

RR22

MemoryMemory

= = --2525

= 109= 109

= 84= 84

0101010001010100RR00

RegistersRegisters

Example: AdditionExample: Addition

Arithmetic/Logic Instructions (I)Arithmetic/Logic Instructions (I)

7.7. OR R0, R1, R2OR R0, R1, R2 OROR the bit patterns in the bit patterns in RR11 and and

RR22 and place the result in and place the result in RR00

8.8. AND R0, R1, R2AND R0, R1, R2 ANDAND the bit patterns in the bit patterns in RR11 and and

RR22 and place the result in and place the result in RR00

9.9. XOR R0, R1, R2XOR R0, R1, R2 XORXOR the bit patterns in the bit patterns in RR11 and and

RR22 and place the result in and place the result in RR00

Logic InstructionsLogic Instructions

Arithmetic/Logic Instructions (II)Arithmetic/Logic Instructions (II)

Page 34: Lecture#06   inner workings of the cpu

34

L 1 , A0

LI 2 , OF

ADD 0 , 1 , 2

ST 0 , X0

L 1 , A0L 1 , A0

LI 2 , OFLI 2 , OF

ADD 0 , 1 , 2ADD 0 , 1 , 2

ST 0 , X0ST 0 , X0

1001101110011011

0000101100001011

A0A0

X0X0

1001101110011011

0000111100001111

RR11

RR22

MemoryMemory

RR00

RegistersRegisters

Example: Mask the first 4 bits of Example: Mask the first 4 bits of

the binary string in memory the binary string in memory A0A0

1001101110011011

0000111100001111

RR11

RR22

0000101100001011RR00

Arithmetic/Logic Instructions (II)Arithmetic/Logic Instructions (II)

L 1 , A0

L 2 , A1

LI 3 , 0F

LI 4 , F0

AND 1 , 1 , 3

AND 2 , 2 , 4

OR 0 , 1 , 2

ST 0 , X0

L 1 , A0L 1 , A0

L 2 , A1L 2 , A1

LI 3 , 0FLI 3 , 0F

LI 4 , F0LI 4 , F0

AND 1 , 1 , 3AND 1 , 1 , 3

AND 2 , 2 , 4AND 2 , 2 , 4

OR 0 , 1 , 2OR 0 , 1 , 2

ST 0 , X0ST 0 , X0

1001100110011001

1101110110111011

1101110110011001

A0A0

A1A1

X0X0

Example: MaskingExample: Masking

0000000010011001

1101110100000000

RR11

RR22

1101100111011001RR00

0000111100001111

1111000011110000

RR33

RR44

1001100110011001

1101101111011011

RR11

RR22

RR00

0000111100001111

1111000011110000

RR33

RR44

Arithmetic/Logic Instructions (II)Arithmetic/Logic Instructions (II)

Page 35: Lecture#06   inner workings of the cpu

35

B.B. RR R , IRR R , I ROTATEROTATE the bit patterns in the bit patterns in RR

to right to right II times. Each time times. Each time

place the bit that started at the place the bit that started at the

lowlow--orderorder end at the end at the highhigh--

orderorder endend

Example RR , 0 , 02Example RR , 0 , 02

Bit String Operating InstructionsBit String Operating Instructions

1 0 1 1 0 0 01 0 1 1 0 0 0 11

11 1 0 1 1 0 01 0 1 1 0 0 00

0 10 1 1 0 1 1 0 01 0 1 1 0 0

Original StringOriginal String

Resulting StringResulting String

Arithmetic/Logic Instructions (III)Arithmetic/Logic Instructions (III)

E.E. JMP R , AJMP R , A JUMPJUMP the instruction located the instruction located

in the memory cell in the memory cell AA if the bit if the bit

pattern in pattern in RR is equal to the is equal to the

one in one in RR

F.F. HALTHALT HALTHALT the executionthe execution

Control InstructionsControl Instructions

Page 36: Lecture#06   inner workings of the cpu

36

LI 0 , 0A

LI 1 , 00

LI 2 , 01

ADD 3 , 1, 2

JMP 3 , 3E

LR 1 , 3

JMP 0 , 36

HALT

LI 0 , 0ALI 0 , 0A

LI 1 , 00LI 1 , 00

LI 2 , 01LI 2 , 01

ADD 3 , 1, 2ADD 3 , 1, 2

JMP 3 , 3EJMP 3 , 3E

LR 1 , 3LR 1 , 3

JMP 0 , 36JMP 0 , 36

HALTHALT

0000000000000000

0000000100000001

RR11

RR22

0000101000001010RR00

0000000100000001RR33

3030

3232

3434

3636

3838

3A3A

3C3C

3E3E

RR00 = 0A= 0A

RR11 = 00= 00

RR22 = 01= 01

RR33 = R= R11 +R+R22

RR11 = R= R33

RR33 = R= R00 ??

YesYes

NoNo

Example: Control InstructionsExample: Control Instructions

Control UnitControl Unit

30

3C

48

54

21 17 31 80 21 F5 31 81 11 80 12 81

23 FF 94 23 23 01 52 34 53 12 33 82

11 82 22 80 83 12 20 00 E3 5E 11 80

12 81 31 7F 31 80 12 7F 32 81 F0 00

74

80

8C

98

The CPU CycleThe CPU Cycle

ALUALUMain MemoryMain Memory

ProgramProgram

CounterCounter

InstructionInstruction

RegisterRegister

CircuitsCircuits

Code SegmentCode Segment

Data SegmentData Segment

AA

dd

dd

rr

ee

ss

ss

8 bit8 bit

busbus

GeneralGeneral

PurposePurpose

RegistersRegisters

Page 37: Lecture#06   inner workings of the cpu

37

�� Three choicesThree choices

�� Accumulator architectureAccumulator architecture

��General Purpose Register (GPR) architectureGeneral Purpose Register (GPR) architecture

�� Stack architectureStack architecture

Operand OrganizationOperand Organization

�� One operand of a binary operation is One operand of a binary operation is

implicitly in the accumulatorimplicitly in the accumulator

�� AdvantageAdvantage

�� Minimizes the internal complexity of the machineMinimizes the internal complexity of the machine

�� Allows for very short instructionsAllows for very short instructions

�� DisadvantageDisadvantage

�� Memory traffic is very highMemory traffic is very high

�� Programming is cumbersomeProgramming is cumbersome

Operand Organization Operand Organization –– AccumulatorAccumulator

ArchitectureArchitecture

Page 38: Lecture#06   inner workings of the cpu

38

�� Uses sets of general purpose registersUses sets of general purpose registers

�� AdvantageAdvantage

�� Register sets are faster than memory Register sets are faster than memory

�� Easy for compilers to deal withEasy for compilers to deal with

�� Due to low costs large numbers of these registers Due to low costs large numbers of these registers

are being addedare being added

�� DisadvantageDisadvantage

�� Results in longer instructions (longer fetch and Results in longer instructions (longer fetch and

decode times)decode times)

Operand Organization Operand Organization –– General General

Purpose Register (GPR) ArchitecturePurpose Register (GPR) Architecture

�� Three typesThree types

�� MemoryMemory--memorymemory

oo may have two or three operands in memorymay have two or three operands in memory

oo an instruction may perform an operation without an instruction may perform an operation without

requiring any operand to be in a register requiring any operand to be in a register

�� RegisterRegister--memorymemory

oo at least one operand must be in a register and one at least one operand must be in a register and one

in memoryin memory

�� LoadLoad--store store

oo requires data to be moved into registers before any requires data to be moved into registers before any

operation is performedoperation is performed

Operand Organization Operand Organization –– General General

Purpose Register (GPR) ArchitecturePurpose Register (GPR) Architecture

Page 39: Lecture#06   inner workings of the cpu

39

�� Uses a Uses a stack stack to execute instructionsto execute instructions

�� Operations:Operations:

�� PUSH PUSH –– put a value on put a value on

top of the stacktop of the stack

�� POP POP –– read top value read top value

and move down the and move down the

““stack pointerstack pointer””

�� Example:Example:

�� POPPOP

�� PUSH 9PUSH 9

572

9

Operand Organization Operand Organization –– Stack Stack

ArchitectureArchitecture

�� Instructions implicitly refer to values at the top Instructions implicitly refer to values at the top

of the stackof the stack�� data can be accessed only from the top of the stack, data can be accessed only from the top of the stack,

one word at a timeone word at a time

�� AdvantageAdvantage�� Good code density Good code density

�� Simple model for evaluation of expressionsSimple model for evaluation of expressions

�� DisadvantageDisadvantage�� Restricts the sequence of operand processingRestricts the sequence of operand processing

�� Execution bottleneckExecution bottleneck (the stack is located in memory)(the stack is located in memory)

Operand Organization Operand Organization –– Stack Stack

ArchitectureArchitecture

Page 40: Lecture#06   inner workings of the cpu

40

�� Stack architecture requires us to think about Stack architecture requires us to think about

arithmetic expressions in a new wayarithmetic expressions in a new way

�� We are used to We are used to Infix notationInfix notation

oo E.g., Z =E.g., Z = X + YX + Y

�� Stack arithmetic requires Stack arithmetic requires Postfix notationPostfix notation: :

oo E.g., Z =E.g., Z = XY+XY+

oo Postfix notation is also know as Postfix notation is also know as

Reverse Polish NotationReverse Polish Notation

Operand Organization Operand Organization –– Stack Stack

ArchitectureArchitecture

�� Postfix notation doesnPostfix notation doesn’’t need parentheses t need parentheses

�� E.g., E.g.,

�� The infix expression The infix expression Z = (X * Y) + (W * U)Z = (X * Y) + (W * U)

is the postfix expression is the postfix expression Z = X Y * W U * + Z = X Y * W U * +

�� Calculating Calculating Z = X Y * W U * +Z = X Y * W U * + in a stack ISA in a stack ISA

PUSH XPUSH X

PUSH YPUSH Y

MULTMULT

PUSH WPUSH W

PUSH UPUSH U

MULTMULT

ADDADD

POP ZPOP Z

Binary operators

• pop the two operands on the

stack top, and

• push the result on the stack

Stack Architecture Stack Architecture –– Postfix NotationPostfix Notation

Page 41: Lecture#06   inner workings of the cpu

41

�� The number of operands in each instruction The number of operands in each instruction

affects the length of the instructionaffects the length of the instruction

�� Instruction length can beInstruction length can be

�� Fixed Fixed –– quick to decode but wastes spacequick to decode but wastes space

�� Variable Variable –– more complex to decode but saves spacemore complex to decode but saves space

�� All architectures limit the number of operands All architectures limit the number of operands

allowed per instructionallowed per instruction

�� Stack architecture has 0 or 1 explicit operandStack architecture has 0 or 1 explicit operand

�� Accumulator architecture has 0 or 1 explicit operandAccumulator architecture has 0 or 1 explicit operand

�� GPR architecture has 1, 2 or 3 operandsGPR architecture has 1, 2 or 3 operands

Number of Operands and InstructionNumber of Operands and Instruction

LengthLength

�� Calculating the infix expression Z = X * Y + W * UCalculating the infix expression Z = X * Y + W * U

One operand

LOAD X

MULT Y

STORE TEMP

LOAD W

MULT U

ADD TEMP

STORE Z

Two operands

LOAD R1,X

MULT R1,Y

LOAD R2,W

MULT R2,U

ADD R1,R2

STORE Z,R1

The accumulator is the

destination for the

result of the instruction

Three operands

MULT R1,X,Y

MULT R2,W,U

ADD Z,R1,R2

The first operand is often the

destination for the result of the

instruction

Number of Operands Number of Operands -- ExampleExample

Page 42: Lecture#06   inner workings of the cpu

42

16 bit Instruction (2 bytes)16 bit Instruction (2 bytes)

HighHigh--Order ByteOrder Byte

The machine code 0010010001111100 represents the instructionThe machine code 0010010001111100 represents the instruction LI 4 , 7CLI 4 , 7C

LowLow--Order ByteOrder Byte

0 0 1 0 0 1 0 00 0 1 0 0 1 0 0 0 1 1 1 1 1 0 00 1 1 1 1 1 0 0

Bits 0Bits 0--3 OpCode3 OpCode

Bits 4Bits 4--15 Operands15 Operands

LILI 44 7C7C

Coding InstructionCoding Instruction

16 bit Instruction (2 bytes)16 bit Instruction (2 bytes)

Format 2Format 2 RegisterRegister Memory AddressMemory Address

Format 3Format 3 RegisterRegister RegisterRegister RegisterRegister

Format 4Format 4 Unused (zero)Unused (zero) RegisterRegister RegisterRegister

Format 1Format 1 RegisterRegister Immediate ValueImmediate Value

Instruction FormatsInstruction Formats

Page 43: Lecture#06   inner workings of the cpu

43

OpcodeOpcode InstructionInstruction MeaningMeaning

22 LILI R , I R , I Load ImmediateLoad Immediate

AA RL RL R , IR , I Rotate LeftRotate Left

BB RRRR R , IR , I Rotate RightRotate Right

CC SLSL R , IR , I Shift LeftShift Left

DD SRSR R , IR , I Shift RightShift Right

Format 1 InstructionFormat 1 Instruction

Format 1Format 1 RegisterRegister Immediate ValueImmediate Value

Format 1 InstructionFormat 1 Instruction

Format 1Format 1 RegisterRegister Immediate ValueImmediate Value

1.1. COPY THE BIT PATTERN IN THE LOWCOPY THE BIT PATTERN IN THE LOW--ORDER BYTEORDER BYTE

INTO THE SPECIFIED REGISTER , ORINTO THE SPECIFIED REGISTER , OR

2.2. SHIFT/ROTATE THE BITS IN THE SPECIFIED SHIFT/ROTATE THE BITS IN THE SPECIFIED

REGISTER THE NUMBER OF PLACES SPECIFIEDREGISTER THE NUMBER OF PLACES SPECIFIED

IN THE LOWIN THE LOW--ORDER BYTE.ORDER BYTE.

Format 1 InstructionFormat 1 Instruction

Page 44: Lecture#06   inner workings of the cpu

44

Format 2 InstructionFormat 2 Instruction

Format 2Format 2 RegisterRegister Memory AddressMemory Address

OpcodeOpcode InstructionInstruction MeaningMeaning

11 L L R , A R , A Load from MemoryLoad from Memory

33 ST ST R , AR , A Store to MemoryStore to Memory

EE JMP JMP R , AR , A Conditional JumpConditional Jump

Format 2 InstructionFormat 2 Instruction

Format 2Format 2 RegisterRegister Memory AddressMemory Address

1.1. Load Load -- Copy the value stored at the Memory AddressCopy the value stored at the Memory Address

into the specified registerinto the specified register

2.2. Store Store -- Copy the value in the specified register to theCopy the value in the specified register to the

Memory AddressMemory Address

3.3. Jump Jump -- Compare the contents of the specified registerCompare the contents of the specified register

and the contents of Register 0. If equal reset the and the contents of Register 0. If equal reset the

Program Counter to the Memory AddressProgram Counter to the Memory Address

Format 2 InstructionFormat 2 Instruction

Page 45: Lecture#06   inner workings of the cpu

45

OpcodeOpcode InstructionInstruction MeaningMeaning

55 ADDADD RR00, R, R11, R, R22 Load ImmediateLoad Immediate

66 AFP AFP RR00, R, R11, R, R22 Rotate LeftRotate Left

77 OROR RR00, R, R11, R, R22 Rotate RightRotate Right

88 ANDAND RR00, R, R11, R, R22 Shift LeftShift Left

99 XORXOR RR00, R, R11, R, R22 Shift RightShift Right

Format 3 InstructionFormat 3 Instruction

Format 3Format 3 RegisterRegister RegisterRegister RegisterRegister

Format 3 InstructionFormat 3 Instruction

Apply the operation to the two values in the registers Apply the operation to the two values in the registers

specified in the Lowspecified in the Low--Order byte and store the result in the Order byte and store the result in the

register specified in the Highregister specified in the High--Order byteOrder byte

Format 3Format 3 RegisterRegister RegisterRegister RegisterRegister

Format 3 InstructionFormat 3 Instruction

Page 46: Lecture#06   inner workings of the cpu

46

Format 4 InstructionFormat 4 Instruction

OpcodeOpcode InstructionInstruction MeaningMeaning

44 LRLR RR11 , R, R22 Load RegisterLoad Register

Format 4Format 4 Unused (zero)Unused (zero) RegisterRegister RegisterRegister

Format 4 InstructionFormat 4 Instruction

Format 4Format 4 Unused (zero)Unused (zero) RegisterRegister RegisterRegister

Copy the value in the second register specified in theCopy the value in the second register specified in the

LowLow--Order byte to the first register specified in theOrder byte to the first register specified in the

LowLow--Order byteOrder byte

Format 4 InstructionFormat 4 Instruction

Page 47: Lecture#06   inner workings of the cpu

47

1. L R , A 9. XOR R0 , R1, R2

2. LI R , I A. RL R , I

3. ST R , A B. RR R , I

4. LR R1 , R2 C. SL R , I

5. ADD R0 , R1, R2 D. SR R , I

6. AFP R0 , R1, R2 E. JMP R , A

7. OR R0 , R1, R2 F. HALT

8. AND R0 , R1, R2

1. L R , A1. L R , A 9. XOR R9. XOR R00 , R, R11, R, R22

2. LI R , I2. LI R , I A. RL R , IA. RL R , I

3.3. ST R , AST R , A B. RR R , IB. RR R , I

4.4. LR RLR R11 , R, R22 C. SL R , IC. SL R , I

5. ADD R5. ADD R00 , R, R11, R, R22 D. SR R , I D. SR R , I

6. AFP R6. AFP R00 , R, R11, R, R22 E. JMP R , AE. JMP R , A

7. OR R7. OR R00 , R, R11, R, R22 F. HALTF. HALT

8. AND R8. AND R00 , R, R11, R, R22

Full Instruction SetFull Instruction Set

NameName CommentComment SyntaxSyntax

TRANSFERTRANSFER

MOVMOV Move (copy)Move (copy) MOV Dest,SourceMOV Dest,Source

PUSHPUSH Push onto stackPush onto stack PUSH SourcePUSH Source

POPPOP Pop from stackPop from stack POP DestPOP Dest

ININ InputInput IN Dest, PortIN Dest, Port

OUTOUT OutputOutput OUT Port, SourceOUT Port, Source

ARITHMETICARITHMETIC

ADDADD AddAdd ADD Dest,SourceADD Dest,Source

SUBSUB SubtractSubtract SUB Dest,SourceSUB Dest,Source

DIVDIV Divide (unsigned)Divide (unsigned) DIV OpDIV Op

MULMUL Multiply (unsigned)Multiply (unsigned) MUL OpMUL Op

INCINC IncrementIncrement INC OpINC Op

DECDEC DecrementDecrement DEC OpDEC Op

CMPCMP CompareCompare CMP Op1,Op2CMP Op1,Op2

Examples of OpCodeExamples of OpCode

Page 48: Lecture#06   inner workings of the cpu

48

NameName CommentComment SyntaxSyntax

LOGICLOGIC

NEGNEG Negate (twoNegate (two--complement)complement) NEG OpNEG Op

NOTNOT Invert each bitInvert each bit NOT OpNOT Op

ANDAND Logical andLogical and AND Dest,SourceAND Dest,Source

OROR Logical orLogical or OR Dest,SourceOR Dest,Source

XORXOR Logical exclusive orLogical exclusive or XOR Dest,SourceXOR Dest,Source

JUMPSJUMPS

CALLCALL Call subroutineCall subroutine CALL ProcCALL Proc

JMPJMP JumpJump JMP DestJMP Dest

JEJE Jump if EqualJump if Equal JE DestJE Dest

JZJZ Jump if ZeroJump if Zero JZ DestJZ Dest

RETRET Return from subroutineReturn from subroutine RETRET

JNEJNE Jump if not EqualJump if not Equal JNE DestJNE Dest

JNZJNZ Jump if not ZeroJump if not Zero JNZ DestJNZ Dest

Examples of OpCodeExamples of OpCode

Assembler Machine Code Hexa

L 1 , 30 0001 0001 0011 0000 1130

L 2 , 40 0001 0010 0100 0000 1240

ST 1 , 40 0011 0001 0100 0000 3140

ST 2 , 30 0011 0010 0011 0000 3230

AssemblerAssembler Machine CodeMachine Code HexaHexa

L 1 , 30 0001 0001 0011 0000L 1 , 30 0001 0001 0011 0000 11301130

L 2 , 40 0001 0010 0100 0000L 2 , 40 0001 0010 0100 0000 12401240

ST 1 , 40 0011 0001 0100 0000ST 1 , 40 0011 0001 0100 0000 31403140

ST 2 , 30 0011 0010 0011 0000ST 2 , 30 0011 0010 0011 0000 32303230

0110 11010110 1101

1001 10011001 1001

3030

4040

0110 11010110 1101

1001 10011001 1001

RR11

RR22

AA

BB

0001 00010001 0001

0011 00000011 0000

0001 00100001 0010

0100 00000100 0000

0011 00010011 0001

0100 00000100 0000

0011 00100011 0010

0011 00000011 0000

0110 11010110 1101

1001 10011001 1001

1010

1111

1212

1313

1414

1515

1616

1717

3030

4040

Coding Program: ExampleCoding Program: Example

Page 49: Lecture#06   inner workings of the cpu

49

Fetch an instruction Fetch an instruction

from the memory cell from the memory cell

where the PC pointswhere the PC points

CPU CycleCPU Cycle

Decode the instruction Decode the instruction

Execute the Execute the

instructioninstruction

Increment the PCIncrement the PC

FETCHFETCH

DECODEDECODE

EXECUTEEXECUTE

1. Retrieve the next instruction from memory (as indicated by the program counter) and then increment the program counter

2. Decode the bit pattern in the instruction register

3. Perform the action requested by the instruction in the instruction register

Fetch

Fetch

Decode

ExecuteExecute

CPU Cycle (Machine Cycle)CPU Cycle (Machine Cycle)

L 1 , 30 1130

L 2 , 40 1240

ST 1 , 40 3140

ST 2 , 30 3230

L 1 , 30 L 1 , 30 11301130

L 2 , 40 L 2 , 40 12401240

ST 1 , 40 ST 1 , 40 31403140

ST 2 , 30 ST 2 , 30 32303230

RR00

RR11

0001 00010001 0001

0011 00000011 0000

0001 00100001 0010

0100 00000100 0000

0011 00010011 0001

0100 00000100 0000

0011 00100011 0010

0011 00000011 0000

0110 11010110 1101

1001 10011001 1001

1010

1111

1212

1313

1414

1515

1616

1717

3030

4040

FETCHFETCH

DECODEDECODE

EXECUTEEXECUTE

PCPC

RR22

RRFF

……....

Program Execution: Swap ExampleProgram Execution: Swap Example

Page 50: Lecture#06   inner workings of the cpu

50

L 1 , 30

L 2 , 40

ST 1 , 40

ST 2 , 30

L 1 , 30 L 1 , 30

L 2 , 40 L 2 , 40

ST 1 , 40 ST 1 , 40

ST 2 , 30 ST 2 , 30

RR00

RR11

0001 00010001 0001

0011 00000011 0000

0001 00100001 0010

0100 00000100 0000

0011 00010011 0001

0100 00000100 0000

0011 00100011 0010

0011 00000011 0000

0110 11010110 1101

1001 10011001 1001

1010

1111

1212

1313

1414

1515

1616

1717

3030

4040

FETCHFETCH

DECODEDECODE

EXECUTEEXECUTE

PCPC

RR22

RRFF

……....

Instruction:Instruction:

0001 0001 0011 00000001 0001 0011 0000

Execute a ProgramExecute a Program

L 1 , 30

L 2 , 40

ST 1 , 40

ST 2 , 30

L 1 , 30 L 1 , 30

L 2 , 40 L 2 , 40

ST 1 , 40 ST 1 , 40

ST 2 , 30 ST 2 , 30

RR00

RR11

0001 00010001 0001

0011 00000011 0000

0001 00100001 0010

0100 00000100 0000

0011 00010011 0001

0100 00000100 0000

0011 00100011 0010

0011 00000011 0000

0110 11010110 1101

1001 10011001 1001

1010

1111

1212

1313

1414

1515

1616

1717

3030

4040

FETCHFETCH

DECODEDECODE

EXECUTEEXECUTE

PCPC

RR22

RRFF

……....

Execute a ProgramExecute a Program

Instruction:Instruction:

0001 0001 0011 00000001 0001 0011 0000

OperationOperation--code :code : 00010001

RegisterRegister :: 00010001

Memory address :Memory address : 0011 00000011 0000

Page 51: Lecture#06   inner workings of the cpu

51

L 1 , 30

L 2 , 40

ST 1 , 40

ST 2 , 30

L 1 , 30 L 1 , 30

L 2 , 40 L 2 , 40

ST 1 , 40 ST 1 , 40

ST 2 , 30 ST 2 , 30

RR00

RR11

0001 00010001 0001

0011 00000011 0000

0001 00100001 0010

0100 00000100 0000

0011 00010011 0001

0100 00000100 0000

0011 00100011 0010

0011 00000011 0000

0110 11010110 1101

1001 10011001 1001

1010

1111

1212

1313

1414

1515

1616

1717

3030

4040

FETCHFETCH

DECODEDECODE

EXECUTEEXECUTE

PCPC

RR22

RRFF

……....

Execute a ProgramExecute a Program

Instruction:Instruction:

0001 0001 0011 00000001 0001 0011 0000

OperationOperation--code :code : 00010001

RegisterRegister :: 00010001

Memory address :Memory address : 0011 00000011 0000

L 1 , 30

L 2 , 40

ST 1 , 40

ST 2 , 30

L 1 , 30 L 1 , 30

L 2 , 40 L 2 , 40

ST 1 , 40 ST 1 , 40

ST 2 , 30 ST 2 , 30

RR00

RR11

0001 00010001 0001

0011 00000011 0000

0001 00100001 0010

0100 00000100 0000

0011 00010011 0001

0100 00000100 0000

0011 00100011 0010

0011 00000011 0000

0110 11010110 1101

1001 10011001 1001

1010

1111

1212

1313

1414

1515

1616

1717

3030

4040

FETCHFETCH

DECODEDECODE

EXECUTEEXECUTE

PCPC

RR22

RRFF

……....

Execute a ProgramExecute a Program

Instruction:Instruction:

0001 0001 0011 00000001 0001 0011 0000

OperationOperation--code :code : 00010001

RegisterRegister :: 00010001

Memory address :Memory address : 0011 00000011 0000

0110 11010110 1101

Page 52: Lecture#06   inner workings of the cpu

52

L 1 , 30

L 2 , 40

ST 1 , 40

ST 2 , 30

L 1 , 30 L 1 , 30

L 2 , 40 L 2 , 40

ST 1 , 40 ST 1 , 40

ST 2 , 30 ST 2 , 30

RR00

RR11

0001 00010001 0001

0011 00000011 0000

0001 00100001 0010

0100 00000100 0000

0011 00010011 0001

0100 00000100 0000

0011 00100011 0010

0011 00000011 0000

0110 11010110 1101

1001 10011001 1001

1010

1111

1212

1313

1414

1515

1616

1717

3030

4040

FETCHFETCH

DECODEDECODE

EXECUTEEXECUTE

PCPC

RR22

RRFF

……....

Execute a ProgramExecute a Program

Instruction:Instruction:

0001 0001 0011 00000001 0001 0011 0000

OperationOperation--code :code : 00010001

RegisterRegister :: 00010001

Memory address :Memory address : 0011 00000011 0000

0110 11010110 1101

L 1 , 30

L 2 , 40

ST 1 , 40

ST 2 , 30

L 1 , 30 L 1 , 30

L 2 , 40 L 2 , 40

ST 1 , 40 ST 1 , 40

ST 2 , 30 ST 2 , 30

RR00

RR11

0001 00010001 0001

0011 00000011 0000

0001 00100001 0010

0100 00000100 0000

0011 00010011 0001

0100 00000100 0000

0011 00100011 0010

0011 00000011 0000

0110 11010110 1101

1001 10011001 1001

1010

1111

1212

1313

1414

1515

1616

1717

3030

4040

FETCHFETCH

DECODEDECODE

EXECUTEEXECUTE

PCPC

RR22

RRFF

……....

Execute a ProgramExecute a Program

Instruction:Instruction:

0001 0010 0100 00000001 0010 0100 0000

0110 11010110 1101

Page 53: Lecture#06   inner workings of the cpu

53

L 1 , 30

L 2 , 40

ST 1 , 40

ST 2 , 30

L 1 , 30 L 1 , 30

L 2 , 40 L 2 , 40

ST 1 , 40 ST 1 , 40

ST 2 , 30 ST 2 , 30

RR00

RR11

0001 00010001 0001

0011 00000011 0000

0001 00100001 0010

0100 00000100 0000

0011 00010011 0001

0100 00000100 0000

0011 00100011 0010

0011 00000011 0000

0110 11010110 1101

1001 10011001 1001

1010

1111

1212

1313

1414

1515

1616

1717

3030

4040

FETCHFETCH

DECODEDECODE

EXECUTEEXECUTE

PCPC

RR22

RRFF

……....

Execute a ProgramExecute a Program

Instruction:Instruction:

0001 0010 0100 00000001 0010 0100 0000

OperationOperation--code :code : 00010001

RegisterRegister :: 00100010

Memory address :Memory address : 0100 00000100 0000

0110 11010110 1101

L 1 , 30

L 2 , 40

ST 1 , 40

ST 2 , 30

L 1 , 30 L 1 , 30

L 2 , 40 L 2 , 40

ST 1 , 40 ST 1 , 40

ST 2 , 30 ST 2 , 30

RR00

RR11

0001 00010001 0001

0011 00000011 0000

0001 00100001 0010

0100 00000100 0000

0011 00010011 0001

0100 00000100 0000

0011 00100011 0010

0011 00000011 0000

0110 11010110 1101

1001 10011001 1001

1010

1111

1212

1313

1414

1515

1616

1717

3030

4040

FETCHFETCH

DECODEDECODE

EXECUTEEXECUTE

PCPC

RR22

RRFF

……....

Execute a ProgramExecute a Program

Instruction:Instruction:

0001 0010 0100 00000001 0010 0100 0000

OperationOperation--code :code : 00010001

RegisterRegister :: 00100010

Memory address :Memory address : 0100 00000100 0000

0110 11010110 1101

Page 54: Lecture#06   inner workings of the cpu

54

L 1 , 30

L 2 , 40

ST 1 , 40

ST 2 , 30

L 1 , 30 L 1 , 30

L 2 , 40 L 2 , 40

ST 1 , 40 ST 1 , 40

ST 2 , 30 ST 2 , 30

RR00

RR11

0001 00010001 0001

0011 00000011 0000

0001 00100001 0010

0100 00000100 0000

0011 00010011 0001

0100 00000100 0000

0011 00100011 0010

0011 00000011 0000

0110 11010110 1101

1001 10011001 1001

1010

1111

1212

1313

1414

1515

1616

1717

3030

4040

FETCHFETCH

DECODEDECODE

EXECUTEEXECUTE

PCPC

1001 10011001 1001RR22

RRFF

……....

Execute a ProgramExecute a Program

Instruction:Instruction:

0001 0010 0100 00000001 0010 0100 0000

OperationOperation--code :code : 00010001

RegisterRegister :: 00100010

Memory address :Memory address : 0100 00000100 0000

0110 11010110 1101

L 1 , 30

L 2 , 40

ST 1 , 40

ST 2 , 30

L 1 , 30 L 1 , 30

L 2 , 40 L 2 , 40

ST 1 , 40 ST 1 , 40

ST 2 , 30 ST 2 , 30

RR00

RR11

0001 00010001 0001

0011 00000011 0000

0001 00100001 0010

0100 00000100 0000

0011 00010011 0001

0100 00000100 0000

0011 00100011 0010

0011 00000011 0000

0110 11010110 1101

1001 10011001 1001

1010

1111

1212

1313

1414

1515

1616

1717

3030

4040

FETCHFETCH

DECODEDECODE

EXECUTEEXECUTEPCPC

1001 10011001 1001RR22

RRFF

……....

Execute a ProgramExecute a Program

0110 11010110 1101

Page 55: Lecture#06   inner workings of the cpu

55

L 1 , 30

L 2 , 40

ST 1 , 40

ST 2 , 30

L 1 , 30 L 1 , 30

L 2 , 40 L 2 , 40

ST 1 , 40 ST 1 , 40

ST 2 , 30 ST 2 , 30

RR00

RR11

0001 00010001 0001

0011 00000011 0000

0001 00100001 0010

0100 00000100 0000

0011 00010011 0001

0100 00000100 0000

0011 00100011 0010

0011 00000011 0000

0110 11010110 1101

1001 10011001 1001

1010

1111

1212

1313

1414

1515

1616

1717

3030

4040

FETCHFETCH

DECODEDECODE

EXECUTEEXECUTEPCPC

1001 10011001 1001RR22

RRFF

……....

Execute a ProgramExecute a Program

Instruction:Instruction:

0011 0001 0100 00000011 0001 0100 0000

0110 11010110 1101

L 1 , 30

L 2 , 40

ST 1 , 40

ST 2 , 30

L 1 , 30 L 1 , 30

L 2 , 40 L 2 , 40

ST 1 , 40 ST 1 , 40

ST 2 , 30 ST 2 , 30

RR00

RR11

0001 00010001 0001

0011 00000011 0000

0001 00100001 0010

0100 00000100 0000

0011 00010011 0001

0100 00000100 0000

0011 00100011 0010

0011 00000011 0000

0110 11010110 1101

1001 10011001 1001

1010

1111

1212

1313

1414

1515

1616

1717

3030

4040

FETCHFETCH

DECODEDECODE

EXECUTEEXECUTEPCPC

1001 10011001 1001RR22

RRFF

……....

Execute a ProgramExecute a Program

Instruction:Instruction:

0011 0001 0100 00000011 0001 0100 0000

OperationOperation--code :code : 00110011

RegisterRegister :: 00010001

Memory address :Memory address : 0100 00000100 0000

0110 11010110 1101

Page 56: Lecture#06   inner workings of the cpu

56

L 1 , 30

L 2 , 40

ST 1 , 40

ST 2 , 30

L 1 , 30 L 1 , 30

L 2 , 40 L 2 , 40

ST 1 , 40 ST 1 , 40

ST 2 , 30 ST 2 , 30

RR00

RR11

0001 00010001 0001

0011 00000011 0000

0001 00100001 0010

0100 00000100 0000

0011 00010011 0001

0100 00000100 0000

0011 00100011 0010

0011 00000011 0000

0110 11010110 1101

1001 10011001 1001

1010

1111

1212

1313

1414

1515

1616

1717

3030

4040

FETCHFETCH

DECODEDECODE

EXECUTEEXECUTEPCPC

1001 10011001 1001RR22

RRFF

……....

Execute a ProgramExecute a Program

Instruction:Instruction:

0011 0001 0100 00000011 0001 0100 0000

OperationOperation--code :code : 00110011

RegisterRegister :: 00010001

Memory address :Memory address : 0100 00000100 0000

0110 11010110 1101

L 1 , 30

L 2 , 40

ST 1 , 40

ST 2 , 30

L 1 , 30 L 1 , 30

L 2 , 40 L 2 , 40

ST 1 , 40 ST 1 , 40

ST 2 , 30 ST 2 , 30

RR00

RR11

0001 00010001 0001

0011 00000011 0000

0001 00100001 0010

0100 00000100 0000

0011 00010011 0001

0100 00000100 0000

0011 00100011 0010

0011 00000011 0000

0110 11010110 1101

0110 11010110 1101

1010

1111

1212

1313

1414

1515

1616

1717

3030

4040

FETCHFETCH

DECODEDECODE

EXECUTEEXECUTEPCPC

1001 10011001 1001RR22

RRFF

……....

Execute a ProgramExecute a Program

Instruction:Instruction:

0011 0001 0100 00000011 0001 0100 0000

OperationOperation--code :code : 00110011

RegisterRegister :: 00010001

Memory address :Memory address : 0100 00000100 0000

0110 11010110 1101

Page 57: Lecture#06   inner workings of the cpu

57

L 1 , 30

L 2 , 40

ST 1 , 40

ST 2 , 30

L 1 , 30 L 1 , 30

L 2 , 40 L 2 , 40

ST 1 , 40 ST 1 , 40

ST 2 , 30 ST 2 , 30

RR00

RR11

0001 00010001 0001

0011 00000011 0000

0001 00100001 0010

0100 00000100 0000

0011 00010011 0001

0100 00000100 0000

0011 00100011 0010

0011 00000011 0000

0110 11010110 1101

0110 11010110 1101

1010

1111

1212

1313

1414

1515

1616

1717

3030

4040

FETCHFETCH

DECODEDECODE

EXECUTEEXECUTE

PCPC

1001 10011001 1001RR22

RRFF

……....

Execute a ProgramExecute a Program

0110 11010110 1101

L 1 , 30

L 2 , 40

ST 1 , 40

ST 2 , 30

L 1 , 30 L 1 , 30

L 2 , 40 L 2 , 40

ST 1 , 40 ST 1 , 40

ST 2 , 30 ST 2 , 30

RR00

RR11

0001 00010001 0001

0011 00000011 0000

0001 00100001 0010

0100 00000100 0000

0011 00010011 0001

0100 00000100 0000

0011 00100011 0010

0011 00000011 0000

0110 11010110 1101

0110 11010110 1101

1010

1111

1212

1313

1414

1515

1616

1717

3030

4040

FETCHFETCH

DECODEDECODE

EXECUTEEXECUTE

PCPC

1001 10011001 1001RR22

RRFF

……....

Execute a ProgramExecute a Program

Instruction:Instruction:

0011 0010 0011 00000011 0010 0011 0000

0110 11010110 1101

Page 58: Lecture#06   inner workings of the cpu

58

L 1 , 30

L 2 , 40

ST 1 , 40

ST 2 , 30

L 1 , 30 L 1 , 30

L 2 , 40 L 2 , 40

ST 1 , 40 ST 1 , 40

ST 2 , 30 ST 2 , 30

RR00

RR11

0001 00010001 0001

0011 00000011 0000

0001 00100001 0010

0100 00000100 0000

0011 00010011 0001

0100 00000100 0000

0011 00100011 0010

0011 00000011 0000

0110 11010110 1101

0110 11010110 1101

1010

1111

1212

1313

1414

1515

1616

1717

3030

4040

FETCHFETCH

DECODEDECODE

EXECUTEEXECUTE

PCPC

1001 10011001 1001RR22

RRFF

……....

Execute a ProgramExecute a Program

Instruction:Instruction:

0011 0010 0011 00000011 0010 0011 0000

OperationOperation--code :code : 00110011

RegisterRegister :: 00100010

Memory address :Memory address : 0011 00000011 0000

0110 11010110 1101

L 1 , 30

L 2 , 40

ST 1 , 40

ST 2 , 30

L 1 , 30 L 1 , 30

L 2 , 40 L 2 , 40

ST 1 , 40 ST 1 , 40

ST 2 , 30 ST 2 , 30

RR00

RR11

0001 00010001 0001

0011 00000011 0000

0001 00100001 0010

0100 00000100 0000

0011 00010011 0001

0100 00000100 0000

0011 00100011 0010

0011 00000011 0000

0110 11010110 1101

0110 11010110 1101

1010

1111

1212

1313

1414

1515

1616

1717

3030

4040

FETCHFETCH

DECODEDECODE

EXECUTEEXECUTE

PCPC

1001 10011001 1001RR22

RRFF

……....

Execute a ProgramExecute a Program

Instruction:Instruction:

0011 0010 0011 00000011 0010 0011 0000

OperationOperation--code :code : 00110011

RegisterRegister :: 00100010

Memory address :Memory address : 0011 00000011 0000

0110 11010110 1101

Page 59: Lecture#06   inner workings of the cpu

59

L 1 , 30

L 2 , 40

ST 1 , 40

ST 2 , 30

L 1 , 30 L 1 , 30

L 2 , 40 L 2 , 40

ST 1 , 40 ST 1 , 40

ST 2 , 30 ST 2 , 30

RR00

RR11

0001 00010001 0001

0011 00000011 0000

0001 00100001 0010

0100 00000100 0000

0011 00010011 0001

0100 00000100 0000

0011 00100011 0010

0011 00000011 0000

1001 10011001 1001

0110 11010110 1101

1010

1111

1212

1313

1414

1515

1616

1717

3030

4040

FETCHFETCH

DECODEDECODE

EXECUTEEXECUTE

PCPC

1001 10011001 1001RR22

RRFF

……....

Execute a ProgramExecute a Program

Instruction:Instruction:

0011 0010 0011 00000011 0010 0011 0000

OperationOperation--code :code : 00110011

RegisterRegister :: 00100010

Memory address :Memory address : 0011 00000011 0000

0110 11010110 1101

Assembler Machine Code Hexa

L 1 , 30 0001 0001 0011 0000 1130

L 2 , 40 0001 0010 0100 0000 1240

ST 1 , 40 0011 0001 0100 0000 3140

ST 2 , 30 0011 0010 0011 0000 3230

AssemblerAssembler Machine CodeMachine Code HexaHexa

L 1 , 30 0001 0001 0011 0000L 1 , 30 0001 0001 0011 0000 11301130

L 2 , 40 0001 0010 0100 0000L 2 , 40 0001 0010 0100 0000 12401240

ST 1 , 40 0011 0001 0100 0000ST 1 , 40 0011 0001 0100 0000 31403140

ST 2 , 30 0011 0010 0011 0000ST 2 , 30 0011 0010 0011 0000 32303230

1001 10011001 1001

0110 11010110 1101

3030

4040

0110 11010110 1101

1001 10011001 1001

RR11

RR22

AA

BB

0001 00010001 0001

0011 00000011 0000

0001 00100001 0010

0100 00000100 0000

0011 00010011 0001

0100 00000100 0000

0011 00100011 0010

0011 00000011 0000

1001 10011001 1001

0110 11010110 1101

1010

1111

1212

1313

1414

1515

1616

1717

3030

4040

Coding Program: An ExampleCoding Program: An Example

Page 60: Lecture#06   inner workings of the cpu

60

LI 1 , 17 LOAD 23 IN HEX INTO R1

ST 1 , A STORE VALUE AT A

LI 1 , F5 LOAD -11 IN HEX INTO R1

ST 1 , B STORE VALUE AT B

LI 1 , 17 LI 1 , 17 LOAD 23 IN HEX INTO R1LOAD 23 IN HEX INTO R1

ST 1 , A ST 1 , A STORE VALUE AT ASTORE VALUE AT A

LI 1 , F5 LI 1 , F5 LOAD LOAD --11 IN HEX INTO R111 IN HEX INTO R1

ST 1 , B ST 1 , B STORE VALUE AT BSTORE VALUE AT B

Assembler Code for A:=23, B:=Assembler Code for A:=23, B:=--11;11;

LI 1 , 17 2117 00100001 00010111

ST 1 , A 3180 00110001 10000000

LI 1 , F5 21F5 00100001 11110101

ST 1 , B 3181 00110001 10000001

LI 1 , 17 LI 1 , 17 2117 2117 00100001 0001011100100001 00010111

ST 1 , A ST 1 , A 3180 3180 00110001 1000000000110001 10000000

LI 1 , F5 LI 1 , F5 21F5 21F5 00100001 1111010100100001 11110101

ST 1 , B ST 1 , B 3181 3181 00110001 1000000100110001 10000001

Machine Code for A:=23, B:=Machine Code for A:=23, B:=--11;11;

Page 61: Lecture#06   inner workings of the cpu

61

L 1 , A LOAD A INTO R1

L 2 , B LOAD B INTO R2

LI 3 , FF SET MASK TO FLIP B

XOR 4 , 2 , 3 FLIP B

LI 3 , 01 LOAD 1 INTO R3

ADD 2 , 3 , 4 ADD 1 TO FLIPPED B

ADD 3 , 1 , 2 NOW DO R3 = A + B

ST 3 , C STORE R3 AT C

L 1 , A L 1 , A LOAD A INTO R1LOAD A INTO R1

L 2 , B L 2 , B LOAD B INTO R2LOAD B INTO R2

LI 3 , FF LI 3 , FF SET MASK TO FLIP BSET MASK TO FLIP B

XOR 4 , 2 , 3 XOR 4 , 2 , 3 FLIP BFLIP B

LI 3 , 01LI 3 , 01 LOAD 1 INTO R3LOAD 1 INTO R3

ADD 2 , 3 , 4ADD 2 , 3 , 4 ADD 1 TO FLIPPED BADD 1 TO FLIPPED B

ADD 3 , 1 , 2ADD 3 , 1 , 2 NOW DO R3 = A + BNOW DO R3 = A + B

ST 3 , CST 3 , C STORE R3 AT CSTORE R3 AT C

Assembler Code for C:=AAssembler Code for C:=A--B;B;

L 1 , A 1180 00010001 10000000

L 2 , B 1281 00010010 10000001

LI 3 , FF 23FF 00100011 11111111

XOR 4 , 2 , 3 9423 10010100 00100011

LI 3 , 01 2301 00100011 00000001

ADD 2 , 3 , 4 5234 01010010 00110100

ADD 3 , 1 , 2 5312 01010011 00010010

ST 3 , C 3382 00110011 10000010

L 1 , A L 1 , A 11801180 00010001 1000000000010001 10000000

L 2 , B L 2 , B 12811281 00010010 1000000100010010 10000001

LI 3 , FF LI 3 , FF 23FF23FF 00100011 1111111100100011 11111111

XOR 4 , 2 , 3 XOR 4 , 2 , 3 94239423 10010100 0010001110010100 00100011

LI 3 , 01LI 3 , 01 23012301 00100011 0000000100100011 00000001

ADD 2 , 3 , 4ADD 2 , 3 , 4 52345234 01010010 0011010001010010 00110100

ADD 3 , 1 , 2ADD 3 , 1 , 2 53125312 01010011 0001001001010011 00010010

ST 3 , CST 3 , C 33823382 00110011 1000001000110011 10000010

Machine Code for C:=AMachine Code for C:=A--B;B;

Page 62: Lecture#06   inner workings of the cpu

62

PROGRAM Sort;PROGRAM Sort;VARVAR

A,B,C : INTEGER;A,B,C : INTEGER;PROCEDURE Swap (VAR X,Y : INTEGER);PROCEDURE Swap (VAR X,Y : INTEGER);VAR VAR

Temp : INTEGER;Temp : INTEGER;BEGIN {Swap}BEGIN {Swap}Temp := A;Temp := A;A := B;A := B;B := Temp;B := Temp;

END {Swap};END {Swap};BEGIN {Sort}BEGIN {Sort}C := AC := A--B;B;IF C = 0 THENIF C = 0 THEN

Swap (A,B);Swap (A,B);END {Sort}.END {Sort}.

Example ProgramExample Program

30 LI30 LI 1,171,17 21172117

32 ST32 ST 1,A1,A 31803180

34 LI34 LI 1,F51,F5 21F521F5

36 ST36 ST 1,B1,B 31813181

38 L38 L 1,A1,A 11801180

3A L3A L 2,B2,B 12811281

3C LI3C LI 3,FF3,FF 23FF23FF

3E XOR3E XOR4,2,34,2,3 94239423

40 LI40 LI 3,013,01 23012301

42 ADD42 ADD2,3,42,3,4 52345234

44 ADD44 ADD3,1,23,1,2 53125312

46 ST46 ST 3,C3,C 33823382

48 L48 L 1,C1,C 11821182

4A LI4A LI 2,802,80 22802280

4C AND 3,1,24C AND 3,1,2 83128312

4E LI4E LI 0,000,00 20002000

50 JMP 3,5E50 JMP 3,5E E35EE35E

52 L52 L 1,A1,A 11801180

54 L54 L 2,B2,B 12811281

56 ST56 ST 1,TEMP1,TEMP 317F317F

58 ST58 ST 2,A2,A 31803180

5A L5A L 2,TEMP2,TEMP 127F127F

5C ST5C ST 2,B2,B 32813281

5E HALT5E HALT F000F000

Assembler and Machine CodeAssembler and Machine Code

Page 63: Lecture#06   inner workings of the cpu

63

3030

3C3C

4848

5454

7474

8080

8C8C

9898

21 17 31 80 21 F5 31 81 11 80 12 81

23 FF 94 23 23 01 52 34 53 12 33 82

11 82 22 80 83 12 20 00 E3 5E 11 80

12 81 31 7F 31 80 12 7F 32 81 F0 00

Code Loaded in MemoryCode Loaded in Memory

Control Unit

30

3C

48

54

21 17 31 80 21 F5 31 81 11 80 12 81

23 FF 94 23 23 01 52 34 53 12 33 82

11 82 22 80 83 12 20 00 E3 5E 11 80

12 81 31 7F 31 80 12 7F 32 81 F0 00

74

80

8C

98

The CPU CycleThe CPU Cycle

FETCH

DECODE

EXECUTE

ALUMain MemoryMain Memory

ProgramProgram

CounterCounter

InstructionInstruction

RegisterRegister

CircuitsCircuits

GeneralGeneral

PurposePurpose

RegistersRegisters

Code SegmentCode Segment

Data SegmentData Segment

Cycle Status (illustration only)Cycle Status (illustration only)

AA

dd

dd

rr

ee

ss

ss

8 bit8 bit

busbus

Page 64: Lecture#06   inner workings of the cpu

64

Control Unit

30

3C

48

54

21 17 31 80 21 F5 31 81 11 80 12 81

23 FF 94 23 23 01 52 34 53 12 33 82

11 82 22 80 83 12 20 00 E3 5E 11 80

12 81 31 7F 31 80 12 7F 32 81 F0 00

74

80

8C

98

The CPU CycleThe CPU Cycle

FETCH

DECODE

EXECUTE

ALUMain MemoryMain Memory

30

Control Unit

30

3C

48

54

21 17 31 80 21 F5 31 81 11 80 12 81

23 FF 94 23 23 01 52 34 53 12 33 82

11 82 22 80 83 12 20 00 E3 5E 11 80

12 81 31 7F 31 80 12 7F 32 81 F0 00

74

80

8C

98

The CPU CycleThe CPU Cycle

FETCHFETCH

DECODE

EXECUTE

ALUMain MemoryMain Memory

30

21

21

Page 65: Lecture#06   inner workings of the cpu

65

Control Unit

30

3C

48

54

21 17 31 80 21 F5 31 81 11 80 12 81

23 FF 94 23 23 01 52 34 53 12 33 82

11 82 22 80 83 12 20 00 E3 5E 11 80

12 81 31 7F 31 80 12 7F 32 81 F0 00

74

80

8C

98

The CPU CycleThe CPU Cycle

FETCHFETCH

DECODE

EXECUTE

ALUMain MemoryMain Memory

30 2121

17

17

Control Unit

30

3C

48

54

21 17 31 80 21 F5 31 81 11 80 12 81

23 FF 94 23 23 01 52 34 53 12 33 82

11 82 22 80 83 12 20 00 E3 5E 11 80

12 81 31 7F 31 80 12 7F 32 81 F0 00

74

80

8C

98

The CPU CycleThe CPU Cycle

FETCHFETCH

DECODE

EXECUTE

ALUMain MemoryMain Memory

32 2121 17

Page 66: Lecture#06   inner workings of the cpu

66

Control Unit

30

3C

48

54

21 17 31 80 21 F5 31 81 11 80 12 81

23 FF 94 23 23 01 52 34 53 12 33 82

11 82 22 80 83 12 20 00 E3 5E 11 80

12 81 31 7F 31 80 12 7F 32 81 F0 00

74

80

8C

98

The CPU CycleThe CPU Cycle

FETCH

DECODEDECODE

EXECUTE

ALUMain MemoryMain Memory

32 2121 17

LI

Control Unit

30

3C

48

54

21 17 31 80 21 F5 31 81 11 80 12 81

23 FF 94 23 23 01 52 34 53 12 33 82

11 82 22 80 83 12 20 00 E3 5E 11 80

12 81 31 7F 31 80 12 7F 32 81 F0 00

74

80

8C

98

The CPU CycleThe CPU Cycle

FETCH

DECODE

EXECUTEEXECUTE

ALUMain MemoryMain Memory

32 2121 17

LI

17

Page 67: Lecture#06   inner workings of the cpu

67

Control Unit

30

3C

48

54

21 17 31 80 21 F5 31 81 11 80 12 81

23 FF 94 23 23 01 52 34 53 12 33 82

11 82 22 80 83 12 20 00 E3 5E 11 80

12 81 31 7F 31 80 12 7F 32 81 F0 00

74

80

8C

98

The CPU CycleThe CPU Cycle

FETCHFETCH

DECODE

EXECUTE

ALUMain MemoryMain Memory

32 2131 17

1731

Control Unit

30

3C

48

54

21 17 31 80 21 F5 31 81 11 80 12 81

23 FF 94 23 23 01 52 34 53 12 33 82

11 82 22 80 83 12 20 00 E3 5E 11 80

12 81 31 7F 31 80 12 7F 32 81 F0 00

74

80

8C

98

The CPU CycleThe CPU Cycle

FETCHFETCH

DECODE

EXECUTE

ALUMain MemoryMain Memory

32 2131 80

1780

Page 68: Lecture#06   inner workings of the cpu

68

Control Unit

30

3C

48

54

21 17 31 80 21 F5 31 81 11 80 12 81

23 FF 94 23 23 01 52 34 53 12 33 82

11 82 22 80 83 12 20 00 E3 5E 11 80

12 81 31 7F 31 80 12 7F 32 81 F0 00

74

80

8C

98

The CPU CycleThe CPU Cycle

FETCHFETCH

DECODE

EXECUTE

ALUMain MemoryMain Memory

34 2131 80

17

Control Unit

30

3C

48

54

21 17 31 80 21 F5 31 81 11 80 12 81

23 FF 94 23 23 01 52 34 53 12 33 82

11 82 22 80 83 12 20 00 E3 5E 11 80

12 81 31 7F 31 80 12 7F 32 81 F0 00

74

80

8C

98

The CPU CycleThe CPU Cycle

FETCH

DECODEDECODE

EXECUTE

ALUMain MemoryMain Memory

34 2131 80

17

ST

17

Page 69: Lecture#06   inner workings of the cpu

69

Control Unit

30

3C

48

54

21 17 31 80 21 F5 31 81 11 80 12 81

23 FF 94 23 23 01 52 34 53 12 33 82

11 82 22 80 83 12 20 00 E3 5E 11 80

12 81 31 7F 31 80 12 7F 32 81 F0 00

74

80

8C

98

The CPU CycleThe CPU Cycle

FETCH

DECODE

EXECUTEEXECUTE

ALUMain MemoryMain Memory

34 2131 80

17

ST

1717

Control Unit

30

3C

48

54

21 17 31 80 21 F5 31 81 11 80 12 81

23 FF 94 23 23 01 52 34 53 12 33 82

11 82 22 80 83 12 20 00 E3 5E 11 80

12 81 31 7F 31 80 12 7F 32 81 F0 00

74

80

8C

98

The CPU CycleThe CPU Cycle

FETCHFETCH

DECODE

EXECUTE

ALUMain MemoryMain Memory

34 2121 80

1721

17

Page 70: Lecture#06   inner workings of the cpu

70

Control Unit

30

3C

48

54

21 17 31 80 21 F5 31 81 11 80 12 81

23 FF 94 23 23 01 52 34 53 12 33 82

11 82 22 80 83 12 20 00 E3 5E 11 80

12 81 31 7F 31 80 12 7F 32 81 F0 00

74

80

8C

98

The CPU CycleThe CPU Cycle

FETCHFETCH

DECODE

EXECUTE

ALUMain MemoryMain Memory

34 2121 F5

17F5

17

Control Unit

30

3C

48

54

21 17 31 80 21 F5 31 81 11 80 12 81

23 FF 94 23 23 01 52 34 53 12 33 82

11 82 22 80 83 12 20 00 E3 5E 11 80

12 81 31 7F 31 80 12 7F 32 81 F0 00

74

80

8C

98

The CPU CycleThe CPU Cycle

FETCHFETCH

DECODE

EXECUTE

ALUMain MemoryMain Memory

36 2121 F5

17

17

Page 71: Lecture#06   inner workings of the cpu

71

Control Unit

30

3C

48

54

21 17 31 80 21 F5 31 81 11 80 12 81

23 FF 94 23 23 01 52 34 53 12 33 82

11 82 22 80 83 12 20 00 E3 5E 11 80

12 81 31 7F 31 80 12 7F 32 81 F0 00

74

80

8C

98

The CPU CycleThe CPU Cycle

FETCH

DECODEDECODE

EXECUTE

ALUMain MemoryMain Memory

36 2121 F5

17

LI

1717

Control Unit

30

3C

48

54

21 17 31 80 21 F5 31 81 11 80 12 81

23 FF 94 23 23 01 52 34 53 12 33 82

11 82 22 80 83 12 20 00 E3 5E 11 80

12 81 31 7F 31 80 12 7F 32 81 F0 00

74

80

8C

98

The CPU CycleThe CPU Cycle

FETCH

DECODE

EXECUTEEXECUTE

ALUMain MemoryMain Memory

36 2121 F5

17

LI

F517

Page 72: Lecture#06   inner workings of the cpu

72

Control Unit

30

3C

48

54

21 17 31 80 21 F5 31 81 11 80 12 81

23 FF 94 23 23 01 52 34 53 12 33 82

11 82 22 80 83 12 20 00 E3 5E 11 80

12 81 31 7F 31 80 12 7F 32 81 F0 00

74

80

8C

98

The CPU CycleThe CPU Cycle

FETCHFETCH

DECODE

EXECUTE

ALUMain MemoryMain Memory

36 2131 F5

F5

31

17

Control Unit

30

3C

48

54

21 17 31 80 21 F5 31 81 11 80 12 81

23 FF 94 23 23 01 52 34 53 12 33 82

11 82 22 80 83 12 20 00 E3 5E 11 80

12 81 31 7F 31 80 12 7F 32 81 F0 00

74

80

8C

98

The CPU CycleThe CPU Cycle

FETCHFETCH

DECODE

EXECUTE

ALUMain MemoryMain Memory

36 2131 81

F5

81

17

Page 73: Lecture#06   inner workings of the cpu

73

Control Unit

30

3C

48

54

21 17 31 80 21 F5 31 81 11 80 12 81

23 FF 94 23 23 01 52 34 53 12 33 82

11 82 22 80 83 12 20 00 E3 5E 11 80

12 81 31 7F 31 80 12 7F 32 81 F0 00

74

80

8C

98

The CPU CycleThe CPU Cycle

FETCHFETCH

DECODE

EXECUTE

ALUMain MemoryMain Memory

38 2131 81

F5

17

Control Unit

30

3C

48

54

21 17 31 80 21 F5 31 81 11 80 12 81

23 FF 94 23 23 01 52 34 53 12 33 82

11 82 22 80 83 12 20 00 E3 5E 11 80

12 81 31 7F 31 80 12 7F 32 81 F0 00

74

80

8C

98

The CPU CycleThe CPU Cycle

FETCH

DECODEDECODE

EXECUTE

ALUMain MemoryMain Memory

38 2131 81

F5

ST

F517

Page 74: Lecture#06   inner workings of the cpu

74

Control Unit

30

3C

48

54

21 17 31 80 21 F5 31 81 11 80 12 81

23 FF 94 23 23 01 52 34 53 12 33 82

11 82 22 80 83 12 20 00 E3 5E 11 80

12 81 31 7F 31 80 12 7F 32 81 F0 00

74

80

8C

98

The CPU CycleThe CPU Cycle

FETCH

DECODE

EXECUTEEXECUTE

ALUMain MemoryMain Memory

38 2131 81

F5

ST

F5F5

17

Control Unit

30

3C

48

54

21 17 31 80 21 F5 31 81 11 80 12 81

23 FF 94 23 23 01 52 34 53 12 33 82

11 82 22 80 83 12 20 00 E3 5E 11 80

12 81 31 7F 31 80 12 7F 32 81 F0 00

74

80

8C

98

The CPU CycleThe CPU Cycle

FETCHFETCH

DECODE

EXECUTE

ALUMain MemoryMain Memory

38 2111 81

F5

17 F5

11

Page 75: Lecture#06   inner workings of the cpu

75

Control Unit

30

3C

48

54

21 17 31 80 21 F5 31 81 11 80 12 81

23 FF 94 23 23 01 52 34 53 12 33 82

11 82 22 80 83 12 20 00 E3 5E 11 80

12 81 31 7F 31 80 12 7F 32 81 F0 00

74

80

8C

98

The CPU CycleThe CPU Cycle

FETCHFETCH

DECODE

EXECUTE

ALUMain MemoryMain Memory

38 2111 80

F5

17 F5

80

Control Unit

30

3C

48

54

21 17 31 80 21 F5 31 81 11 80 12 81

23 FF 94 23 23 01 52 34 53 12 33 82

11 82 22 80 83 12 20 00 E3 5E 11 80

12 81 31 7F 31 80 12 7F 32 81 F0 00

74

80

8C

98

The CPU CycleThe CPU Cycle

FETCHFETCH

DECODE

EXECUTE

ALUMain MemoryMain Memory

3A 2111 80

F5

17 F5

Page 76: Lecture#06   inner workings of the cpu

76

Control Unit

30

3C

48

54

21 17 31 80 21 F5 31 81 11 80 12 81

23 FF 94 23 23 01 52 34 53 12 33 82

11 82 22 80 83 12 20 00 E3 5E 11 80

12 81 31 7F 31 80 12 7F 32 81 F0 00

74

80

8C

98

The CPU CycleThe CPU Cycle

FETCH

DECODEDECODE

EXECUTE

ALUMain MemoryMain Memory

3A 2111 80

F5

17 F5

L

17

Control Unit

30

3C

48

54

21 17 31 80 21 F5 31 81 11 80 12 81

23 FF 94 23 23 01 52 34 53 12 33 82

11 82 22 80 83 12 20 00 E3 5E 11 80

12 81 31 7F 31 80 12 7F 32 81 F0 00

74

80

8C

98

The CPU CycleThe CPU Cycle

FETCH

DECODE

EXECUTEEXECUTE

ALUMain MemoryMain Memory

3A 2111 80

F5

17 F5

L

1717

Page 77: Lecture#06   inner workings of the cpu

77

Control Unit

30

3C

48

54

21 17 31 80 21 F5 31 81 11 80 12 81

23 FF 94 23 23 01 52 34 53 12 33 82

11 82 22 80 83 12 20 00 E3 5E 11 80

12 81 31 7F 31 80 12 7F 32 81 F0 00

74

80

8C

98

The CPU Cycle The CPU Cycle –– and so onand so on……

FETCH

DECODE

EXEUTEEXEUTE

ALUMain MemoryMain Memory

21

F5

17 F5

L

�� Some CPUs divide the fetchSome CPUs divide the fetch--decodedecode--execute execute

cycle into smaller stepscycle into smaller steps

�� Instruction Level PipeliningInstruction Level Pipelining overlaps these overlaps these

smaller steps smaller steps for consecutive instructionsfor consecutive instructions in in

order to increase throughputorder to increase throughput

�� Need to balance the time taken by each pipeline Need to balance the time taken by each pipeline

stagestage

Instruction Execution Instruction Execution -- PipeliningPipelining

Page 78: Lecture#06   inner workings of the cpu

78

�� Suppose a fetchSuppose a fetch--decodedecode--execute cycle were broken execute cycle were broken

into the following smaller steps:into the following smaller steps:

1.1. Fetch instructionFetch instruction

2.2. Decode opcodeDecode opcode

3.3. Calculate the address of operandsCalculate the address of operands

4.4. Fetch operandsFetch operands

5.5. Execute instructionExecute instruction

6.6. Store resultStore result

�� For every clock cycle, oneFor every clock cycle, one

small step is carried out, andsmall step is carried out, and

the stages are overlappedthe stages are overlapped

Instruction Level Pipelining Instruction Level Pipelining -- ExampleExample

�� There are There are nn instructionsinstructions

�� There are There are kk stages in the pipeline, and the time per stages in the pipeline, and the time per

stage is stage is ttpp

�� The first instruction requires The first instruction requires k k xx ttpp time to completetime to complete

�� The remaining The remaining (n (n –– 1)1) instructions emerge from the instructions emerge from the

pipeline one per stagepipeline one per stage

�� The total time to complete the remaining instructions The total time to complete the remaining instructions

is is (n (n –– 1) 1) ttpp

�� Thus, the time required to complete Thus, the time required to complete nn tasks using a tasks using a

kk--stage pipeline isstage pipeline is

(k * (k * ttpp) + (n ) + (n –– 1) 1) ttpp = (k + n = (k + n –– 1) 1) ttpp

Instruction Level Pipelining Instruction Level Pipelining -- SpeedSpeed

Page 79: Lecture#06   inner workings of the cpu

79

�� Speedup gained by using a pipelineSpeedup gained by using a pipeline

�� As As nn approaches infinity, approaches infinity, (k + n (k + n –– 1)1) approaches approaches nn, ,

which results in a theoretical speedup ofwhich results in a theoretical speedup of

p

p

tnk

tknSpeedup

)1( −+

×=

time without

pipeline

time with

pipeline

ktn

tknSpeedup

p

p=

×

=

Instruction Level Pipelining Instruction Level Pipelining -- SpeedSpeed

�� AssumptionsAssumptions

�� the architecture supports fetching instructions and data the architecture supports fetching instructions and data

in parallelin parallel

�� the pipeline can be kept filled at all timesthe pipeline can be kept filled at all times

oo This is not always the case due to pipeline conflictsThis is not always the case due to pipeline conflicts

�� It may appear that more stages imply faster It may appear that more stages imply faster

performance, butperformance, but

�� the amount of control logic increases with the number the amount of control logic increases with the number

of stagesof stages

�� pipeline conflicts affect the execution of instructionspipeline conflicts affect the execution of instructions

Instruction Level Pipelining Instruction Level Pipelining -- IssuesIssues

Page 80: Lecture#06   inner workings of the cpu

80

�� Resource conflictsResource conflicts

�� One instruction is storing a value to memory while One instruction is storing a value to memory while

another instruction is being fetched from memoryanother instruction is being fetched from memory

�� Data dependenciesData dependencies

�� When the notWhen the not--yetyet--available result of one instruction available result of one instruction

is the operand of a subsequent instructionis the operand of a subsequent instruction

�� Conditional branch statementsConditional branch statements

�� Several instructions can be fetched and decoded Several instructions can be fetched and decoded

before the execution of a preceding branch before the execution of a preceding branch

instruction is finishedinstruction is finished

Instruction Level Pipelining Instruction Level Pipelining –– Pipeline Pipeline

ConflictsConflicts

SCS1003 SCS1003 -- Computer Computer

SystemsSystems

Thank You