computer architecture overvieesb/2018fall.ics332/aug22.pdf · 2018-08-22 · history von neumann...
TRANSCRIPT
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Computer Architecture OverviewICS332 — Operating Systems
Henri Casanova ([email protected])
Spring 2018
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
ENIACVon Neumann Model
1946 — ENIAC
Electronic Numerical Integrator And Computer aka “Giant Brain”
First electronic general-purpose computer
Before that, “were humans, who could use non-programmablemechanical and later electrical computation tools
Could be reprogrammed (Stored-Program Computer instead ofFixed-Program Computer)
Main sponsor: University of Pennsylvania / Ballistic ResearchLaboratory ($487k eq. 2016 $7M)
Designers: Mauchly and Eckert
First operators (i.e., programmers): The 6 “ENIAC Girls” (McNulty,Jennings, Snyder, Wescoff, Bilas, and Lichterman)
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
ENIACVon Neumann Model
1946 — ENIAC
Electronic Numerical Integrator And Computer aka “Giant Brain”
First electronic general-purpose computer
Before that, “were humans, who could use non-programmablemechanical and later electrical computation tools
Could be reprogrammed (Stored-Program Computer instead ofFixed-Program Computer)
Main sponsor: University of Pennsylvania / Ballistic ResearchLaboratory ($487k eq. 2016 $7M)
Designers: Mauchly and Eckert
First operators (i.e., programmers): The 6 “ENIAC Girls” (McNulty,Jennings, Snyder, Wescoff, Bilas, and Lichterman)
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
ENIACVon Neumann Model
1946 — ENIAC (Features)
1000x faster than (specialized) electro-mechanical equivalent
2400x times faster than (specialized) human being (30 secondsinstead of 20 hours)
100 kHz / 5 kIPS (now: 4GHz / 5,000 MIPS)
1,000 bits of RAM (i.e., 0.12 KiB)
150 kW (now: 200W)
17,468 vacuum tubes (failure prone, power hungry)
8 × 3 × 100 ft; 27 metric tons (60,000 pounds)
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
ENIACVon Neumann Model
1946 — ENIAC (Pictures)
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
ENIACVon Neumann Model
1946 — ENIAC (Pictures)
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
ENIACVon Neumann Model
1946 — ENIAC (Pictures)
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
ENIACVon Neumann Model
Von Neumann
ENIAC design frozen in 1943; Eckert and Mauchly work on a newdesign: the EDVAC
1944: Von Neumann (1903-1957) joins Eckert and Mauchly, writes amemo formalizing their ideas
This became the Von Neumann Architecture Model
A Central Processing Unit performs operations and controls thesequence of operationsA Memory Unit contains code and dataSome kind of Input and Output mechanisms (I/O)
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
ENIACVon Neumann Model
Von Neumann
ENIAC design frozen in 1943; Eckert and Mauchly work on a newdesign: the EDVAC
1944: Von Neumann (1903-1957) joins Eckert and Mauchly, writes amemo formalizing their ideas
This became the Von Neumann Architecture Model
A Central Processing Unit performs operations and controls thesequence of operationsA Memory Unit contains code and dataSome kind of Input and Output mechanisms (I/O)
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Von Neumann ModelMemory UnitCentral Processing Unit
Von Neumann Model
Amazingly it is still possible tothink of the computer this way at aconceptual level (model from ∼70years ago!)
CPU ⇐⇒ Memory
mI/O
Today a computer looks morelike:
Memory
CPU Disk Controller USB Controller Graphics Adapter
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Von Neumann ModelMemory UnitCentral Processing Unit
Von Neumann Model
Amazingly it is still possible tothink of the computer this way at aconceptual level (model from ∼70years ago!)
CPU ⇐⇒ Memory
mI/O
Today a computer looks morelike:
Memory
CPU Disk Controller USB Controller Graphics Adapter
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Von Neumann ModelMemory UnitCentral Processing Unit
Von Neumann Model
Amazingly it is still possible tothink of the computer this way at aconceptual level (model from ∼70years ago!)
CPU ⇐⇒ Memory
mI/O
Today a computer looks morelike:
Memory
CPU Graphics AdapterUSB ControllerDisk Controller
Memory Bus
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Von Neumann ModelMemory UnitCentral Processing Unit
Von Neumann Model: Origins
1847: Boolean algebra – Truth value (true / false), Boolean logic,Bit (binary digit)
1937: Shannon’s MS Thesis – Any logical, numerical relationshipcan be built using Boolean algebra
Therefore, any “information” can be represented in binary form, andtherefore we can build computers that only understand binary
Building computers this way is technologically convenient:
0 Volt: False (0)∼5 Volt: True (1)
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Von Neumann ModelMemory UnitCentral Processing Unit
The Von Neumann Architecture
CPU ⇐⇒ MemorymI/O
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Von Neumann ModelMemory UnitCentral Processing Unit
Memory Unit
Called Memory or RAM (Random Access Memory) for short
I will say “memory” or “RAM” interchangeably
The basic unit of memory is the byte (or octet, or octad, or octade)
1 Byte = 8 bits, e.g., “0110 1011”
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Von Neumann ModelMemory UnitCentral Processing Unit
Memory Unit
The memory contains numerical “information” / “data” / “content”
Content3141
2592
167-5...
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Von Neumann ModelMemory UnitCentral Processing Unit
Memory Unit
The “data” are represented in memory in binary as bytes
Content (Human)0000 0011 30000 0001 10000 0100 40000 0001 10001 1001 250000 1001 90000 0010 21010 0111 1671111 1011 -5
... ...
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Von Neumann ModelMemory UnitCentral Processing Unit
Memory Unit
To be used, the data need to be located precisely in memory: addresses
Address Content (Human)0 0000 0011 31 0000 0001 12 0000 0100 43 0000 0001 14 0001 1001 255 0000 1001 96 0000 0010 27 1010 0111 1678 1111 1011 -5... ... ...
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Von Neumann ModelMemory UnitCentral Processing Unit
Memory Unit
... but because computers only understand binary, the addresses arebinary too:
Address Content (Human)0000 0000 0000 0011 30000 0001 0000 0001 10000 0010 0000 0100 40000 0011 0000 0001 10000 0100 0001 1001 250000 0101 0000 1001 90000 0110 0000 0010 20000 0111 1010 0111 1670000 1000 1111 1011 -5
... ... ...
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Von Neumann ModelMemory UnitCentral Processing Unit
Memory Unit
Each byte in memory is labeled by a unique address
We talk of a byte-addressable memory
All addresses on a computer have the same number of bits (e.g.,16-bit addresses)
The CPU has instructions like “Read the byte at address X and giveme its value” and “Write this value into the byte at address Y”
The Memory Unit (Bus + RAM) has the hardware to make theseinstructions happen
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Von Neumann ModelMemory UnitCentral Processing Unit
Memory Unit
Each byte in memory is labeled by a unique address
We talk of a byte-addressable memory
All addresses on a computer have the same number of bits (e.g.,16-bit addresses)
The CPU has instructions like “Read the byte at address X and giveme its value” and “Write this value into the byte at address Y”
The Memory Unit (Bus + RAM) has the hardware to make theseinstructions happen
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Von Neumann ModelMemory UnitCentral Processing Unit
Conceptual View of Memory (16-bit addresses example)
Address Content0000 0000 0000 0000 0000 00110000 0000 0000 0001 0000 00010000 0000 0000 0010 0000 01000000 0000 0000 0011 0000 00010000 0000 0000 0100 0000 01010000 0000 0000 0101 0000 10010000 0000 0000 0110 0000 00100000 0000 0000 0111 0000 01100000 0000 0000 1000 0000 0101
... ...1111 1111 1111 1111 0010 0101
At address0000 0000 0000 0011the content is0000 0001
(The contents of uninitial-ized memory are random)
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Von Neumann ModelMemory UnitCentral Processing Unit
Conceptual View of Memory (16-bit addresses example)
Address Content0000 0000 0000 0000 0000 00110000 0000 0000 0001 0000 00010000 0000 0000 0010 0000 01000000 0000 0000 0011 0000 00010000 0000 0000 0100 0000 01010000 0000 0000 0101 0000 10010000 0000 0000 0110 0000 00100000 0000 0000 0111 0000 01100000 0000 0000 1000 0000 0101
... ...1111 1111 1111 1111 0010 0101
At address0000 0000 0000 0011the content is0000 0001
(The contents of uninitial-ized memory are random)
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Von Neumann ModelMemory UnitCentral Processing Unit
Conceptual View of Memory (8-bit addresses example)
Let’s consider a memory 8-bit addresses with this initial state.
We can write a program that does “At address 1000 0000, store theaddress of the first ’9’ (0000 1001) in memory”
Address Content0000 0000 0000 00110000 0001 0000 00010000 0010 0000 01000000 0011 0000 00010000 0100 0000 01010000 0101 0000 10010000 0110 0000 00100000 0111 0000 01100000 1000 0000 0101
... ...1000 0000 0110 01011000 0001 1001 0111
=⇒
Address Content0000 0000 0000 00110000 0001 0000 00010000 0010 0000 01000000 0011 0000 00010000 0100 0000 01010000 0101 0000 10010000 0110 0000 00100000 0111 0000 01100000 1000 0000 0101
... ...1000 0000 0000 01011000 0001 1001 0111
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Von Neumann ModelMemory UnitCentral Processing Unit
Conceptual View of Memory (8-bit addresses example)
Let’s consider a memory 8-bit addresses with this initial state.We can write a program that does “At address 1000 0000, store theaddress of the first ’9’ (0000 1001) in memory”
Address Content0000 0000 0000 00110000 0001 0000 00010000 0010 0000 01000000 0011 0000 00010000 0100 0000 01010000 0101 0000 10010000 0110 0000 00100000 0111 0000 01100000 1000 0000 0101
... ...1000 0000 0110 01011000 0001 1001 0111
=⇒
Address Content0000 0000 0000 00110000 0001 0000 00010000 0010 0000 01000000 0011 0000 00010000 0100 0000 01010000 0101 0000 10010000 0110 0000 00100000 0111 0000 01100000 1000 0000 0101
... ...1000 0000 0000 01011000 0001 1001 0111
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Von Neumann ModelMemory UnitCentral Processing Unit
Conceptual View of Memory (8-bit addresses example)
Let’s consider a memory 8-bit addresses with this initial state.We can write a program that does “At address 1000 0000, store theaddress of the first ’9’ (0000 1001) in memory”
Address Content0000 0000 0000 00110000 0001 0000 00010000 0010 0000 01000000 0011 0000 00010000 0100 0000 01010000 0101 0000 10010000 0110 0000 00100000 0111 0000 01100000 1000 0000 0101
... ...1000 0000 0110 01011000 0001 1001 0111
=⇒
Address Content0000 0000 0000 00110000 0001 0000 00010000 0010 0000 01000000 0011 0000 00010000 0100 0000 01010000 0101 0000 10010000 0110 0000 00100000 0111 0000 01100000 1000 0000 0101
... ...1000 0000 0000 01011000 0001 1001 0111
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Von Neumann ModelMemory UnitCentral Processing Unit
Indirection
An address is just information
In the previous slide we’ve done indirection
The content at a memory location is the address of another memorylocation: we call this a pointer/referenceAt that other memory location is some content that we care about
which in our case is the value ’9’but which could be yet another address
It’s the job of the programmer to know what memory content means(the CPU has no idea), which is a source of bugs
Very well-known difficulty when writing assembly (ICS312/ICS331)High-level programming languages help, but in C you can dowhatever:
e.g., on a 64-bit architecture a C pointer is simply an unsigned long
unsigned long x = 42;
int *ptr = (int *)x; // bogus pointer!
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Von Neumann ModelMemory UnitCentral Processing Unit
Indirection
An address is just information
In the previous slide we’ve done indirection
The content at a memory location is the address of another memorylocation: we call this a pointer/referenceAt that other memory location is some content that we care about
which in our case is the value ’9’but which could be yet another address
It’s the job of the programmer to know what memory content means(the CPU has no idea), which is a source of bugs
Very well-known difficulty when writing assembly (ICS312/ICS331)High-level programming languages help, but in C you can dowhatever:
e.g., on a 64-bit architecture a C pointer is simply an unsigned long
unsigned long x = 42;
int *ptr = (int *)x; // bogus pointer!
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Von Neumann ModelMemory UnitCentral Processing Unit
Hello World! (Well... not really)
Let’s consider the following pseudo-code:
Step 1) Set the content of variable A to the content at address 1000 0000Step 2) Set the content of variable B to the content at address 1000 0001Step 3) Add A and B together and store the result in AStep 4) Set the content at address 1000 0001 to the contents of AStep 5) Go back to Step 1
or in assembly (pseudo-)instructions:
// MIPS-like (ICS 331)
S1: LOAD A, (1000 0000)
S2: LOAD B, (1000 0001)
S3: ADD A, B
S4: STORE A, (1000 0010)
S5: JMP S1
// x86-like (ICS 312)
S1: MOV AL, [1000 0000]
S2: MOV BL, [1000 0001]
S3: ADD AL, BL
S4: MOV [1000 0010], AL
S5: JMP S1
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Von Neumann ModelMemory UnitCentral Processing Unit
Hello World! (Well... not really)
Let’s consider the following pseudo-code:
Step 1) Set the content of variable A to the content at address 1000 0000Step 2) Set the content of variable B to the content at address 1000 0001Step 3) Add A and B together and store the result in AStep 4) Set the content at address 1000 0001 to the contents of AStep 5) Go back to Step 1
or in assembly (pseudo-)instructions:
// MIPS-like (ICS 331)
S1: LOAD A, (1000 0000)
S2: LOAD B, (1000 0001)
S3: ADD A, B
S4: STORE A, (1000 0010)
S5: JMP S1
// x86-like (ICS 312)
S1: MOV AL, [1000 0000]
S2: MOV BL, [1000 0001]
S3: ADD AL, BL
S4: MOV [1000 0010], AL
S5: JMP S1
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Von Neumann ModelMemory UnitCentral Processing Unit
Binary Instruction Encoding
Instructions are encoded in binary, based on the specification of themicroprocessor your computer uses
Here are some x86 instruction encodings:
Instruction Encoding (in hex) SizeADD EAX, 1 83C001 3 bytesADD EAX, -1 83C0FF 3 bytesADD EAX, -100000 056079FEFF 5 bytesADD EAX, EBX 01D8 2 bytes
Some instructions are shorter than others, which impacts the size ofthe executable
An assembler transforms assembly code into binary code, soprogrammers typically don’t know the binary code for instructions
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Von Neumann ModelMemory UnitCentral Processing Unit
Binary Instruction Encoding
Instructions are encoded in binary, based on the specification of themicroprocessor your computer usesHere are some x86 instruction encodings:
Instruction Encoding (in hex) SizeADD EAX, 1 83C001 3 bytesADD EAX, -1 83C0FF 3 bytesADD EAX, -100000 056079FEFF 5 bytesADD EAX, EBX 01D8 2 bytes
Some instructions are shorter than others, which impacts the size ofthe executable
An assembler transforms assembly code into binary code, soprogrammers typically don’t know the binary code for instructions
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Von Neumann ModelMemory UnitCentral Processing Unit
The program is stored in RAM
alongwith data
Address Content (hex) Meaning0000 0000 83 ADD EAX, 10000 0001 C00000 0010 010000 0011 01 ADD EAX, EBX0000 0100 D80000 0101 050000 0110 60 ADD EAX, -1000000000 0111 790000 1000 FE0000 1000 FF
... ...
... ...
1000 0000 05 Some data1000 0001 4F Some data1000 0010 2C Some data1000 0011 00 Some data
Once a program is loaded inmemory its address spacecontains both code and data
The CPU can’t tell thedifference, only theprogrammer can
This is conveniently hiddenfrom the programmer, unlessyou write assembly
It’s the CPU job tounderstand that 83C0D1means ADD EAX, 1
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Von Neumann ModelMemory UnitCentral Processing Unit
The program is stored in RAM alongwith data
Address Content (hex) Meaning0000 0000 83 ADD EAX, 10000 0001 C00000 0010 010000 0011 01 ADD EAX, EBX0000 0100 D80000 0101 050000 0110 60 ADD EAX, -1000000000 0111 790000 1000 FE0000 1000 FF
... ...
... ...1000 0000 05 Some data1000 0001 4F Some data1000 0010 2C Some data1000 0011 00 Some data
Once a program is loaded inmemory its address spacecontains both code and data
The CPU can’t tell thedifference, only theprogrammer can
This is conveniently hiddenfrom the programmer, unlessyou write assembly
It’s the CPU job tounderstand that 83C0D1means ADD EAX, 1
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Von Neumann ModelMemory UnitCentral Processing Unit
The program is stored in RAM alongwith data
Address Content (hex) Meaning0000 0000 83 ADD EAX, 10000 0001 C00000 0010 010000 0011 01 ADD EAX, EBX0000 0100 D80000 0101 050000 0110 60 ADD EAX, -1000000000 0111 790000 1000 FE0000 1000 FF
... ...
... ...1000 0000 05 Some data1000 0001 4F Some data1000 0010 2C Some data1000 0011 00 Some data
Once a program is loaded inmemory its address spacecontains both code and data
The CPU can’t tell thedifference, only theprogrammer can
This is conveniently hiddenfrom the programmer, unlessyou write assembly
It’s the CPU job tounderstand that 83C0D1means ADD EAX, 1
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Von Neumann ModelMemory UnitCentral Processing Unit
Memory Unit: Conclusions
The memory is basically an indexed array of bytes
The memory contents have various useful meaning:
integers, character codes, floating-point numbers, ... but also higherlevel abstractions: RGB values, coordinates in space-time, images...addresses (pointers)instructions (i.e., executable code) understood by a CPU
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Von Neumann ModelMemory UnitCentral Processing Unit
The Von Neumann Architecture
CPU ⇐⇒ MemorymI/O
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Von Neumann ModelMemory UnitCentral Processing Unit
Central Processing Unit
The CPU reads data from memory into registers, writes data fromregisters to memory, and computes
The component that performs the computational operations is calledthe ALU (Arithmetic and Logic Unit)
It can perform what you expect (+, -, /, *, OR, AND, XOR, ...)
Operands and results of operations must all be in registers
Unfortunately, there are very few registers
e.g., Intel-i7 8 × 32-bit; 16 × 64-bit; (and 16 FP 128- or 256-bit)
This is a pain when writing assembly by hand
But the compiler does all that work for us when we use high-levellanguages
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Von Neumann ModelMemory UnitCentral Processing Unit
Central Processing Unit
The CPU reads data from memory into registers, writes data fromregisters to memory, and computes
The component that performs the computational operations is calledthe ALU (Arithmetic and Logic Unit)
It can perform what you expect (+, -, /, *, OR, AND, XOR, ...)
Operands and results of operations must all be in registers
Unfortunately, there are very few registers
e.g., Intel-i7 8 × 32-bit; 16 × 64-bit; (and 16 FP 128- or 256-bit)
This is a pain when writing assembly by hand
But the compiler does all that work for us when we use high-levellanguages
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Von Neumann ModelMemory UnitCentral Processing Unit
Central Processing Unit
The CPU reads data from memory into registers, writes data fromregisters to memory, and computes
The component that performs the computational operations is calledthe ALU (Arithmetic and Logic Unit)
It can perform what you expect (+, -, /, *, OR, AND, XOR, ...)
Operands and results of operations must all be in registers
Unfortunately, there are very few registers
e.g., Intel-i7 8 × 32-bit; 16 × 64-bit; (and 16 FP 128- or 256-bit)
This is a pain when writing assembly by hand
But the compiler does all that work for us when we use high-levellanguages
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Von Neumann ModelMemory UnitCentral Processing Unit
Central Processing Unit
The CPU reads data from memory into registers, writes data fromregisters to memory, and computes
The component that performs the computational operations is calledthe ALU (Arithmetic and Logic Unit)
It can perform what you expect (+, -, /, *, OR, AND, XOR, ...)
Operands and results of operations must all be in registers
Unfortunately, there are very few registers
e.g., Intel-i7 8 × 32-bit; 16 × 64-bit; (and 16 FP 128- or 256-bit)
This is a pain when writing assembly by hand
But the compiler does all that work for us when we use high-levellanguages
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Von Neumann ModelMemory UnitCentral Processing Unit
Central Processing Unit
The CPU also controls the execution of the program’s instructions
The Control Unit is the component in charge of controlling theprogram execution, and it uses dedicated registers:
Program Counter: Contains the address of the next instruction thatshould be executed: is incremented after each instruction but can beset to whatever address when there is a change in control flowCurrent Instruction: The binary code of the instruction which iscurrently being executedOther registers: Stack Pointer, Frame Pointer, ...
The Control Unit decodes the instructions (i.e., interprets their bits)and makes them happen
This is a main topic of a Computer Architecture course
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Fetch-Decode-Execute CycleInitializationFetchDecodeExecuteRepeat...I/O
Fetch-Decode-Execute Cycle
The Fetch-Decode-Execute Cycle
The Control Unit fetches the next program instruction from memoryusing the program counterThe instruction is decoded and signals are sent to hardwarecomponents (memory controller, ALU, I/O controller)The instruction is executed:
Values are fetched from memory and put in the registersComputation is performed by the ALU and results are stored inregistersRegister values are pushed back to memoryProgram state is modified (Program Counter, Stack Pointer, ...)
Repeat
Computers implement many variations on this cycle, with tons ofbells and whistles to make it as fast as possible
But one can still program with the above model in mind (butcertainly without fully understanding performance issues)
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Fetch-Decode-Execute CycleInitializationFetchDecodeExecuteRepeat...I/O
Fetch-Decode-Execute Cycle
The Fetch-Decode-Execute Cycle
The Control Unit fetches the next program instruction from memoryusing the program counter
The instruction is decoded and signals are sent to hardwarecomponents (memory controller, ALU, I/O controller)The instruction is executed:
Values are fetched from memory and put in the registersComputation is performed by the ALU and results are stored inregistersRegister values are pushed back to memoryProgram state is modified (Program Counter, Stack Pointer, ...)
Repeat
Computers implement many variations on this cycle, with tons ofbells and whistles to make it as fast as possible
But one can still program with the above model in mind (butcertainly without fully understanding performance issues)
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Fetch-Decode-Execute CycleInitializationFetchDecodeExecuteRepeat...I/O
Fetch-Decode-Execute Cycle
The Fetch-Decode-Execute Cycle
The Control Unit fetches the next program instruction from memoryusing the program counterThe instruction is decoded and signals are sent to hardwarecomponents (memory controller, ALU, I/O controller)
The instruction is executed:
Values are fetched from memory and put in the registersComputation is performed by the ALU and results are stored inregistersRegister values are pushed back to memoryProgram state is modified (Program Counter, Stack Pointer, ...)
Repeat
Computers implement many variations on this cycle, with tons ofbells and whistles to make it as fast as possible
But one can still program with the above model in mind (butcertainly without fully understanding performance issues)
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Fetch-Decode-Execute CycleInitializationFetchDecodeExecuteRepeat...I/O
Fetch-Decode-Execute Cycle
The Fetch-Decode-Execute Cycle
The Control Unit fetches the next program instruction from memoryusing the program counterThe instruction is decoded and signals are sent to hardwarecomponents (memory controller, ALU, I/O controller)The instruction is executed:
Values are fetched from memory and put in the registersComputation is performed by the ALU and results are stored inregistersRegister values are pushed back to memoryProgram state is modified (Program Counter, Stack Pointer, ...)
Repeat
Computers implement many variations on this cycle, with tons ofbells and whistles to make it as fast as possible
But one can still program with the above model in mind (butcertainly without fully understanding performance issues)
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Fetch-Decode-Execute CycleInitializationFetchDecodeExecuteRepeat...I/O
Fetch-Decode-Execute Cycle
The Fetch-Decode-Execute Cycle
The Control Unit fetches the next program instruction from memoryusing the program counterThe instruction is decoded and signals are sent to hardwarecomponents (memory controller, ALU, I/O controller)The instruction is executed:
Values are fetched from memory and put in the registersComputation is performed by the ALU and results are stored inregistersRegister values are pushed back to memoryProgram state is modified (Program Counter, Stack Pointer, ...)
Repeat
Computers implement many variations on this cycle, with tons ofbells and whistles to make it as fast as possible
But one can still program with the above model in mind (butcertainly without fully understanding performance issues)
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Fetch-Decode-Execute CycleInitializationFetchDecodeExecuteRepeat...I/O
Fetch-Decode-Execute Cycle
The Fetch-Decode-Execute Cycle
The Control Unit fetches the next program instruction from memoryusing the program counterThe instruction is decoded and signals are sent to hardwarecomponents (memory controller, ALU, I/O controller)The instruction is executed:
Values are fetched from memory and put in the registersComputation is performed by the ALU and results are stored inregistersRegister values are pushed back to memoryProgram state is modified (Program Counter, Stack Pointer, ...)
Repeat
Computers implement many variations on this cycle, with tons ofbells and whistles to make it as fast as possible
But one can still program with the above model in mind (butcertainly without fully understanding performance issues)
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Fetch-Decode-Execute CycleInitializationFetchDecodeExecuteRepeat...I/O
Fetch-Decode-Execute
Let’s consider a simplistic hypothetical Von Neumann architecture
Memory contains 256 × 1 byte
CPU has 2 “data” registers (A and B), 2 “control” registers(Program Counter and Current Instruction)
CPU instructions encoded on 1 byte (8 bits): 3-bit “opcode”(operation code) and 5-bit operands:
Opcode 000: Load to register A from memoryOpcode 001: Load to register B from memoryOpcode 010: Add B to A; store the result in AOpcode 011: Store the value of A to memoryOpcode 100: JumpOpcode 111: Halt (program terminates)
We will assume that initially A = 5 and B = 151
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Fetch-Decode-Execute CycleInitializationFetchDecodeExecuteRepeat...I/O
Sample Execution Decoding
From the previous slide, our instructions are as follows:
Opcode 000: Load to register A from memoryOpcode 001: Load to register B from memoryOpcode 010: Add B to A; store the result in AOpcode 011: Store the value of A to memoryOpcode 100: JumpOpcode 111: Halt (program terminates)
So, for instance, here are meanings of example instructions:
00010111: Load the byte in RAM at address 00010111 into registerA (“LOAD A, (10111)” in MIPS-like assembly)010?????: A = A + B (we don’t care what the 5 trailing bits arebecause this instruction takes no operand)10000011: Jump to the instruction at address 00000011 and executeit
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Fetch-Decode-Execute CycleInitializationFetchDecodeExecuteRepeat...I/O
(Initialization)-Fetch-Decode-Execute
CPU
ALU
undefinedA
undefinedB
Registers
Control Unit
undefinedPC
undefinedCI
CU Registers
Memory
Address Content Meaning
0000 0100 000 10000 LOAD A, (10000)
0000 0101 001 10001 LOAD B, (10001)
0000 0110 010 00000 ADD A, B
0000 0111 011 10001 STORE A, (10001)
0000 1000 100 00100 JMP (0 0100)
... ... ...0001 0000 0000 0101 5d0001 0001 1001 0111 151d
... ... ...
The Program (its Code and its Data) is loaded into memory (Guess whodoes that?)
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Fetch-Decode-Execute CycleInitializationFetchDecodeExecuteRepeat...I/O
(Initialization)-Fetch-Decode-Execute
CPU
ALU
undefinedA
undefinedB
Registers
Control Unit
0000 0100PC
undefinedCI
CU Registers
Memory
Address Content Meaning
0000 0100 000 10000 LOAD A, (10000)
0000 0101 001 10001 LOAD B, (10001)
0000 0110 010 00000 ADD A, B
0000 0111 011 10001 STORE A, (10001)
0000 1000 100 00100 JMP (0 0100)
... ... ...0001 0000 0000 0101 5d0001 0001 1001 0111 151d
... ... ...
The Program Counter is set to the address of the first instruction of theprogram (Guess who does that?)
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Fetch-Decode-Execute CycleInitializationFetchDecodeExecuteRepeat...I/O
Fetch-Decode-Execute
CPU
ALU
undefinedA
undefinedB
Registers
Control Unit
0000 0100PC
undefinedCI
CU Registers
Memory
Address Content Meaning
0000 0100 000 10000 LOAD A, (10000)
0000 0101 001 10001 LOAD B, (10001)
0000 0110 010 00000 ADD A, B
0000 0111 011 10001 STORE A, (10001)
0000 1000 100 00100 JMP (0 0100)
... ... ...0001 0000 000 00101 5d0001 0001 100 10111 151d
... ... ...
A request is put on the Address Bus to retrieve the value in memory ataddress PC = 0000 0100
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Fetch-Decode-Execute CycleInitializationFetchDecodeExecuteRepeat...I/O
Fetch-Decode-Execute
CPU
ALU
undefinedA
undefinedB
Registers
Control Unit
0000 0100PC
0001 0000CI
CU Registers
Memory
Address Content Meaning
0000 0100 000 10000 LOAD A, (10000)
0000 0101 001 10001 LOAD B, (10001)
0000 0110 010 00000 ADD A, B
0000 0111 011 10001 STORE A, (10001)
0000 1000 100 00100 JMP (0 0100)
... ... ...0001 0000 0000 0101 5d0001 0001 1001 0111 151d
... ... ...
The Memory Unit puts the requested data on the Data Bus and the CPUputs it into the CI register
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Fetch-Decode-Execute CycleInitializationFetchDecodeExecuteRepeat...I/O
Fetch-Decode-Execute
CPU
ALU
undefinedA
undefinedB
Registers
Control Unit
0000 0101PC
0001 0000CI
CU Registers
Memory
Address Data Meaning
0000 0100 000 10000 LOAD A, (10000)
0000 0101 001 10001 LOAD B, (10001)
0000 0110 010 00000 ADD A, B
0000 0111 011 10001 STORE A, (10001)
0000 1000 100 00100 JMP (0 0100)
... ... ...0001 0000 0000 0101 5d0001 0001 1001 0111 151d
... ... ...
The PC register value is incremented: its new value is the address of thenext instruction to execute
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Fetch-Decode-Execute CycleInitializationFetchDecodeExecuteRepeat...I/O
Fetch-Decode-Execute
CPU
ALU
undefinedA
undefinedB
Registers
Control Unit
0000 0101PC
00010000CI
CU Registers
Memory
Address Data Meaning
0000 0100 000 10000 LOAD A, (10000)
0000 0101 001 10001 LOAD B, (10001)
0000 0110 010 00000 ADD A, B
0000 0111 011 10001 STORE A, (10001)
0000 1000 100 00100 JMP (0 0100)
... ... ...0001 0000 0000 0101 5d0001 0001 1001 0111 151d
... ... ...
The instruction is decoded: 00010000 means “000 = LOAD A fromaddress 000(10000)”
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Fetch-Decode-Execute CycleInitializationFetchDecodeExecuteRepeat...I/O
Fetch-Decode-Execute
CPU
ALU
undefinedA
undefinedB
Registers
Control Unit
0000 0101PC
00010000CI
CU Registers
Memory
Address Contents Meaning
0000 0100 000 10000 LOAD A, (10000)
0000 0101 001 10001 LOAD B, (10001)
0000 0110 010 00000 ADD A, B
0000 0111 011 10001 STORE A, (10001)
0000 1000 100 00100 JMP (0 0100)
... ... ...0001 0000 0000 0101 5d0001 0001 1001 0111 151d
... ... ...
The instruction is executed: The value of the memory at address00010000 is requested (using the address bus)
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Fetch-Decode-Execute CycleInitializationFetchDecodeExecuteRepeat...I/O
Fetch-Decode-Execute
CPU
ALU
0000 0101A
undefinedB
Registers
Control Unit
0000 0101PC
0001 0000CI
CU Registers
Memory
Address Contents Meaning
0000 0100 000 10000 LOAD A, (10000)
0000 0101 001 10001 LOAD B, (10001)
0000 0110 010 00000 ADD A, B
0000 0111 011 10001 STORE A, (10001)
0000 1000 100 00100 JMP (0 0100)
... ... ...0001 0000 0000 0101 5d0001 0001 1001 0111 151d
... ... ...
The instruction is executed: The content at address 10000, that is 00000101 is put on the Data Bus and written to register A
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Fetch-Decode-Execute CycleInitializationFetchDecodeExecuteRepeat...I/O
Fetch-Decode-Execute-(Repeat)
Repeat!
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Fetch-Decode-Execute CycleInitializationFetchDecodeExecuteRepeat...I/O
Fetch-Decode-Execute
CPU
ALU
0000 0101A
undefinedB
Registers
Control Unit
0000 0110PC
0011 0001CI
CU Registers
Memory
Address Contents Meaning
0000 0100 000 10000 LOAD A, (10000)
000 00101 001 10001 LOAD B, (10001)
0000 0110 010 00000 ADD A, B
0000 0111 011 10001 STORE A, (10001)
0000 1000 100 00100 JMP (0 0100)
... ... ...0001 0000 0000 0101 5d0001 0001 1001 0111 151d
... ... ...
Fetch (Note that the value of PC is incremented)
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Fetch-Decode-Execute CycleInitializationFetchDecodeExecuteRepeat...I/O
Fetch-Decode-Execute
CPU
ALU
0000 0101A
undefinedB
Registers
Control Unit
0000 0110PC
001 10001CI
CU Registers
Memory
Address Contents Meaning
0000 0100 000 10000 LOAD A, (10000)
0000 0101 001 10001 LOAD B, (10001)
0000 0110 010 00000 ADD A, B
0000 0111 011 10001 STORE A, (10001)
0000 1000 100 00100 JMP (0 0100)
... ... ...0001 0000 0000 0101 5d0001 0001 1001 0111 151d
... ... ...
The instruction is decoded: 00110001 means “001 = LOAD B fromaddress 000(10001)”
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Fetch-Decode-Execute CycleInitializationFetchDecodeExecuteRepeat...I/O
Fetch-Decode-Execute
CPU
ALU
0000 0101A
1001 0111B
Registers
Control Unit
0000 0110PC
0011 0001CI
CU Registers
Memory
Address Contents Meaning
0000 0100 000 10000 LOAD A, (10000)
0000 0101 001 10001 LOAD B, (10001)
0000 0110 010 00000 ADD A, B
0000 0111 011 10001 STORE A, (10001)
0000 1000 100 00100 JMP (0 0100)
... ... ...0001 0000 0000 0101 5d0001 0001 1001 0111 151d
... ... ...
The instruction is executed: Value read at address 00010001, that is,1001 0111 is written to register B
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Fetch-Decode-Execute CycleInitializationFetchDecodeExecuteRepeat...I/O
Fetch-Decode-Execute
CPU
ALU
0000 0101A
1001 0111B
Registers
Control Unit
0000 0111PC
0100 0000CI
CU Registers
Memory
Address Contents Meaning
0000 0100 000 10000 LOAD A, (10000)
0000 0101 001 10001 LOAD B, (10001)
0000 0110 010 00000 ADD A, B
0000 0111 011 10001 STORE A, (10001)
0000 1000 100 00100 JMP (0 0100)
... ... ...0001 0000 0000 0101 5d0001 0001 1001 0111 151d
... ... ...
Fetch
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Fetch-Decode-Execute CycleInitializationFetchDecodeExecuteRepeat...I/O
Fetch-Decode-Execute
CPU
ALU
0000 0101A
1001 0111B
Registers
Control Unit
0000 0111PC
01000000CI
CU Registers
Memory
Address Contents Meaning
0000 0100 000 10000 LOAD A, (10000)
0000 0101 001 10001 LOAD B, (10001)
0000 0110 010 00000 ADD A, B
0000 0111 011 10001 STORE A, (10001)
0000 1000 100 00100 JMP (0 0100)
... ... ...0001 0000 0000 0101 5d0001 0001 1001 0111 151d
... ... ...
The instruction is decoded: 01000000 means “010 = ADD A, B (theoperand is ignored)”
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Fetch-Decode-Execute CycleInitializationFetchDecodeExecuteRepeat...I/O
Fetch-Decode-Execute
CPU
ALU
1001 1100A
1001 0111B
Registers
Control Unit
0000 0111PC
0100 0000CI
CU Registers
Memory
Address Contents Meaning
0000 0100 000 10000 LOAD A, (10000)
0000 0101 001 10001 LOAD B, (10001)
0000 0110 010 00000 ADD A, B
0000 0111 011 10001 STORE A, (10001)
0000 1000 100 00100 JMP (0 0100)
... ... ...0001 0000 0000 0101 5d0001 0001 1001 0111 151d
... ... ...
The instruction is executed (A ← A+B)
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Fetch-Decode-Execute CycleInitializationFetchDecodeExecuteRepeat...I/O
Fetch-Decode-Execute
CPU
ALU
1001 1100A
1001 0111B
Registers
Control Unit
0000 1000PC
0111 0001CI
CU Registers
Memory
Address Contents Meaning
0000 0100 000 10000 LOAD A, (10000)
0000 0101 001 10001 LOAD B, (10001)
0000 0110 010 00000 ADD A, B
0000 0111 011 10001 STORE A, (10001)
0000 1000 100 00100 JMP (0 0100)
... ... ...0001 0000 0000 0101 5d0001 0001 1001 0111 151d
... ... ...
Fetch
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Fetch-Decode-Execute CycleInitializationFetchDecodeExecuteRepeat...I/O
Fetch-Decode-Execute
CPU
ALU
1001 1100A
1001 0111B
Registers
Control Unit
0000 1000PC
0111 0001CI
CU Registers
Memory
Address Contents Meaning
0000 0100 000 10000 LOAD A, (10000)
0000 0101 001 10001 LOAD B, (10001)
0000 0110 010 00000 ADD A, B
0000 0111 011 10001 STORE A, (10001)
0000 1000 100 00100 JMP (0 0100)
... ... ...0001 0000 0000 0101 5d0001 0001 1001 1100 156d
... ... ...
(Let’s skip the Decode part) Execute
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Fetch-Decode-Execute CycleInitializationFetchDecodeExecuteRepeat...I/O
Fetch-Decode-Execute
CPU
ALU
1001 1100A
1001 0111B
Registers
Control Unit
0000 1001PC
1000 0100CI
CU Registers
Memory
Address Content Meaning
0000 0100 000 10000 LOAD A, (10000)
0000 0101 001 10001 LOAD B, (10001)
0000 0110 010 00000 ADD A, B
0000 0111 011 10001 STORE A, (10001)
0000 1000 100 00100 JMP (00 0100)
... ... ...0001 0000 0000 0101 5d0001 0001 1001 0111 156d
... ... ...
Fetch
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Fetch-Decode-Execute CycleInitializationFetchDecodeExecuteRepeat...I/O
Fetch-Decode-Execute
CPU
ALU
1001 1010A
1001 0111B
Registers
Control Unit
0000 0100PC
1000 0100CI
CU Registers
Memory
Address Content Meaning
0000 0100 000 10000 LOAD A, (10000)
0000 0101 001 10001 LOAD B, (10001)
0000 0110 010 00000 ADD A, B
0000 0111 011 10001 STORE A, (10001)
0000 1000 100 00100 JMP (0 0100)
... ... ...0001 0000 0000 0101 5d0001 0001 1001 1010 d154
... ... ...
Execute - the JMP instruction modifies the value of a control register (PC)
The next instruction to execute will be LOAD A, (10000)
And like that we have implemented an infinite loop...
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Fetch-Decode-Execute CycleInitializationFetchDecodeExecuteRepeat...I/O
Fetch-Decode-Execute
CPU
ALU
1001 1010A
1001 0111B
Registers
Control Unit
0000 0100PC
1000 0100CI
CU Registers
Memory
Address Content Meaning
0000 0100 000 10000 LOAD A, (10000)
0000 0101 001 10001 LOAD B, (10001)
0000 0110 010 00000 ADD A, B
0000 0111 011 10001 STORE A, (10001)
0000 1000 100 00100 JMP (0 0100)
... ... ...0001 0000 0000 0101 5d0001 0001 1001 1010 d154
... ... ...
Execute - the JMP instruction modifies the value of a control register (PC)The next instruction to execute will be LOAD A, (10000)
And like that we have implemented an infinite loop...
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Fetch-Decode-Execute CycleInitializationFetchDecodeExecuteRepeat...I/O
Fetch-Decode-Execute
CPU
ALU
1001 1010A
1001 0111B
Registers
Control Unit
0000 0100PC
1000 0100CI
CU Registers
Memory
Address Content Meaning
0000 0100 000 10000 LOAD A, (10000)
0000 0101 001 10001 LOAD B, (10001)
0000 0110 010 00000 ADD A, B
0000 0111 011 10001 STORE A, (10001)
0000 1000 100 00100 JMP (0 0100)
... ... ...0001 0000 0000 0101 5d0001 0001 1001 1010 d154
... ... ...
Execute - the JMP instruction modifies the value of a control register (PC)The next instruction to execute will be LOAD A, (10000)
And like that we have implemented an infinite loop...
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Fetch-Decode-Execute CycleInitializationFetchDecodeExecuteRepeat...I/O
Fetch-Decode-Execute Practice
It’s a pretty good idea to review these slides and see if you can goback to the first slide (initialization) and see if you can yourself gothrough the fetch-decode-execute cycle
We’ll have a simple homework assignment along these lines
But just in case, let’s do one together right now...
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Fetch-Decode-Execute CycleInitializationFetchDecodeExecuteRepeat...I/O
In-class activity (just to make sure we’re all on board)
CPU
undefinedA
0000 0110B
Registers
0000 0001PC
UndefinedCI
CU Registers
Opcode Meaning 5-bit operand000 Load to register A from memory address001 Load to register B from memory address010 Add B to A; store the result in A ignored011 Store the value of A to memory address100 Jump address111 Halt ignored
Memory
Address Content
0000 0000 001 100100000 0001 000 100110000 0010 010 000000000 0011 011 101110000 0100 001 101110000 0101 111 00000
... ...0001 0010 0000 01100001 0011 1000 0111
What is the decimal value of register B when the program terminates?
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Fetch-Decode-Execute CycleInitializationFetchDecodeExecuteRepeat...I/O
In-class activity solution
CPU
1000 1101A
1000 1101B
Registers
0000 0110PC
1110 000CI
CU Registers
Opcode Meaning 5-bit operand000 Load to register A from memory address001 Load to register B from memory address010 Add B to A; store the result in A ignored011 Store the value of A to memory address100 Jump address111 Halt ignored
Memory
Address Content Meaning
0000 0000 001 100100000 0001 000 10011 A ← 135d0000 0010 010 00000 A ← 135d + 6d = 141d0000 0011 011 10111 (00010111) ← 141d0000 0100 001 10111 B ← (00010111) = 141d0000 0101 111 00000 Halt
... ...0001 0010 0000 0110 6d0001 0011 1000 0111 135d
Answer: the decimal value of B is 141
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Fetch-Decode-Execute CycleInitializationFetchDecodeExecuteRepeat...I/O
There is more to Fetch-Decode-Execute
This was a simplified view of the way things work
Control and data paths are implemented by several hardwarecomponents
There is usually more than one ALU
There are caches between the CPU and the memory
There are even multiple CPUs
The cycle is pipelined: Fetch the instruction i + 1 while instruction iis being executed
Decades of computer architecture research have gone into improvingspeed, thus often leading to high hardware complexity (and doingsmart things in hardware requires more logic gates and wires, thusincreasing CPU cost)
But, conceptually, it is still Fetch-Decode-Execute.
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Fetch-Decode-Execute CycleInitializationFetchDecodeExecuteRepeat...I/O
The Von Neumann Architecture
CPU ⇐⇒ MemorymI/O
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Fetch-Decode-Execute CycleInitializationFetchDecodeExecuteRepeat...I/O
I/O
Let’s leave this topic for (much) later...
Let’s just assume that there is an I/O Controller and that the CPUcan talk to it to make I/O happen (reads and writes)
After all there is a Memory Controller and at the conceptual levelthey are not so different
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
CachingLocalityDirect Memory AccessCurrent Architectures
The RAM is slow
A big speed issue: the memory is slow
Accessing a register is very fast
e.g., a 4GHz CPU can update a register in 0.25 nanosecond (1 cycle)
Accessing the memory takes about 10 ns
The memory is ∼40 times slower than the CPU
What does the CPU do while it’s waiting for the memory togive it data?
NOTHING!! (yes, this is a problem)
This is the famous “Von-Neumann Bottleneck”
Many techniques have been develop to address this
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
CachingLocalityDirect Memory AccessCurrent Architectures
The RAM is slow
A big speed issue: the memory is slow
Accessing a register is very fast
e.g., a 4GHz CPU can update a register in 0.25 nanosecond (1 cycle)
Accessing the memory takes about 10 ns
The memory is ∼40 times slower than the CPU
What does the CPU do while it’s waiting for the memory togive it data?
NOTHING!! (yes, this is a problem)
This is the famous “Von-Neumann Bottleneck”
Many techniques have been develop to address this
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
CachingLocalityDirect Memory AccessCurrent Architectures
The RAM is slow
A big speed issue: the memory is slow
Accessing a register is very fast
e.g., a 4GHz CPU can update a register in 0.25 nanosecond (1 cycle)
Accessing the memory takes about 10 ns
The memory is ∼40 times slower than the CPU
What does the CPU do while it’s waiting for the memory togive it data?
NOTHING!! (yes, this is a problem)
This is the famous “Von-Neumann Bottleneck”
Many techniques have been develop to address this
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
CachingLocalityDirect Memory AccessCurrent Architectures
The RAM is slow
A big speed issue: the memory is slow
Accessing a register is very fast
e.g., a 4GHz CPU can update a register in 0.25 nanosecond (1 cycle)
Accessing the memory takes about 10 ns
The memory is ∼40 times slower than the CPU
What does the CPU do while it’s waiting for the memory togive it data?
NOTHING!! (yes, this is a problem)
This is the famous “Von-Neumann Bottleneck”
Many techniques have been develop to address this
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
CachingLocalityDirect Memory AccessCurrent Architectures
The RAM is slow
A big speed issue: the memory is slow
Accessing a register is very fast
e.g., a 4GHz CPU can update a register in 0.25 nanosecond (1 cycle)
Accessing the memory takes about 10 ns
The memory is ∼40 times slower than the CPU
What does the CPU do while it’s waiting for the memory togive it data?
NOTHING!! (yes, this is a problem)
This is the famous “Von-Neumann Bottleneck”
Many techniques have been develop to address this
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
CachingLocalityDirect Memory AccessCurrent Architectures
The RAM is slow
A big speed issue: the memory is slow
Accessing a register is very fast
e.g., a 4GHz CPU can update a register in 0.25 nanosecond (1 cycle)
Accessing the memory takes about 10 ns
The memory is ∼40 times slower than the CPU
What does the CPU do while it’s waiting for the memory togive it data?
NOTHING!! (yes, this is a problem)
This is the famous “Von-Neumann Bottleneck”
Many techniques have been develop to address this
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
CachingLocalityDirect Memory AccessCurrent Architectures
Several levels of RAM
We would like a gigantic and fast memory
Could we just build the memory just as gazillions of registers?
No!!! Cost/physics make it impossible
Instead, we play a trick to provide the illusion of a fast memory
This trick is called the memory hierarchy
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
CachingLocalityDirect Memory AccessCurrent Architectures
Several levels of RAM
We would like a gigantic and fast memory
Could we just build the memory just as gazillions of registers?
No!!! Cost/physics make it impossible
Instead, we play a trick to provide the illusion of a fast memory
This trick is called the memory hierarchy
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
CachingLocalityDirect Memory AccessCurrent Architectures
Several levels of RAM
We would like a gigantic and fast memory
Could we just build the memory just as gazillions of registers?
No!!! Cost/physics make it impossible
Instead, we play a trick to provide the illusion of a fast memory
This trick is called the memory hierarchy
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
CachingLocalityDirect Memory AccessCurrent Architectures
Several levels of RAM
We would like a gigantic and fast memory
Could we just build the memory just as gazillions of registers?
No!!! Cost/physics make it impossible
Instead, we play a trick to provide the illusion of a fast memory
This trick is called the memory hierarchy
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
CachingLocalityDirect Memory AccessCurrent Architectures
Several levels of RAM
We would like a gigantic and fast memory
Could we just build the memory just as gazillions of registers?
No!!! Cost/physics make it impossible
Instead, we play a trick to provide the illusion of a fast memory
This trick is called the memory hierarchy
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
CachingLocalityDirect Memory AccessCurrent Architectures
The Memory Hierarchy
fast slow
small large
(CPU)Registers
MemoryMemory Bus
I/ODevices
I/OBus
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
CachingLocalityDirect Memory AccessCurrent Architectures
The Memory Hierarchy
fast slow
small large
(CPU)Registers
MemoryMemory Bus I/O
Devices
I/OBus
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
CachingLocalityDirect Memory AccessCurrent Architectures
The Memory Hierarchy
fast slow
small large
(CPU)Registers
Memory
MemoryBus I/O
Devices
I/OBus
Cache
Few 100s Bytes< 1 ns
Compiler
kB to MB1 ns
Hardware
GB10 nsOS
TB1+ ms
OS
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
CachingLocalityDirect Memory AccessCurrent Architectures
The Memory Hierarchy
fast slow
small large
(CPU)Registers
Memory
MemoryBus I/O
Devices
I/OBus
Cache
Few 100s Bytes< 1 ns
Compiler
kB to MB1 ns
Hardware
GB10 nsOS
TB1+ ms
OS
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
CachingLocalityDirect Memory AccessCurrent Architectures
The Memory Hierarchy in a Nutshell
When a program accesses a byte in memory:
It checks whether the byte is in cache, and if so, it just gets itOtherwise, the byte value is brought from the (slow) memory intothe (fast) cacheThe values around the byte are also brought into the cache
Analogy:
To write a paper you need a reference book from the libraryYou go to the library and find the book on a shelf, noticing that thebooks around it are on the same topic! You can...
Leave the book at the library and go to the library each time youneed one referenceTake only the one book... but if it makes a reference to another bookon the same topic you’ll have to go back to the libraryOr take the one book and the books around it and put them on yourdesk... and if THE reference makes a reference maybe you’ll have thereferred book right thereIn this last option your desk is a “cache for the library”
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
CachingLocalityDirect Memory AccessCurrent Architectures
The Memory Hierarchy in a Nutshell
When a program accesses a byte in memory:
It checks whether the byte is in cache, and if so, it just gets itOtherwise, the byte value is brought from the (slow) memory intothe (fast) cacheThe values around the byte are also brought into the cache
Analogy:
To write a paper you need a reference book from the libraryYou go to the library and find the book on a shelf, noticing that thebooks around it are on the same topic! You can...
Leave the book at the library and go to the library each time youneed one reference
Take only the one book... but if it makes a reference to another bookon the same topic you’ll have to go back to the libraryOr take the one book and the books around it and put them on yourdesk... and if THE reference makes a reference maybe you’ll have thereferred book right thereIn this last option your desk is a “cache for the library”
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
CachingLocalityDirect Memory AccessCurrent Architectures
The Memory Hierarchy in a Nutshell
When a program accesses a byte in memory:
It checks whether the byte is in cache, and if so, it just gets itOtherwise, the byte value is brought from the (slow) memory intothe (fast) cacheThe values around the byte are also brought into the cache
Analogy:
To write a paper you need a reference book from the libraryYou go to the library and find the book on a shelf, noticing that thebooks around it are on the same topic! You can...
Leave the book at the library and go to the library each time youneed one referenceTake only the one book... but if it makes a reference to another bookon the same topic you’ll have to go back to the library
Or take the one book and the books around it and put them on yourdesk... and if THE reference makes a reference maybe you’ll have thereferred book right thereIn this last option your desk is a “cache for the library”
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
CachingLocalityDirect Memory AccessCurrent Architectures
The Memory Hierarchy in a Nutshell
When a program accesses a byte in memory:
It checks whether the byte is in cache, and if so, it just gets itOtherwise, the byte value is brought from the (slow) memory intothe (fast) cacheThe values around the byte are also brought into the cache
Analogy:
To write a paper you need a reference book from the libraryYou go to the library and find the book on a shelf, noticing that thebooks around it are on the same topic! You can...
Leave the book at the library and go to the library each time youneed one referenceTake only the one book... but if it makes a reference to another bookon the same topic you’ll have to go back to the libraryOr take the one book and the books around it and put them on yourdesk... and if THE reference makes a reference maybe you’ll have thereferred book right thereIn this last option your desk is a “cache for the library”
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
CachingLocalityDirect Memory AccessCurrent Architectures
Why does it work?
TEMPORAL LOCALITY
A program tends to reference addresses it has already referenced
e.g., Counters
The first access is expensive: Fetching the value takes many cycles
Each subsequent accesses are cheap: The value is in cache
The “I need that same book again” analogy
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
CachingLocalityDirect Memory AccessCurrent Architectures
Why does it work?
SPATIAL LOCALITY
A program tends to reference addresses next to addresses it hasalready referenced
e.g., When manipulating arrays (i.e., contiguous bytes in memory)
The access to element i is expensive: Fetching the value takes manycycles
Access to elements i + 1, i + 2, ... are cheap: The values are incache!
The “I need a book on that same shelf” analogy
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
CachingLocalityDirect Memory AccessCurrent Architectures
The Memory Hierarchy: Memory Caches
In reality there is more than one level of cache (L1, L2, L3)
Trade-offs between size, speed, and cost
L1 (the closest/fastest to the CPU) is actually split into Data Cacheand Instructions Cache
Chunks of data are brought from (far-away) memory and are copiedand kept around in (nearby) caches
The same data exist in multiple levels of memory at once, whichleads to interesting issues/problems we might discuss (see ICS 432)
Cache Hit: When a data item is found in cache (e.g., we would talkof a “L2 cache hit”)
Cache Miss: When a data item is not found in cache (e.g., we wouldtalk of a “L1 cache hit”)
We’ll use this hit/miss terminology for several OS concepts...
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
CachingLocalityDirect Memory AccessCurrent Architectures
The Memory Hierarchy: Memory Caches
In reality there is more than one level of cache (L1, L2, L3)
Trade-offs between size, speed, and cost
L1 (the closest/fastest to the CPU) is actually split into Data Cacheand Instructions Cache
Chunks of data are brought from (far-away) memory and are copiedand kept around in (nearby) caches
The same data exist in multiple levels of memory at once, whichleads to interesting issues/problems we might discuss (see ICS 432)
Cache Hit: When a data item is found in cache (e.g., we would talkof a “L2 cache hit”)
Cache Miss: When a data item is not found in cache (e.g., we wouldtalk of a “L1 cache hit”)
We’ll use this hit/miss terminology for several OS concepts...
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
CachingLocalityDirect Memory AccessCurrent Architectures
The Memory Hierarchy: Memory Caches
In reality there is more than one level of cache (L1, L2, L3)
Trade-offs between size, speed, and cost
L1 (the closest/fastest to the CPU) is actually split into Data Cacheand Instructions Cache
Chunks of data are brought from (far-away) memory and are copiedand kept around in (nearby) caches
The same data exist in multiple levels of memory at once, whichleads to interesting issues/problems we might discuss (see ICS 432)
Cache Hit: When a data item is found in cache (e.g., we would talkof a “L2 cache hit”)
Cache Miss: When a data item is not found in cache (e.g., we wouldtalk of a “L1 cache hit”)
We’ll use this hit/miss terminology for several OS concepts...
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
CachingLocalityDirect Memory AccessCurrent Architectures
Direct Memory Access (DMA)
Often, one has to copy large chunks of data to/from RAM from/tosome peripheral device (graphics card, network card, sound card,disk)
In the pure Von-Neumann model, the CPU has to be involved foreach copy operation
The problem is the memory copies take a long time (even withcaches), and the CPU spends its life twiddling its thumbs while thecopies are taking place ause
It would be better to have copies occur independently so that theCPU can do something useful while the memory copy is takingplace
This is called Direct Memory Access (DMA)
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
CachingLocalityDirect Memory AccessCurrent Architectures
Direct Memory Access (DMA)
Often, one has to copy large chunks of data to/from RAM from/tosome peripheral device (graphics card, network card, sound card,disk)
In the pure Von-Neumann model, the CPU has to be involved foreach copy operation
The problem is the memory copies take a long time (even withcaches), and the CPU spends its life twiddling its thumbs while thecopies are taking place ause
It would be better to have copies occur independently so that theCPU can do something useful while the memory copy is takingplace
This is called Direct Memory Access (DMA)
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
CachingLocalityDirect Memory AccessCurrent Architectures
Direct Memory Access (DMA)
DMA is used on all modern computers
e.g., the Intel i7 has an on-chip DMA controller
How DMA works (without getting into details):
The CPU simply tells the DMA controller to initiate a RAM copyWhen the copy is complete the DMA controller tells the CPU “it’sdone” by generating an interrupt (more on interrupts very soon)In the meantime, the CPU was free to do whatever
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
CachingLocalityDirect Memory AccessCurrent Architectures
DMA is not free
To perform data transfers the DMA controller uses the memory bus
In the meantime, the code executed by the CPU likely also uses thememory bus
Therefore, they can interfere with each other
There are several ways in which this interference can be managed(give priority to DMA, to CPU, weight usage, ...)
See a Computer Architecture course
In general, using DMA leads to much better performance anywayand (good) software should to it as often as possible
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
CachingLocalityDirect Memory AccessCurrent Architectures
DMA is not free
To perform data transfers the DMA controller uses the memory bus
In the meantime, the code executed by the CPU likely also uses thememory bus
Therefore, they can interfere with each other
There are several ways in which this interference can be managed(give priority to DMA, to CPU, weight usage, ...)
See a Computer Architecture course
In general, using DMA leads to much better performance anywayand (good) software should to it as often as possible
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
CachingLocalityDirect Memory AccessCurrent Architectures
DMA is not free
To perform data transfers the DMA controller uses the memory bus
In the meantime, the code executed by the CPU likely also uses thememory bus
Therefore, they can interfere with each other
There are several ways in which this interference can be managed(give priority to DMA, to CPU, weight usage, ...)
See a Computer Architecture course
In general, using DMA leads to much better performance anywayand (good) software should to it as often as possible
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
CachingLocalityDirect Memory AccessCurrent Architectures
DMA is not free
To perform data transfers the DMA controller uses the memory bus
In the meantime, the code executed by the CPU likely also uses thememory bus
Therefore, they can interfere with each other
There are several ways in which this interference can be managed(give priority to DMA, to CPU, weight usage, ...)
See a Computer Architecture course
In general, using DMA leads to much better performance anywayand (good) software should to it as often as possible
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
CachingLocalityDirect Memory AccessCurrent Architectures
DMA is not free
To perform data transfers the DMA controller uses the memory bus
In the meantime, the code executed by the CPU likely also uses thememory bus
Therefore, they can interfere with each other
There are several ways in which this interference can be managed(give priority to DMA, to CPU, weight usage, ...)
See a Computer Architecture course
In general, using DMA leads to much better performance anywayand (good) software should to it as often as possible
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
CachingLocalityDirect Memory AccessCurrent Architectures
Current Architectures
Current architectures are much more complex than what we justdescribed
Because constructors cannot increase clock rate further (power/heatissues), our current CPUs are multi-core
Multiple “low” clock rate CPUs on a single chip
This is a great solution to a problem, but most users/programmerswould rather have a 100 GHz single core than 50 2 GHz cores
We’ll talk about multi-core architectures later in the semester
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
CachingLocalityDirect Memory AccessCurrent Architectures
Example of a real-life system
Picture obtained with lstopo
(sudo apt-get install hwloc)
Henri Casanova ([email protected]) Computer Architecture Overview
HistoryVon Neumann Model
Fetch-Decode-Execute CycleSpeeding Things Up
Conclusion
Conclusion
If you want to know more:
Take ICS312 / ICS331Take Computer Architecture (EE 461, ICS431)Computer Organization and Design,Patterson and Hennessy
We will have a quiz on these lecture notesnext week
Henri Casanova ([email protected]) Computer Architecture Overview