cache organizationcsg.csail.mit.edu/6.823s20/lectures/l03split.pdf · 2020. 2. 11. · cache...

95
L03-1 MIT 6.823 Spring 2020 Mengjia Yan Computer Science and Artificial Intelligence Laboratory M.I.T. Based on slides from Daniel Sanchez Cache Organization

Upload: others

Post on 18-Aug-2021

19 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

L03-1MIT 6.823 Spring 2020

Mengjia YanComputer Science and Artificial Intelligence Laboratory

M.I.T.

Based on slides from Daniel Sanchez

Cache Organization

Page 2: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

CPU-Memory Bottleneck

February 11, 2020

MemoryCPU

L03-2

Page 3: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

CPU-Memory Bottleneck

February 11, 2020

MemoryCPU

Performance of high-speed computers is usually

L03-2

Page 4: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

CPU-Memory Bottleneck

February 11, 2020

MemoryCPU

Performance of high-speed computers is usuallylimited by memory bandwidth & latency

• Latency (time for a single access)Memory access time >> Processor cycle time

• Bandwidth (number of accesses per unit time)

L03-2

Page 5: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

CPU-Memory Bottleneck

February 11, 2020

MemoryCPU

Performance of high-speed computers is usuallylimited by memory bandwidth & latency

• Latency (time for a single access)Memory access time >> Processor cycle time

• Bandwidth (number of accesses per unit time)if fraction m of instructions access memory,

Þ1+m memory references / instructionÞ CPI = 1 requires 1+m memory refs / cycle

L03-2

Page 6: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Memory Technology• Early machines used a variety of memory technologies

– Manchester Mark I used CRT Memory Storage– EDVAC used a mercury delay line

• Core memory was first large scale reliable main memory– Invented by Forrester in late 40s at MIT for Whirlwind project– Bits stored as magnetization polarity on small ferrite cores threaded onto 2

dimensional grid of wires

February 11, 2020 L03-3

Page 7: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Memory Technology• Early machines used a variety of memory technologies

– Manchester Mark I used CRT Memory Storage– EDVAC used a mercury delay line

• Core memory was first large scale reliable main memory– Invented by Forrester in late 40s at MIT for Whirlwind project– Bits stored as magnetization polarity on small ferrite cores threaded onto 2

dimensional grid of wires

• First commercial DRAM was Intel 1103– 1Kbit of storage on single chip– charge on a capacitor used to hold value

• Semiconductor memory quickly replaced core in 1970s– Intel formed to exploit market for semiconductor memory

February 11, 2020 L03-3

Page 8: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Memory Technology• Early machines used a variety of memory technologies

– Manchester Mark I used CRT Memory Storage– EDVAC used a mercury delay line

• Core memory was first large scale reliable main memory– Invented by Forrester in late 40s at MIT for Whirlwind project– Bits stored as magnetization polarity on small ferrite cores threaded onto 2

dimensional grid of wires

• First commercial DRAM was Intel 1103– 1Kbit of storage on single chip– charge on a capacitor used to hold value

• Semiconductor memory quickly replaced core in 1970s– Intel formed to exploit market for semiconductor memory

• Flash memory– Slower, but denser than DRAM. Also non-volatile, but with wearout issues

• Phase change memory (PCM, 3D XPoint)– Slightly slower, but much denser than DRAM and non-volatile

February 11, 2020 L03-3

Page 9: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

DRAM Architecture

February 11, 2020

Memory cell(one bit)

L03-4

Page 10: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

DRAM Architecture

February 11, 2020

Row

Add

ress

D

ecod

erCol.1

Col.2M

Row 1

Row 2N

Column Decoder & Sense Amplifiers

M

N

N+M

bit linesword lines

DData

• Bits stored in 2-dimensional arrays on chip

Memory cell(one bit)

L03-4

Page 11: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

DRAM Architecture

February 11, 2020

Row

Add

ress

D

ecod

erCol.1

Col.2M

Row 1

Row 2N

Column Decoder & Sense Amplifiers

M

N

N+M

bit linesword lines

DData

• Bits stored in 2-dimensional arrays on chip• Question: why read the entire row?

Memory cell(one bit)

L03-4

Page 12: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

DRAM Architecture

February 11, 2020

Row

Add

ress

D

ecod

erCol.1

Col.2M

Row 1

Row 2N

Column Decoder & Sense Amplifiers

M

N

N+M

bit linesword lines

DData

• Bits stored in 2-dimensional arrays on chip• Question: why read the entire row?• Modern chips have around 8 logical banks on each chip

– each logical bank physically implemented as many smaller arrays

Memory cell(one bit)

L03-4

Page 13: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

DRAM timing

February 11, 2020

DRAM Spec:CL, tRCD, tRP, tRAS, e.g., 9-9-9-24

L03-5

Page 14: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Processor-DRAM Gap (latency)

February 11, 2020 L03-6

Page 15: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Processor-DRAM Gap (latency)

February 11, 2020

Four-issue 2GHz superscalar accessing 100ns DRAM could execute 800 instructions during time for one memory access!

L03-6

Page 16: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Little’s Law

February 11, 2020

Throughput (T) = Number in Flight (N) / Latency (L)

MemoryCPUTable of

accesses in flight

Example:--- Assume infinite-bandwidth memory--- 100 cycles / memory reference--- 1 + 0.2 memory references / instruction

L03-7

Page 17: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Little’s Law

February 11, 2020

Throughput (T) = Number in Flight (N) / Latency (L)

MemoryCPUTable of

accesses in flight

Example:--- Assume infinite-bandwidth memory--- 100 cycles / memory reference--- 1 + 0.2 memory references / instruction

Þ Table size = 1.2 * 100 = 120 entries

120 independent memory operations in flight!L03-7

Page 18: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

6-Transistor SRAM Cell

bit bit

word(row select)

10

0 1

Basic Static RAM Cell

February 11, 2020

• Write:1. Drive bit lines (bit=1, bit=0)2. Select word line

• Read:1. Precharge bit and bit to Vdd2. Select word line3. Cell pulls one bit line low4. Column sense amp detects difference between bit & bit

bit bit

word

10

L03-8

Page 19: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Multilevel Memory

February 11, 2020

Strategy: Reduce average latency using small, fast memories called caches.

Caches are a mechanism to reduce memory latency based on the empirical observation that the patterns of memory references made by a processor are often highly predictable:

L03-9

Page 20: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Multilevel Memory

February 11, 2020

Strategy: Reduce average latency using small, fast memories called caches.

Caches are a mechanism to reduce memory latency based on the empirical observation that the patterns of memory references made by a processor are often highly predictable:

PC… 96

loop: add r2, r1, r1 100subi r3, r3, #1 104bnez r3, loop 108… 112

L03-9

Page 21: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Typical Memory Reference Patterns

February 11, 2020

Address

Time

Instructionfetches

n loop iterations

L03-10

Page 22: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Typical Memory Reference Patterns

February 11, 2020

Address

Time

Instructionfetches

Stackaccesses

n loop iterations

L03-10

Page 23: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Typical Memory Reference Patterns

February 11, 2020

Address

Time

Instructionfetches

Stackaccesses

n loop iterations

L03-10

Page 24: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Typical Memory Reference Patterns

February 11, 2020

Address

Time

Instructionfetches

Stackaccesses

n loop iterations

subroutine call

L03-10

Page 25: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Typical Memory Reference Patterns

February 11, 2020

Address

Time

Instructionfetches

Stackaccesses

n loop iterations

subroutine call

argument access

L03-10

Page 26: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Typical Memory Reference Patterns

February 11, 2020

Address

Time

Instructionfetches

Stackaccesses

n loop iterations

subroutine call

subroutine return

argument access

L03-10

Page 27: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Typical Memory Reference Patterns

February 11, 2020

Address

Time

Instructionfetches

Stackaccesses

Dataaccesses

n loop iterations

subroutine call

subroutine return

argument access

L03-10

Page 28: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Typical Memory Reference Patterns

February 11, 2020

Address

Time

Instructionfetches

Stackaccesses

Dataaccesses

n loop iterations

subroutine call

subroutine return

argument access

L03-10

Page 29: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Typical Memory Reference Patterns

February 11, 2020

Address

Time

Instructionfetches

Stackaccesses

Dataaccesses

n loop iterations

subroutine call

subroutine return

argument access

vector access

L03-10

Page 30: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Typical Memory Reference Patterns

February 11, 2020

Address

Time

Instructionfetches

Stackaccesses

Dataaccesses

n loop iterations

subroutine call

subroutine return

argument access

vector access

scalar accesses

L03-10

Page 31: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Common Predictable Patterns

Two predictable properties of memory references:

– Temporal Locality: If a location is referenced it is likely to be referenced again in the near future.

– Spatial Locality: If a location is referenced it is likely that locations near it will be referenced in the near future.

February 11, 2020 L03-11

Page 32: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Memory Hierarchy

February 11, 2020

• size: Register << SRAM << DRAM why?• latency: Register << SRAM << DRAM why?• bandwidth: on-chip >> off-chip why?

Small,Fast

Memory(RF, SRAM)

CPUBig, Slow Memory(DRAM)

A B

holds frequently used data

L03-12

Page 33: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Memory Hierarchy

February 11, 2020

• size: Register << SRAM << DRAM why?• latency: Register << SRAM << DRAM why?• bandwidth: on-chip >> off-chip why?

On a data access:hit (data Î fast memory) Þ low latency accessmiss (data Ï fast memory) Þ long latency access (DRAM)

Small,Fast

Memory(RF, SRAM)

CPUBig, Slow Memory(DRAM)

A B

holds frequently used data

L03-12

Page 34: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Typical memory hierarchies

February 11, 2020 L03-13

Page 35: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Management of Memory Hierarchy

February 11, 2020

• Small/fast storage, e.g., registers– Address usually specified in instruction– Generally implemented directly as a register file

• but hardware might do things behind software’s back, e.g., stack management, register renaming

• Large/slower storage, e.g., memory– Address usually computed from values in register– Generally implemented as a cache hierarchy

• hardware decides what is kept in fast memory• but software may provide “hints”, e.g., don’t cache or

prefetch

L03-14

Page 36: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Inside a Cache

February 11, 2020

CACHEProcessor MainMemory

Address Address

DataData

AddressTag

Data Block

DataByte

DataByte

DataByte

Line100

304

6848

copy of main memorylocation 100

copy of main memorylocation 101

416

L03-15

Page 37: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Inside a Cache

February 11, 2020

CACHEProcessor MainMemory

Address Address

DataData

AddressTag

Data Block

DataByte

DataByte

DataByte

Line100

304

6848

copy of main memorylocation 100

copy of main memorylocation 101

416

Q: How many bits needed in tag? ___________________________Q: Why not use very small/big block size?

L03-15

Page 38: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Inside a Cache

February 11, 2020

CACHEProcessor MainMemory

Address Address

DataData

AddressTag

Data Block

DataByte

DataByte

DataByte

Line100

304

6848

copy of main memorylocation 100

copy of main memorylocation 101

416

Q: How many bits needed in tag? ___________________________Q: Why not use very small/big block size?

Enough to uniquely identify block

L03-15

Page 39: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Direct-Mapped Cache

February 11, 2020

Tag Data BlockV

2k

lines

L03-16

Page 40: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Direct-Mapped Cache

February 11, 2020

Tag Data BlockV

2k

lines

OffsetTag Index

Block number Block offset

L03-16

Page 41: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Direct-Mapped Cache

February 11, 2020

Tag Data BlockV k

2k

lines

OffsetTag Index

Block number Block offset

L03-16

Page 42: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Direct-Mapped Cache

February 11, 2020

Tag Data BlockV k

=t

t

HIT

2k

lines

OffsetTag Index

Block number Block offset

L03-16

Page 43: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Direct-Mapped Cache

February 11, 2020

Tag Data BlockV k

=t

t

HIT

b

Data Word or Byte

2k

lines

OffsetTag Index

Block number Block offset

L03-16

Page 44: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Direct-Mapped Cache

February 11, 2020

Tag Data BlockV k

=t

t

HIT

b

Data Word or Byte

2k

lines

OffsetTag Index

Block number Block offset

Q: What is a bad reference pattern? ____________________

L03-16

Page 45: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Direct-Mapped Cache

February 11, 2020

Tag Data BlockV k

=t

t

HIT

b

Data Word or Byte

2k

lines

OffsetTag Index

Block number Block offset

Q: What is a bad reference pattern? ____________________Strided at size of cache

L03-16

Page 46: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Direct Map Address Selectionhigher-order vs. lower-order address bits

February 11, 2020

Tag Data BlockV

=

OffsetIndextk

b

t

HIT Data Word or Byte

2k

lines

Tag

Q: Why might this be undesirable? ________________________

L03-17

Page 47: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Direct Map Address Selectionhigher-order vs. lower-order address bits

February 11, 2020

Tag Data BlockV

=

OffsetIndextk

b

t

HIT Data Word or Byte

2k

lines

Tag

Q: Why might this be undesirable? ________________________Spatially local blocks conflict

L03-17

Page 48: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Hashed Address Mapping

February 11, 2020

Tag Data BlockV

=

Offset

tb

t

HIT Data Word or Byte

2k

lines

Address

Hash

Q: What are the tradeoffs of hashing?

L03-18

Page 49: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Hashed Address Mapping

February 11, 2020

Tag Data BlockV

=

Offset

tb

t

HIT Data Word or Byte

2k

lines

Address

Hash

Q: What are the tradeoffs of hashing?Good: Regular strides don’t conflictBad: Hash adds latency

Tag is largerL03-18

Page 50: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

2-Way Set-Associative Cache

February 11, 2020

Tag Data BlockV

BlockOffset

Tag Index

Tag Data BlockV

L03-19

Page 51: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

2-Way Set-Associative Cache

February 11, 2020

Tag Data BlockV

BlockOffset

Tag Index

kTag Data BlockV

L03-19

Page 52: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

2-Way Set-Associative Cache

February 11, 2020

Tag Data BlockV

BlockOffset

Tag Index

kTag Data BlockV

=

t

HIT

=

t

L03-19

Page 53: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

2-Way Set-Associative Cache

February 11, 2020

Tag Data BlockV

BlockOffset

Tag Index

kTag Data BlockV

b

DataWordor Byte

=

t

HIT

=

t

L03-19

Page 54: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Fully Associative Cache

February 11, 2020

Tag Data BlockV

=

Blo

ckO

ffse

tTa

g

t

b

HIT

DataWordor Byte

=

=

t

L03-20

Page 55: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Placement Policy

February 11, 2020

Set Number

Cache

0 1 2 3 4 5 6 7 8 91 1 1 1 1 1 1 1 1 1 0 1 2 3 4 5 6 7 8 9

2 2 2 2 2 2 2 2 2 2 0 1 2 3 4 5 6 7 8 9

3 30 1

Memory

Block Number

block 12 can be placed

0 1 2 3 4 5 6 7

DirectMappedonly intoblock 4

(12 mod 8)

L03-21

Page 56: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Placement Policy

February 11, 2020

Set Number

Cache

0 1 2 3 4 5 6 7 8 91 1 1 1 1 1 1 1 1 1 0 1 2 3 4 5 6 7 8 9

2 2 2 2 2 2 2 2 2 2 0 1 2 3 4 5 6 7 8 9

3 30 1

Memory

Block Number

block 12 can be placed

0 1 2 3 4 5 6 7

DirectMappedonly intoblock 4

(12 mod 8)

0 1 2 3

(2-way) SetAssociativeanywhere in

set 0(12 mod 4)

L03-21

Page 57: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Placement Policy

February 11, 2020

Set Number

Cache

0 1 2 3 4 5 6 7 8 91 1 1 1 1 1 1 1 1 1 0 1 2 3 4 5 6 7 8 9

2 2 2 2 2 2 2 2 2 2 0 1 2 3 4 5 6 7 8 9

3 30 1

Memory

Block Number

block 12 can be placed

0 1 2 3 4 5 6 7

DirectMappedonly intoblock 4

(12 mod 8)

FullyAssociativeanywhere

0 1 2 3

(2-way) SetAssociativeanywhere in

set 0(12 mod 4)

L03-21

Page 58: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Cache Algorithm (Read)

February 11, 2020

Look at Processor Address, search cache tags to find match. Then either

L03-22

Page 59: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Cache Algorithm (Read)

February 11, 2020

Look at Processor Address, search cache tags to find match. Then either

Found in cachea.k.a. HIT

Return copyof data fromcache

L03-22

Page 60: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Cache Algorithm (Read)

February 11, 2020

Look at Processor Address, search cache tags to find match. Then either

Found in cachea.k.a. HIT

Return copyof data fromcache

Not in cachea.k.a. MISS

Read block of data fromMain Memory

Wait …

Return data to processorand update cache

L03-22

Page 61: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Cache Algorithm (Read)

February 11, 2020

Look at Processor Address, search cache tags to find match. Then either

Found in cachea.k.a. HIT

Return copyof data fromcache

Not in cachea.k.a. MISS

Read block of data fromMain Memory

Wait …

Return data to processorand update cache

Which line do we replace?

L03-22

Page 62: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Improving Cache Performance

February 11, 2020

Average memory access time =Hit time + Miss rate x Miss penalty

To improve performance:• reduce the hit time• reduce the miss rate (e.g., larger, better policy)• reduce the miss penalty (e.g., L2 cache)

What is the simplest design strategy?

L03-23

Page 63: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Improving Cache Performance

February 11, 2020

Average memory access time =Hit time + Miss rate x Miss penalty

To improve performance:• reduce the hit time• reduce the miss rate (e.g., larger, better policy)• reduce the miss penalty (e.g., L2 cache)

What is the simplest design strategy?

Biggest cache that doesn’t increase hit time past 1-2 cycles (approx. 16-64KB in modern technology)

[design issues more complex with out-of-order superscalar processors]

L03-23

Page 64: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Causes for Cache Misses

February 11, 2020

• Compulsory: First reference to a block a.k.a. cold start misses

- misses that would occur even with infinite cache

• Capacity:cache is too small to hold all data the program needs

- misses that would occur even under perfectplacement & replacement policy

• Conflict:misses from collisions due to block-placement strategy

- misses that would not occur with full associativity

L03-24

Page 65: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Effect of Cache Parameters on Performance

February 11, 2020

Largercapacity

cache

Higher associativity

cache

Larger block size cache *

Compulsory misses

Capacity misses

Conflict misses

Hit latency

Miss latency

* Assume substantial spatial localityL03-25

Page 66: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Effect of Cache Parameters on Performance

February 11, 2020

Largercapacity

cache

Higher associativity

cache

Larger block size cache *

Compulsory misses

Capacity misses

Conflict misses

Hit latency

Miss latency

* Assume substantial spatial localityL03-25

Page 67: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Effect of Cache Parameters on Performance

February 11, 2020

Largercapacity

cache

Higher associativity

cache

Larger block size cache *

Compulsory misses

Capacity misses

Conflict misses

Hit latency

Miss latency

* Assume substantial spatial localityL03-25

Page 68: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Effect of Cache Parameters on Performance

February 11, 2020

Largercapacity

cache

Higher associativity

cache

Larger block size cache *

Compulsory misses

Capacity misses

Conflict misses

Hit latency

Miss latency

* Assume substantial spatial localityL03-25

Page 69: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Effect of Cache Parameters on Performance

February 11, 2020

Largercapacity

cache

Higher associativity

cache

Larger block size cache *

Compulsory misses

Capacity misses

Conflict misses

Hit latency

Miss latency

* Assume substantial spatial localityL03-25

Page 70: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Effect of Cache Parameters on Performance

February 11, 2020

Largercapacity

cache

Higher associativity

cache

Larger block size cache *

Compulsory misses

Capacity misses

Conflict misses

Hit latency

Miss latency

* Assume substantial spatial localityL03-25

Page 71: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Effect of Cache Parameters on Performance

February 11, 2020

Largercapacity

cache

Higher associativity

cache

Larger block size cache *

Compulsory misses

Capacity misses

Conflict misses

Hit latency

Miss latency

* Assume substantial spatial localityL03-25

Page 72: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Effect of Cache Parameters on Performance

February 11, 2020

Largercapacity

cache

Higher associativity

cache

Larger block size cache *

Compulsory misses

Capacity misses

Conflict misses

Hit latency

Miss latency

* Assume substantial spatial localityL03-25

Page 73: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Effect of Cache Parameters on Performance

February 11, 2020

Largercapacity

cache

Higher associativity

cache

Larger block size cache *

Compulsory misses

Capacity misses

Conflict misses

Hit latency

Miss latency

* Assume substantial spatial localityL03-25

Page 74: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Effect of Cache Parameters on Performance

February 11, 2020

Largercapacity

cache

Higher associativity

cache

Larger block size cache *

Compulsory misses

Capacity misses

Conflict misses

Hit latency

Miss latency

* Assume substantial spatial localityL03-25

Page 75: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Effect of Cache Parameters on Performance

February 11, 2020

Largercapacity

cache

Higher associativity

cache

Larger block size cache *

Compulsory misses

Capacity misses

Conflict misses

Hit latency

Miss latency

* Assume substantial spatial localityL03-25

Page 76: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Effect of Cache Parameters on Performance

February 11, 2020

Largercapacity

cache

Higher associativity

cache

Larger block size cache *

Compulsory misses

Capacity misses

Conflict misses

Hit latency

Miss latency

* Assume substantial spatial localityL03-25

Page 77: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Effect of Cache Parameters on Performance

February 11, 2020

Largercapacity

cache

Higher associativity

cache

Larger block size cache *

Compulsory misses

Capacity misses

Conflict misses

Hit latency

Miss latency

* Assume substantial spatial localityL03-25

Page 78: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Effect of Cache Parameters on Performance

February 11, 2020

Largercapacity

cache

Higher associativity

cache

Larger block size cache *

Compulsory misses

Capacity misses

Conflict misses

Hit latency

Miss latency

?

* Assume substantial spatial localityL03-25

Page 79: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Effect of Cache Parameters on Performance

February 11, 2020

Largercapacity

cache

Higher associativity

cache

Larger block size cache *

Compulsory misses

Capacity misses

Conflict misses

Hit latency

Miss latency

?

* Assume substantial spatial localityL03-25

Page 80: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Effect of Cache Parameters on Performance

February 11, 2020

Largercapacity

cache

Higher associativity

cache

Larger block size cache *

Compulsory misses

Capacity misses

Conflict misses

Hit latency

Miss latency

?

* Assume substantial spatial localityL03-25

Page 81: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Block-level Optimizations

• Tags are too large, i.e., too much overhead– Simple solution: Larger blocks, but miss penalty could be

large.

February 11, 2020 L03-26

Page 82: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Block-level Optimizations

• Tags are too large, i.e., too much overhead– Simple solution: Larger blocks, but miss penalty could be

large.

• Sub-block placement (aka sector cache)– A valid bit added to units smaller than the full block, called

sub-blocks– Only read a sub-block on a miss– If a tag matches, is the sub-block in the cache?

February 11, 2020

100300204

1 1 1 1 1 1 0 00 1 0 1

L03-26

Page 83: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Replacement Policy

February 11, 2020

Which block from a set should be evicted?

• Random

L03-27

Page 84: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Replacement Policy

February 11, 2020

Which block from a set should be evicted?

• Random

L03-27

Page 85: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Replacement Policy

February 11, 2020

Which block from a set should be evicted?

• Random

• Least Recently Used (LRU)• LRU cache state must be updated on every access• true implementation only feasible for small sets (2-way)• pseudo-LRU binary tree was often used for 4-8 way

L03-27

Page 86: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Replacement Policy

February 11, 2020

Which block from a set should be evicted?

• Random

• Least Recently Used (LRU)• LRU cache state must be updated on every access• true implementation only feasible for small sets (2-way)• pseudo-LRU binary tree was often used for 4-8 way

• First In, First Out (FIFO) a.k.a. Round-Robin• used in highly associative caches

L03-27

Page 87: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Replacement Policy

February 11, 2020

Which block from a set should be evicted?

• Random

• Least Recently Used (LRU)• LRU cache state must be updated on every access• true implementation only feasible for small sets (2-way)• pseudo-LRU binary tree was often used for 4-8 way

• First In, First Out (FIFO) a.k.a. Round-Robin• used in highly associative caches

• Not Least Recently Used (NLRU)• FIFO with exception for most recently used block or blocks

L03-27

Page 88: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Replacement Policy

February 11, 2020

Which block from a set should be evicted?

• Random

• Least Recently Used (LRU)• LRU cache state must be updated on every access• true implementation only feasible for small sets (2-way)• pseudo-LRU binary tree was often used for 4-8 way

• First In, First Out (FIFO) a.k.a. Round-Robin• used in highly associative caches

• Not Least Recently Used (NLRU)• FIFO with exception for most recently used block or blocks

• One-bit LRU• Each way represented by a bit. Set on use, replace first unused.

L03-27

Page 89: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Multilevel Caches

• A memory cannot be large and fast• Add level of cache to reduce miss penalty

– Each level can have longer latency than level above– So, increase sizes of cache at each level

February 11, 2020

CPU L1 L2 DRAM

Metrics:

Local miss rate = misses in cache/ accesses to cache

Global miss rate = misses in cache / CPU memory accesses

Misses per instruction (MPI) = misses in cache / number of instructions

L03-29

Page 90: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Victim Caches (HP 7200)

February 11, 2020

Victim cache is a small associative back up cache, added to a direct mapped cache, which holds recently evicted lines• First look up in direct mapped cache• If miss, look in victim cache• If hit in victim cache, swap hit line with line now evicted from L1• If miss in victim cache, L1 victim -> VC, VC victim->?Fast hit time of direct mapped but with reduced conflict misses

L1 Data Cache

Unified L2 Cache

RF

CPU

Evicted data from L1

Evicted data from VCwhere ?

Hit data (miss in L1)Victim CacheFA, 4 blocks

L03-30

Page 91: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Inclusion Policy

• Inclusive multilevel cache: – Inner cache holds copies of data in outer cache– External access need only check outer cache– Most common case

• Exclusive multilevel caches:– Inner cache may hold data not in outer cache– Swap lines between inner/outer caches on miss– Used in AMD Athlon with 64KB primary and 256KB secondary

cache

• Non-inclusive multilevel caches:– Intel Skylake– ARM

February 11, 2020 L03-31

Page 92: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Inclusion Policy

• Inclusive multilevel cache: – Inner cache holds copies of data in outer cache– External access need only check outer cache– Most common case

• Exclusive multilevel caches:– Inner cache may hold data not in outer cache– Swap lines between inner/outer caches on miss– Used in AMD Athlon with 64KB primary and 256KB secondary

cache

• Non-inclusive multilevel caches:– Intel Skylake– ARM

Why choose one type or the other?February 11, 2020 L03-31

Page 93: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

HBM DRAM or MCDRAM

February 11, 2020

Source: AMD

L03-32

Page 94: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

MIT 6.823 Spring 2020

Mixed technology caching(Intel Knights Landing)

February 11, 2020

MCDRAM(as mem)

DDRL2

CPU

MCDRAM(as cache)

L2

CPUDDR

MCDRAM(as cache)

MCDRAM(as mem)

L2

CPU DDR

L03-33

Page 95: Cache Organizationcsg.csail.mit.edu/6.823S20/Lectures/L03split.pdf · 2020. 2. 11. · Cache Organization. MIT 6.823 Spring 2020 CPU-Memory Bottleneck February 11, 2020 CPU Memory

L03-100MIT 6.823 Spring 2020

Thank you!

Next lecture:Virtual memory