module #7 – memory managementcse325/modules/module07/module...1 module #7 – memory management...

Post on 11-Dec-2020

15 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

Module #7 – Memory Management

� Lecture #2

• Cache memory

� Readings: Silberschatz, 9.1-9.8

1

Exploiting Locality of Reference

� Exploit locality of reference by keeping a subset of the instructions and data values in high-speed storage (with mechanism to change the subset of instructions and data values when necessary).

� Processor checks high-speed storage first; if item not found, copy it from slower speed storage.

2

2

Exploiting Locality of Reference

� Some systems have two or three levels of cache.

� Level 1 cache (usually split between the instruction cache and the data cache) is smallest and fastest.

� Level 2 cache is larger and slower (Level 3 cache is even larger and slower).

3

Cache and RAM Configuration

� RAM is much larger than cache, so cache can only hold a subset of RAM.

� RAM is viewed as a sequence of fixed-size blocks (N bytes).

� Each cache slot (line) can hold one block (N bytes).

� Each cache slot has associated control bits (valid, tag, etc).

� The unit of transfer between RAM and cache is one block.

4

RAM Cache

control

bits

data block

(4 bytes)

3

Read (Load) Operation

Read hit – if desired item is already present in cache, simply copy item from cache to CPU.

• Control info sent to cache – hit

• Desired item copied from cache

to CPU

5

CPU

Cache

RAM

Read (Load) Operation

Read miss – if desired item is not already present in cache, copy block containing item from RAM to cache.

• Control info sent to cache – miss

• Address sent to RAM

• Block containing desired item

copied into cache

• Desired item copied from cache

to CPU

6

CPU

Cache

RAM

4

Write (Store) Operation

Write hit – if desired item is already present in cache, simply copy item from CPU to cache.

• Control info sent to cache – hit

• Desired item copied from CPU

to cache

7

CPU

Cache

RAM

Write (Store) Operation

Write miss – if desired item is not already present in cache, copy block containing item from RAM to cache.

• Control info sent to cache – miss

• Address sent to RAM

• Block containing desired item

copied into cache

• Desired item copied from CPU

to cache

8

CPU

Cache

RAM

5

Write Policies

� After a write operation, the contents of the block in cache and RAM are different – must have a strategy.

� Write through: whenever a cache block is changed, the block is written (copied) to RAM.

� Write back: cache block is only written (copied) to RAM when the cache line is evicted (replaced).

• multiple store instructions can occur before block has to be written to RAM

• modified bit used to indicate that block has been changed (and must be written to RAM)

9

Cache Organizations

Several different cache organizations have been developed:

• Direct mapped

• Fully associative

• Set associative

Direct mapped and fully

associative are two ends

of the spectrum – set

associative is in between.

10

6

Direct Mapped

Mapping function: I = J mod M

I = cache line number

J = main memory block number

M = number of lines in the cache

11

Example #1

� Block size: 4 bytes

� RAM: 16 MB (24-bit addresses)

� RAM is viewed as 222 blocks of 4 bytes each

(16 MB / 4 bytes)

� Cache: 64 KB for data blocks

� Cache is organized as 214 lines, where each line holds 4 bytes (64 KB / 4 bytes)

� Control bits associated with each cache line

12

7

Example (2)

Address (24 bits) viewed as three fields:

• Offset: 2 bits to identify byte within block

• Line: 14 bits to identify cache line

• Tag: 8 bits (remaining bits)

13

Tag Line Offset

8 bits 14 bits 2 bits

Example (3)

Address: 16339C

in binary:

000101100011001110011100

Tag: 00010110 (16)

Line: 00110011100111 (0CE7)

Byte: 00 (0)

14

8

Example (4)

Cache line Addresses of RAM blocks

0 000000, 010000, …, FF0000

1 000004, 010004, …, FF0004

2 000008, 010008, …, FF0008

.

.

214-1 00FFFC, 01FFFC, …, FFFFFC

15

Determining Hit or Miss

When the cache controller checks a particular cache line, it needs to determine if the desired item is already in the cache or not (hit or miss).

• Check the Valid bit.

• Compare the tag from

the address and the tag

from the cache line.

• If the entry is valid

and the tags match,

the item is present

in the cache.16

9

Example #2

� Address (32 bits) viewed as three fields:

• Byte offset: 8 bits to identify byte within block

• Line: 4 bits to identify cache line

• Tag: 20 bits (remaining bits)

� Example: FFF7C408

11111111111101111100010000001000

17

Example (2)

� How many lines in the cache?

24 = 16 lines

� How many bytes in one block?

28 = 256 bytes

� How many control bits in one line?

V + M + Tag = 1 + 1 + 20 = 22 bits

� How many total bits in one line?

control + data = 22 + 2048 = 2070 bits

18

10

Example (3)

V M Tag V M Tag

---- ----- ---- -----

[0]: 1 0 FF641 [8]: 0 0 0004A

[1]: 1 0 00014 [9]: 1 0 00028

[2]: 1 0 0003A [A]: 1 0 00028

[3]: 0 1 FF593 [B]: 1 1 FFF7C

[4]: 1 1 FFF7C [C]: 0 1 00EA1

[5]: 1 0 00014 [D]: 1 0 00028

[6]: 0 0 00014 [E]: 1 1 0003A

[7]: 1 0 00014 [F]: 1 1 0003A

19

Example (4)

� Index – line number (not stored)

� Valid bit (V) – initially 0, set to 1 when that entry in the cache is in use

� Modified bit (M) – set to 1 when at least one byte in the block has been modified by a "write" operation (sometimes called the dirty bit)

� Tag bits – compared to tag bits from address

� Block – 256 bytes (not shown)

20

11

Example (5)

� Consider the cache entry at index 4:

[4]: 1 1 FFF7C

• What are the addresses of the first and last bytes in that cache entry?

first byte: FFF7C400

last byte: FFF7C4FF

• Has the contents of that cache block been modified?

Yes, M = 1

21

Example (6)

� Consider a request to read from address 00028A14

Line in address is A, so check cache line at index A:

[A]: 1 0 00028

Hit: V = 1 and tag in cache line matches tag in address

Transfer 4 bytes (14, 15, 16, 17) from cache block to CPU

22

12

Example (7)

� Consider a request to read from address 0007260C

Line in address is 6, so check cache line at index 6:

[6]: 0 0 00014

Miss: V = 0

Transfer 256 bytes from RAM to cache

Set V bit to 1

Set M bit to 0

Set tag to 00072

Transfer 4 bytes (0C, 0D, 0E, 0F) from cache block to CPU

23

Example (8)

� Consider a request to write to address 0003AED8

Line in address is E, so check cache line at index E:

[E]: 1 1 0003A

Hit: V = 1 and tag in cache line matches tag in address

Transfer 4 bytes from CPU to cache block (D8, D9, DA, DB)

Set M bit to 1

� Note that some of the 256 bytes in the cache block are no longer the same as the corresponding bytes in RAM (copy block to RAM later)

24

13

Example (9)

� Consider a request to write to address 0003A344

Line in address is 3, so check cache line at index 3:

[3]: 0 1 FF593

Miss: V = 0

Transfer 256 bytes from RAM to cache

Set V bit to 1

Set M bit to 0

Set tag to 0003A

Transfer 4 bytes (44, 45, 46, 47) from CPU to cache block

Set M bit to 1

25

Example (10)

� Consider a request to read from address 002C5934

Line in address is 9, so check cache line at index 9:

[9]: 1 0 00028

Miss: V = 1, but tags don't match

Transfer 256 bytes from RAM to cache

Set V bit to 1

Set M bit to 0

Set tag to 002C5

Transfer 4 bytes (34, 35, 36, 37) from cache block to CPU

26

14

Example (11)

� Consider a request to read from address 002D1F98

Line in address is F, so check cache line at index F:

[F]: 1 1 0003A

Miss: V = 1, but tags don't match

Transfer 256 bytes from cache to RAM (write back)

Transfer 256 bytes from RAM to cache

Set V bit to 1

Set M bit to 0

Set tag to 002D1

Transfer 4 bytes (98, 99, 9A, 9B) from cache block to CPU

27

� To exploit spatial locality, a cache slot must hold more than one item (one word).

� Block size is always a multiple of 2 (use the least significant N bits of the address to identify a specific byte within a block of 2N bytes).

� Typical block sizes are 32 to 256 bytes, with 64 and 128 bytes as the most common.

Block Size

28

15

Miss rate vs. block size for one benchmark

Block Size

29

� Addresses are 32 bits

� Cache characteristics:

• direct mapped

• write through

• 256 slots

• 16 words (64 bytes) per block

� Address subdivided into 3 fields:

• 18 + 8 + 6

Example: 64-byte blocks

30

16

31

top related