lirs : an efficient replacement policy to improve buffer cache performance song jiang 1 and xiaodong...

54
LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National Science Foundation

Upload: alfredo-pyke

Post on 31-Mar-2015

217 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

LIRS : An Efficient Replacement Policy

to Improve Buffer Cache Performance

Song Jiang1 and Xiaodong Zhang1,2

1College of William and Mary2National Science Foundation

Page 2: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

The Problem of LRU Replacement

• File scanning: one-time accessed blocks are not replaced timely;

• Loop-like accesses: blocks to be accessed soonest can be unfortunately replaced;

• Accesses with distinct frequencies: Frequently accessed blocks can be unfortunately replaced.

Inability to cope with weak access locality

Page 3: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

Why does LRU Fail Sometimes?

• A recently used block is not necessarily to be

used again soon.

• Can not deal with working set larger than

available cache size

Page 4: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

LRU Merits

• Simplicity: affordable implementation

• Adaptability: responsive to access pattern changes

Page 5: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

Our Objectives

• Address the limits of LRU fundamentally.

• Retain the low overhead and adaptability merits of LRU.

Significant efforts have been made to improve LRU, but

• Case by case; or

• High runtime overhead

Our objectives:

Page 6: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

Outline

• Related Work

• The LIRS Algorithm

• LIRS Implementation Using LRU Stack

• Performance Evaluation

• Sensitivity and Overhead Analysis

• Conclusions

Page 7: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

Related Work

• Aided by user-level hints

• Detection and adaptation of access regularities

• Tracing and utilizing deeper history information

Page 8: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

User-level Hints

• Application-controlled file caching [Cao et al, USENIX’94]

• Application-informed prefetching and caching [Patterson et al, SOSP’96]

Rely on users’ understanding of data access patterns

Page 9: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

Detection and Adaptation of Regularities• SEQ: sequential access pattern detection

[Glass et al, Sigmetrics’97]

• EELRU: on-line analysis of aggregate recency distributions of referenced blocks [Smaragdakis et al, Sigmetrics’97]

• DEAR: detection of multiple block reference patterns [Choi et al, USENIX’99]

• AFC: Application/File-level Characterization [Choi et al, Sigmetrics’00]

• UBM: Unified Buffer Management [Kim et al, OSDI’00]

Case-by-case oriented approaches

Page 10: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

Tracing and Utilizing Access History

• LRFU: combine LRU and LFU [Lee et al, Sigmetrics’99]

• LRU-K: replacement decision based on the time of the Kth-to-last reference [ O'Neil et al, Sigmod’93]

• 2Q: use two queues to quickly remove cold blocks [Johnson et al, VLDB’94]

Either high implementation cost, or

workload dependent performance

Page 11: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

Outline

• Related Work

• The LIRS Algorithm

• LIRS Implementation Using LRU Stack

• Performance Evaluation

• Sensitivity and Overhead Analysis

• Conclusions

Page 12: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

Observation of Data Flow in LRU Stack

• Blocks are ordered by recency in the LRU stack;

• Blocks enter from stack top, and leave from its bottom;

A block evicted from the bottom of the stack should have been evicted much earlier !

1

6

32

5

LRU stack

.

.

.

Page 13: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

Inter-Reference Recency (IRR)

IRR of a block: number of other unique blocks accessed between two consecutive references to the block.

Recency: number of other unique blocks accessed from last reference to the current time.

1 2 3 4 3 1 5 6 5

IRR = 3

R = 2

Page 14: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

Principles of Our Replacement

If a block’s IRR is high, its next IRR is likely to be high again. We select the blocks with high IRRs for replacement .

Once IRR is out of date, we rely on the recency.

LIRS: Low Inter-reference Recency Set Replacement Policy We keep the blocks with low IRRs in cache.

Page 15: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

Basic LIRS Idea: Keep LIR Blocks in Cache Low IRR (LIR) block and High IRR (HIR) block

LIR block set

(size is Llirs )

HIR block set

Cache size

L = Llirs + LhirsLhirs

Llir

s

Physical CacheBlock Sets

Page 16: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

An Example for LIRS

Llirs=2, Lhirs=1

V time /Blocks

1 2 3 4 5 6 7 8 9 10 R IRR

A X X X 1 1

B X X 3 1

C X 4 inf

D X X 2 3

E X 0 inf

LIR block set = {A, B}, HIR block set = {C, D, E}

Page 17: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

C

D

E

HIR block set

A

B

A

B

E

LIR block set

Resident blocks

Mapping to Cache Block Sets

Lhirs=1

Llirs=2

Physical Cache

Page 18: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

D is referenced at time 10

V time /Blocks

1 2 3 4 5 6 7 8 9 10 R IRR

A X X X 1 1

B X X 3 1

C X 4 inf

D X X XX 0 3

E X 1 Inf

The resident HIR block (E) is replaced !

Which Block is replaced ? Replace a HIR Block

Page 19: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

V time /Blocks

1 2 3 4 5 6 7 8 9 10 R IRR

A X X X 2 1

B X X 3 1

C X 4 inf

D X X XX 0 2

E X 1 Inf

How LIR Set is Updated ? Recency of LIR Block Used

Page 20: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

V time /Blocks

1 2 3 4 5 6 7 8 9 10 R IRR

A X X X 2 1

B X X 3 1

C X 4 inf

D X X XX 0 2

E X 1 Inf

After D is Referenced at Time 10

E is replaced, D enters LIR set

Page 21: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

V time /Blocks

1 2 3 4 5 6 7 8 9 10 R IRR

A X X X 2 1

B X X 4 1

C X XX 0 4

D X X 3 3

E X 1 Inf

If Reference is to C at Time 10 . . . . . .

E is replaced, C can not enter LIR set

Page 22: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

The Power of LIRS Replacement

• File scanning: one-time accessed blocks will be replaced timely;

• Loop-like accesses: blocks to be accessed soonest will NOT be replaced;

• Accesses with distinct frequencies: Frequently accessed blocks will NOT be replaced.

Capability to cope with weak access locality

Page 23: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

Outline

• Related Work

• The LIRS Algorithm

• LIRS Implementation Using LRU Stack

• Performance Evaluation

• Sensitivity and Overhead Analysis

• Conclusions

Page 24: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

LIRS Efficiency: O(1)

Rmax

(Maximum Recency of LIR blocks)

IRR HIR

(New IRR of the

HIR block)

This efficiency is achieved by our LIRS stack.

LRU stack + LIR block with Rmax recency in its bottom ==> LIRS stack.

Page 25: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

Differences between LRU and LIRS Stacks

resident block

LIR block

HIR block

Cache size

L = 5

3216

5

LRU stack

53216948

LIRS stack

Llir = 3

Lhir =2

• Stack size of LRU decided by cache size, and fixed; Stack size of LIRS decided by LIR block with Rmax recncy, and varied.• LRU stack holds only resident blocks; LIRS stack holds any blocks whose recencies are no more than Rmax.

• LRU stack does not distinguish “hot” and “cold” blocks in it; LIRS stack distinguishes LIR and HIR blocks in it, and dynamically maintains their statues.

Page 26: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

Rmax (Maximum Recency of LIR blocks)

IRR HIR

(New IRR of the HIR block)

Blocks in the LIRS stack ==> IRR < Rmax

Other blocks ==> IRR > Rmax

LIRS Stack

How does LIRS Stack Help?

Page 27: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

LIRS Operations

resident in cache

LIR block

HIR block

Cache size

L = 5Llir =

3

Lhir =2

53216948

LIRS stack S

53

Resident HIR Stack Q

• Initialization: All the referenced blocks are given an LIR status until LIR block set is full.

We place resident HIR blocks in Stack Q

• Upon accessing a LIR block (a hit)

• Upon accessing a resident HIR block (a hit)

• Upon accessing a non-resident HIR block (a miss)

Page 28: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

Access a LIR block (a Hit)

53216948

S

53

Q

532169

4

8

S

53

Q

Access 4 Access 8

resident in cache

LIR block

HIR block

Cache size

L = 5Llir =

3

Lhir =2

5321

48

S

53

Q

69

S

d

Page 29: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

Access a HIR Resident block (a Hit)

5321

48

S

53

Q

Access 3 Access 5

1

348

S

5

Q

5

resident in cache

LIR block

HIR block

Cache size

L = 5Llir =

3

Lhir =2

3

1

48

S

5

Q

52

S

d

Page 30: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

Access a Non-Resident HIR block (a Miss)

Access 7

5

348

S

7

Q

7

5

1

348

S

5

Q

5resident in cache

LIR block

HIR block

Cache size

L = 5Llir =

3

Lhir =2

Page 31: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

Access a HIR Non-Resident block (a Miss) (Cont)

resident in cache

5 block number LIR block

HIR block

Cache size

L = 5Llir =

3

Lhir =2

Access 9

5

348

S

7

Q

7

5

7

348

S

9

Q

9

75

Access 5

4

S Q

8

9

87

5

3

Page 32: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

Outline

• Related Work

• The LIRS Algorithm

• LIRS Implementation Using LRU Stack

• Performance Evaluation

• Sensitivity and Overhead Analysis

• Conclusions

Page 33: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

Workload Traces

•cpp is a GNU C compiler pre-processor trace

• cs is an interactive C source program examination tool trace.

• glimpse is a text information retrieval utility trace.

• postgres is a trace of join queries among four relations in a relational

database system

• sprite is from the Sprite network file system

• mulit1 is obtained by executing two workloads, cs and cpp, together.

• multi2 is obtained by executing three workloads, cs, cpp, and

postgres, together.

Page 34: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

Representative Access patterns

• Looping references: all blocks are accessed repeatedly

with a regular interval;

• Temporally-clustered references: blocks accessed more

recently are the ones more likely to be accessed again soon.

• Probabilistic references: each block has a stationary

reference probability, and all blocks are accessed

independently with the associated probabilities.

Page 35: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

Cache Partition

• 1% of the cache size is for HIR blocks

• 99% of the cache size is for LIR blocks

• Performance is not sensitive to a partition.

Page 36: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

Looping Pattern: cs (Time-space map)

Page 37: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

Looping Pattern: cs (Hit Rates)

Page 38: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

Looping Pattern: postgres (Time-space map)

Page 39: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

Looping Pattern: postgres (Hit Rates)

Page 40: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

Looping Pattern: postgres (Hit Rates)

Page 41: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

Probabilistic Pattern: cpp (Time-space map)

Page 42: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

Probabilistic Pattern: cpp (Hit Rates)

Page 43: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

Temporally-Clustered Pattern: sprite (Time-space map)

Page 44: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

Temporally-Clustered Pattern: sprite (Hit Rates)

Page 45: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

Mixed Pattern: multi1 (Time-space map)

Page 46: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

Mixed Pattern: multi1 (Hit Rates)

Page 47: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

Mixed Pattern: multi2 (Time-space map)

Page 48: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

Mixed Pattern: multi2 (Hit Rates)

Page 49: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

Outline

• Related Work

• The LIRS Algorithm

• LIRS Implementation Using LRU Stack

• Performance Evaluation

• Sensitivity and Overhead Analysis

• Conclusions

Page 50: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

Sensitivity to the Change of Lhirs

Page 51: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

Sensitivity to the Change of Lhirs

Page 52: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

LIRS with Limited Stack Sizes

Page 53: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

LIRS with Limited Stack Sizes

Page 54: LIRS : An Efficient Replacement Policy to Improve Buffer Cache Performance Song Jiang 1 and Xiaodong Zhang 1,2 1 College of William and Mary 2 National

Conclusions

• Effectively use deeper access history without explicit

regularity detection and high cost operations.

• Outperform exiting replacement policies.

• Its implementation as simple as LRU.

• Applicable to virtual memory and database buffer

management.