october 14, 2002mascots 20021 workload characterization in web caching hierarchies guangwei bai...

28
October 14, 2002 MASCOTS 2002 1 Workload Characterization in Web Caching Hierarchies Guangwei Bai Carey Williamson Department of Computer Science University of Calgary

Post on 19-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: October 14, 2002MASCOTS 20021 Workload Characterization in Web Caching Hierarchies Guangwei Bai Carey Williamson Department of Computer Science University

October 14, 2002 MASCOTS 2002 1

Workload Characterizationin Web Caching Hierarchies

Guangwei Bai

Carey Williamson

Department of Computer Science

University of Calgary

Page 2: October 14, 2002MASCOTS 20021 Workload Characterization in Web Caching Hierarchies Guangwei Bai Carey Williamson Department of Computer Science University

October 14, 2002 MASCOTS 2002 2

Talk Outline

1. Problem Statement

2. Experimental Methodology

3. Simulation Results

4. Modeling Results

5. Summary and Conclusions

Page 3: October 14, 2002MASCOTS 20021 Workload Characterization in Web Caching Hierarchies Guangwei Bai Carey Williamson Department of Computer Science University

October 14, 2002 MASCOTS 2002 3

1. Introduction

World Wide Web: One of the most popular applications on today’s Internet

Web proxy caching:A technique used for improving performance and scalability of the Internet

Page 4: October 14, 2002MASCOTS 20021 Workload Characterization in Web Caching Hierarchies Guangwei Bai Carey Williamson Department of Computer Science University

October 14, 2002 MASCOTS 2002 4

Internet

Web Server Web Server

Web Proxy Caching System

…Web Clients…

Illustration of Web Proxy Cache Filtering Effect

Original Request Stream

Filtered Request Stream

Page 5: October 14, 2002MASCOTS 20021 Workload Characterization in Web Caching Hierarchies Guangwei Bai Carey Williamson Department of Computer Science University

October 14, 2002 MASCOTS 2002 5

Example of Web cache filter effect

Time ID0.001 A0.025 B0.150 C0.689 A0.890 D1.358 B1.777 B2.190 A2.460 E

Arriving Request Stream Filtered Request Stream

Time ID0.001 A0.025 B0.150 C0.890 D1.358 B2.460 E

WebProxyCache

Page 6: October 14, 2002MASCOTS 20021 Workload Characterization in Web Caching Hierarchies Guangwei Bai Carey Williamson Department of Computer Science University

October 14, 2002 MASCOTS 2002 6

Example of Web cache filter effect

Time ID0.001 A0.025 B0.150 C0.689 A0.890 D1.358 B1.777 B2.190 A2.460 E

Arriving Request Stream Filtered Request Stream

Time ID0.001 A0.025 B0.150 C0.890 D1.358 B2.460 E

WebProxyCache

Frequency-domain effect

Page 7: October 14, 2002MASCOTS 20021 Workload Characterization in Web Caching Hierarchies Guangwei Bai Carey Williamson Department of Computer Science University

October 14, 2002 MASCOTS 2002 7

Example of Web cache filter effect

Time ID0.001 A0.025 B0.150 C0.689 A0.890 D1.358 B1.777 B2.190 A2.460 E

Arriving Request Stream Filtered Request Stream

Time ID0.001 A0.025 B0.150 C0.890 D1.358 B2.460 E

WebProxyCache

Time-domain effect

Page 8: October 14, 2002MASCOTS 20021 Workload Characterization in Web Caching Hierarchies Guangwei Bai Carey Williamson Department of Computer Science University

October 14, 2002 MASCOTS 2002 8

Goal of this Work:

Time-domain analysis of cache filter effects in Web caching hierarchies:o Study impact of a cache on the structural characteristics of Web request workload (mean, peak, variance, self-similarity)o Sensitivity of filter effect to cache configuration (cache size and cache replacement policy)o Characterizing aggregate Web request streams in a multi-level Web proxy caching hierarchy

Page 9: October 14, 2002MASCOTS 20021 Workload Characterization in Web Caching Hierarchies Guangwei Bai Carey Williamson Department of Computer Science University

October 14, 2002 MASCOTS 2002 9

Multi-Level Web Proxy Caching System

Web Proxy Cache 1

Web Proxy Cache 2 Web Proxy Cache 3

1 2

3

3

21

Child Level

Parent Level

Page 10: October 14, 2002MASCOTS 20021 Workload Characterization in Web Caching Hierarchies Guangwei Bai Carey Williamson Department of Computer Science University

October 14, 2002 MASCOTS 2002 10

Experimental Methodology

Trace-driven simulation Web proxy cache simulator Synthetic Web proxy workloads

o Controllable characteristicso Trace length: about 1M requests o Zipf slope: -0.75, -0.8o Request arrival process:

Deterministic, Poisson, Self-Similar

Page 11: October 14, 2002MASCOTS 20021 Workload Characterization in Web Caching Hierarchies Guangwei Bai Carey Williamson Department of Computer Science University

October 14, 2002 MASCOTS 2002 11

0

2000

4000

1200

0

1400

0

6000

8000

1000

0

Time (sec)

0

0.2

0.4

0.6

0.8

1

Hit

Rat

io

16:0

0

15:3

0

12:3

0

12:0

0

R

equ

ests

per

5-m

inu

te I

nte

rval

0

4000

2000

1400

0

1200

0

1000

0

8000

6000

Time (sec)

04000

12000

16000

20000

16:0

0

15:3

0

12:3

0

12:0

0

8000

General Observations: Filter Effects

ArrivalCounts

CacheHit Ratio

Page 12: October 14, 2002MASCOTS 20021 Workload Characterization in Web Caching Hierarchies Guangwei Bai Carey Williamson Department of Computer Science University

October 14, 2002 MASCOTS 2002 12

Effect of Cache Configuration

Experimental factors: Cache size determines the maximum

number of Web Content bytes that can be held in the cache at one time

Cache Replacement Policy determines what object(s) to remove from the cache when more space is needed to store an incoming object (e.g. RAND, FIFO, LRU, LFU, GDS)

(Assumption: arrival process is Poisson)

Page 13: October 14, 2002MASCOTS 20021 Workload Characterization in Web Caching Hierarchies Guangwei Bai Carey Williamson Department of Computer Science University

October 14, 2002 MASCOTS 2002 13

Effect of Cache Size on Traffic Structure

0 20 40 60 80 100 1200

5

10

15

20

25

Fre

qu

ency

in P

erce

nt

Requests per 1-minute Interval

(a) Effect of cache size

Marginal Distribution Plot (pdf)

Page 14: October 14, 2002MASCOTS 20021 Workload Characterization in Web Caching Hierarchies Guangwei Bai Carey Williamson Department of Computer Science University

October 14, 2002 MASCOTS 2002 14

Effect of Cache Replacement PolicyF

req

uen

cy

1201008060402000

5

10

15

20

25

Requests per 1-minute Arrival

(b) Effect of cache policy

(8 KB)

Page 15: October 14, 2002MASCOTS 20021 Workload Characterization in Web Caching Hierarchies Guangwei Bai Carey Williamson Department of Computer Science University

October 14, 2002 MASCOTS 2002 15

Input: Deterministic Arrival Process

Main Observations: Reduces mean arrival rate of filtered request stream Increases variance of the filtered request stream

StatisticsBeforeCache

Cache Size (MB)

Mean

StandardDeviation

Hit Ratio

4 16 64 256 10241

38.8% 47.8% 52.7% 55.5% 59.1% 62.7%

60.00 36.88 31.45 28.71 27.31 25.37 23.03

0.00 4.84 4.60 4.01 4.00 4.31 4.78

Page 16: October 14, 2002MASCOTS 20021 Workload Characterization in Web Caching Hierarchies Guangwei Bai Carey Williamson Department of Computer Science University

Input: Poisson Arrival Process

Main Observations: Large impact on mean; little impact on variance Variance-to-mean ratio increases with cache size For small cache sizes, the filtered stream is well-characterized as a Poisson process.

StatisticsBeforeCache

Cache Size (MB)

Mean

StandardDeviation

Hit Ratio

4 16 64 256 10241

38.8% 47.8% 52.7% 55.5% 59.1% 62.7%

60.10 36.81 31.38 28.65 27.26 25.33 23.00

7.82 6.77 6.07 5.43 5.31 5.39 5.62

Page 17: October 14, 2002MASCOTS 20021 Workload Characterization in Web Caching Hierarchies Guangwei Bai Carey Williamson Department of Computer Science University

Input: Self-Similar Arrival Process

Main Observations: Large impact on mean; little impact on variance Variance-to-mean ratio increases with cache size Filtered request stream retains self-similar structure

StatisticsBeforeCache

Cache Size (MB)

Mean

StandardDeviation

Hit Ratio

4 16 64 256 10241

38.8% 47.8% 52.7% 55.5% 59.1% 62.7%

62.87 38.50 32.79 29.88 28.27 26.05 23.49

12.24 9.03 7.98 7.12 6.94 7.02 7.14

Page 18: October 14, 2002MASCOTS 20021 Workload Characterization in Web Caching Hierarchies Guangwei Bai Carey Williamson Department of Computer Science University

October 14, 2002 MASCOTS 2002 18

Network traffic self-similarityThe statistical characterization of the trafficis essentially invariant with time scale.

Main measureHurst parameter: 0.5 < H < 1

Examinationo autocorrelation (long-range dependence)o variance-time ploto rescaled adjusted range statistic (R/S)

Background: Self-Similar Traffic

Page 19: October 14, 2002MASCOTS 20021 Workload Characterization in Web Caching Hierarchies Guangwei Bai Carey Williamson Department of Computer Science University

October 14, 2002 MASCOTS 2002 19

Traffic Characterization in aWeb Proxy Caching Hierarchy

Filter effects of the first-level cache on Web workload

Statistical multiplexing of filtered Web request streams after the first-level cache

Modeling aggregate request stream offered to the second-level cache

Page 20: October 14, 2002MASCOTS 20021 Workload Characterization in Web Caching Hierarchies Guangwei Bai Carey Williamson Department of Computer Science University

October 14, 2002 MASCOTS 2002 20

Multi-Level Web Proxy Caching System

Web Proxy Cache 1

Web Proxy Cache 2 Web Proxy Cache 3

1 2

3

3

21

Child Level

Parent Level

Page 21: October 14, 2002MASCOTS 20021 Workload Characterization in Web Caching Hierarchies Guangwei Bai Carey Williamson Department of Computer Science University

October 14, 2002 MASCOTS 2002 21

Synthetic Self-Similar Workload Traces offered to the first-level cache

Trace 1(H=0.70, Zipf slope=0.75)

Trace 2(H=0.80, Zipf slope=0.80)

4000 120000 8000

60

100

140

20

180

Time (sec.) Req

ues

ts p

er I

nte

rval

80000 4000 1200020

60

100

140

Time (sec.) Req

ues

ts p

er I

nte

rval

Page 22: October 14, 2002MASCOTS 20021 Workload Characterization in Web Caching Hierarchies Guangwei Bai Carey Williamson Department of Computer Science University

Evidence of Self-Similar Request Arrival Process for Filtered Web Proxy Workload

0 4000 8000 120000

20

40

60

Time IntervalCou

nt

of A

rriv

al

/I

nte

rval

(a) Time Series

1

0 20 40 60 80 100-0.4

0

0.4

0.8

Lag

Au

toco

rrel

atio

n

(b) Autocorrelation

0 1 2 3 4-4

-3

-2

-1

0

Log10(Aggregation level)

Log

10(V

aria

nce

)

(c) Variance-Time Plot

1 2 3 40

1234

Log

10(R

/S)

Log10(Sample Size)(d) R/S Pox Plot

H=0.699

1`

Page 23: October 14, 2002MASCOTS 20021 Workload Characterization in Web Caching Hierarchies Guangwei Bai Carey Williamson Department of Computer Science University

October 14, 2002 MASCOTS 2002 23

Superposition of Web Workload in time-domain

3

210 20 40 60 80 100 120 140

0

2

4

6

8

Request Arrival

Fre

qu

ency

(%

)

Characteristics of aggregate request arrival process 3

Page 24: October 14, 2002MASCOTS 20021 Workload Characterization in Web Caching Hierarchies Guangwei Bai Carey Williamson Department of Computer Science University

Evidence of Self-Similarity forAggregate Request Arrival Process 3

2000 6000 10000

0

40

80

120

R

equ

ests

per

In

terv

al

Time(sec.)(a) Time series

0 20 40 60 80 100-0.4

0

0.4

0.8

Lag

Au

toco

rrel

atio

n

(b) Autocorrelation function

0 1 2 3 4

0

-1

-2

-3

-4Log

10(v

aria

nce

)

Log10(aggregation level)

(c) Variance-Time Plot

1 2 3 40

1234

Log10(sample size)

Log

10(R

/S)

(d) R/S Pox Plot

H=0.76

Page 25: October 14, 2002MASCOTS 20021 Workload Characterization in Web Caching Hierarchies Guangwei Bai Carey Williamson Department of Computer Science University

October 14, 2002 MASCOTS 2002 25

• Gamma Distribution

β Γ( )

x-μβ

( ) ex-μβ( )- -1

f(x) =

: shape parameter

β : scale parameter

μ : location parameter

Modeling of Aggregate Workload

Page 26: October 14, 2002MASCOTS 20021 Workload Characterization in Web Caching Hierarchies Guangwei Bai Carey Williamson Department of Computer Science University

October 14, 2002 MASCOTS 2002 26

Modeling of Aggregate Workload

Page 27: October 14, 2002MASCOTS 20021 Workload Characterization in Web Caching Hierarchies Guangwei Bai Carey Williamson Department of Computer Science University

October 14, 2002 MASCOTS 2002 27

Summary and Conclusions

• Recap: Trace-driven simulation of Web proxy caching hierarchy, with synthetic Web workloads

• Cache reduces peak and mean request arrival rate

• Cache filter effect does not remove self-similarity

• Superposition of Web request streams results in a bursty aggregate request stream

• Gamma distribution: a flexible and robust means to characterize request arrival count distribution at different stages in a Web caching hierarchy

Page 28: October 14, 2002MASCOTS 20021 Workload Characterization in Web Caching Hierarchies Guangwei Bai Carey Williamson Department of Computer Science University

October 14, 2002 MASCOTS 2002 28

Future Work

• Bigger traces, more general workloads

• Studying the mathematical relationships between gamma (shape) and beta (scale) parameters versus cache size and hit ratio

• For more information:– Email: {bai,carey}@cpsc.ucalgary.ca– http://www.cpsc.ucalgary.ca/~carey