october 14, 2002mascots 20021 workload characterization in web caching hierarchies guangwei bai...
Post on 19-Dec-2015
216 views
TRANSCRIPT
October 14, 2002 MASCOTS 2002 1
Workload Characterizationin Web Caching Hierarchies
Guangwei Bai
Carey Williamson
Department of Computer Science
University of Calgary
October 14, 2002 MASCOTS 2002 2
Talk Outline
1. Problem Statement
2. Experimental Methodology
3. Simulation Results
4. Modeling Results
5. Summary and Conclusions
October 14, 2002 MASCOTS 2002 3
1. Introduction
World Wide Web: One of the most popular applications on today’s Internet
Web proxy caching:A technique used for improving performance and scalability of the Internet
October 14, 2002 MASCOTS 2002 4
Internet
Web Server Web Server
Web Proxy Caching System
…Web Clients…
Illustration of Web Proxy Cache Filtering Effect
Original Request Stream
Filtered Request Stream
October 14, 2002 MASCOTS 2002 5
Example of Web cache filter effect
Time ID0.001 A0.025 B0.150 C0.689 A0.890 D1.358 B1.777 B2.190 A2.460 E
Arriving Request Stream Filtered Request Stream
Time ID0.001 A0.025 B0.150 C0.890 D1.358 B2.460 E
WebProxyCache
…
…
October 14, 2002 MASCOTS 2002 6
Example of Web cache filter effect
Time ID0.001 A0.025 B0.150 C0.689 A0.890 D1.358 B1.777 B2.190 A2.460 E
Arriving Request Stream Filtered Request Stream
Time ID0.001 A0.025 B0.150 C0.890 D1.358 B2.460 E
WebProxyCache
Frequency-domain effect
…
…
October 14, 2002 MASCOTS 2002 7
Example of Web cache filter effect
Time ID0.001 A0.025 B0.150 C0.689 A0.890 D1.358 B1.777 B2.190 A2.460 E
Arriving Request Stream Filtered Request Stream
Time ID0.001 A0.025 B0.150 C0.890 D1.358 B2.460 E
…
WebProxyCache
Time-domain effect
…
October 14, 2002 MASCOTS 2002 8
Goal of this Work:
Time-domain analysis of cache filter effects in Web caching hierarchies:o Study impact of a cache on the structural characteristics of Web request workload (mean, peak, variance, self-similarity)o Sensitivity of filter effect to cache configuration (cache size and cache replacement policy)o Characterizing aggregate Web request streams in a multi-level Web proxy caching hierarchy
October 14, 2002 MASCOTS 2002 9
Multi-Level Web Proxy Caching System
Web Proxy Cache 1
Web Proxy Cache 2 Web Proxy Cache 3
1 2
3
3
21
Child Level
Parent Level
October 14, 2002 MASCOTS 2002 10
Experimental Methodology
Trace-driven simulation Web proxy cache simulator Synthetic Web proxy workloads
o Controllable characteristicso Trace length: about 1M requests o Zipf slope: -0.75, -0.8o Request arrival process:
Deterministic, Poisson, Self-Similar
October 14, 2002 MASCOTS 2002 11
0
2000
4000
1200
0
1400
0
6000
8000
1000
0
Time (sec)
0
0.2
0.4
0.6
0.8
1
Hit
Rat
io
16:0
0
15:3
0
12:3
0
12:0
0
R
equ
ests
per
5-m
inu
te I
nte
rval
0
4000
2000
1400
0
1200
0
1000
0
8000
6000
Time (sec)
04000
12000
16000
20000
16:0
0
15:3
0
12:3
0
12:0
0
8000
General Observations: Filter Effects
ArrivalCounts
CacheHit Ratio
October 14, 2002 MASCOTS 2002 12
Effect of Cache Configuration
Experimental factors: Cache size determines the maximum
number of Web Content bytes that can be held in the cache at one time
Cache Replacement Policy determines what object(s) to remove from the cache when more space is needed to store an incoming object (e.g. RAND, FIFO, LRU, LFU, GDS)
(Assumption: arrival process is Poisson)
October 14, 2002 MASCOTS 2002 13
Effect of Cache Size on Traffic Structure
0 20 40 60 80 100 1200
5
10
15
20
25
Fre
qu
ency
in P
erce
nt
Requests per 1-minute Interval
(a) Effect of cache size
Marginal Distribution Plot (pdf)
October 14, 2002 MASCOTS 2002 14
Effect of Cache Replacement PolicyF
req
uen
cy
1201008060402000
5
10
15
20
25
Requests per 1-minute Arrival
(b) Effect of cache policy
(8 KB)
October 14, 2002 MASCOTS 2002 15
Input: Deterministic Arrival Process
Main Observations: Reduces mean arrival rate of filtered request stream Increases variance of the filtered request stream
StatisticsBeforeCache
Cache Size (MB)
Mean
StandardDeviation
Hit Ratio
4 16 64 256 10241
38.8% 47.8% 52.7% 55.5% 59.1% 62.7%
60.00 36.88 31.45 28.71 27.31 25.37 23.03
0.00 4.84 4.60 4.01 4.00 4.31 4.78
Input: Poisson Arrival Process
Main Observations: Large impact on mean; little impact on variance Variance-to-mean ratio increases with cache size For small cache sizes, the filtered stream is well-characterized as a Poisson process.
StatisticsBeforeCache
Cache Size (MB)
Mean
StandardDeviation
Hit Ratio
4 16 64 256 10241
38.8% 47.8% 52.7% 55.5% 59.1% 62.7%
60.10 36.81 31.38 28.65 27.26 25.33 23.00
7.82 6.77 6.07 5.43 5.31 5.39 5.62
Input: Self-Similar Arrival Process
Main Observations: Large impact on mean; little impact on variance Variance-to-mean ratio increases with cache size Filtered request stream retains self-similar structure
StatisticsBeforeCache
Cache Size (MB)
Mean
StandardDeviation
Hit Ratio
4 16 64 256 10241
38.8% 47.8% 52.7% 55.5% 59.1% 62.7%
62.87 38.50 32.79 29.88 28.27 26.05 23.49
12.24 9.03 7.98 7.12 6.94 7.02 7.14
October 14, 2002 MASCOTS 2002 18
Network traffic self-similarityThe statistical characterization of the trafficis essentially invariant with time scale.
Main measureHurst parameter: 0.5 < H < 1
Examinationo autocorrelation (long-range dependence)o variance-time ploto rescaled adjusted range statistic (R/S)
Background: Self-Similar Traffic
October 14, 2002 MASCOTS 2002 19
Traffic Characterization in aWeb Proxy Caching Hierarchy
Filter effects of the first-level cache on Web workload
Statistical multiplexing of filtered Web request streams after the first-level cache
Modeling aggregate request stream offered to the second-level cache
October 14, 2002 MASCOTS 2002 20
Multi-Level Web Proxy Caching System
Web Proxy Cache 1
Web Proxy Cache 2 Web Proxy Cache 3
1 2
3
3
21
Child Level
Parent Level
October 14, 2002 MASCOTS 2002 21
Synthetic Self-Similar Workload Traces offered to the first-level cache
Trace 1(H=0.70, Zipf slope=0.75)
Trace 2(H=0.80, Zipf slope=0.80)
4000 120000 8000
60
100
140
20
180
Time (sec.) Req
ues
ts p
er I
nte
rval
80000 4000 1200020
60
100
140
Time (sec.) Req
ues
ts p
er I
nte
rval
Evidence of Self-Similar Request Arrival Process for Filtered Web Proxy Workload
0 4000 8000 120000
20
40
60
Time IntervalCou
nt
of A
rriv
al
/I
nte
rval
(a) Time Series
1
0 20 40 60 80 100-0.4
0
0.4
0.8
Lag
Au
toco
rrel
atio
n
(b) Autocorrelation
0 1 2 3 4-4
-3
-2
-1
0
Log10(Aggregation level)
Log
10(V
aria
nce
)
(c) Variance-Time Plot
1 2 3 40
1234
Log
10(R
/S)
Log10(Sample Size)(d) R/S Pox Plot
H=0.699
1`
October 14, 2002 MASCOTS 2002 23
Superposition of Web Workload in time-domain
3
210 20 40 60 80 100 120 140
0
2
4
6
8
Request Arrival
Fre
qu
ency
(%
)
Characteristics of aggregate request arrival process 3
Evidence of Self-Similarity forAggregate Request Arrival Process 3
2000 6000 10000
0
40
80
120
R
equ
ests
per
In
terv
al
Time(sec.)(a) Time series
0 20 40 60 80 100-0.4
0
0.4
0.8
Lag
Au
toco
rrel
atio
n
(b) Autocorrelation function
0 1 2 3 4
0
-1
-2
-3
-4Log
10(v
aria
nce
)
Log10(aggregation level)
(c) Variance-Time Plot
1 2 3 40
1234
Log10(sample size)
Log
10(R
/S)
(d) R/S Pox Plot
H=0.76
October 14, 2002 MASCOTS 2002 25
• Gamma Distribution
β Γ( )
x-μβ
( ) ex-μβ( )- -1
f(x) =
: shape parameter
β : scale parameter
μ : location parameter
Modeling of Aggregate Workload
October 14, 2002 MASCOTS 2002 26
Modeling of Aggregate Workload
October 14, 2002 MASCOTS 2002 27
Summary and Conclusions
• Recap: Trace-driven simulation of Web proxy caching hierarchy, with synthetic Web workloads
• Cache reduces peak and mean request arrival rate
• Cache filter effect does not remove self-similarity
• Superposition of Web request streams results in a bursty aggregate request stream
• Gamma distribution: a flexible and robust means to characterize request arrival count distribution at different stages in a Web caching hierarchy
October 14, 2002 MASCOTS 2002 28
Future Work
• Bigger traces, more general workloads
• Studying the mathematical relationships between gamma (shape) and beta (scale) parameters versus cache size and hit ratio
• For more information:– Email: {bai,carey}@cpsc.ucalgary.ca– http://www.cpsc.ucalgary.ca/~carey