ubiquitous web caching wenzheng gu ph.d. defense cise department, university of florida november 25,...

Post on 11-Jan-2016

219 Views

Category:

Documents

3 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Ubiquitous Web Caching

Wenzheng Gu

Ph.D. Defense

CISE Department, University of Florida

November 25, 2003

Outline

Introduction Overview Challenges Contributions Related Work

Extended-ICP Protocol Design Emulation and Analysis

Adaptation and Negotiation with Caching Priority Fidelity Markup Language Partiality Adaptation Versioning Negotiation

Outline

Introduction Overview Challenges Contributions Related Work

Extended-ICP Protocol Design Emulation and Analysis

Adaptation and Negotiation with Caching Priority Fidelity Markup Language Partiality Adaptation Versioning Negotiation

Ubiquitous Computing

Trends on Wireless and Internet Growth

Web Caching

RemoteServerClient

ProxyServer

Request

Response

HTTP

CacheCache

Cache Hit

Benefits of Web Caching

Reduces network bandwidth usage

Lessens user-perceived delays

Lightens loads on origin servers

Internet Caching Protocol (ICP)

Two Types of Relationship: parent sibling

Parent 1 Parent 2

parent

sibling

Child 1 Child 2 Child 3

Outline

Introduction Overview Challenges Contributions Related Work

Extended-ICP Protocol Design Emulation and Analysis

Adaptation and Negotiation with Caching Priority Fidelity Markup Language Partiality Adaptation Versioning Negotiation

The Impact of Mobility on Web Access and Web Caching (1/2) Currently, there is no mobile Web caching protocol. Changing Network:

By leaving home network, mobile users are disconnected from their home cache servers.

By returning home or visiting other networks, users are disconnected from the cache servers just visited.

Hence, users experience degradation of performance while mobile and upon their return.

Changing devices: Users lose client cached objects, favorites, cookies. Users lose personal calendar, contact information

The Impact of Mobility on Web Access and Web Caching (2/2) Heterogeneity of Devices Wide Variety of Web Contents Lack of Automated User Intent Wireless Network Limitation

Low Bandwidth Disconnection/ Handoff Address Migration

Lack of Context Aware Lack of Security

Outline

Introduction Overview Challenges Contributions Related Work

Extended-ICP Protocol Design Emulation and Analysis

Adaptation and Negotiation with Caching Priority Fidelity Markup Language Partiality Adaptation Versioning Negotiation

Contributions Extended ICP Protocol (x-ICP)

Support for Mobility in Web Caching Experimentally demonstrated and quantified the

benefits of x-ICP in terms of cache hit rate. Adaptation Mechanisms to Cope with Device

Heterogeneity and Web Content Variety Adaptive Web Content Adaptive Client and Server Side Algorithms Experimentally demonstrated the benefits of our

adaptive mechanism

Architecture of Ubiquitous Web Caching

IBM Compatible

Laptop computer

Power Mac G4

Hand held computer

Cell phone

ProxyWeb Server

Web ServerWeb Server

PFML

HTTP

X-ICP

PFML

Outline

Introduction Overview Challenges Contributions Related Work

Extended-ICP Protocol Design Emulation and Analysis

Adaptation and Negotiation with Caching Priority Fidelity Markup Language Partiality Adaptation Versioning Negotiation

Mobile IP Routing [PER98]

Foreign Proxy

Home Proxy HomeNetwork

ForeignNetwork

InternetHost

Mobile Client

Mobile Client

Addressing Device Heterogeneity– CC/PP [CCP99]

CC/PP stands for Composite Capabilities/ Preferences Profiles

The CC/PP describes and manages software and hardware profiles that include: information on the user agent's capabilities the user's specified preferences within the user

agent's set of options;

Content Negotiation [HOL98]

Container Page Req

Container Page Res

Embedded obj.req. with Accept

Headers

Variant selectionbased on variantlist and CC/PP

selected Obj.

OriginServer Client

Container Page Req

Container Page Res

Embedded obj.req. with Accept

Headers

Variant selectionbased on variant

list, properties andaccept headers selected Obj.

CacheServer Client

Container Page Req

Container Page Res

Embedded obj. req.

Variant selectionbased on variant

list, properties andclient info.

selected Obj.

CacheServer Client

Container Page Req

Container Page Res

Embedded obj. req.

Variant selectionbased on variant

list and client info.

variant list

OriginServer Client

Obj. URI of aspecific version

selected Obj.

Server-driven Negotiation Agent-driven Negotiation

Transparent Negotiation Versioning Negotiation

Content Adaptation [SMI98]

Outline

Introduction Overview Motivation and Contributions Related Work

Extended-ICP Protocol Design Emulation and Analysis

Adaptation and Negotiation with Caching Priority Fidelity Markup Language Partiality Adaptation Versioning Negotiation

Overview of X-ICP

A Web caching protocol to support mobile users Automatically connect the user’s Foreign Proxy with

his Home Proxy when a user changes the point of attachment on the network

Deliver user’s profile If network situation permits, deliver cached objects

from Home Proxy instead of from origin Web site Collect all the downloaded objects and store them

on the Home Proxy when a user is on the move so that the contents continues to be available upon the user’s return

X-ICP Infrastructure

Home

Internet

Cache ExchangeMotionConnection

Modules of X-ICPProxyServer

ServerSide

Process

ClientSide

Process

StorageManager

NodeMonitor

CacheCopier

To Web

From Client

Exchange with other cache copier

Exchange with other

Node Monitors

X-ICP Processes

Proxy and X-ICP Services Discovery

Mobile Node Registration

Web Object Delivery

Cache Contents Duplication

Outline

Introduction Overview Motivation and Contributions Related Work

Extended-ICP Protocol Design Emulation and Analysis

Adaptation and Negotiation with Caching Priority Fidelity Markup Language Partiality Adaptation Versioning Negotiation

Emulation Environment

Proxy logs from CISE department of UF are used as trace data. The field of URL is mainly utilized to measure the cache hit rate.

3 out of 5 subnets on the CISE network with different population were chosen. Traces were kept running for 25 days each.

ICP was implemented to query Aswan and/or Cairo in order to locate which object is from where.

Mobility is emulated by clearing the cache everyday, thus compulsory miss is higher.

Aswan and Cairo are configured as sibling in Squid.

Home Proxy--Aswan Foreign Proxy--Cairo

client

Solaris 6/Squid 2.4 Solaris 8/Squid 2.4

LAN To Internet

ICP

Emulation Results(1/2)

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

1 3 5 7 9 11 13 15 17

Day

Hit

Rat

ioHome ProxyForeign Proxy

Emulation Results(2/2)

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1 3 5 7 9 11 13 15 17

Day

Hit

Rat

io

Home ProxyForeign Proxy

Emulation Conclusion

With x-ICP deployed, a 21% higher hit rates can be achieved, which is the hit rate on the Home Proxy when users are attached to a Foreign Proxy.

Definitions

Execution time for entire task without x-ICP X-ICP Speedup = ---------------------------------------------------------------------

Execution time for entire task using x-ICP when possible

Cf : Cache Hit Rate on a Foreign ProxyCh : Cache Hit Rate on a Home ProxyDo : Round-trip delay between Foreign Proxy and Origin ServerDh : Round-trip delay between Foreign Proxy and Home Proxy

Do

Foreign Proxy (Cf)

Home Proxy (Ch)

Origin Site

Dh

Performance Analysis of x-ICP

ohh

o

ohhf

of

DCD

DSpeedup

DCDC

DCSpeedup

)1(

])1([)1(

)1(

Performance Analysis of x-ICP

Let Do = 65ms Based on Cottrell’s study on “Internet

Monitoring at SLAC” [COT00].

6579.065

Speedup

Dh

Let Ch = 21%, From our Emulation study based on CISE

Web Caching logs.

0

2

4

6

8

10

12

14

16

1 1.05 1.1 1.15 1.2 1.25

Speedup

Ro

un

d-t

rip

tim

e (

ms)

13.65 Y=0.024X+5.8887

201 miles geographic distance between two proxy servers is allowed with x-ICP deployed

RTT<2ms on campus network

The Speedup is 1.22 with x-ICP deployed

Analysis Results on x-ICP

Sensitivity Analysis – Do

0

0.2

0.4

0.6

0.8

1

1.2

1.4

0 10 20 30 40 50 60 70 80 90 100

Average Regional RTT within North America (ms)

Sp

eed

up

Dh=2

Dh=13.65

Generally speaking, the impact of the average regional RTT value is not significant on the speedup.

Do=65

Sensitivity Analysis - Ch

0

5

10

15

20

25

30

35

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Cache Hit Rate on Home Proxy (Ch)

Sp

eed

up

Dh=2

Dh=13.65

The increment of the speedup is negligible when the cache hit rate value is small.

Ch=21%

Evaluation on X-ICP

With x-ICP deployed, a 21% cache hit rate can be achieved on the Home Proxy

With that 21% hit rate a 1.22 times higher speedup can be gained on a

campus wide high speed network. the distance of two proxy servers can be up to

about 201 miles in terms of current Internet environment.

Summary on X-ICP

X-ICP extends ICP caching protocol to support for mobility

X-ICP reduces the user’s response time Under x-ICP, user’s profile follows the user

while mobile. This provides for a seamless Web experience.

Outline

Introduction Overview Challenges Contributions Related Work

Extended-ICP Protocol Design Emulation and Analysis

Adaptation and Negotiation with Caching Priority Fidelity Markup Language Partiality Adaptation Versioning Negotiation

Content Service Overlay Content Service Overlay NetworksNetworks

OriginServer

Client

Client

Client

Packet Network

Content Network Overlay

Edge Node

Content Services Network Overlay

Server Server

Content Types

XHTML pages Content files

Video Image Text Audio

Content Delivery to Heterogeneous Devices

Existing Approaches Content Adaptation Content Negotiation

Our Approach—Partiality Fidelity Markup Language

Take advantage of the index page Insert two types of new tags as metadata

Priority Tag Fidelity Tag

The Hierarchy of PFML Elements

PFML

Priority

Fidelity

Choice

Img Script Embed

Other HTML Tags …

The Document Type Definition of PFML

<?xml version =”1.0”?><!DOCTYPE PFML SYSTEM “PFML.dtd”><!ELEMENT PFML (Priority*)>

<!ELEMENT Priority ANY><!ATTLIST Priority name CDATA #IMPLIED ><!ATTLIST Priority value (0|1|2|3|4|5|6|7|8|9) ‘9’ ><!ATTLIST Priority fixed (Y|N) ‘Y’ >

<!ELEMENT Fidelity (choice*)><!ELEMENT choice (img* | script* | embed*)><!ATTLIST choice sourceQuality CDATA ‘1’ type CDATA #IMPLIED charset CDATA #IMPLIED language CDATA #IMPLIED feature CDATA #IMPLIED … … >

Processing on PFML

“9”

“5”

Foo.gif

“0”

“9”

“5”

Foo.gif

“9”

“5”

Foo.png

1 2

.

xxxxxxxxxx

Foo.png

Foo.png

1. Partiality Adaptation

2. Versioning Negotiation

Outline Introduction

Overview Challenges Contributions Related Work

Extended-ICP Protocol Design Emulation and Analysis

Adaptation and Negotiation with Caching Priority Fidelity Markup Language Partiality Adaptation Versioning Negotiation

Priority Tag

Priority tag is used to divide a web page into several portions

Advantages: Enable different device users share the same

index page so that increase the cache hit rate and reduces traffic and response time

Reduce the number of embedded objects to download so that bandwidth is saved

Example of Priority Tag <?xml version = “1.0”?> <Priority value=’9’ fixed=‘Y’> <HTML> <!--Foo’s personal Web site. --> <HEAD> <TITLE> Foo’s Home </TITLE> </HEAD> <BODY> <!- - self-introduction- -> <P> I am … </P> </Priority> <Priority value=’5’ fixed=‘N’> <!- -Personal picture - -> <IMG SRC=”Foo.gif” BORDER…> <!- - My interests - -> <P> I like sports and music… </P> <!- -friends’ link - -> <P>Foo1 < A HREF = HTTP://…></P> <P>Foo2 < A HREF = HTTP://…></P> </Priority> <Priority value=’9’ fixed=‘Y’> <!- - contact information - -> <P> Phone #: (123)456-7890 </P> </BODY> </HTML> </Priority>

The Adaptive Priority Decision Algorithm

Page Segment Priority Decision Algorithm

Agent priority Decision Algorithm

Algorithm Complexity

Maintained in O(log n)

Inserted or Deleted in O(log n)

Constructed in O(n)

Web Caching in Partiality Adaptation A mobile device can take advantage of the copy

of a Web page previously downloaded by some other devices, for example, a desktop, in a caching hierarchy.

A device with more capabilities can use the partial copy of a Web page downloaded previously by a smaller device, and send it to the user directly.

The user community size is bigger so cache hit rate could be higher.

Experiments on Priority Tags

Laptop computer

PDA

Cell phone

Proxy ServerWeb Server

Internet

Experiment One on Partiality Adaptation (1/2)

1

10

1001000

10000100000

Cnn L

aptop

Goo

gle L

apto

p

Cnn P

DA

Goo

gle P

DA

Cnn P

hone

Goo

gle P

hone

log

arit

hm

ic t

ime

(ms)

Remote

ExtractedCached

Experiment One on Partiality Adaptation (2/2)

When the speed of wireless network is above 11Mbps, it’s 9 times faster to download a 50k Web page from an extracted case than from origin site.

Questions? What if the speed of wireless network is not fast

enough? What if the Web page is not big enough?

More experimentation is needed.

Experiment Two on Partiality Adaptation (1/2)

0

2000

4000

6000

8000

10000

12000

14000

16000

18000

20000

0 5000 10000 15000 20000 25000 30000

Page Size (bytes)

Do

wn

load

ing

Tim

e (m

s)

Remote Extracted

Experiment Two on Partiality Adaptation (2/2)

Simulation Data On an average of 4843 bytes downloading With Priority Tags 5910 ms Without Priority Tags 6857 ms

According to our simulation, using Priority Tags can reduce about 1 second response time to the cellular phone users to browse the internet.

Outline

Introduction Overview Challenges Contributions Related Work

Extended-ICP Protocol Design Simulation and Analysis

Adaptation and Negotiation with Caching Priority Fidelity Markup Language Partiality Adaptation Versioning Negotiation

Fidelity Tag

Fidelity Tags are mainly used for content negotiation.

Allow web server to insert the object lists and their attributes into a web page where the corresponding web object is embedded.

Advantage: Let user make the decision so that eliminates the

CC/PP file fetching and parsing Reduce the number of round-trips

Example of Fidelity Tag <Fidelity> <choice sourceQuality= “1” type=“img/gif”>

<img src=“/images/foo.gif” width=“276” height=“110” /> </choice> <choice sourceQuality=“0.6” type=“img/png”>

<img src=“/images/foo.png” width=“76” height= “30” /> </choice> <choice> foo </choice> </Fidelity> <Fidelity> <choice sourceQuality= “0.9” type= “text/html” language= “en”>

<doc src=”/document/paper.html.en” /> </choice> <choice sourceQuality= “0.7” type=”text/html” language=”fr” >

<doc src=”/document/paper.html.fr” /> </choice> <choice sourceQuality= “1.0” type= “application/postscript” language= “en” >

<doc src=”/document/paper.ps.en” /> </choice> </Fidelity>

Experiment on Fidelity Tags

i95cl

Apache Web Server

Nextel Tower

CCPP

PAVN

Internet

Total Roundtrip Time

0

2000

4000

6000

8000

10000

12000

Object Size (bytes)

Ro

un

d-t

rip

Tim

e (

ms) PAVN

CCPP

Time Measured on the Server Side

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6

CCPP

PAVN

time (seconds)

total

system

user

Evaluation on Fidelity Tags

We saved about 1 second on the server side by using PAVN instead of CC/PP module.

We saved about 0.8 second on the total round-trip time with our implementation on PAVN and CC/PP.

Summary on PFML

With the simple Priority and Fidelity metadata and associated algorithms, we give a better solution to the following problems: Heterogeneity of Devices Wide Variety of Web Contents Lack of Automated User Intent Low Bandwidth on Wireless Network

Conclusion

Designed a mobile Web caching protocol Deliver Web contents from nearby proxy Deliver user’s personal profile

Designed adaptive PFML and associated algorithms, with Content adaptation Content negotiation

Future Work

To model user’s behavior while mobile How far is the foreign proxy How long is the user’s linger time What devices are being used The speed of movement How frequent to move

It will help to deploy proxies and to determine the functionalities of proxies

Publications

“Extended Internet Caching Protocol: A Foundation for Building Web Caching to Nomadic Users,” ACM Symposium on Applied Computing, Melbourne, FL, January 2003.

“Ubiquitous Web Caching,” submitted to Wireless Communication and Mobile Computing Magazine, John Wiley and Sons, 2003.

“Adaptive Content Delivery with XML,” to submit, ITC Specialist Seminar on Performance Evaluation of Wireless and Mobile Systems Antwerp, Belgium, August 2004

Questions

Thanks!

In the core

FlashFlash

HTMLHTML

WMLWML

DHTMLDHTML

ASPASP

PHPPHPJPEGJPEG

PNGPNGGIFGIF

MPEG1MPEG1

RealReal

Windows MediaWindows MediaQuicktimeQuicktime

MPEG4MPEG4

At the end

DesktopDesktop

PDAPDA

PalmtopPalmtop

Integrated ChipIntegrated Chip

Embedded DevicesEmbedded DevicesWAP PhoneWAP Phone

LaptopLaptop

Mobile IP Tunneling

Scenario on X-ICP Registration

Hops Detection(1/4)

Deploying x-ICP on different networks can bring more overhead.

If the two proxy servers are too far away from each other, x-ICP sibling configuration shouldn’t take place.

Hops, RTT, or physical distance of the two servers should be detected.

On the Foreign Proxy (proxyE): Cache_host proxy4.Net1 sibling http-port

icp-port On the Home Proxy (proxy4):

acl src ProxyE ProxyE.Net2 Http_access allow ProxyE ICP_access allow ProxyE

Sibling Proxy Configuration(2/4)

Care_of_Address

Register with Node Monitor (3/4) Foreign User 1

Foreign User 2

Foreign User n

URL 1 URL nURL 2

URL 1 URL nURL 2

URL 1 URL nURL 2

Object Lists

ForeignUserList

User Profile Delivery (4/4)

Bookmarks History links Contact information Cookies …

RTT vs. Distance (1/3)(courtesy of Stanford Linear Accelerator Center )

0.002.004.006.008.00

10.0012.0014.0016.0018.00

1 2 3 4 5 6

time period of a day

Del

ay(m

s)

hop 1hop 2hop 3hop 4hop 5hop 6hop 7

RTT vs. Distance(2/3) This is a trace-route like simulation conducted in our lab. Requests are made to the Random selected top 100 Web sites. Round trip time for the first 7 hops (routers) are collected. The first 6 hops are on campus. It shows the RTT < 2ms on the

campus backbone.

Page Segment Priority Value Decision Algorithm(1/2) # Nc: total number of clicks; increment upon each click

# Ns: total number of segments of a page

# t: a function to calculate a specific threshold with parameters

# Pi: priority value for segment i

# Ti: the time stamp to generate the Pi

# Tnow: the current time

# Ci: total number of clicks on segment i

# Ci’: total number of clicks on segment i sent from client agent

 

# executed on each access

for each segment i {

Ci Ci + Ci’;

Nc Nc + Ci’;

}

Page Segment Priority Value Decision

Algorithm(2/2) # executed periodically

for each segment i {

# priority value increment

if ( Ci > t (Ci,Pi,Nc,Ns) and Pi < 9 ) {

Pi Pi + 1;

Ti Tnow;

}

# priority value decrement

# expired means the segment hasn’t been touched for a period

else if ( Ti expired and Pi > 0 ) {

Pi Pi –1;

Ti Tnow;

}

}

Client Agent Priority Value Decision Algorithm (1/2)# Nj,c: total number of clicks on a page; increment upon each click # Nj,s: total number of segments of a page # Np: total number of pages # t: a function to calculate a specific threshold with parameters # Pj,i: priority value for segment I # Pj,c: priority value for a page # Tj,c: the time stamp to generate the Pj,c # Vk: the total number of pages having priority k, where 0<=k<=9# Pa: priority value for a client agent# Ta: the time stamp to generate Pa# Cj,i: total number of clicks on segment i, page j.

# upon each clickif (new page) Np Np +1; Initialize new (Cj,i)s to 0; Initialize new Pj,c to 0;Cj,i Cj,i + 1;Nj,c Nj,c + 1;

 

Client Agent Priority Value Decision Algorithm (1/2)

# at the idle timefor each page j{ Pj,c’ Pj,c;

# change priority of pagefor each segment i

if ( Cj,i > t(Nj,c,Nj,s) and Pj,c > Pj,i)

Pj,c Pj,i;Tj,c Tnow;

else if ( Tj,c expired)if ( Pj,c < 9 )

Pj,c Pj,c + 1;

Tj,c Tnow;

# change priority of agentif ( Pj,c <> Pj,c’)

k Pj,c’; Vk Vk -1;

k Pj,c;Vk Vk +1;if (Vk > t(Np) and Pa >

k)Pa k;Ta Tnow;

else if (Ta expires and Pa < 9)Pa Pa + 1;Ta Tnow;

}

RVSA details:

The overall quality Q of a variant is the value of Q = round5( qs * qt * qc * ql * qf ) qs Is the source quality factor in the variant

description. qt The media type quality factor qc The charset quality factor ql The language quality factor qf The features quality factor

Example of RVSA

Variant list {"paper.html.en" 0.9 {type text/html} {language en}}, {"paper.html.fr" 0.7 {type text/html} {language fr}}, {"paper.ps.en" 1.0 {type application/postscript} {language

en}} Request Accept- headers :

text/html:q=1.0, */*:q=0.8 Accept-Language: en;q=1.0, fr;q=0.5

Computations round5 ( qs * qt * qc * ql * qf ) = Q paper.html.en: 0.9 * 1.0 * 1.0 * 1.0 * 1.0 = 0.90000

paper.html.fr: 0.7 * 1.0 * 1.0 * 0.5 * 1.0 = 0.35000 paper.ps.en: 1.0 * 0.8 * 1.0 * 1.0 * 1.0 = 0.80000

top related