1 berkeley-helsinki summer course lecture #12: introspection and adaptation randy h. katz computer...

Berkeley-Helsinki Summer Course

Lecture #12: Introspection and Adaptation

Randy H. Katz

Computer Science Division

Electrical Engineering and Computer Science Department

University of California

Berkeley, CA 94720-1776

Outline

• Introspection Concept and Methods• SPAND Content Level Adaptation• MIT Congestion Manager/TCP Layer

Adaptation• ICAP Cache-Layer Adaptation

Outline

Introspection

• From Latin introspicere, “to look within”– Process of observing the operations of one’s own mind

with a view to discovering the laws that govern the mind

• Within the context of computer systems– Observing how a system is used (observe): usage

patterns, network activity, resource availability, denial of service attacks, etc.

– Extracting a behavioral model from such use (discover)– Use this model to improve the behavior of the system,

by making it more proactive, rather than reactive, to how it is used

– Improve performance and fault tolerance, e.g., deciding when to make replicas of objects and where to place them

Introspection in Computer Systems

• Locality of Reference– Temporal: objects that are used are likely to be used

again in the near future– Geographic: objects near each other are likely to be

used together

• Exploited in many places– Hardware caches, virtual memory mechanisms, file

caches– Object interrelationships– Adaptive name resolution– Mobility patterns

• Implications– Prefetching/prestaging– Clustering/grouping– Continuous refinement of behavioral model

Example: Wide-Area Routing and Data Location in

OceanStore• Requirements

– Find data quickly, wherever it might reside» Locate nearby data without global communication » Permit rapid data migration

– Insensitive to faults and denial of service attacks» Provide multiple routes to each piece of data» Route around bad servers and ignore bad data

– Repairable infrastructure» Easy to reconstruct routing and location information

• Technique: Combined Routing and Data Location

– Packets are addressed to GUIDs, not locations– Infrastructure gets the packets to their destinations and

verifies that servers are behaving John Kubiatowicz

Two-levels of Routing

• Fast, probabilistic search for “routing cache”

– Built from attenuated Bloom filters– Approximation to gradient search– Not going to say more about this today

• Redundant Plaxton Mesh used for underlying routing infrastructure:

– Randomized data structure with locality properties– Redundant, insensitive to faults, and repairable– Amenable to continuous adaptation to adjust for:

» Changing network behavior» Faulty servers» Denial of service attacks

John Kubiatowicz

Basic Plaxton MeshIncremental suffix-based routing

NodeID0x43FE

NodeID0x13FENodeID

0xABFE

NodeID0x1290

NodeID0x239E

NodeID0x73FE

NodeID0x423E

NodeID0x79FE

NodeID0x23FE

NodeID0x73FF

NodeID0x555E

NodeID0x035E

NodeID0x44FE

NodeID0x9990

NodeID0xF990

NodeID0x993E

NodeID0x04FE

NodeID0x43FE

John Kubiatowicz

Use of Plaxton MeshRandomization and

Locality

John Kubiatowicz

Use of the Plaxton Mesh(Tapestry Infrastructure)

• As in original Plaxton scheme:– Scheme to directly map GUIDs to root node IDs– Replicas publish toward a document root– Search walks toward root until pointer locatedlocality!

• OceanStore enhancements for reliability:– Documents have multiple roots (Salted hash of GUID)– Each node has multiple neighbor links– Searches proceed along multiple paths

» Tradeoff between reliability and bandwidth?– Routing-level validation of query results

• Dynamic node insertion and deletion algorithms

– Continuous repair and incremental optimization of links

John Kubiatowicz

OceanStore Domains for Introspection

• Network Connectivity, Latency– Location tree optimization, link failure recovery

• Neighbor Nodes– Clock synchronization, node failure recovery

• File Usage– File migration– Clustering related files– Prefetching, hoarding

• Storage Peers– Accounting, archive durability, backlisting

• Meta-Introspection– Confidence estimation, stability

Dennis Geels, geels@cs.Berkeley.edu

Common Functionality

• These targets share some requirements:– High input rates

» Watch every file access, heartbeat, packet transmission

– Both short- and long-term decisions» Respond to changes immediately» Extract patterns from historical information

– Hierarchical, Distributed Analysis» Low levels make decisions based on local information» Higher levels possess broader, approximate

knowledge» Nodes must cooperate to solve problem

• We can build shared infrastructure

Architecture for Wide-Area Introspection

• Fast Event-Driven Handlers– Filter and aggregate incoming events– Respond immediately if necessary

• Local Database, Periodic Analysis– Store historical information for trend-watching– Allow more complicated, off-line algorithms

• Location-Independent Routing– Flexible coordination, communication

Event-Driven Handlers

• Treat all incoming data as events: messages, timeouts, etc.

– Leads to natural state-machine design– Events cause state transitions, finite processing time– A few common primitives could be powerful: average.

count, filter by predicate, etc.

• Implemented in “small language”– Counts important primitives for aggregation, database

access– Facilitates implementation of introspective algorithms

» Allows greater exploration, adaptability– Can verify security, termination guarantees

• E.g., EVENT.TYPE=“file access” : increment COUNT in EDGES where SRC==EVENT.SRC and DST==EVENT.SRC

Local Database, Periodic Analysis

• Database Provides Powerful, Flexible Storage– Persistent data allows long-term analysis– Standard interface for event handler scripting language– Leverage existing aggregation functionality

» Considerable work from Telegraph Project– Can be lightweight

• Sophisticated Algorithms Run On Databases– Too resource-intensive to operate directly on events– Allow use of full programming language– Security, termination still checkable; should use common

mechanisms

• E.g., expensive clustering algorithm operating over edge graph, using sparse-matrix operations to extract eigenvectors representing related filesDennis Geels,

geels@cs.Berkeley.edu

Location-Independent Routing

• Not a very good name for a rather simple idea. Interesting introspective problems are inherent-ly distributed. Coodination among nodes is difficult. Needed:

– Automatically create/locate parents in aggregation hierarchy

– Path redundancy for stability, availability– Scalability– Fault tolerance, responsiveness to fluctuation in

workload

• OceanStore data location system shares these requirements. This coincidence is not surprising, as each are instances of wide-area distributed problem solving.

• Leverage OceanStore Location/Routing System

Summary: Introspection in OceanStore

• Recognize and share a few common mechanisms

– Efficient event-driven handlers– More powerful, database-driven algorithms– Distributed, location-independent routing

• Leverage common architecture to allow system designers to concentrate on developing & optimizing domain-specific algorithms

Outline

SPAND Architecture

Mark Stemm

SPAND Architecture

Mark Stemm

What is Needed

• An efficient, accurate, extensible and time-aware system that makes shared, passive measurements of network performance

• Applications that use this performance measurement system to enable or improve their functionality

Mark Stemm

Issues to Address

• Efficiency: What are the bandwidth and response time overheads of the system?

• Accuracy: How closely does predicted value match actual client performance?

• Extensibility: How difficult is it to add new types of applications to the measurement system?

• Time-aware: How well does the system adapt to and take advantage of temporal changes in network characteristics?

Mark Stemm

SPAND Approach: Shared Passive Measurements

Mark Stemm

Related Work

• Previous work to solve this problem– Use active probing of network– Depend on results from a single host (no sharing)– Measure the wrong metrics (latency, hop count)

• NetDyn, NetNow, Imeter– Measure latency and packet loss probability

• Packet Pair, bprobes– If Fair Queuing, measures “fair share” of bottleneck link b/w)– Without Fair Queuing, unknown (min close to link b/w)

• Pathchar– Combines traceroute & packet pair to find hop-by-hop latency

& link b/w

• Packet Bunch Mode– Extends back-to-back technique to multiple packets for

greater accuracyMark Stemm

Related Work

• Probing Algorithms– Cprobes: sends small group of echo packets as a

simulated connection (w/o flow or congestion control)– Treno: like above, but with TCP flow/congestion

control algorithms– Network Probe Daemon: traces route or makes short

connection to other network probe daemons– Network Weather Service: makes periodic transfers to

distributed servers to determine b/w and CPU load on each

Mark Stemm

Related Work

• Server Selection Systems– DNS to map name to many servers

» Either round-robin or load balancing– Boston University: uses cprobes, bprobes– Harvest: uses round trip time– Harvard: uses geographic location– Using routing metrics:

» IPV6 Anycast» HOPS» Cisco Distributed Director» University of Colorado

– IBM WOM: uses ping times– Georgia Tech: uses per-application, per-domain probe

clientsMark Stemm

Comparison with Shared Passive Measurement

• What is measured?– Others: latency, link b/w, network b/w– SPAND: actual response time, application specific

• Where is it implemented?– Others: internal network, at server– SPAND: only in client domain

• How much additional traffic is introduced?– Others: tens of Kbytes per probe– SPAND: small performance reports and responses

• How realistic are the probes?– Others: artificially generated probes that don’t

necessarily match realistic application workloads– SPAND: actual observed performance from

applications Mark Stemm

Comparison with Shared Passive Measurement

• Does the probing use flow/congestion control?

– Others: no– SPAND: whatever the application uses (usually yes)

• Do clients share performance information?– Others: no; sometimes probes are made on behalf of

clients– SPAND: yes

Mark Stemm

Benefits of Sharing and Passive Measurements

• Two similarly connected hosts are likely to observe same performance of distant hosts

• Sharing measurements implies redundant probes can be eliminated

Mark Stemm

Benefits of Passive Measurements

Mark Stemm

Design of SPAND

Mark Stemm

Design of SPAND

• Modified Clients– Make Performance Reports to Performance Servers– Send Performance Requests to Performance Servers

• Performance Servers– Receive reports from clients– Aggregate/post process reports– Respond to requests with Performance Responses

• Packet Capture Host– Snoops on local traffic– Makes Performance Reports on behalf of unmodified

clients

Mark Stemm

Design of SPAND

• Applications Classes– Way in which an application uses the network– Examples:

» Bulk transfer: uses flow control, congestion control, reliable delivery

» Telnet: uses reliability» Real-time: uses flow control and reliability

– (Addr, Application Class) is target of a Performance Request/Report

Mark Stemm

Issues

• Accuracy– Is net performance stable enough to make meaningful

Performance Reports?– How long does it take before the system can service

the bulk of the Performance Requests?– In steady state, what % of Performance Requests does

the system service?– How accurate are Performance Responses?

• Stability– Performance results must not vary much with time

• Implications of Connection Lengths– Short TCP connections dominated by round trip time;

long connections by available bandwidth

Mark Stemm

Application of SPAND:Content Negotiation

Mark Stemm

Web pages look good on server LAN

37Mark Stemm

Implications for Distant Access,

Overwhelmed Servers

Content Negotiation

Mark Stemm

Client-Side Negotiation Results

Mark Stemm

Server-Side Dynamics

Mark Stemm

Server-Side Negotiation: Results

Mark Stemm

Content Negotiation Results

• Network is the bottleneck for clients and servers

• Content negotiation can reduce download times of web clients

• Content negotiation can increase throughput of web servers

• Actual benefit depends on fraction of negotiable documents

Mark Stemm

Outline

Congestion Manager(Hari@MIT, Srini@CMU)

• The Problem:– Communications flows compete for same limited bandwidth

resource (especially on slow start!), implement own congestion response, no shared learning, inefficient, within end node

• The Power of Shared Learning and Information Sharing

f(n)f(n)

ServerClient

Internet

Adapting to Network

Server

• New applications may not use TCP– Implement new protocol– Often do not adapt to congestion: not “TCP-friendly”

Internet

Client

Need system that helps applications learnand adapt to congestion

State of Congestion Control

• Increasing number of concurrent flows

• Increasing number of non-TCP apps

Congestion Manager (CM): An end-system architecture for congestion management

The Big Picture

HTTP Video1

TCP1 TCP2 UDP

Audio Video2

All congestion management tasks performed in CMApplications learn and adapt using API

CongestionManager

Per-macroflow statistics

(cwnd, rtt, etc.)API

Problems

• How does CM control when and whose transmissions occur?

– Keep application in control of what to send

• How does CM discover network state?– What information is shared?– What is the granularity of sharing?

Key issues: API and information sharingKey issues: API and information sharing

The CM Architecture

Applications (TCP, conferencing app, etc)

Prober

CongestionController

Scheduler

Responder

Congestion Detector

Sender Receiver

CM protocol

Feedback about Network State

• Monitoring successes and losses– Application hints– Probing system

• Notification API (application hints)

IP header

Probing System

• Receiver modifications necessary– Support for separate CM header– Uses sequence number to detect losses– Sender can request count of packets received

• Receiver modifications detected/negotiated via handshake– Enables incremental deployment

IP payload CM headerIP payload

Congestion Controller

• Responsible for deciding when to send a packet

• Window-based AIMD with traffic shaping• Exponential aging when feedback low

– Halve window every RTT (minimum)

• Plug in other algorithms– Selected on a “macro-flow” granularity

Scheduler

• Responsible for deciding who should send a packet

• Hierarchical round robin• Hints from application or receiver

– Used to prioritize flows

• Plug in other algorithms– Selected on a “macro-flow” granularity– Prioritization interface may be different

CM Web Performance

With CM

CM greatly improvespredictabilityand consistency

TCP Newreno

Time (s)

Layered Streaming Audio

Audio adapts to available bandwidthCombination of TCP & Audio compete equally with normal TCP

0 5 10 15 20 25

Time (s)

Competing TCP

TCP/CM

Audio/CM

Congestion Manager Summary

• CM enables proper & stable congestion behavior

• Simple API enables app to learn/adapt to network state

• Improves consistency/predictability of net xfers• CM provides benefit even when deployed at

senders alone

Outline

How Internet Content is Delivered Today

Boston

New York

English

Spanish

databaseServerfarm

mainframe

component

solution

Last MileAccess

Internet Centralized Servers

BroadbandConnectionscable modemsDSL, dial-up,wireless

Caching & InternetContent Deliverylocalizes content

Applications &Multiple versionsof content arecentralized

Internet

What is iCAP?

• iCAP lets clients send HTTP messages to servers for “adaptation”

– In essence, an “RPC” mechanism (Remote Procedure Call) for HTTP messages

• An adapted message might be a request:– Modify request method, URL being requested, etc.

• ...or, it might be a reply– Change any aspect of delivered content

• iCAP enables edge services

What iCAP is not-for now

• A way to specify adaptation policy• A configuration protocol• A protocol that establishes trust between

previously unrelated parties• In other words:

ICAP defines the how, not who, when or why

iCAP Makes Content Smarter!

iCAP Enables Local Services

ISPNetwork

Internet(Large Backbone ISP)

To ServerFarms

Congested, Slow, Distant,and/or Expensive Link

Content distributionnetwork, or Cache

Local sources of content: Better for everyone (client, network, server)

Clients

Why iCAP?

• Fast, Simple, Scalable• Allows services to be customized

Virus Checker

Web Serveror Proxy

TranscoderContent Filter

LanguageTranslator

Ad insertion

Legend: iCAP servers forCompute-IntensiveOperations

ICAP Benefits

• Very simple operation– ICAP builds on HTTP GET and POST

• No proprietary APIs Required• Standards-based• Leverages the latest Internet infrastructure

developments• Fast, simple, scalable, and reliable• Allows you to customise services

iCAP general design

• Simple, simple, simple: CGI should be able to turn a web server to an ICAP server

• Based on HTTP (+special headers)• Three modes:

– Modify a request– Satisfy a request (like any other proxy)– Modify a response

Request Modification

• The request is passed to the ICAP server (almost) unmodified; just like a proxy would

• The ICAP server sends back a modified request, encapsulated in response headers

– Body, if any (e.g. POST), may also be modified

Request Modification

Client

ICAPServer

ICAP server modifies a request;modified request continues on its way

Originserver

Proxy cacheiCAP client

Response Modification

• ICAP client always uses POST to send body• Encapsulated in POST headers may also

be:– Headers used by user to request the object– Headers used by origin server in its reply

• ICAP server replies with modified content

Response Modification

Origin ServerClientProxy Cache(ICAP Client)

ICAPServer

ICAP server modifies a response from an origin server;might be once, as the object is cached, or once per client served

Request Satisfaction

Origin Server

ClientProxy Cache(ICAP Client)

ICAPServer

ICAP server satisfies a request just like a proxy;origin server MAY be contacted by ICAP server (or not)

Infinite Variations

• Allows innovations. You choose 3rd party applications.

• iCAP enables many different kinds of apps!– Edge content sources can pass pages to ad servers– Expensive operations can be offloaded– Content filters can respond either with an unmodified

request or HTML (“Get Back to Work!”)

Next Steps

• ICAP Supporters continue to enhance protocol

– Learn from solutions and fix “bugs”– Build future functionality later

• IETF– ICAP Forum will submit the specification to IETF for

draft RFC status in mid 2000

• Additional partners – Software developers, infrastructure companies and

Internet content delivery service providers will be solicited for participation

– Need to get on same page

More on – Important iCAP info at this site http://www.i-cap.org.

– Become an iCAP Participant by sending an e-mail to mailto:partners@i-cap.org. A reply will be sent outlining requirements.

1 berkeley-helsinki summer course lecture #12: introspection and adaptation randy h. katz computer...

continuous adaptation

routing cache

adaptation randy

data location packets

combined routing

levels of routing

randomized data structure

piece of data route

Documents

cmsc 414 computer (and network) security lecture 2 jonathan...

virtualization introspection system

cmsc 414 computer and network security lecture 25 jonathan...

cmsc 414 computer and network security lecture 18 jonathan...

introspection museum exhibition

moksha trio - introspection

jcs introspection

katz, stoica f04 eecs 122: introduction to computer networks...

review openaccess virtualmachineintrospection:towards...

jonathan katz education - university of...

automaton introspection* - core · automaton introspection*...

towards transparent introspection

greg howe - introspection

an overview of aspects shmuel katz computer science...

cmsc 414 computer and network security lecture 14 jonathan...

cmsc 414 computer and network security lecture 7 jonathan...

randy h. katz computer science division electrical...

cmsc 414 computer and network security lecture 12 jonathan...

sy katz, ph.d. s. katz associates, inc. 4388 knightsbridge...

katz, stoica f04 eecs 122: introduction to computer networks...