tpact: the transparent proxy agent control protocol presented to cs558 march 26, 1999 alberto cerpa...

TPACT: theTransparent Proxy Agent

Control proTocol

Presented to CS558

March 26, 1999

Alberto Cerpa & Jeremy Elson

outline of our talk

• History and Motivations– What is TPACT? Why is it needed?

• Design & Standardization Effort– Complex, even before implementation

• Implementation Details– Who’s doing what; current status

part one

a brief history of the web,and the stories of those

prophets who have come before

the book of genesis(c. 1993)

Genesis of the WWW: Few clients, few servers, no problem.

Client Server(with monitor; i.e. cheap)

InternetBackbone

the book of numbers(c. 1995)

Increasing numbers of clients: Network & Server loads grow.

InternetBackbone

Clients Cheap, Overloaded Server

the book of numbers(c. 1996)

Increasing numbers of clients: So, add more server capacity.

ClientsServer Farms(Fabio’s Territory)

InternetBackbone

the book of exodus(c. 1997)

Exodus of clients: From backbone providers to small ISPs

Clients

ISP Network Internet(Large Backbone ISP)

To ServerFarms

Congested,Slow, and/orExpensive

ISP Network

the book of peter(c. 1998)

Transparent Caches: Reduce traffic without client configuration

Clients

Internet(Large Backbone ISP)

To ServerFarms

Congested,Slow, and/orExpensive

$New!! N

on-E2E

Internet!

the book of peter(c. 1998)

ISP UserNetwork

Proxy Caches

Router

L4 Switch

Critical observation: Cache usuallyknows what should be intercepted,

but the Switch is the interceptorInternet

(Large Backbone ISP)

the book of jeremiahand alberto (5:58)

Proxy Caches(or any other type oftransparent proxy)

L4 Switch

TPACT allows the cache and switch to exchange control traffic

what control traffic?

• When caches come up, they can tell the switch: “add me to your cache group”

• Switches immediately stop sending work to dead caches using periodic KEEPALIVEs

• Caches can tell switches to allow direct connections for clients (e.g., on auth failure)

• When caches are not on same LAN as switch, dest IP is lost; cache can get it from switch

part two

writing the next book,and convincing the worldthat they should read it

why write a standard?

• There are many switch vendors and many cache vendors -- no one makes both (but Cisco)

• An open standard with a good, open-source reference implementation promotes use

• Standards are subject to peer review

• Doing it right, once, will (hopefully) prevent others from needing to reinvent the wheel

why not snmp?• Initially, it seemed perfect to us -- it’s a

generic way for net devices to interoperate

• But:– retransmission policy?

– hard state vs. soft state?

– authentication?

– too heavyweight for wire speed?– can we do it in 2 months? -- straw that broke the camel’s back

the new camel

• TCP

• Hard state

• Our own data format (very simple)

• Possible to map easily to SNMP in the future, if necessary

redirection semantics

• If you ask the switch to allow a client through, do existing flows break?– Do we add a “redirect client except for the

following ethereal ports” command?

– Do we assume that all switches keep per-flow state, and can redirect new connections without breaking old ones?

– Ostrich Algorithm: let the connections break?

• Multiple services -- only some redirected?

nat buzz• Switches actually do 2 types of forwarding:

– Forward the actual, unchanged IP datagram to a LAN where the cache is promiscuously listening

– Change the IP destination to the cache and route the modified datagram normally

• If we overwrite the IP destination, we still need to know where the client wanted to go– Some application-level protocols tell us

– If not, use TPACT to ask switch

• Even with app headers, seems like we still need IP dest so we can forge the reply to the client!

authentication

• Both sides share a secret (say, a password)• Sender:

– appends the secret to its message– calculates an MD5 hash – replaces the secret with the MD5

• Receiver:– Saves the MD5– Replaces the MD5 with the secret– Calculates the MD5 (should match)

• Sequence numbers to prevent replay attacks• Note: this is authentication, not encryption

details, details(or, writing a standard is harder than it looks)

• Byte ordering

• Which 1 byte do you use in a 4-byte int?

• Authentication specified for operations, or for connections?

• Sequence number space exhaustion problems

• etc., etc….

current status

• Internet-Draft just about done– Major issues decided

– Some details not filled in yet

• Some details and semantics will probably change as we get implementation experience

• It’s being reviewed by people at NetApp and will be submitted to IETF soon (hopefully)

part three

turning an idea into reality,and trying to avoid learning

lessons the hard way

division of labor

• Library of common functions (Alberto & Jer)

• Server-Side (Jer)– FreeBSD with IP Filter package

• Client-Side (Alberto)– Squid

freebsd implementation

• Adding support to, say, an Alteon switch might be better; do you have the code?

• We think we can emulate the behavior of a (slow) switch using FreeBSD + IP Filter

• Having a FreeBSD-L4 switch may be useful

• FreeBSD = public reference implementation

experimental topology

L4 Emulator(+ IP Filter)

Squid 1(+ IP Filter)

Client


Roqueta(router)

Internet

ip filter limitations?• A “real” switch (e.g., the Alteon) can decide

where a connection goes when it starts

• With the IP filter package, we have to install rules beforehand

• Will changing this mapping break existing connections?– Maybe not; per-connection state may be persistent

even when rules change

– If so, use stable hash function ala multicast rendezvous point selection from set

what you get is notwhat you see

• IP Filter’s packet generation circumvents tcpdump; need a hub and another machine

• The switches are having a confounding effect; they may be preventing delivery of “spoofed” packets

• Trying to combine multiple services into one machine can be very confusing

squid implementation

• “Helper” vs. Library.

• Advantage of a helper:– Nice way of plugging in TPACT into Squid.

• Disadvantage:– We’d have to implement TPACT twice, or

invent a 2nd protocol from helper to squid

• Library has standard API that can be used by any cache or switch implementation

operation in squid

• “I am here” upon Squid initialization.

• New Event in charge of sending KEEPALIVEs and flow control.

• “What server’s IP” if no “Host:” info.– Other cases?

• “Direct” this client to the Internet.

• Send HTTP redirect after ACK from NE.

direct/redirect example

L4 Emulator(+ IP Filter)


Client

Internet

Step 1: Client request to Internet gets intercepted; sent to cacheStep 2: Cache sends request to InternetStep 3: Server rejects client’s request (IP addr not authenticated)Step 4: Cache tells switch to direct that client’s requests to InternetStep 5: Switch ACKs the updated redirection listStep 6: Cache tells client to retry (by sending HTTP redirect)Step 7: Client sends same request; this time not intercepted

anti-calamari(or, how to keep squid from getting fried)

• Flow control: based on no. of fd’s open– Start/Stop with some hysteresis.

• Performance metrics:– Load: same as before.

– Average response time: statistics?

• Recovery from crashes:– process: nicely handled by the OS (TCP reset)

– machine: KEEPALIVE timeout.

that’s all, folks!

Thank you!

tpact: the transparent proxy agent control protocol presented to cs558 march 26, 1999 alberto cerpa...

Documents

switch slide

necessary slide

wheel slide

cheap internet backbone

overloaded server slide

andor expensive slide

book of numbers

current status slide