tbit: tcp behavior inference tool - icir · 2000. 10. 24. · rfc 2414: min (4*mss, max (2*mss,...

Post on 06-Sep-2020

2 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

TBIT: TCP Behavior Inference Tool

Jitendra Padhye

Sally Floyd

AT&T Center for Internet Research at ICSI

(ACIRI)

http://www.aciri.org/tbit/

1 of 24

Outline of talk

� Motivation

� Description of the tool

� Results

� Future work

2 of 24

Motivation

� TCP handles a majority of today’s Internet traffic

� Understanding TCP behavior is important: OS

vendors, ISPs

� RFCs and other documents specify how TCP should

behave

3 of 24

Needless to say ....

Implementations do not always matchspecifications!

4 of 24

Example

� Initial window used by TCP: amount of data sent out

in a “burst” before any ACKs are received.

� RFC 2414: min (4*MSS, max (2*MSS, 4380 bytes))

� MSS 512 burst of 2000 bytes

� We have found TCPs (www.uwm.edu) that send

8000+ bytes with MSS of 512!

� Large bursts of packets buffering problems, loss,

delays.

5 of 24

How to detect misbehaving TCPs

� Passive detection: Vern Paxson analyzed thousands

of tcpdump traces and detected several

anomalies (1996-97)

� Passive detection has limitations

� TBIT actively probes TCP stacks at web servers to

test behavior

6 of 24

How it works: The basic idea

� Send “fabricated” TCP packets over raw IP sockets.

� Host firewall prevents kernel from seeing response

packets.

� BPF delivers blocked packets to user process.

� Net effect: a user-level, user-controllable TCP,

without kernel changes.

Based on “Sting” project at Univ. of Washington by

Stefan Savage

7 of 24

Example

Determine TCP initial window used by a web server.

� Send SYN. Wait to receive SYN-ACK.

� Send HTTP GET request for “/”

� Do not ACK any incoming packets.

� Wait until first retransmission.

� Initial window � Max. sequence number received.

Can check with several MSS values!

8 of 24

Tests implemented so far

� Handshake tests: Timestamp used?

SACK-capable?

� Cong estion response: Reduce congestion

window? NewReno/Reno/Tahoe?

� SACK: Construct SACKs correctly? Respond to

SACKs correctly?

� Other: Initial window? ECN-capable?

9 of 24

Results: Background

� Two lists of web sites:

– 100hot.com: approx. 200 unique IP addresses.

– Trace from an ISP proxy (courtesy Dax Kelson):

approx. 27,000 unique IP addresses.

� Tests repeated at least twice at different times.

� Results reported only if consistent across runs.

� Not allowed to run NMAP: hard to correlate with OS

10 of 24

Initial Window

638 tests from Proxy list. 10/12/00. MSS 512.

Results:

– 4 hosts had initial windows of 8000+ bytes (17

packets with MSS 512, 80 packets with MSS

100). www.uwm.edu(2), endeavor.med.nyu.edu,

www.monash.com.

– 12% hosts reported initial windows of �

packets.

11 of 24

Timestamps

� Timestamps enable better estimation of RTO

� 136 completed tests from Hot list. 7/15/00.

� 25% of the servers tested did not use timestamps.

For example: www.ebay.com, www.hp.com

� AIX hosts send garbage. Problem reported to IBM,

fix in works.

� Have not tested if timestamps are used correctly.

12 of 24

Congestion window reduction

TCP expected to cut sending rate in half on packet

drop. Essential to the stability of the Internet!

6485 tests from Proxy list. 10/19/00. MSS 100.

Drop one packet when window reaches 8, and count

outstanding packets.

Results: 72 hosts (1.11%) reduced congestion

window to 7 packets. For example: www.adobe.com,

members.zdnet.com

13 of 24

Congestion window reduction:

Examples

0

500

1000

1500

2000

2500

3000

0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1

Seq

no�

Time

12c4.com 63.95.221.61 Window reduced

RcvdAck

Drop

0

500

1000

1500

2000

2500

3000

0.74 0.76 0.78 0.8 0.82 0.84 0.86 0.88S

eqno�

Time

www.adobe.com 192.150.12.101 Window not reduced

RcvdAck

Drop

Window reduced Window not reduced

14 of 24

Claim SACK-capable

� SACK (Selective Acknowledge Ment) reduces RTOs,

improves performance.

� 136 tests from Hot list. 7/15/00.

� Results:

– 42% not SACK-capable. For example:

home.netscape.com, www.cnn.com

– Many SACK-capable hosts do not seem to use

SACKs correctly.

15 of 24

Correct SACK usage

� 2278 tests from Proxy list. 10/18/00. MSS 100.

� Drop packets 6 and 8, and see if they are

retransmitted together.

� Results: Only about 6% of the hosts used SACK

correctly.

16 of 24

SACK Usage examples

0

200

400

600

800

1000

1200

1400

1600

1800

0.85 0.9 0.95 1 1.05 1.1 1.15 1.2 1.25 1.3 1.35 1.4

Seq

no�

Time

63.95.221.61: Sack Works

RcvdAck

Drop

0

200

400

600

800

1000

1200

1400

0.65 0.7 0.75 0.8 0.85 0.9 0.95 1 1.05 1.1 1.15

Seq

no�

Time

63.226.117.70: Sack Ignored

RcvdAck

Drop

Correct usage SACK info ignored

17 of 24

ECN

� Negotiated during SYN/ACK exchange.

� 26,447 tests from Proxy list.

� 8% of web servers unreachable from ECN-capable

clients.

� Sometimes, problem with Cisco Local Director (Dax

Kelson). Fixed.

18 of 24

TCP flavor

� 136 tests from Hot list. 7/15/00. MSS 100.

� Results:

– 61% NewReno, 22% Reno, rest Tahoe.

– Microsoft servers took timeout for every packet

loss for small transfers. Problem reported to

Microsoft, fix will be available in next version of

Windows 2000.

19 of 24

TCP flavor: NewReno vs. Reno

0

500

1000

1500

2000

2500

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8

Seq

no�

Time

0 www.tminterzines.com 192.225.36.138 rx=2 to=1 Reno

RcvdAck

Drop

0

500

1000

1500

2000

2500

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Seq

no�

Time

home.netscape.com 205.188.247.65 rx=2 to=0 NewReno

RcvdAck

Drop

Reno NewReno

20 of 24

TCP flavor: NewReno vs. Tahoe

0

500

1000

1500

2000

2500

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8

Seq

no�

Time

www.microsoft.com 207.46.130.14 rx=3 to=1 TahoeNoFR

RcvdAck

Drop

0

500

1000

1500

2000

2500

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Seq

no�

Time

home.netscape.com 205.188.247.65 rx=2 to=0 NewReno

RcvdAck

Drop

Tahoe (No Fast Retransmit) NewReno

21 of 24

Difficulties

� Too few packets: set smaller MSS?

� Lost packets: repeat test multiple times.

� Multiple hosts answering same IP address:

non-repeatable results?

� No easy way to test without a web server.

22 of 24

Future Work

� Full conformance checking for TCP.

� Automatic generation of simulator models.

� Extend this approach to investigate other behaviors

of the Internet infrastructure

� Suggestions? Beyond TCP?

� Run NMAP?

23 of 24

Finally ....

� Source code, detailed results and a preliminary

report are available: http://www .aciri.or g/tbit/

� We encourage people to use the software and add

their own tests.

24 of 24

top related