katz, stoica f04 eecs 122: introduction to computer networks congestion control computer science...

49
Katz, Stoica F04 EECS 122: Introduction to Computer Networks Congestion Control Computer Science Division Department of Electrical Engineering and Computer Sciences University of California, Berkeley Berkeley, CA 94720-1776

Post on 19-Dec-2015

223 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Katz, Stoica F04 EECS 122: Introduction to Computer Networks Congestion Control Computer Science Division Department of Electrical Engineering and Computer

Katz, Stoica F04

EECS 122: Introduction to Computer Networks

Congestion Control

Computer Science Division

Department of Electrical Engineering and Computer Sciences

University of California, Berkeley

Berkeley, CA 94720-1776

Page 2: Katz, Stoica F04 EECS 122: Introduction to Computer Networks Congestion Control Computer Science Division Department of Electrical Engineering and Computer

2Katz, Stoica F04

Today’s Lecture: 10

Network (IP)

Application

Transport

Link

Physical

2

7, 8, 9

10, 11

17, 18, 19

14, 15, 16

21, 22, 23

25

6

Page 3: Katz, Stoica F04 EECS 122: Introduction to Computer Networks Congestion Control Computer Science Division Department of Electrical Engineering and Computer

Katz, Stoica F04

Finishing Last Lecture

Page 4: Katz, Stoica F04 EECS 122: Introduction to Computer Networks Congestion Control Computer Science Division Department of Electrical Engineering and Computer

4Katz, Stoica F04

Where do IP routers belong?

Big Picture

Communication Network

SwitchedCommunication

Network

BroadcastCommunication

Network

Circuit-Switched

Communication Network

Packet-Switched

Communication Network

Datagram Network

Virtual Circuit Network

Page 5: Katz, Stoica F04 EECS 122: Introduction to Computer Networks Congestion Control Computer Science Division Department of Electrical Engineering and Computer

5Katz, Stoica F04

Packet (Datagram) Switching Properties

Expensive forwarding- Forwarding table size depends on number of different

destinations

- Must lookup in forwarding table for every packet

Robust- Link and router failure may be transparent for end-hosts

High bandwidth utilization- Statistical multiplexing

No service guarantees- Network allows hosts to send more packets than

available bandwidth congestion dropped packets

Page 6: Katz, Stoica F04 EECS 122: Introduction to Computer Networks Congestion Control Computer Science Division Department of Electrical Engineering and Computer

6Katz, Stoica F04

Virtual Circuit (VC) Switching

Packets not switched independently- Establish virtual circuit before sending data

Forwarding table entry- (input port, input VCI, output port, output VCI)

- VCI – Virtual Circuit Identifier

Each packet carries a VCI in its header Upon a packet arrival at interface i

- Input port uses i and the packet’s VCI v to find the routing entry (i, v, i’, v’)

- Replaces v with v’ in the packet header

- Forwards packet to output port i’

Page 7: Katz, Stoica F04 EECS 122: Introduction to Computer Networks Congestion Control Computer Science Division Department of Electrical Engineering and Computer

7Katz, Stoica F04

VC Forwarding: Example

1234

1234

1234

1234

1234

1234

5

……114

……

…3

in outin-VCI

11

out-VCI

5

…… 73

……

…2

in outin-VCI out-VCI

11

7

……14

……

…1

in outin-VCI out-VCI

7

1

sourcedestination

Page 8: Katz, Stoica F04 EECS 122: Introduction to Computer Networks Congestion Control Computer Science Division Department of Electrical Engineering and Computer

8Katz, Stoica F04

VC Forwarding (cont’d)

A signaling protocol is required to set up the state for each VC in the routing table

- A source needs to wait for one RTT (round trip time) before sending the first data packet

Can provide per-VC QoS- When we set the VC, we can also reserve bandwidth

and buffer resources along the path

Page 9: Katz, Stoica F04 EECS 122: Introduction to Computer Networks Congestion Control Computer Science Division Department of Electrical Engineering and Computer

9Katz, Stoica F04

VC Switching Properties

Less expensive forwarding- Forwarding table size depends on number of different

circuits

- Must lookup in forwarding table for every packet

Much higher delay for short flows- 1 RTT delay for connection setup

Less Robust- End host must spend 1 RTT to establish new

connection after link and router failure

Flexible service guarantees- Either statistical multiplexing or resource reservations

Page 10: Katz, Stoica F04 EECS 122: Introduction to Computer Networks Congestion Control Computer Science Division Department of Electrical Engineering and Computer

10Katz, Stoica F04

Circuit Switching

Packets not switched independently- Establish circuit before sending data

Circuit is a dedicated path from source to destination

- E.g., old style telephone switchboard, where establishing circuit means connecting wires in all the switches along path

- E.g., modern dense wave division multiplexing (DWDM) form of optical networking, where establishing circuit means reserving an optical wavelength in all switches along path

No forwarding table

Page 11: Katz, Stoica F04 EECS 122: Introduction to Computer Networks Congestion Control Computer Science Division Department of Electrical Engineering and Computer

11Katz, Stoica F04

Circuit Switching Properties

Cheap forwarding- No table lookup

Much higher delay for short flows- 1 RTT delay for connection setup

Less robust- End host must spend 1 RTT to establish new

connection after link and router failure

Must use resource reservations

Page 12: Katz, Stoica F04 EECS 122: Introduction to Computer Networks Congestion Control Computer Science Division Department of Electrical Engineering and Computer

12Katz, Stoica F04

Forwarding Comparison

pure packet switching

virtual circuit switching

circuit switching

forwarding cost

high low none

bandwidth utilization

high flexible low

resource reservations

none flexible yes

robustness high low low

Page 13: Katz, Stoica F04 EECS 122: Introduction to Computer Networks Congestion Control Computer Science Division Department of Electrical Engineering and Computer

13Katz, Stoica F04

Summary

Routers- Key building blocks of today a network in general, and

Internet in particular Main functionalities implemented by a router

- Packet forwarding- Buffer management- Packet scheduling- Packet classification

Forwarding techniques- Datagram (packet) switching- Virtual circuit switching- Circuit switching

Page 14: Katz, Stoica F04 EECS 122: Introduction to Computer Networks Congestion Control Computer Science Division Department of Electrical Engineering and Computer

Katz, Stoica F04

Starting New Lecture

Congestion Control

Page 15: Katz, Stoica F04 EECS 122: Introduction to Computer Networks Congestion Control Computer Science Division Department of Electrical Engineering and Computer

15Katz, Stoica F04

What We Know

We know: How to process packets in a switch How to route packets in the network How to send packets reliably

We don’t know: How fast to send

Page 16: Katz, Stoica F04 EECS 122: Introduction to Computer Networks Congestion Control Computer Science Division Department of Electrical Engineering and Computer

16Katz, Stoica F04

What’s at Stake?

Send too slow: link is not fully utilized- wastes time

Send too fast: link is fully utilized but....- queue builds up in router buffer (delay)

- overflow buffers in routers

- overflow buffers in receiving host (ignore)

Why are buffer overflows a problem?- packet drops (mine and others)

- Interesting history....(Van Jacobson rides to the rescue)

Page 17: Katz, Stoica F04 EECS 122: Introduction to Computer Networks Congestion Control Computer Science Division Department of Electrical Engineering and Computer

17Katz, Stoica F04

Abstract View

We ignore internal structure of router and model it as having a single queue for a particular input-output pair

Sending Host Buffer in Router Receiving Host

A B

Page 18: Katz, Stoica F04 EECS 122: Introduction to Computer Networks Congestion Control Computer Science Division Department of Electrical Engineering and Computer

18Katz, Stoica F04

Three Congestion Control Problems

Adjusting to bottleneck bandwidth

Adjusting to variations in bandwidth

Sharing bandwidth between flows

Page 19: Katz, Stoica F04 EECS 122: Introduction to Computer Networks Congestion Control Computer Science Division Department of Electrical Engineering and Computer

19Katz, Stoica F04

Single Flow, Fixed Bandwidth

Adjust rate to match bottleneck bandwidth- without any a priori knowledge

- could be gigabit link, could be a modem

A B100 Mbps

Page 20: Katz, Stoica F04 EECS 122: Introduction to Computer Networks Congestion Control Computer Science Division Department of Electrical Engineering and Computer

20Katz, Stoica F04

Single Flow, Varying Bandwidth

Adjust rate to match instantaneous bandwidth- assuming you have rough idea of bandwidth

A BBW(t)

Page 21: Katz, Stoica F04 EECS 122: Introduction to Computer Networks Congestion Control Computer Science Division Department of Electrical Engineering and Computer

21Katz, Stoica F04

Multiple Flows

Two Issues: Adjust total sending rate to match bandwidth Allocation of bandwidth between flows

A2 B2100 Mbps

A1

A3 B3

B1

Page 22: Katz, Stoica F04 EECS 122: Introduction to Computer Networks Congestion Control Computer Science Division Department of Electrical Engineering and Computer

22Katz, Stoica F04

Reality

Congestion control is a resource allocation problem involving many flows, many links, and complicated global dynamics

Page 23: Katz, Stoica F04 EECS 122: Introduction to Computer Networks Congestion Control Computer Science Division Department of Electrical Engineering and Computer

23Katz, Stoica F04

General Approaches

Send without care- many packet drops- not as stupid as it seems

Reservations- pre-arrange bandwidth allocations- requires negotiation before sending packets- low utilization

Pricing- don’t drop packets for the high-bidders- requires payment model

Page 24: Katz, Stoica F04 EECS 122: Introduction to Computer Networks Congestion Control Computer Science Division Department of Electrical Engineering and Computer

24Katz, Stoica F04

General Approaches (cont’d)

Dynamic Adjustment- probe network to test level of congestion

- speed up when no congestion

- slow down when congestion

- suboptimal, messy dynamics, simple to implement

All three techniques have their place- but for generic Internet usage, dynamic adjustment is

the most appropriate

- due to pricing structure, traffic characteristics, and good citizenship

Page 25: Katz, Stoica F04 EECS 122: Introduction to Computer Networks Congestion Control Computer Science Division Department of Electrical Engineering and Computer

25Katz, Stoica F04

TCP Congestion Control

TCP connection has window- controls number of unacknowledged packets

Sending rate: ~Window/RTT

Vary window size to control sending rate

Page 26: Katz, Stoica F04 EECS 122: Introduction to Computer Networks Congestion Control Computer Science Division Department of Electrical Engineering and Computer

26Katz, Stoica F04

Congestion Window (cwnd)

Limits how much data can be in transit Implemented as # of bytes Described as # packets in this lecture

EffectiveWindow = MaxWindow – (LastByteSent – LastByteAcked)

MaxWindow = min(cwnd, AdvertisedWindow)

LastByteAckedLastByteSent

sequence number increases

MaxWindow

EffectiveWindow

Page 27: Katz, Stoica F04 EECS 122: Introduction to Computer Networks Congestion Control Computer Science Division Department of Electrical Engineering and Computer

27Katz, Stoica F04

Two Basic Components

Detecting congestion

Rate adjustment algorithm- depends on congestion or not

- three subproblems within adjustment problem

• finding fixed bandwidth

• adjusting to bandwidth variations

• sharing bandwidth

Page 28: Katz, Stoica F04 EECS 122: Introduction to Computer Networks Congestion Control Computer Science Division Department of Electrical Engineering and Computer

28Katz, Stoica F04

Detecting Congestion

Packet dropping is best sign of congestion- delay-based methods are hard and risky

How do you detect packet drops? ACKs- TCP uses ACKs to signal receipt of data- ACK denotes last contiguous byte received

• actually, ACKs indicate next segment expected

Two signs of packet drops- No ACK after certain time interval: time-out- Several duplicate ACKs (ignore for now)

Page 29: Katz, Stoica F04 EECS 122: Introduction to Computer Networks Congestion Control Computer Science Division Department of Electrical Engineering and Computer

29Katz, Stoica F04

Rate Adjustment

Basic structure:- Upon receipt of ACK (of new data): increase rate

- Upon detection of loss: decrease rate

But what increase/decrease functions should we use?

- Depends on what problem we are solving

Page 30: Katz, Stoica F04 EECS 122: Introduction to Computer Networks Congestion Control Computer Science Division Department of Electrical Engineering and Computer

30Katz, Stoica F04

Problem #1: Single Flow, Fixed BW

Want to get a first-order estimate of the available bandwidth

- Assume bandwidth is fixed- Ignore presence of other flows

Want to start slow, but rapidly increase rate until packet drop occurs (“slow-start”)

Adjustment: - cwnd initially set to 1- cwnd++ upon receipt of ACK

Page 31: Katz, Stoica F04 EECS 122: Introduction to Computer Networks Congestion Control Computer Science Division Department of Electrical Engineering and Computer

31Katz, Stoica F04

Slow-Start

cwnd increases exponentially: cwnd doubles every time a full cwnd of packets has been sent

- Each ACK releases two packets

- Slow-start is called “slow” because of starting pointsegment 1cwnd = 1

cwnd = 2 segment 2segment 3

cwnd = 4 segment 4

segment 5segment 6segment 7

cwnd = 8

cwnd = 3

Page 32: Katz, Stoica F04 EECS 122: Introduction to Computer Networks Congestion Control Computer Science Division Department of Electrical Engineering and Computer

32Katz, Stoica F04

Problems with Slow-Start

Slow-start can result in many losses- roughly the size of cwnd ~ BW*RTT

Example:- at some point, cwnd is enough to fill “pipe”

- after another RTT, cwnd is double its previous value

- all the excess packets are dropped!

Therefore, need a more gentle adjustment algorithm once have rough estimate of bandwidth

Page 33: Katz, Stoica F04 EECS 122: Introduction to Computer Networks Congestion Control Computer Science Division Department of Electrical Engineering and Computer

33Katz, Stoica F04

Problem #2: Single Flow, Varying BW

Want to be able to track available bandwidth, oscillating around its current value

Possible variations: (in terms of RTTs)- multiplicative increase or decrease: cwnd a*cwnd- additive increase or decrease: cwnd cwnd + b

Four alternatives:- AIAD: gentle increase, gentle decrease- AIMD: gentle increase, drastic decrease- MIAD: drastic increase, gentle decrease (too many losses)- MIMD: drastic increase and decrease

Page 34: Katz, Stoica F04 EECS 122: Introduction to Computer Networks Congestion Control Computer Science Division Department of Electrical Engineering and Computer

34Katz, Stoica F04

Problem #3: Multiple Flows

Want steady state to be “fair”

Many notions of fairness, but here all we require is that two identical flows end up with the same bandwidth

This eliminates MIMD and AIAD

AIMD is the only remaining solution!

Page 35: Katz, Stoica F04 EECS 122: Introduction to Computer Networks Congestion Control Computer Science Division Department of Electrical Engineering and Computer

35Katz, Stoica F04

Buffer and Window Dynamics

No congestion x increases by one packet/RTT every RTT Congestion decrease x by factor 2

A BC = 50 pkts/RTT

0

10

20

30

40

50

60

1 28 55 82 109

136

163

190

217

244

271

298

325

352

379

406

433

460

487

Backlog in router (pkts)Congested if > 20

Rate (pkts/RTT)

x

Page 36: Katz, Stoica F04 EECS 122: Introduction to Computer Networks Congestion Control Computer Science Division Department of Electrical Engineering and Computer

36Katz, Stoica F04

AIMD Sharing Dynamics

A Bx

D E

0

10

20

30

40

50

60

1 28 55 82 109

136

163

190

217

244

271

298

325

352

379

406

433

460

487

No congestion rate increases by one packet/RTT every RTT Congestion decrease rate by factor 2

Rates equalize fair share

y

Page 37: Katz, Stoica F04 EECS 122: Introduction to Computer Networks Congestion Control Computer Science Division Department of Electrical Engineering and Computer

37Katz, Stoica F04

AIAD Sharing Dynamics

A Bx

D E No congestion x increases by one packet/RTT every RTT Congestion decrease x by 1

0

10

20

30

40

50

60

1 28 55 82 109

136

163

190

217

244

271

298

325

352

379

406

433

460

487

y

Page 38: Katz, Stoica F04 EECS 122: Introduction to Computer Networks Congestion Control Computer Science Division Department of Electrical Engineering and Computer

38Katz, Stoica F04

AIMD

C

x

y

A Bx

C

D Ey

Limit rates:x = y

Page 39: Katz, Stoica F04 EECS 122: Introduction to Computer Networks Congestion Control Computer Science Division Department of Electrical Engineering and Computer

39Katz, Stoica F04

AIAD

C

x

y

A Bx

C

D Ey

Limit rates:x and y depend

on initial values

Page 40: Katz, Stoica F04 EECS 122: Introduction to Computer Networks Congestion Control Computer Science Division Department of Electrical Engineering and Computer

40Katz, Stoica F04

Implementing AIMD

After each ACK- increment cwnd by 1/cwnd (cwnd += 1/cwnd)

- as a result, cwnd is increased by one only if all segments in a cwnd have been acknowledged

But need to decide when to leave slow-start and enter AIMD use ssthresh variable

Page 41: Katz, Stoica F04 EECS 122: Introduction to Computer Networks Congestion Control Computer Science Division Department of Electrical Engineering and Computer

41Katz, Stoica F04

Slow Start/AIMD Pseudocode

Initially:cwnd = 1;ssthresh = infinite;

New ack received:if (cwnd < ssthresh) /* Slow Start*/ cwnd = cwnd + 1;else /* Congestion Avoidance */ cwnd = cwnd + 1/cwnd;

Timeout:/* Multiplicative decrease */ssthresh = cwnd/2;cwnd = 1;

Page 42: Katz, Stoica F04 EECS 122: Introduction to Computer Networks Congestion Control Computer Science Division Department of Electrical Engineering and Computer

42Katz, Stoica F04

The big picture (with timeouts)

Time

cwnd

Timeout

SlowStart

AIMD

ssthresh

Timeout

SlowStart

SlowStart

AIMD

Page 43: Katz, Stoica F04 EECS 122: Introduction to Computer Networks Congestion Control Computer Science Division Department of Electrical Engineering and Computer

43Katz, Stoica F04

Congestion Detection Revisited

Wait for Retransmission Time Out (RTO)- RTO kills throughput

In BSD TCP implementations, RTO is usually more than 500ms

- the granularity of RTT estimate is 500 ms

- retransmission timeout is RTT + 4 * mean_deviation

Solution: Don’t wait for RTO to expire

Page 44: Katz, Stoica F04 EECS 122: Introduction to Computer Networks Congestion Control Computer Science Division Department of Electrical Engineering and Computer

44Katz, Stoica F04

Fast Retransmits

Resend a segment after 3 duplicate ACKs

- a duplicate ACK means that an out-of sequence segment was received

Notes:- ACKs are for next

expected packet - packet reordering can

cause duplicate ACKs- window may be too small

to get enough duplicate ACKs

ACK 2

segment 1cwnd = 1

cwnd = 2 segment 2segment 3

ACK 4cwnd = 4 segment 4

segment 5segment 6segment 7

ACK 3

3 duplicateACKs

ACK 4

ACK 4

ACK 4

Page 45: Katz, Stoica F04 EECS 122: Introduction to Computer Networks Congestion Control Computer Science Division Department of Electrical Engineering and Computer

45Katz, Stoica F04

Fast Recovery: After a Fast Retransmit

ssthresh = cwnd / 2 cwnd = ssthresh

- instead of setting cwnd to 1, cut cwnd in half (multiplicative decrease)

for each dup ack arrival- dupack++- MaxWindow = min(cwnd + dupack, AdvWin) - indicates packet left network, so we may be able to send more

receive ack for new data (beyond initial dup ack)- dupack = 0- exit fast recovery

But when RTO expires still do cwnd = 1

Page 46: Katz, Stoica F04 EECS 122: Introduction to Computer Networks Congestion Control Computer Science Division Department of Electrical Engineering and Computer

46Katz, Stoica F04

Fast Retransmit and Fast Recovery

Retransmit after 3 duplicated acks- Prevent expensive timeouts

Reduce slow starts At steady state, cwnd oscillates around the

optimal window size

Time

cwnd

Slow Start

AI/MD

Fast retransmit

Page 47: Katz, Stoica F04 EECS 122: Introduction to Computer Networks Congestion Control Computer Science Division Department of Electrical Engineering and Computer

47Katz, Stoica F04

TCP Congestion Control Summary

Measure available bandwidth- slow start: fast, hard on network- AIMD: slow, gentle on network

Detecting congestion- timeout based on RTT

• robust, causes low throughput- Fast Retransmit: avoids timeouts when few packets lost

• can be fooled, maintains high throughput

Recovering from loss- Fast recovery: don’t set cwnd=1 with fast retransmits

Page 48: Katz, Stoica F04 EECS 122: Introduction to Computer Networks Congestion Control Computer Science Division Department of Electrical Engineering and Computer

48Katz, Stoica F04

Issues to Think About

What about short flows? (setting initial cwnd)- most flows are short- most bytes are in long flows

How does this work over wireless links?- packet reordering fools fast retransmit- loss not always congestion related

High speeds?- to reach 10gbps, packet losses occur every 90 minutes!

Why are losses bad?- Tornado codes: can reconstruct data proportional to packets that get

through. Why not send at maximal rate?

Fairness: how do flows with different RTTs share link?

Page 49: Katz, Stoica F04 EECS 122: Introduction to Computer Networks Congestion Control Computer Science Division Department of Electrical Engineering and Computer

49Katz, Stoica F04

Bonus Question

Why is TCP like Blanche Dubois?

Because it “relies on the kindness of strangers...”

What happens if not everyone cooperates?