balaji prabhakar

32
Balaji Prabhakar Coping with (exploiting) heavy tails Balaji Prabhakar Departments of EE and CS Stanford University [email protected]

Upload: gali

Post on 07-Jan-2016

77 views

Category:

Documents


0 download

DESCRIPTION

Coping with (exploiting) heavy tails. Balaji Prabhakar Departments of EE and CS Stanford University [email protected]. Balaji Prabhakar. Overview. SIFT: Asimple algorithm for identifying large flows reducing average flow delays with smaller router buffers - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Balaji Prabhakar

High PerformanceSwitching and RoutingTelecom Center Workshop: Sept 4, 1997.

Balaji Prabhakar

Coping with (exploiting) heavy tails

Balaji Prabhakar

Departments of EE and CSStanford University

[email protected]

Page 2: Balaji Prabhakar

2

Overview

• SIFT: Asimple algorithm for identifying large flows– reducing average flow delays– with smaller router buffers with Konstantinos Psounis and Arpita Ghosh

• Bandwidth at wireless proxy servers– TDM vs FDM – or, how may servers suffice with Pablo Molinero-Fernandez and Konstantinos Psounis

Page 3: Balaji Prabhakar

3

SIFT: Motivation

• Egress buffers on router line cards at present serve packets in a FIFO manner

• The bandwidth sharing that results from this and the actions of transport protocols like TCP translates to some service order for flows that isn’t well understood; that is, at the flow level do we have:– FIFO? PS? SRPT?

(none of the above)

Egress BufferEgress Buffer

Page 4: Balaji Prabhakar

4

SIFT: Motivation

• But, serving packets according to the SRPT (Shortest Remaining Processing Time) policy at the flow level – would minimize average delay– given the heavy-tailed nature of Internet flow size distribution, the

reduction in delay can be huge

Page 5: Balaji Prabhakar

5

SRPT at the flow level

• Next packet to depart under FIFO– green

• Next packet to depart under SRPT– orange

Egress BufferEgress Buffer

Page 6: Balaji Prabhakar

6

But …

• SRPT is unimplementable– router needs to know residual flow sizes for all enqueued flows: virtually

impossible to implement

• Other pre-emptive schemes like SFF (shortest flow first) or LAS (least attained service) are like-wise too complicated to implement

• This has led researchers to consider tagging flows at the edge, where the number of distinct flows is much smaller– but, this requires a different design of edge and core routers– more importantly, needs extra space on IP packet headers to signal flow

size

• Is something simpler possible?

Page 7: Balaji Prabhakar

7

SIFT: A randomized algorithm

• Flip a coin with bias p (= 0.01, say) for heads on each arriving packet, independently from packet to packet

• A flow is “sampled” if one its packets has a head on it

HHTTTT TT TT TT HH

Page 8: Balaji Prabhakar

8

SIFT: A randomized algorithm

• A flow of size X has roughly 0.01X chance of being sampled– flows with fewer than 15 packets are sampled with prob 0.15– flows with more than 100 packets are sampled with prob 1 – the precise probability is: 1 – (1-0.01)X

• Most short flows will not be sampled, most long flows will be

Page 9: Balaji Prabhakar

9

• Ideally, we would like to sample like the blue curve

• Sampling with prob p gives the red curve– there are false positives and false negatives

• Can we get the green curve?

The accuracy of classification

Flow size

Pro

b (

sam

ple

d)

Page 10: Balaji Prabhakar

10

• Sample with a coin of bias q = 0.1– say that a flow is “sampled” if it gets two heads!– this reduces the chance of making errors– but, you have to have a count the number heads

• So, how can we use SIFT at a router?

SIFT+

Page 11: Balaji Prabhakar

11

SIFT at a router

• Sample incoming packets• Place any packet with a head (or the second such packet) in the low

priority buffer• Place all further packets from this flow in the low priority buffer (to avoid

mis-sequencing)

Page 12: Balaji Prabhakar

12

• Simulation results with ns-2

• Topology:

Simulation results

Page 13: Balaji Prabhakar

13

Overall Average Delays

Page 14: Balaji Prabhakar

14

Average delay for short flows

Page 15: Balaji Prabhakar

15

Delay for long flows

Page 16: Balaji Prabhakar

16

• SIFT needs– two logical queues in one physical buffer– to sample arriving packets – a table for maintaining id of sampled flows– to check whether incoming packet belongs to sampled

flow or not all quite simple to implement

Implementation requirements

Page 17: Balaji Prabhakar

17

• The buffer of the short flows has very low occupancy– so, can we simply reduce it drastically without sacrificing

performance?

• More precisely, suppose– we reduce the buffer size for the small flows, increase it

for the large flows, keep the total the same as FIFO

A big bonus

Page 18: Balaji Prabhakar

SIFT incurs fewer drops

Buffer_Size(Short flows) = 10; Buffer_Size(Long flows) = 290;Buffer_Size(Single FIFO Queue) = 300;

SIFT ------ FIFO ------

Page 19: Balaji Prabhakar

19

• Suppose we reduce the buffer size of the long flows as well

• Questions:– will packet drops still be fewer?– will the delays still be as good?

Reducing total buffer size

Page 20: Balaji Prabhakar

Drops under SIFT with less total buffer

Buffer_Size(PRQ0) = 10; Buffer_Size(PRQ1) = 190;Buffer_Size(One Queue) = 300;

One Queue SIFT ------ FIFO ------

Page 21: Balaji Prabhakar

Delay histogram for short flows

SIFT ------ FIFO ------

Page 22: Balaji Prabhakar

Delay histogram for long flows

SIFT ------ FIFO ------

Page 23: Balaji Prabhakar

23

• A randomized scheme, preliminary results show that– it has a low implementation complexity– it reduces delays drastically (users are happy)– with 30-35% smaller buffers at egress line cards (router manufacturers are happy)

• Lot more work needed– at the moment we have a good understanding of how to sample, and extensive (and encouraging) simulation tests– need to understand the effect of reduced buffers on end-to-end

congestion control algorithms

Conclusions for SIFT

Page 24: Balaji Prabhakar

24

• Motivation: Wireless and satellite

• Problem: Single transmitter, multiple receivers– bandwidth available for transmission: W bits/sec– should files be transferred to one receiver at a time? (TDM)– or, should we divide the bandwidth into K channels of W/K bits/sec

and transmit to K receivers at a time? (FDM)

• For heavy-tailed jobs, K > 1 minimizes mean delay

• Questions:– What is the right choice of K? – How does it depend on flow-size distributions?

How many servers do we need?

Page 25: Balaji Prabhakar

25

A simulation: HT file sizes

Ave

rage

res

pons

e ti

me

(s)

Number of servers (K)

Page 26: Balaji Prabhakar

26

The model

• Use an M/Heavy-Tailed/K queueing system– service times X: bimodal to begin with, generalizes P(X=A) = = 1-P(X=B), where A < E(X) ¿ B and ¼ 1– the arrival rate is

• Let SK, WK and DK be the service time, waiting time and total delay in K server system– E(SK) = K E(X); E(DK) = K E(X) + E(WK)

• Main idea in determining K*– have enough servers to take care of long jobs so that short jobs aren’t waiting for long amounts of time– but no more because, otherwise, the service times become big

Page 27: Balaji Prabhakar

27

Approximately determining WK

• Consider two states: servers blocked or not– Blocked: all K servers are busy serving long jobs

E(WK) = E[WK|blocked] PB + E[WK|unblocked] (1-PB)

• PB ¼ P(there are at least K large arrivals in KB time slots)

– this is actually a lower bound, but accurate for large B

– with Poisson arrivals, PB is easy to find out

• E(WK|unblocked) ¼ 0

• E(WK|blocked) ¼ E(W1), which can be determined from the P-K formula

Page 28: Balaji Prabhakar

28

Putting it all together

Page 29: Balaji Prabhakar

29

=0.9995, =0.50

Number of servers (K)

Ave

rage

res

pons

e ti

me

(s)

M/Bimodal/K

Page 30: Balaji Prabhakar

30

=1.1, =0.50

Number of servers (K)

Ave

rage

res

pons

e ti

me

(s)

M/Pareto/K

Page 31: Balaji Prabhakar

31

=1.1, =0.50

Number of servers (K)

Stan

dard

dev

iati

on o

f re

spon

se ti

me

(s)

M/Pareto/K: higher moments

Page 32: Balaji Prabhakar

32

The optimal number of servers

load

Pareto

K*

simulationK*

formula

0.1 1.1 3 3

0.5 1.1 8 9

0.8 1.1 50 46

0.1 1.3 2 2

0.5 1.3 6 6

0.8 1.3 20 22