1 aman shaikh: june 02 ucsc infocom 2002 avoiding instability during graceful shutdown of ospf aman...

23
INFOCOM 2002 1 Aman Shaikh: June 02 UCSC Avoiding Instability during Graceful Shutdown of OSPF Aman Shaikh, UCSC Joint work with Rohit Dube, Xebeo Communications Inc. Anujan Varma, UCSC INFOCOM – June 2002

Upload: julie-munday

Post on 14-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

INFOCOM 2002 1Aman Shaikh: June 02

UCSC

Avoiding Instability during Graceful Shutdown of OSPF

Aman Shaikh, UCSC

Joint work with

Rohit Dube, Xebeo Communications Inc.

Anujan Varma, UCSC

INFOCOM – June 2002

INFOCOM 2002 2Aman Shaikh: June 02

UCSC

Software Upgrade is a Pain• Upgrade of routing software on routers is a fact of

life– Extensions to routing protocols, new functionality,

version upgrades, bug fixes– Critical need for seamless upgrades

• Current practice– During upgrade, network operators withdraw “router-

under-upgrade” from forwarding service• Route flaps, traffic disruption, instability

– Operators have to carefully schedule upgrades• Schedule them during night when load is moderate• Stagger upgrades of different routers

– A painful job

INFOCOM 2002 3Aman Shaikh: June 02

UCSC

We Can do Better

• Router can continue forwarding even while its routing process is inactive, at least for a while– Current routers have separate routing and forwarding

paths• Routing in software (CPU), forwarding in hardware (switching)

• Routing protocols need to be extended since they always try to route around inactive router

• Our proposal: IBB (I’ll Be Back) Extension to OSPF

• Other proposals– OSPF: Hitless restart proposal by Jonh Moy

• Internet draft: draft-ietf-ospf-hitless-restart-02.txt

– BGP: Graceful restart proposal by Sangli et al.• Internet draft: draft-ietf-idr-restart-05.txt

INFOCOM 2002 4Aman Shaikh: June 02

UCSC

Router Model

Route Processor (CPU)

Forwarding Info. Base (FIB)

Interface card Interface card

Forwarding

SwitchingFabric

Data packet

Data packet

Topology view

Shortest Path Tree (SPT)

OSPF Process

LSA LSA

Forwarding

INFOCOM 2002 5Aman Shaikh: June 02

UCSC

IBB Proposal in a Nutshell

• OSPF process on router R needs to be shutdown• Before shutdown, R informs other routers that it is going to be inactive for a while• R specifies a time period (IBB Timeout) by which it expects to become operational again• Other routers continue using R for forwarding during IBB Timeout period• If R comes back within IBB Timeout period, no routing instability or flaps• Else other routers start forwarding packets around R

INFOCOM 2002 6Aman Shaikh: June 02

UCSC

• R cannot update its forwarding table to reflect the change– Can lead to loop or black holes

What if Topology Changes

B

A

R

3

2

6

(a) Topology when R went down

B

A

R

10

2

6

(b) Topology changes while R is inactive

INFOCOM 2002 7Aman Shaikh: June 02

UCSC

Handling Changes: Options

• Don’t do anything• Stop using R: Moy’s proposal

– Inadvertent changes during upgrade are likely• Flapping due to a bad interface somewhere

– But all changes are not bad• Do not always lead to loops or black holes

• Stop using R only when loop or black hole gets formed– And only for those destinations for which there is a

problem– Need algorithms which is what the bulk of the paper is

about

Our approach

INFOCOM 2002 8Aman Shaikh: June 02

UCSC

Roadmap of Algorithm

• Single area, single inactive router case

– Loop formation

– Black hole formation

• Single area, multiple inactive routers case

• Multiple areas

INFOCOM 2002 9Aman Shaikh: June 02

UCSC

Single Area, Single Inactive Router

• Problem Formulation– Inactive Router = R– All routers other than R have the same image

of the topology graph– R’s image is that of a past - the time at which it

went down– Source = S, Destination = D– Next hop(R, D) = Y– Actual path a packet takes from S to D =

P(S->D)

INFOCOM 2002 10Aman Shaikh: June 02

UCSC

Loop Detection

• P(S->D) has a loop iff S and Y have R on their paths to D in their SPTs (Shortest Path Trees)

D

R

3

2 6

Topology when R went down

S

1

Y

20

D

R

10

2 6

S

1

Y

Topology changes while R is inactive

20

Y

R

D

2

6

S and Y have R on their paths to D in their SPT

S

1

S

R

D

1

6

Y

2

If there is a loop, neighbor can always detect it

INFOCOM 2002 11Aman Shaikh: June 02

UCSC

Loop Prevention

• Every router needs to calculate a

path to D such that R does not appear on it

D

R

10

2 6

S1

Y

Changed topologywhile R is inactive

20

S

D

20

S and Y calculate pathsto D w/o R on it

Y

D

10

INFOCOM 2002 12Aman Shaikh: June 02

UCSC

Loop Avoidance Procedure• R sends forwarding table to neighbors before

shutdown

- Thus, Y knows that next hop(R, D) is Y

• Detection: during SPF (Shortest Path First)

calculation neighbors detect loops- Y checks if R exists on the path to D or not

• Upon detection, neighbors send avoid messages to other routers in the domain

- avoid(R, D) = avoid using R for reaching D• Prevention: upon receiving the avoid(R, D)

message, other routers calculate a new path to D

such that R does not appear on it

INFOCOM 2002 13Aman Shaikh: June 02

UCSC

Multiple Inactive Routers

• Set of inactive routers: R1, R2, …, Rn

• Loop avoidance procedure applies for each inactive router– Detection

• Router detects loops for all its inactive neighbors

– Prevention• A router can get avoid(Ri, D) messages for j inactive routers (j

<= n)• The router avoids these j forbidden routers on its path to D

• Problem: Set of forbidden routers can be different for different destinations– O(n) shortest path calculations

• n = number of vertices

INFOCOM 2002 14Aman Shaikh: June 02

UCSC

Simplification

• Router avoids all inactive routers if it has some forbidden routers on its path to D– Calculate two SPTs:

1.SPT with all inactive routers on it

2.SPT w/o any inactive router on it– If the path to D does not contain any forbidden

routers on it,• pick next hop for D from the first SPT

– Else,• pick next hop for D from the second SPT

INFOCOM 2002 15Aman Shaikh: June 02

UCSC

Performance

• Maximum effect on the SPF calculation– Quantify overhead– Impact of

• Topology size• Number of inactive routers

• Prototype Implementation– IBB extension incorporated into GateD 4.0.7

INFOCOM 2002 16Aman Shaikh: June 02

UCSC

Testbed Setup

SUT

LAN

TopTracker

TT

Physical Topology

LSAs

Routers underupgrade

SUT

TopTracker

TT

1

20

R’1 R’2 R’m

R1 R2 Rm

M1

Complete graphWith n nodes

1 1 1

111

1 1 1

Emulated topology

SUT’s view of the Topology

INFOCOM 2002 17Aman Shaikh: June 02

UCSC

Experiment Sequence

GateD on SUT IBB-GateD on SUTTime (mins)

T = 0 Bring m rtrs down Bring m rtrs down in IBB mode

T = 4 Send avoid(Ri, Mj) messages to SUT(1<=i<=m, 1<=j<=n)

T = 8 Bring m inactive rtrs up Bring m inactive rtrs up

Case Am inactive rtrs

Case Bm inactive rtrs, avoid them

Overhead =mean SPF time in Case Bmean SPF time in Case A

INFOCOM 2002 18Aman Shaikh: June 02

UCSC

Result

• Sources of overhead:– Second SPF calculation– Graph in case B is larger than in case A

• Gets larger as m increases

0

0.5

1

1.5

2

2.5

3

3.5

50 60 70 80 90 100

# of nodes in connected component (n)

Ove

rhea

d m = 1

m = 2

m = 5

m = 10

INFOCOM 2002 19Aman Shaikh: June 02

UCSC

Conclusions

• IBB proposal: extend OSPF so that a router can be used for forwarding even while its OSPF process is inactive

• Main contribution: an algorithm that gracefully handles topological changes– Stops using the inactive router for a

destination when using the router can lead to loops or black holes

– Overhead of the algorithm is modest • Shows good scaling behavior in terms of topology

size and number of inactive routers

INFOCOM 2002 20Aman Shaikh: June 02

UCSC

Future Directions

• Incremental deployment– Can the algorithm be modified so that only a

subset of routers need to support it?

• Measuring other aspects of overhead– Messaging

• Reducing the overhead– SPF calculation: incremental algorithm for

second pass– Better data structures in prototype

• Other protocols …

INFOCOM 2002 21Aman Shaikh: June 02

UCSC

Backup

INFOCOM 2002 22Aman Shaikh: June 02

UCSC

OSPF Background

• Link-state routing protocol– all routers in the domain come to a consistent view of

the topology by exchange of Link State Advertisements (LSAs) • set of LSAs (self-originated + received) at a router = topology

• SPF Calculation – each router calculates a single source shortest path

tree

• Forwarding Information Base (FIB)– each router uses the tree to build its FIB, which

governs packet forwarding

INFOCOM 2002 23Aman Shaikh: June 02

UCSC

OSPF Overview : Example

A

B

DC

E

F

I

G

H

J

11 1

1 12 1

3

21

1

1 1

OSPF Domain (single area)

A

B

DC

E

F

I

G

H

J

1

1 1 1

21

1

1

SPT at G

1