evolution of ip/ol performance management

14
AT&T Labs Research Evolution of IP/OL Performance Management Robert Doverspike, Jennifer Yates, Jorge Pastor, Martin Birk – AT&T Labs Research

Upload: maeko

Post on 07-Feb-2016

43 views

Category:

Documents


0 download

DESCRIPTION

Evolution of IP/OL Performance Management. Robert Doverspike, Jennifer Yates, Jorge Pastor, Martin Birk – AT&T Labs Research. Outline. Key Takeaways Performance Management – must consider interlayer (focus IP) Evolution story for IP/OL Architecture for Long Haul Networks Example problems - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Evolution of IP/OL Performance Management

AT&T Labs Research

Evolution of IP/OL Performance ManagementRobert Doverspike, Jennifer Yates, Jorge Pastor, Martin Birk – AT&T Labs Research

Page 2: Evolution of IP/OL Performance Management

TitlePage 2AT&T Labs Research

Outline

• Key Takeaways– Performance Management – must consider interlayer

(focus IP)

• Evolution story for IP/OL– Architecture for Long Haul Networks

• Example problems

• Next chapter in evolution– Let’s get it right this time

Page 3: Evolution of IP/OL Performance Management

TitlePage 3AT&T Labs Research

Key Takeaways

• Optical PM goals should focus on use in IP layer– Links in the IP layer form connections in the optical layer– Virtually all high rate connections are IP links (between

either routers or Ethernet switches)• Perfect optical layer detection is a lofty goal, but

– will fall short if architected in isolation• E.g., need to have strong inter-layer coordination

• Why do we stress this for OL?– Inter-layer fault management has many flaws in practice,

even after 15 years of SONET perfecting– Need adequate mechanisms across layers to handle

scenarios when things go wrong or confusion reigns

Page 4: Evolution of IP/OL Performance Management

TitlePage 4AT&T Labs Research

Evolution Story for Long Haul Networks

SONET RingLayer

IP Layer

Pt-Pt WDM Layer

Router

ADM

DCS/Intelligent Optical Switch

Degree-n OADM/WXC

WDM Terminal

1st Generation

Page 5: Evolution of IP/OL Performance Management

TitlePage 5AT&T Labs Research

Evolution Story for Long Haul Networks

SONET RingLayer DCS Layer

IP Layer

Pt-Pt WDM Layer

Router

ADM

DCS/Intelligent Optical Switch

Degree-n OADM/WXC

WDM Terminal

1st Generation

Page 6: Evolution of IP/OL Performance Management

TitlePage 6AT&T Labs Research

Evolution Story for Long Haul Networks

SONET RingLayer

IP Layer

Pt-Pt WDM Layer ULH/WXC Layer

Router

ADM

DCS/Intelligent Optical Switch

Degree-n OADM/WXC

WDM Terminal

1st Generation2nd Generation

Page 7: Evolution of IP/OL Performance Management

TitlePage 7AT&T Labs Research

Evolution Story for Long Haul Networks

SONET RingLayer

IP Layer

Pt-Pt WDM Layer ULH/WXC Layer

Router

ADM

DCS/Intelligent Optical Switch

Degree-n OADM/WXC

WDM Terminal

1st Generation 2nd Generation3rd Generation

Page 8: Evolution of IP/OL Performance Management

TitlePage 8AT&T Labs Research

Some of the problems we’ve encountered

Ring switching impact on higher layers

• Upper layer has timer – waits for lower layer to restore – Done!

• Wrong! – not a simple decision on when to take IP link up and down

SONET RingLayer

IP Layer

X

Page 9: Evolution of IP/OL Performance Management

TitlePage 9AT&T Labs Research

Some of the problems we’ve encountered1st Generation of IP/OL

• SONET alarms received by upper layer are ambiguous and conflicting• Many error types in SONET: BER, AIS, P-LOS, clear during protection

switching• Arrive at different times

• Software bugs – routers don’t behave as expected

• Inconsistencies in calculation of BER and IP layer holddown timer

SONET RingLayer

IP Layer

X

AIS-P BER-P CLR

LOS-L LOS-L

AIS-P BER-P CLR

AIS-P

PPP ACK; OSPF ping

Page 10: Evolution of IP/OL Performance Management

TitlePage 10AT&T Labs Research

• No standards for inter-layer interaction– Physical layer: testers need requirement scripts to test – no

standard, no script– No industry requirement often means no testing, no sharing of

behavior– Historically, L1 and L3 labs have been separate

• Some members of Telecom community have integrated their labs

– Software bugs – routers don’t behave as expected– No specification of common parameters and metric

• Example: Router measures BER in fixed timer intervals• Router takes link down upon TCA (threshold exceeded)• Protection switching results in VERY short but high burst of

error Crosses router threshold even though it is << 10 ms!

What is the source of these problems?

Page 11: Evolution of IP/OL Performance Management

TitlePage 11AT&T Labs Research

IP (logical) layer

LA

SF Washington

NY

LA

SF Washington

NY

Physical (fibre) layer

Common SRLG

• Shared Risk Groups still not well modeled– Single failure at lower layer results in multiple, scattered link failures at

higher layer – network unprepared to restore– Example: portions of dual IP access links routed over same ring – both links

taken down due to previous confusion

What is the source of these problems?

Page 12: Evolution of IP/OL Performance Management

TitlePage 12AT&T Labs Research

Identity Crisis2nd Generation of IP/OL

• High speed (2.5/10/40Gbs) IP links skip SONET ring/xconnect layer and instead route over long sequences of Point-to-point WDM systems, interconnected by O/E/O optical transponders– Should the Optical Path pretend to be a transparent (like dark

fiber)• E.g., No AIS/BER TCA – re-transmit all LOS/LOP to Path Termination

Points• How does one isolate faults for repair (OTs, Amplifiers, WDM Terms)?

– OR: Should it display characteristics of SONET Section/Line/Path Fault Management Architecture?• However: then similar 1st Gen IP/SONET Ring confusion occurs

• Practicality dictated that industry implemented a combination of both approaches

Page 13: Evolution of IP/OL Performance Management

TitlePage 13AT&T Labs Research

• Use long-term model of all-optical path to IP layer link• Two major issues to resolve

– What if intermediate OEO exists in near-term?– How do we model restoration at OL and how does IP layer interact?

• IP layer responsible for deciding link health

– Fast link layer detection (LOS)– GIGE and other signal are going to be transported over the 3rd Gen

OL• Is the set of PM alarms and TCAs we inherit from SONET appropriate for 3rd Gen

OL?• If not, which ones or new ones should we define?

ULH/WSC: The Final Solution?3rd Generation of IP/OL

Page 14: Evolution of IP/OL Performance Management

TitlePage 14AT&T Labs Research

Some potential approaches

• OL only passes simple alarms to the upper layer, e.g., LOS.– Upper layer makes its assessments of BER, packet coding

violations, ACK failures– OL still does fault isolation for OEO components or amplifiers

(e.g., to WDM Term or EMS or Fault OSS), but NOT passed up to IP layer

– Where/how do we do this? Standards, Fora, vendor interactions, carrier requirements?

• Repair process:– Need to correlate what fails in the OL with what fails in the IP

layer (1 to many map)– Network discovery of IP/OL relationships (e.g., SRLG) across

layers would facilitate fault correlation process