international conference on communications (icc 2004)
DESCRIPTION
International Conference on Communications (ICC 2004). Enhancing fault management performance of two-step QoS routing algorithms in GMPLS networks. Eusebi Calle, Jose L Marzo, Anna Urra. L. Fabrega. [email protected]. Universitat de Girona. Contents. Background (Fault Management) - PowerPoint PPT PresentationTRANSCRIPT
International Conference on Communications (ICC 2004)
Broadband Communications and
Distributed Systems GroupUniversitat de Girona
Enhancing fault management performance of two-step QoS routing algorithms in GMPLS
networks
Eusebi Calle, Jose L Marzo, Anna Urra. L. Fabrega
Contents
Broadband Communications and
Distributed Systems Group
Universitat de Girona
Background (Fault Management)
The failure probability and impact
Two-step and one-step routing methods
Experimental results Summary and conclusions
1. Fault Management
1.1 MPLS/GMPLS fault management.
Working LSP
Backup LSP
PML Node
PSL Node
• Protection Switch LSR (PSL) : switches protected traffic from the working path to the corresponding backup path.
• Protection Merge LSR (PML) : merges their traffic into a single outgoing LSP, or, if it is itself the destination, passes the traffic on to the higher layer protocols.
1 3 5 7 9
42 6 8
FIS : Fault Indication Signal
Contents
1. Fault Management
2. Reducing failure probability/ impact
3. Two-step versus One-step routing
4. Experimental results
5. Summary and Conclusions
1. Fault Management
1.2 Classes of impairments IETF RFC3469
Path Failure (PF) ...
Path Degraded (PD)...
Link Failure (LF) is an indication from a lower layer that the link over which the path is carried has failed. If the lower layer supports detection and reporting of this fault, i.e. any fault that indicates link failure for example SONET Loss of Signal (LoS), this may be used by the MPLS recovery mechanism.
Link Degraded (LD) ...
Contents
1. Fault Management
2. Reducing failure probability/ impact
3. Two-step versus One-step routing
4. Experimental results
5. Summary and Conclusions
SINGLE LINK FAILURES
Working LSP
Backup LSP
1 3 5 7 9
42 6 8
1. Fault Management
M is the number of backup LSPs used to protect N working LSPs
1:1: 1 working LSP is protected/restored by one backup LSP.
M:1: 1 working LSP is protected/restored by M backup LSPs.
1:N: 1 backup LSP is used to protect/restore N working LSPs (shared backups).
M:N : N working LSPs are restored by M backup LSPs
1:0 : No protection (for instance, Best effort traffic)
1+1: Traffic is sent concurrently on both the working LSP and the backup LSP.
Working PathsBackup Paths
1:1 M:1 1:N
M:N 0:1 1+1
1.3 The M:N model
Contents
1. Fault Management
2. Reducing failure probability/ impact
3. Two-step versus One-step routing
4. Experimental results
5. Summary and Conclusions
1. Fault Management
1.4 a) Path provisioning classification
1.4 b) Resource allocation classification
Path Provisioning
Computed on demand
Pre-computed
Established on demand
Pre-established
Resource pre-allocated
Resource allocated on demand
Resource allocation
Dedicated (1:1 or 1+1)
Shared (1:N, M:N)
No resources (1:0)
Contents
1. Fault Management
2. Reducing failure probability/ impact
3. Two-step versus One-step routing
4. Experimental results
5. Summary and Conclusions
Egress Node PML
Ingress node PSL
Working Path
Global Backup Path
1 3 5 7 9
42 6 8
1.5.a) Global Backup Path
AdvantagesPath Protection
( 1 PSL, 1 PML )
Disadvantages
Slow Failure Recovery Time
Packet Loss
1. Fault ManagementContents
1. Fault Management
2. Reducing failure probability/ impact
3. Two-step versus One-step routing
4. Experimental results
5. Summary and Conclusions
Egress Node
Ingress node
Working Path
Global Backup Path
1 3 5 7 9
42 6 8
1.5.b) Reverse Backup Path
Reverse Backup Path
AdvantagesPath Protection
Low Packet Loss
Disadvantages
Slow Failure Recovery Time
Packet reordering
High Resource Consumption
1. Fault ManagementContents
1. Fault Management
2. Reducing failure probability/ impact
3. Two-step versus One-step routing
4. Experimental results
5. Summary and Conclusions
Egress Node
Ingress node
Working Path
1 3 5 7 9
42 6 8
1.5.c) Local Backup Path
Local Backup Path
AdvantagesFast Failure Recovery Time
Low Packet Loss
Disadvantages
High Resource
Consumption
(Path Protection)
1. Fault ManagementContents
1. Fault Management
2. Reducing failure probability/ impact
3. Two-step versus One-step routing
4. Experimental results
5. Summary and Conclusions
1.5.d) Segment Backup Path
Egress Node
Ingress node
Working Path
1 3 5 7 9
42 6 8
Segment Backup Path
Advantages
Disadvantages
1. Fault ManagementContents
1. Fault Management
2. Reducing failure probability/ impact
3. Two-step versus One-step routing
4. Experimental results
5. Summary and Conclusions
1.5.e) 1+1 Protection
1. Fault Management
Egress Node
Ingress node
Path 2
1 3 5 7 9
42 6 8
Path 1
AdvantagesPath Protection
Very Low Packet Loss
Disadvantages
Fast Failure Recovery Time
High Resource
Consumption
Contents
1. Fault Management
2. Reducing failure probability/ impact
3. Two-step versus One-step routing
4. Experimental results
5. Summary and Conclusions
2. Reducing failure probability and impact
2.1. Enhanced fault recovery methods for protected traffic services in GMPLS networks
Drawbacks and lacks
No protection considerations -> Secondary routing objective (No specific backup routing information)
High complexity (in terms of computation time)
High resource consumption (1+1)
No traffic differentiation
No physical network considerations (availability and reliability)
Failure impact (fault recovery time, packet loss…)
Objectives
Protection as a main routing objective
Low complexity
Low resource consumption
Traffic differentiation
Failure Probabilities
Reducing Failure Impact
Contents
1. Fault Management
2. Reducing failure probability/ impact
3. Two-step versus One-step routing
4. Experimental results
5. Summary and Conclusions
Recovery phaseFault detection (TDET)
Hold off time (THOF)
Notification time (TNOT)
New Backup creation(TBR + TBS)
Backup Activation (TBA)
Switchover (TSW)
Complete recovery (TCR)
Features
Depends on the technology
Depends on the lower layers
Depends on the Failure Notification Distance and notification method
Depends on the routing and signaling method applied
Depends on the backup distance and signaling cross-connection process
Depends on the node technology
Depends on the backup distance
Time Reduction
Cannot be reduced (except in the case of monitoring techniques)
Setup (0-50 ms)
Minimizing the Failure Notification Distance and optimizing the process
Pre-establishing the backup
Minimizing the backup distance and optimizing the process
Cannot be reduced
Minimizing the backup distance
2.3 Minimization of the failure recovery time (Failure Impact)
IETFCCAMPCommon Control and Measurement Plane
Intenet Drafts
Rabbat, Sharma...
2. Reducing failure probability and impactContents
1. Fault Management
2. Reducing failure probability/ impact
3. Two-step versus One-step routing
4. Experimental results
5. Summary and Conclusions
Geographical Conditions
Failure Probability
Models:
MIL-HDBK-217
Bellcore/Telcordia
Initial Link Failure
Probability
Statistical Failure Values
Network Components
Current Link Failure
Probability LFP
LFP1 LFP2 LFP3 LFPN
N
1i
iLFPFP_LSP
Label Switch Path Failure Probability
Label Switch Path
2.4 Failure Probability
….
MTTR
MTBF
FR
2. Reducing failure probability and impactContents
1. Fault Management
2. Reducing failure probability/ impact
3. Two-step versus One-step routing
4. Experimental results
5. Summary and Conclusions
Residual Label Switch Path Failure Probability
LFP = 1·10-4 LFP = 4·10-4
Working path
1 3 7
42 6
5
RFP = (1+4)= 5 = LFP
Working path
1 3 7
42 6
5
Local Backup
RFP = 1
Working path
1 3 7
42 6
5
Local Backups
RFP = 0
Working path
Segment Backup
1 3 7
42 6
5
RFP = 0
Working path
1 3 7
42 6
5
Global Backup
RFP = 0
Working path
1+1
1 3 7
42 6
5
RFP = 0
2.5 Residual Failure Probability
2. Reducing failure probability and impactContents
1. Fault Management
2. Reducing failure probability/ impact
3. Two-step versus One-step routing
4. Experimental results
5. Summary and Conclusions
Low Failure ProbabilityHigh Failure Probability2.6 Case Study
Separated Links to be protected
Working path
1 3 7
42 6
5
Local
Working path
Global Backup
1 3 7
42 6
5
Working path
1 3 7
42 6
5
Local
Working path
1 3 7
42 6
5
Working path
1 3 7
42 6
5
Working path
1 3 7
42 6
5
Local Backup
Working path
1 3 7
42 6
5
Local Backup
s
Working path
Global Backup
1 3 7
42 6
5
Segment
Consecutive links to be protected
2. Reducing failure probability and impactContents
1. Fault Management
2. Reducing failure probability/ impact
3. Two-step versus One-step routing
4. Experimental results
5. Summary and Conclusions
Protected Traffic services
High-resilience requirement traffic services: Traffic that is very sensible to network faults (like EF diffserv traffic). Residual Failure probability and Failure Impact values should be set up at zero. 1+1 or local backup paths can be used in order to accomplish these values.
Medium-resilience requirement traffic services: Traffic that is sensible to network faults (like AF1 or AF2 diffserv traffic). However, resource consumption should be taken into account to route the working and backup paths. Residual failure probabilities and failure impact values should be bounded in order to achieve the desirable QoS with appropriate resource consumption. Segment and global backups can be used to protect these services.
Non-Protected Traffic services
None-resilience requirement traffic services. No protection requirements are needed (BE traffic).
Protection assignment for class types based on the network failure probability and failure impact
2.7 GMPLS Protection with traffic differentiation
Contents
1. Fault Management
2. Reducing failure probability/ impact
3. Two-step versus One-step routing
4. Experimental results
5. Summary and Conclusions
2. Reducing failure probability and impact
1 3 7
42
6 8
5
Shortest Working Path
Two-step routing
Low FailureProbability
High FailureProbability
1 3 7
42
6 8
5
Working path
Backup Path
3.1 Two-step versus One-step routing
3. Two-step vs One-step routing algorithms
Trap Topologies (MHA + Global Protection)
Working path
1 3 7
42
5
One-step routing
Backup Path
6 8
Smart Protection
Fast Recovery Time
Low Packet Loss
Low Resource Consumption
Advantages
Low Failure Probability
Contents
1. Fault Management
2. Reducing failure probability/ impact
3. Two-step versus One-step routing
4. Experimental results
5. Summary and Conclusions
4.1 Failure Probability Analysis (*)
4. Experimental results
0.06
0.08
0.1
0.12
0.14
0.16
0.18
0.2
1 2 3 4 5 6 7 8 9 10
No traffic differentiation
Protected Traffic
Req
uest
Rej
ectio
n R
atio
Trial Number
0
500
1000
1500
2000
2500
3000
<2 2 3 4 5 6 7
WSP
PWSP_FPT (NPT)
PWSP_FPT (PT)
Failure Probability Distribution
Num
ber
of L
SP
Failure Probability evaluation. Traffic differentiation. Dynamic Traffic
0
1
2
3
4
5
6
Non Protected Traffic
Protected Traffic
No Traffic Differentiation
Time
LS
P
fail
ure
prob
abil
ity
10 - 4
Failure Probability evaluation. Traffic differentiation. Incremental Traffic
Residual Failure Probability evaluation. Request Rejection Ratio
No protection
Incremental / Dynamic exp.
Traffic Differentiation
Modified WSP
Contents
1. Fault Management
2. Reducing failure probability/ impact
3. Two-step versus One-step routing
4. Experimental results
5. Summary and Conclusions
Non Protected TrafficProtected Traffic
Net
wor
k R
esid
ual
Fai
lure
Pro
babi
lity
0
0,1
0,5
0,2
0,3
0,4
Non Protected Traffic
Protected Traffic
0
0,1
0,5
0,2
0,3
0,4
Net
wo
rk R
esid
ual
Fai
lure
Pro
bab
ilit
y
Non Protected Traffic
Protected Traffic
TimeTime
Residual Failure Probability evaluation. Segment Backups and traffic differentiation.
Residual Failure Probability evaluation. Local Backups and traffic differentiation.
4.2 Residual Failure Probability Analysis
Local and segment protection
Dynamic traffic, Traffic Differentiation, Modified WSP
Similar RFP, Local (more resource requirements) but minor failure impact.
Contents
1. Fault Management
2. Reducing failure probability/ impact
3. Two-step versus One-step routing
4. Experimental results
5. Summary and Conclusions
4. Experimental results
5. Summary and conclusions
5.1 Summary and conclusions
Failure Impact
Minimum Failure Notification
Minimum resource consumption (Segment + Probabilities)
Minimum Residual Failure Probabilities
Network Availability and Reliability
Failure probability evaluation models
Resource Consumption
Protected-Traffic Services
Enhanced routing algorithms
Two-step routing methods
Quality of protection degree
Contents
1. Fault Management
2. Reducing failure probability/ impact
3. Two-step versus One-step routing
4. Experimental results
5. Summary and Conclusions
Broadband Communications and
Distributed Systems GroupUniversitat de Girona
International Conference on Communications (ICC 2004)
Enhancing fault management performance of two-step QoS routing algorithms in GMPLS
networks
Eusebi Calle, Jose L Marzo, Anna Urra
Thank you !