globally optimal distributed batch reconfiguration for hazard-free dynamic provisioning:
DESCRIPTION
Globally Optimal Distributed Batch Reconfiguration for Hazard-free Dynamic Provisioning: How an Entire Network can “Think Globally and Act Locally” Wayne D. Grover [email protected] University of Alberta and TRLabs Edmonton, AB, Canada DRCN 2007 La Rochelle, France, Oct. 70-10. - PowerPoint PPT PresentationTRANSCRIPT
Globally Optimal Distributed BatchGlobally Optimal Distributed Batch
Reconfiguration for Hazard-free Reconfiguration for Hazard-free Dynamic Provisioning:Dynamic Provisioning:
How an Entire Network can “Think How an Entire Network can “Think Globally and Act Locally”Globally and Act Locally”
Wayne D. GroverWayne D. [email protected]@trlabs.ca
University of Alberta and TRLabsUniversity of Alberta and TRLabs
Edmonton, AB, CanadaEdmonton, AB, Canada
DRCN 2007DRCN 2007 La Rochelle, France, Oct. 70-10 La Rochelle, France, Oct. 70-10
Wayne D. GroverUniversity of Alberta and TRLabs
2
Globally Optimal Distributed Synchronous Batch Reconfiguration
Setting the stage..what motivates this proposal?Setting the stage..what motivates this proposal?
• US National Science Foundation:– Calls for “completely new approaches to network operations.”
– Identifies robust networking as one of the “grand challenges” in networking science
• Concern that existing peer-to-peer asynchronous distributed provisioning scheme has the risks of network state incoherence– E.g. [1] Pandi & Wosinska, ICTON-RONEXT 2005
• Separately, in other industries, there is a move to exploring the applications and benefits of “on-line O.R.” – Existing distributed provisioning schemes can only employ greedy
solution methods
• What if a whole network could “think globally, but act locally”?– Greater resource efficiencies, greatly reduced signalling, hazard-free
operation, continual near-optimality (“self consolidating”)
Wayne D. GroverUniversity of Alberta and TRLabs
3
Globally Optimal Distributed Synchronous Batch Reconfiguration
What Problem(s) are we trying to solve in dynamic What Problem(s) are we trying to solve in dynamic protected service provisioning?protected service provisioning?
1. The inherent risk of schemes that operate dynamically, on the per-connection timescale, assuming global state coherency at all times.• Risky !• High signalling volumes
Rather than trying to quantity and lower the risk: is there some approach that fundamentally avoids the risk in the first place?
2. In existing concepts provisioning is per-path with no chance to globally optimize • Periodic re-optimization of overall network
configuration is awkward or unaddressed.
Wayne D. GroverUniversity of Alberta and TRLabs
4
Globally Optimal Distributed Synchronous Batch Reconfiguration
OverviewOverview
• What is the problem?– Review key aspects of current dynamic provisioning concept
• Key Concepts of New Proposal– Outline of Operation
• Sub-study: Benefits of Batch Incremental Re-optimization problem– Sample results
• Summary of Advantages and Disadvantages• Research Directions
Wayne D. GroverUniversity of Alberta and TRLabs
5
Globally Optimal Distributed Synchronous Batch Reconfiguration
SBPP Dynamic Protected Service Provisioning Concept(1)SBPP Dynamic Protected Service Provisioning Concept(1)
01
23 4
7
6
9
8 5
10
11
12
Establish a protected connection
to node 11
Spare capacity sharing
Establish a protected connection
to node 2
Wayne D. GroverUniversity of Alberta and TRLabs
6
Globally Optimal Distributed Synchronous Batch Reconfiguration
01
23 4
7
6
9
8 5
10
11
12
SBPP route computation
and signaling process
I want to establish a
protected connection
to node 11
1. Compute working and protection routes
Working: 0-4-8-11;
Protection: 0-3-7-9-12-112. Establish working path
3. Establish protection path
PATH
PATH
PATHRESV
RESV
RESVPATH
PATH
PATH
PATHPATH
RESV
RESV
RESV
RESV
RESV
LSA
LSALSALSA
4. Flood LSA messages
LSA: Link State Advertisement
SBPP Dynamic Protected Service Provisioning Concept(2)SBPP Dynamic Protected Service Provisioning Concept(2)
Wayne D. GroverUniversity of Alberta and TRLabs
7
Globally Optimal Distributed Synchronous Batch Reconfiguration
Observations / Concerns about Dynamic SBPPObservations / Concerns about Dynamic SBPP
• Every node needs and assumes a complete and current network state database, and existing current protection capacity sharing relationships• Link state updates are advertised on a per-connection basis• Link state updates are disseminated asynchronously by any node at the same time other nodes are relying and acting upon time critical state information.• The total database of network state that is operationally “critical” grows at least as O(n3) with size of the network or operating domain and also intensifies with frequency of changes in the network.
Wayne D. GroverUniversity of Alberta and TRLabs
8
Globally Optimal Distributed Synchronous Batch Reconfiguration
Alternatives for Dynamic Automated Provisioning Alternatives for Dynamic Automated Provisioning
• Centralized Control: Global view, one operation at a time.– Safe (in the present regard) but other downsides
• Apply packet priorities to update messages, use TE summary packets, etc. – i.e., measures to try to just mitigate the risk– Will eventually crash when provisioning is dynamic enough
• “Protected Working Capacity Envelope” Concept– Removes protection arrangements from the per-connection time
scale Refs: Grover- Comm Mag, Shen & Grover, Shen PhD; available at www.ece.ualberta.ca/~grover
• Proposal: “Globally Optimal” Distributed Synchronous Batch Re-optimization – Eliminates the hazard of database incoherence– Framework yields other advantages
Wayne D. GroverUniversity of Alberta and TRLabs
9
Globally Optimal Distributed Synchronous Batch Reconfiguration
Key Concepts of New Proposal Key Concepts of New Proposal
• Nodes in these networks have “precise time” ! Can we exploit that?– YES: Time synchronization can help in data synchronization
• “Small-batch incremental reoptimization” provisioning – not path-by-path instantaneous asynchronous provisioning
• Globally synchronous change actions, not asynchronous actions– Reliance on “precise time” to coordinate actions and decisions.
• Relegating all operationally critical signaling for state update to non- real-time communication requirements– Robust confirmation of global state database coherence before any
reliance upon it for network actions
• Solving a globally optimal reconfiguration solution – But nodes act locally to put into effect their parts only of globally optimal
reconfiguration plans.
Wayne D. GroverUniversity of Alberta and TRLabs
10
Globally Optimal Distributed Synchronous Batch Reconfiguration
How it works:How it works:
Operational phases:
• Batch Change Accumulation• Change Dissemination and Confirmation• Globally Optimal Reconfiguration Solution• Local change activation
Wayne D. GroverUniversity of Alberta and TRLabs
11
Globally Optimal Distributed Synchronous Batch Reconfiguration
Step 1. Batch Change AccumulationStep 1. Batch Change Accumulation
• 5 to 10 minute interval envisaged– More generally, the period is relative to the connection holding
time and request rate (I.e. provisioning traffic intensity)
• Nodes make no changes to network connection state during this time
• Nodes observe:– New requests– Departures (released connections)as they arise at their location only.
• At end of the period nodes emit a change summary packet– Like an existing LSA, but contains batch change info– Robust error detection / correction encoded on packet– This dissemination is not real-time-critical
Wayne D. GroverUniversity of Alberta and TRLabs
12
Globally Optimal Distributed Synchronous Batch Reconfiguration
Step 2. Change Dissemination and ConfirmationStep 2. Change Dissemination and Confirmation
• (Again, no change is made to network state made in this phase)• Nodes receive “batch change” summary packets from each other.
– SLA-like forwarding as in OSPF (Internet)
– May include pre-arranged scheduled service requests
– This data exchange is not real-time critical
– This process overlaps with the next change Accumulation phase
• Nodes integrate change packets received into single network-wide re-provisioning summary view of the new requirements.
• Each node then emits a global change summary checksum• Each node wait until an intermediate time mark: If every “heard”
checksum matches own: proceed,– Else: flood out “wave-off: go around” signalPartial
change list
Partial change list
Partial change list
Checksum of integrated overall network change list
Checksum of integrated overall network change list
Wayne D. GroverUniversity of Alberta and TRLabs
13
Globally Optimal Distributed Synchronous Batch Reconfiguration
Step 3. “Thinking Globally”: Step 3. “Thinking Globally”: Optimal Reconfiguration Optimal Reconfiguration SolutionSolution
• Each node locally solves an instance of the globally optimal reconfiguration problem– May be any problem version network operator prefers– Example: Route and protect maximum number of the new service
requests• While reclaiming capacity from released connections• With or without permission to re-optimize existing protection
other Variants: – Multiple priorities or protection classes (multi-QoP)– Permission to re-arrange selected working paths– Strategies to include hedging against future uncertainty– Impairment aware, availability aware routing, etc.
• Nodal solutions have to be “identical not just equivalent”• Prospect here for true “on-line O.R.”
– any reduced complexity version of the optimal problem can also be substituted here
Wayne D. GroverUniversity of Alberta and TRLabs
14
Globally Optimal Distributed Synchronous Batch Reconfiguration
Step 4.“Acting Locally”: Step 4.“Acting Locally”: Node do Node do theirtheir part of the solution part of the solution (only)(only)
• On the next globally precise-time mark:– Each node activates the switching matrix changes to put into
effect its part (only) of the complete network reconfiguration solution.
– No continuing existing connection is altered– Service access nodes observe the turn-up of their new
connections and test end-to-end.
• New operating phase commences• Change request accumulation continuesNote that this results in creation of a complete set of new service paths and their protection arrangements simultaneously in parallel on the network with no signaling. Correctness of the outcome is independently validated by each end-node pair (as it would be in any case).
Wayne D. GroverUniversity of Alberta and TRLabs
15
Globally Optimal Distributed Synchronous Batch Reconfiguration
Accumulate, n
Accumulate, n+1
Dissem
inat
e,
n+1
Activat
e, n
Compute, n
Dissem
inat
e,
n
Activat
e,
n+1
Compute, n+1
Accumulate, n+2
* * **
*
* = precise-
time
instant
time
Overall Network Synchronous Phases Overall Network Synchronous Phases
Network state only changes at these synchronous instants…they are like the clock edges in a digital logic circuit
Wayne D. GroverUniversity of Alberta and TRLabs
16
Globally Optimal Distributed Synchronous Batch Reconfiguration
Sub-Study: Benefits of Optimal Batch Incremental Re-Sub-Study: Benefits of Optimal Batch Incremental Re-Provisioning Provisioning (with Z. Pandi on COST 270 STSM to TRLabs)(with Z. Pandi on COST 270 STSM to TRLabs)
• Simulation of an “on-line O.R.” application for batch incremental re-optimization that is made possible by this framework.
• Statistically non-stationary random traffic demand – i.e., not just random but spatially and temporally evolving random arrival /
departure traffic
– Tests / illustrates ability for scheme to inherently track and re-optimize for time-evolving demand patterns
• Each node accumulates batch change info• At end of each batch period, globally optimal incremental
reconfiguration problem solved (on a single CPU)– Global changes put into effect locally in simulated network
• Compared performance against asynchronous independent provisioning using best known SBPP provision algorithm
Wayne D. GroverUniversity of Alberta and TRLabs
17
Globally Optimal Distributed Synchronous Batch Reconfiguration
Simulation DetailsSimulation Details
• SBPP protection principle vs. small batch incremental reoptimization. (AMPL Model is Appendix to the paper)
• Spare capacity allocations re-optimized each interval as well as new and released working paths routed
• Networks– Sparse topology – High degree topology
• Scenarios– Stationary random– Temporal overload– Temporo-spatial N-S and E-W evolutions;
• Accumulation intervals from 0.2 to 0.4 holding times
Wayne D. GroverUniversity of Alberta and TRLabs
18
Globally Optimal Distributed Synchronous Batch Reconfiguration
Test NetworksTest Networks
Original EU Model Sparse Version
Time-space non-stationary statistical evolution of demand pattern
Wayne D. GroverUniversity of Alberta and TRLabs
19
Globally Optimal Distributed Synchronous Batch Reconfiguration
Sample Performance ResultsSample Performance Results
Full topology, general overload
Full topology, spatial evolution
Sparse topology, spatial evolution
Time (unit holding times)
Total number of blocking events
Batching interval = 0.4 mean holding time
Wayne D. GroverUniversity of Alberta and TRLabs
20
Globally Optimal Distributed Synchronous Batch Reconfiguration
Summary: Properties, Advantages, DisadvantagesSummary: Properties, Advantages, Disadvantages
• (+) Eliminates the hazard of database incoherency under asynchronous operation – All critical state-exchange becomes non-time critical (+)
• (+) Network enjoys the efficiency and adaptability of on-line continual global re-optimization of network state
• (-?) New connections are provisioned in the next provisioning cycle, not “instantaneously.”( service activation delay)
• Nodes synchronize their actions using existing network network time/frequency assets. – analogy to clocked logic circuit robustness
• Service protection still acts at any time in response to actual failure, – provisioning cycle skipped so protection action is reflected in next
change accumulation period.
Wayne D. GroverUniversity of Alberta and TRLabs
21
Globally Optimal Distributed Synchronous Batch Reconfiguration
Research Directions within this frameworkResearch Directions within this framework• Incremental re-optimization models and strategies that the framework enables
– Options such as spare capacity re-optimization or not– Multi QoP classes, priorities– Working re-arrangeable service classes?– Maximum revenue, minimum load, etc: different objectives – Multi-QoP provisioning solutions– Incremental on-line grooming optimization
• Different approaches to “identical not just equivalent” solution of the disparate instances of the same global optimization problem.
• Links to the “scheduled lightpath” connection planning problem and the “network consolidation problem.”
• Accommodation for a top-priority no-delay service class– If thought essential
• Extension to Domains rather than nodes
• Links to the PWCE concept
• Collaborations in this area already begun with B. Jaumard, Networks OR group at Concordia U., Montreal
Your Feedback and Questions are most Your Feedback and Questions are most welcomedwelcomed
Wayne Grover
Extra slidesExtra slides
Wayne D. GroverWayne D. Grover
University of Alberta and TRLabsUniversity of Alberta and TRLabs
Edmonton, AB, CanadaEdmonton, AB, Canada
\\
Wayne D. GroverUniversity of Alberta and TRLabs
24
Globally Optimal Distributed Synchronous Batch Reconfiguration
Example:Example: p p-Cycle-Based Protected Working Capacity -Cycle-Based Protected Working Capacity Envelope (PWCE)Envelope (PWCE)
0
1
4
32
5
Working capacity=128
96
10
131115 16
Working network
0
1
4
32
58
710
6
351 0
Protection capacity=4 Protecting network
0
1
4
32
5
Total deployed capacity=16
1616
16
16
161616
16
Initial network
No per-connection protection path establishment
No protection path signaling required for failure recovery
Protected services
Protecting
Wayne D. GroverUniversity of Alberta and TRLabs
25
Globally Optimal Distributed Synchronous Batch Reconfiguration
PWCE-Operational Steps for Service ProvisioningPWCE-Operational Steps for Service Provisioning
01
23 4
7
6
9
8 5
10
11
12
I want to establish a
protected connection
to node 11
PATH
PATH
PATH RESV
RESV
RESV1. Compute working route: 0-4-8-112. Establish working path
PWCE route computation
and signaling process
No LSA flooding unless the envelope
capacity on the span is used up
Wayne D. GroverUniversity of Alberta and TRLabs
26
Globally Optimal Distributed Synchronous Batch Reconfiguration
Key Ideas / Philosophy of PWCEKey Ideas / Philosophy of PWCE
• For protected services, “if you can route it (through the PWCE), it IS protected.”
• PWCE does not disseminate link state information per connection, or any protection information during service provisioning.
• PWCE provides observability on the approach to blocking, i.e., toward the edge of the operating envelope.
– Onset of blocking under SBPP is less observable.
• If demand pattern evolves, one can adapt the envelope by changing the partitioning of total capacity