usda training mpls
TRANSCRIPT
IP MPLS VirtualPrivate Networks
Presented by:Chris Chase
2
MPLS Concept
Outgrowth of IP Switching (e.g., MPOA, Epsilon’s IP Switching, Cisco’s tag switching)
Key concept:
Separate routing (the selection of paths through the network) from forwarding/switching plus an abstraction of aggregation
3
Non-MPLS Routing
• Hierarchical topology - edge and backbone routers
• Forward packet - lookup route at each hop
BFR - Big Fast RouterPER - Provider Edge RouterCR - Customer Router
CR
CR
CR
BFR
BFR
BFRPER
PER
PER
PER
PER
PER
PER
PER
Provider RouterNetwork
CR
Packet forwarded by hop-by-hop route lookup
Routes chosen using OSPF interior routing protocols
A
A.1
4
Routing with MPLS• Interior routes are assigned Labels that identify a
connection/path Called a Label Switched Path (LSP) instead
of a PVC
LSR - Label Switch Router PER - Provider Edge RouterCR - Customer Router
CR
CR
CR
LSR
LSR
LSRPER
PER
PER
PER
PER
PER
PER
PER
CR
Routes chosen using OSPF interior routing protocols
LSP: Route lookup once and associated label assigned to packet
A
5
Routing with MPLS• Interior routes are assigned Labels that identify a
connection/path Called a Label Switched Path (LSP) instead of a PVC
LSR - Label Switch Router PER - Provider Edge RouterCR - Customer Router
CR
CR
CR
LSR
LSR
LSRPER
PER
PER
PER
PER
PER
PER
PER
CR
Routes chosen using OSPF interior routing protocols
LSP: Route lookup once and associated label assigned to packet
A
A.1
A.1
6
Routing with MPLS
• Traffic Engineering – can use alternative to the IGP shortest path
LSR - Label Switch Router PER - Provider Edge RouterCR - Customer Router
CR
CR
CR
LSR
LSR
LSRPER
PER
PER
PER
PER
PER
PER
PER
CR
Routes chosen using OSPF interior routing protocols
LSP: Route lookup once and associated label assigned to packet
7
Routing with MPLS• Interior routes are assigned Labels that identify a
connection/path. Called a Label Switched Path (LSP) instead of a PVC.
LSR - Label Switch Router PER - Provider Edge RouterCR - Customer Router
CR
CR
CR
LSR
LSR
LSR
PER
PER
PER
PER
PER
PER
PER
CR
Routes chosen using OSPF interior routing protocols
LSP: Route lookup once and associated label assigned to packet
8
MPLS: Decouples routingand forwarding
IP packet header only examined at ingress PER Hierarchy of routing/Label Stacking
– Interior knows nothing about external addresses or routes
– Only needs to know how to get between edges (PERs)
Enables very efficient explicit routing– Explicit routing in IPv4/v6 is expensive – Use explicit route for LSP instead of OSPF route
VPN & Scale
Traffic Engineering
9
History
IP cut through switching– Improve performance and provide QoS to IP
• Multiprotocol over ATM (MPOA)• Epsilon’s IP Switching• Ascend’s IP Navigator• Cisco’s tag switching• IBM’s Aris
1997 – Needed alternative to SVC service for FR and ATM
• Provider based IP VPN concept conceived– MPLS work initiated at IETF
• A technology solution looking for a problem
10
Killer Applications of MPLS
– IP VPNs• Provider Based, Simple, scalable, “layer 2” security• Overlapping, private addressing plans
– Layer 2 VPNs – FR, Ethernet, Circuit services– Traffic Engineering
• Deliver service guarantees similar to FR/ATM• Fast reroute
– Hierarchical Networks• Carrier’s carrier
– Universal control plane• Label = Optical (Lambda), Sonet/TDM, Spatial
(ports/conduits)• GMPLS and “Optical UNI”* Not really an advantage: Performance
11
The Basics
12
Generic MPLS Encapsulation
• MPLS does not define a link layer protocol – no framing provided• A “shim” header between link and network protocol
• New LLC and PID defined for Ethernet, PPP, ATM, FR to carry label• Can stack tags/labels. Stack bit indicates end of stack.• There is no protocol ID field to indicate type of encapsulated packet.
• Protocol of encapsulated packet is implied by the label• Indicated when the label is signaled (next slide)
Layer 2 Header | PID MPLS Label 1 MPLS Label 2 MPLS Label n Layer 3 Packet …
Label (20bits) | CoS (3 bits) | Stack (1 bit) | TTL (8 bits)
13
Forwarding Equivalence Class(FEC) and Hierarchy
FEC = All packets with the same forwarding requirements– i.e., same path, same QoS (policing, scheduling, discard)
• COS bits can modify packet handling– Many different FEC types:
• IPv4, IPv6, FR, ATM, Ethernet VLANFEC label – all packets in this class get the same label
Can stack labels (end of stack bit) Hierarchy of equivalence classes
• Hierarchy of routing• VPNs – L3 and L2• Traffic engineering
14
Multi-protocol
Forwarding/Switching is content agnostic– Can carry IP, FR, ATM, Ethernet, anything– Label represents base common treatment shared by
all packets with that label (FEC) Control Plane (Routing and signaling) is content
agnostic– IP control plane
• Routing – OSPF, IS-IS, BGP, PIM• Signaling – LDP, CR-LDP, RSVP-TE, BGP+ext, PIM+ext• CoS – Diff-serv
Many Layer 2 technologies, e.g, FR and ATM, have been fitted to MPLS
MPLS is not ATM– But ATM switches can be MPLS switches
15
Standards
IETF– First RFCs
• 2702 (TE reqs), 3031 (arch), 3032 (stack encoding), 3034 (FR), 3035 (ATM VC), 3036 (LDP), 2547 (VPN), 3107 (BGP), etc.
– Drafts: GMPLS, BGP, Multicast, Fast Recovery, L2 VPNs, …– http://www.ietf.org/html.charters/mpls-charter.html– L3 VPN:
• http://www.ietf.org/html.charters/ppvpn-charter.html• draft-ietf-ppvpn-rfc2547bis-04.txt
– L2 VPN• http://www.ietf.org/html.charters/pwe3-charter.html
Additional ITU work, MPLS and ATM Forum
16
Layer 3 MPLS VPNThe Next Generation IP WAN
Based on 2547 draft Another tool in the WAN toolbox for the network
architect
17
Traditional Point-to-Point WANs
Rely on a hub architecture
CC CCC
H
18
Dual Star - Redundancy
CC CC
H H
19
Aggregation/Distribution LayerScaling through hierarchy
CC CC
H HH H
AA
20
Domains of Enterprise WANs
Private lines FR/ATM VC
– Private line replacement– Hub-and-spoke– Very reliable, trusted, common
Site-to-site Internet VPN (i.e., IPSEC tunnels)– Point-to-point topologies (typically hub-and-spoke)– Extranets (also SSL)– Remote access– Footprint– Outsourced versus do-it-yourself
L3 MPLS VPN• Layer 3 IP routing “outsourced to carrier”Following slides
They complement each other
L3 MPLS VPN – 2547 style
Provider-based VPN• Vis-à-vis CPE-based tunneling VPN, e.g., L2TP with IPSEC• Others: Virtual router VPNs, Layer 2 MPLS VPNs
IP MPLS VPN defined as a set of interfaces– Interface: PPP, FR/ATM VC, Ethernet Vlan, LT2P– VPN membership assigned when provisioned
Customer interface: standard IP, no MPLS VPN appears as an Autonomous System (AS)
– Customer router peers with this AS - a transit only AS in between customer’s sites
– “Private” - separated from other VPNs
Like having your own little “Internet”
MPLS VPN Layer 3 IP Architecture
CER
PER
LSR
LSR
LSR
CER
PERIBGP
BGP orother protocol or
Static Routes
OSPF
Access IP serial linkEncapsulated in PPP or FR/ATM PVC or Ethernet
PER = Provider Edge RouterCER = Customer Edge Router
LSR = Label Switch Router
MPLS Network
23
MPLS VPN Value Adds vs. Other VPNs
Any to Any IP Connectivity – Optimal Routing without SVCs
• Improved delay by avoiding tandem routing through a hub• Offload hub router
Any IP address scheme - Intranets and extranets Circuit Consolidation – eliminate aggregation layer Diversity via IP routing - simplified DRO Ease of network expansion Access technology agnostic
– FR, ATM, PPP over DS0-OC48; Ethernet IP Class of Service Provider-based IP VPN
• No CPE-based tunneling and encryption equipment/software nor PKI management.
24
Combined Services
CER
CER
CER
Service GW
Internet
Remote AccessNetwork
IP MPLS VPN
FW
InternationalMPLS VPN
GenericGW
dial
Cable,DSL,dial
IPSEC tunnel
L2TP (optionally IPSEC)
MPLS VPN and Edge VPN (IP-VPN)
CER
Access:FR/ATM/DSLEthernet/P-L PPP
25
Load Balancing:From VPN toward customer
PE
PE
PE
PE
CE6
CE3
CE4
CE1
CE5
CE2
MPLS VPN
Pt-to-pt link
A
A
Link1,BGP
Link2,BGP
All flows from remote CE’s (3-6) matching route A will load balance across Links 1 and 2 (even from CE4). Note: the load balancing decision is made (using MPLS) at the ingress to the network.
Cust site
Network A
26
Outbound Route Filtering (ORF)
Allowing a CER to communicate route filter to PER– Dynamically transferred through BGP
CER PER
AC MPLS
BGP
BGP Route Refresh message carries any inbound
prefix-based filter
Any inbound prefix based filter is applied as out bound filter to PER
PER=Provider Edge Router
CER=Customer Edge Router
AC=Access Connection
BGP=Border Gateway Protocol
MPLS=Multi Protocol Label Switching
InboundPermit 0/0Deny all
Out boundPermit 0/0Deny all
27
Class of Service Concepts
The ability for user to differentiate traffic– A provider could differentiate in many ways:
• Isolation - keep traffic in different classes from unfairly impacting each other
• Performance• Bandwidth• Delay• Discard
• Service• Availability• Support
– Network engineer view as a toolset to manage traffic• As opposed to the marketing/management view around
perception
28
IP Header Class ofService Marking
IHL Type of Service/Diffserv codepoint
Destination Address
Source Address
Header ChecksumProtocolTTL
Fragment OffsetFlagsIdentification
Total lengthVersion
0 8 16 3224
Prec 3b | D | T | R | x | x
DSCP 6b | x | x
old
new
29
CoS via IP Packet Marking
CE classifies traffic per packet via marking– IP Diffserv Codepoints (Precedence bits)
Marking interpreted – Separate queuing per class– Per class resource scheduling, e.g.,
• Priority queuing• Bandwidth scheduling (WFQ)
– Drop differentiation• Packets marked “discard eligible” above class
bandwidth• Transmitted when not congested
30
VPN CoS
PERPER
CERCER
LSR
LSRLSR
Port
Trunk
CER
Per Class PolicingClass Servicing
PQBursty
MPLS LSP
Policer
Classification (application/policy level),Session control
Session Control (H.323, SIP)
Gatekeeper
PQBursty
Interface
31
CoS MarkingTransparency Using MPLS
Users don’t want packet markings to change– And some older system’s TCP breaks
Provider can indicate reclassification by marking label instead of remarking IP packet
Policer
Label | CoS=AF12 | ...IP, DSCP=AF11|…
Label | CoS=AF11 | ...IP, DSCP=AF11|…IP, DSCP=AF11|…
32
RFC2547 ConstrainedRoute Distribution
Route Targets (RT) are used to constrain connectivity– Keeps VPNs separate– Creates topology within VPN
Concept of Hub and Spoke route policies used as building blocks– Hub and Spokes have a certain RT import and export list– Hub sites can see other hub and spoke routes– Spokes see only hub routes
Combine/compose to create VPN topology– Union of multiple hubs and spokes create arbitrary
topologies These slides don’t show the explicit RT import/export
lists– Hub types are shown as Hi and spoke types as Si
Any-to-Any Topology
H0
H0
H0
VPN
34
Hub and Spoke Topology
Hi = “hub” interfaces, Si = “spoke” interfaces. The term “hub” and “spoke” just refers to how routes are constrained.Si can only exchange routes with Hi. Hi exchanges routes with all Hi and with Si. Specifically in terms of route targets (RTs), Hi exports RT_Hi and imports {RT_Hi, RT_Si}, while Si exports RT_Si and imports RT_Hi.
S1
S2
S2
H1
H1
H2
S1
VPN
35
Hub and Spoke Topology
Here we combine connectivity policies. Using H0, all hubs talk to each other.By taking such unions of policies completely arbitrary bi-directional connectivity graphs can be realized (in fact completely arbitrary uni-directional graphs could be achieved which might be applicable for something like a firewall).
S1
S2
S2
H1
H1
H2
S1
VPN
Note – RT’s only Constrain Routes
This does not use access lists that filter packets!– BGP MPLS VPN technology constrains route distribution (i.e.,
connectivity), they do not require per packet manipulation (which does not scale nor manage well).
– Packets always follow routes (in the reverse direction of route flow)
But constraint of a specific route is not sufficient to constrain the reachability of a destination matching the route!– Overlapping routes (e.g., aggregates or defaults) can cause
problems.
36
Tradeoffs of L3 MPLS VPN
37
For all the IP values-adds the drawbacks are:– Have to route with provider!
• Troubleshooting is more difficult than L2 WANs• L2 has clear demarc connection up or down.
• Convergence is slower• Route changes have to propagate through provider routers
• Certain routing problems are more difficult to solve• Some problems are more easily solved with direct topology
manipulation• E.g., hub connectivity based on source
• Peering model rather than flat model – a different paradigm
– Only IP: no IPX, SNA, DECNET, Appletalk• Have to tunnel
– Technology not as mature
Customer Support
38
Layer 2 services are easier to support– A customer doesn’t call if his FR-connected routers
aren’t seeing the same set of routes Layer 3 VPN
– Customer calls – “I can’t see my route. Help me troubleshoot my
network.”
– Customer visible provider-based tools can help sectionalize and show customer whether there is a problem with provider network – without getting a technician on the line.
39
Comments about CER-PER Protocol 2547 VPNs are a peering architecture!
– Static• Stable, but fill out order form• Only can detect local link failure
– BGP• Many policies; geared towards multihoming; peering• Load balancing and ORF
– OSPF• Changes intra-area to inter-area backdoor always preferred
– EIGRP• Proprietary• Without ability to summarize need ability to avoid going active• CER acts as stub can loop (count to infinity) without new
feature
How MPLS and 2547 VPNs Work
40
Quick BGP intro LDP operation 2547 VPNs
– Follow the VPN route and label– Follow the packet
41
BGP Basics
BGP - fairly simple protocol– Uses TCP for reliable delivery– Distributes appearance/change and withdraw/disappearance of
routes in route table• Routing Information Base (RIB) = BGP route table
– eBGP = between AS– iBGP = within AS
AS_PATH – list of where route has been Next hop Other attributes are about policy, i.e., which route is
“best”
R1 R2RIB------
TCP Connection
A| zW| s,rRIB------
Route Reflector – scaling IBGP among AS edges
42
RR RRPE
PE
PECE
CE
CE
CE
CEIBGP client
IBGPEBGP
PE
PEPE
Route Reflector
43
Updates in RIB from inbound peer type are sent to outbound peer type in table below
Client is a special kind of IBGP neighbor– Any BGP router with a neighbor designated as a client is
a “route reflector”
Outbound
Inbound
EBGP IBGP Client
EBGP X X X
IBGP X X
Client X X X
A Label Switched Path – LSP
44
The downstream node assigns label
Often called an MPLS tunnel: payload headers are not Inspected inside of an LSP.
417 data 666 data 233 data datadata
POP! PUSH!SWAP! SWAP!
A label switched path“tail end” “head end”
LSP’s are Unidirectional
45
Destination FEC based – Can’t distinguish upstream sender– LSP’s merge– Results in multipoint-to-point LSPs
417 IP1 823 IP1 IP1 IP1
IP2 IP2
233 IP1
417 IP2912 IP2565 IP2
LSP merge
Penultimate Hop Popping
46
417 IP 666 IP 233 IP IP IP
POP +IP Lookup PUSHSWAP SWAP
666 IP 233 IP IP IP
IP Lookup PUSHPOP SWAP
IP
Look up the label pop + look up header underneath– Why even bother sending a label?
47
Follow the Routeand Follow the Packet
CR1 at Site 1 has a packet addressed to a hostin network Z at Site 2. How does it get there?
CR1
PER1
LSR1
LSR3
LSR2
CR2
PER2
MPLS VPN Cloud
Network Z
Site 1Site 2
48
Label Distribution in Interior
For links configured for label switching– Node sends out periodic UDP Hello “all router subnet broadcast” on well-
know LDP UDP port• Contains IP address for desired LDP session and desired label space
– Creates a TCP-based LDP session to any node answering Hello• Session is initiated by node with lower advertised IP address
Each node advertises routes (FEC) and labels to all LDP peers– Node picks a local label for each route in its routing table and advertises this
to everyone• Called downstream, unsolicited and independent control mode
• The upstream node only installs into its label forwarding table where the downstream node is the next hop for the route (FEC)
• Note in this mode if there is a reroute that changes the next hop the label is just rebound locally – no signaling upstream
Much faster reroute compared to ATM, but local label assignment does not guarantee LSP is in place
• ATM-LSRs work differently
49
LSP Setup for OSPF Route to PER2
CR1
PER1
LSR1
LSR3
LSR2
CR2
PER2
IPFR Cloud
LSP for the OSPF route to reach PER2
L2 pop
Li - labels requested via LDP from next hop neighbor for each routing table entry
L1 L2
PER2 L1
L4 L2
50
How MPLS VPNs Work
1) Follow the routes– Each VPN on a PER has a private routing table
• Called a Virtual Routing Forwarding (vrf) table• vrf is assigned attributes that are unique to the VPN
• Route Targets (RT) - attached to VPN routes.• only vrfs with common RTs share routes with each other
• Route Distinguishers (RD) - appended to routes to ensure uniqueness even if VPNs have overlapping address spaces
• Creates a new address family called vpnv4 = RD+ipv4
• Note: RTs and RDs are not applied to packets
2) Follow the packet– A stack of two labels is used to forward the packet on the
interior LSP and then external interface
51
VPN extensions
Route Target (RT)– BGP 64 bit extended community value– First 16bit identify as RT type. Other 48 bit is
variable• Conventional format – ASN:X, i.e., 16b:32b
Route Distinguisher– 64 bit, first 16 identify RD type
• 48 bit selectable with format convention ASN:X, i.e., 16b:32b
52
Distributing Customer Routes
CR1
LSR1
LSR3
LSR2
CR2
PER2
Network Z
IPFR Cloud
LSPLi - labels
PER2 learns Rt Z via BGP or is statically configured with Rt Z.
LNK2 data: vrf1vrf1: RT1, RD1 table: Rt Z LNK2
LNK1PER1
LNK2
53
Customer Routes Distributed via IBGP with
Label
CR1
PER1
LSR1
LSR3
LSR2
CR2
PER2
IBGP msg Network Z
IPFR Cloud
RD1+Z, L4, RT1, PER2
Li - labelsLSP
LNK2 data: vrf1vrf1: RT1, RD1 table: Rt Z L4,CR2,LNK2
LNK2
54
Only vrfs with MatchingRTs Import Route
CR1
PER1
LSR1
LSR3
LSR2
CR2
PER2
Network Z
IPFR Cloud
Li - labels
LNK1 data: vrf1vrf1: RT1, RD2 table: Rt Z L4, PER2 PER2 L1, LSR1
LSP
LNK2 data: vrf1vrf1: RT1, RD1 table: Rt Z L4,CR2,LNK2
LNK2
55
Purpose of BGP Label
Indicates which vrf and optionally which interface on the egress PER
Locally, the egress PER will treat labels in two possible ways:
– Non-aggregate label is associated with an external route
• Will be switched directly to an outgoing interface• IP header is not examined
– Aggregate label is associated with a locally originated or directly connected route
• Packet will be looked up in the vrf context
56
CR1 learns RT Z via BGP (or statically configured)
CR1
PER1
LSR1
LSR3
LSR2
CR2
PER2Network Z
IPFR Cloud
Li - labels
LNK1 data: vrf1vrf1: RT1, RD2 table: Rt Z L4, PER2 PER2 L1, LSR1
table: Rt Z PER1
LSP
LNK2 data: vrf1vrf1: RT1, RD1 table: Rt Z L4,CR2,LNK2
57
Packet for Rt Z forwarded by CR1
CR1
PER1
LSR1
LSR3
LSR2
CR2
PER2
Route Z
IPFR Cloud
Li - labels
LNK1 data: vrf1vrf1: RT1, RD1 table: Rt Z L4, PER2 PER2 L1, LSR1
table: Rt Z PER1,LNK1
Z| packet
LSP
LNK2 data: vrf1vrf1: RT1, RD1 table: Rt Z L4,CR2,LNK2
58
Top label is label-switched through interior
CR1
PER1
LSR1
LSR3
LSR2
CR2
PER2
Route Z
IPFR Cloud
Li - labels
LNK1 data: vrf1vrf1: RT1, RD1 table: Rt Z L4, PER2 PER2 L1, LSR1 L1|L4|Z| packet
L1 L2
L2 pop
LSP
LNK2 data: vrf1vrf1: RT1, RD1 table: Rt Z L4,CR2,LNK2
59
Top label popped at end of LSP
CR1
PER1
LSR1
LSR3
LSR2
CR2
PER2
Route Z
IPFR Cloud
Li - labels
L4|Z| packet
LSP
LNK2 data: vrf1vrf1: RT1, RD1 table: Rt Z L4,CR2,LNK2
60
Inner label determinesegress interface and then is popped.
CR1
PER1
LSR1
LSR3
LSR2
CR2
PER2
Route Z
IPFR Cloud
LSPLi - labels
Z|packet
LNK2 data: vrf1vrf1: RT1, RD1 table: Rt Z L4,CR2,LNK2
61
MPLS in Core Not Needed
MPLS for IGP domain serves as a tunneling method among PERs
Could use other tunneling methods Advantages to MPLS:
– Full mesh of LSP tunnels automatically created– Can use MPLS TE
Internet draft to use IP or GRE tunneling– Automatically (treat vpnv4 BGP next hop as a
recursive encapsulation)
62
MPLS VPN Security
There is a private routing table for each VPN (vrf) VPN membership Identity associated with each access
connection – VPN membership is not determined by IP header, only by
interface (e.g., DLCI, VPI/VCI, PPP, VLAN tag).– Label and RT for VPN attached to routes advertised for
interface.– Route and its matching label are only imported by routing
tables that match the VPN RT.– Impossible for a packet on a PVC in one vrf to spoof its
way or jump into another vrf
63
MPLS VPN Security
Requires correct provisioning of connections and RT’s
Same as FR/ATM security• Given correct provisioning it is impossible for packet to
“jump” from one PVC to another PVC– If you don’t need encryption on FR/ATM/P-L then don’t
need it here• If you would encrypt over FR/ATM/P-L then you would also
encrypt here
64
MPLS VPN Scale for the Carrier
Can the MPLS VPN technology scale to meet the size of the market?
Can it be managed at scale? Does this have anything to do with the Internet?
65
All Those Routes
Can be a lot of routes in the service– The aggregate over all VPNs– There is no summarization of routes!
BUT …– No VPN state in backbone LSRs, only in PERs.– PER only holds routes for VPNs touching it.– Route Reflectors (RR) only handle VPNs they touch
66
Large CorporateIntranet vs Internet
Intranet = Private Corporate Network– aka VPN– P/L or FR/ATM VCs
• Recent survey - total number of U.S. enterprises using FR at ~35,000
– Tens to thousands of sites• 95% T1
Internet – corporate access from Intranet– 10 Corporate gateways– T1
67
Modeling Large Enterprise VPNs
Based on customer observations and business nature of large corps
Large # sites, N, for N=100 – 10,000 #routes = K*N + C
– K = 2, 3, … 10• one route always for CER-PER link
Largest FR/ATM customer N ~20,000 sites 95% < T1 BW utilization < 25%
68
Constraints
User Plane – forwarding (pps)– vrf/MPLS cost is small
Control Plane• Interior (IGP) component is independent of MPLS/VPN
– Space (memory)• vrf and BGP session overhead small compared to route space
– Signaling/Routing (CPU)• most important in transient situations (e.g., link failures)
Public resources– e.g, registered IPv4 addresses, RT’s among partners
OSS
69
Divide and Conquer
How to keep dimensions within constraints?– (i) State reduction– (ii) Partition– (iii) Distribute
• Forwarding• Control plane
70
State Reduction
Route Summarization– In “middle” of customer network– not really an option, maybe greenfield intranets
Limit the routes in a VPN– Keep # allowed routes commensurate with number
of interfaces purchased• vrf route limit
– Use carrier’s carrier• “Providers”, non-enterprise, that have lots of routes, few
ports
71
Partitioning
Limit VPNs touching a PER– To avoid poor PER utilization – aggregate and fan-out
• Groom up interfaces rather than push PER toward CE• Fan out across PERs in a POP
Limit VPNs touching RR– (i) Via RTs and ORF – requires RT assignment strategy– (ii) Via communities– (iii) Via PER-RR mutually exclusive VPN subsets
PER’s and RR’s only need to handle the largest enterprise customer VPN
72
Distributing Forwarding and Control
Distributed forwarding– All modern routers have distributed user (forwarding)
plane – BUT most have centralized control
• Low CER speeds lots CERs/PER more likely to be constrained by control plane limits than packets per second
Distributed control– CER-PER routing and vrf tables limited to necessary
interfaces– Central controller has no vrf tables, vpnv4 route
tables, or CER peering protocols
73
It is based on Multicast Domain (draft-rosen-vpn-mcast-05.txt) P and PE routers multicast enabled Provider internal multicast routing tables Globally PEs configured to run PIM (global instance) with adjacent P routers PEs maintain PIM adjacencies with CE devices Normal PIM configuration in customer network
–PIM modes, RPs , multicast addressing
PIM Adjacency
CE
CE
CE
Backbone Multicast
PIM Adjacency PIM Adjacency
VRF
mVRF PE
PE
PE P Backbone
IP Multicast VPN Solution
74
Multicast Tunnels and the default MDTs
Per mvrf default multicast distribution tree (default MDT) using traditional PIM within backbone
MDT used to distribute end customer multicast packets and PIM control messages
Access to the MDT is via a multicast GRE tunnel interface on PE Each PE in VPN is a leaf _and_ root on the MDT For efficiency (but more state) can launch per session (S,G) MDTs for a
VPN
Provider Network
Per VRF MDTPE
CE
CE
PE
PE
CE
75
Using an MDT
Forwarding onto the MDT is done in encapsulated packets off PE–GRE or IP-in-IP
C-Packets - customer control and data packets P-Packets – provider control and data packets
–Destination Address = MDT group address for VRF–Source Address = IP address of PE M-BGP peering address
C-packet becomes a P-packet when encapsulated MPLS is NOT used!
C-packetSRC=PC1
DST=225.1.1.1
P-packetSRC=Lo1
DST=234.10.10.1
MDT GROUP ADDRESS234.10.10.1
Lo1
C-packetSRC=PC1
DST=225.1.1.1
PE PE
SRCReceiver
CE CE
76
IP Traffic Engineering and MPLS
Improving utilization of backbone resources
77
The Multi-Commodity Flow Problem
ni
nj
cij
Demands d(i,j) from node i to jConstraints - link capacity b(i,j) Costs, e.g., link costs C(i,j) Path (route) p(k) variables for each demand
The traffic engineering problem:1) Find a feasible solution2) Find a min cost solution3) Find feasible and min cost solution
with single node or link deletion
78
Explicit Routing
Solutions to the arbitrary TE problem require specifying the explicit route (path) for each demand
Could calculate explicit routes satisfying constraints offline
– Then specify explicit routes in network without constraints
79
Constraint-based Shortest Path First (CSPF)
Can let the network enforce constraints CSPF distributed algorithm
– Given full knowledge of network resource allocation– Route a demand by
1) Pruning network to only feasible paths2) Pick shortest path
• Compromise to solving the full TE problem
80
IP TE Metric Manipulation
– i.e., pick OSPF weights to create feasible solution• Limited in problems that it can solve
Simple Topology and capacity augmentation– Tends to over-engineer or restrict topology
Source Route– IPv4 option that allows explicit route– Very costly, not practical
No efficient explicit routing nor knowledge of network resource allocation
81
Making it Fit with Plain IP Routing
1
2D
A
B
C3
4
Link size = 1, d(1,2) = 0.75, d(1,3) = 0.5, d(1,4) = 0.5Can’t pick OSPF weights that work
82
ATM TE ATM routing (PNNI) has knowledge of resource
usage– Bandwidth booked per trunk
Performs CSPF to find feasible path for – New demand– Rerouted demand– Feasibility referred to as Call Admission Control
83
Making it Fit with ATM switching
Link size = 1, d(1,2) = 0.75, d(1,3) = 0.5, d(1,4) = 0.5
1
2D
A
B
C3
4
ATM switch
84
IP over ATM
The way to build ISP backbones not too long ago Allowed efficiently utilizing a limited number of
costly facilities shared among routers Typically a full mesh of ATM PVCs is created among
the backbone routers– PVCs sized to router endpoint demand
But …– Led to N^2 IP peering– IP router investment outstripped speed of ATM
85
MPLS TE for IP
Provides efficient explicit routing for IP Can communicate resource constraints But not an overlay routing design
– Routers not in a full peering mesh Uses IP-based control plane protocols rather than a
different protocol– RSVP-TE uses extensions of RSVP to carry labels and
additional constraints
86
How RSVP-TE Works
PATH downstream contains explicit hops and bandwidth
RESV upstream contains labels
RESV with labels
1
2D
A
B
C3
PATH <A, B, C, 3> 0.5Mbps
pop5118
pop
3.1
3.1|9|
18 3.1|9|51 3.1| 9
LDP 3 L9
3.1
87
Online CSPF with OSPF-TE
Can use RSVP-TE without resource reservation– Calculate constrained paths offline
For online CSPF need:– Knowledge of resource assignment in network
• Add resources to OSPF link states• i.e., bandwidth available per class (diff-serv)
• Flood changes in resource allocation• Unlike normal OSPF which just floods when link up/down
changes– Now use RSVP-TE with non-zero reservations per class
(diff-serv)– Similar to ATM PNNI
88
MPLS Fast Reroute
Using MPLS TE to improve availability– RSVP-TE creates backup tunnels– On failure of protected LSP, packets are shoved down
backup LSP tunnel– Switchover is faster than waiting for CSPF to calculate
and signal a new LSP For local repair (link or node) can recover ~100ms or
better– Backup LSP is already in place, so as soon as the failure
is detected locally the headend just needs to reprogram the label FIB
89
Link Protection
Create backup LSP around link to Next Hop With or without reservation
– Can also backup normal LDP LSP
1
2D
A
B
C3
1
2D
A
B
C3
Protected LSP
Backup tunnel.Pushes label 51 onto tunnel
1851 45
pop
90
Node Protection
Create backup tunnel LSP for two hops away (next-next hop)
Backs up RSVP-TE tunnel– Learns labels from RESV recorded route of protected
tunnel
1
2D
A
B
C3
Protected LSP
Backup tunnel.Pushes label 45 onto tunnel
1851 45
pop
91
Path Protection
Create an end-to-end diverse backup tunnel Slower than local protection – have to wait for
headend to detect failure
1
2D
A
B
C3
Protected LSP1851 45
pop
Backup LSP
92
What are Layer 2 VPNs?
Defined at the IETF PPVPN and PWE3 groupshttp://www.ietf.org/internet-drafts/draft-ietf-ppvpn-l2vpn-requirements-00.txthttp://www.ietf.org/html.charters/pwe3-charter.html
Point-to-point– Virtual Private Wire Service (VPWS)– Offers FR, ATM and Ethernet “pvc”-like services
• Nothing new here – have been available for many years Multi-point Ethernet Bridging
– Virtual Private LAN Service (VPLS)• Similar to the Transparent LAN Services
• Around for a while using standard Ethernet switching• But VPLS is more scalable over the WAN
93
So what?
IP or MPLS as the multi-service carrier core– Was ATM, but ATM didn’t keep up with IP investments– On one core network carrier can put
• Internet, Voice (trunking and service), FR, Ethernet, ATM, L3 IP VPN, IPSEC VPN
• Finally, network convergence for the carrier?? New market for struggling carriers
– Some newer providers only built fiber transport and IP backbone for Internet service and no ATM backbone
• They are eager to go after the Enterprise WAN business• L2 VPNs can be built on their existing IP infrastructure
For Customers – nothing really new, just more competitors for their WAN
94
Business Communications Review,Jan 2002 chart from Vertical Systems
95
Tunneling
PE-to-PE tunnel– L2TP– MPLS
Multiplexer field– One tunnel, many connections called Pseudo Wires
Control field– Optional sequence number (detect out of order
packets)– Protocol specific control bits (e.g., DE, FECN, CLP, PTI)
96
Encapsulations
IP header (20B) Session ID (4B) Cookie (8B) Control word (opt)
Payload
Tunnel label (4B) VC Label (4B) Control word (opt)
Payload
Payload cont
L2TPv3 – purely connectionless with IP header– No new technology in carrier IP backbone … but– Spoofable
• Cookie provides no strong verification– No QoS other than diff-serv
MPLS– Less overhead– Can use MPLS TE
97
L2 MPLS VPN: Example FR
“Directed” LDP between PE pair exchanges FEC and label for a particular pseudo wire
PVCs within tunnels
MPLS Network DLCI 100
DLCI 300
DLCI 200
PER
L1 | L2 | Cw | FR PDU
PEPER
CER
CER
CER
FR PVC from DLCI 100 to 300
98
VPLS – Virtual Private LAN Service
Multipoint to multipoint service– Any-to-any
Does LAN bridging, MAC address learning While Ethernet frame based
– IT IS ACCESS TECHNOLOGY AGNOSTIC• Don’t have to use GigE• Can use Ethernet bridging over other access types
• i.e., bridged over FR/ATM/PPP for NxDS0, T1, NxT1, T3 or bridged over SONET
– Any protocol – not just IP [e.g., IPX, DECNET] No routing with carrier! … but
– More than a few dozen sites on a VPN (single LAN)?– No Spanning Tree, so just connect routers
99
State of the Art
OAM work still needs a lot of work– Fault detection/isolation, performance measurement,
probing Little “Call Admission Control”
– How to map bandwidth resources and classes onto tunnels
No Multi-AS implementations Minimal legacy interworking
– Just glue connections at “dumb” interconnect to ATM
100
MPLS to Prem or in the Enterprise? Can run MPLS to CER
– By running BGP CER-CER• Can create own VPNs (vpnv4) on top of providers
• Tenant service• Hierarchy of ipv4 routing
• 3rd tier ISP backbone outsourcing (carrier’s of carrier)– But don’t need MPLS for tunneling CER-CER
• Use IP tunneling with transparent interoperability with carrier
MPLS in private network– Create own VPNs (essentially an internal carrier)– For traffic engineering IP
101
Completed PHASE I- MPLSPlease Continue to the Next Phase
Performance Engineering in MPLS-based VPNs
Susan HiltonEnterprise Network Consultant
103
Performance Engineering
Rationale CoS Foundation Technologies Service Implementation Applied Performance Engineering
104
Not All Traffic is Equal
BANDWIDTH
DELAY
JITTER
InteractiveData 3-Tier ERP Bulk Transfer Interactive
Voice
LOW MEDIUM HI MEDIUM
LOW MEDIUM HIGH LOW
MEDIUM MEDIUM HIGH LOW
APPLICATIONS
SER
VIC
E M
ET
RIC
S
105
Multi-Application Networks Mixing applications with ‘similar’ traffic characteristics
and similar performance requirements is simply a sizing exercise
– Statistical multiplication
Mixing applications with conflicting traffic characteristics often causes some to not meet Response Time requirements
– Even with sufficient bandwidth deployed!
106
Latency ≤ 150 ms Jitter ≤ 30 ms Loss ≤ 1%One-way requirements
Traffic Profiles and basic QoS requirements Voice, Video and Data
Smooth Drop Sensitive Delay Sensitive UDP Priority
VoiceVoice
Bandwidth per call depends on codec and sampling-rate
BurstyBursty Drop Sensitive Delay Sensitive UDP Priority
Video-ConfVideo-Conf
Latency ≤ 150 ms Jitter ≤ 30 ms Loss ≤ 1%One-way requirements
Similar performance requirements as VoIP, but radically different traffic patterns
Smooth/BurstySmooth/Bursty Drop InsensitiveDrop Insensitive Delay InsensitiveDelay Insensitive TCP RetransmitsTCP Retransmits
DataData
Data Classes:Data Classes:Mission-Critical AppsMission-Critical AppsTransactional/Interactive AppsTransactional/Interactive AppsBulk Data AppsBulk Data AppsBest Effort Apps (Default)Best Effort Apps (Default)
Traffic patterns for Data vary among applications (and even among different versions of the same application)
107
Data Classifications: Application ExamplesApplication Class Example Applications Application / Traffic Properties Packet / Message Sizes
Interactive Telnet, Citrix, Oracle Thin-Clients,AOL Instant Messenger,
Yahoo Instant Messenger,PlaceWare (Conference),Netmeeting Whiteboard
Highly Interactive applications with tight user feedback requirements.
Average Message Size < 100 B
Max message size < 1 KB bytes
Transactional SAP, PeopleSoft - Vantive, Oracle – Financials + Internet
Procurement + B2B + Supply Chain Mgmt + Application Server
Oracle 8i Database,Ariba Buyer,
I2, Siebel, E.piphany,Broadvision,
IBM Bus 2 Bus,Microsoft SQL,
Lotus Notes, Microsoft Outlook,BEA Systems,
Email Download (SMTP),DLSw+
Transactional applications typically use a client-server protocol model.
User initiated client based queries followed by server response. Query
response may consist of many messages between client and server.
Query response may consist of many TCP and FTP sessions running simultaneously
(e.g. HTTP based applications)
Depends on application.
Could be anywhere from
1 KB to 50 MB
Bulk Database Syncs, Network based Backups,Video Content Distribution, Large ftp file
transfers
Long file transfersAlways invokes TCP congestion
management
Average message size 64 KB or greater
Best-Effort All non-critical traffic,HTTP Web Browsing + Other
Miscellaneous traffic
108
Performance Engineering Includes:
– Network Engineering– Capacity Planning– Traffic Engineering– Bandwidth management
• congestion management/avoidance to ensure the availability of high priority traffic and at the same time increase the network efficiency
What is performance engineering?– The process of engineering a network to assure that
applications attain their required performance.– Provide defined service metrics for different
applications
109
Service Metrics
Bandwidth:
Delay/Latency: – Time it takes a packet to travel from origination to destination – Distance, switching, insertion, queuing
Jitter (Variability in Delay): – Latency that is unpredictable; 1st packet 10ms delay, 2nd
packet 30ms delay. (Early/late packets)
Packet Loss: – Buffer Overflows, Selective discards, Line Errors
110
Technology Evolution - General Attributes
PLPL
FRFR
II
VPNSVPNS
Increase in: Connectivity Shared
Resources (<$?)
Path Variance Delay Delay
VarianceDecrease in: Per
connection engineering
FRFR
ATMATM
Most connectivityMost delayLeast cost
Least connectivityLeast delayMost cost
L2L2VPNVPN
111
Technology Evolution - Future Direction
Connectivity Shared Resources
(<$?)
PLPL
FRFR
ATMATM
II
VPNSVPNS
Class-of-Service features added to improve:•Path Variance•Delay •Delay
Variance
COS
COS
COS
PLPL
II
VPNSVPNS
FRFR
ATMATM
L2L2VPNVPN
L2L2VPNVPNCOS
112
QoS / CoS…What’s the Difference?
QoS – Quality of Service– Absolute Metrics, ‘Contracted’ parameters– Each flow must be engineered independently– Addresses the service requirements of different applications in order to
provide more than “best effort” service for specific applications
CoS – Class of Service– Relative treatment of contending flows– Implies that flows can be categorized or differentiated!– The implementation that provides QoS
Used interchangeably in this session
113
Is Bandwidth the Answer?
Just deploy more bandwidth– Queuing Delay is f(link speed)– Bandwidth is cheap– QoS is complicated
•MAYBE for LAN, MAN•Maybe even for WAN backbone•WAN edge will remain a bottleneck for foreseeable future
114
Why IP QoS is Needed
Enterprise networks are migrating toward IP transport
Best effort is not good enough– ‘Engineered’ performance is required for
enterprise applications.
Emerging applications (VOIP, Streaming) are highly sensitive to delay, jitter (delay variation), and packet loss.
Need performance/reliability of private networks with ubiquity/cost advantage of Internet.
• One approach– MPLS
115
CoS for IP VPNs
Traditional techniques do not work for mesh topologies as we will see.
Egress port speed is still a bottleneck– The Service needs to participate in CoS solution.
116
CoS Foundation Technologies
Advanced Queuing Queue Management Traffic Shaping Classification and Marking Fragmentation
117
Advanced Queuing
Advanced Queuing is any technique that transmits packets in a different order than they were received.
– I.e. Not FIFO
These techniques only kick-in when there is congestion. (I.e. if there is no queuing, then there is no ‘advanced’ queuing.
– Advanced queuing only makes sense where there is a speed mismatch.
• I.e arrival rate is greater than departure rate.
118
Priority Queuing
Prioritization allows specified traffic to preempt competing traffic.
Can be multiple levels of priority.– (4 in Cisco)
Higher priority traffic can ‘starve’ lower priority traffic.
In
Out
119
Bandwidth Allocation Also called:
– Custom Queuing– Weighted Round Robin•Each traffic type gets a relative
allocation of the bandwidth.
• bits, bytes, packets…
In
105311
Out
120
Bandwidth Allocation Considerations
Cycle Time– All buckets are served in each cycle. (No ‘priority’)– More buckets longer cycle time
Unused allocation shared proportionately across remaining traffic types.
Cisco de-queuing quirk– Always serve at least 1 packet, even if it is bigger than
allocation– No concept of ‘credit’ or ‘deficit’.
121
Fair Queue
Not all packets are equal– Large and small packets– Part of sparse or heavy peer to peer– More or less time sensitive to application
Scheduling algorithms – basic– When a packet within a flow arrives, calculate when it would get
served as part of each flow– Process packets in this order (not necessarily at this time)
1
3
2
4
5
122
Weighted and Flow Based
Simple algorithm works well to give everyone a fair share. Some problems -
– Sparse flows must wait even though they require little bandwidth (high jitter)
– High priority packets must wait Assign weights to scheduled times based on
– Precedence– Sparse or heavy flow (Cisco Flow Based WFQ)– Other
123
Weighted Fair Queuing No ‘defined’ traffic classes– Packets ‘hashed’ to 1 of 64 queues– Hash f (SourceIP, Source Port, DestIP, DestPort)– Possibility of bulk and interactive with same hash
Weighted Fair’ means each queue has equal weight (1)
Each Packet is ‘scheduled’ at arrival time– Schedule Time = Queue Tail + (Weight * Length)– De-Queue based on Schedule Time
• ‘Calendar Queuing’
123
64 60
250
200
150
125
1000
300
310
120180240300
0
Hash
124
Flow Based WFQ – more detail
Detects bandwidth of layer 4 flows (also know as conversations)
Classifies traffic into as many as 64 bins Allocates bandwidth equally across all flows Light flows get the bandwidth they need, heavy
flows share the remaining bandwidth On by default in Cisco low speed interfaces
125
Class-Based Weighted Fair Queuing
A Hybrid– ‘Class-Based’
• Defined Classes instead of hashing– ‘Weighted Fair Queuing’
• Weighted Fair Queuing with defined ‘weight’• Schedule Time = Queue Tail + (Weight * Length)
A
600
300
400
600
1200
0Class-Mapping
B
C
4
2
10
Weight
60 bytes 60 bytes
100 bytes
150 bytes 150 bytes
126
Class-Based Weighted Fair Queuing
Weight of a traffic class is implied by Bandwidth– Weight ~ 1/(BW Percent)
Behavior is opposite of WFQ– Higher BW = Higher Priority
A
244, 198, 132, 66
15,0000
Class-Mapping
B
10
1.1
Weight
60 bytes each
1500 bytes
Example:Class A – FTP BW 10% Weight 10Class B – Telnet BW 90% Weight 1.111
127
CBWFQ - Cisco Unlike WFQ that applies relative priority, CBWFQ enables absolute
guarantees Assigns flows to classes
– Allocation to a class can be based on almost anything Made up of many parts
– FQ ~ Fair Queue, nobody gets it all– W ~ Weighs applied to queues– CB ~ Class Based, uses class to define queues
None of this applies unless there is a queue to manage. When no congestion, no priority is assigned.
Only available Cisco solution for high speed router ports (above E1, with possible exception of DS3 Frame Relay)
Weights are applied per class Generally uses Flow Based WFQ within a class Includes a special class – Low Latency Queue (LLQ)
128
Configuring CBWFQ (Cisco) Policy-map defines the classes Class-map assigns packets to a class Service-policy invokes the policy on an interface
class-map class1match access-group 101
class-map class2match input-interface s0
!policy-map policy1
class class1bandwidth 50queue-limit 100
class class2bandwidth 20queue-limit 35
class class-defaultfair-queue
interface atm0.1 point-to-pointip address 10.10.10.1 255.255.255.252pvc atlanta 1/105
vbr-nrt 40000 72000 32service-policy out policy1
129
Low Latency Queue - LLQ
A special queue defined by the policy-map Applies strict priority up to the bandwidth specified
– will not serve any other queue until the LLQ is empty
Drops packets above the specified bandwidth – WRED and queue depth do not apply
Invoked by using the “priority” command in place of the “bandwidth” command
130
Queue Management
Queue Depth is specified in policy-map CBWFQ defines how queues are served but what
happens when a particular queue gets too big? Packets are discarded
– Tail Drop drops all packets arriving after the queue is full
– Weighted Random Early Detection (WRED)
131
Queue Management – Tail Drop Tail Drop, Global Synchronization, WRED•Tail drop tends to affect all flows in a queue. This effect is called global synchronization. All flows crank up their windows, congestion occurs, tail drop drops all arriving packets, all widows reset.
Queue Full
IndividualTCP
Sessions
TotalBW
UtilizationAverage Utilization After tail drop.
132
Queue Management - WRED Weighted Random Early Detect– WRED drops a few random packets before congestion reaches the
queue depth threshold. This causes a small number of flows to reset their TCP/IP window; while the remaining flows continue to use available bandwidth.
– Effective for ‘large’ number of flows, questionable for enterprise.– Can specify differing WRED threshold based on IP Prec.
Queue FullWRED Thresh
IndividualTCP
Sessions
TotalBW
UtilizationAverage Utilization After WRED drop.
133
Tail Drop vs. WRED
Tail drop tends to affect all flows in a queue. This effect is called global synchronization. All flows crank up their windows, congestion occurs, tail drop drops all arriving packets, all widows reset.
WRED drops a few random packets before congestion reaches the queue depth threshold. This causes a small number of flows to reset their TCP/IP window.
134
Traffic Shaping
• Traffic shaping is a tool to ‘move’ a queue from one place in a network to another.
• Queues only occur where there is a speed mismatch.• If arrival rate > departure rate -> queue
• Traffic Shaping forces a speed mismatch in a router to preventa speed mismatch in a network.
135
Traffic Shaping
FRNetwork
T1 In / 64K Out Queue in the Network
Traffic Shape to ~ 64K – No Queue in NetworkQueue is in Router, where advanced queuing can be used
64K Port
T1 Port
RouterRouter
136
Mesh Based Shaping
In mesh topologies, where is the choke point? That is where a queue will build!
Router Router
Q
Can we control this queue at the high speed end? NO, because of other remote sites.
137
Value of MPLS Traffic Shaping Unlike mesh PVCs, MPLS services can apply
shaping in the cloud to manage the queue
Router Router
Q
138
Policing / Shaping Two Sides of the Same Coin
A service subscription has an implied traffic contract– Service provider Polices arriving packets against the
contract– Customer Shapes traffic to assure conformance to the
contract
139
Policing
Policing– Enforce a traffic ‘contract’– Pass all traffic within contract– Out of contract
• Drop• Or ‘mark’ as out of contract
• DE Bit in frame relay networks• CLP in ATM networks• IP Precedence or DSCP in IP networks
140
What happens to Non-Conforming Traffic
Mark but do not discard, unless congestionOR
Drop
141
Classification and Marking Need to identify packets in order to determine
what service level is required (classification)– Supported by marking or coloring– Marking is done in the IP header
Ver Hdr Type of Service LengthID Flag Fragment
OffsetTTL Protocol Header Checksum
Source IP AddressDestination IP
AddressOptions
Data
20 bytes
142
Marking IP Precedence Type of Service provides 8 bits
– Bits 0-2 IP Precedence
Value Bits Name0 000 Routine1 001 Priority2 010 Immediate3 011 Flash4 100 Flash
Override5 101 Critical6 110 Internet
Control7 111 Network
Control
143
Marking DSCP Start with IP Precedence 3 bits (Class Selectors) Add Drop Precedence Levels of 3 bits
Precedence7 Same—network control6 Same—internet control5 Express Forwarding (EF)4 Class 43 Class 32 Class 21 Class 10 Best Effort
144
Marking DSCP The second three bits are used for Per Hop Behavior or Drop Probability
Applies to Class 1-4 or Assured Forwarding (AF) Provides more flexibility
Class 1 Class 2 Class 3 Class 4
Low drop 001010AF11 DSCP 10
010010AF21DSCP 18
011010AF31DSCP 26
100010AF41DSCP 34
Medium drop
001100AF12DSCP 12
010100AF22DSCP 20
011100AF32DSCP 28
100100AF42DSCP 36
High drop
001110AF13 DSCP 14
010110AF23DSCP 22
011110AF33DSCP 30
100110AF43DSCP 38
145
Fragmentation
On low speed (<768K) ports, queueing is not enough Insertion (serialization) delay is an issue Insertion = packet size (bits)/line speed
– Example (1500*8)/56 = 214 msec Objective for voice is 10 msec insertion delay per
packet
146
Insertion Delays
t
56Kbps
100 bytes
800bits /56K = 14ms 100bytes / 7 = 14ms
14ms
56Kbps=7bytes/ms
t
128Kbps
100 bytes
64Kbps
6.25ms
12.5ms
t
128Kbps
500 bytes
(8*500)*(1/128K + 1/ 64K) = 93.75ms
64Kbps
31.25ms
62.5ms
Fn(line speed) Fn(packet length)
(8*100)*(1/128K + 1/64K) = 18.75ms
147
Fragmentation & Compression
Even with optimal queuing treatment, performance may not be acceptable.
The best treatment that can be obtained with strict prioritization is queuing delay of O(1/2 of a packet).
• I.e. Best case, prioritized packet gets transmitted immediately.• Worst case, prioritized packet has to wait for the currently transmitting packet to finish.• On average, wait time for prioritized packet is ½ of a non-prioritized packet.
This is still substantial delay (and jitter) for low speed ports. Fragmentation & Compression is a means to make the low priority packets
smaller; so O(1/2 packet delay is smaller)
148
Head of Line Blocking Real time traffic arrives but a 1500 byte packet is
just starting transmission LLQ/CBWFQ gives only priority if the line has not
started to send the packet… There is no preemptive capability..
The 1500-byte frame takes 187.5 ms to serialize on a 64-kbps access. Real time traffic have to wait. This is HOL Blocking…
Link Fragmentation and Interleaving (LFI) is the mechanism fragment large data frames into regularly sized pieces and to interleave small real time packets into the flow.
149
Head of Line Blocking
150
Fragmentation MTU
– Dangerous – many apps set Do Not Fragment– Can be done at source, but no very practical
FRF.12– Fragment data packets– Prioritize voice packets– Voice only, not suitable for priority data
• No LFI• ATM Interworking is complicated
ML-PPP– Best ‘generic’ solution– More overhead than FRF.12, but OK with compression
151
Fragmentation
50 100 250 500 1000 15001536 0 1 1 3 5 8768 1 1 3 5 10 16512 1 2 4 8 16 23256 2 3 8 16 31 47192 2 4 10 21 42 63128 3 6 16 31 63 9464 6 13 31 63 125 18856 7 14 36 71 143 214
T (Transmit 1 Packet) mS
LinkSpeed
Packet Size (bytes)
152
Why Compression? IP header UDP header RTP header
– Real time protocol, synchronizes packets
2 Voice SamplesIP Header UDP Header RTP Header
20 bytes 12 bytes8 bytes 20 bytes
Total = 60 bytes, 63% is overhead!
60 bytes * 8 bits/byte * 50 PPS = 24KBPS!
153
Why Compression?
154
How does it work?
• Compressor and decompressor share consistent state : that includes fixed fields, first order differences and second order difference fields (delta encoding)
• There is Context ID (CID) that identifies flows and used as database index.
• Flows are hashed against IP source&destination address and UDP source&destination ports and assigned CID at compressor
• Decompressor use CID as database index ..• There is sequence number to detect packet loss … • Bandwidth vs CPU
155
How it all works together!
156
Service Implementation Example AT&T’s IP Enabled Frame
Relay/ATM– Marking– Shaping– Queuing– Profiles– Policing– Futures
157
Service Architecture
CE
MPLSCore
CE
PER
Port
PER
Port
Frame Relay or ATM
Access PVC(CDR)
158
Marking CER marks TOS or DSCP bits
– Real Time 101 110 or 101 000 • Dropped if rate exceeded
– Bursty High• 011 010 or 011 000 in contract• Remarked to 011 100 if out of contract
– Bursty Low• 010 010 or 010 000 in contract• 010 100 out of contract
– Best Effort • 000 000
159
Shaping Based on egress port speed
– Contracted rate not a factor– Shaped to egress port speed– Moves queue into the network router
160
IPFR/ATM CoS Implementation AT&T uses Low Latency Queueing (LLQ) and Class Based
Weighted Fair Queueing (CBWFQ) in the IP FR/ATM network to implement CoS
LLQ has strict priority—best for delay sensitive applications like voice
CBWFQ allows critical data classes to get more bandwidth allocation than other classes
2 2 2 2 2 2 11VVV VVV
Transmit Queue(FIFO)
Class 1(FIFO)
Class 3 (FIFO)
Class 2 (FIFO)
LLQ drained completelywhenever packets queued (upto Max BW during congestion)
Other classes typicalCB-WFQ
Interface
1 1 1 1 1 1
3 3 3 3 3 3
LLQ (FIFO)
VVVV
161
Queuing 4 Queues (‘Classes of Service’)
– Real Time--LLQ– Bursty-Hi—CBWFQ with WRED– Bursty-Lo—CBWFQ with WRED– Best Effort—WFQ with WRED
– Mapping based on IP Precedence or DifServ marking
162
CoS Profiles AT&T’s IPFR/ATM Service provides for 4 separate classes
– Real Time– Bursty High– Bursty Low– Best Effort
The 4 classes can be thought of as a priority hierarchy The hierarchy is controlled by bandwidth allocation
– Simply, the more bandwidth that is allocated, the higher the priority
– Real time class is given a strict bandwidth allocation– Other classes are assigned a percentage of CDR bandwidth
163
Real Time Class
Best for voice, maybe video—not both if port speed is <768K
Strictly policed—excess over the allocated bandwidth is dropped
Must be purchased in increments of 20% of CDR Determine real time allocation based on call
requirements
164
Policing – Real Time class
RT– Ingress – Police to contracted value
• Drop excess• No burst
– Egress – Police only when port is congested
• Drop excess• No burst
165
Policing – Bursty Data Ingress
– Remark Excess (out of contract)• Contract based on allocation profile
– Burst Size? Egress
– No policing function– Queuing treatment of ‘out of contract’ identical to ‘in
contract’– ‘Lower’ drop thresholds
166
Futures MPLS EXP bits used for CoS in the backbone
– Will no longer remark customer packets MPLS Traffic Engineering (TE)
167
Applied Performance Engineering
Optimizing response times– Identify traffic classes & requirements– Implement policies at network bottlenecks.– Calculate ‘expected’ behavior.
168
Applied Performance Engineering
Remember the relevant parameters:– Bandwidth– Delay– Jitter– Loss
169
Performance Engineering Guidelines
Potential queuing (Bottlenecks) exists at any point where the arriving rate can exceed the departing rate.
– i.e. find the speed mismatch points
The dominant bottleneck is at the ‘slowest’ link in the end to end connection.
– TCP protocols tend to adapt to the available bandwidth. This means that the only place where there can be a sustained congestion is at the slowest link. Any ‘faster’ links will only experience ‘transient’ congestion.
For ‘Low Speed’ ports, priority is more important For ‘ High Speed’ resources, BW allocation is sufficient
170
Frame Relay Performance Engineering
FR, ATM– Speed Mismatch Network Buffering
App1
App2
App3
Router Router
Q
171
Application Inventory
List all applications Include ‘administrative’ apps
– Routing– DNS– OS
Establish Requirements– Response Time, Bandwidth
172
What other applications are out there? Applications that have low delay tolerance (sub second)
– Telnet– Citrix– DLSW– TN3270
Applications that have multi-second response times– ERPs (SAP, Peoplesoft, Siebel)– Credit card authorizations– Reservations– HTTP applications
Background applications– Email– FTP– Database synchronization
173
Response Time Requirements
Group applications according to (response time requirements) delay sensitivity.
VeryDelay
Sensitive
NotDelay
Sensitive
Voice FTPE-Mail
WebP.O.STelnet
My-SAPPeopleSoft 7
174
GroupingIf voice is present
Put it in RTGroup adjacent application classes into
available bins.• Use caution if RT is to be used for data apps• You don’t have to use all available classes
VOICEP.O.STelnetDNS
ERPWeb
MailFTP
Real Time Bursty-Hi Bursty-Lo Best Effort
175
Capacity Planning
Establish BW / Application– To meet stated requirements
Determine required port speed
176
What is the “best” profile? Only one type of data applicationsBursty High Mix of sub-second and background
– Sub-second Bursty High– Background Best Effort
Mix of multi-second and background– Multi-second Bursty High– Background Best Effort
Mix of all types– Sub-second Bursty High– Multi-second Bursty Low– Background Best Effort
This process will help to identify the “best” profile!
177
Profile SelectionCOS Package RT% of CDR BH% of CDR BL% of CDR BE
Multimedia High 80 20 0 --
60 40 0 --
60 20 20 --
Multimedia Standard
40 60 0 --
40 40 20 --
20 80 0 --
20 60 20 --
20 40 40 --
10 80 10 --
10 60 30 --
10 40 50 --
Critical Data 0 100 0 --
0 80 20 --
0 40 60 --
Business Data 0 0 100 --
Economy 0 0 0 100
178
Insertion Delay Exercise Given a 1500 byte packet, what is the insertion delay
– o at 56K?– o at 128K?
If you wanted to run voice on the same 128K port as the 1500 byte packet, what size would you recommend fragmenting the packet to minimize jitter? (hint—10 msec should be the insertion delay of the fragment)
179
CoS Exercise 1 Need to support 3 simultaneous calls with a G.729
codec Rest of data is web surfing
What port size do I need? What profile should I pick?
180
CoS Exercise 2
No voice requirements Lots of telnet traffic Some http-based ERP applications Rest is web surfing and email
What profile should I consider? Why?
181
Completed PHASE IICongratulations
Completed All Phases