ece544: communication networks-ii, spring 2006
DESCRIPTION
ECE544: Communication Networks-II, Spring 2006. D. Raychaudhuri Lecture 5. Includes teaching materials from L. Peterson. Today’s Lecture. Routing metrics Scalable IP routing IPv6 Inter-domain routing (BGP). Routing Metrics. Metric choices. Static metrics (e.g., hop count) - PowerPoint PPT PresentationTRANSCRIPT
ECE544: Communication Networks-II, Spring 2006
D. Raychaudhuri
Lecture 5
Includes teaching materials from L. Peterson
Today’s Lecture
• Routing metrics• Scalable IP routing• IPv6• Inter-domain routing (BGP)
Routing Metrics
Metric choices
• Static metrics (e.g., hop count)– good only if links are homogeneous– not the case in the Internet
• Static metrics do not take into account:– link delay– link capacity– link load (hard to measure)
Original ARPANET metric
• Cost proportional to queue size– instantaneous queue length as delay
estimator
• Problems:– did not take into account link speed– poor indicator of expected delay due to
rapid fluctuations– delay may be longer even if queue size is
small due to contention for other resources
New metric
• Delay = (depart time - arrival time) + transmission time + link propagation delay– (depart time - arrival time) captures queuing– transmission time captures link capacity– link propagation delay captures the physical
length of the link
• Measurements averaged over 10 seconds– Update sent if difference > threshold, or
every 50 seconds
Performance of new metric
• Works well for light to moderate load– static values dominate
• Oscillates under heavy load– queuing dominates
• Reason: there is no correlation between original and new values of delay after re-routing!
Specific problems
• Range is too wide– 9.6 Kbps highly loaded link can appear 127
times costlier than 56 Kbps lightly loaded link– can make a 127-hop path look better than 1-
hop
• No limit in reported delay variation• All nodes calculate routes simultaneously
– triggered by link update
Consequences
• Low network utilization (50% in example)
• Congestion can spread elsewhere• Routes could oscillate between short
and long paths• Large swings lead to frequent route
updates– more messages– frequent SPT re-calculation
Revised link metric
• Better metric: packet delay = f(queueing, transmission, propagation).
• When lightly loaded, transmission and propagation are good predictors
• When heavily loaded queueing delay is dominant and so transmission and propagation are bad predictors
Routing metric v.s. link utilization
0
30
60
140
75
50% 100%25% 75%
225New metric(routing units)
Utilization
9.6 satellite
9.6 terrestrial
56 terrestrial
56 satellite
90
Observations
• Cost of highly loaded link never more than 3*cost when idle
• Most expensive link is 7 * least expensive link
• High-speed satellite link is more attractive than low-speed terrestrial link
Routing dynamics
0
1.0
4.0
0.5
0.75
0.25
2.0 3.01.0 1.5 2.5 3.50.5
Link reported cost
UtilizationBoundedoscillation
Metric map
Network response
Routing dynamics
0
1.0
4.0
0.5
0.75
0.25
2.0 3.01.0 1.5 2.5 3.50.5
Reported cost
Utilization Metric map
Network response
Easing ina new link
Scalable IP Routing
How to Make Routing Scale
• Flat versus Hierarchical Addresses• Inefficient use of Hierarchical Address Space
– class C with 2 hosts (2/255 = 0.78% efficient)– class B with 256 hosts (256/65535 = 0.39%
efficient)
• Still Too Many Networks– routing tables do not scale– route propagation protocols do not scale
Internet StructureRecent Past
NSFNET backboneStanford
BARRNETregional
BerkeleyPARC
NCAR
UA
UNM
Westnetregional
UNL KU
ISU
MidNetregional…
Internet StructureToday
Backbone service provider
Peeringpoint
Peeringpoint
Large corporation
Large corporation
Smallcorporation
“Consumer ” ISP
“Consumer” ISP
“ Consumer” ISP
Subnetting• Add another level to address/routing
hierarchy: subnet• Subnet masks define variable partition of host
part• Subnets visible only within site
Network number Host number
Class B address
Subnet mask (255.255.255.0)
Subnetted address
111111111111111111111111 00000000
Network number Host IDSubnet ID
Subnet Example
Forwarding table at router R1Subnet Number Subnet Mask Next
Hop128.96.34.0 255.255.255.128
interface 0128.96.34.128 255.255.255.128
interface 1128.96.33.0 255.255.255.0 R2
Subnet mask: 255.255.255.128Subnet number: 128.96.34.0
128.96.34.15 128.96.34.1
H1R1
128.96.34.130Subnet mask: 255.255.255.128Subnet number: 128.96.34.128
128.96.34.129128.96.34.139
R2H2
128.96.33.1128.96.33.14
Subnet mask: 255.255.255.0Subnet number: 128.96.33.0
H3
Supernetting (CIDR)• Assign block of contiguous network numbers
to nearby networks• Called CIDR: Classless Inter-Domain Routing• Protocol uses a (length, value) pair length = # of bits in network prefix
• Use CIDR bit mask to identify block size• All routers must understand CIDR addressing• Routers can aggregate routes with a single
advertisement -> use longest prefix match
Supernetting (CIDR)• Routers can aggregate routes with a single
advertisement -> use longest prefix match• Hex/length notation for CIDR address:
– C4.50.0.0/12 denotes a netmask with 12 leading 1 bits, i.e. FF.F0.0.0
• Routing table uses “longest prefix match”– 171.69 (16 bit prefix) = port #1– 171.69.10 (24 bit prefix) = port #2– then DA=171.69.10.5 matches port #1– and DA = 171.69.20.3 matches port#2
Chapter 4, Figure 26
Border gateway(advertises path to
11000000000001)
Regional network
Corporation X(11000000000001000001)
Corporation Y(11000000000001000000)
Route Aggregation with CIDR
IP Version 6• Features
– 128-bit addresses (classless)– multicast– real-time service– authentication and security – autoconfiguration – end-to-end fragmentation– protocol extensions
• Header– 40-byte “base” header– extension headers (fixed order, mostly fixed length)
• fragmentation• source routing• authentication and security• other options
IP ServiceIP Service IPv4 SolutionIPv4 Solution IPv6 SolutionIPv6 Solution
Mobile IP with Direct Routing
Mobile IP with Direct Routing
DHCPDHCP
Mobile IPMobile IP
IGMP/PIM/Multicast BGP
IGMP/PIM/Multicast BGP
IP MulticastIP Multicast MLD/PIM/Multicast BGP,Scope IdentifierMLD/PIM/Multicast
BGP,Scope Identifier
MobilityMobility
AutoconfigurationAutoconfigurationServerless,
Reconfiguration, DHCPServerless,
Reconfiguration, DHCP
IPv6 Technology Scope
32-bit, Network Address Translation
32-bit, Network Address Translation
128-bit, MultipleScopes
128-bit, MultipleScopes
Addressing RangeAddressing Range
Quality-of-ServiceQuality-of-Service Differentiated Service, Integrated Service
Differentiated Service, Integrated Service
Differentiated Service, Integrated Service
Differentiated Service, Integrated Service
SecuritySecurity IPSec Mandated, works End-to-EndIPSec Mandated,
works End-to-EndIPSecIPSec
IPv4 & IPv6 Header Comparison
Version IHLType of Service
Total Length
Identification FlagsFragment
Offset
Time to Live
Protocol Header Checksum
Source Address
Destination Address
Options Padding
Version Traffic Class Flow Label
Payload LengthNext
HeaderHop Limit
Source Address
Destination Address
IPv4 HeaderIPv4 Header IPv6 HeaderHeader
- field’s name kept from IPv4 to IPv6
- fields not kept in IPv6
- Name & position changed in IPv6
- New field in IPv6Leg
end
27
IPv6 Addressing • IPv6 Addressing rules are covered by multiples
RFC’s–Architecture defined by RFC 2373
• Address Types are :–Unicast : One to One (Global, Link local, Site local, Compatible)–Anycast : One to Nearest (Allocated from Unicast)–Multicast : One to Many–Reserved
• A single interface may be assigned multiple IPv6 addresses of any type (unicast, anycast, multicast)
–No Broadcast Address -> Use Multicast
IPv6 Address Representation
• 16-bit fields in case insensitive colon hexadecimal representation
• 2031:0000:130F:0000:0000:09C0:876A:130B
• Leading zeros in a field are optional:• 2031:0:130F:0:0:9C0:876A:130B
• Successive fields of 0 represented as ::, but only once in an address:
• 2031:0:130F::9C0:876A:130B• 2031::130F::9C0:876A:130B• 0:0:0:0:0:0:0:1 => ::1• 0:0:0:0:0:0:0:0 => ::
• IPv4-compatible address representation• 0:0:0:0:0:0:192.168.30.1 = ::192.168.30.1 = ::C0A8:1E01
29
IPv6 Addressing
• Prefix Format (PF) Allocation–PF = 0000 0000 : Reserved–PF = 001 : Aggregatable Global Unicast Address–PF = 1111 1110 10 : Link Local Use Addresses (FE80::/10)–PF = 1111 1110 11 : Site Local Use Addresses (FEC)::/10)–PF = 1111 1111 : Multicast Addresses (FF00::/8)–Other values are currently Unassigned (approx. 7/8th of total)
• All Prefix Formats have to support EUI-64 bits Interface ID setting
–But Multicast
Aggregatable Global Unicast Addresses
• Aggregatable Global Unicast addresses are:–Addresses for generic use of IPv6–Structured as a hierarchy to keep the aggregation
• See draft-ietf-ipngwg-addr-arch-v3-07
Interface IDGlobal Routing Prefix SLA
001
64 bits3 45 bits 16 bits
Provider Site Host
Address Allocation
• The allocation process is under reviewed by the Registries: –IANA allocates 2001::/16 to registries–Each registry gets a /23 prefix from IANA–Formely, all ISP were getting a /35–With the new proposal, Registry allocates a /36 (immediate allocation) or /32 (initial allocation) prefix to an IPv6 ISP–Policy is that an ISP allocates a /48 prefix to each end customer–ftp://ftp.cs.duke.edu/pub/narten/ietf/global-ipv6-assign-2002-04-25.txt
2001 0410
ISP prefix
Site prefix
LAN prefix
/32 /48 /64
Registry
/23
Bootstrap process - RFC2450
Interface ID
Hierarchical Addressing & Aggregation
–Larger address space enables:•Aggregation of prefixes announced in the global routing table.•Efficient and scalable routing.
ISP
2001:0410::/32
Customerno 2
IPv6 Internet
2001::/16
2001:0410:0002:/48
2001:0410:0001:/48
Customerno 1
Only announces the /32 prefix
• Link-local addresses for use during auto-configuration and when no routers are present:
• Site-local addresses for independence from Global Reachability, similar to IPv4 private address space
Link-Local & Site-Local Unicast Addresses
1111 1110 10 0 interface ID
1111 1110 11 0 interface IDSLA*
Multicast Addresses (RFC 2375)
• low-order flag indicates permanent / transient group; three other flags reserved
• scope field: 1 - node local–2 - link-local–5 - site-local–8 - organization-local–B - community-local–E - global–(all other values reserved)
4 112 bits8
group IDscopeflags11111111
4
35
more on IPv6 Addressing
80 bits 32 bits16 bits
IPv4 Address00000000……………………………0000
IPv6 Addresses with Embedded IPv4 AddressesIPv6 Addresses with Embedded IPv4 Addresses
80 bits 32 bits16 bits
IPv4 AddressFFFF0000……………………………0000
IPv4 mapped IPv6 addressIPv4 mapped IPv6 address
IPv6 Addressing Examples
LAN: 3ffe:b00:c18:1::/64
Ethernet0
MAC address: 0060.3e47.1530
router# show ipv6 interface Ethernet0Ethernet0 is up, line protocol is up IPv6 is enabled, link-local address is FE80::260:3EFF:FE47:1530Global unicast address(es): 2001:410:213:1:260:3EFF:FE47:1530, subnet is 2001:410:213:1::/64 Joined group address(es): FF02::1:FF47:1530 FF02::1 FF02::2 MTU is 1500 bytes
interface Ethernet0 ipv6 address 2001:410:213:1::/64 eui-64
BGP Overview
BGP-4: Border Gateway Protocol
• AS (Autonomous System) Types– stub AS: has a single connection to one other AS
• carries local traffic only– multihomed AS: has connections to more than one
AS• refuses to carry transit traffic
– transit AS: has connections to more than one AS• carries both transit and local traffic
• Each AS has:– one or more border routers– one BGP speaker that advertises:
• local networks• other reachable networks (transit AS only)• gives path information
Example
1 2
3
1.11.2
2.1 2.2
3.1 3.2
2.2.1
44.1 4.2
5
5.1 5.2
EGP
IGP
EGPEGP
IGP
IGP
IGPIGP
EGPEGP
Path Suboptimality
1 2
3
1.11.2
2.1 2.2
3.1 3.2
2.2.1
3 hop red pathvs2 hop green path
Choices• Link state or distance vector?
– no universal metric - policy decisions
• Problems with distance-vector:– Bellman-Ford algorithm may not converge
• Problems with link state:– metric used by routers not the same - loops– LS database too large - entire Internet– may expose policies to other AS’s
Solution: Path Vectors• Each routing update carries the
entire path• Loops are detected as follows:
– when AS gets route check if AS already in path– if yes, reject route– if no, add self and advertise route further
• Advantage:– metrics are local - AS chooses path, protocol
ensures no loops
Problems
• Routing table size– need an entry for all paths to all networks
• Required memory= O(N + M*A) * K)– N: number of networks– M: mean AS distance– A: number of AS’s– K: number of BGP peers
• Problem reduced with CIDR
Routing table size
networks(NLRI)
mean ASdistance
# of AS’s BGPpeers/net
memory
2,100 5 59 3 27,000
4,000 10 100 6 108,000
10,000 15 300 10 490,000
100,000 20 3,000 20 1,040,000
BGP Example• Speaker for AS2 advertises reachability to P and Q
– network 128.96, 192.4.153, 192.4.32, and 192.4.3, can be reached directly from AS2
• Speaker for backbone advertises– networks 128.96, 192.4.153, 192.4.32, and 192.4.3 can be
reached along the path (AS1, AS2).• Speaker can cancel previously advertised paths
Backbone network(AS 1)
Regional provider A(AS 2)
Regional provider B(AS 3)
Customer P(AS 4)
Customer Q(AS 5)
Customer R(AS 6)
Customer S(AS 7)
128.96192.4.153
192.4.32192.4.3
192.12.69
192.4.54192.4.23
Interior BGP peers• IGP cannot propagate all the
information required by BGP• External routers in an AS use
interior BGP (IBGP) connections to communicate
• External routers agree on routes and inform IGP
IBGP
Interconnecting BGP peers
• BGP uses TCP to connect peers• Advantages:
– BGP much simpler– no need for periodic refresh– incremental updates
• Disadvantages– congestion control on a routing
protocol?
Hop-by-hop model
• BGP advertises to neighbors only those routes that it uses– consistent with the hop-by-hop
Internet paradigm– e.g., AS1 cannot tell AS2 to route to
other AS’s in a manner different than what AS2 has chosen (need source routing for that)
Policy with BGP
• BGP provides capability for enforcing various policies
• Policies are not part of BGP: they are provided to BGP as configuration information
• BGP enforces policies by choosing paths from multiple alternatives and controlling advertisement to other AS’s
Examples of BGP policies
• A multihomed AS refuses to act as transit– limit path advertisement
• A multihomed AS can become transit for some AS’s– only advertise paths to some AS’s
• An AS can favor or disfavor certain AS’s for traffic transit from itself
BGP-4
• Latest version of BGP• BGP-4 supports CIDR
Routing information bases (RIB)
• Routes are stored in RIBs• Adj-RIBs-In: routing info that has been
learned from other routers (unprocessed routing info)
• Loc-RIB: local routing information selected from Adj-RIBs-In (routes selected locally)
• Adj-RIBs-Out: info to be advertised to peers (routes to be advertised)
BGP common header
Length Type
0 1 2 3
Marker (security and message delineation)
Types: OPEN, UPDATE, NOTIFICATION, KEEPALIVE
Optional parameters <type, length, value>
BGP OPEN message
Length Type: update
0 1 2 3
Marker (security and message delineation)
versionMy autonomous system Hold time
BGP identifierParameter length
My AS: id assigned to that ASHold timer: max interval between KEEPALIVE or UPDATE messagesBGP ID: address of one interface (same for all messages)
BGP UPDATE message
Length Type: open
0 1 2 3
Marker (security and message delineation)
Withdrawn....routes len Withdrawn routes (variable)
Path attribute len Path attributes (variable)
Network layer reachability information (NLRI) (variable)
UPDATE message reports information on a SINGLE path,but can report multiple withdrawn routes
...
NLRI
• Network Level Reachability Information– list of IP address prefixes encoded as
follows:
Length (1 byte) Prefix (variable)
Data
BGP NOTIFICATION message
Length Type: NOTIFICATION
0 1 2 3
Marker (security and message delineation)
Error code
Error sub-code
Used for error notification
BGP KEEPALIVE message
Length Type: KEEPALIVE
0 1 2 3
Marker (security and message delineation)
Sent periodically to peers to ensure connectivityIf hold_time is zero, messages are not sent
Policy routing
T Z XY
U V
Assume Y forbids T’s trafficT cannot reach X, but X can reach T!
Options
• Advertise all paths:– Path 1: through T can reach 197.8.0.0/23– Path 2: through T can reach 197.8.2.0/24– Path 3: through T can reach 197.8.3.0/24
• But this does not reduce routing tables! We would like to advertise:– Path 1: through T can reach 197.8.0.0/22
Sources
• RFC1771: main BGP RFC• RFC1772-3-4: application, experiences,
and analysis of BGP• RFC1965: AS confederations for BGP• Christian Huitema’s book “Routing in the
Internet”, chapters 8 and 9.• http://www.academ.com/nanog/
feb1997/BGPTutorial/sld022.htm (Cisco tutorial)
69
Today’s Homework• Peterson & Davie, Chap 4
4.31 (3rd Ed)4.334.404.45
Download and browse IPv6 and BGP RFC’s