troubleshooting eigrp networksd2zmdbbm9feqrf.cloudfront.net/2015/usa/pdf/brkrst-2331.pdf · •the...
TRANSCRIPT
Troubleshooting EIGRP Networks
Steven Moore
BRKRST-2331
• Scope
• Questions
• Additional Materials
• CiscoLive.com
• Additional Courses
• BRKRST-2336: EIGRP Deployment in Modern Networks
• BRKRST-3336: WAN Virtualization using Over-the-Top (OTP)
• BRKARC-2002: Techniques of a Network Detective
Troubleshooting EIGRP Networks - Housekeeping
• Troubleshooting Methodology
• Troubleshooting the Essentials
• Neighbor Formation
• Route Computation & Propagation
• Advanced Issues
• Resource Depletion
• High Speed Links
• IPv6 Specifics
Troubleshooting EIGRP Networks
Troubleshooting Methodology
Troubleshooting Methodology
• Knowledge of the System• Expected Behaviors / Baseline
• Relationships
• Where to look for information!
• Logical Sequence of Events
• EIGRP: • Peers Form
• Routes Exchanged
• Path Computation (DUAL)
• Routing Table Updated (if necessary)
• Peers Updated (if necessary)
A
B
Troubleshooting Methodology
• Peer Issues
• Show ip eigrp neighbor
• Show log (with log-neighbor-changes enabled)
• Debug ip eigrp packet (with care)
• Path Computation and Propagation Issues
• Show ip eigrp event
• Show ip eigrp topology
• Info for TAC
• Show ip eigrp plugin detail
• Show eigrp tech detail
• Show run | s router eigrp
Where to Look for Information – Cheat Sheet
Show Commands
• The most useful command for checking neighbor status is show ipeigrp neighbors
• Some of the important information provided by this command are
• Hold time—time left that you’ll wait for an EIGRP packet from this peer before declaring him down
• Uptime—how long it’s been since the last time this peer was initialized
• SRTT (Smooth Round Trip Time)—average amount of time it takes to get an Ack for a reliable packet from this peer
• RTO (Retransmit Time Out)—how long to wait between retransmissions if Acks are not received from this peer
Neighbors
RtrA#show ip eigrp neighbors
IP-EIGRP neighbors for process 1
H Address Interface Hold Uptime SRTT RTO Q Seq
(sec) (ms) Cnt Num
2 10.1.1.1 Et0 12 6d16h 20 200 0 233
1 10.1.4.3 Et1 13 2w2d 87 522 0 452
0 10.1.4.2 Et1 10 2w2d 85 510 0 3
Seconds Remaining Before Declaring Neighbor Down
How Long Since the Last Time Neighbor Was Discovered
How Long It Takes for This Neighbor to Respond to Reliable Packets
How Long We’ll Wait Before Retransmitting if No Acknowledgement
NeighborsShow IP EIGRP Neighbors
rtr302-ce1#show ip eigrp neighbor detail
IP-EIGRP neighbors for process 1
H Address Interface Hold Uptime SRTT RTO Q Seq Type
(sec) (ms) Cnt Num
1 17.17.17.2 Et1/0 14 00:00:03 394 2364 0 124
Version 12.0/1.2, Retrans: 0, Retries: 0
Stub Peer Advertising ( CONNECTED SUMMARY ) Routes
0 50.10.10.1 Et0/0 13 04:04:39 55 330 0 13
Version 12.0/1.2, Retrans: 2, Retries: 0
Neighbors
• The big brother of the show ip eigrp neighbor command; some of the additional information available via the detailed version of this command include
• Number of retransmissions and retries for each neighbor
• Version of Cisco IOS and EIGRP
• Stub information (if configured)
Show IP EIGRP Neighbor Detail
Show Commands
• Show eigrp plugin detail gives information about all of the EIGRP capabilities and what version each is running; EIGRP has gone to a plug-in model for packaging features (similar to Windows DLL, sort of) and a particular platform/branch may have different plugins or different version of those plugins
• EIGRP is also what’s known as a True Component is done in a portable way, so the first line is critical to know which version of the True Component you’re running
Show EIGRP Plugin Detail
Show Commands
• This command may or may not exist in the version you’re running, since it was recently added
• While this command may not be all that useful to you, the TAC will probably asking you for it to identify exactly the version and capabilities of EIGRP on your router
Show EIGRP Plugin Detail
r3#sh eigrp plugin detail
EIGRP feature plugins:::
eigrp-release : 5.01.00 : Portable EIGRP Release
: 2.00.03 : Source Component Release(Portable EIGRP Release(rel5_1))
igrp2 : 3.00.00 : Reliable Transport/Dual Database
external-client : 1.02.00 : Service Distribution Client Support
bfd : 1.01.00 : BFD Platform Support
ipv4-af : 2.01.01 : Routing Protocol Support
ipv4-sf : 1.01.00 : Service Distribution Support
ipv6-af : 2.01.01 : Routing Protocol Support
ipv6-sf : 1.01.00 : Service Distribution Support
snmp-agent : 1.01.01 : SNMP/SNMPv2 Agent Support
r3#
Show Commands
• This gives you the plugins again, with a lot of other internal info; TAC will probably be asking for this one, too
Show EIGRP Tech
r3#show eigrp tech
EIGRP feature plugins:::
eigrp-release : 5.01.00 : Portable EIGRP Release
: 2.00.03 : Source Component Release(Portable EIGRP
Release(rel5_1))
igrp2 : 3.00.00 : Reliable Transport/Dual Database
external-client : 1.02.00 : Service Distribution Client Support
bfd : 1.01.00 : BFD Platform Support
ipv4-af : 2.01.01 : Routing Protocol Support
ipv4-sf : 1.01.00 : Service Distribution Support
ipv6-af : 2.01.01 : Routing Protocol Support
ipv6-sf : 1.01.00 : Service Distribution Support
snmp-agent : 1.01.01 : SNMP/SNMPv2 Agent Support
EIGRP Internal Process States
procinfoQ: 2
deadQ:
ddbQ: 2
EIGRP-IPv4 Protocol for AS(1)
{vrid:1 afi:1 as:1 tableid:0 vrfid:0 tid:0 name: }
PIDs: Hello: 26 PDM: 25
Router-ID: 172.18.176.153
Threads: procinfo: 0x18DD1A0 ddb: 0x18DD380
workQ:
iidbQ:
temp_iidbQ:
passive_iidbQ:
peerQ:
static_peerQ:
suspendQ:
networkQ: 1
summaryQ:
Socket Queue: 0/2000/0/0 (current/max/highest/drops)
Input Queue: 0/2000/0/0 (current/max/highest/drops)
GRS/NSF: enabled hold-timer: 240
Active Timer: 3 min
Distance: internal 90 external 170
Max Path: 4
Max Hopcount: 100
Variance: 1
Show CommandsShow EIGRP Tech
-
EIGRP-IPv6 Protocol for AS(1)
{vrid:0 afi:2 as:1 tableid:0 vrfid:0 tid:0 name: }
PIDs: Hello: (no process) PDM: (no process)
Router-ID: 172.18.176.153
Threads: procinfo: 0x185ECA0 ddb: 0x7B45B940
workQ:
iidbQ:
temp_iidbQ:
passive_iidbQ:
peerQ:
static_peerQ:
suspendQ:
summaryQ:
Socket Queue: %EIGRP(ERROR): invalid socket
Input Queue: 0/2000/0/0 (current/max/highest/drops)
Active Timer: 3 min
Distance: internal 90 external 170
Max Path: 16
Max Hopcount: 100
Variance: 1
r3#
Show CommandsShow EIGRP Tech
Show Commands
• Show eigrp tech has tons of internal information that may seem meaningless; there may be nuggets you can glean from it (summaryq entries, etc.), but mainly I wanted to expose you to it so that it won’t be completely foreign to you when TAC asks you for it
• Again, this is a new command and may not exist in the version you’re running—if it’s not available now, it will be available in a later upgrade
• Note that this is also a “show eigrp tech detail” with even more undecipherable info
Show EIGRP Tech
Show Commands
• To make commands consistent across address/service families, changing the form of the Show commands
• EIGRP will become the second argument followed by the address-family
Show IP EIGRP commands changing in the future
Router#show eigrp ?
address-family EIGRP address-family show commands
plugins EIGRP feature plugin installed
protocols Show EIGRP protocol info
service-family EIGRP service-family show commands
tech-support Show EIGRP internal tech support
information
Router#show eigrp address-family ?
ipv4 EIGRP IPv4 Address-Family
ipv6 EIGRP IPv6 Address-Family
Router#show eigrp address-family ipv4 ?
<1-65535> Autonomous System
accounting Prefix Accounting
events Events logged
interfaces interfaces
multicast Select a multicast instance
neighbors Neighbors
timers Timers
topology Select Topology
traffic Traffic Statistics
vrf Select a VPN Routing/Forwarding
instance
Show Commands
• We’ve had a bit of an issue over the years with inconsistencies in the command structure in EIGRP. Sometimes, the IPv4 commands and the IPv6 commands were slightly different
• With the addition of service-family command (SAF, not covered in this presentation), we decided to make things consistent. Unfortunately, this also meant that things needed to be different
• Both the old and new forms of the show commands are currently supported, but in the future as new features are rolled out, some commands may only be available in the new command format
• You might as well start getting used to them now, if you’re running a version that supports the new format!
Show IP EIGRP command changes
Show Commands
RtrB#show ip eigrp traffic
IP-EIGRP Traffic Statistics for AS 1
Hellos sent/received: 574/558
Updates sent/received: 5/7
Queries sent/received: 2/2
Replies sent/received: 2/2
Acks sent/received: 11/7
Input queue high water mark 2, 0 drops
SIA-Queries sent/received: 1/1
SIA-Replies sent/received: 1/1
Hello Process ID: 64
PDM Process ID: 63
Show IP EIGRP Traffic
Show Commands
• Show ip eigrp traffic can be very useful to see what kind of activity has been occurring on your network; Some of the most interesting information includes:
• Input queue high water mark—this shows how many packets have been queued inside of the router to be processed—when packets are received from the IP layer, EIGRP accepts the packets and queues them up for processing; if the router is so busy that the queue isn’t getting serviced, the queue could build up—unless there are drops, there is nothing to worry about, but it can give you an indication of how hard EIGRP is working
• SIA-queries sent/received—this is useful to determine how often the router has stayed active for at least one and one-half minutes (as mentioned in the earlier section on stuck-in-active routes; this number should be relatively low—if it’s not, it’s taking a bit of time for replies to be received for queries, and it might be worth exploring why
Show IP EIGRP Traffic
Show Commands
RtrA#show ip protocol
*** IP Routing is NSF aware ***
Routing Protocol is "eigrp 200"
Outgoing update filter list for all interfaces is not set
Incoming update filter list for all interfaces is not set
Default networks flagged in outgoing updates
Default networks accepted from incoming updates
EIGRP metric weight K1=1, K2=0, K3=1, K4=0, K5=0
EIGRP maximum hopcount 100
EIGRP maximum metric variance 1
Redistributing: eigrp 200
EIGRP NSF-aware route hold timer is 240s
EIGRP NSF enabled
NSF signal timer is 20s
NSF converge timer is 120s
Show IP Protocol
Show Commands
• There are many fields in the show ip protocol which are useful in the troubleshooting process
• Some of the most interesting include:
• Outgoing and incoming filter lists
• Variance setting
• Redistribution configured (note that if a router is not redistributing other protocols, it still shows that it is redistributing itself)
• NSF configuration and timers
Show IP Protocol
Show Commands
Automatic network summarization is not in effect
Address Summarization:
40.0.0.0/8 for Vlan301
Summarizing with metric 1536
Maximum path: 4
Routing for Networks:
40.80.0.0/16
192.168.107.0
Routing Information Sources:
Gateway Distance Last Update
(this router) 90 00:01:10
40.80.24.33 90 01:13:47
40.80.12.19 90 01:13:47
40.80.23.31 90 01:13:48
40.80.8.13 90 01:13:48
Distance: internal 90 external 170
Show IP Protocol
Show Commands
• More show ip protocol information:
• Summarization defined (both auto and manual) along with the metric associated with each summary
• Max-path setting
• Network statements
• Distance settings
• Note that the routing information sources section is really useless for EIGRP, since it doesn’t use periodic update; we didn’t rip it out of the display, but there isn’t much useful information for EIGRP in this section
Show IP Protocol
Event Log
• The two primary weapons at your disposal are debugs and the event log; realize that the output of both debugs and the event log are cryptic and probably not tremendously useful to you (so why am I telling you about them?)
• There are times when the output of debugs or the event log is enough to lead you in a direction, even if you don’t really understand all that it is telling you; don’t expect to be an expert at EIGRP through the use of debugs or the event log, but they can help
• Don’t forget, debugs can kill your router—don’t do a debug if you don’t know how heavy the overhead is; I may tell you below about some debugs, but don’t consider this approval from Cisco to run them on your production network
• The event log is non-disruptive, so it is much safer; just display it and see what’s been happening lately
EIGRP Troubleshooting Tools
• On a busy, unstable network debugs can be hazardous to your network’s health
• Event log is non-disruptive—it’s already running
• (unless you turned it off)
• Defaults to 500 lines (configurable)
• EIGRP event-log-size <number of lines>
• Maximum event-log-size is half of available memory
• Most recent events at top of log by default
• Read from the bottom to top
• Later version support displaying the event log in reverse!
Event Log
• A separate event log is kept for each AS
• 500 lines are not very much; on a network where there is significant instability or activity, 500 lines may only be a second or two (or less) — you can change the size of the event log (if needed) by the command
• eigrp event-log-size <number of lines>• Recent IOS limits to half of available memory
• If number of lines set to 0, it disables the log
• You can clear the event log by typing
• clear ip eigrp event
• Most recent events are at the top of the log by default, so time flows from bottom to top
Event Log
• New parameters available for showing the event log
Event Log
Rtr2#show ip eigrp event ?
<1-4294967295> Starting event number
errmsg Show Events being logged
reverse Show most recent event last
sia Show Events being logged
type Show Events being logged
| Output modifiers
<cr>
• Three different event types can be logged
• EIGRP log-event-type [dual][xmit][transport]
• Default is dual—normally most useful
• Dual is FSM (decisions in finite state machine)
• xmit and transport are different aspects of actually sending packets to peers
• Any combination of the three can be on at the same time
• Work is in progress to add additional debug information to event log
Event Log
RtrA#show ip eigrp events
Event information for AS 1:
1 01:52:51.223 NDB delete: 30.1.1.0/24 1
2 01:52:51.223 RDB delete: 30.1.1.0/24 10.1.3.2
3 01:52:51.191 Metric set: 30.1.1.0/24 4294967295
4 01:52:51.191 Poison squashed: 30.1.1.0/24 lost if
5 01:52:51.191 Poison squashed: 30.1.1.0/24 metric chg
6 01:52:51.191 Send reply: 30.1.1.0/24 10.1.3.2
7 01:52:51.187 Not active net/1=SH: 30.1.1.0/24 1
8 01:52:51.187 FC not sat Dmin/met: 4294967295 46738176
9 01:52:51.187 Find FS: 30.1.1.0/24 46738176
10 01:52:51.187 Rcv query met/succ met: 4294967295 4294967295
11 01:52:51.187 Rcv query dest/nh: 30.1.1.0/24 10.1.3.2
12 01:52:36.771 Change queue emptied, entries: 1
13 01:52:36.771 Metric set: 30.1.1.0/24 46738176
Event Log
Debugs
• Remember—debugs can be dangerous!
• Use only in the lab or if advised by the TAC
• To make a little safer
• Logging buffered <size>
• No logging console
Debugs
• By enabling logging buffered and shutting off the console log, you improve your odds of not killing your router when you do a debug; still no guarantees
• We used to change the scheduler interval when we performed debugs in the TAC, as well; this command is version dependent, so I’m not going to give you syntax here
Debugs
• Use modifiers to limit the scope of route events or packet debugs
• Limit debugs to a particular neighbor
• debug ip eigrp neighbor AS address
• Limit debugs to a particular prefix
• debug ip eigrp AS network mask
Debugs
• Both packet debugs and route event debugs create so much output that you would have a hard time sorting through it for the pieces you care about; the two modifier commands above allow you to limit what the debug output will include
• “Debug ip eigrp neighbor AS address” will limit the output to those entries pertaining to a particular neighbor
• “Debug ip eigrp AS network mask” will limit the output to only those entries that pertain to the prefix identified
• Unfortunately, you have to enable the debug (packet or route events) prior to putting the modifier on, so you could kill your router before you are able to get the limits placed on the output; sorry, but that’s the way it is
Debugs
Debugs
RtrA#debug ip eigrp
IP-EIGRP Route Events debugging is on
RtrA#debug ip eigrp neighbor 1 10.1.6.2
IP Neighbor target enabled on AS 1 for 10.1.6.2
IP-EIGRP Neighbor Target Events debugging is on
RtrA#clear ip eigrp neighbor
P-EIGRP: 10.1.8.0/24 - do advertise out Serial1/2
IP-EIGRP: Int 10.1.8.0/24 metric 28160 - 256002560
IP-EIGRP: 10.1.7.0/24 - do advertise out Serial1/2
IP-EIGRP: 10.1.1.0/24 - do advertise out Serial1/2
IP-EIGRP: Int 10.1.1.0/24 metric 28160 - 25600256
IP-EIGRP: Processing incoming UPDATE packet
IP-EIGRP: 10.1.6.0/24 - do advertise out Serial1/1
Debug IP EIGRP (Route Events) – neighbor filtering
Debugs
• In this debug, we are looking at route events recorded when neighbors are cleared (in reality, the debugs produced were far, far more—this is only a snapshot of the debug); a modifier was included to limit the output to only the events related to a single EIGRP neighbor, 10.1.6.2
• Notice that the debug output doesn’t identify which neighbors are involved in any of the events; without knowing the address used in the modifier command, you really can’t tell which neighbors you are interacting with in the debug output
• This output is often useful when trying to determine what EIGRP thinks is happening when there are route changes in the network
Debug IP EIGRP (Route Events) – neighbor filtering
Debugs
RtrA#debug ip eigrp
IP-EIGRP Route Events debugging is on
RtrA#debug ip eigrp 1 10.1.7.0 255.255.255.0
IP Target enabled on AS 1 for 10.1.7.0/24
IP-EIGRP AS Target Events debugging is on
RtrA#clear ip eigrp neighbor
IP-EIGRP: 10.1.7.0/24 - do advertise out Serial1/2
IP-EIGRP: 10.1.7.0/24 - do advertise out Serial1/1
IP-EIGRP: Int 10.1.7.0/24 metric 20512000 20000000 512000
IP-EIGRP: 10.1.7.0/24 - do advertise out Serial1/2
IP-EIGRP: Processing incoming UPDATE packet
IP-EIGRP: 10.1.7.0/24 - do advertise out Serial1/1
Debug IP EIGRP (Route Events) – prefix filtering
Debugs
• Again, this is the output of debugging EIGRP routing events, this time modified to only display output related to a single prefix in the network; this modifier can be very useful when trying to troubleshoot a problem with a single prefix (or representative route)
Debug IP EIGRP (Route Events) – prefix filtering
Debugs
RtrA#debug eigrp packet ?
ack EIGRP ack packets
hello EIGRP hello packets
ipxsap EIGRP ipxsap packets
probe EIGRP probe packets
query EIGRP query packets
reply EIGRP reply packets
request EIGRP request packets
stub EIGRP stub packets
retry EIGRP retransmissions
terse Display all EIGRP packets except Hellos
update EIGRP update packets
verbose Display all EIGRP packet
Debug EIGRP Packet
Debugs
• This debug is used in a variety of problems and circumstances; debug eigrp packet hello is used to troubleshoot neighbor establishment/maintenance problems
• Debug eigrp packet query, reply, update, etc., are also often used to try to determine the process occurring when a problem occurs—be careful; I’ve crashed/hung more than one router by doing a debug on a router that was too busy
• Probably the most commonly used debug eigrp packet option is terse, which includes all of the above except hellos; an example follows on the next page
Debug EIGRP Packet
Debugs
RtrA#debug eigrp packet terse
EIGRP Packets debugging is on
(UPDATE, REQUEST, QUERY, REPLY, IPXSAP, PROBE, ACK, STUB)
EIGRP: Sending UPDATE on Serial1/0 nbr 10.1.1.2
AS 1, Flags 0x0, Seq 2831/1329 idbQ 0/0 iidbQ un/rely 0/0 peerQ un/rely 0/1 serno 19707-19707
EIGRP: Sending UPDATE on Serial1/1 nbr 10.1.2.2
AS 1, Flags 0x0, Seq 2832/1708 idbQ 0/0 iidbQ un/rely 0/0 peerQ un/rely 0/1 serno 19707-19707
EIGRP: Sending UPDATE on Serial1/2 nbr 10.1.3.2
AS 1, Flags 0x0, Seq 2833/1680 idbQ 0/0 iidbQ un/rely 0/0 peerQ un/rely 0/1 serno 19707-19707
EIGRP: Received ACK on Serial1/0 nbr 10.1.1.2
AS 1, Flags 0x0, Seq 0/2831 idbQ 0/0 iidbQ un/rly 0/0 peerQ un/rely 0/1
EIGRP: Serial1/0 multicast flow blocking cleared
EIGRP: Received ACK on Serial1/1 nbr 10.1.2.2
AS 1, Flags 0x0, Seq 0/2832 idbQ 0/0 iidbQ un/rely 0/0 peerQ un/rely 0/1
EIGRP: Serial1/1 multicast flow blocking cleared
Debug EIGRP Packet Terse
DebugsDebug IP EIGRP Notifications
RtrA#debug ip eigrp notifications
IP-EIGRP Event notification debugging is on
RtrA#clear ip route *
RtrA#
IP-EIGRP: Callback: reload_iptable
IP-EIGRP: iptable_redistribute into eigrp AS 1
IP-EIGRP: Callback: redist frm static AS 0 100.100.100.0/24
into: eigrp AS 1 event: 1
IP-EIGRP: Callback: redist frm static AS 0 200.200.200.0/24
into: eigrp AS 1 event: 1
Debugs
• This debug is used to debug problems between EIGRP and the routing/interface infrastructure; for example, if you’re having problems with redistribution, the actual place where EIGRP gets the redistributed routes is the RIB/routing table—these callbacks are the mechanism to signal changes between the routing infrastructure and EIGRP when routes come and go
• Callbacks are also used from the interface infrastructure as interfaces are shutdown/no shut, deleted, addresses added, etc.
• Any problem with EIGRP’s interaction with other parts of the system will probably come through this mechanism, thus this debug is your best bet
Debug IP EIGRP Notifications
Debugs
RtrA#debug eigrp fsm
EIGRP FSM Events/Actions debugging is on
RtrA#clear ip route *
RtrA#
DUAL: Find FS for dest 10.1.8.0/24. FD is 28160, RD is 28160
DUAL: 0.0.0.0 metric 28160/0 found Dmin is 28160
DUAL: Find FS for dest 10.1.3.0/24. FD is 21024000, RD is 21024000
DUAL: 10.1.6.2 metric 21024000/2169856 found Dmin is 21024000
DUAL: RT installed 10.1.3.0/24 via 10.1.6.2
DUAL: Find FS for dest 10.1.2.0/24. FD is 21536000, RD is 21536000
Debug EIGRP FSM (Finite State Machine)
Debugs
• Debug eigrp fsm is very, very similar to dual event log; since the dual event log is non-disruptive and this debug could certainly cause problems, I rarely use this debug in real life—sometimes it’s useful in conjunction with another debug to see how the different parts of EIGRP interact (debug ip packet, debug eigrp transport, and debug eigrp fsm to see how all the pieces fit together)
• FSM stands for Finite State Machine, which describes the behavior of DUAL, the path selection part of EIGRP
Debug EIGRP FSM
• Neighbors
• Route Computation and Propagation
• Understanding the Topology Table
• Stuck In Active Routes
• Summarization
• Redistribution
• Additional Propagation Issues
Troubleshooting the Essentials
• The most useful command for checking neighbor status is show ipeigrp neighbors
• Some of the important information provided by this command are
• Hold time—time left that you’ll wait for an EIGRP packet from this peer before declaring him down
• Uptime—how long it’s been since the last time this peer was initialized
• SRTT (Smooth Round Trip Time)—average amount of time it takes to get an Ack for a reliable packet from this peer
• RTO (Retransmit Time Out)—how long to wait between retransmissions if Acks are not received from this peer
Neighbors
RtrA#show ip eigrp neighbors
IP-EIGRP neighbors for process 1
H Address Interface Hold Uptime SRTT RTO Q Seq
(sec) (ms) Cnt Num
2 10.1.1.1 Et0 12 6d16h 20 200 0 233
1 10.1.4.3 Et1 13 2w2d 87 522 0 452
0 10.1.4.2 Et1 10 2w2d 85 510 0 3
Seconds Remaining Before Declaring Neighbor Down
How Long Since the Last Time Neighbor Was Discovered
How Long It Takes for This Neighbor to Respond to Reliable Packets
How Long We’ll Wait Before Retransmitting if No Acknowledgement
NeighborsShow IP EIGRP Neighbors
Outstanding Packets
Last Reliable Packet Sent
rtr302-ce1#show ip eigrp neighbor detail
IP-EIGRP neighbors for process 1
H Address Interface Hold Uptime SRTT RTO Q Seq Type
(sec) (ms) Cnt Num
1 17.17.17.2 Et1/0 14 00:00:03 394 2364 0 124
Version 12.0/1.2, Retrans: 0, Retries: 0
Stub Peer Advertising ( CONNECTED SUMMARY ) Routes
0 50.10.10.1 Et0/0 13 04:04:39 55 330 0 13
Version 12.0/1.2, Retrans: 2, Retries: 0
Neighbors
• The big brother of the show ip eigrp neighbor command; some of the additional information available via the detailed version of this command include
• Number of retransmissions and retries for each neighbor
• Version of Cisco IOS and EIGRP
• Stub information (if configured)
Show IP EIGRP Neighbor Detail
RtrA# config terminal
Enter configuration commands, one per line. End with CNTL/Z.
RtrA(config) # router eigrp 1
RtrA(config-router) # eigrp log-neighbor-changes
RtrA(config-router) # logging buffered 10000
RtrA(config) # service timestamps log datetime msec
Neighbors
• EIGRP Log-Neighbor-Changes is on by default since 12.2(12)
• Turn it on and leave it on
• Best to send to buffer log
• EIGRP log-neighbor-changes is the best tool you have to understand why neighbor relationships are not stable. It should be enabled on every router in your network—CSCdx67706 (12.2(12)) made it the default behavior; as explained on the previous slide, the uptime value from show ip eigrp neighbors will tell you the last time a neighbor bounced, but not how often or why—with log-neighbor-changes on and logging buffered, you keep not only a history of when neighbors have been reset, but the reason why… absolutely invaluable
• Logging buffered is also recommended, because logging to a syslog server is not bulletproof; for example, if the neighbor bouncing is between the router losing neighbors and the syslog server, the messages could be lost—it’s best to keep these types of messages locally on the router, in addition to the syslog server
• It may also be useful to increase the size of the buffer log in order to capture a greater duration of error messages—you would hate to lose the EIGRP neighbor messages because of flapping links filling the buffer log; if you aren’t starved for memory, change the buffer log size using the command logging buffered 10000 in configuration mode
• The service timestamps command above puts more granular timestamps in the log, so it’s easier to tell when the neighbor stability problems occurred
Neighbors
Neighbor 10.1.1.1 (Ethernet0) is down: peer restarted
Neighbor 10.1.1.1 (Ethernet0) is up: new adjacency
Neighbor 10.1.1.1 (Ethernet0) is down: holding time expired
Neighbor 10.1.1.1 (Ethernet0) is down: retry limit exceeded
Others, but not often
Neighbors
• So this tells us why the neighbor is bouncing—but what do they mean?
• Hint: peer restarted means you have to ask the peer; he’s the one that restarted the session
Log-Neighbor-Changes Messages
• Peer restarted—the other router reset our neighbor relationship; you need to go to him to see why it thought our relationship had to be bounced
• New adjacency—established a new neighbor relationship with this neighbor; happens at initial startup and after recovering from a neighbor going down
• Holding time expired—we didn’t hear any EIGRP packets from this neighbor for the duration of the hold time; this is typically 15 seconds for most media (180 seconds for low-speed NBMA)
• Retry limit exceeded—this neighbor didn’t acknowledge a reliable packet after at least 16 retransmissions (actual duration of retransmissions is also based on the hold time, but there were at least 16 attempts)
Log-Neighbor-Changes Messages
Holding Time Expired
• The holding time expires when an EIGRP packet is not received during hold time
• Typically caused by congestion or physical errors
• Ping the multicast address (224.0.0.10) from the other router
• If there are a lot of interfaces or neighbors, you should use extended ping and specify the source address or interface
Neighbor 10.1.1.1 (Ethernet0) is down:
holding time expired
A
B
Hello
Ping 224.0.0.10
Hello
Neighbor 10.1.1.2 (Ethernet0) is down:
peer restarted
• When an EIGRP packet is received from a neighbor, the hold timer for that neighbor resets to the hold time supplied in that neighbor’s hello packet, then the value begins decrementing
• The hold timer for each neighbor is reset back to the hold time when each EIGRP packet is received from that neighbor (long ago and far way, it needed to be a hello received, but now any EIGRP packet will reset the timer)
• Since hellos are sent every five seconds on most networks, the hold time value in a show ip eigrp neighbors is normally between 10 and 15 (resetting to hold time (15), decrementing to hold time minus hello interval or less, then going back to hold time)
• Why would a router not see EIGRP packets from a neighbor?
• He may be gone (crashed, powered off, disconnected, etc.)
• He (or we) may be overly congested (input/output queue drops, etc.)
• Network between us may be dropping packets (CRC errors, frame errors, excessive collisions)
Holding Time Expired
RtrA# debug eigrp packet hello
EIGRP Packets debugging is on (HELLO)
19:08:38.521: EIGRP: Sending HELLO on Serial1/1
19:08:38.521: AS 1, Flags 0x0, Seq 0/0 idbQ 0/0 iidbQ un/rely 0/0
19:08:38.869: EIGRP: Received HELLO on Serial1/1 nbr 10.1.6.2
19:08:38.869: AS 1, Flags 0x0, Seq 0/0 idbQ 0/0 iidbQ un/rely 0/0
19:08:39.081: EIGRP: Sending HELLO on FastEthernet0/0
19:08:39.081: AS 1, Fags 0x0, Seq 0/0 idbQ 0/0 iidbQ un/rely 0/0
Holding Time Expired
Remember—Any Debug Can Be Hazardous
• Another troubleshooting tool available is to do the command “debug eigrppacket hello”; this will produce debug output to the console or buffer log (depending on how you have it configured) that will show the frequency of hellos sent and received
• You should make sure you have the timestamps for the debugs set to a value that you can actually see the frequency; something like:
• service timestamps debug datetime msec
• Remember that any time you enable a debug on a production router, you are taking a calculated risk; it’s always better to use all of the safer troubleshooting techniques before resorting to debugs—sometimes they’re necessary, however
Holding Time Expired
• EIGRP sends both unreliable and reliable packets
• Hellos and Acks are unreliable
• Updates, Queries, Replies, SIA-queries and SIA-replies are reliable
• Reliable packets are sequenced and require an acknowledgement
• Reliable packets are retransmitted up to 16 times if not acknowledged
Retry Limit Exceeded
• Exceeding the retry limit means that we’re sending reliable packets which are not getting acknowledged by a neighbor—when a reliable packet is sent to a neighbor, it must respond with a unicast acknowledgement; if a router is sending reliable packets and not getting acknowledgements, one of two things are probably happening
• The reliable packet is not being delivered to the neighbor
• The acknowledgement from the neighbor is not being delivered to the sender of the reliable packet
• These errors are normally due to problems with delivery of packets, either on the link between the routers or in the routers themselves—congestion, errors, and other problems can all keep unicast packets from being delivered properly; look for queue drops, errors, etc., when the problem occurs, and try to ping the unicast address of the neighbor to see if unicasts in general are broken or whether the problem is specific to EIGRP
Retry Limit Exceeded
Retry Limit Exceeded
• Reliable packets are re-sent after Retransmit Time Out (RTO) • Typically 6 x Smooth Round Trip Time
(SRTT)
• Minimum 100 ms
• Maximum 5000 ms (five seconds)
• 16 retransmits takes between roughly 40 and 80 seconds
• If a reliable packet is not acknowledged before 16 retransmissions and the hold timer duration has passed, re-initialize the neighbor Neighbor 10.1.1.1 (Ethernet0) is down: retry limit exceeded
A
B
PacketAck
Neighbor 10.1.1.2 (Ethernet0) is down:
peer restarted
• The Retransmit Timeout (RTO) is used to determine when to retry sending a packet when an Ack has not been received, and is (generally) based on 6 X Smooth Round Trip Time (SRTT); the SRTT is derived from previous measurements of how long it took to get an Ack from this neighbor—the minimum RTO is 100 Msec and the maximum is 5000 Msec; each retry backs off 1.5 times the last interval
• The minimum time required for 16 retransmits is approximately 40 seconds (minimum interval of 100 ms with a max interval of 5000 ms); for example, If there isn’t an acknowledgement after 100 ms, the packet is retransmitted and we set a timer for 150 ms—if it expires, we send it again and set the timer for 225 ms, then 337 ms, etc., until 5000 ms is reached; 5000 ms is then repeated until a total of 16 retransmissions have been sent
• The maximum time for 16 retransmits is approximately 80 seconds, if the initial retry is 5000 ms and all subsequent retries are also 5000 ms
Retry Limit Exceeded
• If a reliable packet is retransmitted 16 times without an acknowledgement, EIGRP checks to see if the duration of the retries has reached the hold time, as well
• Since the hold time is typically 15 sec on anything but low-speed NBMA, it normally isn’t a factor in the retry limit; NBMA links that are T1 or less, however, wait an additional period of time after re-trying 16 times, until the hold-time period (180 seconds) has been reached before declaring a neighbor down due to retry limit exceeded
• This was done to give the low-speed NBMA networks every possible chance to get the Acks across before downing the neighbor
• Remember this if you modify the hold times!
Retry Limit Exceeded
Unidirectional Links
A
B
HelloUpdate
RtrA#show ip eigrp neighbors
IP-EIGRP neighbors for process 1
RtrA#
%DUAL-5-NBRCHANGE: IP-EIGRP 1: Neighbor 10.1.5.4 (Serial1) is down:
retry limit exceeded
RtrB#show ip eigrp neighbors
IP-EIGRP neighbors for process 1
H Address Interface Hold Q Seq
Cnt Num
1 10.1.102.2 Et0 14 4 0
• In this example, we see what happens when a link is only working in one direction; unidirectional links can occur because of a duplicate IP address, a wedged input queue, link errors, or any other reason you can think of that would allow packets to be delivered only in one direction on a link
• RtrB doesn’t even realize that RtrA exists—RtrA is sending out his hellos, waiting for a neighbor to show up on the network; what it doesn’t realize is that the RtrB is already out there and trying to bring up the neighbor relationship
• RtrB, on the other hand, sees the hellos from RtrA, sends his own hellos and then sends an update to RtrA to try to get their topology tables/routing tables populated—unfortunately, since the updates are also not being received by RtrA, it of course isn’t sending acknowledgements; RtrB tries it 16 times and then resets his relationship with RtrA and starts over
• You’ll spot this symptom by the retry limit exceeded messages on RtrB, RtrB having RtrA in his neighbor table with a continual Q count, and RtrA not seeing RtrB, at all
• CSCdy45118 has been implemented to create a reliable neighbor establishment process (three-way handshake) and reliable neighbor maintenance (neighbor taken down more quickly when unidirectional link encountered). 12.2T, 12.3 and up
Unidirectional Links
Retry Limit Exceeded
• Ping the neighbor’s unicast address
• Vary the packet size
• Try large numbers of packets
• This ping can be issued from either neighbor; the results should be the same
• Common causes
• Mismatched MTU
• Unidirectional link
• Dirty link
RtrB# ping
Protocol[ip]:
Target IP address: 10.1.1.1
Repeat count [5]: 100
Datagram Size: 1500
Timeout in seconds[2]:
Extended commands[n]: y
....
A
B
• Some manual configuration changes can also reset EIGRP neighbors, depending on the Cisco IOS version
• Summary changes (manual and auto)
• Route filter changes
• Stub setting changes
• This is normal behavior for much older code
• CSCdy20284 removed many of these neighbor resets• Implemented in 12.2S, 12.3T, and 12.4 (approximately 2005)
• Mismatch of K-values (metric weights) will prevent peers from forming also (best just not to change them at all).
Manual Changes
• Summary changes
• When a summary changes on an interface, components of the summary may need to be removed from any neighbors reached through that interface; neighbors through that interface are reset to synch up topology entries
• Route filter changes
• Similar to summary explanation above; neighbors are bounced if a distribute-list is added/removed/changed on an interface in order to synch up topology entries
• In the past, we also bounced neighbors when interface metric info changed (delay, bandwidth), but we no longer do that (CSCdp08764)
• CSCdy20284 was implemented to stop bouncing neighbors when many manual changes occur; in late 12.2S, 12.3T, and 12.4, summary and filter changes no longer bounce neighbors
• Changing Stub setting or router-ids still resets peers! Remember to make these changes during maintenance windows
Manual Changes
Show Commands
• There is also a show ip eigrp interface which contains a subset of this info; You may want to just use that if you don’t need all the detail
• This command supplies a lot of information about how the interfaces are being used and how well they are obeying; some of the interesting information available via this command is:
• Retransmissions sent—this shows how many times EIGRP has had to retransmit packets on this interface, indicating that it didn’t get an ack for a reliable packet; having retransmits is not terrible, but if this number is a large percentage of packets sent on this interface, something is keeping neighbors from receiving (and acking) reliable packets
• Out-of-sequence rcvd—this shows how often packets are received out of order, which should be a relatively unusual occurrence; again, it’s nothing to worry about if you get occasional out-of-order packets since the underlying delivery mechanism is best-effort—if the number is a large percentage of packets sent on the interface, however, then you may want to look into what’s happening on the interface—are there errors?
• You can also use this command to see if an interface only contains stub neighbors and if authentication is enabled
Show IP EIGRP Interface Detail
Neighbors
RtrB#show ip eigrp interface detail
IP-EIGRP interfaces for process 1
Xmit Queue Mean Pacing Time Multicast Pending
Interface Peers Un/Reliable SRTT Un/Reliable Flow Timer Routes
Et0/0 1 0/0 737 0/10 5376 0
Hello interval is 5 sec
Next xmit serial <none>
Un/reliable mcasts: 0/3 Un/reliable ucasts: 6/3
Mcast exceptions: 0 CR packets: 0 ACKs suppressed: 0
Retransmissions sent: 0 Out-of-sequence rcvd: 0
Authentication mode is not set
Et1/0 1 0/0 885 0/10 6480 0
Hello interval is 5 sec
Next xmit serial <none>
Un/reliable mcasts: 0/2 Un/reliable ucasts: 5/3
Mcast exceptions: 0 CR packets: 0 ACKs suppressed: 0
Retransmissions sent: 0 Out-of-sequence rcvd: 0
Authentication mode is not set
Show IP EIGRP Interface Detail
• On a busy, unstable network debugs can be hazardous to your network’s health
• Event log is non-disruptive—it’s already running
• (unless you turned it off)
• Defaults to 500 lines (configurable) – per AS
• EIGRP event-log-size <number of lines>
• Maximum event-log-size is half of available memory
• Most recent events at top of log by default
• Read from the bottom to top
• Later version support displaying the event log in reverse
• Clear ip eigrp event resets the log buffer
Event Log
• A separate event log is kept for each AS
• 500 lines are not very much; on a network where there is significant instability or activity, 500 lines may only be a second or two (or less) — you can change the size of the event log (if needed) by the command
• eigrp event-log-size <number of lines>• Recent IOS limits to half of available memory
• If number of lines set to 0, it disables the log
• You can clear the event log by typing
• clear ip eigrp event
• Most recent events are at the top of the log by default, so time flows from bottom to top
Event Log
• Three different event types can be logged
• EIGRP log-event-type [dual][xmit][transport]
• Default is dual—normally most useful
• Dual is FSM (decisions in finite state machine)
• xmit and transport are different aspects of actually sending packets to peers
• Any combination of the three can be on at the same time
• Work is in progress to add additional debug information to event log
Event Log
The EIGRP Topology Table
• The topology table is probably the most critical structure in EIGRP
• Contains building blocks used by DUAL
• Used to create updates for neighbors
• Used to populate the routing table
• Understanding the topology table contents is extremely important in troubleshooting EIGRP problems
Topology Table
• One of the reasons that EIGRP is called an advanced distance vector protocol is that it retains more information than just the best path for each route it receives—this means that it can potentially make decisions more quickly when changes occur, because it has a more complete view of the network than RIP, for example; the place this additional information is stored is in the topology table
• The topology table contains an entry for every route EIGRP is aware of, and includes information about the paths through all neighbors that have reported this route to him—when a route is withdrawn by a neighbor, EIGRP will look in the topology table to see if there is a feasible successor, which is another downstream neighbor that is guaranteed to be loop-free; if so, EIGRP will use that neighbor and never have to go looking farther
• Contrary to popular belief, the topology table also contains routes which are not feasible; these are called possible successors and may be promoted to feasible successors, or even successors if the topology of the network were to change
• The following slides show a few different ways to look at the topology table and give hints on how to evaluate it
Topology Table
RtrA#sh ip eigrp topology sum
IP-EIGRP Topology Table for AS(200)/ID(40.80.0.17)
Head serial 1, next serial 1526
589 routes, 0 pending replies, 0 dummies
IP-EIGRP(0) enabled on 12 interfaces, neighbors present on 4 interfaces
Quiescent interfaces: Po3 Po6 Po2 Gi8/5
Internal data structures used to manage the topology table
Number of Replies to send from this router
Total number of routes in the local topology table
Topology TableShow IP EIGRP Topology Summary
Interfaces with No Outstanding Packets to Be Sent or Acknowledged
Topology Table
• Displays a list of successors and feasible successors for all destinations known by EIGRP
Show IP EIGRP Topology
Computed
distance
RtrA#show ip eigrp topology
IP-EIGRP Topology Table for AS(1)/ID(10.1.6.1)
..snip…..
P 10.200.1.0/24, 1 successors, FD is 21026560
via 10.1.1.2 (21026560/20514560), Serial1/0
via 10.1.2.2 (46740736/20514560), Serial1/1
Reported
distance
Successor
Feasible successor
Feasible distance
B
A
D
C E10.1.2.0/24 10.1.5.0/24
10.2
00.1
.0/2
4
.1
.1
.1
.1
.1
.1
.2
.2
.2
.2
.2
.2128K56K
Topology Table
• The most common way to look at the topology table is with the generic show ip eigrp topology command; this command displays all of the routes in the EIGRP topology table, along with their successors and feasible successors
• In the above example, the P on the left side of the topology entry displayed means the route is Passive—if it has an A, it means the route is Active; the destination being described by this topology entry is for 10.200.1.0 255.255.255.0—this route has one successor, and the feasible distance is 21026560; the feasible distance is normally the metric that would appear in the routing table if you did the command show ip route 10.200.1.0 255.255.255.0 (but not always)
• Following the information on the destination network, the successors and feasible successors are listed—the successors (one or more) are listed first, then the feasible successors are listed; the entry for each next-hop includes the IP address, the computed distance through this neighbor, the reported distance this neighbor told us, and which interface is used to reach him
• 10.1.2.2 is a feasible successor because his reported distance (21514560) is less than our current feasible distance (21026560) (It’s a smaller distance, and that means it is closer.)
Show IP EIGRP Topology
Topology Table
• Displays a list of all neighbors who are providing EIGRP with an alternative path to each destination
Show IP EIGRP Topology All-links
RtrA#show ip eigrp topology all-links
IP-EIGRP Topology Table for AS(1)/ID(10.1.6.1)
…..snip…..
P 10.200.1.0/24, 1 successors, FD is 21026560
via 10.1.1.2 (21026560/20514560), Serial1/0
via 10.1.2.2 (46740736/20514560), Serial1/1
via 10.1.3.2 (46740736/46228736), Serial1/2
Computed
distance
Reported
distance
Successor
Feasible successor
Feasible distance
Possible successor
B
A
D
C E10.1.2.0/24 10.1.5.0/24
10.2
00.1
.0/2
4
.1
.1
.1
.1
.1
.1
.2
.2
.2
.2
.2
.2128K56K
Topology Table
• If you want to display all of the paths which EIGRP contains in its topology table, use the show ip eigrp topology all-links command
• You’ll notice in the above output that not only are the successor (10.1.1.2) and feasible successor (10.1.2.2) shown, but another router that doesn’t qualify as either is also displayed; the reported distance from 10.1.3.2 (46228736) is far worse than the current feasible distance (21026560), so it isn’t feasible
• This command is often useful to understand the true complexity of network convergence—I’ve been on networks with pages of non-feasible alternative paths in the topology table because of a lack of summarization/distribution lists; these large numbers of alternative paths can cause EIGRP to work extremely hard when transitions occur and can actually keep EIGRP from successfully converging
Show IP EIGRP Topology All-links
Topology Table
• Displays detailed information for all paths received for a particular destination
Show IP EIGRP Topology <net><mask>
RtrA#show ip eigrp topology 10.200.1.0/24
IP-EIGRP topology entry for 10.200.1.0/24
State is Passive, Query origin flag is 1, 1 Successor(s), FD is 21026560
Routing Descriptor Blocks:
10.1.1.2 (Serial1/0), from 10.1.1.2, Send flag is 0x0
Composite metric is (21026560/20514560), Route is Internal
Vector metric:
....
10.1.2.2 (Serial1/1), from 10.1.2.2, Send flag is 0x0
Composite metric is (46740736/20514560), Route is Internal
Vector metric:
....
10.1.3.2 (Serial1/2), from 10.1.3.2, Send flag is 0x0
Composite metric is (46740736/46228736), Route is Internal
Vector metric:
....
B
A
D
C E10.1.2.0/24 10.1.5.0/24
10.2
00.1
.0/2
4
.1
.1
.1
.1
.1
.1
.2
.2
.2
.2
.2
.2128K56K
Topology Table
• If you really want to know all of the information EIGRP stores about a particular route, use the command show ip eigrp topology <network><mask>
• Note that the mask can be supplied in dotted decimal or /xx form
• In the above display, you’ll see that EIGRP not only stores which next-hops have reported a path to the target network, it stores the metric components used to reach the total (composite) metric
• You also may notice that EIGRP contains a hop count in the vector metrics—the hop count isn’t actually used in calculating the metric, but instead was included to limit the apparent maximum diameter of the network; in EIGRP’s early days, developers wanted to ensure that routes wouldn’t loop forever and put this safety net in place—in today’s EIGRP, it actually isn’t necessary any longer, but is retained for compatibility
Show IP EIGRP Topology <network><mask>
RtrA#show ip eigrp topology 30.1.1.0/24
IP-EIGRP topology entry for 30.1.1.0/24
State is Passive, Query origin flag is 1, 1 Successor(s), FD is 46738176
Routing Descriptor Blocks:
10.1.3.2 (Serial1/2), from 10.1.3.2, Send flag is 0x0
....
External data:
Originating router is 64.1.4.14
AS number of route is 0
External protocol is Static, external metric is 0
Administrator tag is 0 (0x00000000)Static Route to 30.1.1.0/24 is
redistributed into EIGRP
Topology Table
• Showing the topology table entry for anexternal route shows additionalinformation about the route
External Topology Table Entry
B
A
D
C E10.1.2.0/24 10.1.5.0/24
10.2
00.1
.0/2
4
.1
.1
.1
.1
.1
.1
.2
.2
.2
.2
.2
.2128K56K
Topology Table
• If you perform the command show ip eigrp topology <network><mask> for an external route (one redistributed into EIGRP from another protocol), even more information is displayed
• The initial part of the display is identical to the command output for an internal (native) route—the one exception is the identifier of the route as being external; another section is appended to the first part, however, containing external information—the most interesting parts of the external data are the originating router and the source of the route
• The originating router is the router who initially redistributed the route into EIGRP—note that the value for the originating router is router-id of the source router, which doesn’t necessarily need to belong to an EIGRP-enabled interface; the router-id is selected in the same way OSPF selects router-ids, starting with highest IP address on a loopback interface, if any are defined, or using the highest IP address on the router if there aren’t loopback interfaces—note that if a router receives an external route and the originating router field is the same as the receiver’s router-id, it rejects the route—this is noted in the event log as ignored, dup router
• The originating routing protocol (where it was redistributed from) is also identified in the external data section; this is often useful when unexpected routes are received and you are hunting the source
External Topology Table Entry
Topology Table
• Zero successor routes are those that fail to get installed in the routing table by EIGRP because there is a route with a better admin distance already installed
Show IP EIGRP Topology Zero
RtrA#show ip eigrp topology zero
IP-EIGRP Topology Table for AS(1)/ID(10.1.6.1)
....
P 10.200.1.0/24, 0 successors, FD is Inaccessible
via 10.1.1.2 (21026560/20514560), Serial1/0
via 10.1.2.2 (46740736/20514560), Serial1/1
via 10.1.3.2 (46740736/46228736), Serial1/2
RtrA#show ip route 10.200.1.0 255.255.255.0
Routing entry for 10.200.1.0/24
Known via "static", distance 1, metric 0
Routing Descriptor Blocks:
* 10.1.1.2
Route metric is 0, traffic share count is 1
B
A
D
C E10.1.2.0/24 10.1.5.0/24
10.2
00.1
.0/2
4
.1
.1
.1
.1
.1
.1
.2
.2
.2
.2
.2
.2128K56K
Topology Table
• And last, the show ip eigrp topology zero command is available to display the topology table entries that are not actually being used by the routing table
• Typically, zero successor entries are ones that EIGRP attempted to install into the routing table, but found a better alternative there already; in our example above, when EIGRP tried to install its route (with an administrative distance of 90), it found a static route already there (with an administrative distance of one) and thus couldn’t install it—in case the better route goes away, EIGRP retains the information in the topology table, and will try to install the route again if it is notified that the static (or whatever) route is removed
• Routes that are active sometimes also show up as zero successor routes, but they are transient and don’t remain in that state
• This command isn’t often used or useful
Show IP EIGRP Topology Zero
Stuck in Active
• Indicates at least two problems
• A route went active
• It got stuck
• Both the ‘stuck’ and the ‘active’ have already occurred prior to the message being logged!
• Event-logs are your friend
Stuck-in-Active Routes (SIA)%DUAL-3-SIA: Route 10.1.1.0 255.255.255.0 stuck-in-active state in IP-EIGRP 100. Cleaning up
The Active Process
• RtrA loses its route to 10.10.10.0/24
• RtrA has no other path to this destination, so it marks the route as Active and sends a Query to RtrB
• RtrB receives this Query from its successor and has no other paths to reach the destination
• RtrB marks 10.10.10.0/24 as Active, and sends a Query to RtrC
10.10.10.0/24
A
B
C
Query
No other path
Query
The Active Process
• RtrC receives the Query and has no more neighbors to Query and no alternate paths to 10.10.10.0/24
• RtrC marks the route as unreachable, and sends a Reply to RtrB
• RtrB receives the Reply, marks 10.10.10.0/24 as unreachable, and sends a Reply to RtrA
• RtrA receives the Reply and since it didn’t learn any viable paths to reach 10.10.10.0/24, it deletes the route from the topology and routing tables
10.10.10.0/24
Query
No other path
Query
Reply
Reply
A
B
C
The Active Process
• What happens if RtrC ’s Reply isn’t sent, or doesn’t make it toRtrB?
• While RtrC is trying to send the Reply, RtrA ’s Active timer is running
• After 90 seconds, RtrA sends an SIA query to RtrB
• If RtrB is still waiting on RtrC , it sends an SIA reply to RtrA
10.10.10.0/24
No other path
Query
Reply
Query
Active Timer
SIA Query
SIA Reply
A
B
C
• SIA-queries are sent to a neighbor up to three times
• May attempt to get a reply from a neighbor for a total of six minutes
• If a Reply is not received by the end of this process, the route is considered stuck through this neighbor
• On the router that doesn’t get a reply after three SIA-queries
• Reinitializes neighbor(s) who didn’t answer
• Goes active on all routes known through bounced neighbor(s)
• Re-advertises to bounced neighbor all routes that were previously advertised
The SIA Query Process
• Sometimes the active process doesn’t complete normally; this can be due to a number of different problems which are covered later in this presentation… what happens when things go wrong?
• If RtrB doesn’t respond to RtrA within 1.5 minutes because it’s still waiting for a Reply from RtrC, RtrA will send an SIA-query to RtrB checking the status—if RtrB is still waiting for a Reply itself, it will respond to RtrA with an SIA-reply; this resets the SIA timer on RtrA so it will wait another 1.5 minutes
• Eventually, the problem keeping RtrC from responding to RtrB will take the neighbor relationship down between RtrB and RtrC, which will cause RtrB to reply to A, ending the Query process
The SIA Query Process
• Two (probably) unrelated causes of the problem — stuck and active
• Need to troubleshoot both parts
• Cause of active often easier to find
• Cause of stuck more important to find
Troubleshooting SIAs
• If routes never went active in the network, we would never have to worry about any getting stuck; unfortunately, in a real network there are often link failures and other situations that will cause routes to go active—one of our jobs is to minimize them, however
• If there are routes that regularly go active in the network, you should absolutely try to understand why they are not stable; while you cannot ensure that routes will never go active on the network, a network manager should work to minimize the number of routes going active by finding and resolving the causes
• Even if you reduce the number of routes going active to the minimum possible, if you don’t eliminate the reasons that they get stuck you haven’t fixed the most important part of the problem; the next time you get an active route, you could again get stuck
• The direct impact of an active route is small; the possible impact of a stuck-in-active route can be far greater
Troubleshooting SIAs
• Determine what is common to routes going active
• Known network problems?
• Flapping link(s)?
• From the same region of the network?
• Resolve whatever is causing them to go active (if possible)
Troubleshooting the Active Part of SIAs
• The syslog may tell you which routes are going active, causing you to get stuck. Since the SIA message reports the route that was stuck, it seems rather straight forward to determine which routes are going active. This is only partially true—once SIAs are occurring in the network, many routes will go active due to the reaction to the SIA; you need to determine which routes went active early in the process in order to determine the trigger
• Additionally, you can do show ip eigrp topology active on the network when SIAs are not occurring and see if you regularly catch the same set of routes going active
• If you are able to determine which routes are regularly going active, determine what is common to those routes—are links flapping (bouncing up and down) causing the routes (and everything behind it) to regularly go active?
• Are most or all of the routes coming from the same area of the network? If so, you need to determine what is common in the topology to them so that you can determine why they are not stable
Troubleshooting the Active Part of SIAs
• Show ip eigrp topology active
• Useful only while the problem is occurring
• If the problem isn’t occurring at the time, it is very difficult to find the reason the routes are getting stuck
Troubleshooting the Stuck Part of SIAs
• Our best weapon to use to find the cause of routes getting stuck-in-active is the command show ip eigrp topology active; it provides invaluable information about routes that are in transition—examples of the output of this command and how to evaluate it will be in the next several slides
• Unfortunately, this command only shows routes that are currently in transition; it isn’t useful after the fact when you are trying to determine what happened earlier—if you aren’t chasing it while the problem is occurring, there aren’t really any tools that will help you find the cause
Troubleshooting the Stuck Part of SIAs
Chasing Active Routes
Why Is RtrA Reporting SIA Routes?
Let’s Look at a Problem in Progress
RtrA Is Waiting on RtrB
20.1
.1.0
/24
10.1.1.0/24 10.1.2.0/24 10.1.3.0/24
.1 .2 .1 .2 .1 .2
A B C D
RtrA#show ip eigrp topology active
IP-EIGRP Topology Table for AS(1)/ID(20.1.1.1)
A 20.1.1.0/24, 1 successors, FD is Inaccessible
1 replies, active 00:01:17, query-origin: Local origin
via Connected (Infinity/Infinity), Ethernet1/0
Remaining replies:
via 10.1.1.2, r, Ethernet0/0
• In our example network, we’ve noticed dual-3-sia messages in the log of RtrA and we know the trigger is an unstable network off of this router; instead of just shutting down the unstable link, we decide to try to determine the cause of the stuck part of stuck-in-active
• In the above output, we see that RtrA is active on the route 20.1.1.0/24 (note the “A” in the left column) and has been waiting for an answer from 10.1.1.2 (RtrB) for one minute and 17 seconds—we know that we are waiting on RtrB because of the lower case r after the IP address; sometimes, the lower case r comes after the metric in the upper part of the output (not under remaining replies)—don’t be fooled—the lower case r is the key, not whether it’s under the remaining replies are or not
• Since we know why we are staying active on the route because RtrB hasn’t answered us, we need to go to him (RtrB) to see why he’s taking so long to answer
Chasing Active Routes
Chasing Active Routes
So Why Hasn’t RtrB Replied?
RtrB#show ip eigrp topology active
IP-EIGRP Topology Table for AS(1)/ID(10.1.2.1)
A 20.1.1.0/24, 1 successors, FD is Inaccessible
1 replies, active 00:01:26, query-origin: Successor Origin
via 10.1.1.1 (Infinity/Infinity), Ethernet0/0
Remaining replies:
via 10.1.2.2, r, Ethernet1/0
RtrB is Waiting on RtrC
20.1
.1.0
/24
10.1.1.0/24 10.1.2.0/24 10.1.3.0/24
.1 .2 .1 .2 .1 .2
A B C D
• We repeat the show ip eigrp topology active command on RtrB and we get the results seen above
• We see that RtrB probably isn’t the cause of our stuck-in-active routes, since it is also waiting on another router downstream to answer his query before it can reply; again, the lower case r beside the IP address of 10.1.2.2 tells us it is the neighbor slow to reply
• We now need to go to 10.1.2.2 (RtrC) and see why it isn’t answering RtrB
Chasing Active Routes (Step 1)
Chasing Active Routes
What’s RtrC’s Problem?
RtrC#show ip eigrp topology active
IP-EIGRP Topology Table for AS(1)/ID(10.1.3.1)
A 20.1.1.0/24, 1 successors, FD is Inaccessible, Qqr
1 replies, active 00:01:33, query-origin: Successor Origin, retries(1)
via 10.1.2.1 (Infinity/Infinity), Ethernet0/0, serno 20
via 10.1.3.2 (Infinity/Infinity), rs, q, Ethernet1/0, serno 19, anchored
RtrC Is Waiting on RtrD
20.1
.1.0
/24
10.1.1.0/24 10.1.2.0/24 10.1.3.0/24
.1 .2 .1 .2 .1 .2
A B C D
• On RtrC we repeat the show ip eigrp topology active command and see what it thinks of the route
• Again, he’s waiting on another neighbor downstream to answer him before it can answer RtrB… you are probably getting the idea of how exciting this process can be; of course, in a real network you probably have users/managers breathing down your neck making it a bit more interesting
• As I’m sure you suspect our next step should be to see why 10.1.3.2 (RtrD) isn’t answering RtrC’s query
Chasing Active Routes (Step 2)
Chasing Active Routes
Why Isn’t RtrD Answering?
RtrD#show ip eigrp topology active
IP-EIGRP Topology Table for AS(1)/ID(10.1.3.2)
RtrD#
20.1
.1.0
/24
10.1.1.0/24 10.1.2.0/24 10.1.3.0/24
.1 .2 .1 .2 .1 .2
A B C D
No active routes… back to RtrC!
• And again, we look at the active topology table entries, this time on RtrD
• Wait… RtrD isn’t waiting on anyone for any routes; did the replies finally get returned and the route is no longer active? We need to go back to RtrC and see if it is still active on the route
Chasing Active Routes (Step 3)
Chasing Active Routes
No; RtrC Is Still Waiting on RtrD; What’s the Deal?
RtrC#show ip eigrp topology active
IP-EIGRP Topology Table for AS(1)/ID(10.1.3.1)
A 20.1.1.0/24, 1 successors, FD is Inaccessible, Qqr
1 replies, active 00:01:52, query-origin: Successor Origin, retries(1)
via 10.1.2.1 (Infinity/Infinity), Ethernet0/0, serno 20
via 10.1.3.2 (Infinity/Infinity), rs, q, Ethernet1/0, serno 19, anchored
RtrC is waiting on RtrD
20.1
.1.0
/24
10.1.1.0/24 10.1.2.0/24 10.1.3.0/24
.1 .2 .1 .2 .1 .2
A B C D
• Hmmm… RtrC still thinks the route is active and it’s gotten even older
• There appears to be a problem, Houston. RtrC thinks it needs a reply from RtrD, yet RtrDisn’t active on the route; we need to take a look at the neighbor relationship between these two routers to try to identify what is going wrong
Chasing Active Routes (Step 4)
Chasing Active Routes
We need to investigate and see why they don’t seem to agree about the active route
RtrC#show ip eigrp neighbors
IP-EIGRP neighbors for process 1
H Address Interface Hold Uptime SRTT RTO Q Seq
(sec) (ms) Cnt Num
0 10.1.3.2 Et1/0 13 00:00:14 0 5000 1 0
1 10.1.2.1 Et0/0 13 01:22:54 227 1362 0 385
Looks like something’s broken between RtrC and RtrD
20.1
.1.0
/24
10.1.1.0/24 10.1.2.0/24 10.1.3.0/24
.1 .2 .1 .2 .1 .2
A B C D
• It appears that RtrC is having a bit of a problem communicating with RtrD—the neighbor relationship isn’t even making it completely up based on the Q count on RtrC; we also notice in the log that the neighbor keeps bouncing due to retry limit exceeded
• Now we need to use our normal troubleshooting methodology to determine why these two routers can’t talk to each other properly
Chasing Active Routes (Step 5)
Chasing Active Routes
RtrC#ping 10.1.3.2
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.1.3.2, timeout is 2 seconds:
.....
Success rate is 0 percent (0/5)
Okay—we can’t ping; we need to fix this before EIGRP stands a chance of
working
20.1
.1.0
/24
10.1.1.0/24 10.1.2.0/24 10.1.3.0/24
.1 .2 .1 .2 .1 .2
A B C D
• How does basic connectivity look? A ping between RtrC and RtrD isn’t succeeding either; we’ll need to find out why they can’t talk to each other
• Whatever is causing them to not talk to each other is undoubtedly a contributing factor to the SIAs we’re seeing in the network; we need to find and fix the problem with this link and remove the cause of the SIA routes
Chasing Active Routes (Step 6)
• It’s not always this easy to find the cause of an SIA
• Sometimes you chase the waiting neighbors in a circle
• If so, summarize and simplify
• Easier after CSCdp33034 (circa 2000)
• SIA should happen closer to the location of the cause of the problem
Troubleshooting the Stuck Part of SIAs
• Our example of chasing SIA routes was intentionally made very easy in order to demonstrate the tools and techniques—in a real event on a network, there would probably be many more routes active, and many more neighbors replying; this can make chasing the waiting neighbors significantly more challenging
• Usually, you will be able to succeed at tracking the waiting neighbors back to the source of the problem—occasionally, you can’t—on highly redundant networks, in particular, you can find yourself chasing neighbors in circles without reaching an endpoint cause of the waiting; if you run into this case, you may need to temporarily reduce the redundancy in order to simplify the network for troubleshooting and convergence
Troubleshooting the Stuck Part of SIAs
• Bad or congested links
• Query range is “too long”
• Excessive redundancy
• Overloaded router (high CPU)
• Router memory shortage
• Software defects (seldom)
Likely Causes for Stuck-in-Active
• Remember that the cause of the SIA route could be a different location than where the SIA message and bounced neighbors happened; this is particularly true with code older than CSCdp33034
• Some of the possible causes of SIAs are:
• Links that are either experiencing high CRC or other physical errors or are congested to the point of dropping a significant number of frames—queries, replies, or acknowledgements could be lost
• The time it takes for a query to go from one end of the network to the other is too long and the active timer expires before the query process completes; I don’t think I’ve ever seen a network where this is true, by the way
• The complexity in the network is so great due to excessive redundancy that EIGRP is required to work so hard at sending and replying to queries that it cannot complete them in time
• A router is low on memory so that it is able to send hellos, which are very small, but be unable to send queries or replies
• There have occasionally been software defects that caused SIAs (CSCdi83660, CSCdv85419, CSCtc31545)
Likely Causes for SIAs
• What is excessive redundancy? Isn’t redundancy a good thing, not something to avoid? I categorize excessive redundancy as alternative paths that exist in the network that provide little if any real benefit of improved reliability, and are often unplanned and unexpected
• In the above example, the four subnets on the left (which could be VLANs through a switch or any other media, for that matter) are there to provide users with access to the network; there are two routers connected to each VLAN in order to provide redundancy (probably via HSRP) so that the users will have failover capability if there is a problem
• Unfortunately, the designer may have created a network topology a little different than what it intended
Excessive Redundancy
Excessive Redundancy1.1.1.0/24
A
B
RtrA#show ip eigrp topo | begin 1.1.1.0
P 1.1.1.0/24, 1 successors, FD is 128256
via Connected, Loopback1
P 10.0.11.0/24, 1 successors, FD is 9048064
...snip...
RtrA#show ip route | begin 1.1.1.0
C 1.1.1.0 is directly connected, Loopback1
...snip...
RtrA#show ip eigrp topo all | begin 1.1.1.0
P 1.1.1.0/24, 1 successors, FD is 128256, serno 2673915
via Connected, Loopback1
via 10.0.19.2 (9690112/9173248), FastEthernet6/0.19
via 10.0.20.2 (9690368/9173248), FastEthernet6/0.20
via 10.0.13.2 (9688576/9173248), FastEthernet6/0.13
via 10.0.45.2 (9696768/9173248), FastEthernet6/0.45
via 10.0.27.2 (9692160/9173248), FastEthernet6/0.27
via 10.0.28.2 (9692416/9173248), FastEthernet6/0.28
via 10.0.22.2 (9690880/9173248), FastEthernet6/0.22
via 10.0.42.2 (9696000/9173248), FastEthernet6/0.42
...snip...
Wow, Where did all of these alternative paths
come from!
RtrA
RtrB
• If you just define network statements under EIGRP covering all of your interfaces, each of the user subnets will be treated by EIGRP as possible alternative paths to reach every destination in the network! It is rarely the network designers goal to have these user subnets used as transit paths to reach other parts of the network for anyone other than the users that reside on that segment, but EIGRP doesn’t know what the designer intended, only what he/she configured
• As the output above shows, when something changes in the network EIGRP has to converge over each of the user subnets as part of the query path
Excessive Redundancy
Excessive Redundancy1.1.1.0/24
A
Brouter eigrp 1
passive-interface fastethernet6/0.1
passive-interface fastethernet6/0.2
Or
router eigrp 1
passive-interface default
no passive-interface fastethernet0/0
RtrA
• A simple solution to this particular problem is to use the passive-interface command. In EIGRP, defining an interface as passive means that the subnet on that interface is included in the EIGRP topology table and propagated to the rest of the network, but no peers will be formed across these interfaces; this means that they will not be in the transit path and will greatly simplify EIGRP’s apparent topology and the associated complexity of convergence
• If you don’t plan to have a link as a transit path, make it passive!
• Note that if you didn’t want those interfaces to show up in EIGRP at all, you could define more specific network statements to only cover the interfaces you’re interested in. Our assumption above is that those interfaces are needed in EIGRP, just not for peer formation.
Excessive Redundancy
• Decrease query scope (involve fewer routers in the query process)
• Summarization (filters out knowledge of the route to downstream peer)
• Route filters (filters out knowledge of the route to downstream peer)
• Define spoke/edge routers as stubs (explicitly defined as non-transit device)
• Run a Cisco IOS which includes CSCdp33034
Minimizing SIA Routes
• We’ve now talked about the impact that SIA routes can have on your network and how to track down the root cause of SIA events; while you may not be able to completely rid your network of SIA routes, there are techniques you can use to minimize your exposure
• Decrease query scope—in our example network, you saw the queries sent to each router in a chain; if a router receives a query on a route that it doesn’t have in its topology table, it immediately answers and doesn’t send the query onward—this is a very good thing; you do this through:
• Summarization—auto-summary (seldom used) or manual summary to summarize within a major network or to summarize external routes
• Route filtering—used to limit knowledge of routes; particularly on dual-homed remotes, which tend to reflect all routes back to the other leg of the dual home connection
• Use hierarchy—if the network doesn’t have hierarchy, the two techniques above cannot adequately be used
• Define spoke/edge routers as stubs so they aren’t queried at all
• Run a Cisco IOS with CSCdp33034 included
Minimizing SIA Routes
• Summary Metrics
• Summary Admin Distance
• Summary Black Holes
Summary Problems
Summary Basics
• Summarization is an information hiding technique to send less-specific routes to represent block of prefixes
• 192.168.1.0/24, 192.168.2.0/24, and 192.168.3.0/24 can be aggregated to 192.168.0.0/22
• Rather than advertising three networks with each representing 255 addresses (253 hosts), RtrA advertises a single network, representing 1024 addresses
192.168.3.0/24
192.168.2.0/24
192.168.1.0/24
253 Hosts
192.168.0.0/22
1 Network
1024 Addresses
3 Networks
255 Addresses Each
A
• Summarization is an information-hiding technique used to minimize the number of prefixes advertised while still maintaining full reachability—summarization will be most effective if the network is designed in a hierarchical way so that multiple prefixes can be represented at some point in the network by a single, less specific prefix; one typical place of summarization is from distribution routers toward spokes that only need to know a default route (or at least some subset of total routes) in order to reach the remainder of the network
• When summarization is used in EIGRP networks, scalability is greatly enhanced both because of the fewer number of prefixes known throughout the network as well as the decreased query scope that summarization brings; the query scope aspect will be explained in more detail later in this presentation
Summary Basics
Summary Metrics
• In EIGRP, the metric of a summary is based on the metrics of its components
• EIGRP chooses the metric of the lowest cost component route as the metric of the summary
A
B
10.1
.0.0
/24
Co
st 1
0
10.1
.1.0
/24
Co
st 2
0
10.2
.0.0
/24
Cost 10
10.2
.1.0
/24
Co
st 2
0
10.1.0.0/23
Cost 1010.2.0.0/23
Cost 10
C
• When EIGRP creates a summary route, it has to determine the metric to include with the route advertisement—EIGRP examines every entry in the database (topology table) looking for components of the summary that will be suppressed (thus represented by) the summary; EIGRP finds the component with the best composite metric and then copies the metric details from it (bandwidth, delay, etc.) into the summary topology table entry
• Note that it does not take the best delay, best bandwidth, etc., but takes the best composite metric and grabs the attributes from it.
• This works fine except for the fact that components of the summary may come and go, which means EIGRP has to continually make sure the summary is still using the lowest metric contained in a summary component
Summary Metrics
Summary Metrics• If the component from which the metric
was derived flaps, then summary updates are required as well!
• The summary is used to hide reachability information, yet changes to the metric information causes the routers beyond the summary to perform work to keep up with the metric changes
• There is also processing overhead for EIGRP to recalculate the summary metric each time a component changes
10.1
.0.0
/24
Co
st 1
0
10.1
.1.0
/24
Co
st 2
0
10.2
.0.0
/24
Cost 10
10.2
.1.0
/24
Co
st 2
0
10.1.0.0/23
Cost 10
10.2.0.0/23
Cost 10Cost 20
A
B C
• This recalculation of the summary metric when components change causes two significant things to happen:
• Every time the component with the best metric changes, the summary needs to be re-advertised to all of it’s peers—thus the desire to hide topology changes behind the summary is only partially functional; while it hides the changes for each component prefix, it still causes updates and processing if the best component is the one that changed
• Even if the best component isn’t the one that changed, EIGRP internally has to look at every topology table entry to make sure the summary metric wasn’t affected; with large numbers of components or large numbers of summaries, this can be significant processing
Summary Metrics
Summary Metrics—Solutions
• Use a loopback interface to force the metric to remain constant
• Create a loopback interface within the summary range with a lower metric than any other component
• Generally best to use a /32 for the prefix and use delay to force the metric value
• The summary will use the metric of the loopback, which will never go down
A
B
10.1
.0.0
/24
Co
st 2
0
10.1
.1.0
/24
Co
st 2
0
10.1.0.0/23
Cost 10
loopback 0
ip address 10.1.1.1 255.255.255.255
delay 1
• One way to minimize/remove the first problem (metric changing downstream due to component changes) is to create a loopback on the router doing the summarization and ensure that it has the best metric of any component of the summary; since it will remain up unless administratively shut down, the metric of the summary will not change in its updates to upstream peers
• Note that this approach does nothing to change the second summary metric issue; i.e., router cpu processing required to recalculate on the router doing the summarization—that’s next
Summary Metrics
Summary Metrics—Solutions
• In recent EIGRP code, you can define the “summary-metric” command in router mode inorder to specify the metric to be used on the summary, regardless of the metrics of the component routes
• This is similar to defining the metric on redistribution statements in router mode
• This eliminates metric churn downstream as well as local processing
router eigrp ROCKS
address-family ipv4 autonomous-system 1
network 10.0.0.0
topology base
summary-metric 10.1.0.0 255.255.254.0 10000 100 255 1 1500
A
B
10.1
.0.0
/24
Co
st 2
0
10.1
.1.0
/24
Co
st 2
0
10.1.0.0/23
Cost 10
• The recent implementations of EIGRP (release five and newer) contain the new “summary-metric” command under the router prompt which allows you to specify the metric to use on the summary so that learning the metric from summary components is unnecessary; since the metric is fixed, both the route churn problem for downstream peers and local database searching processing are removed
• This new command will greatly improve scalability in networks using summarization with large topology tables, which is where summarization is most useful!
Summary Metrics
Admin Distance Basics
• The administrative distance has nothing to do with distance but instead should be considered preference or believability
• If a route is being installed in the RIB from multiple sources, the admin distance defines which one wins
• The chart supplied here shows the default distances; important to this discussion is the fact that EIGRP summary routes are preferred (AD 5) over routes learned from EIGRP peers (AD 90 for internals, 170 for externals)
• Note: Lower = better
Route Source Default Distance
Connected Interface 0
Static Route 1
EIGRP Summary Route 5
eBGP 20
Internal EIGRP 90
IGRP 100
OSPF 110
IS-IS 115
RIP 120
On Demand Routing (ODR) 160
External EIGRP 170
iBGP 200
Unknown 255
Summary Admin Distance
• Original summary Admin Distance problem• Default summary is defined on
distribution routers to spokes
• Internet access point provides 0.0.0.0 external for default route to the Internet
• AD of five for the summary is better than the AD of 170 from the external EIGRP route, so the Internet default route is rejected!
• So let’s just add the AD to the summary!
A 200
A B
C
0.0.0.0
RtrA#sh ip route 0.0.0.0 0.0.0.0
Routing entry for 0.0.0.0/0, supernet
Known via "eigrp 1", distance 5, metric 25600, candidate default path, type internal
ip summary-address eigrp 1 0.0.0.0 0.0.0.0
• A common implementation of summaries in an EIGRP network is the generation of a default route (0.0.0.0/0) from the distribution layer to the access routers; this installs a 0.0.0.0/0 Null0 summary (discard) route on the distribution layer router
• The problem occurs when there is an actual 0.0.0.0/0 default route generated elsewhere in the network that the distribution layer needs to receive in order to do proper routing; since the 0.0.0.0/0 summary route has an AD of 5 and the 0.0.0.0/0 learned across the EIGRP network has an AD of 90 for internals or 170 for externals, the local summary wins and the received route is discarded
• In order to solve this problem, we added the ability to define a manual admin distance to the summary several years ago—this turned out to be a good idea only in limited deployment scenarios
Summary Admin Distance
Summary Admin Distance
• Since the summaries created on RtrA and RtrB now have a worse AD than the external route received, the external route wins and is propagated to RtrC
• Components of the summary are still suppressed
• RtrC has equal cost paths to 0.0.0.0/0
• But what happens if RtrA loses the path to the Internet?
D*EX 0.0.0.0/0 [170/409600] via 10.1.2.2, 00:00:10, Serial1/0
[170/409600] via 10.1.1.1, 00:00:10, Serial0/0
A
0.0.0.0
ip summary-address eigrp 1 0.0.0.0 0.0.0.0 200
A B
C
• When the summary is defined with an admin distance of 200 to make it worse than the external route learned across the EIGRP network, it works just like we intended; the summary is created internally to EIGRP so that the components are still suppressed to the access routers, but the external route received across the EIGRP network wins the installation into the RIB and is advertised to the access layer routers
• The problem occurs if the distribution layer router loses the 0.0.0.0/0 from the EIGRP network for some reason—since it’s the receipt of this route that keeps the local summary from being installed in the RIB and being advertised to the access layer peers, it’s disappearance changes things
Summary Admin Distance
Summary Admin Distance
• Since RtrA no longer has the external route, it creates the local summary route and advertises it to RtrC
• Now RtrC receives an external route from RtrB and an internal route from A
• The route from RtrA wins! Now RtrC’sdefault route points to A, who doesn’t have access to the Internet and maybe not even the company’s intranet! D* 0.0.0.0/0 [90/409600] via 10.1.1.1, 00:00:10, Serial0/0
0.0.0.0
ip summary-address eigrp 1 0.0.0.0 0.0.0.0 200
A B
C
• When RtrA loses the 0.0.0.0/0 it received from the EIGRP network, it installs it’s local summary route into the RIB with an admin distance of 200 and happily advertises it to the access layer peers; this is the interesting part—summary routes only appear as a special route type on the router that generates the summary—on peers that receive the summary, it appears like any other internal route!
• That means that RtrC will receive an external route from RtrB with an AD of 170 and an internal route from RtrA with and AD of 90—the route from RtrA wins! The problem is that now RtrC points at RtrA for all of it’s traffic, yet RtrA no longer has access to the Internet and maybe not even the internal company network… drats!
• This problem doesn’t occur on single-homed access layer routers; if there’s only one path out from the access router, it doesn’t really matter if it’s internal or external
Summary Admin Distance
C
B
Summary Admin Distance
• How do you resolve this problem?
• Define the Admin Distance on the summary as 255 instead of something lower
• The route will only be advertised if the external exists!
• Note: Don’t use 255 for single-homed remotes!
B
D*EX 0.0.0.0/0 [170/409600] via 10.1.2.2, 00:00:10, Serial1/0
0.0.0.0
ip summary-address eigrp 1 0.0.0.0 0.0.0.0 255
A B
C
• For dual-homed sites, using the summary admin distance in a slightly different way can provide a decent solution; if the summary is defined with an AD of 255, it will lose to the external route and allow that route to be propagated through to the access routers as before—if the external route disappears, however, the AD of 255 keeps the local summary from being installed in the RIB and thus it won’t be advertised to the access routers; the only remaining route will be the external through RtrB
• Note:• If you put an AD of 255 on a single-homed remote and the external default route goes away, the
access router will not receive the route at all!
• You may need a floating static on the access router to avoid this side-effect
Summary Admin Distance
C
AB
Summary Admin Distance
• Another way to resolve the problem
• Don’t use summary admin distance if sending to dual-homed remotes
• Instead, use distribute-list to permit 0.0.0.0 to remotes with floating static on remotes
AB
router eigrp 1
distribute-list 1 out Serial0/0
….
access-list 1 permit 0.0.0.0
ip route 0.0.0.0 0.0.0.0 10.1.1.1 200
ip route 0.0.0.0 0.0.0.0 10.1.2.2 200
0.0.0.0
A B
C
• An alternative solution is to not use summarization at all on the distribution layer and instead use a distribute-list out to permit only the 0.0.0.0/0 route to the access layer; of course, that means that if the 0.0.0.0/0 route disappears, the access layer routers are stranded and can’t reach the internal, company network… not good
• To avoid this problem, the distribute-list out on the distribution layer should be used in tandem with floating static routes on the access-layer routers
Summary Admin Distance
C
A B
Summary Admin Distance• Another situation encountered when
adding the admin distance to a summary• Same summary can be defined on multiple
interfaces
• Only one topology table/routing table entry is actually created
• The last defined Admin distance wins!
ip summary-address eigrp 1 0.0.0.0 0.0.0.0 200
ip summary-address eigrp 1 0.0.0.0 0.0.0.0 240
Router#sh ip route 0.0.0.0
Routing entry for 0.0.0.0/0, supernet
Known via "eigrp 1", distance 240, metric 128256, candidate default path, type internal
Router#show run int e0/0
interface Ethernet0/0
ip address 10.1.1.1 255.255.255.0
ip summary-address eigrp 1 0.0.0.0 0.0.0.0 200
Router#show run int e0/1
interface Ethernet0/1
ip address 10.1.2.1 255.255.255.0
ip summary-address eigrp 1 0.0.0.0.0 0.0.0.0 240
• One other “interesting” aspect that may be unexpected when using the admin distance option on the summary commands is that while a summary can be entered on many different interfaces, only one summary topology entry and one routing table entry is actually created
• This means that there is really only one distance associated with the summary, regardless of how many different distances you enter for the same summary on different interfaces
• If you enter the same summary on different interfaces, it’s really only adding interfaces to a single summary queue entry; if each of these summaries has a different admin distance, the last one entered will be the one used
Summary Admin Distance
A
B
10.1
.2.0
/24
10.1
.3.0
/24
E
D
X
ip summary-address eigrp 1 10.1.0.0 255.255.0.0
10.1
.1.0
/24
C
Summary Black Holes
• This network implements manual summarization from the distribution routers toward the core
• These summaries represent all spoke networks and links to the spokes
• It normally doesn’t matter whether RtrA or RtrB is used to reach an address on a spoke from X
RtrX#sh ip route | sec 10.1.0.0
D 10.1.0.0/16 [90/307200] via 10.2.1.52, 00:02:01, Ethernet0/0
[90/307200] via 10.2.1.51, 00:02:01, Ethernet0/0
• In this example network, the designer implemented manual summarization to hide the specific routes located at the remote sites by summarizing from the distribution layer toward the core; on each of the interfaces of RtrA and RtrB toward router X is the summary statement ip summary-address eigrp 1 10.1.0.0 255.255.0.0—this blocks the specific prefixes from being advertised to X, and instead only advertises the 10.1.0.0/16 prefix there
• Normally, this works great; minimal info is known at the core and proper routing takes place just fine—transitions in the remotes are hidden form the core, which makes the core more stable—but what happens if a problem occurs?
Summary Black Holes
Summary Black Holes
• What happens if the RtrA to RtrC Link fails?
• RtrX is still receiving the summary from both RtrA and RtrB and may chose RtrA as its path for packets going to 10.1.1.0/24
• A builds a discard route to null0 with an administrative distance of five when the summary is configured
• The traffic will be dropped at RtrA!
RtrA#sh ip route 10.1.1.0
Routing entry for 10.1.0.0/16
Known via "eigrp 1", distance 5, * directly connected, via Null0
RtrX#ping 10.1.1.53
.....
Success rate is 0 percent (0/5)
RtrX#
10.1
.1.0
/24
A
B
10.1
.2.0
/24
10.1
.3.0
/24
E
D
X
C
ip summary-address eigrp 1 10.1.0.0 255.255.0.0
• The problem is that a summary will be installed and sent if any component of the summary exists; if a router doing summarization loses access to one or more of the components of the summary, it will still advertise reachability to the entire summary, even though packets destined to the missing component(s) cannot be delivered
• This isn’t a problem if the summarizing router is the only path the lost network, but often it’s not; in the diagram above, both RtrA and RtrB have access to the remotes and are summarizing them toward the core of the network—if RtrA loses access to 10.1.1.0/24, routers downstream (like rtrX) could send packets to RtrA that it cannot deliver; if the packets went to RtrB, however, they would have been delivered successfully
• Note that this problem is common to all routing protocols which can do summarization. Information hiding is a good thing, but there are possible down-sides, as well.
Summary Black Holes
GRE Tunnel, No Summarization
New Link, No Summarization
Summary Black Holes
• Possible Solutions
• Add a new link between RtrA and RtrBwithout summarization configured
• Add a GRE tunnel between RtrA and RtrB without summarization configured
RtrA#sh ip ei to all | sec 10.1.1.0
P 10.1.1.0/24, 1 successors, FD is 307200, serno 19
via 10.1.10.53 (307200/281600), Ethernet0/1
via 10.1.20.52 (1587200/307200), Ethernet1/0
A
B
10.1
.2.0
/24
10.1
.3.0
/24
E
D
X
ip summary-address eigrp 1 10.1.0.0 255.255.0.0
10.1
.1.0
/24C
• The normal method to avoid/resolve this problem is to have another link between summarizing routers (another fast/gigEthernet, PVC, etc.) and not summarize across this link; in that way, RtrA would be getting component routes from RtrB and would know how to deliver packets to 10.1.1.0 through RtrB
• Another approach used if the cost of another link is too high is to put a GRE tunnel between RtrA and RtrB and allow all component routes to be advertised across this tunnel
• There’s also some discussion inside of EIGRP development of ways to solve this problem dynamically. A sys-wish bug has been filed against it (CSCdw68502) but we haven’t started the work yet; we’re still discussing the best way to solve it
Summary Black Holes
• Redistribution metrics
• Static route to connected interface
• Multiple points of redistribution
Redistribution Problems
• Redistribution is used to advertise routes learned in another routing protocol (or another EIGRP AS) into the EIGRP network; routes that are redistributed into EIGRP are considered less trustworthy then native EIGRP routes because of the loss of specific topology information/metrics—because they’re less trustworthy, they’re given a worse admin distance so that routes that are learned internally within the AS are preferred
• Redistribution is a fact of life in many networks, with the foreign route sources coming from suppliers, other divisions, other companies when there are mergers, etc.; redistribution isn’t evil, but it needs to be controlled so that the EIGRP network remains stable
• Another use of redistribution is for MPLS/VPN over BGP using PE-CE support; this creates its own interesting troubleshooting challenges
Redistribution Basics
Redistribution Metrics
• One of the most common problems with redistribution into EIGRP is when a redistribution metric isn’t defined
• RtrA is redistributing 10.1.1.0/24 from RIP, but RtrB and RtrC do not have the route installed
• The first thing to check is whether RtrA has a redistribution metric configured via either
• Default-metric <metric>
• Redistribute rip metric <metric>
• EIGRP can’t directly turn a hop count or cost into an EIGRP metric, so it won’t redistribute routes unless it knows what metric to assign to them
10.1.1.0/24 via RIP
router eigrp 100redistribute rip
What Metric Should I Use?
RtrC#show ip route 10.1.1.0RtrC#
A
C
BRtrB#show ip route 10.1.1.0B#
• As demonstrated, EIGRP must have a redistribution metric to use for the routes being redistributed; the two forms of supplying this redistribution metric serve similar, but not identical purposes—make sure you use the one that matches your requirements
• If you want to set the metric to be used for routes from a particular redistribution source, use the metric keyword on the redistribution statement
• router eigrp 1• redistribute rip metric 10000 100 255 1 1500
• redistribute ospf 1 metric 1000 200 255 1 1500
• If you want all redistribution sources to have the same metric applied to the redistributed routes, you can use the default-metric command instead
• router eigrp 1• redistribute rip
• redistribute ospf 1
• default-metric 10000 100 255 1 1500
Redistribution Metrics
• EIGRP will automatically derive the redistribution metrics from:
• A connected interface for redistribute connected
• The interface through which a static route is reached for redistribute static (note: this isn’t always reliable; you’re better off specifying the redistribution metric!)
• The metric of an EIGRP route from another AS
• If none are those are true, you must supply the metric for redistribution
• On the redistribute statement explicitly
• Or a default metric statement
• Just Configure it explicitly! Configure with Intent.
Redistribution Metrics
Static Route to Connected Interface
• Another surprise you could hit isn’t really a “problem” but could be unexpected
• A static route with the next-hop pointing to a local interface may be automatically redistributed
• This happens only if the destination network is covered by a network statement
router eigrp 1network 10.0.0.0
!Ip route 10.2.2.0 255.255.255.0 null0
r31#show ip eigrp topology 10.2.2.0/24IP-EIGRP (AS 1): Topology entry for 10.2.2.0/24State is Passive, Query origin flag is 1, 1 Successor(s), FD is 256Routing Descriptor Blocks:0.0.0.0, from Rstatic, Send flag is 0x0
Composite metric is (256/0), Route is InternalVector metric:Minimum bandwidth is 10000000 KbitTotal delay is 0 microsecondsReliability is 0/255Load is 0/255Minimum MTU is 1500Hop count is 0
r32#show ip route10.0.0.0/24 is subnetted, 3 subnets
C 10.1.2.0 is directly connected, Ethernet0/0D 10.2.2.0 [90/281600] via 10.1.2.31, 00:19:20, Ethernet0/0D 10.1.1.0 [90/307200] via 10.1.2.31, 00:24:19, Ethernet0/0
r31
r32
• One situation that isn’t necessarily a problem but creates calls to the TAC and questions on our internal email aliases is the apparently bizarre behavior that EIGRP will automatically redistribute a static route even if redistribute static isn’t configured in some circumstances—how could this not be a bug?
• When a static route is configured and the next-hop is a local interface (including null0) it sets a bit on the route identifying that it is pseudo-connected which requires things like ARP to reach hosts within the subnet
• Since the connected bit is set on the route, EIGRP picks it up if the destination of the route is also covered by a network statement; this has always been the case and is operating as designed
• One really unusual aspect of the route that redistributed is that it shows up as a redistributed internal so peers will see the route as an internal even though it’s redistributed… confusing, huh?
Static Route to Connected Interface
EIG
RP
OS
PF
Multiple Points of Redistribution
• A route is injected into EIGRP as an external; this route is redistributed into OSPF by RtrB
• The route is transmitted through OSPF to RtrA , who redistributes it back into EIGRP
• Depending on the manually set metrics, RtrB may prefer this redistributed route, building a routing loop
• Depending on the timing, the loop can be persistent or transient. Either way, a bad thing!
A
Metric 10 Metric 2816000
10.1.1.0/24
Metric
2688000
Metric 25 Metric 2560256
B
• As mentioned before, redistribution isn’t evil in itself, but a network designer needs to be particularly careful if there are multiple points of redistribution between routing protocols; also as mentioned before, a redistribution metric is normally supplied manually at the redistribution point—this artificial setting of the metric hides where the redistributed routes actually exist in the network
• Because of the loss of specific topology information due to resetting the metrics, suboptimal routing is likely if there are multiple points of redistribution—how can you know which redistribution point is closest to the actual destination? You can’t
• Not only that, it’s possible to create routing loops if the redistribution metrics at some places is better than the original metric; in the above example, a route that originates in EIGRP as an external that is then redistributed into OSPF and back into EIGRP could have a better metric at the inbound OSPF redistribution point than at the original redistribution point… broken
Multiple Points of Redistribution
• There are three primary methods used to prevent this routing loop:
• Redistributing live routing information in only one direction
• Filtering routes based on the prefixes advertised to prevent feedback loop
• Filtering routes using routing tags to prevent feedback
Redistribution Design
• The possible routing loop previously discussed occurs because of the mutual redistribution between protocols at multiple points; if the redistribution occurs in only one direction, the invalid improvement in metric cannot occur
• One way to change this from a mutual redistribution scenario to one-way redistribution is to provide the routes in one direction either through summarization or through a redistributed static
• One direction uses dynamic redistribution and other direction is static
Multiple Points of Redistribution
EIG
RP
OS
PF
Multiple Points of Redistribution
• Redistribute a static in one direction, and between protocols in the other direction
• A route is injected into EIGRP as an external; this route is then redistributed into OSPF by RtrB
• The route is transmitted to RtrA through OSPF; the route is not redistributed back into EIGRP, since redistribution between OSPF and EIGRP is not configured
Live Routing Information in Only One Direction
router ospf 100
redistribute eigrp 100 metric 10
router eigrp 100
redistribute static metric 10000 1000 255 1 1500
10
.1.0
.0/1
6
10
.2.0
.0/1
6
A
BMetric 10 Metric 2816000
Metric 25
ip route 10.2.0.0 255.255.0.0 serial 0/0
10.1.1.0/24
Metric 2560256
• Another way to eliminate the routing loop is by filtering routes that originate in EIGRP from being relearned back into EIGRP from the OSPF-EIGRP redistribution point; in other words, if the route originated in EIGRP, we have no need to accept the route back into EIGRP after it’s been redistributed into OSPF
• This filtering can be done with distribute-lists if the prefixes involved are easily identified blocks—if not, we have other techniques
Multiple Points of Redistribution
Multiple Points of Redistribution
• Configure access/prefix lists which match the address ranges used in each section of the network and filter based on these ACLs
• The route is injected into EIGRP as an external; this route is then redistributed into OSPF by RtrB
• The route is transmitted through OSPF and reaches RtrA
• The route is now blocked by distribute list 20, which breaks the routing loop
Filtering Based on Prefixes
10
.2.0
.0/1
6
router ospf 100
redistribute eigrp 100 metric 10
distribute-list 10 out
router eigrp 100
redistribute ospf 100 metric 1000 1 255 1 1500
distribute-list 20 out
EIG
RP
OS
PF
A
B
access-list 10 permit 10.1.0.0 0.0.255.255
access-list 20 permit 10.2.0.0 0.0.255.255
Metric 10 Metric 2816000
Metric 25
10.1.1.0/24
Metric 2560256
10
.1.0
.0/1
6
• If the prefixes generated in each routing protocol are not in well-defined blocks that are easily specified in an access-list, filtering can also be done using route-tags
• The tags can be applied as a route is redistributed from EIGRP into OSPF (as well as OSPF into EIGRP)
• Filters can be set to deny the routes from re-entering the domains in which they originated
• This is a far more flexible filtering method since it doesn’t require that the routes being filtered be in well-defined blocks; any route that has the tag set will be matched in the filtering process
Multiple Points of Redistribution
10
.2.0
.0/1
6
Multiple Points of Redistribution
• Set route tags when redistributing between the protocols; deny tagged routes at the redistribution point
• The route is injected into EIGRP as an external; it is redistributed into OSPF by RtrB and a tag is set
• The route is transmitted to RtrAthrough OSPF
• The route is blocked from being redistributed into EIGRPbecause of the route tag
Filtering Based on Tags
router ospf 100
redistribute eigrp 100 metric 10 route-map usetags
router eigrp 100
redistribute ospf 100 metric 1000 1 255 1 1500 route-map usetags
EIG
RP
OS
PF
A
B
route-map usetags deny 10
match tag 1000
route-map usetags permit 20
set tag 1000
Metric 10 Metric 2816000
10.1.1.0/24
Metric 2560256
Metric 25
10
.1.0
.0/1
6
• Zero successor routes
• Duplicate Router-ID
Additional Propagation Problems
Zero Successor Routes
• Zero successor routes happen when EIGRP attempts to install a route in the RIB and it is rejected
• This normally occurs when there is a route in the RIB with a better admin distance than EIGRP
• EIGRP cannot propagate zero successor routes to peers
A
C
B
10.10.10.10/32
RtrA#show ip eigrp topologyP 10.10.10.10/32, 1 successors, FD is 128256
via Connected, Loopback0
RtrB#show ip eigrp topologyP 10.10.10.10/32, 0 successors, FD is Inaccessible
via 10.1.1.30 (409600/128256), Ethernet0/0RtrB#show ip route static10.0.0.0/8 is variably subnettedS 10.10.10.10/32 [1/0] via 10.1.2.2
RtrC#show ip eigrp topology | incl 10.10.10.10RtrC#
• A route that shows up in the topology table with 0 successors and an FD of inaccessible is unusable by EIGRP for some reason; typically that means that we received the prefix from a peer and when we tried to install it into the RIB, the RIB rejected the installation
• Normally, the RIB rejects a route installation when there is already a better route in the table—remember our discussion about admin distance earlier in this presentation? Here’s what happens when we’re the loser with a worse admin distance
• One of the side-effects of a route being flagged as 0 successor/inaccessible is that we aren’t permitted to advertise a route to peers that we didn’t succeed at installing in the RIB; if we couldn’t put the route in the RIB, we can’t verify that the destination will be reachable, thus we can’t tell our peers about it
Zero Successor Routes
Zero Successor Routes
• Another case of zero successor routes happens with overlapping EIGRP Autonomous Systems
• If a prefix is known in both AS with the same AD, only one AS can install it in the RIB
• The EIGRP AS that failed to install it will not propagate the route to its peers
A
C
B
10.10.10.10/32
RtrA#show ip eigrp 1 topologyP 10.10.10.10/32, 1 successors, FD is 128256
via Connected, Loopback0RtrA#show ip eigrp 2 topologyP 10.10.10.10/32, 1 successors, FD is 128256
via Connected, Loopback0
RtrB#show ip eigrp 1 topologyP 10.10.10.10/32, 1 successors, FD is 409600
via 10.1.1.30 (409600/128256), Ethernet0/0RtrB#show ip eigrp 2 topologyP 10.10.10.10/32, 0 successors, FD is Inaccessible
via 10.1.1.30 (409600/128256), Ethernet0/0
D
RtrD#show ip eigrp 2 topology | incl 10.10.10.10RtrD#
RtrC#sh ip eigrp 1 topologyP 10.10.10.10/32, 1 successors, FD is 435200
via 10.1.2.2 (435200/409600), Ethernet0/0
• Another case of the zero successor route problem is when there are overlapping EIGRP Autonomous Systems. Sometimes a network designer will define two EIGRP Autonomous Systems with the same network statement, covering the same interfaces and expect both of them to propagate the associated routes; this is often done during AS transition time when they’re trying to combine Autonomous Systems
• The problem is that a prefix cannot be installed in the RIB by both Autonomous Systems at the same time, so one will be accepted and one will be rejected; the one that is rejected will not be sent to peers in the losing Autonomous System
Zero Successor Routes
Duplicate Router-ID
• A problem previously limited to redistributed routes happens when there are duplicate router-ids
• Please note that this problem is no longer limited to external routes!
• In this example, RtrB sees the route redistributed from RIP on RtrA just fine, but RtrC does not see it
10.1.1.0/24 via RIP
router eigrp 100redistribute ripdefault-metric ....
RtrC#show ip route 10.1.1.0RtrC#
RtrB#show ip route 10.1.1.0....10.1.1.0/24 via [RtrA]
A
C
B
192.168.1.1
Duplicate Router-ID
• Looking at RtrB’s topology table, we can see the originating router ID field in the external route is set to 192.168.1.1
• But, that’s RtrC’s loopback address!
A
C
B
RtrB#show ip eigrp topology 10.1.1.0IP-EIGRP (AS 1) topology entry for 10.10.1.0/24....External data:Originating router is 192.168.1.1
• In the example above, we have a problem where routes are being redistributed into the network, but one router elsewhere in the network is refusing to install the routes into its topology table or routing table
• As the slide shows, the problem is that the router-id of the redistributing router matches the router-id of the router refusing to install the route
• To block routing loops, a router doing redistribution will not accept a route from a neighbor if it is the one that originated it via redistribution; this is known by the originating router field in the external data section of the topology table entry
• Since the router-ids are the same on RtrA and RtrC, RtrC thinks RtrA’s external routes originated on RtrC and it rejects them
Duplicate Router-ID
• Impact on route installation
• EIGRP includes the router ID of the originating router in external routing information in older code, and both internal and external routes in newer code
• If a router receives an route with a router ID matching its own local router ID, it discards the route
• This prevents routing loops/SIA for routes originated locally but also learned form others
• You need to make sure your router-ids are unique by either not duplicating addresses on loopback interfaces or explicitly defining the router-id!
Duplicate Router-ID
• The EIGRP router ID is derived from
• The router-id command
• Highest IP address on a loopback interface
• Highest non-loopback interface IP address if no loopbacks
• NOTE: Interface used for router-id must reside in the same routing table as the EIGRP process creating the router-id
• Once the router ID is set, it won’t be changed without manual intervention, even if the interface from which it’s taken is removed from the router
Duplicate Router-ID
Environments containing mixed versions of IOS can create interesting problems if duplicate router-ids exist in your network
If a prefix is learned through a path that runs code that doesn’t support the internal router-id, the information gets stripped out
This could cause inconsistent results!
Duplicate Router-ID
• The EIGRP router ID was added to internal routes as part of release 5 EIGRP
• Code before release 5 exchanges older TLV types which do not contain the router-id for Internal routes
• If a router running release 5 or later sends updates to an older peer (pre-release 5), it sends the older TLV type without the router-id in the packet
• This can cause inconsistencies in your network
Duplicate Router-ID
Duplicate Router-ID
In the network shown here, RtrAreceives updates for 10.1.1.0/24 through two paths
Because the internal router-id is not supported by RtrC , RtrA would install the prefix learned through him
The prefix learned through RtrBwould be rejected due to the duplicate router-id
This could also mean that the prefix works sometimes and not others!
A
B C
D
Pre-Rel 5
Rel 5
Rel 5
Rel 5 192.168.1.1
192.168.1.1
10.1.1.0/24
• In the preceding slide, a prefix could be learned via two different paths. If the prefix is learned from one peer, it doesn’t contain the router-id and is accepted and installed. If the prefix is learned from the other peer, the router-id is included in the update and the prefix is rejected due to the duplicate router-id.
• This example shows a straightforward result, with one path not allowed stopping the equal cost load-balancing that should be going on based on the topology.
• In some cases, it may happen that at times a prefix could be installed if certain links were up (or down) but not if other links are up (or down.) This inconsistency could be extremely difficult to troubleshoot due to the apparent random nature of the symptoms.
• Just remember, router-ids need to be unique!
Duplicate Router-ID
Duplicate Router-ID
• In older versions of Cisco IOS® software, the only way to find out a router’s router ID was to go to an adjacent router and look for some redistributed route from that router
router# show ip eigrp topology 10.1.1.0 255.255.255.0IP-EIGRP (AS 7): topology entry for 10.1.1.0/24State is Passive, Query origin flag is 1, 1 Successor(s), FD is 2560000256Routing Descriptor Blocks:10.1.2.1 (Ethernet0/0), via 10.1.2.1, Send flag is 0x0
Composite metric is (2560000256/0), Route is External....External data:
Originating router is 192.168.1.1AS number of route is 0External protocol is RIP, external metric is 1Administrator tag is 0 (0x00000000)
• In current versions of Cisco IOS software, the router ID is listed in the output of show ip eigrp topology
• router-1# show ip eigrp topology
• IP-EIGRP Topology Table for AS(7)/ID(192.168.1.1)
• ....
• If your event log is large enough, or things are happening slowly enough, you might also see the problem indicated in your event log
• 1 02:30:18.591 Ignored route, metric: 192.168.1.0 2297856
• 2 02:30:18.591 Ignored route, neighbor info: 10.1.1.0/24 Serial0/3
• 3 02:30:18.591 Ignored route, dup router: 192.168.1.1
Duplicate Router-ID
Duplicate Router-ID
• To determine the EIGRP release to see if it supports the internal router-id, use the command “show eigrp plugin”
• If the command is rejected, you’re definitely running older code
• If the command is accepted, look at the version of EIGRP in the output to see if it’s before or after release 5
RtrA#show eigrp plugin
EIGRP feature plugins:::
eigrp-release : 5.01.00 : Portable EIGRP Release
: 2.02.34 : Source Component Release(Portable EIGRP Release(rel5_1))
igrp2 : 3.00.00 : Reliable Transport/Dual Database
external-client : 1.02.00 : Service Distribution Client Support
…Snip …
• To determine the release of EIGRP you’re running on a router, use the command “show eigrp plugin” or “show eigrp plugin detail”. These commands are explained a little later in this presentation but I thought you could use the info here, as well.
• Older code (Release3 and earlier, IIRC) did not support the “show eigrp plugin” command and will reject it as invalid.
• Newer code will show you the version of EIGRP you’re running in addition to which subsystems/features are loaded into EIGRP.
Duplicate Router-ID
• Resource Depletion
• Problems with Links > 1G
• IPv6 Unique Issues
Advanced Issues
• Shared Media
• Scalability
Resource Depletion
• EIGRP by default will use up to 50% of the link bandwidth for EIGRP packets
• This parameter is manually configurable by using the command:
• ip bandwidth-percent eigrp <AS-number> <nnn>
EIGRP Resource Depletion
• Prior to CSCdi36031 (roughly 10.3), EIGRP had huge problems with bandwidth depletion—EIGRP would use all of the link at the expense of layer 2 keep alives!
• With CSCdi36031, there was a significant change in the way we build and transmit packets at the time that still effects low-speed links; occasionally the TAC still gets cases or questions about the behavior so it’s worth explaining here so you can design accordingly
• The biggest part of the change was to implement packet pacing based on the defined bandwidth of the interface. This packet pacing puts enough gaps between the packets to ensure that we don’t overwhelm the interface with EIGRP packets causing the layer two keepalives to be missed and the interface to drop
• There are some circumstances where the bandwidth on the interface isn’t a good measure of what pacing should be, however
EIGRP Resource Depletion
Bandwidth over Multipoint Interfaces
• EIGRP over multipoint interfaces such as DMVPN and mGRE has to share the available bandwidth among peers
• EIGRP uses the bandwidth on the main interface divided by the number of neighbors on that interface to get the bandwidth available per neighbor
• Hint: IWAN Topology
• Some interface types appear to EIGRP to be a shared interface but in reality they’re provided by a point-to-point mechanism (like DMVPN—Dynamic Multipoint Virtual Private Networks—which uses MGRE for transport) and the ability of the underlying network may not match up with the bandwidth defined on the interface; for example, if an mGRE outbound interface is Gigabit Ethernet but the tunnels traverse an ISPs network, we can’t actually send at Gigabit rates and expect all of the packets to be delivered at that rate
• EIGRP divides the defined bandwidth by the number of peers seen, giving each approximately equal shares of the available bandwidth; this may not be exactly right, but it’s the best we can guess
EIGRP Resource Depletion
• Set the bandwidth on the multipoint interface to a value that most closely defines the actual circuit speed and ability to deliver traffic to the peers.
• Don’t artificially scale the bandwidth configuration down or up for traffic control!
• Use DMVPN Per-Tunnel QoS feature to control bandwidth along with the eigrp “ip bandwidth-percent” command.
Bandwidth over Multipoint Interfaces
• For DMVPN, other resources can also be depleted (and often are)
• Several processes are involved, each of which has overhead and can run into resource depletion• nhrp, ipsec, etc.
• Make sure interface queues and buffers are tuned to minimize/eliminate drops
• Monitor EIGRP process queues for drops in large scale scenarios with instability • No way to adjust these queues, amount of information or pacing needs to be
adjusted via good summarization (with summary metrics!) or filtering
Bandwidth over Multipoint Interfaces
Show Commands
RtrB#show ip eigrp traffic
EIGRP-IPv4 VR(IWAN) Address-Family Traffic Statistics for AS(1)
Hellos sent/received: 822997/205769
Updates sent/received: 4/4
Queries sent/received: 1/1
Replies sent/received: 1/1
Acks sent/received: 6/4
SIA-Queries sent/received: 0/0
SIA-Replies sent/received: 0/0
Hello Process ID: 632
PDM Process ID: 627
Socket Queue: 0/10000/2/0 (current/max/highest/drops)
Input Queue: 0/10000/2/0 (current/max/highest/drops)
Show IP EIGRP Traffic
Show Commands
• Show ip eigrp traffic can be very useful to see what kind of activity has been occurring on your network; Some of the most interesting information includes:
• Input queue high water mark—this shows how many packets have been queued inside of the router to be processed—when packets are received from the IP layer, EIGRP accepts the packets and queues them up for processing; if the router is so busy that the queue isn’t getting serviced, the queue could build up—unless there are drops, there is nothing to worry about, but it can give you an indication of how hard EIGRP is working
• SIA-queries sent/received—this is useful to determine how often the router has stayed active for at least one and one-half minutes (as mentioned in the earlier section on stuck-in-active routes; this number should be relatively low—if it’s not, it’s taking a bit of time for replies to be received for queries, and it might be worth exploring why
Show IP EIGRP Traffic
• DMVPN/mGRE is a very popular technology which presents some interesting challenges in troubleshooting and avoiding problems; from an EIGRP perspective, the mGRE Tunnel interface is multicast and our sending process (and pacing) is based on the assumption that we send a multicast packet out to all of the peers on the Tunnel interface—in reality, the multicast packet is replicated by the NHRP code (Next Hop Resolution Protocol), which then delivers the replicated packets to (normally) ipsec for encryption before delivering on the Tunnel
• Each step along the way has queues and buffers and other resources required to do the job of packet delivery—each of these resources are potential places of resource depletion; the following slides give you a few commands we’ve found useful when troubleshooting large scale DMVPN environments
EIGRP Resource Depletion
• Many commands are useful to see how DMVPN/mGRE are behaving resource-wiseShow ip nhrp summary
Show ip nhrp multicast
Show ip nhrp traffic
Show buffers | include failures
Show interface tunnel 1 | include nput
Show interface Gig 0/1 | include nput (for outbound interface used by tunnel)
Show buffer input-interface Gig 0/1 header
Show ip eigrp topo summary
Show ip eigrp traffic
Show ip eigrp interface detail tunnel 1
EIGRP Resource Depletion
hub2#show ip nhrp summary
IP NHRP cache 800 entries, 288000 bytes
0 static 800 dynamic 0 incomplete
hub2#show ip nhrp
106.1.0.2/32 via 106.1.0.2
Tunnel2 created 1w6d, expire 00:14:37
Type: dynamic, Flags: unique registered
NBMA address: 4.1.0.2
106.1.0.6/32 via 106.1.0.6
Tunnel2 created 1w6d, expire 00:14:37
Type: dynamic, Flags: unique registered
NBMA address: 4.1.0.6
EIGRP Resource Depletion
hub2#show ip nhrp multicast
I/F NBMA address
Tunnel2 2.2.2.2 Flags: static
Tunnel2 4.9.0.142 Flags: dynamic
Tunnel2 4.9.0.10 Flags: dynamic
Tunnel2 4.9.0.58 Flags: dynamic
Tunnel2 4.2.0.150 Flags: dynamic
Tunnel2 4.2.0.22 Flags: dynamic
hub2#show ip nhrp traffic
Tunnel2: Max-send limit:65535Pkts/10Sec, Usage:0%
Sent: Total 3014400
0 Resolution Request 0 Resolution Reply 0 Registration Request
3014400 Registration Reply 0 Purge Request 0 Purge Reply
0 Error Indication 0 Traffic Indication
Rcvd: Total 3014400
0 Resolution Request 0 Resolution Reply 3014400 Registration Request
0 Registration Reply 0 Purge Request 0 Purge Reply
0 Error Indication 0 Traffic Indication
EIGRP Resource Depletion
hub2#show buffers | include failures
0 failures (0 no memory)
0 failures (0 no memory)
0 failures (0 no memory)
0 failures (0 no memory)
0 failures (0 no memory)
hub2#show interface tunnel 2 | include nput
Last input 00:00:00, output 00:00:01, output hang never
Input queue: 0/4096/0/0 (size/max/drops/flushes); Total output drops: 0
30 second input rate 146000 bits/sec, 189 packets/sec
198829081 packets input, 1620463316 bytes, 0 no buffer
0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored, 0 abort
EIGRP Resource Depletion
hub2#show interface gig 0/3 | include nput
output flow-control is XON, input flow-control is unsupported
Last input 00:00:00, output 00:00:01, output hang never
Input queue: 1/4096/0/0 (size/max/drops/flushes); Total output drops: 0
5 minute input rate 150000 bits/sec, 173 packets/sec
199154019 packets input, 128670948 bytes, 0 no buffer
0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
0 watchdog, 664336 multicast, 0 pause input
EIGRP Resource Depletion
hub2#show buffer input-interface gig 0/3 header
Buffer information for Middle buffer at 0x7A59A7C
data_area 0x7890CFE4, refcount 1, next 0x0, flags 0x280
linktype 7 (IP), enctype 1 (ARPA), encsize 14, rxtype 1
if_input 0x5A9DA04 (GigabitEthernet0/3), if_output 0x0 (None)
inputtime 1w6d (elapsed 00:00:00.004)
outputtime 00:00:00.000 (elapsed never), oqnumber 65535
datagramstart 0x7890D02A, datagramsize 108, maximum size 756
mac_start 0x7890D02A, addr_start 0x7890D02A, info_start 0x0
network_start 0x7890D038, transport_start 0x7890D04C, caller_pc 0x22DCC58
source: 4.19.0.82, destination: 2.2.2.2, id: 0xCA33, ttl: 252, prot: 47
EIGRP Resource Depletion
EIGRP Resource Depletionhub2#show ip eigrp topology summary
EIGRP-IPv4 Topology Table Summary for
AS(1)/ID(3.21.66.1)
Head serial 1, next serial 8011
1679 routes, 0 pending replies, 0 dummies
Enabled on 877 interfaces, 800 neighbors present
on 1 interfaces
Quiescent interfaces:
Tu2
hub2#show ip eigrp traffic
EIGRP-IPv4 Traffic Statistics for AS(1)
Hellos sent/received: 214154002/409376558
Updates sent/received: 99630/12123
Queries sent/received: 0/0
Replies sent/received: 0/0
Acks sent/received: 2609/119749
SIA-Queries sent/received: 0/0
SIA-Replies sent/received: 0/0
Hello Process ID: 260
PDM Process ID: 259
Socket Queue: 0/2000/864/0
(current/max/highest/drops)
Input Queue: 0/2000/864/0
(current/max/highest/drops)
DMVPN with Multiple Next Hops
• In DMVPN phase 2 setup, a hub router can learn two or more equal-cost paths to a site.
• However, the hub router will only advertise one of the paths to other spokes in the DMVPN network.
Implication:
• Spoke to spoke tunnels will only be established to a single router and cannot leverage this multi-routers setup
• Phase 3 of DMVPN mitigates issue 10.1.5.0/24
.5 .6
10.1.5.0 [90/18600] via 172.16.1.5, Tunnel1
via 172.16.1.6, Tunnel1
10.1.5.0 [90/32600] via 172.16.1.5, Tunnel1
• While this isn't a route propagation problem, per se, it's still a situation that may take you by surprise and therefore may be useful to understand
• One of the designs being implemented with DMVPN uses multiple paths from the hub to reach spoke subnets. This could be two paths to the same spoke or through two spokes (as shown on the previous slide)
• The problem is that EIGRP still uses normal distance vector rules and sends updates based on the top topology table entry. Even if there are two equal cost paths, EIGRP sends updates based on the top entry. Since historically distance vector protocols only report how far they are metric-wise from a destination, this has been enough. Now it’s not quite enough information
DMVPN with Multiple Next Hops (Multiple IWAN Hubs)
DMVPN with Multiple Next Hops
• One way to avoid this situation is to use a different hub for each preferred spoke path
• Each hub could use either a metric type command (offset-list) or distance (if all internals) to prefer one path or the other in order to propagate the desired next-hop information to the other spokes
10.1.5.0 [90/18600] via 172.16.1.5
10.1.5.0 [90/32600] via 172.16.1.5
via 172.16.1.6
10.1.5.0 [90/18600] via 172.16.1.6
10.1.5.0/24
.5 .6
• There is also new functionality (add-path) allowing a route to be advertised with multiple next hops, addressing this very situation: Multiple DMVPN hubs advertising the same prefix. If your code version supports this functionality, it is the ideal solution and the most elegant.
• Without the add-path functionality though it's not that hard to design your network so that the two paths to the prefix at the spoke is learned through two different hubs rather than through one
• You could then use metric (offset-list possibly) or the distance command to have each hub prefer a different path to the spoke prefix. It could then advertise the next-hop value associated with it's choice, avoiding the problem.
DMVPN with Multiple Next Hops
• This might also have some relevance to EIGRP OTP, information later in additional information slides.
• IWAN Architecture addresses this in several different ways:
• DMVPN Phase 3 – no longer requiring the routes to be advertised for spoke-to-spoke tunnel establishment
• PFRv3’s ability to look at the additional routing information for path selection and load sharing
• Ability to carry additional next hops and advertise shared prefixes from the hub site/dual pop’s.
DMVPN with Multiple Next Hops
• Scaled Bandwidth Problem
• Delay Granularity Problem
• Path Selection Problem
• The Solution – Wide Metrics
• Remaining Issues
Problems with Links > 1G
Problems with Links > 1G
• EIGRP was created when FastEthernet was considered fast
• In order to simplify metric calculation and comparison, bandwidth was scaled
• Scaled Bandwidth is 10^7/BW (with BW in kbps)
• This worked fine until the BW was 10^7 or greater
• 10G is 10^7 in kbps, so the scaled value becomes 10^7/10^7 or a value of 1
• If the interface bandwidth is > 10G, the value becomes 0!
• 10^7/11^7 is < 1 therefore value becomes 0 (Integer math)
• While the code is smart enough to discount a BW value of 0 for the calculation, it can still cause invalid routing decisions
Scaled Bandwidth Problem
Problems with Links > 1G
• EIGRP was created in the early 1990s and was an enhancement to IGRP, which dated back to the 1980s
• In order to scale the metrics larger for EIGRP, the IGRP was increased from 24 to 32 bits in size. This wasn’t enough.
• Similar to IGRP, we scaled the Bandwidth value in order to ease transport and comparisons. By making the stored bandwidth value 10^7/BW, larger bandwidth values creating a smaller number, making comparisons easier. (Bigger bandwidth = smaller value/better metric)
• Of course, when you go beyond 10^7 as a bandwidth value, the formula breaks down. 10G is 10^7 (since it’s measured in kbps). Links above 10G weren’t envisioned in the early 1990s and that lack of foresight cause us to have to deal with it now.
Scaled Bandwidth Problem
Problems with Links > 1GScaled Bandwidth Problem
C6K-MCORE-1#show interface port-channel 12
Port-channel12 is up, line protocol is up (connected)
…
MTU 9216 bytes, BW 20000000 Kbit, DLY 10 usec,
…
C6K-MCORE-1#show ip eigrp topology 10.1.1.0/30
EIGRP-IPv4 Topology Entry for AS(1)/ID(1.0.0.23) for 10.1.1.0/30
State is Passive, Query origin flag is 1, 1 Successor(s), FD is 256
Descriptor Blocks:
0.0.0.0 (Port-channel12), from Connected, Send flag is 0x0
Composite metric is (256/0), route is Internal
Vector metric:
Minimum bandwidth is 0 Kbit
Total delay is 10 microseconds
…
Problems with Links > 1G
• Interface delay values are defined by the core code
• EIGRP just pulls in the value provided by the infrastructure
• Just like EIGRP with scaled BW, the core code didn’t foresee such high BW links
• 1 Gigabit – Delay is 10 microseconds
• 10 Gigabit – Delay is 10 microseconds
• 20 Gigabit – Delay is 10 microseconds
• Etc.
• Since EIGRP uses aggregate delay to make path selection decision, this is trouble
• A 1G, 10G, and 20G link all look the same value and the 20G isn’t preferred
Delay Granularity Problem
Problems with Links > 1G
• While this problem wasn’t really EIGRP’s fault, the cause is similar to the Bandwidth problem described above.
• The Interface Delay values were determined back in the 1980s and the range of Delay values in microseconds became too small too quickly.
• Since a 1G link had a delay value of 10 microseconds and the value is specified in 10s of microseconds, you can’t get any smaller delay values on the interface.
• Since EIGRP uses this Delay value in the metric calculations, we were unable to tell a 1G link from a 2G link from a 10G link, etc.
• Not our fault, but our problem to fix!
Delay Granularity Problem
Problems with Links > 1G
• Due to the problem with Scaled BW and Delay granularity, path selection is negatively impacted
• Links could be seen as equal cost when they’re not
• Higher BW links are not preferred over lower BW links
• While this will not create routing loops, sub-optimal routing could easily occur
Path Selection
Problems with Links > 1G
• The lack of Delay granularity and the inability of EIGRP to store and compare BW values > 10G combined to cause EIGRP to make invalid or suboptimal routing decisions when links above 1G are used.
• Routing loops will not occur since all routers make the same (wrong) decision, but you can’t use the higher speed links in your network in the way you want. There’s a reason you put in a 10G link (or higher) and we should pay attention to that!
• Unfortunately, the standard formula, interface delay values, and transport wouldn’t allow us to fix the problem using the old metric components.
Path Selection
Problems with Links > 1GPath Selection
A
B C
D
10.1.1.0/24
10G
10G 1G
1G
RtrD sees two equal cost paths to reach 10.1.1.0/24
One path definitely better, but not preferred
– Underutilize 10G path?
– Overrun 1G path?
Problems with Links > 1G
• In the preceding diagram, EIGRP on RtrD will be unable to distinguish the difference in metric for the path to 10.1.1.0/24 through RtrB and RtrD.
• Since the 10G link will have the same scaled bandwidth value as the 1G link, and both will have a delay value of 10 msecs, both paths are seen as being the same.
• This is either inefficient or terrible. EIGRP on RtrD will attempt to install equal cost paths through RtrB and RtrC, using both equally in the forwarding plane.
• Since the path through RtrC cannot handle nearly as much traffic as the path through RtrD, it’s likely either the path through RtrC will be overwhelmed or the path through RtrB will be underutilized.
Path Selection
Problems with Links > 1GPath Selection
A
B
C
D
10.1.1.0/24
10G
10G
1G
1G
An even worse example:
RtrD sees one best path to reach 10.1.1.0/24 through RtrC
The path taken is definitely worse than the other, but EIGRP is not aware and thus makes the wrong routing decision
10G
E
Problems with Links > 1G
• In the past, the only solution to this problem was to manually set the delay values so that you could impact the path selection to make sense in your network
• This wasn’t a good answer, mainly because it was administratively burdensome and hard to design/implement
• We also saw cases where customers built in routing loops by changing the delay values differently on different ends of the links!
• OSPF solved this a few years ago by defining a reference bandwidth that must be set the same throughout an area
• EIGRP doesn’t have areas and we couldn’t make customers upgrade all routers in the network in order to support a different metric method
• How can we solve this elegantly?
The Old Solution
Problems with Links > 1G
• EIGRP has implemented a feature called “Wide Metric” support, which no longer scales the BW and dynamically creates the delay in picoseconds
• This feature is backward compatible (though mixed environments can have suboptimal routing)
• Implemented in EIGRP Release 8.0 and later (Release 18.0 is current!)
• NOTE: only available when configured in Named Mode!
• On by default when configured in Named Mode
http://www.cisco.com/c/en/us/products/collateral/ios-nx-os-software/enhanced-interior-gateway-routing-protocol-eigrp/whitepaper_C11-720525.html
The Solution
Problems with Links > 1G
Router# show eigrp address-family ipv4 topology
EIGRP-IPv4 VR(WideMetric) Topology Entry for AS(4453)/ID(3.3.3.3) for 100.1.0.0/16
State is Passive, Query origin flag is 1, 1 Successor(s), FD is 262144, RIB is 2048
Descriptor Blocks:
2.0.0.2 (Ethernet0/2), from 2.0.0.2, Send flag is 0x0
Composite metric is (262144/196608), route is Internal
Vector metric:
Minimum bandwidth is 20000000 Kbit
Total delay is 3000000 picoseconds
Reliability is 255/255
Load is 1/255
Minimum MTU is 1500
Hop count is 2
Originating router is 100.1.1.1
The Solution
Problems with Links > 1G
• Possible Routing Loop with IOS and IOS/XR PEs
• Granularity Problem for Offset-lists between Classic and Wide Metrics
Remaining Issues
Problems with Links > 1GPossible Routing Loop with IOS and IOS/XR PEs
• PE - Provider Edge router
• CE - Customer Edge router
• VPN - Virtual Private Network
• VRF - Virtual Routing and Forwarding instance (routing table)
• Backdoor link - link between sites which doesn’t use the VPN
Service Provider
EIGRP Site 1
EIGRP Site 2
PE
VPN
PE
CE
CEBackdoor link
Problems with Links > 1G
• In a MPLS/VPN PE/CE environment with both IOS and IOS/XR routers running wide metrics, IOS/XR evaluates the cost community attribute of the prefixes differently than IOS
• This can cause the IOS and IOS/XR PE to make different decisions about the exit point for a prefix, potentially leading to a routing loop
• The workaround is to use 32 bit metrics on the IOS/XR router
• Test metric-type 32-bit before CSCul96943
• Metric-type 32-bit after CSCul96943
Possible Routing Loop with IOS and IOS/XR PEs
Problems with Links > 1G
• One PE is running an IOS/XR version with Wide Metric support and another is running IOS with Wide Metric support
• There is also a backdoor link between the two EIGRP sites
• PE1 could prefer PE2 and PE2 could prefer PE1
• This can cause a routing loop!
• Define metric-type 32-bit on IOS/XR!
Possible Routing Loop with IOS and IOS/XR PEs
Service Provider
EIGRP Site 1
EIGRP Site 2
PE1 PE2
CE
CE
IOSIOS/XR
Backdoor link
Problems with Links > 1G
• Wide metrics and Classic metrics use different scales for delay value
• Classic – 10s of microseconds
• Wide – picoseconds
• Offset-lists use changes in delay to “offset” the metric
• If the change is made on a wide-metric router and then sent to a classic router, granularity is lost and a different value than expected may be received
• Workaround – verify changes to metric after implementing offset-list and adjust as needed!
• Note that this could impact match statements if you’re looking for a specific value!
Granularity Problem for Offset-lists between Classic and Wide Metrics
Problems with Links > 1GGranularity Problem for Offset-lists between Classic and Wide Metrics
R1#show run | sec router eigrp
router eigrp 100
network 10.0.0.0
offset-list 10 out 1000
R1#show run | sec access-list
access-list 10 permit any
R1
R2
Release 12
Release 6
10.1.23.1/24
R2#show ip eigrp topology 10.1.23.0/24
EIGRP-IPv4 Topology Entry for AS(100)/ID(10.1.21.1) for
10.1.23.0/24
State is Passive, Query origin flag is 1, 1 Successor(s),
FD is 307200
Descriptor Blocks:
10.1.12.2 (Ethernet0/0), from 10.1.12.2, Send flag is 0x0
Composite metric is (307200/281600), route is Internal
…
Before:
R2#show ip eigrp topology 10.1.23.0/24
EIGRP-IPv4 Topology Entry for AS(100)/ID(10.1.21.1) for
10.1.23.0/24
State is Passive, Query origin flag is 1, 1 Successor(s),
FD is 307200
Descriptor Blocks:
10.1.12.2 (Ethernet0/0), from 10.1.12.2, Send flag is 0x0
Composite metric is (307968/282368), route is Internal
…
After:
• IPv6 Router-ID
• IPv6 Interfaces
• IPv6 Peer Addresses
• IPv6 Shutdown
IPv6 Unique Issues
• EIGRP IPv6 topology table entries include the router-id of the originating router, just like IPv4
• In previous versions, external routes only
• Now true for Internal routes, as well
• The router-id used by IPv6 is a four-byte number
• Actually uses IPv4 address, just like IPv4!
• Why use an IPv4 address for EIGRP IPv6 router-id?
• Seemed overkill to use 128-bit number for router-id
• Routers may not have a global IPv6 address defined
IPv6 Router-id
• EIGRP IPv6 topology table entries have router-id fields, just like their IPv4 equivalents; this router-id is used to identify the place in the network that the prefix was originated to help eliminate routing/redistribution loops
• When designing the EIGRP IPv6 implementation, we discussed whether to use one of the IPv6 addresses for the router-id or whether to use an IPv4 address as we did for IPv4 EIGRP
• Why use an IPv4 address for the router-id?
• First, the router-id is only used as a label to identify where a prefix originated—an IPv4 address is 32 bits and an IPv6 address is 128 bits; to sacrifice 128 bits for the router-id for every prefix would significantly decrease the number of prefixes we could fit into an Update packet, all for a field that is effectively a label
• Additionally, IPv6 has the interesting characteristic that a router isn’t required to have any globally reachable addresses; since a router could contain only link-local addresses, the usefulness of an IPv6 address as a router-id could be minimal—it may or may not give you any indication of where the originating router exists in the network
IPv6 Router-id
• EIGRP IPv6 will not work without a router-id
• EIGRP uses the same router-id selection process used by IPV4• Highest IPv4 address on a Loopback interface
• If no Loopbacks, highest IPv4 address on non-loopback interface
• If no IPv4 address is available to use, manually set the router-id under the “ipv6 router eigrp x” configuration• “eigrp router-id 1.1.1.1”
• Note that in some older versions, the leading “eigrp” in the command above wasn’t required.
• Be intentional with your configuration and define your router-id’s!
IPv6 Router-id
• IPv6 uses exactly the same router-id selection criteria as IPv4
• If there are one or more loopback interfaces, use the highest IP address on a loopback interface
• If no loopback interfaces, use the highest IP address on whatever interfaces you have
• Interesting limitations exist, however
• The interfaces must exist in the same table (IPv4 version) as the IPv6 instance; in other words, if the interfaces belong to a VRF (Virtual Routing and Forwarding table), then they aren’t eligible to be used as router-ids for IPv6
• Note that once a router-id is selected, it isn’t changed to reflect changes in addresses; in other words, if a “no ip address” is issued for the address used as a router-id, it won’t change the router-id used until the next reload
• If there aren’t any interfaces available with IPv4 addresses, manually specify the address using the “eigrp router-id x.x.x.x” command• Some versions don’t use the leading “eigrp”—we haven’t been terribly consistent with this, but we will be from now
on… really
IPv6 Router-id
• EIGRP does not use a network statement to specify its IPv6 interfaces
• Interfaces may not have a routable address (may have only link-local)
• Need some other way to identify which should run EIGRP
• Two different methods exist, depending on the configuration method
• Classic mode uses the “ipv6 eigrp <AS>” command on each interface
• Named mode defaults to having all interfaces enabled• Will need to “shutdown” under af-interface in order to not include some interfaces
IPv6 Interfaces
IPv6 Interfaces
Classic ModeR2#conf t
Enter configuration commands, one per line. End
with CNTL/Z.
R2(config)#ipv6 router eigrp 1
R2(config-rtr)#interface s4/0
R2(config-if)#ipv6 eigrp 1
R2#sh run interface s4/0
Interface Serial4/0
ipv6 address 1:2::2/64
ipv6 eigrp 1
end
R2#sh run | section ipv6 router
ipv6 router eigrp 1
R2#sh ipv6 eigrp topology
P 1:2::/64, 1 successors, FD is 2169856
via Connected, Serial4/0
Named modeR1# conf t
Enter configuration commands, one per line. End
with CNTL/Z.
R1(config)#router eigrp foo
R1(config-router)#address-family ipv6 unicast auto
1
R1#sh run | section router eigrp
router eigrp foo
!
address-family ipv6 unicast autonomous-system 1
!
topology base
exit-af-topology
exit-address-family
R1#sh eigrp address-family ipv6 topology
P 1:1::/64, 1 successors, FD is 1735175958
via Connected, Serial4/0
• EIGRP sends hellos sourced from the link-local interface address
• Note that IPv6 interfaces are not required to have globally routable addresses
• Many routers will NOT have a global address on an interface that is facing other routers!
• A couple of interesting differences due to this use of link-local addresses
• Address in neighbor tables are only useful if you know which routers are reachable on each interface already
• Since a router can assign the same link-local address to multiple interfaces, you may see multiple peers with the same address if you peer across multiple links!
IPv6 Peer Addresses
• Output of “show ipv6 eigrp neighbors” with neighbor seen through multiple interfaces:
IPv6 Peer Addresses
EIGRP-IPv6 Neighbors for AS(1)
H Address Interface Hold Uptime SRTT RTO Q Seq
(sec) (ms) Cnt Num
3 Link-local address: Gi1/12 14 00:00:05 8 200 0 20
FE80::216:9CFF:FE6E:2C40
2 Link-local address: Gi1/13 13 00:00:06 7 200 0 16
FE80::216:9CFF:FE6E:2C40
1 Link-local address: Gi1/11 13 00:00:06 13 200 0 17
FE80::216:9CFF:FE6E:2C40
0 Link-local address: Gi1/10 13 00:00:06 4 200 0 18
FE80::216:9CFF:FE6E:2C40
• EIGRP IPv6 sends hellos (and actually all EIGRP packets) with the source address of the link-local address on the interface
• Note for those not familiar: IPv6 has two primary address classes• Link-local - not routable and significant only on the link (broadcast domain) on which they exist
• Global - advertised and used for remote reachability
• Since a peer relationship is limited to the link the peers see each other on, routers use the link-local address for communication
• An interesting aspect of this is that some platforms will assign the same link-local address to multiple interfaces
• This works just great since the address is only meaningful on the local link
• This does cause the above behavior, however—note that all four peers seen on this router have exactly the same address! Not a problem, but probably unexpected
IPv6 Peer Addresses
• When EIGRP for IPV6 was originally coded, the router process was defined to be Shutdown by default
• This was done because of the lack of network statement and the fact that interface commands could start EIGRP before filtering was defined
• This default behavior has confused users and testers so we’ve changed it in the latest code
• Just be warned that the default behavior for shutdown in EIGRP IPv6 is different in different versions!
IPv6 Shutdown
• Many, many years ago (in an IOS far, far away) we made the decision to have EIGRP IPv6 start off shutdown by default; since there are no network statements in EIGRP IPv6 , as soon as an interface was defined to use EIGRP IPv6 , processes would start, peers would form, and routes would be exchanged
• Since all of this could happen before filtering was put into place, we decided it was safer to require the user to explicitly do “no shut” under the ipv6 router statement to kick off the processing
• We’ve since come to regret this decision since it’s so drastically different than IPv4 EIGRP behavior—in recent code, we’ve changed the default to not require an explicit shutdown; be warned of our inconsistency
IPv6 Shutdown
• EIGRP IPv6 does not support VRFs in “classic” configuration mode
• There is a new mode of defining EIGRP you’ll be hearing a lot more about in the future that is being use for all advanced features
• Named mode doesn’t include the AS number on the router line and uses address-family commands to create EIGRP processes
• Also with Named mode configuration, all EIGRP interface commands are performed under the router command rather than on the interfaces themselves
• This change was made because of the way processes started in the old configuration method and we required more flexibility in when information was provided
IPv6 and VRF Support
IPv6 and VRF Support
Router#sh run | sec router eigrp
router eigrp foo
!
address-family ipv6 unicast autonomous-system 1
!
topology base
exit-af-topology
exit-address-family
address-family ipv4 unicast autonomous-system 2
!
topology base
exit-af-topology
network 10.0.0.0
exit-address-family
• This slide is a preview of coming attractions in addition to an explanation of how to define EIGRP IPv6 on a VRF. For the last several years, advanced features have been coded to use “named mode” configuration rather than classic
• We ran into limitations trying to put many features in classic mode because of the assumptions made when the router command was issued. We decided then to start a new mode for advanced feature but not take away “classic” EIGRP configuration for the more “normal” or simple implementations
• Making things more complicated when you wanted to do the same things you’ve been doing for years made no sense to us. Using a new mode for more complicated features seemed more reasonable
• We hope you agree!
IPv6 and VRF Support
• EIGRP Over-the-Top
• NX-OS
EIGRP ETC
EIGRP OTP
EIGRP Over-the-Top Peering
• Control Plane peering is accomplished with EIGRP “neighbor” statement
• Full Mesh, Partial Mesh, EIGRP Route-Reflectors
• Data Plane packet delivery is accomplished with LISP encapsulation
• No longer dependent upon the Service Provider – Natively maintain the networks EIGRP internal routing
Extend the EIGRP Network over MPLS or Internet Natively
DATALISP
Service Provider
MPLS VPNEIGRP
AS 4453
EIGRPAS 4453
Hello Hello
router eigrp ROCKS
address-family ipv4 unicast auto 4453
neighbor 192.168.2.2 Serial1/0 remote 100 lisp-encap
...
router eigrp ROCKS
address-family ipv4 unicast auto 4453
neighbor 192.168.1.1 Serial1/0 remote 100 lisp-encap
...
DATADATA CE-1 CE-2
WAN Virtualization using OTP
• EIGRP “end-to-end” solution with:
NO special requirement on Service Provider
NO special requirement on Enterprise
NO routing protocol on CE/PE link
NO need for route redistribution
NO no need for default or static routes
EIGRP Support for WAN Transparency
253
PE / CE
BGPComplexity
Carrier Involvement
Multiple Redistribution
Public & Unsecure
EIGRP OTP
EIGRP Simplicity
Carrier Independence
Zero Redistribution
Private & Secure
Availability- ASR 1000 Series– XE 3.10/IOS 15.3(3)S
- ISR, ISR G2, 7200 Series – IOS 15.4(3)T
WAN Virtualization using OTP
• No additional routing protocol to administer
• No routing protocol is needed on CE to PE link
• All user traffic appears and unicast IP data packets
• Limit impact on Service Providers Network
• Customer routes are NOT carried in MPLS VPN backbone
• Customer route flaps do not generate BGP convergence events
• Smaller BGP routing tables, smaller memory foot print, lower CPU usage
• Works with existing PE equipment
• Multivendor PE support
• No upgrade requirements for PE or any MPLS VPN backbone router
Service Provider Benefits
WAN Virtualization using OTP
• Single routing protocol solution
• Simple configuration and deployment for both IPv4 and IPv6
• Convergence is not depending on Service Provider
• Only the CE needs to be upgraded
• Routes are carried over the Service Provider’s network, not though it
• No artificial limitation on number of routes being exchanged between sites
• Convergence speed not impacted by BGP timers
• Works with both traditional managed and non-managed internet connections
• Compliments an L3 Any-to-Any architecture (optional hair pinning of traffic)
• Support for multiple MPLS VPN connections
• Support for connections not part of the MPLS VPN (“backdoor” links)
Enterprise Benefits
OTP – How it Works
• CE Routers have ‘private’ and ‘public’ interfaces
• Private interfaces use addresses that are part of the Enterprise network
• Public interfaces use addresses that are part of the Service Providers network
• For OTP neighbors to form, the Public interface must also be included in the EIGRP topology database (covered by the “network” command in IPv4)
• Packets are sourced from/to the public interface address eliminates the need for static routes
• EIGRP packets which are normally sent via multicast (Hello, Update, etc..) are sent unicast via the public interface
• Site-to-site traffic is encapsulated using LISP and sent unicast from/to the public interface address
Properties of CE Routers
OTP – How it Works
• EIGRP creates the LISP0 interface, and starts sending Hello packets to remote site via the Public interface
• Once neighborship is formed, EIGRP sends and receives routes from the peer, installing the routes into the RIB with the nexthop interface LISP0
• Traffic that arrive on the router destined for the remote side, is first sent to LISP0
• LISP encaps the traffic and then sends it to the Public interface
EIGRP, LISP, and RIB – Oh My!
EIGRP
PublicInterface
InsideInterface
DefaultTraffic
Site to Site
Traffic
RIB
RouteUpdates
LISP0
OTP – Data Plane
• Why use LISP to encapsulate the data as it traverses the WAN?
• Its “stateless” tunneling, so it;
• Requires NO tunnels to configure or manage
• Is transparent to the endpoints and to the IP core
• Supports both hair-pin and site-to-site traffic
• Supports both IPv4 and IPv6 traffic
• Provides an overlay solution that enables transparent extension of network across WAN
• IP-based for excellent transport independence
• Service provider picks optimal traffic path for site to site data
• Supports multicast and VLANs to allow for future enhancements
LISP Data Encapsulation
OTP – Data Plane
• Path MTU needs to be considered when deploying OTP
• LISP encapsulation adds 36 bytes (20 IP + 8 UDP + 8 LISP) for IPv4(56 bytes for IPv6)
• This could be significant for small packets (e.g., a VoIP packet)
• LISP handles packet fragmentation
• If the DF bit is set, it will generate an ICMP Destination Unreachable message
• LISP does not handle packet reassembly
• As a consequence, it is required to adjust the MTU to ensure the control plan does not fragment
• Best practice - set the MTU is set to to 1444 (or lower) bytes.
LISP Data Encapsulation Properties
OTP – How it Works
• EIGRP OTP is deployed in one of two ways
• Remote Routers
• Used for configuring a router to peer with one specific neighbor
• Forms a full mesh topology
• Configured with the command
• Route Reflectors
• Used to configure a router as a ‘hub’
• Forms a Hub and Spoke topology
• Configured with the command
Modes of Deployment
remote-neighbors source [interface] unicast-listen lisp-encap
neighbor [ipv4/v6 address] [interface] remote [max-hops] lisp-encap [lisp-id]
• For More on how to Deploy and Operate EIGRP OTP, please see the session focused on this solution – BRKRST-3336.
• Can be deployed as a series of full and partial mesh static remote peers – or – utilizing one or more devices as an EIGRP Route-Reflector.
EIGRP OTP
EIGRP OTP
• MTU internal to the provider/LISP Interface (normal tunneling MTU issues)
• LISP Interface State
• Be sure the provider is properly delivering packets across their network
• Ping from public (WAN) interface to remote public interface
• Ping with full mtu and df: ping 192.168.10.1 size 1500 df-bit
• If utilizing route-reflectors, be especially diligent about the Split-horizon and next-hop-self
• For multiple next hops in dual route-reflector scenario – add path option is needed.
• Network statement has to cover public (WAN) interface – else this can result in recursive route lookup failures for dual home sites
Common ‘Gotchas’
NX-OS
• Show commands
• VPC L3 Peering
EIGRP and NX-OS
EIGRP and NX-OS
• NX-OS has a slightly different CLI.
• Configuration Guides provide the latest CLI for enabling features in EIGRP, but some of the show and troubleshooting commands are different and samples have been included.
Show commands
EIGRP and NX-OS
lassie# sh ip eigrp
IP-EIGRP AS 1 ID 1.1.48.10 VRF default
Process-tag: 1
Instance Number: 1
Status: running
Authentication mode: none
Authentication key-chain: none
Metric weights: K1=1 K2=0 K3=1 K4=0 K5=0
IP proto: 88 Multicast group: 224.0.0.10
Int distance: 90 Ext distance: 170
Max paths: 8
Number of EIGRP interfaces: 1 (0 loopbacks)
Number of EIGRP passive interfaces: 0
Number of EIGRP peers: 3
Graceful-Restart: Enabled
Stub-Routing: Disabled
NSF converge time limit/expiries: 120/0
NSF route-hold time limit/expiries: 240/0
NSF signal time limit/expiries: 20/0
Redistributed max-prefix: Disabled
Show commands
EIGRP and NX-OS
lassie# sh ip ei int
IP-EIGRP interfaces for process 1 VRF default
Xmit Queue Mean Pacing Time Multicast Pending
Interface Peers Un/Reliable SRTT Un/Reliable Flow Timer Routes
Vlan48 3 0/0 483 0/0 3624 0
Hello interval is 5 sec
Holdtime interval is 15 sec
Next xmit serial <none>
Un/reliable mcasts: 0/9 Un/reliable ucasts: 32/15
Mcast exceptions: 3 CR packets: 2 ACKs suppressed: 1
Retransmissions sent: 1 Out-of-sequence rcvd: 0
Authentication mode is not set
Use multicast
Classic/wide metric peers: 3/0
Show commands
EIGRP and NX-OS
lassie# sh ip ei event ?
cli EIGRP CLI related events
errors Error log of EIGRP
fsm FSM log of EIGRP
msgs Message log of EIGRP
packet Packet log of EIGRP
rib RIB log of EIGRP
statistics State and size of the buffers
Show commands
lassie# sh ip ei traff
IP-EIGRP Traffic Statistics for AS 1 VRF
default
Hellos sent/received: 18246/54575
Updates sent/received: 23/35
Queries sent/received: 0/1
Replies sent/received: 1/0
Acks sent/received: 32/30
Input queue high water mark 3, 0 drops
SIA-Queries sent/received: 0/0
SIA-Replies sent/received: 0/0
Hello Process ID: (no process)
PDM Process ID: (no process)
EIGRP and NX-OS
lassie# deb ip eigrp ?
<CR>
1 EIGRP process tag
A.B.C.D Network to display information about
A.B.C.D/LEN IP prefix <network>/<length>, e.g., 192.168.0.0/16
all Enable all EIGRP debugs
fsm EIGRP Dual Finite State Machine events/actions
graceful-restart EIGRP Graceful-Restart
ha EIGRP HA
mpls IP-EIGRP MPLS debugging
neighbor IP-EIGRP neighbor debugging
notifications IP-EIGRP event notifications
packets EIGRP packets
route-map EIGRP route-map
summary IP-EIGRP summary route processing
transmit EIGRP transmission events
urib IP-EIGRP URIB interaction event debugging
vrf-events IP-EIGRP VRF event debugging
Show commands
EIGRP and NX-OS
lassie# sh ip eigrp internal ?
<CR>
> Redirect it to a file
>> Redirect it to a file in append mode
event-history Event History of EIGRP
library-info Show various event logs of library
mem-stats Show memory allocation statistics
| Pipe command output to filter
lassie# sh ip eigrp internal
IP-EIGRP AS 1 ID 1.1.48.10 VRF default
Process-tag: 1
Instance Number: 1
UUID: 1090519344
Process Linux Pid: 7084
Show commands
EIGRP and NX-OS
1/4
0/0/0
1/1
1/3
Lassie
N5K-5548P
B09
Harmon
N5K-5548P
A05
Pointer
N5K-5548UP
B04
Charlie2
ASR1001
C08
Charlie3
ASR1001
C08
1/11/1
1/21/2
1/1
1/3
1/4
0/0/0
10
mgmt0
Jules
N5K-5548P
B09
VPC L3 Peering
• Be careful! TTL issues can occur and some scenarios are unsupported.
• Refer to: http://www.cisco.com/c/dam/en/us/td/docs/switches/datacenter/sw/design/vpc_design/vpc_best_practices_design_guide.pdf
• Specifically: Layer 3 and vPC: Guidelines and Restrictions
• Troubleshooting Methodology
• Troubleshooting the Essentials
• Neighbor Formation
• Route Computation & Propagation
• Advanced Issues
• Resource Depletion
• High Speed Links
• IPv6 Specifics
Summary: Troubleshooting EIGRP Networks
• EIGRP contains many tools and techniques that can be used to keep the EIGRP network running smoothly and efficiently
• What now?
• Armed with Information
• Preventative Steps
• Hopefully this session taught you how to make the best use of these tools and techniques and translate them in a manner which will be relevant to your network design
Summary: Troubleshooting EIGRP Networks
Participate in the “My Favorite Speaker” Contest
• Promote your favorite speaker through Twitter and you could win $200 of Cisco Press products (@CiscoPress)
• Send a tweet and include
• Your favorite speaker’s Twitter handle ( @smoore_bits )
• Two hashtags: #CLUS #MyFavoriteSpeaker
• You can submit an entry for more than one of your “favorite” speakers
• Don’t forget to follow @CiscoLive and @CiscoPress
• View the official rules at http://bit.ly/CLUSwin
Promote Your Favorite Speaker and You Could Be a Winner
Complete Your Online Session Evaluation
Don’t forget: Cisco Live sessions will be available for viewing on-demand after the event at CiscoLive.com/Online
• Give us your feedback to be entered into a Daily Survey Drawing. A daily winner will receive a $750 Amazon gift card.
• Complete your session surveys though the Cisco Live mobile app or your computer on Cisco Live Connect.
Continue Your Education
• Demos in the Cisco Campus
• Walk-in Self-Paced Labs
• Table Topics
• Meet the Engineer 1:1 meetings
• And don’t forget CiscoLive.com!
Recommended Reading for BRKRST-2331
ASIN: 1578701651
ISBN: 0201657732
ISBN 1587051877
Open-EIGRP:draft-savage-eigrp-02
ISBN-13: 978-1587144233
Thank you