1 how scalable is the capacity of (electronic) ip routers? nick mckeown professor of electrical...
DESCRIPTION
3 Why it’s hard for capacity to keep up with link rates Packet processing power Link rateTRANSCRIPT
1
Hi gh Pe rf orm a nceSwi tc hi ng and Routi ngTe lec om Ce nter W orks ho p: Sep t 4 , 19 97.
How scalable is the capacity of(electronic) IP routers?
Nick McKeownProfessor of Electrical Engineering and Computer Science, Stanford [email protected]://www.stanford.edu/~nickm
2
Why ask the question?
Widely held assumption: Electronic IP routers will not keep up with link capacity.
Background: Router Capacity = (number of lines) x (line-rate)
• Biggest router capacity 4 years ago ~= 10Gb/s• Biggest router capacity 2 years ago ~= 40Gb/s• Biggest router capacity today ~= 160Gb/s• Next couple of generations: ~1-40Tb/s
3
Why it’s hard for capacity to keep up with link rates
Packet processing
power
Link rate
4
0,1
1
10
100
1000
10000
1985 1990 1995 2000
Spec
95In
t CPU
resu
lts
Why it’s hard for capacity to keep up with link rates
0,1
1
10
100
1000
10000
1985 1990 1995 2000
Fibe
r Cap
acity
(Gbi
t/s)
TDM DWDM
Packet processing Power Link Speed
2x / 2 years 2x / 7 months
Source: SPEC95Int & David Miller, Stanford.
5
Instructions per packet
time
Instructionsper packet
What we’d like: (more features)QoS, Multicast, Security, …
What will happen
6
What limits a router’s capacity?• It’s a packet switch:
– Must be able to buffer every packet for an unpredictable amount of time.
• Hop-by-hop routing:– Once per ~1000bits it must index into
a forwarding table with ~100k entries.
• [Optional QoS support– Very complex per-packet processing]
Limited by memory random access time
Limited by memory random access time
7
What really limits the capacity?
• At first glance: the random access time to memory.
• In fact, this can be solved by more parallelism (replication and pipelining).
• Dilemma: But parallelism requires more power and space.
8
What really limits the capacity?
Suggestion: – Don’t assume optics will oust CMOS in
IP routers because of increased system capacity.
– It might oust CMOS because of reduced (power x space) for a given capacity.
9
Outline
• A brief history of IP routers• Where they will go next
– Incorporating optics into routers– More parallelism (with or without
optics)
10
RouteTableCPU Buffer
Memory
LineInterface
MAC
LineInterface
MAC
LineInterface
MAC
Typically <0.5Gb/s aggregate capacity
First Generation RoutersShared Backplane
Line Interface
CPUMemory
11
First Generation RoutersQueueing Structure: Centralized Shared
MemoryLarge, single dynamically allocated memory buffer:N writes per “cell” timeN reads per “cell” time.
Limited by memory bandwidth.
Numerous work has proven and made possible QoS:– Fairness– Delay Guarantees– Delay Variation Control– Loss Guarantees– Statistical Guarantees
MemoryController
Output 1Output 2
Output N
Input 1Input 2
Input N
12
Second Generation RoutersRouteTableCPU
LineCard
BufferMemory
LineCard
MAC
BufferMemory
LineCard
MAC
BufferMemory
FwdingCache
FwdingCache
FwdingCache
MAC
BufferMemory
Typically <5Gb/s aggregate capacity
13
Second Generation RoutersQueueing Structure: Combined Input and
Output Queueing
Bus
1 write per “cell” time 1 read per “cell” timeRate of writes/reads determined by bus speed
14
Third Generation Routers
LineCard
MAC
LocalBuffer
Memory
CPUCard
LineCard
MAC
LocalBuffer
Memory
Switched Backplane
Line Interface
CPUMemory Fwding
Table
RoutingTable
FwdingTable
Typically <50Gb/s aggregate capacity
15
Arbiter
Third Generation RoutersQueueing Structure
Switch
1 write per “cell” time 1 read per “cell” timeRate of writes/reads determined by switch
fabric speedup
Per-flow/class or per-output queues (VOQs)
Per-flow/class or per-input queues
Flow-controlbackpressure
16
Third Generation Routers
19” or 23”
7’
• Size-constrained: 19” or 23” wide.
• Power-constrained.
17
Complex linecards
PhysicalLayer
Framing&
Maintenance
PacketProcessing
Buffer Mgmt&
Scheduling
Buffer Mgmt&
Scheduling
Buffer & StateMemory
Buffer & StateMemory
Typical IP Router Linecard
OC192c linecard:~10-30M gates~2Gbits of memory~2 square feet>$10k cost100’s of Watts
LookupTables
“Backplane”
SwitchFabric
Arbitration
Optics
18
Fourth Generation Routers/Switches
Optics inside a router for the first time
Switch Core Linecards
Optical links
1000’sof feet
The LCS Protocol
0.3 - 10Tb/s routers in development
19
Where next?
• Incorporating (more) optics into a router.
• More parallelism (with or without optics).
20
Incorporating optics into a router
• Replacing the switch fabric with an optical datapath.
• Increasing the internal “cell” size to reduce rate of arbitration and reconfiguration.
21
Replacing the switch fabric with optics
SwitchFabric
Arbitration
PhysicalLayer
Framing&
Maintenance
PacketProcessing
Buffer Mgmt&
Scheduling
Buffer Mgmt&
Scheduling
Buffer & StateMemory
Buffer & StateMemory
Typical IP Router LinecardLookupTables
OpticsPhysical
LayerFraming
& Maintenance
PacketProcessing
Buffer Mgmt&
Scheduling
Buffer Mgmt&
Scheduling
Buffer & StateMemory
Buffer & StateMemory
Typical IP Router LinecardLookupTables
Opticselectrical
SwitchFabric
Arbitration
PhysicalLayer
Framing&
Maintenance
PacketProcessing
Buffer Mgmt&
Scheduling
Buffer Mgmt&
Scheduling
Buffer & StateMemory
Buffer & StateMemory
Typical IP Router LinecardLookupTables
OpticsPhysical
LayerFraming
& Maintenance
PacketProcessing
Buffer Mgmt&
Scheduling
Buffer Mgmt&
Scheduling
Buffer & StateMemory
Buffer & StateMemory
Typical IP Router LinecardLookupTables
Optics
optical
Req/Grant Req/Grant
Candidate technologies 1. MEMs2. Fast tunable lasers + passive optical
couplers
22
Replacing the switch fabric with optics
• Most common internal “cell” size is 64 bytes (50ns @ OC192, 12ns @ OC768)
• Too fast for arbitration • Too fast for reconfiguration• What we’ll see:
– Increased cell length– E.g. switch bursts of cells – But less efficient.
23
More parallelism
• Parallel packet buffers• Parallel lookup tables
• Multiple parallel routers
24
Multiple parallel routers
What we’d like:R
R R
R
The building blocks we’d like to use:
R
RR
R
NxN
IP Router capacity 100s of Tb/s
25
Why this might be a good idea
• Larger overall capacity• Faster line rates• Redundancy• Familiarity
– “After all, this is how the Internet is built”
26
Multiple parallel routers Load Balancing architectures
R R
R
12……k
R
RR
R/k R/k
R
RR
27
Method #1: Random packet load-balancing
Method: As packets arrive they are randomly distributed, packet by packet over each router.
Advantages: – Almost unlimited capacity– Load-balancer is simple– Load-balancer needs no packet buffering
Disadvantages:– Random fluctuations in traffic each router is loaded
differently • Packets within a flow may become mis-sequenced• It is not possible to predict the system performance
28
Method #2: Random flow load-balancing
Method: Each new flow (e.g. TCP connection) is randomly assigned to a router. All packets in a
flow follow the same path.Advantages:
– Almost unlimited capacity– Load-balancer is simple (e.g. hashing of flow ID).– Load-balancer needs no packet buffering.– No mis-sequencing of packets within a flow.
Disadvantages:– Random fluctuations in traffic each router is loaded
differently • It is not possible to predict the system performance
29
Observations
• Random load-balancing: It’s hard to predict system performance.
• Flow-by-flow load-balancing: Worst-case performance is very poor.
If designers, system builders, network operators etc. need to know the worst case performance, random load-balancing will not suffice.
30
Method #3: Intelligent packet load-balancing
Goal: Each new packet is carefully assigned to a router so that:
• Packets are not mis-sequenced.• The throughput is maximized and
understood.• Delay of each packet can be controlled.
We call this “Parallel Packet Switching”
31
Method #3: Intelligent packet load-balancing
Parallel Packet Switching
1
2
k
1
N
rate, R
rate, R
rate, R
rate, R
1
N
Router
Bufferless
R/k R/k
32
Parallel Packet Switching
• Advantages– Single-stage of buffering– No excess link capacity– kpower per subsystem – kmemory bandwidth – klookup rate
33
Precise Emulation of a Shared Memory Switch
N N
Shared Memory Switch1
N
Parallel Packet Switch
= ?
1
N
1
N
34
Parallel Packet SwitchTheorem
1. If S > 2k/(k+2) 2 then a parallel packet switch can precisely emulate a FCFS shared memory switch for all traffic.
35
Example of an IP Router with Parallel Packet Switching
1
2
16
1
1024
160Gb/s
160Gb/s
rate, R
rate, R
1
1024
10Tb/s routerR/k R/k
Overall capacity 160Tb/s
36
My conclusions
• The capacity of electronic IP routers will scale a long way yet.
• The opportunity of optics is to reduce power and space– By using optics within the router.– By replacing routers with circuit
switches.