opticomm 2001nick mckeown1 do optics belong in internet core routers? keynote, opticomm 2001 denver,...
DESCRIPTION
Opticomm 2001Nick McKeown3 1.“ Optics and routers don’t belong together” Optics are ill-suited to packet switching Buffering Packet switches inherently require buffers for times of congestion, Buffers provide statistical multiplexing for outgoing links, Optical buffers are not economically feasible. Processing Packet processing is too complex to be done optically.TRANSCRIPT
1Opticomm 2001Nick McKeown
Hi gh Pe rf orm a nceSwi tc hi ng and Routi ngTe lec om Ce nter W orks ho p: Sep t 4 , 19 97.
Do Optics Belong in Internet Core Routers?
Keynote, Opticomm 2001Denver, Colorado
Nick McKeownProfessor of Electrical Engineering and Computer Science, Stanford [email protected]://www.stanford.edu/~nickm
2Opticomm 2001Nick McKeown
There seem to be 3 opinions…
1. “Optics and routers don’t belong together” Optics are ill-suited to packet switching. CMOS technology and architectural techniques will scale
just fine.2. “Optical circuit switches will kill off (core) routers”
Optical circuit switches are simpler and faster than routers.
We don’t need packet switching anymore.3. “Optics and routers belong together”
Electronic switched “backplanes” will be replaced by optics.
Leads to lower power, higher density routers.
3Opticomm 2001Nick McKeown
1. “Optics and routers don’t belong together”
Optics are ill-suited to packet switching• Buffering
Packet switches inherently require buffers for times of congestion,
Buffers provide statistical multiplexing for outgoing links,
Optical buffers are not economically feasible.• Processing
Packet processing is too complex to be done optically.
4Opticomm 2001Nick McKeown
Buffering
B
A
time
time
rate
rate
x
x
A x
B x
A
BC
2xC < 2x
A+B
time
rate
The Internet is built on the assumption ofexpensive, congested links.
Statistical multiplexing enables sharing of expensive links.
All routers have big buffers.
Rule of thumb: buffersize ~= RTT * line-rate.At 10Gb/s: 2.5Gbits.
5Opticomm 2001Nick McKeown
Processing
PhysicalLayer
Framing&
Maintenance
PacketProcessing
Buffer Mgmt&
Scheduling
Buffer Mgmt&
Scheduling
Buffer & StateMemory
Buffer & StateMemory
Typical IP Router LinecardLookupTables
Backplane
Buffered orBufferless
Fabric
Arbitration
Optics
OC192c linecard: 30M gates 2.5Gbits of memory 2 square feet of board 200W $20k cost
6Opticomm 2001Nick McKeown
1. “Optics and routers don’t belong together”
CMOS and router architectures will scale just fine
Growth in capacity of electronic routers: Capacity 1992 ~ 2Gb/s Capacity 1995 ~ 10Gb/s Capacity 1998 ~ 40Gb/s Capacity 2001 ~ 160Gb/s Capacity 2003 ~ 1-40Tb/s
Main techniques for increasing capacity in electronic routers: Separating linecards from switch cores. Parallelism and load-balancing.
7Opticomm 2001Nick McKeown
Current “3rd generation” Routers
Switched Backplane
Line Interface
CPUMemory
LineCard
MAC
LocalBuffer
Memory
CPUCard
LineCard
MAC
LocalBuffer
MemoryFwdingTable
RoutingTable
FwdingTable
Typically <=160Gb/s aggregate capacity
8Opticomm 2001Nick McKeown
Arbiter
3rd Generation RoutersQueueing Structure
Switch
1 write per “cell” time 1 read per “cell” timeRate of writes/reads determined by switch
fabric speedup
Per-flow/class or per-output queues (VOQs)
Per-flow/class or per-input queues
Flow-controlbackpressure
9Opticomm 2001Nick McKeown
3rd Generation Routers
19” or 23”
7’
Size-constrained: 19” or 23” wide.
Power-constrained: 5kW for 640Gb/s is typical.
10Opticomm 2001Nick McKeown
Separating linecards from switch cores
4th Generation Routers/Switches
Switch Core Linecards
1000’sof feet
The LCS Protocol
0.3 - 10Tb/s routers in development
Optical links
11Opticomm 2001Nick McKeown
1. “Optics and routers don’t belong together”
CMOS and router architectures will scale just fine
Growth in capacity of electronic routers: Capacity 1992 ~= 2Gb/s Capacity 1995 ~= 10Gb/s Capacity 1998 ~= 40Gb/s Capacity 2001 ~= 160Gb/s Capacity 2003 ~1-40Tb/s
Main techniques for increasing capacity in electronic routers: Separating linecards from switch cores. Parallelism and load-balancing.
12Opticomm 2001Nick McKeown
Parallelism and Load-Balancing
Techniques in development for linecards at 10’s of Gb/s, but not discussed here: Parallel packet buffers. Parallel lookup tables.
Discussed here: Load-balancing across multiple parallel
routers.
13Opticomm 2001Nick McKeown
Multiple parallel routers
Big Router:R
R R
R
The building blocks:
R
RR
R
NxN
IP Router capacity 100s of Tb/s
14Opticomm 2001Nick McKeown
Multiple parallel routers Load Balancing architectures
R R
R
12……k
R
RR
R/k R/k
R
RR
15Opticomm 2001Nick McKeown
Method #1: Random packet load-balancing
Method: As packets arrive they are randomly distributed, packet by packet over each router.
Advantages: Almost unlimited capacity Load-balancer is simple Load-balancer needs no packet buffering
Disadvantages: Random fluctuations in traffic each router is loaded
differently • Packets within a flow may become mis-sequenced• It is not possible to predict the system performance
16Opticomm 2001Nick McKeown
Method #2: Random flow load-balancing
Method: Each new flow (e.g. TCP connection) is randomly assigned to a router. All packets in a
flow follow the same path.Advantages:
Almost unlimited capacity Load-balancer is simple (e.g. hashing of flow ID). Load-balancer needs no packet buffering. No mis-sequencing of packets within a flow.
Disadvantages: Random fluctuations in traffic each router is loaded
differently • It is not possible to predict the system performance
17Opticomm 2001Nick McKeown
Observations
• Random load-balancing: It’s hard to predict system performance.
• Flow-by-flow load-balancing: Worst-case performance is very poor.
If designers, system builders, network operators etc. need to know the worst
case performance, random load-balancing will not suffice.
(Conversely: If they don’t, then it will).
18Opticomm 2001Nick McKeown
Method #3: Intelligent packet load-balancing
Goal: Each new packet is carefully assigned to a router so that:
• Packets are not mis-sequenced.• The throughput is maximized and
understood.• Delay of each packet can be controlled.
We call this “Parallel Packet Switching”
19Opticomm 2001Nick McKeown
Method #3: Intelligent packet load-balancing
Parallel Packet Switching
1
2
k
1
N
rate, R
rate, R
rate, R
rate, R
1
N
Router
Bufferless
R/k R/k
20Opticomm 2001Nick McKeown
Parallel Packet Switching
• AdvantagesSingle-stage of bufferingNo excess link capacitykpower per subsystem kmemory bandwidth klookup rate
21Opticomm 2001Nick McKeown
Example of an IP Router with Parallel Packet Switching
1
2
16
1
1024
160Gb/s
160Gb/s
rate, R
rate, R
1
1024
10Tb/s routerR/k R/k
Overall capacity 160Tb/s
22Opticomm 2001Nick McKeown
1.“Optics and routers don’t belong together”
Summary
• If optics cannot buffer or process packets, and
• If electronic CMOS-based routers can be built that are fast enough, then
• Why would anyone try and build an optical router?
23Opticomm 2001Nick McKeown
There seem to be 3 opinions…
1. “Optics and routers don’t belong together” Optics are ill-suited to packet switching. CMOS technology and architectural techniques will scale
just fine.2. “Optical circuit switches will kill off (core) routers”
Optical circuit switches are simpler and faster than routers.
We don’t need packet switching anymore.3. “Optics and routers belong together”
Electronic switched “backplanes” will be replaced by optics.
Leads to lower power, higher density routers.
24Opticomm 2001Nick McKeown
2. “Optical circuit switches will kill off (core) routers”
Optical circuit switches are simpler and faster than routers.
• A survey of available equipment suggests that, with electronics, you can build a circuit switch that has about 10x the capacity of a packet switch.
• This is because a packet switch requires lots of complex per-packet processing, …
• While a circuit switch requires no per-packet processing.
25Opticomm 2001Nick McKeown
Processing steps
IP Router Per packet:
IP lookup. Update header & CRC. Forward to correct
output. Schedule departure.
Per route: Maintain routing entry.
Circuit SwitchContinuously:
Transfer bits, bytes, photons from input to output.
Per circuit: Establish circuit Remove circuit
26Opticomm 2001Nick McKeown
0,1
1
10
100
1000
10000
1985 1990 1995 2000
Spec
95In
t CPU
resu
lts
Why it’s hard for capacity to keep up with link rates
0,1
1
10
100
1000
10000
1985 1990 1995 2000
Fibe
r Cap
acity
(Gbi
t/s)
TDM DWDM
Packet processing Power Link Speed
2x / 2 years 2x / 7 months
Source: SPEC95Int & David Miller, Stanford.
27Opticomm 2001Nick McKeown
Instructions per packet
time
Instructionsper packet
What we’d like: (more per-packet processing features)More efficient use of links, differentiated services, Multicast, Security, …
What will happen
28Opticomm 2001Nick McKeown
1
10
100
1000
10000
100000
1996 1997 1998 1999 2000 2001
Normalized number of instructions per packet
29Opticomm 2001Nick McKeown
2. “Optical circuit switches will kill off (core) routers”
We don’t need packet switching anymore.
• Original reasons for packet switching no longer hold.
• There are new techniques, such MPLambaS, burst switching, and TCP Switching that all make it possible to use circuit switching in the core.
• Actually, most of the core is circuit switched already!
30Opticomm 2001Nick McKeown
Original reasons for packet switching
1. Efficient use of expensive links:“Circuit switching is rarely used for data
networks, ... because of very inefficient use of the links” – Gallager.
2. Resilience to failure of links & routers:”For high reliability, ... [the Internet] was to be a
datagram subnet, so if some lines and [routers] were destroyed, messages could be ... rerouted” – Tanenbaum.
Source: Networking 101
31Opticomm 2001Nick McKeown
Neither reason is true today
1. Link capacity is abundant and under used Most links are unused due to lack of switching
capacity. Most links are utilized < 10%. Utilization continues to decrease.
2. Routers rarely fail They are designed for <5s down-time per year. They take >1min to recover when they do
(circuit switches must recover in <50ms).
32Opticomm 2001Nick McKeown
How networking people think the Internet is
Router
33Opticomm 2001Nick McKeown
How the Internet really is
Circuit Switched(SONET)
Packet Switched(IP routers)
$35Bn$6Bn
34Opticomm 2001Nick McKeown
How the Internet really is
SONET/SDH
IP routers IP routersYourLocalCO
YourLocalCO
35Opticomm 2001Nick McKeown
2. “Optical circuit switches will kill off (core) routers”
Summary• If the original rationale for packet
switching no longer holds, and• If circuit switching is inherently faster,
and cheaper than packet switching, and• If circuit switching is already working fine
for most of the Internet already, then• Packet switching doesn’t appear to have
a long-term future.
36Opticomm 2001Nick McKeown
There seem to be 3 opinions…
1. “Optics and routers don’t belong together” Optics are ill-suited to packet switching. CMOS technology and architectural techniques will scale
just fine.2. “Optical circuit switches will kill off (core) routers”
Optical circuit switches are simpler and faster than routers.
We don’t need packet switching anymore.3. “Optics and routers belong together”
Electronic switched “backplanes” will be replaced by optics.
Leads to lower power, higher density routers.
37Opticomm 2001Nick McKeown
3. “Optics and routers belong together”
Electronic switched “backplanes” will be replaced by optics.
• The first step is already happening: physical separation of linecards and switch cores.
• Optical switching is feasible.• Scheduling/arbitration is hard.
38Opticomm 2001Nick McKeown
Separating linecards from switch cores
4th Generation Routers/Switches
Switch Core Linecards
Optical links
39Opticomm 2001Nick McKeown
Replacing the switch fabric with optics
SwitchFabric
Scheduler
PhysicalLayer
Framing&
Maintenance
PacketProcessing
Buffer Mgmt&
Scheduling
Buffer Mgmt&
Scheduling
Buffer & StateMemory
Buffer & StateMemory
Typical IP Router LinecardLookupTables
OpticsPhysical
LayerFraming
& Maintenance
PacketProcessing
Buffer Mgmt&
Scheduling
Buffer Mgmt&
Scheduling
Buffer & StateMemory
Buffer & StateMemory
Typical IP Router LinecardLookupTables
Optics
Electrical
SwitchFabric
Scheduler
PhysicalLayer
Framing&
Maintenance
PacketProcessing
Buffer Mgmt&
Scheduling
Buffer Mgmt&
Scheduling
Buffer & StateMemory
Buffer & StateMemory
LookupTables
OpticsPhysical
LayerFraming
& Maintenance
PacketProcessing
Buffer Mgmt&
Scheduling
Buffer Mgmt&
Scheduling
Buffer & StateMemory
Buffer & StateMemory
LookupTables
Optics
Optical
Req/Grant Req/Grant
Candidate technologies: MEMs, gratings, passive optical couplers + tunable lasers,
holography,…
Req/Grant Req/Grant
But this is the difficult part…
40Opticomm 2001Nick McKeown
Architecture of most routers today
Scheduler
Per-output queues (VOQs)
1. Scheduler picks new configuration each “cell” time (<50ns for OC192).
2. Scheduling decisions are complex: “Ideal” algorithm:
O(N3) [maximum weight bipartite matching] “Good” algorithm:
O(N2) [maximal size bipartite matching] Requires speedup which reduces cell time.
3. Scheduler chip is typically several million gates, 4. It is hard to use a distributed algorithm.
The scheduler is often the bottleneck in the system.
41Opticomm 2001Nick McKeown
Overcoming the scheduler bottleneck
1. Increase the internal “cell” size to reduce rate of arbitration and reconfiguration.
Today: 64B is common. Expect 100s or 1000s of bytes per cell [Kar]. Throughput is not affected. When does it become circuit switching?
2. Eliminate the need for a scheduler Two-stage switch [Chang].
42Opticomm 2001Nick McKeown
Two-Stage SwitchBackground
1
N
1
N
OutputsInputs
Simple Round-Robin
It is known that if traffic is uniform and non-bursty,Then a single stage, with virtual output queues,and trivial round-robin (“TDM”) scheduling, gives 100% throughput.
Of course, real traffic is non-uniform and
bursty.
43Opticomm 2001Nick McKeown
Two-Stage Switch
1
N
1
N
1
N
External Outputs
Internal Inputs
External Inputs
First Round-Robin Second Round-Robin
Load Balancing
Switch gives 100% throughput for non-uniform, burstytraffic, without a scheduler or speedup!
44Opticomm 2001Nick McKeown
An optical two-stage switch
1
2
3
Phase 2
Phase 1
45Opticomm 2001Nick McKeown
3. “Optics and routers belong together”Summary
• Optical switches can replace electronic crossbar switches now,
• Arbitration requires: Faster (compromised?) schedulers, orA 2-stage switch fabric.
46Opticomm 2001Nick McKeown
So which will it be…?1. “Optics and routers don’t belong together”
Optics are ill-suited to packet switching. CMOS technology and architectural techniques will scale
just fine.2. “Optical circuit switches will kill off (core) routers”
Optical circuit switches are simpler and faster than routers.
We don’t need packet switching anymore.3. “Optics and routers belong together”
Electronic switched “backplanes” will be replaced by optics.
Leads to lower power, higher density routers.
A bit of both for a few years: Continued scaling of electronic routers. Novel routers incorporating optics.
A prediction: By 2010, almost all of the Internet core will be optical and circuit switched.