networking seminar stanford universitynetseminar.stanford.edu/seminars/03_12_15.pdf · 3/12/2015...

Madan Jampani3/12/2015

Networking SeminarStanford University

Can SDN control plane scale without sacrificing

abstractions and performance?

Control Plane

Data Plane

Simple yet powerful abstraction

Global Network View

Scaling strong consistency

Managing Complexity

What is the simplest controller?

Observe Program

Abstractions Performance Correctness Availability Scale

SingleInstance

How to improve availability?

Primary Standby

Standby Primary

SingleInstance

Primary/Standby

Primary Standby

Why you might need to scale?

Primary Standby

Fully Distributed Control Plane

SingleInstance

Primary/Standby

Multi-Instance

SingleInstance

Primary/Standby

MultiInstance

“Tell me about your slice?”

SingleInstance

Primary/Standby

MultiInstance

“Tell me about your slice?”

SingleInstance

Primary/Standby

MultiInstance

Can we build a complete solution?

Let’s look at prior art...

TopologyEvents

DistributedTopology

WriteTopology

DistributedTopology

ReadTopology

DistributedTopology

ReadTopology

DistributedTopology

ReadTopology

Can we build a solution that meets all criteria?

Topology as a State Machine

Events are Switch/Port/Link up/down

Current Topology

Updated Topology

apply event

● Simple● Fast

● Dropped messages● Reordered messages

We have a single writer problem

C1C2C1

Switch Mastership Terms

We track this in a stronglyconsistent store

2.1 2.2 2.3 2.4 2.5 2.61.41.2 1.31.1 3.2 3.33.12.7 2.8

C1C2C1

Switch Event Numbers

Partial Ordering of Topology Events

Each event has a unique logical timestamp

(Switch ID, Term Number, Event Number)

E (S1, 1, 2)

(S1, 1, 2)

F(S1, 2, 1)

(S1, 2, 1)

(S1, 2, 1) (S1, 1, 2)E

(S1, 2, 1)

(S1, 2, 1) (S1, 1, 2)E

To summarize

Each instance has a full copy of network topology

Events are timestamped on arrival and broadcasted

Stale events are dropped on receipt

There is one additional step...

What about dropped messages?

Did you hear about the port

that went offline?

Anti-Entropy

Lightweight, Gossip style, peer-to-peer

Quickly bootstraps newly joined nodes

SingleInstance

Primary/Standby

MultiInstance

A model for state tracking

If you are the observer, Eventual Consistency is the best option

View should be consistent with the network, not with other views

Hiding the complexity

EventuallyConsistentMap<K, V, T>

Plugin your own timestamp generation

What about other Control Plane state...

● Switch to Controller mapping● Resource Allocations● Flows● Various application generated state

What about other Control Plane state...

● Switch to Controller mapping● Resource Allocations● Flows● Various application generated state

In all case strong consistency is either required or highly desirable

Consistency => Coordination

All participants need to be available

Consensus

Only a majority need to participate

State Consistency through Consensus

LOG LOG

State Consistency through Consensus

LOG LOG LOG LOG LOG LOG

Scaling Consistency

Consistency Cost

● Atomic updates within a shard are “cheap”● Atomic updates spanning shards use 2PC

Hiding Complexity

ConsistentMap<K, V>

To Conclude

● SDN at scale is possible if we can take care of state management

● Abstractions are important

Thank you

networking seminar stanford universitynetseminar.stanford.edu/seminars/03_12_15.pdf · 3/12/2015...

Documents

tapestry: scalable and fault-tolerant routing and...

protocol analysis, wireless networking, and mobility john...

chapter 6 conclusions - stanford...

high performance all-optical networks with small buffers...

stanford geothermal program stanford ......stanford...

topology for computing - directory...

reducing the buffer size in backbone routers yashar ganjali...

a wide open world of social networking: monica lam, stanford...

introducing optical switching into the network ecoc 2005,...

designing networks with little or no buffers or can gulliver...

open%networking%summit2017% · open%networking%summit2017%...

sunet the stanford university network presentation for the...

cse390 – advanced computer networks lecture 22: software...

high performance networking with little or no buffers yashar...

primoz skraba stanford...

teaching)computer)) networking)with)mininet)€¦ ·...

tapestry: scalable and fault-tolerant routing and location...

studying the great firewall of china - stanford...

tcp protocol stack - stanford university · pdf file1...

maximum size matchings & input queued switches sundar iyer,...