1.10.2012 1 seattle - seattle - a scalable ethernet architecture for large enterprises t-110.6120...

40
1.10.2012 1 SEATTLE SEATTLE - - A Scalable Ethernet Architecture for Large Enterprises T-110.6120 – Special Course in Future Internet Technologies M.Sc. Pekka Hippeläinen IBM phippela@gmail

Upload: arabella-washington

Post on 23-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1.10.2012 1 SEATTLE - SEATTLE - A Scalable Ethernet Architecture for Large Enterprises T-110.6120 – Special Course in Future Internet Technologies M.Sc

1.10.2012 1

SEATTLESEATTLE- - A Scalable Ethernet Architecturefor Large Enterprises

T-110.6120 – Special Course in Future Internet Technologies

M.Sc. Pekka Hippeläinen

IBM

phippela@gmail

Page 2: 1.10.2012 1 SEATTLE - SEATTLE - A Scalable Ethernet Architecture for Large Enterprises T-110.6120 – Special Course in Future Internet Technologies M.Sc

1.10.2012 2

SEATTLE

Based on and pictures borrowed from:Changhoon,K;Caesar,M;Rexford,J. Floodless in SEATTLE: A Scalable Ethernet Architecture for Large Enterprises

Is it possible to build a protocol that maintains the same configuration-free properties as Ethernet bridging, yet scales to large networks?

Page 3: 1.10.2012 1 SEATTLE - SEATTLE - A Scalable Ethernet Architecture for Large Enterprises T-110.6120 – Special Course in Future Internet Technologies M.Sc

1.10.2012 3

Contents

Motivation: network management challenge

Ethernet features: ARP and DHCP broadcasts 1) Ethernet Bridging 2) Scaling with Hybrid networks 3) Scaling with VLANs Distributed Hashing SEATTLE approach

Results Conclusions

Page 4: 1.10.2012 1 SEATTLE - SEATTLE - A Scalable Ethernet Architecture for Large Enterprises T-110.6120 – Special Course in Future Internet Technologies M.Sc

1.10.2012 4

Network management challenge IP Networks require massive effort to

configure and manage Even 70% of an enterprise network’s cost

goes to maintenance and configuration Ethernet is much simpler to manage However Ethernet does not scale well

beyond small LANs SEATTLE architecture aims to provide

scalability of IP with simplicity of Ethernet management

Page 5: 1.10.2012 1 SEATTLE - SEATTLE - A Scalable Ethernet Architecture for Large Enterprises T-110.6120 – Special Course in Future Internet Technologies M.Sc

1.10.2012 5

Why Ethernet is so wonderful ? Easy to setup, easy to manage DHCP server, some hubs, plug’n play

Page 6: 1.10.2012 1 SEATTLE - SEATTLE - A Scalable Ethernet Architecture for Large Enterprises T-110.6120 – Special Course in Future Internet Technologies M.Sc

18.09.2012 6

Flooding query 1: DHCP requests Lets say node A joins the ethernet To get IP / confirm IP – node A sends a DHCP

request as a broadcast Request floods through the broadcast domain

Page 7: 1.10.2012 1 SEATTLE - SEATTLE - A Scalable Ethernet Architecture for Large Enterprises T-110.6120 – Special Course in Future Internet Technologies M.Sc

1.10.2012 7

Flooding query 2: ARP In order for node A to communicate to node

B in the same broadcast domain, the sender needs MAC address of the node B

Lets assume that node B IP is know Node A sends and Address Request Protocol

(ARP) broadcast – to find out MAC address of node B

Similarly to DHCP broadcast – the request is flooded through the whole broadcast domain

This is basically {IP -> MAC} mapping

Page 8: 1.10.2012 1 SEATTLE - SEATTLE - A Scalable Ethernet Architecture for Large Enterprises T-110.6120 – Special Course in Future Internet Technologies M.Sc

1.10.2012 8

Why flooding is bad ? Large Ethernet deployments contain vast

number of hosts and thousands of bridges Ethernet was not designed to such a scale Virtualization and mobile deployments can

cause many dynamic events – causing control traffic

Broadcast messages need to be processed in the end hosts – interrupting cpu

The bridges forwarding tables grow roughly linearly with number of hosts

Page 9: 1.10.2012 1 SEATTLE - SEATTLE - A Scalable Ethernet Architecture for Large Enterprises T-110.6120 – Special Course in Future Internet Technologies M.Sc

1.10.2012 9

1) Ethernet bridging Ethernet consists of segments each

comprising a single physical layer Ethernet bridges are used to interconnect

segments to multi-hop network i.e. LAN This forms a single broadcast domain Bridge learns how to reach a host – by

inspecting the incoming frames and associating the source MAC with the incoming port

A bridge stores this information to a forwarding table – using the table to forward packets to correct direction

Page 10: 1.10.2012 1 SEATTLE - SEATTLE - A Scalable Ethernet Architecture for Large Enterprises T-110.6120 – Special Course in Future Internet Technologies M.Sc

1.10.2012 10

Bridge spanning tree One bridge is configured to be the root

bridge Other bridges collectively compute a

spanning tree based on the distance to the root

Thus traffic is not routed through shortest path but along the spanning tree

This approach avoids broadcast storms

Page 11: 1.10.2012 1 SEATTLE - SEATTLE - A Scalable Ethernet Architecture for Large Enterprises T-110.6120 – Special Course in Future Internet Technologies M.Sc

1.10.2012 11

Page 12: 1.10.2012 1 SEATTLE - SEATTLE - A Scalable Ethernet Architecture for Large Enterprises T-110.6120 – Special Course in Future Internet Technologies M.Sc

1.10.2012 12

2) Hybrid IP/Ethernet In this approach multiple LANs are

interconnected with IP routing In hybrid networks each LAN contains at

most a few hundred of hosts that form IP subnet

IP subnet is associated with the IP prefix Assigning IP prefixes to subnet and

associating subnets with router interfaces is a manual process

Unlike MAC which is host identifier – IP address denotes the hosts current location in the network

Page 13: 1.10.2012 1 SEATTLE - SEATTLE - A Scalable Ethernet Architecture for Large Enterprises T-110.6120 – Special Course in Future Internet Technologies M.Sc

1.10.2012 13

Page 14: 1.10.2012 1 SEATTLE - SEATTLE - A Scalable Ethernet Architecture for Large Enterprises T-110.6120 – Special Course in Future Internet Technologies M.Sc

1.10.2012 14

Drawbacks of Hybrid approach Biggest drawback is the configuration

overhead Router interfaces must be configured Host must have correct IP address corresponding

to the subnet it is located (DHCP can be used) Networking policies are defined usually per

network prefix i.e. topology When network changes the policies must be

updated Limited mobility support

Mobile users & virtualized hosts at datacenters If IP is constant – the user should stay on the same

subnet

Page 15: 1.10.2012 1 SEATTLE - SEATTLE - A Scalable Ethernet Architecture for Large Enterprises T-110.6120 – Special Course in Future Internet Technologies M.Sc

1.10.2012 15

3) Virtual LANs Overcomes some problems of Ethernet

and IP Networks Administrators can logically groups hosts

into same broadcast domain VLANS can be configured to overlap –

configuring bridges not the hosts Now broadcast overhead can be reduced

by the isolates domains Mobility is simplified – IP address can be

retained while moving between bridges

Page 16: 1.10.2012 1 SEATTLE - SEATTLE - A Scalable Ethernet Architecture for Large Enterprises T-110.6120 – Special Course in Future Internet Technologies M.Sc

1.10.2012 16

Virtual LANs Traffic from B1 to B2 can be ‘trunked’

over multiple bridges Inter domain traffic needs to be routed

Page 17: 1.10.2012 1 SEATTLE - SEATTLE - A Scalable Ethernet Architecture for Large Enterprises T-110.6120 – Special Course in Future Internet Technologies M.Sc

1.10.2012 17

Drawbacks of VLANs Trunk configuration overhead

Extending VLAN across multiple bridges requires VLAN to be configured at each of the bridges participating. Often manual work.

Limited control plane scalability Forwarding table entries and broadcast traffic for

every active host and every VLAN visible Insufficient data plane efficiency

Single spanning tree is still used within each VLAN

Inter-VLAN traffic must be routed via IP gateways

Page 18: 1.10.2012 1 SEATTLE - SEATTLE - A Scalable Ethernet Architecture for Large Enterprises T-110.6120 – Special Course in Future Internet Technologies M.Sc

1.10.2012 18

Distributed Hash Tables Hash tables are used to store {key -> value}

pairs In case of multiple nodes there is nice way to

make Nodes symmetric Distribute the hash table entries evenly among

nodes Keep reshuffling of entries small in case of

adding/removing nodes Idea is to calculate H(key) that is mapped to a

host – one can visualize this to mapping to an angle (or to a point on a circle)

Page 19: 1.10.2012 1 SEATTLE - SEATTLE - A Scalable Ethernet Architecture for Large Enterprises T-110.6120 – Special Course in Future Internet Technologies M.Sc

1.10.2012 19

Distributed Hash Tables

Each node is mapped to randomly distributed points on the circle

Thus each node is mapped to multiple buckets

One calculates the H(key) – and stores the entry to the node owning this bucket

If node is removed – the values are now assigned to next buckets

If node is added – entries are moved to the new buckets

Page 20: 1.10.2012 1 SEATTLE - SEATTLE - A Scalable Ethernet Architecture for Large Enterprises T-110.6120 – Special Course in Future Internet Technologies M.Sc

1.10.2012 20

SEATTLE approach 1/2 1) Switches calculate shortest

path among themselves This is link state protocol – basically

Dijkstra Switch level discovery protocol – Ethernet

hosts do not respond Switch topology much more stable than at

host level Much more scalable than at host level Each switch has an ID – one MAC address of

the switch interfaces

Page 21: 1.10.2012 1 SEATTLE - SEATTLE - A Scalable Ethernet Architecture for Large Enterprises T-110.6120 – Special Course in Future Internet Technologies M.Sc

1.10.2012 21

SEATTLE approach 2/2 2) DHT used in switches

{IP->MAC} mapping This is essentially ARP request avoiding

flooding {MAC->location} mapping

When switch is located – routing along the shortest path can be used

DCHP Service location can also be stored SEATTLE thus reduces flooding, allows

usage of shortest path and offers a nice way to locate DHCP service

Page 22: 1.10.2012 1 SEATTLE - SEATTLE - A Scalable Ethernet Architecture for Large Enterprises T-110.6120 – Special Course in Future Internet Technologies M.Sc

1.10.2012 22

SEATTLE Control overhead reduced with consistent

hashing When set of switches changes due to network

failure or recovery – only some entries must be moved

Balancing load with virtual switches If some switches are more powerful – the switch

can represent itself as many – getting more load Enabling flexible service discovery

This is mainly DHCP – but could be something like {“PRINTER”->location}

Page 23: 1.10.2012 1 SEATTLE - SEATTLE - A Scalable Ethernet Architecture for Large Enterprises T-110.6120 – Special Course in Future Internet Technologies M.Sc

1.10.2012 23

Topology changes Adding and removing switches/links

can alter topology Switch/link failures and recoveries can

also lead to partitioning events (more rare)

Non-partitioning link failures are easy to handle – the resolver for hash entry is not changed

Page 24: 1.10.2012 1 SEATTLE - SEATTLE - A Scalable Ethernet Architecture for Large Enterprises T-110.6120 – Special Course in Future Internet Technologies M.Sc

1.10.2012 24

Switch failures If switch fails or recovers hash entries

need to be moved The switch that published value – monitors

the liveliness of resolver. Republishing entry when needed

The entries have TTL

Page 25: 1.10.2012 1 SEATTLE - SEATTLE - A Scalable Ethernet Architecture for Large Enterprises T-110.6120 – Special Course in Future Internet Technologies M.Sc

1.10.2012 25

Partitioning events Each switch has to book keep also

locally-stored location entries If switch s_old is removed / not reachable –

all the switches need to remove these location entries

This approach correctly handles partitioning events

Page 26: 1.10.2012 1 SEATTLE - SEATTLE - A Scalable Ethernet Architecture for Large Enterprises T-110.6120 – Special Course in Future Internet Technologies M.Sc

1.10.2012 26

Scaling:location Hosts use directory service to publish and

maintain {mac->location} mappings When host a with mac_a arrives – it accesses

switch S_a (steps 1-3) Switch s_a publishes {mac_a,location}, by

calculating the correct bucket F(mac_a) i.e. switch/resolver

When node b wants to send message to node a F(mac_a) is calculated to fetch the location

’Reactive resolution’ – also cache misses do not lead flooding

Page 27: 1.10.2012 1 SEATTLE - SEATTLE - A Scalable Ethernet Architecture for Large Enterprises T-110.6120 – Special Course in Future Internet Technologies M.Sc

1.10.2012 27

Scaling:ARP When node b makes ARP request – SEATTLE

converts this to a {F(IP_a) -> mac_a} request The resolver/switch for F(IP_a) is usually

different from F(mac_a) Optimization for hosts making ARP request

F(IP_a) address resolver can also store mac_a and S_a

When node b makes F(IP_a) ARP request also mac_a->S_a mapping is cached to S_b

Shortest path (-> path 10) can now be used

Page 28: 1.10.2012 1 SEATTLE - SEATTLE - A Scalable Ethernet Architecture for Large Enterprises T-110.6120 – Special Course in Future Internet Technologies M.Sc

1.10.2012 28

Handling host dynamics

Location change Wireless handoff VM moved but retaining MAC

Host MAC address changes NIC card replaced Failover event VM migration forcing MAC change

Host changes IP DHCP leave expires Manual reconfiguration

Page 29: 1.10.2012 1 SEATTLE - SEATTLE - A Scalable Ethernet Architecture for Large Enterprises T-110.6120 – Special Course in Future Internet Technologies M.Sc

1.10.2012 29

Insert, delete and update Location change

Host h moves from s_old to s_new s_new updates the existing mac-to-location

entry MAC change

IP-to-MAC update MAC-to-location deletion (old) and insertion

(new) IP change

S_h deletes old IP-to-MAC and inserts new IP-to-MAC

Page 30: 1.10.2012 1 SEATTLE - SEATTLE - A Scalable Ethernet Architecture for Large Enterprises T-110.6120 – Special Course in Future Internet Technologies M.Sc

1.10.2012 30

Ethernet: Bootstrapping hosts Host discovered by access switches

SEATTLE switches snoop ARP requests Most OSes generate ARP request at boot up / if up Aldo DHCP messages or host down can be used

Host configuration without broadcast DHCP_SERVER hashes string “DHCP_SERVER” and

stores the location to the switches The “DHCP_SERVER” string is used to locate

service No need to broadcast for ARP or DHCP

Page 31: 1.10.2012 1 SEATTLE - SEATTLE - A Scalable Ethernet Architecture for Large Enterprises T-110.6120 – Special Course in Future Internet Technologies M.Sc

1.10.2012 31

Scalable and flexible VLANs To support broadcasts – the authors suggest

using groups Similar to VLAN - groups is defined as a set of

hosts who share the same broadcast domain The groups are not limited to layer-2

reachability Multicast-based group-wide broadcasting

Multicast tree with broadcast root for each group F(group_id) used for broadcast root location

Page 32: 1.10.2012 1 SEATTLE - SEATTLE - A Scalable Ethernet Architecture for Large Enterprises T-110.6120 – Special Course in Future Internet Technologies M.Sc

1.10.2012 32

Simulations 1) Campus ~40 000 students

517 routers and switches

2) AP-Large (Access Provider) 315 routers

3) Datacenter (DC) 4 core routes with 21 aggregation switches

Routers were converted to SEATTLE switches

Page 33: 1.10.2012 1 SEATTLE - SEATTLE - A Scalable Ethernet Architecture for Large Enterprises T-110.6120 – Special Course in Future Internet Technologies M.Sc

1.10.2012 33

Cache timeout and AP-large with 50k hosts

Shortest path cache timeouthas impact on number oflocation lookups

Even with 60s time out 99.98%packets were forwarded without lookup

Control overhead (blue) decreases very fast – where as the table size increases only moderately

Shortest path is used in majority of routing in these simulations

Page 34: 1.10.2012 1 SEATTLE - SEATTLE - A Scalable Ethernet Architecture for Large Enterprises T-110.6120 – Special Course in Future Internet Technologies M.Sc

1.10.2012 34

Table size increase in DC

Ethernet bridges stores entryfor each destination ~ O(sh)behavior across network

SEATTLE requires only ~O(h) state since only access and resolver switches need to store and location information for each hosts

With this topology the table size was reduced by factor of 22

In AP-large case the factor was increased to 64

Page 35: 1.10.2012 1 SEATTLE - SEATTLE - A Scalable Ethernet Architecture for Large Enterprises T-110.6120 – Special Course in Future Internet Technologies M.Sc

1.10.2012 35

Control overhead in AP-large Number of control messages

over all links in the topologydivided by the number switchesand duration of the trace

SEATTLY significantly reduces control overhead in the simulations

This is mainly because Ethernet generates network wide floods for a significant number of packets

Page 36: 1.10.2012 1 SEATTLE - SEATTLE - A Scalable Ethernet Architecture for Large Enterprises T-110.6120 – Special Course in Future Internet Technologies M.Sc

1.10.2012 36

Effect of switch failure in DC Switches were allowed to fail

randomly The average recover time was

30 seconds SEATTLE can use all the links in the

topology, where as Ethernet is restricted to the spanning tree

Ethernet must re-compute the tree causing outages

Page 37: 1.10.2012 1 SEATTLE - SEATTLE - A Scalable Ethernet Architecture for Large Enterprises T-110.6120 – Special Course in Future Internet Technologies M.Sc

1.10.2012 37

Effect of host mobility in Campus Hosts were randomly moved

between access switches For high mobility rates,

SEATLLES loss rate was lower than Ethernet

On Ethernet it takes sometime for switches to evict the stale information location information and re-learn the new location

SEATTLE provided low loss and broadcast overhead

Page 38: 1.10.2012 1 SEATTLE - SEATTLE - A Scalable Ethernet Architecture for Large Enterprises T-110.6120 – Special Course in Future Internet Technologies M.Sc

1.10.2012 38

What was omitted Authors suggest multi-level one-hop DHTs

With large dynamic networks – it can be beneficial that entries are stored close

This is achieved with regions and backbone – border switches connect to the backbone switches

With topology changes Approach to seamless mobility is described in the

paper Updating remote host caches is required with

switch based MAC revocation lists Some simulation results Authors also made sample implementation

Page 39: 1.10.2012 1 SEATTLE - SEATTLE - A Scalable Ethernet Architecture for Large Enterprises T-110.6120 – Special Course in Future Internet Technologies M.Sc

1.10.2012 39

Conlusions Operators today face challenges in managing

and configuring large networks. This is largely to complexity of administering IP networks.

Ethernet is not a viable alternative poor scaling and inefficient path selection

SEATTLE promises scalable self-configuring routing Simulations suggest efficient routing, low latency

with quick recovery Host mobility supported with low control overhead

Ethernet stacks at end hosts are not modified

Page 40: 1.10.2012 1 SEATTLE - SEATTLE - A Scalable Ethernet Architecture for Large Enterprises T-110.6120 – Special Course in Future Internet Technologies M.Sc

1.10.2012 40

Thank you for your attention!Questions? Comments?