planetlab: evolution vs intelligent design in global network infrastructure larry peterson princeton...
TRANSCRIPT
![Page 1: PlanetLab: Evolution vs Intelligent Design in Global Network Infrastructure Larry Peterson Princeton University](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649f585503460f94c7d591/html5/thumbnails/1.jpg)
PlanetLab: Evolution vs Intelligent Design in Global
Network Infrastructure
Larry PetersonPrinceton University
![Page 2: PlanetLab: Evolution vs Intelligent Design in Global Network Infrastructure Larry Peterson Princeton University](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649f585503460f94c7d591/html5/thumbnails/2.jpg)
Case for PlanetLab
This chasm is a majorbarrier to realizing the
Future InternetMat
uri
ty
Time
FoundationalResearch
Simulation andResearch Prototypes
Small ScaleTestbeds
DeployedFuture
Internet
![Page 3: PlanetLab: Evolution vs Intelligent Design in Global Network Infrastructure Larry Peterson Princeton University](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649f585503460f94c7d591/html5/thumbnails/3.jpg)
PlanetLab
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
• 637 machines spanning 302 sites and 35 countries nodes within a LAN-hop of > 2M users
• Supports distributed virtualization each of 350+ network services running in their own slice
![Page 4: PlanetLab: Evolution vs Intelligent Design in Global Network Infrastructure Larry Peterson Princeton University](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649f585503460f94c7d591/html5/thumbnails/4.jpg)
Slices
![Page 5: PlanetLab: Evolution vs Intelligent Design in Global Network Infrastructure Larry Peterson Princeton University](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649f585503460f94c7d591/html5/thumbnails/5.jpg)
Slices
![Page 6: PlanetLab: Evolution vs Intelligent Design in Global Network Infrastructure Larry Peterson Princeton University](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649f585503460f94c7d591/html5/thumbnails/6.jpg)
Slices
![Page 7: PlanetLab: Evolution vs Intelligent Design in Global Network Infrastructure Larry Peterson Princeton University](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649f585503460f94c7d591/html5/thumbnails/7.jpg)
User Opt-in
ServerNAT
Client
![Page 8: PlanetLab: Evolution vs Intelligent Design in Global Network Infrastructure Larry Peterson Princeton University](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649f585503460f94c7d591/html5/thumbnails/8.jpg)
Per-Node View
Virtual Machine Monitor (VMM)
NodeMgr
LocalAdmin
VM1 VM2 VMn…
![Page 9: PlanetLab: Evolution vs Intelligent Design in Global Network Infrastructure Larry Peterson Princeton University](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649f585503460f94c7d591/html5/thumbnails/9.jpg)
Long-Running Services• Content Distribution
– CoDeeN: Princeton– Coral: NYU– Cobweb: Cornell
• Internet Measurement– ScriptRoute: Washington, Maryland
• Anomaly Detection & Fault Diagnosis– PIER: Berkeley, Intel– PlanetSeer: Princeton
• DHT– Bamboo (OpenDHT): Berkeley, Intel– Chord (DHash): MIT
![Page 10: PlanetLab: Evolution vs Intelligent Design in Global Network Infrastructure Larry Peterson Princeton University](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649f585503460f94c7d591/html5/thumbnails/10.jpg)
Services (cont)• Routing
– i3: Berkeley– Virtual ISP: Princeton
• DNS– CoDNS: Princeton– CoDoNs: Cornell
• Storage & Large File Transfer– LOCI: Tennessee– CoBlitz: Princeton– Shark: NYU
• Multicast– End System Multicast: CMU– Tmesh: Michigan
![Page 11: PlanetLab: Evolution vs Intelligent Design in Global Network Infrastructure Larry Peterson Princeton University](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649f585503460f94c7d591/html5/thumbnails/11.jpg)
Usage Stats• Slices: 350 - 425• AS peers: 6000• Users: 1028• Bytes-per-day: 2 - 4 TB• IP-flows-per-day: 190M• Unique IP-addrs-per-day: 1M
![Page 12: PlanetLab: Evolution vs Intelligent Design in Global Network Infrastructure Larry Peterson Princeton University](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649f585503460f94c7d591/html5/thumbnails/12.jpg)
Architectural Questions• What is the PlanetLab architecture?
– more a question of synthesis than cleverness• Why is this the right architecture?
– non-technical requirements– technical decisions that influenced adoption
• What is a system architecture anyway?– how does it accommodate change (evolution)
![Page 13: PlanetLab: Evolution vs Intelligent Design in Global Network Infrastructure Larry Peterson Princeton University](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649f585503460f94c7d591/html5/thumbnails/13.jpg)
Requirements
1) Global platform that supports both short-term experiments and long-running services.
– services must be isolated from each other• performance isolation• name space isolation
– multiple services must run concurrently
Distributed Virtualization– each service runs in its own slice: a set of VMs
![Page 14: PlanetLab: Evolution vs Intelligent Design in Global Network Infrastructure Larry Peterson Princeton University](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649f585503460f94c7d591/html5/thumbnails/14.jpg)
Requirements
2) It must be available now, even though no one knows for sure what “it” is.
– deploy what we have today, and evolve over time– make the system as familiar as possible (e.g., Linux)
Unbundled Management– independent mgmt services run in their own slice– evolve independently; best services survive– no single service gets to be “root” but some services
require additional privilege
![Page 15: PlanetLab: Evolution vs Intelligent Design in Global Network Infrastructure Larry Peterson Princeton University](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649f585503460f94c7d591/html5/thumbnails/15.jpg)
Requirements
3) Must convince sites to host nodes running code written by unknown researchers.
– protect the Internet from PlanetLab
Chain of Responsibility– explicit notion of responsibility– trace network activity to responsible party
![Page 16: PlanetLab: Evolution vs Intelligent Design in Global Network Infrastructure Larry Peterson Princeton University](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649f585503460f94c7d591/html5/thumbnails/16.jpg)
Requirements
4) Sustaining growth depends on support for autonomy and decentralized control.
– sites have the final say about the nodes they host– sites want to provide “private PlanetLabs”– regional autonomy is important
Federation– universal agreement on minimal core (narrow waist)– allow independent pieces to evolve independently– identify principals and trust relationships among them
![Page 17: PlanetLab: Evolution vs Intelligent Design in Global Network Infrastructure Larry Peterson Princeton University](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649f585503460f94c7d591/html5/thumbnails/17.jpg)
Requirements5) Must scale to support many users with minimal
resources available.– expect under-provisioned state to be the norm– shortage of logical resources too (e.g., IP addresses)
Decouple slice creation from resource allocationOverbook with recovery
– support both guarantees and best effort– recover from wedged states under heavy load
![Page 18: PlanetLab: Evolution vs Intelligent Design in Global Network Infrastructure Larry Peterson Princeton University](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649f585503460f94c7d591/html5/thumbnails/18.jpg)
Tension Among Requirements• Distributed Virtualization / Unbundled Management
– isolation vs one slice managing another
• Federation / Chain of Responsibility– autonomy vs trusted authority
• Under-provisioned / Distributed Virtualization– efficient sharing vs isolation
• Other tensions– support users vs evolve the architecture– evolution vs clean slate
![Page 19: PlanetLab: Evolution vs Intelligent Design in Global Network Infrastructure Larry Peterson Princeton University](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649f585503460f94c7d591/html5/thumbnails/19.jpg)
Synergy Among Requirements• Unbundled Management
– third party management software
• Federation– independent evolution of components– support for autonomous control of resources
![Page 20: PlanetLab: Evolution vs Intelligent Design in Global Network Infrastructure Larry Peterson Princeton University](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649f585503460f94c7d591/html5/thumbnails/20.jpg)
Architecture (1)• Node Operating System
– isolate slices– audit behavior
• PlanetLab Central (PLC)– remotely manage nodes– bootstrap service to instantiate and control slices
• Third-party Infrastructure Services– monitor slice/node health– discover available resources– create and configure a slice– resource allocation
![Page 21: PlanetLab: Evolution vs Intelligent Design in Global Network Infrastructure Larry Peterson Princeton University](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649f585503460f94c7d591/html5/thumbnails/21.jpg)
Trust RelationshipsPrincetonBerkeleyWashingtonMITBrownCMUNYUETHHarvardHP LabsIntelNEC LabsPurdueUCSDSICSCambridgeCornell…
princeton_codeennyu_dcornell_beehiveatt_mcashcmu_esmharvard_icehplabs_donutlabidsl_pseprirb_phiparis6_landmarksmit_dhtmcgill_cardhuji_enderarizona_storkucb_bambooucsd_shareumd_scriptroute…
N x NTrusted
Intermediary(PLC)
![Page 22: PlanetLab: Evolution vs Intelligent Design in Global Network Infrastructure Larry Peterson Princeton University](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649f585503460f94c7d591/html5/thumbnails/22.jpg)
Trust Relationships (cont)
NodeOwner
PLCService
Developer(User)1
2
3
4
1) PLC expresses trust in a user by issuing it credentials to access a slice
2) Users trust to create slices on their behalf and inspect credentials
3) Owner trusts PLC to vet users and map network activity to right user
4) PLC trusts owner to keep nodes physically secure
![Page 23: PlanetLab: Evolution vs Intelligent Design in Global Network Infrastructure Larry Peterson Princeton University](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649f585503460f94c7d591/html5/thumbnails/23.jpg)
Trust Relationships (cont)
NodeOwner
MgmtAuthority
ServiceDeveloper
(User)1
2
3
4
1) PLC expresses trust in a user by issuing credentials to access a slice
2) Users trust to create slices on their behalf and inspect credentials
3) Owner trusts PLC to vet users and map network activity to right user
4) PLC trusts owner to keep nodes physically secure
SliceAuthority
5
6
5) MA trusts SA to reliably map slices to users
6) SA trusts MA to provide working VMs
![Page 24: PlanetLab: Evolution vs Intelligent Design in Global Network Infrastructure Larry Peterson Princeton University](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649f585503460f94c7d591/html5/thumbnails/24.jpg)
Owner 1
Owner 2
Owner 3
Owner N
. . .
SliceAuthority
ManagementAuthority
Software updates
Auditing data
Create slices
. .
.
U S
E R
S
PlanetLabNodes
ServiceDevelopers
Request a slice
New slice ID
Access slice
Identifyslice users(resolve abuse)
Learn about nodes
Architecture (2)
![Page 25: PlanetLab: Evolution vs Intelligent Design in Global Network Infrastructure Larry Peterson Princeton University](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649f585503460f94c7d591/html5/thumbnails/25.jpg)
Architecture (3)
Node
MA
NM +VMM
nodedatabase
NodeOwner
OwnerVM
SCS
SAslicedatabase
VM ServiceDeveloper
![Page 26: PlanetLab: Evolution vs Intelligent Design in Global Network Infrastructure Larry Peterson Princeton University](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649f585503460f94c7d591/html5/thumbnails/26.jpg)
Per-Node Mechanisms
Virtual Machine Monitor (VMM)
NodeMgr
OwnerVM
VM1 VM2 VMn…
Linux kernel (Fedora Core)+ Vservers (namespace isolation)+ Schedulers (performance isolation)+ VNET (network virtualization)
SliverMgrProper
PlanetFlowSliceStatpl_scspl_mom
![Page 27: PlanetLab: Evolution vs Intelligent Design in Global Network Infrastructure Larry Peterson Princeton University](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649f585503460f94c7d591/html5/thumbnails/27.jpg)
VMM• Linux
– significant mind-share• Vserver
– scales to hundreds of VMs per node (12MB each)• Scheduling
– CPU• fair share per slice (guarantees possible)
– link bandwidth• fair share per slice• average rate limit: 1.5Mbps (24-hour bucket size)• peak rate limit: set by each site (100Mbps default)
– disk• 5GB quota per slice (limit run-away log files)
– memory• no limit• pl_mom resets biggest user at 90% utilization
![Page 28: PlanetLab: Evolution vs Intelligent Design in Global Network Infrastructure Larry Peterson Princeton University](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649f585503460f94c7d591/html5/thumbnails/28.jpg)
VMM (cont)• VNET
– socket programs “just work”• including raw sockets
– slices should be able to send only…• well-formed IP packets• to non-blacklisted hosts
– slices should be able to receive only…• packets related to connections that they initiated (e.g., replies)• packets destined for bound ports (e.g., server requests)
– essentially a switching firewall for sockets• leverages Linux's built-in connection tracking modules
– also supports virtual devices• standard PF_PACKET behavior• used to connect to a “virtual ISP”
![Page 29: PlanetLab: Evolution vs Intelligent Design in Global Network Infrastructure Larry Peterson Princeton University](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649f585503460f94c7d591/html5/thumbnails/29.jpg)
Node Manager• SliverMgr
– creates VM and sets resource allocations– interacts with…
• bootstrap slice creation service (pl_scs)• third-party slice creation & brokerage services (using tickets)
• Proper: PRivileged OPERations– grants unprivileged slices access to privileged info– effectively “pokes holes” in the namespace isolation– examples
• files: open, get/set flags• directories: mount/unmount• sockets: create/bind• processes: fork/wait/kill
![Page 30: PlanetLab: Evolution vs Intelligent Design in Global Network Infrastructure Larry Peterson Princeton University](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649f585503460f94c7d591/html5/thumbnails/30.jpg)
Auditing & Monitoring• PlanetFlow
– logs every outbound IP flow on every node• accesses ulogd via Proper• retrieves packet headers, timestamps, context ids (batched)
– used to audit traffic– aggregated and archived at PLC
• SliceStat– has access to kernel-level / system-wide information
• accesses /proc via Proper
– used by global monitoring services– used to performance debug services
![Page 31: PlanetLab: Evolution vs Intelligent Design in Global Network Infrastructure Larry Peterson Princeton University](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649f585503460f94c7d591/html5/thumbnails/31.jpg)
Infrastructure Services• Brokerage Services
– Sirius: Georgia– Bellagio: UCSD, Harvard, Intel– Tycoon: HP
• Environment Services– Stork: Arizona– AppMgr: MIT
• Monitoring/Discovery Services– CoMon: Princeton– PsEPR: Intel– SWORD: Berkeley– IrisLog: Intel
![Page 32: PlanetLab: Evolution vs Intelligent Design in Global Network Infrastructure Larry Peterson Princeton University](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649f585503460f94c7d591/html5/thumbnails/32.jpg)
Evolution vs Intelligent Design• Favor evolution over clean slate• Favor design principles over a fixed architecture• Specifically…
– leverage existing software and interfaces– keep VMM and control plane orthogonal– exploit virtualization
• vertical: mgmt services run in slices• horizontal: stacks of VMs
– give no one root (least privilege + level playing field)– support federation (decentralized control)
![Page 33: PlanetLab: Evolution vs Intelligent Design in Global Network Infrastructure Larry Peterson Princeton University](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649f585503460f94c7d591/html5/thumbnails/33.jpg)
Other Lessons
• Inferior tracks lead to superior locomotives• Empower the user: yum• Build it and they (research papers) will come• Overlays are not networks• PlanetLab: We debug your network• From universal connectivity to gated communities• If you don’t talk to your university’s general
counsel, you aren’t doing network research• Work fast, before anyone cares
![Page 34: PlanetLab: Evolution vs Intelligent Design in Global Network Infrastructure Larry Peterson Princeton University](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649f585503460f94c7d591/html5/thumbnails/34.jpg)
Collaborators• Andy Bavier• Marc Fiuczynski• Mark Huang• Scott Karlin• Aaron Klingaman• Martin Makowiecki• Reid Moran• Steve Muir• Stephen Soltesz• Mike Wawrzoniak
• David Culler, Berkeley• Tom Anderson, UW• Timothy Roscoe, Intel• Mic Bowman, Intel• John Hartman, Arizona• David Lowenthal, UGA• Vivek Pai, Princeton• David Parkes, Harvard• Amin Vahdat, UCSD• Rick McGeer, HP Labs
![Page 35: PlanetLab: Evolution vs Intelligent Design in Global Network Infrastructure Larry Peterson Princeton University](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649f585503460f94c7d591/html5/thumbnails/35.jpg)
Available CPU Capacity
0
20
40
60
80
100
120
10 20 30 40 50 60 70 80
Pct of CPU Available
Pct of 360 Nodes
Feb 1-8, 2005 (Week before SIGCOMM deadline)
![Page 36: PlanetLab: Evolution vs Intelligent Design in Global Network Infrastructure Larry Peterson Princeton University](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649f585503460f94c7d591/html5/thumbnails/36.jpg)
Node Boot/InstallNode PLC (MA) Boot Server
1. Boots from BootCD (Linux loaded)
2. Hardware initialized
3. Read network config . from floppy
7. Node key read into memory from floppy
4. Contact PLC (MA)
6. Execute boot mgr
Boot Manager
8. Invoke Boot API
10. State = “install”, run installer
11. Update node state via Boot API
13. Chain-boot node (no restart)
14. Node booted
9. Verify node key, send current node state
12. Verify node key, change state to “boot”
5. Send boot manager
![Page 37: PlanetLab: Evolution vs Intelligent Design in Global Network Infrastructure Larry Peterson Princeton University](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649f585503460f94c7d591/html5/thumbnails/37.jpg)
Chain of ResponsibilityJoin Request PI submits Consortium paperwork and requests to join
PI Activated PLC verifies PI, activates account, enables site (logged)
User Activated Users create accounts with keys, PI activates accounts (logged)
Nodes Added to Slices
Users add nodes to their slice (logged)
Slice Traffic Logged
Experiments generate traffic (logged by PlanetFlow)
Traffic Logs Centrally Stored
PLC periodically pulls traffic logs from nodes
Slice Created PI creates slice and assigns users to it (logged)
Network Activity Slice Responsible Users & PI
![Page 38: PlanetLab: Evolution vs Intelligent Design in Global Network Infrastructure Larry Peterson Princeton University](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649f585503460f94c7d591/html5/thumbnails/38.jpg)
Slice Creation
PLC(SA)
VMM
NM VM
PI SliceCreate( ) SliceUsersAdd( )
User SliceAttributeSet( ) SliceGetTicket( )
VM VM…
.
.
.
.
.
.
(distribute ticket to slice creation service: pl_scs)
SliverCreate(rspec)
![Page 39: PlanetLab: Evolution vs Intelligent Design in Global Network Infrastructure Larry Peterson Princeton University](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649f585503460f94c7d591/html5/thumbnails/39.jpg)
Brokerage Service
PLC(SA)
VMM
NM VMSliceAttributeSet( ) SliceGetTicket( ) VM VM…
.
.
.
.
.
.
(distribute ticket to brokerage service)
rcap = PoolCreate(rspec)
Broker
![Page 40: PlanetLab: Evolution vs Intelligent Design in Global Network Infrastructure Larry Peterson Princeton University](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649f585503460f94c7d591/html5/thumbnails/40.jpg)
Brokerage Service (cont)
PLC(SA)
VMM
NM VM VM VM…
.
.
.
.
.
.
(broker contacts relevant nodes)
PoolSplit(rcap, slice, rspec)
VM
User BuyResources( ) Broker