dynamic circuit services in us lhcnet
DESCRIPTION
Dynamic Circuit Services in US LHCNet. Artur Barczyk, Caltech Joint Techs Workshop Honolulu, 01/23/2008. US LHCNet Overview. Mission oriented network: Provide trans-Atlantic network infrastructure to support the US LHC program. SARA. Starlight. CERN. Manlan. Four PoPs : CERN - PowerPoint PPT PresentationTRANSCRIPT
US LHCUS LHCNWG NWG
+ DRAGON
US-LHCNet + VINCI
+ OSCARS
dCachedCache
dCache,xrootd,REDDNet
PLaNetS + VINCI
dCache
CASTOR
FNAL Tier-1BNL Tier-1
University Tier-3s
University Tier-2s
CERN Tier-0
+ DRAGON
US-LHCNet + VINCI
+ OSCARS
dCachedCache
dCache,xrootd,REDDNet
PLaNetS + VINCI
dCache
CASTOR
FNAL Tier-1BNL Tier-1
University Tier-3s
University Tier-2s
CERN Tier-0
Dynamic Circuit ServicesDynamic Circuit Servicesin in
US LHCNetUS LHCNet
Dynamic Circuit ServicesDynamic Circuit Servicesin in
US LHCNetUS LHCNet
Artur Barczyk, CaltechArtur Barczyk, CaltechJoint Techs WorkshopJoint Techs WorkshopHonolulu, 01/23/2008Honolulu, 01/23/2008
US LHCUS LHCNWG NWG US LHCNet OverviewUS LHCNet Overview
Mission oriented network:Mission oriented network:Provide trans-Atlantic network infrastructure Provide trans-Atlantic network infrastructure
to support the US LHC programto support the US LHC program
Mission oriented network:Mission oriented network:Provide trans-Atlantic network infrastructure Provide trans-Atlantic network infrastructure
to support the US LHC programto support the US LHC program
Four PoPs:Four PoPs: CERNCERN Starlight (→ Fermilab)Starlight (→ Fermilab) Manlan (→ Brookhaven)Manlan (→ Brookhaven) SARASARA
Four PoPs:Four PoPs: CERNCERN Starlight (→ Fermilab)Starlight (→ Fermilab) Manlan (→ Brookhaven)Manlan (→ Brookhaven) SARASARA
CERN
SARA
Manlan
Starlight
2008: 2008: 30 (40) Gbps trans-Atlantic bandwidth30 (40) Gbps trans-Atlantic bandwidth
(roadmap: 80 Gbps by 2010)(roadmap: 80 Gbps by 2010)
2008: 2008: 30 (40) Gbps trans-Atlantic bandwidth30 (40) Gbps trans-Atlantic bandwidth
(roadmap: 80 Gbps by 2010)(roadmap: 80 Gbps by 2010)
US LHCUS LHCNWG NWG
ALICE
pp s =14 TeV L=1034 cm-2 s-1
27 km Tunnel in Switzerland & France
CMS
Atlas
Higgs, SUSY, Extra Dimensions, CP Violation, QG Plasma, … the Unexpected
6000+ Physicists & Engineers
250+ Institutes 60+ Countries
Challenges: Analyze petabytes of complex data cooperativelyHarness global computing, data & network resources
Large Hadron Collider @ CERNLarge Hadron Collider @ CERN
LHCb
Start in 2008Start in 2008Start in 2008Start in 2008
US LHCUS LHCNWG NWG The LHC Data Grid HierarchyThe LHC Data Grid Hierarchy
Emerging Vision: A Richly Structured, Global Dynamic System
10 Gbps
CERN/Outside Ratio ~1:4 T0/(T1)/(T2) ~1:2:2~40% of Resources in Tier2s
US T1s and T2s Connect to US LHCNet PoPs OnlineOnline
10 – 40 Gbps
GEANT2+NRENSGEANT2+NRENS
Germany T1 BNL T1
USLHCNet + ESnetUSLHCNet + ESnet
Outside/CERN Ratio Larger; Expanded Role of Tier1s & Tier2s: Greater Reliance on Networks
US LHCUS LHCNWG NWG The Roles of Tier CentersThe Roles of Tier Centers
Tier 0Tier 0(CERN)(CERN)Tier 0Tier 0
(CERN)(CERN)
Tier 2Tier 2Tier 2Tier 2
Tier 3Tier 3Tier 3Tier 3
11 Tier1s, over 100 Tier2s→ LHC Computing will be more dynamic & network-oriented
11 Tier1s, over 100 Tier2s→ LHC Computing will be more dynamic & network-oriented
Requirements for Requirements for Dynamic Circuit Dynamic Circuit
Services in US LHCNetServices in US LHCNet
Requirements for Requirements for Dynamic Circuit Dynamic Circuit
Services in US LHCNetServices in US LHCNet
Prompt calibration and alignment
ReconstructionStore complete set of RAW
data
ReprocessingStore part of
processed data
Monte Carlo Production
Physics Analysis
Physics Analysis
Tier 1Tier 1Tier 1Tier 1
Tier 1Tier 1Tier 1Tier 1
Defines the dynamism Defines the dynamism of data transfersof data transfers
Defines the dynamism Defines the dynamism of data transfersof data transfers
US LHCUS LHCNWG NWG
CMS Data Transfer Volume CMS Data Transfer Volume (May – Aug. 2007)(May – Aug. 2007)
10 PetaBytes transferredOver 4 Mos. = 8.0 Gbps
Avg.(15 Gbps Peak)
US LHCUS LHCNWG NWG
88 Gbps Peak; 80+ Gbps Sustainable for Hours,
Storage-to-Storage
40 G In40 G In
40 G Out40 G Out
End-system capabilities growingEnd-system capabilities growing
US LHCUS LHCNWG NWG Managed Data TransfersManaged Data Transfers
The scale of the problem and the capabilities of the end-systems require The scale of the problem and the capabilities of the end-systems require a managed approach with scheduled data transfer requestsa managed approach with scheduled data transfer requests
The dynamism of the data transfers defines the requirements for The dynamism of the data transfers defines the requirements for schedulingscheduling Tier0 → Tier1, linked to duty cycle of the LHCTier0 → Tier1, linked to duty cycle of the LHC Tier1 → Tier1, whenever data sets are reprocessedTier1 → Tier1, whenever data sets are reprocessed Tier1 → Tier2, distribute data sets for analysisTier1 → Tier2, distribute data sets for analysis Tier2 → Tier1, distribute MC produced dataTier2 → Tier1, distribute MC produced data
Transfer ClassesTransfer Classes Fixed allocationFixed allocation Preemptible transfersPreemptible transfers Best effortBest effort
PrioritiesPriorities PreemptionPreemption
Use LCAS to squeeze low(er) priority circuitsUse LCAS to squeeze low(er) priority circuits Interact with End-SystemsInteract with End-Systems
Verify and monitor capabilitiesVerify and monitor capabilities
All of this will All of this will happen “on happen “on
demand” from demand” from Experiment’s Data Experiment’s Data
Management Management systemssystems
All of this will All of this will happen “on happen “on
demand” from demand” from Experiment’s Data Experiment’s Data
Management Management systemssystems
Needs to work end-to-Needs to work end-to-end: end: collaboration in collaboration in
GLIF, DICEGLIF, DICE
Needs to work end-to-Needs to work end-to-end: end: collaboration in collaboration in
GLIF, DICEGLIF, DICE
US LHCUS LHCNWG NWG
Managed Network ServicesManaged Network ServicesOperations ScenarioOperations Scenario
Receive request, check capabilities, schedule network resourcesReceive request, check capabilities, schedule network resources ““Transfer N Gigabytes from A to B with target throughput R1”Transfer N Gigabytes from A to B with target throughput R1” Authenticate/authorize/prioritizeAuthenticate/authorize/prioritize Verify end-host rate capabilities R2 (achievable rate)Verify end-host rate capabilities R2 (achievable rate) Schedule bandwidth B > R2; estimate time to complete T(0)Schedule bandwidth B > R2; estimate time to complete T(0) Schedule path with priorities P(i) on segment S(i)Schedule path with priorities P(i) on segment S(i)
Check progress periodicallyCheck progress periodically Compare rate R(t) to R2, update time to complete T(i) to T(i-1) Compare rate R(t) to R2, update time to complete T(i) to T(i-1)
Trigger on Trigger on behavioursbehaviours requiring further action requiring further action Error (e.g. segment failure)Error (e.g. segment failure) Performance issues (e.g. poor progress, channel underutilized, long Performance issues (e.g. poor progress, channel underutilized, long
waits)waits) State change (e.g. new high priority transfer submitted)State change (e.g. new high priority transfer submitted)
Respond dynamically: to match policies and optimize throughputRespond dynamically: to match policies and optimize throughput Change channel size(s)Change channel size(s) Build alternative path(s)Build alternative path(s) Create new channel(s) and squeeze others in classCreate new channel(s) and squeeze others in class
US LHCUS LHCNWG NWG
Managed Network Services: Managed Network Services: End-System IntegrationEnd-System Integration
Integration of network services and end-systems Integration of network services and end-systems Requires end-to-end view of the network and end-systems, real-time Requires end-to-end view of the network and end-systems, real-time
monitoringmonitoring
Robust, real-time and scalable messaging infrastructureRobust, real-time and scalable messaging infrastructure Information extraction and correlationInformation extraction and correlation
e.g. network state, end-host state, transfer queues-statee.g. network state, end-host state, transfer queues-state Obtain via network services Obtain via network services end-host agent (EHA) interactions end-host agent (EHA) interactions Provide sufficient information for decision supportProvide sufficient information for decision support
Cooperation of EHAs and network servicesCooperation of EHAs and network services Automate some operational decisions using accumulated experienceAutomate some operational decisions using accumulated experience Increase level of automation to respond to: increases in usage, Increase level of automation to respond to: increases in usage,
number of users, and competition for scarce network resourcesnumber of users, and competition for scarce network resources
Required for a robust end-to-end production systemRequired for a robust end-to-end production systemRequired for a robust end-to-end production systemRequired for a robust end-to-end production system
US LHCUS LHCNWG NWG Lightpaths in US LHCNet domainLightpaths in US LHCNet domain
(Virtual Intelligent Networks for Computing Infrastructures in Physics)
Control PlaneControl Plane
Data PlaneData Plane
Dynamic setup and reservation of lightpaths has been Dynamic setup and reservation of lightpaths has been successfully demonstrated by the VINCI project successfully demonstrated by the VINCI project controlling optical switchescontrolling optical switches
Dynamic setup and reservation of lightpaths has been Dynamic setup and reservation of lightpaths has been successfully demonstrated by the VINCI project successfully demonstrated by the VINCI project controlling optical switchescontrolling optical switches
US LHCUS LHCNWG NWG Planned InterfacesPlanned Interfaces
I-NNI:I-NNI: VINCI (custom)
protocols
E-NNI:E-NNI: Web Services
(DCN IDC)
UNI:UNI: VINCI custom
protocol, client = EHA
Most, if not all, LHC data transfers will cross more than one domainMost, if not all, LHC data transfers will cross more than one domain E.g. in order to transfer data from CERN to Fermilab: E.g. in order to transfer data from CERN to Fermilab:
CERN CERN → → US LHCNet US LHCNet →→ ESnet ESnet → → FermilabFermilab VINCI Control Plane for intra-domain,VINCI Control Plane for intra-domain, DCN (DICE/GLIF) IDC for inter-domain provisioningDCN (DICE/GLIF) IDC for inter-domain provisioning
UNI:UNI: DCN IDC?
LambdaStation?TeraPaths?
US LHCUS LHCNWG NWG
US LHCUS LHCNWG NWG Protection SchemesProtection Schemes
Mesh-protection at Layer 1Mesh-protection at Layer 1 US LHCNet links are assigned to US LHCNet links are assigned to
primary usersprimary users CERN – Starlight for CMSCERN – Starlight for CMS CERN – Manlan for AtlasCERN – Manlan for Atlas
In case of link failure cannot blindly In case of link failure cannot blindly use bandwidth belonging to the other use bandwidth belonging to the other collaborationcollaboration
Carefully choose protection links, Carefully choose protection links, e.g. use the indirect path (CERN-e.g. use the indirect path (CERN-SARA-Manlan)SARA-Manlan) Designated Transit Lists, and DTL-Designated Transit Lists, and DTL-
SetsSets
High-level protection features High-level protection features implemented in VINCIimplemented in VINCI Re-provision lower priority circuitsRe-provision lower priority circuits Preemption, LCASPreemption, LCAS
Needs to work end-to-Needs to work end-to-end: end: collaboration in collaboration in
GLIF, DICEGLIF, DICE
Needs to work end-to-Needs to work end-to-end: end: collaboration in collaboration in
GLIF, DICEGLIF, DICE
US LHCUS LHCNWG NWG Basic Functionality To-Date Basic Functionality To-Date
14
High performance
servers
Ciena CoreDirectors
US LHCNet routers
Ultralight routers
Pre-production (R&D) setup:Pre-production (R&D) setup:Local domain: Local domain: routing of private IP subnets onto tagged VLANs
Core network (TDM): Core network (TDM): VLAN based Virtual Circuits
Pre-production (R&D) setup:Pre-production (R&D) setup:Local domain: Local domain: routing of private IP subnets onto tagged VLANs
Core network (TDM): Core network (TDM): VLAN based Virtual Circuits
Semi-automatic intra-domain Semi-automatic intra-domain circuit provisioningcircuit provisioning
Bandwidth adjustment (LCAS)Bandwidth adjustment (LCAS) End-host tuning by the End-Host End-host tuning by the End-Host
AgentAgent End-to-End monitoringEnd-to-End monitoring
Semi-automatic intra-domain Semi-automatic intra-domain circuit provisioningcircuit provisioning
Bandwidth adjustment (LCAS)Bandwidth adjustment (LCAS) End-host tuning by the End-Host End-host tuning by the End-Host
AgentAgent End-to-End monitoringEnd-to-End monitoring
US LHCUS LHCNWG NWG
CERNGeneva
Manlan
USLHCnet
MonALISA: Monitoring theMonALISA: Monitoring theUS LHCNet Ciena CDCI NetworkUS LHCNet Ciena CDCI Network
SARA
Starlight
US LHCUS LHCNWG NWG Roadmap AheadRoadmap Ahead
The current capabilities includeThe current capabilities include End-to-End monitoringEnd-to-End monitoring Intra-domain circuit provisioningIntra-domain circuit provisioning End-host tuning by the End-Host AgentEnd-host tuning by the End-Host Agent
Towards a production system (intra-domain)Towards a production system (intra-domain) Integrate existing end-host agent, monitoring and measurement Integrate existing end-host agent, monitoring and measurement
servicesservices Provide a uniform user/application interfaceProvide a uniform user/application interface
Integration with experiments’ Data Management SystemsIntegration with experiments’ Data Management Systems Automated fault handlingAutomated fault handling Priority-based transfer schedulingPriority-based transfer scheduling Include Authorisation, Authentication and AccountingInclude Authorisation, Authentication and Accounting
Towards a production system (inter-domain)Towards a production system (inter-domain) Interface to DCN IDC Interface to DCN IDC
Work with DICE, GLIF on IDC protocol specificationWork with DICE, GLIF on IDC protocol specification Topology exchange, routing, end-to-end path calculation Topology exchange, routing, end-to-end path calculation
Extend AAA infrastructure to multi-domainExtend AAA infrastructure to multi-domain
US LHCUS LHCNWG NWG Summary and ConclusionsSummary and Conclusions
Movement of LHC data will be highly dynamicMovement of LHC data will be highly dynamic Follow LHC data grid hierarchyFollow LHC data grid hierarchy Different data sets (size, transfer speed and duration), different prioritiesDifferent data sets (size, transfer speed and duration), different priorities
Data Management requires network-awarenessData Management requires network-awareness Guaranteed bandwidth end-to-end (storage-system to storage-system)Guaranteed bandwidth end-to-end (storage-system to storage-system) End-to-end monitoring including end-systemsEnd-to-end monitoring including end-systems
We are developing the intra-domain control plane for US We are developing the intra-domain control plane for US LHCNetLHCNet VINCI project, based on MonALISA frameworkVINCI project, based on MonALISA framework Many services and agents are already developed or in advanced stateMany services and agents are already developed or in advanced state
Use Internet2’s IDC protocol for inter-domain provisioningUse Internet2’s IDC protocol for inter-domain provisioning Collaboration with Internet2, ESNet, LambdaStation, Terapaths on end-to-end Collaboration with Internet2, ESNet, LambdaStation, Terapaths on end-to-end
circuit provisioningcircuit provisioning