the science dmz: recent developments · 16/05/2017 · • science dmz as plaorm • modern...
TRANSCRIPT
TheScienceDMZ:RecentDevelopments
EliDart,NetworkEngineerESnetScienceEngagementLawrenceBerkeleyNa@onalLaboratory
WRNP17
Belém,Brazil
May16,2017
©2017,EnergySciencesNetwork
Overview
• ScienceDMZAsPlaMorm• ModernResearchDataPortal
• PacificResearchPlaMorm– PRP– NRP
• Note:ThistalkassumesyoualreadyunderstandtheScienceDMZ
– Ifyouhaven’tencounteredtheScienceDMZ,severalfolksinRNPcanhelpyou,includingLeandroCiuffoandAlexMoura
– Orcheckoutthefasterdataknowledgebase:• hXp://fasterdata.es.net/science-dmz/
2 – ESnet Science Engagement ([email protected]) - 5/15/17
©2017,EnergySciencesNetwork
• OncetherearemanyScienceDMZsinyournetwork,morethingsbecomepossible
• Easyfiletransferisgood,butwhatelsecanwedo?– Updatethearchitectureofdataportals– Buildservicesbetweenins@tu@ons– Interconnectfacili@es
• Severaleffortsunderwaytodothesethings
ScienceDMZAsAPla3orm
3 – ESnet Science Engagement ([email protected]) - 5/15/17 ©2017,EnergySciencesNetwork
ScienceDataPortals
• Largerepositoriesofscien@ficdata– Climatedata– Skysurveys(astronomy,cosmology)– Manyothers– Datasearch,browsing,access
• Manyscien@ficdataportalsweredesigned15+yearsago– Single-web-serverdesign– Databrowse/search,dataaccess,userawarenessallinasinglesystem– Allthedatagoesthroughtheportalserver
• Inmanycasesbydesign• E.g.embargobeforepublica@on(enforceaccesscontrol)
4 – ESnet Science Engagement ([email protected]) - 5/15/17 ©2017,EnergySciencesNetwork
LegacyPortalDesign
10GE
Border Router
WAN
Firewall
Enterprise
perfSONAR
perfSONAR
Filesystem(data store)
10GE
Portal Server
Browsing pathQuery pathData path
Portal server applications:· web server· search· database· authentication· data service
• Verydifficulttoimproveperformancewithoutarchitecturalchange– Sodwarecomponentsalltangledtogether
– DifficulttoputthewholeportalinaScienceDMZbecauseofsecurity
– EvenifyoucouldputitinaDMZ,manycomponentsaren’tscalable
• Whatdoesarchitecturalchangemean?
5 – ESnet Science Engagement ([email protected]) - 5/15/17 ©2017,EnergySciencesNetwork
ExampleofArchitecturalChange–CDN
• Let’slookatwhatContentDeliveryNetworksdidforwebapplica@ons• CDNsareawell-deployeddesignpaXern
– Akamaiandfriends– En@reindustryinCDNs– Assumedpartoftoday’sInternetarchitecture
• WhatdoesaCDNdo?– Storesta@ccontentinaseparateloca@onfromdynamiccontent
• Complexityisn’tinthesta@ccontent–it’sintheapplica@ondynamics• Webapplica@onsarecomplex,full-featured,andslow– Databases,userawareness,etc.– Lotsofintegratedpieces
• Dataserviceforsta@ccontentissimplebycomparison
– Separa@onofapplica@onanddataserviceallowseachtobeop@mized
6 – ESnet Science Engagement ([email protected]) - 5/15/17 ©2017,EnergySciencesNetwork
ClassicalWebServerModel
• Webbrowserfetchespagesfromwebserver– Allcontentstoredonthewebserver– Webapplica@onsrunonthewebserver
• Webservermaycallouttolocaldatabase• Fundamentallyallprocessingislocaltothewebserver
– Webserversendsdatatoclientbrowseroverthenetwork• Perceivedclientperformancechangeswithnetworkcondi@ons
– Severalproblemsinthegeneralcase– Latencyincreases@metopagerender– Packetloss+latencycauseproblemsforlargesta@cobjects
HostingProvider
TransitNetwork
Residential BroadbandWEB
Long Distance / High Latency
Web Server
Browser
7 – ESnet Science Engagement ([email protected]) - 5/15/17 ©2017,EnergySciencesNetwork
SoluFon:PlaceLargeStaFcObjectsNearClient
HostingProvider
TransitNetwork
Residential BroadbandWEB
Long Distance / High Latency
CDN
DATA
Short Distance / Low Latency
Web Server
CDN Data Server
Browser
• CDNprovidessta@ccontent“close”toclient– Latencygoesdown
• Timetopagerendergoesdown• Sta@ccontentperformancegoesup
– Loadonwebservergoesdown(noneedtoservesta@ccontent)
– Webservers@llmanagescomplexbehavior• Localreasoning/fastchangesforapplica@onowner
• Significantwinforwebapplica@onperformance
8 – ESnet Science Engagement ([email protected]) - 5/15/17 ©2017,EnergySciencesNetwork
ClientSimplySeesIncreasedPerformance
• Clientdoesn’tseetheCDNasaseparatething– Webcontentisalls@llviewedinabrowser
• Browserfetcheswhatthepagetellsittofetch• Differentcontentcomesfromdifferentplaces• Userdoesn’tknow/care
• CDNsprovideanarchitecturalsolu@ontoaperformanceproblem– Notbrute-force– Worksmarter,notharder
The‘NetWEB
Browser
Web Server
Rich, Slow
DATA
CDN Data Server
Simple,Fast
The‘NetWEB
Browser
Web Server
©2017,EnergySciencesNetwork9 – ESnet Science Engagement ([email protected]) - 5/15/17
ArchitecturalExaminaFonofDataPortals
• Commondataportalfunc@ons(mostportalshavethese)– Search/query/discovery– Datadownloadmethodfordataaccess– GUIforbrowsingbyhumans– APIformachineaccess–ideallyincorporatessearch/query+download
• Performancepainisprimarilyinthedatahandlingpiece– Rapidincreaseindatascaleeclipsedlegacysodwarestackcapabili@es– Portalserversodenstuckinenterprisenetwork
• Canwe“disassemble”theportalandputthepiecesbacktogetherbeXer?– UseScienceDMZasaplaMormforthedatapiece– AvoidplacingcomplexsodwareintheScienceDMZ
10 – ESnet Science Engagement ([email protected]) - 5/15/17 ©2017,EnergySciencesNetwork
LegacyPortalDesign
10GE
Border Router
WAN
Firewall
Enterprise
perfSONAR
perfSONAR
Filesystem(data store)
10GE
Portal Server
Browsing pathQuery pathData path
Portal server applications:· web server· search· database· authentication· data service
11 – ESnet Science Engagement ([email protected]) - 5/15/17 ©2017,EnergySciencesNetwork
Next-GeneraFonPortalLeveragesScienceDMZ
10GE10GE
10GE
10GE
Border Router
WAN
Science DMZSwitch/Router
Firewall
Enterprise
perfSONAR
perfSONAR
10GE
10GE
10GE10GE
DTN
DTN
API DTNs(data access governed
by portal)
DTN
DTN
perfSONAR
Filesystem (data store)
10GE
Portal Server
Browsing pathQuery path
Portal server applications:· web server· search· database· authentication
Data Path
Data Transfer Path
Portal Query/Browse Path
12 – ESnet Science Engagement ([email protected]) - 5/15/17 ©2017,EnergySciencesNetwork
PutTheDataOnDedicatedInfrastructure
• Wehaveseparatedthedatahandlingfromtheportallogic• Portaliss@llitsnormalself,butenhanced
– PortalGUI,database,search,etc.allfunc@onastheydidbefore– QueryreturnspointerstodataobjectsintheScienceDMZ– Portalisnowfreedfrom@estothedataservers(runitonAmazonifyouwant!)
• Datahandlingisseparate,andscalable– High-performanceDTNsintheScienceDMZ– Scaleasmuchasyouneedtowithoutmodifyingtheportalsodware
• Outsourcedatahandlingtocompu@ngcentersorcampuscentralstorage– Compu@ngcentersaresetupforlarge-scaledata– Letthemhandlethelarge-scaledata,andlettheportaldotheorchestra@onofdataplacement
13 – ESnet Science Engagement ([email protected]) - 5/15/17 ©2017,EnergySciencesNetwork
The Pacific Research Platform Creates a Regional End-to-End Science-Driven “Big Data Freeway System”
NSF CC*DNI Grant $5M 10/2015-10/2020
• PI: Larry Smarr, UC San Diego Calit2
• Co-PIs: - Camille Crittenden, UC
Berkeley CITRIS, - Tom DeFanti, UC San Diego
Calit2, - Philip Papadopoulos, UC
San Diego SDSC, - Frank Wuerthwein, UC San
Diego Physics and SDSC
PRPProvidesInteroperability
• ScienceDMZsatpar@cipa@ngsitesensureinteroperability• PRPengineersworktoensuretheyinteroperate
– GlobusdatatransferbetweenDTNs– perfSONAR
• Somevaria@oninDTNs– SomehaveFIONADTNs
• FIONA==FlashI/ONetworkAppliance• DesignedbyPRPengineersatUCSanDiego• hXps://fasterdata.es.net/science-dmz/DTN/fiona-flash-i-o-network-appliance/
– SomehaveDTNsconnectedtoHPCstorage• Key–theyallinteroperate,removingintegra@onburdenfromscien@sts
15 – ESnet Science Engagement ([email protected]) - 5/15/17 ©2017,EnergySciencesNetwork
PRPScienceDrivers
• Mul@plescienceareas– Astronomyandastrophysics– Biomedicalapplica@ons– Lifesciences– Par@clephysics– Virtualrealityanddatavisualiza@on
• hXp://prp.ucsd.edu/
5/15/1716
NaFonalResearchPla3orm(NRP)
• ReplicatethePRPonana@onalscale• Interoperable,high-performancecyberinfrastructure
– Builttoservedomainscience– Scaleupto~200ins@tu@ons
• Firstworkshoptobeheldthissummer– Domainscienceinput– Policyques@ons– Architecture,scalability– IncludecampusIT,regionalnetworks,na@onalnetworks,fundingagencies,etc.inacommonconversa@on.
17 – ESnet Science Engagement ([email protected]) - 5/15/17 ©2017,EnergySciencesNetwork
PetascaleDTNProject
• AnotherexampleofbuildingontheScienceDMZ• Supportsalldata-intensiveapplica@onswhichrequirelarge-scaledataplacement
• Collabora@onbetweenHPCfacili@es– ALCF,NCSA,NERSC,OLCF
• Goal:per-Globus-jobperformanceat1PB/weeklevel– 15gigabitspersecond– Withchecksumsturnedon,etc.– Nospecialshortcuts,noarcaneop@ons
• Referencedatasetis4.4TBofastrophysicsmodeloutput– Mixoffilesizes– Manydirectories– Realdata!
18 – ESnet Science Engagement ([email protected]) - 5/15/17 ©2017,EnergySciencesNetwork
PetascaleDTNProject
10.0 Gbps
17.6 Gbps
14.8 Gbps
19.3 Gbps
17.4 Gbps 17.0 Gbps
32.4 Gbps
25.3 Gbps
18.3 Gbps
16.3 Gbps
24.1 Gbps
24.0 Gbps
DTN
DTN
DTN
DTN
alcf#dtn_miraALCF
nersc#dtnNERSC
olcf#dtn_atlasOLCF
ncsa#BlueWatersNCSA
Data set: L380Files: 19260Directories: 211Other files: 0Total bytes: 4442781786482 (4.4T bytes)Smallest file: 0 bytes (0 bytes)Largest file: 11313896248 bytes (11G bytes)Size distribution:
1 - 10 bytes: 7 files10 - 100 bytes: 1 files100 - 1K bytes: 59 files1K - 10K bytes: 3170 files10K - 100K bytes: 1560 files100K - 1M bytes: 2817 files1M - 10M bytes: 3901 files10M - 100M bytes: 3800 files100M - 1G bytes: 2295 files1G - 10G bytes: 1647 files10G - 100G bytes: 3 files
March 2017L380 Data Set
19 – ESnet Science Engagement ([email protected]) - 5/15/17 ©2017,EnergySciencesNetwork
Thanks!
[email protected](ESnet)LawrenceBerkeleyNa@onalLaboratory
hXp://fasterdata.es.net/
hXp://my.es.net/
hXp://www.es.net/
ExtraSlides
5/15/1721
WhatIsScienceEngagement?
• Technologypeopleworkingwithscien@ststohelpsolveproblems– Improvedatatransferperformance– Improvedataworkflows(e.g.torequirelesshumaneffort)– Improveexperimentopera@ons– …andmore…
• Usingexperiencegainedfromhelpingscien@ststoimprovecyberinfrastructure– Networkdesign– Tooldesign– Systemdesign
5/15/1722
EngagementIsImportant:OldModel
• Scien@stasintegrator– Requiresscien@ststodiscovernewtechnologies– Requiresscien@ststobecomeexpertinnewtechnologies– Requiresscien@ststoassembledis@ncttechnologiesintoanintegratedsolu@onthatworksforthem
– Somescien@stsdothisbrilliantly–mostdonot
5/15/1723
EngagementIsImportant:NewModel
• Scien@stascollaborator– Technologistsunderstandtechnology– Technologistsunderstandenoughofthesciencetoseehowtechnologyfits
– Technologistshelpscien@stsadoptausefulsolu@on– Thisismuchmoreproduc@ve,andrequiresscienceengagement
5/15/1724