ucsd cc-nie science projects - internet2 · project in brief . prism puts sdsc’s big data gordon...
TRANSCRIPT
4/30/2014 1
University of California, San Diego
• Prism@UCSD – Science DMZ
– PI: P. Papadopulos, co-PI: L. Smarr
– 01/01/2013 to 12/31/2014
• CHERuB – 100G campus gateway
– PI: M. Norman, co-PI: T. Hutton, V. Polichar
– 01/01/2014 to 12/31/2015
UCSD and its environment
4/30/2014 3
Scripps Institute
of Oceanography
Salk Institute
Venter Institute
General Atomics
CalIT2
SDSC
Physics
Medical School
Skaggs
In addition to the 3 main
UCSD units:
- General Campus
- Medical School
- Scripps I.O.
there are many other
research organizations
on and around campus.
NCMIR
Stem Cell Institute
Connecting YOU on UCSD Campus with the World
By Creating a Big Data Freeway System
NSF CC-NIE Has Awarded Prism@UCSD Optical Switch
Phil Papadopoulos, SDSC, Calit2, PI
CHERuB
Prism@UCSD: A Researcher Defined 10 and
40Gbit/s Campus Scale Data Carrier
• high-bandwidth end-to-end optical connections
• routed by next generation Arista switches (7504)
• connects lab “data producers” with SDSC data-intensive computing &
storage resources
• 10 Terabit/s of aggregate bandwidth, has full bisection similar to in-
machine room clusters, but is deployed at a campus scale
• builds upon and upgrades the Quartzite "campus-scale network
laboratory" NSF MRI (awarded 2006)
• adds IPv6 and OpenFlow
• existing optical fiber connection to the SDSC is being expanded to
120Gbps as a high-bandwidth bridge to cloud/parallel storage and NSF
XSEDE resources
Project in Brief
PRISM Puts SDSC’s Big Data Gordon Supercomputer
and Data Oasis Storage Into Your Lab
12
PRISM is Connecting CERN’s CMS Experiment
To Our Physics Department
80 Gbps PRISM Connection Has Been Made
UCSD is a Tier-2 LHC Data Center:
CMS Flow into UCSD Physics Dept. Peaks at 2.4 Gbps
Source: Frank Wuerthwein, Physics UCSD
Dan Cayan
USGS Water Resources Discipline
Scripps Institution of Oceanography, UC San Diego
much support from Mary Tyree, Mike Dettinger, Guido Franco and other colleagues
Sponsors: California Energy Commission NOAA RISA program California DWR, DOE, NSF
Planning for climate change in California substantial shifts on top of already high climate variability
SIO Campus Climate Researchers Need to Download
Results from Remote Supercomputer Simulations
to Make Regional Climate Change Forecasts
average summer
afternoon temperature
average summer
afternoon temperature
10 GFDL A2 1km downscaled to 1km
Hugo Hidalgo Tapash Das Mike Dettinger
Ultra High Resolution Microscopy Images
Created at the National Center for Microscopy Imaging
NIH National Center for Microscopy & Imaging Research
Integrated Infrastructure of Shared Resources
Source: Steve Peltier, Mark Ellisman, NCMIR
Local SOM
Infrastructure
Scientific
Instruments
End User
FIONA Workstation
Shared Infrastructure
PRISM Links Calit2’s VROOM to NCMIR to Explore
Confocal Light Microscope Images of Rat Brains
Protein Data Bank (PDB) Needs
Bandwidth to Connect Resources and Users
• Archive of experimentally
determined 3D structures of
proteins, nucleic acids, complex
assemblies
• One of the largest scientific
resources in life sciences
Source: Phil Bourne and
Andreas Prlić, PDB Hemoglobin
Virus
PDB Usage Is Growing Over Time
• More than 300,000 Unique Visitors per Month
• Up to 300 Concurrent Users
• ~10 Structures are Downloaded per Second 7/24/365
• Increasingly Popular Web Services Traffic
Source: Phil Bourne and Andreas Prlić, PDB
RCSB PDB 159 million
entry downloads
PDBe 34 million
entry downloads
PDBj 16 million
entry downloads
2010 FTP Traffic
Source: Phil Bourne and Andreas Prlić, PDB
• Why is it Important?
– Enables PDB to Better Serve Its Users by Providing
Increased Reliability and Quicker Results
• How Will it be Done?
– By More Evenly Allocating PDB Resources at Rutgers and
UCSD
– By Directing Users to the Closest Site
• Need High Bandwidth Between Rutgers & UCSD Facilities
PDB Plans to Establish Global Load Balancing
Source: Phil Bourne and Andreas Prlić, PDB
PRISM Will Link Computational Mass Spectrometry
and Genome Sequencing Cores to the Big Data Freeway
ProteoSAFe: Compute-intensive
discovery MS at the click of a
button
MassIVE: repository and
identification platform for all
MS data in the world
Source: proteomics.ucsd.edu
SAN DIEGO SUPERCOMPUTER CENTER
at the UNIVERSITY OF CALIFORNIA; SAN DIEGO
http://cherub.ucsd.edu
SAN DIEGO SUPERCOMPUTER CENTER
at the UNIVERSITY OF CALIFORNIA; SAN DIEGO
CHERuB*: SDSC-ACT partner to bring 100Gbps connectivity to UCSD
UCSD/SDSC
New 100G path
LBL - CMMAP
UNL - OSG
UWisc Madison - OSG
Pink line – New CENIC 100GBlue lines – Existing/planned ANI 100GGreen lines – Existing PacWave 100GMaroon lines – XSEDE 10G networkThin lines – Other existing 10G or lower
NICS - CMMAP
UCR
FNAL - Tier-1 LHC
Austin/TACC
UCSB
NERSC - POLARBEAR, CAIDA
Production late 2014
*Configurable, High-speed, Extensible Research Bandwidth
SAN DIEGO SUPERCOMPUTER CENTER
at the UNIVERSITY OF CALIFORNIA; SAN DIEGO
The Plumbing (ask Tom Hutton)
PacWave, CENIC,
Internet2, NLR, ESnet,
StarLight, XSEDE & other R&E networks
DWDM100G
transponders
DWDM100G
transponders
818 W. 7th, Los Angeles, CA 10100 Hopkins Drive, La Jolla, CA
up to 3 add'l 100G transponders can be
attached
up to 3 add'l 100G transponders can be
attached
to CENIC/ PacWave switch L2
UCSD/SDSC Gateway Juniper
MX960 "MX0"
New 2x100G/8x10Gline card + optics
New 40G line card +
optics
SDSC Juniper MX960 "Medusa"
New 100G card/optics
Other SDSC
resources
UCSD Primary Node Cisco 6509 "Node B"
PRISM@UCSD Arista 7504
PRISM@UCSD- many UCSD big
data users
mult. 40G+ connections
UCSD Production users
mult. 10G connections
GORDON
compute
cluster
2x40G 4x10G
100G
100G
mult. 40G connections
NEW
UCSD
Key:
Green/dashed lines - new component/equipment in proposal
Pink/black - existing UCSD infrastructure
UCSD/SDSC Cisco 6509
UCSD
DYNES
add'l 10G card/optics
100G
Equinix/L3/CENIC POPSDSC NAP
existing CENIC fiber
Nx10G
10G
Existing ESnet
SD router
10G
Dual Arista 7508"Oasis"
SDSC
DYNES
128x10G
256x10G
DataOasis/
SDSC Cloud
SAN DIEGO SUPERCOMPUTER CENTER
at the UNIVERSITY OF CALIFORNIA; SAN DIEGO
CENIC/ESnet 100G Connection enables Big Data science collaborations between
NERSC and SDSC
UCSD/SDSC
New 100G path
LBL - CMMAP
UNL - OSG
UWisc Madison - OSG
Pink line – New CENIC 100GBlue lines – Existing/planned ANI 100GGreen lines – Existing PacWave 100GMaroon lines – XSEDE 10G networkThin lines – Other existing 10G or lower
NICS - CMMAP
UCR
FNAL - Tier-1 LHC
Austin/TACC
UCSB
NERSC - POLARBEAR, CAIDA
SAN DIEGO SUPERCOMPUTER CENTER
at the UNIVERSITY OF CALIFORNIA; SAN DIEGO
A Unique, Powerful, Data-Intensive Testbed for Scientific Discovery
EDISON HPC SYSTEM
2 PF, 434 TB RAM
6 PB
150 GB/s 100 GB/s
4.5 PB DTN DTN ESnet/CENIC
100 Gb/s
GORDON HPD SYSTEM
0.3 PF, 364 TB RAM+SSD
SAN DIEGO SUPERCOMPUTER CENTER
at the UNIVERSITY OF CALIFORNIA; SAN DIEGO
POLARBEAR Cosmology Telescope UC Berkeley/NERSC-UCSD/SDSC
• Goal: Measure B-mode polarization in the CMB from inflation era
• Data path: Chile (obs)-UCB/NERSC (analysis)-UCSD/SDSC (analysis)
• Data acquisition rates:
• 22 GB/mo. (current)
• 3 TB/mo. (2014-2016)
• Map making data analysis NERSC & SDSC
• 100 MC realizations of 100 TB data = 10 PB
Atacama Desert, Chile
SAN DIEGO SUPERCOMPUTER CENTER
at the UNIVERSITY OF CALIFORNIA; SAN DIEGO
Next Generation Network Measurement CAIDA (SDSC)-NERSC
• CAIDA operates the UCSD Network Telescope, which collects Internet Background Radiation
• Data paths: global internet, ESnet
• Data rates: 3-4 TB/mo
• Using NERSC tape archive to replicate 100 TB historical data
• Other projects: network measurement tools, Future Internet Architecture
100’s TB archival data
SDSC/NERSC
unassigned
IPv4
addresses
SAN DIEGO SUPERCOMPUTER CENTER
at the UNIVERSITY OF CALIFORNIA; SAN DIEGO
High Energy Physics LHC/US-CMS UCSD Tier-2—US-CMS collaboration
• Goals: Higgs boson, supersymmetry, BSM
• Data Paths: CERN-FNAL (Tier 1)-UCSD (Tier 2) via ESnet and CENIC/I2
• Peak Bandwidths:
• Current: 10+5 Gbps
• 2015: 40 Gbps when LHC operates @ 14 Tev
SAN DIEGO SUPERCOMPUTER CENTER
at the UNIVERSITY OF CALIFORNIA; SAN DIEGO
Education & Training: UCSD Telemedicine Center
SAN DIEGO SUPERCOMPUTER CENTER
at the UNIVERSITY OF CALIFORNIA; SAN DIEGO
CHERuB Implementation Status
• January, 2014 • Project funded, equipment on order
• February, 2014 • Equipment received • Production network switch upgraded
• March, 2014 • Campus gateway upgraded, connected to regional 100G feed • Successful border-to-regional test @100Gbps
• Next steps (April/May): • Connect Prism switch, test @2x40Gbps • Connect SDSC infrastructure, test @100Gbps • Connect production switch, test @4x10Gbps
• Production Goal: September 2014
Comet is a ~2000TeraFLOP System Architected
for the “Long Tail of Science”
NSF Track 2 award to SDSC
$12M NSF award to acquire
$3M/yr x 4 yrs to operate
Production early 2015