scientific computing at fermilab

31
Scientific Computing at Fermilab

Upload: robbin

Post on 25-Feb-2016

62 views

Category:

Documents


4 download

DESCRIPTION

Scientific Computing at Fermilab. Our “Mission”. Provide computing, software tools and expertise to all parts of the Fermilab scientific program including theory simulations (e.g. Lattice QCD and Cosmology), and accelerator modeling - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Scientific Computing at Fermilab

Scientific Computing at Fermilab

Page 2: Scientific Computing at Fermilab

Our “Mission”• Provide computing, software tools and expertise to all

parts of the Fermilab scientific program including theory simulations (e.g. Lattice QCD and Cosmology), and accelerator modeling

• Work closely with each scientific program – as collaborators (where a scientist/staff from SCD is involved) and as valued customers.

• Create a coherent Scientific Computing program from the many parts and many funding sources – encouraging sharing of facilities, common approaches and re-use of software wherever possible

• Work closely with CCD as part of an overall coherent program

Scientific Computing - Fermilab S&T Review, Sept 5 20122

Page 3: Scientific Computing at Fermilab

A Few points

• We are ~160 strong made up of almost entirely technically trained staff

• 26 Scientists in the Division• As the lab changes its mission, scientific

computing is having to adapt to this new and more challenging landscape.

• Scientific Computing is a very “matrixed” organization. I will not try to cover all we do but pick and choose things that are on my mind right now…

3

Page 4: Scientific Computing at Fermilab

Scientific Discovery – the reason we are here

• The computing capability needed for scientific discovery is bounded only by human imagination

• Next Generation of Scientific Breakthroughs Require major new advances in computing

technology Energy-efficient hardware, algorithms, applications

and systems sofware Data “Explosion” – Big Data is here

Observational, sensor networks, and simulation Computing/Data Throughput challenges

4

Page 5: Scientific Computing at Fermilab

About to Experience a Paradigm Shift in Computing

• For the last decade – GRID and computing resources has been very stable

• However…. End of Moore’s Law is looming New Computing technologies on the near horizon

Phase change memories Stacked dies

Exponential grown in parallelism in HPC IBM Blue Gene leading the charge Heterogeneous systems delivering higher

performance/watt (Titan) Power is a constraint Programmability…

5

Page 6: Scientific Computing at Fermilab

Computing Landscape will change..

• HEP is going to have to adapt to this changing world

• While the future for the next few years is clear, we don’t really know where we will be in the next decade or 20 years

• Ultimately market forces will determine the future

• We need to turn this into a positive force for both High Energy Physics and High Performance computing.

6

Page 7: Scientific Computing at Fermilab

Think Back on your Computing Careers…

7

Page 8: Scientific Computing at Fermilab

And Today….

8

Ask Yourself… Has Computing Gotten any Easier in the last 30 years?

Lattice b_c machine…

Page 9: Scientific Computing at Fermilab

Starting to Make the Bridge to the FutureNew Funding Initiatives • COMPASS Scidac Project (3 year) $2.2M/year• US Lattice QCD Project (5 year) ~$5M/year• Geant4 Parallelization -- joint with ASCR (2 year)

$1M/year• CMS on HPC machines (1 year) $150k• PDACS – Galaxy Simulation Portal – joint with Argonne (1

year) $250k• Science Framework for DES (1 year) $150k• Tevatron Data Preservation (2 year) $350k/year

• Partnering with NSF through OSG• Will be aggressive in upcoming data and knowledge

discovery opportunities at DOE 9

Page 10: Scientific Computing at Fermilab

Geant4

10

Workshop Held• between HEP and ASCR• Discussed how to transform

GEANT4 to run efficiently on modern and future multi-core computers and hybrids

• Workshop chairs were Robert Lucas (USC) and RR.

• Funded for $1M/year for 2 years

Here: Algorithmic development to be able to utilize multi-core architectures and are porting G4 sections to the GPUs)

Page 11: Scientific Computing at Fermilab

CMS • CMS would like to maintain current trigger thresholds for

2015 run to allow full Higgs Characterization• Thus nominal 350hz output would increase to ~1khz.• Computing budgets expected to remain constant – not

grow.• Need to take advantage of leadership class computing

faciilites• Need to incorporate more parallelism into software• Algorithms need to be more efficient (faster)

11

Page 12: Scientific Computing at Fermilab

PDACS

• Portal for Data Analysis Services for Cosmological Simulation. Joint Project with Argonne, Fermilab, and NERSC Salman Habib (Argonne) is the PI

• Cosmological data/analysis service at scale – a workflow management system

• Portal based on that used for computational biology – idea is to facilitate analysis/simulation effort for those not familiar with advanced computing techniques12

Dark energy, matter Cosmic gas Galaxies

Simulations connect fundamentals with observables

Page 13: Scientific Computing at Fermilab

Data Archival Facility

13

• Would like to offer archive facilities for broader community

• Will require work on front ends to simplify for non HEP Users

• Had discussions with Ice Cube

One of Seven 10k slot tape robots at FNAL

Page 14: Scientific Computing at Fermilab

We Can’t forget our day job….

14

Page 15: Scientific Computing at Fermilab

CMS Tier 1 at Fermilab• The CMS Tier-1 facility at Fermilab

and the experienced team who operate it enable CMS to reprocess data quickly and to distribute the data reliably to the user community around the world.

• We lead US and Overall CMS in Software and computing

15

Fermilab also operates: • LHC Physics Center (LPC)• Remote Operations Center• U.S. CMS Analysis Facility

Page 16: Scientific Computing at Fermilab

Intensity Frontier Program (Diverse)

16

Page 17: Scientific Computing at Fermilab

Intensity Frontier Strategy

• Common approaches/solutions are essential to support this broad range of experiments with limited SCD staff. Examples include ArtDAQ, ART, SAM IF, LArSoft, Jobsub,…

• SCD has established a liaison between ourselvs and experiments to insure communication and understand needs/requirements

• Completing the process of establishing MOU’s between SCD and experiment to clarify our roles/responsibilities

17

Page 18: Scientific Computing at Fermilab

Intensity Frontier Strategy - 2

• A shared analysis facility where we can quickly and flexibly allocate computing to experiments

• Continue to work to “grid enable” the simulation and processing software Good success with MINOS, MINERvA and Mu2e

• All experiments use shared storage services – for data and local disk – so we can allocate resources when needed

• Perception that intensity frontier will not be computing intensive is wrong

18

Page 19: Scientific Computing at Fermilab

artdaq Introduction

artdaq is a toolkit for creating data acquisition systems to be run on commodity servers• It is integrated with the art event reconstruction and analysis

framework for event filtering and data compression.• provides data transfer, event building, process management,

system and process state behavior, control messaging, message logging, infrastructure for DAQ process and art module configuration, and writing of data to disk in ROOT format.

• The goal is to provide the common, reusable components of a DAQ system and allow experimenters to focus on the experiment-specific parts of the system. This software that reads out and configures the experiment-specific front-end hardware, the analysis modules that run inside of art, and the online data quality monitoring modules.

• As part of our work in building the DAQ software system for upcoming experiments, such as Mu2e and Darkside 50, we will be adding more features

• . 19

Page 20: Scientific Computing at Fermilab

artdaq IntroductionWe are currently working with the DarkSide-50 collaboration to develop and deploy their DAQ system using artdaq.• The DS-50 DAQ reads out ~15 commercial VME modules into four front-end

computers using commercial PCIe cards and transfers the data to five event builder and analysis computers over a QDR Infiniband network.

• The maximum data rate through the system will be 500 MB/s, and we have achieved a data compression factor of five.

• The DAQ system is being commissioned at LNGS, and it is being used to collect data and monitor the performance of the detector as it is being commissioned. (plots of phototube response?)

artdaq will be used for the Mu2e DAQ, and we are working toward a demonstration system which reads data from the candidate commercial PCIe cards, builds complete events, runs sample analysis modules, and writes the data to disk for later analysis. • The Mu2e system will have 48 readout links from the detector into commercial

PCIe cards, and the data rate into the PCIe cards will be ~30 GB/s. Event fragments will be sent to 48 commodity servers over a high-speed network, and the online filtering algorithms will be run in the commodity servers.

• We will be developing the experiment-specific artdaq components as part of creating the demonstration system, and this system will be used to validate the performance of the baseline design in preparation for the CD-review early next year.

20

Page 21: Scientific Computing at Fermilab

Cosmic Frontier

• Continue to curate data for SDSS • Support data and processing for Auger, CDMS and COUPP • Will maintain an archive copy of the DES data and provide

modest analysis facilities for Fermilab DES scientists. Data management is an NCSA (NSF) responsibility Helping NCSA by “wrappering” science codes needed for 2nd light

when NCSA completes its framework.• DES use Open Science Grid resources opportunistically and

will make heavy use of NERSC• Writing Science Framework for DES – hope to extend to

LSST• Darkside 50 writing their DAQ system using artDAQ

21

Page 22: Scientific Computing at Fermilab

Tevatron (Data) Knowledge Preservation

• Maintaining full analysis capability for next few years though building software to get away from custom sys.

• Successful FWP funded. hired TWO domain knowledgeable scientists to lead the preservation effort on each experiment (and 5 fte of SCD effort)

• Knowledge Preservation Need to plan and execute the following…

Preserving analysis notes, electronic logs etc Document how to do analysis well Document sample analyses as cross checks Understand job submission, db, and data handling issues Investigate/pursue virtualization

• Try to keep CDF/D0 strategy in synch and leverage common resources/solutions

22

Page 23: Scientific Computing at Fermilab

Synergia at Fermilab

• Synergia is an accelerator simulation package combining collective effects and nonlinear optics Developed at Fermilab, partially

funded by SciDAC

• Synergia utilizes state-of-the-art physics and computer science

– Physics: state of the art in collective effects and optics simultaneously

– Computer science: scales from desktops to supercomputers

• Efficient running on 100k+ cores• Best practices: test suite, unit tests

Synergia is being used to model multiple Fermilab machines

Main Injector for Project-X and Recycler for ANU

Booster instabilities and injection losses

Mu2e: resonant extraction from the Debuncher

Weak scaling to 131,072 cores

Page 24: Scientific Computing at Fermilab

Synergia collaboration with CERN for LHC injector upgrades

• CERN has asked us to join in an informal collaboration to model space charge in the CERN injector accelerators

• Currently engaged in benchmarking exercise

Current status reviewed at Space Charge 2013 workshop at CERN

Most detailed benchmark of PIC space charge codes to date, using both data and analytic models

Breaking new ground in accelerator simulation

Synergia has emerged as the leader in fidelity and performance

PTC-Orbit has been shown to have problems reproducing individual particle tunes

Individual particle tune vs. initial position

PTC-Orbit displays noise and

growth over time

Synergia results are

smooth and stable

Phase space showing trapping

benchmark Synergia

Page 25: Scientific Computing at Fermilab

SCD has more work than human resources

• Insufficiently Staffed at the moment Improving Event generators – especially for Intensity

Frontier Modeling of neutrino beamline/target Simulation effort – all IF experiments want more

resources; both technical and analysis Muon Collider Simulation – both accelerator and

detector R&D in Sofware definable networks

25

Page 26: Scientific Computing at Fermilab

Closing Remarks

• SCD has transitioned to fully support the Intensity Frontier

• We also have a number of projects underway to prepare for the paradigm shift in computing

• We are short handed and are having to make choices

26

Page 27: Scientific Computing at Fermilab

Back Up – AT the Moment….

27

Page 28: Scientific Computing at Fermilab

Riding the Wave of Progress…

28

Page 29: Scientific Computing at Fermilab

MKIDs (Microwave Kinetic Inductance Devices)• Pixelated micro-size resonator array.• Superconducting sensors with meV energy gap. Not only a single photon detector:

Theoretically, allow for energy resolution (E/ΔE) of about 100 in the visible and near infrared spectrum.

Best candidate to provide medium resolution spectroscopy of >1 billion galaxies, QSO and other objects from LSST data if the energy resolution is improved to 80 or better, currently at ~16. Note that scanning that number of galaxies is outside the reach of current fiber based spectrometers.

An MKID array of 100,000 pixels will be enough to obtain medium resolution spectroscopic information for all LSST galaxies up to magnitude 24.5 with an error .

High bandwidth: Allows for filtering of atmospheric fluctuations at ~100 Hz or faster.

Page 30: Scientific Computing at Fermilab

Multi 10K-pixel instrument and science with MKIDs• PPD and SCD teamed up to build an instrument with a number of pixels between 10K and 100K.

External collaborators: UCSB (Ben Mazin, Giga-Z) , ANL, U. Michigan. Potential collaboration: strong coupling with the next CMB instrument proposed by John Carlstrom U.

Chicago and Clarence Chang ANL that also requires the same DAQ readout electronics.

• Steve Heathcote, director of the SOAR telescope, Cerro Tololo, has expressed interest in hosting the MKID R&D instrument in 2016. (Ref. Steve Heathcote letter to Juan Estrada (FNAL)).

• SOAR telescope operations in late 2016: 10 nights x 10 hours/night. would give a limiting magnitude of ~ 25. Potential science (under consideration): Photometric redshift calibration for DES, Cluster of galaxies,

Supernovae host galaxy redshift, Strong lensing.

• SCD/ESE will design the DAQ for up to 100K pixel instrument. 1000 to 2000 MKIDs per RF feed-line, 50 feedlines. Input bandwidth: 400 GB/s Triggerless DAQ. Data reduction: ~200 MB/s to storage. Digital signal processing for FPGAs, GPUs, processors, etc.

• Status: Adiabatic dilution refrigerator (ADR) functioning at Sidet. Test of low noise electronics underway. MKID testing to start this summer. Electronic system design underway.

Page 31: Scientific Computing at Fermilab

31 Scientific Computing - Fermilab S&T Review, Sept 5 2012

• The Open Science Grid (OSG) advances science through open distributed computing. The OSG is a multi-disciplinary partnership to federate local, regional, community and national cyberinfrastructures to meet the needs of research and academic communities at all scales.

• Total of 95 sites; ½ million jobs a day, 1 million CPU hours/day; 1 million files transferred/day.

• It is cost effective, it promotes collaboration, it is working!

Open Science Grid (OSG)

The US contribution and partnership with the LHC

Computing Grid is provided through OSG

for CMS and ATLAS