chep 2003 general summary
DESCRIPTION
CHEP 2003 General Summary. Torre Wenaus, BNL/CERN CHEP 2003, UC San Diego, La Jolla March 28, 2003. I agree with all the other summaries. Thank you to the organizers, and have a safe journey home. Outline: The CHEP03 Zeitgeist. Themes and observations Rising trends Important developments - PowerPoint PPT PresentationTRANSCRIPT
CHEP 2003 General Summary
Torre Wenaus, BNL/CERN
CHEP 2003, UC San Diego, La Jolla
March 28, 2003
I agree with all the other summaries.
Thank you to the organizers,and have a safe journey home
CHEP 2003 Summary, March 28 2003 Slide 3
Torre Wenaus, BNL/CERN
Outline: The CHEP03 Zeitgeist
Themes and observations Rising trends Important developments Receding trends Underrepresented Open questions Concerns Major challenges Conclusions Thanks
Google zeitgeist: http://www.google.com/press/zeitgeist.html
zeit·geist | Pronunciation: 'tsIt-"gIst, 'zIt | Function: noun | Etymology: German,
from Zeit (time) + Geist (spirit) |Date: 1884 | Meaning: the general
intellectual, moral, and cultural climate of an era
CHEP 2003 Summary, March 28 2003 Slide 4
Torre Wenaus, BNL/CERN
Themes and observations
Lesson from the past: Make it simple (R. Brun) No more complex than necessary Users want consolidation, ease of use, and stability Must consider also needs of the future; longer view of maintainability and
evolution In the interests of long term stability
OO and C++ is the accepted paradigm No major OO/C++ migration or usage angst at this conference, it is done and
accepted Offline and online: “Triumph of C++ for HEP DAQ confirmed” – DAQ
summary Now we are hearing reports on Nth generation C++ software
L. Sexton Kennedy, CDF: Every component has been rewritten at least once. Implementations have now stabilized such that every new arrival doesn’t start by discarding and rewriting software
“Many more talks about redesign than about design” – Data management summary
And on the maturation and emergence of tools as broad standards, after years of development and refinement
e.g. Geant4, ROOT I/O
CHEP 2003 Summary, March 28 2003 Slide 5
Torre Wenaus, BNL/CERN
Themes and observations
The tyranny of Moore’s Law Wolbers: it is not a substitute for more efficient & faster code,
smaller data size it works against thinking before doing Optimize wherever possible
Addressing the digital divide in networking (H.Newman) HEP is obligated as a community to work on this A world problem in which our field can have visible impact
Farm challenges Don’t underestimate farm installation and operations (R.Divia) Big issues are power, cooling, space! (S.Wolbers)
Watts/$ steadily rising (R.Mount) Tape-disk random access performance gap in analysis is receding
as an issue, but disk-memory gap is hardly being addressed (R.Mount)
CHEP 2003 Summary, March 28 2003 Slide 6
Torre Wenaus, BNL/CERN
Random-Access Storage Performance
0.000000001
0.00000001
0.0000001
0.000001
0.00001
0.0001
0.001
0.01
0.1
1
10
100
1000
log10 (Obect Size Bytes)
Ret
reiv
al R
ate
Mb
ytes
/s
PC2100
WD200GB
STK9940B
0 1 2 3 4 5 6 7 8 9 10
R. Mount
CHEP 2003 Summary, March 28 2003 Slide 7
Torre Wenaus, BNL/CERN
Rising trends
ROOT For analysis, I/O, and much else Now fully supported at CERN: EP/SFT section Close interaction with experiments on new developments
Run II, RHIC, ALICE, LCG, BaBar, … Foreign classes, PROOF, geometry, grid integration, …
Mentioned in 47+ talks at this conference Open source databases (MySQL, Postgres, …)
Metadata, distributed computing, conditions, … Empowering software: easy and potent MySQL mentioned in 37 talks! Postgres in 8, Oracle in 27
Online – offline continuum Similar Linux farm environments, attainable time budgets Same framework, maybe same algorithms, in HLT as in offline
(V.Boisvert, ATLAS) Stringent performance/robustness requirements on software
CHEP 2003 Summary, March 28 2003 Slide 8
Torre Wenaus, BNL/CERN
Rising trends
Common projects Joint projects one of the CDF/D0 successes (Wolbers)
But hard to align running experiments with LHC LHC Computing Grid project Grid projects in general Laudable but difficult; increasingly forced by the circumstances
Resource constraints and increasing scale and complexity makes go-it-alone N times too costly
cf. comments in online/DAQ context by G. Dubois-Feldmann today: somewhat less success in online where it is even harder than offline, but possible LHC inroads
Related is software reuse… Respect what we know about long software development
timescales
CHEP 2003 Summary, March 28 2003 Slide 9
Torre Wenaus, BNL/CERN
Rene’s time to develop plot
LCG?
CHEP 2003 Summary, March 28 2003 Slide 10
Torre Wenaus, BNL/CERN
LCG must effectively re-use and leverage existing software, or fail
LCG?
This is the approach taken: cf. POOL, SEAL talks. Time will tell! cf. next CHEP
CHEP 2003 Summary, March 28 2003 Slide 11
Torre Wenaus, BNL/CERN
Rising trends
Modular component architectures Many examples in offline; also in online/DAQ (XDAQ – CMS) [also in open
source…] Associated infrastructure: white boards, centrality of dictionary, plug-ins, …
XML The no-brainer for small scale structured data storage and exchange.
[The more humane applications leave the XML generation to the computer and not the humans]
ASCII lovers [count me in] now have their standard Many talks in many areas involving XML applications
Detector description, conditions info, configuration, monitoring, graphics, object models, data/object interchange, dictionary generation, not to mention layered apps (e.g. SOAP)…
37 talks mention XML (same 37 as MySQL?) But XML in itself does not define common format/schema, and much divergence
and duplication exists in how XML is used e.g. detector description
We heard (I.Foster) about an OGSA community clearing house, we have similar things ourselves (CLHEP, FreeHEP), maybe we need one for XML applications
CHEP 2003 Summary, March 28 2003 Slide 12
Torre Wenaus, BNL/CERN
Rising trends
Open source in general “Open source, please. Your interests rarely in commercial
vendor’s interests” (M.Purschke, PHENIX) In the CDF/D0 success column, similarly all over DBs, Qt, utility libraries, … and Linux, it goes without saying Extraordinary capability and quality
Java, to a degree Important limitations being addressed, e.g. manageable C++
interoperability (JACE autogeneration of interface) JAS, NLC sw, IceCube, CDMS DAQ, … But not broadly competing with C++ in usage so far
HENP as CS partner and collaborator To our mutual benefit in the Grid and in networking
CHEP 2003 Summary, March 28 2003 Slide 13
Torre Wenaus, BNL/CERN
Rising trends
“New” simulation engines: Geant4, FLUKA Geant4 as a production tool
In production in BaBar: EM validation in hand, hadronic beginning, robust and reasonably fast
ATLAS on the way to completing G4 transition after two years of physics validation
CMS, LHCb also transitioning over next year GLAST using LHCb/Gaudi Geant4 interface
FLUKA not new – established and widely used – but new integration efforts as a detector simulation engine for the four LHC experiments
FLUGG interface to G4 geometry ALICE Virtual Monte Carlo as uniform interface to multiple engines
(FLUKA, Geant4, Geant3) Interest from other experiments; joint LCG project starting Used for Geant4 testing FLUKA integration in progress
CHEP 2003 Summary, March 28 2003 Slide 14
Torre Wenaus, BNL/CERN
Rising trends
Automation in software development/management Heard about several automated tools for code building and
testing, release integration & tag management, configuration management
Popular new software web portal at CERN LCG/SPI http://savannah.cern.ch
Automated textual and statistical analysis of test outputs
CHEP 2003 Summary, March 28 2003 Slide 15
Torre Wenaus, BNL/CERN
Rising trends – The Grid
The central importance of distributed computing to future (increasingly, present) HENP is long known
‘The Grid’ as the means to that is now established Major, broad successes in funding and in attracting
collaboration with CS F.Berman, Grid 2003: “HEP has set a model for
integration, focus, coordination” Progress in applying Grid software and infrastructure
to real problems Batch production
Clearly the chosen path; success to be proven, but has promise and broad commitment
CHEP 2003 Summary, March 28 2003 Slide 16
Torre Wenaus, BNL/CERN
The Grid
F.Berman, Grids on the horizon: Must be useful, usable, stable; supported More cooperative than competitive
[Not always the case today!] Applications are key to success
Not a “Field of Dreams” “build it and they will come” R&D field any more
Grid killer app: a focus on data. Good match to us Still a long way to go
CHEP 2003 Summary, March 28 2003 Slide 17
Torre Wenaus, BNL/CERN
The Grid
Miron Livny: Benefit to science: democratization of computing Still very manpower intensive: when the support team
goes on holiday, so does the Grid (CMS testbed in Dec)
Best practice middleware requires True collaboration, “open minds” (cf. Berman) Testing, deployment/adoption, evaluation metrics,
robustness, professional support, longevity, responsiveness to show stoppers, …
Much to do and improve but important progress E.g. VDT as standard middleware suite
CHEP 2003 Summary, March 28 2003 Slide 18
Torre Wenaus, BNL/CERN
Important developments
Community consensus on a C++ object store: ROOT I/O Though many approaches to its use Combined with RDBMS for physics data storage CDF, RHIC, LHC, BaBar, GLAST, …
Software engineering is catching up to us – F. Carminati “High ceremony processes” are not an obvious success And we are not alone… “Agile methodologies”, Extreme Programming (XP), is SE’s
response Extremely close to a successful HENP working model Adaptive, simple, incremental, tight iterations, plan for
change, adjust the methodology for your environment “I just learned we use XP”, comment from CDF
Means of responsibly formalizing and addressing – in a useful way – software engineering in HEP, and software management
Both must be effective and lightweight: Agile
CHEP 2003 Summary, March 28 2003 Slide 19
Torre Wenaus, BNL/CERN
Important developments
Major strides in networking HENP a leading applications driver and a co-developer of global
networks (H. Newman) Require rapid global access to event samples and analyzed
physics results drawn from massive data stores PB by 2002, ~100 PB by 2007, ~1 Exabyte by ~2012
Rate of Progress >> Moore’s Law Factor of ~1M in 1985-2005 (~5k during 1995-2005) in
global HENP network bandwidth Factor of 25-100 Gain in max sustained throughput in 15
months on some US+TransAtlantic routes Network providers see us as an opportunity because we
push real production applications Future promise: Optiputer (P.Papadopoulos)
“Key driving applications keep the IT research focused”
CHEP 2003 Summary, March 28 2003 Slide 20
Torre Wenaus, BNL/CERN
Important developments
The LHC Computing Grid Project Major new internationally supported effort to build the distributed
computing environment of the LHC Encompasses
the distributed computing facility Site fabrics (facilities), middleware selection, integration, testing,
deployment at distributed sites, operations and support, … the common physics applications software
Persistency, core libraries and services, physics analysis interfaces, simulation and other frameworks, all in a distributed environment
Must succeed if LHC computing is to succeed! An impressive effort by the experiments together with CERN to work in
accord across the cope of computing Managed so as to ensure comprehensive oversight by the experiments First testbed deployment is this summer (LCG-1)
Including the first major applications deployment, POOL persistency framework (ROOT I/O + MySQL hybrid)
CHEP 2003 Summary, March 28 2003 Slide 21
Torre Wenaus, BNL/CERN
Important developments
Success of mass stores Castor “reliable and effective” (ALICE) D0/CDF convergence on successful Enstore/SAM HPSS successful at RHIC
Exciting new generation of specialized lattice gauge computers (B. Sugar)
Two tracks: QCD on a chip: QCDOC, a “technical marvel”, project with
IBM $1M/Tflop, aiming at 10+ Tflop at BNL in 04
Optimized commodity clusters Pentium 4, Myrinet/Gbit Ethernet 10+ Tflop at FNAL and JLAB by 06
SCIDAC grant to improve software usability
CHEP 2003 Summary, March 28 2003 Slide 22
Torre Wenaus, BNL/CERN
Receding trends
Objectivity and ODBMS in general “Jury still out” at CHEP 2000 (P.Sphicas), but now clear Objectivity dropped or being phased out by LHC experiments,
COMPASS, BaBar event store In PHENIX “becoming a liability” (compiler issues);
augmented with RDBMSs Not due to technical failure but a mix of technical problems,
commercial concerns, manpower costs, availability of an alternative
Its replacements are not other ODBMSes but files (often ROOT) + RDBMS (mySQL, Oracle, Postgres…) for metadata
Magnetic tape (apart from archival) PASTA: “unlimited” multi-PB disk caches technically possible
but true cost is unclear (reliability, manageability) File system access under urgent investigation “tapes as random access device no longer a viable option” –
large disk caches needed for LHC analysis
CHEP 2003 Summary, March 28 2003 Slide 23
Torre Wenaus, BNL/CERN
Receding trends
Commercial software? No… Some in decline (Objy, LHC++), but new prospects
opening (IBM, Sun, MS, …) in Grid Open source now has an important commercial
element we derive great benefit from (even post-.com crash)
Red Hat, MySQL, Qt, …
CHEP 2003 Summary, March 28 2003 Slide 24
Torre Wenaus, BNL/CERN
Underrepresented
Collaborative tools Was represented this week, but only lightly Vital for distributed collaboration on software development and
physics analysis H. Newman: need culture of collaboration
Distributed and remote collaboration should be the norm Not solely, or even predominantly?, a matter of tool development
in the community How is the exponential commercial side evolving and how
can we leverage it What is the evolutionary path, strategy, role for community-
developed tools such as VRVS Why is the user experience often poor
Poor physical facilities/configurations, instabilities, heterogeneous tools/protocols, support issues, …
Current experience sometimes competes unsuccessfully with the telephone, despite all the shortcomings
CHEP 2003 Summary, March 28 2003 Slide 25
Torre Wenaus, BNL/CERN
Open questions
Distributed analysis What will it look like? What development line(s) are taking us
there? Still very much R&D pursued in multiple directions Several models (e.g. R.Brun) with varying degrees of Grid
exploitation/distributed character H.Newman – where is the comprehensive global system
architecture? M.Livny – have to proceed incrementally, step by step, from the bottom up
Some efforts were reported which are incrementally extending established analysis tools into Grid-based analysis
PROOF, JAS Others working from various starting points
Genius, Ganga, Clarens, …
CHEP 2003 Summary, March 28 2003 Slide 26
Torre Wenaus, BNL/CERN
Open questions
Distributed analysis continued Production: environments more well-defined, tools
more advanced, a few in production, varying levels of middleware usage
AliEn (ALICE), SAM-Grid (Run2), CMS tool suite, GRAT, Magda (ATLAS),DIRAC (LHCb) …
Not a lot of sharing/collaboration above the middleware level!!
Necessary precursor to the more complex analysis environment, and hard in itself
What analysis improvements will the Grid really provide? (panel discussion)
CHEP 2003 Summary, March 28 2003 Slide 27
Torre Wenaus, BNL/CERN
What analysis improvements will the Grid really provide? (panel discussion)
Some of the comments… (what I heard, not what was said) Murphy’s Law needs to be beaten, not Moore’s Law
(V.Innocente) From a technical point of view, the realization of a successful
grid will be a single integrated distributed computing center (R.Mount)
But beyond the technical, a successful grid will grow human resources, drawing in distributed people not otherwise involved, as well as material resources (M. Kasemann)
The grid is more than this. The LHC will build the first global collaboration, reaching out to uninvolved countries. This incurs on us an obligation. Through the grid we must make their participation possible and their resources useful. (H. Newman)
It is an unprecedented opportunity to screw up. But we have no choice, we cannot put it all in one place. Focus on reliability.
CHEP 2003 Summary, March 28 2003 Slide 28
Torre Wenaus, BNL/CERN
Grid panel 2
The grid is something new. We can’t let a ‘one virtual computing center’ be the dominating thing. There should be no dominant force and we should avoid centralized decision making. This will help analysis. (L. Robertson)
Grids enable collaboration at a scale not attempted before. Distributed efforts are motivated to compete with one another and with the central site, and this brings benefits and resources. Analysis groups are teams, spread across continents and time zones. How do they collaborate? The grid should provide the solution. Also, provenance is largely overlooked, but it is key to analysis. (P. Avery)
We have no model for how 5000 users will use a globally distributed facility. System issues must be addressed now. (H. Newman)
Physicists should not see the grid at all. It should be transparent. (P. Mato)
CHEP 2003 Summary, March 28 2003 Slide 29
Torre Wenaus, BNL/CERN
Grid panel 3
The grid will be successful if we make it simple. Will force some coherence in the development of distributed analysis tools. Too much process will kill the process. There is not enough prototyping going on. (R.Brun)
Agree, we need more prototyping. We need candidate strategies, then build prototypes, and see what works. You have to do this before you will be able to abstract from experience and automate & make transparent the approaches that work. (H. Newman)
Funding agencies, computer scientists, other sciences are excited by the HEP grid work, eg. on provenance. Possibility of infusion of funding. Could pursue google-like response to what now takes 3 months. (R. Mount)
The grid will enable collaborative work and harness distributed brainpower. It will allow HENP to be more present as a field at the home university. This is important for the health of our field. (H. Newman)
There is lots to learn from existing experiments. (R. Brun)
CHEP 2003 Summary, March 28 2003 Slide 30
Torre Wenaus, BNL/CERN
Open questions
Impact of facility security on Grid computing Site security in the grid era – Dane Skow
Avoid complexity in designing security; it is the bane of secure systems
Must be agile in the face of change; resistant to attack
Risk management, not elimination; must accept some risk to carry on work
No clear answers in the bottom line, there is much yet to be resolved and understood, and many are working
Workable resolution is vital, since you don’t have a usable grid if the walls don’t have sockets
CHEP 2003 Summary, March 28 2003 Slide 31
Torre Wenaus, BNL/CERN
Open questions
Impact of OGSA migration (Globus) on middleware Open Grid Services Architecture Leveraging industry standard web services Much industry involvement
IBM, Sun, NEC, Oracle, … Attention given to backward compatibility Promising approach; may the migration go well!
Alpha is under test; production release in June Major dependency given Globus’ foundation role
in our middleware Current Globus2 will be supported for some time but we
will be interested in new functionality
CHEP 2003 Summary, March 28 2003 Slide 32
Torre Wenaus, BNL/CERN
Open questions
Utility and practicality of generate-on-demand virtual data (‘virtual data by materialization’)
Networking going well; cost/complexity equation favors copying Interesting talk (C.Jones) on successful implementation and use
for many years in CLEO Relies on user discipline to ensure regenerated data is
trustworthy Utility of data provenance management, needed for secure trust
of on-demand data, is a separate question Should have important utility, not only for virtual data
(reproducibility, trust) but as a communication mechanism in widely distributed collaborations
Cannot allow reliance on hallway conversations with production gurus
CHEP 2003 Summary, March 28 2003 Slide 33
Torre Wenaus, BNL/CERN
Concerns
Data analysis as “the last wheel of the car” (R. Brun) Clear message from current generation (e.g. Run 2, BaBar): don’t
leave data analysis systems and infrastructure too late, it will lead to problems
Vastly more true when we are talking about doing globally distributed analysis, for the first time
with unprecedented volume and complexity, e.g. Terabyte scale at the LHC
Making dist analysis both very difficult and mandatory We cannot bootstrap ourselves into a global analysis system, it
will take long incremental work, so we better be working in a coordinated & effective way now
R. Brun: Will not converge on one system; will be multiple competing systems, and that will not be bad [hopefully a small number]
CHEP 2003 Summary, March 28 2003 Slide 34
Torre Wenaus, BNL/CERN
Concerns
Are we doing enough to ensure senior people can contribute directly to physics analysis?
How do we interpret the fact (R. Brun) that PAW usage is still rising?
Has everyone bought the C++/OO paradigm shift? Are we developing and/or providing the right tools?
Is there enough engagement of senior physicists in the (limited) exploratory work being done on future physics analysis environments?
Almost certainly no, and may be difficult to attract their attention unless/until attractive prototypes can be turned loose on them
CHEP 2003 Summary, March 28 2003 Slide 35
Torre Wenaus, BNL/CERN
Major Challenges
Storage architecture “possibly biggest challenge for LHC” (PASTA)
Seamless integration from CPU caches to deep archive
Currently very poor data management tools for storage systems
More architectural work needed in next 2 years
CHEP 2003 Summary, March 28 2003 Slide 36
Torre Wenaus, BNL/CERN
Future ALICE Data Challenges
New technologies CPUs Servers Network
MB
ytes
/s
R. Divia
CHEP 2003 Summary, March 28 2003 Slide 37
Torre Wenaus, BNL/CERN
Conclusions (1)
Coming experiments must learn from prior generations: give early (ie for LHC, immediate) attention to data analysis
It will take generations of incremental iterations of design, prototyping and stressful deployment to get it right
Particularly in the unprecedented global collaborative environment of the LHC
C++ is a mature and accepted standard Several generations of C++ code in production experiments
(BaBar, Run 2, …) Maturation of tools into broad usage (Geant4, ROOT I/O) No sign of a major new language migration so far [thank
goodness] But beware excessive complexity and remember the promise of
accessible, usable software
CHEP 2003 Summary, March 28 2003 Slide 38
Torre Wenaus, BNL/CERN
Conclusions (2)
Grids and networking are making great strides HENP is a successful and valued partner with CS
We provide a community focused on challenging large-scale deployments in real research settings
But Murphy’s Law is a potent adversary today; far from robust transparency, and much much more to do
Global collaborative computing must become a successful norm for us
Down to the global researcher at the home institute Rich leadership potential for our field
Important new common endeavours like the Grid and LCG have much invested in their success… will be interesting to measure the degree of success at next CHEP
CHEP 2003 Summary, March 28 2003 Slide 39
Torre Wenaus, BNL/CERN
Thanks
Thanks to Jim Branson and his team of organizers for giving us
A stimulating program and comfortable schedule
More-than-pleasant facilities and surroundings
Terrific banquet, I hear! A very successful conference.
I for one will return to La Jolla any time
I agree with all the other summaries.
Thank you to the organizers,and have a safe journey home