central database support summary of activities of it/db group for lhc computing review
DESCRIPTION
IT/DB Group zFormed as part of IT restructuring Jan 2K 2 sections: ORACLE (dbr); Objectivity/DB (odb) zManpower issues: ¶ urgent staff replacements retirements in pipeline in DBR resignation (already) in ODB · ensure adequate staffing in e.g. many staff on short-term contracts zStreamlining working methods zPrimary focus is productionTRANSCRIPT
Central Database Support
Summary of Activities of IT/DB Group for LHC Computing Review
http://wwwinfo.cern.ch/db/
OverviewOverview of IT/DB GroupORACLE activitiesObjectivity/DB activitiesEspressoConclusions Concerning AlternativesSummary
IT/DB GroupFormed as part of IT restructuring Jan 2K
• 2 sections: ORACLE (dbr); Objectivity/DB (odb)Manpower issues:
urgent staff replacements• retirements in pipeline in DBR• resignation (already) in ODB
ensure adequate staffing in 2005+• e.g. many staff on short-term contracts
Streamlining working methodsPrimary focus is production
Working MethodsEnsure that all activities have both a
responsible + backup(s)Identify areas where common solutions can
be appliedUse appropriate tools to improve response;
maximize knowledge sharing Problem Tracking, Newsgroups, FAQs, Web, …
Leverage many years experience with production ORACLE services
ORACLE Activities
ORACLE Activities“Mission Critical” activities include:
Support for EDMS projectSupport & Running of Accelerator ServicesSupport & Operations of Central DB Servers
Major new activities include:ORACLE for “physics” applications
e.g. ALEPH / LHCb book-keeping; LHC detectors LEP decommissioning; LHC construction Windows 2000; Forms to Web + Java;
Everybody at CERN is an ORACLE user!
ORACLE Services Engineering Data Management System
manage engineering data for LHC project, machine and experiments (+SL, ST, PS, …)
Accelerator Services LEP logging, LEP+SPS measurements, …
Anticipate similar usage for LHCCentral Services
Network DB (LANDB)Physics-related Activities And the CERN network...
ORACLE SummaryORACLE production fundamental to
CERNIncreased usage in physics experimentsShould exploit “career building” value of
ORACLE experience attract short term staff & visitors
But must retain core expertise to run these critical services
Objectivity Activities
Objy Production ServicesUsed in production by numerous groups
CHORUS, COMPASS, CMS, NA45, ALEPH …
Major CMS production in progress Expect few TB of data over several weeks
ATLAS plan to significantly increase use of Objy in 2000
COMPASS production - starts May 2000
Objectivity Successes Standards “influenced” ODBMS
successfully deployed for numerous HEP experiments at several sites
Major milestones, including 170MB/s data rate (CMS), 35+TB total data (BaBar) met
Important enhancement requests (MSS interface, security hooks, other AMS extensions) delivered and in production
Objectivity Problems Concerns over market size / company
stability Plans to go public this year; sink or swim?
Pending enhancement requests (VLDB) Planned for end-2000 release
Support issues need to be addressed Better response to problems; improved
information flow; support for RCs etc.
Objectivity IssuesRecent visit to Objy to pursue major issues
Classified as eXpress, Short, Medium and Long term• X - build for Red Hat 6.1; AMS instabilities• S - DBID API• M - VLDB support• L - “Multi-FD” issuesX - V5.2.1 for RH6.1 now available; AMS being worked onS - DBID in V6 (~summer 2000?)M - draft spec ~May; final spec ~AugustL - enhancements in V6, V6++; revisit later
Cautiously confident that technical issues will be solved
Objectivity Summary Usable and used in production
Still need some enhancements to meet LHC baseline requirements
Baseline assumption for ATLAS and CMS If market takes off (successful IPO), then
growth of company, local support, etc. will follow But will we will be able to influence product?
A fallback strategy is mandatory
Risk Analysis: IssuesChoice of Technology
ODBMS, ORDBMS, RDBMS, “light-weight” Persistency, files + meta-data, ...
Choice of Vendor (historically) #1 Objectivity, #2 Versant
Size of market Did not take off as anticipated; unlikely to
grow significantly in short-medium term
Persistency: ConclusionsObjectivity/DB is viable technically
No viable alternative commercial ODBMSOther possibilities include:
“Open Source” (?) ODBMS solution ORDBMS-based solution (also for event data) Meta-data + files
RD45 investigating & directly based on experience at FNAL / BNL ...
RDBMS Investigations
ORDBMS Questions To what extent can ORDBMSs scale? What would be the impact on
DBA; developer; userOracle being used to store meta-dataProject in CMS to study extended
RDBMS for event data (Informix)Possible studies also with Oracle
Espresso
EspressoEspresso is a proof-of-concept prototype
built to answer questions from Risk Analysis Could we build an alternative to Objectivity/DB? How much manpower would be required? Can we overcome limitations of Objy’s current
architecture?Support for VLDBs, multi-FD work-arounds etc.
Test / validate import architectural choices
Espresso - Current Status A working prototype has been produced,
implementing the ODMG C++ binding on which HepODBMS is layered
LHC++ Histograms (HTL), tags, and other applications have been successfully ported plans to port G4 examples, Iguana, ORCA, ...
Successfully demonstrates feasibility, but more work on scalability / performance / robustness required
Espresso - Next StepsStart detailed requirement discussion with
experiments and other interested institutesContinue Scalability & Performance Test
Storage Manager: larger files (>100GB)Page Server: connections > 500Lock Server: number of locks > 20kC++ Binding & Schema Manager: port Geant4 persistency
examples and Conditions-DB By summer this year
Written Architectural Overview of the PrototypeDevelopment Plan with detailed manpower estimatesSingle user evaluation system
Espresso - SummaryInitial prototype suggests that it is
technically feasible
Discussions with other sites suggest that interest goes well beyond HEP
Manpower estimates / possible resources indicate “project” would have to start “soon”
Persistency - Summary“ODBMS-like” solution is still preferredFunctional & support requirements
should be available by ~October 2000Investigations of other possibilities will
proceed in parallelInformation on all approaches should
be available in time for “2001 decision”
IT/DB SummaryProduction Database Services are the
“raison d’être” of the IT/DB group
Production services based on ORACLE and Objectivity/DB must & will continue
http://wwwinfo.cern.ch/db/
End of Presentation
Background slides follow...
RD45 Activities
Proposed activities presented at LCB Review (November 1999) and CHEP 2K
Basically consist of: Production activities
IT/DB Group Preparation for “2001 choice”
Requirements WGs, Risk Analysis, Customer / HEP visits etc.
Some slides from LCB / CHEP follow…
Guiding Principles “In particular, the data should be presented in as consistent
a way as possible. The data themselves may be stored in a variety of formats but this should be hidden from the user…”
"The ODMG ... binding is based on one fundamental principle: the programmer should perceive the binding as a single language for expressing both database and programming operations, not two separate languages with arbitrary boundaries between them.“
Capability of scaling to LHC data volumes & rates Capable of satisfying wide variety of HEP needs
DAQ, SIM, REC, Analysis, ... Use of “standard”, widely-used solutions if applicable
CMS
Database Production Service - What is missing? Transparent non-blocking interface with MSSUser capability to:
export, extract, replicate data and schema manipulate data and schema outside production
database and while accessing data and schema from production database
Fully functional, reliable high-quality database system including VLDB support (>>1PB) management tools
From L. Silvestris: Review of application software services for the LHC era, FOCUS 07/10/99
Objy V5.2
Objy V6?
BaBar
O(R)DBMS EvolutionFrom CMS Computing Technical Proposal:
“If the ODBMS industry flourishes it is very likely that by 2005 CMS will be able to obtain products, embodying thousands of man-years of work, that are well matched to its worldwide data management and access needs. The cost of such products to CMS will be equivalent to at most a few man-years. We believe that the ODBMS industry and the corresponding market are likely to flourish. However, if this is not the case, a decision will have to be made in approximately the year 2000 to devote some tens of man-years of effort to the development of a less satisfactory data management system for the LHC experiments.”
RDBMS + “object extensions” Can store ADTs “Methods” on server
Complex Data with Queries
$8B in 1996Likely to become
dominant DBMS technology
Complex DataPerformance,
scalabilityTight Language
Binding OQL - SQL3 query subset
Growth similar to RDBMS in ’80s
~$1B market by 2001
ODBMS / RDBMS / ORDBMS
~$100M?
Risk Analysis: IssuesChoice of Technology
ODBMS, ORDBMS, RDBMS, light-weight POM, files + meta-data etc.
Choice of Vendor #1 Objectivity, #2 Versant
The Home-Grown approach Estimate resources requiredImplies proof-of-concept prototype
Versant
Risk Analysis:Summary of Options Evaluate C++ binding to e.g. ORACLE Add ESCROW clause to Objectivity contract Pursue possibility of source license Visit key Objectivity customers Produce new requirements list Estimate manpower to support Objy in house Estimate manpower for “clean-sheet”
solution Continue to monitor alternatives
The LCB agrees with the other suggested steps to mitigate risk, with the addition of trying to insure that user code in reconstruction and analysis programs is kept as standards compliant as possible.
Risk Analysis: ConclusionsA solution is certainly possible! How much should we align ourselves
with industry trends / standards?ODBMS unlikely to dominate DBMS
market Likely to survive foreseeable future - market!
Need to complete current prototype to make meaningful manpower estimates
Future ActivitiesProduction ServicesConsidered essential
by several experiments
Tools, documentation, regular releases, … general production
level support Push for VLDB and
other enhancements
“2001” milestoneRevise requirementsVisit other HEP labs
(BNL, FNAL, SLAC, …)Provide ODBMS-
independent s/w layerEstimate man-power for
alternative POMEvaluate ORDBMS
technology
Summary (+)We have a good understanding of ODBMS
technology & Objectivity/DB in particularSystem has been demonstrated to work in
production up to level of today’s (BaBar) experiments
Many enhancements have been delivered, others in pipeline
Production experience will be invaluable for LHC (product enhancements, tools, etc.)
Summary (-)The ODBMS market has not taken off as
was previously predictedWe need to assure ourselves that there is
sufficient non-HEP demand (and $$$)We need to (in any case) understand how
an eventual migration could be handledWe need to develop at least one realistic
fallback scenario
ConclusionsR&D phase of RD45 has now led to
production ODBMS servicesRisks of current strategy well understood -
risk management must continueWe are well placed to prepare for “2001
milestone”Future focus:
Production Road-map to 2001 and beyond
RD45 - Future ActivitiesRevise requirements
establish WGs, together with experimentsVisit other HEP labs (BNL, FNAL, SLAC, …)
Recent SLAC visit; BNL ~Sep 2K; FNAL 2001?Provide ODBMS-independent s/w layer
Extension of existing HepODBMSEstimate man-power for alternative POM
Preliminary estimates available: ~15MYEvaluate ORDBMS technology
ORACLE meeting Oct 2K; work in CMS with Informix
Requirements WGsFunctional
e.g. scalability to LHC data volumes & rates
platform / language heterogeneity
transactional safety and crash recovery
navigational access at ~disk / network speed
Support / Release e.g. notification of new &
withdrawn features; support for new
“platforms” within X months;
advance notice of release schedule
automatic acknowledgement of PRs, change of state, etc.
Examples of possible functional / support requirements
RD45 SummaryExperiments have requested continuation of:
Meetings; Workshops; White-papers; Workshop prior to CHEK 2K; next July 4-5; Oct-Nov?
In addition, proposed R&D items are: Support for the choice of database system Manpower estimate for an Alternative Persistent Object Manager A database independent software layer based largely on the ODMG
interface standard The analysis and revision of LHC database requirements The potential use of a mainstream ORDBMS products, such as
ORACLE 8i