reconstruction and analysis on demand: a success story
DESCRIPTION
Reconstruction and Analysis on Demand: A Success Story. Christopher D. Jones Cornell University, USA. Overview. Describe “ Standard ” processing model Describe “ On Demand ” processing model Similar to GriPhN’s “Virtual Data Model” What we’ve learned User reaction Conclusion. - PowerPoint PPT PresentationTRANSCRIPT
Reconstruction and Analysis on Demand:
A Success Story
Christopher D. Jones
Cornell University, USA
C. Jones CHEP03 2
Overview
• Describe “Standard” processing model
• Describe “On Demand” processing model– Similar to GriPhN’s “Virtual Data Model”
• What we’ve learned
• User reaction
• Conclusion
C. Jones CHEP03 3
Standard Processing System
• Designed for reconstruction– All objects are supposed to be created for each event
• Each processing step is broken into its own module– E.g., track finding and track fitting are separate
• The modules are run in a user-specified sequence
• Each module adds its data to the ‘event’ when the module is executed
• Each module can halt the processing of an event
InputModule
Track Finder Track Fitter OutputModule
C. Jones CHEP03 4
Critique of Standard Design
• Good– Simple mental model
• Users can feel confident they know how the program works
– Easy to debug• Simple to determine which module had a problem
• Bad– User must know inter-module dependencies in order to place the
modules in the correct sequence• Users often run jobs with many modules they do not need in order to
avoid missing a module they might need
– Optimization of module sequence must be done by hand
– Reading back from storage is inefficient• Must create all objects from storage even if job does not use them
C. Jones CHEP03 5
On-demand System
• Designed for analysis batch processing– Not all objects need to be created each event
• Processing is broken into different types of modules– Providers
• Source: reads data from a persistent store• Producer: creates data on demand
– Requestors• Sink: writes data to a persistent store• Processor: analyzes and filters ‘events’
• Data providers register what data they can provide• Processing sequence is set by the order of data requests• Only Processors can halt the processing of an ‘event’
Source Processor A Processor B Sink
C. Jones CHEP03 6
Data Model
A Record holds all data that are related by life-timee.g., Event Record holds Raw Data, Tracks, Calorimeter Showers, etc.
A Stream is a time-ordered sequence of Records
A Frame is a collection of Records that describe the state of the detector at an instant in time.
All data are accessed via the exact same interface and mechanism
C. Jones CHEP03 7
Data Flow: Frame as Data Bus
EventDatabase
CalibrationDatabase TrackFinder TrackFitter
Frame
SelectBtoKPi EventDisplay Event List
Sources: data from storage Producers: data from algorithm
Processors: analyze and filter data Sinks: store data
Data Providers: data returned when requested
Data Requestor: sequentially run requestors for each new Record from a source
C. Jones CHEP03 8
Callback Mechanism
• Provider registers a Proxy for each data type it can create• Proxies are placed in the Record and indexed with a key
– Type: the object type returned by the Proxy– Usage: an optional string describing use of object – Production: an optional run-time settable string
• Users access data via a type-safe templated function callList<FitPion> pions;extract( iFrame.record(kEvent), pions);
• (based on ideas from Babar’s Ifd package)
• extract call builds the key and asks Record for Proxy• Proxy runs algorithm to deliver data
– Proxy caches data in case of another request– If a problem occurs, an exception is thrown
C. Jones CHEP03 9
Callback Example: Algorithm
Processor SelectBtoKPi
Producer Track Fitter
FitPionsProxyFitKaonsProxy…
Track Finder
TracksProxy
HitCalibrator
CalibratedHitsProxy
Source Calibration DB
PedestalProxyAlignmentProxy…
Raw Data File
RawDataProxy
C. Jones CHEP03 10
Callback Example: Storage
Processor SelectBtoKPi
Source Event Database
FitPionsProxyFitKaonsProxyRawDataProxy…
In both examples, same SelectBtoKPi shared object can be used
C. Jones CHEP03 11
Critique of On-demand System
• Good– Can be used for all data access needs
• Online software trigger, Online data quality monitoring, Online event display, calibration, reconstruction, MC generation, Offline event display, analysis
– Self organizes calling chain• Users can add Producers in any order
– Optimizes access from Storage• Sources only need to say when a new Record (e.g., event) is available
• Data for a Record is retrieved/decoded on demand
• Bad– Can be harder to debug since no explicit call order
• Use of exceptions key to simplifying debugging
– Performance testing is more challenging
C. Jones CHEP03 12
What We Have Learned
• First release of the system was September 1998
• Callback mechanism can be made fast– Proxy lookup takes less than 1 part in 10-7 of CPU time on simple
job that processed 2,000 events/s on moderate computer
• Cyclical dependencies are easy to find and fix– Only happened once and was found immediately on first test
• Do not need to modify data once it is created– Preliminary versions of data are given their own key
• Automatically optimizes performance of reconstruction– Trivially added filter to remove junk events by using FoundTracks
• Optimize analysis by storing many small objects– Only need to retrieve and decode data needed for current job
C. Jones CHEP03 13
User Reactions
• In general, user response has been very positive– Previously CLEO used a ‘standard system’ written in FORTRAN
• Reconstruction coders like the system– We have code skeleton generators for Proxy/Producer/Processor
• Only need to add their specific code
– Easy for them to test their code
• Analysis coders can still program the ‘old way’– All analysis code in the ‘event’ routine
• Some analysis coders are pushing bounds– Place selectors (e.g. cuts for tracks) in Producers
• Users share selectors via dynamically loaded Producers
– Processor only used to fill Histograms/Ntuples
– If stored selections, only rerun Processor when reprocessing data
C. Jones CHEP03 14
Conclusion
• It is possible to build an ‘on demand’ system that is– efficient
– debuggable
– capable of dealing with all data (not just data in an event)
– easy to write components
– good for reconstruction
– acceptable to users
• Some reasons for success– Skeleton code generators
• User only has to write new code, not infrastructure ‘glue’
– Users do not need to register what data they may request• Data reads occur more frequently than writes
– Simple rule for when algorithms run• If you add a Producer, it takes precedence over a Source