maria grazia pia, infn genova statistical testing project maria grazia pia, infn genova on behalf of...

36
Maria Grazia Pia, INFN Genova Statistical Testing Statistical Testing Project Project Maria Grazia Pia, INFN Genova on behalf of the Statistical Testing Team http://www.ge.infn.it/geant4/analysis/T LCG-Application Meeting CERN, 27 November 2002

Upload: daniel-davies

Post on 28-Mar-2015

227 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: Maria Grazia Pia, INFN Genova Statistical Testing Project Maria Grazia Pia, INFN Genova on behalf of the Statistical Testing Team

Maria Grazia Pia, INFN Genova

Statistical Testing ProjectStatistical Testing Project

Maria Grazia Pia, INFN Genova

on behalf of the Statistical Testing Team

http://www.ge.infn.it/geant4/analysis/TandA

LCG-Application Meeting CERN, 27 November 2002

Page 2: Maria Grazia Pia, INFN Genova Statistical Testing Project Maria Grazia Pia, INFN Genova on behalf of the Statistical Testing Team

Maria Grazia Pia, INFN Genova

History and backgroundHistory and background

Page 3: Maria Grazia Pia, INFN Genova Statistical Testing Project Maria Grazia Pia, INFN Genova on behalf of the Statistical Testing Team

Maria Grazia Pia, INFN Genova

What is?What is?

Provide tools for the Provide tools for the statistical comparisonstatistical comparison of distributions of distributions– equivalent reference distributions (for instance, regression testing)– experimental measurements– data from reference sources– functions deriving from theoretical calculations or from fits

physics physics validationvalidation

regression regression testingtesting

system testingsystem testing

Main application areas in Geant4:

Interest in other areas, not only Geant4? LCG?

A project to develop a

statistical analysis statistical analysis systemsystem,,

to be used in Geant4 testing

A project to develop a

statistical analysis statistical analysis systemsystem,,

to be used in Geant4 testing

Page 4: Maria Grazia Pia, INFN Genova Statistical Testing Project Maria Grazia Pia, INFN Genova on behalf of the Statistical Testing Team

Maria Grazia Pia, INFN Genova

HistoryHistory“Statistical testing” agreed in the Geant4 Collaboration as a major objective for 2002

Initial ideas presented at Geant4 TSB meeting, November 2001

Open brainstorming session at a Geant4-WG workshop, 31 May 2002

Inception phase, summer 2002– Informal discussions with STT, Geant4 collaborators and interested potential developers– Initial collection of user requirements in Geant4– First version of software process deliverables: Vision, URD, Risk List

Presentation at Geant4 Workshop + parallel sessions, October 2002– http://www.ge.infn.it/geant4/talks/G4workshop/CERN/pia/tanda-2002.ppt

Launch of the project

Page 5: Maria Grazia Pia, INFN Genova Statistical Testing Project Maria Grazia Pia, INFN Genova on behalf of the Statistical Testing Team

Maria Grazia Pia, INFN Genova

The teamThe teamDevelopment team

Pablo Cirrone, INFN Southern National Lab

Stefania Donadio, Univ. and INFN Genova

Susanna Guatelli, CERN/IT/API Technical Student and INFN Genova

Alberto Lemut, Univ. and INFN Genova

Barbara Mascialino, Univ. and INFN Genova

Sandra Parlati, INFN Gran Sasso National Lab

Andreas Pfeiffer, CERN/IT/API

Maria Grazia Pia, INFN Genova

Geant4 system integration teamGabriele Cosmo, CERN/IT/API - Geant4 Release Manager

Sergei Sadilov, CERN/IT/API - Geant4 System Testing Coordinator

Statistical consultancyPaolo Viarengo, Univ. Genova, Statistician

interested collaborators

are welcome!

+ requirements, suggestions, -testing by many other Geant4 Collaborators (M. Maire, A. Ribon, L. Urban et al.)

Page 6: Maria Grazia Pia, INFN Genova Statistical Testing Project Maria Grazia Pia, INFN Genova on behalf of the Statistical Testing Team

Maria Grazia Pia, INFN Genova

The visionThe vision

Page 7: Maria Grazia Pia, INFN Genova Statistical Testing Project Maria Grazia Pia, INFN Genova on behalf of the Statistical Testing Team

Maria Grazia Pia, INFN Genova

Vision: the basics

Rigorous software processsoftware process

Have a visionvision for the project– An internal tool for Geant4 physics & STT?

– Also for Geant4 physics validation in the experiments?

– Other parties than Geant4 interested?

Who are the stakeholdersstakeholders?

Who are the usersusers?

Who are the developersdevelopers?

Build on a solid architecturearchitecture

Clearly define scopescope, objectivesobjectives

Flexible, extensible, Flexible, extensible, maintainablemaintainable system

Software quality quality

Clearly define roles

Page 8: Maria Grazia Pia, INFN Genova Statistical Testing Project Maria Grazia Pia, INFN Genova on behalf of the Statistical Testing Team

Maria Grazia Pia, INFN Genova

Scope of the projectScope of the project

The project will provide tools for statistical testingtools for statistical testing of Geant4– physics comparisons and regression testing– multiple comparison algorithms

GeneralityGenerality (for application also in other areas) should be pursued– facilitated by a component-based architecture

The statistical tools should be used in Geant4 (and in other frameworks)– tool to be used in testing frameworks– not a testing framework itself

Re-use existing tools whenever possible– no attempt to re-invent the wheel– but critical, scientific evaluation of candidate tools

Page 9: Maria Grazia Pia, INFN Genova Statistical Testing Project Maria Grazia Pia, INFN Genova on behalf of the Statistical Testing Team

Maria Grazia Pia, INFN Genova

Architectural guidelinesArchitectural guidelines

The project adopts a solid architectural architectural approach– to offer the functionalityfunctionality and the qualityquality needed by the users– to be maintainablemaintainable over a large time scale– to be extensibleextensible, to accommodate future evolutions of the requirements

Component-based approachComponent-based approach– Geant4-specificGeant4-specific components + + generalgeneral components – to facilitate re-use and integration in diverse frameworks

AIDAAIDA– adopt a (HEP) standard– no dependence on any specific analysis tool

PythonPython

The approach adopted is compatible with the recommendations of the LCG Architecture Blueprint RTAGLCG Architecture Blueprint RTAG

Page 10: Maria Grazia Pia, INFN Genova Statistical Testing Project Maria Grazia Pia, INFN Genova on behalf of the Statistical Testing Team

Maria Grazia Pia, INFN Genova

The reason why we are here…The reason why we are here… Core statistics comparison componentstatistics comparison component + user layer

can be generalised to wider scope than Geant4 only

This is the reason why we present the project to LCG – to establish a scientific discussionscientific discussion on a topic of common interest– to see if there are any interested usersinterested users– to see if there are any interested collaboratorsinterested collaborators

We would all benefit of a collaborative approach to a common problem

– share expertise, ideas, tools, resources…

Page 11: Maria Grazia Pia, INFN Genova Statistical Testing Project Maria Grazia Pia, INFN Genova on behalf of the Statistical Testing Team

Maria Grazia Pia, INFN Genova

Software process guidelinesSoftware process guidelines

Significant experience in the team– in Geant4 and in other projects

Guidance from ISO 15504ISO 15504– standard!

USDPUSDP, specifically tailoredtailored to the project– practical guidance and tools from the RUPRUP– both rigorous and lightweight– mapping onto ISO 15504

Open to use tools provided by the LCG Software Process LCG Software Process InfrastructureInfrastructure project

Page 12: Maria Grazia Pia, INFN Genova Statistical Testing Project Maria Grazia Pia, INFN Genova on behalf of the Statistical Testing Team

Maria Grazia Pia, INFN Genova

Who are the stakeholders? Who are the stakeholders?

Name Description Responsibilities

Geant4 STT Coordinator

Coordinates system testingEnsure that the system meets the needs of Geant4 System Testing

Geant4 physics coordinators

Coordinate Geant4 std EM, lowE EM, hadronic WGs

Ensure that the system meets the needs of Geant4 Physics Testing

Geant4 TSBIs responsible for Geant4 technical matters

Provide guidelines, monitors progress

INFN Computing Committee

National Committee whom part of the developers respond to; has appointed 4 referees

Recommend funding; review the project, monitor progress

Others? Who? LCG? Requirements? Expertise?

Page 13: Maria Grazia Pia, INFN Genova Statistical Testing Project Maria Grazia Pia, INFN Genova on behalf of the Statistical Testing Team

Maria Grazia Pia, INFN Genova

Who are the users?Who are the users?

Other potential users:

users of the Geant4 Toolkitusers of the Geant4 Toolkit, wishing to compare the results of their applications to reference data or to their own experimental results

other projectsother projects with requirements for statistical comparisons of distributions(e.g. the LHC Computing Grid project)

Groups Responsibilities

Geant4 physics Working Groups

Provide and document requirements, provide feedback on prototypes, perform -testing on preliminary releases of the product, provide use cases for acceptance testing

Geant4 STT Provide and document requirements, perform formal acceptance testing for adoption in system testing

Page 14: Maria Grazia Pia, INFN Genova Statistical Testing Project Maria Grazia Pia, INFN Genova on behalf of the Statistical Testing Team

Maria Grazia Pia, INFN Genova

Some use casesSome use cases

Regression testing– Throughout the software life-cycle

Online DAQ– Monitoring detector behaviour w.r.t. a reference

Simulation validation– Comparison with experimental data

Reconstruction– Comparison of reconstructed vs. expected distributions

Physics analysis– Comparisons of experimental distributions (ATLAS vs. CMS Higgs?)– Comparison with theoretical distributions (data vs. Standard Model)

Page 15: Maria Grazia Pia, INFN Genova Statistical Testing Project Maria Grazia Pia, INFN Genova on behalf of the Statistical Testing Team

Maria Grazia Pia, INFN Genova

What do the users want?What do the users want?

User requirementsUser requirements from Geant4 Geant4 (physics, system testing) elicited, analysed, specified and reviewed with the users

– User Requirements Document– http://www.ge.infn.it/geant4/analysis/TandA/URD_TandA.html– Use case model in progress

Specific user requirements related to the core statisticalstatistical component component – Detail in progress (URD in preparation)– Input from LCG?

Requirement traceability– Analysis/design, implementation, test, documentation, results

Page 16: Maria Grazia Pia, INFN Genova Statistical Testing Project Maria Grazia Pia, INFN Genova on behalf of the Statistical Testing Team

Maria Grazia Pia, INFN Genova

Are there any constraints? Are there any constraints?

Geant4 constraint requirementsGeant4 constraint requirements

Based on AIDA

No concrete dependencies on specific AIDA implementations should appear in the code of the system tests

Available on Geant4 supported platforms

The system should not require additional licenses w.r.t. what required for Geant4 development

Other non-functional requirements?

Page 17: Maria Grazia Pia, INFN Genova Statistical Testing Project Maria Grazia Pia, INFN Genova on behalf of the Statistical Testing Team

Maria Grazia Pia, INFN Genova

The core statistical component

The core statistical component

Page 18: Maria Grazia Pia, INFN Genova Statistical Testing Project Maria Grazia Pia, INFN Genova on behalf of the Statistical Testing Team

Maria Grazia Pia, INFN Genova

HBOOK, PAW & Co.HBOOK, PAW & Co.

Based on considerations such as those given above, as well as considerable computational experience, it is generally believed that tests like the Kolmogorov or Smirnov-Cramer-Von-Mises (which is similar but more complicated to calculate) are probably the most powerfulthe most powerful for the kinds of phenomena generally of interest to high-energy physicists. […]

The value of PROB returned by HDIFF is calculated such that it will be uniformly distributed between zero and one for compatible histograms, provided the data are not binned.provided the data are not binned. […]

The value of PROB should notnot be expected to have exactly the correctcorrect distribution for binned databinned data.

HBOOK manual, 1994

CDF Collaboration, Inclusive jet cross section in p pbar collisions at sqrt(s) 1.8 TeV, Phys. Rev. Lett. 77 (1996) 438

but…

Page 19: Maria Grazia Pia, INFN Genova Statistical Testing Project Maria Grazia Pia, INFN Genova on behalf of the Statistical Testing Team

Maria Grazia Pia, INFN Genova

Goodness-of-fit testsGoodness-of-fit tests

Pearson’s 2 test

Kolmogorov test

Kolmogorov – Smirnov test

Lilliefors test

Cramer-von Mises test

Anderson-Darling test

Kuiper test

It is a difficult domain…

Implementing algorithms is easyBut comparing real-life distributions is not easy

Incremental and iterative software processCollaboration with statistics experts

Patience, humility, time…

System open to extension and evolution

Suggestions welcome!

Page 20: Maria Grazia Pia, INFN Genova Statistical Testing Project Maria Grazia Pia, INFN Genova on behalf of the Statistical Testing Team

Maria Grazia Pia, INFN Genova

Pearson’s 2Pearson’s 2

Applies to discrete discrete distributions

It can be useful also in case of continuous distributions, but the data must be grouped into classes

Cannot be applied if the counting of the theoretical frequencies in each class is < 5

When this is not the case, one could try to unify contiguous classes until the minimum theoretical frequency is reached

Page 21: Maria Grazia Pia, INFN Genova Statistical Testing Project Maria Grazia Pia, INFN Genova on behalf of the Statistical Testing Team

Maria Grazia Pia, INFN Genova

Kolmogorov testKolmogorov test

The easiest among non-parametric tests

Verify the adaptation of a sample coming from a random continuous continuous variable

Based on the computation of the maximum distance between an empirical repartition function and the theoretical repartition one

Test statistics:

D = sup | FO(x) - FT(x)|

Page 22: Maria Grazia Pia, INFN Genova Statistical Testing Project Maria Grazia Pia, INFN Genova on behalf of the Statistical Testing Team

Maria Grazia Pia, INFN Genova

Kolmogorov-Smirnov testKolmogorov-Smirnov test

Problem of the two samples– mathematically similar to Kolmogorov’s

Instead of comparing an empirical distribution with a theoretical one, try to find the maximum difference between the distributions of the two samples Fn and Gm:

Dmn= sup |Fn(x) - Gm(x)|

Can be applied only to continuouscontinuous random variables

Conover (1971) and Gibbons and Chakraborti (1992) tried to extend it to cases of discrete random variables

Page 23: Maria Grazia Pia, INFN Genova Statistical Testing Project Maria Grazia Pia, INFN Genova on behalf of the Statistical Testing Team

Maria Grazia Pia, INFN Genova

Lilliefors testLilliefors test

Similar to Kolmogorov test

Based on the null hypothesis that the random continuous variable is normally distributed N(m,2), with m and 2 unknown

Performed comparing the empirical repartition function F(z1,z2,...,zn) with the one of the standardized normal distribution (z):

D* = sup | FO(z) - (z)|

Page 24: Maria Grazia Pia, INFN Genova Statistical Testing Project Maria Grazia Pia, INFN Genova on behalf of the Statistical Testing Team

Maria Grazia Pia, INFN Genova

Cramer-von Mises testCramer-von Mises test

Based on the test statistics:

2 = integral (FO(x) - FT(x))2 dF(x)

Can be performed both on continuouscontinuous and discrete discrete variables

Satisfactory for symmetric and right-skewed distributions

Page 25: Maria Grazia Pia, INFN Genova Statistical Testing Project Maria Grazia Pia, INFN Genova on behalf of the Statistical Testing Team

Maria Grazia Pia, INFN Genova

Anderson-Darling testAnderson-Darling test

Performed on the test statistics:

A2= integral { [FO(x) – FT(x)]2 / [FT(x) (1-FT(X))] } dFT(x)

Can be performed both on continuouscontinuous and discretediscrete variables

Seems to be suitable to any data-set (Aksenov and Savageau - 2002) with any skewnessskewness (symmetric distributions, left or right skewed)

Seems to be sensitive to fat tail of distributions

Page 26: Maria Grazia Pia, INFN Genova Statistical Testing Project Maria Grazia Pia, INFN Genova on behalf of the Statistical Testing Team

Maria Grazia Pia, INFN Genova

Kuiper testKuiper test

Based on a quantity that remains invariant for any shift or re-parameterization

Does not work well on tails

D* = max (FO(x)-FT(x)) + max (FT(x)-FO(x))

Page 27: Maria Grazia Pia, INFN Genova Statistical Testing Project Maria Grazia Pia, INFN Genova on behalf of the Statistical Testing Team

Maria Grazia Pia, INFN Genova

Work in progressWork in progress

Page 28: Maria Grazia Pia, INFN Genova Statistical Testing Project Maria Grazia Pia, INFN Genova on behalf of the Statistical Testing Team

Maria Grazia Pia, INFN Genova

OOADOOAD

Preliminary design of the statistical component in progressin progress

Core statistics comparison package

User layer

Policy-based class design

http://www.ge.infn.it/geant4/rose/statistics/

Validation of the design through use cases

Some open issues identified, to be addressed in next design iteration

Page 29: Maria Grazia Pia, INFN Genova Statistical Testing Project Maria Grazia Pia, INFN Genova on behalf of the Statistical Testing Team

Maria Grazia Pia, INFN Genova work in

progre

ss+ more algorithms

Page 30: Maria Grazia Pia, INFN Genova Statistical Testing Project Maria Grazia Pia, INFN Genova on behalf of the Statistical Testing Team

Maria Grazia Pia, INFN Genova work in

progre

ss

Page 31: Maria Grazia Pia, INFN Genova Statistical Testing Project Maria Grazia Pia, INFN Genova on behalf of the Statistical Testing Team

Maria Grazia Pia, INFN Genova

work in progress

Use case: compare two continuous distributions

Page 32: Maria Grazia Pia, INFN Genova Statistical Testing Project Maria Grazia Pia, INFN Genova on behalf of the Statistical Testing Team

Maria Grazia Pia, INFN Genova

Work in progressWork in progressImplementation and test of preliminary design

What can be re-used?– Algorithms in GSL, NAG libraries (to be evaluated)

Studies in progress– Transformation between continuous-discrete distributions– Strategies to use Kolmogorov-Smirnov with discrete distributions (E. Dagum + original ideas)– How to deal with experimental errors (not only statistical!)– Multi-dimensional distributions– Bayesian approach

In the to-do list– Conversion from AIDA objects to distributions– “Pythonisation”

Revision of the initial documents (Vision, URD, Risks)– Based on the recent evolutions in the project– Input from today’s meeting?

Page 33: Maria Grazia Pia, INFN Genova Statistical Testing Project Maria Grazia Pia, INFN Genova on behalf of the Statistical Testing Team

Maria Grazia Pia, INFN Genova

Work in progress: Geant4-specificWork in progress: Geant4-specific

Development of general physics tests in the E.M. domain, for comparison of reference distributions

– Compilation of existing tests– Evaluation, documentation of tests– Elicitation of requirements for tests among the Geant4 physics groups– Collection of reference data/distributions

Prototype for automated comparison w.r.t. reference databases – NIST, Sandia etc., directly downloaded from the web– Prototype as a risk mitigation strategy

Integration in the Geant4 system testing framework

Integration in Geant4 physics testing frameworks

Page 34: Maria Grazia Pia, INFN Genova Statistical Testing Project Maria Grazia Pia, INFN Genova on behalf of the Statistical Testing Team

Maria Grazia Pia, INFN Genova

Where?Where?

Geant4-specific stuff– In Geant4– May be included in public distribution, if of interest to users

Core statistical component– Developed in an independent CVS repository– Code, documentation, software process deliverables

Web site– http://www.ge.infn.it/geant4/analysis/TandA/index.html

Contact persons– [email protected], [email protected]

Page 35: Maria Grazia Pia, INFN Genova Statistical Testing Project Maria Grazia Pia, INFN Genova on behalf of the Statistical Testing Team

Maria Grazia Pia, INFN Genova

Time scaleTime scale

Aggressive time scale driven by Geant4 needsdriven by Geant4 needs– incremental and iterative software process

OOAD + implementation already startedPrototype at CHEP

Advanced functional system summer 2003

Open to the needs/suggestions of LCG– compatible with the available resources and Geant4 needs

Page 36: Maria Grazia Pia, INFN Genova Statistical Testing Project Maria Grazia Pia, INFN Genova on behalf of the Statistical Testing Team

Maria Grazia Pia, INFN Genova

Conclusions…Conclusions…

Geant4 requires a statistical testing system for physics validation and regression testing

– to provide a high quality product to its user communities

Core statistical component (of potential general interest)Geant4-specific components

Project compatible with LCG architecture blueprint– component-based approach, AIDA, Python…

Rigorous software process– to contribute to the quality of the product

Aggressive time scale dictated by Geant4 needs

Open to scientific collaborationBeginning

…Beginning