hep data analysis using root · roofit e xample • roorealvar has a name, a title and a range....
TRANSCRIPT
HEP data analysis using ROOT
week 3 ▪ ROOT Maths and Physics Libraries
▪ ROOT Geometries
Mark Hodgkinson
1
Week 3
• ROOT maths and physics libraries – vectors and their operations – data modelling with RooFit
• ROOT geometries – internal TGeo classes – interfaces to other platforms
2
ROOT vectors• TLorentzVector – probably the most useful for HEP data
analysis
– initialised to (0,0,0,0)
• Other 2D and 3D vectors available
root [1] TLorentzVector vroot [2] v.Print()(x,y,z,t)=(0.000000,0.000000,0.000000,0.000000) (P,eta,phi,E)=(0.000000,0.000000,0.000000,0.000000)root [3]
root [3] TVector
TVectorT<double>TVectorT<float>TVector2TVector3
3
TLorentzVector
• Assumes beam along z-axis – transverse variables derived accordingly
• e.g. – DeltaPhi: angle between two 4 momenta in
transverse plane – Et: transverse energy
4
Extended ROOT vectors
• Extended vector libs now available
–many useful operations • e.g. vector projections
root [1] gSystem->Load("libGenVector.so");
5
GenVector example• Find mathcoreVectorCollection.C in
$ROOTSYS – copy to your working directory and run
(compiled)$ root mathcoreVectorCollection.C+Time for new Vector 0.113157 0.12*******************************************************************************Tree :t1 : Tree with new LorentzVector **Entries : 10000 : Total = 1854232 bytes File Size = 1667895 ** : : Tree compression factor = 1.11 ********************************************************************************Br 0 :tracks : Int_t tracks_ **Entries : 10000 : Total Size= 84854 bytes File Size = 24060 **Baskets : 4 : Basket Size= 32000 bytes Compression= 3.34 **............................................................................**Br 1 :tracks.fCoordinates.fX : Double_t fX[tracks_] **Entries : 10000 : Total Size= 443266 bytes File Size = 412915 **Baskets : 16 : Basket Size= 32000 bytes Compression= 1.07 **............................................................................**Br 2 :tracks.fCoordinates.fY : Double_t fY[tracks_] **Entries : 10000 : Total Size= 443266 bytes File Size = 412942 **Baskets : 16 : Basket Size= 32000 bytes Compression= 1.07 **............................................................................*
6
Gen
Vect
or e
xam
ple
h1E n t r i e s 10000Mean 2 1 0 . 3RMS 1 3 0 . 5
0 200 400 600 800 10000
100
200
300
h1E n t r i e s 10000Mean 2 1 0 . 3RMS 1 3 0 . 5
h2E n t r i e s 10000Mean 5 . 0 1 1RMS 2 . 2 2 8
0 5 10 15 200
500
1000
1500
h2E n t r i e s 10000Mean 5 . 0 1 1RMS 2 . 2 2 8
h3E n t r i e s 60105Mean 3 4 . 4 7RMS 3 6 . 8 2
0 50 100 150 2000
2000
4000
h3E n t r i e s 60105Mean 3 4 . 4 7RMS 3 6 . 8 2
h4E n t r i e s 50105Mean 1 2 . 5 8RMS 6 . 5 5 9
0 20 40 60 80 1000
1000
2000
3000
h4E n t r i e s 50105Mean 1 2 . 5 8RMS 6 . 5 5 9
h5E n t r i e s 50105
Mean - 0 . 0 0 8 3 5 9RMS 1 . 7 3 1
-4 -2 0 2 40
200
400
600
800
h5E n t r i e s 50105
Mean - 0 . 0 0 8 3 5 9RMS 1 . 7 3 1
h6E n t r i e s 50105
Mean - 0 . 0 0 2 4 5 8RMS 0 . 8 1 7 8
-1 -0.5 0 0.5 10
2000
4000
6000h6
E n t r i e s 50105
Mean - 0 . 0 0 2 4 5 8RMS 0 . 8 1 7 8
7
GenVector example• Look at the macro – writes an STL vector of LorentzVectors to a
TTree
• Don’t use magic numbers to define fundamental constants!
root [1] TDatabasePDG dbroot [2] db.GetParticle(211).Mass()(const Double_t)1.39570000000000000e-01 8
Task• Here you will practice using a TTree to fill some histograms - in this case we need to
make a plot of the visible tau mass. • ROOT files with trees from MC simulation of Z->tautau are available. • Run the example and look at the printed visible tau pt - it is always zero! • In TauTruthClassifiers_MC12.cxx you need to fill in the setVisibleVectors function -
use the getNProng function as an example of what to do. • Then create and fill a histogram of the true visible mass (i.e. don’t include invisible
particles - remember each tau always decays into a W boson and a neutrino, followed by the W decay to leptons or hadrons), and write it out to a file.
• Need Monte Carlo particle numbering scheme here. • Will use your knowledge of TLorentzVector, c++ vectors, ROOT histograms and file I/
O. Also notice it uses a std::map, but this is setup for you. • Map is c++ way of storing data with a key - it maps a key to a variable. In this case
the key is an integer and the data retrieved is a TLorentzVector. Can find many webpages explaining maps and other c++ data structures (pair, set etc) you might find useful with google search.
• Copy the folder from /home/hodgkinson/ROOTTutorial_Week3 to your own area.
9
Task
• You can build an “executable” by typing “make”. • Next week we will discuss more details about compilation.
• Then run the ROOTTutorial executable and specify the range of events to process - in the above we process all events between 0 and 1000.
• Input files have 45500 events in total. • What is actually being run starts from the code inside main.cxx, which in turn calls functions in
the other source code files. • We have instructions to submit jobs to our linux box cluster here - so you might also think about
running jobs in parallel to speed up things, and then hadding the output files. • If you get errors when you add your code in, typical things to do are:
• use google - often lots of hits from stackoverflow website about the exact errors you get • ask someone more experienced for help (people in your office etc)
10
RooFit
• Introduction from main author W. Wekerke (Nikhef) here.
• As with all software tools, you need to judge if you need this or if native ROOT fitting is enough - it is if all you want to do is fit a simple Gaussian for example.
e.g. for one of my projects, we are just using TH1F->Fit() functionality
11
• Taken from link on previous slide • Short answer - sufficiently complex fit might be better done in Roofit. • Rootfit is an extension of ROOT, and nowadays comes by default when you install
(default ROOT version, now 5.34-36 on HEP0 has it - other versions may not, but can ask M. Robinson to install if needed).
12
• Very often hear people talking about ‘Toy MC’ – use parameterisations of input PDFs for speed
• often useful to reweight PDFs to study propagation of errors
• RooFit provides such functionality – worked example
• from https://root.cern.ch/roofit-20-minutes
Simulating data with RooFit
13
RooFit example
• PDF functions in RooFit – Landau: parameterisation of dE/dx – Argus: empirical function describing n-body
decay – Breit-Wigner: (∗Gaussian) describes a
resonance – Crystal-Ball: radiative energy loss – Decay: can be symmetric and convoluted
• This e.g. uses the Argus function
14
RooFit Example• RooRealVar has a name, a title and a range. Optionally can specify a unit as well.
• RooRealVar by default initialises the value at mid-range. • But can also specify both initial value and range.
• Finally can create Gaussian PDF of the mass with a mean and width.
15
RooFit Example
• We can put the gaussian pdf in a RooPlot, and then draw the RooPlot.
Task: Setup a Landau PDF and draw that.16
RooFit example
• Argus background component
• Signal + Background PDF
// --- Build Argus background PDF --- RooRealVar argpar("argpar","argus shape parameter",-20.0,-100.,-1.) ; RooConstVar argconst("argconst","argus constant",5.291); RooArgusBG argus("argus","Argus PDF",mes,argconst,argpar) ;
// --- Construct signal+background PDF --- RooRealVar nsig("nsig","#signal events",200,0.,10000) ; RooRealVar nbkg("nbkg","#background events",800,0.,10000) ; RooAddPdf sum("sum","g+a",RooArgList(gauss,argus),RooArgList(nsig,nbkg)) ;
17
RooFit example
• Generate unbinned toy data – and perform Max Likelihood fit// --- Generate a toyMC sample from composite PDF ---RooDataSet *data = sum.generate(mes,2000);
// --- Perform extended ML fit of composite PDF to toy data ---sum.fitTo(*data);
// --- Plot toy data and composite PDF overlaid --- RooPlot* mesframe = mes.frame();data->plotOn(mesframe);sum.plotOn(mesframe); sum.plotOn(mesframe,RooFit::Components(argus),RooFit::LineStyle(kDashed)); mesframe->Draw();
18
Task• Lets re-use tree2.root, that we generated
in previous lecture tasks.
• We can import the destep variable into a RootDataSet and then use a RootPlot to draw it.
• Task: Set up an appropriate PDF and use it to fit this data. Compare this to using native ROOT fit functionality with appropriate shape. 20
Fitting
• Have only scratched the surface - enough to get you going, but much more you can learn.
• Other tools exist, e.g. HistFitter developed by ATLAS extends Roofit to make it fits of multiple data regions more convenient (e.g. typical ATLAS search for supersymmetry has histograms from 10 signal regions, each of which may have say 6 control and validation regions to constrain the PDFs)
• Task: Remake the tau visible mass histogram, only for 3-prong taus (Hint: you need only change 1-2 lines of code for this) Then read in the tau visible mass histogram, choose a shape and do a fit to the mass (use either RooFit or standard Fitting, as you prefer)
• Hint: You will need to run on all 45k events eventually, most of these 3 prong taus decay via the a1 resonance - a Breit-Wigner shape might be a good bet according to the PDG book.
• ROOT tutorials (in tutorials folder underneath $ROOTSYS) there are many Rootfit examples.
21
ROOT Geometries
• Many HEP experiments use ROOT compatible geometry definitions
• Some cross-platform interfaces – VMC, GEANT, XML, GDML
22
ROOT TGeo
• examples in tutorials/geom – e.g. rootgeom.C
• run it – invoke the OpenGL
viewer
• try some of the others
26
geant4 Virtual MC
• Extension to ROOT – built against Geant4 (or Geant3)
• Permits use of different transport codes without changing user code and geometry – a little beyond the scope of this course
28
Tasks Recap
• Run the mathcoreVectorCollection.C example and read it.
• Plot the visible tau mass using ROOTTutorial_Week3 • Plot a Landau shape using Roofit • Import data from tree2.root into Rootfit and perform a
fit of the destep variable. • Fit the tau mass from 3-prong decays using
ROOTTutorial_Week3 • Have a look at the ROOT geometry tutorials. • If you don’t have time for this all the geometry are
probably the least important, as this is a rather specialised use case most people won’t ever use.
29
Closing remarks
• Many physics objects and mathematical operations supported – only skimmed surface today
• Don’t rewrite/reinvent unnecessarily – shouldn’t be coding e.g. coordinate
transformations and vector products
• Magic numbers are bad – saw some today
30
Closing remarks
• Introduced macro compilation – lots more on this next week
• Next time: – compiling binaries and libs – ROOT as a dependency – bindings to other languages
• Any questions?
31