automating the calibration and alignment revision process ...€¦ · 1 automating the calibration...

62
1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University of Sofia “St. Kl. Ohridski” Facultet of Physics Department Atomic Physics Supervisors: Ass. Prof. Dr. Leandar Litov Eng. Vasco Amaral

Upload: others

Post on 13-Jul-2020

16 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

1

Automating the Calibration and Alignment revisionprocess in Hera-B

Master Thesis

Galina Deyanova

University of Sofia “St. Kl. Ohridski”Facultet of PhysicsDepartment Atomic Physics

Supervisors:Ass. Prof. Dr. Leandar LitovEng. Vasco Amaral

Page 2: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

2

Page 3: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

Contents

1 Introduction 7

2 The Hera-B experiment 92.1 The HERA-B Physics Program . . . . . . . . . . . . . . . . . . . . . . . . . . 92.2 The HERA-B Detector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.2.1 The beam . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.2.2 Internal wire target . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.2.3 Vertex Detector System (VDS) . . . . . . . . . . . . . . . . . . . . . . 112.2.4 Tracker system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132.2.5 Ring Imaging Cherenkov Detector (RICH) . . . . . . . . . . . . . . . . 142.2.6 Electro magnetic Calorimeter (ECAL) . . . . . . . . . . . . . . . . . . 152.2.7 Muon chamber . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162.2.8 High Pt system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.3 HERA-B Trigger System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172.3.1 Pretrigger . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172.3.2 First level trigger (FLT) . . . . . . . . . . . . . . . . . . . . . . . . . . 182.3.3 Second/Third Level Trigger (SLT/TLT) . . . . . . . . . . . . . . . . . 182.3.4 Fourth Level Trigger(4LT) . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.4 Data Acquisition and Online Computing Systems . . . . . . . . . . . . . . . 202.4.1 Data Acquisition System (DAQ) . . . . . . . . . . . . . . . . . . . . . 202.4.2 The DAQ design requirements . . . . . . . . . . . . . . . . . . . . . . 202.4.3 Online Data Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212.4.4 Event data transmission protocol . . . . . . . . . . . . . . . . . . . . . 222.4.5 The event data logging . . . . . . . . . . . . . . . . . . . . . . . . . . . 222.4.6 The Fourth Level Farm Software . . . . . . . . . . . . . . . . . . . . . 222.4.7 The data processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222.4.8 The usage of Calibration and Alignment (CnA) . . . . . . . . . . . . . 232.4.9 Event Data Re-processing . . . . . . . . . . . . . . . . . . . . . . . . . 232.4.10 The role of the database . . . . . . . . . . . . . . . . . . . . . . . . . . 24

3 Databases : preliminary concepts 253.1 Relation Data Base Management System (RDBMS) . . . . . . . . . . . . . . 253.2 Data independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263.3 Query in a DBMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3

Page 4: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

4 CONTENTS

3.4 Structure of a DBMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273.5 Data catalog . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

4 Hera-B databases 294.1 Hera-B databases purposes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294.2 The database architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

4.2.1 Data Storage Layer (DSL) . . . . . . . . . . . . . . . . . . . . . . . . . 314.2.2 Data Definition Layer (DDL) . . . . . . . . . . . . . . . . . . . . . . . 314.2.3 Object Manager Layer (OML) . . . . . . . . . . . . . . . . . . . . . . 31

4.3 Detector configuration databases . . . . . . . . . . . . . . . . . . . . . . . . . 324.4 Offline access to the data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324.5 CnA constants storage and retrieval requirements . . . . . . . . . . . . . . . . 32

4.5.1 Run book-keeping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334.5.2 CnA keys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344.5.3 Keyrelease mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . 344.5.4 Dbedit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

5 Used technology 375.1 The MySQL Database and RDBMS . . . . . . . . . . . . . . . . . . . . . . . 375.2 Web server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385.3 Program language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

5.3.1 Scripting languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395.3.2 Perl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395.3.3 Common Gateway Interface . . . . . . . . . . . . . . . . . . . . . . . . 395.3.4 Database Driver Interface . . . . . . . . . . . . . . . . . . . . . . . . . 40

6 Design and implementation 416.1 The requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 416.2 Implementation details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

6.2.1 The function of Web applications . . . . . . . . . . . . . . . . . . . . . 45

7 Conclusions 55

8 Glossary 57

9 Acknowlegement 59

Page 5: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

CONTENTS 5

Abstract

In Hera-B both the Data Aquisition System (DAQ) and Analysis Frame make use ofa data model where the physic event is stored together with its corresponding Detector’scalibration and alignment (CnA) information.

After the data taking period some tuning of this information is usually required (pro-voking the need of physics data reconstruction (how the raw information is stored)).

Due to the complexity of the software system, the task of improving the CnA used tobe done manually by the system operator with the synchronization with the sub-detectorexperts.

The cumbersome procedure used to be ineficien with the need of permanent man powerand did not provide the final user doing analysis with a centralized feed-back. This motivateus to propose to build-up a sophisticated and centralized system using Web-based technology,integrated with database.

In this thesis we will explain the problem and software ,and contextualize it in the Hera-B’s user’s requirements.Then we will present the architecture of our implemened system.

Page 6: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

6 CONTENTS

Page 7: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

Chapter 1

Introduction

The HERA-B experiment is one of four experiments at HERA 1, the largest accelerator ofDESY 2, a large particle physics laboratory in Hamburg, Germany.

It was design to measure CP violation in B-meson system. The current Hera-B physicsprogram includes a wide range of heavy flavor physics: studies of charmonium productionand nuclear suppression effects, glueball and exotics states and the measurement of the bbcross section.

Because of the rareness of the events containing the B meson (one every 1011 events), asophisticated multi-level trigger system was developed and implemented.

All trigger levels depend on and need the access to geometry, Calibration and Alignmentconstants - the so-called CnA. The set of CnA constants used during the data taking at anytime is identified by a given index object, in the database, the so-called CnA keys.

After processing the data and analysis, improved calibration and alignment is produced.The settings are done from different experts for each sub-detector group.

They inform the database administrator to produce an updated on the CnA keys, theso-called keyrelease. Then the event data can be reprocessed using the better CnA version.

This procedure used to be rather complicate and time consuming to be done manuallyconsidering the set of technology oportunities. The solution for this problem using anWeb interface was developed to make easier the procedure of setting changes and reviewingthe database for the subgroup experts and database administrator. Our Web applicationssolution makes use of technology like Perl CGI using MySQL database. It is permanentlyon-line in the “Database group” Web site .

The Web interface give an easy and fast way to check the information, that is stored intothe databases. It allows to indicate, independently from the database administrator, whichchanges are done and must be set on the data base.

In this thesis the automatization of the keyrelease procedure will be explained and it willbe contextualized in the physics experiment and requirements.

After a brief introduction in chapter (1), the Hera-B detector, the Data Acquisition Sys-tem and the Physics program are described in chapter (2). Chapter (3) give an introductionabout the Database, the Database Management System and the advantages of using them.

1Hadron-Electron-Ring-Analge2Deutsches Elektronen Synhrotron

7

Page 8: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

8 CHAPTER 1. INTRODUCTION

How the Hera-B database is organized and with which data it operates is presented in Chap-ter (4). The used technology for the preparing the Web application are discussed in Chapter(5). The CnA revision tools design and purposes are explained in chapter (6).

Page 9: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

Chapter 2

The Hera-B experiment

2.1 The HERA-B Physics Program

HERA-B is not in th eposition to contibute competitive measurements in the area of CPviolation in the B meson system. However it is possible to measure and improve knowlegein the wide field of heavy flavour physiscs and QSD. It would also help to test and improvethe theoretical models as Color Singlet Model, Coloue Evaportation Model and Color OctetModel.

Figure 2.1: Invariant mass plots of the both J/ψ channels of the data sets of 2002/2003.In the µ-sample, the ψ

signal is clearly visible. The analysis has been performed on about60 % of the total available statistics.

The following physics topics are studing [4]:

9

Page 10: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

10 CHAPTER 2. THE HERA-B EXPERIMENT

1.The measurements of the bb cross section is of interest for HERA-B since it would givean indication of the feasibility of possible future contributions to the physics of b quarks.Existing measurements are incompatible and have poor precision.

2. Measurement of J/ψ, ψ′

and χc production, A dependence, differential distributionsof charmonium states as a function of pT and xF and nuclear suppression effects: colorsinglet and color octet model are still viable. HERA-B has a unique sensitivity to thekinematic region of charmonium production at negative xF . Therefore, the first systematicmeasurement of nuclear suppression of charmonium production in the negative xF region,where models show a different behavior, can be produced for the different charmoniumstates.

3. Measurements of J/ψ polarization, double charm and double J/ψ production, Υproduction.

4. Measurement of the direct photon production at high-pT : the hard photon spectrum issensitive to the gluon structure function. Previous measurements show significant deviationsfrom NLO QCD calculations.

5. Searches for glueballs and exotic states. The color singlet components of the beambaryons are dominated by the Pomeron. According to the results of the H1 and ZEUSexperiments at HERA, Pomeron is mainly a di-gluon system which is a soft momentumcomponent of the proton, with a momentum fraction of its host proton near zero. Becauseof the dominance of gluons in these systems, Pomeron- Pomeron collisions are believed toprovide a rich source of the QCD-required bound-gluon systems called Glueballs.

6. Minimum bias physics: inclusive distributions and correlation studies of identifiedparticles, A-dependencies, hyperon production and polarization, open and hidden charmproduction, rare hyperon decays.

Page 11: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

2.2. THE HERA-B DETECTOR 11

2.2 The HERA-B Detector

The HERA-B detector is a single-magnet forward spectrometer with an open geometryand a length of ∼ 20 m. It covers polar angles ranging from 10 mrad to about 200 mrad,corresponding to about 90 % solid angle coverage in the center-of-mass system. The layout ofthe HERA-B spectrometer is shown in Fig. 2.2. The detector consists of a set of sub-systemsused for different applications: a set of tracking chambers for track reconstruction andmomentum determination, a vertex detector for vertex reconstruction, and several detectorsfor particle identification.

2.2.1 The beam

The HERA ring has a circumference of 6.3 km. The protons are accelerated in several stepsup to an energy of 920 GeV. The HERA beam is structured in 220 bunches that are 1 nslong and separated by 96 ns in time. Only 180 bunches are filled with protons.

Table 2.1: Parameters of the HERA proton beam in 2000.

Beam circumference 6336 mProton beam energy 920 GeVTypical proton current 100 mANumber of bunches 220 (180)Bunch spacing 96 ns√

(s) at HERA-B 41.6 GeV

2.2.2 Internal wire target

B and c-quarks are produced via interactions of 920 GeV protons on target wires, which areoperated in the beam halo at typical distances of about five beam standard deviations fromthe beam center. The target configuration consists of two stations of four wires each (seeFig. 2.3). These stations are separated by 5 cm along the beam direction. The distancebetween the stations allows the separation of different interactions in a single bunch. Thewires can be inserted individually into the proton beam halo, until the desired interactionrate (luminosity) is achieved. The wires are made of the various materials: Al, T, C andW. All wires have 50 µm transverse width, while their thickness (in the beam direction) are500 µm for the Aluminum, Titanium wires and Tungsten and the 1000 µm for the Carbonwire.

2.2.3 Vertex Detector System (VDS)

The VDS is a central component of the HERA-B experiment as it has to provide the trackcoordinates for reconstructing the J/ψ → µ+µ−, e+e− decay vertices and the impact pa-rameters of all tagging particles. This requires an accurate determination of the parametersof the charged tracks close to the target.

Page 12: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

12

CH

AP

TE

R2.

TH

EH

ER

A-B

EX

PE

RIM

EN

T

Ring Imaging Cherenkov Counter

250 mrad

220 mrad

160 mrad

Magnet

Si-Strip Vertex Detector

Calorimeter

TRD

Muon Detector

Target Wires

0 m5101520

Photon Detector

Planar Mirrors

Top View

Side View

Proton Beam

Electron Beam

Proton Beam

Electron Beam

Spherical Mirrors

Vertex Vessel

Inner / Outer Tracker

Beam Pipe

C4 F10

Fig

ure

2.2

:T

he

Lay

out

ofth

eH

ER

A-B

Spectro

meter.

Page 13: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

2.2. THE HERA-B DETECTOR 13

proton beam

above 2

outer 2

inner 2

below 2

below 1

inner 1

above 1

outer 1

station 2

station 1

Figure 2.3: The target wires configuration.

The VDS contains 64 double-sided silicon micro-strip detectors arranged in 8 super-layers perpendicular to the beam axis. Each super-layer is built up by four quadrantssurrounding the beam. To ensure good acceptance for the track at small angles, the minimumdistance between the active area of a silicon module and the beam should be 1 cm. Duringinjection the fluctuations in the position of the proton beam could seriously damage thesilicon detector. To avoid this, seven super-layers are mounted in a Roman Pot System,operated in a secondary vacuum and separated from the primary vacuum and the protonbeam by 150 µm thick Aluminum RF-shield (see Fig. 2.4). A sophisticated system of motors,which moves the modules during injection far from the beam to a safe position, was designedand successfully implemented.

Recent tests have proven that detectors perform well, even after receiving a radiationdose equivalent to one year of HERA-B operation. Individual detectors exhibit signal-to-noise ratios of about 25(18) on the n - side(p - side), and hit efficiencies of about 95-99 %.A single-hit resolution of about 12 µm is also achieved.

2.2.4 Tracker system

In the region from z ∼ 2 m (end of VDS) to z ∼ 13 m, 13 (7 in magnet) super-layersof the Inner and Outer Tracker system(ITR and OTR), whose granularity and technologyvary with the radial distance from the beam pipe, provide pattern recognition and measuremomentum of the charge particles over a wide momentum range.

The Inner Tracker covers an area from 5 up to 25 cm from the beam pipe. It is made ofMicro-Strip Gas Chambers with Gas Electron Multipliers (MSGC-GEM) (see Fig. 2.5). Astrip pitch of 300 µm, anode pitch 10 µm and cathode pitch 170 µm ensure a hit resolutionof 100 µm. The Outer Tracker (OTR) is a gaseous detector based on Honeycomb driftchamber technology with cells of 5 and 10 mm, and module lengths up to 4.5 m. Eachtracking super-layer consists of several layers of MSGC-GEM detectors and Honeycomb drift

Page 14: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

14 CHAPTER 2. THE HERA-B EXPERIMENT

Figure 2.4: The Roman Pot System of the Silicon Vertex Detector.

chambers arranged in three different orientations. This enables tracks to be reconstructedin all three spatial coordinates, while keeping low the combinatorial ambiguities for spacepoint reconstruction.

There are three parts of tracking system, each with a different purpose:

• Pattern Recognition Chambers (PC) is the main part of the tracking system.

It is a set of four pattern recognition chambers, installed in the field free region betweenmagnet and the RICH detector. Each super-layer consist of many densely staggeredlayers, facilitating the search of charged tracks in the absence of a magnetic field.

• Trigger Chambers (TC) are used in the First Level Trigger.

The First level Trigger of HERA-B performs a track search, initiated by pretriggersfrom the calorimeter or the MUON system. To sufficiently narrow down the searchwindow for track finding, two tracking stations are installed close to the calorimeter.

• Magnet Chambers (MC):

Seven magnet chambers are set inside the magnetic field to enable an efficient re-construction of K0

Sdecay. These chambers facilitate the extrapolation to the vertex

detector of the track segment reconstructed in the PC chambers.

2.2.5 Ring Imaging Cherenkov Detector (RICH)

The RICH detector is designed to separate efficiently kaons from pions in a wide range ofmomentum (between 5 and 50 GeV). It uses Cherenkov effect to determine particle velocity:

Page 15: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

2.2. THE HERA-B DETECTOR 15

Figure 2.5: Schematic cross-section of a MSGC-GEM detector.

when a particle crosses a medium with a velocity higher than the velocity of light in thatmedium it produces a cone of light with an opening angle that is proportional to the velocityof the particle β = v

c. The momentum threshold for emiting Cherenkov light is 2.6 GeV/c

for pions, 9 GeV/c for kaons and 17.2 GeV/c for protons.Knowing the momentum p from the tracking chambers, the mass m of the particle, and

therefore its identity, can be extracted from the angle of Cherenkov ring θC using:

cos θC =1

nβ→ θ2C = θ20 −

m2

p2

The HERA-B RICH is built as tank which is filled with C4F10 as radiator gas. TheCherenkov light is focused by spherical mirrors and is directed by planar mirrors to anarray of multi channel photo-multipliers that are located outside the detector acceptance(see Fig. 2.6).

2.2.6 Electro magnetic Calorimeter (ECAL)

The Electro magnetic Calorimeter (ECAL) measures the energy deposited by particles andallows to separate electro magnetically showering particles (e, γ) from hadrons.

Electrons are particles that leave all their energy in the calorimeter, while hadrons leavejust a small fraction of their total energy. Photons are identified as candidates, which have noassociated track in the tracking chambers corresponding to the cluster (energy deposition)in the ECAL.

To ensure an overall occupancy of less than 10 %, ECAL is segmented into three parts ofdifferent granularity: Inner, Middle and Outer. The calorimeter is divided in 2344 modulesand has a depth of 23 X0 in the inner and 20 X0 in the middle and outer regions. Themodule size of 11.15× 11.15 cm2 remains constant throughout calorimeter but the numberof cells in one module increases from 1 in the Outer part to 4 in the Middle and 25 in theInnermost part. Each module is manufactured out of alternating layers of absorber and

Page 16: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

16 CHAPTER 2. THE HERA-B EXPERIMENT

Photon Detectors

Photon Detectors

Planar Mirrors

Spherical Mirrors

C4ÊF10

Figure 2.6: RICH detector.

plastic scintilators - ‘Shashlik structure’ (see Fig. 2.7). Wavelength shifter optical fiberscross the module perpendicular and guide the light to photo-multipliers (PMTs).

2.2.7 Muon chamber

The main purpose of the muon system is to separate muons from hadrons in a wide rangeof momentum (from a few GeV up to 200 Gev) and to provide a pretrigger signal to theFirst Level Trigger (FLT). The muon system consists of the muon filter interleaved withfour super layers of muon chambers (MU1-MU4), as depicted in Fig. 2.8. The absorberscut off muons with momentum below 4.5 GeV/c. The first two super layers MU1 and MU2consist of three double layers of proportional tube chambers with a stereo angles of 0◦ and±20◦. In the central region of these super layers pixel chambers are employed due to thehigher occupancy. A gas pixel chamber is realized as square cells that are formed by onesense wire and four potential wires which are oriented along the beam axis. The third andfourth super layers (MU3, MU4) each consist of a single 0◦ double layer of pad chambers,and pixel chambers in the central region. Signals from MU1, MU3 and MU4 super layersprovide information for the muon pretrigger and the first level trigger hardware, while MU2is used only in the off-line analysis.

2.2.8 High Pt system

The high Pt chambers were designed to be used at the trigger level to detect high pt hadronscoming from the decay of a beauty hadron. The system consists of three planes of detectors,

Page 17: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

2.3. HERA-B TRIGGER SYSTEM 17

Absorber plate

40 layers, 23 Xo

WLS fibres

PM + base

Scintillator tile

PM housing

11.2 cm

Figure 2.7: ECAL modules.

placed in the middle of the magnet, with two different technologies for the inner and theouter part. The inner part is made of gas-pixel chambers, similar to those used in the muonchambers, with around 12000 readout channels. The outer high Pt-system is made of strawtubes with a pad readout. The High Pt chambers have a projective segmentation to detectthe high pt hadrons that leave a different hit pattern than the background particles comingfrom the pN -interaction.

2.3 HERA-B Trigger System

The HERA-B experiment was designed to run with an interaction rate of 40 MHz andthe initial bunch crossing rate of 10 MHz. The 52000 channels of the detector produce 4.7TBytes/s of data. This, together with the fact that golden B0

ddecay happens only once every

1011 interactions means that a very selective trigger has to be used. Therefore, a powerfulfour-level trigger system starting with pretriggers was developed for the experiment and isdescribed in Fig. 2.9, where the input/output rates and trigger decision time at each levelare also indicated. The trigger system is specifically designed to select events containing aJ/ψ decaying to µ+µ− and e+e−.

2.3.1 Pretrigger

Three different detectors can supply ’fast’ pretriggers: clusters in the calorimeter with atransverse energy larger than a certain threshold, pad coincidences in the third and fourthmuon super layers, or three-fold coincidences in the high-pt system inside the magnet. The

Page 18: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

18 CHAPTER 2. THE HERA-B EXPERIMENT

MU1

MU2

MU3

MU4

Hadron Absorber

Figure 2.8: Perspective view of the HERA-B Muon System.

first two systems search for lepton candidates, while the third one searches for hadrons witha high transverse momentum.

2.3.2 First level trigger (FLT)

The FLT, based on Kalman Filtering technique, starts from the Region of Interest (RoI)defined by a pretrigger system, that point to the location of possible electron, muon or high-pt hadron candidate in the event, and attempts to follow the track through tracking stationsdownstream of the target. A track candidate is passed as a message between custom-madeprocessors, each of which covers a part of the acceptance of one super-layer. For the tracksthat are followed through all super-layers, the FLT decision is based on a high-pT tracks oron the invariant mass of track pairs (J/ψ → l+l− trigger).

2.3.3 Second/Third Level Trigger (SLT/TLT)

The second level trigger (SLT) and third level trigger (TLT) are integrated into a singlesystem, implemented as a 240-node farm of Pentium PC’s running the Linux operatingsystem, a high bandwidth switch, and a system of so-called ’Second Level Buffers’ (SLB),which store event data while the SLT decision is being made. The SLT works on RoI’sdefined by the FLT, and processes the event, improving the track parameters of tracksfound by the FLT, propagating them through additional tracking layers (using drift timeinformation) and then finally performs a track fit. Successful fit candidates are trackedthrough the vertex detector and a vertex fit is performed. The trigger decision is based on

Page 19: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

2.3. HERA-B TRIGGER SYSTEM 19

Input/Output

trigger rate

Trigger

decision time

Figure 2.9: Four Trigger Levels in the HERA-B Experiment.

Page 20: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

20 CHAPTER 2. THE HERA-B EXPERIMENT

the outcome of the vertex fit and/or the track impact parameters (detached vertices).

2.3.4 Fourth Level Trigger(4LT)

The 4LT runs on 200 nodes of Pentium CPU’s to provide the full reconstruction of the eventsand can also classify the events into different ‘physics classes’ to be used in different physicsanalysis. These 200 nodes run a full reconstruction program of the event and make somelooser selection based on the off-line analysis algorithms. With a rate 2 kHz and latencytime of 10 ms, the interaction information is pipelined to a farm of 100 microprocessors, thefarm cluster. The design input rate to the 4LT is ∼ 40 Hz. Finally, online reconstructedevents are written to the tape at a rate of 10 Mbyte/s.

2.4 Data Acquisition and Online Computing Systems

2.4.1 Data Acquisition System (DAQ)

The HERA-B Data Acquisition System (DAQ) integrates the high trigger levels (SLT and4LT) and logging protocol plus the hardware and software interconnection between them(see Fig. 2.10). It serves the data to the SLT and the 4LT and provides the structure fordata logging. The system was built such that the SLT code and reconstruction code areinserted in a data stream without any actual knowledge of the data transmission protocols.The DAQ also provides the data transmission protocol for online monitoring and calibrationdata.

The system has built using commodity hardware Linux PC farms with custom madeelectronic modules (Digital Processor Board) to match the high speed and low latency dataswitch requirements for the Second Level Trigger.

2.4.2 The DAQ design requirements

The HERA-B DAQ system was designed to meet the following requirements: it must supporta Front End event rate of 10 MHz and at the same time it has to support the FLT acceptrates of 50 kHz, with a data size of 500 KBytes.

The data transfer to the Second Level Buffers (SLB) must be done at rates of up to 25GB/sec. During the triggering and event building at the second level trigger, DAQ mustbe able to support data transfer rates of about 1.0 GB/s . The Second Level Buffers mustkeep the data stored during the SLT decision in queues of up to 280 buffers correspondingto a SLT trigger farm of 240 nodes. The transmission to the 4LT level (reconstruction farm)must be done at rates of 40 Hz for events size of 100 Kbytes after the data sparsification.After the data reconstruction, the logging system must cope with rates of up to 40 Hz forevent size of about 200 Kbytes.

The DAQ is in charge of providing calibration and monitoring data for all trigger levelsand the data stream for the initial loading and online updates of the calibration data.

Page 21: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

2.4. DATA ACQUISITION AND ONLINE COMPUTING SYSTEMS 21

FCS

controller SLT - 4LT

Trigger decision

and synchronization

free node publishing

veto

Front End Electronic control

Event Controller

HERA CLOCK

free node request

Tape

PC PC PC PC PCPCPC PC PC PC PCPC

NODELOGGING

Front endpipelines

TriggerProcessor

500,000 ChannelsDetectors

PC PC PC PC PCPC PC PC PC PCPCPC

Ethernet switch

Farm

L1

L2/

L3

L4

DSP Buffers

SLT->4LT Switch

Farm

DSP Switch

Figure 2.10: The HERA-B DAQ.

2.4.3 Online Data Flow

The data are stored during the First Level Trigger decision in the 128-depth front endpipelines at event rate of 10 MHz. Once the FLT has accepted the event, the data aretransfered in parallel into the Second Level Buffers for further processing. The system wasdesigned to transfer up to 25 GBytes/s from the front end pipelines to the SLB systemallowing a maximum FLT accept rate of 50 KHz. The Fast Control System must control thedata flux from the Front End pipelines to the SLB. The behavior of the slowest pipeline inthe system is emulated internally in the FCS mother and controlled via two parameters thatcan be adjusted. The trigger is inhibited when the emulation predicts a FIFO full situationin the system.

Page 22: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

22 CHAPTER 2. THE HERA-B EXPERIMENT

2.4.4 Event data transmission protocol

The data transmission protocol is designed to fulfill the requirements of the trigger levels.The protocol at the SLT is based on a event pull from the Second Level Buffer queuemanaged by the Event Controller (EVC). From the SLT the events are pushed to the 4LTusing a queue of free 4LT nodes managed by the 4LT controller.

2.4.5 The event data logging

A multi-threaded logger process receives the reconstructed events from the 4LT farm andwrites them to disk. There is a sender processes in each of the 4LT nodes reading the eventdata buffers which hold the event output of the online reconstruction processes (normally2 per node). At booting time the senders establish a TCP connection with the logger,which stays open. No controller is required for the 4LT to log event data transmission. Therequired logging performance is 10 MB/s.

The archiving of event data is run simultaneously to the logging, using the local disk ofthe logger node as a temporary buffer for the data. A multi-threaded archiver process movesthe data files to tape using the OSM technology. The tape write speed is approximately 8MB/sec per tape drive. In average, 1.5 drives are available to the experiment, yielding to amaximum archiving performance of 12 MB/s, limited by the mass storage system.

2.4.6 The Fourth Level Farm Software

The reconstruction, event classification and selection packages are contained in a frameprogram called Analysis and Reconstruction Tool (ARTE). It provides I/O and memorymanagement for event data. ARTE is also used for analysis, including Monte Carlo eventgeneration and detector simulation. It contains interfaces to the event data files for offlinepurposes as well as to shared memory segments for online usage on the 4LT farm. DB

2.4.7 The data processing

The essential features of the data processing can be described as three stage of processing:

• Event Reconstruction. This is a primary reconstruction and converts the raw datainto physical quantities such as hits, tracks, vertices, clusters of energy. It may bedone more than once in the case of corrections or develop ment of improved analysistechniques.

• Data Summary Tape(DST) Analysis. At this stage, the events for differentphysics processes are separated into event samples, and any physics-specific recon-struction is performed (e.g constrained vertices).

This stage is usually carried out separately a few times for each event. The numberof times is highly dependent upon the physics topics. The data processed in differenttopics is written into a mini DST files.

• Physics Analysis. This comprises multiple scans over an event sample, applyingcuts based on an examination of the output statistics (in form of histograms or similar

Page 23: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

2.4. DATA ACQUISITION AND ONLINE COMPUTING SYSTEMS 23

visualization techniques), and iterating towards a final physics event samples are usedfor detailed analysis on particular physics processes.

This stage is carried out many more times for each event than the DST Analysis stage.

2.4.8 The usage of Calibration and Alignment (CnA)

During the reconstruction procedure, the data needed to align and calibrate the detectoronline are derived. To increase statistics available, a scheme was develop to collect all datain parallel from all the nodes in a central place called Gatherer. The Gatherer data arethen used to compute the alignment constants which are updated in the online databases.These data are stored in a Berkeley Database, using a Mizzi Database Management System(DBMS) and it must be broadcast simultaneously from the database to a few hundredprocessors. The updated CnA constants are produced online not only in the reconstructionfarm but also in dedicated second level nodes [3].

The set of CnA constants used during the data taking at any time is identified by anindex object, so-called CnA key. It is written in the event header and it allows an event to beassociated with all the calibration and alignment data used during the trigger process andonline reconstruction. A pushed architecture has been designed for a low latency distributionof the CnA data to the SLT nodes. The updated CnA constants are multi-casted using thefast and reliable SHARK links. On the other hand, the larger processing time of the onlinereconstruction nodes (order of seconds), allows a slower distribution using a pull architecture.The reconstruction nodes fetch the updated CnA data from the fast database memory cachesvia fast Ethernet. The volume of the CnA constants is 650 MB and the attained performanceof the loading of constants into the second level processes through the SHARC switch was1 GB/sec.

Every data taking session begins booting all the processes and establishing a tree likestructure control infrastructure to operate the system according to the phase of the datataking. A uniform interface based on HERA-B DAQ messaging system lets one to start-upand terminate process remotely. The system boot-up procedure is controlled by a numberof special processes called processes manager. During the booting procedure, they requirethe processes server to start processes according to the information stored in the HERA-BDAQ database. One or more processes are involved in booting every subcomponent, i.e.all subcomponents are boot up simultaneously. Once all the processes have been launched,the control infrastructure is established by means of state machine protocol applied by ahalf of processes in the system. The protocol provides the control and monitoring of thesubcomponents state with respect to their readiness to take data. To get ready to take datadifferent subcomponents may require different number of internal states of initializations.

2.4.9 Event Data Re-processing

When the reconstruction packages are further developed and/or improved CnA constantsare produced, the event data need to be reprocessed. Therefore, a system to exploit theCPU power of the 4LT farm for event data re-processing has been setup. It works similarlyto the usual online processing scheme and makes use of the online protocols, in particularthe logging and the archiving facilities. Only the data source is different: instead of passing

Page 24: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

24 CHAPTER 2. THE HERA-B EXPERIMENT

raw event data from the DAQ system to the 4LT nodes, a process retrieves data files fromtape and distributes the event to the 4LT farm nodes.

The event re-processing rate is only limited by the farm processing power, 50 Hz for anaverage event reconstruction time of 4 sec.

2.4.10 The role of the database

The intensive usage of informations on a large number of machine required an elegant methodto manipulate the data, the level of priority for different users and a fast access to the data.The description of database system implemented, the advantages that it gives are regardedto the next chapter.

Page 25: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

Chapter 3

Databases : preliminaryconcepts

Databases are collections of data, describing the activities of related organizations. Thestrength of Databases is in the processing of large volumes of data.

A Database Management System(DBMS), is a software designed to assist in maintainingand utilizing large collection of data. Using a DBMS to manage data gives many advantages:

• Data independence: the application programs should be as independent as possiblefrom details of data representation and storage.

• Efficient data access: the DBMS utilizes a variety of sophisticated techniques to storeand retrieve data efficiently.

• Data integrity and security: if the data is always accessed through the DBMS, theDBMS can enforce integrity constraints on the data.

The DBMS can enforce an access control that govern which data is visible to differentclasses of user.

• Data administration: management of data - responsible for organizing the data repre-sentation to minimize redundancy and for fine tuning the storage of data to make theretrieval efficient.

• Concurrent access and crash recovery: a DBMS schedules permit concurrent accessesto the data in such a manner that users can think of data as being accessed by onlyone user at time.

3.1 Relation Data Base Management System (RDBMS)

A DBMS allows the user to define the data to be stored in terms of a data model.Most Data Base Management Systems are based on the Relational Data Model. The

central data description construct in this model is a relation, which can be thought of as a

25

Page 26: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

26 CHAPTER 3. DATABASES : PRELIMINARY CONCEPTS

set of records. A relational database allows the definition of data structures, storage andretrieval operations and integrity constraints. In such a database the data and relationsbetween them are organized in tables.

A table is a collection of rows or records and each row in a table contains the same fields.Certain fields may be designated as keys, which means that searches for specific values ofthat field will use indexing to speed them up.

Where fields in two different tables take values from the same set, a join operation canbe performed to select related records in the two tables by matching values in those fields.Often, but not always, the fields will have the same name in both tables. This can beextended to joining multiple tables on multiple fields. Because these relationships are onlyspecified at retrieval time, relational databases are classed as dynamic database managementsystem.

A description of data in terms of a data model is called a schema. In the relationalmodel, the schema for a relational specifies its name, the name of field and type of eachfield.

The data in a DBMS is described at three levels of abstraction [5]: conceptual, physical,and external schemas.

A Data Definition Language (DDL) is used to define the external and conceptual schemas.

All DBMS vendors support SQL (Structured Query Language) commands to describeaspects of physical schema.

Information about the conceptual, external, and physical section is stored in systemcatalogs.

3.2 Data independence

A very important advantage of using a DBMS is that it offers data independence. Applica-tion programs are insulated from changes in the way the data is structured and stored.

Data independence is achieved through the use of three levels of data abstraction. Therelations in the external schema are in principle generated on demand from the relationcorresponding to the conceptual schema. If the underlying data is reorganized, that is, theconceptual schema is changed, the definition of a view relation can be modified so thatthe same relation is computed as before. Thus, users can be shielded from changes in thelogical structure of data, or changes in the choice of relations to be stored. This propertyis called logical data independence. The conceptual schema insulates users from changes inthe physical storage of the data and it hides details such how the data is actually laid outon disk, the file structure and choice of indexes.

3.3 Query in a DBMS

The questions involving the data stored in a DBMS are called queries.

A DBMS provide a special language, called the query language, in which queries canbe posed. A DBMS takes great care of evaluating of queries as efficiently as possible. Theefficiency of query evaluation is determinated to a large extend by how the data is stored

Page 27: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

3.4. STRUCTURE OF A DBMS 27

physically. A good choice of indexes for the underlying relations can speed up each queryin the preceding list.

3.4 Structure of a DBMS

The DBMS accepts SQL commands generated from a variety of user interfaces, producesquery evaluation plans, executes these plans against the database, and returns the answers.

When the user issue a request, the parsed query is presented to a query optimizer, whichuses information about how the data is stored to produce an efficient execution plan forevaluating the query. An execution plan is a blueprint for evaluating a query, and it isusually represented as a tree of relational operators that serve as the building blocks forevaluating queries posed against the data. The code that implements relational operatorssits on top of file and access methods layer.

This layer includes a variety of software for supporting the concept of a file, which, in aDBMS, is a collection of pages or records.

This layer supports a heap file, or file of unordered pages, as well as indexes.The files and access methods layer code sits on top of the buffer manager, which brings

pages in from disk to main memory as needed in response to read requests. The lowestlayer of the DBMS software deals with management space on disk, where data is stored.Higher layers allocate, delicate, read and write pages through this layer called the disk spacemanager.

3.5 Data catalog

A fundamental property of a database system is that it maintains a description of all thedata that it contains.

A relational DBMS maintains [6] information about every relation and index that itcontains. A relation can be stored using one of several alternative file structures with one ormore indexes - each stored as file - on every relation. The DBMS also maintains informationabout views, for which no tuples are stored explicitly. This information is stored in acollection of relations, maintained by the system, called catalog relations or system catalog.The information in System catalog is the following one:

• For each relation:

Its relation name.

The file name and the file structure of the file in which it is stored.

The attribute name and type of each of its attributes.

The index name of each index on relation.

The integrity constraints e.g. primary key and foreign key constraints the relation.

• For each index:

The index name and structure of the index,

Page 28: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

28 CHAPTER 3. DATABASES : PRELIMINARY CONCEPTS

The search key attributes.

Statistics about relations and indexes are stored in the system catalogs and up-dated periodically.

In Hera-B the database requirements for efficiency the fast access of the used data are verystrict [7]. For this reason, a very particular non standard “in house” database managementsystem was specially developed. Some standard technology RDBMS are also part of thesystem.

Page 29: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

Chapter 4

Hera-B databases

4.1 Hera-B databases purposes

The Hera-B database was found to supply database support and services, provide advise ondatabase creation and maintenance. It supports and makes part of the integrated solutionof data acquisition, reprocessing and analysis information systems.

The complexity [7] of the detector, magnitude of data and real-time needs from thesystem imposes strong efficiency implying that the database system plays a central role inthe overall efficiency.

It is responsible for management, maintenance and monitoring of setup, period and eventmeta-data information.

• Setup include a cabling connection, geometry and farm software configuration, basicHigh Energy Physics (HEP) constants. It occupies a few megabytes.

• The period information contains calibration, alignment, status, luminosity - all condi-tions of the detector status.It is around 200 GBytes.

• Event informations: run bookkeeping for fast access, subdetector FED bit pattern.

Most of the client applications and information stored into database system are createdand maintained by other groups as part of their responsibilities.

The database system centralizes the information of the following phases: online com-puting systems and data acquisition, reconstruction CnA key (on-line and off-line), off-lineanalysis processes. It provides the manager infrastructure to the :

• Deal with the detector configuration databases.

• Calibration and alignment.

• Distributing the information to the reconstruction farms.

• Provide separate offline and online access.

• Keep available slow control and data quality information.

29

Page 30: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

30 CHAPTER 4. HERA-B DATABASES

DSP switch DSP switch

TriggerPC

TriggerPC

TriggerPC

Ethernet switch

4LTPC

4LTPC

LoggerPC

FCS

SLT/TLT

4LT

DetectorFront-EndElectronics

EventControl

Events

Farms

DATABASE GROUP

Alignment

Calibration

Configuration

Configuration

Storage

Configuration

Storage

DataReprocessing

Physical storageinto tape

Figure 4.1: The role of database as part of the Hera-b experiments.

Page 31: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

4.2. THE DATABASE ARCHITECTURE 31

4.2 The database architecture

The database structure contains several layers. It was implement as custom made databaseusing different software parts, including also an open source products.

It includes Berkeley Database which is used for physical level of storage, Mizzi databasedefinition layer and management system on Berkeley layer and Leda management layer whichprovide a object persistence and manage associations between objects in related containers.

In spite of having to deal with fast responsiveness, the Hera-B database system is not areal-time database since the transactions don’t have to be executed within a deadline.

4.2.1 Data Storage Layer (DSL)

For data storage layer was chosen the Berkeley database technology which belong to thecategory of relational databases [7]. Mainly this decision was based on the fact that Berkeleydatabase is very well tested program toolkit that provides fast, reliable, scalable and mission-critical database support to software developers.

Berkeley database’s high performance and stability support thousands of simultaneoususers working on databases as large as 256 terabytes.

4.2.2 Data Definition Layer (DDL)

The data definition layer is called Mizzi. It is a C-array type formatter. Mizzi formats ofvariable length data arrays of several types in a machine(platform) independent way. It alsodefines the name and time/version encoding of the keys to the access data. This formatterwith a very simple C-binding is very efficient while storing large numbers of very smallobjects that appear in use cases like managing detector channel status.The access patternsare also ideal for this approach.

4.2.3 Object Manager Layer (OML)

Leda is an object manager layer [7] that was developed to provide object persistence andmanage associations between objects in related containers. The implemented many-to-manyassociations are navigated, with the help iterators, using hash tables. Keys are used as objectidentifiers that have the scope of classes, and associations can only be followed when theclasses of objects have explicitly been loaded or saved.It has been extensively used used forthe purpose of the configuration databases.

Leda is a simplified object manager layer that provide C and C++ APIs working onthe client side, that allows a simplified interaction with the basic Mizzi serializer. It givesthe client application the possibility to fix the tables data definition and relations schemaaccording to dedicated C preprocessor macros, hiding the declaration, instantiation andinitialization of a complex set of C structures containing all the relevant information on thewhole database schema.

Relations among rows of tables are set associating the primary keys and are realized inthe client application memory as system of C pointers, optimizing in this way the local datanavigation. They are stored to permanent database in special relations tables as explicitkey association.

Page 32: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

32 CHAPTER 4. HERA-B DATABASES

The fact it includes, in parallel to the C++ binding, a simple C interface together withits availability on embedded PC’s running the real-time Lynux OS operating system haveproven to be very useful.

The Leda package does not allow the database server to optimize queries by followingobject associations. A relation database is used in the case of tag data-quality databaseswhere this limitation has turned out to be problematic.

4.3 Detector configuration databases

The configuration of the detector involves a complicated network of interrelated objects andit is the domain for usage of object features in the DBMS.

The performance required during the initialization of the nodes in the trigger farms ateach setup period forces the use of simple implementations.

4.4 Offline access to the data

To have a steady database running system during the acquisition phase, data is replicatedbetween the online system and the servers for access of offline applications. Offline accessto the database is achieved by replication mechanism that isolates the online servers fromthe load fluctuations induced within a time length of one hour are immediately replicatedto a mirror site, and from there to the rest of the outside institutes.

The replication infrastructure decoupled the online computing infrastructure from theload fluctuations of offline user access. It also reduced the dependence between the twonetwork.

4.5 CnA constants storage and retrieval requirements

The efficient use of event tag databases is difficult due to the huge number of events shouldbe tagged. The runs are extended for periods in which conditions change several times.Therefore, tagged data sets have been introduced which are the basic for the data qualityclassification. They are managed by a relational database system that is periodically updatedby a process that uses stored available information.

Generating these methods to catalog the useful scientific data, is extremely importantto reduce the large event data set mined during the analysis phase.

The data set properties used as selection criteria are the run conditions for each subde-tector and the data quality assessments generated on-line for periods where the conditionsdid not change. The classification of the data in these periods can also be performed inrelation to specific physics channels.

Due to the huge number of events, the investigation on approaches to do classificationby event tags is centered in distributed database servers with each node being responsiblefor data corresponding to a given period. Requests are sent simultaneously to the differentserver nodes, and the results are merged before being sent to the client.

Page 33: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

4.5. CNA CONSTANTS STORAGE AND RETRIEVAL REQUIREMENTS 33

4.5.1 Run book-keeping

Associated with the 600 K input channels of the Hera-B detector, a large amount of statusinformation such as high voltage and low voltage, temperatures and gas system parameters ismonitored very frequently, only a very limited part of it actually changes at each update [7].

The most straight forward options to store slow control data in the database where eitherto put all the data from the information source into the object that contains all the valueseven if most did not change, or store each channel into one separate object. The first optionhas the disadvantage of consuming a lot of space, since all values would be stored, regardlessif they have changed or not. The second option implies a huge number of small objects inthe database which is not desirable because of enormous growth of the index tables. Storingeach individual channel update in a relational table would also lead to a huge and impracticalnumber of entries.

In order to avoid the presence of too many small objects, the system incorporates pos-sibility to update individual channels that are a part of large collection objects. The slowcontrol interface defines schema, data and update objects. The client API has been imple-mented to hide the update objects from the user. The main part of the work is done atthe server level, thus allowing a good performance. Upon requests, the history of values isre-clustered on the database server before being sent to the client.

It would be trivial to design a solution to stamping the events with a time-stamp, theoverhead [7] that would fall on the data base querying system would be rather high, sinceall the events are analysed one by one, and each would have to query the data base forpossible updates. The association between events and the objects in database must beefficient and flexible. The object index distribution mechanism adds an extra reference levelinto the database system, stamping each event by using a key index pointing to the relatedinformation in database. The association from events to the key object is dynamic, allowingnot only the reproduction of the trigger conditions, but also improved re-processing. Thedata concerned calibration, alignment, slow control and monitoring are stored in so-calledMizzi tables.

There are data tables and tables with indexes toward these data. It is an optimizationfor data storage and retrieval based on an index key field. Each table is completely definedby two index numbers: major and minor. They have a three different possible meanings.In general major number means the number of the tables that exists. In the second casethe major is run number and in the third the major index is by time stamp. The minornumber in all cases means the number of version of the table e.g. the minor indicates theupdate numbers. It is created auto incremental after each update on the proper calibrationor alignment tables. The procedure of improving CnA is called release, the update ofCnA keys - keyrelease. The direct navigation from the event to all the related databaseinformation, without querying intervals of validity, is crucial to enable the use of the cacheservers that can verify if the required information is locally available without contacting themain database servers.

The users can not make requests directly to the databases. The maintaining of theBerkeley database is centralized. Mizzi encapsulates the Berkeley DB API and provides theinfrastructure for client/server request of indexed unformatted objects.

Page 34: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

34 CHAPTER 4. HERA-B DATABASES

4.5.2 CnA keys

The relations between different calibration and alignment constants that belong to a givenrun is given by an index object are called CnA keys. They are stored into index tables andsaved into “index” database called key table in Berkeley/Mizzi.

The database that keep all the information about what is happening during the run isaccessed via RUN LOG server in run config database table.There the user can check intokey table server, keytable and keytablemc (keytable for Monte Carlo) databases.

The keytables call the full path to the all CnA for given run: the servers which operatewith different databases, database name, major and minor index that describe completelythe tables. The major is the CnA key number and the minor reference to the release version.

After the giving of these settings the user can see the proper calibrations or alignment.

Offline Revision Phase

I1 I2 I3 In

Calibration 1 Calibration 2 Calibration n

Alignment 1 Alignment 2 Alignement n

Time

Production Phase

Event Flow

Storage

I1 I2 I3 InOnllineIndexesVersion 0

Calibration 1 Calibration 2 Calibration n

Alignment 1 Alignment 2 Alignement nCalibration 1’ Calibration 2’ Calibration n’

Alignment 1’ Alignement n’

I2 I3 InI1

Alignment 2’

Version 0

Revision 1

Figure 4.2: Keyrelease.

4.5.3 Keyrelease mechanism

Before re-processing it is necessary to indicate the new CnA that is going to be used.This isdone via setting a new keytable ,e.g., to creating a new version of the keytable with a newaddresses.

The mechanism for updating the keytable major index is called keyrelease and it includesthe following steps:

• Make a copy of the current keytables from the online machine.

• Update the copy version.

• Add the updated keytables to the online machine.

Page 35: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

4.5. CNA CONSTANTS STORAGE AND RETRIEVAL REQUIREMENTS 35

Because of the different criteria of making indexes there are three different ways to makethe keyrelease. The first of them is when the major number is corresponds to the runnumber - then the major equal to run number. The second is to give the major equals totime stamp. The third is to set a group of keytables giving the CnA keys for the first andthe last keytable that define the range of keyrelease.

When the keyrelease is started, the program check by itself if the keytables already existsto be updated. There are two options in this case: if the software doesn’t find the existingtable can stop the procedure or it can create a new table.These two options are set by plusand minus.

4.5.4 Dbedit

The different databases in Mizzi are manipulated via the database editor.It is an interface, written in TCL TK which allows to check, edit and drop the information

about all the servers and databases in the configuration databases.

Figure 4.3: Dbedit tool for management and maintenance of distributed DB.

The idea developed interface was to create persistent Web interface to check aboutkeytable database instead of using dbedit.

Before to proceed to the description of the problem and what is the architecture of thesolution we will use next chapter to introduce some technology to serve a ground base forthe design of the solution.

Page 36: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

36 CHAPTER 4. HERA-B DATABASES

Page 37: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

Chapter 5

Used technology

In order to engineer the implementation for our Web solution, programs written in Perl usingCGI and DBI modules has been used . They extract information from a MySQL database,where there is a special copy of the keytable database for a Web access to the data. In thischapter the advantages of the used technology are reviewed and the client/server mechanismis explained.

5.1 The MySQL Database and RDBMS

MySQL is the most popular SQL [6](Structured Query Language) relational open sourcedatabase of Client/Server type. It belongs to the Relational Database Management system,which is broker between the physical database and the users of that database.

MySQL packs a large feature set into a very small and fast engine. It includes the SQLserver, client programs for server access and administration and the Application Program-ming Interface (API). The SQL server is exploited for Web applications. MySQL is flexibleand it can work under various system as Solaris, Irix, Windows or Linux. It is the bestchoice for middle size databases.

The MySQL system catalog consists of all the databases and the tables that the servergoverns. They have a tree structure: each of the databases have their own directory in thesystem catalog. Each table is presented as files in DB directory.

The System catalog comprise also some status file (logfiles), which are generated by theserver. These status files consist of important information about the server’s work.

Everything in the system catalog is managed by only one program - mysqld (MySQLserver). The client programs never handle directly the data directory. The server is respon-sible for this.

It is the link between the client programs and the data, that the users want to use.When the server is started, it opens the log files and activates the network interface foraccess towards the system catalog. For access to the databases, the application programsmake a connection with the server and send the queries as SQL operators for executing therequired operations. The server executes all requests and sends back the result to the client.The server is multi thread and it can serve many simultaneous connections with clients.

37

Page 38: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

38 CHAPTER 5. USED TECHNOLOGY

Usually one server governs the database of one host computer but it is possible to startseveral servers if each of them controls the data in different system catalogs.

It is possible to start several servers which are working with one data catalog.Update log: counts the queries which modify the database. This file is recorded in the

system catalog on the server. Each database (manage of MySQL server) has its own systemcatalog. that exists as subdirectory of the basic System catalog, with the same name.

Each table in the database exists as collection of 3 files: the form file, the data file andthe index file.

• The form file consists on the file description.

• The data file includes the data.

• The index file contains the index of each relation.

5.2 Web server

World wide web is an Internet distributed information retrieval system which originated atCERN. Information on the Web is represented to the user as hypertext object in Hyper-text Markup Language (HTML) format. HTML is a hypertext document format which isstandard for Web page preparation.

Hypertext links refer to other documents by their url’s. These can refer to local orremote resources accessible via ftp, telnet or via the http protocol used to transfer hypertextdocuments.

The World Wide Web is based on a client-server model.The server provides resourcesand the client requests them. The client program (known as a browser) runs on the user’scomputer and provides two basic navigation operations: to follow a link or to send a queryto a server. The mechanism which allows both to communicate with each other is calledhypertext transfer protocol (http).

When the user makes a request the browser establishes a connection with the server andsends it using the http.

This protocol exchange the HTML documents.The server doesn’t do the work itself.It is daemon (process) which is set to wait for the user’s request. When it delegates

complex tasks to an external program or script. In the case of the database access, thescript acts as a gateway between the server and the data repository.

When the server receives a request to access the database it passes the request to agateway program, which get the data and return result to the server.

Then the server repackages the information from the script and forward it back to theclient.

In our case, the details of this interaction are specified by the Common Gateway Interface(CGI).

The CGI protocol defines the [9] input that a script can expect to receive from theserver and the output which must return to the server. The Database Interface makes aconnection with the data base and fetches the proper query that the server sends to thebrowser. The most popular and used of them is Apache - open source http server for Linux,

Page 39: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

5.3. PROGRAM LANGUAGE 39

WindowsNT and other platforms. It is fast and reliable. It provides http services in sinkwith the current http standards. The client-server interface operates by making an externalinterface available which is continually monitored by the server.

The external interface may be provided by a file, named pipe or socket with a well knownlocation in the file system.

Clients may write directly to the command through the use of explicit embedded code.

5.3 Program language

5.3.1 Scripting languages

Scripting languages are higher level than system programming languages. They are referredto as glue languages or system integration languages.They are used for ”gluing” applicationsand they assume that a collection of useful components written in other languages alreadyexists.In order to simplify the tasks of connecting components, scripting languages tend tobe type-less. The strongly typed nature of system programming languages encourages thecreation of variety incompatible interface [10]. Scripting languages use interpreters, whichare less efficient than the compiled code for system programming languages.However, theoperations in scripting languages have a greater functionality, flexibility and compatibility .

The increasing of their importance a shift in the application mix towards gluing appli-cation.

The most used shifts are Graphical User Interface (GUI), the Internet, and componentframework.

5.3.2 Perl

Perl is a high level programming language.It is able to process strings including regular-expression string manipulation.

This capability makes Perl a good choice for database programming, since the majorityof the information stored within databases is textual in nature.Perl scripts tend to be farsmaller than equivalent C programs and are generally portable to other operating systemsthat run Perl with a few or no modification (flexibility). The usage of Perl as Apache moduleincrease its productiveness.

Perl has ability to use dynamically external modules.The basic Perl’s function (application) on the Web is to create CGI programs and it is

using a CGI as a module for it.

5.3.3 Common Gateway Interface

The Common Gateway Interface (CGI) permits interactivity between a client and a hostoperating system through the World Wide Web via the Hyper Text Transfer Protocol. It’sa standard for external gateway programs to interface with information servers, such as httpor Web servers.

A plain HTML document that the Web server delivers is static, which means it doesn’tchange. A CGI program is executed in real-time, so that it can output dynamic information- perhaps a weather reading, or the latest results from a database query.

Page 40: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

40 CHAPTER 5. USED TECHNOLOGY

CGI allows the users to run a program on their machine that performs a specified task.Gateways are programs which handle information requests and return the appropriate

document or generate a document on the fly.They can be used for a variety of purposes, themost common being the handling of FORM requests for http. A http server is often used asa gateway to a legacy information system. The CGI is a convention between HTTP serverimplementors about how to integrate such gateway scripts and programs.

The server can serve information which is not in a form readable by the client (e.g. anSQL database), and act as a mediator between the two to produce something which clientscan use.

5.3.4 Database Driver Interface

The DBI is a database access module for the Perl programming language [6].It defines a set of methods, variables, and conventions that provide a consistent database

interface, independent of the actual database being used. The Application ProgrammingInterface (API) defines the call interface and the variables that Perl scripts use. The API isimplemented by the Perl DBI extension.

The DBI ”dispatches” the method calls to the appropriate driver for actual execution.The DBI is also responsible for the dynamic loading of drivers, error checking and handling,providing default implementations for methods, and many other non-database specific du-ties. Each driver contains implementations of the DBI methods using the private interfacefunctions of the corresponding database engine. Multiple simultaneous connections to mul-tiple databases through multiple drivers can be made via the DBI. Simply make one connectcall for each database and keep a copy of each returned database handle.

Page 41: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

Chapter 6

Design and implementation

In order to provide a fast and easy infrastructure (access) to the data and to centralize theinformation in case of changes, a web client interface was developed. The reduced databasecrew was also another reason to look for automation of the time consuming tasks. With anew interface different subdetector group experts have a possibility to indicate which changesare inserted into the Data Base and which information in keytables have to be updated.

6.1 The requirements

The calibration and alignment constants are necessary for the data acquisition and analysisphase. To improve the analysis results, physicists improve estimations of these constants.Then the data are reprocessed using new CnA constants.

For a different kind of analysis different versions of CnA constants may be used. Theuser have to know the range of validity for different constants and the time when theywere produced. The index validity corresponds to the validity of particular calibration andalignment constants. If the user knows the revision version of the keytable he/she can seewhich changes are valid for that version.

When the data are processed with ARTE, several parameters have to be defined in thescript file, called kumac, to indicate the runs and geometry which are used and the paths tothe data. These are the run number, the two indexes of keytable (CnA key and version ofthis keytable) and the reprocessing version of the data.

Before starting physics analysis of data the user have to know:

• Which keytable corresponds to which run and vice versa

• Check the list of revisions for a given run or keytable.

• Check the list of revision tables (to see that CnA improvements are valid for the certainrun).

The next group of users are the subcomponent experts which are responsible for im-provements in the CnA data for each of subdetector databases. When they make changes

41

Page 42: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

42 CHAPTER 6. DESIGN AND IMPLEMENTATION

Figure 6.1: “Use cases” scheme.

in the subdetectors databases, they inform the database administrator, who finally makeschanges in the keytable database (see fig. 6.1).

The subcomponent experts have to:

• Make a review of keytable: check a list of CnA tables, which are used for given run.

• Have a tool to check the list of new changes in the CnA tables that will be used in thekeytable.

• Set conditions for making keyrelease.

• Have a possibility to delete changes in case of errors before data submittion in Berke-ley/Mizzi databases.

Finally, the database administrator will produce a new revision online. He also can doit manually from the prompt.

The CnA revisions can be done using the dbedit in a few steps:

Page 43: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

6.2. IMPLEMENTATION DETAILS 43

• Set the server RUN LOG and the table run config. Set as major the number of therequired run.

Run log database keeps information about the run itself. Here the user also finds theCnA key for the certain run. The meaning of major and minor indexes in Run logdatabase are: the run number for major and the time stamp for minor.

• Set the server key table and table keytable. Set as major the CnA key and as minorthe version of the keytable, which corresponds to the number of the all updates whichare done till now.

This setting gives a content of keytable: the name of all servers and databases whichwere used during the certain run, the number of tables with calibrations which be-long to these databases. The last are presented with their major and minor indexes.Because different CnA modifications are part of responsibility of different subgroupexpert, there is no standard about meaning of major and minor in this case. Majorcould be run number or time stamp or some other number. But after update of somecalibrations or alignments parameters the minor number is increased auto incremen-tally which going to show that minor indicate somehow the number of updates of thistable.

• The command Go and options Next , Previous , Last and First are used to check abouthow many revisions are done for a given keytable (how many different updates on oneor another CnA).

• In order to review the calibration or alignment it is necessary to get the major andminor for a required calibration and then to set the proper server, table and theseindexes.

In the following the development of new tools to automate the revision tasks, which nowcan be done manually in HERA-B via dbedit, will be presented.

6.2 Implementation details

In order to cope with the described requirements the Web design was chosen. The builtWeb site provides the services in the navigation through several pages. Fig. 6.2 shows thedata chain from client to user.

The architecture of new solution include the following technics: the Web applicationsare developed using the scripting language Perl with CGI and DBI internal modules. Perl isinstalled as module of Apache server, which speed up the Perl interpreter. These applicationsare written for MySQL database.

The mechanism of program is easy for learning and the code is simple. The usage ofWeb applications require a care about security. This is a main reason to use the copy ofBerkeley/Mizzi keytable databases for Web access. The MySQL database is a good choicefor a middle size data volumes.

The chain from keytable creation and usage to the keytable visualization tools is thefollowing:

Page 44: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

44 CHAPTER 6. DESIGN AND IMPLEMENTATION

in the online analysis and reprocessing the most recent CnA keys are used, which arebroadcasted simultaneously using the SHARK network to the SLT and a tree of cachedatabase servers to the 4LT. In case of changes in keytables the cache servers request to theBerkeley/Mizzi. When the new run is started the keytable is created automatically.

The separate databases are used for online and offline analysis. The isolation is achievedusing the replication of database for offline access. A copy of keytable databases in MySQLdatabase is provided specially for the Web access.

The procedure to transmit the recent keytable from Berkeley/Mizzi to offline databaseand MySQL versions is the following: the cron daemon is set to check each of Mizzi tablesfor a new version. The C code (script) extract this information from Mizzi. The Perl DBImodule connects with a MySQL database and allows to operate with it. The SQL queriesspecify the different manipulation on the data: insert, delete, update, join information fromdifferent tables and visualized it in one table.

Figure 6.2: The data chain from client to the user.

The Web server Apache is used for reply to the user’s request.

A client sends a request to a server according to the http protocol, asking for information.The server responds. When the server receive the query, the Perl DBI program interactswith the database (MySQL) and extracts the proper information. The Perl CGI programcan access information in the database and format the results as html.

Page 45: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

6.2. IMPLEMENTATION DETAILS 45

In the next section the functionality of the developed applications are presented.

6.2.1 The function of Web applications

The developed applications are hold as links on the Web page in the Hera-B Databasegroup’s site. Below a brief explanation for each of the linked programs is presented. Theappearance of the Web page with the CnA Key revision tools is shown on fig. 6.3.

• The first program allows to Arte users to receive the keytable for given run or toconvert the keytable number to the run number (see fig. 6.4). The keytable numbersare presented as links to the list of all calibration and alignment tables. This possibilityallows users to see all calibration and alignment tables that were used in the given run,list of servernames and primary and secondary keys (major and minor) (see fig. 6.5).

• The program run/release version gives an information how many releases are valid pergiven run (see fig. 6.6). It allows to understand how many times different indexes wereupdated after the changes of calibration or alignment constants.

• The table of keybook revisions gives all keytable release versions for each run rangeand keytable range(see fig 6.7). It allows to user to define the changes for all runs.

The next few tools are required to indicate which update of CnA tables is performedand which indexes have to be updated in the keytables. The purpose of these tools isto ensure independence of the subcomponent experts from database administrator. Thesubcomponent experts have to indicate that changes are done: on which server and database,if the new table is created or the old one is updated.

• A List of Subcomponent entries: show all servers and tables for the selected subde-tector which are made and have to be submitted, e.g., need keyrelease. The databaseadministrator can use this tool to see if there is a new information to be updated: thistool allows to review the current changes which require a keyrelease (see fig. 6.9).

• Insert the subcomponent changes: insert information about which tables are modified.In this program one of options supports the keyrelease phase: the users insert intodatabase all parameters which are needed to produce keyrelease. There are threeoptions for the making update on keytables depending on the kinds of indexes andtwo options according to if it is necessary to create new table reference into keytableor to update an old one. The first group of options define the possibility to createan index by run number, by time stamp or by other way that subgroup expert haschosen. The second group of options is used if during the reprocessing a new table iscreated or already new server or database is used. In this case the creation of a newreference into keytable is required.

In the program called “Insert a new entry” seven combinations according to these twogroups of criteria are presented (see fig. 6.11). Options run and time stamp updatesan index which are created by run and time stamp. For the option “major and minor”major and minor of CnA tables are required.

The “forced” run, time stamp, major and minor allows to create a new tables.

Page 46: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

46 CHAPTER 6. DESIGN AND IMPLEMENTATION

Figure 6.3: The Web page with links to MySQL applications.

Page 47: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

6.2. IMPLEMENTATION DETAILS 47

Figure 6.4: Run-Keytable conversion.

The option “forced all” activate the possibility of making keyrelease e.g set a groupof tables to be updated: the keytable range e.g. the keytables for group of runs withthe same CnA conditions, server and table name which are going to be updated andmajor and minor indexes which correspond with the proper tables.

The option force give all parameters which completely define the keyrelease.

The next step is to extract this information from MySQL and to produce an updateof the keytable.

• Delete subcomponent changes. In case of errors using the program “insert entries theuser can immediately delete still unsubmitted settings for selected subdetector serverand table (see fig. 6.10).

• Making a new revision: set the release number, run validity range and keytable validityrange and indicate this new release version online (see fig. 6.12). The option ”submit”means if this release version is done only in MySQL - the position ”no” or keyreleaseis already done for it. This indication appears on the tool Table of release versions.

Page 48: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

48 CHAPTER 6. DESIGN AND IMPLEMENTATION

Figure 6.5: Keytable addresses:servernames, databases and tables, indicated by major andminor.

Page 49: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

6.2. IMPLEMENTATION DETAILS 49

Figure 6.6: Keytable releases per given run.

Page 50: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

50 CHAPTER 6. DESIGN AND IMPLEMENTATION

Figure 6.7: The revision table with a run validity per given CnA release.

Page 51: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

6.2. IMPLEMENTATION DETAILS 51

Figure 6.8: Run-Keyrelease conversion.

Figure 6.9: List of recent changes which require a keyrelease.

Page 52: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

52 CHAPTER 6. DESIGN AND IMPLEMENTATION

Figure 6.10: Delete component entries.

Figure 6.11: The keyrelease options set.

Page 53: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

6.2. IMPLEMENTATION DETAILS 53

Figure 6.12: Application for inserting a new Revision.

Page 54: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

54 CHAPTER 6. DESIGN AND IMPLEMENTATION

Page 55: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

Chapter 7

Conclusions

Due to the fact that nowadays HEP experiments are getting more demanding in whatconcerns to data volume and complexity of the systems subsystems like databases are gettingmore relevant and essential for the overall performance of the sets of the machine.

The complexity of the detector, the magnitude of data and real-time needs from thesystem , imposes strong efficiency implying that the database system plays a central role inthe overall efficiency.

In the Hera-B experiment database group is responsible for the management, mainte-nance and monitoring of the setup, and information of the run and event meta-data .

It provides management ifrastructure to the detector configuration databases, calibra-tion, alignment, slow control and data quality information.

The developed application described in this thesis refers to the calibration and alignmentprocess automation. It serves the special kind of database which keeps the relations objectbetween all calibration and alignment databases corresponding to a given run.

The reduced database crew provoce necessarity to develop a Web applications whichcertralized information in case of changes and are not time consuming.

The programs are written in Perl cgi using MySQL database to store the data. Thedeveloped Web applications support analysis users and subcomponent experts.

The solution for these requirements were to develop Web applications to centralize in-formation in case of changes and to provide easier interface toward database.

55

Page 56: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

56 CHAPTER 7. CONCLUSIONS

Page 57: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

Chapter 8

Glossary

Apache An open source HTTP server. It features highly configurable error messages,DBM-based authentication databases, and content negotiation.

API Application Programming Interface is the interface by which an application programaccesses the operating system and other services. It defines how the client’s programscan connect and communicate with a server.

ARTE or Analysis and Reconstruction Tool is the central place where the general datastructures for Hera-B software is defined. The data are stored in 2-dimensionalstructures called ARTE tables.

It is developed and tested under UNIX system.

Berkeley Database Relation Database which is used in Hera-B as the physical storagelevel of Slow control, CnA and Monitoring data.

Client-server technology A common form of distributed system in which the softwareis splitted between server tasks and client tasks. A client sends requests to a server,according to some protocol, asking for information or action, and the server responds.

DBI Database Interface is a database access interface. It defines a set of methods,variables and conventions that provide a consistent database interface, independentof the actual database being used.

DBMS The Database Management System allows a user to define the data to be storedin terms of data model.

CGI Common Gateway Interface is a standard for running external programs from aWorld-Wide Web HTTP server. CGI specifies how to pass arguments to theexecuting program as part of the HTTP request.

Index A sequence of (key, pointer) pairs where each pointer points to a record in adatabase which contains the key value in some particular field.

The index is sorted on the key values to allow rapid searching for a particular keyvalue, using e.g. binary search. The index is ”inverted” in the sense that the key

57

Page 58: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

58 CHAPTER 8. GLOSSARY

value is used to find the record rather than the other way round. For databases inwhich the records may be searched based on more than one field, multiple indicesmay be created that are sorted on those keys.

An index may contain gaps to allow for new entries to be added in the correct sortorder without always requiring the following entries to be shifted out of the way.

HTML is a hypertext document format used on the World Wide Web.

MIZZI A Server framework or so-called data definition layer which is used together witha Berkeley Database in Hera-B Database.

MySQL The open source Relation Database Management System. It is flexible and it’sthe best choice for a middle size database.

Perl is Practical Extraction and Report Language with a high flexibility syntax andconcise regular expression operators.

Reprocessing The reprocess of raw data with a new (improved) calibration andalignment.

Relational Database Model allows the definition, storage and retrieval operations andintegrity constraints. In this model the data and relations between them areorganized in tables.

Certain fields may be designated as keys, which means that searches for specificvalues of that field will use indexing to speed them up.

RDBMS Relation Database Management System is the management layer over RelationDatabase.Relationships are only specified at retrieval time.Relational databases areclassed as dynamic database management system.

Script a program to respond with some small upper limit of response time.

Scripting languages referred to as glue languages or system integration languages.

They have a great functionality, flexibility and compatibility.

SQL Structured Query Language is a standard for the definition and management ofpersistent, complex objects, provides basic language constructs for defining andmanipulating tables of data.

Web is an Internet client-server hypertext distributed information retrieval system whichoriginated from the CERN.

Page 59: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

Chapter 9

Acknowlegement

I’m gratefull to Leandar Litov and Bernhard Schmidt for their support and their belief inme.I’m gratefull to Vasco Amaral for his advices, precise editions and understanding.Specially thanks to Patricia Conde. She thought me to a lot of important things . I’mgratefull for her continuos cares.Thanks to Maxym Titov, Luis Silva and Joao Batista for their help, advices and patience.

59

Page 60: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

60 CHAPTER 9. ACKNOWLEGEMENT

Page 61: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

Bibliography

[1] HERA-B collaboration,

An Experiment to Study CP Violation in the B System Using an Internal Target at

the HERA Proton Ring, DESY-PRC 94/02 (1994).

[2] Maarten Bruinsma

Performance of the First Level Trigger of Hera-B and nuclear Effects in J/ψproduction

[3] J.M.Hernandez

PC Farms for Triggering and Online Reconstruction at HERA-B, HERA-B

[4] HERA-B collaboration,

Plans for HERA-B after the shutdown 2003,15-04-2003.

[5] Raghu Ramakrishnan, Joahannes Gehrke

Databases Management Systems, second edition, 2000, McGrawHill

[6] Paul Dubois

MySQL,InfoDAR 2002 ,first edition

[7] V .Amaral

The role of databases in High Energy Physics experiments systems - The Hera-B case

study, Technical note , Ref.HERA-B 03-023, Subgroup 03-006,Hamburg/Germany

[8] V .Amaral

Database system

[9] Steven Brenner ,Edwin Aoki

Introduction to CGI Perl,1996

[10] John K. Ousterhout

Higher Level Programming for the 21 Century

61

Page 62: Automating the Calibration and Alignment revision process ...€¦ · 1 Automating the Calibration and Alignment revision process in Hera-B Master Thesis Galina Deyanova University

62 BIBLIOGRAPHY

[11] V.Amaral, Luis Silva, Joao Batista, A. Amorim

Standard Multi-tier Software Technologies for HEP solution,HERA-B, Technical note,Ref.02-045 ,Software 02-011.

[12] A.Amorim,V.Amaral etc.

The HERA-B Database Management for Detector Configuration ,Calibration,

Alignement,Slow Control and Data Classification,CHEP 2000, Padova/Italy.

[13] Vasco Amaral, Antonio Amorim, et.

Operational experience running the Hera-B database system,Proc. of CHEP’01,International Conference on Computing and Nuclear Physics(Sept.2001), Sciencepress 6-11:396-397, 2001, Beijing/China.

[14] M.Dam,A.Gellrich etc.

HERA-B Data Acquisition System,october 2002

[15] V.Amaral,Cuido Moerkotte etc.

Studies For Optimisation Of Data Analysis Queries For HEP using HERA-B

Commissioning Data,

[16] Azalov

The Databases,third edition 1992

[17] Database Managers,Hera-B 99-121, Software 99-019

[18] D. Olson, J.Siegirist, D.Quarrie and etc..

Data Access and Analysis of Massive Datasets for High-Energy and Nuclear Physics,