1 reliability and availability of the large hadron collider (lhc) machineprotection system jan...

37
1 Reliability and Reliability and Availability of the Availability of the Large Hadron Collider Large Hadron Collider (LHC) (LHC) MachineProtection MachineProtection System System Jan Uythoven CERN, Geneva, Switzerland Thanks to R. Schmidt, B. Goddard, R. Filippini* and the many other colleagues working on the LHC Protection System *Presently at PSI, Zürich

Upload: kevin-gregory

Post on 29-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Reliability and Availability of the Large Hadron Collider (LHC) MachineProtection System Jan Uythoven CERN, Geneva, Switzerland Thanks to R. Schmidt,

1

Reliability and Availability Reliability and Availability of the Large Hadron of the Large Hadron

Collider (LHC) Collider (LHC) MachineProtection SystemMachineProtection System

Jan Uythoven

CERN, Geneva, Switzerland

Thanks to R. Schmidt, B. Goddard, R. Filippini* and the many other colleagues working on the LHC Protection System

*Presently at PSI, Zürich

Page 2: 1 Reliability and Availability of the Large Hadron Collider (LHC) MachineProtection System Jan Uythoven CERN, Geneva, Switzerland Thanks to R. Schmidt,

Jan Uythoven, CERN

ITER RAMI Workshop 6-7 December 2007

Page 2

The Large Hadron Collider (LHC) at The Large Hadron Collider (LHC) at CERN - Geneva CERN - Geneva

The world largest particle accelerator with a circumference of 27 km

1232 Superconducting dipole magnets operating at 1.9 K

Operation with beam foreseen for 2008

Page 3: 1 Reliability and Availability of the Large Hadron Collider (LHC) MachineProtection System Jan Uythoven CERN, Geneva, Switzerland Thanks to R. Schmidt,

Jan Uythoven, CERN

ITER RAMI Workshop 6-7 December 2007

Page 3

LHC LayoutLHC Layout

Page 4: 1 Reliability and Availability of the Large Hadron Collider (LHC) MachineProtection System Jan Uythoven CERN, Geneva, Switzerland Thanks to R. Schmidt,

Jan Uythoven, CERN

ITER RAMI Workshop 6-7 December 2007

Page 4

LHC Stored Energy LHC Stored Energy For nominal beam intensity at 7 TeV:

Energy Stored in one beam: 360 MJ Energy Stored in the superconducting magnets: 10 GJ

0.01

0.10

1.00

10.00

100.00

1000.00

10000.00

1 10 100 1000 10000Momentum [GeV/c]

En

erg

y st

ore

d in

th

e b

eam

[M

J] LHC topenergy

LHC injection(12 SPS batches)

ISR

SNSLEP2

SPS fixed target HERA

TEVATRON

SPSppbar

SPS batch to LHC

Factor~200

RHIC proton

LHC energy in magnets

Energy to

heat a

nd

melt

one kg of

copper:

700 kJ

Page 5: 1 Reliability and Availability of the Large Hadron Collider (LHC) MachineProtection System Jan Uythoven CERN, Geneva, Switzerland Thanks to R. Schmidt,

Jan Uythoven, CERN

ITER RAMI Workshop 6-7 December 2007

Page 5

Quench Protection and Quench Protection and Energy Extraction SystemEnergy Extraction System

when one magnet quenches, quench heaters are fired for this magnet the current in the quenched magnet decays in about 200 ms the current in series from the other magnets flows through the bypass diode that can stand the current for about 100-200 seconds

Magnet 1 Magnet 2

Power Converter

Magnet 154

Magnet i

Page 6: 1 Reliability and Availability of the Large Hadron Collider (LHC) MachineProtection System Jan Uythoven CERN, Geneva, Switzerland Thanks to R. Schmidt,

Jan Uythoven, CERN

ITER RAMI Workshop 6-7 December 2007

Page 6

13 kA Energy Extraction in 13 kA Energy Extraction in tunnel adjacent to acceleratortunnel adjacent to accelerator

Resistors absorbing the energy

Switches - for switching the resistors into series with the magnets

Page 7: 1 Reliability and Availability of the Large Hadron Collider (LHC) MachineProtection System Jan Uythoven CERN, Geneva, Switzerland Thanks to R. Schmidt,

Jan Uythoven, CERN

ITER RAMI Workshop 6-7 December 2007

Page 7

Quench Protection and Quench Protection and Energy Extraction SystemEnergy Extraction System

8 Separate systems: one for each sector Energies per sector similar to Hera and Tevatron

accelerators Needs to work very reliably, as damage potential is huge

Reliability studies of the system have been done ‘Traditional’ technologies Limited dependence on other systems

This talk mainly on Protection from beam energy

PhD. Thesis A.Vergara:http://documents.cern.ch/cgi-bin/setlink?base=preprint&categ=cern&id=cern-thesis-2004-019

Page 8: 1 Reliability and Availability of the Large Hadron Collider (LHC) MachineProtection System Jan Uythoven CERN, Geneva, Switzerland Thanks to R. Schmidt,

Jan Uythoven, CERN

ITER RAMI Workshop 6-7 December 2007

Page 8

How to protect the machine How to protect the machine from the Beam Energy ?from the Beam Energy ?

Machine Protection System which Detects “any fault” in the machine:

Hardware not working properly, although fault tolerant design of safety critical systems

Effect of failures, including beam instabilities, leading to beam losses Safely dumps the beam before it can cause any damage

Fast reaction time Beams to be dumped within

3 turns of detection of problem= 300 s

Beam Dump Block:Where the beam

should go in case of any ‘problems’

detected

Page 9: 1 Reliability and Availability of the Large Hadron Collider (LHC) MachineProtection System Jan Uythoven CERN, Geneva, Switzerland Thanks to R. Schmidt,

9

Systems detecting failures and LHC Beam Interlocks

Beam Interlock SystemBeam

Dumping System

Injection InterlockPowering

Interlockssc magnets

PoweringInterlocks

nc magnets

QPS(several 1000)

Power Converters

~1500

AUG

UPS

Power Converters

Magnets

Magnet Current Monitor

CryoOK

RFSystem

Movable Detectors

LHCExperiments

Beam LossMonitors

BCM

Experimental Magnets

CollimationSystem

CollimatorPositions

Environmentalparameters

Transverse Feedback

Beam ApertureKickers

BeamLifetimeFBCM

Screens / Mirrors

BTV

Access System

Doors EIS

VacuumSystem

Vacuumvalves

AccessSafetyBlocks

RF Stoppers

Beam loss monitors

BLM

SpecialBLMs

Monitorsaperture

limits(some 100)

Monitors in arcs

(several 1000)

Timing System (Post Mortem

Trigger)

Operator Buttons

CCC

SafeLHC

Parameter

SoftwareInterlocks

LHCDevices

Sequencer

LHCDevices

LHCDevices

Safe Beam Parameter

Distribution

SafeBeamFlag

Little beam dependence

Page 10: 1 Reliability and Availability of the Large Hadron Collider (LHC) MachineProtection System Jan Uythoven CERN, Geneva, Switzerland Thanks to R. Schmidt,

Jan Uythoven, CERN

ITER RAMI Workshop 6-7 December 2007

Page 10

Principle of the LHC Machine Principle of the LHC Machine

protection Systemprotection System

• ‘User systems’ can detect failures and send hardwired signal to beam interlock system

• Range from Experimental Detectors to Vacuum Valves

• Each user system provides a status signal, the user permit signal.

• The beam interlock system combines the user permits and produces the beam permit

• The beam permit is a hardwired signal that is provided to the dump kicker

• The Beam Dumping System combines many high technology techniques

Beam Interlock System

LHC Dump kickerBeam ‘Permit’

User permitsignals

Hardware links /systems, fully redundant

Many

different

technologies

Many

different

technologies

Page 11: 1 Reliability and Availability of the Large Hadron Collider (LHC) MachineProtection System Jan Uythoven CERN, Geneva, Switzerland Thanks to R. Schmidt,

Jan Uythoven, CERN

ITER RAMI Workshop 6-7 December 2007

Page 11

Organisation for the LHCOrganisation for the LHC

Machine Protection includes many different hardware systems

Many different departments and groups responsible for their equipment Coordination of machine protection by two working groups

General coordination – definition of the system Commissioning working group – accent is on procedures to be applied

Reviews and external audits are used for obtaining external advice

General review LHC Machine Protection System Audit of Beam Interlock Controller done Audit of Beam Dumping System planned Audit of Beam Loss Monitoring System requested

Page 12: 1 Reliability and Availability of the Large Hadron Collider (LHC) MachineProtection System Jan Uythoven CERN, Geneva, Switzerland Thanks to R. Schmidt,

Jan Uythoven, CERN

ITER RAMI Workshop 6-7 December 2007

Page 12

Requirements concerning Requirements concerning Machine Protection SystemMachine Protection System

Safety Assessment (‘reliability’) IEC 61508 standard defining the different Safety Integrity Levels (SIL)

ranking from SIL1 to SIL4 Based on Risk Classes = Consequence x Frequency Machine Protection System for the LHC should be SIL3, taking definition of

Protection Systems, with a probability of failure between 10-8 and 10-7 per hour (because of short mission times)

Catastrophy = beam should have been dumped and this did not take place; can possibly cause large damage

Availability Definition:

Beam is dumped when it was not required Operation can not take place because the protection system does not give

the green light (is not ready) Requirement:

Definition not according to any standard Downtime comparable to other accelerator equipment; maximum tens of

operations per year

Page 13: 1 Reliability and Availability of the Large Hadron Collider (LHC) MachineProtection System Jan Uythoven CERN, Geneva, Switzerland Thanks to R. Schmidt,

Jan Uythoven, CERN

ITER RAMI Workshop 6-7 December 2007

Page 13

Approach AdoptedApproach Adopted“Strategy”“Strategy”

End of ’90s: start an “Interlock Manager”, which later continued as a Machine Protection System

Until then Particle Accelerators mainly considered Equipment Protection

Since then ‘Machine Protection’ has become a common approach in high power accelerators

Dual Approach Prevent fault at the source (= old fashioned approach)

& Detect the effect resulting from any fault, including beam

instabilities, and react fast enough to prevent damage Deployment in SPS accelerator to test concepts

Page 14: 1 Reliability and Availability of the Large Hadron Collider (LHC) MachineProtection System Jan Uythoven CERN, Geneva, Switzerland Thanks to R. Schmidt,

Jan Uythoven, CERN

ITER RAMI Workshop 6-7 December 2007

Page 14

Are the requirements Are the requirements fulfilled?fulfilled?

Reduce the Protection System to the basic elements. The other systems give an additional protection.

BICBeam Interlock Controller

LBDSBeam Dumping System

BLMBeam Loss Monitors

PICPower Interlock Controller

QPSQuench Protection System

6 BLMs per sc quad4000 in total

Page 15: 1 Reliability and Availability of the Large Hadron Collider (LHC) MachineProtection System Jan Uythoven CERN, Geneva, Switzerland Thanks to R. Schmidt,

15

Systems detecting failures and LHC Beam Interlocks

Beam Interlock SystemBeam

Dumping System

Injection InterlockPowering

Interlockssc magnets

PoweringInterlocks

nc magnets

QPS(several 1000)

Power Converters

~1500

AUG

UPS

Power Converters

Magnets

Magnet Current Monitor

CryoOK

RFSystem

Movable Detectors

LHCExperiments

Beam LossMonitors

BCM

Experimental Magnets

CollimationSystem

CollimatorPositions

Environmentalparameters

Transverse Feedback

Beam ApertureKickers

BeamLifetimeFBCM

Screens / Mirrors

BTV

Access System

Doors EIS

VacuumSystem

Vacuumvalves

AccessSafetyBlocks

RF Stoppers

Beam loss monitors

BLM

SpecialBLMs

Monitorsaperture

limits(some 100)

Monitors in arcs

(several 1000)

Timing System (Post Mortem

Trigger)

Operator Buttons

CCC

SafeLHC

Parameter

SoftwareInterlocks

LHCDevices

Sequencer

LHCDevices

LHCDevices

Safe Beam Parameter

Distribution

SafeBeamFlag

Page 16: 1 Reliability and Availability of the Large Hadron Collider (LHC) MachineProtection System Jan Uythoven CERN, Geneva, Switzerland Thanks to R. Schmidt,

Jan Uythoven, CERN

ITER RAMI Workshop 6-7 December 2007

Page 16

Main SystemsMain Systems

Thorough design from the start Based on redundancy

For each of the 5 main components of the Machine Protection System Dependability numbers (= reliability & availability) have been calculated Basically one PhD thesis per system ! Some details for the Beam Dumping System calculations are

given later Assume operational scenario Combination of these numbers gives the Machine

Protection Dependability estimate Shows weak links

Page 17: 1 Reliability and Availability of the Large Hadron Collider (LHC) MachineProtection System Jan Uythoven CERN, Geneva, Switzerland Thanks to R. Schmidt,

Jan Uythoven, CERN

ITER RAMI Workshop 6-7 December 2007

Page 17

Resulting Unsafety and Resulting Unsafety and Availability NumbersAvailability Numbers

System Unsafety/y

Probability

False dumps/yAverage Std.D.

LBDS (OP1) 2.410-7(2x) 4(2x) +/-1.9

BIC 1.410-8 0.5 +/-0.5

BLM 1.4410-3 (Front-end)

0.0610-3 (Back-end VME)

17 +/-4.0

PIC 0.510-3 1.5 +/-1.2

QPS 0.410-3 15.8 +/-3.9

MPS 2.310-4

5.75 10-8/h (SIL3)

41 +/-6.0

ASSUMPTIONSOperational scenario

200 days/year of operations: 400 beam operations (10h each) followed by checks (2h).

Diagnostics effectivenessLBDS and BIC “as good as new” after checks (BLM, partially)QPS and PIC “as good as new” after periodic inspection or power abort

DR apportionment60% planned dumps15% fast beam losses15% slow beam losses10% others

Redundancy No cross-redundancy within the Beam Loss Monitors (P = 0, worst-case)

Page 18: 1 Reliability and Availability of the Large Hadron Collider (LHC) MachineProtection System Jan Uythoven CERN, Geneva, Switzerland Thanks to R. Schmidt,

Jan Uythoven, CERN

ITER RAMI Workshop 6-7 December 2007

Page 18

Sensitivity of Safety to the Model Sensitivity of Safety to the Model ParametersParameters

Sensitivity to the type of dump request The fast beam losses contribute by two orders of magnitude more to the overall unsafety.

45% of fast beam losses assumed instead of 15%.Safety moves from 2.3 10-4 /y to 6.810-4 /y SIL2

Sensitivity to the redundancy of the BLM Same dump request apportionment, but a beam loss is detectable by two monitors with a probability 0<P<1.

If P moves from 0 to 1, the safety will be recovered from 6.810-4 /y to 2.810-5 /y SIL 4

RESULTS on LOG scale!

Page 19: 1 Reliability and Availability of the Large Hadron Collider (LHC) MachineProtection System Jan Uythoven CERN, Geneva, Switzerland Thanks to R. Schmidt,

Jan Uythoven, CERN

ITER RAMI Workshop 6-7 December 2007

Page 19

Failure Rates of a Single Failure Rates of a Single Sub-SystemSub-System

(…open brackets…(…open brackets…

System Unsafety/y

Probability

False dumps/yAverage Std.D.

LBDS (OP1) 2.410-7(2x) 4(2x) +/-1.9

BIC 1.410-8 0.5 +/-0.5

BLM 1.4410-3 (Front-end)

0.0610-3 (Back-end VME)

17 +/-4.0

PIC 0.510-3 1.5 +/-1.2

QPS 0.410-3 15.8 +/-3.9

MPS 2.310-4

5.75 10-8/h (SIL3)

41 +/-6.0

LHC Beam Dumping System

Page 20: 1 Reliability and Availability of the Large Hadron Collider (LHC) MachineProtection System Jan Uythoven CERN, Geneva, Switzerland Thanks to R. Schmidt,

Jan Uythoven, CERN ITER RAMI Workshop 6-7 December 2007

20

The LBDSLHC Beam Dumping System

LBDS inventory

Extraction 15 Kicker Magnets + 15 generators

10 Septum Magnets + 1 power converter

Dilution 10 Kicker Magnets + 10 generators

Absorption One dump block

Electronics Beam energy measurement (BEM)

Beam energy tracking (BET)

Triggering and re-triggering

Post mortem diagnostics (check of every beam dump)

Beam line 975 m from extraction point to TDE

1) MKD

The 15 kicker magnets deflect the beam horizontally

4) MKB

The 10 kicker magnets dilute the beam energy

3) MSD

The 15 septum magnets deflect the beam vertically

5) TDE

The beam is absorbed in a graphite block

2) Q4

The quadrupole enhances the horizontal deflection

The beam sweep at the front face of the TDE absorber at 450 GeV

Page 21: 1 Reliability and Availability of the Large Hadron Collider (LHC) MachineProtection System Jan Uythoven CERN, Geneva, Switzerland Thanks to R. Schmidt,

Jan Uythoven, CERN ITER RAMI Workshop 6-7 December 2007

21

The LBDS: Safety in DesignFault Tolerant Features

No single point of failure should exist in the LBDS• Redundancy is introduced to allow failures up to a certain threshold.• Surveillance detects failures and issues a fail safe dump request.

Redundancy

14 out of 15 MKD, 1 out of 2 MKD generator branches

Surveillance

Energy tracking, Retriggering

Redundancy

1 out of 4 MKBH, 1 out of 6 MKBV

Surveillance

Energy tracking

Surveillance

Energy tracking, Fast current change monitoringRedundancy

1 out of 2 trigger generation and distribution

Surveillance

Synchronization tracking

Surveillance

TX/RX error detection Voting of inputs

Page 22: 1 Reliability and Availability of the Large Hadron Collider (LHC) MachineProtection System Jan Uythoven CERN, Geneva, Switzerland Thanks to R. Schmidt,

Jan Uythoven, CERN ITER RAMI Workshop 6-7 December 2007

22

The Modeling Framework

FMECA = Failure Modes Effects and Criticalities Analysis

No detailed assessment of fault consequences. Two failure modes only:

•Fail Safe

•Fail Unsafe

Page 23: 1 Reliability and Availability of the Large Hadron Collider (LHC) MachineProtection System Jan Uythoven CERN, Geneva, Switzerland Thanks to R. Schmidt,

Jan Uythoven, CERN ITER RAMI Workshop 6-7 December 2007

23

Reliability Prediction• Failure rates are deduced at component level from standard literature

(i.e. Military Handbook 217F).• The logic expressions of the failure modes are translated into probabilities and into failure rates.• Example: the failure mode F1MKD of the MKD system:

1. Logic Expression 2 out of 15[(PT1A AND PT1B) OR (SP1A AND SP1B) OR (SC1A AND SC1B) OR (CP2A AND CP2B) OR (COS12A AND COS12B) OR (COS22A AND COS22B) OR M]

2. Probability

3. Failure rate

MCOSCOSCPSCSPPTsilentMKD

silentMKDsilentMKDMKDF

PPPPPPPP

PPP

)1)(1)(1)(1)(1)(1(1

)1(14)1(1512

222

212

22

12

12

1_

15_

14__1

MCOSCOSCPSCSPPTsilentMKD

silentMKDsilentMKDMKDF

PPPPPPPP

PPP

)1)(1)(1)(1)(1)(1(1

)1(14)1(1512

222

212

22

12

12

1_

15_

14__1

)(1

)()( )(

1

11

)(

10

1

tP

tdPtetP

MKDF

MKDFMKDF

d

MKDF

t

MKDF

)(1

)()( )(

1

11

)(

10

1

tP

tdPtetP

MKDF

MKDFMKDF

d

MKDF

t

MKDF

Page 24: 1 Reliability and Availability of the Large Hadron Collider (LHC) MachineProtection System Jan Uythoven CERN, Geneva, Switzerland Thanks to R. Schmidt,

Jan Uythoven, CERN ITER RAMI Workshop 6-7 December 2007

24

ResultsFailure Modes and Rates of the LBDS

MKD

The FMECA and reliability prediction have been performed for all sub-systems in the LBDS.

More than 2100 failure modes have been classified at component level.

They have been arranged into 21 failure modes at system level.

Page 25: 1 Reliability and Availability of the Large Hadron Collider (LHC) MachineProtection System Jan Uythoven CERN, Geneva, Switzerland Thanks to R. Schmidt,

Jan Uythoven, CERN ITER RAMI Workshop 6-7 December 2007

25

Operation Scenarios for one MissionState Transition Diagrams

Failsafe ratesFS\Xk are decreasing with time

Fail unsafe rates FU\Xk are increasing with time

STATES

Available X0

X1 (no BETS)

X2 (no RTS)

X3 (no BETS, RTS)

Failsafe X4

Failed unsafe X5

Compact State Based Approach

Page 26: 1 Reliability and Availability of the Large Hadron Collider (LHC) MachineProtection System Jan Uythoven CERN, Geneva, Switzerland Thanks to R. Schmidt,

Jan Uythoven, CERN ITER RAMI Workshop 6-7 December 2007

26

State Transition DiagramsThe Sequence of Missions and Checks

Missions are driven either by internal false dumps or by external dump requests.

At checks the system is recovered to the initial state.

The process starts in X = 0 of Mission 1 and stops when one year of operation is reached.

The sequence of N missions and checks is a non-homogeneous Markov process of 2N5 states.

Page 27: 1 Reliability and Availability of the Large Hadron Collider (LHC) MachineProtection System Jan Uythoven CERN, Geneva, Switzerland Thanks to R. Schmidt,

Jan Uythoven, CERN ITER RAMI Workshop 6-7 December 2007

27

Operational Scenario

• Missions of random duration alternate with 2 hours of checks, over 200 days of operations.– In addition to a false dump, the end of the mission is determined by an

external dump request, which is either a planned dump request (Weibull) or a beam induced.

• The dump request rate is:

factor scale

factor shape

at t value theis

0 at t value theis

)()/1(

1)(

0

1

03

tt

t

factor scale

factor shape

at t value theis

0 at t value theis

)()/1(

1)(

0

1

03

tt

t

Planned dump

=5, = 1/11

Beam induced dump

= 0.001, 0 = 0.1

Distribution of dump requests

Page 28: 1 Reliability and Availability of the Large Hadron Collider (LHC) MachineProtection System Jan Uythoven CERN, Geneva, Switzerland Thanks to R. Schmidt,

Jan Uythoven, CERN

ITER RAMI Workshop 6-7 December 2007

Page 28

… … close brackets…)close brackets…)

System Unsafety/y

Probability

False dumps/yAverage Std.D.

LBDS (OP1) 2.410-7(2x) 4(2x) +/-1.9

BIC 1.410-8 0.5 +/-0.5

BLM 1.4410-3 (Front-end)

0.0610-3 (Back-end VME)

17 +/-4.0

PIC 0.510-3 1.5 +/-1.2

QPS 0.410-3 15.8 +/-3.9

MPS 2.310-4

5.75 10-8/h (SIL3)

41 +/-6.0

LHC Beam Dumping System

PhD. thesis Roberto Filippini:http://doc.cern.ch/archive/electronic/cern/preprints/thesis/thesis-2006-054.pdf

Availability of other systems not studied, can be done if

required

Page 29: 1 Reliability and Availability of the Large Hadron Collider (LHC) MachineProtection System Jan Uythoven CERN, Geneva, Switzerland Thanks to R. Schmidt,

Jan Uythoven, CERN

ITER RAMI Workshop 6-7 December 2007

Page 29

Also Analysis being done Also Analysis being done with a Different Approachwith a Different Approach

Hybrid methodology combining fault tree for component failure rates and simulations in the time domain for the complete system

Results concerning protection system reliability and beam availability

Option to disable part of a system and see the effect

Collaboration with Laboratory for Safety Analysis, ETH Zürich, Sigrid Wagner

Ongoing…

hybrid metho-dology

System level: global frame, agent-based approach

Component level: established reliability method, e.g. fault tree

User System BIS LBDS

dangerousconditions

Dump

Dfaulttree

individualfailure behaviour

D

MPS

Page 30: 1 Reliability and Availability of the Large Hadron Collider (LHC) MachineProtection System Jan Uythoven CERN, Geneva, Switzerland Thanks to R. Schmidt,

Jan Uythoven, CERN

ITER RAMI Workshop 6-7 December 2007

Page 30

Key issues concerning Key issues concerning Design of Sub-SystemsDesign of Sub-Systems

Requirements to obtain a “safe system” No single point of failure

Redundancy of critical components Redundancy of signal paths between (sub-)systems

Periodic checks to get back to a state which is ‘as good as new’

Failure rates of redundant systems increase in time – get back to zero (different from aging)

Surveillance of critical signals Safe mission abort Trade off between availability and reliability

Page 31: 1 Reliability and Availability of the Large Hadron Collider (LHC) MachineProtection System Jan Uythoven CERN, Geneva, Switzerland Thanks to R. Schmidt,

Jan Uythoven, CERN

ITER RAMI Workshop 6-7 December 2007

Page 31

Following the Design Following the Design Studies and ManufacturingStudies and Manufacturing

Test equipment in operational environment Quench Protection System operational during Hardware Commissioning of

the LHC magnets Reliability run starting for the Beam Dumping System with about 3 months

of continuous operation Can give upper limit of failure rate of most critical components because of

redundancy Logging and Post Mortem systems (analysis of events using logging data,

and special ‘fast’ buffers triggered after a beam dump) used during Hardware Commissioning

Install similar equipment or components in operational accelerators Beam Interlock System installed and operational in the LHC injection chain Fast Magnet Current Change Monitor already operational Energy tracking system of the LHC beam dump working for the extraction

system of the SPS injector

Page 32: 1 Reliability and Availability of the Large Hadron Collider (LHC) MachineProtection System Jan Uythoven CERN, Geneva, Switzerland Thanks to R. Schmidt,

Jan Uythoven, CERN

ITER RAMI Workshop 6-7 December 2007

Page 32

General Test ProceduresGeneral Test Procedures Before operation with beam:

Thorough testing required of all installed equipment Definition and follow-up of test procedures for the individual

equipment Machine Protection System Commissioning Working Group

which approves the test procedures Tests with beam required

Define tests before going into a next beam commissioning phase

Example: Provoke a quench of a magnet and check Beam Loss Monitoring signals

Measure delays between detection and actual beam dump Safe beam flag to allow masking of some interlock channels in

case of low intensity / low energy beams

How to enforce these tests ?

Page 33: 1 Reliability and Availability of the Large Hadron Collider (LHC) MachineProtection System Jan Uythoven CERN, Geneva, Switzerland Thanks to R. Schmidt,

Jan Uythoven, CERN

ITER RAMI Workshop 6-7 December 2007

Page 33

Lessons Learned from the Lessons Learned from the exerciseexercise

Absolute failure rate levels depend largely on model assumptions, but do indicate the weak links in the system Confidence in relative numbers and sensitivity effects

Hardware of some systems was adapted to obtain reliability numbers similar to the other systems Add redundancy

Periodic testing, sometimes several times per day, will contribute to the safety of the system Test the presence of the assumed redundancy

Page 34: 1 Reliability and Availability of the Large Hadron Collider (LHC) MachineProtection System Jan Uythoven CERN, Geneva, Switzerland Thanks to R. Schmidt,

Jan Uythoven, CERN

ITER RAMI Workshop 6-7 December 2007

Page 34

Human AspectsHuman Aspects

Hardware •Design•Dependability Studies•Testing of proto-types•Testing of series in Laboratory•Testing once installed•Tests with beam

Procedures•Testing during production•Testing after installation•During Operation

•Confirm Redundancy•Post Mortem•Re-establish confidence

•When changing hardware•When changing settingsGained

experienceA lot of

discussions…

Page 35: 1 Reliability and Availability of the Large Hadron Collider (LHC) MachineProtection System Jan Uythoven CERN, Geneva, Switzerland Thanks to R. Schmidt,

Jan Uythoven, CERN

ITER RAMI Workshop 6-7 December 2007

Page 35

Example of human AspectsExample of human Aspects

Beam accident extracting high intensity beam in 2004 from the SPS injector by which vacuum chamber was damaged

Noise on temperature sensors induced by the beam caused magnet interlock, stopping the magnet power converter

Error in the protection logic: Magnet power converter was stopped before inhibiting extraction

No clear procedures what to do: the experiment was continued without sorting out the problem

No clear responsibility: several people were in charge at the same time and nobody said ‘stop’

Created a lot of awareness of potential problems for the LHC

Page 36: 1 Reliability and Availability of the Large Hadron Collider (LHC) MachineProtection System Jan Uythoven CERN, Geneva, Switzerland Thanks to R. Schmidt,

Jan Uythoven, CERN

ITER RAMI Workshop 6-7 December 2007

Page 36

LHC StrategyLHC Strategypresently under Discussionpresently under Discussion

How to change Beam Loss Monitor thresholds & masking of signals

Thousands of values – avoid errors Are the correct when put in for the first time?

Who is allowed to do adapt the thresholds? What will be the procedures?

The Post Mortem Analysis of the Beam Dumping System indicates a fault

What are the procedures to recover? Who can give the ‘ok’ again? “The same problem happened last month; after 1 day of testing we just

continued. We are near the end of the physics run of this year…” Who is in charge? Will there be a group of ‘safety experts’ and what will be

their role?

Page 37: 1 Reliability and Availability of the Large Hadron Collider (LHC) MachineProtection System Jan Uythoven CERN, Geneva, Switzerland Thanks to R. Schmidt,

Jan Uythoven, CERN

ITER RAMI Workshop 6-7 December 2007

Page 37

ConclusionsConclusions Safety and Reliability has become an accepted topic for high power

accelerators The LHC has a coherent Machine Protection System following

interdisciplinary work for almost 20 years Producing dependability numbers is very time consuming and the

result depends largely on the model assumptions However the benefits are that

The weak links can be shown Designs have been adapted accordingly Awareness has been raised

On paper the numbers look good, but testing is required during installation, cold check-outs and operation with beam

Procedures during normal operation. Checks required almost continuously to confirm the redundancy of the systems

Procedures in case an abnormality is detected Who is responsible in the control room?

Organisational issues will be important Enforcing procedures / exceptions