reliability analysis of the lhc quadrupole quench protection ......reliability analysis of the lhc...
TRANSCRIPT
-
Reliability analysis of the LHC quadrupole quench protection system
TE-MPE-TM #117
13/09/2018
Miriam Blumenschein
Andrea Apollonio, Reiner Denz, Jelena Spasic, Jens Steckert, Jan Uythoven, Daniel Wollmann
1
-
Motivation and Overview
Motivation:
• Upgrade of the 392 quench detection units for the LHC main quadrupole magnets (DQLPU-B) in LS2
• Upgrade is part of the QPS maintenance plan
• No new quench detection functionalities
• Enhanced diagnostic functionalities
Overview
1. RIRE at the example of the quadrupole quench protection system, step by step
2. Example of a detailed study: the trigger coupling
3. Outlook
2
-
Objectives and principle and of RIRE
Reliability Requirements and Initial Risk Evaluation RIRE
2. Adapted FMEA
System failure
behaviour
1. Risk matrix
Accelerator
reliability targets
3. System reliability requirements
4. Risk Evaluation: necessary reliability actions
3
-
Step 1: Accelerator reliability requirements
LHC risk matrixRecovery
∞ year month week day hours minutes
S7 S6 S5 S4 S3 S2 S1
Fre
quency
1 / hour
1 / day
1 / week
1 / month
1 / year
1 / 10 years
1 / 100 years
1 / 1000 years 4
-
2. Adapted FMEA
System failure
behaviour
1. Risk matrix
Accelerator
reliability targets
3. System reliability requirements
4. Risk Evaluation: necessary reliability actions
2.1) System context
2.2) System structure
2.3) System functions
2.4) Context dependent
functions
2.5) Failure modes and
effects
Step 2
5
-
2.1 Context: Powering magnets and Interlock for 1 sector (out of 8)
6
Diode
MQD 1
Quench heater
Diode
Quadrupole MQ 1 MBA MBB MBC
Diode
MQF 2
Quench heater
Diode
Diode
MQD 47/51
Quench heater
Diode
Beam 2
Beam 1
Current lead
Current lead
Current lead
Current lead
Power
converter
EE
RQ
D
EE
RQ
F
Res
isto
r
Sw
itc
h
Power
converter
FP
A O
pe
n/
Clo
se
En
erg
y E
xtr
actio
n S
yste
m
DQHDS DQHDS DQHDS DQHDS DQHDS DQHDS
DQ
LP
U_
B
1/2
DQ
LP
U_
S
DQ
LP
U_
B
1/2
DQ
LP
U_
S
DQ
LP
U_
B
1/2
DQ
LP
U_
S
Quench Loop
Controller DQQLC
PICPC_DISCHARGE_REQUEST
PC_FAST_ABORT CIRCUIT_QUENCH
DISCHARGE_REQUEST
Quench Interlock Loop QIL
RQF circuit
RQD circuit
SC equipment
to be protected
Quench Protection
Circuit
quench loop
Discharge
loop
Quench
Interlock Loop
Res
isto
r
Sw
itc
h
Quadrupole MQ 2 Quadrupole MQ 47/51
MQF 47/51MQD 2MQF 1
Eve
n p
oin
t
Od
d p
oin
t
DQ
GP
U-D
open, readopen,
read
op
en
op
en
op
en
op
en
op
en
op
en
op
en
BIS
Upgrade in LS2: DQLPU_B
Beam dump request
-
Step 2.2: System structure
1.Quadrupole
2.Beam operation (beam dump, injection)1.Energy extraction, n=1
2.Quench heater, n = 2 * 51
3.Quench interlock loop, n =11. Quench Detection QD, n = 51
1. DYPQ Yellow protection rack quadrupole, n = 47/51
Imm
edia
te e
ffect
End e
ffect
7
-
Step 2.3: System functions
8
Rese
t,
sim
ple
co
mm
ands
Lo
ggin
g
QP
S_
OK
Vo
lta
ge
ta
p
ext_
B
Vo
lta
ge
ta
p
int_
B
Vo
lta
ge
ta
p
ext_
A
Vo
lta
ge
ta
p
int_
A
DYPQ DYPB-S
UP
S 1
MQF MQD
UP
S 2
Inte
rlo
ck
IN Inte
rlo
ck
OU
T
DQHDS
trigger
WinCC
supervisionMQF + MQDExpert tool
Lo
ggin
g,
PM
Rese
t,
ch
an
ge
co
nfigu
ratio
n
DQ
HD
S
inte
rlo
ck
-
Step 2.4: Context dependent functions
Quench
OK: Reset
detection boardCapacitor bank
charged (810 V)Switching
off/ on
Commissioning
Normal operation
~4800 h/a
Post quench I
(trigger latched)
~5-10 min
Quench event
analysis ~ [h]
Maintenance,
repair, tests
Post quench II
(trigger unlatched)
~10 min
Sending
post mortem data
not OK
Revalidation
OK
9
-
Step 2.4: Context dependent functions
Quench
Normal operation
~4800 h/a
Post quench I
(trigger latched)
~5-10 min
• Keep quench interlock loop
opened
• Keep quench heater power
supply latched
• …
• Open quench interlock
loop
• Discharge quench
heater power supply
• …
• Keep quench
interlock loop closed
• Keep quench heater
power supply
charged
• …
10
-
Step 2.5: Failure modes and effectsFMEA black box level: quench detection system
Context Normal operation (~4800 h/a) Asymmetric quench …Function Keep quench interlock loop closed Open quench loop …ID failure mode OP.1 AQ.1 …Failure mode Quench interlock loop opened 1oo2
or 2oo2Quench interlock loop not opened 1oo2
…
Immediate effect
False energy extraction, no firing of the quench heaters,false circuit quench interlock
… …
End effect False beam dump … …Severity of EE 2 … …Detection Method
Quench interlock loop monitoring indicates loop status
… …
11
FMEA report on EDMS (2010822)
https://edms.cern.ch/file/2010822/1/Report_FMEABlackBox_2018_08_03.pdf
-
Step 2.5: Summary failure effects
Quadrupole
• DYPQ_EE1: False quenching, S2 (hours)
• DYPQ_EE2: Quadrupole damaged, S5 (month)
Beam operation
• DYPQ_EE3: Injection delayed, S2 (hours)
• DYPQ_EE4: False beam dump, S2 (hours)
• DYPQ_EE5: Missing abort trigger by DYPQ, beam dump by another protection system: • nQPS works: S3 (days) analysis time
• nQPS does not work, BLM work: S5 (month) quadrupole damaged
12
-
2. Adapted FMEA
System failure
behavior
1. Risk matrix
Accelerator
reliability targets
3. System reliability requirements
4. Risk Evaluation: necessary reliability actions
The accelerator
targets are allocated to the end effects
Step 3
13
-
Step 3: Reliability targets for failure effects
• Recovery time includes the time needed for maintenance or intervention and the time to bring the LHC back to the state at which the failure occurred
LHC risk matrix Recovery
∞ year month week day hours minutes
Fre
quency
1 / hour
1 / day
1 / week
1 / month
1 / year EE1,EE3,EE4
1 / 10 years EE5
1 / 100 years EE2
1 / 1000 years14
-
2. Adapted FMEA
System failure
behavior
1. Risk matrix
Accelerator
reliability targets
3. System reliability requirements
4. Risk Evaluation: necessary reliability actions Purpose: • Estimate the
necessary extent of reliability actions
Step 4
15
-
Step 4: Risk evaluation
Necessary extent of reliability actions is estimated:
• Severity: S 3 (day) – S 7 (infinite)• 1 end effect in severity category 5
• 1 end effect in severity category 3
• Undetectable: • 6 failure modes: recommended actions to improve detectability
16
-
Step 4: Visualization of the FEMA table
17
Severity
categories
End effects
Failure
modes
Reliability modelling
36
inputs
Fault Tree Report on EDMS (2010822)
One contributor:
Trigger circuit –
missing trigger
DQHDS
https://edms.cern.ch/file/2010822/1/Report_FaultTree_2018_08_03.pdf
-
2. Detailed study – trigger coupling
18
MQ
FM
QD
Trigger circuit is contributor to:• FM: heater series is not fired 2oo2• EE: Quadrupole damage (S5 – month)
-
2. Trigger coupling - Analysis techniques
• Failure rate prediction according to the handbook 217Plus for the estimation of occurrence probabilities of electronic components
• Inductive FMECA according to IEC 60812 for single failure analysis
• Quantitative fault tree analysis according to IEC 61025 for multiple failure analysis
• Supported by software Isograph
19
-
2. Trigger coupling - Results
Objective 1: Compare design alternatives:Chosen design: • trigger coupling with diode in single configuration, • 4 DQQDL diodes, • 1 DQCSU resistors, • no cross triggering
Objective 2: Weaknesses in the DYPQ trigger circuit?• Single DQHDS entry
Objective 3: Estimate DYPQ trigger circuit reliability• The probability that for one of the 392 quadrupoles two out of two heaters are not
fired within 100 years is estimated to be 0.1 %.
Documentation on EDMS (2010831)
20
https://edms.cern.ch/nav/P:CERN-0000076638:V0/D:2010831:V1
-
2. Trigger coupling - Results
• Recovery time includes the time needed for maintenance or intervention and the time to bring the LHC back to the state at which the failure occurred
LHC risk matrix Recovery
∞ year month week day hours minutes
Fre
quency
1 / hour
1 / day
1 / week
1 / month
1 / year EE1,EE3,EE4
1 / 10 years EE5
1 / 100 years EE2
1 / 1000 years21
DYPQ_EE2: Quadrupole damaged, S5 (month)• Due to trigger circuit: 0.1%
-
Summary and conclusion
• RIRE is a CERN tailored methodology for the experience based derivation of quantitative reliability targets
• RIRE was applied to the upgraded quadrupole quench protection system DYPQ, a complex systems with context dependent functions• Several critical failure modes identified; qualitative study affected design and
procedures• The usefulness of quantitative reliability study on board level were identified
(contributors to S5 and S3 failure modes)• The trigger link, a contributor to S5 was studied on component level, result:
chosen design acceptable
• Some S3 – S7 failure modes remain to be studied to fully qualify the system
22
-
23
-
Step 2.5: LHC severity table
24
Severity level Recovery time
7 Catastrophic: Infinite
6 Very high: Year
5 High: Month
4 Critical: Week
3 Major: Days
2 Moderate: Hours
1 Low: Minutes
-
After upgrade
1. DYPQ Yellow Protection rack Quadrupole, n = 47/51
1. DQLPU-B II Local Protection Unit type B (iQPS) n = 1
1. DQQDL: new board, new features
2. DQAMC: minor upgrade (different firmware)
3. DQLPR (dipole), n = 2
2. DQHDS: minor upgrade (fuse to earth)
3. Crawford box
4. Dispatching box
5. System board controller: PS monitoring, quench heater supervision, triggering and timing controller
25
Current
1. DYPQ Yellow Protection rack Quadrupole, n = 47/51
1. DQLPU-B Local Protection Unit type B (iQPS) n = 1
1. DQQDL Quench Detection Local, n = 4
2. DQAMC Acquisition and Monitoring Controller, n = 1
3. SYKO Power Supply, n = 2
2. DQHDS Heaters Discharge power Supply, n = 2
3. Crawford box, n = 1
4. Dispatching box, n = 1DQLIM (dipole)
1.2 DQLPU-B Upgrade in LS2