wbs 3.2 – data acquisition
DESCRIPTION
WBS 3.2 – Data Acquisition. Paris Sphicas/CERN-MIT US CMS Level-2 DAQ manager DOE/NSF Review May 8, 2001. Outline. Overview of DAQ & High Level Trigger Status and Technical Progress Scope and Contingency Since Last Review Committee Concerns and Issues Plans Summary and Conclusions. - PowerPoint PPT PresentationTRANSCRIPT
US CMS DOE/NSF Review: May 8-10, 2001 1
WBS 3.2 – Data AcquisitionWBS 3.2 – Data Acquisition
Paris Sphicas/CERN-MITUS CMS Level-2 DAQ manager
DOE/NSF ReviewMay 8, 2001
US CMS DOE/NSF Review: May 8-10, 2001 2
OutlineOutlineOverview of DAQ & High Level TriggerStatus and Technical ProgressScope and Contingency Since Last ReviewCommittee Concerns and IssuesPlansSummary and Conclusions
US CMS DOE/NSF Review: May 8-10, 2001 3
System Overview: DAQSystem Overview: DAQ
Computing Services
16 Million channels
Charge Time Pattern
40 MHz COLLISION RATE
75 kHz 1 MB EVENT DATA
1 Terabit/s READOUT
50,000 data channels
200 GB buffers ~ 400 Readout memories
3 Gigacell buffers
500 Gigabit/s
5 TeraIPS
~ 400 CPU farms
Gigabit/s SERVICE LAN
Petabyte ARCHIVE
Energy Tracks
100 HzFILTERED
EVENT
EVENT BUILDER. A large switching network (400+400 ports) with total throughput ~ 400Gbit/s forms the interconnection between the sources (deep buffers) and the destinations (buffers before farm CPUs).
EVENT FILTER. A set of high performance commercial processors organized into many farms convenient for on-line and off-line applications.
SWITCH NETWORK
LEVEL-1TRIGGER
DETECTOR CHANNELS Original design: Lvl1 @ 100 kHz
Rescope in 1997: 75kHz
But design all elements to be able to do 100 kHz
US CMS DOE/NSF Review: May 8-10, 2001 4
DAQ architectureDAQ architectureMust reduce 1 GHz of input interactions to 100 Hz
• Do it in steps/successive approximations: “Trigger Levels”
Front end pipelines
Readout buffers
Processor farms
Switching network
Detectors
Lvl-1
HLT
Lvl-1
Lvl-2
Lvl-3
Front end pipelines
Readout buffers
Processor farms
Switching network
Detectors
“Traditional”: 3 physical levels CMS: 2 physical levels
US CMS DOE/NSF Review: May 8-10, 2001 5
2 vs 3 physical levels2 vs 3 physical levels
Data
Data Access
Processing Units
Three Physical Levels• Investment in:
• Control Logic• Specialized
processors (possibly)
Two Physical Levels• Investment in:
• Bandwidth• Commercial
Processors
Lvl-1
Lvl-2
Lvl-3
Lvl-1
HLT
Bandwidth
Bandwidth
Model
US CMS DOE/NSF Review: May 8-10, 2001 6
CMS DAQ: US contributionCMS DAQ: US contributionUS: Event Manager + Builder Units
Builder NetworksEvent Manager
Detector Front-end
Computing Services
ReadoutSystems
Builder and FilterSystems
Level 1 Trigger
Run Control
BU
FU
FU
FU
CERN: Inputs+ SwitchUS: Outputs+ EVMOther responsibilities:
Detector Front-Ends:detector groups
Computing Services:infrastructure
US
Filter Units not included in “outputs”
US CMS DOE/NSF Review: May 8-10, 2001 7
Developments last year (I)Developments last year (I)Multistep Event Building no longer necessary
• Initial decision to invest in networking and computing technologies proving correct
• Today: two alternatives: Myrinet 2000 (2.5 Gb/s links) and/or two Gbit/s Ethernet links/RU
• Tomorrow: + Infiniband (?)
Sub-event LVL-2 data (Calorimeter, muon) (100 kHz, 250 Gbit/s)
100 kHz
Full event LVL-3 data (Track information) (e.g.10 kHz, 75 Gbit/s)
10 kHzEvents accepted to higher levels : 10%
500readoutunits
EVM
High levels (tracker data)
500filterunits
Level-2 Cal. & Muon
EVM 350 Gbit/s
Level-2 25% data
Level-3 75% data
US CMS DOE/NSF Review: May 8-10, 2001 8
Developments last year (II)Developments last year (II)Physics Reconstruction and Selection (PRS): new
project in CMS; along with CCS and TriDAS (online): CPT
Joint Technical Board
Core Computing &
Software
Physics Reconstruction
& Selection
TRIDAS(Online farm)
Reconstruction Group. RPROM (Stephan Wynhoff)
Simulation Group. SPROM (Albert De Roeck)
Architecture Task Force. CAFE (Jim Branson)
Paris Sphicas Sergio CittolinMartti PimiaDavid Stickland
…
US CMS DOE/NSF Review: May 8-10, 2001 9
Developments last year (III)Developments last year (III)High Level Trigger: included in “PRS”
• Defining “Level-2” as anything doable without tracking information, Level-2 is ~ complete
New LHC schedule new date for DAQ TDR• First beams in early 06, first physics in Aug 06• Submission date was always set to T0(LHC)-3.5 yrs.
• With new schedule, submission goes to end (Nov 30) 2001Schedule & Milestones:
• Unchanged, especially for the HLT/PRS part(s)• What gets delayed is decisions on technologies to use, etc.,
but not the results of the studies.• However, with another year’s technology with us, we can
expect that most of the data transfer issues are no longer with us, so we just concentrate on
(a) the algorithm itself and (b) the CPU needed
US CMS DOE/NSF Review: May 8-10, 2001 10
Progress Since Last ReviewProgress Since Last Review16x16 Event Builder Demonstrator complete:
• Based on Myrinet-2000:• Barrel-shifter works at close to 100% (raw) efficiency
• Based on Gbit Ethernet:• Looks very promising – especially if 10 Gbit Ethernet in time
Designs for 500x500 switch available• Simulation results very pomising
Builder Unit prototype:• Two solutions being looked at:
• Custom-made board (commercial components)• Recycling of units made for Readout into a PC
High Level Trigger:• “Level-2” equivalent algorithms in place• Now working on “Level-3” (~ includes tracker information)
US CMS DOE/NSF Review: May 8-10, 2001 11
Progress: Readout UnitProgress: Readout Unit
Aim: complete chain test in 2001
US CMS DOE/NSF Review: May 8-10, 2001 12
Progress: switchProgress: switch16x16 EVB based on Myrinet and on Gbit Ethernet
now complete• Barrel-shifting givesnon-blocking behavior 4k
2k
...... ... ...BU0 BU1 BU2 BU3
RU0 RU1 RU2 RU3
US CMS DOE/NSF Review: May 8-10, 2001 13
Progress: BU prototypeProgress: BU prototypeCurrent deviceDone at UCSDCopper Gbit Ethernet NICPowerPC CPURAMlink interface
SysKonnect Perfomance
0
20
40
60
80
100
120
140
0 200 400 600 800 1000 1200 1400 1600
Frame Size (Bytes)
Xfer
Rat
e (M
B/s)
US CMS DOE/NSF Review: May 8-10, 2001 14
Progress: HLT algorithmsProgress: HLT algorithmsPRS groups in place since 4/99 priority on HLT
• Using new (OO) software reconstruction (ORCA)• “Level-2” equivalent code in place; now “Level-3”
• The question: when should we add tracking information?
US CMS DOE/NSF Review: May 8-10, 2001 15
Progress: control softwareProgress: control software
Sub-Systems ManagersSub-Systems Managers
Sub-SystemsSub-Systems ResourcesResources
CSCSMngrMngrCSCS
SystemSystem
DCSDCSMngrMngrDCSDCS
SystemSystem
CMS CMS Sub-SystemSub-System
TriggerTriggerMngrMngr
TriggerTriggerSystemSystem
EFEFMngrMngr
EFEFSystemSystem
EVBEVBMngrMngrEVBEVB
SystemSystem
GUIGUIRUN RUN
MANAGERMANAGERRUN RUN MANAGERMANAGER
PARTITION jPARTITION jPARTITION kPARTITION k
GUIGUIRUN RUN
MANAGERMANAGER
PARTITION iPARTITION i
Sub-Systems:Sub-Systems:- EVB = Event Builder- EVB = Event Builder- EF = Event Filte- EF = Event Filterrss- DCS = Detector Control System- DCS = Detector Control System- CS = Computing Service- CS = Computing Service- - LHC = LHC Main ControlLHC = LHC Main Control
US CMS DOE/NSF Review: May 8-10, 2001 16
Progress: towards a Progress: towards a composite switch (I)composite switch (I)
Using Myrinet 2000 (available today)• bisection bandwidth 1 Tbps• 6 layer - 512 minimal routes for each source –
destination pair
Clos-128 switch
Issue: design a 500x500 switch fabric out of smaller (e.g. 32x32, 64x64) basic switches
... ...nxmS1
... ...nxmS2
... ...nxm
... ...nxmSr
...
... ...mxnD1
... ...mxnD2
... ...mxnDr-1
... ...mxnDr
...
... ...rxr
... ...rxr
... ...rxr
... ...rxr
...
... ...nxn1
... ...nxn2
... ...nxn... ...nxn
n
...
... ...nxn... ...nxn
... ...nxn
... ...nxn
...
Clos Network (93) Banyan Network (46)
US CMS DOE/NSF Review: May 8-10, 2001 17
Progress: towards a Progress: towards a composite switch (II)composite switch (II)
11
2
25
3
2
20 BUs
20
4
20 RUs
20 Ports 25 Ports
40 Ports
1
1
2
25
2
20 BUs
20 RUs
2 Ports 10G
25 Ports 10G40 Ports
US CMS DOE/NSF Review: May 8-10, 2001 18
DAQ - BCWS and BCWPDAQ - BCWS and BCWPCumulative BCWP/BCWS = 95%; little schedule slippage
• DAQ has completed BCWP/EAC = 18% of the project.
Change in accounting (AY$)(+ delayed actuals reported)
FNAL Software Engr added And work starts going faster$0
$200
$400
$600
$800
$1,000
$1,200
AYK$
BCWS (K$)BCWP (K$)ACWP (K$)
Change Control to account for demonstrator schedule
US CMS DOE/NSF Review: May 8-10, 2001 19
DAQ - Contingency UseDAQ - Contingency Use
TRIDAS
-0.03
-0.025
-0.02
-0.015
-0.01
-0.005
0
0.005
0.01
0.015
0.02
Aug-
98
Oct
-98
Dec
-98
Feb-
99
Apr-
99
Jun-
99
Aug-
99
Oct
-99
Dec
-99
Feb-
00
(EA
C -
Base
)/Bas
e
DAQ decreased its cost in FY00$ – drop of ~ 3%. Most of the change due to M&S (prices dropping). Effect increased in AY$ units.
Recosting(mainly)
0.000
0.050
0.100
0.150
0.200
0.250
0.300
Jan-
00
Feb-
00
Mar
-00
Apr
-00
May
-00
Jun-
00
Jul-0
0
Aug
-00
Sep
-00
Oct
-00
Nov
-00
Dec
-00
Jan-
01
Feb-
01
Mar
-01
BCWP/EAC
(EAC-Base)/Base
Moved profile to later: 2003-04 procurements (now) scheduled for 2004-05. AY$ increase.
US CMS DOE/NSF Review: May 8-10, 2001 20
DAQ - Yearly BCWSDAQ - Yearly BCWS
DAQ BCWS by FY
0
200,000
400,000
600,000
800,000
1,000,000
1,200,000
1,400,000
1,600,000
1,800,000
FY96 FY97 FY98 FY99 FY00 FY01 FY02 FY03 FY04 FY05
Old schedule:most of the cost (3.3M$ out of 4.4M$ total) in FY03 & FY04
New schedule:same cost is now distributed in years FY03-FY04-FY05
DAQ BCWS by Fiscal Year
0
500,000
1,000,000
1,500,000
2,000,000
2,500,000
3,000,000
FY96 FY97 FY98 FY99 FY00 FY01 FY02 FY03 FY04 FY05
2406
861
404
Not final… exact schedule for (crucial) 04-05 period to be defined at TDR
US CMS DOE/NSF Review: May 8-10, 2001 21
DAQ – ML 1-2DAQ – ML 1-2All of them have been met
• Two in past 12 months
Only change: DAQ TDR is anticipated for end 2002• Will determine set of milestones for
“production/building” stage @ TDR time
US CMS DOE/NSF Review: May 8-10, 2001 22
Last Review ConcernsLast Review ConcernsConcerns from last time:1. Add a physicist or software professional familiar with data
acquisition to the data acquisition effort. This project has made good progress with the manpower it has from the CMS project and the support of the base high energy physics program, but additional manpower, as recommended last year, is still important. It would be best to hire an individual in the next year who could then participate in the development of the TDR for data acquisition and would remain committed to CMS through the turn-on of the data acquisition system in 2005. Response: US CMS have made a high priority request to the base program for additional support at U.C. San Diego. This request was made at the meeting between US CMS and DOE/NSF on Sept. 11 and it was well received. It is therefore assumed that an additional postdoc will be available to work on the DAQ effort. Should that not come about, the recommendation will be revisited in order to find an alternative solution.
US CMS DOE/NSF Review: May 8-10, 2001 23
Plans for this yearPlans for this yearHLT: complete Level-3 equivalent code
• Goal is to get rate down to ~ few kHz (Lvl-3)• Create first trigger table for O(100)Hz output (Lvl-4)
DAQ: complete demonstrator to 32x32• Complete comparison with simulation• Test out 2-Gbit scenarios
Vertical chain test• Integrate Readout Unit, Switch, Builder Unit + Event
Manager in one testbed• Check hardware/software interoperability
TDR: aim for first draft at end of 2001
US CMS DOE/NSF Review: May 8-10, 2001 24
Summary & ConclusionsSummary & ConclusionsTechnology is moving fast in the right direction
• Single-step EVB is now the baseline designEVB prototype program
• Very good results from traffic shaping (16x16)• EVM and BU on track
High Level Trigger• Organizational changes: PRS project• Full Lvl-2 results in July 2000; now on Lvl-3
Project Management• Schedule reasonable (95% on track)• Cost experience so far
TDR: new date: end 2002; aim for draft end 2001
US CMS DOE/NSF Review: May 8-10, 2001 25
DAQ - Estimate to CompleteDAQ - Estimate to CompleteWBS Number Description
EDIA (k$)
M&S (k$)
Mfg Labor (k$)
Base Cost (k$)
Cont (k$)
Cont (%)
Total Cost (k$)
Estimate at Completion (AY$) 12'983 18'297FY96-FY99 (AY$) 2'311 2'311Estimate to Complete (AY$) 3'257 7'404 10 10'671 5'314 50 15'985Escalation (DOE January 2000 indices) 153 440 0 5933 Trigger and Data Acquisition 3'105 6'963 10 10'078 5'012 50 15'0903.1 Trigger 1'972 3'706 10 5'689 2'642 46 8'3313.2 Data Acquisition 1'132 3'257 4'389 2'371 54 6'7603.2.1 Prototypes: RU 58 40 97 49 51 1473.2.2 Prototypes: FU3.2.3 Prototypes: Event Builder 178 31 209 146 70 3553.2.4 Demonstrator for TDR 273 207 481 285 59 7663.2.5 Production: Builter Unit 320 2'827 3'147 1'566 50 4'7133.2.6 Production: Event Builder 154 152 306 244 80 5503.2.7 DAQ Tests/Installation 149 149 80 54 229
EDIA26%
M&S74%
$1.132 M$3.257 MDAQ Cost to complete:4.389 M$
Contingency:2.371 M$ (54%)
(adequate, given most of cost is in M&S)
US CMS DOE/NSF Review: May 8-10, 2001 26
SoW 01 – DAQSoW 01 – DAQDAQ SOWs FY01 -- $.2M
University of California-Los AngelesUniversity of California-San DiegoFermilabMIT
US CMS DOE/NSF Review: May 8-10, 2001 27
DAQ Resource UsageDAQ Resource UsageEngineering and Technical resources are compared to
the people called out in the annual SOW. This tracking ensures that the needed labor is deployed.
DAQ Resource Usage
0
2
4
6
8
10
12
FY98 FY99 FY00 FY01 FY02 FY03 FY04 FY05
FTE'
s Tech
Eng
Phys