online monitoring and filtering graham july 2009 graham july 2009
TRANSCRIPT
Online monitoringand filtering
Online monitoringand filtering
GrahamJuly 2009Graham
July 2009
Monitoring and filtering in CODA v2✦ Up to 32 ROCs.✦ A single event builder (EB)✦ EB output is a stream of single events.✦ EB is connected to Event Transport (ET)
system.✦ ET has one or more online analysis, filter and
monitor programs attached. ✦ Event recorder attaches to ET and takes all
events that survive filtering.
CODA v2 systemCODA v2 system
Simplified ETSimplified ET
✦ ET has following features:✦ Can be more than one data producer per
ET.✦ Each station can have a user provided
filter algorithm that looks at the data tags.✦ Can be more than one data consumer per
station but algorithm is shared.✦ System has “fair play” algorithms.
✦ round robin vs first free etc.✦ Stations can be configured to accept all
events, a sample of events or be skipped when their fifo is full.
✦ Since data moves “on a track” programs attached to stations after the producers but before data recorder can modify or filter data.
✦ Similarly programs attached to stations after the data recorder can monitor the data and if configured to skip events when their input is full do not introduce dead time.
7
Hall B
ET1
ET2 ET3
EB
ER
ECAL TOF CerD Tagger DC
LA-CAL
Online farm✦ Distributed
✦ Need processing cycles✦ Need high bandwidth
✦ Must survive node problems✦ Two modes:
✦ Filter✦ Monitor
Reminder of EB architectureReminder of EB architecture
Online farm proposalOnline farm proposal
Proposal✦ Each EMU in the final stage of the EB writes to an
ET.✦ provides one station per farm node.✦ configured to load balance between nodes.✦ EMU has one or more backup ETs if preferred
full.✦ Each node has a local ET and several jobs.
✦ Local ET gets data from the remote ET.✦ Each job gets data from and puts to local ET.
✦ After filter/monitor local ET puts to a remote ET.✦ One or more event recorders pull data from this
ET.
How it works✦ First ET is a source of data for one or more nodes.
✦ Load balance and fault tolerance between nodes.
✦ Second ET, local to node is source for several jobs.✦ Load balance and fault tolerance between jobs.
✦ Last ET has data sources from one or more nodes.✦ Control nodes and jobs using AFECS.✦ Why it works
✦ Distributed and parallel✦ Only requires configuration of ET systems
✦ can tune parameters to alter behavior.
Issues✦ What does the data look like at this stage?
✦ Events?✦ Blocks of events?✦ Does it matter?
✦ What do we do with “non-physics” events?✦ Does it matter if event N appears before or
after event N+1?