d. saltzberg, 7 dec 01 l2 review level-2 interface board status david saltzberg for l2 group...

44
D. Saltzberg, 7 Dec 01 L2 Review CDF Level-2 Interface Board Status David Saltzberg for L2 Group Level-Two Trigger Review December 7, 2001

Upload: lily-gregory

Post on 03-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

D. Saltzberg, 7 Dec 01L2 Review

CDF

Level-2 Interface Board Status

David Saltzberg for L2 Group

Level-Two Trigger Review

December 7, 2001

D. Saltzberg, 7 Dec 01L2 Review

CDF Overview

Phase 1: L1interface, Clist, XTRPlist, SVTlist Phase 2: ISOlist, RECES Phase 3: Muon board (not in this talk)

(Phase 1 and Phase 2 have been done in parallel)

D. Saltzberg, 7 Dec 01L2 Review

CDF Responsible Physicists

L1 interface: Greg Feild*

Clist: Monica Tecchio, Heather Ray*

XTRPlist, SVTlist: Matt Worcester *, Jane Nachtman, D.S. RECES: Masa Tanaka *, Karen Byrum *

ISOlist: Steve Kuhlmann * , Bob Blair *

* = lives within 50 miles of Fermilab

ANL engineers (L1,reces,isolist): John Dawson, Bill Haberichter

Special operatives: Stephen Miller, Ted Liu, Peter Wittich

D. Saltzberg, 7 Dec 01L2 Review

CDF Theory of Operation - I

Input data from “Clients” L1 interface, RECES

- one word/event, no handshake

Clist, XTRPlist, SVTlist, Isolist: - variable length data, buffered by FIFO’s

- terminated by EE word

- Some info transfer about BC, L2B or event count for sync checking

D. Saltzberg, 7 Dec 01L2 Review

CDF Theory of Operation - II

Output: Control and Data signals via Magic Bus Master mode (currently all boards except Reces)

- L2P issues “STARTLOAD”

- When ready, Interface board requests “Boss”

- Board is granted Boss from upstream

- Board drives block mode data-transfer on Bus

- Boss is released by interface board, and MOD_DONE asserted

- When all MOD_DONE bits set, L2P begins processing

Slave mode

- Board is addressed over Magic Bus and read in single-word transfers

Alternate Output (TRKlist boards) VME readout

D. Saltzberg, 7 Dec 01L2 Review

CDFGeneral Error Detecton &

Handling

In L2P (every event--10’s kHz) L2P has 600 sec timeout for all MOD_DONE signals BC, L2B, or counters checked where possible event by event Checks for exactly 1 magic bus word from L1 board If error, pull CDF_ERROR (or equivalent) and ask for automatic Halt-

Recover-Run to resynch FIFO’s.

In TrigMon (~ 2 Hz) Check Number of words transferred for each board Check BC across system Exact bit-for-bit comparison of data vs. emulation and/or alternate

source

Offline Run select parts of TrigMon & Monica’s validation code on look area,

stream-g, stream-b, l2-torture runs (~1M events lately)

D. Saltzberg, 7 Dec 01L2 Review

CDF Testing Performance of system

Without Beam “L2 torture” nominally runs at ~20 kHz Occasionally have run system at ~40 kHz Runs system with high L2B occupancy Test patterns in for COT tracks, SVT tracks, emulate clusters 9 interface boards & up to 3 alphas connected

With Beam Same config. Get real XFT tracks but often have to run SVT test patterns (no SVX) Have found other problems (sometimes systemwide) that tests w/o

beam do not show. (Real world stuff that no teststand will anticipate) Extensive tests before Oct. shutdown, preliminary Dec. tests.

D. Saltzberg, 7 Dec 01L2 Review

CDF Current Boss Arb. Kludge

Glitch on BOSSGROUT (pecl) when taking BOSS can lead to two boards taking boss. Since in hardware (not firmware), cannot make simple glitch protection

Solution: Reduce collision rate by putting different delays in boards’ receiving of STARTLOAD (limits deadtimeless

L1A rate at 20kHz--we should have such problems.) Handle remaining collisions with L2P error handling New Backplane In a pinch, could it be fixed with TTL

D. Saltzberg, 7 Dec 01L2 Review

CDF Overview Plot of L2 crate

D. Saltzberg, 7 Dec 01L2 Review

CDF Board -by-Board Status(follows...)

Status of “best” board Highest rate tested & error rate Limit on (or measurement of) bit error rate Cooperation with other boards Plans for further work

Status of spares Number and status of spares known problems?

Status of Documentation Debugging tools, here and elsewhere Plans Other comments

D. Saltzberg, 7 Dec 01L2 Review

CDF L1 Interface Board

L2 torture tests tested at 20-40 kHz no problems tested ~1M events, no errors tested offlineno collisions with other boards (by construction)

Known problemsnoisier than others, but protected in timestill have to connect ground sheild & seeSolving noise here may solve it elsewhere

D. Saltzberg, 7 Dec 01L2 Review

CDF L1 Interface Board Plots

No errors in bit-for-bit comparison

D. Saltzberg, 7 Dec 01L2 Review

CDFL1 Interface Spares &

Debugging tools

SparesS/N 1 OK S/N 2 OK (in crate)S/N 3 3/4 stuffed

Debugging toolsBit for bit check available offline If more or less than one word is sent, L2P pulls error (Pretty simple board, no need for complex diagnostics)Teststand: Can set bit patterns, check in realtime or later

- data source: FRED

- data sink: MB to emulator board

D. Saltzberg, 7 Dec 01L2 Review

CDFL1 Interface

Documentation/Plans

DOCSCDFNOTE 4971Webpage:

http://hepwww.physics.yale.edu/www_info/yale_cdf/l1crate.html

Schematics have control room hardcopyPDF files recently sent to Greg-- will put on web and in trigger

room Plans

Keep runningFinish stuffing board #3 (2nd spare) and testLook into noise problem, not urgent. Wait until after new MB

installed

D. Saltzberg, 7 Dec 01L2 Review

CDF CList Board

Responsibles: Monica Tecchio, Heather Ray Gets data by fiber from each Locos board L2 torture tests

works at 20-40 kHz no errors no errors found in ~1 M events offline

Known problems crate 04-- had bit 02 is stuck low (probably trivial)

D. Saltzberg, 7 Dec 01L2 Review

CDF Clist board plots

No errors in bit-for-bit comparisons

D. Saltzberg, 7 Dec 01L2 Review

CDF L2 cutting on Jets

D. Saltzberg, 7 Dec 01L2 Review

CDF Clist Debugging tools

Bit-for-bit comparisons done in online/offline monitoring If L2 buffer number disagrees L2P pulls error Clusters can be set

pulling cable in DCAS crate makes a known cluster in principle software exists to make arbitrary cluster pattern at B0 (need

to verify)

Michigan teststand capabilities: Standalone board tests using VME Data source: Locos Data sink: MB & L2P Test full clustering chain DCAS ---> L2P via MB w/ tracer generating

multiple L1A’s

D. Saltzberg, 7 Dec 01L2 Review

CDFClist

Spares/Documentation/Plans

Spares S/N 1 OK (in system) S/N 2 flaky VME, otherwise works. S/N 3 being stuffed

Documentation webpage for aces, experts & non-experts

- http://www-cdf.fnal.gov/internal/cdfoperations/trigger/level2/my.html will become general L2 webpage (need more disk space) schematics online in Michigan hardcopies in trigger room

Plans Keep running stably with board #1, monitor robustness Fix flaky VME on board #2

Make board #3 a second “hot spare”

D. Saltzberg, 7 Dec 01L2 Review

CDF SVTlist Board Tests

Responsibles: Jane Nachtman Matt Worcester, D. Saltzberg L2 Torture Testing:

20-40 kHz L1A no errors (SVX off, running SVT test pattern) Tested with ~1 M events no bit errors Special run with checks inside alpha: BER<10-6

No collisions with other boards

Problems Gets confused if no EE word from SVT; L2P pulls error.

- Due to SVX not sending info to SVX

- Known problems in SVX have been fixed, others?

- Bill A. thinking about an SVT timeout to pull error

- Only happens with beam. Checked (painfully) before shutdown & it worked (could even have taken special oct. SVT runs with it.)

No firmware changes to TRACKlist boards in last 2 months!

D. Saltzberg, 7 Dec 01L2 Review

CDF Some SVTList Plots

No errors in bit-for-bit comparisons

D. Saltzberg, 7 Dec 01L2 Review

CDFL2 SVT Cutting (before

shutdown)

D. Saltzberg, 7 Dec 01L2 Review

CDF XTRPlist Board Tests

Responsibles: Jane Nachtman Matt Worcester, D. Saltzberg L2 Torture Testing:

20-40 kHz L1A noerrors Tested with 1 M events no detectable errors

- XTRD bank has known errors that cause Ntracks mismatch

- Correct at L2, wrong in readout

- No errors when cut on Ntrack agreement

- Handscan of other events looks okay

No collisions with other boards

Problems Illinois to fix XTRD bank filling errors One bad pT bit from one XTRP board

D. Saltzberg, 7 Dec 01L2 Review

CDF XTRPlist plots

No errors in bit-for-bit comparisons when number of tracks agrees.

D. Saltzberg, 7 Dec 01L2 Review

CDF Spares for TRACKlist

SVTList & XTRPlist are both instances of one board: TRACKlist CPLD change with JTAG connector one jumper change

Six production TRACKlist boards Currently 2 in L2P crate--permanent Currently 2 in SVT crate --1 or both temporary?

- one makes nominal SVTD bank. Convenient for booking SVT crate for test runs

- having separate boards effectively makes a cable check

- another board in SVT crate makes XTRP list---could be removed soon?

Six production boards, at least 2 required in system, maybe 3. Right now using 4.

D. Saltzberg, 7 Dec 01L2 Review

CDF TRACKlist spares

S/N 1 & 2: (Prototypes, no longer used.) S/N 3 XTRPlist OK (in L2P crate) S/N 4 SVTlist OK -- used for SVTD bank S/N 5 XTRPlist OK --”hot spare” S/N 6 SVTlist MB not working, bad connection S/N 7 SVTlist stuck chisq bit for MB -- used for SVTD bank S/N 8 SVTlist OK (in L2P crate)

All boards work for VME readout

D. Saltzberg, 7 Dec 01L2 Review

CDF TRACKlist debugging tools

Can send arbitrary pattern from SVT easily Can send arbitrary pattern from XTRP (more difficult) Bit-by-bit checking in TrigMon Can test BC from XTRP & SVT on every event UCLA teststand:

data source: merger boarddata sink: MB and emulator board and/or VME

D. Saltzberg, 7 Dec 01L2 Review

CDF TRACKlist plans

Keep running stably Fix one SVT spare (bad connection makes MB error) Fix one bad bit on another SVT spare Wean SVT off of second SVT board Make sure all six boards are “hot spares” Print hardcopies of schematics & firmware

D. Saltzberg, 7 Dec 01L2 Review

CDF TRACKList Documentation

Web-pages:Specs

http://buggs.physics.ucla.edu/~nachtman/board/specifications_v1.ps TIB instructions:

http://www-b0.fnal.gov:8000/level2/tib/tib_main.html TIB database: http://www-b0.fnal.gov:8000/level2/tib/tib_status.html

TIB schematics etc:

http://buggs.physics.ucla.edu/~nachtman/tib.html

Schematics on web in .eps format Need updated hardcopies printed out

D. Saltzberg, 7 Dec 01L2 Review

CDF ISOlist status

Responsibles: Steve Kuhlmann, Bob Blair Calculates 5 isolation sums

DCAS->Iso Pick -->ISOlist Clique ->Isoclique-> ISOlist

L2 Torture tests (or cosmics) need to require eta-phi match (~1-3% failure)

perfect at 20-40 kHz in all 5 sums Problems

with collisions see eta-phi match (still 1-3% failure), but L2P can check and pass the event

In 0.5% of events also scatter of expected vs. seen in all 5 sums (less than analog jitter in Run 1) N.B. the whole scatter comes from crate 1, eta=17.

D. Saltzberg, 7 Dec 01L2 Review

CDF ISOlist plots

D. Saltzberg, 7 Dec 01L2 Review

CDF ISOlist spares

In DCAS cratesNeed 1 ISOclique (have 2)Need 6 isopicks (have 8, 1 with stuck bit)

In L2P crateNeed 1 ISOlist (have 2)

All spares are “hot spares” except for 1 isopick with stuck bit.

D. Saltzberg, 7 Dec 01L2 Review

CDF ISOlist Debugging Tools

Standard running ISOpick times out if DCAS does not send data

Standalone code: writes to ISOclique (only board with VME) a seed tell it to read out fixed values to ISOlation system can load different values for different buffer numbers with a switch, can read energies from DCAS. Essentially this “factors” the

problem.

TrigMon & Offline Code Incorporated isolation variables into Monica’s code Need to debug some boundary values against the hardware

Teststand at ANL data source: ISOpick data sink: MB to emulator board

D. Saltzberg, 7 Dec 01L2 Review

CDF ISOlist Documentation/Plans

DOCSCDFnote 5788Schematics in hardcopy in binders at ANL but will come to

trigger roomPDF files of schematics (firmware & hardware) are

available, will be placed on web by Heather

Plans Continue running & monitor robustness Go after eta/phi mismatch (needs coordination between ANL and

Michigan) Find & fix flaky bit in DCAS crate

D. Saltzberg, 7 Dec 01L2 Review

CDF RECES status

Responsibles: Masa Tanaka, Karen Byrum Four boards in L2P crate receive information from SMXR by fiber During L2 Torture tests (36 kHz)

In crate, on backplane, but not used by default table No negative interactions

Special L2 executable (TEST_RECES table) L1 input is crossing trigger and 4 GeV elec, 8 GeV photon runs at 20kHz L1 input, 100 Hz L2A Maybe small bit errors -- few thousand events All SMXR to RECES is okay (at end of shutdown)

Problems Accidental collisions on Alpha readout Sol’ns: Arnd’s special retry readout code. Stephen will modify FPGA possible bit errors (10-3)

D. Saltzberg, 7 Dec 01L2 Review

CDF Reces Plots

D. Saltzberg, 7 Dec 01L2 Review

CDF RECES Spares/Docs/Plans

Need 4 Reces boards in system 4 in top crate OK 2 spare boards OK

Docs CDF 5132 Need to put schematics on web & hardcopies in trigger room.

Plans Keep RECES on backplane during default running Fix readout problem Search for BER < 10-4 in standard datataking & fix

D. Saltzberg, 7 Dec 01L2 Review

CDF Reces Debugging tools

Special standalone code VME based. Set trigger threshold, load SMXR’s Send bit patterns to RECES board, Alpha reads through VME Check bit-for-bit (checks all bits) 10 Hz (tens of thousands of events OK)

ANL teststand Not needed any more

TrigMon plots temperature plots checks bit-for-bit errors

D. Saltzberg, 7 Dec 01L2 Review

CDF Interface Board status by run(documented for collaboration)

D. Saltzberg, 7 Dec 01L2 Review

CDF Interface Boards:The Bottom Line

L2 crate with Clist, XTRPlist, SVTlist, L1 interface, ISOlist all work at up to full speed 20 kHz as-is.

Their bit-error rates are measured < 10-6 (RECES not tested to this level yet.)

Essentially all documentation exists. Some tweaks in progress

There is at least one working spare for every board. Every board has a real expert living close by Work in progress fixing up extra boards’ bad bits etc. In current configuration we can fulfill the charge of running

jets, electrons and SVT at 5e31 right now, as-is (assuming all clients are working)---”backups” will only distract.

D. Saltzberg, 7 Dec 01L2 Review

CDF Goals of Sept. workshop(for interface boards)

sync errors <10-6 DONE cut on jets/ “reliable Clist” DONE “reliable L1 board” DONE automated HRR DONE “solve XTRP problem” DONE (don’t remember what is was, but it works) reliable SVTlist DONE SVT kludge path DONE alpha code for cutting on SVT: Simple code DONE, complete cdf4718-lite underway Solve clist eta/phi errors for electrons:

DONE for electrons (iso needs work) alpha electron code Debugging prepare firmware without delays for MB testing DONE test boards on new MB NOT DONE test isolist and reces DONE “improve documentation” DONE -- more to do, as always

D. Saltzberg, 7 Dec 01L2 Review

CDF Suggestions-I

Spares should not be kept in lower crate unless being used. Otherwise water leak (it has happened before!) will destroy all boards. Currently squatting on other spare space...could use space allocated specifically for L2 spares

Need more disk space for L2 webpages on B0 machine. SVT group should use XTRP list in TL2D and free up spare

TRACKlist board “Clients” should be kept in stable configuration D-sized plotter in B0 for printing updated Firmware schematics

(.eps or .pdf)

D. Saltzberg, 7 Dec 01L2 Review

CDF Suggestions-II

Need more of the “good” jumpers (white) Make MagicBus document a CDFNOTE File cabinet for all L2 docs. Can be different sized schematics

and also text documents so folders would work better than one binder.

web “clearing house” for all L2 web documents. Good documentation exists for all boards, just need a list of links (Heather is working on this.) I think we should not over-structure this at this point...leave the microstructure to the individual groups

When given choice of testing kludge path vs. real path, try real first

D. Saltzberg, 7 Dec 01L2 Review

CDF Suggestions -III

In next 3-6 months, experts (and their supervisors) should think about training their successors.

Need to implement bit-for-bit emulation SIXD--> TL2D into TrigMon

Need someone to write/ implement XFLD-->XTRD emulation A MB “display” module would be a critical debugging tool

(LED’s on each line) much like the old Fastbus display module