overclocking the v1495

14
Overclocking the V1495

Upload: floyd

Post on 05-Feb-2016

60 views

Category:

Documents


0 download

DESCRIPTION

Overclocking the V1495. Why we want to overclocking the V1495?. Data compression and transfer over multiple clock cycles allows more detailed information to be passed to the track correlator , and can help deciding the trigger more accurate. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Overclocking the V1495

Overclocking the V1495

Page 2: Overclocking the V1495

2

Why we want to overclocking the V1495?

Each V1495 can maximum support 160 (32x5)LVDS/ECL/PECL inputs with 32 output.

Page 3: Overclocking the V1495

3

How to overclocking the V1495?Cyclone FPGAs offer phase locked loops (PLLs) and a global clocknetwork for clock management solutions. Cyclone PLLs offer clockmultiplication and division, phase shifting, programmable duty cycle,and external clock outputs, allowing system-level clock management andskew control.

Page 4: Overclocking the V1495

4

RTL Viewer figure

c0

locked

inclk0

inclk0

c0

locked

c0 = inclk0x4

Page 5: Overclocking the V1495

160

bits d

ata

All X

/Y p

lane

Da

ta ha

ndlin

g

32 b

its

32 b

its

Da

ta collect

Da

ta ha

ndlin

g

Trigger

Front level FPGA Second level FPGA

Flow chart of transfer data in multi clock cycle

.

.

.

5

Because of the limit working frequency of the input/output port (200MHz), and 53 MHz beam, we can only “overclock” data transfer by a factor of 3 (32 bits → 87 bits).

Page 6: Overclocking the V1495

6

Data handling code

datatopattern : process( A, B, C)Begin if (A(0)='1') or (A(3)='1' AND A(1)='1' AND A(2)='1') then ccye<="01"; databuffer(0)<='1'; C(24)<='1'; else databuffer(0)<='0'; ccye<="00"; C(24)<='0'; end if; end process;

Page 7: Overclocking the V1495

7

Transfering codeif c0'event and c0 = '1' then if (ccy="0000") then C(18)<='0'; C(19)<='0'; C(20)<='0'; C(21)<='0'; C(22)<='0'; C(23)<='0'; ccy<="0001"; ccyd<="01"; elsif (ccy="0001" and ccye="01")then C(18)<='1'; C(19)<='0'; C(20)<='0'; C(21)<=databuffer(0); C(22)<=databuffer(1); C(23)<=databuffer(2); ccy<="0010"; elsif (ccy="0010")then C(18)<='0';

C(19)<='1'; C(20)<='0'; C(21)<=databuffer(3); C(22)<=databuffer(4); C(23)<=databuffer(5); ccy<="0100"; elsif (ccy="0100")then C(18)<='0'; C(19)<='0'; C(20)<='1'; C(21)<=databuffer(6); C(22)<=databuffer(7); C(23)<=databuffer(8); ccy<="0000"; ccyd<="00"; end if; end if;

Page 8: Overclocking the V1495

8

Data collecting & handling codeif c0'event and c0 = '1' then if (A(2)='1' and A(3)='0' and A(4)='0') then buff(0)<= A(5); buff(1)<= A(6); buff(2)<= A(7); elsif (A(2)='0' and A(3)='1' and A(4)='0')then buff(3)<= A(5); buff(4)<= A(6); buff(5)<= A(7); elsif (A(2)='0' and A(3)='0' and A(4)='1')then buff(6)<= A(5); buff(7)<= A(6); buff(8)<= A(7); C(20)<='1'; elsif (C(20)='1') then if (buff(0)='1') then C(21)<='1';

end if; if (buff(0)='1' and buff(8)='1') then C(22)<='1'; end if; buff(0)<= '0'; buff(1)<= '0'; buff(2)<= '0'; buff(3)<= '0'; buff(4)<= '0'; buff(5)<= '0'; buff(6)<= '0'; buff(7)<= '0'; buff(8)<= '0'; C(20)<='0'; elsif (C(21)='1') then C(21)<='0'; elsif (C(22)='1') then C(22)<='0'; end if;end if;

Page 9: Overclocking the V1495

9

V1495 System ArchitectureOriginal plan: 3 v1495s: 2 to do tracks in X/Y → lower/upper halves of detector, with 2 32-bit outputs to v1495 track correlater (given 160 input channels)Concern: trade offs between speed of trigger, resolution of information passed to second level, problems with multiple tracks, …

By a simple calculation we will have less than 1876 track trigger conditions in each X plane v1495. And less than 4404 trigger conditions in each Y plane.

Just considering the Y plane, we need 15bits to encoding the track information, if there are multiple tracks the output bits is significant not enough.

Page 10: Overclocking the V1495

10

V1495 Simple / Sample Trigger Matrix / Track Program - 1D

“Toy” Example: 3 hits / 4 planes, inputs on A, B, D, E, output on C: if( (A( 1)='1' AND B( 1)='1' AND D( 1)='1' ) OR (A( 1)='1' AND B( 1)='1' AND D( 2)='1' ) . . . OR (B(32)='1' AND D(32)='1' AND E(32)='1' ) ) then C(3)<='1'; elseif C(3)<='0'; end if;Tested with up to several thousand conditions coded, several input conditions all led to output in < 40 ns(32ns ~38ns)

Page 11: Overclocking the V1495

Some possible V1495 System Architecture

1: 5 v1495s: 4 to do tracks in 4 quadrants, with 32-bit outputs to v1495 track correlater – twice as much information transferred, efficiency problems at edges.

11

Each front level v1495 only handle 1 quadrants. If there are any track cross over quadrants, then we will lose the information.

Page 12: Overclocking the V1495

Some possible V1495 System Architecture

2: 5 v1495s: 4 in lower X, lower Y, upper X, upper Y, with 32-bit outputs, but transfer all paddle hits in 3 times multiple clock cycle to v1495 track correlater. Each front level v1495 need send 80 bits data to the next v1495. All data processing done in the second level v1495 - are the 20,000 logic elements enough for all desired triggers?

12

Page 13: Overclocking the V1495

Some possible V1495 System Architecture3: 3 v1495s: 1 tracks all X, 1 tracks all Y. With 32-bit outputs and

transfer data in multiple clock cycle to v1495 track correlater to triple information – even efficiency, but issues from resolution and multiple tracks.

13

Track information in X paddles is less than Y paddles, maybe we can put the kinetic energy information in X paddles.

Page 14: Overclocking the V1495

14

Track correlaterOriginal plan:Since “front” FPGAs look at entire halves of detectors, expect to be able to eliminate vertical inefficiency bandPossible plan:The S1 scheme has the vertical and horizontal inefficiency band. The S2 and S3 scheme can also eliminate the horizontal and vertical inefficiency band. How important is this? Is track rate so high in horizontal plane that we want an inefficiency stripe there?Intuitions please!C.A. Gagliardi et al., "Hardware Trigger System for Fermilab E866", Nucl. Instrum. Methods A 418, 322 (1998)