an operand routing network for an sfq reconfigurable data-paths processor

15
An Operand Routing Network for an SFQ Reconfigurable Data-Paths Processor I. Kataeva 1 , H. Akaike 1 , A. Fujimaki 1 , N. Yoshikawa 2 , N. Takagi 3 , K. Murakami 4 1 Dept. of Quantum Engineering, Nagoya University 2 Dept. of Electrical and Computer Engineering, Yokohama National University 3 Dept. of Information Engineering, Nagoya University 4 Research Institute for Information Technology, Kyushu University

Upload: bill

Post on 02-Feb-2016

35 views

Category:

Documents


0 download

DESCRIPTION

An Operand Routing Network for an SFQ Reconfigurable Data-Paths Processor. I. Kataeva 1 , H. Akaike 1 , A. Fujimaki 1 , N. Yoshikawa 2 , N. Takagi 3 , K. Murakami 4 1 Dept. of Quantum Engineering, Nagoya University 2 Dept. of Electrical and Computer Engineering, Yokohama National University - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: An Operand Routing Network for an SFQ Reconfigurable Data-Paths Processor

An Operand Routing Network for an SFQ Reconfigurable Data-Paths Processor

I. Kataeva1, H. Akaike1, A. Fujimaki1, N. Yoshikawa2, N. Takagi3, K. Murakami4

1Dept. of Quantum Engineering, Nagoya University2Dept. of Electrical and Computer Engineering, Yokohama National University3Dept. of Information Engineering, Nagoya University4Research Institute for Information Technology, Kyushu University

Page 2: An Operand Routing Network for an SFQ Reconfigurable Data-Paths Processor

Outline

Architectures of the Operand Routing Network NDRO-based ORN Crossbar-based ORN

A crossbar switch with a multicasting function

Experimental results Crossbar switches A 1-to-2 ORN prototype

Conclusion

Page 3: An Operand Routing Network for an SFQ Reconfigurable Data-Paths Processor

Reconfigurable Data-path processoran architecture suitable for SFQ implementation

:

LM

:...:::

SMAC

SB

ORN

...

ORN

...

: : : :

ORN

...

ORN

FPU

two-dimensional array of Floating Point Units

connected using Operand Routing network

ORN is reconfigured to fit a DFG

Page 4: An Operand Routing Network for an SFQ Reconfigurable Data-Paths Processor

Requirements for the Operand Routing Network

each FPU can be connected to one or more FPUs in the next row

connections are established between the FPUs in the immediate vicinity of each other

number of the connections, N, is odd

one FPU’s output can be connected to either or both inputs of an FPU in the next stage

FPU

FPU

FPU

FPU

FPU

FPU

FPU

FPU

FPU

FPU

Page 5: An Operand Routing Network for an SFQ Reconfigurable Data-Paths Processor

NDRO-based Operand Routing Network

“+”: 2×M×N NDRO cells small number of Josephson junctions

NDRO

FPU

FPU

FPU

FPU

FPU

FPU

FPU

FPU

FPU

FPU

NDRO

NDRO

NDRO

NDRO

NDRO

FPU

FPU

FPU

FPU

FPU

FPU

FPU

FPU

FPU

FPU

“–”: irregular non-pipelined structure with the increase of the complexity becomes cumbersome

Page 6: An Operand Routing Network for an SFQ Reconfigurable Data-Paths Processor

½CB

½CB

½CB

CB

CB

“+”: scalable pipelined easily re-designed for any number of N and M

Crossbar-based Operand Routing Network

FPU

FPU

FPU

FPU

FPU

FPU

FPU

FPU

FPU

FPU

FPU

FPU

FPU

FPU

FPU

FPU

FPU

FPU

FPU

FPU ½CB

½CB

CB

CB

CB

CB

CB

CB

CB

CB

CB

“–”: large number of Josephson junctions M - ½ CB and (2×M+1)(N-1)/2 - CB

Page 7: An Operand Routing Network for an SFQ Reconfigurable Data-Paths Processor

Comparison of the ORN architectures

ORN complexity

latency, ps minimum interval

number of control lines

bias current, A

power, mW

number of JJ (including control

block)

N=2, M=4 200 nf+200ps 32 0.2 0.5 1710

N=3, M=8 288 nf+288ps 96 0.7 1.75 6840

N=5, M=10 200

N=9, M=32 1152

ORN complexity

latency, clocks

minimum interval

number of control lines

bias current, A

power, mW

number of JJ

number of JJ (including control

block)

N=2, M=4 6 nf 48 0.3 0.75 3000 4020

N=3, M=8 6 nf 100 0.63 1.575 6230 7684

N=5, M=10 10 nf 208 1.41 3.525 13930 14954

N=9, M=32 18 nf 1168 8.28 20.7 77440 99088

Crossbar-based ORN

NDRO-based ORN

Number of JJs of NDRO-based ORN in a table is an estimation based on a design of the switch for RDP prototype (N=3, M=4) that consisted of 3420 JJs (Iwasaki, not published yet)

A crossbar switch with broadcasting function: 296 JJs

Page 8: An Operand Routing Network for an SFQ Reconfigurable Data-Paths Processor

Crossbar switch with a multicasting function, ver.1

DFF

01AND

clkin

cross0

bar0

din0

din1

dout0

dout1

00

10AND

11AND

00AND

DFF

DFF

DFF

DFF

NDRO

DFF

NDRO

DFF

DFF

cross1

bar1

01AND

10AND

11AND

00AND

01AND

10AND

11AND

00AND

788 JJs

area – 1.28mm×1mm

bias current – 87 mA

clock frequency – 27 GHz

4 pipeline stages

01 10 11

clkout

NOT

NOT

Page 9: An Operand Routing Network for an SFQ Reconfigurable Data-Paths Processor

clkin

cross0

bar0

din0

din1

dout0

dout1

00

NDROC02

cross1

bar1

AND

AND

AND

AND

296 JJs 30 mA 62% reduction of the number of JJs 2 pipeline stages instead of 4

01 10 11

NDROC02

DFF

DFF

clkout clkin

cross0

bar0

din0

dout0

dout1NDRO

cross1

bar1AND

AND

141 JJs 14 mA 70% reduction of the number of JJs 2 pipeline stages instead of 4

NDRO

DFF

clkout

Crossbar switch with a multicasting function, ver.2

00 01 10 11

Page 10: An Operand Routing Network for an SFQ Reconfigurable Data-Paths Processor

Experimental results

data_in

lad

der

clkin_lfin

clkin_lfoutclkin_hf

data_out

Circuits tested:

crossbar switch with a multicasting function, ver.1

½ crossbar switch with a multicasting function , ver.1

1-to-2 ORN prototype

circuit under test

Page 11: An Operand Routing Network for an SFQ Reconfigurable Data-Paths Processor

Cross-bar switch with a multicasting function, ver.1

00 01

10 11

din0din1bar0

bar1clkin_lfin

clkin_lfout

dout0dout1

clkout

clkout1

cross0

din0din1bar0cross1clkin_lfin

clkin_lfout

dout0dout1

clkout

clkout1

bar1

Pattern Bias current

Lower margin, %

Upper margin, %

00 bias1 -13.450 4.213

bias2 -4.702 15.014

01 bias1 -13.450 7.745

bias2 -7.989 5.156

10 bias1 -15.217 7.745

bias2 -4.702 5.156

11 bias1 -13.450 4.213

bias2 -4.702 11.728

788 JJs area – 1.28mm×1mm bias current – 87 mA clock frequency – 27 GHz

Total: 1484 JJs bias current – 172 mA

clkin_hf

Page 12: An Operand Routing Network for an SFQ Reconfigurable Data-Paths Processor

½ cross-bar switch with a multicasting function, ver.1

00 01

10

din0bar0

bar1clkin_lfinclkin_lfout

dout0

dout1

clkout

clkout1

cross0

din0bar0

cross1clkin_lfin

clkin_lfout

dout0dout1

clkout

clkout1

Pattern Bias current lower margin, %

Bias current upper margin, %

00 2.402 15.512

01 2.7 17.797

10 1.011 18.293

Tested at the frequencies: 22÷36 GHz

455 JJs area – 1mm×0.52mm bias current – 50 mA clock frequency – 27 GHz

Total: 1066 JJs bias current – 127 mA

Page 13: An Operand Routing Network for an SFQ Reconfigurable Data-Paths Processor

1-to-2 ORN: low frequency test

completely functional, exhaustive test

bias_kern0 = -14.6/5.3 % does not depend on the pattern

bias_kern1 = -16.1/18.3 % for din0 -> dout11, dout12 bias_kern2 = -20.7/12.6 % for din0 -> dout11, dout12

minimum!

bias_kern1 = -40.3/17.2% for din1 -> dout01 bias_kern2 = -38/12.6% for din2 -> dout02, dout12

maximum!

Total 2031 JJs Total bias current 224.4 mAopen466, no. 4 chip F2 example:

din0 -> dout01, dout02, dout12control pattern: CBT0 “10”, CBT1 “01”, CBT2 “10”

din0

cross10bar00clkin_lfin

dout01

dout11

clkoutclkout1clkout2

dout02

dout12

cross11cross01

bar02bar12

bias_kern0 = -14.6/5.3 % bias_kern1 = -23/17.2% bias_kern2 = -23/12.6%

Page 14: An Operand Routing Network for an SFQ Reconfigurable Data-Paths Processor

bias_kern1 margins for din0 -> dout11 routing

-30.000

-25.000

-20.000

-15.000

-10.000

-5.000

0.000

5.000

10.000

15.000

20.000

10.842 12.679 14.324 15.858 17.241 18.818 20.345 21.854 23.480 upper margin

lower margin

bias_kern1 margins for din0 -> dout01 routing

-30.000

-25.000

-20.000

-15.000

-10.000

-5.000

0.000

5.000

10.000

15.000

20.000

10.842 12.679 14.324 15.858 17.241 18.818 20.345 21.854 upper margin

lower margin

1-to-2 ORN: high frequency test and bias margins frequency dependence

din0

clkin_lfout1clkin_hfclkin_lfin

dout01dout11

clkoutclkout1clkout2

dout02dout12

bar10bar00

bar11bar01

open466, no. 4 chip F2 example:din0 -> dout11control pattern: CBT0 “00”, CBT1 “00”

bias_kern0 margins were frequency independent measured at the frequencies up to 23.5 GHz

bias_kern2 margins for din0 -> dout02, dout12 routing

-30.000

-25.000

-20.000

-15.000

-10.000

-5.000

0.000

5.000

10.000

15.000

10.842 12.679 14.324 15.858 17.241 18.818 20.345

upper margin

lower margin

Page 15: An Operand Routing Network for an SFQ Reconfigurable Data-Paths Processor

Conclusion

we have proposed two different architectures of an ORN:

NDRO-based crossbar-based

complexity comparison has been done and crossbar-based ORN is considered a better option due to its scalability and pipelined structure

a new version of the crossbar switch has been designed

two versions of a crossbar with a multicasting function have been designed for 2.5 kA/cm2 process and successfully tested at the frequencies up to 28 GHz

a 1-to-2 ORN has been designed and successfully tested at the frequencies up to 23.5 GHz