firefly: illuminating future network-on-chip with nanophotonics

25
Firefly: Illuminating Future Network-on-Chip with Nanophotonics Yan Pan, Prabhat Kumar, John Kim , Gokhan Memik, Yu Zhang, Alok Choudhary EECS Department Northwestern University Evanston, IL, USA {panyan,prabhat-kumar,g-memik, yu-zhang,a-choudhary} @northwestern.edu CS Department KAIST Daejeon, Korea [email protected]

Upload: yovela

Post on 19-Feb-2016

17 views

Category:

Documents


0 download

DESCRIPTION

Firefly: Illuminating Future Network-on-Chip with Nanophotonics. Yan Pan, Prabhat Kumar, John Kim † , Gokhan Memik , Yu Zhang, Alok Choudhary. EECS Department Northwestern University Evanston, IL, USA {panyan,prabhat-kumar,g-memik, yu-zhang,a-choudhary} @northwestern.edu. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Firefly: Illuminating Future  Network-on-Chip with Nanophotonics

Firefly: Illuminating Future Network-on-Chip with Nanophotonics

Yan Pan, Prabhat Kumar, John Kim†, Gokhan Memik, Yu Zhang, Alok Choudhary

EECS DepartmentNorthwestern University

Evanston, IL, USA{panyan,prabhat-kumar,g-memik,

yu-zhang,a-choudhary}@northwestern.edu

† CS DepartmentKAIST

Daejeon, [email protected]

Page 2: Firefly: Illuminating Future  Network-on-Chip with Nanophotonics

Motivation Architecture of Firefly Evaluation Conclusion

ISCA 2009Yan Pan 2/25

On-Chip Network TopologiesOn-Chip Network Topologies

Mesh[MIT RAW] [TILE64]

[Teraflops]

C-Mesh[Balfour’06]

[Cianchetti’09]

Crossbar[Vantrease’08]

[Kirman’06]

Others: Torus[Shacham’07], Flattened Butterfly[Kim’07], Dragonfly[Kim’08], Hierarchical(Bus&Mesh)[Das’08], Clos[Joshi’09], Ring[Larrabee], ……

► Network-on-chip is critical for performance.

Page 3: Firefly: Illuminating Future  Network-on-Chip with Nanophotonics

Motivation Architecture of Firefly Evaluation Conclusion

ISCA 2009Yan Pan 3/25

Signaling technologiesSignaling technologies

► Electrical signaling– Repeater insertion needed– Bandwidth density (up to 8 Gbps/um) [Chang HPCA‘08]

► Nanophotonics– Bandwidth density ~100 Gbps/ μm !!! [Batten HOTI’08]

– Generally distance independent power consumption– Speed of light low latency

• Propagation• Switching [Cianchetti ISCA’09]

Page 4: Firefly: Illuminating Future  Network-on-Chip with Nanophotonics

Motivation Architecture of Firefly Evaluation Conclusion

ISCA 2009Yan Pan 4/25

Nanophotonic componentsNanophotonic components

► Basic components

off-chiplaser source

coupler

resonant modulators

resonant detectors

Ge-doped

waveguide

Page 5: Firefly: Illuminating Future  Network-on-Chip with Nanophotonics

Motivation Architecture of Firefly Evaluation Conclusion

ISCA 2009Yan Pan 5/25

Radius r Baseline WavelengthTemperature t Manufacturing error correctionCarrier density d Fast tuning by charge injection

Resonant RingsResonant Rings

► Selective– Couple optical energy of a specific wavelength

Page 6: Firefly: Illuminating Future  Network-on-Chip with Nanophotonics

Motivation Architecture of Firefly Evaluation Conclusion

ISCA 2009Yan Pan 6/25

Putting it togetherPutting it together

► Modulation & detection– ~100 Gbps/μm bandwidth density [Batten HOTI’08]

11010101

11010101

10001011

10001011

64 wavelengths DWDM3 ~ 5μm waveguide pitch10Gbps per link

~100 Gbps/μmbandwidth density

Page 7: Firefly: Illuminating Future  Network-on-Chip with Nanophotonics

Motivation Architecture of Firefly Evaluation Conclusion

ISCA 2009Yan Pan 7/25

What’s the catch?What’s the catch?

► Power Cost– Ring heating– Laser Power– E/O & O/E conversions– Distance insensitive

► For short links (2.5mm)

– Nanophotonics– Electrical

• RC lines with repeater insertion

[Batten HOTI’08] [Cheng ISCA’06]

0

100

200

300

400

500

600

700

Nanophotonics RC Line

Per B

it En

ergy

(fJ/

b)

Optical Components Ring HeatingLaser Electrical

► For long links– Nanophotonics

• Cost stays the same

– Electrical• Cost increases

Page 8: Firefly: Illuminating Future  Network-on-Chip with Nanophotonics

Motivation Architecture of Firefly Evaluation Conclusion

ISCA 2009Yan Pan 8/25

Here is the idea ……Here is the idea ……

► Design an architecture that differentiates traffic.– Use electrical signaling for short links.– Use nanophotonics only for long range traffic.

► What do we gain?– Low latency– High bandwidth density– High power efficiency– Localized arbitration– Scalability

Page 9: Firefly: Illuminating Future  Network-on-Chip with Nanophotonics

Motivation Architecture of Firefly Evaluation Conclusion

ISCA 2009Yan Pan 9/25

OutlineOutline

► Motivation► Architecture of Firefly► Evaluation► Conclusion

Page 10: Firefly: Illuminating Future  Network-on-Chip with Nanophotonics

Motivation Architecture of Firefly Evaluation Conclusion

ISCA 2009Yan Pan 10/25

Layout View of 64-core FireflyLayout View of 64-core Firefly

► Concentration– 4 cores share a

router– 16 routers

P0 P1

P2 P3

P0 P1

P2 P3

P0 P1

P2 P3

P0 P1

P2 P3

P0 P1

P2 P3

P0 P1

P2 P3

P0 P1

P2 P3

P0 P1

P2 P3

P0 P1

P2 P3

P0 P1

P2 P3

P0 P1

P2 P3

P0 P1

P2 P3

P0 P1

P2 P3

P0 P1

P2 P3

P0 P1

P2 P3

P0 P1

P2 P3R

R R

R R

R R

R R

R R

R R

R R

R

Page 11: Firefly: Illuminating Future  Network-on-Chip with Nanophotonics

Motivation Architecture of Firefly Evaluation Conclusion

ISCA 2009Yan Pan 11/25

Layout View of 64-core FireflyLayout View of 64-core Firefly

► Concentration► Clusters

– Electrically connected

– Mesh topology– 4 routers per

cluster– 4 clusters

R R

R R

R R

R R

R R

R R

R R

R R

Cluster 0Cluster 0(C0)(C0)

Cluster 1Cluster 1(C1)(C1)

Cluster 3Cluster 3(C3)(C3)

Cluster 2Cluster 2(C2)(C2)

C0R0 C0R1

C0R2 C0R3

C1R0 C1R1

C1R2 C1R3

C3R0 C3R1

C3R2 C3R3

C2R0 C2R1

C2R2 C2R3

Page 12: Firefly: Illuminating Future  Network-on-Chip with Nanophotonics

Motivation Architecture of Firefly Evaluation Conclusion

ISCA 2009Yan Pan 12/25

C0R0 C0R1

C0R2 C0R3

C1R0 C1R1

C1R2 C1R3

C3R0 C3R1

C3R2 C3R3

C2R0 C2R1

C2R2 C2R3

C0R3 C1R3

C3R3C2R3

C0R2 C1R2

C3R2C2R2

Layout View of 64-core FireflyLayout View of 64-core Firefly

► Concentration► Clusters► Assemblies

– Routers from different clusters

– Optically connected

– Logical crossbars

C0R0 C1R0

C3R0C2R0

C0R1 C1R1

C3R1C2R1

A1A0

Page 13: Firefly: Illuminating Future  Network-on-Chip with Nanophotonics

Motivation Architecture of Firefly Evaluation Conclusion

ISCA 2009Yan Pan 13/25

C0R0 C0R1

C0R2 C0R3

C1R0 C1R1

C1R2 C1R3

C3R0 C3R1

C3R2 C3R3

C2R0 C2R1

C2R2 C2R3

Layout View of 64-core FireflyLayout View of 64-core Firefly

► Clusters– Electrical

CMESH► Assemblies

– Nanophotonic crossbars

A2 A3A0 A1

Nanophotonic Nanophotonic CrossbarsCrossbarsEfficient nanophotonic

crossbars needed!

Page 14: Firefly: Illuminating Future  Network-on-Chip with Nanophotonics

Motivation Architecture of Firefly Evaluation Conclusion

ISCA 2009Yan Pan 14/25

Nanophotonic crossbarsNanophotonic crossbars

► Single-Write-Multiple-Read (SWMR) [Kirman’06] (CMXbar††)

– Dedicated sending channel– Multicast in nature– Receiver compare & discard – High fan-out laser power

SWMR Crossbar

†† [Joshi NOCS’09]

CH0

R0 R1 RN-1

w

CH1

...

......

... ... ...w

w

... ...

CH(N-1)

Data

Channels

Page 15: Firefly: Illuminating Future  Network-on-Chip with Nanophotonics

Motivation Architecture of Firefly Evaluation Conclusion

ISCA 2009Yan Pan 15/25

Nanophotonic crossbarsNanophotonic crossbars

► Multiple-Write-Single-Read (MWSR)[Vantrease’08] (DMXbar††)

– Dedicated receiving channel– Demux to channel– Global arbitration needed!

MWSR Crossbar

CH0

R0 R1 RN-1

CH1

...

......

... ... ...

... ...

CH(N-1)

ww

w

Data

Channels

†† [Joshi NOCS’09]

Page 16: Firefly: Illuminating Future  Network-on-Chip with Nanophotonics

Motivation Architecture of Firefly Evaluation Conclusion

ISCA 2009Yan Pan 16/25

Reservation-assisted SWMRReservation-assisted SWMR

► Goal– Avoid global arbitration– Reduce power

► Proposed design– Reservation channels

• Narrow

– Multicast to reserve• Destination ID• Packet length

– Uni-cast data packet R-SWMR Crossbar

CH0a

CH1a

CH(N-1)a

...

... ... ...log (Ns)

... ...

log (Ns)

Reservation C

hannels

log (Ns)

CH0

R0 R1 RN-1

CH1

CH(N-1)

...

......

... ... ...

... ...

...

ww

w

Data

Channels

Page 17: Firefly: Illuminating Future  Network-on-Chip with Nanophotonics

Motivation Architecture of Firefly Evaluation Conclusion

ISCA 2009Yan Pan 17/25

Router MicroarchitectureRouter Microarchitecture

► Virtual-channel router– Added optical link ports and extra buffer.

SwitchAllocator

VCAllocator

Output k

Crossbar switch

RouterRoutingcomputation

Eject(Output 1)

VC 1

VC 2

VC v

VC 1

VC 2

VC v

Inject(Input 1)

Input k

Arbiter

global output E/O

global input 1 O/E

global input g O/Einput buffer Dedicated

sending channel for all traffic.

Separate receiving channels from other clusters.

Page 18: Firefly: Illuminating Future  Network-on-Chip with Nanophotonics

Motivation Architecture of Firefly Evaluation Conclusion

ISCA 2009Yan Pan 18/25

► Routing– Intra-cluster routing– Traversing optical link

RoutingRouting

C0R0

C5R0

C5R1

C5R2

C5R3

RT LT LT LT LT LT OA RT LT RT LT RT LT RT

RT LT LT LT LT LT OA RT LT RT LT RT LT RT

RT LT LT LT LT LT OA RT LT RT LT RT LT RT

head

body

tail

RB

--

--

RT RB LT OA

SwitchAllocator

VCAllocator

Output k

Crossbar switch

RouterRoutingcomputation

Eject(Output 1)

VC 1

VC 2

VC v

VC 1

VC 2

VC v

Inject(Input 1)

Input k

Arbiter

global output E/O

global input 1 O/E

global input g O/Einput buffer

FIREFLY_dest FIREFLY_src

(FIREFLY_dest)

CH0a

CH1a

CH(N-1)a

...

... ... ...log (Ns)

... ...

log (Ns)

Reservation C

hannels

log (Ns)

CH0

R0 R1 RN-1

CH1

CH(N-1)

...

......

... ... ...

... ...

...

ww

w

Data

Channels

Page 19: Firefly: Illuminating Future  Network-on-Chip with Nanophotonics

Motivation Architecture of Firefly Evaluation Conclusion

ISCA 2009Yan Pan 19/25

Firefly – another lookFirefly – another look

► Clusters– Short electrical links– Concentrated mesh

► Assemblies– Long nanophotonic links– Partitioned crossbars

► Benefits– Traffic locality– Reduced hardware– Localized arbitration– Distributed inter-cluster bandwidth

C0R3PPPP

C0R0

PPPP

C0R2

PPPP

C0R1PPPP

C2R0PPPP

C3R0PPPP

C1R0PPPP

C0

C1

C2

C3

C0R3PPPP

C0R0

PPPP

C0R2

PPPP

C0R1PPPP

C0

... ...

C2R0PPPP

C3R0PPPP

C1R0PPPP

C1

C2

C3

... ...

... ...

A0

A1

A2

A3

Page 20: Firefly: Illuminating Future  Network-on-Chip with Nanophotonics

Motivation Architecture of Firefly Evaluation Conclusion

ISCA 2009Yan Pan 20/25

OutlineOutline

► Motivation► Architecture of Firefly► Evaluation► Conclusion

Page 21: Firefly: Illuminating Future  Network-on-Chip with Nanophotonics

Motivation Architecture of Firefly Evaluation Conclusion

ISCA 2009Yan Pan 21/25

Evaluation SetupEvaluation Setup

► Cycle-accurate simulator (Booksim)

► Firefly vs. CMESH, Dragonfly† and OP_XBAR► Synthetic traffic patterns and traces

Code Name Topology Global Routing Min#VC

CMESH Concentrated mesh dimension-ordered routing 1

DFLY_MIN Minimal routing, traversing nanophotonics at most once. 2

DFLY_VALNonminimal routing, traversing nanophotonics up to twice.

3

OP_XBAR All-optical crossbar using token-based global arbitration destination-based routing 1

FIREFLYProposed hybrid architecture with multiple logical optical inter-cluster crossbar.

Intra-cluster routing in the source cluster before traversing nanophotonics

1

Dragonfly topology mapped to on-chip network

Electrical

Hybrid

Optical

Hybrid

[† Kim et al, ISCA’08]

Page 22: Firefly: Illuminating Future  Network-on-Chip with Nanophotonics

Motivation Architecture of Firefly Evaluation Conclusion

ISCA 2009Yan Pan 22/25

Load / Latency CurveLoad / Latency Curve

► Throughput– Up to 4.8x over OP_XBAR– At least +70% over Dragonfly

0

5

10

15

20

25

30

35

0 0.1 0.2 0.3 0.4 0.5 0.6

Late

ncy

(#Cy

cles

)

Injection Rate(a)

0

5

10

15

20

25

30

35

0 0.2 0.4 0.6 0.8 1

Late

ncy

(#Cy

cles

)Injection Rate

(b)

0

10

20

30

40

50

60

0 0.2 0.4 0.6 0.8 1

Late

ncy

(#Cy

cles

)

Injection Rate(d)

0

10

20

30

40

50

60

0 0.1 0.2 0.3 0.4 0.5 0.6

Late

ncy

(#Cy

cles

)

Injection Rate(c)

Bitcomp, 1-cycle Uniform, 1-cycle

4.8x 70%

Page 23: Firefly: Illuminating Future  Network-on-Chip with Nanophotonics

Motivation Architecture of Firefly Evaluation Conclusion

ISCA 2009Yan Pan 23/25

Energy BreakdownEnergy Breakdown

► Reduced hardware by partitioning– Reduced heating

► Throughput impact► Locality

– 34% energy reduction over OP_XBAR with locality

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2

CMESHDFLY_MINDFLY_VALOP_XBAR

FIREFLYCMESH

DFLY_MINDFLY_VALOP_XBAR

FIREFLY

Tape

r_L0

.7D7

Bitc

omp

Average Per-packet Energy (nJ)

Router / DEMUX

Electircal Link

Optical Link

Laser

Ring Heating

Page 24: Firefly: Illuminating Future  Network-on-Chip with Nanophotonics

Motivation Architecture of Firefly Evaluation Conclusion

ISCA 2009Yan Pan 24/25

Technology SensitivityTechnology Sensitivity

► α is heating ratio and β is laser ratio.► Firefly favors traffic locality.

bitcomp taper_L0.7D7

Page 25: Firefly: Illuminating Future  Network-on-Chip with Nanophotonics

Motivation Architecture of Firefly Evaluation Conclusion

ISCA 2009Yan Pan 25/25

ConclusionConclusion

► Technology impacts architecture– New opportunities in nanophotonics

• Low latency, high bandwidth density

– Tailored architectures needed► Firefly benefits from nanophotonics by providing

– Power Efficiency • Hybrid signaling• Partitioned R-SWMR crossbars

Reduced hardware/power

– Scalability• Scalable inter-cluster bandwidth• Low-radix routers/crossbars