microring weight banks for sth3b.1.pdf cleo 2018 © osa 2018 · techniques for extracting weight...

2
STh3B.1.pdf CLEO 2018 © OSA 2018 Microring Weight Banks for Neuromorphic Silicon Photonics A. N. Tait, T. Ferreira de Lima, M. A. Nahmias, B. J. Shastri, and P. R. Prucnal Department of Electrical Engineering, Princeton University, Princeton, NJ, 08544 USA [email protected] Abstract—Research in photonic information processing has experienced a recent resurgence. Microring weight banks enable reconfigurable neural network approaches on silicon photonics. We discuss the latest results in neuromorphic silicon photonic networks and microring weight banks. As compared to electronics, optics lacks key physical re- quirements for implementing digital logic; however, these same physical qualities make it superior at analog signal processing. For example, radio frequency (RF) and microwave photonics (MWP) exploit the bandwidth, linearity, and tun- ability of optics to address next-generation wireless system needs [1], but their processing repertoire is usually constrained to single-input linear operations. Extending this performance to much more complex tasks would require scalable models and photonic implementations of network-based information processing. Optical neural networks have been explored using free-space optics, but were effectively considered untenable about two decades ago because they were large, expensive, sub-GHz, and severely limited in scalability. Neuromorphic photonics [2] aims to map physical models of optoelectronic systems to abstract models of neural networks. Today, rapid advances in photonic manufacturing could revo- lutionize large-scale photonic systems in terms of size, cost, device performance, reliability, and scalability. The processing repertoire of single neurons is relatively simple while that of overall neural networks can be incredibly complex and varied. Network connection strengths (a.k.a. weights) are closely tied to computational function, not just data communication. Neural networks owe their name to biology, but are in fact a class of well-studied mathematical models. This extensive knowledge of how to relate weight profiles to function can be leveraged by hardware that can be made isomorphic to a neural network framework – neuromorphic systems. Neuromorphic electronic architectures utilizing this strategy have recently attracted tremendous research interest [3]. Here, we discuss recent advances in integrated neuromorphic photonics and silicon photonic weight banks. Interest in integrated lasers with neuron-like spiking behav- ior has flourished over the past several years [4]. These lasers exhibit excitable (a.k.a spiking) dynamics where the outputs are pulses of fixed amplitude and continuous in time. The differential equations describing a laser cavity with embedded saturable absorber can be designed to map exactly to those of a common neuron model called leaky integrate-and-fire [5], except on timescales roughly 10 7 × faster than their biological 50µm W D M w 11 w 12 w 13 w 1in w 21 w 22 w 23 w 2in w 31 w 32 w 33 w 3in LD λ1 LD λ2 LD λ3 LD λ4 W D M MRR weight bank BPD * LD:λ4 MRR weight bank BPD * LD:λ1 ... -20 -10 0 10 20 W F = 0.449 s 1 s 2 Neuron state [s 1 , s 2 ] (mV) -20 -10 0 10 20 W F = 0.529 Time (ms) -0.5 0 0.5 -20 -10 0 10 20 W F = 0.629 State 1 : s 1 (normalized) W win = WF -1 0 1 WF 0 Neuron state [s 1 , s 2 ] (mV) a) b) c) d) a) c) b) Fig. 1. Silicon Photonic Neural Networks. a) Broadcast-and-weight archi- tecture. b) Demonstrated broadcast-and-weight network based on microring weight banks. c) Observed oscillatory bifurcation in the experiment in (b). counterparts. Experimental work on excitable lasers has so far focused on isolated neurons [6] and fixed, cascadable chains [7], [8]. The shortage of research on networks of these lasers might be explained by the challenges of implementing low-loss, compact, and tunable filters in the active III/V platforms required for laser gain. The “broadcast-and-weight” architecture (Fig. 1a) was pro- posed as a protocol for implementing networks of photonic neurons using integrated photonic devices [9]. Neuron outputs are wavelength division multiplexed (WDM) and broadcast to other neurons. At each neuron input, these WDM signals are weighted by reconfigurable, continuous-valued filters called microring (MRR) weight banks and then summed by total power detection. This electrical weighted sum then modulates the corresponding WDM carrier through a nonlinear electro- optic device [10]. The first demonstration of a broadcast-and- weight network [11] established a dynamical correspondence with a continuous-time recurrent neural network (CTRNN) model. Evidence of this dynamical correspondence was found by reproducing bifurcations expected in the CTRNN. This correspondence allows for the application of neural network design tools to analog photonic networks, making them

Upload: others

Post on 13-Oct-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Microring Weight Banks for STh3B.1.pdf CLEO 2018 © OSA 2018 · Techniques for extracting weight vectors from time-domain WDM measure-ments and for precise control of MRR weights

STh3B.1.pdf CLEO 2018 © OSA 2018

Microring Weight Banks forNeuromorphic Silicon Photonics

A. N. Tait, T. Ferreira de Lima, M. A. Nahmias, B. J. Shastri, and P. R. PrucnalDepartment of Electrical Engineering, Princeton University, Princeton, NJ, 08544 USA

[email protected]

Abstract—Research in photonic information processing hasexperienced a recent resurgence. Microring weight banks enablereconfigurable neural network approaches on silicon photonics.We discuss the latest results in neuromorphic silicon photonicnetworks and microring weight banks.

As compared to electronics, optics lacks key physical re-quirements for implementing digital logic; however, thesesame physical qualities make it superior at analog signalprocessing. For example, radio frequency (RF) and microwavephotonics (MWP) exploit the bandwidth, linearity, and tun-ability of optics to address next-generation wireless systemneeds [1], but their processing repertoire is usually constrainedto single-input linear operations. Extending this performanceto much more complex tasks would require scalable modelsand photonic implementations of network-based informationprocessing. Optical neural networks have been explored usingfree-space optics, but were effectively considered untenableabout two decades ago because they were large, expensive,sub-GHz, and severely limited in scalability.

Neuromorphic photonics [2] aims to map physical models ofoptoelectronic systems to abstract models of neural networks.Today, rapid advances in photonic manufacturing could revo-lutionize large-scale photonic systems in terms of size, cost,device performance, reliability, and scalability. The processingrepertoire of single neurons is relatively simple while that ofoverall neural networks can be incredibly complex and varied.Network connection strengths (a.k.a. weights) are closelytied to computational function, not just data communication.Neural networks owe their name to biology, but are in facta class of well-studied mathematical models. This extensiveknowledge of how to relate weight profiles to function can beleveraged by hardware that can be made isomorphic to a neuralnetwork framework – neuromorphic systems. Neuromorphicelectronic architectures utilizing this strategy have recentlyattracted tremendous research interest [3]. Here, we discussrecent advances in integrated neuromorphic photonics andsilicon photonic weight banks.

Interest in integrated lasers with neuron-like spiking behav-ior has flourished over the past several years [4]. These lasersexhibit excitable (a.k.a spiking) dynamics where the outputsare pulses of fixed amplitude and continuous in time. Thedifferential equations describing a laser cavity with embeddedsaturable absorber can be designed to map exactly to those ofa common neuron model called leaky integrate-and-fire [5],except on timescales roughly 107× faster than their biological

50µm

W

D

M

w11

w12

w13

w1in

w21

w22

w23

w2in

w31

w32

w33

w3in

LD

λ1

LD

λ2

LD

λ3

LD

λ4

W

D

M

MRR weight bank BPD *LD:λ4

MRR weight bank BPD *LD:λ1

...

-20

-10

0

10

20 WF = 0.449

s1

s2

Neuro

n s

tate

[s

1, s

2] (m

V)

-20

-10

0

10

20 WF = 0.529

Time (ms)-0.5 0 0.5

-20

-10

0

10

20 WF = 0.629

Sta

te 1

: s

1(n

orm

aliz

ed

)

W win

=

WF −1 0

1 WF 0

Neuro

n s

tate

[s

1, s

2] (m

V)

a)

b)

c)

d)

a)

c)

b)

Fig. 1. Silicon Photonic Neural Networks. a) Broadcast-and-weight archi-tecture. b) Demonstrated broadcast-and-weight network based on microringweight banks. c) Observed oscillatory bifurcation in the experiment in (b).

counterparts. Experimental work on excitable lasers has sofar focused on isolated neurons [6] and fixed, cascadablechains [7], [8]. The shortage of research on networks of theselasers might be explained by the challenges of implementinglow-loss, compact, and tunable filters in the active III/Vplatforms required for laser gain.

The “broadcast-and-weight” architecture (Fig. 1a) was pro-posed as a protocol for implementing networks of photonicneurons using integrated photonic devices [9]. Neuron outputsare wavelength division multiplexed (WDM) and broadcast toother neurons. At each neuron input, these WDM signals areweighted by reconfigurable, continuous-valued filters calledmicroring (MRR) weight banks and then summed by totalpower detection. This electrical weighted sum then modulatesthe corresponding WDM carrier through a nonlinear electro-optic device [10]. The first demonstration of a broadcast-and-weight network [11] established a dynamical correspondencewith a continuous-time recurrent neural network (CTRNN)model. Evidence of this dynamical correspondence was foundby reproducing bifurcations expected in the CTRNN. Thiscorrespondence allows for the application of neural networkdesign tools to analog photonic networks, making them

mdizon
COPYRIGHT
Page 2: Microring Weight Banks for STh3B.1.pdf CLEO 2018 © OSA 2018 · Techniques for extracting weight vectors from time-domain WDM measure-ments and for precise control of MRR weights

not merely reconfigurable, but programmable. For example,a simulated continuous-time recurrent neural network withmodulator-class neurons can be programmed to solve nonlin-ear differential equations with a 294× speedup compared to aCPU baseline [11].

MRR weight banks are the seat of reconfigurability inintegrated analog photonic networks. Their performance istherefore closely tied to the potential of these overall systems.In a MRR weight bank, the transmission seen by a WDMchannel is configured by thermally tuning that filter on andoff resonance [12]. A bank of these filters coupled to twobus waveguides independently weights all WDM channels. Acomplementary –1 to +1 weight range is achieved by balancedphotodetection of the multiplexed bus WGs. Techniques forextracting weight vectors from time-domain WDM measure-ments and for precise control of MRR weights were introducedin Ref. [13]. The demonstrated weight resolution of 4.1 bitsplus a sign bit (i.e. 34 distinguishable levels) is on par withthe weight resolution of digital neuromorphic electronics.

In a MRR-based WDM device, the channel count limit is de-termined by the finesse of the resonators and the channel spac-ing normalized to the filter linewidth. In MRR demultiplexers,this channel density parameter trades off with inter-channelcrosstalk. In a weight bank, all WDM signals exit the samepair of waveguides, so the concept of inter-channel crosstalkbreaks down. Instead, the channel density is limited by theability to weight neighboring signals independently [14]. Aunique property of the weight bank is the presence of two buswaveguides between filters that act upon neighboring WDMchannels. Coherent paths can be formed for wavelengths thatpartially couple through the neighboring filter.

The nature of inter-filter interference is fundmentally dif-ferent for MRR filters that are odd-pole vs. even-pole. In aodd-pole bank, a channel partially coupled through a neigh-boring filter returns through the opposite bus WG to completea resonator-like feedback path; in an even-pole bank, thepartially dropped channel continues in the same directionto instead complete an interferometer-like feedforward path.Interferometer-like interference depends on a path length dif-ference, rather than a sum, so changes that affect both bus WGsequally should not change the interferometric phase condition.Two-pole designs investigated in [15] can exploit inter-filterinterference to achieve a WDM density improvement of 3.4×.Furthermore, the inter-channel phase condition is tolerant todynamic tuning only in 2-pole weight banks.

Neuromorphic silicon photonics [2] combines physicalmodels of optoelectronic systems with computational modelsof neural networks. It represents a new opportunity for ma-chine information processing on sub-nanosecond timescales.The strategy of neuromorphic engineering is to externalize therisk of developing computational theory alongside hardware.The strategy of remaining compatible with silicon photonicsexternalizes the risk of platform development. Silicon pho-tonic manufacturing introduces unprecedented opportunitiesfor large-scale, analog photonic systems with wide reconfig-urability. By applying neural abstractions for programming

c)

DROP

THRUIN

20 µm

IN

TH

RU

DROP

100

µm

RF

out

d)

WDM

in

IN THRU

DROP

PD+

PD–

Channel 1 weight-1 -0.5 0 0.5 1

Channel 2 w

eig

ht

-1

-0.5

0

0.5

1

Channel 1 weight

Ch

an

nel 2

weig

ht

Time (ns)

a)

b)

Fig. 2. Microring Weight Banks. a) Concept and device with tunable MRRs– IN, DROP, and THRU are all multiplexed. b) Precise two-channel controlresults – black grid is target; red is mean error; blue is variance. c) One-pole(A) vs. Two-pole (B, C) designs. Two-pole designs are robust to tuning withthe asymmetric device (C) exhibiting the deepest isolation between peaks.

and learning interconnects, these systems could find applica-tion in new regimes of information processing where speed,adaptibility, and energy efficiency are paramount. Neuromor-phic photonics is likely to first impact microwave signalprocessing and scientific computing. It might also be appliedto ultrafast control and machine learning [16]. Since thisultrafast computational regime is largely unexplored, it couldfurthermore enable applications that are as of yet unknown.

REFERENCES

[1] D. Perez et al., Nature Communications, vol. 8, no. 1, p. 636, 2017.[2] P. R. Prucnal and B. J. Shastri. CRC Press, May 2017.[3] G. Indiveri and S. C. Liu, Proceedings of the IEEE, vol. 103, no. 8, pp.

1379–1397, Aug 2015.[4] P. R. Prucnal et al., Adv. Opt. Photon., vol. 8, no. 2, pp. 228–299, Jun

2016.[5] M. A. Nahmias et al., IEEE J. Sel. Top. Quantum Electron., vol. 19,

no. 5, pp. 1–12, Sept 2013.[6] F. Selmi et al., Phys. Rev. Lett., vol. 112, p. 183902, May 2014.[7] T. V. Vaerenbergh et al., Opt. Express, vol. 20, no. 18, pp. 20 292–20 308,

Aug 2012.[8] B. J. Shastri et al., Sci. Rep., vol. 5, p. 19126, Dec. 2015.[9] A. N. Tait et al., J. Lightwave Technol., vol. 32, no. 21, pp. 4029–4041,

Nov 2014.[10] M. A. Nahmias et al., Applied Physics Letters, vol. 108, no. 15, 2016.[11] A. N. Tait et al., Scientific Reports, vol. 7, no. 1, p. 7430, 2017.[12] A. Tait et al., Photonics Technol. Lett., vol. 28, no. 8, pp. 887–890, April

2016.[13] A. N. Tait et al., Opt. Express, vol. 24, no. 8, pp. 8895–8906, Apr 2016.[14] ——, IEEE J. Sel. Top. Quantum Electron., vol. 22, no. 6, 2016.[15] ——, Opt. Lett., vol. (peer review), 2018.[16] T. Ferreira de Lima et al., Nanophotonics, vol. 6, no. 3, 2017.

CLEO 2018 © OSA 2018STh3B.1.pdf