using magnetic tunnel junctions to compute like the brain...stochastic computing with...

36
· PML · NDCD · Alternative Computing Group 1 Using magnetic tunnel junctions to compute like the brain Mark Stiles with help from a cast of dozens Why? To understand how the brain works To develop machine consciousness To do cognitive computing more efficiently

Upload: others

Post on 18-Jun-2021

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Using magnetic tunnel junctions to compute like the brain...Stochastic computing with superparamagnetic tunnel junctions 27 Neural network Stochastic bit streams AND and OR gates,

· PML · NDCD · Alternative Computing Group 1

Using magnetic tunnel junctions to compute like the brainMark Stiles with help from a cast of dozens

Why?

To understand how the brain works

To develop machine consciousness

To do cognitive computing more efficiently

Page 2: Using magnetic tunnel junctions to compute like the brain...Stochastic computing with superparamagnetic tunnel junctions 27 Neural network Stochastic bit streams AND and OR gates,

· PML · NDCD · Alternative Computing Group

Hardware for Artificial Intelligence – NIST Gaithersburg

2

Me Vivek Amin

Advait Madhavan Mark

Anders

Philippe Talatchian

Jonathan Goodwill

Matthew Daniels Brian

Hoskins

Siyuan Huang

Alice Mizrahi

Nikolai Zhitenev

Jabez McClelland

Page 3: Using magnetic tunnel junctions to compute like the brain...Stochastic computing with superparamagnetic tunnel junctions 27 Neural network Stochastic bit streams AND and OR gates,

· PML · NDCD · Alternative Computing Group

Computers are designed to solve numerical problems, the brain excels at categorical problems

3

Natural language processing

Image: cmu.edu

Satellite trajectories

Image: wikimedia.org

Digital Computer Neuromorphic System

High precision numerical Categorical

Page 4: Using magnetic tunnel junctions to compute like the brain...Stochastic computing with superparamagnetic tunnel junctions 27 Neural network Stochastic bit streams AND and OR gates,

· PML · NDCD · Alternative Computing Group

Computers separate long term memory from processing,

while in the brain, they are intimately connected

4

Neurons (1011)

Synapses (1015)

Small scale structure of the brain:

Memory Unit

Central Processing Unit

Control Unit

Logic Unit

Input

Output

von Neumann architecture:

Digital Computer Neuromorphic System

Memory-Processor Bottleneck Collocation of processor and memory

High precision numerical Categorical

Page 5: Using magnetic tunnel junctions to compute like the brain...Stochastic computing with superparamagnetic tunnel junctions 27 Neural network Stochastic bit streams AND and OR gates,

· PML · NDCD · Alternative Computing Group

Computers use a rigid encoding scheme,

the brain prioritizes resiliency with more flexible schemes

5

Neuronal spiking

timeImage: Wikipedia

Clocked processing

Image: Wikipedia

Digital Computer Neuromorphic System

Discrete, Deterministic, & Synchronous Analog, Stochastic, & Asynchronous

Memory-Processor Bottleneck Collocation of processor and memory

High precision numerical Categorical

Rossant et al., Frontiers in Neuroscience 5, 9 (2011)

Neuro

n I

ndex

Binary Representation

64 32 16 8 4 2 1

1 0 0 1 1 0 1

1 s

Page 6: Using magnetic tunnel junctions to compute like the brain...Stochastic computing with superparamagnetic tunnel junctions 27 Neural network Stochastic bit streams AND and OR gates,

· PML · NDCD · Alternative Computing Group

Neural networks provide the simplest implementation of brain-like cognitive computing (rate coding)

6

inputs

idealoutputs

information flow

● Neurons output a rate representing a spike train.

● The rate from each neuron is multiplied by synaptic weights for each neuron pair.

● Neurons sum the incoming weights and output a non-linear function of the sum (activation function).

● Network non-linearly transforms space to bring similar inputs together

A: 1.00

B: 0.73

C: 0.53

Page 7: Using magnetic tunnel junctions to compute like the brain...Stochastic computing with superparamagnetic tunnel junctions 27 Neural network Stochastic bit streams AND and OR gates,

· PML · NDCD · Alternative Computing Group

A whole range of approaches to more efficient computing

7

Machine Learning Specialized

Hardware, GPU, TPU

Specialty Chips, TrueNorth, Loihi, BrainScales

Hardware for Artificial Intelligence

Software

Architectures

Encoding

Circuits

Physical devices

Page 8: Using magnetic tunnel junctions to compute like the brain...Stochastic computing with superparamagnetic tunnel junctions 27 Neural network Stochastic bit streams AND and OR gates,

· PML · NDCD · Alternative Computing Group

CMOS has developed well-defined abstraction layers,but with novel devices, designing across the stack is necessary

8

Details depend on what the device can do.

What is important depends on what the architecture needs

Software

Architectures

Encoding

Circuits

Physical devices

Page 9: Using magnetic tunnel junctions to compute like the brain...Stochastic computing with superparamagnetic tunnel junctions 27 Neural network Stochastic bit streams AND and OR gates,

· PML · NDCD · Alternative Computing Group

Magnetic tunnel junctions are multifunctional devices

9

● Non-volatile embedded memory

● Spin-torque nano-oscillators

● Superparamagnetic tunnel junctions

0 2 4 6 8

Re

sis

tan

ce

Time (s)

RAP

RP

RAP

RP

Intel: MRAM integrated into 22nm FinFET CMOS

Neuronal spiking

timeImage: Wikipedia

Rossant et al., Frontiers in Neuroscience 5, 9 (2011)

Neuro

n I

ndex

Page 10: Using magnetic tunnel junctions to compute like the brain...Stochastic computing with superparamagnetic tunnel junctions 27 Neural network Stochastic bit streams AND and OR gates,

· PML · NDCD · Alternative Computing Group

Bigger energy barriers mean longer data retention, but more energy needed to write.

10

Energy

barrier

60 kT

14 kT

years

ms

μs

Memory for storage (MRAM)

Short term memory (embedded/cache)

Superparamagnetic (random signals)

7 kT

Retention

Time

AP

P

“1”

“0”

low-resistance high-resistance

Page 11: Using magnetic tunnel junctions to compute like the brain...Stochastic computing with superparamagnetic tunnel junctions 27 Neural network Stochastic bit streams AND and OR gates,

· PML · NDCD · Alternative Computing Group

Magnetic tunnel junctions can be controlled by current.

11

Electron flow through tunnel junction→ Spin-transfer torque

Electron flow through heavy metal underlayer→ Spin-orbit torque

IHM

IMTJ

low-resistance high-resistance

Energy

Parallel Antiparallel

Negative current

low-resistance high-resistance

Energy

Parallel Antiparallel

Positive current

Page 12: Using magnetic tunnel junctions to compute like the brain...Stochastic computing with superparamagnetic tunnel junctions 27 Neural network Stochastic bit streams AND and OR gates,

· PML · NDCD · Alternative Computing Group

Superparamagnetic magnetic tunnel junctions: Control rate with spin-transfer torque (or spin-orbit torque)

12

0.100 0.125 0.150

400

500

600

700

Re

sis

tan

ce

(

)

Time (s)

-50 0 500

500

1000

Rate

(H

z)

Current (A)

Mean R

esi

stance

(Pro

babili

ty)

𝑅𝐴𝑃

𝑅𝑃

IMTJ

AP

kBT

P

Page 13: Using magnetic tunnel junctions to compute like the brain...Stochastic computing with superparamagnetic tunnel junctions 27 Neural network Stochastic bit streams AND and OR gates,

· PML · NDCD · Alternative Computing Group

Superparamagnetic magnetic tunnel junctions: Control rate with spin-transfer torque (or spin-orbit torque)

13

0.100 0.125 0.150

400

500

600

Resis

tan

ce (

)

Time (s)

0.100 0.125 0.150

400

500

600

700

Re

sis

tan

ce

(

)

Time (s)

0.100 0.125 0.150

400

500

600

Resis

tan

ce (

)

Time (s)

-50 0 500

500

1000

Rate

(H

z)

Current (A)

Mean R

esi

stance

(Pro

babili

ty)

𝑅𝐴𝑃

𝑅𝑃

IMTJ

AP

kBT

I > 0

Stabilizes

antiparallel

state

P

I < 0

Stabilizes

parallel

stateP AP

kBT

AP

kBT

P

Page 14: Using magnetic tunnel junctions to compute like the brain...Stochastic computing with superparamagnetic tunnel junctions 27 Neural network Stochastic bit streams AND and OR gates,

· PML · NDCD · Alternative Computing Group

Computing with superparamagnetic tunnel junctions

14

● Random number generation

● Probabilistic computing

● Rate coding

Rate

Current

Pro

babili

ty

Current

𝑝 = 0.5

Borders et al., Nature 573, 390-393 (2019)

p -bit

Won Ho Choi et al., 2014 IEEE IEDM pp. 12.5.1-12.5.4,

D. Vodenicarevic et al., Phys. Rev. Applied 8, 054045 (2017)

XOR gate

Direction

Page 15: Using magnetic tunnel junctions to compute like the brain...Stochastic computing with superparamagnetic tunnel junctions 27 Neural network Stochastic bit streams AND and OR gates,

· PML · NDCD · Alternative Computing Group

Many reasons to incorporate Complementary Metal Oxide Semiconductor (CMOS)

15

pmos

nmos

𝑉𝑔

𝑉𝑔

On for 𝑉𝑔 = 0

Off for 𝑉𝑔 = 𝑉𝑑𝑑

Off for 𝑉𝑔 = 0

On for 𝑉𝑔 = 𝑉𝑑𝑑

Author: Cephidan, https://commons.wikimedia.org/wiki/File:LDD-MOS_transistor_-_CMOS_with_STI.svg

Page 16: Using magnetic tunnel junctions to compute like the brain...Stochastic computing with superparamagnetic tunnel junctions 27 Neural network Stochastic bit streams AND and OR gates,

· PML · NDCD · Alternative Computing Group

Many reasons to incorporate Complementary Metal Oxide Semiconductor (CMOS)

16

pmos

nmos

𝑉𝑔

𝑉𝑔

On for 𝑉𝑔 = 0

Off for 𝑉𝑔 = 𝑉𝑑𝑑

Off for 𝑉𝑔 = 0

On for 𝑉𝑔 = 𝑉𝑑𝑑

A

Inverter

A

pmos

nmos

AA

𝑉𝑑𝑑

0

Voltage pins Output voltage from voltage pins, not input

One transistor always off except during switching,

In 2014, 250 × 1018

transistors manufactured

What can other devices do better?

● Non-volatility (local memory)

● Plasticity (local learning)

● Stochasticity

● Oscillators

→ cheap substrate

→ local energy source → fanout

→ 30 aJ and up

Page 17: Using magnetic tunnel junctions to compute like the brain...Stochastic computing with superparamagnetic tunnel junctions 27 Neural network Stochastic bit streams AND and OR gates,

· PML · NDCD · Alternative Computing Group

Energies vary over many orders of magnitude

17

1 meV

1 eV

1 keV

1 MeV

1 aJ

1 zJ

1 fJ

1 pJ

1 GeV

1 nJ

Room temperature

Gate transition

MRAM barrier

Synaptic event

MRAM switching

Neural spike

Ion channel

Liquid Helium temperature

Page 18: Using magnetic tunnel junctions to compute like the brain...Stochastic computing with superparamagnetic tunnel junctions 27 Neural network Stochastic bit streams AND and OR gates,

· PML · NDCD · Alternative Computing Group

Software

Architectures

Encoding

Circuits

Physical devices

Population coding with superparamagnetic tunnel junctions

18

● Rate/Population Coding

● Transition rate, binary

● Pre-charge sense amplifier

● Superparamagnetic tunnel junctions, Short term memory

A. Mizrahi, T. Hirtzlin, A. Fukushima, H. Kubota, S. Yuasa, J. Grollier, and D. Querlioz, Nature Comm. 9, 1533 (2018);

A. Mizrahi, J. Grollier, D. Querlioz, and M. D. Stiles, J. Appl. Phys. 124, 152111 (2018).

Page 19: Using magnetic tunnel junctions to compute like the brain...Stochastic computing with superparamagnetic tunnel junctions 27 Neural network Stochastic bit streams AND and OR gates,

· PML · NDCD · Alternative Computing Group

Bigger energy barriers mean longer data retention, but more energy needed to write.

19

Energy

barrier

60 kT

14 kT

years

ms

μs

Memory for storage (MRAM)

Short term memory (synaptic weights)

Superparamagnetic (spiking neurons)

7 kT

Retention

Time

low-resistance high-resistance

AP

P

Page 20: Using magnetic tunnel junctions to compute like the brain...Stochastic computing with superparamagnetic tunnel junctions 27 Neural network Stochastic bit streams AND and OR gates,

· PML · NDCD · Alternative Computing Group

Population coding – represent continuous degrees of freedom with multiple spike rates

20

Direction

θ

+ 180°

- 180°

Neurons

Neuron

Firin

g r

ate

Firin

g r

ate

𝐻Σ

𝑤𝑖

𝑟𝑖

Weights

Page 21: Using magnetic tunnel junctions to compute like the brain...Stochastic computing with superparamagnetic tunnel junctions 27 Neural network Stochastic bit streams AND and OR gates,

· PML · NDCD · Alternative Computing Group

Simple sense, respond, measure, and learn test

21

Sensory input

Motor response

SMTJ-based CMOS only

Area 12,000 μm2 20,000 μm2

Energy 7.8 nJ 20 nJ

T. Hirtzlin at C2N:

Key advantage = stochastic analog to digital conversion

Weights stored in stable MTJs.Can we save energy there?

Page 22: Using magnetic tunnel junctions to compute like the brain...Stochastic computing with superparamagnetic tunnel junctions 27 Neural network Stochastic bit streams AND and OR gates,

· PML · NDCD · Alternative Computing Group

Continuous learning: the key to a robust system that can adapt to changes

22

Usually system is only trained once because it requires a lot of energyBUT unable to adapt to changes!

Initial Training

Untrained network

Use of the network

Unexpected change Loss of precision,

failure

Untrained network

Initial Training

Weight modification

Use of the network

Unexpected change

System adapted

System equipped with continuous learning

Page 23: Using magnetic tunnel junctions to compute like the brain...Stochastic computing with superparamagnetic tunnel junctions 27 Neural network Stochastic bit streams AND and OR gates,

· PML · NDCD · Alternative Computing Group

Continuous learning overcomes unreliable synapses

23

Here, ΔE = 15 kBT magnetic tunnel junctions can be used for memory with no precision loss

Ew = 100 kBT

No loss

Ew = 6 kBT

Ew = 8 kBT

Ew = 10 kBT

Ew = 15 kBT

0 1000 2000 3000 4000 5000 60001

10

Err

or

(%)

Learning steps

Page 24: Using magnetic tunnel junctions to compute like the brain...Stochastic computing with superparamagnetic tunnel junctions 27 Neural network Stochastic bit streams AND and OR gates,

· PML · NDCD · Alternative Computing Group

2 3 4 5 6 7 8 9 10 11 12

0.001

0.01

0.1

1

Error (%)

No

rma

lize

d p

ow

er

co

nsu

mp

tio

n

46 neurons

Increase energy

barrier ΔEw

Using unreliable synapses lowers the energy consumption

24

𝐼 ∝ ∆𝐸𝑤

𝐸𝑛𝑒𝑟𝑔𝑦 ∝ ∆𝐸𝑤2

Write energy

Write current

(Sato et al., APL 2014)

Page 25: Using magnetic tunnel junctions to compute like the brain...Stochastic computing with superparamagnetic tunnel junctions 27 Neural network Stochastic bit streams AND and OR gates,

· PML · NDCD · Alternative Computing Group

2 3 4 5 6 7 8 9 10 11 12

0.001

0.01

0.1

1

10 neurons

20 neurons

100 neurons

Error (%)

No

rma

lize

d p

ow

er

co

nsu

mp

tio

n

46 neurons

Using unreliable synapses lowers the energy consumption

25

𝐼 ∝ ∆𝐸𝑤

𝐸𝑛𝑒𝑟𝑔𝑦 ∝ ∆𝐸𝑤2

Write energy

Write current

(Sato et al., APL 2014)

Page 26: Using magnetic tunnel junctions to compute like the brain...Stochastic computing with superparamagnetic tunnel junctions 27 Neural network Stochastic bit streams AND and OR gates,

· PML · NDCD · Alternative Computing Group

0.001

0.01

0.1

1

10

100

2 3 4 5 6 7 8 9 101112

10

12

14

16

18

20

100

46

20

10 neurons

OptimumNo

rma

lize

d p

ow

er

co

nsu

mp

tio

nN

um

be

r

of

ne

uro

ns

En

erg

y b

arr

ier

(k

BT

)

Error (%)

Using unreliable synapses lowers the energy consumption

26

Example:For 3% precision→ 54 neurons in each population (2916 weights)→ ΔEw = 12 kBT

𝐼 ∝ ∆𝐸𝑤

𝐸𝑛𝑒𝑟𝑔𝑦 ∝ ∆𝐸𝑤2

Write energy

Write current

(Sato et al., APL 2014)

Page 27: Using magnetic tunnel junctions to compute like the brain...Stochastic computing with superparamagnetic tunnel junctions 27 Neural network Stochastic bit streams AND and OR gates,

· PML · NDCD · Alternative Computing Group

Software

Architectures

Encoding

Circuits

Physical devices

Stochastic computing with superparamagnetic tunnel junctions

27

● Neural network

● Stochastic bit streams

● AND and OR gates, PCSA

● Superparamagnetic tunnel junctions

M. W. Daniels, A. Madhavan, P. Tatatchian, A. Mizrahi, and M. D. Stiles, Physical Review Applied, 13, 034016 (2020).

Page 28: Using magnetic tunnel junctions to compute like the brain...Stochastic computing with superparamagnetic tunnel junctions 27 Neural network Stochastic bit streams AND and OR gates,

· PML · NDCD · Alternative Computing Group

Stochastic computing represents numbers as probabilistic bit

streams

28

observed = 0.19

0.5

0.4

0.7

● Resilient to white noise

● Incorrect bit ⟹ error ≈ 𝑂 1

● Compare to binary: error ≈ 𝑂(2𝑁)

● Naturally coherent with bilevel devices (magnetic tunnel junctions).

● Interpreting these “spike trains” as rates or probabilities restricts us to 0 ≤ 𝑝 ≤ 1.

Time →

0.63

0.35

0.69

p = 0.2

Page 29: Using magnetic tunnel junctions to compute like the brain...Stochastic computing with superparamagnetic tunnel junctions 27 Neural network Stochastic bit streams AND and OR gates,

· PML · NDCD · Alternative Computing Group

Unbiased superparamagnetic tunnel junction plus CMOSfor low energy random bitstreams

29

Pre-Charge Sense Amplifier (PCSA)Superparamagnetictunnel junction

𝑅𝐴𝑃 > 𝑅𝑟𝑒𝑓 > 𝑅𝑃

Clock

Reference resistor

Clock

Clock

Low energy random bits0.100 0.125 0.150

400

500

600

700

Resis

tan

ce (

)

Time (s)

IMTJ <<IC

AP

kBT

P

𝑝 = 0.5

Page 30: Using magnetic tunnel junctions to compute like the brain...Stochastic computing with superparamagnetic tunnel junctions 27 Neural network Stochastic bit streams AND and OR gates,

· PML · NDCD · Alternative Computing Group

AND gates for multiplication – Programmable bitstream generator: 10 fJ per bit

30

Example

1001010100 (0.4)

x 1010001110 (0.5)

= 1000000100 (0.2)

AND gate (multiply)

𝒑𝐀𝐍𝐃 = 𝒑𝒂𝒑𝒃

Typical stochastic computing uses Linear Feedback Shift Registers→ Short period pseudorandom numbers

1/2

1/4

3/4

1/4, 3/4

1/8, 3/8

7/8, 5/8

1/8, 3/8 , 5/8, 7/8

1/16, 3/16, 5/16, 7/16

15/16, 13/16, 11/16, 9/16

AND gate Inverter Multiplexer

Page 31: Using magnetic tunnel junctions to compute like the brain...Stochastic computing with superparamagnetic tunnel junctions 27 Neural network Stochastic bit streams AND and OR gates,

· PML · NDCD · Alternative Computing Group

Synapses from bitstream generators (weights) and AND gates (multiplication)

31

AND-gate synapseSynaptic weight:

Bit stream from neuron

Programmable Bit stream generator

Page 32: Using magnetic tunnel junctions to compute like the brain...Stochastic computing with superparamagnetic tunnel junctions 27 Neural network Stochastic bit streams AND and OR gates,

· PML · NDCD · Alternative Computing Group

Neurons from OR gates for nonlinear summation

32

Example

1001010100 (0.4)

+ 1010001110 (0.5)

= 1011011110 (0.7)

OR gate (nonlinear add)

𝒑𝐎𝐑 = 𝒑𝒂 + 𝒑𝒃 − 𝒑𝒂𝒑𝒃

OR-gate neuron does not work with correlated bit streams

Page 33: Using magnetic tunnel junctions to compute like the brain...Stochastic computing with superparamagnetic tunnel junctions 27 Neural network Stochastic bit streams AND and OR gates,

· PML · NDCD · Alternative Computing Group

Low area, high efficiency neural network with stochastic information throughout the calculation

33

AND-gate synapseSynaptic weight: Bit stream from programmable SMTJ generator

Bit stream from neuron

OR-gate neuron

Page 34: Using magnetic tunnel junctions to compute like the brain...Stochastic computing with superparamagnetic tunnel junctions 27 Neural network Stochastic bit streams AND and OR gates,

· PML · NDCD · Alternative Computing Group

Efficient random bitstream generation and avoiding domain translation – energy efficiency on a standard problem

34

A standard deep neural network structure Standard test dataset

● Energy/performance tradeoff by tuning circuit parameters.

● Running for longer or shorter total time (~N) gives

energy/performance tradeoff.

Stochastic computing implementations

Future: implement primitives to check feasibility with realistic MTJs

Page 35: Using magnetic tunnel junctions to compute like the brain...Stochastic computing with superparamagnetic tunnel junctions 27 Neural network Stochastic bit streams AND and OR gates,

· PML · NDCD · Alternative Computing Group

Collaborators

35

Brian Hoskins, Matthew Daniels, Guru Khalsa, Mark Anders, Jonathan Goodwill, Jabez McClelland, Nikolai Zhitenev, William Rippard, Matthew Pufall, Emilie Jué, Mark Stiles

Energy Frontier Research CenterDepartment of Energy

Alice Mizrahi, Advait Madhavan, Philippe Talatchian

Sumito Tsunegi, Kay Yakushiji, AkioFukushima, Hitoshi Kubota, Shinji Yuasa

Tifenn Hirtzlin, Damien Querlioz

Jacob Torrejon, Mathieu Riou, Flavio Abreu Araujo, Paolo Bortolotti, Vincent Cros, Julie Grollier

George Tzimpragos, Tim Sherwood

Siyuan Huang, Gina Adam

Vivek Amin, Jonathan Gibbons, Paul Haney, Axel Hoffmann, Julie Grollier

Page 36: Using magnetic tunnel junctions to compute like the brain...Stochastic computing with superparamagnetic tunnel junctions 27 Neural network Stochastic bit streams AND and OR gates,

· PML · NDCD · Alternative Computing Group

Using magnetic tunnel junctions to compute like the brain

36

Architecture Encoding Devices

Population coding Fluctuation rates Superparamagnetic tunnel junctions

Synaptic weights Binary Unstable magnetic tunnel junctions

Neural network with stochastic computing

Random bitstreams Superparamagnetic tunnel junctions, Repurposed CMOS Logic gates

● Emulating features of the brain can increase efficiency for cognitive computing.

● Applications require designing across the computational stack

● Architecture● Encoding● Circuitry● Devices

● CMOS is likely to play an important role in any room-temperature computing scheme.

● Magnetic tunnel junctions

● Multifunctional● Already in CMOS Fabs

Software

Architectures

Encoding

Circuits

Physical devices