simulating a quantum annealer with gpu-based monte carlo...
TRANSCRIPT
![Page 1: Simulating a quantum annealer with GPU-based Monte Carlo ...on-demand.gputechconf.com/gtc/2016/presentation/s6380-james-kin… · Simulating a quantum annealer with GPU-based Monte](https://reader034.vdocuments.site/reader034/viewer/2022042219/5ec5b1e58ae40e70fd3160ee/html5/thumbnails/1.jpg)
Simulating a quantum annealer withGPU-based Monte Carlo algorithmsMayssam Mohammadi NevisiMani RanjbarJames KingSheir YarkoniJeremy P. HiltonCatherine C. McGeoch
April 6, 2016
![Page 2: Simulating a quantum annealer with GPU-based Monte Carlo ...on-demand.gputechconf.com/gtc/2016/presentation/s6380-james-kin… · Simulating a quantum annealer with GPU-based Monte](https://reader034.vdocuments.site/reader034/viewer/2022042219/5ec5b1e58ae40e70fd3160ee/html5/thumbnails/2.jpg)
©2016 D-Wave Systems Inc. All rights reserved.
Introduction
2 / 27
![Page 3: Simulating a quantum annealer with GPU-based Monte Carlo ...on-demand.gputechconf.com/gtc/2016/presentation/s6380-james-kin… · Simulating a quantum annealer with GPU-based Monte](https://reader034.vdocuments.site/reader034/viewer/2022042219/5ec5b1e58ae40e70fd3160ee/html5/thumbnails/3.jpg)
©2016 D-Wave Systems Inc. All rights reserved.
D-Wave QPU
I Quantum annealing chipI Highly specialized co-processorI Physical implementation of an
NP-hard optimization problemI Physical heuristic algorithm runs
on the chip
3 / 27
![Page 4: Simulating a quantum annealer with GPU-based Monte Carlo ...on-demand.gputechconf.com/gtc/2016/presentation/s6380-james-kin… · Simulating a quantum annealer with GPU-based Monte](https://reader034.vdocuments.site/reader034/viewer/2022042219/5ec5b1e58ae40e70fd3160ee/html5/thumbnails/4.jpg)
©2016 D-Wave Systems Inc. All rights reserved.
Ising Minimization
Given:I A graph G = (V , E)
I A collection of weights h = {hi : i ∈ V} andJ = {Jij : (i , j) ∈ E} (the Hamiltonian)
Assign:I Values from {−1, +1} to n spin variables s = {si}
Such that we minimize the energy function:
E(s) =∑i∈V
hisi +∑
(i,j)∈E
Jijsisj .
4 / 27
![Page 5: Simulating a quantum annealer with GPU-based Monte Carlo ...on-demand.gputechconf.com/gtc/2016/presentation/s6380-james-kin… · Simulating a quantum annealer with GPU-based Monte](https://reader034.vdocuments.site/reader034/viewer/2022042219/5ec5b1e58ae40e70fd3160ee/html5/thumbnails/5.jpg)
©2016 D-Wave Systems Inc. All rights reserved.
Chimera topology
I Ck is a k × k grid of dense K4,4“unit cells”
Processor Topology QubitsD-Wave One C4 128D-Wave Two C8 512D-Wave 2X C12 1152
I Chimera topologies are bipartiteI Any graph can be embedded in
a Chimera graph via minorembedding
5 / 27
![Page 6: Simulating a quantum annealer with GPU-based Monte Carlo ...on-demand.gputechconf.com/gtc/2016/presentation/s6380-james-kin… · Simulating a quantum annealer with GPU-based Monte](https://reader034.vdocuments.site/reader034/viewer/2022042219/5ec5b1e58ae40e70fd3160ee/html5/thumbnails/6.jpg)
©2016 D-Wave Systems Inc. All rights reserved.
Simulated (Thermal) Annealing
I Heuristic optimization algorithmthat simulates classical thermalannealing
I System of spins moves randomlyin state space
I Cools slowly from hot(random/explorative) to cold(greedy/exploitative)
I Uses thermal activation to jumpover energy barriers
Thermal Jump
Configuration(state)
Energy(cost)
6 / 27
![Page 7: Simulating a quantum annealer with GPU-based Monte Carlo ...on-demand.gputechconf.com/gtc/2016/presentation/s6380-james-kin… · Simulating a quantum annealer with GPU-based Monte](https://reader034.vdocuments.site/reader034/viewer/2022042219/5ec5b1e58ae40e70fd3160ee/html5/thumbnails/7.jpg)
©2016 D-Wave Systems Inc. All rights reserved.
Quantum Annealing
I Quantum annealing (QA) isrelated to adiabatic quantumcomputing (AQC)
H(t) = A(t) · Hinit + B(t) · Hprob.
I Takes advantage of thermalactivation just like classicalannealing
I Also has a new complementaryresource: quantum tunneling.
Thermal Jump
QuantumTunneling
Configuration(state)
Energy(cost)
7 / 27
![Page 8: Simulating a quantum annealer with GPU-based Monte Carlo ...on-demand.gputechconf.com/gtc/2016/presentation/s6380-james-kin… · Simulating a quantum annealer with GPU-based Monte](https://reader034.vdocuments.site/reader034/viewer/2022042219/5ec5b1e58ae40e70fd3160ee/html5/thumbnails/8.jpg)
©2016 D-Wave Systems Inc. All rights reserved.
Motivation for GPU Solvers
8 / 27
![Page 9: Simulating a quantum annealer with GPU-based Monte Carlo ...on-demand.gputechconf.com/gtc/2016/presentation/s6380-james-kin… · Simulating a quantum annealer with GPU-based Monte](https://reader034.vdocuments.site/reader034/viewer/2022042219/5ec5b1e58ae40e70fd3160ee/html5/thumbnails/9.jpg)
©2016 D-Wave Systems Inc. All rights reserved.
Why develop optimized GPU implementations?
I Quantum computers arehard to simulate
I Even approximatesimulations via MonteCarlo methods can be slow
Between somequantiles and systemsizes we observe aprefactor advantage[for D-Wave] as highas 108.- Denchev et al. (2015)
9 / 27
![Page 10: Simulating a quantum annealer with GPU-based Monte Carlo ...on-demand.gputechconf.com/gtc/2016/presentation/s6380-james-kin… · Simulating a quantum annealer with GPU-based Monte](https://reader034.vdocuments.site/reader034/viewer/2022042219/5ec5b1e58ae40e70fd3160ee/html5/thumbnails/10.jpg)
©2016 D-Wave Systems Inc. All rights reserved.
Why develop optimized GPU implementations?
I Software solvers slow down our experiments”This experiment occupied millions of processor coresfor several days to tune and run the classicalalgorithms for these benchmarks.”
- Denchev et al. (2015)
I Faster solvers → faster experimental cycle → improvedunderstanding of our chips
I Fast GPU simulation leads to better quantum computers!
10 / 27
![Page 11: Simulating a quantum annealer with GPU-based Monte Carlo ...on-demand.gputechconf.com/gtc/2016/presentation/s6380-james-kin… · Simulating a quantum annealer with GPU-based Monte](https://reader034.vdocuments.site/reader034/viewer/2022042219/5ec5b1e58ae40e70fd3160ee/html5/thumbnails/11.jpg)
©2016 D-Wave Systems Inc. All rights reserved.
Algorithms and GPU suitability
I Good/interesting classical solvers forChimera Ising problems fall into twocategories:
I Low-treewidth local searchI Single-spin Monte Carlo algorithms
I Low-treewidth local search is notsuitable.
I Memory requirements are too highI Limited parallelizability.
I Single-spin Monte Carlo algorithms areideal!
I Very low memory requirementsI Highly parallelizable.
11 / 27
![Page 12: Simulating a quantum annealer with GPU-based Monte Carlo ...on-demand.gputechconf.com/gtc/2016/presentation/s6380-james-kin… · Simulating a quantum annealer with GPU-based Monte](https://reader034.vdocuments.site/reader034/viewer/2022042219/5ec5b1e58ae40e70fd3160ee/html5/thumbnails/12.jpg)
©2016 D-Wave Systems Inc. All rights reserved.
Algorithms
12 / 27
![Page 13: Simulating a quantum annealer with GPU-based Monte Carlo ...on-demand.gputechconf.com/gtc/2016/presentation/s6380-james-kin… · Simulating a quantum annealer with GPU-based Monte](https://reader034.vdocuments.site/reader034/viewer/2022042219/5ec5b1e58ae40e70fd3160ee/html5/thumbnails/13.jpg)
©2016 D-Wave Systems Inc. All rights reserved.
Simulated Annealing
I Single-spin updatesI Flipping this spin would lead to a change in energy ∆EI Probability of accepting the spin flip is min(1, e−β∆E )
Algorithm 1 Simulated Annealing1: for each sample to be taken do2: for i = 1 to num sweeps do3: β := betas[i]4: for spin in spins do5: calculate ∆Espin6: flip spin with probability min(1, e−β∆Espin )7: end for8: end for9: end for
sam
ples
swee
pssp
ins
bipartite graph means half of thespin updates can be done in parallel
13 / 27
![Page 14: Simulating a quantum annealer with GPU-based Monte Carlo ...on-demand.gputechconf.com/gtc/2016/presentation/s6380-james-kin… · Simulating a quantum annealer with GPU-based Monte](https://reader034.vdocuments.site/reader034/viewer/2022042219/5ec5b1e58ae40e70fd3160ee/html5/thumbnails/14.jpg)
©2016 D-Wave Systems Inc. All rights reserved.
Parallel Tempering
I Instead of one Markov chain thatslowly goes from high to lowtemperature:
I Use an ensemble offixed-temperature Markov chains(“replicas”)
I Replicas form a “temperatureladder”
I Replicas can exchangetemperatures with neighbouringchains on the ladder withprobability min(1, e(Ei−Ej )(βi−βj ))
14 / 27
![Page 15: Simulating a quantum annealer with GPU-based Monte Carlo ...on-demand.gputechconf.com/gtc/2016/presentation/s6380-james-kin… · Simulating a quantum annealer with GPU-based Monte](https://reader034.vdocuments.site/reader034/viewer/2022042219/5ec5b1e58ae40e70fd3160ee/html5/thumbnails/15.jpg)
©2016 D-Wave Systems Inc. All rights reserved.
Approximate Simulations of Quantum Annealing
Quantum Monte Carlo†
I Many replicas of thesystem (Trotter slices)representing differentpoints in imaginary time
I Path-integral Monte Carlomethod
I We implement the ‘discretetime’ variant
† QMC can reproduce QAequilibrated statistics, but doesn’tsimulate its dynamics.
Spin Vector Monte Carlo
I Mean-field approximationI Simulates coherence but
no entanglementI Each spin is represented
by an angle
15 / 27
![Page 16: Simulating a quantum annealer with GPU-based Monte Carlo ...on-demand.gputechconf.com/gtc/2016/presentation/s6380-james-kin… · Simulating a quantum annealer with GPU-based Monte](https://reader034.vdocuments.site/reader034/viewer/2022042219/5ec5b1e58ae40e70fd3160ee/html5/thumbnails/16.jpg)
©2016 D-Wave Systems Inc. All rights reserved.
GPU Simulated AnnealingImplementation
16 / 27
![Page 17: Simulating a quantum annealer with GPU-based Monte Carlo ...on-demand.gputechconf.com/gtc/2016/presentation/s6380-james-kin… · Simulating a quantum annealer with GPU-based Monte](https://reader034.vdocuments.site/reader034/viewer/2022042219/5ec5b1e58ae40e70fd3160ee/html5/thumbnails/17.jpg)
©2016 D-Wave Systems Inc. All rights reserved.
Thread Structure — Hamiltonian
I One unit cell per threadI Cell Hamiltonian stored as floats in
40 registers
8 fields (h)16 in-tile couplings (J)
+ 16 inter-tile couplings† (J)40 registers
I Compiler uses additional 39 registersper thread
† Each inter-tile coupling is stored in two threads
17 / 27
![Page 18: Simulating a quantum annealer with GPU-based Monte Carlo ...on-demand.gputechconf.com/gtc/2016/presentation/s6380-james-kin… · Simulating a quantum annealer with GPU-based Monte](https://reader034.vdocuments.site/reader034/viewer/2022042219/5ec5b1e58ae40e70fd3160ee/html5/thumbnails/18.jpg)
©2016 D-Wave Systems Inc. All rights reserved.
Thread Structure — States
I Each state is +1 or −1I Each state is accessed by multiple
threads for energy calculation
States must be stored inshared memory!
I 8k2 states per sampleI Storing as floats is faster than
packing bits; registers are still thelimiting factor†
† For parallel tempering and quantum MonteCarlo we pack bits because we have up to 64replicas
18 / 27
![Page 19: Simulating a quantum annealer with GPU-based Monte Carlo ...on-demand.gputechconf.com/gtc/2016/presentation/s6380-james-kin… · Simulating a quantum annealer with GPU-based Monte](https://reader034.vdocuments.site/reader034/viewer/2022042219/5ec5b1e58ae40e70fd3160ee/html5/thumbnails/19.jpg)
©2016 D-Wave Systems Inc. All rights reserved.
Block Structure
I 79 registers per threadI k2 threads per sampleI 65,536 registers per SM (Maxwell)I Each SM can run
⌊ 65,53679k2
⌋samples in
parallel
Topology C4 C8 C12
Concurrentsamples 51 12 5per SM
19 / 27
![Page 20: Simulating a quantum annealer with GPU-based Monte Carlo ...on-demand.gputechconf.com/gtc/2016/presentation/s6380-james-kin… · Simulating a quantum annealer with GPU-based Monte](https://reader034.vdocuments.site/reader034/viewer/2022042219/5ec5b1e58ae40e70fd3160ee/html5/thumbnails/20.jpg)
©2016 D-Wave Systems Inc. All rights reserved.
Fast Random Number Generation
I A significant fraction of running time isused to generate random numbers.
I We use xorshift random numbergenerators
I 2-3 times faster than cuRandI Imperfect but still suitable for
applications that are not highly sensitiveto RNG quality.
20 / 27
![Page 21: Simulating a quantum annealer with GPU-based Monte Carlo ...on-demand.gputechconf.com/gtc/2016/presentation/s6380-james-kin… · Simulating a quantum annealer with GPU-based Monte](https://reader034.vdocuments.site/reader034/viewer/2022042219/5ec5b1e58ae40e70fd3160ee/html5/thumbnails/21.jpg)
©2016 D-Wave Systems Inc. All rights reserved.
Fast Approximations of Mathematical Functions
I Exponentiation is necessary todetermine flip probabilities
I Sine and cosine are used in Spin VectorMonte Carlo
I CPU implementations often cachefunction values in lookup tables
I Not feasible for GPUs due to memoryrestrictions
I CUDA to the rescue! Intrinsic fast mathfunctions are:
I Faster than regular math functions orTaylor approximations
I Accurate enough for our Monte Carloalgorithms
x
f (x)
f (x) = sin x
f (x) = min(1, e−x )
21 / 27
![Page 22: Simulating a quantum annealer with GPU-based Monte Carlo ...on-demand.gputechconf.com/gtc/2016/presentation/s6380-james-kin… · Simulating a quantum annealer with GPU-based Monte](https://reader034.vdocuments.site/reader034/viewer/2022042219/5ec5b1e58ae40e70fd3160ee/html5/thumbnails/22.jpg)
©2016 D-Wave Systems Inc. All rights reserved.
Results
22 / 27
![Page 23: Simulating a quantum annealer with GPU-based Monte Carlo ...on-demand.gputechconf.com/gtc/2016/presentation/s6380-james-kin… · Simulating a quantum annealer with GPU-based Monte](https://reader034.vdocuments.site/reader034/viewer/2022042219/5ec5b1e58ae40e70fd3160ee/html5/thumbnails/23.jpg)
©2016 D-Wave Systems Inc. All rights reserved.
Implementation Speeds
I Code is still being fine-tunedI Significant speedup over CPU seen in all four algorithmsI Huge spin flip/nanosecond/dollar improvement over CPUsI Actual numbers to be released in a forthcoming paper
23 / 27
![Page 24: Simulating a quantum annealer with GPU-based Monte Carlo ...on-demand.gputechconf.com/gtc/2016/presentation/s6380-james-kin… · Simulating a quantum annealer with GPU-based Monte](https://reader034.vdocuments.site/reader034/viewer/2022042219/5ec5b1e58ae40e70fd3160ee/html5/thumbnails/24.jpg)
©2016 D-Wave Systems Inc. All rights reserved.
Breakdown of Runtime — Simulated Annealing
35%
25%
35%
5%
Delta energy calculationRandom number generationSpin flip tests (exp function)Other
24 / 27
![Page 25: Simulating a quantum annealer with GPU-based Monte Carlo ...on-demand.gputechconf.com/gtc/2016/presentation/s6380-james-kin… · Simulating a quantum annealer with GPU-based Monte](https://reader034.vdocuments.site/reader034/viewer/2022042219/5ec5b1e58ae40e70fd3160ee/html5/thumbnails/25.jpg)
©2016 D-Wave Systems Inc. All rights reserved.
Conclusion
25 / 27
![Page 26: Simulating a quantum annealer with GPU-based Monte Carlo ...on-demand.gputechconf.com/gtc/2016/presentation/s6380-james-kin… · Simulating a quantum annealer with GPU-based Monte](https://reader034.vdocuments.site/reader034/viewer/2022042219/5ec5b1e58ae40e70fd3160ee/html5/thumbnails/26.jpg)
©2016 D-Wave Systems Inc. All rights reserved.
Recap
I Quantum processors are very hard to simulate classicallyI Monte Carlo algorithms are among the best tractable
approximationsI Monte Carlo algorithms with single-spin updates are ideal
for GPUI We can achieve significant speedups even over a more
expensive CPU
26 / 27
![Page 27: Simulating a quantum annealer with GPU-based Monte Carlo ...on-demand.gputechconf.com/gtc/2016/presentation/s6380-james-kin… · Simulating a quantum annealer with GPU-based Monte](https://reader034.vdocuments.site/reader034/viewer/2022042219/5ec5b1e58ae40e70fd3160ee/html5/thumbnails/27.jpg)
©2016 D-Wave Systems Inc. All rights reserved.
Looking to the Future
I Future D-Wave chips will be biggerand denser
I Future NVIDIA chips will be biggerand faster (more registers per SM?)
I GPUs should continue to beat CPUsfor Monte Carlo algorithms withsingle-spin updates
I Algorithms with low-treewidthupdates unlikely to become feasiblefor GPUs
27 / 27