easing the monte carlo sign problem - arxiv.org · easing the monte carlo sign problem dominik...

27
Easing the Monte Carlo sign problem Dominik Hangleiter, 1, * Ingo Roth, 1 Daniel Nagaj, 2 and Jens Eisert 1 1 Dahlem Center for Complex Quantum Systems, Freie Universität Berlin, Germany 2 RCQI, Institute of Physics, Slovak Academy of Sciences, Bratislava, Slovakia Quantum Monte Carlo (QMC) methods are the gold standard for studying equilibrium properties of quantum many-body systems – their phase transitions, ground and thermal state properties. However, in many interesting situations QMC methods are faced with a sign problem, causing the severe limitation of an exponential increase in the sampling complexity and hence the run-time of the QMC algorithm. In this work, we develop a sys- tematic, generally applicable, and practically feasible methodology for easing the sign problem by efficiently computable basis changes and use it to rigorously assess the sign problem. Our framework introduces measures of non-stoquasticity that – as we demonstrate analytically and numerically – at the same time provide a practi- cally relevant and efficiently computable figure of merit for the severity of the sign problem. We show that those measures can practically be brought to a good use to ease the sign problem. To do so, we use geometric algo- rithms for optimization over the orthogonal group and ease the sign problem of frustrated Heisenberg ladders. Complementing this pragmatic mindset, we prove that easing the sign problem in terms of those measures is in general an NP-complete task for nearest-neighbour Hamiltonians and simple basis choices by a polynomial reduction to the MAXCUT-problem. Intriguingly, easing remains hard even in cases in which we can efficiently assert that no exact solution exists. Quantum Monte Carlo (QMC) techniques are central to our understanding of the equilibrium physics of many-body quan- tum systems. They provide arguably one of the most pow- erful workhorses for efficiently calculating expectation val- ues of observables in ground and thermal states of various classes of many-body Hamiltonians [14]. For a Hamilto- nian H in dimension D, the idea at the heart of the most prominent variant of QMC is to sample out world lines in a corresponding (D + 1)-dimensional system, where the addi- tional dimension is the (Monte Carlo) time dimension. These world lines correspond to paths through an m-fold expansion of e -βH = (e -βH/m ) m where an entry of e -βH/m in a local basis is selected in each step. Each such path is associated with a probability which is proportional to the product of the selected entries. To sample from the resulting distribution, one can construct a suitable Markov chain of paths satisfying de- tailed balance, which – if gapped – eventually converges to its equilibrium distribution representing the thermal state. Gen- erally speaking, concentration-of-measure phenomena often make such a procedure efficient. In the classical variant of Monte Carlo, the Hamiltonian is always diagonal, giving rise to positive weights. In QMC, in contrast, positive (in general even complex) off-diagonal ma- trix elements of H potentially give rise to negative weights of the paths. This leads to what is famously known as the sign problem of QMC, namely that now one is faced with the task of sampling a quasi-probability distribution (normalized but non-positive) as opposed to a non-negative probability distri- bution. This task can be achieved by introducing a suitable probability distribution that reproduces the desired sampling averages but typically comes at the cost of an exponential in- crease in the sampling complexity and hence the runtime of the algorithm. For example, in world-line Monte Carlo one takes the absolute value of the quasi-probability distribution * Corresponding author: [email protected] and then computes the average sign which is given by the ex- pectation value of the signs of the quasi-probabilities with re- spect to the new distribution. The sign problem is particularly severe for fermionic Hamiltonians, as the particle-exchange anti-symmetry forces their matrix elements to have alternating signs in the standard basis. Naturally, though, it also appears for bosonic or spin Hamiltonians. The sign problem therefore severely limits our understanding of quantum materials. One can go as far as seeing it to divide strongly correlated systems into easy and intractable cases. A basic but fundamental insight is that the QMC sign problem is a basis-dependent property [5, 6]. For this rea- son, saying that ‘a Hamiltonian does or does not exhibit a sign-problem’ is meaningless without specifying a basis. Since physical quantities of interest are independent of the basis choice, the observation that the sign problem is basis- dependent gives immediate hope to actually mitigate the sign problem of QMC by expressing the Hamiltonian in a suitable basis. This is not guaranteed to improve the overall runtime of QMC as governed not only by the sampling complexity but also by the computational complexity of producing an indi- vidual sample. Nonetheless, mitigating the sign problem is widely expected to render QMC efficient in many situations. In this work, we establish a comprehensive novel frame- work for assessing, understanding, and optimizing the sign problem computationally, asking the questions: What is the optimal computationally meaningful local basis choice for a QMC simulation of a Hamiltonian problem, can we find it, and how hard is this task in general? Curing the sign problem In fact, it is known that one can completely cure the sign problem using basis rotations in certain situations. For specific models, sign-problem free bases can be found an- alytically, involving non-local bases, for example by using so-called auxiliary-field [7], Jordan-Wigner [8] or Majorana arXiv:1906.02309v3 [quant-ph] 1 Aug 2020

Upload: others

Post on 08-Oct-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Easing the Monte Carlo sign problem - arxiv.org · Easing the Monte Carlo sign problem Dominik Hangleiter,1, Ingo Roth,1 Daniel Nagaj,2 and Jens Eisert1 1Dahlem Center for Complex

Easing the Monte Carlo sign problem

Dominik Hangleiter,1, ∗ Ingo Roth,1 Daniel Nagaj,2 and Jens Eisert11Dahlem Center for Complex Quantum Systems, Freie Universität Berlin, Germany

2RCQI, Institute of Physics, Slovak Academy of Sciences, Bratislava, Slovakia

Quantum Monte Carlo (QMC) methods are the gold standard for studying equilibrium properties of quantummany-body systems – their phase transitions, ground and thermal state properties. However, in many interestingsituations QMC methods are faced with a sign problem, causing the severe limitation of an exponential increasein the sampling complexity and hence the run-time of the QMC algorithm. In this work, we develop a sys-tematic, generally applicable, and practically feasible methodology for easing the sign problem by efficientlycomputable basis changes and use it to rigorously assess the sign problem. Our framework introduces measuresof non-stoquasticity that – as we demonstrate analytically and numerically – at the same time provide a practi-cally relevant and efficiently computable figure of merit for the severity of the sign problem. We show that thosemeasures can practically be brought to a good use to ease the sign problem. To do so, we use geometric algo-rithms for optimization over the orthogonal group and ease the sign problem of frustrated Heisenberg ladders.Complementing this pragmatic mindset, we prove that easing the sign problem in terms of those measures isin general an NP-complete task for nearest-neighbour Hamiltonians and simple basis choices by a polynomialreduction to the MAXCUT-problem. Intriguingly, easing remains hard even in cases in which we can efficientlyassert that no exact solution exists.

Quantum Monte Carlo (QMC) techniques are central to ourunderstanding of the equilibrium physics of many-body quan-tum systems. They provide arguably one of the most pow-erful workhorses for efficiently calculating expectation val-ues of observables in ground and thermal states of variousclasses of many-body Hamiltonians [1–4]. For a Hamilto-nian H in dimension D, the idea at the heart of the mostprominent variant of QMC is to sample out world lines in acorresponding (D + 1)-dimensional system, where the addi-tional dimension is the (Monte Carlo) time dimension. Theseworld lines correspond to paths through an m-fold expansionof e−βH = (e−βH/m)m where an entry of e−βH/m in a localbasis is selected in each step. Each such path is associatedwith a probability which is proportional to the product of theselected entries. To sample from the resulting distribution, onecan construct a suitable Markov chain of paths satisfying de-tailed balance, which – if gapped – eventually converges to itsequilibrium distribution representing the thermal state. Gen-erally speaking, concentration-of-measure phenomena oftenmake such a procedure efficient.

In the classical variant of Monte Carlo, the Hamiltonian isalways diagonal, giving rise to positive weights. In QMC, incontrast, positive (in general even complex) off-diagonal ma-trix elements of H potentially give rise to negative weights ofthe paths. This leads to what is famously known as the signproblem of QMC, namely that now one is faced with the taskof sampling a quasi-probability distribution (normalized butnon-positive) as opposed to a non-negative probability distri-bution. This task can be achieved by introducing a suitableprobability distribution that reproduces the desired samplingaverages but typically comes at the cost of an exponential in-crease in the sampling complexity and hence the runtime ofthe algorithm. For example, in world-line Monte Carlo onetakes the absolute value of the quasi-probability distribution

∗ Corresponding author: [email protected]

and then computes the average sign which is given by the ex-pectation value of the signs of the quasi-probabilities with re-spect to the new distribution. The sign problem is particularlysevere for fermionic Hamiltonians, as the particle-exchangeanti-symmetry forces their matrix elements to have alternatingsigns in the standard basis. Naturally, though, it also appearsfor bosonic or spin Hamiltonians. The sign problem thereforeseverely limits our understanding of quantum materials. Onecan go as far as seeing it to divide strongly correlated systemsinto easy and intractable cases.

A basic but fundamental insight is that the QMC signproblem is a basis-dependent property [5, 6]. For this rea-son, saying that ‘a Hamiltonian does or does not exhibita sign-problem’ is meaningless without specifying a basis.Since physical quantities of interest are independent of thebasis choice, the observation that the sign problem is basis-dependent gives immediate hope to actually mitigate the signproblem of QMC by expressing the Hamiltonian in a suitablebasis. This is not guaranteed to improve the overall runtimeof QMC as governed not only by the sampling complexity butalso by the computational complexity of producing an indi-vidual sample. Nonetheless, mitigating the sign problem iswidely expected to render QMC efficient in many situations.

In this work, we establish a comprehensive novel frame-work for assessing, understanding, and optimizing the signproblem computationally, asking the questions: What is theoptimal computationally meaningful local basis choice for aQMC simulation of a Hamiltonian problem, can we find it,and how hard is this task in general?

Curing the sign problem

In fact, it is known that one can completely cure thesign problem using basis rotations in certain situations. Forspecific models, sign-problem free bases can be found an-alytically, involving non-local bases, for example by usingso-called auxiliary-field [7], Jordan-Wigner [8] or Majorana

arX

iv:1

906.

0230

9v3

[qu

ant-

ph]

1 A

ug 2

020

Page 2: Easing the Monte Carlo sign problem - arxiv.org · Easing the Monte Carlo sign problem Dominik Hangleiter,1, Ingo Roth,1 Daniel Nagaj,2 and Jens Eisert1 1Dahlem Center for Complex

2

[9, 10] transformations. One can also exploit specific knownproperties of the system such as that the system dimerizes[11–14]. Such findings motivate the quest for a more broadlyapplicable systematic search for basis changes that avoid thesign problem, in a way that does not depend on the specificphysics of the problem at hand. After all, in a QMC simula-tion one wants to learn about the physics of a system in thefirst place and, indeed, the optimal basis choice may very wellbe closely related to that physics.

Clearly, a useful notion of curing has to restrict the set ofallowed basis transformation such that expressing the Hamil-tonian in the new basis is still computationally tractable. Forexample, in its eigenbasis every Hamiltonian is diagonal andthus sign-problem free, but even writing down this basis typ-ically requires an exponential amount of resources. The in-trinsic sign problem of a Hamiltonian is thus a property ofits equivalence classes under conjugation with some suitablesubgroup of the unitary group. The simplest examples of suchchoices include local Hadamard, Clifford or unitary transfor-mations. Most generally, one can allow for quasi-local circuitswhich are efficiently computable [6], including short circuitsand matrix product unitaries [15, 16], but also invertible trans-formations [17].

A both useful and simple sufficient condition for the ab-sence of a sign problem, independent of the specifics of asimulation, is that the Hamiltonian matrix is stoquastic, i.e.,has only non-positive off-diagonal entries. In fact, stoquastic-ity provides a useful framework to assess the computationalcomplexity of a systematic approach to curing the sign prob-lem [18]. Only recently has the curing problem, to decidewhether a stoquastic local basis exists, been shown to be anNP-complete task under on-site unitary transformations for 2-local Hamiltonians with additional local fields [19, 20], whileit remains efficiently solvable for strictly 2-local Hamiltoni-ans [20, 21]. But any such approach is faced with the ques-tion: Is all hope lost for simulating a Hamiltonian problem viaQMC more efficiently even when a stoquastic basis cannot befound in polynomial time?

A pragmatic approach: Easing the sign problem

This leads us to the first part of the initially posed question:what is the optimal computationally meaningful choice of ba-sis? In any Monte Carlo algorithm, computational hardnessdue to a sign problem is manifested in a super-polynomial in-crease in its sample complexity as the system size grows. Intu-itively speaking, the sample complexity increases because thevariance of the Monte Carlo estimator does. In this mindset,finding a QMC algorithm with feasible runtime for Hamilto-nians with a sign problem does not require the much strongertask of finding a basis in which the Hamiltonian is fully sto-quastic. Indeed, in many cases such a basis may not evenexist within a given subgroup of the unitaries. Rather, of-ten it is sufficient to merely find a basis in which the Hamil-tonian is approximately stoquastic so that the scaling of thevariance of the corresponding estimator with the system sizeis more favourable – ideally polynomial. More pragmati-

cally still, practitioners in QMC are increasingly less worriedabout small sign problems for which simulations are still fea-sible for reasonable system sizes using state-of-the-art com-puting power. This remains true even if the sampling effortmay strictly speaking diverge exponentially with the systemsize. Consequently, we argue that practical computational ap-proaches towards the sign problem, rather than focusing onexactly curing it, should target the less ambitious yet practi-cally meaningful task of approximately solving or easing it inthe best possible way.

Here, we propose a systematic, generally applicable, andpractically feasible methodology for easing the sign problemvia basis rotations that allows for a meaningful rigorous as-sessment of this task. An appealing feature of our frameworkis that it neither requires any a priori knowledge about thephysics of a problem nor depends on specifics of a given sim-ulation procedure, in contrast to other known refinements ofQMC. At the heart of our approach lies a formulation of theeasing problem in terms of a simple, efficiently computablemeasure of approximate stoquasticity that generically quanti-fies the sampling complexity.

The sample complexity of a QMC algorithm can be linkedto the size of the inverse average sign, which directly boundsthe variance of the QMC estimator [18]. In an attempt to easethe sign problem of a given Hamiltonian it is therefore nat-ural to try and improve the average sign. For a few specificmodels such improvements have indeed been achieved by dif-ferent means: for example, one can exploit known physics tofind bases with improved average sign [14, 22] that are ofteninduced by sparse representations [17, 23, 24]. For particu-lar observables, one can also exploit clever decompositionsof the Monte Carlo estimator into clusters with non-negativesign [25–31].

However, the sample complexity of computing the averagesign via QMC is given by its very value and typically scalesexponentially in the system size. Ironically, easing the signproblem by optimizing the average sign is therefore typicallyinfeasible whenever there is a sign problem. One would hencelike to quantify the severeness sign problem in terms of aquantity that is efficiently computable for physical Hamilto-nians – a crucial property to be practically useful in a generalapproach to easing the sign problem.

Building on the notion of stoquasticity, for a real D × DHamiltonian matrix H , we propose the sum of all non-stoquastic matrix entries

ν1(H) := D−1‖H¬‖`1 , (1)

as a natural measure of non-stoquasticity [32] in order toquantify the sampling complexity of a QMC algorithm ingeneric instances. Here, as throughout this work, we denotethe non-stoquastic part of the Hamiltonian by H¬ which isdefined by (H¬)i,j = hi,j for hi,j > 0 and i 6= j, and zerootherwise. Moreover, ‖H‖`1 =

∑i,j |hi,j | is the vector-`1-

norm.For local Hamiltonians on bounded-degree graphs such as

regular lattices this measure can be efficiently computed fromthe non-stoquastic entries of the local terms themselves – fortranslation-invariant Hamiltonians even with constant effort.

Page 3: Easing the Monte Carlo sign problem - arxiv.org · Easing the Monte Carlo sign problem Dominik Hangleiter,1, Ingo Roth,1 Daniel Nagaj,2 and Jens Eisert1 1Dahlem Center for Complex

3

But we can also go beyond that and prove that, for 2-localHamiltonians acting on any graph, the measure ν1 can be effi-ciently approximated up to any inverse polynomial error; seeTheorem 6. This result renders our measure applicable toproblems with long-range and low-degree interactions as theyarise, for example, in quantum chemistry.

In principle, one can also conceive of other measures ofnon-stoquasticity such as the `1→1-norm or the `2-norm ofthe non-stoquastic part of H . We argue that the `1-normis the most meaningful measure that is agnostic to any par-ticular structure of the Hamiltonian matrix and therefore themost versatile measure for a general approach to easing thesign problem. What is more, it acts as a natural regular-izer promoting a sparse representation [33] in the spirit ofRefs. [17, 23, 24].

But how does the non-stoquasticity relate to the samplecomplexity of a QMC simulation? We find that it is in factimpossible to directly connect a continuous measure of non-stoquasticity to the average sign, which takes on its maximalvalue at unity and achieves this value for stoquastic Hamil-tonians: We can construct exotic examples of highly non-stoquastic Hamiltonians with large positive off-diagonal en-tries which also have unit average sign. Conversely, we pro-vide an example of a Hamiltonian with arbitrarily small non-stoquasticity for which the average sign nearly vanishes.

On the one hand, our examples demonstrate a high sensitiv-ity of the average sign to the Monte Carlo parameters. On theother hand, they also require a malicious interplay betweenthe Hamiltonian matrix entries and highly fine-tuned MonteCarlo parameters. We therefore expect that, in generic situa-tions, the non-stoquasticity measure ν1 meaningfully quanti-fies the sample complexity of QMC. We give analytical ar-guments that this is actually the case and numerically findthat the average sign of generic two-local Hamiltonians scalesexponentially in ν1; see Sec. II. Thus, we provide evidencethat the non-stoquasticity of a local Hamiltonian meaningfullyquantifies its sign problem.

Easing in practice

This leads us to the question: Can we practically easethe sign problem of physical Hamiltonians by minimiz-ing non-stoquasticity? To study this second question, weconsider translation-invariant nearest-neighbour Hamiltoni-ans in a quasi one-dimensional geometry [34]. Quasi one-dimensional systems, such as anti-ferromagnetic HeisenbergHamiltonians on ladder geometries [35, 36] are the simplestnon-trivial systems that exhibit a sign problem since they ad-mit the phenomenon of geometric frustration [37]. Frustra-tion gives rise to a plethora of phenomena arising in quasione-dimensional systems such as the emergence of quantumspin liquids [38, 39] and the interplay of spin-1/2 and spin-1physics [40]. They are also somewhat more realistic descrip-tions of actual low-dimensional experimental situations thansimple one-dimensional chains, serving as a model for smallcouplings in the transverse direction [36, 41, 42]. Thereforequasi one-dimensional systems are often seen as a stepping

stone towards studying higher dimensions [43], where the signproblem inhibits QMC simulations [44], and thus serve as theperfect playground for a proof of principle.

As a meaningful simple ansatz class, we consider on-siteorthogonal transformations O ∈ O(d) of the type

H =

n∑i=1

Ti(h) 7→ O⊗nH(OT )⊗n, (2)

for Hamiltonians H acting on n qudits with local dimensiond. Here, Ti(h) denotes the translation of a two-local term hto site i. On-site transformations can be handled particularlywell as they preserve locality and translation-invariance of lo-cal Hamiltonians. In particular, for such transformations, theglobal non-stoquasticity measure can be expressed locally interms of the transformed term O⊗2h(OT )⊗2 so that the opti-mization problem has constant complexity in the system size.This constitutes an exponential improvement over approachesthat directly optimize the average sign.

To optimize the non-stoquasticity in this setting, we haveimplemented a geometric optimization method suitable forgroup manifolds, namely, a conjugate gradient descent algo-rithm over the orthogonal group O(d) [45, 46]. In Fig. 1(a)we show that, generically, the algorithm accurately recoversan on-site stoquastic basis for random Hamiltonians whichare known to admit such a basis a priori. This showsthat the heuristic algorithm successfully minimizes the non-stoquasticity and thus serves as a benchmark for its function-ing.

We now apply the algorithm to frustrated anti-ferromagnetic Heisenberg Hamiltonians on different laddergeometries; see Fig. 1(b) and (c). Ladder geometries arenot only interesting for the reasons described above, butalso because in spite of frustration effects they often admitsign-problem free QMC methods [11, 13, 14]. For both theJ0-J1-J2-J3-model studied in Ref. [11] and the frustratedHeisenberg ladder studied in Refs. [13, 14], we find a richoptimization landscape in which a relative improvement ofthe non-stoquasticity by a factor of 2 to 5 can be achieveddepending on the region in the phase diagram. Importantlyand in spite of those seemingly moderate improvements ofnon-stoquasticity, we find that the sample complexity ofQMC as governed by the inverse average sign is greatlydiminished to approximate unity in large regions of theparameter space for the frustrated ladder model; see Fig. 2.

It may well be the case that no stoquastic dimer basis existseven though other variants of QMC do not incur a sign prob-lem for such basis choices: in Ref. [11] a stoquastic but non-local basis of the J0-J1-J2-J3-model is identified for valuesof J2 ≥ J0 + J1, indicating that more general ansatz classesmay well help to further improve the non-stoquasticity. Wealso observe that first-order optimization algorithms such asthe employed conjugate gradient method encounter obstaclesdue to the rugged non-stoquasticity landscape. Intuitively, thislandscape is governed by the combinatorial increase of possi-ble assignments of signs to the Hamiltonian matrix elements.

The findings of our proof-of-principle study are twofold:on the one hand, they show that one can in fact efficiently

Page 4: Easing the Monte Carlo sign problem - arxiv.org · Easing the Monte Carlo sign problem Dominik Hangleiter,1, Ingo Roth,1 Daniel Nagaj,2 and Jens Eisert1 1Dahlem Center for Complex

4

2 3 4 5 6

10−10

10−8

10−6

10−4

10−2

100

Local dimension

ν 1(OHOT)/ν 1(H

)Random curable Hamiltonian

0 1 2 3 40

1

2

3

4

J2/JJ3/J

J0-J1-J2-J3 model

0.20

0.30

0.40

0.50

ν1(OHOT )ν1(H)

0 0.5 10

0.5

1

J⊥/J‖

J×/J‖

Frustrated Ladder Model

0.20

0.30

0.40

0.50

0.60

ν1(OHOT )ν1(H)

O O OO

J0

J1

J2 J3 J⊥ J⊥

J‖

J‖

Figure 1. We optimize the non-stoquasticity ν1 of translation-invariant, two-local Hamiltonians over on-site orthogonal transformationsO = O⊗n using a conjugate gradient method for manifold optimization [45, 46]. Figure (a) shows the relative non-stoquasticity improvementof random two-local Hamiltonians that are known to admit an on-site stoquastic basis. For each local dimension 100 instances are drawn andthe results displayed as a box plot according to Ref. [47, 2.16], where whiskers are placed at 1.5 times the interquartile range and circles denoteoutliers. This serves as a benchmark of our algorithm, which for almost all instances accurately recovers a stoquastic on-site basis. Figure(b) displays the optimized non-stoquasticity of the anti-ferromagnetic J0-J1-J2-J3-Heisenberg model relative to the computational basis as afunction of J2/J, J3/J , where J0 = J1 = J . The algorithm is initialized in a Haar random orthogonal on-site basis. This model is knownto admit a non-local stoquastic basis for J3 ≥ J0 + J1 [11]. Figure (c) shows the optimized non-stoquasticity of the anti-ferromagneticHeisenberg ladder illustrated in the inset with couplings J‖, J⊥, J× relative to the computational basis as a function of J⊥/J‖ and J×/J‖. Weinitialized the algorithm at the identity matrix (that was randomly perturbed by a small amount). The phase diagram of the non-stoquasticityqualitatively agrees with the findings of Ref. [14], where the stochastic series expansion (SSE) QMC method was studied. There, it was foundthat the sign problem can be completely eliminated for a completely frustrated arrangement where J× = J‖, while the sign problem remainspresent for partially frustrated couplings J× 6= J‖. However, throughout the parameter regime the stoquasticity remains non-trivial, whichmay be due to the fact that the optimization algorithm converges to local minima.

0.5 1 1.5

0.5

1

1.5

J⊥/J‖

J×/J‖

log(〈sign〉−1OHOT )/ log(〈sign〉−1

H )

0.00

2.00

4.00

Figure 2. Improvement of the inverse average sign 〈sign〉−1 con-comitant with the improvement in non-stoquasticity of Fig. 1(c) forthe frustrated ladder model as measured by the ratio of its logarithmbefore optimization compared to that after optimization. We com-pute the average sign via exact diagonalization for a ladder of 2× 4-sites, m = 100 Monte Carlo steps and inverse temperature β = 1.

optimize the non-stoquasticity for translation-invariant prob-lems that admit a stoquastic basis lying within the ansatz orbit.They also further substantiate the claim that optimizing non-stoquasticity typically eases the sign problem and dampensthe increase in sampling complexity. What is more, they indi-cate that more general ansatz classes such as quasi-local cir-

cuits yield the promise to further reduce the non-stoquasticityof ladder models. We therefore expect that optimizing non-stoquasticity is a feasible and promising means to reducethe sign problem for many different systems, including two-dimensional lattice systems, by exploiting the flexibility of-fered by larger ansatz classes within our framework. On theother hand, already in our small study we encountered obsta-cles preventing efficient optimization of the non-stoquasticityin the guise of a complicated and rugged optimization land-scape.

The computational complexity of SignEasing

Fundamentally, our findings thus raise the third question:How far can an approach to easing the sign problem usingoptimization over local bases carry in principle? In our maincomplexity-theoretic result, we systematically study the fun-damental limits of minimizing non-stoquasticity as a meansto ease the sign problem. To do so, we complement the prag-matic mindset of this work with the rigorous machinery ofcomputational complexity theory, asking the question: Whatis the computational complexity of optimally easing the signproblem? In order to formalize this question, we introduce thecorresponding decision problem:

Definition 1 (SignEasing). Given an n-qubit Hamiltonian H ,constants B > A ≥ 0 with B − A ≥ 1/poly(n), and aset of allowed unitary transformations U , decide which of the

Page 5: Easing the Monte Carlo sign problem - arxiv.org · Easing the Monte Carlo sign problem Dominik Hangleiter,1, Ingo Roth,1 Daniel Nagaj,2 and Jens Eisert1 1Dahlem Center for Complex

5

7→

(a) (b)

ZZsi

sjXX

ZZ−ZZ

−ZZ

Zsi

Zsj

Figure 3. Constructing a Hamiltonian whose sign problem is NP-hard to ease under orthogonal on-site transformations. (a) To prove NP-completeness of SignEasing, we reduce it to the MAXCUT-problem which asks for the ground-state energy of an anti-ferromagnetic IsingHamiltonian H on a graph G. (b) In our encoding, we map H to a Hamiltonian H ′ in which all ZZ-interactions are replaced by XX-interactions and translate the spin configurations (s1, . . . , sn) ∈ {0, 1}n of the anti-ferromagnetic Ising model to on-site transformationsZs11 · · ·Zsnn . To achieve this restriction, we penalize all other transformations by adding an ancilla qubit ξi,j for every edge (i, j) of G andadding the interaction term C(ZiZj −ZiZξi,j −ZjZξi,j ) with a suitably chosen constant C > 0. We obtain that ν1(H ′) can be eased belowa certain value if and only if the ground state energy of H is below that value to begin with, thus establishing the reduction.

following is the case:

YES : ∃U ∈ U : ν1(UHU†) ≤ A, or (3)

NO : ∀U ∈ U : ν1(UHU†) ≥ B. (4)

We derive the computational complexity of the sign eas-ing problem in simple settings, namely for 2-local Hamilto-nians, allowing for on-site orthogonal Clifford operations aswell as for on-site general orthogonal transformations. Weprove that under both classes of transformations SignEasingis NP-complete. Intriguingly, this holds true even in cases inwhich the curing problem can be decided efficiently, namely,for strictly 2-local XYZ Hamiltonians of the type consideredin Refs. [20, 21].

Theorem 2 (Complexity of SignEasing). SignEasing is NP-complete for 2-local (XYZ) Hamiltonians under

i. on-site orthogonal Clifford transformations, and

ii. on-site general orthogonal transformations.

From a practical perspective, our results pose limitationson the worst-case runtime of algorithms designed to find op-timal QMC bases for the physically relevant case of 2-localHamiltonians. From a complexity-theoretic perspective, theymanifest a sign problem variant of the dichotomy between theefficiently solvable 2SAT-problem to decide whether there ex-ists a satisfying assignment for a 2-local sentence, and the NP-complete MAX2SAT-problem asking what is the least possi-ble number of broken clauses. They thus complete the picturedrawn by Refs. [19–21] regarding the connection between sat-isfiability problems and the problems of curing and easing thesign problem on arbitrary graphs, a state of affairs which weillustrate in Table I. It is natural to ask the question how far thisconnection extends and what we can learn from it about effi-ciently solvable instances. For example, one may ask, whetherresults about the hard regions of 3SAT and MAX2SAT carryover to the problems of curing and easing the sign problem.

We prove Theorem 2 i and ii as Theorems 8 and 9. Theessential idea of our proof, sketched below and illustrated in

Fig. 3, is to design a corresponding Hamiltonian such that ifthe sign problem could be optimally eased for this Hamilto-nian under the respective ansatz class, one could also find theground state energy of the original anti-ferromagnetic IsingHamiltonian, a task that is NP-hard to begin with. It isstraightforward to prove versions of Theorem 2 for any `p-norm of the non-stoquastic part of H with finite p as a mea-sure of non-stoquasticity. Our result is therefore independentof the particular choice of (`p) non-stoquasticity measure.

Proof sketch. SignEasing for arbitrary 2-local Hamiltonians iscontained in NP – given a basis transformation, we can ap-proximate the measure of non-stoquasticity from the trans-formed local terms up to any inverse polynomial error andhence verify the YES-case (3); see Theorem 6.

The key idea of the harder direction of the proof is to en-code the promise version of the MAXCUT-problem into theSignEasing-problem. An instance of MAXCUT is given bya graph G = (V,E), and the problem is to decide whetherthe ground-state energy of the anti-ferromagnetic (AF) IsingHamiltonian

H =∑

(i,j)∈EZiZj , (5)

is below a constant A or above B. Here, Zi is the Pauli-

Satisfiability Stoquasticity Complexity Refs.

3SAT Curing 2+1-local H NP-complete [19, 20]2SAT Curing strictly 2-local H in P [20, 21]

MAX2SAT Easing strictly 2-local H NP-complete here

Table I. The satisfiability equivalent of curing the sign problem is todecide whether a given sentence is satisfiable, while the equivalentof easing is to find the minimal number of clauses that are violatedby a sentence. Similarly, results on the computational complexityof curing and easing the non-stoquasticity of a local Hamiltonian Hare in one-to-one correspondence with the hardness of satisfiabilityproblems.

Page 6: Easing the Monte Carlo sign problem - arxiv.org · Easing the Monte Carlo sign problem Dominik Hangleiter,1, Ingo Roth,1 Daniel Nagaj,2 and Jens Eisert1 1Dahlem Center for Complex

6

Z-operator acting on site i. We now define a HamiltonianH ′ in which we replace every ZiZj interaction of H by anXiXj interaction as we illustrate in Fig. 3. To understandour embedding, suppose that we perform basis changes onlyby applying Z or 1 at every site. In this case a Hamilto-nian term can be made stoquastic if and only if XiXj 7→−XiXj which is achieved by a transformation ZsiZsj with(si, sj) = (0, 1) ∨ (1, 0). A term remains stoquastic for(si, sj) = (1, 1) ∨ (0, 0). This provides a direct mappingbetween spin configurations (1, 0) and (0, 1), which do notcontribute to the ground state energy of the anti-ferromagneticIsing model and transformations that make local terms in H ′

stoquastic and thus decrease the non-stoquasticity.To prove the theorem for arbitrary on-site Clifford and or-

thogonal transformations, we introduce an additional qubitξi,j for every edge (i, j) and add interaction terms C(ZiZj −ZiZξi,j − ZjZξi,j ) to H ′ with constant C = 2 deg(G),where deg(G) is the degree of the interaction graph G, seeFig. 3(b). These terms penalize all other transformations suchthat the optimal non-stoquasticity ofH ′ is always achieved fortransformations of the form Zs11 · · ·Zsnn with (s1, . . . , sn) ∈{0, 1}n. For example, suppose that we apply Hadamard trans-formations to all sites i, j, ξi,j , then the ZZ interactions andXX interactions change roles so that the non-stoquasticitycannot be decreased by such a transformation. Showing thisfor all possible transformations constitutes the main technicalpart of the proof.

Since MAXCUT is a variant of the MAX2SAT-problem ourresults not only manifest but also crucially utilise the 2SAT-MAX2SAT dichotomy. Notice that since the MAXCUT-problem is NP-hard already on subgraphs of the double-layered square lattice [48], which has degree six, hard in-stances of the sign-easing problem occur already for low-dimensional lattices with small (constant) interaction strength.

In our complexity-theoretic analysis, we have focused onthe computational complexity of easing the sign problem asthe size of an arbitrary input graph is scaled up, in the samemindset as Refs. [18–21]. We expect, however, that the com-plexity of SignEasing scales similarly in the size of the latticeunit cell and the local dimension of translation-invariant sys-tems such as those discussed above.

Summary

Let us summarize: Our work introduces the sign easingmethodology as a systematic novel paradigm useful for as-sessing and understanding the sign problem of QMC simula-tions. We ask and answer three central questions using com-plementary methods from theoretical and applied computerscience as well as from physics. First, we define a measureof non-stoquasticity suitable for easing the sign problem andextensively discussed its relation to the average sign. Second,we demonstrate that one can feasibly optimize this measureover local bases in simple settings by applying geometric op-timization methods. Finally, we establish the computationalcomplexity of sign easing in a broader but still simple setting.

In this way, our work not only identifies a means of easingthe sign problem and demonstrates its feasibility and poten-tial, but also shows up its fundamental limitations in termsof computational complexity. Even more so, we are confi-dent that the framework of our work provides both valuableguidance and the practical means for future research on sys-tematically easing the sign problem of Hamiltonians that areparticularly interesting and relevant in condensed-matter andmaterial science applications.

Outlook

As a first general and systematic attempt to easing the signproblem, we have restricted the focus of this work in severalways. As such, a number of questions, generalizing our resultsin different directions, are left open.

First, we have restricted our discussion to the prominentworld-line Monte Carlo method to maintain clarity through-out the manuscript. We are confident, however, that our re-sults find immediate application for other Monte Carlo meth-ods such as stochastic series expansion Monte Carlo and de-terminantal Monte Carlo [37, 49] as well as diffusion MonteCarlo techniques such as full-configuration-interaction MonteCarlo [50]. Similar sign problems involving the samplingfrom quasi-probability distributions also appear in differentcontexts, for example, in approaches to the classical simula-tion of quantum circuits [51–53] or high-energy physics [54].In these contexts, too, the problem of finding better bases inwhich to perform the sampling appears. While the frameworkdeveloped in this work uses the specific features of QMC,the general idea and mindset behind it applies to all basis-dependent sign problems. Our work thus paves the way to-wards easing sign problems in a plethora of contexts.

Second, we have only considered real-valued Hamiltoniansand transformations which preserve this property. For gen-eral complex-valued Hamiltonians, the sign problem takes theform of a complex phase problem. A natural follow-up of ourwork is to explore how our results on easing the sign problemgeneralize to the complex phase problem.

Third, we have put an emphasis on the conjugation ofHamiltonians under on-site Clifford and orthogonal circuits.In principle, one may also allow for arbitrary quasi-local cir-cuits, as long as the conjugation can be efficiently computed;albeit of exponentially increasing effort with the support ofthe involved unitaries. This leads to the interesting insight thatwithin the trivial phase of matter, one can always remove thesign problem: One has to conjugate the Hamiltonian with thequasi-local unitary that brings a given Hamiltonian into an on-site form of a fixed point Hamiltonian. For given Hamiltoni-ans, this may be impractical, of course. In this sense, one canidentify trivial quantum phases of matter as efficiently com-putable phases of matter, an intriguing state of affairs from aconceptual perspective. Conversely, for topologically orderedsystems, there may be topological obstructions to curing thesign problem by any quasi-local circuit [6, 55], giving rise toan entire phase of matter that exhibits an intrinsic sign prob-lem. For example, the fixed point Hamiltonians of the most

Page 7: Easing the Monte Carlo sign problem - arxiv.org · Easing the Monte Carlo sign problem Dominik Hangleiter,1, Ingo Roth,1 Daniel Nagaj,2 and Jens Eisert1 1Dahlem Center for Complex

7

general class of non-chiral topologically ordered systems, theLevin-Wen models [56], are associated with 12-local Hamil-tonians, many of which are expected to not be curable fromtheir sign problem. This insight further motivates to studythe sign easing problem for efficiently computable subgroupsof local unitaries from a perspective of topological phases ofmatter.

Our work also opens up several paths for future research.The immediate and practically most relevant direction is ofcourse to find the best possible way of minimizing the non-stoquasticity of translation-invariant systems and to explorehow well the sign problem can be eased in systems that arenot yet amenable to QMC. We have already introduce a flex-ible optimization approach which can be straightforwardlyapplied to a wide range of translation-invariant systems andansatz classes in any dimensionality. In this respect, it will beinteresting to compare possible ways of optimizing the signproblem via different measures [57] and optimization algo-rithms [58] in various systems [59].

Furthermore, in our hardness proof we have shown that theeasing problem is intricately related to satisfiability problems.Building on this connection, an exciting direction of researchis to combine highly efficient SAT-solvers that are capableof exploring combinatorically large sets, with manifold opti-mization techniques that are able to handle rich geometricalstructures, in the spirit of recent work [60]. While our hard-ness result shows up fundamental limitations of SignEasingin the general case, it thus also opens the door to potentiallysolve the sign easing problem in relevant instances by ap-plying methods well known in computer science to relaxedversions of the easing problem. One may thus hope that forlarge classes of relevant instances for which minimizing non-stoquasticity is actually tractable.

A question closely related to the sign easing problem is thefollowing: How hard is it to find the ground state energy ofa stoquastic Hamiltonian – a sub-problem of the so-called lo-cal Hamiltonian problem. The computational complexity ofthis stoquastic local Hamiltonian problem poses fundamen-tal limitations on the classical simulatability of Hamiltonianswhich do not suffer from a sign problem and are thereforeamenable to QMC simulations. It has been shown that the2-local stoquastic Hamiltonian problem is complete for theclass StoqMA [61, 62], a class intermediate between AM andMA that also functions as a genuinely intermediate class inthe complexity classification of local Hamiltonian problems[63], even when extending to the full low-energy spectrum[64]. The results of Ref. [61] also imply that we cannot ex-pect to efficiently find a stoquastic local basis for arbitrarylocal Hamiltonians unless the unlikely complexity-theoreticequality AM = QMA holds.

Indeed, for efficiently curable Hamiltonians, the localHamiltonian problem is reduced to a stoquastic local Hamil-tonian problem. Conversely, both the easing problem and thestoquastic local Hamiltonian problem contribute to the hard-ness of a QMC procedure. For a given Hamiltonian, QMCmay thus be computationally intractable for two reasons: it ishard to find a basis in which the Hamiltonian is stoquastic, orcooling to its ground state is computationally hard in its own

right. In a QMC algorithm, the latter hardness is manifestedas a Markov chain Monte Carlo algorithm not converging inpolynomial time. This may be the case even for classical mod-els such as Ising spin glasses [48].

An important open question is how the hardness of easingthe sign problem and the hardness of sampling from the esti-mator distribution are related in specific cases. For example,when improving the average sign, the hardness of a problemthat was manifest in an increased sample complexity of theMonte Carlo estimator, might be ‘transferred’ to the hardnessof sampling from the resulting distribution. On the other hand,there might be instances in which the only obstacle in the wayof an efficient simulation is to find a certain basis in whichthe corresponding Hamiltonian has a large average sign, but,given that basis, QMC runs efficiently.

Overview

The plan for the technical part of this work is as follows: InSection I we sketch the idea of world-line QMC methods andexplain how the sign problem arises there. In Section II wethen discuss the relation between the average sign and non-stoquasticity. There, we construct examples showing that thetwo are in general unrelated (II A), but then continue to argueboth analytically (II B) and numerically (II C) that the non-stoquasticity ν1 defined in Eq. (1) is a meaningful and effi-ciently computable (II D) measure of the sign problem. InSection III we perform a proof-of-principle numerical studyshowing that easing is both feasible and meaningful for trans-lationally invariant models with a sign problem. In Section IVwe then study the fundamental limitations of a systematicapproach to the sign problem in proving the computationalhardness of SignEasing when allowing for both orthogonalClifford (Theorem 8) and general orthogonal transformations(Theorem 9).

I. THE SIGN PROBLEM OF QUANTUM MONTE CARLO

We begin the technical part of this work with an exposi-tion of the basics of Quantum Monte Carlo methods. For thepurpose of this work, we focus on the prominent world-lineMonte Carlo method of calculating partition functions andthermal expectation values of a HamiltonianH at inverse tem-perature β [49]. Here, both quantities are expressed as

Zβ,H ' Tr[Tmm ] =∑

~λ∈Λm+1, λm+1=λ1

a(~λ) (6)

〈O〉β,H '1

Zβ,HTr[TmmO] =

1

Zβ,H

∑~λ∈Λm+1

a(~λ)O(λm|λ1),

(7)

for large enough m ∈ N Monte Carlo steps in terms of theamplitudes

a(~λ) = Tm(λ1|λ2)Tm(λ2|λ3) · · ·Tm(λm|λm+1), (8)

Page 8: Easing the Monte Carlo sign problem - arxiv.org · Easing the Monte Carlo sign problem Dominik Hangleiter,1, Ingo Roth,1 Daniel Nagaj,2 and Jens Eisert1 1Dahlem Center for Complex

8

on the configuration space Λm+1 = [dimH]×(m+1). Here,we have defined the transfer matrix Tm(λ′|λ) = 〈λ′|1 −βH/m|λ〉 and in general denote the entries of a matrix Aas A(λ1|λ2) = 〈λ1|A|λ2〉. The computation of the partitionfunction involves a summation over all closed paths of lengthm (i.e., paths with periodic boundary conditions); the com-putation of general observables involves a summation over allopen paths.

For non-negative path weights, both quantities may berewritten as expectation values in a probability distributionq(~λ) = a(~λ)/

∑~λ a(~λ), which reduces to q(~λ) = a(~λ)/Zβ,H

when computing the expectation value of diagonal observ-ables. The sign problem is manifested in the fact that theoff-diagonal entries ofH may be positive potentially implyingthat a(~λ) < 0. Therefore q(~λ) is in general a quasi-probabilitydistribution.

To compute the quantities (6) and (7) via Monte Carlo sam-pling, one constructs a linear estimator as the expectationvalue 〈f〉p =

∑~λ p(

~λ)f(~λ) of a random variable f distributedaccording to a probability distribution p. By Chebyshev’s in-equality the statistical error ε, i.e. the deviation from the mean,when averaging s samples of an i.i.d. random variable X isupper bounded by its variance

ε ≤√

Var(X)/(s(1− δ)) , (9)

with probability at least 1− δ. Hence, to achieve any relativeerror ε, the number of samples needs to grow with the varianceof the random variable normalized by its expectation value. Infact, it can be easily shown that the variance-optimal estimatorfor the partition function Zβ,H is given by the probability dis-tribution p(~λ) = |a(~λ)|/‖a‖`1 with ‖a‖`1 =

∑~λ |a(~λ)| and

the estimator f(~λ) = sign(a(~λ)) · ‖a‖`1 [53]. The variance ofthis estimator is given by

Varp(f) = ‖a‖2`1(‖q‖2`1 − 1) (10)

and hence the relative error of the approximation by

Varp(f)

〈f〉2p= ‖q‖2`1 − 1 ≡ 〈sign〉−2

p − 1, (11)

where 〈sign〉p = 1/‖q‖`1 is called the average sign of thequasi-probability distribution q. One may interpret the aver-age sign as the ratio between the partition functions of theoriginal system with Hamiltonian H acting on n qubits anda corresponding ‘bosonic system’ with Hamiltonian H ′ =

(H − 2H¬) as 〈sign〉p = Tr[e−βH ]/Tr[e−βH′]. Generi-

cally, such a quantity is expected to scale as e−βn∆f , thatis, inverse exponentially in the particle number n, the in-verse temperature β, and the free energy density difference∆f = f ′−f ≥ 0 between ‘bosonic’ and original system [18].

In order to minimize the relative approximation error of aQMC algorithm, we therefore need to minimize the inverseaverage sign, or equivalently ‖q‖`1 , over the allowed set ofbasis choices which we denote by U . To optimally ease thesign problem in terms of its sample (and hence computational)

complexity one therefore needs to solve the following mini-mization problem

minU∈U‖q‖2`1 − 1 = min

U∈UTr[|UTmU†|m]2

Tr[Tmm ]2− 1, (12)

where as throughout this work | · | denotes taking the entry-wise absolute value and not the matrix absolute value.

II. THE RELATION BETWEEN THE AVERAGE SIGNAND NON-STOQUASTICITY

The difficulty in dealing with the minimization problem(12) is manifold. First, determining the quantity ‖q‖`1 =Tr[|Tm|m]/Tr[Tmm ] via QMC suffers from the very sign prob-lem it quantifies: it can easily be checked that the relativevariance of 〈sign〉p is precisely given by 〈sign〉−2

p − 1. Itthus inherits the complexity of computing the partition func-tion Zβ,H in the first place. Naïve optimization of the termTr[|Tm|m]/Tr[Tmm ] even incurs the cost of diagonalizing theexponential-size matrices Tm and |Tm|. Second, the optimiza-tion problem is non-convex and highly non-linear in the uni-tary transformation T 7→ UTU† with U ∈ U .

While it might be possible to minimize the unitarily depen-dent term Tr[|Tm|m] and its gradient stochastically via QMCin some cases [57, 65], such approaches cannot yield certifi-cates for the quality of the obtained basis as the average signitself is not computed. Moreover, they are dependent on thedistribution defined by |Tm| being well-behaved (i.e., ergodicand satisfying detailed balance) for QMC algorithms.

It therefore seems infeasible to find a converging and ef-ficient algorithm for minimizing the average sign for generalHamiltonians directly. Ideally, one could find a simple quan-tity measuring the non-stoquasticity of the Hamiltonian whichcan be connected to the inverse average sign in a meaningfulway while at the same time admitting efficient evaluation.

A. Case studies

We now show that this hope is in vain in its most generalformulation. Specifically, we provide an example of a Hamil-tonian which has large positive entries but is nevertheless sign-problem free (has unit average sign) for specific choices of βand m, as well as an example of an Hamiltonian that is closeto stoquastic but incurs an arbitrarily small average sign forcertain choices of β and m in a specific QMC procedure.

Here, as throughout this work, whenever we consider sys-tems of multiple qubits, for A ∈ C2×2 we define

Ai = A⊗ 1{i}c , (13)

to be the operator that acts as A on qubit i and trivially on itscomplement {i}c.

Example 3 (Highly non-stoquastic but sign-problem freeHamiltonians). Let us define a Hamiltonian term acting on

Page 9: Easing the Monte Carlo sign problem - arxiv.org · Easing the Monte Carlo sign problem Dominik Hangleiter,1, Ingo Roth,1 Daniel Nagaj,2 and Jens Eisert1 1Dahlem Center for Complex

9

two qubits with label i, j as

hi,j = −1

2(XiXj − YiYj) +Xi. (14)

Then this Hamiltonian term is non-stoquastic with totalweight ν1(hi,j) = 1. What is more, the n-qubit Hamiltonian

H = 1 +

n∑i<j

hi,j (15)

is highly non-stoquastic with total weight ν1(H) = n. Atthe same time, the QMC algorithm for computing the parti-tion function of H with parameters β,m, has average sign〈sign〉β,m = 1.

Proof. We first determine the non-stoquasticity of H as

ν1(H) =∑i

ν1(Xi) = n. (16)

To see why the QMC algorithm has unit average sign, notethat the transfer matrix Tm = 1−βH/m has negative entriesTm(λ|λ′) < 0 only if the parity of λ⊕λ′ is odd since for theseterms only a single X term contributes. Whenever λ ⊕ λ′ =0, i.e., has even parity, we have Tm(λ|λ′) ≥ 0 since onlyXX − Y Y terms or the diagonal contribute – both of whichhave non-negative matrix elements.

In the calculation of the partition function, the summa-tion runs over closed paths only. But for any closed pathλ1 → λ2 → · · · → λm → λ1, it is necessary that the to-tal parity (λ1⊕λ2)⊕ . . .⊕ (λm⊕λ1) vanishes. In particular,this implies that every allowed path incurs an even numberof odd-parity steps and therefore an even number of negativesigns. Therefore, only non-negative paths contribute to thepath integral and the average sign is attained at unity.

Example 4 (Barely non-stoquastic Hamiltonians with arbi-trarily small average sign). Let us define the 2-qubit Hamil-tonian

Ha,b =m

β

(1⊗ 1− 1⊗X − 1

2(X ⊗X + Y ⊗ Y ) (17)

+1

2[(a+ b)X ⊗ Z + (b− a)X ⊗ 1]

), (18)

with b ≥ a > 0 positive numbers and m ∈ 2N + 1 . The non-stoquasticity of Ha,b is given by ν1(Ha,b) = bm/(2β), theaverage sign of QMC with parameters β and m is dominatedby |〈sign〉β,m| ≤ C(b − a)/a, where C is a constant. Thus,even for arbitrarily small non-stoquasticity we can make thesign problem unboundedly severe as we tune a to be close tob.

Proof. We derive the bound on the average sign. For the givenHamiltonian, the corresponding transfer matrix for a QMC al-gorithm for inverse temperature β with m steps is given by

Tm ≡ Ta,b =

0 1 −b 01 0 1 a−b 1 0 10 a 1 0

. (19)

Recall that the average sign can be written as

〈sign〉β,m =Tr[Tmm ]

Tr[|Tm|m]. (20)

We denote by Tm a matrix similar to Tm but where the posi-tions of a and −b are exchanged. Due to the symmetry of theproblem we have that Tr[Tmm ] = Tr[T

m

m] and Tr[|Tm|m] =

Tr[|Tm|m

]. Hence,

Tr[Tmm ] =1

2

[Tr[Tmm ] + Tr[T

m

m]]

(21)

=1

2

∑~λ∈Λm

[Tm(λ1 | λ2) · · ·Tm(λm | λ1) (22)

+ Tm(λ1 | λ2) · · ·Tm(λm | λ1)

](23)

=1

2

∑~λ

[af(~λ)(−b)g(~λ) + ag(λ)(−b)f(~λ)

], (24)

where in the last line we have used the fact that every sum-mand is a polynomial in the entries of Ta,b. The functionsg, f : Λm → [m] describe the corresponding exponents. Alittle thought reveals that since all path are closed and m isodd g(~λ) + f(~λ) is larger than 1 and also odd for all ~λ. Wethus find that one of the two terms for each ~λmust be negativeand

|TrTmm | ≤1

2

∑~λ

|af(~λ)bg(~λ) − bf(λ)ag(

~λ)| (25)

≤ 1

2

∑~λ

(2g(~λ) − 1)af(~λ)+g(~λ)−1|b− a|. (26)

Furthermore, we have

|Tr |Tm|m| =1

2

∑~λ

(af(~λ)bg(~λ) + bf(λ)ag(

~λ)) (27)

≥∑~λ

(af(~λ)+g(~λ)

). (28)

Combining these two bounds and using g(~λ) ≤ m, we con-clude that

|〈sign〉| ≤(

2m−1 − 1

2

) |b− a|a

. (29)

The second example shows that in principle also Hamilto-nians with arbitrarily small positive entries can cause a severeincrease of the sampling complexity of specific Monte Carloalgorithms. Interestingly, in this situation the sign problemcannot be eased by a basis change: the average sign vanishessince the unitarily invariant term |TrTmm | is tuned to be small.On the contrary, the sign problem can be completely avoided

Page 10: Easing the Monte Carlo sign problem - arxiv.org · Easing the Monte Carlo sign problem Dominik Hangleiter,1, Ingo Roth,1 Daniel Nagaj,2 and Jens Eisert1 1Dahlem Center for Complex

10

in this example by choosing the Monte-Carlo path length tobe even instead of odd.

These simple examples illustrate the following importantobservation: The sign problem as measured by the averagesign can in certain situations be avoided or intensified by fine-tuning the problem and parameters of the QMC procedure in-dependently of the actual magnitude of the positive entries ofthe Hamiltonian. But such examples seem to rely on an in-tricate conspiracy of the structure of the Hamiltonian and thechooen QMC procedure, e.g., the discretization. It is plausibleto assume that the most pathological cases are unlikely to ap-pear in practical applications, and can at least be rather easilyovercome by slightly modifying the QMC algorithm.

B. Measures of non-stoquasticity

In this work, our goal is to develop a more general method-ology for the task of easing the sign problem that is indepen-dent of the details of the QMC algorithm and the combinato-rial properties of potential paths that can be constructed fromthe entries of the transfer matrix. Very much in the spirit ofthe notion of stoquasticity we aim at a property of the Hamil-tonian in a given basis to measure its deviation from having agood sampling complexity. Natural candidates for such a non-stoquasticity measure of a Hamiltonian are entry-wise normsof its positive entries. For any p ≥ 1 we define the non-stoquasticitiy of H as

νp(H) = D−1‖H¬‖`p , (30)

where ‖ · ‖`p denotes the vector-`p norm. For every p, νpis efficiently computable for local Hamiltonians on bounded-degree graphs and therefore suitable for easing the sign prob-lem of a large class of Hamiltonians by local basis choices.It is also obviously a measure of the non-stoquasticity in thesense that νp(H) = 0 if and only if H is stoquastic. Wenote that we have chosen our definition such that the non-stoquasticity measure νp scales extensively in the number ofnon-stoquastic terms of a local Hamiltonian. This is becauseevery non-stoquastic local Hamiltonian term creates on the or-der of 2n positive matrix entries in a global n-qubit Hamilto-nian matrix due to tensoring with identities on the complementof its support.

Our examples in the previous section show that it is no-toriously difficult if not impossible to connect any notion ofnon-stoquasticity to the actual sample complexity incurred bya QMC procedure as measured by the inverse average sign.This is due to the dependence of the average sign on the com-binatorics of Monte Carlo paths. However, those examplesinvolved a large degree of fine-tuning. Therefore, one mighthope to find a connection between non-stoquasticity and aver-age sign for generic cases.

So let us look at the connection between optimizing a non-stoquasticity measure νp and optimizing the QMC samplingcomplexity as in (12). Our measure can be expressed in termsof the transfer matrix Tm as

νp(H) =1

D

m

2β‖|Tm| − Tm‖`p , (31)

where we assume that diag(βH/m) ≤ 1.Due to the unitary invariance of the trace, the optimization

of the sampling complexity via (12) is equivalent to minimis-ing

S(U) = Tr[|UTmU†|m]− Tr[Tmm ]. (32)

Let us for the sake of clarity, suppress the explicit dependenceon the unitary U and define Tm = UTmU

†. If we define thepositive and negative entries of the transfer matrix respectivelyas ∆± = 1

2

(|Tm| ± Tm

), we can write

S(U) = Tr[|Tm|m]− Tr[Tmm ] (33)

= 2∑

~s∈{±}m:~s odd

Tr[∆s1 · · ·∆sm ]. (34)

The summation in the last line is restricted to all ~s ∈ {±}mwith an odd number of negative signs. The resulting ex-pression thus involves a summation over closed paths thatcontain an odd number of negative contributions such that∆s1(λ1|λ2) · · ·∆sm(λm|λ1) < 0. In particular, every suchpath contains at least one step with a negative contribution.

The size of S(U) thus depends both on the number of ‘neg-ative paths’ and their individual weight. From Eq. (34) wefind that the contribution to S(U) of paths with exactly onenegative step has the form

2m∑λ1,λ2

∆−(λ1|λ2)∆m−1+ (λ2|λ1), (35)

using the cyclicity of the trace. This expression (35) is aweighted sum over the negative entries of Tm, where theweights are given by the contribution ∆m−1

+ (λ2|λ1) of allpositive paths of length m− 1.

For a transfer matrix in which the positive entries do notsignificantly differ and their distribution relative to the nega-tive entries is unstructured, we have constant ∆m−1

+ (λ2|λ1) ≈‖∆m−1

+ ‖`∞ . Therefore, the linear term (35) scales approxi-mately as

2m‖∆−‖`1‖∆m−1+ ‖`∞ ∝ Dν1(H). (36)

For higher-order negative contributions, we expect thatS(U) or, correspondingly, the average sign scales as exp(c ·Dν1(H)) for some c > 0. Our expectation is based on thefollowing observation: in the calculation of the inverse aver-age sign, all paths of lengthmwith an odd number of negativesteps contribute. Potentially, in each step every negative en-try of Tm appears. Then the sum of all negative entries ofTm contributes. But the number of paths with k ∈ 2N0 + 1negative steps scales as

(mk

)which leads to an exponential

growth in ‖H¬‖`1 and hence Dν1(H). In the following sec-tion, we provide a brief numerical analysis confirming thisexpectation.

We further observe that if the positive entries of Tm aremore structured, the weights appearing in Eq. (35) might de-viate from a uniform distribution. In such a case, other νp-measures become meaningful as a measure of the inverse av-erage sign since they saturate a corresponding Hölder bound.

Page 11: Easing the Monte Carlo sign problem - arxiv.org · Easing the Monte Carlo sign problem Dominik Hangleiter,1, Ingo Roth,1 Daniel Nagaj,2 and Jens Eisert1 1Dahlem Center for Complex

11

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

100

101

102

103

104

105

106

ν1(Hα) · 2n

1/〈sign〉 β,m(H

α)

Figure 4. The figure shows the inverse average sign for 100 randomlychosen instances of 5-qubit Hamiltonians Hα for β = 1 and m =100 Monte Carlo steps as a function of dν1(Hα). We find a roughlyexponential dependence of the inverse average sign with ν1(Hα) as1/〈sign〉β,m(Hα) ∝ exp(a · dν1(Hα)) for a > 0.

C. Numerical analysis

In this subsection, we provide evidence that ν1(H) is in-deed a very much meaningful measure of the sample complex-ity and hence the inverse average sign by exactly calculatingthe inverse average sign as a function of ν1(H). We do so byrandomly drawing 2-local Hamiltonians on a line of n qubitsof the form

H =

n∑i=1

Ti(h), (37)

where h ∈ R4×4 is a nearest-neighbour interaction term andthe translation operator Ti acts as Ti(h) = 1

⊗(i−1)d ⊗ h ⊗

1⊗n−i−1d . We choose each local term h in an i.i.d. fashion

from the zero-centered Gaussian measure and projecting toHermitian matrices. For each random instance H , we thenconsider the one-parameter Hamiltonian family

Hα =H −H¬ + αH¬

2nν1(H¬). (38)

Note that ν1(Hα) = α/2n. Fig. 4 shows that, generically, theaverage sign monotonously depends on the non-stoquasticity.Indeed, as expected for large m, the dependence is an expo-nential one.

D. Computing the non-stoquasticity

Above, we have argued that a key problem of the aver-age sign lies in the fact that it is not efficiently computablewhenever there is a sign problem. But how does the non-stoquasticity measure ν1 fare in this respect? We now showthat for arbitrary 2-local Hamiltonians the non-stoquasticitymeasure ν1 can in fact be approximated up to an inverse poly-nomially small additive error in polynomial time. While thisis not possible for arbitrary local Hamiltonians as a simpleexample shows, any νp-measure can be efficiently computed

exactly in polynomial time for local Hamiltonians acting onbounded-degree graphs.

We write a real 2-local Hamiltonian with 1-local terms as

H2+1 =∑i<j

(ai,jXiXj + bi,jYiYj + ci,jZiZj (39)

+ xi,jXiZj + xj,iZiXj

)+∑i

(αiXi + γiZi

),

parametrized by real coefficient vectors a, b, c ∈Rn(n−1)/2, x ∈ Rn(n−1) which are non-zero only onthe edges (i, j) ∈ E of the interaction Hamiltonian graphG = (V,E) as well as vectors α, γ ∈ Rn. The Hamiltonianinteraction graph is defined by a set V of sites or vertices andthe edge set

E ={

(i, j) ∈ V × V :

¬(ai,j = bi,j = ci,j = xi,j = xj,i = 0)}. (40)

We call N (i) = {j : (i, j) ∈ E} the neighbourhood of sitei on the graph G, deg(i) = |N (i)| the degree of site i anddeg(G) = maxi∈V deg(i) the degree of the overall graph.Notice that obtaining an expression for the non-stoquasticityis non-trivial since several local Hamiltonian terms may con-tribute to a particular entry of the global Hamiltonian matrix.

Lemma 5 (Non-stoquasticity of (2 + 1)-local Hamiltonians).The non-stoquasticity measure ν1 of real 2-local Hamiltoni-ans with 1-local terms of the form H2+1 satisfies

ν1(H2+1) =∑i<j

ν1(ai,jXiXj + bi,jYiYj)

+∑i

ν1

(αiXi +

∑j∈NXZ(i)

xi,jXiZj

),

(41)

and it holds that

ν1(ai,jXiXj + bi,jYiYj) = (42)1

2

∑i<j

(max{ai,j + bi,j , 0}+ max{ai,j − bi,j , 0}

),

ν1

(αiXi +

∑j∈NXZ(i)

xi,jXiZj

)= 2− degXZ(i)

×∑

λNXZ (i)

max

{αi +

∑j∈NXZ(i)

(−1)λjxi,j , 0

}.

(43)

Here, we have defined the XZ neighbourhood NXZ(i) ={j : xij 6= 0} of site i as all vertices j connected to iby an XiZj-edge. As useful shorthands, we also define theXZ degree degXZ(i) = |NXZ(i)| and the restriction of aspin configuration λ ∈ {0, 1}n to an XZ neighbourhood asλNXZ(i) = (λj)j∈NXZ(i) ∈ {0, 1}degXZ(i). We conceive ofsummation over an empty set (the case that NXZ(i) = {})as resulting in 0 so that the corresponding term in Eq. (43)vanishes.

Notice that the non-stoquasticity of an XZ term does notdepend on the sign of its weight, while for XX and X termsit crucially does.

Page 12: Easing the Monte Carlo sign problem - arxiv.org · Easing the Monte Carlo sign problem Dominik Hangleiter,1, Ingo Roth,1 Daniel Nagaj,2 and Jens Eisert1 1Dahlem Center for Complex

12

Proof. We can determine the `1-norm of the off-diagonal partof the Hamiltonian H2+1 as

‖H2+1 − diag(H2+1)‖`1

=∑

λ∈{0,1}n

{∑i<j

|ai,j − (−1)λi+λj bi,j |

+∑i

∣∣∣∣αi +∑

j∈NXZ(i)

(−1)λjxi,j

∣∣∣∣}=2n−1

∑i<j

(|ai,j + bi,j |+ |ai,j − bi,j |

)+∑i

2n−degXZ(i)∑

λNXZ (i)

∣∣∣∣αi +∑

j∈NXZ(i)

(−1)λjxi,j

∣∣∣∣.(44)

From Eq. (44) we can then directly calculate the non-stoquasticity ν1 of H2 by discarding all matrix entries withnegative sign before taking the `1-norm and dividing by2n.

Now, clearly we can efficiently compute the term (42) forarbitrary graphs as the sum runs over at most n(n−1)/2 manyterms. In the term (43), in contrast, the sum over spin con-figurations λNXZ(i) in the XZ neighbourhood of site i runsover 2degXZ(i) many terms and hence this term is efficientlycomputable exactly in case the vertex degree degXZ(i) of anyvertex i grows at most logarithmically with n. This shows thatfor bounded-degree graphs such as regular lattices the non-stoquasticity can be computed efficiently.

But what if the degree of the graph grows faster thanlogarithmically with n so that the sum runs over super-polynomially many non-trivial terms? The following Lemmashows that even in this case, that is, for 2-local Hamiltoni-ans acting on arbitrary graphs, the non-stoquasticity can beefficiently approximated up to any inverse polynomially smalladditive error using Monte Carlo sampling.

Theorem 6. The sum (43) can be efficiently approximatedup to additive error ε via Monte Carlo sampling with failureprobability δ from 16 degXZ(i)(maxj |xi,j |)2 log(2/δ)/ε2

many iid. samples.

Proof. For the proof we will use concentration of measure forLipschitz functions. To this end we begin by noticing that thesum (43) can be rewritten as a uniform expectation value overki = degXZ(i) many Rademacher (±1) random variables as∑

λNXZ (i)

max

{αi +

∑j∈NXZ(i)

(−1)λjxi,j , 0

}= Eσ∈{±1}ki [f

(i)α,x(σ)],

(45)

where we have defined

f (i)α,x :Rki → R

s 7→ max

{αi +

∑j∈NXZ(i)

sjxi,j , 0

}.

(46)

It can easily be seen that the function f (i)α,x is Lipschitz with

constant(

maxj |xi,j |)k

1/2i :

∣∣f (i)α,x(s)− f (i)

α,x(s′)∣∣ =

∣∣∣∣ degXZ(i)∑j=1

xi,j(si − s′i

)∣∣∣∣ (47)

≤(

maxj|xi,j |

)‖s− s′‖`1 (48)

≤(

maxj|xi,j |

)k

1/2i ‖s− s′‖`2 . (49)

Here, we have used the fact that the `p norms on Rn

satisfy Moreover, f(i)α,x is clearly separately convex, that

is, for each k = 1, 2, . . . , ki the function sj 7→f

(i)α,x(s1, s2, . . . , sj−1, sj , sj+1, sj+2, . . . , sn) is convex for

each fixed (s1, s2, . . . , sj−1, sj+1, sj+2, . . . , sn) ∈ Rki−1.We can then apply [66, Theorem 3.4] to obtain that the es-

timator

f (i)α,x =

1

m

m∑l=1

f (i)α,x(σ(l)), (50)

for the m Rademacher vectors σ(l) ∈ {±1}ki drawn iid. uni-formly at random satisfies

Pσ[∣∣f (i)

α,x − Eσ[f (i)α,xσ]

∣∣ ≥ ε] ≤ 2e− mε2

16ki(maxj |xi,j |)2 . (51)

This implies that with probability 1− δ the estimator satisfies∣∣f (i)α,x − Eσ[f (i)

α,xσ]∣∣ ≤ ε (52)

whenever the number m of independently drawn Rademachervectors satisfies

m ≥ 16 ki(maxj |xi,j |)2

ε2log(2/δ). (53)

In total we thus obtain a polynomial worst-case complexityof computing the non-stoquasticity of a (2 + 1)-local Hamil-tonian of the form (39) up to additive error εwith failure prob-ability δ as given by

n(n− 1)

2+

16∑i degXZ(i)(maxi,j xi,j)

2

(ε/n)2log

(2

δ

).

(54)

III. EASING THE SIGN PROBLEM: AN ALGORITHMICAPPROACH

To demonstrate the feasibility of SignEasing and to putour findings more closely into the context of practical con-densed matter problems, we numerically optimize the non-stoquasticity of certain nearest-neighbour Hamiltonians inquasi one-dimensional ladder geometries. Such systems areeffectively described by translation-invariant Hamiltonians onn d-dimensional quantum systems of the form (37) with

Page 13: Easing the Monte Carlo sign problem - arxiv.org · Easing the Monte Carlo sign problem Dominik Hangleiter,1, Ingo Roth,1 Daniel Nagaj,2 and Jens Eisert1 1Dahlem Center for Complex

13

nearest-neighbour interaction term h ∈ Rd2×d2 . For the sakeof simplicity, we specialize here to closed boundary condi-tions, identifying n+ 1 = 1.

We then optimize the non-stoquasticity of H over on-siteorthogonal basis choices of the type

H 7→ O⊗nH(OT )⊗n. (55)

On-site transformations are particularly simple to handle asthey preserve both locality and translation-invariance of theHamiltonian. Due to the translation-invariance of the problemthe global non-stoquasticity measure can be expressed locallyin terms of the transformed term O⊗2h(OT )⊗2 so that theproblem has constant complexity in the system size. Moreprecisely, for Hamiltonians of the form (37) we can expressthe non-stoquasticity measure ν1(H) = n2n−3ν1(h) in termsof an effective local measure

ν1(h) =∑

ijk;lmn:j 6=m,k=n

max{

0, (h⊗ 1 + 1⊗ h)ijk;lmn

}. (56)

Optimizing ν1(H) for the global Hamiltonian is thereforeequivalent to the much smaller problem of minimizing ν1(h).While the non-stoquasticity measure ν1 can be efficientlyevaluated, thus satisfying a necessary criterion for an effi-cient optimization algorithm, minimizing ν1 may and in factwill still be a non-trivial task in general – an intuition wemake rigorous below. This is because in optimizing the basis-dependent measure ν1 over quasi-local basis choices one isfaced with a highly non-convex optimization problem of ahigh-order polynomial over the sphere of orthogonal matrices.Among the best developed multi-purpose methods for opti-mization over group manifolds such as the orthogonal groupare conjugate gradient descent methods [45]. Compared tosimple gradient-descent algorithms, conjugate gradient algo-rithms are capable to better incorporate the underlying groupstructure to the effect that they achieve much faster runtimesand better convergence properties.

To practically minimize the non-stoquasticity ν1 over theorthogonal groupO(d) we have implemented a conjugate gra-dient descent algorithm following Ref. [45]. Our implemen-tation is publicly available [46] and detailed in Appendix A.

We first benchmark the algorithm on Hamiltonians whichwe know to admit an on-site stoquastic basis. Specifically, weapply the algorithm to recover an on-site stoquastic basis ofthe random translation-invariant Hamiltonian

H =

n∑i=1

Ti(O⊗2(h− h¬)(OT )⊗2

)(57)

on n qudits where the two-local term h ∈ Rd2×d2 is a Hamil-tonian term with uniformly random spectrum expressed in aHaar-random basis and O ∈ O(d) is a Haar-random on-siteorthogonal matrix. In Fig. 1(a) we show the performance ofthe algorithm on randomly chosen instances of (57) for dif-ferent values of the local dimension d. In all but very fewinstances our algorithm essentially recovers the stoquastic ba-sis of the random Hamiltonian. By construction, this can only

J0

J1

J2 J3

O O O O O O

J⊥ J⊥

J‖

J‖

O O O O O O

(a)

(b)

Figure 5. Quasi one-dimensional models with a sign problem. Fig-ure (a) shows the lattice of the J0-J1-J2-J3-Heisenberg model on atriangular quasi one-dimensional lattice as given in Eq. (58). Figure(b) shows the lattice structure of the frustrated Heisenberg model (59)with couplings J⊥, J‖ and J× on a square-lattice ladder with crosscoupling. In our simulations, we group sites to dimers as indicatedin the figures and then optimize the measure ν1(h) of the effective2-local terms h over on-site orthogonal transformations O⊗2.

be due to the fact that the algorithm gets stuck in local min-ima, indicating a potential limitation of first-order optimiza-tion techniques as a tool for easing the sign problem of generalHamiltonians.

We then study frustrated anti-ferromagnetic HeisenbergHamiltonians, i.e., Hamiltonians with positively weighted in-teraction terms ~Si · ~Sj , on different ladder geometries. Here,~Si = (Xi, Yi, Zi)

T is the spin operator at site i. The signproblem of frustrated ladder systems can in many cases actu-ally be removed by going to a dimer basis [11, 13, 14]. How-ever – and this is important for our approach – in those casesthe sign problem is not removed by finding a stoquastic lo-cal basis, but rather by exploiting specific properties of theMonte Carlo simulation at hand, for example, that no nega-tive paths occur in the simulation [11] or by exploiting spe-cific properties of the Monte Carlo implementation at hand[13, 14]. Therefore, frustrated Heisenberg ladders constitutethe ideal playground to explore the methodology of easing thesign problem by (quasi-)local basis choices.

The first model we study is the J0-J1-J2-J3-model [11]whose Hamiltonian is given by (see Fig. 5(a))

H ~J =

n∑i=1

(J0~S1i~S1i+1 + J1

~S2i~S2i+1 + J2

~S1i~S2i + J3

~S1i+1

~S2i

),

(58)

where ~S1i denotes the spin operator at site i on the lower

rung and ~S2i on the upper rung of the ladder, respectively,

and Ji ≥ 0 for all i. Intriguingly, this Hamiltonian doesnot have a sign problem in the singlet-triplet dimer basis even

Page 14: Easing the Monte Carlo sign problem - arxiv.org · Easing the Monte Carlo sign problem Dominik Hangleiter,1, Ingo Roth,1 Daniel Nagaj,2 and Jens Eisert1 1Dahlem Center for Complex

14

0.5 1 1.5

0.5

1

1.5

J⊥/J‖

J×/J‖

Improvement of ν1

0.20

0.40

0.60

ν1(OHOT )ν1(H)

0.5 1 1.5

0.5

1

1.5

J⊥/J‖

Improvement of 〈sign〉−1

0.00

2.00

4.00

log(〈sign〉−1

OHOT)

log(〈sign〉−1H

)

0.5 1 1.5

0.5

1

1.5

J⊥/J‖

J×/J‖

〈sign〉−1 before optimization

2.00

4.00

6.00

log(〈sign〉−1H )

0.5 1 1.5

0.5

1

1.5

J⊥/J‖

〈sign〉−1 after optimization

0.00

2.00

4.00

log(〈sign〉−1OHOT )

(a) (b)

(c) (d)

Figure 6. Improvement of the average sign concomitant with an improvement of the non-stoquasticity of the frustrated Heisenberg ladder (59).Figure (a) shows the optimized non-stoquasticity ν1 in terms of its relative improvement compared to the computational basis. (b) We expectthe inverse average sign to scale exponentially in the non-stoquasticity. Therefore, we plot the ratio of the logarithm of the inverse average signbefore optimization to that after optimization. We compute the average sign via exact diagonalization for a ladder of 2 × 4-sites, m = 100Monte Carlo steps and inverse temperature β = 1. We also plot the logarithm of the inverse average sign (c) before and (d) after optimizationof a local orthogonal basis.

though the Hamiltonian is not stoquastic in that basis. How-ever, there exists a non-local stoquastic basis for values ofJ3 ≥ J0 + J1 [11]. We show the results of optimizing thenon-stoquasticity of H ~J with J0 = J1 = J over a translation-invariant dimer basis (see Fig. 5(a)) in Fig. 1(b). We initializeour simulations in a Haar random orthogonal on-site basis. In-terestingly, we find an improvement of the non-stoquasticityunder on-site orthogonal basis choices that does not seem tocorrelate with the region in which a non-local stoquastic ba-sis was found in Ref. [11]. We view this as an indication thatless local ansatz classes such as quasi-local circuits can furtherimprove the non-stoquasticity for this model.

We now apply the algorithm to the anti-ferromagneticHeisenberg ladder studied in Refs. [13, 14]. The Hamiltonian

of this system is given by (see Fig. 5(b))

HJ‖,J⊥,J× =

n∑i=1

(J‖(~S1i~S1i+1 + ~S2

i~S2i+1

)+ J⊥~S

1i~S2i

+ J×(~S1i~S2i+1 + ~S1

i+1~S2i

)),

(59)

with interaction constants J‖, J⊥, J× ≥ 0. For this geome-try, the situation is somewhat more involved: Refs. [13, 14]find that a sign-problem free QMC procedure exists, albeitfor a slightly different QMC procedure than we consider here,namely stochastic series expansion (SSE) Monte Carlo [37].Similar to the world-line Monte Carlo method discussed here,SSE is based on an expansion of the exponential exp(−βH)albeit via a Taylor expansion as opposed to a product expan-sion. Specifically, for the partially frustrated case in whichJ× 6= J‖ their solution of the sign problem exploits a spe-cific sublattice structure of the Hamiltonian in combination

Page 15: Easing the Monte Carlo sign problem - arxiv.org · Easing the Monte Carlo sign problem Dominik Hangleiter,1, Ingo Roth,1 Daniel Nagaj,2 and Jens Eisert1 1Dahlem Center for Complex

15

with the SSE approach. We optimize the non-stoquasticity ofdimer basis choices as shown in Fig. 5(b) when starting froma random initial point that is close to the identity. Our results,shown in Fig. 1(c), qualitatively reflect the findings of Wesselet al. [14] for SSE in terms of stoquasticity in that the non-stoquasticity can be significantly reduced for the fully frus-trated case J‖ = J×, while it can be merely slightly improvedfor the partially frustrated case.

At the same time, the algorithm does not recover a fullystoquastic basis for the frustrated ladder model HJ‖,J⊥,J× asmight be expected. There may be several reasons for this: ei-ther the nearly sign-problem free QMC procedure found inRefs. [13, 14] is in fact specific to SSE in that no stoquasticdimer basis and hence no sign-problem free world-line MonteCarlo method exists in the orbit of orthogonal dimer bases, orthe conjugate gradient algorithm gets stuck in local minima.In any case, the performance of our algorithm for both frus-trated ladders demonstrates that the optimization landscape isgenerically an extremely rugged one, reflecting the computa-tional hardness of the optimization problem in general.

We now turn to showing the improvement of the averagesign concomitant with the improvement in non-stoquasticityin Fig. 6. We first observe that Figs. 6(a) and (b) are com-patible with an exponential dependence of the inverse aver-age sign on the non-stoquasticity 〈sign〉−1 ∝ exp(cν1(H))as conjectured above: in the regions in which a significant im-provement of the non-stoquasticity could be achieved by localbasis choices, the inverse average sign could also be stronglyimproved. Importantly, while the Hamiltonian could not bemade entirely stoquastic, the improvement in the inverse av-erage sign reaches an extent to which nearly no sign prob-lem remains in those regions. This shows that also moderateimprovements in non-stoquasticity can yield tremendous im-provements of the average sign. At the same time, a severesign problem remains – and actually becomes worse – in asmall region of the parameter space (around J⊥/J‖ & 3/4and J×/J‖ . 1/2) even though the non-stoquasticity couldbe reduced to some extent in that region. This may reflectopen questions about the relation between average sign andnon-stoquasticity that arose in our earlier discussion in Sec-tion II: while in generic cases the two notions of severeness ofthe sign problem are expected to be closely related, there is nogeneral simple correspondence between them.

Our findings demonstrate both the feasibility of minimizingthe non-stoquasticity in order to ease the sign problem by opti-mizing over suitably chosen ansatz classes of unitary/orthogo-nal transformations and potential obstacles to a universal solu-tion of the sign problem. In particular, for translation-invariantproblems – while it may well be computationally infeasible –the complexity of the optimization problem only scales withthe locality of the Hamiltonian, the local dimension and thedepth of the circuit. We expect, however, that there exists noalgorithm with polynomial runtime in all of these parametersthat always solves the optimisation problem.

Our findings also indicate that more general ansatz classesyield the potential to further improve non-stoquasticity. Dif-ferent classes of orthogonal transformations can be straight-forwardly incorporated in our algorithmic approach. A de-

tailed study of different ansatz classes and their potential foreasing the sign problem is, however, beyond the scope of thiswork. It is the subject of ongoing and future efforts to studythe optimal basis choice in terms of the non-stoquasticity forboth deeper circuits and further models as well as the connec-tion between the average sign and the non-stoquasticity.

IV. EASING THE SIGN PROBLEM: COMPUTATIONALCOMPLEXITY

Let us now focus on a more fundamental question, namely,how far an approach that optimizes the non-stoquasticity cancarry in principle. We have explored the potential of easingusing state-of-the-art optimization algorithms; let us now turnto its boundaries, a glimpse of which we have already wit-nessed in the shape of a rugged optimization landscape. Themethod of choice for this task is the theory of computationalcomplexity.

We analyze the computational complexity of easing thesign problem under particularly simple basis choices, namely,real on-site Clifford and orthogonal transformations. In bothcases, we show that easing the sign problem is an NP-complete task even in cases in which deciding whether thesign problem can be cured is efficiently solvable [21], namelyfor XYZ Hamiltonians as given by

HXYZ =∑i<j

(ai,jXiXj + bi,jYiYj + ci,jZiZj) . (60)

Like Refs. [18–21], we allow for arbitrary interaction graphs.A central ingredient in proving Theorem 2 is an expression

for the non-stoquasticity measure ν1 of strictly 2-local Hamil-tonians of the form

H2 =∑i<j

(ai,jXiXj + ci,jZiZj + xi,jXiZj + xj,iZiXj

).

(61)

It is sufficient to restrict to Hamiltonians of the form (61) be-cause the orbit of XYZ Hamiltonians under on-site orthogonal(Clifford) transformations does not reach Y Y terms.

It is a direct consequence of Lemma 5 that

ν1(H2) =∑i<j

ν1(ai,jXiXj) +∑i

ν1

( ∑j∈NXZ(i)

xi,jXiZj

),

(62)

where the XZ neighbourhood NXZ(i) of a vertex i and re-lated notions are defined in Sec. II D. More specifically, fol-lowing Eqs. (42) and (43) we find that

ν1(ai,jXiXj) =∑i<j

max{ai,j , 0}, (63)

ν1

( ∑j∈NXZ(i)

xi,jXiZj

)= 2− degXZ(i)

×∑

λNXZ (i)

max

{ ∑j∈NXZ(i)

(−1)λjxi,j , 0

}.

(64)

Page 16: Easing the Monte Carlo sign problem - arxiv.org · Easing the Monte Carlo sign problem Dominik Hangleiter,1, Ingo Roth,1 Daniel Nagaj,2 and Jens Eisert1 1Dahlem Center for Complex

16

Since for the proof of hardness we need analytical expres-sions of the non-stoquasticity, we cannot resort to the sam-pling algorithm to evaluate the non-stoquasticity ofXZ termsas proposed in Sec. II D. We analytically bound the contribu-tion of a vertex with non-trivial XZ neighbourhood with thefollowing lemma.

Lemma 7 (XZ non-stoquasticity). The following bound istrue for k ∈ N

∑λ∈{0,1}k

max

{ k∑j=1

(−1)λjxj , 0

}≥ max

j|xj | · 2k−1. (65)

Proof. Let us assume wlog. that x1 ≥ x2 ≥ . . . ≥ xk ≥ 0, allterms being positive and non-increasingly ordered. This doesnot restrict generality as all possible combinations of signsappear in the sum (65). We prove the claim by induction. Fork = 1, the statement is true by immediate inspection. For theinduction step, we use the following inequality for a, b ∈ R

max{a+ b, 0}+ max{a− b, 0} ≥ 2 max{a, 0}, (66)

which can be easily checked by checking the three cases a ≥|b|, a ≤ −|b| and −|b| < a < |b|. We then calculate

∑λ∈{0,1}k

max

{ k∑j=1

(−1)λjxj , 0

}(67)

=∑

λ1,...,λk−1∈{0,1}max

{xk +

k−1∑j=1

(−1)λjxj , 0

}

+∑

λ1,...,λk−1∈{0,1}max

{− xk +

k−1∑j=1

(−1)λjxj , 0

}.

(68)

≥ 2∑

λ′∈{0,1}k−1

max

{ k−1∑j=1

(−1)λ′jxj , 0

}(69)

I.H.≥ 2 · 2k−2|x1| = 2k−1|x1|, (70)

where we have used (66) in the second to last and the induc-tion hypothesis in the last step. This proves the claim.

In the proof of Theorem 2 we will use that Lemma 5 im-plies that every term ai,jXiXj contributes an additional costmax{ai,j , 0} to the non-stoquasticity of H2. Moreover, sincemaxj∈[k] |xj | ≥

∑kj=1 |xj |/k, Lemmas 5 and 7 imply that we

can view every term xi,jXiZj of H2 as contributing at least acost |xi,j |/(2 deg(G)) to the non-stoquasticity of H2, whereG is the interaction graph of H2.

We are now ready to show that with respect to the non-stoquasticity measure ν1 easing the sign problem for 2-localXYZ Hamiltonians with on-site Cliffords is NP-complete onarbitrary graphs. We restate Theorem 2i here.

Theorem 8 (SignEasing under orthogonal Clifford transfor-mations). SignEasing is NP-complete for 2-local Hamilto-nians on an arbitrary graph, in particular for XYZ Hamil-tonians, under on-site orthogonal Clifford transformations,

that is, the real group generated by {X,Z,W} with W theHadamard matrix.

Proof. Clearly the problem is in NP, since one can sim-ply receive a (polynomial-size) description of the transforma-tion in the Yes-case, and then calculate the measure of non-stoquasticity efficiently for XYZ Hamiltonians, verifying thesolution.

To prove NP-hardness, we encode the MAXCUT prob-lem in the SignEasing problem. A MAXCUT instance canbe phrased in terms of asking whether an anti-ferromagnetic(AF) Ising model on a graph G = (V,E) with e = |E| edgeson v = |V | spins

H =∑

(i,j)∈EZiZj , (71)

has ground-state energy λmin(H) below A or above B withconstants B − A ≥ 1/poly(v). This is because in the Isingmodel one gets energy −1 for a (0, 1) or (1, 0) -edge and +1for a (0, 0) or (1, 1) edge.

Let us now encode the MAXCUT problem phrased in termsof the AF Ising model problem into SignEasing for the XYZHamiltonian. We will design a HamiltonianH ′, and ask if on-site orthogonal Clifford transformations can decrease its mea-sure of non-stoquasticity ν1 below A, or whether it remainsabove B for any Clifford basis choice.

For each AF edge between spins i, j in the AF Ising model,the new Hamiltonian H ′ will have an edge

hi,j = XiXj . (72)

On top of that, for every edge (i, j) ∈ E we add one ancillaqubit ξi,j as shown in Figure 3, and interactions

h(ξ)i,j = C

(ZiZj − ZiZξi,j − Zξi,jZj

), (73)

where C = 4 deg(G). Note that the additional terms are di-agonal and hence stoquastic. The total new Hamiltonian thenreads

H ′ =∑

(i,j)∈E

[XiXj + C

(ZiZj − ZiZξi,j − Zξi,jZj

)],

(74)

and acts on n = v + e qubits. We construct H ′ so that anattempt to decrease the non-stoquasticity ν1 by swapping Zand X operators via Hadamard transformations will fail, andso the best one can do is to choose a sign in front of each localX operator. Of course, this then becomes the original, hard,MAXCUT problem in disguise. Let us prove this.

Let N ′(i) = {j : (i, j) ∈ E′} = N (i) ∪ {ξi,j : (i, j) ∈E} be the neighbourhood of site i on the augmented graphG′ = (V ′, E′) on which H ′ lives. We start the proof byobserving that the degree deg′(i) = |N ′(i)| of a site i onthe augmented graph satisifies deg′(i) = 2 deg(i), and hencedeg′(G) = 2 deg(G).

Page 17: Easing the Monte Carlo sign problem - arxiv.org · Easing the Monte Carlo sign problem Dominik Hangleiter,1, Ingo Roth,1 Daniel Nagaj,2 and Jens Eisert1 1Dahlem Center for Complex

17

a. Orthogonal Clifford transformations First, let us notethat any element of the orthogonal single-qubit Clifford groupcan be written as

C = ±WwXxZz, (75)

where we denote the Hadamard matrix with W and w, x, z ∈{0, 1}. Since the global sign is irrelevant, a real n-qubit Clif-ford of the form C = C1⊗· · ·⊗Cn is parametrized by binaryvectors ~w, ~x, ~z ∈ {0, 1}n.

How does H ′ transform under real single-qubit Cliffordtransformations? By definition CZC† ∈ {±Z,±X} andlikewise for X . Therefore, the transformed Hamiltonian willbe of the form (61). Throughout the proof, we will use thatevery term ai,jXiXj contributes at least max{ai,j , 0} to thenon-stoquasticity, while every term xi,jXiZj contributes atleast |xi,j |/(2 deg(G′)) = |xi,j |/(4 deg(G)) as shown byLemmas 5 and 7 above.

We now show that MAXCUT can be embedded intoSignEasing under on-site orthogonal Clifford transforma-tions. To do so, we need to show two things: first, that inthe yes-case that λmin(H) ≤ A, the non-stoquasticity of H ′

can be brought below A using on-site orthogonal Cliffordtransformations. Second, we show that in the no-case thatλmin(H) ≥ B, the non-stoquasticity of H ′ cannot be broughtbelow B using on-site orthogonal Clifford transformations.

b. Yes-case: (Diagonal) transformations that map X to±X (~w = 0). These transformations only change the signin front of Xi, keeping its form. At the same time they onlychange the signs of the ZiZj terms, keeping them diagonaland hence stoquastic. The transformed XiXj terms (72) willbe stoquastic if and only if exactly one of the transformationsof the X at sites i, j is a Z-flip.

We can view the coefficient zi as a spin si in the originalAF Ising model: for zi = 1, Xi → −Xi, corresponding to aspin si = 1 in the original AF Ising model, while for zi = 0,Xi → Xi, which we view it as the Ising spin si = 0.

7→XX

ZZ−ZZ

−ZZ

Zsi

Zsj (−1)si+sj XXZZ−Z

Z

−ZZ

Each such Clifford transformation thus corresponds to a par-ticular state of the original AF Ising model as given by a spinconfiguration ~s ∈ {0, 1}v . Whenever the transformations onneighbouring sites result in a stoquastic interaction−XiXj inthe transformed XYZ Hamiltonian, we have a (0, 1) or (1, 0)anti-ferromagnetic Ising edge with cost 0. On the other hand,each non-stoquastic XiXj term in the XYZ Hamiltonian hascost 1, while the corresponding edge in the Ising model is(0, 0) or (1, 1) also with cost 1.

What is the amount of sign easing we can hope to achieve?We have argued above that only diagonal transformationswhich map Xi 7→ ±Xi potentially ease the sign problem

since we designed the interactions so that a Hadamard trans-formation always incurs a larger cost than keeping an XiXj

term non-stoquastic. For those transformations, we have aone-to-one correspondence with the ground state of the orig-inal AF Ising model. Hence, the original AF Ising modelground state energy λmin(H) is also the optimal number ofnon-stoquastic terms XiXj which one can achieve via on-site orthogonal Clifford transformations, each adding an ad-ditional cost 1 to the non-stoquasticity measure ν1.

In the yes-case we can therefore achieve non-stoquasticity

ν1(yes) ≤ A, (76)

by choosing ~x, ~w = 0 and (z1, . . . , zv)T = ~s0, the ground

state of H .We now show that in the no-case, the non-stoquasticity

measure will be at least

ν1(no) ≥ B. (77)

c. No-case: (Hadamard) transformations that map X to±Z (~w 6= 0). We have designed the additional Hamiltonianterm (73) so that such transformations result in large non-stoquasticity. Specifically, we show that for any choice of ~z,choosing ~x = ~w = 0 achieves the optimal non-stoquasticityin the orbit of orthogonal Clifford transformations.

It is sufficient to show that any Clifford transformation onan edge (i, j) (and its ancilla qubit ξi,j) that is non-stoquasticfor a given choice of ~z can only increase the non-stoquasticity.

To begin with, note that choosing xi = 1 results in Zi 7→−Zi, Xi 7→ Xi, and choosing wi = 1 maps Xi 7→ Zi andZi 7→ Xi. We obtain the following transformation rules ofPauli Zi and Xi, given choices of xi and wi:

xi wi Zi Xi

0 0 Zi Xi

0 1 Xi Zi1 0 −Zi Xi

1 1 −Xi Zi

First, suppose that for an edge (i, j), a Hadamard trans-formation is performed on qubit i, but not j so that we havewi = 1, wj = 0. Then for some choice of xi, xj the trans-formed edge is given by

Wi(XiXj ± CZiZj)Wi = ZiXj ± CXiZj , (78)

7→XX

ZZ−ZZ

−ZZ

W

ZX

XZ−XZ

−ZZ

and has non-stoquasticity at least (C + 1)/(2 degG′).

Page 18: Easing the Monte Carlo sign problem - arxiv.org · Easing the Monte Carlo sign problem Dominik Hangleiter,1, Ingo Roth,1 Daniel Nagaj,2 and Jens Eisert1 1Dahlem Center for Complex

18

Now, suppose that a Hadamard transformation is performedon both qubit i and j but not its ancilla qubit. Then the trans-formed term is given by

WiWjXxii X

xjj X

xξi,jξi,j

(hi,j + h(ξ)i,j )Xxi

i Xxjj X

xξi,jξi,j

WiWj

= ±ZiZj + C(±XiXj ±XiZξi,j ± Zξi,jXj

),

(79)

7→XX

ZZ−ZZ

−ZZ

W

W±ZZ±XX±X

Z

±ZX

with non-stoquasticity cost at least

ν1

(C(±XiZξi,j ± Zξi,jXj)

)= 2C/(2 degG′). (80)

Could the edge be possibly cured by performing aHadamard transformation on the ancilla qubit as well? In thiscase, we get

WiWjWξi,j (hi,j + h(ξ)i,j )WiWjWξi,j

= ZiZj + C(XiXj −XiXξi,j −Xξi,jXj

),

(81)

7→XX

ZZ−ZZ

−ZZ

W

W

W

ZZXX−X

X

−XX

with non-stoquasticity

ν1(WiWjWξi,j (hi,j + h(ξ)i,j )WiWjWξi,j ) = C. (82)

Because of the frustrated arrangement of the signs of the ZZterms, no local sign flip of those terms (achieved by choicesof xi, xj , xξi,j 6= 0) can cure the sign problem of an ancillarytriangle, leaving it lower bounded by (82).

On the other hand, the original cost incurred from local signflips via Z-transformations, is given by

ν1(XiXj) = 1, (83)

which is always smaller than the cost incurred if additional Xor W transformations are applied since we chose C such thatC/(2 degG′) = 1. Therefore, in the no-case of the originalAF Ising model the non-stoquasticity ofH ′ cannot be broughtbelow

ν1(no) ≥ B, (84)

with the optimal choice achieved for ~x, ~w = 0 and(z1, . . . , zv)

T = ~s0.

Theorem 9 (SignEasing under orthogonal transformations).SignEasing is NP-complete for 2-local Hamiltonians on anarbitrary graph, in particular, for XYZ Hamiltonians underon-site orthogonal transformations.

Proof. We proceed analogously to the proof for orthogonalClifford transformations, showing that in the yes-case, thereexists a product orthogonal transformation O = O1 · · ·Onsuch that ν1(OH ′OT ) ≤ A, while in the no-case there existsno such transformation P with ν1(PH ′PT ) ≤ B.

The yes-case is clear: In this case, the energy of ~s is belowA. Then by our construction, the sign problem can be easedbelow A with the orthogonal transformation

O =∏i∈V

Zsii . (85)

We now need to show that in the no-case, any orthogonaltransformation incurs non-stoquasticity above B. We first re-mark that the orthogonal group O(2) decomposes into twosectors with determinant ±1, respectively. Therefore, any2× 2 orthogonal matrix can be written as

Oa(θ) =

(cos θ −a sin θsin θ a cos θ

), (86)

which for a = det(Oa(θ)) = −1 is a reflection and for a =det(Oa(θ)) = +1 a rotation by an angle θ. Note that thefollowing composition laws hold

O−1(θ)O1(φ) = O−1(θ − φ), (87)O1(θ)O−1(φ) = O−1(θ + φ), (88)O1(θ)O1(φ) = O1(θ + φ), (89)

O−1(θ)O−1(φ) = O1(θ − φ). (90)

Now observe three facts: First, any reflection by an angle θcan be written as a product of a reflection across the X-axisand a rotation as(

cos θ sin θsin θ − cos θ

)=

(cos θ − sin θsin θ cos θ

)Z = R(θ)Z, (91)

where R(θ) is the rotation by an angle θ. Second, for anyHermitian matrix H and any angle θ, it holds that

O(θ)HO(θ)T = O(θ + π)HO(θ + π)T , (92)

so that it suffices to restrict to angles θ ∈ [−π/4, 3π/4] in aninterval of length π. Third, a rotation by an angle π/2 can bedecomposed into two reflections as

R(π/2) = XZ. (93)

Taken together, these facts imply that an arbitrary single-qubitorthogonal transformation is given by

O(θ, z, p) = R (θ + (π/2)p) · Zz = R(θ) ·XpZp+z, (94)

where the rotation angle is given by θ ∈ [−π/4, π/4], z ∈{0, 1} fixes whether or not a Z-flip is applied, and p ∈ {0, 1}

Page 19: Easing the Monte Carlo sign problem - arxiv.org · Easing the Monte Carlo sign problem Dominik Hangleiter,1, Ingo Roth,1 Daniel Nagaj,2 and Jens Eisert1 1Dahlem Center for Complex

19

mods out a rotation by an angle π/2. Now defineO(~θ, ~z, ~p) =∏iOi(θi, zi, pi).We now need to show that, in the no-case, for any choice of

~θ ∈ [−π/4, π/4]n, ~z, ~p ∈ {0, 1}n it holds that

ν1

(O(~θ, ~z, ~p)H ′O(~θ, ~z, ~p)T

)≥ b. (95)

To complete the proof, we use that the action ofR(θ) on Pauli-X and Z matrices is given by

R(θ)ZR(θ)T = cos(2θ)Z + sin(2θ)X, (96)

R(θ)XR(θ)T = cos(2θ)X − sin(2θ)Z. (97)

In Appendix B, we show that given any choice of ~z and ~p,no choice of ~θ can decrease the non-stoquasticity of an edgewith non-zero non-stoquasticity below 1. That is, we analyzetransformations of the following form:

XX

ZZ−ZZ

−ZZ

R(θi)

R(θj)

R(θaij)

We do this by using the standard form (94) of the local trans-formations in terms of X , Z and restricted rotation matriceswith angles in [−π/4, π/4]. We split the proof into threeparts, analyzing three different (continous) regions for rota-tion angle choices. The only difference in the constructionwhen compared to the Clifford-case is that here we chooseC = (2 degG′)2. This completes the proof.

Notice that since MAXCUT is NP-hard already on sub-graphs of the double-layered square lattice [48], which hasconstant degree, there are hard instances of the SignEasingproblem on low-dimensional lattices with constant couplingstrength.

We remark that one can easily extend the proofs of Theo-rems 8 and 9 for different non-stoquasticity measures νp with1 < p < ∞. To see this, note that the decision problem forνp(H) is equivalent to the problem for νp(H)p.

For terms xi,jXiZj we can then use the trivial bound

∑λ∈{0,1}k

max

{ k∑j=1

(−1)λjxi,j , 0

}p≥ 2p(k−deg(i))

∑j

|xj |p(98)

instead of Lemma 7. Thus, every term xi,jXiZj contributesat least 2−p degG′ |xi,j |p to νpp , while a term ai,jXiXj con-tributes

νp(ai,jXiXj)p = (max{ai,j , 0})p, (99)

to the total non-stoquasticity of H ′.For general `p-non-stoquasticity measures νp one therefore

need merely choose C = 2deg(G′) = 4deg(G) to prove Theo-rem 8 and C = 22 deg(G′) = 16deg(G) for Theorem 9.

ACKNOWLEDGEMENTS

We are immensely grateful for the many fruitful discussionswhich have helped shape this work – with Albert Werner, Mar-tin Schwarz, Juani Bermejo-Vega, and Christian Krumnowin early stages of the project; more recently with MatthiasTroyer, Joel Klassen, Marios Ioannou, Maria Laura Baez,Hakop Pashayan, Simon Trebst, Augustine Kshetrimayum,Alexander Studt and Alex Nietner. We also thank Barbara Ter-hal, Marios Ioannou, Jarrod McClean and Maria Laura Baezfor helpful comments on our draft. D. H., I. R. and J. E. ac-knowledge financial support from the ERC (TAQ), the Tem-pleton Foundation, the DFG (EI 519/14-1, EI 519/9-1, EI519/7-1, CRC 183 in project B01), and the European Union’sHorizon 2020 research and innovation programme under grantagreement No 817482 (PASQuanS). D. N. has received fund-ing from the People Programme (Marie Curie Actions) EU’s7th Framework Programme under REA grant agreement No.609427. His research has been further co-funded by the Slo-vak Academy of Sciences, as well as by the Slovak Researchand Development Agency grant QETWORK APVV-14-0878and VEGA MAXAP 2/0173/17.

[1] J. E. Hirsch, R. L. Sugar, D. J. Scalapino, and R. Blankenbecler,Monte Carlo simulations of one-dimensional fermion systems,Phys. Rev. B 26, 5033 (1982).

[2] M. Troyer, F. Alet, S. Trebst, and S. Wessel, Non-local Updatesfor Quantum Monte Carlo Simulations, AIP Conf. Proc. 690,156 (2003), arXiv:physics/0306128.

[3] L. Pollet, Recent developments in Quantum Monte-Carlo sim-ulations with applications for cold gases, Rep. Prog. Phys. 75,094501 (2012), arXiv:1206.0781.

[4] S. Trotzky, L. Pollet, F. Gerbier, U. Schnorrberger, I. Bloch,N. Prokof’ev, B. Svistunov, and M. Troyer, Suppression of thecritical temperature for superfluidity near the Mott transition:validating a quantum simulator, Nature Phys. 6, 998 (2010).

[5] N. Hatano and M. Suzuki, Representation basis in quan-tum Monte Carlo calculations and the negative-sign problem,Physics Letters A 163, 246 (1992).

[6] M. B. Hastings, How quantum are non-negative wavefunc-tions? J. Math. Phys. 57, 015210 (2015), arXiv:1506.08883.

Page 20: Easing the Monte Carlo sign problem - arxiv.org · Easing the Monte Carlo sign problem Dominik Hangleiter,1, Ingo Roth,1 Daniel Nagaj,2 and Jens Eisert1 1Dahlem Center for Complex

20

[7] C. Wu, J.-P. Hu, and S.-C. Zhang, Exact SO(5) Symmetry inthe Spin- 3 / 2 Fermionic System, Phys. Rev. Lett. 91 (2003),10.1103/PhysRevLett.91.186402, arXiv:cond-mat/0302165.

[8] K. Okunishi and K. Harada, Symmetry-protected topologi-cal order and negative-sign problem for $\mathrm{SO(}N)$bilinear-biquadratic chains, Phys. Rev. B 89, 134422 (2014),arXiv:1312.2643.

[9] Z.-X. Li, Y.-F. Jiang, and H. Yao, Solving the fermionsign problem in quantum Monte Carlo simulations by Majo-rana representation, Phys. Rev. B 91 (2015), 10.1103/Phys-RevB.91.241117, arXiv:1601.05780.

[10] Z.-X. Li, Y.-F. Jiang, and H. Yao, Majorana-Time-ReversalSymmetries: A Fundamental Principle for Sign-Problem-FreeQuantum Monte Carlo Simulations, Phys. Rev. Lett. 117,267002 (2016), arXiv:1601.05780.

[11] T. Nakamura, Vanishing of the negative-sign problem of quan-tum Monte Carlo simulations in one-dimensional frustratedspin systems, Phys. Rev. B 57, R3197 (1998), arXiv:cond-mat/9707019.

[12] F. Alet, K. Damle, and S. Pujari, Sign-Problem-Free MonteCarlo Simulation of Certain Frustrated Quantum Magnets,Phys. Rev. Lett. 117, 197203 (2016), arXiv:1511.01586.

[13] A. Honecker, S. Wessel, R. Kerkdyk, T. Pruschke, F. Mila, andB. Normand, Thermodynamic properties of highly frustratedquantum spin ladders: Influence of many-particle bound states,Phys. Rev. B 93, 054408 (2016), arXiv:1511.01501.

[14] S. Wessel, B. Normand, F. Mila, and A. Honecker, EfficientQuantum Monte Carlo simulations of highly frustrated mag-nets: the frustrated spin-1/2 ladder, SciPost Physics 3, 005(2017).

[15] J. I. Cirac, D. Perez-Garcia, N. Schuch, and F. Ver-straete, Matrix product unitaries: structure, symmetries, andtopological invariants, J. Stat. Mech. 2017, 083105 (2017),arXiv:1703.09188.

[16] M. B. Sahinoglu, S. K. Shukla, F. Bi, and X. Chen, Matrixproduct representation of locality preserving unitaries, Phys.Rev. B 98, 245122 (2018), arXiv:1704.01943.

[17] W. Dobrautz, H. Luo, and A. Alavi, Compact numerical so-lutions to the two-dimensional repulsive Hubbard model ob-tained via nonunitary similarity transformations, Phys. Rev. B99, 075119 (2019), arXiv:1811.03607.

[18] M. Troyer and U.-J. Wiese, Computational complexity andfundamental limitations to fermionic quantum Monte Carlosimulations, Phys. Rev. Lett. 94, 170201 (2005), arXiv:cond-mat/0408370.

[19] M. Marvian, D. A. Lidar, and I. Hen, On the computa-tional complexity of curing non-stoquastic Hamiltonians, Na-ture Comm. 10, 1571 (2019), arXiv:1802.03408.

[20] J. Klassen, M. Marvian, S. Piddock, M. Ioannou, I. Hen, andB. Terhal, Hardness and Ease of Curing the Sign Problem forTwo-Local Qubit Hamiltonians, (2019), arXiv:1906.08800.

[21] J. Klassen and B. M. Terhal, Two-local qubit Hamiltoni-ans: when are they stoquastic? Quantum 3, 139 (2019),arXiv:1806.05405.

[22] H. Shinaoka, Y. Nomura, S. Biermann, M. Troyer, andP. Werner, Negative sign problem in continuous-time quan-tum Monte Carlo: Optimal choice of single-particle ba-sis for impurity problems, Phys. Rev. B 92, 195126 (2015),arXiv:1508.06741.

[23] J. R. McClean and A. Aspuru-Guzik, Clock Quantum MonteCarlo: an imaginary-time method for real-time quantum dy-namics, Phys. Rev. A 91, 012311 (2015), arXiv:1410.1877.

[24] R. E. Thomas, Q. Sun, A. Alavi, and G. H. Booth, Stochas-tic Multiconfigurational Self-Consistent Field Theory, J. Chem.

Theory Comput. 11, 5316 (2015), arXiv:1510.03635.[25] W. Bietenholz, A. Pochinsky, and U. J. Wiese, Meron-Cluster

Simulation of the θ Vacuum in the 2D O(3) Model, Phys. Rev.Lett. 75, 4524 (1995), arXiv:hep-lat/9505019.

[26] S. Chandrasekharan and U.-J. Wiese, Meron-cluster solutionof fermion sign problems, Phys. Rev. Lett. 83, 3116 (1999),arXiv:cond-mat/9902128.

[27] P. Henelius and A. W. Sandvik, Sign problem in Monte Carlosimulations of frustrated quantum spin systems, Phys. Rev. B62, 1102 (2000), arXiv:cond-mat/0001351.

[28] M. Nyfeler, F.-J. Jiang, F. Kämpfer, and U.-J. Wiese, NestedCluster Algorithm for Frustrated Quantum Antiferromagnets,Phys. Rev. Lett. 100, 247206 (2008), arXiv:0803.3538.

[29] E. Huffman and S. Chandrasekharan, Solution to signproblems in models of interacting fermions and quantumspins, Phys. Rev. E 94 (2016), 10.1103/PhysRevE.94.043311,arXiv:1605.07420.

[30] C. T. Hann, E. Huffman, and S. Chandrasekharan, Solution tothe sign problem in a frustrated quantum impurity model, Ann.Phys. 376, 63 (2017), arXiv:1608.05144.

[31] I. Hen, Resolution of the sign problem for a frustrated triplet ofspins, Phys. Rev. E 99, 033306 (2019), arXiv:1811.03027.

[32] In contrast to Ref. [19] where term-wise stoquasticity is consid-ered, our definition remains on the level of the global Hamil-tonian. This is because positive matrix elements of the localHamiltonian terms may cancel in the global matrix representa-tion. Term-wise stoquasticity is thus meaningful (only) in thefollowing sense: if a Hamiltonian admits a term-wise stoquas-tic local basis, it has no sign problem on an arbitrary lattice.However, for any given graph, it might well be possible to fullycure the sign problem using an allowed set of transformationseven if the Hamiltonian cannot be made term-wise stoquastic.

[33] S. Foucart and H. Rauhut, A Mathematical Introduction toCompressive Sensing, Applied and Numerical Harmonic Anal-ysis (Springer New York, New York, NY, 2013).

[34] H.-J. Mikeska and A. K. Kolezhuk, in Quantum Magnetism,Lecture Notes in Physics, edited by U. Schollwöck, J. Richter,D. J. J. Farnell, and R. F. Bishop (Springer Berlin Heidelberg,Berlin, Heidelberg, 2004) pp. 1–83.

[35] E. Dagotto and T. M. Rice, Surprises on the Way from One- toTwo-Dimensional Quantum Magnets: The Ladder Materials,Science 271, 618 (1996).

[36] M. Takano, Spin ladder compounds, Physica C: Superconduc-tivity Proceedings of the International Symposium on Frontiersof High - Tc Superconductivity, 263, 468 (1996).

[37] A. W. Sandvik, Computational Studies of Quantum SpinSystems, AIP Conference Proceedings 1297, 135 (2010),arXiv:1101.3281.

[38] T. Meng, T. Neupert, M. Greiter, and R. Thomale, Coupled-wire construction of chiral spin liquids, Phys. Rev. B 91,241106 (2015), arXiv:1503.05051.

[39] P.-H. Huang, J.-H. Chen, A. E. Feiguin, C. Chamon, andC. Mudry, Coupled spin- 1

2ladders as microscopic models

for non-Abelian chiral spin liquids, Phys. Rev. B 95, 144413(2017), arXiv:1611.02523.

[40] A. Nietner, C. Krumnow, E. J. Bergholtz, and J. Eisert, Com-posite symmetry protected topological order and effective mod-els, Phys. Rev. B 96, 235138 (2017), arXiv:1704.02992.

[41] Y. Yoshida, H. Ito, M. Maesato, Y. Shimizu, H. Hayama, T. Hi-ramatsu, Y. Nakamura, H. Kishida, T. Koretsune, C. Hotta,and G. Saito, Spin-disordered quantum phases in a quasi-one-dimensional triangular lattice, Nature Physics 11, 679 (2015).

[42] K. T. Lai and M. Valldor, Coexistence of spin ordering on lad-ders and spin dimer formation in a new-structure-type com-

Page 21: Easing the Monte Carlo sign problem - arxiv.org · Easing the Monte Carlo sign problem Dominik Hangleiter,1, Ingo Roth,1 Daniel Nagaj,2 and Jens Eisert1 1Dahlem Center for Complex

21

pound Sr2Co3S2O3, Scientific Reports 7, 43767 (2017).[43] V. I. Iglovikov, E. Khatami, and R. T. Scalettar, Geometry de-

pendence of the sign problem in quantum Monte Carlo simula-tions, Phys. Rev. B 92, 045110 (2015).

[44] J. Carrasquilla, Z. Hao, and R. G. Melko, A two-dimensionalspin liquid in quantum kagome ice, Nature Communications 6,7421 (2015), arXiv:1407.0037.

[45] T. Abrudan, J. Eriksson, and V. Koivunen, Conjugate gradi-ent algorithm for optimization under unitary matrix constraint,Signal Processing 89, 1704 (2009).

[46] D. Hangleiter and I. Roth, (2019), Gitlab repository at https://gitlab.com/ingo.roth/signease.

[47] ISO 16269-4:2010, Statistical interpretation of data – Part 4:Detection and treatment of outliers, Standard (International Or-ganization for Standardization, Geneva, CH, 2010).

[48] F. Barahona, On the computational complexity of Ising spinglass models, J. Phys. A 15, 3241 (1982).

[49] D. P. Landau and K. Binder, A guide to Monte Carlo simulationsin statistical physics (Cambridge University Press, 2000).

[50] G. H. Booth, A. J. W. Thom, and A. Alavi, Fermion MonteCarlo without fixed nodes: A game of life, death, and annihila-tion in Slater determinant space, 131, 054106.

[51] C. M. Dawson, H. L. Haselgrove, A. P. Hines, D. Mortimer,M. A. Nielsen, and T. J. Osborne, Quantum computing andpolynomial equations over the finite field Z_2, Quant. Inf.Comp. 5, 102 (2005), arXiv:quant-ph/0408129.

[52] S. P. Jordan, D. Gosset, and P. J. Love, Quantum-Merlin-Arthur–complete problems for stoquastic Hamiltonians andMarkov matrices, Phys. Rev. A 81 (2010), 10.1103/Phys-RevA.81.032331, arXiv:0905.4755.

[53] H. Pashayan, J. J. Wallman, and S. D. Bartlett, Estimating out-come probabilities of quantum circuits using quasiprobabilities,Phys. Rev. Lett. 115 (2015), 10.1103/PhysRevLett.115.070501,arXiv:1503.07525.

[54] K. N. Anagnostopoulos and J. Nishimura, New approach tothe complex-action problem and its application to a nonper-turbative study of superstring theory, Phys. Rev. D 66, 106008(2002), arXiv:hep-th/0108041.

[55] Z. Ringel and D. L. Kovrizhin, Quantized gravitational re-sponses, the sign problem, and quantum complexity, 3,e1701758.

[56] M. A. Levin and X.-G. Wen, String-net condensation: A physi-cal mechanism for topological phases, Phys. Rev. B 71, 045110(2005), arXiv:cond-mat/0404617.

[57] R. Levy and B. K. Clark, Mitigating the Sign Problem ThroughBasis Rotations, (2019), arXiv:1907.02076.

[58] G. Torlai, J. Carrasquilla, M. T. Fishman, R. G. Melko, andM. P. A. Fisher, Wavefunction positivization via automatic dif-ferentiation, (2019), arXiv:1906.04654.

[59] A. J. Kim, P. Werner, and R. Valentí, Alleviating theSign Problem in Quantum Monte Carlo Simulations ofSpin-Orbit-Coupled Multi-Orbital Hubbard Models, (2019),arXiv:1907.11298.

[60] Y. Shoukry, P. Nuzzo, A. L. Sangiovanni-Vincentelli, S. A. Se-shia, G. J. Pappas, and P. Tabuada, SMC: Satisfiability Mod-ulo Convex Optimization, in Proceedings of the 20th Interna-tional Conference on Hybrid Systems: Computation and Con-trol, HSCC ’17 (ACM, New York, NY, USA, 2017) pp. 19–28,event-place: Pittsburgh, Pennsylvania, USA.

[61] S. Bravyi, D. P. DiVincenzo, R. I. Oliveira, and B. M. Ter-hal, The Complexity of Stoquastic Local Hamiltonian Problems,Quant. Inf. Comp. 8, 0361 (2008), arXiv:quant-ph/0606140.

[62] S. Bravyi and B. Terhal, Complexity of stoquastic frustration-free Hamiltonians, SIAM J. Comput. 39, 1462 (2009),arXiv:0806.1746.

[63] T. Cubitt and A. Montanaro, Complexity Classification of Lo-cal Hamiltonian Problems, SIAM J. Comput. 45, 268 (2016),arXiv:1311.3161.

[64] T. S. Cubitt, A. Montanaro, and S. Piddock, Universal quantumHamiltonians, PNAS 115, 9497 (2018), arXiv:1701.05182.

[65] J. S. Spencer, N. S. Blunt, and W. M. Foulkes, The sign prob-lem and population dynamics in the full configuration interac-tion quantum Monte Carlo method, The Journal of ChemicalPhysics 136, 054110 (2012), arXiv:1110.5479.

[66] M. J. Wainwright, High-Dimensional Statistics: A Non-Asymptotic Viewpoint, 1st ed. (Cambridge University Press,2019).

[67] M. Schmidt, G. Fung, and R. Rosales, Fast Optimization Meth-ods for L1 Regularization: A Comparative Study and Two NewApproaches, in Machine Learning: ECML 2007, Lecture Notesin Computer Science, edited by J. N. Kok, J. Koronacki, R. L. d.Mantaras, S. Matwin, D. Mladenic, and A. Skowron (SpringerBerlin Heidelberg, 2007) pp. 286–297.

Appendix A: Conjugate gradient descent for sign easing

In this appendix, we provide details on the numerical implementation [46] of the conjugate gradient descent algorithm forminimizing the non-stoquasticity ν1 over orthogonal circuits. In this work, we focus on the ansatz class of translation-invarianton-site orthogonal transformations, but other classes such as constant-depth circuits can be implemented analogously.

1. Translation-invariant formulation of the non-stoquasticity measure

We begin by deriving a simple expression for the non-stoquasticity measure for translation-invariant, one-dimensional,nearest-neighbour Hamiltonians. Our formulation is based on the observation that by translation-invariance, the measure ν1(H)only depends on the local Hamiltonian term h and is therefore independent of the system size. Since Hamiltonian terms withoverlapping support can contribute to the same matrix element in the global Hamiltonian matrix it is thus sufficient to optimizecertain sums of matrix elements of h rather than make h itself stoquastic – a condition required only by the stronger notion ofterm-wise sxtoquasticity.

Page 22: Easing the Monte Carlo sign problem - arxiv.org · Easing the Monte Carlo sign problem Dominik Hangleiter,1, Ingo Roth,1 Daniel Nagaj,2 and Jens Eisert1 1Dahlem Center for Complex

22

Recall that we consider real nearest-neighbour Hamiltonians with closed boundary conditions on a line of n local systemswith dimension d

H =

n∑i=1

Ti(h), (A1)

where h ∈ Rd2×d2 is the local term and we use the notation Ti(h) = 1⊗(i−1) ⊗ h ⊗ 1

⊗(n−i−1). Closed boundary conditionsidentify n+ 1 = 1.

Specifically, we can calculate

ν1(H) = n2n−3∑

i,j,k;l,m,n:k 6=l,m=n

max{

0, (h⊗ 1 + 1⊗ h)i,k,m;j,l,n

}≡ n2n−3ν1(h). (A2)

This reduces the problem of minimizing ν1(H) over on-site orthogonal bases to minimizing ν1(h) over such bases. To deriveEq. (A2) we re-express the global non-stoquasticity measure as follows:

ν1(H) =∑

(i1,...,in) 6=(j1,...,jn)

max

{0, 〈i1, . . . , in|

n∑l=1

Tl(h)|j1, . . . , jn〉}

(A3)

=

n∑p=1

∑i1,...,in,jp,jp+1:ip+1 6=jp+1

max {0, 〈i1, . . . , in|(Tp(h) + Tp+1(h))|i1, . . . , ip−1, jp, jp+1, ip+2, . . . , in〉} (A4)

= 2n−3n∑p=1

∑ip,...,ip+2,jp,jp+1:ip+1 6=jp+1

max {0, 〈ip, . . . , ip+2|(h⊗ 1 + 1⊗ h)|jp, jp+1, ip+2〉} (A5)

= n2n−3∑

i1,...,i3,j1,j2:i2 6=j2

max {0, 〈i1, i2, i3|(h⊗ 1 + 1⊗ h)|j1, j2, i3〉} . (A6)

In the first step, we have used that the condition (i1, . . . , in) 6= (j1, . . . , jn) implies that at least one of the indices differs.Summing over p, we can let this index be the (p + 1)st one. To avoid double-counting, we then divide the sum over all stringswhich differ on at most two nearest neighbours into a sum over all strings which potentially differ on two nearest neighbours leftof the (p+ 1)st index. Patches which differ on more than two nearest neighbours vanish for nearest-neighbour Hamiltonians. Inthe second step, we use that all terms with support left of the pth qubit vanish, since the basis strings differ on the (p+ 1)st qubit.In the last step, we use the translation-invariance again to account for the sum over p, incurring a factor n.

2. Gradient of the objective function

As an ansatz class we choose on-site orthogonal transformations in O(d), where d is the dimension of a local constituent ofthe system. More precisely, we consider transformations of the type (see Eq. (55))

H =

n∑i=1

Ti(h) 7→ O⊗nH(OT )⊗n, (A7)

which locally amounts to

h 7→ h(O) := (O ⊗O)h(OT ⊗OT ). (A8)

The key ingredient for the conjugate gradient descent algorithm is the derivative of the objective function ν1(h(O)) withrespect to the orthogonal matrix O. The gradient is given by

∂Oν1(h(O)) =

∑i,j

∂ν1

∂h(i|j) ·∂h(i|j)∂O

. (A9)

We can further expand the terms of the global gradient (A9): in particular, we can express the gradient of the effective localterms as a conjugation h(O) = C(O)hC†(O) of the local term h by the orthorgonal circuit C(O) = O ⊗O. We will now deriveexpressions for the measure and the gradients that the algorithm has to evaluate.

Page 23: Easing the Monte Carlo sign problem - arxiv.org · Easing the Monte Carlo sign problem Dominik Hangleiter,1, Ingo Roth,1 Daniel Nagaj,2 and Jens Eisert1 1Dahlem Center for Complex

23

a. The objective function gradient We first determine the gradient of the objective function. Since we will also make useof different measures νp as defined in Eq. (31), we write the objective function for different p ≥ 1 as

νpp(h) =∑

i,j,k;l,m,n:k 6=l,m=n

max{

(h⊗ 1 + 1⊗ h)i,k,m;j,l,n , 0}p. (A10)

Then, the gradient of the objective function is given by

∂νpp∂h(x|y)

∣∣∣∣h(0)

= p∑

i,j,k;l,m,n:k 6=l,m=n

(|x〉〈y| ⊗ 1 + 1⊗ |x〉〈y|)i,k,m;j,l,n max{

(h⊗ 1 + 1⊗ h)i,k,m;j,l,n , 0}p−1

(A11)

b. Gradient of the transformed Hamiltonian We can expand the gradient of the transformed Hamiltonian term by theorthogonal matrix as

∂h(i|j)∂O

=∑m,n

∂ adjh(C)(i|j)∂C(m|n)

∂C(m|n)

∂O,(A12)

where by adjh(C) we denote the adjunction map h 7→ ChCT . The derivative of the adjoint action of C on h is given by

∂ adjh(C)(i|j)∂C(m|n)

=∑k,l

∂C(m|n)C(i|k)h(k|l)C(j|l) (A13)

=∑k,l

[δm,iδn,kh(k|l)C(j|l) + C(i|k)h(k|l)δm,jδn,l] (A14)

= 〈m|i〉〈j|ChT |n〉+ 〈m|j〉〈i|Ch|n〉 (A15)

From this expression, we can directly read off its matrix form

∂ adh(C)(i|j)∂C = |i〉〈j|ChT + |j〉〈i|Ch. (A16)

It remains to compute the gradient of the circuit with respect to the orthogonal matrix. We obtain with 〈m| = 〈m1| ⊗ 〈m2|

∂ C(m|n)

∂O(k|l) =

2∑i=1

δmi,kδni,lO(m1|n1)O(m2|n2) =

2∑i=1

〈mi|k〉〈l|ni〉O(m1|n1)O(m2|n2), (A17)

which expressed in matrix form is then given by

∂ Cle(m|n)

∂O= |n1〉〈m1|〈m2|O|n2〉+ |n2〉〈m2|〈m1|O|n1〉. (A18)

3. Algorithmic procedure

We start our conjugate gradient algorithm either at the identity matrix (with or without a small perturbation) or a Haar-randomly chosen orthogonal matrix as indicated at the respective places in the main text.

Since the minimization of the ν1-measure is numerically not well behaved, we improve the performance of the algorithm inseveral ways. First, we observe that the measure ν2 given by the Frobenius norm of the non-stoquastic part of the Hamiltonianis numerically much better behaved. This is due to the `2-norm being continuously differentiable while the `1-norm is onlysubdifferentible. In particular, at its minima the gradient of the `1-norm is discontinuous and never vanishes. For this reason,rather than optimizing the `1-norm of the non-stoquastic part, we optimize a smooth approximation thereof as given by [67]

ν1,α(H) :=∑i 6=j

fα (Hi,j) , (A19)

with fα(x) = x+ α−1 log(1 + exp(−αx)).To achieve the best possible performance, we then carry out a hybrid approach: First, we pre-optimize by minimizing ν2

using our conjugate gradient descent algorithm. Second, starting at the minimizer obtained in the Frobenius norm optimization,

Page 24: Easing the Monte Carlo sign problem - arxiv.org · Easing the Monte Carlo sign problem Dominik Hangleiter,1, Ingo Roth,1 Daniel Nagaj,2 and Jens Eisert1 1Dahlem Center for Complex

24

we minimize the smooth non-stoquasticity measure ν1,α. We choose the value α = 50, α = 100 and α = 40, for therandom stoquastic Hamiltonians, the J0-J1-J2-J3-model, and the frustrated ladder model, respectively. We then compare theresult to a direct minimization of the non-stoquasticity ν1,α starting from the original initial point and choose whichever of theminimizations performed best. The exact details of our optimization algorithms can be found at Ref. [46] together with code toreproduce the figures shown here. Our code framework can be easily adapted for optimization of other Hamiltonian models andmore general circuit architectures.

Appendix B: Orthogonal transformations of the penalty terms (proof of Theorem 9)

Proof of Theorem 9 (continued). As in the proof for orthogonal Clifford transformations, we will show that for any given choiceof Z-transformations one cannot further decrease the non-stoquasticity by exploiting the additional freedom offered by the fullorthogonal group. Above, we have argued that applying an arbitrary orthogonal transformation at a single site can be reduced toapplying R(θ)XpZp+z with θ ∈ [−π/4, π/4] and z, p ∈ {0, 1}. We will now show that for any choice of ~z and ~p, a rotation byangles ~θ ∈ [−π/4, π/4]n cannot decrease the non-stoquasticity any further.

Analogously to the proof for Clifford-transformations, we discuss all possible transformations by dividing them into differentcases. In each case the non-stoquasticity of an uncured edge (i, j) and its ancilla qubit ξi,j cannot be eased below its previousvalue of 1. The additional difficulty we encouter here is that the orthogonal group is continuous as opposed to the orthogonalClifford group, which is a discrete and rather ‘small’ group.

Given a choice of ~z, consider an edge (i, j) with a non-trivial contribution to ν1 and its corresponding ancilla qubit ξi,j . Webegin, supposing that pi = pj = pξi,j = 0 so that the X-flips act trivially on all three qubits. We now analyze the effect of theremaining rotations R(θ) on each of the qubits.

XX

ZZ−ZZ

−ZZ

R(θi)

R(θj)

R(θaij)

Specifically, we apply rotations with angles θi/2, θj/2, θξi,j/2 with θi, θj , θξi,j ∈ [−π/2, π/2] to the three qubits. Note thatwe consider rotations by half-angles θ → θ/2 while at the same time doubling the interval [−π/4, π/4]→ [−π/2, π/2] to easenotation later in the proof. The effect of rotations on each vertex of an edge (i, j) is given by

XiXj + CZiZj 7→[C cos(θi) cos(θj)− sin(θi) sin(θj)]ZiZj

+ [C cos(θi) sin(θj)− sin(θi) cos(θj)]ZiXj

+ [C sin(θi) cos(θj)− cos(θi) sin(θj)]XiZj

+ [C sin(θi) sin(θj) + cos(θi) cos(θj)]XiXj ,

(B1)

and likewise for the edges (i, ξi,j) and (j, ξi,j). The non-stoquasticity of the transformed Hamiltonian terms corresponding to

Page 25: Easing the Monte Carlo sign problem - arxiv.org · Easing the Monte Carlo sign problem Dominik Hangleiter,1, Ingo Roth,1 Daniel Nagaj,2 and Jens Eisert1 1Dahlem Center for Complex

25

−π/2 π/2

π/2

−π/2

Case 1≥ (A3) + (A4)

Case 1≥ (A3) + (A4)

Case 3≥ (A3) + (A4)

Case 2≥ (A5) + (A6)

Case 3≥ (A3) + (A4)

Case 2≥ (A5) + (A6)

Figure 7. We divide the derivation of a lower bound on the non-stoquasticity of an edge (i, j) and its ancilla qubit ξi,j on which we applyrotations Ri(θi/2)Rj(θj/2)Rξi,j (θξi,j/2) with θi, θj , θξi,j ∈ [π/2, π/2] into three distinct cases. In each case we use different terms of theexpression (B3)-(B6) to lower bound the non-stoquasticity.

the three qubits is then given by

ν1

(Ri(θi/2)Rj(θj/2)Rξi,j (θξi,j/2)

(XiXj + C(ZiZj − ZiZξi,j − ZjZξi,j )

)Ri(θi/2)TRj(θj/2)TRξi,j (θξi,j/2)T

)(B2)

≥ max {C sin(θi) sin(θj) + cos(θi) cos(θj), 0} (B3)

+ (2 degG′)−1 ·(|C sin(θi) cos(θj)− cos(θi) sin(θj)|+ |C sin(θj) cos(θi)− cos(θj) sin(θi)|

)(B4)

+

(max

{−C sin(θi) sin(θξi,j ), 0

}+ max

{−C sin(θj) sin(θξi,j ), 0

})(B5)

+ (2 degG′)−1 · C(| cos(θi) sin(θξi,j )|+ | sin(θi) cos(θξi,j )|+ | cos(θj) sin(θξi,j )|+ | sin(θj) cos(θξi,j )|

). (B6)

Note that the terms (B3) and (B5) stem from the XX interactions with a positive sign and therefore depend on the signs of sinterms. Conversely, the terms (B4) and (B6) stem from the XZ interactions and therefore involve absolute values.

We divide the allowed rotations into different sectors corresponding to the different combinations of the signs of sin(θi) andsin(θj) as shown in Fig. 7. Which of the terms in (B3) and (B5) are non-trivial precisely depends on these combinations. Wedivide the cases as follows: first, the sectors in which sign(θi) = sign(θj). Second, the sectors in which sign(θi) = − sign(θj)and |θi|+ |θj | ≤ π/2. Third, the sectors in which sign(θi) = − sign(θj) and |θi|+ |θj | ≥ π/2. Taken together, the three casescover the entire range of allowed angles. Moreover, in all cases we allow arbitrary choices of θξi,j ∈ [−π/2, π/2]. We nowproceed to lower-bound the non-stoquasticity of the Hamiltonian terms acting on the three qubits, where in each case we willuse different terms of Eqs. (B3)-(B6).

We first discuss the case in which sign(θi) = sign(θj). In this case, it suffices to consider the terms (B3) and (B4). Ob-serve that in this case both C sin(θi) sin(θj) ≥ 0 and cos(θi) cos(θj) ≥ 0. Moreover, to our choice of C and noting that(|C sin(θi) cos(θj)− cos(θi) sin(θj)|+ |C sin(θj) cos(θi)− cos(θj) sin(θi)|) ≥ C| sin(θi − θj)|, we obtain

(B3) + (B4) ≥ cos(θi − θj) + C/(2 degG′) · | sin(θi − θj)| ≥ 1. (B7)

Second, we discuss the case in which sign(θi) = − sign(θj) and π/2 < |θi| + |θj | ≤ π. In this case, we consider the terms

Page 26: Easing the Monte Carlo sign problem - arxiv.org · Easing the Monte Carlo sign problem Dominik Hangleiter,1, Ingo Roth,1 Daniel Nagaj,2 and Jens Eisert1 1Dahlem Center for Complex

26

|θi | = |θ

j |

|θ i|+|θ j| =α(ε 3) θi

θj

π2

−π2

ε1 > ε2 > ε3 > ε4 = 0

Figure 8. Illustration of the lower bound in case 3: Every θi and θj such that |θi|+|θj | ≤ π/2 defines a contour line ε(θi, θj) = ε1, ε2, ε3, ε4 =0 (shades of pink). Given θi, θj and defining ε = ε(θi, θj), Lemma 10 implies the lower bound |θi| + |θj | ≥ α(ε) as defined in Eq. (B22).We then obtain (B3) + (B4) ≥ ε+ (C + 1) sin(α(ε)) ≥ 1.

(B5) and (B6):

(B5) + (B6) ≥C/(2 degG′) ·(| cos(θξi,j )|(sin(|θi|) + sin(|θj |)) (B8)

+ | sin(θξi,j )| (2 deg(G′) min{sin(|θi|), sin(|θj |)}+ | cos(θi)|+ | cos(θj)|))

(B9)

≥ 1. (B10)

Here, we have used that

sin(|θi|) + sin(|θj |) = 2 sin

( |x|+ |y|2

)cos

( |x| − |y|2

)≥ 1, (B11)

for π/2 ≤ |θi|+ |θj | ≤ π so that |x| − |y| ≤ π/2 and the definition of C = (2 degG′)2.Finally, we have the remaining case sign(θi) = − sign(θj) and |θi|+ |θj | ≤ π/2. In this case, it is again sufficient to consider

terms (B3) and (B4). This is the hardest case since the sin terms in (B3) increase much faster than the cos terms decrease dueto the factor of C > 1. Therefore, we cannot find a bound in terms of a sum-of-angles rule as in the previous cases. Instead, tolower-bound the terms terms (B3) and (B4) in this case, we proceed as follows: We reduce the problem of minimizing the sum(B3) + (B4) ≥ 1 to a one-dimensional problem by noting two facts: first, the term (B4) depends only on the sum |θi| + |θj |.Moreover, it increases monotonously in this sum. Second, for every choice of θi, θj , the value of (B3) takes on its minimal valueε at |θi| = |θj | =: α(ε)/2. Hence, the sum (B3) + (B4) is lower bounded by the sum of ε and (B4) evaluated at α(ε). The proofis concluded by a lower bound on the latter term. We now elaborate those steps one-by-one.

Let us begin by expressing (B4) as a function of |θi|+ |θj |

|C sin(θi) cos(θj)− cos(θi) sin(θj)|+ |C sin(θj) cos(θi)− cos(θj) sin(θi)| (B12)= C sin(|θi|) cos(θj) + cos(θi) sin(|θj |) + C sin(|θj |) cos(θi) + cos(θj) sin(|θi|) (B13)= (C + 1) sin(|θi|+ |θj |), (B14)

where we have used the fact that sign(sin(θi)) = − sign(sin(θj)) and that the cosines are non-negative. For |θi| + |θj | ≤ π/2this is a monotonously increasing function in |θi|+ |θj |. We now define the value of the term (B3) to be

ε(θi, θj) := max{−C sin |θi| sin |θj |+ cos |θi| cos |θj |, 0} ≥ 0. (B15)

For every choice of α = α(θi, θj) := |θi|+ |θj |, the minimal value of

(B3) + (B4) = ε(θi, θj) + (C + 1)/(2 degG′) · sin(α(θi, θj)), (B16)

Page 27: Easing the Monte Carlo sign problem - arxiv.org · Easing the Monte Carlo sign problem Dominik Hangleiter,1, Ingo Roth,1 Daniel Nagaj,2 and Jens Eisert1 1Dahlem Center for Complex

27

is therefore attained at the minimal value of ε(θi, θj) subject to the constraint |θi| + |θj | = α ≤ π/2. Covnersely, the valueis attained at the minimal value of α(θi, θj) subject to the constraint (B15). This reduces the problem to a one dimensionalproblem, which we exploit explicitly in the following lemma. The intuition behind this lemma is shown in Fig. 8.

Lemma 10. For any fixed value π/2 ≥ |θi|+ |θj | = α ≥ 0, the minimal value ε(α) of ε(θi, θj) is achieved at |θi| = |θj | = α/2.Moreover, for every θi, θj such that ε(θi, θj) ≥ ε(α) it holds that |θi|+ |θj | ≥ α.

Proof. Let |θi| = (α− δ)/2, |θj | = (α+ δ)/2 for 0 ≤ δ ≤ π/2. Then

ε(θi, θj) = −C sin

(α− δ

2

)sin

(α+ δ

2

)+ cos

(α+ δ

2

)cos

(α− δ

2

)(B17)

=C

2(cosα− cos δ) +

1

2(cosα+ cos δ) (B18)

=1

2((C + 1) cosα− (C − 1) cos δ) , (B19)

which is minimal at δ = 0.The second part of the lemma can be be seen by contraposition: Assume |θi|+ |θj | < α(ε). Then

ε(θi, θj) =1

2((C + 1) cos(|θi|+ |θj |)− (C − 1) cos(|θi| − |θj |)) (B20)

≤ 1

2(C + 1) cos(|θi|+ |θj |) <

1

2(C + 1) cosα ≡ ε(α), (B21)

where we have used the assumption and the monotonicity of the cosine in the interval [0, π/2] in the last inequality and itsnon-negativity in the interval [−π/2, π/2] in the second to last inequality.

Now, for every choice of θi, θj , the minimal value of ε(α) of ε(θi, θj) corresponding to α = |θi| + |θj | is attained at |θi| =|θj | = α/2. Correspondingly, we can re-express α in terms of ε as

ε ≡ ε(α) = −C sin2(α/2) + cos2(α/2) ⇔ α(ε) = 2 arctan

√1− εC + ε

. (B22)

The second part of Lemma 10 states that for all θi, θj such that ε(θi, θj) ≥ ε(α) ≥ 0 we have |θi|+ |θj | ≥ α(ε) and consequentlysin(|θi|+ |θj |) ≥ sin(α(ε)). Given θi, θj and defining ε := ε(θi, θj) this implies the lower bound

(B3) + (B4) ≥ ε+ (C + 1)/(2 degG′) · sin(α(ε)), (B23)

where we have used the equivalence (B22).It remains to lower-bound sin(α(ε)). Define x =

√(1− ε)/(C + ε). We can then rewrite

sin(α(ε)) = sin (2 arctanx) = 2 sin(arctanx) cos(arctanx) = 2x

1 + x2≥ x, (B24)

for x ≤ 1, where we have used that sin(arctan(x)) = x cos(arctanx) = x/√

1 + x2. We can also bound

x =

√1− εC + ε

≥√

1

C

√1− ε

1 + ε/C≥√

1

C

√1− ε1 + ε

≥ 1− ε√C, (B25)

where the last inequality can be seen by squaring both sides and using 0 ≤ ε < 1. Combining everything we obtain

(B3) + (B4) ≥ ε+

√C

2 degG′· (1− ε) = ε

(1−

√C

2 degG′

)+

√C

2 degG′= 1. (B26)

due to our choice of C = (2 degG′)2.To conclude the proof, we discuss the effect of applyingX-flips to each of the sites. ApplyingXiXj orXξi,j merely alters the

signs of the terms in (B5). But since we did not constrain the sign of θξi,j in the proof, everything remains unchanged. Supposean X-flip is applied to either qubit i or j, or either both qubit i and ξi,j or either both qubit j and ξi,j . Assuming wlog. thatqubit i is X-flipped, we achieve the same lower bounds as before by identifying θi 7→ −θi.