metastable states, quasi-stationary distributions and soft ... · hypotheses for markov chains on a...

44
Metastable states, quasi-stationary distributions and soft measures Alessandra Bianchi, Alexandre Gaudilliere To cite this version: Alessandra Bianchi, Alexandre Gaudilliere. Metastable states, quasi-stationary distribu- tions and soft measures. Stochastic Processes and their Applications, Elsevier, 2015, <10.1016/j.spa.2015.11.015>. <hal-00573852v2> HAL Id: hal-00573852 https://hal.archives-ouvertes.fr/hal-00573852v2 Submitted on 29 Jan 2016 HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte pluridisciplinaire HAL, est destin´ ee au d´ epˆ ot et ` a la diffusion de documents scientifiques de niveau recherche, publi´ es ou non, ´ emanant des ´ etablissements d’enseignement et de recherche fran¸cais ou ´ etrangers, des laboratoires publics ou priv´ es.

Upload: others

Post on 25-Feb-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Metastable states, quasi-stationary distributions and soft ... · hypotheses for Markov chains on a finite configuration space in some asymptotic regime. By com-paring restricted

Metastable states, quasi-stationary distributions and

soft measures

Alessandra Bianchi, Alexandre Gaudilliere

To cite this version:

Alessandra Bianchi, Alexandre Gaudilliere. Metastable states, quasi-stationary distribu-tions and soft measures. Stochastic Processes and their Applications, Elsevier, 2015,<10.1016/j.spa.2015.11.015>. <hal-00573852v2>

HAL Id: hal-00573852

https://hal.archives-ouvertes.fr/hal-00573852v2

Submitted on 29 Jan 2016

HAL is a multi-disciplinary open accessarchive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come fromteaching and research institutions in France orabroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, estdestinee au depot et a la diffusion de documentsscientifiques de niveau recherche, publies ou non,emanant des etablissements d’enseignement et derecherche francais ou etrangers, des laboratoirespublics ou prives.

Page 2: Metastable states, quasi-stationary distributions and soft ... · hypotheses for Markov chains on a finite configuration space in some asymptotic regime. By com-paring restricted

METASTABLE STATES,QUASI-STATIONARY DISTRIBUTIONS AND SOFT MEASURES

ALESSANDRA BIANCHI AND ALEXANDRE GAUDILLIERE

ABSTRACT. We establish metastability in the sense of Lebowitz and Penrose under practical and simplehypotheses for Markov chains on a finite configuration space in some asymptotic regime. By com-paring restricted ensemble and quasi-stationary measures, and introducing soft measures as interpo-lation between the two, we prove asymptotic exponential exit law and, on a generally different timescale, asymptotic exponential transition law. By using potential-theoretic tools, and introducing “(κ, λ)-capacities”, we give sharp estimates on relaxation time, as well as mean exit time and transition time.We also establish local thermalization on shorter time scales.

1. METASTABILITY AFTER LEBOWITZ AND PENROSE

1.1. Phenomenology and modelization. Lebowitz and Penrose characterized metastable thermo-dynamic states by the following properties [4]:

(a) only one thermodynamic phase is present,(b) a system that starts in this state is likely to take a long time to get out,(c) once the system has gotten out, it is unlikely to return.

We can think, for example, about freezing fog made of small droplets in which only one phase ispresent (liquid phase) that remains for a long time in such a state (until collision with ground ortrees, forming then hard rime) and that once frozen will typically not return to liquid state beforepressure or temperature have changed.

To model such a state they considered in [4] a deterministic dynamics with equilibrium measureµ. First, they associated with the metastable phase a subset R of the configuration space, anddescribed this metastable state by the restricted ensemble µR = µ(·|R). Second, they proved thatthe escape rate from R of the system started in µR is maximal at time t = 0, and that this initialescape rate is very small. Last, they used standard methods of equilibrium statistical mechanicsto deal with (c). As an estimate of the returning probability to the metastable state they used thefraction of members of the equilibrium ensemble that have configurations inR and they noted ([4],Section 8):

This amounts to assuming that a system whose dynamical state has just left R isno more likely to return to it than one whose dynamical state was never anywherenear R. The validity of this assumption, at least in the short run, is dubious, butat least it provides us with some indication of what to expect.

In this paper we want to give a different model for the same phenomenology that overcomes thelast difficulty. We will work with stochastic processes rather than deterministic dynamics, but theLebowitz-Penrose modelization will be our guideline. We will try to recover this phenomenologyunder simple and practical hypotheses only. Since the study of metastability has been considerablyenriched after the Lebowitz and Penrose work, we want also to incorporate in our modeling asmuch as possible of what was previously achieved. We will then make a brief and partial reviewof these achievements. Our goals and starting ideas will depend on this review but not our proofs,since we want to make this paper as self-contained as possible. Our model and results are presentedin Section 2. Examples of applications are given in Section 7.

Date: June 11, 2015.2010 Mathematics Subject Classification. 82C26, 60J27, 60J75, 60J45.Key words and phrases. Metastability, restricted ensemble, quasi-stationary measure, soft measures, exponential law,

spectral gap, mixing time, potential theory.Supported by GDRE 224 GREFI-MEFI, by the European Research Council through the Advanced Grant PTRELSS 228032,

and by the FIRB Project RBFR10N90W (MIUR, Italy). A. Bianchi acknowledges the University of Padua that provided partialfinancial support through the Project Stochastic Processes and Applications to Complex Systems (CPDA123182).

1

Page 3: Metastable states, quasi-stationary distributions and soft ... · hypotheses for Markov chains on a finite configuration space in some asymptotic regime. By com-paring restricted

METASTABILITY AND QUASI-STATIONARY MEASURES 2

1.2. A partial review. Since the Lebowitz and Penrose paper, an enormous amount of work hasbeen done to describe the metastability phenomenon. In particular Cassandro, Galves, Olivieri andVares introduced the path-wise approach, which focused, in the context of stochastic processes, ontime averages associated with an asymptotic exponential law [6]. This was further developed bythe pioneering works of Neves and Schonmann who studied the typical paths for stochastic Isingmodel in a given volume in the low temperature regime ([8], [10]). This work was then extended tohigher dimensions, infinite-volume and fixed-temperature regimes, locally conservative dynamicsand probabilistic cellular automata ([15], [13], [19], [27], [33], [45]).

As developed in [28], a crucial role was played by large deviation tools inherited from Wentzelland Freidlin in their reduction procedure from continuous stochastic processes to finite-configuration-space Markov chains with exponentially small transition rates [7]. This is especially true for very-low-temperature regimes, but the same kind of reduction procedure made it possible to deal invarious cases with large-volume rather than low-temperature limits (see [18] for the Curie-Weissmodel under random magnetic field, see [28] for further examples).

Then, using potential-theoretic rather than large deviations tools, Bovier, Eckhoff, Gayrard andKlein developed a set of general techniques to compute sharp asymptotics of the expected valueof asymptotic exponential laws associated with the metastability phenomenon, and revisited (af-ter [5], [12]) the relation between spectrum of the generator of the stochastic dynamics andmetastability ([20], [23], [26]). This allowed, for example, to go beyond logarithmic asymp-totics for stochastic Ising models in the low-temperature regime ([21], [29]) and to prove the firstrigorous results in the fully conservative case ([38]), to deal with metastability for the randomhopping-time dynamics associated with the Random Energy Model ([22]), to make a detailed anal-ysis of Sinai’s random walk spectrum ([32]), and to extend the study of the disordered Curie-Weissmodel to the case of continuous magnetic field distribution ([34], [44]). We refer to [48] for acomprehensive account of this approach.

We then reached an essentially complete comprehension of the metastability phenomenon inat least two classes of models: very low temperature dynamics in finite fixed volumes and largevolume or continuous-configuration space dynamics that can be reduced via a Wentzell-Freidlinprocedure to the previous case. Of course, specific and often nontrivial computations have to bemade for each specific model, but there exists a general approach to the problem that is developedin [28] and, as far as the potential-theoretic part is concerned, [20], [23], and [26] together with[41], [46] that bridges between potential theory and typical path description by reinforcing andgeneralizing the results of [11] (and it is worth noting that [41], after [37, 43], contemplates alsothe case of polynomially small rather than only exponentially small transition probabilities). Forboth classes of models, like one-dimensional metastable systems as considered in [32] or [31], arecurrence property for a very localized subset of the configuration space (single configurationsidentified to metastable states in the first case, small neighborhoods of the dynamics attractors inthe second case) plays an important role.

Beyond these two classes of models there are many limit cases, special cases, and partial results.For example, in [22] we are far from a finite-fixed-volume situation but single configurations canstill be identified with metastable states and have still enough mass at equilibrium for potential-theoretic or renewal techniques to work. This is not the case in [38], where potential-theoretictools give only expected values of some hitting times when the system is started from some specificharmonic measures that are very different from what one would expect to be a “metastable state”(here, like in the sequel and following Lebowitz and Penrose, we mean a whole measure whenreferring to a metastable “state” and not to a single configuration of the configuration space). Anykind of exponential law is presently also lacking in this case. The same difficulty is faced in [34],but it is overcome in [44] by mean of a specific coupling argument that gives point-wise estimatesand opens the way to the exponential law. We also note that [46] develops some general martingaleideas to deal with the same issues within the framework built from [37, 41]. In fact, though workingwith a different setup, we share some of the leading ideas developed in [46], which uses some ofthe key objects that we introduced through this work. Such ideas also inspired [47] where the non-reversible situation is contemplated. Finally, the beautiful paper by Schonmann and Shlosman [19]achieves the tour de force of using essentially equilibrium statistical mechanics computations to dealwith the dynamical problem of metastability. In this case also the exponential law is lacking as well

Page 4: Metastable states, quasi-stationary distributions and soft ... · hypotheses for Markov chains on a finite configuration space in some asymptotic regime. By com-paring restricted

METASTABILITY AND QUASI-STATIONARY MEASURES 3

as sharp estimates on the relaxation time, and even the simple formulation of such properties is notcompletely obvious in this fixed-temperature and vanishing-magnetic-field regime.

1.3. Starting ideas. In the present paper we want to elaborate some tools to describe the metasta-bility phenomenon beyond the case of a dynamics with a recurrence property for a very localizedsubset of the configuration space. We will focus on exponential laws and sharp asymptotics oftheir expected values. We note that the exponential law itself suggests some kind of recurrenceproperty. If it is not a recurrence property for a very localized subset, it has to be in some sensea recurrence property to a whole “spread measure”. And this measure should coincide with ourmetastable state. Now, following Lebowitz and Penrose, if we associate the metastable state withsome subset R of the configuration space X , then, considering property (b), we have at least twocandidates to describe our metastable state: one is the restricted ensemble µR = µ(·|R), the otheris the quasi-stationary distribution

µ∗R = limt→+∞

PµR(X(t) ∈ ·|τX\R > t) (1.1)

where X(t) is the configuration of the system at time t and τX\R is the exit time of R (we willbe more precise in the next section). Notice that Eq. (1.1) provides the stationarity of µ∗R for theprocess conditioned to not having exit R,

Pµ∗R(X(t) ∈ ·|τX\R > t

)= µ∗R , (1.2)

and thus explains the name of quasi-stationary distributions.The main advantage of µR is that µR is often an explicit measure one can compute with, while

µ∗R is only implicitly defined. The main advantage of µ∗R is that the exit law of R for the systemstarted in µ∗R is an exponential law. Our first results will then start, as in [9], with a comparisonbetween µR and µ∗R. We will give simple and practical hypotheses to ensure that they are close insome sense, then we will be able to prove some kind of recurrence property for µ∗R. In doing so wewill also answer some problems left open in [9] (see our comment after formula (2.31)). All thiswill be done in the simplest possible setup: considering a Markov process on a finite configurationspace in some asymptotic regime (including the possibility of sending to infinity the cardinality ofthe configuration space).

In the present work we will essentially build on the ideas of four different papers: [4] for theformulation of the problem, [6] for the focus on exponential laws, [20] for the introduction ofpotential-theoretic techniques in the metastability field to get sharp estimates on some mean hittingtimes, and Miclo’s work [40] where some concepts of local equilibrium, and “hitting times” of suchequilibriums, are introduced. As far as this last paper is concerned, it will only work as a source ofinspiration: we will not require a full spectrum knowledge, and we will not introduce any notionof dependence of a local equilibrium on the initial condition. Finally, we note that the idea ofconsidering quasi-stationary measures as metastable states was already contemplated in [24]. Eventhough some of our results echo some of [24], we were not able to make any clear comparison,essentially because of the much more analytical point of view of [24] and the many hypothesesintroduced in the results of [24]. We note that [24] deals with a much more general setup thanours since the authors consider non-reversible Markov processes on a continuous configurationspace, while we look at reversible Markov processes on finite configuration space. However, thereason why we assume reversibility is to be able to use potential-theoretic results to get sharpestimates on mean times via variational principles, a question that is not considered in [24].

1.4. Two new objects. In this section we provide a brief explanation on the two main new objectsof this paper, that we will progressively describe in the sequel : (κ, λ)-capacities and soft-measures.To understand their meaning, beyond their definitions, we can start from the main formula intro-duced in the context of metastable dynamics by [20]. Given a reversible and irreducible Markovprocess X : t 7→ X(t) ∈ X , and for any two disjoint and non-empty subsets A and B of X , it holds

EνA [τB ] =µ(VA,B)

C(A,B), (1.3)

where νA is the so-called harmonic measure on A (which actually depends also on B), τB is thehitting time of B, µ(VA,B) is the mean value, w.r.t. the equilibrium measure µ, of the “equilibriumpotential” between A and B, and C(A,B) is the capacity between A and B. This formula hadin particular two crucial advantages. First, it allowed to describe the metastability phenomenon

Page 5: Metastable states, quasi-stationary distributions and soft ... · hypotheses for Markov chains on a finite configuration space in some asymptotic regime. By com-paring restricted

METASTABILITY AND QUASI-STATIONARY MEASURES 4

essentially only through the computation of mean hitting times. Second, the most relevant partin the right-hand side is the capacity appearing in the denominator, which has the key property ofsatisfying two variational principles which, in turn, can be used to get sharp estimates just by usingtest functions to obtain upper bounds and test flows to obtain lower bounds. Using this formulaone has however to cope with three interlinked difficulties, which, depending on the consideredmodel, can or cannot be easily overcame:

i) the choice of the family of sets A and B can be delicate;ii) there is in general no variational principle to help in estimating the mean potential at the

numerator of the right-hand side;iii) the harmonic measure νA is in general very different from the natural measures associated

with metastability, say µR or µ∗R.Let us rapidly explain these three points. Formula (1.3) is in its very nature associated with theMarkov process stopped at time τB . Our previous discussion explains why it will not be sufficientjust to choose B = X \ R. Thus, in general, one has to consider a family of sets B that are“deep inside X \ R”, and for a symmetric reason, the family of sets A should be chosen “deepinside R” too. But then, the deeper are these chosen sets, the harder turns the estimation of themean potential. Moreover, while µR and µ∗R are usually concentrated deep inside R (and A), theharmonic measure νA of formula (1.3) has support on the border of A. In general, this makesuneasy the comparison between the Markov process started from νA and the system started from a“metastable equilibrium”. We point out that this last difficulty is actually the exact counterpart in Aof the fact that (1.3) deals with a Markov process stopped in B (this possibly not obvious fact canbe well understood by looking at the proof of (1.3)).

The two objects that we introduce in this work, are partially intended to deal with these dif-ficulties. The (κ, λ)-capacities are capacities computed in an extended network that is associatedwith a Markov process stopped at rate λ in B and for which κ plays a symmetrical role in A (justlike, when discussing (1.3), we noted that the fact that νA was concentrated on the border of Awas the counterpart of the fact that (1.3) was dealing with a process stopped in B). We will thenbe able to build on (1.3) with a Markov process that can penetrate B. This will allow to simplychoose B = X \ R, rather than a family of subsets of X \ R, and to compute the “mean transi-tion times” from metastable to stable states by estimating the (κ, λ)-capacities. Symmetrically, theparameter κ will be used to deal with measures that are concentrated deep inside A = R. In ad-dition, the parameter λ will be used to interpolate between the restricted ensemble µR (at λ = 0)and the quasi-stationary measure µ∗R (at λ = +∞). These interpolating measures will be our soft-measures; they are the quasi-stationary measures of the trace on R of the process killed outside Rat rate λ. In some sense, they are intended to keep the idea of characterizing metastability throughthe computation of mean hitting times, for which we can benefit of the classical potential theoryand of its variational principles. Though formula (1.3) will be used in our proofs, we will derivenew (asymptotic) equations expressing these mean hitting times in terms of quantities that satisfytwo-sided variational principles only, and do not involve mean potentials.

Finally, we stress that the difficulties arising when using (1.3) to describe a metastable dynamicswill not magically disappear by using soft-measures or (κ, λ)-capacities instead of standard capac-ities. They are actually deferred into the estimation of the local relaxation times, called γ−1

R andγ−1X\R in the sequel. However, in doing so, we can benefit from the huge mathematical literature

dealing with the computation of rates of convergence to equilibrium. In this paper we will alsoprove a new Poincare inequality, adding one more tool in this respect. And at this point, we shouldstress that the hypotheses that these local relaxation times should satisfy in order to apply our re-sults, do not require sharp estimates. Rough estimates will be enough to find large windows inwhich choosing our parameters κ and λ to obtain, through the use of variational principles, sharpestimates on the global relaxation time (γ−1 in the sequel).

2. MODEL AND RESULTS

2.1. Quasi-stationary measure and restricted ensemble. We consider a continuous-time Markovprocess X on a finite set X with generator defined by

Lf(x) =∑y∈X

p(x, y)(f(y)− f(x)) (2.1)

Page 6: Metastable states, quasi-stationary distributions and soft ... · hypotheses for Markov chains on a finite configuration space in some asymptotic regime. By com-paring restricted

METASTABILITY AND QUASI-STATIONARY MEASURES 5

for x in X and f : X → R, and where p is such that∑y

p(x, y) = 1 . (2.2)

Since X is finite, any generator can be written like in (2.1) up to time rescaling. We assume thatX is irreducible and reversible with respect to some probability measure µ, we denote by 〈· , ·〉 thescalar product in `2(µ), by ‖ · ‖ the associated 2-norm, by D the Dirichlet form defined by

D(f) = 〈f,−Lf〉 =1

2

∑x,y∈X

c(x, y) [f(x)− f(y)]2 (2.3)

where each conductance c(x, y) is equal to

c(x, y) = µ(x)p(x, y), (2.4)

and by γ the spectral gap

γ = minVarµ(f)6=0

D(f)

Varµ(f). (2.5)

For R ⊂ X we define in each x ∈ R the escape probability (or rate)

eR(x) =∑y 6∈R

p(x, y) (2.6)

and we denote by XR the reflected process (or restricted process) with generator given by

LRf(x) =∑y∈R

pR(x, y)(f(y)− f(x)) (2.7)

for x in R and f : R → R, and where, for all x, y in R,

pR(x, y) =

{p(x, y) if x 6= y,p(x, x) + eR(x) if x = y.

(2.8)

We will only consider subsets R such that both XR and XX\R are irreducible and we note that XRinherits from X the reversibility property with respect to the restricted ensemble

µR = µ(·|R). (2.9)

We identify `2(µR) with the subset of `2(µ) of functions f : X → R such that f |X\R ≡ 0 and wedenote by 〈· , ·〉R, ‖ · ‖R, DR, cR(x, y) and γR the associated scalar product, 2-norm, Dirichlet form,conductances for x, y in R and spectral gap.

We denote by p∗R the sub-Markovian kernel on R such that, for all x, y in R,

p∗R(x, y) = p(x, y). (2.10)

We know from [3] and the Perron-Frobenius theorem that there exists φ∗R > 0 such that 1 − φ∗R isthe spectral radius of p∗R and that there is a unique quasi-stationary measure µ∗R such that µ∗Rp

∗R =

(1− φ∗R)µ∗R. In addition we have, for all x, y in R and t ≥ 0, with τX\R the exit time from R, i.e.,the hitting time of X \ R,

limt→+∞ Px(X(t) = y|τX\R > t) = µ∗R(y), (2.11)

Pµ∗R(τX\R > t) = e−φ∗Rt, (2.12)

µ∗R(eR) = φ∗R. (2.13)

The limit in (2.11) is called a Yaglom limit after Yaglom showed the existence of such limits in thecase of branching processes [2]. In our context of finite state spaces, the existence of such a limit,that does not depend on the starting point x, simply follows from the Perron-Frobenius theorem.In Sections 2.3 and 6 these properties will be rederived in a slightly more general context.

Our first result states that if 1/φ∗R, the mean exit time for the system started in µ∗R, is large withrespect to 1/γR, the relaxation time of the reflected process, then the quasi-stationary measure µ∗R

Page 7: Metastable states, quasi-stationary distributions and soft ... · hypotheses for Markov chains on a finite configuration space in some asymptotic regime. By com-paring restricted

METASTABILITY AND QUASI-STATIONARY MEASURES 6

is close to the restricted ensemble µR. This is similar to Lemma 10 (b) in [9]. More precisely, forall x in R, let us define

ε∗R =φ∗RγR

(2.14)

h∗R(x) =µ∗R(x)

µR(x)(2.15)

and notice that h∗R is a right eigenvector of p∗R with eigenvalue 1− φ∗R. We prove the following.

Proposition 2.1. If ε∗R < 1, then

VarµR(h∗R) = ‖h∗R − 1R‖2R ≤ε∗R

1− ε∗R(2.16)

Proof. See Section 3.1. �

Remark. When proving that ε∗R goes to 0 in some asymptotic regime (for example when the cardi-nality of the configuration space goes to infinity like in [6], when some parameter of the dynamicsgoes to 0 like the temperature in [8] or when both happen like in [36]) one has to give upperbounds on φ∗R and lower bounds on γR. φ∗R satisfies a variational principle through which onecan get such upper bounds using suitable test functions. In particular, since one can often easilycompute with µR, and eR is often explicit, one can usually estimate

φR = µR(eR) (2.17)

and then bound φ∗R with φR. In some cases, for example in the low-temperature regime, thisestimate will already be good enough. More generally and precisely, we have the following lemma,that we prove in Section 3.2.

Lemma 2.2. φ∗R = minf 6=0

f|X\R=0

D(f)

‖f‖2≤ 1

EµR[τX\R

] ≤ φR.

Lower bounds on γR can be more difficult to obtain. However we note, first, that rough lowerbounds will often be sufficient to our ends, second, that the new Poincare inequality we will provein this paper (Theorem 2.10) can be used to this purpose (see Section 7.2).

As a consequence of this first result we can control the convergence rate of the Yaglom limitin (2.11). We note that, by the reversibility of X with respect to µ, p∗R is a self-adjoint operator on`2(µR) and has real eigenvalues. By the Perron-Frobenius theorem, this implies the existence of aspectral gap γ∗R > 0 equal to the difference between the first and the second largest eigenvalue ofp∗R.

Proposition 2.3. If ε∗R < 13 , then

1

γ∗R≤ 1

γR

{1− ε∗R1− 3ε∗R

}. (2.18)

Proof. See Section 3.3. �

Remark. Since, after the static study made in [42], we intend to apply our results to the dynamicalstudy of the cavity algorithm introduced in [30], for which finite-volume effects are of first impor-tance, we need to give asymptotics with quantitative control of corrective terms. This producesquite long formulas and to simplify the reading we put between curly brackets any terms that go to1 in a suitable asymptotic regime.

Then we setζ∗R = min

x∈RµR(x)h∗R

2(x) = minx∈R

µ∗R(x)h∗R(x) , (2.19)

which is the mass of the smallest atom of the measure µR biased by h∗R2, and define, if ε∗R < 1/3

and for any δ ∈]0, 1[,

T ∗δ,R =1

γR

(ln

2

δ(1− δ)ζ∗R

){(1 +

√ε∗R

1− ε∗R

)(1− ε∗R1− 3ε∗R

)}(2.20)

to get point-wise mixing estimates for Yaglom limits.

Page 8: Metastable states, quasi-stationary distributions and soft ... · hypotheses for Markov chains on a finite configuration space in some asymptotic regime. By com-paring restricted

METASTABILITY AND QUASI-STATIONARY MEASURES 7

Theorem 2.4 (Mixing towards quasi-stationary measure). If ε∗R < 1/3, then for all x, y ∈ R andδ ∈]0, 1[, ∣∣∣∣Px(X(t) = y | τX\R > t)

µ∗R(y)− 1

∣∣∣∣ < δ as soon as t > T ∗δ,R (2.21)

Proof. See Section 3.4. �

Remark. In words, this says that either the system leaves R before time T ∗δ,R, or it is describedafter that time by µ∗R in the strongest possible sense. This theorem is useful only if one can provideupper bounds on T ∗δ,R. Bounding T ∗δ,R depends on the control we have on ε∗R and on this newparameter ζ∗R. As far as the latter is concerned, we note that it only appears in the formula throughits logarithm. Crude or very crude estimates of ζ∗R will then often be sufficient. One has for examplethe following lemma.

Lemma 2.5.i) With ζR = minx∈R µR(x) and αR = maxx∈R eR(x), it holds

ln1

ζ∗R≤ ln

4

ζR+αRγR

[ln

4ε∗R(1− ε∗R)ζR

]+

, (2.22)

where the brackets [·]+ stand for the positive part.ii) If p(x, x) > 0 for all x ∈ R, then

ln1

ζ∗R≤ ln

1

minx∈R µ∗R2(x)

≤ 2∆RDR , (2.23)

where∆R = max{− ln pR(x, y) : pR(x, y) > 0 , ∀x, y ∈ R}DR = min{k ≥ 0 : pkR(x, y) > 0 , ∀x, y ∈ R} .

Proof. See Appendix A. �

Also, since h∗R is superharmonic onRwith respect to L (see Appendix A), it reaches its minimumon the internal border of R,

∂−R = {x ∈ R : ∃y 6∈ R, p(x, y) > 0} . (2.24)

Then we always have

ζ∗R ≥(

minx∈R

µR(x)

)(minx∈∂−R

h∗R(x)

)2

, (2.25)

and in the special case when ∂−R reduces to a singleton, this gives, by (2.13) and (2.17),

ζ∗R ≥ minx∈R

µR(x)

(φ∗RφR

)2

. (2.26)

Bounding ζ∗R from below essentially reduces in this case to giving a lower bound on φ∗R, which isone of the main goals of this paper (see Theorem 2.9).

We will make a special choice for the parameter δ in (2.20): we define

T ∗R = T ∗ε∗R,R. (2.27)

We then have

φ∗RT∗R ≤ ε∗R

(ln

3

ε∗Rζ∗R

){(1 +

√ε∗R

1− ε∗R

)(1− ε∗R1− 3ε∗R

)}(2.28)

as soon as ε∗R < 1/3. We will sometimes refer in the sequel to the regime φ∗RT∗R � 1. Equation

(2.28) provides a sufficient and practical condition for being in such a regime. We close this sectionwith a first asymptotic exponential law in this particular regime.

Theorem 2.6 (Asymptotic exit law). For any probability measure ν onR, define πR(ν) = Pν(τX\R <T ∗R) . If ε∗R < 1/3, then, for all t ≥ φ∗RT ∗R,{

Pν(τX\R > tφ∗R

) ≤ (1− πR(ν))e−t{eφ∗RT∗R(1 + ε∗R)

}Pν(τX\R > t

φ∗R) ≥ (1− πR(ν))e−t

{eφ∗RT∗R(1− ε∗R)

} .

Proof. See Section 4.1 �

Page 9: Metastable states, quasi-stationary distributions and soft ... · hypotheses for Markov chains on a finite configuration space in some asymptotic regime. By com-paring restricted

METASTABILITY AND QUASI-STATIONARY MEASURES 8

Remark. The theorem gives more than an asymptotic exponential exit law. It says that, providedπR(ν) converges to some limit and in the regime φ∗RT

∗R � 1, the normalized mean exit time φ∗RτX\R

converges in law to a convex combination between a Dirac mass in 0 and an exponential law withmean 1.

As an example of an application we can consider the case of the restricted ensemble.

Lemma 2.7. It holds

πR(µR) = PµR(τX\R ≤ T ∗R) ≤ 1

2

√ε∗R

1− ε∗R+ φ∗RT

∗R . (2.29)

Proof. See Section 4.2. �

This shows an asymptotic exponential exit law in the regime φ∗RT∗R � 1 for the system started

in the restricted ensemble.Another consequence of Theorem 2.6 is that, in the regime φ∗RT

∗R � 1, µ∗R asymptotically

maximizes the mean exit time on the set of all possible starting measures. This can be seen ina different way by following [9]. Consider, for any t > 0 the natural coupling up to time t ∧ τX\Rbetween X started from a measure ν and a process that starts from X(0), follows the law of thereflected process up to time t, and then the same law as the original process. This last processcannot escape from R before X and we get,

Eν[τX\R

]≤ t+

(1 +

e−γRt

ζR

)EµR

[τX\R

], (2.30)

with, as previously, ζR = minx∈R µR(x). Using Lemma 2.2 and optimizing in t one gets

Eν[τX\R

]≤ 1

φ∗R

{1 + ε∗R + ε∗R ln

1

ε∗RζR

}. (2.31)

We already mentioned that φ∗R can be estimated from above by using test functions in a vari-ational principle (see Lemma 2.2). One of the questions raised in [9] is that of upper bounds onmean exit times, i.e., that of lower bounds for φ∗R. This is the question we will now adress.

2.2. (κ, λ)-capacities, mean exit times and a new Poincare inequality. In this section we in-troduce a new object which extends the notion of capacity between sets. For any κ, λ > 0 andA,B ⊂ X , we first extend the electrical network (X , c), with c(x, y) = µ(x)p(x, y) = µ(y)p(y, x) forall distinct x, y ∈ X , into a larger electrical network (X , c) by attaching a dangling edge (a, a) withconductance κµ(a) to each a ∈ A and a dangling edge (b, b) with conductance λµ(b) to each b ∈ B(this extension is related with some Markov chain modification considered in [14]. More precisely,we add |A|+ |B| nodes and edges to the network by setting

X = X ∪ {a : a ∈ A} ∪ {b : b ∈ B}

and, for all distinct x, y ∈ X we define

c(x, y) =

c(x, y) if (x, y) = (x, y) ∈ X × Xκµ(a) if (x, y) = (a, a) for some a ∈ Aλµ(b) if (x, y) = (b, b) for some b ∈ B0 otherwise

. (2.32)

This extended network is naturally associated with a family of “two level Markov processes” thatevolve like X in X , “go down” from A and B in X to A and B at rate κ and λ respectively, and “goup” from A and B to A and B in X at some rates tuning the equilibrium measure of such processesin X . (We will use in the proof of our results this liberty in choosing the rates of this family ofprocesses associated with this unique extended electrical network.)

Page 10: Metastable states, quasi-stationary distributions and soft ... · hypotheses for Markov chains on a finite configuration space in some asymptotic regime. By com-paring restricted

METASTABILITY AND QUASI-STATIONARY MEASURES 9

Definition 2.8. The (κ, λ)-capacity, Cλκ (A,B), is defined as the capacity between the sets A and Bin the electrical network (X , c), and then is given, according to Dirichlet principle, by

Cλκ (A,B) = minf :X 7→R

1

2

∑x,y∈X

c(x, y)[f(x)− f(y)]2; f|A = 1 , f|B = 0

= minf :X 7→R

D(f) + κ∑a∈A

µ(a)[f(a)− 1]2 + λ∑b∈B

µ(b)[f(b)− 0]2

= minf :X 7→R

D(f) + κµ(A)EµA[(f|A − 1)2

]+ λµ(B)EµB

[(f|B − 0)2

].

(2.33)

Remarks.

i) Since all the points of A and B are at potential 1 and 0 respectively in formula (2.33), theyare electrically equivalent and we could have defined the (κ, λ)-capacity between A andB by adding just two nodes to the electrical network (X , c). However, our definition withdangling edges will be more useful in the sequel.

ii) A (κ, λ)-capacity is in some sense easy to estimate since it satisfies a two-sided variationalprinciple. On one hand, by definition, it is the infimum of some functional, and any testfunction will provide an upper bound. On the other hand it is the supremum of anotherfunctional on flows from A to B, which are antisymmetric functions of oriented edges withnull divergence in X , i.e., on functions ψ : X × X 7→ R such that for all x ∈ X \ (A ∪ B),divxψ =

∑x∈X ψ(x, x) = 0. This is Thomson’s principle that goes back to [1], Chapter 1,

Appendix A (see also lecture notes [35] for a more modern presentation or textbook [16]for the proof of an almost equivalent result). Letting

D(ψ) = 12

∑x,y∈X

ψ(x, y)2

c(x, y),

be the energy dissipated by the flow ψ in the network (X , c), and Ψ1(A, B) the set ofunitary flows from A to B, that is, the set of flows ψ from A to B such that∑

a∈A

divaψ =∑a∈A

∑x∈X

ψ(a, x) = 1 = −∑b∈B

divbψ = −∑b∈B

∑x∈X

ψ(b, x) , (2.34)

we haveCλκ (A,B) = max

ψ∈Ψ1(A,B)D(ψ)−1 . (2.35)

If A ∩B = ∅ this gives

Cλκ (A,B) = maxψ∈Ψ1(A,B)

(D(ψ) +

∑a∈A

(divaψ)2

κµ(a)+∑b∈B

(divbψ)2

λµ(b)

)−1

(2.36)

= maxψ∈Ψ1(A,B)

(D(ψ) + 1

κµ(A)EµA[(

divψµA

)2]

+ 1λµ(B)EµB

[(divψµB

)2])−1

,

where Ψ1(A,B) is the set of unitary flows ψ from A to B and

D(ψ) = 12

∑x,y∈X

ψ(x, y)2

c(x, y).

Then, any test flow provides a lower bound on Cλκ (A,B).iii) We know ([16], [35]) that the infimum and supremum in (2.33) and (2.36), are realized,

respectively, by the equilibrium potential V λκ = P(·)(`−1A (σκ) < `−1

B (σλ)), where `−1A and

`−1B are the right continuous inverses of the local times in A and B, while σκ and σλ are

independent exponential times with rates κ and λ, and by its associated normalized current

− c∇V λκCλκ (A,B)

: (x, y) ∈ X × X 7−→ c(x,y)Cλκ (A,B)

(V λκ (x)− V λκ (y)) . (2.37)

We will say more on such quantities in the next section.

Page 11: Metastable states, quasi-stationary distributions and soft ... · hypotheses for Markov chains on a finite configuration space in some asymptotic regime. By com-paring restricted

METASTABILITY AND QUASI-STATIONARY MEASURES 10

iv) The previous definitions and observations extend to the case when κ and λ are equal to+∞. In that case we identify A with A in the extended network if κ = +∞, or B withB if λ = +∞, and we drop the infinite upper or lower index in the notation, so that, forexample, Cκ(A,B) = C∞κ (A,B). However, when κ and λ are both equal to +∞, to avoidany ambiguity we need to require that A ∩ B = ∅. In that case the notation becomesC(A,B) = C∞∞ (A,B) and we recover indeed the usual notion of capacity.

We then get sharp asymptotics on mean exit times for the system started in the quasi-stationarymeasure.

Theorem 2.9 (Mean exit time estimates). For all κ > 0, it holds

Cκ(R,X\R)µ(R)

{1− ε∗R − κ

γR

}≤ φ∗R ≤

Cκ(R,X\R)µ(R)

{1− Cκ(R,X\R)

κµ(R)

}−2

(2.38)

Proof. See Section 5. �

Remarks.

i) In the regime ε∗R � 1, one can choose κ such that φ∗R � κ � γR and infer, by the lowerbound in (2.38), that κ� Cκ(R,X\R)

µ(R) . In turns, this yields an asymptotical matching upperbound.

ii) Both bounds are in some sense easy to estimate since capacities satisfy a two-sided vari-ational principle. Moreover, compared with the formula for mean exit time provided bypotential-theoretic techniques (see, e.g., [20]), the above inequalities require no residualaverage potential estimates. (Such estimates, as well as some harmonic measures will onlyplay a role in the proof of the theorem.)

Our (κ, λ)-capacities provide also spectral gap estimates and a new general Poincare inequality.For κ, λ > 0 and A,B ⊂ X we set

φλκ(A,B) =Cλκ (A,B)

µ(A)µ(B)= φκλ(B,A) . (2.39)

Theorem 2.10 (Relaxation time estimates). For all κ, λ > 0 and any R ⊂ X such that XR andXX\R are both irreducible Markov processes, 1

γ ≥1

φλκ(R,X\R)

{1− Cκ(R,X\R)

κµ(R) − Cλ(R,X\R)λµ(X\R)

}2

1γ ≤

1φλκ(R,X\R)

{1 + max

(κ+φλκ(R,X\R)

γR,λ+φλκ(R,X\R)

γX\R

)} . (2.40)

Proof. See Section 5. �

Remarks.

i) Without loss of generality, we can assume µ(R) ≤ µ(X \R) so that, by (2.39), φλκ ≤2Cλκ (R,X \R)/µ(R). Then, as a consequence of the previous theorem and of the mono-tonicity in κ and λ of (κ, λ)-capacities, we get matching bounds on 1/γ in the regimeε∗R + ε∗X\R + φ∗R/γX\R � 1. One can indeed choose κ such that φ∗R � κ � γR, just asfor Theorem 2.9 (Remark i)), and λ such that φ∗R, φ

∗X\R � λ � γX\R. In addition and like

previously, all the relevant quantities can be estimated by two-sided variational principles.ii) The lower bound is a generalization of the classical isoperimetrical estimate that is recov-

ered for κ = λ = +∞.iii) The upper bound is a new Poincare inequality. This inequality, or an easy-to-derive version

when one divides the configuration space into more than two subsets, echoes Poincareinequalities given in [25]. We are not able to compare in full generality our result withthat of [25] but we note that because of the presence of some global parameter called γin [25] one gets generally in our metastable situation an extra factor 1/min(γR, γX\R) byapplying the results of [25].

iv) The proof of this upper bound, when considering more than two subsets, extends verbatimto obtain the following result.

Page 12: Metastable states, quasi-stationary distributions and soft ... · hypotheses for Markov chains on a finite configuration space in some asymptotic regime. By com-paring restricted

METASTABILITY AND QUASI-STATIONARY MEASURES 11

Lemma 2.11. If R1, R2, . . . , Rm form a partition of X for which each of the restrictedprocesses XRi is irreducible, if κ1, κ2, . . . , κm are positive real numbers and if we write γi forγRi and φ(i, j) = C

κjκi (Ri,Rj)/(µ(Ri)µ(Rj)), then

1

γ≤

1

2

∑i 6=j

1

φ(i, j)

1 +maxi

1γi

{1 +

∑j 6=i

κiφ(i,j)

}12

∑i 6=j

1φ(i,j)

. (2.41)

2.3. Soft measures, local thermalization, transition and mixing times. We address now the dif-ficulty raised by Lebowitz and Penrose. Whatever the measure we choose to describe our metastablestate, restricted ensemble or quasi-stationary measure, it is associated with some subset R of theconfiguration space. Then there is an ambiguity when one looks at property (b): what is “gettingout” of the metastable state? One is tempted to say that it corresponds in our model to the exitfrom R. But doing so we are very unlikely to modelize in any satisfactory way property (c): wecan expect that “on the edge”, when the system just exited R, it has probabilities of the same orderto “proceed forward” and thermalize in X \R and to go “backward” and thermalize in R. Thus wewould like to define what would be a “true escape” fromR. Theorem 2.4 suggests an answer in theregime φ∗X\RT

∗X\R � 1. We could define the true escape as the first excursion of length T ∗X\R inside

X \R. Since time randomization is almost always a good idea, we are led to consider a randomtimer, which is independent of the dynamics and has exponential distribution in order to keep theMarkovianity of the process. The timer starts when the dynamics exits R, but if it does not ringbefore returning to R, the excursion to X \R is ignored in the sense that it is not considered a “trueescape” from R. A “true escape” happens only when the timer rings during one of the excursionoutside R. This will lead to an extension of the concept of quasi-stationary distribution that inter-polate between µ∗R and µR and we will see (Theorem 2.19 below) that the system will actually beclose to equilibrium the first time when the timer will ring during an excursion outside R: it willhave truly escaped from metastability.

For any A ⊂ X we call

`A(t) =

∫ t

0

1A(X(s))ds (2.42)

the local time spent in A up to time t and we denote by `−1A the right-continuous inverse of `A:

`−1A (t) = inf{s ≥ 0 : `A(s) > t}. (2.43)

Recall that the process X can be seen as the process updated, according to its discrete version withtransition probability matrix p, at each ring of a Poissonian clock with intensity 1. Let us then callτ the first ringing time. For σλ an exponential time with mean 1/λ that is independent from X, wedefine for all x and y in R

p∗R,λ(x, y) = Px(X(τ+

R) = y, `X\R(τ+R) ≤ σλ

)(2.44)

with τ+R the return time in R after the first clock ring, i.e., τ+

R = τ + τR ◦ θτ with θ the usual shiftoperator. We also define, for all x in R,

eR,λ(x) = Px(`X\R(τ+R) > σλ) = 1−

∑y∈R

p∗R,λ(x, y) (2.45)

and for all x and y in R

pR,λ(x, y) =

{p∗R,λ(x, y) if x 6= y,

p∗R,λ(x, x) + eR,λ(x) if x = y.(2.46)

The Markov process XR,λ on R with generator defined by

LR,λf(x) =∑y∈R

pR,λ(x, y)(f(y)− f(x)) (2.47)

is reversible with respect to µR and has spectral gap

γR,λ = minVarµR (f)6=0

DR,λ(f)

VarµR(f)(2.48)

Page 13: Metastable states, quasi-stationary distributions and soft ... · hypotheses for Markov chains on a finite configuration space in some asymptotic regime. By com-paring restricted

METASTABILITY AND QUASI-STATIONARY MEASURES 12

where

DR,λ(f) =1

2

∑x,y

cR,λ(x, y)(f(x)− f(y))2 (2.49)

withcR,λ(x, y) = µR(x)pR,λ(x, y) = pR,λ(y, x)µR(y). (2.50)

In addition we defineT := `−1

X\R(σλ) (2.51)

τX\R,λ = `R(`−1X\R(σλ)). (2.52)

We will refer to τX\R,λ as the transition time, since, for suitable choices of λ, this is the time spentby the process in R before “truly escaping” from R, as seen by formula (2.65) in Theorem 2.19below.

Remark. While T is the global time such that the time spent in X \ R, during possibly manyexcursions, is equal to σλ, the time τX\R,λ is the local time on R associated to T . On one hand, itmay thus look natural to address the study toward the characterization of the global time T . Onthe other hand, the transition time τX\R,λ is a natural generalization of the exit time τX\R, givenin such a way that when the time τX\R,λ is reached, not only the dynamics has exited R but it hasalso spent in X \ R a time equal to σλ. With a little effort, we will then derive for τX\R,λ similarresults to those we have obtained for τX\R, and in particular its asymptotic exponential law (seeTheorem2.17). At this point one may then think to derive information on T by the identity

T = σλ + τX\R,λ ,

but since the random variables σλ and τX\R,λ are not independent, this representation of T is notimmediately useful. However, we will show that for a suitable range of λ the global time T andthe local time τX\R,λ are asymptotically of the same order, which is also the order of the relaxationtime (see Theorem 2.19 and remark below).

We know by the Perron-Frobenius theorem that the spectral radius of p∗R,λ is a simple positiveeigenvalue that is smaller than or equal to 1 and has left and right eigenvectors with positivecoordinates. We call it 1 − φ∗R,λ and denote by µ∗R,λ the unique associated left eigenvector that isalso a probability measure on R. We then have the following lemma.

Lemma 2.12. It holdsi) φ∗R,λ = µ∗R,λ(eR,λ) ;

ii) Pµ∗R,λ(τX\R,λ > t) = e−tφ∗R,λ , ∀t ≥ 0 ;

iii) limt→∞

Px(X ◦ `−1R (t) = y | τX\R,λ > t) = µ∗R,λ(y) , ∀x, y ∈ R.

Proof. See Section 6.1. �

We say that µ∗R,λ is a quasi-stationary measure associated with a soft barrier, or a soft quasi-stationary measure, or, more simply, a soft measure. Indeed, µ∗R,λ is the limiting distribution of theprocess conditioned to survival when it is killed at rate λ outsideR. So, the hardest quasi-stationarymeasure associated with R, corresponding to λ = +∞, is the quasi-stationary measure µ∗R, whilethe softest measure, corresponding to λ = 0, is the restricted ensemble µR (φ∗R,0 = 0 and µ∗R,0 isthe equilibrium measure associated with p∗R,0 = pR,0, which is reversible with respect to µR). Moreprecisely we have the following.

Lemma 2.13. The function λ ∈ [0,+∞] 7→ µ∗R,λ ∈ `2(µ∗R) is a continuous interpolation between therestricted ensemble µR and the quasi-stationary distribution µ∗R. In particular, for any λ0 ∈ [0,+∞]and y ∈ R, we have

limλ→λ0

µ∗R,λ(y) = µ∗R,λ0(y) (2.53)

and for all x ∈ R it holds the limit commutation property

limλ→λ0

limt→∞

Px(X ◦ `−1R (t) = y | τX\R,λ > t) = lim

t→∞limλ→λ0

Px(X ◦ `−1R (t) = y | τX\R,λ > t) . (2.54)

Proof. See Section 6.2. �

Page 14: Metastable states, quasi-stationary distributions and soft ... · hypotheses for Markov chains on a finite configuration space in some asymptotic regime. By com-paring restricted

METASTABILITY AND QUASI-STATIONARY MEASURES 13

Analogously to what was done in the case λ = +∞ we set ε∗R,λ = φ∗R,λ/γR,λ, h∗R,λ = µ∗R,λ/µRand we call γ∗R,λ the gap between the largest and the second eigenvalue of p∗R,λ (since p∗R,λ isself-adjoint with respect to 〈·, ·〉R it has only real eigenvalues). We also define φR,λ = µR(eR,λ).

Proposition 2.14. The parameters γR,λ, φ∗R,λ, ε∗R,λ and φR,λ depend continuously on λ. In addition,when λ decreases to 0, so do φ∗R,λ, ε∗R,λ and φR,λ, while γR,λ increases.

Proof. See Section 6.3. �

The proofs of Sections 3 carry over this more general setup, and we get, by defining the analo-gous T ∗δ,R,λ and αR,λ (while ζR, which is associated with µR rather than µ∗R, has no “λ-extension”),the following theorem.

Theorem 2.15 (Mixing towards soft measures). For all λ ≥ 0, φ∗R,λ ≤ φ∗R, γR,λ ≥ γR and ε∗R,λ ≤ε∗R, Proposition 2.1, Proposition 2.3, Theorem 2.4 and Lemma 2.5 hold with an extra index λ andwriting X ◦ `−1

R instead of X.

Remark. By continuity and monotonicity, the hypotheses ε∗R,λ < 1 and ε∗R,λ < 1/3 are alwayssatisfied for λ small enough.

We are now ready to deal with local thermalization: we will identify a “short” time scale onwhich, for any given starting point, the system will relax towards a mixture of “local equilibria”that are our quasi-stationary measures with soften barriers.

For a given κ ≥ 0, let σκ be an exponential time with mean 1/κwhich is independent fromX andfrom σλ. We think to σk as to the random time which enters in the construction of soft measuresover X \R, in the same way the random time σλ entered in the construction of soft measure overR. We define inductively, for κ, λ ≥ 0, the stopping times τi for i ≥ 0:

τ0 = 0, (2.55)

τ1 = `−1R (σκ) ∧ `−1

X\R(σλ), (2.56)

τi+1 = τi + τ1 ◦ θτi (2.57)

Then for δ ∈ (0, 1) we call i0 the smallest i ≥ 1 such that one of the two following conditions holds,

i) X(τi) ∈ R and `R(τi)− `R(τi−1) > T ∗δ,R,λ, (2.58)

ii) X(τi) 6∈ R and `X\R(τi)− `X\R(τi−1) > T ∗δ,X\R,κ, (2.59)

and we set τδ = τi0 .

Theorem 2.16 (Local thermalization). For any δ ∈ (0, 1) and any probability measure ν on X , ifε∗R,λ < 1/3 and ε∗X\R,κ < 1/3, then

max

(maxx∈R

∣∣∣∣∣Pν(X(τδ) = x |X(τδ) ∈ R)

µ∗R,λ(x)− 1

∣∣∣∣∣ , maxx∈X\R

∣∣∣∣∣Pν(X(τδ) = x |X(τδ) 6∈ R)

µ∗X\R,κ(x)− 1

∣∣∣∣∣)< δ .

(2.60)Moreover if ξ = max

(eκT

∗δ,R,λ − 1, eλT

∗δ,X\R,κ − 1

)< 1, it holds

Pν(τδ > t

(1κ + 1

λ

))≤ e−t

{1

1− ξ

}. (2.61)

Proof. See Section 6.4. �

Remark. For κ and λ small enough, we have ε∗R,λ < 1/3 and ε∗X\R,λ < 1/3. Then, when κ and λ

decrease to 0, we have non-increasing upper bounds on T ∗δ,R,λ and T ∗δ,X\R,κ. As a consequence, thecondition ξ < 1 will always be satisfied for κ and λ small enough and the theorem says that startingfrom any configuration the system is close to a random mixture of two states (µ∗R,λ and µ∗X\R,κ,close to µR and µX\R respectively) after a time of order T ∗δ,R,λ + T ∗δ,X\R,κ.

As previously we make special choices for the parameter δ and we set

T ∗R,λ = T ∗ε∗R,λ,R,λ and T ∗X\R,κ = T ∗ε∗X\R,κ,X\R,κ(2.62)

Page 15: Metastable states, quasi-stationary distributions and soft ... · hypotheses for Markov chains on a finite configuration space in some asymptotic regime. By com-paring restricted

METASTABILITY AND QUASI-STATIONARY MEASURES 14

We then have, as soon as ε∗R,λ < 1/3,

φ∗R,λT∗R,λ ≤ ε∗R,λ

(ln

3

ε∗R,λζ∗R,λ

){(1 +

√ε∗R,λ

1− ε∗R,λ

)(1− ε∗R,λ1− 3ε∗R,λ

)}. (2.63)

Now the proofs of Section 4 carry over this more general setup and we get asymptotic exponen-tial laws for the transition time τX\R,λ.

Theorem 2.17 (Asymptotic transition law). For all λ ≥ 0, Theorem 2.6, Lemma 2.7 and inequal-ity (2.31) hold with an extra index λ.

We can also give sharp estimates on the mean transition time and asymptotics of the mixingtime.

Theorem 2.18 (Mean transition time estimates). For all κ, λ > 0, setting φλκ = φλκ(R,X \R) (re-call (2.39)), it holds φ∗R,λ ≥

Cλκ (R,X\R)µ(R)

{1−µ(R)−2φ∗R,λ/λ

1−µ(R)

}{1−max

(κ+φλκγR

,λ+φλκγX\R

)},

φ∗R,λ ≤Cλκ (R,X\R)

µ(R)

{1 + ε∗R,λ + ε∗R,λ ln 1

ε∗R,λζR+

φ∗R,λκ

}.

(2.64)

Proof. See Section 6.5. �

Remarksi) In the regime ε∗R + ε∗X\R + φ∗R/γX\R � 1 and assuming µ(R) ≤ µ(X \R) one can chooseκ and λ in such a way that φ∗R � κ � γR and φ∗R, φ

∗X\R � λ � γX\R, and then we get

matching bounds provided ε∗R,λ � ln(1/ζR). Once again, all the relevant quantities can beestimated via a two-sided variational principle.

ii) This logarithmic term in the upper bound looks spurious. An upper bound without such aterm should hold but we were not able to derive it.

Theorem 2.19 (Mixing time asymptotics). For κ, λ > 0 and any x ∈ X , we define T = `−1X\R(σλ)

and νx = Px(X(T ) = ·). Then, if ε∗X\R,κ < 1/3

‖νx − µX\R‖TV ≤ 1

2ε∗X\R,κ + λT ∗X\R,κ, (2.65)

‖νx − µ‖TV ≤ µ(R) +

√ε∗X\R,κ

1− ε∗X\R,κ+ λT ∗X\R,κ . (2.66)

In addition, if

η = µ(R) + 2

(√ε∗X\R,κ

1− ε∗X\R,κ+ λT ∗X\R,κ

)<

1

2, (2.67)

then, with

tmix = inft≥0

{maxx∈X‖Px(X(t) = ·)− µ‖TV ≤

1

2

(η +

1

2

)}, (2.68)

we have

tmix ≤2

φ∗R,λ(

12 − µ(R)

) {1 + ε∗R,λ + ε∗R,λ ln1

ε∗R,λζR+φ∗R,λλ

}. (2.69)

Proof. See Section 6.6. �

Remark. The theorem makes sense in the regime ε∗R+ε∗X\R+φ∗R/γX\R � 1. One can then choose λsuch that φ∗R,λ � λ � γX\R,κ. If λT ∗X\R,κ � 1 then (1 + 2η)/4 can be made as close as (1 +

2µ(R))/4 < 1/2 as we want. If ε∗R,λ ln(1/ζR)� 1, then the theorem provides the correct order forthe mixing time, since the spectral gap goes like φ∗R,λ/µ(X \R) and µ(X \R) ≥ 1/2.

Let us finally summarize our results. To have a mathematical model of the metastability phe-nomenon described by properties (a)-(c), we first consider a reversible Markov process on a finitestate space X , and a subset R of X such that µ(R) < µ(X \ R), with µ the equilibrium measureof the process. We the describe metastable states by soft measures associated with R in the regimeε∗R + ε∗X\R + φ∗R/γX\R � 1. In this regime all soft measures are close to the restricted ensemble

Page 16: Metastable states, quasi-stationary distributions and soft ... · hypotheses for Markov chains on a finite configuration space in some asymptotic regime. By com-paring restricted

METASTABILITY AND QUASI-STATIONARY MEASURES 15

(Theorem 2.15). If we choose κ and λ such that φ∗R � κ� γR and φ∗R, φ∗X\R � λ� γX\R then we

can show

i) local thermalization towards the soft measure µR,λ or µX\R,κ starting from any configura-tion in X and on a short time scale 1

κ + 1λ (Theorem 2.16),

ii) exponential asymptotic transition time on a long time scale 1φ∗R,λ

∼ µ(R)Cλκ (R,X\R)

(Theo-

rems 2.17 and 2.18),iii) return time to metastable state on a still longer time scale 1

φ∗X\R,κ∼ µ(X\R)

Cλκ (R,X\R)(Theo-

rem 2.18 applied to X \R in place of R).

In addition relaxation and mixing times are of the same order as the mean transition time (Theo-rems 2.10 and 2.19) - in particular the relaxation time has the same exact asymptotic up to a factorµ(X \R) - while exit times are on long, but generally shorter, time scale (Theorem 2.9). And wenote once again, that all relevant quantities can be estimated via two-sided variational principles.

3. ANALYSIS IN `2(µR)

3.1. Proof of Proposition 2.1. We recall that the reflected process XR is reversible w.r.t. µRwith spectral gap γR. In particular, for any function f ∈ `2(µR), we have the Poincare inequalityVarµR(f) ≤ 1

γRDR(f), where DR(f) is the Dirichlet form of f given by

DR(f) = 〈f,−LRf〉µ =∑x,y∈R

µR(x)f(x)(δx(y)− pR(x, y))f(y) . (3.1)

Applying the Poincare inequality to h∗R, and exploiting the definition of pR and p∗R, we get

VarµR(h∗R) ≤ 1

γRDR(h∗R) =

1

γR

∑x,y∈R

µR(x)h∗R(x) (δx(y)− pR(x, y))h∗R(y)

=1

γR

µR(h∗R2)−

∑x,y∈R

µ∗R(x)pR(x, y)h∗R(y)

=

1

γR

µR(h∗R2)−

∑x,y∈R

µ∗R(x)(p(x, y) + δx(y)eR(x))h∗R(y)

≤ 1

γR

µR(h∗R2)−

∑x,y∈R

µ∗R(x)p∗R(x, y)h∗R(y)

.

(3.2)

From the last line, using that µ∗R is a left eigenvector of p∗R with eigenvalue (1 − φ∗R) and thatµ∗R(h∗R) = µR(h∗R

2) , we get

VarµR(h∗R) ≤ φ∗RγR

µR(h∗R2) =

φ∗RγR

(VarµR(h∗R) + 1) . (3.3)

Finally, rearranging the terms in the above inequality and from the hypothesis ε∗R =φ∗RγR

< 1, weobtain the required upper bound.

3.2. Proof of Lemma 2.2. Let us denote by L∗R the sub-Markovian generator associated to thekernel p∗R. For any function f ∈ `2(µR), this is defined as

(L∗

Rf)(x) = −f(x) +∑y∈R

p∗R(x, y)f(y) , (3.4)

and we have the following useful lemma:

Lemma 3.1. For all f ∈ `2(µR), it holds

DR(f) ≤ D(f)

µ(R)= 〈f,−L

Rf〉R . (3.5)

Page 17: Metastable states, quasi-stationary distributions and soft ... · hypotheses for Markov chains on a finite configuration space in some asymptotic regime. By com-paring restricted

METASTABILITY AND QUASI-STATIONARY MEASURES 16

Proof of Lemma 3.1 . For all x, y ∈ R with x 6= y, pR(x, y) = p(x, y). Then we have

DR(f) =1

2

∑x,y∈R

µR(x)pR(x, y) [f(x)− f(y)]2

=1

2

∑x,y∈R

µR(x)p(x, y) [f(x)− f(y)]2,

(3.6)

since only the terms in x 6= y matter in this sum. Thus, extending the sum to all x, y ∈ X ,

DR(f) ≤ 1

2

∑x,y∈X

µR(x)p(x, y) [f(x)− f(y)]2 ≤ D(f)

µ(R), (3.7)

and this provides the stated upper bound.To prove the equality, we recall that the space `2(µR) is identified with the subset of functions

f ∈ `2(µ) with f|X\R ≡ 0. Since, for all x, y ∈ R, it holds that µR(x) = µ(x)/µ(R) and p∗R(x, y) =

p(x, y), we have

D(f)

µ(R)=

1

µ(R)

∑x,y∈X

µ(x)f(x) (δx(y)− p(x, y)) f(y)

=∑x,y∈R

µR(x)f(x) (δx(y)− p∗R(x, y)) f(y)

= 〈f,−L∗

Rf〉R ,

(3.8)

which concludes the proof. �

We can now proceed with the proof of Lemma 2.2. Since 1− φ∗R is the largest eigenvalue of p∗R,we have

φ∗R = minf :R→Rf 6=0

〈f,−L∗Rf〉R〈f, f〉R

, (3.9)

then the equality in Lemma 2.2 is a consequence of Lemma 3.1. Taking f = 1R as test function in(3.9), we get

φ∗R ≤∑x∈R

µR(x)

1−∑y∈R

p∗R(x, y)

=∑x∈R

µR(x)

1−∑y∈R

p(x, y)

=∑x∈R

µR(x)eR(x) = φR ,

(3.10)

and it remains to prove that EµR[τX\R

]lies between 1/φR and 1/φ∗R.

Since, for any k ∈ N, (1− φ∗R)k is the largest eigenvalue of (p∗R)k, the same argument gives

1− (1− φ∗R)k ≤∑x∈R

µR(x)

1−∑y∈R

(p∗R)k

(x, y)

(3.11)

namely,

(1− φ∗R)k ≥∑x∈R

µR(x)∑y∈R

(p∗R)k

(x, y) . (3.12)

By summing on k ≥ 0 and with X the discrete time version of X, such that X follows X at eachring of a Poissonian clock of intensity 1, we have, with obvious notation,

1

φ∗R=∑k≥0

(1− φ∗R)k ≥∑k≥1

PµR(τX\R ≥ k) = EµR [τX\R] ≥ PµR(τX\R = 1) =1

φR. (3.13)

Page 18: Metastable states, quasi-stationary distributions and soft ... · hypotheses for Markov chains on a finite configuration space in some asymptotic regime. By com-paring restricted

METASTABILITY AND QUASI-STATIONARY MEASURES 17

3.3. Proof of Proposition 2.3. The second smallest eigenvalue of the sub-Markovian generatorL∗R, φ∗R + γ∗R, satisfies the variational formula

φ∗R + γ∗R = min

{〈f,−L∗Rf〉R〈f, f〉R

: f 6= 0 , 〈f, h∗R〉R = 0

}= min

{〈f,−L

Rf〉R : 〈f, h∗R〉R = 0 , 〈f, f〉R = 1} (3.14)

Let f be a function on R that realizes the minimum in the above definition, with 〈f, f〉R = 1. Since〈f, h∗R〉R = 0, we have

〈f, h∗R − 1R〉R = −〈f,1R〉R = −µR(f)

and then, by the Cauchy-Schwartz inequality together with Proposition 2.1,

µ2R(f) ≤ ‖f‖2R · ‖h∗R − 1R‖2R ≤

ε∗R1− ε∗R

. (3.15)

Now, writing the orthogonal decomposition f = µR(f) + g, with µR(g) = 0, we have

1 = ‖f‖2R = µ2R(f) + ‖g‖2R

and thus, from (3.15),

‖g‖2R = 1− µ2R(f) ≥ 1− ε∗R

1− ε∗R=

1− 2ε∗R1− ε∗R

.

Using g as a test function in

γR = min

{DR(h)

‖h‖2R: h 6= 0 , µR(h) = 0

}, (3.16)

we get

γR ≤1− ε∗R1− 2ε∗R

DR(g) =1− ε∗R1− 2ε∗R

DR(f) . (3.17)

From Lemma 3.1, and using that f was chosen in order to have 〈f,−L∗Rf〉R = φ∗R + γ∗R, we get

γR ≤1− ε∗R1− 2ε∗R

〈f,−L∗

Rf〉R =1− ε∗R1− 2ε∗R

(φ∗R + γ∗R) . (3.18)

Setting φ∗R = ε∗RγR and rearranging the terms in the last inequality, we get(1− 3ε∗R + ε∗R

2

1− 2ε∗R

)γR ≤

(1− ε∗R1− 2ε∗R

)γ∗R ,

which, under the hypothesis ε∗R < 1/3, implies

1

γ∗R≤ 1

γR

{1− ε∗R1− 3ε∗R

}.

3.4. Proof of Theorem 2.4. The proof is based on a classical trick to control mixing times withrelaxation times. For any probability measure ν on R, any f : R → R such that µ∗R(f) 6= 0 and anys, t ≥ 0, one can check that

Eν [f(X(s+ t))1{τX\R>s+t}]− µ∗R(f)Pν(τX\R > s+ t)

=∑y∈R

(Pν(X(s) = y , τX\R > s)− Pν(τX\R > s)µ∗R(y)

)×(Ey[f(X(t))1{τX\R>t}]− Py(τX\R > t)µ∗R(f)

).

(3.19)

Indeed, one can rewrite the right-hand side of the above equality as the sum of four terms, two ofwhich coincide with the two terms in the left-hand side by the Markov property, while the othertwo terms cancel using the quasi-stationarity property, i.e.

Eµ∗R[f(X(t)) | τX\R > t

]= µ∗R(f) . (3.20)

Page 19: Metastable states, quasi-stationary distributions and soft ... · hypotheses for Markov chains on a finite configuration space in some asymptotic regime. By com-paring restricted

METASTABILITY AND QUASI-STATIONARY MEASURES 18

As a consequence, by the Cauchy-Schwartz inequality one gets∣∣Eν [f(X(s+ t))1{τX\R>t}]− µ∗R(f)Pν(τX\R > s+ t)

∣∣≤∥∥∥Pν(X(s)=· , τX\R>s)

µR(·) − Pν(τX\R > s)h∗R(·)∥∥∥R

×∥∥∥E(·)[f(X(t))1{τX\R>t}]− P(·)(τX\R > t)µ∗R(f)

∥∥∥R.

(3.21)

We now estimate these two factors. Noting that

Pν(X(s) = · , τX\R > s) = νesL∗R(·) and E(·)[f(X(t))1{τX\R>t}] = etL

∗Rf(·) ,

and diagonalizing the self-adjoint operator L∗R in an orthonormal basis, one gets∥∥∥Pν(X(s)=· , τX\R>s)µR(·) − ‖ ν

µR‖R

h∗R‖h∗R‖R

cos θν e−φ∗Rs

∥∥∥2

R≤ ‖ ν

µR‖2R

sin2 θνe−2s(φ∗R+γ∗R) , (3.22)

with θν ∈ [0, π/2[ such that ‖ νµR‖R‖h∗R‖R cos θν = 〈 νµR , h

∗R〉 = ν(h∗R) , and∥∥∥E(·)[f(X(t))1{τX\R>t}]− ‖f‖R

h∗R‖h∗R‖R

cos θf e−φ∗Rt

∥∥∥2

R≤ ‖f‖2

Rsin2 θfe

−2t(φ∗R+γ∗R) (3.23)

with θf ∈ [0, π] \ {π/2} such that ‖f‖R‖h∗R‖R cos θf = 〈f, h∗R〉 = µ∗R(f) .Moreover, since

Pν(τX\R > s) = µR

(Pν(X(s)=· , τX\R>s)

µR(·)

),

by the Cauchy-Schwartz inequality and using (3.22) we get∣∣∣Pν(τX\R > s)− ‖ νµR‖R cos θν‖h∗R‖R

e−φ∗Rs∣∣∣ =

∣∣∣µR (Pν(X(s)=· , τX\R>s)µR(·) − ‖ ν

µR‖R

h∗R‖h∗R‖R

cos θν e−φ∗Rs

)∣∣∣≤⟨1R,

∣∣∣Pν(X(s)=· , τX\R>s)µR(·) − ‖ ν

µR‖R

h∗R‖h∗R‖R

cos θν e−φ∗Rs

∣∣∣⟩R

≤ ‖ νµR‖R sin θνe

−s(φ∗R+γ∗R) .

(3.24)

Using inequalities (3.22) and (3.24), we finally get∥∥∥Pν(X(s)=· , τX\R>s)µR(·) − Pν( τX\R > s)h∗R(·)

∥∥R

≤∥∥∥Pν(X(s)=· , τX\R>s)

µR(·) − ‖ νµR‖R

h∗R‖h∗R‖R

cos θν e−φ∗Rs

∥∥∥R

+∥∥∥(Pν(τX\R > s)− ‖ ν

µR‖R cos θν‖h∗R‖R

e−φ∗Rs)h∗R

∥∥∥R

≤ ‖ νµR‖R(1 + ‖h∗R‖R) sin θν e

−s(φ∗R+γ∗R) .

(3.25)

which provides an estimate of the first factor in (3.21).To what concerns the second factor, noting that

P(·)(τX\R > t) = E(·)[1R(X(t))1{τX\R>t}]

and that, from the definition of cos θf applied to f = 1R,

cos θ1R = 1‖h∗R‖R

and sin2 θ1R =‖h∗R‖

2

R−1

‖h∗R‖2R=

VarµR (h∗R)

‖h∗R‖2R,

from inequality (3.23) we get∥∥∥P(·)(τX\R > t)− h∗R‖h∗R‖2R

e−φ∗Rt∥∥∥2

≤ VarµR (h∗R)

‖h∗R‖2Re−2t(φ∗R+γ∗R) . (3.26)

Then, from inequalities (3.23) and (3.26),

Page 20: Metastable states, quasi-stationary distributions and soft ... · hypotheses for Markov chains on a finite configuration space in some asymptotic regime. By com-paring restricted

METASTABILITY AND QUASI-STATIONARY MEASURES 19

∥∥E(·)[f(X(t)) 1{τX\R>t}]− P(·)(τX\R > t)µ∗R(f)∥∥∥R

≤∥∥∥E(·)[f(X(t))1{τX\R>t}]− ‖f‖R

h∗R‖h∗R‖R

cos θf e−φ∗Rt

∥∥∥R

+∥∥∥P(·)(τX\R > t)‖f‖R‖h∗R‖R cos θf − ‖f‖R

h∗R‖h∗R‖R

cos θf e−φ∗Rt

∥∥∥R

≤ ‖f‖R sin θfe−t(φ∗R+γ∗R) + ‖f‖R‖h∗R‖R

√VarµR (h∗R)

‖h∗R‖Rcos θf e

−t(φ∗R+γ∗R)

=

(‖f‖R sin θf +

µ∗R(f)√

VarµR (h∗R)

‖h∗R‖R

)e−t(φ

∗R+γ∗R) .

(3.27)

which provides an estimate of the second factor in (3.21).Inserting (3.25) and (3.27) in (3.21), we then obtain∣∣Eν [f(X(s+ t))1{τX\R>t}]− µ

∗R(f)Pν(τX\R > s+ t)

∣∣≤ ‖ ν

µR‖R(1 + ‖h∗R‖R) sin θν

(‖f‖R sin θf +

µ∗R(f)√

VarµR (h∗R)

‖h∗R‖R

)e−(s+t)(φ∗R+γ∗R) .

(3.28)

To conclude our proof we will make two more steps. First notice that from (3.24) one also getsthat, for any t ≥ 0,

Pν(τX\R > t) ≥ ‖ νµR‖R(

cos θν‖h∗R‖R

e−tφ∗R − sin θν e

−t(φ∗R+γ∗R)).

In particular, as soon as the following condition is verified

sin θν e−t(φ∗R+γ∗R) ≤ δ cos θν

‖h∗R‖Re−φ

∗Rt , (3.29)

that is‖h∗R‖R tan θν e

−φ∗Rt ≤ δ , (3.30)

it holdsPν(τX\R > t) ≥ (1− δ)‖ ν

µR‖R cos θν‖h∗R‖R

e−φ∗Rt . (3.31)

Now, dividing both terms of (3.28) by µ∗R(f)Pν(τX\R > s+ t), we reach an inequality that controlsthe Yaglom limit and that, provided condition (3.30) holds and then using the last inequality, readsas ∣∣∣Eν [f(X(t)) | τX\R>t]

µ∗R(f) − 1∣∣∣ ≤ 1+‖h∗R‖R

1−δ tan θν

(tan θf +

√VarµR(h∗R)

)e−γ

∗Rt . (3.32)

As a final step we apply this inequality to ν = δx and f = δy. For this choice of ν and f , and bydefinition of θν and θf , one has

tan θν ≤1

cos θν=

‖h∗R‖R√µR(x)h∗R(x)

and tan θf ≤1

cos θf=

‖h∗R‖R√µR(y)h∗R(y)

,

Thus, from (3.32), we obtain that under condition (3.30)∣∣∣∣Px((X(t)=y) | τX\R>t)µ∗R(y) − 1

∣∣∣∣ ≤ e−γ∗Rt (1+‖h∗R‖R )‖h∗R‖2

R(1−δ)

√µR(x)h∗R(x)

(1√

µR(y)h∗R(y)+

√VarµR (h∗R)

‖h∗R‖R

)

≤ e−γ∗Rt 1

1−δ

(1 +

√1 +

ε∗R1−ε∗R

)(1 +

ε∗R1−ε∗R

)1√ζ∗R

(1√ζ∗R

+

√ε∗R

1−ε∗R

),

(3.33)

where in the second line we used that ‖h∗R‖R ≥ 1, the estimate given in Proposition 2.1, and weintroduced the quantity ζ∗R defined in (2.19).

The right-hand side of the last inequality is smaller than δ as soon as

t ≥ 1γ∗R

[ln 2

δ(1−δ)ζ∗R+ ln

((12 + 1

2

√1 +

ε∗R1−ε∗R

)(1 +

ε∗R1−ε∗R

)(1 +

√ζ∗Rε

∗R

1−ε∗R

))], (3.34)

which also implies (3.30).

Page 21: Metastable states, quasi-stationary distributions and soft ... · hypotheses for Markov chains on a finite configuration space in some asymptotic regime. By com-paring restricted

METASTABILITY AND QUASI-STATIONARY MEASURES 20

Finally, from the hypothesis ε∗R < 1/3, from the concavity of the logarithm and of the squareroot function, and using that ζ∗R ≤ 1, then δ(1 − δ) ≤ 1/4 and ln 8 ≥ 1 + 5/(4

√2), after some

computation one obtains that the condition (3.34) is implied by the stronger condition

t ≥ 1γ∗R

(ln 2

δ(1−δ)ζ∗R

){1 +

√ε∗R

1−ε∗R

}, (3.35)

which, in turns, follows from t > T ∗δ,R by using Proposition 2.3.

4. AROUND THE EXPONENTIAL LAW

4.1. Proof of Theorem 2.6. We write

Pν(τX\R > tφ∗R

) = πR(ν)Pν(τX\R > tφ∗R| τX\R < T ∗R)

+ (1− πR(ν))Pν(τX\R > tφ∗R| τX\R > T ∗R) .

If t ≥ φ∗RT ∗R, the first term in the r.h.s equals zero and we get

Pν(τX\R > t

φ∗R

)= (1− πR(ν))Pν

(τX\R > t

φ∗R| τX\R > T ∗R

).

By Theorem 2.4, we also have∣∣∣∣Pν (τX\R > tφ∗R| τX\R > T ∗R

)− e−φ∗R(

tφ∗R−T∗R)

∣∣∣∣ ≤ ε∗Re−φ∗R(tφ∗R−T∗R)

,

which together with the previous equality completes the proof.

4.2. Proof of Lemma 2.7. On the one hand we have

Pµ∗R(τX\R ≤ T ∗R) = 1− e−φ∗RT∗R ≤ φ∗RT ∗R . (4.1)

On the other hand, denoting by dTV (µ, ν) the total variation distance between µ and ν, from the`2(µR) estimate given in Proposition 2.1, together with the Cauchy-Schwartz inequality, we get

dTV (µR, µ∗R) =

1

2

∑x∈R|µR(x)− µ∗R(x)| = 1

2

∑x∈R

µR(x) |1− h∗R(x)|

≤ 1

2VarµR(hR)1/2 ≤ 1

2

√ε∗R

1− ε∗R.

(4.2)

We then derive the desired result by using an optimal coupling.

5. WORKING WITH (κ, λ)-CAPACITIES

5.1. Proof of the upper bound in Theorem 2.9. Let X denote the continuous-time Markov chainon X defined, for κ > 0,by the generator

(Lf)(x) =

κ(f(x)− f(x)) if x = x ∈ R(Lf)(x) + κ(f(x)− f(x)) if x = x ∈ R(Lf)(x) if x = x ∈ X \R

. (5.1)

This is a reversible process with respect to a measure µ defined as

µ(x) =

{µ(x) if x = x ∈ Xκκµ(x) if x = x ∈ R . (5.2)

Note that µ is not a probability measure. Let us denote by νR the harmonic measure on R associatedwith X \R, i.e., the probability measure on R defined by

νR(x) =−µ(x)(LVκ)(x)

Cκ(R,X \R)(5.3)

and with

Vκ(x) =

Vκ(x) = Px(σκ < τX\R) if x = x ∈ R1 if x = x ∈ R0 if x = x ∈ X \R

.

Page 22: Metastable states, quasi-stationary distributions and soft ... · hypotheses for Markov chains on a finite configuration space in some asymptotic regime. By com-paring restricted

METASTABILITY AND QUASI-STATIONARY MEASURES 21

With obvious notation, we then have

EνR [τX\R] =µ(Vκ)

Cκ(R,X \R). (5.4)

Such kind of formula was introduced into the study of metastability in [20]. We refer to lecturenotes [35] for a derivation.

Now setting ν(x) = νR(x) for all x ∈ R, we can write

EνR [τX\R] = 1κ + Eν [τX\R] + Eν [τX\R] · κ · 1

κ = 1κ + Eν [τX\R](1 + κ

κ ) ,

where the first of the three summands stands for the mean time to go from R to R, the secondone for the mean time spent when moving inside R before reaching X \R, and the last one for themean time spent moving back and forth between R and R. From (5.2) we also have

µ(Vκ)

Cκ(R,X \R)=µ(Vκ) +

∑x∈R µ(x)

Cκ(R,X \R)=µ(Vκ) + κ

κµ(R)

Cκ(R,X \R).

Inserting the above equalities in (5.4) and multiplying by κ, we then get

1 + Eν [τX\R](κ+ κ) =κµ(Vκ) + κµ(R)

Cκ(R,X \R). (5.5)

Note that µ(R), µ(Vκ), Cκ(R,X \R) and Eν [τX\R] do not depend on κ. Then, in the limit of avanishing κ, it holds

1 + κEν [τX\R] =κµ(R)

Cκ(R,X \R). (5.6)

This already provides, by (2.31), an upper bound on φ∗R.To get the more practical upper bound stated in (2.38), let first note that dividing (5.5) by κ,

and then sending κ to +∞, we get

Eν [τX\R] =µ(Vκ)

Cκ(R,X \R). (5.7)

Together with (5.6), this implies µ(Vκ)Cκ(R,X\R) = µ(R)

Cκ(R,X\R) −1κ or equivalently

Lemma 5.1.

µR(Vκ) = 1− Cκ(R,X \R)

κµ(R)(5.8)

We now exploit the variational principle for φ∗R provided by Lemma 2.2 and take Vκ as testfunction. Noting that Vκ is the equilibrium potential of the electrical network on X defined in(2.32), from (2.33) we get

D(Vκ(x)) ≤ Cκ(R,X \R) . (5.9)By the Jensen inequality and (5.8),

‖Vκ‖2 = µ(R)∑x∈R

µR(x)Px(σκ < τX\R)2

≥ µ(R)µR(Vκ)2

= µ(R)(

1− Cκ(R,X\R)κµ(R)

)2

. (5.10)

Finally inserting these inequalities in the variational principle for φ∗R, we get the stated upperbound.

5.2. Proof of the upper bound in Theorem 2.10. For any f ∈ `2(µ), we have

Varµ(f) = µ(Varµ(f |1R)) + Varµ(µ(f |1R))

= µ(R)VarµR(f|R) + µ(X \R)VarµX\R(f|X\R)

+ µ(R)µ(X \R)(µR(f|R)− µX\R(f|X\R)

)2

.

(5.11)

Now, using the test function

f =f − µX\R(f|X\R)

µR(f|R)− µX\R(f|X\R)

Page 23: Metastable states, quasi-stationary distributions and soft ... · hypotheses for Markov chains on a finite configuration space in some asymptotic regime. By com-paring restricted

METASTABILITY AND QUASI-STATIONARY MEASURES 22

in the definition (2.33) of (κ, λ)-capacity, we get

Cλκ (R,X \R) ≤ D(f) + κµ(R)VarµR(f|R) + λµ(X \R)VarµX\R(f|X\R)

=(µR(f|R)− µX\R(f|X\R)

)−2(D(f) + κµ(R)VarµR(f|R) + λµ(X \R)VarµX\R(f|X\R)

),

which provides an upper bound on(µR(f|R)− µX\R(f|X\R)

)2

. Applying that bound in Eq. (5.11),

and from the definition of φλκ(A,B), we get

Varµ(f) ≤ µ(R)VarµR(f|R) + µ(X \R)VarµX\R(f|X\R)

+(D(f) + κµ(R)VarµR(f|R) + λµ(X \R)VarµX\R(f|X\R)

)φλκ(R,X \R)

−1(5.12)

≤ D(f)φλκ(R,X\R)

+µ(R)DR(f|R )

γR

(1 + κ

φλκ(R,X\R)

)+

µ(X\R)DX\R(f|X\R )

γX\R

(1 + λ

φλκ(R,X\R)

)≤ D(f)

φλκ(R,X\R)

{1 + max

(φλκ(R,X\R)+κ

γR;φλκ(R,X\R)+λ

γX\R

)},

where in the last step we used that

D(f) ≤ µ(R)DR(f|R) + µ(X \R)DX\R(f|X\R) .

The upper bound in (2.40) follows directly.

5.3. Proof of the lower bound of Theorem 2.9. From inequality (5.12) applied to f = h∗R andλ = +∞, and since h∗R|X\R = 0, we get

Varµ(h∗R) ≤ µ(R)VarµR(h∗R) +D(h∗R)

Cκ(R,X \R)µ(R)(1− µ(R))

{1 + κ

γR

}. (5.13)

On the other hand, by (5.11),

Varµ(h∗R) = µ(R)VarµR(h∗R) + µ(R)(1− µ(R)) .

Inserting this formula in (5.13), the term µ(R)VarµR(h∗R) becomes zero and then, dividing byµ(R)(1− µ(R)), we have

1 ≤ D(h∗R)

Cκ(R,X \R)

{1 + κ

γR

}, (5.14)

or equivalently

Cκ(R,X \R){

1 + κγR

}−1

≤ D(h∗R) . (5.15)

Now, dividing by µ(R)‖h∗R‖2R and using that, by Prop. 2.1, ‖h∗R‖2R = VarµR(h∗R) + 1 ≤ 1/(1− ε∗R),we get

Cκ(R,X \R)

µ(R)

{1− ε∗R1 + κ

γR

}≤ Cκ(R,X \R)

µ(R)‖h∗R‖2R

{1 + κ

γR

}−1

≤ D(h∗R)

‖h∗R‖2Rµ(R)= φ∗R , (5.16)

where the last equality comes from Lemma 3.1 and the fact that

〈h∗R,−L∗

Rh∗R〉R = φ∗R .

Finally, using the convexity of the function x 7→ 11+x , we obtain

Cκ(R,X \R)

µ(R)

{1− ε∗R − κ

γR

}≤ φ∗R , (5.17)

which concludes the proof of the lower bound in (2.38).

Page 24: Metastable states, quasi-stationary distributions and soft ... · hypotheses for Markov chains on a finite configuration space in some asymptotic regime. By com-paring restricted

METASTABILITY AND QUASI-STATIONARY MEASURES 23

5.4. Proof of the lower bound in Theorem 2.10. We use the test function V λκ , for which we knowthat (see (2.33))

D(V λκ ) + κµ(R)EµR[(V λκ |R − 1

)2]

+ λµ(X \R)EµX\R

[(V λκ |X\R − 0

)2]

= Cλκ (R,X \R)

so thatD(V λκ ) ≤ Cλκ (R,X \R) . (5.18)

We then look for a lower bound on Varµ(V λκ ). From (5.11) we have

Varµ(V λκ ) ≥ µ(R)µ(X \R)(µR(V λκ |R)− µX\R(V λκ |X\R)

)2

,

thus we need to estimate µR(V λκ |R) and µX\R(V λκ |X\R).By the monotonicity in λ, for all x ∈ R we get

V λκ (x) = Px(`−1R (σκ)) < `−1

X\R(σλ)) ≥ Px(σκ < τX\R) = Vκ(x) ,

which implies, together with Lemma 5.1,

µR(V λκ |R) ≥ 1− Cκ(R,X\R)κµ(R) .

In the same way we have

µX\R(V λκ |X\R) ≤ Cλ(R,X\R)λµ(X\R) .

Altogether, we finally get

γ ≤ D(V λκ )

Varµ(V λκ )≤ Cλκ (R,X\R)

µ(R)µ(X\R)

(1− Cκ(R,X\R)

κµ(R) − Cλ(R,X\R)λµ(X\R)

)−2

. (5.19)

6. WORKING WITH SOFT MEASURES

6.1. Proof of Lemma 2.12. If λ = 0 the first statement holds trivially since, in that case, φ∗R,λ =

0 = µ∗R,λ(eR,λ). If λ > 0, we can write

Pµ∗R,λ(τX\R,λ ≤ t) =∑k≥1

Pµ∗R,λ(NR(t) ≥ k)(1− φ∗R,λ)k−1µ∗R,λ(eR,λ) ,

where NR(t) is the number of clock rings inside R for the Poissonian clock associated to X. Takingthe limit as t→∞ in the above equation, we get that

1 = µ∗R,λ(eR,λ)/φ∗R,λ ,

which provides identity i).Let us now define the operator L∗R,λ on `2(µR) as

(L∗

R,λf)(x) = −f(x) +∑y∈R

p∗R,λ(x, y)f(y) ∀x ∈ R , f ∈ `2(µR) (6.1)

and notice that, for any probability measure ν on X , it holds

Eν[f(X ◦ `−1

R (t))1{τX\R,λ>t}

]= ν

(etL∗R,λf

). (6.2)

The exponential law given in ii) follows from the above identity applied to ν = µ∗R,λ and f = 1R.Finally, since 1−φ∗R,λ is a simple eigenvalue equal to the spectral radius of p∗R,λ, for any x, y ∈ R

and in the large t regime, we have

Px(X ◦ `−1R (t) = y , τX\R,λ > t) ∼ cxµ∗R,λ(y)e−tφ

∗R,λ , (6.3)

where cxµ∗R,λ is the canonical projection of δx on the one-dimensional eigenspace associated withµ∗R,λ (cx is strictly positive as a consequence of the positivity of µ∗R,λ). From (6.3), and taking thelimit when t goes to infinity, it follows

limt→∞

Px(X ◦ `−1R (t) = y | τX\R,λ > t) = µ∗R,λ(y) .

Page 25: Metastable states, quasi-stationary distributions and soft ... · hypotheses for Markov chains on a finite configuration space in some asymptotic regime. By com-paring restricted

METASTABILITY AND QUASI-STATIONARY MEASURES 24

6.2. Proof of Lemma 2.13. The result is once again a consequence of the Perron-Frobenius the-orem. Let χλ(y) denote the characteristic polynomial of L∗R,λ, which can be written as χλ(y) =

(y + φ∗R,λ)a(y). If a(y) = (y + φ∗R,λ)q(y) + a(−φ∗R,λ) is the Euclidian division of a(y) by (y − φ∗R,λ),we have the Bezout identity

1a(−φ∗R,λ)a(y)− 1

a(−φ∗R,λ)q(y)(y + φ∗R,λ) = 1 . (6.4)

In particular, for any x ∈ R, 1a(−φ∗R,λ)δxa(L∗R,λ) = cxµ

∗R,λ is the canonical projection of δx on the

eigenspace associated to µ∗R,λ, and since cx > 0 as previously noticed, we have

µ∗R,λ =δxa(L∗R,λ)∑

y∈R δxa(L∗R,λ)1{y}. (6.5)

Since a(y) = χλ(y)(y+φ∗R,λ) , the above equation expresses the map λ 7→ µ∗R,λ as a composition of contin-

uous functions of λ.

6.3. Proof of Proposition 2.14. As far as φR,λ is concerned, continuity and monotonicity followfrom continuity and monotonicity of eR,λ(x) for any x ∈ R. We then consider the other parameters.The continuity follows from the continuity of the eigenvalues as root of the characteristic polyno-mial. To prove the monotonicity, we notice that when λ decreases to zero, p∗R,λ(x, y) grows for allx and y in R as well as cR,λ(x, y) for any distinct x, y ∈ R. From the variational characterizationof φ∗R,λ, i.e.

φ∗R,λ = min{〈f,−L

R,λf〉R : 〈f, f〉R = 1 , f > 0}

(6.6)

= min〈f,f〉R=1f>0

∑x,y∈R

µR(x)f(x)

f(x)−∑y∈R

p∗R,λ(x, y)f(y)

(6.7)

where the restriction f > 0 comes from the fact that, by the Perron-Frobenius theorem, the righteigenvector has positive coordinates, we see that φ∗R,λ is decreasing in λ. Similarly, using

γR,λ = min

12

∑x,y∈R

cR,λ(x, y)(f(x)− f(y))2 : VarµR(f) = 1

, (6.8)

we see that γR,λ is increasing in λ. As a consequence ε∗R,λ is decreasing in λ, and we have

ε∗R,0 =φ∗R,0γR,0

=µ∗R,0(eR,0)

γR,0= 0 . (6.9)

6.4. Proof of Theorem 2.16. Proof of (2.60): We first write

Pν(X(τδ) = x |X(τδ) ∈ R) =1

Pν(X(τδ) ∈ R)

∑i≥0

∑xi∈X

Pν(i0 > i,X(τi) = xi) (6.10)

× Pxi(X ◦ `−1R (σκ) = x, `−1

R (σκ) < `−1X\R(σλ) , σκ > T ∗δ,R,λ)

Now, conditioning on σκ and setting Pσκxi = Pxi(· |σκ), we get

Pν(X(τδ) = x |X(τδ) ∈ R) =1

Pν(X(τδ) ∈ R)

∑i≥0

∑xi∈X

Pν(i0 > i,X(τi) = xi)

× E[Pσκxi (X ◦ `−1

R (σκ) = x, `−1R (σκ) < `−1

X\R(σλ) , σκ > T ∗δ,R,λ)]

=1

Pν(X(τδ) ∈ R)

∑i≥0

∑xi∈X

Pν(i0 > i,X(τi) = xi)

× E[1{σκ>T∗δ,R,λ}P

σκxi (X ◦ `−1

R (σκ) = x |σκ < τX\R,λ)Pσκxi (σκ < τX\R,λ)],

(6.11)

where the second equality comes from the independence between X, σκ and σλ. Since

Pν(X(τδ) ∈ R) =∑i≥0

∑xi∈X

Pν(i0 > i,X(τi) = xi)E[1{σκ>T∗δ,R,λ}P

σκxi (σκ < τX\R,λ)

], (6.12)

Page 26: Metastable states, quasi-stationary distributions and soft ... · hypotheses for Markov chains on a finite configuration space in some asymptotic regime. By com-paring restricted

METASTABILITY AND QUASI-STATIONARY MEASURES 25

from (6.11) we get

Pν(X(τδ) = x |X(τδ) ∈ R)

µ∗R,λ(x)− 1 =

1

Pν(X(τδ) ∈ R)

∑i≥0

∑xi∈X

Pν(i0 > i,X(τi) = xi)

× E

[1{σκ>T∗δ,R,λ}P

σκxi (σκ < τX\R,λ)

(Pσκxi (X ◦ `−1

R (σκ) = x |σκ < τX\R,λ)

µ∗R,λ(x)− 1

)].

(6.13)

An analogous expression can be found for Pν(X(τδ)=x |X(τδ)∈X\R)µ∗X\R,κ(x) − 1. The result then follows from

Theorem 2.15, and in particular from the equivalent of Theorem 2.4.To prove inequality (2.61), we first state the following lemma.

Lemma 6.1. Let T > 0 and {σi : i ≥ 1} be a sequence of independent exponential random variablesof rate κ such that eκT − 1 < 1. If N = min{i ≥ 1 : σi > T}, then

P

(N∑i=1

σi >t

κ

)≤ e−t

1− (eκT − 1)(6.14)

Proof of Lemma 6.1. Using the property of the exponential distribution, we have

P

(N∑i=1

σi >t

κ

)=∑n≥1

P(N = n)P

n∑j=1

σi >t

κ

∣∣σ1 < T, . . . , σn−1 < T, σn > T

≤∑n≥1

P(N = n)P(σn >

t

κ− (n− 1)T

∣∣σn > T

)

=∑n≥1

P(N = n)P(σn >

t

κ− nT

)=∑n≥1

(1− e−κT )n−1e−κT e−t+nκT

= e−t∑n≥1

(eκT − 1)n−1 =e−t

1− (eκT − 1),

(6.15)

which concludes the proof. �

Coming back to the proof of Th. 2.16, we first notice that if τδ > t( 1κ + 1

λ ), then `R(τδ) >tκ or

`X\R(τδ) >tλ . As a consequence, defining

AR = {κ`R(τδ) ∨ λ`X\R(τδ) = κ`R(τδ) > t}AX\R = {κ`R(τδ) ∨ λ`X\R(τδ) = λ`X\R(τδ) > t} (6.16)

so that P(AR) + P(AX\R) ≤ 1, we have

Pν(τδ > t( 1κ + 1

λ )) = Pν(τδ > t( 1κ + 1

λ )|AR)Pν(AR) + Pν(τδ > t( 1κ + 1

λ )|AX\R)Pν(AX\R) .

Using the independence between σκ, σλ and X, together with the previous lemma, we finally get

Pν(τδ > t

(1κ + 1

λ

))≤ e−t

{1

1−ξ

}(Pν(AR) + Pν(AX\R)

)≤ e−t

{1

1−ξ

}.

6.5. Proof of Theorem 2.18. To prove the upper bound we consider the extended electrical net-work associated with Cλκ (A,B) and follow the first steps of the proof of the upper bound in Theorem2.9 (see subsection 5.1). We then reach, for some probability measure ν on R, to

1 + κEν [τX\R,λ] =κµ(R)

Cλκ (R,X \R). (6.17)

instead of equation (5.6). Using then, from Theorem 2.17, the analogous of equation (2.31), weobtain

1 +κ

φ∗R,λ

{1 + ε∗R,λ + ε∗R,λ ln

1

ε∗R,λζR

}≥ κµ(R)

Cλκ (R,X \ R), (6.18)

Page 27: Metastable states, quasi-stationary distributions and soft ... · hypotheses for Markov chains on a finite configuration space in some asymptotic regime. By com-paring restricted

METASTABILITY AND QUASI-STATIONARY MEASURES 26

or1

φ∗R,λ

{1 + ε∗R,λ + ε∗R,λ ln

1

ε∗R,λζR+φ∗R,λκ

}≥ µ(R)

Cλκ (R,X \ R), (6.19)

which gives the desired upper bound on φ∗R,λ.The proof of the lower bound will be similar to the proof of the lower bound of Theorem 2.9,

where we used a partial Poincare inequality to control the mean exit time from R. The differencehere is that we will have to work on the whole space X and not only on R. Since µ∗R,λ is con-centrated on R, we will first compare its associated exit time τX\R,λ with the exit time of anotherquasi-stationary measure, µ∗X , that spreads on the whole space X . Then we will control φ∗X , theescape rate from X , with the spectral gap estimated in Theorem 2.10.

Let µ∗X be the quasi-stationary measure on X associated with the Markovian process X on X =

X ∪ X \R with generator L defined, for some λ > 0, by

(Lf)(x) =

(Lf)(x) if x = x ∈ R(Lf)(x) + λ(f(x)− f(x)) if x = x ∈ X \Rλ(f(x)− f(x)) if x = x ∈ X \R

. (6.20)

The associated escape rate φ∗X is, with obvious notation and for any probability measure ν on X ,the rate of exponential decay of Pν(τX\R > t) when t goes to infinity. Since

Pµ∗R,λ

(τX\R > t

)≥ Pµ∗R,λ

(`R

(τX\R

)> t)

= Pµ∗R,λ(τX\R,λ > t

)= e−φ

∗R,λt ,

(6.21)

we have φ∗R,λ ≥ φ∗X .We then have to estimate φ∗X and we do so by comparison with the spectral gap. By Lemma 3.1

applied with R = X and the correct normalizations, and taking, with obvious notation, f = h∗X ,which is indeed the minimizer in the variational principle satisfied by φ∗X , we have

φ∗X ≥D(h∗X )

‖h∗X ‖2=‖h∗X ‖2 − 1

‖h∗X ‖2D(h∗X )

Varµ(h∗X )≥

(1− 1

‖h∗X ‖2

)γ. (6.22)

Now,

‖h∗X ‖2 ≥∑x∈R

µ(x)

(µ∗X (x)

µ(x)

)2

= µ(R)∑x∈R

µR(x)

(µ∗X (x)

µ(x)

)2

≥ µ(R)

(∑x∈R

µR(x)µ∗X (x)

µ(x)

)2

=1

µ(R)(µ∗X (R))

2.

(6.23)

Since the escape from X occurs at rate λ in each point of X \R and there are no direct connectionsbetween R and X \ R, one has

µ∗X (X \ R) · λ = φ∗X ≤ φ∗R,λ (6.24)

or

µ∗X (R) ≥{

1−φ∗R,λλ

}. (6.25)

From φ∗R,λ ≥ φ∗X , (6.22), Theorem 2.10 and (6.23) we obtain

φ∗R,λ ≥

1− µ(R){1− φ∗R,λ

λ

}2

Cλκ (R,X \R)

µ(R)(1− µ(R))

1

1 + max(κ+φλκγR

,λ+φλκγX\R

) . (6.26)

Developing the square, dropping a few terms and using the convexity of x 7→ 1/(1 +x), this implies

φ∗R,λ ≥Cλκ (R,X \ R)

µ(R)

{1− µ(R)− 2φ∗R,λ/λ

1− µ(R)

}{1−max

(κ+ φλκγR

,λ+ φλκγX\R

,

)}, (6.27)

which is the desired result.

Page 28: Metastable states, quasi-stationary distributions and soft ... · hypotheses for Markov chains on a finite configuration space in some asymptotic regime. By com-paring restricted

METASTABILITY AND QUASI-STATIONARY MEASURES 27

6.6. Proof of Theorem 2.19. By Theorem 2.15, for all x in X ,

maxy∈R

∣∣∣∣∣∣Px(X(T ) = y

∣∣∣σλ > T ∗X\R,κ

)µ∗X\R,κ(y)

− 1

∣∣∣∣∣∣ ≤ ε∗X\R,κ. (6.28)

Then

‖νx − µ∗X\R,κ‖TV ≤1

2ε∗X\R,κ + P

(σλ < T ∗X\R,κ

)=

1

2ε∗X\R,κ + 1− e−λT

∗X\R,κ

≤ 1

2ε∗X\R,κ + λT ∗X\R,κ .

(6.29)

Also

‖µ∗X\R,κ − µX\R‖TV ≤ 1

2

√ε∗X\R

1− ε∗X\R, (6.30)

‖µX\R − µ‖TV = µ(R) , (6.31)

and the upper on ‖νx − µ‖TV follows from the triangular inequality.Now, for all t > 0,

‖Px(X(t) = ·)− µ‖TV ≤ ‖νx − µ‖TV + Px(T > t) (6.32)

and to prove our mixing time estimate, it is sufficient to show

Px (T > t) ≤ 1

2

(1

2+ η

)− µ(R)−

√ε∗X\R

1− ε∗X\R− λT ∗X\R,0 =

1

4− µ(R)

2(6.33)

for

t ≥ 2(12 − µ(R)

)φ∗R,λ

{1 + ε∗R,λ + ε∗R,λ ln

1

ε∗R,λζR+φ∗R,λλ

}. (6.34)

To obtain such an estimate we give an upper bound on the mean value of T and use Markovinequality. With T ′ = σλ ∧ τR, we have, using (2.31) adapted to soft measures,

Ex[T ] ≤ E[σλ] + Ex[EX(T ′) [τR,λ]

∣∣∣X(T ′) ∈ R]

≤ 1

λ+

1

φ∗R,λ

{1 + ε∗R,λ + ε∗R,λ ln

1

ε∗R,λζR

}

=1

φ∗R,λ

{1 + ε∗R,λ + ε∗R,λ ln

1

ε∗R,λζR+φ∗R,λλ

},

(6.35)

so that (6.34) implies (6.33).

7. TWO EXAMPLES

In this section we want to illustrate the analysis method of Section 2 with reference to toymodels. We will recover known results for the Glauber dynamics of the Curie-Weiss model andgive sharp asymptotics of its relaxation time, we will also study a variation on the so-called “n-dog”theme considered in [17] that illustrates the variety of scenarios one can encounter in proving ourbasic hypothesis on ε∗R or controlling T ∗R.

7.1. Metastable behavior of the Curie-Weiss model.

Page 29: Metastable states, quasi-stationary distributions and soft ... · hypotheses for Markov chains on a finite configuration space in some asymptotic regime. By com-paring restricted

METASTABILITY AND QUASI-STATIONARY MEASURES 28

7.1.1. Model, dynamics and one-dimensional representation. Les us consider the Curie-Weiss modelwhich is a mean-field spin system described by N spin variables, σ = (σ1, . . . , σN ) ∈ X = {−1, 1}N ,with Hamiltonian

HN,h(σ) = − 1

2N

N∑i,j=1

σiσj − hN∑i=1

σi , (7.1)

where h > 0 is called the external field. The corresponding Gibbs probability measure on X is

µN,h,β(σ) =e−βH(σ)

ZN,h,β, (7.2)

where β > 0 is the inverse of the temperature, and ZN,h,β =∑σ∈X e

−βHN,h(σ) is the normalizingfactor called the partition function. To make the notation simpler, we set H(σ) ≡ HN,h(σ), µ(σ) ≡µN,h,β(σ) and Z ≡ ZN,h,β .

For every N ∈ N, and setting [−1, 1]N := {−1,−1 + 2N , . . . , 1}, let us define the total magnetiza-

tion, mN : X 7→ [−1, 1]N , as

mN (σ) ≡ 1

N

N∑i=1

σi . (7.3)

Notice that it allows rewriting the Hamiltonian as a function of a one-dimensional parameter, i.e.

H(σ) = Nu(mN (σ)) , (7.4)

with u(m) = −m2

2 − hm, for m ∈ [−1, 1]. For simplicity, in the sequel we will identify functions de-fined on the discrete set [−1, 1]N with functions defined on [−1, 1] by setting f(m) ≡ f([2Nm]/2N).

For m ∈ [−1, 1], let us consider the functions

fN (m) = − 1

βNln

∑σ :mN (σ)=m

e−βH(σ) (7.5)

f(m) = limN→∞

fN (m) . (7.6)

A standard computation shows that

fN (m) = u(m)− 1

βs(m) +

1

βNln

(√(1−m)2πN

2 (1 + o(1))

)= f(m) +

1

βNln

(√(1−m)2πN

2 (1 + o(1))

),

(7.7)

where s(m) = −(

1+m2 ln 1+m

2 + 1−m2 ln 1−m

2

)is the entropy of Bernoulli random variables. More-

over, the critical points of the function f(m) satisfy the equation

m = tanh(β(m+ h)) (7.8)

and one gets, for β > 1 and 0 < h <√

β−1β + 1

β ln(√β +√β − 1

), that the graph of f(m) is given

by a double-well with two minima m− < 0 < m+ and a maximum m0 < 0.We then consider the time evolution of this system provided by a heat-bath Glauber dynamics.

This is a Markov chain on X , denoted by X = (X(t))t≥0, defined through the following (normal-ized) rates

p(σ, σi) = 1N

e−βH(σi)

e−βH(σ)+e−βH(σi), for i = 1, . . . , N

p(σ, σ) = 1−∑Ni=1 p(σ, σ

i)p(σ, σ′) = 0 , elsewhere

(7.9)

where σi denotes the configuration obtained from σ by a spin-flip at the site i. An easy check showsthat X is reversible w.r.t µ.

It turns out that the induced dynamics on the space [−1, 1]N , X(t) := mN (X(t)), is also Markov-ian with transition rates

p(m,m± 2N ) =

(1∓m

2

) ( 1+tanh(β∆±)2

)p(m,m) = 1− p(m,m+ 2

N )− p(m,m− 2N )

p(m,m′) = 0 , elsewhere, (7.10)

Page 30: Metastable states, quasi-stationary distributions and soft ... · hypotheses for Markov chains on a finite configuration space in some asymptotic regime. By com-paring restricted

METASTABILITY AND QUASI-STATIONARY MEASURES 29

where ∆± := ±m± h+ 1N . Moreover, an easy check shows that the induced dynamics is reversible

w.r.t. the probability measure µ on [−1,+1]N , given by

µ(m) :=e−βNfN (m)

ZN=

∑σ :mN (σ)=m

µ(σ) , (7.11)

where fN was defined in (7.7). When parameterized by m, the evolution of our system can beviewed as a one-dimensional random walk driven by a double-well potential. We thus consider themetastable region R ⊂ X and the corresponding one-dimensional projection R ⊂ [−1,+1]N :

R := {σ ∈ X : mN (σ) ≤ m0} ; R := {m ∈ [−1, 1]N : m ≤ m0} . (7.12)

7.1.2. Verifying hypotheses: First part. In order to apply our Theorems (2.6) (and the related in-equality (2.31), we first have to verify the hypotheses and in particular provide a suitable upperbound on ε∗R =

φ∗RγR

and on ζ∗R.By Lemma 2.2 we get

φ∗R ≤ φR = µR(eR) ≤ µR(∂−R) =µ(m0)

µ(R)≤ exp (−βNΓ)(1 + o(1)) , (7.13)

where in the last inequality we set Γ := f(m0)− f(m−) and used (7.7). The hypothesis on ε∗R thenfollows immediately by applying the (N logN)−1 lower bound on γR that was derived in [39] bya very precise computation. Since we just need a rough control of this quantity, that will be thencompared to φ∗R, we provide here a new simpler argument that yields a bound of order N−3/2.

We first notice that the dynamics defined by (7.9) can be compared to a random walk on thehypercube. This suggests that a simple way to control the spectral gap is by mixing time estimates.To this aim we consider the dynamics reflected in R, XR = (XR(t))t≥0, and the related mixingtime,

τmix,R( 14 ) = inf

t≥0{maxσ∈R‖Pσ(XR(t) = · )− µR‖TV ≤ 1

4} . (7.14)

In what follows we will denote by c(β) a constant depending on β but independent of N , whoseparticular value may change from line to line. With the above notation it holds the following:

Proposition 7.1.

τmix,R( 14 ) ≤ c(β)N

32 . (7.15)

Proof. The idea of the proof is based on coupling techniques, and we thus define the followingcoupling:

a) We consider two independent dynamics XσR and Xη

R, with initial states σ, η ∈ R, and letthem run independently until they reach the same magnetization.

b) We then run an analogous coupling to that defined in [39], where the only difference hereis the reflection on R. This coupling is defined for initial states σ, σ′ ∈ R with mN (σ) =mN (σ′), and is defined in such a way so as to keep the magnetizations coupled along thedynamics.

Let Tσ,η denote the coupling time for the global coupled dynamics (XσR(t), Xη

R(t)), that is

Tσ,η = inf{t ≥ 0 : XσR(t) = Xη

R(t)} . (7.16)

To provide an estimate on τmix,R( 14 ), it is then enough to find t such that

maxσ,η∈R

P(Tσ,η > t) ≤ 14 . (7.17)

The coupling time Tσ,η can be controlled by first estimating the time to couple the magnetizations,and then the coupling time of the dynamics starting in configurations with equal magnetization.

Formally, let (XmR (t), Xm′

R (t)) be the induced coupled dynamics with initial states m,m′ ∈ R,and define

Tm,m′ = inf{t ≥ 0 : XmR (t) = Xm′

R (t)} . (7.18)

Page 31: Metastable states, quasi-stationary distributions and soft ... · hypotheses for Markov chains on a finite configuration space in some asymptotic regime. By com-paring restricted

METASTABILITY AND QUASI-STATIONARY MEASURES 30

Then, for any σ, η ∈ R,

P(Tσ,η > t) ≤ P

maxm,m′∈R

Tm,m′ + maxσ,σ′∈R:

mN (σ)=mN (σ′)

Tσ,σ′ > t

≤ maxm,m′∈R

P(Tm,m′ >t2 ) + max

σ,σ′∈R:mN (σ)=mN (σ′)

P(Tσ,σ′ >t2 )

(7.19)

Following the same argument of [39] (see Lemma 2.9 and its proof) it is easy to prove that, forany σ, σ′ ∈ R such that mN (σ) = mN (σ′),

P(Tσ,σ′ > c(β)N logN) ≤ 1N (7.20)

for a constant c(β) depending on β but not in N .To control the time Tm,m′ , let τm denote the stopping time in m for XR. With some abuse of

notation, let Em denote the average over the dynamics XR with initial state m ∈ R and define

T := maxm∈R

Em(τm−) = max{E−1(τm−);Em0

(τm−)}

(7.21)

where the second equality is due to an obvious geometric fact. Then the following Lemmas hold.

Lemma 7.2. With the above notation, it holds

T ≤ c(β)N32 . (7.22)

Lemma 7.3. For all t ≥ 40T , it holds

maxm,m′∈R

P(Tm,m′ > t) = P(T−1,m0> t) ≤ 1

2 . (7.23)

Before proving the above Lemmas, let us conclude the proof of Prop. 7.1.By inequalities (7.19)-(7.20) and Lemma 7.3, it follows that if t = max{240T,N

32 }, N ≥ 8, and for

any σ, η ∈ R,

P(Tσ,η > t) ≤ P(T−1,m0 > 120T ) + maxσ,σ′∈R:

mN (σ)=mN (σ′)

P(Tσ,σ′ > N32 )

≤ 1

8+

1

N≤ 1

4

(7.24)

By Lemma 7.2 the above inequality holds whenever t > c(β)N32 and the statement of the Proposi-

tion follows. �

We now come back to the proofs of the two Lemmas.

Proof of Lemma 7.2. Since XR is a one-dimensional dynamics, for any two states x, y ∈ R we havethe formula

Ex(τy) =µ(Vx,y)

C(x, y)(7.25)

where Vx,y and C(x, y) are, respectively, the equilibrium potential and the capacity between x andy. Moreover, if x < y, we have

Vx,y(m) = Pm(τx < τy) =

1 if m ≤ x0 if m ≥ yC(x,y)C(m,y) if x < m < y

, (7.26)

C(x, y)−1 =

(y−x)N2 −1∑

k=0

(c(x+ 2k

N , x+ 2(k+1)N

))−1

, (7.27)

with c(x, y) = µ(x)p(x, y). Analogous formulas hold when x > y.

Remark. Since we are considering the dynamics reflecting in R, the classical version for the meanexit time would be Ex(τy) =

µR(Vx,y)CR(x,y) rather than Eq. (7.25), where CR(x, y) is the capacity defined

through conductances c(x, y) = µR(x)pR(x, y). However, it is easy to verify that for points x, y ∈ Rthe two formulas are equivalent.

Page 32: Metastable states, quasi-stationary distributions and soft ... · hypotheses for Markov chains on a finite configuration space in some asymptotic regime. By com-paring restricted

METASTABILITY AND QUASI-STATIONARY MEASURES 31

In Appendix B we will show that, if there are no local maxima of fN in [x, y],

C(x, y)−1 ≤ c(β)√NZN max

z∈{x,y}eβNfN (z) . (7.28)

Putting together (7.25)-(7.28), and since f(y) > f(m−) for any y ∈ [−1,m−), we get

E−1(τm−) =

(m−+1)N2 −1∑

j=0

µR(−1 + 2jN )C(−1 + 2j

N ,m−)−1 ≤ c(β)N32 (7.29)

Analogous computations can be done for Em0(τm−), providing the same estimate. This concludes

the proof of the Lemma. �

Proof of Lemma 7.3. The first identity of (7.23) is obvious, due to the geometry of the problem.We then focus on the two dynamics X−1

R and Xm0

R , and define recursively the stopping timessk, τk and s′k, for k ≥ 1:

s1 := inft≥0{X−1R (t) = m−}

τk := inft≥sk{Xm0

R (t) = m−}

sk+1 := inft≥τk{X−1R (t) = m−}

s′k := supt≤τk{X−1R (t) = m−}

(7.30)

Letting τ(t) denote the first clock ring after time t , we can define the event

A := {∃k ≤ 2 : s′k = τk or X−1R (s′k + τ(s′k)) > m−} . (7.31)

Notice that, since s′k is the time of the last visit in m− of the dynamics X−1R before τk, and because

−1 < m− < m0, the occurrence of the event {X−1R (s′k + τ(s′k)) > m−} implies that T−1,m0 < τk. In

particular, we have A ∪ {τ2 ≤ t} ⊂ {T−1,m0 ≤ t} and then

P(T−1,m0> t) ≤ P(Ac) + P(τ2 > t) . (7.32)

From the definition (7.10) of rates p, and using that 1−m−2 > 1+m−

2 together with the properties ofthe hyperbolic tangent, one can show that

p(m−,m− − 2N ) ≤ p(m−,m− + 2

N )⇐⇒ p(m−,m− − 2N ) ≤ 1

2 (1− p(m−,m−)) . (7.33)

Thus

P(Ac) ≤ P(X−1R (s′k + τ(s′k)) < m− , for k = 1, 2)

≤(P(X−1

R (t+ τ(t)) = m− − 2N |X

−1R (t) = m−, X

−1R (t+ τ(t)) 6= m−)

)2=

(p(m−,m− − 2

N )

1− p(m−,m−)

)2

≤ 1

4

(7.34)

In order to estimate P(τ2 > t), we divide the interval [0, t] in k = b t8T c subintervals of length 8T ,where T was defined in (7.21). The event {τ2 > t} is then included in the event that, in at leastk − 2 subintervals, at most one of the process has arrivals in m−. On each interval, this happenswith probability bounded above by

P−1(τm− > 8T ) + Pm0(τm− > 8T ) ≤ 1

8T(E−1(τm−) + Em0

(τm−)) ≤ 1

4, (7.35)

by Markov’s inequality, and then

P(τ2 > t) ≤(k − 2

k

)(1

4

)k−2

≤ 2−k+3 . (7.36)

The statement follows taking t ≥ 40T , so that k ≥ 5 and P(τ2 > t) ≤ 14 , and finally, from (7.32)

and the previous estimates,

P(T−1,m0> t) ≤ 1

2. (7.37)

Page 33: Metastable states, quasi-stationary distributions and soft ... · hypotheses for Markov chains on a finite configuration space in some asymptotic regime. By com-paring restricted

METASTABILITY AND QUASI-STATIONARY MEASURES 32

Coming back to the hypotheses on ε∗R, from the well known inequality γ−1R ≤ τmix,R( 1

4 ), and by(7.13) and (7.15), we obtain

ε∗R =φ∗RγR≤ c(β)N

32 exp (−βNΓ) (1 + o(1)) , (7.38)

which goes to 0 for any N large enough, and thus satisfies the hypothesis of our main theorems.Moreover, from Lemma 2.5 and the trivial bounds ∆R ≥ N and DR ≥ c(β), we get

ζ∗R ≥ e−∆RDR ≥ e−Nc(β) (7.39)

By inequality 2.28 and the previous estimates, this implies that for N large enough the conditionφ∗R · T ∗R � 1 is satisfied.

7.1.3. Asympotic law of the exit time. Applying Th. 2.6 and by the related inequality (2.31), we getthat in the limit N →∞ and for all distributions ν on R,{

Eν(τX\R) ≤ φ∗R−1(1 + o(1))

Eν(τX\R) ≥ (1− πR(ν))φ∗R−1(1 + o(1))

(7.40)

and for all t > 0,Pν(φ∗RτX\R > t) = (1− πR(ν))e−t(1 + o(1)) . (7.41)

In particular, for ν = µR,EµR(τX\R) = φ∗R

−1(1 + o(1)) (7.42)

PµR(φ∗R · τX\R > t) = e−t(1 + o(1)) , ∀t ≥ 0 . (7.43)

The next step concerns the estimation of φ∗R. By Th. 2.9, assuming that N is large enough tohave ε∗R + φ∗R · T ∗R � 1, and choosing k such that φ∗R � k � γR, we have

φ∗R =Ck(R,X \R)

µ(R). (7.44)

In order to estimate Ck(R,X \R), we use its two variational characterizations, (2.33) and (2.36),with suitable test functions. The one-dimensional nature of the model suggests that the capacitiesof the dynamics over X could be well approximated by the analogous quantities computed forthe induced dynamics over [−1, 1]N . This has the advantage that the equilibrium potential of theone-dimensional chain, namely the minimizer in (2.33) for k, λ =∞, can be explicitly given.

Following this idea, for the upper bound we consider the test function V (σ) := Vm−,m0(mN (σ)),

where Vm−,m0 is the function defined in (7.26). In other words, V (σ) is the equilibrium potentialassociated to the one-dimensional chain, with boundary conditions V (m−) = 1 and V (m0) = 0.Explicitly, for m ∈ [−1, 1]N ,

Vm−,m0(m) = Pm(τm− < τm0

) =

1 if m ≤ m−0 if m ≥ m0C(m−,m0)C(m,m0) otherwise

, (7.45)

where C(x, y)−1 =

(y−x)N2 −1∑

k=0

(c(x+ 2k

N , x+ 2(k+1)N

))−1

and c(x, y) = µ(x)p(x, y).

Then we have

Ck(R,X \R) ≤ D(V ) + k∑σ∈R

µ(σ)(V (σ)− 1)2

= C(m−,m0) + k

m0∑m=m−

e−βNfN (m)

ZN

(C(m−,m0)

C(m−,m)

)2

≤ C(m−,m0) + kc(β)NC(m−,m0) ,

(7.46)

where the last inequality is due to (7.28) which is derived in Appendix B. Since k � γR ≤c(β)N−3/2, it holds

Ck(R,X \R) ≤ C(m−,m0)(1 + o(1)) . (7.47)

Page 34: Metastable states, quasi-stationary distributions and soft ... · hypotheses for Markov chains on a finite configuration space in some asymptotic regime. By com-paring restricted

METASTABILITY AND QUASI-STATIONARY MEASURES 33

Similarly, for the lower bound onCk(R,X \R) we consider a unitary test flow ψ which is constanton all couples (σ, σ′) of given magnetization. Specifically we set ψ(σ, σ′) := Ψ(mN (σ),mN (σ′)) anddefine {

Ψ(m,m+ 2N ) =

(S(m) (1−m)N

2

)−1

∀m ∈ [m−,m0]N ,

Ψ(m,m′) = 0 otherwise, (7.48)

with S(m) = |{σ : mN (σ) = m}|. With this definition, the flow Ψ is the unitary flow from m− tom0 that realized the minimum in the Thompson principle for the one-dimensional chain. Insertingthe test flow in (2.36), we then have

Ck(R,X \R)−1 ≤ D(ψ) +µ(R)

kS(m−)

e−βNu(m−)

µ(R)ZN

(ZN · e−βNu(m−)

S(m−)

)2

=

m0∑m=m−

ZN · eβNfN (m)

p(m,m+ 2N )

+1

kZN · eβNfN (m−)

= C(m−,m0) +1

kµ(m−)−1 .

(7.49)

Since k−1 � (φ∗R)−1 ≤ µ(m−)Ck(R,X \R)−1 (by (7.44)), we get

Ck(R,X \R) ≥ C(m−,m0)(1 + o(1)) . (7.50)

From (7.44)) and with the above estimate, we then have

φ∗R =C(m−,m0)

µ(R)(1 + o(1)) . (7.51)

Finally, µ(R) and the capacity C(m−,m0) defined in (7.27) can be both evaluated for large N (seeAppendix B), providing the following asymptotic expressions:

C(m−,m0) =

√(1−m2

0)β|f ′′(m0)|πN

e−βNf(m0)

ZN(1 + o(1)) (7.52)

µ(R) =e−βNf(m−)

ZN ·√βf ′′(m−)(1−m2

−)(1 + o(1)) . (7.53)

Altogether, under the same hypotheses of before, we have

EµR(τX\R) =πN · eβN(f(m0)−f(m−))

β√|f ′′(m0)|f ′′(m−)(1−m2

0)(1−m2−)

(1 + o(1)) (7.54)

7.1.4. Verifying hypotheses: Second part. To move to the second part of the analysis, which goesfrom Th. 2.10 to Th. 2.19, we first have to estimate the quantities φ∗X\R, γX\R and ε∗X\R related tothe dynamics over X \R. As for φ∗R, we can find easily a rough (but sufficient) upper bound overφ∗X\R by Lemma 2.2. By trivial estimates we get

φ∗X\R ≤ φX\R = µX\R(eX\R) ≤ µX\R(∂+R) =µ(m0 + 2

N )

µ(X \R)≤ exp (−βNΓ′)(1 + o(1)) , (7.55)

where in the last inequality we set Γ′ := f(m0 + 2N )− f(m+) and used (7.7).

To get a lower bound over γX\R, as for γR we proceed by first estimating the mixing time of thedynamics reflected in X \R, XX\R = (XX\R(t))t≥0. With obvious notation, it holds the following:

Proposition 7.4.τmix,X\R( 1

4 ) ≤ c(β)N32 . (7.56)

Proof. The proof is the same as for Prop. 7.1, and can write down just replacing R with X \R, thestates −1,m− and m0 respectively with m+, −1 and m0 + 2

N and the time T defined in (7.21) with

T ′ = maxm∈X\R

Em(τm+) = max{E+1(τm+);Em0+ 2

N(τm+)

}.

Page 35: Metastable states, quasi-stationary distributions and soft ... · hypotheses for Markov chains on a finite configuration space in some asymptotic regime. By com-paring restricted

METASTABILITY AND QUASI-STATIONARY MEASURES 34

As a consequence of (7.55) and Prop. 7.4, we obtain

ε∗X\R =φ∗X\R

γX\R≤ c(β)N

32 exp (−βNΓ′) (1 + o(1)) , (7.57)

and alsoφ∗RγX\R

≤ c(β)N32 exp (−βNΓ) (1 + o(1)) , (7.58)

which are both� 1 for any N large enough.

7.1.5. Relaxation, transition and mixing times. From inequalities (7.57) and (7.58), we can choosek, λ in Theorems 2.10-2.19, such that φR � k � γR and φR + φX\R � λ � γX\R, and thenget matching upper and lower bound over on the relaxation time γ and the mean transition time.Explicitly, by Th. 2.10, 2.17 and 2.18, it holds that in the limit N → ∞ and for k, λ such thatφ∗R � k � γR and max{φ∗R, φ∗X\R} � λ� γX\R,

i)

γ−1 =µ(R)µ(X \R)

Cλk (R,X \R)(1 + o(1)) (7.59)

ii) For all distribution ν over R,{Eν(τX\R,λ) ≤ φ∗R,λ

−1(1 + o(1))

Eν(τX\R,λ) ≥ (1− πR(ν))φ∗R,λ−1(1 + o(1))

(7.60)

and for all t > 0,

Pν(φ∗R,λτX\R,λ > t) = (1− πR(ν))e−t(1 + o(1)) . (7.61)

In particular, for ν = µR,

EµR(τX\R,λ) = φ∗R,λ−1(1 + o(1)) (7.62)

PµR(φ∗R,λτX\R,λ > t) = e−t(1 + o(1)) , ∀t ≥ 0 . (7.63)iii)

φ∗R,λ =Cλk (R,X \R)

µ(R)(1 + o(1)) (7.64)

To provide quantitative estimates on the relaxation and transition time, it thus remains to esti-mate the capacity Cλk (R,X \R). As for Ck(R,X \R), we make use of the variational characteriza-tions (2.33) and (2.36), with suitable test functions. The functions that we consider are extensionsof those defined for Ck(R,X \R), in the sense that they are defined similarly but on a bigger sup-port.

Explicitly, let V (σ) := Vm−,m+(mN (σ)), with Vm−,m+

defined in (7.26). Plugging V into (2.33),we obtain the upper bound

Cλk (R,X \R) ≤ D(V ) + k∑σ∈R

µ(σ)(V (σ)− 1)2 + λ∑

σ∈X\R

µ(σ)(V (σ))2 (7.65)

Since V is defined as the equilibrium potential of the one-dimensional chain, with boundary condi-tion V (m−) = 1 and V (m+) = 0, we have that D(V ) = C(m−,m+). Using inequality (7.28) and

choosing k, λ� γX\R ≤ c(β)N−32 , the second and third terms of (7.65) are bounded as

k∑σ∈R

µ(σ)(V (σ)− 1)2 = k

m0∑m=m−

e−βNfN (m)

ZN

(C(m−,m+)

C(m−,m− 2N )

)2

≤ kc(β)N32 (C(m−,m+))

2ZN e

βNfN (m0)

≤ c(β) (C(m−,m+))2ZN e

βNfN (m0)

λ∑

σ∈X\R

µ(σ)(V (σ))2 = λ

m+∑m=m0

e−βNfN (m)

ZN

(C(m−,m+)

C(m,m+)

)2

≤ λc(β)N32 (C(m−,m+))

2ZN e

βNfN (m0)

≤ c(β) (C(m−,m+))2ZN e

βNfN (m0)

Page 36: Metastable states, quasi-stationary distributions and soft ... · hypotheses for Markov chains on a finite configuration space in some asymptotic regime. By com-paring restricted

METASTABILITY AND QUASI-STATIONARY MEASURES 35

In Appendix B, the capacity C(m−,m+) is evaluated for large N and the following formula isobtained

C(m−,m+) =

√(1−m2

0)β|f ′′(m0)|2πN

e−βNf(m0)

ZN(1 + o(1)) . (7.66)

This implies that the second and third terms above are o(C(m−,m+)) and then

Cλk (R,X \R) ≤ C(m−,m+)(1 + o(1)) . (7.67)

For the lower bound we consider a test unitary flow ψ(σ, σ′) := Ψ(mN (σ),mN (σ′)) with{Ψ(m,m+ 2

N ) =(S(m) (1−m)N

2

)−1

∀m ∈ [m−,m+]N ,

Ψ(m,m′) = 0 otherwise. (7.68)

Inserting the test flow in (2.36), we then have

Cλk (R,X \R)−1 ≤ D(ψ) +µ(R)

kS(m−)

e−βNu(m−)

µ(R)ZN

(ZN · e−βNu(m−)

S(m−)

)2

+µ(X \R)

λS(m+)

e−βNu(m+)

µ(X \R)ZN

(ZN · e−βNu(m+)

S(m+)

)2

=

m0∑m=m−

ZN · eβNfN (m)

p(m,m+ 2N )

+1

kZN · eβNfN (m−) +

1

λZN · eβNfN (m+)

≤ C(m−,m+)−1(1 + o(1)) ,

(7.69)

where in the last step we used that k−1 � φ∗R−1 = µ(m−)C(m−,m0)−1, λ−1 � φ∗X\R

−1 =

µ(m+)C(m0,m+)−1, and the fact that C(m−,m0) , C(m0,m+) and C(m−,m+) are all of orderN−1e−βNf(m0) (see Appendix B). From (7.64) and with the above estimates, we then have

φ∗R,λ =C(m−,m+)

µ(R)(1 + o(1)) . (7.70)

Finally, if we choose λ so that λT ∗X\R � 1, for example λ� N−5/2, then we can apply Th. 2.19 andget an upper bound on the mixing time of the same order of the transition and relaxation times.Altogether, in the limit N → ∞ and for k, λ such that e−βNΓ � k � N−3/2 and e−βNΓ} � λ �N−5/2, it holds

i)

γ−1 = EµR(τX\R,λ)(1 + o(1))

=2πN · eβN(f(m0)−f(m−))

β√|f ′′(m0)|f ′′(m−)(1−m2

0)(1−m2−)

(1 + o(1))(7.71)

(ii)τmix( 1

4 ) ≤ 4γ−1(1 + o(1)) (7.72)

Remark. Notice that in the Curie-Weiss model the mean exit time and the mean transition timediffer asymptotically only by a factor 2 (see Eqs. (7.54)and (7.71)). This is a slight difference butone that clarifies the different rule of the exit time from the transition time. Notice also that by thewell-known bound τmix( 1

4 ) ≥ γ−1, the second result shows that the mixing time and the relaxationtime are of the same order, which is N · eβN(f(m0)−f(m−)).

7.2. The wasp graph. Given three positive real numbers ra, rt and rw and a positive integer n,we set la = bnrac, lt = bnrtc and lw = bnrwc. We then consider two cubic lattices with verticesindexed by {0, . . . , la}3 and {0, . . . , lt}3, four copies of the square lattice with vertices indexed by{0, . . . , lw}2 and we attached them together by identifying some corners as in Figure 1, formingthen the “wasp graph” with its four “wings” and its “abdomen” attached to its central “thorax”. Wefinally place ourself in the regime n � 1 and consider the random walk with constant fixed rate αbetween nearest-neighbour, with α ≤ 1/6 to satisfy our hypothesis (2.2).

Without wings and with ra = rt = 1 our wasp would be the three-dimensional “n-dog” of [17]and we would have a relaxation time and mixing time of order n3. We will reprove this result byusing our (κ, λ)-capacities, actually considering the same kind of flows as those used in [17] but,

Page 37: Metastable states, quasi-stationary distributions and soft ... · hypotheses for Markov chains on a finite configuration space in some asymptotic regime. By com-paring restricted

METASTABILITY AND QUASI-STATIONARY MEASURES 36

lw

la

lt

FIGURE 1. A wasp without a head, and maybe misplaced wings.

as we will see, with some more flexibility in building such flows. We will also show that, as onecould expect, adding the wings will not change the spectral gap and mixing time asymptotics. Thisis, in particular, to illustrate how one can recursively apply Theorem (2.10): γR can be estimatedby applying the theorem to the restricted dynamics itself. The last reason why we introduced thistoy model that combines two and three-dimensional graphs is that it illustrates some limits of ourresult: while using the three-dimensional parts of our graph we will be able to estimate easily themixing time of our random walk and prove asymptotic exponential laws for exit and transitiontimes, the two-dimensional “wing pair restricted” random walk is associated with a too slowlydecreasing ε∗R to control more than the relaxation time: we are in the regime ε∗R � 1 but outsidethe regime φ∗RT

∗R � 1.

To fix some notation, let us call Rt the cubic lattice {0, . . . , lt}3 and Ra this other cubic latticefrom which one corner is removed to have a partition of the vertices Xb = Rt ∪ Ra that describethe wasp body obtained after remotion of the wings. In the same way we call R1, R2, R3 andR4 the four square lattices from which one point has been removed to obtain a partition of R =

Rt ∪⋃4i=1Ri that is the front part of the wasp. We then have a partition of the whole graph

X = R∪RaLet us start with the study of the Xb-restricted random walk. We will write φ∗t , γt and ε∗t instead

of φ∗Rt , γRt and ε∗Rt . From Lemma 2.2 we have, with obvious notation, φ∗t ≤ φt = 3α/(1 + lt)3. As

far as γt is concerned we can use the following lemma.

Lemma 7.5. For d ≥ 1, if γd is the spectral gap of the random walk on the d-dimensional lattice{0, . . . , l}d with nearest-neighbour jump rate α, then

1

γd≤ dl(l + 1)

2α/e. (7.73)

In addition, if we call 0 the all 0 coordinate vertex and γ′d the spectral gap of restricted random walkon {0, . . . , l}d \ {0}, then

1

γ′d≤ d(l + 1)2

2αd+1/e. (7.74)

Proof. The estimate (7.73) is obtained by the standard coordinate by coordinate coupling for thelazy version of the original random walk. The same coupling can be used for the random walk onthe graph with one removed corner to bring each coordinate of two lazy random walks at distance1 at most. Since one can then build another coupling making them meet in d steps at most withprobability αd at least, and start the coupling again from the beginning if they do not, the meanmeeting time of these lazy random walks is bounded from above by

1

αd

(dl(l + 1)

2α+ d

)≤ d(l + 1)2

2αd+1. (7.75)

Markov’s inequality makes then possible to bound the mixing time of the lazy walk, from whichone deduces (7.74) for the original walk. �

The first part of the lemma together with the previous estimate on φ∗t gives then

ε∗t ≤3lt(lt + 1)

2α/e

(lt + 1)3≤ 9e

1 + lt. (7.76)

Page 38: Metastable states, quasi-stationary distributions and soft ... · hypotheses for Markov chains on a finite configuration space in some asymptotic regime. By com-paring restricted

METASTABILITY AND QUASI-STATIONARY MEASURES 37

To prove that l3t is the correct order of the exit time fromRt we apply Theorem 2.9 to estimate φ∗tfrom below and then just have to build a unitary flow fromRa toRt to estimate from below a (κ, λ)-capacity with λ = +∞. We send a flow of strength 1 to the junction corner (this will be modifiedwhen working with λ < +∞) and have to absorb it inRt. Since we know the probabilistic meaningof the optimal flow and κ has, heuristically, to be small enough to be close to local equilibrium atabsorption, we should absorb a fraction of order 1/(1+ lt)

3 of this unitary flow in each vertex ofRt.But we are not constrained to realize this exactly and this is where we have some flexibility thathelps in computation. Since we also know from the electrical network picture that the optimal flowshould in some sense be radially distributed, we build our flow as the mean of a random simpleflow with some spherical symmetry. Let us first explain how two build a certain random path ξ.We begin by choosing a point Q with positive coordinates in the origin centered euclidean ball ofradius (1 + lt) according to the normalized Lebesgue measure. This point belongs to some unitarycube with integer coordinates corners and we call Q′ the corner with the smallest coordinates. Wethen approximate the radius [0, Q] by a coordinate non-decreasing path that starts from 0, is onlymade of edges along the unitary cubes crossed by [0, Q], and ends in Q′. Such a path is in particulara shortest path on the lattice that links 0 with Q′ and the fact that is exists can be shown by shownby recurrence on the dimension and by using projections along coordinate axes. For such a randompath ξ (since Q is random) we define a flow ψξ by ψξ(x, y) = 1{(x,y)∈ξ} − 1{(y,x)∈ξ}. We finally useas test flow in Thomson’s principle the associated mean flow, that is the flow ψ such that, for x andy nearest neighbours with ‖y‖2 > ‖x‖2, ψ(x, y) = P((x, y) ∈ ξ). Since the distance between theseapproximating paths and their associated radius is smaller than

√3, the chosen point has to be in

cone of half angle α, with sinα =√

3/‖y‖2 for y to be used in the approximating path. It followsthat

ψ(x, y) ≤ 118

4π(1+lt)3

3

3(1 + lt)

3(1− cosα) = 4

(1−

√1− 3

‖y‖22

)≤ 12

‖y‖2∞. (7.77)

Also, for all x ∈ Rt, we have divxΨ = P(Q′ = x) and

EµRt

[(divΨ

µRt

)2]

=∑x∈Rt

1

(1 + lt)3(1 + lt)

6P

2 (Q′ = x)

=∑x∈Rt

(1 + lt)3 Vol(C(x) ∩B1/8)2(

18

43π(1 + lt)3

)2(7.78)

where C(x) is the unitary cube with x as smallest coordinate coordinate corner, B1/8 is the positivecoordinate part of the ball of radius (1 + lt) and Vol stands for the Lebesgue measure. It followsthat

EµRt

[(divΨ

µRt

)2]≤∑x∈Rt

1

(1 + lt)3

Vol(C(x) ∩B1/8)(18

43π)2 =

6

π(7.79)

Thomson’s principle gives then, with µb the uniform measure on Xb, κ > 0 and Cκ,b(Rt, Ra) theκ-capacity between Rt and Ra that is computed relatively to the restricted random walk in Xb,

µb (Rt)Cκ,b (Rt,Ra)

≤ (1 + lt)3

α+

lt−1∑k=0

3(1 + lt)

3

α

[3(1 + k)2 − 3(1 + k) + 1

] 144

(1 + k)4+

1

κ

1

8

6

π

≤ (1 + lt)3

α

1 + 432∑k≥1

3

k2

+6

κπ

≤ 2161(1 + lt)

3

α+

6

κπ.

(7.80)

Choosing 1/κ� n3 this shows that l3t = brtnc3 is the correct order for the exit time.To estimate transition and relaxation time with the same tools, we have to estimate (κ, λ)-

capacities with finite λ. Using not only randomly chosen sinks but randomly chosen sources also,

Page 39: Metastable states, quasi-stationary distributions and soft ... · hypotheses for Markov chains on a finite configuration space in some asymptotic regime. By com-paring restricted

METASTABILITY AND QUASI-STATIONARY MEASURES 38

we can define a mean flow as previously to prove, with obvious notation,

µb (Rt)µb (Ra)

Cλκ,b (Rt,Ra)≤ µb (Ra)

lt−1∑k=0

3(1 + lt)

3

α

[3(1 + k)2 − 3(1 + k) + 1

] 144

(1 + k)4

+ µb (Rt)la−1∑k=0

3(1 + la)3

α

[3(1 + k)2 − 3(1 + k) + 1

] 144

(1 + k)4

+ µb (Ra)6

κπ+ µb (Rt)

6

λπ

≤ 2160

(µb (Ra)

(1 + lt)3

α+ µb (Rt)

(1 + la)3

α

)+

6

κπ+

6

λπ,

(7.81)

to get, by choosing also 1/λ� n3,

µb (Rt)µb (Ra)

Cλκ,b (Rt,Ra)≤ 2160

2r3t r

3a

r3t + r3

a

n3

α+ o(n3) . (7.82)

By Theorem 2.10, using (7.74) and choosing κ, λ � 1/n2, this gives an upper bound on the re-laxation time with the same asymptotics. Theorem 2.18 also provides a similar upper bound onthe mean transition time. Going to lower bounds on 1/γb = 1/γXb and 1/φ∗t,λ one could estimate(κ, λ)-capacities through Dirichlet principle, but it is better to recall that µb(Rt)/Cλκ,b(Rt,Ra) ≥(1− ε)/φ∗t,λ for any ε and large enough n and 1/φ∗t,λ ≥ 1/φ∗t ≥ (1 + lt)

3/(3α).As far as exponential asymptotic laws and mixing time asymptotics are concerned, our results

depend on our ability, with obvious notation, to control ζ∗t and show that ε∗t ln(1/ζ∗t ) goes to zeroand ensure φ∗tT

∗t � 1. This cannot be achieved by using Lemma 2.5, since ε∗t /γt = φ∗t /γ

2t � 1

and ε∗tDt is of order 1. (Estimates provided by (2.23) and (2.22) would actually be enough indimension four and five respectively.) We are, however, in the special case where (2.26) holds andproves, since φ∗t and φt are of the same order, that ε∗t ln(1/ζ∗t )� 1. This proves local thermalizationon time scale n2 lnn and exponential asymptotic laws immediately follow. This also proves that themixing time goes like n3 as soon as rt 6= ra.

We prove now that these asymptotics on relaxation, transition, exit and mixing times are stillvalid on the full wasp graph, wings included. To do so we note that our previous flow used toestimate (κ, λ)-capacities between Rt and Ra in Xb can still be used to estimate (κ, λ)-capacitiesbetween R (wings included) and Ra in the full space X . The key point now is to control γR. If ourwasp had only one wing R1, we could have use Theorem 2.10 directly on R = R1 ∪ (R \R1). Wewill use instead Lemma 2.11 and, anyway, will have to estimate (κ, λ)-capacities between R1 andRt and compare it with γ1 = γR1

and γt = γRt .Let us start by estimating φ∗1 = φ∗R1

. In this two-dimensional case the easy bound φ∗1 ≤ φ1 is nota good one. We then use the variational principle satisfied by φ∗1 (see Lemma 2.2) with the samekind of test function we would have use to estimate Cκ(R1,Rt). With V (x) = (ln(‖x‖∞))/(1+ln l1)for x ∈ R1, we have, with obvious notation,

D1(V ) =

l1−1∑k=0

2(k + 1)α

(1 + l1)2

[ln(k + 2)− ln(k + 1)

1 + ln l1

]2

≤ 2α

(1 + ln l1)2

(1 + l1)2

l1−1∑k=0

1

k + 1

≤ 2α

(1 + ln l1)2

(1 + l1)2 (1 + ln l1) =

(1 + ln l1) (1 + l1)2

(7.83)

We also have

µ1(V 2) =

l1∑k=0

(2k + 1)1

(1 + l1)2

ln2(1 + k)

(1 + ln l1)2

≥ 1

(1 + l1)2(1 + ln l1)2

∫ 1+l1

1

(2x− 1) ln2 x dx

≥ 1

3for l1 ≥ 20.

(7.84)

Page 40: Metastable states, quasi-stationary distributions and soft ... · hypotheses for Markov chains on a finite configuration space in some asymptotic regime. By com-paring restricted

METASTABILITY AND QUASI-STATIONARY MEASURES 39

Since V|∂−Rt ≡ 0 it follows that φ∗1 ≤ 6α/((1 + l1)2(1 + ln l1)) for l1 ≥ 20, and ε∗1 decreases at leastlike 1/ ln l1.

To see that we have found the right order for φ∗1 we estimate the κ-capacity between R1 and Rtby using the same kind of flow as previously. For nearest neighbours x and y with ‖y‖2 > ‖x‖2 sucha flow ψ satisfies, with sinα ≤

√2/‖y‖2 and α ≤ π/4, so that sinα ≥ α/

√2

ψ(x, y) ≤ 114πl

21

αl21 ≤4

π

√2

√2

‖y‖2≤ 8

π‖y‖∞. (7.85)

Thomson principle then gives

µR(R1)

Cκ(R1,Rt)≤ (1 + l1)2

α+

l1−1∑k=0

2(2k + 1)(1 + l1)2

α

(8

π(k + 1)

)2

+1

κ

114π

≤ (1 + l1)2

α

(1 +

256

π2

l1−1∑k=0

1

k + 1

)+

4

κπ

≤ (1 + l1)2

α(1 + 26(1 + ln l1)) +

4

κπ,

(7.86)

which proves, choosing 1/κ� n2 lnn that 1/φ∗1 is of order n2 lnn.Combining these two- and three-dimensional flows we get, denoting by Cλκ,R(·, ·) the (κ, λ)-

capacity that is computed relatively to the random walk restricted in R.

µR(R1)µR(Rt)Cλκ,R (R1,Rt)

≤ 26µR(Rt)(1 + l1)2(1 + ln l1)

α+ 2160µR(R1)

(1 + lt)3

α

+ µR(Rt)4

κπ+ µR(R1)

π

6λ.

(7.87)

With 1/κ� n2 lnn as previously and 1/λ� n3, since µR(Rt) is of order 1 and µR(R1) is of order1/n, this leads to

µR(R1)µR(Rt)Cλκ,R (R1,Rt)

≤ 26c21n2 lnn

α+O(n2) (7.88)

and, choosing also 1/κ, 1/λ � n2, one has in the same way, using the previous test function V inDirichlet principle,

µR(R1)µR(Rt)Cλκ,R (R1,Rt)

≥ 1

2c21n2 lnn

α+O(n2) . (7.89)

From Lemma 2.11, (7.73) and (7.74) it follows that 1/γR = o(n3) and the results obtained for thewasp without wings holds with the wings also.

We note however that when applying the previous two-dimensional computation to the randomwalk restricted to a pair of wings, we obtain similarly a good spectral gap control but we are notable to show the asymptotic exponential law or derive mixing time estimates, since, in this caseT ∗R1

and 1/φ∗1 are of the same order.

APPENDIX A. ESTIMATING ζ∗R

A.1. Crude and very crude estimates. We prove here lemma 2.5.

Proof of i). One has, for all x in R and t > 0,

µ∗R(x) = Pµ∗R(X(t) = x

∣∣ τX\R > t)≥ Pµ∗R

(X(t) = x, τX\R > t

). (A.1)

By the natural coupling between X and XR up to time τX\R and stochastic domination of τX\R byan exponential random variable with parameter αR that is independent from XR, it follows

µ∗R(x) ≥ Pµ∗R (XR(t) = x) e−αRt . (A.2)

By Cauchy-Schwarz inequality, Proposition 2.1, and the standard trick to control `∞(µR) normswith `2(µR) norms (the same we used in the proof of Theorem 2.4) we get

µ∗R(x) ≥

(1− e−γRt

√ε∗R

(1− ε∗R)µR(x)

)e−αRtµR(x) . (A.3)

Page 41: Metastable states, quasi-stationary distributions and soft ... · hypotheses for Markov chains on a finite configuration space in some asymptotic regime. By com-paring restricted

METASTABILITY AND QUASI-STATIONARY MEASURES 40

To make this bound useful, we notice that the term inside the bracket is larger than or equal to 1/2if

t ≥ t0 :=1

2γRln

(4ε∗R

(1− ε∗R)µR(x)

)If t0 > 0 for all x ∈ R, that is if 4ε∗R

(1−ε∗R)ζR> 1, then we can plug in its value in (A.3) and get, by

definition of ζ∗R,

ζ∗R ≥ minx∈R

µR(x)

4

(√4ε∗R

(1− ε∗R)µR(x)

)− 2αRγR

≥ ζR4

(√4ε∗R

(1− ε∗R)ζR

)− 2αRγR

. (A.4)

On the other hand, if 4ε∗R(1−ε∗R)ζR

≤ 1, we can just take the value t = 0 in (A.3) and get

ζ∗R ≥ minx∈R

(1−

√ε∗R

(1− ε∗R)µR(x)

)2

µR(x) ≥ ζR4. (A.5)

Taking the logarithm of 1/ζ∗R and putting things together, we obtain the stated inequality.

Proof of ii). The first inequality is obvious from the definition of ζ∗R. Let X denote the discreteversion of X like in Section 3.2 and let N(t) be the number of rings up to time t. Then, for z ∈ R,we have

µ∗R(z) = limt→∞

Px(X(t) = z | τX\R > t)

= limt→∞

∑k≥0

Px(X(k) = z | τX\R > k)P(N(t) = k)

≥ limt→∞

∑k≥0

Px(X(k +DR) = z | τX\R > k +DR)P(N(t) = k +DR)

= limt→∞

∑k≥0

∑y∈R

Px(X(k) = y | τX\R > k)Py(X(DR) = z | τX\R > DR)

× P(N(t) = k +DR) ,

(A.6)

where we used the notation τX\R for the hitting time of the chain X on X \R. Since for all y ∈ Rwe have

Py(X(DR) = z | τX\R > DR) ≥ Py(X(DR) = z , τX\R > DR) ≥ e−∆RDR ,

we get

µ∗R(z) ≥ e−∆RDR limt→∞

∑k≥0

∑y∈R

Px(X(k) = y | τX\R > k)P(N(t) = k +DR)

= e−∆RDR limt→∞

P(N(t) ≥ DR) = e−∆RDR .

(A.7)

A.2. Superharmonicity of h∗R. To prove that h∗R is a super-harmonic function, notice that, for allx ∈ R,

(Lh∗R)(x) = −h∗R(x) +∑y∈X

p(x, y)h∗R(y)

= −h∗R(x) +∑y∈R

p(x, y)µ∗R(y)

µR(y)

= −h∗R(x) +∑y∈R

p∗R(x, y)µ∗R(y)

µR(y)

= −h∗R(x) +∑y∈R

p∗R(y, x)µ∗R(y)

µR(x)

= −φ∗Rh∗R(x) ≤ 0 ,

(A.8)

where in the last two lines we used the reversibility of p∗R w.r.t. µR and that µ∗Rp∗R = (1− φ∗R)µ∗R.

Page 42: Metastable states, quasi-stationary distributions and soft ... · hypotheses for Markov chains on a finite configuration space in some asymptotic regime. By com-paring restricted

METASTABILITY AND QUASI-STATIONARY MEASURES 41

APPENDIX B. COMPUTATION OF RELEVANT QUANTITIES IN THE CURIE-WEISS MODEL

Here we provide some accurate estimates over relevant quantities in the characterization of themetastable behavior for the Curie-Weiss model.

B.1. Measure of the metastable set. Here we prove formula (7.53) which provides the asymptoticexpression of µ(R). By definition on µ and R and using (7.7), we have

ZN · µ(R) =

(m0−m−)N2∑

k=(−1−m−)N2

e−βNfN (m−+ 2kN )

=

√2

N

1√π(1−m2

−)

(m0−m−)N2∑

k=(−1−m−)N2

e−βNf(m−+ 2kN )(1 + o(1))

=

√2

N

1√π(1−m2

−)e−βNf(m−)

bN23 c∑

k=−dN23 e

e−βNf′′(m−)

2 ( 2kN )2

(1 + o(1))

(B.1)

where in the last step we use Taylor approximation and observe that∑|k|≥dN

23 e

e−βNf′′(m−)

2 ( 2kN )2

≤ Ne−c(β)N13 .

Approximating the sum in (B.1) with an integral, we finally get

ZN · µ(R) =1√

π(1−m2−)e−βNf(m−)

∫Re−βf

′′(m−)x2

dx(1 + o(1))

=1√

βf ′′(m−)(1−m2−)e−βNf(m−)(1 + o(1)) .

(B.2)

B.2. Capacities between points in the macroscopic scale. Here we provide an asymptotic ex-pression for the capacity between points in the one-dimensional dynamics with transition rates(7.10), that is the dynamics induced on [−1, 1]N by the Curie-Weiss heath-bath dynamics.

As recalled in Section 7.1, for points x < y ∈ [−1, 1]N it holds

C(x, y)−1 =

(y−x)N2 −1∑

k=0

(c(x+ 2k

N , x+ 2(k+1)N

))−1

,

with c(x, y) = µ(x)p(x, y). In the following, we will first provide an asymptotic approximationfor C(x, y)−1 when m0 6∈ [x, y], and then compute the asymptotic formulas of C(m−,m0)−1,C(m0,m+)−1 and C(m−,m0)−1 .

If m0 6∈ [x, y], we can assume w.l.o.g. that f(x) > f(z) for all z ∈ (x, y). Bounding below therates p with a positive constant c(β) and from (7.7), we get

C(x, y)−1 ≤ c(β)√NZN

(y−x)N2 −1∑

k=0

eβNf(x+2kN )

≤ c(β)√NZNe

βNf(x)

(y−x)N2 −1∑

k=0

e−β|f′(ξk)||2k

≤ c(β)√NZNe

βNf(x)

, (B.3)

where in the second line we used f(x + 2kN ) − f(x) = −|f(ξk)| 2kN for some ξk ∈ (x, x + 2k

N ) andthat there exists a constant c > 0 such that |f(ξk)| > c uniformly in k. If instead f(y) > f(z) forall z ∈ (x, y), then is enough to switch x and y and run the argument above, as C(x, y) = C(y, x).Altogether, this provides inequality (7.28).

Page 43: Metastable states, quasi-stationary distributions and soft ... · hypotheses for Markov chains on a finite configuration space in some asymptotic regime. By com-paring restricted

METASTABILITY AND QUASI-STATIONARY MEASURES 42

To compute C(m−,m0), we first notice that, since m0 is a critical point of f , we can write

tanh(β∆±(x+ 2kN )) = ± tanh(β(m0 + h))(1 + o(1)) = ±m0(1 + o(1))

and then get the approximation p(x ± 2kN , x ±

2(k+1)N ) =

(1−m20)

4 (1 + o(1)). Proceeding as for thecomputation of µ(R), we have

C(m−,m0)−1 =4

(1−m20)ZN

(m0−m−)N2 −1∑

k=0

eβNfN (m0−2kN )(1 + o(1))

= 2

√2Nπ

(1−m20)ZNe

βNf(m0)

bN23 c∑

k=0

e−−βN|f′′(m0)|

2 ( 2kN )2

(1 + o(1))

= 2N

√π

(1−m20)ZNe

βNf(m0)

∫ +∞

0

e−β|f′′(m0)|x2

dx (1 + o(1))

=πN√

(1−m20)β|f ′′(m0)|

ZNeβNf(m0)(1 + o(1)) .

(B.4)

This provides formula (7.52).Similarly we can compute C(m0,m+)−1 and C(m−,m+)−1. In the first case we let the sum over

k of (B.4) run from (m0 − m+)N2 to 0, and then get the same result as for C(m−,m0)−1. Whencomputing C(m−,m+)−1, we let the sum over k run from (m0 −m+)N2 to (m0 −m−)N2 − 1. Wethen approximate the sum by an integral over all R that finally produces an extra factor of 2 withrespect to (B.4). This yields formula (7.66).

REFERENCES

[1] W. THOMSON, P. G. TAIT, Treatise on Natural Philosophy, Oxford (1867).[2] A. M. YAGLOM, Certain limit theorems of the theory of branching stochastic processes, Dokl. Akad. Nauk. SSSR 56,

795-798 (1947).[3] J. N. DARROCH, E. SENETA, On quasi-stationary distributions in absorbing discrete-time finite Markov chains, J. Appl.

Probability 2, 88-100 (1965).[4] O. PENROSE, J. L. LEBOWITZ, Rigorous treatment of metastable states in the van der Waals-Maxwell Theory, J. Stat.

Phys. 3, 211-241 (1971).[5] A. D. WENTZELL, Formulas for eigenfunctions and eigenmeasures that are connected with a Markov process, Teor.

Verojatnost. i Primenen. 18, 329 (1973).[6] M. CASSANDRO, A. GALVES, E. OLIVIERI AND M. E. VARES, Metastable behaviour of stochastic dynamics: a pathwise

approach, J. Stat. Phys 35, 603-634 (1984).[7] M. I. FREIDLIN, A. D. WENTZELL, Random Perturbations of Dynamical Systems, Springer-Verlag (1984).[8] E. J. NEVES, R. SCHONMANN, Critical droplets and metastability for a Glauber dynamics at very low temperature,

Comm. Math. Phys. 137, 209-230 (1991).[9] D. J. ALDOUS, M. BROWN, Inequalities for rare events in time-reversible Markov chains I, Stochastic Inequalities, ed.

by M. Shaked and Y. L. Tong, IMS Lecture Notes in Statistics 22, 1-16 (1992).[10] E. J. NEVES, R. SCHONMANN, Behaviour of droplets for a class of Glauber dynamics at very low temperature, Prob.

Theory Relat. Fields 91, 331-354 (1992).[11] E. SCOPPOLA, Metastability for Markov chains: a general procedure based on renormalization group ideas, in Proba-

bility theory of spatial disorder and phase transition Cambridge (1993) Ed. G.R.Grimmett - Kluwer Acad.Publ.(1994)303-322.

[12] E. SCOPPOLA, Renormalization and graph methods for Markov chains, in Advances in Dynamical Systems and Quan-tum Physics S.Albeverio, R.Figari, E.Orlandi, A.Teta Ed. - World Scientific 1995.

[13] G. BEN AROUS, R. CERF, Metastability of the three-dimensional Ising model on a torus at very low temperature,Electron. J. Prob. 1 (1996).

[14] L. MICLO, Sur les problemes de sortie discrets inhomogenes, Ann. Appl. Probab. 6, 1112-1156 (1996).[15] P. DEHGHANPOUR, R. SCHONMANN, Metropolis dynamics relaxation via nucleation and growth, Comm. Math. Phys.

188, 89-119 (1997).[16] J. R. NORRIS, Markov Chains, Cambridge University Press (1997).[17] L. SALOFF-COSTE, Lectures on finite Markov chains Lecture Notes in Mathematics 1665, 301-413 (1997).[18] P. MATHIEU, P. PICCO, Metastability and convergence to equilibrium for the random field Curie-Weiss model, J. Statist.

Phys. 91, 679-732 (1998).[19] R. SCHONMANN, S. SHLOSMAN, Wulff droplets and the metastable relaxation of kinetic Ising models, Comm. Math.

Phys. 194, 389-462 (1998).[20] A. BOVIER, M. ECKHOFF, V. GAYRARD AND M. KLEIN, Metastability and low lying spectra in reversible Markov chains,

Commun. Math. Phys. 228, 219-255 (2002).

Page 44: Metastable states, quasi-stationary distributions and soft ... · hypotheses for Markov chains on a finite configuration space in some asymptotic regime. By com-paring restricted

METASTABILITY AND QUASI-STATIONARY MEASURES 43

[21] A. BOVIER, F. MANZO, Metastability in Glauber dynamics in the low temperature limit: Beyond exponential asymp-totics J. Statist. Phys. 107, 757-779 (2002).

[22] G. BEN AROUS, A. BOVIER AND V. GAYRARD, Glauber dynamics of the random energy model. 1. Metastable motion onthe extreme states, Commun. Math. Phys. 235, 379-425 (2003)

[23] A. BOVIER, M. ECKHOFF, V. GAYRARD AND M. KLEIN, Metastability in reversible diffusion processes 1. Sharp estimatesfor capacities and exit times. J. Eur. Math. Soc. 6, 399-424 (2004).

[24] W. HUISINGA, S. MEYN, C. SCHUTTE, Phase transitions and metastability in Markovian and molecular systems, Ann.Appl. Prob. 14, 419-458 (2004).

[25] M. JERRUM, J. B. SON, P. TETALI AND E. VIGODA, Elementary bounds on Poincare and log-Sobolev constants fordecomposable Markov chains, Ann. Appl. Prob. 14, 1741-1765 (2004).

[26] A. BOVIER, V. GAYRARD AND M. KLEIN, Metastability in reversible diffusion processes. 2. Precise estimates for smalleigenvalues J. Eur. Math. Soc. 7, 69-99 (2005).

[27] A. GAUDILLIERE, E. OLIVIERI AND E. SCOPPOLA, Nucleation pattern at low temperature for local Kawasaki dynamicsin two dimensions, Markov Processes Relat. Fields 11, 553-628 (2005).

[28] E. OLIVIERI, M. E. VARES, Large deviations and metastability, Cambridge University Press (2005).[29] A. BOVIER, F. DEN HOLLANDER AND F. NARDI, Sharp asymptotics for Kawasaki dynamics on a finite box with open

boundary conditions Probab. Theor. Rel. Fields. 135, 265-310 (2006).[30] A. IOVANELLA, B. SCOPPOLA AND E. SCOPPOLA, Some spin glass ideas applied to the clique problem, J. Stat. Phys. 126,

895-915 (2007).[31] O. BERTONCINI, J. BARRERA M. AND R. FERNANDEZ, Cut-off and exit from metastability: two sides of the same coin,

C. R. Math. 346, 691-696 (2008).[32] A. BOVIER, A. FAGGIONATO, Spectral analysis of Sinai’s walk for small eigenvalues Ann. Probab. 36, 198-254 (2008).[33] E. N. M. CIRILLO, F. R. NARDI AND C. SPITONI, Metastability for reversible probabilistic cellular automata with self-

interaction, J. Stat. Phys. 132, 431-471 (2008).[34] A. BIANCHI, A. BOVIER AND D. IOFFE, Sharp asymptotics for metastability in the Random Field Curie-Weiss model, EJP

14, 1541-1603 (2009).[35] A. GAUDILLIERE, Condenser physics applied to Markov chains - A brief introduction to potential theory,

arXiv:0901.3053.[36] A. GAUDILLIERE, F. DEN HOLLANDER, F. R. NARDI, E. OLIVIERI AND E. SCOPPOLA, Ideal gas approximation for a

two-dimensional rarefied gas under Kawasaki dynamics, Stoch. Proc. and Appl. 119, 737-774 (2009).[37] J. BELTRAN, C. LANDIM, Tunneling and metastability of continuous time Markov chains J. Stat. Phys. 140, 1065-1114

(2010).[38] A. BOVIER, F. DEN HOLLANDER AND C. SPITONI, Homogeneous nucleation for Glauber and Kawasaki dynamics in large

volumes and low temperature Ann. Prob. 38, 661-713 (2010).[39] D.A. LEVIN, M. LUCZAK, Y. PERES,, Glauber dynamics for the mean field Ising model: cut-off, critical power law, and

metastability, Probab. Theory. and Related Fields 146, no. 1-2, 223-265 (2010).[40] L. MICLO, On absorption times and Dirichlet eigenvalues, ESIAM : PS 14, 117-150 (2010).[41] J. BELTRAN, C. LANDIM, Metastability of reversible finite state Markov processes, Stoch. Proc. and Appl. 121, 1633-

1677 (2011).[42] A. GAUDILLIERE, B. SCOPPOLA, E. SCOPPOLA AND M. VIALE, Phase transitions for the cavity approach to the clique

problem on random graphs, Journal of Stat. Phys. 145, 1127-1155 (2011).[43] J. BELTRAN, C. LANDIM, Metastability of reversible condensed zero range processes on a finite set, PROBAB. THEORY

AND RELATED FIELDS 152, 781-807 (2012).[44] A. BIANCHI, A. BOVIER AND D. IOFFE, Point-wise estimates and exponential laws in metastable systems via coupling

methods, Ann. Prob. 40, 339-371 (2012).[45] R. CERF, F. MANZO, Nucleation and growth for the Ising model in d dimensions at very low temperatures, Ann. Prob.

41, 3697-3785 (2013).[46] J. BELTRAN, C. LANDIM, A Martingale approach to metastability, Prob. Therory and Related Fields (2014).[47] R. FERNANDEZ, F. MANZO, F. NARDI, E. SCOPPOLA AND J. SOHIER Conditioned, quasi-stationary, restricted measures

and escape from metastable states, arXiv:1410.4814.[48] A. Bovier and F. den Hollander, Metastability: a potential-theoretic approach. Monograph, xx + 569 pp. to appear in

Grundlehren der mathematischen Wissenschaften, Vol. 351 Springer (2015), draft version available from A. Bovier’sweb page.

A. BIANCHI, DIP. DI MATEMATICA - UNIVERSITA DI PADOVA, VIA TRIESTE, 63 - 35121 PADOVA, ITALY

E-mail address: [email protected]

A. GAUDILLIERE, AIX MARSEILLE UNIVERSITE, CNRS, CENTRALE MARSEILLE, I2M, UMR 7373, 13453 MARSEILLE,FRANCE

E-mail address: [email protected]