on non-negative unbiased estimators
DESCRIPTION
Talk on the design on non-negative unbiased estimators, useful to perform exact inference for intractable target distributions. Corresponds to the article http://arxiv.org/abs/1309.6473TRANSCRIPT
On non-negative unbiased estimators
Pierre E. Jacob@ University of Oxford
& Alexandre H. Thiery@ National University of Singapore
Banff – March 2014
Pierre Jacob Non-negative unbiased estimators 1/ 28
Outline
1 Exact inference
2 Unbiased estimators and sign problem
3 Existence and non-existence results
4 Discussion
Pierre Jacob Non-negative unbiased estimators 2/ 28
Outline
1 Exact inference
2 Unbiased estimators and sign problem
3 Existence and non-existence results
4 Discussion
Pierre Jacob Non-negative unbiased estimators 2/ 28
Exact inference
For a target probability distribution with unnormalised density π, anumerical method is “exact” if for any test function φ,∫
φ(θ)π(θ)dθ∫π(θ)dθ
can be approximated with arbitrary precision, at the cost of morecomputational effort.
⇒ In this sense MCMC is exact.
Pierre Jacob Non-negative unbiased estimators 3/ 28
Exact inference
With exact methods, no systematic error.
No guarantees that for a fixed computational budget, exactmethods should be preferred over approximate methods.
Still important to know in which settings exact methods areavailable.
Pierre Jacob Non-negative unbiased estimators 4/ 28
Exact inference using Metropolis-Hastings
Metropolis-Hastings algorithm.
1: Set some θ(1).2: for i = 2 to Nθ do3: Propose θ⋆ ∼ q(·|θ(i−1)).4: Compute the ratio:
α = min(
1,π(θ⋆)
π(θ(i−1))q(θ(i−1)|θ⋆)q(θ⋆|θ(i−1))
).
5: Set θ(i) = θ⋆ with probability α, otherwise set θ(i) = θ(i−1).6: end for
Pierre Jacob Non-negative unbiased estimators 5/ 28
Exact inference with unbiased estimators
Pseudo-marginal Metropolis-Hastings algorithm.For each θ, we can sample Z (θ) with E(Z (θ)) = π(θ).
1: Set some θ(1) and sample Z (θ(1)).2: for i = 2 to Nθ do3: Propose θ⋆ ∼ q(·|θ(i−1)) and sample Z (θ⋆).4: Compute the ratio:
α = min(
1,Z (θ⋆)
Z (θ(i−1))q(θ(i−1)|θ⋆)q(θ⋆|θ(i−1))
).
5: Set θ(i) = θ⋆, Z (θ(i)) = Z (θ⋆) with probability α, otherwiseset θ(i) = θ(i−1), Z (θ(i)) = Z (θ(i−1)).
6: end for
Pierre Jacob Non-negative unbiased estimators 6/ 28
Exact inference with unbiased estimators
Game-changer when one has access to efficient unbiasedestimators of the target density.
Especially in parameter inference for hidden Markov models,with particle MCMC methods based on particle filters toestimate the likelihood.
What if one doesn’t have access to straightforward unbiasedestimators? Are there general schemes to obtain thoseunbiased estimators?
Pierre Jacob Non-negative unbiased estimators 7/ 28
Exact inference with unbiased estimators
Example: big dataObservations yi
iid∼ fθ for i = 1, . . . , n, and n is very large.Can we do exact inference without computing the full likelihoodevery time we try a new parameter value?
Unbiased estimator of the log-likelihood
ℓ(θ) = (n/m)m∑
i=1log f (yσi | θ)
for m < n and σi corresponding to some subsampling scheme.
It doesn’t directly provide an unbiased estimator of thelikelihood.
Pierre Jacob Non-negative unbiased estimators 8/ 28
Exact inference with unbiased estimators
Doubly intractable distributionsPosterior density decomposable into
π(θ | y) = 1C (θ)
f (y; θ)p(θ).
One can typically get an unbiased estimator of C (θ) usingimportance sampling.
It doesn’t directly provide an unbiased estimator of 1/C (θ).
Pierre Jacob Non-negative unbiased estimators 9/ 28
Exact inference with unbiased estimators
Reference priorsStarting from an arbitrary prior π⋆, define
fk(θ) = exp{∫
p(y1, . . . , yk | θ) log π⋆(θ | y1, . . . , yk)dy1 . . . dyk
}and the reference prior is, for any θ0 in the interior of Θ,
f (θ) = limk→∞
fk(θ)fk(θ0)
.
(Berger, Bernardo, Sun 2009.)Unbiased estimators of f (θ)?
Pierre Jacob Non-negative unbiased estimators 10/ 28
Outline
1 Exact inference
2 Unbiased estimators and sign problem
3 Existence and non-existence results
4 Discussion
Pierre Jacob Non-negative unbiased estimators 10/ 28
Unbiased estimators
Von Neuman & Ulam (∼ 1950), Kuti (∼ 1980), Rychlik (∼ 1990),McLeish, Rhee & Glynn (∼ 2010). . .
Removing the bias off consistent estimatorsIntroduce
a random variable S with E(S) = λ ∈ R,a sequence (Sn)n≥0 converging to S in L2,N be an integer valued random variable andwn = 1/P(N ≥ n) < ∞ for all n ≥ 0,
then
Y =N∑
n=0wn ×
(Sn − Sn−1
)has expectation E(Y ) = E(S) = λ.
Pierre Jacob Non-negative unbiased estimators 11/ 28
Unbiased estimators
If ∞∑n=1
wn × E(
|S − Sn−1|2)
< ∞, (1)
then the variance of Y is finite.
Denote by tn the expected computing time to obtainSn − Sn−1.Then the computing time of Y , denoted by τ , shouldpreferably satisfy
E(τ) =∞∑
n=0w−1
n × tn < ∞. (2)
Success story in multi-level Monte Carlo.
Pierre Jacob Non-negative unbiased estimators 12/ 28
Unbiased estimators
Even if the consistent estimators Sn are each almost-surelynon-negative, Y is not in general almost-surely non-negative:
Y =N∑
n=0wn ×
(Sn − Sn−1
),
unless we manage to construct ordered consistent estimators, ie:
P(Sn−1 ≤ Sn) = 1.
Direct implementation of the pseudo-marginal approach is difficultin the presence of possibly negative acceptance probabilities.
Pierre Jacob Non-negative unbiased estimators 13/ 28
Dealing with negative values
One can still perform exact inference by noting∫φ(θ)π(θ)dθ∫
π(θ)dθ=∫
φ(θ)σ(π(θ))|π(θ)|dθ∫σ(π(θ))|π(θ)|dθ
which suggests using the absolute values of Z (θ) in the MHacceptance ratio.The integral is recovered using the importance samplingestimator: ∑N
i=1 σ(Z (θ(i)))φ(θ(i))∑Ni=1 σ(Z (θ(i)))
As an importance sampler, deteriorates with the dimension.Called the sign problem in lattice quantum chromodynamics.
Pierre Jacob Non-negative unbiased estimators 14/ 28
Avoiding the sign problem
Can we avoid the sign problem by directly designingnon-negative unbiased estimators?
Given an unbiased estimator of λ > 0, can I generate anon-negative unbiased estimator of λ?
Let f be any function f : R → R+. Given an unbiasedestimator of λ ∈ R, can I generate a non-negative unbiasedestimator of f (λ)?
Pierre Jacob Non-negative unbiased estimators 15/ 28
Outline
1 Exact inference
2 Unbiased estimators and sign problem
3 Existence and non-existence results
4 Discussion
Pierre Jacob Non-negative unbiased estimators 15/ 28
f -factory
Let X be a subset of R and f : conv(X ) → R+ a function.
DefinitionAn X -algorithm A is an f -factory if, given as inputs
any i.i.d sequence X = (Xk)k≥1 with expectation λ ∈ R,an auxiliary random variable U ∼ Uniform(0, 1) independentof (Xk)k≥1,
then Y = A(U , X) is a non-negative unbiased estimator of f (λ).
Pierre Jacob Non-negative unbiased estimators 16/ 28
X -algorithm
Let X be a subset of R.
DefinitionAn X -algorithm A is a pair
(T , φ
)where
T = (Tn)n≥0 is a sequence of Tn : (0, 1) × X n → {0, 1},φ = (φn)n≥0 is a sequence of φn : (0, 1) × X n → R+.
A ≡ (T , φ) takes u ∈ (0, 1) and x = (xi)i≥1 ∈ X ∞ as inputs andproduces as output
exit time: τ = τ(u, x) = inf{n ≥ 0 : Tn(u, x1, . . . , xn) = 1}exit value: A(u, x) = φτ (u, x1, . . . , xτ )
Set A(u, x) = ∞ if Tn never gives 1.
Pierre Jacob Non-negative unbiased estimators 17/ 28
General non-existence of f -factories
TheoremFor any non constant function f : R → R+, no f -factory exists.
LemmaGiven i.i.d copies of an unbiased estimator of λ > 0 and a uniformrandom variable U , there is no algorithm producing a non-negativeunbiased estimator of λ.
The lemma is not directly implied by the theorem but the proof isvery similar.
Pierre Jacob Non-negative unbiased estimators 18/ 28
Proof
For the sake of contradiction, introducea non-constant function f : R → R+, and λ1, λ2 ∈ R withf (λ1) > f (λ2),an f -factory (φ, T ).
Consider an i.i.d sequence X = (Xn)n≥1 with expectation λ1.Then
A(U , X) = φτX (U , X1, . . . , XτX )
has expectation f (λ1), and
τX = inf{n : Tn(U , X1, . . . , Xn) = 1}
is almost surely finite.
Pierre Jacob Non-negative unbiased estimators 19/ 28
Proof
An f -factory should work for any input sequence.Introduce Bernoulli variables (Bn)n≥1, with P(Bn = 0) = ε and
Yn = Bn Xn + λ2 − λ1(1 − ε)ε
(1 − Bn)
so that E(Yn) = λ2.Then
A(U , Y ) = φτY (U , Y1, . . . , YτY )
has expectation f (λ2) < f (λ1), and
τY = inf{n : Tn(U , Y1, . . . , Yn) = 1}
is almost surely finite.
Pierre Jacob Non-negative unbiased estimators 20/ 28
Proof
By construction we can tune the probability (1 − ε)n of
Mn = {(Y1, . . . , Yn) = (X1, . . . , Xn)},
by changing ε. On the events
{(Y1, . . . , Yn) = (X1, . . . , Xn)}
the algorithm has to “compensate”, so that
f (λ2) = E[A(U , Y )] < E[A(U , X)] = f (λ1).
But the algorithm cannot ouptut values lower than zero⇒ for ε small enough it leads to a contradiction.
Pierre Jacob Non-negative unbiased estimators 21/ 28
Other cases
By putting more restrictions on X we get different results.
The case where X ⊂ R+ and f is decreasing also leads to anon-existence result.
The case where X ⊂ R+ and f is increasing allows someconstructions, for instance for real analytic functions.
Pierre Jacob Non-negative unbiased estimators 22/ 28
Other cases
No full characterisation of increasing functions allowingf -factories for X ⊂ R+, yet.
The case where X = [a, b] is related to the Bernoulli factory.Necessary and sufficient condition: f continuous and thereexist n, m ∈ N and ε > 0 such that
∀x ∈ [a, b] f (x) ≥ ε min ((x − a)m , (b − x)n)
Pierre Jacob Non-negative unbiased estimators 23/ 28
Case X = [a, b]
Assume
∀x ∈ [a, b] f (x) ≥ ε min ((x − a)m , (b − x)n) .
Introduceg : x 7→ f (x)/{(x − a)m(b − x)n}
bounded away from zero.Hence g can be approximated from below by polynomials.Introduce
P1(x) =∑
(i,j)∈I1
α(1)i,j (x − a)i(b − x)j
with non-negative coefficients, and such that P1(x) ≤ g(x).Then approximate g − P1 from below by P2, g − P1 − P2 by P3,etc.
Pierre Jacob Non-negative unbiased estimators 24/ 28
Case X = [a, b]
We obtain a sum of polynomials∑n
k=0 Pk(x) converging to g(x)when n → ∞.We multiply by (x − a)m(b − x)n to estimate f (x) instead.This leads to a sequence of estimators
Sn =n∑
k=0
∑(i,j)∈Ik
a(k)i,j
{ i∏p=1
(Xp − a)i}{ j∏
q=1(b − Xi+q)j
}
for which P(Sn−1 ≤ Sn) = 1, yielding a non-negative unbiasedestimator of f (λ).
Pierre Jacob Non-negative unbiased estimators 25/ 28
Outline
1 Exact inference
2 Unbiased estimators and sign problem
3 Existence and non-existence results
4 Discussion
Pierre Jacob Non-negative unbiased estimators 25/ 28
Back to statistics
No f -factory for X = R and any non-constant f .⇒ without lower bounds on the log-likelihood estimator, nonon-negative unbiased likelihood estimators.
No f -factory for decreasing functions f and X = R+.⇒ without lower and upper bounds on the estimator of C (θ),no non-negative unbiased estimators of 1/C (θ).
For the reference prior, it seems hopeless unless X = [a, b].
Pierre Jacob Non-negative unbiased estimators 26/ 28
Discussion
No answer for the case f “slowly” increasing and X ⊂ R+.
We only considered the transformation of an unbiasedestimator of λ to an unbiased estimator of f (λ).
Should we tolerate negative values and come up withappropriate methodology?
Should we aim for exact inference?
Thanks!
Pierre Jacob Non-negative unbiased estimators 27/ 28
References
On non-negative unbiased estimators, Jacob, Thiery, 2014(arXiv)
Playing Russian Roulette with Intractable Likelihoods,Girolami, Lyne, Strathmann, Simpson, Atchade, 2013 (arXiv)
Computational complexity and fundamental limitations tofermionic quantum Monte Carlo simulations, Troyer, Wiese,2005 (Phys. rev. let. 94)
Unbiased Estimation with Square Root Convergence for SDEModels , Rhee, Glynn, 2013 (arXiv)
Pierre Jacob Non-negative unbiased estimators 28/ 28