a marginalisation paradox example

27
A Marginalisation Paradox Example Dennis Prangle 28th October 2009

Upload: dpra23

Post on 18-Nov-2014

865 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: A Marginalisation Paradox Example

A Marginalisation Paradox Example

Dennis Prangle

28th October 2009

Page 2: A Marginalisation Paradox Example

Overview

Bayesian inference recap

Example of error due to a marginalisation paradox

(Very) rough overview of general issues

Page 3: A Marginalisation Paradox Example

Part I

Bayesian Inference

Page 4: A Marginalisation Paradox Example

Bayesian Inference

Prior distribution on parameters θ: p(θ)

Model for the data X : f (X |θ)

Posterior distribution is (using Bayes’ theorem):

f (θ|X ) =p(θ)f (X |θ)∫p(θ)f (X |θ)dθ

n.b. p(θ) only needed up to proportionality

Bayesian inference performed using computational MonteCarlo methods (e.g. MCMC)

Typically also don’t need normalisation constant for p(θ) asratios used

Page 5: A Marginalisation Paradox Example

Improper Prior

A probability density p(θ) (roughly speaking!) satisfies:

1 p(θ) ≥ 02∫

p(θ)dθ = 1

An improper prior doesn’t require condition 2

Instead can have∫

p(θ)dθ =∞Example: p(θ) = 1 “improper uniform”

Sometimes used to represent prior ignorance

Resulting posterior often a proper distribution

⇒ meaningful conclusions (. . . or are they?!)

Page 6: A Marginalisation Paradox Example

Part II

Example: Tuberculosis in San Francisco

Page 7: A Marginalisation Paradox Example

Background: Tuberculosis

Tuberculosis is an infectious disease spread by bacteria

Epidemiological interest lies in estimating rates oftransmission and recovery

Conjectured that data on bacteria mutation providesinformation → more accurate inference

Page 8: A Marginalisation Paradox Example

Background: Paper

Tanaka et al (2006) investigated a Tuberculosis outbreak inSan Francisco in 1991/2

473 samples of Tuberculosis bacteria taken at a particular date

Genotyped according to a particular genetic marker

Samples split into clusters which share the same genotype

Cluster size 1 2 3 4 5 8 10 15 23 30

Number of clusters 282 20 13 4 2 1 1 1 1 1

Page 9: A Marginalisation Paradox Example

Model: Underlying disease process

Assume initially there is one case

3 event types: birth, death, mutation (→ new genotype)

Suppose there are N cases at some time

Rate of births: αN

Rate of deaths: δN

Rate of mutations θN

Defines a continuous time Markov process model

We don’t care about times (no data) so can reduce to discretetime Markov process

Page 10: A Marginalisation Paradox Example

Model: Producing data

Run the disease process until there are 10,000 cases

(If the disease dies out, rerun)

Take a simple random sample of 473 cases

Convert to data on genotype frequencies

Page 11: A Marginalisation Paradox Example

Prior

Some information on θ from previous studiesPrior distribution N(0.198, 0.067352) chosenCorresponding density denoted p(θ)

Ignorance for other parameters

Proposed (improper) overall prior:

p(α, δ, θ) =

{p(θ) if 0 < δ < α0 otherwise

Motivation:Marginal for θ is p(θ)Marginal for (α, δ) is improper uniform:{

1 if 0 < δ < α0 otherwise

Restriction α > δ ⇒ zero prior probability on parameterswhere epidemic usually dies out

Page 12: A Marginalisation Paradox Example

Results

See Tanaka et al paper

Note change from prior

Page 13: A Marginalisation Paradox Example

Parameter Redundancy

All parameters are proportional to rates

Multiplying all by a constant affects only rate of events

But this is irrelevant to our model

Model is over-parameterised:

(α, δ, θ) and (kα, kδ, kθ) give same likelihood

Page 14: A Marginalisation Paradox Example

Reparameterisation

Reparameterise to:

a = α/(α + δ + θ)

d = δ/(α + δ + θ)

θ = θ

Motivation: keep θ as have prior info for it

a and d tell us everything about relative rates

Only θ has info on absolute rates. . .

. . . and θ has info on absolute rates only

Parameter constraints:

α, δ, θ > 0⇒ a, d , θ ≥ 0and also a + d ≤ 1Requirement α > δ in prior ⇒ a > d

Page 15: A Marginalisation Paradox Example

Paradox (intuitive)

In new parameterisation, θ equiv to absolute rate info

But data has no information on absolute rates

So (marginal) θ posterior should equal prior?????

Page 16: A Marginalisation Paradox Example

Analytic Results 1: Jacobian

Recall:

a = α/(α + δ + θ)

d = δ/(α + δ + θ)

θnew = θ

Solve to give:

α = aθnew/(1− a− d)

δ = dθnew/(1− a− d)

θ = θnew

Differentiate for Jacobian:

J = (1−a−d)−2

θnew(1− d) aθ a(1− a− d)dθ θnew(1− a) d(1− a− d)0 0 1

|J| = θ2

new(1− a− d)−3

Page 17: A Marginalisation Paradox Example

Analytic Results 2: Reparameterised prior

Recall p(α, δ, θ) = p(θ)I [0 < δ < α]

(where p(θ) is a normal pdf)

Then:

p(a, d , θnew) = p(θ)I [0 < δ < α]|J|= θ2

newp(θnew)I [0 < d < a](1− a− d)−3

Page 18: A Marginalisation Paradox Example

Analytic Results 3: Posterior

Recall likelihood depends on a, d only

i.e. f (X |λ) = f (a, d)

So posterior is:

π(a, d , θnew) ∝ θ2newp(θnew)I [0 < d < a](1− a− d)−3f (a, d)

If this is proper, then posterior marginal for θ is:

π(θnew) ∝ θ2newp(θnew)

Matches results graph

Page 19: A Marginalisation Paradox Example

Paradox and explanation

The prior was constructed to have marginal p(θ)

The model contains no data on θ

But we have shown that the posterior acts like ∝ θ2p(θ)

(easy to falsely conclude that change is due to data)

PARADOX

The problem is that marginal distributions are not well definedfor improper priors

i.e.∫

p(α, δ, θ)dαdδ is not a pdf (integral not 1)Attempting to normalise gives /∞ problems

Prior didn’t really have claimed marginal

Page 20: A Marginalisation Paradox Example

Practical resolution

Prior aimed to combine ignorance on α, δ with priorknowledge on θ

In (a, d , θ) reparameterisation, range of (a, d) is finite

Combine p(θ) with a uniform marginal on (a, d) usingindependence

For this parameterisation does give proper prior

So priors are well defined

(side issue: is uniform best representation of ignorance?)

Page 21: A Marginalisation Paradox Example

Part III

Marginalisation Paradoxes: theory

Page 22: A Marginalisation Paradox Example

Subjective Bayes viewpoint

Priors should represent prior beliefs

Only a probability distribution represent beliefs coherently

Therefore don’t use improper priors

(this is the resolution used earlier)

Page 23: A Marginalisation Paradox Example

Objective Bayes viewpoint

Conclusions shouldn’t depend on subjective beliefs(c.f. frequentist analysis)

Instead use objective reference priors

Lots of theory for choosing these

Will often be improper (e.g. Jeffrey’s prior)

So marginalisation paradoxes a real issue

Page 24: A Marginalisation Paradox Example

The marginalisation paradox

Well-known Bayesian inference paradox

From Dawid, Stone, Zidek (RSS B 1973; read paper)

For models with a particular structure. . .

. . . there are two marginalisation approaches to Bayesianinference

For improper priors, these typically do not agree

Large literature; claims of resolution but not fullyacknowledged

Is my example a special case of this?

Page 25: A Marginalisation Paradox Example

Part IV

Conclusion

Page 26: A Marginalisation Paradox Example

Conclusion

Be wary of marginalisation issues for improper priors!

Page 27: A Marginalisation Paradox Example

Bibliography

A. P. Dawid, M. Stone, and J. V. Zidek Marginalizationparadoxes in Bayesian and structural inference JRSS(B),35:189-233, 1973.

Mark M. Tanaka, Andrew R. Francis, Fabio Luciani, and S. A.Sisson. Using Approximate Bayesian Computation to EstimateTuberculosis Transmission Parameters from Genotype Data.Genetics, 173:1511–1520, 2006.