neutral bayesian reference models for incidence rates of ... · for incidence rates of (rare)...

Jouni Kerman Statistical Methodology, Novartis Pharma AG, Basel BAYES2012, May 10, Aachen

Neutral Bayesian reference models for incidence rates of (rare) clinical events

Outline

§ Motivation – why reference (default) models?

§ Selection criteria for the reference models

§  Investigating candidates for reference models

§ A proposal for Neutral reference models • Augmenting the proposed reference analysis with historical data

2 | BAYES2012 | J Kerman | May 10 | Neutral reference analyses

Motivation


Reference analyses for comparison

§ We do more and more complex analyses... • E.g., meta-analyses

“Reality check: are the results reasonable?”



§ Comparing with point estimates to reveal discrepancies

• Are the results reasonable?

• Any “excessive” shrinkage?



§ Plotting just the data points is not enough

• Must visualize the uncertainty around the point estimates

• Need simple Bayesian models to produce point estimates and “reference” uncertainty intervals !



§ Stratified analyses • Model the rate within a

single treatment (sub)group • Model a rate difference

(e.g., LoR, RR) for two (sub)groups

§ Pooled analyses • Analyses with pooled

studies/subgroups (i.e., assuming identical rates between studies or groups)


Stratified and pooled reference analyses “Looking at the raw data”


Stratified and pooled reference analyses “Looking at the differences”


Reference (‘default’) analyses - Example: Safety

§ Example: Kidney transplantation; one single study


Treatment Deaths at 12 months

A 7 / 251

B 9 / 274

C 6 / 384

Considering selection criteria for the reference models


Binomial/Poisson models and shrinkage

§ Shrinkage is unavoidable ! • Consider y=0

• The point estimate and the length of the posterior intervals (with respect to the scale n) are determined completely by the prior

•  (Recall: there are no “uninformative” models...)


Illustration: Binomial-beta conjugate model

with prior Beta(a, a)

Binomial/Poisson models and shrinkage

§ Shrinkage is unavoidable ! • Consider y=1 • The point estimate and the

posterior intervals are strongly influenced by the prior:

Pr( θ > y/n | y ) > 0.74 or Pr( θ > y/n | y ) > 0.37 ?

• As y increases, influence of the prior is diminished, but N can be arbitrarily large


Illustration: Binomial-beta conjugate model

with prior Beta(a, a)

Choosing a reference model

§ The choice of shrinkage ... is yours • By choosing a reference

model, we are in fact deciding on the amount of shrinkage

• What is an acceptable “default amount of shrinkage” ?


Neutrality as a criterion

§ A neutral model for rates and proportions • Pr( θ > MLE | y ) ≈ 50%

consistently for all possible outcomes and sample sizes whenever the MLE is not at the boundary of the parameter space

•  “A priori doesn’t favor high or low values relative to the MLE (sample mean)”

• Exact neutrality cannot be achieved – but some priors are “more neutral” than others


MLE=0.2; median = dotted line Pr( θ > MLE | y ) = 50.2%

Neutrality for the differences

§ A neutral default model • Pr(θ1 - θ2 > d | y ) ≈ 50% • where d is the observed

difference – on some scale, e.g. log or logit or original scale

• Equivalently, ‘d’ should be as close to the posterior median as possible


A reference model should provide

neutral inferences for both rates and

differences

Investigating candidates for reference models


Candidates for reference models (Binomial)

§ Conjugate models •  yi ~ Binomial(ni, θi), i=1, 2 • θi ~ Beta(a, a); a in (0, 1)

§ Logistic regression with different parameterizations and different vague prior distributions (Normal or scaled Student’s t) – total 116 models


Model “A” Model “B” Model “C”

logit(θ1) =

µ1 µ

µ - Δ / 2

logit(θ2) = µ2 µ + Δ

µ + Δ / 2

Candidates for reference models(Poisson)

§ Conjugate models •  yi ~ Binomial(ni, θi), i=1, 2 • θi ~ Gamma(a, 0); a in (0, 1)

§ Poisson regression (log link) with different parameterizations and different vague prior distributions (Normal or scaled Student’s t) – total 116 models


Model “A” Model “B” Model “C”

log (θ1) =

µ1 µ

µ - Δ / 2

log (θ2) = µ2 µ + Δ

µ + Δ / 2

An apparent ‘bias’ in rate estimates An example

§ A “noninformative” analysis ? • y=1 event out of n=1000 • Statisticians (a), (b), and (c) use

different “noninformative” models


Median estimate

Pr( est > 0.001 | y ) Model

(a) 0.7 / 1000 36.8% Beta(0.01, 0.01)

(b) 1.0 / 1000 50.8% Beta(1/3, 1/3)

(c) 1.7 / 1000 73.5% Beta(1, 1)

An apparent ‘bias’ in log-risk ratio estimates An example

§ A “noninformative” analysis ? • Experimental: y=3 events out of n=1000 • Placebo: y=1 events out of n=1000 • Statisticians (a), (b), and (c) use different “noninformative” models


Median odds

Pr( odds > 3 | y )

Model Priors

(a) 3.9 58% “C” µ ~ N(0,1002) Δ ~ N(0,102)

(b) 2.95 49% “A” µ1 ~ N(0,52) µ2 ~ N(0, 52)

(c) 2.25 39% “B” µ ~ N(0,52) Δ ~ N(0,2.52)

Asymmetric estimates in log-risk ratio estimates An example

§ A “noninformative” analysis ? • Experimental: y=1 events out of n=1000 • Placebo: y=1 events out of n=1000 • Statisticians (a), (b), and (c) use different “noninformative” models


Median odds

Pr( odds > 3 | y )

Logistic Model

Priors

(a) 0.64 65% “B” µ ~ N(0,52) Δ ~ N(0,52)

(b) 0.90 47% “B” µ ~ t(0,10, 5) Δ ~ t(0,5, 5)

(c) 1.00 50% “B” µ ~ N(0,1002) Δ ~ N(0, 52)

“What is your point estimate?”

A proposal for default models


Neutral models for proportions and probabilities

§ The Binomial-Beta conjugate model with shape parameter 1/3 •  y ~ Binomial(θ, n) • θ ~ Beta(1/3, 1/3)

• Behaves consistently, for all sample sizes n and outcomes y


Neutral models for rates

§ Poisson-Gamma conjugate model with the shape parameter 1/3 •  y ~ Poisson(λX) • X = exposure • λ ~ Gamma(1/3, 0)

• Behaves consistently, for all exposures X and outcomes y


Neutral models for differences and ratios

§ Treatment groups are estimated separately, then differences computed • E.g., the Binomial-beta model:

•  ( θ1 | y ) ~ Beta(1/3 + y1, 1/3 + n1 - y1) •  ( θ2 | y ) ~ Beta(1/3 + y2, 1/3 + n2 – y2)

• Compute δ = θ2 - θ1 • Compute Δ’ = logit(θ2) - logit(θ1)

•  E.g., by simulation

• Δ and δ are neutral – approximately centered at the point estimate - consistently

• Δ and δ are symmetric when y, n are equal in both groups


Behavior of the Binomial models

§ The Beta(1/3, 1/3) conjugate model behaves the most consistently

§  Displayed: max. absolute bias (%) for estimated rates or odds in all models

§  (Worst case scenario, y=1 for one of the arms)


Beta(1/3, 1/3)

Behavior of the Poisson models

§ The Gamma(1/3, 0) conjugate model behaves the most consistently

§  Displayed: max. absolute bias (%) for estimated rate or rate ratio in all models

§  (Worst case scenario, y=1 for one of the arms)


Gamma(1/3, 0)

Neutral models for differences and ratios

§ Examples of ‘worst cases’ (one group has y=1)


Data 1 Data 2 Median point

estimateθ1

Median point

estimate θ2

Median odds

estimate

Pr( odds > obs | y )

1/1000 2/1000 0.0010 0.0020 2.0 50%

1/1000 3/1000 0.0010 0.0030 3.0 50%

1/1000 4/1000 0.0010 0.0040 3.9 50%

1/1000 5/1000 0.0010 0.0050 4.9 50%

Example: Meta-analysis

§ Viewing posterior intervals from many multilevel models at once

§ Green: pooled

§ Gray: fully stratified reference intervals

30 | Statistical Methodology Science VC | Jouni Kerman | Nov 9, 2010 | Analyzing Proportions and Rates using Neutral Priors

Augmenting the default analysis with external information


Augmenting the default reference analysis Binomial model

§ A family of informative Beta priors

Beta(1/3 + mp, 1/3 + m(1-p))

• Fix ‘p’ (a priori observed point estimate)

• Use ‘m’ to adjust prior precision • Beta(1/3, 1/3) is the “prior of all

priors” • Neither shape parameter ever < 1/3


meansamplenm

npnm

mmedianposterior+

++

≈

Augmenting the default reference analysis Poisson model

§ A family of informative Gamma conjugate priors

Gamma(1/3 + ky, kX)

• Fix ‘y / X’ (a priori observed point estimate)

• Use ‘k’ within (0,1) to adjust prior precision

• Gamma(1/3, 0) is the “prior of all priors”


Conclusion

§ The classical point estimates (sample means and their differences) remain the reference points that are inevitably compared to model-based inferences

§ Recognizing that shrinkage is unavoidable in these count data models, we propose (approximate) neutrality as a criterion for reference models

§ The proposed conjugate models perform consistently for all outcomes and sample sizes • Symmetry and minimal “bias” • Easily computable without MCMC •  Intuitively augmentable by external information


References

§ Kerman (2011) Neutral noninformative and informative conjugate beta and gamma prior distributions. Electronic Journal of Statistics 5:1450-1470

§ Kerman (2012) Neutral Bayesian reference models for incidence rates of clinical events (Working paper)


A look at the neutral Beta prior (Log-odds scale)

• Beta(1, 1) – Uniform • Beta(1/2, 1/2) – “Jeffreys”

• Beta(1/3, 1/3) – “Neutral” • Beta(0.001, 0.001) – “Approximate Haldane”


Reference model candidates investigated Binomial & Poisson regression models


For µ For Δ

Normal model

σ = 3.3, 5, 10, 100 σ = 2.5, 5, 10

Student-t model

Scale = 3.3, 5, 10, 100 Df = 2, 5, 10

Scale = 2.5, 3.3, 5, 10 Df = same as for µ

Possible reference models (Binomial) yi ~ Binomial(ni, θi), i=1, 2


Beta Normal

Scaled t

A θi ~ Beta(a, a) δ = θ2 - θ1

logit(θi) ~ N(0, σ2) δ = logit(θ2) - logit(θ1)

logit(θi) ~ N(0, σ2) δ = logit(θ2) - logit(θ1)

B logit(θ1) ~ N(0, σ12)

δ ~ N(0, σ22)

θ2 = logit(θ1) + δ

logit(θ1) ~ t(0, σ1, df1) δ ~ t(0, σ2, df2) θ2 = logit(θ1) + δ

C logit(µ) ~ N(0, σ12)

δ ~ N(0, σ22)

θ1 = logit(µ) - δ / 2 θ2 = logit(µ) + δ / 2

logit(µ) ~ t(0, σ1, df1) δ ~ t(0, σ2, df2) θ1 = logit(µ) - δ / 2 θ2 = logit(µ) + δ / 2

Possible reference models (Poisson) yi ~ Poisson(Xiθi), i=1, 2


Gamma Normal

Scaled t

A θi ~ Gamma(a, ε) δ = θ2 - θ1

log (θi) ~ N(0, σ2) δ = log (θ2) - log (θ1)

log (θi) ~ N(0, σ2) δ = log (θ2) - log (θ1)

B log (θ1) ~ N(0, σ12)

δ ~ N(0, σ22)

θ2 = log (θ1) + δ

log (θ1) ~ t(0, σ1, df1) δ ~ t(0, σ2, df2) θ2 = log (θ1) + δ

C log (µ) ~ N(0, σ12)

δ ~ N(0, σ22)

θ1 = log (µ) - δ / 2 θ2 = log (µ) + δ / 2

log (µ) ~ t(0, σ1, df1) δ ~ t(0, σ2, df2) θ1 = log (µ) - δ / 2 θ2 = log (µ) + δ / 2

neutral bayesian reference models for incidence rates of ... · for incidence rates of (rare)...

Documents