bruno lecoutre c.n.r.s. et université de rouen e-mail: [email protected]

67
Bruno Lecoutre Bruno Lecoutre C.N.R.S. et Université de Rouen C.N.R.S. et Université de Rouen E-mail: [email protected] Internet: http://www.univ-rouen.fr/LMRS/Persopage/Lecoutre/Eris E E quipe quipe R R aisonnement aisonnement I I nduction nduction S S MaxEnt 2006 Twenty sixth International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Engineering CNRS, Paris, France, July 9, 2006 And if you were a Bayesian without And if you were a Bayesian without knowing it? knowing it?

Upload: ovid

Post on 15-Jan-2016

22 views

Category:

Documents


0 download

DESCRIPTION

MaxEnt 2006 Twenty sixth International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Engineering CNRS, Paris, France, July 9, 2006. And if you were a Bayesian without knowing it?. Bruno Lecoutre C.N.R.S. et Université de Rouen - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

Bruno LecoutreBruno Lecoutre

C.N.R.S. et Université de Rouen C.N.R.S. et Université de Rouen

E-mail: [email protected]

Internet: http://www.univ-rouen.fr/LMRS/Persopage/Lecoutre/Eris

EEquipe quipe RRaisonnement aisonnement IInduction nduction SStatistiquetatistique

MaxEnt 2006 Twenty sixth International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and EngineeringCNRS, Paris, France, July 9, 2006

And if you were a Bayesian without And if you were a Bayesian without knowing it?knowing it?

Page 2: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

Bayes, Thomas (b. 1702, London - d. 1761, Tunbridge Wells, Kent), mathematician who first used probability inductively and established a mathematical basis for probability inference (a means of calculating, from the number of times an event has not occured, the probability that it will occur in future trials)

Page 3: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr
Page 4: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

ProbabilityProbability

andandStatistical InferenceStatistical Inference

Page 5: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

 

Many statistical users misinterpret the p-values of significance tests as “inverse” probabilities:

1-p is “the probability that the alternative hypothesis is true”

(Mis)intepretations of (Mis)intepretations of pp-values-valuesin Bayesian termsin Bayesian terms

Page 6: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

Frequentist interpretation of a 95% confidence interval:

  In the long run 95% of computed confidence intervals will contain

the “true value” of the parameter

  Each interval in isolation has either a 0 or 100% probability

of containing it

(Mis)intepretations of confidence levels(Mis)intepretations of confidence levelsin Bayesian termsin Bayesian terms

This “correct” interpretation does not make sense for most users!

  It is the interpretation in (Bayesian) terms of

“a fixed interval having a 95% chance of including the true value

of interest”

which is the appealing feature of confidence intervals

Page 7: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

  Even experienced users and experts in statistics are not

immune from conceptual confusions

“In these conditions [a p-value of 1/15], the odds of 14 to 1

that this loss was caused by seeding [of clouds]

do not appear negligible to us”Neyman et al., 1969

(Mis)intepretations of frequentist procedures(Mis)intepretations of frequentist proceduresin Bayesian termsin Bayesian terms

  All the attempts to rectify these interpretations have been

a loosing battle

We ask themselves:

“And if you were a Bayesian without knowing it?”

  Virtually all users interpret frequentist confidence intervals

in a Bayesian fashion

Page 8: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

Two main definitions of probabilityTwo main definitions of probability(already in Bernoulli, 17th century)(already in Bernoulli, 17th century)

“Frequentist (“classical”, “orthodox”, “sampling theory”) conception

A measure of the degree of belief (or confidence) in the occurrence of an event or more generally in a proposition

The long-run frequency of occurrence of an event, either in a sequence of repeated trials or in an ensemble of “identically” prepared systems

Seems to make probability an objective property, existing in the nature independently of us, that should be based on empirical frequencies

The “Bayesian” conception A much more general definition: Ramsey, 1931; Savage, 1954; de Finetti, 1974

Jaynes, E.T. (2003) Probability Theory: The Logic of Science (Edited by G.L. Bretthorst)

Cambridge, England: Cambridge University Press

Page 9: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

 

The Bayesian definition fits the meaning of the term probability in everyday language

 

  The Bayesian probability theory appears to be much more closely related to how people intuitively reason in the presence of uncertainty

 

“It is beyond any reasonable doubt that for most people,probabilities about single events do make sense

even though this sense may be naïve and fall short from numerical accuracy”Rouanet, in Rouanet et al., 2000, page 26

Page 10: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

Frequentist approachSelf-proclaimed “objective” contrary to the “Bayesian” inference that should be necessary “subjective”

Bayesian approachThe Bayesian definition can serve to describe “objective knowledge”,

in particular based on symmetry arguments or on frequency data

  Bayesian statistical inference is no less objective

than frequentist inference

It is even the contrary in many contexts

Page 11: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

Statistical InferenceStatistical Inference

Statistical inference is typically concerned with both known quantities - the observed data - and unknown quantities - the parameters and the data that have not been observed.

“The raw material of a statistical investigation is a set of observations; these are the values taken on by random variables X whose distribution P is at least partly unknown.

Lehmann, 1959

Page 12: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

 Bayesian inference

Parameters can also be probabilized

 Frequentist inference

All probabilities (in fact frequencies) are conditional on [unknown] parameters

Significance tests (parameter value fixed by hypothesis)

Confidence intervals

Distributions of probabilities that express our uncertainty

before observations (does nor depend on data): prior probabilities

after observations (conditional on data): posterior (or revised) probabilities

also about future data: predictive probabilities

Page 13: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

Known data0 0 0 1 0

f = 1/5

Unknownparameter

= ?

A finite population of size N=20With a dichotomous variable

1 (success) – 0 (failure)Proportion of success

A sample of size n=5from this populationhas been observed

A simple illustrative situationA simple illustrative situation

Page 14: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

Inductive reasoning: generalisation from known to unknown

Known data0 0 0 1 0

f = 1/5

Unknownparameter

= ?

In the frequentist framework:  no probabilities no solution

Page 15: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

Frequentist inference: from unknown to known

Known data0 1 0 0 0

f = 1/5

Unknownparameter

= ?

 no more solution

Page 16: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

Frequentist inference

Imaginary repetitions of the observations

f = 0/5 : 0.00006f = 1/5 : 0.005f = 2/5 : 0.068f = 3/5 : 0.293f = 4/5 : 0.440f = 5/5 : 0.194

One sample have been observed out 15 503 possible samples

Sampling probabilities = frequencies

ParameterFixed value

Example:

= 15/20 = 0.75

Data: 0 0 0 1 0 (f = 1/5 = 0.20)

Page 17: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

If = 0.75 one find in 99.5% of the repetitionsa value f > 1/5 (greater than the observation f=0.20)

The null hypothesis = 0.75 is rejected (Significant: p = 0.00506)

Imaginary repetitions of the observations

f = 0/5 : 0.00006f = 1/5 : 0.005f = 2/5 : 0.068f = 3/5 : 0.293f = 4/5 : 0.440f = 5/5 : 0.194

Null hypothesis

Example 1: = 0.75 (15/20)

Level: = 0.050.995

Data0 0 0 1 0 f = 0.20

Frequentist significance test

Page 18: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

However, this conclusion is based on the probabilityof the samples that have not been observed!

“If P is small, that means that there have been unexpectedly large departures

from prediction.

But why should these be stated in terms of P?

The latter gives the probability of departures, measured in a particular way,equal to or greater than the observed set, and the contribution from theactual value is nearly always negligible.

What the use of P implies, therefore, is that a hypothesis that may be truemay be rejected because it has not predicted observable results that havenot occurred.

This seems a remarkable procedure.”’Jeffreys, 1961

Page 19: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

Null hypothesis

Example 2: = 0.50 (10/20)

Level: = 0.05

Data0 0 0 1 0 f = 0.20

Frequentist significance test

Imaginary repetitions of the observations f = 0/5 : 0.00006

f = 1/5 : 0.005 f = 2/5 : 0.068

f = 3/5 : 0.293f = 4/5 : 0.440f = 5/5 : 0.194

0.848

If = 0.50 one find in 84.8% of the repetitionsA value f > 1/5 (greater than than the observation f =0.20)

The null hypothesis = 0.50 is not rejected (Non significant: p = 0.152)

Obviously this does not prove that = 0.50!

Page 20: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

Set of possible values for that are not rejected at level

Data0 0 0 1 0 f = 0.20

Frequentist confidence interval

Example = 0.05One get the “95% confidence” interval

[0.05 , 0.60]

How to interpret the “95% confidence”?

Page 21: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

The frequentist interpretation is based on the universal statement:

“Whatever the fixed value of the parameter is, in 95% (at least)of the repetitions the interval that should be computedincludes this value”

Interpretation of frequentist confidence?

Page 22: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

A very strange interpretation:it does not involve the data in hand!

It is at least unrealistic

“Objection has sometimes been made that the method of calculatingConfidence Limits by setting an assigned value such as 1% on the

frequency of observing 3 or less (or at the other end of observing 3 or

more) is unrealistic in treating the values less than 3, which have not been

observed, in exactly the same manner as the value 3, which is the one that has been observed.

This feature is indeed not very defensible save as an approximation.”

Fisher, 1990/1973, page 71

Page 23: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

Set of all possible valuesof the unknown parameter

= 0/20, 1/20, 2/20… 20/20

Bayesian inference

Probabilities that express our  uncertainty

(in addition to sampling probabilities)Known data0 0 0 1 0

f = 1/5

Return to the inductive reasoning:Generalisation from known to unknown

“As long as we are uncertain about values of parameters,we will fall into the Bayesian camp”

Iversen, 2000

Page 24: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

Bayesian inference

All the frequentist probabilities associated with the data

Pr(f = 1/5 | )

Likelihood function

= 0/20 0 = 10/20 0.135= 1/20 0.250 = 11/20 0.089= 2/20 0.395 = 12/20 0.054= 3/20 0.461 = 13/20 0.029= 4/20 0.470 = 14/20 0.014= 5/20 0.440 = 15/20 0.005= 6/20 0.387 = 16/20 0.001= 7/20 0.323 = 17/20 0= 8/20 0.255 = 18/20 0= 9/20 0.192 = 19/20 0

= 20/20 0

Data0 0 0 1 0

f = 1/5

Page 25: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

joint probabilities (a simple product):

Pr( and f=1/5) = Pr(f=1/5 |) × Pr()

likelihood × prior probability

Bayesian inference

Predictive probabilities (sum of the joint probabilities)

Pr(f=1/5)

A weighted average of the likelihood function

We assume prior probabilities Pr() (before observation)

posterior probabilities (A simple application of the definition of conditional probabilities)

Pr( | f=1/5) = Pr( and f=1/5) / Pr(f)

The normalized product of the prior and the likelihood

Page 26: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

“Bayesian statistics is difficult in the sense that thinking is difficult”

Berry, 1997

We can conclude with Berry

Page 27: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

Considerable difficultiesConsiderable difficulties

with the frequentist approachwith the frequentist approach

Page 28: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

The mysterious and unrealistic useThe mysterious and unrealistic useof the sampling distributionof the sampling distribution

Frequent questions asked by students and statistical users

“why one considers the probability of samples outcomes that are more extreme than the one observed?”

“why must one calculate the probability of samples that have not been observed?”

etc.

No such difficulties with the Bayesian inference

Involves the sampling probability of the data , via the likelihood function

that writes the sampling distribution in the natural order :

“from unknown to known”

Page 29: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

Experts in statistics are not immuneExperts in statistics are not immunefrom conceptual confusionsfrom conceptual confusionsAbout confidence intervalsAbout confidence intervals

A methodological paper by Rosnow and Rosenthal (1996)

They take the example of an observed difference between two means d=+0.266

They consider the interval [0,+532] whose bounds are the “null hypothesis” (0) and what they call the “counternul value” (2d=+0.532), the symmetrical value of 0 with regard to d

They interpret this specific interval [0,+532] as “a 77% confidence interval”

(0.77=1-2×0.115, where 0.115 is the one-sided p-value for the usual t test)

Clearly, 0.77 is here a data dependent probability,which needs a Bayesian approach to be correctly interpreted

Page 30: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

Experimental research and statistical inference:Experimental research and statistical inference:A paradoxical situationA paradoxical situation

Null Hypothesis Significance Testing (NHST)

An unavoidable norm in most scientific publications

BUT

 Innumerable misinterpretations and misuses

Often appears as a label of scientificness

Use explicitly denounced by the most eminent and most experienced scientists

“The test provides neither the necessary nor the sufficient scope or type of knowledge that basic scientific social research requires”

Morrison & Henkel, 1969

Page 31: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

Today is a crucial timeToday is a crucial time

Users' uneasiness is ever growing

In all fields necessity of changes in reporting experimental results

routinely report effect size indicators

and their interval estimatesin addition to or in place of the results of NHST

Page 32: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

Common misinterpretations of NHSTCommon misinterpretations of NHST

Emphasized by empirical studiesRosenthal & Gaito, 1963; Nelson, Rosenthal & Rosnow, 1986;Oakes, 1986; Zuckerman, Hodgins, Zuckerman & Rosenthal, 1993;Falk & Greenbaum, 1995; Mittag & Thompson, 2000; Gordon, 2001;M.-P. Lecoutre (2000), B. Lecoutre, M.-P. Lecoutre & Poitevineau, 2001

Shared by most methodology instructorsHaller & Krauss, 2001

Professional applied statisticians are not immune to misinterpretationsM.-P. Lecoutre, Poitevineau & B.Lecoutre (2003) - Even statisticians are not immune

to misinterpretations of Null Hypothesis Significance Tests. International Journal of Psychology, 38, 37-45

Page 33: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

Why these misinterpretations?Why these misinterpretations?

An individual's lack of mastery?

    This explanation is hardly applicable to professional statisticians

“Judgmental adjustments” or “adaptative distorsions”'(M.-P. Lecoutre, in Rouanet et al., 2000, page 74)

    designed to make an ill-suited tool fit their true needs

Examples: - Confusion between “statistical significance” and “scientific significance” - Improper uses of nonsignificant results as “proof of the null hypothesis” - “Incorrect” (“non frequentist”) interpretations of p-values as inverse probabilities

Page 34: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

NHST does not address questions that are of primary interest for the scientific research

This suggests that

“users really want to make a different kind of inference”Robinson & Wainer,

2002, page 270

Page 35: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

A more or less “naïve” mixture of NHST resultsA more or less “naïve” mixture of NHST resultsand other informationand other information

BUT

this is not an easy task!

The task of statisticians in pharmaceutical companies

“Actually, what an experienced statistician does when looking at p-values is to combine them with information on sample size, nullhypothesis, test statistic, and so forth to form in his mind something that is pretty much like a Confidence interval to be able to interpret the p-values in a reasonable way”

Schmidt, 1995, page 490

Page 36: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

A set of recipes and ritualsA set of recipes and rituals

Many attempts to remedy the inadequacy of usual significance tests

See for instance: the “Task Force” of the American Psychological Association (Wilkinson et al. 1999)

They do not supply real statistical thinking

“We need statistical thinking, not rituals”

Gigerenzer, 1998

They are both partially technically redundant and conceptually incoherent

Page 37: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

Confidence intervals could quickly become a compulsory Confidence intervals could quickly become a compulsory norm in experimental publicationsnorm in experimental publications

In practice two probabilities can be routinely associated with a specific interval estimate computed from a particular sample

The first probability is “the proportion of repeated intervals that contain the parameter”

It is usually termed the coverage probability

The second is the Bayesian “posterior probability that this interval contains the parameter” (given the data in hand), assuming a noninformative prior distribution

In the frequentist approach, it is forbidden to use the second probability

In the Bayesian approach, the two probabilities are valid

Moreover, an “objective Bayes” interval is often “a great frequentist procedure” (Berger, 2004)

Page 38: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

The debates can be expressed on these terms:

“whether the probabilities should only refer to data and be based on frequency or whether they should also apply to parameters and be regarded as measures of beliefs”

Page 39: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

The ambivalence of statistical instructorsThe ambivalence of statistical instructors

It is undoubtedly the natural (Bayesian) interpretation “a fixed interval having a 95% chance of including the true value of interest”that is the appealing feature of confidence intervals

Most statistical instructors tolerate and even usethis heretic interpretation

Page 40: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

In a popular statistical textbook (whose objective is to allow the reader “accessing the deep intuitions in the field”), one can found the following interpretation of the confidence interval for a proportion:

“Si dans un sondage de taille 1000, on trouve P [la proportion observée]= 0.613, la proportion 1 à estimer a une probabilité 0.95 de se trouver dansla fourchette: [0.58,0.64]”

“If in a public opinion poll of size 1000, one find P [the observed proportion]

= 0.613, the proportion 1 to be estimated has a 0.95 probability to be in the

range: [0.58,0.64]''

Claudine Robert, 1995, page 221

The ambivalence of statistical instructorsThe ambivalence of statistical instructors

Page 41: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

In an other book that claims the goal of understanding statistics, a 95% confidence interval is described as

“an interval such that the probability is 0.95 that the interval contains the population value]”

Pagano, 1990, page 228

The ambivalence of statistical instructorsThe ambivalence of statistical instructors

Page 42: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

The ambivalence of statistical instructorsThe ambivalence of statistical instructors

“It would not be scientifically sound to justify a procedure by frequentist arguments and to interpret it in Bayesian terms”

Rouanet, 2000

Page 43: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

 Other authors claim that the “correct” frequentist interpretationthey advocate can be expressed as :

 

Hard to imagine that readers can understand that “confident” refers here to a frequentist view of probability!

 “We can be 95% confident that the population mean

is between 114.06 and 119.94”

Kirk, 1982 

“We may claim 95% confidence that the population value of multiple R2

is no lower than 0.266”

Smithson, 2001

We will distinguish between probability as frequency, termed probability, and probability as information/uncertainty, termed confidence”

Schweder & Hjort (2002)

Page 44: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

 

Teaching the frequentist interpretation:

a losing battle

“we are fighting a losing battle”Freeman, 1993  

Page 45: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

Most statistical users are likely to be Bayesian“without knowing it”!

“It could be argued that since most physicians use statement A [the probability the true mean value is in the interval is 95%] to describe ‘confidence’ intervals, what they really want are ‘probability’ intervals. Since to get them they must use Bayesian methods, then they are really Bayesians at heart!”

Grunkemeier & Payne, 2002

Page 46: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

The Bayesian therapyThe Bayesian therapy

Page 47: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

It is not acceptable that that future statistical inference methods users will continue using non appropriate procedures because they know no other alternative

Since most people use “inverse probability” statements to interpret NHST and confidence intervals, the Bayesian definition of probability, conditional probabilities and Bayes’ formula are already - at least implicitly - involved in the use of frequentist methods

Which is simply required by the Bayesian approach is a very natural shift of emphasis about these concepts, showing that they can be used consistently and appropriately in statistical analysis (Lecoutre, 2006)

Page 48: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

A better understanding of frequentist procedures

“Students [exposed to a Bayesian approach] come to understand the frequentist conceptsof confidence intervals and P values better than do students

exposed only to a frequentist approachBerry, 1997

Combining descriptive statistics and significance tests

A basic situation: the inference about the difference between two normal means

Let us denote by d (assuming d≠0) the observed difference and by t the value of the Student's test statistic

Assuming the usual non informative prior, the posterior for is ageneralized (or scaled) t distribution (with the same degrees of freedomAs the t test), centered on d and with scale factor the ratio e=d/t

(see e.g. Lecoutre, 2006)

Page 49: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

Conceptual links

Bayesian interpretation of the p-value

The one-sided p-value of the t test is exactly the posterior Bayesian

probability that the difference has the opposite sign of the observeddifference

Bayesian interpretation of the confidence interval

It becomes correct to say that “there is a 95% [for instance] probability of

being included between the fixed bounds of the interval” (conditionally on the data)

If d>0, there is a p posterior probability of a negative difference and a 1-p complementary probability of a positive difference

In the Bayesian framework these statements are statistically correct

Page 50: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

Some decisive advantages

overcoming usual difficulties

In this way, Bayesian methods allow users to overcome usual difficulties encountered with the frequentist approach

“public use” statements

The use of noninformative priors has a privileged status in order to gain “public use” statements

Combining information

when “good prior information is available” other Bayesian techniques also have an important role to play in experimental investigations

Page 51: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

Bayesian procedures are no more arbitrarythan frequentist ones

Many potential users of Bayesian methods continue to think that they are too subjective to be scientifically acceptable

BUT

Page 52: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

frequentist methods are full of ad hoc conventions

Thus the p-value is traditionally based on the samples that are “more extreme” than the observed data (under the null hypothesis)

ExampleFor instance, let us consider the usual Binomial one-tailed test for the null hypothesis =0 against the alternative <0

This test is conservative, but if the observed data are excluded, it becomes liberal

A typical solution to overcome this problem consists in considering a “mid-p-value”, but it has only it ad hoc justifications

But, for discrete data, it depends on whether the observed data are included or not

Page 53: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

The choice of a noninformative prior distribution cannot avoid conventions

For Binomial sampling, different priors have been proposed for an objective Bayesian analysis (for a discussion, see e.g. Lee, 1989, pages 86-90)

It exists two extreme noninformative priors that are respectively the more unfavourable and the more favourable priors with respect to the null hypothesisThey are respectively the Beta distribution of parameters 1 and 0 and the Beta distribution of parameters 0 and 1

The observed significance levels of the inclusive and exclusive conventions

are exactly the posterior Bayesian probabilities that is greater than

0respectively associated with these two extreme priors

These two priors constitute an a priori “ignorance zone”' (Bernard, 1996), which is related to the notion of imprecise probability (see Walley, 1996)

But the particular choice of such a prior is an exact counterpart of the arbitrariness involved within the frequentist approach

Page 54: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

The usual criticism of frequentists towards the divergence of Bayesians with respect to the choice of a non informative prior can be easily reversed

Furthermore, the Jeffreys prior, which is very naturally the intermediate Beta distribution of parameters ½ and ½ gives an intermediate value, fully justified, close to the observed mid-p-value

The Jeffreys prior credible interval has remarkable frequentist propertiesIts coverage probability is very close to the nominal level, even for small-size samplesIt is undoubtedly an objective procedure that can be favourably compared to most frequentist intervals

“We revisit the problem of interval estimation of a binomial proportion…We begin by showing that the chaotic coverage properties of the Wald interval

are far more persistent than is appreciated...We recommend the Wilson interval or the equal tailed Jeffreys prior interval for small n”

Brown, Cai and DasGupta, 2001, page 101

Page 55: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

In this case, the observed significance levels of the inclusive and exclusive conventions are exactly the posterior Bayesian probabilities associated with the two respective priors Beta(0,0) and Beta(0,1)

The preceding results can be generalized to more general situations of comparisons between proportions

(see for the case of a 2×2 contingency table Lecoutre & Charron, 2000)

Similar results are obtained for negative-Binomial (or Pascal) sampling

This suggests that the intermediate Beta distribution of parameters 0 and ½ is an objective procedureIt is precisely the Jeffreys prior

This result concerns a very important issue related to the “likelihood principle”

Page 56: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

The predictive probabilities:A very appealing tool

The predictive idea is central in experimental investigations

“The essence of science is replication.A scientist should always be concerned about what would happen

if he or another scientist were to repeat his experiment”

Guttman, 1983

Page 57: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

Bayesian predictive probabilities:a very appealing method to answer essential questions such as

Planning: How many subjects?

“How big should be the experiment to have a reasonable chance of demonstrating a given conclusion?”

Monitoring: When to stop?

“Given the current data, what is the chance that the final result will be in some sense conclusive, or on the contrary inconclusive?”

These questions are unconditional in that they require consideration of all possible value of parameters

Whereas traditional frequentist practice does not address these questions, predictive probabilities give them direct and natural answer

“An essential aspect of the process of evaluating design strategies is the ability to calculate predictive probabilities of potential results”

Berry, 1991

Page 58: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

The stopping rule principle:A need to rethink

Experimental designs often involve interim looks at the data

Most experimental investigators feel that the possibility of early stopping cannot be ignored, since it may induce a bias on the inference that must be explicitly corrected

Page 59: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

Consequently, they regret the fact that the Bayesian methods, unlike the frequentist practice, generallyignore this specificity of the design

Bayarri and Berger (2004) consider this desideratum as an area of current disagreement between the frequentist and Bayesian approaches

This is due to the compliance of most Bayesians with the likelihood principle (a consequence of Bayes' theorem), which implies the stopping rule principle in interim analysis:

“Once the data have been obtained, the reasons for stopping experimentationshould have no bearing on the evidence reported

about unknown model parameters”

Bayarri and Berger, 2004, page 81

Would the fact that “people resist an idea so patently right” (Savage, 1954) be fatal to the claim that “they are Bayesian without knowing it”?

Page 60: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

This is not so sure, experimental investigators could well be right!

They feel that the experimental design (incorporating the stopping rule) is prior to the sampling information and thatthe information on the design is one part of the evidence

It is precisely the point of view developed by de Cristofaro (1996, 2004, 2006),who persuasively argued that the correct version of Bayes' formula must integrate

the parameter the design d

the initial evidence (prior to designing) e0

the statistical information i

Consequently Bayes' formula must be written in the following form:

p(| i, e0 ,d) p(| e0 ,d) p(i |,e0,d)

Page 61: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

It becomes evident that the prior depends on d

p(| i, e0 ,d) p(| e0 ,d) p(i |,e0,d)

With this formulation, both the likelihood principle and the stopping rule principle are no longer an automatic consequence

It is not true that, under the same likelihood, the inference about is the same, irrespective of d

Box and Tiao (1973, pages 45-46), stated that the Jeffreys priors are different for the Binomial and Pascal sampling as the two sampling models are also different

In both cases, the resulting posterior distribution have remarkable frequentist properties (i.e. coverage probabilities of credible intervals)

Page 62: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

This result can be extended to general stopping rules (Bunouf, 2006)

The basic principle is that the design information, which is ignored in the likelihood function, can be recovered in the Fisher information (which is related to Shannon's notion of entropy)

Within this framework, we can get a coherent and fully justified Bayesian answer to the issue of sequential analysis,which furthermore satisfy the experimental investigators desideratum (Bunouf and Lecoutre, 2006)

Page 63: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

ConclusionConclusion

Page 64: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

In actual fact I suggest that such a theory is by no means a speculativeviewpoint but on the contrary is perfectly feasible (see especially, Berger, 2004)It is better suited to the needs of users than frequentist approach and provide scientists with relevant answers to essential questions raised by experimental data analysis

“A widely accepted objective Bayes theory, which fiducial inference was

intended to be, would be of immense theoretical and practical importance.

A successful objective Bayes theory would have

to provide good frequentist properties in familiar situations, for instance,

reasonable coverage probabilities for whatever replaces confidence intervals”Efron, 1998, page 106

Page 65: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

Why scientists really appear to want a different kind Why scientists really appear to want a different kind of inference but seem reluctant to use Bayesian of inference but seem reluctant to use Bayesian

inferential procedures in practice? inferential procedures in practice?

“This state of affairs appears to be due to a combination of factors including

philosophical conviction,

tradition,

statistical training,

lack of ‘availability’,

computational difficulties,

reporting difficulties,

and perceived resistance by journal editors”

WinklerWinkler, 1974

Page 66: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

The times we are living in at the moment appear to be crucial

“we [statisticians] will all be Bayesians in 2020,

and then we can be a united profession”Lindley in Smith, 1995, page 317

One of the decisive factors could be the recent “draft guidance document” of the US Fud and Drug Administration (FDA, 2006)

This document reviews “the least burdensome way of addressing the relevant issues related to the use of Bayesian statistics in medical device clinical trials”

It opens the possibility for experimental investigators to really be Bayesian in practice

Page 67: Bruno Lecoutre C.N.R.S. et Université de Rouen  E-mail: bruno.lecoutre@univ-rouen.fr

“It is their straightforward, natural approach to inference

that makes them [Bayesian methods] so attractive”

Schmitt, 1969

Text and references available upon requestMail to : [email protected]