Download - Estimating the Transfer Function from Neuronal Activity to BOLD

Estimating the Transfer Function from

Neuronal Activity to BOLD

Maria Joao RosaMaria Joao Rosa

SPM Homecoming 2008SPM Homecoming 2008

Wellcome Trust Centre for Neuroimaging Wellcome Trust Centre for Neuroimaging

Statistic formulations• P(A): probability of event A occurring• P(A|B): probability of A occurring given B occurred• P(B|A): probability of B occurring given A occurred• P(A,B): probability of A and B occurring simultaneously (joint probability of A and

B)

Joint probability of A and B

P(A,B) = P(A|B)*P(B) = P(B|A)*P(A)

P(B|A) = P(A|B)*P(B)/P(A)

Which is Bayes Rule

Bayes’ Rule is very often referred to Bayes’ Theorem, but it is not really a theorem, and should more properly be referred to as Bayes’ Rule (Hacking, 2001).

Reverend Thomas Bayes(1702 – 1761)

• Reverend Thomas Bayes was a minister interested in probability and stated a form of his famous rule in the context of solving a somewhat complex problem involving billiard balls

• It was first stated by Bayes in his ‘Essay towards solving a problem in the doctrine of chances’, published in the Philosophical Transactions of the Royal Society of London in 1764.

Conditional probability

P(A|B): conditional probability of A given B

Q: When are we considering conditional probabilities?

A: Almost always!

Examples:

• Lottery chances

• Dice tossing

Conditional probabilityExamples (cont’):• P(Brown eyes|Male): (P(A|B) with A := Brown eyes, B := Male)

1. What is the probability that a person has brown eyes, ignoring everyone who is not a male?

2. Ratio: (being a male with brown eyes)/(being a male)3. Probability ratio: probability that a person is both male and has brown eyes to the

probability that a person is maleP(Male) = P(B) = 0.52P(Brown eyes) = P(A) = 0.78P(Male with brown eyes) = P(A,B) = 0.38

P(A|B) = P(B|A)*P(A)/P(B) = P(A,B)/P(B) = 0.38/0.52 = 0.73..

Flipping it around (Bayes idea):You could also calculate now what’s the prob. of being a male if you have brown eyesprob. of being a male if you have brown eyes P(B|

A) = P(A|B)*P(B)/P(A) = 0.73*0.52/0.78 = 0.4871…

Statistic terminology• P(A) is called the marginal or prior probability of A (since it is the

probability of A prior to having any information about B)

Similarly:

• P(B): the marginal or prior probability of B • P(A|B) is called the likelihood function for A given B.

• P(B|A): the posterior probability of B given A (since it depends on having information about A)

Bayes Rule P(B|A) = P(A|B)*P(B)/P(A)

“likelihood” function for B (for fixed A)“posterior” probability of B given A

prior probabilities of B, A (“priors”)

It relates to the conditional density of a parameter (posterior probability) with its unconditional density (prior, since depends on information present before the experiment).

The likelihood is the probability of the data given the parameter and represents the data now available.

Bayes’ Theorem for a given parameter

p (data) = p (data) p () / p (data)

1/P (data) is basically a normalizing constant

Posterior likelihood x prior

The prior is the probability of the parameter and represents what was thought before seeing the data.

The posterior represents what is thought given both prior information and the data just seen.

Data and hypotheses…– We have a hypotheses H0 (null), H1

– We have data (Y)– We want to check if the model that we have (H1) fits our

data (accept H1 / reject H0) or not (H0)Inferential statistics:

what is the probability that we can reject H0 and accept H1 at some level of significance (, P)

These are a-priori decisions even when we don’t know what the data will be and how it will behave.

Bayes:We get some evidence for the model (“likelihood”) and then can even

compare “likelihoods” of different models

Where does Bayes Rule come at hand?

• In diagnostic cases where we’re are trying to calculate P(Disease | Symptom) we often know P(Symptom | Disease), the probability that you have the symptom given the disease, because this data has been collected from previous confirmed cases.

• In scientific cases where we want to know P(Hypothesis | Result), the probability that a hypothesis is true given some relevant result, we may know P(Result | Hypothesis), the probability that we would obtain that result given that the hypothesis is true- this is often statistically calculable, as when we have a p-value.

Applicability to (f)mri

• Let’s take fMRI as a relevant example

Y = X * + • We have:

– Measured data : Y– Model : X

– Model estimates: , (/variance)

What do we get with inferential statistics?

• T-statistics on the betas ( = (1,2,…)) (taking error into account) for a

specific voxel we would ONLY get that there is a chance (e.g. < 5%) that there is NO effect of (e.g. 1 > 2), given the data

• But what about the likelihood of the model???

• What are the chances/likelihood that 1 > 2 at some voxel or region

• Could we get some quantitative measure on that?

What do we get with Bayes statistics?

Here, the idea (Bayes) is to use our post-hoc knowledge (our data) to estimate the model, ( also allowing us to compare hypotheses (models) and see which fits our data best)

“posterior” distribution for X given Y “likelihood” of Y given X

prior probabilities of Y, X (“priors”)

Now to Steve about the practical sides in SPM…

P(X|Y) = P(Y|X)*P(X)/P(Y)

i.e. P(|Y) = P(Y|)*P()/P(Y)

Bayes for Beginners: Applications

spatial normalization segmentation EEG source localisation

and Bayesian inference in… Posterior Probability Maps (PPM) Dynamic Causal Modelling (DCM)

SPM uses priors for estimation in…

Bayes in SPM

Standard approach in science is the null hypothesis significance test (NHST)

Low p value suggests “there is not nothing”

Assumption is H0 = noise; randomness

)|( 0HDp

Null hypothesis significance testing

H0 = molecules are randomly arranged in space

Looking unlikely…

Kreuger (2001) American Psychologist

Something vs nothing

nsDt/

…If there is any effect.. n

snDE )(

Our interpretations ultimately depend on p(H0)

“Risky” vs “safe” research…

Better to be explicit – incorporate subjectivity when specifying hypotheses.

Belief change = p(H0) – p(H0 | D)

If the underlying effect δ ~= 0, no matter how small, the test statistic grows in size – is this physiological?

The case for the defence• Law of large numbers means that the test statistic will

identify a consistent trend (δ ~= 0) with a sufficient sample size

• In SPM, we look at images of statistics, not effect sizes

• A highly significant statistic may reflect a small non-physiological difference, with large N

• BUT… as long as we are aware of this, classical inference works well for common sample sizes

Mp

p-1

Mpost

post-1

d-1

Md

post = d + p Mpost = d Md + p Mp

post

Posterior Probability Distributionprecision = 1/2

BUT!!! What is p(H0) for randomness?!

H1 H2 H3 H4vs vs vs vs etc…

Reframe the question – compare alternative hypotheses/models:

(1) Bayesian model comparison

)()()|()|(

yppypyp Bayes:

If only one model, then p(y) is a normalising constant…

For model Hi :)|(

)|(),|(),|(i

iii Hyp

HpHypHyp Model evidence for Hi

Practical example (1)

Dynamic causal modelling (DCM)

V1

V5

SPC

Motion

Photic

Attention

0.85

0.57 -0.02

0.84

0.58

H=1

V1

V5

SPC

Motion

Photic

Attention

0.86

0.56 -0.02

1.42

0.55

0.750.89

H=2

0.70

V1

V5

SPC

Motion

Photic

Attention

0.85

0.57 -0.02

1.360.70

0.85

0.23

H=3

Attention0.03

Model Evidence:

Bayes factor:

)|()|(

2

112 Hyp

HypB

dHpHypHyp iii )|(),|()|(

XY

General Linear Model:

What are the priors?

),0( CNwith

• In “classical” SPM, no (flat) priors• In “full” Bayes, priors might be from theoretical arguments or from independent data• In “empirical” Bayes, priors derive from the same data, assuming a hierarchical model for generation of the data

Parameters of one level can be made priors on distribution of parameters at lower level

Parameters and hyperparameters at each level can be estimated using EM algorithm

(2) Priors about the null hypothesis

Shrinkage prior

)1( XY

General Linear Model:

Shrinkage prior:

),0()( )1( CNp

)2(0 ),0()( CNp

)(p

0

In the absence of evidenceto the contrary, parameters

will shrink to zero

Bayesian Inference

LikelihoodLikelihood PriorPriorPosteriorPosterior

SPMsSPMs

PPMsPPMs

u

)(yft

)0|( tp)|( yp

)()|()|( pypyp

Bayesian test Classical T-test

Changes with search volume

Practical example (2)

SPM5 Interface

(2) Posterior Probability Maps

Mean (Cbeta_*.img)

Std dev (SDbeta_*.img)

PPM (spmP_*.img)

Activation threshold

Probability

Posterior probability distribution p( |Y)

)|( yp

(3) Use informative priors (cutting edge!)

• Spatial constraints on fMRI activity (e.g. grey matter)

• Spatial constraints on EEG sources, e.g. using fMRI blobs

?

(4) Tasters – The Bayesian Brain

Ernst & Banks (2002) Nature

(4a) Taster: Modelling behaviour…

Friston (2005) Phil Trans R Soc B

(4b) Taster: Modelling the brain…

Acknowledgements and further reading

• Previous MFD talks

• Jean & Guillame’s SPM course slides

• Krueger (2001) Null hypothesis significance testing Am Psychol 56: 16-26

• Penny et al. (2004) Comparing dynamic causal models. Neuroimage 22: 1157-1172

• Friston & Penny (2003) Posterior probability maps and SPMs Neuroimage 19: 1240-1249

• Friston (2005) A theory of cortical responses Phil Trans R Soc B

• www.ualberta.ca/~chrisw/BayesForBeginners.pdf

• www.fil.ion.ucl.ac.uk/spm/doc/books/hbf2/pdfs/Ch17.pdf

http://www.ualberta.ca/~chrisw/BayesForBeginners.pdf

http://www.fil.ion.ucl.ac.uk/spm/doc/books/hbf2/pdfs/

Bayes’ ending

Bunhill Fields Burial Groundoff City Road, EC1

Download - Estimating the Transfer Function from Neuronal Activity to BOLD

Top Related