a general statistical analysis for fmri data

39
A general statistical analysis for fMRI data Keith Worsley 12 , Chuanhong Liao 1 , John Aston 123 , Jean-Baptiste Poline 4 , Gary Duncan 5 , Vali Petre 2 , Alan Evans 2 1 Department of Mathematics and Statistics, McGill University, 2 Brain Imaging Centre, Montreal Neurological Institute, 3 Imperial College, London, 4 Service Hospitalier Frédéric Joliot, CEA, Orsay, 5 Centre de Recherche en Sciences Neurologiques,

Upload: fadey

Post on 12-Jan-2016

27 views

Category:

Documents


1 download

DESCRIPTION

A general statistical analysis for fMRI data. Keith Worsley 12 , Chuanhong Liao 1 , John Aston 123 , Jean-Baptiste Poline 4 , Gary Duncan 5 , Vali Petre 2 , Alan Evans 2 1 Department of Mathematics and Statistics, McGill University, 2 Brain Imaging Centre, Montreal Neurological Institute, - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: A general statistical analysis for fMRI data

A general statistical analysis for fMRI data

Keith Worsley12, Chuanhong Liao1, John Aston123,

Jean-Baptiste Poline4, Gary Duncan5, Vali Petre2, Alan Evans2

1Department of Mathematics and Statistics, McGill University,2Brain Imaging Centre, Montreal Neurological Institute,

3Imperial College, London,4Service Hospitalier Frédéric Joliot, CEA, Orsay,

5Centre de Recherche en Sciences Neurologiques, Université de Montréal

Page 2: A general statistical analysis for fMRI data

Choices …

• Time domain / frequency domain?

• AR / ARMA / state space models?

• Linear / non-linear time series model?

• Fixed HRF / estimated HRF?

• Voxel / local / global parameters?

• Fixed effects / random effects?

• Frequentist / Bayesian?

Page 3: A general statistical analysis for fMRI data

More importantly ...

• Fast execution / slow execution?

• Matlab / C?

• Script (batch) / GUI?

• Lazy / hard working … ?

• Why not just use SPM?

• Develop new ideas ...

Page 4: A general statistical analysis for fMRI data

Aim: Simple, general, valid, robust, fast analysis of fMRI data

• Linear model: ? ? Yt = (stimulust * HRF) b + driftt c + errort

• AR(p) errors: ? ? ? errort = a1 errort-1 + … + ap errort-p + s WNt

unknown parameters

Page 5: A general statistical analysis for fMRI data

MATLAB: reads MINC or analyze format (www/math.mcgill.ca/keith/fmristat)

• FMRIDESIGN: Sets up stimulus, convolves it with the HRF and its derivatives (for estimating delay).

• FMRILM: Fits model, estimates effects (contrasts in the magnitudes, b), standard errors, T and F statistics.

• MULTISTAT: Combines effects from separate runs/sessions/subjects in a hierarchical fixed / random effects analysis.

• TSTAT_THRESHOLD: Uses random field theory / Bonferroni to find thresholds for corrected P-values for peaks and clusters of T and F maps.

Page 6: A general statistical analysis for fMRI data
Page 7: A general statistical analysis for fMRI data

0 50 100 150 200 250 300 350 400-1

0

1

2(a) Stimulus, s(t): alternating hot and warm stimuli on forearm, separated by rest (9 seconds each).

hot

warm

hot

warm

0 50 100 150 200 250 300 350 400

-0.2

0

0.2

0.4

(b) Hemodynamic response function, h(t): difference of two gamma densities (Glover, 1999)

0 50 100 150 200 250 300 350 400-1

0

1

2(c) Response, x(t): sampled at the slice acquisition times every 3 seconds

Time, t (seconds)

Example: Pain perception

Page 8: A general statistical analysis for fMRI data
Page 9: A general statistical analysis for fMRI data

-0.1

0

0.1

0.2

0.3

0.4

First step: estimate the autocorrelationAR(1) model: errort = a1 errort-1 + s WNt

• Fit the linear model using least squares

• errort = Yt – fitted Yt

• â1 = Correlation ( errort , errort-1)

• Estimating errort’s changes their correlation structure slightly, so â1 is slightly biased:

Raw autocorrelation Smoothed 15mm Bias corrected â1

~ -0.05 ~ 0~ -0.05 ~ 0

?

Page 10: A general statistical analysis for fMRI data

-6

-4

-2

0

2

4

6

0

0.5

1

1.5

2

2.5

-6

-4

-2

0

2

4

6

Second step: refit the linear modelPre-whiten: Yt

* = Yt – â1 Yt-1, then fit using least squares:

Effect: hot – warm Sd of effect

T statistic = Effect / Sd

T > 4.86 (P < 0.05, corrected)

Page 11: A general statistical analysis for fMRI data

-0.1

0

0.1

0.2

0.3

0.4

-0.1

0

0.1

0.2

0.3

0.4

-0.1

0

0.1

0.2

0.3

0.4

-0.1

0

0.1

0.2

0.3

0.4

Higher order AR model? Try AR(4): â1 â2

â3 â4

AR(1) seems to be adequate

~ 0

~ 0 ~ 0

Page 12: A general statistical analysis for fMRI data

-6

-4

-2

0

2

4

6

-6

-4

-2

0

2

4

6

-6

-4

-2

0

2

4

6

-6

-4

-2

0

2

4

6

… has no effect on the T statistics:AR(1) AR(2)

AR(4)

biases T up ~12% more false positives

But ignoring correlation …

Page 13: A general statistical analysis for fMRI data
Page 14: A general statistical analysis for fMRI data

-5

0

5

0

0.5

1

1.5

2

2.5

-5

0

5

Results from 4 runs on the same subject

Run 1 Run 2 Run 3 Run 4

EffectEi

SdSi

T statEi / Si

Page 15: A general statistical analysis for fMRI data

MULTISTAT: combines effects from different runs/sessions/subjects:

• Ei = effect for run/session/subject i

• Si = standard error of effect

• Mixed effects model:

Ei = covariatesi c + Si WNiF + WNi

R

Random effect,due to variability from run to run

‘Fixed effects’ error,due to variabilitywithin the same run

Usually 1, but could add group,treatment, age,sex, ...

}from

FMRILM

? ?

Page 16: A general statistical analysis for fMRI data

REML estimation using the EM algorithm

• Slow to converge (10 iterations by default).• Stable (maintains estimate 2 > 0 ), but2 biased if 2 (random effect) is small, so:• Re-parametrise the variance model:

Var(Ei) = Si2 + 2

= (Si2 – minj Sj

2) + (2 + minj Sj2)

= Si*2 + *2 2 = *2 – minj Sj

2 (less biased estimate)^ ^

^

?

?

^

Page 17: A general statistical analysis for fMRI data

-5

0

5

0

1

2

-5

0

5

Run 1 Run 2 Run 3 Run 4 MULTISTAT

EffectEi

SdSi

T statEi / Si

Problem: 4 runs, 3 df for random effects sd ...

… and T>15.96 for P<0.05 (corrected):

… very noisy sd:

… so no response is detected …

Page 18: A general statistical analysis for fMRI data

• Basic idea: increase df by spatial smoothing (local pooling) of the sd.

• Can’t smooth the random effects sd directly, - too much anatomical structure.

• Instead,

random effects sd

fixed effects sd

which removes the anatomical structure before smoothing.

Solution: Spatial regularization of the sd

sd = smooth fixed effects sd )

Page 19: A general statistical analysis for fMRI data

0

1

2

3

4

0

1

2

3

Random effects sd(3 df)

Fixed effects sd(448 df)

Random effects sdFixed effects sd

Regularized sd(112 df)

Fixed effects sd

Smooth Smooth 15mm15mm ~1~1

~1.6~1.6

Over scans

~3~3

Over subjects

Page 20: A general statistical analysis for fMRI data

dfratio = dfrandom(2 + 1)1 1 1

dfeff dfratio dffixed

e.g. dfrandom = 3, dffixed = 112, FWHMdata = 6mm:

FWHMratio (mm) 0 5 10 15 20 infinite

dfeff 3 11 45 112 192 448

Effective df

Random effects Fixed effects variability bias compromise!

FWHMratio2 3/2

FWHMdata2

= +

Page 21: A general statistical analysis for fMRI data

-5

0

5

0

1

2

-5

0

5

Run 1 Run 2 Run 3 Run 4 MULTISTAT

EffectEi

SdSi

T statEi / Si

Final result: 15mm smoothing, 112 effective df …

… less noisy sd:

… and T>4.86 for P<0.05 (corrected):

… and now we can detect a response!

Page 22: A general statistical analysis for fMRI data
Page 23: A general statistical analysis for fMRI data

T>4.86T > 4.86 (P < 0.05, corrected)

Page 24: A general statistical analysis for fMRI data

Conclusion

• Largest portion of variance comes from the last stage i.e. combining over subjects:

sdrun2 sdsess

2 sdsubj2

nrun nsess nsubj nsess nsubj nsubj

• If you want to optimize total scanner time, take more subjects.

• What you do at early stages doesn’t matter very much!

+ +

Page 25: A general statistical analysis for fMRI data

• Delays or latency in the neuronal response are modeled as a temporal scale shift in the reference HRF:

• Fast voxel-wise delay estimator is found by adding the derivative of the reference HRF with respect to the log scale shift as an extra term to the linear model.

• Bias correction using the second derivative.• Shrunk to the reference delay by a factor of 1/(1+1/T2), T is the T statistic for the magnitude.

0 5 10 15 20 25-0.2

0

0.2

0.4

0.6

Delay = 5.4 seconds, log scale shift = 0 (reference hrf, h0)

Delay = 4.0 seconds, log scale shift = -0.3

Delay = 7.3 seconds, log scale shift = +0.3

t (seconds)

P.S. Estimating the delay of the response

Page 26: A general statistical analysis for fMRI data

-5

0

5

-5

0

5

0

2

4

6

8

10

0

1

2

3

4

5

Delay of the hot stimulusT stat for magnitude = 0 T stat for delay = 5.4 secs

Delay (secs) Sd of delay (secs)

Page 27: A general statistical analysis for fMRI data

-5

0

5

-5

0

5

0

5

10

0

1

2

3

4

5

-5

0

5

-5

0

5

0

5

10

0

1

2

3

4

5

Varying the delay of the reference HRF

Ref.delay= 4.0

Ref.delay= 7.3

-5

0

5

-5

0

5

0

5

10

0

1

2

3

4

5

T stat for mag T stat for delay Delay Sd of delay

Ref.delay= 5.4

>4.86 ~0 ~5.4s >4.86 ~0 ~5.4s 0.6s0.6s

~5.4s~5.4s

~5.4s~5.4s

Page 28: A general statistical analysis for fMRI data

Delay(secs)

6.5

5

5.5

4

4.5

T > 4.86 (P < 0.05, corrected)

Page 29: A general statistical analysis for fMRI data

Delay(secs)

6.5

5

5.5

4

4.5

T > 4.86 (P < 0.05, corrected)

Page 30: A general statistical analysis for fMRI data

Comparison:• Different slice acquisition times:• Drift removal:

• Temporal correlation:

• Estimation of effects:

• Rationale:

• Random effects:

• Map of the delay:

SPM’99:• Adds a temporal derivative• Low frequency cosines (flat at the ends)• AR(1), global parameter, bias reduction not necessary• Band pass filter, then least-squares, then correction for temporal correlation• More robust, but lower df• No regularization, low df• No

fmristat:• Shifts the model

• Polynomials (free at the ends)• AR(p), voxel parameters, bias reduction• Pre-whiten, then least squares (no further corrections needed)• More accurate, higher df• Regularization, high df• Yes

Page 31: A general statistical analysis for fMRI data
Page 32: A general statistical analysis for fMRI data
Page 33: A general statistical analysis for fMRI data
Page 34: A general statistical analysis for fMRI data
Page 35: A general statistical analysis for fMRI data

References

• http://www.math.mcgill.ca/keith/fmristat

• Worsley et al. (2000). A general statistical analysis for fMRI data. NeuroImage, 11:S648, and submitted.

• Liao et al. (2001). Estimating the delay of the fMRI response. NeuroImage, 13:S185 and submitted.

Page 36: A general statistical analysis for fMRI data

False Discovery Rate (FDR)Benjamini and Hochberg (1995), Journal of the Royal Statistical Society

Benjamini and Yekutieli (2001), Annals of StatisticsGenovese et al. (2001), NeuroImage

• FDR controls the expected proportion of false positives amongst the discoveries, whereas

• Bonferroni / random field theory controls the probability of any false positives

• No correction controls the proportion of false positives in the volume

Page 37: A general statistical analysis for fMRI data

-4

-2

0

2

4

-4

-2

0

2

4

-4

-2

0

2

4

-4

-2

0

2

4

Noise

P < 0.05 (uncorrected), Z > 1.645% of volume is false +

FDR < 0.05, Z > 2.825% of discoveries is false +

P < 0.05 (corrected), Z > 4.225% probability of any false +

Signal + Gaussian white noise

False +

True +Signal

Page 38: A general statistical analysis for fMRI data

• FDR depends on the ordered P-values: P1 < P2 < … < Pn. To control the FDR at find K = max {i : Pi < (i/n) }, threshold the P-values at PK

Proportion of true + 1 0.1 0.01 0.001 0.0001 Threshold Z 1.64 2.56 3.28 3.88 4.41

• Bonferroni thresholds the P-values at /n: Number of voxels 1 10 100 1000 10000 Threshold Z 1.64 2.58 3.29 3.89 4.42

• Random field theory: resels = volume / FHHM3: Number of resels 0 1 10 100 1000 Threshold Z 1.64 2.82 3.46 4.09 4.65

Comparison of thresholds

Page 39: A general statistical analysis for fMRI data

FDR < 0.05, Z > 2.915% of discoveries is false +

P < 0.05 (corrected), Z > 4.865% probability of any false +

P < 0.05 (uncorrected), Z > 1.645% of volume is false +

Which do you prefer?