multiple testing

Multiple testing

Justin ChumbleyLaboratory for Social and Neural Systems ResearchInstitute for Empirical Research in EconomicsUniversity of Zurich

With many thanks for slides & images to:FIL Methods group

Overview of SPM

Realignment Smoothing

Normalisation

General linear model

Statistical parametric map (SPM)Image time-series

Parameter estimates

Design matrix

Template

Kernel

Gaussian field theory

p <0.05

Statisticalinference

Inference at a single voxel

contrast ofestimated

parameters

varianceestimate

H0 , H1: zero/non-zero activation

parameters

varianceestimate

Decision:H0 , H1: zero/non-zero activation

parameters

varianceestimate

Decision:H0 , H1: zero/non-zero activation

parameters

varianceestimate

Multiple tests

parameters

varianceestimate

What is the problem?

Multiple tests

parameters

varianceestimate

hp or more FP FWERFPE FDR

All positives

Multiple tests

parameters

varianceestimate

Bonferonni

FWER N

Convention: Choose h to limit assuming family-wise H0

Smooth errors: the facts• intrinsic smoothness

– MRI signals are aquired in k-space (Fourier space); after projection on anatomical space, signals have continuous support

– diffusion of vasodilatory molecules has extended spatial support

• extrinsic smoothness– resampling during preprocessing– matched filter theorem

deliberate additional smoothing to increase SNR

Model errors as Gaussian random fields

1 2 3 4 5 6 7 8 9-5

Surprising qualitative features in this landscape?

Height Spatial extentLocationTotal number

• General form for expected Euler characteristic• 2, F, & t fields• restricted search regions

<Q ; h> = S Rd (W) rd (h)

Number of regions, Q?

Rd (W): RESEL count; depends on the search region – how big, how smooth, what shape ?

rd (h): EC density; depends on type of field (eg. Gaussian, t) and thethreshold, h.

Worsley et al. (1996), HBM

• General form for expected Euler characteristic• 2, F, & t fields• restricted search regions

<Q ; h> = S Rd (W) rd (h)

Number of regions, Q?

Rd (W): RESEL count

R0(W) = (W) Euler characteristic ofWR1(W) = resel diameterR2(W) = resel surface areaR3(W) = resel volume

rd (h): d-dimensional EC density –E.g. Gaussian RF:

r0(h) = 1- (u)

r1(h) = (4 ln2)1/2 exp(-u2/2) / (2p)

r2(h) = (4 ln2) exp(-u2/2) / (2p)3/2

r3(h) = (4 ln2)3/2 (u2 -1) exp(-u2/2) / (2p)2

r4(h) = (4 ln2)2 (u3 -3u) exp(-u2/2) / (2p)5/2

Worsley et al. (1996), HBM

height, h

Size of a region, S?)|( hHsp ii

Location of events?

‘Poisson clumping heuristic’high maxima locations follow a spatial Poisson process.

Number of events: Poisson distributed, under high thresholds (Adler & Hasofer, 1981).

Location of events?

‘Poisson clumping heuristic’high maxima locations follow a spatial Poisson process - Thinned

Number of events: Poisson distributed, under high thresholds (Adler & Hasofer, 1981).

‘F W E’ decisions1. Induce a set of regions,

2. Ensure

Each element of A departure from flat signal

95.0)0|(| , shAp

},...,,{ ,,,, 2211 NN shshshsh AAAA

))(;Poi(~|| , sSphQA ish

• Advantage– Adaptive to spatial noise (unlike voxel-wise

approaches, voxBonf/voxFDR).

• Disadvantage– Stingy

• Consider A … – not a set of positive decisions – a set of candidates, with unknown noise/ signal

partition• Decide on a partition?

– BH algorithm 1. Arranges candidates according to ‘improbability’ 2. Determines a partition

},{ hs

hn AAA

},{ hs

• Select desired limit on FDR• Order p-values,

p(1) p(2) ... p(V)

• Let r be largest i such that

i.e. decides on a partition of A controlling

JRSS-B (1995)57:289-300

|)(|||)( AcAip i

sr Appp ˆ)()2()1( ,...,,

(i/|A|) p-va

i/|A|= proportion of all selected regions

ˆ, AAAA

EFDR ns

Conclusions

• There is a multiple testing problem (From either ‘voxel’ or ‘blob’ perspective)

• ‘Corrections’ necessary to counterbalance this• FWE

– Random Field Theory• Inference about blobs (peaks, clusters)• Excellent for large sample sizes (e.g. single-subject analyses or large

group analyses)• Afford littles power for group studies with small sample size

consider non-parametric methods (not discussed in this talk)

• FDR• represents false positive risk over whole set of regions

• Height, spatial extent, total number

Further reading• Friston KJ, Frith CD, Liddle PF, Frackowiak RS. Comparing

functional (PET) images: the assessment of significant change. J Cereb Blood Flow Metab. 1991 Jul;11(4):690-9.

• Genovese CR, Lazar NA, Nichols T. Thresholding of statistical maps in functional neuroimaging using the false discovery rate. Neuroimage. 2002 Apr;15(4):870-8.

• Worsley KJ Marrett S Neelin P Vandal AC Friston KJ Evans AC. A unified statistical approach for determining significant signals in images of cerebral activation. Human Brain Mapping 1996;4:58-73.

Thank you

multiple testing

s rd w rd hnumber of

set of regions

resel volume rd h

f w eadvantageadaptive

w euler characteristic

f w e control

spatial noise

resel diameterr2w

Documents

multiple hypothesis testing, adjusting for latent...

testing reading through multiple-choice? an insight within...

forecast combination and multiple testing

multiple testing adjustments

asynchronous online testing of multiple hypotheses

multiple environment over stress testing (meost)

multiple testing

recommendations on multiple testing adjustment in multi

sensitivity, specificity, roc multiple testing independent...

multiple regression and classical assumption testing ·...

“there’s no confidence in multiple-choice testing …”

multiple dimensions of load testing

global testing under sparse alternatives: anova, multiple

multiple testing etc

multiple testing with an empirical alternative hypothesis

candidate marker detection and multiple testing

multiple hypothesis testing - sci2s

multiple hypothesis testing - statpower

9 multiple testing

statistical analysis of multiple choice · pdf...