wavelet-based statistical analysis of fmri data

17
1 UCLA, Ivo Dinov Slide 1 Stat 233 Statistical Methods in Biomedical Imaging Wavelet-Based Statistical Analysis of fMRI Data Ivo Dinov http://www.stat.ucla.edu/~dinov UCLA, Ivo Dinov Slide 3 fMRI Simulation movie UCLA, Ivo Dinov Slide 4 Problems with BOLD How localized is the BOLD response to the site of neuronal activity? Is the signal from draining veins rather than the tissue itself? How is the signal change coupled to neuronal activity? Do changes in timing of BOLD responses reliably tell us about changes in timing of neural activity? UCLA, Ivo Dinov Slide 5 Resting state versus Active state e.g. Finger tapping, word recognition. Whole brain scanned in ~3 seconds using a high speed imaging technique (EPI). Perform analysis to detect regions which show a signal increase in response to the stimulus. Activation Imaging using BOLD UCLA, Ivo Dinov Slide 6 fMRI Data Analysis Tools Brain Voyager http://www.brainvoyager.de/ CARET http://brainmap.wustl.edu/resources/caretnew.html FSL http://www.fmrib.ox.ac.uk/fsl/ SCIrun http://software.sci.utah.edu/scirun.html LONI Pipeline http://www.loni.ucla.edu McGill BIC www.bic.mni.mcgill.ca/usr/local/matlab5/toolbox/fmri/ AFNI http://afni.nimh.nih.gov/afni/ SPM http://www.fil.ion.ucl.ac.uk/spm/ UCLA, Ivo Dinov Slide 7 Hypotheses vs. Data driven Approaches Hypothesis-driven Examples: t-tests, correlations, general linear model (GLM) a priori model of activation is suggested data is checked to see how closely it matches components of the model most commonly used approach (e.g., SPM) Data-driven Independent Component Analysis (ICA) No prior hypotheses are necessary Multivariate techniques determine the patterns in the data that account for the most variance across all voxels Can be used to validate a model (see if the math comes up with the components you would’ve predicted) Can be inspected to see if there are things happening in your data that you didn’t predict Can be used to identify confounds (e.g., head motion) Need a way to organize the many possible components New and upcoming

Upload: others

Post on 20-Apr-2022

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Wavelet-Based Statistical Analysis of fMRI Data

1

UCLA, Ivo Dinov Slide 1

Stat 233 Statistical Methods in Biomedical Imaging

Wavelet-Based Statistical Analysis of fMRI Data

Ivo Dinov

http://www.stat.ucla.edu/~dinov

UCLA, Ivo DinovSlide 3

fMRI Simulation movie

UCLA, Ivo DinovSlide 4

Problems with BOLD

How localized is the BOLD response to the site of neuronal activity?

Is the signal from draining veins rather than the tissue itself?

How is the signal change coupled to neuronal activity?

Do changes in timing of BOLD responses reliably tell us about changes in timing of neural activity?

UCLA, Ivo DinovSlide 5

Resting state versus Active statee.g. Finger tapping, word recognition.

Whole brain scanned in ~3 seconds using a high speed imaging technique (EPI).

Perform analysis to detect regions which show a signal increase in response to the stimulus.

Activation Imaging using BOLD

UCLA, Ivo DinovSlide 6

fMRI Data Analysis Tools

Brain Voyager http://www.brainvoyager.de/

CARET http://brainmap.wustl.edu/resources/caretnew.html

FSL http://www.fmrib.ox.ac.uk/fsl/

SCIrun http://software.sci.utah.edu/scirun.html

LONI Pipeline http://www.loni.ucla.edu

McGill BIC www.bic.mni.mcgill.ca/usr/local/matlab5/toolbox/fmri/

AFNI http://afni.nimh.nih.gov/afni/

SPM http://www.fil.ion.ucl.ac.uk/spm/UCLA, Ivo DinovSlide 7

Hypotheses vs. Data driven Approaches

Hypothesis-drivenExamples: t-tests, correlations, general linear model (GLM)

• a priori model of activation is suggested• data is checked to see how closely it matches components of the model• most commonly used approach (e.g., SPM)Data-driven – Independent Component Analysis (ICA)• No prior hypotheses are necessary• Multivariate techniques determine the patterns in the data that account

for the most variance across all voxels• Can be used to validate a model (see if the math comes up with the

components you would’ve predicted)• Can be inspected to see if there are things happening in your data that

you didn’t predict• Can be used to identify confounds (e.g., head motion)• Need a way to organize the many possible components • New and upcoming

Page 2: Wavelet-Based Statistical Analysis of fMRI Data

2

UCLA, Ivo DinovSlide 8

A. Region-Of-Interest (ROI) approach1. A localizer run(s) to find a region (e.g., show moving rings to find medial temporal MT area)2. Extract time course information from that region in separate independent runs3. See if the trends in that region are statistically significant

The runs that are used to generate the area are independent from those used to test the hypothesis.

Example study: Tootell et al, 1995, Motion Aftereffect

Two approaches: Global vs. Regional

Localize “motion area” MT in a run comparing moving vs. stationary rings

Extract time courses from MT in subsequent runs while subjects see illusory motion (motion aftereffect)

MT

UCLA, Ivo DinovSlide 9

Two Approaches: Whole Brain Stats

Source: Tootell et al

B. Whole volume statistical approach1. Make predictions about what differences you should see if

your hypotheses are correct2. Decide on statistical measures to test for predicted

differences (e.g., t-tests, correlations, GLMs)3. Determine appropriate statistical threshold4. See if statistical estimates are significant

Statistics available1. T-test2. Correlation3. Frequency-Based (Fourier/Wavelet/Fractal) modeling4. General Linear Model -overarching statistical model that lets you perform many types

of statistical analyses (incl. correlation/regression, ANOVA)

UCLA, Ivo DinovSlide 10

Comparing the two approaches

Region of Interest (ROI) Analyses• Gives you more statistical power because you do not have to correct

for the number of comparisons• Hypothesis-driven• ROI is not smeared due to intersubject averaging• Easy to analyze and interpret• Neglects other areas which may play a fundamental role• Popular in North America

Whole Brain Analysis• Requires no prior hypotheses about areas involved• Includes entire brain• Can lose spatial resolution with intersubject averaging• Can produce meaningless lists of areas that are difficult to interpret• Depends highly on statistics and threshold selected• Popular in EuropeNOTE: Though different experimenters tend to prefer one method over the other,

they are NOT mutually exclusive. You can check ROIs you predicted and then check the data for other areas.

UCLA, Ivo DinovSlide 11

Why do we need statistics?MR Signal intensities are random-variation caused by scanner (magnet, coil, time)-variation in neurophysiology (area, task, subject)

The rest-state scans used against activation tasks in subtraction paradigm setting to correct for some of these effects within the same run.

Statistical analyses help us confirm whether the eyeball tests of significance based on visual thresholding are real/correct.

Because we do so many comparisons (~106), we need a way to compensate (increase of Type I error).

UCLA, Ivo DinovSlide 12

Paradigm: Event-related design to assess between-population differences of amplitude and variance of hemodynamicresponse to visual stimulus (14 young, 14 old noAD and 13 AD subjects)

Stimulus required a righthand index-finger click

16 coronal slices64x64 in-plane voxelsvolume time = 2 sec3 x 3 x 5 mm3 voxels

4 runs per Subject128 volumes per run 2 (randomly chosen)conditions (always 7+8)1 Trial = 21.44 sec (8 vol’s)Total 60 Trials/subject

fMRI of Young, Nondemented and Demented Adults

15 Random Trials / Run

1 8 16 24 32 40 48 56 64 72 80 88 96 104 112 120128

1 8 16 24 32 40 48 56 64 72 80 88 96 104 112 120128

1 8 16 24 32 40 48 56 64 72 80 88 96 104 112 120128

Run 1

Run 2

Run 3

1 8 16 24 32 40 48 56 64 72 80 88 96 104 112 120128Run 4

blank1-trial condition–1.5 second 8Hz B/W checkerboard flicker

2-trial condition – 2 visual stimuli 5.36 sec apart

Buckner et al. J. Cog. Neurosci. 2000

UCLA, Ivo DinovSlide 13

One Approach – Voxel-Based T-testsLook for differences b/w One-Trial & Two-Trial Activation, for a given voxel:

Measure average MR signal and SD for each volume in which 1-Trial stimuli were presented (7-8 epochs x 8 volumes/epoch = 56-64 volumes).Measure average MR signal and SD for each volume in which 2-Trial stimuli were presented (56-64 volumes).

If p = .001 and 65,536 voxels, 65,536*0.001 = 66 voxels could be significant purely by chance

Stat-Significant Mean difference? Calculate To value. Look up p valuefor that number of degrees of freedom (df >= 56 x 2 = 112). e.g., For ~112 df To>1.98 → p <0.05 To > 3.39 → p < 0.001Repeat this process 65,536 more times(64x64x16), once for each voxel.

To look for OneTrial – TwoTrialActivation, look at the negative tail of the comparison

F>PP>F fMR

ISignal

2 Trial1 Trial 2 Trial1 Trial

Page 3: Wavelet-Based Statistical Analysis of fMRI Data

3

UCLA, Ivo DinovSlide 14

T-test: MapsFor each voxel in the brain, we can now color code a map based

on the computed t and p values:

And we can also do this for the negative tail (2 Trial – 1 Trial)Blue = low significanceGreen = high significance

We can do this for the positivetail (1 Trial – 2 Trial)Orange = low significanceYellow = high significanceSchmutz or ESP voxels?

UCLA, Ivo DinovSlide 15

Another Statistical Analysis – Basic Correlation• voxels with time course correlated with a reference function• can incorporate hemodynamic response function (HRF) to predict time course more accurately

For each voxel:• Find the correlation between the predictor and the MR signal• Extract the correlation (r value) and find the corresponding p value (Pearson/Spearman).• Determine whether it is statistically significant• Similar in spirit to a T-test.

56 data points

56 data points

Data 0 1 Ref.ModelValue of Predictor

Valu

e of

fMR

I Sig

nal

Calculate r

Remember r2 is the proportion of variance accounted for by our predictor, e.g., if r=0.7, r2=0.49 49%

UCLA, Ivo DinovSlide 16

Predictor – Hemodynamic Response Function - HRF

% signal change= (point – baseline)/baselineusually 0.5-3%

initial dip-more focal and potentially a better measure-somewhat elusive so far, not everyone can find it

Perc

ent C

hang

e Fr

om B

asel

ine

StimulusOn

Time from StimulusOnset (sec)

time to risesignal begins to rise soon after stimulus begins

time to peaksignal peaks 4-6 sec after stimulus begins

post stimulus undershoot

signal suppressed after stimulation ends

UCLA, Ivo DinovSlide 17

Correlation: Incorporating the HRFWe can model the expected curve of the data by convolving our predictor with the hemodynamic response function.

To find a 1-Trial responsive area, we can correlate the convolved face predictor with each voxel time course

Data 0 1 ModelValue of Predictor

56 data points

56 Model points

Valu

e of

fMR

I Sig

nal

g(x)KernelPredictor

m(x)

Model(m*g)(x)

Dataf(x)

UCLA, Ivo DinovSlide 18

Correlations: Negative Tail

Looking for anything that shows a reduced response during 1-Trial condition (compared to both 2-Trial and fixation/rest)

UCLA, Ivo DinovSlide 19

Two Main fMRI Designs

Block design• Compare long periods

(e.g., 16 sec) of one condition with long periods of another

• Traditional approach• Most statistically

powerful approach• Less dependent on how

well you model the HRF (hemo)

• Habituation effects?!?

Event-related design• Compare brief trials (1 sec)

of 1 condition with brief of another

• Newer approach (1997+) • Less statistically powerful

but has some advantages• Trials can be well-spaced to

allow the MR signal to return to baseline b/w trials (e.g., 21+ sec apart) or closely spaced (e.g., every 2 sec)

Page 4: Wavelet-Based Statistical Analysis of fMRI Data

4

UCLA, Ivo DinovSlide 20

Problems with T-tests and Correlation Analyses

1) How do we evaluate runs with different orders?We could average our 4 runs (60 trials) by 1-Trial of 2-Trial stimuli and then do stats comparing the functional differences between the 2 paradigms. There is no way to collapse between orders.

2) If we test more subjects, how can we evaluate all subjects together?As with the single subject runs, we could average all the subjects together (after morphing them into an atlas space) but that still means we have to run all of them in the same order.

3) We can get nice hemodynamic predictors for 1-Trial or 2-Trial stimuli, but how can we compare them accurately?

If this predictor is significant, we won’t know if it’s because

1-Trial > 2-Trial OR 1-Trial > Fixation

A Solution: Use General Linear ModelingUCLA, Ivo DinovSlide 21

General Linear Model for fMRI

Adapted from Brain Voyager course slides

Parse out variance in the voxel’s time course to the contributions of six predictors plus residual noise (what the predictors can’t account for).

+

fMRI signal residuals

+

β1 ×

β2 ×

=

β6 ×

+

+

+

+

β3 ×

β4 ×

β5 ×

Design Matrix Examples:Stimulus

Subject

Run

Trial

Group (AD/Young)

ROI

UCLA, Ivo DinovSlide 22

Completeness and Efficiency of Signal Representation – Wavelet functions 1D

SOCR Wavelet/Fourier Demo!

http://socr.stat.ucla.edu/

UCLA, Ivo DinovSlide 23

Mostly no global multi-scale self-similarity.

Contain both man-made and natural features

Mostly no simple and universal underlying physical or biological processes that generate the patterns in a general image.

Features of Natural Images

Ref. www.math.umn.edu/~jhshen

UCLA, Ivo DinovSlide 24

Fourier Representation

Claim: Harmonic waves are bad vision neurons…

Because.A typical Fourier neuron is

To process a simple bright spot in the visual field,

all such neurons have to respond to it (!) since

).exp(iax=φ

)(xδ

.1)(, )( == ∫ − dxex xi πδφδ

UCLA, Ivo DinovSlide 25

Efficiency of representation

David Field (Cornell U, Vision psychologist):

“To discriminate between objects, an effective transform (representation) encodes each image using the smallestpossible number of neurons, chosen from a large pool.”

Page 5: Wavelet-Based Statistical Analysis of fMRI Data

5

UCLA, Ivo DinovSlide 26

Asking our own “headtop”…

Psychologists show that visual neurons are spatiallyorganized, and each behaves like a small sensor (receptor) that can respond strongly to spatial changes such as edge contours of objects (Fields, 1990).

UCLA, Ivo DinovSlide 27

Marr’s Edge Detection

Detection of edge contours is a critical ability of human vision(Marr, 1982). Marr and Hildreth (1980) proposed a model for human detection of edges at all scales. This is Marr’s Theory of Zero-Crossings:

Why? IG( whereI in occurs Edge

yxyxyG

xGG

yxG

.0)

,2

exp2

11

:isGaussian theofLaplacian

KernelGaussian,2

exp2

1

2

22

2

22

4

2

2

2

2

2

22

2

=∗∆

+−

+−−=

=∂

∂+∂

∂=∆

+−=

σ

σσσ

σ

σσπσ

σπσ

UCLA, Ivo DinovSlide 28

Marr’s Edge Detection

Laplacian of (an isotropic) Gaussian Kernel (at (0,0)) (http://www.dai.ed.ac.uk/HIPR2/flatjavasrc/)

+−= 2

22

2 2exp

21

σπσσ

yxG 22

22

yG

xGG

∂∂+

∂∂=∆ σσσ

UCLA, Ivo DinovSlide 29

Laplacian of a Gaussian –Why an Edge at Zero-Crossing?

Uses the second spatial derivatives of an image. The operator response will be zero in areas where the image has a constant intensity (i.e. where the intensity gradient is zero).

Near a change in image intensity the operator response will be positive on the darker side, and negative on the lighter side.

A sharp edge between two regions of different uniform intensities, the Laplacian of Gaussians response will be:

zero away from the edge, positive to one side of the edge, negative to the otherzero on the edge in b/w.

Example:Response of 1D Laplacian of Gaussian (σ = 3 pixels) to a 1D step edge.

UCLA, Ivo DinovSlide 30

Laplacian of a Gaussian –Why an Edge at Zero-Crossing?

A zero crossing detector looks for places where the Laplacianchanges sign. Such points often occur around edges in images.

But they also occur at places that are not as easy to associate with edges. Zero crossings always lie on closed contours.

The starting point for the zero crossing detector is an image which has been convolved with a Laplacian of Gaussian filter.

The zero crossings that result are strongly influenced by the size of the (isotropic) Gaussian, change filter-size in applet demo.

As the smoothing is increased then fewer and fewer zero crossing contours will be found, and those that do remain will correspond to features of larger and larger scale in the image.

UCLA, Ivo DinovSlide 31

Laplacian of a Gaussian –Why an Edge at Zero-Crossing?

Demos:

C:\Ivo.dir\UCLA_Classes\Applets.dir\ImageProcessing\LaplacianOfGaussian

LaplacianOfGaussianDemo.html

LaplacianDemo.html

Page 6: Wavelet-Based Statistical Analysis of fMRI Data

6

UCLA, Ivo DinovSlide 32

Marr vs. Haar’s edge encoding

Marr’s edge detector is to use second derivative to locate the maxima of the first derivative (which the edge contours pass through).

Haar Basis (1909) encodes the edges into image representation via the first derivative operator (i.e. moving average/difference):

( )

−=+= →← ++

+,...

2,

2...,...,... 122122

122nn

nnn

nHaar

nnxxxx daxx

UCLA, Ivo DinovSlide 33

A good representation should respect edges

Edges are important features of images.

A good image representation (or basis) should be capable of providing the edge information easily.

Edge is a local feature. Local operators, like differentiation, must be incorporated into the representation, as in the coding by the Haar basis.

Wavelets improve Haar encoding, while respecting the above edge representation principle.

UCLA, Ivo DinovSlide 34

What is a good image representation?

Mathematically rigorous (i.e. a clean and stable analysisand synthesis program exists. FT & IFT…).

Having an independent fast digital formulation framework (e.g., FFT, FWT… ).

Capturing the characteristics of the input signals, and thus many existing processing operators (e.g. image indexing, image searching …) are directly permitted on such representation.

Improving statistical inference power.

UCLA, Ivo DinovSlide 35

Mathematics of wavelet image representation

Let Ω denote the collection of all images. What is the mathematical structure of Ω ? Suppose that is an observed image. Then Ω should be invariant under

Euclidean motion of the scanner/imager:

Brightness/Flashing:

or, more generally, a morphological transform –

Scaling:

Ω∈f

.),2(),()( 2RaOQaQxfxf ∈∈+→

,),()( +∈→ Rxfxf µµ

.0',:)),(()( >→→ hRRhxfhxf

.),()( +∈→ Rxfxf λλ

UCLA, Ivo DinovSlide 36

E.g., Zooming in 2-D

UCLA, Ivo DinovSlide 37

What is zooming?Zooming (aiming) center: a.

Zooming scale: h.

Zoom into the h-neighborhood at a in a given image I:

Zooming is one of the most fundamental and characteristic operators for image analysis and visual communication. It reflects the multi-scale nature of images and vision.

( )aperture. the yII

field; visual the x xhaIxI

hay

hay

ha

ha,1)(

,,)(

,

,

⋅=

Ω∈⋅+=−

Ω−

Page 7: Wavelet-Based Statistical Analysis of fMRI Data

7

UCLA, Ivo DinovSlide 38

The zooming/scaling and wavelets

The base function:

Aiming (a) and zooming (h), in-or-out:

Generating response (to centering/scaling of signals):

The zooming space:

)( xψ

).()( 1, h

axhha x −= ψψ

.)()(, ,,, dxxxIII hahaha ψψ ∫==

.),( +×∈ RRha

UCLA, Ivo DinovSlide 39

Good Signal representation should be adaptable

A good representation should react strongly to abrupt changes, and weakly to stable intensities

That means, for a stable image, e.g., constant, I=c, the responses Ia , h should be approximately zeros:

This is the “differentiating” property of the encoding, e.g., d/dx

.0)(∫ =R

0, ,, ≈= haha II ψ

UCLA, Ivo DinovSlide 40

The continuous wavelet representation

DefinitionDefinition:

A differentiating zooming function ψ(x) is said to be a (continuous) wavelet. Representing a given image I(x) by all the scale/shift responses

is the corresponding wavelet representation.

Questions:Questions:Is there a best wavelet ψ(x) ?Does such wavelet representation allow perfect image reconstruction?

haha II ,, ,ψ=

UCLA, Ivo DinovSlide 41

Synthesizing a wavelet representation

Goal:Goal: to recover perfectly an image signal I from its wavelet representation I(a , h).

(Continuous) Wavelet synthesis (where ^ denotes the FT ):

which is in the form of IFT. Let then it can be perfectly recovered via the a-FT of I(a , h).Then can be perfectly recovered from J via

, ),(,,),( ,,ξξψψψ ia

haha ehIIIhaI −∧∧∧∧

===

)(),( ξψξ hIhJ∧∧

=

∧I

.1 as,)(),()(),0[),0[

2)(

== ∫∫ ∞∞

∧∧

dh h

dh h

hhJIhψ

ξψξξ

UCLA, Ivo DinovSlide 42

The admissibility condition & differentiation

The admissibility condition of a continuous wavelet:

A differentiating zooming operator satisfies the admissibility condition since:

Examples:The (1D Laplacian of Gaussian) Marr wavelet (Mexican-hat): second derivative of Gaussian.

The Shannon wavelet:

. /|)(| 2

0∞<

∧∞

∫ dhhhψ

).()( and ,0)()0( hochhdxxR

+===∧∧

∫ ψψψ

t t t

x sincx sincx

ππ

ψ)sin()(sinc

).()2(2)(=

−=

UCLA, Ivo DinovSlide 43

The discrete set of zooming scales

Make a log-linear discretization to the scale parameter h:

Make a scale-adaptive discretization of the zooming centers:

The discrete set of zooming operators:

.,2,1,0log 2 L±±=−=→ jhjj

.,2 ,1 ,0,2/ : 2 scaleat

L±±=

==→= −

kkkhakh j

jkj

j

).2(2)()( 2/1, kxx jj

hkhx

hkj j

j

j−== − ψψψ

Page 8: Wavelet-Based Statistical Analysis of fMRI Data

8

UCLA, Ivo DinovSlide 44

The discrete wavelet representation

The wavelet coefficients:

Questions:Questions:Does the set of all wavelet coefficients still encode the complete information of each input image I ? Is the set of wavelets a basis?Or more precisely what space has these basis functions?

We don't know. But let's check out some examples...

WT.continuous theof in terms ,

.)2()(2,

2,2,

2/,,

jj kkj

j

R

jkjkj

Id

dxkxxIId

−−=

−== ∫ ψψ

,:)( , Zkjxkj ∈ψ

UCLA, Ivo DinovSlide 45

Example 1: Haar wavelet

The Haar “aperture” function is

Haar’s theorem (1905):All Haar wavelets together with the constant function I[0;1](x), form an orthonormal basis for the Hilbert space of all square integrable functions on [0, 1].

).()()( 121

210 xI xIx xx

(Harr)<≤<≤ −=ψ

1

-1

11/2

ZkZj (Haar)kj ∈∈ ;|,ψ

UCLA, Ivo DinovSlide 46

Haar waveletsHaar’s mother wavelet:

Why orthonormal basis?

.Completeness is due to the fact that:All All dyadicallydyadically piecewise constant functions are dense in piecewise constant functions are dense in LL22 (0,1).(0,1).

1

-1

).()()( 121

210 xI xIx xx

(Harr)<≤<≤ −=ψ

otherwise mklj

mlkj ,0&,1; ,,

===ψψ

UCLA, Ivo DinovSlide 47

Haar wavelets

Three Haar wavelets and the mean (constant) encode all the information of the piecewise constant approximation (or, the analog-to-digital transition).

I(x)

1 constant + 3 wavelets are enough.

UCLA, Ivo DinovSlide 48

Example 2: The Shannon wavelets

The Shannon’s “aperture” function is:

TheoremTheorem:

is an orthonormal basis of L2(R).,:)( Zkjx(Shannon)

j,k ∈ψ

x x x sinc

x( sincx sincx(Shannon)

ππ

ψ)sin()(

),)2(2)(

=

−=

UCLA, Ivo DinovSlide 49

Shannon wavelets

An orthonormal basis is visualized better in Fourier space!

Shannon proved that:All (−π, π) bandlimited signals can be represented by sinc(x-n)…(−2π, π ) U (π, 2π) band-limited by (−4π, 2π ) U (2π, 4π) band-limited by

).)2(2)( x( sincx sincx(Shannon)

−=ψ

π– π

Fourier of 2sinc(2x)Fourier of sinc(x)

– 2 π 2 π

)2(2)(

,1,0

nxnx

nn

−=−

ψψψ

Page 9: Wavelet-Based Statistical Analysis of fMRI Data

9

UCLA, Ivo DinovSlide 50

Shannon wavelets

Shannon proved that:All (−π, π) band-limited signals can be represented by sinc(x-n)(−2π, π ) U (π, 2π) band-limited by (−4π, 2π ) U (2π, 4π) band-limited by…

)2(2)(

,1,0

nxnx

nn

−=−

ψψψ

. . . . . .

.s' nxby drepresente n )(,0 −ψ.s' )2(2by drepresente n1, nx −= ψψ

.s' )2/(2/1by drepresente n1,- nx −= ψψ

π π4π22π

ξ

UCLA, Ivo DinovSlide 51

Partition of the Time-Frequency space

Heisenberg’s uncertainty principle postulates that for each particle’s position and momentum we have:

So, to get an optimal spatial-localization, we must influence the particle’s scale or frequency content.

Freq

uenc

y

Space/Time

sinc(x-n)

sinc(2x-m)

sinc(4x-k)

.2π≥∆⋅∆ xt

UCLA, Ivo DinovSlide 52

Multiresolution analysis (MRA)

Mallat and Meyer (1986):

An (orthogonal) multiresolution of L2(R) is a chain of closed subspaces indexed by all integers:

subject to the following three conditions:(completeness)

(scale similarity)

(translation seed) V0 has an orthonormal basis consisting of all integral translates of a single function

LL 21012 VVVVV ⊂⊂⊂⊂ −−

.0lim),(lim 2 ==−∞→∞→ nnnn

VRLV

.)2()( 1+∈⇔∈ nn VxfVxf

:)( :)( Znnx x ∈−φφUCLA, Ivo DinovSlide 53

Equations for designing MRA

The refinement/dilation equation for the father-waveletfunction:

This father-wavelet function is often called: scaling function, shape function

Let Wo denote the orthogonal complement of Vo in V1. Then Wo is also orthogonally spanned by the integer translates of a single translation function ψ(x) the (mother) wavelet!

s.' ofset suitable afor ),2(2)( nn n hnxhx −= ∑ φφ

s.'g of set suitable somefor nxgx nn n ),2(2)( −= ∑ φψ

LL 21012 VVVVV ⊂⊂⊂⊂ −−

UCLA, Ivo DinovSlide 54

Wavelets representation

Theorem:Theorem:

Wavelets representation of a signal:Wavelets representation of a signal:

2

2/,

Lfor basis lorthonormaan is ,:)2(2 Zkjkxjj

kj ∈−= ψψ

jj VII ∈~+

11 −− ∈ jj VI+

11 −− ∈ jj Wd

+

L2−jd .00 Wd ∈

.00 VI ∈

.0021 IdddI jjj ++++= −− L

Course Scale +Wavelet Detail Coefficients

……

UCLA, Ivo DinovSlide 55

An example of wavelet decomposition

One level wavelet decomposition of a 1-D signal

Page 10: Wavelet-Based Statistical Analysis of fMRI Data

10

UCLA, Ivo DinovSlide 56

2-Channel filter bank: Analysis bankForward Discrete Wavelet Transform

• H’ is the low-pass filter ;•G’ is the high-pass filter ;• 2 is the down-sampling operator: (1 3 4 6 5 8 7) (1 4 5 7).

G’

H’ 2

2

xs

d

Low-pass channel

High-pass channel

Scales

Details

UCLA, Ivo DinovSlide 57

2-channel filter bank: Synthesis bankInverse Discrete Wavelet Transform

• H is the low-pass filter ;• G is the high-pass filter.• 2 is the up-sampling operator: (1 4 5 7) (1 0 4 0 5 0 7).

G

H2

2

s

d

Low-pass channel

High-pass channel

y

UCLA, Ivo DinovSlide 58

A biorthogonal filter bank

G’

H’ 2

2x

s

d

G

H2

2

s

dy

Biorthogonal (or perfect) filter bank: if y=x for all inputs x .

UCLA, Ivo DinovSlide 59

An orthogonalorthogonal filter bank

G’

H’ 2

2x

s

d

Definition: Orthogonal filter bank is biorthogonal, where both analysis filters H’ and G’ are the time reversals of the synthesis filters H & G: e.g., H=(1, 2, 3) H’=(3, 2, 1).

G

H2

2

s

dy

UCLA, Ivo DinovSlide 60

The fundamental theorem of MRA

An orthogonalorthogonal Mallat-Meyer MRA corresponds to an orthogonalorthogonal filter bank with the synthesis filters:

where, the hn’s and gn’s are the 2-scale connection coefficients in the dilation and wavelet equations:

And, the multiresolution wavelet decomposition of f corresponds to the iteration of the analysis bank with the φ-coefficients of f as the input digital data.

).:(),:( ZngGZnhH nn ∈=∈=

).2(2)(),2(2)( nxgxnxhxn nn n −=−= ∑∑ φψφφ

UCLA, Ivo DinovSlide 61

The fundamental theorem (cont’d)

jj VI ∈+

11 −− ∈ jj VI+

11 −− ∈ jj Wd

+

L2−jd .00 Wd ∈

.00 VI ∈

.0021 IdddI jjj ++++= −− L

Suppose j=2, and ∑=k k xkcI ).()( ,222 φ

G’

H’ 22

c2d1

G’

H’ 22 d0

c0c1

Page 11: Wavelet-Based Statistical Analysis of fMRI Data

11

UCLA, Ivo Dinov Slide 62

The Discrete Wavelet Transform – DWT

Alternating quardature mirror – Smooth (S) and Detail (D) filters for Daub4 filterbank.

NNcccccccc

cccccccc

cccccccc

×

2301

1032

0123

3210

0123

3210

- -

. .. .

. .- -

- -

SD

Half the results of the S and D filters are discarded (decimation) .N N/2 Smooth/Low-Pass components and N/2 Detail/High-Pass components.

UCLA, Ivo DinovSlide 63

DWT – Pyramidal Algorithm (forward)

Alternating quardature mirror – Smooth (S) and Detail (D) filters for Daub4 filterbank.

Data(y) Apply Permute Apply PermuteMatrix: Cxy Elements Matrix: Cxy Elements

Details, waveletcoefficients

4

3

2

1

2

1

2

1

4

3

2

1

2

2

1

1

4

3

2

1

4

3

2

1

4

4

3

3

2

2

1

1

8

7

6

5

4

3

2

1

ddddddddssss

ddddddssddss

ddddssss

dsdsdsds

yyyyyyyy Smooth

Components ,Mother functioncoefficients

Forward DWT

UCLA, Ivo DinovSlide 64

DWT – Pyramidal Algorithm (inverse)

Alternating quardature mirror – Smooth (S) and Detail (D) filters for Daub4 filterbank.

Data(y) Permute Apply Permute ApplyElements CTxw Elements CTxw

=

8

7

6

5

4

3

2

1

4

4

3

3

2

2

1

1

4

3

2

1

4

3

2

1

4

3

2

1

2

2

1

1

4

3

2

1

2

1

2

1

8

7

6

5

4

3

2

1

?

wwwwwwww

dsdsdsds

ddddssss

ddddddssddss

ddddddddssss

yyyyyyyy

InverseDWTC=orthomatrix

UCLA, Ivo DinovSlide 65

FBI fingerprints.JPEG2000 image compression file format.Image indexing and image search engines (for databanks).Signal modeling.Image denoising and restorations (optimal representation).Texture analysis.Brain Mapping and Neuroimaging.Algorithm speeding up based on multi-resolution representation (e.g., solving large sparse-matrix linear equ’s)Time series analysis.

Major Applications of Wavelets

UCLA, Ivo DinovSlide 67

Completeness and Efficiency of Signal Representation – Wavelet functions 2D

A 2D Daubechies wavelet illustrating the compact support,fast decaying and oscillatory properties of wavelets.

UCLA, Ivo DinovSlide 68

Completeness and Efficiency of Signal Representation – Wavelet functions 3D

Signal & it’s 3D WT

Page 12: Wavelet-Based Statistical Analysis of fMRI Data

12

UCLA, Ivo DinovSlide 69

Completeness and Efficiency of Signal Representation – Wavelet functions 3D

Signal & it’s 3D WT

UCLA, Ivo DinovSlide 70

Completeness and Efficiency of Signal Representation – Wavelet functions 3D

Signal & it’s 3D WT

UCLA, Ivo DinovSlide 71

Completeness and Efficiency of Signal Representation – Wavelet functions 3D

Signal & it’s 3D WT

UCLA, Ivo DinovSlide 72

Wavelets

The three curves on this graph represent the original signal(HeavySine function, thin red curve), its wavelet transform (thin bluecurve) and the reconstructed function estimator using only the largest2% of the wavelet parameters. Note the space-frequency decorrelationof the original data in the wavelet-space (blue curve).

UCLA, Ivo DinovSlide 73

Wavelets

If Time Permits:C:\Ivo.dir\Research\movies\WaveletMovie.mpg

Also available online at the Wavelet Analysis of Image Registration site:http://www.loni.ucla.edu/~dinov/WAIR.html

UCLA, Ivo DinovSlide 74

Atlas–Based Wavelet–Space Stat Analysis

Design protocol for the wavelet-based construction and utilization of the fMRIfunctional atlas (F&A SVPA). The construction of the anatomical probabilisticatlas is augmented by including the regional distributions of the waveletcoefficients of the functional signals for one population. Statistical analysis offunctional variability between a new fMRI volume and the F&A SVPA atlas isassessed following wavelet-space shrinkage.

Page 13: Wavelet-Based Statistical Analysis of fMRI Data

13

UCLA, Ivo DinovSlide 75

Distribution of the wavelet coefficients

Displayed is the frequency histogram of the wavelet coefficients at 1,000 randomly selected locations (random indices of wj,k), averaged across all 578 MRI volumes part of the ICBM database [Mazziotta et al., 1995]. Notably, most of the wavelet coefficients are near theorigin, with some having sporadic, but large magnitudes. Heavy-tail distributions models will be appropriate for these data.

-5 0 5

UCLA, Ivo DinovSlide 76

Distr’n of the wavelet coefficients – 3 volumes

Shown here are the frequency distributions of the wavelet coefficients for three separate MRI volumes (randomly selected from the pool of 578). There is little variation between these three individuals, however the overall shape of the distribution of the magnitudes of the wavelet coefficients of each individual across the 1,000 random locations is more regular, still heavy-tailed, than the averaged distribution across subjects depicted in the previous Figure

UCLA, Ivo DinovSlide 77

Wavelet Coefficient distributions MRI data

This image illustrates a portion of all of the individualdistributions of the waveletparameters for the 578 ICBM MRI volumes. The horizontal axis represents the magnitude of the wavelet coefficients in the range [-4 ; 4], vertical axis labels the subject index and the row color map indicates the frequencies of occurrence of a wavelet coefficient of certain magnitude across all 1,000 voxels. Bright colors indicate high, and dark colors represent low, frequencies.

Vox

el L

ocat

ion Distribution Across 578 Subjects

UCLA, Ivo DinovSlide 78

fMRI data – 1 subject wavelet coefficient distributions

This diagram depicts the frequency histogram of the average magnitude of a wavelet coefficient, across all 128 fMRI 3D time-volumes (part of the fMRI study of Buckner and colleagues [Buckner et al., 2000]) at 1,000 randomly selected locations (random indices of wj,k). Again, we observe the heavy-tailness of the data.

UCLA, Ivo DinovSlide 79

A 2D image displaying all of the individual distributions for each 128 3D time points (1 run, 15 trials) of an fMRI timeseries. On the horizontal axis is the magnitude of the wavelet and the vertical axis labels the time point of the fMRI epoch.

Colors indicate the frequencies of occurrence of a wavelet coefficient of certain magnitude across all 1,000 voxel locations. The across row average is effectively shown on previous Figure.

Wavelet coefficient Distribution – 1 run

-4 0 4

1 64

128

Magnitude of wavelet-coefficients

Time

Freq

uenc

y

UCLA, Ivo DinovSlide 80

Distribution of 4D wavelet coefficients

This graph shows the frequency distribution of the wavelet coefficients of the complete 4D WT of the fMRI timeseries. Note the slow asymptotic decay of the tails of this distribution (data-size: 64x64y16z128t, floating point). Here the x-axis represents the magnitude of the wavelet coefficients and the vertical axis indicates the frequency of occurrence of a wavelet coefficient of the given magnitude within the 4D dataset. Sporadic behavior of the large-magnitude wavelet coefficients is clearly visible at the tails of this empirical distribution.

-10 0 10

Page 14: Wavelet-Based Statistical Analysis of fMRI Data

14

UCLA, Ivo DinovSlide 81

Distribution of (and Models for) the Wavelet Coefficients of the 4D fMRI volume

Several heavy-tail distribution models are fitted to the frequency histogram of 4D wavelet coefficients. These include double-exponential, Cauchy, T, double-Pareto and Bessel K form models. Because our statistical tests will be applied on the coefficients that survive wavelet shrinkage it is important to utilize a distribution model that provides accurate asymptotic approximation to the real data in the tail regions.

UCLA, Ivo DinovSlide 82

Right Tail of Distribution of (and Models for) the Wavelet Coefficients of the 4D fMRI volume

Shows the extreme left trail of the data distribution (data is symmetric). The double-exponential and the T distribution models underestimate the tails of the data asymptotically, but provide good fits around the mean. Cauchy, Bessel K forms and double-Pareto distribution models, in that order, provide increasingly heavier tails with Cauchy being the most likely candidate for the best fit to the observed wavelet coefficients across the entire range. The double-Pareto and the Bessel K form densities provide the heaviest tails, however, they are inadequate in the central range [-3 : 3] and undefined near the mean of zero.

UCLA, Ivo DinovSlide 83

Wavelet-domain Atlas-based fMRI Analysis

• Anatomical Sub-Volume Probabilistic Atlas (SVPA)

• Analysis using conventional time-domain SVT (sub-volume thresholding)Subtractionparadigm:AD-ElderlyNC

20

0

UCLA, Ivo DinovSlide 84

Wavelet-domain Atlas-based fMRI Analysis

• Raw results following analysis in the wavelet-domain (has ringing effects outside of the true ROI where the activation supposedly occurred).

• Post-processed wavelet domain statistics [final uniform thresholding (top 1%), reconstruction of all wavelet-space stats into a single volume – regional calculations]:

UCLA, Ivo DinovSlide 85

Bayesian Mixture Models for fMRI Analysis

• We use two component mixture prior distributions on the wavelet coefficients θj,k with

where πj is a proportion between 0 and 1, δ(0) is the Diracpoint mass at zero and τj >0. In other words, there is a level-dependent positive probability πj a priori that each wavelet coefficient will be exactly zero. If not, the coefficient will be normally distributed with mean zero and a level-specific standard deviation τj

( ) )0()1(,0~| 2, δπτπτπθ jjjjjkj N −+

UCLA, Ivo DinovSlide 86

Bayesian Mixture Models for fMRI Analysis

• Given the observed data over an ROI y=f+ε, the corresponding wavelet representation w = θ +z, where W is the discrete WT, w = Wy, θ=Wf and z = Wε, and the above prior distribution for the true wavelet coefficients, the posterior distributions of θj,k are again independent two-component mixtures:

Where λj,k= 1/(1+ ρj,k) and and the posterior odds that θ j,kis exactly zero are:

( ) )0()1(,~| ,22

22

22

2,

,,, δλτσ

τσ

τσ

τλθ kj

j

j

j

jkjkjkjkj

wNwp −+

++

+

−+−=

)(2exp

1222

,222

,στσ

τσ

στπ

πρ

j

kjjj

j

jkj

w

Page 15: Wavelet-Based Statistical Analysis of fMRI Data

15

UCLA, Ivo Dinov Slide 87

Completeness and Efficiency of Signal Representation

UCLA, Ivo Dinov Slide 88

Completeness and Efficiency of Signal Representation - Functional Bases

UCLA, Ivo Dinov Slide 89

Completeness and Efficiency of Signal Representation - Functional Bases

UCLA, Ivo Dinov Slide 90

Completeness and Efficiency of Signal Representation - Functional Bases

UCLA, Ivo Dinov Slide 93

Completeness and Efficiency of Signal Representation –Why use Wavelet functions?

IWT(Uniform 5%)

HeavySine

UCLA, Ivo Dinov Slide 94

Wavelet-space Shrinkage

Page 16: Wavelet-Based Statistical Analysis of fMRI Data

16

UCLA, Ivo Dinov Slide 95

Wavelet-space Shrinkage

UCLA, Ivo Dinov Slide 96

Wavelet-space Shrinkage

UCLA, Ivo Dinov Slide 97

Wavelet-space Shrinkage

UCLA, Ivo Dinov Slide 98

Wavelet-space Shrinkage

UCLA, Ivo DinovSlide 99

Stereotactic data alignment - warping

Data Target

Warp 1 Warp 2 Warp 3

UCLA, Ivo Dinov Slide 100

Wavelet-space Warp ClassificationFive PET (left) and corresponding MRI (right) data volumes (left columns), their LS,

MI and Affine warps (columns 2-4) and the target (right column).

Whi

ch o

ne is

the

best

war

p?

Page 17: Wavelet-Based Statistical Analysis of fMRI Data

17

UCLA, Ivo Dinov Slide 101

Wavelet-space Warp Classification

Spread Group Classification (SGC) Analysis of PET data-separation of saccade PET frequency clouds

UCLA, Ivo Dinov Slide 102

Wavelet-space Warp Classification

Cluster Group Classification (CGC) Analysis of MRI data- how close together are all 5 MRI under each warp strategy?

Smal

ler c

lass

ifyin

g va

lues

be

tter w

arps

1

4

2

3Rank