multivariate models of inter-subject anatomical variability · introduction geometric variability...
TRANSCRIPT
IntroductionGeometric Variability
Similarity MeasuresReal data
Multivariate models of inter-subjectanatomical variability
John Ashburner
Wellcome Trust Centre for Neuroimaging,UCL Institute of Neurology,
12 Queen Square,London WC1N 3BG,
UK.
John Ashburner Anatomical Features
IntroductionGeometric Variability
Similarity MeasuresReal data
PredictionBinary ClassificationCurse of dimensionality
“The only relevant test of the validityof a hypothesis is comparison ofprediction with experience.”
Milton Friedman
John Ashburner Anatomical Features
IntroductionGeometric Variability
Similarity MeasuresReal data
PredictionBinary ClassificationCurse of dimensionality
Choosing models/hypotheses/theories
MacKay, DJC.“Bayesianinterpolation.”Neural computation4, no. 3 (1992):415-447.
John Ashburner Anatomical Features
IntroductionGeometric Variability
Similarity MeasuresReal data
PredictionBinary ClassificationCurse of dimensionality
Evidence-based Science
...also just known as “science”.
Researchers claim to find differences between groups. Dothose findings actually discriminate?
How can we most accurately diagnose a disorder from imagedata?
Pharma wants biomarkers. How do we most effectivelyidentify them?
There are lots of potential imaging biomarkers. Which aremost (cost) effective?
Pattern recognition provides a framework to compare data (orpreprocessing strategy) to determine the most accurate approach.
John Ashburner Anatomical Features
IntroductionGeometric Variability
Similarity MeasuresReal data
PredictionBinary ClassificationCurse of dimensionality
Biological variability is multivariate
John Ashburner Anatomical Features
IntroductionGeometric Variability
Similarity MeasuresReal data
PredictionBinary ClassificationCurse of dimensionality
A generative classification approach
p(x,y=0) = p(x|y=0) p(y=0)
Feature 1
Fea
ture
2
0 2 4−7
−6
−5
−4
−3
−2
−1p(x,y=1) = p(x|y=1) p(y=1)
Feature 1
Fea
ture
2
0 2 4−7
−6
−5
−4
−3
−2
−1
p(x) = p(x,y=0) + p(x,y=1)
Feature 1
Fea
ture
2
0 2 4−7
−6
−5
−4
−3
−2
−1p(y=0|x) = p(x,y=0)/p(x)
Feature 1
Fea
ture
2
0 2 4−7
−6
−5
−4
−3
−2
−1
John Ashburner Anatomical Features
IntroductionGeometric Variability
Similarity MeasuresReal data
PredictionBinary ClassificationCurse of dimensionality
Discriminative classification approaches
Ground truth
Feature 1
Fea
ture
2
0 2 4−7
−6
−5
−4
−3
−2
−1FLDA
Feature 1
Fea
ture
2
0 2 4−7
−6
−5
−4
−3
−2
−1
SVC
Feature 1
Fea
ture
2
0 2 4−7
−6
−5
−4
−3
−2
−1Simple Logistic Regression
Feature 1
Fea
ture
2
0 2 4−7
−6
−5
−4
−3
−2
−1
John Ashburner Anatomical Features
IntroductionGeometric Variability
Similarity MeasuresReal data
PredictionBinary ClassificationCurse of dimensionality
Bayesian classification
Simple Logistic Regression
Feature 1
Fea
ture
2
0 2 4−7
−6
−5
−4
−3
−2
−1
0 2 4−7
−6
−5
−4
−3
−2
−1Hyperplane Uncertainty
Feature 1
Fea
ture
2
Bayesian Logistic Regression
Feature 1
Fea
ture
2
0 2 4−7
−6
−5
−4
−3
−2
−1Bayesian Logistic Regression
Feature 1
Fea
ture
2
0 2 4−7
−6
−5
−4
−3
−2
−1
John Ashburner Anatomical Features
IntroductionGeometric Variability
Similarity MeasuresReal data
PredictionBinary ClassificationCurse of dimensionality
Why Bayesian?
To deal with different priors.
Consider a method with 90% sensitivity and specificity.Consider using this to screen for a disease afflicting 1% of thepopulation.On average, out of 100 people there would be 10 wronglyassigned to the disease group.A positive diagnosis suggests only about a 10% chance ofhaving the disease.
P(Disease|Pred+) = P(Pred+|Disease)P(Disease)P(Pred+|Disease)P(Disease)+P(Pred+|Healthy)P(Healthy)
= Sensitivity×P(Disease)Sensitivity×P(Disease)+(1−Specificity)×P(Healthy)
Better decision-making by accounting for utility functions.
John Ashburner Anatomical Features
IntroductionGeometric Variability
Similarity MeasuresReal data
PredictionBinary ClassificationCurse of dimensionality
Curse of dimensionality
Large p, small n.
John Ashburner Anatomical Features
IntroductionGeometric Variability
Similarity MeasuresReal data
PredictionBinary ClassificationCurse of dimensionality
Nearest-neighbour classification
−3 −2 −1 0 1 2−2
−1
0
1
2
Feature 1
Featu
re 2
Not nicesmoothseparations.
Lots of sharpcorners.
May beimproved withK-nearestneighbours.
John Ashburner Anatomical Features
IntroductionGeometric Variability
Similarity MeasuresReal data
PredictionBinary ClassificationCurse of dimensionality
Rule-based approaches
0 1 2 3 40
1
2
3
4
x
y
((x<0.3) & (y<2)) | ((x<0.75) & (y<1.25)) | (y<0.4)
Not nicesmoothseparations.
Lots of sharpcorners.
John Ashburner Anatomical Features
IntroductionGeometric Variability
Similarity MeasuresReal data
PredictionBinary ClassificationCurse of dimensionality
Corners matter in high-dimensions
John Ashburner Anatomical Features
IntroductionGeometric Variability
Similarity MeasuresReal data
PredictionBinary ClassificationCurse of dimensionality
Corners matter in high-dimensions
0 2 4 6 8 10 12 14 16 18 200
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Circle area = π r2
Sphere volume = 4/3 π r3
Number of dimensions
Vo
lum
e o
f h
yp
er−
sp
he
re (
r=1
/2)
John Ashburner Anatomical Features
IntroductionGeometric Variability
Similarity MeasuresReal data
PredictionBinary ClassificationCurse of dimensionality
Dimensionality 6= number of voxels
Little evidence to suggest that most voxel-based featureselection methods help.
Little or no increase in predictive accuracy.Commonly perceived as being more “interpretable”.
Prior knowledge derived from independent data is the mostreliable way to improve accuracy.
e.g. search the literature for clues about which regions toweight more heavily.
Cuingnet, Remi, Emilie Gerardin, Jerome Tessieras, Guillaume Auzias, Stephane Lehericy, Marie-Odile Habert,Marie Chupin, Habib Benali, and Olivier Colliot. “Automatic classification of patients with Alzheimer’s disease fromstructural MRI: a comparison of ten methods using the ADNI database.” Neuroimage 56, no. 2 (2011): 766-781.Chu, Carlton, Ai-Ling Hsu, Kun-Hsien Chou, Peter Bandettini, and ChingPo Lin. “Does feature selection improveclassification accuracy? Impact of sample size and feature selection on classification using anatomical magneticresonance images.” Neuroimage 60, no. 1 (2012): 59-70.See winning strategies in http://www.ebc.pitt.edu/PBAIC.html
John Ashburner Anatomical Features
IntroductionGeometric Variability
Similarity MeasuresReal data
PredictionBinary ClassificationCurse of dimensionality
Linear versus Nonlinear methods
Linear methods are more interpretable.
Nonlinear methods usually increasedimensionality.
Better to preprocess to obtain featuresthat behave more linearly.
40 50 60 70 80 90 100 110 120 130 140 150 1601.4
1.5
1.6
1.7
1.8
1.9
2
Weight (kg)
Hei
ght (
m)
Body Mass Index (linear plot)
BMI=18.5BMI=25BMI=30
40 50 60 70 80 90 100 120 140 1601.4
1.5
1.6
1.7
1.8
1.9
2
Weight (kg)
Hei
ght (
m)
Body Mass Index (log−log plot)
BMI=18.5BMI=25BMI=30
2 4 6 8 10
2
4
6
8
10
Feature 1
Featu
re 2
Raw Data
100
10−1
100
101
Feature 1F
eatu
re 2
Log−Transformed
John Ashburner Anatomical Features
IntroductionGeometric Variability
Similarity MeasuresReal data
ManifoldsPrincipal Components
Transformed images fall on manifolds
Rotating an image leads to points on a 1D manifold.
0 0.2 0.4 0.60
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Rigid-body motion leads to a 6-dimensional manifold (not shown).
John Ashburner Anatomical Features
IntroductionGeometric Variability
Similarity MeasuresReal data
ManifoldsPrincipal Components
Local linearisation through smoothing
0 200 400 600 8000
100
200
300
400
500
600
700
800
0 100 200 300 4000
50
100
150
200
250
300
350
400
Spatial smoothingcan make themanifolds morelinear with respectto smallmisregistrations.Some information isinevitably lost.
John Ashburner Anatomical Features
IntroductionGeometric Variability
Similarity MeasuresReal data
ManifoldsPrincipal Components
One mode of geometric variability
Simulated images Principal components
A suitable model would reduce these data to a single dimension.
John Ashburner Anatomical Features
IntroductionGeometric Variability
Similarity MeasuresReal data
ManifoldsPrincipal Components
Two modes of geometric variability
Simulated images Principal components
A suitable model would reduce these data to two dimensions.
John Ashburner Anatomical Features
IntroductionGeometric Variability
Similarity MeasuresReal data
DistancesExamples of features
Similarity Measures
Many methods are based on similarity measures.
A common similarity measure is the dot product.
Similarity: k(x, y) =∑
k
xkyk
Nonlinear methods are often based on distances.
Distance: d(x, y) =
√∑k
(xk − yk )2
Similarity: k(x, y) = exp(−λd(x, y)2)
How do we best measure distances between brain images?
John Ashburner Anatomical Features
IntroductionGeometric Variability
Similarity MeasuresReal data
DistancesExamples of features
Image Registration
Image registration measures distancesbetween images.
Often involves minimising the sum of twoterms:
Distance between the image intensities.Distance of the deformation from zero.
The sum of these terms gives thedistance.
µ ° θ f ° φ
φ
|Jφ|
θ
|Jθ|
John Ashburner Anatomical Features
IntroductionGeometric Variability
Similarity MeasuresReal data
DistancesExamples of features
Different ways of measuring distances
John Ashburner Anatomical Features
IntroductionGeometric Variability
Similarity MeasuresReal data
DistancesExamples of features
Different ways of measuring distances
Twosimulated
images
John Ashburner Anatomical Features
IntroductionGeometric Variability
Similarity MeasuresReal data
DistancesExamples of features
Metrics
Distances need to satisfy the properties of a metric:
1 d(x, y) ≥ 0 (non-negativity)
2 d(x, y) = 0 if and only if x = y (identity of indiscernibles)
3 d(x, y) = d(y, x) (symmetry)
4 d(x, z) ≤ d(x, y) + d(y, z) (triangle inequality).
Satisfying (3) requires inverse-consistent image registration.Satisfying (4) requires a specific family of image registrationalgorithm.
John Ashburner Anatomical Features
IntroductionGeometric Variability
Similarity MeasuresReal data
DistancesExamples of features
Non-Euclidean geometry
Distances are not alwaysmeasured along a straightline.
“Shapes are the ultimatenon-linear sort of thing”
John Ashburner Anatomical Features
IntroductionGeometric Variability
Similarity MeasuresReal data
DistancesExamples of features
Linear approximations to nonlinear problems
John Ashburner Anatomical Features
IntroductionGeometric Variability
Similarity MeasuresReal data
DistancesExamples of features
Example Images
Some example (non-brain) images.
John Ashburner Anatomical Features
IntroductionGeometric Variability
Similarity MeasuresReal data
DistancesExamples of features
Registered Images
We could register the images to their average shape...
John Ashburner Anatomical Features
IntroductionGeometric Variability
Similarity MeasuresReal data
DistancesExamples of features
Deformations
...and study the deformations...
John Ashburner Anatomical Features
IntroductionGeometric Variability
Similarity MeasuresReal data
DistancesExamples of features
Jacobian Determinants
...or the relative volumes...
John Ashburner Anatomical Features
IntroductionGeometric Variability
Similarity MeasuresReal data
DistancesExamples of features
Scalar Momentum
... or “scalar momentum”
John Ashburner Anatomical Features
IntroductionGeometric Variability
Similarity MeasuresReal data
DistancesExamples of features
Reconstructed Images
Reconstructions from template and scalar momenta.
John Ashburner Anatomical Features
IntroductionGeometric Variability
Similarity MeasuresReal data
DataFeaturesResults
Real data
Used 550 T1w brain MRI fromIXI (Information eXtraction fromImages) dataset.http://www.
brain-development.org/
Data from three differenthospitals in London:
Hammersmith Hospitalusing a Philips 3T system
Guy’s Hospital using aPhilips 1.5T system
Institute of Psychiatry usinga GE 1.5T system
John Ashburner Anatomical Features
IntroductionGeometric Variability
Similarity MeasuresReal data
DataFeaturesResults
Grey and White Matter
Segmented intoGM and WM.Approximatelyaligned viarigid-body.
Ashburner, J & Friston, KJ. Unified segmentation. NeuroImage 26(3):839–851 (2005).
John Ashburner Anatomical Features
IntroductionGeometric Variability
Similarity MeasuresReal data
DataFeaturesResults
Diffeomorphic Alignment
All GM and WM were diffeomorphically aligned to their commonaverage-shaped template.
Ashburner, J & Friston, KJ. Diffeomorphic registration using geodesic shooting and Gauss-Newton optimisation.NeuroImage 55(3):954–967 (2011).Ashburner, J & Friston, KJ. Computing average shaped tissue probability templates. NeuroImage 45(2):333–341(2009).
John Ashburner Anatomical Features
IntroductionGeometric Variability
Similarity MeasuresReal data
DataFeaturesResults
Volumetric Features
A number of featureswere used for patternrecognition.Firstly, two featuresrelating to relativevolumes.Initial velocitydivergence is similarto logarithms ofJacobiandeterminants.
JacobianDeterminants
Initial VelocityDivergence
John Ashburner Anatomical Features
IntroductionGeometric Variability
Similarity MeasuresReal data
DataFeaturesResults
Grey Matter Features
Rigidly RegisteredGM
NonlinearlyRegistered GM
Registered andJacobian Scaled GM
John Ashburner Anatomical Features
IntroductionGeometric Variability
Similarity MeasuresReal data
DataFeaturesResults
“Scalar Momentum” Features
“Scalar momentum”actually has twocomponents becauseGM was matchedwith GM and WMwas matched withWM.
First MomentumComponent
Second MomentumComponent
John Ashburner Anatomical Features
IntroductionGeometric Variability
Similarity MeasuresReal data
DataFeaturesResults
Age Regression
Linear Gaussian Process Regression to predict subject ages.
0 2 4 6 8 10 12 14 16 18 20−2020
−2000
−1980
−1960
−1940
−1920
−1900
−1880
−1860
−1840
−1820
Bayesian Model Evidence
Smoothing (mm)
Log L
ikelih
ood
Jacobians
Divergences
Rigid GM
Unmodulated GM
Modulated GM
Scalar Momentum
0 2 4 6 8 10 12 14 16 18 206
6.5
7
7.5
8
8.5
8−Fold Cross−Validation
Smoothing (mm)
RM
S e
rro
r (y
ea
rs)
Jacobians
Divergences
Rigid GM
Unmodulated GM
Modulated GM
Scalar Momentum
Rasmussen, CE & Williams, CKI. Gaussian processes for machine learning. Springer (2006).
John Ashburner Anatomical Features
IntroductionGeometric Variability
Similarity MeasuresReal data
DataFeaturesResults
Sex Classification
Linear Gaussian Process Classification (EP) to predict sexes.
0 2 4 6 8 10 12 14 16 18 20−260
−240
−220
−200
−180
−160
−140
−120
Bayesian Model Evidence
Smoothing (mm)
Log L
ikelih
ood
Jacobians
Divergences
Rigid GM
Unmodulated GM
Modulated GM
Scalar Momentum
0 2 4 6 8 10 12 14 16 18 200.91
0.92
0.93
0.94
0.95
0.96
0.97
0.98
Gaussian Process (EP)
Smoothing (mm)
AU
C
Jacobians
Divergences
Rigid GM
Unmodulated GM
Modulated GM
Scalar Momentum
Rasmussen, CE & Williams, CKI. Gaussian processes for machine learning. Springer (2006).
John Ashburner Anatomical Features
IntroductionGeometric Variability
Similarity MeasuresReal data
DataFeaturesResults
Predictive Accuracies
Age
0 10 20 30 40 50 60 70 80 90 1000
10
20
30
40
50
60
70
80
90
100
Scalar Momentum (10mm FWHM)
Actual Age (years)
Pre
dic
ted
Ag
e (
ye
ars
)
Sex
0 0.2 0.4 0.6 0.8 1
0.2
0.4
0.6
0.8
1
ROC Curve (AUC=0.9769)
Sensitivity
Sp
ecific
ity
John Ashburner Anatomical Features
IntroductionGeometric Variability
Similarity MeasuresReal data
DataFeaturesResults
Conclusions
Scalar momentum (with about 10mm smoothing) appears tobe a useful feature set.
Jacobian-scaled warped GM is surprisingly poor.
Amount of spatial smoothing makes a big difference.
Further dependencies on the details of the registration stillneed exploring.
John Ashburner Anatomical Features
IntroductionGeometric Variability
Similarity MeasuresReal data
DataFeaturesResults
John Ashburner Anatomical Features
IntroductionGeometric Variability
Similarity MeasuresReal data
DataFeaturesResults
John Ashburner Anatomical Features
IntroductionGeometric Variability
Similarity MeasuresReal data
DataFeaturesResults
John Ashburner Anatomical Features