shape, colour and texture in mitosis detectionpersonal.maths.surrey.ac.uk/t.decampos/papers/... ·...
TRANSCRIPT
1
Shape, Colour and Texture in
Mitosis Detection
Violet Snell
Supervisors:
Prof. J. Kittler & Dr. W. Christmas
Centre for Vision, Speech and Signal Processing
University of Surrey
Guildford, UK
Violet Snell, CVSSP2
Pipeline
Colour Matching
Colour-based likelihood
ClassifierFeature
Extraction
Candidate Locations
Grey-scale Conversion
Patch Extraction & Segmentation
Patch Image & Object(s) Mask
Violet Snell, CVSSP3
Pre-processing:
stain variation
1 13 25 37 49 61 73 85 97 109
121
133
145
157
169
181
193
205
217
229
241
253
0.000
0.002
0.004
0.006
0.008
0.010
0.012
0.014
Green histograms, per patient
1 13 25 37 49 61 73 85 97 109
121
133
145
157
169
181
193
205
217
229
241
253
0.000
0.005
0.010
0.015
0.020
0.025
Red histograms, per patient
Violet Snell, CVSSP4
Pre-processing: White holes
Strong bias on histograms
Not clinically significant
Mask based on Green Thresholding
Threshold found as dip in Green
histogram
0 50 100 150 200 250
0
0.5
1
1.5
2
2.5
Thr
esho
ld=
209
Green Histogram for single HPF%
Violet Snell, CVSSP5
White Hole Exclusion
1 16 31 46 61 76 91 106
121
136
151
166
181
196
211
226
241
256
0.000
0.005
0.010
0.015
0.020
0.025
Red histograms, per patient
1 16 31 46 61 76 91 106
121
136
151
166
181
196
211
226
241
256
0.000
0.005
0.010
0.015
0.020
Blue histograms, per patient
Violet Snell, CVSSP6
Histogram Matching
Violet Snell, CVSSP7
Colour-based Likelihood
Projections of 3D (R,G,B) histograms, 64 bins in each dimension
All pixels vs Mitotic nuclei
10 pixel radius around ground-truth marked locations
Ratio gives a basic likelihood for each pixel based on its colour alone
Violet Snell, CVSSP8
Likelihood to Candidate Locations
Low-pass filter to provide
some spatial coherence
5x5 box
Threshold
Closed-contour search
Centre of each contour is
initial location
Violet Snell, CVSSP9
Conversion to Grey-scale
PCA of pixels within Mitotic nuclei (10 pix radius)
Principal axis projection
Violet Snell, CVSSP10
Conversion to Grey-scale
Violet Snell, CVSSP11
Segmentation
70x70pix patches
Grey-level threshold search, range: 40-145
Optimised for 2 objectives:
High average gradient across resulting
boundary
Low variance within the object
Weighted according to each parameter's std.
deviation
Minimum Area Limit
Contrast between background and foreground
Location refined to centroid of segmented object
Violet Snell, CVSSP12
Segmentation
Telophase pairs
2nd object present in patch
Comparable areas
Comparable contrast
101K patches in training set
180:1 class imbalance
145:1 for single objects
800:1 for pairs
Questions?
Violet Snell, CVSSP13
Feature Extraction
Area
Circularity = Perimeter2/Area
Convex Hull Area relative to object Area
Elongation of minimum area enclosing rectangle
Boundary Radial profile
High-pass filter
Fourier Shape Descriptors
Normalised magnitudes
1-5 as separate features
Sum of the higher terms
Violet Snell, CVSSP14
Feature Extraction
Contrast
Background excluding white holes
Segmentation level relative to object's mean
intensity
Average Gradient across segmentation boundary
Average Edge steepness
Contrast-independent
Morphology at intermediate thresholds
1/3 and 2/3 of [min..max] interval
Average object area at each one
Violet Snell, CVSSP15
Feature Extraction
Local Variance
Inside object 7x7
Background 5x5
Low-pass filtered background variance
High-pass filtered background variance
High-pass filtered foreground variance
Object internal variance
Violet Snell, CVSSP16
Classifier A: SVM
23 rotation-invariant features, normalised to unit variance
RBF kernel
Class imbalance 145:1 & Training set size 75K
Model averaging combined with random sub-sampling
All the positive examples
A different random portion of negative examples
Class weights
Parameter optimisation
Cross-validation is patient-based
Standard grid search fails for F1
Violet Snell, CVSSP17
Pairs
Features describe a single object
Telophase pairs need special treatment
Class imbalance over 800:1
Each object assessed separately first
At least one has a high enough prediction
Reduce class imbalance to 30:1, training set size ~1K
Features that assess balance of constituent objects' attributes
Average and ratio for a subset of single-object features
Total of the two objects' prediction scores
Violet Snell, CVSSP18
Classifier A: Results
Cross-validation F-score ~45%
Recall slightly higher than precision
Submission #1 optimised for overall F-score
Submission #2 optimised for weighted average of patient F-scores
Weighted by number of images – still over-representing high grade
Submission #3 biased in favour of Recall
Submission Precision Recall F1
#1 41.2% 26.5% 32.2%
#2 38.2% 28.0% 32.3%
#3 35.7% 33.2% 34.4%
Violet Snell, CVSSP19
Results Analysis
Big gap between cross-validation and test
Recall consistently lower than precision in test
Training set still not large enough to cover patient and tissue variation
Total number of detections too low on all submissions
Good correlation to expected number of detections for each patient
Correlation coefficient 0.82
Patients with fewer mitoses are “harder”
Under-represented in training set
Questions?
Violet Snell, CVSSP20
Classifier B: GP-LVM
Joint submission
Sheffield Institute for Translational Neuroscience,
Sheffield University, UK
Teo de Campos
GP-LVM
Gaussian Process Latent Variable Model
Probabilistic
Generative
Non-linear dimensionality reduction
Used for data-driven human motion synthesis
Violet Snell, CVSSP21
Latent Variable Models
Mapping from latent space X, related to underlying
physical processes, to observed variables Y, e.g. pixel
intensity values
Gaussian Process used to optimise the model's fit to
example points through kernel parameters θ
Estimate of noise level, i.e. parts of signal not explained by the model
Computational complexity
O(N3) in number of training data points
Test samples require an iterative optimisation to estimate their
position in latent space, and hence likelihood
Violet Snell, CVSSP22
Pipeline
Colour Matching
Colour-based likelihood
Classifier
Candidate Locations
Grey-scale Conversion
Patch Extraction, Segmentation &
Rotation
Patch Images
Violet Snell, CVSSP23
GP-LVM Experiments
Separate models for Positive and Negative classes
Spatial pyramid to provide connections between pixels
6555-dimensional observed space
Negatives for training selected by clustering
Equal number of Positive and Negative training points
Very high levels of noise (2dB SNR) compared to other applications (~30dB)
Negative model produces much higher likelihoods than Positive model
Weights based on average density of mitotic figures
Very slow in test as well as training
May benefit from GPU acceleration
Violet Snell, CVSSP24
GP-LVM Results
Positive ModelDimensions 2 & 18
Positive ModelDimensions 8 & 17
F1=11.3%
Violet Snell, CVSSP25
Summary
Histogram matching to cancel stain variations
Colour-based likelihood as 1st phase of detection
Two very different approaches to classification
Traditional feature extraction and SVM
Balance of shape, intensity and texture attributes
Latent Variable Models with Gaussian Process
Future work
Deep GP-LVM
Direct to grade, or a mitotic count bracket, rather than locations
Requires much larger data sets, but a lot less labelling
Violet Snell, CVSSP26
Questions?
Violet Snell, CVSSP27
Likelihood Threshold
Threshold controls trade-off between
Number of missed mitoses
Number of candidate locations requiring further assessment
And therefore the class imbalance
Err on the side of caution
14 of 550 training locations missed (2.5%)
115K potentials to check