na-mic national alliance for medical image computing competitive evaluation & validation of...
TRANSCRIPT
NA-MICNational Alliance for Medical Image Computing http://na-mic.org
Competitive Evaluation & Validation of Segmentation
Methods
Martin Styner, UNC
NA-MIC Core 1 and 5
National Alliance for Medical Image Computing http://na-mic.org Slide 2
Main Activities
• DTI tractography: afternoon, Sonia Pujol• Segmentation algorithms
– Competitions at MICCAI– NAMIC: Co-sponsor– Largest MICCAI workshops– Continued online competition– 07: Caudate, liver– 08: Lesion, liver tumor, coronary artery– 09: Prostate, Head & Neck, Cardiac LV– 10: Knee bones, cartilage ?
National Alliance for Medical Image Computing http://na-mic.org Slide 3
Data Setup
• Open datasets with expert “ground truth”• 3 sets of data:
1. Training data with GT for all
2. Testing data prior workshop for proceedings
3. Testing data at workshop
• Workshop test data is hard test– Several methods failed under time pressure– Ranking with sets 2 and 3 always different thus far
• Ground truth only disseminated on training• Sets 2 & 3 fused for online competition
• Additional STAPLE composite from submissions
National Alliance for Medical Image Computing http://na-mic.org Slide 4
Tools for Evaluation
• Open datasets• Publicly available evaluation tools
– Adjusted for each application– Automated unbiased evaluation
• Score: composite of multiple metrics– Normalized against expert variability
• Reliability/repeatability evaluation– Scan/Rescan datasets => Coefficients of
variation
National Alliance for Medical Image Computing http://na-mic.org Slide 5
Example Caudate Segmentation
• Caudate: Basal ganglia– Schizophrenia, Parkinsons,
Fragile-X, Autism
• Datasets from UNC & BWH– Segmentations from 2 labs– Pediatric, adult & elderly scans– 33 training, 29 + 5 testing
• 10 scan/rescan single subject
National Alliance for Medical Image Computing http://na-mic.org Slide 6
Metrics/Scores
• General metrics/scores– Absolute volume difference percent– Volumetric overlap– Surface distance (mean/RMS/Max)
• Volume metric for standard neuroimaging studies• Shape metrics for shape analysis, parcellations• Scores are relative to expert variability
– Intra-expert variability would score at 90– Score for each metric– Average score for each dataset
National Alliance for Medical Image Computing http://na-mic.org Slide 7
Results
• Automatic generation of tables & figures
• Atlas based methods performed best
National Alliance for Medical Image Computing http://na-mic.org Slide 8
Online evaluation
• Continued evaluation for new methods
• 6 new submission in 09– 2 prior to publication,
needed for favorable review
• Not all competitions have working online evals
National Alliance for Medical Image Computing http://na-mic.org Slide 9
Papers
• Workshop proceedings– Open pub in Insight Journal/MIDAS
• Papers in IEEE TMI & MedIASchaap et al. Standardized evaluation methodology and reference database
for evaluating coronary artery centerline extraction algorithms. Medical image analysis (2009) vol. 13 (5) pp. 701-14
Heimann et al. Comparison and evaluation of methods for liver segmentation from CT datasets. IEEE Transactions on Medical Imaging (2009) vol. 28 (8) pp. 1251-65
Caudate paper with updated online evaluation in prep
National Alliance for Medical Image Computing http://na-mic.org Slide 10
Discussion
• Very positive echo from community
• Evaluation workshops proliferate– DTI tractography at MICCAI 09 – Lung CT registration at MICCAI 10
• Many unaddressed topics
• Dataset availability biggest problem
• NA-MIC is strong supporter