description and retrieval of medical visual information based on language modelling

62
1 Description and retrieval of medical visual information based on language modelling Antonio Foncubierta-Rodríguez

Upload: antonio-foncubierta-rodriguez

Post on 15-Jul-2015

129 views

Category:

Presentations & Public Speaking


0 download

TRANSCRIPT

Page 1: Description and retrieval of medical visual information based on language modelling

1

Description and retrieval of medicalvisual information based on languagemodellingAntonio Foncubierta-Rodríguez

Page 2: Description and retrieval of medical visual information based on language modelling

Table of contents

Motivation and introduction

Technical contributions

Experiments

Concluding remarks

2

Page 3: Description and retrieval of medical visual information based on language modelling

Evolution of medical images

• 1895, Conrad Röntgen discoversX–rays

• Approximately 100 years later:anatomical, functional, motion

• Any aspect can be visualized andquantified

Imagingmodalities

Microscopy

Visible light

Magneticresonance

X–Rays

Nuclearimaging

Ultrasound

5

Page 4: Description and retrieval of medical visual information based on language modelling

Use of medical images

GenevaUniversityHospitals

during 2012

Magneticresonance

X–Rays

Nuclearimaging

30,645 CT exams

12,819 MRI exams

1,426 PET exams

30%of worldstoragecapacity* estimation

6

Page 5: Description and retrieval of medical visual information based on language modelling

Dimensions of medical images

2D

2D + time

3D

3D + time

3D + other

E.g.: dermatography,radiography, angiography.

E.g.: echography,endoscopy.

E.g.: CT, MRI, PET.

E.g.: functional MRI.

E.g.: Dual Energy CT.

7

Page 6: Description and retrieval of medical visual information based on language modelling

Computer Aided Tools

• Multimodal information• Partly annotated• Multidimensional

HOWto make sense?

CAD

CBIR

9

Page 7: Description and retrieval of medical visual information based on language modelling

Visual features

High di-mensional

approaches

Shape de-scription

Point–based

Surface–based

Topology–basedFull–

supportdescrip-

tion

Geometry–based

Spectral–based

Statistical& stochas-tic meth-

ods

Videospecificmethods

Low di-mensional

approaches

Spinimages

Silhouettesand depth

images

Slice &frame

analysis

10

Page 8: Description and retrieval of medical visual information based on language modelling

Visual similarity

Ii = log 1Pi

• Information:• Specific definition• Low level features

• Similarity• General definition• Higher level concepts (semantic

gap)

11

Page 9: Description and retrieval of medical visual information based on language modelling

Bag of visual words

• BoVW aims at shortening the semantic gap• Consists of:

1. Partition a n–dimensional feature space into K disjoint regions2. Measure features at m sampling points of an image3. Assign each sample to one of the K regions4. K–bin histogram is the image descriptor

12

Page 10: Description and retrieval of medical visual information based on language modelling

Scientific contributions

Feature ex-traction and

modelling using

BOVW

Multiscaletexture

descriptors

Multiscaleanalysisof ROIs

OptimalVocabu-lary Size

OptimalBag length

Optimal vo-cabulariesin DECT

VocabularyPruning

Languagemodelling

Groundtruth

generation

14

Page 11: Description and retrieval of medical visual information based on language modelling

Section outline

Motivation and introduction

Technical contributionsMulti–scale texture descriptionA visual grammarROI detector

Experiments

Concluding remarks

15

Page 12: Description and retrieval of medical visual information based on language modelling

Multidimensional description

• 3D models• External structure• Shape analysis• Deformation quantification

• Volumetric images• Internal structure• Pattern analysis• Early stage detection

17

Page 13: Description and retrieval of medical visual information based on language modelling

Multi–scale texture description

TextureThe feel, appearance or consistency of a surface or a substance.

— Oxford Dictionaries

Texture contains important information about the structuralarrangement of surfaces and their relationship to thesurrounding environment.

— Haralick et al.

18

Page 14: Description and retrieval of medical visual information based on language modelling

Wavelet analysis

Wavelet analysis

ψs,τ(t) =1p

t − τ

s

Ψs,τ(ω) =1p

s|s|Ψ (sω) e−jωτ

• ψ(t) must be zero mean• Ψ(ω) is a bandpass filter• Finite set of scale parameters s

• Scaling function ϕ(t) used tocover the low frequencies

19

Page 15: Description and retrieval of medical visual information based on language modelling

Wavelet analysis: filterbanks

0 ω0

|Ψs(ω)|

←− B −→← B2 →

s = 1s = 2s = 4

20

Page 16: Description and retrieval of medical visual information based on language modelling

Isotropic wavelet analysis• Gaussian–based functions to analyze isotropic image texture• Difference of Gaussians is an approximation to Laplacian of Gaussians

(Mexican Hat)

Difference of Gaussians

gσ(x) =1

σxσyσz

Æ

(2π)3e−�

(xδx)2

2σ2x

+(yδy)

2

2σ2y

+(zδz)2

2σ2z

ψj(x) = gσ1(x)− gσ2(x)

σ2 = 1.6σ1

21

Page 17: Description and retrieval of medical visual information based on language modelling

Riesz transform

• Multidimensional extension of the Hilbert transform• Steerable

Nth order 3D Riesz transform

ÛR(n1,n2,n3)f (ω) =

n1 + n2 + n3

n1!n2!n3!

(−jω1)n1 (−jω2)n2 (−jω3)n3

||ω||n1+n2+n3f̂ (ω)

for all combinations of (n1,n2,n3) with n1 + n2 + n3 = N and n1,2,3 ∈ N.�N+2

2

templatesR(n1,n2,n3)

22

Page 18: Description and retrieval of medical visual information based on language modelling

Riesz filterbanks

• Multiscale• Steerable bandpass

filters• Fourier domain

23

Page 19: Description and retrieval of medical visual information based on language modelling

Beyond bag of visual words

• Widely used• Strong performance variation• Clustering:

• Large clusters, small vocabularies• Small clusters, large vocabularies

Languagemodelling of

BOVW

VocabularySize

Meaning

Wordto wordrelations

25

Page 20: Description and retrieval of medical visual information based on language modelling

From words to grammar

GrammarThe whole system and structure of a language or of languages ingeneral, usually taken as consisting of syntax and morphology(including inflections) and sometimes also phonology and semantics.

— Oxford Dictionaries

26

Page 21: Description and retrieval of medical visual information based on language modelling

From words to grammar

xx

xx

xx

x

x

x

x

x

x

xx

x

x

x

x

x

x

x

x

x

x

x

x

x

xxxx

x

x

xx xx

x

x

x

x

xx

x

x

x

xx

x xx

x

x

x

x

xx

x

xx

x

x

xxx

x

xx xx xx

xxx

x

xx

xx

xx

x

x

xx

x

x x

x

x

xx

xx

x

x

x x

x

x xx

xxx xx

x

x

x

xx

x

x

x

x

xx

x

x

x

x

x

x

xx

x x

xx

x

x

x

x

xx

xx

x

x

x

x

xx

x

x

xx

x xx

x

x

x

x

xxx

x

x

x

xx

x x

x

x

xx

xxx

xx

x

x

x

x

xx

x

x

x

VisualGrammar

Meaning

Synonymy

Polysemy

27

Page 22: Description and retrieval of medical visual information based on language modelling

Visual topics

PLSA–based definitionA visual topic is an unobserved or latent variable z ∈ Z =

z1, . . . , zNZ

so that theprobability of observing the word wn in the visual instance Ii:

P(wn, Ii) =

NZ∑

j=1

P(wn|zj)P(zj|Ii).

P(zj|Ii) P(wn|zj)

WNW×NZ

image topics visual words

28

Page 23: Description and retrieval of medical visual information based on language modelling

The word-topic matrix

WNW×NZ =

P(w1|z1) · · · P(w1|zNZ )P(w2|z1) · · · P(w2|zNZ )

.... . .

...P(wNW |z1) · · · P(wNW |zNZ )

t1,1 · · · t1,NZt2,1 · · · t2,NZ

.... . .

...tNW ,1 · · · tNW ,NZ

• Rows: relevant topics for a word

• Columns: relevant words for a topic

• Use the ratio of words as a topic–based significance tn,j:

• tn,j = 1→ the most significant word for the topic• tn,j = 1/NW → the least significant word for the topic

29

Page 24: Description and retrieval of medical visual information based on language modelling

Visual meaningfulness

DefinitionThe visual meaningfulness of a visual word wn is its maximum topic–basedsignificance level:

mn =

maxj�

tn,j

if maxj�

tn,j

≥ Tmeaning0 otherwise

• Words below the meaningfulness threshold can be truncated.

30

Page 25: Description and retrieval of medical visual information based on language modelling

Meaningfulness transformation

Definition

h = (n(w1),n(w2), . . . ,n(wNW ))T

M =

m1 0 · · · 00 m2 · · · 0...

.... . .

...0 0 · · · mNW

hM = Mhn(wM

i) = mi · n(wi)

31

Page 26: Description and retrieval of medical visual information based on language modelling

Word to word relations

Example

• A single class might have severalvisual appearances

• Several classes might partiallyshare the visual appearance

• Two visual words with the samemeaning, belong to differenthistogram bins and cannot becompared

• Identifying synonymy allows tocompare these words

bimodal class

partially shared

appearance

word 1

word 2

word 3

word 4

word 5

32

Page 27: Description and retrieval of medical visual information based on language modelling

Synonymy graphs

• Word 3 is partially linked to words1 and 2

• Words 4 and 5 are also linked

word 1

word 2

word 3word 4

word 5

33

Page 28: Description and retrieval of medical visual information based on language modelling

Visual synonymy

DefinitionA pair of visual words wn,wm can be considered synonyms if the followingthree conditions are met:1. There is at least one visual topic zj to which both wn and wm belong.2. wn and wm have a similar contextual distribution with the rest of the

words.3. wn and wm have a complementary distribution in the collection.

34

Page 29: Description and retrieval of medical visual information based on language modelling

Synonymy value

DefinitionThe synonymy value of two words wn,wm is the maximum significance valuefor which both words are significant for the same visual topic.

t1,1 · · · t1,NZ

t2,1 · · · t2,NZ...

. . ....

tNW ,1 · · · tNW ,NZ

σnm = σmn = maxj

§

minn,m

tn,j, tm,j

ª

35

Page 30: Description and retrieval of medical visual information based on language modelling

Synonymy transformation

Definition

S =

1 s12 · · · s1NW

s21 1 · · · s2NW...

.... . .

...sNW1 sNW2 · · · 1

, sij = sji =

1 if i = jσij if wi,wj are synonyms0 otherwise

Transformed histogram:

hS = Sh; n(wSi

) = n(wi) +∑

i 6=j

sijn(wj)

36

Page 31: Description and retrieval of medical visual information based on language modelling

Word ambiguity and dimensionality

• Some visual words are sourcesof ambiguity if they relate tovarious appearances

• Their presence in the histogram isnot discriminative

• Possible solution: identifypolysemy and reduce theirweight

topic A topic B

37

Page 32: Description and retrieval of medical visual information based on language modelling

Visual polysemy

DefinitionA visual word wn is polysemic in strict sense if all the following conditions aremet:1. wn if there are at least two visual topics zj, zk to which the visual word

belongs (wide sense polysemy)2. There is a visual word wm, which is a synonym of wn and belongs to the

topic zj

3. There is a visual word wl, which is a synonym of wn and belongs to thetopic zj

4. wm,wl are not synonyms

38

Page 33: Description and retrieval of medical visual information based on language modelling

Polysemy threshold

DefinitionThe polysemy threshold of a visual word wn, Tn

polysemy, is the largest valuethat satisfies that there are at least two topics for which the word issignificant above the threshold:

t1,1 · · · t1,NZ

t2,1 · · · t2,NZ...

. . ....

tNW ,1 · · · tNW ,NZ

¦

tn,j ≥ Tnpolysemy

©

� ≥ 2;∀j = 1, . . . ,NZ

39

Page 34: Description and retrieval of medical visual information based on language modelling

Polysemy transformation

Definition

P =

p1 0 · · · 00 p2 · · · 0...

.... . .

...0 0 · · · pNW

; pi = 1− T ipolysemy

Transformed histogram:

hP = Ph; n(wPi

) = pi · n(wi)

40

Page 35: Description and retrieval of medical visual information based on language modelling

Grammatical similarity

VisualGrammar

Meaning

Synonymy

Polysemy

vocabularypruning

bin to binweighting

vocabularyweighting

simgram(Ii, Ij) =(S ·P ·M ·hi)

T · (S ·P ·M ·hj)

(S ·P ·M ·hi)

·

(S ·P ·M ·hj)

41

Page 36: Description and retrieval of medical visual information based on language modelling

Section outline

Motivation and introduction

Technical contributionsMulti–scale texture descriptionA visual grammarROI detector

Experiments

Concluding remarks

42

Page 37: Description and retrieval of medical visual information based on language modelling

Local analysis

• Medical images contain largeamounts of information

• Abnormalities and clinicallyrelevant patterns occur only inreduced regions of interest

• Local context description :• Dense sampling• Keypoint–based analysis

43

Page 38: Description and retrieval of medical visual information based on language modelling

Geodesic detection of regional extrema

1. Multi–scale difference ofGaussians relates to saliency

2. Use geodesic operations toobtain regional extrema:2.1 Fill hole / grind peak2.2 Substract from the original DoG

image2.3 Label each fully connected

component larger than astructuring element.

44

Page 39: Description and retrieval of medical visual information based on language modelling

Section outline

Motivation and introduction

Technical contributions

ExperimentsTexture analysis of 2D lung CTTexture analysis of 3D brain MRITexture analysis of 4D lung CTTexture analysis of 4D lung CT using ROIsVisual grammar for description of 2D imagesVisual grammar for description of 3D medical images

Concluding remarks

45

Page 40: Description and retrieval of medical visual information based on language modelling

Texture analysis of 2D lung CT

• Interstitial lung diseases• TALISMAN dataset acquired at

Geneva University Hospitals• 90 HRCT scans from 85 patients• 1679 annotated regions• 6 classes

• fibrosis• ground glass• emphysema• micronodules• healthy tissue• consolidation

47

Page 41: Description and retrieval of medical visual information based on language modelling

Texture analysis of 2D lung CT

k-means

clustering

visual vocabulary

word-1 = (f11,f12,...,f1N)

word-2= (f21,f22,...,f2N)

...

word-k= (fk1,fk2,...,fkN)

4 scales

Wavelet

Transform

Energy of

Coe cients

Dataset

Histogram of

visual words

for each region

k-dimensional

discrete feature space

48

Page 42: Description and retrieval of medical visual information based on language modelling

Texture analysis of 2D lung CT

• Optimal number of visualwords between 100 and300

• Overall performancedecreases with largervocabularies

Keep only meaningful words0 50 100 150 200 250 300 350 400 450 500

20

30

40

50

60

70

80

Number of Visual Words

P@

1 (

%)

ConsolidationEmphysemaFibrosisGround GlassHealthyMicronodulesGeometric mean

49

Page 43: Description and retrieval of medical visual information based on language modelling

Texture analysis of 3D brain MRI

• Texture–based segmentation of thecerebellum

• IBSR dataset provided by MGH.• MRI from 18 adult subjects• Manual segmentations

• Cerebellum cortex• Cerebellum white matter

51

Page 44: Description and retrieval of medical visual information based on language modelling

Texture analysis of 3D brain MRITraining Set

Testing Set

Histogram

Equalization

5 Scales

DoG 3D Wavelet

k-means

Clustering

NxNxN block

visual words

histogram

Nearest Neighbor

Search

Visual word

assignment

Visual Words

Histograms

Feature Space

Training Set

Training Set

Training Set

Testing Set

PREPROCESSING

FEATURE EXTRACTION

CLASSIFICATION

Visual

Vocabulary

52

Page 45: Description and retrieval of medical visual information based on language modelling

Texture analysis of 3D brain MRI

• Performance improves with largerblock sizes

• Rest of brain• Cerbellum cortex

• Performance does not improve• Cerebellum white matter

Data–driven regions of interest

53

Page 46: Description and retrieval of medical visual information based on language modelling

Texture analysis of 4D lung CT

• Pulmonary embolism retrieval• Dual Energy CT dataset acquired

at Geneva University Hospitals• 25 patients• 4D data

• x,y,z• Energy level of acquisition

• Ground truth• Severity (Qanadli index)• Lobe based

55

Page 47: Description and retrieval of medical visual information based on language modelling

Texture analysis of 4D lung CT

k-means

clustering

55-dimensional

continuous feature space

visual vocabulary

word-1 = (f11,f12,...,f1N)

word-2= (f21,f22,...,f2N)

...

word-k= (fk1,fk2,...,fkN)

voxeli = closest word

5 scales

Wavelet

Transform

Energy of

Coefficients

. . .

Energy level 1 Energy level 11

Histogram of

visual words

for each lobe

voxeli = (fi1,fi2,..,fiN)

Lung lobes mask

1

54

3

2

k-dimensional

discrete feature space

56

Page 48: Description and retrieval of medical visual information based on language modelling

Texture analysis of 4D lung CT• Performance improves with

4D data• 63% for P@1• 62% for P@5• 60% for P@10

• Optimal configuration• 2 scales, 100–150 words

• Intensive computation• High dimensional feature

space

Analyze only part of the data:ROIs and meaningful words.

Words Scales P@1(%) P@5(%) P@10(%)

50 1 55 56 56100 1 58 55 57150 1 58 56 56

50 2 62 58 55100 2 62 62 60150 2 63 62 60

50 3 58 54 55100 3 60 59 58150 3 57 62 58

50 5 45 52 51100 5 57 52 51150 5 58 52 52

57

Page 49: Description and retrieval of medical visual information based on language modelling

Texture analysis of 4D lung CT using ROIs

• Pulmonary embolism detection• Improvements over previous

approaches• ROI–based analysis• Optimal combination of

energy–based vocabularies

59

Page 50: Description and retrieval of medical visual information based on language modelling

Texture analysis of 4D lung CT using ROIs

• Improvements in performance• Optimal combination of

energy–based vocabularies• Multi–scale regions of interest

Finer–grain analysis of significantwords and synergies among them

Lobe DECT Words Energy levels SECT

LR 84 % 5 (50,130) 52 %LL 84 % 5 (100,140) 48 %MR 80 % 5 (40,50,130,140) 52 %UL 76 % 25 (40,70,80,90) 60 %UR 80 % 25 (90,120) 56 %

60

Page 51: Description and retrieval of medical visual information based on language modelling

Visual grammar for description of 2D images

• Classification and retrieval ofimages from the biomedicalliterature

• ImageCLEFmed modalityclassification task

• 1000 training and 1000 testimages

• 31 hierarchical categories

62

Page 52: Description and retrieval of medical visual information based on language modelling

Visual grammar for description of 2D images

• SIFT–based visual vocabularies• Varying number of visual topics

from 25 to 350 in steps of 25• Varying meaningfulness threshold

from 50% to 100%

63

Page 53: Description and retrieval of medical visual information based on language modelling

Visual grammar for description of 2D images

• Statistically significantimprovement over state-of-the artbaseline

• Vocabulary reductions withouteffect on the accuracy

• Up to 20% of the originalvocabulary size

Analyze synonymy relations amongmultiple vocabularies

0 50 100 150 200 250 300 350 400 450 50020

25

30

35

40

45

50

55

60

65

Effective number of visual words

Cla

ssific

ation a

ccura

cy (

%)

Baseline Grammar Statistical significance threshold

64

Page 54: Description and retrieval of medical visual information based on language modelling

Visual grammar for description of 3D medical images

• Organ identification task• VISCERAL dataset• Full body CT scans

• 15 Contrast–enhanced• 15 Not enhanced• 10 anatomical structures, 8 classes

66

Page 55: Description and retrieval of medical visual information based on language modelling

Visual grammar for description of 3D medical images

• Riesz–based texture features• 3 scales• Riesz order 2.

• Organ–specific vocabularies• 1000 random samples within the organ• 20 visual words per organ

• Visual Grammar transformation

67

Page 56: Description and retrieval of medical visual information based on language modelling

Visual grammar for description of 3D medical images

• Good results for organidentification

• Reduction of vocabularysize with respect tobaseline without visualgrammar

68

Page 57: Description and retrieval of medical visual information based on language modelling

Visual grammar for description of 3D medical images

• Good results for organidentification

• Reduction of vocabularysize with respect tobaseline without visualgrammar

0 20 40 60 80 100 120 140 160 180 20010

20

30

40

50

60

70

80

Vocabulary Size

Cla

ssific

atio

n A

ccu

racy (

%)

68

Page 58: Description and retrieval of medical visual information based on language modelling

Section outline

Motivation and introduction

Technical contributions

Experiments

Concluding remarks

69

Page 59: Description and retrieval of medical visual information based on language modelling

Conclusions

Feature ex-traction and

modelling using

BOVW

Multiscaletexture

descriptors

Multiscaleanalysisof ROIs

OptimalVocabu-lary Size

OptimalBag length

Optimal vo-cabulariesin DECT

VocabularyPruning

Languagemodelling

Evaluation of DoG and Riesz Wavelets and BOVW

Data–driven ROI forlocal analysis of lungtexture

Optimal vocabularysize by learninginformative words

BOVW need to coveranatomicallymeaningful areas

Specific vocabularies,combined, providebetter insight intopatterns

Removal of wordsusing languagemodelling, does notimpact accuracy

Visual Grammartransformationsimprove accuracy andreduce descriptor size

70

Page 60: Description and retrieval of medical visual information based on language modelling

Shortcomings

• Visual grammar model is slow to train for large vocabularies, synonymyrequires further restrictions (sparsity)

• Semantics is covered, but there’s other aspects that can still be explored• Variations of a visual word (morphology)• Combination rules of words in proximity (syntax)

• Bag of visual words has evolved into VLAD and Fisher Vectors, which insome aspects are more robust.

71

Page 61: Description and retrieval of medical visual information based on language modelling

Future work

• Extend the visual grammar evaluation• Extend the visual grammar to cover various languages

• Synergies between isotropic and steerable texture descriptors• Synergies between text and visual description• Synergies between color and texture description

• Extend the language modelling to identify• Paradigmatic relations• Absence of visual words

72

Page 62: Description and retrieval of medical visual information based on language modelling

Questions

73