news image annotation on a large parallel text-image corpus€¦ · introduction image annotation...

26

Upload: others

Post on 11-Oct-2020

18 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: News Image Annotation on a Large Parallel Text-Image Corpus€¦ · Introduction Image annotation ext-ImageT News corpus Experiments Conclusion News Image Annotation on a Large Parallel

IntroductionImage annotation

Text-Image News corpusExperimentsConclusion

News Image Annotation

on a Large Parallel Text-Image Corpus

Pierre Tirilly1, Vincent Claveau2 and Patrick Gros3

1Université de Rennes 1/IRISA � 2CNRS/IRISA3INRIA Rennes-Bretagne Atlantique

LREC - 05/20/2010 - Malta

Pierre Tirilly1, Vincent Claveau2 and Patrick Gros3 News Image Annotation

Page 2: News Image Annotation on a Large Parallel Text-Image Corpus€¦ · Introduction Image annotation ext-ImageT News corpus Experiments Conclusion News Image Annotation on a Large Parallel

IntroductionImage annotation

Text-Image News corpusExperimentsConclusion

Image annotation

Image annotation: textual description of image content

Pierre Tirilly1, Vincent Claveau2 and Patrick Gros3 News Image Annotation

Page 3: News Image Annotation on a Large Parallel Text-Image Corpus€¦ · Introduction Image annotation ext-ImageT News corpus Experiments Conclusion News Image Annotation on a Large Parallel

IntroductionImage annotation

Text-Image News corpusExperimentsConclusion

Image annotation: major issues

How to acquire annotation keywords?

How to associate keywords with visual content?

Pierre Tirilly1, Vincent Claveau2 and Patrick Gros3 News Image Annotation

Page 4: News Image Annotation on a Large Parallel Text-Image Corpus€¦ · Introduction Image annotation ext-ImageT News corpus Experiments Conclusion News Image Annotation on a Large Parallel

IntroductionImage annotation

Text-Image News corpusExperimentsConclusion

Image annotation: approaches

Typical approach: low-level approach

Fixed annotation vocabularyVisual content = low-level features (color, texture...)Machine learning to �nd keywords-visual features associations

→ Problem: semantic gap

Our approach: high-level approach

Open annotation vocabulary acquired on a text corpusDetection of high-level concepts in imagesAssociation of high-level textual and visual concepts

→ Requires a parallel text-image corpus

Pierre Tirilly1, Vincent Claveau2 and Patrick Gros3 News Image Annotation

Page 5: News Image Annotation on a Large Parallel Text-Image Corpus€¦ · Introduction Image annotation ext-ImageT News corpus Experiments Conclusion News Image Annotation on a Large Parallel

IntroductionImage annotation

Text-Image News corpusExperimentsConclusion

Contributions

High-level image annotation method

Named entity extraction from textsUse of named entities to annotate the images containing thesuited visual concepts

New corpus containing texts and images

Pierre Tirilly1, Vincent Claveau2 and Patrick Gros3 News Image Annotation

Page 6: News Image Annotation on a Large Parallel Text-Image Corpus€¦ · Introduction Image annotation ext-ImageT News corpus Experiments Conclusion News Image Annotation on a Large Parallel

IntroductionImage annotation

Text-Image News corpusExperimentsConclusion

Selection of annotation candidatesAnnotation using visual concepts

Outline

1 Image annotationSelection of annotation candidatesAnnotation using visual concepts

2 Text-Image News corpus

3 Experiments

4 Conclusion

Pierre Tirilly1, Vincent Claveau2 and Patrick Gros3 News Image Annotation

Page 7: News Image Annotation on a Large Parallel Text-Image Corpus€¦ · Introduction Image annotation ext-ImageT News corpus Experiments Conclusion News Image Annotation on a Large Parallel

IntroductionImage annotation

Text-Image News corpusExperimentsConclusion

Selection of annotation candidatesAnnotation using visual concepts

Objective

Pierre Tirilly1, Vincent Claveau2 and Patrick Gros3 News Image Annotation

Page 8: News Image Annotation on a Large Parallel Text-Image Corpus€¦ · Introduction Image annotation ext-ImageT News corpus Experiments Conclusion News Image Annotation on a Large Parallel

IntroductionImage annotation

Text-Image News corpusExperimentsConclusion

Selection of annotation candidatesAnnotation using visual concepts

Selection of annotation candidates

Named entity detection and categorization using Nemesis [Fourour, 2002]

Pierre Tirilly1, Vincent Claveau2 and Patrick Gros3 News Image Annotation

Page 9: News Image Annotation on a Large Parallel Text-Image Corpus€¦ · Introduction Image annotation ext-ImageT News corpus Experiments Conclusion News Image Annotation on a Large Parallel

IntroductionImage annotation

Text-Image News corpusExperimentsConclusion

Selection of annotation candidatesAnnotation using visual concepts

Annotation candidates scores

Annotation candidates = named entities extracted from thearticle's text

Two criterias:

Term frequency f : number of occurrences in the textDocument frequency df : number of documents where theterm occurs

Possibles scores:

Frequency fFrequency and document frequency f−dfFrequency and inverse document frequency f−idf

Other criterias (see paper)

Pierre Tirilly1, Vincent Claveau2 and Patrick Gros3 News Image Annotation

Page 10: News Image Annotation on a Large Parallel Text-Image Corpus€¦ · Introduction Image annotation ext-ImageT News corpus Experiments Conclusion News Image Annotation on a Large Parallel

IntroductionImage annotation

Text-Image News corpusExperimentsConclusion

Selection of annotation candidatesAnnotation using visual concepts

Annotation using visual concepts

Pierre Tirilly1, Vincent Claveau2 and Patrick Gros3 News Image Annotation

Page 11: News Image Annotation on a Large Parallel Text-Image Corpus€¦ · Introduction Image annotation ext-ImageT News corpus Experiments Conclusion News Image Annotation on a Large Parallel

IntroductionImage annotation

Text-Image News corpusExperimentsConclusion

Selection of annotation candidatesAnnotation using visual concepts

Annotation using visual concepts

Face detection: OpenCV detectorbased on Haar features and boosteddecision trees [Lienhart, 2002]

Logo detection: detector based onvisual words and Bayesian learning[Tirilly, 2010]

Pierre Tirilly1, Vincent Claveau2 and Patrick Gros3 News Image Annotation

Page 12: News Image Annotation on a Large Parallel Text-Image Corpus€¦ · Introduction Image annotation ext-ImageT News corpus Experiments Conclusion News Image Annotation on a Large Parallel

IntroductionImage annotation

Text-Image News corpusExperimentsConclusion

Selection of annotation candidatesAnnotation using visual concepts

Annotation using visual concepts

Pierre Tirilly1, Vincent Claveau2 and Patrick Gros3 News Image Annotation

Page 13: News Image Annotation on a Large Parallel Text-Image Corpus€¦ · Introduction Image annotation ext-ImageT News corpus Experiments Conclusion News Image Annotation on a Large Parallel

IntroductionImage annotation

Text-Image News corpusExperimentsConclusion

Selection of annotation candidatesAnnotation using visual concepts

Annotation using visual concepts

Pierre Tirilly1, Vincent Claveau2 and Patrick Gros3 News Image Annotation

Page 14: News Image Annotation on a Large Parallel Text-Image Corpus€¦ · Introduction Image annotation ext-ImageT News corpus Experiments Conclusion News Image Annotation on a Large Parallel

IntroductionImage annotation

Text-Image News corpusExperimentsConclusion

Selection of annotation candidatesAnnotation using visual concepts

Visual concepts suited to named entities

Faces ↔ Names of people

Logos ↔ Names of brands, products, events, organizations,�rms, institutions, artistic groups

Pierre Tirilly1, Vincent Claveau2 and Patrick Gros3 News Image Annotation

Page 15: News Image Annotation on a Large Parallel Text-Image Corpus€¦ · Introduction Image annotation ext-ImageT News corpus Experiments Conclusion News Image Annotation on a Large Parallel

IntroductionImage annotation

Text-Image News corpusExperimentsConclusion

Outline

1 Image annotationSelection of annotation candidatesAnnotation using visual concepts

2 Text-Image News corpus

3 Experiments

4 Conclusion

Pierre Tirilly1, Vincent Claveau2 and Patrick Gros3 News Image Annotation

Page 16: News Image Annotation on a Large Parallel Text-Image Corpus€¦ · Introduction Image annotation ext-ImageT News corpus Experiments Conclusion News Image Annotation on a Large Parallel

IntroductionImage annotation

Text-Image News corpusExperimentsConclusion

News corpus: document example

Pierre Tirilly1, Vincent Claveau2 and Patrick Gros3 News Image Annotation

Page 17: News Image Annotation on a Large Parallel Text-Image Corpus€¦ · Introduction Image annotation ext-ImageT News corpus Experiments Conclusion News Image Annotation on a Large Parallel

IntroductionImage annotation

Text-Image News corpusExperimentsConclusion

News corpus: properties

A document contains:

an article (text)one or several imagesone caption per imageone context description per image (facultative)

Corpus properties:

Large size (> 27,000 documents, > 42,000 captioned images)Real-world application (news articles)Texts illustrated by images (not only image descriptions)

Pierre Tirilly1, Vincent Claveau2 and Patrick Gros3 News Image Annotation

Page 18: News Image Annotation on a Large Parallel Text-Image Corpus€¦ · Introduction Image annotation ext-ImageT News corpus Experiments Conclusion News Image Annotation on a Large Parallel

IntroductionImage annotation

Text-Image News corpusExperimentsConclusion

Outline

1 Image annotationSelection of annotation candidatesAnnotation using visual concepts

2 Text-Image News corpus

3 Experiments

4 Conclusion

Pierre Tirilly1, Vincent Claveau2 and Patrick Gros3 News Image Annotation

Page 19: News Image Annotation on a Large Parallel Text-Image Corpus€¦ · Introduction Image annotation ext-ImageT News corpus Experiments Conclusion News Image Annotation on a Large Parallel

IntroductionImage annotation

Text-Image News corpusExperimentsConclusion

Groundtruth and evaluation metrics

Groundtruth = captions of images

→ a correct annotation appears in the image's caption

Evaluation measure = annotation precision (rate of correctannotations) and number of annotated images

one threshold to select annotation candidates → one couple(number of annotated images, annotation precision)various thresholds → recall-precision-like curves

Pierre Tirilly1, Vincent Claveau2 and Patrick Gros3 News Image Annotation

Page 20: News Image Annotation on a Large Parallel Text-Image Corpus€¦ · Introduction Image annotation ext-ImageT News corpus Experiments Conclusion News Image Annotation on a Large Parallel

IntroductionImage annotation

Text-Image News corpusExperimentsConclusion

Face annotation performance

f − idf not suited

→ images tend to content common faces

Pierre Tirilly1, Vincent Claveau2 and Patrick Gros3 News Image Annotation

Page 21: News Image Annotation on a Large Parallel Text-Image Corpus€¦ · Introduction Image annotation ext-ImageT News corpus Experiments Conclusion News Image Annotation on a Large Parallel

IntroductionImage annotation

Text-Image News corpusExperimentsConclusion

Logo annotation performance

f − idf not suited→ images tend to content common logos

Pierre Tirilly1, Vincent Claveau2 and Patrick Gros3 News Image Annotation

Page 22: News Image Annotation on a Large Parallel Text-Image Corpus€¦ · Introduction Image annotation ext-ImageT News corpus Experiments Conclusion News Image Annotation on a Large Parallel

IntroductionImage annotation

Text-Image News corpusExperimentsConclusion

Annotation examples

Libération 8 Gainsbourg 17CE 2 Nelson 2SCPL 2 Melody 2Rotschild 2 Birkin 2Société Civile des Hardy 2Personnels de Libération 1Le Monde 1Comité d'Entreprise 1

Pierre Tirilly1, Vincent Claveau2 and Patrick Gros3 News Image Annotation

Page 23: News Image Annotation on a Large Parallel Text-Image Corpus€¦ · Introduction Image annotation ext-ImageT News corpus Experiments Conclusion News Image Annotation on a Large Parallel

IntroductionImage annotation

Text-Image News corpusExperimentsConclusion

Outline

1 Image annotationSelection of annotation candidatesAnnotation using visual concepts

2 Text-Image News corpus

3 Experiments

4 Conclusion

Pierre Tirilly1, Vincent Claveau2 and Patrick Gros3 News Image Annotation

Page 24: News Image Annotation on a Large Parallel Text-Image Corpus€¦ · Introduction Image annotation ext-ImageT News corpus Experiments Conclusion News Image Annotation on a Large Parallel

IntroductionImage annotation

Text-Image News corpusExperimentsConclusion

Conclusion

New bimodal corpus

Texts and illustrative imagesLarge-scale and real-world application

High-level approach to image annotation

Open annotation vocabulary (named entities)NLP to acquire annotation vocabulary from textsUse of high level visual conceptsSimple technique, but provides good results

Pierre Tirilly1, Vincent Claveau2 and Patrick Gros3 News Image Annotation

Page 25: News Image Annotation on a Large Parallel Text-Image Corpus€¦ · Introduction Image annotation ext-ImageT News corpus Experiments Conclusion News Image Annotation on a Large Parallel

IntroductionImage annotation

Text-Image News corpusExperimentsConclusion

Future work

Improved named entity selection

Length normalizationRelative threshold

Use of other NLP techniques

Anaphora resolutionAcronym identi�cationNamed entity annotation

Extending our results to other corpora

Other news sourcesBlogs, Wikipedia. . .

Pierre Tirilly1, Vincent Claveau2 and Patrick Gros3 News Image Annotation

Page 26: News Image Annotation on a Large Parallel Text-Image Corpus€¦ · Introduction Image annotation ext-ImageT News corpus Experiments Conclusion News Image Annotation on a Large Parallel

IntroductionImage annotation

Text-Image News corpusExperimentsConclusion

News Image Annotation

on a Large Parallel Text-Image Corpus

Pierre Tirilly1, Vincent Claveau2 and Patrick Gros3

1Université de Rennes 1/IRISA � 2CNRS/IRISA3INRIA Rennes-Bretagne Atlantique

LREC - 05/20/2010 - Malta

Pierre Tirilly1, Vincent Claveau2 and Patrick Gros3 News Image Annotation