discriminative relevance feedback with virtual textual representation for efficient image retrieval...

Discriminative Relevance Feedback With Virtual Textual

Representation For Efficient Image Retrieval

Suman Karthik and C.V.Jawahar

Multimedia data is growing exponentially. Cheap high quality digital imaging devices

Sharing of multimedia data on the internet

Content based organization and retrieval is a viable way of accessing this data.

Introduction

What do users look for?

*In CBIR systems users look for ‘things’ not ‘stuff’. More than global image properties Traditional object recognition won’t work Two choices:

Rely on text to identify objects Look at regions (objects or parts of objects)

The region based image retrieval paradigm was successfully applied by Carson et al. in Blobworld[99-2004] and Wang et al. in SIMPLIcity[2001]

*Chad Carson: Blobworld

Practical CBIR systems

Practical large scale deployment of CBIR systems require. Efficient indexing and retrieval of thousands of

documents. Flexible framework for retrieval based on various

types of features. For example Specialized features for highly specific sub domains like faces, vehicles, monuments.

Ability to scale up to millions of images without a significant performance trade off.

Lessons From Text Retrieval

Large scale text retrieval systems have been successfully deployed. Search Engines like

Efficient indexing and retrieval of millions of documents has been achieved.

The text retrieval frameworks are adaptive enough to be applied to specialized domains.

Images as text documents.

Color (YUV), compactness and location of segment are used to encode the segment as text.

Transformation

Virtual Textual Representation

Document

Image

Words

Segments Text

Segmentation

Transformation

Feature SpaceBins represented

by strings or wordsQuantization

Color

Com

pactness

Positio

n

Transformation

Quantization can be achieved in a number of ways. Uniform vector space quantization for data set with

a uniform feature point distribution.

Density based quantization of the feature space can be achieved with simple k-means quantization.

Irrespective of the quantization applied each cell in the vector space has a representative string.

Each image segment is assigned to a cell and is assigned the representative string of the cell.

Discriminative Relevance Feedback Discriminative regions are

given higher weight than representative regions. Image segments that can

differentiate between roses and other flowers are given higher weight with respect to the class roses.

Regions aiding classification rather than clustering are chosen. Image segments containing

humans are able to differentiate between ‘surfer’ and ‘wave’ images.

Transformation

Intuitive way of learning content Over segmentation and subsequent deduction of content

can be achieved if the problem is modeled like this.Document

Image

Words

Segments Text

Segmentation

Perfromance

Discriminative relevance feedback consistently out performed Region based importance method.

Given are the precision data for discriminative relevance feedback and Bayesian region importance relevance feedback.

Indexing

CBIR systems usually use spatial databases to index and retrieve data. Blobworld uses variants or R-trees. *[Megan Thomas, Chad Carson,

Joseph M. Hellerstein Creating a Customized Access Method for Blobworld (2000) ICDE]

Relevance feedback skews the feature space rendering spatial databases inefficient. *[peng et al. Kernel Indexing for Relevance Feedback Image Retrieval]

Elastic Bucket Trie

Null

AB

C

A

B

A

B

B

Nodes

Buckets

CABCBA

OverflowSplitA B

QueryBBC

Retrieved Bucket

Insert

Spatial Data Structures

Become inefficient when used with relevance feedback.

Requires costly arithmetic operations.

Number of splits of the spatial data structure is not fixed

Strictly adheres to spatial characteristics of the feature space.

Spatial data structures VS EBT

Elastic Bucket Trie

Not effected by relevance feedback schemes.

Requires bit opertations

Number of splits of EBT is limited.

The trie need not adhere to an

underlying spatial structure though that is also possible.*

Suman Karthik, C.V. Jawahar, Efficient Region Based Indexing and Retrieval for Images with Elastic Bucket Tries, ICPR(2006)

Relevance feedback and EBT

Typical relevance feedback algorithms need to be modified to work with text.

Keywords emerge with relevance feedback signifying association between key segments.

EBT can be used without any modifications with discriminative relevance feedback.

Bag of words

The scheme is very similar to contemporary bag of words approaches.

Interest point based bag of words approaches can also be adapted to work within our framework.*

Any type of vector quantization of the feature space used by these schemes can use EBT.

*R. Fergus, L. Fei-Fei, P. Perona, and A. Zisserman. Learning object categories from google's image search. ICCV, 2005.

Future Work

Usage of heterogeneous strings to describe an image.

Text encoding of images that is highly distinctive.

Text encoding of images that is robust to segmentation inaccuracies.

Text based image mining to discover concepts and their features.

THE END

discriminative relevance feedback with virtual textual representation for efficient image retrieval...

Documents

text retrieval frameworks

image retrieval paradigm

text documents

retrieval of millions

image segments

retrieval of thousands

data set

precision data