laboratory!i mpeg7!implementaon!exercises! ·...

Laboratory I

MPEG7 implementa5on exercises

Ing. Marco Ber5ni

Exercise objec5ves

•  Implement a CBIR system based on MPEG-‐7 features

•  Experiment with vo5ng algorithms

•  Experiment with approximate retrieval

Working tools

•  Use LIRE open source library (Java) –  use version 0.8 or more recent 0.9 alpha –  get from: hPp://www.seman5cmetadata.net/

•  Use UCID v2 dataset for tes5ng.

•  Use of Eclipse IDE is strongly encouraged.

Task 1

•  Using LIRE and a sample skeleton program implement a CBIR system that uses: –  CLD –  SCD –  EHD and their (weighted) combina5ons. The user should be able to select which features are to be used.

•  Test on UCID dataset and evaluate Precision/Recall/AP using ground truth data.

Task 2

•  Modify the soYware developed in Task 1 to experiment with vo5ng and rank fusion algorithms. Perform content –based image retrieval (CBIR) using single features, then merge the ranked lists using: –  Borda count –  Rank product –  Inverted Rank Posi6on

Borda count

•  The Borda count is a single-‐winner elec6on method in which voters rank candidates in order of preference.

•  The Borda count determines the winner of an elec5on by giving each candidate a certain

number of points corresponding to the posi5on in which he or she is ranked by each voter. Once all votes have been counted the candidate with the most points is the winner.

Rank product

•  Rank Product is a simple non-‐parametric sta5s5cal method based on ranks of fold changes. Derived from DNA analysis to detect expressed genes in microarrays.

Filled circles represent ranks of one gene in the different replicates. The rank product for this gene would be (2×1×4×2)1/4 ≈ 2

Inverted Rank Posi5on

•  Inverted Rank Posi5on algorithm merges the mul5 feature similarity lists into a single overall similarity ranking list. Uses the inverse of the sum of inverses of the feature similarity rank scores for each individual feature for a given image from relevant feature similarity ranking lists

q = query image, i = {db images} 464 M. Jovic et al.

color feature similarity ranking list

Rank Image1

2

3

4

5

a

b

c

e

d

Rank Image1

2

3

4

5

d

a

c

e

b

Rank Image1

2

3

4

5

b

a

c

e

d

Inverse Rank Position Algorithm

final image similarity ranking list

shape feature similarity ranking list

texture feature similarity ranking list

Rank Image1

2

3

4

5

a

b

d

c

e

Fig. 2. Ordering of the first five retrieved images based on the color-shape-texturefeatures merged by Inverse Rank Position Algorithm

votes etc.). Finally, for each database image, all the votes from all of the threefeature similarity ranking lists are summed up and the image with the highestnumber of votes is ranked as the most relevant to the query image, winning theelection.

BC(q,i) =n!

feature similarity=1

rank positionfeature similarity. (5)

feature similarity ! {CFSRL, SFSRL, TFSRL}; i ! {a, b, c, d, e}; n = 3. (6)

Example. According to the sample feature similarity ranking lists given in 2,the overall similarity ranking of the images {a, b, c, d, e} with respect to thequery image q is calculated as following:

BC (a) = 5; BC (b) = 8; BC (c) = 9; BC (d) = 11; BC (e) = 12. (7)

=" e > d > c > b > a, meaning that image a is the most relevant image to thequery q, image b is the next relevant etc(Fig. 3).

Leave Out Algorithm(LO) is a third algorithm to merge the multi featuresimilarity lists into a single overall similarity ranking list. The elements are in-serted into the final similarity ranking list circularly from three feature similarityranking lists(see Algorithm 1). Repeating elements from feature similarity rank-ing lists are not inserted into the final similarity ranking list if already appearedthere. Order of the next selected element from the feature similarity rankinglists to be inserted into the final similarity ranking list can be arbitrary andwill therefore influence on the retrieval precision. In the experimental part, the

Image Retrieval Based on Similarity Score Fusion 463

uniquely determined. Number of features used for the image representation willdetermine the number of feature similarity lists. Integrated ranking of the multifeature similarity ranking lists is then determined by the the fusion algorithms.Such a framework is illustrated on the Fig. 1(b). The fusion is done in sucha way to optimize retrieval performance. Three feature similarity score fusionalgorithms are proposed. As for the image feature representation, color, shapeand texture image features are chosen. Color feature is represented by the colormoments [3]. Shape feature is represented by the edge–direction histogram [4].Texture feature is represented by the texture neighborhood [9]. Color feature sim-ilarity in is measured by the weighted Euclidean distance [2], while shape andtexture feature similarity are measured with a help of city-block distance.

Let us for a given query image q, with respect to all database images, de-fine three feature similarity ranking lists: color feature similarity ranking list(CFSRL), shape feature similarity ranking list(SFSRL) and texture featuresimilarity ranking list(TFSRL). Next, let us assume that at CFSRL, SFSRLand TFSRL top five positions, the images with identifiers {a, b, c, d, e} areordered as following:

CFSRL = (a, b, c, e, d) ; SFSRL = (d, a, c, e, b) ; TFSRL = (b, a, c, e, d) .(1)

Inverse Rank Position Algorithm(IRP) is a first algorithm to merge themulti feature similarity lists into a single overall similarity ranking list. Theinverse of the sum of inverses of the feature similarity rank scores for each indi-vidual feature for a given image from relevant feature similarity ranking lists isused( 3).

IRP(q,i) =1!n

feature similarity=11

rank positionfeature similarity

. (2)

feature similarity ! {CFSRL, SFSRL, TFSRL}; i ! {a, b, c, d, e}; n = 3. (3)

Example. According to the sample feature similarity ranking lists given in 2,the overall similarity ranking of the images {a, b, c, d, e} with respect to thequery image q is calculated as following:

IRP (a) =12; IRP (b) =

1019

; IRP (c) = 1; IRP (d) =57; IRP (e) =

43.(4)

=" e > c > d > b > a, meaning that image a is the most relevant image to thequery q, image b is the next relevant etc(Fig. 2).

Borda Count Algorithm(BC) taken from social theory in voting [16] is asecond algorithm to merge the multi feature similarity lists into a final overallsimilarity ranking list. An image with the highest rank on each of the featuresimilarity ranking lists (in an n–way vote) gets n votes. Each subsequent im-age gets one vote less (so that the number two gets n-1 votes, number three n-2

Bibliography

•  FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY Lecture Notes in Computer Science, 2006, Volume 4223, Pages 461-‐470, DOI: 10.1007/11881599_54 Image Retrieval Based on Similarity Score Fusion from Feature Similarity Ranking Lists Mladen Jović, Yutaka Hatakeyama, Fangyan Dong and Kaoru Hirota.

•  FEBS LePers Volume 573, Issues 1-‐3, 27 August 2004, Pages 83-‐92 DOI:10.1016/j.febslet.2004.07.055 Rank products: a simple, yet powerful, new method to detect differen6ally regulated genes in replicated microarray experiments Rainer Breitlinga, Patrick Armengauda, Anna Amtmanna, Pawel Herzyk.

laboratory!i mpeg7!implementaon!exercises! ·...

Documents