overall approach - eprints · a unified, modular and multimodal approach to search and hyperlinking...

1
TEAM TITLE CONTACT CLIENT MEMBERS POWERED BY SPONSORS OpenIMAJ Intelligent Multimedia Analysis In Java A Unified, Modular and Multimodal Approach to Search and Hyperlinking Video MEDIAEVAL 2013 Search and Hyperlinking of Television Content SOTON-WAIS (University of Southampton) John Preston, Jonathon Hare, Sina Samangooei, jamie davies, Neha Jain, David Dupplaw, PAul Lewis Jonathon Hare [email protected] n.ret RUN Type p5 subtitles only Transcript +concepts Transcript +concepts+lsh 5393 0.42 0.35 7489 0.35 0.30 7488 0.35 0.30 P10 P20 MAP 0.22 0.20 0.21 0.07 0.06 0.06 Search engine A probability density function over the timeline of each programme is generated such that it indicates the instantaneous relevance of the programme to the query at a point in time OVERALL Approach Results Search engine architecture QUERY Ranked Programme Segments Programme #1 Programme #2 Programme #3 Programme #4 Segments are generated by finding high probability portions and integrating the PDF to get a score for ranking Generating modules Weighting modules The search engine was built in a modular way. different modules are run in turn, modifying the PDF of each programme based on the query. Generating modules increase the probability at specific points on a timeline by adding gaussian distributions. Weighting modules change the overall pdf of a programme by scaling Any added gaussian functions up or down. TRANSCRIPT SEARCH Visual concept search Near duplicate keyframes MRR RUN Type mGAP MASP subtitles only Transcript +concepts Transcript +concepts+lsh 0.21 0.10 0.11 0.15 0.07 0.06 0.12 0.07 0.05 Search Task Hyperlinking Task synopis weighting Title weighting channel name weighting If terms from the title and/or synopsis of the programme match the query, then the overall weight is increased giving the segments from the programme a higher probability and ranking position. If the query mentions a channel (i.e. "bbc three") and the programme was not broadcast on that channel its pdf is weighted to near zero, and the programme will not appear in the ranked results. The transcript is searched for terms in the query, and hits generate new gaussians at the corresponding point on the timeline, with the height of the gaussian proportional to the idf of the term. Visual concepts inferred from the query are matched against each programme, causing a gaussian to be placed on the timeline for every hit. Similar keyframes are matched using a technique based on locality sensitive hashing. similar keyframes generate gaussians at the respective time with height proportional to similarity. 1 3 SIFT Features 128 bit LSH Sketching 4 x 32 Bit Partitions 2 Random Gaussian P- Stable Hash functions Difference-of- gaussian interest points Hash Tables Hash IMAGES 212 1 783 1;2 1;2 3 Search Engine 1 3 2 Graph of Images Any collision in a hash table results in graph edges between the colliding images Looks up the vertex corresponding to the query & returns the connected vertices ordered by edge weight Visual concepts inferred from the query. Near-duplicate keyframes used for query expansion once the initial query was performed. query Visual concepts extracted from the anchor. anchor keyframes used to find near duplicates targets. anchor subtitles used as the query text for the transcript search module.

Upload: others

Post on 26-Jun-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: OVERALL Approach - Eprints · A Unified, Modular and Multimodal Approach to Search and Hyperlinking Video MEDIAEVAL 2013 Search and Hyperlinking of Television Content SOTON-WAIS (University

TEAM

TITLE

CONTACT

CLIENT

MEMBERS

POWERED BY

SPONSORS

OpenIMAJIntelligent Multimedia Analysis In Java

A Unified, Modular and Multimodal Approach to Search and Hyperlinking Video

MEDIAEVAL 2013Search and Hyperlinking of Television

Content

SOTON-WAIS(University of Southampton)

John Preston, Jonathon Hare, Sina Samangooei,jamie davies, Neha Jain, David Dupplaw, PAul Lewis

Jonathon [email protected]

n.retRUN Type p5

subtitles only

Transcript+concepts

Transcript+concepts+lsh

5393 0.42 0.35

7489 0.35 0.30

7488 0.35 0.30

P10 P20 MAP

0.22

0.20

0.21

0.07

0.06

0.06

Search engine

A probability density function over the timeline of each programme is generated such that it indicates the instantaneous relevance of the programme to the query at a point in time

OVERALL Approach

ResultsSearch engine architecture

QUERY

Ranked Programme SegmentsProgramme #1

Programme #2

Programme #3

Programme #4

Segments are generated by finding high probability portions and integrating the PDF to get a score for ranking

Generating modules

Weighting modules

The search engine was built in a modular way. different modules are run in turn, modifying the PDF of each programme based on the query.

Generating modules increase the probability at specific points on a timeline by adding gaussian distributions.

Weighting modules change the overall pdf of a programme by scaling Any added gaussian functions up or down.

TRANSCRIPTSEARCH

Visual conceptsearch

Nearduplicatekeyframes

MRRRUN Type mGAP MASP

subtitles only

Transcript+concepts

Transcript+concepts+lsh

0.21 0.10 0.11

0.15 0.07 0.06

0.12 0.07 0.05

Search Task

Hyperlinking Task

synopisweighting

Titleweighting channel name

weightingIf terms from the title and/or synopsis of the programme match the query, then the overall weight is increased giving the segments from the programme a higher probability and ranking position.

If the query mentions a channel (i.e. "bbc three") and the programme was not broadcast on that channel its pdf is weighted to near zero, and the programme will not appear in the ranked results.

The transcript is searched for terms in the query, and hits generate new gaussians at the corresponding point on the timeline, with the height of the gaussian proportional to the idf of the term.

Visual concepts inferred from the query are matched against each programme, causing a gaussian to be placed on the timeline for every hit.

Similar keyframes are matched using a technique based on locality sensitive hashing. similar keyframes generate gaussians at the respective time with height proportional to similarity.

1 3

SIFT Features

128 bit LSH Sketching

4 x 32 Bit Partitions

2

Random Gaussian P-Stable Hash functions

Difference-of-gaussian interest points

Hash Tables

Hash IMAGES

212

1

783

1;2

1;2

3

Search Engine

13

2Graph of Images

Any collision in a hash table results in graph edges between the colliding images

Looks up the vertex corresponding to the

query & returns the connected vertices

ordered by edge weight

Visual concepts inferred from the query. Near-duplicate keyframes used for query expansion once the initial query was performed.

query Visual concepts extracted from the anchor. anchor keyframes used to find near duplicates targets. anchor subtitles used as the query text for the transcript search module.