visual instance mining of news videos using a graph-based approach

43
BY DAVID ALMENDROS GUTIÉRREZ DIRECTED BY HORST EIDENBERGER XAVIER GIRÓ-I-NIETO 2013-2014 VISUAL INSTANCE MINING USING OF NEWS VIDEOS USING A GRAPH-BASED APPROACH

Upload: xavier-giro

Post on 27-Jun-2015

120 views

Category:

Technology


0 download

DESCRIPTION

Full details: https://imatge.upc.edu/web/publications/visual-instance-mining-news-videos-using-graph-based-approach Author: David Almendros-Gutiérrez Advisors: Xavier Giró-i-Nieto (UPC) and Horst Eidenberger (TU Wien) Degree: Telecommunications Engineering (5 years) at Telecom BCN-ETSETB (UPC) The aim of this thesis is to design a tool that performs visual instance search mining for news video summarization. This means to extract the relevant content of the video in order to be able to recognize the storyline of the news. Initially, a sampling of the video is required to get the frames with a desired rate. Then, different relevant contents are detected from each frame, focusing on faces, text and several objects that the user can select. Next, we use a graph-based clustering method in order to recognize them with a high accuracy and select the most representative ones to show them in the visual summary. Furthermore, a graphical user interface in Wt was developed to create an online demo to test the application. During the development of the application we have been testing the tool with the CCMA dataset. We prepared a web-based survey based on four results from this dataset to check the opinion of the users. We also validate our visual instance mining results comparing them with the results obtained applying an algorithm developed at Columbia University for video summarization. We have run the algorithm on a dataset of a few videos on two events: 'Boston bombings' and the 'search of the Malaysian airlines flight'. We carried out another web-based survey in which users could compare our approach with this related work. With these surveys we analyze if our tool fulfill the requirements we set up. We can conclude that our system extract visual instances that show the most relevant content of news videos and can be used to summarize these videos effectively.

TRANSCRIPT

Page 1: Visual instance mining of news videos using a graph-based approach

BY D AV I D A L M E N D R O S G U T I É R R E Z

D I R E C T E D BYH O R S T E I D E N B E R G E RXAV I E R G I R Ó - I - N I E T O

2 0 1 3 - 2 0 1 4

VISUAL INSTANCE MINING USING OF NEWS VIDEOS USING

A GRAPH-BASED APPROACH

Page 2: Visual instance mining of news videos using a graph-based approach

2

CONTENTS

IntroductionState of the artRequirements analysisDeveloped solutionEvaluationFuture work

Page 3: Visual instance mining of news videos using a graph-based approach

3

CONTENTS

Introduction Motivation

State of the artRequirements analysisDeveloped solutionEvaluationFuture work

Page 4: Visual instance mining of news videos using a graph-based approach

4

INTRODUCTION

Page 5: Visual instance mining of news videos using a graph-based approach

5

INTRODUCTION

Motivation

Manel Martos’s Thesis (2013)“Content-based video summarization

oriented to movie trailers”

Page 6: Visual instance mining of news videos using a graph-based approach

6

INTRODUCTION

News domain

• Websites

• News bulletins

• Newspaper

Page 7: Visual instance mining of news videos using a graph-based approach

7

CONTENTS

IntroductionState of the art

Visual instance mining News summarization

Requirements analysisDeveloped solutionEvaluationFuture work

Page 8: Visual instance mining of news videos using a graph-based approach

8

STATE OF THE ART

Visual instance mining

From a video

From a large collection of images

* Wei Zhang et al, "Scalable Visual Instance Mining with Threads of Features" (ACM MultiMedia 2014)

*

Page 9: Visual instance mining of news videos using a graph-based approach

9

STATE OF THE ART

News summarizationNews Rover * Developed at Columbia University

* H. Li et al, "News rover: exploring topical structures and serendipity in heterogeneous multimedia news" (ACM MultiMedia 2013)

Page 10: Visual instance mining of news videos using a graph-based approach

10

CONTENTS

IntroductionState of the artRequirements analysis

Content requirements Structural requirements

Developed solutionEvaluationFuture work

Page 11: Visual instance mining of news videos using a graph-based approach

11

REQUIREMENTS ANALYSIS

Content requirements

Barack ObamaPresident of the USA

Núria SoléAnchorwoman of tv3 news

Flag

Fire truck

Page 12: Visual instance mining of news videos using a graph-based approach

12

REQUIREMENTS ANALYSIS

Structural requirements

Page 13: Visual instance mining of news videos using a graph-based approach

13

CONTENTS

IntroductionState of the artRequirements analysisDeveloped solution

Environment System architecture overview Temporal sampling Instances detection Graph-based selection Presentation

EvaluationFuture work

Page 14: Visual instance mining of news videos using a graph-based approach

14

DEVELOPED SOLUTION

Environment

Page 15: Visual instance mining of news videos using a graph-based approach

15

DEVELOPED SOLUTION

System architecture overview

Page 16: Visual instance mining of news videos using a graph-based approach

16

DEVELOPED SOLUTION

Temporal samplingFrom user’s desired frame rate Uniform sampling

Page 17: Visual instance mining of news videos using a graph-based approach

17

DEVELOPED SOLUTION

Instance detection Faces detection

Viola & Jones algorithm

DetectMultiscale method

Page 18: Visual instance mining of news videos using a graph-based approach

18

DEVELOPED SOLUTION

Objects detection SURF descriptors and matching

Training images

Page 19: Visual instance mining of news videos using a graph-based approach

19

DEVELOPED SOLUTION

3. Matching

2. Keypoints & Surf descriptors of frames

1. Keypoints & Surf descriptors of training images

Page 20: Visual instance mining of news videos using a graph-based approach

20

DEVELOPED SOLUTION

Heuristic decision

0.1 0.2 0.3 0.4 0.5

0.600000000000001

0.700000000000001 0.8

00.10.20.30.40.50.60.7

Test with ambulances

Detection threshold

% c

orre

ct d

etec

tion

0.1 0.2 0.3 0.4 0.5

0.600000000000001

0.700000000000001

00.10.20.30.40.50.60.7

Test with police cars

Detection threshold

% c

orre

ct d

etec

tions

Page 21: Visual instance mining of news videos using a graph-based approach

21

Edge detection

DEVELOPED SOLUTION

Texts detection Stroke width based algorithm

ResultsStroke width of all pixels are computed

Page 22: Visual instance mining of news videos using a graph-based approach

22

DEVELOPED SOLUTION

Graph-based selection of representative instances

Pre-processing Increase the accuracy

Original GrayscaleCropped

Resized Equalized

Pre-processin

g

Features extraction

Similarity graph

Clustering Selection

Page 23: Visual instance mining of news videos using a graph-based approach

23

Features extraction LBPH

Histogram comparing Histogram intersection Chi-square distance

Similarity value

DEVELOPED SOLUTION

With 𝛼 = 1

Page 24: Visual instance mining of news videos using a graph-based approach

24

DEVELOPED SOLUTION

Similarity graph (Full connectivity)

Node = Visual instance

Awn = Visual similarity

Page 25: Visual instance mining of news videos using a graph-based approach

25

Clustering by Edge filtering

DEVELOPED SOLUTION

Similarity value > Threshold

Subgraphs Heuristic decision

Page 26: Visual instance mining of news videos using a graph-based approach

26

DEVELOPED SOLUTION

Selection of the representatives visual instances

Mutual reinforcement Scores

Number of nodes > Threshold

Time of appearanceHeuristic decision

Page 27: Visual instance mining of news videos using a graph-based approach

27

DEVELOPED SOLUTION

Presentation Graphical User Interface online (GUI)

Developed with Wt

Initial design

Page 28: Visual instance mining of news videos using a graph-based approach

28

DEVELOPED SOLUTION

Final result of the GUI

Page 29: Visual instance mining of news videos using a graph-based approach

29

CONTENTS

IntroductionState of the artRequirements analysisSystem architecture overviewDeveloped solutionEvaluation

User study 1 User study 2 Conclusions

Future work

Page 30: Visual instance mining of news videos using a graph-based approach

30

EVALUATION

User study 1 2 complementary web-based surveys

4 videos from the CCMA dataset 40 participants

Evaluation Redundancy Understanding Quality (Mean Opinion Score (MOS))

1. Unacceptable 2. Poor 3. Fair 4. Good 5. Excellent

Page 31: Visual instance mining of news videos using a graph-based approach

31

EVALUATION

Visual summary 1

Visual summary 4

Page 32: Visual instance mining of news videos using a graph-based approach

32

EVALUATION

Redundancy

70%

30%

Visual summary 1

YESNO

70%

30%

Visual summary 2

YESNO

70%

30%

Visual summary 3

YESNO 48%53%

Visual summary 4

YESNO

Page 33: Visual instance mining of news videos using a graph-based approach

33

EVALUATION

UnderstandingRanking Keywords before

watching the videoKeywords after

watching the video1 Puerto Rico Independence

2 Independence Puerto Rico

3 Political party Future

4 Election Voting

5 Opinion Political party

Ranking Keywords before watching the video

Keywords after watching the video

1 Music New schedule

2 Catalunya Radio Novelty

3 Programming Catalunya Radio

4 Office Culture

5 Schedule Information

Page 34: Visual instance mining of news videos using a graph-based approach

34

EVALUATION

1 2 3 4 502468

101214161820

Visual summary 1

Visual summary 2

Score rate

o Quality

Part

icip

an

ts

MOS1 = 3,8MOS2 = 3,57MOS3 = 3,6MOS4 = 3,72

Page 35: Visual instance mining of news videos using a graph-based approach

35

EVALUATION

User study 2 Web-based survey

2 well-known news 356 videos of “Boston Marathon bombings” 406 videos of “Disappearance of the Malaysia airlines flight”

55 participants

Evaluation Comparison with W. Zhang (ACM MM 2014)

Quality (Mean Opinion Score (MOS))• 1. Unacceptable • 2. Poor• 3. Fair• 4. Good• 5. Excellent

Page 36: Visual instance mining of news videos using a graph-based approach

36

EVALUATION

Boston Marathon bombings

W. Zhang (ACM MM

2014)

Our visual summary

Page 37: Visual instance mining of news videos using a graph-based approach

37

EVALUATION

1 2 3 4 50

5

10

15

20

25

30

W. Zhang (ACM MM 2014)Our visual summary

Score rate

Part

icip

an

ts

MOS = 2,2MOS = 4,15

Page 38: Visual instance mining of news videos using a graph-based approach

38

EVALUATION

Disappearance of the Malaysia airlines flight

W. Zhang(ACM MM 2014)

Our visual summary

Page 39: Visual instance mining of news videos using a graph-based approach

39

EVALUATION

1 2 3 4 50

5

10

15

20

25

30

W. Zhang (ACM MM 2014)Our visual summary

Score rate

Part

icip

an

ts

MOS = 2,56MOS = 3,62

Page 40: Visual instance mining of news videos using a graph-based approach

40

EVALUATION

Conclusions

Pros Extract relevant content Summarize the news video Seem to be competitive with the state of the art

Cons Exist redundancy Low accuracy of the object detection

Page 41: Visual instance mining of news videos using a graph-based approach

41

CONTENTS

IntroductionState of the artRequirements analysisSystem architecture overviewDeveloped solutionEvaluationFuture work

Page 42: Visual instance mining of news videos using a graph-based approach

42

CONCLUSION

Future work

Improve the detection

Audio transcription

Content presentation

Interactive prototype

Page 43: Visual instance mining of news videos using a graph-based approach

43

THANK YOU VERY MUCH FOR YOUR ATTENTION