rich internet application for semi-automatic annotation of semantic shots on keyframes

36
Rich Internet Application for Semi-Automatic Annotation of Semantic Shots on Keyframes Elisabet Carcel, Manuel Martos, Xavier Giró-i-Nieto, Ferran Marqués 1 1 Wednesday, December 14, 11

Upload: xavier-giro

Post on 03-Jul-2015

72 views

Category:

Technology


1 download

DESCRIPTION

Authors: Elisabet Carcel, Manuel Martos, Xavier Giró-i-Nieto and Ferran Marqués Details: https://imatge.upc.edu/web/publications/rich-internet-application-semi-automatic-annotation-semantic-shots-keyframes This paper describes a system developed for the semi- automatic annotation of keyframes in a broadcasting company. The tool aims at assisting archivists who traditionally label every keyframe manually by suggesting them an automatic annotation that they can intuitively edit and validate. The system is valid for any domain as it uses generic MPEG-7 visual descriptors and binary SVM classifiers. The classification engine has been tested on the multiclass problem of semantic shot detection, a type of metadata used in the company to index new con- tent ingested in the system. The detection performance has been tested in two different domains: soccer and parliament. The core engine is ac- cessed by a Rich Internet Application via a web service. The graphical user interface allows the edition of the suggested labels with an intuitive drag and drop mechanism between rows of thumbnails, each row representing a different semantic shot class. The system has been described as complete and easy to use by the professional archivists at the company.

TRANSCRIPT

Page 1: Rich Internet Application for Semi-Automatic Annotation of Semantic Shots on Keyframes

Rich Internet Application for Semi-Automatic Annotation of Semantic Shots on Keyframes

Elisabet Carcel, Manuel Martos, Xavier Giró-i-Nieto, Ferran Marqués

1

1Wednesday, December 14, 11

Page 2: Rich Internet Application for Semi-Automatic Annotation of Semantic Shots on Keyframes

Outline

Graphical User Interface

Motivation

Requirements

Conclusions

Semantic Shot Classifier

Design

Future work

Evaluation

Design

Video-demo

2

2Wednesday, December 14, 11

Page 3: Rich Internet Application for Semi-Automatic Annotation of Semantic Shots on Keyframes

Motivation

Digital content:

Multimedia content must be annotated to allow its future retrieval.

RetrievalIngestAV Production

Video recording in digital format

Storage and video indexing in a database system

Search through the metadata to retrieve video content

3

3Wednesday, December 14, 11

Page 4: Rich Internet Application for Semi-Automatic Annotation of Semantic Shots on Keyframes

DIGITION(MAM) Media Asset Management

Motivation

DIGITIONMedia Asset Management

Asset

02_14_20

4

4Wednesday, December 14, 11

Page 5: Rich Internet Application for Semi-Automatic Annotation of Semantic Shots on Keyframes

0.056%

99.944%

Daily retrieval in 2010

DIGITION(MAM) Media Asset Management

Repository Ingest

Retrieval

Motivation

0

80,000

160,000

240,000

320,000

400,000

2010 2011 2012

25,000

25,000

25,000

185,000160,000

135,000

StoredIngest

Video hours per year

Ingest and Retrieval

Non-retrieved content75h retrieved

5

5Wednesday, December 14, 11

Page 6: Rich Internet Application for Semi-Automatic Annotation of Semantic Shots on Keyframes

Motivation

Current solution Proposed new approach

Metadata by strata

• Start time code• End time code• Content description - Action -Semantic shots - People

Non-annotated assets

Annotated assets

GUI for semi-automatic

annotation

6

6Wednesday, December 14, 11

Page 7: Rich Internet Application for Semi-Automatic Annotation of Semantic Shots on Keyframes

Outline

Graphical User Interface

Motivation

Requirements

Conclusions

Semantic Shot Classifier

Design

Future work

Evaluation

Design

Video-demo

7

7Wednesday, December 14, 11

Page 8: Rich Internet Application for Semi-Automatic Annotation of Semantic Shots on Keyframes

Semantic Shot Classifier

• Annotation of the extracted keyframes

Semantic Shot

Medium Shot+

Player

Medium Shot+

Box

RequirementsPla semàntic

Semantic ConceptShot Type

8

8Wednesday, December 14, 11

Page 9: Rich Internet Application for Semi-Automatic Annotation of Semantic Shots on Keyframes

• President close-up

• Close-up

• Medium

• General bureau

• General chamber

• Player close-up

• Player medium

• Stadium overview

• Audience

• Banner

• Box

Requirements

Football Parlament

9

9Wednesday, December 14, 11

Page 10: Rich Internet Application for Semi-Automatic Annotation of Semantic Shots on Keyframes

Graphical User Interface

• Embeddable in existing system (Digition)

• User friendly

• Domain-generic interface

Requirements

10

10Wednesday, December 14, 11

Page 11: Rich Internet Application for Semi-Automatic Annotation of Semantic Shots on Keyframes

Outline

Graphical User InterfaceRequirements

Conclusions

Semantic Shot Classifier

Design

Future work

Evaluation

Design

Video-demo

Motivation

11

11Wednesday, December 14, 11

Page 12: Rich Internet Application for Semi-Automatic Annotation of Semantic Shots on Keyframes

Supervised Machine Learning

Design

Training

Detection

=????

0.23

0.98

>Thr

Decision

Yes

No

?

?

?

?

12

12Wednesday, December 14, 11

Page 13: Rich Internet Application for Semi-Automatic Annotation of Semantic Shots on Keyframes

System ArchitectureTraining:

Test:

>thr

Decision

Yes

No

MAXconfidence

?

Confidence 1

Confidence 2

Confidence N

Classifier 2

Classifier N

Classifier 1

Domain model

MAX class

Unknown

SVM Trainer

Model class 1

Model class 2

Model class N

Design

SVM Trainer

SVM Trainer

13

13Wednesday, December 14, 11

Page 14: Rich Internet Application for Semi-Automatic Annotation of Semantic Shots on Keyframes

Annotation

Football datasetFootball datasetFootball dataset

Class 1 Medium 116

Class 2 Closer Up 87

Class 3 Stadium 4

Class 4 Audience 19

Class 5 Banner 4

Class 6 Box 6

Class 7 Unknown 278

514

Parliament datasetParliament datasetParliament dataset

Class 1 President CU 16Class 2 Close-up 63Class 3 Medium 68Class 4 Bureau 13Class 5 Chamber 23Class 6 Unknown 84

276

Design

GAT annotation tool: http://upseek.upc.edu/gat

14

14Wednesday, December 14, 11

Page 15: Rich Internet Application for Semi-Automatic Annotation of Semantic Shots on Keyframes

Trainer parameters

Unsupervised clustering to support multi-view: -MIN # elements / cluster -MAX cluster radius

Design

Detector parameters

Confidence threshold

>Thr.

Decision

Yes

NoConfidence Unknown

15

15Wednesday, December 14, 11

Page 16: Rich Internet Application for Semi-Automatic Annotation of Semantic Shots on Keyframes

Outline

Graphical User InterfaceRequirements

Conclusions

Semantic Shot Classifier

Design

Future work

Evaluation

Design

Video-demo

Motivation

16

16Wednesday, December 14, 11

Page 17: Rich Internet Application for Semi-Automatic Annotation of Semantic Shots on Keyframes

Quality measure: F1-score

Evaluation

Class 1 Class 2 Class 3

Class1

Class 2

Class 3

4 1 0

0 3 0

2 0 3

Automatic

Manua l

17

17Wednesday, December 14, 11

Page 18: Rich Internet Application for Semi-Automatic Annotation of Semantic Shots on Keyframes

Quality measure: F1-score

Evaluation

Class 1 Class 2 Class 3

Class 1

Class 2

Class 3

4 1 0

0 3 0

2 0 3

Automatic

Manua l

18

18Wednesday, December 14, 11

Page 19: Rich Internet Application for Semi-Automatic Annotation of Semantic Shots on Keyframes

Cross Validation with Random Partition

• Training

- 80% of positive instances (+)

- Positive instances from other classes are

considered negative (-)

Evaluation

• Detection

- 20% of the positive instances (+)

19

19Wednesday, December 14, 11

Page 20: Rich Internet Application for Semi-Automatic Annotation of Semantic Shots on Keyframes

3 parameters to tune the classifier:

• MAX radius for a multi-view cluster • MIN elements for a multi-view cluster• Confidence threshold

Threshold (0.0 - 1.0)

Evaluation

Cross-validationiterations (3)

Max radius(0.1 - 1.0)

Min elements (0 - 5)

F1 measure

20

20Wednesday, December 14, 11

Page 21: Rich Internet Application for Semi-Automatic Annotation of Semantic Shots on Keyframes

Results for Parliament domain

• Optimal parameters

– Threshold: 0.9 – MIN # elements: 3 – MAX radius: 0.8

7. Evaluation

7.3.2. Catalan Parliament optimum results

The maximum mean F1 measure is 0.9720188262233014 and corresponds to the fol-lowing values of the three variables:

– Minimum score: 0.9

– Minimum number of elements: 3

– Maximum radius: 0.8

Figure 7.4 shows the mean F1 measure for all combinations computed of minimum ele-ments and maximum distance with 0.9 minimum score (optimum).

Min Max radius

elements 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

1 0.718 0.962 0.952 0.949 0.958 0.970 0.957 0.957 0.949 0.946

2 0.715 0.962 0.955 0.952 0.952 0.962 0.951 0.951 0.966 0.965

3 0.769 0.948 0.968 0.960 0.960 0.957 0.959 0.972 0.944 0.958

4 0.763 0.955 0.965 0.962 0.956 0.963 0.968 0.966 0.954 0.964

5 0.719 0.970 0.966 0.947 0.958 0.959 0.958 0.961 0.953 0.967

Figure 7.4.: Catalan Parliament F1 measures for 0.9 minimum score

48

Evaluation

21

21Wednesday, December 14, 11

Page 22: Rich Internet Application for Semi-Automatic Annotation of Semantic Shots on Keyframes

• “Unknown” labelled as “Close up”: 7.73% error

• Semantic shots labelled as “Unknown”: 1.27% error

F1 Measure

• Close Up president: 0.98• Close Up: 0.95• Medium: 0.99• General bureau: 0.98• General chamber: 0.99• Unknown: 0.94

Cromo

Cromo PP

Classe3

CU Presi.

CU Presi.

Medium

Bureau

Chamber

Unknown

93 0 0 0 0 3

0 376 0 0 0 2

0 0 405 0 0 3

0 0 0 75 0 3

0 0 0 0 135 3

0 39 0 0 0 465

Automatic

Manua l

CU Presi

CU Med Bur Cha Un know

Evaluation

22

22Wednesday, December 14, 11

Page 23: Rich Internet Application for Semi-Automatic Annotation of Semantic Shots on Keyframes

Results for Football domain

• Optimal parameters

– Threshold: 0.7 – MIN # elements: 1 – MAX radius: 0.8

Evaluation

7. Evaluation

7.3.1. Soccer matches optimum results

The maximum mean F1 measure is 0.8599863416562412 and corresponds to the fol-lowing values of the three variables:

– Minimum score: 0.7

– Minimum number of elements: 1

– Maximum radius: 0.8

Figure 7.1 shows the mean F1 measure for all combinations computed of minimum ele-ments and maximum distance with 0.7 minimum score (optimum).

Min Max radius

elements 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

1 0.385 0.616 0.691 0.703 0.822 0.734 0.771 0.860 0.829 0.714

2 0 0 0.838 0.546 0.858 0.827 0.828 0.816 0.832 0.659

3 0 0 0.853 0.784 0.764 0.785 0.770 0.818 0.833 0.727

4 0 0 0 0 0 0 0 0 0 0

5 0 0 0 0 0 0 0 0 0 0

Figure 7.1.: Soccer match F1 measures for 0.7 minimum score

44

23

23Wednesday, December 14, 11

Page 24: Rich Internet Application for Semi-Automatic Annotation of Semantic Shots on Keyframes

Cromo Cromo PP

Classe3

3

Cromo

Cromo PP

C.Beauty

Public

Pancarta

Llotja

Unknown

629 1 0 0 0 0 33

4 482 0 0 0 0 36

1 0 20 0 0 0 3

0 0 0 94 0 0 20

0 0 0 0 18 0 6

0 1 0 0 0 21 14

461 3 0 0 0 0 1204

Automatic

Manua l

Cromo Cromo PP

Cromo Beauty

Public Panc. Llotja Unknow

7.3. Results

The confusion matrix of the optimum combination can be seen on Table 7.3. From thisconfusion matrix the F1 measure, precision and recall for each class are extracted andrepresented on Figure 7.2.

AutomaticClass1 Class2 Class3 Class4 Class5 Class6 Class7

Class1 652 11 0 0 0 0 33Class2 4 482 0 0 0 0 36

Manual Class3 1 0 20 0 0 0 3Class4 0 0 0 94 0 0 20Class5 0 0 0 0 18 0 6Class6 0 1 0 0 0 21 14Class7 461 3 0 0 0 0 1204

Table 7.3.: Best soccer match confusion matrix instances

Class1 Class2 Class3 Class4 Class5 Class6 Class7Precision 0.58 0.97 1.0 1.0 1.0 1.0 0.91Recall 0.93 0.92 0.83 0.82 0.75 0.58 0.72

F1 measure 0.72 0.95 0.91 0.90 0.86 0.74 0.81

Figure 7.2.: Precision, recall and F1 measure bar graph for each soccer match class

45

F1 Measure

• Close-UP: 0.72

• Medium: 0.95

• C.Beauty: 0.91

• Public: 0.90

• Banner: 0.86

• Box: 0.74

• Unknown: 0.81

Evaluation

7. Evaluation

When analizing the confusion matrix it can be seen which classes have been detected asother classes. Figure 7.3 shows all classes which presents some detection problem, theclass that causes that problem and the percentage error of the confusion.

Manual Automatic % Error

Cromo PP Cromo 0.76%

Cromo Cromo PP 1.58%

Cromo Beauty Cromo 4.16%

Llotja Cromo PP 2.77%

Cap pla Cromo 27.63%

Cap pla Cromo PP 0.17%

Totes les classes Cap pla 7.9%

Figure 7.3.: Soccer match detection errors

46

24

24Wednesday, December 14, 11

Page 25: Rich Internet Application for Semi-Automatic Annotation of Semantic Shots on Keyframes

Outline

Graphical User InterfaceRequirements

Conclusions

Semantic Shot Classifier

Design

Future work

Evaluation

Design

Video-demo

Motivation

25

25Wednesday, December 14, 11

Page 26: Rich Internet Application for Semi-Automatic Annotation of Semantic Shots on Keyframes

Design

26

26Wednesday, December 14, 11

Page 27: Rich Internet Application for Semi-Automatic Annotation of Semantic Shots on Keyframes

Web services

• List of keyframes given an asset ID

• Keyframe classifier

System architecture

Data exchange

JSON Format

00_00_09_24 00_00_39_25 00_00_54_48 00_00_59_41

Asset 1257

Class 3Score: 0.83

Class 3Score 0.75

Class 5Score 0.46

Class 2Score 0.98

00_00_39_25FootballMIN elemsMAX radius

00_00_54_48FootballMIN elemsMAX radius

00_00_59_41FootballMIN elemsMAX radius

00_00_09_24FootballMIN elemsMAX radius

Design

27

27Wednesday, December 14, 11

Page 28: Rich Internet Application for Semi-Automatic Annotation of Semantic Shots on Keyframes

Outline

Graphical User InterfaceRequirements

Conclusions

Semantic Shot Classifier

Design

Future work

Evaluation

Design

Video-demo

Motivation

28

28Wednesday, December 14, 11

Page 29: Rich Internet Application for Semi-Automatic Annotation of Semantic Shots on Keyframes

Video-demo

29

29Wednesday, December 14, 11

Page 30: Rich Internet Application for Semi-Automatic Annotation of Semantic Shots on Keyframes

Video-demo

Parliament domain: http://youtu.be/R7o0sCoe-Gs

30

30Wednesday, December 14, 11

Page 31: Rich Internet Application for Semi-Automatic Annotation of Semantic Shots on Keyframes

Video-demo

Football domain: http://youtu.be/uURx9GoRJBg

31

31Wednesday, December 14, 11

Page 32: Rich Internet Application for Semi-Automatic Annotation of Semantic Shots on Keyframes

Outline

Graphical User InterfaceRequirements

Conclusions

Semantic Shot Classifier

Design

Future work

Evaluation

Design

Video-demo

Motivation

32

32Wednesday, December 14, 11

Page 33: Rich Internet Application for Semi-Automatic Annotation of Semantic Shots on Keyframes

• Web service to use new annotations to re-train

• Use semantic shot information to launch additional analysis

- Synthetic and natural signs -- Text recognition

- Close ups -- Face recognition

- Medium Shot Parliament -- Speech recognition

Future work

�����+,���������������������������

( ���� �������

&��������������9���������������,�������������������!���������������������������

������������������� �����������������0��1�� �������������������������������������������

5��������� ��� � ���� ���� ��������� ����� ������������ � ����������������� ���������������� �

���������������I������MT��&�������������������������������+����������������������������

������������������������������������������������������������������������������

�( � �� ����� ������������

�����������������$����������������������������������������������������������

����!��������!�����������!������������������������������������������������������������

��������������������������������

7� � �� � ��� � �������� � ��� � �� � ��� � �������� � �� � �������� � �� � ��� � �����������

��������������������������9*"��&��!��������������������-�������������������������

��������������������������������������������������������������=������������

��������������������������������������������������������������������������-�������

���� ����������������� � ������������������������������������������������������� ���

(�

��������*�1 � ������ �� )�33

33Wednesday, December 14, 11

Page 34: Rich Internet Application for Semi-Automatic Annotation of Semantic Shots on Keyframes

Outline

Graphical User InterfaceRequirements

Conclusions

Semantic Shot Classifier

Design

Future work

Evaluation

Design

Video-demo

Motivation

34

34Wednesday, December 14, 11

Page 35: Rich Internet Application for Semi-Automatic Annotation of Semantic Shots on Keyframes

Conclusions

Semantic shot classifier

• Multiclass classification combining binary classifiers

- Positive labels in a class become negative in others

- Grid search for classifier parameters

• Generic framework for multiple domains

• Results good enough for exploitation

Graphical User Interface

• Uses classifier through web services

• Flexible and user friend:

- Sorting by score

- Threshold can be manually changed

- Drag and drop

- Validation by page or shot type

- Keyframe highlight

• Annotated keyframes are valid for:

- Training

- Retrieval

35

35Wednesday, December 14, 11

Page 36: Rich Internet Application for Semi-Automatic Annotation of Semantic Shots on Keyframes

Outline

Graphical User InterfaceRequirements

Conclusions

Semantic Shot Classifier

Design

Future work

Evaluation

Design

Video-demo

Motivation

Thank you for you attention

Follow our research blog: http://bitsearch.blogspot.com

36

36Wednesday, December 14, 11