semantic multimedia analysis and search - future soc symposium 2013

43
Symposium on Future Trends in Service-Oriented Computing, HPI Potsdam, 20-21.06.2013 Semantic Multimedia Analysis and Search Dr. Harald Sack Hasso-Plattner-Institut for IT-Systems Engineering University of Potsdam Potsdam, 21/06/2013 Freitag, 21. Juni 13

Upload: harald-sack

Post on 16-Jan-2015

672 views

Category:

Education


0 download

DESCRIPTION

keynote at FutureSOC Symposium 2013, at HPI, Potsdam, 20-21.06.2013

TRANSCRIPT

Page 1: Semantic Multimedia Analysis and Search - Future SOC Symposium 2013

Symposium on Future Trends in Service-Oriented Computing, HPI Potsdam, 20-21.06.2013

Semantic Multimedia Analysis and Search

Dr. Harald SackHasso-Plattner-Institut for IT-Systems Engineering

University of Potsdam

Potsdam, 21/06/2013

Freitag, 21. Juni 13

Page 2: Semantic Multimedia Analysis and Search - Future SOC Symposium 2013

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, LDW 2011, Magdeburg, 30. Sep. 2011

•Searching Multimedia Web vs. Archive

•How to Open Up Multimedia Data?Automated Multimedia Analysis

•How to Determine the Meaning of (Multimedia) Metadata? Context-Driven Semantic Analysis

•How to Make Use of Semantic Metadata?Exploratory Search and Intelligent Recommendations

Semantic Multimedia Analysis and Search

Freitag, 21. Juni 13

Page 3: Semantic Multimedia Analysis and Search - Future SOC Symposium 2013

Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam

3

Searching the WebFreitag, 21. Juni 13

Page 4: Semantic Multimedia Analysis and Search - Future SOC Symposium 2013

Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam

4

Searching the WebFreitag, 21. Juni 13

Page 5: Semantic Multimedia Analysis and Search - Future SOC Symposium 2013

Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam

5

Freitag, 21. Juni 13

Page 6: Semantic Multimedia Analysis and Search - Future SOC Symposium 2013

Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam

6

Google Knowledge Graph

= “search results with semantic- search information gathered from a wide variety of sources“

Freitag, 21. Juni 13

Page 7: Semantic Multimedia Analysis and Search - Future SOC Symposium 2013

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Workshop ,Corporate Semantic Web‘, XInnovations 2011, Berlin, 19. Sep. 2011Google Multimedia Search

Freitag, 21. Juni 13

Page 8: Semantic Multimedia Analysis and Search - Future SOC Symposium 2013

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, LDW 2011, Magdeburg, 30. Sep. 2011

‣Google Multimedia Search relies on text-based metadata and link context

How does Google find Multimedia?

Freitag, 21. Juni 13

Page 9: Semantic Multimedia Analysis and Search - Future SOC Symposium 2013

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, LDW 2011, Magdeburg, 30. Sep. 2011

Seach by Media Content

Freitag, 21. Juni 13

Page 10: Semantic Multimedia Analysis and Search - Future SOC Symposium 2013

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, LDW 2011, Magdeburg, 30. Sep. 2011

The Ordinary Archive is a Small World...

Neil Armstrong

Freitag, 21. Juni 13

Page 11: Semantic Multimedia Analysis and Search - Future SOC Symposium 2013

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, LDW 2011, Magdeburg, 30. Sep. 2011

But, wouldn‘t it be nice, if.....

Neil Armstrong

...but maybe you are also interested in

- Buzz Aldrin (1 videos)- John Glen (1 video)- Juri Gagarin (2 videos)

- Richard Nixon (3 videos)

- Apollo 11 (1 video)- NASA (20 videos)

- Moon (14 videos)

- space exploration (34 videos)

- technology (1.205 videos)

Sorry, no results found for ‘Neil Armstrong‘...

Freitag, 21. Juni 13

Page 12: Semantic Multimedia Analysis and Search - Future SOC Symposium 2013

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, LDW 2011, Magdeburg, 30. Sep. 2011

How to Search in Multimedia Archives?

Freitag, 21. Juni 13

Page 13: Semantic Multimedia Analysis and Search - Future SOC Symposium 2013

vfm - Seminar: Metadatenmanagement in Medienunternehmen, 05. September 2012, Bonn Jörg Waitelonis, Hasso-Plattner-Institut Potsdam

Content-Based Search in Multimedia Archives relies on text-based Metadata Current Solution: Manual Annotation

Freitag, 21. Juni 13

Page 14: Semantic Multimedia Analysis and Search - Future SOC Symposium 2013

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, LDW 2011, Magdeburg, 30. Sep. 2011 image

VisualConceptDetection

Text Recognition

Visual Analysis

(Selected) Automated Media Analysis

Face Detection

Face Detection

Logo Detection

audio-visual

text / images

Audio-Mining

structuralanalysis

AutomatedSpeech

Recognitionaudio event detection

audio

Freitag, 21. Juni 13

Page 15: Semantic Multimedia Analysis and Search - Future SOC Symposium 2013

Structural Video Analysis

• Decomposition of time-based media into meaningful media fragments of coherent content that can be used as basic element for indexing and classification

scenes

shots

subshots

frames

video

keyframes

Freitag, 21. Juni 13

Page 16: Semantic Multimedia Analysis and Search - Future SOC Symposium 2013

Video Optical Character Recognition (OCR)

Fig. 1. Workflow of the proposed text detection method. (b) is the vertical edge map of (a). (c) is the vertical dilation map of(b). (d) is the binary map of (c). (e) the result map of subsequent connected component analysis. (f) shows the binary map afterthe adaptive projection profile refinement. (g) is the final detection result.

for text detection of nature scene images. The operator com-putes for each pixel the width of the most likely stroke con-taining the pixel. The output of the operator is a stroke-featuremap, which has the same size as the input image, while eachpixel represents the corresponding stroke width value of theinput image.

3. TEXT DETECTION IN VIDEO IMAGES

Text detection is the first task of video OCR. Our approachdetermines, whether a single frame of a video file containstext lines, for which a tight bounding box is returned. In or-der to manage detected text lines efficiently, we have defined aclass ”text line object” with the following properties: bound-ing box location (the top-left corner position), bounding boxsize. After the first round of text detection, the refinement andthe verification procedures ensure the validity of the detectionresults in order to reduce false alarms.

3.1. Text detector

Before performing the text detection process, a gaussiansmooth filter is applied to the images that have an entropyvalue larger than a predefined threshold Tentr . For our pur-pose, Tentr =5.25 has proven to be to the best advantage.

We have developed an edge based text detector, subse-quently referred to edge text detector. The advantage of ourdetector is its computational efficiency compared to other ma-chine learning based approaches, because no computation-ally expensive training period is required. However, for vi-sually different video sequences a parameter adaption has tobe performed. The best suited parameter combination of ourmethod were learned from the test runs on the given test data.

Fig. 2. Workflow of the proposed adaptive text line refinementprocedure

The processing workflow for a single frame is depictedin Fig. 1 (a-e). First, a vertical edge map is produced usingSobel filter [8] (cf. Fig. 1 (b)). Then, the morphological dila-tion operation is adopted to link the vertical character edgestogether (cf. Fig. 1 (c)). Let MinW denote the detected min-imal text line width. A rectangle kernel:1�MinW is definedfor vertical dilation operator. Subsequently, a binary maskis generated by using Otsu’s thresholding method [9]. Ulti-mately, we create a binary map after Connected Component

• Video OCR is much more difficult than traditional print OCR• fast detection/filtering of text candidates• verification of text candidates• script separation from background• visual quality enhancement• application of standard OCR software• spell correction w.r.t. context and temporal

redundancy

Freitag, 21. Juni 13

Page 17: Semantic Multimedia Analysis and Search - Future SOC Symposium 2013

• Face DetectionDetect candidate image regionsin a video frame that depict a human face

• Face TrackingTrack a detected face in videoover consecutive frames within shot boundaries

• Face ClusteringGroup faces detected and tracked in videos into visually similar sets within a single video

• Face Recognition/IdentificationReliable identification of detected faces

Video Face Detection, Tracking & Clustering

personfrontal face:90%

not a person

personprofile face:70%

Freitag, 21. Juni 13

Page 18: Semantic Multimedia Analysis and Search - Future SOC Symposium 2013

Visual Concept Detection

• Adaption of traditional ,Bag of Words‘ approach from text retrieval

• Image is expressed as vector (histogram)of dictionary codeword frequencies

• classification via machine learning(Support Vector Machines)

• Konzeptzuordnung durch maschinelles Lernverfahren (hier Support Vector Machines)

Freitag, 21. Juni 13

Page 19: Semantic Multimedia Analysis and Search - Future SOC Symposium 2013

Annotation of Audiovisual Data

Metadata Extraction

Metadata (e.g. MPEG-7) ... <SpatialDecomposition> <TextAnnotation> <KeywordAnnotation> <Keyword>Astronaut</Keyword> </KeywordAnnotation> </TextAnnotation> <SpatialMask> <SubRegion> <Polygon> <Coords> 480 150 620 480 </Coords> </Polygon> </SubRegion> </SpatialMask> ... </SpatialDecomposition> ...

• Multimedia data with spatiotemporal Annotations

Neil Armstrong

Freitag, 21. Juni 13

Page 21: Semantic Multimedia Analysis and Search - Future SOC Symposium 2013

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, LDW 2011, Magdeburg, 30. Sep. 2011

• Authoritative Metadata• structured data• semi-structured data

• natural language text • Non-authoritative Metadata

• (free) user tags and comments• restricted vocabularies

• (Media) Analysis Metadata• low level features• high level features

• etc.

How to Determine the Meaning of Metadata?

SemanticAnalysis

reliability

context

pragmatics

location dependency

accuracy

timedependency

level ofabstraction

Freitag, 21. Juni 13

Page 22: Semantic Multimedia Analysis and Search - Future SOC Symposium 2013

Neil Armstrong

Astronaut

is a

Person

is a

Science Occupation

subClassOf

Employment

subClassOf

Entities

Ontologies

has an

,Neil Armstrong‘ is more than just a character string

Kosmonautsame as

Juri Gagarin

is a

is NOT a

!

Freitag, 21. Juni 13

Page 23: Semantic Multimedia Analysis and Search - Future SOC Symposium 2013

Where does the knowledge come from...?

Freitag, 21. Juni 13

Page 24: Semantic Multimedia Analysis and Search - Future SOC Symposium 2013

Astronaut Person

Neil Armstrong

Science Occupation

Employment

is a is a

is a

is a has a

Web of Data

Freitag, 21. Juni 13

Page 25: Semantic Multimedia Analysis and Search - Future SOC Symposium 2013

Web of Data = Linked Open DataBut what, if there is no trivial unique identification?

Armstronguser tag

Freitag, 21. Juni 13

Page 26: Semantic Multimedia Analysis and Search - Future SOC Symposium 2013

Semantic Web Technologies , Dr. Harald Sack, Hasso Plattner Institute, University of Potsdam

Armstrong

Freitag, 21. Juni 13

Page 27: Semantic Multimedia Analysis and Search - Future SOC Symposium 2013

Semantic Web Technologies , Dr. Harald Sack, Hasso Plattner Institute, University of Potsdam

ArmstrongArmstrong+Moon

Freitag, 21. Juni 13

Page 28: Semantic Multimedia Analysis and Search - Future SOC Symposium 2013

Web of Data = Linked Open DataUnderstanding requires Context

Armstrong

Moon

EagleSpace

Freitag, 21. Juni 13

Page 29: Semantic Multimedia Analysis and Search - Future SOC Symposium 2013

4242 42 4224424242 42 4242Semantic AnalysisSemantics is determined by Context

Context Item

N.Steinmetz, H.Sack: Semantic Multimedia Information Retrieval Based on Contextual Descriptions, 2013

„Armstrong landed the Eagle on the Moon.“Text

SEMEX Multimedia Context Model

Context Dimensions

TemporalContext

SpatialContext

ProvenanceContext

Relevance

determines

Ambiguity

influences

Accuracy

influences

Contextual Description

ClassDiversity

Level of Structure

SourceReliability

SourceDiversity

Freitag, 21. Juni 13

Page 30: Semantic Multimedia Analysis and Search - Future SOC Symposium 2013

Armstrong

George Armstrong Custer

Neil Armstrong

The Armstrong Twins

Armstrong, Florida

Armstrong, Ontario

Armstrong Automobile

Joe ArmstrongArmstrong County, Texass

Armstrong Gun

Craig Armstrong

Armstrong (Moon Crater)

Louis Armstrong

Armstrong Tunnel

Louis Armstrong International Airport

Armstrong‘s Theorem

Sir Thomas Armstrong

Ian Armstrong

Eagle Moon

Eagle (Bird)

Eagle (heraldry)

USCGC Eagle

The Eagle (2011 film)

Eagle (song)

John H. EagleEagle (typeface)

Eagle Falls (Washington)

Eagle (Moon Crater)

Eagle (comic)

Eagle (lunar module)

Eagle TV

Armstrong Tunnel

The Eagle (Pub)

War Eagle

The Eagle (newspaper)

Eagle (racehorse)

Angela EagleLinda Eagle

James Philipp Eagle

95 entities448 entities

Armstrong (British Columbia)Karen Armstrong

Curtis Armstrong

Gillian Armstrong Hilary Armstrong

William L. Armstrong

156 entities

Man on the Moon (film)

Moon (song)

Moon Son-Ri

C Moon

The Moon (Tarot card)

Edgar Moon

Moon OSMoon (Band)

Moon

Moon 44

Man on the Moon (soundtrack)

William Moon

Lottie Moon

Mr. Moon (song)

Man on the Moon (musical)

Darvin Moon

Moon 83

Francis MoonGary Moon

Robert Charles Moon

Black Moon

Allan Moon

Ban-Ki Moon

Fly me to the Moon (song)

Semantic AnalysisNamed Entity Mapping

„Armstrong landed the Eagle on the Moon.“

Consider all entities within the same context

Freitag, 21. Juni 13

Page 31: Semantic Multimedia Analysis and Search - Future SOC Symposium 2013

Select matching entities from all possible candidate entities: • Popularity based strategies• Linguistical strategies• Statistical strategies• Semantic based strategies

General Approach1. Make an assumption 2. Do the strategies support or contradict your assumption3. Make decision according to logical and probabilistic rules/constraints

Semantic AnalysisNamed Entity Recognition

N. Ludwig, H. Sack, “Named entity recognition for user-generated tags,TIR 2011

• reference text corpus(wikipedia)

• link graph (wikipedia)• semantic graph

(DBpedia)

Entity Selection Process

Freitag, 21. Juni 13

Page 32: Semantic Multimedia Analysis and Search - Future SOC Symposium 2013

Armstrong

George Armstrong Custer

The Armstrong Twins

Armstrong, Florida

Armstrong, Ontario

Armstrong Automobile

Joe ArmstrongArmstrong County, Texass

Armstrong Gun

Craig Armstrong

Armstrong (Moon Crater)

Armstrong Tunnel

Louis Armstrong International Airport

Armstrong‘s Theorem

Sir Thomas Armstrong

Ian Armstrong

Eagle Moon

Eagle (Bird)

Eagle (heraldry)

USCGC Eagle

The Eagle (2011 film)

Eagle (song)

John H. EagleEagle (typeface)

Eagle Falls (Washington)

Eagle (Moon Crater)

Eagle (comic)

Eagle TV

Armstrong Tunnel

The Eagle (Pub)

War Eagle

The Eagle (newspaper)

Eagle (racehorse)

Angela EagleLinda Eagle

James Philipp Eagle

95 entities448 entities

Armstrong (British Columbia)Karen Armstrong

Curtis Armstrong

Gillian Armstrong Hilary Armstrong

William L. Armstrong

156 entities

Man on the Moon (film)

Moon (song)

Moon Son-Ri

C Moon

The Moon (Tarot card)

Edgar Moon

Moon OSMoon (Band)

Moon 44

Man on the Moon (soundtrack)

William Moon

Lottie Moon

Mr. Moon (song)

Man on the Moon (musical)

Darvin Moon

Moon 83

Francis MoonGary Moon

Robert Charles Moon

Black Moon

Allan Moon

Ban-Ki Moon

Neil Armstrong

Eagle (lunar module)

Moon

Louis Armstrong

Fly me to the Moon (song)

Semantic AnalysisNamed Entity Recognition

„Armstrong landed the Eagle on the Moon.“

N. Steinmetz, H.Sack: Semantic Multimedia Information Retrieval Based on Contextual Descriptions, 2013

Entity Selection Process(Semantic) Graph Analysis

Freitag, 21. Juni 13

Page 33: Semantic Multimedia Analysis and Search - Future SOC Symposium 2013

4242 42 4224424242 42 4242

vfm - Seminar: Metadatenmanagement in Medienunternehmen, 05. September 2012, Bonn Jörg Waitelonis, Hasso-Plattner-Institut Potsdam

33

Semantically Annotated Multimedia

Video Analysis /Metadata Extraction

timemetadata

metadatametadata

metadatametadata

e.g., person xylocation yzevent abc

e.g., bibliographical data,geographical data,encyclopedic data, ..

Entity Recognition/ Mapping

N. Ludwig, H. Sack: Named Entity Recognition for User-Generated Tags. In Proc. of the 8th Int. Workshop on Text-based Information Retrieval, IEEE CS Press, 2011

Freitag, 21. Juni 13

Page 34: Semantic Multimedia Analysis and Search - Future SOC Symposium 2013

Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam

34

Entity Based Search

• linguistic ambiguities of traditional keyword based search can be avoided

• enables high precision and high recall retrieval

http://www.yovisto.com/labs/autosuggestion/

• Query string refinement / extension• entity auto-suggestion• interpretation of natural language queries

J. Osterhoff, J. Waitelonis, H. Sack, Widen the Peepholes! Entity-Based Auto-Suggestion as a rich and yet immediate Starting Point for Exploratory Search, IVDW 2012

Freitag, 21. Juni 13

Page 35: Semantic Multimedia Analysis and Search - Future SOC Symposium 2013

Freitag, 21. Juni 13

Page 36: Semantic Multimedia Analysis and Search - Future SOC Symposium 2013

Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam

36

http://mediaglobe.yovisto.com:8080/mggui-dev2/

search facets

C. Hentschel, H. Sack, et al., Open up cultural heritage in video archives with mediaglobe, I2CS 2012

Freitag, 21. Juni 13

Page 37: Semantic Multimedia Analysis and Search - Future SOC Symposium 2013

Freitag, 21. Juni 13

Page 38: Semantic Multimedia Analysis and Search - Future SOC Symposium 2013

Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam

38

Explorative Search

dbpedia-owl:mission

dbpedia:Neil_Armstrong

dbpedia:Apollo_11dbpedia-owl:mission

category:Apollo_program

dcterms:subject

dbpedia:Apollo_13

dcterms:subject

yago:Space_accidents_and_incidents

rdf:type

rdf:type

dbpedia:Space_Shuttle_Challenger

dbpedia-owl:mission

http://mediaglobe.yovisto.com:8080/J. Waitelonis, H. Sack: Towards exploratory video search using linked data, MTAP Volume 59, Number 2 (2012), 645-672

dbpedia:Buzz_Aldrin

dbpedia:Michael_Collins

Freitag, 21. Juni 13

Page 39: Semantic Multimedia Analysis and Search - Future SOC Symposium 2013

Exploratory Search and Serendipity•Find something that you were not looking for on purpose ...

dbpedia:Buzz_Aldrin

dbpedia:Cookie_Monster

dbpedia:Strictly_Come_Dancing

dbpedia:Transformers

Freitag, 21. Juni 13

Page 41: Semantic Multimedia Analysis and Search - Future SOC Symposium 2013

Explorative Search & Intelligent Recommmendationwith yovisto

http://mediaglobe.yovisto.com:8080/

Freitag, 21. Juni 13

Page 42: Semantic Multimedia Analysis and Search - Future SOC Symposium 2013

Explorative Search & Intelligent Recommmendationwith yovisto

http://mediaglobe.yovisto.com:8080/

Freitag, 21. Juni 13

Page 43: Semantic Multimedia Analysis and Search - Future SOC Symposium 2013

Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, LDW 2011, Magdeburg, 30. Sep. 2011

Contact:Dr. Harald SackHasso-Plattner-Institut für SoftwaresystemtechnikUniversität PotsdamProf.-Dr.-Helmert-Str. 2-3D-14482 Potsdam

Homepage:http://www.hpi.uni-potsdam.de/meinel/team/sack.htmlBlog: http://yovisto.blogspot.com/E-Mail: [email protected] Twitter: lysander07 / biblionomicon / yovisto Slides can be found at http://slideshare.com/lysander07/

Thank you very much

for your attention!

Freitag, 21. Juni 13