mumis franciska de jong & thijs westerveld university of twente [email protected]...
TRANSCRIPT
MUMIS
Franciska de Jong & Thijs Westerveld
University of [email protected]
Multimedia Indexing and Searching
OBJECTIVES
• Automatically indexing of video• Data from different media sources
(paper, radio, tv)• Domain: soccer• Digitise + ASR• Extract significant events• Merge annotations• Store final annotations• UI for searching
FACTS SHEET
Title: MUMIS: Multimedia Indexing and Searching Environment
Funding: EU Language Engineering Sector of TAP
Duration: 30 monthsJuly 2000 – January 2003
Volume: 2.4 M Euro, 385 Person months
Languages:Dutch, English, German (Swedish)
Consortium
• University of Twente (NL)• Sheffield University (UK)• University of Nijmegen (NL)• DFKI LT-Lab (DE)• Max Planck Institute for
Psycholinguistics (DE)• Esteam (SE)• VDA (NL)
Offline Processing
FormalText
FormalText
FormalText
FormalTextFormal
TextFormal
TextFormal
TextFormal
TextFormalText
FormalText
FormalTextSpeech
TranscrASR
EN
DEFormal
TextFormal
TextFormal
TextFormalText
FormalText
FormalTextFormal
TextFormal
TextFormal
TextFreeText
FormalText
FormalText
FormalText
FormalTextFormal
TextFormal
TextFormal
TextFormal
TextFormalText
FormalText
FormalTextFormal
Text
IEMergedAnnotated formal text
NL
Information Extraction
Automatic Speech Recognition
FormalText
FormalText
FormalText
FormalTextFormal
TextFormal
TextFormal
TextFormal
TextFormalText
FormalText
FormalTextSpeech
Signals
Merging
Annotations
FormalTextFormal
TextFormalTextAnno-
tations
Merging
DOMAIN MODELLING DATA: text, video, audio
Location
<?xml version=…><mumis-ontology><version>…</version>...<class><name>Defender</name><documentation>a ’Defender’ is a …</documentation><subclass-of>Player</subclass-of></class></mumis-ontology>
AnnotationsMultilingual IEMultilingual Search
...Player:…Consequence:…Time:…Location:...
Multilingual Lexicons
ENTITYEVENT RELATION
TimeDate
PersonScoreObject
DefenderOfficial
Artifact
Stopper
GoalPlayer:…Cause:…Time:…...
Player
Foul
SPEECH RECOGNITION
• Large-vocabulary• Speaker independent• Phoneme-based• Hidden Markov models
• acoustic model• language model
• Emotionally coloured speech • Domain language model• Match specific vocabularies (player
names)
INFORMATION EXTRACTION
• multilingual• formal descriptions• closed captions• tickers• newspapers• ASR output (radio/TV comment)
IE DATA
Formal textSchoten op doel 4
4Schoten naast doel
6 7Overtredingen 23
15Gele kaarten 1
1Rode kaarten 0
1Hoekschoppen 3
5Buitenspel 4 1
Ticker24 Scholes beats Jens Jeremies wonderfully, dragging the ball around and past the Bayern Munich man. He then finds Michael Owen on the right wing, but Owen's cross is poor.
TV report
Scholes
Past Jeremies
Owen
NewspaperOwen header pushed onto the post Deisler brought the German supporters to their feet with a buccaneering run down the right. Moments later Dietmar Hamann managed the first shot on target but it was straight at David Seaman. Mehmet Scholl should have done better after getting goalside of Phil Neville inside the area from Jens Jeremies’ astute pass but he scuffed his shot.
He then finds Michael Owen on the right wing
PASSplayer1 = Scholesplayer2 = Owen.
He Scholesthen finds Michael Owen on the right wing …
He then finds VPMichael Owen on the right wing NPbut Owen's cross NP
24 Scholes beats Jens Jeremies wonderfully, dragging ...
24 Scholes beatJens Jeremies wonderfull, drag...
24 NUMScholes PROPbeat VERB
3p singJens PROPJeremies PROPwonderfull ADV, PUNCT...
24 Scholes beats Jens Jeremies wonderfully, dragging the ball around and past the Bayern Munich man. He then finds Michael Owen on the right wing, but Owen's cross is poor.
IE Techniques & resources• Tokenisation• Lemmatisation• POS + morphology• Named Entities• Shallow parsing• Co-reference
resolution• Template filling
24 timeScholes playerbeatJens Jeremies playerwonderfull, …
MERGING
• Fuse annotations and recover from errors and differences:
• Multiple annotations of the same event (possibly with different attributes, e.g. time).
• Wrong event descriptions because of information extraction errors.
• Merging multiple partial annotations, e.g. by solving unsolved references like “star player”.
• Description logic
ON-LINE TASKS
•Search for interesting events with formal questions (user interface in many languages)
•Indicate hits by thumbnails & let user select scene
•Play scene via the Internet & allow scrolling
Give me all goals from Overmars shot with his head in 1. Half.
Event=Goal; Scorer=Overmars; Cause=Head; Time<=45
PSV - Ajax1995
Ned - Eng 1998
Ned - Ger1998
Multilingual Search and Display
SUMMARY
• Multimedia and multilingual
• ASR on emotionally coloured speech
• IE on ASR output
• Merging different annotations
• Search archives and play video online
http://parlevink.cs.utwente.nl/projects/mumis.html