trecvid evaluations

23
TRECVID Evaluations Mei-Chen Yeh 03/27/2012

Upload: clodia

Post on 23-Feb-2016

49 views

Category:

Documents


0 download

DESCRIPTION

TRECVID Evaluations. Mei-Chen Yeh 03/27/2012. Introduction. Text REtrieval Conference (TREC) Organized by National Institute of Standards (NIST) Support from government agencies Annual evaluation (NOT a competition) Different “tracks” over the years, e.g. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: TRECVID Evaluations

TRECVID Evaluations

Mei-Chen Yeh03/27/2012

Page 2: TRECVID Evaluations

Introduction

• Text REtrieval Conference (TREC)– Organized by National Institute of Standards (NIST)– Support from government agencies– Annual evaluation (NOT a competition)– Different “tracks” over the years, e.g.• web retrieval, email spam filtering, question answering,

routing, spoken documents, OCR, video (standalone conference from 2001)

• TREC Video Retrieval Evaluation (TRECVID)

Page 3: TRECVID Evaluations

Introduction

• Objectives of TRECVID– Promote progress in content-based analysis and

retrieval from digital videos– Provide open, metrics-based evaluation– Model real world situations

Page 4: TRECVID Evaluations

Introduction

• Evaluation is driven by participants• The collection is fixed, available in the spring– 50% data used for development, 50% for testing

• Test queries available in July, 1 month to submission

• More details: – http://trecvid.nist.gov/

Page 5: TRECVID Evaluations

TRECVID Video Collections• Test data

– Broadcast news– TV programs– Surveillance videos– Video rushes provided by BBC– Documentary and educational materials supplied by the Netherlands Institute

for Sound and Vision (2007-2009)– The Gatwick airport surveillance videos provided by the UK Home Office (2009)– Web videos (2010)

• Languages– English– Arabic– Chinese

Page 6: TRECVID Evaluations

Collection History

Page 7: TRECVID Evaluations

Collection History

• 2011– 19200 online videos (150 GB, 600 hours)– 50 hours of airport surveillance videos

• 2012– 27200 online videos (200 GB, 800 hours)– 21,000 equal-length, short clips of BBC rush videos– airport surveillance videos (not yet announced)– ~4,000-hour collection of Internet multimedia

Page 8: TRECVID Evaluations

Tasks

• Semantic indexing (SIN) • Known-item search (KIS) • Content-based copy detection (CCD) – by 2011• Interactive surveillance event detection (SED) • Instance search (INS) • Multimedia event detection (MED)• Multimedia event recounting (MER) – since

2012

Page 9: TRECVID Evaluations

Semantic indexing

• System task:– Given the test collection, master shot reference,

and concept definitions, return for each concept a list of at most 2000 shot IDs from the test collection ranked according to their likeliness of containing the concept.

• 500 concepts (since 2011)• “Concept pair” (2012)

Page 10: TRECVID Evaluations

Examples• Boy (One or more male children)• Teenager• Scientists (Images of people who appear to be scientists)• Dark skinned people• Handshaking• Running• Throwing• Eaters (Putting food or drink in his/her mouth)• Sadness • Anger• Windy (Scenes showing windy weather) Full list

Page 11: TRECVID Evaluations

Example (concept pair)• Beach + Mountain • Old_People + Flags • Animal + Snow • Bird + Waterscape_waterfront • Dog + Indoor • Driver + Female_Human_Face • Person + Underwater • Table + Telephone • Two_People + Vegetation • Car + Bicycle

Page 12: TRECVID Evaluations

Known-item search

• Models the situation in which someone knows of a video, has seen it before, believes it is contained in a collection, but doesn't know where to look.

• Inputs– A text-only description of the video desired– A test collection of videos

• Outputs– Top ranked videos (automatic or interactive mode)

Page 13: TRECVID Evaluations

Examples

• Find the video with the guy talking about how it just keeps raining.

• Find the video about some guys in their apartment talking about some cleaning schedule.

• Find the video where a guy talks about the FBI and Britney Spears.

• Find the video with the guy in a yellow T-shirt with the big letter M on it.

• …http://www-nlpir.nist.gov/projects/tv2010/ki.examples.html

Page 14: TRECVID Evaluations

Content-based copy detection

Page 15: TRECVID Evaluations
Page 16: TRECVID Evaluations

Surveillance event detection• Detects human behaviors in vast amounts surveillance

video, real time!• For public safety and security• Event examples– Person runs– Cell to ear– Object put– People meet– Embrace– Pointing– …

Page 17: TRECVID Evaluations

Instance search

• Finds video segments of a certain specific person, object, or place, given a visual example.

Page 18: TRECVID Evaluations

Instance search

• Input– a collection of test clips– a collection of queries that delimit a person,

object, or place entity in some example video• Output– for each query up to the 1000 clips most likely to

contain a recognizable instance of the entity

Page 19: TRECVID Evaluations

Query examples

Page 20: TRECVID Evaluations

Multimedia event detection • System task

– Given a collection of test videos and a list of test events, indicate whether each of the test events is present anywhere in each of the test videos and give the strength of evidence for each such judgment.

• In 2010– Making a cake: one or more people make a cake – Batting a run in: within a single play during a baseball-type game, a

batter hits a ball and one or more runners (possibly including the batter) scores a run

– Assembling a shelter: one or more people construct a temporary or semi-permanent shelter for humans that could provide protection from the elements.

• 15 new events are released for 2011, not yet announced for 2012.

Page 21: TRECVID Evaluations

Multimedia event recounting• New in 2012• Task

– Once a multimedia event detection system has found an event in a video clip, it is useful for a human user to be able to examine the evidence on which the system's decision was based. An important goal is for that evidence to be semantically meaningful to a human.

• Input– a clip and a event kit (name, definition, explication--textual exposition

of the terms and concepts, evidential descriptions, and illustrative video exemplars)

• Output– a clear, concise text-only (alphanumeric) recounting or summary of

the key evidence that the event does in fact occur in the video

Page 22: TRECVID Evaluations

Schedule

• Feb. call for participation• Apr. complete the guidelines• Jun.-Jul. release query data• Sep. submission due• Oct. return the results• Nov. paper submission due• Dec. workshop

Page 23: TRECVID Evaluations

Call for partners

• Standardized evaluations and comparisons• Test on large collections • Failures are not embarrassing, and can be

presented at the TRECVID workshop!• Anyone can participate!– A “priceless” resource for researches