mediaeval 2015 - verifying multimedia use at mediaeval 2015

23
Verifying Multimedia Use at MediaEval 2015 Christina Boididou 1 , Katerina Andreadou 1 , Symeon Papadopoulos 1 , Duc-Tien Dang-Nguyen 2 , Giulia Boato 2 , Michael Riegler 3 & Yiannis Kompatsiaris 1 1 Information Technologies Institute (ITI), CERTH, Greece 2 University of Trento, Italy 3 Simula Research Lab, Norway MediaEval 2015 Workshop, Sept. 14-15, 2015, Wurzen, Germany

Upload: multimediaeval

Post on 20-Jan-2017

92 views

Category:

Education


2 download

TRANSCRIPT

Verifying Multimedia Use at MediaEval 2015 Christina Boididou1, Katerina Andreadou1, Symeon Papadopoulos1, Duc-Tien Dang-Nguyen2, Giulia Boato2, Michael Riegler3 & Yiannis Kompatsiaris1

1 Information Technologies Institute (ITI), CERTH, Greece

2 University of Trento, Italy

3 Simula Research Lab, Norway

MediaEval 2015 Workshop, Sept. 14-15, 2015, Wurzen, Germany

Real or Fake

#2

Real or Fake

#3

Real photo captured April 2011 by WSJ but heavily tweeted during Hurricane Sandy (29 Oct 2012) Tweeted by multiple sources & retweeted multiple times Original online at:

http://blogs.wsj.com/metropolis/2011/04/28/weather-journal-clouds-gathered-but-no-tornado-damage/

Task at a Glance

#4

TWEET

IMAGE

MEDIAEVAL SYSTEM

FAKE

REAL

Systems may use:

• Tweet text

• Tweet metadata

• Twitter user profile

• Image content

AUTHOR (PROFILE)

A Typology of Fake: Reposting of Real

• Photos from past events reposted as being associated to current event

#5

A Typology of Fake: Reposting of Art

• Artworks presented as real imagery

#6

A Typology of Fake: Speculations

• Speculations regarding the association of persons or actions to current event

#7

A Typology of Fake: Photoshopping

• Digitally manipulated photos

#8

Assessing Multimedia Use

TWEET

#9

Assessing Multimedia Use

LINKED CONTENT

Assessing Multimedia Use

AUTHOR

Ground Truth Generation

• Data (tweet) collection

– Historic (known cases discussed online) using Topsy

– Real-time during major events using streaming API

• Tweet set expansion

– Near-duplicate image search + human inspection was used to increase the number of associated tweets

• Label assignment

– Fake/real labels were manually assigned after consulting online reports that were posted after each event

#12

Annotation Challenges

• Tweets declaring that the embedded image is fake

• Tweets with obvious manipulations

• All those cases were manually checked and removed

from both the development and test set!

#13

Verification Corpus - Dev

#14

Event Name fake real

#images #tweets #users #images #tweets #users

Hurricane Sandy 62 5,559 5,432 148 4,664 4,446

Boston Marathon bombing 35 189 187 28 344 310

Sochi Olympics 26 274 252 - - -

MH370 Flight 29 501 493 - - -

Bring Back Our Girls 7 131 126 - - -

Columbian Chemicals 15 185 87 - - -

Passport hoax 2 44 44 - - -

Rock Elephant 1 13 13 - - -

Underwater bedroom 3 113 112 - - -

Livr mobile app 4 9 9 - - -

Pig fish 1 14 14 - - -

Total 185 7,032 6,769 176 5,008 4,756

Verification Corpus - Test

#15

Event Name fake real

#images #tweets #users #images #tweets #users

Solar Eclipse 6 137 135 4 140 133

Samurai with girl 4 218 212 - - -

Nepal Earthquake 21 356 343 11 1004 934

Garissa Attack 2 6 6 2 73 72

Syrian boy 1 1786 1692 - - -

Varoufakis 1 61 59 - - -

• Evaluation was based on classic IR/ML measures: Precision, Recall, F-measure (target class: fake)

• Participants were allowed to mark a tweet as “unknown” (expected to result in reduced recall)

Results

#16

Team Run Recall Precision F-Score

MCG-ICT run1 0.921 0.964 0.942

run2 0.922 0.937 0.930

UoS-ITI

run1 0.032 1.000 0.063

run2 0.017 1.000 0.034

run3 0.034 1.000 0.065

run4 0.720 1.000 0.837

CERTH-UNITN

run1 0.794 0.733 0.762

run2 0.749 0.994 0.854

run3 0.922 0.736 0.819

run4 0.798 0.860 0.828

run5 0.967 0.862 0.911

Results: Examples #1

• All participants failed to classify those correctly

• True label: Fake / Predicted: Real

#17

Results: Examples #2

• All participants classified these correctly.

#18

fake real

Results: Examples #3

• Only participant#1 predicted those correctly.

#19

fake real

Results: Examples #4

• Only participant#2 predicted those correctly.

#20

real real

Results: Examples #5

• Only participant#3 predicted those correctly.

#21

fake real

Future Plans

• Move beyond tweets + images

– Blog/news articles

– Public Facebook posts (in pages)

– Other?

• Move beyond the simple fake/real distinction

– Real, but inaccurate

– Messages expressing doubt

– Other?

• Use different evaluation measures

– AUC probably better especially when there is class imbalance

#22

Thank you!

• Code:

https://github.com/MKLab-ITI/image-verification-corpus

• Get in touch:

@sympapadopoulos / [email protected]

@CMpoi / [email protected]