verifying multimedia use at mediaeval 2015
TRANSCRIPT
Verifying Multimedia Use at MediaEval 2015Christina Boididou1, Katerina Andreadou1, Symeon Papadopoulos1, Duc-Tien Dang-Nguyen2, Giulia Boato2, Michael Riegler3 & Yiannis Kompatsiaris1
1 Information Technologies Institute (ITI), CERTH, Greece2 University of Trento, Italy3 Simula Research Lab, Norway
MediaEval 2015 Workshop, Sept. 14-15, 2015, Wurzen, Germany
Real or Fake
#2
Real or Fake
#3
Real photocaptured April 2011 by WSJbutheavily tweeted during Hurricane Sandy(29 Oct 2012)
Tweeted by multiple sources & retweeted multiple times
Original online at:
http://blogs.wsj.com/metropolis/2011/04/28/weather-journal-clouds-gathered-but-no-tornado-damage/
Task at a Glance
#4
TWEET
IMAGE
MEDIAEVAL SYSTEM
FAKE
REAL
Systems may use:• Tweet text• Tweet metadata• Twitter user profile• Image content
AUTHOR(PROFILE)
A Typology of Fake: Reposting of Real
• Photos from past events reposted as being associated to current event
#5
A Typology of Fake: Reposting of Art
• Artworks presented as real imagery
#6
A Typology of Fake: Speculations
• Speculations regarding the association of persons or actions to current event
#7
A Typology of Fake: Photoshopping
• Digitally manipulated photos
#8
Assessing Multimedia Use
TWEET
#9
Assessing Multimedia Use
LINKED CONTENT
Assessing Multimedia Use
AUTHOR
Ground Truth Generation
• Data (tweet) collection– Historic (known cases discussed online) using Topsy– Real-time during major events using streaming API
• Tweet set expansion– Near-duplicate image search + human inspection was used
to increase the number of associated tweets• Label assignment
– Fake/real labels were manually assigned after consulting online reports that were posted after each event
#12
Annotation Challenges
• Tweets declaring that the embedded image is fake
• Tweets with obvious manipulations
• All those cases were manually checked and removed from both the development and test set!
#13
Verification Corpus - Dev
#14
Event Name fake real#images #tweets #users #images #tweets #users
Hurricane Sandy 62 5,559 5,432 148 4,664 4,446
Boston Marathon bombing 35 189 187 28 344 310
Sochi Olympics 26 274 252 - - -
MH370 Flight 29 501 493 - - -
Bring Back Our Girls 7 131 126 - - -
Columbian Chemicals 15 185 87 - - -
Passport hoax 2 44 44 - - -
Rock Elephant 1 13 13 - - -
Underwater bedroom 3 113 112 - - -
Livr mobile app 4 9 9 - - -
Pig fish 1 14 14 - - -
Total 185 7,032 6,769 176 5,008 4,756
Verification Corpus - Test
#15
Event Name fake real#images #tweets #users #images #tweets #users
Solar Eclipse 6 137 135 4 140 133
Samurai with girl 4 218 212 - - -
Nepal Earthquake 21 356 343 11 1004 934
Garissa Attack 2 6 6 2 73 72
Syrian boy 1 1786 1692 - - -
Varoufakis 1 61 59 - - -
• Evaluation was based on classic IR/ML measures: Precision, Recall, F-measure (target class: fake)
• Participants were allowed to mark a tweet as “unknown” (expected to result in reduced recall)
Results
#16
Team Run Recall Precision F-Score
MCG-ICT run1 0.921 0.964 0.942run2 0.922 0.937 0.930
UoS-ITI
run1 0.032 1.000 0.063run2 0.017 1.000 0.034run3 0.034 1.000 0.065run4 0.720 1.000 0.837
CERTH-UNITN
run1 0.794 0.733 0.762run2 0.749 0.994 0.854run3 0.922 0.736 0.819run4 0.798 0.860 0.828run5 0.967 0.862 0.911
Results: Examples #1
• All participants failed to classify those correctly• True label: Fake / Predicted: Real
#17
Results: Examples #2
• All participants classified these correctly.
#18
fake real
Results: Examples #3• Only participant#1 predicted those correctly.
#19
fake real
Results: Examples #4• Only participant#2 predicted those correctly.
#20
real real
Results: Examples #5• Only participant#3 predicted those correctly.
#21
fake real
Future Plans
• Move beyond tweets + images– Blog/news articles– Public Facebook posts (in pages)– Other?
• Move beyond the simple fake/real distinction– Real, but inaccurate– Messages expressing doubt– Other?
• Use different evaluation measures– AUC probably better especially when there is class
imbalance
#22
Thank you!
• Code:https://github.com/MKLab-ITI/image-verification-
corpus
• Get in touch:@sympapadopoulos / [email protected]@CMpoi / [email protected]