cs531presentation
DESCRIPTION
Based on paper Understanding and Classifying Image Tweets ACM-MM 2013 Disclaimer: I am not any kind of author of this paper. I have used that paper as a basis for my course project proposal.TRANSCRIPT
MUSTAFA ILKER SARAC20801528
UNDERSTANDING AND CLASSIFYING IMAGE TWEETSACM-MM 2013
Investigating Images Related to Twitter Trending Topics
1
CS531 - Mustafa Ilker SARAC 04/12/2023
Content
IntroductionMotivationImage-TweetsImage and Text RelationVisual/Non-Visual ClassificationExperimentsInitial Results
2
CS531 - Mustafa Ilker SARAC 04/12/2023
Introduction
Image-tweets Correlation between tweet’s image and text
50% of all posts are image-tweetsImage tweets retweeted more and survived
longer
3
CS531 - Mustafa Ilker SARAC 04/12/2023
Motivation
04/12/2023CS531 - Mustafa Ilker SARAC
4
Questions to ask What types of images do users embed? Do the images distinctly differ from images on
image/photo-sharing websites like Flickr? Do the textual contents of image tweets differ from posts
that are text-only?Contributions
Corpus Annotated subset Built a classifier to distinguish two subclasses of image-
tweets; Visual Non-Visual
Image-Tweets
04/12/2023CS531 - Mustafa Ilker SARAC
5
Corpus Text-only and image-tweets from Weibo 7 months in 2012 ~57M tweets Manually annotated ~5K subset
Image-Tweets
04/12/2023CS531 - Mustafa Ilker SARAC
6
Image Characteristics Images are post-processed by Weibo 45.1% of the corpus are image-tweets Images vary by quality and topics
70% of annotated corpus are natural photograph.
Image-Tweets
04/12/2023CS531 - Mustafa Ilker SARAC
7
Image-tweets vs. Text-only When? What? Why? More image-tweets during daytime – When? LDA applied to a subset, ~1M, of corpus – What?
k=50 latent topics are learned Daily chatter or information sharing – Why?
Image and Text Relation
04/12/2023CS531 - Mustafa Ilker SARAC
8
99% of image tweets have text. Status (event, time ,location) Logico – semantic
Image and Text Relation
04/12/2023CS531 - Mustafa Ilker SARAC
9
Visually-relevant image-tweets At least one noun or verb corresponds to part of the
imageNon-visual image-tweets
Image and text has no visual correspondence Hard to distinguish by just looking images May exhibit emotional relevance
Visual/Non-Visual Classification
04/12/2023CS531 - Mustafa Ilker SARAC
10
Dataset Construction Crowdsourcing to label a random subset of the image-
tweets Visual Non-visual
Each image is annotated by 3 different subjects 4811 image-tweets annotated
3206 (2/3) visual 1605 (1/3) non-visual
3 major types of features are used Text Image Context
Visual/Non-Visual Classification
04/12/2023CS531 - Mustafa Ilker SARAC
11
Text Features Binary word features Previously learned topics from LDA Part of Speech(POS) density features Named Entities Microblog specific features
@mentions #hashtags Geolocation URLs
Visual/Non-Visual Classification
04/12/2023CS531 - Mustafa Ilker SARAC
12
Image features Face detection SIFT features with bag of visual words representation
Applied LDA with k=35
Context Features Retweets Comments Follower Ratio Posting Time etc.
Experiment
04/12/2023CS531 - Mustafa Ilker SARAC
13
10 fold cross-validation with Naïve Bayes is performed
Macro-averaged F1 score is computed.Baseline is using only words as feature
F1 = 64.8Each feature is combined individually to
observe the impact.When combined all positive features
F1 = 70.5
Experiment
04/12/2023CS531 - Mustafa Ilker SARAC
14
Proposed Work
04/12/2023CS531 - Mustafa Ilker SARAC
15
Re-rank images of image-tweets returned by Twitter search
Select good images in order to represent Trending Topics.
Twitter scraped and some initial results are obtained using Retweets, Favorites for contextual features SIFT for image features to compare images.
Initial Results
04/12/2023CS531 - Mustafa Ilker SARAC
16
QUESTIONS?
04/12/2023CS531 - Mustafa Ilker SARAC
17
Thank You