cs531presentation

17
MUSTAFA ILKER SARAC 20801528 UNDERSTANDING AND CLASSIFYING IMAGE TWEETS ACM-MM 2013 Investigating Images Related to Twitter Trending Topics 1 CS531 - Mustafa Ilker SARAC 01/03/2022

Upload: milkers

Post on 13-May-2015

88 views

Category:

Economy & Finance


0 download

DESCRIPTION

Based on paper Understanding and Classifying Image Tweets ACM-MM 2013 Disclaimer: I am not any kind of author of this paper. I have used that paper as a basis for my course project proposal.

TRANSCRIPT

Page 1: CS531presentation

MUSTAFA ILKER SARAC20801528

UNDERSTANDING AND CLASSIFYING IMAGE TWEETSACM-MM 2013

Investigating Images Related to Twitter Trending Topics

1

CS531 - Mustafa Ilker SARAC 04/12/2023

Page 2: CS531presentation

Content

IntroductionMotivationImage-TweetsImage and Text RelationVisual/Non-Visual ClassificationExperimentsInitial Results

2

CS531 - Mustafa Ilker SARAC 04/12/2023

Page 3: CS531presentation

Introduction

Image-tweets Correlation between tweet’s image and text

50% of all posts are image-tweetsImage tweets retweeted more and survived

longer

3

CS531 - Mustafa Ilker SARAC 04/12/2023

Page 4: CS531presentation

Motivation

04/12/2023CS531 - Mustafa Ilker SARAC

4

Questions to ask What types of images do users embed? Do the images distinctly differ from images on

image/photo-sharing websites like Flickr? Do the textual contents of image tweets differ from posts

that are text-only?Contributions

Corpus Annotated subset Built a classifier to distinguish two subclasses of image-

tweets; Visual Non-Visual

Page 5: CS531presentation

Image-Tweets

04/12/2023CS531 - Mustafa Ilker SARAC

5

Corpus Text-only and image-tweets from Weibo 7 months in 2012 ~57M tweets Manually annotated ~5K subset

Page 6: CS531presentation

Image-Tweets

04/12/2023CS531 - Mustafa Ilker SARAC

6

Image Characteristics Images are post-processed by Weibo 45.1% of the corpus are image-tweets Images vary by quality and topics

70% of annotated corpus are natural photograph.

Page 7: CS531presentation

Image-Tweets

04/12/2023CS531 - Mustafa Ilker SARAC

7

Image-tweets vs. Text-only When? What? Why? More image-tweets during daytime – When? LDA applied to a subset, ~1M, of corpus – What?

k=50 latent topics are learned Daily chatter or information sharing – Why?

Page 8: CS531presentation

Image and Text Relation

04/12/2023CS531 - Mustafa Ilker SARAC

8

99% of image tweets have text. Status (event, time ,location) Logico – semantic

Page 9: CS531presentation

Image and Text Relation

04/12/2023CS531 - Mustafa Ilker SARAC

9

Visually-relevant image-tweets At least one noun or verb corresponds to part of the

imageNon-visual image-tweets

Image and text has no visual correspondence Hard to distinguish by just looking images May exhibit emotional relevance

Page 10: CS531presentation

Visual/Non-Visual Classification

04/12/2023CS531 - Mustafa Ilker SARAC

10

Dataset Construction Crowdsourcing to label a random subset of the image-

tweets Visual Non-visual

Each image is annotated by 3 different subjects 4811 image-tweets annotated

3206 (2/3) visual 1605 (1/3) non-visual

3 major types of features are used Text Image Context

Page 11: CS531presentation

Visual/Non-Visual Classification

04/12/2023CS531 - Mustafa Ilker SARAC

11

Text Features Binary word features Previously learned topics from LDA Part of Speech(POS) density features Named Entities Microblog specific features

@mentions #hashtags Geolocation URLs

Page 12: CS531presentation

Visual/Non-Visual Classification

04/12/2023CS531 - Mustafa Ilker SARAC

12

Image features Face detection SIFT features with bag of visual words representation

Applied LDA with k=35

Context Features Retweets Comments Follower Ratio Posting Time etc.

Page 13: CS531presentation

Experiment

04/12/2023CS531 - Mustafa Ilker SARAC

13

10 fold cross-validation with Naïve Bayes is performed

Macro-averaged F1 score is computed.Baseline is using only words as feature

F1 = 64.8Each feature is combined individually to

observe the impact.When combined all positive features

F1 = 70.5

Page 14: CS531presentation

Experiment

04/12/2023CS531 - Mustafa Ilker SARAC

14

Page 15: CS531presentation

Proposed Work

04/12/2023CS531 - Mustafa Ilker SARAC

15

Re-rank images of image-tweets returned by Twitter search

Select good images in order to represent Trending Topics.

Twitter scraped and some initial results are obtained using Retweets, Favorites for contextual features SIFT for image features to compare images.

Page 16: CS531presentation

Initial Results

04/12/2023CS531 - Mustafa Ilker SARAC

16

Page 17: CS531presentation

QUESTIONS?

04/12/2023CS531 - Mustafa Ilker SARAC

17

Thank You