fashion 10000: an enriched dataset of fashion and clothing

19
Fashion 10000 An Enriched Dataset of Fashion and Clothing Presentation: Michael Riegler, Klagenfurt University & TU Delft Babak Loni, TU Delft Lei Yen Cheung, TU Delft Alessandro Bozzon, TU Delft Luke Gottlieb, ICSI Martha Larson, TU Delft

Upload: michael-riegler

Post on 22-Apr-2015

239 views

Category:

Science


1 download

DESCRIPTION

Presentation of the Fashion 10000 data set for the ACM Multimedia Systems Conference 2014.

TRANSCRIPT

Page 1: Fashion 10000: An Enriched Dataset of Fashion and Clothing

Fashion 10000An Enriched Dataset of Fashion and Clothing

Presentation: Michael Riegler, Klagenfurt University & TU DelftBabak Loni, TU DelftLei Yen Cheung, TU DelftAlessandro Bozzon, TU DelftLuke Gottlieb, ICSIMartha Larson, TU Delft

Page 2: Fashion 10000: An Enriched Dataset of Fashion and Clothing

Table of Content• Introduction• Dataset Collection• Dataset Annotation

– Statistics

• Applications of Dataset• Conclusion

Page 3: Fashion 10000: An Enriched Dataset of Fashion and Clothing

The Dataset• Social Images• At least 10000 fashion-

related images• Social metadata• Creative Common

images• Annotated with

different labels

Page 4: Fashion 10000: An Enriched Dataset of Fashion and Clothing

The Collection

Wikipedia

470 Fashion Categories

Flickr

- Query only CC attribution images- Query should also appear in tags- Top relevant images

32K Images262 Categories

Flickr Fashion 10000

+ MTurk Annotations + Metadata

Page 5: Fashion 10000: An Enriched Dataset of Fashion and Clothing

Metadata• Collected in xml and csv format

– Title, description, owner, Tags, Location, geo-parameters

• Additional metadata: Info, Geos, Context, Tags, Notes, Favorites, Urls, Comments

Page 6: Fashion 10000: An Enriched Dataset of Fashion and Clothing

General StatisticsPairs fashion item, photo 32,398Number of distinct fashion categories 262

Max/avg/min nr of photos per fashion item 200/ 122.95 / 10

Number of photos with geo annotations 7,933

Total number of comments 58,578

Max/avg/min nr of comments per photo 575 / 7.35/ 1

Total number of tags, photo pairs 460,907

Total number of distinct tags 56,275

Max/avg/min nr of tags per photo 136/ 15.15/ 1

Total number of notes, photo pairs 5,892

Max/avg/min nr of notes per photo 195/ 5.31/ 1

Total number of favorites 37,131

Max/avg/min nr of favorites per photo 20/ 3.61/ 1

Total number of contexts 110,505

Max/avg/min nr of contexts per photo 206/ 3.93/ 1

Page 7: Fashion 10000: An Enriched Dataset of Fashion and Clothing

General StatisticsPairs fashion item, photo 32,398

Number of distinct fashion categories 262Max/avg/min nr of photos per fashion item 200/ 122.95 / 10

Number of photos with geo annotations 7,933

Total number of comments 58,578

Max/avg/min nr of comments per photo 575 / 7.35/ 1

Total number of tags, photo pairs 460,907

Total number of distinct tags 56,275

Max/avg/min nr of tags per photo 136/ 15.15/ 1

Total number of notes, photo pairs 5,892

Max/avg/min nr of notes per photo 195/ 5.31/ 1

Total number of favorites 37,131

Max/avg/min nr of favorites per photo 20/ 3.61/ 1

Total number of contexts 110,505

Max/avg/min nr of contexts per photo 206/ 3.93/ 1

Page 8: Fashion 10000: An Enriched Dataset of Fashion and Clothing

General StatisticsPairs fashion item, photo 32,398

Number of distinct fashion categories 262Max/avg/min nr of photos per fashion item

200/ 122.95 / 10

Number of photos with geo annotations 7,933

Total number of comments 58,578

Max/avg/min nr of comments per photo 575 / 7.35/ 1

Total number of tags, photo pairs 460,907

Total number of distinct tags 56,275

Max/avg/min nr of tags per photo 136/ 15.15/ 1

Total number of notes, photo pairs 5,892

Max/avg/min nr of notes per photo 195/ 5.31/ 1

Total number of favorites 37,131

Max/avg/min nr of favorites per photo 20/ 3.61/ 1

Total number of contexts 110,505

Max/avg/min nr of contexts per photo 206/ 3.93/ 1

Page 9: Fashion 10000: An Enriched Dataset of Fashion and Clothing

General StatisticsPairs fashion item, photo 32,398

Number of distinct fashion categories 262

Max/avg/min nr of photos per fashion item 200/ 122.95 / 10

Number of photos with geo annotations 7,933Total number of comments 58,578

Max/avg/min nr of comments per photo 575 / 7.35/ 1

Total number of tags, photo pairs 460,907

Total number of distinct tags 56,275

Max/avg/min nr of tags per photo 136/ 15.15/ 1

Total number of notes, photo pairs 5,892

Max/avg/min nr of notes per photo 195/ 5.31/ 1

Total number of favorites 37,131

Max/avg/min nr of favorites per photo 20/ 3.61/ 1

Total number of contexts 110,505

Max/avg/min nr of contexts per photo 206/ 3.93/ 1

Page 10: Fashion 10000: An Enriched Dataset of Fashion and Clothing

Dataset Annotation• Some images might

not be relevant to fashion and clothing

• The ground truth differentiates relevant from non-relevant

Page 11: Fashion 10000: An Enriched Dataset of Fashion and Clothing

Dataset Annotation• We used AMT to create ground

truth for the images• The fashion category is

described with a definition from Wikipedia

• 6 questions to create 6 labels for each of the images

• We also ask about familiarity of workers with the fashion category

Page 12: Fashion 10000: An Enriched Dataset of Fashion and Clothing

HIT Design

Page 13: Fashion 10000: An Enriched Dataset of Fashion and Clothing

HIT Design

Page 14: Fashion 10000: An Enriched Dataset of Fashion and Clothing

HIT Questions (Labels)Question Possible values

Q1) Fashion / Clothing Related yes – no - notsure

Q2) Specialty clothing item (image Category)

yes – no - notsure

Q3) Number of people nopeople – onepeople - manypeople

Q4) Professional model or not? yes – no – notapp (not applicable)

Q5) Person wearing fashion? yes – no – noperson – notapp (not applicable)

Q6) Formal / Informal formalmen - formalwomen - informalmen informalwomen – other (cross-dressing or multiple persons) – notapp (not applicatble)

Page 15: Fashion 10000: An Enriched Dataset of Fashion and Clothing

Annotation Statistics

Total number of assignments 24,457

% of rejected assignments 4 %

Total number of unique workers 1470

Avg. number of assignment by each worker 17

Avg. Completion time 127 sec

Avg. familiarity of workers with fashion items

5.8 (range 1-7)

Question 1 2 3 4 5 6

Kappa Value

0.66 0.65 0.85 0.51 0.38 0.48

Page 16: Fashion 10000: An Enriched Dataset of Fashion and Clothing

Dataset Statistics• Using the generated ground truth the

statistics about the images were calculated

Number of fashion related images 18,487

Number of images with many people 7,417

Number of images with one person 9,771

Number of images with no person 13,179

Number of images with intention of showing fashion

9,096

Number of professional fashion images 2,814

Page 17: Fashion 10000: An Enriched Dataset of Fashion and Clothing

Applications of the Dataset• Developing social media content analysis

– Game with a purpose (domino game)

• Basis for the brave new task in MediaEval multimedia benchmarking initiative

• Use case for the proof of intentional framing

Page 18: Fashion 10000: An Enriched Dataset of Fashion and Clothing

Conclusion• Fashion dataset• Six different labels• AMT generated ground

truth• Can be used in various

research areas• Evaluated in the

MediaEval Benchmark

Page 19: Fashion 10000: An Enriched Dataset of Fashion and Clothing

Michael [email protected]

Thank you!