cs 1699: intro to computer vision interactive image search ...kovashka/cs1699_fa15/... · visual...
TRANSCRIPT
![Page 1: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/1.jpg)
CS 1699: Intro to Computer Vision
Interactive Image Searchwith Attributes
Prof. Adriana KovashkaUniversity of Pittsburgh
November 5, 2015
![Page 2: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/2.jpg)
We Need Search to Access Visual Data
• 144,000 hours of video uploaded to YouTube daily
• 4.5 million photos uploaded to Flickr daily
• 150 million top-level domains
• This data pool is too big to simply browse
![Page 3: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/3.jpg)
Some Example Image Search Problems
… “Who did the witness see at the crime scene?”
… “Find all images depictingshiny objects.”
…“Which plant
is this?”
![Page 4: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/4.jpg)
How Is Image Search Done Today?
• Keywords work well for categories with known name
Marionberry?
marionberry
Marionberry.jpg Marion-berries-2.jpg
Marionberry-jam.jpg
![Page 5: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/5.jpg)
How Is Image Search Done Today?
• Unclear how to search without name or image
• Keywords are not enough!
???
![Page 6: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/6.jpg)
Solution? Interactively Describing Visual Content
It’s a yellow fruit.
Is it a lemon?
It has spikes. Is it a durian?
It is green on the inside. Is it a horned
melon?
![Page 7: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/7.jpg)
Solution? Interactively Describing Visual Content
It’s a yellow fruit.
Is it a lemon?
It has spikes. Is it a durian?
It is green on the inside. Is it a horned
melon?
![Page 8: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/8.jpg)
• Potential to communicate more precisely the desired visual content
• Iteratively refine the set of retrieved images
My Goal: Precise Communication for Search
Feedback
Results
![Page 9: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/9.jpg)
Key Questions
1) How to open up communication?
2) How to account for ambiguity of search terms?
“Bright” “Bright”
Target is more smiling than
![Page 10: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/10.jpg)
How is Interactive Search Done Today?
• Traditional binary feedback imprecise; allows only coarse communication between user and system
Keywords + binary relevance feedback
thin white male
[Rui et al. 1998, Zhou et al. 2003, Tong & Chang 2001, Cox et al. 2000, Ferecatu & Geman 2007, …]
relevantirrelevant
![Page 11: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/11.jpg)
Our Idea: Search via Comparisons
• Allow user to “whittle away” irrelevant images via comparative feedback on properties of results
“Like this… but with curlier hair”
Kovashka, Parikh, and Grauman, CVPR 2012
![Page 12: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/12.jpg)
Visual Attributes
• High-level semantic properties shared by objects
smiling large-lips
long-hair
natural
perspective openyellow
spiky smooth
high
heel
redornaments
metallic
![Page 13: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/13.jpg)
Why Infer Properties
We want to be able to infer something about unfamiliar objects
Cat Horse Dog ???
If we can infer category names…
Familiar Objects New Object
Slide credit: Derek HoiemFarhadi et al., CVPR 2009
![Page 14: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/14.jpg)
Why Infer Properties
We want to be able to infer something about unfamiliar objects
Has Stripes
Has Ears
Has Eyes
Has Four Legs
Has Mane
Has Tail
Has Snout
Brown
Muscular
Has Snout
Has Stripes (like cat)
Has Mane and Tail (like horse)
Has Snout (like horse and dog)
Familiar Objects New Object
If we can infer properties…
Slide credit: Derek HoiemFarhadi et al., CVPR 2009
![Page 15: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/15.jpg)
Binary Attributes
bright / not bright
smiling / not smiling
natural / not natural
![Page 16: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/16.jpg)
We need ability to compare images by attribute “strength”
bright
smiling
natural
Relative Attributes
![Page 17: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/17.jpg)
Learning Relative Attributes
• At test time, predict attribute strength of each database image
• Input: Image features x• Output: Real-valued attribute strength am (x)
• At training time, learn a mapping between image features and attribute strength
• Input: Pairs of ordered images with features
• Output: Ranking functions a1, … , aM
Parikh and Grauman, ICCV 2011
![Page 18: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/18.jpg)
Learning Relative Attributes
• We want to learn a spectrum (ranking model) for an attribute, e.g. “brightness”.
• Supervision from human annotators consists of:
Parikh and Grauman, ICCV 2011
Ordered pairs
Similar pairs
![Page 19: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/19.jpg)
Learn a ranking function
that best satisfies the constraints:
Image features
Learned parameters
Learning Relative Attributes
![Page 20: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/20.jpg)
Max-margin learning to rank formulation
Image Relative attribute score
Learning Relative Attributes
Parikh and Grauman, ICCV 2011; Joachims, KDD 2002
wm
![Page 21: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/21.jpg)
We need ability to compare images by attribute “strength”
bright
smiling
natural
Relative Attributes
![Page 22: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/22.jpg)
WhittleSearch with Relative Attribute Feedback
Results Page 1
Update relevance
scores
score=7
score=5
score=4
score=4score=1
?User: “I want somethingmore natural than this.”
Kovashka, Parikh, and Grauman, CVPR 2012
![Page 23: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/23.jpg)
WhittleSearch with Relative Attribute Feedback
natural
perspective“I want
something more natural
than this.” “I want something less naturalthan this.”
“I want something with more perspective than this.”
Kovashka, Parikh, and Grauman, CVPR 2012
+1
+1
+1
+1
+1
+1
+1 +1
+1
+1 +1
![Page 24: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/24.jpg)
More open than
Qualitative Result (Relative Attribute Feedback)
More open than
Less ornaments than
Match
Round 1
Ro
un
d 2
Round 3
Query: “I want a bright, open shoe that is shorton the leg.”
Selected feedback
![Page 25: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/25.jpg)
Shoes [Berg10, Kovashka12]:14,658 shoe images;
10 attributes: “pointy”, “bright”, “high-heeled”, “feminine” etc.
OSR [Oliva01]:2,688 scene images;
6 attributes:“natural”, “perspective”,
“open-air”, “close-depth” etc.
PubFig [Kumar08]:772 face images;
11 attributes:“masculine”, “young”,
“smiling”, “round-face”, etc.
Datasets Data from 147 users
![Page 26: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/26.jpg)
WhittleSearch Results (Summary)
• Binary feedback represents status quo [Rui et al. 1998, Cox et al. 2000, Ferecatu & Geman 2007, …]
• WhittleSearch finds relevant results faster than traditional binary feedback
![Page 27: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/27.jpg)
WhittleSearch Demo
![Page 28: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/28.jpg)
Impact of WhittleSearch: Adobe Font Selection
• Users retrieve fonts that match requested attributes
• Fonts sorted by relative attribute scores
O’Donovan et al., Exploratory Font Selection using Crowdsourced Attributes, SIGGRAPH 2014
![Page 29: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/29.jpg)
• The most relevant images might not be most informative
• Existing active methods largely focus on binary relevance feedback and suffer from expensive selection procedures
[Tong & Chang 2001, Li et al. 2001, Cox et al. 2000, Ferecatu & Geman 2007, …]
Problem: On What Should User Give Feedback?
Page 1
![Page 30: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/30.jpg)
Idea: Attribute Pivots for Guiding Feedback
• Relative 20 questions game
• Select series of most informative visual comparisons that user should make to help deduce target
?More
Less
Are the shoes you seekmore or less feminine than ?
… more or less bright than ?
Kovashka and Grauman, ICCV 2013
![Page 31: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/31.jpg)
Are the shoes you seekmore or less feminine than ?
Idea: Attribute Pivots for Guiding Feedback
#1 pointy#2 open#3 bright…
#M feminine…
X
#1
#2
#3
…
#N
pointy,
open,
bright,
feminine,
Pivots#1
#2
#3
…
#M
O(MN) comparisons total
M: ~11 N: ~15k
O(M) comparisons total
Traditional active approach Using pivots
![Page 32: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/32.jpg)
Key Questions
1) How to open up communication?
2) How to account for ambiguity of search terms?
“Bright” “Bright”
Target is more smiling than
![Page 33: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/33.jpg)
Standard Attribute Learning: One Generic Model
• Learn a generic model by pooling training data regardless of the annotator identity
• Inter-annotator disagreement treated as noise
Vote on labels
“formal”
“not formal”
Crowd
![Page 34: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/34.jpg)
Problem: One Model Does Not Fit All
• There may be valid perceptual differences within an attribute, yet existing methods assume monolithic attribute sufficient
[Lampert et al. CVPR 2009, Farhadi et al. CVPR 2009,Branson et al. ECCV 2010, Kumar et al. PAMI 2011,Scheirer et al. CVPR 2012, Parikh & Grauman ICCV 2011, …]
Formal? User labels: 50% “yes”50% “no”
or
More ornamented? User labels: 50% “first”20% “second”30% “equally”
Binary attribute Relative attribute
![Page 35: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/35.jpg)
Personalization from Scratch
• Collect data from the user and train a classifier
“formal”
“not formal”
“formal”
“not
formal”
User
![Page 36: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/36.jpg)
Idea: Learn User-Specific Attributes
• Treat learning perceived attributes as an adaptation problem
• Adapt generic attribute model with minimal user-specific labeled examples
Standard
approach:Vote on labels
Our idea:
“formal”
“not formal”
“formal”
“not
formal”
“formal”
“not formal”
Crowd
User
Kovashka and Grauman, ICCV 2013
![Page 37: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/37.jpg)
• Adapting binary attribute classifiers:
J. Yang et al. ICDM 2007
Given user-labeled data
and generic model ,
learn adapted model ,
Learning Adapted Attributes
![Page 38: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/38.jpg)
Learning Adapted Attributes
“formal”
“not formal”
“formal”
“not formal”Generic boundary
Adapted boundary
![Page 39: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/39.jpg)
SUN Attributes [Patterson12]:
14,340 scene images12 attributes:
“sailing”, “hiking”, “vacationing”, “open area”,
“vegetation”, etc.
Datasets
Shoes [Berg10, Kovashka12]:14,658 shoe images;
10 attributes: “pointy”, “bright”, “high-heeled”, “feminine” etc.
![Page 40: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/40.jpg)
Adapted Attribute Accuracy
• Result over 3 datasets, 32 attributes, and 75 total users
• Our user-adaptive method most accurately captures perceived attributes
![Page 41: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/41.jpg)
0
10
20
30
40
50
60
70
Shoes-Binary SUN
generic generic+
user-exclusive user-adaptive
Mat
ch r
ate
Personalizing Image Searchwith Adapted Attributes
“white shiny heels”
“shinier than ”
![Page 42: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/42.jpg)
Which Images Most Influence Adaptation?
pointy open bright ornamented shiny high-heeled
long formal sporty feminine
sailing vacationing hiking camping socializing shopping
vegetationcloudsnatural light cold open area horizon far
![Page 43: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/43.jpg)
Attribute Model Spectrum
Generic ? User-adaptive
No personalization Personalized to each user
Robustness to noise via majority vote
No robustness to noise
Assumes all users have thesame notion of the attribute
Assumes each user has aunique notion of the attribute
![Page 44: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/44.jpg)
Problem: User-Specific Extreme
• Different groups of users might subscribe to different shades of meaning of an attribute term
• How can we discover these shades automatically?
‘s color is…
“голубой”(azure)
“青”(blue and
green)
“blue”
![Page 45: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/45.jpg)
Our Idea: Discovering Shades of Attributes
• Discover “schools of thought” among users based on latent factors behind their use of attribute terms
• Allows discovery of the attribute’s “shades of meaning”
Shade 2
Shade 3
Shade 1
Fashionable shoes
Kovashka and Grauman, IJCV 2015
![Page 46: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/46.jpg)
Approach: Collecting Crowd Labels
• Show users definitions but no example images• Ask them to label images with presence of attribute• Also ask for explanations of some responses
Attribute Definition
Pointy having a comparatively sharp point, or having numerous pointed parts
Open having interspersed gaps, spaces, or intervals
Ornate made in an intricate shape or decorated with complex patterns
Comfortable providing physical comfort, ease and relaxation
Formal designed for wear or use at elaborate ceremonial or social events
Brown the color of, for example, chocolate and coffee
Fashionable conforming to the current fashion; stylish; trendy; modern
To clutter to make disorderly or hard to use by filling or covering with objects
To soothe to bring comfort, composure, or relief
Open (area) affording unobstructed passage or view
Modern characteristic or expressive of recent times or the present; contemporary
Rustic of, relating to, or typical of country life or country people
![Page 47: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/47.jpg)
Approach: Recovering Latent Factors
• Given: a partially observed attribute-specific label matrix
• Want to recover its latent factors via matrix factorization
• Gives a representation of each user
Image 1 Image 2 Image 3 Image 4
Annotator 1 1 ? 0 ?
Annotator 2 ? 1 0 ?
Annotator 3 1 ? ? 0
Annotator 4 0 ? 1 ?
Annotator 5 ? ? 1 1
Annotator 6 ? 0 ? 1
Annotator 7 1 0 1 ?
Factor 1 (toe?)
Factor 2(heel?)
Annotator 1 0.85 0.12
Annotator 2 0.72 0.21
Annotator 3 0.91 0.17
Annotator 4 0.07 0.95
Annotator 5 0.50 0.92
Annotator 6 0.15 0.75
Annotator 7 0.45 0.50
Is the attribute “open” present in the image?
![Page 48: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/48.jpg)
Approach: Discovering Shades
• Cluster users in the space of latent factor representations
• Use K-means; select K automatically
• Gives a representation of each shade for this attribute
Factor 1 (toe?)
Factor 2(heel?)
Annotator 1 0.85 0.12
Annotator 2 0.72 0.21
Annotator 3 0.91 0.17
Annotator 4 0.07 0.95
Annotator 5 0.50 0.92
Annotator 6 0.15 0.75
Annotator 7 0.45 0.50
Shade 1
Shade 2
Attribute: “open”
![Page 49: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/49.jpg)
Approach: Using Shades to Predict Attributes
Vote on labels
“open”
“not open”
Crowd
Vote on labels Vote on labels Vote on labels
Adapt Adapt Adapt
![Page 50: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/50.jpg)
Results: Accuracy of Attribute Prediction
• Shades make personalization more robust
• Shades’ advantage transfers to more accurate search
User-adp: [Kovashka and Grauman ICCV 2013]Attr disc: [Rastegari et al. ECCV 2012]
![Page 51: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/51.jpg)
Collected Labels and ExplanationsImage Attribute Present? Explanation
Ornate No "Ornate means decorated with extra items not inherent in the making of the object. This boot has a camo print as part of the
object, but no additional items put on it."
Ornate Yes "The flowerprint pattern is unorthodox for a rubber boot and really stands out against the jet black background."
![Page 52: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/52.jpg)
Collected Labels and ExplanationsImage Attribute Present? Explanation
Ornate No "Ornate means decorated with extra items not inherent in the making of the object. This boot has a camo print as part of the
object, but no additional items put on it."
Ornate Yes "The flowerprint pattern is unorthodox for a rubber boot and really stands out against the jet black background."
Open area Yes "This is an enclosed area, but the room is very large and the ceiling is very high, giving a lot of room. I think that this makes it an
enclosed area that is also an open area. "
Open area No "I do not consider the image to show an open area because the area shown is enclosed by walls. It is a larger space on the interior
of the building so it does have some aspects of an open space."
![Page 53: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/53.jpg)
Collected Labels and ExplanationsImage Attribute Present? Explanation
Ornate No "Ornate means decorated with extra items not inherent in the making of the object. This boot has a camo print as part of the
object, but no additional items put on it."
Ornate Yes "The flowerprint pattern is unorthodox for a rubber boot and really stands out against the jet black background."
Open area Yes "This is an enclosed area, but the room is very large and the ceiling is very high, giving a lot of room. I think that this makes it an
enclosed area that is also an open area. "
Open area No "I do not consider the image to show an open area because the area shown is enclosed by walls. It is a larger space on the interior
of the building so it does have some aspects of an open space."
Comfortable Yes "The heel is shorter and looks more sturdy with the thickness of the heel which would make it more comfortable then your typical
heel."
Formal Yes "I believe the formal aspect of this should is the color and design of the the fabrics on this shoe. I felt this shoe would be used by a
person who wanted to be formal yet comfortable. "
![Page 54: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/54.jpg)
Results: Discovered Shades
COMFORTABLE
![Page 55: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/55.jpg)
Results: Discovered Shades
OPEN AREA
![Page 56: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/56.jpg)
Results: Discovered Shades
SOOTHING
![Page 57: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/57.jpg)
Adapting Attributes by Selecting Features Similar across Domains
Liu and Kovashka, WACV 2016
![Page 58: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/58.jpg)
Adapting Attributes by Selecting Features Similar across Domains
• Idea: Learn model for new domain by adapting a model for an existing domain
• Idea: Rely on features that are similar between the two domains to make the adaptation possible
Liu and Kovashka, WACV 2016
![Page 59: CS 1699: Intro to Computer Vision Interactive Image Search ...kovashka/cs1699_fa15/... · Visual Attributes •High-level semantic properties shared by objects smiling large-lips](https://reader034.vdocuments.site/reader034/viewer/2022051813/60314e958ad38b334908d568/html5/thumbnails/59.jpg)
Adapting Attributes by Selecting Features Similar across Domains
Liu and Kovashka, WACV 2016
A naïve adaptation method fails to outperform a method that learns from scratch, but feature selection makes our method most useful whendata is scarce.