jitendra malik university of california at...
TRANSCRIPT
![Page 1: Jitendra Malik University of California at Berkeleylcm.csa.iisc.ernet.in/indous-symposium/slides/jitendra...Jitendra Malik University of California at Berkeley Computer Vision Group](https://reader033.vdocuments.site/reader033/viewer/2022051510/5ff76174797ffc7c191ed0c9/html5/thumbnails/1.jpg)
Visual Recognition:
Prospects for Image & Video Analytics
Jitendra Malik
University of California at Berkeley
![Page 2: Jitendra Malik University of California at Berkeleylcm.csa.iisc.ernet.in/indous-symposium/slides/jitendra...Jitendra Malik University of California at Berkeley Computer Vision Group](https://reader033.vdocuments.site/reader033/viewer/2022051510/5ff76174797ffc7c191ed0c9/html5/thumbnails/2.jpg)
Computer Vision Group
UC Berkeley
Classification & Segmentation
Tiger Grass
Water
Sand
outdoor
wildlife
Tiger
tail
eye
legs
head
back
shadow
mouth
![Page 3: Jitendra Malik University of California at Berkeleylcm.csa.iisc.ernet.in/indous-symposium/slides/jitendra...Jitendra Malik University of California at Berkeley Computer Vision Group](https://reader033.vdocuments.site/reader033/viewer/2022051510/5ff76174797ffc7c191ed0c9/html5/thumbnails/3.jpg)
PASCAL Visual Object Challenge
![Page 4: Jitendra Malik University of California at Berkeleylcm.csa.iisc.ernet.in/indous-symposium/slides/jitendra...Jitendra Malik University of California at Berkeley Computer Vision Group](https://reader033.vdocuments.site/reader033/viewer/2022051510/5ff76174797ffc7c191ed0c9/html5/thumbnails/4.jpg)
We want to locate the object
Orig. Image Segmentation Orig. Image Segmentation
![Page 5: Jitendra Malik University of California at Berkeleylcm.csa.iisc.ernet.in/indous-symposium/slides/jitendra...Jitendra Malik University of California at Berkeley Computer Vision Group](https://reader033.vdocuments.site/reader033/viewer/2022051510/5ff76174797ffc7c191ed0c9/html5/thumbnails/5.jpg)
Computer Vision Group
UC Berkeley
Fifty years of computer vision 1963-2013
• 1960s: Beginnings in artificial intelligence, image processing and pattern recognition
• 1970s: Foundational work on image formation: Horn, Koenderink, Longuet-Higgins …
• 1980s: Vision as applied mathematics: geometry, multi-scale analysis, probabilistic modeling, control theory, optimization
• 1990s: Geometric analysis largely completed, vision meets graphics, statistical learning approaches resurface
• 2000s: Significant advances in visual recognition, range of practical applications
![Page 6: Jitendra Malik University of California at Berkeleylcm.csa.iisc.ernet.in/indous-symposium/slides/jitendra...Jitendra Malik University of California at Berkeley Computer Vision Group](https://reader033.vdocuments.site/reader033/viewer/2022051510/5ff76174797ffc7c191ed0c9/html5/thumbnails/6.jpg)
![Page 7: Jitendra Malik University of California at Berkeleylcm.csa.iisc.ernet.in/indous-symposium/slides/jitendra...Jitendra Malik University of California at Berkeley Computer Vision Group](https://reader033.vdocuments.site/reader033/viewer/2022051510/5ff76174797ffc7c191ed0c9/html5/thumbnails/7.jpg)
Computer Vision Group University of California
Berkeley
Handwritten digit recognition
(MNIST,USPS)
• LeCun’s Convolutional Neural Networks variations (0.8%, 0.6% and 0.4% on MNIST)
• Tangent Distance(Simard, LeCun & Denker: 2.5% on USPS)
• Randomized Decision Trees (Amit, Geman & Wilder, 0.8%)
• K-NN based Shape context/TPS matching (Belongie, Malik & Puzicha: 0.6% on MNIST)
![Page 8: Jitendra Malik University of California at Berkeleylcm.csa.iisc.ernet.in/indous-symposium/slides/jitendra...Jitendra Malik University of California at Berkeley Computer Vision Group](https://reader033.vdocuments.site/reader033/viewer/2022051510/5ff76174797ffc7c191ed0c9/html5/thumbnails/8.jpg)
Computer Vision Group
UC Berkeley
EZ-Gimpy Results (Mori & Malik, 2003)
• 171 of 192 images correctly identified: 92 %
horse
smile
canvas
spade
join
here
![Page 9: Jitendra Malik University of California at Berkeleylcm.csa.iisc.ernet.in/indous-symposium/slides/jitendra...Jitendra Malik University of California at Berkeley Computer Vision Group](https://reader033.vdocuments.site/reader033/viewer/2022051510/5ff76174797ffc7c191ed0c9/html5/thumbnails/9.jpg)
Results on various images submitted to the CMU on-line face detector http://www.vasc.ri.cmu.edu/cgi-bin/demos/findface.cgi
Face Detection Carnegie Mellon University
![Page 10: Jitendra Malik University of California at Berkeleylcm.csa.iisc.ernet.in/indous-symposium/slides/jitendra...Jitendra Malik University of California at Berkeley Computer Vision Group](https://reader033.vdocuments.site/reader033/viewer/2022051510/5ff76174797ffc7c191ed0c9/html5/thumbnails/10.jpg)
Multiscale sliding window
Ask this question repeatedly, varying position, scale, category…
Paradigm introduced by Rowley, Baluja & Kanade 96 for face detection Viola & Jones 01, Dalal & Triggs 05, Felzenszwalb, McAllester, Ramanan 08
![Page 11: Jitendra Malik University of California at Berkeleylcm.csa.iisc.ernet.in/indous-symposium/slides/jitendra...Jitendra Malik University of California at Berkeley Computer Vision Group](https://reader033.vdocuments.site/reader033/viewer/2022051510/5ff76174797ffc7c191ed0c9/html5/thumbnails/11.jpg)
Computer Vision Group
UC Berkeley
Caltech-101 [Fei-Fei et al. 04]
• 102 classes, 31-300 images/class
![Page 12: Jitendra Malik University of California at Berkeleylcm.csa.iisc.ernet.in/indous-symposium/slides/jitendra...Jitendra Malik University of California at Berkeley Computer Vision Group](https://reader033.vdocuments.site/reader033/viewer/2022051510/5ff76174797ffc7c191ed0c9/html5/thumbnails/12.jpg)
Caltech 101 classification results
(even better by combining cues..)
![Page 13: Jitendra Malik University of California at Berkeleylcm.csa.iisc.ernet.in/indous-symposium/slides/jitendra...Jitendra Malik University of California at Berkeley Computer Vision Group](https://reader033.vdocuments.site/reader033/viewer/2022051510/5ff76174797ffc7c191ed0c9/html5/thumbnails/13.jpg)
PASCAL Visual Object Challenge
![Page 14: Jitendra Malik University of California at Berkeleylcm.csa.iisc.ernet.in/indous-symposium/slides/jitendra...Jitendra Malik University of California at Berkeley Computer Vision Group](https://reader033.vdocuments.site/reader033/viewer/2022051510/5ff76174797ffc7c191ed0c9/html5/thumbnails/14.jpg)
![Page 15: Jitendra Malik University of California at Berkeleylcm.csa.iisc.ernet.in/indous-symposium/slides/jitendra...Jitendra Malik University of California at Berkeley Computer Vision Group](https://reader033.vdocuments.site/reader033/viewer/2022051510/5ff76174797ffc7c191ed0c9/html5/thumbnails/15.jpg)
![Page 16: Jitendra Malik University of California at Berkeleylcm.csa.iisc.ernet.in/indous-symposium/slides/jitendra...Jitendra Malik University of California at Berkeley Computer Vision Group](https://reader033.vdocuments.site/reader033/viewer/2022051510/5ff76174797ffc7c191ed0c9/html5/thumbnails/16.jpg)
![Page 17: Jitendra Malik University of California at Berkeleylcm.csa.iisc.ernet.in/indous-symposium/slides/jitendra...Jitendra Malik University of California at Berkeley Computer Vision Group](https://reader033.vdocuments.site/reader033/viewer/2022051510/5ff76174797ffc7c191ed0c9/html5/thumbnails/17.jpg)
Trying to find stick figures is
hard (and unnecessary!)
Generalized Cylinders (Binford, Marr & Nishihara)
Geons (Biederman)
![Page 18: Jitendra Malik University of California at Berkeleylcm.csa.iisc.ernet.in/indous-symposium/slides/jitendra...Jitendra Malik University of California at Berkeley Computer Vision Group](https://reader033.vdocuments.site/reader033/viewer/2022051510/5ff76174797ffc7c191ed0c9/html5/thumbnails/18.jpg)
Person detection is challenging
![Page 19: Jitendra Malik University of California at Berkeleylcm.csa.iisc.ernet.in/indous-symposium/slides/jitendra...Jitendra Malik University of California at Berkeley Computer Vision Group](https://reader033.vdocuments.site/reader033/viewer/2022051510/5ff76174797ffc7c191ed0c9/html5/thumbnails/19.jpg)
Can we build upon the success
of faces and pedestrians?
Pattern matching
Capture patterns that are common and visually characteristic
Are these the only two common and characteristic patterns?
Rowley, Baluja, Kanade CVPR96
Viola and Jones, IJCV01
…
Dalal and Triggs, CVPR05
…
![Page 20: Jitendra Malik University of California at Berkeleylcm.csa.iisc.ernet.in/indous-symposium/slides/jitendra...Jitendra Malik University of California at Berkeley Computer Vision Group](https://reader033.vdocuments.site/reader033/viewer/2022051510/5ff76174797ffc7c191ed0c9/html5/thumbnails/20.jpg)
Poselets
We will train classifiers for these different visual patterns
![Page 21: Jitendra Malik University of California at Berkeleylcm.csa.iisc.ernet.in/indous-symposium/slides/jitendra...Jitendra Malik University of California at Berkeley Computer Vision Group](https://reader033.vdocuments.site/reader033/viewer/2022051510/5ff76174797ffc7c191ed0c9/html5/thumbnails/21.jpg)
Segmenting people
[Bourdev, Maji, Brox and Malik, ECCV10]
Best person segmentation on PASCAL 2010 dataset
![Page 22: Jitendra Malik University of California at Berkeleylcm.csa.iisc.ernet.in/indous-symposium/slides/jitendra...Jitendra Malik University of California at Berkeley Computer Vision Group](https://reader033.vdocuments.site/reader033/viewer/2022051510/5ff76174797ffc7c191ed0c9/html5/thumbnails/22.jpg)
“A person with
long pants”
“A man with short
hair and long sleeves”
“A man with short
hair, glasses, short
sleeves and shorts”
“A woman with long hair,
glasses and long pants”(??)
Describing people
![Page 23: Jitendra Malik University of California at Berkeleylcm.csa.iisc.ernet.in/indous-symposium/slides/jitendra...Jitendra Malik University of California at Berkeley Computer Vision Group](https://reader033.vdocuments.site/reader033/viewer/2022051510/5ff76174797ffc7c191ed0c9/html5/thumbnails/23.jpg)
Male or female?
![Page 24: Jitendra Malik University of California at Berkeleylcm.csa.iisc.ernet.in/indous-symposium/slides/jitendra...Jitendra Malik University of California at Berkeley Computer Vision Group](https://reader033.vdocuments.site/reader033/viewer/2022051510/5ff76174797ffc7c191ed0c9/html5/thumbnails/24.jpg)
Gender classifier per poselet is
much easier to train
![Page 25: Jitendra Malik University of California at Berkeleylcm.csa.iisc.ernet.in/indous-symposium/slides/jitendra...Jitendra Malik University of California at Berkeley Computer Vision Group](https://reader033.vdocuments.site/reader033/viewer/2022051510/5ff76174797ffc7c191ed0c9/html5/thumbnails/25.jpg)
Is male
![Page 26: Jitendra Malik University of California at Berkeleylcm.csa.iisc.ernet.in/indous-symposium/slides/jitendra...Jitendra Malik University of California at Berkeley Computer Vision Group](https://reader033.vdocuments.site/reader033/viewer/2022051510/5ff76174797ffc7c191ed0c9/html5/thumbnails/26.jpg)
Has long hair
![Page 27: Jitendra Malik University of California at Berkeleylcm.csa.iisc.ernet.in/indous-symposium/slides/jitendra...Jitendra Malik University of California at Berkeley Computer Vision Group](https://reader033.vdocuments.site/reader033/viewer/2022051510/5ff76174797ffc7c191ed0c9/html5/thumbnails/27.jpg)
Wears long pants
![Page 28: Jitendra Malik University of California at Berkeleylcm.csa.iisc.ernet.in/indous-symposium/slides/jitendra...Jitendra Malik University of California at Berkeley Computer Vision Group](https://reader033.vdocuments.site/reader033/viewer/2022051510/5ff76174797ffc7c191ed0c9/html5/thumbnails/28.jpg)
Wears a hat
![Page 29: Jitendra Malik University of California at Berkeleylcm.csa.iisc.ernet.in/indous-symposium/slides/jitendra...Jitendra Malik University of California at Berkeley Computer Vision Group](https://reader033.vdocuments.site/reader033/viewer/2022051510/5ff76174797ffc7c191ed0c9/html5/thumbnails/29.jpg)
Wears long sleeves
![Page 30: Jitendra Malik University of California at Berkeleylcm.csa.iisc.ernet.in/indous-symposium/slides/jitendra...Jitendra Malik University of California at Berkeley Computer Vision Group](https://reader033.vdocuments.site/reader033/viewer/2022051510/5ff76174797ffc7c191ed0c9/html5/thumbnails/30.jpg)
Wears glasses
![Page 31: Jitendra Malik University of California at Berkeleylcm.csa.iisc.ernet.in/indous-symposium/slides/jitendra...Jitendra Malik University of California at Berkeley Computer Vision Group](https://reader033.vdocuments.site/reader033/viewer/2022051510/5ff76174797ffc7c191ed0c9/html5/thumbnails/31.jpg)
Actions in still images …
have characteristic :
pose and appearance
interaction with objects and agents
![Page 32: Jitendra Malik University of California at Berkeleylcm.csa.iisc.ernet.in/indous-symposium/slides/jitendra...Jitendra Malik University of California at Berkeley Computer Vision Group](https://reader033.vdocuments.site/reader033/viewer/2022051510/5ff76174797ffc7c191ed0c9/html5/thumbnails/32.jpg)
Some discriminative poselets
![Page 33: Jitendra Malik University of California at Berkeleylcm.csa.iisc.ernet.in/indous-symposium/slides/jitendra...Jitendra Malik University of California at Berkeley Computer Vision Group](https://reader033.vdocuments.site/reader033/viewer/2022051510/5ff76174797ffc7c191ed0c9/html5/thumbnails/33.jpg)
Problem: Human Activity Recognition
12/20/2011 SMARTS Annual Review 2011
Mean Performance: 59.7% correct
Approach: Learn pose and appearance specific for an action
![Page 34: Jitendra Malik University of California at Berkeleylcm.csa.iisc.ernet.in/indous-symposium/slides/jitendra...Jitendra Malik University of California at Berkeley Computer Vision Group](https://reader033.vdocuments.site/reader033/viewer/2022051510/5ff76174797ffc7c191ed0c9/html5/thumbnails/34.jpg)
Results : Top Confusions
![Page 35: Jitendra Malik University of California at Berkeleylcm.csa.iisc.ernet.in/indous-symposium/slides/jitendra...Jitendra Malik University of California at Berkeley Computer Vision Group](https://reader033.vdocuments.site/reader033/viewer/2022051510/5ff76174797ffc7c191ed0c9/html5/thumbnails/35.jpg)
Low-Cost Automated Tuberculosis Diagnostics Using Mobile Microscopy Jeannette Chang1, Pablo Arbelaez1, Neil Switz2, Clay Reber2, Asa Tapley2,3
Lucian Davis3, Adithya Cattamanchi3, Daniel Fletcher2, and Jitendra Malik1
Department of Electrical Engineering and Computer Science, UC Berkeley1
Department of Bioengineering, UC Berkeley2
Medical School and San Francisco General Hospital, UC San Francisco3
![Page 36: Jitendra Malik University of California at Berkeleylcm.csa.iisc.ernet.in/indous-symposium/slides/jitendra...Jitendra Malik University of California at Berkeley Computer Vision Group](https://reader033.vdocuments.site/reader033/viewer/2022051510/5ff76174797ffc7c191ed0c9/html5/thumbnails/36.jpg)
Why Tuberculosis? Mortality and Treatment1
TB is second leading cause of deaths from infectious disease worldwide (after HIV/AIDS)
Highly effective antibiotic treatment
Current Diagnostics
Technicians screen microscopic images of sputum smears manually
Other methods include culture and PCR
Tremendous potential benefit from automated processing or classification
1. http://www.who.int/tb/publications/global_report/2011/gtbr11_full.pdf
2. http://www.thehindu.com/health/rx/article21138.ece
Examples of sputum smears with TB bacteria. Brightfield (top) and fluorescent (bottom) microscopy.2
![Page 37: Jitendra Malik University of California at Berkeleylcm.csa.iisc.ernet.in/indous-symposium/slides/jitendra...Jitendra Malik University of California at Berkeley Computer Vision Group](https://reader033.vdocuments.site/reader033/viewer/2022051510/5ff76174797ffc7c191ed0c9/html5/thumbnails/37.jpg)
Candidate TB Blob
Identification
Feature Extraction
Linear SVM Classification
Input image from
CellScope device
Each candidate TB object is
characterized by a feature vector
containing 8 Hu moment invariants
and 14 geometric/photometric
descriptors.
𝑥1⋮𝑥𝑁
=
Candidate TB objects sorted by their
SVM output confidence scores in
decreasing order (row-wise, from top
to bottom)
Bar plot with SVM output confidence
scores corresponding to sorted candidate
TB objects
Array of
candidate
TB objects
0 20 40 60 80 1000
0.2
0.4
0.6
0.8
1
Candidate Object Index
SV
M O
utp
ut
Confidence S
core
Sample subset of candidate TB objects with
corresponding confidence scores
0.918 0.885 0.389 0.374 0.008 0.002 0.001 0.000
![Page 38: Jitendra Malik University of California at Berkeleylcm.csa.iisc.ernet.in/indous-symposium/slides/jitendra...Jitendra Malik University of California at Berkeley Computer Vision Group](https://reader033.vdocuments.site/reader033/viewer/2022051510/5ff76174797ffc7c191ed0c9/html5/thumbnails/38.jpg)
Sample positive objects Sample negative objects
Sample Candidate Objects
![Page 39: Jitendra Malik University of California at Berkeleylcm.csa.iisc.ernet.in/indous-symposium/slides/jitendra...Jitendra Malik University of California at Berkeley Computer Vision Group](https://reader033.vdocuments.site/reader033/viewer/2022051510/5ff76174797ffc7c191ed0c9/html5/thumbnails/39.jpg)
Patches in Descending Order of Confidence
![Page 40: Jitendra Malik University of California at Berkeleylcm.csa.iisc.ernet.in/indous-symposium/slides/jitendra...Jitendra Malik University of California at Berkeley Computer Vision Group](https://reader033.vdocuments.site/reader033/viewer/2022051510/5ff76174797ffc7c191ed0c9/html5/thumbnails/40.jpg)
0.000 0.200 0.400 0.600 0.800 1.000
MinIntensity
φ1
Perimeter
FilledArea
Area
φ5
φ7
φ6
φ4
φ11
MaxIntensity
EulerNumber
Extent
φ3
ConvexArea
Solidity
MajorAxisLength
EquivDiameter
φ2
MinorAxisLength
Eccentricity
MeanIntensity
Features listed in descending order of
normalized SVM weights.
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Sensitivity (Recall)
Specific
ity o
r P
recis
ion
SS/RP curves, Avg spec: 0.96744, Avg prec: 0.95389 cost exp: 7
train-SS
train-RP
test-SS
test-RP
Object-Level Performance (Uganda Data)
![Page 41: Jitendra Malik University of California at Berkeleylcm.csa.iisc.ernet.in/indous-symposium/slides/jitendra...Jitendra Malik University of California at Berkeley Computer Vision Group](https://reader033.vdocuments.site/reader033/viewer/2022051510/5ff76174797ffc7c191ed0c9/html5/thumbnails/41.jpg)
Slide-Level Performance (Uganda Data)