diego aguirre computer vision introduction 1. question what is computer vision? 2
TRANSCRIPT
3
COMPUTER VISION
• Computer Vision is the process of extracting knowledge about the world from one or more digital images.
DIGITAL IMAGES
Color Images are formed with three 2-D arrays, representing the Red, Green and Blue components of the image.
8
APPLICATIONS
• Why do we need Computer Vision?
• Mention a few applications/products that make use of Computer Vision
20
COMPUTER VISION IS HARD
• Believe me! It is!
• It is often said that 2/3 (60%+) of the brain is "involved" in vision.
21
ORGANIZATION OF A CV SYSTEM
• The organization of a computer vision system varies a lot from one to another.
• However, these are some of the typical tasks found in these systems:
32
LEARNING ALGORITHM
• Learning algorithms work with vectors of feature values
• We need to go from matrices to vectors
Extract
Features Classifier
Face / !Face
34
FACE DETECTION
• First real-time face detector proposed by Viola & Jones in 2005
• “Robust real-time face detection”
35
FACE DETECTION - VIOLA & JONES, 2005
• Robust (very true positive rate, very low false positive rate)
• Real Time (very fast)
36
FACE DETECTION - VIOLA & JONES, 2005
• Type of features they use: Haar-like features
Extract
Features Classifier
Face / !Face
For example:
Image Integral Image
98 110 121 125 122 129
99 110 120 116 116 129
97 109 124 111 123 134
98 112 132 108 123 133
97 113 147 108 125 142
95 111 168 122 130 137
96 104 172 130 126 130
98 208 329 454 576 705
197 417 658 899113
7139
5
294 623 988134
0170
1209
3
392 833133
0179
0227
4279
9
489104
3168
7225
5286
4353
1
584124
9206
1275
1349
0429
4
680144
9243
3325
3411
8505
2
Integral Image: A table that holds the sum of all pixel
values to the left and top of a given pixel, inclusive.
FACE DETECTION - VIOLA & JONES, 2005
For example:
Image Integral Image
98 110 121 125 122 129
99 110 120 116 116 129
97 109 124 111 123 134
98 112 132 108 123 133
97 113 147 108 125 142
95 111 168 122 130 137
96 104 172 130 126 130
98 208 329 454 576 705
197 417 658 899113
7139
5
294 623 988134
0170
1209
3
392 833133
0179
0227
4279
9
489104
3168
7225
5286
4353
1
584124
9206
1275
1349
0429
4
680144
9243
3325
3411
8505
2
FACE DETECTION - VIOLA & JONES, 2005
For example:
Image Integral Image
98 110 121 125 122 129
99 110 120 116 116 129
97 109 124 111 123 134
98 112 132 108 123 133
97 113 147 108 125 142
95 111 168 122 130 137
96 104 172 130 126 130
98 208 329 454 576 705
197 417 658 899113
7139
5
294 623 988134
0170
1209
3
392 833133
0179
0227
4279
9
489104
3168
7225
5286
4353
1
584124
9206
1275
1349
0429
4
680144
9243
3325
3411
8505
2
FACE DETECTION - VIOLA & JONES, 2005
For example:
Image Integral Image
98 110 121 125 122 129
99 110 120 116 116 129
97 109 124 111 123 134
98 112 132 108 123 133
97 113 147 108 125 142
95 111 168 122 130 137
96 104 172 130 126 130
98 208 329 454 576 705
197 417 658 899113
7139
5
294 623 988134
0170
1209
3
392 833133
0179
0227
4279
9
489104
3168
7225
5286
4353
1
584124
9206
12751
3490
4294
680144
9243
3325
3411
8505
2
FACE DETECTION - VIOLA & JONES, 2005
98 110 121 125 122 129
99 110 120 116 116 129
97 109 124 111 123 134
98 112 132 108 123 133
97 113 147 108 125 142
95 111 168 122 130 137
96 104 172 130 126 130
98 208 329 454 576 705
197 417 658 899113
7139
5
294 623 988134
0170
1209
3
392 833133
0179
0227
4279
9
489104
3168
7225
5286
4353
1
584124
9206
1275
1349
0429
4
680144
9243
3325
3411
8505
2
Image Integral Image (II)
FACE DETECTION - VIOLA & JONES, 2005
Fast summations of arbitrary rectangles using integral
images.
98 110 121 125 122 129
99 110 120 116 116 129
97 109 124 111 123 134
98 112 132 108 123 133
97 113 147 108 125 142
95 111 168 122 130 137
96 104 172 130 126 130
98 208 329 454 576 705
197 417 658 899113
7139
5
294 623 988134
0170
1209
3
392 833133
0179
0227
4279
9
489104
3168
7225
5286
4353
1
584124
9206
1275
1349
0429
4
680144
9243
3325
3411
8505
2
Image Integral Image (II)
Sum = IIP +…
= 3490 + …
P
FACE DETECTION - VIOLA & JONES, 2005
98 110 121 125 122 129
99 110 120 116 116 129
97 109 124 111 123 134
98 112 132 108 123 133
97 113 147 108 125 142
95 111 168 122 130 137
96 104 172 130 126 130
98 208 329 454 576 705
197 417 658 899113
7139
5
294 623 988134
0170
1209
3
392 833133
0179
0227
4279
9
489104
3168
7225
5286
4353
1
584124
9206
1275
1349
0429
4
680144
9243
3325
3411
8505
2
Image Integral Image (II)
Sum = IIP – IIQ + …
= 3490 – 1137 + …
Q
P
FACE DETECTION - VIOLA & JONES, 2005
98 110 121 125 122 129
99 110 120 116 116 129
97 109 124 111 123 134
98 112 132 108 123 133
97 113 147 108 125 142
95 111 168 122 130 137
96 104 172 130 126 130
98 208 329 454 576 705
197 417 658 899113
7139
5
294 623 988134
0170
1209
3
392 833133
0179
0227
4279
9
489104
3168
7225
5286
4353
1
584124
9206
1275
1349
0429
4
680144
9243
3325
3411
8505
2
Image Integral Image (II)
Sum = IIP – IIQ – IIS + …
= 3490 – 1137 – 1249 + …
Q
PS
FACE DETECTION - VIOLA & JONES, 2005
98 110 121 125 122 129
99 110 120 116 116 129
97 109 124 111 123 134
98 112 132 108 123 133
97 113 147 108 125 142
95 111 168 122 130 137
96 104 172 130 126 130
98 208 329 454 576 705
197 417 658 899113
7139
5
294 623 988134
0170
1209
3
392 833133
0179
0227
4279
9
489104
3168
7225
5286
4353
1
584124
9206
1275
1349
0429
4
680144
9243
3325
3411
8505
2
Image Integral Image (II)
Sum = IIP – IIQ – IIS + IIR = 3490 – 1137 – 1249 + 417 = 1521
QR
PS
Can be computed in constant time with only 4 references
FACE DETECTION - VIOLA & JONES, 2005
50
FACE DETECTION - VIOLA & JONES, 2005
• Feature extraction – DONE!
• Classifier?
Extract
Features Classifier
Face / !Face
51
FACE DETECTION - VIOLA & JONES, 2005
• They use a variation of AdaBoost to build a cascade of weak classifiers.
Where stage i is simpler (and faster) than stage i+1
52
FACE DETECTION - VIOLA & JONES, 2005
• We have a classifier that tells us if a given image is a face or not.
• What if we want to detect multiple faces in an image?
• Sliding window!
Idea:
• Slide windows of different sizes across image. • At each location match the window to a face model.
I.1
FACE DETECTION
Dealing with multiple scalesObvious solution:
Build a detector for each possible scale
Better idea:Build a detector for a single scaleDuring detection, scale the image
FACE DETECTION - VIOLA & JONES, 2005