mosse - hamedkiani.com · mosse kmosse miltrack struck oab(1) semiboost fragtrack our method mean...
TRANSCRIPT
-
IEEE 2015 Conference on Computer Vision and Pattern
Recognition Correlation Filters with Limited Boundaries
Hamed Kiani, Terence Sim and Simon Lucey PAVIS (IIT), SoC (NUS) and RI (CMU)
Section 1 (sizes): Posters boards are 48” tall and 96” wide, but we recommend you leave a little border since you may not be able to pin at the vertical edge. Since PowerPoint does not let one define such a large paper size, this template is designed to be printed at 200%, yielding a 46” x94” poster. You can scale it up or down a bit (e.g. 42” is a common paper size at FexEd). Note there is no direct international A0.. A1 equivalent. The poster size is approximately three A0 boards next to each other, i.e., each column in this example is about one A0 board.
Ideally you want to keep it very readable: this is nots your paper, it is a poster. 32pt here (64 final printing) is good for most text: • Sub-bullets are 28 here (56 final)
– Don’t use smaller than 24pt in this template (which is 48pt in final printing at 200%)
– Insert plenty of graphics and any math you need
When inserting graphics or equations, keep the resolution high (remember this will be printed at 200%). If you can see blocking artifacts at 400% magnification in PowerPoint, consider finding better graphics. This is an example of BAD/LOW RES GRAPHICS
Leave enough margin for pushpin and remember many big plotters cannot get within .5” of the actual paper edge.
You are free to use colored backgrounds and such but they generally reduce readability.
You are free to use what ever fonts you like. • San Serif fonts like Arial are more readable from a distance,
• Serif fonts like times may look more consistent with your mathematics
Section 2 (layout): Remember the poster session will be crowded so design the poster to be read in columns so people can read what is in front of them and move left to right to get the whole story.
The poster should use photos, figures, and tables to tell the story of the study. For clarity, present the information in a sequence that is easy to follow.
There is often way too much text in a poster - there definitely is in this template! Posters primarily are visual presentations; the text should support the graphics. Look critically at the layout. Some poster 'experts' suggest that if there is about 20-25% text, 40-45% graphics and 30-40% empty space, you are doing well.
Section 3: Include more figures than are in the paper so you can talk to them. Include things that are not in the paper and then encourage them to read the paper. Don’t try to just put all the paper here.
If it looks like a cut/paste of the paper, people skip that poster since they can read the papers after the conference. Many people find it better to spend time talking with poster presenters that have more to offer than just redoing the paper content paper in big fonts.
Remember Poster boards look like this.. This is your canvas. Paint us a picture of your work.
CORRELATION FILTERS WITH LIMITED BOUNDARIESHAMED KIANI, TERENCE SIM AND SIMON LUCEY
CORRELATION FILTERSBoundary Effects: Synthetic training patches
���� ∆��
�
IEEE 2015 Conference on Computer Vision and Pattern
Recognition
𝑥𝑖 𝑦𝑖 ℎ
*
𝑥𝑖 𝑦𝑖
*
ℎ
T ≫ D
E(h) =1
2
N∑i=1
||yi − xi ? h||22 +λ
2||h||22
1- Frequency domain: O(ND logD)
E(ĥ) =1
2
N∑i=1
||ŷi − diag(x̂i)>ĥ||22 +λ
2||ĥ||22
(ŷ denotes the DFT of vector y)
2- Spatial domain: O(D3 +ND2)
E(h) =1
2
N∑i=1
D∑j=1
||yi(j)− h> xi[∆τj ]︸ ︷︷ ︸circular shift
||22 +λ
2||h||22
CONTRIBUTIONS1. A new correlation filter objective to drastically reduce the
number of synthetic patches.2. Optimizing the new objective using ADMM with very effi-
cient complexity and memory usage.
KEY IDEA
IEEE 2015 Conference on Computer Vision and Pattern
Recognition
𝑥𝑖 𝑦𝑖 ℎ
*
𝑥𝑖 𝑦𝑖
*
ℎ
T ≫ D
E(h) =1
2
N∑i=1
T∑j=1
||yi(j)− h>Pxi[∆τj ]||22 +λ
2||h||22
𝐏 𝑥 ∆𝜏𝑗
1. # of training patches: T vs. D (T � D)2. # of patches affected by circular shift (synthetic): D−1T vs.
D−1D
3. Complexity: O(D3 +NDT )
E(h, ĝ) =1
2
N∑i=1
||ŷi − diag(x̂i)>ĝ||22 +λ
2||h||22
s.t. ĝ =√DFP>h (F: D ×D discrete Fourier transform matrix)
Augmented Lagrangian
L(ĝ,h, ζ̂) = 12
N∑i=1
||ŷi − diag(x̂i)>ĝ||22 +λ
2||h||22 + ζ̂>(ĝ −
√DFP>h) +
µ
2||ĝ −
√DFP>h||22
Augmented Lagrangian is solved using Alternating Direction Method of Multipliers (ADMM) with a time complexity ofO([N +K]T log T ) and memory usage of O(T ) .
RESULTS (2)
MOSSE KMOSSE MILTrack STRUCK OAB(1) SemiBoost FragTrack Our method
mean {0.80, 11} {0.84, 12} {0.72, 16} {0.91, 12} {0.53, 31} {0.62, 29} {0.51, 37} {0.97, 8}fps 600 100 25 11 25 25 2 100
RESULTS (1)Runtime and Convergence Performance
1 100 200 300 400 500 6000
1
2
3
4
5
6
7
8x 10
4
Number of training images
Tim
e t
o c
on
ve
rge
(s)
Spatial steepest descent (32x32)
Our method (32x32)
Spatial steepest descent (64x64)
Our method (64x64)
0 500 1000 150010
0
102
104
106
108
1010
1012
# of iteration
Obje
ctive
va
lue
Spatial steepest descent
Our method
Localization Performance
1
Lo
ca
liza
tio
n r
ate
at
d <
0.1
0
MOSSE ASEF UMACE MACE OTF Our method
1 50 100 150 200 250 300 350 4000
0.5
1
Number of training images
Localiz
ation r
ate
at d <
0.1
0
0.05 0.10 0.15 0.200
0.5
1
Threshold (fraction of interocular distance)
Localiz
ation r
ate
Tracking Performance: 100 fpsgirl clifbar
10 20 30 40 500
0.2
0.4
0.6
0.8
1
Threshold
Pre
cis
ion
MOSSE
K_MOSSE
MILTrack
STRUCK
FragTrack
IVTs
Our method
10 20 30 40 500
0.2
0.4
0.6
0.8
1
Threshold
Pre
cis
ion
MOSSE
K_MOSSE
MILTrack
STRUCK
FragTrack
IVTs
Our method
100 200 300 4000
50
100
150
200
Frame #
Positio
n E
rror
(pix
el)
MOSSE
K_MOSSE
STRUCK
FragTrack
Our method
50 100 150 200 250 3000
20
40
60
80
100
120
140
Frame #
Positio
n E
rror
(pix
el)
MOSSE
K_MOSSE
STRUCK
FragTrack
Our method
REFERENCES
[1] D. Bolme, J. Beveridge, B. Draper, and Y. Lui. Visual Object Tracking usingAdaptive Correlation Filters CVPR’10
[2] Code and Demo: http://www.hamedkiani.com/cfwlb.html