speeding up probabilistic inference of camera orientation by function approximation and grid...
DESCRIPTION
Slides from my presentation at the WSCG2011. Describes some modifications to existing techniques for camera orientation estimation in "Manhattan Worlds" aiming at faster calculation times.TRANSCRIPT
![Page 1: Speeding up probabilistic inference of camera orientation by function approximation and grid masking - WSCG2011 presentation](https://reader034.vdocuments.site/reader034/viewer/2022042715/558e4ee71a28abfd308b46ff/html5/thumbnails/1.jpg)
LTI–PCS–EPUSP
nic-wscg2011
N. Werneck
1–Introduction
2–Methodology
3–Results
References
Referencias
c©N. Werneck
Speeding up probabilistic inference of
camera orientation by function
approximation and grid masking
Nicolau L. Werneck
Doctoral candidateSupervisor: Prof. Anna Helena Reali Costa
Intelligent Techniques Laboratory, LTI — PCS — PoliUniversidade de Sao Paulo (USP), Brazil
WSCG’2011, PlzenFeb/2011
1 / 15
![Page 2: Speeding up probabilistic inference of camera orientation by function approximation and grid masking - WSCG2011 presentation](https://reader034.vdocuments.site/reader034/viewer/2022042715/558e4ee71a28abfd308b46ff/html5/thumbnails/2.jpg)
LTI–PCS–EPUSP
nic-wscg2011
N. Werneck
1–Introduction
2–Methodology
3–Results
References
Referencias
c©N. Werneck
IntroductionThe problem — camera orientation estimation
Environment edges are assumedto be in the three directions ofthe reference frame.(Lego Land, Manhattan World)
We want to calculate thecamera orientation in relationto this reference frame, inreal-time.
Technique based on continuousoptimization. No edge extrac-tion or matching involved.(Maximum likelihood)
2 / 15
![Page 3: Speeding up probabilistic inference of camera orientation by function approximation and grid masking - WSCG2011 presentation](https://reader034.vdocuments.site/reader034/viewer/2022042715/558e4ee71a28abfd308b46ff/html5/thumbnails/3.jpg)
LTI–PCS–EPUSP
nic-wscg2011
N. Werneck
1–Introduction
2–Methodology
3–Results
References
Referencias
c©N. Werneck
IntroductionGeometrical constraints
Knowing the camera orientation from a picture we canpredict the directions of image edges.
3 / 15
![Page 4: Speeding up probabilistic inference of camera orientation by function approximation and grid masking - WSCG2011 presentation](https://reader034.vdocuments.site/reader034/viewer/2022042715/558e4ee71a28abfd308b46ff/html5/thumbnails/4.jpg)
![Page 5: Speeding up probabilistic inference of camera orientation by function approximation and grid masking - WSCG2011 presentation](https://reader034.vdocuments.site/reader034/viewer/2022042715/558e4ee71a28abfd308b46ff/html5/thumbnails/5.jpg)
LTI–PCS–EPUSP
nic-wscg2011
N. Werneck
1–Introduction
2–Methodology
3–Results
References
Referencias
c©N. Werneck
IntroductionBayesian camera orientation estimation
The data analized is the gradient of the input image.
5 / 15
![Page 6: Speeding up probabilistic inference of camera orientation by function approximation and grid masking - WSCG2011 presentation](https://reader034.vdocuments.site/reader034/viewer/2022042715/558e4ee71a28abfd308b46ff/html5/thumbnails/6.jpg)
LTI–PCS–EPUSP
nic-wscg2011
N. Werneck
1–Introduction
2–Methodology
3–Results
References
Referencias
c©N. Werneck
IntroductionBayesian camera orientation estimation
The Bayesian camera orientation estimation works bydefining an objective function L(~Ψ) to be optimized. The
solution is ~Ψ∗ = argmax L(~Ψ).
The function L tells how well the arguments“explain” the evidences. (Likelihood function)
In this problem ~Ψ is a set of arguments that modelthe camera orientation.
L tells how much the edges in the images arealigned to the directions expected from thevanishing points produced by ~Ψ.
6 / 15
![Page 7: Speeding up probabilistic inference of camera orientation by function approximation and grid masking - WSCG2011 presentation](https://reader034.vdocuments.site/reader034/viewer/2022042715/558e4ee71a28abfd308b46ff/html5/thumbnails/7.jpg)
LTI–PCS–EPUSP
nic-wscg2011
N. Werneck
1–Introduction
2–Methodology
3–Results
References
Referencias
c©N. Werneck
Existing techniques
This work is based on previous research by Coughlan andYuille [2003], Deutscher et al. [2002], Schindler andDellaert [2004], Denis et al. [2008].
They are all based on likelihood maximization. Thedifferences lie in:
What parameters are estimated.(Other than orientation).
What optimization algorithm is employed.
Expression of the likelihood function.(Specially what PDF models are used).
Subsampling technique.
7 / 15
![Page 8: Speeding up probabilistic inference of camera orientation by function approximation and grid masking - WSCG2011 presentation](https://reader034.vdocuments.site/reader034/viewer/2022042715/558e4ee71a28abfd308b46ff/html5/thumbnails/8.jpg)
LTI–PCS–EPUSP
nic-wscg2011
N. Werneck
1–Introduction
2–Methodology
3–Results
References
Referencias
c©N. Werneck
Original expression
In Coughlan and Yuille [2003] the image likelihood is a
product of the likelihoods of gradients ~E~u at each pixel ~u.
Observation model
Lik. pixel is edgeLik. orientation match
The expression built is a Maximum a posteriori estimator.
Using Mk for P(m~u = k), Φk for P(φ~u|m~u = k ,~Ψ,~u) andtaking the log we arrive at the objective function...
8 / 15
![Page 9: Speeding up probabilistic inference of camera orientation by function approximation and grid masking - WSCG2011 presentation](https://reader034.vdocuments.site/reader034/viewer/2022042715/558e4ee71a28abfd308b46ff/html5/thumbnails/9.jpg)
LTI–PCS–EPUSP
nic-wscg2011
N. Werneck
1–Introduction
2–Methodology
3–Results
References
Referencias
c©N. Werneck
Proposed expression
L(~Ψ)
= ∑~u
log
(Poff (E~u)Φ1M1+Pon(E~u)Φ5M5+Pon(E~u)∑
4k=2 ΦkMk
)
Using log(b+a)≈ ab + log(b), we arrive at
Lik. pixel is edgeLik. orientation match
There is a weighting coefficient based on the gradientnorm multiplied by something that depends on thegradient directions and camera orientation.
9 / 15
![Page 10: Speeding up probabilistic inference of camera orientation by function approximation and grid masking - WSCG2011 presentation](https://reader034.vdocuments.site/reader034/viewer/2022042715/558e4ee71a28abfd308b46ff/html5/thumbnails/10.jpg)
LTI–PCS–EPUSP
nic-wscg2011
N. Werneck
1–Introduction
2–Methodology
3–Results
References
Referencias
c©N. Werneck
Gradient norm maskingThe mask generating function
W ′(E~u) =
(Poff (E~u)
Pon(E~u)M1 +M5
)−1
Also...
We replaced W ′ forW , based on thelogistic function.
We also used vectordot products insteadof calculating arctan.
10 / 15
![Page 11: Speeding up probabilistic inference of camera orientation by function approximation and grid masking - WSCG2011 presentation](https://reader034.vdocuments.site/reader034/viewer/2022042715/558e4ee71a28abfd308b46ff/html5/thumbnails/11.jpg)
LTI–PCS–EPUSP
nic-wscg2011
N. Werneck
1–Introduction
2–Methodology
3–Results
References
Referencias
c©N. Werneck
Grid masking
We select one from every few lines and columns.
Images edges are sampled regularly.
Minimally long lines are necessarily sampled.
Better strategy for high resolution images, whereedge pixels are “rare”.
11 / 15
![Page 12: Speeding up probabilistic inference of camera orientation by function approximation and grid masking - WSCG2011 presentation](https://reader034.vdocuments.site/reader034/viewer/2022042715/558e4ee71a28abfd308b46ff/html5/thumbnails/12.jpg)
LTI–PCS–EPUSP
nic-wscg2011
N. Werneck
1–Introduction
2–Methodology
3–Results
References
Referencias
c©N. Werneck
ResultsExpression evaluation
Speed
Expressions were implemented in Cython, using SIMDinstructions, and tested on c1.xlarge AWS computers.A speedup of 50–64× was detected.
Original 1100.0 ±60msProposed 18.9 ±2.4ms
(4s per image with the proposal, without subsampling.)
Quality
From 102 tests, the original expression “fixed” thesolution in 5 occasions, but ruined 6 good solutions.Mean error went from 4.7◦ to 5.5◦. (Large outliers)
12 / 15
![Page 13: Speeding up probabilistic inference of camera orientation by function approximation and grid masking - WSCG2011 presentation](https://reader034.vdocuments.site/reader034/viewer/2022042715/558e4ee71a28abfd308b46ff/html5/thumbnails/13.jpg)
LTI–PCS–EPUSP
nic-wscg2011
N. Werneck
1–Introduction
2–Methodology
3–Results
References
Referencias
c©N. Werneck
ResultsGrid masking evaluation
Speed increases as solution quality drops.
13 / 15
![Page 14: Speeding up probabilistic inference of camera orientation by function approximation and grid masking - WSCG2011 presentation](https://reader034.vdocuments.site/reader034/viewer/2022042715/558e4ee71a28abfd308b46ff/html5/thumbnails/14.jpg)
LTI–PCS–EPUSP
nic-wscg2011
N. Werneck
1–Introduction
2–Methodology
3–Results
References
Referencias
c©N. Werneck
ConclusionThe proposed expression is simpler, faster, intuitive andjustifies selecting pixels from gradient norm.
The grid masking technique proved to be a goodalternative for subsampling images deterministically.
Future work
Develop a complete pixel selection method.
Find best parameters.
Try to use gradient-based optimization.
Thanks! THE END
http://nwerneck.sdf.org
14 / 15
![Page 15: Speeding up probabilistic inference of camera orientation by function approximation and grid masking - WSCG2011 presentation](https://reader034.vdocuments.site/reader034/viewer/2022042715/558e4ee71a28abfd308b46ff/html5/thumbnails/15.jpg)
LTI–PCS–EPUSP
nic-wscg2011
N. Werneck
1–Introduction
2–Methodology
3–Results
References
Referencias
c©N. Werneck
References
James M. Coughlan and A. L. Yuille. Manhattan world: orientationand outlier detection by bayesian inference. Neural Comput.,15(5):1063–1088, 2003. ISSN 0899-7667. URLdoi:10.1162/089976603765202668.
Patrick Denis, James H. Elder, and Francisco J. Estrada. Efficientedge-based methods for estimating manhattan frames in urbanimagery. In David A. Forsyth, Philip H. S. Torr, and AndrewZisserman, editors, ECCV (2), volume 5303 of Lecture Notesin Computer Science, pages 197–210. Springer, 2008. ISBN978-3-540-88685-3.
Jonathan Deutscher, Michael Isard, and John Maccormick.Automatic camera calibration from a single manhattan image.In Eur. Conf. on Computer Vision (ECCV, pages 175–205,2002.
Grant Schindler and Frank Dellaert. Atlanta world: An expectationmaximization framework for simultaneous low-level edgegrouping and camera calibration in complex man-madeenvironments. In CVPR (1), pages 203–209, 2004.
15 / 15