inverse depth monocular slam - oxford brookes...
TRANSCRIPT
Javier CiveraJMM Montiel A.J. Davison
Inverse Depth Monocular SLAMJavier Civera, Andrew J. Davison, JMM Montiel
Javier CiveraJMM Montiel A.J. Davison
Problem Statement
• Sequential simultaneous sensor location and map building at frame rate, 30Hz.
• Camera moves freely in 3D, 6dof camera motion
• Outdoors real scenes contains close and distant, even at infinity, features
• Main contribution codifying scene points with its inverse depth:
1. Deals with low parallax cases2. Deals with both distant and
close points3. Map features are initialised
just from one image
Javier CiveraJMM Montiel A.J. Davison
Camera Geometry: Pure Bearing-only Sensor
X[X,Y,Z]
camera,optical center
O
f
• Camera detects rays• A ray is defined by the optical
center O and the observed point• The images is used as the method
to determined the detected ray• Depth is not detected
x[x,y]x
yY
Z
X
Earththefromfar light years10at
gallaxiesofimage9
Javier CiveraJMM Montiel A.J. Davison
Points at Infinity
• projective cameras do observe points at infinity• parallel lines meet at infinity, a projective camera does observe this
intersection point as vanishing point• we intend to code and exploit this points at infinity in the monocular
SLAM problem
Javier CiveraJMM Montiel A.J. Davison
Parallax
• no parallax geometries– Camera rotation– Camera observing a scene
plane• low parallax cases
– Distant features compared with camera translation
– Initial feature observation
Javier CiveraJMM Montiel A.J. Davison
Stereo Vision. Sensitivity Analysis
β
X
P1 P2b
Plano deimagen
Centroóptico
Ejeóptico
P
Z
2x1x
1C2C
f f
θ 1 θ 2
angleParallax z
rad toup add ein triangl angles threeThe
tantan
tantan
tan 2,ray tan 1,ray
21
2122
11
21
2
1
ββ
πθθθθ
θθθθ
θθ
bPCC
bzbz
bzxzx
≈
−≈
≈≈
−=
+==
Geometry, depending only on:• relative camera location • observed point location
Parallax, angle defined by the two rays corresponding to the two cameras fora scene point.
Javier CiveraJMM Montiel A.J. Davison
Image errors
pxσ2
oσ2
f
(mm.)lenght focal lens,
(mm.) size pixel , ,
embetween threlation eapproximatan
radiansin , error,n orientatio for thedeviation standard theerrorn orientatioray on the based is analysiserror
pixels 5.0deviation standard ingcorrespond pixel, 1error clicking afor example,
2 is rangeerror thumbof rule a as
pixelsin deviation standarderror ,
o
o
f
dfd
px
px
px
px
σσ
σ
σ
σ
σ
≅
=±
±
Javier CiveraJMM Montiel A.J. Davison
Stereo vision. Gaussian propagation
β
X
b
ImagePlane
OpticalAxis
P
1C2C
221 θ+θ
bb oo22
22β
σ=
βσ
5.121
2β
θ+θσob
direction. rays twoon dependsdirection ZX,on n propagatioError
direction. bisectric rays twoin the oriented axismajor with Elliposid
1.5o
2o
power 1.5th theoparallax t inverse
baseline to alproportion error,depth an smaller therror Lateral
parallax squared inverse
baseline to alproportionerror Depth
βσ
βσ
b
b
Javier CiveraJMM Montiel A.J. Davison
Linearity limits of the Gaussian propagation.Inverse depth coding
o5.0=α
o0.1=σ o0.1=σ
-2 -1 0 10
100
200
300
400
500
600
700
800
900
x
z
x z representation
-1 -0.5 0 0.50
0.002
0.004
0.006
0.008
0.01
0.012
0.014
0.016
0.018
ρ =1/
d
θ(deg)
ρθrepresentation
Simulation: computing depth of a point from 2 views at known camera locations• Non Gaussian in XZ• Gaussian in 1/d, theta
Javier CiveraJMM Montiel A.J. Davison
State of the art I
• SLAM, initially proposed by Smith and Cheesman, 1986, widespread usage in robotics for multisensor fusion [Castellanos 1999], [Feder 1999] [Thrun et al. 2005]
– Sequential approach– Ability to close loops, identifying features previously observed as
reobserved. Complexity is linked to the scene not to the number of observations processed.
• SLAM used for computer vision, [Castellanos 2000], [Davison 1998] combined with odometry
• Monocular SLAM vision [Davison 2003]– Camera "following the laws of mechanics" motion model– Vision as the only sensor, no odometry.– Synergic usage of vision geometry and vision photometric map– Low parallax points avoided:
» Points represented as XYZ, only works with points close to the camera
» Delayed initialization
SFM
,com
pute
r vi
sion
met
hods
Javier CiveraJMM Montiel A.J. Davison
State of the art II
• Photogrammetric bundle adjustment, 60's– Normally only close points
• Computer vision geometry Hartley & Zisserman [Hartley 2003]– Robust statistics– Matching between several shots enforcing a coherence with a
projective camera model– Applied to individual shots, to sequences with varying camera
parameters– Applied for robot navigation [Nister 2003, Mouragnon 2006]– Not sequential– Wide-baseline performance– Routine usage of points at infinity
• Model selection problem [Torr 1998, 1999 ]– No parallax, homography model– Parallax epipolar geometry model– Increasing the frame rate, the interframe motion closes to a
homography
SLA
M m
etho
ds
Javier CiveraJMM Montiel A.J. Davison
State of the art IIIFeature initialization in Monocular SLAM• Delayed approach: Image tracking until safe Gaussian triangulation
Bailey 2003, Davison 2003• Undelayed intialization
– Multiple hypotheses in depth, Kwok Dissanayake 2004, Sola et al. 2005,
Inverse depth usage• Concept used in different domains
– Parallax with respect plane at infinity, Zisserman 2003– Optical flow Heeger & Jepson 1992– Modified polar coordinates in bearing only TMA Aidala & Hammel 1983– Sequential structure estimation from known motion Okutomi&Kanade 1993.– Pairwise first and individual EKF's Chowdhury&Chellappa 2003
• Recently used for Monocular SLAM– Trwany & Roumeliotis 2005– Eade & Drummond 2006, FastSLAM, a distinguished intialization stage.
Javier CiveraJMM Montiel A.J. Davison
Camera motion priors
W
CC
CC
Constant velocity motion model:• Smooth camera motion• Impulse acceleration noise
( )WCWC qr ,
Wv
CωC
Javier CiveraJMM Montiel A.J. Davison
Scene point coding in inverse depth. Measurement equation
( )
( )ιι φθ
ρ
ρ
φθ
ρ ,as observed ispoint a parallax,low at
zero togoes , baseline,low locations,camera close
zero togoesparallax ,0 point,distant
parallax ,
observed is feature thefirst time theof those, , ,
mRh
r
r
m
CW=
−⎟⎟⎟
⎠
⎞
⎜⎜⎜
⎝
⎛
⇒→
⎟⎟⎟
⎠
⎞
⎜⎜⎜
⎝
⎛−
⎟⎟⎟
⎠
⎞
⎜⎜⎜
⎝
⎛
⎟⎟⎟
⎠
⎞
⎜⎜⎜
⎝
⎛
C
WC
i
i
i
i
WC
i
i
i
i
ii
i
i
i
zyx
zyx
zyx
parallaxangleα
⎟⎟⎟
⎠
⎞
⎜⎜⎜
⎝
⎛−
⎟⎟⎟
⎠
⎞
⎜⎜⎜
⎝
⎛C
i
i
i
i
zyx
Wrρ
W
WCr
ii
d=ρ1
⎟⎟⎟
⎠
⎞
⎜⎜⎜
⎝
⎛
i
i
i
zyx m
CXYZh
C
( )WCWC qr ,
Cρh
z
y
y hh
dfvv −= 0 0
z
x
x hh
dfuu −=
( )⎟⎟⎟
⎠
⎞
⎜⎜⎜
⎝
⎛
+⎟⎟⎟
⎠
⎞
⎜⎜⎜
⎝
⎛−
⎟⎟⎟
⎠
⎞
⎜⎜⎜
⎝
⎛==
⎟⎟⎟
⎠
⎞
⎜⎜⎜
⎝
⎛
ιιι φθρρ ,mrRh CW WC
i
i
iC
z
y
x
zyx
hhh
Javier CiveraJMM Montiel A.J. Davison
SLAM joint map+camera state vectorThe full estimate coded in a unique Gaussian distribution
1. Camera state vector.2. ALL the map features.3. The joint Gaussian has proven to be useful in coding strong
correlation between the observations.
( )
⎥⎥⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢⎢
⎣
⎡
=
=
nnnnv
nv
nvvvv
TT
T
TTn
TTv
yyyyyx
yyyyyx
yxyxxx
PPP
PPPPPP
P
yyxx
L
MOMM
L
L
L
1
1111
1
1
Javier CiveraJMM Montiel A.J. Davison
Active search
storedpatch
Estimation at step k-1
Prediction at step k
Update at step kstored patch is searched in allacceptance the region
Javier CiveraJMM Montiel A.J. Davison
[ ]infinite including and fromregion a covers
2,2- interval at the th
so dinitialize is covarinace its and at nobservatio featurefirst thefrom dinitialize are
covariance ingcorrespond theand ,,,,nobservatioan just from observed are points New
min
0
d
zyx
ρρ
ρ
σρσρ
σρρ
φθ
+
Inverse depth feature initialization prior
00 21
σρ +
ρσρ 210
min
+=d
0
1ρ
⎟⎟⎟
⎠
⎞
⎜⎜⎜
⎝
⎛
i
i
i
zyx
( )ii φθ ,m
ρσρ 20 −
∞
0ρ
Javier CiveraJMM Montiel A.J. Davison
Dimensional Monocular SLAM
0
00
a
,,
,
ρ
ω
α
σρσσ
σσ
V
tzi
Δσ,z
( )T
nCWWCWC yyvqr L1,,,, ωMonocular SLAM
Processing
So estimation process can be considered as a function
Being the dimensions involved, time, and length:
camera location
scenemap
Calibrated image measurements
Frame Rate
Camera Motion Priors
Intial depth Prior
Javier CiveraJMM Montiel A.J. Davison
Dimensionless Monocular SLAM
0 ,1
,
,
00
a
ρ
ω
α
σ
σσ
σσ
ρ Π
ΠΠ
ΠΠ
V
=
1
,
=Δtzi σΠΠz
Monocular SLAM
Processing
camera location
scenemap
Calibrated image measurements
Frame Rate
Camera Motion Priors
Intial depth Prior
state vector split in scale, d, and dimensionless coefficients
are selected to define the dimensionless coefficients [Buckingham 1914]:0, ρtΔ
monocular camera cannot observe scale, d. If scale were known:
Javier CiveraJMM Montiel A.J. Davison
Interpreting Dimensionless Parameters as Image Quantities
dimensionlesscamera linearacceleration
standard deviation
dimensionlesscameraangular
accelerationstandard deviation
Javier CiveraJMM Montiel A.J. Davison
Inverse depth estimation history
near feature (3), eventually excludes infinite from the acceptance region
distant feature (11), infinite always included in the acceptance region
11 3
113
11 3
113
(b)
11 3
11
3
(a)
11
zero inverse depth
0 250 500 750-1.5
-1
-0.5
0
0.5
1
1.5
-0.03
-0.02
-0.01
0
0.01
0.02
0.03
0 250 500 750-1.5
-1
-0.5
0
0.5
1
1.5
-0.02
0
0.02
0.04
0.06
infinity / zero inverse depth
Javier CiveraJMM Montiel A.J. Davison
200 400 6000
0.2
0.4
0.6
0.8
1rx
frame number
met
ers
200 400 6000
0.2
0.4
0.6
0.8
1ry
frame number
met
ers
200 400 6000
0.2
0.4
0.6
0.8
1rz
frame number
met
ers
200 400 6000
0.5
1
1.5
2ψ (rotx)
frame numberde
g
200 400 6000
0.5
1
1.5
2θ (roty)
frame number
deg
200 400 6000
0.5
1
1.5
2φ (rotz)
frame number
deg
Loop Closing
Javier CiveraJMM Montiel A.J. Davison
Inverse depth linearityanalysis
linearitypoor L ,1cos0
tion initializaat
d ↑≈⇒≈ αα
estimation whole thealong eperformanc goodlinearity
but ,cos1 gathering,parallax after
linearity L , 0cos10
tion initializaat
ρ
ρ
σα
αα
−
↓≈−⇒≈
Javier CiveraJMM Montiel A.J. Davison
Inverse depth to XYZ conversion• Inverse depth good performance along the whole estimation process• Inverse depth coding needs 6 parameters• XYZ coding good performance for reduced depth uncertainty• So, it is not mandatory to switch from inverse depth to XYZ, but
computing overhead can be reduce• Switching criteria based on the linearity index
Javier CiveraJMM Montiel A.J. Davison
0 100 200 300 400 500 600 7000
200
400
600
dim
ensi
on
state vector dimension
0 100 200 300 400 500 600 70070
80
90
100
% re
duct
ion
% dimension reduction for code swithching
0 100 200 300 400 500 600 7000
50
100
# m
ap 3
D p
oint
s
# map 3D points
swithching inverse depth to XYZ
all map features in inverse depth
total # 3D points
total # 3D points inverse depth
total # 3D points in XYZ
Switching evolution
Javier CiveraJMM Montiel A.J. Davison
Switching evolution
(a)
(c)
(b)
(d)
Inverse depth coding
Depth coding
Javier CiveraJMM Montiel A.J. Davison
BibliografíaJ.M.M. Montiel, Javier Civera y Andrew J. Davison: “Unified Inverse DepthParametrization for Monocular SLAM”. Robotics: Science and Systems Conference2006.
Javier Civera , Andrew J. Davison and J.M.M. Montiel,: “Inverse Depth to DepthConversion for Monocular SLAM”. IEEE Int Conf on Robotics and Automation Rome, April 2007.
Javier Civera, Andrew J. Davison, J. M. M. Montiel. "Dimensionless Monocular SLAM". 3rd Iberian Conference on Pattern Recognition and Image Analysis (IbPRIA), Girona, 2007
J.M.M Montiel and Andrew J. Davision: "A visual compass based on SLAM". In Proc. Intl. Conf. on Robotics and Automation, pages 1917--1922, 2006.
J.M.M Montiel home page: http://webdiis.unizar.es/~josemari/
Anderw Davison home page http://www.doc.ic.ac.uk/~ajd/