Download - Vision-based SLAM
![Page 1: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/1.jpg)
Vision-based SLAM
Simon LacroixRobotics and AI groupLAAS/CNRS, Toulouse
With contributions from:Anthony Mallet, Il-Kyun Jung,
Thomas Lemaire and Joan Sola
![Page 2: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/2.jpg)
Benefits of vision for SLAM ?
• Cameras : low cost, light and power-saving
• Perceive data – In a volume– Very far– Very precisely
1024 x 1024 pixels60º x 60º FOV
0.06 º pixel resolution
1.0 cm at 10.0 m
• Stereovision– 2 cameras
provide depth
• Images carry a vast amount of information• A vast know-how exists in the computer vision community
![Page 3: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/3.jpg)
• The way humans perceive depth
Stereo image pair Stereo images viewerStereo camera
0. A few words on stereovision
• Very popular in the early 20th century
• Anaglyphs
PolarizationRed/Blue
![Page 4: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/4.jpg)
Principle of stereovision
In 2 dimensions (two linear cameras):
Right camera
b
Right image
Disparityd
)tan()tan( α +=
bd
Left camera
Left imageα
![Page 5: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/5.jpg)
Principle of stereovision
In 3 dimensions (two usual matrix cameras):
1. Establish the geometry of the system (off line)2. Establish matches between the two images, compute the disparity3. On the basis of the matches disparity, compute the 3D coordinates
![Page 6: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/6.jpg)
Geometry of stereovision
x
y
z x
y
z
Ol
Or
P
pl
P1
P2
pr
pr1pr2
![Page 7: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/7.jpg)
Geometry of stereovision
Ql
Q
Qr
x
y
z x
y
z
Ol
Or
P
pl
pr
![Page 8: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/8.jpg)
Geometry of stereovision
x
y
z x
y
z
Ol
Or
R
Epipolar geometry
Epipoles
Epipolar lines
![Page 9: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/9.jpg)
Stereo images rectification
Goal: transform the images so that epipolar lines are parallel
Interest: computational cost reduction of the matching process
![Page 10: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/10.jpg)
Dense pixel-based stereovision
Problem: « For each pixel in the left image, find its correspondant in the right image »
… 3 6 3 7 9 2 8 7 6 8 9 6 4 9 0 9 9 0 …
… 3 5 7 4 9 6 3 9 6 5 8 6 3 0 1 9 7 5 …
Left line
Right line
???
The matches are computed on windows
Several ways to compare windows: “SAD”, “SSD”, “ZNCC”, Hamming distance on census-transformed images…
![Page 11: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/11.jpg)
Dense pixel-based stereovision
Original image Disparity map 3D image
![Page 12: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/12.jpg)
Outline
0. A few words on stereovision
0-bis. Visual odometry
![Page 13: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/13.jpg)
2. Pixels selection3. Pixels tracking
1. Stereovision
Stereovision
4. Motionestimation
Visual odometry principle
![Page 14: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/14.jpg)
QuickTime™ et undécompresseur MPEG-4 vidéo
sont requis pour visionner cette image.
![Page 15: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/15.jpg)
Visual odometry• Fairly good precision (up to 1% on 100m trajectories)
• But:
– Depends on odometry (to track pixels)
– No error model available
![Page 16: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/16.jpg)
Visual odometry• Applied on the Mars Exploration Rovers
50 % slip
![Page 17: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/17.jpg)
Outline
0. A few words on stereovision
0-bis. Visual odometry
1. Stereovision SLAM
![Page 18: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/18.jpg)
What kind of landmarks ?
Interest points = sharp peaks of the autocorrelation function
Harris detector (precise version [Schmidt 98])
Auto-correlation matrix:
Principal curvatures defined by the two eigen values of the matrix
(s: scale of the detection)
€
λ1,λ 2
![Page 19: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/19.jpg)
Landmarks : interest points
• Landmark matching
?
![Page 20: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/20.jpg)
Interest points stability
Interest point repeatability
Interest point similarity : resemblance measure of the two principal curvatures of repeated points
points Detected
points Repeated=tyRepetabili
= 70% (7 repeated points out of 10 detected points)
)',max(
)',min()',(
)',max(
)',min()',(
22
222
11
111 λλ
λλ=
λλλλ
= xxSxxS pp
Maximum point similarity: 1
![Page 21: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/21.jpg)
Interest points stability
Repeatability and point similarity evaluation:
Evaluated with known artificial rotation and scale changes
![Page 22: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/22.jpg)
Interest points matching
Principle: combine signal and geometric information to match groups of points [Jung ICCV 01]
![Page 23: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/23.jpg)
Consecutive images Large viewpoint change Small overlap
Landmark matching results
![Page 24: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/24.jpg)
1.5 scale change 3.0 scale change
Landmark matching results
![Page 25: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/25.jpg)
Landmark matching results (Ced)
Detected points Matched points An other example
![Page 26: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/26.jpg)
– Landmark detection– Relative observations (measures)
• Of the landmark positions• Of the robot motions
– Observation associations– Refinement of the landmark and robot positions
Vision : interest points
StereovisionVisual motion estimation
Interest points matching Extended Kalman filter
Stereovision SLAM
![Page 27: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/27.jpg)
Dense stereovision actually not required
IP matching applied on stereo frames (even easier !)
![Page 28: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/28.jpg)
Dense stereovision actually not required
IP matching applied on stereo frames (even easier !)
![Page 29: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/29.jpg)
Visual motion estimation
1. Stereovision
2. Interest point detection
3. Interest points matching
4. Stereovision
5. Motionestimation
![Page 30: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/30.jpg)
– Landmark detection– Relative observations (measures)
• Of the landmark positions• Of the robot motions
– Observation associations– Refinement of the landmark and robot positions
Vision : interest points OK
Stereovision OKVisual motion estimation OK
Interest points matching OK Extended Kalman filter
Stereovision SLAM
![Page 31: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/31.jpg)
Seting up the Kalman filter
€
x(k +1) = f (x(k),u(k +1)) + v(k +1), v with covariance Pv (k)
€
z(k) = h(x(k)) + w(k), w with covariance Pw (k)
€
u(k +1) = (Δφ,Δθ,Δψ ,Δtx,Δty,Δtz )
€
υ i(k +1) = zi(k +1) − ˆ z i(k +1/k)
• System state:
• System equation:
• Observation equation:
• Prediction: motion estimates
• Landmark “discovery”: stereovision
• Observation : matching + stereovision
€
x(k) = [x p,m1,...,mN ], avec x p = [φ,θ,ψ , tx, ty, tz] et mi = [x i, y i,zi]
P(k) =Ppp (k) Ppm (k)
Ppm (k) Pmm (k)
⎡
⎣ ⎢
⎤
⎦ ⎥
€
mi = [x i, y i,zi]
Need to estimate the errors
![Page 32: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/32.jpg)
Error estimates (1)
• Errors on the disparity estimates
)( cfd =σempirical study:
s
• Errors on the 3D coordinates :
Maximal errors : 0.4 m baseline: 2310 xx−≤σ
1.2 m baseline: 2410.3 xx
−≤σ
Online estimation of the errors
2xd
x dx α
σσα=⇒=
Stereovision error:
![Page 33: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/33.jpg)
Error estimates (2)• Interest point matching error (not miss-matching)
- Correlation surface built thanks to rotation and scale adaptive correlation,fitted with a Gaussian distribution
Gaussian distribution Correlation surface
• Combination of matching and stereo error - Driven by 8 neighbor 3D points and projecting one sigma covariance ellipse to 3D surface
1 pixel
X0Xk
wk
))(( 220
20
8
1
22
8
10 kk
kkX XXw σ+σ+−=σ ∑
=
220 kσσ , : variance of stereo vision error
![Page 34: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/34.jpg)
Error estimates (3)
Visual motion estimation error
• Propagating the uncertainty of 3D matching points set to optimal motion estimate
- 3D matching points set
- Optimal motion estimate
- Cost function
€
J( ˆ u , ˆ Q ) = (X 'n −R( ˆ Θ , ˆ Φ , ˆ Ψ )Xn − ˆ t T )2
n=1
N
∑€
ˆ Q = Q + ΔQ = [X1,..., XN , X '1 ,..., X 'N ]
),̂,̂,̂ˆ,ˆ,ˆ(ˆzyxuuu tttΨΦΘ=Δ+=
• Covariance of the random perturbation Δu : propagation using Taylor series expansion of the Jacobian of the cost function around Qu ˆ,ˆ
![Page 35: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/35.jpg)
QuickTime™ et undécompresseur
sont requis pour visionner cette image.
![Page 36: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/36.jpg)
Results
70m loop, altitude from 25 to 30m, 90 stereo pair processed
Landmark error ellipses (x40)
Tra
ject
ory
and
land
mar
ksP
ositi
on a
nd
attit
ude
varia
nces
![Page 37: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/37.jpg)
Results
70m loop, altitude from 25 to 30m, 90 stereo pair processed
Frame 1/90
Reference
Reference
Std. Dev.
VME
result
VME
Abs.error
SLAM
result
SLAM
Std. Dev.
SLAM
Abs. error
Θ 6.19° 0.18° 11.93° 5.74° 6.01° 0.16° 0.18°
Φ 2.31° 0.66° 4.00° 1.69° 1.42° 0.55° 0.89°
-105.94° 0.06° -105.52° 0.41° -106.03° 0.08° 0.09°
tx 3.17m 0.26m 5.31m 2.14m 3.13m 0.09m 0.04m
ty 0.61m 0.07m 2.01m 1.40m 0.26m 0.19m 0.35m
tz -1.52m 0.04m -3.25m 1.73m -1.51m 0.03m 0.01m
![Page 38: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/38.jpg)
Results (Ced)
270m loop, altitude from 25 to 30m, 400 stereo pairs processed, 350 landmark mapped
Landmark error ellipses (x30)
Tra
ject
ory
and
land
mar
ksP
ositi
on a
nd
attit
ude
varia
nces
![Page 39: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/39.jpg)
Results (Ced)
270m loop, altitude from 25 to 30m, 400 stereo pairs processed, 350 landmark mapped
Frame 1/400
Reference
Reference
Std. Dev.
VME
result
VME
Abs.error
SLAM
result
SLAM
Std. Dev.
SLAM
Abs. error
Θ -0.12° 0.87° -0.13° 0.01° -3.68° 0.38° 3.56°
Φ 2.87° 1.14° -4.99° 7.86° 5.54° 0.40° 1.64°
105.44° 0.23° 101.82° 3.62° 104.32° 0.19° 1.12°
tx -4.73m 0.57m 5.45m10.38
m-3.98m
0.21m
0.95m
ty 0.14m 0.46m 3.04m 2.90m -2.16m0.22
m2.12m
tz 3.89m 0.15m 19.81m15.94
m3.46m
0.11m
0.43m
![Page 40: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/40.jpg)
Application to ground rovers
landmark uncertainty ellipses (x5)
• 110 stereo pairs processed, 60m loop
![Page 41: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/41.jpg)
Application to ground rovers
Frame 1/100
Reference
Reference
Std. Dev.
VME
result
VME
Abs.error
SLAM
result
SLAM
Std. Dev.
SLAM
Abs. error
Θ 0.52° 0.31° 2.75° 2.23° 0.88° 0.98° 0.36 °
Φ 0.36° 0.25° -0.11° 0.47° 0.72° 0.74° 0.36 °
-0.14° 0.16° 1.89° 2.03° 1.24° 1.84° 1.38°
tx -0.012m
0.010m
0.057m0.069
m-
0.077m0.069
m0.065
m
ty -0.243m
0.019m
-1.018m0.775
m-
0.284m0.064
m0.041
m
tz 0.019m0.015
m0.144m
0.125m
0.018m0.019
m0.001
m
• 110 stereo pairs processed, 60m loop
![Page 42: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/42.jpg)
Application to indoor robots About 30 m long trajectory, 1300 stereo image pairs
… … …
![Page 43: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/43.jpg)
Application to indoor robots
10 timesCov. ellipse
About 30 m long trajectory, 1300 stereo image pairs
![Page 44: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/44.jpg)
Application to indoor robots About 30 m long trajectory, 1300 stereo image pairs
10 timesCov. ellipse
Beginning of loopMiddle of loop
End of loop
![Page 45: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/45.jpg)
Application to indoor robots
Phi Theta Elevation
-Two rotation angles(Phi, Theta) and Elevation must be zero
CameraPhi
Theta
Elevation
About 30 m long trajectory, 1300 stereo image pairs
![Page 46: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/46.jpg)
Outline
0. A few words on stereovision
0-bis. Visual odometry
1. Stereovision SLAM
2. Monocular (bearing-only) SLAM
![Page 47: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/47.jpg)
Bearing-only SLAM
Generic SLAM
– Landmark detection– Relative observations (measures)
• Of the landmark positions• Of the robot motions
– Observation associations– Refinement of the landmark and robot positions
Stereovision SLAM
Vision : interest points
Stereovision Visual motion estimation
Interest points matching Extended Kalman filter
![Page 48: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/48.jpg)
Bearing-only SLAM
Generic SLAM
– Landmark detection– Relative observations (measures)
• Of the landmark positions• Of the robot motions
– Observation associations– Refinement of the landmark and robot positions
Monocular SLAM
Vision : interest points
« Multi-view stereovision » INS, Motion model, GPS…
Interest points matching Particle filter + extended Kalman filter
![Page 49: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/49.jpg)
« Observation filter » ≈ Gaussian particles
1. Landmark initialisation
2. Landmark observations
Landmark observations
![Page 50: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/50.jpg)
Bearing-only SLAM
![Page 51: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/51.jpg)
Overview of the whole algorithm
Bearing-only SLAM
![Page 52: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/52.jpg)
Comparison stereo / bearing-only
Mapped landmarks (bearing-only case)
![Page 53: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/53.jpg)
Looking forward / looking sidewards
stereovision bearing-only
![Page 54: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/54.jpg)
Using panoramic vision
![Page 55: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/55.jpg)
Data association is still an issue
« View-based » qualitative navigation can help to focus the search
![Page 56: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/56.jpg)
View-based navigation
Indexing with global attributes
Local characteristics histograms based on gaussian derivatives Color Histograms
Texture histograms
Local Characteristics Histograms Family
(LCHF)
![Page 57: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/57.jpg)
View-based navigation
Empirical relation between image distance and cartesian distance
![Page 58: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/58.jpg)
Closing the loop
1. Image processing at each image acquisition
![Page 59: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/59.jpg)
Closing the loop2. SLAM processes at each image acquisition
![Page 60: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/60.jpg)
Closing the loop
QuickTime™ et undécompresseur Cinepak
sont requis pour visionner cette image.
![Page 61: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/61.jpg)
Outline
0. A few words on stereovision
0-bis. Visual odometry
1. Stereovision SLAM
2. Monocular (bearing-only) SLAM
3. Bearing-only SLAM using line segments
![Page 62: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/62.jpg)
Using line segments
![Page 63: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/63.jpg)
Initializing line segments landmarks
Line segment representation: Plücker coordinates
In 2 dimensions:
![Page 64: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/64.jpg)
Bearing-only SLAM with line segments
QuickTime™ et undécompresseur
sont requis pour visionner cette image.
![Page 65: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/65.jpg)
Bearing-only SLAM with line segments
![Page 66: Vision-based SLAM](https://reader036.vdocuments.site/reader036/viewer/2022081503/56814388550346895db0042a/html5/thumbnails/66.jpg)
Summary
0. A few words on stereovision
0-bis. Visual odometry
1. Stereovision SLAM
2. Monocular (bearing-only) SLAM
3. Bearing-only SLAM using line segments