detect - carnegie mellon school of computer scienceyx/papers/stereoerror97.pdfor and systematic err...

7

Upload: duongdan

Post on 11-Mar-2018

214 views

Category:

Documents


1 download

TRANSCRIPT

Abstract

1 Introduction

Error Analysis of a Real-Time Stereo System

Yalin Xiong Larry Matthies

Apple Computer Jet Propulsion Laboratory

Cupertino, CA 95014 Pasadena, CA 91109

Correlation-based real-time stereo systems have been

proven to be e�ective in applications such as robot navi-

gation, elevation map building etc. This paper provides

an in-depth analysis of the major error sources for such

a real-time stereo system in the context of cross-country

navigation of an autonomous vehicle. Three major

types of errors: foreshortening error, misalignment er-

ror and systematic error, are identi�ed. The combined

disparity errors can easily exceed three-tenths of a pixel,

which translates to signi�cant range errors. Upon un-

derstanding these error sources, we demonstrate dif-

ferent approaches to either correct them or model their

magnitudes without excessive additional computations.

By correcting those errors, we show that the precision

of the stereo algorithm can be improved by 50%.

Unmanned ground vehicles (UGV) have been un-der development for various missions ranging from in-

terplanetary exploration, volcanic exploration to haz-

ardous waste disposal. One critical sensing capability

of such vehicles is to avoid obstacles for autonomous orsemi-autonomous navigation. During the past decade,

real-time stereo systems have emerged as the major al-

ternative to achieve such a task other than the laser

range �nder, which is di�cult to use in many mis-sions for various reasons. Two main requirements for

such a stereo system are its speed and precision. It

must be fast enough to guide the vehicle in real-time,

and precise enough to detect far-away obstacles so that

the vehicle has enough time to steer around it. Thecorrelation-based stereo method [3, 4] can be as fast

as several pairs per second, and it delivers reasonably

precise results.

With the current state-of-art system as described inthe next section, we believe that, in order to achieve an

even higher percentage of correct obstacle detections in

the context of autonomous navigation, we need a bet-

ter understanding of the errors in stereo than regardingthem uniformly as consequences of image noises, and to

correct those errors if possible without excessive addi-

tional computations. This paper presents our analysis

of three major error sources in such correlation-based

real-time stereo systems. All these three types of error

are signi�cant in magnitude and yet none of them is

caused by image noise.

The foreshortening errors result from the fact that

the 3D scene is not fronto-parallel. Therefore, for any

small patch in the left image, its corresponding patchin the right image is not only translated but also dis-

torted. The correlation-based stereo method models

the translation but not the distortion. To measure

the distortion e�ects on disparity values, we computea \foreshortening sensitivity map", which represents

how sensitive the disparity value at any pixel is with

respect to the magnitude of the foreshortening. Such a

measure is useful in determining the reliability of dis-parity values. In order to correct the errors caused by

the foreshortening e�ects without changing the stereo

algorithm, we can pre-warp the right image such that

the foreshortening is zero for an arbitrary 3D plane

instead of the fronto-parallel plane. For the task of ve-hicle navigation, we can set such a plane to be the ideal

ground plane. In practice, such a simple method usu-

ally results in an 80% reduction of the foreshortening

errors.

Precise stereo calibration has been extensively re-

searched during the past decade [2]. The problem withthe outdoor and cross-country navigation is that ex-

tensive mechanical vibrations and rough terrain can

often perturb camera parameters, and render the pre-

calibrated stereo rig either partially or even completely

out of calibration. The second major error sourceis the misalignment errors caused by stereo jig be-

ing slightly out of calibration. We will not explore

the self-calibration approach since self-calibration for

lenses with signi�cant distortions is beyond the scopeof this paper. Instead, we �rst try to model the e�ects

of misalignment by computing a \misalignment sensi-

tivity map", which measures the sensitivity of the dis-

parity values with respect to the misalignment. Such ameasure is also useful in determining the reliability of

disparity values. In order to correct the misalignment

errors, we model the misalignment �eld by a low-order

X X� �

ad hoc

� �

x y

l r

l r

r

l

r l

l ll

3

= 3

3

= 3

2

0

1 +1

0 0

1 +1

1 +1 0 0

0 0

2 Background3 Foreshortening Error

M d I x; y I x d; y ;

I I

S

S S

S S

dS S

S S S:

I

I M d

d

I x; y I x d ay; y ;

a

I x d ay; y I x; y d ay@I

@x:

bivariate polynomial, and then pre-shift the right im-age vertically in the opposite direction. We show that

such a simple approach usually results in 50% reduc-

tion of misalignment errors.

The third error source for the correlation-basedstereo is the systematic error which is the consequence

of window e�ects and quadratic approximation in the

subpixel registration. For typical images used in cross-

country navigation, the magnitude of the systematicerror is between 0.05 and 0.10 pixels. We test a quartic

�tting algorithm to replace the quadratic interpolation.

The quartic approach appears to reduce the systematic

errors by about 0.03 pixels. Without drastic change tothe simple correlation algorithm and the amount of

computation, we believe the systematic error together

with errors from image noise represents the upper limit

of the precision that the real-time stereo system can

achieve. We also outline several approaches to alle-viate the systematic errors if a signi�cant amount of

additional computation is feasible.

Figure 1 shows the block diagram of a real-timestereo system for obstacle detection. The image rec-

ti�cation process utilizes the stereo calibration infor-

mation to correct radial lens distortion and align two

images along scanlines. Image pyramids for both theleft and right image are built. The integer disparity

values are computed by searching along the scanlines

using a 7 7 window. The objective function is the

Sum-of-Squared-Di�erence (SSD):

( ) = ( ( ) ( )) (1)

where is the left image, and is the right image.The SSD values are evaluated discretely along scan-

lines of the right image, and the integer disparity values

are where the SSD values are the smallest. Once the

smallest SSD value is identi�ed, the two adjacentSSD values and on the left and right side of

are used together with to approximate the SSD

curve by a second-order polynomial. The subpixel ad-

justment on the integer disparity value is the minimum

of the quadratic curve:

� =2( + 2 )

(2)

To achieve robustness, we also employ many addi-

tional techniques such as smoothing, statistical out-

lier detection, blob �ltering, consistency checking and

other heuristic methods. Since most of thesetechniques are irrelevant to this paper, we will not spec-

ify the details of these technique. Interested readers

may refer to [3, 4].

In the latest demo in Fort Hood of the stereo-guidednavigation, the vehicle successfully traversed 2 kilo-

meters autonomously. Despite the success, we believe

that the system needs to be more reliable in detecting

even smaller obstacles, and one of the major optionsto achieve that goal is to improve the precision of the

range image from the stereo system without excessive

additional computations. Given the current con�gura-

tion of the stereo system, a disparity error of 0.1 pixelsresults in a range error of 14 centimeters at 10 meters.

We need to detect obstacles of 41 centimeters high (i.e.

the axle clearance of the HMMWV) at that distance.

The three major error sources presented in this paperusually generate a combined error in a magnitude of

around 0.3 pixels, which is substantial for the obsta-

cle detection purpose. In the next three sections, we

will look into each of the three major error sources in

the real-time stereo system, and propose various ap-proaches to either correct or model them.

The foreshortening error results from image defor-

mations unmodelled in the original SSD formulation

in Eq. 1. When the right image is not only trans-lated but also deformed with respect to the left image

, minimizing the SSD function ( ) will not yield

the correct disparity. Such a deformation is caused

by non-zero disparity gradients, which, for recti�edstereo pairs, is caused by the 3D surface not being

fronto-parallel. Depending on the image texture and

the amount of deformation, the estimated disparity ^

will contain errors of various magnitudes. In the UGVnavigation task, since the surface of the terrain is rarely

fronto-parallel to the camera pointing direction, there

is always a similar deformation. In this section, we

will �rst model quantitatively the foreshortening er-

rors for terrain imagery, then reduce those errors bypre-warping the right image so that the equi-disparity

plane coincides with the ideal ground plane. Similar

pre-warp methods were also proposed in [5, 1].

In the SSD minimization in Eq. 1, if the disparity

values have signi�cant gradient in the column (vertical)

direction, which is usually true for stereo images shot

from the top of a vehicle, the deformation between theleft and right image can be modeled as:

( ) = ( + + ) (3)

where is the non-zero disparity gradient in the col-

umn direction.

Taking the Taylor expansion of the above equation,

we have

( + + ) ( ) + ( + ) (4)

Detectobstacles

Blob,X-Y-Z

Sub-pixeland smooth

Get integerdisparity

Buildpyramids

Rectifyimages

Datacube MV-200 50 MHz 68060

l

l

� � �

� �

� �

� �

X X � �

P P � �P P � �

x y

l

f

f

fx y

@I@x

x y@I@x

f

3

= 3

3

= 3

0

2

0

3

= 3

3

= 3

2

3

= 3

3

= 3

2

M d d d ay@I

@x;

d d ae ;

e

ey

:

e

= ; =

;

Figure 1: Main Blocks of a Real-Time Stereo System for Obstacle Avoidance

Replacing it into Eq. 1 and simplifying, we then have

( ) ( ) (5)

the minimum of which is

^= + (6)

where the foreshortening sensitivity is de�ned as:

= (7)

The foreshortening sensitivity measures at ev-

ery pixel location the ratio between the magnitude of

disparity error and the magnitude of the disparity gra-

dient which causes the error. We call the image of

these sensitivity values the \Foreshortening SensitivityMap". It can be computed very e�ciently.

Figure 2(a) shows a typical image of a nearly at ter-

rain. Figure 3 shows the foreshortening sensitivity map

of the terrain image. The sensitivity values are quan-tized uniformly between [-3, 3]. Based on the stereo jig

con�guration and the image resolution at 240 256,

the magnitude of vertical foreshortening of the recti�ed

stereo images is always around 1/8, i.e., the disparityincreases from about 0 to 30 pixels in 240 rows. Even

though in the cross-country navigation, the magnitude

of the gradient varies according to the relative orienta-

tion of the vehicle with respect to the ground plane, theforeshortening will deviate from the nominal amount

by only small amounts in average. Figure 2(b) shows

the image resulting from horizontally shifting the im-

age in Figure 2(a) by an increasing disparity from top

to bottom. The corresponding true disparity map hasa vertical gradient of 1/8. We then feed this synthetic

stereo pair to the real-time stereo system to measure

the disparity errors caused by this foreshortening.

Figure 4 shows the errors caused by the foreshorten-ing at every pixel and its correlation with the foreshort-

ening sensitivity values. The line in the correlation plot

is the ideal linear relation predicted in Eq. 6. The er-

rors are quantized uniformly between [ 3 8 3 8]. Weconclude that the foreshortening sensitivity map for-

mulated in Eq. 7 is an accurate model of sensitivity to

disparity gradient at every pixel.

(a) Original (b) Warped

Figure 2: Original and Warped Terrain Image

Figure 3: Foreshortening Sensitivity Map

We also measure the statistics of all the sensitivityvalues in order to have an idea about the average mag-

nitude of the errors caused by the foreshortening. The

mean is about zero, and the standard deviation is 0.71.

If the expected disparity gradient is around 1/8, thesestatistical measures predict that the foreshortening er-

ror has a zero mean, and a standard deviation of about

0.09 pixel.

Given such a substantial magnitude of the average

errors caused by foreshortening, we attempt to correctthem. Instead of having fronto-parallel planes be equi-

disparity, we can pre-warp the image such that the

ideal ground plane is equi-disparity as we did in Fig-

ure 4. Figure 5 is the real right image taken togetherwith the left image in Figure 2. Figure 6 shows side-

by-side the disparity maps resulting from the original

stereo algorithm and the pre-warp method. Note that

the disparity gradient in both maps has been factoredout, and the disparity maps are quantized uniformly

between [1 4] pixels. Though we do not have ground

truth to quantify how much improvement the prewarp

-3 -2 -1 0 1 2 3-0.6

-0.4

-0.2

0

0.2

0.4

0.6

Foreshortening Sensitivity Values

Act

ual D

ispa

rity

Err

ors

1

l l

l

� � �

� �

� �

� �

X X � �

P PP P � �

4 Misalignment Error

0

0

0 0

3

= 3

3

= 3

0

2

0

3

= 3

3

= 3

3

= 3

3

= 3

2

1

r l

l ll l

x y

l l

m

m

mx y

@I@x

@I@y

x y@I

@x

m

m

I x; y I x d ; y m ;

d

I x d ; y m I x; y d@I

@xm@I

@y

M d d d@I

@xm@I

@y;

d d me ;

e

e :

e

=

The errors also include the systematic errors presented in the

next section, which may explain the di�erences between what is

predicted by the misalignment sensitivity map and the actual

disparity errors.

Errors Correlation

Figure 4: Errors caused by Foreshortening and Corre-

lation with Foreshortening Sensitivity Values

Figure 5: Right Image of the Same Terrain

method has introduced, the reduction of small ripples

in the disparity maps is obvious. The residual disparitygradient is about 0.01 to 0.02 in both the vertical and

horizontal directions. Such a reduction of the disparity

gradient results in more than 80% of reduction in the

foreshortening and consequent foreshortening errors.

When there is a certain amount of misalignment af-

ter the recti�cation, it will introduce additional errors.In practice, calibrating a stereo jig so that the mis-

alignment is less than a tenth of a pixel is challenging.

Even if it can be calibrated to such a high precision,

it can rarely survive the severe vibrations of the vehi-cle and the rough terrain without being slightly out of

calibration. In this section, we will look into this mis-

Original algorithm Foreshortening Correction

Figure 6: Disparity Maps

alignment problem and try to model and correct thedisparity errors caused by misalignment.

In the SSD minimization in Eq. 1, if there exists

a small vertical misalignment , the relation between

the left and right images can be modeled locally as:

( ) = ( + + ) (8)

where, again, is the true disparity value.

Taking the Taylor expansion of the above equation,

we have

( + + ) ( ) + + (9)

Replacing it into Eq. 1, we have

( ) ( ) (10)

the minimum of which is

^= + (11)

where the misalignment sensitivity value is de�ned

as:

= (12)

The misalignment sensitivity measures at every

pixel the ratio between the magnitude of the disparity

error and the magnitude of the misalignment whichcauses the error. We call the image of these sensitivity

values the \Misalignment Sensitivity Map", which can

be computed very e�ciently.

For the left image shown in Figure 2, its misalign-ment sensitivity map is shown in Figure 7. The sen-

sitivity values are quantized uniformly in [-3, 3]. We

synthetically shifted the same image vertically by 1 5

pixel, and fed the pair to the real-time stereo systemto generate the disparity map. Figure 8 shows the

disparity error caused by the misalignment and the

correlation between the errors and the misalignment

sensitivity values. The line in the correlation plot is

the ideal linear relation predicted in Eq. 11. The er-rors are quantized uniformly in [-3/5, 3/5]. Overall, the

misalignment sensitivity map predicts fairly accurately

the disparity errors caused by the misalignment.

We also measure the statistics of the all the sensitiv-ity values in order to have an idea about the average

magnitude of the errors caused by the misalignment.

-3 -2 -1 0 1 2 3-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

Misalignment Sensitivity Values

Act

ual D

ispa

rity

Err

ors

2

2

is

: ; :

:

:

:

:

Again, it also includes systematic errors.

Figure 7: Misalignment Sensitivity Map

Errors Correlation

Figure 8: Disparity Errors Caused by Misalignment

and Correlation with Sensitivity Values

The mean value is zero, and the standard deviation is

0.51 pixel. Therefore, for a misalignment ranging from

0.1 to 0.2 pixel, the average error caused by the mis-

alignment for road imagery ranges approximately from0.05 to 0.10 pixel.

Since the errors caused by misalignment are signi�-cant, it is desirable to minimize the magnitude of the

misalignment. The traditional way to achieve this goal

is to calibrate the stereo jig more precisely. Unfortu-

nately, for reasons cited before, achieving even higherprecision in calibration is both di�cult and ine�ective.

Self-calibration may be the only option when the stereo

calibration is completely out of calibration, e.g. the

magnitude of the misalignment is more than a pixel.But for most cases, the magnitude is well within half a

pixel. Therefore, we can directly correct the misalign-

ment in images instead of compensating it indirectly

through modifying the camera parameters.

Before we adopt the approach to directly correct

misalignment in images, we need to experimentally ver-ify that the misalignment �eld is consistent, i.e., it does

not change from one frame to next. If it is approx-

imately constant, we then only need to calibrate the

misalignment �eld occasionally to keep our model up-

to-date. Otherwise, calibrating the misalignment �eldfor every stereo pair will be too much computation,

and further analysis is needed to determine causes of

the dynamic misalignment.

In order to calibrate the misalignment �eld, we aug-

Figure 9: Misalignment Fields

mented the original SSD search in Eq. 1 to include a

vertical search in the misalignment calibration mode.In fact, the vertical search only needs to be 1 pixel.

In the subpixel registration in Eq. 2, we �t a bivari-

ate quadratic equation to a nine-point neighborhood

to compute the subpixel shift in both the horizontaland vertical directions. Note that in the experiments

shown in this section, we have already applied the fore-

shortening correction technique presented in the previ-

ous section. Figure 9 shows the misalignment �elds

quanti�ed between [ 0 4 0 4] for the stereo pair (Fig-ure 2(a) and Figure 5) and another stereo pair shot

in the same trip. It shows that the misalignment �elds

are fairly consistent. For both pairs, we have signi�cant

misalignments on the left side of the image. The maxi-mal misalignments in lower-left and upper-left corners

are roughly 0 4 pixels respectively. And the overall

standard deviation of the vertical shift is 0 15 pixels .

Given that the misalignment �elds are consistent,

we can model them by a bivariate polynomial. In prac-

tice, we �nd that a 4th order bivariate polynomial can

model the misalignment �elds very well.The modellingof the misalignment �eld by using a 2D SSD search

and �tting a bivariate polynomial computationally

expensive. But we do not need to do it for every pair.

And it is much easier and faster than a full calibration.

After the misalignment �eld is modeled, we then

pre-warp the stereo pair such that the misalignment is

canceled. The warp is simply a vertical shift at everypixel. Figure 10(a) shows the disparity map for the

same stereo pair we used in the previous section. We

compensated the misalignment. And the improvement

in the lower-left area is very obvious since the mis-alignment was large in that area. In fact, Figure 10(b)

shows the misalignment �eld after we have compen-

sated for the misalignment. Statistically, the standard

deviation of the misalignment in Figure 10(b) is 0 10

pixel for the whole image. Given that the systematicerror is at least 0 05 pixel, we conclude that the mis-

−0.5 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5−0.1

−0.08

−0.06

−0.04

−0.02

0

0.02

0.04

0.06

0.08

0.1

True Disparity

Dis

parit

y B

ias

12

02

12

jf jfd

jfd

jf jfd

5 Systematic Error

k � k

k � k

k � k

� � � �

� � �

Window E�ect Errors

Linearization Errors

not

f

d

x ; ;

S e e ;

S e ;

S e e ;

d

dfd

f=:

d

d d

f : ; : ; : ; :

f :

f :

� �

= = =

= = = =

(a) Disparity Map (b) Misalignment Field

Figure 10: Disparity Map andMisalignment Field after

Misalignment Compensation

alignment compensation method reduces the amountof misalignment by at least 50%.

The third type of error source in the real-time stereo

system is what we called the \systematic errors", which

are caused by the implicit assumptions in the algo-rithm. There are two kind of systematic errors, which

are signi�cant:

: When the window mask

slides on the right image, every time we move the

window to the right by one pixel, the leftmost col-umn is moved out and the rightmost column is

moved in. With a �nite window, the di�erence in

image content by replacing the leftmost column by

the new rightmost column is random. Therefore,

there is certain amount of randomness in the SSDvalues. When the SSD values are interpolated,

such a randomness causes errors in disparity.

: The implicit assumption inthe quadratic interpolation is that the right image

can be linearized using Taylor expansion around

the true disparity. Unfortunately such an approx-

imation is poor when there is a signi�cant amountof high frequency information.

Since the window e�ect errors are caused by the dis-crepancy between the leftmost column which is moved

out and the rightmost column which is moved in, we

have the following two observations:

1. The errors have zero-mean because the discrep-

ancy should have a zero-mean distribution.

2. When the image is composed of information only

at the harmonic frequencies of the window width,

the window e�ect errors should be zero. In other

words, if the frequency of the information is alwaysan integer times the inverse of the window width,

the moved-out leftmost column is exactly the same

as the moved-in rightmost column.

Figure 11: Bias for Di�erent Frequency

Let us suppose that an image is composed of one

single sinusoid at frequency which is an integer times

the inverse of the window width, and the true disparityis . Therefore, using the Parsavel's theorem, we have

SSD values at = 1 0 1 as

= (13)

= 1 (14)

= (15)

where the magnitude of the sinusoid is omitted since it

is the same for all three SSD values. In this case, the

only systematic error is the linearization error.After �tting a parabola to the three SSD values

above, the estimated disparity ^ in Eq. 2 is:

^=tan( )

2 tan( 2)(16)

Figure 11 shows the true disparity versus the es-

timation bias ^ . The di�erent curves are for dif-

ferent frequency = 2 0 1 5 1 0 0 5, with the

outmost corresponding to = 2 0, and the attestcorresponding to = 0 5.

Of course, an image contains information at frequen-

cies ranging from to . But the di�erent biases gen-

erated by information at di�erent frequencies docancel each other since they all bias toward the same

direction. As a result, the linearization error is in fact

a bias toward integer disparity values. The more high

frequency information the image contains, the larger

the magnitude of the bias is.We test the real-time stereo system on stereo pairs

with uniform disparity and zero misalignment, i.e.

both the foreshortening and misalignment errors are

zero. We translate the whole terrain image horizon-tally by the following amount: 3 8, 1 4, 1 8, 0,

1 8, 1 4, 3 8, 1 2 pixel. For every pair, we run the

real-time stereo system on it, and compute the statis-

tical distribution of the disparity values. The secondset of experiments is the same as the �rst one except

that we also add in the right image a white noise whose

standard deviation is 3.0 out of 256 greylevels.

−0.5 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5−0.06

−0.04

−0.02

0

0.02

0.04

0.06

True Disparity

Dis

pa

rity

Bia

s

−0.5 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.50

0.01

0.02

0.03

0.04

0.05

0.06

0.07

True Disparity

RM

S E

rro

rs in

Dis

pa

rity

-0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5-0.06

-0.04

-0.02

0

0.02

0.04

0.06

True Disparity

Dis

pa

rity

Bia

s

-0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.50

0.01

0.02

0.03

0.04

0.05

0.06

0.07

True Disparity

RM

S E

rro

rs in

Dis

pa

rity

� �

2 1 0 1 2

=

M d

S S S S S

References

Acknowledgments

Sub-pixel Sampling

Larger Window

ICCV'95

IEEE Transactions on

Robotics and Automation

Robotics Research: The Seventh International

Symposium

DARPA

Image Understanding Workshop

[1] P. Burt, L. Wixson and G. Salgian, Electronically di-

rected \focal" stereo, in ; pp. 94-101.

[2] D. Gennery, Camera Calibration Including Lens Dis-

tortion, JPL Technical Report D-8580, Jet Propulsion

Lab, Pasadena, CA, 1991.

[3] L. H. Matthies and P. Grandjean, Stochastic perfor-

mance modeling and evaluation of obstacle detectability

with imaging range sensors, in

, 10(6), December 1994, pp.

783-791.

[4] L. Matthies, A. Kelly, T. Litwin, and G. Tharp, Obsta-

cle detection for unmanned ground vehicles: a progress

report, in

, Springer, 1996, pp. 475-486.

[5] Lynn H. Quam, Hierarchical warp stereo, in

, 1984, pp. 149-155

(a) Bias (b) RMS Errors

Figure 12: Systematic Bias and RMS Errors from the

Experiments

Figure 12(a) shows the disparity bias versus truedisparity for the two sets of the experiments. And Fig-

ure 12(b) shows the standard RMS errors, i.e. the

zero-mean errors, for the same sets of experiments.

The solid curves are from the �rst set, and the dashed

curves from the second set.From these synthetic examples, we draw the follow-

ing conclusions:

1. The linearization error is a bias modeled by Eq. 16.

The maximal bias for terrain imagery is about 0.04

to 0.05 pixel, and is reached around 1 4 from an

integer disparity value.

2. The window e�ect error is zero-mean with its

standard deviation for terrain imagery about 0.05pixel.

3. The e�ect of image noises is relatively insigni�-

cant. The RMS errors show only a very small

amount of increase when image noises are present.

One way to reduce the linearization errors is to ex-

tend the Taylor expansion to quadratic terms. In other

words, the SSD curve ( ) should be approximated asa quartic instead of a quadratic curve. In order to �t

the SSD values against the quartic curve, we need �ve

SSD sample values: , , , , and . The

minimum of the quartic curve is one root of its deriva-

tive function. Since we already have a good subpixelsolution from the original quadratic approximation, we

can use it as an initial estimate of the minimum. Usu-

ally the minimum is reached in a couple of iterations

using the Newton-Raphson method.Figure 13 show the bias and RMS errors for the

same stereo pair by using the quartic curve �tting. We

can see that the quartic curve method reduces both

the linearization error (bias) and window e�ect errors(RMS) by about 0.03 pixels.

There are a number of other approaches which can

reduce the systematic errors signi�cantly, though they

(a) Bias (b) RMS Errors

Figure 13: Bias and RMS Errors using Quartic Curve

Fitting

usually require a signi�cant amount of additional com-

putations:

: The magnitude of the bias is

proportional to the sampling distance in the SSD

function. If we can sample the SSD values by half-pixel distance instead of one pixel distance, the

magnitude of the bias can be cut by half.

: Since the window e�ect errors are

the direct consequence of the discrepancy of the

moved-out leftmost column and moved-in right-most column, increasing the window size will de-

crease the e�ects of the discrepancy simply be-

cause there are more columns.

This research was conducted while the �rst author

was associated with The Robotics Institute, Carnegie

Mellon University. The authors would like to thank

Todd Litwin in JPL for helping with the data sets andprograms.