an evaluation methodology for stereo correspondence algorithms

An Evaluation Methodology for Stereo Correspondence Algorithms

Ivan Cabezas, Maria Trujillo and Margaret [email protected]

February 25th 2012International Conference on Computer Vision Theory and Applications, VISAPP 2012, Rome - Italy

Multimedia and Vision Laboratory

MMV is a research group of the Universidad del Valle in Cali, Colombia

An Evaluation Methodology for Stereo Correspondence Algorithms, VISAPP 2012, Rome - Italy

Ivan Maria et al.

Ayax Inc.

Ayax Inc. offers informatics solutions for decision analysis


Margaret

Content

Stereo Vision Canonical Stereo Geometry and Disparity Ground-truth Based Evaluation

Quantitative Evaluation Methodologies Middlebury’s Methodology A* Methodology

A* Groups Methodology Experimental Results Final Remarks


Stereo Vision

The stereo vision problem is to recover the 3D structure of the scene using two or more images

3D ModelStereo Images

Disparity Map

Left RightCorrespondence

Algorithm

ReconstructionAlgorithm

CameraSystem

3D World

2D Images

InverseProblem

OpticsProblem

Yang Q. et al., Stereo Matching with Colour-Weighted Correlation, Hierarchical Belief Propagation, and Occlusion Handling, IEEE PAMI 2009


Canonical Stereo Geometry and Disparity

Disparity is the distance between corresponding points

Trucco, E. and Verri A., Introductory Techniques for 3D Computer Vision, Prentice Hall 1998


Accurate Estimation Inaccurate Estimation

P

CrCl

πl πrpl pr

B

f

Z’

pr’

P ’

P

CrCl

πl πrpl pr

B

f

Z

Ground-truth Based Evaluation

Ground-truth based evaluation is based on the comparison using disparity ground-truth data

Scharstein, D. and Szeliski, R., High-accuracy Stereo Depth Maps using Structured Light, CVPR 2003Tola, E., Lepetit, V. and Fua, P., A Fast Local Descriptor for Dense Matching, CVPR 2008Strecha, C., et al. On Benchmarking Camera Calibration and Multi-View Stereo for High Resolution Imagery, CVPR 2008http://www.zf-usa.com/products/3d-laser-scanners/


Quantitative Evaluation Methodologies

Szeliski, R., Prediction Error as a Quality Metric for Motion and Stereo, ICCV 2000Kostliva, J., Cech, J., and Sara, R., Feasibility Boundary in Dense and Semi-Dense Stereo Matching, CVPR 2007Tomabari, F., Mattoccia, S., and Di Stefano, L., Stereo for robots: Quantitative Evaluation of Efficient and Low-memory Dense Stereo Algorithms, ICCARV 2010Cabezas, I. and Trujillo M., A Non-Linear Quantitative Evaluation Approach for Disparity Estimation, VISAPP 2011


The use of a methodology allows to:

Assert specific components and procedures

Tune algorithm's parameters

Support decision for researchers and practitioners

Measure the progress on the field

Middlebury’s Methodology


Select Test Bed Images Select Error Criteria

Select Error Measures

nonocc all disc

Select and Apply Stereo Algorithms

Compute Error Measures

ObjectStereo GC+SegmBorder PUTv3

PatchMatch ImproveSubPix OverSegmBP

Scharstein, D. and Szeliski, R., High-accuracy Stereo Depth Maps using Structured Light, CVPR 2003Scharstein, D. and Szeliski, R., http://vision.middlebury.edu/stereo/eval/, 2012

Middlebury’s Methodology (ii)




Algorithm nonocc all discObjectStereo 2.20 1 6.99 2 6.36 1

GC+SegmBorder 4.99 6 5.78 1 8.66 5PUTv3 2.40 2 9.11 6 6.56 2

PatchMatch 2.47 3 7.80 3 7.11 3ImproveSubPix 2.96 4 8.22 4 8.55 4OverSegmBP 3.19 5 8.81 5 8.89 6

Algorithm Average Rank

FinalRanking

ObjectStereo 1.33 1

PatchMatch 3.00 2PUTv3 3.33 3

GC+SegmBorder 4.00 4ImproveSubPix 4.00 5OverSegmBP 5.33 6

Apply Evaluation Model

Algorithm nonocc all discObjectStereo 2.20 6.99 6.36

GC+SegmBorder 4.99 5.78 8.66PUTv3 2.40 9.11 6.56

PatchMatch 2.47 7.80 7.11ImproveSubPix 2.96 8.22 8.55 OverSegmBP 3.19 8.81 8.89

Scharstein, D. and Szeliski, R., http://vision.middlebury.edu/stereo/eval/, 2012

Middlebury’s Methodology (iii)


Scharstein, D. and Szeliski, R., A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms, IJCV 2002Scharstein, D. and Szeliski, R., http://vision.middlebury.edu/stereo/eval/, 2012

Apply Evaluation Model Interpret Results

The ObjectStereo algorithm produces accurate resultsMiddlebury’s

Evaluation Model

Algorithm Average Rank

FinalRanking

ObjectStereo 1.33 1

PatchMatch 3.00 2

PUTv3 3.33 3

GC+SegmBorder 4.00 4

ImproveSubPix 4.00 5

OverSegmBP 5.33 6

Middlebury’s Methodology (iv): Weaknesses


The Middlebury’s evaluation model have some shortcomings

In some cases, the ranks are assigned arbitrarily

The same average ranking does not imply the same performance (and vice versa)

The cardinality of the set of top-performer algorithms is a free parameter

It operates values related to incommensurable measures

Middlebury’s Methodology (v): Weaknesses

The BMP percentage measures the quantity of disparity estimation errors exceeding a threshold


The BMP measure have some shortcomings:

It is sensitive to the threshold selection

It ignores the error magnitude

It ignores the inverse relation between depth and disparity

It may conceal estimation errors of a large magnitude, and, also it may penalise errors of small impact in the final 3D reconstruction

Cabezas, I., Padilla, V., and Trujillo M., A Measure for Accuracy Disparity Maps Evaluation, CIARP 2011Gallup, D., et al. Variable Baseline/Resolution Stereo, CVPR, 2008

The A* evaluation methodology brings a theoretical background for the comparison of stereo correspondence algorithms The set of algorithms under evaluation

The set of estimated maps to be compared

The function that produces a vector of error measures

The set of vectors of error measures

A* Methodology


Cabezas, I. and Trujillo M., A Non-Linear Quantitative Evaluation Approach for Disparity Estimation, VISAPP 2011

A* Methodology (ii)


Cabezas, I. and Trujillo M., A Non-Linear Quantitative Evaluation Approach for Disparity Estimation, VISAPP 2011

The evaluation model of the A* methodology addresses the comparison of stereo correspondence algorithms as a multi-objective optimisation problem It defines a partition over the set A (the decision space)

Subject to:

where ≺ denotes the Pareto Dominance relation: Let p and q be two algorithms

Let Vp and Vq be a pair of vectors belonging to the objective space

Thus, three possible relations are considered

A* Methodology (iii): Pareto Dominance


Van Veldhuizen, D., et al., Considerations in Engineering Parallel Multi-objective Evolutionary Algorithms, Trans in Evolutionary Computing 2003

The Pareto Dominance defines a partial order relation

VGC+SegmBorder = < 4.99, 5.78, 8.66 >VPatchMatch = < 2.47, 7.80, 7.11 >VImproveSubPix = < 2.96, 8.22, 8.55 >

VGC+SegmBorder VPatchMatch < 4.99, 5.78, 8.66 > < 2.47, 7.80, 7.11 >

GC+SegmBorder ~ PatchMatch

VPatchMatch VImproveSubPix

< 2.47, 7.80, 7.11 > < 2.96, 8.22, 8.55 >

Patchmatch ≺ ImproveSubPix

A* Methodology (iv): Illustration




nonocc all disc





Scharstein, D. and Szeliski, R., High-accuracy Stereo Depth Maps using Structured Light, CVPR 2003Scharstein, D. and Szeliski, R., http://vision.middlebury.edu/stereo/eval/, 2012

A* Methodology (v): Illustration

The evaluation model performs the partitioning and the grouping of stereo algorithms under evaluation, based on the Pareto Dominance relation





PatchMatch 2.47 7.80 7.11ImproveSubPix 2.96 8.22 8.55 OverSegmBP 3.19 8.81 8.89

Algorithm nonocc all disc SetObjectStereo 2.20 6.99 6.36 A*

GC+SegmBorder 4.99 5.78 8.66 A*PUTv3 2.40 9.11 6.56 A’

PatchMatch 2.47 7.80 7.11 A’ImproveSubPix 2.96 8.22 8.55 A’OverSegmBP 3.19 8.81 8.89 A’


, GC+SegmBorder

PatchMatch

ObjectStereo

PUTv3 ImproveSubPix OverSegmBP,, ,

A* Methodology (vi): Illustration

Interpretation of results is based on the cardinality of the set A*



The Objectstereo and the GC+SegmBorder algorithms are,

comparable among them, and have a superior performance to the rest of

algorithms

A* Evaluation Model

Algorithm nonocc all disc SetObjectStereo 2.20 6.99 6.36 A*

GC+SegmBorder 4.99 5.78 8.66 A*PUTv3 2.40 9.11 6.56 A’

PatchMatch 2.47 7.80 7.11 A’ImproveSubPix 2.96 8.22 8.55 A’OverSegmBP 3.19 8.81 8.89 A’

A* Methodology (vii): Strength and Weakness

An Evaluation Methodology for Stereo Correspondence Algorithms, VISAPP 2012, Rome - ItalyCabezas, I. and Trujillo M., A Non-Linear Quantitative Evaluation Approach for Disparity Estimation, VISAPP 2011

Strength: It allows a formal interpretation of results, based on the cardinality of the set A*, and in regard to considered imagery test-bed

Weakness: It does not allow an exhaustive evaluation of the entire set of algorithms under evaluation It computes the set A* just once, and does not bring information about A’

A* Groups Methodology


It extends the evaluation model of the A* methodology, incorporating the capability of performing an exhaustive evaluation

It introduces the partitioningAndGrouping algorithmA = Set ( { } );A.load( “Algorithms.dat” );A* = Set ( { } );A’ = Set ( { } );group = 1;do { computePartition( A, A*, A’, g, ≺ ); A*.save ( “A*_group_”+group ); group++; A.update ( A’ ); // A = A / A* A*.removeAll ( ); // A* = { } A’.removeAll ( ); // A’ = { } }while ( ! A.isEmpty ( ) );

subject to:

The A* Groups methodology uses the Sigma-Z-Error (SZE) measure

The SZE measure has the following properties:

It is inherently related to depth reconstruction in a stereo system

It is based on the inverse relation between depth and disparity

It considers the magnitude of the estimation error

It is threshold free

A* Groups Methodology (ii): Sigma-Z-Error

An Evaluation Methodology for Stereo Correspondence Algorithms, VISAPP 2012, Rome - ItalyCabezas, I., Padilla, V., and Trujillo M., A Measure for Accuracy Disparity Maps Evaluation, CIARP 2011

A* Groups Methodology (iii): Illustration

The evaluation process of selected algorithms by using the proposal




nonocc all disc





A* Groups Methodology (iv): Illustration

The evaluation model performs the partitioning and the grouping of stereo algorithms under evaluation, based on the Pareto Dominance relation





PatchMatch 49.95 261.84 32.85ImproveSubPix 50.66 97.94 32.01OverSegmBP 58.65 108.60 34.58

,GC+SegmBorder PatchMatch

ObjectStereo PUTv3 ImproveSubPix OverSegmBP,, ,

Algorithm nonocc all disc GroupGC+SegmBorder 50.48 64.90 24.33 1

PatchMatch 49.95 261.84 32.85 1PUTv3 99.67 333.37 53.79

ImproveSubPix 50.66 97.94 32.01

OverSegmBP 58.65 108.60 34.58

ObjectStereo 73.88 117.90 36.25


,

A* Groups Methodology (v): Illustration


ObjectStereo PUTv3 ImproveSubPix OverSegmBP,, ,

Algorithm nonocc all discPUTv3 99.67 333.37 53.79

ImproveSubPix 50.66 97.94 32.01

OverSegmBP 58.65 108.60 34.58

ObjectStereo 73.88 117.90 36.25


Algorithm nonocc all disc GroupImproveSubPix 50.66 97.94 32.01 2

PUTv3 99.67 333.37 53.79

ObjectStereo 73.88 117.90 36.25

OverSegmBP 58.65 108.60 34.58

ImproveSubPix

ObjectStereo PUTv3 OverSegmBP, ,

ObjectStereo PUTv3 OverSegmBP, ,Algorithm nonocc all disc

PUTv3 99.67 333.37 53.79

OverSegmBP 58.65 108.60 34.58

ObjectStereo 73.88 117.90 36.25

ObjectStereo

OverSegmBP

,PUTv3

Algorithm nonocc all disc GroupOverSegmBP 58.65 108.60 34.58 3

PUTv3 99.67 333.37 53.79

ObjectStereo 73.88 117.90 36.25

And so on …

A* Groups Methodology (vi): Illustration

Interpretation of results is based on the cardinality of each group



There are 5 groups of different performance

The GC+SegmBorder and the PatchMatch algorithms are, comparable among them,

and have a superior performance to the rest of algorithms

The ImproveSubPix algorithm is superior to the OverSegmBP, the ObjectStereo, and

the PUTv3 algorithms

…

The PUTv3 algorithm has the lowest performance

A* GroupsEvaluation Model

Algorithm nonocc all disc GroupGC+SegmBorder 50.48 64.90 24.33 1

PatchMatch 49.95 261.84 32.85 1ImproveSubPix 50.66 97.94 32.01 2OverSegmBP 58.65 108.60 34.58 3ObjectStereo 73.88 117.90 36.25 4

PUTv3 99.67 333.37 53.79 5

Experimental Results

The conducted evaluation involves the following elements:


Test Bed Images

Error Criteria

Evaluation Models

Error Measures

A* Groups Middlebury

SZE , BMP

nonocc , all , disc

Scharstein, D. and Szeliski, R., http://vision.middlebury.edu/stereo/eval/, 2012

Stereo Algorithms 112 algorithms from the Middlebury’s repository

Experimental Results (ii)


Algorithm Strategy Group Middlebury’s Ranking

DoubleBP Global 1 4

PatchMatch Local 1 11GC+SegmBorder Global 1 13

FeatureGC Global 1 18Segm+Visib Global 1 29MultiresGC Global 1 30DistinctSM Local 1 34

GC+occ Global 1 67MultiCamGC Global 1 68

Algorithm Group Middlebury’s Ranking

ADCensus 2 1

AdaptingBP 2 2CoopRegion 2 3DoubleBP 1 4

RDP 2 5OutlierConf 2 6

SubPixDoubleBP 2 7SurfaceStereo 2 8

WarpMat 2 9ObjectStereo 2 10PatchMatch 1 11

Undr+OverSeg 2 12GC+SegmBorder 1 13

InfoPermeable 2 14CostFilter 2 15

Final Remarks

The use of the A* Groups methodology allows to perform an exhaustive evaluation, as well as an objective interpretation of results

Innovative results in regard to the comparison of stereo correspondence algorithms were obtained using proposed methodology and the SZE error measure

The introduced methodology offers advantages over the conventional approaches to compare stereo correspondence algorithms

Authors are already working in order to provide to the research community an accessible way to use the introduced methodology

Thanks!An Evaluation Methodology for Stereo Correspondence Algorithms, VISAPP 2012, Rome - Italy

An Evaluation Methodology for Stereo Correspondence Algorithms

Ivan Cabezas, Maria Trujillo and Margaret [email protected]

February 25th 2012International Conference on Computer Vision Theory and Applications, VISAPP 2012, Rome, Italy

an evaluation methodology for stereo correspondence algorithms

Documents