article in press - cvpgnpu-cvpg.org/uploads/file/17_08_21_10_59_25_128.pdf · the rest of this...

ARTICLE IN PRESS

JID: NEUCOM [m5G; July 26, 2017;19:13 ]

Neurocomputing 0 0 0 (2017) 1–11

Contents lists available at ScienceDirect

Neurocomputing

journal homepage: www.elsevier.com/locate/neucom

Robust outlier removal using penalized linear regression in multiview

geometry

�

Guoqing Zhou

a , Qing Wang

a , ∗, Zhaolin Xiao

b

a School of Computer Science and Engineering, Northwestern Polytechnical University, NO.127 West Youyi Road, Xian, Shaanxi 710072, China b School of Computer Science and Engineering, Xi’an University of Technology, NO.5 South Jinhua Road, Xian, Shaanxi, China

a r t i c l e i n f o

Article history:

Received 12 July 2016

Revised 26 April 2017

Accepted 20 June 2017

Available online xxx

Communicated by Bin Fan

Keywords:

Computer vision

Multiview geometry

Penalized linear regression

Outlier remove

Masking and swamping

a b s t r a c t

In multiview geometry, it is crucial to remove outliers before the optimization since they are adverse fac-

tors for parameter estimation. Some efficient and very popular methods for this task are RANSAC, MLESAC

and their improved variants. However, Olsson et al. have pointed that mismatches in longer point tracks

may go undetected by using RANSAC or MLESAC. Although some robust and efficient algorithms are pro-

posed to deal with outlier removal, little concerns on the masking (an outlier is undetected as such) and

swamping (an inlier is misclassified as an outlier) effects are taken into account in the community, which

probably makes the fitted model biased. In the paper, we first characterize some typical parameter es-

timation problems in multiview geometry, such as triangulation, homography estimate and shape from

motion (SFM), into a linear regression model. Then, a non-convex penalized regression approach is pro-

posed to effectively remove outliers for robust parameter estimation. Finally,we analyze the robustness

of non-convex penalized regression theoretically. We have validated our method on three representative

estimation problems in multiview geometry, including triangulation, homography estimate and the SFM

with known camera orientation. Experiments on both synthetic data and real scene objects demonstrate

that the proposed method outperforms the state-of-the-art methods. This approach can also be extended

to more generic problems that within-profile correlations exist.

© 2017 Elsevier B.V. All rights reserved.

1

p

r

s

a

i

t

o

m

e

[

e

C

R

F

C

W

m

m

g

o

g

m

e

p

o

i

m

t

h

0

. Introduction

Robust parameter estimation plays an important role in many

roblems of computer vision, such as triangulation, planar homog-

aphy, fundamental matrix estimate, structure and motion recon-

truction and so on. Typically, parameter estimation can be posed

s a minimization of different objective functions according to var-

ous error vectors ( i.e. L 1 , L 2 or L ∞

norm) [1] . Recent work on mul-

iview geometric problems has exploited convexity properties in

rder to obtain global optimum. A lot of facts show that many geo-

etric problems are quasi-convex under the L ∞

norm [2–5] . How-

ver, they are also known to be extremely vulnerable to outliers

6] , which are often unavoidable. As a result, outlier removal has

xtensively drawn a number of attentions in computer vision [1] .

� This work was supported by Grants from National Natural Science Foundation of

hina (no. 61401359, no. 61531014, no. 61501370), Grants from Natural Science Basic

esearch Plan in Shaanxi Province of China (no. 2015JQ6209), NPU Foundation for

undamental Research (no.3102015JSJ0 0 07) and Aeronautical ScienceFoundation of

hina (no. 2015ZC53044). ∗ Corresponding author.

E-mail addresses: [email protected] (G. Zhou), [email protected] (Q.

ang), [email protected] (Z. Xiao).

p

p

o

u

m

e

i

r

ttp://dx.doi.org/10.1016/j.neucom.2017.06.043

925-2312/© 2017 Elsevier B.V. All rights reserved.

Please cite this article as: G. Zhou et al., Robust outlier removal using pe

(2017), http://dx.doi.org/10.1016/j.neucom.2017.06.043

Traditionally, RANSAC is probably the most commonly used

ethod for outlier removal [7] . RANSAC allows the user quickly re-

oving most of outliers from the data; therefore it has met with

reat success in computer vision. However, as RANSAC only works

n a subset of the images, mismatches in longer point tracks may

o undetected. Meanwhile, being a randomized algorithm, RANSAC

ay easily miss a few of outliers, which causes disastrous results,

specially when L ∞

norm based optimization methods are em-

loyed [2] .

The limitation of RANSAC algorithm motivates the development

f various alternative approaches for outlier removal. Among the

mproved outlier removal frameworks, the L ∞

norm based mini-

ization has drawn a great deal of attentions because it can find

he global optima of outlier removal. Sim and Hartley [6] have ex-

loited special properties of strictly quasi-convex objective in com-

uter vision. For this reason, they proposed a simple method of

utlier removal, throwing away bad points with the maximal resid-

als. However the set of measurements with the largest residual

ust contain multiple points; whilst the proposed method only

nsures the set contains at least one outlier. In other words, some

nliers are falsely removed in each iteration. On the other hand, the

emaining point set may still have some outliers.

nalized linear regression in multiview geometry, Neurocomputing

http://dx.doi.org/10.1016/j.neucom.2017.06.043

http://www.ScienceDirect.com

http://www.elsevier.com/locate/neucom

mailto:[email protected]





2 G. Zhou et al. / Neurocomputing 0 0 0 (2017) 1–11

ARTICLE IN PRESS


r

a

o

r

t

b

t

m

l

i

n

d

t

h

d

p

i

a

w

c

p

a

m

T

r

l

i

i

s

b

t

p

w

w

p

5

s

f

3

p

a

t

m

a

(

T

p

u

e

d

a

a

l

e

c

2

Our work is also motivated by the need for improving the

robustness of outlier removal to benefit multiview geometric

problems. To ensure the fitted model unbiased as much as possi-

ble, we need to concern the masking (an outlier is undetected as

such) and swamping (an inlier is misclassified as an outlier) effects

[8] . In general, simple estimation method works very well if there

is only one outlier in the observed data. However, when there are

multiple outliers, these simple methods will fail. In these situa-

tions, the problem of identifying such cases become more difficult

due to the masking and swamping effects. In outlier detection, the

masking is usually more serious than the swamping. The former

can cause gross distortions, whereas the latter is often just a mat-

ter of lost efficiency. In this paper, we challenge the masking and

swamping effects of parameter estimation in multiview geometry

by formulating the problems as a linear regression model.

In this paper, we present a different solution which first charac-

terize typical parameter estimation problems into a linear regres-

sion model. Then, a non-convex penalized regression approach is

proposed to effectively remove outliers for robust parameter es-

timation. Specifically, we analyze the robustness of non-convex

penalized regression theoretically. We then compare our method

to the state-of-the-art methods on three representative estima-

tion problems in multiview geometry, including triangulation, ho-

mography estimate and SFM with known camera orientation. Ex-

periments on both synthetic data and real scene objects demon-

strate that the proposed method outperforms the state-of-the-art

methods.

2. Related work

Outlier removal in computer vision is a classical and important

problem. There are many related researches with different ideas.

The well-known method is RANSAC [7] . Extensive literature exists

on robust methods for regression. The M-estimator SAmple Con-

sensus (MSAC) algorithm [9] , the typical method used in computer

vision, is an improvement of RANSAC that replaces the inlier count

with a weighted voting based on an M-estimator. Similar schemes

for robust estimation using random sampling and M-estimators

were also proposed in [10] . Ma et al. [11] proposed the vector

field consensus (VFC) method, which is computationally efficient

and robust, to establish matching between two point sets. The VFC

algorithm disregards geometric constraints and determines corre-

spondences of features between point sets using coherence of the

underlying motion fields. Tang et al. [12] proposed a novel method

for point pattern matching via spectral graph analysis. The method

formulates feature point set matching as an optimization problem

with one-to-one constraint.

Recently, in an effort to further utilize the whole set of corre-

spondences, slack variables have been introduced into the objec-

tive function [13–15] so that outlier removal schemes are devised

by analyzing the slack values, replacing the sampling strategy on a

subset of the correspondences by RANSAC or its improved variants.

Seo et al. [13] have put 1-slack variable into the objective function.

Since the strategy of outlier removal is similar to that of [6] , it is

still unable to overcome the masking and swamping effects.

Olsson et al. [14] have introduced N -slack variables into the

objective function. Essentially, the L 1 norm of the “slacks” vec-

tor (or the sum of infeasibilities) is taken into consideration. The

L 1 method is, therefore,fast and shown to work well in practice,

specifically, its performance is better than traditional RANSAC-like

methods. However, no theoretical analysis is given in the paper. Its

limitation, however, is that the removed data may not include gen-

uine outliers (masking and swamping). This can lead to arbitrarily

bad models.

Yu et al. [15] have pointed that the 1-slack [13] and N -slack

variables with L norm [14] represent two ends of the spectrum of
1


eliability versus speed. Thus, outlier removal is characterized as

game that involves two players of conflicting interests, namely,

ptimizer and outlier. In essence, this method is a least-square

egression. Although this strategy is able to strike a balance be-

ween these two performance factors, the swamping, which cannot

e avoided in this strategy, could lead one to delete good observa-

ions. Meanwhile, masking still appears in this method. It is even

ore serious to choose the value of K, i.e. the proportion of out-

iers. It must be given in advance (heuristically), but unfortunately,

t is often unrealistic in practical applications.

Apart from above-mentioned trends, Li proposed to remove

othing but outliers rather than the whole support set that pro-

uces the smallest cost [16] . Essentially, the method belongs to

he branch of exhaustive search. Despite its improved efficiency,

owever, the method is special to the triangulation problem (low

imensional parameter estimation). Similar methods are also pro-

osed in [17–20] , which target the global optimization by remov-

ng the outliers. However, these methods are either computation-

lly intractable for high dimensional parameter estimation (with a

orst-case exponential complexity), or only tailored for a very spe-

ific class of applications. For example, in [19] , the triangulation

roblem is solved in O (n d+1 ) , where n is the number of points

nd d the dimension of the model. Recently,outlier-robust based

ethods [21–24] have attracted the attention of the researchers.

hese methods are applied to pattern classification, support vector

egression, machine learning, and so on.

In most of the previous approaches, the data set containing out-

iers was pre-cleaned by RANSAC algorithm. After this preprocess-

ng, the data set only contains few outliers so that the problem of

dentifying such observations is not difficult. However, when the

et contains many outliers, the problem of identifying such cases

ecome more difficult and these methods are unable to overcome

he masking and swamping effects simultaneously.

The rest of this paper is organized as follows. In Section 2 , we

rovide a general description of problem formulation. Section 3 ,

e introduce the non-convex penalized regression idea as one

ay to detect outliers, its intuition, and theoretical analysis. The

roposed algorithm is described in detail in Section 4 . Section

presents experimental results for the synthetic data and real

cene data. In Section 8 , we conclude the paper and provide the

uture direction.

. Problem formulations

Structure from motion (SFM) is one of the most challenging

roblems in multiview geometry. We pick it up as a specific ex-

mple to explain the approach in this paper. The goal of SFM is

o infer the scene structure (often 3D points) and/or the camera

otion, given the image data.

Let P i , i = 1 , 2 , . . . , N be the 3 × 4 camera matrix of camera i

mong N views. X j , j = 1 , 2 , . . . , M is the 3D point represented by

X j , Y j , Z j , 1) � , and u

i j

is the image point represented by (u i j , v i

j , 1) � .

hen, the central projection is simply expressed as a linear map-

ing between their homogeneous coordinates,

i j → P i X j . (1)

For Eq. (1) , different forms of unknowns correspond to differ-

nt multiview geometric problems. For example, the problem of

etermining a point’s 3D position from a set of corresponding im-

ge locations and known camera matrix is known as N-View Tri-

ngulation . If the point’s 3D position and the corresponding image

ocations are known, the problem of recovering the 3 × 4 cam-

ra matrix P is known as Camera Resection . The more popular and

hallenging problem is SFM with Known Camera Orientation , where

D image points and rotation parameters of cameras are given. We



G. Zhou et al. / Neurocomputing 0 0 0 (2017) 1–11 3

ARTICLE IN PRESS


i

e

w

o

O

m

E

o

i

d

w

c

a

s

a

d

w

c

f

e

i

i

d

w

t

e

E

d

o

a

1

E

t

d

b⎛⎝w

s

3

c

t

t

d

W

y

w

t

t

T

t

(

j

s

n

w

m

m

p

u

u

w

H

l

H

s⎛⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝

E

p

x

w

a

p

b

c

r

e

4

r

f

g

f

i

w

p

o

t

g

e

T

s

nfer the 3D structure (3D point positions) of the scene and cam-

ra translation parameters. In the subsequent parts of this section,

e take the SFM with known camera orientation as the example

f outlier removal and use it to explain the penalized regression.

ther problems can be formulated in a similar way.

Sturm and Triggs [25] have pointed out that affine factorization

ethod cannot be applied directly to projective reconstruction. So,

q. (1) representing the projective mapping is interpreted as true

nly up to a constant factor. Writing these constant factors explic-

tly, we have

i j u

i j = P i X j , (2)

here d i j is the projective depth of 3D point X j with respect to the

amera i . Eq. (2) is true only if the correct weighting factor d i j is

pplied to each of the measured point u

i j .

If we decompose P

i as K

i [ R

i | t i ] (we can omit known K

i later for

implicity), here we need to estimate the pose of each camera t i

nd 3D-points X j =

(˜ X j 1

). Now, we expand Eq. (2) as follow,

i j

⎛

⎜ ⎝

u

i j

v i j

1

⎞

⎟ ⎠

=

⎛

⎝

r i � 1 ˜ X j + t i 1

r i � 2 ˜ X j + t i 2

r i � 3 ˜ X j + t i 3

⎞

⎠ , (3)

here r i � 1

, r i � 2

, r i � 3

are row vectors of R

i , and t i = (t i 1 , t i

2 , t i

3 ) � is en-

oded by the camera center.

Since we mainly consider the outlier removal problem, we will,

or simplicity, configure the Eq. (3) as a linear equation. It is nec-

ssary to recover the projective depths d i j . First, we normalize the

mage points and then use the recovery method proposed in [25] ,

.e. ,

i j =

(e k

i × u

i j

)·(F k

i u

k j

)‖ e k

i × u

i j ‖

2 d

k j (4)

here u

i � j

F k i

u

k j = 0 , k = 1 , 2 , . . . , N and e k

i is the epipole of u

k j

in

he image i . In this method, the projective depths can be recov-

red only from fundamental matrices and epipoles. Solving the

q. (4) in least squares, we can obtain d i j in terms of the given

k j .

In the SFM problem, we set the first 3D point as the origin

f world coordinate, i.e. , X 1 = (0 , 0 , 0 , 1) � . At the same time, we

ssume the depth of the first 3D point X 1 in the first camera is

, i.e. , t 1 3 = 1 . This means that the d 1 1 can be initialized to 1. The

q. (4) can be recursively chained together to give all the projec-

ive depths about the point X j . When we recover all the projective

epths, for the sake of discussion, both sides in Eq. (3) are divided

y d i j respectively.

u

i j

v i j

1

⎞

⎠ =

⎛

⎝

r i � 1 1 0 0

r i � 2 0 1 0

r i � 3 0 0 1

⎞

⎠

(˜ X

′ j

t ′ i

), (5)

here ˜ X

′ j =

X j / d i j and t ′ i =t i / d i j , i = 1 , 2 , . . . , N, j =1 , 2 , . . . , M.

Thus, N cameras and M 3D points provide P = 3 × N × M con-

traint equations. The number of unknowns is Q =3 × (M − 1) + × N − 1 , where each camera corresponds to three parameters ex-

ept that the first one has two freedoms (here we have its transla-

ion vector as t 1 = (t 1 1 , t 1

2 , 1) � ) and a space point corresponds to

hree unknowns except the first one X 1 = (0 , 0 , 0 , 1) � . In accor-

ance with Eq. (5) , we have t ′ 1 = (t ′ 1 1 , t ′ 1 2 , 1) � and

˜ X

′ 1 = (0 , 0 , 0) � .

e rewrite the equations as a vector form,

= Aβ + ε, (6)



here A ∈ R

P×Q is a fixed matrix of predictors, β= ( X

′� 2

, . . . , ˜ X

′� M

,

′ 1 1

, t ′ 1 2

, t ′ 2 � , . . . , t ′ N� ) � ∈R

Q is a fixed (unknown) coefficient vec-

or and ε∈R

P is the random error vector, ε∼N (0 , σ 2 I) . y =(u 1 1 , v

1 1 , 1 , . . . , u

i j , v i

j , 1 , . . . , u N M

, v N M

, 1) � ∈ R

P is a observation vector.

he case p is written as y p =a

� p β + εp , p = 1 , 2 , . . . , P, where a

� p is

he row vector of A .

It is important to note that Eq. (6) is a linear regression model

LRM) if d i j is known. As mentioned before, we can obtain pro-

ective depths from image point correspondences. If there exist

ome outliers and noises, the estimates of projective depths can

ot converge to the global optimum. However, in Section 5.3 , we

ill prove that the not very accurate d i j does not affect the perfor-

ance of the proposed algorithm.

Since there are slight differences between homography esti-

ate and the SFM, we need to describe the form of homogra-

hy independently. The pair of key points, u i = (u i , v i , 1) � and

′ i = (u ′

i , v ′

i , 1) � in homogeneous coordinates, are related by,

i = Hu

′ i , (7)

here H is a 3 × 3 homography matrix. The right bottom entry of

is fixed to 1 and the number of unknowns is 8 [3] . We reformu-

ate the H as a column vector h ,

=

(

h 11 h 12 h 13

h 21 h 22 h 23

h 31 h 32 1

)

reshape −→ h � (h 1 , h 2 , . . . , h 8 ) � , (8)

o that Eq. (8) is expanded as follow,

u 1

v 1 . . .

u i

v i . . .

u N

v N

⎞

⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠

=

⎛

⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎝

u

′ 1 v ′ 1 1 0 0 0 −u 1 u

′ 1 −u 1 v ′ 1

0 0 0 u

′ 1 v ′ 1 1 −v 1 u

′ 1 −v 1 v ′ 1

. . . u

′ i

v ′ i

1 0 0 0 −u i u

′ i

−u i v ′ i 0 0 0 u

′ i

v ′ i

1 −v i u

′ i

−v i v ′ i . . .

u

′ N v ′ N 1 0 0 0 −u N u

′ N −u N v ′ N

0 0 0 u

′ N v ′ N 1 −v N u

′ N −v N v ′ N

⎞

⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠

h .

(9)

q. (9) is also a linear regression. The triangulation has similar ex-

ression as homography

i = P i X , (10)

here P i is a 3 × 4 camera matrix, X is the unknown 3D point

nd the x i is the correspondence image points. The model fitting

roblems of the SFM,triangulation and homography estimate have

een re-formulated as specific regression model. Next, we will dis-

uss outlier removal under a unified framework of penalty linear

egression.The discussed outlier removal approach is for problems

xpressed by Eq. (6) .

. A framework for detecting outliers based on penalty linear

egression

For the linear regression model, there are different approaches

or multiple outliers identification, being separated into two cate-

ories: direct approaches and indirect approaches, using residuals

rom the robust fitting or not. The success of the direct approaches

s depended on the initially cleaned subset of data. The procedure

orks well for low leverage outliers but may fail when the sam-

le contains some high-leverage outliers. The indirect approach to

utlier identification obtains robust regression estimate firstly, and

hen identifies the outliers by using the residuals from robust re-

ression estimate. The most commonly indirect methods are M-

stimation [26] and Least Trimmed Squares (LTS) estimation [27] .

o counter the masking and swamping, we propose a method for

imultaneous variable selection and outlier identification.




ARTICLE IN PRESS


1 Step 1. Initialization. k ← 0 , β(k ) ← βinit and

γ (k ) ← argmin

γL (β(0) , γ ) .

Step 2. Update k ← k + 1 ,

β(k ) ← argmin

β

L (β, γ (k −1) ) (14)

γ (k ) ← argmin

γL (β(k ) , γ ) (15)

Step 3. If they converge, then output current (β(k ) , γ (k ) ) and

stop the algorithm, otherwise return to Step 2.

Algorithm 1: Alternating iterative optimization.

L

F

s

K

E

l

β

w

d

w

i

r

β

T

λ

l

γ

w

1

γ

p

k

B

s

a

e

p

T

a

s

c

o

m

p

4.1. Outlier detection model

In the observed data for multiview geometric problems, noises

and outliers cannot be avoided in the images. From the above

analysis, a 3D space point and the corresponding 2D image point

provide an observation of linear regression. Considering the least-

squares estimate is not robust, one can take outlier-resistant loss

functions into consideration, such as the L 1 -penalty or more gen-

eral Huber function. When the observation i is suspected to be an

outlier, we can consider the Mean Shift Outlier Model (MSOM). The

MSOM gives the same residual sums of squares as the model with-

out the relevant observations, so it can be used for model selection

combined with outlier identification. For a role of observations af-

ter deleting variables or the graphical display showing the role of

variables after removing observations using the MSOM, please re-

fer to [28] .

In this paper, we wish to find a robust method based on the

MSOM, which can deal with the data set containing multiple out-

liers. This model is

y = Aβ +

√

n γ + ε, ε ∼ N (0 , σ 2 I) . (11)

where y= (y 1 , y 2 , . . . , y n ) � is an n dimensional response vector,

A = (a 1 , a 2 , ..., a n ) � is an n × p covariant matrix, β is a p dimen-

sional unknown coefficient vector, γ = (γ1 , γ2 , . . . , γn ) � is an n di-

mensional unknown vector whose nonzero elements correspond to

outlier and ε is an n dimensional random error vector. Correspond-

ingly, the coefficient of γ is assumed to be √

n to match its scale

with the columns of A . The test for whether the observation i ′ is

an outlier is the same as testing whether the parameter γ i is zero

in the regression. If γi = 0 , then the case i ′ is good, otherwise it is

an outlier. We assume that γ is sparse since outliers should not be

the common. The goal is to find a robust estimate of γ and also

the β. According to Gaussian Markov theorem, the Ordinary Least

Square (OLS) estimator is the best unbiased estimator for linear re-

gression model. In absence of multiple outliers, the main challenge

is to counter masking and swamping effects. These two observa-

tions, in conjunction, lead to the following cost function:

L (β, γ

)= min

β∈ R

p , γ∈ R

n

1

2 n

‖ y − Aβ − √

n γ‖ 2 + λβ

p ∑

j=1

w β, j

∣∣β j

∣∣ (12)

+ λγ

n ∑

i =1

w γ ,i P ( γi ) (12)

where λβ > 0 and λγ > 0 are tuning parameters for β and γ . P ( ·)is a penalty function that encourages the sparsity. Supposing that˜ β= ( β1 , . . . ,

˜ βp ) � and ˜ γ = ( γ1 , . . . , ˜ γn ) � are the preliminary estima-

tors, w β, j = 1 / | β j | and w γ ,i = 1 / | γi | are known weights. Note that

in Eq. (12) , the noise term ε has been removed because the dis-

crepancy is mainly caused by outliers due to their larger magni-

tudes.

4.2. Optimization framework

General approaches to obtain β in linear regression implicitly

assume that all observations have equal influence on the model

fitting. However, the outlier can depend on individual observa-

tion. This fact means that it is better to perform variable selec-

tion and outliers identification in linear regression simultaneously.

Since the function L ( β, γ) is jointly convex in β and γ , the alter-

nating optimization is guaranteed to converge. To solve the opti-

mization problem in Eq. (12) , we introduce an alternate optimiza-

tion method in the Algorithm 1 .



This algorithm surely converges since it satisfies

(β(0) , γ (0)

)≥L

(β(1) , γ (0)

)≥L

(β(1) , γ (1)

)≥· · ·≥0 . (15)

or Eq. (13) , the optimization problem can be solved by some clas-

ical optimization methods, such as Faster algorithms based on the

rylov iterative and algebraic multi-grid methods [29] . For solving

q. (14) , we set ∂ L/∂ β = 0 in Eq. (12) and then get the explicit so-

ution of β

=

(A

� A

)† A

� (y − √

n γ)

(16)

here ( A

� A ) † is the Moore–Penrose pseudoinverse of ( A

� A ). It is

efined as

(A

� A ) † = lim

μ→ 0 ((A

� A ) � (A

� A ) + μI) −1 (A

� A ) �

here I is the identity matrix and μ is a scalar variable, which is

ntroduced to avoid numerical instability. Therefore, Eq. (16) can be

ewritten as

= ((A

� A + μI) −1 ) A

� (y − √

n γ ) . (17)

hen, we plug the solution β of Eq. (17) back into Eq. (12) . Since

β�w β, j | βj | > 0, which is obtained by calculating Eq. (17) , the out-

ier detection can be written as

ˆ = argmin

γ

1

2

‖ ( I − H ) (y − √

n γ)‖

2 2 +

n ∑

i =1

P ( γi ;λi ) (18)

here H =A (A

� A ) −1 A

� is the hat matrix of A and λi = λγ w γ,i , i = , . . . , n . When we obtain the solution of Eq. (18) , the estimated

ˆ can be used to substitute in Eq. (17) and the estimation of the

rediction function parameter β becomes straightforward. So, the

ey step of Algorithm 1 is the outlier detection by solving Eq. (18) .

ased on the above analysis, the latter of this paper will focus on

olving Eq. (18) for outlier removal. Inspired by the sparsity of γ ,

natural way to accomplish this task is to use the penalized lin-

ar regression to minimize. For the penalized linear regression, the

enalty function P ( ·) plays the most important role in the Eq. (18) .

heoretically, many penalty functions can be employed, such as L 2 nd L 1 penalties. The L 2 penalty ‖ γ‖ 2 yields a ridge type regres-

ion, while the L 1 penalty ‖ γ‖ 1 results in LASSO. Even though the

onvex optimization is a well-formulated model, in the presence

f multiple outliers with moderate or high leverage values this

ethod fails to eliminate the masking and swamping effects. Ex-

erimental results on HBK data have verified that the L approach
1



ARTICLE IN PRESS


i

i

i

s

i

p

P

s

m

s

i

5

5

b

P

t

o

m

s

F

t

t

t

o

ψ

t

t

t

t

f

E

r

e

j

o

1

s

5

s

I

m

m

a

d

d

p

t

{

s

p

t

g

s

m

A

e

λ

λ

f

a

‖

w

t

t

e

r

e

a

p

o

m

s

g

b

d

γ

w

E

n

s not able to identify the correct outliers without the swamp-

ng [30] . Theoretically, a convex criterion is inherently incompat-

ble with the robustness [19] . Recently, the robust linear regres-

ion, in which the penalty function P ( ·) is non-convex, has been

ntroduced for outlier detection. In this paper, we consider the L 0 enalty function,

(γ ;λ) = λ2 / 2 · I γ =0 . (19)

The main advantage of L 0 penalty function is that it is not sen-

itive to outliers. As a result, it is robust to the swamping and

asking. On the other hand, the two challenges of linear regres-

ion with L 0 penalty function are algorithmic convergence and tun-

ng parameter selection.

. Theoretical analysis

.1. Robustness analysis

In the robust statistical theory, there is a universal connection

etween the thresholding function ( γ ; λ) and penalty function

( γ ; λ). Antoniadis and Fan [31] proposed a three-step construc-

ion method to construct the penalty function from the thresh-

lding function with the smallest curvature. If ( γ ; λ) is an odd

onotone unbounded shrinkage rule for γ at any λ, the corre-

ponding P ( γ ; λ) is described as follows,

−1 (u ;λ) = sup { t : (t;λ) ≤ u } s (u ;λ) = −1 (u ;λ) − u

P (γ ;λ) = P (γ ;λ) =

∫ | γ | 0 s (u ;λ) du

(20)

or the non-convex penalty function Eq. (19) , the corresponding

hreshold function has the following form.

(γ ;λ) =

{0 , if | γ | ≤ λγ , if | γ | > λ

(21)

As similar as the idea in [30] , our procedure has a close connec-

ion to the ψ − type Redescending M-estimator through the equa-

ion ψ(γ ;λ) = γ − (γ ;λ) . In this paper, the explicit formulation

f ψ( γ ; λ) is

(γ ;λ) =

{γ , if | γ | ≤ λ0 , if | γ | > λ

(22)

Obviously, ψ(γ ;λ) = 0 for all γ with | γ | > λ. In other words,

he ψ( γ ; λ) are non-decreasing near the origin, but decreasing

oward 0 far from the origin. It means that the robust loss func-

ion Eq. (18) , which contains the non-convex penalty function, has

he redescending property. The redescending property of the loss

unction is very attractive computationally since the solution of

q. (18) is always unique because the estimating function never

eaches zero as | γ | ← 0. Due to this property, the penalty lin-

ar regression, which provided in this paper, can completely re-

ect gross outliers and do not completely ignore moderately large

utliers (like median). Theoretically, it has a high breakpoint near

/2. Thus, the loss function Eq. (18) is robust to the masking and

wamping.

.2. Convergence analysis

Since the penalty function in Eq. (18) is non-convex, the corre-

ponding optimization problem Eq. (12) suffers from local minima.

n this subsection, we refer a theoretical analysis of global mini-

um of Algorithm 1 from [32] , in which Katayama et al. have sum-

arized some properties of the loss function, the penalty function

nd the thresholding function of linear regression.



Firstly, the random error vector ε = (ε 1 , ε 2 , . . . , ε n ) � is indepen-

ently and identically distributed as the zero mean sub-Gaussian

istribution with a parameter σ > 0.

Secondly, the thresholding function ( · ; λ) satisfies that

(γ ;λ) = 0 if | γ | ≤ λ and | (γ ;λ) − γ | ≤ λ for all γ ∈ R .

Lastly, there are some notations and hypotheses to further sup-

ort the following analyses. Let ( β, γ ) be the preliminary estima-

or of ( β, γ), ˆ S =supp( β)={ i | βi = 0 } ⊂ { 1 , . . . , p} and

ˆ G =supp( γ )= i | γi = 0 } ⊂ { 1 , . . . , n } with size ˆ s =| S | and

ˆ g =| G | . Similarly, let S =upp(β) and G =supp(γ ) with the size s =| S| and g =| G| . We sup-

ose there exists a sequence a n , 1 → 0 and a constant κ > 0 such

hat ‖ β − β‖ 2 + ‖ γ − γ‖ 2 ≤ ˆ C a n, 1 and δmin ( s ≥ κ) with probability

oing to one, where ˆ C is some positive constant. This leads to the

creening property that S ⊂ ˆ S and G ⊂ ˆ G if

in

{min

j∈ S | β j | , min

i ∈ G | γi |

}>

ˆ C a n, 1 . (23)

Based on these properties and assumptions, we found the

lgorithm 1 has the algorithmic and statistical convergence prop-

rty.

Theorem 1 [32] . Let

β ≥ 2 CR w

√

σ 2 log p

n

,

γ ≤ Cn

R w

max j∈ S

∑

i ∈ G | a i j |

√

σ 2 log p

n

or a constant C >

√

2 and R w

> 0. Then, for any iteration κ ≥ 1 and

ny initial value βinit in Algorithm 1 ,

β(κ) − β‖ 2 ≤ ρκ‖ βinit − β‖ 2 + 2 κ−1 √

s λβ

(R

−1 w

+ max j∈ S

w β, j

)κ−1 ∑

i =0

ρ i (24)

ith probability going to one. R w

was satisfied with these restric-

ions as min

j∈ S w β, j ≥ R −1

w

and max i ∈ G

w γ ,i ≤ R w

.

Theorem 1 shows that the output of Algorithm 1 is globally op-

imal. More importantly, it indicates that even if the preliminary

stimates do not have the high accuracy, we can improve its accu-

acy in calculating Eq. (12) with different tuning parameters. This

nsures that the not very accurate depth preliminary d i j , which is

ffected by the outliers, does not affect the performance of the

roposed algorithm in Eq. (19) . In short, Algorithm 1 is robust to

utlier removal and undoubtedly converges into the global opti-

ization.

Furthermore, to simplify the discussion and computation, we

uppose that the spectral decomposition of the hat matrix H is

iven by H =U DU

� . Define an index set C ={ i ; D ii = 0 } and let U c

e formed by taking the corresponding columns of U . Then a re-

uced model of Eq. (18) can be reformatted as the following,

ˆ = argmin

γ

1

2

‖ y − B

√

n γ‖

2 2 +

n ∑

i =1

w γ ,i P (γi ;λi ) (25)

here ˆ y =U

� c y and B=U

� c ∈ R

(n −p) ×n . In the reduced model

q. (25) , A β has disappeared so that β does not affect the robust-

ess of outlier removal theoretically. Correspondingly, we confirm




ARTICLE IN PRESS


Input : The image points { u

i j } , j = 1 , 2 , . . . , M containing

outliers and the camera orientation R

i , i = 1 , 2 , . . . , N;

Output : The subset { u

i j } of image points, the RMS error of

model fitting is less than a threshold ε > 0 .

1 Obtain the initial 3D coordinate of the point X j and the

camera position t i by the existing linear algorithm, such as

DLT [1];

2 Reformulate the data as the line regression model,

y = Aβ +

√

n γ + ε , A ∈ R

n ×p , y ∈ R

n , λ = 0 , γ (0) ∈ R

p ;

3 j ← 0 , i ← 0 , γ ( j) ← γ (0) ;

4 con v erged ← F ALSE ;

5 H ← Hat Mat rix (A ) = A (A

� A ) −1 A

� ; 6 r ← y − Hy;

7 λupper ← ‖ (I − H) y ·/ √

diag(I − H) ‖ ∞

;

8 while λ < λupper do

9 while not con v erged do

10 γ ( j+1) ← (Hγ ( j) + r;λ) ;

11 con v erged ← ‖ γ ( j+1) − γ ( j) ‖ ∞

< ε;

12 j ← j + 1 ;

13 end

14 Deliver ˆ γ (i ) ← γ ( j) ;

15 BIC

(i ) ← m log ( RSS /m ) + k (λ)( log (m ) + 1) ;

16 λ ← λ + 0 . 01

17 i ← i + 1 ;

18 end

19 i ∗ = arg min

i ( BIC

(i ) ) , γ∗ =

ˆ γ (i ∗) ;

20 According to the index of nonzero element in the γ∗, we

remove the same index element of A . Reformulate the matrix

A , then we get the value of ˜ u

i j .

Algorithm 2: Outlier removal using non-convex penalized re-

gression.

7

p

w

c

a

c

m

f

N

s

L

c

a

7

1

s

u

5

d

1 See http://www.maths.lth.se/matematiklth/personal/fredrik/download.html . 2 See http://www.robots.ox.ac.uk/ ∼vgg/data.html . 3 See http://www.maths.lth.se/matematiklth/personal/calle . 4 See http://www.mosek.com/ .

that the not very accurate depth estimation d i j

does not affect out-

lier removal.

5.3. Tuning parameter selection

According to above analysis, we will focus on the solution of

penalized linear regression of Eq. (18) . As we all know, the tun-

ing parameter λ in the cost function of Eq. (18) is very impor-

tant to overcome masking and swamping. The λ is an essential

tradeoff factor. If λ is too large, masking will appear inevitably.

On the contrary, if λ is too small, swamping is unable to avoid.

In the presence of noise, the choice of λ becomes more diffi-

cult. Even if we assume that the standardized robust residuals

are IID N (0 , 1) , a data-dependent λ is still difficult to choose.

The reason is that a large prediction error can indicate either a

good robust or an outlier. We find that the regression model in

Eq. (18) is a wavelet approximation problem [31] since B

� B = I.

This makes it possible to get an optimal λ, for all (transformed) ob-

servations are clean. Without loss of generality, we could generate

the solution path by decreasing λ from ‖ (I − H) y ·/ √

diag(I − H) ‖ ∞

to 0.

The parameters selection strategy is identified along the solu-

tion path as one corresponding to a tuning parameter value that

optimizes some criteria, such as General Cross-Validation (GCV)

[33] , Akaikes Information Criterion (AIC) [34] or Bayesian Informa-

tion Criterion (BIC) [35] . It is well known that GCV-based meth-

ods and AIC-based methods are not consistent for model selec-

tion. The BIC is a criterion for model selection among a finite

set of models. It is based, in part, on the likelihood function.

When fitting models, it is possible to increase the likelihood by

adding parameters. BIC-based tuning parameter selection [36] has

been shown to be consistent for model selection in several set-

tings. The BIC criteria may be employed to determine the value

of λ. So, the BIC solves this problem by introducing a penalty

term for the number of parameters in the model. We design a

BIC from the viewpoint of penalized regression to choose the best

parameter λ in outlier detection. In this paper, the BIC is given

by,

BIC = m log ( RSS /m ) + k (λ)( log (m ) + 1) , (26)

where m = n − p, RSS =‖ (I − H)(y − ˆ γ ) ‖ 2 2 , k (λ) = DF (λ) + 1=

| nz(λ) | + 1 with nz(λ) = { i : ˆ γi (λ) = 0 } . Since the maximum a posteriori probability of the model is

equivalent to the minimization of BIC, we choose the optimal λ by

i ∗ = min i BIC

(i ) along the regularization path, where i ∗ is the index

of the λ solution path. After we obtain the optimal λ, the corre-

sponding cost function of Eq. (18) can remove outliers robustly to

masking and swamping.

6. The algorithm

We also take the SFM with known camera orientation as the

example to introduce the proposed approach in Algorithm 2 . In-

spired by the literature [37] , we utilize the connection between the

penalty linear regression and the thresholding function to simplify

the calculation. Specifically, we update γ ( j+1) by (Hγ ( j) + r;λ)

in the Step 10 instead of Eq. (25) . It can be carried out without re-

calculating β. This is the reason that the not very accurate depth

d i j

does not affect the performance of the proposed algorithm. Af-

ter we execute the outlier removal process, much more accurate

d i j

and β can be computed based on the clean data. The algo-

rithms for other multiview geometry problems are very similar to

Algorithm 2 .



. Experimental results and analysis

We verify our method on three typical multiview geometric

roblems, including triangulation, homography estimate and SFM

ith known camera rotation. Although our experiments have only

onsidered these problems, the proposed method is general and

pplicable to any multiview geometric problems, such as camera

alibration, trifocal tensor and so on. We have tested the proposed

ethod on both synthetic and real data. The synthetic data comes

rom the L-infinity-1.0. 1 The real data include VGG data 2 and the

otre Dame data courtesy. 3 The experimental environment is a

tandard PC (P8600, 6 GB 64 bits). Our algorithm is coded in MAT-

AB 2011a and we use Matlab profiler to report the timings and

omparisons. To solve the LP and SOCP problems, we use freely

vailable MOSEK

4 and SeDuMi [38] .

.1. Triangulation

We simulated a 3D scene with 50 points within a cube and set

00 views in front of the scene. The corresponding points of the

ynthetic image are normalized into (−1 , +1) and Gaussian noises

p to 0.01 are added in randomly. This amounts to a 100-view and

0-point triangulation problem. Each point is processed indepen-

ently. We add various proportions outliers to these 50 0 0 image


http://www.maths.lth.se/matematiklth/personal/fredrik/download.html

http://www.robots.ox.ac.uk/~vgg/data.html

http://www.maths.lth.se/matematiklth/personal/calle

http://www.mosek.com/



ARTICLE IN PRESS


Table 1

Performance of the proposed method on the triangulation for synthetic data in term of the masking and swamping effects. We also

report the RMS reprojection error before and after outlier removal.

Outlier M(%) S(%) RMS

Scale Proportion 1-slack K-slack Ours 1-slack K-slack Ours Before 1-slack K-slack Ours

0.1 0.10 0.1760 0.2200 0.0240 0.7260 0.6978 0.0 0 0 0 0.0332 0.0237 0.0252 0.0063

0.20 0.2020 0.2060 0.0080 0.7198 0.6943 0.0 0 0 0 0.0451 0.0362 0.0351 0.0067

0.30 0.1813 0.2167 0.0087 0.7174 0.6911 0.0 0 0 0 0.0550 0.0424 0.0423 0.0073

0.40 0.2040 0.2160 0.0110 0.7023 0.6830 0.0 0 07 0.0635 0.0518 0.0512 0.0079

0.25 0.10 0.0320 0.0340 0.0 0 0 0 0.7144 0.6862 0.0 0 07 0.0814 0.0213 0.0199 0.0050

0.20 0.0250 0.0310 0.0 0 0 0 0.7028 0.6763 0.0 0 03 0.1117 0.0244 0.0320 0.0049

0.30 0.0327 0.0347 0.0 0 0 0 0.6814 0.6569 0.0 0 03 0.1369 0.0409 0.0439 0.0050

0.40 0.0310 0.0305 0.0 0 0 0 0.6720 0.6487 0.0010 0.1582 0.0501 0.0479 0.0049

0.5 0.10 0.0 0 0 0 0.0 0 0 0 0.0 0 0 0 0.7073 0.6742 0.0 0 02 0.1600 0.0049 0.0048 0.0050

0.20 0.0 0 0 0 0.0010 0.0 0 0 0 0.6850 0.6503 0.0 0 03 0.2235 0.0049 0.0071 0.0050

0.30 0.0 0 0 0 0.0 0 0 0 0.0 0 0 0 0.6697 0.6171 0.0014 0.2739 0.0050 0.0049 0.0050

0.40 0.0 0 0 0 0.0 0 0 0 0.0 0 0 0 0.6577 0.5960 0.0010 0.3163 0.0048 0.0047 0.0049

p

i

t

5

t

s

f

g

T

l

t

i

s

c

s

p

(

R

i

K

t

t

w

t

T

t

s

f

m

o

s

l

s

m

m

i

m

Table 2

Performance of various methods on homography estimate in

term of the number of removed matches. We also report the

required CPU seconds and the RMS reprojection error (pixels).

Data Match Method Removed CPU RMS

1-slack 96 2.99 0.77

Keble 167 K-slack 76 2.99 0.76

MSAC 68 0.49 0.76

VFC 25 0.45 0.59

Ours 25 0.20 0.57

Graft 437 1-slack 259 12.67 1.01

K-slack 272 3.54 0.93

MSAC 166 0.41 0.90

VFC 78 0.83 1.59

Ours 163 2.11 0.86

b

w

t

7

m

s

t

G

r

c

r

m

W

R

W

s

r

r

t

m

t

s

5 http://users.cecs.anu.edu.au/ ∼jinyu/ . 6 See https://cn.mathworks.com/help/vision/ref/estimategeometrictransform.html . 7 See https://sites.google.com/site/jiayima2013/ .

oints. The proportion of outliers K is from 0.10 to 0.40 (theoret-

cally 0.50 is the upper bound and the model is changed when

he proportion of outliers exceeds 0.50). The amount of outliers is

0 0 0 × K . In order to validate the robustness of our algorithm to

he masking and swamping, we add outlier points with different

cales. We evaluate the performance of outlier removal using the

ollowing three measures [39] .

• M: the mean masking probability (fraction of undetected true

outliers).

• S: the mean swamping probability (fraction of good points la-

beled as outliers).

• RMS: Root Mean Square reprojection error for points in images

from the fitted model.

Ideally, M ≈ 0 , S ≈ 0 , and RMS ≈ 0 . We use the well known al-

orithm based on L ∞

norm [3] to solve the triangulation problem.

his algorithm is guaranteed to obtain the global optimization so-

ution, but it is not robust to outliers. We make comparisons with

he state-of-art methods, 1-slack and K-slack methods.Of course,

n accordance with the principles of triangulation methods, we en-

ure that each 3D point has 3 corresponding image points at least.

Table 1 shows the performance of the proposed method and the

ompared algorithm on the triangulation for synthetic data (with

cale of outlier as 0.1, 0.25 and 0.5). Whatever the scale and pro-

ortion of outliers are, our method is almost robust to swamping

at most 15 points are mislabeled as outliers).From the results of

MS reprojection error, we find that slight swamping makes little

nfluence on our model fitting. On the other hand, the 1-slack and

-slack methods have obviously swamping. This is the reason that

he RMS of 1-slack and K-slack is slightly smaller than ours when

he scale is 0.5.

Next, we analyze the effect of masking. When the scale is 0.1,

e find some of outliers are not cleared away in all methods. But,

he M value of our is much smaller than the compared methods.

he reason is that the image points are added Gaussian noises up

o 0.01 randomly. Some outliers may become the noise when the

cale of outlier is relatively small. This conclusion can be verified

rom the RMS error. The triangulation solver is global optimization

ethod based on the L ∞

norm [3] , which is very sensitive to the

utlier. If the remain of masking is really outlier, the RMS error

hould be larger. In fact, the RMS error of triangulation on out-

ier removed data is far less than that of original data. When the

cale of outlier increases to 0.25 and 0.5, our method can still re-

ove all outliers. However, the compared methods show serious

asking effect, leading to bigger RMS. Only when the scale factor

s 0.5, the comparison algorithm can conquer the masking. Experi-

ental results validate that the proposed algorithm is better than



aseline algorithms both in swamping and masking, no matter

hat the scale and the proportion of outliers are added in the syn-

hetic data.

.2. Homography estimate

Our second experiment is point matching for homography esti-

ate. We used two image pairs in the experiments. The SIFT de-

criptor [40] and traditional BBF algorithm are used to find the

entative matches across images. The Keble data has 167 and the

raft data contains 437 matches. It is hard to illustrate homog-

aphy comparing with the ground truth for the real data, so we

onsider three performance measures, including overall CPU time

equired to remove the outlier, the overall number of removed

atches and the model quality by RMS error on the cleaned data.

hen running various algorithms there is a significant drop in the

MS error indicating that the removed residuals are really outliers.

e compare our method with 1-slack and K-slack methods (the

ource code and test data) 5 , MSAC method

6 and VFC method. 7 The

esults of MSAC may not be identical between runs because of the

andomized nature. So, we repeat the MSAC 100 times and take

he mean as the experimental results. Table 2 shows the perfor-

ance of various algorithms on homography estimation.

For both the Keble and Graft data, the 1-slack method removes

he most image pairs, but the RMS is the largest. The reason is that

ome of outliers are failed to remove and some of inliers are wrong


http://users.cecs.anu.edu.au/~jinyu/

https://cn.mathworks.com/help/vision/ref/estimategeometrictransform.html

https://sites.google.com/site/jiayima2013/



ARTICLE IN PRESS


Fig. 1. Matching results and mosaics of two sets.

Please cite this article as: G. Zhou et al., Robust outlier removal using penalized linear regression in multiview geometry, Neurocomputing




ARTICLE IN PRESS


0 1 2 3x 10−3

0

500

1000

15001−slack

0 1 2 3x 10−3

0

500

1000

1500K−slack

0 1 2 3x 10−3

0

500

1000

1500our method

(a) Dino

0 2 4 6x 10−3

0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000our method

0 2 4 6x 10−3

0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000K−slack

0 2 4 6x 10−3

0

500

1000

1500

2000

2500

3000

3500

4000

4500

50001−slack

(b) UWO

0 0.5 1 1.5 2 2.5x 10−3

0

500

1000

1500

2000

2500our method

0 0.5 1 1.5 2 2.5x 10−3

0

500

1000

1500

2000

2500K−slack

0 0.5 1 1.5 2 2.5x 10−3

0

500

1000

1500

2000

25001−slack

(c) House

0 0.5 1 1.5 2x 10−3

0

1000

2000

3000

4000

5000

6000

7000our method

0 0.5 1 1.5 2x 10−3

0

1000

2000

3000

4000

5000

6000

7000K−slack

0 0.5 1 1.5 2x 10−3

0

1000

2000

3000

4000

5000

6000

70001−slack

(d) Cathedral

Fig. 2. The histogram of RMS error of four sets.

r

c

t

s

s

m

a

t

a

a

V

t

e

l

r

m

i

s

p

7

l

t

t

i

d

a

T

Table 3

Performance of various methods on the SFM data (# cameras) in

terms of the total number of removed 2D observations (with the

corresponding number of removed 3D points in parentheses). We

also report the required CPU seconds and the RMS reprojection

error (pixels) of the model returned by each method.

Data Method Removed CPU RMS

Dino (36) 1-slack 6094 ( 1501 ) 747.16 0.70

4983 3D pts K-slack 6843 (1707) 46.97 0.42

16,432 observ. Ours 6935 (1610) 126.62 0.42

UWO (57) 1-slack 3742 (679) 385.57 0.75

8990 3D pts K-slack 1565 (262) 125.56 0.69

27,722 observ. Ours 1274 ( 257 ) 132.40 0.69

House (12) 1-slack 5914 (1252) 254.42 0.70

12,475 3D pts K-slack 4989 (847) 168.90 0.70

35,470 observ. Ours 6507 ( 838 ) 95.89 0.52

Cathedral (17) 1-slack 8613 (2189) 591.71 0.73

16,961 3D pts K-slack 6650 (1491) 255.85 0.76

46,045 observ. Ours 4 482 ( 54 4 ) 383.81 0.61

t

s

a

i

t

w

o

i

m

l

r

emoved. The result of K-slack method is better than 1-slack, be-

ause it removes less image pairs but the RMS is almost equal. So,

he 1-slack and K-slack methods are not robust to the masking and

wamping. The MSAC method shows a nearly RMS error with K-

lack method, but removes fewer pairs. This means the MSAC is

ore robust to the masking than K-slack method. The main dis-

dvantages are the randomness and parameter settings related to

he model. The VFC method has the same robustness and efficiency

s our method for the Keble data. But it removes the fewest pairs

nd has the biggest RMS error for Graft data. The reason is that the

FC method does not consider the geometric constraints between

he images. So it is not robust to the masking for the homography

stimation, which has obvious projective transformation.

For our method, the number of removed pairs is significantly

ess than those compared algorithms. Encouragingly, our RMS er-

or is obviously less than other algorithms, which proves that our

ethod is robust to the masking and swamping. In order to more

ntuitively show our advantages, matching results of 1-slack, K-

lack, MSAC, VFC and ours and the mosaics derived from our ap-

roach are shown in Fig. 1 .

.3. SFM with known camera orientation

In this experiment, we evaluate our method on the more chal-

enging problem, i.e. SFM with known camera orientation. We ob-

ain the experiment data sets from the web 3 . The camera calibra-

ion and rotation have been calculated in advance. The data sets

nclude four examples, including Dino, UWO, House and Cathe-

ral. For each set, the number of cameras, 3D points (visible in

t least 2 cameras), and observed 2D points are listed in Table 3 .

he Dino data is relatively clean, so we randomly perturbed 15% of



he original data by up to 5 pixels to create a more realistic test

etting. Other data were pre-cleaned by RANSAC, and only contain

round 1% outliers. Although the outlier and noise level is low, it

s shown that directly applying a bundle adjustment method on

hese data results in a RMS error of around 3 pixels [41] . To test

hether a lower reprojection error can be achieved by removing

utliers, we set the error tolerance ε to 2 pixels for all exper-

ments. Table 3 compares our method with 1-slack and K-slack

ethods. For the Dino data, the 1-slack algorithm removes the

east outliers, but the RMS error is largest. This means that some

eal outliers are not successfully removed by the 1-slack method.




ARTICLE IN PRESS


[

[

[

[

[

[

[

The RMS error of the K-slack is equivalent to our method, but

the outliers which we removed are less than the K-slack. The re-

sult shows that the K-slack method removed some inliers (swamp-

ing effect). For the UWO data, the 1-slack removes the most out-

liers, but the RMS error is the largest. This is the worst case, be-

cause it means some inliers are falsely removed and some out-

liers are not removed simultaneously. Similar to the result of the

Dino, our method removes the least outliers, but the RMS error

is also equivalent to the K-slack. The results of House and Cathe-

dral are more encouraging. The RMS errors of ours are obviously

less than those of l-slack and K-slack methods, which can also be

seen from the histogram of RMS error shown in Fig. 2 . At the same

time, the number of removed outliers is the least. This has proven

that our method sweeps away real outliers and keeps real inliers

successfully.

In summary, we claim that non-convex penalized regression ap-

proach is more robust to masking and swamping than these state-

of-the-art approaches. The only pity is that our efficiency is a little

worse than the K-slack.

8. Conclusion

Identification and removal of potential outliers are very impor-

tant for robust data regression. Many multiview geometric appli-

cations have proven the estimated models are biased when en-

countering multiple outliers. In the paper, we first survey some

state-of-the-art approaches from the view of masking and swamp-

ing and characterize typical model fitting problems in multiview

geometry by linear regression model. Then we propose a novel

non-convex penalized regression for outlier removal and analyze

the robustness of the solver. Compared to previous methods, our

approach is very robust to masking and swamping. Experimental

results on simulated data and real images have demonstrated the

superior practical performance of our method over those state-of-

the-art approaches. Apart from outlier removal, our method can be

extended to many model fitting problems in computer vision.

Acknowledgments

We would like to thank Prof. Rene Vidal of the Johns Hopkins

University for helpful discussions. I also would like to thank the

support of the China Scholarship Council (no. 201606295051) and

the Top International University Visiting Program for Outstanding

Young scholars of Northwestern Polytechnical University during my

visit to the Johns Hopkins University.

References

[1] R. Hartley , A. Zisserman , Multiple View Geometry in Computer Vision, Cam-bridge University Press, 20 0 0 .

[2] R. Hartley , F. Schaffalitzky , l ∞ norm minimization in geometric reconstruc-

tion problems, in: Proceedings of Conference on Computer Vision and PatternRecognition, 1, 2004, pp. 504–509 .

[3] F. Kahl , R. Hartley , Multiple-view geometry under the l ∞ norm, IEEE Trans.Pattern Anal. Mach. Intell. 30 (9) (2008) 1603–1617 .

[4] Q. Ke , T. Kanade , Quasiconvex optimization for robust geometric reconstruc-tion, IEEE Trans. Pattern Anal. Mach. Intell. 29 (10) (2007) 1834–1847 .

[5] G. Zhou , Q. Wang , Enhanced continuous tabu search for parameter estimation

in multiview geometry, in:ICCV (2013) 3240–3247 . [6] K. Sim , R. Hartley , Removing outliers using the l ∞ norm, in: Proceedings of

Conference on Computer Vision and Pattern Recognition, 2006, pp. 4 85–4 94 . [7] M.A. Fischler , R.C. Bolles , Random sample consensus: a paradigm for model fit-

ting with applications to image analysis and automated cartography, Commun.ACM 24 (6) (1981) 381–395 .



[8] E. Acuna , C. Rodriguez , A meta analysis study of outlier detection methodsin classification[J], Technical paper, Department of Mathematics, University of

Puerto Rico at Mayaguez, 2004 . [9] P.H. Torr , A. Zisserman , MLESAC: a new robust estimator with application

to estimating image geometry, Comput. Vis Image Underst. 78 (1) (20 0 0)138–156 .

[10] C.V. Stewart , Bias in robust estimation caused by discontinuities and multiplestructures, IEEE Trans. Pattern Anal. Mach. Intell. 19 (8) (1997) 818–833 .

[11] J. Ma , J. Zhao , J. Tian , A.L. Yuille , Z. Tu , Robust point matching via vector field

consensus, IEEE Trans. Image Process. 23 (4) (2014) 1706–1721 . [12] J. Tang , L. Shao , X. Zhen , Robust point pattern matching based on spectral con-

text, Pattern Recognit. 47 (3) (2014) 1469–1484 . [13] Y. Seo , H. Lee , S.W. Lee , Outlier removal by convex optimization for L-infin-

ity approaches, in: Advances in Image and Video Technology, Springer, 2009,pp. 203–214 .

[14] C. Olsson , A. Eriksson , R. Hartley , Outlier removal using duality, in: Pro-

ceedings of Conference on Computer Vision and Pattern Recognition, 2010,pp. 1450–1457 .

[15] J. Yu , A. Eriksson , T.-J. Chin , D. Suter , An adversarial optimization approach toefficient outlier removal, in: Proceedings of the International Conference on

Computer Vision, 2011, pp. 399–406 . [16] H. Li , A practical algorithm for l triangulation with outliers, in: Proceedings of

Conference on Computer Vision and Pattern Recognition, 2007, pp. 1–8 .

[17] C. Olsson , O. Enqvist , F. Kahl , A polynomial-time bound for matching and reg-istration with outliers, in: Proceedings of Conference on Computer Vision and

Pattern Recognition, 2008, pp. 738–751 . [18] H. Li , Consensus set maximization with guaranteed global optimality for ro-

bust geometry estimation, in: Proceedings of the International Conference onComputer Vision, 2009, pp. 1074–1080 .

[19] O. Enqvist , E. Ask , F. Kahl , K. Aström , Robust fitting for multiple view geome-

try, in: Proceedings of the European Conference on Computer Vision, Springer,2012, pp. 738–751 .

[20] H. Zhao , B. Jiang , Image matching using a local distribution based outlier de-tection technique, Neurocomputing 148 (2015) 611–618 .

[21] K. Zhang , M. Luo , Outlier-robust extreme learning machine for regression prob-lems, Neurocomputing 151 (2015) 1519–1527 .

22] C. Chen , C. Yan , A robust weighted least squares support vector regression

based on least trimmed squares, Neurocomputing 168 (2015) 941–946 . [23] G. Barreto , A. Barros , A robust extreme learning machine for pattern classifica-

tion with outliers, Neurocomputing 176 (2015) 3–13 . [24] Q. Wang , T. Chen , L. Si , Sparse representation with geometric configuration

constraint for line segment matching, Neurocomputing 134 (2014) 100–110 . 25] P. Sturm , B. Triggs , A factorization based algorithm for multi-image projective

structure and motion, in: Proceedings of the European Conference on Com-

puter Vision, 1996, pp. 709–720 . 26] P.J. Huber , Robust Statistics, Springer, 2011 .

[27] P.J. Rousseeuw , A.M. Leroy , Robust Regression and Outlier Detection, 589, JohnWiley & Sons, 2005 .

28] S.-S. Kim , S.H. Park , W. Krzanowski , Simultaneous variable selection and outlieridentification in linear regression using the mean-shift outlier model, J. Appl.

Stat. 35 (3) (2008) 283–291 . [29] A.N. Hirani, K. Kalyanaraman, Least squares ranking on graphs, Mathematics

(2011) . arXiv: 1011.1716v4 .

[30] Y. She , A.B. Owen , Outlier detection using nonconvex penalized regression, J.Am. Stat. Assoc. 106 (494) (2011) 626–639 .

[31] A. Antoniadis , J. Fan , Regularization of wavelet approximations, J. Am. Stat. As-soc. 96 (455) (2001) 939–967 .

32] S. Katayama, H. Fujisawa Sparse and robust linear regression: an optimizationalgorithm and its statistical properties, 2015. arXiv preprint arXiv: 1505.05257.

21 .

[33] L. Breiman , et al. , Heuristics of instability and stabilization in model selection,Ann. Stat. 24 (6) (1996) 2350–2383 .

[34] H. Zou , T. Hastie , R. Tibshirani , et al. , On the degrees of freedom of the lasso,Ann. Stat. 35 (5) (2007) 2173–2192 .

[35] H. Wang , R. Li , C. Tsai , Tuning parameter selectors for the smoothly clippedabsolute deviation method, Biometrika 94 (2007) 553–568 .

36] G. Schwarz , Estimating the dimension of a model[J], The annals of statistics 6

(2) (1978) 461–464 . [37] Y. She , Thresholding-based iterative selection procedures for model selection

and shrinkage, Electron. J Stat. 3 (2009) 384–415 . 38] J.F. Sturm , Using sedumi 1.02, a matlab toolbox for optimization over symmet-

ric cones, Optimization methods and software 11 (1–4) (1999) 625–653 . [39] L. McCann , Robust Model Selection and Outlier Detection in Linear Regressions,

MIT, 2006, Ph.D. thesis .

[40] K. Mikolajczyk , C. Schmid , A performance evaluation of local descriptors, IEEETrans. Pattern Anal. Mach. Intell. 27 (10) (2005) 1615–1630 .

[41] B. Triggs , Factorization methods for projective structure and motion, in: Pro-ceedings of Conference on Computer Vision and Pattern Recognition, IEEE,

1996, pp. 845–851 .


http://refhub.elsevier.com/S0925-2312(17)31171-2/sbref0001












http://refhub.elsevier.com/S0925-2312(17)31171-2/sbref0004a

















































































http://arxiv.org/abs/1011.1716v4







http://arxiv.org/abs/1505.05257.21




























ARTICLE IN PRESS


a

a

a

i

a

n

I

C

t

s

e

p

t

t

Guoqing Zhou received the B.S. degree in Computer Sci-

ence from the School of Computer Science, NorthwesternPolytechnical University, in 2003. He then joined North-

western Polytechnical University as a Lecturer. In 2009

and 2013 he obtained Master and Ph.D. degrees in theSchool of Computer Science, Northwestern Polytechnical

University. Dr. Guoqing Zhou is now a Associate Professorin the School of Computer Science, Northwestern Poly-

technical University. He worked as a visiting scholar inCenter for Imaging Science (CIS)of The Johns Hopkins Uni-

versity, in 2016.9–2017.9. His research interests include

computer vision and computational photography, such as3D reconstruction, global optimization, light field imaging

nd processing. He has published more than 10 papers in the international journalsnd conferences, such as TIP, ICCV, CVPR.

Qing Wang received the B.S. degree in information math-ematics from the Department of Mathematics, Peking

University, in 1991. He then joined Northwestern Poly-technical University as a Lecturer. In 1997 and 20 0 0 he

obtained Master and Ph.D. degrees in the Department of

Computer Science and Engineering, Northwestern Poly-technical University. In 2006, he was awarded as out-

standing talent program of new century by Ministry ofEducation, China. Dr. Qing Wang is now a Professor and

Ph.D. Tutor in the School of Computer Science, North-western Polytechnical University. He is now a Member of

IEEE and ACM. He is also a Senior Member of China Com-

puter Federation (CCF). He worked as research assistantnd research scientist in the Department of Electronic and Information Engineer-

ng, the Hong Kong Polytechnic University from 1999 to 2002. He also worked as visiting scholar in the School of Information Engineering, The University of Syd-

ey, Australia, in 2003 and 2004. In 2009 and 2012, he visited Human Computernteraction Institute, Carnegie Mellon University, for six months and Department of

omputer Science, University of Delaware, for one month. Prof. Wang’s research in-

erests include image processing, computer vision and computational photography,uch as 3D reconstruction, object detection, tracking and recognition in dynamic

nvironment, light field imaging and processing. He has published more than 100apers in the international journals and conferences.



Zhaolin Xiao received Bachelor and Master degree in the

Department of Computer Science and Engineering, Xi’anTechnological University in 2006 and 2009 respectively.

He received Ph.D. in the Department of Computer Science

and Engineering, Northwestern Polytechnical University 2014. He then joined Xi’an University of Technology as a

Lecturer. He is a Member of China Computer Federation(CCF). He worked as a visiting scholar in the School of

Computer Science, The University of Wisconsin-Madison,USA, in 2011 and 2012. In 2012, he visited in the Depart-

ment of Computer and Information Sciences, University

of Delaware, for one month. Dr. Xiao’s research interestsinclude image processing, computer vision and computa-

ional photography. At present, his research focus is on image based 3D reconstruc-ion, light field imaging and processing.



article in press - cvpgnpu-cvpg.org/uploads/file/17_08_21_10_59_25_128.pdf · the rest of this...

Documents