total least squares - hong kong baptist university · c.c. paige and z. strakosˇ , unifying least...

Post on 20-Jun-2020

5 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Total least squares

Gerard MEURANT

October, 2008

1 Introduction to total least squares

2 Approximation of the TLS secular equation

3 Numerical experiments

Introduction to total least squares

In least squares (LS) we have only a perturbation of the right handside whereas Total Least Squares (TLS) considers perturbations ofthe vector of observations c and of the m × n data matrix A

minimize ‖(E r

)‖F ,

E , r

subject to the constraint (A + E )x = c + r

This is finding the smallest perturbations E and r such that c + ris in the range of A + E

see Golub and Van Loan; Van Huffel and Vandewalle; Paige andStrakos

Theorem (Golub and Van Loan)

Let C =(A c

)and UTCV = Σ be its SVD

Assume that the singular values of C are such that

σ1 ≥ · · · ≥ σk > σk+1 = · · ·σn+1

Then the solution of the TLS problem is given by

min ‖(E r

)‖F = σn+1

andxTLS = − y

α

where the vector(y α

)Tof norm 1 with α 6= 0 is in the

subspace Sk spanned by the right singular vectors{vk+1, . . . , vn+1} of V . If there is no such vector with α 6= 0, theTLS problem has no solution

The right singular vectors v i are the eigenvectors of CTC and Sk isthe invariant subspace associated to the smallest eigenvalue σ2

n+1

The TLS solution xTLS solves the eigenvalue problem

CTC

(x−1

)= σ2

n+1

(x−1

)

TheoremIf σA

n > σn+1, then xTLS exists and is the unique solution of theTLS problem

xTLS = (ATA− σ2n+1I )

−1AT c

Moreover, σn+1 satisfies the following secular equation

σ2n+1

[1 +

n∑i=1

d2i

(σAi )2 − σ2

n+1

]= ρ2

LS

where the vector d = UT c and ρ2LS = ‖(c − AxLS)‖2

The secular equation can also be written as

σ2n+1 = cT c − cTA(ATA− σ2

n+1I )−1AT c

This is obtained by writing(ATA AccTA cT c

) (x−1

)= (σn+1)

2

(x−1

)and eliminating x

For data least squares (DLS) when only the matrix is perturbed,the secular equation is

cT c − cTA(ATA− σ2I )−1AT c = 0

This can also be written as

cT (AAT − σ2I )−1c = 0

−5 0 5 10 15 20 25−10

−8

−6

−4

−2

0

2

4

6

8

10TLS secular function as a function of σ2

Example of TLS secular function as a function of σ2

Approximation of the TLS secular equation

We approximate the quadratic form in the TLS secular equation byusing one of the Golub-Kahan bidiagonalization algorithms with cas a starting vectorIt reduces A to lower bidiagonal form and generates a matrix

Ck =

γ1

δ1. . .. . .

. . .

. . . γk

δk

a k + 1 by k matrix such that CT

k Ck = Jk the tridiagonal matrixgenerated by the Lanczos algorithm for the matrix ATA

At iteration k we approximate the TLS secular equation by

cT c − ‖c‖2(e1)TCk(CTk Ck − σ2I )−1CT

k e1 = σ2

This corresponds to the Gauss quadrature rule

We use the SVD of Ck = UkSkV Tk . Let σ

(k)i be the singular values

of Ck and ξ(k) = UTk e1

(ξ(k)k+1)

2

σ2−

k∑i=1

(ξ(k)i )2

(σ(k)i )2 − σ2

=1

‖c‖2

We need to compute the smallest zero. Secular equation solversuse rational interpolationWhen an approximate solution σ2

tls has been computed, we solve

xtls = (ATA− σ2tls I )

−1AT c

The Gauss–Radau rule

We implement the Gauss–Radau rule by using the otherGolub-Kahan bidiagonalization algorithm with AT c as a startingvectorIt reduces A to upper bidiagonal form. If

Bk =

γ1 δ1

. . .. . .

γk−1 δk−1

γk

the matrix Bk is the Cholesky factor of the Lanczos matrix Jk

To obtain the Gauss–Radau rule we must modify Bk to have aprescribed eigenvalue zLet ω be the solution of

(BTk Bk − zI )ω = (γk−1δk−1)

2ek

Let

ωk = (z + ωk)− (γk−1δk−1)2

γ2k−1

= (z + ωk)− δ2k−1

The modified matrix giving the Gauss–Radau rule is

Bk =

γ1 δ1

. . .. . .

γk−1 δk−1

γk

where γk =

√ωk

Using Bk we solve the secular equation

‖c‖2 − ‖AT c‖2(e1)T (BTk Bk − σ2I )−1e1 = σ2

with the SVD of Bk

Numerical experiments

As = UsΣsVTs , Us = I − 2

usuTs

‖us‖2, Vs = I − 2

vsvTs

‖vs‖2

where us and vs are random vectorsΣs is an m × n diagonal matrix with elements [1, · · · ,

√n]

Let xs be a vector whose ith component is 1/i and cs = Asxs

A = As + ξ randn(m, n)

The right hand side is

c = cs + ξ randn(m, 1)

A small example

Example TLS1, m = 100, n = 50, BNS1 ε = 10−6

ξ L it. s it. sol. exact sol.

0.3 10−2 30 57 0.01703479103104873 0.01703478979190218

0.3 10−1 26 49 0.169448388286749 0.1694483528865543

0.3 28 73 1.464892131470029 1.464891451263777

30 33 64 88.21012648624229 88.21012652906667

A larger example

I We are not able to store A which is a dense matrix in Matlab

I We use the vectors us and vs to do matrix multiplies with As

or ATs

I We perturb the singular values in the same way as the righthand side

Example TLS3, m = 10000, n = 5000, noise=0.3, BNS1

ε L it. s it. min it. max it. av. it. solution

10−6 250 273 1 2 1.09 1.418582932414374

10−10 328 660 1 3 2.01 1.418576233569240

It works fine but it is too expensive

The Gauss–Radau rule

Example TLS1, m = 100, n = 50, Gauss–Radau, noise=0.3,ε = 10−6

Met. L it.√

z s it. min it. max it. solution

Newt 28 σmin 130 2 14 1.464891376927382

Newt 28 σmax 79 2 4 1.464892626809155

Rat 28 σmin 98 2 5 1.464891376927382

Rat 28 σmax 74 2 3 1.464892626809155

Example TLS3, m = 10000, n = 5000, Gauss–Radau, noise=0.3,ε = 10−6

Met. L it.√

z s it. min it. max it. solution

Newt 250 σmin 2572 3 31 1.418576232676234

Newt 250 σmax 1926 3 26 1.418583305908228

Rat 250 σmin 837 2 5 1.418576232676233

Rat 250 σmax 653 2 4 1.418583305908227

Optimization of the algorithm

To reduce the cost

I We monitor the convergence of the smallest singular value ofA

I For this we solve a secular equation at every Lanczos iteration

I We use a third order rational approximation and tridiagonalsolves

I The Gauss and Gauss–Radau estimates are only computed atthe end

Example TLS3, m = 10000, n = 5000, noise=0.3, ε = 10−6

Met. L it. trid√

z s it. solution

- 250 551

Gauss - 2 1.418582932414440

G–R σmin(Bk) 2 1.418582932414443

G–R σmax(Bk) 3 1.418583305908306

Example TLS4, m = 100000, n = 50000, noise=0.3, ε = 10−6

Met. L it. trid√

z s it. solution

- 755 1775

Gauss - 1 0.8721122166701496

G–R σmin(Bk) 2 0.8721122166735605

G–R σmax(Bk) 3 0.8721124331415380

For example TLS3 with m = 10000, n = 5000 and ε = 10−6

I The computing time when solving for Gauss and Gauss–Radauat each iteration was 117 seconds

I With the last algorithm it is 12 seconds

J.R. Bunch, C.P. Nielsen and D.C. Sorensen,Rank-one modification of the symmetric eigenproblem,Numer. Math., v 31, (1978), pp 31–48

G.H. Golub and C. Van Loan, An analysis of the totalleast squares problem, SIAM J. Numer. Anal., v 17 n 6,(1980), pp 883–893

Ren-Cang Li, Solving secular equations stably andefficiently, Report UCB CSD-94-851, University of California,Berkeley, (1994)

A. Melman, A unifying convergence analysis of second-ordermethods for secular equation, Math. Comp., v 66 n 217,(1997), pp 333–344

A. Melman, A numerical comparison of methods for solvingsecular equations, J. Comp. Appl. Math., v 86, (1997),pp 237–249

C.C. Paige and Z. Strakos, Bounds for the least squaresresidual using scaled total least squares, in Proc. 3rd int.workshop on TLS and error-in-variables modelling, S. VanHuffel and P. Lemmerling eds., Kluwer, (2001), pp 25–34

C.C. Paige and Z. Strakos, Unifying least squares, totalleast squares and data least squares, in Proc. 3rd int. workshopon TLS and error-in-variables modelling, S. Van Huffel andP. Lemmerling eds., Kluwer, (2001), pp 35–44

C.C. Paige and Z. Strakos, Bounds for the least squaresdistance using scaled total least squares problems,Numer. Math., v 91, (2002), pp 93-115

C.C. Paige and Z. Strakos, Scaled total least squaresfundamentals, Numer. Math., v 91, (2002), pp 117-146

C.C. Paige and Z. Strakos, Core problems in linearalgebraic systems, SIAM J. Matrix Anal. Appl., v 27 n 3,(2006), pp 861–874

S. Van Huffel and J. Vandewalle, The total leastsquares problem: computational aspects and analysis, SIAM,(1991)

top related