condition numbers of numeric and algebraic problemscondition number of linear equations (cont’d)...

Condition Numbers of Numeric andAlgebraic Problems

Stephen Vavasis1

1Department of Combinatorics & OptimizationUniversity of Waterloo

2011-Nov-16 / Fields Inst. Workshop on HybridSymbolic-Numeric Computation

1/ 76

Outline

1 Condition numbers in general

2 Condition numbers of linear equations

3 Linear least squares

4 Eigenvalues

5 Linear Programming

6 Geometric condition numbers

7 Polynomial evaluation and roots

2/ 76

Condition numbers in general

Outline




4 Eigenvalues




3/ 76


Condition number definition

Given a real-number problem, that is, a functionΦ : Rm → Rn, the condition number of an instancemeans its sensitivity to small perturbation. In

particular: cond num= limε→0

sup‖y‖≤ε

‖Φ(x + y) − Φ(x)‖‖y‖

(absolute measurement).

Or perhaps

cond num= limε→0

sup‖y‖≤ε

‖Φ(x + y) − Φ(x)‖/‖Φ(x)‖‖y‖/‖x‖

4/ 76


Details

Details to specify:

Precise definition of input and output

Relative or absolute? (applies to both the input andoutput)

Which part of the data is perturbed?

What norm is used to measure sensitivity?

5/ 76


Uses of a condition number

Condition numbers determine the best possibleaccuracy of the solution in the presence ofapproximations made by the computation.

Condition numbers sometimes bound theconvergence speed of iterative methods.

Condition numbers sometimes measure the distanceof an instance to singularity.

Condition numbers sometimes shed light onpreconditioning.

6/ 76


Condition numbers and floating-pointalgorithms

Condition numbers are properties of an instance(i.e., the data), not any particular algorithm.

Condition numbers set achievable limits algorithms.

Condition number analysis may indicate that certainalgorithmic choices are unwise

Condition numbers often reveal some usefulgeometric properties of the instance

7/ 76

Condition numbers of linear equations

Outline




4 Eigenvalues




8/ 76


Condition number of linear equations

Most famous classic example (Von Neumann &Goldstine; Turing) is the condition number ofsolving linear equations.

Function Φ is Φ(A,b) = A−1b, A ∈ Rn×n, b ∈ Rn.

Sensitivities understood in the relative sense. Alldata may be perturbed. Matrix norm is induced byvector norm.

9/ 76


Condition number of linear equations(cont’d)

Theorem: Suppose x = A−1b;x + Δx = (A + ΔA)−1(b + Δb) with

max

(‖ΔA‖‖A‖ ,

‖Δb‖‖b‖

)≤ δ.

Then ‖Δx‖‖x‖ ≤ 2κ(A)δ + O(δ2)

where κ(A) = ‖A‖ · ‖A−1‖ is the condition numberof A.

10/ 76



Note κ(A) ≥ 1 and κ(tA) = κ(A) for all t �= 0.

Specializing to the Euclidean vector norm and itsinduced matrix norm, κ(A) = σ1/σn, the ratio ofthe extremal singular values of A.

Geometrically: matrix A maps the n-ball to anellipsoid. The condition number is the ratio of themaximum to minimum axis length of this ellipsoid.

11/ 76



−1 −0.5 0 0.5 1−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

�

x �→(

4 23 2

)x

−4 −3 −2 −1 0 1 2 3 4

−3

−2

−1

0

1

2

3

κ

(4 23 2

)= 16.4

12/ 76



The bound on perturbation to x can be (mostly)achieved.

The bound can be achieved if only A or only b isperturbed.

Condition number does not depend on b.

13/ 76


Condition number and distance tosingularity

Theorem: If A ∈ Rn×n is nonsingular, then 1/κ(A)is the relative distance from A to singular matrices,i.e.,

1

κ(A)= inf

{‖ΔA‖‖A‖ : A + ΔA is singular

}.

Optimal ΔA pushes the smallest singular value to 0.

14/ 76


Condition number and iterative methods

Theorem: Suppose A ∈ Rn×n is symmetric andpositive definite. Then the ith iterate of theconjugate gradient method for solving Ax = bsatisfies

‖xi − A−1b‖A ≤ 2‖x0 − A−1b‖A ·(√

κ(A)−1√κ(A)+1

)i

.

Note: ‖x‖A means (xTAx)1/2.

For a symmetric positive definite A,κ(A) = λmax(A)/λmin(A).

Steepest descent minimization applied toφ(A) = xTAx/2 − bTx is also bounded in terms ofcondition number.

15/ 76


Extension to nonsymmetric systems

For nonsymmetric A, can apply CG to minimize‖Ax − b‖; equivalent to solving symmetric systemATAx = ATb.

Note that κ(ATA) = κ(A)2.

Other well-known iterative methods fornonsymmetric Ax = b, e.g., GMRES, Bi-CGSTAB,are not governed by κ(A).

16/ 76


Computing the condition number

The condition number of the condition number isthe condition number (Demmel).

Means: The sensititivity of the condition numberitself with respect to perturbations of A is againκ(A).

In practice, this means that very large conditionnumbers (greater than 1017 in Matlab) cannotusually be computed accurately, except for matriceswith special structure.

Even for well-conditioned matrices, computing thecondition number is more expensive than solvingAx = b.

17/ 76


Preconditioning linear equations

Instead of applying conjugate gradient to Ax = b,apply it to CACTy = Cb, where C is a squarenonsingular system; CTC is called thepreconditioner.

Want C such that κ(CACT ) κ(A).

Too expensive to compute either quantity.

Tradeoff between time to compute/apply C versusκ(CACT ).

18/ 76


Example of preconditioning

A symmetric n × n matrix A is a weighted Laplacianif the diagonal entries are nonnegative, theoff-diagonal entries are negative, and the row sumsare nonnegative.

Above conditions imply positive semidefiniteness.

Spielman, Teng and others in a series of papers inthe past 10 years found a graph-theoreticpreconditioner for weighted Laplacians.

Consequence is that these systems can be solved innearly linear time (linear in the number of nonzeroentries in A).

19/ 76


Extension to finite element stiffnessmatrices

Boman, Hendrickson and V. extend Spielman &Teng to finite element discretizations of PDE’s ofthe form ∇ · (σ∇u) = −f .

Finite element stiffness matrix K can be factored asATD1/2HD1/2A, where A is a node-arc incidencematrix, D is diagonal, positive semidefinite(=⇒ ATDA is a weighted Laplacian).

If all cells of the mesh are well-shaped, then κ(H) issmall, and any preconditioner for ATDA also worksfor K .

20/ 76

Linear least squares

Outline




4 Eigenvalues




21/ 76



Problem is: minimize ‖Ax − b‖2 given A ∈ Rm×n

and b ∈ Rm.

Equivalent to solving linear equations ATAx = ATb(first order optimality condition).

Assume that rank(A) = n, so solution is unique.

Common algorithms: method of normal equations;QR factorization

22/ 76


Condition number of linear least squares

Theorem: If ‖ΔA‖/‖A‖ ≤ δ and ‖Δb‖/‖b‖ ≤ δthen

‖Δx‖‖x‖ ≤ 2κ(A)δ +

κ(A)2δ‖b − Ax‖‖A‖ · ‖x‖

due to Wedin. Achievable.

Here, κ(A) = σ1(A)/σn(A).

If linear systems bound applied to ATAx = ATb,obtain a weaker bound.

23/ 76


Solving linear least squares

Moral 1: Reducing Problem A to Problem Bestablishes a bound on the condition number, butthe bound may be weak.

Moral 2: Solving linear least squares via reductionto linear equations may give a poor answer.

24/ 76


Weighted least squares

Weighted least squares means minimizing‖D(Ax− b)‖ where D is a positive definite diagonalweight matrix.

Reduces to ordinary linear least squares under theobvious substitution A = DA and b = Db.

Perturbation bound for ordinary least squares meanssolution can be arbitrarily inaccurate as κ(D) → ∞.

25/ 76


Weighted least squares (cont’d)

Theorem: There is a bound on ‖x − x‖ thatdepends on χA and χA and is independent of κ(D).

These quantities were introduced by Stewart, Toddand Dikin independently.

Specialized weighted least-squares algorithms (V.;Hough & V.) can achieve this bound.

These algorithms require that dependence amongrows of A is detected infallibly. (Independent rowshave a condition number bounded by χA.)

26/ 76


More about χA

χA = sup{‖(ATDA)−1ATD‖ : D ∈ D} andχA = sup{‖A(ATDA)−1ATD‖ : D ∈ D}.D denotes positive definite diagonal matrices.

Dikin; Todd; Stewart: this sup is finite.

Khachiyan; Tuncel: NP-hard to compute oraccurately approximate χA or χA.

27/ 76

Eigenvalues

Outline




4 Eigenvalues




28/ 76

Eigenvalues

Condition number of eigenvalues

If Ax = λx and λ is not a repeated eigenvalue, thenλ depends smoothly on A. Differentiation showsthat if λ + Δλ is the corresponding eigenvalue ofA + ΔA, ‖ΔA‖ ≤ δ, then . . . Exercise: FIGURE ITOUT!

Hint: Normalize eigenvector; expand(A + ΔA)(x + Δx) = (λ + Δλ)(x + Δx) to firstorder; multiply on the left by yH (normalized lefteigenvector of λ).

29/ 76

Eigenvalues

Condition number of eigenvalues

If Ax = λx and λ is not a repeated eigenvalue, thenλ depends smoothly on A. Differentiation showsthat if λ + Δλ is the corresponding eigenvalue ofA + ΔA, ‖ΔA‖ ≤ δ, then|Δλ| ≤ δ/|yHx| + O(δ2).Here, y is the normalized left eigenvector of A and xis the normalized right eigenvector.Absolute perturbations make sense in this contextsince could have λ = 0 when A �= 0.If A is a normal matrix, i.e., has an orthonormalbasis of eigenvectors, then λ is perfectly conditionedsince |yHx| = 1.

29/ 76

Eigenvalues

Ill conditioned eigenvalues andpseudoeigenvalues

If an eigenvalue is ill conditioned (|yHx| ≈ 0), thenit is sensitive to perturbations of A and hard tocompute.

Eigenvalue criteria applied to Jacobian are oftenused to test stability of solutions to differentialequations and stability of numerical methods fordifferential equations.

These criteria are unreliable if they are applied toproblems with a highly non-normal Jacobian(Trefethen).

30/ 76

Eigenvalues

Pseudoeigenvalues

Trefethen and others define an ε-pseudoeigenvalueof A to be a complex number λ such that‖(A − λI )−1‖ ≥ 1/ε.

Equivalent to: λ is a ε-pseudoeigenvalue if thereexists ΔA s.t. ‖ΔA‖ ≤ ε and λ is an eigenvalue ofA + ΔA.

Large condition number indicates thatpseudoeigenvalues can be far from true eigenvalues.

Computational experiments and theoretical analysisshow that pseudoeigenvalues can be used reliably invarious stability criteria.

31/ 76

Linear Programming

Outline




4 Eigenvalues




32/ 76

Linear Programming

Linear programming

Linear programming refers to minimizing cTxsubject to Ax = b and x ≥ 0, where A ∈ Rm×n isgiven, as are b ∈ Rm and c ∈ Rn.

Linear programming admits an elegant dualitytheory. Two competing algorithms for LP are incommon usage: simplex and interior point. Simplexis not known to be polynomial time.

Three possible outcomes for LP: optimal solutionfound; problem is unbounded; problem is infeasible.

Optimizer not necessarily unique.

33/ 76

Linear Programming

Inputs and outputs for linear programming

Input is A,b, c. Some subtlety here: What ifconstraints are structured, e.g., li ≤ aT

i x ≤ ui .

(At least) two possible outputs from LP: optimalvalue cTx∗ or optimizer x∗ itself.

Optimizer is more sensitive to perturbation thanoptimal value.

34/ 76

Linear Programming

Example

min x + (1 + δ)ys.t. 0 ≤ x , y ≤ 1

x + y ≥ 0.3

δ = −0.01

35/ 76

Linear Programming

Example

min x + (1 + δ)ys.t. 0 ≤ x , y ≤ 1

x + y ≥ 0.3

δ = 0

35/ 76

Linear Programming

Example

min x + (1 + δ)ys.t. 0 ≤ x , y ≤ 1

x + y ≥ 0.3

δ = 0.01

35/ 76

Linear Programming

Renegar’s condition number

Start with the LP feasibility problem: givenA ∈ Rm×n such that rank(A) = m and b ∈ Rm, findan x such that Ax = b, x ≥ 0.

Let the instance be written as d = (A,b). Assumeit is feasible. The condition number of the instanceaccording to Renegar is

C (d) =‖d‖

inf{‖d − d′‖ : d′ is infeasible} .

Renegar and Vera relate this condition number tothe complexity of finding a solution to LP feasibility.

36/ 76

Linear Programming

Followups to Renegar’s work

Many other problems similar to LP feasibility butwith subtly different properties.

Critique of Renegar’s theorem: A and b scaledifferently.

More general critique: condition number shoulddepend on S = {x : Ax = b} but not on particularchoice of A and b.

37/ 76

Linear Programming

Condition number independent ofrepresentation

Epelman and Freund define a condition number forLP feasibility as C2(d) = 1/sym(H(d), 0) wheresym(D, x) = sup{t : x + v ∈ D =⇒ x − tv ∈ D}and H(d) = {bθ−Ax : θ ≥ 0; x ≥ 0; |θ|+ ‖x‖ ≤ 1}Existence of feasible point implies 0 ∈ int(H(d)).

sym(D, 0) is invariant under invertible linearmappings of D.

E & F show that C2(d) ≤ C (d).

38/ 76

Linear Programming

Other forms of linear feasibility

Cheung, Cucker & Pena unify several LP results.

Form the following self-dual feasibility problem:− cTx + bTy ≥ 0,

cx0 − ATy ≥ 0,− bx0 + Ax = 0,

x0, x ≥ 0.

Let ρ be the distance of this instance to anotherinstance with a different strictly complementarybasis (uniquely determined; Goldman-Tucker 1956).

This distance is equal or related to several previouslyproposed condition numbers.

39/ 76

Linear Programming

LP complexity bounded by χA

V. and Ye proposed an interior point method whoserunning time depends only on χA, m, n and not onb, c.

No other interior point method has this property.Tardos had earlier shown how to get this kind ofcomplexity with the ellipsoid method.

New technique is layered least squares, ageneralization of weighted least-squares.

Some weights infinitely larger than others.

40/ 76

Linear Programming

Is χA a condition number?

Perhaps χA is a condition number for linearprogramming, but no precise statement is known.

Cheung et al. found relationships between χA andother LP condition numbers.

Note that χA depends only on N(A) and not on Aitself.

41/ 76

Linear Programming

Semidefinite programming

Semidefinite programming instances can haveinfinite condition number even when there is aunique optimizer.

Example: maximize x subject to y ≥ 1, x2 + y 2 ≤ 1.

Unique feasible point (0, 1) has objective value 0.

Perturbing constraint to x2 + y 2 ≤ 1 + ε movesoptimizer to (

√ε, 1).

Hence ratio of perturbation of root to perturbationof data is unbounded as ε → 0.

42/ 76

Linear Programming

Semidefinite programming (cont’d)

In this SDP instance, the Slater condition fails.

43/ 76

Geometric condition numbers

Outline




4 Eigenvalues




44/ 76


Boundary value problem

A boundary value problem means solving adifferential equation involving one or more spatialindependent variables over a domain Ω ⊂ Rd

(usually d = 1, 2, 3) given boundary conditions andfield values.

Classic example mentioned earlier:∇ · (σ(x)∇u(x)) = −f (x) for x ∈ Ω with somespecification of u (Dirichlet) or ∂u/∂n (Neumann)at every point on ∂Ω

Special case: σ(x) ≡ 1 and f ≡ 0 called Laplace’sequation.

45/ 76


Geometry and BVP solution

Difficulty of solving BVP solution depends partly ongeometry of Ω. (Many technicalities regarding meshgeneration, etc.)Clean problem: assume Ω simply connected in R2,the BVP is Laplace’s equation, boundary data isDirichlet.Can reduce solution to inversion of integraloperators whose condition number depends on theharmonic conjugation operator.Harmonic conjugation means: given the real part ofa complex analytic function defined on a simplyconnected open set Ω ∈ C, find the imaginary part(unique up to constant additive term).

46/ 76


Example simply connected plane domains

Well conditioned domains

Poorly conditioned domains

47/ 76


Complex analysis

In this case, the problem reduces to one in complexanalysis: come up with a geometric bound for theharmonic conjugation operator.

In the case that Ω is convex, harmonic conjugationbounded in terms of aspect ratio of Ω

Conjectures by V. made in 1990s for general case,mostly solved by C. Bishop in the past few years.

Geometric condition number can be bounded interms of a tree-like decomposition of Ω intoapproximately round bodies.

Extension to 3D wide open.

48/ 76

Polynomial evaluation and roots

Outline




4 Eigenvalues




49/ 76


Evaluation of polynomials

Suppose p(x) = a0 + a1x + · · · + adxd (standard

monomial form).If x perturbed by an absolute amount δ1 and a by arelative amount δ2, then p(x) perturbed by at most|p′(x)|δ1 + ‖a‖ · ‖pwr(x)‖δ2.pwr(x) = [1; x ; . . . ; xd ]Example: unstable to use Taylor series toapproximate ex when x 0 because‖a‖ · ‖pwr(x)‖ � |p(x)|.Same polynomial OK when x ≈ 0.Same principles carry over to multivariatepolynomials.

50/ 76


Evaluation of Bernstein-Bezier polynomials

A Bernstein-Bezier univariate polynomial is specifiedby coefficients a0, . . . , ad and given by

p(x) =d∑

i=0

(di

)aix

i(1 − x)d−i .

Argument x usually in [0, 1].

Can show that perturbation to p(x) bounded by|p′(x)|δ1 + ‖a‖δ2 for this range of x .

51/ 76


Condition number of a root of a univariatepolynomial

Given p(x) as above, find a root.

Toh and Trefethen (Gautschi): condition number ofroot x∗ (relative perturbation to p, absoluteperturbation to x∗) is . . . Exercise: FIGURE ITOUT!

Hint: Cauchy-Schwarz inequality needed

52/ 76


Condition number of a root of a univariatepolynomial

Given p(x) as above, find a root.

Toh and Trefethen (Gautschi): condition number ofroot x∗ (relative perturbation to p, absoluteperturbation to x∗) is ‖a‖ · ‖pwr(x∗)‖/|p′(x∗)|.Blum et al. work with absolute perturbation, so forthem the condition number is ‖pwr(x∗)‖/|p′(x∗)|.

52/ 76


Multivariate polynomial systems

For condition of an isolated root, Blum et al. givethe following definition of condition number.

Suppose x∗ is a root of polynomial system p(x) = 0and ∇p(x∗) (Jacobian matrix) is invertible at x∗.Implicit function theorem means there is adifferentiable root function r(q) defined in aneighborhood O of p such that q(r(q)) = 0 for allq ∈ O.

Condition number is ‖∇r(p)‖.

53/ 76


Condition number theorem of Blum et al.

Say that x∗ is a singular root of p if ∇p(x∗) issingular.

Theorem: The reciprocal condition number of rootx∗ of polynomial p is within a constant factor(depending on degree) of the minimum distancefrom p to a polynomial p such that x∗ is a singularroot of p.

54/ 76


Roots in [0, 1]

Suppose only roots in [0, 1] are sought. Notsufficient to consider condition numbers of roots inthis interval since an ill-conditioned root at 0.5 + εimeans slight perturbation of polynomial may movethe root to [0, 1].

E.g., p(x) = (x − 0.5)2 + 10−9.

Two solutions to this problem: (a) consider allroots, or (b) use a definition of condition numberthat looks beyond the behavior at the roots.

55/ 76


Finding all roots in [0, 1]n

Problem arises in geometric computing.

Suppose p : Rn → Rn is a polynomial, and we wishto compute all roots of p in [0, 1]n.

Ill-conditioned case is when an small perturbation top creates a singular root in [0, 1]n.

The problem considered is not counting the numberof roots in [0, 1]n. In that problem, there is anadditional ill-conditioned case when one of the rootsis very close to ∂[0, 1]n.

56/ 76


Condition number definition

(Srijuntongsiri & V.) Define

κ(f) = ‖f‖ · supx∈[0,1]n

(min

(1

‖f(x)‖ , ‖∇f(x)−1‖))

In other words, f is well conditioned if for each pointin [0, 1]n, either ‖f(x)‖/‖f‖ is not too small or∇f(x) is not too ill-conditioned.

Condition number is ∞ if there is an x∗ such thatf(x∗) = 0 and ∇f(x∗) is singular.

Any standard norm may be used for f.

57/ 76


Condition number bounds perturbations toroots

Theorem: Suppose f, f are two polynomials such

that ‖f−f‖‖f‖ ≤ δ where δκ(f) 1. Suppose

x∗ ∈ [0, 1]n and f(x∗) = 0. Then

infy∈f−1(0)

‖x∗ − y‖ ≤ cdδκ(f)

58/ 76


Condition number bounds distance tosingularity

As mentioned earlier, a singular polynomial f has aroot x∗ ∈ [0, 1]n such that ∇f(x∗) is singular.

Theorem: For a polynomial f, the relative distanceto singular polynomials is equal to 1/κ(f) to withina factor depending on degree.

59/ 76


General theorems about conditionnumbers

Suppose the condition number is the relativedistance to singularity, where “singularity” meansbelonging to a semi-algebraic cone of co-dimensionat least 1.

Demmel showed that the mean of the logarithmiccondition number is small.

Burgisser, Cucker and Lotz showed that thesmoothed logarithmic condition number is small.

60/ 76


Smoothed condition number

Suppose Ψ(f ) is some kind of complexity measureor condition number. It is a nonnegative real-valuedfunction of an input instance f ∈ F .

Worst-case analysis: supf ∈F Ψ(f ).

Average-case analysis: Ef ∈F [Ψ(f )]

Smoothed analysis (Spielman & Teng):

supf ∈F Ef ∈B(f ,δ)[Ψ(f )]

61/ 76


Affine invariance

Condition number of Srijuntongsiri & V. is notaffinely invariant, but many algorithms are.

Affinely invariant means that κ(f) = κ(Af) for anynonsingular matrix A.

S. & V. solve this problem in a brute-force manner:define κ(f) = inf{κ(Af) : A ∈ GLn(R)}; triviallyκ(f) is affinely invariant.

An affinely invariant algorithm whose complexity isbounded in terms of κ(f) is automatically alsobounded by κ(f).

62/ 76


Algorithms to find roots in [0, 1]n

Problem: Given polynomial f : Rn → Rn, find allroots in [0, 1]n.

Focus is on the case of small degree, small n so thatexhaustive search is tractable.

Important problem in computational geometry.

Goal: obtain an algorithm whose complexity isbounded by κ(f).

63/ 76


An algorithm with no such bound

Consider classic Toth algorithm. Subdivide [0, 1]n

into 2n subcubes. For each subcube, use a test thateither confirms that f has a unique root in thesubcube, confirms that there is no root, or else isinconclusive.

If test is inconclusive, recursively subdivide.

Test based on interval arithmetic.

Problem: if a root has a coordinate that is exactlyof the form k/2l (or is very close to such a point)then the Toth algorithm can get stuck (inconclusiveat all recursive levels).

64/ 76


KTS Algorithm

KTS stands for Kantorovich Test Subdivisionalgorithm.

Also based on recursive subdivision into cubes.

Uses Kantorovich test and exclusion test.

Kantorovich test guarantees a unique root that canbe found with Newton.

65/ 76


Kantorovich theorem

Suppose f : D → Rn is diffble, D an open convexsubset of Rn.

Suppose x0 ∈ D, ‖∇f(x0)−1f(x0)‖ ≤ η.

Suppose ‖∇f(x0)−1(∇f(x) −∇f(y))‖ ≤ ω‖x − y‖

for all x, y ∈ D.

If h = ηω < 1/2 then f has a root inB(x0, (1 −√

1 − 2h)/ω). This root is unique inB(x0, (1 +

√1 − 2h)/ω) ∩ D and is the limit of

Newton’s method starting at x0.

66/ 76


Kantorovich test

For a polynomial function, it is straightforward toestimate η and ω.

Can determine if the current subcube lies in a diskwhere there is convergence to a unique root.

As the subcube gets smaller, the disk stays thesame size.

67/ 76


Bernstein-Bezier polynomial

The Bernstein-Bezier form of a polynomial functionf : Rn → Rn of individual degree d is∑d

i1=0 · · ·∑d

in=0 ai1···in∏n

j=1 xijj (1 − xj)

d−ij ·(

dij

)

Theorem: f ([0, 1]n) ⊂ conv(a0···0, . . . , ad ···d).Follows because monomials are nonnegative andsum to 1.

68/ 76


Exclusion test

Given a subcube S = [a1, b1] × · · · × [an, bn] of[0, 1]n, can define the obvious bijectionπ : [0, 1]n → S and new polynomial f : [0, 1]n → Rn

by f(π(x)) = f(x).

Rewrite f in B-B form. If hull of control points of fdoes not contain 0 then f has no root in S .

69/ 76


Main theorem

Suppose S is a subcube of width at most cd/κ(f)2.

Then either the Kantorovich test is satisfied by S ,and the region containing the root also contains S ,or the exclusion test is satisfied by S (or possiblyboth).

Means that S does not have to be furthersubdivided.

Yields a complexity bound that depends on κ(f).

70/ 76


Extension to the case of one degree offreedom

Suppose now that f : [0, 1]n+1 → Rn is a polynomialsystem. Roots in this case are, generically, curves in[0, 1]n+1.

This problem is also important in computationalgeometry: surface-surface intersection problem.

Defineκ(f) = ‖f‖ · maxx∈[0,1]n+1 min

(1

‖f(x)‖ , ‖∇f (x)+‖)

Here, B+ denotes BT (BBT )−1, the Moore-Penroseinverse of B in the case that rows of B are linearlyindependent.

71/ 76


KTS algorithm for one degree of freedom

Srijuntongsiri & V. extend the KTS algorithm tothis case with similar complexity bounds.

New problem: tracing curves between subcubes toget connected components.

Kantorovich theorem implies that once subdivisionis fine enough, there will be only one possible way toconnect curves together.

72/ 76


Interval versus floating point arithmetic

The KTS algorithm has been implemented infloating point arithmetic.

In floating-point arithmetic, its correctness is notguaranteed

Can implement KTS in interval arithmetic toguarantee correctness, which raises the cost.

KTS is affinely invariant in exact arithmetic, but notin floating point or interval arithmetic.

73/ 76


Yet another condition number forroot-finding

Possible to reduce the multivariate polynomialrootfinding to eigenvalue/eigenvector computation(see e.g. Jonsson & V.) using resultants.

Another avenue for defining root-finding conditionnumber: use conditioning of resultingeigenvalue/eigenvector problem

Jonsson & V.: besides condition number at theroot, resulting root-finding condition numberdeteriorates if the polynomials are close to having acommon factor.

74/ 76


Future directions (I)

Condition-based analysis and floating pointarithmetic: Is affine invariance good or bad? Forlinear system solving, condition number obviouslynot affinely invariant!Multiple ways to define condition number ofpolynomial rootfinding.

Condition number of rootsCondition number involving function value and firstderivativeCondition number involving resultant matrices

Find connections?

75/ 76


Future directions (II)

Robust optimization and condition numbers

Lots of work to do on geometric condition numbers.

Condition number of semidefinite programming.Although Renegar and others have defined this,there are still well-conditioned instances that causetrouble for solvers

Preconditioning is not widely used outside of solvingAx = b.

76/ 76

condition numbers of numeric and algebraic problemscondition number of linear equations (cont’d)...

Documents