roots finding techniques for non linear eq

ROOT FINDING TECHNIQUES

FOR SOLVING

NON-LINEAR EQUATIONS

NUMERICAL METHODS FOR ENGINEERS 1

ROOT FINDING TECHNIQUES FOR SOLVING NON-LINEAR EQUATIONS

1. Introduction Life would be extremely boring if all physical systems could be described by linear systems of equations. Most engineering problems are non-linear in nature although, in some instances, reasonable approximations may be obtained by linearisation. However, in a lot of cases linear approximations are useless with little or no possibility of obtaining a reasonable solution with them. A particular good example is turbulent flowing liquid which is chaotic in nature and cannot be described with linear equations. Non-linear equations are required to adequately describe plastic and viscoplastic deformation of solids. Even a simple pendulum requires non-linear equations to describe its behaviour for relatively large oscillations. In this course of lectures we will focus on algebraic equations and in particular arranging the equations so that their solution can be simply viewed as a root finding exercise. Attention is focused on the so called, fixed-point methods as these are easy to implement and, if constructed properly, can be highly convergent. A fixed-point

method involves the construction of an iterative procedure of the form, ( ) ( )( )k1k xx φ=+ , to solve ( ) 0xf = , i.e. to

determine the values of x that are the roots of ( )xf . If properly constructed a root of ( )xf should be a fixed point

of the iterative scheme, i.e. if ( ) 0f =α , then ( )αφ=α .

It is expected that the student should familiarise him/herself with the direct search techniques: Bisection Method, False-Position Method, Secant(or Chord) Method, as these are based on relatively simple ideas so will not be covered in this lecture course. 2. Root Multiplicity An equation of the form , ( ) 0xf = , is said to have a root at x = α if ( ) 0f =α .

If the function, f , can be written as ( ) ( ) ( )xgxxf mα−= , with ( ) 0g ≠α , then ( )xf is said to have a root of

multiplicity(or order) m at x = α .

Consider a Taylor series expansion of ( )xf at x = α , i.e.

( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ...dx

fd

!4

x

dx

fd

!3

x

dx

fd

!2

x

dx

dfxfxf

4

44

3

33

2

22

+αα−+αα−+αα−+αα−+α=

or more succinctly,

( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ...f!4

xf

!3

xf

!2

xfxfxf 4

432

+αα−+α′′′α−+α′′α−+α′α−+α= (1)

where ( ) ( )α=αs

ss

dx

fdf , i.e.

d f

dx

s

s evaluated at x = α . If α is a root of ( )xf then

( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )xgx...f!4

xf

!3

xf

!2

xfxxf 1

432

α−=��

��

�+αα−+α′′′α−+α′′α−+α′α−=

and note that, ( ) ( )α′=α fg1 , so ( ) 0g1 ≠α if and only if ( ) 0f ≠α′ . If, however, ( ) 0f =α′ , then


( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )xgx...f!4

xf

!3

x

!2

fxxf 2

242

2 α−=��

��

�+αα−+α′′′α−+α′′

α−=

and ( ) ( ) !2/fg2 α′′=α , so ( ) 0g2 ≠α if and only if ( ) 0f ≠α′′ .

In general, ( )xf has a root of multiplicity m at x = α if and only if

( ) ( ) ( ) ( ) ( ) 0f............ffff 1m =α=α′′′=α′′=α′=α − and ( ) 0f m ≠α . (2)

If m = 2 then the root is called a double root and this occurs when ( ) ( ) 0ff =α′=α and ( ) 0f ≠α′′ .

Example 2.1

Determine the roots of ( ) 1x2xxp 2 ++= as well as the multiplicity of each root.

This polynomial can be factorised to give ( ) ( )21xxp += and so ( )xp has a double root at x = −1. Note that,

( ) ( )1x2xp +=′ and ( ) 2xp =′′ and thus ( ) ( ) 01p1p =−′=− with ( ) 0xp ≠′′ .

3. Single Variable Fixed-Point I teration

One possible approach to obtaining a root of ( )xf is via a graphical construction.

If, for example, we define ( )xf1 and ( )xf 2 so that ( ) ( ) ( )xfxfxf 21 −= , then graphs for ( )xf1 and ( )xf 2 can

be constructed on a single plot.

Clearly, ( ) 0xf = at the points where ( ) ( )xfxf 21 = .

Thus, the intersection points of the two graphs for ( )xf1

and ( )xf 2 are roots of ( )xf .

Example 3.1 Use a graphical construction to find the roots of

( ) 2xxxf −= . Let ( ) xxf1 = and ( ) 22 xxf = and the

graph follows. Clearly, the roots are at x = 0 and x = 1. Constructions of this type should only be used as a rough guide.

f (x)=x

f (x)= x

x1

1

2

2

0


The Newton-Raphson Method Before going on to examine fixed-point methods in general it is worth, at this stage, introducing the Newton-Raphson method which is an extremely popular fixed-point method. To do this consider the geometrical construction shown in the figure.

Note that, ( )( )

( )( )( ) ( )1kk

kk

xx

xfxf +−

=′ which can be rearranged to

give the classical Newton-Raphson method, i.e.

( ) ( )( )( )( )( )k

kk1k

xf

xfxx

′−=+ .

This is clearly, a fixed point method, since it is of the form,

( ) ( )( )k1k xx φ=+ , where ( )( ) ( )( )( )( )( )k

kkk

xf

xfxx

′−=φ .

Moreover, if ( )xf has a root of multiplicity 1 at x = α , then

( ) ( )( ) ( ) α=

α′−α=

α′α−α=αφ

f

0

f

f

Thus, any root of ( )xf is a fixed point of the Newton-Raphson method.

Example 3.2

Use the Newton-Raphson method to determine the root of ( ) 2exf x −= and take your initial guess to be ( ) 0x 0 = .

In this case, ( ) xexf =′ and so ( ) ( )( )( )( )( )

( )( )

( )k

k

x

xk

k

kk1k

e

2ex

xf

xfxx

−−=′

−=+ .

Thus, setting k = 0 gives, ( ) ( )

( )

( ) 11

21

e

2e0

e

2exx

0

0

x

x01

0

0

=−−=−−=−−= , and setting k = 1 gives,

( ) 75576.0e

2e1x

1

12 =−−= .

Continuing in a similar fashion provides: ( ) 69507.0e

2e75576.0x

75576.0

75576.03 =−−= and

( ) 69315.0e

2e69507.0x

69507.0

69507.04 =−−= .

The exact answer is ln .2 0 69315≈ . This example demonstrates just how rapidly the Newton-Raphson method can converge to an answer.

f(x)

xx

f'(x )

f(x)

(k)(k+1)

f(x )

(k)

(k)

x


3.1 General Concepts Fixed-point iteration can be visualised via a graphical construction called a cobweb. This construction involves

forming graphs for x and ( )xφ on the same plot.

The points where, ( )xx φ= , are the fixed points of the iterative scheme. This plot can then be used to follow the

iterative process.

This figure is typical of a problem that has two fixed points where one is a fixed point of attraction (i.e. α) and the other is a fixed point of repulsion (i.e. β ).

Although a fixed point of repulsion, β , satisfies ( )βφ=β the iterative procedure, ( ) ( )( )k1k xx φ=+ , does not

converge to β .

One of the distinct differences between the two fixed points, identified in the figure, is that, ( ) 1>βφ′ whilst

( ) 1<αφ′ .

It turns out that a necessary and sufficient condition for a fixed point, α , to be a fixed point of attraction is that

( ) 1<αφ′ .

To prove this we need to make use of the Mean Value Theorem.

In addition, if α is a fixed point and ( ) 1<αφ′ , then provided ( )0x is sufficiently close to α , the iterative

scheme, ( ) ( )( )k1k xx φ=+ , will converge.

READ SUPPLEMENTARY MATERIAL: PROOFS ARE NON-EXAMINABLE (PAGES 10 TO 11)

xxxxxxxxα β

φ(x )

φ (x)

y=x

φ (x )

yy=

(o)(1)(2)(3)(4)(5)

(o)

(1)


Example 3.1.1 Investigate the behaviour of the following fixed-point iterative schemes,

(i) ( ) ( )( ) 3/2xx2k1k +=+ ,

(ii) ( ) ( ) 2x3x k1k −=+ ,

at the roots of the polynomial, ( ) 2x3xxp 2 +−= , i.e. at x = 1 and x = 2.

Note that, these schemes were obtained by simply manipulating the equation x x2 3 2 0− + = to give: (i)

( ) 3/2xx 2 += , (ii) x x= −3 2 .

(i) Let ( ) ( )( ) ( )( )k2k1k x3/2xx φ=+=+ and note that, by construction, the fixed points of this scheme, satisfies

( ) ( ) 0233/2 22 =+α−α�+α=αφ=α , and so are roots of ( )xp .

Moreover, ( ) 3/x2x =φ′ and ( ) 13/21 <=φ′ , ( ) 13/42 >=φ′ , so 1 is a fixed point of attraction whilst 2 is

a fixed point of repulsion.

In addition, ( ) 2/3x2/313/x211x <<−�<<−�<φ′ , and so convergence to this point is guaranteed

if ( )1

01 b2/3x2/1a =<<= .

(ii) Let ( ) ( ) ( )( )kk1k x2x3x φ=−=+ and note that, by construction, the fixed points of this scheme, satisfies

( ) 02323 2 =+α−α�−α=αφ=α , and so are roots of ( )xp .

Moreover, ( ) ( )2x32/3x −=φ′ and ( ) 12/31 >=φ′ , ( ) 14/32 <=φ′ , so 2 is a fixed point of attraction

whilst 1 is a fixed point of repulsion in this case.


4. Rates of Convergence It is important to appreciate that some fixed-point iterative schemes will converge to a solution at a faster rate (i.e. in less iterations) than others. To understand what influences convergence consider the following:

( ) ( ) ( )( ) ( )( ) =α−ε+αφ=α−φ=α−=ε ++ kk1k1k xx

( ) ( ) ( )( )

( )( )

( ) ...........!3!2

3k2kk +αφ ′′′ε+αφ ′′ε+αφ′ε+α−αφ=

( ) ( )( )

( )( )

( ) ...........!3!2

3k2kk +αφ ′′′ε+αφ ′′ε+αφ′ε=

where ( )( )kε+αφ has been replaced by its Taylor series.

If ( )kx is close to the fixed point α then higher powers of ( )kε will be small and so ( ) ( ) ( )αφ′ε≈ε + k1k .

It is therefore clearly advantageous to have ( )αφ′ as small as possible and ideally ( ) 0=αφ′ is best.

A fixed point iterative method is said to be of 2nd order if ( ) 0=αφ′ and ( ) 0≠αφ′′ where in this case

( )( )

( )αφ′′ε≈ε +

!2

2k1k .

Similarly, if ( ) ( ) 0=αφ′′=αφ′ and ( ) 0≠αφ ′′′ then the method is said to be 3rd order , etc.

Example 4.1

Show that, if α is a root of ( )xf of multiplicity 1, then the Newton-Raphson method, ( ) ( )( )( )( )( )k

kk1k

xf

xfxx

′−=+ , is

at least a 2nd order method.

It is clear that each root of ( )xf is a fixed point of ( ) ( ) ( )( ) ( )( ) ( )( )kkkk1k xxf/xfxx φ=′−=+ since

( ) ( ) α=α′α−α f/f .

Note that,

( ) ( )( )

( )( )

( ) ( )( )( )

( ) ( )( )( )22 xf

xfxf

xf

xfxf

xf

xf1

xf

xfx

dx

dx

′′′

=��

��

′′′

−′′

−=��

��

′−=φ′

and ( ) 0=αφ′ because ( ) 0f =α .


5. Fixed Points for Functions of Several Variables A system of non-linear equations can generally be written as,

( )( )( ) 0x,...,x,xf

0x,...,x,xf

0x,...,x,xf

n213

n212

n211

===

. (5) . .

( ) 0x,...,x,xf n21n = ,

where the objective is the determination of the vector roots of the form,

( ) ( ) α=ααα== Tn21

Tn21 ,...,,x,...,x,xx , which satisfy this system.

It is convenient to use a vectorial representation for this system, i.e.

( ) ( ) ( ) ( ) ( )( ) === Tn21n21 xf,...,xf,xfx,...,x,xFxF

( ) ( ) ( )( ) ( ) 00...,0,0x,...,x,xf,...,x,...,x,xf,x,...,x,xf TTn21nn212n211 ===

where the functions, f f fn1 2, ,........, , are called the co-ordinate functions of F. A fixed point iterative scheme for this system takes the vectorial form:

( )

( ) ( )( )( )( )

( )( )

( )

( ) ( ) ( )( ) ( )kkn

k2

k1

k

n

2

1

kn

k2

k1

1k

n

2

1

1k x,...,x,x

.

.

x

.

.

x

x

x

.

.

x

x

x φ=φ=

��

��

�

φ

φφ

=

��

��

�

φ

φφ

=

��

��

�

=

+

+

and a fixed point satisfies

( )( )

( )

( ) ( )αφ=αααφ=

��

��

�

αφ

αφαφ

=

��

��

�

α

αα

=α n21

n

2

1

n

2

1

,...,,

.

.

.

.

and, of course, we required the iterative scheme to be constructed so that its fixed points are the roots of the above non-linear system.


Example 5.1

Confirm that ( )T1,0=α is a root of the 2-variable system

( )( ) 0x2xxx,xf

04x4x4xx4xxxx,xf

121212

1212212

21211

=−==−++−−=

This is true because

( ) ( )( ) ( ) 002101,0f,f

0404141040101,0f,f

2212

221211

=×−×==αα

=−×+×+××−−×==αα

5.1 Newton's Method Newton's method is just an extension of the single variable Newton-Raphson method. Its derivation can be obtained by considering a Taylor series expansion of the co-ordinate functions. In the single variable case we have

( )( ) ( ) ( ) ( )( )( ) ( )( ) ( ) ( )( ) ( )( ) ( ) ( )( )( )2k1kk1

k1kk1

k1kk1

1k1 xxOxfxxxfxxxfxf −+′−+=−+= ++++

and if ( )1kx + is close to a root then ( )( ) 0xf 1k

1 ≈+ and ( ) ( )( ) 0xx2k1k ≈−+ .

Setting these to zero in the above equation gives

( )( ) ( ) ( )( ) ( )( ) ( )( ) ( ) ( )( ) ( ) ( )( )k1

kk1

1kk1

k1

k1kk1 xfxxfxxfxfxxxf0 −′=′�′−+= ++

which is, of course, the Newton-Raphson method. Consider then, a two variable system, where the corresponding Taylor expansion is

( ) ( )( ) ( ) ( )( ) ( ) ( )( ) ( ) ( )( ) ( ) ( )( ) ( ) ( )( )k2

k121

k2

1k2

k2

k111

k1

1k1

k2

k11

1k2

1k11 x,xx/fxxx,xx/fxxx,xfx,xf ∂∂−+∂∂−+≈ ++++ ( ) ( )( ) ( ) ( )( ) ( ) ( )( ) ( ) ( )( ) ( ) ( )( ) ( ) ( )( )k

2k

122k

21k

2k

2k

112k

11k

1k

2k

121k

21k

12 x,xx/fxxx,xx/fxxx,xfx,xf ∂∂−+∂∂−+≈ ++++

Setting ( ) ( )( ) ( ) ( )( ) 0x,xfx,xf 1k2

1k12

1k2

1k11 == ++++ provides, in matrix form,

( ) ( ) ( ) ( ) ( )k

2

1

k

2

1

k

2212

2111

1k

2

1

k

2212

2111

f

f

x

x

x/fx/f

x/fx/f

x

x

x/fx/f

x/fx/f��

��

−��

��

��

��

�

∂∂∂∂∂∂∂∂

=��

��

��

��

�

∂∂∂∂∂∂∂∂ +

or ( ) ( ) ( ) ( ) ( )kkk1kk FxJxJ −=+ where J is the partial derivative matrix and is known as the Jacobian matr ix, and ( ) ( )( ) ( )( ) ( )( )( )Tk

2k

1kk xf,xfxFF == .


These ideas can be generalised to n variables where Newton's method takes the form ( ) ( ) ( ) ( ) ( )kkk1kk FxJxJ −=+

and J is an n n× matrix with coefficients ( ) jiij x/fJ ∂∂= .

It is clear that Newton's method is a fixed point scheme since ( ) ( ) ( ) ( ) ( )kk1kk1k FJxx φ=−=−+ and

( ) ( ) 0FFJJ =α�α−α=α , i.e. the fixed points of the method are roots of F.

Note that, it is essential that the Jacobian is at least non-singular in the vicinity of the root. Example 5.1.1

Apply Newton's Method to the system in example 5.1 starting with ( ) ( )T0 1,1x = .

Taking partial derivatives of the co-ordinate functions provides

( )( )( )( ) 12122

22112

1212121

21212111

xx,xx/f

02xx,xx/f

4x4xx,xx/f

4x4x2xx2x,xx/f

=∂∂=−=∂∂

+−=∂∂

+−−=∂∂

and so ( )��

��

�

−=

11

10J 0 , giving

( ) ( ) ( )

��

��

=��

��

−−��

��

��

��

�

−=��

��

−��

��

��

��

�

−=��

��

��

��

�

− 1

1

1

0

1

1

11

10

f

f

x

x

11

10

x

x

11

100

2

1

0

2

1

1

2

1

and this implies that

( )

��

��

=��

��

��

��

� −=��

��

��

��

�

−=��

��

−

1

0

1

1

01

11

1

1

11

10

x

x11

2

1.

Proceeding in a similar fashion gives, ( )��

��

�

−=

01

40J 1 and

( ) ( ) ( )

��

��

��

��

�

−=��

��

−��

��

��

��

�

−=��

��

��

��

�

− 1

0

01

40

f

f

x

x

01

40

x

x

01

401

2

1

1

2

1

2

2

1. Thus,

( ) ( )

��

��

=��

��

=��

��

1

0

x

x

x

x1

2

1

2

2

1 and the

method has in fact converged. In general convergence will take more that two iterations. However, Newton's Method can be show to have 2nd order convergence and this is the prime reason for its use in modern numerical analysis.


SUPPLEMENTARY MATERIAL: PROOFS NON-EXAMINABLE 3.1 Mean Value Theorem

If ( )xh is a continuous function over the interval, a x b≤ ≤ , then there exists an ξ in this interval such that,

( ) ( )( )abhdxxhb

a

−ξ=� .

Although it is relatively easy to formally prove this, the figure illustrates the concept equally well. The basic idea is that ξ can be found so that A A A A2 3 2 1+ = + which is clearly achievable for continuous functions. 3.2 Uniqueness and convergence proofs

This theorem can be used with ( ) ( )xxh φ′= provided, of course, ( )xφ′ is a continuous function.

In this case, the Mean Value Theorem gives,

( ) ( ) ( ) ( )( )ababdxxb

a

−ξφ′=φ−φ=φ′� (3)

where a b≤ ≤ξ .

If we assume that, ( )xφ′ is a continuous function then:

(i) There is no more than one unique fixed point for the iterative scheme, ( ) ( )( )k1k xx φ=+ , in an interval where

( ) 1x <φ′ .

Proof: Assume that both α and β are fixed points of the scheme, ( ) ( )( )k1k xx φ=+ , i.e. ( )αφ=α and ( )βφ=β

with α β≠ . The mean value theorem can be applied between points α and β to give

( ) ( ) ( )( ) ( ) α−β<α−βξφ′=α−βξφ′=αφ−βφ=α−β

where the inequality was generated by making use of the fact that α ξ β≤ ≤ and so ( ) 1<ξφ′ .

Note that, it is not possible for the inequality, β α β α− < − , to be true, and so we must have α β= , i.e. there is

only one fixed point in the interval.

h(x)

xa b

h(ξ)

Α

Α

Α1

2

3

ξ


(ii) If α is a fixed point and ( ) 1<αφ′ , then provided ( )0x is sufficiently close to α , the iterative scheme,

( ) ( )( )k1k xx φ=+ , will converge.

Proof: Although a proof for (ii) is not conceptually difficult it is somewhat involved. The main problem is the need

for a rigorous definition of the statement; ' ( )0x is sufficiently close to α '.

Continuity of ( )xφ′ at x = α ensures the existence of an interval, a x b≤ ≤ , in which α belongs and

( ) 1x <φ′ . Let us assume that, for the time-being, each ( )1kx + belongs to this interval so that ( )( ) 1x 1k <φ′ + .

It is useful to introduce the positive number, u, with the property that, ( )( ) 1ux 1k <≤φ′ + .

Moreover, let us define the kth error to be ( ) ( ) ( )( ) α−φ=α−=ε −1kkk xx and thus, with application of the Mean

Value Theorem, we have,

( ) ( ) ( )( ) ( )( ) ( )( ) ( )( ) ( )( )α−ξφ′=α−ξφ′=α−φ=α−=ε −−−−− 1k1k1k1k1kkk xxxx

( )( ) ( )1k1k uxu −− ε=α−≤

from which we deduce that,

( )0

kk u ε≤ε , and hence

( ) ( ) 0ulimlim k

k

0k

k=ε≤ε

∞→∞→.

It clear, therefore, that convergence is guaranteed with the assumption

( )( ) 1x 1k <φ′ + for all k.

To complete the proof we need to define

an interval that ensures ( )( ) 1x 1k <φ′ +

for all k. One rather conservative possibility is simply, a x b1 1≤ ≤ where a d1 = −α , b d1 = +α and

{ }a,bmind −αα−= .

xxxx xα

φ(x )

φ(x)

y=x

φ(x )

y

y=

(0)(1) (2)

(0)

(1)

a

dd

a b=b11

roots finding techniques for non linear eq

Documents