a concise introduction to ordinary di erential equationshcmth018/odebook.pdf · 2018-10-17 · 1 a...

A Concise Introduction toOrdinary Differential Equations

David Protas

1

A Concise Introduction toOrdinary Differential Equations

David ProtasCalifornia State University, Northridge

Please send any comments, corrections to [email protected]

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. To view a copy of this license, visithttp://creativecommons.org/licenses/by-nc-nd/4.0/.

Contents

Preface iii

Chapter 0. Introduction 11. Solutions of Ordinary Differential Equations 1

Chapter 1. First Order Equations 51. Preliminary Remarks 52. First Order Linear Equations 53. Some Other First Order Equations 94. Existence and Uniqueness 195. Numerical Methods 296. Direction Fields 39

Chapter 2. Linear Equations of Order Two and Higher 451. Preliminary Remarks 452. Definition of Linear Equations 453. A First Look at Constant Coefficients 464. Solution Spaces 495. Linear Homogeneous Equations of Order Two 526. Higher Order Linear Homogeneous Equations 597. Nonhomogeneous Linear Equations and the Annihilator Method 628. The Method of Variation of Parameters 69

Chapter 3. Series Solutions of Linear Equations 751. Preliminary Remarks 752. Power Series Solutions 753. Regular Singular Points and the Euler Equation 864. More on Equations with Regular Singular Points 91

Chapter 4. Systems of Differential Equations 991. Preliminary Remarks 992. Linear Systems and Vector Notation 993. Constant Coefficients 1054. Complex Eigenvalues and Nonhomogeneous Systems 1115. Nonlinear Systems 117

Chapter 5. The Geometry of Systems 1251. Preliminary Remarks 1252. The Phase Plane 1253. Critical Points 1334. Perturbations and the Almost Linear Property 139

i

ii CONTENTS

Answers to Selected Exercises 149

Index 155

Preface

This book covers standard topics of a first (theoretically oriented) course inordinary differential equations. It is assumed that the reader has had a class inlinear algebra. Applications are mainly seen in the preliminary remarks to chaptersand in the exercise sets. I have tried to create a work that allows most, if not all,of the material in it to be covered thoroughly in a one semester course.

iii

CHAPTER 0

Introduction

1. Solutions of Ordinary Differential Equations

Many problems in science and engineering are represented in mathematicalterms by equations involving a function of one variable along with one or more ofits derivatives. Such an equation is called an ordinary differential equation. Theword ‘ordinary’ is used because the function of one variable has ordinary derivativesas opposed to partial derivatives. One simple example of an ordinary differentialequation is

x2y′′ − 6y = 0.

Here, y is an unknown function of the independent variable x. We say that φ1(x) =x3 is a solution of this equation on the interval (−∞,∞) since

x2φ′′1(x)− 6φ1(x) = 0

for all x ∈ (−∞,∞). Specifically, x2 · 6x − 6 · x3 = 0 for all x. (How we find thesolution φ1(x) = x3 will be shown later.) Similarly, φ2(x) = x−2 is a solution ofthe equation on the interval (−∞, 0) and on the interval (0,∞). In general, φ is asolution of an ordinary differential equation

F (x, y, y′, . . . , y(n)) = 0

on an interval I if F (x, φ(x), φ′(x), . . . , φ(n)(x)) = 0 for all x ∈ I. In the exampleabove, F (x, y, y′, y′′) = x2y′′ − 6y. For a second example, consider the ordinarydifferential equation y′ = −2xy2. A solution of this equation on (−∞,∞) is φ(x) =1/(x2 + 1). We have let F (x, y, y′) = y′ + 2xy2.

Typically, an ordinary differential equation will have many solutions on aninterval. We often need to find a solution of the equation that also satisfies certainextra conditions. For example, the function φ(x) = x3 + 5x−2 is a solution of theequation x2y′′ − 6y = 0 that also satisfies the conditions φ(1) = 6 and φ′(1) = −7.For this reason, we say that φ(x) = x3 + 5x−2 is a solution of the initial valueproblem

x2y′′ − 6y = 0

y(1) = 6, y′(1) = −7.

Definition 0.1. Let x0 be a point in an interval I, and let β1, . . . , βn benumbers. Then

(1)F (x, y, y′, . . . , y(n)) = 0

y(x0) = β1, y′(x0) = β2, . . . , y

(n−1)(x0) = βn

is an initial value problem. A function φ is a solution of (1) on I if

F (x, φ(x), φ′(x), . . . , φ(n)(x)) = 0

1

2 0. INTRODUCTION

for all x in I and φ(x0) = β1, φ′(x0) = β2, . . . , φ

(n−1)(x0) = βn.

The word ‘initial’ is used because the independent variable often representstime and we want to use information about a given time x0 to predict what willhappen later on.

Eventually, we will also be considering systems of ordinary differential equa-tions. For now, however, we will deal with just one ordinary differential equationat a time.

We are not going to be able to find one method of solution that can be applied toall ordinary differential equations. Instead, we will define certain types of equationsand find methods of solution appropriate to these specific types. We start with theidea of order. The order of an ordinary differential equation is the highest orderderivative that appears in the equation. For example, the order of

y′′ + 5x3y′ − 6y = cosx

is 2, while the order of

yy′′′ + (y′′)4 + 6x5 = 0

is 3. In the next chapter, we will study equations of order 1.Let us agree to say that two differential equations of the same order are equiv-

alent if they have the same solution set. Often, when trying to solve a givendifferential equation, we will replace it by an equivalent equation that is in somesort of standard form or is easier to deal with for whatever reason. Most of the time,we will treat equivalent equations as though they were the same equation writtenin two different ways. Occasionally, we will replace an equation by a non-equivalentone when solving the second equation also gives us solutions of the first. It shouldbe clear that y′ = −2xy2 and y′ + 2xy2 = 0 are equivalent. However, the ordinarydifferential equations y′ = −2xy2 and y′/y2 = −2x are not equivalent since theconstant function φ(x) ≡ 0 is a solution of the first but not of the second. Still,solutions of the second equation are also solutions of the first, and so solving thesecond equation is a way of finding solutions of the first.

Exercises

1. Verify that φ(x) = x−2 is a solution of x2y′′ − 6y = 0 on (−∞, 0) and(0,∞).

2. Verify that φ(x) = e3x is a solution of y′′ − 4y′ + 3y = 0 on (−∞,∞).

3. Verify that φ(x) = (6− 3x)1/3 is a solution of y′ = −1/y2 on (−∞, 2) and(2,∞).

4. Verify that for each constant a, φ(x) = eax is a solution of y′ = ay on(−∞,∞).

5. Verify that for each constant a, φ1(x) = cos ax and φ2(x) = sin ax aresolutions of y′′ + a2y = 0 on (−∞,∞).

6. Verify that φ(x) = x3 + 5x−2 is a solution of the initial value problemx2y′′ − 6y = 0, y(1) = 6, y′(1) = −7 on (0,∞).

1. SOLUTIONS OF ORDINARY DIFFERENTIAL EQUATIONS 3

7. Verify that φ(x) = xe2x is a solution of the initial value problem y′′ −4y′ + 4y = 0, y(0) = 0, y′(0) = 1 on (−∞,∞).

8. Verify that φ(x) =√

4− x2 is a solution of the initial value problemy′ = −x/y, y(0) = 2 on (−2, 2).

9. State the order of y(4) = x5y′′ − 12.

10. State the order of y′ = eyy′′′ + x4(y′)6.

CHAPTER 1

First Order Equations

1. Preliminary Remarks

Let A(t) be the number of ounces of a radioactive material left t months aftera certain starting time. It is known that the rate of decay of the material isproportional to its amount. That is, A′ = kA, where k is a negative constantdetermined by the particular radioactive material under consideration. Let A0

stand for the initial amount. We want to find the amount of material left at anytime t. To do this we will solve an initial value problem involving a first orderdifferential equation. If we replace A by y (and t by x) to conform to our usualchoice of variables, we will be faced with the initial value problem

y′ = ky, y(0) = y0,

which we will learn how to solve in the next two sections.There is no general method that enables us to solve all first order ordinary

differential equations. We will look at a number special types of first order equationsand develop methods for solving them; after that, we will see what we can do aboutthe others.

2. First Order Linear Equations

Definition 1.1. A first order ordinary differential equation is said to be linearif it is of the form

y′ + a(x)y = g(x),

where a, g are functions defined on some interval I. If g(x) = 0 for all x ∈ I, thenthe linear equation is called homogeneous. Otherwise, it is called nonhomogeneous.

Example. (a) The equation y′ − 4x2y = 0 is a first order linear homogeneousequation. (We will assume that the interval I is (−∞,∞) since we see no reasonto avoid any values of x.)(b) The equation 3y′ = (lnx)y is equivalent to the first order linear homogeneousequation y′ − ( 1

3 lnx)y = 0 (on the interval (0,∞)).

(c) The linear equation y′ + e3xy = sinx is nonhomogeneous.(d) The equation y′ + e3xy2 = sinx is not linear.

Our goal now is to develop a method for solving first order linear equations. Wewill start by working out a couple of specific examples with the idea of generalizingwhat we learn from them.

Example. Consider the equation y′ + 8x3y = 0 (on the interval (−∞,∞)).Since the equation involves a derivative, it is tempting to try to integrate to solvefor y. Integrating the right side of the equation presents no problem; we simplyget a constant. However, we do not know how to integrate the left side since it

5

6 1. FIRST ORDER EQUATIONS

involves an unknown function y and is not in any obvious way the derivative ofsomething we can find. We will change the left side into a manageable expressionby multiplying through the equation by an appropriately chosen function.

Multiply both sides of the equation by e2x4

. (You should be wondering wherethis expression came from. The answer will be supplied shortly.) We get

e2x4

y′ + 8x3e2x4

y = 0.

Notice that (e2x4

y)′ = e2x4

y′ + 8x3e2x4

y. Thus, our equation can be written

(e2x4

y)′ = 0.

Now integration is easy. We get e2x4

y = C, where C is an arbitrary constant.We conclude that if y is any solution of our equation, then it must be of the form

y = Ce−2x4

. Furthermore, since all of the steps we went through are reversible,

every function given by y = Ce−2x4

is a solution.

Now let’s start to examine how we came up with e2x4

as the function to multiplythrough the equation in the last example. Notice that

∫8x3 dx = 2x4 +K and so,

e2x4

was e raised to an antiderivative of the coefficient of y. We will see shortlythat, in general, when solving y′ + a(x)y = b(x), multiplying through by e raisedto an antiderivative of a(x) gives us an expression we can integrate.

First however, let’s examine what would have happened in the last example ifwe had chosen a different antiderivative. Let’s multiply through y′ + 8x3y = 0 by

e2x4+K , for any (non-zero) value of K. Since e2x

4+K = e2x4

eK , we get

eKe2x4

y′ + eK8x3e2x4

= 0.

Now dividing by eK , gets us back exactly to where we were in the example whenwe did not have the constant K. This shows that the choice of the particularantiderivative, in other words the choice of K, doesn’t matter. In practice, mostpeople most of the time use K = 0 in this type of problem.

To emphasize the idea that only one particular antiderivative is needed in theprocess we are developing, we will use the following somewhat unconventional no-tation.

Definition 1.2. For any continuous function a, let the symbol∫ ∗a(x) dx stand

for any one particular antiderivative of a.

Example. Consider the equation y′ = 12x + 6xy (on the interval (−∞,∞)).This equation is equivalent to the linear equation y′ − 6xy = 12x. Compute∫ ∗−6x dx = −3x2, and let µ = e−3x

2

. Multiplying our differential equation byµ gives

e−3x2

y′ − 6xe−3x2

y = 12xe−3x2

.

However, (e−3x2

y)′ = e−3x2

y′ − 6xe−3x2

y, and so the equation becomes

(e−3x2

y)′ = 12xe−3x2

.

Integrating, we get e−3x2

y = −2e−3x2

+C. Thus, if y is any solution of y′ − 6xy =

12x, then it is of the form y = −2 +Ce3x2

. As in the last example, we can retraceour steps to see that every function of this form is a solution.

Before leaving this example, let’s see whether we can find a solution φ thatsatisfies the additional condition that φ(0) = 2. If φ is a solution, there must be

2. FIRST ORDER LINEAR EQUATIONS 7

a constant C such that φ(x) = −2 + Ce3x2

for all x ∈ (−∞,∞). Then at x = 0,the additional condition says that −2 + Ce0 = 2, and so φ satisfies this additionalcondition if and only if C = 4. In other words, the initial value problem

y′ − 6xy = 12x, y(0) = 2

has the unique solution φ(x) = −2 + 4e3x2

on (−∞,∞).

Now we are ready to solve first order linear equations in general.

Theorem 1.1. Consider the first order linear differential equation

y′ + a(x)y = g(x),

where a and g are continuous functions on some interval I. Put A(x) =∫ ∗a(x) dx.

Then the equation has infinitely many solutions on I, all of which are given by

(1) φ(x) = e−A(x)

∫eA(x)g(x) dx.

Furthermore, for any x0 ∈ I and any number α, there is exactly one solution on Iof the initial value problem

y′ + a(x)y = g(x), y(x0) = α.

Proof. φ is a solution of y′ + a(x)y = g(x)

⇔ φ′(x) + a(x)φ(x) = g(x)

⇔ eA(x)φ′(x) + eA(x)a(x)φ(x) = eA(x)g(x)

⇔ (eA(x)φ)′ = eA(x)g(x)

⇔ eA(x)φ =

∫eA(x)g(x) dx

⇔ φ(x) = e−A(x)

∫eA(x)g(x) dx.

Let’s note that the indefinite integral∫eA(x)g(x) dx represents a family of functions,

one of which is∫ xx0eA(t)g(t) dt. Then we can write

(2) φ(x) = e−A(x)

[∫ x

x0

eA(t)g(t) dt+ C

].

From this it easily follows that there is exactly one solution of the differentialequation that also satisfies the condition y(x0) = α, and it is the one correspondingto C = αeA(x0).

We should note that the function

µ = µ(x) = e∫ ∗ a(x) dx

is often referred to as an integrating factor . Also, (1) is referred to as the generalsolution of the first order equation. It is the collection of all solutions of the equationand involves an arbitrary constant C, which is shown explicitly in (2).

Example. Solve the initial value problem

xy′ + 2y = 3x4, y(1) = 8.


Dividing through by x, we get

y′ +2

xy = 3x3,

which we will think of as the equation in standard form. The largest intervalcontaining x = 1 for which this equation is well defined is (0,∞), and we will lookfor a solution in this interval; however, it could be that we might only want toconsider x ≥ 1 so that x = 1 is an initial point.

Rather than just plugging into the formula given in the last theorem, we willfind it more instructive to recreate the process that is described in its proof. Sincex > 0,

∫ ∗ 2x dx = 2 lnx, and so the integrating factor is

µ = e2 ln x = eln(x2) = x2.

Multiplication through the equation in standard form by the integrating factor givesus

x2y′ + 2xy = 3x5,

and so,

(x2y)′ = 3x5.

Integrating this, we get x2y = 12x

6 + C. Thus, the differential equation has as itsgeneral solution

y =1

2x4 + Cx−2.

Now we use the initial condition y(1) = 8. Plugging x = 1 and y = 8 into thegeneral solution, we get 8 = 1

2 +C, and so C = 152 . The solution to the initial value

problem is therefore

y =1

2x4 +

15

2x−2

for all x > 0.

We should note that in the next chapter, the notion of a first order lineardifferential equation will be extended slightly to include equations of the form

a0(x)y′ + a(x)y = g(x).

This will have no effect on what we have done so far; wherever a0(x) is not zero,we can, as we did in the last example, divide through by it to get an equation inthe standard form we have been dealing with in this section.

Exercises

Find the general solution for each of the following equations.

1. y′ + 5y = e2x

2. y′ − 4y = 2e3x

3. 2y′ − 4xy = 3ex2

4. 3y′ + 12xy = e−2x2

5. y′ = − 2xy + 5x7

6. y′ = 3xy + x2

3. SOME OTHER FIRST ORDER EQUATIONS 9

Find the solution of each of the following initial value problems.

7. y′ = 2xy + ex2

, y(0) = 4

8. y′ = 8xe−x3 − 3x2y, y(1) = 0

9. y′ + 12xy = x, y(0) = 5

10. 2y′ − 6xy = 5x, y(0) = 3

11. Let A(t) be the number of ounces of a radioactive material left t monthsafter a certain starting time, and let A0 be the initial amount. Find A(t)by solving the initial value problem A′ = kA, A(0) = A0.

12. Consider an electrical circuit containing only a resistor and an inductor.Let i(t) be the current. If R is the resistance and L is the inductance (bothconstant) and there is a constant impressed voltage E, then i satisfies thedifferential equation

Ldi

dt+Ri = E.

If i(0) = i0, solve for i.

13. Suppose that φ1 is a solution of y′+ a(x)y = g1(x) and φ2 is a solution ofy′ + a(x)y = g2(x) on an interval I.(a) If φ = φ1+φ2, prove that φ is a solution of y′+a(x)y = g1(x)+g2(x).(b) Use the method indicated by part (a) to solve y′ + 3y = x+ e−x.

14. If φ is a solution of y′+a(x)y = g(x), find a solution of y′+a(x)y = 2g(x).More generally, if k is any number, find a solution of y′ + a(x)y = kg(x).

15. Prove that the solutions of y′+a(x)y = 0 form a one dimensional subspaceof the vector space of all differentiable functions on I.

3. Some Other First Order Equations

We will now look at some equations that are not linear. Of the many possibletypes, we will study a few of the most standard ones. The first type will involveseparating variables.

Example. Consider the equation y′ = 6x/y2 for x ∈ (−∞,∞). Let’s assumethat φ is a solution of this equation. Then

φ′(x) = 6x/[φ(x)]2.

We would like to integrate both sides of the equation, but we are not ready to doso, yet. First multiply through by φ(x). We get

[φ(x)]2φ′(x) = 6x.

Now we can say ∫[φ(x)]2φ′(x) dx =

∫6x dx.

Evaluating these integrals (and using the substitution u = φ(x), du = φ′(x) dx forthe left-hand one), we get

[φ(x)]3

3= 3x2 + C,


and so if φ is a solution, then

φ(x) = (9x2 + 3C)1/3.

Furthermore, it is easy to check that every function of this form is, in fact, asolution. Note that since 3C can stand for any number just as C can, most peoplewould write the answer as φ(x) = (9x2 + C)1/3.

The process we just went though can be given more succinctly as follows. Start-ing with y′ = 6x/y2, we first write the derivative in Leibnitz notation so that we

have dydx = 6x/y2. Then we separate the variables (including the differentials),

getting y2 dy = 6x dx, and so, ∫y2 dy =

∫6x dx.

From this, we get y3/3 = 3x2 + C. Therefore, y = (9x2 + C)1/3, as before.If integrating with respect to y on one side of the equation while integrating with

respect to x on the other side seems suspicious to you, just note how it correspondsto integration by substitution when we went through the problem originally.

Definition 1.3. A first order differential equation is said to be a separableequation if it is of the form y′ = −M(x)/N(y).

Note that a separable equation can be rewritten as N(y) dy = −M(x) dx, whichhelps explain why the word ‘separable’ is used. The only reason for the minus signis to make the notation consistent with something else coming shortly.

As in the case of the first example in this section, our method of solution willbe to separate the variables, integrate, and then solve for the dependent variable.Instead of formalizing this as a theorem, we will just work through a few moreexamples. There are some pitfalls in this process, as the examples will show.

Example. Consider the equation y′ + 2xy = 0. This equation is linear andcan be solved as such. However, it is possible to separate the variables, and it willbe instructive to attack the problem in this way. We have dy

dx = −2xy, and so,

y−1 dy = −2x dx. Then, ∫y−1 dy =

∫−2x dx.

From this we get ln |y| = −x2 + C and then

eln |y| = e−x2+C ,

|y| = eCe−x2

,

and finally

y = ±eCe−x2

.

Put K = ±eC . Since C can be any number, it follows that K can be any non-zero

number. Thus, we have y = Ke−x2

for any K 6= 0.Notice that we have missed finding the constant solution y ≡ 0, (which corre-

sponds to K = 0). This happened because when we were separating the variables,we divided by y, which is only allowed if y 6= 0. If we had tried to write the equationin standard form for a separable equation, we would have had early warning of apossible problem involving y = 0 because we would have had N(y) = 1/y.

In the next two examples, we will need to be concerned about the domain asolution can have.


Example. Consider the equation xy′ = y2. Separating the variables leads toy−2 dy = x−1 dx, which in turn gives −1/y = ln |x| + C after integration. Thus,y = −1/(ln |x|+ C). Of course, we have to exclude x = 0 from the domain of anysuch solution because of the logarithm. Furthermore, we must avoid any x suchthat ln |x| + C = 0. In other words, both x = 0 and x = ±e−C must be excludedfrom the domain of the solution y = −1/(ln |x|+C). Also, our method has missedfinding y ≡ 0, which is obviously a solution.

Example. Let’s solve the equation y′ = a2 + y2, where a is any constant.Separating the variables gives us

dy

a2 + y2= dx.

If a 6= 0, we are led to 1a tan−1 ya = x+C, after integration. Then y

a = tan(ax+aC),and so y = a tan(ax + aC). We can simplify the form of the answer, writingy = a tan(ax + C) since aC represents an arbitrary constant. Note that becauseof the tangent function, we should restrict x so that ax + C does not equal oddinteger multiples of π/2. On the other hand, if a = 0, the reader can check thaty = −1/(x+ C) for x 6= −C and y ≡ 0 are solutions.

Our next two examples show that initial value problems involving separableequations do not necessarily have unique solutions. In the first example, we haveno solutions; in the second, we have more than one.

Example. Consider the initial value problem

y2y′ = x2, y(1) = 0.

This problem obviously has no solution as can be seen by trying to plug x = 1 andy = 0 into the equation simultaneously. Let’s apply the method of separation ofvariables to this equation anyway to see what happens. We get y2 dy = x2 dx, andthen y3/3 = x3/3 +C. The initial condition implies 0 = 1/3 +C. Thus, C = −1/3and we wind up with the function y = (x3 − 1)1/3, which is not differentiable atx = 1.


y′ = 3xy1/3, y(0) = 0.

Separating the variables of the differential equation leads to y−1/3 dy = 3x dx, whichin turn gives (3/2)y2/3 = (3/2)x2 +C after integration. Even though N(y) = y−1/3

does not exist at y = 0, we will find solutions for this example by using the initialcondition y(0) = 0 along with what we have so far. We get 0 = 0 + C and so,C = 0. Then y2/3 = x2. Solving for y carefully gives us y = x3 and y = −x3. Ascan be checked directly, both y = x3 and y = −x3 are solutions of the initial valueproblem on the interval (−∞,∞). Note that y ≡ 0 is yet another solution of theinitial value problem on (−∞,∞).

The difficulties we have encountered in solving separable equations should,perhaps, make us appreciate the situation for linear equations, where there are nosurprises involving the domain of a solution, and where every initial value problemhas a unique solution.

We will now move on to another type of first order equation that has beenstudied extensively.


Definition 1.4. Consider a first order differential equation in the form

y′ = −M(x, y)/N(x, y),

or,

M(x, y) +N(x, y)y′ = 0.

The equation is said to be exact in a rectangle

R = (x, y) : |x− x0| ≤ a, |y − y0| ≤ b

if M and N are continuous for all (x, y) ∈ R and there exists a function F suchthat

∂F

∂x= M and

∂F

∂y= N

throughout R.

Let’s start by showing that a particular example is exact. Our method isidentical to the process of finding a function F of two variables whose gradient ∇Fis a given vector field.

Example. Consider the differential equation y′ = −2xy/(x2 + 2y). If youlike, this can be rewritten as 2xy + (x2 + 2y)y′ = 0. In any event, we can setM(x, y) = 2xy and N(x, y) = x2 +2y. To show that the equation is exact, we mustfind a function F such that

#1∂F

∂x= 2xy

#2∂F

∂y= x2 + 2y.

Starting with the first of these two conditions, we can integrate with respect to xwhile holding y constant. This gives

F (x, y) = x2y +K(y),

where K(y) is constant with respect to the variable of integration x but mightdepend on y. This, in turn, leads to

∂F

∂y= x2 +K ′(y).

Comparing this to #2 gives us x2 +K ′(y) = x2 + 2y, and so, K ′(y) = 2y. Since itis enough to find just one function F , we can simply chose K(y) =

∫ ∗2y dy = y2.

We wind up with

F (x, y) = x2y + y2,

which certainly has the desired first partial derivatives. Furthermore, it has thedesired derivatives at all points (x, y). Therefore, the equation is considered to beexact in the entire plane R2, not just on a bounded rectangle.

Notice that we still haven’t solved the differential equation. However, havingfound a function F that shows exactness, that will be easy. The next theorem whenapplied to our example says that if y is any function of x implicitly defined by theequation, x2y + y2 = C, then y is a solution of the exact equation. Here, C is anarbitrary constant. Since it is possible in this example, let’s solve for y explicitly.


Using the quadratic formula on y2 + x2y − C = 0 to solve for the unknown y, weget

y =−x2 ±

√x4 + 4C

2.

We should note that solutions corresponding to negative values of C will haverestricted domains.

Theorem 1.2. Suppose the equation

M(x, y) +N(x, y)y′ = 0

is exact in a rectangle R, with F being a function such that

∂F

∂x= M and

∂F

∂y= N

in R. Then, every differentiable function φ defined implicitly by F (x, y) = C, whereC is a constant, is a solution of the exact equation. Also, every solution of the exactequation whose graph lies in R, can be found this way.

Proof. Suppose φ is a differentiable function defined implicitly by F (x, y) =C. In other words, F (x, φ(x)) = C for all x in an interval I, (where R = I × I ′).Put W (x) = F (x, φ(x)). Then,

0 = W ′(x) (since W (x) = C)

= Fx(x, φ(x)) · 1 + Fy(x, φ(x)) · φ′(x) (by the chain rule)

= M(x, φ(x)) +N(x, φ(x)) · φ′(x)

Therefore, φ is a solution of M(x, y) +N(x, y)y′ = 0.Conversely, suppose φ is a solution of M(x, y) +N(x, y)y′ = 0. In other words,

M(x, φ(x)) + N(x, φ(x)) · φ′(x) = 0. Then Fx(x, φ(x)) + Fy(x, φ(x)) · φ′(x) = 0.With W defined as above, this says that W ′(x) = 0, and so, W is constant. Inother words, F (x, φ(x)) = C.

Any equation

M(x, y) +N(x, y)y′ = 0

can be written using differentials as

M(x, y) dx+N(x, y) dy = 0.

In particular, any exact equation can be written as

∂F

∂xdx+

∂F

∂ydy = 0.

The above proof can then be summarized by dF = 0 if and only if F = C.

Example. Solve the equation −3x/y + y/x + 2y′ = 0. This can be rewrittenas −3x2 + y2 + 2xyy′ = 0. To try to solve this, let’s look for a function F as above,in the hope that the equation is exact. We want to find F (x, y) such that

#1∂F

∂x= −3x2 + y2

#2∂F

∂y= 2xy.


Starting with #1, we integrate with respect to x while holding y constant to get

F (x, y) = −x3 + xy2 +K(y).

When we then take the partial derivative of this with respect to y and compare theresult to #2, we get 2xy+K ′(y) = 2xy. Thus K is any constant, and we might aswell choose K = 0 since we only need to find one function F . Therefore, we haveF (x, y) = −x3 + xy2, which satisfies both condition #1 and condition #2. We arenot finished, however. We were supposed to solve a differential equation, and allwe have so far is a function of two variables, which does not tell us what y is. Thesolutions of 2y′ = 3x/y − y/x are given implicitly by −x3 + xy2 = C. Therefore,

y = ±√

(x3 + C)/x. We should note that for each C, the corresponding solutionhas a restricted domain.

Example. Solve the equation x2 +y+y2y′ = 0. Proceeding as above, we wantto find F (x, y) such that

#1∂F

∂x= x2 + y

#2∂F

∂y= y2.

Starting with #1, we integrate with respect to x while holding y constant to get

F (x, y) = x3/3 + xy +K(y).

When we then take the partial derivative of this with respect to y and compare theresult to #2, we get x+K ′(y) = y2. This implies that K ′(y) = −x+ y2, which isclearly impossible since K ′(y) cannot vary with x. We conclude that no functionF satisfies both conditions #1 and #2. In other words, the differential equationis not exact. Since it also is neither linear nor separable, we will have to leave itunsolved, for now.

The last example shows that it would be desirable to have a quick methodfor determining whether or not a first order differential equation is exact. Thefollowing result, which gives us such a method, parallels the standard condition fordetermining whether or not a 2-dimensional vector field is conservative.

Theorem 1.3. Let M and N be real valued functions of two real variables thathave continuous first partial derivatives on a rectangle

R = (x, y) : |x− x0| ≤ a, |y − y0| ≤ b.Then the equation M(x, y) +N(x, y)y′ = 0 is exact if and only if

(3)∂M

∂y=∂N

∂x

in R.

Proof. First, suppose that M(x, y) + N(x, y)y′ = 0 is exact. There is afunction F such that

∂F

∂x= M and

∂F

∂y= N

in R. Differentiate to get

∂2F

∂y∂x=∂M

∂yand

∂2F

∂x∂y=∂N

∂x.


But F has continuous second partial derivatives since M and N have continuousfirst partial derivatives, and so by a standard result of calculus,

∂2F

∂y∂x=

∂2F

∂x∂y.

Therefore, (3) holds throughout R.Conversely, assume that (3) holds in R. We need to find a function F such that

∂F

∂x= M and

∂F

∂y= N

in R. If such a function F exists, then using the Fundamental Theorem of Calculus,we will have

F (x, y)− F (x0, y0) = F (x, y)− F (x0, y) + F (x0, y)− F (x0, y0)

=

∫ x

x0

∂F

∂x(s, y) ds+

∫ y

y0

∂F

∂y(x0, t) dt

=

∫ x

x0

M(s, y) ds+

∫ y

y0

N(x0, t) dt.

Using this as a guide, define F on R by

F (x, y) =

∫ x

x0

M(s, y) ds+

∫ y

y0

N(x0, t) dt.

It then follows immediately that

∂F

∂x(x, y) = M(x, y)

for all (x, y) ∈ R. Also, for each (x, y) ∈ R,

∂F

∂y(x, y) =

∫ x

x0

∂M

∂y(s, y) ds+N(x0, y)

=

∫ x

x0

∂N

∂x(s, y) ds+N(x0, y)

= N(x, y)−N(x0, y) +N(x0, y)

= N(x, y).

(In the preceding, bringing the partial derivative with respect to y inside the inte-gral with respect to s is allowed since M has continuous first partial derivatives.)Therefore, the equation is exact in R.

The hypothesis that domain under consideration was a rectangle R was usedto ensure that the integrations defining F were over lines in the domain. However,using some more advanced techniques, it could be shown that the theorem is truefor any simply connected domain. Roughly speaking, this means that the domainhas no holes.

The reader should test the last theorem on all of the examples that precededit. In particular, for x2 + y+ y2y′ = 0, we have M(x, y) = x2 + y and N(x, y) = y2.This implies that My(x, y) = 1 while Nx(x, y) = 0, which confirms that fact thatthe differential equation is not exact. On the other hand, the equation −3x/y +y/x+ 2y′ = 0 involves some subtleties that will be discussed immediately after thenext example.



y′ = −(2xy + 1)/(x2 + cos y), y(0) = 0.

The differential equation can be rewritten as

2xy + 1 + (x2 + cos y)y′ = 0,

or

(2xy + 1)dx+ (x2 + cos y)dy = 0.

In any event, M(x, y) = 2xy + 1 and N(x, y) = x2 + cos y. Before going anyfurther, we can check whether the equation is exact by computing My(x, y) = 2xand Nx(x, y) = 2x. Now that we know that the equation is exact, let’s look for afunction F such that

#1∂F

∂x= 2xy + 1

#2∂F

∂y= x2 + cos y.

Just for the sake of variety, we will start with condition #2 this time. Integratingwith respect to y while holding x constant gives us

F (x, y) = x2y + sin y +K(x).

When this, in turn, is differentiated with respect to the other variable, x, we get

∂F

∂x= 2xy +K ′(x).

This, combined with #1, implies that K ′(x) = 1, and so we can choose K(x) = x.Thus, F (x, y) = x2y+ sin y+x. The solutions of the differential equation are givenby x2y+ sin y+x = C. Plugging in the initial condition leads to C = 0. Therefore,the solution of the initial value problem is given implicitly by x2y + sin y + x = 0.In this example, as is often the case, there is no way of exhibiting the solution yexplicitly.

At times, an equation that is not exact can be replaced by an equivalent onethat is exact. We have seen this happen already. Consider again the equation

−3x/y + y/x+ 2y′ = 0.

This equation is not exact since M(x, y) = −3x/y + y/x and N(x, y) = 2 implyMy(x, y) 6= Nx(x, y). However, when we multiplied through by xy, we got

−3x2 + y2 + 2xyy′ = 0,

which we showed to be exact. From this we see that changing the form of a dif-ferential equation in an apparently inconsequential way can, in fact, have majorconsequences in terms of exactness.

The method of solving first order linear equations that was presented in the lastsection parallels the process of replacing a differential equation with an equivalentone . Let’s look back at linear equations with this in mind. We start with theequation y′ + a(x)y = g(x). Rewrite it as −g(x) + a(x)y + y′ = 0, which is exactonly in the very special case of a being identically zero. As before, multiply throughthe equation by the integrating factor

µ = µ(x) = e∫ ∗ a(x) dx.


We get

−µ(x)g(x) + µ(x)a(x)y︸︷︷︸M(x,y)

+ µ(x)︸︷︷︸N(x,y)

y′ = 0.

Since µ′(x) = µ(x)a(x), My(x, y) = µ(x)a(x) = Nx(x, y), and so we now have anexact equation. Following our usual procedure, we can find

F (x, y) = µ(x)y −∫ ∗

µ(x)g(x) dx,

and setting this equal to a constant gives us the general solution of the first orderlinear equation, in agreement with what we got before.

There are many other first order differential equations that when multiplied byan appropriate integrating factor µ give an equivalent equation that is exact (anthen can be solved). Unfortunately, it is usually quite difficult to find µ. This isespecially truly when µ is a function of both x and y. For more about this, see theexercises.

We have seen how to solve three types of first order differential equations:linear, separable, and exact. Of course, there are many first order equations ofimportance that do not fit any of these patterns. In the exercises, we will brieflyinvestigate a few other standard types of first order equations. There are still otherequations for which there are no known methods for finding solutions expressibleeither explicitly or implicitly, as we have been able to do so far. In the next twosections we will show that many of these equations do have solutions, and we willuse numerical methods for approximating the solutions.

Exercises

1. For each of the following equations, state which of the methods of thissection and the last section apply. (For some equations, more than onemethod is possible; for others, none may be possible.) Do not solve theequations.(a) y′ + xy2 = 0(b) y′ + xy2 = 1(c) y′ + x2y = 1(d) y′ + x2y = 3x2

(e) x2y′ + 2xy = 0(f) x2y′ + 3xy = 1(g) x2y′ + 2xy = x(h) x2y′ + 2xy = x2

Using the methods of this section and the last section, solve the following equationsor initial value problems.

2. y′ = 3x2e−y − e−y, y(0) = 0

3. (2x+ y2) + (2xy + 1)y′ = 0

4. y′ = 4x3y + 2xex4

5. y′ = −2xey/(x2ey + 1)


6. xy2 + (x2 + 1)y′ = 0

7. (2xy3 + 2) + 3x2y2y′ = 0, y(1) = 3

8. y′ + 2x−1y = 2x4

9. y′ = (2x− sin y)/(1 + x cos y), y(0) = 1

10. y′ = a/y3

11. (a+ y3) + 3xy2y′ = 0

12. Try to solve the initial value problem

(2xy3 + 2) + 3x2y2y′ = 0, y(0) = 3.

What is wrong with your answer?

13. Prove that every separable equation can be made to be exact.

14. An equation of the form y′ = (a − by)y is called a logistic equation. Ifthe number N (in thousands) at time t (in years) of a species that hasjust been introduced to a certain island is assumed to satisfy the logisticequation N ′ = (5 − 2N)N , and if now at time t = 0 the population is1000, what will the population of this species be six months from now?Also, what will the population tend toward in the long run?

15. Consider an object of mass m at the end of a horizontal spring withstiffness coefficient k. Let x be the (variable) distance the object is awayfrom its resting position. (Stretching the spring corresponds to x > 0,while compressing it corresponds to x < 0.) If friction is ignored and thespring is assumed massless, it can be shown that the velocity v of theobject obeys the differential equation

dv

dx= − k

m

x

v.

Show that 12mv

2+ 12kx

2 is constant, which says that the sum of the kineticenergy and the potential energy of the system remains constant over time.

16. Given an equation M(x, y) +N(x, y)y′ = 0 (which is assumed not exact),show that if the expression 1

N (∂M∂y −∂N∂x ) is a continuous function of x

alone (or is constant), then multiplying the equation by

µ = µ(x) = e∫ ∗ 1

N ( ∂M∂y −

∂N∂x ) dx

leads to an equivalent exact equation.

17. Given an equation M(x, y) +N(x, y)y′ = 0 (which is assumed not exact),show that if the expression 1

M (∂N∂x −∂M∂y ) is a continuous function of y

alone (or is constant), then multiplying the equation by

µ = µ(y) = e∫ ∗ 1

M ( ∂N∂x −

∂M∂y ) dy

leads to an equivalent exact equation.

18. Use a one variable integrating factor to solve y2 + (3xy + 1)y′ = 0.

19. Use a one variable integrating factor to solve 2y2 + 2x+ 1 + 2yy′ = 0.

4. EXISTENCE AND UNIQUENESS 19

20. Verify that µ(x, y) = 1/(xy2) is an integrating factor for 3y2−xy+x2y′ = 0and then use it to solve the equation.

A function of two variables f is said to be homogeneous of degree zero if

f(tx, ty) = f(x, y)

for all real numbers x, y, t. In this case, the differential equation y′ = f(x, y) is saidto be homogeneous. (This use of the word homogeneous has no relation to its usein connection with linear equations.) To solve a homogeneous equation, make thesubstitution v = y/x. In other words, let y = xv, and so

dy

dx= x

dv

dx+ v.

This will lead to a separable equation in the variables x, v, which you then solve.Use this technique on the following equations and initial value problems.

21. y′ = (x3 + y3)/xy2

22. y′ = (x2y − y3)/x3

23.√xy + y − xy′ = 0, y(1) = 1 (x > 0)

24. y′ = (xy + y2)/x2, y(1) = 1/2 (x > 0)

An equation of the form y′ + a(x)y = b(x)yn is called a Bernoulli equation. Tosolve it, make the substitution v = y1−n. In other words, let y = v1/(1−n) and then

dy

dx= 1

1−nvn/(1−n) dv

dx.

This will lead to a linear equation in the variables x, v, which you then solve. Usethis technique on the following equations.

25. y′ + 2y/x = y3/x4

26. y′ − y/x = 1/y2

4. Existence and Uniqueness

We are going to show that first order differential equations under rather generalconditions have solutions. This will require a number of steps, including replacingan initial value problem by an equivalent integral equation. We start by carefullystating what it means for a function to be a solution of an initial value problemand to be a solution of an integral equation.

Let f be a continuous real valued function on a rectangle

R = (x, y) : |x− x0| ≤ a, |y − y0| ≤ b.A function φ is said to be a solution of the initial value problem

(4)y′ = f(x, y)

y(x0) = y0

on an interval I containing x0 if φ is differentiable on I, (x, φ(x)) ∈ R for all x ∈ I,φ′(x) = f(x, φ(x)) for all x ∈ I, and φ(x0) = y0.

Correspondingly, φ is said to be a solution of the integral equation

(5) y = y0 +

∫ x

x0

f(t, y) dt


on an interval I containing x0 if φ is continuous on I, (x, φ(x)) ∈ R for all x ∈ I,and

φ(x) = y0 +

∫ x

x0

f(t, φ(t)) dt

for all x ∈ I.We have already seen numerous examples of initial value problems and their

solutions. Here is an example of an integral equation and its solution.

Example. Put f(x, y) = 3y/x, x0 = 1, y0 = 2. A solution of

y = 2 +

∫ x

1

3y

tdt

is y = φ(x) = 2x3 since

2 +

∫ x

1

3φ(t)

tdt = 2 +

∫ x

1

6t2 dt = 2x3 = φ(t).

Theorem 1.4. A function φ is a solution of the initial value problem (4) onan interval I if and only if it is a solution of the integral equation (5) on I.

Proof. Suppose φ satisfies (5) on I. Since f(t, φ(t)) is a continuous function oft, we immediately get that φ is a differentiable function of x and φ′(x) = f(x, φ(x))for all x ∈ I. Clearly, φ(x0) = y0. Thus, φ is a solution of (4) on I.

Conversely, suppose φ is a solution of (4) on I. Then, φ′(x) = f(x, φ(x)) andφ(x0) = y0. Putting G(x) = f(x, φ(x)), we have φ′ = G, and so,

φ(x)− φ(x0) =

∫ x

x0

G(t) dt.

In other words,

φ(x) = y0 +

∫ x

x0

f(t, φ(t)) dt

for all x ∈ I, and φ satisfies (5).

We now set out to try to solve (5). We will use the method of successiveapproximations, also known as the method of Picard iterations. An outline of themethod is as follows. Our first attempt at a solution is the function φ0 given by

φ0(x) = y0

for all x ∈ I. Next try φ1 defined by

φ1(x) = y0 +

∫ x

x0

f(t, φ0(t)) dt.

Then try φ2 defined by

φ2(x) = y0 +

∫ x

x0

f(t, φ1(t)) dt.

In general, put

(6) φk(x) = y0 +

∫ x

x0

f(t, φk−1(t)) dt.


Our hope is that by letting φ(x) = limk→∞ φk(x) and then taking the limit of bothsides of (6), we will wind up with

φ(x) = y0 +

∫ x

x0

f(t, φ(t)) dt,

and so φ will be a solution of (5). There are, however, a couple of non-trivialconcerns with this. It is not certain that limk→∞ φk(x) will exist, and it is notcertain that when taking the limit of the right side of (6), we can bring the limitinside the integral.

Putting aside these concerns for the moment, however, let’s try out the methodof successive approximations on a simple example.

Example. Consider the initial value problem y′ = y, y(0) = 1. We havef(x, y) = y, x0 = 0, and y0 = 1. Thus, the equivalent integral equation is

y = 1 +

∫ x

0

y dt.

Furthermore, φ0(x) ≡ 1 and

φk(x) = 1 +

∫ x

0

φk−1(t) dt

for k = 1, 2, . . . . Applying this formula repeatedly, we get

φ1(x) = 1 +

∫ x

0

1 dt = 1 + x

φ2(x) = 1 +

∫ x

0

(1 + t) dt = 1 + x+x2

2

φ3(x) = 1 +

∫ x

0

(1 + t+t2

2) dt = 1 + x+

x2

2+

x3

3 · 2and so on. An easy proof by induction shows that, in general,

φk(x) = 1 + x+x2

2!+ · · ·+ xk

k!for all x. We recognize that we are getting partial sums for the Maclaurin series ofex, and so the sequence of successive approximations φk(x) converges to φ(x) = ex

for all x. We can immediately check that ex is, in fact, the solution of the initialvalue problem.

In the last example, everything worked out as we would hope. It is now time toface the task of determining conditions that will ensure this sort of nice behavior.First, let us note the following standard result, which is commonly stated in calculuscourses and proved in a course in (multivariable) advanced calculus. Since R is aclosed and bounded set and f is continuous on R, f is bounded on R, i.e., there isa constant M such that |f(x, y)| ≤M for all (x, y) ∈ R.

Using the notation that has been introduced in this section, we are ready forthe following lemma, which among other things tells us that all the successiveapproximations φk exist, at least near the point x0.

Lemma 1.1. Let α = mina, b/M, and put I = x : |x − x0| ≤ α. Then foreach k = 0, 1, . . ., φk is continuous on I and (x, φk(x)) ∈ R for all x ∈ I. Moreover,

(7) |φk(x)− y0| ≤M |x− x0|


for all x ∈ I.

Proof. Our proof will be by mathematical induction. First note that all theresults are obviously true for φ0. Next, assume inductively that the lemma is truefor k = m− 1, where m is an arbitrary, fixed positive integer. Then f(t, φm−1(t))is a continuous function of t ∈ I. Thus, it follows from (6) that φm is continuouson I (and, in fact, is differentiable). Also,

|φm(x)− y0| =∣∣∣∣∫ x

x0

f(t, φm−1(t)) dt

∣∣∣∣ ≤M |x− x0|since (t, φm−1(t)) ∈ R for all t ∈ I. It remains only to show that (x, φm(x)) ∈ Rfor all x ∈ I. For this, first note that for x ∈ I, |x− x0| ≤ α ≤ a. Also, if x ∈ I,

|φm(x)− y0| ≤M |x− x0| ≤Mα ≤M · b/M = b.

Thus, (x, φm(x)) ∈ R for all x ∈ I. The lemma is, therefore, true for k = m.

To be sure that the successive approximations converge to a solution, we willneed a further condition on f .

Definition 1.5. Let f be a function of two variables defined on a set S. Wesay that f satisfies a Lipschitz condition on S if there exists a constant K such that

|f(x, y1)− f(x, y2)| ≤ K|y1 − y2|

for all (x, y1) and (x, y2) in S.

Example. The function f(x, y) = x3y2 satisfies a Lipschitz condition on therectangle

R = [−1, 3]× [−5, 5] = (x, y) : |x− 1| ≤ 2, |y| ≤ 5since for any (x, y1) and (x, y2) in R,

|x3y21 − x3y22 | = |x3| · |y1 + y2)| · |y1 − y2| ≤ 27 · 10|y1 − y2| = 270|y1 − y2|.

You should notice that the Lipschitz condition, as defined here, says much moreabout f(x, y) as a function of y than as a function of x. In particular, for each fixedx, f(x, y) will be continuous with respect to y, but for a fixed value of y, f(x, y)is not necessarily continuous with respect to x. You are asked to show this in theexercises. Also, it should be mentioned that there exists an analogous notion ofa Lipschitz condition for functions of one variable, which is of interest in othersettings.

Next comes a result that, for many examples, gives us an easy way of deter-mining that a function satisfies a Lipschitz condition.

Theorem 1.5. Let S be either a rectangle

(x, y) : |x− x0| ≤ a, |y − y0| ≤ b

or a strip

(x, y) : |x− x0| ≤ a, −∞ < y <∞.Suppose there is a constant K such that |fy(x, y)| ≤ K for all (x, y) ∈ S. Then, fsatisfies a Lipschitz condition on S with Lipschitz constant K.


Proof. For any fixed x ∈ [x0 − a, x0 + a] choose (x, y1) and (x, y2) in S. Themean value theorem applied to f thought of as a function of the single variable yon the interval between y1 and y2 states that there exists a number c (which candepend on x) between y1 and y2 such that

f(x, y1)− f(x, y2) = fy(x, c)(y1 − y2).

Thus, |f(x, y1)− f(x, y2)| ≤ K|y1 − y2| for all (x, y1) and (x, y2) in S.

Example. Let’s look at the previous example f(x, y) = x3y2 on the rectangleR = [−1, 3] × [−5, 5] again. On R, |fy(x, y)| = |2x3y| ≤ 270. Thus, f satisfiesa Lipschitz condition on R (with Lipschitz constant 270). The fact that we gotthe same Lipschitz constant as before is of no importance; in other examples, thismight not happen.

Theorem 1.6. (Existence Theorem) Let f be a continuous function on a rec-tangle

R = (x, y) : |x− x0| ≤ a, |y − y0| ≤ b.Suppose |f(x, y)| ≤M for all (x, y) ∈ R and f satisfies a Lipschitz condition on R.Put α = mina, b/M and

φ0(x) = y0

φk(x) = y0 +

∫ x

x0

f(t, φk−1(t)) dt, k = 1, 2, . . .

Then, on I = x : |x − x0| ≤ α, the sequence φk converges to a solution of theinitial value problem

y′ = f(x, y), y(x0) = y0.

Before proving the theorem, we will see how it applies to a specific example.


y′ = x3y2 + 1, y(0) = 1.

We cannot find a solution to this problem since the equation is not of a typewe have studied, but we will be able to show that a solution exists. Let R =[−2, 2] × [0, 2]. (The choice of R is arbitrary, except that the ‘center’ must be at(x0, y0) = (0, 1).) We have f(x, y) = x3y2 + 1 and a = 2, b = 1. Then, M = 33and α = min2, 1/33 = 1/33. Furthermore, f satisfies a Lipschitz condition on Rsince |fy(x, y)| = |2x3y| ≤ 32 for all (x, y) ∈ R. Therefore, the initial value problemhas a solution on I = [−1/33, 1/33], and this solution is the limit of the sequence ofsuccessive approximations. Note that we could have gotten a solution on a largerinterval I if we had chosen R with more care.

Proof of the Existence Theorem. We will start by getting an idea ofhow much successive approximations differ from each other. Specifically, we wantto prove

(8) |φk(x)− φk−1(x)| ≤ MKk−1|x− x0|k

k!,

where K is a Lipschitz constant, for all x ∈ I, k = 1, 2, . . . . We will use mathe-matical induction. When k = 1, (8) is true since if x ∈ I,

|φ1(x)− φ0(x)| = |φ1(x)− y0| ≤M |x− x0|


by the last lemma. Now assume inductively that (8) is true for k = m, where m isan arbitrary positive integer. If x ∈ I and x > x0,

|φm+1(x)− φm(x)| =∣∣∣∣∫ x

x0

[f(t, φm(t))− f(t, φm−1(t))] dt

∣∣∣∣≤∫ x

x0

|f(t, φm(t))− f(t, φm−1(t))| dt

≤∫ x

x0

K|φm(t)− φm−1(t)| dt (Lipschitz condition)

≤∫ x

x0

KMKm−1|t− x0|m

m!dt (inductive assumption)

=

[MKm (t− x0)m+1

(m+ 1)m!

]xx0

=MKm|x− x0|m+1

(m+ 1)!,

and so (8) holds for k = m+ 1. If x ∈ I and x < x0, a similar computation showsthat (8) holds for k = m+ 1. Thus, (8) is true for all k = 1, 2, . . . .

Due the tools we have available to us, it turns out that it will be convenientdeal with a series rather than the given sequence of successive approximations. Tothis end, let’s note that

φk = φ0 + (φ1 − φ0) + (φ2 − φ1) + · · ·+ (φk − φk−1).

In other words, φk(x) is a partial sum of the series

(9) φ0(x) +

∞∑k=1

[φk(x)− φk−1(x)].

This series converges on I by the comparison test and (8) since

∞∑k=0

MKk−1|x− x0|k

k!

converges for all x (and has sum (M/K)eK|x−x0|). Call the sum of (9), φ(x). Then,the sequence φk converges to φ on I.

Next, we will show that φ is continuous on I. Since

φk(x1)− φk(x2) =

∫ x1

x0

f(t, φk−1(t)) dt−∫ x2

x0

f(t, φk−1(t)) dt

=

∫ x1

x2

f(t, φk−1(t)) dt,

we get

|φk(x1)− φk(x2)| ≤M |x1 − x2|for all k. It then follows from letting k →∞ that

(10) |φ(x1)− φ(x2)| ≤M |x1 − x2|

for all x1 and x2 in I. From this, we easily get that φ is continuous on I (just aswe noted earlier that the Lipschitz condition for a function of two variables impliescontinuity in y).


It is clear that φ(x0) = y0. Also, letting x1 = x and x2 = x0 in (10), we have

|φ(x)− y0)| ≤M |x− x0| ≤Mα ≤ b,

and so (x, φ(x)) ∈ R for all x ∈ I.Now all that remains is to show that φ satisfies the integral equation (5). As

mentioned before, it is not elementary that taking the limit as k →∞ of both sidesof (6) proves that φ is a solution of (5). We will need an inequality that describesin detail the convergence of φk(x) to φ(x). For any x ∈ I,

|φ(x)− φk(x)| =

∣∣∣∣∣∞∑

i=k+1

[φi(x)− φi−1(x)]

∣∣∣∣∣≤

∞∑i=k+1

|φi(x)− φi−1(x)|

≤∞∑

i=k+1

MKi−1|x− x0|i

i!(by (8))

≤∞∑

i=k+1

MKi−1αi

i!

≤ MKkαk+1

(k + 1)!

∞∑j=0

Kjαj

j!

=M

K

(Kα)k+1

(k + 1)!eKα.

Setting εk = (Kα)k+1/(k + 1)!, we have

(11) |φ(x)− φk(x)| ≤ M

Kεke

Kα

for all x ∈ I. Note that εk → 0 as k →∞, as can be seen from eKα =∑k εk. Now,

we have

φk(x) = y0 +

∫ x

x0

f(t, φk−1(t)) dt,

and from this we want to show that

φ(x) = y0 +

∫ x

x0

f(t, φ(t)) dt

for all x ∈ I. Let k →∞. Clearly φk(x)→ φ(x). Thus, we will be done if

∫ x

x0

f(t, φk−1(t)) dt→∫ x

x0

f(t, φ(t)) dt,


and this does occur since∣∣∣∣∫ x

x0

f(t, φk−1(t)) dt−∫ x

x0

f(t, φ(t)) dt

∣∣∣∣ ≤ ∣∣∣∣∫ x

x0

|f(t, φk−1(t))− f(t, φ(t))| dt∣∣∣∣

≤ K∣∣∣∣∫ x

x0

|φk−1(t)− φ(t)| dt∣∣∣∣

≤Mεk−1eKα

∣∣∣∣∫ x

x0

dt

∣∣∣∣= MeKα|x− x0|εk−1 → 0.

It should be noted that the last theorem is a local existence theorem. Since αmight be strictly less than a, the function φ that we have found is not necessarily asolution to the initial value problem (4) on the entire interval I = x : |x−x0| ≤ a.To guarantee a solution that holds on all of I, stronger conditions on f are needed.For information about this, the reader should consult an advanced book.

The Existence Theorem says that under certain conditions the successive ap-proximations converge to a solution φ. This being the case, it then is at times ofinterest to be able to estimate how well each φk approximates φ. Condition (11)gives us such an estimate. We highlight this by restating (11) as a theorem in itsown right.

Theorem 1.7. Under the hypotheses of the Existence Theorem,

|φ(x)− φk(x)| ≤ M

K

(Kα)k+1

(k + 1)!eKα,

for k = 0, 1, 2, . . . .

We are now going to show that under the hypotheses of our Existence Theoremthe solution to the initial value problem (4) that we found is the only possible

solution on I and that, moreover, there can be at most one solution on I. Theproof of this will be quick once we have the following lemma.

Lemma 1.2. Let v be function, differentiable on an interval I containing a pointx0, such that v(x0) = 0. Put I+ = x ∈ I : x ≥ x0 and I− = x ∈ I : x ≤ x0.

(a) If for all x ∈ I+, v(x) ≥ 0 and v′(x) + cv(x) ≤ 0, where c is a constant,then v = 0 on I+.

(b) If for all x ∈ I−, v(x) ≤ 0 and v′(x) + cv(x) ≤ 0, where c is a constant,then v = 0 on I−.

Proof. The differential inequality satisfied by v is reminiscent of a first or-der linear differential equation. With this in mind, put µ(x) = e

∫ ∗ c dx = ecx.Multiplying through the inequality by the positive function µ(x) gives us

µ(x)v′(x) + cµ(x)v(x) ≤ 0,

which implies (µ(x)v(x))′ ≤ 0. Thus, µv is non-increasing. If x ∈ I+, then

µ(x)v(x) ≤ µ(x0)v(x0) = µ(x0) · 0 = 0,

and so v(x) ≤ 0. But, by hypothesis, v(x) ≥ 0 for all x ∈ I+. Thus, v(x) ≡ 0 onI+. Part (b) is handled in a similar manner.


Theorem 1.8. (Uniqueness Theorem) Let f be continuous and satisfy a Lip-schitz condition on

R = (x, y) : |x− x0| ≤ a, |y − y0| ≤ b.

Set I = x : |x−x0| ≤ a. If φ and ψ are both solutions of the initial value problem

y′ = f(x, y), y(x0) = y0,

on any interval I such that x0 ∈ I ⊆ I, then φ = ψ on I.

Proof. We have

φ(x) = y0 +

∫ x

x0

f(t, φ(t)) dt, ψ(x) = y0 +

∫ x

x0

f(t, ψ(t)) dt.

For all x ∈ I+,

|φ(x)− ψ(x)| ≤∫ x

x0

|f(t, φ(t))− f(t, ψ(t))| dt

≤ K∫ x

x0

|φ(t)− ψ(t)| dt.

Put v(x) =∫ xx0|φ(t) − ψ(t)| dt. We have v′(x) ≤ Kv(x). If we set c = −K, then

v′(x) + cv(x) ≤ 0 for all x ∈ I+. Clearly, v(x0) = 0 and v ≥ 0 on I+. Then bypart (a) of the lemma, v = 0 on I+. Therefore, φ = ψ on I+. The argument for I−

is similar, but requires a bit more care. Defining v(x) as above, we have v ≤ 0 onI−. Using the relation

∫ x0

x= −

∫ xx0

, one can then show that v′(x) ≤ −Kv(x) for

all x ∈ I−. The result then follows from part (b) of the lemma with c = K.

Example. Consider the initial value problem y′ = 1 + y2/x, y(4) = 0 on therectangle R = [1, 7]× [−2, 2]. This is equivalent to the integral equation

y = 0 +

∫ x

4

(1 + y2/t) dt.

We have φ0(x) ≡ 0 and φk(x) =∫ x4

(1 + [φk−1(t)]2/t) dt for k = 1, 2, . . . . Inparticular,

φ1(x) =

∫ x

4

1 dt = x− 4

and

φ2(x) =

∫ x

4

(1 +

(t− 4)2

t

)dt

=

∫ x

4

(t− 7 + 16t−1) dt

=1

2x2 − 7x+ 16 lnx+ 20− 16 ln 4.

Clearly, f(x, y) = 1 + y2/x is continuous on R. Since x0 = 4 and y0 = 0, we havea = 3 and b = 2. Then, |f(x, y)| ≤ 1 + 4/1 = 5 ≡ M and α = min3, 2/5 = 2/5.Also, f satisfies a Lipschitz condition on R since |fy(x, y)| = |2y/x| ≤ 4/1 = 4 forall (x, y) ∈ R. Thus, the sequence of successive approximations φk converges toa solution φ on the interval I = x : |x − 4| ≤ 2/5. By the Uniqueness Theorem,this is the only solution on I.


We should note that since a continuous function on a closed, bounded set isbounded, Theorem 1.5 implies that the conclusions of both the Existence Theoremand the Uniqueness Theorem hold if we have that f and fy are continuous on R.

It can be shown, using techniques beyond the scope of these notes, that justhaving f continuous on R is enough to ensure the existence of a local solution tothe initial value problem (4). However, continuity of f on R is not enough to ensureuniqueness, as can be seen from one of the following exercises.

Exercises

1. Consider the initial value problem y′ = x2y, y(0) = 3 on the rectangleR = (x, y) : |x| ≤ 2, |y − 3| ≤ 6.(a) Give the equivalent integral equation.(b) Compute the successive approximations φ0, φ1, φ2.(c) Show that f(x, y) = x2y satisfies a Lipschitz condition on R.(d) Use R to compute a value α such that the successive approximations

φ0, φ1, φ2, . . . converge to a solution φ on I = x : |x| ≤ α.(e) Use a method studied earlier to solve the initial value problem. Writ-

ing down the Maclaurin series for this solution, show how φ0, φ1, φ2relate to the solution.

2. Consider the initial value problem y′ = 2x+ y, y(1) = 2 on the rectangleR = [0, 2]× [−1, 5].(a) Give the equivalent integral equation.(b) Compute the successive approximations φ0, φ1, φ2.(c) Show that f(x, y) = 2x+ y satisfies a Lipschitz condition on R.(d) Use R to compute a value α such that the successive approximations

φ0, φ1, φ2, . . . converge to a solution φ on I = x : |x− 1| ≤ α.3. Consider the initial value problem y′ = 4x−y2, y(0) = 0 on the rectangle

R = [−10, 10]× [−20, 20].(a) Give the equivalent integral equation.(b) Compute the successive approximations φ0, φ1, φ2.(c) Show that f(x, y) = 4x− y2 satisfies a Lipschitz condition on R.(d) Use R to compute a value α such that the successive approximations

φ0, φ1, φ2, . . . converge to a solution φ on I = x : |x| ≤ α.4. Consider the initial value problem y′ = 2+x2/y, y(0) = 5 on the rectangle

R = (x, y) : |x| ≤ 1, |y − 5| ≤ 3.(a) Give the equivalent integral equation.(b) Compute the successive approximations φ0 and φ1.(c) Show that f(x, y) = 2 + x2/y satisfies a Lipschitz condition on R.(d) Use R to compute a value α such that the successive approximations

φ0, φ1, φ2, . . . converge to a solution φ on I = x : |x| ≤ α.5. As shown in the section covering separable equations, the initial value

problem y2y′ = x2, y(1) = 0 has no solution. Does this mean that theExistence Theorem is false? Explain.

5. NUMERICAL METHODS 29

6. As shown in the section covering separable equations, the initial valueproblem y′ = 3xy1/3, y(0) = 0 has solutions x3, −x3, and 0. Does thiscontradict the Uniqueness Theorem? Explain.

7. (a) Suppose |f(x)| ≤ 5 and |g(x)| ≤ 2 for all x ∈ [4, 7]. Find a constant

M1 such that∣∣∣∫ 7

4(f(x)− g(x)) dx

∣∣∣ ≤M1.

(b) Suppose 2 ≤ f(x) ≤ 10 and 2 ≤ g(x) ≤ 10 for all x ∈ [4, 7]. Find a

constant M2 such that∣∣∣∫ 7

4f(x)/g(x) dx

∣∣∣ ≤M2.

8. Prove part (b) of Lemma 1.2.

9. Fill in the details for I− in the proof of the Uniqueness Theorem.

10. Find a function f and a rectangle R such that f satisfies a Lipschitzcondition on R, but fy does not exist everywhere on R.

11. Consider the initial value problem y′ = 2xy2 + 2x, y(0) = 0 on therectangle R = [−a, a]× [−b, b].(a) Apply the Existence and Uniqueness Theorems to show that there is

a unique solution on the interval I = x : |x| ≤ α, where

α = min

a,

b

2a(b2 + 1)

.

(b) Show that for each fixed a > 0, the largest choice for α as b rangesover all possible values is α = mina, 1/(4a).

(c) Show that the largest α we can choose as a and b range over allpossible values is α = 1/2.

(d) Solve the initial value problem and state what is the largest intervalcontaining x0 = 0 on which the solution holds.

12. For each k = 2, 3, . . ., let gk be the function on [0, 1] whose graph consistsof the line segment from the point (0, 0) to the point (1/k, k) followed bythe line segment from (1/k, k) to (2/k, 0) followed by the line segmentfrom (2/k, 0) to (1, 0). Show that

limk→∞

∫ 1

0

gk(x) dx 6=∫ 1

0

limk→∞

gk(x) dx.

How does the possibility of this sort of behavior relate to the difficulty ofthe proof of the Existence Theorem?

5. Numerical Methods

Not only do the successive approximations of the last section give us a proofthat a wide variety of initial value problems have solutions, but also they give usa way of finding the solutions: compute φk and then take the limit as k → ∞.Unfortunately, this process is usually extremely difficult if not impossible from apractical standpoint. Even finding a ‘good’ approximation φk corresponding to alarge value of k is often not feasible because of the difficult integrations involved.What is normally done, instead, is to use a numerical method.

We are going to concentrate on Euler’s method. It is not a particularly powerfulmethod for approximating the solution of an initial value problem, but it has theadvantage of being easy to understand and it gives us insight into how some of the


more sophisticated methods work, as well. We will introduce Euler’s method bydescribing, in great detail, how it is applied in the following example.

Before beginning the example, however, let’s recall an elementary fact fromCartesian geometry. If (xk−1, Yk−1) and (xk, Yk) are two points in the plane withxk−1 6= xk and if m is the slope of the line joining them, then Yk − Yk−1 =m(xk − xk−1). Putting, h = xk − xk−1, we write

(12) Yk = Yk−1 + hm.


y′ =4x+ 2y + 6

y2 + 1, y(0) = −1.

As is typical in a problem involving a numerical method, we will restrict our atten-tion to x on a bounded interval; in this case, let’s choose I = [0, 2]. We can see bythe Existence and Uniqueness Theorems that this problem has a unique solutionon [0, 2]. (Choose R = [−2, 2]× [−31, 29]. Using standard techniques from calculus,we see that M can be taken to be 15, and so α = min2, 30/15 = 2.)

The solution to the initial value problem has, of course, a graph. We willapproximate the graph by a polygonal line, which in turn will correspond to ourapproximation of the solution. In drawing the polygonal line, we will start at x = 0and move to the right in several steps until we reach x = 2. Let’s (arbitrarily)decide to do this in 4 steps. This corresponds to moving to the right by 0.5 eachtime. We say that the step size is h = 0.5. Let the points of the polygonal line bedenoted by (x, Y ); in other words, the polygonal line will be traced out by Y (x).The first step involves drawing a line segment between x = 0 and x = 0.5 thatapproximates the graph of the solution for this range of x-values. We will choosethe line segment that is tangent to the graph at (x0, y0) = (0, 1). We view this asa reasonable choice for an approximation since the graph and the line segment willhave the same values and the same ‘slopes’ at x0, and so hopefully will have valuesthat are close to each other for x close to x0 = 0. From the differential equation,the slope will be

m(0) = y′(0) = f(0,−1) = 2,

and then

Y (0.5) = −1 + (0.5)(2) = 0

by (12). Note that the approximation using a tangent line is exactly the approxima-tion corresponding to differentials. See Figure 1(a). For the next piece of the polyg-onal line approximation, we would like to choose a line segment starting at (0.5, 0),extending h = 0.5 to the right, and having slope equal to y′(0.5) = f(0.5, y(0.5)).Unfortunately, this is not possible for us since we don’t know y(0.5). Instead, wedo the next best thing, which is to use Y (0.5) in place of y(0.5) to get

m(0.5) = f(0.5, Y (0.5)) = f(0.5, 0) = 8,

and then

Y (1) = 0 + (0.5)(8) = 4

by (12). See Figure 1(b). Repeating this process, we get

m(1) = f(1, 4) = 1.0588

and then

Y (1.5) = 4 + (0.5)(1.0588) = 4.5294,


0.5 1 1.5 2-1

0

1

2

3

4

5

0.5 1 1.5 2-1

0

1

2

3

4

5

(a) (b)

0.5 1 1.5 2-1

0

1

2

3

4

5

0.5 1 1.5 2-1

0

1

2

3

4

5

(c) (d)

Figure 1. y′ = (4x+ 2y + 6)/(y2 + 1), y(0) = −1 with h = 0.5

followed by

m(1.5) = f(1.5, 4.5294) = 0.9788

and then

Y (1.5) = 4.5294 + (0.5)(0.9788) = 5.0188,

rounding off to 4 places past the decimal. See Figure 1(c) and Figure 1(d). Theapproximate solution of the initial value problem is given by Y (x) for x ∈ [0, 2].

The process we just went through is called Euler’s method. In general, it goesas follows. Suppose we have an initial value problem

y′ = f(x, y), y(x0) = y0

and we want to construct an approximate solution Y (x) on an interval [x0, b].Choose a number of steps n and the corresponding step size h = (b − x0)/n. Letx1 = x0 + h, x2 = x1 + h = x0 + 2h, and so on. We have xk = x0 + kh fork = 0, 1, 2, . . . , n. We start by setting Y (x0) = y0, which we also denote by Y0. Theline segment from the point (x0, Y0) with slope m0 = f(x0, Y0) that correspondsto x increasing by h ends at the point (x1, Y1), where Y1 = Y0 + hf(x0, Y0). PutY (x1) = Y1. Next, the line segment from (x1, Y1) with slope m1 = f(x1, Y1) thatcorresponds to x again increasing by h ends at the point (x2, Y2), where Y2 =Y1 +hf(x1, Y1). Put Y (x2) = Y2. Continuing in this manner, Y3 = Y2 +hf(x2, Y2),


0.5 1 1.5 2

-1

0

1

2

3

4

5

Figure 2. y′ = (4x+ 2y + 6)/(y2 + 1), y(0) = −1 with h = 0.02

and in general,

(13) Yk = Yk−1 + hf(xk−1, Yk−1).

This holds for k = 1, 2, . . . , n since when k = n we have reached xn = x0 + nh = b.We set Y (xk) = Yk for k = 0, 1, 2, . . . , n. To make the polygonal line constructedabove be the graph of Y , we can define Y (x) for x ∈ [x0, b] but x 6= xk by linearinterpolation.

Let’s look back at the last example. If we change n to 100, then h = 0.02.Applying Euler’s method with n = 100 calls out for using a computer. We get thegraph shown in Figure 2. This graph looks like a smooth curve, but one shouldremember that it is actually a polygonal line; it looks smooth simply because itsline segments are so short. With h = 0.02 we get Y (2) = 4.1430, as compared withY (2) = 4.8221 when h = 0.5.

When using Euler’s method, we should of course be concerned with how accu-rate our results are. To no one’s surprise, smaller value’s of h usually give moreaccurate approximations. However, there are exceptions. In particular, there is theproblem of round-off error. In a computation involving many steps, the round-offerrors from the steps, although individually small, can combine together to givean answer that significantly inaccurate. This is especially a problem when using acomputer, where the ease of use can mask the complexity of a computation.

We are left with the question of how small to choose h. Unfortunately, thereis no completely satisfactory answer to this. One tactic that is often used is asfollows. Try a value of h that you guess is, on the one hand, small enough to givea acceptably accurate result if there were no round-off error while, on the otherhand, corresponds to a value of n small enough to make a significant round-offerror unlikely. Then try the same problem with h/2. If your results don’t vary bymuch, then your results are probably accurate enough for most purposes.

In the next example, the Existence and Uniqueness theorems will imply thatthere is a unique solution to the initial value problem near x0, but they will notensure a unique solution throughout the entire interval of interest. We will assume,


xk Yk1.00 01.25 -0.51.50 -0.76561.75 -0.60622.00 -0.6239

0.5 1 1.5 2

-1

-0.8

-0.6

-0.4

-0.2

0

Figure 3. y′ = 3xy2 − 2, y(1) = 0 with h = 0.25

0.5 1 1.5 2

-1

-0.8

-0.6

-0.4

-0.2

0

Figure 4. y′ = 3xy2 − 2, y(1) = 0 with h = 0.01

however, that a solution does exist globally and use Euler’s method to try to ap-proximate it. When we are done, we should look at our results critically to seewhether there is any peculiar behavior that might indicate that our assumption isfalse. This is a strategy that is quite common.

Example. Consider the initial value problem y′ = 3xy2 − 2, y(1) = 0 on theinterval [1, 2]. At first, let’s choose n = 4 or, equivalently, h = 0.25. We haveY0 = 0. Move to the right along a line segment with slope m0 = f(1, 0) = −2until we reach x1 = 1.25. This gives us Y1 = 0 + (0.25)(−2) = −0.5. Next,move to the right along a line segment with slope m1 = f(1.25,−0.5) = −1.0625until we reach x2 = 1.50. This gives us Y2 = −0.5 + (0.25)(−1.0625) = −0.7656.Continuing on in this way, Y3 = −0.7656 + (0.25)f(1.50,−0.7656) = −0.6062 andY4 = −0.6062 + (0.25)f(1.75,−0.6062) = −0.6239. The results are summarized ina table and graph in Figure 3.

Certainly, we do not expect the approximation corresponding to n = 4 to bein any practical sense accurate enough. Let’s try Euler’s method with n = 50 andso h = 0.02. We get Y (2) = −0.5990 (to four places past the decimal). To get anindication of how accurate our result is, we can try again with h = 0.01. We getY (2) = −0.5987. The graph of Y corresponding to h = 0.01 appears in Figure 4.

There are numerous other numerical methods for approximating solutions ofinitial value problems that are generally more accurate (but more complicated)than Euler’s method. We will describe just two of them. First we will look atthe improved Euler’s method. In Euler’s method, the slope of each line segment


is determined, roughly speaking, by the approximate ‘slope’ of the solution curveat the starting point of the segment. In the improved Euler’s method, we will try(roughly) to make the slope of the segment equal the average of the ‘slope’ of thesolution at the start of the segment and the ‘slope’ of the solution at the end of thesegment. Let’s see how this applies to the first example of this section.

Example. Consider again the initial value problem

y′ =4x+ 2y + 6

y2 + 1, y(0) = −1

on the interval [0, 2]. As before, we will construct a polygonal line that approximatesthe graph of the solution. Let’s choose the step size to be h = 0.5. PuttingY (x0) = Y0 = y0, we start at the point (x0, Y0) = (0,−1) and move 0.5 unitsto the right along a line with slope equal to f(0,−1), just as we did for Euler’smethod. We get a tentative next point, corresponding to slope m0 = f(0,−1) = 2,

of (x1, Y1), where x1 = 0.5 and Y1 = Y0 + hf(x0, Y0) = −1 + (0.5)(2) = 0. Now,the ‘slope’ of the solution curve when x = 0.5 is given by f(0.5, y(0.5)), which weof course don’t know since y is unknown. In this case, the best we can do is touse f(x1, Y1) = f(0.5, 0) = 8 instead. Our final choice for the slope of the linesegment starting at (0,−1) is the average m0 = (2 + 8)/2 = 5. This gives usY (0.5) = Y1 = −1 + (0.5)(5) = 1.5. For x going between 0.5 and 1, we get

m1 = f(0.5, Y (0.5)) = f(0.5,1.5) = 3.3846

⇒ Y2 = 1.5 + (0.5)(3.386) = 3.1923

⇒ f(x2, Y2) = f(1, 3.1923) = 1.4641

⇒ m1 = (3.3846 + 1.4641)/2 = 2.4244

⇒ Y (1) = Y2 = 1.5 + (0.5)(2.4244) = 2.7122

Then for x going between 1 and 1.5,

m2 = f(1, Y (1)) = f(1,2.7122) = 1.8459

⇒ Y3 = 2.7122 + (0.5)(1.8459) = 3.6351

⇒ f(x3, Y3) = f(1.5, 3.6351) = 1.3557

⇒ m2 = (1.8459 + 1.3557)/2 = 1.6008

⇒ Y (1.5) = Y3 = 2.7122 + (0.5)(1.6008) = 3.5126

Finally, for x going from 1.5 to 2,

m3 = f(1.5, Y (1.5)) = f(1.5,3.5126) = 1.4264

⇒ Y4 = 3.5126 + (0.5)(1.4264) = 4.2258

⇒ f(x4, Y4) = f(2, 4.2258) = 1.1906

⇒ m3 = (1.4264 + 1.1906)/2 = 1.3085

⇒ Y (2) = Y4 = 3.5126 + (0.5)(1.3085) = 4.1668

The graph of the approximate solution Y (x) is given in Figure 5.

The improved Euler’s method is also known as the modified Euler’s method,Heun’s method , and the second order Runge-Kutta method . For an arbitrary first


0.5 1 1.5 2

-1

0

1

2

3

4

5

Figure 5. y′ = (4x+ 2y + 6)/(y2 + 1), y(0) = −1, improved, h = 0.5

order initial value problem

y′ = f(x, y), y(x0) = y0

it gives the approximate solution Y (x) determined by

Yk = Yk−1 + hf(xk−1, Yk−1)

Yk = Yk−1 + hf(xk−1, Yk−1) + f(xk, Yk)

2

for k = 1, 2, . . . , n. This is where Yk = Y (xk) = Y (x0 + kh) for k = 1, 2, . . . , n, andY (x) is given by linear interpolation between the xk’s.

Let’s look back at the last example. If we change n to 100, we get the graphshown in Figure 6 and we get Y (2) = 4.4356.

Example. Consider the initial value problem y′ = 3xy2 − 2, y(1) = 0 on theinterval [1, 2]. We applied Euler’s method to this problem earlier. Now, using theimproved Euler’s method with n = 100, we get the graph shown in Figure 7 andwe get Y (2) = −0.5983.

We will look at just one more numerical method for approximating the solutionof an initial value problem

y′ = f(x, y), y(x0) = y0

on an interval [x0, b]. It is called the fourth order Runge-Kutta method. As before,divide the interval [x0, b] into n subintervals of width h, and put xk = x0 + kh fork = 1, 2, . . . , n. The approximate solution given by the fourth order Runge-Kuttamethod is a piecewise linear function Y (x), where Y (x0) = Y0 = y0 and whereY (xk) = Yk, k = 1, 2, . . . .n, is defined as follows:

Yk = Yk−1 + hmk1 + 2mk2 + 2mk3 +mk4

6,


0.5 1 1.5 2

-1

0

1

2

3

4

5

Figure 6. y′ = (4x+ 2y + 6)/(y2 + 1), y(0) = −1, improved, h = 0.02

where

mk1 = f(xk−1, Yk−1)

mk2 = f(xk−1 + 12h, Yk−1 + 1

2hmk1)

mk3 = f(xk−1 + 12h, Yk−1 + 1

2hmk2)

mk4 = f(xk−1 + h, Yk−1 + hmk3).

There is no simple way of visualizing why the ‘slope’ (mk1+2mk2+2mk3+mk4))/6is what we want. However, this method is easy to use (with a computer) and itturns out often to be accurate enough to be of practical value. For the sake ofbrevity, we will refer to this method simply as the Runge-Kutta method.

We will now look at an example whose actual solution can be found, so that wecan compare the accuracy of the approximations our numerical methods give us.


y′ +2x

x2 + 1y =

3x2

x2 + 1, y(0) = 2.

0.5 1 1.5 2

-1

-0.8

-0.6

-0.4

-0.2

0

Figure 7. y′ = 3xy2 − 2, y(1) = 0, improved, h = 0.01


h = 0.04

x solution Euler improved Euler Runge-Kutta0 2 2 2 21 1.5 1.49579222 1.50059984 1.500000012 2 1.98870097 2.00032259 2.000000013 2.9 2.89224207 2.90016957 2.90000001

h = 0.02

x solution Euler improved Euler Runge-Kutta0 2 2 2 21 1.5 1.49782651 1.50014999 1.500000002 2 1.99435657 2.00008032 2.000000003 2.9 2.89611387 2.90004219 2.90000000

Figure 8. Comparison of methods

We recognize that the differential equation is linear. Setting

µ = e∫ ∗ 2x

x2+1dx

= eln(x2+1) = x2 + 1,

we get (x2 + 1)y′ + 2xy = 3x2, and then ((x2 + 1)y)′ = 3x2. After integrating andsolving for y, we wind up with y = (x3 + C)/(x2 + 1). The initial condition thenimplies that 2 = C. Therefore,

y =x3 + 2

x2 + 1.

Figure 8 summarizes our results when we apply the three numerical methods underdiscussion to this initial value problem restricted to the interval [0, 3], first forn = 75 and then for n = 150. The results of the numerical methods have beenrounded off to eight places past the decimal point.

We see that in this example the improved Euler’s method was more accuratethan Euler’s method and that the Runge-Kutta method was more accurate thanthe improved Euler’s method. Also, the larger number of steps (the smaller stepsize) gave more accurate results for all three methods.

When comparing the accuracy of the different methods, it should be remem-bered that they involve differing amounts of computational effort. In particular, itis clear that for the same choice of n the improved Euler’s method involves abouttwice as many computations as Euler’s method. So to be fair, we should, for ex-ample, compare the improved Euler’s method for n = 75 with Euler’s method forn = 150. Even then, the improved Euler’s method outperforms Euler’s methodin this example. Similar comments can be made about the Runge-Kutta methodversus the improved Euler’s method.

We should make two more remarks about this example. First, round-off errorwas not visible anywhere; all the calculations were done to least two more significantfigures than what was printed out. Second, even the Runge-Kutta method forn = 150 did not give entirely accurate results; the errors were just so small that wedid see them with a printout of eight places past the decimal point.

It is now time to make some additional general remarks about errors that areencountered when using numerical techniques of the type we have been studying.Clearly, both the round-off errors and the errors inherent in the methods themselves(that would occur even if the arithmetic were done with absolute precision) arecumulative. Thus, as we construct our approximate solution from left to right, the


possible error tends to increase as we move to the right. To be more detailed aboutthis we will need the following terminology.

Definition 1.6. Let r be a a positive number. We define a function G to beO(hr) (say “big oh of h to the r”) if there is a constant C such that |G(h)| ≤ Chrfor all positive h close to 0, that is for all h such that 0 < h < δ for some δ > 0.

The concept O(hr) measures how fast the values of a function must approachzero as h→ 0. It is easily checked that the bigger r is, the faster this rate must be.Thus, it is a stronger statement to say that a function G is O(h3) than it is to saythat it is O(h2). For example, sinh is O(h). (Remember that limh→0(sinh)/h = 1.)Then sinh is O(hr) for every r < 1, but it can be shown that sinh is not O(hr) forany r > 1. A simpler example is that h3 is O(h2) as well as O(h3). Also, 100h3 isO(h2) as well as O(h3).

Let us say that the cumulative discretization error of an approximate solutionY at x is |Y (x) − y(x)|, where y is the actual solution and we assume there isno round-off error in computing Y . (As we have noted above, the cumulativediscretization error tends to increase as xmoves to the right.) By means of advancedmethods we will not go into, the following has been shown to be true when f is asufficiently ‘nice’ function. For an approximate solution gotten from Euler’s methodthe cumulative discretization error throughout the interval [x0, b] is O(h). For theimproved Euler’s method, it is O(h2); for the Runge-Kutta method it is O(h4).Note that for small values of the step size h, the differences in these rates can beextremely significant. For example, 6h4 is 0.002 the size of 3h when h = 0.1 because6h4/3h = 2h3 = 0.002; when h = 0.01, 6h4 is just 0.000002 the size of 3h.

Exercises

For each of the following, use Euler’s method without the aid of a computer. Makea table of values for xn, Yn (rounded off to 4 places past the decimal point) anddraw a graph of the approximate solution Y (x).

1. y′ = 2x− 3y + 1, y(1) = 5 on [1, 1.4] with h = 0.1

2. y′ = x2 + y2, y(0) = 1 on [0, 0.4] with h = 0.1

3. y′ = 4xy2 − 2, y(0) = −1 on [0, 1.5] with h = 0.5

For each of the following, use Euler’s method with a computer to find Y (b). Thendo the same using the improved Euler’s method and the Runge-Kutta method.

4. y′ = 2x− 3y + 1, y(1) = 5 with h = 0.05 and b = 1.4

5. y′ = 2x− 3y + 1, y(1) = 5 with h = 0.01 and b = 1.4

6. y′ = x2 + y2, y(0) = 1 with h = 0.01 and b = 0.4

7. y′ = 4xy2 − 2, y(0) = −1 with h = 0.02 and b = 1.5

8. Find the actual solution of y′ = 2x − 3y + 1, y(1) = 5. Compare theaccuracy of all approximate solutions you found in earlier exercises forthis initial value problem on [1, 1.4]

6. DIRECTION FIELDS 39

9. Consider the initial value problem y′ = 2xy2+2x, y(0) = 0 on the intervalI = [0, 1.3].(a) With a computer, use Euler’s method to find Y (1.3), first with h =

0.05 and then with h = 0.01.(b) Repeat part (a) using the improved Euler’s method.(c) Repeat part (a) using the Runge-Kutta method.(d) Find the actual solution y(x) and analyze what goes wrong in at-

tempting to use any of the Y (1.3) from above to approximate y(1.3).(You can refer back to exercise 11 in the last section.)

6. Direction Fields

We now turn to the problem of representing geometrically the set of solutionsof a first order differential equation. This will be helpful in determining generalproperties of the solutions such as whether they are increasing, whether they areperiodic, whether they have a limit as the independent variable goes to infinity, andso on. Finding out about these general properties is of special importance whenwe are not able to find actual solutions analytically and have to be content withnumerical approximations.

The graph of a solution of an initial value problem gives us a geometric rep-resentation of the solution. One way of visualizing the solutions of a first orderdifferential equation is then to plot the solutions corresponding to a number of dif-ferent initial conditions, all on the same graph. We will refer to such a geometricrepresentation as an integral curve portrait. The plot corresponding to a specificinitial value is called a trajectory or integral curve.

Example. Consider the differential equation

y′ =4x+ 2y + 6

y2 + 1.

By choosing, in turn, the initial conditions y(0) = −2,−1, 0, 1, 2, 3, 4, we get thetrajectories shown in Figure 9. Here, we have used Euler’s method with step sizeh = 0.02. This gives us some idea of what the solutions of the differential equationare, at least in the rectangle R = [0, 2]× [−2, 6].

Another method for geometrically representing a differential equation y′ =f(x, y) is a direction field, which is constructed as follows. Consider the set ofpoints (xk, yk), where xk = xk−1 + h and yk = yk−1 + h∗ for all k, and whereh, h∗ are fixed positive numbers. Through each point, (xk, yk), draw a short linesegment with slope equal to f(xk, yk). The resulting collection of line segments iscalled a direction field. It is also called a slope field . Since y′ = f(x, y) and thederivative gives the slope of the tangent line, a line segment is tangent at (xk, yk)to the solution passing through (xk, yk). Then by using the line segments as aguide, we can trace out the graphs of solutions of the differential equation. We canalso, at times, see some general properties of solutions and locate regions of theplane where solutions behave unusually.

We should note that the slopes of the lines segments are exactly the derivativesof solutions. However, when we try to ‘join up’ line segments to form trajectories,we get approximate solutions.


0.5 1 1.5 2

-2

-1

0

1

2

3

4

5

6

Figure 9. y′ = (4x+ 2y + 6)/(y2 + 1)

0.5 1 1.5 2

-2

-1

1

2

3

4

5

6

Figure 10. Direction field for y′ = (4x+ 2y + 6)/(y2 + 1)

Example. Consider again

y′ =4x+ 2y + 6

y2 + 1.

Figure 10 shows a direction field generated by computer for this equation. Thereader should compare this with the trajectories shown in Figure 9.


1 2 3 4

1

2

3

4

Figure 11. Direction field

A word of caution is appropriate at this point. Theoretically, a direction fieldshould give the slope f(x, y) at each point (x, y) in the plane. In the pictures we (orcomputers) draw, we are restricted to a rectangle R and we only ‘see’ f(x, y) at thepoints (xk, yk). We will assume, perhaps naively, that the direction fields shownhere in these notes give a fair representation of the solutions, that no vital infor-mation is missed by only seeing line segments corresponding to points (xk, yk) ina rectangle R.

Example. Consider the direction field given in Figure 11. If φ is a solutionwith φ(0) = 0, estimate φ(4). You should get an answer between 2.5 and 3. Ofcourse, different people will draw the trajectory in their minds in slightly differentways. There is no one correct answer. When you are constructing the trajectory,you are probably following a procedure similar to one of the numerical methodsstudied in the last section. If, as it appears, all solutions in the rectangle underconsideration are concave downward, then notice how a person thinking along thelines of Euler’s method should get an approximation for φ(4) that is larger thanthe actual value of φ(4) and also larger than the approximation gotten by someonethinking along the lines of the improved Euler’s method.

Exercises

1. If φ is a solution of the first order differential equation whose directionfield is shown in Figure 12(a) and if φ(0) = 0, estimate φ(4).

2. If φ is a solution of the first order differential equation whose directionfield is shown in Figure 12(b) and if φ(0) = 0, estimate φ(4).

3. For the direction field shown in Figure 13(a), find Y (4) for the approximatesolution Y starting with Y (0) = 1 that is produced by Euler’s method withh = 1.


1 2 3 4

-1

1

2

3

1 2 3 4

-1

1

2

3

(a) (b)

Figure 12. Direction fields

4. For the direction field shown in Figure 13(b), find Y (4) for the approx-imate solution Y starting with Y (0) = 2 that is produced by Euler’smethod with h = 1.

5. Let f be continuous and satisfy a Lipschitz condition on a rectangle R.Explain how the Uniqueness Theorem implies that trajectories lying in Rof y′ = f(x, y) cannot intersect.

6. Consider y′ = (x+ y) sin y. Note that y ≡ 0 and y ≡ π are solutions. Usethe preceding problem to prove that if φ is a solution and 0 < φ(x0) < πfor some x0, then 0 < φ(x) < π for all x.

1 2 3 40

1

2

3

4

1 2 3 40

1

2

3

4

(a) (b)


7. A first order ordinary differential equation of the form y′ = f(y) is saidto be autonomous. Note that for any fixed value of y, the value of y′ willbe the same for all values of x. Thus, the line segments in the direction


field of y′ = f(y) for points with the same y coordinate will have the sameslope. In Figure 14, which direction field corresponds to an autonomousequation?

1 2 3 4

1

2

3

4

1 2 3 4

1

2

3

4

(a) (b)


CHAPTER 2

Linear Equations of Order Two and Higher


An object of mass m is at the end of a spring. According to Hooke’s law,the force on the object exerted by the spring is proportional to the distance ythe spring is stretched from its natural length. (Stretching corresponds to y > 0,compressing corresponds to y < 0.) In other words, the force is ky, where theconstant of proportionality k is a positive number that measures the stiffness of thespring. By Newton’s second law of motion, the force on the spring exerted by theobject is given by m times the acceleration y′′,where differentiation is with respectto time. According to Newton’s third law, these two forces are equal in magnitudeand opposite in direction. Thus, my′′ = −ky. To see how this mass-spring systembehaves (assuming no other forces are involved), we will need to solve the secondorder equation my′′+ky = 0. If at some initial time t = 0, the spring is at rest butis stretched β units from its natural length, its motion thereafter will be given bythe solution of the initial value problem

my′′ + ky = 0, y(0) = β, y′(0) = 0.

2. Definition of Linear Equations

In contrast to the situation for first order equations, there is only one broadclass of ordinary differential equations of order two or higher that is well understood.These are the linear equations, which we define now.

Definition 2.1. For any positive integer n, let a0, . . . , an, g be continuousfunctions on an interval I, with a0 not identically zero. A linear differential equationof order n is an equation of the form

a0(x)y(n) + a1(x)y(n−1) + · · ·+ an(x)y = g(x).

If g is identically zero on I, the equation is said to be homogeneous; if not, theequation is nonhomogeneous.

For simplicity, we will assume that a0, . . . , an, b are real valued even thoughmost of what we do remains true for complex valued functions. Also, we shouldnote the the use of the word “homogeneous” here has no relation to the first orderhomogeneous equations introduced in the exercises for Section 2.3.

Example. (a) xy′′′ + 3x5y′′ + (x + 1)y′ − exy = lnx is a third order linearnonhomogeneous equation on the interval (0,∞).(b) y′′ + 7y′ + 6x3y = 0 is a second order linear homogeneous equation on theinterval (−∞,∞).(c) y(4) − 6y = 2x − 1 is a fourth order linear nonhomogeneous equation on theinterval (−∞,∞).

45

46 2. LINEAR EQUATIONS OF ORDER TWO AND HIGHER

3. A First Look at Constant Coefficients

Before looking at the general properties of linear equations, we first will findsolutions for an especially simple subclass: second order linear homogeneous equa-tions with constant coefficients. Then we will have a ready supply of examples wecan use to illustrate the ideas we develop later on.

Consideray′′ + by′ + cy = 0,

where a, b, c are constants and a 6= 0. We will use the method of educated guessingto try to find solutions of this equation. Since a, b, and c are constants, we cansuppose that y′′, y′, and y are roughly alike since the equation says that the threeterms on the left side cancel each other out by adding up to zero. By going throughthe list of standard functions we know how to differentiate, we see that erx, wherer is a constant, is a function that might work.

Example. Consider y′′ − y′ − 6y = 0. We will look for solutions of the formφ(x) = erx. Of course, φ′(x) = rerx and φ′′(x) = r2erx. Then, φ is a solution ofthe equation if and only

r2erx − rerx − 6erx = 0.

Dividing by erx, we get that φ(x) = erx is a solution of the differential equation ifand only if r is a solution of the algebraic equation

r2 − r − 6 = 0,

which we can easily solve. We get r = −2, 3, and so e−2x and e3x are solutions ofthe differential equation. Furthermore, we can check by directly plugging it in thatC1e

−2x is also a solution for any constant C1. In the same way we see that C2e3x is

a solution for any constant C2. Moreover, we can then check that every function ofthe form C1e

−2x +C2e3x is a solution. Note that there are no restrictions on x. In

other words, each of these functions is a solution on the entire real line (−∞,∞).

Example. Find solutions of 2y′′−6y = 0 on (−∞,∞). By plugging in φ(x) =

erx we are lead to 2r2 − 6 = 0 and so r = ±√

3. As in the last example, we can

conclude that every function of the form φ(x) = C1e−√3x +C2e

√3x, where C1 and

C2 are arbitrary constants, is a solution.

Given the differential equation

ay′′ + by′ + cy = 0,

we will refer top(r) = ar2 + br + c

as its characteristic polynomial (also known as its auxiliary polynomial). Sincea, b,and c are real numbers, the characteristic polynomial can have two real roots,one real double root, or two complex conjugate roots, as can be seen from thequadratic formula. This, in turn, will lead to three different forms of solutions ofthe differential equation. The first two of these are covered by the next theorem.

Theorem 2.1. If the characteristic polynomial p(r) = ar2 + br + c has twodistinct real roots r1, r2, then the functions Φ1 and Φ2 given by

Φ1(x) = er1x, Φ2(x) = er2x

are solutions of ay′′ + by′ + cy = 0 on the interval (−∞,∞).

3. A FIRST LOOK AT CONSTANT COEFFICIENTS 47

If p(r) = ar2 + br + c has a real root r1 of multiplicity two, then the functionsΦ1 and Φ2 given by

Φ1(x) = er1x, Φ2(x) = xer1x

are solutions of ay′′ + by′ + cy = 0 on the interval (−∞,∞).

Proof. Suppose r is a root of p(r) = ar2 + br + c. If Φ(x) = erx, then

aΦ′′(x) + bΦ′(x) + cΦ(x) = ar2erx + brerx + cerx

= (ar2 + br + c)erx

= p(r)erx

= 0

for all x.All that remains now is to show that if r1 is a double root of p(r), then Φ2(x) =

xer1x is a solution of the differential equation. We have

aΦ′′2(x) + bΦ′2(x) + cΦ2(x) = a(xer1x)′′ + b(xer1x)′ + cxer1x

= a(xr21er1x + 2r1e

r1x) + b(xr1er1x + er1x) + cxer1x

= (2ar1 + b)er1x + (ar21 + br1 + c)xer1x

= (2ar1 + b)er1x + p(r1)xer1x

= (2ar1 + b)er1x.

Note, however, that p(r) can be expressed as p(r) = a(r− r1)2 since r1 is a doubleroot and p(r) has leading coefficient a. Differentiating p(r) in its two forms gives2ar + b = 2a(r − r1), and so 2ar1 + b = 0. Thus,

aΦ′′2(x) + bΦ′2(x) + cΦ2(x) = 0er1x = 0,

showing that Φ2 is a solution of the differential equation in this case.

Notice that in each of the above cases, neither Φ1 nor Φ2 is simply a constanttimes the other.

Example. Find two solutions of 9y′′−6y′+y = 0. The characteristic equationis 9r2 − 6r + 1 = 0. Solving this, we get r = 1/3, 1/3, counting multiplicity. Thus,Φ1(x) = ex/3 and Φ2(x) = xex/3.

In the last example, ex/3 and 2ex/3 are two solutions of the equation. However,what we really wanted to find were two solutions such that neither was a constanttimes the other. Said more generally, we want linearly independent solutions, aconcept we will review in the next section.

Example. Find two solutions of y′′+5y′+6y = 0. The characteristic equationis r2 + 5r + 6 = 0. Solving this, we get r = −2,−3. Thus Φ1(x) = e−2x andΦ2(x) = e−3x. You can easily check that xe−2x is not a solution. The same goesfor xe−3x.

Before dealing with the case of imaginary roots r, we will review a few factsabout complex numbers and functions. First, recall that if z = s+ it, then

ez = es(cos t+ i sin t).


In particular, eit = cos t+ i sin t, and then e−it = cos t− i sin t. From this it followsthat

cos t =eit + e−it

2and sin t =

eit − e−it

2i.

It can be easily checked that ddxe

rx = rerx for any complex number r.

Theorem 2.2. If the characteristic polynomial p(r) = ar2 + br + c has twodistinct complex conjugate roots r1 = λ+µi and r2 = λ−µi, then the functions Φ1

and Φ2 given by

Φ1(x) = er1x, Φ2(x) = er2x

are solutions of ay′′+ by′+ cy = 0 on the interval (−∞,∞). Furthermore, the pairof real valued functions

Φ1(x) = eλx cosµx, Φ2(x) = eλx sinµx

are solutions of ay′′ + by′ + cy = 0 on (−∞,∞).

Proof. Since the rule for differentiation of erx is the same for both real andimaginary r, we can prove that Φ1 and Φ2 are solutions just as was done in thelast theorem. Then, it is easy to see that any linear combination C1Φ1 + C2Φ2 isa solution. This proves that Φ1 and Φ2 are solutions since

eλx cosµx =1

2eλx+iµx +

1

2eλ−iµx and eλx sinµx =

1

2ieλx+iµx − 1

2ieλ−iµx.

Example. Find two real valued solutions of y′′+4y′+20y = 0. The character-istic equation is r2 + 4r + 20 = 0. Solving this by means of the quadratic formula,we get

r =−4±

√16− 80

2=−4±

√−64

2=−4± 8i

2= −2± 4i.

Thus, Φ1(x) = e−2x cos 4x and Φ2(x) = e−2x sin 4x.

Example. Find two real valued solutions of y′′ + y = 0. The characteristicequation is r2 + 1 = 0. Solving this, we get r = ±i. Thus, Φ1(x) = cosx and

Φ2(x) = sinx.

Notice that in the examples given above, we were not asked to find all solutionsof a differential equation. The reason is that even if we are able to find infinitelymany solutions, we do not at this point have the theoretical tools needed to decidewhether they constitute all the solutions of the equation. We will start to developthe necessary tools in the next section.

Exercises

For each of the following equations, find two real valued solutions (with neither onebeing simply a constant times the other).

1. y′′ − 7y′ + 12y = 0

2. y′′ + 3y′ − 10y = 0

3. y′′ − 2y′ − y = 0

4. SOLUTION SPACES 49

4. 4y′′ + 4y′ − y = 0

5. 4y′′ + 4y′ + y = 0

6. y′′ − 2y′ + y = 0

7. y′′ − 6y′ + 10y = 0

8. 3y′′ + y = 0

9. 2y′′ + 5y = 0

10. 2y′′ − y′ + 2y = 0

11. If r = λ+ µi, prove that ddxe

rx = rerx.

4. Solution Spaces

It is now time to see how solutions on an interval I of the linear homogeneousequation

(1) a0(x)y(n) + a1(x)y(n−1) + · · ·+ an(x)y = 0

relate to each other, how knowledge of some solutions enable us to find other so-lutions, and how to be sure that all solutions are found. A number of standardresults from linear algebra will of great help to us, and the reader is encouraged torefer back to his or her linear algebra textbook and/or notes as the need arises.

As we saw in the last section, complex numbers can be involved in solving lineardifferential equations, even if the equations have real coefficients and we are tryingto find real valued solutions. For this reason, we will be dealing with vector spacesover the field of complex numbers. In other words, the scalars will be allowed tobe imaginary as well as real numbers. All the standard results from a first coursein linear algebra that we will use are the same for complex vector spaces and realvector spaces, and we will consider them as known. (Their proofs are generally thesame, as well.)

Recall that a set U of vectors in a vector space V forms a subspace of V (andso is a vector space in its own right) if

• U 6= ∅• v1 ∈ U and v2 ∈ U ⇒ v1 + v2 ∈ U (closure under addition)• v ∈ U and α is a scalar ⇒ αv ∈ U (closure under scalar multiplication)

Theorem 2.3. The set of all solutions of the nth-order linear homogeneousequation (1) form a vector space under the usual definitions of addition of functionsand multiplication by scalars.

Proof. We will show that the set of solutions is a subspace of the vector spaceof all functions on I. First of all, this set is not empty since the identically zerofunction is obviously a solution. Next, we will show that the set of solutions is


closed under addition. If φ and ψ are solutions of (1), then

a0(x)[φ(x) + ψ(x)](n) + a1(x)[φ(x) + ψ(x)](n−1) + · · ·+ an(x)[φ(x) + ψ(x)]

= a0(x)[φ(n)(x) + ψ(n)(x)] + a1(x)[φ(n−1)(x) + ψ(n−1)(x)]

+ · · ·+ an(x)[φ(x) + ψ(x)]

= a0(x)φ(n)(x) + a1(x)φ(n−1)(x) + · · ·+ an(x)φ(x)

+ a0(x)ψ(n)(x) + a1(x)ψ(n−1)(x) + · · ·+ an(x)ψ(x)

= 0 + 0 = 0,

which shows that φ + ψ is a solution. To show closure under multiplication byscalars, we assume φ is a solution and α is a scalar. Then,

a0(x)[αφ(x)](n)+a1(x)[αφ(x)](n−1) + · · ·+ an(x)[αφ(x)]

= a0(x)[αφ(n)(x)] + a1(x)[αφ(n−1)(x)] + · · ·+ an(x)[αφ(x)]

= α[a0(x)φ(n)(x) + a1(x)φ(n−1)(x) + · · ·+ an(x)φ(x)]

= α0 = 0,

which shows that αφ is a solution.

One immediate benefit of the above is that we are now assured that wheneverwe have solutions of (1), we can get many more just by taking linear combinations.

Example. We have seen that two solutions of y′′ + 4y′ + 20y = 0 are Φ1(x) =

e−2x cos 4x and Φ2(x) = e−2x sin 4x. It then follows that any function of the form

φ(x) = C1e−2x cos 4x+ C2e

−2x sin 4x,

where C1, C2 are constants, is a solution of the differential equation.

Definition 2.2. For any n-times differentiable function φ, we let

L(φ) = a0φ(n) + a1φ

(n−1) + · · ·+ anφ,

so that the linear homogeneous equation (1) can be written as

L(y) = 0.

L is called a differential operator.

Recall that a linear transformation T : V → W from a vector space V to avector space W is a function from V to W satisfying the conditions:

T (v1 + v2) = T (v1) + T (v2) for all v1, v2 ∈ VT (αv) = αT (v) for all v ∈ V and for all scalars α.

Theorem 2.4. The differential operator L is a linear transformation from thevector space of all n-times differentiable functions on the interval I to the vectorspace of all functions on I.

The proof of this theorem is almost identical to the proof of the last theorem.For this reason, it will be left as an exercise for the reader.

Note that the set of solutions to the linear homogeneous equation (1) can nowbe thought of as the kernel or null space of L. Also we should note that if φ is afunction of x, then so is L(φ), and we could write L(φ)(x).

4. SOLUTION SPACES 51

Recall that one of the most important concepts in linear algebra is linear inde-pendence of vectors. We will now write down the definition in the specific settingof the vector space being a space of functions.

Definition 2.3. The functions f1, f2, . . . , fn defined on a set S are said to belinearly independent if the condition

α1f1 + α2f2 + · · ·+ αnfn = 0

implies that all the constants α1, α2, . . . , αn must be zero. In other words, if

α1f1(x) + α2f2(x) + · · ·+ αnfn(x) = 0

for all x ∈ S, then α1 = α2 = · · · = αn = 0.

Example. Prove that the two solutions Φ1(x) = e−2x cos 4x and Φ2(x) =e−2x sin 4x are linearly independent on (−∞,∞). We remind ourselves that twovectors are linearly independent if neither is a constant times the other. That isclearly true in this example, and so we are done.

Example. Prove that the functions x, x2, cosx are linearly independent on(−∞,∞). We suppose

(2) α1x+ α2x2 + α3 cosx = 0

for all x. This will certainly be true if the constants α1, α2, α3 are all zero. Thequestion we have to deal with is whether there is any other choice of α1, α2, α3 forwhich (2) will hold for all x. If (2) holds for all x, then in particular it must be truefor x = 0, x = 1, and x = 2. Plugging in these three values of x in succession givesthree equations

α3 = 0

α1 + α2 + (cos 1)α3 = 0

2α1 + 4α2 + (cos 2)α3 = 0

which we can easily solve. We get α1 = α2 = α3 = 0. This is the only choice ofconstants that will make (2) true for the three values of x we selected, and so it isthe only choice of constants that will make (2) true for all x. Thus, x, x2, cosx arelinearly independent on (−∞,∞). Of course, we could have selected other valuesof x to get the desired result.

Here is a second approach that will show linear independence in this example.Differentiate (2) three times to get α3 sinx = 0, which implies that α3 = 0. Thenthe result of differentiating (2) twice is 2α2 = 0 and so α2 = 0. Then (2) reducesto α1x = 0 (for all x), which implies that α1 = 0.

There is a tendency to not always pay attention to the interval on which weare trying to solve a differential equation. The next example shows that at timesthe interval is of definite importance when the issue is linear independence.

Example. Let f1(x) = x4 and f2(x) = x3|x|. On the interval (−∞,∞), f1and f2 are linearly independent, as can be seen by setting α1x

4 + α2x3|x| = 0 for

all x, plugging in x = 1 and x = −1, and seeing that the resulting pair of equationsimplies that α1 = α2 = 0. On the other hand, f1 and f2 are linearly dependent on(0,∞) since on this interval they are equal.


When we find the general solution of linear homogeneous equations, it will turnout that the linear independence of certain solutions will be of crucial importance.The next theorem establishes what we will need in the case of linear, homogeneoussecond order equations with constant coefficients.

Theorem 2.5. The solutions Φ1 and Φ2 are linearly independent (on any in-terval) in each of the three cases covered in Theorem 2.1 and Theorem 2.2. Also, the

solutions Φ1 and Φ2 given in Theorem 2.2 are linearly independent on any interval.

Proof. As has been noted before, the easiest way to decide whether two func-tions are linear independent or not is to ask whether either is a constant times theother. In all the cases under consideration, the answer is obviously no, and so thetwo solutions are linearly independent.

Thus, in the last section we were always trying to find two linearly independentsolutions, not just any two solutions.

Exercises

1. If L(y) = y′′ + 3xy′ − 5y, compute L(x4).

2. If L(y) = 2xy′′ + 3y′ − 6y, compute L(e3x).

3. Find two linearly independent vectors (functions) in the kernel of L(y) =y′′ − 8y′ + 15y.

4. Find two linearly independent vectors (functions) in the kernel of L(y) =9y′′ − 12y′ + 4y.

5. Prove Theorem 2.4.

6. Prove that sinx, sin 2x are linearly independent on (−∞,∞).

7. Prove that x, 2x− 1, lnx are linearly independent on (0,∞).

8. Prove that if f1, . . . , fn are linearly independent on a set S and S ⊂ T ,then f1, . . . , fn are linearly independent on T .

5. Linear Homogeneous Equations of Order Two

We will now turn to initial value problems for second order linear homogeneousequations. The generalization to equations on order n > 2 will follow in the nextsection. As in the case of first order linear equations, we will have to restrict ourattention to equations with leading coefficient a0(x) being nonzero, or after divisionby a0(x), equations with leading coefficient one. Specifically, we will consider theinitial value problem

(3)y′′ + a1(x)y′ + a2(x)y = 0

y(x0) = β, y′(x0) = γ.

(Notice the change in notation. We now write y(x0) = β, while in the first ordercase we wrote y(x0) = y0.)

Theorem 2.6. (Existence Theorem) If a1 and a2 are continuous on an intervalI containing x0, then the initial value problem (3) has a solution on I.

5. LINEAR HOMOGENEOUS EQUATIONS OF ORDER TWO 53

Proof. We will only prove this theorem in the special case of a1 and a2 beingconstant. For the general case, the reader should consult an advanced book.

If a1 and a2 are constant, then by Theorem 2.1 and Theorem 2.2 the differentialequation y′′ + a1y

′ + a2y = 0 always has a pair of solutions Φ1 and Φ2. Then

φ(x) = C1Φ1(x) + C2Φ2(x)

is a solution on the interval (−∞,∞) for any constants C1 and C2. We want tochoose C1 and C2 so that φ also satisfies the initial conditions. This will happen if

C1Φ1(x0) + C2Φ2(x0) = β

and

C1Φ′1(x0) + C2Φ′2(x0) = γ.

In other words, φ will satisfy the initial conditions if we can solve the simultaneouslinear (algebraic) equations

Φ1(x0)C1 + Φ2(x0)C2 = β

Φ′1(x0)C1 + Φ′2(x0)C2 = γ

for C1 and C2. Recall that such a system will have a (unique) solution for any βand γ if and only if the determinant of coefficients is nonzero. If Φ1(x) = er1x andΦ2(x) = er2x, then∣∣∣∣Φ1(x0) Φ2(x0)

Φ′1(x0) Φ′2(x0)

∣∣∣∣ =

∣∣∣∣ er1x0 er2x0

r1er1x0 r2e

r2x0

∣∣∣∣ = (r2 − r1)er1x0er2x0 6= 0

for any distinct (real or imaginary) r1, r2. The reader can check that the determi-nant of coefficients is also nonzero when Φ1(x) = er1x and Φ2(x) = xer1x.

To prove uniqueness for the initial value problem (3), we will need the followingtechnical lemma.

Lemma 2.1. Let a1 and a2 be continuous on a closed, bounded interval I con-taining a point x0, and put M1 = supI|a1(x)|, M2 = supI|a2(x)|. Let φ be any

solution of y′′ + a1(x)y′ + a2(x)y = 0 on I. Then for each x ∈ I,

‖φ(x0)‖e−K|x−x0| ≤ ‖φ(x)‖ ≤ ‖φ(x0)‖eK|x−x0|,

where ‖φ(x)‖ =√|φ(x)|2 + |φ′(x)|2 and K = 1 +M1 +M2.

Proof. We first note that M1 and M2 are finite since a continuous functionon a closed, bounded interval is bounded. Put u = ‖φ‖2. Remembering that φmight be complex valued, we write

u = φφ+ φ′φ′.

Then, using φ′ = φ′, we get

u′ = φ′φ+ φφ′ + φ′′φ′ + φ′φ′′

and so,

|u′| ≤ 2|φ||φ′|+ 2|φ′′||φ′|.However, φ′′ = −a1φ′ − a2φ since φ is a solution of the differential equation, andso |φ′′| ≤M1|φ′|+M2|φ| on I. Then

|u′| ≤ 2|φ||φ′|+ 2M1|φ′|2 + 2M2|φ||φ′| = 2(1 +M2)|φ||φ′|+ 2M1|φ′|2.


We will now use the inequality, 2ab ≤ a2 + b2, which is true for any real numbersa and b (and can be proven by expanding (a− b)2 ≥ 0). This gives us

|u′| ≤ (1 +M2)(|φ|2 + |φ′|2) + 2M1|φ′|2 ≤ 2(1 +M1 +M2)(|φ|2 + |φ′|2).

Thus, |u′| ≤ 2Ku throughout I.We have

(4) −2Ku(x) ≤ u′(x) ≤ 2Ku(x)

for all x ∈ I. The right hand inequality in (4) gives us u′(x)− 2Ku(x) ≤ 0, whichcan be thought of as a first order linear inequality. Letting µ = e−2Kx, multiply byµ to get after a couple of steps, (e−2Kxu(x))′ ≤ 0 for all x ∈ I. Then,

x ≥ x0 ⇒ e−2Kxu(x) ≤ e−2Kx0u(x0)

⇒ u(x) ≤ e2K(x−x0)u(x0) = e2K|x−x0|u(x0)

⇒ ||φ(x)|| ≤ eK|x−x0|||φ(x0)||.Also,

x ≤ x0 ⇒ e−2Kxu(x) ≥ e−2Kx0u(x0)

⇒ u(x) ≥ e2K(x−x0)u(x0) = e−2K|x−x0|u(x0)

⇒ e−K|x−x0|||φ(x0)|| ≤ ||φ(x)||.Similarly, the left hand inequality in (4) leads to

x ≤ x0 ⇒ ||φ(x)|| ≤ eK|x−x0|||φ(x0)||and

x ≥ x0 ⇒ e−K|x−x0|||φ(x0)|| ≤ ||φ(x)||.These four implications combine to give us the desired result.

Theorem 2.7. (Uniqueness Theorem) If a1 and a2 are continuous on an in-terval I containing x0, then the initial value problem (3) has at most one solutionon I.

Proof. Suppose that φ and ψ are solutions of (3) on I. We will show that

φ(x) = ψ(x) for all x ∈ I. Pick any x in I, and let I be a closed, bounded subintervalof I that contains both x0 and x. We will apply the lemma to the function φ− ψon the interval I. First, note that φ− ψ is a solution of y′′ + a1y

′ + a2y = 0 on I.Next, ||(φ − ψ)(x0)|| = 0 since both φ and ψ satisfy the initial conditions in (3).

The lemma then says that ||(φ−ψ)(x)|| = 0 since x ∈ I. Then, φ(x) = ψ(x). Sincex was an arbitrary point in I, we are done.

We now turn to a handy way of showing that two functions are linearly inde-pendent.

Definition 2.4. The Wronskian W (f1, f2) of two differentiable functions f1and f2 is the function given by

W (f1, f2)(x) =

∣∣∣∣f1(x) f2(x)f ′1(x) f ′2(x)

∣∣∣∣ = f1(x)f ′2(x)− f2(x)f ′1(x).

Lemma 2.2. Let f1 and f2 be differentiable functions on an interval I. If thereexists a point x0 ∈ I for which W (f1, f2)(x0) 6= 0, then f1 and f2 are linearlyindependent on I.


Proof. Suppose that α1f1(x)+α2f2(x) = 0 for all x ∈ I. Then differentiationgives us α1f

′1(x) + α2f

′2(x) = 0 for all x ∈ I. In particular, we have for x = x0,

f1(x0)α1 + f2(x0)α2 = 0

f ′1(x0)α1 + f ′2(x0)α2 = 0.

This system of two equations in two unknowns has a unique solution for α1, α2

since the determinant of coefficients is the Wronskian at x0, which is given to benonzero. That unique solution is obviously α1 = α2 = 0. Therefore, f1 and f2 arelinearly independent on I.

Example. Consider the functions x3 and x5 on (−∞,∞).

W (x3, x5) =

∣∣∣∣ x3 x5

3x2 5x4

∣∣∣∣ = 5x7 − 3x7 = 2x7 6= 0

whenever x 6= 0. Therefore, x3 and x5 are linearly independent on (−∞,∞).

If the two functions under consideration are solutions of the same second orderlinear homogeneous equation, then it turns out that much more can be said aboutthe relationship between the Wronskian and linear independence.

Theorem 2.8. Let φ1 and φ2 be solutions of

(5) y′′ + a1(x)y′ + a2(x)y = 0

on an interval I. Then the following conditions are equivalent.(a) W (φ1, φ2)(x) is never zero for x ∈ I.(b) There exists a point x0 ∈ I such that W (φ1, φ2)(x0) 6= 0.(c) φ1, φ2 are linearly independent on I.

Proof. The implication (a) ⇒ (b) is trivial, while the implication (b) ⇒ (c)was proved in the preceding lemma. Thus, we will be done if we can show that (c)implies (a). We assume that (c) holds, i.e., that φ1, φ2 are linearly independent onI. Now let’s suppose that there is some point x∗ ∈ I such that W (φ1, φ2)(x∗) = 0and show how this leads to a contradiction. Since we have W (φ1, φ2)(x∗) = 0, thesystem of equations

φ1(x∗)C1 + φ2(x∗)C2 = 0

φ′1(x∗)C1 + φ′2(x∗)C2 = 0

has a solution for C1, C2 such that either C1 6= 0 or C1 6= 0 (or both). For thischoice of C1, C2, put φ = C1φ1 + C2φ2. Then, φ satisfies (5). Also, φ(x∗) = 0by the first of the two equations in the above system, while the second says thatφ′(x∗) = 0. In other words, φ satisfies the initial value problem

y′′ + a1(x)y′ + a2(x)y = 0

y(x∗) = 0, y′(x∗) = 0

on I. However, the identically zero function trivially satisfies this initial value prob-lem. Thus, φ = 0 by the Uniqueness Theorem, and so, C1φ1 + C2φ2 = 0 for someC1, C2 that are not both zero. This says that φ1, φ2 are linearly dependent on I,which is a contradiction. Since supposing that (a) was false lead to a contradiction,we can conclude that (a) must be true whenever (c) is true.


Here, as well as everywhere else in this section, we assume that a1 and a2 arecontinuous on I. Notice that the last theorem implies that if φ1, φ2 are solutionsof (5) on an interval I, then either W (φ1, φ2)(x) is always zero or never zero on I.

Example. For any fixed number r1, we already know that Φ(x) = er1x andΦ(x) = xer1x are linearly independent solutions on (−∞,∞) of a second order linearhomogeneous equation with constant coefficients whose characteristic polynomialhas r1 as a double root. This is confirmed by the computation

W (er1x, xer1x) =

∣∣∣∣ er1x xer1x

r1er1x xr1e

r1x + er1x

∣∣∣∣= er1x(xr1e

r1x + er1x)− r1er1xxer1x = e2r1x 6= 0.

Example. Let f1(x) = x− 1 and f2(x) = ex for all x. We can easily computethat W (f1, f2)(x) = (x − 2)ex. Then W (f1, f2)(0) = −2 6= 0 implies (by the lastlemma) that f1 and f2 are linearly independent on (−∞,∞), while W (f1, f2)(2) = 0implies nothing. Note that f1 and f2 cannot be solutions of the same second orderlinear homogeneous equation on (−∞,∞) since their Wronskian is sometimes zeroand sometimes nonzero there.

In trying to solve a differential equation, the ultimate goal is to find all solutionsof the equation. With the help of the Wronskian and the Uniqueness Theorem, weare now in the position of being able to see how this is done for second order linearhomogeneous equations.

Theorem 2.9. Let φ1 and φ2 be any two linearly independent solutions of (5)on an interval I. Then every solution on I of (5) is of the form φ = C1φ1 +C2φ2,where C1 and C2 are constants.

Proof. Let φ be any solution of (5) on I. Let x0 be any point in I, and defineβ and γ by β = φ(x0) and γ = φ′(x0). Then, obviously, φ satisfies the initial valueproblem (3) on I for this choice of β and γ. On the other hand, W (φ1, φ2)(x0) 6= 0since φ1 and φ2 are linearly independent solutions of (5). This implies that thereexist constants C1 and C2 such that

φ1(x0)C1 + φ2(x0)C2 = β

φ′1(x0)C1 + φ′2(x0)C2 = γ,

since the Wronskian is the determinant of the coefficient matrix of the system. Putψ = C1φ1 + C2φ2 for this choice of C1 and C2. As a linear combination of φ1 andφ2, ψ must be a solution of the differential equation in (5); that it satisfies theinitial conditions follows directly from the system of equations given above. TheUniqueness Theorem then says that φ = ψ. In other words, φ = C1φ1 + C2φ2.

We will refer to C1φ1 + C2φ2 as the general solution of (5). Notice that wehave proven that φ1 and φ2 span the vector space of all solutions of (5). Since theywere given to be linearly independent, they form a basis for the solution space,and so the solution space has dimension two. Then a standard result from linearalgebra states that for any solution φ, the constants C1 and C2 found above mustbe unique.

Example. Find all solutions of y′′ + 6y′ + 10y = 0. The characteristic poly-nomial has roots r = −3± i. A basis for the solution space is e(−3+i)x, e(−3−i)x.


Another basis is e−3x cosx, e−3x sinx. If we choose the latter for example, wecan write the general solution as

φ(x) = C1e−3x cosx+ C2e

−3x sinx.

(Note that the general solution can also be written as

φ(x) = K1e−(3+i)x +K2e

(−3−i)x.

This is the same collection of functions; it just appears in a different form.)


y′′ − 7y′ + 12y = 0, y(0) = 3, y′(0) = 7.

The characteristic polynomial has roots r = 3, 4. Thus, every solution of thedifferential equation is of the form

φ(x) = C1e3x + C2e

4x.

Then, φ′(x) = 3C1e3x + 4C2e

4x. The initial conditions imply that C1 + C2 = 3and 3C1 + 4C2 = 7. Solving these two equations simultaneously gives C1 = 5 andC2 = −2. Therefore, the solution of the initial value problem is φ(x) = 5e3x−2e4x.

Let’s look again at the constant coefficient equation

(6) ay′′ + by′ + cy = 0.

In the case of the characteristic polynomial having a conjugate pair of roots λ±µi,the general solution of (6) is

φ(x) = C1eλx cosµx+ C2e

λx sinµx.

If we can find numbers A > 0 and δ such that A cos δ = C1 and A sin δ = C2, then

φ(x) = C1eλx cosµx+ C2e

λx sinµx

= eλx[(cosµx)(A cos δ) + (sinµx)(A sin δ)]

= Aeλx cos(µx− δ)

is another way of writing the general solution of (6), which can be helpful in vi-sualizing how the solutions behave. The numbers A and δ can, in fact, alwaysbe found; we easily see that A =

√C2

1 + C22 and that δ can be determined from

cos δ = C1/A, sin δ = C2/A.


y′′ + 2y′ + 5y = 0, y(0) = 1, y′(0) = 1.

The characteristic polynomial has roots −1± 2i. Thus, every solution of the differ-ential equation is of the form

φ(x) = C1e−x cos 2x+ C2e

−x sin 2x.

We can use the initial conditions to find that C1 = 1 and C2 = 1. Then, thesolution of the differential equation is φ(x) = e−x cos 2x + e−x sin 2x. But also,

A =√

12 + 12 =√

2 and δ = π/4 since sin δ = 1/√

2, cos δ = 1/√

2. Therefore, the

solution can be written as φ(x) =√

2e−x cos(2x− π/4). The graph of the solution

is shown in Figure (1). Also included are the plots of y =√

2e−x and y = −√

2e−x.


1 2 3 4

-1

-0.5

0.5

1

Figure 1. y′′ + 2y′ + 5y = 0, y(0) = 1, y′(0) = 1

Exercises

1. Find the general solution of 9y′′ − 6y′ + y = 0.

2. Find the general solution of y′′ − 2y′ + 4y = 0.

3. Solve y′′ + 3y′ − 10y = 0, y(0) = 1, y′(0) = 9.

4. Solve y′′ − 4y′ + 4y = 0, y(0) = 3 y′(0) = −2.

5. Verify that x and 1/x are solutions of y′′ + (1/x)y′ − (1/x2)y = 0 on(0,∞), and then prove that the general solution of the equation on (0,∞)is φ(x) = C1x+ C2x

−1.

6. Solve my′′ + ky = 0, y(0) = β, y′(0) = 0, where m and k are positiveconstants. What about your solution shows that this initial value problemdoes not model with complete accuracy the mass-spring system given atthe beginning of this chapter?

7. A more realistic model of a mass-spring system is given by the equationmy′′+ by′+ky = 0, where b is a positive constant related to friction. Thedamping, as reflected in the by′ term, can be thought of as correspondingto the spring being in a viscous fluid. The more viscous the fluid, thelarger the value of b is.(a) Based on physical considerations, do you expect solutions that even-

tually approach zero monotonically if b is close to zero or if b is large?What about solutions oscillating toward zero?

(b) Solve my′′ + by′ + ky = 0 and find the value of b (in terms of m andk) that separates the two types of behavior.

8. If φ1, φ2, and φ3 are solutions of equation (5) on an interval I, proveφ1, φ2, φ3 are linearly dependent on I.

9. Let φ1 and φ2 be solutions of (5) on an interval I containing a point x0.If φ1(x0) = 5φ2(x0) and φ′1(x0) = 5φ′2(x0), prove that φ1(x) = 5φ2(x) forall x ∈ I.

6. HIGHER ORDER LINEAR HOMOGENEOUS EQUATIONS 59

10. Let φ1 and φ2 be solutions of (5) on an interval I containing a point x0. Ifφ1(x0) = 0 and φ2(x0) = 0, prove that φ1 and φ2 are linearly dependenton I.

11. Let φ1 and φ2 be solutions of (5) on an interval I containing a point x0.(a) Prove that the Wronskian W (φ1, φ2)(x) satisfies the differential equa-

tion W ′ + a1(x)W = 0.(b) By solving the first order linear equation, show that the Wronskian

satisfies the formula

W (φ1, φ2)(x) = exp

[−∫ x

x0

a1(t) dt

]W (φ1, φ2)(x0)

for all x ∈ I. (By definition, expu = eu.)

6. Higher Order Linear Homogeneous Equations

We will now generalize the material of the last sections to linear homogeneousequations of order n > 2. There is very little that is really new, and so we willmainly just state results without going through their proofs. Let’s start with theconstant coefficient case.

Theorem 2.10. Consider the linear homogeneous equation

(7) a0y(n) + a1y

(n−1) + · · ·+ any = 0,

where a0, a1, . . . , an are constants and a0 6= 0. Let r1, . . . , rk be the distinct roots ofthe characteristic polynomial

p(r) = a0rn + a1r

n−1 + · · ·+ an,

and suppose rj has multiplicity mj, j = 1 . . . , k. (Note that by the FundamentalTheorem of Algebra, m1 + · · ·+mk = n.) Then the n functions

er1x, xer1x, . . . , xm1−1er1x,

...

erkx, xerkx, . . . , xmk−1erkx

are linearly independent solutions of (7). Furthermore, since the coefficients arereal, any imaginary roots must come in complex conjugate pairs λ ± µi, and thesolutions xse(λ±µi)x can be replaced by xseλx cosµx, xseλx sinµx resulting in, again,a collection of n linearly independent solutions.

Example. Suppose p(r) = (r − 6)3(r + 2) is the characteristic polynomial fora fourth order linear homogeneous equation with constant coefficients. Then e6x,xe6x, x2e6x, e−2x are four linearly independent solutions of the differential equation.

Example. Find five linearly independent solutions of y(5) − y(4) = 0. Thecharacteristic equation is r5− r4 = 0, which in factored form is r4(r− 1) = 0. Thisleads to r = 0, 0, 0, 0, 1, counting multiplicity. Therefore, 1, x, x2, x3, ex are fivelinearly independent solutions of the differential equation.

Example. Find four linearly independent solutions of y(4) + 6y′′ + 9y = 0.The characteristic equation is r4 + 6r2 + 9 = 0, which we factor as (r2 + 3)2 = 0

to get r = ±√

3i, ±√

3i. Therefore, e±i√3x, xe±i

√3x are four linearly independent


solutions of the differential equation. On the other hand, so are cos√

3x, sin√

3x,x cos

√3x, x sin

√3x.

As in the second order case, an appropriately defined initial value problem willhave a unique solution, as the following two theorems assert. Assume a0(x) ≡ 1.

Theorem 2.11. (Existence Theorem) If a1, a2, . . . , an are continuous on aninterval I containing a point x0, then the initial value problem

(8)y(n) + a1(x)y(n−1) + · · ·+ an(x)y = 0

y(x0) = β1, y′(x0) = β2, . . . , y

(n−1)(x0) = βn

has a solution on I.

Theorem 2.12. (Uniqueness Theorem) If a1, a2 . . . , an are continuous on aninterval I containing a point x0, then the initial value problem (8) has only onesolution on I.

In the present setting, the Wronskian will again be useful in dealing with linearindependence and in verifying that a collection of solutions gives a general solution.

Definition 2.5. The Wronskian W (f1, f2, . . . , fn) of any (n− 1)-times differ-entiable functions f1, f2, . . . , fn, is the function given by

W (f1, f2, . . . , fn)(x) =

∣∣∣∣∣∣∣∣∣f1(x) f2(x) · · · fn(x)f ′1(x) f ′2(x) · · · f ′n(x)

......

...

f(n−1)1 (x) f

(n−1)2 (x) · · · f

(n−1)n (x)

∣∣∣∣∣∣∣∣∣ .Lemma 2.3. Let f1, f2, . . . , fn be (n − 1)-times differentiable functions on an

interval I. If there exists a point x0 ∈ I for which W (f1, f2, . . . , fn)(x0) 6= 0, thenf1, f2, . . . , fn are linearly independent on I.

Example. Prove that the functions f1(x) = sinx, f2(x) = cosx, f3(x) = x3

are linearly independent on (−∞,∞). We have

W (f1, f2, f3)(x) =

∣∣∣∣∣∣sinx cosx x3

cosx − sinx 3x2

− sinx − cosx 6x

∣∣∣∣∣∣ .In particular,

W (f1, f2, f3)(π) =

∣∣∣∣∣∣0 −1 π3

−1 0 3π2

0 1 6π

∣∣∣∣∣∣ = −6π − π3 6= 0.

Therefore, sinx, cosx, x3 are linearly independent on (−∞,∞). Note that, in thisparticular example, W (f1, f2, f3)(0) = 0, which tells us nothing about linear inde-pendence or dependence.

In contrast to the last example, the following theorem implies that the Wron-skian being zero does imply linear dependence if we are dealing with the rightnumber of solutions of the right kind of differential equation.

Theorem 2.13. Let φ1, φ2, . . . , φn be solutions of

(9) y(n) + a1(x)y(n−1) + · · ·+ an(x)y = 0

6. HIGHER ORDER LINEAR HOMOGENEOUS EQUATIONS 61

on an interval I. Then the following conditions are equivalent.(a) W (φ1, φ2, . . . , φn)(x) is never zero for x ∈ I.(b) There exists a point x0 ∈ I such that W (φ1, φ2, . . . , φn)(x0) 6= 0.(c) φ1, φ2, . . . , φn are linearly independent on I.

Example. The functions 1, x, x2, e6x are four linearly independent solutions,as seen in Theorem 2.10, of the of a fourth order linear homogeneous differentialequation (with constant coefficients) corresponding to a characteristic polynomialwith roots 0, 0, 0, 6. We get for all x ∈ (−∞,∞),

W (1, x, x2, e6x) =

∣∣∣∣∣∣∣∣1 x x2 e6x

0 1 2x 6e6x

0 0 2 36e6x

0 0 0 216e6x

∣∣∣∣∣∣∣∣ = 432e6x 6= 0,

which is in agreement with what the last theorem predicts. Notice, however, thatx, e6x are two solutions of a fourth order equation while W (x, e6x) = (6x − 1)e6x

is sometimes zero and sometimes nonzero, a circumstance that, of course, does notcontradict the last theorem.

Just as in the second order case, the Wronskian and the Uniqueness Theoremgive us the following result, which tells us how to find the general solution of alinear homogeneous equation.

Theorem 2.14. Let φ1, φ2, . . . , φn be any n linearly independent solutions of(9) on an interval I. Then every solution of (9) is of the form

φ = C1φ1 + C2φ2 + · · ·+ Cnφn,

where C1, C2, . . . , Cn are constants.

Example. Find the general solution of y′′′ − 8y = 0. The characteristic equa-tion is r3 − 8 = 0, which can be factored as (r − 2)(r2 + 2r + 4) = 0. We get

r = 2,−1 ±√

3i. An alternate way of solving this characteristic equation is tonote that we are trying to find the cube roots of 8 = 8e0i. They are r1 = 2e(0/3)i,r2 = 2e(2π/3)i, and r3 = 2e(4π/3)i, which simplify to r1 = 2, r2 = −1 +

√3i, and

r3 = −1−√

3i. The general solution can then be written as

φ(x) = C1e2x + C2e

−x cos√

3x+ C3e−x sin

√3x.

(If we do not wish to highlight which solutions are real valued, we can also write

the general solution as φ(x) = K1e−x +K2e

(−1+√3i)x +K3e

(−1−√3i)x.)

Example. Solve the initial value problem y′′′ − 2y′′ − 3y′ + 6y = 0, y(0) = 4,y′(0) = 4, and y′′(0) = 14. The characteristic equation is r3−2r2−3r+6 = 0, If wesolve this by grouping, we get r2(r− 2)− 3(r− 2) = 0 and then (r− 2)(r2− 3) = 0.

Thus, r = 2,±√

3. The general solution of the differential equation is φ(x) =

C1e2x + C2e

√3x + C3e

−√3x. The initial conditions lead to the system

C1 + C2 + C3 = 4

2C1 +√

3C2 −√

3C3 = 44C1 + 3C2 + 3C3 = 14

We get C1 = 2, C2 = 1, C3 = 1. Therefore the solution to the initial value problem

is φ(x) = 2e2x + e√3x + e−

√3x.


Exercises

1. Find all real valued solutions of y′′′ − 6y′′ + 13y′ − 10y = 0.

2. Find all real valued solutions of y′′′ − 3y′′ + 9y′ − 27y = 0.

3. Find all real valued solutions of y(4) − y′′′ + y′ − y = 0.

4. Find all real valued solutions of y′′′ + y = 0.

5. Solve y′′′ + y′′ − y′ − y = 0, y(0) = 2, y′(0) = 0, y′′(0) = 6.

6. Solve y′′′ − 2y′′ − y′ + 2y = 0, y(0) = 4, y′(0) = 3, y′′(0) = 7.

7. Let f1(x) = x, f2(x) = x+ 1, f1(x) = x3.(a) Compute W (f1, f2, f3)(x).(b) What does the Wronskian say about the linear independence or de-

pendence of f1, f2, f3 on (−∞,∞)? Why?(c) Can f1, f2, f3 be three solutions of a third order linear homogeneous

differential equation on (−∞,∞)? Why?(d) Can f1, f2, f3 be three solutions of a second order linear homogeneous

differential equation on (−∞,∞)? Why?(e) Find a fourth order linear homogeneous differential equation with

constant coefficients that has f1, f2, f3 as solutions.

8. Let f1(x) = x, f2(x) = ex, f1(x) = e−x.(a) Compute W (f1, f2, f3)(x).(b) What does the Wronskian say about the linear independence or de-

pendence of f1, f2, f3 on (−∞,∞)? Why?(c) Can f1, f2, f3 be three solutions of a third order linear homogeneous

differential equation on (−∞,∞)? Why?(d) Can f1, f2, f3 be three solutions of a second order linear homogeneous

differential equation on (−∞,∞)? Why?(e) Find a fourth order linear homogeneous differential equation with

constant coefficients that has f1, f2, f3 as solutions.

9. Prove Lemma 2.3.


7. Nonhomogeneous Linear Equations and the Annihilator Method

To find the general solution of the nonhomogeneous linear equation

(10) a0(x)y(n) + a1(x)y(n−1) + · · ·+ an(x)y = g(x),

where a0, a1, . . . , an, g are continuous on an interval I, it will be necessary to con-sider the corresponding homogeneous equation

(11) a0(x)y(n) + a1(x)y(n−1) + · · ·+ an(x)y = 0.

As before, it will be convenient to represent the left side of these equations by thelinear operator

(12) L(y) = a0y(n) + a1y

(n−1) + · · ·+ any.

7. NONHOMOGENEOUS LINEAR EQUATIONS AND THE ANNIHILATOR METHOD 63

Theorem 2.15. Let ψp be a solution of the nonhomogeneous equation (10).Then every solution ψ of (10) is of the form ψ = ψp + φ, where φ is a solution ofthe corresponding homogeneous equation (11).

Proof. We have that ψp and ψ are solutions of L(y) = g. In other words,L(ψp) = g and L(ψ) = g. Then since L is linear, L(ψ − ψp) = L(ψ) − L(ψp) =g−g = 0, and so ψ−ψp is a solution φ of (11). Then, ψ−ψp = φ implies ψ = ψp+φ,as claimed.

Let us note that ψp is usually referred to a particular solution of (10). It willbe shown in the next section that equation (10) does, in fact, have solutions whena0(x) is non-zero. In other words, we will prove that ψp exists if a0(x) is non-zero.

Suppose that φ1, φ2, . . . , φn are linearly independent solutions of (11) and ψpis a particular solution of (10). We know that the general solution of (11) is

φ = C1φ1 + C2φ2 + · · ·+ Cnφn.

Then the last theorem says that the general solution of the nonhomogeneous equa-tion (10) is

ψ = ψp + C1φ1 + C2φ2 + · · ·+ Cnφn.

Example. Given that ψp(x) = x2 − 2 is a solution of y′′ + y = x2, find thegeneral solution of this equation. The homogeneous equation y′′ + y = 0 leads tor = ±i from which we get its general solution φ(x) = C1 cosx+C2 sinx. Therefore,the general solution of the nonhomogeneous equation is

ψ(x) = (x2 − 2) + C1 cosx+ C2 sinx.

The natural question to ask at this point is how can we find a particular solutionψp. There are two methods we will be using. The first is specifically for constantcoefficient equations, and is called the annihilator method. To prepare for it, wewill need to make some observations about linear differential operators.

Let D stand for differentiation. Then D is a linear differential operator. Forany positive integer j, put Dj equal to the composition of j copies of D; that is, Dj

is the jth derivative linear differential operator. Note that (12) can now be writtenas

L = a0Dn + a1D

n−1 + · · ·+ anI,

where I is the identity operator. As is done for linear transformations in general,we will let ML stand for the composition of L followed by M whenever L and Mare linear differential operators. It follows that ML is then a linear differentialoperator.

Lemma 2.4. If L,M,N, P are linear differential operators, then

(N + P )(L+M) = NL+NM + PL+ PM.

The proof is routine and will be omitted.There are many examples of linear differential operators L,M for which ML

and LM are different. However, if L and M involve constant coefficients, thefollowing lemma holds.

Lemma 2.5. If L and M are constant coefficient, linear differential operators,then ML = LM .


Proof. Say L = a0Dn + a1D

(n−1) + · · ·+ anI and M = b0Dm + b1D

(m−1) +· · ·+ bmI. By the last lemma,

ML = (b0Dm + b1D

(m−1) + · · ·+ bmI)(a0Dn + a1D

(n−1) + · · ·+ anI)

= b0a0Dm+n + (b0a1 + b1a0)Dm+n−1 + · · ·+ bmanI

= a0b0Dn+m + (a0b1 + a1b0)Dn+m−1 + · · ·+ anbmI

= (a0Dn + a1D

(n−1) + · · ·+ anI)(b0Dm + b1D

(m−1) + · · ·+ bmI)

= LM.

Example. Let L = 2D + 4I and M = 3D. We will check directly thatML = LM . For any y,

ML(y) = M(L(y))

= M(2y′ + 4y)

= 3(2y′ + 4y)′

= 6y′′ + 12y′

and

LM(y) = L(M(y))

= L(3y′)

= 2(3y′)′ + 4(3y′)

= 6y′′ + 12y′.

As noted above, the last lemma does not hold for variable coefficient, lineardifferential operators. If in the last example we had M = xD rather than M = 3D,then part of the computation would involve 2(xy′)′, which when expanded requiresthe product rule. The reader can check that we would not get ML = LM . Wewill assume from here on out that all our linear differential operators involve onlyconstant coefficients.

The next theorem will provide the key step in finding a particular solutionψp using the annihilator method. It enables us to transfer a problem involving anonhomogeneous equation to a problem involving a homogeneous equation, whichwe already know how to solve.

Theorem 2.16. Consider the nonhomogeneous, constant coefficient, linear dif-ferential equation L(y) = g, where g is a solution of a homogeneous, constant coef-ficient, linear differential equation M(y) = 0. Then every solution of L(y) = g isalso a solution of ML(y) = 0.

Proof. Let ψ be a solution of L(y) = g. We have L(ψ) = g. Then,

ML(ψ) = M(L(ψ)) = M(g) = 0,

which says that ψ is a solution of ML(y) = 0.

We need just one more simple result before we are ready to solve some nonho-mogeneous equations.


Lemma 2.6. Let L and M be constant coefficient, linear differential operators.If p is the characteristic polynomial for L and q is the characteristic polynomial forM , then qp is the characteristic polynomial for ML.

Proof. Say L = a0Dn + a1D

(n−1) + · · ·+ anI, M = b0Dm + b1D

(m−1) + · · ·+bmI. Then p(r) = a0r

n+a1r(n−1) + · · ·+an and q(r) = b0r

m+ b1r(m−1) + · · ·+ bm.

As in the proof of the last lemma,

ML = b0a0Dm+n + (b0a1 + b1a0)Dm+n−1 + · · ·+ bmanI,

whose characteristic polynomial is clearly qp.

We are now ready to apply the annihilator method.

Example. Solve y′′ − 3y′ + 2y = x2. In other words, solve L(y) = x2, whereL = D2 − 3D + 2I. We will first solve the corresponding homogeneous equationL(y) = 0. Its characteristic polynomial is p(r) = r2− 3r+ 2, whose roots are easilyseen to be r = 1, 2. The general solution of the homogeneous equation is thenφ(x) = C1e

x + C2e2x.

To find a particular solution ψp, we need a constant coefficient, linear homo-geneous equation that has x2 as a solution. The simplest choice is y′′′ = 0, whichwe are able to find by educated guessing. Put another way, we have found a linear,constant coefficient operator M = D3 that annihilates x2. By the last theorem,every solution ψ of y′′ − 3y′ + 2y = x2 will also be a solution of ML(y) = 0. Inparticular, ψp will be among the solutions of D3[D2 − 3D + 2I](y) = 0. To solvethis fifth order homogeneous equation, we note that its characteristic polynomialis q(r)p(r) = r3[r2 − 3r + 2], which has roots r = 0, 0, 0, 1, 2. Then ψp(x) is amongthe solutions K1 +K2x+K3x

2 +K4ex +K5e

2x. In other words,

ψp(x) = K1 +K2x+K3x2 +K4e

x +K5e2x

if the constants K1,K2,K3,K4,K5 are chosen properly. We are trying to findconstants K1,K2,K3,K4,K5 such that L(K1 +K2x+K3x

2 +K4ex +K5e

2x) = 0.However, L(ex) = 0 and L(e2x) = 0 since ex and e2x are solutions of L(y) = 0.This combined with the fact that L is linear, implies that

L(K1 +K2x+K3x2 +K4e

x +K5e2x) = L(K1 +K2x+K3x

2).

In other words, if we plug ψp into the left side of y′′− 3y′+ 2y = x2, the ex and e2x

terms will not contribute to getting x2. For this reason, we can simplify the formof ψp(x) to

ψp(x) = K1 +K2x+K3x2.

We will now plug this into the equation y′′ − 3y′ + 2y = x2 to see what K1,K2,K3

must be. We get

2K3 − 3(K2 + 2K3x) + 2(K1 +K2x+K3x2) = x2

and so,

2K3x2 + (−6K3 + 2K2)x+ (2K3 − 3K2 + 2K1) = x2.

This will hold for all x if (and since x2, x, 1 are linearly independent, only if)K1,K2,K3 satisfy

2K3 = 12K2 − 6K3 = 0

2K1 − 3K2 + 2K3 = 0


We get K3 = 1/2, K2 = 3/2, and then K1 = 7/4. Thus, ψp(x) = 7/4 + (3/2)x +(1/2)x2. Therefore, the general solution of y′′ − 3y′ + 2y = x2 is

ψ(x) =7

4+

3

2x+

1

2x2 + C1e

x + C2e2x.

Notice that if we had kept K4 and K5 in our computation, we would have foundthe general solution ψ(x) directly since there would turn out to be no restrictionson K4 and K5, thus making them arbitrary. In the next example, we will find theannihilator M in a more methodical manner.

Example. Solve y′′′ − 8y = 7xex. We should first solve the correspondinghomogeneous equation y′′′ − 8y = 0. This, however, was already done in the lastsection. We found there that the general solution of y′′′ − 8y = 0 is

φ(x) = C1e2x + C2e

−x cos√

3x+ C3e−x sin

√3x.

In order to find ψp, we will need a linear differential operator M that annihilates7xex. To this end, let us recall that when we solve a homogeneous equation, westart with a linear differential operator L, find its characteristic polynomial p(r),find the roots r of the polynomial, and then get solutions erx of the differentialequation. Finding M will involve going through this process, but in reverse order.We start by noting that 7xex comes from a double root r = 1, 1 since the exponentis 1x and since the x in front of the exponential corresponds to a double root. (The7 is a constant Cj that is not relevant in finding M .) The simplest polynomial thatgives us a double root r = 1, 1 is q(r) = (r − 1)2. This, in turn, corresponds to thelinear differential operator M = (D− 1)2. We have found an operator M such thatM(7xex) = 0.

Now that we have an annihilator, we can proceed as in the last example. Theparticular solution ψp will be a solution of the fifth order equation ML(y) = 0,

whose characteristic polynomial has roots r = 1, 1, 2,−1±√

3i. Then

ψp(x) = K1ex +K2xe

x +K3e2x +K4e

−x cos√

3x+K5e−x sin

√3x

if K1,K2,K3,K4,K5 are chosen properly. However, since e2x, e−x cos√

3x, ande−x sin

√3x are solutions of L(y) = 0, we can simplify the form of ψp(x) to get

ψp(x) = K1ex +K2xe

x.

Plugging ψp(x) into the original equation gives us

(K1 + 3K2)ex +K2xex − 8(K1e

x +K2xex) = 7xex.

Equating the coefficients of xex and then of ex, we get the system of equations

− 7K2 = 7−7K1 + 3K2 = 0

whose solution is K1 = −3/7, K2 = −1. Thus, ψp(x) = −(3/7)ex−xex. Therefore,

ψ(x) = −3

7ex − xex + C1e

2x + C2e−x cos

√3x+ C3e

−x sin√

3x

is the general solution of y′′′ − 8y = 7xex.

Example. Solve y′′ − 3y′ + 2y = 5 cosx, y(0) = 0, y′(0) = −4. The equationy′′ − 3y′ + 2y = 0 has characteristic polynomial p(r) = r2 − 3r + 2 with rootsr = 1, 2. The general solution of the homogeneous equation is, therefore, φ(x) =C1e

x + C2e2x.


The function 5 cosx comes from the conjugate pair of roots r = 0±1i, or simplyr = ±i. (The roots, in turn, lead to a polynomial q(r) and a linear differentialoperator M , which we need not find explicitly.) As in the earlier examples, the formof ψp is determined by the gathering together of the roots (counting multiplicity)of p(r) and q(r), which in this case is r = ±i, 1, 2. We get

ψp(x) = K1 cosx+K2 sinx+K3ex +K4e

2x,

and after simplifying, ψp(x) = K1 cosx+K2 sinx for the proper choice of K1,K2.Plugging this into the nonhomogeneous equation, we are lead, after a couple ofsteps, to the system

K1 − 3K2 = 53K1 + K2 = 0

whose solution is K1 = 1/2, K2 = −3/2. Therefore, ψp(x) = (1/2) cosx−(3/2) sinxand the general solution of the nonhomogeneous equation is

ψ(x) =1

2cosx− 3

2sinx+ C1e

x + C2e2x.

Now, the initial conditions give us the system

1/2 + C1 + C2 = 0−3/2 + C1 + 2C2 = −4

which leads to C1 = 3/2, C2 = −2. Therefore the solution of the initial valueproblem is ψ(x) = (1/2) cosx− (3/2) sinx+ (3/2)ex − 2e2x.

So far, the form of ψp has mirrored that of g quite closely, but when p(r) andq(r) have roots in common the pattern is not quite so simple, as we will see in thenext example.

Example. Solve y′′ − 5y′ + 6y = 4e3x. The equation y′′ − 5y′ + 6y = 0 hascharacteristic polynomial p(r) = r2−5r+6 with roots r = 2, 3. The general solutionof the homogeneous equation is, therefore, φ(x) = C1e

2x + C2e3x.

The function g(x) = 4e3x comes from the root r = 3. Gathering the roots ofp(r) and q(r) together, we have r = 2, 3, 3. Then the particular solution is of theform

ψp(x) = K1e2x +K2e

3x +K3xe3x.

Since e2x and e3x are solutions of the homogeneous equation, we can simplify sothat ψp(x) = K3xe

3x for the proper choice of K3. Plugging in, we easily find thatK3 = 4. Therefore, ψp(x) = 4xe3x and ψ(x) = 4xe3x + C1e

2x + C2e3x.

Let’s summarize the annihilator method. We find a particular solution ψp(x) forthe nonhomogeneous, constant coefficient, linear equation L(y) = g(x) as follows:

• Find the roots of p(r), where p(r) is the characteristic polynomial for L.• Find the roots of q(r), where q(r) is the characteristic polynomial for

a constant coefficient, linear differential operator M such that g(x) is asolution of M(y) = 0.

• Since ψp(x) is a solution of ML(y) = 0, write down ψp(x) as a sum ofterms determined by combining the roots (counting multiplicity) of q(r)and p(r).

• Simplify the form of ψp(x) by eliminating the terms that are solutions ofthe homogeneous equation L(y) = 0.


• Evaluate the coefficients by plugging your expression for ψp(x) into L(y) =g(x)

The general solution of L(y) = g(x) is then ψp(x) plus the general solution ofL(y) = 0.

We should take a moment to discuss what to do when g(x) is a sum of terms, sayg(x) = g1(x) + g2(x). If g1(x) and g2(x) correspond to different roots, then findinga particular solution for L(y) = g1(x) and a particular solution for L(y) = g2(x),and then adding them together is a reasonable alternative to finding a particularsolution for L(y) = g(x) directly. However, if g1(x) and g2(x) correspond to thesame root, breaking the problem into two just makes for extra work. Thus, if weare finding a particular solution of y′′ + y = 2ex + 3e5x, we can find a particularsolution of y′′ + y = 2ex and a particular solution of y′′ + y = 3e5x, and then add;however, for finding a particular solution of y′′ + y = 2ex + 5x2ex, it is best not todeal with 2ex and 5x2ex separately.

Looking at the examples we have gone through and thinking about how theoperators L and M interact, we see that some definite patterns appear. If the rootcorresponding to g(x) is not a root of p(r), then the form of ψp(x) is similar to g(x).The following abbreviated table shows some of the many possibilities. However, if

g(x) Form of ψp(x)

erx K1erx

cosµx K1 cosµx+K2 sinµxsinµx K1 cosµx+K2 sinµxxjerx K1x

jerx +K2xj−1erx + · · ·+Kj+1e

rx

the root corresponding to g(x) is a root of p(r) of multiplicity s, then the initialchoice of ψp(x) must be multiplied by xs. When thought of in term of patterns suchas those shown in the table, the annihilator method is usually called the method ofundetermined coefficients.

Example. Find the form of particular solution ψp of

y(5) − y′′′ = 5e3x + x cosx+ 4ex − ex sin 3x+ 6x.

Even though we are not being asked to find a general solution, we will still needto know the roots involved in the corresponding homogeneous equation. They arer = 0, 0, 0, 1,−1 since p(r) = r5 − r3. Now, we note that the first term of the righthand side arises from r = 3, which is not in the list of “homogeneous roots”; thesecond term arises from r = ±i,±i, which are not in the list; the third term arisesfrom r = 1, which appears once in the list; the fourth term arises from r = 1± 3i,which are not on the list; the fifth term arises from r = 0, 0 (since 6x = 6xe0x),which appears three times in the list. Thus,

ψp(x) = K1e3x +K2x cosx+K3x sinx+K4 cosx+K5 sinx+K6xe

x

+K7ex cos 3x+K8e

x sin 3x+K9x4 +K10x

3.

We will not attempt to evaluate the constants.

8. THE METHOD OF VARIATION OF PARAMETERS 69

Exercises

1. Find the general solution of y′′ + 9y = 4e2x.

2. Find the general solution of 4y′′ + y = 3e−2x.

3. Find the general solution of y′′ − 2y′ + y = cosx+ ex.

4. Find the general solution of y′′ − 9y = sin 2x+ 4e3x.

5. Solve y′′ − 4y = 10e2x, y(0) = 6, y′(0) = 10.

6. Solve y′′ − 3y′ − 4y = −4x− 11, y(0) = −1, y′(0) = 9.

7. Write the form (but do not evaluate the coefficients) of a particular solu-tion of y′′ − 4y′ + 4y = e2x + 3e−2x − 2e2x cosx+ 4 sin 2x+ 3x2 + 5.

8. Write the form (but do not evaluate the coefficients) of a particular solu-tion of y′′′ − 2y′′ = 2ex + 3x cos 2x+ e2x sinx+ 5xe2x + 8.

9. Find the general solution of y′ + 3y = e4x

(a) using the annihilator method.(b) using an integrating factor.

10. A undamped mass-spring system satisfies my′′ + ky = 0, where y is dis-tance the spring is stretched from its natural length and is a function oftime t. (Here, m > 0 is the mass of the object and k > 0 is the spring

constant.) If an external force A cos√k/m t is applied to the mass, the

system will satisfy my′′+ky = A cos√k/m t. Solve this nonhomogeneous

equation, and state what will happen to the spring in the long run. Thephenomenon you are seeing here is called resonance.

11. Show that the set of solutions of a nonhomogeneous linear differentialequation a0(x)y(n) + a1(x)y(n−1) + · · · + an(x)y = g(x) on an interval Ido not form a vector space.

12. Let L and M be differential operators. Suppose that ψ is a solution ofthe differential equation L(y) = g and g is a solution of the differentialequation M(y) = g∗. Prove that ψ is a solution of the differential equationML(y) = g∗.

8. The Method of Variation of Parameters

In order to use the annihilator method to find a particular solution of a lin-ear nonhomogeneous equation L(y) = g, it is necessary that the equation haveconstant coefficients and that g be the solution of a constant coefficient, linear ho-mogeneous equation. We now want a method for finding ψp that applies to virtuallyall nonhomogeneous linear equations. It is necessary, however to state our resultfor equations of order n whose coefficient a0(x) of y(n) is identically one. Of course,dividing by an(x) (wherever an(x) 6= 0) converts a more general equation to thedesired form.

Theorem 2.17. Let a1, . . . , an, and g be continuous functions on an intervalI. Then there is a particular solution of

(13) y(n) + a1(x)y(n−1) + · · ·+ an(x)y = g(x)


on I of the form

ψp(x) = u1(x)φ1(x) + · · ·+ un(x)φn(x),

where φ1, . . . , φn are n linearly independent solutions of the corresponding homoge-neous linear differential equation

y(n) + a1(x)y(n−1) + · · ·+ an(x)y = 0,

and where the first derivatives of u1, . . . , un satisfy the system

(14)

φ1(x)u′1(x) + · · · + φn(x)u′n(x) = 0φ′1(x)u′1(x) + · · · + φ′n(x)u′n(x) = 0

...

φ(n−2)1 (x)u′1(x) + · · · + φ

(n−2)n (x)u′n(x) = 0

φ(n−1)1 (x)u′1(x) + · · · + φ

(n−1)n (x)u′n(x) = g(x)

for all x ∈ I.

Proof. We will just give the proof for second order equations. The proof forequations of order n is essentially the same, but involves rather messy notation. Weare assuming that φ1, φ2 are linearly independent solutions of y′′ + ay′ + by = 0 onan interval I. In other words, L(φ1) = 0 and L(φ2) = 0, where L = D2 + aD + bI.The goal is to find functions u1, u2 such that L(u1φ1 + u2φ2) = g. Compute

L(u1φ1 + u2φ2) = (u1φ1 + u2φ2)′′ + a(u1φ1 + u2φ2)′ + b(u1φ1 + u2φ2)= u′′1φ1 + 2u′1φ

′1 + u1φ

′′1 + u′′2φ2 + 2u′2φ

′2 + u2φ

′′2

+ au′1φ1 + au1φ′1 + au′2φ2 + au2φ

′2 + bu1φ1 + bu2φ2

= (u′′1φ1 + u′′2φ2) + 2(u′1φ′1 + u′2φ

′2) + a(u′1φ1 + u′2φ2)

+u1(φ′′1 + aφ′1 + bφ1) + u2(φ′′2 + aφ′2 + bφ2).

But φ′′1 + aφ′1 + bφ1 = L(φ1) = 0 and φ′′2 + aφ′2 + bφ2 = L(φ2) = 0. Thus, we have

L(u1φ1 + u2φ2) = (u′′1φ1 + u′′2φ2) + 2(u′1φ′1 + u′2φ

′2) + a(u′1φ1 + u′2φ2).

Note that if u′1φ1 + u′2φ2 = 0, then differentiation gives

(u′′1φ1 + u′′2φ2) + (u′1φ′1 + u′2φ

′2) = 0,

and so

L(u1φ1 + u2φ2) = u′1φ′1 + u′2φ

′2.

Then if, additionally, u′1φ′1 + u′2φ

′2 = g, we have L(u1φ1 + u2φ2) = g. Therefore, if

u′1(x), u′2(x) satisfy

φ1(x)u′1(x) + φ2(x)u′2(x) = 0

φ′1(x)u′1(x) + φ′2(x)u′2(x) = g(x)

for all x ∈ I, then u1φ1 + u2φ2 is a solution of L(y) = g on I.

We should note that the system (14) always has a solution for u1(x), . . . , un(x)for all x ∈ I since the determinant of coefficients is W (φ1, . . . , φn)(x), which isnever zero due to the linear independence of the solutions φ1, . . . , φn. Thus, theabove shows that the nonhomogeneous equation (13) must always have a solutionψp.

Let’s summarize the method of variation of parameters. We find a particularsolution ψp(x) for the nonhomogeneous, linear equation of order n as follows:


• Find n linearly independent solutions φ1(x), . . . , φn(x) of the correspond-ing homogeneous equation.

• Solve the system (14) for u′1(x), . . . , u′n(x).• Integrate to get u1(x), . . . , un(x).• Form ψp(x) = u1(x)φ1(x) + · · ·+ un(x)φn(x).

The general solution of the nonhomogeneous equation is then

ψ(x) = ψp(x) + C1φ1(x) + · · ·+ Cnφn(x).

There are a number of ways to solve the system (14). One possible methodis Cramer’s rule. An advantage of Cramer’s rule is that it gives a formula foru′1, . . . , u

′n; a disadvantage is that it is a very slow, computationally inefficient

method for all but the smallest values of n. In the case of n = 2, we note thatCramer’s rule says that the (unique) solution of

φ1(x)u′1(x) + φ2(x)u′2(x) = 0

φ′1(x)u′1(x) + φ′2(x)u′2(x) = g(x)

is

u′1(x) =

∣∣∣∣ 0 φ2(x)g(x) φ′2(x)

∣∣∣∣∣∣∣∣φ1(x) φ2(x)φ′1(x) φ′2(x)

∣∣∣∣ =−φ2(x)g(x)

W (φ1, φ2)(x),

u′2(x) =

∣∣∣∣φ1(x) 0φ′1(x) g(x)

∣∣∣∣∣∣∣∣φ1(x) φ2(x)φ′1(x) φ′2(x)

∣∣∣∣ =φ1(x)g(x)

W (φ1, φ2)(x).

Again, we should emphasize that Cramer’s rule is just one way of solving (14).

Example. Solve y′′+4y′+4y = e3x. We first solve the corresponding homoge-neous equation y′′+4y′+4y = 0. Its characteristic polynomial is p(r) = r2 +4r+4,whose roots are easily seen to be r = −2,−2. Two linearly independent solutionsof the homogeneous equation are then φ1(x) = e−2x, φ2(x) = xe−2x.

Let’s use the method of variation of parameters to find a particular solutionψp. For this, we need to solve the system

e−2xu′1(x) + xe−2xu′2(x) = 0

−2e−2xu′1(x) + (e−2x − 2xe−2x)u′2(x) = e3x

By Cramer’s rule,

u′1(x) =

∣∣∣∣ 0 xe−2x

e3x e−2x − 2xe−2x

∣∣∣∣∣∣∣∣ e−2x xe−2x

−2e−2x e−2x − 2xe−2x

∣∣∣∣ =−xe−2xe3x

e−4x= −xe5x,

and

u′2(x) =

∣∣∣∣ e−2x 0−2e−2x e3x

∣∣∣∣∣∣∣∣ e−2x xe−2x

−2e−2x e−2x − 2xe−2x

∣∣∣∣ =e−2xe3x

e−4x= e5x.


Then, u1(x) =∫ ∗

(−xe5x) dx = − 15xe

5x + 125e

5x (using integration by parts), and

u2(x) =∫ ∗e5x dx = 1

5e5x. Thus,

ψp(x) =

(−1

5xe5x +

1

25e5x)e−2x +

1

5e5xxe−2x =

1

25e3x.

The general solution is, therefore, ψ(x) = 125e

3x + C1e−2x + C2xe

−2x.

We should note that in the last example, ψp could have been found (muchmore easily) using the annihilator method. In the next example, however, there isno such choice; the annihilator method won’t be possible since the function g(x) isnot the solution of a constant coefficient, linear homogeneous equation.

Example. Solve 2y′′ + 2y = tanx for −π/2 < x < π/2. We first solve thecorresponding homogeneous equation 2y′′+2y = 0. Its characteristic polynomial isp(r) = 2r2 + 2, whose roots are easily seen to be r = ±i. Two linearly independentsolutions of the homogeneous equation are then φ1(x) = cosx, φ2(x) = sinx.

As we prepare to apply the method of variation of parameters to find ψp, noticethat we will need to use g(x) = 1

2 tanx since that is the right side of the differentialequation once it is put into standard form by dividing by the leading coefficient, 2.Thus we will solve the system

(cosx)u′1(x) + (sinx)u′2(x) = 0

(− sinx)u′1(x) + (cosx)u′2(x) =1

2tanx

One way of solving this is to multiply through the first equation by sinx and thesecond equation by cosx and then add the resulting expressions. This gives us(sin2 x + cos2 x)u′2(x) = 1

2 cosx tanx, and then u′2(x) = 12 sinx. Plugging this into

the first equation of the system gives (cosx)u′1(x) + 12 sin2 x = 0 and so,

u′1(x) =− sin2 x

2 cosx.

Then,

u1(x) = −1

2

∫ ∗ 1− cos2 x

cosxdx

=1

2

∫ ∗(− secx+ cosx) dx

= −1

2ln | secx+ tanx|+ 1

2sinx.

Also, u′2(x) = 12 sinx implies that u2(x) = − 1

2 cosx. Thus,

ψp(x) =

(−1

2ln | secx+ tanx|+ 1

2sinx

)cosx+

(−1

2cosx

)sinx

= −1

2cosx ln | secx+ tanx|.

Therefore, ψ(x) = − 12 cosx ln | secx+ tanx|+ C1 cosx+ C2 sinx.

Notice that, in the above examples, if we had included constants of integrationwhen finding u1, u2, we would have been led directly to the general solution.


Exercises

1. Do exercise #1 from the last section, using the method of variation of pa-rameters to find a particular solution. (Warning: non-trivial integrationsare involved.)

2. Do exercise #2 from the last section, using the method of variation of pa-rameters to find a particular solution. (Warning: non-trivial integrationsare involved.)

3. Find the general solution of y′′ − 6y′ + 9y = x−1e3x (x > 0).

4. Find the general solution of 2y′′ − 8y′ + 8y = x−2e2x (x > 0).

5. Find the general solution of 3y′′ + 3y = secx (−π/2 < x < π/2).

6. Find the general solution of y′′ + y = sec2 x (−π/2 < x < π/2).

7. Do exercise #5 from the last section, using the method of variation ofparameters to find a particular solution.

8. Do exercise #6 from the last section, using the method of variation ofparameters to find a particular solution.

9. Find the general solution of y′′′ + y′′ + y′ + y = 2, using the method ofvariation of parameters to find a particular solution.

10. Find the general solution of y′′′− y′′ = 3ex, using the method of variationof parameters to find a particular solution.

11. Given that φ1(x) = x and φ2(x) = 1/x are solutions of y′′ + (1/x)y′ −(1/x2)y = 0, find the general solution of y′′+(1/x)y′−(1/x2)y = 4x2 (x >0).

12. Given that φ1(x) = x+ 1 and φ2(x) = ex are solutions of y′′− ( 1x + 1)y′+

1xy = 0, find the general solution of y′′ − ( 1

x + 1)y′ + 1xy = xex (x > 0).

CHAPTER 3

Series Solutions of Linear Equations


Suppose a drum is made with a membrane stretched across a circular shell.The vibrations of the membrane will then satisfy a partial differential equation inthe polar variables r and θ and in the time variable t, called the wave equation.The standard procedure for solving this equation involves looking for solutions ofthe form u(r, θ, t) = R(r)Θ(θ)T (t). It is shown that the function R must satisfy anordinary differential equation

r2R′′ + rR′ + (λr2 − γ)R = 0,

where λ and γ are appropriately chosen constants. Note that this is a linear ordinarydifferential equation, but unlike the linear equations we have been able to solve sofar, it has variable coefficients. In this chapter, we will find out how to solve certainlinear equations with variable coefficients.

2. Power Series Solutions

Consider the nth-order linear homogeneous differential equation

(1) y(n) + a1(x)y(n−1) + · · ·+ an(x)y = 0,

where a1, . . . , an are continuous on some interval I. Note that, once again, we haverestricted our attention to the special case of a0(x), the coefficient of y(n), beingidentically 1. So far, we have a method for solving the equation only if all thecoefficients are constant. Our goal now is to see how to solve (1) when all thecoefficients are expressible as convergent power series about some point x0 ∈ I. Itwill turn out that in this situation, all the solutions of (1) will also be expressibleas convergent power series about x0. To prepare for this, let’s remind ourselves ofa few of the basic facts about power series.

Definition 3.1. Let x0, c0, c1, c2, . . . be numbers. The series

∞∑k=0

ck(x− x0)k = c0 + c1(x− x0) + c2(x− x0)2 + · · ·

is said to be a power series about x0.

Although the natural setting for power series is the field of complex numbers,we will restrict our attention to real numbers. Power series will be of use to us onlyat points x where the series converge. Some of the main facts about convergenceof power series are contained in the following two theorems, which we state but donot prove.

75

76 3. SERIES SOLUTIONS OF LINEAR EQUATIONS

Theorem 3.1. For any power series∞∑k=0

ck(x− x0)k,

one and only of the following statements is true:

(a) The series converges absolutely for all x.(b) There exists a positive number R such that the series converges absolutely

for all x with |x− x0| < R and diverges for all x with |x− x0| > R.(c) The series converges only for x = x0.

Condition (a) is abbreviated by saying R = ∞, while (c) is abbreviated bysaying R = 0. In all cases, R is called the radius of convergence.

To find out about convergence of a series, there are a number of standardconvergence tests from which to choose. Of these, the absolute ratio test is oftenthe most convenient when working with a power series. Recall that the absoluteratio test says that for a series

∑k ak of nonzero numbers such that

ρ = limk→∞

|ak+1||ak|

exists, the series converges absolutely if ρ < 1, while the series diverges if ρ > 1.(If ρ = 1, there is no conclusion.)

Example. Find the radius of convergence of∑k k3k(x − 1)k. Obviously, the

series converges at x = 1. For each fixed x 6= 1,

|ak+1||ak|

=(k + 1)3k+1|x− 1|k+1

k3k|x− 1|k=

3(k + 1)

k|x− 1| → 3|x− 1|︸︷︷︸

ρ

as k → ∞. Thus, the series converges absolutely for each x with 3|x − 1| < 1 anddiverges for each x with 3|x− 1| > 1. This says the series converges absolutely for|x−1| < 1/3, i.e. 2/3 < x < 4/3, and diverges for |x−1| > 1/3. (We will not worryabout whether there is convergence or divergence at the endpoints x = 2/3, 4/3.)Therefore, R = 1/3.

Power series behave very naturally under differentiation. The next theoremsays that we can differentiate a power series termwise.

Theorem 3.2. Suppose that

f(x) =

∞∑k=0

ck(x− x0)k = c0 + c1(x− x0) + c2(x− x0)2 + · · · ,

with the power series having radius of convergence R, (0 < R ≤ ∞). Then

f ′(x) =

∞∑k=1

kck(x− x0)k−1 = c1 + 2c2(x− x0) + 3c3(x− x0)2 + · · ·

for all x with |x− x0| < R.

This theorem can be applied repeatedly to get

f ′′(x) =

∞∑k=2

k(k − 1)ck(x− x0)k−2 = 2c2 + 3 · 2c3(x− x0) + 4 · 3c4(x− x0)2 + · · · ,

2. POWER SERIES SOLUTIONS 77

f ′′′(x) =

∞∑k=3

k(k − 1)(k − 2)ck(x− x0)k−3 = 3 · 2c3 + 4 · 3 · 2c4(x− x0) + · · · ,

and so on.Now, let’s introduce some terminology related to power series.

Definition 3.2. Let f be a function defined on an open interval I containinga point x0. If f can be expressed as a convergent (0 < R ≤ ∞) power series aboutx0, then f is said to be analytic at x0.

Said another way, f is analytic at x0 if its Taylor series based at x0 is convergentand has f as its sum on an interval about x0.

Example. f(x) = ex is analytic at 0 since

ex =

∞∑k=0

xk

k!

for all x. Moreover, ex is analytic at any x0 since

ex = ex0ex−x0 =

∞∑k=0

ex0

k!(x− x0)k.

Example. f(x) = x2 + 3x− 5 is analytic at 0 since we can write

f(x) = −5 + 3x+ x2 + 0x3 + 0x4 + · · · ,which shows f(x) as a convergent series (because of having only finitely manynonzero terms). Similarly, we can see that f(x) is analytic at 1 since f(x) =(x− 1)2 + 5(x− 1)− 1.

It is clear that the process used in the last example can be generalized toshow that every polynomial is analytic at every x0. Furthermore, it can be shownthat every rational function f(x) = P (x)/Q(x) is analytic at every x0 for whichQ(x0) 6= 0. Recall that a rational function is, by definition, the quotient of twopolynomials.

Example. The last claim asserts that f(x) = 1/(x2+1) is analytic everywhere.Let’s check directly that this rational function is analytic at 0. Using the formulafor the sum of a geometric series, we get

f(x) =1

1− (−x2)=

∞∑k=0

(−x2)k =

∞∑k=0

(−1)kx2k,

if and only if | − x2| < 1, i.e. if and only if |x| < 1.

We will soon see that the solutions of a linear homogeneous equation withcoefficients analytic at a point x0 must also be analytic at x0. Before dealing withthis in general, we will first see how it applies to a specific example.

Example. Solve y′′ + x2y = 0. We first note that the coefficients a1(x) = 0and a2(x) = x2 are analytic everywhere. It will be most convenient, however, towork with x0 = 0 since the coefficient x2 is of the form (x−x0)k for x0 = 0. Assumethere is a solution of the form

(2) φ(x) =

∞∑k=0

ckxk = c0 + c1x+ c2x

2 + · · · .


Then,

φ′(x) =

∞∑k=1

kckxk−1 and φ′′(x) =

∞∑k=2

k(k − 1)ckxk−2.

Plugging into the differential equation, gives us

∞∑k=2

k(k − 1)ckxk−2 + x2

∞∑k=0

ckxk = 0,

and so,

∞∑k=2

k(k − 1)ckxk−2 +

∞∑k=0

ckxk+2 = 0.

We would like to combine the two series together into a single power series. To dothis, we will need the power of x to be of the same form throughout. Let’s chooseto write all powers in the form xk+2. (We could just as easily choose xk−2; also, xk

is a reasonable choice, but would involve a little more work.) We will accomplishour goal in two steps. First, set j + 2 = k − 2 and so j = k − 4. We get

∞∑j=−2

(j + 4)(j + 3)cj+4xj+2 +

∞∑k=0

ckxk+2 = 0.

Next, set k = j. This gives us

∞∑k=−2

(k + 4)(k + 3)ck+4xk+2 +

∞∑k=0

ckxk+2 = 0.

We are now ready to combine the series together, but we note that the first serieshas terms (corresponding to k = −2,−1) that have no equivalents in the secondseries and which will be handled separately. We get

2c2 + 6c3x+

∞∑k=0

(k + 4)(k + 3)ck+4 + ckxk+2 = 0.

This last condition will hold throughout an interval if we can choose the ck’s sothat the coefficient of each power of x is zero. So, set

2c2 = 0, 6c3 = 0, and (k + 4)(k + 3)ck+4 + ck = 0,

k = 0, 1, . . . . The first two equations give c2 = 0 and c3 = 0. We solve the thirdfor c(highest subscript) to get

(3) ck+4 =−ck

(k + 4)(k + 3), k = 0, 1, 2, . . . .


Then,

k = 0 ⇒ c4 =−c04 · 3

k = 1 ⇒ c5 =−c15 · 4

k = 2 ⇒ c6 =−c26 · 5

= 0

k = 3 ⇒ c7 =−c37 · 6

= 0

k = 4 ⇒ c8 =−c48 · 7

=c0

8 · 7 · 4 · 3k = 5 ⇒ c9 =

−c59 · 8

=c1

9 · 8 · 5 · 4and so on. We think of the above formulas as giving ck, k > 1, in terms of c0 andc1, while not telling us anything about the values of c0 and c1. With c0 and c1being arbitrary, we are free to choose them any way we like. Let’s choose c0 = 1and c1 = 0. Then c2 = 0, c3 = 0, c4 = −1/(4 · 3), c5 = 0, c6 = 0, c7 = 0,c8 = 1/(8 · 7 · 4 · 3), c9 = 0, and so on. This gives us the function

φ1(x) = 1− 1

4 · 3x4 +

1

8 · 7 · 4 · 3x8 −+ · · · .

On the other hand, if we choose c0 = 0 and c1 = 1, we get

φ2(x) = x− 1

5 · 4x5 +

1

9 · 8 · 5 · 4x9 −+ · · · .

Actually, we have just given the first few terms of φ1(x) and φ2(x). To write outφ1(x) and φ2(x) in full, we will use sigma notation. For this, we need to see apattern for the coefficients. Although writing the pattern for each function willnot be particularly simple, we have computed enough coefficients to see what thepatterns are. We can write

φ1(x) = 1 +

∞∑m=1

(−1)m

4m · (4m− 1) · (4m− 4)(4m− 5) · · · 4 · 3x4m,

φ2(x) = x+

∞∑m=1

(−1)m

(4m+ 1) · 4m · (4m− 3)(4m− 4) · · · 5 · 4x4m+1.

(In each case, the first term was written separately simply because it did not fit thegeneral pattern for this example.)

We still have a few thing to do before we can say we have solved the differentialequation. First, we need to show that the φ1 and φ2 we have found actually exist.In other words, we need to show that the series converge on some interval. Next,we need to check that φ1 and φ2 are, in fact, solutions. Finally, we need to checkthat φ1 and φ2 form a basis for the solution space. To show convergence, we willuse the absolute ratio test by looking at

limk→∞

|ck+4xk+4|

|ckxk|,


where k = 4m for φ1 and k = 4m + 1 for φ2. The most convenient way of doingthis is by using (3). We get for each x 6= 0,

limk→∞

|ck+4xk+4|

|ckxk|= limk→∞

1

(k + 4)(k + 3)|x|4 = 0 < 1.

Thus, the series for both φ1 and φ2 converge for all x. Next, since power seriescan be differentiated termwise, all the above steps can be reversed, showing thatφ1 and φ2 do satisfy the differential equation on (−∞,∞). Finally, we will showthat the general solution on (−∞,∞) is φ = C1φ1 +C2φ2 by checking that φ1 andφ2 are linearly independent. This is surprisingly easy to do using the Wronskian.We have

W (φ1, φ2)(0) =

∣∣∣∣φ1(0) φ2(0)φ′1(0) φ′2(0)

∣∣∣∣ =

∣∣∣∣1 00 1

∣∣∣∣ = 1 6= 0,

which shows that φ1 and φ2 are linearly independent on (−∞,∞). Notice howit was our choices for c0 and c1 when defining φ1 and φ2 that gave us linearlyindependent solutions.

In the above, the process of changing from the index k to the index j and thenback to the index k could be compressed into the single step of replacing k − 2 byk + 2. When this is done, care must be taken to see that since the “old” k startedat the value 2, the “new” must start at the value −2.

Equation (3) is referred to as a recurrence relation. To show that analyticcoefficients lead in general to analytic solutions, we will analyze the correspondinginitial value problem.

Theorem 3.3. Let x0 be a real number, and suppose that a1, . . . , an are analyticat x0. Then for any constants β1, β2, . . . , βn, the initial value problem

y(n) + a1(x)y(n−1) + · · ·+ an(x)y = 0

y(x0) = β1, y′(x0) = β2, . . . , y

(n−1)(x0) = βn

has a solution φ, analytic at x0. Furthermore, if the power series expansions fora1, . . . , an all converge on I = x : |x− x0| < R0, then the power series expansionfor φ converges on I.

Proof. To try to keep the notation from becoming overwhelming, we will justconsider the special case of n = 2 and x0 = 0. Even this case is quite daunting,and so we will make the further simplifying assumption that the coefficient of y′

is identically zero. Still, the proof we give will contain all the basic ideas of thegeneral proof. Before we begin let us note that we multiply power series in muchthe same way as we multiply polynomials. For any two power series, we have

(a0 + a1x+ a2x2 + · · · )(b0 + b1x+ b2x

2 + · · · )= a0b0 + (a1b0 + a0b1)x+ (a2b0 + a1b1 + a0b2)x2 + · · · .

In sigma notation, this says( ∞∑i=0

aixi

) ∞∑j=0

bjxj

=

∞∑k=0

k∑j=0

ak−jbj

xk.


Put L(y) = y′′ + by, where

b(x) =

∞∑k=0

bkxk

converges on the interval I = x : |x| < R0. We will consider the initial valueproblem L(y) = 0, y(0) = β, y′(0) = γ. We want

φ(x) =

∞∑k=0

ckxk

to be a solution, convergent on I. In order that φ satisfy the initial conditions, itis necessary that c0 = β and c1 = γ. We will plug φ into L(y) = 0 to see how wemust choose the other ck’s. We have

∞∑k=2

k(k − 1)ckxk−2 +

( ∞∑i=0

bixi

) ∞∑j=0

cjxj

= 0.

Then,

∞∑k=2

k(k − 1)ckxk−2 +

∞∑k=0

k∑j=0

bk−jcj

xk = 0,

and after replacing k − 2 by k in the first series,

∞∑k=0

(k + 2)(k + 1)ck+2xk +

∞∑k=0

k∑j=0

bk−jcj

xk = 0.

Collecting terms, we have

∞∑k=0

(k + 2)(k + 1)ck+2 +

k∑j=0

bk−jcj

xk = 0.

Thus, we want

(4) (k + 2)(k + 1)ck+2 = −k∑j=0

bk−jcj

for k = 0, 1, . . . . Since c0 and c1 were given values earlier, we can now use therecurrence relation (4) to define c2, c3, . . . .

We need to show that the series

∞∑k=0

ckxk,

with the ck’s defined as above, converges on I. Once this is accomplished, we willbe done since termwise differentiation will imply that φ is the solution. Pick anynumber r with 0 < r < R0. Since

∑bkr

k converges, bkrk → 0 as k →∞. It follows


that there is a constant M such that |bk|rk < M for all k. By (4),

(k + 2)(k + 1)|ck+2| ≤k∑j=0

|bk−jcj |

≤k∑j=0

M

rk−j|cj |

≤k+1∑j=0

M

rk−j|cj |.

Now put d0 = |c0|, d1 = |c1|, and define dk+2 for k = 0, 1, . . . by

(k + 2)(k + 1)dk+2 =

k+1∑j=0

M

rk−jdj .

Then replacing k + 1 by k gives

(k + 1)kdk+1 =

k∑j=0

M

rk−1−jdj ,

and so,

r(k + 2)(k + 1)dk+2 − (k + 1)kdk+1 = Mr2dk+1.

This implies that

r(k + 2)(k + 1)dk+2 = ((k + 1)k +Mr2)dk+1.

Thus,

limk→∞

|dk+2xk+2|

|dk+1xk+1|= limk→∞

dk+2

dk+1|x| = lim

k→∞

(k + 1)k +Mr2

r(k + 2)(k + 1)|x| = |x|

r< 1,

if |x| < r. We conclude that∑dkx

k converges absolutely at each |x| < r by theabsolute ratio test, and then that

∑ckx

k converges absolutely at each |x| < r bythe comparison test since |ck| ≤ dk for all k. The convergence holds for all x ∈ Isince r was an arbitrary positive number less than R0.

Since every solution of a differential equation is a solution of an appropriatelydefined initial value problem, we immediately get the following.

Corollary 3.1. Let x0 be a real number, and suppose that a1, . . . , an areanalytic at x0. Then every solution of

y(n) + a1(x)y(n−1) + · · ·+ an(x)y = 0

is analytic at x0. Furthermore, if the power series expansions for a1, . . . , an allconverge on I = x : |x − x0| < R0, then the power series expansion for eachsolution converges on I.

Our next example will be the Legendre equation, actually a collection of equa-tions. It arises in the study of, among other things, certain cases of steady stateheat distribution and electrostatic potential.


Example. For any constant α, consider the Legendre equation

(5) (1− x2)y′′ − 2xy′ + α(α+ 1)y = 0.

If we rewrite the equation as

(6) y′′ − 2x

1− x2y′ +

α(α+ 1)

1− x2y = 0,

we see that the coefficients are analytic at 0 (as well as at any number 6= ±1).Thus, we will look for solutions of the form

φ(x) =

∞∑k=0

ckxk.

Since the power series expansion about 0 of 1/(1−x2) has radius of convergence 1,the preceding results assure us that each solution φ will have a radius of convergencethat is 1 or bigger.

We have

φ′(x) =

∞∑k=1

kckxk−1 and φ′′(x) =

∞∑k=2

k(k − 1)ckxk−2.

Notice how plugging into the Legendre equation in its original form (5) will be fareasier than using (6). We get

∞∑k=2

k(k − 1)ckxk−2 −

∞∑k=2

k(k − 1)ckxk −

∞∑k=1

2kckxk +

∞∑k=0

α(α+ 1)ckxk = 0.

Replacing k − 2 by k in the first series gives∞∑k=0

(k + 2)(k + 1)ck+2xk −

∞∑k=2

k(k − 1)ckxk −

∞∑k=1

2kckxk +

∞∑k=0

α(α+ 1)ckxk = 0.

We want to combine the series together, but are faced with the problem that theydo not all start with the same value of k. In contrast to the last example, we willsimply start each series in this example at k = 0 since all of the extra terms addedon in this way will equal 0. We get∞∑k=0

(k + 2)(k + 1)ck+2xk −

∞∑k=0

k(k − 1)ckxk −

∞∑k=0

2kckxk +

∞∑k=0

α(α+ 1)ckxk = 0,

and so,∞∑k=0

(k + 2)(k + 1)ck+2 − k(k − 1)ck − 2kck + α(α+ 1)ckxk = 0.

Setting the coefficient of each power of x equal to zero gives us

(k + 2)(k + 1)ck+2 − k(k − 1)ck − 2kck + α(α+ 1)ck = 0

for k = 0, 1, . . . . This particular expression can be simplified since

(α+ k + 1)(α− k) = −k(k − 1)− 2k + α(α+ 1),

as the reader can easily check. We get the recurrence relation

ck+2 = − (α+ k + 1)(α− k)

(k + 2)(k + 1)ck


for k = 0, 1, . . . . Then,

k = 0 ⇒ c2 = − (α+ 1)α

2c0

k = 1 ⇒ c3 = − (α+ 2)(α− 1)

3 · 2c1

k = 2 ⇒ c4 = − (α+ 3)(α− 2)

4 · 3c2 =

(α+ 3)(α+ 1)(α)(α− 2)

4 · 3 · 2c0

k = 3 ⇒ c5 = − (α+ 4)(α− 3)

5 · 4c3 =

(α+ 4)(α+ 2)(α− 1)(α− 3)

5 · 4 · 3 · 2c1,

and so on. We have written out enough terms to detect a pattern. It is

c2m = (−1)m(α+ 2m− 1)(α+ 2m− 3) · · · (α+ 1)(α)(α− 2) · · · (α− 2m+ 2)

(2m)!c0,

c2m+1 = (−1)m(α+ 2m)(α+ 2m− 2) · · · (α+ 2)(α− 1)(α− 3) · · · (α− 2m+ 1)

(2m+ 1)!c1.

Taking c0 = 1 and c1 = 0, we get a solution φ1; taking c0 = 0 and c1 = 1, weget a solution φ2. By the last corollary, both φ1 and φ2 converge on (at least)I = x : |x| < 1. Also, φ1 and φ2 are linearly independent since

W (φ1, φ2)(0) =

∣∣∣∣1 00 1

∣∣∣∣ = 1 6= 0.

Therefore, the general solution is φ = C1φ1 + C2φ2.Let’s look more closely at some special cases of the Legendre equation. We

claim that if α is a nonnegative even integer, then φ1 is a polynomial since α = 2mimplies c2m+2 = 0, which in turn implies that all the following coefficients are zero.In particular, an easy computation shows that

α = 0 ⇒ φ1(x) = 1

α = 2 ⇒ φ1(x) = 1− 3x2

α = 4 ⇒ φ1(x) = 1− 10x2 +35

3x4

and so on. However, in these cases, φ2 is not a polynomial, which implies that φ1and its multiples are the only polynomial solutions. In a similar fashion, it can beseen that if α is a positive odd integer, then φ2 is a polynomial solution and that

α = 1 ⇒ φ2(x) = x

α = 3 ⇒ φ2(x) = x− 5

3x3

α = 5 ⇒ φ2(x) = x− 14

3x3 +

21

5x5

and so on. This time, φ1 is not a polynomial, and so φ2 and its multiples arethe only polynomial solutions. Notice that we have found solutions that convergeon an interval (the entire real line) larger than the one guaranteed by the generaltheory. On the other hand, the absolute ratio test can be used to show that anynon-polynomial solution has radius of convergence R = 1. We conclude that foreach n = 0, 1, . . ., the Legendre equation

(1− x2)y′′ − 2xy′ + n(n+ 1)y = 0


has a unique polynomial solution Pn such that Pn(1) = 1, (which is a constantmultiple of φ1 or φ2, depending on whether n is even or odd). Pn is called the nthLegendre polynomial. We have

P0(x) = 1

P1(x) = x

P2(x) =3

2x2 − 1

2

P3(x) =5

2x3 − 3

2x

and so on. The Legendre polynomials have been studied extensively.

The examples we have looked at are, believe it or not, relatively simple. Ameasure of their simplicity is that we have been able to find a pattern for thecoefficients that has enabled us to write down full solutions in sigma notation.Often, however, we will not be able to detect a pattern and will have to be contentwith just writing down the first few terms of a solution, which really just gives usan approximation of the solution. In particular, a second degree equation can easilylead to a recurrence relation involving three different coefficients, in which case apattern for the coefficients is likely to be too complicated to find.

Using results from complex analysis, we can greatly simplify the process ofestimating the radius of convergence of a solution. If f is a rational function, thenthe radius of convergence of the power series expansion based at x0 of f equals thedistance to the point in the complex plane closest to x0 at which f does not exist.Then, for example, if we are trying to solve

y′′ +x+ 8

(x− 1)(x− 5)y′ +

2

x− 2y = 0

using power series about x0 = 0, the series for (x + 8)/[(x − 1)(x − 5)] will haveradius of convergence 1, the series for 2/(x− 2) will have radius of convergence 2,and so the radius of convergence for any solution will be at least min1, 2 = 1. If,for example, we are trying to solve

(9 + x2)y′′ + y = 0

using power series about x0 = 0, the power series for 1/(9 + x2) will have radius ofconvergence 3 since the closest (and only) points at which 9 + x2 is zero are ±3i.Also, the coefficient of y′ is 0, which has radius of convergence∞. Thus, the radiusof convergence for any solution will be at least min∞, 3 = 3.

Exercises

Unless you are specifically told otherwise, assume all power series solutions in thisexercise set are based at x0 = 0.

1. Use the techniques of this section to solve y′−2xy = 0. Check your answerby also solving the equation by means of first order techniques.

2. Use the techniques of this section to find two linearly independent solu-tions of y′′ + y = 0. Check your answer by also solving the equation bymeans of constant coefficient techniques.


3. Find two linearly independent solutions of y′′ + 3xy′ − 6y = 0. For anynon-polynomial solution, it will be sufficient to write the answer in theform: first 4 non-zero terms + · · · .

4. Find two linearly independent solutions of y′′ − x2y′ + 4xy = 0. For anynon-polynomial solution, it will be sufficient to write the answer in theform: first 4 non-zero terms + · · · .

5. Find two linearly independent solutions of y′′ + 8xy′ − 4y = 0. Also, usethe absolute ratio test to find where your solutions converge.

6. Using power series based at x0 = 6, find two linearly independent solutionsof y′′− (x− 6)3y = 0. Also, use the absolute ratio test to find where yoursolutions converge.

Note that to solve an initial value problem using power series, it is not necessaryto first find the general solution. Instead, you can find the solution to the initialvalue problem directly with an appropriate choice of c0 and c1.

7. Solve y′′ + 2x2y′ − 8xy = 0, y(0) = 0, y′(0) = 1.

8. Solve y′′ + xy′ + y = 0, y(0) = 1, y′(0) = 0. (To put the answer in niceform, you might consider the formula (2m)(2m− 2) · · · 2 = m!2m.)

9. Solve y′′ + x2y′ + y = 0, y(0) = 1, y′(0) = 0. It will be sufficient to writethe answer in the form: first 4 non-zero terms + · · · .

10. For each constant α, find two linearly independent solutions of the Hermiteequation y′′−2xy′+2αy = 0. Also, show that if α is a nonnegative integern, then there is a polynomial solution.

11. For each constant α, find two linearly independent solutions of the Cheby-shev equation (1 − x2)y′′ − xy′ + α2y = 0. Also, show that if α is anonnegative integer n, then there is a polynomial solution.

12. What is the largest value of R0 for which we can be sure, without evalu-ating the solutions, that every solution φ(x) =

∑ckx

k of (1 + 5x2)y′′ +8xy′ − 4y = 0 converges on x : |x| < R0

.

3. Regular Singular Points and the Euler Equation

Now we will consider the nth-order linear homogeneous differential equation

(7) a0(x)y(n) + a1(x)y(n−1) + · · ·+ an(x)y = 0,

where a0, . . . , an are continuous on some interval I.

Definition 3.3. Let x0 be a point in the interval I. If a0, . . . , an are analyticat x0 and if a0(x0) 6= 0, then x0 is said to be an ordinary point of the equation (7).If a0, . . . , an are analytic at x0 and if a0(x0) = 0, then x0 is said to be a singularpoint of (7).

If x0 an ordinary point of (7), we can divide by a0 and will have an equation ofthe sort studied in the last section. (It can be shown that the quotient of analyticfunctions is analytic, as long as we do not divide by zero.) We will be able, therefore,to find power series solutions. On the other hand, if x0 is a singular point of (7), we

3. REGULAR SINGULAR POINTS AND THE EULER EQUATION 87

have nothing that assures the existence of a power series solution about x0. (Theimportance of whether or not a0(x0) = 0 comes, perhaps, as a surprise.) There is,however, a certain kind of singular point for which something positive can be said.

Definition 3.4. A singular point x0 is called a regular singular point for thedifferential equation (7), if (7) can be put into the form

(8) (x− x0)ny(n) + b1(x)(x− x0)n−1y(n−1) + · · ·+ bn(x)y = 0

near x0, where b1, . . . , bn are analytic at x0.

Example. (a) For x2y′′ + 3xy′ − 5y = 0, the only singular point is 0. It is aregular singular point, with b1(x) = 3 and b2(x) = −5.(b) For x2y′′ + xexy′ + (sinx)y = 0, the only singular point is 0. It is a regularsingular point, with b1(x) = ex and b2(x) = sinx.(c) For y′′ + 4y′ + 2(x− 3)−1y = 0, we should first multiply through by (x− 3)2 toget (x− 3)2y′′ + 4(x− 3)2y′ + 2(x− 3)y = 0. The only singular point is 3. It is aregular singular point, with b1(x) = 4(x− 3) and b2(x) = 2(x− 3).(d) For x2y′′+ 5y′+ 2y = 0, the only singular point is 0. It is a not regular singularpoint since b1(x) = 5/x, which is not analytic at 0. (We could view the equationas x2y′′ + (5/x)xy′ + 2y = 0.)(e) For x3(x + 1)y′′ + xy′ + x2y = 0, the only singular points are 0 and −1. Forx0 = 0, we divide by x and by x+ 1 to get x2y′′ + (x+ 1)−1y′ + x(x+ 1)−1y = 0.Then 0 is not a regular singular point since b1(x) = [x(x+ 1)]−1 is not analytic at0. For x0 = −1, we multiply the original equation by x+ 1 and divide by x3 to get(x+ 1)2y′′ + x−2(x+ 1)y′ + x−1(x+ 1)y = 0. Then −1 is a regular singular point,with b1(x) = x−2 and b2(x) = x−1(x+ 1).

A simple type of equation with a regular singular point is an Euler equation,which we define now.

Definition 3.5. The differential equation ax2y′′ + bxy′ + cy = 0, where a, b,and c are real numbers and a 6= 0, is an Euler equation (of order 2).

We immediately see that an Euler equation has a regular singular point at 0.Our goal is to solve Euler equations. Let’s start with a specific example.

Example. Solve x2y′′+ 32xy

′− 12y = 0. Since the coefficients are each a single

power of x, let’s (naively) look for solutions that are powers of x. Put φ(x) = xr,where r is some constant whose value is unknown for now. Also, assume for nowthat x > 0. Plugging φ(x) = xr into the equation, we get

r(r − 1)xr−2x2 +3

2rxr−1x− 1

2xr = 0.

This simplifies to 2r(r − 1)xr + 3rxr − xr = 0, which gives us the polynomialequation

2r(r − 1) + 3r − 1 = 0.

Solving this, we get 2r2+r−1 = 0 and then r = −1, 12 . Since the steps are reversible,

we have that φ1(x) = x−1 and φ2(x) = x1/2 are solutions on the interval (0,∞).Recall that the solution space of a second order linear homogeneous equation withcontinuous coefficients has dimension 2 if its leading coefficient never is zero (so thatwe can divide through by it). Thus, the general solution of our equation on (0,∞)is φ(x) = C1x

−1 + C2x1/2 since x−1 and x1/2 are clearly linearly independent.


Note that using xr for negative values of x would have led to undesirablecomplications. For x < 0, let’s plug in φ(x) = (−x)r. We claim that proceeding asabove leads again to the polynomial 2r(r− 1) + 3r− 1 with roots r = −1, 12 so that

then φ(x) = C1(−x)−1 +C2(−x)1/2 is the general solution on the interval (−∞, 0).We can combine our results by saying that the general solution on both (0,∞) and(−∞, 0) is

φ(x) = C1|x|−1 + C2|x|1/2.Notice that the general solution on (0,∞) and on (−∞, 0) can also be expressed asφ(x) = C1x

−1 + C2|x|1/2, with a different C1.

Given the Euler equation ax2y′′ + bxy′ + cy = 0, we will refer to

p(r) = ar(r − 1) + br + c

as its indicial polynomial . Since a, b, and c are real numbers, the indicial polynomial,just as the characteristic polynomial in the constant coefficient setting, can havetwo real roots, one real double root, or two complex conjugate roots. This, in turn,will lead to three different forms of solutions of the differential equation.

Theorem 3.4. If the indicial polynomial

(9) p(r) = ar(r − 1) + br + c

has two distinct real roots r1, r2, then the general solution of ax2y′′+ bxy′+ cy = 0is

φ(x) = C1|x|r1 + C2|x|r2

on the intervals (−∞, 0) and (0,∞).If (9) has a real root r1 of multiplicity two, then the the general solution of

ax2y′′ + bxy′ + cy = 0 is

φ(x) = C1|x|r1 + C2|x|r1 ln |x|

on (−∞, 0) and on (0,∞).If (9) has distinct complex conjugate roots r1 = λ + µi and r2 = λ − µi, then

the general solution of ax2y′′ + bxy′ + cy = 0 is

φ(x) = C1|x|r1 + C2|x|r2

on (−∞, 0) and on (0,∞). Also, the general solution can be written as

φ(x) = C1|x|λ cos(µ ln |x|) + C2|x|λ sin(µ ln |x|).

Proof. Put L(y) = ax2y′′ + bxy′ + cy. We will restrict our attention to theinterval (0,∞). The proof for (−∞, 0) is similar. If φ(x) = xr, then

L(φ)(x) = ax2r(r − 1)xr−2 + bxrxr−1 + cxr = [ar(r − 1) + br + c]xr = p(r)xr.

Thus, erx is a solution of the Euler equation if and only if r is a root of its indicialpolynomial. If r1 and r2 are distinct roots (real or complex), the general solutionon (0,∞) can then be given as

φ(x) = C1xr1 + C2x

r2

since the solution space has dimension 2 and xr1 , xr2 are clearly linearly indepen-dent.

3. REGULAR SINGULAR POINTS AND THE EULER EQUATION 89

If r1 is a double root of p(r), we need to show that xr1 lnx is a solution ofL(y) = 0 on (0,∞). Note that we have p(r) = a(r − r1)2, which implies p′(r) =2a(r − r1). A simple computation gives

L(xr1 lnx) = ax2(xr1 lnx)′′ + bx(xr1 lnx)′ + c(xr1 lnx)

= xr1 [p(r1) lnx+ p′(r1)].

Thus, L(xr1 lnx) = 0, and the result follows since the linear independence is clear.If r = λ+ µi, then for any x > 0,

xr = xλxµi = xλe(µi) ln x = xλ[cos(µ lnx) + i sin(µ lnx)].

So, if r1 = λ+µi and r2 = λ−µi are distinct complex conjugate roots of p(r), thenxλ cos(µ lnx) and xλ sin(µ lnx) are linear combinations of xr1 , xr2 , which impliesthat they are solutions of the linear homogeneous equation. Again, the result followssince the linear independence is clear.

Example. Solve x2y′′ + 3xy′ + 5y = 0 on x > 0. The indicial polynomial forthis Euler equation is p(r) = r(r−1)+3r+5 = r2 +2r+5, which has roots −1±2i.The general solution on (0,∞) is, therefore,

φ(x) = C1x−1 cos(2 lnx) + C2x

−1 sin(2 lnx).

Example. Solve (x−4)2y′′−5(x−4)y′+ 9y = 0 on x 6= 4. We get the indicialpolynomial p(r) = r(r−1)−5r+9, which corresponds to plugging in φ(x) = (x−4)r

for x > 4. The roots of p(r) are r = 3, 3. Therefore, the general solution of thedifferential equation is

φ(x) = C1|x− 4|3 + C2|x− 4|3 ln |x− 4|

on both (4,∞) and (−∞, 4).

Example. Solve 2x2y′′ − 5xy′ + 3y = 8x6 on x > 0. Since this is a nonho-mogeneous equation, we start by solving the corresponding homogeneous equation2x2y′′ − 5xy′ + 3y = 0, which is an Euler equation. The indicial polynomial isp(r) = 2r(r − 1)− 5r + 3 = 2r2 − 7r + 3, which has roots r = 1/2, 3. The generalsolution of the homogeneous equation is then

φ(x) = C1x1/2 + C2x

3.

We have developed two methods for finding a particular solution ψp. In thisexample, however, we only can use the method of variation of parameters, since theequation does not have constant coefficients. Taking care to put our equation intostandard form for this method by dividing through by 2x2, we see that g(x) = 4x4.Thus, we want to solve the system

x1/2u′1(x) + x3u′2(x) = 0

1

2x−1/2u′1(x) + 3x2u′2(x) = 4x4.

We can find that u′1(x) = (−8/5)x9/2 by subtracting x times the second equationfrom 3 times the first. Then substituting into the first equation gives u′2(x) =(8/5)x2. Integrating gives u1(x) = (−16/55)x11/2 and u2(x) = (8/15)x3. Then,

ψp(x) = [−16

55x11/2][x1/2] + [

8

15x3][x3] =

8

33x6.


Therefore, the general solution of the original equation is

ψ(x) =8

33x6 + C1x

1/2 + C2x3.

The techniques we have developed above can be extended to Euler equationsof any order. Here, in brief, is how the generalization goes. A linear homogeneousdifferential equation of order n is an Euler equation if each of its terms is of theform: a constant (possibly 1 or 0) times a power of x times a derivative of y withorder equal to the power of x. The indicial polynomial is the polynomial thatresults from plugging xr into the equation, at least for x > 0. From the roots ofthe indicial polynomial, we get solutions to the differential equation as we did inthe case of n = 2, but with the more general condition that if r is a root of theindicial polynomial of order k, then |x|r, |x|r lnx, . . . , |x|r(lnx)k−1 are k linearlyindependent solutions of the differential equation.

You have probably noticed that there is much about Euler equations that re-sembles what is known about equations with constant coefficients. This is not acoincidence. For some insight concerning the relationship between Euler equationsand equations with constant coefficients, see the exercises.

Example. Solve x3y′′′ + 6x2y′′ + 7xy′ + y = 0 on x > 0. Plugging in xr, weget

x3r(r − 1)(r − 2)xr−3 + 6x2r(r − 1)xr−2 + 7xrxr−1 + xr = 0,

which leads to r3 + 3r2 + 3r+ 1 = 0. Thus, the roots of the indicial polynomial arer = −1,−1,−1 and the general solution of the differential equation on (0,∞) is

φ(x) = C1x−1 + C2x

−1 lnx+ C3x−1(lnx)2.

Exercises

1. Solve x2y′′ − 7xy′ + 16y = 0 on x 6= 0.

2. Solve 4(x− 3)2y′′ + 8(x− 3)y′ + y = 0 on x > 3.

3. Solve x2y′′ + 3xy′ + 5y = 0 on x > 0.

4. Solve x2y′′ − 3xy′ + 8y = 0 on x > 0.

5. Solve the initial value problem 3x2y′′− 2xy′+ 2y = 0, y(1) = 0, y′(1) = 2(on x > 0).

6. Solve the initial value problem (x + 1)2y′′ − 6y = 0, y(0) = 1, y′(0) = 4(on x > −1).

7. Solve x2y′′ + 2xy′ − 6y = 5x2 on x > 0.

8. Solve x2y′′ − 4xy′ + 6y = 5x3 on x > 0.

9. Solve the initial value problem 2y′′ − (1/x)y′ + (1/x2)y = x, y(1) = 1,y′(1) = 0 (on x > 0).

10. Solve x3y′′′ + 2x2y′′ − xy′ + y = 0 on x > 0.

4. MORE ON EQUATIONS WITH REGULAR SINGULAR POINTS 91

11. Use techniques of this section to solve xy′ + 3y = x5 on x > 0. Checkyour answer by also solving the the equation by means of an integratingfactor.

12. Put x = et and suppose φ and η are two functions related by φ(x) = η(t).(a) Show that φ is a solution of the Euler equation ax2y′′+ bxy′+ cy = 0

when x > 0 if and only if η is a solution of the constant coefficientequation ay′′ + (b− a)y′ + cy = 0.

(b) Compute the indicial polynomial of the Euler equation and the char-acteristic polynomial of the constant coefficient equation, and com-pare the two.

13. Use the ideas of the preceding problem to solve x2y′′ − 4xy′ + 6y = 0 onx > 0 by converting to a constant coefficient equation. Check your answerby also solving the equation directly as an Euler equation.

14. Find all singular points of x2y′′+(x+1)y′+y = 0 and determine whethereach is a regular singular point.

15. Find all singular points of x(x− 1)y′′ + (x− 1)y′ + 5y = 0 and determinewhether each is a regular singular point.

4. More on Equations with Regular Singular Points

We now want to find solutions about a regular singular point when our equationis not an Euler equation. The procedure we are about to describe is called themethod of Frobenius. To keep the notation from becoming too complicated, let’sconsider only the case of a second order equation with regular singular point x0 = 0.Based on the types of solutions found so far for variable coefficient equations, itseems reasonable to try to find solutions of the form

(10) φ(x) = xr∞∑k=0

ckxk,

at least for x > 0. In order to avoid the product rule when finding φ′(x), we willimmediately rewrite φ(x) as

(11) φ(x) =∞∑k=0

ckxk+r,

x > 0. We should note that this is not necessarily a power series since r is notnecessarily a nonnegative integer. Still, termwise differentiation is valid. We get

φ′(x) =

∞∑k=0

(k + r)ckxk+r−1 and φ′′(x) =

∞∑k=0

(k + r)(k + r − 1)ckxk+r−2.

In contrast to earlier problems, all the summations need to start at zero; we do notknow whether the original series starts with a constant term that would drop outafter differentiation. Let’s turn to a specific example.

Example. Solve 2x2y′′ + 3xy′ + (2x − 1)y = 0 on x > 0. It is clear that 0is a regular singular point with b1(x) = 3/2 and b2(x) = (2x − 1)/2. We will,therefore, look for a solution of the form (11). Much, but not all, of what we do


will parallel how we found power series solutions about ordinary points. Plugginginto the equation gives

∞∑k=0

2(k + r)(k + r − 1)ckxk+r +

∞∑k=0

3(k + r)ckxk+r

+

∞∑k=0

2ckxk+r+1 −

∞∑k=0

ckxk+r = 0.

To put all the powers of x into the same form, we replace k+ 1 by k in the next tolast series, giving us

∞∑k=0

2(k + r)(k + r − 1)ckxk+r +

∞∑k=0

3(k + r)ckxk+r

+

∞∑k=1

2ck−1xk+r −

∞∑k=0

ckxk+r = 0.

Then combining the series together, we get

2r(r − 1)c0 + 3rc0 − c0xr

+

∞∑k=1

2(k + r)(k + r − 1)ck + 3(k + r)ck + 2ck−1 − ckxk+r = 0.

This will hold for all x > 0 if the coefficient of each power of x is zero. Startingwith the coefficient of xr, we want either c0 = 0 or 2r(r − 1) + 3r − 1 = 0. Wewill see shortly that c0 = 0 leads to the trivial solution φ ≡ 0, and so we willconcentrate on the other condition, which simplifies to 2r2 +r−1 = 0. This impliesthat r = 1/2,−1. Now turning our attention to the coefficients of the other powersof x, we want

2(k + r)(k + r − 1)ck + 3(k + r)ck + 2ck−1 − ck = 0

for k = 1, 2, . . . . This recurrence relation can be rewritten (unless we are dividingby 0) as

(12) ck =2ck−1

1− 3(k + r)− 2(k + r)(k + r − 1).

Let r = 1/2. Then (12) becomes

ck =2ck−1

1− 3(k + 1/2)− 2(k + 1/2)(k − 1/2)=−2ck−1k(2k + 3)

for k = 1, 2, . . . . Then,

k = 1 ⇒ c1 =−2c01 · 5

k = 2 ⇒ c2 =−2c12 · 7

=4c0

2 · 1 · 7 · 5

k = 3 ⇒ c3 =−2c23 · 9

=−8c0

3 · 2 · 1 · 9 · 7 · 5


and so on. Setting c0 = 1, we get the solution

φ1(x) = x1/2(

1− 2

1 · 5x+

4

2 · 1 · 7 · 5x2 − 8

3 · 2 · 1 · 9 · 7 · 5x3 +− · · ·

),

which can be written in full as

φ1(x) = x1/2

(1 +

∞∑k=1

(−1)k2k

k!(2k + 3)(2k + 1) · · · 5xk

).

If r = −1, then (12) becomes

ck =2ck−1

1− 3(k − 1)− 2(k − 1)(k − 2)=−2ck−1k(2k − 3)

for k = 1, 2, . . . . Then,

k = 1 ⇒ c1 =2c01

k = 2 ⇒ c2 =−2c12 · 1

=−4c0

2 · 1 · 1

k = 3 ⇒ c3 =−2c23 · 3

=8c0

3 · 2 · 1 · 3 · 1

and so on. Setting c0 = 1, we get the solution

φ2(x) = x−1(

1 +2

1x− 4

2 · 1 · 1x2 +

8

3 · 2 · 1 · 3 · 1x3 −+ · · ·

),

which can be written in full as

φ2(x) = x−1

(1 +

∞∑k=1

(−1)k+1 2k

k!(2k − 3)(2k − 5) · · · 1xk

).

It is easily checked, using the absolute ratio test in conjunction with the recur-rence relation (12), that each of the above series converges for all x, and so φ1 andφ2 are solutions on all of (0,∞). Also, φ1, φ2 are linearly independent, so that thegeneral solution of the differential equation is φ = C1φ1 + C2φ2.

Let us note in passing that (2k)(2k − 2) · · · 2 = k!2k, and then

(2k − 1)(2k − 3) · · · 1 = (2k)!/(k!2k).

It follows that φ2(x) can be rewritten as

φ2(x) = x−1

(1 +

∞∑k=1

(−1)k+1 22k−1

k!(2k − 2)!xk

).

Similarly, φ1(x) can be written in a more sophisticated form.We now want to describe more generally the solutions of linear homogeneous

differential equations near regular singular points. We will consider a second orderequation with a regular singular point at 0. Even in this case, we will limit ourselvesto stating the facts; the proofs are quite demanding and will not be dealt with here.

Theorem 3.5. Consider the equation

(13) x2y′′ + b1(x)xy′ + b2(x)y = 0,


where b1 and b2 have power series expansions about 0 that converge for all |x| < R0.Let r1, r2 be the roots of the indicial polynomial

(14) p(r) = r(r − 1) + b1(0)r + b2(0).

If r1 − r2 is not an integer, then there exist two linearly independent solutions of(13) on (−R0, 0) and on (0, R0) of the form

φ1(x) = |x|r1∞∑k=0

ckxk, φ2(x) = |x|r2

∞∑k=0

dkxk,

where each series converges for all |x| < R0.

Note that the indicial polynomial (14) corresponds to the indicial polynomialof an Euler equation when b1 and b2 are constants. If r1−r2 = 0, it should come asno surprise that there might not be two linearly independent solutions of the formgiven above. However, the case of r1 − r2 being a nonzero integer is not so clear.It turns out that the following type of example can occur: r1 = 11/2, r2 = 7/2 andthen we get

φ1(x) = x112

(1 +

x

2+x2

3+ · · ·

), φ2(x) = x

72

(0 + 0x+ x2 +

x3

2+x4

3+ · · ·

),

and so this does not give two linearly independent solutions. There will still be twolinearly independent solutions in all cases, as the next theorem details.

Theorem 3.6. If r1 = r2, then there are two linearly independent solutions of(13) on (−R0, 0) and on (0, R0) of the form

φ1(x) = |x|r1∞∑k=0

ckxk, φ2(x) = |x|r1+1

∞∑k=0

dkxk + (ln |x|)φ1(x),

where each series converges for all |x| < R0.If r1− r2 is a positive integer, then there are two linearly independent solutions

of (13) on (−R0, 0) and on (0, R0) of the form

φ1(x) = |x|r1∞∑k=0

ckxk, φ2(x) = |x|r2

∞∑k=0

dkxk + c(ln |x|)φ1(x),

where each series converges for all |x| < R0.

All cases are now accounted for since if r1 − r2 is a negative integer, we cansimply relabel the roots so that r1−r2 is positive. We should note that it is possiblewhen r1 − r2 is a positive integer that the constant c will be zero, in which casethere will be two ‘simple’ linearly independent solutions.

Our next example will be the Bessel equation, a collection of equations thatarise in the study of, among other things, circular vibrating membranes and tem-perature distributions on circular plates.

Example. For any nonnegative constant α, consider the Bessel equation

(15) x2y′′ + xy′ + (x2 − α2)y = 0.

Clearly, 0 is a regular singular point with b1(x) = 1 and b2(x) = x2 − α2. Theindicial polynomial is

(16) p(r) = r(r − 1) + b1(0)r + b2(0) = r(r − 1) + r − α2 = r2 − α2,


which has roots r1 = α, r2 = −α.Let’s look for a solution of the form (11) on x > 0. Plugging into the equation

gives

∞∑k=0

(k + r)(k + r − 1)ckxk+r +

∞∑k=0

(k + r)ckxk+r

+

∞∑k=0

ckxk+r+2 −

∞∑k=0

α2ckxk+r = 0.

To put all the powers of x into the same form, we replace k+ 2 by k in the next tolast series, giving us

∞∑k=0

(k + r)(k + r − 1)ckxk+r +

∞∑k=0

(k + r)ckxk+r

+

∞∑k=2

ck−2xk+r −

∞∑k=0

α2ckxk+r = 0.

Then combining the series together, we get

[r(r − 1) + r − α2]c0xr + [(r + 1)r + (r + 1)− α2]c1x

r+1

+

∞∑k=2

(k + r)(k + r − 1)ck + (k + r)ck + ck−2 − α2ckxk+r = 0.

Notice that the expression in brackets multiplying c0 is p(r). In the exercises, youare asked to show that this is always the case. Now, set the coefficient of each powerof x equal to zero. Starting with the coefficient of xr, we get (under the assumptionthat c0 6= 0) the roots of the indicial polynomial. That is, we get r1 = α, r2 = −α.Next, the coefficient of xr+1 being 0 implies that

(17) [(r + 1)2 − α2]c1 = 0.

Turning our attention to the coefficients of the other powers of x, we want

(k + r)(k + r − 1)ck + (k + r)ck + ck−2 − α2ck = 0

for k = 2, 3, . . . . This recurrence relation can be rewritten as

[(k + r)2 − α2]ck + ck−2 = 0,

and then (if we are not dividing by 0) as

(18) ck =−ck−2

(k + r)2 − α2,

for k = 2, 3, . . . . Let’s now choose r = r1 = α, In this case, (17) says that[2α+ 1]c1 = 0 and so c1 = 0 since α ≥ 0. Also, (18) becomes

ck =−ck−2

(k + α)2 − α2=−ck−2

k(k + 2α)


for k = 2, 3, . . . . Then,

k = 2 ⇒ c2 =−c0

2(2 + 2α)

k = 3 ⇒ c3 =−c1

3(3 + 2α)= 0

k = 4 ⇒ c4 =−c2

4(4 + 2α)=

c04 · 2(4 + 2α)(2 + 2α)

k = 5 ⇒ c5 =−c3

5(5 + 2α)= 0

k = 6 ⇒ c6 =−c4

6(6 + 2α)=

−c06 · 4 · 2(6 + 2α)(4 + 2α)(2 + 2α)

and so on. In general, we will have c2m+1 = 0 and

(19) c2m =(−1)mc0

2mm!2m(m+ α)(m− 1 + α) · · · (1 + α)

for m = 1, 2, . . . .Let’s now look at some special cases. If α = 0, and we let c0 = 1, then we have

the solution

φ1(x) =

∞∑m=0

(−1)m

2mm!2mm!x2m,

which is usually denoted by J0(x). In other words,

J0(x) =

∞∑m=0

(−1)m

(m!)2

(x2

)2mis a solution on (0,∞) of x2y′′ + xy′ + x2y = 0, the Bessel equation of order zero.J0 is referred to as the Bessel function of the first kind of order zero. (The use ofthe word ‘order’ has no relation our usual meaning for this word.) Since r1 = r2,we look to Theorem 3.6 to see what to expect in terms of a second solution. Itturns out that

φ2(x) = x

∞∑m=0

(−1)m

22m[(m+ 1)!]2

(1 +

1

2+ · · ·+ 1

m+ 1

)x2m+1 + (lnx)φ1(x)

is a solution on (0,∞) of x2y′′ + xy′ + x2y = 0 such that φ1, φ2 are linearly inde-pendent.

If α is a positive integer n (and if we put c0 = 1/(2nn!) to get a nicer lookingformula), then the solution φ1(x) on (0,∞) of x2y′′+xy′+ (x2−n2)y = 0 becomes

Jn(x) =(x

2

)n ∞∑m=0

(−1)m

m!(m+ n)!

(x2

)2m,

the Bessel function of the first kind of order n. Since r1 − r2 = 2n is a positiveinteger, there is no assurance that a second solution φ2 of the form (11) exists withφ1, φ2 being linearly independent. It turns out that φ2 must be quite complicated.

If α is n/2, where n is a positive odd integer, then r1 − r2 = n is a positiveinteger, but there will be two linearly independent solutions of the form (11) anyway;this corresponds to the constant c in Theorem 3.6 being 0. If α is any number forwhich r1 − r2 is not a positive integer, then certainly there will be two linearly


independent solutions on (0,∞) of the form (11). In both cases, the first can befound using r1 = α, as was done above; the second can be found using r2 = −α.

Exercises

1. Find one non-trivial solution of 4x2y′′ + (1− 4x)y = 0 on x > 0.

2. Find one non-trivial solution of x2y′′ + 9xy′ + (x+ 16)y = 0 on x > 0.

3. Find two linearly independent solutions of 3xy′′ + y′ − xy = 0 on x > 0.For any series that is in a solution, it will be sufficient to write the seriesin the form: first 3 non-zero terms + · · · .

4. Find two linearly independent solutions of 2x2y′′+xy′+ (x2− 1)y = 0 onx > 0. For any series that is in a solution, it will be sufficient to write theseries in the form: first 3 non-zero terms + · · · .

5. Find two linearly independent solutions on x > 0 of x2y′′+xy′+(x2− 14 )y =

0, the Bessel equation of order 1/2.

6. For each constant α, find one non-trivial solution on x > 0 of

xy′′ + (1− x)y′ + αy = 0,

the Laguerre equation of order α. Also, show that whenever α is a positiveinteger, this solution is a polynomial.

7. Find all solutions of xy′ + (x + 1)y = 0 on x > 0 using the techniquesof this section. Also, check your answer by solving the equation using anintegrating factor.

8. Find one non-trivial solution of x2y′′ + (x2 + x)y′ + x2y = 0 on x > 0. Itwill be sufficient write the series involved in your answer in the form: first4 non-zero terms + · · · .

9. Let L(y) = x2y′′ + b1(x)xy′ + b2(x)y, where b1 and b2 are analytic at0. If φ(x) =

∑∞k=0 ckx

k+r, show that the coefficient of xr in the seriesexpansion of L(φ)(x) is p(r)c0, where p(r) is the indicial polynomial ofL(y) = 0.

CHAPTER 4

Systems of Differential Equations


Two tanks are connected to each other by two pipes. Tank I contains W1

gallons of water, and Tank II contains W2 gallons of water. Initially, the water inTank I is pure, while the water in Tank II has S pounds of salt dissolved in it.Water flows through one pipe from Tank I to Tank II, while water flows throughthe other pipe from Tank II to Tank I, each at a constant rate of R gallons perminute. We will assume that the water in each tank is being stirred so that the saltin each tank is evenly distributed at all times. Let y1(t) be the number of poundsof salt in Tank I and y2(t) be the number of pounds of salt in Tank II at any timet. Using the principle that the rate of change in the amount of salt in a tank equalsthe rate at which salt enters the tank minus the rate at which salt leaves the tank,we get the system of differential equations

(1)

y′1 = − R

W1y1 +

R

W2y2

y′2 =R

W1y1 −

R

W2y2

along with the initial conditions

(2) y1(0) = 0, y2(0) = S.

In this chapter, we will find out how to solve such problems.

2. Linear Systems and Vector Notation

It will be convenient to express many of our results by means of matrices andcolumn vectors. Before getting into the general theory, however, we will introducesome of the notation in a specific example.

Example. Consider the system of equations

y′1 = 2y1 − 2y2

y′2 = y1 + 5y2

where y1, y2 are functions of an independent variable x. This system can be rewrit-ten as the matrix equation [

y′1y′2

]=

[2 −21 5

] [y1y2

].

We will abbreviate this as

y′ = Ay.

99

100 4. SYSTEMS OF DIFFERENTIAL EQUATIONS

Suppose that φ1, φ2 is a solution of the system. By this we mean that

φ′1(x) = 2φ1(x)− 2φ2(x)

and

φ′2(x) = φ1(x) + 5φ2(x)

for all x in some interval I. Putting

φ =

[φ1φ2

],

we have

φ′(x) = Aφ(x)

for all x ∈ I and we say that φ is a solution of y′ = Ay on I.It can easily be checked that

φ[1](x) =

[2e3x

−e3x]

is a solution of the system. (Do it!) Likewise, we can check that

φ[2](x) =

[e4x

−e4x]

is a another solution. (We will soon see how to find these solutions.) We will usethe notation

φ[1](x) =

[φ11(x)φ21(x)

]=

[2e3x

−e3x]

and φ[2](x) =

[φ12(x)φ22(x)

]=

[e4x

−e4x].

In φij , the first subscript i refers to the variable yi being solved for, while the second

subscript j tells us the solution φ[j] that φij is a part of.

If φ[1] and φ[2] are solutions, then it is easily seen that any linear combination

C1φ[1] + C2φ

[2] is also a solution. In other words,

φ(x) = C1e3x

[2−1

]+ C2e

4x

[1−1

]is a solution for every choice of constants C1, C2. In elementary notation, this says

φ1(x) = 2C1e3x + C2e

4x

φ2(x) = −C1e3x − C2e

4x

is a solution for every choice of C1, C2. We will, of course, need to determinewhether the process we have just gone through gives us all solutions of the system.

We will now generalize what we have seen so far.

Definition 4.1. A system of first order differential equations is said to be asystem of linear equations if it is of the form

(3)

y′1 = a11(x)y1 + a12(x)y2 + · · ·+ a1n(x)yn + g1(x)

y′2 = a21(x)y1 + a22(x)y2 + · · ·+ a2n(x)yn + g2(x)

...

y′n = an1(x)y1 + an2(x)y2 + · · ·+ ann(x)yn + g1(x)

2. LINEAR SYSTEMS AND VECTOR NOTATION 101

for all x in some interval I. This can be written in matrix notation as

(4) y′ = A(x)y + g(x),

where

A =

a11(x) · · · a1n(x)...

...an1(x) · · · ann(x)

, y =

y1...yn

, and g(x) =

g1(x)...

gn(x)

,and where y′, the derivative of a matrix of functions, is gotten by differentiatingeach entry of the matrix. If

g(x) = 0 =

0...0

for all x ∈ I, the system is called homogeneous. Otherwise, it is called nonhomo-geneous.

Note that the system given in the preceding example was linear homogeneouson (−∞,∞). Since multiplication by a matrix represents a linear transformation,we easily get the following results.

Theorem 4.1. The solutions of a linear homogeneous system

(5) y′ = A(x)y

form a vector space (which is a subspace of the space of all n-tuples of differentiablefunctions on I).

Corollary 4.1. Every linear combination of solutions of (5) is a solution of(5).

We now turn to initial value problems for systems of linear homogeneous equa-tions. The behavior mirrors quite closely what we have seen for a higher orderequation, and we will just state the results without proof.

Theorem 4.2. If aij is continuous for each i, j = 1, . . . , n on an interval Icontaining a point x0, then the initial value problem

(6)

y′1 = a11(x)y1 + a12(x)y2 + · · ·+ a1n(x)yn

y′2 = a21(x)y1 + a22(x)y2 + · · ·+ a2n(x)yn

...

y′n = an1(x)y1 + an2(x)y2 + · · ·+ ann(x)yn

y1(x0) = β1, y2(x0) = β2, . . . , yn(x0) = βn

has one and only one solution on I.

When solving a system of linear homogeneous differential equations (5), we aredealing with vector valued functions

f =

f1...fn

.


As before, the idea of linear independence will be important for finding the gen-eral solution of the system. The vector valued functions f [1], . . . , f [n] are linearlyindependent on a set S if the condition

α1f[1] + · · ·+ αnf [n] = 0

on S implies that α1 = · · · = αn = 0.

Example. Show that the functions

φ[1](x) =

[2e3x

−e3x]

and φ[2](x) =

[e4x

−e4x],

which appeared in the first example of this section, are linearly independent on anyset S. We suppose that

α1φ[1](x) + αnφ

[2](x) = 0

for all x ∈ S. Then if x0 is any one particular point in S, we have

α1φ[1](x0) + α2φ

[2](x0) = 0.

In other words,

2e3x0α1 + e4x0α2 = 0

−e3x0α1 − e4x0α2 = 0

This system of algebraic equations has the unique solution α1 = α2 = 0 since thematrix of coefficients ∣∣∣∣2e3x0 e4x0

−e3x0 −e4x0

∣∣∣∣ = −e7x0

is non-zero. Thus, φ[1], φ[2] are linearly independent on S.

The determinant we encountered in this last example is called a Wronskian ofvector valued functions.

Definition 4.2. If

f [1] =

f11...fn1

, . . . , f [n] =

f1n...fnn

are differentiable vector valued functions, then their Wronskian is defined by

Wv(f[1], . . . , f [n])(x) =

∣∣∣∣∣∣∣f11(x) · · · f1n(x)

......

fn1(x) · · · fnn(x)

∣∣∣∣∣∣∣ .Lemma 4.1. Let f [1], . . . , f [n] be differentiable vector valued functions on a set S.

If there exists a point x0 ∈ S for which Wv(f[1], . . . , f [n])(x0) 6= 0, then f [1], . . . , f [n]

are linearly independent on S.

Proof. Suppose that α1f[1](x)+· · ·+αnf [n](x) = 0 for all x ∈ S. In particular,

we have for x = x0, α1f[1](x0) + · · ·+ αnf [n](x0) = 0. In other words,

f11(x0)α1 + · · ·+ f1n(x0)αn = 0

...

fn1(x0)α1 + · · ·+ fnn(x0)αn = 0.

2. LINEAR SYSTEMS AND VECTOR NOTATION 103

This system of n equations in n unknowns has a unique solution for α1, . . . , αn sincethe determinant of coefficients is the Wronskian at x0, which is given to be nonzero.That unique solution is obviously α1 = · · · = αn = 0. Therefore, f [1], . . . , f [n] arelinearly independent on S.

The proofs of the following two results will be left to the reader. Just as in thecase of the preceding lemma, their proofs mirror the proofs of the correspondingresults for a single higher order linear homogeneous equation.

Theorem 4.3. Let φ[1], . . . ,φ[n] be n solutions of the linear homogeneous sys-tem (5) of n equations in n variables on an interval I. Then the following conditionsare equivalent.

(a) Wv(φ[1], . . . ,φ[n])(x) is never zero for x ∈ I.

(b) There exists a point x0 ∈ I such that Wv(φ[1], . . . ,φ[n])(x0) 6= 0.

(c) φ[1], . . . ,φ[n] are linearly independent on I.

Theorem 4.4. Let φ[1], . . . ,φ[n] be any n linearly independent solutions of thelinear homogeneous system (5) of n equations in n variables on an interval I. Thenevery solution of (5) is of the form φ = C1φ1 + · · ·+Cnφn, where C1, . . . , Cn areconstants.

We now know the form of the general solution of the linear homogeneous system(5) of n equations in n variables. Also, we see that the solution space of (5) hasdimension n.

It is not a coincidence that the results we have so far about linear homogeneoussystems closely resemble results about single higher order equations. Consider theequation

(7) y(n) + a1(x)y(n−1) + · · ·+ an(x)y = 0.

Put y1 = y, y2 = y′, . . . , yn = y(n−1). Then, clearly, φ is a solution of (7) if andonly if

φ =

φφ′

...φ(n−1)

is a solution of the system

(8)

y′1 = y2

y′2 = y3

...

y′n−1 = yn

y′n = −an(x)y1 − · · · − a1(x)yn.

Thus, we can think of higher order equations as being equivalent to certain systemsof first order equations. We will take advantage of this relationship later on.


Exercises

1. Rewrite the system

y′1 = 2y1 − 3xy2

y′2 = 5y1 + x2y2

in the form y′ = A(x)y.

2. Rewrite y′ = A(x)y, where

A(x) =

[3 2 + xx 5

],

as a system of two equations in y1 and y2.

3. Verify that

φ[1](x) = e2x[61

]and φ[2](x) = e−3x

[−11

]are solutions of the system

y′1 = 3y1 + 6y2

y′2 = −y1 − 4y2

Also, compute Wv(φ[1],φ[2])(x), and use this to show that φ[1],φ[2] are

linearly independent on any interval I.

4. Verify that

φ(x) = e3x

322

is a solution of the initial value problem

y′1 = 3y1 + y2 − y3y′2 = 2y2 + y3y′3 = −y2 + 4y3

y1(0) = 3, y2(0) = 2, y3(0) = 2




8. Contrary to the situation for scalar valued functions, it makes sense totalk about vector valued functions being linearly independent or linearlydependent at a point. Put

f [1](x) = eλ1x

[ab

]and f [2](x) = eλ2x

[cd

].

For any point x0, prove that f [1](x0), f [2](x0) are linearly independent ifand only if ad− bc 6= 0.

3. CONSTANT COEFFICIENTS 105

3. Constant Coefficients

We now want to solve systems of first order linear homogeneous equations withconstant coefficients

(9)

y′1 = a11y1 + a12y2 + · · ·+ a1nyn

y′2 = a21y1 + a22y2 + · · ·+ a2nyn

...

y′n = an1y1 + an2y2 + · · ·+ annyn

for all x ∈ (−∞,∞). This can be written in matrix notation as

(10) y′ = Ay.

As a first example, consider the system

y′1 = 3y1

y′2 = 7y2.

This, of course, can be solved very easily by earlier methods since the dependentvariables y1 and y2 are not interrelated. The two equations are said to be uncoupled.We get φ1(x) = C1e

3x and φ2(x) = C2e7x. Note that in matrix form, the system

becomes [y′1y′2

]=

[3 00 7

] [y1y2

].

On the other hand, a system like

y′1 = 2y1 + y2

y′2 = 3y1 + 4y2

does not have an obvious solution because the dependent variables y1 and y2 cannotbe treated one at a time. In matrix form, this system becomes[

y′1y′2

]=

[2 13 4

] [y1y2

].

Thinking in terms of the matrix forms, we see that the reason the first system isso easy to solve is that its coefficient matrix is a diagonal matrix. This suggeststhat with a little extra effort we might also be able to solve systems with coefficientmatrices that are diagonalizable.

We recall that a square matrix A is said to be diagonalizable if it is similarto a diagonal matrix. In other words, there is a diagonal matrix Λ such thatΛ = P−1AP for some invertible matrix P. The matrix P is commonly referred toas a transition matrix. In solving systems with diagonalizable coefficient matrices,we will use the following standard result from linear algebra.

Theorem 4.5. Let A be an n × n matrix with n linearly independent eigen-vectors. Then A is similar to a diagonal matrix, i.e. Λ = P−1AP, where Λ isdiagonal. Furthermore, the columns of P are the linearly independent eigenvectorsand the diagonal entries of Λ are the corresponding eigenvalues of A.

Before going any further, let’s diagonalize a specific matrix.


Example. Diagonalize

A =

[2 13 4

].

Recall that the eigenvalues of A can be found by solving

det (λI −A) = 0.

(Of course, this is equivalent to solving det (A− λI) = 0, which many people prefer.Also, recall that this is called the characteristic equation of A.) We have∣∣∣∣λ− 2 −1

−3 λ− 4

∣∣∣∣ = 0,

which simplifies to λ2 − 6λ+ 5 = 0. Thus, λ = 1, 5 are eigenvalues.Eigenvectors can now be found by solving

(λI −A)v = 0.

Starting with λ = 1, we have [−1 −1−3 −3

] [k1k2

]=

[00

],

which reduces to the single equation k1 + k2 = 0. Roughly speaking, the oneequation enables us to solve for one unknown, leaving the other arbitrary. If we, forexample, let k2 be the arbitrary unknown and relabel it as k2 = α, then k1 = −αand the eigenspace corresponding to λ = 1 consists of the vectors

α

[−11

].

For λ = 5, we need to solve [3 −1−3 1

] [k1k2

]=

[00

],

which reduces to the single equation 3k1 − k2 = 0. Again, one of the unknownsmust be arbitrary. If, for example, we set k1 = β, then k2 = 3β, and the eigenspacecorresponding to λ = 5 consists of the vectors

β

[13

].

Let’s choose [−11

]and

[13

]as our two linearly independent eigenvectors. Then, Λ = P−1AP, where

Λ =

[1 00 5

]and P =

[−1 11 3

],

and A has been diagonalized.

Now, let’s solve y′ = Ay for any diagonalizable matrix A. Say Λ = P−1AP.We then have A = PΛP−1 and so,

y′ = Ay ⇔ y′ = PΛP−1y

⇔ P−1y′ = ΛP−1y

⇔ z′ = Λz,


where z = P−1y. Since Λ is diagonal, we can easily solve z′ = Λz for z. Then wecan find y from y = Pz.

Example. Solve

y′1 = 2y1 + y2

y′2 = 3y1 + 4y2

This system is equivalent to y′ = Ay, where

A =

[2 13 4

].

In the last example, we showed that this matrix A is diagonalizable; specifically,Λ = P−1AP, for

Λ =

[1 00 5

]and P =

[−1 11 3

].

Following the outline given above, we want to solve z′ = Λz, which is equivalent tothe system

z′1 = z1

z′2 = 5z2

Clearly, z1 = C1ex and z2 = C2e

5x. In other words,

z =

[C1e

x

C2e5x

].

Using y = Pz, we conclude that[y1y2

]=

[−1 11 3

] [C1e

x

C2e5x

]=

[−C1e

x + C2e5x

C1ex + 3C2e

5x

].

Note that the answer can be rewritten as

y = C1ex

[−11

]+ C2e

5x

[13

],

a form that makes it easy to see a basis for the two dimensional solution space.

Example. Solve

y′1 = y1 + 3y2 + 6y3

y′2 = −3y1 − 5y2 − 6y3

y′3 = 3y1 + 3y2 + 4y3


A =

1 3 6−3 −5 −63 3 4

.To find the eigenvalues of A, we need to solve∣∣∣∣∣∣

λ− 1 −3 −63 λ+ 5 6−3 −3 λ− 4

∣∣∣∣∣∣ = 0.

Expansion of the determinant leads to λ3 − 12λ − 16 = 0. Solving this, we findthat the eigenvalues of A are λ = −2,−2, 4 (counting algebraic multiplicity). The


eigenspace corresponding to λ = −2 is gotten by solving (−2I−A)v = 0. In otherwords, we must solve −3 −3 −6

3 3 6−3 −3 −6

k1k2k3

=

000

.This reduces to the single equation k1 +k2 +2k3 = 0, which can be used to solve forone of the unknowns, leaving the other two arbitrary. Let, for example, k2 and k3be arbitrary, writing k2 = α and k3 = β. Then k1 = −α−2β, and so the eigenspacecorresponding to λ = −2 consists of all vectors of the form−α− 2β

αβ

= α

−110

+ β

−201

.For λ = 4 we have (4I −A)v = 0. In other words, we must solve 3 −3 −6

3 9 6−3 −3 0

k1k2k3

=

000

.Using row reduction, we can see that this is equivalent to the system

k1 − k3 = 0k2 + k3 = 0

.

The two equations let us solve for two of the variables, leaving one variable arbitrary.Put k3 = γ. Then k1 = γ and k2 = −γ. Thus, the eigenspace corresponding toλ = 4 consists of all vectors of the form

γ

1−11

.Choosing −1

10

,−2

01

and

1−11

as our three linearly independent eigenvectors, we get Λ = P−1AP, where

Λ =

−2 0 00 −2 00 0 4

and P =

−1 −2 11 0 −10 1 1

,and A has been diagonalized.

Now, we want to solve z′ = Λz, which is equivalent to the system

z′1 = −2z1

z′2 = −2z2

z′3 = 4z3.

Clearly, z1 = C1e−2x, z2 = C2e

−2x and z3 = C3e4x. In other words,

z =

C1e−2x

C2e−2x

C3e4x

.


Using y = Pz, we conclude thaty1y2y3

=

−1 −2 11 0 −10 1 1

C1e−2x

C2e−2x

C3e4x

=

−C1e−2x − 2C2e

−2x + C3e4x

C1ex − C3e

4x

C2ex + C3e

4x

.Note that the answer can be rewritten as

y = C1e−2x

−110

+ C2e−2x

−201

+ C3e4x

1−11

.In the last two examples, we saw that the general solution consisted of lin-

ear combinations of solutions of the form eλxv, where λ was an eigenvalue of thecoefficient matrix and v was an eigenvector corresponding to λ. This was not acoincidence; notice that whenever a matrix P is multiplied times a column vectorz, the product is as given in the following diagram. | |

v[1] · · · v[n]

| |

z1...zn

= z1v[1] + · · ·+ znv[n],

which is what we are claiming for zi = Cieλit, i = 1, . . . , n. The results we have

established are summarized in the following theorem.

Theorem 4.6. Consider the system (10) of n first order linear homogeneousequations with constant coefficients in n unknowns. Suppose A has n linearly in-dependent eigenvectors v[1],v[2], . . . ,v[n]. For each i = 1, 2, . . . , n, let λi be theeigenvalue corresponding to v[i]. Then, the general solution of (10) is given by

φ(x) = C1eλ1xv[1] + C2e

λ2xv[2] + · · ·+ Cneλnxv[n].

Theorem 4.6 can be verified directly as follows. If v 6= 0, then y = eλxvsatisfies y′ = Ay if and only if λeλxv = Aeλxv, which holds if and only if

λv = Av.

This last line, however, simply says that λ is an eigenvalue and v is an eigenvectorof A. Furthermore, we get the general solution by taking linear combinations sincewe have n linearly independent solutions and the solution space has dimension n.

Notice that we have shown, whether A is diagonalizable or not, that eλxv,where λ is an eigenvalue and v is an eigenvector of A belonging to λ, will alwaysbe a solution of y′ = Ay.


y′1 = 3y1 + y2

y′2 = 2y1 + 4y2

y1(0) = 2, y2(0) = 7.

The reader can check that the matrix

A =

[3 12 4

]


has eigenvalues λ = 2, 5, and that the eigenvectors corresponding to λ = 2 are ofthe form

α

[−11

],

while the eigenvectors corresponding to λ = 5 are of the form

β

[1/21

].

The general solution of the system is then,

y = C1e2x

[−11

]+ C2e

5x

[12

](using β = 2 to make the answer look simpler). In other words,

y1 = −C1e2x + C2e

5x

y2 = C1e2x + 2C2e

5x.

The initial conditions imply that

−C1 + C2 = 2

C1 + 2C2 = 7.

Solving this, we get C1 = 1, C2 = 3. Therefore, the solution of the initial valueproblem is

y1 = −e2x + 3e5x

y2 = e2x + 6e5x.

Exercises

1. Solve the system

y′1 = y1 − 2y2

y′2 = 3y1 + 6y2

2. Solve the system

y′1 = y1 + 2y2

y′2 = −y1 + 4y2

3. Solve the system

y′1 = 2y1 + 3y2

y′2 = 4y1 + y2

y′3 = 4y1 − 4y2 + 5y3

4. Solve the system

y′1 = 3y1 + y2 − 2y3

y′2 = 2y1 + 4y2 − 4y3

y′3 = 2y1 + y2 − y3

4. COMPLEX EIGENVALUES AND NONHOMOGENEOUS SYSTEMS 111

5. Solve the initial value problem

y′1 = 4y1 + y2

y′2 = 4y1 + 4y2

y1(0) = 7, y2(0) = 2


y′1 = y1 − 3y2

y′2 = 2y1 + 6y2

y1(0) = −9, y2(0) = 7

7. Consider the equation y′′ + 5y′ − 14y = 0.(a) Putting y1 = y and y2 = y′, convert the equation to a system and

then solve the system.(b) Verify your answer by solving the equation directly.

8. Consider the equation y′′′ − 2y′′ − y′ + 2y = 0.(a) Putting y1 = y, y2 = y′ and y3 = y′′, convert the equation to a

system and then solve the system.(b) Verify your answer by solving the equation directly.

9. Consider the mixture problem described in the introduction to this chap-ter. Suppose that W1 = W2 = 50, R = 3, and S = 5.(a) Solve the initial value problem given by (1) and (2) in this case.(b) What limiting values do y1 and y2 approach in the long run?

10. Consider the mixture problem described in the introduction to this chap-ter.(a) Solve the initial value problem given by (1) and (2) for arbitrary

values of W1, W2, R, and S.(b) What limiting values do y1 and y2 approach in the long run?

4. Complex Eigenvalues and Nonhomogeneous Systems

In the last section, all the examples involved real eigenvalues and eigenvectors.However, the diagonalization process and Theorem 4.6 remain true for complexeigenvalues and eigenvectors. Specifically, eλxv will be a solution of y′ = Aywhenever λ is an eigenvalue and v is a corresponding eigenvector of the matrix A,even if λ and the entries of v are imaginary. In this situation, it is often desirable, asit was when solving a single equation, to find a basis of the solution space consistingof real valued functions. To this end, we note that if φ is a complex valued solutionof y′ = Ay, where A has real entries, then the complex conjugate φ is a solution

of y′ = Ay since φ′ = Aφ⇒ φ′ = Aφ⇒ φ′

= Aφ⇒ φ′

= Aφ. Then, by addingand subtracting φ and φ, we see that the real and imaginary parts of φ are alsosolutions of y′ = Ay.

Example. Solve

y′1 = 3y1 − y2y′2 = 2y1 + y2



A =

[3 −12 1

].

To find the eigenvalues of A, we need to solve∣∣∣∣λ− 3 1−2 λ− 1

∣∣∣∣ = 0.

This reduces to λ2 − 4λ + 5 = 0. Thus, the eigenvalues of A are λ = 2 ± i. Theeigenspace corresponding to λ = 2 + i is gotten by solving ((2 + i)I −A)v = 0. Inother words, we must solve[

−1 + i 1−2 1 + i

] [k1k2

]=

[00

].

This is equivalent to the system

(−1 + i)k1 + k2 = 0

−2k1 + (1 + i)k2 = 0.

Since λ = 2 + i is an eigenvalue, this system must have infinitely many solutions,and so one of the equations must be a constant multiple of the other. The readercan check that the second equation is, in fact, 1+i times the first. Thus, the systemreduces to the single equation (−1 + i)k1 + k2 = 0, which can be used to solve forone of the unknowns, leaving the other arbitrary. Let k1 = α. Then k2 = (1− i)α,and so the eigenspace corresponding to λ = 2 + i consists of all vectors of the form

α

[1

1− i

].

We could now find the eigenspace corresponding to λ = 2− i, but that will not benecessary, as we will see. Note that

e(2+i)x[

11− i

]is a solution of our system of differential equations and that

e(2+i)x[

11− i

]= e2x(cosx+ i sinx)

[1

1− i

]= e2x

[cosx+ i sinx

cosx− i cosx+ i sinx+ sinx

]= e2x

[cosx

cosx+ sinx

]+ ie2x

[sinx

sinx− cosx

].

By the remarks just before this example, we have that

e2x[

cosxcosx+ sinx

]and e2x

[sinx

sinx− cosx

]are solutions of the system of differential equations. Furthermore, we can easily seethat they are linearly independent. Since the solution space has dimension 2, weconclude that the general solution is

φ(x) = C1e2x

[cosx

cosx+ sinx

]+ C2e

2x

[sinx

sinx− cosx

].


Written in more elementary notation, this becomes

φ1(x) = C1e2x cosx+ C2e

2x sinx

φ2(x) = (C1 − C2)e2x cosx+ (C1 + C2)e2x sinx.

At first glance, it may seem strange that we did not need to work with botheigenvalues in the last example in order to find the general solution of the system ofdifferential equations. To see, however, how both eigenvalues are at least indirectlyused, we can proceed as follows. Consider y′ = Ay, where A is a matrix, all ofwhose entries are real numbers. Suppose that λ is an imaginary (i.e. complex withnonzero imaginary part) eigenvalue of A and v is an eigenvector in the eigenspaceof λ. Since det (λI −A) = 0, we get det (λI −A) = 0 by taking the complexconjugate of the determinant. Thus, λ is an eigenvalue of A. Then (λI −A)v = 0implies that (λI −A)v = 0, where by v we mean the column vector whose entriesare the complex conjugates of the entries of v. Thus, v is an eigenvector in the

eigenspace of λ. We then have that eλxv and eλxv are solutions of y′ = Ay. But

it can be easily checked that eλxv is the complex conjugate of eλxv. By takinglinear combinations of these two solutions, we get the two solutions, the real andimaginary parts of eλxv, which we used when writing out the general solution ofthe last system.

The reader will note that we have not seen how to find the general solution ofy′ = Ay when A is not diagonalizable. To get a basis for the solution space, it isnecessary to find generalized eigenvectors. These are related to the Jordan canonicalform of A, which is a matrix that is similar to A and is ‘almost’ diagonal. In thecase of an eigevalue λ that is a double root of the characteristic equation and hasan eigenspace of dimension one, it can be shown that two linearly independentsolutions of the system are eλxv and eλx(xv + u), where v is an eigenvector and uis a generalized eigenvector satisfying (λI −A)u = v. We will not go further intothese matters here.

Let’s turn now to nonhomogeneous systems. Specifically, consider

y′ = Ay + g(x).

If A is diagonalizable, we can proceed much in the same way as we did in thehomogeneous case. Say Λ = P−1AP. We then have A = PΛP−1 and so,

y′ = Ay + g(x) ⇔ y′ = PΛP−1y + g(x)

⇔ P−1y′ = ΛP−1y + P−1g(x)

⇔ z′ = Λz + P−1g(x),

where z = P−1y. Since Λ is diagonal, we can easily solve z′ = Λz + P−1g(x) forz. Then we can find y from y = Pz.

Example. Solve

y′1 = 2y1 + y2 + 3

y′2 = 2y1 + 3y2 + 6ex

This system is equivalent to y′ = Ay + g(x), where

A =

[2 13 4

]and g(x) =

[3

6ex

].


Solving det (λI −A) = 0, we get that the eigenvalues of A are λ = 1, 4. Proceedingas usual, we can see that the eigenspace corresponding to λ = 1 consists of vectorsthat can be put into the form

α

[1−1

],

while the eigenspace corresponding to λ = 4 consists of vectors that can be put intothe form

β

[12

].

Then we have Λ = P−1AP, where

Λ =

[1 00 4

]and P =

[1 1−1 2

].

For the system of nonhomogeneous differential equations, we will also need P−1.It is easily computed that

P−1 =

[2/3 −1/31/3 1/3

],

as the reader can confirm.Now, we want to solve z′ = Λz + P−1g(x). First, note that

P−1g(x) =

[2/3 −1/31/3 1/3

] [3

6ex

]=

[2− 2ex

1 + 2ex

].

Thus, we want to solve the system

z′1 = z1 + 2− 2ex

z′2 = 4z2 + 1 + 2ex

which we rewrite as

z′1 − z1 = 2− 2ex

z′2 − 4z2 = 1 + 2ex

Let’s solve these uncoupled equations by means of integrating factors for first orderlinear equations. For the first of the two,

µ = e∫ ∗(−1) dx = e−x.

Then, multiplying through by µ, we get

e−xz′1 − e−xz1 = 2e−x − 2,

which can be rewritten as (e−xz1)′ = 2e−x− 2. Integration gives e−xz1 = −2e−x−2x+ C1, and so

z1 = −2− 2xex + C1ex.

For the second equation, µ = e−4x. Then

z′2 − 4z2 = 1 + 2ex ⇔ e−4xz′2 − 4e−4xz2 = e−4x + 2e−3x

⇔ (e−4xz2)′ = e−4x + 2e−3x

⇔ e−4xz2 = −1

4e−4x − 2

3e−3x + C2

⇔ z2 = −1

4− 2

3ex + c2e

4x.


Using y = Pz, we conclude that[y1y2

]=

[1 1−1 2

] [−2− 2xex + C1e

x

− 14 −

23ex + C2e

4x

]=

[− 9

4 − 2xex − 23ex + C1e

x + C2e4x

32 + 2xex − 4

3ex − C1e

x + 2C2e4x

].

It should be noted that the solution of the last example can be rewritten as[y1y2

]=

[− 9

4 − 2xex − 23ex

32 + 2xex − 4

3ex

]+ C1e

x

[1−1

]+ C2e

4x

[12

],

which can be interpreted as a particular solution of the nonhomogeneous equationplus the general solution of the corresponding homogeneous equation. In fact, itis easy to show that the general solution of any system y′ = Ay + g(x) equals aparticular solution of y′ = Ay + g(x) plus the general solution of y′ = Ay. Boththe annihilator method and the method of variation of parameters can be modifiedso that they enable one, under appropriate conditions, to find a particular solutionof a nonhomogeneous system.

Exercises

1. Solve the system

y′1 = y1 − y2y′2 = y1 + y2

2. Solve the system

y′1 = 6y1 − 8y2

y′2 = 4y1 − 2y2

3. Solve the system

y′1 = y2

y′2 = −y1y′3 = y1


y′1 = 4y1 − 5y2

y′2 = 5y1 − 4y2

y1(0) = 1, y2(0) = 5

5. Solve the system

y′1 = 7y1 − 4y2 + 3e2x

y′2 = 5y1 − 2y2

6. Solve the system

y′1 = 3y1 + 7y2 + 2ex

y′2 = −y1 − 5y2 + 3ex


7. Solve the system

y′1 = 2y1 + y2 + 5ex

y′2 = y1 + 2y2 + e4x

8. Solve the system

y′1 = y1 + 2y3 + 4ex

y′2 = −y2 + y3

y′3 = y1 + 1

9. Consider the equation y′′ − 3y′ + 2y = 4x.(a) Putting y1 = y and y2 = y′, convert the equation to a system and

then solve the system.(b) Verify your answer by solving the equation directly.

10. Prove that if ψ and ψp are solutions of the system y′ = A(x)y + g(x),then ψ = ψp + φ, where φ is a solution of y′ = A(x)y.

11. (a) If α is not an eigenvalue of A, prove that φ(x) = eαxu is a solution ofy′ = Ay+eαxv if and only if u = (αI−A)−1v. (Be sure to point outhow you are using α not being an eigenvalue.) This is a special caseof the method of undetermined coefficients (the annihilator method)for systems.

(b) Use the method outlined here to solve Number 6 of this exercise set.

12. (a) Suppose that φ[1], . . .φ[n] form a basis for the solution space of the

system y′ = A(x)y. Note that φ(x) = C1φ[1](x) + · · ·+ Cnφ

[n] canbe rewritten as φ(x) = Φ(x)C, where Φ is the square matrix whose

columns are φ[1], . . .φ[n] and C is the column vector with entriesC1, . . . , Cn. Also, note that the matrix equation Φ′ = AΦ holds.Prove that ψ(x) = Φ(x)u(x) is a solution of y′ = A(x)y + g(x) ifand only if u(x) =

∫ ∗Φ−1(x)g(x) dx. This is the method of variation

of parameters for systems.(b) Use the method outlined here to solve Number 5 of this exercise set.

13. Consider a mass-spring system situated along a straight line that consistsof a spring with spring constant k1 which is stationary at one end and hasa mass m1 at the other end and a second spring, with spring constant k2,extending from the mass m1 to a second mass m2. Let s1 and s2 measurethe positions of m1 and m2 at time t, with s1 = 0, s2 = 0 correspondingto the springs being at their natural lengths, neither stretched nor com-pressed. With the aid of Hooke’s law, it can be shown that s1 and s2satisfy the equations

m1s′′1 = −k1s1 + k2(s2 − s1)

m2s′′2 = −k2(s2 − s1).

Suppose that m1 = m2 = 1, k1 = 3, and k2 = 2.(a) Show that s1 and s2 satisfy the second order equations

s′′1 = −5s1 + 2s2

s′′2 = 2s1 − 2s2.

5. NONLINEAR SYSTEMS 117

(b) By letting y1 = s1, y2 = s′1, y3 = s2, and y4 = s′2, convert thesesecond order equations into a system of four first order equations.

(c) Solve the system of four first order equations.(d) Use your answer to part (c) to give s1 and s2 as functions of time.

Here is a quick way of solving a simple linear system. Suppose, for example, wehave

y′1 = 4y1 + 3y2

y′2 = 2y1 + 5y2

Let D be the differentiation operator. The system becomes

Dy1 = 4y1 + 3y2

Dy2 = 2y1 + 5y2and so,

(D − 4)y1 − 3y2 = 0

−2y1 + (D − 5)y2 = 0.

(D− 4, for example, could be written as D− 4I, where I is the identity operator.)To eliminate y2 in this system, take D − 5 of the first equation and 3 times thesecond, and then add. We get (D − 5)(D − 4)y1 − 3(D − 5)y2 = 0 and −6y1 +3(D − 5)y2 = 0, which when added produces (D − 5)(D − 4)y1 − 6y1 = 0. Thissimplifies to (D2− 9D+ 14)y1 = 0. In other words, y′′1 − 9y′1 + 14y1 = 0. Using thecharacteristic polynomial, we get y1 = C1e

2x + C2e7x. Substituting this into the

equation (D − 4)y1 − 3y2 = 0, we get y2 = − 23C1e

2x + C2e7x.

14. Use the method of elimination to solve the system

y′1 = 2y1 + y2 + x2

y′2 = 2y1 + 3y2


y′1 = y1 − y2

y′2 = y1 + 3y2


y′1 = −y1 − y2

y′2 = 2y1 − 3y2


y′1 = y1 + 5y2 + 12

y′2 = −y1 − 3y2

5. Nonlinear Systems

We would like to solve the system of differential equations

y′1 = f1(x, y1, . . . , yn)

y′2 = f2(x, y1, . . . , yn)

...

y′n = fn(x, y1, . . . , yn)


where f1, f2, . . . , fn are continuous functions on some set. This can be written usingvector notation as y′ = f(x,y), and the corresponding initial value problem can bewritten as

(11) y′ = f(x,y), y(x0) = β0,

where β0 is a column vector of constants. Just as in the case of a single equation, theexistence and uniqueness of a solution can be shown under fairly weak hypothesesby converting to an equivalent system of integral equations. Both the statementsof the results and their proofs are almost identical to the single equation case, andwe will not go through the details here. We will simply note that if f1, . . . , fn arecontinuous and have continuous first partial derivatives with respect to y1, . . . , ynin a neighborhood of (x0,β0), then (11) has a unique solution near x0.

When attempting to solve initial value problems for systems of nonlinear equa-tions, numerical methods are normally used. The numerical methods that we en-countered earlier can rather easily be modified to handle systems. We will restrictourselves here to Euler’s method. In order to try to avoid dealing with conflictinguses of subscripts and superscripts, we will change our notation. Let’s assume thatx and y are functions of the independent variable t.


x′ =1

2x2 + y

y′ = x− 2y

x(0) = 0, y(0) = 4

on the interval t : 0 ≤ t ≤ 1. We will assume, without proof, that a solution exists;therefore, when we view our results, we should look for any peculiar behavior thatmight indicate that our assumption is false. Let’s choose step size h = 0.25 and putf(t, x, y) = 1

2x2 + y, g(t, x, y) = x − 2y. The solution can be thought of as being

given by two graphs, one showing x as a function of t and the other showing y asa function of t. We will approximate each of these graphs by a polygonal line, onegiven by X(t) and the other by Y (t).

The first step involves drawing a line segment between t = 0 and t = 0.25 thatis tangent to the t, x graph at (t0, x0) = (0, 0). From the first differential equation,the slope will be x′(0) = f(0, 0, 4) = 4, and then

X(0.25) = 0 + (0.25)(4) = 1.

Also, we draw a line segment between t = 0 and t = 0.25 that is tangent to the t, ygraph at (t0, y0) = (0, 4). From the second differential equation, the slope will bey′(0) = g(0, 0, 4) = −8, and then

Y (0.25) = 4 + (0.25)(−8) = 2.

See the top row in Figure 1. Going from t = 0.25 to t = 0.5, we would like theslopes of our line segments to be

x′(0.25) = f(0.25, x(0.25), y(0.25)) and y′(0.25) = g(0.25, x(0.25), y(0.25)),

but since x and y are unknown we choose

f(0.25, X(0.25), Y (0.25)) and g(0.25, X(0.25), Y (0.25))


0.2 0.4 0.6 0.8 1t0

0.51

1.52

2.53

3.54x

0.2 0.4 0.6 0.8 1t0

0.51

1.52

2.53

3.54

y

0.2 0.4 0.6 0.8 1t0

0.51

1.52

2.53

3.54x

0.2 0.4 0.6 0.8 1t0

0.51

1.52

2.53

3.54

y

0.2 0.4 0.6 0.8 1t0

0.51

1.52

2.53

3.54x

0.2 0.4 0.6 0.8 1t0

0.51

1.52

2.53

3.54

y

0.2 0.4 0.6 0.8 1t0

0.51

1.52

2.53

3.54x

0.2 0.4 0.6 0.8 1t0

0.51

1.52

2.53

3.54

y

Figure 1. x′ = 12x

2 + y, y′ = x− 2y, x(0) = 0, y(0) = 4


0.5 1 1.5 2 2.5 3x0

0.5

1

1.5

2

2.5

3

3.5

4

y

Figure 2. x′ = 12x

2 + y, y′ = x− 2y, x(0) = 0, y(0) = 4

instead. We get

X(0.5) = X(0.25) + (0.25)f(0.25, 1, 2) = 1 + (0.25)(2.5) = 1.625

Y (0.5) = Y (0.25) + (0.25)g(0.25, 1, 2) = 2 + (0.25)(−3) = 1.25

See the second row in Figure 1. Continuing in this way, we get

X(0.75) = X(0.5) + (0.25)f(0.5, 1.625, 1.25) = 1.625 + (0.25)(2.57) = 2.268

Y (0.75) = Y (0.5) + (0.25)g(0.5, 1.625, 1.25) = 1.25 + (0.25)(−0.875) = 1.031

(see the third row in Figure 1) and then

X(1) = X(0.75) + (0.25)f(0.75, 2.268, 1.031) = 2.268 + (0.25)(3.603) = 3.168

Y (1) = Y (0.75) + (0.25)g(0.75, 2.268, 1.031) = 1.031 + (0.25)(0.206) = 1.083

(see the bottom row in Figure 1), rounded off to three places past the decimal point.Our results can be summarized in the following table.

tk Xk Yk0.00 0 40.25 1 20.50 1.625 1.250.75 2.268 1.0311.00 3.168 1.083

Also, it can be of interest to see directly how the variables x and y relate to eachother. (This will be covered more extensively in the next chapter.) The graph ofX versus Y shown in Figure 2 approximates that relationship.


More generally, to solve an initial value problem

x′ = f(t, x, y)

y′ = g(t, x, y)

x(t0) = x0, y(t0) = y0

on an interval [t0, b] by means of Euler’s method, we proceed as follows. Choose anumber of steps n and the corresponding step size h = (b− t0)/n. Set X0 = x(t0)and Y0 = y(t0). For k = 1, 2, . . . , n, put tk = t0 + kh. Then we define inductively

Xk = Xk−1 + hf(tk−1, Xk−1, Yk−1)

Yk = Yk−1 + h g(tk−1, Xk−1, Yk−1)

for k = 1, 2, . . . , n. Equivalently,[Xk

Yk

]=

[Xk−1Yk−1

]+ h

[f(tk−1, Xk−1, Yk−1)g(tk−1, Xk−1, Yk−1)

].

Putting X(tk) = Xk and Y (tk) = Yk and then interpolating linearly, we get theapproximate solution [

X(t)Y (t)

]for t0 ≤ t ≤ b.

The reader can easily see how to generalize this method to systems of morethan two first order equations.


x′ = 2x+ y + 3

y′ = 2x+ 3y + 6et

x(0) = −3, y(0) = 1/4

on the interval t : 0 ≤ t ≤ 2. This nonhomogeneous linear system was solvedin the last section. We will apply Euler’s method to the initial value problemand check how accurate our answer is by comparing it with the solution found inthe last section. Let’s choose step size h = 0.5 and put f(t, x, y) = 2x + y + 3,g(t, x, y) = 2x+ 3y + 6et. We compute

X(0.5) = x(0) + (0.5)f(0,−3, 1/4) = −3 + (0.5)(−2.75) = −4.375

Y (0.5) = y(0) + (0.5)g(0,−3, 1/4) = 0.25 + (0.5)(0.75) = 0.625

and then

X(1.0) = X(0.5) + (0.5)f(0.5,−4.375, 0.625) = −4.375 + (0.5)(−5.125) = −6.938

Y (1.0) = Y (0.5) + (0.5)g(0.5,−4.375, 0.625) = 0.625 + (0.5)(3.0173) = 2.134

and then

X(1.5) = X(1.0) + (0.5)f(1.0,−6.938, 2.134) = −6.938 + (0.5)(−8.742) = −11.308

Y (1.5) = Y (1.0) + (0.5)g(1.0,−6.938, 2.134) = 2.134 + (0.5)(8.8357) = 6.552


and then

X(2.0) = X(1.5) + (0.5)f(1.5,−11.308, 6.552)

= −11.308+(0.5)(−13.064) = −17.841

Y (2.0) = Y (1.5) + (0.5)g(1.5,−11.308, 6.552) = 6.552 + (0.5)(23.9301) = 18.516

rounded off to three places past the decimal point. To see how accurate this is,let’s recall that the general solution of this system is

x = −9

4− 2tet − 2

3et + C1e

t + C2e4t

y =3

2+ 2tet − 4

3et − C1e

t + 2C2e4t

It is an elementary exercise to see that the initial conditions imply that C1 = −1/12and C2 = 0 so that then

x = −9

4− 2tet − 3

4et

y =3

2+ 2tet − 5

4et

. This gives x(2) = −37.348, y(2) = 21.820 (again rounded off to three placespast the decimal point). Not surprisingly, Euler’s method with the large step sizeof h = 0.5 has given us results that are quite inaccurate. Euler’s method withh = 0.01 gives X(2) = −34.064, Y (2) = 27.056, which is still not particularly closeto the actual values x(2) and y(2). Let us note that the fourth order Runge-Kuttamethod for systems with h = 0.01 gives X(2) = −37.348, Y (2) = 21.820.

We now return to higher order equations. For any integer n ≥ 2, the equation

(12) x(n) = g(t, x, x′, . . . , x(n−1))

can be converted to a system of first order equations as follows. Put x1 = x,x2 = x′, . . . , xn = x(n−1). Then, (12) is equivalent to the system

(13)

x′1 = x2

x′2 = x3

...

x′n−1 = xn

x′n = g(t, x1, x2, . . . , xn).

If (12) is a linear equation, then (13) is a linear system; a couple of examples ofthis appeared in the last two exercise sets. If (12) is not linear, then the standardway of finding a solution (to a corresponding initial value problem) is by meansof applying a numerical method to to the equivalent initial value problem for thesystem (13).


x′′ = tx′ − x2

x(0) = 0, x′(0) = 1.5


on the interval [0, 2]. By setting, x = x and y = x′, we get the problem

x′ = y

y′ = ty − x2

x(0) = 0, y(0) = 1.5

which can be solved using numerical methods. If we apply Euler’s method to this,we get an approximate solution X(t) (and also Y (t)). The reader can check with theaid of appropriate software that if we choose step size h = 0.01, then the followingtable gives values of X(t) (corresponding to k = 0, 50, 100, 150, 200).

tk Xk

0.0 00.5 0.7691.0 1.5641.5 2.0912.0 1.662

Exercises

1. Use Euler’s method with h = 0.1 (no computer) to approximate x(0.3)and y(0.3) if x and y are solutions of the initial value problem x′ = xy,y′ = x+ y, x(0) = 1, y(0) = 0.

2. Use Euler’s method with h = 0.5 (no computer) to approximate x(1.5)and y(1.5) if x and y are solutions of the initial value problem x′ = y2 +1,y′ = x+ y, x(0) = 0, y(0) = 0.

3. Using a computer, repeat exercise 1 with h = 0.005.


5. Use Euler’s method with h = 0.5 (no computer) to approximate x and ywith a table of values tk, Xk, Yk for 0 ≤ t ≤ 2 if x and y are solutions ofthe initial value problem x′ = ty, y′ = x+ y, x(0) = 1, y(0) = 3.

6. Use Euler’s method with h = 0.25 (no computer) to approximate x and ywith a table of values tk, Xk, Yk for 1 ≤ t ≤ 2 if x and y are solutions ofthe initial value problem x′ = xy + t, y′ = x− 2y, x(1) = 0, y(1) = 1.



9. Use Euler’s method with h = 0.25 (no computer) to approximate x witha table of values tk, Xk for 0 ≤ t ≤ 0.75 if x is the solution of the initialvalue problem x′′ = 2x3x′ + t2, x(0) = 0, x′(0) = 1.

10. Use Euler’s method with h = 0.5 (no computer) to approximate x with atable of values tk, Xk for 0 ≤ t ≤ 2 if x is the solution of the initial valueproblem x′′ + tx2 = x′, x(0) = 2, x′(0) = 1.




13. Consider an idealized damped pendulum consisting of a unit mass at theend of a rigid rod of unit length and zero mass. Its motion is given byθ′′+ cθ′+ g sin θ = 0, where g is the (positive) acceleration due to gravity,c is a damping constant, and θ is the angle the pendulum makes with thevertical, and where differentiation is with respect to time t. Use Euler’smethod (h = 0.025) on a computer to approximate θ with a table of valuestk, θk for 0 ≤ t ≤ 3, where θ is the solution of the initial value problemθ′′ + 4θ′ + 32 sin θ = 0, θ(0) = 0.4, θ′(0) = 0.

14. The motion of the idealized damped pendulum given in the last problemcan be reasonably modeled by the linear equation θ′′ + cθ′ + gθ = 0 ifθ is small. Solve directly (no numerical approximation) the initial valueproblem θ′′ + 4θ′ + 32θ = 0, θ(0) = 0.4, θ′(0) = 0. In particular, whatdo you get for θ(3)? Compare this to the approximation gotten in theprevious exercise.

CHAPTER 5

The Geometry of Systems


Suppose we are asked to see how the populations of two species in a closedenvironment are related to each other. For example, let x be the number of rabbits(in thousands) and let y be the number of foxes (in tens) in a certain woods at timet. Assume that this is modeled by the system

(1)x′ = 5x− x2 − xyy′ = xy − 2y.

This is an example of a predator-prey problem.Before proceeding any further, let’s note some ways this system fits the physical

situation. If there are no rabbits (x = 0), then the second equation becomesy′ = −2y, which implies y = Ce−2t, and so the foxes die out, presumably fromstarvation. On the other hand, if there are no foxes (y = 0), then the first equationbecomes x′ = 5x − x2, a first order separable equation whose solution is x =5Cet/(1 + Cet); the number of rabbits would then approach 5000 in the long run,presumably because 5000 rabbits is the most the vegetation can support. Moregenerally, that second equation indicates that a lot of rabbits ⇒ x is large ⇒ y′

is positive ⇒ the number of foxes grows; the first equation indicates that a lot offoxes ⇒ y is large ⇒ x′ is negative ⇒ the number of rabbits decreases.

Given an initial number of rabbits and number of foxes, we would like to seehow these two populations vary with respect to each other over time. Although,we cannot solve (1), we will be able to get insight into the problem by solving anassociated linear system and we will use this insight in interpreting a graphicalrepresentation of the relationship between x and y.

2. The Phase Plane

Consider the system

(2)x′ = f(t, x, y)

y′ = g(t, x, y)

of differential equations, where x and y are unknown functions of t. We will assumethat f and g are continuous and have continuous first partial derivatives withrespect to x and y in some region of t, x, y-space. As noted in the last chapter,this is sufficient to ensure that for any (t0, x0, y0) in the interior of the region, thecorresponding initial value problem will have a solution close to t0. Moreover, ifthe region is of the form

R = (t, x, y) : |t− t0| ≤ a, |x− x0| ≤ b, |y − y0| ≤ c

125

126 5. THE GEOMETRY OF SYSTEMS

and if I is any subinterval of t : |t− t0| ≤ a containing t0, then any two solutionson I of the initial value problem must coincide throughout I. It is, however, oftenimpossible to solve explicitly for x and y. Still, we will attempt to find out somegeneral information about the functions x and y and how they relate to each other.In particular, it can be of interest to determine whether x and y are bounded,whether they have a limit as t → ∞, and whether small changes in f or g or ininitial conditions will result in only small changes in x and y.

We will restrict our attention to systems for which f and g are independent oft. These are called autonomous systems. That is,

(3)x′ = f(x, y)

y′ = g(x, y)

is said to be an autonomous system. To get an idea of how restrictive this conditionis, it is helpful to see which linear systems are autonomous. The reader can easilycheck that a linear system is autonomous if and only if it has constant coefficientsand is either homogeneous or nonhomogenous with a constant forcing function.

For any value of t, (x(t), y(t)) can be thought of as a point in a plane, called thephase plane. As t varies, (x(t), y(t)) traces out a curve in the phase plane, called atrajectory . A representative collection of trajectories of a system in the phase planeis called a phase portrait for the system. Recall that trajectories for a single firstorder differential equation were defined much earlier. The reader should comparethe two uses of the word ‘trajectory.’

Example. Consider the linear system

(4)x′ = 4y

y′ = x.

This system is easily seen to have eigenvalues λ = 2,−2, which lead to the solution[xy

]= C1e

2t

[21

]+ C2e

−2t[−21

].

In other words,

x = 2C1e2t − 2C2e

−2t

y = C1e2t + C2e

−2t.

Various initial conditions will lead to various values of C1 and C2. If C2 = 0, thenx = 2C1e

2t, y = C1e2t, and we get the trajectory 2y = x, x > 0 for C1 > 0, the

trajectory 2y = x, x < 0 for C1 < 0, and the trajectory y = x = 0 for C1 = 0.Likewise, if C1 = 0, then x = −2C2e

−2t, y = C2e−2t, and we get the trajectory

2y = −x, x > 0 for C2 > 0, the trajectory 2y = −x, x < 0 for C2 < 0, and thetrajectory y = x = 0 for C2 = 0. More generally, x2 − 4y2 = −16C1C2, which isa hyperbola whenever C1C2 6= 0; for each choice of non-zero C1 and C2, we cancheck that the trajectory is one branch of the hyperbola. A phase portrait for thissystem is shown in Figure 1. The arrowheads indicate the direction of increasing t.

Now, here is an alternate way of finding trajectories, one that does not requireus to solve the system. Assuming x′ 6= 0, the chain rule implies that

dy

dx=dy/dt

dx/dt=y′

x′.

2. THE PHASE PLANE 127

-3 -2 -1 1 2 3x

-2

-1

1

2

y

Figure 1. x′ = 4y, y′ = x

Then our system givesdy

dx=

x

4y.

This first order separable equation has solutions 4y2 = x2 +C, which is equivalentto what we got above. To determine the direction of increasing t, we examine thesigns of x′ and y′ throughout the phase plane. In the first quadrant, 2y and x areboth positive, and so (4) implies x′ and y′ are both positive. Thus, the directioncorresponding to increasing t is to the right and upward, in conformance to whatwe found in our first approach where we could simply look at the solutions to seewhat happens as t increases. The other three quadrants are handled similarly.

We should note that we were fortunate in our last example to be led to aseparable equation. More generally, a system

x′ = ax+ by

y′ = cx+ dy,

where a, b, c, and d are constants, leads to a first order homogeneous equation —see the problem set for Section 1.3. For an arbitrary autonomous system (3), thereis no guarantee that we can solve the resulting first order equation in x and y.

Another way of geometrically representing the solutions of an autonomous sys-tem (3) is by means of a direction field . In a manner similar to what was donefor the direction field of a single first order equation, which we encountered muchearlier, we proceed as follows. Consider the set of points (xk, yk) in the phaseplane, where xk = xk−1 + ∆x and yk = yk−1 + ∆y for fixed positive numbers ∆x,∆y. Through each point (xk, yk), draw a short line segment with slope equal tog(xk, yk)/f(xk, yk). Since the derivative

dy

dx=g(xk, yk)

f(xk, yk)

gives the slope of the tangent line, the line segment is tangent at (xk, yk) to thetrajectory passing through (xk, yk). Thus, by using the line segments as a guide,we can trace out trajectories of solutions, at least approximately. We augment the


-3 -2 -1 1 2 3x

-2

-1

1

2

y

Figure 2. x′ = 4y, y′ = x

line segments with arrowheads to indicate the direction corresponding to increasingt.

Example. The system

x′ = 4y

y′ = x.

has the direction field shown in Figure 2. The reader should compare the directionfield with the phase portrait for this system, given in Figure 1.

Example. Consider the system

x′ = −4xy

y′ = 2x2 − x.Although this system is not linear, we can still draw a phase portrait of it usingthe second method that was applied to system (4). The chain rule gives us

dy

dx=

2x2 − x−4xy

= −2x− 1

4y,

for x 6= 0, y 6= 0. Then 4y dy = −(2x − 1) dx and so 2y2 = −x2 + x + C. Thetrajectories are therefore parts of ellipses(

x− 1

2

)2

+ 2y2 = C +1

4.

A phase portrait for this system is given in Figure 3. Again, the direction ofincreasing t, as indicated by the arrowheads, can be found by examining the signsof x′ and y′ throughout the phase plane. For example, if x > 1/2 and y > 0,then x′ = −4xy < 0 and y′ = 2x2 − x > 0, and so the direction of increasing tis to the left and upward; if 0 < x < 1/2 and y > 0, then x′ = −4xy < 0 andy′ = 2x2 − x < 0, and so the direction of increasing t is to the left and downward.There are four other regions, which are handled similarly. Also, note that thepoint (1/2, 0), which corresponds to C = −1/4, is the trajectory of the constantsolution x ≡ 1/2, y ≡ 0. Likewise, for each b, the point (0, b) is the trajectory of


-0.5 0.5 1 1.5 x

-1

-0.5

0.5

1y

Figure 3. x′ = −4xy, y′ = 2x2 − x

the constant solution x ≡ 0, y ≡ b. For each C with −1/4 < C < 0, the ellipse(x− 1

2 )2 +2y2 = C+ 14 (traced out in the counterclockwise direction and repeatedly

wrapping around on itself) is a trajectory. For each C with C > 0, the ellipse(x − 1

2 )2 + 2y2 = C + 14 consists of four trajectories, one being the part of the

ellipse to he right of the y-axis, a second being the part of the ellipse to he leftof the y-axis, and two others that are single points corresponding to the constantsolutions x ≡ 0, y ≡ b =

√C/2 and x ≡ 0, y ≡ b = −

√C/2. The case C = 0 is

left to the reader.

We now want to see how a system being autonomous endows any phase portraitof it with certain desirable properties. Let[

φ(t)ψ(t)

]be a solution of the autonomous system (3) for −∞ < t <∞. For any real number

a, put φ(t) = φ(t−a), ψ(t) = ψ(t−a). If s = t−a, the chain rule says φ′(t) = φ′(s).

Thus, φ′(t) = f(φ(s), ψ(s)) = f(φ(t), φ(t)). Similarly, ψ′(t) = f(ψ(t), ψ(t)). Thus,[φ(t)

ψ(t)

]is also a solution of (3). Furthermore, it is clear that (φ(t), ψ(t)) and (φ(t), ψ(t))trace out the same curve (different parameterizations) in the phase plane. Note

that if t0 is any real number and if we put t0 = t0 − a, then φ(t0) = φ(t0) and

ψ(t0) = ψ(t0). Therefore, if a certain curve is a trajectory in the phase plane for(3) and (x0, y0) is any point on the curve, then there is a solution of (3) corre-sponding to the trajectory that passes through (x0, y0) at any t (t0 or t0) that wechoose. We summarize this by saying that trajectories of an autonomous systemare independent of time t.


x′ = 4y

y′ = x.


again. From its phase portrait we see that, for example, if there is ever a timet0 when x = 2 and y = −1, then x → 0 and y → 0 as t → ∞, and this is trueregardless of what that time t0 is.

Theorem 5.1. Let f and g be continuous functions with continuous first partialderivatives in some region Ω of the x, y-plane. Then different trajectories in Ω ofthe the autonomous system (3) cannot intersect at any interior point of Ω.

Proof. Suppose that two different trajectories in the phase plane for the sys-tem (3) intersect at an interior point of Ω. Then we can find a point (x0, y0) and arectangle

R0 = (x, y) : |x− x0| ≤ b, |y − y0| ≤ csuch that the trajectories intersect at (x0, y0) and are not identical on R0. Say[

φ1(t)ψ1(t)

]and

[φ2(t)ψ2(t)

]are solutions of (3) tracing out these trajectories. As noted above, we can choosethe solutions so that they both pass though the point (x0, y0) at the same time t0.Then they are both solutions of the initial value problem

(5)

x′ = f(x, y)

y′ = g(x, y)

x(t0) = x0, y(t0) = y0.

However, our conditions on f and g are enough to guarantee that (5) has onlyone solution. (See the first paragraph of Chapter 4, Section 5.) Therefore, if thetrajectories intersected, they would coincide.

Notice how the proof of the last theorem breaks down if we try to apply itto a non-autonomous system. There are, in fact, many non-autonomous systemswith trajectories that intersect. For a specific example, see the following problemset. As we will soon see, there are autonomous systems (3) with solutions that areperiodic. The corresponding trajectories must then wrap over themselves. However,trajectories arising from solutions of (3) cannot intersect at isolated points; this alsois dealt with in the following problem set.

Exercises

1. Consider the system

x′ = 2y

y′ = 8x.

(a) Using eigenvalues and eigenvectors, solve this linear system(b) Describe the trajectories of this system by solving a single equation

in x and y.(c) Show that the solutions from part (a) satisfy the trajectories from

part (b).(d) Draw a phase portrait of the system (as always, indicating the direc-

tion of increasing t by arrowheads).


-3 -2 -1 1 2 3x

-2

-1

1

2

y

Figure 4. Direction field


x′ = 2y

y′ = −8x.

(a) Using eigenvalues and eigenvectors, solve this linear system(b) Show that the trajectories of this system are ellipses by solving a

single equation in x and y.(c) Show that the solutions from part (a) satisfy the trajectories from

part (b).(d) Draw a phase portrait of the system.

3. Draw a phase portrait of the system

x′ = y

y′ = 2xy.

4. Draw a phase portrait of the system

x′ = xy + y

y′ = xy2 + y2.

5. Figure 4 shows the direction field for an autonomous system of two equa-tions. Use it to sketch a phase portrait of the system.

6. Verify that for each C1 and C2,[φ(t)ψ(t)

]=

[t3 + C1

C2et

]is a solution of

x′ = 3t2

y′ = y,


and show that the solution with C1 = 0, C2 = 1 and the solution withC1 = −8, C2 = e−2 both equal[

01

]for some (different) t. What does this say about the corresponding tra-jectories that would be impossible for an autonomous system?

7. Let [φ(t)ψ(t)

]be a solution of the autonomous system (3), where f and g are continuousfunctions with continuous first partial derivatives in some rectangle R0

in x, y-space centered at (x0, y0). Suppose x0 = φ(a) = φ(b) and y0 =

ψ(a) = ψ(b) for some a 6= b. Put φ(t) = φ(t − t∗) and ψ(t) = ψ(t − t∗),where t∗ = b − a. Prove that φ = φ and ψ = ψ close to b, thus showingthat φ(t − t∗) = φ(t) and ψ(t − t∗) = ψ(t) for all t close to b and thecorresponding trajectory wraps over itself near (x0, y0).

8. Consider the non-autonomous system

x′ = cos3 t− 2 cos t sin2 t

y′ = − sin3 t+ 2 cos2 t sin t

on −π/4 ≤ t ≤ 3π/4.(a) Verify that [

φ(t)ψ(t)

]=

[cos2 t sin tcos t sin2 t

]is a solution.

(b) Check that [φ(0)ψ(0)

]=

[φ(π/2)ψ(π/2)

].

(c) Verify that the parametric equations

x = cos2 t sin t

y = cos t sin2 t

correspond to the four-leaved rose r = 12 sin 2t, given in polar coordi-

nates.(d) Trace out the trajectory corresponding to the solution[

φ(t)ψ(t)

]for −π/4 ≤ t ≤ 3π/4, showing that it intersects itself only at theorigin.

3. CRITICAL POINTS 133

3. Critical Points

Definition 5.1. A critical point of the autonomous system

(6)x′ = f(x, y)

y′ = g(x, y)

is a point (x0, y0) for which f(x0, y0) = 0 and g(x0, y0) = 0.

Notice that if (x0, y0) is a critical point of (6), then[φ(t)ψ(t)

]≡[x0y0

]is trivially a (constant) solution of (6). Then by uniqueness of the solution to theinitial value problem, no other solution can ever take on the value[

x0y0

].

In other words, no non-constant trajectory can pass through the critical point(x0, y0). Because they correspond to constant solutions, critical points are alsoreferred to as stationary points. Here, and throughout the rest of the chapter, weassume that f and g are continuous with continuous first partial derivatives.

Before going any further, let us make a comment about notation. We write[xy

]when we want to view x and y as solutions of a system and write (x, y) when weview x and y as coordinates in the phase plane.

Example. The system

x′ = 4y

y′ = x.

has exactly one critical point point, which is (0, 0). It corresponds to the constantsolution [

xy

]≡[00

].

Also, note that the system can be written in matrix form as[x′

y′

]=

[0 41 0

] [xy

],

which leads us to our next example.

Example. Let A be a 2 × 2 matrix of constants such that det A 6= 0. Thenthe system of linear homogeneous equations with constant coefficients[

x′

y′

]= A

[xy

],

has exactly one critical point, which is (0, 0), since

A

[xy

]=

[00

]has only the trivial solution.


Example. The critical points of

x′ = −4xy

y′ = 2x2 − x.are found by solving

−4xy = 0

2x2 − x = 0.

The second equation gives us x = 1/2, 0. Then by the first equation, x = 1/2implies y = 0, while y can take on any value if x = 0. Thus, the critical points are(1/2, 0) and (0, b) for all b, which correspond to the constant solutions found earlierfor this system of differential equations.

We will describe the phase plane of a system of two equations by analyzing thebehavior of trajectories near critical points. This is done by categorizing criticalpoints into several types. We will restrict ourselves to isolated critical points. Thequalitative analysis we are going through is most important for nonlinear systemssince they are the ones that generally cannot be explicitly solved. However, ourmethod for analyzing nonlinear systems will involve approximating them by linearsystems, and so we will start by categorizing the isolated singularities of linearsystems.

Consider the system [x′

y′

]= A

[xy

],

where A is a 2× 2 matrix with constant coefficients such that det A 6= 0. We havejust seen in the last section that this system has one and only one critical point,which is (0, 0). (If det A = 0, then the system has at least an entire line of non-isolated critical points.) To determine what kind of critical point (0, 0) is, we willexamine the eigenvalues of A. Note that λ = 0 cannot be an eigenvalue of A sinceif it were, we would have Av = 0v = 0 for some v 6= 0, which would contradict Abeing invertible.

Let’s start by supposing A has two distinct real eigenvalues λ1, λ2. The generalsolution of the system can be written

φ(t) =

[φ(t)ψ(t)

]=

[xy

]= C1e

λ1t

[ξ1η1

]+ C2e

λ2t

[ξ2η2

],

where

v[1] =

[ξ1η1

], v[2] =

[ξ2η2

]are linearly independent eigenvectors.

Assume now that λ1 and λ2 are both positive. Clearly for every solution exceptthe trivial one corresponding to C1 = C2 = 0,

limt→−∞

x2 + y2 = 0 and limt→∞

x2 + y2 =∞

since v[1] 6= 0 and v[2] 6= 0. In other words, each trajectory other than the sta-tionary one corresponding to the critical point goes from the origin (in the limit)to infinity. If C2 = 0, then φ(t) = C1e

λ1tv[1], which is all positive scalar mul-tiples of the vector C1v

[1]. For each nonzero C1, the trajectory is a ray headingout from the origin in the direction of C1v

[1]. (Note that we have infinitely many


x

y

x

y

(a) (b)

Figure 5. Real eigenvalues with same sign

x

y

Figure 6. Real eigenvalues with opposite signs

solutions because of the infinitely many choices for C1, but they give us only twodifferent trajectories, one for positive values of C1 and one for negative values ofC1.) Likewise, if C1 = 0, we get φ(t) = C2e

λ1tv[2], which for positive and negativevalues of C2 determines a ray in the direction of C2v

[2] going from the origin toinfinity. If we arrange our notation for the eigenvalues so that λ1 > λ2 > 0, then amore detailed analysis shows that every trajectory corresponding to both C1 andC2 being nonzero is a curve heading out from the origin initially in the direction ofC2v

[2] (since eλ2t dominates eλ1t as t → −∞) that goes toward infinity in a waythat the direction in the limit becomes that of C1v

[1] (since eλ1t dominates eλ2t ast→∞). Part (a) of Figure 5 shows an example of this.

Next suppose that λ1 and λ2 are both negative and that λ1 < λ2 < 0. Thedescription of the phase plane is analogous to the positive eigenvalue case but withthe direction of the trajectories reversed so that they all head toward the origin.An example of this is shown in part (b) of Figure 5.

Now suppose that λ1 and λ2 have opposite signs. Without loss of generality,we can assume that λ1 > 0 and λ2 < 0. If C2 = 0, we get for each C1 6= 0 a ray


x

y

Figure 7. Double eigenvalue with eigenspace of dimension two

in the direction of C1v[1] heading out from the origin; if C1 = 0 and C2 6= 0, the

trajectory is a ray determined by C2v[2] but headed in toward the origin. To see

what happens when neither C1 nor C2 is zero, note that eλ1t →∞ and eλ2t → 0 ast → ∞. Thus the trajectory of such a solution approaches asymptotically the raycorresponding to C2 = 0 as t → ∞. Likewise, the trajectory approaches the raycorresponding to C1 = 0 as t→ −∞. See Figure 6.

We next consider the case of A having a repeated eigenvector λ, which must bereal since the entries of A are real. If there are two linearly independent eigenvectors

v[1] =

[ξ1η1

]and v[2] =

[ξ2η2

],

then the solutions of the system can be written as

φ(t) = C1eλtv[1] + C2e

λtv[2] = eλt(C1v[1] + C2v

[2]),

which gives us all positive scalar multiples of the vector C1v[1] + C2v

[2]. In otherwords, for each C1, C2 not both zero, the vector valued function traces out a raywith direction vector C1v

[1] + C2v[2]. When λ > 0, the trajectory heads out from

the origin; when λ < 0, the trajectory heads in toward the origin as t increases.See Figure 7 for an example of the λ > 0 case.

Now suppose the eigenspace corresponding to λ has dimension one. As wasclaimed in the last chapter, it can be shown that the solutions of the system are ofthe form

φ(t) = C1eλtv + C2e

λt(tv + u),

where v is an eigenvector and u is a generalized eigenvector. If C2 = 0, φ(t) is apositive scalar multiple of C1v for each t. The corresponding trajectory then is aray, heading away from the origin if λ > 0 and heading toward the origin if λ < 0.If C2 6= 0, then for each C1, the trajectory is again a curve between the origin andinfinity. If also λ > 0, then the trajectory heads out from the origin as t increases;if λ < 0, the direction is reversed. See Figure 8 for an example of the λ > 0 case.

The final possibility for the eigenvalues of A is that they are a complex con-jugate pair α ± iβ, β 6= 0. Let v be an eigenvector corresponding to α + iβ. The


x

y

Figure 8. Double eigenvalue with eigenspace of dimension one

x

y

Figure 9. Imaginary eigenvalues with positive real part

solutions of the system can be written as

φ(t) = C1eαtReeiβtv+ C2e

αtImeiβtv.

If α = 0, we have φ(t) = C1Reeiβtv + C2Imeiβtv, and clearly φ(t + 2π/β) =φ(t) for all t. As a consequence, each trajectory wraps over itself. In fact, it can beshown that the trajectories are ellipses (or circles). Then if α 6= 0, the trajectoriesare spirals. If α > 0, the trajectories spiral out to infinity; if α < 0, the trajectoriesspiral toward the origin as t increases. See Figure 9 for an example of the α > 0case.

For both linear and nonlinear systems, critical points can be categorized interms of the behavior of nearby trajectories. By a neighborhood of a point in theplane, we will mean an open disc centered at that point.

Definition 5.2. Let (x0, y0) be a critical point of the system

x′ = f(x, y)

y′ = g(x, y).


x

y

N

N’

Figure 10. Stable critical point

Then (x0, y0) is said to be stable if for each neighborhood of N ′ of (x0, y0), thereis a neighborhood N of (x0, y0) such that whenever

φ =

[φψ

]is a solution of the system that has (φ(t1), ψ(t1)) in N for some t1, then (φ(t), ψ(t))is in N ′ for all t ≥ t1; if, in addition, limt→∞(φ(t), ψ(t)) = (x0, y0) whenever(φ(t1), ψ(t1)) ∈ N , we say that (x0, y0) is asymptotically stable. A critical point(x0, y0) is said to be unstable if it is not stable.

Put in more geometric terms, the definition says the following. The criticalpoint (x0, y0) is stable if any trajectory that comes close enough to (x0, y0) (so thatit has a point in N) will then stay close to (x0, y0) (remaining in N ′). Note that Ncan be smaller than N ′ as shown in Figure 10, and that N ′ can be arbitrarily small.The point (x0, y0) is asymptotically stable if, in addition, each such trajectoryapproaches (x0, y0) as t → ∞. On the other hand, the critical point is unstable ifno matter how large a neighborhood N ′ you choose, there will be a solution thatis at some time t1 as close as you like to (x0, y0) but is outside N ′ sometime later.

Let’s take a moment to examine the terminology we are using. The only solutionpassing through a critical point (x0, y0) is a constant solution. If (x0, y0) is stableand a solution passes through a point near (x0, y0) at some time t1, then it isn’t aconstant solution but at least will be ‘stable’ in the sense of remaining close by forall t ≥ t1. If (x0, y0) is asymptotically stable and if (x, y) gives a trajectory thatapproaches (x0, y0) as t → ∞, the plot of x versus t has x ≡ x0 as a horizontalasymptote and the plot of y versus t has y ≡ y0 as a horizontal asymptote.

Now, let’s look back at a linear system whose coefficient matrix A is nonsin-gular. Let λ1, λ2 denote the eigenvalues of A. The behavior of the trajectories,which we recently analyzed, implies the following. If λ1 and λ2 are distinct and arereal and positive or are imaginary with positive real part, then (0, 0) is an unstablecritical point. If λ1 and λ2 are distinct and are real and negative or are imaginary

4. PERTURBATIONS AND THE ALMOST LINEAR PROPERTY 139

with negative real part, then (0, 0) is asymptotically stable. If λ1 and λ2 are realnumbers with opposite signs, then (0, 0) is unstable. If λ1 and λ2 are purely imag-inary, then (0, 0) is stable but not asymptotically stable. Finally, if λ is a repeatedeigenvalue, then (0, 0) is unstable if λ > 0 and is asymptotically stable if λ < 0.

Exercises

For problems 1 - 5, find whether (0, 0) is stable, asymptotically stable, or unstable.

1. x′ = 2x+ 5yy′ = 3x− 7y

2. x′ = −4x+ 2yy′ = 3x− 5y

3. x′ = x− 5yy′ = 3x+ 2y

4. x′ = 3x− 4yy′ = 6x− 3y

5. x′ = −2x+ 3yy′ = −3x− 8y

6. Suppose the populations of rabbits and foxes in a closed environmentis represented over time by a linear system whose coefficient matrix isnonsingular. If (0, 0) is asymptotically stable, what happens to the twopopulations in the long run? If the system is not known to be linear canwe draw the same conclusion? Explain.

7. Let (0, 0) be a critical point of a system. If[φψ

]is a nonconstant solution, show that φ(t) and ψ(t) can never both equalzero for the same t. Give an example to show that φ(t1) and ψ(t2) can bezero if t1 6= t2.

4. Perturbations and the Almost Linear Property

Consider the autonomous linear system

(7)x′ = ax+ by

y′ = cx+ dy,

where a, b, c, d are constants such that ad− bc 6= 0. In applications, the parametersa, b, c, d are often found by measuring, which is not exact. For this reason andothers, it is of interest to see whether the nature of the solutions of (7) remainsunaltered under small perturbations (changes) of a, b, c, d. As we have seen, thetype of solutions of (7) is determined by the eigenvalues λ1, λ2 of

A =

[a bc d

].


The eigenvalues are solutions of the second order polynomial equation det(λI−A) =0. It follows from the quadratic formula that λ1 and λ2 are (real or imaginary withsame real part) continuous functions of a, b, c, d. Suppose that for a certain choice ofa, b, c, d, the eigenvalues of A are distinct. Then for small enough perturbations ofa, b, c, d, the resulting eigenvalues will be distinct and will be of the same sort (bothpositive, both negative, real with opposite signs, both imaginary with positive realpart, both imaginary with negative real part) as the original eigenvalues. Thus,the type of solutions of the system (7) and the stability of the critical point (0, 0)remain unchanged under small perturbations. On the other hand, if A has twodistinct purely imaginary eigenvalues, then under small perturbations the resultingsystem could have two distinct purely imaginary eigenvalues, but could also haveeigenvalues with nonzero real part. Thus, the trajectories could change from ellipsesto spirals and the stability of (0, 0) would then change. Also, if A has two identicaleigenvalues, then under a small perturbation of a, b, c, d, the resulting eigenvaluescould be identical, distinct and real with the same sign, or imaginary. Furthermore,even if the eigenvalues remain identical, the dimension of the eigenspace couldchange; here again, the type of solutions of (7) can change under small perturbationsof a, b, c, d (although in this case the stability of (0, 0) remains unaltered). Changein the type of solutions brought about by arbitrarily small perturbations of a, b, c, dis commonly referred to as bifurcation.

Example. The system

x′ = 2x− 5y

y′ = 4x− 7y

has eigenvalues λ = −2,−3, and so (0, 0) is asymptotically stable. Thus, any systemresulting from small enough perturbations of the coefficients will have (0, 0) as anasymptotically stable critical point. For example, the reader can check that for

x′ = 2.1x− 4.95y

y′ = 4.2x− 7.12y,

(0, 0) is asymptotically stable.

Example. The system

x′ = −2y

y′ = 8x

has purely imaginary eigenvalues λ = ±4i, and so (0, 0) is stable but not asymp-totically stable. For any ε > 0, the system

x′ = εx− 2y

y′ = 8x+ εy

will have eigenvalues λ = ε± 4i, making (0, 0) unstable, while the system

x′ = −εx− 2y

y′ = 8x− εy


will have eigenvalues λ = −ε±4i, making (0, 0) asymptotically stable. On the otherhand, the system

x′ = εx− 2y

y′ = 8x− εy

will have eigenvalues λ = ±√

16− ε2i, making (0, 0) stable but not asymptoticallystable (as long as ε is less than 4).

Now let’s return to the nonlinear autonomous system

(8)x′ = f(x, y)

y′ = g(x, y),

where as before we assume that f and g are continuous with continuous first partialderivatives. Recall from multivariable calculus that the assumptions on f and gimply that they are differentiable. In particular for f at (x0, y0), this means that

f(x, y)− f(x0, y0) = ∇f(x0, y0) · 〈x− x0, y − y0〉+ f(x− x0, y − y0)

= fx(x0, y0)(x− x0) + fy(x0, y0)(y − y0) + f(x− x0, y − y0),

where ∇f(x0, y0) = fx(x0, y0)i+fy(x0, y0)j is the gradient of f , and f is a functionsuch that

lim(h,k)→(0,0)

f(h, k)√h2 + k2

= 0.

If (x0, y0) is a critical point, this becomes

f(x, y) = fx(x0, y0)(x− x0) + fy(x0, y0)(y − y0) + f(x− x0, y − y0),

and we have a similar formula for g.We will, for the moment, assume that the critical point (x0, y0) is (0, 0). This

special case is especially convenient to work with, and we will see in the examplesthat follow that the general situation can be reduced to this case simply by atranslation. Putting a = fx(0, 0), b = fy(0, 0), c = gx(0, 0), and d = gy(0, 0), we

have f(x, y) = ax+ by + f(x, y) and g(x, y) = cx+ dy + g(x, y). Thus, (8) can berewritten as

(9)x′ = ax+ by + f(x, y)

y′ = cx+ dy + g(x, y).

Since

lim(x,y)→(0,0)

f(x, y)√x2 + y2

= 0 and lim(x,y)→(0,0)

g(x, y)√x2 + y2

= 0,

the system can be thought of as being almost linear near the critical point (0, 0).We want to use our ability to solve the associated linear system

(10)x′ = ax+ by

y′ = cx+ dy

to tell us something about the solutions of (9), at least close to the critical point.Before doing that, however, let’s first look at a couple of examples of finding theassociated linear system.



x′ = x+ x2 − 3y

y′ = 4x+ 4xy.

First we find the critical points. Setting 4x + 4xy = 0, we get x = 0 or y = −1.If x = 0, The equation x + x2 − 3y = 0 implies y = 0, and so we have the criticalpoint (0, 0). If y = −1, then x+ x2 − 3y = 0 becomes x2 + x+ 3 = 0, which has noreal solutions. Thus, (0, 0) is the only critical point. Clearly, f(x, y) = x+ x2 − 3yand g(x, y) = 4x+ 4xy are continuous with continuous first partial derivatives. Wecompute a = fx(0, 0) = 1, b = fy(0, 0) = −3, c = gx(0, 0) = 4, and d = gy(0, 0) = 0.The system can then be written in the form of (9) as

x′ = x− 3y + x2

y′ = 4x+ 4xy.

with f(x, y) = x2 and g(x, y) = 4xy. Note that, as is implied by differentiability,

lim(x,y)→(0,0)

x2√x2 + y2

= lim(x,y)→(0,0)

4xy√x2 + y2

= 0.


x′ = 2x+ x2 + sin y

y′ = −4y.

We get two critical points (0, 0) and (−2, 0). Starting with (0, 0), we computea = fx(0, 0) = 2, b = fy(0, 0) = 1, c = gx(0, 0) = 0, and d = gy(0, 0) = −4. Thus,the system will be rewritten as

x′ = 2x+ y + (x2 + sin y − y)

y′ = −4y.

Near (0, 0), we will approximate the given system with the linear system

x′ = 2x+ y

y′ = −4y.

Now, let’s move to the critical point (−2, 0). Introduce the variables u = x−(−2) =x+ 2 and v = y − 0 = y. Under this translation, we get

u′ = −2u+ u2 − sin v

v′ = −4v.

This system has a critical point (u, v) = (0, 0), which corresponds to (x, y) =(−2, 0). (The critical point (u, v) = (2, 0) is not needed to analyze the originalsystem.) As above, we can rewrite this as

u′ = −2u− v + (u2 − sin v + v)

v′ = −4v.

Our plan then is to approximate the given system near (x, y) = (−2, 0) with thelinear system

u′ = −2u− vv′ = −4v.


near (u, v) = (0, 0).

The effect on stability of a critical point when moving to an almost linear systemfrom an associated linear system mirrors what we have seen when the coefficientsof a linear system are perturbed. We state without proof

Theorem 5.2. Let (x0, y0) be an isolated critical point of (8), and let

(11)u′ = au+ bv

v′ = cu+ dv

be the associated linear system, where u = x − x0, v = y − y0. Let λ1, λ2 be theeigenvalues corresponding to (11).

(i) If λ1 and λ2 are real (distinct or equal) or are imaginary with nonzeroreal part, then (x0, y0) has the same stability for (8) as (0, 0) has for (11).That is, (x0, y0) is asymptotically stable for (8) if λ1, λ2 are positive orhave positive real part; (x0, y0) is unstable for (8) if λ1, λ2 are negativeor have negative real part, or if λ1, λ2 are real with opposite signs.

(ii) If λ1 and λ2 are purely imaginary (so that (0, 0) is stable but not asymp-totically stable for (11)), then (x0, y0) can be asymptotically stable, stable,or unstable for (8).

Let’s look back at the last two examples.


x′ = x+ x2 − 3y

y′ = 4x+ 4xy.

The associated linear system is

x′ = x− 3y

y′ = 4x,

whose eigenvalues λ = 12 ±

12

√47i have positive real part. Therefore, (0, 0) is

an unstable critical point for both the original nonlinear system and its linearapproximation. Figure 11(a) is a phase portrait of the original system, while Figure11(b) shows the corresponding phase portrait of the linear approximation.


x′ = 2x+ x2 + sin y

y′ = −4y.

The linear system associated with the critical point (0, 0) is

x′ = 2x+ y

y′ = −4y.

The corresponding eigenvalues are λ = 2,−4, making (0, 0) an unstable criticalpoint for both the linearization and the original system. The linear system associ-ated with the critical point (−2, 0) is

u′ = −2u− vv′ = −4v,


-1.5 1.5x

-1.5

1.5

y

-1.5 1.5x

-1.5

1.5

y

(a) (b)

Figure 11. x′ = x+ x2 − 3y, y′ = 4x+ 4xy

where u = x + 2 and v = y. The corresponding eigenvalues are λ = −2,−4,making (u, v) = (0, 0) an asymptotically stable critical point for the linearizationand (x, y) = (−2, 0) an asymptotically stable critical point for the original system.See Figure 12 for phase portraits of the system and the two linearizations.

Two of the exercises in the exercise set for this section illustrate part (ii) of thelast theorem. Although the phase portrait of a system with a linearization thatinvolves purely imaginary eigenvalues can give us an indication of the stability of acritical point, to actually prove what we think we see requires advanced techniques(not pursued here) such as the Liapunov method.

We now return to the example involving rabbits and foxes that introduced thischapter.


(12)x′ = 5x− x2 − xyy′ = xy − 2y,

where x is the number (in thousands) of rabbits and y is the number (in tens) offoxes at time t in a certain woods. It is easily seen that the critical points are (0, 0),(5, 0), and (2, 3). The linear system associated with (0, 0) is

x′ = 5x

y′ = −2y,

which has eigenvalues λ = 5,−2. Thus, (0, 0) is an unstable critical point for (12).For (5, 0), put u = x− 5, v = y. Substituting into (12)and then linearizing, we getthe linear system

u′ = −5u− 5v

v′ = 3v,

which has eigenvalues λ = −5, 3. Thus, (x, y) = (5, 0) is an unstable critical pointfor (12). For (2, 3), put u = x− 2, v = y − 3. Substituting this into (12) and then


-1 1u

-1

1v

-1 1x

-1

1

y

Figure 12. x′ = 2x+ x2 + sin y, y′ = −4y

linearizing, we get the linear system

u′ = −2u− 2v

v′ = 3u,

which has eigenvalues λ = −1 ±√

5i. Thus, (x, y) = (2, 3) is an asymptoticallystable critical point for (12).

Let’s see what we have found implies about the populations of rabbits andfoxes. (Of course, negative values of x and y will not have any physical relevance.)First and most importantly, if at any time there are approximately 2000 rabbitsand 30 foxes, the number of rabbits will approach 2000 and the number of foxes willapproach 30 in the long run. On the other hand, there are population distributionsas close as we like to the other two critical points that do not lead to distributions inthe future close to these unstable critical points. See Figure 13 for a phase portraitof (12).

The myriad of different ways the trajectories of an autonomous system canbe distributed in a phase plane is by no means determined just by critical points.For example, some phase planes have limit cycles, a phenomenon that we have notencountered before but which we will see now.


1 2 3 4 5 6x

1

2

3

4

5

6

y

Figure 13. Rabbits and foxes


(13)

x′ = −3

2y

y′ =2

3x−

(1

9x2 +

1

4y2 − 9

25

)y.

Clearly, the one and only critical point is (0, 0). The corresponding linearizationis

x′ = −3

2y

y′ =2

3x+

9

25y,

which has the eigenvalues

λ =9

50±√

2419

50i.

Thus (0, 0) is an unstable critical point for both the linearization and the originalsystem. Also, it is easy to check that

x =9

5cos t

y =6

5sin t

is a solution of (13). This solution produces the closed trajectory

1

9x2 +

1

4y2 =

9

25.

This ellipse is shown in Figure 14, the phase plane of (13). Furthermore, everynonconstant trajectory either inside or outside the closed trajectory spirals towardit as t → ∞. The closed trajectory is called an asymptotically stable limit cycle.Incidentally, the reader should note how the definition of a stable critical pointdoes not say that (0, 0) is stable in this example even though all trajectories areeventually close (within 2 units) to (0, 0).


Figure 14. Phase portrait with limit cycle

Geometric analysis of the sort we have been going through for systems of twofirst order ordinary differential equations in two dependent variables becomes vastlymore difficult when applied to systems of n equations in n dependent variables,n = 3, 4, . . . . Even in the case of n = 3, where we still can draw (3-dimensional)pictures, the analysis of behavior near critical points is not understood nearly aswell as in the case of n = 2. There is much that is unknown; there is much that iswaiting to be discovered.

Exercises


x′ = x− 2y

y′ = 5x− y.

(a) Find perturbations for which (0, 0) is an unstable critical point.(b) Find perturbations for which (0, 0) is asymptotically stable.(c) Find perturbations for which (0, 0) is stable but not asymptotically

stable.


x′ = −2x− 3y

y′ = 3x+ 2y.

(a) Find perturbations for which (0, 0) is an unstable critical point.(b) Find perturbations for which (0, 0) is asymptotically stable.(c) Find perturbations for which (0, 0) is stable but not asymptotically

stable.


x′ = 8x+ 4xy

y′ = y + 2x2.


(a) Find all critical points.(b) Show that the system is almost linear near each of its critical points

and find the associated linear systems.(c) Find the stability of each critical point of the original system by

finding the stability of (0, 0) for the associated linear system.


x′ = x− y2

y′ = 2x− 8.

(a) Find all critical points.(b) Show that the system is almost linear near each of its critical points

and find the associated linear systems.(c) Find the stability of each critical point of the original system by

finding the stability of (0, 0) for the associated linear system.


x′ = y − x3

y′ = −x− 2y3.

(a) Show that the system is almost linear near (0, 0) and find the asso-ciated linear system.

(b) Find the stability of (0, 0) for the linear system.(c) With the aid of a computer, draw a phase portrait of the original

system and comment on the stability of (0, 0) based on what thephase portrait seems to indicate.


x′ = 4y + 2x3

y′ = −x+ 3y3.

(a) Show that the system is almost linear near (0, 0) and find the asso-ciated linear system.

(b) Find the stability of (0, 0) for the linear system.(c) With the aid of a computer, draw a phase portrait of the original

system and comment on the stability of (0, 0) based on what thephase portrait seems to indicate.

Answers to Selected Exercises

Section 0.1

1. x2 · 6x−4 − 6x−2 = 03. 1

3 (6− 3x)−2/3(−3) = −1/(6− 3x)2/3

5. −a2 cos ax+ a2 cos ax = 0 and −a2 sin ax+ a2 sin ax = 07. (4xe2x + 4e2x)− 4(2xe2x + e2x) + 4xe2x = 0 and φ(0) = 0e0 = 0, φ′(0) =

0e0 + e0 = 19. 4

Section 1.2

1. y = 17e

2x + Ce−5x

3. y = 32xe

x2

+ Cex2

5. y = 12x

8 + Cx−2

7. y = xex2

+ 4ex2

9. y = 2 + 3e−14x

2

11. A(t) = A0ekt

13. (b) y = 13x−

19 + 1

2e−x + Ce−3x

Section 1.3

1. (a) separable(e) linear, separable, and exact

3. y = (−1±√

1− 4x(x2 − C))/2x5. x2ey + y = C7. y = 3

√(29− 2x)/x2

9. −x2 + x sin y + y = 111. y = (−a+ C/x)1/3

19. y = ±√Ce−2x − x

21. y = x(3 ln |x|+ C)1/3

23. y = x( 12 lnx+ 1)2 and y = x( 1

2 lnx− 1)2

25. y = ( 27x−3 + Cx4)−1/2

Section 1.4

1. (a) y = 3 +∫ x0t2y dt

(b) φ0(x) ≡ 3, φ1(x) = 3 + x3, φ2(x) = 3 + x3 + 16x

6

(c) |fy(x, y)| = x2 ≤ 4 on R(d) 1/6

(e) φ(x) = 3ex3/3, and φ0, φ1, φ2 are partial sums

3. (a) y =∫ x0

(4t− y2) dt

(b) φ0(x) ≡ 0, φ1(x) = 2x2, φ2(x) = 2x2 − 45x

5

149

150 ANSWERS TO SELECTED EXERCISES

(c) |fy(x, y)| = |2y| ≤ 40 on R(d) 1/22

Section 1.5

1. Partial answer: Y (1) = 5, Y (1.1) = 3.8, Y (1.2) = 2.98, Y (1.3) = 2.426,Y (1.4) = 2.0582

3. Partial answer: Y (0) = −1, Y (.5) = −2, Y (1) = 6, Y (1.5) = 25. Y (1.4) = 2.29300744, Y (1.4) = 2.31638748, Y (1.4) = 2.316153357. Y (1.5) = −0.61157942, Y (1.5) = −0.61227082, Y (1.5) = −0.61222533

Section 1.6

1. any number close to 2.753. 2

Section 2.3

1. y = e3x, y = e4x

3. y = e(1+√2)x, y = e(1−

√2)x

5. y = e(−1/2)x, y = xe(−1/2)x

7. y = e3x cosx, y = e3x sinx

9. y = cos√

52x, y = sin

√52x

Section 2.4

1. 12x2 + 7x4

3. e3x, e5x

Section 2.5

1. y = C1e(1/3)x + C2xe

(1/3)x

3. y = 2e2x − e−5x7. (b) b = 2

√mk

Section 2.6

1. y = C1e2x + C2e

2x cosx+ C3e2x sinx, with C1, C2, C3 real

3. y = C1ex+C2e

−x+C3e(1/2)x cos

√32 x+C3e

(1/2)x sin√32 x, with C1, C2, C3, C4

real5. y = 2ex − 2xe−x

7. (a) −6x(b) linearly independent since W (f1, f2, f3)(x) not identically 0

Section 2.7

1. y = 413e

2x + C1 cos 3x+ C2 sin 3x

3. y = − 12 sinx+ 1

2x2ex + C1e

x + C2xex

5. y = 52xe

2x + 398 e

2x + 98e−2x

9. y = 17e

4x + Ce−3x

Section 2.8

1. y = 413e

2x + C1 cos 3x+ C2 sin 3x

3. y = −xe3x + xe3x lnx+ C1e3x + C2xe

3x, with the first term unnecessarysince it can be absorbed into the last term

ANSWERS TO SELECTED EXERCISES 151

5. y = − 13 cosx ln(secx) + 1

3x sinx+ C1 cosx+ C2 sinx

7. y = 52xe

2x + 398 e

2x + 98e−2x

9. y = 2 + C1e−x + C2 cosx+ C3 sinx

11. y = 415x

4 + C1x+ C2/x

Section 3.2

1. φ(x) = c0

∞∑m=0

x2m

m!= c0e

x2

3. φ1(x) = 1 + 3x2, φ2(x) = x+ 12x

3 − 340x

5 + 9560x

7 −+ · · ·

5. φ1(x) = 1 +

∞∑m=1

(−1)m4m · (−1) · 3 · · · (4m− 5)

(2m)!x2m

φ2(x) = x+

∞∑m=1

(−1)m4m · 1 · 5 · · · (4m− 3)

(2m+ 1)!x2m+1

7. φ(x) = x+ 12x

4

9. φ(x) = 1− 12x

2 + 124x

4 + 120x

5 + · · ·

11. φ1(x) = 1 +

∞∑m=1

(−α2)(22 − α2) · · · ((2m− 2)2 − α2)

(2m)!x2m

φ2(x) = x+

∞∑m=1

(12 − α2)(32 − α2) · · · ((2m− 1)2 − α2)

(2m+ 1)!x2m+1

If α is a nonnegative even integer, the first series terminates when m =(α+2)/2. If α is a positive odd integer, the second series terminates whenm = (α+ 1)/2.

Section 3.3

1. y = C1x4 + C2x

4 ln |x|3. y = C1x

−1 cos(2 lnx) + C2x−1 sin(2 lnx)

5. y = −6x2/3 + 6x7. y = x2 lnx− 1

5x2 +C1x

−3 +C2x2, with the second term unnecessary since

it can be absorbed into the last term9. y = 1

10x3 − 3

2x+ 125 x

1/2

11. y = 18x

5 + Cx−3

13. y = C1x2 + C2x

3

Section 3.4

1. φ(x) = x1/2∞∑k=0

xk

(k!)2

3. φ1(x) = 1 + 18x

2 + 1320x

4 + · · ·φ2(x) = x2/3(1 + 1

16x2 + 1

896x4 + · · · )

5. φ1(x) = x1/2(1− 13!x

2 + 15!x

4−+ · · · ) = x−1/2(x− 13!x

3 + 15!x

5−+ · · · ) =

x−1/2 sinxφ2(x) = x−1/2(1− 1

2!x2 + 1

4!x4 −+ · · · ) = x−1/2 cosx

7. φ(x) = Cx−1e−x

152 ANSWERS TO SELECTED EXERCISES

Section 4.2

1.

[y′1y′2

]=

[2 −3x5 x2

] [y1y2

]3. Partial answer: Wronskian equals 7e−x

Section 4.3

1. y1 = −C1e3x − 2C2e

4x

y2 = C1e3x + 3C2e

4x

3. y1 = −3C1e−2x + C3e

5x

y2 = 4C1e−2x + C3e

5x

y3 = 4C1e−2x + C2e

5x

5. y1 = 3e2x + 4e6x

y2 = −6e2x + 8e6x

7. y = C1e−7x + C2e

2x

9. (a) y1 = 2.5− 2.5e−0.12t

y2 = 2.5 + 2.5e−0.12t

(b) y1 = y2 = 2.5

Section 4.4

1. y1 = C1ex cosx+ C2e

x sinxy2 = −C2e

x cosx+ C1ex sinx

3. y1 = C3 cosx− C2 sinxy2 = −C2 cosx− C3 sinxy3 = C1 + C2 cosx+ C3 sinx

5. y1 = −12xe2x − 15e2x + 4C1e2x + C2e

3x

y2 = −15xe2x − 15e2x + 5C1e2x + C2e

3x

7. y1 = 52xe

x − 54ex − 1

3e4x − C1e

x + C2e3x

y2 = − 52xe

x − 54ex + 2

3e4x + C1e

x + C2e3x

9. y = 2x+ 3 + C1ex + C2e

2x

13. (d) s1 = C1 cos t+ C2 sin t+ 2C3 cos√

6t+ 2C4 sin√

6t

s2 = 2C1 cos t+ 2C2 sin t− C3 cos√

6t− C4 sin√

6t15. y1 = C1e

2x + C2xe2x

y2 = (−C1 − C2)e2x − C2xe2x

17. y1 = −3− (C1 + 2C2) cos 2x+ (2C1 − C2) sin 2xy2 = 3 + C1 cos 2x+ C2 sin 2x

Section 4.5

1. X(0.3) = 1.0312, Y (0.3) = 0.3323. X(0.3) = 1.05039635, Y (0.3) = 0.353879615. Partial answer: X(2) = 16.09375, Y (0.3) = 22.81257. Partial answer: X(2) = 47.61581572, Y (0.3) = 55.754238389. Partial answer: X(0.75) = 0.7559

11. Partial answer: X(0.75) = 0.79569082

Section 5.2

1. (a) x = C1e4t + C2e

−4t

y = 2C1e4t − 2C2e

−4t

(b) Parts of hyperbolas −4x2 + y2 = C (lines when C = 0)

ANSWERS TO SELECTED EXERCISES 153

3. Partial answer: trajectories are points (a, 0) and parts of parabolas y =x2 + C

Section 5.3

1. unstable3. unstable5. asymptotically stable

Section 5.4

1. (a) x′ = (1 + ε)x− 2y for ε > 0 (among many possibilities)y′ = 5x− y

(b) x′ = (1− ε)x− 2y for ε > 0 (among many possibilities)y′ = 5x− y

(c) x′ = x− (2− ε)y for ε < 9/5 (among many possibilities)y′ = 5x− y

3. (a) (0, 0), (1,−2), (−1,−2)(c) all critical points unstable

5. (b) stable but not asymptotically stable

Index

almost linear system, 139annihilator method, 62, 116autonomous, 42, 126auxiliary polynomial, 46

basis, 56, 107Bernoulli equation, 19Bessel equation, 94

characteristic polynomial, 46, 65, 106Chebyshev equation, 86critical point, 133

diagonalizable matrix, 105differential operator, 50, 63dimension, 56, 103, 113direction field, 39, 127

eigenvalue, 105eigenvector, 106Euler equation, 87Euler’s method, 29exact equation, 12Existence Theorem, 23, 52, 60

Frobenius (method of), 91

general solution, 7, 56, 63, 103

Hermite equation, 86Heun’s method, 34homogeneous equation(1st order), 19homogeneous equation(nth order), 45

improved Euler’s method, 34indicial polynomial, 88, 94integral curve, 39

integral equation, 19integrating factor, 7, 18

Laguerre equation, 97Legendre equation, 82linear independence, 51, 54, 60, 102

ordinary point, 86

particular solution, 63phase plane, 126Picard iterations, 20

regular singular point, 87Runge-Kutta method, 34

separable equation, 10singular point, 86slope field, 39span, 56step size, 30successive approximations, 20

trajectory, 39, 126transition matrix, 105

undetermined coefficients, 68, 116Uniqueness Theorem, 27, 54, 60

variation of parameters, 69, 115

Wronskian, 54, 60, 102

155

a concise introduction to ordinary di erential equationshcmth018/odebook.pdf · 2018-10-17 · 1 a...

Documents