math2011 notes

Chapter 1

Vectors

(Reference: SHE, Chapter 12, FYA Chapter 5)We begin with a little revision of first year vectors algebra and geometry.

1.1 Norms and Dot products

The basic point of a vector is something that has both size and direction: a scalar onthe other hand, just has size. In order to keep track of what is a vector and what a scalar,we use a standard convention:

Notation: A vector is always typed in bold, like x, or bold and underlined,˜x. When

writing a vector, get into the habit of underlining it x, or putting an arrow over it,−→AB.

In this course we will be looking at 2 or 3 dimensional real vectors, members of thevector spaces R

2 or R3. Such a vector ought to always be written as a column:

˜x =

(x1

x2

)or

˜x =

x1

x2

x3

.

Convention: I will always use the convention that a general vector called, say,˜x will have

components called x1, x2, . . . etc (or x1, x2 . . . , the index position will not matter in thiscourse). If I want

˜x to have different components I will always say what they are.

Definition 1.1 The scalar or dot product of two vectorsã,

˜b ∈ R

n is given by

ã ·

˜b = a1b1 + · · · + a

nbn.

Definition 1.2 The length or norm of a vectorã ∈ R

n is given by

‖ã‖ =

√ã ·

ã =

√a2

1+ a2

2+ . . . + a2

n.

1

2 math2011, 2010

We sometimes use a single line for the norm, |ã|, which is of course the same symbol as

the absolute value used for scalars. If you are careful to underline vectors, this shouldn’tcause confusion.

Definition 1.3 A unit vector is any vector of length 1.

Given any vectorã a unit vector in the same direction as

ã is ˜

a

‖ã‖ , which we often

write as ã.

Definition 1.4 For two non-zero vectorsã,

˜b ∈ R

n the angle θ between them is given by

cos θ = ã ·

˜b

‖ã‖ ‖

˜b‖

.

I’ve called this a definition, but in the case of R2 it is of course a theorem (or part of

the definition of the dot product), see FYA chapter 5.

Definition 1.5 Two non-zero vectorsã,

˜b ∈ R

n are orthogonal ifã ·

˜b = 0, or equiva-

lently the angle between them is π/2.

Definition 1.6 The standard basis of R3 consists of the vectors

˜i =

1

0

0

,˜j =

0

1

0

, and˜k =

0

0

1

which point along the coordinate axes.

These vectors are unit vectors and pairwise orthogonal — they are an orthonormalbasis of R

3.

Example 1.1 Let A, B and C be the points with coordinates (1,2,3), (2,0,1) and (3, 3,−2)respectively.

a) Find the vectors−→OA,

−→AB,

−→AC and

−−→BC.

b) Find the distance of A from the origin.

c) Find a unit vector in the direction of−→AB.

d) Show that−→AB,

−→AC and

−−→BC are linearly dependent.

e) Find the angle between−→AB and

−→AC.

f) Find the component of−→AB in the direction of

−→AC, and the component of

−→AB per-

pendicular to−→AC.

Chapter 1 3

1.2 Cross and Triple Products

The cross (and triple) products are only defined for R3. They do not work in any other

dimensions.

Definition 1.7 The cross product (or vector product) of two vectorsã,

˜b ∈ R

3 isgiven by

ã ×

˜b =

∣∣∣∣∣∣∣

˜i

˜j

˜k

a1 a2 a3

b1 b2 b3

∣∣∣∣∣∣∣; i.e.

a1

a2

a3

×

b1

b2

b3

=

a2b3 − a3b2

a3b1 − a1b3

a1b2 − a2b1

.

The magnitude of the cross product is given by

‖ã×

˜b‖ = ‖

ã‖ ‖

˜b‖ | sin θ|

where θ is the angle between the two vectors.

Proposition 1.1 (Properties of the cross product) Letã,

˜b ∈ R

3. Then

a)ã×

˜b =

˜0 if and only if

ã and

˜b are parallel.

b)ã×

˜b is orthogonal to both

ã and

˜b.

c)ã×

˜b = −

˜b ×

ã

d) ‖ã×

˜b‖ is the area of the parallelogram with sides

ã and

˜b.

For proofs see SHE or FYA.

The cross productã×

˜b is a vector, so if we had a third vector,

˜c, we could take either

the dot or cross product of˜c with

ã×

˜b

Definition 1.8 The scalar triple product of three vectorsã,

˜b,

˜c ∈ R

3 is

[ã,

˜b,

˜c] = (

ã×

˜b) ·

˜c =

ã · (

˜b ×

˜c) =

∣∣∣∣∣∣∣

a1 a2 a3

b1 b2 b3

c1 c2 c3

∣∣∣∣∣∣∣

The absolute value of the scalar triple product [ã,

˜b,

˜c] is the volume of the paral-

lelepiped withã,

˜b and

˜c as edges (FYA). It follows that [

ã,

˜b,

˜c] = 0 if and only if

ã,

˜b

and˜c are coplanar.

Definition 1.9 The vector triple product of three vectorsã,

˜b,

˜c ∈ R

3 is (ã×

˜b)×

˜c.

The scalar triple product of three vectors is a scalar and the vector triple product is avector.

Both triple products are alternating, that is if you swap any two vectors and you changethe sign.

4 math2011, 2010

Proposition 1.2 Forã,

˜b,

˜c ∈ R

3, (ã ×

˜b) ×

˜c = (

ã ·

˜c)

˜b− (

˜b ·

˜c)

ã.

The most straighforward way to prove this is just to write out both sides, but it israther tedious.

Example 1.2 Let A, B and C be the points with coordinates (1, 0, 1), (2,−1, 3) and(0, 1, 2) respectively. Find

a) The area of the triangle OAB.

b) The volume of the parallelepiped with sides−→OA,

−−→OB and

−→OC.

Example 1.3 Prove thatã× (

˜b×

˜c) = (

˜c ·

ã)

˜b − (

˜b ·

ã)

˜c.

We haveã × (

˜b ×

˜c) = −(

˜b ×

˜c) ×

ã = −(

˜b ·

ã)

˜c + (

˜c ·

ã)

˜b by changing the symbols.

Note how different this is to the original!

1.3 Lines

We shall now recall how to obtain equations for lines in R2 and R

3. The commonestproblem you will face here is to parameterise a line between two points. Recall thatparameterising is a way of labelling the points on the line by giving the two (or three)coordinates of all the points on the line in terms of functions of one variable only.

Suppose we have two points A and B with position vectorsã and

˜b. Then the line

through A and B has parametric vector equation

˜x(t) =

ã + t(

˜b −

ã), −∞ < t < ∞ (1.1)

See figure 1.1.

x

y

A

ã

B

˜b

Figure 1.1: Line through two points

What this means is that as the parameter t varies from −∞ to ∞, the point withposition vector

˜x(t) traces out the line through A and B. If our coordinates are x, y and

z, then we have the three parametric equations

x = a1 + t(b1 − a1), y = a2 + t(b2 − a2), z = a3 + t(b3 − a3).

Chapter 1 5

We see that the three coordinates are functions of the one variable (t).If we only want the line segment from A to B, then we cut down the range of the

parameter: the line segment from A to B has parametric vector equation

˜x(t) =

ã + t(

˜b −

ã), 0 < t < 1. (1.2)

So as t varies from 0 to 1, the points with position vector˜x(t) go along the line starting

at A and ending at B.

Example 1.4 Let A and B be the points with coordinates (1,2,3) and (2,0,1) respectively.

a) Find a parametric vector equation for the line through A and B.

b) Find a parametric vector equation for the straight line path from A to B.

c) Find points on the line AB that are 5 units from A.

Example 1.5 Do the following two lines intersect?

˜r(t) =

0

8

3

+ t

−1

3

1

,˜s(t) =

−2

8

−7

+ t

1

−1

3

.

If they do, find the point(s) of intersection and the angle at which they intersect.

1.4 Planes

Lines are one dimensional objects: we parameterise them with one parameter. Planes aretwo dimensional: we need two paramaters.

A typical problem in parameterising planes is to find the plane through three points A,B and C that are not collinear. A parametric vector equation of this plane is given by

˜x(λ, µ) =

ã + λ(

˜b−

ã) + µ(

˜c−

ã), λ, µ ∈ R.

See figure 1.2. There are other options: you can swap˜b and

ã for example.

If we restrict the parameters to [0, 1], then this gives us a parameterisation of theparallelogram with corners at A, B, C and the point D with position vector

˜b +

˜c −

ã.

From the parametric equation of the plane we get the Cartesian equation using thecross product. Suppose the plane is

˜x(λ, µ) =

ã + λ

˜b + µ

˜c. Then the vector

ñ =

˜b ×

˜c is

normal to the plane. The point-normal form of the plane is then

ñ · (

˜x −

ã) = 0 (1.3)

where˜x is the general point on the plane. Multiplying equation (1.3) out will give the

Cartesian equation.

6 math2011, 2010

A

B

C˜b −

˜a

˜c −

˜a

D

Figure 1.2: Plane through 3 points

Example 1.6 Let A, B and C be the points with coordinates (1,2,3), (2,0,1) and (3, 3,−2)respectively. For the plane through A, B and C:

a) give a parametric equation;

b) find a normal direction;

c) find the Cartesian equation.

Now parameterise the triangle ABC.

Example 1.7 Let Π be the plane with Cartesian equation 2x − 3y + z = 2. Find aparametric equation for Π and a normal to Π.

Note that although there are infinitely many different parametric and point-normalequations for a plane, there is really only one Cartesian equation: any others you mightwrite down are multiples of any particular one.

Chapter 2

Curves

2.1 Vector valued functions

In general, a function is a rule that assigns to each element in the domain an element inthe codomain.

Definition 2.1 A vector valued function or vector function, is simply a functionf : A → B whose codomain, B, is a set of vectors.

We have already seen example of these — the parametric vector form of a line or aplane are really vector valued functions.

This course is all about vector functions of real variables, and typically the domainand codomain are subsets of R, R

2 or R3. The mathematics generalises of course, but by

the time you’ve mastered functions from or into R2, you’ve done the hard part: higher

dimensions just involves more notation (and unfortunately few pictures).A typical vector function, say

˜f : R

2 → R2 can be written as

˜f(u, v) =

(f1(u, v)

f2(u, v)

)= f1(u, v)

˜i + f2(u, v)

˜j.

We call the (real valued) functions f1 and f2 the component functions of˜f.

2.2 Basic definitions for curves

Intuitively, we think of a curve as a one dimensional configuration, like the path of a movingparticle, or as something obtained by bending and twisting a straight line. Just like a line,we parameterise a curve using one parameter.

Definition 2.2 A curve in Rn is a vector valued function of one variable,

˜c : I → R

n,where I is some (possibly infinite) interval in R.

1

2 math2011, 2010

We often think of the range of this function as the curve, although this is not strictlycorrect. When parameterising a curve it is not enough to just give the formulas: we needto know the domain too.

Definition 2.3 For a curve˜c : I → R

n the range˜c(I) is called the trace of

˜c.

I’ll often give future definitions for plane curves (curves in R2) for convenience, but

everything I say and define will work in all other dimensions with simple modifications.

Example 2.1 Draw the trace of the plane curve˜r : I → R

2 defined by

˜r(t) = (2t2, 1 + t), t ∈ [−1, 2].

The trace of˜r is a set of points in the plane

{(x, y) : x = 2t2, y = 1 + t, −1 ≤ t ≤ 2

}

which is a parabola, see figure 2.1.

x

y

b

b

b

b

t = −1

t = 0

t = 1t = 2

Figure 2.1: trace of a curve

If f is a function of x with domain [a, b], then the graph of f is the set of points

{(x, y) : y = f(x); x ∈ [a, b]}.

This is also the trace of a plane curve, which we can define as˜r : [a, b] → R

2 given by

˜r(x) = (x, f(x)), a ≤ x ≤ b, using x as a parameter.

Note that although all graphs give you curves, there are curves that are not graphs.The example in figure 2.1 is one such. You need a unique y value for each x value beforeyou can call (the trace of) a curve a graph.

Example 2.2 Sketch the curve in R2 given by

˜C : [0, 2π] → R

2 defined by

˜C(t) = 2 cos t

˜i + sin t

˜j

and describe its trace.

Chapter 2 3

Example 2.3 Sketch the curve in R3 given by

˜r : [−π, 3π] → R

3 defined by

˜r(t) = cos t

˜i + sin t

˜j + t

˜k

and describe its trace.

Most of the curves we will see and use in this course will be lines or circles. We expectyou to be able to parameterise lines, line segments, circles and arcs of circles. We lookedat line and line segments earlier.

For parameterising circles and arc of circles we usually use trig functions. In general,if a circle has centre

˜c and radius R, then it can be parameterised as

˜r(t) =

(c1 + R cos t

c2 + R sin t

)=

˜c + R cos t

˜i + R sin t

˜j, 0 ≤ t ≤ 2π.

(Sometimes it is more convenient to use −π ≤ t ≤ π — this gives the same trace, but istechnically a different curve.)

For arcs of circles, you merely restrict the range.

Example 2.4 Sketch and parameterise the semi-circle in the upper half plane with endsat (3, 0) and (−1, 0).

2.3 Properties of curves

Definition 2.4 Let˜r : [a, b] → R

2 be a curve defined by˜r(t) =

(x(t), y(t)

). If both

component functions, x and y, are continuous functions from [a, b] to R then we call˜r

continuous.

All the examples we’ve see so far are continuous.

Definition 2.5 Let˜r : [a, b] → R

2 be a curve. The points˜r(a) and

˜r(b) are called the

end points of˜r. We say

˜r is closed if

˜r(a) =

˜r(b).

Example 2.5 The parabola in example 2.1 has endpoints (2, 0) and (8, 3) but is not closed.The curve in example 2.2 is closed, as

˜C(0) =

˜C(2π) = (2, 0).

Definition 2.6 A point P on the trace of a curve˜r : I → R

2 is a multiple point if thetrace passes through P more than once, i.e. there are at least two values t1, t2 ∈ R with

˜r(t1) =

˜r(t2). A curve with no multiple points (excluding any end points) is called simple.

A continuous plane curve that is both simple and closed is called a Jordan curve:example 2.2 is an obvious example. There is a famous, obvious but very hard theoremcalled the Jordan curve theorem that says that the trace of a Jordan curve always

4 math2011, 2010

splits the plane into exactly two pieces, one bounded (the inside) and one unbounded (theoutside).

A parameterisation of a plane curve has the added bonus of giving the curve an ori-entation, that is a direction to the trace. Consider a curve

˜r : [a, b] → R

2. We say˜r (or

more accurately, its trace) is oriented in one way as t varies from a to b and the other wayas t varies from b to a. If we are denoting the curve by

˜r when oriented in one way, when

oriented in the other way we denote it as −˜r. However, in the case of a Jordan curve, we

always assume the curve is oriented in such a way that the inside is always on the left asone moves along the curve in the direction given by increasing t.

2.4 Calculus with curves

2.4.1 Limits

The limit of a curve˜r : I → R

2 is defined by taking the limits of its component functionsas follows.

Suppose˜r(t) = (x(t), y(t)) and lim

t→a

x(t) = α and limt→a

y(t) = β exist, then

limt→a ˜

r(t) = (α, β) .

Limits of vector functions obey the same rule as limits of real-valued functions. The formaldefinition is with the usual ε and δ, with a minor change of notation: lim

t→a

˜r(t) = (α, β)

means that for any ε > 0 there is a δ > 0 such that if |t − a| < δ then ‖˜r(t) −

˜r(a)‖ < ε.

If a curve is continuous at a (i.e. the component functions are continuous at a), thenwe have lim

t→a ˜r(t) =

˜r(a).

2.4.2 Differentiation

The derivative of a curve is defined in the same way as for real valued functions:

Definition 2.7 If˜r : (a, b) → R

2 then its derivative˜r′ : (a, b) → R

2 is defined by

˜r′(t) = lim

h→0˜r(t + h) −

˜r(t)

h

if the limit exists. In this case we say˜r is differentiable or smooth at t.

We also have to consider curves that are piecewise smooth: such a curve is made upof a finite sequence of smooth pieces.

Suppose points P and Q have position vectors˜r(t) and

˜r(t + h) respectively, then

−→PQ

represents the vector˜r(t + h) −

˜r(t), which can therefore be regarded as a secant vector.

If h > 0, the scalar multiple (1/h)(˜r(t+h)−

˜r(t)) has the same direction as

˜r(t+h)−

˜r(t).

As h → 0, it appears that this vector approaches a vector that lies on the tangent line.This is why we have the following:

Chapter 2 5

˜r(t)

˜r(t + h)

Figure 2.2: tangent

Definition 2.8 The vector˜r′(t) is called the tangent vector or the velocity of the

curve defined by˜r at the point P , provided that

˜r′(t) exists and is not equal to zero. The

length of the velocity, ‖˜r′(t)‖ is called the speed. The tangent line to C at point P is

defined to be the line through P parallel to the tangent vector˜r′(t).

If we have˜r(t) =

(x(t), y(t)

), then it is easy to see that

˜r′(t) =

(x′(t), y′(t)

). So on a

practical level the derivative of a curve is obtained by simply differentiating its componentfunctions.

We can also define the unit tangent vector to a curve, which is

˜T(

˜r(t)) ≡

˜r′(t) = ˜

r′(t)

‖˜r′(t)‖ .

With the unit tangent vector we can say what it means for a curve to be a line: A lineis a curve whose unit tangent vector is constant. In other words, a line is a curve whosedirection is constant.

Example 2.6 Find the points of intersection of the two curves given by

˜r1(t) =

(t2 − 1, t, 1 + t

),

˜r2(t) =

(−t, t2 − 1, t

).

Also, find the angles at which they intersect.

Suppose˜r1 and

˜r2 are differentiable vector functions, c is a scalar, and f is a differen-

tiable real-valued function. Then

a)d

dt[˜r1(t) +

˜r2(t)] =

˜r′1(t) +

˜r′2(t)

b)d

dt[c

˜r1(t)] = c

˜r′1(t)

c)d

dt[f(t)

˜r1(t)] = f ′(t)

˜r1(t) + f(t)

˜r′1(t)

d)d

dt[˜r1(t) ·

˜r2(t)] =

˜r′1(t) ·

˜r2(t) +

˜r1(t) ·

˜r′2(t) (‘·’ is the dot product of vectors.)

6 math2011, 2010

e)d

dt[˜r1(t) ×

˜r2(t)] =

˜r′1(t) ×

˜r2(t) +

˜r1(t) ×

˜r′2(t) (‘×’ is the cross product of vectors.)

f)d

dt[˜r1(f(t))] = f ′(t)

˜r′1(f(t)) (chain rule)

Note that the rules for the derivative of a dot or cross product are exactly the same as theusual product rule you learnt in High School, but applied to vector functions.

The tangent vector to a curve is of course another valued function, and so is anothercurve. So we can differentiate the the derivative:

Definition 2.9 The second derivative of a curve,˜r′′ is called the acceleration of the

curve˜r.

Example 2.7 Define the plane curve˜r by

˜r(t) = 2

˜i + cos t

˜j + 2 sin t

˜k. Is

˜r continuous at

t = 0? Find the velocity and acceleration at t = π

2.

Example 2.8 A particle is travelling on a circle with constant speed. Show that thevelocity and acceleration are orthogonal.

Since a circle is in a plane, we might as well assume that the particle is in the xy-plane,and the circle centered at the origin. Thus the path of the particle is given by

˜r(t) = a (cos θ(t)

˜i + sin θ(t)

˜j)

for some function θ(t), where a is the radius. The velocity will be

˜r′(t) = aθ′ (− sin θ(t)

˜i + cos θ(t)

˜j) ,

and the speed is hence ‖˜r′‖ = a|θ′|. As this is constant, there must be constants ω, θ0 with

θ = ωt+ θ0 (we could assume θ0 = 0 by changing the starting point of t). The accelerationis

˜r′′(t) = aω2 (− cos θ(t)

˜i − sin θ(t)

˜j) = −ω2

˜r,

and it is easy to see that˜r′ ·

˜r′′ is always zero, proving the result.

2.4.3 Integration

We will be looking more closely at using curves in integration in the second half of thesession. But for the moment I will just point out that in the same way that we differentiatedthe components of a curve, we can integrate them to get another vector valued function.So if

˜r : [a, b] → R

2 is parameterised by˜r(t) =

(x(t), y(t)

)then we can define

∫

˜r dt =

(∫x dt ,

∫y dt

),

∫b

a ˜r dt =

(∫b

a

x dt ,

∫b

a

y dt

)

as indefinte and definite integrals of the curve.We do not usually do this sort of integration-by-component for vector functions. But

one integral that is quite common is arc length: the integral of the speed.

Chapter 2 7

Definition 2.10 For a curve˜r : [a, b] → R

n, the arc length s from˜r(a) to

˜r(t) is given

by s =

∫t

a

‖˜r′(u)‖ du.

Except for lines and circles, and some special curves, arc length is very difficult tocalculate explicitly. Even for a curve as simple as an ellipse we need to invent new functionsfor the arc length.

Example 2.9 For the curve in R3 given by

˜r(t) = 2 cos t

˜i + 3 sin t

˜j +

√15

4cos 2t

˜k, t ∈ R :

a) Find the velocity and acceleration.

b) Show that the velocity and acceleration are orthogonal at t = nπ

2, n ∈ Z.

c) Find the length of the curve from t = 0 to t = 2π.

d) Write down the unit tangent vector at t = π

6.

e) Sketch the curve and indicate the unit tangent vector found in part d).

2.5 Plotting curves: Maple and/or matlab

Although we will seldom consider any curve more complicated than a conic section (parabola,ellipse, hyperbola) in this course, you will occasionally come across other curves. With anunfamilar curve the best way to get a picture of it is to get the computer to draw it.

You will all be familar with Maple or matlab, or possibly both, and they can both plotcurves for you very easily. On the Blackboard page I will put some Maple and matlab filesshowing or reminding you how to plot curves.

As an example of a trickier curve, the curve paramaterised by

˜r = (t − tanh t, sech t) , t ∈ R

is in figure 2.3(The tractrix has the interesting property that the distance from any point P on the

curve to the point where the tangent at P intersect the x-axis is always 1.) So if you hada reluctant dog on a lead of length one placed one unit up the y-axis and walked off downthe x-axis dragging it behind you so that the lead stays taught, the dog’s trajectory wouldbe the tractrix.)

8 math2011, 2010

1

0.75

1.0

2

0.5

−1 0

0.0

0.25

−2

Figure 2.3: Tractrix

Chapter 3

Surfaces

3.1 Basic definitions for surfaces

When I say “surface”, you should picture something like a ball or a piece of paper: some-thing essentially two dimensional living in a three dimensional space.

Definition 3.1 A surface in R3 can be described

(i) explicitly as the graph of a function of two variables, say z = f(x, y) or

(ii) implicitly as the set of all points satisfying ϕ(x, y, z) = 0 or

(iii) parametrically by equations of the form

x = x(u, v), y = y(u, v), z = z(u, v)

The first two methods are the most intuitive; the third one is the most useful math-ematically, as it builds in a way of labelling the points of the surface. But any surfacedefined in one way can always be defined by one of the others, but usually only in patches,as we will see. Most surfaces cannot be completely described in all three ways.

Example 3.1 Describe the plane through the points (1, 0, 0), (0, 1, 0) and (0, 0, 1) in thethree ways listed in definition 3.1.

Example 3.2 Let S be the unit sphere, the set of points satisfying x2 + y2 + z2 − 1 = 0.Explain why this cannot be described as a graph and give a parametric description of it.

There are many ways of parameterising a sphere, but the one we will see most often is

x = sin v cos u, y = sin v sin u, z = cos v (0 ≤ u ≤ 2π, 0 ≤ v ≤ π)

1

2 math2011, 2010

3.2 Sketching Surfaces

In general it is difficult to sketch surfaces well. Mostly it is enough to be able to seewhat the underlying solid looks like when we sight along the axes, and/or look at theintersections of the surface and the various coorinate planes. This is where the computercan really help, but we expect you to be able to sketch surfaces without it.

The other problem you face is that drawing in three dimensions needs some awarenessof perspective: how an object actually looks. I will discuss this in more detail in thelectures.

3.2.1 Graphs of functions of two variables

For functions of one variable,f : R → R, the graph is a subsetof R

2, namely the set of points{(x, y) : y = f(x)}

x

y

a b

f(a) f(b)

Figure 3.1: graph

For function of two variables, f : R2 → R, the graph is a subset of R

3, namely the set ofpoints {(x, y, z) : z = f(x, y)}, see figure 3.2

y

x

z

b

b

(a, b)

f(a, b)

Figure 3.2: 2 dimensional graph

A graph automatically gives us a parameterisation — we use x and y as the twoparameters and simply z = f(x, y).

The surfaces we will see most in this course are planes, spheres and cones, and weexpect you to recognise these when you see them. But this leaves the question of what youdo if you do not recognise a surface.

Chapter 3 3

One approach is to look at the curves formed from the intersections of the surface withthe coordinate planes (planes x = 0, y = 0 and z = 0), or other planes, for example planesparallel to the coordinate planes.

Example 3.3 What sort of shape is the surface given by x2 + y2 − z2 = 1?

We note that the intersection of the surface with z = 0 is the curve x2 + y2 = 1, whichis of course the unit circle. The intersection with y = 0 is x2 − z2 = 1, which you ought torecognise as a hyperbola. In fact, the intersection of the surface and any plane of the formy = ax (which contains the z-axis) is a congruent hyperbola.

x

y

1

z = 0

x2 + y2 = 1

x

z

1

x2 − z2 = 1

The intersection of the surface and any plane with z constant is also a circle, but ofradius

√1 + z2. This tells us the surface is formed from revolving the hyperbola x2−z2 = 1

around the z-axis to form a surface known as a hyperboloid of one sheet, see figure 3.3.

Example 3.4 Sketch the surface given by the graph of f(x, y) = x2 + y2.

3.2.2 Contour Lines

A second method of graphing a surface is to use contour lines, or level curves. Thevalues of the function f are indicated by drawing in a plane the curve with equationf(x, y) = c for indicative values of c. We actually see this representation every night onthe news when we look at isobars on the weather map, which give lines of equal pressure.

For example, with the hyperbloid of example 3.3 we get circles centered at the origin:

4 math2011, 2010

−1

0−1.0

−1

−0.5

0.0

0 1

0.5

1

1.0

Figure 3.3: Hyperboloid of 1 sheet

x

y

1 2 4

Contours for z = 0, 1, 2, 3

Note that the circles get progressively further apart as z increases, and the lowest levelcurve is in the centre.

Example 3.5 Let f : R2 → R be given by f(x, y) = x2 − y2. Plot some level curves of

the graph of f and sketch the surface.

Chapter 3 5

3.2.3 Projection

(see SHE, Section 14.2)Another useful tool for getting a picture of a surface is the projection. Those of you

who have done technical or engineering drawing will be used to this idea:

Definition 3.2 If S is a surface in R3 then its projection on the xy-plane is the set

{(x, y) ∈ R

2 | for some z, (x, y, z) ∈ S}

i.e., the projection is what we see if we sight the surface along the z-axis.

Example 3.6 Consider the surface given byx2

a2+

y2

b2+

z2

c2= 1 which is an ellipsoid. Its

projection on the xy-plane is the (interior of) the ellipsex2

a2+

y2

b2≤ 1

When we come to integrating over two- and three-dimensional regions, we will oftenwant to know the intersections of curves or surfaces and the projections on these intersec-tions onto the coordinate planes.

Example 3.7 Find the projection of intersection of the surfaces z = x2 + 2y2 and z =x + y + 1 onto the xy plane.

Example 3.8 Sketch the solid bounded above by the sphere centered at the origin andwith radius

√2 and below by the surface z = x2+y2. Find the projection of the intersection

of the surfaces onto the xy-plane.

3.3 Plotting surfaces with Maple and/or matlab

There are several functions for plotting in three dimensions in Maple. The basic one isplot3d, but there is also the useful implicitplot3d. They both have options that allow

6 math2011, 2010

for various colouring effects and display of contours. I will try to demonstrate some ofthese in lectures. See also Blackboard page for example of these two functions, and others.

For matlab, there are the functions surf, surfc and others.You should make an effort to play around with the 3-d plotting routines of one (or

both) of these packages to help strengthen your three-dimensional visualisation powers.

Drawing in 3-d 1

Drawing in three dimensions

This is adapted from Tomas & Finney’s Calculus and Analytic Geometry, 7th Ed,Addison Wesley, 1988.

1 Use a Pencil

An obvious comment, but you will need to erase lines either to move them or make themlook hidden, so a pencil is a must.

2 Breaking or Hiding Lines

When one line passes behind another, break it to show it does not touch and that part ishidden.

A

B

C

D

crossingA

B

C

D

AB behindC

D

A

B

CD behind

A line or curve that is hidden should be drawn dashed (see examples below).

3 Perspective

In general you should draw an object as if it is below you and to your left. Also, note thatright angles will not necessarily look like right angles (especially important with axes, seesection 4), circles will look like ellipses, but a sphere will still look circular, see section 7.

You should also be aware that parallel lines appear to converge.

4 Axes

If you actually look at three mutually orthogonal lines in space, then none of the anglesthey make actually appear to be right angles.

Drawing in 3-d 2

x

y

z

Impossible

x

y

z

Correct

You should also make sure the (apparant) angles between the axes are reasonably large.

5 Planes parallel to axes

Draw planes parallel to the coordinate planes as if they were rectangles with sides parallelto the coordinate axes. To do this, the rectangles will actually be drawn as parallelograms.Do not forget to break and dash lines where appropriate.

x

y

z

x

y

z

A “contact dot” sometimes helps show where lines (axes in these cases) meet the plane.Note also that I have drawn the axes so they do not touch the parallelogram representingthe plane.

Drawing in 3-d 3

6 Planes intersecting all axes

Drawing a plane that intersects all three axes is done with the following steps.(a) Sketch the axes, mark the intercepts and connect the intercepts to form a triangle.(b) Complete the triangle into a parallelogram. There are several ways of doing this,

you want to pick one so that none of its sides are parallel to your axes. If necessary, shiftan intercept point.

(c) Draw a second parallelogram with sides parallel to the small one: this will be yousketch of the plane.

(d) Darken exposed lines, dash hidden ones and rub out the small parallelogram andtriangle — keep the intercept points.

x

y

z

b

b

b

(a)x

y

z

b

b

b

(b)

x

y

z

b

b

b

(c)

x

y

z

b

b

b

(d)

Drawing in 3-d 4

7 Spheres

When drawing a sphere start with the (circular) outline of the sphere and add in the(elliptical) equator (back half dashed). If you need axes, add them next, breaking lineswhere needed. Note that the “north pole” will not be at the highest point of the outline,but just below it.

sphere first

x

y

z

b

b

b

Axes later

8 Other surfaces

With other surfaces, for example cylinders, paraboloids, hyperboloids, it is also useful todraw the surface first and then add in the axes.

John Steele

Chapter 4

Partial Derivatives and Continuity

4.1 Limits and Continuity

We have already seen the idea of limits applied to curves, and the idea was exactly the sameas for functions from R to R: we can make distance in the codomain arbitrarily small ( < ε)by making distances in the domain small ( < δ). All that changed in the formal definitionwas we used the length of vectors on the codomain (ε) distance (. . . ‖

˜r(t)−

˜r(a)‖ < ε ).

For functions from R2 to R, say, the same basic idea is used, but now we need lengthsof vectors on the domain distance (δ). So

Definition 4.1 We write

lim(x,y)→(a,b)

f(x, y) = `

and we say that the limit of f(x, y) as (x, y) approaches (a, b) is `, if we can make thevalues of f(x, y) as close to ` as we like by taking the point (x, y) sufficiently close to thepoint (a, b) but not equal to (a, b).

The formal definition is lim(x,y)→(a,b)

f(x, y) = ` means that for any ε > 0 there is a δ > 0

such that if ‖(x, y)− (a, b)‖ < δ then |f(x, y)− `| < ε.All the limit rules we met before (uniqueness, sums, products etc) work in this case —

the same proofs work with minor changes of notation.

NOTE: the very important difference to the case of a one-dimensional domain is that intwo dimensions we can approach the point (a, b) from infinitely many directions comparedto only two (left or right) for a one variable function. This makes limits of functions ofseveral variables more subtle than functions of one variable.

In particular, if f has a limit at (a, b), then however we make (x, y) approach the point(a, b), f(x, y) approaches `. If different ways of approaching (a, b) give different answers,then there is no limit.

Example 4.1 Investigate the function f : R2 \ {(0, 0)} → R given f(x, y) =xy3

x4 + y4.

1

2 math2011, 2010

Look at how this function behaves along lines through the origin and explain what thistells us about the behaviour of f as (x, y) tends to (0, 0)

Example 4.2 Find the limit lim(x,y)→(0,0)

f(x, y), if it exits, for the function

f(x, y) =sin (x2 + y2)

x2 + y2.

Definition 4.2 A function f : D ⊆ R2 → R of two variables is called continuous at(a, b) ∈ D if

lim(x,y)→(a,b)

f(x, y) = f(a, b) .

We say f is continuous on D if f is continuous at every point (a, b) ∈ D.

4.2 Partial Derivatives

(See SHE, Section 14.4)

Now we wish to investigate the rate of change in each direction x or y of the functionf depending on x and y.

Definition 4.3 The (first) partial derivative, fx, with respect to x, of a function f oftwo variables is the function of two variables obtained by holding y constant and differen-tiating with respect to x.

fx(x, y) = limh→0

f(x + h, y)− f(x, y)

hor fx(a, b) = lim

x→a

f(x, b)− f(a, b)

x− a.

Similarly for fy we have

fy(x, y) = limh→0

f(x, y + h)− f(x, y)

hor fy(a, b) = lim

y→b

f(a, y)− f(a, b)

y − b.

There are several different notations for partial derivatives, for example, if z = f(x, y)we can write

fx(x, y) = fx =∂f

∂x=

∂

∂xf(x, y) =

∂z

∂x= Dx f = f,1 = D1(f)

My personal preference is for fx, f,1 or D1(f), but you will meet all of these in differenttexts. I will discuss the problems with these notations in lectures, as well as looking atwhat the partial derivatives mean geometrically.

Chapter 4 3

Definition 4.4 If z = f(x, y) we can write the second partial derivatives of f

(fx)x = fxx = f,11 =∂

∂x

(∂f

∂x

)=

∂2f

∂x2=

∂2z

∂x2

(fx)y = fxy = f,12 =∂

∂y

(∂f

∂x

)=

∂2f

∂y∂x=

∂2z

∂y∂x

(fy)x = fyx = f,21 =∂

∂x

(∂f

∂y

)=

∂2f

∂x∂y=

∂2z

∂x∂y

(fy)y = fyy = f,22 =∂

∂y

(∂f

∂y

)=

∂2f

∂y2=

∂2z

∂y2

Example 4.3 Calculate all first and second partial derivatives for the function f : R2 → Rgiven by

f(x, y) = x4 + 2x2y3 + sin(xy).

I’ve been very careful with the order in the mixed second derivatives fxy and fyx,but the following very useful result means we do not need to be (usually):

Theorem 4.1 (Clairauts) Suppose f is defined on a disc D that contains the point (a, b).If the functions fxy and fyx are both continuous on D, then

fxy(a, b) = fyx(a, b) .

In particular, for a function whose components are built up from polynomials, sine,cosine, the exponential etc, all derivatives of all orders are continuous and so we get equalityof the mixed derivatives.

Example 4.4 Calculate fxy(0, 0) and fyx(0, 0) if

f(x, y) =

xy(x2 − y2)

x2 + y2(x, y) 6= (0, 0)

0 x = y = 0

.

Chapter 5

Chain Rule and Applications

5.1 Chain Rule

Reference: SHE, Section 15.3Suppose you are in a large room with a heater in one corner, so that the temperature in

the room varies with position. How would you calculate the rate of change of temperatureas you walked around the room? Well, we have temperature T a function of coordinatesin the room, x and y, and x and y will both be functions of time t. So the temperature

you feel is a function of time, T(x(t), y(t)

), and we want

dT

dt. This is a case of “function

of a function”, and so we need a chain rule.If we were to keep y constant at y0, say, then we’d have a one-variable problem. As y

is being kept fixed, we have

d

dtT

(x(t), y0

)=

∂T

∂x(x, y0)

dx

dt,

and similarly if x is kept fixed we would have

d

dtT

(x0, y(t)

)=

∂T

∂y(x0, y)

dy

dt.

What if both x and y are varying? Well, we add these two results together:

Theorem 5.1 Suppose that

a) f is a differentiable function of two variables x and y,

b) x and y are differentiable functions of a single variable t.

Then if F (t) = f (x(t), y(t)) we have

F ′(t) = fx

(x(t), y(t)

)x′(t) + fy

(x(t), y(t)

)y′(t)

1

2 math2011, 2010

This is sometimes written asdz

dt=

∂f

∂x

dx

dt+

∂f

∂y

dy

dt

where z denotes F (t).

Example 5.1 Suppose z = ex2+y, x = cos t and y = sin t. Find

dz

dtat t = 0.

On the other hand, it could be that your position in the room depended on two variables,s and t. Now it makes sense to ask about the rate of change of T with respect to s or t,which are of course partial derivatives.

Theorem 5.2 Suppose that

a) f is a differentiable function of two variables x and y,

b) x and y are differentiable functions of two variables s and t.

Then ifF (s, t) = f (x(s, t), y(s, t)) ,

we have

∂F

∂s=

∂f

∂x

∂x

∂s+

∂f

∂y

∂y

∂s

∂F

∂t=

∂f

∂x

∂x

∂t+

∂f

∂y

∂y

∂t

Example 5.2 Suppose f(x, y) = x2y + y3, x = r cos θ and y = r sin θ. Find∂f

∂rand

∂f

∂θusing the chain rule.

Example 5.3 Suppose that f : R → R is differentiable. Show that u(x, y) = f(x/y) is asolution to the partial differential equation

x∂u

∂x+ y

∂u

∂y= 0

One way of keeping track of chain rules is to use a diagram like figure 5.1, whichcorresponds to the case of theorem 5.2. We link every function to all those variables thatdirectly determine it. Each path starting from the top most function (F in figure 5.1) givesyou a product of partial derivatives, and you add up all the paths from the top variable tothe appropriate bottom (independent) variable.

These chain rules (and assosiated diagrams) can be extended to functions of morevariables. Suppose we have f(w, x, y, z) with w, x, y and z dependent on r, s and t. Then,for example

∂f

∂r=

∂f

∂w

∂w

∂r+

∂f

∂x

∂x

∂r+

∂f

∂y

∂y

∂r+

∂f

∂z

∂z

∂r,

with diagram in figure 5.2.

Chapter 5 3

b

b b

b b

F

x y

s t

∂f

∂x

∂f

∂y

∂x

∂s ∂y

∂s

∂y

∂t∂x

∂t

Figure 5.1: chain rule

f

w x zy

r ts

b

bb b b

bb b

Figure 5.2: chain rule

5.2 Differentiation and Integration

As you should all know, the Fundamental Theorem of Calculus tells us that differentiatingan integral with respect to its limits gives you the integrand:

d

dx

∫ x

0

sin t

tdt =

sin x

x.

The (one variable) Chain rule can be applied to tell us that, for example

d

dx

∫ x2

0

sin t

tdt =

sin(x2)

x2× 2x = 2

sin(x2)

x.

With the next result and the (several variable) chain rule we can find

d

dx

∫ x2

0

sin xt

tdt,

where now x occurs in the integrand too.

Theorem 5.3 Let g be a real valued function of two variables x and y. Then if both g and

4 math2011, 2010

∂g

∂xare continuous

d

dx

∫ b

a

g(x, y) dy =

∫ b

a

∂g

∂x(x, y) dy.

We are “passing the derivative through the integral sign”, swapping the order of twoinfinite processes.

Example 5.4 Findd

dx

∫1

0

sin xt

tdt and

d

dx

∫ x2

0

sin xt

tdt.

5.2.1 Integrals with a parameter

We can use differentiation under the integral sign to find some interesting integrals. Forexample, it is easy to show that

∫∞

−∞

dx

x2 + a2=

π

a

using a standard (inverse tan) integral. If we differentiate both sides with respect to a thenwe get ∫

∞

−∞

−2a dx

(x2 + a2)2= − π

a2.

From this we get ∫∞

−∞

dx

(x2 + a2)2=

π

2a3,

which is much harder to do with standard integrals.

For a rather more interesting example, consider the following (a typical exam question):

Example 5.5 Given that

∫∞

−∞

cos(2x)

x2 + a2dx =

πe−2a

acalculate

∫∞

−∞

cos(2x)

(x2 + 4)2dx.

As an exercise, find

∫∞

−∞

cos(2x)

(x2 + 9)3dx.

5.3 Implicit Functions

Theorem 5.4 Suppose that an equation of the form F (x, y) = 0 defines y implicitly as adifferentiable function of x, that is, y = f(x), where F (x, f(x)) = 0 for all x in the domainof f . If F is differentiable then

∂F

∂x

dx

dx+

∂F

∂y

dy

dx= 0 so

dy

dx= − Fx

Fy

.

Example 5.6 A curve is given implicitly by F (x, y) = 3x2 + y2 + 2xy − 4x = 0. When isit vertical?

Chapter 6

Gradient and Directional Derivatives

Reference: SHE, Sections 15.1, 15.2

We claimed previously that we could interpret the first partial derivatives of z = f(x, y)— fx(

˜x0) and fy(

˜x0) — as the rate of change of f at

˜x0 = x0

˜i+y0

˜j in the

˜i and

˜j directions

respectively. Consider a unit vector˜u in R

2. Then the equation˜x =

˜x0 + t

˜u represents

a straight line in R2 through the point

˜x0 in direction

˜u. The vertical plane through this

line meets the surface z = f(x, y) in a curve z = F (t) where F (t) = f(˜x0 + t

˜u).

x

y

z

b

b

˜x0

Figure 6.1: Directional Derivative

Definition 6.1 We define the directional derivative of f at˜x0 = x0

˜i+y0

˜j with respect

to the unit vector˜u to be the gradient (slope) of the curve in figure 6.1 at t = 0:

∂f

∂˜u

(˜x0) = f ′

˜u(

˜x0) = lim

t→0

1

t(f(

˜x0 + t

˜u) − f(

˜x0))

= limt→0

F (t) − F (0)

t= F ′(0).

1

2 math2011, 2010

If we let˜u = (u1, u2), then with

F (t) = f(x(t), y(t)) = f(x0 + t u1, y0 + t u2)

we havedF

dt=

∂f

∂x

dx

dt+

∂f

∂y

dy

dt

by the chain rule. Therefore

f ′

˜u(x0) =

∂f

∂x(˜x0) u1 +

∂f

∂y(˜x0) u2

=∇f ·˜u

where

∇f =∂f

∂x ˜i +

∂f

∂y ˜j

is called the gradient of f .

Definition 6.2 Suppose f is a function of two variables and fx, fy are continuous at (a, b)then the gradient of f at (a, b) is defined as the vector

fx(a, b)˜i + fy(a, b)

˜j ≡ (fx(a, b), fy(a, b))

in R2 and is denoted by ∇f (a, b) or grad f(a, b).

Do not confuse the gradient vector ∇f (which lives in the xy-plane) with a tangentvector to the surface (more on this later).

Example 6.1 Find the gradient of f , where f(x, y) = cos x exp(xy2) at the point (x, y) =(0, 2).

Notes:

a) Since ‖˜u‖ = 1 we can write

˜u = cos θ

˜i + sin θ

˜j for some θ.

b) If˜u =

˜i (which corresponds to θ = 0) we have

f ′

˜i(

˜x0) =

∂f

∂x(˜x0) cos(0) +

∂f

∂y(˜x0) sin(0) =

∂f

∂x(˜x0)

Similarly

f ′

˜j(

˜x0) =

∂f

∂y(˜x0)

c) For a general unit vector˜u

f ′

˜u(

˜x0) = ∇f ·

˜u

= ‖∇f‖ ‖˜u‖ cos φ

= ‖∇f‖ cos φ

where φ is the angle between ∇f and˜u.

Chapter 6 3

Definition 6.3 Suppose that˜u = cos θ

˜i + sin θ

˜j is a unit vector in R

2. The directional

derivative of f at˜x0 in the direction of

˜u (or θ) is

∇f(˜x0) ·

˜u = fx(

˜x0) cos θ + fy(

˜x0) sin θ .

It is denoted by f ′

˜u(

˜x0). The directional derivative may be described geometrically as the

rate of change of the function in the specified direction˜u.

Note: that we always use a unit vector when finding the directional derivative.

Example 6.2 A mountain has height above sea level (in 1000s of metres) given by z =f(x, y) = 4 −

(1

4x4 + sin(πx) sin(πy) + y2

)for −2 ≤ x, y ≤ 2, where the y-axis points due

north. What is the slope of the mountain in the South-East direction at x = 0, y = 1/2?

Theorem 6.1 For a function of two variables with ∇f(˜x0) 6= 0, then

a) the maximum and minimum values of f ′

˜u(

˜x0) are ±‖∇f(

˜x0)‖ and occur in the di-

rections given by ±∇f(˜x0), and

(b) ∇f(˜x0) is normal to the level curve passing through

˜x0, provided that fx and fy are

continuous at˜x0.

Proof : Part (a) follows from note (c) above:

f ′

˜u(

˜x0) = ‖∇f‖ cos φ.

The max and min values for f ′

˜u(

˜x0) occur at the max and min of cosφ, namely φ = 0

for the max and φ = π for the min, the max value being ‖∇f(˜x0)‖ and the min value

−‖∇f(˜x0)‖.

For part (b), suppose the level curve is˜r(t) = x(t)

˜i + y(t)

˜j. Then f(x(t), y(t)) is

constant. The chain rule now gives

0 =d

dtf(x(t), y(t)) =

∂f

∂x

dx

dt+

∂f

∂y

dy

dt= ∇f ·

˜r′,

proving the result. �

Example 6.3 In what direction is the slope of the mountain in example 6.2 greatest at(0, 0.5) and what is this slope?

Example 6.4 Consider the following function of two variables given by f(x, y) = x4 −3xy + 2y2. Calculate the gradient of f at a general point. Then draw the contour through(0.5, 0.5) and add the gradient at (0.5, 0.5).

The gradient is given by

∇f =∂f

∂x ˜i +

∂f

∂y ˜j = (4x3 − 3y)

˜i + (− 3x + 4y)

˜j

4 math2011, 2010

x

y

1

b

x

y

1

Consider the contour that goes through the point (0.5, 0.5):Now we calculate the gradient vector at (0.5, 0.5) to be −

˜i + 1

2

˜j and include it in the

following figure.For a variety of contours we can show that the gradient is normal to the contour, i.e.,

x

y

1

Example 6.5 For f(x, y) = x2 + y2

a) Find ∇f

b) Sketch some level curves of f

c) Indicate ∇f at some points on these curves.

Example 6.6 The temperature of a BBQ plate covering the region −10 ≤ x, y ≤ 10 isgiven by

T (x, y) =400

2 + 1

4x2 + 1

9y2

Chapter 6 5

An ant lands on the plate at the point (1,2). In which direction should she run to movemost quickly away from the heat?

Finally, note the ideas of gradient, directional derivative extend to R3 where

∇f(x, y, z) =∂f

∂x ˜i +

∂f

∂y ˜j +

∂f

∂z ˜k

and the gradient is normal to the surface f(x, y, z) = constant.

Chapter 7

Normal and Tangent Lines; TangentPlanes

Reference: Section 15.4

7.1 Tangent Planes

Suppose we have a surface in R3 defined by the equation ϕ(x, y, z) = C, C a constant, and

we also have a curve˜r, with

˜r(t) = f(t)

˜i + g(t)

˜j + h(t)

˜k, that lies in the surface. Now the

tangent vector˜r is

˜r′(t) = f ′(t)

˜i + g′(t)

˜j + h′(t)

˜k.

As the trace of˜r lies on the surface, we have ϕ(f(t), g(t), h(t)) = C. Differentiating

and using the chain rule we get

d

dtϕ(f(t), g(t), h(t)) = 0

i.e.∂ϕ

∂x

df

dt+

∂ϕ

∂y

dg

dt+

∂ϕ

∂z

dh

dt= 0

i.e.

ϕ

x

ϕy

ϕz

·

f ′

g′

h′

= 0

i.e. ∇ϕ ·˜r′ = 0

So, at any particular point, P , on the surface, all curves that lie in the surface and gothrough P have a tangent vector orthogonal to ∇ϕ(P ). The set of all vectors at P orthog-onal to ∇ϕ(P ) of course form a plane: the tangent plane.

Definition 7.1 For a surface S defined by ϕ = constant, the tangent plane at P ∈ S isthe plane through P with normal ∇ϕ(P ).

1

2 math2011, 2010

From this definition, we can immediately write down the point-normal form of thetangent plane at P :

∇ϕ(P ) · (˜x −

˜p) = 0 (7.1)

and the Cartesian (and parametric vector) forms follow easily, see earlier.

Example 7.1 Find the tangent plane to the surface x2 + y2 + z2 = 6 at the point P =(1, 2,−1).

The formula of equation (7.1) is easily modified to the case of a surface given as thegraph: such a surface is z = f(x, y), and so f(x, y) − z = 0 is constant, so use f(x, y) − zas ϕ(x, y, z) Then we have

∇ϕ =∂f

∂x˜i +

∂f

∂y˜j−

˜k.

Example 7.2 Find the tangent plane to the graph of x4 + 3x2y − y2 at (x, y) = (1, 2).

We can find a parametric form of the tangent plane at a point P directly by relyingon its dimension, which is obviously 2. All we need to do is find two independent tangentvectors at P , and they must then span the plane. We do this by finding two curves in thesurface going through P whose tangent vectors at P are independent.

For a graph this is easy. Suppose our point P in the graph of z = f(x, y) is (x0, y0, f(x0, y0) ).Two obvious curves in the surface both going through P (at t = 0) are

˜r1(t) =

(t + x0, y0, f(t + x0, y0)

)

˜r2(t) =

(x0, t + y0, f(x0, t + y0)

)

At t = 0 these have tangents, respectively

˜u =

˜r′1

=˜i +

∂f

∂x ˜k

˜v =

˜r′2

=˜j +

∂f

∂y ˜k

The tangent plane through P thus has parametric vector form˜p + λ

˜u + ν

˜v. A normal to

this plane is given by

˜u×

˜v =

∣∣∣∣∣∣∣

˜i

˜j

˜k

1 0 fx

0 1 fy

∣∣∣∣∣∣∣= −f

x

˜i − f

y

˜j +

˜k

which is −∇ϕ, for ϕ = f(x, y) − z.

Example 7.3 Find a parametric form of the tangent plane to the graph of f , wheref(x, y) = x3 − 3xy2 + 3y, at (x, y) = (2, 1).

Chapter 7 3

The “point-normal” approach is the best one, because it generalises immediately toother dimensions, in particular, we can use it for tangent lines to curves.

Example 7.4 Find the equation of the tangent line to the ellipse 3x2 + 4y2 = 7 at thepoint (1, 1)

7.2 Normal lines

For curves in R2 the normal line at a point P is the line orthogonal to the tangent line at

P . For surfaces in R3 the normal line at P is the line normal to the tangent plane at P

Both of these are easily found using the gradient.

Example 7.5 Find the normal line to the ellipse 2x2 + 5y2 = 7 at (1, 1).

Example 7.6 Find the normal line to the ellispoid 2x2 +2y2 +z2 = 8 at the point (1, 1, 2)

We say two curves in R2 through the one point P touch tangentially at P if they

have the same tangent line at P . Similarly, two surfaces through the one point P touchtangentially at P if they have the same tangent plane at P . Of course, checking two planesare the same is not quite as straightforward as checking two lines are the same, as thereare many ways of writing down a given plane. For example, it’s not immediately obviousthat

˜x =

1

1

1

+ s

1

2

−2

+ t

3

−2

4

and˜x =

1

1

1

+ λ

2

0

1

+ µ

1

−2

3

are the same plane. But a plane only has one normal direction, so two planes through onepoint are the same if and only if their normals are parallel.

Example 7.7 Show that the ellipsoid 2x2 +2y2 + z2 = 8 and the sphere 2x2 +2y2 +2z2 +x + y − 3z = 8 touch tangentially at (1, 1, 2).

7.3 Linear Approximation

One of the chief benefits of the tangent line to the graph of a function f : R → R is thatit gives us a good approximation to f — in fact, it is the best approximation you can dowith a polynomial of degree 1. In terms of the graph of f , being differentiable at a pointreally means the graph is approximately a straight line, or looks so if you magnify thegraph enough.

For example, if f(x) = x2,

4 math2011, 2010

−2 2 − 1

10

1

10

But for the absolute value, no amount of magnification will make the graph look like a lineat the origin:

When it comes to two variables, everything we’ve looked at so far would suggest thatthe tangent plane will give us our best linear approximation, and this is usually the case.

Example 7.8 Determine the equation for the tangent plane to the surface

z = f(x, y) = sin(x + y)

at (1,− 1, 0) and hence approximate the value of the function at x = 1.05 and y = − 0.85.

However, consider the following function

f(x, y) =

xy

x2 + y2(x, y) 6= (0, 0)

0 (x, y) = (0, 0)

.

Chapter 7 5

Now f is not continuous at (0, 0) since lim(x,y)→(0,0)

f(x, y) does not exist (why?) yet it has

first partial derivatives:

∂f

∂x(0, 0) = lim

∆x→0

f(∆x, 0) − f(0, 0)

∆x= 0

∂f

∂y(0, 0) = lim

∆y→0

f(0, ∆y) − f(0, 0)

∆y= 0 .

This means that the equation of our “tangent plane” is z = 0. But the plane z = 0 is auseless approximation to the function z = f(x, y) close to (0, 0). If y = x (x 6= 0) thenf(x, x) = 1

2no matter how small x is and so f is not approximated by z = 0. So the

surface does not have a tangent plane at the origin.We see that differentiability for several variables (whatever that might mean) is rather

more complicated than the one variable case. This example is really telling us that if wewant differentiability to mean “locally linear”, then just having partial derivatives is notgood enough.

With this in mind, we give our first try at a definition of differentiability for functionsof two variables:

Definition 7.2 Let f : R2 → R be defined at (a, b) and assume f

xand f

yare also

defined at (a, b). Then we say f is differentiable at (a, b) if the plane fx(a, b) (x − a) +

fy(a, b) (y − b)− (z − f(a, b)) = 0 gives a “good approximation” to the surface z = f(x, y)

in all directions.

In this case it makes sense to call the above plane the tangent plane. I’m not going tosay precisely what I mean by “good approximation”, but we can prove that if the partialderivatives f

xand f

yare continuous at (a, b) then the function is differentiable. We will

return to this problem later.

7.4 Differentials

Reference: SHE, Section 15.8Recall from one variable calculus we defined the differential of y = f(x) as

dy = f ′(x) dx .

For a function of two variables, z = f(x, y), we define differentials dx and dy to beindependent (since both x and y are). Then the total differential dz is defined by

dz = fx(x, y) dx + f

y(x, y) dy =

∂z

∂xdx +

∂z

∂ydy .

If we return to our linear approximation formula we have

f(x, y) ≈ f(a, b) + fx(a, b) (x − a) + f

y(a, b) (y − b) = f(a, b) + dz

6 math2011, 2010

where dx = ∆x = x−a and dy = ∆y = y−b. This approach is very useful in estimating thechange in a function if we know the change (error, increment) in the independent variablesx and y.

Example 7.9 The length and the width of a rectangle are measured as 30cm and 24cm,respectively, with an error in measurement of at most 0.1cm in each. Use differentials toestimate the maximum error in the calculated area of the rectangle.

Chapter 8

Taylor Series

8.1 Revision

Recall that we can often express functions of one variable by a power series expansion, e.g.,

1

1 − t= 1 + t + t2 + · · · =

∞∑

n=0

tn |t| < 1

et = 1 + t +t2

2!+

t3

3!+ . . . =

∞∑

n=0

tn

n!for all t

sin t = t − t3

3!+

t5

5!− . . . =

∞∑

n=0

(− 1)nt2n+1

(2n + 1)!for all t

ln(1 + t) = t − t2

2+

t3

3− . . . |t| < 1

Such expansions are examples of Taylor series expansions about zero (Maclaurin series).Around a general point a, the Taylor series is

φ(t) = φ(a) +φ′(a)

1!(t − a) +

φ′′(a)

2!(t − a)2 + . . . (8.1)

We recall that the linear terms (up to degree 1) give us the best linear approximationto φ, that is, the tangent line to the graph of φ, figure 8.1 (a). If we take the terms up tosecond order, we get the best quadratic approximation to φ, which gives the osculatingparabola to the graph, figure 8.1 (b), which is an even better approximation.

If the function we are dealing with is a polynomial of degree n, say, then the Taylorseries will terminate after n + 1 terms, since all higher derivatives are zero. But generallythe equality in equation (8.1) doesn’t hold for all t but only for “small t”. If we truncatethe expansion after n terms we can estimate the “remainder” or “error” term:

φ(t) = φ(a) +φ′(a)

1!(t − a) + . . . +

φ(n−1)(a)

(n − 1)!(t − a)n−1 + R

n(t − a)

1

2 math2011, 2010

x

y

(a)

x

y

(b)

Figure 8.1: Taylor approximations

where

Rn(t − a) =

φ(n)(t∗)

n!(t − a)n

and

a < t∗ < t if t > a

t < t∗ < a if t < a

If Rn(t − a) → 0 as n → ∞ then we say the Taylor series expansion converges to φ(t).

The hard part of using Taylor series for approximations is estimating the error, andthis is where the remainder term comes in. Note that the remainder term follows the samepattern as the other ones except that it is evaluated at a different (and unknown) point.But if we knew that |φ(n)(t)| < M between t and a, then the error involved in truncatingafter n terms can be estimated using R

nto be

E .M

n!|t − a|n.

Example 8.1 Find the Taylor series of f(x) =√

x about x = 4. Use the terms up tosecond order to approximate

√4.2 and estimate the error in this approximation using the

remainder.

8.2 Two variables

Now we wish to extend the Taylor series notion from one variable to two variables. Supposez = f(x, y) and let

˜x0 = (x0, y0) and

˜h = (h, k) be fixed. Now define φ by

φ(t) = f(x0 + th) for real t.

Chapter 8 3

Then by the chain rule

dφ

dt=

∂f

∂x

dx

dt+

∂f

∂y

dy

dt

=∂f

∂xh +

∂f

∂yk

and

d2φ

dt2=

d

dt

(dφ

dt

)=

∂

∂x

(dφ

dt

)dx

dt+

∂

∂y

(dφ

dt

)dy

dt

=∂

∂x

(∂f

∂xh +

∂f

∂yk

)dx

dt+

∂

∂y

(∂f

∂xh +

∂f

∂yk

)dy

dt

=

(∂2f

∂x2h +

∂2f

∂x ∂yk

)h +

(∂2f

∂y ∂xh +

∂2f

∂y2k

)k

= h2∂2f

∂x2+ 2hk

∂2f

∂x ∂y+ k2

∂2f

∂y2=

(h

∂

∂x+ k

∂

∂y

)2

f

Similarly

d3φ

dt3=

d

dt

(d2φ

dt2

)=

∂

∂x

(d2φ

dt2

)dx

dt+

∂

∂y

(d2φ

dt2

)dy

dt

=

(h

∂

∂x+ k

∂

∂y

)3

f

= h3∂3f

∂x3+ 3h2k

∂3f

∂x2 ∂y+ 3hk2

∂3f

∂x ∂y2+ k3

∂3f

∂y3.

We should note that the above partial derivatives are evaluated at the point˜x0 = (x0, y0).

We substitute this into

φ(1) = φ(0) +φ′(0)

1!1 +

φ′′(0)

2!12 + . . .

to get

f(x0 + h, y0 + k) = f(x0, y0) +1

1!

(∂f

∂x(˜x0) h +

∂f

∂x(˜x0) k

)

+1

2!

(∂2f

∂x2(˜x0) h2 + 2

∂2f

∂x ∂y(˜x0) hk +

∂2f

∂y2(˜x0) k2

)

+1

3!

(∂3f

∂x3(˜x0) h3 +

∂3f

∂x2∂y(˜x0) 3h2k

+∂3f

∂x∂y2(˜x0) 3hk2 +

∂3f

∂y3(˜x0) k3

)+ . . . (8.2)

4 math2011, 2010

which is called the Taylor expansion of f about the point˜x0. Notice that if we set

x = x0 + h, y = y0 + k so that h = x − x0 and k = y − y0 then we can write:

f(x, y) = f(x0, y0) +1

1!

(∂f

∂x(˜x0) (x − x0) +

∂f

∂y(˜x0) (y − y0)

)

+1

2!

(∂2f

∂x2(˜x0) (x − x0)

2 + 2∂2f

∂x ∂y(˜x0) (x − x0) (y − y0) +

∂2f

∂y2(˜x0) (y − y0)

2

)+ . . .

We expect you to know the MacLaurin series for et, sin t, cos t and (1 ± t)−1 from firstyear. They can often be used to get other series, even two variable ones, by standardmanipulations.

Example 8.2 Neglecting all products of x and y of order higher than 3 obtain the Taylorexpansion of z = sin(x + y) about the origin, i.e.,

˜x0 = (0, 0).

Example 8.3 Find the Taylor series for f(x, y) =√

x 3√

y about the point (x, y) = (4, 8) upto and including terms of second order. Use the terms up to second order to approximate√

4.2 3√

7.9

Example 8.4 Given that for f : R2 → R,

f(−1, 0) = 3 ,∂f

∂x(−1, 0) = −3 ,

∂f

∂y(−1, 0) = 2

∂2f

∂x2(−1, 0) = 4 ,

∂2f

∂x∂y(−1, 0) = −4 , and

∂2f

∂y2(−1, 0) = 0 .

(a) Write down the Taylor series for f(x, y) about the point (−1, 0) up to and includingterms of second order.

(b) Use the constant and linear terms from the Taylor series in (a) to approximatef(−1.02, 0.01).

(c) Find the rate of change of f at the point (−1, 0) in the direction of the vector˜i+3

˜j.

Just as in the one variable case, for a polynomial the Taylor series terminates aftera finite number of terms. For non-polynomials, we can truncate a Taylor series after allthe terms involving an (n − 1)th derivative, and what is left is an error term. Again, justas in the one variable case, it looks like a standard term, but is evaluated at a different(unknown) point.

Definition 8.1 If the 2-variable Taylor expansion of equation (8.2) is truncated after the(n − 1)th derivative terms, the remainder is given by

Rn(h, k) =

1

(n)!

(h

∂

∂x+ k

∂

∂y

)n

∣∣∣∣∣˜x0+c(h,k)

f

where the derivatives are all evaluated at some point˜x0 + c(h, k) with |c| < 1.

Chapter 8 5

Example 8.5 Find the quadratic polynomial that best approximates sin(x) sin(y) near(0, 0). Estimate the accuracy if |x| ≤ .1 and |y| < 0.1

Maple can calculate Taylor series and multivariable Taylor series for you. Use thetaylor for one variable and mtaylor for several variables. See Maple help pages, myeLearning Vista and lectures for examples.

Chapter 9

Critical Points and LagrangeMultipliers

(Reference: SHE, Section 15.5, 15.6)

9.1 Critical Points for One Variable

Let us begin with a look at the following problem:How do the critical points of the function f : R → R given by f(x) = Ax2 + B behave

as A increases from negative to positive?

x

y

B A > 0

x

y

B

A = 0

x

y

B

A < 0

Of course, this is a High School problem. We note that the behaviour changes markedlyas A varies from negative to positive, though.

Now consider a function whose MacLaurin series is

f(x) = f(0) +1

2f ′′(0)x2 +

1

3!f ′′′(0)x3 + · · ·

As f ′(0) = 0 we have a critical point at x = 0. For small x, x2 dominates all the higherpowers: x2 > |x3| > x4 > . . . , so close to zero we have

f(x) ≈ f(0) +1

2f ′′(0)x2

1

2 math2011, 2010

and so the sign of f ′′(0) will govern the type of critical point we have:

f ′′(0) > 0 local minimum.

f ′′(0) < 0 local maximum.

f ′′(0) = 0 more work needed.

For a critical point other than the origin, we need only change x to x− a and calculatederivatives at x = a instead and we’ve rediscovered the second derivative test. However,the advantage of our method is that we can apply it to functions of two variables.

9.2 Critical Points and Two Variables

Firstly, some definitions:

Definition 9.1 Let f : R2 → R be a function. A critical point of f is a point (a, b)

where ∇f is either zero or does not exist.

Definition 9.2 Let f : R2 → R be a function. A stationary point of f is a point (a, b)

where ∇f is zero.

An equivalent way of saying this is fx(a, b) = fy(a, b) = 0 or the tangent plane to thegraph z = f(x, y) is horizontal.

Definition 9.3 A function f : R2 → R has a local maximum (respectively minimum)

at (a, b) if f(a, b) is the largest (resp. smallest) function value on some small disc centeredat (a, b).

Example 9.1 Consider the function z = f(x, y) = x2 + y2 with graph in figure 9.1. Wehave fx = 2x, fy = 2y so there is a stationary point at (0, 0) and f(0, 0) = 0. For anypoint (x, y) 6= (0, 0), f(x, y) > 0 and so (0, 0) is a minimum.

We use the word extremum to save us having to write “maximum or minimum” allthe time.

Theorem 9.1 Suppse f has a local extremum at (a, b). Then if fx(a, b) and fy(a, b) bothexist, then f has a stationary point at (a, b).

The converse of this result is false:

Example 9.2 Consider the function z = f(x, y) = x2 − y2 with graph in figure 9.2. Wehave fx = 2x, fy = −2y, so we have a stationary point at (0, 0), and f(0, 0) = 0. But asf(x, 0) > f(0, 0) if x 6= 0 however small x is, (0, 0) is not a local max. Similarly, f(0, y) < 0if y 6= 0 however small y is, so (0, 0) is not a local min.

The shape of the graph in figure 9.2 prompts us to the next definition:

Chapter 9 3

-2-1

01

2

x

-2

-1

0

1

2y

0

1

2

3

4

z

0

1

2

3

Figure 9.1: Graph of x2 + y2

-2

0

2

x

-2

0

2y

-5

0

5

z

-5

0

5

Figure 9.2: Graph of x2 − y2

Definition 9.4 A stationary point that is not an extremum is called a saddle.

Example 9.3 Do the following have a local min or local max at the origin?

a) f(x, y) = x2 − 2xy + 10y2

b) f(x, y) = 2x2 + 12xy + 8y2

c) f(x, y) = x2 − 2xy + y2

All the examples so far are quadratics, so let’s look at the general quadratic, f(x, y) =Ax2 +2Bxy+Cy2 to find the pattern. We have f(0, 0) = 0 of course, and assuming A 6= 0,

4 math2011, 2010

we have

Ax2 + 2Bxy + Cy2 = A

(x +

B

Ay

)2

− B2

Ay2 + Cy2

= A

(x +

B

Ay

)2

− A−1(B2 − AC

)y2 (9.1)

Firstly, if B2 − AC = 0, then the quadratic is a perfect square, and the sign of A tells uswhat we want.

Secondly, if B2 − AC > 0, then the two terms in equation (9.1) have opposite signs,and so we have a saddle.

If B2 −AC < 0, the two terms in equation (9.1) have the same sign and we have a maxor min.

This leaves the special cases with A = 0, when f(x, y) = y(2Bx + Cy). If B 6= 0,f(x, y) can have either sign, so we have a saddle — note that in this case B2 − AC > 0.

If B = 0 as well, we have f(x, y) = Cy2, once again a perfect square so the sign of Ctells us what we have: C > 0 gives a min, C < 0 a max.

So we have the following results, if D = B2 − AC 6= 0:

B2 − AC positive: saddle

B2 − AC negative, A positive: minimum

B2 − AC negative, A negative: maximum

If B2 − AC = 0, then we have a perfect square and so we get a max or min, but it is not“isolated”.

Now let us apply the same reasoning we used for one variable, and suppose we have afunction f : R

2 → R whose Taylor series about (0, 0) is

f(x, y) = f(0, 0) +1

2!

(∂2f

∂x2(0, 0)x2 + 2

∂2f

∂x∂y(0, 0)xy +

∂2f

∂y2(0, 0)y2

)+ · · ·

Just as in the one variable case, if we are close to the origin, then we can ignore all thehigher order derivatives, and the analysis of the general quadratic tells us what sort ofcritical point we will have.

Theorem 9.2 (Second Derivative Test) Suppose the second order partial derivativesof f are continuous on a disk with centre (a, b) and suppose that fx(a, b) = 0 = fy(a, b),i.e., (a, b) is a critical point. Let

D = (fxy(a, b))2 − fxx(a, b) fyy(a, b).

Then

a) If D < 0 and fxx(a, b) > 0, then f(a, b) is a local minimum

Chapter 9 5

b) If D < 0 and fxx(a, b) < 0, then f(a, b) is a local maximum.

c) If D > 0 then f(a, b) is a saddle point

We call D the discriminant of the critical point. If D = 0 we need to do more work,just like in the one variable case.

Example 9.4 Analyse the critical points of f : R2 → R where f(x, y) = x3 − y3 − 3xy.

Example 9.5 Analyse the critical points of f : R2 → R where f(x, y) = sin(x) cos(y).

9.3 Extrema on bounded sets

So far, we have allowed the variables x and y to be unbounded. It is more usual inapplications to have bounds on x and y. For example we might be looking at temperaturein a room, and then x and y would be the dimensions of the room. So we might typicallyhave a ≤ x ≤ b and c ≤ y ≤ d for some constants a, b, c, d. We would write the set ofpossible values of (x, y) as [a, b]× [c, d]. This is an example of a closed, bounded region.It is not too dificult to properly define bounded:

Definition 9.5 A subset D of R2 is bounded if it is contained in some set if the form

[a, b] × [c, d], where a, b, c, d are all finite.

A proper definition of “closed” is a little harder, although some of you may have metit. For this course it will be enough to say a set is closed if contains all of its edge. Forexample the set {(x, y) | x2 + y2 < 1} is not closed as its “edge” (the set of point withx2 + y2 = 1) is not included. But the set {(x, y) | x ≥ 1} is closed.

For closed, bounded sets we have a very useful result:

Theorem 9.3 Let f : D → R be continuous on the closed bounded region D ⊂ R2. Then

f attains a maximum and a minimum value on D.

We met the one variable version of this (the max-min theorem) in first year. To findthe extrema on closed bounded sets we do the following steps:

(a) Find the values of f at critical points of f in S.

(b) Find the extreme values of f on the boundary of S.

(c) The largest of the values from the above steps is the absolute maximum and thesmallest of these values is the absolute minimum.

Example 9.6 Find the extrema of the function given by f(x, y) = x3 − y3 − 3xy on theregion of R

2 bounded by x = 0, y = 0 and y + x = 3.

Example 9.7 Find the extreme values of the function given by f(x, y) = 2xy on theclosed disc x2 + y2 ≤ 4.

6 math2011, 2010

Example 9.8 What are the dimension of the largest box in the first octant (x ≥ 0, y ≥ 0,z ≥ 0) with one vertex at the origin and the opposite one on the plane 3x + 2y + z = 3?

This last problem is an example of a constrained extremum: these are usually dealtwith by the very powerful method of Lagrange Multipliers, which we turn to next.

9.4 Lagrange Multipliers

We now wish to consider the problem of maximising or minimising a function that issubject to a constraint, e.g., find the extreme values of f(x, y) subject to g(x, y) = k.

One possibility is that sometimes we can solve g(x, y) = k to obtain y = y(x) say. Thenwe set about finding the critical points of z = f(x, y(x)).

Example 9.9 Maximise xy subject to x+y = 6. This is equivalent to maximising x (6−x),which requires x = 3 obviously, and f(3, 3) = 9.

Example 9.8 is a three variable version of a similar problem.However, we cannot always do this — or even if we can, it may make the problem

very much more complicated than it needs to be. But even if we cannot explicitly solveg(x, y) = k to obtain y = y(x) such a function is defined implicitly and we still have tofind the critical points of z = f(x, y(x)).

Differentiating the constraint g(x, y) = k gives us

∂g

∂x

dx

dx+

∂g

∂y

dy

dx= 0 so

dy

dx= −gx

gy

Now for a critical point

0 =dz

dx=

∂f

∂x+

∂f

∂y

dy

dx=

∂f

∂x+

∂f

∂y

(−gx

gy

).

This will be the case iffx

gx

=fy

gy

(= λ say) .

This means(fx, fy) = λ (gx, gy) or ∇f = λ∇g

where λ is the Lagrange multiplier.We recall that ∇f is normal to the curve f = constant, and ∇f = λ∇g says at the

critical point(s) the two normals are proportional. So at the critical point, the curves wheref is constant is tangent to the curves where g is constant. We can illustrate this is in thefollowing example.

Example 9.10 Minimise z = f(x, y) = x2 + y2 subject to g(x, y) = x y = 16.We draw in the circle x2 + y2 = r2 and increase r until the corresponding circle touches

the curve g(x, y) = 16.Figure 9.3 shows the curve g(x, y) = 16 with some level curves of f(x, y).

Chapter 9 7

-10 -5 0 5 10

-10

-5

0

5

10

Figure 9.3: Lagrange Multipliers

The “parallel gradients” idea generalises to more that two dimensions, and even tomore than two constraints (where that makes sense):

Theorem 9.4 (Method of Lagrange Multipliers) To determine the maximum and min-imum values of a function f(

˜x) such that

˜x ∈ R

n subject to constraints gi(˜x) = ki for all

i = 1, . . . , k where k ≤ n − 1 (assuming that these extreme values exist):

(a) Find the values of˜x and λi such that

∇f(˜x) =

k∑

i=1

λi ∇gi(˜x)

and

gi(˜x) = ki ∀i = 1, . . . , k (≤ n − 1) .

(b) Evaluate f at all points˜x that result from the previous step. The largest of these

values is the maximum value of f ; the smallest is the minimum of f .

Note that we do not need any second derivative test: if there are extrema we havefound them.

Example 9.11 Find the point closest to the origin on the graph of xy + 1.

Unfortunately, while the method of Lagrange multipliers is very powerful, there is noalgorithm for solving the equations you get. The equations are usually non-linear, so youwon’t be able to use Gaussian elimination for example. Sometimes it is easier to find outthe multipliers (the λi) first, and sometimes it is best not to find them, but to eliminatethem from your equations. There are also often special cases that need to be checked out

8 math2011, 2010

— what if one variable (or a multiplier) is zero, for example, which means you cannotdivide by or cancel it.

In the end, with Lagrange multiplier problems, you must use your wits and not beafraid to try something. If one approach does not work, try another.

Example 9.12 Find (if they exist) the maximum and minimum values of f(x, y, z) =1

xyzon the ellipsoid 9x2 + y2 + z2 = 3 in the first octant.

Example 9.13 Find the extrema of the function f given by f(x, y, z) = x + y + z subjectto the conditions x2 + y2 = 2 and x + z = 1.

Chapter 10

Jacobian Matrix and InverseFunctions

So far we have looked at functions from R to Rn (curves) and from R2 or R3 to R — inother words either the domain or the codomain but not both have been vector valued. Wenow want to look at little at more general vector valued functions on vector domain.

10.1 Vector Valued Functions

A function˜f : Rp → Rq (q > 1) is a vector valued function of p variables. (It is usual to

use Rn and Rm here, but p and q have the advantage of sounding less alike.)

Example 10.1

˜f

x

y

z

=

(x + y + z

xyz

)

defines a function from R3 to R2.

When it comes to these vector valued functions, we now must write vectors as columnvectors (essentially because matrices act on column vectors). In the example 10.1, the

real-valued functions f 1

x

y

z

= x + y + z and f 2

x

y

z

= xyz are called the co-ordinate

or component functions of˜f, and we may write

˜f =

(f 1

f 2

).

1

2 math2011, 2010

Generally, any˜f : Rp → Rq is determined by q co-ordinate functions f 1, . . . , f q and we

write

˜f =

f 1(x1, . . . , xp)

f 2(x1, . . . , xp)...

f q(x1, . . . , xp)

(10.1)

10.2 Jacobian Matrix

All this is very obvious, but our first problem is how we define the derivative of a vectorvalued function. Recall that if f : R2 → R then we can form the directional derivative,i.e.,

f ′

˜u = u1 ∂f

∂x+ u2 ∂f

∂y= ∇f ·

˜u

where˜u = (u1, u2). So that knowledge of the the gradient of f gives information about all

directional derivatives. Therefore it is reasonable to assume

∇˜pf =

(∂f

∂x(˜p),

∂f

∂y(˜p)

)is the derivative of f at

˜p. (The story is more complicated than this but if f is “differen-

tiable” then ∇f represents the derivative, see section 10.3.)More generally if f : Rp → R we take the derivative at

ã to be the row vector(

∂f

∂x1(ã),

∂f

∂x2(ã), . . . ,

∂f

∂xp(ã)

)= ∇

ãf

Now take˜f : Rp → Rq where

˜f is as in equation (10.1), then the natural candidate for the

derivative of˜f at

ã is

Jã˜f =

∂f 1

∂x1

∂f 1

∂x2. . .

∂f 1

∂xp

∂f 2

∂x1

∂f 2

∂x2. . .

∂f 2

∂xp

......

. . ....

∂f q

∂x1

∂f q

∂x2. . .

∂f q

∂xp

where the partial derivatives are evaluated at

ã. This q× p matrix is called the Jacobian

matrix of˜f. Writing the function

˜f as a column helps us to get the rows and columns of

the Jacobian matrix the right way round. Note the “Jacobian” is usually the determinantof this matrix when the matrix is square, i.e., p = q.

Example 10.2 Find the Jacobian matrix of˜f from example 10.1 and evaluate it at the

point A = (1, 2, 3).

Chapter 10 3

Most of the cases we will be looking at have p = q = either 2 or 3. Suppose u = u(x, y)and v = v(x, y). If we define f : R2 → R2 by

˜f

(x

y

)=

(u(x, y)

v(x, y)

)≡

(f 1

f 2

)then the Jacobian matrix is

J˜f =

∂u

∂x

∂u

∂y∂v

∂x

∂v

∂y

and the Jacobian (determinant)

det(J˜f) =

∣∣∣∣∣∣∣∣∂u

∂x

∂u

∂y∂v

∂x

∂v

∂y

∣∣∣∣∣∣∣∣ =∂u

∂x

∂v

∂y− ∂v

∂x

∂u

∂y.

We often denote det(J˜f) by

∂(u, v)

∂(x, y).

Example 10.3 Polar to Cartesian co-ordinates, where

x = r cos θ and y = r sin θ.

Then

∂(x, y)

∂(r, θ)=

∣∣∣∣∣∣∣∣∂x

∂r

∂x

∂θ∂y

∂r

∂y

∂θ

∣∣∣∣∣∣∣∣ =

∣∣∣∣∣cos θ − r sin θ

sin θ r cos θ

∣∣∣∣∣ = r.

10.3 Derivatives

We have already noted that if˜f : Rp → Rq then the Jacobian matrix at each point

ã ∈ Rp

is an q × p matrix. Such a matrix Jã˜f gives us a linear map D

ã

˜f : Rp → Rq by

(Dã

˜f) (

˜x) = J

ã˜f ·

˜x for all

˜x ∈ Rp

Note that˜x is a column vector.

Definition 10.1 Formally˜f : Rp → Rq is differentiable at

ã if, for small

˜h, D

ã˜f(

˜h) is a

“good” approximation to ‖˜f(

ã +

˜h)−

˜f(

ã)‖ in the sense that

lim

˜h→

˜0

‖˜f(

ã +

˜h)−

˜f(

ã)−D

ã

˜f(

˜h)‖

‖˜h‖

= 0

4 math2011, 2010

where

‖˜h‖ =

√h2

1 + h22 + . . . + h2

p.

You should compare this to the one variable case: a function f : R → R is differentiable

at a if limh→0

f(a + h)− f(a)

hexists, and we call this limit f ′(a). But we could equally well

say this as f : R → R is differentiable at a if there is a number, written f ′(a), for which

limh→0

|f(a + h)− f(a)− f ′(a) · h||h|

= 0,

because a linear map T : R → R can only be multiplying by a number.

Example 10.4 Write the derivative of the function in example 10.1 at (1, 2, 3) as a linearmap.

Suppose˜f and

˜g are two differentiable functions from Rp to Rq. It is easy to see that

the derivative of˜f+

˜g is the sum of the derivatives of

˜f and

˜g. We can take the dot product

of˜f and

˜g and get a function from Rp to R, and then differentiate that. The result is a

sort of product rule, but I’ll leave you to work out what happens. Since we cannot dividevectors, there cannot be a quotient rule, so of the standard differentiation rules, that leavesthe chain rule.

10.4 The Chain Rule

Now suppose that˜g : Rp → Rs and

˜f : Rs → Rq. We can now form the composition

˜f ◦

˜g

by mapping with˜g first and then following with

˜f:

Rp˜g

−−−→ Rs ˜f

−−−→ Rq

˜x −−−→

˜g(

˜x) −−−→

˜f(

˜g(

˜x))

(10.2)

(˜f ◦

˜g) (

˜x) =

˜f (

˜g(

˜x)) ∀

˜x ∈ Rp

Example 10.5 Let˜g : R2 → R2 and

˜f : R2 → R3 be defined, respectively, by

˜g

(x

y

)=

(x + y

xy

)and

˜f

(x

y

)=

sin x

x− y

xy

Then

˜f ◦

˜g is defined by

(˜f ◦

˜g)

(x

y

)= f

(g

(x

y

))= f

(x + y

xy

)=

sin(x + y)

x + y − xy

(x + y) (xy)

.

Chapter 10 5

Now if˜f and

˜g in equation (10.2) above are differentiable, then if

˜b =

˜g(

ã) ∈ Rs, the

maps Jã˜g : Rp → Rs and J

˜b˜f : Rs → Rq are defined, and we have

Theorem 10.1 (The Chain Rule) Suppose that˜g : Rp → Rs and

˜f : Rs → Rq are

differentiable. Then

Jã(

˜f ◦

˜g) = J

˜g(

ã)

˜f · J

ã˜g.

This is again just like the one variable case, except now we are multiplying matrices(see below).

Example 10.6 Consider example 10.5:

˜g

(x

y

)=

(x + y

xy

)and

˜f

(x

y

)=

sin x

x− y

xy

Find Jã(

˜f ◦

˜g) where

ã =

(a1

a2

). Let

ã =

(a1

a2

). Then

Jã˜g =

(1 1

y x

)ã

=

(1 1

a2 a1

)

also

J˜g(

ã)

˜f =

cos x 0

1 − 1

y x

x=a1+a2,y=a1a2

=

cos(a1 + a2) 0

1 − 1

a1a2 a1 + a2

and

Jã(

˜f ◦

˜g) =

cos(x + y) cos(x + y)

1− y 1− x

2xy + y2 x2 + 2xy

ã

We find thatcos(a1 + a2) cos(a1 + a2)

1− a2 1− a1

2a1a2 + a22 a1

2 + 2a1a2

=

cos(a1 + a2) 0

1 −1

a1a2 a1 + a2

·

(1 1

a2 a1

)

I mentioned earlier that the chain rule here is very like the chain rule you met forfunctions of one variable. It is rather more accurate to say that the one variable chain ruleis a special case of the one we’ve just met — the same can be said for the chain rules wesaw in chapter 5.

6 math2011, 2010

Let x : R → R be a differentiable function of t and u : R → R a differentiable functionof x. Then (u ◦ x) : R → R is given by (u ◦ x)(t) = u(x(t)). In the notation of this chapter

Jt(u ◦ x) = Jx(t)u · Jtx

i.e.

[d

dt(u ◦ x)

]t

=

[du

dx

]x(t)

[dx

dt

]t

We usually write this asdu

dt=

du

dx

dx

dt

keeping in mind that when we writedu

dtwe are thinking of u as a function of t, i.e., u(x(t))

and when we writedu

dxwe are thinking of u as a function of x.

Now suppose we have x = x(t), y = y(t) and z = f(x, y). Then

R (x,y)−−−→ R2 f−−−→ R

and

Jt(f ◦˜x) = J

˜x(t)f · Jt

˜x

Therefore

(d

dtf(x(t), y(t))

)=

(∂f

∂x

∂f

∂y

)·

dx

dtdy

dt

so that

df

dt=

∂f

∂x

dx

dt+

∂f

∂y

dy

dt,

which is just what we saw in chapter 5.

10.5 Inverse Functions

In first year (or earlier) you will have met the inverse function theorem, which says essen-tially that if f ′(a) is not zero, there is a differentiable inverse function f−1 defined nearf(a) with [

d

dt(f−1)

]f(a)

=1

f ′(a).

Chapter 10 7

What happens in the multi-variable case? Well, let us consider a case where we canwrite down the inverse. For polar coordinates we have

x = r cos θ , y = r sin θ

r =√

x2 + y2 , θ = arctan(y

x

)Now differentiating we get

∂r

∂x=

x√x2 + y2

=r cos θ

r= cos θ and

∂x

∂r= cos θ

i.e.,∂r

∂x6= 1

∂x

∂r

We see that the one variable inverse function theorem does not apply to partial derivatives.However, there is a simple generalisation if we use the multivariable derivative, that is, theJacobean matrix.

To continue with the polar coordinate example, define

˜f

(r

θ

)=

(x(r, θ)

y(r, θ)

)=

(r cos θ

r sin θ

)(10.3)

and

˜g

(x

y

)=

(r(x, y)

θ(x, y)

)=

(√x2 + y2

arctan(

yx

)) (10.4)

Consider

(˜f ◦

˜g)

(x

y

)=

˜f

(˜g

(x

y

))=

˜f

(r

θ

)=

(x

y

)= Id

(x

y

)

Therefore˜f ◦

˜g = Id, the identity operator on R2. Similarly

˜g ◦

˜f = Id.

Recall

Id

(x

y

)=

(x

y

)so that J(Id) =

(1 0

0 1

)≡ 2× 2 identity matrix

Thus by the chain rule

J˜f · J

˜g = J(Id) =

(1 0

0 1

)= J

˜g · J

˜f

8 math2011, 2010

so that (J˜f)−1 = J

˜g. Note for simplicity the points of evaluation have been left out.

Therefore ∂r

∂x

∂r

∂y∂θ

∂x

∂θ

∂y

− 1

=

∂x

∂r

∂x

∂θ∂y

∂r

∂y

∂θ

.

We can check this directly by substituting∂r

∂x=

x√x2 + y2

= cos θ etc.

The same idea works in general:

Theorem 10.2 (The Inverse Function Theorem) Let˜f : Rp → Rp be differentiable at

ã. Then if J

ã

˜f is an invertible matrix, there is an inverse function

˜f−1 : Rp → Rp defined

in some neighbourhood of˜b =

˜f(

ã) and

(J˜b

˜f−1) = (J

ã

˜f)−1

Note that the inverse function may only exist in a small region around˜b =

˜f(

ã).

Example 10.7 We earlier saw that for polar coordinates, with the notation of equa-tion (10.3)

J˜f =

(cos θ − r sin θ

sin θ r cos θ

),

with determinant r. So it follows from the inverse function theorem that the inversefunction

˜g is differentiable if r 6= 0.

Example 10.8 The function g : R2 → R2 is given by

g

(x

y

)=

(u

v

)=

(x2 − y2

x2 + y2

).

Where is g invertible? Find the Jacobian matrix of g−1 where g is invertible.

Now let us apply the inverse function theorem to the Jacobian determinants. We recallthat

∂(r, θ)

∂(x, y)= det J

˜g =

∣∣∣∣∣∣∣∣∂r

∂x

∂r

∂y∂θ

∂x

∂θ

∂y

∣∣∣∣∣∣∣∣ and∂(x, y)

∂(r, θ)= det J

˜f =

∣∣∣∣∣∣∣∣∂x

∂r

∂x

∂θ∂y

∂r

∂y

∂θ

∣∣∣∣∣∣∣∣ .Since J

˜g and J

˜f are inverse matrices, their determinants are inverses:

∂(r, θ)

∂(x, y)=

1

∂(x, y)

∂(r, θ)

.

This sort of result is true for any change of variable — in any number of dimensions —and will prove very useful in chapter 13.

Chapter 11

Double Integrals

We now wish to turn to integrating function of several variables. We will only look at thecase of real valued functions, not vector valued functions. We will also only be looking atthe equivalent of definite integrals.

You should recall from one variable calculus the Riemann sum definition of the integral.Let f(x) be a function with domain [a, b]. We partition the domain into n subintervals[xi−1, xi], not necessarily of equally spaced points. In each [xi, xi+1] we choose an arbitrarypoint x∗

i : the set of these points is called a tag of the partition. The Riemann sum for thistagged partition is

n∑

i=1

f(x∗i )(xi+1 − xi).

If the limit as the length of the longest subinterval tends to zero (so n → ∞) exists and isindependent of the tag we say f is integrable on [a, b] and define the integral

b∫

a

f(x) dx = limn∑

i=1

f(x∗i )(xi+1 − xi) .

We then proved theorems that tell us that if f is continuous then it is integrable, sowe can use equal length subintervals and take the max (or min) on each subinterval as thetag.

We now want to extend this idea to functions of several variables, two to begin with.Important: you must get into the habit of drawing the region you are integrating overwhen doing multiple integrals.

11.1 Double Integrals over Rectangles

(Reference: SHE, Sections 16.1, 16.2)Our first problem is that two dimensional regions come in many more shapes than one

dimensional ones — the latter can only be intervals. So we begin by defining integrationof the simplest type of two-dimensional regions: rectangles.

1

2 math2011, 2010

Suppose f : R → R is a function of two variables (x, y) defined on the rectangularregion R:

R = [a, b] × [c, d] ={(x, y) ∈ R

2| a ≤ x ≤ b, c ≤ y ≤ d}

.

We consider the region R covered by a network of grid lines parallel to the x and y axes.These lines divide the region R into rectangles with areas ∆A = ∆x ∆y. We next order

a b

c

d

y

x

these rectangles in some way and, as before, choose a sample point (x∗i , y

∗j ) in each of the

rectangle — the set of sample points is again called a tag. We again form the Riemannsum for the tagged partition

n∑

i=1

m∑

j=1

f(x∗i , y

∗j ) ∆A ,

and look at the limit of this sum as the size of largest rectangle tends to zero. If this limitexists independently of the tag, then we would say f is integrable on R.

In the one variable case the integral is the area under the curve. In the two variablecase it is the volume under the graph, assuming f(x, y) ≥ 0 on R, see figure 11.1

Definition 11.1 The double integral of z = f(x, y) over the rectangular region R is givenby

∫∫

Rf(x, y) dA =

∫∫

Rf(x, y) dx dy

= limn→∞

limm→∞

n∑

i=1

m∑

j=1

f(x∗i , y

∗j ) ∆A

= limn,m→∞

n∑

i=1

m∑

j=1

f(x∗i , y

∗j ) ∆A

if this limit exists independently of the tag {(x∗i , y

∗j )}.

Chapter 11 3

-2

-1

0

1

2

x

-2-1

0

1

2

y

0

1

2

3

z

-2

-1

0

1

2

x

0

1

2

-2

-1

0

1

2

x

-2-1

0

1

2

y

0

1

2

3

z

-2

-1

0

1

2

x

0

1

2

Figure 11.1: Double Integral

Just as in the one variable case, we can show that if f is continuous, then it is integrable,so we can take any partition and any tag to calculate

∫∫f dA.

Example 11.1 Compute∫∫

[0,1]×[0,2]x y2 dA assuming the integral exists.

Let the partitions in the x and y directions be

Px =

{i

n: i = 0, 1, . . . , n

}and Py =

{2j

m: j = 0, 1, . . . , m

}

In this example we will take (x∗i , y

∗j ) as the lower right hand corner (vertex) for each of the

rectangles Rij , i.e., (x∗i , y

∗j ) = (xi, yj−1). Thus

n∑

i=1

m∑

j=1

f(x∗i , y

∗j ) ∆x ∆y =

n∑

i=1

m∑

j=1

f(xi, yj−1) (xi − xi−1) (yj − yj−1)

=n∑

i=1

m∑

j=1

i

n

(2(j − 1)

m

)2 (1

n

) (2

m

)

=8

n2m3

n∑

i=1

m∑

j=1

i (j − 1)2

=8

n2m3

(n∑

i=1

i

) (m∑

j=1

(j − 1)2

)

We recall from first year

n∑

i=1

i = 1 + 2 + 3 + · · ·+ n =1

2n (n + 1)

m∑

j=1

(j − 1)2 = 12 + 22 + · · ·+ (m − 1)2 =1

6(m − 1) m (2m− 1)

4 math2011, 2010

Hence

n∑

i=1

m∑

j=1

f(x∗i , y

∗j ) , ∆x ∆y =

8

n2m3

(n∑

i=1

i

) (m∑

j=1

(j − 1)2

)

=8

n2m3

1

2n (n + 1)

1

6(m − 1) m (2m − 1)

=2

3

(1 +

1

n

) (1 − 1

m

) (2 − 1

m

)

→ 4

3as n, m → ∞

Thus ∫∫

[0,1]×[0,2]

x y2 dA =4

3

In practice we do not subdivide the region R and calculate the volume of rectangularcolumns but rather use the Fundamental Theorem of Calculus — this again is similar tothe way we do one variable integration.

Let us recap the idea of finding volumes by slicing.

We begin with a positive function f : R → R, and look at its graph, left picture offigure 11.2 Specifically let’s slice parallel to the xz-plane, i.e., y is a constant. The cross

-2

0

2

x

-2

0

2

y

0

1

2

3

4

z

0

1

x

z

Figure 11.2: Slicing the graph

section in the y = yj plane looks like the graph on the right in figure 11.2.

So if we assume that for a fixed y, f(x, y) is integrable with respect to x,

A(y) =

∫ b

a

f(x, y) dx for c ≤ y ≤ d

Chapter 11 5

so the area of the “slice” is A(yj). Thus the volume of the slice of thickness ∆y = yj −yj−1

is A(yj) ∆y. Hence the volume under the surface is approximated by

m∑

j=1

A(yj) ∆y.

But the limit of the sum as m → ∞ is

∫ d

c

A(y) dy

(here we have assumed A is integrable with respect to y). Thus we would expect that

∫∫

Rf(x, y) dA =

∫ d

c

[∫ b

a

f(x, y) dx

]dy.

In this case we integrate with respect to x first and then y.Similarly by slicing parallel to the yz-plane we would expect

∫∫

Rf(x, y) dA =

∫ b

a

[∫ d

c

f(x, y) dy

]dx.

where we integrate with respect to y first, then x. Note that we think of these integralsfrom the inside outwards: y first means dy is on the inside etc.

Example 11.2 Let R = [0, 2] × [0, 1]. Calculate

∫∫

R4 − x − y dA

firstly by slicing parallel to the xz-plane (x integral first) and secondly slicing parallel tothe yz-plane (y integral first).

The obvious question is: for what functions are these two integrals equal?

Theorem 11.1 (Fubini’s Theorem) If f(x, y) is continuous on the rectangular regionR = [a, b] × [c, d] then

∫∫

Rf(x, y) dA =

∫ b

a

∫ d

c

f(x, y) dy dx =

∫ d

c

∫ b

a

f(x, y) dx dy .

We won’t prove this as we would need a formal definition of the integral.

In practice, it often doesn’t matter which way you chose to do the integrals, but some-times it is much easier to do one way rather than the other:

6 math2011, 2010

Example 11.3 Use Fubini’s theorem to calculate

∫∫

Ry sin(xy) dA

where R = [1, 2] × [0, π].

Fubini’s theorem has the following simple corollary for special types of functions:

Corollary 11.2 If f(x, y) = g(x) h(y) then

∫∫

Rf(x, y) dA =

(∫ d

c

h(y) dy

) (∫ b

a

g(x) dx

)

Proof : We have

∫∫

Rf(x, y) dA =

∫ d

c

[∫ b

a

g(x) h(y) dx

]dy

=

∫ d

c

h(y)

[∫ b

a

g(x) dx

]dy

=

(∫ d

c

h(y) dy

) (∫ b

a

g(x) dx

)

�

Example 11.4 Calculate

∫∫

Rx2y3 dx dy where R = [0, 1] × [0, 1].

We are still left with the question of what to do if f becomes negative. In a sense,the mathematics does not care — just integrate. The interpretation of the integral willchange: rather than having a volume you will have a signed volume – the volume underthe xy plane is counted as negative. This is the same idea as signed area for one variableintegrals.

There are many standard properties of double integrals, most of which are very closeto properties of one variable integrals:

Theorem 11.3 (Properties of Double Integrals)

a) If f(x, y) = 1 then

∫∫

Rf(x, y) dA =

∫∫

RdA is the area of R.

b) If the area of R is zero then∫∫

R f(x, y) dA = 0 (even if f is unbounded on R).

c)

∫∫

R[f(x, y) ± g(x, y)] dA =

∫∫

Rf(x, y) dA±

∫∫

R

g(x, y) dA.

Chapter 11 7

d)

∫∫

Rk f(x, y) dA = k

∫∫

Rf(x, y) dA if k is constant.

e) If f(x, y) ≥ g(x, y) on R then

∫∫

Rf(x, y) dA ≥

∫∫

Rg(x, y) dA.

f) Suppose R = R1 ∪R2 and the area of R1 ∩R2 is zero. Then∫∫

Rf(x, y) dA =

∫∫

R1

f(x, y) dA +

∫∫

R2

f(x, y) dA.

Part f) of this result tells us how to calculate integrals over a region that is a unionof rectangles: break the region up into disjoint rectangles and integrate over each oneseparately. Since the area of the edge of a rectangle is zero, we do not need to worry aboutthe rectangles having common edges.

11.2 Double Integrals over general bounded regions

When we come to looking at double integrals over bounded non-rectangular regions R,such as figure 11.3, then we use a standard trick. We extend the definition of the integrandto any rectangle enclosing R and decree that the integrand is zero outside R. Then weintegrate our extended integrand over the rectangle.

R

Figure 11.3: General Region

This will probably mean that the integrand has some discontinuities, but these will beover the boundary of R, which has zero area, so don’t cause any problems.

To calculate the integral over a general region, we do the same thing as for a rectangle,and slice with either x or y held constant. There is a generalisation of Fubini’s theoremthat says that as long as f is continuous (except possibly for a region of zero area), thenthe order of the integration does not matter.

Suppose we are integrating f over R in figure 11.4 and we start with x constant, asshown. When we find the area of this slice, the limits of the integral will depend on x,rather than be a constant. We see that R can be described as

R = {(x, y) : a ≤ x ≤ b, φ1(x) ≤ y ≤ φ2(x)}

8 math2011, 2010

x

y

R

x0a b

Figure 11.4: Slicing with x constant

for two functions φ1 and φ2. The slice shown at x = x0 would then have area

∫ φ2(x0)

φ1(x0)

f(x0, y) dy

and the double integral over R would be

∫∫

Rf dx dy =

∫ b

a

[∫ φ2(x)

φ1(x)

f(x, y) dy

]dx.

Example 11.5 Find the volume of the prism whose base is given by the triangle in thexy-plane bounded by the x-axis and the lines y = x and x = 1 and whose top surface liesin the plane x + y + z = 4.

The solid is represented on the left in the figure 11.5.

0

0.25

0.5

0.75

1

x

0

0.25

0.5

0.75

1

y

0

1

2

3

4

z

0

1

2

3

x

y

1

Figure 11.5: Prism

Next consider the region of integration R in the xy-plane, right of figure 11.5

Chapter 11 9

Then for any x between 0 and 1, y varies from 0 to y = x. Hence the volume is givenby

V =

∫∫

Rf dA =

∫1

0

∫ x

0

(4 − x − y) dy dx

=

∫1

0

(4x − 3

2x2

)dx

= 2x2 − 1

2x3

∣∣∣∣1

0

=3

2.

On the other hand, if we tried to find the volume of example 11.5 by integrating withrespect to y first we see that R is described as {(x, y) : 0 ≤ y ≤ 1, y ≤ x ≤ 1} and so weget integral ∫

1

0

∫1

y

4 − x − y dx dy.

This is no harder than the previous case, but there is an very important point to be madewith this example. Unlike the case of a rectangular region, we cannot just swap the twointegrals around: we have to rewrite the integral completely, as the limits can changedrastically. This is why you must draw a diagram.

Furthermore, if we tried to slice the region in figure 11.4 holding y constant, we cometo a problem.

x

y

R

Figure 11.6: Slicing with y constant

For the part of R below the dashed line in figure 11.6, we cannot describe the xcoordinate of the boundary using two functions of the y value of the slice, as such a slicecuts the boundary in four points, not two, and hence consists of two intervals, not one.So in this case we would have to either integrate with respect to y first (that is, keep xconstant first), or split R into three parts: a part above the dashed line and two partsbelow.

For some regions, the situation will be reversed — integrating with respect to x firstinvolves only one integral — and we have already seen that in yet others we could do the

10 math2011, 2010

integration in either order. But we could come across a situation like that in figure 11.6where we would nevertheless do x first, because that makes the actual integration easier(or perhaps possible): three easy integrals are better than one very hard one.

Example 11.6 Let I be the integral

I =

∫0

−5

∫5

−y

(x + 2y) dx dy +

∫25

0

∫5

√y

(x + 2y) dx dy.

Sketch the region of integration, change the order of the integration and hence calculatethe value of I.

We can call regions like figure 11.6 y-simple, but not x-simple. The region in exam-ple 11.5 is both x- and y-simple.

In general, we find that any region we are integrating over is either x-simple or y-simple(see figure 11.7), or both, or at least split into regions each of which is one of the two types.

x

y

a b

φ1

φ2

R

y-simple

x

y

a

b

φ1

φ2

R

x-simple

Figure 11.7: Two types of region

In figure 11.7, both φ1 and φ2 are continuous.It is important to note that the region of integration R determines the limits of inte-

gration NOT the integrand of the double integral.

Example 11.7 Let R be the region in figure 11.8. Find

∫∫

Rxy dA.

11.3 Double Integrals in Polar Coordinates

Suppose we wish to evaluate

∫∫

R

f(x, y) dA where R is say a region defined by circles, or

some of the other curves you met in first year that are best described by polar coordinates,such as cardioids.

Chapter 11 11

x

y

y = x2

y = 1

4

y = 2 − x

R

Figure 11.8: Region R

x

y

r = r1

r = r2

θ = θ1

θ = θ2

Figure 11.9: Polar rectangle

The same basic idea we used before will work: we split the region up in “rectangles”and add up the volumes of prisms. The problem is that we need the area of a polarrectangle, which looks like the shape in figure 11.9.

So what is the area of this shape, in terms of ∆r = r2 − r1 and ∆θ = θ2 − θ1?

Well, it is the difference between the sectors of two circles, and a sector with angle ∆θof a circle of radius r has area 1

2r2∆θ. So the area of the polar rectangle, ∆A is given by

∆A =1

2(r2

2− r2

1)∆θ =

1

2(r1 + r2)∆r∆θ = r ∆r∆θ,

where r is the “average radius”.

So for double integration in polar coordinates, we would partition the region up intopolar rectangles, as in figure 11.10.

Then pick a point (r∗i , θ∗j ) in each polar rectangle (tag the partition) and form the

Riemann sumn∑

i=1

m∑

j=1

f(r∗i , θ∗j ) ∆A ,

12 math2011, 2010

x

y

Figure 11.10: Polar partition

As the partition gets finer we expect this to approach the integral in polar coordinates

∫∫

Rf(x, y) dA =

∫∫

Rf(r cos θ, r sin θ) r dr dθ.

Note: the extra r in the integrand.

Example 11.8 Find the volume of the solid under the surface z = ex2+y2

above the polarrectangle given by

1 ≤ r ≤ 3, 0 ≤ θ ≤ 1

4π.

Just as in the case of Cartesian coordinates, we can integrate over more complicatedregions defined by polar coordinates by using variable integration limits.

Example 11.9 Sketch the four-leaved rose, given in polars by r = 2 sin 2θ and calculatethe area of one of its leaves.

Example 11.10 Evaluate

∫∫

Ry dA where R is the region in the first quadrant outside

the circle r = 1 and inside the cardioid r = 1 + cos θ.

Most of the polar integrals you meet will be over regions where r is a function of θ andso we integrate with respect to r first, but we can integrate with respect to θ first if that iseasier, or necessary, and there is a version of Fubini’s theorem that tells us the order doesnot matter.

Example 11.11 The centres of two spheres of radius a are 2b apart, where b ≤ a. Findthe volume of their intersection.

11.4 Centroid and Centre of Mass

(Reference: SHE, Section 16.5)

Chapter 11 13

The most obvious application of double integrals is to calculating areas and volumes.The next simplest is the mass and centre of mass of a lamina (thin sheet).

Suppose a lamina occupying a plane region R has density at co-ordinate (x, y) givenby δ(x, y), δ having the dimension of mass per unit area. Hence the total mass m of thelamina is given by

M =

∫∫

Rδ(x, y) dA.

Definition 11.2 The co-ordinates of the centre of mass of a lamina occupying the regionR and having density δ(x, y) are

x =1

M

∫∫

Rxδ(x, y) dA, y =

1

M

∫∫

Ryδ(x, y) dA,

where M is the mass of the lamina.

The centre of mass is the point at which you could balance the lamina.The integrals

∫∫R xδ(x, y) dA and

∫∫R yδ(x, y) dA, are called the x-moment and y-

moment respectively.

Definition 11.3 The co-ordinates of the centroid of a region R are

xc =1

A

∫∫

Rx dA, yc =

1

A

∫∫

Ry dA,

where A is the area of R.

The centroid is the geometric centre of the region R, and is the centre of mass of alamina in the shape of R with unit density.

Example 11.12 Find the centroid of the region enclosed by the three semi-circles below.

x

y

1 2

Example 11.13 Consider a triangular lamina R in the plane, bounded by x = 0, y = 0,x + y = 1, with density δ(x, y) = 2xy. Find its centre of mass.

MATH 2011

Several Variable Calculus

Autumn 2010

Lecturer: Dr Bill Ellis

Email: [email protected]

Office Hour: Wed 2 pm & Fri 1 pm in RC-4071

April 7, 2010

Contents

12 Triple Integrals [SHE 10 §17.6 - 17.9] 112.1 The Definite Integral of a Function over a Region in Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

12.2 Computing

∫∫∫

T

f dV using Cylindrical Polar Co-ordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

12.3 Computing

∫∫∫

T

f dV using Spherical Polar Co-ordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

13 Change of Variables [SHE 10 §17.10] 15

14 Line Integrals and Green’s Theorem [SHE 10 §18.1 - 18.5, 18.8] 2014.1 Line Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

14.1.1 Hyperbolic Functions and their Inverses [SHE 10 §7.8 - 7.9] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2314.2 Curl and Divergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3014.3 Green’s Theorem in the Plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3514.4 Conservative Vector Fields (revisited) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

14.4.1 Independence of Path in Line Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4314.4.2 Physical Interpretation of Divergence and Curl in R

3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

15 Parameterisation of Surfaces, Surface Integrals [SHE 10 §18.7] 47

17 Stokes’ Theorem [SHE 10 §18.10] 56

18 Divergence Theorem [SHE 10 §18.9] 64

19 Fourier Series 7019.1 Orthogonal functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7019.2 Fourier series of trigonometric functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

i

CONTENTS

19.2.1 Definition of a trigonometric Fourier series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7419.2.2 Convergence of a trigonometric Fourier series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7619.2.3 Fourier sine and cosine series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7819.2.4 Periodic odd and even extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

ii

Chapter 12

Triple Integrals [SHE 10 §17.6 - 17.9]

LECTURE 1

12.1 The Definite Integral of a Function over a Re-gion in Space

The approach to define and then evaluate a triple integral is similar to that of thedouble integral. Consider the parallelepiped (box) defined by

B = [a1, a2] × [b1, b2] × [c1, c2] .

We then partition B into small boxes

x

y

z

and form the Riemann sumn∑

i=1

m∑

j=1

p∑

k=1

f(x∗

i , y∗j , z∗k

)∆V

where ∆V = ∆x∆y∆z. If the Riemann sum tends to a limit (independent of partitionand choice of (x∗

i , y∗j , z∗k) we call the limit the triple integral of f over B.

Definition 12.1 The triple integral of f(x, y, z) over the parallelepiped region B isgiven by

∫∫∫

B

f(x, y, z) dV

=

∫∫∫

B

f(x, y, z) dx dy dz

= limn→∞

limm→∞

limp→∞

n∑

i=1

m∑

j=1

p∑

k=1

f(x∗i , y

∗j , z∗k)∆V

= limn,m,p→∞

n∑

i=1

m∑

j=1

p∑

k=1

f(x∗i , y

∗j , z∗k)∆V

if this limit exists.

Example 12.1 Evaluate the triple integral

1∫

0

1∫

0

2∫

− 2

x y z2 dz dy dx.

1

CHAPTER 12. TRIPLE INTEGRALS [SHE 10 §17.6 - 17.9]

Solution:

1∫

0

1∫

0

2∫

− 2

x y z2 dz dy dx =

1∫

0

1∫

0

(1

3x y z3

∣∣∣∣2

− 2

)dy dx =

16

3

1∫

0

1∫

0

x y dy dx

=16

3

1∫

0

(1

2x y2

∣∣∣∣1

0

)dx

=8

3

1∫

0

xdx

=8

3

(1

2x2

∣∣∣∣1

0

)=

4

3.

�

Things are not too bad when our region is a rectangular box B. What happens in ageneral region in R

3?

Example 12.2 Determine the limits of integration for evaluating the triple integralof a function f(x, y, z) over the tetrahedron T with vertices (0, 0, 0), (0, 0, 1), (1, 0, 0)and (0, 1, 0).

Solution:

0

0.25

0.5

0.75

1

x

0

0.25

0.5

0.75

1

y

0

0.2

0.4

0.6

0.8

1

z

0

0.2

0.4

0.6

0.8

We note that z runs from z = 0 to z = 1 − x − y.

Thus

∫∫∫

T

f(x, y, z) dV =

∫∫

R

1−x−y∫

0

f(x, y, z) dz

dA .

The region R is represented in the following diagram.

Hence

∫∫∫

T

f(x, y, z) dV =

1∫

0

1−x∫

0

1−x−y∫

0

f(x, y, z) dz dy dx .

�


∫∫∫

T

z dV over the volume T bounded by the planes x = 0,

y = 0, z = 0, y = 1 − x and the surface z =√

x + y.

Solution:

2


0

0.25

0.5

0.751

x

0

0.25

0.5

0.75

1

y

0

0.2

0.4

0.6

0.8

1

z

0

0.2

0.4

0.6

0.8

Thus

∫∫∫

T

z dV =

1∫

0

1−x∫

0

√x+y∫

0

z dz dy dx =

1∫

0

1−x∫

0

(1

2z2

∣∣∣∣

√x+y

0

)dy dx

=1

2

1∫

0

1−x∫

0

(x + y) dy dx

=1

2

1∫

0

(xy +

1

2y2

∣∣∣∣1−x

0

)dy dx

=1

2

1∫

0

(x (1 − x) +

1

2(1 − x)2

)dy dx

=1

2

(1

2x2 − 1

3x3 − 1

6(1 − x)3

∣∣∣∣1

0

)dy dx

=1

2

(1

2− 1

3+

1

6

)=

1

6.

�

Example 12.4 Determine the volume of the region cut from the cylinder x2 + y2 = 4by the planes z = 0 and x + z = 3.

Solution:

3


(cont.)

�

Example 12.5 Determine the volume of the hemisphere z =√

a2 − x2 − y2, z ≥ 0.

Solution:

x

y

z

x

∫∫∫

T

dV =

a∫

−a

√a2−x2∫

−√

a2−x2

√a2−x2−y2∫

0

dz dy dx =

a∫

− a

√a2−x2∫

−√

a2−x2

√a2 − x2 − y2 dy dx

substitute y =√

a2 − x2 sin θ

=

a∫

− a

π2∫

− π2

(a2 − x2) cos2 θ dθ dx

=π

2

a∫

− a

(a2 − x2) dx

=π

2

(a2 x − 1

3x3

∣∣∣∣a

− a

)=

2πa3

3.

�

LECTURE 2

4


12.2 Computing

∫∫∫

T

f dV using Cylindrical Polar

Co-ordinates

θ

x

y

z

r

z

x = r cos θ

y = r sin θ

z = z

Consider the volume element dV in cylindrical polar co-ordinates

dV = r dr dθ dz

∫∫∫

T

f(x, y, z) dx dy dz ≡∫∫∫

T

f(r cos θ, r sin θ, z) r dr dθ dz

Example 12.6 Determine the volume of the region cut from the cylinder x2 + y2 = 4by the planes z = 0 and x + z = 3.

Solution:

-2

-1

0

1

2

x

-2

-1

0

1

2

y

0

1

2

3

4

5

z

0

1

2

3

4

T ={(x, y, z) ∈ R

3 | − 2 ≤ x ≤ 2,−√

4 − x2 ≤ y ≤√

4 − x2, 0 ≤ z ≤ 3 − x}

≡{(r, θ, z) ∈ R

3 | 0 ≤ θ ≤ 2π, 0 ≤ r ≤ 2, 0 ≤ z ≤ 3 − r cos θ}

Thus

V =

∫∫∫

T

dV =

2∫

− 2

√4−x2∫

−√

4−x2

3−x∫

0

dz dy dx =

2∫

0

2π∫

0

3−r cos θ∫

0

r dz dθ dr

=

2∫

0

2π∫

0

r (3 − r cos θ) dθ dr

= 6π

2∫

0

r dr = 12π .

�

5


Example 12.7 Determine the volume of the solid that the cylinder r = a cos θ cutsout of the sphere of radius a centred at the origin.

Solution:

-1-0.5

00.5

1

x

-1

-0.5

0

0.5

1

y

-1

-0.5

0

0.5

1

z

-1

-0.5

0

0.5

y

x

T ={(r, θ, z) ∈ R

3 | − π

2≤ θ ≤ π

2, 0 ≤ r ≤ a cos θ,−

√a2 − r2 ≤ z ≤

√a2 − r2

}

V =

∫∫∫

T

dV

=

π2∫

−π2

a cos θ∫

0

√a2−r2∫

−√

a2−r2

r dz dr dθ

= 4

π2∫

0

a cos θ∫

0

r√

a2 − r2 dr dθ

= − 4

3

π2∫

0

((a2 − r2)

3

2

∣∣∣a cos θ

0

)dθ

= − 4

3

π2∫

0

((a2 − a2 cos2 θ)

3

2 − a3)

dθ

= − 4

3

π2∫

0

(a3 sin3 θ − a3

)dθ

= − 4a3

3

π2∫

0

(sin θ(1 − cos2 θ) − 1

)dθ

= − 4a3

3

(− cos θ +

1

3cos3 θ − θ

∣∣∣∣

π2

0

)

=2

9a3 (3π − 4) .

�

Example 12.8 Convert the triple integral

1∫

− 1

√1−y2∫

0

x∫

0

√x2 + y2 dz dx dy

to an equivalent triple integral in cylindrical polar co-ordinates.

6


Solution:

0

0.25

0.5

0.75

1 x

-1

-0.5

0

0.5

1

y

0

0.2

0.4

0.6

0.8

1

z

0

0.25

0.5

0.75

-0.5

0

0.5

1

y

T ={(x, y, z) ∈ R

3 | − 1 ≤ y ≤ 1, 0 ≤ x ≤√

1 − y2, 0 ≤ z ≤ x}

≡{(r, θ, z) ∈ R

3 | − π

2≤ θ ≤ π

2, 0 ≤ r ≤ 1, 0 ≤ z ≤ r cos θ

}

Hence we could evaluate the integral

1∫

− 1

√1−y2∫

0

x∫

0

√x2 + y2 dz dx dy =

1∫

0

π2∫

− π2

r cos θ∫

0

r r dz dθ dr =

1∫

0

π2∫

− π2

r cos θ∫

0

r2 dz dθ dr

=

1∫

0

π2∫

− π2

r3 cos θ dθ dr

=1

4

π2∫

− π2

cos θ dθ

=1

4

(sin θ|

π2

− π2

)=

1

2.

�


∫∫∫

T

x2 dV where T is the solid that lies within the cylinder

x2 + y2 = 1, below the cone z2 = 4 x2 + 4 y2 and above the plane z = 0.

Solution:

7


(cont.)

�

Definition 12.2 The co-ordinates of the centre of mass of an object occupying theregion T and having density ρ(x, y, z) are

xm =1

m

∫∫∫

T

xρ(x, y, z) dV

ym =1

m

∫∫∫

T

yρ(x, y, z) dV

zm =1

m

∫∫∫

T

zρ(x, y, z) dV

where

m =

∫∫∫

T

ρ(x, y, z) dV .

Definition 12.3 The co-ordinates of the centroid of a object occupying the region Tare

x =1

V

∫∫∫

T

xdV

y =1

V

∫∫∫

T

y dV

z =1

V

∫∫∫

T

z dV

where

V =

∫∫∫

T

dV .

This yields the geometric centre of the region T with volume V .

Example 12.10 Determine the volume and centroid of the solid that is bounded bythe paraboloids z = x2 + y2 and z = 36 − 3x2 − 3y2.

Solution:

-2

0

2x

-2

0

2y

0

10

20

30

z

-2

0

2x

0

10

8


The two paraboloids intersect when x2+y2 = 36−3x2−3y2. This implies the projectiononto the xy-plane to be

R ={(x, y) ∈ R

2 |x2 + y2 ≤ 9}

.

Thus in cylindrical polar co-ordinates

T ={(r, θ, z) ∈ R

3 | 0 ≤ θ ≤ 2π, 0 ≤ r ≤ 3, r2 ≤ z ≤ 36 − 3r2}

.

Before we can determine the centroid of the region we need to determine the volumeV of the object. Thus

V =

∫∫∫

T

dV =

2π∫

0

3∫

0

36−3r2∫

r2

r dz dr dθ

= 2π

3∫

0

r(36 − 4r2

)dr

= 2π(

18 r2 − r4∣∣30

)

= 2π (162 − 81)

= 162π .

Hence

x =1

V

∫∫∫

T

xdV

=1

V

2π∫

0

3∫

0

36−3r2∫

r2

r cos θ r dz dr dθ

=1

V

2π∫

0

cos θ dθ

3∫

0

36−3r2∫

r2

r2 dz dr

= 0 ,

y =1

V

∫∫∫

T

y dV

=1

V

2π∫

0

3∫

0

36−3r2∫

r2

r sin θ r dz dr dθ

=1

V

2π∫

0

sin θ dθ

3∫

0

36−3r2∫

r2

r2 dz dr

= 0 ,

z =1

V

∫∫∫

T

z dV

=1

V

2π∫

0

3∫

0

36−3r2∫

r2

z r dz dr dθ

=2π

V

3∫

0

r

(1

2z2

∣∣∣∣36−3r2

r2

)dr

=π

V

3∫

0

r((36 − 3r2)2 − r4

)dr

=π

V

3∫

0

r(64 − 63r2 + 8r4

)dr

=π

V

(64

2r2 − 63

4r4 +

8

6r6

∣∣∣∣3

0

)

=π

V

(64 9

2− 63 81

4+

8

636

)= 15 .

�

LECTURE 3

9


12.3 Computing

∫∫∫

T

f dV using Spherical Polar Co-

ordinates

θ

φ

x

y

z

r

x = r cos θ sin φ

y = r sin θ sin φ

z = r cosφ

Consider the volume element dV in spherical polar co-ordinates

dV = r2 sin φdr dφ dθ

∫∫∫

T

f(x, y, z) dx dy dz =

∫∫∫

T

f(r cos θ sin φ, r sin θ sinφ, r cosφ) r2 sin φdr dθ dφ

Example 12.11 Determine the volume of the hemisphere z =√

a2 − x2 − y2, z ≥ 0.

Solution:

x

y

z

x

T ={(x, y, z) ∈ R

3 | − a ≤ x ≤ a,−√

a2 − x2 ≤ y ≤√

a2 − x2,

0 ≤ z ≤√

a2 − x2 − y2}

≡{(r, θ, φ) ∈ R

3 | 0 ≤ r ≤ a, 0 ≤ θ ≤ 2π, 0 ≤ φ ≤ π

2

}

Thus

V =

∫∫∫

T

dV =

a∫

−a

√a2−x2∫

−√

a2−x2

√a2−x2−y2∫

0

dz dy dx =

2π∫

0

π2∫

0

a∫

0

r2 sin φdr dφ dθ

=

2π∫

0

dθ

π2∫

0

sin φdφ

a∫

0

r2 dr

= (2π) (1)

(a3

3

)=

2πa3

3.

�

10


Example 12.12 Determine the volume enclosed by the torus r = sin φ.

Solution:

-1-0.5

00.5

1

x

-1

-0.5

0

0.5

1y

-0.5

-0.25

0

0.25

0.5

z

-0.5

-0.25

0

T ={(r, θ, φ) ∈ R

3 | 0 ≤ θ ≤ 2π, 0 ≤ φ ≤ π, 0 ≤ r ≤ sin φ}

Thus

V =

∫∫∫

T

dV =

2π∫

0

π∫

0

sin φ∫

0

r2 sin φdr dφ dθ

= 2π

π∫

0

1

3sin4 φdφ

=2π

3

(3

8φ − 1

4sin 2φ +

1

16sin 4φ

∣∣∣∣π

0

)

=1

4π2 .

�


∫∫∫

T

y2 dV where T is the solid bounded by the unit sphere

centred at the origin and in the first octant.

Solution:

00.25

0.50.75

1

x

0

0.25

0.5

0.75

1

y

0

0.25

0.5

0.75

1

z

0

0.25

0.5

0.75

T ={(r, θ, φ) ∈ R

3 | 0 ≤ θ ≤ π

2, 0 ≤ φ ≤ π

2, 0 ≤ r ≤ 1

}

Thus ∫∫∫

T

y2 dV =

π2∫

0

π2∫

0

1∫

0

(r sin θ sin φ)2 r2 sin φdr dφ dθ

=

π2∫

0

π2∫

0

1∫

0

r4 sin2 θ sin3 φdr dφ dθ

=(π

4

) (1

5

) π2∫

0

sin3 φdφ

=π

20

π2∫

0

(1 − cos2 φ

)sin φdφ

=π

20

(− cosφ +

1

3cos3

∣∣∣∣

π2

0

)=

π

30.

�

11


Example 12.14 A spherical planet of radius R has an atmosphere whose densityvaries as ρ = ρ0e

− ch where h is the altitude above the surface of the planet, ρ0 isthe density at the planet’s surface and c is a positive constant. Determine the mass ofthe planet’s atmosphere.

Solution:

(cont.)

�

Example 12.15 Determine the mass and centre of mass of the solid that lies beneaththe sphere x2 + y2 + z2 = z and above the cone z2 = x2 + y2 if its density variesproportional to the distance from the vertex of the cone.

Solution:

12


(cont.) (cont.)

13


(cont.) (cont.)

�

14

Chapter 13

Change of Variables [SHE 10 §17.10]

LECTURE 4

The previously studied polar, cylindrical polar and spherical polar co-ordinate substi-tutions are specific examples of the general substitution method for multiple integrals.Our attention will be focused in developing this general method for double integralsbut is easily extended to triple and/or higher multiple integrals.Suppose that the region R∗ in the uv-plane is transformed one-to-one into the regionR in the xy-plane by equations of the form

x = x(u, v), y = y(u, v)

We call the region R the image of R∗ under the transformation and the region R∗

is the preimage of R. Hence any function f(x, y) defined on R can be thought of as

a function f(x(u, v), y(u, v)) defined on R∗. The next question is how the integral off(x, y) over the region R is related to the integral of f(x(u, v), y(u, v)) over R∗?

∫∫

R

f(x, y) dx dy =

∫∫

R∗

f(x(u, v), y(u, v))

∣∣∣∣∂(x, y)

∂(u, v)

∣∣∣∣ du dv

where ∂(x,y)∂(u,v) is the Jacobian determinant

∂(x, y)

∂(u, v)=

∣∣∣∣∣∣

∂x∂u

∂x∂v

∂y∂u

∂y∂v

∣∣∣∣∣∣.

Example 13.1 For polar co-ordinates (x = r cos θ, y = r sin θ in R2) we have

∂(x, y)

∂(r, θ)=

∣∣∣∣∂x∂r

∂x∂θ

∂y∂r

∂y∂θ

∣∣∣∣ =

∣∣∣∣cos θ − r sin θsin θ r cos θ

∣∣∣∣ = r .

Thus∫∫

R

f(x, y) dx dy =

∫∫

R∗

f(r cos θ, r sin θ) |r| dr dθ

=

∫∫

R∗

f(r cos θ, r sin θ) r dr dθ (if r ≥ 0) .

Note that the integral on the right hand side can be considered as the integral off(r cos θ, r sin θ) r over a region R∗ in the cartesian rθ-plane.

15

CHAPTER 13. CHANGE OF VARIABLES [SHE 10 §17.10]


∫∫

R

(3x + 4y) dx dy where region R is bounded by

y = x, y = x − 2, y = − 2x, y = 3 − 2x .

Solution: Consider the region R which is represented in the following diagram

-0.5 0.5 1 1.5 2 2.5 3 3.5x

-2

-1

1

2

y

The equations of the boundary suggest the following transformation

u = x − y, v = 2x + y .

Hence the region R∗ is represented in the following diagram

0.25 0.5 0.75 1 1.25 1.5 1.75 2u

0.5

1

1.5

2

2.5

3v

R∗ ={(u, v) ∈ R

2 | 0 ≤ u ≤ 2, 0 ≤ v ≤ 3}

.

The Jacobian determinant is given by

∂(u, v)

∂(x, y)=

∣∣∣∣∣

∂u∂x

∂u∂y

∂v∂x

∂v∂y

∣∣∣∣∣ =

∣∣∣∣1 − 12 1

∣∣∣∣ = 3 ⇒ ∂(x, y)

∂(u, v)=

1

3.

Thus we next wish to calculate the integral. Before doing so we need to rewrite theintegrand 3x + 4y in terms of u and v. Thus

u = x − y, v = 2x + y ⇒ x =1

3(u + v), y =

1

3(v − 2u) .

Hence

∫∫

R

(3x + 4y) dx dy =

3∫

0

2∫

0

1

3(7v − 5u)

1

3du dv

=1

9

3∫

0

(7vu − 5

2u2

∣∣∣∣2

0

)dv

=1

9

3∫

0

(14v − 10) dv

=1

9

(7v2 − 10v

∣∣30

)

=1

9(63 − 30)

=11

3.

�


∫∫

R

sin(x− y) cos(x + 2y) dx dy where region R is the paral-

lelogram bounded by

x − y = 0, x − y = π, x + 2y = 0, x + 2y =π

2.

16


Solution: Let u = x − y and v = x + 2y. Thus the region R in the xy-plane

-0.5 0.5 1 1.5 2 2.5 3 3.5x

-3

-2

-1

1

y

is the image of the region R∗ in the uv-plane

p��2

p

u

p��2

v

R∗ ={(u, v) ∈ R

2 | 0 ≤ u ≤ π, 0 ≤ v ≤ π

2

}.


∂(u, v)

∂(x, y)=

∣∣∣∣∣

∂u∂x

∂u∂y

∂v∂x

∂v∂y

∣∣∣∣∣ =

∣∣∣∣1 − 11 2

∣∣∣∣ = 3 ⇒ ∂(x, y)

∂(u, v)=

1

3.

Thus

∫∫

R

sin(x − y) cos(x + 2y) dx dy =

π∫

0

π2∫

0

sin u cos v1

3dv du

=1

3

π∫

0

sin u du

π2∫

0

cos v dv

=2

3.

�


∞∫

−∞

∞∫

−∞

e− (x−y)2

1 + (x + y)2dx dy by integrating over the square re-

gion R ={(x, y) ∈ R

2 | − a ≤ x ≤ a,− a ≤ y ≤ a}

and then taking the limit as a →∞. HINT : Set u = x − y and v = x + y.

Solution: If u = x − y and v = x + y then the region R in the xy-plane

-a ax

-a

a

y

17


is the image of the region R∗ in the uv-plane

-2a 2au

-2a

2a

v


∂(u, v)

∂(x, y)=

∣∣∣∣∣

∂u∂x

∂u∂y

∂v∂x

∂v∂y

∣∣∣∣∣ =

∣∣∣∣1 − 11 1

∣∣∣∣ = 2 ⇒ ∂(x, y)

∂(u, v)=

1

2.

Thus

lima→∞

a∫

− a

a∫

− a

e− (x−y)2

1 + (x + y)2dx dy = 2 lim

a→∞

2a∫

0

−u+2a∫

u−2a

e−u2

1 + v2

1

2dv du

=

∞∫

0

∞∫

−∞

e−u2

1 + v2dv du

=

∞∫

0

e−u2

du

∞∫

−∞

1

1 + v2dv

=

√π

2

(arctan v|∞−∞

)

=1

2π

3

2 .

�

Example 13.5 Calculate the area of the region R bounded by the curvesx2 − 2xy + y2 + x + y = 0 and x + y + 4 = 0.

Solution: The first curve can be rewritten as

x2 − 2xy + y2 + x + y = 0 ⇒ (x − y)2 + x + y = 0 .

This suggests the transformation

18


(cont.)

�

The generalisation to triple integrals is as follows

∫∫∫

T

f(x, y, z) dx dy dz =

∫∫∫

T ∗

f (x(u, v, w), y(u, v, w), z(u, v, w))

∣∣∣∣∂(x, y, z)

∂(u, v, w)

∣∣∣∣ du dv dw .

19

Chapter 14

Line Integrals and Green’s Theorem [SHE 10 §18.1 - 18.5, 18.8]

LECTURE 5

14.1 Line Integrals

We begin by considering a wire of variable density (mass/unit length). The problemof calculating the mass is easy if the wire has the form of a straight line along say thex-axis from a to b. Let the linear mass density be f(x). We partition the wire along[a, b] into n sub-intervals [xi−1, xi], 1 ≤ i ≤ n, of length ∆x such that x0 = a andxn = b and xi = xi−1 + ∆x. Then

Mass (m) ≃n∑

i=1

f(xi)∆x .

As the number of sub-intervals approaches ∞, this Riemann sum approaches the inte-gral of f , thus

m =

b∫

a

f(x) dx.

Suppose now that the wire is bent into a 3−dimensional curve. To obtain the masswe must first define the curve. This is usually done parametrically by giving thecomponents x, y, z as functions of parameter t. The curve is then given in a vectorform as

˜r(t) = x(t)

˜i + y(t)

˜j + z(t)

˜k.

The linear mass density (mass per unit length) is a function of position in spacef(

˜r(t)) = f(x(t), y(t), z(t)). Suppose we calculate the mass of the wire from t = a to

t = b. To do this we partition the parameter interval [a, b] into n sub-intervals [ti−1, ti],1 ≤ i ≤ n, of length ∆t such that t0 = a and tn = b and ti = ti−1 + ∆t. This dividesthe curve into n sub-curves. Consider one of these sub-curves. Then

∆Mass ≃ f(˜r(ti)) × (length of sub curve)

length of sub curve ≃√

[x(ti) − x(ti−1)]2 + [y(ti) − y(ti−1)]2 + [z(ti) − z(ti−1)]2

≃√

[x′(ti)∆t]2 + [y′(ti)∆t]2 + [z′(ti)∆t]2

=√

(x′(ti))2 + (y′(ti))2 + (z′(ti))2 ∆t

= ‖˜r′(ti)‖∆t .

20

CHAPTER 14. LINE INTEGRALS AND GREEN’S THEOREM [SHE 10 §18.1 - 18.5, 18.8]

Hence

∆Mass ≃ f(˜r(ti)) ‖

˜r′(ti)‖∆t

and so

Mass of wire ≃∑

i

f(˜r(ti)) ‖

˜r′(ti)‖∆t

Finally taking the limits as number of sub-intervals approaches ∞ (or as ∆t → 0) theRiemann sum then approaches the integral and so

Mass of the wire =

b∫

a

f(˜r(t)) ‖

˜r′(t)‖ dt.

This integral is called the line integral of the scalar field or scalar function falong the curve C and is denoted by

∫

C

f ds.

Thus by definition

∫

C

f ds =

b∫

a

f(˜r(t)) ‖

˜r′(t)‖ dt .

Example 14.1 A wire joins the points (0, 0, 0) to (1, 2, 2) in a straight line and haslinear mass density

f(x, y, z) = x + y2 + z2 .

Calculate the mass.

Solution: Our first task is to determine the parametric equation for the curve. Sincethe curve is a straight line we can use the vector parameterisation of a curve.

˜r(t) =

˜r(0) + t

˜v

where˜r(0) =< 0, 0, 0 > and v =< 1 − 0, 2 − 0, 2 − 0 >=< 1, 2, 2 > with t : 0 → 1.

Thus

˜r(t) = t < 1, 2, 2 >= t

˜i + 2t

˜j + 2t

˜k .

Therefore˜r′(t) =

˜i + 2

˜j + 2

˜k and ‖

˜r′(t)‖ = 3. Hence

Mass =

∫

C

f ds =

1∫

0

f(˜r(t)) ‖

˜r′(t)‖ dt

=

1∫

0

(t + (2t)2 + (2t)2) 3 dt

= 3

(t2

2+

8t3

3

∣∣∣∣1

0

)= 9

1

2.

�

Example 14.2 Calculate the line integral

∫

C

xy

zds

where C is the curve

˜r(t) = t

˜i + t2

˜j + t2

˜k from t = 0 to t = 1.

Solution:

�

21


Notes:

• If we let f be the constant function 1 then the line integral first gives the lengthof the curve, i.e.,

length ℓ =

∫

C

ds.

For a curve in R2 we have the curve parametrised by x so we can write

x = t, y = f(t).

Then

length ℓ =

∫

C

ds

=

b∫

a

√(dx

dt

)2

+

(dy

dt

)2

dt

=

b∫

a

√1 + (f ′(t))2 dt .

• The distance along a curve from some given point on it is usually denoted by sand then we say that for the curve

˜r(t) the line segment (or element of length) is

given by

ds = ‖˜r′(t)‖ dt.

• If t represents time as a particle moves along the curve˜r(t) then it can be shown

that˜r′(t) is the velocity of the particle at

˜r(t). We have

˜r′(t) = lim

h→0 ˜r(t + h) −

˜r(t)

h,

i.e., a limit of position/time. Then˜r′(t) is tangent to the curve and the magnitude

of the velocity is ‖˜r′(t)‖. The line element ds = ‖

˜r′(t)‖ dt then says in “rough”

intuitive terms

distance travelled = speed × time.

• The centre of mass of a wire shaped like a curve C, with linear density functionσ(x, y, z) and mass m is located at the point (xm, ym, zm), where

xm =1

m

∫

C

xσ(x, y, z) ds

ym =1

m

∫

C

yσ(x, y, z) ds

zm =1

m

∫

C

zσ(x, y, z) ds

where

m =

∫

C

σ(x, y, z) ds .

Example 14.3 Determine the area of that portion of the surface x2 + y2 = 9 that liesabove the first quadrant of the xy-plane and below the plane z = 3 − x.

Solution:

0

1

2

3

x

0

1

2

3

y

0

1

2

3

z

0

1

2

22


(cont.)

�

14.1.1 Hyperbolic Functions and their Inverses [SHE 10 §7.8 -7.9]

Definition 14.1 Let x be a real number. The hyperbolic functions are given by

coshx ≡ ex + e−x

2hyperbolic cosine,

sinhx ≡ ex − e−x

2hyperbolic sine,

tanhx ≡ sinhx

coshxhyperbolic tangent.

Example 14.4 Show that cosh2 x − sinh2 x = 1.

Solution:

�

Example 14.5 Determine the derivatives of coshx and sinhx.

Solution:

d

dxcosh x =

d

dx

(ex + e−x

2

)=

ex − e−x

2= sinhx,

d

dxsinhx =

d

dx

(ex − e−x

2

)=

ex + e−x

2= coshx .

�

23


Definition 14.2 Euler’s Formula is given by

eıθ = cos θ + ı sin θ

where ı =√−1.

Example 14.6 Using the definition of the hyperbolic cosine and hyperbolic sine re-spectively we have

cosh ıθ =eıθ + e−ıθ

2=

cos θ + ı sin θ + cos θ − ı sin θ

2= cos θ,

sinh ıθ =eıθ − e−ıθ

2=

cos θ + ı sin θ − cos θ + ı sin θ

2= ı sin θ .

Example 14.7 Find the length of the catenary˜r(t) = t

˜i+cosh t

˜j from t = 0 to t = 1.

0.2 0.4 0.6 0.8 1

0.6

0.8

1

1.2

1.4

1.6

1.8

2

Solution: Thus

˜r′(t) =

˜i + sinh t

˜j .

Hence

ℓ =

1∫

0

‖˜r′(t)‖ dt =

1∫

0

√1 + sinh2 t dt =

1∫

0

√cosh2 t dt

=

1∫

0

cosh t dt

= sinh 1 .

�

LECTURE 6

There is a second type of line integral which is important. Consider the problem ofcomputing the work done by a position dependent (varies with position) force

˜F(

˜r)

as it acts on a particle that moves along a curve. Again we have the curve specifiedparametrically by a vector function

˜r(t) = x(t)

˜i + y(t)

˜j + z(t)

˜k. The force however is

now a vector function of position and so˜F(

˜r) is a function of x, y, z which we write as

˜F(x, y, z) or

˜F(

˜r) .

Such a function is called a vector field or vector function. For example, if

˜F(x, y, z) = yz

˜i + xz

˜j + xy

˜k

then

˜F(1, 2, 2) = 4

˜i + 2

˜j + 2

˜k and

˜F(−1, 0, 2) = −2

˜j .

In effect each point in space has a vector representing force assigned to it. Otherexamples are of vector fields are the electric field, gravitational field and the velocityfield in a fluid.Our problem now is as follows. A particle moves along the curve

˜r(t) in a force field

˜F(

˜r). What is the work done by the force on the particle?

Suppose we calculate the work done from t = a to t = b. To do this we divide theinterval [a, b] into n sub-intervals [ti−1, ti], 1 ≤ i ≤ n, of length ∆t such that t0 = aand tn = b and ti = ti−1 + ∆t.

24


This divides the curve into n sub-curves.

Consider the ith sub-curve, then work done, Wi, as the particle moves along the ith

sub-curve is approximately

Wi ≃˜F(

˜r(ti)) · (

˜r(ti) −

˜r(ti−1))

≃˜F(

˜r(ti)) ·

˜r′(ti)∆t .

Summing over all sub-curves yields

W ≃∑

i ˜F(

˜r(ti)) ·

˜r′(ti)∆t .

This is a Riemann sum of the function˜F(

˜r(ti))·

˜r′(ti) and so as number of sub-intervals

→ ∞, this approaches the integral

W =

b∫

a˜F(

˜r(t)) ·

˜r′(t) dt.

This integral is called the line integral of the vector field˜F along the curve C and is

denoted by ∫

C˜F · d

˜r.

Thus by definition

∫

C˜F · d

˜r =

b∫

a˜F(

˜r(t)) ·

˜r′(t) dt .

Example 14.8 Find the work done when the particle is moved along the curve˜r(t) =

t˜i + t2

˜j + t2

˜k from (0, 0, 0) to (2, 4, 4) under the force

˜F(x, y, z) = yz

˜i + xz

˜j + xy

˜k .

Solution: Now

˜r′(t) =

˜i + 2t

˜j + 2t

˜k ,

˜F(

˜r(t)) = t4

˜i + t3

˜j + t3

˜k ,

˜F(

˜r(t)) ·

˜r′(t) = t4 + 2t4 + 2t4 = 5t4 .

Also (0, 0, 0) is t = 0 and (2, 4, 4) is t = 2. Thus

Work =

∫

C˜F · d

˜r =

2∫

0˜F(

˜r(t)) ·

˜r′(t) dt

=

2∫

0

5t4 dt

= t5∣∣20

= 32 .

�

25


There are several alternative ways of writing the line integral of a vector field. Firstit can be written as the line integral of a scalar function, that is, a line integral of thefirst type. This uses the fact that

˜r′(t) is a tangent to the curve and so if we denote

the unit tangent vector by˜T(

˜r(t)) then

˜T(

˜r(t)) ≡

˜r′(t) = ˜

r′(t)

‖˜r′(t)‖ .

Thus

∫

C˜F · d

˜r =

b∫

a˜F(

˜r(t)) ·

˜r′(t) dt

=

b∫

a˜F(

˜r(t)) · ˜

r′(t)

‖˜r′(t)‖ ‖

˜r′(t)‖ dt

=

b∫

a˜F(

˜r(t)) ·

˜T(

˜r(t)) ‖

˜r′(t)‖ dt

=

∫

C˜F ·

˜T ds

=

∫

C

Ft ds.

where Ft ≡˜F ·

˜T is the tangential component of

˜F along curve C. This notation

however makes no contribution to calculating the integral.Let

˜F(

˜r(t)) = Fx

˜i+Fy

˜j+Fz

˜k, then the second form of line integral is to write formally

d˜r = dx

˜i + dy

˜j + dz

˜k

so that ∫

C˜F · d

˜r =

∫

C

(Fx dx + Fy dy + Fz dz) .

These integrals can be calculated separately using the parametric form. For example

∫

C

Fx dx =

b∫

a

Fx(˜r(t))x′(t) dt .

In the previous example this gives

∫

C

Fx dx =

2∫

0

t4 1 dt =32

5.


∫

C˜F · d

˜r where F = xy

˜i + (y − x2)

˜j and C is the curve

xy = 1, 1 ≤ x ≤ 3.

Solution:

�

26


Example 14.10 Let˜F be the vector field shown in the figure below. If C is quarter

circle from (1, 0) to (0, 1) determine if

∫

C˜F · d

˜r is positive, negative or zero.

0.2 0.4 0.6 0.8 1

0.2

0.4

0.6

0.8

1

Solution:

�

There is an important relationship between line integrals and gradients. Let f be ascalar valued function of three variables. Then the gradient of f , ∇f ≡ gradf , is avector valued function of three variables, i.e., a vector field. For example,

f(x, y, z) = xyz

then

∇f(x, y, z) =∂f

∂x ˜i +

∂f

∂y˜j +

∂f

∂z ˜k = yz

˜i + xz

˜j + xy

˜k.

Now consider the line integral

∫

C

∇f · d˜r where C is the curve

˜r(t) from t = a to t = b.

Let g(t) = f(x(t), y(t), z(t)). Then by the chain rule

27


Thus,

∫

C

∇f · d˜r =

b∫

a

∇f(˜r(t)) ·

˜r′(t) dt

=

b∫

a

d

dt(f(

˜r(t))) dt

= f(˜r(t))|ba

= f(˜r(b)) − f(

˜r(a)).

This result is the vector analogue of the fundamental theorem of calculus

b∫

a

f ′(x) dx = f(b) − f(a) .

LECTURE 7

There is a very important consequence of this result. The value of the integral dependsonly on the value of f at the end-points of the curve. It is independent of the actualshape of the curve between the two endpoints! From a physical viewpoint this meansthat if a force is given by a gradient of a scalar function, then the work done in goingfrom one point to another is independent of the path. A force field which is the gradientof a scalar function is called a conservative field or gradient field and the scalarfunction is called a potential for the field. Many force fields are conservative. Forexample, the electric field of a charge q is given by

˜E(

˜r) =

q

‖˜r‖2

ˆ˜r, (where ˆ

˜r is the unit vector in the direction of

˜r) .

Let

φ(˜r) =

q

‖˜r‖ =

q√x2 + y2 + z2

then

∇φ = − q

[x

(x2 + y2 + z2)3/2˜i +

y

(x2 + y2 + z2)3/2˜j +

z

(x2 + y2 + z2)3/2˜k

]

= − q ˜r

‖˜r‖3

= − q

‖˜r‖2

ˆ˜r.

Thus,˜E = −∇φ. Thus the electric field is conservative and work done in moving

a charge around in an electric field is independent of path. In fact it is simply thedifference in potential evaluated at the final and initial points. Thus if we move chargee from A to B in the electric field of charge q then

Work =

∫

C˜F · d

˜r =

∫

C

e˜E · d

˜r

= − e

∫

C

∇φ · d˜r

= − e (φ(˜rB) − φ(

˜rA))

= e (φ(˜rA) − φ(

˜rB)) .

Notes:

• The “-” sign in˜E = −∇φ means that the potential is larger the nearer we are

to a positive charge. Thus if ‖˜rA‖ < ‖

˜rB‖ means that the field does work on the

charge e if the charge e is positive.

• Another way of stating the fact that work done is independent of path for aconservative force field is that for such a vector field the integral around a closedpath is zero. If the curve is closed then

˜r(a) =

˜r(b) and so

∫

C

∇f · d˜r = f(

˜r(b)) − f(

˜r(a)) = 0.

• For simple closed curves we use the notation

∮

C

. This is an especially common

notation in complex variables.

28


• Given a vector field˜F(

˜r) there are various criteria which tell us whether it is

conservative, i.e., whether it can be written as ∇f , and various methods for findingthe function f , the potential, if it is conservative. We defer the most expedientmethod till later when we have discuss the concept of curl.

• We have proved that if˜F is conservative, i.e.,

˜F = ∇f , then

∫

C˜F·d

˜r is independent

of path and depends only on end points. It can be shown that the converse is also

true: If

∫

C˜F · d

˜r is independent of path, then

˜F is conservative.

Example 14.11 Find a potential function f for the conservative vector field

˜F(x, y) = (6xy − y3)

˜i + (4y + 3x2 − 3xy2)

˜j .

Solution: Because we are given that˜F is a conservative field, the line integral∫

C˜F ·

˜T ds is independent of path. Let C be the straight-line path from A(0, 0) to

B(x1, y1) parametrised by x = x1t, y = y1t with 0 ≤ t ≤ 1.

(cont.)

�

29


Example 14.12 Calculate the work done by the force˜F = 3x

˜i+ y

˜j along the straight

line segment from (2, 2) to (−2, 2).

Solution: We need to parametrise the curve. Let us parametrise the curve in twoways

˜r1(t) = t

˜i + 2

˜j t : 2 → −2 ,

˜r2(t) = −t

˜i + 2

˜j t : −2 → 2 .

Thus the work done along the curve using the first parameterisation is

∫

C˜F · d

˜r1 =

−2∫

2

(3t˜i + 2

˜j) · (

˜i + 0

˜j) dt

= −2∫

−2

3t dt .

Similarly the work done along the curve for the second parameterisation is

∫

C˜F · d

˜r2 =

2∫

−2

(−3t˜i + 2

˜j) · (−

˜i + 0

˜j) dt

=

2∫

−2

3t dt .

Note that the integrals are opposite but the work done should be independent ofthe parameterisation of the curve. This is not a paradox since the answer to bothintegrals is zero. What we should note is that different parameterisation can lead to adifferent integral but the value of the line integral should be the same independent ofparameterisation.

�

We now wish to further investigate the importance of the concept that the parametrisedcurve is an oriented curve. The curve is a succession of points traversed in a certainorder. Consider the following curves

˜r =

˜r(t) t : a → b ,

R(s) =˜r(a + b − s) s : a → b .

Both vector functions trace out the same set of points but the first curve starts at˜r(a)

and finishes at˜r(b) whereas the second curve starts at

˜r(b) and ends at

˜r(a):

R(a) =˜r(a + b − a) =

˜r(b) , R(b) =

˜r(a + b − b) =

˜r(a) .

Example 14.13 The vector function

˜r(t) = cos t

˜i + sin t

˜j t : 0 → 2π

is a parameterisation of the unit circle traversed counter-clockwise when viewed fromabove whereas

R(s) = cos(2π − s)˜i + sin(2π − s)

˜j s : 0 → 2π

≡ cos s˜i − sin s

˜j s : 0 → 2π

is a parameterisation of the unit circle traversed clockwise when view from above.

14.2 Curl and Divergence

In physical applications such as particle mechanics, fluid mechanics and electromag-netic theory, other notions of the derivative of a vector field besides the total derivativeare useful. In particular, for a given vector field

˜F on a subset of R

2 (or later R3), we

can calculate the divergence and curl of˜F.

Imagine a bath tub full of water and the plug has just been pulled. Water is nowdraining through the sink which has a certain cross sectional area. We would expectthat there must be some relationship between the rate at which water crosses the areaof the sink and the change in volume of the water in the bath. Also after a whilethe fluid starts to rotate around the sink and eventually forms a “twister”. Is there arelationship between the rotation of the fluid and the flow across the sink?To simplify our discussion we first consider vector fields in R

2. Later we will extendthe ideas developed in R

2 to R3.

Consider the motion of fluid in a planar region. This can be represented by a velocityfield of the fluid motion. We wish to investigate the relationship of the movement of

30


fluid around or across the boundary of the region with the motion inside the region.We first define two quantities.

Definition 14.3 If C is a smooth curve in R2 (parametrised by

˜r(t)) in the domain

of the vector field˜F then the flow along the curve C is the line integral of

˜F ·

˜T:

flow =

∫

C˜F ·

˜T ds .

The integral in this case is called a flow integral. If the curve is closed then the flowis called the circulation around the closed curve.

Definition 14.4 If C is a smooth curve in R2 (parametrised by

˜r(t)) in the domain

of the vector field˜F and if

ñ is the outward pointing unit normal vector to C then the

flux of˜F across curve C is the line integral of

˜F ·

ñ:

flux of˜F across C =

∫

C˜F ·

ñ ds .

Notes:

• The outward pointing unit normal vector to the curve C can be derived by takingthe cross product of

˜T and the unit vector in the z direction

˜k. The choice of

˜T ×

˜k or

˜k ×

˜T depends on the orientation of the curve C as viewed from above:

˜T ×

˜k counter clockwise,

˜k ×

˜T clockwise.

• The word flux is latin for flow. We should note however that many flux calculationsdo not involve motion, e.g., electric or magnetic field. Though it is useful forvisualisation of these and upcoming concepts to consider the velocity field of afluid flow.

Let˜F(x, y) = P (x, y)

˜i + Q(x, y)

˜j be the velocity field of a fluid flow in the plane and

let the first order partial derivatives of P and Q be continuous at each point in theregion R. Let (x, y) be a point in the region R and A be a small rectangular regionwith the bottom left hand corner at (x, y). The rectangular region has sides of length∆x and ∆y.

(x, y)

∆(x, y + y) (x + x, y + y)

(x + x, y)∆

∆ ∆

We wish to consider the net flux of fluid across (through) this rectangular region A,i.e.,

∮

C ˜F ·

ñ ds. The rate which fluid leaves the rectangular region across the bottom

edge is approximately given by

˜F(x +

∆x

2, y) · (−

˜j)∆x = −Q(x +

∆x

2, y)∆x .

This is the component of the fluid velocity at (x, y) in the direction of the outwardnormal times the length of the side (compare with

∮

C ˜F ·

ñ ds). Note over the side (x, y)

to (x + ∆x, y) the vector field˜F in general would vary. We have taken the value of

˜F at the position (x + ∆x

2 , y) as the value of˜F for all positions between (x, y) and

(x + ∆x, y).

LECTURE 8

Thus for all four sides we have

Top˜F(x +

∆x

2, y + ∆y) ·

˜j∆x = Q(x +

∆x

2, y + ∆y)∆x

Bottom˜F(x +

∆x

2, y) · (−

˜j)∆x = −Q(x +

∆x

2, y)∆x

Left˜F(x, y +

∆y

2) · (−

˜i)∆y = −P (x, y +

∆y

2)∆y

Right˜F(x + ∆x, y +

∆y

2) ·

˜i∆y = P (x + ∆x, y +

∆y

2)∆y

31


Considering opposite sides we have

Top & Bottom

(Q(x +

∆x

2, y + ∆y) − Q(x +

∆x

2, y)

)∆x

≈(

∂Q

∂y∆y

)∆x

Right & Left

(P (x + ∆x, y +

∆y

2) − P (x, y +

∆y

2)

)∆y

≈(

∂P

∂x∆x

)∆y

Adding yields the total flux across the rectangular region A boundary, i.e.,

flux (

∮

C˜F ·

˜n ds) ≈

(∂P

∂x+

∂Q

∂y

)∆x∆y .

If we divide by the area of the rectangular region A, ∆x∆y, we obtain the flux perunit area, i.e.,

flux

area≈ ∂P

∂x+

∂Q

∂y.

If we let both ∆x → 0 and ∆y → 0 this yields the flux density or divergence of˜F

at (x, y).

Definition 14.5 The flux density or divergence of a vector field˜F = Fx(x, y)

˜i +

Fy(x, y)˜j at the point (x, y) is given by

div˜F ≡ ∇ ·

˜F =

∂Fx

∂x+

∂Fy

∂y.

In the context of fluid flow the divergence represents the instantaneous rate of fluidflow outward at each point in the fluid.

Example 14.14 Determine if the following vector fields are flux free by considering∮

C˜F ·

˜n ds for an arbitrary closed path C.

-1 -0.5 0.5 1

-1

-0.5

0.5

1

-1 -0.5 0.5 1

-1

-0.5

0.5

1

32


-1 -0.5 0.5 1

-1

-0.5

0.5

1

-1 -0.5 0.5 1

-1

-0.5

0.5

1

Next we wish to consider the net circulation of fluid counter clockwise around the

boundary of the rectangular region A, i.e.,

∮

C˜F ·

˜T ds.

(x, y)

∆(x, y + y) (x + x, y + y)

(x + x, y)∆

∆ ∆

The rate which fluid flows along the bottom edge (left to right) of the rectangularregion is approximately given by

˜F(x +

∆x

2, y) ·

˜i∆x = P (x +

∆x

2, y)∆x .

This is the component of the fluid velocity at (x, y) in the direction of the tangent tothe boundary times the length of the side (compare with

∮

C ˜F ·

˜T ds). Thus for all four

sides we have

Top˜F(x +

∆x

2, y + ∆y) · (−

˜i)∆x = −P (x +

∆x

2, y + ∆y)∆x

Bottom˜F(x +

∆x

2, y) ·

˜i∆x = P (x +

∆x

2, y)∆x

Left˜F(x, y +

∆y

2) · (−

˜j)∆y = −Q(x, y +

∆y

2)∆y

Right˜F(x + ∆x, y +

∆y

2) ·

˜j∆y = Q(x + ∆x, y +

∆y

2)∆y

33


Considering opposite sides we have

Top & Bottom −(

P (x +∆x

2, y + ∆y) − P (x +

∆x

2, y)

)∆x

≈ −(

∂P

∂y∆y

)∆x

Right & Left

(Q(x + ∆x, y +

∆y

2) − Q(x, y +

∆y

2)

)∆y

≈(

∂Q

∂x∆x

)∆y

Adding yields the total circulation around the rectangular region A boundary.

circulation (

∮

C˜F ·

˜T ds) ≈

(∂Q

∂x− ∂P

∂y

)∆x∆y .

If we divide by the area of the rectangular region A, ∆x∆y, we obtain the circulationper unit area, i.e.,

circulation

area≈ ∂Q

∂x− ∂P

∂y.

If we let both ∆x → 0 and ∆y → 0 this yields the circulation density or curl of˜F

at (x, y).

Definition 14.6 The circulation density or curl of a vector field˜F = Fx(x, y)

˜i +

Fy(x, y)˜j at the point (x, y) is given by

curl˜F ≡ ∇×

˜F =

(∂Fy

∂x− ∂Fx

∂y

)

˜k .

In the context of fluid flow the curl measures the amount of rotation undergone bymicroscopic parts of the fluid.

LECTURE 9

Example 14.15 Determine if the following vector fields are circulation free by con-

sidering

∮

C˜F ·

˜T ds for an arbitrary closed path C.

-1 -0.5 0.5 1

-1

-0.5

0.5

1

-1 -0.5 0.5 1

-1

-0.5

0.5

1

34


-1 -0.5 0.5 1

-1

-0.5

0.5

1

-1 -0.5 0.5 1

-1

-0.5

0.5

1

14.3 Green’s Theorem in the Plane

Theorem 14.1 (Green’s Theorem in the Plane) Let R be a closed bounded re-gion in the x, y plane whose boundary C consists of a finite number of smooth curves.Let P (x, y) and Q(x, y) be functions that are continuous and have continuous firstorder partial derivatives ∂P

∂y and ∂Q∂x everywhere in some domain containing R. Then

∫∫

R

(∂Q

∂x− ∂P

∂y

)dA ≡

∫∫

R

(∂Q

∂x− ∂P

∂y

)dx dy ≡

∮

C

(P dx + Q dy)

where C is the entire boundary of R taken in the sense that R is on the left of C.

Notes:

• Writing˜F =< P, Q > then

∫∫

R

curl˜F ·

˜k dx dy ≡

∫∫

R

(∇×˜F) ·

˜k dx dy =

∮

C˜F ·

˜T ds ≡

∮

C˜F · d

˜r .

This is known as the circulation-curl or tangential form of Green’s Theoremin the Plane.

• Writing˜F =< Q,−P > then

∫∫

R

div˜F dx dy ≡

∫∫

R

∇ ·˜F dx dy =

∮

C˜F ·

˜n ds

where˜n is the outward pointing unit normal to C. This is known as the flux-

divergence or normal form of Green’s Theorem in the Plane.

• For a formal proof of Green’s Theorem in the Plane see SHE.

35


How can two different vector fields give the same result for the line integral

Heuristic argument for circulation-curl form

Heuristic argument for flux-divergence form

Example 14.16 Verify the circulation-curl and flux-divergence forms of Green’s The-orem in the Plane for the vector field

˜F(x, y) = (x − y)

˜i + x

˜j

and the region R bounded by the unit circle C

˜r(t) = cos t

˜i + sin t

˜j 0 ≤ t ≤ 2π .

Solution: Thus

div˜F ≡ ∇ ·

˜F =

∂Fx

∂x+

∂Fy

∂y= 1 + 0 = 1 ,

curl˜F ≡ ∇×

˜F =

(∂Fy

∂x− ∂Fx

∂y

)

˜k = (1 − (− 1))

˜k = 2

˜k .

Also

˜F (

˜r(t)) = (cos t − sin t)

˜i + cos t

˜j ,

˜r′(t) = − sin t

˜i + cos t

˜j ,

‖˜r′(t)‖ = ,

˜T = ˜

r′(t)

‖˜r′(t)‖ = − sin t

˜i + cos t

˜j ,

˜n =

˜T ×

˜k =

∣∣∣∣∣∣∣∣∣∣

˜i

˜j

˜k

− sin t cos t 0

0 0 1

∣∣∣∣∣∣∣∣∣∣

= cos t˜i + sin t

˜j ≡ − ˜

T′(t)

‖˜T′(t)‖ .

Thus

∮

C˜F ·

˜T ds =

2π∫

0

((cos t − sin t)˜i + cos t

˜j) · (− sin t

˜i + cos t

˜j) ‖

˜r′(t)‖ dt

=

2π∫

0

(1 − sin t cos t) dt

= 2π

36


and∫∫

R

(∇×˜F) ·

˜k dx dy =

∫∫

R

2˜k ·

˜k dx dy

= 2

∫∫

R

dx dy

= 2 × (area of unit circle)

= 2π .

Hence we have verified∫∫

R

(∇×˜F) ·

˜k dx dy =

∮

C˜F ·

˜T ds .

Also

∮

C˜F ·

˜n ds =

2π∫

0

((cos t − sin t)˜i + cos t

˜j) · (cos t

˜i + sin t

˜j) ‖

˜r′(t)‖ dt

=

2π∫

0

cos2 t dt

= π

and∫∫

R

∇ ·˜F dx dy =

∫∫

R

dx dy

= area of unit circle

= π .

Hence we have also verified∫∫

R

∇ ·˜F dx dy =

∮

C˜F ·

˜n ds .

�

LECTURE 10

Example 14.17 Verify Green’s Theorem in the Plane for

∮

C

(xy+y2) dx+x2 dy where

C is the positively oriented boundary of the region bounded by the curves y = x andy = x2.

Solution:

37


(cont.)

�


∮

C

(x2 − 2xy) dx + (x2y + 3) dy where C is the positively

oriented boundary of the region contained by y2 = 8x and x = 2 using Green’s Theoremin the Plane.

Solution:

�

38


Example 14.19 Show that Green’s Theorem in the Plane leads to the following for-

mula for the plane area A =

∮

C

xdy ≡∮

C

− y dx ≡ 1

2

∮

C

(xdy − y dx).

Solution: Green’s Theorem in the Plane states

∫∫

R

(∂Q

∂x− ∂P

∂y

)dA ≡

∫∫

R

(∂Q

∂x− ∂P

∂y

)dx dy =

∮

C

(P dx + Q dy) .

Set Q = x and P ≡ 0 then

∫∫

R

(∂Q

∂x− ∂P

∂y

)dx dy =

∫∫

R

1 dx dy

= A

=

∮

C

xdy .

Similarly for the second version set Q ≡ 0 and P = − y. By adding the two resultsand dividing by two we obtain the third version.

�

Example 14.20 Use A =

∮

C

xdy to find the area of an ellipsex2

a2+

y2

b2= 1.

Solution: We parametrise the positively oriented curve C, which is the boundary ofthe ellipse, such that

x = a cos t

y = b sin t

where 0 ≤ t ≤ 2π. Thus the area A of an ellipse is given by

A =

∮

C

xdy =

2π∫

0

(a cos t) (b cos t dt) = a b

2π∫

0

cos2 t dt = π a b .

�

Example 14.21 Show that the area formula A =1

2

∮

C

(xdy − y dx) leads to

A =

∮

C

1

2r2 dθ in polar co-ordinates.

[Hint: x(r, θ) = r cos θ, y(r, θ) = r sin θ]

Solution: Convert Cartesian expression to one in plane polar co-ordinates:

x = r cos θ

dx = dr cosθ − r sin θ dθ

y = r sin θ

dy = dr sin θ + r cos θ dθ

Substituting into the Cartesian form for the area yields

A =1

2

∮

C

(xdy − y dx)

=1

2

∮

C

[r cos θ (dr sin θ + r cos θ dθ) − r sin θ (dr cosθ − r sin θ dθ)

]

=

∮

C

1

2r2 dθ .

�

Example 14.22 Evaluate the line integral

∮

C

y2 dx + 3xy dy

where C is the positively oriented boundary of the semi-annular region R in the upperhalf plane between the circles x2 + y2 = 1 and x2 + y2 = 9.

Solution:

39


(cont.)

�

LECTURE 11

Theorem 14.2 (Green’s Theorem in the Plane (2 curve case)) Let R be aclosed bounded region in the x, y plane which is bounded by the curves C1 and C2

which consist of a finite number of smooth curves. Let P (x, y) and Q(x, y) be func-

tions that are continuous and have continuous first order partial derivatives∂P

∂yand

∂Q

∂xeverywhere in some domain containing R. Then

∫∫

R

(∂Q

∂x− ∂P

∂y

)dA =

∮

C1

(P dx + Q dy) +

∮

C2

(P dx + Q dy)

where C1 and C2 are positively oriented, i.e., taken in the sense that R is on the left ofC1 and C2.

Notes:

• Writing˜F =< P, Q > then

∫∫

R

curl˜F ·

˜k dA =

∮

C1

˜F ·

˜T ds +

∮

C2

˜F ·

˜T ds .

This is the circulation-curl or tangential form of Green’s Theorem in thePlane (2 curve case).

40


• Writing˜F =< Q,−P > then

∫∫

R

div˜F dA =

∮

C1

˜F ·

ñ ds +

∮

C2

˜F ·

ñ ds

whereñ is the outward pointing unit normal to R. This is the flux-divergence

or normal form of Green’s Theorem in the Plane (2 curve case).

Example 14.23 If˜F = − y

x2 + y2˜i +

x

x2 + y2˜j, show that

∮

C˜F · d

˜r = 2π for every

simple positively oriented closed path C that encloses the origin.

Solution:

41


(cont.)

-2 -1 1 2

-2

-1

1

2

�

14.4 Conservative Vector Fields (revisited)

Definition 14.7 The flux density or divergence of a vector field˜F = Fx(x, y, z)

˜i+

Fy(x, y, z)˜j + Fz(x, y, z)

˜k at the point (x, y, z) is given by

div˜F ≡ ∇ ·

˜F =

⟨∂

∂x,

∂

∂y,

∂

∂z

⟩· 〈Fx, Fy, Fz〉 =

∂Fx

∂x+

∂Fy

∂y+

∂Fz

∂z.

Definition 14.8 The circulation density or curl of a vector field˜F = Fx(x, y, z)

˜i+



curl˜F ≡ ∇×

˜F

=

∣∣∣∣∣∣∣∣∣∣∣

˜i

˜j

˜k

∂

∂x

∂

∂y

∂

∂z

Fx Fy Fz

∣∣∣∣∣∣∣∣∣∣∣

=

(∂Fz

∂y− ∂Fy

∂z

)

˜i +

(∂Fx

∂z− ∂Fz

∂x

)

˜j +

(∂Fy

∂x− ∂Fx

∂y

)

˜k .

Example 14.24 Determine the divergence of˜F(

˜r) = ˜

r

‖˜r‖ .

Solution:

∇ ·˜F(

˜r) = ∇ ·

⟨x√

x2 + y2 + z2,

y√x2 + y2 + z2

,z√

x2 + y2 + z2

⟩

=y2 + z2

(x2 + y2 + z2)3

2

+z2 + x2

(x2 + y2 + z2)3

2

+x2 + y2

(x2 + y2 + z2)3

2

=2√

x2 + y2 + z2=

2

‖˜r‖ .

�

42


Example 14.25 Find a formula for div (f(˜r)

˜F(

˜r)).

Solution: We expect a formula that is similar to the product formula. A reasonableguess is

div (f˜F) ≡ ∇ · (f

˜F) = ∇f ·

˜F + f∇ ·

˜F .

To check this we set˜F =< Fx, Fy, Fz >. Hence

∇ · (f˜F) =

∂(fFx)

∂x+

∂(fFy)

∂y+

∂(fFz)

∂z

=∂f

∂xFx + f

∂Fx

∂x+

∂f

∂yFy + f

∂Fy

∂y+

∂f

∂zFz + f

∂Fz

∂z

=

⟨∂f

∂x,∂f

∂y,∂f

∂z

⟩· < Fx, Fy , Fz > +f

(∂Fx

∂x+

∂Fy

∂y+

∂Fz

∂z

)

= ∇f ·˜F + f ∇ ·

˜F .

�

LECTURE 12

14.4.1 Independence of Path in Line Integrals

Definition 14.9 If

∫

C˜F · d

˜r is independent of path in a domain D then for any two

points A and B in D,

∫

C1

˜F · d

˜r =

∫

C2

˜F · d

˜r where C1, C2 are any two paths in D from A

to B.

Theorem 14.3 The line integral

∫

C˜F · d

˜r, with

˜F with continuous first order partial

derivatives in domain D, is independent of path in D if and only if˜F = ∇f for some

function f in D. That is, a potential f exists for˜F.

Theorem 14.4

∫

C˜F · d

˜r is independent of path in D if and only if it is zero on every

simple closed path in D.

Definition 14.10 The first order differential form˜F · d

˜r = Fx dx + Fy dy + Fz dz is

exact, or is an exact differential in a domain D, if it is the differential of a differentiablefunction f(x, y, z) in D, i.e., if

˜F = ∇f , i.e., if

˜F is conservative.

Definition 14.11 A domain D is simply connected if every closed curve in D can becontinuously shrunk to a point in D without leaving D.

Theorem 14.5 (Test for Independence of Path) If˜F has continuous first order

partial derivatives in a simply connected domain D then

∫

C˜F · d

˜r is independent of

path if and only if curl˜F = 0 in D, i.e.,

˜F = ∇f in D if and only if curl

˜F = 0 in D.

43


Example 14.26 Show that the vector field˜F = (ex cos y)

˜i−(ex sin y)

˜j+z

˜k is conser-

vative in R3, i.e., curl

˜F = 0. Hence determine a scalar potential for

˜F, i.e.,

˜F = ∇f .

Solution:

curl˜F =

∣∣∣∣∣∣∣∣∣∣∣

˜i

˜j

˜k

∂

∂x

∂

∂y

∂

∂z

ex cos y − ex sin y z

∣∣∣∣∣∣∣∣∣∣∣

= (0 − 0)˜i + (0 − 0)

˜j + (− ex sin y + ex sin y)

˜k

= 0 .

Thus the vector field is conservative in R3. A potential f for a conservative vector field

˜F satisfies

˜F = ∇f which can be written in component form:

Fx = ex cos y =∂f

∂x⇒ f(x, y, z) = ex cos y + g(y, z)

Fy = − ex sin y =∂f

∂y

Fz = z =∂f

∂z

�

Example 14.27 Find a potential function for the conservative vector field

˜F(x, y) = (6xy − y3)

˜i + (4y + 3x2 − 3xy2)

˜j.

in a simply connected domain of R2.

Solution:

�

44


Example 14.28 Show that the vector field˜F = − y

x2 + y2˜i+

x

x2 + y2˜j is conservative

in a simply connected domain excluding the origin in R2. Hence determine a scalar

potential for˜F, i.e.,

˜F = ∇f . [Hint:

∫dx

x2 + a2=

1

aarctan

(x

a

)]

Solution:

curl˜F =

∣∣∣∣∣∣∣∣∣∣∣∣∣

˜i

˜j

˜k

∂

∂x

∂

∂y

∂

∂z

− y

x2 + y2

x

x2 + y20

∣∣∣∣∣∣∣∣∣∣∣∣∣

= (0 − 0)˜i + (0 − 0)

˜j +

(x2 + y2 − 2x2

(x2 + y2)2+

x2 + y2 − 2y2

(x2 + y2)2

)

˜k

= 0 ((x, y) 6= (0, 0)) .

Thus conservative in a simply connected domain excluding the origin in R2. A potential

f for a conservative vector field˜F satisfies

˜F = ∇f which can be written in component

form:

Fx = − y

x2 + y2=

∂f

∂x

Fy =x

x2 + y2=

∂f

∂y

(cont.)

�

45


14.4.2 Physical Interpretation of Divergence and Curl in R3

Example 14.29 Consider a cloud of dust particles is moving so that the velocity of aparticle passing through the point (x, y, z) is given by

˜v(x, y, z) ≡

⟨dx

dt,dy

dt,dz

dt

⟩=< ax, by, cz >

and a, b, c ∈ R are constants. We wish to measure the time rate of change in the volumeof a region about the origin occupied by a portion of the dust cloud. For conveniencewe will take the portion to be a small cube centred at the origin. A particle initially at(x, y, z) will be at position

˜P(t) =

⟨xeat, yebt, zect

⟩

after t time units. In particular consider all the dust particles that initially occupieda cube of side length s that is centred at the origin. Since the transformation P islinear, after t units of time, the dust will occupy a rectangular solid with side of lengthsseat, sebt, sect. The volume of this solid is thus s3e(a+b+c) t. Hence the average rate ofchange in the volume occupied by the dust over time 0 to t is

s3e(a+b+c) t − s3

t.

The instantaneous rate of change of this volume will be the limit as t → 0 of the aboveexpression:

limt→0

s3e(a+b+c) t − s3

t= s3 (a + b + c) .

Hence the instantaneous rate of change in the volume, per unit volume, is a + b + c =div v. When a+b+c > 0 the volume is expanding and so, on average, the dust particlesare moving outward or diverging from the origin.

In general, if˜F on R

3 is the velocity field of a fluid, then div˜F evaluated at the point

˜x measures the instantaneous rate of expansion (or compression) per unit volume in amicroscopic portion of the fluid centred at

˜x. For this reason, if

˜F satisfies div

˜F = 0,

then˜F is called incompressible.

Example 14.30 Consider again a cloud of dust particles each of which follows a circu-lar path centred on the z-axis and lying in a plane perpendicular to the z-axis. Supposethe angular velocity is a constant positive value of ω so that the rotation is counter

clockwise as viewed from the positive z-axis. The position of a dust particle at time tinitially at (x, y, z) is given by

˜P(t) =< x cos ω t − y sin ω t, x sin ω t + y cosω t, z > .

Hence the velocity field at (x, y, z) is

˜v =

˜P′(0) =< −ωy, ωx, 0 > .

Now notice that curl˜v = (0, 0, 2ω) which is a vector parallel to the z-axis with magni-

tude twice the angular speed ω. Since the z-axis is also along the axis of rotation wesee that the curl in this case is twice the angular velocity.

In general, if˜F on R

3 is the velocity field of a fluid, then curl˜F evaluated at the point

˜x

provides a measure of the angular velocity of a microscopic portion of the fluid centredat

˜x; its direction is along the axis of this rotation. For this reason, a vector field

whose curl is zero is called irrotational.Given a twice differentiable scalar function φ on a subset U ⊂ R

3 we can calculategradφ ≡ ∇φ to obtain a vector field and then we can calculate div (∇φ) to obtain ascalar function again. This last function is the Laplacian of φ. We note

Laplacianφ ≡ ∇ · ∇φ ≡ ∇2φ =∂2φ

∂x2+

∂2φ

∂y2+

∂2φ

∂z2.

A scalar function φ whose Laplacian is zero is called harmonic. For example, thetemperature distribution through a solid that is in thermal equilibrium is a harmonicfunction.

46

Chapter 15

Parameterisation of Surfaces, Surface Integrals [SHE 10 §18.7]

LECTURE 13

Previously we have defined curves in the plane in three possible ways:

Explicit form y = f(x), e.g., y = x2

Implicit form F (x, y) = 0, e.g., x2 + y2 − a2 = 0

Parametric vector form˜r(t) = x(t)

˜i + y(t)

˜j, a ≤ t ≤ b

In an analogous way we have for surfaces in space

Explicit form z = f(x, y), e.g., z = x + y

Implicit form F (x, y, z) = 0, e.g., x2 + y2 + z2 − a2 = 0

There is also a vector parametric form that locates the position on a surface as a vectorfunction of two variables.Given a function

˜r : R ⊂ R

2 → R3, we define the surface parametrised by

˜r to be the

set of points S = {˜r(u, v) | (u, v) ∈ R}. The equation

˜r(u, v) = x(u, v)

˜i + y(u, v)

˜j + z(u, v)

˜k

where (u, v) ∈ R is a parameterisation of S. This means given a point (u, v) you obtain

a vector˜r(u, v) or if we write

˜r(u, v) as

−−→OP , given parameters (u, v), we obtain a point

P in 3D space. As (u, v) runs over R, P moves about to yield a surface S. A surface issimple if it has a parameterisation given by a one-to-one function. It is possible for asurface to be both smooth and simple. Note that a given surface does not necessarilyhave a unique parameterisation.

Example 15.1 Determine a vector parametric representation of the plane x+y+z =1.

Solution: The form of the plane x + y + z = 1 means that we can independentlypick x and y and the equation of the plane determines z. Hence

x = u

y = v

z = 1 − u − v .

Thus

˜r(u, v) = u

˜i + v

˜j + (1 − u − v)

˜k, u, v ∈ R .

�

For surfaces given explicitly by z = f(x, y), i.e., as the graph of a function f withcontinuous partial derivatives in R of the xy-plane, a parameterisation of such surfacesis given by

˜r(x, y) = x

˜i + y

˜j + f(x, y)

˜k

such that (x, y) ∈ R ⊂ R2.

Example 15.2 Determine two distinct parameterisation of the upper half unit spherecentred at the origin.

47

CHAPTER 15. PARAMETERISATION OF SURFACES, SURFACE INTEGRALS [SHE 10 §18.7]

Solution: By taking x, y as parameters, the surface can be parametrised by

˜r(x, y) = x

˜i + y

˜j +√

1 − x2 − y2

˜k

such that −√

1 − x2 ≤ y ≤√

1 − x2, − 1 ≤ x ≤ 1.

Another possible way to parametrise the surface is to use spherical polar co-ordinates:

θ

φ

x

y

z

r

x(θ, φ) = 1 cos θ sin φ

y(θ, φ) = 1 sin θ sin φ

z(θ, φ) = 1 cosφ

Thus

˜r(θ, φ) = cos θ sin φ

˜i + sin θ sinφ

˜j + cosφ

˜k , 0 ≤ θ ≤ 2π, 0 ≤ φ ≤ π

2.

�

Definition 15.1 For surface S given by˜r(u, v) where u = u(t) and v = v(t) are

continuous functions of t, we can define a curve C with˜r(u, v) =

˜r(u(t), v(t)) = ¯

˜r(t)

say, with tangent

d

dt¯˜r(t) =

∂˜r

∂u

du

dt+

∂˜r

∂v

dv

dt=

˜ru u′ +

˜rv v′

and a local normal to surface S isÑ =

˜ru ×

˜rv, provided

Ñ 6= 0.

S

Cr (t )

(t )

(t )

r

r

1

2

3

-

-

-

Theorem 15.1 (Tangent Plane and Surface Normal) If a surface S has theparametric specification

˜r(u, v) = x(u, v)

˜i + y(u, v)

˜j + z(u, v)

˜k with continuous

˜ru,

˜rv

which satisfy˜ru ×

˜rv 6= 0 at every point of S, then S has at every point P :

(a) a unique tangent plane passing through P and spanned by˜ru and

˜rv , and

(b) a normal directionÑ = ±(

˜ru ×

˜rv) whose direction depends continuously on the

points of S.

Such a surface S is called a smooth surface. S is called piecewise smooth if itconsists of a finite number of smooth sections.

Example 15.3 Determine a normal to the plane in Ex 15.1.

Solution: From Ex 15.1 we have

˜r(u, v) = u

˜i + v

˜j + (1 − u − v)

˜k, u, v ∈ R .

Thus

˜ru =

˜i −

˜k ,

˜rv =

˜j −

˜k .

Therefore

Ñ =

˜ru ×

˜rv =

∣∣∣∣∣∣˜i

˜j

˜k

1 0 − 10 1 − 1

∣∣∣∣∣∣=

˜i +

˜j +

˜k .

�

48


Example 15.4 Determine a normal to the sphere in Ex 15.2.

Solution: From Ex 15.2 we have


˜i + sin θ sinφ

˜j + cosφ

˜k , 0 ≤ θ ≤ 2π, 0 ≤ φ ≤ π

2.

To determineÑ we must first determine

˜rθ and

˜rφ:

˜rθ = − sin θ sin φ

˜i + cos θ sin φ

˜j + 0

˜k ,

˜rφ = cos θ cosφ

˜i + sin θ cosφ

˜j − sinφ

˜k .

Hence

Ñ =

˜rφ ×

˜rθ

=

∣∣∣∣∣∣∣∣∣∣

˜i

˜j

˜k

cos θ cosφ sin θ cosφ − sinφ

− sin θ sin φ cos θ sin φ 0

∣∣∣∣∣∣∣∣∣∣

= cos θ sin2 φ˜i + sin θ sin2 φ

˜j + cosφ sin φ

˜k .

�

Consider a small rectangle ∆Auv in R with corners (u, v), (u+∆u, v), (u+∆u, v+∆v)and (u, v + ∆v) is mapped under

˜r(u, v) to the small patch ∆Suv = PP ′P ∗P ′′ on the

surface S such that

−−→OP =

˜r(u, v) ;

−−→OP ′ =

˜r(u + ∆u, v)

−−→OP ′′ =

˜r(u, v + ∆v) ;

−−→OP ∗ =

˜r(u + ∆u, v + ∆v)

��

��

��

��

��

��

��

��

��

��

��

P

P"

P*

P’

urvv∆

∆ ru

This patch on the surface (grey patch) is approximately the parallelogram with sidesgiven by

−−→OP ′ −−−→

OP =˜r(u + ∆u, v) −

˜r(u, v) ≈ ∂

˜r

∂u∆u =

˜ru ∆u

and

−−→OP ′′ −−−→

OP =˜r(u, v + ∆v) −

˜r(u, v) ≈ ∂

˜r

∂v∆v =

˜rv ∆v

since for small ∆u and ∆v the grey patch ∆Suv is flat and well approximated by thedashed rectangle. We recall that the area of a parallelogram is given by

∆Suv ≈ ‖˜ru ∆u ×

˜rv ∆v‖ = ‖

˜ru ×

˜rv‖∆u ∆v .

Thus if we partition the region R in the uv-plane into rectangular regions ∆Auv thiscreates a partition for the surface S into surface area elements ∆Suv.

Definition 15.2 The area of the piecewise smooth surface S with parameterisation

˜r(u, v) = x(u, v)

˜i + y(u, v)

˜j + z(u, v)

˜k is given by

∫∫

S

dS =

∫∫

R

‖˜ru ×

˜rv‖ du dv ≡

∫∫

R

‖Ñ(u, v)‖ du dv .

49


Example 15.5 Find the surface area of a sphere centred at the origin and radius a.

Solution: From Ex 15.4 we found the normal to a unit sphere. The normal to asphere of radius a is given by

Ñ = ± a2

[cos θ sin2 φ

˜i + sin θ sin2 φ

˜j + cosφ sin φ

˜k]

.

To determine the surface area by using the formula

∫∫

R

‖Ñ(θ, φ)‖ dθ dφ we must de-

termine ‖Ñ(θ, φ)‖. Now

‖Ñ(θ, φ)‖ = a2

√(cos θ sin2 φ)2 + (sin θ sin2 φ)2 + (cosφ sin φ)2

= a2

√sin4 φ + cos2 φ sin2 φ = a2 sin φ .

Hence

∫∫

R

‖Ñ(θ, φ)‖ dθ dφ =

θ=2π∫

θ=0

φ=π∫

φ=0

a2 sin φdφdθ

= 2πa2

φ=π∫

φ=0

sinφdφ = 4πa2 .

�

If z = f(x, y) explicitly defines the surface S and thus

˜r(x, y) = x

˜i + y

˜j + f(x, y)

˜k

such that (x, y) ∈ R ⊂ R2 then it is easy to verify that

‖Ñ(x, y)‖ =

√f2

x + f2y + 1 .

In fact

Ñ(x, y) = ± (fx

˜i + fy

˜j −

˜k) .

We should also note that if z = f(x, y) then we can rewrite this as the implicit functionφ(x, y, z) = ±(f(x, y) − z) = 0. This is also the equation for a level surface and hencethe normal to the level surface is given by ∇φ, i.e.,

Ñ(x, y) = ∇φ = ± (fx

˜i + fy

˜j −

˜k) .

Definition 15.3 The area of the piecewise smooth surface S defined by the graphz = f(x, y) is given by

∫∫

S

dS =

∫∫

R

√f2

x + f2y + 1 dx dy

where R is the projection of S onto the xy-plane.

Example 15.6 Determine the area of that part of the plane 2x + 7y + z = 10 lyingover the region R =

{(x, y) ∈ R

2 | 0 ≤ x ≤ 1, 0 ≤ y ≤ 1}.

Solution:

0

0.25

0.5

0.75

1

x

0

0.25

0.5

0.75

1y

0

2

4

6

8

10

z

0

0.25

0.5

0.75

1

x

0

2

4

6

8

The equation of the plane can be rewritten as z = f(x, y) = 10 − 2x − 7y. Thusfx = −2 and fy = −7 which yields

dS =√

f2x + f2

y + 1 dx dy =√

4 + 49 + 1 dx dy = 3√

6 dx dy

and hence∫∫

S

dS =

∫∫

R

√f2

x + f2y + 1 dx dy =

∫∫

R

3√

6 dx dy = 3√

6

∫∫

R

dx dy

= 3√

6 × area of R= 3

√6 .

�

50


LECTURE 14

Example 15.7 Determine the surface area for that part of the sphere x2 +y2 +z2 = 4contained within the cylinder x2 + y2 = 1.

Solution:

-2

-1

0

1

2

x

-2

-1

0

1

2y

-2

-1

0

1

2

z

-2

-1

0

1

Let S be the top half of the surface required and hence is defined by the graph z =√4 − x2 − y2. Thus the required area is 2 times the area of S or

2

∫∫

R

√f2

x + f2y + 1 dx dy

where R is the projection of S onto the xy-plane, i.e., x2 + y2 ≤ 1. Now

fx = − x√4 − x2 − y2

; fy = − y√4 − x2 − y2

and hence

dS =√

f2x + f2

y + 1 dx dy =2√

4 − x2 − y2dx dy .

Thus the required surface area is given by

2

∫∫

R

2√4 − x2 − y2

dx dy .

Since the projection R is a circular region it is wise to convert to polar co-ordinates:

2

∫∫

R

2√4 − x2 − y2

dx dy = 2

θ=2π∫

θ=0

r=1∫

r=0

2√4 − r2

r dr dθ

= 4π

r=1∫

r=0

2√4 − r2

r dr let u = 4 − r2

= − 4π

u=3∫

u=4

1√u

du

= 4π

u=4∫

u=3

1√u

du

= 4π(

2√

u∣∣43

)= 8π

(2 −

√3)

.

�

In the previous two examples we wrote down the relationship

dS =√

f2x + f2

y + 1 dA

which we determined from the parametric form for a surface. We can derived thisrelationship from a geometric point of view. For small dA we may assume that dS isflat, i.e., without curvature. Note flat does not mean necessarily horizontal.

51


In general dS will be greater than dA. The relationship between dS and dA dependson the inclination of dS. Since dA is the projection of dS onto the xy-plane therelationship is given by

dA = cos γ dS

where γ is the acute angle between the plane determined by dS and the xy-plane. Alsoif we choose a unit normal

ñ to the plane of dS then γ is the angle between

˜k and

ñ

also. Note if we had chosenñ in the opposite direction then the angle between

ñ and

˜k would be π − γ and then

ñ ·

˜k = − cos γ. Hence

cos γ =|ñ ·

˜k |

and therefore

dA = cos γ dS ⇒ dS =1

|ñ ·

˜k | dA .

For the implicit surface defined by φ(x, y, z) = ±(f(x, y) − z) = 0 we have

ñ =

∇φ

‖∇φ‖ = ±fx˜i + fy

˜j−

˜k√

f2x + f2

y + 1

and thus

|ñ ·

˜k |= 1√

f2x + f2

y + 1.

Finally we have the desired result

dS =1

|ñ ·

˜k | dA =

√f2

x + f2y + 1 dA .

Definition 15.4 The surface integral of a scalar quantity G defined at points˜r(u, v)

on piecewise smooth surface S is given by

∫∫

S

G(˜r) dS =

∫∫

R

G(˜r(u, v)) ‖

Ñ(u, v)‖ du dv

Note if G(˜r) = 1 then the integral gives the surface area of S.

A special case occurs when the surface S is given by z = f(x, y), whence

∫∫

S

G(˜r) dS =

∫∫

R

G(x, y, f(x, y))√

f2x + f2

y + 1 dx dy

where R is the projection of S onto the xy-plane. In the case G(˜r) = 1, the area is

given by

∫∫

S

dS =

∫∫

R

√f2

x + f2y + 1 dx dy .

Example 15.8 A thin sheet of metal is in the shape of a half cylinder x2 + z2 = 1,z ≥ 0, 0 ≤ y ≤ 4. Determine the mass of this metal sheet if its density σ (mass perunit area) at any point (x, y, z) is 2y + z + 3.

Solution:

The surface S is parametrised by

˜r(θ, y) = cos θ

˜i + y

˜j + sin θ

˜k 0 ≤ θ ≤ π, 0 ≤ y ≤ 4 .

52


Thus

˜rθ = − sin θ

˜i + 0

˜j + cos θ

˜k ,

˜ry = 0

˜i +

˜j + 0

˜k ,

Ñ(θ, y) =

˜rθ ×

˜ry

=

∣∣∣∣∣∣˜i

˜j

˜k

− sin θ 0 cos θ0 1 0

∣∣∣∣∣∣= − cos θ

˜i + 0

˜j− sin θ

˜k ,

‖Ñ(θ, y)‖ = 1 .

Hence the mass of the metal sheet is given by

mass =

∫∫

S

σ(x, y, z) dS =

∫∫

R

σ(˜r(θ, y)) ‖

Ñ(θ, y)‖ dθ dy

=

θ=π∫

θ=0

y=4∫

y=0

(2y + sin θ + 3) dy dθ

=

θ=π∫

θ=0

(y2 + y sin θ + 3y

∣∣40

)dθ

=

θ=π∫

θ=0

(28 + 4 sin θ) dθ

= 28π + 4

θ=π∫

θ=0

sin θ dθ

= 28π + 8 .

�

Example 15.9 The dome of a cathedral, in the shape of the paraboloid z = 1−x2−y2

is to be retiled. The cost of tiling is $(z + 15) per square metre (the higher you go upthe more expensive it is). What is the total cost?

Solution:

The projection of paraboloid S onto the xy-plane is R ={(x, y) ∈ R

2 |x2 + y2 ≤ 1}.

Since the surface has equation z = f(x, y) = 1 − x2 − y2 we have

dS =√

f2x + f2

y + 1 dx dy =√

4x2 + 4y2 + 1 dx dy .

Therefore the total cost is given by the integral

cost =

∫∫

R

(1 − x2 − y2 + 15)√

4x2 + 4y2 + 1 dx dy .

53


Since the projection R is circular it is wise to convert to polar co-ordinates:

cost =

∫∫

R

(1 − x2 − y2 + 15)√

4x2 + 4y2 + 1 dx dy

=

θ=2π∫

θ=0

r=1∫

r=0

(16 − r2)√

4r2 + 1 r dr dθ

= 2π

r=1∫

r=0

(16 − r2)√

4r2 + 1 r dr (let u = 4r2 + 1)

=2π

8

u=5∫

u=1

(16 − u − 1

4

) √u du

=π

16

u=5∫

u=1

(65 − u)√

udu

=π

16

(130

3u

3

2 − 2

5u

5

2

∣∣∣∣5

1

)

=π

16

(130

3(5√

5 − 1) − 2

5(25

√5 − 1)

).

�

LECTURE 15

Definition 15.5 A flux integral of˜F over the piecewise smooth surface S defined

by the parameterisation˜r(u, v) is given by

∫∫

S˜F ·

ñ dS =

∫∫

R˜F(

˜r(u, v)) · ˆ

Ñ(u, v) ‖

Ñ(u, v)‖ du dv

=

∫∫

R˜F(

˜r(u, v)) ·

Ñ(u, v) du dv

whereñ =

∇φ

‖∇φ‖ is the unit normal to S such that φ = ±(f(x, y) − z) = 0 and

Ñ = ±(

˜ru ×

˜rv).

Example 15.10 Let S be the half hemisphere x2 + y2 + z2 = 1, y ≥ 0, z ≥ 0.Determine the flux of

˜F = −

˜i + 2

˜j +

˜k across S in the outwards direction.

Solution: The vector parametric form for the surface is given by


˜i + sin θ sin φ

˜j + cosφ

˜k , 0 ≤ θ ≤ π, 0 ≤ φ ≤ π

2.

Thus

˜rθ = − sin θ sin φ

˜i + cos θ sin φ

˜j + 0

˜k ,

˜rφ = cos θ cosφ

˜i + sin θ cosφ

˜j − sin φ

˜k ,

Ñ(θ, φ) =

˜rθ ×

˜rφ

=

∣∣∣∣∣∣∣∣∣∣

˜i

˜j

˜k

− sin θ sin φ cos θ sin φ 0

cos θ cosφ sin θ cosφ − sinφ

∣∣∣∣∣∣∣∣∣∣

= − cos θ sin2 φ˜i − sin θ sin2 φ

˜j− cosφ sin φ

˜k .

We note from the problem that we want the outward flux, i.e., with a non-negativecomponent in the z direction. Thus we should take the negative of the above resultfor the normal. Hence

˜F(

˜r(θ, φ)) ·

Ñ(θ, φ) = (−

˜i + 2

˜j +

˜k) · (cos θ sin2 φ


˜j + cosφ sin φ

˜k)

= 2 sin θ sin2 φ + cosφ sin φ − cos θ sin2 φ .

54


Thus the flux integral is given by

∫∫

S˜F ·

ñdS =

∫∫

R˜F(

˜r(θ, φ)) ·

Ñ(θ, φ) dθ dφ

=

φ= π2∫

φ=0

θ=π∫

θ=0

(2 sin θ sin2 φ + cosφ sin φ − cos θ sin2 φ

)dθ dφ

=

φ= π2∫

φ=0

(4 sin2 φ + π cosφ sin φ

)dφ

= 4π

4+ π

φ=π2∫

φ=0

cosφ sin φdφ (let t = sinφ)

= π + π

1∫

0

t dt = π +π

2=

3π

2.

�

Example 15.11 Let S be the half cylinder x2 + y2 = 1, y ≥ 0 and 0 ≤ z ≤ 1.Determine the flux of

˜F = −

˜i + 2

˜j +

˜k across S in the outwards direction.

Solution:

The surface S is implicitly defined by the function φ(x, y, z) = x2 + y2 − 1 = 0. Anormal to surface S is thus ∇φ = 2x

˜i + 2y

˜j + 0

˜k and hence a unit normal is

ñ =

∇φ

‖∇φ‖ =2x

˜i + 2y

˜j + 0

˜k√

4x2 + 4y2= x

˜i + y

˜j + 0

˜k .

We should check that we have found the outward normal by evaluatingñ at some

point on S. For example at (1, 0, 0),ñ =

˜i+0

˜j+0

˜k =

˜i, which is an outward pointing

normal to the surface.So

˜F ·

ñ = (−

˜i + 2

˜j +

˜k) · (x

˜i + y

˜j + 0

˜k) = − x + 2y. Hence the flux integral is given

by

∫∫

S˜F ·

ñ dS =

∫∫

S

(− x + 2y) dS .

Using cylindrical polar co-ordinates we have x = cos θ, y = sin θ, z = t, 0 ≤ θ ≤ π,0 ≤ t ≤ 1 and dS = 1 dθ dt. Hence

∫∫

S

(− x + 2y) dS =

t=1∫

t=0

θ=π∫

θ=0

(− cos θ + 2 sin θ) dθ dt

= (− sin θ − 2 cos θ|π0 )

= 4 .

�

55

Chapter 17

Stokes’ Theorem [SHE 10 §18.10]

Definition 17.1 The circulation density or curl of a vector field˜F = Fx(x, y, z)

˜i+



curl˜F ≡ ∇×

˜F

=

∣∣∣∣∣∣∣∣∣∣∣

˜i

˜j

˜k

∂

∂x

∂

∂y

∂

∂z

Fx Fy Fz

∣∣∣∣∣∣∣∣∣∣∣

=

(∂Fz

∂y− ∂Fy

∂z

)

˜i +

(∂Fx

∂z− ∂Fz

∂x

)

˜j +

(∂Fy

∂x− ∂Fx

∂y

)

˜k .

Before we state the theorem we discuss one bit of terminology. Let S be an orientablesurface bounded by a smooth curve C (i.e., a curve that forms the “edge” of the surface).Then the orientation of S induces an orientation or direction along C as follows. If weimagine a person walking along C on the positive side of S, then the person is walkingin the positive direction of C if the surface S is on his or her left and in the negativedirection if S is on his or her right.

Theorem 17.1 (Stokes’ Theorem) S is a piecewise smooth orientable surface inspace whose boundary is a piecewise smooth simple closed curve C. If

˜F(x, y, z)’s first

order partial derivatives exist and are continuous in a domain in space containing Sthen

∫∫

S

curl˜F ·

˜n dS =

∮

C˜F · d

˜r

where˜n is the outward pointing unit normal to the surface S and C is positively ori-

entated.

56

CHAPTER 17. STOKES’ THEOREM [SHE 10 §18.10]

Notes:

• If the surface S lies entirely in a plane then Stokes’ Theorem reduces to Green’sTheorem in the Plane.

• Can write

∮

C˜F · d

˜r as

∮

C˜F ·

˜T ds where s is an arc length on C.

• Interpretation of Stokes’ Theorem is the flux integral of curl˜F across any (suitable)

surface S spanning the given simple closed curve C equals (has the same value as)∮

C˜F · d

˜r.

57


Example 17.1 Verify Stokes’ Theorem for the hemisphere S : x2 +y2 +z2 = 9, z ≥ 0,its bounding circle C : x2 + y2 = 9, z = 0 and the vector field

˜F = y

˜i − x

˜j.

Solution:

We use the parameterisation

˜r(t) = 3 cos t

˜i + 3 sin t

˜j, 0 ≤ t ≤ 2π

for the curve C. Hence

˜F = y

˜i − x

˜j = 3 sin t

˜i − 3 cos t

˜j ,

d˜r = − 3 sin t dt

˜i + 3 cos t dt

˜j .

Thus

∮

C˜F · d

˜r =

2π∫

0

− 9 sin2 t dt − 9 cos2 t dt = − 9

2π∫

0

dt = − 18π .

Next we calculate the curl of˜F:

curl˜F ≡ ∇×

˜F =

∣∣∣∣∣∣∣∣∣∣∣

˜i

˜j

˜k

∂

∂x

∂

∂y

∂

∂z

y −x 0

∣∣∣∣∣∣∣∣∣∣∣

= (0 − 0)˜i + (0 − 0)

˜j + (− 1 − 1)

˜k

= − 2˜k .

We parametrise the surface S by

˜r(θ, φ) = 3 cos θ sin φ

˜i + 3 sin θ sin φ

˜j + 3 cosφ

˜k , 0 ≤ θ ≤ 2π, 0 ≤ φ ≤ π

2.

Thus

˜rθ = − 3 sin θ sin φ

˜i + 3 cos θ sin φ

˜j + 0

˜k ,

˜rφ = 3 cos θ cosφ

˜i + 3 sin θ cosφ

˜j − 3 sinφ

˜k ,

Ñ(θ, φ) =

˜rθ ×

˜rφ

= − 9[cos θ sin2 φ


˜j + sin φ cosφ

˜k]

,

‖Ñ(θ, φ)‖ = 9 sinφ ,

curl˜F ·

Ñ(θ, φ) = − 2

˜k ·(− 9

[cos θ sin2 φ


˜j + sin φ cosφ

˜k])

= 18 sin φ cosφ .

Hence ∫∫

S

curl˜F ·

ñ dS ≡

∫∫

R

curl˜F(

˜r(θ, φ)) ·

Ñ(θ, φ) dθ dφ

= 18

θ=2π∫

θ=0

φ=π2∫

φ=0

sin φ cosφdφdθ

= 36π

φ= π2∫

φ=0

sin φ cosφdφ (let t = sin φ)

= 36π

1∫

0

t dt = 18π .

58


What went wrong? We need to be careful when applying Stokes’ Theorem to rememberthat the direction of the normal

Ñ(θ, φ) (could be either

˜rθ×

˜rφ or

˜rφ×

˜rθ = − (

˜rθ×

˜rφ))

is determined by the direction we take around C. Since we are going around the curvecounter clockwise as viewed from above we should take the outward (upward in thiscase, i.e., has a +

˜k component) pointing normal to the surface.

Consider

Ñ(θ, φ)

‖Ñ(θ, φ)‖ = − [cos θ sin φ

˜i + sin θ sin φ

˜j + cosφ

˜k]

= − ˜r(θ, φ)

‖˜r(θ, φ)‖

= − ˜r(θ, φ)

3.

This is the inward (downward) pointing normal. You can convince yourself by

setting φ = 0 in Ñ(θ, φ)

‖Ñ(θ, φ)‖ to obtain − 1

3 ˜k. Thus we should take

Ñ(θ, φ) =

˜rφ ×

˜rθ = 9

[cos θ sin2 φ


˜j + sinφ cos φ

˜k]

and therefore curl˜F ·

Ñ(θ, φ) =

− 18 sin φ cosφ.

Hence∫∫

S

curl˜F ·

ñdS ≡

∫∫

R

curl˜F(

˜r(θ, φ)) ·

Ñ(θ, φ) dθ dφ

= − 18

θ=2π∫

θ=0

φ= π2∫

φ=0

sin φ cosφdφdθ

= − 36π

φ= π2∫

φ=0

sin φ cosφdφ

= − 36π

1∫

0

t dt

= − 18π .

�

59


LECTURE 16

Example 17.2 Determine the circulation of the vector field˜F = (x2−y)

˜i+4z

˜j+x2

˜k

around the curve C in which the plane z = 2 meets the cone z =√

x2 + y2, counterclockwise as viewed from above.

Solution:

We parametrise the cone by

˜r(r, θ) = r cos θ

˜i + r sin θ

˜j + r

˜k , 0 ≤ r ≤ 2, 0 ≤ θ ≤ 2π .

Thus

˜rr = cos θ

˜i + sin θ

˜j +

˜k ,

˜rθ = − r sin θ

˜i + r cos θ

˜j .

Looking at the surface of the cone we should take the normalÑ(r, θ) that has a positive

z component. Thus

Ñ(r, θ) =

˜rr ×

˜rθ = − r cos θ

˜i − r sin θ

˜j + r

˜k .

Next we calculate the curl˜F:

curl˜F ≡ ∇×

˜F =

∣∣∣∣∣∣∣∣∣∣∣

˜i

˜j

˜k

∂

∂x

∂

∂y

∂

∂z

x2 − y 4z x2

∣∣∣∣∣∣∣∣∣∣∣

= − 4˜i− 2x

˜j +

˜k

= − 4˜i− 2r cos θ

˜j +

˜k .

Then

curl˜F ·

Ñ(r, θ) = (− 4

˜i− 2r cos θ

˜j +

˜k) · (− r cos θ

˜i − r sin θ

˜j + r

˜k)

= 4r cos θ + 2r2 sin θ cos θ + r .

Hence using Stokes’ Theorem the circulation is given by

∮

C˜F · d

˜r =

∫∫

S

curl˜F ·

ñ dS ≡

∫∫

R

curl˜F(

˜r(r, θ)) ·

Ñ(r, θ) dr dθ

=

θ=2π∫

θ=0

r=2∫

r=0

(4r cos θ + 2r2 sin θ cos θ + r) dr dθ

=

θ=2π∫

θ=0

(8 cos θ +16

3sin θ cos θ + 2) dθ

= 4π .

�

60


Example 17.3 Let C be the rectangle in the plane z = y orientated as indicated

n

x

z

y

1

3

C

and˜F = x2

˜i + 4xy3

˜j + xy2

˜k. Hence determine

∮

C˜F · d

˜r.

Solution: To evaluate the integral directly would require four separate line integralcalculations, one for each side of the rectangle. Instead we shall apply Stokes’ The-orem so we need only evaluate a single surface integral over any surface with C asits boundary. One such surface is the rectangular surface S bounded by C. For thedirection shown for C the surface S must be orientated by a downward normal.

The surface S has implicit representation given by φ(x, y, z) = y − z = 0. Hence thedownward unit normal is given by

˜n =

∇φ

‖∇φ‖ =1√2

(0˜i +

˜j−

˜k) .

Note the −˜k component in the normal. Next we calculate curl

˜F:

curl˜F =

∣∣∣∣∣∣∣∣∣∣∣

˜i

˜j

˜k

∂

∂x

∂

∂y

∂

∂z

x2 4xy3 xy2

∣∣∣∣∣∣∣∣∣∣∣

= 2xy˜i − y2

˜j + 4y3

˜k .

Thus∮

C˜F · d

˜r =

∫∫

S

curl˜F ·

˜n dS

=

∫∫

R

(2xy˜i− y2

˜j + 4y3

˜k) · 1√

2(0

˜i +

˜j −

˜k)

√2 dxdy

where R is the projection of S onto the xy-plane, i.e.,R =

{(x, y) ∈ R

2|0 ≤ x ≤ 1, 0 ≤ y ≤ 3}. Thus

∮

C˜F · d

˜r =

∫∫

R

(2xy˜i− y2

˜j + 4y3

˜k) · 1√

2(0

˜i +

˜j −

˜k)

√2 dxdy

= −x=1∫

x=0

y=3∫

y=0

(4y3 + y2

)dy dx

= − 90 .

�


∮

C˜F · d

˜r where

˜F = (e−x2 − yz)

˜i + (sin y + xz + 4x)

˜j +

(arctan z + 1)˜k and C is the positively orientated curve of intersection of the surfaces

x2 + y2 + z2 = 25 and z = 4.

Solution:

61


(cont.) (cont.)

�

62



∮

C˜F·d

˜r where

˜F = (4z+x2)

˜i+(−2x+3y5)

˜j+(2x2+5 sin z)

˜k

and C is the positively orientated curve of intersection of the surfaces x2 + y2 = 1 andz = y + 1.

Solution:

(cont.)

�

63

Chapter 18

Divergence Theorem [SHE 10 §18.9]

LECTURE 17

Definition 18.1 The flux density or divergence of a vector field˜F = Fx(x, y, z)

˜i+



div˜F ≡ ∇ ·

˜F =

∂Fx

∂x+

∂Fy

∂y+

∂Fz

∂z.

Theorem 18.1 (Divergence Theorem) If T is a closed bounded region whoseboundary is a piecewise smooth orientable surface S and if

˜F(x, y, z) is continuous

and has continuous first order partial derivatives in some domain containing T then

∫∫∫

T

div˜F dV =

∫∫

S˜F ·

ñ dS

whereñ is the outward unit normal to S.

Proof: We wish to consider the net flux of the vector field

˜F = Fx

˜i + Fy

˜j + Fz

˜k ≡< Fx, Fy, Fz >

across the surface of a box of volume ∆V = ∆x∆y∆z and surface ∆S, i.e,˜F ·

ñ∆S.

The flux across the side at x is approximately given by

˜F(x, y +

∆y

2, z +

∆z

2) · (−

˜i)∆y∆z = −Fx(x, y +

∆y

2, z +

∆z

2)∆y∆z .

Similarly the flux across the side at x + ∆x is approximately given by

˜F(x + ∆x, y +

∆y

2, z +

∆z

2) ·

˜i∆y ∆z = Fx(x + ∆x, y +

∆y

2, z +

∆z

2)∆y ∆z .

Hence the total flux across the two sides at x and x + ∆x is given by

(Fx(x + ∆x, y +

∆y

2, z +

∆z

2) − Fx(x, y +

∆y

2, z +

∆z

2)

)∆y ∆z

≈(

∂Fx

∂x∆x

)∆y ∆z .

64

CHAPTER 18. DIVERGENCE THEOREM [SHE 10 §18.9]

Similarly the total flux across the two sides at y and y + ∆y is given by

(Fy(x +

∆x

2, y + ∆y, z +

∆z

2) − Fy(x +

∆x

2, y, z +

∆z

2)

)∆x∆z

≈(

∂Fy

∂y∆y

)∆x∆z

and the two sides at z and z + ∆z is given by

(Fz(x +

∆x

2, y +

∆y

2, z + ∆z) − Fz(x +

∆x

2, y +

∆y

2, z)

)∆x∆y

≈(

∂Fz

∂z∆z

)∆x∆y .

Hence the total flux across all six sides is given by

˜F ·

˜n∆S =

(∂Fx

∂x∆x

)∆y ∆z +

(∂Fy

∂y∆y

)∆x∆z +

(∂Fz

∂z∆z

)∆x∆y

=

(∂Fx

∂x+

∂Fy

∂y+

∂Fz

∂z

)∆x∆y ∆z

= ∇ ·˜F∆V .

�

Example 18.1 Verify the Divergence Theorem for the sphere S : x2 + y2 + z2 = a2

and the vector field˜F = x

˜i + y

˜j + z

˜k.

Solution: The divergence of˜F is given by

div˜F ≡ ∇ ·

˜F =

∂x

∂x+

∂y

∂y+

∂z

∂z= 3 .

Hence∫∫∫

T

div˜F dV = 3

∫∫∫

T

dV = 3 × volume of sphere of radius a

= 34π

3a3 = 4πa3 .

We parametrise the surface S by spherical polar co-ordinates

˜r(θ, φ) = a cos θ sin φ

˜i + a sin θ sinφ

˜j + a cosφ

˜k , 0 ≤ θ ≤ 2π, 0 ≤ φ ≤ π .

Thus

˜F =

˜r(θ, φ) ,

˜rθ = − a sin θ sin φ

˜i + a cos θ sin φ

˜j + 0

˜k ,

˜rφ = a cos θ cosφ

˜i + a sin θ cosφ

˜j − a sinφ

˜k ,

˜N(θ, φ) =

˜rθ ×

˜rφ

= a2[cos θ sin2 φ


˜j + cosφ sin φ

˜k]

.

65


Hence∫∫

S˜F ·

ñ dS ≡

∫∫

R˜F(

˜r(θ, φ)) ·

Ñ(θ, φ) dφ dθ

= a3

θ=2π∫

θ=0

φ=π∫

φ=0

[cos2 θ sin3 φ + sin2 θ sin3 φ + sin φ cos2 φ

]dφ dθ

= a3

θ=2π∫

θ=0

φ=π∫

φ=0

[sin3 φ + sin φ (1 − sin2 φ)

]dφ dθ

= 2πa3

φ=π∫

φ=0

sin φdφ = 4πa3 .

�

Example 18.2 Let T be the region in R3 bounded by the paraboloid z = 1− (x2 + y2)

and the xy-plane, and let its bounding surface be denoted by S. Let˜F = x

˜i+ y

˜j+ z

˜k.

(a) Evaluate

∫∫∫

T

∇ ·˜F dx dy dz directly.

(b) Give a parameterisation for each of the two separate surfaces of S, the top andbottom.

(c) Use (b) to compute the flux of˜F out of S, directly.

Solution:

(a) Sketch of the region

-1

-0.5

0

0.5

1

x

-1

-0.5

0

0.5

1

y

0

0.25

0.5

0.75

1

z

0

0.25

0.5

˜F = x

˜i + y

˜j + z

˜k ⇒ ∇ ·

˜F = 3. Thus

∫∫∫

T

∇ ·˜F dxdydz =

∫∫∫

T

3 dxdydz

= 3

θ=2π∫

θ=0

r=1∫

r=0

z=1−r2∫

z=0

r dz dr dθ

= 6π

r=1∫

r=0

z=1−r2∫

z=0

r dz dr

= 6π

r=1∫

r=0

r(1 − r2) dr

= 6π

(−1

4(1 − r2)2

∣∣∣∣1

0

)

=3π

2.

(b)

STop˜r(r, θ) = r cos θ

˜i + r sin θ

˜j + (1 − r2)

˜k r : 0 → 1, θ : 0 → 2π

SBottom˜r(r, θ) = r cos θ

˜i + r sin θ

˜j + 0

˜k r : 0 → 1, θ : 0 → 2π

(c) To calculate the flux directly we need to calculate

∫∫

S˜F ·

ñ dS =

∫∫

SBottom

˜F ·

ñ dS +

∫∫

STop

˜F ·

ñ dS .

We consider the two surfaces separately.

Bottom Since the bottom surface is flat the outward unit normal is −˜k. Hence

˜F ·

ñ = − z. But at the bottom z = 0. Thus

∫∫

SBottom

˜F ·

ñ dS = 0 .

66


Top We need to determine the normal to the curved top surface. Hence

˜rr = cos θ

˜i + sin θ

˜j − 2r

˜k

˜rθ = − r sin θ

˜i + r cos θ

˜j + 0

˜k

Ñ(r, θ) =

˜rr ×

˜rθ

=

∣∣∣∣∣∣∣∣∣∣

˜i

˜j

˜k

cos θ sin θ −2r

− r sin θ +r cos θ 0

∣∣∣∣∣∣∣∣∣∣

= 2r2 cos θ˜i + 2r2 sin θ

˜j + r

˜k .

Hence∫∫

STop

˜F ·

ñ dS

=

∫∫

STop

(r cos θ

˜i + r sin θ

˜j + (1 − r2)

˜k)·(2r2 cos θ

˜i + 2r2 sin θ

˜j + r

˜k)

drdθ

=

θ=2π∫

θ=0

r=1∫

r=0

(r + r3

)dr dθ

= 2π

(1

2r2 +

1

4r4

∣∣∣∣1

0

)

=3π

2.

Thus overall∫∫

S˜F ·

ñ dS =

∫∫

SBottom

˜F ·

ñ dS +

∫∫

STop

˜F ·

ñ dS

= 0 +3π

2

=3π

2.

�

LECTURE 18

Example 18.3 Consider the surface integral

∫∫

S

(∇×˜F) ·

ñ dS, where S is an open

cylinder, consisting of a base and side such that x2 + y2 = 1, 0 ≤ z ≤ 2,˜F =

xz˜i + x

˜j + 1

2 y2

˜k and

ñ is the outward unit normal. Determine at least 3 different

ways in which to evaluate the integral. Try all of these ways of evaluating the integral.

Solution:

67


(cont.) (cont.)

�

68


Example 18.4 Prove

∫∫∫

T

∇f dV =

∫∫

S

f˜n dS where f is a scalar function.

Solution:

�

69

Chapter 19

Fourier Series

LECTURE 19

In MATH 1231, we discovered how a basis set of vectors (linearly independentand spanning) for a vector space in R

n could be used to uniquely create anyvector in R

n.

Example 19.1 Every vector˜x =

x1

...xn

∈ Rn can be written as x = x1

˜e1+· · ·+xn

˜en.

This expresses x as a linear combination of the orthonormal basis set {˜e1, . . . ,

˜en},

where

˜e1 =

10...0

,

˜e2 =

01...0

, . . . ,

˜en =

0...01

.

NOTE. The components of˜x, i.e., xi, i = 1, 2, . . ., are unique and in this example can

be determined using xi =˜x ·

˜ei, i = 1, 2, . . .

We also saw in MATH 1231 that the infinite set {1, x, x2, x3, . . .} could be used tocreate a Maclaurin series for an appropriate function f , i.e.,

f(x) =

∞∑

n=0

f (n)(0)

n!xn

where f (n)(0) is the nth derivative of f evaluated at x = 0.

19.1 Orthogonal functions

The real-valued functions on the interval [a, b] form a vector space in the usualsense: Addition of two functions and multiplication of a function with a scalar aredefined pointwise:

(f + g)(x) = f(x) + g(x),

(α f)(x) = α f(x),

where f, g are real-valued functions on [a, b] and α ∈ R is a scalar.

Definition 19.1 Let w be a positive function on [a, b]. For functions f, g defined on[a, b] we define the inner product (associated) with the weight function w by

〈f, g〉w =

b∫

a

f(x) g(x)w(x) dx.

Definition 19.2 For the weight function w(x) = 1 for all x ∈ [a, b] we call the asso-ciated inner product just the inner product and write

〈f, g〉 =

b∫

a

f(x) g(x) dx.

70

CHAPTER 19. FOURIER SERIES

Definition 19.3 The inner product with the weight function w induces a norm via

‖f‖w =√〈f, f〉w =

b∫

a

|f(x)|2 w(x) dx

1/2

.

The norm associated with the inner product (with the weight function w ≡ 1) isdenoted simply by ‖f‖ = ‖f‖w≡1.

The inner product associated with the weight function w and the corresponding normhave the usual properties of any other inner product and norm (think of theEuclidean scalar product and length on vectors in R

n.):

Theorem 19.1 For any real-valued functions f, g, h on [a, b] and any constant λ

(a) 〈f, g〉w = 〈g, f〉w.

(b) 〈f + g, h〉w = 〈f, h〉w + 〈g, h〉w.

(c) 〈h, f + g〉w = 〈h, f〉w + 〈h, g〉w.

(d) 〈λf, g〉w = λ〈f, g〉w = 〈f, λg〉w.

(e) 〈f, f〉w ≥ 0 for all f , and 〈f, f〉w = ‖f‖2w = 0 only if f ≡ 0 (positive definiteness),

(f) |〈f, g〉w| ≤ ‖f‖w ‖g‖w (Cauchy-Schwarz inequality).

NOTE. For purposes of integration we are always allowed to change a functionat a finite number of points; this will not influence/change the values ofany integrals involving the function. Thus a function being zero except at afinite number of points is for integration purposes ‘identical’ to the zero function. Thisobservation implies that a weight function w can be zero at a finite number of points(instead of being positive everywhere on [a, b]).

Definition 19.4 We say that f and g are orthogonal on the interval [a, b] withrespect to the weight function w if

〈f, g〉w =

b∫

a

f(x) g(x)w(x) dx = 0.

Again we say that they are just orthogonal if the weight function is w ≡ 1.

Definition 19.5 Let {Φ1, Φ2, Φ3, . . .} denote a (finite or infinite) set of functions de-fined on [a, b]. The set of functions is called an orthogonal set (with respect to theweight function w) on the interval [a, b] if and only if

〈Φn, Φm〉w =

{0 if m 6= n,λn > 0 if m = n.

Definition 19.6 An orthogonal set {Φ1, Φ2, Φ3, . . .} (with respect to a weight functionw) on [a, b] is called an orthonormal set if

‖Φm‖w = 1 for all m = 1, 2, 3, . . . .

Definition 19.7 An orthogonal set {Φ1, Φ2, Φ3, . . .} (with respect to a weight functionw) on [a, b] is called a complete orthogonal set in a space V of functions on [a, b]if for any f ∈ V the conditions

〈f, Φm〉w = 0 for all m = 1, 2, 3, . . .

imply that f ≡ 0.

Example 19.2 Show that the functions Φm = sin(mx), m = 1, 2, 3, . . . form an or-thogonal set (with respect to the unit weight function, i.e., w(x) = 1) on the interval[−π, π].

Solution:

71


(cont.)

�

Example 19.3 Show that the trigonometric functions {Φ0, Φ1, Ψ1, Φ2, Ψ2, . . .}, givenby

Φn(x) = cos(nx), n = 0, 1, 2, . . . ,

Ψm(x) = sin(mx), m = 1, 2, 3, . . . ,

i.e., {1, cosx, sin x, cos 2x, sin 2x, cos 3x, sin 3x, . . .} form an orthogonal set (with respectto the unit weight function, i.e., w(x) = 1) on the interval [−π, π].

Solution: This means that

〈Φn, Ψm〉 = 0 for all n = 0, 1, 2, . . . , m = 1, 2, 3, . . . ,

〈Φn, Φm〉 = 0 whenever n 6= m,

〈Ψn, Ψm〉 = 0 whenever n 6= m.

–1

–0.5

0

0.5

1

–3 –2 –1 0 1 2 3

x

–1

–0.5

0

0.5

1

–3 –2 –1 0 1 2 3

x

Figure 19.1: The trigonometric function Φn(x) = cos(nx), n = 0, 1, 2 in the left pictureand the trigonometric functions Ψm(x) = sin(mx), m = 1, 2, 3 in the right picture.

72


We indicate now how to show the orthogonality relations. The functions cos and sinsatisfy the following half angle formulas:

sin α + sin β = 2 sin(

12 (α + β)

)cos(

12 (α − β)

), (19.1)

sin α − sin β = 2 sin(

12 (α − β)

)cos(

12 (α + β)

), (19.2)

cosα + cosβ = 2 cos(

12 (α + β)

)cos(

12 (α − β)

), (19.3)

cosα − cosβ = 2 sin(

12 (α + β)

)sin(

12 (β − α)

). (19.4)

To show 〈Φn, Ψm〉 = 0 for all n = 0, 1, 2, . . . and all m = 1, 2, 3, . . ., we use (19.2): setin (19.2)

[12 (α − β) = mx12 (α + β) = nx

]⇔

[α = (m + n)x

β = (n − m)x

].

Then

〈Φn, Ψm〉 =

π∫

−π

cos(nx) sin(mx) dx

=

π∫

−π

1

2

(sin((n + m)x

)− sin

((n − m)x

))dx

=1

2

π∫

−π

sin((n + m)x

)dx −

π∫

−π

sin((n − m)x

)dx

= 0.

The last statement follows from the fact that for all ℓ ∈ Z

π∫

−π

sin(ℓx)dx =

0 if ℓ = 0,

−1

ℓcos(ℓx)

∣∣∣∣π

−π

= 0 if ℓ 6= 0.

The other orthogonality relations are verified in a similar way. �

Are the trigonometric functions also orthonormal?

‖Φ0‖2 =

π∫

−π

cos2(0 x) dx =

π∫

−π

1 dx = 2π.

For Φn, n = 1, 2, 3, . . ., we use (19.3) with nx = 12 (α + β) and nx = 1

2 (α − β), that isα = 2nx and β = 0.

‖Φn‖2 =

π∫

−π

cos2(nx) dx =

π∫

−π

cos(nx) cos(nx) dx

=1

2

π∫

−π

(cos(2nx) + cos(0)

)dx

=1

2

π∫

−π

cos(2nx) dx +1

2

π∫

−π

1 dx

=1

2

(1

2nsin(2n x)

∣∣∣∣π

−π

)+ π

=1

4n

(sin(2nπ) − sin(−2nπ)

)+ π = 0 + π = π.

For Ψn, n ∈ N, we use (19.4) with nx = 12 (α + β) and nx = 1

2 (β − α), that is α = 0and β = 2nx, and obtain

‖Ψn‖2 = π.

Clearly, the set {Φ0, Φ1, Ψ1, Φ2, Ψ2, . . .} is not orthonormal, but by dividing eachof the functions through by its norm we obtain an orthonormal set: The set{Φ0, Φ1, Ψ1, Φ2, Ψ2, . . .}, given by

Φ0(x) =Φ0(x)

‖Φ0‖=

1√2π

cos(0x) =1√2π

,

Φn(x) =Φn(x)

‖Φn‖=

1√π

cos(nx), n = 1, 2, 3, . . .

Ψn(x) =Ψn(x)

‖Ψn‖=

1√π

sin(nx), n = 1, 2, 3, . . .

is an orthonormal set.The orthogonal set {Φ0, Φ1, Ψ1, Φ2, Ψ2, . . .} of trigonometric functions is complete inthe space of continuous functions on [−π, π] and in the space of those functions on

[−π, π] for which the integral ‖f‖2 =

π∫

−π

|f(x)|2 dx is finite, i.e., < ∞.

73


LECTURE 20

19.2 Fourier series of trigonometric functions

19.2.1 Definition of a trigonometric Fourier series

Definition 19.8 A function f , defined on R, is said to be periodic, if there is anumber T > 0 such that

f(x + T ) = f(x) for all x ∈ R. (19.5)

The smallest positive number T for which (19.5) is satisfied is called the period ofthe function f , and we also say that f is T -periodic.

Geometrically, periodicity of a function f (with period T ) means that the functionis completely described through its values on any interval [a, a + T ], where a ∈ R

arbitrary, and that f(a) = f(a + T ).NOTE. We choose the smallest number T for which (19.5) is satisfied. It is clear that(19.5) is then also satisfied for any nT , n ∈ Z, that is, if T is the period of f then alsofor all n ∈ Z

f(x + nT ) = f(x) for all x ∈ R.

Example 19.4 We know that cos(x) and sin(x) have the period 2π. Therefore thetrigonometric functions

Φn(x) = cos(nx), n = 1, 2, 3, . . . ,

Ψn(x) = sin(nx), n = 1, 2, 3, . . . ,

have the period Tn = 2π/n. To see this, write for example

Φn(x) = cos(nx) = cos(2π( n

2πx))

.

The trigonometric functions Φ0, Φ1, Ψ1, Φ2, Ψ2, . . . on [−L, L], given by

Φn(x) = cos(nπ

Lx)

, n = 0, 1, 2, . . . ,

Ψn(x) = sin(nπ

Lx)

, n = 1, 2, 3, . . . ,(19.6)

are an orthogonal set on [−L, L]. This follows easily from the orthogonality in thespecial case L = 1: For example, for m, n ∈ N

〈Φn, Φm〉 =

L∫

−L

cos(nπ

Lx)

cos(mπ

Lx)

dx

=L

π

π∫

−π

cos(ny) cos(my) dy

=

0 if n 6= m,

L

ππ = L if n = m,

where we have made the substitution y = (π/L)x.Analogously, we can prove the other orthogonality relations and also compute thenorms; we obtain

‖Φ0‖2 =

L∫

−L

cos2(

0π

Lx

)dx =

∫1 dx = 2L,

‖Φn‖2 =

L∫

−L

cos2(nπ

Lx)

dx = L, n = 1, 2, 3, . . .

‖Ψn‖2 =

L∫

−L

sin2(nπ

Lx)

dx = L, n = 1, 2, 3, . . . .

(19.7)

The set of trigonometric functions on the interval [−L, L] is also complete in thesense that (f, Φn) = 0 for all n = 0, 1, 2, . . . and (f, Ψn) = 0 for all n = 1, 2, 3, . . .implies that f ≡ 0. (This is true for continuous f and more generally if f only satisfies

L∫

−L

|f(x)|2 dx < ∞.)

74


Definition 19.9 The Fourier series F of a function f on [−L, L] (with ‖f‖ <∞) with respect to the complete orthogonal set of trigonometric functions on[−L, L] is given by

F (x) =a0

2+

∞∑

n=1

{an cos

(nπ

Lx)

+ bn sin(nπ

Lx)}

, (19.8)

with the Fourier coefficients an and bn given by

an =1

L

L∫

−L

f(x) cos(nπ

Lx)

dx, n = 0, 1, 2, . . . , (19.9)

bn =1

L

L∫

−L

f(x) sin(nπ

Lx)

dx, n = 1, 2, 3, . . . . (19.10)

NOTE.

• The first term a0/2 is the coefficient of the constant function Φ0(x) =cos(0πx/L) = 1. Therefore the function Φ0 does not appear explicitly because itsvalue is 1. Note too that we have to scale the coefficient a0 by the factor 1/2.

• The formulas (19.9) and (19.10) are obtained by demanding that for all m

〈f, Φm〉 =

⟨a0

2+

∞∑

n=1

{an cos

(nπ

Lx)

+ bn sin(nπ

Lx)}

, Φm

⟩, (19.11)

〈f, Ψm〉 =

⟨a0

2+

∞∑

n=1

{an cos

(nπ

Lx)

+ bn sin(nπ

Lx)}

, Ψm

⟩. (19.12)

Computing (19.11) explicitly yields

〈f, Φm〉 =

⟨a0

2Φ0 +

∞∑

n=1

{an Φn + bn Ψn} , Φm

⟩

=a0

2〈Φ0, Φm〉 +

∞∑

n=1

{an 〈Φn, Φm〉 + bn 〈Ψn, Φm〉}

=

{ a0

2‖Φ0‖2 =

a0

22L if m = 0

am ‖Φm‖2 = am L if m 6= 0

}= am L,

where in the last step we have used the orthogonality and (19.7). Thus we obtain(19.9), and a similar computation for (19.12) yields (19.10).

Example 19.5 (Square Wave) Determine the Fourier Series for the function f ,defined by

f(x) =

{−K if −π < x < 0,K if 0 ≤ x ≤ π,

where K > 0, andf(x + 2π) = f(x) for all x ∈ R,

i.e., f is 2π-periodic.

–1

–0.5

0.5

1

–8 –6 –4 –2 2 4 6 8

x

Figure 19.2: The square wave with K = 1.

Solution: We compute the Fourier coefficients of f ,

a0 =1

π

π∫

−π

f(x) cos(0x) dx =1

π

π∫

−π

f(x) dx

=1

π

0∫

−π

(−K) dx +

π∫

0

K dx

=1

π(−πK + πK) = 0.

75


For n ≥ 1 we have

an =1

π

π∫

−π

f(x) cos(nx) dx

=1

π

0∫

−π

(−K) cos(nx) dx +

π∫

0

K cos(nx) dx

=1

π

(−K

nsin(nx)

∣∣∣∣x=0

x=−π

+K

nsin(nx)

∣∣∣∣x=π

x=0

)= 0

and

bn =1

π

π∫

−π

f(x) sin(nx) dx

=1

π

0∫

−π

(−K) sin(nx) dx +

π∫

0

K sin(nx) dx

=1

π

(K

ncos(nx)

∣∣∣∣x=0

x=−π

− K

ncos(nx)

∣∣∣∣x=π

x=0

)

=1

π

K

n

(cos(0) − cos(−nπ) − cos(nπ) + cos(0)

)

=2K

nπ

(1 − cos(nπ)

)

=2K

nπ

(1 − (−1)n

)=

{0 if n is even,4K

nπif n is odd.

Thus we obtain the Fourier series of f , given by

F (x) =4K

π

(sin(x) +

1

3sin(3x) +

1

5sin(5x) + · · ·

)

=4K

π

∞∑

k=0

1

2k + 1sin((2k + 1)x

).

(19.13)

�

NOTE. The square wave at the point x = 0 has the value f(0) = K. However, for theFourier series we obtain the value zero, as all the sine functions in the sum vanish inthe point zero. Clearly the function and corresponding Fourier series neednot have the same value at all points x.

19.2.2 Convergence of a trigonometric Fourier series

As the trigonometric functions form a complete orthogonal set, for any f with ‖f‖ =

(L∫

−L

|f(x)|2 dx)1/2 < ∞ we have

limN→∞

∥∥∥∥∥f −(

a0

2+

N∑

n=1

{an cos

(nπ

Lx)

+ bn sin(nπ

Lx)})∥∥∥∥∥

2

= 0, (19.14)

or equivalently∥∥∥∥∥f −(

a0

2+

∞∑

n=1

an cos(nπ

Lx)

+ bn sin(nπ

Lx))∥∥∥∥∥

2

=

L∫

−L

∣∣∣∣∣f(x) −(

a0

2+

∞∑

n=1

an cos(nπ

Lx)

+ bn sin(nπ

Lx))∣∣∣∣∣

2

dx = 0.

(19.15)

In words we say the Fourier series of f converges in the mean to f if (19.14)and (19.15) are satisfied. However, (19.14) and (19.15) do not tell us anything aboutpointwise convergence of the series toward f .The identity (19.15) implies that (use the definition ‖g‖2 = 〈g, g〉, take the innerproduct, and make use of the orthogonality and (19.7))

‖f‖2=(a0

2

)2

‖Φ0‖2 +

∞∑

n=1

{a2

n ‖Φn‖2 + b2n ‖Ψn‖2

}

=a20

42L + L

∞∑

n=1

(a2

n + b2n

),

or equivalently

1

L‖f‖2 =

1

L

L∫

−L

|f(x)|2 dx =a20

2+

∞∑

n=1

(a2

n + b2n

). (19.16)

The identity (19.16) is known as Parseval’s identity.

76


Example 19.6 From Ex 19.5 for the square wave we compute the norm of f andexpress it as a sum of the Fourier coefficients of f with the help of Parseval’s identity

‖f‖ =

π∫

−π

|f(x)|2 dx

1/2

=

K2

π∫

−π

1 dx

1/2

=√

2π K (19.17)

From Parseval’s identity we obtain

‖f‖2 = π

(4K

π

)2(1 +

1

32+

1

52+ . . .

)=

16K2

π

∞∑

k=0

1

(2k + 1)2. (19.18)

From (19.17) and (19.18) together we learn that

16K2

π

∞∑

k=0

1

(2k + 1)2= 2πK2 ⇔

∞∑

k=0

1

(2k + 1)2=

π2

8.

Theorem 19.2 (Uniqueness of the Fourier series) If f with ‖f‖ < ∞ has a se-ries expansion in the trigonometric functions that converges in the mean toward thefunction, then this series expansion is the trigonometric Fourier series.

Among all functions of the form

PN (x) =a0

2+

N∑

n=0

{an Φn(x) + bn Ψn(x)} , (19.19)

with arbitrary real coefficients a0, . . . , aN and b1, . . . , bN , the one that minimises

‖f − PN‖ =

L∫

−L

|f(x) − PN (x)|2 dx

1/2

(for a given function f with ‖f‖ < ∞) is the one, where the coefficients an and bn

are the Fourier coefficients of f . In that case PN is the N-th partial sum of theFourier series of f .We call any function of the form (19.19) a trigonometric polynomial. The sequence{SN} of partial sums SN of the Fourier series of f ,

SN (x) =a0

2+

N∑

n=0

{an Φn(x) + bn Ψn(x)} ,

is thus a sequence of trigonometric polynomials of increasing degree.That SN minimises the norm ‖f−PN‖ among all trigonometric polynomials of the form(19.19) means that PN = SN is the best approximation among all trigonometricpolynomials PN of degree at most N .

LECTURE 21

Definition 19.10 A function f defined on [a, b] is piecewise continuous if it iscontinuous on [a, b] except at a finite number of points

xi, i = 1, 2, . . . , k,

and if both the right-hand and left-hand limits

f(x+i ) = lim

ǫ→0, ǫ>0f(xi + ǫ),

f(x−i ) = lim

ǫ→0, ǫ>0f(xi − ǫ)

exist (and are finite) at each point of discontinuity xi, i = 1, . . . , k.

–1

–0.5

0.5

1

–4 –2 2 4

x

–1

–0.5

0.5

1

–3 –2 –1 1 2 3

x

0

5

10

15

20

–3 –2 –1 1 2 3

x

Figure 19.3: The left picture shows the signum function (see below) on [−5, 5]. Themiddle picture shows a piecewise continuous function, whereas the function in the rightpicture is not piecewise continuous as f(0+) = ∞.

77


Example 19.7 The signum (or sign) function, defined by

f(x) = sgn(x) =

+1 if x > 0,0 if x = 0,−1 if x < 0,

is continuous everywhere except at the point x = 0, but

f(0+) = limǫ→0, ǫ>0

f(0 + ǫ) = 1, f(0−) = limǫ→0, ǫ>0

f(0 − ǫ) = −1

exist. Therefore, the signum function is piecewise continuous.

Definition 19.11 A sequence of functions {fn}n∈N is said to converge pointwisetoward a function f , if for all x

limn→∞

fn(x) = f(x).

Theorem 19.3 Let f be a function on [−L, L], which is piecewise continuous andwhose first derivative f ′ is also piecewise continuous. Then the Fourier series of fconverges pointwise to

f(x) if f is continuous at x ∈ (−L, L),

12

(f(x+) + f(x−)

)if f is discontinuous at x ∈ (−L, L),

12

(f(−L+) + f(+L−)

)if x = ±L.

NOTE. The Fourier series of any function f fulfilling the assumptions always convergesto a function periodic on [−L, L], as at the end points x = ±L the Fourier seriesconverges toward the value

1

2

(f(−L+) + f(+L−)

).

Example 19.8 (Square Wave) The square wave

f(x) =

{−K if −π < x < 0,K if 0 ≤ x ≤ π,

is obviously piecewise continuous, as at the point of discontinuity x = 0 we havef(0+) = K and f(0−) = −K. Its derivative is given by f ′(x) = 0 for all x ∈[−π, π] \ {−π, 0, π}. Clearly f ′(−π+) = 0, f ′(0−) = 0, f ′(0+) = 0, and f ′(+π−) = 0.Thus the square wave has a derivative that is piecewise continuous.

–1

–0.5

0

0.5

1

–3 –2 –1 1 2 3

x

Figure 19.4: The square wave and the first three partial sums of the square wave, withK = 1: the square wave f (solid), S0 (dashed), S1 (dotted), and S2 (bold).

Due to the Theorem, the Fourier series (19.13) of the square wave converges point-wise to {

f(x) if x ∈ (−π, 0) or if x ∈ (0, π),0 if x ∈ {−π, 0, π}.

In Figure 19.4 we show the first partial sums S0, S1, and S2 of the square wave,defined by

SN (x) =4K

π

N∑

k=0

1

2k + 1sin((2k + 1)x

).

19.2.3 Fourier sine and cosine series

Definition 19.12 A function f defined on [−L, L] (on R) is called an even functionif

f(−x) = f(x) for all x ∈ [−L, L] (for all x ∈ R).

Definition 19.13 A function f defined on [−L, L] (on R) is called an odd functionif

f(−x) = −f(x) for all x ∈ [−L, L] (for all x ∈ R).

78


Example 19.9 (odd and even functions)

(a) cos(nx) is an even function on R, because cos(−nx) = cos(nx).

(b) sin(nx) is an odd function on R, because sin(−nx) = − sin(nx).

(c) The signum function is an odd function on R because sgn(−x) = −sgn(x) for allx ∈ R.

(d) The polynomial xn satisfies (−x)n = (−1)nxn for all x ∈ R and thus is an evenfunction on R if n is even and an odd function if n is odd.

NOTE. Geometric interpretation of even and odd. For an even function f , thereflection of the graph of f(x), x ≥ 0, on the y-axis yields just the graph of f(x), x ≤ 0.For an odd function f the image of the graph of f(x), x ≥ 0, under the rotation with180◦ around the origin is just the graph of f(x), x ≤ 0.

Definition 19.14 For a function f on [−L, L], with ‖f‖ < ∞, the Fourier cosineseries is defined by

a0

2+

∞∑

n=1

an cos(nπ

Lx)

,

with the Fourier coefficients given by

an =1

L

L∫

−L

f(x) cos(nπ

Lx)

dx, n = 0, 1, 2, . . . .

Definition 19.15 For a function f on [−L, L], with ‖f‖ < ∞, the Fourier sineseries is defined by

∞∑

n=1

bn sin(nπ

Lx)

,

with the Fourier coefficients given by

bn =1

L

L∫

−L

f(x) sin(nπ

Lx)

dx, n = 1, 2, 3, . . . .

The Fourier cosine series and the Fourier sine series are the Fourier series with respectto the orthogonal set {Φ0, Φ1, . . .} and {Ψ1, Ψ2, . . .}, respectively. The sum of theFourier cosine series and the Fourier sine series of a function f is the trigonometricFourier series of the function.

In general we cannot expect that for a function f (satisfying the assumptions in Theo-rem 19.3) the Fourier sine series or the Fourier cosine series converges to the function,but the following to statements hold:

Theorem 19.4 Suppose f , with ‖f‖ < ∞, is an even function. Then f the trigono-metric Fourier series of f is the Fourier cosine series of f .

Theorem 19.5 Suppose f , with ‖f‖ < ∞, is an odd function. Then f the trigono-metric Fourier series of f is the Fourier sine series of f .

Proof of the theorems: We know that cos(nπx/L) is even and sin(nπx/L) is odd.The Fourier coefficients are defined by

an =1

L

L∫

−L

f(x) cos(nπ

Lx)

dx, n = 0, 1, 2, . . . , (19.20)

bn =1

L

L∫

−L

f(x) sin(nπ

Lx)

dx n = 1, 2, 3, . . . , (19.21)

and the integrands in (19.20) and (19.21) are the product of two functions of whicheach is either even or odd. Now let for example f be an even and g be an odd function.Then the product f g of the two functions is odd, because

f(−x) g(−x) = f(x) (−g(x)) = −f(x) g(x).

With analogous arguments we can prove all the following relations:

EVEN × EVEN = EVEN,

ODD × ODD = EVEN,

EVEN × ODD = ODD.

(19.22)

Now if we consider the integral of an even and an odd function over [−L, L]:

for even f

L∫

−L

f(x) dx = 2

L∫

0

f(x) dx,

for odd f

L∫

−L

f(x) dx =

0∫

−L

f(x) dx +

L∫

0

f(x) dx = 0.

(19.23)

79


Applying (19.22) and (19.23) to the integrals in the Fourier coefficients (19.20) and(19.21) we see that in case of even f all the coefficients bn vanish and that in case ofodd f all the coefficients an vanish. �

NOTE. For an even function (an odd function) satisfying the assumptions in Theorem19.3, Theorem 19.4 (Theorem 19.5) guarantees that the statement in Theorem 19.3about pointwise convergence is true for the Fourier cosine series (Fourier sine series).

19.2.4 Periodic odd and even extensions

In many applications a function f given on an interval [0, L] needs to be expandedinto a Fourier series. One way to do that is to take [0, L] as the interval of periodicityand L as the period. In this case we would generally get a trigonometric Fourierseries with both sine and cosine contributions. However, we can do better by firstextending f to an even or odd function on [−L, L] and then represent itby a Fourier cosine or Fourier sine series, respectively. This is advantageousbecause we have only to determine half as many coefficients.

Definition 19.16 Let f be a function on [0, L]. The even extension of f to theinterval [−L, L] is given by

fe(x) =

{f(x) if 0 ≤ x ≤ L,f(−x) if −L ≤ x < 0.

Definition 19.17 Let f be a function on [0, L]. The odd extension of f to theinterval [−L, L] is given by

fo(x) =

f(x) if 0 < x < L,0 if x ∈ {−L, 0, +L},−f(−x) if −L < x < 0.

NOTE. In the definition of the odd extension we have to redefine f at x = 0 because−f(−0) = f(0) is violated if f(0) 6= 0. The values f(±L) are defined to be zero in theodd extension, because we want to have a periodic odd (and even) extension.Having extended f to [−L, L] with f(−L) = f(L) we now can extend it periodicallyto R.

Definition 19.18 For a function f on [−L, L), or on [−L, L], satisfying

f(−L) = f(L), we can define the periodic extension f to R by

f(x) =

{f(x) if −L ≤ x < L,f(x − 2jL) if (2j − 1)L ≤ x < (2j + 1)L, j ∈ Z.

Definition 19.19 Let f be a function defined on [0, L] and let fe denote its even(periodic) extension. Then the Fourier series of fe is a Fourier cosine series,and if we restrict it to [0, L], we obtain the half-range cosine series for f

f(x) ∼ a0

2+

∞∑

n=1

an cos(nπ

Lx)

, 0 ≤ x ≤ L,

where

an =1

L

L∫

−L

fe(x) cos(nπ

Lx)

dx =2

L

L∫

0

f(x) cos(nπ

Lx)

dx, n = 0, 1, 2, . . . (19.24)

Definition 19.20 Let f be a function defined on [0, L] and let fo denote its odd(periodic) extension. Then the Fourier series of fo is a Fourier sine series, andif we restrict it to [0, L], we obtain the half-range sine series for f

f(x) ∼∞∑

n=1

bn sin(nπ

Lx)

, 0 ≤ x ≤ L,

where

bn =1

L

L∫

−L

fo(x) sin(nπ

Lx)

dx =2

L

L∫

0

f(x) sin(nπ

Lx)

dx, n = 1, 2, 3, . . . (19.25)

NOTE. The second formula for the coefficients in (19.24) and (19.25) makes use of thefact that our integrand is even (see (19.23)). Of course this second formula is easier tocompute, and it also does not make explicit use of the even or odd (periodic) extensionof f .

LECTURE 22

Example 19.10 Find the half-range cosine and sine expansions of the function

f(x) =

{2kL x if 0 ≤ x < L

2 ,

2kL (L − x) if L

2 ≤ x < L.

80


Solution:Half-range cosine expansion:First we determine the (periodic) even extension of f .

fe(x) =

{f(x) if x ∈ [0, L],f(−x) if x ∈ [−L, 0),

and for all x ∈ R we demand fe(x) = fe(x + 2L).

–1

–0.8

–0.6

–0.4

–0.2

0.2

0.4

0.6

0.8

1

–3 –2 –1 1 2 3

x

–1

–0.8

–0.6

–0.4

–0.2

0.2

0.4

0.6

0.8

1

–3 –2 –1 1 2 3

x

Figure 19.5: The ‘triangular-shaped’ function f in the example, with k = 1 and L = π,in the left picture and its even extension fe to [−L, L] in the right picture.

Next we compute the coefficients of the Fourier cosine series of fe: from (19.24)

a0 =2

L

L∫

0

f(x) dx =2

L

2k

L

L/2∫

0

xdx +2k

L

L∫

L/2

(L − x) dx

=4k

L2

(1

2x2

∣∣∣∣L/2

0

− 1

2(L − x)2

∣∣∣∣L

L/2

)=

4k

L2

(L2

8+

L2

8

)= k,

and for n ≥ 1

an =2

L

L∫

0

f(x) cos(nπ

Lx)

dx

=2

L

2k

L

L/2∫

0

x cos(nπ

Lx)

dx +2k

L

L∫

L/2

(L − x) cos(nπ

Lx)

dx

integration by parts needed here

∫u dv = uv −

∫v du

=4k

L2

xL

nπsin(nπ

Lx)∣∣∣∣

L/2

0

− L

nπ

L/2∫

0

sin(nπ

Lx)

dx

+ (L − x)L

nπsin(nπ

Lx)∣∣∣∣

L

L/2

+L

nπ

L∫

L/2

sin(nπ

Lx)

dx

=4k

L2

(L2

2nπsin(nπ

2

)+

L2

n2π2cos(nπ

Lx)∣∣∣

L/2

0

− L2

2nπsin(nπ

2

)− L2

n2π2cos(nπ

Lx)∣∣∣

L

L/2

)

=4k

n2π2

(2 cos

(nπ

2

)− cos(nπ) − 1

).

We compute the first few coefficients an

a1 = 0, a2 = − 16k

22π2, a3 = 0, a4 = 0, a5 = 0,

a6 = − 16k

62π2, a7 = 0, a8 = 0, . . . ,

and see that an = 0 if n /∈ {2, 6, 10, 14, 18, . . .}. For n = 2(2ℓ−1) = 4ℓ−2, ℓ = 1, 2, 3, . . .,we can see that

a4ℓ−2 = − 16k

(4ℓ − 2)2π2, ℓ = 1, 2, 3, . . . .

81


Thus, the half-range cosine series of f reads

Fe(x) ∼ k

2− 16k

π2

(1

22cos

(2π

Lx

)+

1

62cos

(6π

Lx

)+ . . .

)

=k

2− 16k

π2

∞∑

ℓ=1

1

(4ℓ − 2)2cos

((4ℓ − 2)π

Lx

), x ∈ [0, L].

Half-range sine expansion:

First we determine the (periodic) odd extension of f .

fo(x) =

{f(x) if x ∈ [0, L],−f(−x) if x ∈ [−L, 0],

and for all x ∈ R we demand fo(x) = fo(x + 2L).

–1

–0.8

–0.6

–0.4

–0.2

0.2

0.4

0.6

0.8

1

–3 –2 –1 1 2 3

x

–1

–0.5

0

0.5

1

–3 –2 –1 1 2 3

x

Figure 19.6: The ‘triangular-shaped’ function f in the example with k = 1 and L = π,in the left picture and its odd extension fo to [−L, L] in the right picture.

Next we compute the coefficients of the Fourier sine series of fo: from (19.25) for n ≥ 1

bn =2

L

L∫

0

f(x) sin(nπ

Lx)

dx

=

For odd n we can represent the coefficients bn with the formula

b2ℓ−1 =8k

π2

1

(2ℓ − 1)2, ℓ = 1, 2, 3, . . .

82


Thus, the half-range sine series of f reads

Fo(x) ∼ 8k

π2

(1

12sin(π

Lx)

+1

32sin

(3π

Lx

)+

1

52sin

(5π

Lx

)+ . . .

)

=8k

π2

∞∑

ℓ=1

1

(2ℓ − 1)2sin

((2ℓ − 1)π

Lx

), x ∈ [0, L].

From Theorem 19.3 and Theorems 19.4 and 19.5 we know that both half-range seriesexpansions converge at every point x ∈ [0, L] to f(x). �

Example 19.11 Consider the periodic function f of period 4 given by

f(x) =

{−1 −3 ≤ x ≤ −1

0 −1 < x < 1.

(a) Sketch the function f

(b) Determine the Fourier series F of f .

(c) Plot the Fourier series F on your sketch. Are there any points of discrepancybetween F and f? Why? If so, mark these on the graph.

Solution:

(cont.)

F (x) = −1

2+

2

π

∞∑

ℓ=1

(−1)ℓ

2ℓ − 1cos

((2ℓ − 1)πx

2

)

�

83


Example 19.12 Find the Fourier series for

f(x) =

{5, −π < x < 0

3, 0 ≤ x < π

assuming that f has period 2π.[ Hint: Make a vertical shift to simplify the calculations.]

Solution: We first shift the function down by 4 units to make it an odd function,i.e., we create a new function f(x) = f(x) − 4 and compute the Fourier series F (x)for f(x). Once we have computed F (x), we then transform back to obtain our desiredresult of F (x) = F (x) + 4. Thus

f(x) = f(x) − 4 =

{1, −π < x < 0

−1, 0 ≤ x < π.

Since f(x) is an odd function then both a0 and an are equal to zero.Our work is now reduced to calculating bn.

bn =1

L

L∫

−L

f(x) sin(nπx

L

)dx =

1

π

π∫

−π

f(x) sin(nπx

π

)dx

=2

π

π∫

0

f(x) sin (nx) dx

= − 2

π

π∫

0

sin (nx) dx

= − 2

π

[− 1

ncos (nx)

∣∣∣∣π

0

]

=2

nπ[cos (nπ) − 1]

Only for odd values of n do we obtain a non-zero result for bn, i.e.,

b2ℓ−1 = − 4

(2ℓ − 1)π, ℓ = 1, 2, 3, . . .

Thus

F (x) = − 4

π

∞∑

ℓ=1

1

2ℓ − 1sin ((2ℓ − 1)x) .

Hence overall we have

F (x) = F (x) + 4 = 4 − 4

π

∞∑

ℓ=1

1

2ℓ − 1sin ((2ℓ − 1)x) .

�

Example 19.13 Consider the function

f(x) = −x2 + π x, x ∈ [0, π].

(a) Sketch the function f .

(b) Write down the odd extension fo of f to the interval [−π, π]. Sketch the oddextension fo on [−π, π].

(c) Why are all the Fourier coefficients an of fo on [−π, π] zero?

(d) Compute all the Fourier coefficients bn as a function of n. Evaluate the first threenon-zero coefficients bn explicitly.

(e) What are the values of the coefficients bn for even n?

(f) Write down the Fourier series of fo.

(g) Does the Fourier series of fo coincide with the original function f for x ∈ [0, π]?Why?

(h) Hence show that∞∑

ℓ=0

(−1)ℓ

(2ℓ + 1)3=

π3

32.

Solution:

84


(cont.) (cont.)

�

85

The story so far . . .

We introduced the inner product for functions f and g

on the interval [a, b]:

〈f, g〉 =

b∫

a

f(x) g(x) dx.

(This is analogous to the scalar product for vectors.)

We also said that f and g are orthogonal on the interval

[a, b]

〈f, g〉 =

b∫

a

f(x) g(x) dx = 0.

Also defined the norm ‖f‖ of a function f on the

interval [a, b]:

‖f‖ =√〈f, f〉 =

b∫

a

|f(x)|2 dx

1/2

.

We then showed the set of trigonometric functions

{Φ0,Φ1,Ψ1,Φ2,Ψ2, . . .} on [−L,L], given by

Φn(x) = cos(nπL

x), n = 0, 1, 2, . . . ,

Ψn(x) = sin(nπL

x), n = 1, 2, 3, . . . ,

are an orthogonal set on [−L,L].

289

We then said we could write a function f in terms of this

orthogonal set.

The Fourier series F of a function f on [−L,L] (with

‖f‖ < ∞) with respect to the complete orthogonal set

of trigonometric functions on [−L,L] is given by

F (x) =a02

+∞∑

n=1

{an cos

(nπL

x)+ bn sin

(nπL

x)}

,

with the Fourier coefficients an and bn given by

an =1

L

L∫

−L

f(x) cos(nπL

x)dx, n = 0, 1, 2, . . . ,

bn =1

L

L∫

−L

f(x) sin(nπL

x)dx, n = 1, 2, 3, . . . .

NOTE. The function f and corresponding Fourier series F

are the same except may be at a finite number of x values.

In Section 19.4 we discuss if our function f is odd on the

interval [−L,L] then F would be a Fourier sine series,

i.e., an = 0 for n = 0, 1, 2, . . .. Similarly if our function f

is even on the interval [−L,L] then F would be a

Fourier cosine series, i.e., bn = 0 for n = 1, 2, 3, . . ..

Let us continue by starting Section 19.5.

290

math2011 notes

Documents