Download - Interphase EDGE Calculus 3 Lecture/Recitation Notes

Interphase EDGE Calculus 3 Lecture/Recitation Notes

Jack-William Barotta

August 23, 2019

1

Contents

1 Introduction 7

2 Lecture I on July 1, 2019 9

2.1 How to Think About Math . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.2 n-dimensional space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.3 Graph of an Equation in Rn . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.4 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.5 Level Curves of Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3 Recitation I on July 2, 2019 13

4 Lecture II on July 3, 2019 15

4.1 Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

4.2 Linear Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

4.3 Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

4.4 2D Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

4.5 3D Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

4.6 Matrix Operations (ASE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

4.6.1 Solving a Linear System . . . . . . . . . . . . . . . . . . . . . . . . . 26

5 Lecture III on July 5, 2019 27

5.1 Introduction to Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

5.1.1 Addition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

5.1.2 Scalar Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

5.2 More Vector Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

5.3 Applying these concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

2

5.4 The Dot Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

5.4.1 Small Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

5.5 The Cross Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

5.5.1 Example Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

5.6 Big Picture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

5.7 3D Geometry with Lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

5.8 3D Geometry and Planes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

5.8.1 TLDR Finding Equation of Plane . . . . . . . . . . . . . . . . . . . . 40

6 Lecture IV on July 8, 2019 41

6.1 Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

6.2 Planes in Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

6.3 Vector-Valued Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

6.4 Quadric Surfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

6.5 All other Path in Space Stuff (ASE) . . . . . . . . . . . . . . . . . . . . . . . 49

6.5.1 A Proof of Orthogonality . . . . . . . . . . . . . . . . . . . . . . . . . 50

7 Recitation II on July 9,2019 52

7.1 Point to Point . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

7.2 Point to Line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

7.3 Point to Plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

7.4 Line to Line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

7.5 Line to Plane and Plane to Plane . . . . . . . . . . . . . . . . . . . . . . . . 58

8 Lecture V on July 10, 2019 59

8.1 Polar, Cylindrical, and Spherical Coordinates . . . . . . . . . . . . . . . . . 62

8.1.1 Polar Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

3

8.1.2 Cylindrical Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . 63

8.1.3 Spherical Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . 65

9 Lecture VI on July 11, 2019 65

9.1 Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

9.2 Other tools for limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

9.2.1 Alternate Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

9.2.2 Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

9.2.3 Examples of Using the Further Techniques . . . . . . . . . . . . . . . 71

10 Recitation III on July 12, 2019 72

11 Lecture VII on July 15, 2019 75

11.1 Partial Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

11.2 A difficult Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

11.3 Linear Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

12 Recitation IV on July 16, 2019 83

12.1 Partial Derivative Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

12.2 Clarifying an Example in Class on Clairout’s Theorem . . . . . . . . . . . . 85

12.3 Linear Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

12.4 A Rigorous Proof of Clairout’s Theorem . . . . . . . . . . . . . . . . . . . . 88

13 Lecture VIII on July 17, 2019 90

13.1 Review on Linear Approximations . . . . . . . . . . . . . . . . . . . . . . . . 90

13.2 Multivariable Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

13.3 The Second Derivative Test (ASE) . . . . . . . . . . . . . . . . . . . . . . . 95

13.3.1 An Example in Second Derivatives . . . . . . . . . . . . . . . . . . . 96

4

13.4 Directional Derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

14 Recitation V on July 18, 2019 100

14.1 A Small Note on Multivariable Optimization . . . . . . . . . . . . . . . . . . 100

14.2 Gradients and Directional Derivatives . . . . . . . . . . . . . . . . . . . . . . 101

14.3 Following a Path of Max Increase . . . . . . . . . . . . . . . . . . . . . . . . 102

15 Lecture IX on July 19, 2019 104

15.1 Review on Directional Derivatives . . . . . . . . . . . . . . . . . . . . . . . . 104

15.2 Multivariable Chain Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

15.2.1 A Proof of the Multivariable Chain Rule . . . . . . . . . . . . . . . . 109

16 Lecture X on July 23, 2019 110

16.1 Review on Partial Derivatives and Mixed Partials . . . . . . . . . . . . . . . 110

16.2 Lagrange Multipliers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

16.3 Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

17 Recitation VI on July 24, 2019 116

18 Lecture XI on July 25, 2019 123

18.1 Review on Ideas Behind Integration . . . . . . . . . . . . . . . . . . . . . . . 123

18.2 Triple Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

18.3 Integration in Other Coordinate Systems . . . . . . . . . . . . . . . . . . . . 129

18.3.1 Polar Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130

18.3.2 Cylindrical Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . 130

18.3.3 Spherical Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . 131

19 Recitation VII on July 26, 2019 131

5

20 Lecture XII on July 29, 2019 136

20.1 Integration in Spherical and Cylindrical Coordinates . . . . . . . . . . . . . . 136

20.2 Custom Coordinate Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

20.3 Applications of Double and Triple Integrals (ASE) . . . . . . . . . . . . . . . 142

20.3.1 Average Value of a Function . . . . . . . . . . . . . . . . . . . . . . . 142

20.3.2 Center of Mass . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

21 Recitation VIII on July 30, 2019 143

22 Lecture XIII on July 31, 2019 148

22.1 Vector Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148

22.2 Work in Vector Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150

22.3 Fundamental Theorem of Vector Calculus . . . . . . . . . . . . . . . . . . . 152

22.3.1 Conservative Vector Fields . . . . . . . . . . . . . . . . . . . . . . . . 153

22.3.2 Checking Conservative Fields . . . . . . . . . . . . . . . . . . . . . . 154

22.4 Green’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155

23 Recitation IX on August 1, 2019 155

24 Lecture XIV on August 2, 2019 159

24.1 Green’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160

24.2 Surface Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160

24.3 Parametrizing Surfaces (ASE) . . . . . . . . . . . . . . . . . . . . . . . . . . 161

24.3.1 A Better Treatment . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

24.4 Flux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166

24.5 Divergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167

25 Lecture XV on August 5, 2019 167

6

25.1 Divergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167

25.2 Curl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170

25.3 Divergence Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173

25.4 Stokes’ Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175

25.4.1 Same Border, Different Surface . . . . . . . . . . . . . . . . . . . . . 176

26 Recitation X on August 6, 2019 178

27 Lecture XV on August 7, 2019 183

28 Thank You 183

These lecture notes are based off of the Interphase EDGE 2019 Iteration of Multivariable

Calculus instructed by Sam Watson. The introduction contains useful information for those

in my recitation section, and the rest of the sections will be labeled in accordance with the

lecture/recitation that they are associated with. I would greatly appreciate if you alerted

me of any typos that you may find. The more you help me, the more I can help you. I hope

this is of help to you!

1 Introduction

Hello and welcome to my recitation section of Calculus 3! I took this class two years

ago with Professor Watson, and I really thought it was a great help and aid in 18.02.

In addition, I was a recitation instructor last year for Professor Watson, which was even

more fun! I am a Mathematics Major (Course 18) and Economics (Course 14) here at

MIT. I feel like in lecture sometimes it is hard to write down all the little details, so I

will be providing these for you as an additional resource that is supposed to be utilized to

reinforce your very own notes you take in lecture. Please get comfortable with my website

for resources because I will be updating it daily! All of the items that are outlined in blue

7

throughout this document contain hyperlinks to my email in this case, but will also have

useful math resources, additional problems, photos, or other things I decide to put in this.

please bookmark my website jack.mit.edu as I will be uploading all my resources to that

:). I, along with the rest of the Calc3 TAs are currently writing solutions to the in-book

exercises of Professor Watson’s book. That being said, I would be more than happy to go

over those problems with you as well for the additional practice as I am working through

them myself right now!

My ”official” Office Hours for the course will be on:

• Thursdays 8-9

• Sundays 8-9

However, realistically, I will be having office hours from:

• Wednesday 8-10

• Thursdays 8-10

• Sundays 8-101

I also would like to extend time to work individually with students who may find the

Office Hour setting a tad too overwhelming, chaotic, and loud. (which it can definitely be

at times). That being said, please email me, and I would be happy to meet for an hour or

so to go over material related to the class. In my opinion, the most important thing you can

learn this summer is how to use your resources at MIT. They are just about everywhere you

look, and they are just waiting to be used by you. This is your education, and you should

be taking full advantage of the amazing opportunity you have in front of you here. I hope

1Honestly we’ve been knowing that I have like office hours all the time

8

jack.mit.edu

that I can be one of those resources for you over the summer and potentially forward into

the future through other organizations and things I am a part of. I also love talking about

things related or non related to the course, so always feel free to talk about MIT life, math,

or anything else you would like to know about.

I am very excited, and I hope that you can share my excitement throughout our next

seven weeks together. I probably will make a lot of mistakes along the way, so please yell

at me and tell me to fix my mistakes!! As a recitation section, I can promise that every

single one of us will make a handful of mistakes at the very least, so lets us learn from our

mistakes and try for better the next time. Do not get bummed if things do not come super

quickly to you. MIT is a lot different than high school, and it is always better to ask for

help! I want to keep you as engaged as possible, so we will probably do a lot of activities

such as board work, extensions to applications, and maybe even some friendly competition

and games. I will try to incorporate all of your majors if I can into problem solving, so that

you can see how wonderful math is and its ability to weave its way into almost everything.

My high school teacher had fun exam review games, and I hope we can make some of our

own. Well See. Seriously though, overall I want you to love what youre learning and have

fun while doing it. Without further ado, let us begin the actual material of the course!

2 Lecture I on July 1, 2019

Please use sswatson.com/interphase if you want to find all of your Psets, Syllabus, and

other material such as the course textbook provided by Professor Watson. Homework is due

Monday, quizzes are at the start of recitation on Tuesdays, and Sam’s Office hours are 7-9

on Sunday.

9

2.1 How to Think About Math

There was a discussion on some meta math stuff. I will say from experience that whether

or not it is just for you or the grader, writing down your thoughts in the solution definitely

helps. It allows you to organize your thoughts, and it also gives insight to the grader on just

how much you do know about the material being tested/questioned about.

2.2 n-dimensional space

We can express our understanding of Euclidean Space (n-dimensional space) by a really

fancy looking R, namely R. So one-dimensional space is simply just R, the Real number line,

so we simply just have (x). Now, if we want to go to R2 , we are now going to be representing

the real plane. This is how we start defining distance! For example, in R, the signed distance

from the origin to x is x. We can apply this to higher dimensions! In, R2 , we now have

(x, y), and we see that the x-coordinate of a point is the signed distance from the y-axis, and

the y-coordinate of a point is the signed distance to the x-axis. Finally, lets move to R3 .

This is where for me, stuff got visually and geometrically difficult to follow. so now we have

(x, y, z). Now the x-coordinate is the signed distance to the yz plane, the y-coordinate is the

signed distance to the xz plane, and the z-coordinate is the signed distance to the xy plane.

Fun fact for those who care: Distance does not have to be Euclidean. Euclidean is just the

most common formulation of distance. However, there are three axioms of distance and if

your crazy, wack distance system abides by the three axioms of distance, then it is considered

a distance! In Real Analysis, they like to demonstrate this and give crazy problems that

scared me and scarred my view of distance. Lol rereading these notes I wrote from last year

I am so dramatic. Real Analysis is a great course and many students in my section last year

took and well... tbh, had mixed reviews on Real Analysis, so take at your own risk!

10

2.3 Graph of an Equation in Rn

The graph of an equation in Rn is the set of points that satisfy the equation. A graph

simply represents a visual representation of the solution set. The biggest takeaway from this

section is your domain, range, and space you are graphing in matter greatly. An equation

by itself needs the amount of dimensions specified in order to truly answer the question

correctly. For example asking to graph x = 1 leads to 3 different representation depending

on if you are in R,R2, or R3. In R, we simply get that x = 1 is a point at x = 1. In R2,

we get that x = 1 is a vertical line to represent all (x, y) pairs that are (1, y) for all possible

y. In R3, we get that x = 1 is a yz-plane at x = 1 Namely a plane to represent all (x, y, z)

triples that are of the form (1, y, z) for all possible y and z.

We can take this to more interesting cases. For example, lets look at the equation,

x2 +y2 = 1. Some of you may recognize this as a unit circle. However, we need to be careful.

We need to be more precise with ourselves. For example, in R2, we are correct. The set of

points that satisfy this equation are exactly the unit circle. Namely, the squared distance

from the origin is 1. However, lets move to R3. In this case, we now have the unit cylinder.

Since the equation puts no restriction on z, we do not just have the unit circle. Instead, we

have that the squared distance from the z-axis is now 1. Thus, we get a cylinder instead of

a Circle.

2.4 Functions

A function is really cool, in my math nerd circle opinion. A function is what I think of

as a mutater, or a changer of something. Basically, you put some stuff in the function, the

function then does what it needs to do, and it will output the result! A function can be

thought of as a mapping. Youre mapping your input to an output. So in the most common

11

example, think of f(x) = x2 , every inputted x value is being squared and being outputted.

Similar to graphing an equation, we can graph also functions. How we can say this is:

y = f(x) ∀x ∈ R (1)

In words, this means y = f(x), which is our function, for all x that are on the real line. The

fancy symbols are some math notation that is very important to know in higher order math

courses, so it is good to get a little familiar :) However, as many of you won’t necessarily go

into mathematics, you may have functions that do more than just what we generally think

of as a function as. you can have functions change colors, functions that count the amount

of times something occurs sometimes referred to as indicator variables. Moving forward, we

can graph functions from Rn to Rm . That being said, the graph of a function will exist in

the sum of dimensions from the input and the output! This is why we cant really graph

function that have a sum of input and output space that is greater than or equal to four.

So, to have an equation for you to use that may be helpful, the number of dimensions your

graph should be is:

dim(graph) = dim(input) + dim(output) (2)

In words,the dimension of the graph you must use must be equal to the sum of the dimension

of the input space plus the dimension of the output space. This is why we can’t really graph

when the sum of the input and output is greater than three! We can write what are function

are doing in equations as well! For example, say your function f is a function from R3 to R.

We can write this conveniently with an arrow as follows,

f : R3 −→ R (3)

12

What this means if the function, f takes in three inputs, say for example (x, y, z) and out-

puts a single number, persay (w).

Info unrelated to lecture, but for those interested: Function can also take in matrices

and output matrices. For example you might see something of the form, f : Rn,p −→ Rm,q

That being said you don’t have to input n number and output m numbers, you can instead

output n by p matrices and output m by q matrices. We can see this as more motivation as

to how extensive the field of functions really is.

2.5 Level Curves of Functions

A Level set is the set of all points in the domain of a function that map to the same

output! When I first learned about level sets, I did not really see the point, but in higher

dimensions it is definitely helpful. say you have the function, f(x, y, z) = x2 + y2 + z2, using

our equation about dimensions necessary (Equation 2), we see we cannot physically graph

this function since four dimensions are necessary (3 + 1 = 4). However, we definitely can

graph the level sets. To do this, we set the function equal to a number, lets set it equal to

1 just for ease. Then we have 1 = x2 + y2 + z2, and this is the unit sphere! We can graph

that since its only in three dimensions :)

3 Recitation I on July 2, 2019

Great work on the quiz! Also, I am really impressed by all of the problem solving abilities

and different techniques I saw being deployed on the worksheet. There is not much I have

to other than a small discussion on level curves. As many of you may have seen, the level

curve of a function, f , is the graph of an equation generated by picking a specific value for

f . In addition, it is important to make sure that the level curves you are picking may sense

13

and are ”reasonable” for the function being analyzed. Let me use an example for clarity.

Suppose we have the function:

f(x, y) = e−(x2+y2) (4)

Lets first be very tedious with organizing our thoughts. We note that f : R2 −→ R. Thus,

lets figure out which dimension the level curve is. For the case we have here, dimensions

necessary for the levels curves are exactly the same as the dimensions of the input space.

Thus, for this case we need R2 in order to graph the level curves. Now lets go on to what

the graphs look like. In order to construct a level curve, we are going to set our function

equal to some constant.

c = f(x, y) = e−(x2+y2) (5)

So, if we want to find the level curve, let us isolate the x and y argument.

− ln c = x2 + y2 (6)

Okay great. But now we must ask, what values of c can be utilize. notice that our function

has a domain of all of R2 and has a range of c ∈ (0, 1]. Lets make sure this makes sense.

If we plug in some specific values of c that are in the co-domain, we get that, − ln c ≥ 0.

Thus, we get that the level curves of the function are simply circles with radius equivalent to

− ln c. This is called a Gaussian Distribution curve for those who have seen something similar

before. It will be helpful later in the course to realize that a level curve is perpendicular to a

function’s surface. Imagine the side of a mountain as the function, and the level curve being

a specific altitude of the mountain.

14

4 Lecture II on July 3, 2019

4.1 Review

When we utilize the notation that:

f : R2 −→ R (7)

Means that the domain of f is R2 and the co-domain of the function is R. Also for functions

that have an input plus output dimension greater than 3, we cannot necessary work with

graphing the function, but we instead graph the level curves. In the case of, f(x, y, z) =

x2 +y2 +z2, we cannot visualize the function, but we can visualize its level curves as spheres.

4.2 Linear Transformation

Definition: A function, f : R2 −→ R2 is linear if and only if:

f(x, y) = (ax+ by, cx+ dy) where a, b, c, d ∈ R (8)

This is the definition of linear we will be utilizing throughout this course, so please get it

down! We can also conveniently express this in matrix notation that will be used when

discussing this concept both here and in linear algebra courses you may take in the future:

f(x, y) =

a b

c d

xy

(9)

15

We can also introduce the 3-dimensional cousins that may come up sometime in lecture and

recitation below, although it is not of the utmost importance:

f(x, y, z) = (ax+ by+ cz, dx+ ey+ fz, gx+hy+ iz) where a, b, c, d, e, f, g, h, i ∈ R (10)

With corresponding matrix notation:

f(x, y, z) =

a b c

d e f

g h i

x

y

z

(11)

Linear transformations are geometrically a collection of scale, shear, rotate, and projection.

One of the things that people have trouble with getting their head around is the fact that

y = mx + b is not linear. This is not liner because the constant b is involved that will not

map the origin to the origin for all non-zero b. Some examples of linear transformations are:

f(x, y) = (x+ y, x+ y) (12)

f(x, y) = (0, 0) (13)

f(x, y) = (x, y) (14)

Where we choose a, b, c, and d as some constants in order to satisfy our expression. If we

want to scale our input by some factor c, we can represent this as the linear transformation:

f(x, y) = (cx, cy) (15)

16

For some scalar c. We can also rotate a a linear transformation with the rotation found

below:

f(x, y) = (cos θx− sin θy, sin θx+ cos θy) (16)

So, for the case where we want to rotate the plane by π2, then we can plug this in to the

aforementioned equation to obtain:

f(x, y) = (−y, x) (17)

We can project the plane, say onto the x− axis, with the following transformation:

f(x, y) = (x, 0) (18)

Where we effectively compress all of the different y-values for a given x. We also can have a

shear. A classic example of a shear that you may see in a complex variable and/or engineering

course is:

f(x, y) = (x+ y, x) (19)

The interesting idea here is that the area is preserved with this. Although the unit square

is being transformed into a parallelogram, we note that the base and the height remain

constant and as such, the area remains constant. Also, if you think about the determinant

as a means of calculating the area, then we see that the determinant does not change as

ad− bc = 1− 0 = 1 A convenient way to present linear transformations is with the following

theorem.

Theorem A function from R2 −→ R2 is linear if and only if it maps the origin to the

origin and equally spaced lines to equally spaced linear points. We can verify this by looking

through the different families of linear transformations (shear, project, scale, and rotate)

that all satisfy the theorem above.

17

4.3 Determinants

Determinants are really neat. I definitely did not know there was a geometric meaning re-

ally to what a determinant was. I just thought it was an annoying computation, so hopefully

this section proves interesting! Determinants are all about how linear transformations dis-

tort areas. Lets consider a number line first. Consider the linear transformation f(x) = 3x.

We want to ask ourselves, how does this distort areas/lengths/volumes. In this case, we see

that the length between two numbers triples. The 3 in front of the x acts as a distorter of

the original lengths. Thus, we will call 3 the determinant of the function since this is the

distortion factor of length. More generally, we have that given a function, f : R −→ R,

f(x) = mx for m ∈ R (20)

we say that the determinant is m. m can be thought of as the signed factor by which f

transforms lengths. Again, this would not work if we had f(x) = mx + b since the origin

would not be mapped to the origin!

4.4 2D Determinants

Now lets look at how area can be distorted. For the 2d case, consider, f : R2 −→ R2,

f(x, y) = (ax+ by, cx+ dy) (21)

Lets try to figure out the area of the unit square under this linear transformation. Lets start

off by seeing where each of the 4 vertices get mapped to with the linear transformation at

hand. We see that (0, 0), (1, 0), (0, 1), (1, 1) gets mapped to (0, 0), (a, c), (b, d), (a + b, c + d)

respectively. We can calculate the area through some interesting geometry. Lets start by

18

calculating the massive rectangle that has base, (a + c) and height (b + d). Then, we

can subtract off the excess that is not part of the parallelogram! After some algebraic

manipulation, we get that the signed area is simply ad − bc. Thus, we can classify the 2d

determinant as being equivalent to:

area = det

a b

c d

= ad− bc (22)

Again, for ease, I will introduce the matrix notation of the linear transformation that you

will see all over the place at classes at MIT.

f(x, y) =

a b

c d

xy

= (ax+ by, cx+ dy) (23)

Lets think about some interesting cases. In the case that the determinant is −1, the the

orientation of the area is reversed! This is the 2D analog to reversing the length. In addition,

let us look at the case of when the linear transformations turns area into a line. Well, a line

has zero area, and as such, the determinant of such a linear transformation is exactly zero.

You may notice while working that this will occur when one row of the linear transformation

matrix is a scalar multiple of the other row of the linear transformation matrix.

Lets try some examples:

f(x, y) =

1 1

1 1

(24)

Then, the determinant of our linear transformation is ad− bc = 1− 1 = 0. What does this

mean? This linear transformation smashes down everything into a line. Thus, the area is

19

zero. Lets look at the shear case:

f(x, y) =

1 1

0 1

(25)

This was the shear case. We see that the determinant is, ad − bc = 1 − 0 = 1. Thus, this

confirms that the area of the shear transformation is unchanged. Lets finally do an arbitrary

rotation matrix. We would not expect simply rotating would change the area. Lets confirm

this:

f(x, y) =

cos θ − sin θ

sin θ cos θ

xy

(26)

Lets calculate the determinant of this linear transformation. namely det = ad − bc =

cos2 θ+ sin2 θ = 1 Which confirms our suspicion! Lets look at the linear transformation that

flips the unit square over the x− axis, namely:

f(x, y) =

1 0

0 −1

xy

(27)

Then we can calculate the determinant as ad − bc = −1 which again checks out that the

area is unchanged by flipping over the axis, but the orientation flips leading to the negative

sign.

4.5 3D Determinants

In this course, the largest matrices we will do is 3D. To be honest, I dont think any

course makes you actually compute determinants any higher than this. Anyways in 3D the

20

determinant represents the signed factor by which f transforms volumes. For a given matrix,

A =

a b c

d e f

g h i

(28)

The easiest way to compute the determinant is by decomposing

the 3× 3 matrix into smaller matrices. In order to this, we pick a row. For convenience,

I will choose the first row of my matrix. Then the determinant can be expressed as the

following equation:

det(A) = det

a b c

d e f

g h i

= a det

e f

h i

− b det

d f

g i

+ c

d e

g h

(29)

You may now just use the rule we know for a 2× 2 matrix, and then you can use scalar mul-

tiplication of the number out front! This makes the three-dimensional case not as daunting.

Lets fully carry through the multiplication:

det(A) = a(ei− fh)− b(di− fg) + c(dh− eg) (30)

det(A) = aei− afh+ bfg − bdi+ cdh− ceg (31)

You may notice that for three dimensions there is a plus minus pattern when I went across.

This is because a checkerboard pattern is in affect that is alternating between plus and

21

minus. The checkerboard pattern for a 3 3 matrix looks like this:

+ − +

− + −

+ − +

(32)

The general strategy should be that you assign a plus to the first item in your matrix in the

upper left hand corner, and then you follow the checkerboard pattern! The checkerboard

pattern is very important so you dont end up adding something you should subtract or

vice-versa.

4.6 Matrix Operations (ASE)

One of the most important things that you will probably be asked, with a high proba-

bility, is how to compute the inverse of a matrix and utilize it to help solve a linear system!

Let me first walk you through a problem that I put on the additional problem of Chapter 1

that will help us compute the inverse of a matrix. The problem statement is lengthy so try

to stay awake reading it!

Problem:Solving for a matrices inverse is common practice for an 18.02 ASE. I will now

walk you through solving such a problem given your current knowledge on matrices as we

have all the tools that are required. We will just need to throw in some new jargon. First

and foremost, a matrix, A, has an inverse if det(A) 6= 0. This must hold true in order for us

to proceed. In linear algebra speak, the columns of A must be linearly independent in order

22

for there to exist an inverse. Now suppose we have the 3× 3 matrix provided below.

A =

1 1 0

1 0 2

0 0 1

(33)

You can quickly check that A has indeed det(A) 6= 0. We will now compute the inverse. We

will follow a recipe. First and foremost, we will expand along cofactors. Do not mind the

word, but you may see it in other courses. What this means is that say we are looking at

the Aij entry which denotes the ith row and jth column. We now want to cross out this

row and column, take the determinant of what is left (which should be a 2 × 2 matrix in

our case), and put that in the ij spot of some newly created 3× 3 matrix. I will do the top

one for you. In A11 I will delete the first row and first column. I am now left with a smaller

matrix that has determinant equal to zero. I will plug this in, lets call it B, B11 spot. You

now complete the rest. Okay, that’s the hard part. Now, we follow a checkerboard pattern

of changing the sign on our entries. I will display the pattern below:

+ − +

− + −

+ − +

(34)

Okay, so simply look at the matrix, B, that you created and negate the entries that have

negative signs in the above checkerboard pattern. Alright! We are getting closer. Lets call

this new matrix that we switched the sign of every other entry, C. Okay, we will finally

now take the transpose of C. All this means is that we will swap the rows and the columns.

Thus, column one is now row one and so on. We commonly see this as CT . Boom! And that

is it! We will then just divide everything by det(A) We will call our final product A−1

23

Solution: Lets first compute the matrix of A as we will have to use it later. det(A) = −1.

Okay now lets hopp in. Lets do this expand by cofactor thing. I will do this now:

B11 = 0 (35)

B12 = 1 (36)

B13 = 0 (37)

B21 = 1 (38)

B22 = 1 (39)

B23 = 0 (40)

B31 = 2 (41)

B32 = 2 (42)

B33 = −1 (43)

Okay great. Now we will implement the checkerboard pattern displayed in the problem

statement and as such flip the signs of every other entry.

C11 = 0 (44)

C12 = −1 (45)

C13 = 0 (46)

C21 = −1 (47)

24

C22 = 1 (48)

C23 = 0 (49)

C31 = 2 (50)

C32 = −2 (51)

C33 = −1 (52)

Lets now put this together and make the matrix C:

C =

0 −1 0

−1 1 0

2 −2 −1

(53)

Lets now take the transpose of this matrix as said to by swapping the rows and the columns

CT =

0 −1 2

−1 1 −2

0 0 −1

(54)

We will finally divide everything by the determinant, det = −1 to finally get A−1 which is

displayed below:

A−1 =

0 1 −2

1 −1 2

0 0 1

(55)

You can also check and confirm that AA−1 = A−1A = I where I is the identity matrix

25

denoted as: 1 0 0

0 1 0

0 0 1

(56)

Well took some time, but this shows how to compute a matrix inverse. I think that

computing such a thing is best shown just through an example. So hopefully that was

helpful. The good thing here is that you can always check whether you made a mistake or

not. How? Well since AA−1 = A−1A = I, then we can always multiply our two resulting

matrices to obtain the identity.

4.6.1 Solving a Linear System

Consider you have the following equation:

Ax = b (57)

Where A is some 3× 3 matrix, x is some 3× 1 matrix thought of as a vector, and b is some

3 × 1 matrix thought of as a vector. Lets just show this in its full form so that we know

what we are referring to: a11 a12 a13

a21 a22 a23

a31 a32 a33

x

y

z

=

b1

b2

b3

(58)

In these types of problems, the question will generally give you A and b and ask you to

solve for x. They write questions like this so that they first see if you can do some matrix

operations and then solve a system of equations. Here is the important punchline of this

26

section. If A is invertible, then:

Ax = b (59)

A−1Ax = A−1b (60)

However, we already have discussed that for an invertible matrix, AA−1 = I. As such,

Ix = A−1b (61)

x = A−1b (62)

Of course, this only works when the matrix is invertible. However, now we have a quick

way to solve for the x vector that makes this true. This is equivalent to solving a 3 equation

system of equations. So, what I would expect from this section is the ASE potentially asking

you to first solve for the inverse of some matrix and then use that to solve for some x vector

in part b that solves some system of equations.

5 Lecture III on July 5, 2019

If your name wasn’t learned, then press F for you. Your name is not learnt.

5.1 Introduction to Vectors

Vectors will be one of the main objects that we will confront in this course, whether

we are calculating distances in space or fluxes through surface. Lets get the basics down

today, so we can concern ourselves with all the applications later. A vector is an arrow from

one point to another in Rn. We wont need to concern ourselves with all n-dimensions. We

instead, should make sure we are proficient in both R2 and R3 resulting in vectors like (2, 1)

and (1, 2, 1) respectively as some examples. A vector has a magnitude and direction. The

27

length of the vector is the distance from the head to tail. We can also break down the vector

into its x and y components. Two vectors are equivalent if their components are equal. For

example (2, 1) and (4, 2) are not equivalent. While both of these vectors are in the same

direction, notice that the first vector has a smaller magnitude in comparison to the second

vector.

5.1.1 Addition

Consider we have vectors, ~v = (v1, v2) and ~u = (u1, u2), We can add the two components

as:

~v + ~u = (v1, v2) + (u1, u2) = (v1 + u1, v2 + u2) (63)

If we want to represent this on the coordinate grid, we would first draw ~v placing the tail of

the vector on the origin. Then, we will place the tail of ~u at the head of ~v. We then draw a

new vector, that represents the addition of the two vectors from the tail of ~v to the head of

~u. We sometimes denote this as the resultant vector. Do not get lost in the jargon though.

The conceptual understanding is the most important aspect. The jargon can only add once

we are fluent in the concept.

5.1.2 Scalar Multiplication

Suppose we have ~v = (v1, v2), If we want to multiply our vector by some scalar (constant),

c, we get that:

c~v = c(v1, v2) = (cv1, cv2) (64)

What does this do? We see that this scales the original vector whilst keeping the result

parallel (or anti parallel if the constant is negative) as the original vector, ~v.

28

5.2 More Vector Properties

We can only say that two vectors, ~v and ~u are parallel if and only if,

~u = c~v for c ∈ R (65)

Again, given two vectors, ~v and ~u,

c(~u+ ~v) = c~u+ c~v (66)

This is sometimes considered the distributive property of scalar multiplication. We can make

a quick proof of this in two dimensions,

c(~u+ ~v) = c~u+ c~v (67)

c(~u+ ~v) = c(u1 + v1, v2 + u2) (68)

c(~u+ ~v) = (cu1 + cv1, cu2 + cv2) (69)

c(~u+ ~v) = (cu1, cu2) + (cv1, cv2) (70)

c(~u+ ~v) = c(u1, u2) + c(v1, v2) (71)

c(~u+ ~v) = c~u+ c~v (72)

5.3 Applying these concepts

Problem Use vectors to show that the line segment joining two midpoints of the sides

of a triangle is parallel to the third side and half its length.

Solution Lets start with some arbitrary triangle. Lets first label the three vertices of the

29

triangle as A. B, and C labeling in a counterclockwise orientation. Lets now construct a few

vectors naming them with their respective two points involved. In addition to these points,

lets label the point, D, as the midpoint of ~AB and E as the midpoint of ~AC. Immediately,

we can say that:

1

2~BC = ~DE (73)

At this point, make sure your drawing is showing this so that we are on the same page. In

addition, by the way we placed points D and E, we get that:

~AE = ~EC =1

2~AC (74)

as well as:

~AD = ~DB =1

2~AB (75)

Now lets combine some steps. Using vector addition we can identity the smaller triangle,

ADE , expressing in vector notation as:

~AD + ~DE = ~AE (76)

Which, I can rearrange as a vector subtraction expression as:

~DE = ~AE − ~AD (77)

In addition, we can now look at the larger triangle, ABC, and get a synonymous expression

from vector subtraction

~BC = ~AC − ~AB (78)

Lets now plug in the previous expression we get relating half length in equations 43− 45, to

30

get that:

~DE = ~AE − ~AD =1

2~AC =

1

2~AB (79)

~DE = ~AE − ~AD =1

2( ~AC − ~AB) =

1

2~BC (80)

Therefore, we have shown that ~DE = 12~BC, thus by equation 35, we see that since we can

express ~DE as a multiple of ~BC, then these two sides are parallel to one another.

5.4 The Dot Product

Now we will move to the more interesting and useful application of vectors that will be

more extensively used throughout the course. Consider we have two vectors, ~u and ~v. Then

we can express the dot product as:

~u · ~v = (u1, u2) · (v1, v2) = u1v1 + u2v2 (81)

Therefore, we essentially are multiplying together the respective components, and then we

are adding all of them up together to give a scalar (number) value. We note that the dot

product is a measure of how parallel the two vectors are to one another, and you can think

of it as projecting one of the vectors along the direction of the other. A fact that will help

on your Homework (ooo hints in the lecture notes, another bonus of reading) is that:

~u · ~u = |~u|2 (82)

The reason this is the case is because:

~u · ~u = (u1, u2) · (u1, u2) = u21 + u2

2 =√u2

1 + u22 = |~u|2(83)

31

Another really big key equation that we will have LARGE amounts of time with is:

~u · ~v = |~u||~v| cos θ for θ ∈ [0, π] (84)

5.4.1 Small Examples

Lets now apply this to a small example. Consider you see that:

~u · ~v = 0 (85)

What can we say about ~u and ~v, Well, since we can use our alternative expression for the

dot product involving angle, we see that :

~u · ~v = |~u||~v| cos θ = 0 (86)

If this equal zero, and neither of the two vectors is just a zero vector, then the cos θ = 0.

Thus, the angle must be θ = π2. Therefore, the two vectors are perpendicular or orthogonal.

Problem Find the angle between a face diagonal and a space diagonal of a cube.

Solution Lets start off by making vector expressions from the cube. The face diagonal, is

basically moving from the (0, 0) point of the (1, 1). Since we are going to be using a cube

and going into 3D, then lets expression our first vectors as going from (0, 0, 0) to (1, 1, 0),

Thus,

~v = (1, 1, 0) (87)

Now, lets go across the cube. first we need to cross to the other side like the first vector we

created, but then we almost need to head up to the top corner. Namely, we need to make it

32

from the origin all the way up to the point (1, 1, 1), Therefore, we get that:

~u = (1, 1, 1) (88)

Thus, let us now deploy our new equation for the dot product to solve for the angle:

|~u||~v| cos θ = ~u · ~v (89)

|(1, 1, 0)||(1, 1, 1)| cos θ = (1, 1, 0) · (1, 1, 1) (90)

√2√

3 cos θ = 2 (91)

Therefore we can express cos θ as:

cos θ =2√6

(92)

which is enough to calculate what our angle is!

Why is ~i · ~j = 0 for the unit vectors i and j. Well, we generally use i to represent the

x-axis and j to represent the y-axis. Therefore, the two objects are perpendicular to one

another that will ensure that the dot product is equivalent to zero given our first example

in this subsection.

5.5 The Cross Product

Cross Product is the more annoying brother of the dot product, so brace yourselves. We

will pretty much exclusively calculate the cross product in R3. The formula for the cross

product is the following:

~u× ~v = (u2v3 − u3v2, u3v1 − u1v3, u1v2 − u2v1) (93)

33

Well2 this looks like a mess, A much more convenient way to represent this is:

~u× ~v = det

~i ~j ~k

u1 u2 u3

v1 v2 v3

(94)

If you expand out this determinant you will see that equation 63 and 64 are equivalent, but

you can trust me on it.

Now you may ask, what does the cross product do geometrically? The cross product takes

in two vectors, say ~u and ~v and it produces a third vector, say ~w that is perpendicular to

both ~u and ~v. This is extremely powerful, and we will use it about the same as how much

we use the dot product throughout the course. In addition, we can also represent the cross

product in the following fashion that is useful for angle calculation sometimes:

|~u× ~v| = |~u||~v| sin θ (95)

Therefore, if two vectors are parallel, then the cross product is exactly zero. Why? Well, if

two vectors are parallel, or even anti parallel, then θ = 0 or θ = π. In either case, sin θ will

always be zero. Therefore, the cross product must be zero. Since I see what I am about to

say come up a lot let me mention it briefly, in the above formula (equation 65), note that

the LHS is the magnitude of the cross product! Therefore, look at the RHS. Maybe you

remember, that for a parallelogram with sides A and B, the area of the parallelogram is

AB sin θ. That being said, this gives us insight that the RHS is representing the area of a

parallelogram. Thus, the magnitude of the cross product, since the LHS = RHS, is also

the area of a parallelogram made by the two vectors. Keep this in mind for sometime in

2I had a typo before, I am so sorry for this. It has been cleared up now!, always confirm though usingthe determinant formula

34

the future :) The facts above are extremely useful to know, and I recommend you do the

following problem without any computation below that I have made for you:

5.5.1 Example Time

Problem Is the following statement True or False. Please provide a sound argument

as to why you think that it is either true or false: (Take this as a good practice of your

understanding before looking at the solution!)

~u · (~u× ~v) = |u|2 + ~u · ~v (96)

Solution FALSE! Note that the cross product in parenthesis creates a vector that is

perpendicular to both ~v and ~u. Thus, taking the dot product between this vector and ~u

must be equal to zero.

5.6 Big Picture

Lets compare our results for the dot product and cross product. Note that the dot

product has a result that is a scalar quantity, namely just a number. However, the cross

product produces a vector, namely with our case, a vector in R3, which just means a vectors

with 3 components. As a check when you do answer a problem, make sure that this always

remains true. Of course, if it is useful to work with a number once you calculate a cross

product, then take its magnitude like we see in equation 65. Also, Sam said the cross product

is cooler, I disagree #DotProductIsBetter.

5.7 3D Geometry with Lines

We are now edging closer to the calculus portion of the course! This chapter is generally

kind of difficult so please feel free to always reach out to me. The distances between object

35

in space I think is very hard, and I will be uploading a set of notes about them with the

recitation notes on Tuesday, July 9. Okay back to the course. Suppose we want to represent

a line in the coordinate grid, R2, We can write this as:

y = mx+ b (97)

We are going to stay in spirit of this by make is more encompassing. Sorry that the notation

is super wack, let me break it down for you. Suppose with have a point on a line (a, b), that

points in direction ~u, lets say that the slope of the line happens to be m, We can represent

the line as:

(a, b) + t~u t ∈ R (98)

What does this mean? it means that we can start at the point (a, b), and we can move along

the line by adding ~u. In addition, we can multiply by all multiples of ~u, and we will still

remain on the line. For example lets say the direction of the line is ~u = (1,m). Then, we

have the for each unit of x, we move in the y direction by m, which looks a lot like a slope

right? Perfect! So we can say, lets pick (0, 0) as a base point on our line, and lets even choose

m = 1 for simplicity, then:

(0, 0) + t(1,m) = (0, 0) + t(1, 1) for t ∈ R (99)

Which we can see as we plug in values of t, just gives us the exact same thing as y = x.

Hopefully this helped clarify things. Lets bump it up to R3. Suppose we want to write a line

from (3,−4, 1) to (2,−1, 4). Lets pick the first point as our base point, and lets say that at

t = 1, we make it to the second point. Then basically we need to solve for ~u. This probably

36

sounds a bit weird lets start working it out,

(3,−4, 1) + t~u = (3,−4, 1) + 1~u = (2,−1, 4) (100)

Alright so we can rearrange this by basically, coming up with the vector from the first point

to the second point, to get the direction. Just note that I picked t = 1 for convenience, but

we didn’t need to. There are an infinite amount of ways to represent this. I just think always

picking t = 1 helps this out a lot. Okay so, we get that:

~u = (−1, 3, 3) (101)

Okay, so we can express this line as:

(3,−4, 1) + t(−1, 3, 3) = l = (x, y, z) (102)

for the line. Always check that the second point also lies on the line afterwards. If I plug

in t = 1, note that I indeed get (2,−1, 4), which is exactly what we wanted. In addition, I

also get all the other points that are on this line by plugging in different values of t. We can

equally represent this as:

x = 3− t, y = −4 + 3t, z = 1 + 3t for t ∈ R (103)

Hopefully this is starting to make sense. I struggled with this section a lot as a student, so

please always reach out with questions. Lets move on to planes. All we are saying here is

that we want to represent all points on the line. So what we do is we take two points, and

we try to write an equation from one point to the other. When we include the t factor, we

are essentially allowing for not just a line from one point to the other, but for all points in

37

between and beyond the two points that run along the lines existent between the two.

5.8 3D Geometry and Planes

Lets start off with a problem to figure this out. Lets try to write an equation for a plane

that passes through (1, 0, 0), (0, 1, 1), (0, 0, 2). Lets really take our time with this one. I will

write up how to solve this, Saturday, and add it. I am sorry that the first week is really

overwhelming, I promise that I, and the rest of the TAs, will try our best to demystify it for

you. Lets proceed to do this now. So we have three points, and you may think, intuitively

is this enough to clasify a plane? Do we need more? perhaps 4. Lets test this with the

real world case. Suppose we have some plane, lets say the ground of your dorm room. You

are asked by the MIT facilities office whether you want a desk with three legs or four legs.

You say you don’t really mind so long as the desk is not wobbily. The facilities department

immediately hands you a three-legged desk. Lets explore why. Consider placing one leg

down at a time. The first one will make it crash. The second one will make it somewhat

more stable, maybe only allowing it to crash in one or two directions. However, placing the

third leg causes it to be stable. In fact, placing the third leg is analogous to placing the third

point in R3 when defining a plane’s equation. Now lets add the fourth leg. If the ground

is purely even, then were set! However, what if it is a little off? Well, I’m sure you have

experienced this in real life before, the table will wobble. Why? The reason is because that

three of the legs are stablly making contact with the ground, the plane,and the fourth leg

does not necessarily lie on the plane anymore. The reason is the three points, or legs, are

defining some flat space, the plane. Then, adding a fourth leg, point, is in no guarantee

going to lie on that plane! Lets tackle the actual mathematics of this in order to get into it.

When we are handed three points, and we want to find the plane passes through all of

them, we are going to have to make two vectors. The reason is not direct, and I will discuss

38

it when it appears more apparent later in the formulaic recipe. Okay, lets start at the point

(1, 0, 0) and write the vector to (0, 1, 1) and (0, 0, 2) as ~v and ~u respectively.

~v = (−1, 1, 1) (104)

~u = (−1, 0, 2) (105)

Okay great. Now lets bring up why we did this. Well, we want a nifty way to say, here

is an equation that is satisfied for all points that lie on the plane, these three included, as

well as is not satisfied for all points that do not. So, we want to say that all vectors that lie

in the plane are perpendicular to some vector that is normal (perpendicular) to the plane.

Thus, we can deploy the cross product now. We utilize the cross product because the cross

product takes two vectors, in this case two vectors in the plane, and generate a vector that

is perpendicular to both vectors. Thus, lets take the cross product of ~u and ~v

~n = ~v × ~u = (2, 1, 1) (106)

These will be called the coefficients of our plane. Thus, given the previous discussion, we

need all vectors that lie in the plane to be perpendicular to this vector, ~n, namely ~n · ~w = 0,

for some ~w in the plane. We can construct ~w by using a similar strategy to getting ~v and

~u. Lets take some arbitrary point (x, y, z) on the plane, and the point (1, 0, 0), which we

already know is on the plane. We then can write ~w as:

~w = (x− 1, y − 0, z − 0) = (x− 1, y, z) (107)

Lets now take the dot product between this ’arbitrary’ vector, ~w and ~n to get an equation

39

for the plane:

(2, 1, 1) · (x− 1, y, z) = 2x− 2 + y + z = 0 (108)

2x+ y + z = 2 (109)

great! This is the equation for this plane. Lets just test our three points very quickly to

make sure that this does work.

2(1) + 0 + 0 = 2 (110)

2(0) + 1 + 1 = 2 (111)

0 + 0 + 2 = 2 (112)

5.8.1 TLDR Finding Equation of Plane

Looks good. In the future, You can take an alternative that I myself find more useful. I

take the TLDR version which is:

1. write two vectors, ~v and ~u from the three points in the plane

2. take the cross product of the two vectors, making vector ~v × ~u = ~n = (a, b, c)

3. Write the equation ax + by + cz = d, where d is some unknown value I will calculate

in the next step of the formula.

4. Plug in a point on the plane to the equation to solve for the value of d

5. Write the equation ax+ by + cz = d, where a, b, c are found from step 2 and d is from

step 4. Plug in an extra point to make sure I didn’t mess up along the way!

Let me do a quick example right now of the TLDR version in action: Calculus the equation

of the plane passing through the points (1, 0, 0),(0, 1, 0), and (0, 0, 1).

40

1.

~v = (0, 1, 0)− (1, 0, 0) = (−1, 1, 0) (113)

~u = (0, 0, 1)− (1, 0, 0) = (−1, 0, 1) (114)

2.

~v × ~u = (1, 1, 1) (115)

3. The equation of this plane can be expressed, letting a, b, c = 1, 1, 1 as:

x+ y + z = d (116)

4. Plugging in the point, (1, 0, 0)

1 + 0 + 0 = d (117)

We get that d is just 1. Therefore we get that the equation of the plane is:

x+ y + z = 1 (118)

6 Lecture IV on July 8, 2019

6.1 Review

Press F for Sam’s Microphone. Lets have a quick recap/review for things from last time.

Reminder that the dot product can be expressed as:

~v · ~u = |~v||~u| cos θ (119)

41

We also can use this to directly show that:

~u · ~u = |~u|2 (120)

In addition, we also mentioned the cross product. A cross product takes in two vectors,

and it creates a third vector that is perpendicular to both of these vectors. In addition, the

magnitude of the cross product is the area of the parallelogram spanned by the two vectors.

The cross product equation is:

|~v × ~u| = |~v||~u| sin θ (121)

We Also mentioned lines in space. We can write an equation for a line in space by taking

in two points in space. We need to define this line by some base point, namely one of the

base points, along with a vector that is in the direction of the line. the magnitude of the

vector does not matter since we can scale it up and down to reach all points on the line. If

we have a base point P , and a direction of the line ~u, then we can express the line as:

P + t~u for t ∈ R (122)

Which is just the mathematical representation of the idea in the previous paragraph. Suppose

we want to write a line passing through the points (0, 0, 0) and (1, 1, 1). We can then choose

(0, 0, 0) as our base point, P , and we can choose ~u to be the vector from the base point to

the other points, namely ~u = (1, 1, 1). Therefore, we can express the equation of the line as:

l = (x, y, z) = (0, 0, 0) + t(1, 1, 1) for t ∈ R (123)

42

6.2 Planes in Space

We mentioned this briefly last time, and I wrote up some notes over the weekend found

in section 5 of my lecture notes. They are detailed in solving the equation of a plane. Lets

carefully make our way through the problem mentioned at the end of class.

Problem: Find the equation of the plane passing through A = (1, 0, 0),B = (0, 1, 1),

and C = (0, 0, 2).

Solution: We want to start by first generating a vector normal to the plane that has all

three points contained in it. We do this because all vectors that lie within the plane will be

perpendicular to the normal vector. Thus lets first come up with a normal vector. We can

do this by making two vectors from the three points in the plane. We can make a vector

from ~AB and ~AC. Thus, we can write the normal vector as:

~n = ~AB × ~AC (124)

Thus what we can do with this is say, lets pick an arbitrary point D = (x, y, z). Then lets

make a vector, ~AD that is a vector from the point A to the point D. Notice that this vector

must be be perpendicular to the normal vector since the vector, AD, is in the plane, and the

normal vector3 is perpendicular to all vectors in the plane. Therefore, it must be true that:

~AD · ~n = 0 (125)

(x− 1, y, z) · (2, 1, 1) = 0 (126)

2x+ y + z = 2 (127)

3It was not explicitly calculate but the normal vector for this specific case happens to be ~n = (2, 1, 1)

43

This is our equation of the plane! Now, all points that are on the plane will satisfy this

equation, and all points that are not on the plane will not satisfy this equation. A question

that came up in lecture was what happens if we have a parallel plane to the plane we just

calculated. Well lets think about it. The normal vector’s direction cannot change since the

planes are parallel. Therefore, only the number on the right hand side, 2 is our specific case,

will change. Looking back, we can now read off the normal vector to the plane looking at a

final answer. Namely, assume you have some plane with constants a, b, c, d expressed below:

ax+ by + cz = d (128)

Then, the normal vector for this equation is:

~n = (a, b, c) (129)

Lets try out an example problem now, by the way, space is big, lines are small - Sam :

Problem: Find the point where the line l = (x, y, z) = (3 + t,−2t, 3) and x+ y + z = 7

Solution Lets see what we can do here. Well lets check if the line does intersect the

plane since there is always the chance that it doesn’t. Maybe the line passes by the plane

but doesn’t intersect it. In that case, there would not be a point that is shared between the

line and the plane. We can try to plug in the parametric form of the line, and we can plug

it into the equation for the plane. Lets try this because we can isolate the ”time” (the value

of t) that the line intersect the plane, and then we can substitute back in the time to the

line equation to find the specific point that this occurs.

(3 + t) + (−2t) + 3 = 7 (130)

44

t = −1 (131)

We can now plug in this value into the parametric representation of the line to find the point

that they intersect. Namely we get that:

(x, y, z) = (3− 1,−2(−1), 3) = (2, 2, 3) (132)

Lets continue on and try another example that will introduce a massive portion of course

content that is calculating distances between objects in space.

Problem: Find the distance from the point (9, 4, 1) to the line l = (x, y, z) = (1−2t, 3, t).

Solution: Lets start by extracting two pieces of information from our line; a base point

and the direction vector. We can read off the base point by setting t = 0. Doing this, we

get that P = (1, 3, 0) is a point on the line. In addition, we can read off the direction vector

by looking at the coefficients in front of t, by equation 92. So, we get that ~u = (−2, 0, 1).

Great this is a lot of good information that we will need. A naive solution would be okay,

I have a point on the line and another point in space, I can just use the distance formula

between them. Well, this happens to not be the case. The reason being is because we are

interested in the shortest distance between the line and the point. While this idea represents

A distance, it is not the distance that we are looking for. How about we try something else.

Lets make a vector, ~v, which will go from our base point on the line to the point (9, 4, 1) If

we do this we get that:

~v = (8, 1, 1) (133)

Great! Now look at what we have thus far. We have a vector that goes from the line to the

point in space. In addition, we have a vector that points in the direction of the line. From

45

a purely trigonometric standpoint, we can represent the distance between the line and the

point as:

d = |~v| sin θ (134)

This is a good start. However, we don’t know all that much about the angle sin θ since it

is not very clear. Instead, allow me to introduce a small trick. That is, let me multiply the

top and bottom of the expression for distance by |~u|. Doing so we get that:

d =|~v||~u| sin θ|~u|

(135)

Now take a moment and look at what we have here. The top of this expression is something

we are familiar with, namely this is an expression for the magnitude of the cross product.

We can sub this in (This equation is Equation 91), to get:

d =|~v × ~u||~u|

(136)

Now we have an expression for distance in terms of the two vectors that we begun the

problem with! This is great considering the fact that we can solve this just by getting some

magnitudes and solving for a cross product. For the specific numbers utilized in this problem,

we obtain that:

d =√

21 (137)

6.3 Vector-Valued Functions

I hope you had a good three-minute break. Lets now concern ourselves with a class of

functions that are considered vector-valued:

f : Rn −→ R (138)

46

~r : R −→ Rn (139)

We have previously been dealing with the first of the two aforementioned functions. We

will now deal with the second type. Lets try to comprehend whats going on here with a

”real-world” example.

Problem: Suppose you have a bug that is crawling along the outside of a can. The

bug is going to follow a path that wraps around the can exactly once as it crawls from the

bottom to the top. Describe the path.

Solution: We may want to know about the velocity, path, and even acceleration of the

bug as it travels along the surface of the can. Lets first try to describe the path. We want

to use a position vector for this that we will denote as ~r(t). This is something we see all

the time in a physics classroom. We write the vector as a function of time, namely the

components of the vector change as a function of time. Lets make some assumptions so that

we can come up with some path. Lets say that the height of the cylindrical can is 1, the

time it takes for the bug to reach the top is 1, and finally the radius of the cylindrical can

is also 1. lets also say that the bug starts at the position, (1, 0, 0), and it makes it way up

to the point (1, 0, 1). Intuitively, the bug has one unit of time to travel a distance 1 to the

top of the cylinder. Therefore the z component of the path should simply be t. This is a

steady rise up the can. We can get the x and y components by going around a circle. The

parametrization for a circle will always be (~x(t), ~y(t)) = (cos 2πt, sin 2πt)4. Thus we can get

the position of the bug as:

~r(t) = (cos 2πt, sin 2πt, t) (140)

If we want to then go on to calculate the velocity of the bug at some time along its journey

4Generally when you parametric a circle you will get just simply cos t, sin t where t ∈ [0, 2π]

47

we can take the derivative of the position vector component by component. Namely,

~v(t) =d~r(t)

dt(141)

For our case, we can take the derivative of our bug’s position to obtain:

~v(t) = (−2π sin 2πt, 2π cos 2πt, 1) (142)

We can further differentiate velocity to obtain the acceleration by the equation:

~a(t) =d~v(t)

dt(143)

Lets do another example problem for a path to get more familiar with the idea of time-

varying vectors.

Problem: Find the path traced out by a point on a rolling bike wheel with unit radius,

and unit speed.

Solution: Lets have the point start at the bottom of the bike wheel. In addition, lets

start it at the origin. Well first off, we don’t need all 3 dimensions. We can have instead

just x and y components. The first thing to notice is that while the center of the wheel is

constantly moving down the block with a constant speed (unit speed of 1 in this case), the

specific point on the wheel is oscillating back and forth as it makes it way up and down. The

center of the wheel can be described for all times, t, as:

Center = (t, 1) (144)

48

Now, we have take out essentially the transitional motion. Now the only motion that we

have left is essentially the rotational motion. We only have circular motion left. Therefore,

the point will adopt all of the properties of the center’s translational motion whilst also

including its own circular motion as well. With the way we started the picture, we need the

point to start at (0, 0). Therefore, we obtain that the path is:

~r(t) = (t, 1) + (− sin t,− cos t) = (t− sin t, 1− cos t) (145)

6.4 Quadric Surfaces

Quadric surfaces are like quadratic surfaces, but in 3D. They are the 3D analogs. You

actually have come in contact with some of them in our first recitation when we were curves

for things such as 1 = x2 + y2 + z2. We will be graphing these in R3. What does this look

like? It looks like a sphere! However, this one comes off as simple since we are probably

somewhat familiar with this type. Of course, there is a more formulaic way of dealing with

the general surface that will involve the level curves that we covered. For example, what

happens when we have x2 + y2 − z2 = 1. This is something I sure do not know what it

looks like off the top of my head. However, lets start taking slices of z for our equation. For

example, maybe we have that z = 0. Now we have a circle of radius one centered at the

origin. Now lets take a slice at z = 1, Now, we have that x2 + y2 = 2. Now we have a circle

of radius,√

2 at Z=1. We also get the exact same picture for z = −1. If we path these slices

together, we have what looks like an hourglass. See you tomorrow in recitation!

6.5 All other Path in Space Stuff (ASE)

While I do not anticipate this section really coming up on the exam, I will include it just

so that we have some record that it is taught at least the semester that I took it! When

49

we discuss paths in space, we sometimes refer to concepts such as tangent vectors, normal

vectors, bi-normal vectors, and curvature. Each one is honestly just a formula, and it doesn’t

necessarily offer much other than helping you solve problems that ask you to solve each of

these types. The tangent vector, is defined as:

~T (t) =~r′(t)

|~r′(t)|(146)

Which, upon first glance is just the velocity vector divided through by its magnitude. The

reason why this is called the tangent vector is simply because it denotes that direction of

the velocity vector whilst omitting the magnitude of the velocity vector. In addition to the

tangent vector, we can also discuss the normal vector that is defined as:

~N(t) =~T ′(t)

|~T ′(t)|(147)

6.5.1 A Proof of Orthogonality

The normal vector as defined above is always perpendicular to the tangent vector! We

can quickly write up a proof for this. Consider the tangent vector as defined above. You

can directly see that the tangent vector has a constant magnitude for all time by definition.

I will now take use of this fact so that we obtain:

~T (t) · ~T (t) = |~T (t)|2 = 1 (148)

Since ~T (t) is a unit vector. As such, lets now take the derivative of such a dot product:

d

dt

(~T (t) · ~T (t)

)= ~T ′(t) · ~T (t) + ~T (t) · ~T ′(t) = 2~T (t) · ~T ′(t) (149)

50

However, remember that we have already shown that ~T (t)cdot~T (t) is a constant value. There-

fore, this derivative must be equal to zero. As such we have that:

d

dt

(~T (t) · ~T (t)

)= 2~T (t) · ~T ′(t) = 0 (150)

2~T (t) · ~T ′(t) = 0→ ~T (t) · ~T ′(t) = 0 (151)

As such, I can simply divide this expression by |~T ′(t)| without changing the fact that this

will still be equivalent to zero.

1

|~T ′(t)|~T (t) · ~T ′(t) = 0 (152)

~T (t) ·~T ′(t)

|~T ′(t)|= ~T (t) · ~N(t) = 0 (153)

Showing that given we way we have defined both the normal and tangent vectors, they must

be orthogonal for all t. We can finally define our last vector that is rarely asked about.

But, in the case that it is, know that we define the binormal vector as:

~B(t) = ~T (t)× ~N(t) (154)

If I were you, I would just be comfortable with tangent and normal vectors. I think there

is a pretty much zero chance you are asked about a binormal vector. In the case they do,

I believe they would probably give you the formula and make you compute it to see if you

could complete the cross product! The last application of the concepts that we have just

learned is curvature. Curvature measures how smooth a curve it. It requires that ~r′(t) is

continuous and that the magnitude, |~r′(t)| 6= 0. The way I think of curvature is that it

is a measurement of how fast we are changing direction like in circular motion. There are

two definitions that we will come in contact with. Use whichever one is easier for the given

51

problem. Here is the formula, where we denote curvature by the greek letter, κ:

κ =|~T ′(t)||~r′(t)|

=|~r′(t)× ~r′′(t)||~r′(t)|3

(155)

We utilize these two formulas when we are given some ~r(t). In the rare case that instead of

providing use with a ~r(t) expression, we are instead given y = f(x), so that we could express

~r(t) = (x, f(x), we get the following condensed form of the curvature expression:

κ =|f ′′(x)|

(1 + [f ′(x)]2)32

(156)

I wouldn’t think of this section as anymore than a collection of new formulas that just utilize

tools of paths in space that we learned throughout the course. I do not think that these

formulas are commonplace even semester by semester at MIT, so I wouldn’t bank on these

on the ASE. However, in the case that they are, these are the formulas that represent the

collection of.

Finally, I just wanted to add that if you are asked to find the arc length of a curve, the

following formula can be utilized:

S =

ˆ b

a

||r′(t)||dt (157)

Where the path starts at time a and terminates at time b

7 Recitation II on July 9,2019

First off, great job on the quiz! So, we covered a lot, so I want to take some of this space

to summarize how to tackle a couple of the most common points, lines, and planes in space

52

questions that require utilizing a Copious amount of vector mathematics to solve. That

being said, lets starts going through them case by case in a systematic approach. I will try

to teach each with an example. Here we go:

7.1 Point to Point

This is the most simple case and will seem like things you are most likely familiar with

given a point, (a, b, c), and another point, (g, e, f), Then the distance between these two

points is denoted as,

d =√

(a− g)2 + (b− e)2 + (c− f)2 (158)

This is just Pythagorean theorem in three dimensions!

7.2 Point to Line

Lets teach this one through a direct example. Suppose that we have a point (1, 3, 4) out

in space and the line defined by the equation, l = (x, y, z) = (2− 2t, 3 + t, 4t). We will then

be asked if we can find the shortest distance between this line and the point. This is referring

to the perpendicular distance that is the move for all of these types of problem! Okay lets

break this down into a recipe. I will say, like with all of the distance formulas, you can get

creative and this is not the only way to do them.

1. Find a point on the line and the vector that denotes the line’s direction. We can find

a point on the line by plugging in t = 0, with that, we get that the point (2, 3, 0) is on

the line. In addition, the direction of the line is the coefficients of t. Doing so, we get

that the direction of the line is, ~u = (−2, 1, 4). We completed step one!

2. Write a vector from a point on the line to the point out in space. Okay so we already

have both points in question. The point on the line is (2, 3, 0) and the point in space is

53

(1, 3, 4). Therefore, the vector from the first point to the second point is, ~v = (−1, 0, 4).

Perfect! Notice that the magnitude of this vector represents some distance from the

line to the point, but it does not represent the perpendicular distance between the line

and the point.

3. Trigs and Tricks. Okay, so going off the rift at the end of the second step, we can use

some right triangle trigonometry to calculate the perpendicular distance. Namely,

d = |~v| sin θ (159)

So that was the trig. I diagram would be helpful to draw out yourself, but I unfortu-

nately don’t know how to add that to the latex file. Here comes the trick. Because we

don’t really know what the angle exactly is, we want to get rid of it. We can do this

by multiplying and dividing our expression for distance by |~u|. Why, because now the

numerator of our function is the expression for the magnitude of the cross product.

We can represent this discussion in equation form as:

d =|~v||~u| sin θ|~u|

(160)

d =|~v × ~u||~u|

(161)

7.3 Point to Plane

Lets lead by example again. So planes are spoken about in terms of their normal vector.

Okay, so if we have a plane denoted by the equation, 2x + 3y − z = 6. We can pluck off

the coefficients of of the normal vector by looking at the coefficients in front of x, y, and z.

Therefore, for this case we have that ~n = (2, 3,−1). Now lets say I want to find the distance

between this plane and the point, (2, 2, 2). Lets do this systematically again.

54

1. Write a Unit normal vector from the normal vector expression. So in order to change

our normal vector, ~n to the form of the unit normal vector by the following equation:

~n =~n

|~n(162)

So, in our particular example we have that the magnitude of our normal vector is ]√

14.

Thus, we get that our unit normal vector is denoted as:

~n =1√14

(2, 3,−1) (163)

2. Write a vector from a point on the plane to the point out in space. Okay so a point

on the plane must satisfy the plane’s equation. So there are a ton of choices that are

fine. I’ll just pick (1, 1,−1) because that lies on the plane. So now we need to write a

vector from (1, 1,−1) to (2, 2, 2). The vector would be:

~v = (1, 1, 3) (164)

3. Dot the vector, ~v with the unit normal vector to get the distance. Why are we doing

this? Well lets think about it. We only want the component of ~v that lies along the

normal direction. Thus, if we take the dot product of the ~v with the unit normal, we

will simply extract the components of the distance that lie along the arbitrary vector

~v, and we only take the stuff in the unit normal direction. This is, we get that:

d = ~v · ~n = (1, 1, 3) · 1√14

(2, 3,−1) =2√14

(165)

Of course, there are alternative ways to do this, let for example stating that |~v| cos θ =

d, then multiplying the top and bottom by |~n|. Same exact results and the same exact

55

steps in all honesty. Just a different approach.

7.4 Line to Line

Here we in my opinion one of the hardest to visualize. Unlike in R2, we now have lines

that can be skew. Consider we have to lines, L1 = (2−t, t, 4+3t) and L2 = (1−2t,−1+t, 2t),

and we want to know the perpendicular distance between the lines. Lets start off by gaining

some insight on the lines. Namely, lets define one point on each line and also compute the

lines direction. L1 contains the point (2, 0, 4) and has a direction denoted by the vector,

~v = (−1, 1, 3). Please see the section in the notes if getting this part of the information is

difficult in section 6. In addition, L2 contains the point (1,−1, 0) and has a direction denoted

by the vector, ~u = (−2, 1, 2). With all that information close enough to sniff, lets start the

process of getting the answer to this type of question.

1. Write a unit normal vector generated by the cross product of the two line’s directions.

So what does this mean? It means that if we want to find the perpendicular direction

that exists between the two lines that are behaving as vectors, we can take the cross

product of line 1’s vector with line 2’s vector. For our specific example we have that:

~n = (−1, 1, 3)× (−2, 1, 2) = (−1,−4, 1) (166)

In order to transform this into a unit vector, since it will come to play later, lets

compute the magnitude of this cross product and divide through by it, namely:

~n =~n

|~n|(167)

56

With the magnitude of√

18, we can express the unit normal vector as:

~n =~n

|~n|=

1√18

(−1,−4, 1) (168)

Great step one done.

2. Write a vector from a point on one line to a point on the other line. If you remember

back to the beginning of this subsection, we found a point on each line. We have that

(2, 0, 4) is on line 1 5, and (−1, 1, 0) is on line 2. Therefore the vector that connects

the two is:

~w = (−1,−1,−4) (169)

Dot the unit normal vector with the vector between the two points. So this is really

similar to the point and the plane? Why might this be the case? Well lets take a

moment to try and internalize it. In step one we took the cross product of the two

vectors, which essentially is creating a normal vector, a plane type thing, from the two

line vectors. We are then taking this orthogonal vector to both of the line’s direction,

and we are dotting it with some vector from one line to the other. What is this doing?

The dot product is essentially filtering out any of the distance that is not strictly

perpendicular between the two lines, and its result is the distance between the two

lines, namely:

d = ~hatn · ~w =1√18

(−1,−4, 1) · (−1,−1,−4) (170)

As always, if the distance turns out negative just take the magnitude of this. This just

means, since we are working with vectors, that a direction we took happened to be the

opposite, and there is not real meaning besides this.

5Call center amiright

57

7.5 Line to Plane and Plane to Plane

For this one, we are at the are seemingly easier cases. In both of these cases, we either

are going to have the line intersect the plane or it has to be parallel to the plane. If it

intersects the plane, then intuitively, the distance between the two is zero. If the two are

perpendicular. In both cases, in order to not repeat myself, for both these cases, simply pick

a point on the line or plane and then treat the problem like a point to plane problem!!!! The

exact same way and you should be perfect :))) Hope all this helped! So yeah, that was a

lot, but I hope it all makes sense!! Also, sorry for the delay in the posting of this! Let me

do one with parallel planes just to have one in the notes. Find the distance between the

planes x + y + z = 4 and x + y + z = 5. First off how do we know that they are parallel.

Well, since their normal vectors are parallel, then it must be true that their plane surfaces

are also parallel.

1. Compute the unit normal vector of the plane. For this plane we have that: ~n = (1, 1, 1).

Thus if we would like to calculate the unit normal of this, we would obtain that:

~n =~n

|~n|=

1√3

(1, 1, 1) (171)

2. Calculate a vector from one point on the first plane to a point on the second plane. For

convieance, since there are enourmous amount of options to choose from, i’ll choose

the point (4, 0, 0) from the first plane and (5, 0, 0) from the second plane. Doing this,

we obtain that the vector from the first point to the second is:

~v = (1, 0, 0) (172)

3. Take the dot product between ~v and the unit normal. We do this to compute the

distance by essentially projecting ~v along the direction of the unit normal vectors.

58

Namely, we are extracting all of the perpendicular distance from the ~v by dotting it

with the unit vector. For our case we have that:

d = ~v · ~n =1√3

(1, 1, 1) · (1, 0, 0) =1√3

(173)

Jeez that was a lot. Hopefully it is helpful throughout the course :)

8 Lecture V on July 10, 2019

Lets kick off lecture 5 with a little bit of review from last time. We ended lecture talking

about a quadric surface. A quadric surface is a 3D analog of parabolic type curves, now

we have quadratic surfaces. In order to best draw quadric surfaces, we make it simpler for

ourselves by taking z = c for some constant c, and we look at how the equation look in two

variables, namely making some flat shape. By us taking these slices, we get a conic section.

We place these flat shapes at the specific levels of z. We then can construct the surface

together by grouping all of the levels curves together. Consider the example of:

x2 + y2 − z2 = 1 (174)

lets move over the z, and then set z = c and pick some constants such as z = 0, 1, 4.

x2 + y2 = 1 + z2 (175)

x2 + y2 = 1 + c2 (176)

59

where we would then plug in our specific values of z, note the resulting circle located at

these specific values of c and be able to graph these circles. Imagine now you have:

x2 + y2 = z2 (177)

Then notice, at the slice at z = 0, there is not necessarily a circle, but instead, there is

exactly a point. Lets try to think about the exact shape of the aforementioned equation.

Note that if we take z slices, we actually result in what appears to be a cone. Moreso, we

do not necessarily have a cone just above the xy plane, but we also have a cone below the

xy plane. Indeed, the points of each cones share the origin, and then expand either above or

below the plane into their conic shape. Lets consider more interesting cases. Consider the

case of:

x2 + y2 = −1 + z2 (178)

Now we need to do a bit more thinking. With this, imagine we set z = 0, do we have a

legitimate solution? no. Why? because the smallest that x2 + y2 can be is zero. As such,

setting it equal to a negative value will not construct any surface. Now, the first values where

we start to see a surface is at z = ±1. As such, instead of getting a single surface across

the space. We now have a surface that has its lowest value at z = 1 and another surface

that has its largest value at z = −1. There exists a space between them where there is no

surface. Namely, there are no surfaces for z ∈ (−1, 1). We’ll finally end with the parabolic

analog. Consider the function:

z = x2 + y2 (179)

Where you will get circles for each z = c slice for c ∈ [0,∞) that are increasing in radius,

with radius equal to√c. This looks just like a parabola but in 3-dimensional space. In fact,

for those that have seen polar coordinates before, note that this is the function z = r2. So

60

it is a parabola in this type of coordinate system. But, if you don’t know this, do not worry

we’ll get a done of practice with this very shortly. We can add all type of transformations

to the parabolic equation above, like

z = −x2 − y2 (180)

we just type the graph and flip it below the xy plane. The final one we can look at is:

z = −x2 + y2 (181)

Lets try the technique we have been learning. Notice that at the z = 0 slice, we obtain that

y = ±x, we we get a set of criss-crossing lines. And, as we start increasing our z slices, we

start to get hyperbolas. This is very hard to see, but we are basically graphing a pringle! I

attach the following image for clarity.

61

Figure 1: Pringle 6

8.1 Polar, Cylindrical, and Spherical Coordinates

Now we are getting to a very important section. We will be switching between all three

coordinate systems, Cartesian, Cylindrical, and Spherical all the time. Before we introduce

our new coordinate systems, lets take a step back and take a bit more of a formal approach

on our understanding of the Cartesian system. When we describe x, we can define x as being

the signed distance from the y-axis. In 3D, we can represent x as the signed distance from

the yz- plane. A coordinate system in general is an object that gives you enough specific

location to find the actual point that you are trying to describe. We can make analogous

arguments for both the y and z coordinate by describing them as the signed distance from

the xz and xy plane respectively. Together, all three coordinates together given enough

information to specific a point. More generally, a coordinate system on Rn is a set of a

function, f : Rn −→ R that can be used to uniquely identify points in Rn. For example, in

the case of polar-coordinates, we note that (r, θ), where r(P ) is the distance from the origin

to the point, P . In addition, θ(P ) is the signed angle between ~OP , the origin to the point

vector, and the positive x-axis. Lets start the table of ”conversions” between Cartesian and

other coordinate systems.

8.1.1 Polar Coordinates

We can summarize the relationships between polar and Cartesian coordinates as:

r2 = x2 + y2 (182)

6Do not copy-strike me.

62

θ = arctany

x7 (183)

We can also head in the opposite direction:

x = r cos θ (184)

y = r sin θ (185)

8.1.2 Cylindrical Coordinates

Cylindrical coordinates are the bigger brother to polar coordinates. They adopt the same

idea of polar coordinates and add the z-direction. However, the z direction is the same in

both the Cartesian and Cylindrical coordinate systems. In words, z is the distance to the

xy-plane, r is the distance to the z-axis, and θ(P ) is the signed angle from the p-containing

half plane whose boundary is along the z-axis.

r2 = x2 + y2 (186)

θ = arctany

x(187)

z = z (188)


x = r cos θ (189)

y = r sin θ (190)

z = z (191)

7This formula is pretty good. However, please make note of which quadrant the angle actually is in sincethis function will not necessarily produce the correct one.

63

Lets add an example here so that we can see how to graph a system of inequalities, something

that is a very powerful tool that you will see come up all the time. Suppose we have the

following 3 inequalities that we are supposed to graph in conjunction, a system.:

r ≤ 4 (192)

0 ≤ θ ≤ π

3(193)

0 ≤ z ≤ 2 (194)

Here is the resulting image. The strategy here is that we want to say that any point, in

three dimensional space that satisfies all three of the above inequalities, then the graph of

all points that do this is the resulting graph found below. I borrowed the illustration from

Sam’s book, and I do not own nor did I make this graph. We started this by first graphing

the first inequality which is a cylinder of radius 4. However, now as we move to the second

inequality, now we have to get rid of all the point in the cylinder that do not have a theta

coordinate that is, θ ∈ [0, π3]. Even here, we are not done! We now deploy the third inequal-

ity that limits the values of z for, z ∈ [0, 2]. As such, we cut off the points in our wedge that

do not have a z coordinate lying in the specified range for z.

Figure 2: Graph of Inequalities

A proper word explanation is that the points satisfying r ≤ 4 are in a cylinder of radius

4 centered along the z-axis. The points satisfying, 0 ≤ θ ≤ π3

are between the two θ half

64

planes, and the points satisfying, 0 ≤ z ≤ 2 are between the z = 0 and z = 2 planes. Now

its time for the biggest and baddest of them all, spherical coordinates.

8.1.3 Spherical Coordinates

Lets first get the conversions up on the board so that we have a good starting place:

ρ2 = x2 + y2 + z2 (195)

φ = arccosz√

x2 + y2 + z2(196)

θ = arctany

x(197)

And now, in the other direction:

x = ρ sinφ cos θ (198)

y = ρ sinφ sin θ (199)

z = ρ cosφ (200)

We can think, for some point P , that ρ(P ) is the distance from P to the origin. The way we

define θ is the same as we do for cylindrical coordinates. Finally, φ(P ) is the angle between

~OP , vector from the origin to P , and the positive z-axis. An important note is that θ is

bounded, namely 0 ≤ θ ≤ 2π. φ is also bounded, 0 ≤ φ ≤ π.

9 Lecture VI on July 11, 2019

Today we are starting chapter 4, which is the first chapter of multivariate calculus type

stuff! Get hype! It does start off with limits, which tend to be the most out there of subjects.

We are going to try today to get familiar with the concept of multivariate limits.

65

9.1 Limits

Lets start off by talking about the single variable idea of limits. We can think of it as

what a function of doing as a function approaches a specific value, arbitrarily close to, but

not at, the point. Lets get some vocabulary down before we dive into the core idea. We

state that a function is bounded below if its range is a subset of [a,∞) for some a ∈ R What

this is saying is that the function, f , does not have an output value that is smaller than a.

The greatest lower bound of a function, f , is the largest a such that the range of f ⊂ [a,∞).

This sideways U , is just a sign for subset of. Imagine now, that you have a function defined

on the unit interval that is increasing. By strictly increasing, I mean that f(a) ≤ f(b) for

a < b. We can write our function as: f : (0, 1) −→ R. Remember that the function is always

increasing. Therefore, if we want to compute the limit at 0, even though the function is not

defined there, we can note that our function the way it is drawn is bounded below by 2, then

the limit as we approach zero is 2.

Definition: If f is an increasing function on (0, 1), then we say that limr→0 f(r) is the

greatest lower bound of f . We can have the same idea for decreasing function, namely if f

is a decreasing function on (0, 1) then we say that limr→0 f(r) is the smallest upper bound

of f . We aren’t encompassing everything though with this idea. We are only looking at

decreasing and increasing functions, which is totally limiting a massive amount of functions.

We also are only dealing with single-variable function that obviously may be a problem in a

multivariate class! Lets now drop these assumptions on f as we trudge forth.

Definition: If ~a ∈ Rn and r > 0, the punctured ball, B∗(~a, r) is the set of points,

{~x ∈ Rn : 0 < |~x−~a| ≤ r}. This is why we call this a punctured ball, let me break down this

notation. We take all the points that are contained in a radius, r from some point a. We

look at the set of points in the ball, but we omit the point right at the center, a. In the case

66

of functions, f : R2 → R the punctured ball is really just a punctured disk Lets continue

with this:

Definition: Suppose D ∈ Rn and ~a ∈ Rn and that f : D → R. We define, [m(r),M(r)],

as the smallest closed interval containing the image, range, of the punctured ball, B∗(~a, r)∩D

under f . The last thing, B∗(~a, r) ∩ D means the points that are both in the image of the

punctured ball and the domain. Both of the functions, M(r) and m(r) are f : R → R,

meaning that they take in a radius value and they output another single value. We say that

the limit of f(~x) as ~x→ ~a exists if m(r) and M(r) converge to a common value L, we write:

lim~x→~a

f(~x) = L (201)

Wherever you see ~x note that this is a vector of values so think (x, y) or (x, y, z) instead of

the single variable case of just x, that you came in contact with previously. Lets consider

the function

f(x, y) = x2 − y2 + 3 (202)

Well lets first convert this using polar coordinates:

x = t cos θ (203)

y = t sin θ (204)

f(x, y) = 3 + t2 cos2 θ − t2 sin2 θ (205)

f(r, θ) = 3 + t2(cos2 θ − sin2 θ) (206)

f(r, θ) = 3 + t2 cos(2θ) (207)

67

Therefore, since all cosine functions are bounded above by 1 and below by -1, we can construct

our m(r) and M(r) by hitting the bounds for cosine since nothing else is limiting it. I will

say that this is the method we will most likely doing throughout the rest of the limit section

because we cannot always ’guess’ what we think the two m functions are going to be just

by looking at it.Just a reminder that t ≤ r. Therefore, as a last step, we essentially sub out

t with r since we are worried about the biggest and smallest. Instead, we convert to polar

coordinates, and then we make sure that we pick a function for m(r) and M(r) that are only

function of r and not a function of θ. It is important to know some of those trig identities!

Thus:

m(r) = 3− r2 (208)

M(r) = 3 + r2 (209)

Note that if we take the limit as r → 0, the two functions do converge to the same value,

namely 3. Here is a photo of M(r) in purple ,m(r) pink in and then f(x, y) in blue.

.

Figure 3: Limit Functions

68

You can see that we approach the value of 3 at the same point where they all meet! It

does not need to be as hand-wavy, you can see that I used the bounded nature of trig func-

tions to get the same values as Sam. Lets do another example in order to try to get this down:

Problem: Determine whether :

lim(x,y)→(0,0)

(−xyx2 + y2

)(210)

Solution: We are going to be interested in whether as we move towards the origin

from multiple directions, if we achieve the same limit. We cannot just simply complete this

problem right at the origin due to the fact that the function is not even defined at the origin.

Lets write that:

x = t cos θ (211)

y = t sin θ (212)

We can now substitute this into our function to achieve that 8:

f(x, y) = f(t) =−t2 sin θ cos θ

t2= −1

2sin 2θ (213)

Now lets look at the biggest and smallest our function f can be. Namely, we can achieve

a largest value of 12

and a smallest value of −12

, As such we found our M(r) and m(r)

respectively. Therefore, as r → 0, we see that M(r) and m(r) converge to different values,

12

and −12

. Therefore, the limit does not exist. Lets have a couple of other tools, in the back

of our toolkit:

8reminder that 2 sin θ cos θ = sin 2θ

69

9.2 Other tools for limits

9.2.1 Alternate Paths

Consider that you have two paths, ~r1 and ~r2 in Rn with that property that:

limt→0

f(~r1(t)) 6= limt→0

f(~r2(t)) (214)

with ~r1(0) = ~r2(0) = ~a, Then we state that lim~x→~a does not exist. This does not mean that

if two paths do happen to have the same limit, that the limit does exist. Why? There are

an infinite amount of paths, so just having two approach the same value does not actually

allow us to say it exists. We would need to turn to our M(r) and m(r) notation used before

that. This is useful trick in the case that the limit does not exist. For example, sometimes

you may try plugging in paths like y = x or y = 0 to show that the limit does not exist at

the origin perhaps if they lead to different coordinates.

9.2.2 Continuity

Definition: f is continuous if its values equal its limit.

Theorem

1. x, y, z are continuous

2. sums and products of continuous functions are continuous like (x+y+z, xyz, x2+y2, etc)

3. Compositions of continuous functions are continuous (exy+z) for example. Most func-

tions that we are dealing with should be continuous.

lets now turn to an example for the alternative paths example:

70

9.2.3 Examples of Using the Further Techniques

Suppose we want to consider all directions at the exact same time. We can let the angle

of approach be θ. We can make a substitution that:

x = t cos θ (215)

y = t sin θ (216)

When we do this we have to be very careful. Why? Because by making the substitution

only takes into account straight paths, along a specific value of θ, but we are not taking into

account any curvy path. Like maybe a possible candidate could be y = x2 or even y = x3.

These are curvy paths that were not tested by just making a polar coordinate conversion.

We will cover this problem in lecture tomorrow so be on the lookout for that! Also just a

reminder, Sam is still teaching at the moment, and his plane is leaving in 56 minutes lol. I

have attached an image for clarity on the subject. We see that the limit appears to exist

along all straight lines, but if you take the y = x2 curvy path the origin, you will reach a

different limit value. As Sam said, you have to surf your way to the origin along the curvy

paths.

71

Figure 4: Different Curves to the Origin

10 Recitation III on July 12, 2019

Great work today in recitation, and thank you to Klajdi’s half section for joining us. I

just thought it would be useful to put some examples I made from the worksheet straight

into the lecture notes. So there is not necesasily anything new in this section that isn’t from

the worksheets, but it hopefully helps in organizing your studying. That being said lets just

go through a few of the problems for us to get more comfortable with limits.

Problem: Show that lim(x,y)→(0,0)

(x2 + y2)32 (1− sin2(x2 + y3)) = 0. By showing that M(r)

and m(r) converge to 0.

Solution: Lets start as we have in the past by plugging in polar coordinates:

f(x, y) = f(t, θ) = (t2)32 (1− sin2(t2 cos2 θ + t3 sin3 θ)) = t3 cos2(t2 cos2 θ + t3 sin3 θ)) (217)

72

Now we are close. Notice now that the cos2 curve is bounded above by 1 and below by 0,

irrespective of the argument. therefore, we can bound f(x, y).

0 ≤ f(r, θ) ≤ r3 (218)

As such we arrive on the fact that M(r) = r3 and m(r) = 0. Therefore, as we take the limit

as r −→ 0, we get that the limit exists and is equal to zero.

Problem: Show that lim(x,y)→(0,0)

(x2 + y2) sin1

x2 + y2= 0. By showing that M(r) and m(r)

converge to 0.

Solution: Lets start as we have in the past by plugging in polar coordinates:

f(x, y) = f(t, θ) = t2(sin2 θ + cos2 θ) sin1

t2= t2 sin

1

t2(219)

Now we are close. Notice now that the sine curve is bounded above by 1 and below by -1.

therefore, we can bound f(x, y).

− 1r2 ≤ f(r, θ) ≤ 1r2 (220)

And as such, we get that M(r) = r2 and m(r) = −r2. Therefore, as we take the limit as

r −→ 0, we see that the limit on both sides approaches zero, and as such the limit exists

and is zero.

Problem: Show that lim(x,y)→(0,0)−x2yx4+y2

does not exist even though the limits along every

line through the origin exist and are equal.

73

Solution: First, let us prove that the limit converges to a value if we approach it through

any line y = mx by doing a substitution:

lim(x,y)→(0,0)

−x2y

x4 + y2= lim

x→0

−x2mx

x4 + (mx)2(221)

= limx→0

−mx3

x2(x2 +m2)(222)

= limx→0

−mx(x2 +m2)

(223)

=0

0 +m2= 0 (224)

We’ve proved that the limit converges for any linear approach to the origin, however, that

doesn’t guarantee that the limit will converge to the same value for any type of approach.

For instance, we could approach the origin through a parabolic track of the form y = ax2 in

which case the limit becomes:

lim(x,y)→(0,0)

−x2y

x4 + y2= lim

x→0

−x2ax2

x4 + (ax2)2(225)

= limx→0

−ax4

x4(1 + a2)(226)

= limx→0

−a(1 + a2)

=−a

(1 + a2)(227)

which, similar to that previous problem, depends on the specific parabola we use to approach

the origin (in this case determined by the value of a). We therefore conclude that the limit

does not exist.

Hopefully the limit stuff is all down. The most important thing to get out of it, in my

opinion is M(r) and m(r) which really comes down to picking a floor and ceiling for your

function. Basically, we are saying that our function is never larger than M(r) and never

74

smaller than m(r). Then, if the floor and the ceiling are converging towards the same values,

namely closing in on the center of the room, we get the limit exists at that point, and it is

equal to that said value. See you on Monday! One week until the exam :)

11 Lecture VII on July 15, 2019

11.1 Partial Derivatives

Today we are getting to derivatives finally! Lets take a step back and generalize the

derivative from single-variable calculus. Derivatives are really just seeing how much the

function output changes as we change the input slightly. We see that:

f(a+ h)− f(a) ≈ 0 (228)

for some small h. This captures the idea mentioned above that the function can increase or

decrease as you move a small amount away from a but for really small h the change is not

very large. If we want to gain more information, we can instead look at:

f(a+ h)− f(a)

h(229)

Which, as we take the limit as h goes to zero, becomes the formula that is used for a

derivatives of the function f, namely:

f ′(a) = limh→0

f(a+ h)− f(a)

h(230)

If we zoom in at this at this point, we will see a straight line. While the function itself might

be curvy all over the place, if we zoom in so much, we see that the function appears to be

linear, and as such, we can think of the derivative of f at a is just the slope at that point. It

75

tells us how sensitive f is to small changes in the input. Lets take a step up into 2 variables

so that we can handle multivariable differentiation.

If f : R2 → R, then we define the partial derivative of f with respect to x and y

respectively as:

∂xf(a, b) =∂f

∂x(a, b) = lim

h→0

f(a+ h, b)− f(a, b)

h(231)

∂yf(a, b) =∂f

∂y(a, b) = lim

h→0

f(a, b+ y)− f(a, b)

h(232)

Effectively what we are doing is saying let me hold one of my variables constant and only

look at changes in the other. We see that for ∂xf(a, b), we are just holding the y-variable

constant and looking at a small change in x to remark on how sensitive f is with respect to

changes in x. Lets try an example of taking partial derivatives:

Problem: Differentiate ex sinxy with respect to (w.r.t) x and y.

Solution:

∂x(f(x, y) = ∂x(ex sinxy) = ∂x(e

x) sinxy + ex∂x(sinxy)9 (233)

∂x(f(x, y) = ex sinxy + yex cosxy (234)

∂y(f(x, y) = ∂y(ex sinxy) (235)

∂y(f(x, y) = ex∂y(sinxy) = xex cosxy (236)

Theres not much new here. We are just holding one variable constant and taking the deriva-

tives with it one of the variables. The actual application of the partial derivatives is what

9I am using product rule here since x comes up in both of the terms!

76

is going to be fun. So we can put our tools of partial derivatives to work looking at graphs.

Suppose we want to calculate the sign of the partial derivative at a specific point. Suppose

we look at the point (1, 1) on the graph below.

Figure 5: f(x, y)

At this point, we want to determine if the partial derivative with respect to x and y is posi-

tive or negative. Lets first look at the partial derivative with respect to x. Graphically, what

this means is that if we scoot a little bit away from (1, 1) in the positive x direction, what

direction are we heading? We can see that we would be heading downwards, looking like

rolling down the hill, therefore we expect the partial derivative at this point to be negative.

Now lets look at the y direction. If we scoot out just a little bit forwards in the positive y-

axis. We see that if we were to take a step forward in the positive y direction, we would have

to walk a little bit uphill since the function is increasing. Therefore, since we would walking

upwards, the partial derivative with respect to y at this point is positive. Lets continue with

even more examples:

Problem: Given these 3 graphs, decide which of these graphs is f , ∂xf , and ∂yf . Here

is the picture of the three graphs:

77

Figure 6: Graphs of f , ∂xf , and ∂yf in no particular order.

Solution: Maybe we want to start by guessing that the first graph is f . This is nothing

more than a guess. Suppose we choose to look at the x axis. Well if the first graph is indeed

f , then we would expect that since f remains flat along the x-axis, we would expect the

derivative with respect to f along this to be zero. This happens to not be the case for either

of the two other graphs. As such, there is no way that this can be the function itself. Maybe

now lets choose the second graph to be the function f . Lets look at the rightmost edge

towards us. Note that as we move along the edge from back to front, along the +x-direction,

we see that the shape initially increases, and then it decreases. As such, we would expect

a graph of the derivative with respect to x to first start off as positive, to match the initial

increase, and turn negative, about halfway through to match the decrease. Therefore, the

third graph captures this, so we say that the third graph is the graph of ∂xf . Finally, we can

label the first graph as ∂yf . We can look at the y-axis edge to match the behavior of the two.

Here is a neat little theorem that will come in handy throughout the course, it is not all

too powerful in the grander scheme of things but something to mention nonetheless.

Theorem Clairout’s Theorem states that if fxy and fyx exist and are continuous, then:

fxy = fyx (237)

78

I have a cool proof for this that I will include in the recitation notes tomorrow for those

that are interested in completing a mathematics major, so be on the lookout for that :).

Anyways, back to the course. Lets do the following example:

11.2 A difficult Example

Problem: Given the values of f shown, approximate fxy(P ). Let be be the bottom left

corner that has value 2. 0.1 to the right of this point, is another point with value 3. 0.1

above the point P , lets have a point Q that has value 4. In the top right corner which is 0.1

away from the point Q and 0.1 above the bottom corner (the four points form a square) has

value 6. Sorry I didn’t get a picture of the drawing. If any of you have it email it to me.

Solution: Lets thinking about what we have to do here. A reminder that gy measures

the change in the y-direction of g. Here, we are doing this for g = fx. For those that haven’t

seen it, I know I haven’t, fxy means to first take the derivative with respect to x and then

take the derivative with respect to y. Therefore, lets first concern ourselves with the inside

derivative, the partial derivative with respect to x of f . If we have just the picture available

to us, then if we scoot over 0.1, our function changes value by 1 on the bottom left (lets call

this point P ) points Therefore, fx can be viewed as the change in the function value over

the change in the movement over. Therefore, we would get that:

∂xf(P ) ≈ 1

0.1= 10 (238)

Now lets do the same thing for the upper left point. We note that the function, when scooted

over from the upper left point to the right has a function value change of 2 in the space of

79

0.1. Therefore, we can get a similar expression, calling this point Q.

∂xf(Q) ≈ 2

0.1= 20 (239)

Okay so that takes care of the first derivative. Now lets take ∂y of ∂xf . at each of the

points. Now we can look at the points P and Q that we have been looking at throughout

the problem. We see that value of ∂xf goes from 10 to 20 as we move up from point P to Q

changes our y value by 0.1. Therefore, We have our function, ∂xf changing value by 10 in

the space of 0.1 scooting up in the y-direction. Therefore, we can calculate ∂y(∂xf), fxy as:

∂y(∂xf) ≈ 20− 10

0.1= 100 (240)

Resulting in the answer of 100. Lets just take a recap as to what we did since I probably

made some spelling errors and weird sentences trying to catch up. I first started by look at

the point P in the bottom left and the point Q in the top left. I then said, lemme scoot over

from each point a little bit to the left, seeing how much the function changed each time over

the amount of space I scooted over. This represented my ∂xf at each of the points. Now I

want to calculate ∂yg of my function g which is g = ∂xf . Therefore, I start at the point, P ,

scoot up along the y-direction to the point Q. I see that my function changes by 10 whilst

making a scoot of only 0.1. Thus, I get that the ∂yg = ∂y(∂xf) = fxy ≈ 100. Please email

me with any questions you may have in this section because I know that this problem got

some confusion as an exercise, let me add, if we instead did fyx which would be ∂x(∂yf).

Similar to single-variable, lets see how we can linearly approximate function at a specific

point. This was added under the recitation notes.

80

11.3 Linear Approximation

If we have a well-behaved, i.e, a function that doesn’t blow up, have asymptotes, or slope

of 4000000000, we can make a linear approximation to the function a specific point. This is

similar to solving for the tangent line at a point in single-variable calculus. However, now

since we are over in higher dimensions, we will approximate our surface, functions, with a

tangent plane. Lets start with a definition involving differentiability:

Definition: f is differentiable at a point a if there exists a linear function L such that:

limx→a

f(x)− L(x)

|x− a|= 0 (241)

The differentiable clause in this is much necessary. If the function is not differentiable at

the point in question, then we cannot say that there exists some linear function, L(x). Look

at the x = 0 point of the absolute value function f(x) = |x|. The function is not differen-

tiable at this point, and as such, we do not have the ability to come up with a linear function

that can approximate the function at this point. The slope of this linear approximation is

going to equal the derivative value at that point. Now for two variables, we can generalize

the above definition so that we can make linear functions for function, f : R2 → R. Lets

restart the definition for multivariable case:

Definition: A function, f : R2 → R is differentiable at a point (a, b) if there exists a

linear function L such that:

lim(x,y)→(a,b)

f(x, y)− L(x, y)

|(x, y)− (a, b)|= 0 (242)

81

lim(x,y)→(a,b)

f(x, y)− L(x, y)√(x− a)2 + (y − b)2

= 0 (243)

In words, this is saying that if we zoom really far in around that point (a, b), the function

f(x, y) strongly resembles the linear approximation of f, L(x, y). Within the function L(x, y),

we make an analog to the single-variable case by making the coefficients of x and y10 in L(x, y)

are simply the partial derivatives with respect to x and y respectively. If we want a closed

form expression (all this means in having an equation to represent this idea), we can express

L(x, y) around the point (a, b) as:

L(x, y) = f(a, b) +∂f

∂x(x− a) +

∂f

∂y(y − b) (244)

In order to make sure our functions in question are differentiable, lets throw a theorem into

the notes that we can cite to ensure that our function is differentiable at a point.

Theorem If both ∂xf and ∂yf exist and are continuous throughout a disk around the

point in question (think a small neighborhood around the point), then we say that f is

differentiable at each point in the disk. To put this theorem into action lets illuminate it

with an example:

Problem: Show exy sin(x2 + y2) is differentiable everywhere.

Solution: It is clear that the partial derivatives with respect to x and y are just com-

binations of continuous function like exponential, trigonometric, and polynomial functions,

so the theorem above says that f is differentiable. In order to really show this, we would

have to take the partial derivatives. I will say that since the original function is made up

of trigonometric, polynomial, and exponential functions, the partial derivatives will also be

10In the single variable case, we had the coefficient of x being the derivative with respect to x

82

made of this. As such, since we just have a composition (multiplication, addition, etc.) of

continuous functions, then the overall function is continuous. Lets now move to writing the

equation of the tangent plane:

Definition: The linear approximation of f : R2 → R at (a, b) is the function:

L(x, y) = f(a, b) +∂f

∂x(x− a) +

∂f

∂y(y − b) (245)

12 Recitation IV on July 16, 2019

12.1 Partial Derivative Notation

So, there was so mystery about what went on in lecture today. I want to clarify a few

things ahead of time so that we are familiar with what is going on in the course. I heard

from a few that some notation is quite funky, so let me show all of the partial derivative

stuff briefly through an example. Let,

f(x, y) = x2y + x3 (246)

Suppose we first want to take the partial derivatives with respect to x and y. Let me now

do this below:

∂f

∂x= ∂xf = fx = 2xy + 3x2 (247)

∂f

∂y= ∂yf = fy = x2 (248)

Okay great. This is just the first partial derivatives. Now we can introduce the second partial

derivative. So, we have a few more options here, 4. We can, for example, compute all of the

83

following combinations:

fxy = ∂y(∂xf) =∂

∂y(∂f

∂x) =

∂2f

∂y∂x(249)

fxx = ∂x(∂xf) =∂

∂x(∂f

∂x) =

∂2f

∂x2(250)

fyx = ∂x(∂yf) =∂

∂x(∂f

∂y) =

∂2f

∂x∂y(251)

fyy = ∂y(∂yf) =∂

∂y(∂f

∂y) =

∂2f

∂y2(252)

Unfortunately, there are just so many ways to write these things, so we are forced to move

around with all of these notations. I like the last one the best in each row, but that is just

me. Lets move on to compute each of these for our example problem above.

fxy = ∂y(∂xf) =∂

∂y(∂f

∂x) =

∂

∂y(2xy + 3x2) = 2x (253)

fxx = ∂x(∂xf) =∂

∂x(∂f

∂x) =

∂

∂x(2xy + 3x2) = 2y + 6x (254)

fyx = ∂x(∂yf) =∂

∂x(∂f

∂y) =

∂

∂x(x2) = 2x (255)

fyy = ∂y(∂yf) =∂

∂y(∂f

∂y) =

∂

∂y(x2) = 0 (256)

It11 appeared that the order of the differentiation wasn’t all that clear today during lecture

so I wanted to clear that up. In addition, I want to show the exercise that was left for at

home from today in class.

11Notice that equation (213) gives the same result as equation (211). This shows Clairout’s Theorem!

84

12.2 Clarifying an Example in Class on Clairout’s Theorem

Problem: Given the values of f shown, approximate fxy(P ). Let be be the bottom left

corner that has value 2. 0.1 to the right of this point, is another point with value 3. 0.1

above the point P , lets have a point Q that has value 4. In the top right corner which is 0.1

away from the point Q and 0.1 above the bottom corner (the four points form a square) has

value 6. Sorry I didn’t get a picture of the drawing. If any of you have it email it to me. In

class, we did fxy. Now lets do fyx, and show that it is actually equal to fxy

Solution: First off lets clear up the notation. If we are trying to fine fyx, we are first

going to compute fy, and then we are going to compute the partial derivative of fy with

respect to x. Lets get on with this now. Okay, so first we want to compute fy. Lets start at

the point P at the bottom left and scoot up to the point in the top left. If we do this, note

that we are going to scoot up 0.1 units while having a function value change from 2 to 4 for

a net change of 2. Therefore, we can approximate the partial derivative here as a change in

the functions value over the change in y. Namely,

fy(P ) ≈ 4− 2

0.1= 20 (257)

We can also calculate this idea on the right hand side of our little box. Lets perhaps compute

the partial derivative of f with respect to y on the right side of the box. We see that the

bottom right corner has a function value of 3 and the top right corner has a function value

of 6. Therefore if we scoot up by 0.1 units, we bring about a change of 3 in the function

value. Therefore, we can again compute the partial derivative at the bottom right corner

point, lets call G as:

fy(G) ≈ 6− 3

0.1= 30 (258)

85

Okay great. So now we need to apply the next partial derivative. Remember, we are trying

to compute fyx namely we are first scooting up from point P and then we are scooting to

the right of this function essentially. Thus, note that our new function is not just f, but it is

instead fy. Therefore, we are scooting to the right of the fy function. Thus, lets look at our

function values of fy Well in the bottom left corner at point P we have that the function

value is 20. In addition, we have in the bottom right corner at point G, we have the function

value is 30. Therefore, if we scoot over 0.1 to the right we bring about a net change of 10

on the function, fy value therefore, we can approximate:

(fy)x ≈30− 20

0.1= 100 (259)

As such, we have show that whether we take fyx or fxy we end up both having a value of

100 verifying Clairout’s theorem that states that the two quantities are equal for continuous

functions. Hopefully this clarifies things.

12.3 Linear Approximation

Linear approximations are just really the multivariable analog to tangent lines in single-

variable calculus. What is going on here is that we are saying, okay, I have a function that

is defined and has derivatives at some point. The function might be a bit peculiar and

difficult, so let me approximate the function with a tangent plane. Okay so the formula for

the tangent plane is as follows at the point (a, b):

z = L(x, y) = f(a, b) +∂f(a, b)

∂x(x− a) +

∂f(a, b)

∂y(y − b) (260)

So lets explain this. What are we saying? We’re saying is, let me pick a point that is quite

close to the point (a, b). Then the first term tells me, well, the value at the point is probably

86

pretty close the value of the function at the point (a, b). However, maybe it is not quite

that. Thus, we add in the partial derivative terms. What these are saying is that the value

may vary at these points close to the base point, (a, b) by a bit. Namely, we say that the

slopes in both directions multiplied by how much you move in each direction will tell you

how much to add and subtract from the base value found at the point (a, b). Think of the

terms involving partial derivatives at a dimensional analysis standpoing. We are effectively

taking ∂f(a,b)∂x

∆x which has units, air quotes, of ∆f . This is not rigorous but it captures

the essence of what is going on. So perhaps, you function is increasing in both the x and y

variables around the point (a, b). Then, this is saying that if the function is increasing, we

would expect that both ∂f∂x

and ∂f∂y

would be positive. Thus, we start at the value f(a, b) and

we add in the small amount of changes, (x−a) and (y−b) multiplied by the slopes of each of

the variables at that point. Lets illustrate this with an example since I am probably rambling.

Problem: Consider the function,

f(x, y) = x ln y (261)

Compute the linear approximation of the function f(x, y) around the point (1, e).

Solution: We can construct a linear approximation with the following equation:

f(x, y) ≈ f(a, b) +∂f(a, b)

∂x(x− a) +

∂f(a, b)

∂y(y − b) (262)

Therefore, we can directly compute this as:

f(x, y) ≈ f(1, e) +∂f(1, e)

∂x(x− 1) +

∂f(1, e

∂y(y − e) (263)

87

f(x, y) ≈ 1 + ln(e)(x− 1) +1

e(y − e) (264)

f(x, y) ≈ 1 + (x− 1) +1

e(y − e) (265)

Please email me with any questions and if there are any errors. I typed this up very quickly

so that you could all look over it if necessary, so please alert me ASAP. You will be rewarded

with candy!

12.4 A Rigorous Proof of Clairout’s Theorem

Totally unnecessary for the course, but cool nonetheless. So, I know there are a few

people that are actually quite interested in getting a degree in mathematics. In doing so,

many of you will take analysis courses that seek to prove many of the things we use everyday

in calculus. In class this week, we have learned about Clairout’s theorem that states that,

for a continuous function, f : R2 → R,

∂

∂x

∂f

∂y=

∂

∂y

∂f

∂x(266)

Lets now go on to prove this with rigor. Lets start off by stating the theorem we seek to prove:

Theorem Theorem Given f : [a, b] × [c, d] → R has continuous second-order partial

derivatives. Then, fxy = fyx on (a, b)× (c, d).

In order to prove the theorem, I want to cite a Theorem in Arthur Mattuck’s, Real Anal-

ysis textbook. I will now state it here:

Theorem 12.6: Let g ∈ C([a, b]× [c, d]). Then there exists a sequence pn(x, y) of two-

variable polynomials such that pn → g uniformly. We will now utilize this Theorem, 12.6,

88

for our continuous function fxy generating a sequence of polynomials such that pn,

|pn(x, y)− fxy(x, y)| < ε(n) ∀(x, y) ∈ [a, b]× [c, d] (267)

under the condition that:

limn→∞

= 0 (268)

Then, for any rectangle, D = [x1, x2]× [y1, y2] ⊂ [a, b]× [c, d],

|¨D

pndxdy −¨D

fxydxdy| < ε(n)A(D) (269)

Where A(D) = (x2 − x1)(y2 − y1) is the area of our predescribed rectangle, D. Note:

¨D

fxydxdy =

¨D

fyxdydx (270)

Since these double integrals are equivalent to,

f(x2, y2)− f(x2, y1)− f(x1, y2) + f(x1, y1) (271)

Consequently, since pn is a polynomial, then we can also the fact that:

¨D

pndxdy =

¨D

pndydx (272)

which stands true for each n ∈ N. Thus, we generate the equation:

|¨D

pndydx−¨D

fyxdydx| < ε(n)A(D) (273)

89

Finally, we take the limit as n→∞ to achieve the following equation:

¨D

fxy − fyxdydx = 0 (274)

Which implies that for a function with continuous partial derivatives that:

fxy = fyx (275)

13 Lecture VIII on July 17, 2019

13.1 Review on Linear Approximations

Lets start off with a bit of review. We covered linear approximations in a jiffy, so maybe

lets go back and clarify. If f is differentiable, then f be linearly approximate as:

L(x, y) = f(a, b) + ∂xf(a, b)(x− a) + ∂yf(a, b)(y − b) (276)

So now, close to the point (a, b), the linear approximation is having values that are similar to

the function’s values. So, sometimes we will use the linear approximation instead of the actual

function to approximate the function’s value around (a, b). There are not approximations

beyond second order on the ASE, so please do not spend time on this if you are planning to

do this.

13.2 Multivariable Optimization

Now we are going to move forward to optimization. The only difference we have as we

move forward in dimensions is that, we used to set our first derivative equal to zero back

in single-variable calculus. The only difference here is that we set our partial derivatives,

90

namely with respect to x and y both equivalent to zero. Lets consider an exercise back from

single-variable calculus:

Exercise: Find the maximum and minimum value of f(x) = |(1 − x)(x − 3)| over the

interval [0, 3].

Solution: Here we go. So, we will want to find the critical points of the function along

the interval. In addition, we are going to want to check the ends of the interval! Since we are

not just looking at the entire space, and we are instead only looking at a smaller interval,

we are going to not just check the derivatives equal to zero, but we are also going to check

the edges of the interval, where x = 0 and where x = 3. The Extreme Value Theorem tells

us that f has a maximum and a minimum (since f is continuous and defined on a closed

interval. Also such a maximum and minimum must occur at a critical point or an endpoint.

So, we final all critical points and endpoints and check 12. Since we have to deal with the

absolute value bars, we actually get a graph that is a tad more funkier than it would have

been without. Here is a graph of the function in question:

Figure 7: Graph of |(1− x)(x− 3)|12This is a very-well worded answer. However, on exams you would not have to state all of these statements

unless otherwise asked.

91

It appears right from the graph that the functions seems to have its largest value 3 at x = 0.

In addition, we see that at x = 1 and x = 3 both achieve the function’s smallest value on

the interval of 0. Does this make sense though. Well, lets think about it. Since we are using

some form of an absolute value function, then we should never get a function value that is

less than zero. Therefore, we would expect both the maximum and minimum to be greater

than or equal to zero. Also, you can see why it is important to check the endpoints when we

are working on some closed interval since our maximum was at one of the endpoints. Lets

now move forward into two dimensions to see if we can take the ideas of single-variable and

move it into multivariable.

Exercise: Let f(x, y) = −x2 − y2 + x + 23y + 23

36on [0, 1]2. Just a note. Seeing [0, 1]2 is

just the unit square and it means we are letting both x and y be in [0, 1].

Solution: Again, we can start with the Extreme Value Theorem. WE can state that if

f : D → R is continuous and D is closed (includes all boundary points). So, in our case,

we have a closed square since we are including the boundary in our domain and bounded

(contained in some large box). Bounded simply means that our function doesn’t run off to

infinity somewhere in the domain. It means that, like in the limits, we can put a roof and a

ceiling around the function boxing it in, or if you will, bounding the function. Now, we can

say that our critical points in R2 as:

∂f

∂x=∂f

∂y= 0 (277)

And, like the other case, we will have to check the endpoints, but in this case, the borders to

see if the maximum or minimum lies along this. What are our edges in the case of the unit

92

square? Well, it seems like we have 4 edges with the following equations describing them:

y = 0 (278)

y = 1 (279)

x = 0 (280)

x = 1 (281)

We call these are boundary critical points checkers. And, we call the places where both

∂xf and ∂yf equal zero or the function is not differentiable, the interior critical points. The

thing about boundary critical points is that they suck? Why, well lets see. Suppose I plug

in the border on the bottom of the unit square where y = 0. Well now, what happens to

our function? We now just have a function of one variable between [0, 1]. Thus, we basically

have a smaller single-variable sub-problem that we find the absolute maximum and minimum

along each of the four boundaries. As such, since we have four of the boundaries, we would

expect to have 4 smaller single variable absolute maximum and minimum problems as we

check the boundary conditions. Lets actually go on to try one of the borders out. When

f(t, 0):

f(t) = −t2 + t+23

36(282)

Now we have a function of 1 variable. Note that I used t to parametric x along this edge

since x can vary between 0 and 1. Done forget to check the corners! However, note that you

only need to check each corner once since it will pop up as a edge of the interval for two of

your parametrizations. I will write a detailed solution to this problem in the recitation notes

for tomorrow. So, if you are reading along, go on to that to see all the work. For now, lets

93

move onto the interior critical points.

∂f

∂x= 0 = −2x+ 1 (283)

∂f

∂y= 0 = −2y +

2

3(284)

Which results in the point of (x, y) = (12, 1

3) being the critical point yielding the function

value of 1. At this point, we would check all of boundary points, that I will add in later.

Assuming we do that, we get that the maximum of the function is 1 and the minimum of

the function is 1136

. So, knowing how to solve the equations is extremely important, and I

have to help people with this throughout the year as a tutor. So, it was a great question

and deserves a full answer. Check the recitation notes for tomorrow for a follow up.

Problem: Find the critical points of

f(x, y) = (2x2 + 3y2)e−x2−y2 (285)

Solution. Taking the partial derivatives of x and y respectively and setting each equal

to zero:

∂xf = 2x(−2x2 − 3y2 + 2)e−x2−y2 (286)

∂yf = 2y(−2x2 − 3y2 + 3)e−x2−y2 (287)

Thus, if we look at the part in the front of the partial derivative with respect to x, we achieve

that x = 0. Then, we can plug this into our second equation and see that y can be either 0

or 1 or −1, and we set fx = fy = 0. If y = 0 utilizing the same process, then x = 0, 1,−1.

94

Finally if x 6= 0 and y 6= 0, then we must have the following two equations being true:

− 2x2 + 3y3 = 0 (288)

− 2x2 + 3y2 + 2 = 0 (289)

Which yields no solutions actually. Therefore, we only get the points that were discussed

prior yielding (0, 0), (0, 1), (0,−1), (1, 0), and finally (−1, 0).

13.3 The Second Derivative Test (ASE)

The second derivative test is a way that we can classify the critical points of a function

similar to that in single-variable calculus. Since, we have 4 different partial derivatives, the

general formula and conditions are a little bit more extensive than previously. Let me call

the quantity we are going to use to organize all the second order partials D and define D as:

D = fxxfyy − fxyfyx = fxxfyy − f 2xy (290)

We will always skip the second equality. since we know that by Clairout’s Theorem, fxy = fyx

for at least twice-differentiable continuous functions. Now, in multivariable, we have three

potential classifications. We have a max, min, and a saddle. The maximum and minimum

are similar to those in single-variable, but the saddle is the new type of classification that

we have here. For a saddle, the function f has fxx and fyy being of opposite sign, namely

moving around the point in questions doesn’t exhibit uniform behavior of moving either up

or down as you would get at an mix or max. Lets look at the conditions for each point.

Consider D(a, b), namely for the critical point (a, b):

D(a, b) = fxx(a, b)fyy(a, b)− fxy(a, b)2 (291)

95

We obtain a:

1. Relative Minimum: If D > 0 and fxx(a, b) > 0

2. Relative Maximum: If D > 0 and fxx(a, b)

3. Saddle: If D < 0

4. Unknown if D(a, b) = 0 We basically do not have enough information to determine the

nature of this critical

Just a sidenote because I am asked this question a lot throughout the year. By symmetry, if

D > 0, then it is always the case that both fxx and fyy must be of the same sign. As such,

everywhere you see a fxx condition for the relative min and max conditions above, you can

replace that with a fyy condition is that is what suites your fancy. If D > 0 and the −fxy

will always contribute something non-positive, then it must be the case that fxx and fyy be

of the same sign! Lets just do a quick example to reinforce all that was covered.

13.3.1 An Example in Second Derivatives

Problem: Find and classify the critical points of the function:

f(x, y) = 3x2y + y3 − 3x2 − 3y2 + 7 (292)

Solution: Lets start off by taking all of the partial derivatives and second-order partial

derivatives as they will all come in play throughout the problem:

fx = 6xy − 6x (293)

96

fy = 3x2 + 3y2 − 6y (294)

fxx = 6y − 6 (295)

fyy = 6y − 6 (296)

fxy = 6x (297)

Okay, now lets find the critical points of this function so that we can classify each of them.

We can find the critical points of the function by setting both the partial derivative with

respect and the partial derivative with respect to y equal to zero. Therefore:

fx = 6xy − 6x = 0 = 6x(y − 1) = 0 (298)

Therefore, we obtain that either x = 0 or y = 1 from the partial derivative with respect to

x. Lets now plug these in, one at a time into our partial derivative with respect to y set

equal to zero. For the case of x = 0,

fy = 3x2 + 3y2 − 6y = 0 = 3y2 − 6y = 3y(y − 2) = 0 (299)

yielding the result that when x = 0, y = 0 or y = 2. Now lets plug in the y = 1 case into

our partial derivative with respect to y.

fy = 3x2 + 3y2 − 6y = 0 = 3x2 − 3 = 0 (300)

Yielding that when y = 1, x = 1 or x = −1. As such, we have a total of four critical points

located at (0, 0), (0, 2), (1, 1) and finally (−1, 1). Lets now plug each of these points into our

97

second derivative test. By definition, the second derivative test is:

D(x, y) = fxxfyy − f 2xy = (6y − 6)(6y − 6)− (6x)2 = (6y − 6)2 − (6x)2 (301)

Lets now evaluate, and classify each point.

D(0, 0) = (−6)2 = 36 > 0 (302)

So, immediately we know that (0, 0) is either a relative minimum of maximum. Since fxx =

−6 < 0, (0, 0) must be a relative max.

D(0, 2) = (6)2 − 0 = 36 > 0 (303)

So, immediately we know that (0, 2) is either a relative minimum of maximum. Since fxx =

6 > 0, (0, 0) must be a relative min.

D(1, 1) = 02 − 36 = −36 < 0 (304)

Therefore, (1, 1) must be a saddle point.

D(−1, 1) = 02 − 36 = −36 < 0 (305)

Therefore, (−1, 1) must be a saddle point. Hopefully this all makes sense because this will

most definitely be on the ASE!

98

13.4 Directional Derivative

This is a pretty neat section. Suppose we want to calculate a derivative off at some

direction that is not either strictly in the x or the y direction. If that were the case, then

we could just use partial derivatives with respect to x and y. Now, lets suppose we want to

calculate in some arbitrary direction ~u, where ~u is a Unit Vector by definition. Then we

can express the directional derivative in the direction of ~u at the point (a, b) as:

(D~uf)(a, b) = limh→0

(f((a, b) + h~u)− f(a, b)

h

)(306)

If we are at the point (a, b) maybe we want to deploy what we learned last time with regard

to linear approximations:


(L(a+ hu1, b+ hu2)− L(a+ hu1, b+ hu2) + f(a+ hu1, b+ hu2)− f(a, b)

h

)(307)


(f((a, b) + h~u)− L((a, b) + h~u)

h

)+ lim

h→0

(fx(a, b)hu1 + fy(a, b)hu2

h

)(308)

Now look at the first limit. This goes to zero by our definition of the linear approximation

from last class. In addition, the second limit has the h get divided out, therefore just leaving

the expression without any of the h’s being present. A much more convenient form that will

be utilized when we are actually calculating such a thing is:

(D~uf)(a, b) =

(∂f(a, b)

∂x,∂f(a, b)

∂y

)· ~u =

(∂f(a, b)

∂x,∂f(a, b)

∂y

)· (u1, u2) (309)

Where |~u| = 1 Again, I repeat, ~u is a Unit Vector. This is one of the most common

mistakes I see as a TA when people are working through such problems. This is a great

formula that we will be in contact with. The vector of the partial derivatives has a name.

99

It is called the gradient, and it is defined as below:

~∇f(a, b) =

(∂f(a, b)

∂x,∂f(a, b)

∂y

)(310)

Which allows us to write our direction derivative as:

(D~uf)(a, b) = ~∇f(a, b) · ~u = |~∇f(a, b)| cos θ (311)

Therefore, our directional derivative is the largest when the unit vector, ~u, points in the

same direction as the gradient. This is the case of walking in the direction of max increase,

i.e. walking the steepest path up a hill. In the case that the gradient and the unit vector

are orthogonal, the directional derivative is zero. This is the equivalent of walking along

a certain level. The smallest the directional derivative can be is when the unit vector is

anti-parallel to the gradient. This is the equivalent of taking the steepest path down the hill.

14 Recitation V on July 18, 2019

Great work today in recitation. The problems were quite difficult, and we seemed to

have a pretty good understanding of what was going on. Let me give a quick recap of some

of the highlights from both me talking, questions, and things I think would be relevant.

14.1 A Small Note on Multivariable Optimization

One thing that I noticed while working through the annoying problems on the worksheet

is that it is both beneficial and important to check that the critical points, points in question,

are within the boundary. Like, for example, if you are working in the unit square, and you

calculate that there exists a critical point at (2, 3), then we must immediately omit this. Even

if this is a critical point on the function, it is not within our region that we are optimizing

100

by. As such, we will not include it in trying to find our absolute maximum and minimum.

14.2 Gradients and Directional Derivatives

The gradient at a point (a, b) points in the direction of maximum increase. So, picture

yourself on a mountain.

~∇f(a, b) =

(∂f(a, b)

∂x,∂f(a, b)

∂y

)(312)

Here is the equation for reference. You calculate that at where you are standing, the direction

denoted by the vector, 15(3, 4) is the gradient of the function. As such, if you wanted to get

to the top of the mountain as fast as possible, you would take a step forward in this said

direction. It is not necessarily true that once you reach the new point, that the direction of

maximal increase is the same as the previous point. This was a great question in class! The

gradient that is evaluated at each point. It tells you, given you are at this point, this is the

direction you should head in order to ascend in the quickest way possible. The directional

derivative comes in place in the following equation:


Where ~u is a unit vector in some arbitary direction. We can see that the direction of maximal

increase should be in the same direction as the gradient. As a matter of fact, we can define

it as:

~uGreatestInc =~∇f| ~∇f |

(314)

As such if we want to head in the direction of maximal decrease, we can express this as:

~uGreatestDec =− ~∇f| ~∇f |

(315)

101

14.3 Following a Path of Max Increase

So now, thanks to a combination of questions and conversation with Jordan, Hector,

Ivan, Grace, and Raima, I wanted to include this section. Suppose we want to follow a

path along the gradient. How can we compose this path? We seen before that calculating

the gradient at a specific point tells you what direction to head given you’re at that point.

However, suppose now we want to calculate the whole path of travel. How could we do this.

Well, we could calculate the gradient for an arbitrary (x, y)13. Then we could think, well if

I am following the path of greatest increase then I better have it that my velocity always

points in the same direction of the gradient. As such, the velocity of the particle should be

a scalar multiple of the gradient of the function. Therefore we get that:

~∇f(x, y) = c~v(t) (316)

Where c is some constant. Therefore, we can just take c = 1 for convenience:

~∇f(x, y) = ~v(t) = (~x′(t), ~y′(t)) (317)

As such, we can match component by component in order to try and craft back some ~r(t)

function. Let me illuminate this with an example so that we have something to follow along

with :)

Problem: Suppose that the temperature in a room [0, 5]3 is given as a function of po-

sition by T (x, y, z) = 50 + x2 + (y − 3)2 + 2z. You are a bug starting at position (3, 2, 2),

and you are cold. You decide to move in the direction of greatest temperature increase at

all times. First find the direction that the bug initially wants to fly in. Then calculate the

13could simply be (x, y)

102

path of the bug, ~r(t).

Solution: We have that

(∇f)(x, y, z) = 〈2x, 2y − 6, 2〉 (318)

and thus that we will initially move in the direction

(∇f)(3, 2, 2) = 〈6,−2, 2〉. (319)

We wish to find our position in space as a function of time at an arbitrary speed if we follow

the direction of greatest increasing temperature, which we will call −→r (t). Because we will

always point in the direction of the gradient, we know that

−→r ′(t) = 〈x′(t), y′(t), z′(t)〉 = λ〈2x(t), 2y(t)− 6, 2〉. (320)

If we let λ = 1 (its exact value does not matter) we can then solve for −→r (t). We have that

−→r (t) = 〈c1e2t, c2e

2t + 3, 2t+ c3〉. (321)

If we let −→r (0) = (3, 2, 2), then we have that c1 = 3, c2 = −1, and c3 = 2, giving us that

−→r (t) = 〈3e2t,−e2t + 3, 2t+ 2〉. (322)

I thought that this was a really cool example problem that is a great application of the

gradient and ideas from the paths in space chapter.

103

15 Lecture IX on July 19, 2019

We are officially halfway through the summer! I hope that you have had a great experience

thus far! Let me know how my notes are please so that I can make them better for those

that use them. Lets kick off lecture with a review:

15.1 Review on Directional Derivatives

We ended class with:


Where ~u is a unit vector, |~u|. This represents the sensitivity of f to small changes in the ~u

direction from (a, b). ~∇f(a, b) represents the gradient of f at the point (a, b). Remember,

that the gradient points in the direction of maximal increase. Thus, if we would want to find

the direction of maximal increase, then the unit vector:

~uGreatestInc =~∇f| ~∇f |

(324)

represents this direction. In addition, the direction of maximal decrease would be:

~uGreatestDec =− ~∇f| ~∇f |

(325)

The directions orthogonal to the gradient would have a directional derivative equivalent to

zero. This is the equivalent of walking around a level curve instead of walking up or down a

function. Given that ~∇f(a, b) = (α, β), two vectors that are orthogonal to the gradient, and

104

as such have a directional derivative equivalent to zero are:

~uorthog1 =1√

α2 + β2(−β, α) (326)

~uorthog2 =1√

α2 + β2(β,−α) (327)

Lets now consider a level set of a function f , that we will assume is a differentiable function.

For example, maybe we have f : R2 → R. Then, the gradient, by definition will always be

Perpendicular to the level curves of f . Since, moving in a direction along the level curve

will produce a directional derivative equivalent to zero, then moving perpendicular to this

will either point in the gradient’s direction, of max increase, or in the direction opposite the

gradient’s direction, of max decrease. Cheers to Victor for answering this question in class

:0.

It is important to remember that the gradient is orthogonal to the level curves

of a function, f . Lets do a quick example to reinforce our learning:

Problem: Find an equation of a plane tangent to x2 + y2 + 2z2 = 4 at (1, 1, 1).

Solution: We have solved problems similar to this utilizing a linear approximation

method that ends up creating a tangent plane. This problem is a bit different. Note,

this is an equation. We were using tangent planes to approximate functions. We used to be

looking at f(x, y), looking at the graph of f . Now we have an equation though. There is

no function clearly seen here. If we wanted, we could solve this equation for z, but this is

problematic? Why, well, if we solve this equation for z, we get two different surfaces for

the square root, and we have to choose which surface to use. So, lets try a different way to

solve this problem. Instead, we can think of the following. Maybe, our equation is a level

105

set of a function. Namely, Consider the function, f : R3 → R. In fact consider the following

function:

f(x, y, z) = x2 + y2 + 2z2 (328)

Now, we can see that if we look at the level set of f(x, y, z) = c where c = 4, we can envision

our equation in the problem statement as simply a level set of the function mentioned above.

As a quick reminder to the conversations in chapter one, we cannot graph the actual function

f(x, y, z) since it would require four dimensions. However, we definitely can graph its level

sets which happen to be ellipsoids, like the equation given. We can now take the gradient of

this function at the point (1, 1, 1) because we know that the gradient of this function will be

perpendicular to the level set of the function, our original equation we were given. Namely,

~∇(x2 + y2 + 2z2) = (2x, 2y, 4z) (329)

Which, at the point (1, 1, 1) results in (2, 2, 4) being the gradient. As such, since this vector

is the gradient, and the gradient is by nature perpendicular to the level surface, then the

vector ~n = (2, 2, 4) is indeed perpendicular to our surface. However, where before have we

seen normal vectors coming into play? Planes! We note that a tangent plane at the point

(1, 1, 1) will be defined by its normal vector in the form, ax+ by+ cz = d where ~n = (a, b, c).

Thus,

2x+ 2y + 4z = d (330)

plugging in the information available to us. We can solve for d by plugging a point in on

our plane, (1, 1, 1). As such, we obtain that d = 8, and the equation for the plane tangent

to the surface at (1, 1, 1) is equivalent to:

2x+ 2y + 4z = 8 (331)

106

It is quite confusing to get all of this. The biggest point of confusion for me was this. Planes

are defined by their normal vector. Therefore, although this vector itself is normal to the

level curve that we solved for, a normal vector defines a tangent plane! To recap, what we

did was say, okay, I am given an equation. I am going to think of my equation as a level

set of a function f . I am then going to take the gradient of f at the point noting that the

gradient is tangent to the function f and it is orthogonal to the level sets of the function f .

As such, the gradient points orthogonal to the level set. So, for a tangent plane, this would

be the normal vector that defines the plane. We then wanted a full equation for our tangent

plane, so we plugged in the point to solve for d, resulting in the equation for the tangent

plane. I know there is a lot of flipping between tangent and normal, so read this over a few

times to make sure you have it all down. Lets have a rapid flip over to the chain rule.

15.2 Multivariable Chain Rule

The chain rule was first taught to us in single-variable calculus. We used it when we had

compositions of functions such as f(g(t)). Lets start off with an example. Lets compute the

derivative of f(g(t)):

limh→0

f(g(h+ t))− f(g(t))

h= f ′(g(t))

(g(t+ h)− g(t)

h

)= f ′(g(t))g′(t) (332)

Now lets see how we can take the idea of the chain rule and apply it over in the multivariable

setting. I will introduce the subject by stating the formulas just for reference. I will then go

on to explain it afterwards. The multivariable chain rule is expressed succinctly as:

df

dt=∂f

∂x

dx

dt+∂f

∂y

dy

dt(333)

107

However, lets look a little bit more at this expression. It seems like, upon first glance we are

matching a partial derivative with respect to a variable and multiplying it with the derivative

of this said variable with respect to time. We are then summing over all of the variables,

essentially getting the contribution from each of the variables to the overall change in f with

respect to time. In fact, this component by component sum actually is a hidden dot product.

We can write the multivariable chain rule alternatively as:

df

dt= ~∇f · ~r′(t) =

(∂f

∂x,∂f

∂y

)· (~x′(t), ~y′(t)) =

∂f

∂x

dx

dt+∂f

∂y

dy

dt(334)

At a very hand-wavy level, we are having each of the terms having a cancellation’ of the dx

terms to be left with each term of f over t. In addition, we want that our final answer is in

terms of t since we are taking the derivative with respect to time. Let me show you a quick

example of this.

Problem: Compute dfdt

for f(x, y) = x2 + y for ~r(t) = (t, t2)

Solution: We can solve this in two ways. I will highlight both of them now. We can

solve it

1. we can plug in our expressions for x and y into the function so that we have f only in

terms of t. Lets do that now. Note that this is not really utilizing anything new here

in terms of taking the mix of partial derivatives and full derivatives.

f(t) = (t)2 + t2 = 2t2 (335)

df

dt= 4t (336)

Now lets try it our new way and see if we can get the exact same expression for dfdt

108

2. We will now take use of the multivariable chain rule.

df

dt=∂f

∂x

dx

dt+∂f

∂y

dy

dt(337)

df

dt= (2x, 1) · (1, 2t) = 2x+ 2t (338)

However, now we need to plug in our expression for x since our final answer should

only be in terms of t. As such, we obtain that:

df

dt= (2x, 1) · (1, 2t) = 2t+ 2t = 4t (339)

Leading to the same exact answer validating our claim. Lets nove to the proof

15.2.1 A Proof of the Multivariable Chain Rule

Lets try to come up with a proof for this:

d

dt(f(~r(t))) = lim

h→0

(f(~r(t+ h))− f(~r(t))

h

)(340)

Lets now introduce a linear approximation of f at the point ~r(t):

d

dt(f(~r(t))) = lim

h→0

(L(~r(t+ h))− L(~r(t+ h)) + f(~r(t+ h))− f(~r(t))

h

)(341)

Due to L being a linear approximation of f(~r(t). As such, we can group the two middle

terms in our numerator, which goe to zero as h→ 0. Therefore, we are left with:

d

dt(f(~r(t))) = lim

h→0

(L(~r(t+ h))− f(~r(t))

h

)(342)

109

Now, since we at the point, both L and f are the exact same value, we can substitute our

second term in the numerator with L(~r(t)).

d

dt(f(~r(t))) = lim

h→0

(L(~r(t+ h))− L(~r(t))

h

)(343)

d

dt(f(~r(t))) = lim

h→0

(fx(r1(t+ h)− r1(t)) + fy(r2(t+ h)− r2(t))

h

)(344)

d

dt(f(~r(t))) = fx(r

′1) + fy(r

′2) = ~∇f(~r(t)) · ~r′(t) (345)

Where ~r = (r1(t), r2(t)) = (x(t), y(t)). This proof is not necessary to ever use, but it is the

proof behind the pudding of the formula that we are going to be working with. It is much

more important to know how to compute the multivariable chain rule like the example that

I did prior to the proof. Unlike directional derivatives, our dot product does not need to

include a unit vector. The ~r′(t) has a magnitude that can be anything. As such, it would

make sense to incorporate its magnitude since objects moving faster, namely having a higher

magnitude should have a larger change in the overall derivative.

16 Lecture X on July 23, 2019

Great job on the midterm yesterday. You all did really well. Lets start off today with a

review of a few topics.

16.1 Review on Partial Derivatives and Mixed Partials

What is the interpretation of each of our partial derivatives. Consider fx and fy. fx

and fy represent scooting a bit to the right and up respectively and noting how the function

value changes. In addition, we can discuss fxx. This represents how much the function fx

changes if we scoot a bit to the right. Namely, we are looking at the rate of change in the x

110

direction. We can discuss fyy. This represents how much the function fy changes if we scoot

a bit to the right. Namely, we are looking at the rate of change in the y direction. Finally,

there are also mixed partial derivatives. Consider fxy. This is finding the function fx. Then,

we can compare how fx changes with respect to y. Namely, we look at the rate of change of

fx in the y direction. The third problem on the homework best illustrates this. If you draw

functions in Rn, are you an R-tist? In addition, lets just again, state the multivariable chain

rule. We can express this as:

df

dt= ~∇f · ~r′(t) =

(∂f

∂x,∂f

∂y

)· (~x′(t), ~y′(t)) =

∂f

∂x

dx

dt+∂f

∂y

dy

dt(346)

16.2 Lagrange Multipliers

Lagrange multipliers are a new way to solve optimization problems along the boundary.

Consider the following function:

f(x, y) = −x2 − y2 + x+2

3y +

23

36(347)

That we wish to optimize on the following region:

D = (x− 1

2)2 + (y − 1

2)2 ≤ 1

4(348)

Last week, we learned how to solve this type of function with some really lengthy process of

first checking the interior critical points, then points that aren’t differentiable, then writing

equations for the borders, completing a slew of single variable problems, and finally eval-

uating all of these points to see which was the absolute smallest and largest. We are now

going to take a quicker more, ”sophisticated” approach. We note that along the border of

the region, the gradient of f , ~∇f , must be perpendicular to the boundary of the region D,

111

∂D at a point if f is to have a local optimum there. As such, for an optimum along the

boundary of the region D, it must be true that:

~∇f = λ~∇g (349)

Thinking of the boundary of the region D, as a level set of g, then we recall that the gradient

of g, ~∇g will point orthogonal to the boundary. As such, we would expect ~∇g to point in

the same direction of f . Namely, we would expect that ~∇f and ~∇g to point in the same

direction as one another. Since both ~∇f and ~∇g are vectors, we can capture this idea by

using the vector identity that ~∇f is a scalar multiple of ~∇g. So, for some constant λ ∈ R,

it must be true that:

~∇f = λ~∇g (350)

This is the punchline. Seeing why this equation is true is what we completed with the

discussion of the level sets of g, but the equation that we will be utilizing to find local

optimum around the boundary of our region that we are optimizing over, we will just be

using this equation to help us solve for what we do not know. Lets now try to solve this.

We were given an inequality expressing D. We now want to have that g would be that:

g(x, y) = (x− 1

2)2 + (y − 1

2)2 (351)

Where we just took the border function that was equal to 14, and noted that this was a level

set, and we called this the function g that this was a level set of. Now we have some partial

derivatives to take:

~∇f = λ~∇g (352)

(−2x+ 1,−2y +2

3) = λ(2(x− 1

2), 2(y − 1

2)) (353)

112

Nice! So, if we equate each of the components, we obtain that:

− 2x+ 1 = λ2(x− 1

2) (354)

− 2y +2

3= λ2(y − 1

2) (355)

This is looking good. However, we have one problem. We have two equations with three

unknowns. How can we introduce a third equation that must be satisfied? Well, we already

noted that all of this mess is only true on the boundary. Thus, we can obtain a third equation

be writing the equation for the boundary. We have that:

(x− 1

2)2 + (y − 1

2)2 =

1

4(356)

Solving the first equation we obtain that, λ = −1 and x = 12. However, when we look at the

case of λ = −1, the second equation in our set cannot hold. As such, we only obtain one

critical point along the boundary where x = 12, yielding the y-coordinate of 1 or 0. Inside,

we also obtain a critical point of (12, 1

3). We can now evaluate our function at all three of the

critical points: (1

2, 0

)=

8

9(357)

(1

2, 1

)=

5

9(358)

(1

2,1

3

)= 1 (359)

Lets move on to a more practical example.

Problem: Find the maximum volume of a lidless box with a surface area of 72.

113

Solution: So, lets turn this sentence into a type of Lagrange multiplier idea. We want

to maximize our function f = V , the volume, subject to the constraint of the surface area

of the lidless box g = S. Since the box is going to be rectangular, we can express the

volume function f(x, y, z) = xyz. In addition, we can express the surface area function

S = g(x, y, z) = xy + 2xz + 2yz = 72. It actually doesn’t matter which side the lid is taken

off of, so it does not necessarily matter that we took a xy side off, we could have equivalently

taken off a yz perhaps. We now can deploy our Lagrange Multiplier problem:

~∇f = λ~∇g (360)

(yz, xz, xy) = λ(y + 2z, x+ 2z, 2y + 2x) (361)

Now we have three equations above with 4 unknowns. So we must introduce the constraint

as well that:

xy + 2xz + 2yz = 72 (362)

So that we have the set of equations that:

yz = λ(y + 2z) (363)

xz = λ(x+ 2z) (364)

xy = λ(2x+ 2y) (365)

xy + 2xz + 2yz = 72 (366)

Which I will solve later and put into the notes right here. We can now finally move on to

integration

114

16.3 Integration

First lets think of a single-variable case to realign ourselves with integration after so much

differentiation. We can think of integration back in single-variable as the signed area that

lies under the graph of f . We can think of an integral as splitting the interval of which we

are taking the integral over into a bunch of tiny pieces. We then see how much each piece

contributes its volume times the value of the function on that tiny piece. We are effectively,

in the single-variable case , taking the signed area of each one of these really skinny rectan-

gles. We then add each one of these contributions. Then, we take the number of pieces to

infinity, making them infinitely thin! Lets hop right into an example of an integral.

Example: Integrate,

f(x, y) = y sin(πyx) (367)

over the unit square, [0, 1]2.

Solution: Instead now of some interval, note that we are now integrating f over a 2D

region, namely a square. Now we want to divide our little region into a bunch of little

squares! Then, for each of these little squares that we cut the unit square into, we are going

to take the function’s value at that square. Then, we can express the volume as the function

value at the really tiny square multiplier by the area of the super tiny square. We are then

going to sum up each of these contributions from each tiny square. The formal expression

for this is lets first sum up each row of squares and then each column of these contributions

that we previously hypothetically computed. The actual integral expression for this idea is:

ˆ 1

0

ˆ 1

0

f(x, y)dxdy =

ˆ 1

0

ˆ 1

0

y sin(πyx)dxdy (368)

115

Where, ˆ 1

0

f(x, y)dx (369)

represents summing up along each row, and then:

ˆ 1

0

(ˆ 1

0

f(x, y)dx

)dy (370)

represents taking each of these total row contributions and summing those up. When first

taking the derivative with respect to x, similar to partial differentiation, we hold y constant

and treat it as such.

17 Recitation VI on July 24, 2019

I will take this space in order to just reach out with some further discussion points with

Lagrange Multipliers. Firstly, Lagrange multiplier only work along the boundary of your

region. For example, suppose you have some region D, that is the unit disk. That utilizing

the Lagrange multiplier method will only work along x2+y2 = 1, otherwise known as the unit

circle, which happens to the border of the disk. That being said, the Lagrange multiplier

relies heavily on the direction of the gradient. Namely, we should think of a Lagrange

multiplier problem as a process. We first decide, I want to maximize and/or minimize some

function f . In addition, I want to compute this Optimization of f over some boundary which

we can refer to as g. Then, what the Lagrange multiplier formula tells us is that:

~∇f = λ~∇g (371)

Lets walk through some problems in order to really get down the overarching idea. And, we

can even some some really neat problems along the way!

116

Problem: Find the maximum and minimum of the function f(x, y) = 2x− 3y. subject

the constraint, x2 + y2 = 64

Solution:We can utilize the method of Lagrange multipliers in order to solve this problem

where we classify g(x, y) = x2 + y2

~∇f = λ~∇g (372)

We have that:

(2,−3) = λ(2x, 2y) (373)

With the third equation of:

x2 + y2 = 64 (374)

We can write x and y in terms of λ by solving the first two equations, namely:

x =1

λ(375)

y =−3

2λ(376)

Plugging into our third equation, we obtain that:

1

λ2+

9

4λ2= 64 (377)

Leading to:

λ = ±√

13

16(378)

117

Which, if we plug back into our equations, obtain two points:

(16√13,−24√

13) & (

−16√13,

24√13

) (379)

Which, we can evaluate our function at these two points leading to:

f(16√13,−24√

13) =

32√13

+72√13

=104√

13(380)

f(−16√

13,

24√13

) =−32√

13+−72√

13=−104√

13(381)

resulting in the max and min respectively along the constraint.

Problem: A right cylindrical can is to have a volume of 0.25 cubic feet (approximately

2 gallons): Find the height h and radius r that will minimize surface area of the can. What

is the relationship between the resulting r and h?

Solution: Lets first get down equations for both the surface area and the volume.

f(r, h) = S = 2πr2 + 2πrh (382)

g(r, h) = V = πr2h = 0.25ft3 (383)

Which can deploy the Lagrange multiplier equation:

∇f = λ∇g (384)

(4πr + 2πh, 2πr) = λ(2πrh, πr2) (385)

118

Lets now divide through by component:

2r + h

r=

2h

r(386)

rh = 2r2 (387)

Since r cannot be not be negative, we can express h in terms of r.

h = 2r (388)

We can then plug this in to our constraint equation:

2πr3 = 0.25 (389)

r =

(0.25

2π

) 13

(390)

h =

(1

π

) 13

(391)

Problem: Assume there are two commodities with amounts x and y with respective

prices of px and py. In addition, suppose that you have a utility function that you wish to

maximize of the form: U(x, y) = xαy1−α for some constant α ∈ (0, 1). You maximize your

utility function14 subject to some budget constraint. Namely, suppose that in total you have

m dollars to spend on both your products. Maximize the utility function to boom up the

’merican economy by:

1. Writing a constraint equation relating the amount and prices of the two goods along

14This is called a Cobb-Douglass Utility Function

119

with your total amount of money, m.

2. Setting up a function f and g and utilizing the Lagrange multiplier equation to find

the amounts of x and y you should purchase.

Solution:

1. Lets start off by writing a budget constraint. We note that we can buy both x and y

at prices px and py that has to be less than or equal to the amount of money we have

m

pxx+ pyy ≤ m (392)

2. We can now construct our function that we wish to maximize u(x, y) subject to our

constraint pxx+ pyy ≤ m. Therefore, we can make the function g(x, y) as:

f(x, y) = xαy1−α (393)

g(x, y) = pxx+ pyy (394)

We can now utilize methods of Lagrange multipliers to solve this problem:

~∇f = λ~∇g (395)

(αxα−1y1−α, (1− α)xαy−α) = λ(px, py) (396)

We can add in the constraint as well:

pxx+ pyy = m (397)

120

If we divide the first equation by the second equation we obtain that:

αxα−1y1−α

(1− α)xαy−α=pxpy

(398)

αy

(1− α)x=pxpy

(399)

Lets now solve this equation for y

y =1− αα

pxx

py(400)

We can now plug this into our constraint equation:

pxx+ pyy = pxx+ py1− αα

pxx

py= x(px +

1− αα

px) = m (401)

As such, we obtain that:

x =m

(px + 1−ααpx)

=mα

px(402)

Which we can now plug into our expression for y to obtain:

y =1− αα

pxpy

m

(px + 1−ααpx)

=(1− α)m

αpy(1 + 1−αα

)=m(1− α)

py(403)

Problem: Deriving Snell’s Law. Snell’s Law is known to man as n1 sin(α) = n2 sin(β).

Please reference the picture on the next piece of paper for the drawing setup of the problem.

Now, given a beam of light starting at the point A, a distance A above the horizontal, passing

through the interface at the middle, and reaching point B, a distance B below the horizontal,

moving from space with a refractive index of n1 to a region of space with refractive index n2

respectively. Now, light normally moves at the speed of c. However, in a refractive index n,

light takes on the velocity of v = cn. With this knowledge, derive Snell’s law.

121

Solution: So, light always follows the path that takes the shortest time. Therefore we

can make our f function be time over the journey since we are seeking to minimize this. In

addition, We can make the length in the x direction our constraint, since we cannot change

the actual position of the two points in question. Therefore, we can simply take that distance

over time is velocity, rearrange and get a formula for time over the entire journey, namely:

f(α, β) = t =

∑d∑v

=a

v1 cosα+

b

v2 cos β(404)

And our accompanying constraint:

g(α, β) = L = a tanα + b tan β (405)

We now have an f and g with two unknown variables that happen to be our two angles so it

is time to have some fun!

∇f = λ∇g (406)

(a

v1

secα tan,b

v2

sec β tan β) = λ(a sec2 α, b sec2 β) (407)

Lets now solve each of our component equations for λ or otherwise we just divide through:

λ =tanα

v1 secα=

tan β

v2secβ(408)

Lets now plug in our expressions for the velocity that were provided in the image:

n1

csinα =

n2

csin β (409)

n1 sinα = n2 sin β (410)

122

Which is the equation that is known as snells law!

18 Lecture XI on July 25, 2019

18.1 Review on Ideas Behind Integration

Last week, we ended with an example of an integral over a unit square. We can try to

generalize our idea of the integral by thinking, at the core, what an integral is. If f : D → R

where D is some shape, then: ˆD

f (411)

is the result of a process that is:

1. Split D into many small pieces

2. Total the products, ”piece volume” × ”value of function at piece

3. take numbers of pieces to ∞

4. Add up all of the contributions

Last time, we ended by considering:

¨D

f(x, y)dA (412)

Where D is the unit square. The idea here was that we would look at the contributions

along a row at height y, totaling up to, as we first sum along each row and then sum all rows

together, we obtain: ∑y

(∑x

f(x, y)∆x∆y

)(413)

123

Then, If we take the limit as ∆x and ∆y → 0, we can back out the double integral, namely:

lim∆y→0

(lim

∆x→0

∑y

∑x

f(x, y)∆x

)∆y =

¨f(x, y)dxdy (414)

Great now we can utilize the idea of single-variable calculus to actually compute the double

integral here. Lets actually compute the integral in our case here:

ˆ 1

0

ˆ 1

0

f(x, y)dxdy =

ˆ 1

0

ˆ 1

0

y sin(πyx)dxdy (415)

ˆ 1

0

[−cos(πyx)

π

]1

0

dy =

ˆ 1

0

1− cos(πy)

πdy (416)

ˆ 1

0

1− cos(πy)

πdy =

[y − 1

πsin(πy)

π

]1

0

=1

π(417)

Great! However, lets think about more complicated regions. The main difference in my

opinion between single variable and multivariable integration is the fact that your regions

can be a whole slew of funky shapes. As we move away from squares, we can now try to

integrate function over other shapes, like perhaps a triangle in the next example.

Problem: Suppose f : D → R is defined by:

f(x, y) = x2y (418)

Where D is the triangle with vertices at (0, 0), (2, 0) and (0, 3). Calculate˜DfdA

Solution: We have here another integration problem. Now however, we are going to

integrate over a triangle instead of a square. The main difference here is that now, our

region is a little more convoluted since our variables are not necessarily bounded by just

124

constants. We can express this first as sums, by first getting the contribution from one row

for some fixed y and then summing over all rows. Namely, we have that:

∑y

(ˆx2ydx

)∆y (419)

From here, we can note that x is bounded between x = 0 and the side of the triangle. This

is the main difference here. Why? Because we can write an x = equation that describes

the side of the triangle. As such, we can write the equation for that part of the triangle as

y = 3− 32x. Solving for x, we obtain that:

x = 2− 2

3y (420)

We can now put our inner bounds for x into our double integral.

∑y

(ˆ 2− 23y

0

x2ydx

)∆y (421)

We can now take the limit as ∆y → 0. Which leads to our second double integral. Given

our region, we note that y can vary between y = 0 as well as y = 3. As such, our bounds

for y are just two numbers, and there are no functions of x involved. Lets put that into the

problem: ˆ 3

0

(ˆ 2− 23y

0

x2ydx

)dy (422)

You could also do the integrals in the opposite order if you would like, and you would obtain:

ˆ 2

0

(ˆ 3− 32y

0

x2ydy

)dx (423)

This is honestly the hardest part of double and even triple integrals... the Bounds! I

highly encourage you to take the time and draw the region. Then, you would try to come

125

up with your bounds by utilizing the picture. One thing to note as you are making you’re

making your way through double and triple integrals is that the outermost integral must

only have bounds that are numbers. In addition, each inner integral can only be a function

of the outer integrand variables. For example, in our triangle problems x was the variable

of the inner integral and y is the variables of the outer integral. As such, it was totally valid

that our inner variable with respect to x was a function of our outer integral y. In addition,

we are in the clear with our outer integral since our outer integral was only a function of

numbers, and not y nor x.

18.2 Triple Integrals

We are stepping it up a notch now! Suppose that now, f : D → R, where D ⊂ R3.

This notation means that D is a Subset of R3. Mathematically, we are now calculating a

4D volume. We are getting the by letting the region represent the three first dimensions

and then f representing the fourth dimension. However, this doesn’t need to be the case.

Instead of trying to wrap our head around 4D volumes in the 4D chess version of life, we

can instead interpret f as a density value, then:

ˆD

f = M (424)

where M , can be thought of as a Mass since f is the density and D would be the volume,

so we obtain mass! However, you can also think of this with respect to any densities that

you may have come in contact with throughout your physics class. Like, for example, you

could interpret f as a volumetric charge density, and as such, you would be calculating the

total charge by taking the integral of f . Since f can be a function, this can represent all

types of densities! Particularly, you don’t need to have uniform densities as many of you are

probably used to seeing. Lets illuminate this topic with an example.

126

Problem: Integrate f(x, y, z) = x2 + y2 + z2 over [1, 2]3.

Solution: Lets think of the function, f(x, y, z) as the density. Therefore, we would

expect the furthest point of the cube to be the densest given our function of f and its

interpretation. As such, if we wanted to compute the total mass of the cube that we have

defined, we can do so as:

M =

˚f(x, y, z)dV =

ˆ 2

1

ˆ 2

1

ˆ 2

1

(x2 + y2 + z2

)dxdydz (425)

Before we hop into this integral, lets do a sanity check as to a number for the mass that is

definitely larger than the actual mass. Well, the density at the top back corner is 12. This

is the largest the density can ever be. Therefore, since we are working with a unit cube,

and we have a max density of 12, we could guess that the mass of the cube is no more

than 12. This is a mental upper bound we can put on this. In 3D, we can carry the same

ideas that guided us in double integrals and apply it here. We can take a tiny slice of our

cube. Given a particular z constant slice, we now have a double integral. We can now use

our methodology that we utilized in our double integrals, by first looking at a particular y

constant slice compute the contribution along this row, then sum over all rows, and now sum

over all z slices to get the final result. As such, it is like we are just adding a dimension and

calculating another subproblem! As such, we obtain:

M =

˚f(x, y, z)dV =

ˆ 2

1

ˆ 2

1

ˆ 2

1

(x2 + y2 + z2

)dxdydz (426)

M =

ˆ 2

1

ˆ 2

1

[x2z + y2z +

z3

3

]2

1

dydz (427)

127

M =

ˆ 2

1

ˆ 2

1

(x2 + y2 +

7

3

)dydz (428)

M =

ˆ 2

1

[x2y +

y3

3+

7

3y

]2

1

dz (429)

M =

ˆ 2

1

(14

3+ x2

)dz (430)

M =

[14

3x+

x3

3

]2

1

=14

3+

7

3= 7 (431)

Since the actual integral was not evaluated in class, I have provided it here in case you wanted

to see it. Lets continue with a last example related to finding the volume of a tetrahedron.

Problem: Find the volume utilizing a triple integral. Here is an image of the shape

provided.

Figure 8: Tetrahedron Volume Problem

Solution: Lets define a function f(x, y, z) = 1. Then, the volume equals the mass. Lets

start off by looking at slices. We can first note that 0 ≤ z ≤ 4. However, now, as we make

some y constant slices. Now our bounds on the integral get a bit more challenging since

these slices will be a function of z. As such we can say that 0 ≤ y ≤ 3− 34z. Finally, we can

have the x slices. For the x bounds, we note that x is a function of both y and z since we

128

already have them fixed in space. We can derive the bounds for x as 0 ≤ x ≤ 2− 23y − 1

2z.

As such, we can set up our triple integral as:

M =

ˆ 4

0

ˆ 3− 34z

0

ˆ 2− 23y− 1

2z

0

f(x, y, z)dxdydz (432)

M = V =

ˆ 4

0

ˆ 3− 34z

0

ˆ 2− 23y− 1

2z

0

1dxdydz = 4 (433)

Where we took advantage of that fact that when f(x, y, z) = 1 then the mass and the volume

are equivalent.

18.3 Integration in Other Coordinate Systems

Earlier in the course, we did quite briefly go over alternative coordinate systems like

cylindrical and spherical coordinates. Luckily, we introduced them because they will come

in handy greatly when discussing integrating regions that are circular or even spherical in

nature. We can rethink back to polar coordinates at the back end of a single-variable calculus

course. For example image integrating f(x, y) = x + y over the region of a cartoid. 15. To

quickly put them in my notes, in case you want to know we can express the following double

and triple integrals respectively in the different coordinate systems as, over some region D,

where D ⊂ R2 for polar and D ⊂ R3 in cylindrical and spherical coordinates.

¨D

f(x, y)dA =

¨D

f(x, y)dxdy =

¨D

f(r, θ)rdrdθ (434)

which represent the polar coordinate double integral where we pick up this r factor in addition

to our infinitesimal pieces. In addition, in cylindrical coordinates, we have that:

˚D

f(x, y, z)dV =

˚D

f(x, y, z)dxdydz =

˚D

f(r, θ, z)rdrdθdz (435)

15This is the heart-shaped graph’s official name that we see in polar graphs

129

And finally in spherical coordinates we obtain that:

˚D

f(x, y, z)dV =

˚D

f(x, y, z)dxdydz =

˚D

f(ρ, θ, φ)ρ2 sinφdρdθdφ (436)

Where just as a reminder, the conversions between cartesian and polar, cylindrical, and

spherical respectively are:

18.3.1 Polar Coordinates

r2 = x2 + y2 (437)

θ = arctany

x(438)


x = r cos θ (439)

y = r sin θ (440)

18.3.2 Cylindrical Coordinates

r2 = x2 + y2 (441)

θ = arctany

x(442)

z = z (443)


x = r cos θ (444)

130

y = r sin θ (445)

z = z (446)

18.3.3 Spherical Coordinates

ρ2 = x2 + y2 + z2 (447)

φ = arccosz√

x2 + y2 + z2(448)

θ = arctany

x(449)

And now, in the other direction:



z = ρ cosφ (452)

19 Recitation VII on July 26, 2019

I think that for integration, the best thing that you can do is more and more practice.

That being said, let me add to the this as well as the next set of recitation notes set ex-

amples that have to do with integration. Note that these will be familiar from the worksheets.

Problem: Given the density of some unit cube of mass in the first octant is given by:

ρ(x, y, z) = xyz (453)

Find the total mass of the cube.

131

Solution: The mass of the cube is given by:

M =

ˆ 1

0

ˆ 1

0

ˆ 1

0

xyzdxdydzM =1

8(454)

What we have here is a triple integral with all numerical bounds. However, we may not

be as lucky sometimes and have to compute integrals that have more complex. Namely,

we have to come in contact with integrals that contains bounds that are functions of other

variables. Lets explore an example that illuminates this.

Problem: An application of the average value of a function is center of mass. We can

define the center of mass of an object by the following equation:

(xCOM , yCOM) =

(˜RxdA˜

R1dA

,

˜RydA˜

R1dA

)(455)

Where we are essentially calculating the average value of the x and y components, applying

the function from the previous problem. As such, calculate the center of mass of a right

isosceles triangle with the vertices at (0, 0), (1, 0), and (1, 1).

Solution: We can start by calculating just the area, the denominator of both terms,

of the triangle that we are working with. We could actually compute the area utilizing a

double integral, which I will show, but we could also just get the area by drawing the region

and using that the area of a triangle is A = 12Bh = 1

2in our case. Lets show this using the

double integral method.

¨R

dA =

ˆ 1

0

ˆ x

0

dydx =

ˆ 1

0

[y]x0 dx =

ˆ 1

0

xdx =

[x2

2

]1

0

=1

2(456)

132

Now we can compute, utilizing the same bounds for the integrals both,˜RxdA and

˜RydA

¨R

ydA =

ˆ 1

0

ˆ x

0

xdydx =

ˆ 1

0

[xy]x0 dx =

ˆ 1

0

x2dx =1

3(457)

¨R

ydA =

ˆ 1

0

ˆ x

0

ydydx =

ˆ 1

0

[y2

2

]x0

dx =

ˆ 1

0

x2

2dx =

1

6(458)

Then, we can use our center of mass formula to obtain that:

(xCOM , yCOM) =

(˜RxdA˜

R1dA

,

˜RydA˜

R1dA

)=

( 1312

,1612

)=

(2

3,1

3

)(459)

This is a great application of utilizing double integrals to gain insight on other meaningful

quantities found throughout the course. In this case, we have bounds that are actually func-

tions of x. Namely, given our triangular region, we note that if we take an x = c for some

constant c slice, the height that we move up from y = 0 depends on x. As such, our upper

bound on the y-variable also depends on x! Lets trudge forth with some more examples. We

do not need to limit ourselves to just working with cartesian coordinates, but we can take

our first steps into polar coordinates through the following two examples that I will first

work out and then explain the methodology behind it.

Problem: Compute: ˆ ∞−∞

ˆ ∞−∞

(1

πe−(x2+y2)

)dxdy (460)

By converting to polar coordinates, and integrating with respect to r and θ

Solution: Since we’re integrating over the entire xy plane, we would integrate over every

133

possible r and θ which means that our integral becomes:

ˆ ∞−∞

ˆ ∞−∞

(1

πe−(x2+y2)

)dxdy =

ˆ 2π

0

ˆ ∞0

e−r2

πrdrdθ (461)

= 2π

ˆ ∞0

e−u1

2πdu =

[−e−u

]∞0

= 1 (462)

Problem: The surface area of a function, f(x, y), over some region R, is defined as:

S =

¨R

√f 2x + f 2

y + 1dA (463)

Compute the surface area of the function f(x, y) = 12x2 + 1

2y2 over the unit disk.

Solution: Lets first take our partial derivatives that we will have to utilize in our surface

area formula:

fx = x (464)

fy = y (465)

Plugging this into our formula, we obtain that:

S =

¨R

√x2 + y2 + 1dA (466)

Where R is the unit disk. At this point, the integral smells a lot like a conversion to polar

coordinates. As such, lets us make the conversion now, noting that the unit disk has bounds

134

of 0 ≤ r ≤ 1 and 0 ≤ θ ≤ 2π. We obtain that:

S =

ˆ 2π

0

ˆ 1

0

√r2 + 1rdrdθ (467)

We can now solve this integral using u-substitution. Letting u = r2 + 1, we obtain that:

du

2= rdr (468)

Plugging this into our expression, noting that the bounds we have on u are 1 ≤ u ≤ 2 we

have that:

S =1

2

ˆ 2π

0

ˆ 2

1

√ududθ =

1

2

ˆ 2π

0

[2

3u

32

]2

1

dθ =1

3

ˆ 2π

0

232 − 1dθ =

2π

3

(2

32 − 1

)(469)

The major piece of advice to get out of this small collection of exercises is that it can be

very beneficial to spend a decent amount of time thinking about what coordinate thinks

best for solving the problem and then what exactly the bounds are. The tricky part with

integrals is knowing what to put on your bounds. If we can master knowing what, in each

coordinate systems, bounding θ for example looks like graphically, we will be able to cruise

through this section. The next lecture will focus on moving into even more complex, and

even custom!, coordinate systems that take the same ideas and apply them forward. The

motivation behind this is that in many cases, simply sticking to cartesian and polar won’t

work either because we need more dimension in the case of polar, or we have spherical like

objects that are quite difficult to work with under cartesian like systems.

135

20 Lecture XII on July 29, 2019

20.1 Integration in Spherical and Cylindrical Coordinates

Last time, we ended off at Polar coordinates! Today we are going to be going to the 3D

spaces in order to compute functions over regions that are volumes. This is really exciting,

and you will be need to master this. It will show up countless times throughout the rest of

the course as well as the ASE if you plan on taking that. As such, I would encourage you to

complete the homework problems that were unassigned as they are great practice for both.

Suppose we have the following region:

Figure 9: Spherical Region

And, we want to integrate some function f(x, y, z) over this region. That being said, we are

going to need to use integration techniques to compute this. This entails that we are going

to need to figure out our bounds for this spherical like region. The keyword here is spherical.

Regions that appear spherical are going to be easier to integrate over spherical coordinates!

Same goes with cylindrical type shapes using cylindrical coordinates. That being said, let

us introduce a general form of the integral that we want to look at, calling this spherical-like

region R: ˚R

f(x, y, z)dV (470)

136

Where, given our conversions found in section 18.3, we can write as:

˚R

f(ρ, θ, φ)ρ2 sinφdρdφdθ (471)

Where for our specific shape in figure 9, it appears that the following bounds successfully

describe the picture: ˆ π2

π3

ˆ π2

0

ˆ 1

0

f(ρ, θ, φ)ρ2 sinφdρdφdθ (472)

Notice that we get a ρ2 sinφ when we use spherical coordinates. We can justify this in

multiple ways. What we should think is that this factor is a correction factor since we are

now using a funkier coordinate system! A quick justification is the following. Consider just

the dρdφdθ. If this were to represent dV we would have a problem! Why? Well, volume

should have some sort of length cubed from a purely dimensional analysis point of view.

However, looking at this term, we only have length instead of length cubed. As such, we

need to incorporate a length squared somewhere, which is part of the reason as to why we

include the ρ2 sinφ into the expression. The exact form of this comes from a Jacobean that

we will discuss later. This comes from the distortions you can think of of the unit square

as we move towards a curvy coordinate system such as spherical coordinates. I think the

Jacobean greatly shows this which I promise we will do during recitation once we cover this

particular topic, which I will then add back into the notes right here! Remind me if I forget

please. Lets move onto an example of this now:

Problem: Consider a solid with density δ(x, y, z) = (x2 + y2 + z2)−1 and which occupies

the cone:

137

Figure 10: Cone

Find its mass.

Solution: We have some flexibility for solving this. For example, in cylindrical coor-

dinates, we see that we have z = r to describe the equation of the cone. The caviat here

however is not necessarily the region, but it is the density. If we look at our density func-

tion, we do not have the nicest looking thing! In this case the density is barking spherical

coordinates and the shape is barking cylindrical coordinates. Lets trudge forth on this one

with spherical coordinates. We are going to use spherical coordinates due to δ. As such, we

can represent our mass, M , as:

M =

˚δdV (473)

M =

ˆ 2π

0

ˆ π4

0

ˆ secφ

0

δ(ρ, θ, φ)ρ2 sinφdρdφdθ (474)

To put in words, we sweep out all of the possible theta values since our shape is circular in

nature, some people maybe even call it isometric (not important but you may see the word

flashed around). Next, we have that φ can at most be π4. We have this because remember

that the lateral sides of the cone are where z = r, and as such, since the two are exactly

equal to each other, we get that π4

is the largest this can be. In addition, we sweep out all

the other φ values that are less than this since the shape is filled. We finally get the ρ slices,

We can start off saying that ρ must be greater than or equal to zero to get the lower bound.

138

Then, we can use the definition of phi to obtain that:

cosφ =z

ρ=

1

ρ(475)

Leading to the fact that:

ρ =1

cosφ= secφ (476)

As such, we can represent our mass, as, converting our density to spherical coordinates and

obtaining that δ = 1ρ2

M =

ˆ 2π

0

ˆ π4

0

ˆ secφ

0

1

ρ2ρ2 sinφdρdφdθ =

ˆ 2π

0

ˆ π4

0

ˆ secφ

0

sinφdρdφdθ = π ln 2 (477)

We just focused on spherical coordinates! Now, lets just have a quick discussion on cylindrical

coordinates. Since cylindrical coordinates are really just polar coordinates with the addition

of the z-coordinate from the cartesian coordinates, cylindrical coordinates are not a large

step away from polar coordinates. In fact, we can express the conversion of to cylindrical

coordinates as: ˚D

f(r, θ, z)rdrdθdz (478)

Over some region D. In discussion of the dimensional analysis, lets do a quick check that

the units of our volume element are in length cubed. We pick up a length squared from drdz

noting that dθ does not have any length elements. Then, the addition r sitting out front of

the infinitesimals, allows us to have the volume element represented by length cubed.

20.2 Custom Coordinate Systems

This section is really cool! Consider we have a region bounded by xy = 1, xy = 3,

y = 12x and y = 2x. This region is really funky! There is not really a nice way for us

139

to use either cylindrical, spherical, or even cartesian coordinates. That being said, we can

introduce a custom coordinate system. If you notice it appears that we are taking y = cx

cuts for c ∈ [1, 2]. In addition, we are also taking slices along yx = d for d ∈ [12, 3]. This

is looking and smelling quite familiar at this point! It seems that we are taking some new

variable, say u = yx

and v = xy and giving them bounds! Namely, we note that, given the

way I have defined these new variables, u and v, we note that u is bounded between 12

and

2, while v is bounded between 1 and 3. Therefore, we can integrate a function y2 say over

this region bordered by these functions as:

¨D

y2dA =

ˆ 3

1

ˆ 2

12

y

xyxJdudv =

ˆ 3

1

ˆ 2

12

uvJdudv (479)

Great this looks really nice! All we have here is some function bounded by numbers, not

even functions! The tricky part is that J sitting right out front of dudv. That J is called

the Jacobean. For polar and cylindrical coordinates, we say that J = r and for spherical

coordinates we say that J = ρ2 sinφ. Now, lets get a formula for a general Jacobean that

we will receive by transforming to some custom coordinates. The Theorem getting all the

formality of this section is:

Theorem: Suppose that T : R2 → R2 is a differentiable transformation that maps a

region R one-to-one onto a region D. Then, for any continuous function f , we have that:

¨D

f(x, y)dxdy =

¨R

f(T−1(x, y))

∣∣∣∣∂(x, y)

∂(u, v)

∣∣∣∣ dudv (480)

Where,

J =

∣∣∣∣∂(x, y)

∂(u, v)

∣∣∣∣ (481)

It should be noted that it is more-so useful to work with J−1 a lot since we generally write

140

our function u and v as functions of x and y. As such, we can define J−1 as:

J−1 =1

J=

∣∣∣∣∂(u, v)

∂(x, y)

∣∣∣∣ (482)

For all of these, note that we are taking the absolute value of the Jacobean. We do this so

that changing the orientation of our area doesn’t have any effect since we only care about

the area distortion not the orientation distortion. We define:

J =

∣∣∣∣∂(x, y)

∂(u, v)

∣∣∣∣ = | det

∂x∂u

∂x∂v

∂y∂u

∂y∂v

| (483)

Similarly, we can define J−1 as:

J−1 =1

J=

∣∣∣∣∂(u, v)

∂(x, y)

∣∣∣∣ = | det

∂u∂x

∂u∂y

∂v∂x

∂v∂y

| (484)

As such, for our case, we see that since we have written u and v as function of x and y,

maybe we should solve for J−1 and then take the reciprocal of this in order to calculate J

from this in our expression. Lets get on to do this now for our example where u = yx

and

v = xy:

J−1 =1

J=

∣∣∣∣∂(u, v)

∂(x, y)

∣∣∣∣ = | det

∂u∂x

∂u∂y

∂v∂x

∂v∂y

| = | det

− yx2

1x

y x

| = |−2y

x| = 2u (485)

Therefore, we take say that:

J−1 =1

J= 2u (486)

J =1

2u(487)

141

Lets now actually plug this into our example problem that we encountered what seems like

a page and a half ago!:

¨D

y2dA =

ˆ 3

1

ˆ 2

12

y

xyxJdudv =

ˆ 3

1

ˆ 2

12

uvJdudv (488)

¨D

y2dA =

ˆ 3

1

ˆ 2

12

y

xyxJdudv =

ˆ 3

1

ˆ 2

12

uv1

2ududv =

ˆ 3

1

ˆ 2

12

v

2dudv = 3 (489)

The point here is that the trouble, as is the moral with this entire chapter, is not in the

actual integral. The problem is in setting the bounds for double and triple integrals. The

problem here is trying to find what u and v that is a nice custom coordinate system to

transfer over to in order to best integrate over some region! Once we choose these u and v

variables that will be variables of both x and y, we then can compute the Jacobean that will

allow to account for the area distortion.

20.3 Applications of Double and Triple Integrals (ASE)

There are two quantities that will most likely be referenced by name on the ASE relating

to double and triple integrals. I will put them here for reference, and we have done problems

in workshop that relate to them so be sure to check them out for additional practice.

20.3.1 Average Value of a Function

The average value of a function over some region D is:

Avg(f) =

˜Df(x, y)dA˜D

1dA(490)

142

In addition, we can also compute the average value over some 3D region to get the analog

that:

Avg(f) =

˝Df(x, y, z)dV˜D

1dV(491)

Some common things that pop up throughout the 18.02 course is things such as the average

value of x or y over the unit disk. By symmetry, both of these are exactly zero. However, if

it helps, I would definitely go on to compute both of these utilizing the formula given above.

Very closely tied to this, we can compute the center of mass of an object.

20.3.2 Center of Mass

Given a density function δ(x, y), the center of mass of an object that occupies some region

D is:

(xCOM , yCOM) =

(˜Dxδ(x, y)dA˜

Dδ(x, y)dA

,

˜Dyδ(x, y)dA˜

Dδ(x, y)dA

)(492)

Interpret this as finding the average value of the x and y respectively over the mass of the

object. In many cases, the δ(x, y) = 1, and we do not even need to worry about this. In this

case, we are literally just taking the average value of x and y over the region D. We have

the 3D analog of this as well:

(xCOM , yCOM , zCOM) =

(˝Dxδ(x, y, z)dV˝

Dδ(x, y, z)dV

,

˝Dyδ(x, y, z)dV˝

Dδ(x, y, z)dV

,

˝Dzδ(x, y, z)dV˝

Dδ(x, y, z)dV

)(493)

21 Recitation VIII on July 30, 2019

In spirit of the last two recitations that I typed up notes for, let me add some problems

that I think are useful to see as part of the lecture notes that come from the worksheets!

I will say that the overarching idea of what we went over in class is that we can use more

creative coordinate systems in order to compute double or triple integrals. Fortunately, we

143

are not just limited to the coordinate grid, but we have the freedom to branch out and use

spherical and cylindrical coordinates, and even custom coordinates with a Jacobian, in order

to compute. I will put some examples here that I liked from the worksheet :)

Problem: Back in the day, Archimedes (without any knowledge of calculus) calculated

both the surface area and volume of two intersecting cylinders on their axis. This is known as

a ”Groin Vault”. Given the infinite cylinder x2 +y2 = 1 and the infinite cylinder x2 +z2 = 1,

calculate the volume that is encompassed in the intersection of the two cylinders.

Solution: We can try and find bounds for all three of our variables. We can first note

that −1 ≤ x ≤ 1. Since this is the case, we have that −√

1− x2 ≤ y ≤√

1− x2 as well as

−√

1− x2 ≤ z ≤√

1− x2. As such, we can set up our triple integral in order to try and

compute the volume of the groin vault.

˚dV =

ˆ 1

−1

ˆ √1−x2

−√

1−x2

ˆ √1−x2

−√

1−x21dzdydx (494)

=

ˆ 1

−1

ˆ √1−x2

−√

1−x2[z]√

1−x2−√

1−x2 dydx =

ˆ 1

−1

ˆ √1−x2

−√

1−x22√

1− x2dydx (495)

=

ˆ 1

−1

2√

1− x2 [y]√

1−x2−√

1−x2 dx =

ˆ 1

−1

(2√

1− x2)2

dx (496)

˚dV =

ˆ 1

−1

4(1− x2)dx =

[4x− 4x3

3

]1

−1

= 2

(4− 4

3

)=

16

3(497)

Surprisingly there is no π in the answer which comes as a shock to many since the integral

is essentially barking to utilize some form of cylindrical coordinates perhaps.

Problem: Evaluate˝

16zdV over the region E, where E is the upper half of the sphere

x2 + y2 + z2 = 1.

144

Solution: Converting everything to spherical coordinates our integral becomes:

˚16zdV =

ˆ 2π

0

ˆ π2

0

ˆ 1

0

16ρ cosφ(ρ2 sinφ)dρdφdθ

= 2π

ˆ π2

0

2 sin 2φdφ

= 2π [− cos 2φ]π20 = 4π

Problem: Evaluate˝ √

3x2 + 3y2 for the Solid bounded by z = 2x2 + 2y2 and the

plane z = 8

Solution: We note that the intersection of z = 2x2 + 2y2 and z = 8 is a circle satisfying

the equation

4 = x2 + y2. (498)

This means that we will be integrating over a circular region. For that reason, switching

over to polar cylindrical coordinates is a good idea. In polar cylindrical coordinates, our two

functions bounding z above and below become

z = 2r2 and z = 8. (499)

We are interested in a circular intersection of radius 2, so θ ranges from 0 to 2π and r ranges

from 0 to 2. Thus, the triple integral over this 3d domain D is given by

˚D

√3x2 + 3y2dV =

˚D

√3r2rdzdrdθ (500)

=

ˆ 2π

0

ˆ 2

0

ˆ 8

2r2r2√

3dzdrdθ (501)

145

At this point, we evaluate

=√

3

ˆ 2π

0

ˆ 2

0

r2(8− 2r2)drdθ (502)

=√

3

ˆ 2π

0

8(2)3

3− 2(2)5

5dθ (503)

= 2π√

3

(64

3− 64

5

)(504)

= 128π√

3

(1

3− 1

5

). (505)

Problem: Evaluate˜x2 + 2xy + y2dA over R where R is the region bounded by the

curves x+ y = 2, x+ y = 4, y − x = 1 and y − x = −1.

Solution: We can start by defining custom coordinates, u and v as follows:

u = x+ y (506)

v = y − x (507)

Where u ∈ [2, 4] and v ∈ [−1, 1]. We can compute the inverse Jacobian matrix:

J−1 =1

J= det

∣∣∣∣∂(u, v)

∂(x, y)

∣∣∣∣ =

1 1

−1 1

= 2 (508)

Therefore, we have that the Jacobian of this transformation is J = 12. We can implement

the transformation and compute the integral over u and v:

ˆ 4

2

ˆ 1

−1

(x2 + 2xy + y2)1

2dvdu =

ˆ 4

2

ˆ 1

−1

(x+ y)2 1

2dvdu =

1

2

ˆ 4

2

ˆ 1

−1

u2dvdu =

ˆ 4

2

u2du =56

3

(509)

146

Problem: Verify that dV = ρ2 sin(φ)dρdφdθ

Solution: We can compute a 3-dimensional Jacobian as follows:

J =

∣∣∣∣∂(x, y, z)

∂(ρ, φ, θ)

∣∣∣∣ (510)

Lets now write our expressions that represent conversions between Cartesian and spherical

coordinates:



z = ρ cosφ (513)

The Jacobian matrix can be expressed as:

J =

∣∣∣∣∣∣∣∣∣∣det

∂x∂ρ

∂x∂φ

∂x∂θ

∂y∂ρ

∂y∂φ

∂y∂θ

∂z∂ρ

∂z∂φ

∂z∂θ

∣∣∣∣∣∣∣∣∣∣

=

∣∣∣∣∣∣∣∣∣∣det

sinφ cos θ ρ cosφ cos θ −ρ sinφ sin θ

sinφ sin θ ρ cosφ sin θ ρ sinφ cos θ

cosφ −ρ sinφ 0

∣∣∣∣∣∣∣∣∣∣

(514)

Lets now compute the determinant of this matrix:

J = sinφ cos θ det

ρ cosφ sin θ ρ sinφ cos θ

−ρ sinφ 0

(515)

− ρ cosφ cos θ det

sinφ sin θ ρ sinφ cos θ

cosφ 0

(516)

147

− ρ sinφ sin θ det

sinφ sin θ ρ cosφ sin θ

cosφ −ρ sinφ

(517)

J =∣∣+ρ2 sin3 φ cos2 θ + ρ2 cos2 φ sinφ cos2 θ − ρ sinφ sin θ

(−ρ sin2 φ sin θ − ρ cos2 φ sin θ

)∣∣(518)

J =∣∣+ρ2 sin3 φ cos2 θ + ρ2 cos2 φ sinφ cos2 θ + ρ sinφ sin θ

(ρ sin2 φ sin θ + ρ cos2 φ sin θ

)∣∣(519)

J =∣∣+ρ2 sin3 φ cos2 θ + ρ2 cos2 φ sinφ cos2 θ + ρ sinφ sin θ (ρ sin θ)

∣∣ (520)

J = ρ2 sinφ∣∣sin2 φ cos2 θ + cos2 φ cos2 θ + sin2 θ

∣∣ (521)

J = ρ2 sinφ∣∣(sin2 φ+ cos2 φ) cos2 θ + sin2 θ

∣∣ (522)

J = ρ2 sinφ∣∣cos2 θ + sin2 θ

∣∣ (523)

J = ρ2 sinφ (524)

Therefore, since our Jacobian is equal to: J = ρ2 sinφ, We can express the volume element

in spherical coordinates as:

dV = ρ2 sinφdρdφdθ (525)

22 Lecture XIII on July 31, 2019

22.1 Vector Fields

Welcome back! Today we are going to be diving into the vector calculus portion of the

course. This will probably seem a bit new to all of you, so I recommend reading through

these portions a bit more slowly. This is what we have been building up for, as well as all

applications to look at. We start by defining the vector field.

148

Definition: A vector field in R3 is a function, f : R3 → R3. A vector field in R2 is a

function, f : R2 → R2.

We probably have all come in contact with a few vector fields before. One can think of a

gravitational field, electric field, and even flows fields of fluids all as great examples of vector

fields. We draw a vector field by drawing a vector representing the output of the function,

and we place this at the point that is the input to the function. As such, the location of

each arrow can be thought of as the domain, and the arrow itself can be thought of as the

co domain as a way of visually representing it. Given that we can write our vector field,

~F (x, y, z), we can try to construct a way to represent the vector field. Lets see, suppose we

have the following image,

Figure 11: Vector Field 1, Gravitational Attraction.

and we are trying to write a vector field to this point. All the arrows are pointing towards

the source point. We want to write a vector field that has arrows of increasing size as we

approach (1, 1, 1) pointing towards (1, 1, 1). As such, we can obtain that the vector field can

be expressed as:

~F (x, y, z) = − GMm

((x− 1)2 + (y − 1)2 + (z − 1)2)32

(x− 1, y − 1, z − 1) (526)

149

The constants GMm are just for the physical interpretation. However, we could omit those

and still capture the vector field. Lets move on to an easier example.

Problem: Find a vector field, ~F : [0, 2]× [0, 1]→ R2 whose plot looks like this:

Figure 12: Vector Field 2, Shear Flow.

Solution: Lets see the patterns we have here. As y increases, the vector that appears

to point totally in the x direction decreases in magnitude. Notice all the vectors in our little

square all are pointing in the positive x direction, and the magnitude does not change as we

move along any horizontal line. Since there is no y component, and the magnitude of the

arrow pointing in the x direction shrinks as we move up, leading to the vector field being

expressed as:

~F (x, y) = (1− y, 0) (527)

Sometimes in classes that you may take in the future, you will see this type of vector field

referred to as a shear flow! For those interested, this is Not curl-free, which generally catches

students by surprise. Lets move on to start to talk about work.

22.2 Work in Vector Fields

We recall from physics that:

W = ~F · ~d (528)

150

This tells us that work can be expressed as the dot product between force and the displace-

ment of the object. Now we want to generalize for paths that are not necessarily straight.

Specifically, lets discuss the work it takes to move a particle along a path C, in a vector field

~F is:n∑k=1

(~F (~r(tu))

)· (~r(tu)− ~r(tu−1) (529)

As we take the limit of k → ∞, we can replace the sum, of over-dramatic notation and

proof stuff that distracts from the point, and we simply boil this expression down to the real

expression for work that is:

W =

ˆ r2

r1

~F (~r(t)) · dr (530)

A much more workable way to utilize this expression is through a parametrization. Suppose

that you parametrize ~r(t). Then, you can express the above expression for work as:

W =

ˆ t2

t1

(~F (~r(t)) · d~r

dt

)dt (531)

Where we just get a function of t that is just a single variable integral that we can handle

with our knowledge of single-variable calculus. The above expressions is what we will be

extensively working with throughout the course, so I encourage you to get this down pat.

Lets get a theorem to justify why it does not matter what parametrizations we take of the

same path C.

Theorem: If ~r1 and ~r2 are parametrizations of the path C, and ~F is a vector field,

W =

ˆ t2

t1

(~F (~r1(t)) · d~r1

dt

)dt =

ˆ t2

t1

(~F (~r2(t)) · d~r2

dt

)dt (532)

All this is saying in English is that if I can write a ~r(t) that describes the curve C, then this

is a totally valid parametrization. There really is not any special parametrization that you

151

must utilize. Lets do an example. Suppose we have the curve y2 = x between (0, 0) and

(1, 1). Then a totally valid paramatrization as:

~r(t) = (t2, t) for t ∈ [0, 1] (533)

Another totally valid parametrization is:

~r(t) = (2t2, 2t) for t ∈ [0,1

2] (534)

You can see that basically sticking a constant out front does not matter since we just adjust

the time in the interval. Another parametrization is that:

~r(t) = (t,√t) for t ∈ [0, 1] (535)

Also valid! The difference between this and the first parametrization is that the one will

start off much faster and finish slower than the first.

22.3 Fundamental Theorem of Vector Calculus

In general it is not true that given we have two paths, C1 and C2 with the same start

and end points do not have the same work being done. Namely:

ˆC1

~F (~r(t)) · d~r 6=ˆC2

~F (~r(t)) · d~r (536)

152

22.3.1 Conservative Vector Fields

However, this is true for conservative vector fields. Conservative vector fields are defined

as, given ~F is a conservative vector field,

~F = ~∇f (537)

where f is just some function, f : Rn → R, that we will refer to as a potential function. For

a conservative vector field, it is true that given we have two paths, C1 and C2 with the same

start and end points do have the same work being done. Namely:

ˆC1

~∇f(~r(t)) · d~r =

ˆC1

~F (~r(t)) · d~r =

ˆC2

~F (~r(t)) · d~r =

ˆC2

~∇f(~r(t)) · d~r (538)

In general, conservative vector do not depend on the path taken between points ~r(a), the

start point, and point ~r(b), the end point. As such, the climactic piece of information is that

for a conservative vector field, ~F = ~∇f , the following equation holds, that is denoted as the

fundamental theorem of vector calculus. Namely,

W =

ˆC

~F (~r(t)) · d~r =

ˆC

~∇f(~r(t)) · d~r = f(~r(b))− f(~r(a)) (539)

Why is this true you may ask? Well, let me provide you with a better back of the envelope

proof. Given that ~F = ~∇f , we can express a work integral as:

W =

ˆC1

~F (~r(t)) · d~r =

ˆC1

~∇f(~r(t)) · d~r =

ˆC1

(~∇f(~r(t)) · d~r

dt

)dt (540)

153

Notice that in the last equality, the portion in parentheses is just the multivariable chain

rule that represents dfdt

! As such, we obtain that:

W =

ˆC1

(~∇f(~r(t)) · d~r

dt

)dt =

ˆC1

df

dtdt =

ˆC1

df = f(~r(b))− f(~r(a)) (541)

Where ~r(a) is the starting point and ~r(b) is the ending point.

Just to have it stated in all its glory, The Fundamental Theorem of Vector Calculus states

that if C is a path from ~a to ~b and f is a differentiable function, then:

ˆC

~∇f · dr = f(~b)− f(~a) (542)

22.3.2 Checking Conservative Fields

Note that if ~F = (M,N) = ~∇f , where M is the first component of the vector field and

N is the second component of the vector field, then the following relations between M,N ,

and ~∇f is that:

M =∂f

∂x(543)

N =∂f

∂y(544)

Then, we say that a field is conservative if:

∂M

∂y=∂N

∂x(545)

This is another neat application of Clairout’s Theorem! This is totally sufficient to see if ~F

actually is conservative or not conservative. The Theorem simply states that:

Theorem: If My = Nx for a vector field ~F = (M,N), then ~F is a conservative vector

154

field, under the assumption that ~F is differentiable everywhere on R2.

22.4 Green’s Theorem

This section is super neat and is actually just a specific case of the general Stoke’s

Theorem that we will encounter later.

˛~F · d~r =

¨D

(Nx −My) dA (546)

Where¸

simply means a closed integral. You may see this a ton for the rest of the class.

Consider first a conservative vector field. If a conservative vector field has that ~F = ~∇f ,

then the work of this vector field is simply equal to f(b)− f(a) where b is the endpoint and

a is the starting point. However, for a closed loop, the starting and end points are exactly

the same! As such, the work around any closed loop for a conservative vector field is exactly

zero. We can also see this by looking at the right hand side of the above equation. Since

for a conservative vector field, Nx = My, then the RHS will always evaluate to zero for a

conservative vector field over a closed loop! We covered a lot today, so hopefully it was not

all too overwhelming.

23 Recitation IX on August 1, 2019

Let me add in a few examples from workshop that will prove helpful on Exam’s and the

ASE.

Problem: Evaluate the work being done by the vector field, F = 〈2x, 3y, 4z〉 along a

helical path that starts at (0, 0, 0) and stops at (0, 0, 1).

Solution: We can begin by finding a potential function f for F, that is, a scalar function

155

f whose gradient,−→∇f = F. By inspection (taking into account that each component of

F should be a corresponding partial derivative of f, we can come to the conclusion that a

possible f is:

f(x, y, z) = x2 +3

2y2 + 2z2

Once we’ve accomplished this, we can employ the FTOVC which states that:

ˆ (0,0,1)

(0,0,0)

F · dr =

ˆ (0,0,1)

(0,0,0)

−→∇f · dr (547)

= f(0, 0, 1)− f(0, 0, 0) (548)

= 2− 0 (549)

= 2 (550)

Problem: Suppose an object is moving in a vector field, F, such that:

F =

⟨−x

(x2 + y2 + z2)32

,−y

(x2 + y2 + z2)32

,−z

(x2 + y2 + z2)32

⟩(551)

along the path r(t) = 〈1 + t, t3, t cos(πt)〉 from t = 0 to t = 1. Find the work done by this

vector field on the object.

Solution: We begin in the same manner as our previous problem where we can come to

the conclusion that a possible f is:

f(x, y, z) =1

(x2 + y2 + z2)12

Which means that the question now is what is the difference in the value of f at t = 1 vs at

156

t = 0, that is from (1,0,0) to (2,1,-1). So we have

W = f(2, 1,−1)− f(1, 0, 0) (552)

=1√6− 1 (553)

Problem: Find the Work done on a particle that goes through a Force field F = 〈y,−x〉

through a triangular path starting and ending at the origin and going through the points

(1,0) and (1,1) with a) a line integral and b) Green’s Theorem

Solution: We begin by noting that the vector field that we’re given is not a conservative

vector field and thus, we cannot use the FTOVC. However, we can think of our path as a

superposition of 3 line segments and evaluate the work done on our particle through each

of those line segments and add the results up to get the total work done. The paths can be

parameterized as follows:

1. r1(t) = 〈t, 0〉 for t ε [0,1].

2. r2(t) = 〈1, t〉 for t ε [0,1].

3. r3(t) = 〈t, t〉 for t ε [0,1] (but from 1 to 0.

157

So then our work calculation reduces to:

W = W1 +W2 +W3 (554)

=

ˆF · dr1 +

ˆF · dr2 +

ˆF · dr3 (555)

=

ˆF(r1(t)) · r′1dt+

ˆF(r2(t)) · r′2dt+

ˆF(r3(t)) · r′3dt (556)

=

ˆ 1

0

〈0,−t〉 · 〈1, 0〉dt+

ˆ 1

0

〈t,−1〉 · 〈0, 1〉dt+

ˆ 0

1

〈t,−t〉 · 〈1, 1〉dt (557)

= 0− 1 + 0 (558)

= −1 (559)

Now, using green’s theorem, the first thing to do is take the corresponding partial derivatives

and take their difference to find the integrand:

Nx −My = −1− 1 = −2

Now that we have our integrand we just need to integrate that function over the triangle to

get our work which turns into -2 times the area of our triangle (which is 12) and thus, our

work is −1 .

Problem: Suppose you have a force field, F = 〈x3,−y4〉. In addition, assume an object

is moving through the force field along a circular path where the path is described by r(t) =

〈cos(2πt), sin(2πt)〉 from t = 0 to t = 1. Show that the work done is zero through means

of a line integral (math), the fundamental theorem of calculus (math), and through a direct

statement (all words).

158

Solution: We first show this through a direct computation:

˛C

F · dr =

ˆ 1

0

〈x3,−y4〉 · 〈−2π sin(2πt), 2π cos(2πt)〉dt (560)

=

ˆ 1

0

〈cos3(2πt),− sin4(2πt)〉 · 〈−2π sin(2πt), 2π cos(2πt)〉dt (561)

= −2π

ˆ 1

0

cos3(2πt) sin(2πt) + sin4(2πt) cos(2πt)dt (562)

= −2π

ˆ 1

0

cos3(2πt) sin(2πt)dt− 2π

ˆ 1

0

sin4(2πt) cos(2πt)dt (563)

1 (564)

Each of these integrals can be evaluated through a u-substitution. Let

u1 = cos(2πt) and u2 = sin(2πt). (565)

Then, we have

du1 = −2π sin(2πt)dt and du2 = 2π cos(2πt)dt. (566)

Making this change of variables we realize that we would be integrating from 1 to 1 and from

0 to 0 respectively so the value of our integral is 0 .

Using the FTOVC we quickly release that the function is conservative and that the

starting and ending points are the same and thus we obtain 0 as a result.

24 Lecture XIV on August 2, 2019

Last class, we ended with Green’s Theorem. Lets pick up right where we left of:

159

24.1 Green’s Theorem

˛C

~F · d~r =

¨D

(Nx −My) dA (567)

Where ~F = (M,N) Where C is a closed loop, and D is the region enclosed by the curve

C. We, on the left hand side, have an expression that tells us to calculate the work as I

go around some path that starts and ends at the same point. The right-hand side can be

thought of as the curl of the vector field integrated over the area. The reason why we must

have a closed loop, is because we must trap some area. Well, apparently an intuition for

Green’s Theorem is doing an approximation of a work integral over an infinitesimal square’s

border... who knew! Anyways, we are finally done with that proof, lets move on to Surface

Integrals.

24.2 Surface Integrals

If S is the surface and f : S → mathbbR is a function, then we can make the equation

that: ˆS

fdA = limnumber of patches→∞

∑patch

f(patch)area(patch) (568)

A way we can try to conceptualize what is going on is thinking about trying to find the

average global temperature. Namely, we are setting lets have some function, temperature,

that we integrate over the surface of the earth. Then, we can divide through by the total

surface area to extract the average temperature. This is an application of average value that

we went over in workshop with some example problems!

Avg(f) =

´SfdA´SdA

(569)

160

where S is the surface, and f : S → R. Lets try to polish off this idea with an example:

Problem: Find´SfdA, where f(x, y, z) = 2x2 + 2y2 + 2z2 over the unit sphere.

Solution: Well lets think about this a bit before we dive in. Since we are trying to

integrate over the unit sphere, by definition, x2 + y2 + z2 = 1 always! This is the surface of

the unit sphere. Therefore, we simply have that:

2

ˆS

x2 + y2 + z2dA = 2

ˆS

1dA = 2(4πR2) = 8π (570)

since R = 1 for the unit sphere. If we persay wanted to calculate the average value, we could

simply divide our answer by the total surface area, 4π, leaving us with just an average value

of 2 We kind of expected this to happen considering that the function f is always equal to

2 over the surface of the unit sphere.

24.3 Parametrizing Surfaces (ASE)

This section is useful for the ASE, but it is not necessarily tested in this course. Consider

the points (x, y, z) with y = sin 7z for x ∈ [0, 1] and z ∈ [0, π2]. Although there does exist

a formulaic way to parametrize a surface, in this case, we can parametrize this surface by

inspection as:

~r(u, v) = (v, sin 7u, u) (571)

The idea here is we are trying to express a surface of 3 variables but only use two variables,

u and v. Lets move forward with another example:

Example: Parametrize the part of the unit sphere in the first octant

161

Solution: We can bring in spherical coordinates if we are talking about spherical-like

shapes! we can think of our parametrization of the unit sphere in terms of θ and φ for our

parametrization, where for the unit sphere in the first octant, we obtain that: θ ∈ [0, π2] and

φ ∈ [0, π2. As such, we can express our x, y, and z coordinates as:

~r(θ, φ) = (sinφ cos θ, sinφ sin θ, cosφ) (572)

where we basically have expressed spherical coordinates by letting ρ = 1 in the case of the

unit sphere. We can express the formula for surface area as:

¨S

fdA =

¨D

f(~r(u, v))|~ru × ~rv|dudv (573)

Where S is the surface that we are integrating over, and D is the region, below the surface.

The thing that we are taking the magnitude of can be thought of as a jacobian that is

transforming our funky surface region into a nice flat region D that is just a rectangle with

the bounds on θ and φ. Lets break this down piece by piece. WE are first saying, okay f

was given to be in terms of x, y, and z. I am going to hell with these coordinates and plug

in my parametrization of ~r into f . In the case that we are simply trying to find the surface

area itself, we set f equal to one. However, in the case it was equal to a function, we would

plug in our parametrization in terms of θ and φ for our x, y, and z. Then we are going

to calculate this Jacobian looking object sitting our front of our dudv. What this notion

means is we are going to take the partial derivatives of our parametrization with respect to

u and v, cross those, and then take the magnitude of this cross product to represent the area

distortion. Then, our dudv is going to be the two variables that we are parametrizing with

respect to. So, for the case of this problem specifically, we have that we are parametrizing

with respect to θ and φ where each of these two variables are bounded between [0, π2] giving

us our bounds for the integral. Let me compute this problem thoroughly so that you can

162

see this after class. Lets move on to another example in the meantime.

Problem: Find the average value of z over the upper half of the unit sphere.

Solution: we have that, we can take use of the parametrization of the unit sphere, and

simply limit φ[0, π2

and have θ ∈ [0, 2π], with the parametrization that:

~r(θ, φ) = (sinφ cos θ, sinφ sin θ, cosφ) (574)

yielding the integral that:

ˆf

dA =

ˆ 2π

0

ˆ π2

0

z|~rθ × ~rφ|dφdθ = π (575)

Then, from here, we can divide through by the surface area, 2π in order to obtain the average

value of 12.

24.3.1 A Better Treatment

So, we touched on surface integrals in class but lets get down a more formulaic way of

attack. Consider that I have a surface that exists in 3D space, but I want to write my 3-

dimensional object just with only two variables, this is how we will be approaching surfaces.

Lets start off with how to parametrize a surface and then move forward to how to set up a

surface integral. So perhaps you want to parametrize, the part of the surface, z = 1−x2−y2

that lies above the xy plane. Well our task is two be able to express this surface in terms of

only two variables, which we maybe shall call u and v to stay in line with the notation you

will see. well, similar to how we parametrize a line whilst doing work integrals, lets now try

to parametrize this surface. Luckily, all three of our variables are written in terms of only x

163

and y! Therefore, we can express the surface in the following parametrization.

~r(u, v) = (u, v, 1− u2 − v2) (576)

where the bounds of our u and v will trace out the unit disk given that this is the ’shadow’

if you will of our surface on the xy plane. Great! Okay, now that we have parametrized this

surface, lets now compute the surface area!

The formula for surface area is:

S =

¨A

|~ru × ~rv|dudv (577)

The thing that we need to first compute is |~ru × ~rv|. We can do so with the following:

~ru = (1, 0,−2u) (578)

~rv = (0, 1,−2v) (579)

Which when we take the cross product and the take the magnitude of this cross product

obtain that:

|~ru × ~rv| =√

4(u2 + v2) + 1 (580)

which I can then plug into my formula for the surface area to obtain that:

S =

¨A

√4(u2 + v2) + 1dudv (581)

Where now it will be advantageous to switch over to cylnidrical coordinates given our

164

’shadow’ region of the unit disk and integrand. Namely, we obtain that:

S =

ˆ 2π

0

ˆ 1

0

√4r2 + 1rdrdθ (582)

Which we can solve to compute the surface area. I will say that we the worksheets throughout

the course alluded to this formula in a way. That is that we can alternately, given we have

z = f(x, y), express surface area of a surface over a region as:

S =

¨A

√f 2x + f 2

y + 1dA (583)

Which you can see exactly fits the bill as what we ended up obtaining doing the more formal

parametrization! Both work, so just choose which one that you like! The most common

thing you will probably be asked it to parametrize some surface of a sphere of radius a.

Well, you might think, okay, I have spherical coordinates to hopefully parametrize this,

but my spherical coordinates are written in terms of 3 variables, and I can only write my

parametrization in terms of 2 coordinates! However, since the radius, ρ is fixed, since we are

concerned with the surface of the sphere, we actually only have 2 variables, θ and φ and as

such, we can express the parametrization of a sphere of radius a as:

~r(u, v) = (a sinu cos v, a sinu sin v, a cosu) (584)

Nice. Now we can do this exact same thing as previously done, that is, compute ~ru and ~rv to

then take the cross product and magnitude of. The last shape that I see parametrized a lot

is a cone! A cone has the surface, z =√x2 + y2. Which we can conveniently parametrize

as:

~r(u, v) = (u, v,√u2 + v2) (585)

165

In all honesty, there is a bit of ugly and tedious labor that comes with the cross product being

involved, but the overarching idea is not all too bad. Namely, we are integrating over the

region that lies underneath, some shadow region, of our surface that exists in 3-dimensional

space. We are then parametrizing or surface to find some correctional factor, |~ru×~rv to then

integrate over the shadow region! With that I think that is all that is necessary for ASE

prep.

24.4 Flux

A natural application of surface area is flux. We can think, suppose we have some vector

field that is flowing our in space, perhaps think of water flowing through a net. That being

said, suppose we want to see how much water actually does flow through that net. Well, we

can think lets see how much stuff goes through our surface S, namely we can say that:

ˆS

~F · d ~A =

¨~f(~r(u, v)) · (~ru × ~rv)dudv (586)

Where:

d ~A = (~ru × ~rv)dudv (587)

We utilize the dot product here because we want to get only the stuff that penetrates the

surface S. Think of that as motivation for taking the dot product since we are essentially

filtering out all the stuff, vector field, that is not going through the surface. I find it helps

if you think of a wall with holes punctured out. Now, consider that you have a hose. First,

you decide let me point the hose right at some of these small holes. Well then, we would

expect a ton of water to make it through the holes and to the other side. Then suppose I

start to veer off to an angle from the wall, and aim the hose. Well now, water is still going

through the holes, but not as much since since we are at an angle, some is not making it

through like it normally would. Namely, the components of the water from my hose not in

166

the direction of the hole are making it through. Now finally, consider the last case where I

am shooting the water in line with the face of the wall, namely perpendicular to the normal

vector of the wall. Well now, none of the water is going through! All my water is running

along the sides of the wall and as such, none can really come through! Hopefully this helps

internalize this a bit!

24.5 Divergence

Divergence is a measure of the flow density! We compute the divergence of a vector field,

~F as:

~∇ · ~F =

(∂

∂x,∂

∂y,∂

∂z

)· (M,N,P ) (588)

Where F = (M,N,P )

25 Lecture XV on August 5, 2019

Hey all! Sorry I couldn’t be in attendance today, I had to do something for the Office

of Minority Education. Anyways, I still want to ensure that you have all the tools that

you need in order to do great on the Final exam this Thursday. I want to start off with

reviewing Divergence, go over curl, and then introduce the final two big equations of the

course; Divergence and Stokes’ Theorem.

25.1 Divergence

We define the divergence of a vector field, ~F = (M,N,P ) as:

div(~F ) = ~∇ · ~F =

(∂

∂x,∂

∂y,∂

∂z

)· (M,N,P ) (589)

167

The equation by itself is not all too bad. Lets do an example and then complete a discussion

on what this quantity represents.

Problem: Compute the divergence of the vector field ~F = (x2y, yx, z) at the point

(1, 1, 0).

Solution: We can utilize the aforementioned equation to express the divergence of the

vector field ~F as:

div(~F ) = ~∇ · ~F =

(∂

∂x,∂

∂y,∂

∂z

)· (x2y, yx, z) =

∂

∂x(x2y) +

∂

∂y(yx) +

∂

∂z(z) (590)

div(~F = 2xy + x+ 1 (591)

Where, we can calculate the divergence at the point (1, 1, 0) as:

div(~F (1, 1, 0)) = 2 + 1 + 1 = 4 (592)

Great! So that is the computation necessary to compute the divergence of a vector field,

~F , at a point. Now, lets see what it represents. Conceptually, the divergence represents for

us how much ”stuff” (vector field) is flowing in and flowing out a specific point. Namely, if

we look at an infinitesimal volume surrounding a point, so in our previous case, (1, 1, 0), we

would see, given the answer is positive, that there is more stuff (vector field) flowing out from

the point than there is flowing in! Lets look at another pictorial example now. Consider the

following two vector fields that we have seen before, that I will now display below:

168

Figure 13: Divergence of Vector Field 1

In this case suppose I pick a point, that is not (1, 1, 1). Then, I can conclude that in this

case the divergence at this some random point should be negative! Why? Well, lets see. If

we look at a point, lets say (1, 12, 1

2), Then we notice that the size of the arrows representing

the vector field flowing in are larger than the size of the vectors flowing out, we get a net

negative vector field. Lets turn to another example now.

Figure 14: Divergence of Vector Field 1

Here we have an interesting case. Lets look at perhaps the point (12, 1

2). At this point, we see

that the flow coming in from left to right is exactly the same size as the flow coming out from

left to right. As such, this is what we refer to as zero divergence! The overarching point here

is that we want to isolate one point in our mind, and then pictorially ask ourselves whether

the flow in is greater than, less than, or equal to the flow out by looking at the vector field

that the point is exposed to. If the flow in is greater than the flow out, we get a negative

divergence. If the flow out is greater than the flow in, we achieve a positive divergence.

Finally if the flow equal the flow out, we achieve zero divergence. Of course, it is much

169

easier to simply compute the divergence by utilizing the equation, but it also important that

we conceptually master how to look at a graph of flow and be able to say what sign the

divergence will have. Lets move on to curl now. Finally, note that the divergence is a scalar.

It is simply some function that is not a vector! We can think of divergence as a function,

f : Rn → R.

25.2 Curl

Next we will discuss the Curl of a vector field. Lets first present the formula, similar

to divergence and then follow through with the conceptual understanding. The formula for

curl, given that we have some vector field, ~F = (M,N,P ) is computed as :

Curl(~F ) = ~∇× ~F = det

i j k

∂∂x

∂∂y

∂∂z

M N P

(593)

Curl(~F ) = (Py −Nz,Mz − Px, Nx −My) (594)

Where Mx, for example, represents the partial derivative of M with respect to x. Lets first

note that taking the curl of a vector field produces another vector. Namely, the curl of a vec-

tor field can be thought of as a function, f : Rn → Rn, which is this case, f : R3 → R3. Lets

just go through an example real quickly of an actual computation of the Curl of a vector field.

Problem: Compute the Curl of the vector field, ~F = ((x2y, yx, xz)

170

Solution: We can turn to our formula of the curl that states that:

Curl(~F ) = ~∇× ~F = det

i j k

∂∂x

∂∂y

∂∂z

M N P

(595)

Curl(~F ) = ~∇× ~F = det

i j k

∂∂x

∂∂y

∂∂z

x2y yx xz

(596)

Curl(~F ) = (Py −Nz,Mz − Px, Nx −My) (597)

Curl(~F ) =(0− 0, 0− z, y − x2

)(598)

So perhaps, if we wanted to compute the curl at the point (2, 1, 0), we would obtain that:

Curl(~F (2, 1, 0)) = (0, 0,−3) (599)

Okay great! Computing it is quite annoying due to the cross product, but we can trudge on

through that without too much worry. Lets try to get down the conceptual understanding

now. Curl can be described as a circulation density. Namely, we want to think, If I place a

little paddle-wheel, something that will rotate in my vector field, at a specific point in my

vector field, how much and it what direction will it rotate. We define that if the little paddle

wheel we place at a point rotates counterclockwise due to the vector field, we have a positive

curl. In addition, if the little paddle wheel would rotate clockwise at the point, we would

state that we have a negative curl. Lets try this out with a picture we have seen a few times

by now. Consider the vector field below:

171

Figure 15: Curl of Shear Vector Field

Imagine we want to compute the curl at the point (12, 1

2). Well, lets try this conceptual

method that we laid out. Consider we place a little paddle wheel, baby box even if you

prefer, at the point in question. Then consider, how the vector field at this point will make

the box move, given it is locked at the point. We see that the vector on the lower end of the

box will be larger than the vectors at the upper end of the box. As such, we would expect

that the arrows would cause the box to rotate counterclockwise. As such, we would expect

the curl at this point to be positive. Lets test this out given the exact form of the vector

field. It turns out that we can express the vector field of the aforementioned picture as:

~F = (1− y, 0, 0) (600)

Therefore, if we compute the curl of ~F , we obtain that:

curl(~F ) = (0, 0, 1) (601)

Namely, point your right thumb in the direction of the curl vector, straight up, and the

wrap your fingers around, this will indicate the curl. Since your fingers will wrap around

counterclockwise, we obtain that the curl is positive, and even more so, the curl is always

positive. So, for any point in this little rectangle, we get a positive circulation density that

will result in a positive curl. So, we can use a formulaic way of deciding whether the curl is

positive, leading to a counterclockwise rotation, or negative, leading to a clockwise rotation

172

by imagining placing a little paddle-wheel at a specific point, and noting how the vector

field would spin around that little paddle wheel. Lets move on to the applications of both

of these ideas.

25.3 Divergence Theorem

The Divergence Theorem is extremely powerful and comes up all over the place in physics

and applied maths. The Divergence Theorem states that Let E be a simple solid region and

S is the boundary surface of E with positive orientation. Let F be a vector field whose

components have continuous first order partial derivatives. Then,

‹S

F · ndA =

˚E

Div(F )dV (602)

Lets break this down because it looks quite hefty. On the left hand side, we are saying

that given we have some closed surface, namely our surface represents a shell perhaps that

is trapping some 3 dimensional volume, then I can compute the flux of my vector field ~F

through this shell. Basically, we are just saying compute the flux of ~F through some closed

surface. The loop in the double integral indicates that the surface is closed similar to the

loop in the single integral means that the loop is closed. Now, the right hand side says that

this flux is actually going to be equal to the divergence of ~F through the volume that is

trapped by the surface. Lets do a quick example to see what I am saying. In general, we are

going to be asked to compute the flux through say unit cube from inside to outside. And,

instead of computing some nasty surface integrals like the left-hand side would lead us to,

we are going to use the trick on the RHS to make out lives easier. Lets do the example now

to see what I mean.

Problem: Compute the flux through the unit cube of the vector field, ~F = (x, x, x)

173

Solution: Well, if we saw this problem last week, perhaps we would try to calculate the

flux, flow, whichever word you like, through each of the six faces. Doing this would give us

the LHS of divergence theorem. However, since the unit cube is a closed surface, enclosing

some volume, we can instead compute the divergence of F and then integrate this over the

volume of the unit cube. Lets do that now:

‹S

F · ndA =

˚E

Div(F )dV =

ˆ 1

0

ˆ 1

0

ˆ 1

0

(1 + 0 + 0)dxdydz = 1 (603)

Not too bad right? Lets do a bit more of a difficult example now.

Problem: Compute the flux of the vector field ~F = (x2, 2, 3) through the surface

bounded by z = 0 and z = 4− x2 − y2.

Solution: Now we can really see the power of Divergence Theorem. Namely, it would

really suck to compute the surface integral over this shape. Instead we can turn to utilize

the RHS of Divergence Theorem again:

‹S

F · ndA =

˚E

Div(F )dV =

˚(2x+ 0 + 0)dV (604)

Now, we can take use of our knowledge of triple integrals, and convert this to cylindrical

coordinates to solve that:

‹S

F · ndA = 2

ˆ 2π

0

ˆ 2

0

ˆ 4−r2

0

r cos θrdzdrdθ = 0 (605)

The result of zero means that the net flux through the surface is exactly zero. We interpret

that as the amount of crap that flows into our cereal bowl looking thing is the exact same

174

amount of crap 16 that flows out of the cereal bowl looking shape. I will add more and more

examples underneath the recitation section form tomorrow. For now, lets move on to our

last theorem of the class, Stokes’ Theorem.

25.4 Stokes’ Theorem

Lets just start off with the definition and then unpack all that is going on within the

daunting definition, Let S be an oriented smooth surface that is bounded by a simple,

closed, smooth boundary curve C with positive orientation, namely we move around it

counterclockwise. Also let F be a vector field then,

˛~F · d~r =

¨S

(~∇× ~F

)· d ~A (606)

Okay so what does all of this mean. Lets give this a similar analysis to divergence theorem.

We first look at the left-hand side. The left-hand side is a line integral. More specifically, it

is a closed line integral meaning that we perhaps have the boundary that could be a circle,

a square, some loop that is closed. We are then saying that the work done by the vector

field, ~F , as I walk around my closed loop is going to be equal to the curl of ~F along the

surface who has a boundary that I initially walked upon. Lets think about a trash bag for

visualization. Consider a trash bag. We have an opening, where we throw in the trash, we

have a border that surrounds then opening and then we have the bag where all the trash

goes in. We are saying that the work it takes to move around the hole of the trashbag the

border, is equal to the amount of ~∇× ~F that goes through the surface of the trashbag. As

a matter of fact, we have already seen an example of Stokes’ Theorem whilst doing Green’s

16Crap is just the Vector Field

175

Theorem. If you recall, Green’s Theorem tells us that:

˛~F · d~r =

¨Nx −MydA (607)

This looks pretty similar to Stokes’ Theorem. In Fact it is just a special case of Stokes

Theorem. Consider the following. Consider that you have a vector field, ~F = (M,N,P ).

In addition, you have that your surface is just some area on the xy plane that would be

definition have a normal vector that is equal to ~n = (0, 0, 1). Think of the area vector as just

being similar to a plane vector. Namely, the area vector is the vector that points normal to

the actual area surface. Therefore, if we compute ~∇ × ~F , and then dot this with our area

vector, we obtain that:

˛~F · d~r =

¨S

(~∇× ~F

)· (0, 0, 1)dA =

¨Nx −MydA (608)

Thus, we basically use Stokes’ Theorem so that we never have to compute an actual surface

integral. Namely, the RHS can be pretty rough if we have some surface that has a really

hard area vector, and as such, we can instead just compute the work done around the border

of the surface in order to compute, the RHS of the equation.

25.4.1 Same Border, Different Surface

Another interesting trick that I find extremely helpful is the following. Suppose you have

two surface where the unit circle is the border of the surface. The first surface, S1 happens

to be z = 1 − x2 − y2, that lies above the xy plane and the second surface is just the unit

disk, S2. Now suppose I want to calculate

¨S1

~∇× ~F · d ~A (609)

176

as well as: ¨S2

~∇× ~F · d ~A (610)

Each of these alone is a surface integral over the respective surfaces, S1 and S2. However,

we just learned that by Stokes Theorem, each of these is equivalent to the work is takes to

move around the unit circle as the unit circle is acting as the border of each of these shapes.

Therefore, we can conclude by transitivity that:

˛C

~F · d~r =

¨S1

~∇× ~F · d ~A =

¨S2

~∇× ~F · d ~A (611)

Which more importantly means that:

¨S1

~∇× ~F · d ~A =

¨S2

~∇× ~F · d ~A (612)

meaning that if we have two surfaces that share the same border, then it must be true that˜S~∇× ~F · d ~A are equal. Lets see how we can use this to our advantage with the example

mentioned.

Problem: Compute˜S~∇ × ~F · d ~A for the surface z = 1 − x2 − y2 that lies above the

xy-plane, where ~F = (3, x, 4)

Solution: Well, parametrizing this surface would kind of stink. So instead, we can use

this idea of surface Independence to compute this integral. We can utilize the fact that we

can replace our current surface with the unit disk that has a much nicer normal vector of

~n = (0, 0, 1), and compute the integral. We first compute the curl of ~F which happens to be:

~∇× ~F = (0, 0, 1) (613)

177

I can then compute the integral by dotting this with my normal vector ~n and integrating

the area of the unit disk. Doing so, we obtain that:

¨S

~∇× ~F · d ~A =

¨(0, 0, 1) · (0, 0, 1)dA =

¨dA = π (614)

In addition, we can confirm this by taking the line integral as well for good luck. We can

start by parametrizing our path,

~r(t) = (cos t, sin t, 0)) for t ∈ [0, 2π] (615)

Then we can set up the work integral as:

˛~F · d~r =

ˆ 2π

0

(3, cos t, 4) · (− sin t, cos t, 0)dt =

ˆ 2π

0

cos2 t = π (616)

Yay! We get the same answer! Hopefully this clears up Stokes’ Theorem a bit. I know

this one tends to be tricky this is why I recommend honestly never calculating some nasty

surface integral. Instead, either compute a line integral over a closed path, or use the ideas

of surface Independence in order to find a flat surface that shares the border with the crazy

surface that can easily be utilized to calculate the work done. Last lecture of material we

made it!

26 Recitation X on August 6, 2019

In recitation, we computed a few great problems on Stokes and Divergence Theorem. I’ll

attach a few here that we computed that can be used later for reference.

Problem: Divergence Theorem: Use the divergence theorem to evaluate˜S

F · dS where

F = 〈xy,−12y2, z〉 and the surface consists the paraboloid z = 4− (x2 + y2) and the circle in

178

the xy plane it encloses.

Solution: Note that this setup is symmetrical in cylindrical coordinates so we will use

them for this problem. Therefore, the bounds for the region are reduced to:

0 ≤ z ≤ 4− r2

0 ≤ r ≤ 2

0 ≤ θ ≤ 2π

Next, we calculate the divergence of the vector field which is given by: (∇ ·F) = ∂M∂x

+ ∂N∂y

+

∂P∂z

= y − y + 1 = 1

The integral is then,

¨S

F · dS =

˚R

(∇ · F)dV (617)

=

ˆ 2π

0

ˆ 2

0

ˆ 4−r2

0

rdzdrdθ (618)

= 2π

ˆ 2

0

4r − r3dr (619)

= (2π)(8− 4) (620)

= 8π (621)

(622)

179

Problem: Given that for the groin vault, encountered last week, we computed the vol-

ume to be,˝

dV = 163

, compute the flux through the groin vault given that ~F = 〈x, y, z〉.

Solution: We want to compute the flux through a closed surface. Therefore, we can use

divergence theorem, which states

‹∂D

F · dA =

˚D

∇ · FdV, (623)

where ∂D is the surface that bounds a 3 dimensional domain D. But note that ∇ · F = 3,

so

˚∇ · FdV =

˚3dV (624)

= 3 · 16

3(625)

= 16. (626)

Problem: Stoke’s Theorem: Use Stokes’ theorem to evaluate´C

F · dr, where C is the

triangle with vertices (1, 0, 1), (0, 1, 1), and (0, 0, 1), oriented counterclockwise when viewed

from above, and F = (x+ y2, y + z2, z + x2).

Solution: With our surface being this triangle lying on the z = 1 plane, we simply get a

normal vector that is pointing straight upwards. Therefore, we can utilize Stokes’ Theorem

over this region to obtain that:

¨S

∇× F · dA =

˛C

F · dr.

180

Thus, we need only find˜S∇× F · dA.

¨S

∇× F · dA =

ˆ 1

0

ˆ 1−x

0

〈−2z,−2x,−2y〉 · 〈0, 0, 1〉, dydx

=

ˆ 1

0

ˆ 1−x

0

−2ydydx

= −1

ˆ 1

0

(1− x)2dx

=(1− x)3

3|10

= −1

3

Problem: Use Stoke’s Theorem to compute

¨ (∇× ~F

)· d ~A where ~F = (3, x2, 4)

through the surface z = 4− x2 − y2 that sits above the xy-plane.

Solution: We can do this by computing both the left and the right hand sides of Stokes’

Theorem. We first should note that we are not limited to the surface z = 4 − x2 − y2.

Specifically, we could choose any surface that shares the border on the xy-plane that the

surface z = 4 − x2 − y2 does. To make matters simple. I will choose the flat surface, with

area vector in the k direction. and have my surface be the disk of radius 2 since this shares

the border of x2 + y2 = 4 on the xy plane. Lets go ahead now and compute the surface

integral.

¨ (∇× ~F

)=

¨((0, 0, 2x) · (0, 0, 1)) dA =

ˆ 2π

0

ˆ 2

0

2r cos θrdrdθ = 0 (627)

Okay, so as of now, we expect that if we take the closed loop integral around the boundary

181

of our surface, the LHS of Stokes’ Theorem, we also obtain that zero work has been done.

Lets try that now utilizing the parametrization that:

~r(t) = (2 cos t, 2 sin t, 0) for t ∈ [0, 2π] (628)

Doing so, we can compute the line integral as:

˛~F · d~r =

ˆ 2π

0

(3, 4 cos2 t, 4) · (−2, 2 cos t, 0) dt =

ˆ 2π

0

8 cos3 tdt = 0 (629)

Which shows that we got the same value for each side of stokes’ theorem. Each by itself

would have been valid to fully answer the question. I showed both just so that one could see

that they are equivalent.

Problem: Gauss’ Law in Electricity and Magnetism is known as:

‹S

~E · ~ndA =1

ε0

˚ρdV (630)

Where ~E is the electric field, ε0 is the permitivity of free space, and ρ is the charge density of

your object. Using you knowledge of Divergence Theorem, show that the above expression

is equivalent to:

∇ · ~E =ρ

ε0(631)

Solution: We can start by using the Divergence Theorem on the LHS of Gauss’ Law.

182

Doing so, we have the equation:

‹S

~E · ~ndA =

˚ (∇ · ~E

)dV (632)

Which we can now plug into our original equation:

‹S

~E · ~ndA =

˚ (∇ · ~E

)dV =

1

ε0

˚ρdV (633)

Therefore, since we are integrating over the same exact volume region, we can set the two

arguments equal to one another to obtain that:

∇ · ~E =ρ

ε0(634)

Which is referred to as the differential form of Gauss’ Law.

27 Lecture XV on August 7, 2019

Today we are going to be having a review day! Lets try to get through as many problems

as possible.

28 Thank You

Thank you all for such a great summer! I hope you all learned a lot from the course, and

you feel prepared to apply the concepts and principles learnt in the last few weeks in all of

your majors :)

183

Download - Interphase EDGE Calculus 3 Lecture/Recitation Notes

Top Related