Advanced Multivariable Differential Calculus, Exercises of Calculus

These notes are based on lectures from Math 32AH, an honors multivariable differential calculus course at UCLA I taught in the fall of 2020.

Typology: Exercises

2022/2023

Uploaded on 05/11/2023

ananya
ananya 🇺🇸

4.4

(17)

251 documents

1 / 130

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Advanced Multivariable Differential Calculus
Joseph Breen
Last updated: December 24, 2020
Department of Mathematics
University of California, Los Angeles
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40
pf41
pf42
pf43
pf44
pf45
pf46
pf47
pf48
pf49
pf4a
pf4b
pf4c
pf4d
pf4e
pf4f
pf50
pf51
pf52
pf53
pf54
pf55
pf56
pf57
pf58
pf59
pf5a
pf5b
pf5c
pf5d
pf5e
pf5f
pf60
pf61
pf62
pf63
pf64

Partial preview of the text

Download Advanced Multivariable Differential Calculus and more Exercises Calculus in PDF only on Docsity!

Advanced Multivariable Differential Calculus

Joseph Breen

Last updated: December 24, 2020

Department of Mathematics University of California, Los Angeles

Contents

Preface

These notes are based on lectures from Math 32AH, an honors multivariable differential calculus course at UCLA I taught in the fall of 2020. Briefly, the goal of these notes is to develop the theory of differentiation in arbitrary dimensions with more mathematical ma- turity than a typical calculus class, with an eye towards more advanced math. I wouldn’t go so far as to call this a multivariable analysis text, but the level of rigor is fairly high. These notes borrow a fair amount in terms of overlying structure from Calculus and Analysis in Euclidean Space by Jerry Shurman, which was the official recommended text for the course. There are, however, a number of differences, ranging from notation to omission, inclusion, and presentation of most topics. The heart of the notes is Chapter 6, which discusses the theory of differentiation and all of its applications; the first five chapters essentially lay the necessary mathematical foundation of analysis and linear algebra. As far as prerequisites are concerned, I only assume that you are comfortable with all of the usual topics in single variable calculus (limits, continuity, derivatives, optimization, integrals, sequences, series, Taylor polynomials, etc.). In particular, I do not assume any prior knowledge of linear algebra. Linear algebra naturally permeates the entire set of notes, but all of the necessary theory is introduced and explained. In general, I cover only the minimal amount of linear algebra needed, so it would be a bad idea to use these notes as a linear algebra reference. Exercises at the end of each section correspond to homework and exam problems I as- signed during the course. There are many topics that are standard in multivariable calculus courses (like the notion of projecting one vector onto another) that are introduced and stud- ied in the exercises, so keep this in mind. Challenging (optional) exercises are marked with a (∗). I’ve also included some appendices that delve into more advanced (optional) topics. Any comments, corrections, or suggestions are welcome!

Chapter 1

Preliminaries

teehee

more mathematicians. Learning to quickly recognize what one notation means in a certain context is extremely valuable!^1

In any case, here are some down to earth examples of vectors.

Example 2.2.

(1, 2) ∈ R^2

π 1000

 (^) ∈ R^3 〈 500 , 0 , 0 , −e〉 ∈ R^4

One thing I will consistently do in these notes is use boldface letters to indicate elements of Rn. For example, I might say: “Let x ∈ Rn^ be a vector.” Letters that are not in boldface will usually denote scalars. Although I said that you shouldn’t think of vectors as arrows, you actually can (and should, sometimes) think of them that way. In particular, a vector x = (x 1 ,... , xn) ∈ Rn has a geometric interpretation as the arrow emanating from the origin (0,... , 0) and terminating at the coordinate (x 1 ,... , xn). For example, in R^2 we could draw the vector (1, 2) like this:

Sometimes it can be helpful to think of the vector (x 1 ,... , xn) as an arrow with terminal point given by its coordinates, and other times it is better to think of it as simply its terminal point. However, I’ll reiterate what I said above: a vector is simply an element of Rn. An arrow is just a tool for visualization.

2.2 The algebra of Euclidean space

The set of real numbers R comes equipped with a number of algebraic operations like addition and multiplication, together with a host of rules like the distributive law that govern their interactions. Much of this algebraic structure extends naturally to Rn, though some of it is more subtle (like multiplication). Our first task is to establish some basic algebraic definitions and rules in Euclidean space. I’ll begin by defining the notion of vector addition and scalar multiplication.

Definition 2.3. Let x = (x 1 ,... , xn), y = (y 1 ,... , yn) ∈ Rn^ be vectors, and let λ ∈ R be a scalar.

(i) Vector addition is defined as follows:

x + y := (x 1 + y 1 ,... , xn + yn) ∈ Rn. (^1) One day, you may even have to interact with a physicist. If this ever happens, you should be prepared to encounter some highly unusual notation.

(ii) Scalar multiplication is defined as follows:

λx := (λx 1 ,... , λxn) ∈ Rn.

In words, to add two vectors (of the same dimension) you just add the corresponding components, and to multiply a vector by a scalar you just multiply each component by that scalar. Note that we have not yet defined how to multiply two vectors together; we’ll talk about this later.

Example 2.4.

2

If we interpret vectors as arrows emanating from the origin, vector addition and scalar multiplication have nice geometric interpretations. In particular,

  • the vector x + y is the main diagonal of the parallelogram generated by x and y, and
  • the vector λx is the arrow x, stretched by a factor of λ.

See the figure below. One important consequence of the above two statements is that the vector x − y is the off diagonal of the parallelogram generated by x and y, travelling from y to x, shifted appropriately.

x

y

x + y x

y

−y

x − y

x − y

The following definition should be clear.

Definition 2.5. Two vectors x, y ∈ Rn^ are parallel if there is a scalar λ ∈ R such that x = λy or λx = y.

Remark 2.6. The above discussion should make one thing clear: sometimes, it can be helpful to visualize vectors as arrows emanating from points other than the origin. For example, it is natural to think of x − y as the off diagonal arrow, beginning at the terminal point of y. But I’ll reiterate once more what I said above: a vector is simply an element of Rn. Drawing arrows is just a way to visualize such elements. In particular, you should really think of all vectors as emanating from the origin.

We conclude this section with a summary of all of the algebraic rules governing vec- tor addition and scalar multiplication. All of these should feel like natural extensions of corresponding rules in R.

It may seem silly to write all of this out when this statement seems obvious, but this is a nontrivial fact that requires proof based on our definitions. In particular, in the second equality I used the definition of vector addition. In the third equality, I used the definition of scalar multiplication. In the fourth equality, I used the the distributive law of the real numbers, then in the fifth and sixth equalities I used the definitions of vector addition and scalar multiplication again. The proofs of (i)-(vii) are left as an exercise.

Remark 2.8. The reason I named this proposition the vector space axioms is because there is a more general notion of something called a vector space. Briefly and imprecisely, a vector space is any abstract set that satisfies (i)-(viii). For example, the set of functions f : R → R satisfies all of the above properties, and thus is a “vector space.” I won’t discuss abstract vector spaces in these notes — the only one that we care about is Rn. If you’re interested, you can look at Appendix A which discusses some more advanced linear algebra. Broadly speaking, linear algebra is the study of abstract vector spaces.

2.3 The geometry of Euclidean space

Now that we have mastered the basic algebra of Euclidean space, it’s time to start dis- cussing its geometry. By this, I mean things like length and angle.

2.3.1 Inner products

The object that gives rise to geometry in Euclidean space is something called an inner prod- uct. Roughly, an inner product is a way to “multiply” two vectors x and y to produce a scalar.

Definition 2.9. An inner product is a function 〈·, ·〉 : Rn^ × Rn^ → R satisfying:

(i) For all x, y ∈ Rn, 〈x, y〉 = 〈y, x〉.

(ii) For all x ∈ Rn, 〈x, x〉 ≥ 0 and 〈x, x〉 = 0 if and only if x = 0.

(iii) For all x, y, z ∈ Rn^ and λ ∈ R,

〈x + y, z〉 = 〈x, z〉 + 〈x, z〉 〈x, y + z〉 = 〈x, y〉 + 〈x, z〉 〈λx, y〉 = λ 〈x, y〉 〈x, λy〉 = λ 〈x, y〉.

Remark 2.10. Property (i) is referred to as symmetry, property (ii) is positive definitness, and property (iii) is bilinearity. Note that, by symmetry, some of the conditions in (iii) are super- fluous: bilinearity in the second component (equalities 2 and 4) follows from bilinearity in the first component (equalities 1 and 3).

The point of this definition is that an inner product is an operation that should obey the same rules as numerical multiplication. For example, numerical multiplication is symmet- ric: for any x, y ∈ R, xy = yx. It is important to note that in the definition above, I used the word “an.” That suggests there are multiple different inner products, and this is indeed the case. However, there is a standard one defined as follows.

Definition 2.11. The standard inner product , also known as the dot product , is defined as follows: for x = (x 1 ,... , xn) and y = (y 1 ,... , yn),

〈x, y〉 = x · y := x 1 y 1 + · · · xnyn.

Typically I will write 〈x, y〉 to refer to the standard inner product. I will give you some examples of other inner products in the exercises, but throughout the notes we will only use the standard one. The above definition also introduces the alternative notation x · y, which may feel more natural.

Example 2.12. (^) 〈( 1 2

Even though I named the object above the standard inner product, we have to actually verify that it is an inner product. In other words, we have to show that it satisfies (i)-(iii) in Definition 2.9.

Proposition 2.13. The standard inner product 〈·, ·〉, , i.e., the dot product, is an inner product.

Proof. I’ll verify one of the properties here (property (ii)) and I will leave the others as an exercise. Let x = (x 1 ,... , xn). Then

〈x, x〉 = x^21 + · · · + x^2 n.

Since x^2 j ≥ 0 for all j, it follows that 〈x, x〉 ≥ 0. Next, we prove the “if and only if” statement. Note 〈 0 , 0 〉 = 0^2 + · · · 02 = 0. Now, suppose that 〈x, x〉 = 0. Then x^21 + · · · + x^2 n = 0.

Since x^2 j ≥ 0 , it necessarily follows that each xj = 0 and thus x = 0 as desired. Properties (i) and (iii) are left as exercises.

The reason that an inner product gives rise to geometric structure is because it allows to define the notion of length and angle.

Definition 2.14. The length or magnitude or norm of a vector x ∈ Rn^ is

‖x‖ :=

〈x, x〉.

This definition makes mathematical sense by (ii) in Definition 2.9, and it should make intuitive sense because it coincides with the notion of the absolute value of a number: if x ∈ R, then |x| =

x^2. In other words, for vectors in R, the length/magnitude/norm is the same as the absolute value. Using the standard inner product, the magnitude of (x 1 ,... , xn) is just the distance from the origin to the terminal point of the arrow using the usual distance formula.

Example 2.15. ∥ ∥∥ ∥

12 + 2^2 =

This is a quadratic polynomial in the variable t! It looks funny because the constants are given by norms and inner products of vectors, but it’s a quadratic polynomial nonetheless. Since f (t) ≥ 0 , this quadratic polynomial cannot have two distinct real roots. It follows^2 that the discriminant of this polynomial is nonpositive: b^2 − 4 ac ≤ 0. This means that

(2 〈x, y〉)^2 − 4 ‖x‖^2 ‖y‖^2 ≤ 0.

After some simple algebraic manipulation, this becomes

(〈x, y〉)^2 ≤ ‖x‖^2 ‖y‖^2.

Taking square roots gives the desired inequality:

| 〈x, y〉 | ≤ ‖x‖ ‖y‖.

In words, the Cauchy-Schwarz inequality says that the inner product of two vectors is less than (in magnitude) the product of their norms. Typically, the statement of Cauchy- Schwarz comes with the following addendum: equality occurs if and only if the vectors are parallel. I encourage you to think about why this is true, but for the purpose of these notes I’m not too concerned about that part of the statement.

Angles

The innocent looking Cauchy-Schwarz inequality has a number of important applications, the first of which being the ability to define the angle between two vectors. Before giving the definition, I want to convince you that the standard inner product tells us something about our usual notion of angle. Let’s compute the dot product of the vector (1, 0) ∈ R^2 (a vector pointing in the positive x direction) with various other vectors: ( 1 0

cos θ sin θ

= cos θ.

(cos θ, sin θ)

The first computation shows that when we take the dot product of two vectors going in the same direction, we get a positive number. The second computations shows that the dot product of two perpendicular vectors is 0. The third computations shows that the dot product of two vectors going in opposite directions gives a negative number. Finally, the fourth computation generalizes all of these and suggests that the dot product somehow detects the cosine of the angle between the vectors. This should motivate the following definition. (^2) You may have to dig deep into the depths of your memory to remember the quadratic formula: if at (^2) +

bt + c = 0, then t = b±

√ b^2 − 4 ac 2 a.

Definition 2.19. Fix x, y ∈ Rn. The angle between x and y is the number

θ = arccos

〈x, y〉 ‖x‖ ‖y‖

Remark 2.20. Recall that the domain for arccos(t) is |t| ≤ 1. By Cauchy-Schwarz,

∣ (^) ‖〈xx‖‖,yy〉‖

1 , and so this definition makes sense. In other words, we are allowed to define the angle between two vectors this way because of Cauchy-Schwarz.

This definition gives an alternative geometric way of computing the standard inner product of two vectors.

Proposition 2.21. Fix x, y ∈ Rn^ and let θ be the angle between x and y. Then

〈x, y〉 = ‖x‖ ‖y‖ cos θ.

Proof. Rearrange the expression in the above definition.

The next definition should also be clearly motivated by the above discussion.

Definition 2.22. Two vectors x, y ∈ Rn^ are orthogonal or perpendicular if 〈x, y〉 = 0.

Remark 2.23. Rant incoming. Pay close attention to the fact that the above statement about orthogonality is a definition, and not a proposition. In other words, I have defined what it means to be orthogonal, using the inner product. I did not conclude that if two vectors are orthogonal, then their inner product is 0. This might seem weird, or it might seem like a trivial thing for me to start ranting about, but this is actually very important. You likely have a vast amount of geometric prejudice built up in two and three dimensions from your mathematical career so far. In particular, you had already studied things like distance and angles in a geometry class. But in this chapter, you should rid all of that from your mind. At the beginning of the chapter I defined Rn^ as a set, just by describing its elements. A priori, it has no other structure, algebraic or geometric. A priori, there is no notion of angle or length, even in the seemingly familiar R^2. The point of this section is that a choice of inner product gives rise to geometry, and not the other way around. We define the notion of length and angle using an inner product, and if we picked a different inner product then we would get a different notion of length and angle. Anyway, this isn’t really a big deal but it’s an important mindset to have as you delve further into math. Carry on.

The triangle inequality

We conclude this section with an inequality of comparable importance to the Cauchy- Schwarz inequality.

Theorem 2.24 (Triangle inequality). For all x, y ∈ Rn,

‖x + y‖ ≤ ‖x‖ + ‖y‖.

Proof. The moral of the following proof is that it is often easier to deal with squared norms as opposed to norms, the reason being that squared norms are naturally translated into inner products. In particular, note that

‖x + y‖^2 = 〈x + y, x + y〉 = 〈x, x〉 + 〈y, x〉 + 〈x, y〉 + 〈y, y〉 = ‖x‖^2 + 2 〈x, y〉 + ‖y‖^2.

(c) Let a, b, c denote the side lengths of an arbitrary triangle. Let d be the length of the line segment from the midpoint of the c-side to the opposite vertex.

b

a c d

Show that a^2 + b^2 =

c^2 + 2d^2.

  1. Prove that (^) ∣ ∣ (^) ‖a‖ − ‖b‖ ∣∣ (^) ≤ ‖a + b‖

for all a, b ∈ Rn. This is the reverse triangle inequality. [Hint: The normal triangle inequality may be helpful.]

  1. Given two vectors a, b ∈ Rn, the projection of a onto b is the vector

projb a := 〈a, b〉 ‖b‖^2

b.

(a) Show that a − projb a is orthogonal to b. (b) Draw a (generic) picture of a, b, projb a, and a − projb a. (c) Show that ‖a‖^2 = ‖projb a‖^2 + ‖a − projb a‖^2. What famous theorem have you just proven? (d) Show that ‖projb a‖ ≤ ‖a‖ and then use this to give a proof of the Cauchy- Schwarz inequality: | 〈a, b〉 | ≤ ‖a‖ ‖b‖.

(e) Compute the distance from the point (2, 3) to the line y = 12 x.

  1. Fix θ ∈ [0, 2 π) and define a map Tθ : R^2 → R^2 by

Tθ(x, y) = (x cos θ + y sin θ, −x sin θ + y cos θ).

(a) Compute T π 2 (1, 0) and T π 2 (0, 1). (b) Show that ‖Tθ(x, y)‖ = ‖(x, y)‖. (c) Show that 〈Tθ(x 1 , y 1 ), Tθ(x 2 , y 2 )〉 = 〈(x 1 , y 1 ), (x 2 , y 2 )〉. (d) Give a geometric interpretation of (b) and (c), and then give a geometric descrip- tion of Tθ as a whole.

  1. A set of vectors {v 1 ,... , vk} ⊂ Rn^ is called linearly independent if

λ 1 v 1 + · · · + λkvk = 0

implies λ 1 = λ 2 = · · · = λk = 0. In words, the set of vectors is linearly independent if each vector “points in a new direction.” The goal of this exercise is to get you thinking about linear independence and to convince you that this verbal slogan is true.

(a) Show that a set of two vectors {v 1 , v 2 } ⊂ Rn^ is linearly independent if and only if v 1 and v 2 are not parallel.

(b) Show that if vj = 0 for some j, then the set {v 1 ,... , vk} ⊂ Rn^ is not linearly independent. (c) Give an example of three vectors a, b, c ∈ R^3 such that {a, b, c} is linearly inde- pendent. (d) Give an example of three vectors a, b, c ∈ R^3 such that {a, b} is linearly inde- pendent, but {a, b, c} is not linearly independent. (e) Show that any set of 3 vectors in R^2 is not linearly independent.

  1. (∗) Let a, b, c ∈ Rn^ be linearly independent vectors (here n ≥ 3 ). Write down an expression involving a, b, and c which gives a nonzero vector orthogonal to both a and b. [Hint: It may seem odd that I gave you a third vector c, seemingly unrelated to a and b, and I only want a vector orthogonal to a and b. Treat the vector c as a starting point from which to produce the desired vector.]
  2. (∗) Show that for any positive numbers x, y, z, w,

(x + y + z + w)

x

y

z

w

  1. (∗) In this problem, I’ll walk you through the proof of a generalization of the Cauchy- Schwarz inequality called H¨older’s inequality. It says that if p, q > 1 such that (^1) p + (^1) q = 1, then

| 〈a, b〉 | ≤

∑^ n

j=

|aj |p

1 p  

∑^ n

j=

|bj |q

1 q .

Cauchy-Schwarz is the special case p = q = 2.

(a) Show that the above inequality is true if either a = 0 or b = 0. (b) Show that

| 〈a, b〉 | ≤

∑^ n

j=

|aj bj |.

(c) Use concavity of ln x to show that if x, y > 0 , then

ln(xy) ≤ ln

p xp^ +

q yq

[Hint: A function is concave if for all t ∈ [0, 1], tf (x)+(1−t)f (y) ≤ f (tx+(1−t)y).] (d) Prove that for x, y > 0 , xy ≤ xp p

yq q

(e) Let x =

|aj | (∑ n j=1 |aj^ |p

) (^1) p^ and^ y^ =^

|bj | (∑ n j=1 |bj^ |q

) (^1) q

and apply the inequality in the previous part to prove H ¨older’s inequality.

A definition like this takes some time to parse if it is your first time seeing it. In words, this definition is a game: I give you a small number ε, and I tell you that I want your sequence to be ε-close to the vector L. If you can always find a point in the sequence where this is permanently achieved, then you win. Indeed, note that the set

{ x ∈ Rm^ : ‖x − L‖ < ε }

is a ball of radius ε centered around L.

L

a 1 a 2 a 3

ε

Example 3.2. Let’s prove using the formal definition above that

n^ lim→∞

n^3 + 1

Here an = (^) n (^31) +1 and L = 0. These are numbers instead of vectors, but numbers are just vectors in R^1 after all! Let ε > 0 be a small, fixed number. According to the above definition, we want to show that there is some choice of N such that ∣∣ ∣∣^1 n^3 + 1

∣∣ < ε

if n > N. Here the absolute value bars have replaced the norm, since we’re dealing with numbers. Note that

∣ (^) n (^31) +1 −^0

∣ < ε^ is equivalent to

1 n^3 + 1 < ε

since we assume that n > 0. This is further equivalent to (^1) ε < n^3 + 1 and thus

n >

ε

This is precisely the kind of condition we seek! Let N be any integer larger than ( (^1) ε −^1

. Then if n > N , the above work implies that ∣∣ ∣∣^1 n^3 + 1

∣∣ < ε.

Since ε > 0 was arbitrary, we have shown that limn→∞ (^) n (^31) +1 = 0.

To get some more practice with this definition, let’s prove the following well-known properties of limits.

Proposition 3.3. If an → L ∈ Rm^ and bn → K ∈ Rm, then an + bn → L + K ∈ Rm. In other words, the limit of a sum is the sum of the limits, provided both limits exist.

Proof. This proposition is a nice application of the triangle inequality from the previous chapter. Let ε > 0. We want to show that for some choice N ,

‖(an + bn) − (L + K)‖ < ε

when n > N. Note that

‖(an + bn) − (L + K)‖ = ‖(an − L) + (bn − K)‖ ≤ ‖an − L‖ + ‖bn − K‖.

Here, we used the triangle inequality in the last line. The point of this rearrangement is that we know an → L and bn → K, so heuristically we can make each of the above terms small. Indeed, pick N 1 so that n > N 1 implies ‖an − L‖ < ε 2 and pick N 2 such that n > N 2 implies ‖bn − K‖ < ε 2. Each of these choices are possible because an → L and bn → K. Then if n > max(N 1 , N 2 ), we have

‖(an + bn) − (L + K)‖ ≤ ‖an − L‖ + ‖bn − K‖ <

ε 2

ε 2 = ε.

Since ε > 0 was arbitrary, we have shown an + bn → L + K.

Other well known limit properties are left as exercises to the reader.

3.2 Limits of functions

The next limit generalization we are interested in is that of a limit of a function. The fol- lowing definition is similar in spirit to that of the limit of a sequence.

Definition 3.4. Let f : Rn^ → Rm^ and let x 0 ∈ Rn^ and L ∈ Rm. We say that

lim x→x 0 f (x) = L

if for every ε > 0 , there is a δ > 0 such that

0 < ‖x − x 0 ‖ < δ

implies ‖f (x) − L‖ < ε. The intuition here is essentially the same as before. In words, this definition is a game: I give you a small number ε, and you want the values of f to be ε-close to L to win the game. If you can find a δ > 0 such that f is ε-close in a δ-neighborhood of x 0 , then you win. Fortunately, in these notes we typically will not use this definition directly. Instead, there is a convenient characterization of function limits in terms of sequences. In words, the following proposition says: to check if a limit of a function exists, approach that point along all possible sequences.

Proposition 3.5. Let f : Rn^ → Rm, x 0 ∈ Rn, and L ∈ Rm. Then

lim x→x 0 f (x) = L

if and only if for all sequences an → x 0 ,

n^ lim→∞ f^ (an) =^ L.