









Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
The partial derivative with respect to y is the derivative ... by linearizing the functions f and g and verifying the chain rule in the linear case.
Typology: Summaries
1 / 16
This page cannot be seen from the preview
Don't miss anything!










Math S21a: Multivariable calculus Oliver Knill, Summer 2018
If f (x, y) is a function of the two variables x and y, the partial derivative (^) ∂x∂ f (x, y) is defined as the derivative of the function g(x) = f (x, y) with respect to x, where y is considered a constant. The partial derivative with respect to y is the derivative with respect to y where x is fixed.
The short hand notation fx(x, y) = (^) ∂x∂ f (x, y) is convenient. When iterating derivatives, the no- tation is similar: we write for example fxy = (^) ∂x∂ ∂y∂ f. The number fx(x 0 , y 0 ) gives the slope of the graph sliced at (x 0 , y 0 ) in the x direction. The second derivative fxx is a measure of concavity in that direction. The meaning of fxy is the rate of change of the x-slope if you move the cut along the y-axis.
The notation ∂xf, ∂yf was introduced by Carl Gustav Jacobi. Before that, Josef Lagrange used the term ”partial differences”. For functions of three or more variables, the partial derivatives are defined in the same way. We write for example fx(x, y, z) or fxxz (x, y, z).
− 12 x^2 y + 4y^3 , fyy = − 12 x^2 + 12y^2 and see that ∆f = fxx + fyy = 0. A function which satisfies ∆f = 0 is also called harmonic. The equation fxx + fyy = 0 is an example of a partial differential equation: it is an equation for an unknown function f (x, y) which involves partial derivatives with respect to more than one variables.
Clairaut’s theorem: If fxy and fyx are both continuous, then fxy = fyx.
Proof: Following Euler, we first look at the difference quotients and say that if the “Planck constant” h is positive, then fx(x, y) = [f (x + h, y) − f (x, y)]/h. For h = 0, we mean the usual partial derivative fx. Comparing the two sides of the equation for fixed h > 0 shows
hfx(x, y) = f (x + h, y) − f (x, y) h^2 fxy (x, y) = f (x + h, y + h) − f (x, y + h) − (f (x + h, y) − f (x, y))
hfy (x, y) = f (x, y + h) − f (x, y). h^2 fyx(x, y) = f (x + h, y + h) − f (x + h, y) − (f (x, y + h) − f (x, y))
Without having taken any limits we established an identity which holds for all h > 0: the discrete derivatives fx, fy satisfy the relation fxy = fyx for any h > 0. We could fancy it as ”quantum Clairaut” formula. If the classical derivatives fxy, fyx are both continuous, it is possible to take the limit h → 0. The classical Clairaut’s theorem can be seen as a “classical limit”. The quantum Clairaut holds however for all functions f (x, y) of two variables. Not even continuity is needed.
compute. Once you see it, the answer becomes obvious.
f (x, y) =
x^3 y − xy^3 x^2 + y^2
contradicts the Clairaut theorem:
fx(x, y) = (3x^2 y − y^3 )/(x^2 + y^2 ) − 2 x(x^3 y − xy^3 )/(x^2 +y^2 )^2 , fx(0, y) = −y, fxy(0, 0) = −1,
fy(x, y) = (x^3 − 3 xy^2 )/(x^2 + y^2 ) − 2 y(x^3 y − xy^3 )/(x^2 + y^2 )^2 , fy(x, 0) = x, fy,x(0, 0) = 1.
An equation for an unknown function f (x, y) which involves partial derivatives with respect to at least two different variables is called a partial differential equation. We abbreviate PDE. If only the derivative with respect to one variable appears, it is an ordinary differential equation, abbreviated ODE.
Here are examples of partial differential equations. You have to know the first four in the same way than a chemist has to know some basic molecules like H 2 O, CO 2 , CH 4 , N aCl is. It is amazing how often they appear also in areas different from physics, like finance (market prediction) or biology (diffusion). These equations also can be looked at on discrete structures like networks.
f (t, x) = sin(x − t) + sin(x + t) satisfies the wave equation.
demic. The function f (t, x) = √^1 t e−x^2 /(4t)^ satisfies the heat equation.
f (x, y) = x^3 − 3 xy^2 is an example satisfying the Laplace equation.
f (t, x) = e−(x+t)^2 satisfies the advection equation.
The function f (x, y) = cos(x) + sin(y) satisfies the eiconal equation.
function f (t, x) = x t
t e−x
(^2) /(4t) 1+
t e−x
(^2) /(4t) satisfies the Burgers equation.
The function f (t, x) = a 22 cosh−^2 (a 2 (x − a^2 t)) satisfies the KdV equation.
m. The function f (t, x) = ei(kx−^ 2 ℏm k^2 t)^ solves the Schr¨odinger equation. [Here i^2 = −1 is the imaginary i and ℏ is the Planck constant ℏ ∼ 10 −^34 Js.] Here are the graphs of the solutions of the equations. Can you match them with the PDE’s?
3 Verify that for any constant b, the function f (x, t) = e−bt^ cos(x+t)
satisfies the driven transport equation ft(x, t) = fx(x, t)−bf (x, t)
This PDE is sometimes called the advection equation with
damping b.
4 The differential equation
ft = f − xfx − x
fxx
is a version of the infamous Black-Scholes equation. Here
f (x, t) is the prize of a call option and x the stock prize ad t is
time. Find a function f (x, t) solving it which depends both on x
and t. Hint: look first for functions only involve one variable.
5 The partial differential equation ft + f fx = fxx is called Burg-
ers equation and describes waves at the beach. In higher di-
mensions, it leads to the Navier-Stokes equation which are used
to describe the weather. Verify that
f (t, x)
( 1
) 3 / 2 xe−
x^2 4 t √
t e
2 4 t (^) + 1
solves the Burgers equation.
Remark. You better use technology here. Here is an example
on how to check that a function is a solution of the heat equation
in Mathematica:
f[t_,x_]:=(1/Sqrt[t])*Exp[-x^2/(4t)];
Simplify[D[f[t,x],t] == D[f[t,x],{x,2}]]
And here is the function
(1/t)^(3/2)xExp[-x^2/(4t)]/((1/t)^(1/2)*Exp[-x^2/(4t)]+1);
Math S21a: Multivariable calculus Oliver Knill, Summer 2018
In single variable calculus, you have seen the following notion. As usual, we always assume the functions to be differentiable:
The linear approximation of f (x) at a point a is the affine function
L(x) = f (a) + f ′(a)(x − a).
If you have seen Taylor series, this is the part of the series f (x) =
k=0 f^ (k)(a)(x^ −^ a)k/k! where only the k = 0 and k = 1 term are considered. We think about the linear approximation L as a function and not as a graph because we will also look at linear approximations for functions of three variables, where we can not draw graphs.
y=L(x)
y=f(x)
The graph of the function L is close to the graph of f at a. What about higher dimensions?
The linear approximation of f (x, y) at (a, b) is the affine function
L(x, y) = f (a, b) + fx(a, b)(x − a) + fy(a, b)(y − b).
The linear approximation of a function f (x, y, z) at (a, b, c) is
L(x, y, z) = f (a, b, c) + fx(a, b, c)(x − a) + fy(a, b, c)(y − b) + fz (a, b, c)(z − c).
Using the gradient
∇f (x, y) = [fx, fy]T^ , ∇f (x, y, z) = [fx, fy, fz ]T^ ,
the linearization can be written more compactly as
L(~x) = f (~x 0 ) + ∇f (~a) · (~x − ~a).
How do we justify the linearization? If the second variable y = b is fixed, we have a one-dimensional situation, where the only variable is x. Now f (x, b) = f (a, b) + fx(a, b)(x − a) is the linear ap- proximation. Similarly, if x = x 0 is fixed y is the single variable, then f (x 0 , y) = f (x 0 , y 0 ) + fy(x 0 , y 0 )(y − y 0 ). Knowing the linear approximations in both the x and y variables, we can get the general linear approximation by f (x, y) = f (x 0 , y 0 ) + fx(x 0 , y 0 )(x − x 0 ) + fy(x 0 , y 0 )(y − y 0 ).
f (x, y, z) = x^4 − x^3 + y^2 + z^2 = 0 is called the piriform. What is the equation for the tangent plane at the point P = (2, 2 , 2) of this pair shaped surface? We get [a, b, c]T^ = [20, 4 , 4]T^ and so the equation of the plane 20 x + 4y + 4z = 56, where we have obtained the constant to the right by plugging in the point (x, y, z) = (2, 2 , 2).
Remark: In the past, linear approximations were described with differentials. We try to avoid it or at least use it only for intuition. Like Newtons ”fluxions”, the Leibniz “differentials” is a bit outdated. It is is a good intuitive notion, but it is also easy to make mistakes with it. The linearlization of a function f at a point is just a linear function L in the same number of variables. There is a modern notion of differential forms which however needs some multi-linear algebra in its definition. The notion of infinitesimal small quantities has also been clarified within non- standard analysis but that theory needs some logic background. The notion of ”differentials” comes from a time, when calculus was in the early development. To see why the notion “differential” is a bit murky, try to find out what the definition of ”differential” is: you find notions like ”change in the linearization of a function” or ”infinitesimals”. Both are ”foggy terminology”. To add to the confusion, expressions like dx are also called a “differentials”. They appear in total differentials df = fxdx + fydy which can make sense as an intuitive shortcut for the chain rule df (x(t), y(t))/dt = fx(x(t), y(t))dx(t)/dt+fy(x(t), y(t))dy(t)/dt as the later can be multiplied by dt to get the differential expression. (We cover the chain rule later). The expression dx also appears in integrals
∫ (^) π 0 sin(x)^ dx^ but there, it is^ notation^ to indicate the variable we integrate with. Mathematica for example writes Integrate[Sin[x], x, 0 , P i]. Leibniz used
∫ (^) b a f^ (x)^ dx^ because it is close to the Riemann sum
i f^ (xi)dxi^ notation, in which^ dxi^ =^ xi+1^ −^ xi^ represent^ differences. The Leibniz notation
∫ (^) b a f^ (x)^ dx^ is a short-cut for a limit of Riemann sums. But expressions like dx, dt alone are not defined without considerably more theory. As a notation, differentials can be useful in the separation of variable technique: when solving f ′^ = f we write df /dx = f leading to df /f = dx. Integration produces log(f ) = x + c, leading to f = ex+c.
From f (x, y) = 0, one can express y as a function of x, at least near a point where fy is not zero. From d/df (x, y(x)) = ∇f · (1, y′(x)) = fx + fyy′^ = 0, we obtain y′^ = −fx/fy. Even so, we do not know y(x), we can compute its derivative! Implicit differentiation works also in three variables. The equation f (x, y, z) = c defines a surface. Near a point where fz is not zero, the surface can be described as a graph z = z(x, y). We can compute the derivative zx without actually knowing the function z(x, y). To do so, we consider y a fixed parameter and compute, using the chain rule
fx(x, y, z(x, y))1 + fz (x, y)zx(x, y) = 0
so that zx(x, y) = −fx(x, y, z)/fz (x, y, z). This works at points where fz is not zero.
(x, y, z) = (2, 1 , 1). Solution: zx(x, y) = −fx(2, 1 , 1)/fz (2, 1 , 1) = − 4 /(2/9) = −18.
The chain rule is powerful because it implies other differentiation rules like the addition, product and quotient rule in one dimensions: f (x, y) = x+y, x = u(t), y = v(t), d/dt(x+y) = fxu′^ +fyv′^ = u′^ + v′. f (x, y) = xy, x = u(t), y = v(t), d/dt(xy) = fxu′^ + fyv′^ = vu′^ + uv′. f (x, y) = x/y, x = u(t), y = v(t), d/dt(x/y) = fxu′^ + fyv′^ = u′/y − v′u/v^2. As in one dimensions, the chain rule follows from linearization. If f is a linear function f (x, y) = ax + by − c and if the curve ~r(t) = [x 0 + tu, y 0 + tv]T^ parametrizes a line. Then (^) dtd f (~r(t)) = d dt (a(x^0 +^ tu) +^ b(y^0 +^ tv)) =^ au^ +^ bv^ and this is the dot product of^ ∇f^ = (a, b) with^ ~r^
′(t) = (u, v).
Since the chain rule only refers to the derivatives of the functions which agree at the point, the chain rule is also true for general functions.
Math S21a: Multivariable calculus Oliver Knill, Summer 2018
The gradient of a function f (x, y) is defined as
∇f (x, y) = [fx(x, y), fy(x, y)]T^.
For functions of three variables, define
∇f (x, y, z) = [fx(x, y, z), fy(x, y, z), fz (x, y, z)]T^.
The symbol ∇ is spelled “Nabla” and named after an Egyptian or Assyrian harp. Early on, the name “Atled” was suggested. But the textbook of 1901 of Gibbs used Nabla was too persuasive. Here is a very important fact, which is true in any dimension. We only formulate it in dimension 2:
Gradient Theorem: ∇f (x 0 , y 0 ) is perpendicular to the level curve {(x, y) | f (x, y) = c} containing (x 0 , y 0 ).
Proof. Every curve ~r(t) on the level curve or level surface satisfies (^) dtd f (~r(t)) = 0. By the chain rule, ∇f (~r(t)) is perpendicular to the tangent vector ~r′(t). QED.
Because ~n = ∇f (p, q) = [a, b]T^ is perpendicular to the level curve f (x, y) = c through (p, q), the equation for the tangent line is ax + by = d, a = fx(p, q), b = fy(p, q), d = ap + bq. Compactly written, this is ∇f (~x 0 ) · (~x − ~x 0 ) = 0
and means that the gradient of f is perpendicular to any vector (~x − ~x 0 ) in the plane. It is one of the most important statements in multivariable calculus as it gives a crucial link between calculus and geometry. The just mentioned gradient theorem is also useful. We can immediately compute tangent planes and tangent lines, without linearization!
∇f (x, y, z) = [6xy, 3 x^2 , 2 z]T^. And ∇f (1, 1 , 1) = [6, 3 , 2]T^. The plane is 6x+3y+2z = d where d is a constant. We can find the constant d by plugging in a point and get 6x+3y+2z = 11.
x^4 + y^2 + z^6 = 6.
Solution: ~r(t) hits the surface at the time t = 2 in the point (− 1 , − 2 , 1). The velocity vector in that ray is ~v = [− 1 , − 1 , 0]T^. The normal vector at this point is ∇f (− 1 , − 2 , 1) = [− 4 , − 4 , 6]T^ = ~n. The reflected vector is R(~v = 2Proj~n(~v) − ~v. We have Proj~n(~v) = 8 /68[− 4 , − 4 , 6]T^. Therefore, the reflected ray is w~ = (4/17)[− 4 , − 4 , 6]T^ − [− 1 , − 1 , 0]T^.
If f is a function of several variables and ~v is a unit vector then D~vf = ∇f · ~v is called the directional derivative of f in the direction ~v.
The name “directional derivative” is related to the fact that every unit vector gives a direction. If ~v is a unit vector, then the chain rule tells us (^) dtd D~vf = (^) dtd f (x + t~v).
The directional derivative tells us how the function changes when we move in a given direction. Assume for example that T (x, y, z) is the temperature at position (x, y, z). If we move with veloc- ity ~v through space, then D~vT tells us at which rate the temperature changes for us. If we move with velocity ~v on a hilly surface of height h(x, y), then D~vh(x, y) gives us the slope we drive on.
temperature change, one measures at ~r(t). The chain rule told us that this is d/dtf (~r(t)).
the partial derivatives. It measures the rate of change of f , if we walk with unit speed into that direction. But as with partial derivatives, it is a scalar.
The directional derivative satisfies |D~vf | ≤ |∇f ||~v| because ∇f · ~v = |∇f ||~v|| cos(φ)| ≤ |∇f ||~v|.
The direction ~v = ∇f /|∇f | is the direction, where f increases most. It is the direction of steepest ascent.
3 Assume f (x, y) = 1 − x^2 + y^2. Compute the directional derivative
D~v(x, y) at (0, 0), where ~v = [cos(t), sin(t)]T^ is a unit vector. Now
compute
DvDvf (x, y)
at (0, 0), for any unit vector. For which directions is this second
directional derivative positive?
4 The Kitchen-Rosenberg formula gives the curvature of a
level curve f (x, y) = c as
κ =
fxxf (^) y^2 − 2 fxyfxfy + fyyf (^) x^2
(f (^) x^2 + f (^) y^2 )^3 /^2
Use this formula to find the curvature of the ellipse f (x, y) =
x^2 + 2y^2 = 1 at the point (1, 0).
This formula is useful in computer vision. If you want to derive
the formula, you can check that the angle
g(x, y) = arctan(fy/fx)
of the gradient vector has κ as the directional derivative in the
direction ~v = [−fy, fx]T^ /
√ f (^) x^2 + f (^) y^2 tangent to the curve.
5 One can find the maximum of a function numerically by moving in
the direction of the gradient. This is called the steepest ascent
method. You start at a point (x 0 , y 0 ) then move in the direction
of the gradient for some time c to be at (x 1 , y 1 ) = (x 0 , y 0 ) +
c∇f (x 0 , y 0 ). Repeat to (x 2 , y 2 ) = (x 1 , y 1 ) + c∇f (x 1 , y 1 ) etc. It
can be a bit difficult if the function has a flat ridge like in the
Rosenbrock function
f (x, y) = 1 − (1 − x)^2 − 100(y − x^2 )^2.
Plot the contour map of this function on − 0. 6 ≤ x ≤ 1 , − 0. 1 ≤
y ≤ 1 .1, then and find the directional derivative at (1/ 5 , 0) in the
direction (1, 1)/
√