Lecture 9: Partial derivatives, Summaries of Mathematics

The partial derivative with respect to y is the derivative ... by linearizing the functions f and g and verifying the chain rule in the linear case.

Typology: Summaries

2022/2023

Uploaded on 02/28/2023

ekavir
ekavir 🇺🇸

4.3

(31)

257 documents

1 / 16

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Math S21a: Multivariable calculus Oliver Knill, Summer 2018
Lecture 9: Partial derivatives
If f(x, y) is a function of the two variables xand y, the partial derivative
∂x f(x, y)
is defined as the derivative of the function g(x) = f(x, y) with respect to x, where
yis considered a constant. The partial derivative with respect to yis the derivative
with respect to ywhere xis fixed.
The short hand notation fx(x, y) =
∂x f(x, y) is convenient. When iterating derivatives, the no-
tation is similar: we write for example fxy =
∂x
∂y f. The number fx(x0, y0) gives the slope of the
graph sliced at (x0, y0) in the xdirection. The second derivative fxx is a measure of concavity in
that direction. The meaning of fxy is the rate of change of the x-slope if you move the cut along
the y-axis.
The notation xf, yfwas introduced by Carl Gustav Jacobi. Before that, Josef Lagrange used
the term ”partial differences”. For functions of three or more variables, the partial derivatives are
defined in the same way. We write for example fx(x, y, z) or fxxz (x, y, z).
1For f(x, y) = x46x2y2+y4, we have fx(x, y) = 4x312xy2, fxx = 12x212y2, fy(x, y) =
12x2y+ 4y3, fyy =12x2+ 12y2and see that f=fxx +fyy = 0. A function which
satisfies f= 0 is also called harmonic. The equation fxx +fyy = 0 is an example of a
partial differential equation: it is an equation for an unknown function f(x, y ) which
involves partial derivatives with respect to more than one variables.
Clairaut’s theorem: If fxy and fy x are both continuous, then fxy =fyx.
Proof: Following Euler, we first look at the difference quotients and say that if the “Planck
constant” his positive, then fx(x, y) = [f(x+h, y)f(x, y )]/h. For h= 0, we mean the usual
partial derivative fx. Comparing the two sides of the equation for fixed h > 0 shows
hfx(x, y) = f(x+h, y)f(x, y)
h2fxy(x, y) = f(x+h, y +h)f(x, y +h)(f(x+h, y)f(x, y ))
hfy(x, y) = f(x, y +h)f(x, y ).
h2fyx(x, y) = f(x+h, y +h)f(x+h, y)(f(x, y +h)f(x, y ))
Without having taken any limits we established an identity which holds for all h > 0: the discrete
derivatives fx, fysatisfy the relation fxy =fyx for any h > 0. We could fancy it as ”quantum
Clairaut” formula. If the classical derivatives fxy, fyx are both continuous, it is possible to take
the limit h0. The classical Clairaut’s theorem can be seen as a “classical limit”. The quantum
Clairaut holds however for all functions f(x, y) of two variables. Not even continuity is needed.
2Problem: Find fxxxxxyxxxxx for f(x) = sin(x) + x6y10 cos(y). Hint: you do not need to
compute. Once you see it, the answer becomes obvious.
3Some regularity assumption for fxy is necessary. The example
f(x, y) = x3yxy3
x2+y2
contradicts the Clairaut theorem:
1
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff

Partial preview of the text

Download Lecture 9: Partial derivatives and more Summaries Mathematics in PDF only on Docsity!

Math S21a: Multivariable calculus Oliver Knill, Summer 2018

Lecture 9: Partial derivatives

If f (x, y) is a function of the two variables x and y, the partial derivative (^) ∂x∂ f (x, y) is defined as the derivative of the function g(x) = f (x, y) with respect to x, where y is considered a constant. The partial derivative with respect to y is the derivative with respect to y where x is fixed.

The short hand notation fx(x, y) = (^) ∂x∂ f (x, y) is convenient. When iterating derivatives, the no- tation is similar: we write for example fxy = (^) ∂x∂ ∂y∂ f. The number fx(x 0 , y 0 ) gives the slope of the graph sliced at (x 0 , y 0 ) in the x direction. The second derivative fxx is a measure of concavity in that direction. The meaning of fxy is the rate of change of the x-slope if you move the cut along the y-axis.

The notation ∂xf, ∂yf was introduced by Carl Gustav Jacobi. Before that, Josef Lagrange used the term ”partial differences”. For functions of three or more variables, the partial derivatives are defined in the same way. We write for example fx(x, y, z) or fxxz (x, y, z).

1 For f (x, y) = x^4 − 6 x^2 y^2 + y^4 , we have fx(x, y) = 4x^3 − 12 xy^2 , fxx = 12x^2 − 12 y^2 , fy(x, y) =

− 12 x^2 y + 4y^3 , fyy = − 12 x^2 + 12y^2 and see that ∆f = fxx + fyy = 0. A function which satisfies ∆f = 0 is also called harmonic. The equation fxx + fyy = 0 is an example of a partial differential equation: it is an equation for an unknown function f (x, y) which involves partial derivatives with respect to more than one variables.

Clairaut’s theorem: If fxy and fyx are both continuous, then fxy = fyx.

Proof: Following Euler, we first look at the difference quotients and say that if the “Planck constant” h is positive, then fx(x, y) = [f (x + h, y) − f (x, y)]/h. For h = 0, we mean the usual partial derivative fx. Comparing the two sides of the equation for fixed h > 0 shows

hfx(x, y) = f (x + h, y) − f (x, y) h^2 fxy (x, y) = f (x + h, y + h) − f (x, y + h) − (f (x + h, y) − f (x, y))

hfy (x, y) = f (x, y + h) − f (x, y). h^2 fyx(x, y) = f (x + h, y + h) − f (x + h, y) − (f (x, y + h) − f (x, y))

Without having taken any limits we established an identity which holds for all h > 0: the discrete derivatives fx, fy satisfy the relation fxy = fyx for any h > 0. We could fancy it as ”quantum Clairaut” formula. If the classical derivatives fxy, fyx are both continuous, it is possible to take the limit h → 0. The classical Clairaut’s theorem can be seen as a “classical limit”. The quantum Clairaut holds however for all functions f (x, y) of two variables. Not even continuity is needed.

2 Problem: Find fxxxxxyxxxxx for f (x) = sin(x) + x^6 y^10 cos(y). Hint: you do not need to

compute. Once you see it, the answer becomes obvious.

3 Some regularity assumption for fxy is necessary. The example

f (x, y) =

x^3 y − xy^3 x^2 + y^2

contradicts the Clairaut theorem:

fx(x, y) = (3x^2 y − y^3 )/(x^2 + y^2 ) − 2 x(x^3 y − xy^3 )/(x^2 +y^2 )^2 , fx(0, y) = −y, fxy(0, 0) = −1,

fy(x, y) = (x^3 − 3 xy^2 )/(x^2 + y^2 ) − 2 y(x^3 y − xy^3 )/(x^2 + y^2 )^2 , fy(x, 0) = x, fy,x(0, 0) = 1.

An equation for an unknown function f (x, y) which involves partial derivatives with respect to at least two different variables is called a partial differential equation. We abbreviate PDE. If only the derivative with respect to one variable appears, it is an ordinary differential equation, abbreviated ODE.

Here are examples of partial differential equations. You have to know the first four in the same way than a chemist has to know some basic molecules like H 2 O, CO 2 , CH 4 , N aCl is. It is amazing how often they appear also in areas different from physics, like finance (market prediction) or biology (diffusion). These equations also can be looked at on discrete structures like networks.

4 The wave equation ftt(t, x) = fxx(t, x) governs the motion of light or sound. The function

f (t, x) = sin(x − t) + sin(x + t) satisfies the wave equation.

5 The heat equation ft(t, x) = fxx(t, x) describes diffusion of heat or spread of an epi-

demic. The function f (t, x) = √^1 t e−x^2 /(4t)^ satisfies the heat equation.

6 The Laplace equation fxx + fyy = 0 determines the shape of a membrane. The function

f (x, y) = x^3 − 3 xy^2 is an example satisfying the Laplace equation.

7 The advection equation ft = fx is used to model transport in a wire. The function

f (t, x) = e−(x+t)^2 satisfies the advection equation.

8 The eiconal equation f x^2 + f y^2 = 1 is used to see the evolution of wave fronts in optics.

The function f (x, y) = cos(x) + sin(y) satisfies the eiconal equation.

9 The Burgers equation ft + f fx = fxx describes waves at the beach which break. The

function f (t, x) = x t

t e−x

(^2) /(4t) 1+

t e−x

(^2) /(4t) satisfies the Burgers equation.

10 The KdV equation ft + 6f fx + fxxx = 0 models water waves in a narrow channel.

The function f (t, x) = a 22 cosh−^2 (a 2 (x − a^2 t)) satisfies the KdV equation.

11 The Schr¨odinger equation ft = 2 imℏ fxx is used to describe a quantum particle of mass

m. The function f (t, x) = ei(kx−^ 2 ℏm k^2 t)^ solves the Schr¨odinger equation. [Here i^2 = −1 is the imaginary i and ℏ is the Planck constant ℏ ∼ 10 −^34 Js.] Here are the graphs of the solutions of the equations. Can you match them with the PDE’s?

3 Verify that for any constant b, the function f (x, t) = e−bt^ cos(x+t)

satisfies the driven transport equation ft(x, t) = fx(x, t)−bf (x, t)

This PDE is sometimes called the advection equation with

damping b.

4 The differential equation

ft = f − xfx − x

fxx

is a version of the infamous Black-Scholes equation. Here

f (x, t) is the prize of a call option and x the stock prize ad t is

time. Find a function f (x, t) solving it which depends both on x

and t. Hint: look first for functions only involve one variable.

5 The partial differential equation ft + f fx = fxx is called Burg-

ers equation and describes waves at the beach. In higher di-

mensions, it leads to the Navier-Stokes equation which are used

to describe the weather. Verify that

f (t, x)

( 1

t

) 3 / 2 xe−

x^2 4 t √

t e

−x

2 4 t (^) + 1

solves the Burgers equation.

Remark. You better use technology here. Here is an example

on how to check that a function is a solution of the heat equation

in Mathematica:

f[t_,x_]:=(1/Sqrt[t])*Exp[-x^2/(4t)];

Simplify[D[f[t,x],t] == D[f[t,x],{x,2}]]

And here is the function

(1/t)^(3/2)xExp[-x^2/(4t)]/((1/t)^(1/2)*Exp[-x^2/(4t)]+1);

Math S21a: Multivariable calculus Oliver Knill, Summer 2018

Lecture 10: Linearization

In single variable calculus, you have seen the following notion. As usual, we always assume the functions to be differentiable:

The linear approximation of f (x) at a point a is the affine function

L(x) = f (a) + f ′(a)(x − a).

If you have seen Taylor series, this is the part of the series f (x) =

k=0 f^ (k)(a)(x^ −^ a)k/k! where only the k = 0 and k = 1 term are considered. We think about the linear approximation L as a function and not as a graph because we will also look at linear approximations for functions of three variables, where we can not draw graphs.

y=L(x)

y=f(x)

The graph of the function L is close to the graph of f at a. What about higher dimensions?

The linear approximation of f (x, y) at (a, b) is the affine function

L(x, y) = f (a, b) + fx(a, b)(x − a) + fy(a, b)(y − b).

The linear approximation of a function f (x, y, z) at (a, b, c) is

L(x, y, z) = f (a, b, c) + fx(a, b, c)(x − a) + fy(a, b, c)(y − b) + fz (a, b, c)(z − c).

Using the gradient

∇f (x, y) = [fx, fy]T^ , ∇f (x, y, z) = [fx, fy, fz ]T^ ,

the linearization can be written more compactly as

L(~x) = f (~x 0 ) + ∇f (~a) · (~x − ~a).

How do we justify the linearization? If the second variable y = b is fixed, we have a one-dimensional situation, where the only variable is x. Now f (x, b) = f (a, b) + fx(a, b)(x − a) is the linear ap- proximation. Similarly, if x = x 0 is fixed y is the single variable, then f (x 0 , y) = f (x 0 , y 0 ) + fy(x 0 , y 0 )(y − y 0 ). Knowing the linear approximations in both the x and y variables, we can get the general linear approximation by f (x, y) = f (x 0 , y 0 ) + fx(x 0 , y 0 )(x − x 0 ) + fy(x 0 , y 0 )(y − y 0 ).

7 The quartic surface

f (x, y, z) = x^4 − x^3 + y^2 + z^2 = 0 is called the piriform. What is the equation for the tangent plane at the point P = (2, 2 , 2) of this pair shaped surface? We get [a, b, c]T^ = [20, 4 , 4]T^ and so the equation of the plane 20 x + 4y + 4z = 56, where we have obtained the constant to the right by plugging in the point (x, y, z) = (2, 2 , 2).

Remark: In the past, linear approximations were described with differentials. We try to avoid it or at least use it only for intuition. Like Newtons ”fluxions”, the Leibniz “differentials” is a bit outdated. It is is a good intuitive notion, but it is also easy to make mistakes with it. The linearlization of a function f at a point is just a linear function L in the same number of variables. There is a modern notion of differential forms which however needs some multi-linear algebra in its definition. The notion of infinitesimal small quantities has also been clarified within non- standard analysis but that theory needs some logic background. The notion of ”differentials” comes from a time, when calculus was in the early development. To see why the notion “differential” is a bit murky, try to find out what the definition of ”differential” is: you find notions like ”change in the linearization of a function” or ”infinitesimals”. Both are ”foggy terminology”. To add to the confusion, expressions like dx are also called a “differentials”. They appear in total differentials df = fxdx + fydy which can make sense as an intuitive shortcut for the chain rule df (x(t), y(t))/dt = fx(x(t), y(t))dx(t)/dt+fy(x(t), y(t))dy(t)/dt as the later can be multiplied by dt to get the differential expression. (We cover the chain rule later). The expression dx also appears in integrals

∫ (^) π 0 sin(x)^ dx^ but there, it is^ notation^ to indicate the variable we integrate with. Mathematica for example writes Integrate[Sin[x], x, 0 , P i]. Leibniz used

∫ (^) b a f^ (x)^ dx^ because it is close to the Riemann sum

i f^ (xi)dxi^ notation, in which^ dxi^ =^ xi+1^ −^ xi^ represent^ differences. The Leibniz notation

∫ (^) b a f^ (x)^ dx^ is a short-cut for a limit of Riemann sums. But expressions like dx, dt alone are not defined without considerably more theory. As a notation, differentials can be useful in the separation of variable technique: when solving f ′^ = f we write df /dx = f leading to df /f = dx. Integration produces log(f ) = x + c, leading to f = ex+c.

Homework

1 Estimate 1′ 000 ′ 000 ′ 0001 /^10 using linear approximation of f (x) =

x^1 /^10 near x 0 = 8^10.

Source: http://www.cdnpapermoney.com

2 Given f (x, y) = 3yx/π − cos(x). Estimate f (π + 0. 01 , π − 0 .03)

using linearization

3 Estimate f (0. 001 , 0 .9999) for f (x, y) = cos(πy) + sin(x + πy)

using linearization.

4 Find the linear approximation L(x, y) of the function

f (x, y) =

10 − x^2 − 5 y^2

at (2, 1) and use it to estimate f (1. 95 , 1 .04).

5 Estimate (99^3 ∗ 1012 ) by linearizing the function f (x, y) = x^3 y^2

at (100, 100). What is the difference between L(100, 100) and

f (100, 100)?

From f (x, y) = 0, one can express y as a function of x, at least near a point where fy is not zero. From d/df (x, y(x)) = ∇f · (1, y′(x)) = fx + fyy′^ = 0, we obtain y′^ = −fx/fy. Even so, we do not know y(x), we can compute its derivative! Implicit differentiation works also in three variables. The equation f (x, y, z) = c defines a surface. Near a point where fz is not zero, the surface can be described as a graph z = z(x, y). We can compute the derivative zx without actually knowing the function z(x, y). To do so, we consider y a fixed parameter and compute, using the chain rule

fx(x, y, z(x, y))1 + fz (x, y)zx(x, y) = 0

so that zx(x, y) = −fx(x, y, z)/fz (x, y, z). This works at points where fz is not zero.

2 The surface f (x, y, z) = x^2 + y^2 /4 + z^2 /9 = 6 is an ellipsoid. Compute zx(x, y) at the point

(x, y, z) = (2, 1 , 1). Solution: zx(x, y) = −fx(2, 1 , 1)/fz (2, 1 , 1) = − 4 /(2/9) = −18.

The chain rule is powerful because it implies other differentiation rules like the addition, product and quotient rule in one dimensions: f (x, y) = x+y, x = u(t), y = v(t), d/dt(x+y) = fxu′^ +fyv′^ = u′^ + v′. f (x, y) = xy, x = u(t), y = v(t), d/dt(xy) = fxu′^ + fyv′^ = vu′^ + uv′. f (x, y) = x/y, x = u(t), y = v(t), d/dt(x/y) = fxu′^ + fyv′^ = u′/y − v′u/v^2. As in one dimensions, the chain rule follows from linearization. If f is a linear function f (x, y) = ax + by − c and if the curve ~r(t) = [x 0 + tu, y 0 + tv]T^ parametrizes a line. Then (^) dtd f (~r(t)) = d dt (a(x^0 +^ tu) +^ b(y^0 +^ tv)) =^ au^ +^ bv^ and this is the dot product of^ ∇f^ = (a, b) with^ ~r^

′(t) = (u, v).

Since the chain rule only refers to the derivatives of the functions which agree at the point, the chain rule is also true for general functions.

Homework

1 You know that d/dtf (~r(t)) = 10 at t = 0 if ~r(t) = [t, t]T^ and

d/dtf (~r(t)) = 18 at t = 0. ~r(t) = [t, −t]T^. Find the gradient of

f at (0, 0).

2 The pressure in the space at the position (x, y, z) is p(x, y, z) =

x^2 + y^2 − z^3 and the trajectory of an observer is the curve ~r(t) =

[t, t, 1 /t]T^. Using the chain rule, compute the rate of change of

the pressure the observer measures at time t = 2.

3 The chain rule is closely related to linearization. Lets get back to

linearlization a bit:

A farm costs f (x, y), where x is the

number of cows and y is the number of

ducks. There are 10 cows and 20 ducks

and f (10, 20) = 1000000. We know that

fx(x, y) = 2x and fy(x, y) = y^2 for all

x, y. Estimate f (12, 19).

P.S. In the fall of 2013, Oliver made a song out of this:

”Old MacDonald had a million dollar farm, E-I-E-I-O,

and on that farm he had x = 10 cows, E-I-E-I-O,

and on that farm he had y = 20 ducks, E-I-E-I-O,

with fx = 2x here and fy = y^2 there,

and here two cows more, and there a duck less,

how much does the farm cost now, E-I-E-I-O?”

4 Find, using implicit differentiation the derivative d/dx arctanh(x),

where

tanh(x) = sinh(x)/ cosh(x).

The hyperbolic sine and hyperbolic cosine are defined as

Math S21a: Multivariable calculus Oliver Knill, Summer 2018

Lecture 12: Gradient

The gradient of a function f (x, y) is defined as

∇f (x, y) = [fx(x, y), fy(x, y)]T^.

For functions of three variables, define

∇f (x, y, z) = [fx(x, y, z), fy(x, y, z), fz (x, y, z)]T^.

The symbol ∇ is spelled “Nabla” and named after an Egyptian or Assyrian harp. Early on, the name “Atled” was suggested. But the textbook of 1901 of Gibbs used Nabla was too persuasive. Here is a very important fact, which is true in any dimension. We only formulate it in dimension 2:

Gradient Theorem: ∇f (x 0 , y 0 ) is perpendicular to the level curve {(x, y) | f (x, y) = c} containing (x 0 , y 0 ).

Proof. Every curve ~r(t) on the level curve or level surface satisfies (^) dtd f (~r(t)) = 0. By the chain rule, ∇f (~r(t)) is perpendicular to the tangent vector ~r′(t). QED.

Because ~n = ∇f (p, q) = [a, b]T^ is perpendicular to the level curve f (x, y) = c through (p, q), the equation for the tangent line is ax + by = d, a = fx(p, q), b = fy(p, q), d = ap + bq. Compactly written, this is ∇f (~x 0 ) · (~x − ~x 0 ) = 0

and means that the gradient of f is perpendicular to any vector (~x − ~x 0 ) in the plane. It is one of the most important statements in multivariable calculus as it gives a crucial link between calculus and geometry. The just mentioned gradient theorem is also useful. We can immediately compute tangent planes and tangent lines, without linearization!

1 Compute the tangent plane to the surface 3x^2 y + z^2 − 4 = 0 at the point (1, 1 , 1). Solution:

∇f (x, y, z) = [6xy, 3 x^2 , 2 z]T^. And ∇f (1, 1 , 1) = [6, 3 , 2]T^. The plane is 6x+3y+2z = d where d is a constant. We can find the constant d by plugging in a point and get 6x+3y+2z = 11.

2 Problem: reflect the ray ~r(t) = [1 − t, −t, 1]T^ at the surface

x^4 + y^2 + z^6 = 6.

Solution: ~r(t) hits the surface at the time t = 2 in the point (− 1 , − 2 , 1). The velocity vector in that ray is ~v = [− 1 , − 1 , 0]T^. The normal vector at this point is ∇f (− 1 , − 2 , 1) = [− 4 , − 4 , 6]T^ = ~n. The reflected vector is R(~v = 2Proj~n(~v) − ~v. We have Proj~n(~v) = 8 /68[− 4 , − 4 , 6]T^. Therefore, the reflected ray is w~ = (4/17)[− 4 , − 4 , 6]T^ − [− 1 , − 1 , 0]T^.

If f is a function of several variables and ~v is a unit vector then D~vf = ∇f · ~v is called the directional derivative of f in the direction ~v.

The name “directional derivative” is related to the fact that every unit vector gives a direction. If ~v is a unit vector, then the chain rule tells us (^) dtd D~vf = (^) dtd f (x + t~v).

The directional derivative tells us how the function changes when we move in a given direction. Assume for example that T (x, y, z) is the temperature at position (x, y, z). If we move with veloc- ity ~v through space, then D~vT tells us at which rate the temperature changes for us. If we move with velocity ~v on a hilly surface of height h(x, y), then D~vh(x, y) gives us the slope we drive on.

3 If ~r(t) is a curve with velocity ~r ′(t) and the speed is 1, then D~r′(t)f = ∇f (~r(t)) · ~r ′(t) is the

temperature change, one measures at ~r(t). The chain rule told us that this is d/dtf (~r(t)).

4 For ~v = (1, 0 , 0), then D~vf = ∇f · v = fx, the directional derivative is a generalization of

the partial derivatives. It measures the rate of change of f , if we walk with unit speed into that direction. But as with partial derivatives, it is a scalar.

The directional derivative satisfies |D~vf | ≤ |∇f ||~v| because ∇f · ~v = |∇f ||~v|| cos(φ)| ≤ |∇f ||~v|.

The direction ~v = ∇f /|∇f | is the direction, where f increases most. It is the direction of steepest ascent.

3 Assume f (x, y) = 1 − x^2 + y^2. Compute the directional derivative

D~v(x, y) at (0, 0), where ~v = [cos(t), sin(t)]T^ is a unit vector. Now

compute

DvDvf (x, y)

at (0, 0), for any unit vector. For which directions is this second

directional derivative positive?

4 The Kitchen-Rosenberg formula gives the curvature of a

level curve f (x, y) = c as

κ =

fxxf (^) y^2 − 2 fxyfxfy + fyyf (^) x^2

(f (^) x^2 + f (^) y^2 )^3 /^2

Use this formula to find the curvature of the ellipse f (x, y) =

x^2 + 2y^2 = 1 at the point (1, 0).

This formula is useful in computer vision. If you want to derive

the formula, you can check that the angle

g(x, y) = arctan(fy/fx)

of the gradient vector has κ as the directional derivative in the

direction ~v = [−fy, fx]T^ /

√ f (^) x^2 + f (^) y^2 tangent to the curve.

5 One can find the maximum of a function numerically by moving in

the direction of the gradient. This is called the steepest ascent

method. You start at a point (x 0 , y 0 ) then move in the direction

of the gradient for some time c to be at (x 1 , y 1 ) = (x 0 , y 0 ) +

c∇f (x 0 , y 0 ). Repeat to (x 2 , y 2 ) = (x 1 , y 1 ) + c∇f (x 1 , y 1 ) etc. It

can be a bit difficult if the function has a flat ridge like in the

Rosenbrock function

f (x, y) = 1 − (1 − x)^2 − 100(y − x^2 )^2.

Plot the contour map of this function on − 0. 6 ≤ x ≤ 1 , − 0. 1 ≤

y ≤ 1 .1, then and find the directional derivative at (1/ 5 , 0) in the

direction (1, 1)/