Math Notes on differentials, the Chain Rule, gradients, directional derivative, and normal vectors

Similar documents
Implicit Differentiation

Exam 2 Review Solutions

Math 11 Fall 2016 Section 1 Monday, September 19, Definition: A vector parametric equation for the line parallel to vector v = x v, y v, z v

Further Differentiation and Applications

23 Implicit differentiation

Euler equations for multiple integrals

IMPLICIT DIFFERENTIATION

Topic 2.3: The Geometry of Derivatives of Vector Functions

1 Lecture 20: Implicit differentiation

Vectors in two dimensions

Table of Common Derivatives By David Abraham

3.7 Implicit Differentiation -- A Brief Introduction -- Student Notes

Calculus in the AP Physics C Course The Derivative

Linear First-Order Equations

The derivative of a function f(x) is another function, defined in terms of a limiting expression: f(x + δx) f(x)

Lectures - Week 10 Introduction to Ordinary Differential Equations (ODES) First Order Linear ODEs

Implicit Differentiation. Lecture 16.

Integration Review. May 11, 2013

Math Implicit Differentiation. We have discovered (and proved) formulas for finding derivatives of functions like

Math 32A Review Sheet

Multivariable Calculus: Chapter 13: Topic Guide and Formulas (pgs ) * line segment notation above a variable indicates vector

Math Chapter 2 Essentials of Calculus by James Stewart Prepared by Jason Gaddis

Implicit Differentiation

Math 1B, lecture 8: Integration by parts

Math 1271 Solutions for Fall 2005 Final Exam

CHAPTER 1 : DIFFERENTIABLE MANIFOLDS. 1.1 The definition of a differentiable manifold

Lecture 1b. Differential operators and orthogonal coordinates. Partial derivatives. Divergence and divergence theorem. Gradient. A y. + A y y dy. 1b.

Lecture 2 Lagrangian formulation of classical mechanics Mechanics

Math 1272 Solutions for Spring 2005 Final Exam. asked to find the limit of the sequence. This is equivalent to evaluating lim. lim.

JUST THE MATHS UNIT NUMBER DIFFERENTIATION 2 (Rates of change) A.J.Hobson

Chapter 2 Lagrangian Modeling

Final Exam Study Guide and Practice Problems Solutions

Lecture XII. where Φ is called the potential function. Let us introduce spherical coordinates defined through the relations

Differentiation ( , 9.5)

Summary: Differentiation

In the usual geometric derivation of Bragg s Law one assumes that crystalline

Math 210 Midterm #1 Review

II. First variation of functionals

MA 2232 Lecture 08 - Review of Log and Exponential Functions and Exponential Growth

Define each term or concept.

A. Incorrect! The letter t does not appear in the expression of the given integral

Final Exam: Sat 12 Dec 2009, 09:00-12:00

Related Rates. Introduction. We are familiar with a variety of mathematical or quantitative relationships, especially geometric ones.

Separation of Variables

The Principle of Least Action

Section 7.1: Integration by Parts

Schrödinger s equation.

12.5. Differentiation of vectors. Introduction. Prerequisites. Learning Outcomes

Physics 170 Week 7, Lecture 2

18 EVEN MORE CALCULUS

11.7. Implicit Differentiation. Introduction. Prerequisites. Learning Outcomes

x f(x) x f(x) approaching 1 approaching 0.5 approaching 1 approaching 0.

Mathematics. Circles. hsn.uk.net. Higher. Contents. Circles 1. CfE Edition

The Exact Form and General Integrating Factors

Section 7.2. The Calculus of Complex Functions

Survey Sampling. 1 Design-based Inference. Kosuke Imai Department of Politics, Princeton University. February 19, 2013

Unit #6 - Families of Functions, Taylor Polynomials, l Hopital s Rule

d dx But have you ever seen a derivation of these results? We ll prove the first result below. cos h 1

Calculus BC Section II PART A A GRAPHING CALCULATOR IS REQUIRED FOR SOME PROBLEMS OR PARTS OF PROBLEMS

2.5 SOME APPLICATIONS OF THE CHAIN RULE

Section 2.7 Derivatives of powers of functions

Differentiability, Computing Derivatives, Trig Review. Goals:

x f(x) x f(x) approaching 1 approaching 0.5 approaching 1 approaching 0.

Optimization Notes. Note: Any material in red you will need to have memorized verbatim (more or less) for tests, quizzes, and the final exam.

Lagrangian and Hamiltonian Mechanics

Differentiability, Computing Derivatives, Trig Review

Lecture 6: Calculus. In Song Kim. September 7, 2011

there is no special reason why the value of y should be fixed at y = 0.3. Any y such that

The Principle of Least Action and Designing Fiber Optics

Quantum Mechanics in Three Dimensions

Tutorial 1 Differentiation

STUDENT S COMPANIONS IN BASIC MATH: THE FOURTH. Trigonometric Functions

PH 132 Exam 1 Spring Student Name. Student Number. Lab/Recitation Section Number (11,,36)

x = c of N if the limit of f (x) = L and the right-handed limit lim f ( x)

Differentiation Rules Derivatives of Polynomials and Exponential Functions

Diagonalization of Matrices Dr. E. Jacobs

The Three-dimensional Schödinger Equation

Assignment 1. g i (x 1,..., x n ) dx i = 0. i=1

Chapter 3 Notes, Applied Calculus, Tan

Math 342 Partial Differential Equations «Viktor Grigoryan

Solving the Schrödinger Equation for the 1 Electron Atom (Hydrogen-Like)

6 General properties of an autonomous system of two first order ODE

Mathematical Review Problems

Section 2.1 The Derivative and the Tangent Line Problem

Day 4: Motion Along a Curve Vectors

Pure Further Mathematics 1. Revision Notes

1 Definition of the derivative

Some vector algebra and the generalized chain rule Ross Bannister Data Assimilation Research Centre, University of Reading, UK Last updated 10/06/10

Solutions to Math 41 Second Exam November 4, 2010

by using the derivative rules. o Building blocks: d

3.6. Implicit Differentiation. Implicitly Defined Functions

February 21 Math 1190 sec. 63 Spring 2017

Math 115 Section 018 Course Note

Partial Differential Equations

The Ehrenfest Theorems

Calculus and optimization

FINAL EXAM 1 SOLUTIONS Below is the graph of a function f(x). From the graph, read off the value (if any) of the following limits: x 1 +

The Chain Rule. d dx x(t) dx. dt (t)

Chapter 2 Derivatives

Related Rates. Introduction

Transcription:

Math 18.02 Notes on ifferentials, the Chain Rule, graients, irectional erivative, an normal vectors Tangent plane an linear approximation We efine the partial erivatives of f( xy, ) as follows: f f( x+ xy, ) 0 f( xy, ) = lim = fx ( xy, ) is the partial erivative of f with respect to x x 0 x f f( xy, + y) 0 f( xy, ) = lim = fy ( xy, ) is the partial erivative of f with respect to y x 0 y We also showe that if f( xy, ) is ifferentiable at a point ( x0, y 0), then its approximating tangent plane will have equation z = f( x0, y0) + fx( x0, y0)( x0 x0) + fy( x0, y0)( y0 y0). That is, for ( xy, ) sufficiently near ( x0, y 0), the linear approximation f( x, y) f( x0, y0) + fx( x0, y0)( x0 x0) + fy( x0, y0)( y0 y0) will be vali. This can be expresse in several alternative forms. For example, if we write this as: f( x, y) 0 f( x, y ) f ( x, y )( x0 x ) + f ( x, y )( y0 y ) 0 0 x 0 0 0 y 0 0 0 an let f = f( x, y) 0 f( x0, y0) an x= x0 x0 an y = y0 y0, then using the alternate notation f (, ) f = fx x0 y0 an = fy ( x0, y0) we can write: f x+ y (increment form of linear approximation) For functions of n variables f( x1,, x n ), this construction extens to the more general statement: f x1 + + x 1 Where the partial erivative now presumes that all other variables are treate as though constant when i taking this erivative. In particular, for a function of three variables f( xyz,, ) we woul have that: f x+ y+ z This increment notation can be iealize by consiering only the ifference in the values of the function as we remain on the approximating tangent plane (or higher orer analogue for functions of more than two variables. In this case we ientify x = x an y = y (an z = z for a function of three variables) an we express the ifferential as: n n f = x + y for a function f( xy, ) or f = x + y + z for a function f( xyz,, ) The ifferential notation an the increment notation are basically capturing the same iea with the ifferential giving the linear approximation for the actual incremental change in the values of the given function as the inepenent variable in the omain are tweake. 1

Increments (or ifferentials) an error estimation If we have some erive quantity with values expresse in terms of other quantities that are irectly measure within some known egree of error, we can use ifferentials to estimate the error in the erive quantity. The simplest example is in the calculation of the area of a rectangle from the lengths of its sies. In this case, we write A = xy, but it must be note that A( x, y) = xy is, in fact, a function of two inepenent variables. Therefore if the measure value of x is only known within an error x an the measure value of y is only known within an error of y, then the A A incremental change in area associate with these changes will be A x+ y = y x+ x y. If the measure error is expresse in terms of relative error (also calle percent error when expresse as a percentage), A y x+ x y x y we can ivie to get = +. That is, for a quantity erive from two irectly measure A xy x y quantities by multiplication, we a the relative errors to estimate the relative error in the prouct. This kin of analysis can be applie to any erive quantity expresse as a ifferentiable function of irectly measure quantities. Rate of change of a function along a parameterize curve Suppose a function f( xy, ) is given an we think about how its values are istribute by rawing level curves of this function in R 2 (see iagram). Let s also consier a parameterize curve escribe by r () t = xt (), yt (). We can then ask the question: What is the rate of change of the function f( xy, ) as we travel along this parameterize curve? If we think of this as a composition, we have: t ( xt ( ), yt ( )) f( xt ( ), yt ( )) This is a function from R to R 2 to R an we wish to calculate [ ( ( ), ( ))] f xt yt. For brevity we express this more simply as f, but it must be unerstoo that this rate is being calculate along a particular parameterize curve. We can relate incremental changes in the values of this function as f x+ y, so if we simply f x y ivie through by t we see that + an if we let 0 t t t t we get that f f x y x y = lim = lim + lim = +. That is x y [ f( xt ( ), yt ( ))] = +. This t 0 t t 0 t t 0 t is the Basic Chain Rule. 2

Note that this expression for this rate of change is given in the form of a ot prouct. Specifically, we can write f f x y =,,. The 2 n factor shoul be familiar as the velocity vector v to this parameterize curve (at whatever point we are passing through at time t). The 1 st factor is something new known as the graient vector of the function f( xy., ) It gives a vector at every point ( xy., ) [This is known as a vector fiel.] We enote the graient by f =,. With this we can say that f v. x y The same construction can be one with a ifferentiable function f( xyz,, ) an a parameterize curve r () t = xt (), yt (), zt () to give the rate of change x y z [ f( xt ( ), yt ( ), zt ( )) ] = + + Chain Rule in this context. If we efine f =,, x y z, we can then again write that f v as the Basic Geometry of the graient vector We can best unerstan the irection of the graient vector as well as its magnitue by consiering particular paths. With this we ll see that its irection is always in the irection of steepest increase in the values of the function. Furthermore, it must therefore always be perpenicular to the level sets of the given function. This observation will give us a simple way of proucing normal vectors to curves an surfaces. Here s how we etermine these facts: (1) Choose any parametrize path that lies entirely within a level set. By the Basic Chain Rule (in vector form) f we see that f 0 = v = along any such path because the values of the function are constant. But f v = 0 then means that either f v (an therefore f must be perpenicular the level set) or possibly that f vanishes completely (which we will soon ientify as inicative of a critical point). So as long as the graient is nonzero it will be perpenicular to the level sets of the function. (2) Choose any parametrize path along which the values of the given function are increasing. Then the rate of f change will be positive, i.e. f 0 = v >. This means that the angle between f an v must be an acute angle. Referring to the previous iagram we see that f must, in fact, be pointing in the irection of (steepest) increasing values. () What about the magnitue f? For this, let s introuce one more efinition. Note that the rate f v epens very much on the spee with which the parameterize curve is traverse. We can eliminate this bias by instea tracking the rate of change per istance travele. To o this, we nee only assume that we travel at unit spee, but this also means that time elapse an istance travele will be the same. If we express the (unit) velocity vector as u an note that the rate can now be expresse as f u, we get the following efinition: s Definition: The irectional erivative of a function f at a point (position) x 0 in the irection of the unit f vector u is the scalar quantity ( D f)( 0) f( 0) s = = u x x u. Note that the irectional erivative epens not only on where you are but also the irection in which you go. Inee, it s just the scalar projection of the graient in any given irection. In fact, it s easy to see that for.

any function the partial erivative coincies with the irection erivative in the x-irection, an the partial erivative coincies with the irection erivative in the y-irection, etc. Inee, the irectional erivative can be viewe as a generalization of the iea of the partial erivative in any irection you please. Getting back to the meaning of f, note that if we choose a unit vector in the graient irection (the irection of steepest increase), the irectional erivative in that irection will then be f. That is, f can be interprete as the maximum rate of change of the given function per istance travele. In the real worl, this is what we might generally refer to as the grae. A steep hill, for example, woul be inicate by a large graient vector. Graients an normal vectors We can exploit the fact that at any point where the graient of a function f is nonzero it will necessary be perpenicular to the level set of f passing through that point. Given any curve in R 2 expresse in the form Fxy= (, ) constant, or any surface in R expresse in the form Fxyz (,, ) = constant, this gives us a remarkably simple way to prouce a normal vector to the given curve or surface that can then be use to prouce equations for tangent lines or tangent planes. Example 1: Any circle in R with equation x + y = R can be viewe as a level set (contour) for the function Fxy (, ) = x + y. Its graient vector is F= 2 x,2y = 2 xy, which at any point ( xy, ) is just twice the raial vector an is clearly perpenicular to the circle at that point. Example 2: Any plane with equation of the form Ax + By + Cz = D can be viewe as a level surface of the linear function F( x, y, z) = Ax + By + Cz. The graient of this function is F= ABC,, =n, so we see that the graient gives us back a normal vector to this plane. Example : Fin the istance between the two parallel planes x+ 4y+ 12z = 100 an x+ 4y+ 12z = 200. Solution: We immeiately see that with Fxyz (,, ) = x+ 4y+ 12z, its graient vector is F =, 4,12 which is in the irection of steepest increase of this function, i.e. in a irection perpenicular to these parallel planes. If we interpret F = 9 + 16 + 144 = 169 = 1 as the irectional erivative in this normal irection an unerstan this to mean the (constant) rate of change of this function per istance travelle in this irection, we can say that F (istance) = F. That is, the rate of change per istance times the istance will give the net change in the values of this function. But we know that F = 200 0100 = 100, so 1 (istance) = 100 an the istance is therefore 100. We coul also have foun this istance using scalar projection. 1 2 Example 4: Fin an equation for the tangent line to the curve x y + 2xy = 8 at the point ( x0, y 0) = (2,1). Solution: You may first want to verify that this point actually lies on this curve by checking that ( x0, y 0) = (2,1) satisfies the given equation. [It oes.] In a single-variable Calculus course you woul probably go about fining an expression for y using implicit ifferentiation. That calculation might have gone something like this x (where we think of y= yx ( ) without explicitly solving for it): 2 2 2 x y + 2xy = 8 2 8 2 6 2 0 x x y + xy = x + xy + xy + y = x x x 2 y y 2xy + 2y y ( x + 6xy ) + ( 2xy + 2y ) = 0 =0 =0 =0 x x 2 x + 6xy x (2)1) [ ] y y 6 16 8 4

The tangent line will therefore have equation y 1 = 8 ( x 2) 2 We can instea note that this curve is a level set for the function F( x, y) = x y + 2xy. It s graient is easily calculate to be F( x, y) = 2xy + 2 y, x + 6xy an F(2,1) = 6,16 = 2,8. So a normal vector to this curve is the vector,8. The normal line will therefore have slope 8, an the tangent line will have slope given by the negative reciprocal, i.e. 8 which agrees with the previous metho. We coul also have obtaine an equation for the tangent line using n ( x0 x 0) = 0. This gives us,8 x02, y0 1 = 0 or ( x0 2) + 8( y0 1) = 0 which is equivalent to the previous result.. Example 5: Fin an equation for the tangent plane to the surface efine by the equation at the point ( x0, y0, z 0) = (1, 2, 01). xy + x yz = 2 2xz Solution: Shoul we solve for z= f( xy, ) an use our previous form for the equation of the tangent plane. Even if we coul, the calculation is unnecessary as long as we can express the given surface in the form of a level set. This is always easy to o by simply transposing any variables to one sie of the equation. In this example, we rewrite the equation as xy + x yz + 2xz = 2 an let F( x, y, z) = xy + x yz + 2xz. The surface is then just the F = 2 level set. Its graient at any point is F = y + 2xyz + 2 z, x + x z,2x yz + 2x, but we only nee the fact that F(1, 2, 1) = 4, 2, 2 =,1, 1. Therefore n = 2,1, 1 provies a normal vector to the surface at this point an hence a normal vector to the tangent plane at this point. We can then use n ( x0 x 0) = 0 to get the equation of the tangent plane, i.e. 2,1, 01 x01, y0 2, z+ 1 = 0 or 2( x0 1) + ( y02) 0 ( z+ 1) = 0. You can also express this as 2x+ y z = 5. The General Chain Rule The chain rule is an algebraic rule that escribes how to calculate rates of change of functions built from other functions through composition. For example, in a first semester calculus course we learn that if y = y(u) an u = u(x), then we can calculate y y y u by the chain rule: =. In a multivariable setting, we x x u x z z x z y might have z = z(x, y) an x = x(t), y = y(t). We then have = +, the basic chain rule. The chain rule gets more interesting when you apply it to situations where there are more input variables an output variables. For example, let us suppose we have a situation where there are two parameters, φ an θ, x= x( φθ, ) an that for any φ an θ we have equations giving y = y( φθ, ). Let us further suppose that for any choices of z = z( φθ, ) u = u( x, y, z) the variables x, y, an z we have two other variables, u an v, efine by equations. v = v( x, y, z) In this case we can think of this functionally as In this context, the general chain rule gives that G F ( φ, θ) ( xyz,, ) ( uv, ). 5

u u u u u u u u = + + = + + φ φ φ φ θ θ θ θ v v v v v v v v = + + = + + φ φ φ φ θ θ θ θ These can be organize into a statement about the Jacobian matrices of the two functions an of their composition. A Jacobian matrix may be thought of simply as an array of (partial) erivatives of the various output variables with respect to the various input variables, where the outputs are liste from top to bottom an the inputs are liste from left to right. If you know about matrix multiplication, we have u φ v φ u θ = v θ u u u x y z φ v v v φ φ θ or, more succinctly, JF G= JFJG θ θ. To picture what this is telling us, let s specifically look at the situation where φ an θ represent latitue an longitue with the minor change that latitue will be measure from the north pole as 0, the equator as 90, an the south pole as 180. We can then escribe a sphere of raius R by the parametric equations x = R cosθ sini y = R sinθsini. z = R cosi [We ll erive these later when we look at spherical coorinates in etail.] Let us further suppose that the variables u an v measure, for example, temperature an barometric pressure at any point (x, y, z) in R an, in particular, at points on this parametrize sphere in R. θ z v φ y u x We might ask questions about how temperature woul vary as we change latitue or longitue, or how barometric pressure woul vary as we change latitue or longitue. These are the quantities in the Jacobian u u u u u φ θ matrix J F G=. The rows of the Jacobian matrix J F = are just the graient vectors v v v v v φ θ (in R ) of the temperature an barometric pressure functions. (Note that these are functions efine on R an not just on the spherical surface.) 6

φ θ y y The two columns of the Jacobian matrix J G = represent velocity vectors tangent to the φ θ φ θ longitues (φ varying) an latitues (θ varying). These two column vectors are tangent to curves lying in the sphere an are therefore tangent to the sphere. They are, essentially, the south vector an the east vector at any point of the sphere (except at the poles). You might further observe that their cross prouct will be normal to this spherical surface at any given point, a fact which will be useful later in this course when we look at surface integrals. u u φ θ The two columns of the Jacobian matrix J F G= represent vectors in the (u, v) plane an inicate v v φ θ the irections of change if we slightly vary the latitue or the longitue. Notes by Robert Winters 7