Dr. Allen Back Sep. 10, 2014
The chain rule in multivariable calculus is in some ways very simple. But it can lead to extremely intricate sorts of relationships (try thermodynamics in physical chemistry... ) as well as counter-intuitive looking formulas like z y x = x. z y (The above in a context where f (x, y, z) = C.)
First let s try the conceptually simple point of view, using the fact that derivatives of functions are linear transformations. (Matrices.)
Think about differentiable functions and g : U R n R m f : V R m R p where the image of f, namely (f (U)) is a subset of the domain V of g. The chain rule is about the derivative of the composition f g.
Here s a picture:
For g : U R n R m and f : V R m R p, let s use p to denote a point of R n q to denote a point of R m r to denote a point of R p. So more colloquially, we might write q = r = g(p) f (q) and so of course f g gives the relationship r = f (g(p)). (The latter is (f g)(p).)
Fix a point p 0 with g(p 0 ) = q 0 and f (q 0 ) = r 0. Let the derivatives of g and f at the relevant points be T = Dg(p 0 ) S = Df (q 0 ). How are the changes in p, q, and r related?
Fix a point p 0 with g(p 0 ) = q 0 and f (q 0 ) = r 0. Let the derivatives of g and f at the relevant points be T = Dg(p 0 ) S = Df (q 0 ). How are the changes in p, q, and r related? By the linear approximation properties of the derivative, q T p r S q And so plugging the first approximate equality into the second gives the approximation r S(T p) = (ST ) p.
r (ST ) p. What is this saying?
r (ST ) p. What is this saying? For g : U R n R m and f : V R m R p, T = Df (p 0 ) is an m n matrix S = Dg(q 0 ) is an p m matrix So the product ST is a p n matrix representing the derivative at p 0 of g f.
q = T p r = S q r = ST p
So the chain rule theorem says that if f is differentiable at p 0 with f (p 0 ) = q 0 and g is differentiable at q 0, then g f is also differentiable at p 0 with derivative the matrix product (Dg(q 0 )) (Df (p 0 )).
Problem: Suppose we have the polar coordinate map g(r, θ) = (r cos θ, r sin θ) and (r, θ) = f (u, v) is given by f (u, v) = (uv, v). Find the derivative of g f.
f : U R 2 R and g : R R 2. Derivatives/Partial derivatives of f g and g f?
e.g. z = z(x, y) x = x(t) y = y(t) or explicitly z = x 2 + y 2 x = cos t y = 2 sin t f : R 2 R c : R R 2 c(t) =(x(t), y(t)) f c :R R c f :R 2 R 2
c(t) = (x(t), y(t)) x = cos t y = 2 sin t (A vector valued function with 1 dimensional domain is sometimes interpreted as a path c. It s image is a curve; the above c(t) could parametrize the ellipse 4x 2 + y 2 = 4. )
Df = Dc = D(f c) = [ f x [ f x f y [ dx dt dy dt f y ] ] = f dx x dt + f dy y dt ] [ dx ] dt dy dt
Df = Dc = D(f c) = [ f x [ f x f y [ dx dt dy dt f y ] ] = f dx x dt + f dy y dt ] [ dx ] dt dy dt
Df = Dc = D(c f ) = = [ dx f dt x dy f dt x [ f x ] f y [ dx dt dy dt [ dx dt dy dt dx f dt y dy f dt y ] ] [ f x ] ] f y
If we use t to denote both scalars in the domain of c and the range of f (instead of z for the latter), the above might more intuitively be written as ] D(c f ) = [ dx t dt x dy t dt x dx t dt y dy t dt y
If we use t to denote both scalars in the domain of c and the range of f (instead of z for the latter), the above might more intuitively be written as ] D(c f ) = [ dx t dt x dy t dt x dx t dt y dy t dt y where more confusingly, using t = t(x, y) instead of z = f (x, y) we have c(f (x, y)) = (x(t(x, y), y(t(x, y)).
Alternatively z = f (x 1, x 2 ) t = (c 1 (z), c 2 (z)) t = (c 1 (f (x 1, x 2 )), c 2 (f (x 1, x 2 ))) looks quite sensible. Tradeoffs among naturality, intuitiveness, and precision are why we have so many notations for derivatives.
Tree diagrams can be helpful in showing the dependencies for chain rule applications: z = z(x, y) x = y = x(t) y(t) z = z(x(t), y(t)).
z = z(x(t), y(t)) z dx x dt + z dy y dt
Intuitively, one might think: z = z(x(t), y(t)) 1 A change t in t causes a change x in x with multiplier dx dt. 2 The change x in x contributes to a further change z in z with multiplier z. So the overall contribution to the x change in z from the x part has multiplier times t. 3 Similarly for the y part. z dx x dt
z = z(x, y) x = x(u, v) y = y(u, v) z = z(x(u, v), y(u, v)).
z = z(x(u, v), y(u, v)). z u = z x x u + z y y u.
Problem: z = e x2 y w = cos (x + y) x = u 2 v 2 y = z u? 2uv
Cases like z = f (x, u(x, y), v(y)).
More formal approach: f : R 3 R u : R 2 R v : R R h : R 2 R h(x, y) =f (x, u(x, y), v(y)) Dh =?
More formal approach: f : R 3 R u : R 2 R v : R R h : R 2 R h(x, y) =f (x, u(x, y), v(y)) Dh =? Write h as a composition h = f k for k : R 2 R 3.
Write h as a composition h = f k for k : R 2 R 3. What should k be? (Recall h(x, y) = f (x, u(x, y), v(y)).
Write h as a composition h = f k for k : R 2 R 3. (Recall h(x, y) = f (x, u(x, y), v(y)). So k should be defined as k(x, y) = (x, u(x, y), v(y))
Write h as a composition h = f k for k : R 2 R 3. (Recall h(x, y) = f (x, u(x, y), v(y)). And Dh = Df Dk =....
More informally, e.g. thinking about a tree diagram for z = f (x, u(x, y), v(y)) and thinking of the underlying f as f (x, u, v), we d have z x = f x + f u u x. Notice expressions like f z or x x have some ambiguity here that D 1 f does not.
The tangent plane to the graph of z = f (x, y) at (x, y) = (x 0, y 0 ) is defined to be the plane given by z z 0 = f x (x 0, y 0 )(x x 0 ) + f y (x 0, y 0 )(y y 0 ). (The approximation z f x x + f y y is replaced by an exact equality on the tangent plane.)
plane to z = x 2 y 2 at ( 1, 0, 1). Note the tangent plane needn t meet the surface in just one point.
The tangent vector to the path c(t) = (x(t), y(t), z(t)) at t = t 0 is defined to be the vector dx dt Dc(t 0 ) = c t=t0 (t 0 ) = dy dt t=t0 dz dt t=t0 which we will sometimes write more informally as c (t 0 ) = (x (t 0 ), y (t 0 ), z (t 0 )).
The line given by r = c(t 0 ) + t c (t 0 ) is called the tangent line to the path c at t = t 0. (The approximation c c t is replaced by an exact equality on the tangent line.) Here x r = y z is the position vector of a general point on the line and t is any real number.
line to the helix c(t) = (4 cos t, 4 sin t, 3t) at t = π.
If the normal n =< a, b, c >, r =< x, y, z > is a general point and P 0 = (x 0, y 0, z 0 ), then n ( r P 0 ) = 0 becomes a(x x 0 ) + b(y y 0 ) + c(z z 0 ) = 0.
Find the equation of the plane through the three points P 0 = (1, 0, 1), P 1 = ( 1, 1, 2) and P 2 = (1, 2, 3).
Find the equation of the plane through the three points P 0 = (1, 0, 1), P 1 = ( 1, 1, 2) and P 2 = (1, 2, 3). Solution: First find the normal n = P 0 P 1 P 0 P 2
Find the equation of the plane through the three points P 0 = (1, 0, 1), P 1 = ( 1, 1, 2) and P 2 = (1, 2, 3). Solution: First find the normal n = P 0 P 1 P 0 P 2 The cross product: which is î ĵ ˆk 2 1 1 0 2 2
Find the equation of the plane through the three points P 0 = (1, 0, 1), P 1 = ( 1, 1, 2) and P 2 = (1, 2, 3). Solution: First find the normal The cross product: n = P 0 P 1 P 0 P 2 î ĵ ˆk 2 1 1 0 2 2 which is î 1 1 2 2 ĵ 2 1 0 2 + ˆk 2 1 0 2 =< 0, 4, 4 >.
Find the equation of the plane through the three points P 0 = (1, 0, 1), P 1 = ( 1, 1, 2) and P 2 = (1, 2, 3). î 1 1 2 2 ĵ 2 1 0 2 + ˆk 2 1 0 2 =< 0, 4, 4 >. So our plane is < 0, 4, 4 > ( r < 1, 0, 1 >) = 0
Find the equation of the plane through the three points P 0 = (1, 0, 1), P 1 = ( 1, 1, 2) and P 2 = (1, 2, 3). So our plane is or < 0, 4, 4 > ( r < 1, 0, 1 >) = 0 0(x 1) + 4(y 0) 4(z 1) = 0 or 4y 4z + 4 = 0.
Find the equation of the line through the points P 0 = (1, 1, 0) and P 1 = (2, 2, 2).
Find the equation of the line through the points P 0 = (1, 1, 0) and P 1 = (2, 2, 2).
Find the equation of the line through the points P 0 = (1, 1, 0) and P 1 = (2, 2, 2). Solution: P 0 P 1 = (2, 2, 2) (1, 1, 0) =< 1, 1, 2 >. So our line is r =< 1, 1, 0 > +t < 1, 1, 2 >=< 1 + t, 1 + t, 2t >. where t is any real number.
Solution: P 0 P 1 = (2, 2, 2) (1, 1, 0) =< 1, 1, 2 >. So our line is r =< 1, 1, 0 > +t < 1, 1, 2 >=< 1 + t, 1 + t, 2t >. where t is any real number. This is called the vector form of the equation of a line.
Solution: P 0 P 1 = (2, 2, 2) (1, 1, 0) =< 1, 1, 2 >. So our line is r =< 1, 1, 0 > +t < 1, 1, 2 >=< 1 + t, 1 + t, 2t >. where t is any real number. This is called the vector form of the equation of a line. Thinking our general position vector r =< x, y, z >, we can express this as the parametric form: x = y = z = 1 + t 1 + t 2t
Thinking our general position vector r =< x, y, z >, we can express this as the parametric form: Solving for t shows x = y = z = 1 + t 1 + t 2t t = x 1 = y 1 = z 2 which realizes this line as the intersection of the planes x = y and z = 2(y 1) but there are many other pairs of planes containing this line.
The cross product of vectors in R 3 is another vector.
The cross product of vectors in R 3 is another vector. It is good because: it is geometrically meaningful it is straightforward to calculate it is useful (e.g. torque, angular momentum)
The cross product of vectors in R 3 is another vector. u = v w is geometrically determined by the properties: u is perpendicular to both v and w. u is the v w sin θ, the area of the parallelogram spanned by v and w. Choice from the remaining two possibilities is now made based on the right hand rule.
u = v w is geometrically determined by the properties: u is perpendicular to both v and w. u is the v w sin θ, the area of the parallelogram spanned by v and w. Choice from the remaining two possibilities is now made based on the right hand rule.
u = v w is geometrically determined by the properties: u is perpendicular to both v and w. u is the v w sin θ, the area of the parallelogram spanned by v and w. Choice from the remaining two possibilities is now made based on the right hand rule.
u = v w is geometrically determined by the properties: u is perpendicular to both v and w. u is the v w sin θ, the area of the parallelogram spanned by v and w. Choice from the remaining two possibilities is now made based on the right hand rule.
The algebraic definition of the cross product is based on the determinant î ĵ ˆk v w = v 1 v 2 v 3 w 1 w 2 w 3 which means v w = î v 2 v 3 w 2 w 3 ĵ v 1 v 3 w 1 w 3 + ˆk v 1 v 2 w 1 w 2. where a b c d = ad bc and î =< 1, 0, 0 >, ĵ =< 0, 1, 0 >, and ˆk =< 0, 0, 1 >,
v w = w v
Since î ĵ = ˆk and cyclic (so ĵ ˆk = î and ˆk î = ĵ) it is sometimes easiest to use that algebra or comparison with the picture below to determine cross products or use the right hand rule.
For example < 1, 1, 0 > < 0, 0, 1 >= (î + ĵ) ˆk = ĵ + î =< 1, 1, 0 > is easier than writing out the 3 3 determinant.
s can be used to find the area of a parallelogram or triangle spanned by two vectors in R 3. find the volume of a parallelopiped using the scalar triple product u ( v w) = ( u v) w.