MATH The Chain Rule Fall 2016 A vector function of a vector variable is a function F: R n R m. In practice, if x 1, x n is the input,

Similar documents
Chain Rule. MATH 311, Calculus III. J. Robert Buchanan. Spring Department of Mathematics

Math 212-Lecture 8. The chain rule with one independent variable

25. Chain Rule. Now, f is a function of t only. Expand by multiplication:

21-256: Partial differentiation

How to Use Calculus Like a Physicist

Section 3.5 The Implicit Function Theorem

Dr. Allen Back. Sep. 10, 2014

Vector Calculus, Maths II

(1) Recap of Differential Calculus and Integral Calculus (2) Preview of Calculus in three dimensional space (3) Tools for Calculus 3

Math 361: Homework 1 Solutions

Contents. 2 Partial Derivatives. 2.1 Limits and Continuity. Calculus III (part 2): Partial Derivatives (by Evan Dummit, 2017, v. 2.

VANDERBILT UNIVERSITY. MATH 2300 MULTIVARIABLE CALCULUS Practice Test 1 Solutions

Review for the First Midterm Exam

FINAL EXAM STUDY GUIDE

1 Functions of Several Variables Some Examples Level Curves / Contours Functions of More Variables... 6

Chapter 4 Notes, Calculus I with Precalculus 3e Larson/Edwards

Lecture 13 - Wednesday April 29th

Lecture 7 - Separable Equations

MATH 32A: MIDTERM 2 REVIEW. sin 2 u du z(t) = sin 2 t + cos 2 2

Lecture 10. (2) Functions of two variables. Partial derivatives. Dan Nichols February 27, 2018

Calculus I Review Solutions

Section 14.1 Vector Functions and Space Curves

Vector Calculus handout

Week 4: Differentiation for Functions of Several Variables

MA102: Multivariable Calculus

Sec. 1.1: Basics of Vectors

The first order quasi-linear PDEs

g(t) = f(x 1 (t),..., x n (t)).

Chapter 0 of Calculus ++, Differential calculus with several variables

CHAPTER 4: HIGHER ORDER DERIVATIVES. Likewise, we may define the higher order derivatives. f(x, y, z) = xy 2 + e zx. y = 2xy.

Maxima and Minima. (a, b) of R if

Multi Variable Calculus

Review for the Final Exam

Name: Instructor: Lecture time: TA: Section time:

x 1. x n i + x 2 j (x 1, x 2, x 3 ) = x 1 j + x 3

f(x 0 + h) f(x 0 ) h slope of secant line = m sec

Sometimes the domains X and Z will be the same, so this might be written:

z x = f x (x, y, a, b), z y = f y (x, y, a, b). F(x, y, z, z x, z y ) = 0. This is a PDE for the unknown function of two independent variables.

3 Applications of partial differentiation

MATH 19520/51 Class 5

Parametric Equations, Function Composition and the Chain Rule: A Worksheet

CALCULUS III THE CHAIN RULE, DIRECTIONAL DERIVATIVES, AND GRADIENT

2015 Math Camp Calculus Exam Solution

REVIEW OF DIFFERENTIAL CALCULUS

Midterm 1 Review. Distance = (x 1 x 0 ) 2 + (y 1 y 0 ) 2.

g(2, 1) = cos(2π) + 1 = = 9

Directional Derivatives in the Plane

Math 207 Honors Calculus III Final Exam Solutions

Solutions to Math 41 Second Exam November 5, 2013

Notes on multivariable calculus

LB 220 Homework 4 Solutions

MAT 211 Final Exam. Spring Jennings. Show your work!

Vector Functions & Space Curves MATH 2110Q

Summary for Vector Calculus and Complex Calculus (Math 321) By Lei Li

Derivatives of Functions from R n to R m

Math Practice Exam 3 - solutions

n=0 ( 1)n /(n + 1) converges, but not

13. LECTURE 13. Objectives

Econ Slides from Lecture 10

Solutions to old Exam 3 problems

and lim lim 6. The Squeeze Theorem

Exercises for Multivariable Differential Calculus XM521

SOME PROBLEMS YOU SHOULD BE ABLE TO DO

4 Partial Differentiation

Tangent Plane. Linear Approximation. The Gradient

MATH 31BH Homework 5 Solutions

MATH 280 Multivariate Calculus Fall Integrating a vector field over a curve

PARTIAL DERIVATIVES AND THE MULTIVARIABLE CHAIN RULE

CALCULUS III. Paul Dawkins

Vector calculus background

14 Increasing and decreasing functions

MATH 200 WEEK 5 - WEDNESDAY DIRECTIONAL DERIVATIVE

problem score possible Total 60

II. An Application of Derivatives: Optimization

14.5 The Chain Rule. dx dt

Note: Every graph is a level set (why?). But not every level set is a graph. Graphs must pass the vertical line test. (Level sets may or may not.

Math 131, Lecture 20: The Chain Rule, continued

Math 53: Worksheet 9 Solutions

February 27, 2019 LECTURE 10: VECTOR FIELDS.

Lagrange Multipliers

Increments, differentials, and local approximations

1 Lecture 24: Linearization

MATH 12 CLASS 5 NOTES, SEP

Written assignment 4. Due Wednesday November 25. Review of gradients, Jacobians, Chain Rule, and the Implicit Function Theorem.

MITOCW MITRES18_006F10_26_0602_300k-mp4

Math 321 Final Exam. 1. Do NOT write your answers on these sheets. Nothing written on the test papers will be graded.

2.2 Graphs of Functions

1 Vectors and 3-Dimensional Geometry

Module Two: Differential Calculus(continued) synopsis of results and problems (student copy)

Functions of Several Variables: Chain Rules

MATH 353 LECTURE NOTES: WEEK 1 FIRST ORDER ODES

LECTURE 11 - PARTIAL DIFFERENTIATION

(b) Find the range of h(x, y) (5) Use the definition of continuity to explain whether or not the function f(x, y) is continuous at (0, 0)

HW3 - Due 02/06. Each answer must be mathematically justified. Don t forget your name. 1 2, A = 2 2

MA 441 Advanced Engineering Mathematics I Assignments - Spring 2014

Mathematical Economics: Lecture 9

How to solve quasi linear first order PDE. v(u, x) Du(x) = f(u, x) on U IR n, (1)

Math 2a Prac Lectures on Differential Equations

Derivatives and Integrals

February 4, 2019 LECTURE 4: THE DERIVATIVE.

Transcription:

MATH 20550 The Chain Rule Fall 2016 A vector function of a vector variable is a function F: R n R m. In practice, if x 1, x n is the input, F(x 1,, x n ) F 1 (x 1,, x n ),, F m (x 1,, x n ) where each F i (x 1,, x n ) is a multi-variable function of the sort in which we are currently interested. Considering vector functions makes compositions easy to describe makes the Chain Rule both easy to write down easy to use. Here is an example of a composition with vector functions without vector functions. If F: R n R m G: R m R p are given, then (G F)(x 1,, x n ) G ( F 1 (x 1,, x n ),, F m (x 1,, x n ) ) (G 1 ( F1 (x 1,, x n ),, F m (x 1,, x n ) ),, G p ( F1 (x 1,, x n ),, F m (x 1,, x n ) )) If I write x x 1,, x n then I can also write (G F)(x) G ( F(x) ) So rather than describing (u 2 +v 2 uw) 3 +(u 3 +v 3 vw) 2 as the composition of g(x, y) x 3 +y 2 with x(u, v) u 2 + v 2 uw y(u, v) u 3 + v 3 vw I can describe it as g composed with the vector function F(u, v, w) u 2 + v 2 uw, u 3 + v 3 vw. Even if I only care about real-valued functions of a vector variable, as soon as I want to do composites, vector functions give me the same advantages that vectors have over lists of numbers. The Chain Rule is a formula for G F where G: R m R is a real-valued function of m variables F: R n R m is a vector function. Let us write H(x 1,, x n ) for G F. In this notation, we are looking for a formula for H. 1

The Chain Rule If H(x) G ( F(x) ) if x i is one of the input variables for F, then H (CR) G ( F(x) ) (x) provided G F are differentiable. Let s start by talking about the (x) piece. If we fix all the x j s except for x i, we get a curve in R m. Examples. Let F(x, y, z) x 2 y 2, xyz. Fix x z, say x 2, z 3. Then the curve in R 2 is 4 y 2, 6y. Let F(x, y) x 2 y 2, xy, x 3 3xy 2. Fix y 2. Then the curve in R 3 is x 2 4, 2x, x 3 12x. From our work on curves we know how to write down a tangent vector to a curve: 1,, m In the case we already studied in Chapter 13 we wrote F as r(t) t r (t) As far as calculations go the general case is the same since only the variable x i is allowed to move. Example. Let F(x, y, z) x 2 y 2, xyz. Then 2x, yz, 2y, xz x 0, xy. Let F(x, y) x 2 y 2, xy, x 3 3xy 2. Then x 2x, y, 3x2 3y 2 2y, x, 6xy.

Turn now to G(y 1,, y m ) a multi-variable function. We will study the gradient intensely in the next few lectures, but for now we only need the definition. G G(y 1,, y m ),, G 1 m Notice that G is a vector function R m R m since each G is a function R m R 1. i Example. Notice that we write down the partial derivatives in the same order as we write down the variables. Let G(x, y) x 2 xy + sin y. Then G(x, y) 2x y, x + cos y. Let G(x, y, z) x 2 xy + z sin y. Then G(x, y, z) 2x y, x + z cos y, sin y. Let us work through some examples. One variable calculus. The function G is a function of one variable, as is the function F. The gradient G(x) is a 1-vector G (x). The tangent vector x (x) is the 1-vector F (x). The dot product in this case is just the product so H (x) G ( F(x) ) F (x) In English, to differentiate a composition, take the derivative of the outside function, plug in the inside function, then multiply by the derivative of the inside function. The multi-variable chain rule has a similar description. G ( F(x) ) G ( F(x) ) (x) To find a partial derivative of a composition, take the gradient of the outside function, plug in the inside function then take the dot product with the partial derivative of the inside function.

Stewart, Case 1. z f(x, y), x g(t), y h(t) z(t) is the compoosition. Then the vector function is r(t) g(t), h(t) so z(t) f ( r(t) ). Then r dg t r (t) dt, dh, dt f f x, f dz f dt x, f dg dt, dh f dg dt x dt + f dh dt Writing the formula this way hides the fact that you must plug x g(t), y h(t) into the formulas for f x f. So if f(x, y) x 2 + y 3, x(t) t 2 1, y(t) t 3 + t, the correct calculation is r (t) 2t, 3t 2 + 1, f 2x, 3y 2 2(t 2 1), 3(t 3 + t) 2 dz dt 2(t2 1), 3(t 3 + t) 3 2t, 3t 2 + 1 4t(t 2 1) + 3(3t 2 + 1)(t 3 + t) 3 The correct answer is NOT dz dt 2x, 3y2 2t, 3t 2 + 1 4xt + 3y 2 (3t 2 + 1). A helpful clue is to remember that after the composition, z is a function of t therefore its derivative must also be a function of t, NOT t, x y. As we will see below, in certain cases it is possible to work with this mixed form to reduce the work required. Stewart, Case 2. z f(x, y), x g(s, t), y h(s, t) z(s, t) is the composition. We can compute s or f. First compute f t x, f ( remember to plug in to write x y s functions of s t). The vector function is P(s, t) g(s, t), h(s, t). There g are two curve parts, one where t is fixed one where s is fixed: s, h g s t, h. t Unraveling the dot product s f x x s + f s t f x x t + f t So this time, if f(x, y) x 2 + y 3, x(s, t) t 2 s 2, y(s, t) st, the correct calculation is P P 2s, t 2t, s s t f 2x, 3y 2 2(t 2 s 2 ), 3s 2 t 2

s t 2(t 2 s 2 ), 3s 2 t 2 2s, t 4s(t 2 s 2 ) + 3s 2 t 3 2(t 2 s 2 ), 3s 2 t 2 2t, s 4t(t 2 s 2 ) + 3s 3 t 2 Note the partials of the composition should be functions of s t only they are.

Notice that the gradient is the same in both cases so you only have to compute it once. Looking at the general formula (CR) you see that this is always the case. No matter how many different partials of the composition you need to compute, the first vector in the dot product is always the same, the gradient with the vector function plugged in. Stewart, Example 3. z(x, y) e x sin y, x st 2, y s 2 t. Then P(s, t) st 2, s 2 t. z e x sin y, e x cos y e st2 sin(s 2 t), e st2 cos(s 2 t). P s t2, 2st P t 2st, s2 s t e st2 sin(s 2 t), e st2 cos(s 2 t) t 2, 2st t 2 e st2 sin(s 2 t) + 2ste st2 cos(s 2 t) e st2 sin(s 2 t), e st2 cos(s 2 t) 2st, s 2 2ste st2 sin(s 2 t) + s 2 e st2 cos(s 2 t) Notice that no tree diagrams are need to keep everything straight. The dot product does that for you. Stewart, Example 4. w f(x, y, z, t), x x(u, v), y y(u, v), z z(u, v) t t(u, v) so the composition is w(u, v). To compute w u w first write down the vector function, v say P(u, v) x(u, v), y(u, v), z(u, v), t(u, v). Then w(u, v) f P. f f x, f, f, f t ( of course you should think of f as a function of u v). P x u u, u, u, t P x u v v, v, v, t. v w u w v f x, f, f, f t f x, f, f, f t which you can easily verify is the answer in Stewart. x u, u, u, t u x v, v, v, t v

Stewart, Example 5. u(x, y, z) x 4 y + y 2 z 3, x rse t, y rs 2 e t z r 2 s sin t. Find the value of u when r 2, s 1 t 0. The vector function is P(r, s, t) s rse t, rs 2 e t, r 2 s sin t. The point in xyz space is (2, 2, 0). The fact that we only want the partial at one point allows a simplification in the calculation as follows. u 4x 3 y, x 4 + 2yz 3, 3y 2 z 2. Normally I would next substitute for r, s t but since I only want the value of u when r 2, s 1 t 0 I can just plug in x 2,y 2 z 0 right now. u 64, 16, 0. P s ret, 2rse t, r 2 sin t, which at r 2, s 1 t 0 is P 2, 4, 0 s Hence u s 64, 16, 0 2, 4, 0 128 + 64 192 (2,1,0) To get the other two partials at the point, compute P t rse t, rs 2 e t, r 2 s cos t (2,1,0) (2,1,0) 2, 2, 4 so u t 64, 16, 0 2, 2, 4 128 32 96 (2,1,0) P r se t, s 2 e t, 2rs sin t 1, 1, 0 so (2,1,0) (2,1,0) u r 64, 16, 0 1, 1, 0 64 + 16 90 (2,1,0) Stewart at one point in his calculation writes (4x 3 y)(re t ) + (x 4 + 2yz 3 )(2rse t ) + (3y 2 z 2 )(r 2 sin t) mixing the two sets of variables, which is dangerous. But having written this down he still needs to substitute for x, y z in terms of r, s t, but since Stewart is at the point (2, 1, 0) he substitutes into the original formulas for x y z gets x y 2, z 0 then plugs in which is OK.

Higher order partials. Of course the next issue is computing a second order partial of a composition. And then a third order partial so on. If H G F then we want to compute something like ( ) H 2 H x j x i x j We definitely want both the case where x i x j where x i x j. The next part if grayed out because, while the formulas are correct the calculations can be simplified in terms of what you need to remember. In particular, as promised, to compute a higher derivative you simply differentiate with respect to the first variable, then differentiate that answer with respect to the second variable, so on. Keep things written in terms of dot products things remain as simple as possible. An example is provided after the grayed out material. By the Chain Rule H(x) G ( F(x) ) (x) x j x j ( ) H x j We now need to find. By the product rule, ( ) H (C 2 x j R) (x) G ( ) F(x) (x) + G ( F(x) ) 2 F (x) x j x i x j There are four terms in this formula. The gradient G ( F(x) ) x j (x) are just the same as you computed in for the Chain Rule. The term 2 F (x) is computed by taking the x i x j vector (x) differentiating it with respect to x i. In practice this is straightforward. x j The G( F(x) ) term is also the partial derivative of a vector function. The k th component in G is G composed with F so the k th term in G ( ) F(x) is the vector x k Hence 2 H x i x j 2 G x 1 x k,, 2 G x m x k G ( F(x) ) ( ) G x 1 ( ) G x 1 ( ) G,, x m ( ) G x k ( ) G,, x m (x) + G ( F(x) ) 2 F (x) x j x i x j

Especially if you are willing to use both sets of variables in your answer, the calculations are not too bad.

Stewart Example 7. Given an unknown function z f(x, y) x r 2 + s 2, y 2rs find 2 f f r. The vector function is P 2 r2 + s 2, 2rs. The vector function f x, f is basically unknown, but P 2r, 2s. In the formula for the second partial we will need ( ) r ( ) 2 P f 2 f 2, 0 r 2 x x, 2 f f 2 f, 2 y x x y, 2 f y 2 Let us assume as Stewart does that mixed partials are equal. Then ( ) f P ( ) f x r, P r 2 f r 2r 2 f 2 x + 2s 2 f 2 x y, 2r 2 f x y + f 2s 2 y 2 which is the formula in Stewart.. 2r 2 f x + 2s 2 f 2 x y, 2r 2 f x y + f 2s 2 y 2 2r, 2s + f x, f 2, 0 The other two second order partials can be calculated as well. We will need P 2s, 2r, s 2 P r s 0, 2 2 P 2, 0. Then s ( ) 2 f P ( ) f x s, P 2s 2 f s x + 2r 2 f 2 x y, 2s 2 f x y + 2r 2 f y 2 2 f s 2s 2 f 2 x + 2r 2 f 2 x y, 2s 2 f x y + 2r 2 f y 2 2 f r s 2s 2 f x + 2r 2 f 2 x y, 2s 2 f x y + 2r 2 f y 2 2s, 2r + 2r, 2s + f x, f f x, f 2, 0 0, 2 2 f can also be computed using r s ( ) f P ( ) f x r, P P r s 2r 2 f x + 2s 2 f 2 x y, 2r 2 f x y + f 2s 2 y 2 2s, 2r

Let us redo this last example without trying to plug into formula (C 2 R). without having to memorize the formula (C 2 R)! In particular, Given an unknown function z f(x, y) x r 2 + s 2, y 2rs find 2 f. The vector r2 f function is P r 2 + s 2, 2rs. The vector function f x, f is basically unknown, but P 2r, 2s. r The Chain Rule says f f r x, f 2r, 2s A Product Rule says ( ) 2 f 2 r f r Apply the Chain Rule again ( ) f f r x x ( ) f f r so x, f f r x, r P 2 r f P r f 2r, 2s + x, f 2, 0 f f 2r, 2s + x, f 2, 0 2 x, 2 f y x 2 f x y, 2 f 2 y 2r, 2s 2r 2 f 2 x + 2s 2 f y x 2r, 2s 2r 2 f x y + f 2s 2 2 y f r x, f 2r 2 f r 2 x + 2s 2 f y x, 2r 2 f x y + f 2s 2 2 y plugging into ( ) gives the same result as on the last page. 2 f 2 r 2r 2 f 2 x + 2s 2 f y x, 2r 2 f x y + f f 2s 2 2 2r, 2s + y x, f 2, 0 If you need a third order partial, just differentiate this last equation. This time you will use the Sum Rule three times, the Dot Product Rule twice, the Scalar Product Rule four times the Chain Rule six times. Nobody said it was going to be short! But, no matter how complicated it gets, you are always just computing the derivative of sums, dot products, vector functions applying the Chain Rule with the same vector function F when computing a higher order partial of G ( F(x) ).

1. Implicit differentiation The Chain Rule in more than one variable is not so important for doing calculations since you can always do the substitutions then compute the partial(s) you want directly. In one variable of course this is not the case. Just because you know how to differentiate sin x, without the Chain Rule, you have no idea what to do with sin(x 2 ). In more than one variable however, the Chain Rule is behind many of the deepest applications of multi-variable calculus. One example is implicit differentiation. Let F (x 1,, x n, w) be a function of n+1 variables suppose that there is a function of n variables, w w(x 1,, x n ) such that F ( x 1,, x n, w(x 1,, x n ) ) c for some constant c. Then we say F defines w implicitly as a function of x 1,..., x n. Suppose (a 1,, a n, w 0 ) is some point with F of that point equal to c. Then the Chain Rule allows us to compute the partials of w with respect to x i at the point (a 1,, a n ) without knowing the function w explicitly. To derive the formula note let P(x 1,, x n ) x1,, x n, w(x 1, x n ) so F P c. Compute P two ways. Since it is a constant function P Chain Rule, it is also F ( P(x) ) P. But P i th which is 1 the (n + 1) st which is w. Hence + w 0. Applying the has all its components 0 except for the w 0 or (ID) w w which is defined as long as w is not zero at the point. The Implicit Function Theorem says that if F has continuous partials in a neighborhood of a a 1,, a n, w 0 if w 0, then there is a unique differentiable function a w w(x 1,, x n ) defined in a neighborhood of a whose partials are given by (ID) such that w 0 w(a 1,, a n ). Example. Let F (x, y, z) x 2 3xy xz 3 y 2 z 2 let a (2, 3, 2). Then F (a) 34. Then the equation F (x, y, z) 34 defines z implicitly as a function of x y in a neighborhood of a. Compute x 2x 3y z3, 3x 2y2 z 3xz2 2y 2 z at a, x 3, 30 12. Since 0, there really is some function z(x, y) such that z(2, 3) 2 F ( x, y, z(x, y) ) 34 z is guaranteed to be defined

differential in some (undetermined) neighborhood of (2, 3). Moreover x 2x 3y z3 3xz 2 2y 2 z 2x 3y z3 3xz 2 + 2y 2 z in that neighborhood. 3x 2yz2 3xz 2 2y 2 z 3x 2yz2 3xz 2 + 2y 2 z

Example. Here is a piece of the curve y 2 + 2xy + 1 0 which is part of the level curve of F (x, y) y 2 + 2xy + 1 with constant 0. At either green point on the curve, F clearly defines y as a function of x near the point. In other words, the curve passes the vertical line test if we restrict to a small interval around either green dot. Equivalently, a small piece of the curve near one of these points is the graph of a function. In this case, using the quadratic formula one can even write down a formula. Near the top green point y x + x 2 1. Compute x 2y 2y + 2x. Hence d y d x x 2y 2y + 2x y y + x At the top green point y + x x 2 1 > 0 the curve is increasing. At the bottom green point y + x x 2 1 < 0 the curve is decreasing. The point (1, 1) is on the curve, but there 0 the Implicit Function Theorem says nothing. Looking at the graph we see there is no implicitly defined function around the point (1, 1). It is also possible to compute higher derivatives. Recall d2 y d x 2 tells about concavity. Well,

so ( y y + x d y d x x 2y 2y + 2x ( y) (y + x) (y + x) ( y) x x (y + x) 2 y y + x d 2 y d x 2 y 2 x 2 ) ( ) ( y y + y + x (y + x) + y y + x + 1 y + y y + x (y + x) 2 (y + x) 2 d 2 y y(y + 2x) d x2 (y + x) 3 ( ) (y + x) + y x x + 1 (y + x) 2 ) ( ) ( ) y + x x y + y y + x y + x (y + x) 2 For concavity you want to know if d2 y is positive or negative or zero for potential inflection d x 2 points. Hence we need to determine the signs of y, y + x y + 2x. At the two green points x 1.3. For the top point y 0.5 for the lower point y 2.16. Hence y is negative for both points. The quantity y +x is positive for the top point negative for the bottom. The quantity y + 2x is positive for both points. Hence the curve is concave down at the top point concave up at the bottom point. Likewise for the three-variable example above. Recall Then x 2x 3y z3 3xz 2 + 2y 2 z 3x 2yz2 3xz 2 + 2y 2 z ( ) 2 z x x 2x 3y z 3 2x 3y z3 3xz 2 + 2y 2 z (3xz 2 + 2y 2 z) (2x 3y z 3 ) 3xz2 + 2y 2 z (3xz 2 + 2y 2 z) 2 ( 3 3z 2 ) (3xz 2 + 2y 2 z) (2x 3y z 3 ) ( 3 3z 2 ) (3xz 2 + 2y 2 z) (2x 3y z 3 ) ( ) 3xz 2 + 2y 2 z (3xz 2 + 2y 2 z) ( 2 0z 2 + 6xz ) + 4yz + 4y2 (3xz 2 + 2y 2 z) 2

The matrix version of the Chain Rule. If you picked up some matrix theory elsewhere in your life the ultimate version of the Chain Rule is the easiest of all. Start with a function F: R n R m. If F(x 1,, x n ) F 1 (x 1,, x n ),, F m (x 1,, x n ) write down the m n matrix D(F)(x) 1 1 x 1 x 2... m x 1 If G: R m R p then G F: R n R p is defined D ( G F)(x). m x 2 1 x n. m x n ( D(G) ( F(x) ))( D(F)(x) where the right h side is the product of the two matrices. The formulas in the first part of this note follow from expressing the matrix multiplications required in terms of the dot product. )