Change Of Variable Theorem: Multiple Dimensions

Similar documents
Sample Spaces, Random Variables

Stat 426 : Homework 1.

STAT 801: Mathematical Statistics. Distribution Theory

STAT 450: Statistical Theory. Distribution Theory. Reading in Casella and Berger: Ch 2 Sec 1, Ch 4 Sec 1, Ch 4 Sec 6.

2 Functions of random variables

STAT 450: Statistical Theory. Distribution Theory. Reading in Casella and Berger: Ch 2 Sec 1, Ch 4 Sec 1, Ch 4 Sec 6.

Tangent Planes, Linear Approximations and Differentiability

Math 3215 Intro. Probability & Statistics Summer 14. Homework 5: Due 7/3/14

MTH739U/P: Topics in Scientific Computing Autumn 2016 Week 6

2.2 Separable Equations

Joint Distributions: Part Two 1

Conditional Distributions

SMSTC (2017/18) Geometry and Topology 2.

Probability and Distributions

Bivariate Transformations

9 Sequences of Functions

Conditional densities, mass functions, and expectations

Complex Differentials and the Stokes, Goursat and Cauchy Theorems

Lecture Notes on Metric Spaces

MAS223 Statistical Inference and Modelling Exercises

Things You Should Know Coming Into Calc I

Multivariate distributions

CITY UNIVERSITY. London

March 25, 2010 CHAPTER 2: LIMITS AND CONTINUITY OF FUNCTIONS IN EUCLIDEAN SPACE

STEP Support Programme. Pure STEP 1 Questions

Multiple Random Variables

Notes on uniform convergence

Ch3. Generating Random Variates with Non-Uniform Distributions

The Multivariate Normal Distribution 1

Joint Distributions. (a) Scalar multiplication: k = c d. (b) Product of two matrices: c d. (c) The transpose of a matrix:

4.4 Uniform Convergence of Sequences of Functions and the Derivative

Lecture 13 - Wednesday April 29th

Example 2.1. Draw the points with polar coordinates: (i) (3, π) (ii) (2, π/4) (iii) (6, 2π/4) We illustrate all on the following graph:

Chapter 4: Higher-Order Differential Equations Part 1

Volume: The Disk Method. Using the integral to find volume.

MAC2313 Final A. (5 pts) 1. How many of the following are necessarily true? i. The vector field F = 2x + 3y, 3x 5y is conservative.

(c) The first thing to do for this problem is to create a parametric curve for C. One choice would be. (cos(t), sin(t)) with 0 t 2π

Week 10 Worksheet. Math 4653, Section 001 Elementary Probability Fall Ice Breaker Question: Do you prefer waffles or pancakes?

Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows.

conditional cdf, conditional pdf, total probability theorem?

1. The accumulated net change function or area-so-far function

Introducing the Normal Distribution

Calculus 2502A - Advanced Calculus I Fall : Local minima and maxima

Calculus for the Life Sciences II Assignment 6 solutions. f(x, y) = 3π 3 cos 2x + 2 sin 3y

4. Be able to set up and solve an integral using a change of variables. 5. Might be useful to remember the transformation formula for rotations.

EXAM # 3 PLEASE SHOW ALL WORK!

Implicit Functions, Curves and Surfaces

3 Algebraic Methods. we can differentiate both sides implicitly to obtain a differential equation involving x and y:

More on Distribution Function

1 Review of Probability and Distributions

UNIVERSITY OF SOUTHAMPTON

Chapter 5. Random Variables (Continuous Case) 5.1 Basic definitions

Green s Theorem. MATH 311, Calculus III. J. Robert Buchanan. Fall Department of Mathematics. J. Robert Buchanan Green s Theorem

NO CALCULATOR 1. Find the interval or intervals on which the function whose graph is shown is increasing:

Chapter 2. First-Order Differential Equations

JUST THE MATHS UNIT NUMBER ORDINARY DIFFERENTIAL EQUATIONS 1 (First order equations (A)) A.J.Hobson

Math 265 (Butler) Practice Midterm III B (Solutions)

(x 3)(x + 5) = (x 3)(x 1) = x + 5. sin 2 x e ax bx 1 = 1 2. lim

This lesson should help you work on section 1.9 in the third edition or Section 1.8 in the second updated edition.

Joint Probability Distributions and Random Samples (Devore Chapter Five)

Multivariable Calculus Midterm 2 Solutions John Ross

Math 5BI: Problem Set 6 Gradient dynamical systems

MATH 250 TOPIC 13 INTEGRATION. 13B. Constant, Sum, and Difference Rules

Physics 250 Green s functions for ordinary differential equations

CHAPTER 3 REVIEW QUESTIONS MATH 3034 Spring a 1 b 1

Math 32B Discussion Session Week 10 Notes March 14 and March 16, 2017

Math 151. Rumbos Spring Solutions to Review Problems for Exam 2

Supplement: Universal Self-Concordant Barrier Functions

First Order ODEs, Part I

Chapter 5. The multivariate normal distribution. Probability Theory. Linear transformations. The mean vector and the covariance matrix

Section 5-7 : Green's Theorem

MATH 423/ Note that the algebraic operations on the right hand side are vector subtraction and scalar multiplication.

Corrections to the First Printing: Chapter 6. This list includes corrections and clarifications through November 6, 1999.

Student name: Student ID: Math 265 (Butler) Midterm III, 10 November 2011

Tangent spaces, normals and extrema

Applications of the Maximum Principle

MULTIVARIABLE INTEGRATION

Fourth Week: Lectures 10-12

18.02 Multivariable Calculus Fall 2007

Summary: Primer on Integral Calculus:

Stat 5101 Notes: Algorithms

Lecture Wise Questions from 23 to 45 By Virtualians.pk. Q105. What is the impact of double integration in finding out the area and volume of Regions?

Parametric Equations and Polar Coordinates

arxiv:math.ca/ v2 17 Jul 2000

Math 240 Calculus III

MATH 312 Section 4.5: Undetermined Coefficients

Notes on multivariable calculus

Stat 366 A1 (Fall 2006) Midterm Solutions (October 23) page 1

Math Exam IV - Fall 2011

3 Applications of partial differentiation

Differentiation by taking logarithms

MATHS 267 Answers to Stokes Practice Dr. Jones

Math 4377/6308 Advanced Linear Algebra I Dr. Vaughn Climenhaga, PGH 651A HOMEWORK 3

Math 234 Exam 3 Review Sheet

Introduction to Empirical Processes and Semiparametric Inference Lecture 08: Stochastic Convergence

Chapter 5 continued. Chapter 5 sections

Practice Problems for Final Exam

42. Change of Variables: The Jacobian

M311 Functions of Several Variables. CHAPTER 1. Continuity CHAPTER 2. The Bolzano Weierstrass Theorem and Compact Sets CHAPTER 3.

Elliptically Contoured Distributions

Transcription:

Change Of Variable Theorem: Multiple Dimensions Moulinath Banerjee University of Michigan August 30, 01 Let (X, Y ) be a two-dimensional continuous random vector. Thus P (X = x, Y = y) = 0 for all (x, y). Also assume that (X, Y ) has a density function f(x, y). What this means is the following: For any nice( measurable ) subset of R, the probability that (X, Y ) assumes values in R can be represented as P ((X, Y ) A) = f(x, y) dx dy. This is an extension of the requirement in the univariate case. Also, f(x, y) 0 for all (x, y). Thus, the volume enclosed by the surface {x, y, f(x, y)} in x y z space over the area A gives the chance that (X, Y ) takes values in A. We will discuss the change of variable theorem, which enables us to find the density of the random vector (U, V ) which is a nice transformation of (X, Y ). Nice will be made precise in what follows. Change of variable theorem THE HAIRY TECHNICAL VERSION: Let (X 1, X ) be jointly distributed continuous random variables with density function f X (x, y). Let S be an open subset of R, such that P ((X 1, X ) S) = 1 (so the density f can be assumed to be concentrated on S). Let g be a transformation from S to R. Thus we can write, (Y 1, Y ) g(x 1, X ) = (g 1 (X 1, X ), g (X 1, X )), where g 1 and g are both real-valued. Now assume that, (1) g has continuous first partial derivatives on S. () g is a 1 1 function. (3) Let A(x 1, x ) be the matrix whose first row is ( g1 (x 1, x ), g ) ( y1 (x 1, x ) (x 1, x ), y ) (x 1, x ) 1 A,

and whose second row is ( g1 (x 1, x ), g ) ( y1 (x 1, x ) (x 1, x ), y ) (x 1, x ). x x x x Let J g (x 1, x ) = abs (det A(x 1, x )) = y 1 (x 1, x ) y (x 1, x ) y (x 1, x ) y 1 (x 1, x ) x x, be the Jacobian of g. Then, J g (x 1, x ) does not vanish for any (x 1, x ) S. Let h denote the inverse transformation of g. Thus h is defined on g(s) and h(y 1, y ) (h 1 (y 1, y ), h (y 1, y )) for (y 1, y ) in g(s) is the unique (x 1, x ) in S such that (g 1 (x 1, x ), g (x 1, x )) = (y 1, y ). Then h itself has continuous first partial derivatives on g(s) and is clearly 1 1. Also, if B(y 1, y ) denotes the matrix of first partial derivatives of h, then the Jacobian of h, J h (y 1, y ) = (y 1, y ) x (y 1, y ) x (y 1, y ) (y 1, y ) y 1 y y 1 y, where x 1 = h 1 (y 1, y ) and x = h (y 1, y ), does not vanish on g(s) and in fact J h (y 1, y ) = J g (h 1 (y 1, y ), h (y 1, y )) 1. Also, the density of the random vector (Y 1, Y ) is given by, f Y (y 1, y ) = f(h 1 (y 1, y ), h (y 1, y )) J h (y 1, y ), (y 1, y ) g(s) f Y (y 1, y ) = 0 otherwise. Thus, for any nice subset I of S, we have, f X (x 1, x ) dx 1 dx = P ((X 1, X ) I) I = P ((Y 1, Y ) g(i)) = f(h 1 (y 1, y ), h (y 1, y )) J h (y 1, y ). g(i) What it boils down to in SIMPLE language but with caveats: Given (X 1, X ) with joint density f X (x 1, x ), consider (Y 1, Y ) which can be expressed as a nice (appropriately smooth and one-to-one) function of (X 1, X ). To find the density of (Y 1, Y ) we go through the following steps:

Express (X 1, X ) as a function of (Y 1, Y ), i.e. solve for (X 1, X ) in terms of (Y 1, Y ). Thus X 1 = h 1 (Y 1, Y ) for some function h 1 and X = h (Y 1, Y ) for some function h. Calculate J h (y 1, y ) = (y 1, y ) x (y 1, y ) x (y 1, y ) (y 1, y ) y 1 y y 1 y. The density of (Y 1, Y ) at any point (y 1, y ) in the domain, D Y, of (Y 1, Y ) (i.e. the region in which (Y 1, Y ) lives with probability 1) is: f Y (y 1, y ) = f(h 1 (y 1, y ), h (y 1, y )) J h (y 1, y ). We now do an application of the change of variable theorem, that will clearly illustrate what is going on. The theorem looks big and messy at first shot but really has a nice pattern, once you keep staring at it. Those of you who remember your advanced calculus well, will probably spot resemblances to the change of variable theorem in calculus (for two variables). In fact, this is precisely what the above theorem, which we will subsequently refer to as the Jacobian theorem, is, but in a different garb. The theorem extends readily to the case of more than variables but we shall not discuss that extension. Suppose that (X 1, X ) are i.i.d. Exponential(λ) random variables. Thus, f X (x 1, x ) = λ e λ x 1 λ e λ x = λ e λ (x 1+x ), (x 1, x ) S, where S is the open set {x 1 > 0, x > 0}. Consider the following transformation, g, of (X 1, X ). (Y 1, Y ) = g(x 1, X ) = (g 1 (X 1, X ), g (X 1, X )) = (X 1 + X, X 1 /(X 1 + X )). Then, g(s), the open set in which the random vector (Y 1, Y ) assumes values is, g(s) = {(y 1, y ) : 0 < y 1, 0 < y < 1}. Computing the partial derivatives of g we have, and g 1 = 1, g 1 x = 1, g = x (x 1 + x ), g x 1 = x (x 1 + x ). 3

Clearly, the partial derivatives are continuous functions of (x 1, x ); also, g is clearly a 1 1 function on S and furthermore, J g (x 1, x ) = x 1 + x (x 1 + x ) = 1 > 0, x 1 + x for every (x 1, x ) in S. Thus, all conditions of the Jacobian theorem are satisfied. To obtain the density function of (Y 1, Y ) we need to find the inverse transformation. This amounts to expressing (X 1, X ) in terms of (Y 1, Y ). Note that, Y (X 1 + X ) = X 1 ; but Y 1 = X 1 + X. Thus Y Y 1 = X 1. Consequently, X = Y 1 X 1 = Y 1 Y 1 Y = Y 1 (1 Y ). Thus, we obtain the function h from g(s) to S as, h 1 (y 1, y ) = y 1 y, h (y 1, y ) = y 1 y 1 y. The density of (Y 1, Y ) at the point (y 1, y ) in g(s) is then computed as, on noting that f Y (y 1, y ) = f X (h 1 (y 1, y ), h (y 1, y )) J h (y 1, y ) = λ e λ (h 1(y 1,y )+h (y 1,y )) J g (h 1 (y 1, y ), h (y 1, y )) 1 = λ e λ (y 1 y +y 1 y 1 y ) y 1, Thus we can rewrite the density of (Y 1, Y ) as J g (x 1, x ) 1 = x 1 + x. f Y (y 1, y ) = (λ e λ y 1 y 1 ) 1{y 1 > 0} 1{0 < y < 1}. The above shows immediately that Y 1 and Y are independent and that Y 1 follows Γ(, λ) while Y follows U(0, 1). Here I am tacitly using propositions on factorization of joint densities as a product of marginal densities as a necessary and sufficient condition for independence of random variables, a fact you would have learnt in Stat/Math 45. Here is another application of the Change of Variable Theorem and one that gives a way of generating observations from a Normal distribution. Let (X, Y ) be i.i.d. N(0, 1) random variables. Let R be the radius vector corresponding to the point (X, Y ) and let Θ be the angle that R subtends with the positive direction of the x axis. Thus (R, Θ) represents the vector (X, Y ) in polar co-ordinates and we have the following equations: X = R cos θ and Y = R sin θ. (Recall the picture that I drew in class). We want to find the joint density of (R, θ). Note that (R, Θ) lives, with probability 1, in the open set (0 ) (0, Π). When we 4

express X and Y in terms of R and Θ we are looking at the inverse transformation h; the transformation g that maps (X, Y ) to (R, Θ) is a nice transformation in the sense that it satisfies the assumptions (1), () and (3) of the Change of Variable Theorem. We first write down the joint density of X, Y ). f X,Y (x, y) = f X (x) f Y (y) = 1 exp ( x π Now, ) 1 π exp ) ( y = 1 π exp (x, y) = (h 1 (r, θ), h (r, θ)) (r cos θ, r sin θ). We next compute the Jacobian of h at the point (r, θ). This is, J h (r, θ) = x y r θ y x r θ = cos θ r cos θ sin θ ( r sin θ) = r cos (θ) + r sin (θ) = r. ) ( x + y. Thus the joint density of (R, θ) is, f R,Θ (r, θ) = 1 ( π exp h ) 1(r, θ) + +h (r, θ) J h (r, θ) 1{r > 0} 1{0 < θ < π} = 1 ( ) π exp r cos (θ) + r sin (θ) r 1{r > 0} 1{0 < θ < π} = 1 π 1{0 < θ < π} r exp ( r / ) 1{r > 0}. This immediately shows that R and Θ are independent, and that Θ has the uniform distribution on (0, Π) with marginal density, The density of R is, f Θ (θ) = 1 1{0 < θ < π}. π f R (r) = r exp ( r / ) 1{r > 0}. Thus, if we generate R and Θ independently, with marginal distributions given as above, then X = R cos θ and Y = R cos θ are i.i.d. N(0, 1) random variables. To generate R and Θ we proceed as follows: Recall that if F is the distribution function of a random variable X, then F 1 (U) has the same distribution as X, where U is a random variable distributed uniformly on (0, 1). Now, it is easy to show (by using the change of variable theorem in 1 5

dimension discussed in the previous section) that R follows exponential(1/) (this is left as an exercise). If F denotes the distribution function of exp(1/), we have, so that F (w) = 1 exp ( w/), F 1 (p) = log(1 p). Thus if U 1 and U be i.i.d U(0,1) random variables, then log(1 U 1 ) follows exp(1/) and π U has a uniform distribution on (0, π). Consequently, we can take, R = log(1 U 1 ) and Θ = π U. Relevant reading from Rice s Book: Chapter 3 with emphasis on Sections 3.1 through 3.6. Potential problems for discussion: Problems 19, 4, 48, 65. 6