STAT 450: Statistical Theory. Distribution Theory. Reading in Casella and Berger: Ch 2 Sec 1, Ch 4 Sec 1, Ch 4 Sec 6.

Similar documents
STAT 801: Mathematical Statistics. Distribution Theory

STAT 450: Statistical Theory. Distribution Theory. Reading in Casella and Berger: Ch 2 Sec 1, Ch 4 Sec 1, Ch 4 Sec 6.

2 Functions of random variables

Change Of Variable Theorem: Multiple Dimensions

3 Algebraic Methods. we can differentiate both sides implicitly to obtain a differential equation involving x and y:

STAT 450. Moment Generating Functions

Math 3215 Intro. Probability & Statistics Summer 14. Homework 5: Due 7/3/14

MTH739U/P: Topics in Scientific Computing Autumn 2016 Week 6

Chapter 2: Differentiation

Chapter 2: Differentiation

Multiple Random Variables

Definition: If y = f(x), then. f(x + x) f(x) y = f (x) = lim. Rules and formulas: 1. If f(x) = C (a constant function), then f (x) = 0.

Transformation of Probability Densities

Higher Mathematics Course Notes

Distributions of Functions of Random Variables. 5.1 Functions of One Random Variable

Chapter 4 Multiple Random Variables

Random Variables and Their Distributions

Math 5a Reading Assignments for Sections

Calculus I Review Solutions

Tangent Planes, Linear Approximations and Differentiability

MAS223 Statistical Inference and Modelling Exercises

This exam is closed book and closed notes. (You will have access to a copy of the Table of Common Distributions given in the back of the text.

conditional cdf, conditional pdf, total probability theorem?

6.6 Inverse Trigonometric Functions

Functions of Random Variables

Joint Probability Distributions and Random Samples (Devore Chapter Five)

JUST THE MATHS UNIT NUMBER INTEGRATION APPLICATIONS 13 (Second moments of a volume (A)) A.J.Hobson

MATH 250 TOPIC 13 INTEGRATION. 13B. Constant, Sum, and Difference Rules

Multivariable Calculus and Matrix Algebra-Summer 2017

10550 PRACTICE FINAL EXAM SOLUTIONS. x 2 4. x 2 x 2 5x +6 = lim x +2. x 2 x 3 = 4 1 = 4.

13 Implicit Differentiation

B553 Lecture 1: Calculus Review

Partial Derivatives Formulas. KristaKingMath.com

Lecture 13 - Wednesday April 29th

The Multivariate Normal Distribution 1

This does not cover everything on the final. Look at the posted practice problems for other topics.

Math/Stats 425, Sec. 1, Fall 04: Introduction to Probability. Final Exam: Solutions

Review for the First Midterm Exam

1 Solution to Problem 2.1

MSM120 1M1 First year mathematics for civil engineers Revision notes 4

Order Statistics and Distributions

p. 6-1 Continuous Random Variables p. 6-2

Stat 5101 Lecture Slides: Deck 8 Dirichlet Distribution. Charles J. Geyer School of Statistics University of Minnesota

Lecture 3: Random variables, distributions, and transformations

Math 101 Fall 2006 Exam 1 Solutions Instructor: S. Cautis/M. Simpson/R. Stong Thursday, October 5, 2006

4. CONTINUOUS RANDOM VARIABLES

Partial Fractions. June 27, In this section, we will learn to integrate another class of functions: the rational functions.

Example. Evaluate. 3x 2 4 x dx.

The Chain Rule. The Chain Rule. dy dy du dx du dx. For y = f (u) and u = g (x)

f(x, y) = 1 sin(2x) sin(2y) and determine whether each is a maximum, minimum, or saddle. Solution: Critical points occur when f(x, y) = 0.

Continuous Random Variables

Complex Differentials and the Stokes, Goursat and Cauchy Theorems

Transformations and Expectations

CHAPTER 2. The Derivative

JUST THE MATHS UNIT NUMBER NUMERICAL MATHEMATICS 6 (Numerical solution) of (ordinary differential equations (A)) A.J.Hobson

MA2AA1 (ODE s): The inverse and implicit function theorem

4. Distributions of Functions of Random Variables

Math 127C, Spring 2006 Final Exam Solutions. x 2 ), g(y 1, y 2 ) = ( y 1 y 2, y1 2 + y2) 2. (g f) (0) = g (f(0))f (0).

There are some trigonometric identities given on the last page.

Calculus I Practice Exam 2

STA 2201/442 Assignment 2

Sec. 14.3: Partial Derivatives. All of the following are ways of representing the derivative. y dx

Essential Maths 1. Macquarie University MAFC_Essential_Maths Page 1 of These notes were prepared by Anne Cooper and Catriona March.

MATH 307: Problem Set #3 Solutions

Core Mathematics 3 Differentiation

School of the Art Institute of Chicago. Calculus. Frank Timmes. flash.uchicago.edu/~fxt/class_pages/class_calc.

Chapter 2 Continuous Distributions

P1 Calculus II. Partial Differentiation & Multiple Integration. Prof David Murray. dwm/courses/1pd

0.0.1 Moment Generating Functions

Calculus II Practice Test 1 Problems: , 6.5, Page 1 of 10

Rational Functions. Elementary Functions. Algebra with mixed fractions. Algebra with mixed fractions

Handout 5, Summer 2014 Math May Consider the following table of values: x f(x) g(x) f (x) g (x)

Multivariate distributions

Analysis/Calculus Review Day 2

Numerical differentiation

Lecture 1: August 28

1 Lecture 39: The substitution rule.

(c) The first thing to do for this problem is to create a parametric curve for C. One choice would be. (cos(t), sin(t)) with 0 t 2π

1. Use the properties of exponents to simplify the following expression, writing your answer with only positive exponents.

Appendix 5. Derivatives of Vectors and Vector-Valued Functions. Draft Version 24 November 2008

Second Order ODEs. Second Order ODEs. In general second order ODEs contain terms involving y, dy But here only consider equations of the form

The Multivariate Normal Distribution. In this case according to our theorem

In this chapter we study elliptical PDEs. That is, PDEs of the form. 2 u = lots,

Systems of Linear ODEs

MATH 411 NOTES (UNDER CONSTRUCTION)

Solution. This is a routine application of the chain rule.

STA 256: Statistics and Probability I

Ch3. Generating Random Variates with Non-Uniform Distributions

Review for the Final Exam

Calculus II Practice Test Problems for Chapter 7 Page 1 of 6

Math Refresher Course

f(x) g(x) = [f (x)g(x) dx + f(x)g (x)dx

Chain Rule. MATH 311, Calculus III. J. Robert Buchanan. Spring Department of Mathematics

Introduction to Probability Theory

4 Partial Differentiation

Probability Models. 4. What is the definition of the expectation of a discrete random variable?

Lecture 25: Review. Statistics 104. April 23, Colin Rundel

4. We accept without proofs that the following functions are differentiable: (e x ) = e x, sin x = cos x, cos x = sin x, log (x) = 1 sin x

Math 115 Second Midterm March 25, 2010

Stat751 / CSI771 Midterm October 15, 2015 Solutions, Comments. f(x) = 0 otherwise

Transcription:

STAT 45: Statistical Theory Distribution Theory Reading in Casella and Berger: Ch 2 Sec 1, Ch 4 Sec 1, Ch 4 Sec 6. Basic Problem: Start with assumptions about f or CDF of random vector X (X 1,..., X p ). Define Y g(x 1,..., X p ) to be some function of X (usually some statistic of interest like a sample mean or correlation coefficient or a test statistic). How can we compute the distribution or CDF or density of Y? Univariate Techniques In a univariate problem we have X a real-valued random variable and Y, a real valued transformation of X. Often we start with X and are told either the density or CDF of X and then asked to find the distribution of something like X 2 or sin(x) a function of X which we call a transformation. I am going to show you the steps you should follow in an example problem. Method 1: compute the CDF by integration and differentiate to find f Y. Example: Suppose U Uniform[, 1]. What is the distribution of log(x)?. Step 1: Make sure you give the transformed variable a name. Let Y log U. Step 2: Write down the definition of the CDF of Y : F Y (y) P (Y y). Step 3: Substitute in the definition of Y in terms of the original variable in this case, U: F Y (y) P (Y y) P ( log(u) y). Step 4: Solve the inequality to get, hopefully, one or more intervals for U. In this case log(u) y is the same as log(u) y 1

which is the same as So U e y. P ( log(u) Y ) P ( U e y). Step 5: now use the assumptions to write the resulting probability in terms of F U. P ( U e y) 1 P ( U < e y) 1 F U ( e y ). Step 6: Differentiate the resulting formula with respect to y. In this case the formula for F U (u) needs to be written carefully: 1 u > 1 F U (u) u < u < 1 u The quantity e y is never negative but if y < e y > 1 so { 1 F U (e y 1 y < ) 1 e y y Now we can differentiate. For negative y the derivative is. For positive y the derivative is e y. At y the derivative does not exist and the density f Y (y) does not have a unique definition So our density is y < f Y (y) undefined y e y y > This is the standard exponential density so we would finish the problem by saying that log(u) has a standard exponential distribution. I have done the individual steps in a bit of detail but here are the crucial pieces again in a chain of equalities: F Y (y) P (Y y) P ( log U y) P (log U y) P (U e y ) { 1 e y y > y. 2

Differentiating we find f Y (y) d dy F Y (y) so Y has standard exponential distribution. Example: Z N(, 1), i.e. and Y Z 2. Then Now differentiate y < undefined y e y y > f Z (z) 1 2π e z2 /2 F Y (y) P (Z 2 y) { y < P ( y Z y) y. P ( y Z y) F Z ( y) F Z ( y) to get Then f Y (y) [ y < d FZ ( y) F dy Z ( y) ] y > undefined y. d dy F Z( y) f Z ( y) d dy 1 2π exp y ( ( y) 2 /2 ) 1 2 y 1/2 1 2 2πy e y/2. (Similar formula for other derivative.) Thus 1 2πy e y/2 y > f Y (y) y < undefined y. 3

This is the density of a χ 2 1 random variable because the chi-squared distributing with ν degrees of freedom is defined to be the distribution of the sum of ν squared independent standard normal variables. I am now going to rewrite this density using indicator notation which can be very useful: { 1 y > 1(y > ) y which we use to write f Y (y) 1 2πy e y/2 1(y > ) (changing definition unimportantly at y ). Notice: I never evaluated F Y before differentiating it. In fact F Y and F Z are integrals I can t do but I can differentiate them anyway. Remember the fundamental theorem of calculus: d dx x a f(y) dy f(x) at any x where f is continuous. Summary: for Y g(x) with X and Y each real valued P (Y y) P (g(x) y) Take d/dy to compute the density f Y (y) d dy P (X g 1 (, y]). {x:g(x) y} Often can differentiate without doing integral. f X (x) dx. Method 2: Change of variables. Assume g is one to one. I do: g is increasing and differentiable. Interpretation of density (based on density F ): f Y (y) P (y Y y + δy) lim δy δy F Y (y + δy) F Y (y) lim δy δy 4

and P (x X x + δx) f X (x) lim. δx δx Now assume y g(x). Define δy by y + δy g(x + δx). Then Get Take limit to get or P (y Y g(x + δx)) P (x X x + δx). P (y Y y + δy)) δy P (x X x + δx)/δx {g(x + δx) y}/δx. f Y (y) f X (x)/g (x) f Y (g(x))g (x) f X (x). Alternative view: Each probability is integral of a density. The first is the integral of the density of Y over the small interval from y g(x) to y g(x + δx). The interval is narrow so f Y is nearly constant and P (y Y g(x + δx)) f Y (y)(g(x + δx) g(x)). Since g has a derivative the difference and we get g(x + δx) g(x) δxg (x) P (y Y g(x + δx)) f Y (y)g (x)δx. Same idea applied to P (x X x + δx) gives so that or, cancelling the δx in the limit P (x X x + δx) f X (x)δx f Y (y)g (x)δx f X (x)δx f Y (y)g (x) f X (x). If you remember y g(x) then you get f X (x) f Y (g(x))g (x). 5

Or solve y g(x) to get x in terms of y, that is, x g 1 (y) and then f Y (y) f X (g 1 (y))/g (g 1 (y)). This is just the change of variables formula for doing integrals. Remark: For g decreasing g < but Then the interval (g(x), g(x + δx)) is really (g(x + δx), g(x)) so that g(x) g(x + δx) g (x)δx. In both cases this amounts to the formula f X (x) f Y (g(x)) g (x). Mnemonic: f Y (y)dy f X (x)dx. Example: : X Weibull(shape α, scale β) or f X (x) α β ( ) α 1 x exp { (x/β) α } 1(x > ). β Let Y log X or g(x) log(x). Solve y log x: x exp(y) or g 1 (y) e y. Then g (x) 1/x and 1/g (g 1 (y)) 1/(1/e y ) e y. Hence ( ) e y α 1 exp { (e y /β) α } 1(e y > )e y. f Y (y) α β β For any y, e y > so indicator 1. So Define φ log β and θ 1/α; then, f Y (y) α β α exp {αy eαy /β α }. f Y (y) 1 θ exp { y φ θ { }} y φ exp. θ Extreme Value density with location parameter φ and scale parameter θ. (Note: several distributions are called Extreme Value.) Marginalization 6

Simplest multivariate problem: (or in general Y is any X j ). X (X 1,..., X p ), Y X 1 Theorem 1 If X has density f(x 1,..., x p ) and q < p then Y (X 1,..., X q ) has density f Y (x 1,..., x q ) f(x 1,..., x p ) dx q+1... dx p. f X1,...,X q is the marginal density of X 1,..., X q and f X the joint density of X but they are both just densities. Marginal just to distinguish from the joint density of X. Example: : The function is a density provided The integral is f(x 1, x 2 ) Kx 1 x 2 1(x 1 >, x 2 >, x 1 + x 2 < 1) P (X R 2 ) K 1 1 x1 f(x 1, x 2 ) dx 1 dx 2 1. x 1 x 2 dx 1 dx 2 K so K 24. The marginal density of x 1 is f X1 (x 1 ) 24 24x 1 x 2 1 x 1 (1 x 1 ) 2 dx 1 /2 K(1/2 2/3 + 1/4)/2 K/24 1(x 1 >, x 2 >, x 1 + x 2 < 1) dx 2 1 x1 x 1 x 2 1( < x 1 < 1)dx 2 12x 1 (1 x 1 ) 2 1( < x 1 < 1). 7

This is a Beta(2, 3) density. General problem has Y (Y 1,..., Y q ) with Y i g i (X 1,..., X p ). Case 1: q > p. Y won t have density for smooth g. Y will have a singular or discrete distribution. Problem rarely of real interest. (But, e.g., residuals have singular distribution.) Case 2: q p. We use a change of variables formula which generalizes the one derived above for the case p q 1. (See below.) Case 3: q < p. Pad out Y add on p q more variables (carefully chosen) say Y q+1,..., Y p. Find functions g q+1,..., g p. Define for q < i p, Y i g i (X 1,..., X p ) and Z (Y 1,..., Y p ). Choose g i so that we can use change of variables on g (g 1,..., g p ) to compute f Z. Find f Y by integration: f Y (y 1,..., y q ) f Z (y 1,..., y q, z q+1,..., z p )dz q+1... dz p Change of Variables Suppose Y g(x) R p with X R p having density f X. Assume g is a one to one ( injective ) map, i.e., g(x 1 ) g(x 2 ) if and only if x 1 x 2. Find f Y : Step 1: Solve for x in terms of y: x g 1 (y). Step 2: Use basic equation: and rewrite it in the form Interpretation of derivative dx dy f Y (y)dy f X (x)dx f Y (y) f X (g 1 (y)) dx dy. when p > 1: dx dy which is the so called Jacobian. ( ) det xi y j 8

Equivalent formula inverts the matrix: This notation means dy dx det f Y (y) f X(g 1 (y)) dy dx y 1 x 1 y 1 x 2 y p x 1. y p x 2 but with x replaced by the corresponding value of y, that is, replace x by g 1 (y). y 1 x p y p x p Example: : The density f X (x 1, x 2 ) 1 { } 2π exp x2 1 + x 2 2 2 is the standard bivariate normal density. Let Y (Y 1, Y 2 ) where Y 1 X 2 1 + X 2 2 and Y 2 < 2π is angle from the positive x axis to the ray from the origin to the point (X 1, X 2 ). I.e., Y is X in polar co-ordinates. Solve for x in terms of y: so that X 1 Y 1 cos(y 2 ) X 2 Y 1 sin(y 2 ) g(x 1, x 2 ) (g 1 (x 1, x 2 ), g 2 (x 1, x 2 )) ( x 2 1 + x 2 2, argument(x 1, x 2 )) g 1 (y 1, y 2 ) (g1 1 (y 1, y 2 ), g2 1 (y 1, y 2 )) (y 1 cos(y 2 ), y 1 sin(y 2 )) ( dx dy det cos(y2 ) y 1 sin(y 2 ) sin(y 2 ) y 1 cos(y 2 ) y 1. It follows that f Y (y 1, y 2 ) 1 { } 2π exp y2 1 y 1 1( y 1 < )1( y 2 < 2π). 2 9 )

Next: marginal densities of Y 1, Y 2? Factor f Y as f Y (y 1, y 2 ) h 1 (y 1 )h 2 (y 2 ) where h 1 (y 1 ) y 1 e y2 1 /2 1( y 1 < ) and Then h 2 (y 2 ) 1( y 2 < 2π)/(2π). f Y1 (y 1 ) h 1 (y 1 ) h 1 (y 1 )h 2 (y 2 ) dy 2 h 2 (y 2 ) dy 2 so marginal density of Y 1 is a multiple of h 1. Multiplier makes f Y1 1 but in this case so that h 2 (y 2 ) dy 2 2π (2π) 1 dy 2 1 f Y1 (y 1 ) y 1 e y2 1 /2 1( y 1 < ). (Special case of Weibull or Rayleigh distribution.) Similarly f Y2 (y 2 ) 1( y 2 < 2π)/(2π) which is the Uniform((, 2π) density. Exercise: W Y1 2 /2 has standard exponential distribution. Recall: by definition U Y1 2 has a χ 2 distribution on 2 degrees of freedom. Exercise: find χ 2 2 density. Note: We show below factorization of density is equivalent to independence. 1