Joint distribution. Joint distribution. Marginal distributions. Joint distribution

Similar documents
Continuous Random Variables

Chapter 5 : Continuous Random Variables

1 Probability Density Functions

Method: Step 1: Step 2: Find f. Step 3: = Y dy. Solution: 0, ( ) 0, y. Assume

Notes on length and conformal metrics

7 - Continuous random variables

X Z Y Table 1: Possibles values for Y = XZ. 1, p

Physics 116C Solution of inhomogeneous ordinary differential equations using Green s functions

Tutorial 4. b a. h(f) = a b a ln 1. b a dx = ln(b a) nats = log(b a) bits. = ln λ + 1 nats. = log e λ bits. = ln 1 2 ln λ + 1. nats. = ln 2e. bits.

Problem. Statement. variable Y. Method: Step 1: Step 2: y d dy. Find F ( Step 3: Find f = Y. Solution: Assume

Lecture 21: Order statistics

Math 426: Probability Final Exam Practice

CS667 Lecture 6: Monte Carlo Integration 02/10/05

(4.1) D r v(t) ω(t, v(t))

Section 17.2 Line Integrals

Math 115 ( ) Yum-Tong Siu 1. Lagrange Multipliers and Variational Problems with Constraints. F (x,y,y )dx

ODE: Existence and Uniqueness of a Solution

df dt f () b f () a dt

MATH34032: Green s Functions, Integral Equations and the Calculus of Variations 1

The Regulated and Riemann Integrals

a a a a a a a a a a a a a a a a a a a a a a a a In this section, we introduce a general formula for computing determinants.

Integrals along Curves.

The area under the graph of f and above the x-axis between a and b is denoted by. f(x) dx. π O

1.1. Linear Constant Coefficient Equations. Remark: A differential equation is an equation

Solution for Assignment 1 : Intro to Probability and Statistics, PAC learning

10 Vector Integral Calculus

1.2. Linear Variable Coefficient Equations. y + b "! = a y + b " Remark: The case b = 0 and a non-constant can be solved with the same idea as above.

Chapter 14. Matrix Representations of Linear Transformations

Functions of Several Variables

5 Probability densities

Integral equations, eigenvalue, function interpolation

Definite Integrals. The area under a curve can be approximated by adding up the areas of rectangles = 1 1 +

Math 360: A primitive integral and elementary functions

12 TRANSFORMING BIVARIATE DENSITY FUNCTIONS

f(a+h) f(a) x a h 0. This is the rate at which

4.5 JACOBI ITERATION FOR FINDING EIGENVALUES OF A REAL SYMMETRIC MATRIX. be a real symmetric matrix. ; (where we choose θ π for.

New Expansion and Infinite Series

3.4 Numerical integration

MATH 409 Advanced Calculus I Lecture 19: Riemann sums. Properties of integrals.

A REVIEW OF CALCULUS CONCEPTS FOR JDEP 384H. Thomas Shores Department of Mathematics University of Nebraska Spring 2007

A Matrix Algebra Primer

Best Approximation in the 2-norm

MAA 4212 Improper Integrals

HW3 : Moment functions Solutions

Lecture 1: Introduction to integration theory and bounded variation

Best Approximation. Chapter The General Case

We partition C into n small arcs by forming a partition of [a, b] by picking s i as follows: a = s 0 < s 1 < < s n = b.

Lecture 3. Limits of Functions and Continuity

JEE(MAIN) 2015 TEST PAPER WITH SOLUTION (HELD ON SATURDAY 04 th APRIL, 2015) PART B MATHEMATICS

The Wave Equation I. MA 436 Kurt Bryan

Properties of Integrals, Indefinite Integrals. Goals: Definition of the Definite Integral Integral Calculations using Antiderivatives

Linearity, linear operators, and self adjoint eigenvalue problems

Practice final exam solutions

Variational Techniques for Sturm-Liouville Eigenvalue Problems

Overview of Calculus I

Partial Derivatives. Limits. For a single variable function f (x), the limit lim

Mapping the delta function and other Radon measures

38 Riemann sums and existence of the definite integral.

Decision Networks. CS 188: Artificial Intelligence Fall Example: Decision Networks. Decision Networks. Decisions as Outcome Trees

Line Integrals. Partitioning the Curve. Estimating the Mass

Chapter 3. Vector Spaces

Expectation and Variance

Math 231E, Lecture 33. Parametric Calculus

Math Lecture 23

20 MATHEMATICS POLYNOMIALS

STURM-LIOUVILLE THEORY, VARIATIONAL APPROACH

Recitation 3: More Applications of the Derivative

Read section 3.3, 3.4 Announcements:

State space systems analysis (continued) Stability. A. Definitions A system is said to be Asymptotically Stable (AS) when it satisfies

Lecture 1. Functional series. Pointwise and uniform convergence.

Probability Distributions for Gradient Directions in Uncertain 3D Scalar Fields

x = b a n x 2 e x dx. cdx = c(b a), where c is any constant. a b

We divide the interval [a, b] into subintervals of equal length x = b a n

4 7x =250; 5 3x =500; Read section 3.3, 3.4 Announcements: Bell Ringer: Use your calculator to solve

Math Solutions to homework 1

Math 8 Winter 2015 Applications of Integration

Improper Integrals, and Differential Equations

MORE FUNCTION GRAPHING; OPTIMIZATION. (Last edited October 28, 2013 at 11:09pm.)

Numerical integration

Section 3.2 Maximum Principle and Uniqueness

Consequently, the temperature must be the same at each point in the cross section at x. Let:

Review of Gaussian Quadrature method

Math 520 Final Exam Topic Outline Sections 1 3 (Xiao/Dumas/Liaw) Spring 2008

Theoretical foundations of Gaussian quadrature

1.3 The Lemma of DuBois-Reymond

Calculus of Variations

Jim Lambers MAT 280 Spring Semester Lecture 26 and 27 Notes

1 2-D Second Order Equations: Separation of Variables

Math 113 Fall Final Exam Review. 2. Applications of Integration Chapter 6 including sections and section 6.8

Module 6: LINEAR TRANSFORMATIONS

Chapter 0. What is the Lebesgue integral about?

38.2. The Uniform Distribution. Introduction. Prerequisites. Learning Outcomes

Line and Surface Integrals: An Intuitive Understanding

STURM-LIOUVILLE BOUNDARY VALUE PROBLEMS

PHYS 4390: GENERAL RELATIVITY LECTURE 6: TENSOR CALCULUS

THIELE CENTRE. Linear stochastic differential equations with anticipating initial conditions

Math 32B Discussion Session Session 7 Notes August 28, 2018

Discrete Mathematics and Probability Theory Spring 2013 Anant Sahai Lecture 17

Section 6.1 INTRO to LAPLACE TRANSFORMS

Sturm-Liouville Eigenvalue problem: Let p(x) > 0, q(x) 0, r(x) 0 in I = (a, b). Here we assume b > a. Let X C 2 1

Transcription:

Joint distribution To specify the joint distribution of n rndom vribles X 1,...,X n tht tke vlues in the smple spces E 1,...,E n we need probbility mesure, P, on E 1... E n = {(x 1,...,x n ) x i E i, i = 1,...,n}. Joint distribution To specify the joint distribution of n rndom vribles X 1,...,X n tht tke vlues in the smple spces E 1,...,E n we need probbility mesure, P, on E 1... E n = {(x 1,...,x n ) x i E i, i = 1,...,n}. For A E 1... E n P((X 1,...,X n ) A) = P(A) nd for A 1 E 1,...,A n E n P(X 1 A 1,...,X n A n ) = P(A 1... A n ). We sy tht P is the joint distribution of the bundled vrible X =. p.1/3 (X 1,...,X n ).. p.1/3 Joint distribution Mrginl distributions To specify the joint distribution of n rndom vribles X 1,...,X n tht tke vlues in the smple spces E 1,...,E n we need probbility mesure, P, on E 1... E n = {(x 1,...,x n ) x i E i, i = 1,...,n}. For A E 1... E n If P is the joint distribution of X = (X 1,...,X n ) we cn get the mrginl distribution of X i s P i (A) = P(X i A) = P(E 1... E i 1 A E i+1... E n ). for A E i. P((X 1,...,X n ) A) = P(A) nd for A 1 E 1,...,A n E n P(X 1 A 1,...,X n A n ) = P(A 1... A n ).. p.1/3. p./3

Mrginl distributions Independence If P is the joint distribution of X = (X 1,...,X n ) we cn get the mrginl distribution of X i s P i (A) = P(X i A) = P(E 1... E i 1 A E i+1... E n ). for A E i. Specifiction of the mrginl distributions lone is not enough to specify the joint distribution we lso need to specify how the vribles we consider re relted. Definition: We sy tht X 1,...,X n re independent if P(X 1 A 1,...,X n A n ) = P(X i A 1 )... P(X n A n ). (1) If we specify the mrginl distributions of X 1,...,X n nd sy tht the vribles re independent then we hve specified the joint distribution by eqution (1).. p./3. p.3/3 Independence Trnsformtions Definition: We sy tht X 1,...,X n re independent if P(X 1 A 1,...,X n A n ) = P(X i A 1 )... P(X n A n ). (1) Theorem: If X 1 nd X re independent rndom vribles tking vlues in E 1 nd E respectively, nd if h 1 : E 1 E 1 nd h : E E re two trnsformtions then the rndom vribles h 1 (X 1 ) nd h (X ) re independent. Mrginl trnsformtions preserve independence.. p.3/3. p.4/3

Discrete smple spces Exmple If E 1,...,E n re discrete smple spces so is E = E 1...E n nd the joint distribution is given in terms of point probbilities p(x 1,...,x n ), x i E i, i = 1,...,n. Consider E = E 0 E 0 = {A, C, G, T} {A, C, G, T}, nd we let X nd Y denote rndom vribles representing two evolutionry relted nucleic cids in DNA sequence. Let the joint distribution of X nd Y hve point probbilities A 0.17 0.0063 0.0464 0.0051 C 0.0196 0.008 0.008 0.076 G 0.0556 0.0145 0.151 0.0071 T 0.0146 0.0685 0.0069 0.1315. p.5/3. p.6/3 Discrete smple spces Exmple If E 1,...,E n re discrete smple spces so is E = E 1...E n nd the joint distribution is given in terms of point probbilities p(x 1,...,x n ), x i E i, i = 1,...,n. The mrginl distribution of X i hs point probbilities p i (x i ) = P(X i = x i ) = for x i E i. x 1,...,x i 1,x i+1,...,x n p(x 1,...,x i 1, x, x i+1,...,x n ) Consider E = E 0 E 0 = {A, C, G, T} {A, C, G, T}, nd we let X nd Y denote rndom vribles representing two evolutionry relted nucleic cids in DNA sequence. Let the joint distribution of X nd Y hve point probbilities A 0.17 0.0063 0.0464 0.0051 C 0.0196 0.008 0.008 0.076 G 0.0556 0.0145 0.151 0.0071 T 0.0146 0.0685 0.0069 0.1315 Let A = {(x, y) E x = y} denote the event tht the two relted nucleic cids re identicl then. p.5/3 P(X = Y ) = P(A) = 0.17 + 0.008 + 0.151 + 0.1315 = 0.6746.. p.6/3

Independence nd point probbilities Exmple Theorem: If X 1,...,X n re rndom vribles with vlues in discrete smple spces then they re independent if nd only if P(X 1 = x 1,...,X n = x n ) = P(X 1 = x 1 )... P(X n = x 1 ) X Y A 0.17 0.0063 0.0464 0.0051 0.1850 C 0.0196 0.008 0.008 0.076 0.301 G 0.0556 0.0145 0.151 0.0071 0.93 T 0.0146 0.0685 0.0069 0.1315 0.15 0.170 0.901 0.766 0.163 Sme exmple s bove but with the point probbilities for the mrginl distributions. Note tht X nd Y re not independent! For instnce 0.17 = P((X, Y ) = (A, A)) P(X = A) P(Y = A) = 0.1850 0.170 = 0.0401. p.7/3. p.8/3 Independence nd point probbilities Exmple Theorem: If X 1,...,X n re rndom vribles with vlues in discrete smple spces then they re independent if nd only if P(X 1 = x 1,...,X n = x n ) = P(X 1 = x 1 )... P(X n = x 1 ) In words, the rndom vribles re independent if nd only if the point probbilities for their joint distribution fctorize s product of the point probbilities for their mrginl distributions. X Y A 0.0401 0.0537 0.051 0.0400 0.1850 C 0.0654 0.0874 0.0833 0.0651 0.301 G 0.0634 0.0848 0.0809 0.063 0.93 T 0.0481 0.0643 0.0613 0.0479 0.15 0.17 0.901 0.766 0.163 Sme mrginls s bove but X nd Y re independent in this exmple.. p.7/3. p.9/3

The bivrite norml distribution The bivrite norml distribution The function f(x, y) = 1 ρ exp ( x ρxy + y ) π 0.15 0.15 for ρ ( 1, 1) on R is n exmple of bivrite density on R. It stisfies tht f(x) 0 nd f(x, y)dxdy = 1 density 0.10 0.05 0.00 4 x 0 4 0 4 4 y density 0.10 0.05 0.00 4 x 0 4 0 4 4 y With ρ = 0 (left) nd ρ = 0.75 (right).. p.10/3. p.11/3 The bivrite norml distribution The bivrite norml distribution The function f(x, y) = 1 ρ exp ( x ρxy + y ) π We find tht the mrginl distribution of X is given by P( X b) = b f(x, y)dydx = b (1 ρ ) π e (1 ρ )x dx, for ρ ( 1, 1) on R is n exmple of bivrite density on R. It stisfies tht f(x) 0 nd f(x, y)dxdy = 1 The rndom vribles X nd Y hve joint distribution with density f if P( 1 X b 1, Y b ) = b1 b 1 f(x, y)dxdy. p.10/3. p.1/3

The bivrite norml distribution The bivrite norml distribution We find tht the mrginl distribution of X is given by P( X b) = b f(x, y)dydx = b (1 ρ ) π e (1 ρ )x dx, We find tht the mrginl distribution of X is given by P( X b) = b f(x, y)dydx = b (1 ρ ) π e (1 ρ )x dx, which shows tht the mrginl distribution of X is N(0, (1 ρ ) 1 ). which shows tht the mrginl distribution of X is N(0, (1 ρ ) 1 ). The mrginl distribution of Y is N(0, (1 ρ ) 1 ), but the ρ lso determines dependence between X nd Y. The vribles X nd Y re independent if nd only if ρ = 0.. p.1/3. p.1/3 The bivrite norml distribution We find tht the mrginl distribution of X is given by P( X b) = b f(x, y)dydx = b (1 ρ ) π e (1 ρ )x dx, which shows tht the mrginl distribution of X is N(0, (1 ρ ) 1 ). Generl results If X 1,...,X n re rel vlued rndom vribles we cn specify their joint distribution by density f : R n [0, ) such tht P( 1 X 1 b 1,..., n X n b n ) = b1 1 bn n f(x 1,...,x n )dx n...dx 1 cn be computed s n successive ordinry integrls (order does not mtter). The mrginl distribution of Y is N(0, (1 ρ ) 1 ), but the ρ lso determines dependence between X nd Y.. p.1/3. p.13/3

Generl results The model builders pproch If X 1,...,X n re rel vlued rndom vribles we cn specify their joint distribution by density f : R n [0, ) such tht P( 1 X 1 b 1,..., n X n b n ) = b1 1 bn n f(x 1,...,x n )dx n...dx 1 We wnt to construct probbilistic model for the rndom vribles X 1,...,X n. cn be computed s n successive ordinry integrls (order does not mtter). The mrginl distribution of X i hs density f i (x i ) = f(x 1,...,x n )dx n...dx i+1 dx i 1...dx 1. }{{} n 1. p.13/3. p.14/3 Generl results The model builders pproch If X 1,...,X n re rel vlued rndom vribles we cn specify their joint distribution by density f : R n [0, ) such tht P( 1 X 1 b 1,..., n X n b n ) = b1 1 bn n f(x 1,...,x n )dx n...dx 1 We wnt to construct probbilistic model for the rndom vribles X 1,...,X n. Cn we ssume tht the vribles re independent if yes continue. cn be computed s n successive ordinry integrls (order does not mtter). The mrginl distribution of X i hs density f i (x i ) = f(x 1,...,x n )dx n...dx i+1 dx i 1...dx 1. }{{} n 1 The X i s re independent if f(x 1,...,x n ) = f 1 (x 1 )... f n (x n ).. p.13/3. p.14/3

The model builders pproch Conditionl distributions We wnt to construct probbilistic model for the rndom vribles X 1,...,X n. Cn we ssume tht the vribles re independent if yes continue. Cn we ssume tht the vribles ll hve the sme mrginl distribution if yes continue. Definition: The conditionl distribution of Y given tht X A is defined s P(Y B X A) = provided tht P(X A) > 0. P(Y B, X A) P(X A). p.14/3. p.15/3 The model builders pproch Conditionl distributions We wnt to construct probbilistic model for the rndom vribles X 1,...,X n. Cn we ssume tht the vribles re independent if yes continue. Cn we ssume tht the vribles ll hve the sme mrginl distribution if yes continue. Then we sy: Let X 1,...,X n be iid = independent nd identiclly distributed. Definition: The conditionl distribution of Y given tht X A is defined s P(Y B X A) = provided tht P(X A) > 0. P(Y B, X A) P(X A) If X nd Y re discrete we cn condition on events X = x nd get conditionl distributions in terms of point probbilities And we need to specify either the point probbilities for their common distribution or the density the joint distribution is given by products. p(y x) = P(Y = y X = x) = P(Y = y, X = x) P(X = x) where p(x, y) re the joint point probbilities. = p(x, y) p(x, y) y. p.14/3. p.15/3

Exmple Systemtic specifiction Using P(Y = y X = x) = P(X = x, Y = y) P(X = x) = p(x, y) p(x, y). y E we hve to divide by precisely the row sums to get the mtrix of conditionl distributions: X Y A 0.6874 0.0343 0.507 0.076 C 0.0649 0.6667 0.073 0.411 G 0.1904 0.0495 0.7359 0.04 T 0.0658 0.3093 0.0311 0.5938 The row sums bove equl 1 nd this is n exmple of mtrix of trnsition Dependence mong discrete vribles indexed by time prmeter cn be treted systemticlly. We cn define collection of conditionl probbilities, P t (x, y) for t 0, of Y = y given X = x s the solution to system of differentil equtions: dp t (x, y) dt for λ(z, y) 0 for z y nd The λ(y, z) s re clled intensities. = z P t (x, z)λ(z, y) λ(z, z) = y z λ(y, z). probbilities.. p.16/3. p.17/3 Systemtic specifiction Solution Dependence mong discrete vribles indexed by time prmeter cn be treted systemticlly. We cn define collection of conditionl probbilities, P t (x, y) for t 0, of Y = y given X = x s the solution to system of differentil equtions: On finite smple spce nd with the initil condition P 0 (x, x) = 1 the bove system of differentil equtions hs unique solution such tht P t (x, ) is (conditionl) probbility mesure for ll x. dp t (x, y) dt = z P t (x, z)λ(z, y) In generl no closed form expression for the solution. for λ(z, y) 0 for z y nd λ(z, z) = y z λ(y, z).. p.17/3. p.18/3

Jukes-Cntor model Liner Regression Intensities: A 3α α α α C α 3α α α G α α 3α α T α α α 3α We often specify the conditionl distribution of rel vlued rndom vrible Y given X = x for nother rel vlued rndom vrible X by writing Y = α + βx + ǫ where ǫ is nother men 0 rndom vrible (noise), which is independent of X. The prmeter α > 0 tells how mny muttions tht occur per time unit. The solution is P t (x, x) = 0.5 + 0.75 exp( 4αt) P t (x, y) = 0.5 0.5 exp( 4αt), if x y,. p.19/3. p.1/3 Kimur model Liner Regression Intensities: A α β β α β C β α β β α G α β α β β T β α β α β for α, β > 0 nd the solution is We often specify the conditionl distribution of rel vlued rndom vrible Y given X = x for nother rel vlued rndom vrible X by writing Y = α + βx + ǫ where ǫ is nother men 0 rndom vrible (noise), which is independent of X. This is loction trnsformtion of the distribution of ǫ. The conditionl men of Y given X = x is α + βx. P t (x, x) = 0.5 + 0.5 exp( 4βt) + 0.5 exp( (α + β)t) P t (x, y) = 0.5 + 0.5 exp( 4βt) 0.5 exp( (α + β)t), if λ(x, y) = α P t (x, y) = 0.5 0.5 exp( 4βt), if λ(x, y) = β,. p.0/3. p.1/3

Liner Regression Conditionl densities We often specify the conditionl distribution of rel vlued rndom vrible Y given X = x for nother rel vlued rndom vrible X by writing Y = α + βx + ǫ where ǫ is nother men 0 rndom vrible (noise), which is independent of X. This is loction trnsformtion of the distribution of ǫ. The conditionl men of Y given X = x is α + βx. Definition: If f is the density for the joint distribution of two rndom vribles X nd Y tking vlues in R n nd R m, respectively, then with f 1 (x) = f(x, y)dy R m we define the conditionl distribution of Y given X = x to be the distribution with density f(x, y) f(y x) = f 1 (x). for y R m nd x R n with f 1 (x) > 0. If ǫ N(0, σ ) then Y X = x N(α + βx, σ ). Note the formul f(x, y) = f(y x)f 1 (x), tht llows us to specify the joint deistribution by specifying the mrginl. p.1/3 distribution of X nd the conditionl distribution of Y given X.. p./3 Conditionl densities Generlized liner models Definition: If f is the density for the joint distribution of two rndom vribles X nd Y tking vlues in R n nd R m, respectively, then with f 1 (x) = f(x, y)dy R m we define the conditionl distribution of Y given X = x to be the distribution with density f(x, y) f(y x) = f 1 (x). We consider the setup with the probbility mesures P θ on discrete E 1 R given by the point probbilities for θ Θ R. p θ (x) = exp(θx b(θ) + c(x)) for y R m nd x R n with f 1 (x) > 0.. p./3. p.3/3

Generlized liner models We consider the setup with the probbility mesures P θ on discrete E 1 R given by the point probbilities for θ Θ R. p θ (x) = exp(θx b(θ) + c(x)) Let Y be rel vlued rndom vrible we cn define the conditionl distribution of X given Y = y to be P β0 +β 1 y.. p.3/3 Generlized liner models We consider the setup with the probbility mesures P θ on discrete E 1 R given by the point probbilities for θ Θ R. p θ (x) = exp(θx b(θ) + c(x)) Let Y be rel vlued rndom vrible we cn define the conditionl distribution of X given Y = y to be P β0 +β 1 y. Tht is, the conditionl point probbilites for the distribution of X given Y = y re p(x y) = p β0 +β 1 y(x) = exp((β 0 + β 1 y)x b(β 0 + β 1 y) + c(x)). The conditionl men of X given Y = y is b (β 0 + β 1 y).. p.3/3