.0. TWO-DIMENSIONAL RANDOM VARIABLES 47.0.7 Bivariate Normal Distribution Figure.: Bivariate Normal pdf Here we use matrix notation. A bivariate rv is treated as a random vector X X =. The expectation of a bivariate random vector is written as X µ µ = EX = E = and its variance-covariance matrix is varx covx V =,X σ = ρσ σ covx,x varx ρσ σ σ X X Then the joint pdf of a normal bi-variate rv X is given by f X x = wherex = x,x T. π detv exp µ { x µt V x µ. },.8 The determinant ofv is σ detv = det ρσ σ ρσ σ σ = ρ σσ.
48 CHAPTER. ELEMENTS OF PROBABILITY DISTRIBUTION THEORY Hence, the inverse ofv is V = σ ρσ σ detv ρσ σ σ = ρ σ ρσ ρσ σ σ. Then the exponent in formula.8 can be written as x µt V x µ = = ρ x µ,x µ x µ = ρ σ σ ρσ σ ρσ σ x µ x µ. ρ x µ x µ σ σ + x µ σ So, the joint pdf of the two-dimensional normal rv X is f X x = πσ σ ρ { x µ exp ρ σ Note that whenρ = 0 it simplifies to f X x = πσ σ exp ρ x µ x µ + x } µ. σ σ σ { x µ σ + x } µ, σ which can be written as a product of the marginal distributions of X and X. Hence, if X = X,X T has a bivariate normal distribution and ρ = 0 then the variablesx and X are independent..0.8 Bivariate Transformations Theorem.7. Let X and Y be jointly continuous random variables with joint pdf f X,Y x,y which has support on S R. Consider random variables U = gx,y and V = hx,y, where g, and h, form a one-to-one mapping froms tod with inversesx = g u,v andy = h u,v which have continuous partial derivatives. Then, the joint pdf of U, V is f U,V u,v = f X,Y g u,v,h u,v J,
.0. TWO-DIMENSIONAL RANDOM VARIABLES 49 where, the Jacobian of the transformationj is J = det g u,v u h u,v u g u,v v h u,v v for allu,v D Example.3. Let X,Y be independent rvs and X Expλ and Y Expλ. Then, the joint pdf ofx,y is on supports = {x,y : x > 0,y > 0}. f X,Y x,y = λe λx λe λy = λ e λx+y We will find the joint pdf for U,V, where U = gx,y = X + Y and V = hx,y = X/Y. This transformation and the support forx,ygive the support foru,v. This is{u,v : u > 0,v > 0}. The inverse functions are x = g u,v = uv +v and y = h u,v = u +v. The Jacobian of the transformation is equal to J = det g u,v u h u,v u g u,v v h u,v v Hence, by Theorem.7 we can write = det v +v u +v +v u +v f U,V u,v = f X,Y g u,v,h u,v J { uv = λ exp λ +v + u } u +v +v = λ ue λu +v, = u +v. foru,v > 0. These transformed variables are independent. In a simpler situation where gx is a function of x only and hy is function of y only, it is easy to see the following very useful result.
50 CHAPTER. ELEMENTS OF PROBABILITY DISTRIBUTION THEORY Theorem.8. Let X and Y be independent rvs and let gx be a function of x only andhy be function ofy only. Then the functionsu = gx andv = hy are independent. Proof. Continuous case For any u R and v R, define A u = {x : gx u} and A v = {y : hy v}. Then, we can obtain the joint cdf ofu,v as follows F U,V u,v = PU u,v v = PX A u,y A v = PX A u PY A v as X and Y are independent. The mixed partial derivative with respect to u and v will give us the joint pdf for U,V. That is, f U,V u,v = d d u v F U,Vu,v = du PX A u dv PY A v as the first factor depends on u only and the second factor on v only. Hence, the rvs U = gx and V = hy are independent. Exercise.9. LetX,Y be a two-dimensional random variable with joint pdf { 8xy, for 0 x < y ; f X,Y x,y = 0, otherwise. LetU = X/Y and V = Y. a Are the variables X and Y independent? Explain. b Calculate the covariance ofx and Y. c Obtain the joint pdf ofu,v. d Are the variables U and V independent? Explain. e What is the covariance ofu and V? Exercise.0. LetX and Y be independent random variables such that X Expλ and Y Expλ.
.0. TWO-DIMENSIONAL RANDOM VARIABLES 5 a Find the joint probability density function ofu,v, where U = X X +Y and V = X +Y. b Are the variablesu and V independent? Explain. c Show that U is uniformly distributed on 0,. d What is the distribution ofv?
5 CHAPTER. ELEMENTS OF PROBABILITY DISTRIBUTION THEORY. Random Sample and Sampling Distributions Example.3. In a study on relation of the level of cholesterol in blood and incidents of heart attack 8 heart attack patients had their cholesterol level measured two days and four days after the attack. Also, cholesterol levels were recorded for a control group of 30 people who did not have a heart attack. The data are available on the course web-page. Various questions may be asked here. For example:. What is the population s mean cholesterol level on the second day after heart attack?. Is the difference in the mean cholesterol level on day and on day 4 after the attack statistically significant? 3. Is high cholesterol level a significant risk factor for a heart attack? Each numerical value in this example can be treated as a realization of a random variable. For example, value x = 70 for patient one measured after two hours of the heart attack is a realization of a rv X representing all possible values of cholesterol level of patient one. Here we have 8 patients, hence we have 8 random variables X,...,X 8. These patients are only a small part a sample of all people a population having a heart attack at this time and area of living. Here we come with a definition of a random sample. Definition.. The random variables X,...,X n are called a random sample of size n from a population if X,...,X n are mutually independent, each having the same probability distribution. We say that such random variables are iid, that is identically, independently distributed. The joint pdf pmf in the discrete case can be written as a product of the marginal pdfs, i.e., n f X x,...,x n = f Xi x i, where X = X,...,X n is the jointly continuous n-dimensional random variable.
.. RANDOM SAMPLE AND SAMPLING DISTRIBUTIONS 53 Note: In the Example.3 it can be assumed in that these rvs are mutually independent cholesterol level of one patient does not depend on the cholesterol level of another patient. If we can assume that the distribution of the cholesterol of each patient is the same, then the variables X,...,X 8 constitute a random sample. To make sensible inference from the observed values a realization of a random sample we often calculate some summaries of the data, such as the average or an estimate of the sample variance. In general, such summaries are functions of the random sample and we write Y = TX,...,X n. Note that Y itself is then a random variable. The distribution of Y carries some information about the population and it allows us make inference regarding some parameters of the population. For example, about the expected cholesterol level and its variability in the population of people who suffer from heart attack. The distributions of many functions TX,...,X n can be derived from the distributions of the variablesx i. Such distributions are called sampling distributions and the functiont is called a statistic. Functions X = n n i X i and S = n n X i X are the most common statistics used in data analysis... χ, t andf, Distributions These three distributions can be derived from distributions of iid random variables. They are commonly used in statistical hypothesis testing and in interval estimation of unknown population parameters. χ distribution We have introduced theχ distribution as a special case of the gamma distribution, that is Gamma,. We write Y χ,
54 CHAPTER. ELEMENTS OF PROBABILITY DISTRIBUTION THEORY if the rv Y has the chi-squared distribution with degrees of freedom. is the parameter of the distribution function. The pdf ofy χ is f Y y = y e y Γ for y > 0, =,,... and the mgf ofy is M Y t =, t < t. It is easy to show, using the derivatives of the mgf evaluated att = 0, that EY = and vary =. In the following example we will see thatχ is the distribution of a function of iid standard normal rvs. Example.33. In Example.7 we have seen that square of a standard normal rv has χ distribution. Now, assume that Z N0, and Z N0, independently. What is the distribution of Y = Z + Z? Denote by Y = Z and by Y = Z. To answer this question we can use the properties of the mgf as follows. M Y = E e ty = E [ e ty +Y ] = E [ e ty e ty ] = M Y tm Y t as, by Theorem.8, Y and Y are independent. Also, Y χ and Y χ, each with the mgf equal to Hence, M Yi t = M Y t = M Y tm Y t = t /, i =,. t / t / = t. This is the mgf forχ. Hence, by the uniqueness of the mgf we can conclude that Y = Z +Z χ. Note: This result can be easily extended to n independent standard normal rvs. That is, ifz i N0, fori =,...,n independently, then n Y = Zi χ n.
.. RANDOM SAMPLE AND SAMPLING DISTRIBUTIONS 55 From this result we can draw a useful conclusion. Corollary.. A sum of independent random variablest = k Y i, where each component has a chi-squared distribution, i.e., Y i χ i, is a random variable having a chi-squared distribution with = k i degrees of freedom. Note thatt in the above corollary can be written as a sum of squared iid standard normal rvs, hence it must have a chi-squared distribution. Example.34. Let X,...,X n be iid random variables, such that Then and so X i Nµ,σ, i =,...,n. Z i = X i µ σ n Zi = N0, n X i µ χ σ n. This is a useful result, but it depends on two parameters, whose values are usually unknown when we analyze data. Here we will see what happens when we replace µ withx = n n X i. We can write n X i µ = = = n [ Xi X+X µ ] n X i X + n n X µ +X µ X i X n X i X +nx µ. Hence, dividing this byσ we get n X i µ σ wheres = n n X i X. We know that Intro to Stats X N µ, σ n = n S σ, so X µ σ n } {{ } =0 X µ +,.9 σ n N0,.
56 CHAPTER. ELEMENTS OF PROBABILITY DISTRIBUTION THEORY Hence, Also X µ σ n χ. n X i µ χ σ n. Furthermore, it can be shown thatx ands are independent. Now, equation.9 is a relation of the form W = U +V, where W χ n and V χ. Since hereu andv are independent, we have M W t = M U tm V t. That is M U t = M Wt M V t = t n/ t / = t n /. The last expression is the mgf of a random variable with aχ n distribution. That is n S σ χ n. Equivalently, n X i X σ χ n. Student t distribution A derivation of the t-distribution was published in 908 by William Sealy Gosset when he worked at the Guinness Brewery in Dublin. Due to proprietary issues, the paper was written under the pseudonym Student. The distribution is used in hypothesis testing and the test functions having a t distribution are often called t-tests. We write Y t,
.. RANDOM SAMPLE AND SAMPLING DISTRIBUTIONS 57 if the rv Y has the t distribution with degrees of freedom. is the parameter of the distribution function. The pdf is given by f Y y = Γ + πγ + + y, y R, =,,3,... The following theorem is widely applied in statistics to build hypotheses tests. Theorem.9. Let Z and X be independent random variables such that Z N0, and X χ. The random variable Y = Z X/ has Studenttdistribution with degrees of freedom. Proof. Here we will apply Theorem.7. We will find the joint distribution of Y,W, where Y = X/ Z and W = X, and then we will find the marginal density function fory, as required. The densities of standard normal and chi-squared rvs, respectively, are and f Z z = π e z, z R f X x = x e x Γ, x R + Z andx are independent, so the joint pdf ofz,x is equal to the product of the marginal pdfs f Z z and f X x. That is, Here we have the transformation f Z,X z,x = π e z x e x Γ y = z x/ and w = x, which gives the inverses z = y w/ and x = w.
58 CHAPTER. ELEMENTS OF PROBABILITY DISTRIBUTION THEORY Then, Jacobian of the transformation is w/ y J = det w 0 = w. Hence, the joint pdf fory,w is f Y,W y,w = e y w π = w + e w + π w e w Γ + y Γ. w The range of Z,X is {z,x : < z <,0 < x < } and so the range of Y,W is{y,w : < y <,0 < w < }. To obtain the marginal pdf fory we need to integrate the joint pdf fory,w over the range of W. This is an easy task when we notice similarities of this function with the pdf of Gammaα,λ and use the fact that a pdf over the whole range integrates to. The pdf of a gamma random variablev is Denote f V v = vα λ α e λv. Γα α = +, λ = + y. Then, multiplying and dividing the joint pdf for Y,W by λ α and by Γα, we get f Y,W y,w = wα e λw πα Γ λα Γα λ α Γα Then, the marginal pdf fory is f Y y = = 0 = wα λ α e λw Γα w α λ α e λw Γα Γα πα Γ λα 0 Γα πα Γ λα. Γα πα Γ λαdw w α λ α e λw dw Γα
.. RANDOM SAMPLE AND SAMPLING DISTRIBUTIONS 59 as the second factor is treated as a constant when we integrate with respect to w. The first factor has a form of a gamma pdf, hence it integrates to. This gives the pdf fory equal to f Y y = = Γα πα Γ λα Γ + π + Γ [ = Γ+ πγ + y + y + ]+ which is the pdf of a random variable having the Student t distribution with degrees of freedom. Example.35. LetX i,i =,...,n, be iid normal random variables with meanµ and varianceσ. We often write this fact in the following way Then X N X i iid Nµ,σ, i =,...,n. µ, σ, so U = X µ n σ n Also, as shown in Example.34, we have V = n S σ χ n. N0,. Furthermore,X ands are independent, henceu andv are also independent by Theorem.8. Hence, by Theorem.9, we have T = U V/n t n. That is T = X µ σ n n S σ /n = X µ S n t n Note: The function T in Example.35 is used for testing hypothesis about the population mean µ when the variance of the population is unknown. We will be using this function later on in this course.
60 CHAPTER. ELEMENTS OF PROBABILITY DISTRIBUTION THEORY F, Distribution Another sampling distribution very often used in statistics is the Fisher-Snedecor F distribution. We write Y F,, if the rv Y has F distribution with and degrees of freedom. The degrees of freedom are the parameters of the distribution function. The pdf is quite complicated. It is given by f Y y = Γ + Γ Γ y + y +, y R +. This distribution is related to chi-squared distribution in the following way. Theorem.0. Let X andy be independent random variables such that Then the random variable X χ and Y χ. F = X/ Y/ has Fisher s F distribution with and degrees of freedom. Proof. This result can be shown in a similar way as the result of Theorem.9. Take the transformationf = X/ Y/ andw = Y and use Theorem.7. Example.36. Let X,...,X n and Y,...,Y m be two independent random samples such that X i Nµ,σ and Y j Nµ,σ, i =,...,n; j =,...,m and lets ands be the sample variances of the random samplesx,...,x n and Y,...,Y m, respectively. Then see Example.34, n S σ χ n. Similarly, m S σ χ m.
.. RANDOM SAMPLE AND SAMPLING DISTRIBUTIONS 6 Hence, by Theorem.0, the ratio F = n S /n σ /m = S /σ S /σ m S σ F n,m. This statistic is used in testing hypotheses regarding variances of two independent normal populations. Exercise.. Let Y Gammaα,λ, that is,y has the pdf given by y α λ α e λy, for y > 0; f Y y = Γα 0, otherwise, forα > 0 andλ > 0. a Show that for anya R such that a+α > 0 the expectation ofy a is EY a = Γa+α λ a Γα. b Obtain EY and vary. Hint: Use result given in point a and the iterative property of gamma function, i.e., Γz = z Γz. c Let X χ. Use to obtainex andvarx. Exercise.. LetY,...,Y 5 be a random sample of size 5 from a standard normal population, that isy i iid N0,,i =,...,5, and denote byy the sample mean, that is Y = 5 5 Y i. Let Y 6 be another independent standard normal random variable. What is the distribution of a W = 5 Y i? b U = 5 Y i Y? c U +Y 6? d 5Y 6 / W? Explain each of your answers.