Chapter 5. Chapter 5 sections

Similar documents
Chapter 5 continued. Chapter 5 sections

Things to remember when learning probability distributions:

15 Discrete Distributions

Stat 5101 Notes: Brand Name Distributions

Statistics STAT:5100 (22S:193), Fall Sample Final Exam B

Random Variables and Their Distributions

Chapter 3 sections. SKIP: 3.10 Markov Chains. SKIP: pages Chapter 3 - continued

Lecture 1: August 28

STAT Chapter 5 Continuous Distributions

Mathematical Statistics 1 Math A 6330

Probability and Distributions

Stat 5101 Notes: Brand Name Distributions

Chapter 3 sections. SKIP: 3.10 Markov Chains. SKIP: pages Chapter 3 - continued

Probability Distributions Columns (a) through (d)

1 Presessional Probability

Lecture 2: Repetition of probability theory and statistics

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems

Continuous Random Variables and Continuous Distributions

Contents 1. Contents

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Chapter 2. Discrete Distributions

Chapter 4. Chapter 4 sections

More on Distribution Function

1.1 Review of Probability Theory

Formulas for probability theory and linear models SF2941

Continuous Probability Spaces

Some Special Distributions (Hogg Chapter Three)

Actuarial Science Exam 1/P

1 Review of Probability and Distributions

ACM 116: Lectures 3 4

Probability Theory. Patrick Lam

Brief Review of Probability

APM 504: Probability Notes. Jay Taylor Spring Jay Taylor (ASU) APM 504 Fall / 65

Stat410 Probability and Statistics II (F16)

Covariance. Lecture 20: Covariance / Correlation & General Bivariate Normal. Covariance, cont. Properties of Covariance

Tom Salisbury

Some Special Distributions (Hogg Chapter Three)

Lecture 5: Moment generating functions

2. Suppose (X, Y ) is a pair of random variables uniformly distributed over the triangle with vertices (0, 0), (2, 0), (2, 1).

Lecture 13. Poisson Distribution. Text: A Course in Probability by Weiss 5.5. STAT 225 Introduction to Probability Models February 16, 2014

Common probability distributionsi Math 217 Probability and Statistics Prof. D. Joyce, Fall 2014

SDS 321: Introduction to Probability and Statistics

Common ontinuous random variables

t x 1 e t dt, and simplify the answer when possible (for example, when r is a positive even number). In particular, confirm that EX 4 = 3.

Chapter 7: Special Distributions

Continuous Distributions

x. Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ 2 ).

1 Solution to Problem 2.1

Conditional distributions (discrete case)

Lecture 6: Special probability distributions. Summarizing probability distributions. Let X be a random variable with probability distribution

Review 1: STAT Mark Carpenter, Ph.D. Professor of Statistics Department of Mathematics and Statistics. August 25, 2015

2 Random Variable Generation

Advanced topics from statistics

(y 1, y 2 ) = 12 y3 1e y 1 y 2 /2, y 1 > 0, y 2 > 0 0, otherwise.

Quick Tour of Basic Probability Theory and Linear Algebra

Probability Models. 4. What is the definition of the expectation of a discrete random variable?

Continuous Random Variables

It can be shown that if X 1 ;X 2 ;:::;X n are independent r.v. s with

Multivariate Random Variable

Probability. Machine Learning and Pattern Recognition. Chris Williams. School of Informatics, University of Edinburgh. August 2014

Joint p.d.f. and Independent Random Variables

LIST OF FORMULAS FOR STK1100 AND STK1110

System Simulation Part II: Mathematical and Statistical Models Chapter 5: Statistical Models

Continuous Random Variables

Chapter 5 Joint Probability Distributions

THE QUEEN S UNIVERSITY OF BELFAST

Bivariate distributions

3. Probability and Statistics

Topic 2: Review of Probability Theory

Review for the previous lecture

PCMI Introduction to Random Matrix Theory Handout # REVIEW OF PROBABILITY THEORY. Chapter 1 - Events and Their Probabilities

Test Problems for Probability Theory ,

Slides 8: Statistical Models in Simulation

Order Statistics. The order statistics of a set of random variables X 1, X 2,, X n are the same random variables arranged in increasing order.

MAS223 Statistical Inference and Modelling Exercises

Stat 100a, Introduction to Probability.

b. ( ) ( ) ( ) ( ) ( ) 5. Independence: Two events (A & B) are independent if one of the conditions listed below is satisfied; ( ) ( ) ( )

Chapters 3.2 Discrete distributions

Discrete Distributions

Chapter 3, 4 Random Variables ENCS Probability and Stochastic Processes. Concordia University

Definition: A random variable X is a real valued function that maps a sample space S into the space of real numbers R. X : S R

Lecture 14: Multivariate mgf s and chf s

Algorithms for Uncertainty Quantification

MATH : EXAM 2 INFO/LOGISTICS/ADVICE

Multiple Random Variables

STAT/MATH 395 A - PROBABILITY II UW Winter Quarter Moment functions. x r p X (x) (1) E[X r ] = x r f X (x) dx (2) (x E[X]) r p X (x) (3)

1 Review of Probability

Part IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Review of Probabilities and Basic Statistics

Chapter Learning Objectives. Probability Distributions and Probability Density Functions. Continuous Random Variables

STAT2201. Analysis of Engineering & Scientific Data. Unit 3

18.440: Lecture 28 Lectures Review

Guidelines for Solving Probability Problems

Class 26: review for final exam 18.05, Spring 2014

1 Random Variable: Topics

Chapter 4 Multiple Random Variables

BMIR Lecture Series on Probability and Statistics Fall, 2015 Uniform Distribution

18 Bivariate normal distribution I

Chapter 3 Common Families of Distributions

Moments. Raw moment: February 25, 2014 Normalized / Standardized moment:

Transcription:

1 / 43 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions Continuous univariate distributions: 5.6 Normal distributions 5.7 Gamma distributions Just skim 5.8 Beta distributions Multivariate distributions Just skim 5.9 Multinomial distributions 5.10 Bivariate normal distributions

2 / 43 5.1 Introduction Families of distributions How: Why: Parameter and Parameter space pf /pdf and cdf - new notation: f (x parameters ) Mean, variance and the m.g.f. ψ(t) Features, connections to other distributions, approximation Reasoning behind a distribution Natural justification for certain experiments A model for the uncertainty in an experiment All models are wrong, but some are useful George Box

3 / 43 Bernoulli distributions 5.2 Bernoulli and Binomial distributions Def: Bernoulli distributions Bernoulli(p) A r.v. X has the Bernoulli distribution with parameter p if P(X = 1) = p and P(X = 0) = 1 p. The pf of X is { p f (x p) = x (1 p) 1 x for x = 0, 1 0 otherwise Parameter space: p [0, 1] In an experiment with only two possible outcomes, success and failure, let X = number successes. Then X Bernoulli(p) where p is the probability of success. E(X) = p, Var(X) = p(1 p) and ψ(t) = E(e tx ) = pe t + (1 p) 0 for x < 0 The cdf is F (x p) = 1 p for 0 x < 1 1 for x 1

4 / 43 Binomial distributions 5.2 Bernoulli and Binomial distributions Def: Binomial distributions Binomial(n, p) A r.v. X has the Binomial distribution with parameters n and p if X has the pf { ( n ) f (x n, p) = x p x (1 p) n x for x = 0, 1,..., n 0 otherwise Parameter space: n is a positive integer and p [0, 1] If X is the number of successes in n independent tries where prob. of success is p each time, then X Binomial(n, p) Theorem 5.2.1 If X 1, X 2,..., X n form n Bernoulli trials with parameter p (i.e. are i.i.d. Bernoulli(p)) then X = X 1 + + X n Binomial(n, p)

5.2 Bernoulli and Binomial distributions Binomial distributions Let X Binomial(n, p) E(X) = np, Var(X) = np(1 p) To find the m.g.f. of X write X = X 1 + + X n where X i s are i.i.d. Bernoulli(p). Then ψ i (t) = pe t + 1 p and we get ψ(t) = n ψ i (t) = i=1 n ( pe t + 1 p ) = (pe t + 1 p) n i=1 cdf: F(x n, p) = x t=0 ( n t ) p t (1 p) n t = yikes! Theorem 5.2.2 If X i Binomial(n i, p), i = 1,..., k and the X i s are independent, then X = X 1 + + X k Binomial( k i=1 n i, p) 5 / 43

6 / 43 5.2 Bernoulli and Binomial distributions Example: Blood testing (Example 5.2.7) The setup: 1000 people need to be tested for a disease that affects 0.2% of all people. The test is guaranteed to detect the disease if it is present in a blood sample. Task: Find all the people that have the disease. Strategy: Test 1000 samples What s the expected number of people that have the disease? Any assumptions you need to make? Strategy (611): Divide the people into 10 groups of 100. For each group take a portion of each of the 100 blood samples and combine into one sample. Then test the combined blood samples (10 tests).

7 / 43 5.2 Bernoulli and Binomial distributions Example: Blood testing (Example 5.2.7) continued Strategy (611): If all of these tests are negative then none of the 1000 people have the disease. Total number of tests needed: 10 If one of these tests are positive then we test each of the 100 people in that group. Total number of tests needed: 110... If all of the 10 tests are positive we end up having to do 1010 tests Is this strategy better? What is the expected number of tests needed? When does this strategy lose?

8 / 43 5.2 Bernoulli and Binomial distributions Example: Blood testing (Example 5.2.7) continued Let Y i = 1 if test for group i is positive and Y i = 0 otherwise Let Y = Y 1 + + Y 10 = the number of groups where every individual has to be tested. Total number of tests needed: T = 10 + 100Y. Let Z i = number of people in group i that have the disease, i = 1,..., 10. Then Z i Binomial(100, 0.002) Then Y i is a Bernoulli(p) r.v. where p = P(Y i = 1) = P(Z i > 0) = 1 P(Z i = 0) ( ) 100 = 1 0.002 0 (1 0.002) 100 = 1 0.998 100 = 0.181 0 Then Y Binomial(10, 0.181) ET = E(10+100Y ) = 10+100E(Y ) = 10+100(10 0.181) = 191

9 / 43 5.2 Bernoulli and Binomial distributions Example: Blood testing (Example 5.2.7) continued When does this strategy (611) lose? Worst case scenario P(T 1000) = P(Y 9.9) = P(Y = 10) = ( ) 10 10 0.181 10 0.819 0 3.8 10 8 Question: can we go further - a 611-A strategy Any further improvement?

Hypergeometric distributions 5.3 Hypergeometric distributions Def: Hypergeometric distributions A random variable X has the Hypergeometric distribution with parameters N, M and n if it has the pf ( N )( M ) x n x f (x N, M, n) = ) ( N+M n Parameter space: N, M and n are nonnegative integers with n N + M Reasoning: Say we have a finite population with N items of type I and M items of type II. Let X be the number of items of type I when we take n samples without replacement from that population Then X has the hypergeometric distribution 10 / 43

11 / 43 5.3 Hypergeometric distributions Hypergeometric distributions Binomial: Sampling with replacement (effectively infinite population) Hypergeometric: Sample without replacement from a finite population You can also think of the Hypergeometric distribution as a sum of dependent Bernoulli trials Limiting situation: Theorem 5.3.4: If the samples size n is much smaller than the total population N + M then the Hypergeometric distribution with parameters N, M and n will be nearly the same as the Binomial distribution with parameters n and p = N N + M

12 / 43 Poisson distributions 5.4 Poisson distributions Def: Poisson distributions Poisson(λ) A random variable X has the Poisson distribution with mean λ if it has the pf { e λ λ x f (x λ) = x! for x = 0, 1, 2... 0 otherwise Parameter space: λ > 0 Show that f (x λ) is a pf E(X) = λ Var(X) = λ ψ(t) = e λ(et 1) The cdf: F(x λ) = x k=0 e λ λ k k! = yikes.

13 / 43 5.4 Poisson distributions Why Poisson? The Poisson distribution is useful for modeling uncertainty in counts / arrivals Examples: How many calls arrive at a switch board in one hour? How many busses pass while you wait at the bus stop for 10 min? How many bird nests are there in a certain area? Under certain conditions (Poisson postulates) the Poisson distribution can be shown to be the distribution of the number of arrivals (Poisson process). However, the Poisson distribution is often used as a model for uncertainty of counts in other types of experiments. The Poisson distribution can also be used as an approximation to the Binomial(n, p) distribution when n is large and p is small.

Poisson Postulates 5.4 Poisson distributions For t 0, let X t be a random variable with possible values in N 0 (Think: X t = number of arrivals from time 0 to time t) (i) Start with no arrivals: X 0 = 0 (ii) Arrivals in disjoint time periods are ind.: X s and X t X s ind. if s < t (iii) Number of arrivals depends only on period length: X s and X t+s X t are identically distributed (iv) Arrival probability is proportional to period length, if length is small: P(X t = 1) lim = λ t 0 t (v) No simultaneous arrivals: lim t 0 P(X t >1) t = 0 If (i) - (v) hold then for any integer n λt (λt)n P(X t = n) = e that is, X t Poisson(λt) n! Can be defined in terms of spatial areas too. 14 / 43

15 / 43 5.4 Poisson distributions Properties of the Poisson Distributions Useful recursive property: P(X = x) = λ x P(X = x 1) for x 1 Theorem 5.4.4: Sum of Poissons is a Poisson If X 1,..., X k are independent r.v. and X i Poisson(λ i ) for all i, then ( k ) X 1 + + X k Poisson λ i Theorem 5.4.5: Approximation to Binomial Let X n Binomial(n, p n ), where 0 < p n < 1 for all n and {p n } n=1 is a sequence so that lim n np n = λ. Then for all x = 0, 1, 2,... i=1 lim f λ λx X n n (x n, p n ) = e x! = f Poisson (x λ)

16 / 43 5.4 Poisson distributions Example: Poisson as approximation to Binomial Recall the disease testing example. We had 1000 X = X i Binomial(1000, 0.002) and i=1 Y Binomial(100, 0.181)

17 / 43 Geometric distributions 5.5 Negative Binomial distributions Def: Geometric distributions Geometric(p) A random variable X has the Geometric distribution with parameter p if it has the pf { p(1 p) x for x = 0, 1, 2... f (x r, p) = 0 otherwise Parameter space: 0 < p < 1 Say we have an infinite sequence of Bernoulli trials with parameter p X = number of failures before the first success. Then X Geometric(p)

18 / 43 Negative Binomial distributions 5.5 Negative Binomial distributions Def: Negative Binomial distributions NegBinomial(r, p) A random variable X has the Negative Binomial distribution with parameters r and p if it has the pf { ( r+x 1 ) f (x r, p) = x p r (1 p) x for x = 0, 1, 2... 0 otherwise Parameter space: 0 < p < 1 and r positive integer. Say we have an infinite sequence of Bernoulli trials with parameter p X = number of failures before the r th success. Then X NegBinomial(r, p) Geometric(p) = NegBinomial(1, p) Theorem 5.5.2: If X 1,..., X r are i.i.d. Geometric(p) then X = X 1 + + X r NegBinomial(r, p)

19 / 43 5.5 Negative Binomial distributions sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions Continuous univariate distributions: 5.6 Normal distributions 5.7 Gamma distributions Just skim 5.8 Beta distributions Multivariate distributions Just skim 5.9 Multinomial distributions 5.10 Bivariate normal distributions

20 / 43 Gamma distributions 5.7 Gamma distributions The Gamma function: Γ(α) = 0 x α 1 e x dx Γ(1) = 1 and Γ(0.5) = π Γ(α) = (α 1)Γ(α 1) if α > 1 Def: Gamma distributions Gamma(α, β) A continuous r.v. X has the gamma distribution with parameters α and β if it has the pdf { β α f (x α, β) = Γ(α) x α 1 e βx for x > 0 0 otherwise Parameter space: α > 0 and β > 0 Gamma(1, β) is the same as the exponential distribution with parameter β, Expo(β)

21 / 43 5.7 Gamma distributions Properties of the gamma distributions ψ(t) = ( β β t ) α, for t < β. E(X) = α β and E(X) = α β 2 If X 1,..., X k are independent Γ(α i, β) r.v. then ( k ) X 1 + + X k Gamma α i, β i=1

22 / 43 5.7 Gamma distributions Properties of the gamma distributions Theorem 5.7.9: Exponential distribution is memoryless Let X Expo(β) and let t > 0. Then for any h > 0 P(X t + h X t) = P(X h) Theorem 5.7.12: Times between arrivals in a Poisson process Let Z k be the time until the k th arrival in a Poisson process with rate β. Let Y 1 = Z 1 and Y k = Z k Z k 1 for k 2. Then Y 1, Y 2, Y 3,... are i.i.d. with the exponential distribution with parameter β.

23 / 43 Beta distributions 5.8 Beta distributions Def: Beta distributions Beta(α, β) A continuous r.v. X has the beta distribution with parameters α and β if it has the pdf { Γ(α+β) f (x α, β) = Γ(α)Γ(β) x α 1 (1 x) β 1 for 0 < x < 1 0 otherwise Parameter space: α > 0 and β > 0 Beta(1, 1) = Uniform(0, 1) Used to model a r.v.that takes values between 0 and 1. The Beta distributions are often used as prior distributions for probability parameters, e.g. the p in the Binomial distribution.

24 / 43 Beta distributions 5.8 Beta distributions

25 / 43 5.8 Beta distributions sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions Continuous univariate distributions: 5.6 Normal distributions 5.7 Gamma distributions Just skim 5.8 Beta distributions Multivariate distributions Just skim 5.9 Multinomial distributions 5.10 Bivariate normal distributions

26 / 43 5.6 Normal distributions Why Normal? Works well in practice. Many physical experiments have distributions that are approximately normal Central Limit Theorem: Sum of many i.i.d. random variables are approximately normally distributed Mathematically convenient especially the multivariate normal distribution. Can explicitly obtain the distribution of many functions of a normally distributed random variable have. Marginal and conditional distributions of a multivariate normal are also normal (multivariate or univariate). Developed by Gauss and then Laplace in the early 1800s Also known at the Gaussian distributions Gauss Laplace

27 / 43 Normal distributions 5.6 Normal distributions Def: Normal distributions N(µ, σ 2 ) A continuous r.v. X has the normal distribution with mean µ and variance σ 2 if it has the pdf f (x µ, σ 2 ) = 1 ) (x µ)2 exp ( 2π σ 2σ 2, < x < Parameter space: µ R and σ 2 > 0 Show: ψ(t) = exp ( µt + 1 2 σ2 t 2) E(X) = µ Var(X) = σ 2

28 / 43 The Bell curve 5.6 Normal distributions

29 / 43 Standard normal 5.6 Normal distributions Standard normal distribution: N(0, 1) The normal distribution with µ = 0 and σ 2 = 1 is called the standard normal distribution and the pdf and cdf are denoted as φ(x) and Φ(x) The cdf for a normal distribution cannot be expressed in closed form and is evaluated using numerical approximations. Φ(x) is tabulated in the back of the book. Many calculators and programs such as R, Matlab, Excel etc. can calculate Φ(x). Φ( x) = 1 Φ(x) Φ 1 (p) = Φ 1 (1 p)

30 / 43 5.6 Normal distributions Properties of the normal distributions Theorem 5.6.4: Linear transformation of a normal is still normal If X N(µ, σ 2 ) and Y = ax + b where a and b are constants and a 0 then Y N(aµ + b, a 2 σ 2 ) Let F be the cdf of X, where X N(µ, σ 2 ). Then ( ) x µ F(x) = Φ σ and F 1 (p) = µ + σφ 1 (p)

31 / 43 Example: Measured Voltage 5.6 Normal distributions Suppose the measured voltage, X, in a certain electric circuit has the normal distribution with mean 120 and standard deviation 2 1 What is the probability that the measured voltage is between 118 and 122? 2 Below what value will 95% of the measurements be?

32 / 43 5.6 Normal distributions Properties of the normal distributions Theorem 5.6.7: Linear combination of ind. normals is a normal Let X 1,..., X k be independent r.v. and X i N(µ i, σi 2 ) for i = 1,..., k. Then ( ) X 1 + + X k N µ 1 + + µ k, σ1 2 + + σ2 k Also, if a 1,..., a k and b are constants where at least one a i is not zero: In particular: a 1 X 1 + + a k X k + b N ( b + k a i µ i, i=1 k i=1 a 2 i σ2 i The sample mean: X n = 1 n n i=1 X i If X 1,..., X n are a random sample from a N(µ, σ 2 ), what is the distribution of the sample mean? )

33 / 43 5.6 Normal distributions Example: Measured voltage continued Suppose the measured voltage, X, in a certain electric circuit has the normal distribution with mean 120 and standard deviation 2. If three independent measurements of the voltage are made, what is the probability that the sample mean X 3 will lie between 118 and 120? Find x that satisfies P( X 3 120 x) = 0.95

34 / 43 Area under the curve 5.6 Normal distributions

35 / 43 5.6 Normal distributions Lognormal distributions Def: Lognormal distributions If log(x) N(µ, σ 2 ) then we say that X has the Lognormal distribution with parameters µ and σ 2. The support of the lognormal distribution is (0, ). Often used to model time before failure. Example: Let X and Y be independent random variables such that log(x) N(1.6, 4.5) and log(y ) N(3, 6). What is the distribution of the product XY?

36 / 43 Bivariate normal distributions 5.10 Bivariate normal distributions Def: Bivariate normal Two continuous r.v. X 1 and X 2 have the bivariate normal distribution with means µ 1 and µ 2, variances σ1 2 and σ2 2 and correlation ρ if they have the joint pdf 1 f (x 1, x 2 ) = 2π(1 ρ 2 ) 1/2 σ 1 σ 2 ( [ ( ) ( ) 1 (x1 µ 1 ) 2 x1 µ 1 x2 µ 2 exp 2ρ + (x ]) 2 µ 2 ) 2 2(1 ρ 2 ) σ 2 1 σ 1 σ 2 σ 2 2 (1) Parameter space: µ i R, σ 2 i > 0 for i = 1, 2 and 1 ρ 1

37 / 43 Bivariate normal pdf 5.10 Bivariate normal distributions Bivariate normal pdf with different ρ: Contours:

5.10 Bivariate normal distributions Bivariate normal as linear combination Theorem 5.10.1: Bivariate normal from two ind. standard normals Let Z 1 N(0, 1) and Z 2 N(0, 1) be independent. Let µ i R, σi 2 > 0 for i = 1, 2 and 1 ρ 1 and let X 1 = σ 1 Z 1 + µ 1 X 2 = σ 2 (ρz 1 + 1 ρ 2 Z 2 ) + µ 2 (2) Then the joint distribution of X 1 and X 2 is bivariate normal with parameters µ 1, µ 2, σ 2 1, σ2 2 and ρ Theorem 5.10.2 (part 1) the other way Let X 1 and X 2 have the pdf in (1). Then there exist independent standard normal r.v. Z 1 and Z 2 so that (2) holds. 38 / 43

39 / 43 Properties of a bivariate normal 5.10 Bivariate normal distributions Theorem 5.10.2 (part 2) Let X 1 and X 2 have the pdf in (1). Then the marginal distributions are X 1 N(µ 1, σ 2 1 ) and X 2 N(µ 2, σ 2 2 ) And the correlation between X 1 and X 2 is ρ Theorem 5.10.4: The conditional is normal Let X 1 and X 2 have the pdf in (1). Then the conditional distribution of X 2 given that X 1 = x 1 is (univariate) normal with E(X 2 X 1 = x 1 ) = µ 2 + ρσ 2 (x 1 µ 1 ) σ 1 and Var(X 2 X 1 = x 1 ) = (1 ρ 2 )σ 2 2

5.10 Bivariate normal distributions Properties of a bivariate normal Theorem 5.10.3: Uncorrelated Independent Let X 1 and X 2 have the bivariate normal distribution. Then X 1 and X 2 are independent if and only if they are uncorrelated. Only holds for the multivariate normal distribution One of the very convenient properties of the normal distribution Theorem 5.10.5: Linear combinations are normal Let X 1 and X 2 have the pdf in (1) and let a 1, a 2 and b be constants. Then Y = a 1 X 1 + a 2 X 2 + b is normally distributed with E(Y ) = a 1 µ 1 + a 2 µ 2 + b and Var(Y ) = a 2 1 σ2 1 + a2 2 σ2 2 + 2a 1a 2 ρσ 1 σ 2 This extends what we already had for independent normals 40 / 43

41 / 43 Example 5.10 Bivariate normal distributions Let X 1 and X 2 have the bivariate normal distribution with means µ 1 = 3, µ 2 = 5, variances σ1 2 = 4, σ2 2 = 9 and correlation ρ = 0.6. a) Find the distribution of X 2 2X 1 b) What is expected value of X 2, given that we observed X 1 = 2? c) What is the probability that X 1 > X 2?

42 / 43 5.10 Bivariate normal distributions Multivariate normal Matrix notation The pdf of an n-dimensional normal distribution, X N(µ, Σ): { 1 f (x) = (2π) n/2 exp 1 } Σ 1/2 2 (x µ) Σ 1 (x µ) where µ 1 x 1 µ 2 µ =., x = x 2. µ n x n σ 2 1 σ 1,2 σ 1,3 σ 1,n σ 2,1 σ2 2 σ 2,3 σ 2,n and Σ = σ 3,1 σ 3,2 σ3 2 σ 3,n....... σ n,1 σ n,2 σ n,3 σn 2 µ is the mean vector and Σ is called the variance-covariance matrix.

43 / 43 5.10 Bivariate normal distributions Multivariate normal Matrix notation Same things hold for multivariate normal distribution as the bivariate. Let X N(µ, Σ) Linear combinations of X are normal AX + b is (multivariate) normal for fixed matrix A and vector b The marginal distribution of X i is normal with mean µ i and variance σ 2 i The off-diagonal elements of Σ are the covariances between individual elements of X, i.e. Cov(X i, X j ) = σ i,j. The joint marginal distributions are also normal where the mean and covariance matrix are found by picking the corresponding elements from µ and rows and columns from Σ. The conditional distributions are also normal (multivariate or univariate)