Stat 5101 Notes: Brand Name Distributions

Similar documents
Stat 5101 Notes: Brand Name Distributions

Random Variables and Their Distributions

Chapter 5 continued. Chapter 5 sections

Chapter 5. Chapter 5 sections

A Few Special Distributions and Their Properties

Stat 5101 Lecture Slides: Deck 7 Asymptotics, also called Large Sample Theory. Charles J. Geyer School of Statistics University of Minnesota

Stat 5101 Lecture Slides: Deck 8 Dirichlet Distribution. Charles J. Geyer School of Statistics University of Minnesota

Probability Distributions Columns (a) through (d)

LIST OF FORMULAS FOR STK1100 AND STK1110

Things to remember when learning probability distributions:

Stat 5101 Lecture Notes

3. Probability and Statistics

Stat 5101 Lecture Slides Deck 5. Charles J. Geyer School of Statistics University of Minnesota

Common probability distributionsi Math 217 Probability and Statistics Prof. D. Joyce, Fall 2014

3 Modeling Process Quality

1 Uniform Distribution. 2 Gamma Distribution. 3 Inverse Gamma Distribution. 4 Multivariate Normal Distribution. 5 Multivariate Student-t Distribution

Continuous Random Variables and Continuous Distributions

Continuous Random Variables

Advanced topics from statistics

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems

15 Discrete Distributions

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Probability and Distributions

MULTIVARIATE DISTRIBUTIONS

ACM 116: Lectures 3 4

Experimental Design and Statistics - AGA47A

Formulas for probability theory and linear models SF2941

MAS223 Statistical Inference and Modelling Exercises

Statistical Methods in Particle Physics

Department of Large Animal Sciences. Outline. Slide 2. Department of Large Animal Sciences. Slide 4. Department of Large Animal Sciences

Sampling Distributions

Chapter 4 Multiple Random Variables

Chapter 7: Special Distributions

MFM Practitioner Module: Quantitative Risk Management. John Dodson. September 23, 2015

Random Vectors 1. STA442/2101 Fall See last slide for copyright information. 1 / 30

t x 1 e t dt, and simplify the answer when possible (for example, when r is a positive even number). In particular, confirm that EX 4 = 3.

Probability Density Functions

STATISTICS SYLLABUS UNIT I

Stat 5101 Lecture Slides Deck 4. Charles J. Geyer School of Statistics University of Minnesota

x. Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ 2 ).

Lecture 6: Special probability distributions. Summarizing probability distributions. Let X be a random variable with probability distribution

Lecture 2: Repetition of probability theory and statistics

Sampling Distributions

Common ontinuous random variables

Some Special Distributions (Hogg Chapter Three)

EEL 5544 Noise in Linear Systems Lecture 30. X (s) = E [ e sx] f X (x)e sx dx. Moments can be found from the Laplace transform as

Some Special Distributions (Hogg Chapter Three)

PCMI Introduction to Random Matrix Theory Handout # REVIEW OF PROBABILITY THEORY. Chapter 1 - Events and Their Probabilities

Continuous Distributions

Stat410 Probability and Statistics II (F16)

Lecture 1: August 28

CHAPTER 6 SOME CONTINUOUS PROBABILITY DISTRIBUTIONS. 6.2 Normal Distribution. 6.1 Continuous Uniform Distribution

1 Review of Probability and Distributions

Brief Review of Probability

Statistical Methods in Particle Physics

ARCONES MANUAL FOR THE SOA EXAM P/CAS EXAM 1, PROBABILITY, SPRING 2010 EDITION.

Probability Background

STAT Chapter 5 Continuous Distributions

Lecture 11. Multivariate Normal theory

Joint Distributions. (a) Scalar multiplication: k = c d. (b) Product of two matrices: c d. (c) The transpose of a matrix:

Covariance. Lecture 20: Covariance / Correlation & General Bivariate Normal. Covariance, cont. Properties of Covariance

Part IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Actuarial Science Exam 1/P

3 Continuous Random Variables

Definition: A random variable X is a real valued function that maps a sample space S into the space of real numbers R. X : S R

Probability Distributions

Multivariate Distributions

Review for the previous lecture

Chapter 1. Sets and probability. 1.3 Probability space

2 Random Variable Generation

Random Vectors and Multivariate Normal Distributions

Bivariate distributions

Distributions of Functions of Random Variables. 5.1 Functions of One Random Variable

STAT:5100 (22S:193) Statistical Inference I

STAT 4385 Topic 01: Introduction & Review

MATH c UNIVERSITY OF LEEDS Examination for the Module MATH2715 (January 2015) STATISTICAL METHODS. Time allowed: 2 hours

1 Review of Probability

Probability. Machine Learning and Pattern Recognition. Chris Williams. School of Informatics, University of Edinburgh. August 2014

1.12 Multivariate Random Variables

PROBABILITY DISTRIBUTIONS. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception

FIRST YEAR EXAM Monday May 10, 2010; 9:00 12:00am

TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions

Statistics and data analyses

Asymptotic Statistics-III. Changliang Zou

Statistics for scientists and engineers

Chapter 2 Continuous Distributions

Spring 2012 Math 541B Exam 1

Probability Theory and Statistics. Peter Jochumzen

THE QUEEN S UNIVERSITY OF BELFAST

18.440: Lecture 28 Lectures Review

MATH : EXAM 2 INFO/LOGISTICS/ADVICE

Week 9 The Central Limit Theorem and Estimation Concepts

Probability and Statistics

STA 2201/442 Assignment 2

2 Functions of random variables

This exam is closed book and closed notes. (You will have access to a copy of the Table of Common Distributions given in the back of the text.

Review of Probabilities and Basic Statistics

Chp 4. Expectation and Variance

Statistical Distributions

Transcription:

Stat 5101 Notes: Brand Name Distributions Charles J. Geyer September 5, 2012 Contents 1 Discrete Uniform Distribution 2 2 General Discrete Uniform Distribution 2 3 Uniform Distribution 3 4 General Uniform Distribution 3 5 Bernoulli Distribution 4 6 Binomial Distribution 5 7 Hypergeometric Distribution 6 8 Poisson Distribution 7 9 Geometric Distribution 8 10 Negative Binomial Distribution 9 11 Normal Distribution 10 12 Exponential Distribution 12 13 Gamma Distribution 12 14 Beta Distribution 14 15 Multinomial Distribution 15 1

16 Bivariate Normal Distribution 18 17 Multivariate Normal Distribution 19 18 Chi-Square Distribution 21 19 Student s t Distribution 22 20 Snedecor s F Distribution 23 21 Cauchy Distribution 24 22 Laplace Distribution 25 1 Discrete Uniform Distribution DiscUnif(n). Discrete. Rationale Equally likely outcomes. The interval 1, 2,..., n of the integers. Probability Mass Function f(x) = 1 n, x = 1, 2,..., n E(X) = n + 1 2 var(x) = n2 1 12 2 General Discrete Uniform Distribution Discrete. Any finite set S. 2

Probability Mass Function f(x) = 1 n, x S, where n is the number of elements of S. 3 Uniform Distribution Unif(a, b). Rationale Continuous analog of the discrete uniform distribution. Parameters Real numbers a and b with a < b. The interval (a, b) of the real numbers. Probability Density Function f(x) = 1 b a, a < x < b E(X) = a + b 2 (b a)2 var(x) = 12 Relation to Other Distributions Beta(1, 1) = Unif(0, 1). 4 General Uniform Distribution Any open set S in R n. 3

Probability Density Function f(x) = 1 c, x S where c is the measure (length in one dimension, area in two, volume in three, etc.) of the set S. 5 Bernoulli Distribution Ber(p). Discrete. Rationale Any zero-or-one-valued random variable. Parameter Real number 0 p 1. The two-element set {0, 1}. Probability Mass Function f(x) = { p, x = 1 E(X) = p 1 p, x = 0 var(x) = p(1 p) Addition Rule If X 1,..., X k are IID Ber(p) random variables, then X 1 + + X k is a Bin(k, p) random variable. Degeneracy If p = 0 the distribution is concentrated at 0. If p = 1 the distribution is concentrated at 1. Relation to Other Distributions Ber(p) = Bin(1, p). 4

6 Binomial Distribution Bin(n, p). Discrete. Rationale Sum of n IID Bernoulli random variables. Parameters Real number 0 p 1. Integer n 1. The interval 0, 1,..., n of the integers. Probability Mass Function ( ) n f(x) = p x (1 p) n x, x x = 0, 1,..., n E(X) = np var(x) = np(1 p) Addition Rule If X 1,..., X k are independent random variables, X i being Bin(n i, p) distributed, then X 1 + + X k is a Bin(n 1 + + n k, p) random variable. Normal Approximation If np and n(1 p) are both large, then Bin(n, p) N ( np, np(1 p) ) Poisson Approximation If n is large but np is small, then Bin(n, p) Poi(np) Theorem The fact that the probability mass function sums to one is equivalent to the binomial theorem: for any real numbers a and b n k=0 ( ) n a k b n k = (a + b) n. k 5

Degeneracy If p = 0 the distribution is concentrated at 0. If p = 1 the distribution is concentrated at n. Relation to Other Distributions Ber(p) = Bin(1, p). 7 Hypergeometric Distribution Hypergeometric(A, B, n). Discrete. Rationale Sample of size n without replacement from finite population of B zeros and A ones. The interval max(0, n B),..., min(n, A) of the integers. Probability Mass Function ( A B ) f(x) = x)( n x ), x = max(0, n B),..., min(n, A) where ( A+B n E(X) = np var(x) = np(1 p) N n N 1 p = A A + B N = A + B (7.1) Binomial Approximation If n is small compared to either A or B, then where p is given by (7.1). Hypergeometric(n, A, B) Bin(n, p) 6

Normal Approximation If n is large, but small compared to either A or B, then Hypergeometric(n, A, B) N ( np, np(1 p) ) where p is given by (7.1). Theorem The fact that the probability mass function sums to one is equivalent to min(a,n) ( )( ) ( ) A B A + B = x n x n x=max(0,n B) 8 Poisson Distribution Poi(µ) Discrete. Rationale Counts in a Poisson process. Parameter Real number µ > 0. The non-negative integers 0, 1,.... Probability Mass Function f(x) = µx x! e µ, x = 0, 1,... E(X) = µ var(x) = µ Addition Rule If X 1,..., X k are independent random variables, X i being Poi(µ i ) distributed, then X 1 + +X k is a Poi(µ 1 + +µ k ) random variable. Normal Approximation If µ is large, then Poi(µ) N (µ, µ) 7

Theorem The fact that the probability mass function sums to one is equivalent to the Maclaurin series for the exponential function: for any real number x x k k! = ex. k=0 9 Geometric Distribution Geo(p). Discrete. Rationales Discrete lifetime of object that does not age. Waiting time or interarrival time in sequence of IID Bernoulli trials. Inverse sampling. Discrete analog of the exponential distribution. Parameter Real number 0 < p 1. The non-negative integers 0, 1,.... Probability Mass Function f(x) = p(1 p) x x = 0, 1,... E(X) = 1 p p var(x) = 1 p p 2 Addition Rule If X 1,..., X k are IID Geo(p) random variables, then X 1 + + X k is a NegBin(k, p) random variable. 8

Theorem The fact that the probability mass function sums to one is equivalent to the geometric series: for any real number s such that s < 1 k=0 s k = 1 1 s. Degeneracy If p = 1 the distribution is concentrated at 0. 10 Negative Binomial Distribution NegBin(r, p). Discrete. Rationale Sum of IID geometric random variables. Inverse sampling. Gamma mixture of Poisson distributions. Parameters Real number 0 < p 1. Integer r 1. The non-negative integers 0, 1,.... Probability Mass Function ( ) r + x 1 f(x) = p r (1 p) x, x = 0, 1,... x E(X) = var(x) = r(1 p) p r(1 p) p 2 Addition Rule If X 1,..., X k are independent random variables, X i being NegBin(r i, p) distributed, then X 1 + + X k is a NegBin(r 1 + + r k, p) random variable. 9

Normal Approximation If r(1 p) is large, then ( ) r(1 p) r(1 p) NegBin(r, p) N, p p 2 Degeneracy If p = 1 the distribution is concentrated at 0. Extended Definition The definition makes sense for noninteger r if binomial coefficients are defined by ( ) r r (r 1) (r k + 1) = k k! which for integer r agrees with the standard definition. Also ( ) ( ) r + x 1 r = ( 1) x x x which explains the name negative binomial. (10.1) Theorem The fact that the probability mass function sums to one is equivalent to the generalized binomial theorem: for any real number s such that 1 < s < 1 and any real number m k=0 ( ) m s k = (1 + s) m. (10.2) k If m is a nonnegative integer, then ( m k ) is zero for k > m, and we get the ordinary binomial theorem. Changing variables from m to m and from s to s and using (10.1) turns (10.2) into ( ) m + k 1 ( ) m s k = ( s) k = (1 s) m k k k=0 k=0 which has a more obvious relationship to the negative binomial density summing to one. 11 Normal Distribution N (µ, σ 2 ). 10

Rationale Limiting distribution in the central limit theorem. Error distribution that turns the method of least squares into maximum likelihood estimation. Parameters Real numbers µ and σ 2 > 0. The real numbers. Probability Density Function f(x) = 1 2πσ e (x µ)2 /2σ 2, < x < E(X) = µ var(x) = σ 2 E{(X µ) 3 } = 0 E{(X µ) 4 } = 3σ 4 Linear Transformations N (aµ + b, a 2 σ 2 ) distributed. If X is N (µ, σ 2 ) distributed, then ax + b is Addition Rule If X 1,..., X k are independent random variables, X i being N (µ i, σi 2) distributed, then X 1 + + X k is a N (µ 1 + + µ k, σ1 2 + + σ2 k ) random variable. Theorem The fact that the probability density function integrates to one is equivalent to the integral e z2 /2 dz = 2π Relation to Other Distributions If Z is N (0, 1) distributed, then Z 2 is Gam( 1 2, 1 2 ) = chi2 (1) distributed. Also related to Student t, Snedecor F, and Cauchy distributions (for which see). 11

12 Exponential Distribution Exp(λ). Rationales Lifetime of object that does not age. Waiting time or interarrival time in Poisson process. Continuous analog of the geometric distribution. Parameter Real number λ > 0. The interval (0, ) of the real numbers. Probability Density Function f(x) = λe λx, 0 < x < Cumulative Distribution Function F (x) = 1 e λx, 0 < x < E(X) = 1 λ var(x) = 1 λ 2 Addition Rule If X 1,..., X k are IID Exp(λ) random variables, then X 1 + + X k is a Gam(k, λ) random variable. Relation to Other Distributions Exp(λ) = Gam(1, λ). 13 Gamma Distribution Gam(α, λ). 12

Rationales Sum of IID exponential random variables. Conjugate prior for exponential, Poisson, or normal precision family. Parameter Real numbers α > 0 and λ > 0. The interval (0, ) of the real numbers. Probability Density Function f(x) = λα Γ(α) xα 1 e λx, where Γ(α) is defined by (13.1) below. 0 < x < E(X) = α λ var(x) = α λ 2 Addition Rule If X 1,..., X k are independent random variables, X i being Gam(α i, λ) distributed, then X 1 + +X k is a Gam(α 1 + +α k, λ) random variable. Normal Approximation If α is large, then ( α Gam(α, λ) N λ, α ) λ 2 Theorem The fact that the probability density function integrates to one is equivalent to the integral 0 x α 1 e λx dx = Γ(α) λ α the case λ = 1 is the definition of the gamma function Γ(α) = 0 x α 1 e x dx (13.1) 13

Relation to Other Distributions Exp(λ) = Gam(1, λ). chi 2 (ν) = Gam( ν 2, 1 2 ). If X and Y are independent, X is Γ(α 1, λ) distributed and Y is Γ(α 2, λ) distributed, then X/(X + Y ) is Beta(α 1, α 2 ) distributed. If Z is N (0, 1) distributed, then Z 2 is Gam( 1 2, 1 2 ) distributed. Facts About Gamma Functions Integration by parts in (13.1) establishes the gamma function recursion formula Γ(α + 1) = αγ(α), α > 0 (13.2) The relationship between the Exp(λ) and Gam(1, λ) distributions gives Γ(1) = 1 and the relationship between the N (0, 1) and Gam( 1 2, 1 2 ) distributions gives Γ( 1 2 ) = π Together with the recursion (13.2) these give for any positive integer n Γ(n + 1) = n! and Γ(n + 1 2 ) = ( n 1 ( ) 2) n 3 2 3 2 1 2 π 14 Beta Distribution Beta(α 1, α 2 ). Rationales Ratio of gamma random variables. Conjugate prior for binomial or negative binomial family. 14

Parameter Real numbers α 1 > 0 and α 2 > 0. The interval (0, 1) of the real numbers. Probability Density Function f(x) = Γ(α 1 + α 2 ) Γ(α 1 )Γ(α 2 ) xα 1 1 (1 x) α 2 1 0 < x < 1 where Γ(α) is defined by (13.1) above. E(X) = α 1 α 1 + α 2 α 1 α 2 var(x) = (α 1 + α 2 ) 2 (α 1 + α 2 + 1) Theorem The fact that the probability density function integrates to one is equivalent to the integral 1 Relation to Other Distributions 0 x α 1 1 (1 x) α 2 1 dx = Γ(α 1)Γ(α 2 ) Γ(α 1 + α 2 ) If X and Y are independent, X is Γ(α 1, λ) distributed and Y is Γ(α 2, λ) distributed, then X/(X + Y ) is Beta(α 1, α 2 ) distributed. Beta(1, 1) = Unif(0, 1). 15 Multinomial Distribution Multi(n, p). Discrete. Rationale Multivariate analog of the binomial distribution. 15

Parameters Real vector p in the parameter space { } k p R k : 0 p i, i = 1,..., k, and p i = 1 i=1 (15.1) (real vectors whose components are nonnegative and sum to one). The set of vectors { S = x Z k : 0 x i, i = 1,..., k, and } k x i = n i=1 (15.2) (integer vectors whose components are nonnegative and sum to n). Probability Mass Function where f(x) = ( ) k n p x i i x, i=1 ( ) n = x is called a multinomial coefficient. n! k i=1 x i! x S E(X i ) = np i var(x i ) = np i (1 p i ) cov(x i, X j ) = np i p j, i j (Vector Form) E(X) = np var(x) = nm where M = P pp T where P is the diagonal matrix whose vector of diagonal elements is p. 16

Addition Rule If X 1,..., X k are independent random vectors, X i being Multi(n i, p) distributed, then X 1 + + X k is a Multi(n 1 + + n k, p) random variable. Normal Approximation If n is large and p is not near the boundary of the parameter space (15.1), then Multi(n, p) N (np, nm) Theorem The fact that the probability mass function sums to one is equivalent to the multinomial theorem: for any vector a of real numbers ( ) k n a x i i = (a 1 + + a k ) n x i=1 x S Degeneracy If a vector a exists such that Ma = 0, then var(a T X) = 0. In particular, the vector u = (1, 1,..., 1) always satisfies Mu = 0, so var(u T X) = 0. This is obvious, since u T X = k i=1 X i = n by definition of the multinomial distribution, and the variance of a constant is zero. This means a multinomial random vector of dimension k is really of dimension no more than k 1 because it is concentrated on a hyperplane containing the sample space (15.2). Marginal Distributions Every univariate marginal is binomial X i Bin(n, p i ) Not, strictly speaking marginals, but random vectors formed by collapsing categories are multinomial. If A 1,..., A m is a partition of the set {1,..., k} and Y j = X i, i A j q j = p i, i A j j = 1,..., m j = 1,..., m then the random vector Y has a Multi(n, q) distribution. 17

Conditional Distributions If {i 1,..., i m } and {i m+1,..., i k } partition the set {1,..., k}, then the conditional distribution of X i1,..., X im given X im+1,..., X ik is Multi(n X im+1 X ik, q), where the parameter vector q has components q j = p ij p i1 + + p im, j = 1,..., m Relation to Other Distributions Each marginal of a multinomial is binomial. If X is Bin(n, p), then the vector (X, n X) is Multi ( n, (p, 1 p) ). 16 Bivariate Normal Distribution See multivariate normal below. Rationales See multivariate normal below. Parameters Real vector µ of dimension 2, real symmetric positive semidefinite matrix M of dimension 2 2 having the form ( ) σ 2 M = 1 ρσ 1 σ 2 ρσ 1 σ 2 σ2 2 where σ 1 > 0, σ 2 > 0 and 1 < ρ < +1. The Euclidean space R 2. Probability Density Function f(x) = 1 2π det(m) 1/2 exp ( 1 2 (x µ)t M 1 (x µ) ) ( [ (x1 ) 1 = 2π 1 µ 2 1 exp 1 ρ 2 σ 1 σ 2 2(1 ρ 2 ) σ 1 ( ) ( ) ( ) ]) x1 µ 1 x2 µ 2 x2 µ 2 2 2ρ +, x R 2 σ 1 σ 2 σ 2 18

(Vector Form) E(X i ) = µ i, i = 1, 2 var(x i ) = σ 2 i, i = 1, 2 cov(x 1, X 2 ) = ρσ 1 σ 2 cor(x 1, X 2 ) = ρ E(X) = µ var(x) = M Linear Transformations See multivariate normal below. Addition Rule See multivariate normal below. Marginal Distributions X i is N (µ i, σi 2 ) distributed, i = 1, 2. Conditional Distributions The conditional distribution of X 2 given X 1 is ( N µ 2 + ρ σ ) 2 (x 1 µ 1 ), (1 ρ 2 )σ2 2 σ 1 17 Multivariate Normal Distribution N (µ, M) Rationales Multivariate analog of the univariate normal distribution. Limiting distribution in the multivariate central limit theorem. Parameters Real vector µ of dimension k, real symmetric positive semidefinite matrix M of dimension k k. The Euclidean space R k. 19

Probability Density Function If M is (strictly) positive definite, f(x) = (2π) k/2 det(m) 1/2 exp ( 1 2 (x µ)t M 1 (x µ) ), x R k Otherwise there is no density (X is concentrated on a hyperplane). (Vector Form) E(X) = µ var(x) = M Linear Transformations If X is N (µ, M) distributed, then a + BX, where a is a constant vector and B is a constant matrix of dimensions such that the vector addition and matrix multiplication make sense, has the N (a + Bµ, BMB T ) distribution. Addition Rule If X 1,..., X k are independent random vectors, X i being N (µ i, M i ) distributed, then X 1 + +X k is a N (µ 1 + +µ k, M 1 + +M k ) random variable. Degeneracy If a vector a exists such that Ma = 0, then var(a T X) = 0. Partitioned Vectors and Matrices The random vector and parameters are written in partitioned form ( ) X1 X = (17.1a) X 2 ( ) µ1 µ = (17.1b) µ 2 ( ) M11 M M = 12 (17.1c) M 21 M 2 when X 1 consists of the first r elements of X and X 2 of the other k r elements and similarly for µ 1 and µ 2. Marginal Distributions Every marginal of a multivariate normal is normal (univariate or multivariate as the case may be). In partitioned form, the (marginal) distribution of X 1 is N (µ 1, M 11 ). 20

Conditional Distributions Every conditional of a multivariate normal is normal (univariate or multivariate as the case may be). In partitioned form, the conditional distribution of X 1 given X 2 is N (µ 1 + M 12 M 22 [X 2 µ 2 ], M 11 M 12 M 22 M 21) where the notation M 22 denotes the inverse of the matrix M 22 if the matrix is invertible and otherwise any generalized inverse. 18 Chi-Square Distribution chi 2 (ν) or χ 2 (ν). Rationales Sum of squares of IID standard normal random variables. Sampling distribution of sample variance when data are IID normal. Asymptotic distribution in Pearson chi-square test. Asymptotic distribution of log likelihood ratio. Parameter Real number ν > 0 called degrees of freedom. The interval (0, ) of the real numbers. Probability Density Function f(x) = ( 1 2 )ν/2 Γ( ν 2 ) xν/2 1 e x/2, 0 < x <. E(X) = ν var(x) = 2ν Addition Rule If X 1,..., X k are independent random variables, X i being chi 2 (ν i ) distributed, then X 1 + +X k is a chi 2 (ν 1 + +ν k ) random variable. 21

Normal Approximation If ν is large, then Relation to Other Distributions chi 2 (ν) = Gam( ν 2, 1 2 ). chi 2 (ν) N (ν, 2ν) If X is N (0, 1) distributed, then X 2 is chi 2 (1) distributed. If Z and Y are independent, X is N (0, 1) distributed and Y is chi 2 (ν) distributed, then X/ Y/ν is t(ν) distributed. If X and Y are independent and are chi 2 (µ) and chi 2 (ν) distributed, respectively, then (X/µ)/(Y/ν) is F (µ, ν) distributed. 19 Student s t Distribution t(ν). Rationales Sampling distribution of pivotal quantity n(x n µ)/s n when data are IID normal. Marginal for µ in conjugate prior family for two-parameter normal data. Parameter Real number ν > 0 called degrees of freedom. The real numbers. Probability Density Function f(x) = 1 ν+1 Γ( 2 ) νπ Γ( ν 2 ) ( 1 ) (ν+1)/2, < x < + 1 + x2 ν 22

If ν > 1, then E(X) = 0. Otherwise the mean does not exist. If ν > 2, then var(x) = ν ν 2. Otherwise the variance does not exist. Normal Approximation If ν is large, then Relation to Other Distributions t(ν) N (0, 1) If X and Y are independent, X is N (0, 1) distributed and Y is chi 2 (ν) distributed, then X/ Y/ν is t(ν) distributed. If X is t(ν) distributed, then X 2 is F (1, ν) distributed. t(1) = Cauchy(0, 1). 20 Snedecor s F Distribution F (µ, ν). Rationale Ratio of sums of squares for normal data (test statistics in regression and analysis of variance). Parameters Real numbers µ > 0 and ν > 0 called numerator degrees of freedom and denominator degrees of freedom, respectively. The interval (0, ) of the real numbers. Probability Density Function f(x) = µ+ν Γ( 2 )µµ/2 ν ν/2 x µ/2 1 Γ( µ 2 )Γ( ν 2 ), 0 < x < + (µx + ν) (µ+ν)/2 23

If ν > 2, then E(X) = Otherwise the mean does not exist. Relation to Other Distributions ν ν 2. If X and Y are independent and are chi 2 (µ) and chi 2 (ν) distributed, respectively, then (X/µ)/(Y/ν) is F (µ, ν) distributed. If X is t(ν) distributed, then X 2 is F (1, ν) distributed. 21 Cauchy Distribution Cauchy(µ, σ). Rationales Very heavy tailed distribution. Counterexample to law of large numbers. Parameters Real numbers µ and σ > 0. The real numbers. Probability Density Function f(x) = 1 πσ 1 1 + ( x µ σ ) 2, < x < + No moments exist. Addition Rule If X 1,..., X k are IID Cauchy(µ, σ) random variables, then X n = (X 1 + + X k )/n is also Cauchy(µ, σ). 24

Relation to Other Distributions t(1) = Cauchy(0, 1). 22 Laplace Distribution Laplace(µ, σ). Rationales The sample median is the maximum likelihood estimate of the location parameter. Parameters Real numbers µ and σ > 0, called the mean and standard deviation, respectively. The real numbers. Probability Density Function ( 2 f(x) = 2σ exp ) 2 x µ σ, < x < E(X) = µ var(x) = σ 2 25