Formulas for probability theory and linear models SF2941

Similar documents
Continuous Random Variables

Random Variables and Their Distributions

LIST OF FORMULAS FOR STK1100 AND STK1110

Probability and Distributions

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

PCMI Introduction to Random Matrix Theory Handout # REVIEW OF PROBABILITY THEORY. Chapter 1 - Events and Their Probabilities

Random Variables. Cumulative Distribution Function (CDF) Amappingthattransformstheeventstotherealline.

3. Probability and Statistics

1.1 Review of Probability Theory

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R

Chp 4. Expectation and Variance

BASICS OF PROBABILITY

Chapter 5. Chapter 5 sections

1 Presessional Probability

Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows.

Lecture 2: Repetition of probability theory and statistics

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems

conditional cdf, conditional pdf, total probability theorem?

P (x). all other X j =x j. If X is a continuous random vector (see p.172), then the marginal distributions of X i are: f(x)dx 1 dx n

Bivariate distributions

Part IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Lecture 11. Probability Theory: an Overveiw

Chapter 5 continued. Chapter 5 sections

STAT Chapter 5 Continuous Distributions

ACM 116: Lectures 3 4

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Two hours. Statistical Tables to be provided THE UNIVERSITY OF MANCHESTER. 14 January :45 11:45

2 (Statistics) Random variables

1 Review of Probability and Distributions

Chapter 3, 4 Random Variables ENCS Probability and Stochastic Processes. Concordia University

Chapter 4. Chapter 4 sections

5 Operations on Multiple Random Variables

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

18.440: Lecture 28 Lectures Review

Algorithms for Uncertainty Quantification

STA 256: Statistics and Probability I

Review of Probability Theory

4. Distributions of Functions of Random Variables

Multivariate Gaussian Distribution. Auxiliary notes for Time Series Analysis SF2943. Spring 2013

STT 441 Final Exam Fall 2013

Conditional distributions. Conditional expectation and conditional variance with respect to a variable.

Lecture 1: August 28

Lecture 1: Review on Probability and Statistics

EEL 5544 Noise in Linear Systems Lecture 30. X (s) = E [ e sx] f X (x)e sx dx. Moments can be found from the Laplace transform as

The Multivariate Normal Distribution. In this case according to our theorem

Probability- the good parts version. I. Random variables and their distributions; continuous random variables.

1 Random Variable: Topics

EE4601 Communication Systems

Stat410 Probability and Statistics II (F16)

Avd. Matematisk statistik

Quick Tour of Basic Probability Theory and Linear Algebra

PROBABILITY THEORY LECTURE 3

Chapter 2. Some Basic Probability Concepts. 2.1 Experiments, Outcomes and Random Variables

Multiple Random Variables

2 Functions of random variables

Probability Models. 4. What is the definition of the expectation of a discrete random variable?

Statistics 1B. Statistics 1B 1 (1 1)

ECE 4400:693 - Information Theory

18.440: Lecture 28 Lectures Review

Chapter 7: Special Distributions

Actuarial Science Exam 1/P

Statistics, Data Analysis, and Simulation SS 2015

Northwestern University Department of Electrical Engineering and Computer Science

Probability. Paul Schrimpf. January 23, UBC Economics 326. Probability. Paul Schrimpf. Definitions. Properties. Random variables.

Probability Review. Yutian Li. January 18, Stanford University. Yutian Li (Stanford University) Probability Review January 18, / 27

Lecture 2: Review of Probability

n! (k 1)!(n k)! = F (X) U(0, 1). (x, y) = n(n 1) ( F (y) F (x) ) n 2

Expectation of Random Variables

01 Probability Theory and Statistics Review

Lecture 19: Properties of Expectation

Multivariate Random Variable

Multivariate distributions

MATHEMATICS 154, SPRING 2009 PROBABILITY THEORY Outline #11 (Tail-Sum Theorem, Conditional distribution and expectation)

Exam P Review Sheet. for a > 0. ln(a) i=0 ari = a. (1 r) 2. (Note that the A i s form a partition)

n! (k 1)!(n k)! = F (X) U(0, 1). (x, y) = n(n 1) ( F (y) F (x) ) n 2

Joint Distributions. (a) Scalar multiplication: k = c d. (b) Product of two matrices: c d. (c) The transpose of a matrix:

Introduction to Computational Finance and Financial Econometrics Probability Review - Part 2

Let X and Y denote two random variables. The joint distribution of these random

t x 1 e t dt, and simplify the answer when possible (for example, when r is a positive even number). In particular, confirm that EX 4 = 3.

Introduction to Normal Distribution

Moments. Raw moment: February 25, 2014 Normalized / Standardized moment:

Stat 5101 Notes: Brand Name Distributions

Review: mostly probability and some statistics

4. CONTINUOUS RANDOM VARIABLES

Lecture 11. Multivariate Normal theory

ELEMENTS OF PROBABILITY THEORY

Review of probability

MAS223 Statistical Inference and Modelling Exercises

Lycka till!

ECON 5350 Class Notes Review of Probability and Distribution Theory

Practice Examination # 3

Final Exam # 3. Sta 230: Probability. December 16, 2012

Fundamentals of Digital Commun. Ch. 4: Random Variables and Random Processes

Statistics 351 Probability I Fall 2006 (200630) Final Exam Solutions. θ α β Γ(α)Γ(β) (uv)α 1 (v uv) β 1 exp v }

Lecture 6: Special probability distributions. Summarizing probability distributions. Let X be a random variable with probability distribution

BMIR Lecture Series on Probability and Statistics Fall 2015 Discrete RVs

Tom Salisbury

Chapter 4 : Expectation and Moments

Lecture 25: Review. Statistics 104. April 23, Colin Rundel

Fundamental Tools - Probability Theory II

Transcription:

Formulas for probability theory and linear models SF2941 These pages + Appendix 2 of Gut) are permitted as assistance at the exam. 11 maj 2008 Selected formulae of probability Bivariate probability Transforms Multivariate normal distribution Convergence Poisson process Series Expansions and Integrals 1 Probability 1.1 Two inequalities A B PA) PB). P A B) PA) + PB) 1.2 Change of variable in a probability density If X has the probability density f X x), Y = AX + b, and A is invertible, then Y has the probability density 1 f Y y) = det A f X A 1 y b) ) 1

2 Continuous bivariate distributions 2.1 Bivariate densities 2.1.1 Definitions The bivariate vector X, Y ) T has a continuous joint distribution with density f X,Y x, y) if where f X,Y x, y) 0, PX x, Y y) = + + f X,Y x, y)dxdy = 1 Marginal distribution: f X x) = + f X,Y x, y)dy, f Y y) = + f X,Y x, y)dx. Distribution function Conditional densities: x F X x) = PX x) = y x f X,Y u, v)dudv. f X u)du. Pa < X b) = F X b) F X a). X Y = y, if f Y y) > 0. Y X = x if f X x) > 0. f X Y =y x) := f X,Y x, y), f Y y) f Y X=x y) := f X,Y x, y), f X x) 2

Bayes formula f X Y =y x) = f Y X=xy) f X x) = f Y y) f Y X=x y) f X x) = + f Y X=xy)f X x)dx. f X Y =y x) is a a posteriori density for X and f X x) is a priori density for X. 2.1.2 Independence X and Y are independent iff f X,Y x, y) = f X x) f Y y) for every x, y). 2.1.3 Conditional density of X given an event B f X B x) = 2.1.4 Normal distribution { fx x) PB) If X has the density fx; µ, σ) defined by f X x; µ, σ) := x B 0 elsewhere 1 2πσ 2 e x µ)2 2σ 2, then X Nµ, σ 2 ). X N0, 1) is standard normal with density φx) = f X x; 0, 1). The cumulative distribution function of X N0, 1) is for x > 0 Φx) = x φt)dt = 1 x 2 + φt)dt 0 and Φ x) = 1 Φx). 3

2.1.5 Numerical computation of Φx) Approximative values of the cumulative distribution function of X N0, 1), Φx), can be calculated for x > 0 by Φx) = 1 Qx), Qx) = where we use the following approximation 1 : Qx) 1 1 π 2.2 Mean and variance 1 ) x + 1 x2 + 2π π x φt)dt, 1 2π e x2 /2. The expectations or means EX), EY ) are defined if they exist) by EX) = EY ) = + + xf X x)dx, yf Y y)dy, respectively. Variances VarX), VarY ) are defined as VarX) = VarY ) = + + x EX)) 2 f X x)dx, y EY )) 2 f Y y)dy, respectively. We have VarX) = EX 2 ) EX)) 2. The function of a random variable gx) has expectation EgX)) = + gx)f X x)dx. 1 P.O. Börjesson and C.E.W. Sundberg: Simple Approximations of the Error Function Qx) for Communication Applications. IEEE Transactions on Communications, March 1979, pp. 639 643. 4

2.3 Chebyshev, s inequality P X EX) > ε) VarX) ε 2. 2.4 Conditional expectations The condtional expectations of X given Y = y is EX Y = y) := + xf X Y =y x)dx. This can be seen as y EX Y = y), as a function of Y. EX) = EEX Y )), VarX) = VarEX Y )) + EVarY X)). E [ Y gx)) 2] = E [Var[Y X]] + E [ E [Y X] gx)) 2]. 2.5 Covariance CovX, Y) := EXY ) EX) EY ) = = We have = E[X EX)][Y EY )]) + + x EX))y EY ))f X,Y x, y)dxdy. n Var a i X i ) = i=1 n m Cov a i X i, b j Y j ) = i=1 j=1 n n 1 a 2 i VarX n 1 i) + 2 a i a j CovX i, X j ), i=1 i=1 j=i+1 n m a i b j CovX i, X j ). i=1 j=1 2.6 Coefficient of correlation Coefficient of correlation between X and Y is defined as ρ := ρ X,Y := CovX, Y ) VarX) VarY ). 5

3 Best linear prediction α and β that minimize E [Y α + βx)] 2 are given by α = µ Y σ XY σ 2 X µ X = µ Y ρ σ Y σ X µ X β = σ XY σ 2 X = ρ σ Y σ X where µ Y = E [Y ], µ X = E [X], σ 2 Y = Var [Y ], σ 2 X = Var [X], σ XY = CovX, Y ), ρ = σ XY σ X σ Y. 4 Covariance matrix 4.1 Definition Covariance matrix C X := E [ X µ X ) X µ X ) T ] where the element i, j) C X i, j) = E [X i µ i ) X j µ j )] is the covariance between X i and X j. Covariance matrix is nonnegative definite, i.e., for all x we have x T C X x 0 Hence det C X 0. The covariance matrix is symmetric C X = C T X 6

4.2 2 2 Covariance Matrix The covariance matrix of a bivariate random variable X = X 1, X 2 ) T. ) σ 2 C X = 1 ρσ 1 σ 2 ρσ 1 σ 2 σ2 2, where ρ is the coefficient of correlation of X 1 and X 2, and σ1 2 = VarX 1), σ2 2 = Var X 2 ). C X is invertible iff ρ 2 1 Then the inverse is CX 1 1 σ = 2 ) 2 ρσ 1 σ 2 σ1σ 2 11 2 ρ 2 ) ρσ 1 σ 2 σ1 2. 5 Discrete Random Variables X is a discrete) random variable that assumes values in X and Y is a discrete) random variable that assumes values in Y. Remark 5.1 These are measurable maps Xω), ω Ω, from a basic probability space Ω, F, P) = outcomes, a sigma field of of subsets of Ω and probability measure P on F), to X. X and Y are two discrete state spaces, whose generic elements are called values or instantiations and denoted by x i and y j, respectively. X = {x 1,, x L }, Y = {y 1,, y J }. X := the number of elements in X) = L, Y = J. Unless otherwise stated the alphabets considered here are finite. 5.1 Joint Probability Distributions A two dimensional joint simultaneous) probability distribution is a probability defined on X Y px i, y j ) := PX = x i, Y = y j ). 5.1) Hence 0 px i, y j ) and L i=1 Lj=1 px i, y j ) = 1. Marginal distribution for X: J px i ) = px i, y j ). 5.2) j=1 7

Marginal distribution for Y : L py j ) = px i, y j ). 5.3) i=1 These notions can be extended to define the joint simultaneous) probability distribution and the marginal distributions of n random variables. 5.2 Conditional Probability Distributions The conditional probability for X = x i given Y = y j is px i y j ) := px i, y j ) py j ). 5.4) The conditional probability for Y = y j given X = x i is py j x i ) := px i, y j ) px i ). 5.5) Here we assume py j ) > 0 and px i ) > 0. If for example px i ) = 0, we can make the definition of py j x i ) arbitrarily through px i ) py j x i ) = px i, y j ). In other words Hence Next py j x i ) = prob. for the event {X = x i, Y = y j }. prob. for the event {X = x i } L px i y j ) = 1. i=1 P X A) := x i A px i ) 5.6) is the probability of the event that X assumes a value in A, a subset of X. From 5.6) one easily finds the complement rule P X A c ) = 1 P X A), 5.7) where A c is the complement of A, i.e., those outcomes which do not lie in A. Also P X A B) = P X A) + P X B) P X A B), 5.8) is immediate. 8

5.3 Conditional Probability Given an Event The conditional probability for X = x i given X A is denoted by P X x i A) and given by { PX x i ) P X x i A) = P X if x A) i A 5.9) 0 otherwise. 5.4 Independence X and Y are independent random variables if and only if px i, y j ) = px i ) py j ) 5.10) for all pairs x i, y j ) in X Y. In other words all events {X = x i } and {Y = y j } are to be independent. We say that X 1, X 2,..., X n are independent random variables if and only if the joint distribution p X1,X 2,...,X n x i1, x i2...,x in ) = P X 1 = x i1, X 2 = x i2,...,x n = x in ) 5.11) equals p X1,X 2,...,X n x i1, x i2...,x in ) = p X1 x i1 ) p X2 x i2 ) p Xn x in ) 5.12) for every x i1, x i2...,x in X n. We are here assuming for simplicity that X 1, X 2,...,X n take values in the same alphabet. 5.5 A Chain Rule Let Z be a discrete) random variable that assumes values in Z = {z k } K k=1. If pz k ) > 0, px i, y j z k ) = px i, y j, z k ). pz k ) Then we obtain as an identity px i, y j z k ) = px i, y j, z k ) py j, z k ) py j, z k ) pz k ) and again by definition of conditional probability the right hand side is equal to px i y j, z k ) py j z k ). In other words, p X,Y Z x i, y j z k ) = px i y j, z k ) py j z k ). 5.13) 9

5.6 Conditional Independence The random variables X and Y are called conditionally independent given Z if px i, y j z k ) = px i z k ) py j z k ) 5.14) for all triples z k, x i, y j ) Z X Y cf. 5.13)). 6 Miscellaneous 6.1 A Marginalization Formula Let Y be discrete and X be continuous, and let their joint distribution be Then P Y = k, X x) = P Y = k) = = x P Y = k X = u)f X u)du. P Y = k, X u)du u P Y = k X = x) f X x)dx. 6.2 Factorial Moments X is an integer-valued discrete R.V., µ [r] def = E [XX 1) X r + 1)] = = x:integer is called the r:th factorial moment. 6.3 Binomial Moments xx 1) x r + 1))f X x). X is an integer-valued discrete R.V.. ) X E = E [XX 1) X r + 1)]/r! r is called the binomial moment. 10

7 Transforms 7.1 Probability Generating Function 7.1.1 Definition Let X have values k = 0, 1, 2,...,. g X t) = E t X) = t k f X k) k=0 is called the probability generating function. 7.1.2 Prob. Gen. Fnct: Properties d dt g X1) = d n dt n g X 0) n! kt k 1 f X k) t=1 k=1 = E [X] = P X = n) µ [r] = E [XX 1) X r + 1)] = dr dt rg X1) 7.1.3 Prob. Gen. Fnct: Properties Z = X + Y, X and Y non negative integer valued, independent, g Z t) = E t Z) = E t X+Y ) = E t X) E t Y ) = g X t) g Y t). 11

7.1.4 Prob. Gen. Fnct: Examples X Bep) Y Binn, p) Z Po λ) g X t) = 1 p + pt. g Y t) = 1 p + pt) n g Z t) = e λ t 1) 7.2 Moment Generating Functions 7.2.1 Definition Moment generating function, for some h > 0, ψ X t) def = E [ e tx], t < h. { x ψ X t) = i e tx if X x i ) X discrete etx f X x)dx X continuous 7.2.2 Moment Gen. Fnctn: Properties d ds ψ X0) = E [X] ψ X 0) = 1 d k ds kψ X0) = E [ X k]. S n = X 1 + X 2 +... + X n, X i independent. X i I.I.D., ψ Sn t) = E e tsn ) = E e tx 1+X 2 +...+X n) ) = E e tx 1 e tx2 e txn ) = E e tx 1) E e tx 2 ) E e tx n ) = ψx1 t) ψ X2 t) ψ Xn t) ψ Sn s) = ψ X t)) n. 12

7.2.3 Moment Gen. Fnctn: Examples X N µ, σ 2 ) Y Exp a) ψ X t) = e µt+1 2 σ2 t 2 ψ Y s) = 1, at < 1. 1 at 7.2.4 Characteristic function Characteristic function exists for all t. ϕ X t) = E [ e itx] 7.3 Moment generating function, characteristic function of a vector random variable Moment generating function of X n 1 vector) is defined as ψ X t) def = Ee ttx = Ee t 1X 1 +t 2 X 2 + +t nx n Characteristic function of X is defined as ϕ X t) def = Ee ittx = Ee it 1X 1 +t 2 X 2 + +t nx n) 8 Multivariate normal distribution An n 1 random vector X has a normal distribution iff for every n 1- vector a the one-dimensional random vector a T X has a normal distribution. When vector X = X 1, X 2,, X n ) T has a multivariate normal distribution we write X N µ,c). 8.1) The moment generating function is ψ X s) = e st µ+ 1 2 st Cs 8.2) 13

The characteristic function of X N µ, C) is Let X N µ,c) and ϕ X t) = Ee ittx = e itt µ 1 2 tt Ct. 8.3) Y = a + BX. for an arbitrary m n -matrix B and arbitrary m 1 vector a. Then Y N a + Bµ, BCB T ). 8.4) 8.1 χ 2 -distribution for quadratic form of normal r.v. X N µ,c), C is invertible. Then where n is the dimension of X. 8.2 Four product rule X 1, X 2, X 3, X 4 ) T N 0,C).Then X µ) T C 1 X µ) χ 2 n), E [X 1 X 2 X 3 X 4 ] = E [X 1 X 2 ] E [X 3 X 4 ]+E [X 1 X 3 ] E [X 2 X 4 ]+E [X 1 X 4 ] E [X 2 X 3 ] 8.3 Conditional distributions for bivariate normal random variables X, Y ) T N µ,c ). The conditional distribution for Y given X = x is normal ) N µ Y + ρ σy x µ X ), σy 2 σ 1 ρ2 ), X where µ Y = EY ), µ X = EX), σ Y = VarY ), σ X = VarY ) and ρ = CovX, Y ) VarX) VarY ). Z 1 och Z 2 are independent N0, 1). ) ) µ1 σ1 0 µ =,B = ρσ 2 σ 2 1 ρ 2. µ 2 14

If then with X Y C = ) X Y = µ + B ) Z1 Z 2 N µ,c ) σ 2 1 ρσ 1 σ 2 ρσ 1 σ 2 σ 2 2 ) )., 9 Poisson process Xt) = number of occurrences of some event in in 0, t]. Definition 9.1 {Xt) t 0} is a Poisson process with parameter λ > 0, if 1) X0) = 0. 2) The increments Xt k ) Xt k 1 ) are independent stochastic variables 1 k n, 0 t 0 t 1 t 2... t n 1 t n and all n. 3) Xt) Xs) Poλt s)), 0 s < t. T k = the time of occurrence of the kth event. T 0 = 0. We have {T k t} = {Xt) k} τ k = T k T k 1, is the kth interoccurrence time. τ 1, τ 2...,τ k... are independent and identically distributed, τ i Exp ) 1 λ. 10 Convergence 10.1 Definitions We say that X n P X, as n 15

if for all ǫ > 0 P X n X > ǫ) 0, as n We say that if X n q X E X n X 2 0, as n We say that X d n X, as n if F Xn x) F X x), as n for all x, where F X x) is continuous. 10.2 Relations between convergences as n. If c is a constant, X n q X X n P X X n P X X n d X X n P c X n d c as n. If ϕ Xn t) are the characteristic functions of X n, then X n d X ϕ Xn t) ϕ X t) If ϕ X t) is a characteristic function continuous at t = 0, then ϕ Xn t) ϕ X t) X n d X 16

10.3 Law of Large Numbers X 1, X 2,... are independent, identically distributed i.i.d.) random variables with finite expectation µ. We set Then S n = X 1 + X 2 +... + X n, n 1. S n n P µ, as n. 10.4 Central Limit Theorem X 1, X 2,... are independent, identically distributed i.i.d.) random variables with finite expectation µ and finite variance σ 2. We set Then S n = X 1 + X 2 +... + X n, n 1. S n nµ σ n d N0, 1), as n. 11 Series Expansions and Integrals 11.1 Exponential Function e x = 11.2 Geometric Series k=0 x k k! < x <. c n c 1 + c ) n n e c. n 1 1 x = k=0 1 1 x) 2 = n k=0 k=0 x k, x < 1. kx k 1, x < 1. x k = 1 xn+1 1 x, x 1. 17

11.3 Logarithm function ln1 x) = k=1 11.4 Euler Gamma Function Γt) = 0 x k, 1 x < 1. k x t 1 e x dx, t > 0 1 Γ = 2) π Γn) = n 1)! 0 x t e λx dx = n is a nonnegative integer. Γt + 1) λ t+1, λ > 0, t > 1 11.5 A formula with a probabilistic proof) t k 1 1 Γk) λk x k 1 e λx dx = e λtλt)j j! j=0 18