MIT Spring 2016
|
|
- Eleanore Turner
- 6 years ago
- Views:
Transcription
1 Exponential Families II MIT Dr. Kempthorne Spring
2 Outline Exponential Families II 1 Exponential Families II 2
3 : Expectation and Variance U (k 1) and V (l 1) are random vectors If A (m k), B (m l) are nonrandom, and then E (AU + BV) = AE (U) + BE (V) If U = c with probability 1 E (U) = c. For a random vector U, if E ( U 2 ) = ) k E (U 2 ) < define the variance of U by Var(U) For A (m k) as above: Var(AU) = AVar(U)A T For c (k 1) a constant vector Var(U + c) = Var(U) For a (k 1) a constant vector, ) k i=1 i = E (U E (U))(U E (U)) T = Cov(U i, U j ) (k k) (m m) 3 Var(a T U) = Var( j=1 a j U j ) = a T Var(U)a = i,j a i a j Cov(U i, U j )
4 : Expectation and Variance Proposition B.5.1 If E [ U 2 ] < then Var(U) is positive definite if and only if P[a T U + b = 0] < 1, for every a = 0, and b R. Proof. Var(U) is not positive definite iff A T Var(Y)a = 0 for some a = 0 which is equivalent to Var(aT U) = 0. 4
5 : Covariance Definition: For random vectors U (k 1) and V (l 1) define the Covariance of U (k 1) [ and V (l 1) by Cov(U, V) = E (U E (U))(V E (V)) T ] (k l) (must assume: E U 2 < and E V 2 < ) If U and V are independent Cov(U, V) = 0. For nonrandom A, a, B, b, Cov(AU + a, BV + b) = ACov(U, V)B T If U and W are random (k 1) vectors, then Var(U + W) = Var(U) + Cov(U, W)+ Cov(W, U) + Var(W) and if U and W are independent Var(U + W) = Var(U) + Var(W) 5
6 : Moment Generating Functions Moment-Generating Function of a Random Vector Let T = (T 1, T 2,..., T k ) T be a (k 1) random vector. For s = (s 1, s 2,..., s k ) T R k, define M(s) E [e st T ] M(s) is the moment-generating function (mgf) of T The mgf may not exist for a given T. If it does exist, it is defined for s in some ball centered at s = 0. Define the characteristic function (cf) of T: is φ(s) = E [e T T ] = E [cos(s T T)] + ie [sin(s T T)] The cf always exists. 6
7 : Moment Generating Functions Theorem B.5.1 Let S = {s : M(s) < }. Then S is convex. If S has a nonempty interior S 0, (contains a sphere S(0, c), c > 0), then M is analytic on S 0. If S 0 =, and E [ T p ] < for all p, then if i 1 + i i k = p, p M(s) T i i s=0 = E [T i 1 i 1 i k 1 k ] s 1 s k M (s = 0) = E(T j ) = E [T] s j 2 M (s = 0) = E(T i T j ) = E[TT T ] s i s j If S 0 is nonempty, then M(s) determines the distribution of U uniquely. 7
8 : Moment Generating Functions Definition: The Cumulant Generating Function of the random vector T with mgf M T (s) is K (s) = K T (s) = log M T (s). p i 1 i i K (s) s=0 1 k c i1,,i k = c i1,,i k (T) = s s In the bivariate case (k = 2) where µ = E [T], and τ i,j = E [(T 1 µ 1 ) i (T 2 µ 2 ) j ] c 10 = µ 1 c 01 = µ 2 c 2,0 = τ 2,0 = var(t 1 ) c 0,2 = τ 0,2 = var(t 2 ) c 1,1 = τ 1,1 = cov(t 1, T 2 ) c 3,0 = τ 3,0 = E [(T 1 µ 1 ) 3 ] c 0,3 = τ 0,3 = E [(T 2 µ 2 ) 3 ] c 4,0 = τ 4,0 3τ 2 2,0 c 0,4 = τ 0,4 3τ 2 0,2 8
9 Sums of Independent If U and V are independent (k 1) random vectors, then M U+V (s) = M U (s) M V (s) K U+V (s) = K U (s) + K V (s) 9
10 Multivariate Normal Distributions Definition B.6.1: A random vector U (k 1) has a k-variate normal distribution iff U can be written as U = µ + AZ where µ, A are constant and Z = (Z 1,..., Z k ) T ; Z i iid N(0, 1). Definition B.6.2: A random vector U (k 1) has a k-variate normal distribution iff for every (k 1) nonrandom a: k a T U = a i U i has a univariate normal distribution i=1 The moment generating n function of U is M T 1 U (s) = exp s µ + 2 s T Σs where µ = E [U], and Σ = Cov(U) = AA T. 10
11 Outline Exponential Families II 1 Exponential Families II 11
12 12
13 Theorem Let P be a canonical k-parameter exponential family generated by (T, h), with corresponding natural parameter space E and function A(η). Then E is convex A : E R is convex If E has nonempty interior E 0 R k, and η 0 E 0, then T(X ) has under η 0 a mgf given by M(s) = exp {A(η 0 + s) A(η 0 )} valid for all s such that η 0 + s E. (this set of s includes a ball about η 0 ) Corollary Under the conditions of the theorem E η0 [T(X )] = A(η 0 ) Var η0 [T(X )] = A(η 0 ) A 2 A where A(η 0 ) = (η 0 ) η and A(η 0 ) = (η 0 ) j η i η j 13
14 Example: Multinomial Distribution Multinomial Distribution X = (X 1, X 2,..., X q ) Multinomial(n, θ = (θ 1, θ 2,..., θ q )) n θ x 1 θ x 2 x p(x θ) = q θq where x 1! x q! 1 2 q is a given positive integer, ) q θ = (θ 1,..., θ q ) : 1 θ j = 1. n is a given positive integer ) q 1 X i = n. 14
15 Example: Multinomial Distribution n θ x1 θ x2 x p(x θ) = q x θq 1! x q! 1 2 n = x 1! x q! exp{log(θ 1 )x log(θ q 1 )x q 1 ) q 1 ) q 1 +log(1 1 θ j )[n 1 x j ]} ) q 1 = h(x)exp{ j=1 η j (θ)t j (x) B(θ)} ) q 1 = h(x)exp{ j=1 η j T j (x) A(η)} where: n h(x) = x1! x q! η(θ) = (η 1 (θ), η 2 (θ),..., η q 1 (θ)) ) q 1 η j (θ) = log(θ j /(1 1 θ j )), j = 1,..., q 1 T (x) = (X 1, X 2,..., X q 1 ) = (T 1 (x), T 2 (x),..., T q 1 (x)). ) q 1 ) q 1 B(θ) = nlog(1 θ j ) and A(η) = +nlog(1 + e η j ) j=1 j=1 q 1 e η j θ j /(1 θ k ) 1+ j=1 e 1+ 1 θ k /(1 1 θ k ) A(η) j = n q 1 ηj = n q 1 1 q 1 = nθ j A(η) i,j = nθ i θ j, (i = j) and A(η) i,i = nθ i (1 θ i ), 15
16 Rank of Exponential Family Defining the Rank of an Exponential Family Every k-parameter exponential family is also a k -parameter exponential family for any k > k. The minimal value of k defines the rank of the exponential family. Define minimal k as the rank when the generating statistic T(X ) is k-dimentional, and the collection {1, T 1 (X ), T 2 (X ),..., T k (X )} are linearly ) independent with positive probability, i.e., P[ k j=1 a j T j (X ) = a k+1 η] < 1, unless all a j = 0. Note: the set of positive support on X does not depend on η. 16
17 Rank of Exponential Family Theorem Let P = {q(x η), η E} be a canonical exponential family generated by (T(X ), h(x )) with natural parameter space E such that E is open. Then the following statements are equivalent P is of rank k. η is an identifiable parameter. Var(T η) is positive defnite η A(η) is 1-to-1 on E. A(η) is strictly convex on E. Note: E open = A defined on all E. Corollary If P is of rank k under Theorem 1.6.4, then P may be uniquely parametrized by µ(η) E [T(X ) η]. log[q(x, η)] is strictly concave in η on E. 17
18 p-variate Gaussian Family Let Y be a (p 1) random vector with a p-variate Gaussian distribution Y N k (µ, Σ) where µ = E [Y] and Σ = Var(Y) is positive definite, rank p. The density of Y is p(y, µ, Σ) = det(σ) 1 2 (2π) p 2 exp{ 1 2 (Y µ) T Σ 1 (Y µ)} Taking logs: log[p(y, µ, Σ)] = 1 Y T 2 Σ 1 Y + [Σ 1 µ] T Y 1 µ T Σ 1 µ 1 log[det(σ)] p log[2π]] Defining ) Σ 1 = σ i,j ), we can write ) the ) first 2 terms as 1 σ i,i Y 2 p p [ i<j σ i,j Y σ i,j i Y j + 2 i i ] + i=1[ j=1 µ j ]Y i The parameter space dimension is k = p + p(p + 1)/2 = p(p + 3)/2 The sufficient statistics are [(Y 1,..., Y p ), {Y i Y j, 1 i j p}], 18
19 h(y ) 1 θ = (µ, Σ) T Σ 1 B(θ) = 1 log[ det(σ) ] + µ µ 2 For a sample Y 1,..., Y n of iid N p (µσ) r.vectors, the data X = (Y 1, Y 2,..., Y n ) follows the k = ) p(p + 3)/2 parameter ) exponential family with T = ( i Y i, LowerTriangle( i Y i Y T i )) (LowerTriangle( ) refers to matrix elements along and below the diagonal) 19
20 Conjugate Families of Prior Distributions 20 Let X 1,..., X n be a sample from the k-parameter exponential family n ) k ) n p(x θ) = [ i=1 h(x i )]exp{ j=1 η j (θ) i=1 T j (x i ) nb(θ)} where θ is k-dimensional. Treat θ as the variable of interest in p(x θ) Treat n and T j as parameters in p(x θ) Find a normalizing function: f )k ω(t) = exp j=1 t j η j (θ) t k+1 B(θ) dθ 1 dθ k and set Ω = {(t 1,..., t k+1 ) : 0 < ω(t 1,..., t k+1 ) < } Proposition The (k + 1)-parameter exponential family given by f )k π t (θ) = exp j=1 t j η j (θ) t k+1 B(θ) log[ω(t)] where t = (t 1,..., t k+1 ) Ω, is a conjugate prior to p(x θ). MIT Exponential Families
21 MIT OpenCourseWare Mathematical Statistics Spring 2016 For information about citing these materials or our Terms of Use, visit:
MIT Spring 2016
Dr. Kempthorne Spring 2016 1 Outline Building 1 Building 2 Definition Building Let X be a random variable/vector with sample space X R q and probability model P θ. The class of probability models P = {P
More informationProbability Lecture III (August, 2006)
robability Lecture III (August, 2006) 1 Some roperties of Random Vectors and Matrices We generalize univariate notions in this section. Definition 1 Let U = U ij k l, a matrix of random variables. Suppose
More information1. Fisher Information
1. Fisher Information Let f(x θ) be a density function with the property that log f(x θ) is differentiable in θ throughout the open p-dimensional parameter set Θ R p ; then the score statistic (or score
More informationChapter 2: Fundamentals of Statistics Lecture 15: Models and statistics
Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics Data from one or a series of random experiments are collected. Planning experiments and collecting data (not discussed here). Analysis:
More informationMIT Spring 2016
Generalized Linear Models MIT 18.655 Dr. Kempthorne Spring 2016 1 Outline Generalized Linear Models 1 Generalized Linear Models 2 Generalized Linear Model Data: (y i, x i ), i = 1,..., n where y i : response
More informationThe Gauss-Markov Model. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 61
The Gauss-Markov Model Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 1 / 61 Recall that Cov(u, v) = E((u E(u))(v E(v))) = E(uv) E(u)E(v) Var(u) = Cov(u, u) = E(u E(u)) 2 = E(u 2
More informationMIT Spring 2016
MIT 18.655 Dr. Kempthorne Spring 2016 1 MIT 18.655 Outline 1 2 MIT 18.655 3 Decision Problem: Basic Components P = {P θ : θ Θ} : parametric model. Θ = {θ}: Parameter space. A{a} : Action space. L(θ, a)
More informationMIT Spring 2016
MIT 18.655 Dr. Kempthorne Spring 2016 1 MIT 18.655 Outline 1 2 MIT 18.655 Decision Problem: Basic Components P = {P θ : θ Θ} : parametric model. Θ = {θ}: Parameter space. A{a} : Action space. L(θ, a) :
More informationExponential Families
Exponential Families David M. Blei 1 Introduction We discuss the exponential family, a very flexible family of distributions. Most distributions that you have heard of are in the exponential family. Bernoulli,
More informationLecture 1: August 28
36-705: Intermediate Statistics Fall 2017 Lecturer: Siva Balakrishnan Lecture 1: August 28 Our broad goal for the first few lectures is to try to understand the behaviour of sums of independent random
More informationIntroduction: exponential family, conjugacy, and sufficiency (9/2/13)
STA56: Probabilistic machine learning Introduction: exponential family, conjugacy, and sufficiency 9/2/3 Lecturer: Barbara Engelhardt Scribes: Melissa Dalis, Abhinandan Nath, Abhishek Dubey, Xin Zhou Review
More informationLecture 17: Minimal sufficiency
Lecture 17: Minimal sufficiency Maximal reduction without loss of information There are many sufficient statistics for a given family P. In fact, X (the whole data set) is sufficient. If T is a sufficient
More informationST5215: Advanced Statistical Theory
Department of Statistics & Applied Probability Monday, September 26, 2011 Lecture 10: Exponential families and Sufficient statistics Exponential Families Exponential families are important parametric families
More informationMaximum likelihood estimation
Maximum likelihood estimation Guillaume Obozinski Ecole des Ponts - ParisTech Master MVA Maximum likelihood estimation 1/26 Outline 1 Statistical concepts 2 A short review of convex analysis and optimization
More informationLECTURE 11: EXPONENTIAL FAMILY AND GENERALIZED LINEAR MODELS
LECTURE : EXPONENTIAL FAMILY AND GENERALIZED LINEAR MODELS HANI GOODARZI AND SINA JAFARPOUR. EXPONENTIAL FAMILY. Exponential family comprises a set of flexible distribution ranging both continuous and
More informationStat 710: Mathematical Statistics Lecture 12
Stat 710: Mathematical Statistics Lecture 12 Jun Shao Department of Statistics University of Wisconsin Madison, WI 53706, USA Jun Shao (UW-Madison) Stat 710, Lecture 12 Feb 18, 2009 1 / 11 Lecture 12:
More information4. CONTINUOUS RANDOM VARIABLES
IA Probability Lent Term 4 CONTINUOUS RANDOM VARIABLES 4 Introduction Up to now we have restricted consideration to sample spaces Ω which are finite, or countable; we will now relax that assumption We
More informationCS Lecture 19. Exponential Families & Expectation Propagation
CS 6347 Lecture 19 Exponential Families & Expectation Propagation Discrete State Spaces We have been focusing on the case of MRFs over discrete state spaces Probability distributions over discrete spaces
More informationPatterns of Scalable Bayesian Inference Background (Session 1)
Patterns of Scalable Bayesian Inference Background (Session 1) Jerónimo Arenas-García Universidad Carlos III de Madrid jeronimo.arenas@gmail.com June 14, 2017 1 / 15 Motivation. Bayesian Learning principles
More informationLecture 11. Multivariate Normal theory
10. Lecture 11. Multivariate Normal theory Lecture 11. Multivariate Normal theory 1 (1 1) 11. Multivariate Normal theory 11.1. Properties of means and covariances of vectors Properties of means and covariances
More informationFundamentals of Statistics
Chapter 2 Fundamentals of Statistics This chapter discusses some fundamental concepts of mathematical statistics. These concepts are essential for the material in later chapters. 2.1 Populations, Samples,
More informationMIT Spring 2015
Regression Analysis MIT 18.472 Dr. Kempthorne Spring 2015 1 Outline Regression Analysis 1 Regression Analysis 2 Multiple Linear Regression: Setup Data Set n cases i = 1, 2,..., n 1 Response (dependent)
More informationMachine learning - HT Maximum Likelihood
Machine learning - HT 2016 3. Maximum Likelihood Varun Kanade University of Oxford January 27, 2016 Outline Probabilistic Framework Formulate linear regression in the language of probability Introduce
More informationMultiple Random Variables
Multiple Random Variables This Version: July 30, 2015 Multiple Random Variables 2 Now we consider models with more than one r.v. These are called multivariate models For instance: height and weight An
More informationECE 275B Homework # 1 Solutions Winter 2018
ECE 275B Homework # 1 Solutions Winter 2018 1. (a) Because x i are assumed to be independent realizations of a continuous random variable, it is almost surely (a.s.) 1 the case that x 1 < x 2 < < x n Thus,
More informationECE 275B Homework # 1 Solutions Version Winter 2015
ECE 275B Homework # 1 Solutions Version Winter 2015 1. (a) Because x i are assumed to be independent realizations of a continuous random variable, it is almost surely (a.s.) 1 the case that x 1 < x 2
More informationA Probability Review
A Probability Review Outline: A probability review Shorthand notation: RV stands for random variable EE 527, Detection and Estimation Theory, # 0b 1 A Probability Review Reading: Go over handouts 2 5 in
More informationPart IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015
Part IA Probability Theorems Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.
More informationFundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner
Fundamentals CS 281A: Statistical Learning Theory Yangqing Jia Based on tutorial slides by Lester Mackey and Ariel Kleiner August, 2011 Outline 1 Probability 2 Statistics 3 Linear Algebra 4 Optimization
More informationNonparametric Estimation of the Dependence Function for a Multivariate Extreme Value Distribution
Nonparametric Estimation of the Dependence Function for a Multivariate Extreme Value Distribution p. /2 Nonparametric Estimation of the Dependence Function for a Multivariate Extreme Value Distribution
More informationGeneralized Linear Models and Exponential Families
Generalized Linear Models and Exponential Families David M. Blei COS424 Princeton University April 12, 2012 Generalized Linear Models x n y n β Linear regression and logistic regression are both linear
More informationSpring 2012 Math 541A Exam 1. X i, S 2 = 1 n. n 1. X i I(X i < c), T n =
Spring 2012 Math 541A Exam 1 1. (a) Let Z i be independent N(0, 1), i = 1, 2,, n. Are Z = 1 n n Z i and S 2 Z = 1 n 1 n (Z i Z) 2 independent? Prove your claim. (b) Let X 1, X 2,, X n be independent identically
More informationHierarchical Modeling for Univariate Spatial Data
Hierarchical Modeling for Univariate Spatial Data Geography 890, Hierarchical Bayesian Models for Environmental Spatial Data Analysis February 15, 2011 1 Spatial Domain 2 Geography 890 Spatial Domain This
More informationMultivariate Distributions
IEOR E4602: Quantitative Risk Management Spring 2016 c 2016 by Martin Haugh Multivariate Distributions We will study multivariate distributions in these notes, focusing 1 in particular on multivariate
More informationChapter 3. Exponential Families. 3.1 Regular Exponential Families
Chapter 3 Exponential Families 3.1 Regular Exponential Families The theory of exponential families will be used in the following chapters to study some of the most important topics in statistical inference
More informationGaussian Models (9/9/13)
STA561: Probabilistic machine learning Gaussian Models (9/9/13) Lecturer: Barbara Engelhardt Scribes: Xi He, Jiangwei Pan, Ali Razeen, Animesh Srivastava 1 Multivariate Normal Distribution The multivariate
More informationMarch 10, 2017 THE EXPONENTIAL CLASS OF DISTRIBUTIONS
March 10, 2017 THE EXPONENTIAL CLASS OF DISTRIBUTIONS Abstract. We will introduce a class of distributions that will contain many of the discrete and continuous we are familiar with. This class will help
More informationLECTURE 2 NOTES. 1. Minimal sufficient statistics.
LECTURE 2 NOTES 1. Minimal sufficient statistics. Definition 1.1 (minimal sufficient statistic). A statistic t := φ(x) is a minimial sufficient statistic for the parametric model F if 1. it is sufficient.
More information40.530: Statistics. Professor Chen Zehua. Singapore University of Design and Technology
Singapore University of Design and Technology Lecture 9: Hypothesis testing, uniformly most powerful tests. The Neyman-Pearson framework Let P be the family of distributions of concern. The Neyman-Pearson
More informationSTAT215: Solutions for Homework 1
STAT25: Solutions for Homework Due: Wednesday, Jan 30. (0 pt) For X Be(α, β), Evaluate E[X a ( X) b ] for all real numbers a and b. For which a, b is it finite? (b) What is the MGF M log X (t) for the
More informationSTAT 830 The Multivariate Normal Distribution
STAT 830 The Multivariate Normal Distribution Richard Lockhart Simon Fraser University STAT 830 Fall 2013 Richard Lockhart (Simon Fraser University)STAT 830 The Multivariate Normal Distribution STAT 830
More informationMaximum Likelihood Large Sample Theory
Maximum Likelihood Large Sample Theory MIT 18.443 Dr. Kempthorne Spring 2015 1 Outline 1 Large Sample Theory of Maximum Likelihood Estimates 2 Asymptotic Results: Overview Asymptotic Framework Data Model
More informationNotes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed
18.466 Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed 1. MLEs in exponential families Let f(x,θ) for x X and θ Θ be a likelihood function, that is, for present purposes,
More informationStochastic Processes: I. consider bowl of worms model for oscilloscope experiment:
Stochastic Processes: I consider bowl of worms model for oscilloscope experiment: SAPAscope 2.0 / 0 1 RESET SAPA2e 22, 23 II 1 stochastic process is: Stochastic Processes: II informally: bowl + drawing
More informationJoint p.d.f. and Independent Random Variables
1 Joint p.d.f. and Independent Random Variables Let X and Y be two discrete r.v. s and let R be the corresponding space of X and Y. The joint p.d.f. of X = x and Y = y, denoted by f(x, y) = P(X = x, Y
More informationdiscrete random variable: probability mass function continuous random variable: probability density function
CHAPTER 1 DISTRIBUTION THEORY 1 Basic Concepts Random Variables discrete random variable: probability mass function continuous random variable: probability density function CHAPTER 1 DISTRIBUTION THEORY
More informationLinear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52
Statistics for Applications Chapter 10: Generalized Linear Models (GLMs) 1/52 Linear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52 Components of a linear model The two
More informationCHAPTER 1 DISTRIBUTION THEORY 1 CHAPTER 1: DISTRIBUTION THEORY
CHAPTER 1 DISTRIBUTION THEORY 1 CHAPTER 1: DISTRIBUTION THEORY CHAPTER 1 DISTRIBUTION THEORY 2 Basic Concepts CHAPTER 1 DISTRIBUTION THEORY 3 Random Variables (R.V.) discrete random variable: probability
More informationStat260: Bayesian Modeling and Inference Lecture Date: February 10th, Jeffreys priors. exp 1 ) p 2
Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, 2010 Jeffreys priors Lecturer: Michael I. Jordan Scribe: Timothy Hunter 1 Priors for the multivariate Gaussian Consider a multivariate
More informationCS839: Probabilistic Graphical Models. Lecture 7: Learning Fully Observed BNs. Theo Rekatsinas
CS839: Probabilistic Graphical Models Lecture 7: Learning Fully Observed BNs Theo Rekatsinas 1 Exponential family: a basic building block For a numeric random variable X p(x ) =h(x)exp T T (x) A( ) = 1
More informationMethods of Estimation
Methods of Estimation MIT 18.655 Dr. Kempthorne Spring 2016 1 Outline Methods of Estimation I 1 Methods of Estimation I 2 X X, X P P = {P θ, θ Θ}. Problem: Finding a function θˆ(x ) which is close to θ.
More informationPart IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015
Part IB Statistics Theorems with proof Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly)
More informationLecture 2. (See Exercise 7.22, 7.23, 7.24 in Casella & Berger)
8 HENRIK HULT Lecture 2 3. Some common distributions in classical and Bayesian statistics 3.1. Conjugate prior distributions. In the Bayesian setting it is important to compute posterior distributions.
More informationContents 1. Contents
Contents 1 Contents 6 Distributions of Functions of Random Variables 2 6.1 Transformation of Discrete r.v.s............. 3 6.2 Method of Distribution Functions............. 6 6.3 Method of Transformations................
More information18.440: Lecture 28 Lectures Review
18.440: Lecture 28 Lectures 18-27 Review Scott Sheffield MIT Outline Outline It s the coins, stupid Much of what we have done in this course can be motivated by the i.i.d. sequence X i where each X i is
More informationReview (Probability & Linear Algebra)
Review (Probability & Linear Algebra) CE-725 : Statistical Pattern Recognition Sharif University of Technology Spring 2013 M. Soleymani Outline Axioms of probability theory Conditional probability, Joint
More informationClassical Estimation Topics
Classical Estimation Topics Namrata Vaswani, Iowa State University February 25, 2014 This note fills in the gaps in the notes already provided (l0.pdf, l1.pdf, l2.pdf, l3.pdf, LeastSquares.pdf). 1 Min
More informationStable Process. 2. Multivariate Stable Distributions. July, 2006
Stable Process 2. Multivariate Stable Distributions July, 2006 1. Stable random vectors. 2. Characteristic functions. 3. Strictly stable and symmetric stable random vectors. 4. Sub-Gaussian random vectors.
More informationLecture Quantitative Finance Spring Term 2015
on bivariate Lecture Quantitative Finance Spring Term 2015 Prof. Dr. Erich Walter Farkas Lecture 07: April 2, 2015 1 / 54 Outline on bivariate 1 2 bivariate 3 Distribution 4 5 6 7 8 Comments and conclusions
More informationThe Expectation-Maximization Algorithm
1/29 EM & Latent Variable Models Gaussian Mixture Models EM Theory The Expectation-Maximization Algorithm Mihaela van der Schaar Department of Engineering Science University of Oxford MLE for Latent Variable
More informationModelling Dependence with Copulas and Applications to Risk Management. Filip Lindskog, RiskLab, ETH Zürich
Modelling Dependence with Copulas and Applications to Risk Management Filip Lindskog, RiskLab, ETH Zürich 02-07-2000 Home page: http://www.math.ethz.ch/ lindskog E-mail: lindskog@math.ethz.ch RiskLab:
More informationEXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY
EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA, 2016 MODULE 1 : Probability distributions Time allowed: Three hours Candidates should answer FIVE questions. All questions carry equal marks.
More information6.041/6.431 Fall 2010 Quiz 2 Solutions
6.04/6.43: Probabilistic Systems Analysis (Fall 200) 6.04/6.43 Fall 200 Quiz 2 Solutions Problem. (80 points) In this problem: (i) X is a (continuous) uniform random variable on [0, 4]. (ii) Y is an exponential
More informationCS340 Machine learning Gaussian classifiers
CS340 Machine learning Gaussian classifiers 1 Correlated features Height and weight are not independent 2 Multivariate Gaussian Multivariate Normal (MVN) N(x µ,σ) def 1 (2π) p/2 Σ 1/2exp[ 1 2 (x µ)t Σ
More informationPrediction. Prediction MIT Dr. Kempthorne. Spring MIT Prediction
MIT 18.655 Dr. Kempthorne Spring 2016 1 Problems Targets of Change in value of portfolio over fixed holding period. Long-term interest rate in 3 months Survival time of patients being treated for cancer
More informationPart IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015
Part IA Probability Definitions Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.
More informationModelling Dependent Credit Risks
Modelling Dependent Credit Risks Filip Lindskog, RiskLab, ETH Zürich 30 November 2000 Home page:http://www.math.ethz.ch/ lindskog E-mail:lindskog@math.ethz.ch RiskLab:http://www.risklab.ch Modelling Dependent
More informationSTAT 460/ /561 STATISTICAL INFERENCE I & II 2017/2018, TERMs I & II
STAT 460/560 + 461/561 STATISTICAL INFERENCE I & II 2017/2018, TERMs I & II Jiahua Chen and Ruben Zamar Department of Statistics University of British Columbia Contents 1 Some basics 1 1.1 Discipline of
More informationRandom Variables and Their Distributions
Chapter 3 Random Variables and Their Distributions A random variable (r.v.) is a function that assigns one and only one numerical value to each simple event in an experiment. We will denote r.vs by capital
More information1 Appendix A: Matrix Algebra
Appendix A: Matrix Algebra. Definitions Matrix A =[ ]=[A] Symmetric matrix: = for all and Diagonal matrix: 6=0if = but =0if 6= Scalar matrix: the diagonal matrix of = Identity matrix: the scalar matrix
More informationNonlinear Programming Models
Nonlinear Programming Models Fabio Schoen 2008 http://gol.dsi.unifi.it/users/schoen Nonlinear Programming Models p. Introduction Nonlinear Programming Models p. NLP problems minf(x) x S R n Standard form:
More informationMiscellaneous Errors in the Chapter 6 Solutions
Miscellaneous Errors in the Chapter 6 Solutions 3.30(b In this problem, early printings of the second edition use the beta(a, b distribution, but later versions use the Poisson(λ distribution. If your
More informationDefinition 1.1 (Parametric family of distributions) A parametric distribution is a set of distribution functions, each of which is determined by speci
Definition 1.1 (Parametric family of distributions) A parametric distribution is a set of distribution functions, each of which is determined by specifying one or more values called parameters. The number
More informationThe Canonical Gaussian Measure on R
The Canonical Gaussian Measure on R 1. Introduction The main goal of this course is to study Gaussian measures. The simplest example of a Gaussian measure is the canonical Gaussian measure P on R where
More informationMA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems
MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems Review of Basic Probability The fundamentals, random variables, probability distributions Probability mass/density functions
More informationReview (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology
Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Some slides have been adopted from Prof. H.R. Rabiee s and also Prof. R. Gutierrez-Osuna
More informationStatistical Inference
Statistical Inference Robert L. Wolpert Institute of Statistics and Decision Sciences Duke University, Durham, NC, USA. Asymptotic Inference in Exponential Families Let X j be a sequence of independent,
More informationLecture 4: Exponential family of distributions and generalized linear model (GLM) (Draft: version 0.9.2)
Lectures on Machine Learning (Fall 2017) Hyeong In Choi Seoul National University Lecture 4: Exponential family of distributions and generalized linear model (GLM) (Draft: version 0.9.2) Topics to be covered:
More informationMA 575 Linear Models: Cedric E. Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2
MA 575 Linear Models: Cedric E Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2 1 Revision: Probability Theory 11 Random Variables A real-valued random variable is
More informationST5215: Advanced Statistical Theory
Department of Statistics & Applied Probability Wednesday, October 19, 2011 Lecture 17: UMVUE and the first method of derivation Estimable parameters Let ϑ be a parameter in the family P. If there exists
More information1 of 7 7/16/2009 6:12 AM Virtual Laboratories > 7. Point Estimation > 1 2 3 4 5 6 1. Estimators The Basic Statistical Model As usual, our starting point is a random experiment with an underlying sample
More informationQuick Tour of Basic Probability Theory and Linear Algebra
Quick Tour of and Linear Algebra Quick Tour of and Linear Algebra CS224w: Social and Information Network Analysis Fall 2011 Quick Tour of and Linear Algebra Quick Tour of and Linear Algebra Outline Definitions
More informationFormulas for probability theory and linear models SF2941
Formulas for probability theory and linear models SF2941 These pages + Appendix 2 of Gut) are permitted as assistance at the exam. 11 maj 2008 Selected formulae of probability Bivariate probability Transforms
More informationChapter 3: Unbiased Estimation Lecture 22: UMVUE and the method of using a sufficient and complete statistic
Chapter 3: Unbiased Estimation Lecture 22: UMVUE and the method of using a sufficient and complete statistic Unbiased estimation Unbiased or asymptotically unbiased estimation plays an important role in
More informationPreliminaries. Copyright c 2018 Dan Nettleton (Iowa State University) Statistics / 38
Preliminaries Copyright c 2018 Dan Nettleton (Iowa State University) Statistics 510 1 / 38 Notation for Scalars, Vectors, and Matrices Lowercase letters = scalars: x, c, σ. Boldface, lowercase letters
More information6.241 Dynamic Systems and Control
6.241 Dynamic Systems and Control Lecture 2: Least Square Estimation Readings: DDV, Chapter 2 Emilio Frazzoli Aeronautics and Astronautics Massachusetts Institute of Technology February 7, 2011 E. Frazzoli
More informationSystem Identification, Lecture 4
System Identification, Lecture 4 Kristiaan Pelckmans (IT/UU, 2338) Course code: 1RT880, Report code: 61800 - Spring 2016 F, FRI Uppsala University, Information Technology 13 April 2016 SI-2016 K. Pelckmans
More informationWrite your Registration Number, Test Centre, Test Code and the Number of this booklet in the appropriate places on the answersheet.
2016 Booklet No. Test Code : PSA Forenoon Questions : 30 Time : 2 hours Write your Registration Number, Test Centre, Test Code and the Number of this booklet in the appropriate places on the answersheet.
More informationMATH c UNIVERSITY OF LEEDS Examination for the Module MATH2715 (January 2015) STATISTICAL METHODS. Time allowed: 2 hours
MATH2750 This question paper consists of 8 printed pages, each of which is identified by the reference MATH275. All calculators must carry an approval sticker issued by the School of Mathematics. c UNIVERSITY
More informationSTAT 512 sp 2018 Summary Sheet
STAT 5 sp 08 Summary Sheet Karl B. Gregory Spring 08. Transformations of a random variable Let X be a rv with support X and let g be a function mapping X to Y with inverse mapping g (A = {x X : g(x A}
More informationFirst Year Examination Department of Statistics, University of Florida
First Year Examination Department of Statistics, University of Florida August 19, 010, 8:00 am - 1:00 noon Instructions: 1. You have four hours to answer questions in this examination.. You must show your
More informationNonparametric hypothesis tests and permutation tests
Nonparametric hypothesis tests and permutation tests 1.7 & 2.3. Probability Generating Functions 3.8.3. Wilcoxon Signed Rank Test 3.8.2. Mann-Whitney Test Prof. Tesler Math 283 Fall 2018 Prof. Tesler Wilcoxon
More informationMachine Learning. Lecture 02.2: Basics of Information Theory. Nevin L. Zhang
Machine Learning Lecture 02.2: Basics of Information Theory Nevin L. Zhang lzhang@cse.ust.hk Department of Computer Science and Engineering The Hong Kong University of Science and Technology Nevin L. Zhang
More information18.175: Lecture 8 Weak laws and moment-generating/characteristic functions
18.175: Lecture 8 Weak laws and moment-generating/characteristic functions Scott Sheffield MIT 18.175 Lecture 8 1 Outline Moment generating functions Weak law of large numbers: Markov/Chebyshev approach
More informationLecture 1 October 9, 2013
Probabilistic Graphical Models Fall 2013 Lecture 1 October 9, 2013 Lecturer: Guillaume Obozinski Scribe: Huu Dien Khue Le, Robin Bénesse The web page of the course: http://www.di.ens.fr/~fbach/courses/fall2013/
More informationx log x, which is strictly convex, and use Jensen s Inequality:
2. Information measures: mutual information 2.1 Divergence: main inequality Theorem 2.1 (Information Inequality). D(P Q) 0 ; D(P Q) = 0 iff P = Q Proof. Let ϕ(x) x log x, which is strictly convex, and
More informationf X (y, z; θ, σ 2 ) = 1 2 (2πσ2 ) 1 2 exp( (y θz) 2 /2σ 2 ) l c,n (θ, σ 2 ) = i log f(y i, Z i ; θ, σ 2 ) (Y i θz i ) 2 /2σ 2
Chapter 7: EM algorithm in exponential families: JAW 4.30-32 7.1 (i) The EM Algorithm finds MLE s in problems with latent variables (sometimes called missing data ): things you wish you could observe,
More informationLesson 4: Stationary stochastic processes
Dipartimento di Ingegneria e Scienze dell Informazione e Matematica Università dell Aquila, umberto.triacca@univaq.it Stationary stochastic processes Stationarity is a rather intuitive concept, it means
More informationPattern Recognition. Parameter Estimation of Probability Density Functions
Pattern Recognition Parameter Estimation of Probability Density Functions Classification Problem (Review) The classification problem is to assign an arbitrary feature vector x F to one of c classes. The
More informationLecture 17: Likelihood ratio and asymptotic tests
Lecture 17: Likelihood ratio and asymptotic tests Likelihood ratio When both H 0 and H 1 are simple (i.e., Θ 0 = {θ 0 } and Θ 1 = {θ 1 }), Theorem 6.1 applies and a UMP test rejects H 0 when f θ1 (X) f
More information