Exponential Families and Bayesian Inference

Size: px
Start display at page:

Download "Exponential Families and Bayesian Inference"

Transcription

1 Computer Visio Expoetial Families ad Bayesia Iferece Lecture Expoetial Families A expoetial family of distributios is a d-parameter family f(x; havig the followig form: f(x; = h(xe g(t T (x B(, (. where = (,..., d R d, ad g( = [g (,..., g d (], for d fuctios g i : R d R, ad T (x = [T (x,..., T d (x]. Some examples with d =. May well-kow distributios belog to this family. Let us look at some examples:. Beroulli distributio The beroulli distributio characterizes coi tosses: P (X; p = p X ( p X = e X log p+( X log( p. Comparig with equatio (.: = p, T (x = x, g(p = log, B(p = log( p, h(x =. p p. Biomial Distributio The biomial distributio characterizes the umber of success (e.g heads i trials (coi tosses i.e X {0,,..., }. P (X; p = ( x Comparig with Eq 3.: = p, T (x = x, g(p = log ( p X ( p X = x e x log p p + log( p. p p, B(p = log( p, h(x = ( x 3. Poisso Distributio The poisso distributio is give by: f(x; = x e x! = x! ex log. Comparig with Eq 3.: = p, T (x = x, g( = log, B( =, h(x = x!.. -

2 - Lecture A example with d > Normal Distributio The geeral uivariate ormal desity is give by: f(x; µ, σ = (x µ e σ = e x µ +xµ σ log σ. πσ π which is of the form above, settig = [µ σ] T, T (x = [x x] T, g( = [ σ µ σ ] T, B( = µ σ + log σ ad h(x = π. Expoetial Families are closed uder Samplig If X,..., X are sampled i.i.d from a expoetial family, the joit desity has the form: f(x,..., X ; = h(x i e g( P T (X i B(. (. which also belogs has a expoetial form, with h (X,..., X e g ( T (X,...,X B (, T (X,..., X = i T (X i, h (X,..., X = i h(x i ad B ( = B(. B( ad Normalizatio Cosider the form h(xe g(t (x. To tur this expoetial form ito a desity, we eed to divide by the ormalizig costat h(xe g(t (x dx. Defie: B( = log h(xe g(t (x dx. so that h(xe g(t (x = e B(. Now, the expoetial form becomes a desity that itegrates to : h(xe g(t (x f(x; = e B( = h(xe g(t (x B(. So B( is the log of the ormalizig costat. Derivatives of B( ad Momets of T Defie: A(g = log h(xe g T (x dx so that B( = A(g(

3 Lecture -3 Takig the derivative of A with respect to g, we have: A (g = T (x h(x e g T (x dx h(x e g T (x. dx This shows that the derivative of the ormalizig costat gives the Expectatio of T. Oe ca also verify: A (g( = E T (X. (.3 A (g( = V ar T (X. More geerally, a coectio betwee m th derivative ad m th momet of T (X ca be established. This is a very useful result sice the problem of estimatig momets which ivolves computig itegrals has bee tured ito a problem of differetiatig a fuctio. Maximum Likelihood Estimatio We ow use the above properties for Maximum likelihood estimatio based o i.i.d samples X,..., X. The joit desity is give i Eq 3.. The log-likelihood fuctio is give by takig logs i Eq 3.: l(x,..., X ; = g( T (X i B( + The MLE estimate is obtaied by maximizig the fuctio above: ˆ = argmax = argmax g( = argmax g( = argmax g( l(x,..., X T (X i B( + T (X i B( T (X i A(g(. h(x i. h(x i Now dl(x,..., X ; d Settig the derivative equal to 0 we get = dl(x,..., X ; g( (g. dg T (x i A (g( = 0,

4 -4 Lecture which we rewrite usig (.3 T (x i = E T (X. Thus, that maximizes likelihood is the for which the true expectatio of T (X equals the sample expectatio. The oly way i which the data is ivolved i the estimatio of is via the sample mea T (X i, which is refered to as a sufficiet statistic for iferece about. Multivariate Expoetial Family The observatios above also hold for a multivariate d-parameter expoetial family, f(x; = h(xe g(t T (x B( with = [... d ] T,T (X = [T (X... T (X d ] g( = [g (... g d (]. Agai defiig A(g = log h(xe gt T (x dx, the followig results correspodig to the oe-parameter case ca be established g A(g( = E T (X. E T k (X. A g i g j (g( = cov (T i (X, T j (X. The maximum likelihood estimate of is made by solvig the followig set of equatios: T j (x = E T j j =... d. Defiig the discrete empirical distributio which is uiform over the values X... X : R X = δ Xi, we ca express the above equalities as: E RX T j = E T j. At the ML estimate of, the expectatios uder the empirical distributio equals the true expectatio of T.

5 Lecture -5 The Bayesia Approach So far we have used the maximum likelihood method for defiig estimators for which is thought of as a parameter. The Bayesia approach treats parameters as radom variables that ca be described by probabilistic statemets. Bayesia iferece is carried out i the followig way:. We choose a probability desity P ( called the prior distributio that expresses our prior beliefs about before we see ay data.. Defie the family of coditioal distributios P (X. Note that sice ow is a radom variable we write P (X as opposed to P (X;. 3. After observig data X... X, we compute the posterior distributio P ( X... X. For the thid step, we exploy the Bayes rule: P ( X... X = P (X... X P ( P (X... X = P (X P (X... P (X P ( P (X... X What ca we do with the posterior? Two optios are to estimate via the mode or the mea of the posterior distributio. From Bayesia Decisio theory, these optios correspod to optimizig with respect to a zero-oe cost or a squared cost respectively. To maximize the posterior: ˆ = argmax P ( X... X = argmax log P ( X... X = argmax log P (X i + log P (. Note that the ormalizig term P (X... X ca be igored. To estimate via the mea of the posterior. ˆ = P ( X... X I this case, the ormalizig term P (X... X caot be igored. Cojugate Priors I Bayesia statistics a prior distributio is multiplied by the likelihood fuctio ad the ormalized to produce a posterior distributio. A cojugate prior is oe which, whe combied with the likelihood ad ormalized, produces a posterior distributio which is

6 -6 Lecture of the same family as the prior. I most cases oce the uormalized posterior is kow the ormalizatio follows directly from the form of the distributio. Example If oe is estimatig the parameter (the success probability of a Beroulli distributio, ad if oe chooses to use a beta distributio as oe s prior, the the posterior is always aother beta distributio. This allows us to figure out the ormalizig costats bypassig their actual computatio. The Beroulli distributio is give by: We put a Beta distributio B(α, β o p: P (X p = p X ( p ( X P (p = Γ(α + β Γ(αΓ(β pα ( p β where the Γ fuctio is a geeralizatio of the factorial to complex ad real-valued argumets: Γ(α = 0 y α e y dy which for itegers α = gives the factorial Γ( = (!. We kow that for a Beta distributio, the Expectatio is give by: E B(α, β = α α + β Now cosider the posterior distributio of p give the i.i.d sampled data: P (p X... X = P (p P (X i p P (X... X = C α,βp α ( p β pxi ( p ( Xi P (X... X where C α,β = Γ(α+β Γ(αΓ(β is the ormalizig costat for the B(α, β distributio. The above expressio ca be writte as: P (p X... X = C α,βp α ( p β p s ( p ( s P (X... X = C p s+α ( p s+β where s = X i is the umber of sucesses ad C is the ormalizig costat for the posterior. Note that from the form of the posterior we already kow it is a Beta distributio B(s + α, s + β ad the ormalizig costat C is give by C = Γ( + α + β Γ(s + αγ( s + β.

7 Lecture -7 The posterior mea estimate ˆp, therefore, is: ˆp = E B(s + α, s + β = s + α + α + β (.4 Recall that the ML estimate ˆp ML was: ˆp ML = s The posterior estimate i (.4 ad the maximum likelihood estimate are the same asymptotically. However, for small sample sizes (.4 has a smoothig effect. It disallows zero probability ifereces whe the sucess cout is zero, ad eforces the ifluece of a prior estimate. For α = β =, the posterior mea is ˆp = s+ +4 which is the so called Wilso s estimate of p. Cojugate Priors ad the Normal Desity Cosider observatios X,..., X i.i.d N(µ, σ where we assume σ to be kow, ad µ to be the oly ukow parameter. P (X... X µ = πσ e (X i µ σ = (πσ { exp (X i µ } σ Assume a Gaussia prior N(µ 0, σ 0 o the mea i.e our prior belief is to see the mea µ aroud some value µ 0 with variace σ 0 distributed ormally: P (µ = exp (µ µ 0 σ 0 πσ 0 The posterior has the followig form: P (µ X... X = C P (µp (X... X µ = C πσ 0 πσ exp { (X i µ σ (µ µ 0 } σ0 P µ ( X i σ + µ 0 σ 0 = C σ + σ exp «σ + σ 0 where C, C are appropriate ormalizatio costats. From the last expressio it follows

8 -8 Lecture that the posterior is also ormal with mea µ post ad variace σ post give by: µ post = P ( X i + µ σ 0 + σ σ0 σ post σ 0 ( = σ 0 σ σ0 + σ σ = σ + σ 0 X i + µ 0 σ 0 = σ post = σ σ 0 σ 0 + σ P X i. The expres- Recall that the maximum likelihood estimate of the mea was µ ML = sio for µ post above ca be writte as: µ post = σ 0 σ0 + µ σ σ ML + σ0 + µ σ 0 Thus i both examples the posterior mea is a weighted average of the sample mea (Maximum likelihood estimate ad the prior mea. Asymptotically, the posterior mea ad the sample mea are idetical. I the small sample case, the prior belief ca strogly ifluece the choice of µ i the maer expressed above. The posterior mea i the multivariate case has the same form. For estimatio of the covariace of a multivariate ormal it is also possible to defie a cojugate prior - the iverse Wishart distributio o positive defiite matrices. We omit the precise form of the distributio. For our purposes it suffices to ote that the distributio depeds o a cetral covariace C 0 ad a cocetratio parameter a, ad prefers covariaces close to C 0. The fial posterior mea agai has the form of a weighted average of the empirical covariace matrix ad C 0 : C post = Ĉ + ac 0, + a where Ĉ is the empirical covariace which is the maximum likelihood estimate (see lecture.

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1 EECS564 Estimatio, Filterig, ad Detectio Hwk 2 Sols. Witer 25 4. Let Z be a sigle observatio havig desity fuctio where. p (z) = (2z + ), z (a) Assumig that is a oradom parameter, fid ad plot the maximum

More information

Lecture 7: Properties of Random Samples

Lecture 7: Properties of Random Samples Lecture 7: Properties of Radom Samples 1 Cotiued From Last Class Theorem 1.1. Let X 1, X,...X be a radom sample from a populatio with mea µ ad variace σ

More information

4. Partial Sums and the Central Limit Theorem

4. Partial Sums and the Central Limit Theorem 1 of 10 7/16/2009 6:05 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 4. Partial Sums ad the Cetral Limit Theorem The cetral limit theorem ad the law of large umbers are the two fudametal theorems

More information

Lecture 9: September 19

Lecture 9: September 19 36-700: Probability ad Mathematical Statistics I Fall 206 Lecturer: Siva Balakrisha Lecture 9: September 9 9. Review ad Outlie Last class we discussed: Statistical estimatio broadly Pot estimatio Bias-Variace

More information

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA, 016 MODULE : Statistical Iferece Time allowed: Three hours Cadidates should aswer FIVE questios. All questios carry equal marks. The umber

More information

Lecture 12: September 27

Lecture 12: September 27 36-705: Itermediate Statistics Fall 207 Lecturer: Siva Balakrisha Lecture 2: September 27 Today we will discuss sufficiecy i more detail ad the begi to discuss some geeral strategies for costructig estimators.

More information

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5 CS434a/54a: Patter Recogitio Prof. Olga Veksler Lecture 5 Today Itroductio to parameter estimatio Two methods for parameter estimatio Maimum Likelihood Estimatio Bayesia Estimatio Itroducto Bayesia Decisio

More information

Chapter 6 Principles of Data Reduction

Chapter 6 Principles of Data Reduction Chapter 6 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 0 Chapter 6 Priciples of Data Reductio Sectio 6. Itroductio Goal: To summarize or reduce the data X, X,, X to get iformatio about a

More information

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f. Lecture 5 Let us give oe more example of MLE. Example 3. The uiform distributio U[0, ] o the iterval [0, ] has p.d.f. { 1 f(x =, 0 x, 0, otherwise The likelihood fuctio ϕ( = f(x i = 1 I(X 1,..., X [0,

More information

Lecture 11 and 12: Basic estimation theory

Lecture 11 and 12: Basic estimation theory Lecture ad 2: Basic estimatio theory Sprig 202 - EE 94 Networked estimatio ad cotrol Prof. Kha March 2 202 I. MAXIMUM-LIKELIHOOD ESTIMATORS The maximum likelihood priciple is deceptively simple. Louis

More information

Probability and MLE.

Probability and MLE. 10-701 Probability ad MLE http://www.cs.cmu.edu/~pradeepr/701 (brief) itro to probability Basic otatios Radom variable - referrig to a elemet / evet whose status is ukow: A = it will rai tomorrow Domai

More information

Unbiased Estimation. February 7-12, 2008

Unbiased Estimation. February 7-12, 2008 Ubiased Estimatio February 7-2, 2008 We begi with a sample X = (X,..., X ) of radom variables chose accordig to oe of a family of probabilities P θ where θ is elemet from the parameter space Θ. For radom

More information

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4 MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.

More information

Regression and generalization

Regression and generalization Regressio ad geeralizatio CE-717: Machie Learig Sharif Uiversity of Techology M. Soleymai Fall 2016 Curve fittig: probabilistic perspective Describig ucertaity over value of target variable as a probability

More information

Bayesian Methods: Introduction to Multi-parameter Models

Bayesian Methods: Introduction to Multi-parameter Models Bayesia Methods: Itroductio to Multi-parameter Models Parameter: θ = ( θ, θ) Give Likelihood p(y θ) ad prior p(θ ), the posterior p proportioal to p(y θ) x p(θ ) Margial posterior ( θ, θ y) is Iterested

More information

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam. Probability ad Statistics FS 07 Secod Sessio Exam 09.0.08 Time Limit: 80 Miutes Name: Studet ID: This exam cotais 9 pages (icludig this cover page) ad 0 questios. A Formulae sheet is provided with the

More information

Lecture 33: Bootstrap

Lecture 33: Bootstrap Lecture 33: ootstrap Motivatio To evaluate ad compare differet estimators, we eed cosistet estimators of variaces or asymptotic variaces of estimators. This is also importat for hypothesis testig ad cofidece

More information

Approximations and more PMFs and PDFs

Approximations and more PMFs and PDFs Approximatios ad more PMFs ad PDFs Saad Meimeh 1 Approximatio of biomial with Poisso Cosider the biomial distributio ( b(k,,p = p k (1 p k, k λ: k Assume that is large, ad p is small, but p λ at the limit.

More information

This section is optional.

This section is optional. 4 Momet Geeratig Fuctios* This sectio is optioal. The momet geeratig fuctio g : R R of a radom variable X is defied as g(t) = E[e tx ]. Propositio 1. We have g () (0) = E[X ] for = 1, 2,... Proof. Therefore

More information

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015 ECE 8527: Itroductio to Machie Learig ad Patter Recogitio Midterm # 1 Vaishali Ami Fall, 2015 tue39624@temple.edu Problem No. 1: Cosider a two-class discrete distributio problem: ω 1 :{[0,0], [2,0], [2,2],

More information

The Bayesian Learning Framework. Back to Maximum Likelihood. Naïve Bayes. Simple Example: Coin Tosses. Given a generative model

The Bayesian Learning Framework. Back to Maximum Likelihood. Naïve Bayes. Simple Example: Coin Tosses. Given a generative model Back to Maximum Likelihood Give a geerative model f (x, y = k) =π k f k (x) Usig a geerative modellig approach, we assume a parametric form for f k (x) =f (x; k ) ad compute the MLE θ of θ =(π k, k ) k=

More information

Lecture 2: Monte Carlo Simulation

Lecture 2: Monte Carlo Simulation STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?

More information

Estimation for Complete Data

Estimation for Complete Data Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of

More information

Solutions: Homework 3

Solutions: Homework 3 Solutios: Homework 3 Suppose that the radom variables Y,...,Y satisfy Y i = x i + " i : i =,..., IID where x,...,x R are fixed values ad ",...," Normal(0, )with R + kow. Fid ˆ = MLE( ). IND Solutio: Observe

More information

1.010 Uncertainty in Engineering Fall 2008

1.010 Uncertainty in Engineering Fall 2008 MIT OpeCourseWare http://ocw.mit.edu.00 Ucertaity i Egieerig Fall 2008 For iformatio about citig these materials or our Terms of Use, visit: http://ocw.mit.edu.terms. .00 - Brief Notes # 9 Poit ad Iterval

More information

Parameter, Statistic and Random Samples

Parameter, Statistic and Random Samples Parameter, Statistic ad Radom Samples A parameter is a umber that describes the populatio. It is a fixed umber, but i practice we do ot kow its value. A statistic is a fuctio of the sample data, i.e.,

More information

Binomial Distribution

Binomial Distribution 0.0 0.5 1.0 1.5 2.0 2.5 3.0 0 1 2 3 4 5 6 7 0.0 0.5 1.0 1.5 2.0 2.5 3.0 Overview Example: coi tossed three times Defiitio Formula Recall that a r.v. is discrete if there are either a fiite umber of possible

More information

Statistical Theory MT 2009 Problems 1: Solution sketches

Statistical Theory MT 2009 Problems 1: Solution sketches Statistical Theory MT 009 Problems : Solutio sketches. Which of the followig desities are withi a expoetial family? Explai your reasoig. (a) Let 0 < θ < ad put f(x, θ) = ( θ)θ x ; x = 0,,,... (b) (c) where

More information

5. Likelihood Ratio Tests

5. Likelihood Ratio Tests 1 of 5 7/29/2009 3:16 PM Virtual Laboratories > 9. Hy pothesis Testig > 1 2 3 4 5 6 7 5. Likelihood Ratio Tests Prelimiaries As usual, our startig poit is a radom experimet with a uderlyig sample space,

More information

Problem Set 4 Due Oct, 12

Problem Set 4 Due Oct, 12 EE226: Radom Processes i Systems Lecturer: Jea C. Walrad Problem Set 4 Due Oct, 12 Fall 06 GSI: Assae Gueye This problem set essetially reviews detectio theory ad hypothesis testig ad some basic otios

More information

n n i=1 Often we also need to estimate the variance. Below are three estimators each of which is optimal in some sense: n 1 i=1 k=1 i=1 k=1 i=1 k=1

n n i=1 Often we also need to estimate the variance. Below are three estimators each of which is optimal in some sense: n 1 i=1 k=1 i=1 k=1 i=1 k=1 MATH88T Maria Camero Cotets Basic cocepts of statistics Estimators, estimates ad samplig distributios 2 Ordiary least squares estimate 3 3 Maximum lielihood estimator 3 4 Bayesia estimatio Refereces 9

More information

Lecture 6 Simple alternatives and the Neyman-Pearson lemma

Lecture 6 Simple alternatives and the Neyman-Pearson lemma STATS 00: Itroductio to Statistical Iferece Autum 06 Lecture 6 Simple alteratives ad the Neyma-Pearso lemma Last lecture, we discussed a umber of ways to costruct test statistics for testig a simple ull

More information

Statistical Theory MT 2008 Problems 1: Solution sketches

Statistical Theory MT 2008 Problems 1: Solution sketches Statistical Theory MT 008 Problems : Solutio sketches. Which of the followig desities are withi a expoetial family? Explai your reasoig. a) Let 0 < θ < ad put fx, θ) = θ)θ x ; x = 0,,,... b) c) where α

More information

Lecture 3. Properties of Summary Statistics: Sampling Distribution

Lecture 3. Properties of Summary Statistics: Sampling Distribution Lecture 3 Properties of Summary Statistics: Samplig Distributio Mai Theme How ca we use math to justify that our umerical summaries from the sample are good summaries of the populatio? Lecture Summary

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit theorems Throughout this sectio we will assume a probability space (Ω, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

Random Variables, Sampling and Estimation

Random Variables, Sampling and Estimation Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig

More information

Direction: This test is worth 150 points. You are required to complete this test within 55 minutes.

Direction: This test is worth 150 points. You are required to complete this test within 55 minutes. Term Test 3 (Part A) November 1, 004 Name Math 6 Studet Number Directio: This test is worth 10 poits. You are required to complete this test withi miutes. I order to receive full credit, aswer each problem

More information

Direction: This test is worth 250 points. You are required to complete this test within 50 minutes.

Direction: This test is worth 250 points. You are required to complete this test within 50 minutes. Term Test October 3, 003 Name Math 56 Studet Number Directio: This test is worth 50 poits. You are required to complete this test withi 50 miutes. I order to receive full credit, aswer each problem completely

More information

The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.

The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2. SAMPLE STATISTICS A radom sample x 1,x,,x from a distributio f(x) is a set of idepedetly ad idetically variables with x i f(x) for all i Their joit pdf is f(x 1,x,,x )=f(x 1 )f(x ) f(x )= f(x i ) The sample

More information

f(x i ; ) L(x; p) = i=1 To estimate the value of that maximizes L or equivalently ln L we will set =0, for i =1, 2,...,m p x i (1 p) 1 x i i=1

f(x i ; ) L(x; p) = i=1 To estimate the value of that maximizes L or equivalently ln L we will set =0, for i =1, 2,...,m p x i (1 p) 1 x i i=1 Parameter Estimatio Samples from a probability distributio F () are: [,,..., ] T.Theprobabilitydistributio has a parameter vector [,,..., m ] T. Estimator: Statistic used to estimate ukow. Estimate: Observed

More information

Lecture Note 8 Point Estimators and Point Estimation Methods. MIT Spring 2006 Herman Bennett

Lecture Note 8 Point Estimators and Point Estimation Methods. MIT Spring 2006 Herman Bennett Lecture Note 8 Poit Estimators ad Poit Estimatio Methods MIT 14.30 Sprig 2006 Herma Beett Give a parameter with ukow value, the goal of poit estimatio is to use a sample to compute a umber that represets

More information

LECTURE NOTES 9. 1 Point Estimation. 1.1 The Method of Moments

LECTURE NOTES 9. 1 Point Estimation. 1.1 The Method of Moments LECTURE NOTES 9 Poit Estimatio Uder the hypothesis that the sample was geerated from some parametric statistical model, a atural way to uderstad the uderlyig populatio is by estimatig the parameters of

More information

Distribution of Random Samples & Limit theorems

Distribution of Random Samples & Limit theorems STAT/MATH 395 A - PROBABILITY II UW Witer Quarter 2017 Néhémy Lim Distributio of Radom Samples & Limit theorems 1 Distributio of i.i.d. Samples Motivatig example. Assume that the goal of a study is to

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

January 25, 2017 INTRODUCTION TO MATHEMATICAL STATISTICS

January 25, 2017 INTRODUCTION TO MATHEMATICAL STATISTICS Jauary 25, 207 INTRODUCTION TO MATHEMATICAL STATISTICS Abstract. A basic itroductio to statistics assumig kowledge of probability theory.. Probability I a typical udergraduate problem i probability, we

More information

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Discrete Mathematics for CS Spring 2008 David Wagner Note 22 CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig

More information

Maximum Likelihood Estimation and Complexity Regularization

Maximum Likelihood Estimation and Complexity Regularization ECE90 Sprig 004 Statistical Regularizatio ad Learig Theory Lecture: 4 Maximum Likelihood Estimatio ad Complexity Regularizatio Lecturer: Rob Nowak Scribe: Pam Limpiti Review : Maximum Likelihood Estimatio

More information

Questions and Answers on Maximum Likelihood

Questions and Answers on Maximum Likelihood Questios ad Aswers o Maximum Likelihood L. Magee Fall, 2008 1. Give: a observatio-specific log likelihood fuctio l i (θ) = l f(y i x i, θ) the log likelihood fuctio l(θ y, X) = l i(θ) a data set (x i,

More information

Lecture 10 October Minimaxity and least favorable prior sequences

Lecture 10 October Minimaxity and least favorable prior sequences STATS 300A: Theory of Statistics Fall 205 Lecture 0 October 22 Lecturer: Lester Mackey Scribe: Brya He, Rahul Makhijai Warig: These otes may cotai factual ad/or typographic errors. 0. Miimaxity ad least

More information

The standard deviation of the mean

The standard deviation of the mean Physics 6C Fall 20 The stadard deviatio of the mea These otes provide some clarificatio o the distictio betwee the stadard deviatio ad the stadard deviatio of the mea.. The sample mea ad variace Cosider

More information

Stat410 Probability and Statistics II (F16)

Stat410 Probability and Statistics II (F16) Some Basic Cocepts of Statistical Iferece (Sec 5.) Suppose we have a rv X that has a pdf/pmf deoted by f(x; θ) or p(x; θ), where θ is called the parameter. I previous lectures, we focus o probability problems

More information

6. Sufficient, Complete, and Ancillary Statistics

6. Sufficient, Complete, and Ancillary Statistics Sufficiet, Complete ad Acillary Statistics http://www.math.uah.edu/stat/poit/sufficiet.xhtml 1 of 7 7/16/2009 6:13 AM Virtual Laboratories > 7. Poit Estimatio > 1 2 3 4 5 6 6. Sufficiet, Complete, ad Acillary

More information

5 : Exponential Family and Generalized Linear Models

5 : Exponential Family and Generalized Linear Models 0-708: Probabilistic Graphical Models 0-708, Sprig 206 5 : Expoetial Family ad Geeralized Liear Models Lecturer: Matthew Gormley Scribes: Yua Li, Yichog Xu, Silu Wag Expoetial Family Probability desity

More information

ECE 901 Lecture 14: Maximum Likelihood Estimation and Complexity Regularization

ECE 901 Lecture 14: Maximum Likelihood Estimation and Complexity Regularization ECE 90 Lecture 4: Maximum Likelihood Estimatio ad Complexity Regularizatio R Nowak 5/7/009 Review : Maximum Likelihood Estimatio We have iid observatios draw from a ukow distributio Y i iid p θ, i,, where

More information

Machine Learning Brett Bernstein

Machine Learning Brett Bernstein Machie Learig Brett Berstei Week Lecture: Cocept Check Exercises Starred problems are optioal. Statistical Learig Theory. Suppose A = Y = R ad X is some other set. Furthermore, assume P X Y is a discrete

More information

ECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors

ECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors ECONOMETRIC THEORY MODULE XIII Lecture - 34 Asymptotic Theory ad Stochastic Regressors Dr. Shalabh Departmet of Mathematics ad Statistics Idia Istitute of Techology Kapur Asymptotic theory The asymptotic

More information

EE 4TM4: Digital Communications II Probability Theory

EE 4TM4: Digital Communications II Probability Theory 1 EE 4TM4: Digital Commuicatios II Probability Theory I. RANDOM VARIABLES A radom variable is a real-valued fuctio defied o the sample space. Example: Suppose that our experimet cosists of tossig two fair

More information

Last Lecture. Unbiased Test

Last Lecture. Unbiased Test Last Lecture Biostatistics 6 - Statistical Iferece Lecture Uiformly Most Powerful Test Hyu Mi Kag March 8th, 3 What are the typical steps for costructig a likelihood ratio test? Is LRT statistic based

More information

1 Introduction to reducing variance in Monte Carlo simulations

1 Introduction to reducing variance in Monte Carlo simulations Copyright c 010 by Karl Sigma 1 Itroductio to reducig variace i Mote Carlo simulatios 11 Review of cofidece itervals for estimatig a mea I statistics, we estimate a ukow mea µ = E(X) of a distributio by

More information

Simulation. Two Rule For Inverting A Distribution Function

Simulation. Two Rule For Inverting A Distribution Function Simulatio Two Rule For Ivertig A Distributio Fuctio Rule 1. If F(x) = u is costat o a iterval [x 1, x 2 ), the the uiform value u is mapped oto x 2 through the iversio process. Rule 2. If there is a jump

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Some Basic Probability Concepts. 2.1 Experiments, Outcomes and Random Variables

Some Basic Probability Concepts. 2.1 Experiments, Outcomes and Random Variables Some Basic Probability Cocepts 2. Experimets, Outcomes ad Radom Variables A radom variable is a variable whose value is ukow util it is observed. The value of a radom variable results from a experimet;

More information

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample. Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized

More information

Lecture 18: Sampling distributions

Lecture 18: Sampling distributions Lecture 18: Samplig distributios I may applicatios, the populatio is oe or several ormal distributios (or approximately). We ow study properties of some importat statistics based o a radom sample from

More information

17. Joint distributions of extreme order statistics Lehmann 5.1; Ferguson 15

17. Joint distributions of extreme order statistics Lehmann 5.1; Ferguson 15 17. Joit distributios of extreme order statistics Lehma 5.1; Ferguso 15 I Example 10., we derived the asymptotic distributio of the maximum from a radom sample from a uiform distributio. We did this usig

More information

ECO 312 Fall 2013 Chris Sims LIKELIHOOD, POSTERIORS, DIAGNOSING NON-NORMALITY

ECO 312 Fall 2013 Chris Sims LIKELIHOOD, POSTERIORS, DIAGNOSING NON-NORMALITY ECO 312 Fall 2013 Chris Sims LIKELIHOOD, POSTERIORS, DIAGNOSING NON-NORMALITY (1) A distributio that allows asymmetry differet probabilities for egative ad positive outliers is the asymmetric double expoetial,

More information

The Sampling Distribution of the Maximum. Likelihood Estimators for the Parameters of. Beta-Binomial Distribution

The Sampling Distribution of the Maximum. Likelihood Estimators for the Parameters of. Beta-Binomial Distribution Iteratioal Mathematical Forum, Vol. 8, 2013, o. 26, 1263-1277 HIKARI Ltd, www.m-hikari.com http://d.doi.org/10.12988/imf.2013.3475 The Samplig Distributio of the Maimum Likelihood Estimators for the Parameters

More information

Last time: Moments of the Poisson distribution from its generating function. Example: Using telescope to measure intensity of an object

Last time: Moments of the Poisson distribution from its generating function. Example: Using telescope to measure intensity of an object 6.3 Stochastic Estimatio ad Cotrol, Fall 004 Lecture 7 Last time: Momets of the Poisso distributio from its geeratig fuctio. Gs () e dg µ e ds dg µ ( s) µ ( s) µ ( s) µ e ds dg X µ ds X s dg dg + ds ds

More information

Lecture Stat Maximum Likelihood Estimation

Lecture Stat Maximum Likelihood Estimation Lecture Stat 461-561 Maximum Likelihood Estimatio A.D. Jauary 2008 A.D. () Jauary 2008 1 / 63 Maximum Likelihood Estimatio Ivariace Cosistecy E ciecy Nuisace Parameters A.D. () Jauary 2008 2 / 63 Parametric

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS MASSACHUSTTS INSTITUT OF TCHNOLOGY 6.436J/5.085J Fall 2008 Lecture 9 /7/2008 LAWS OF LARG NUMBRS II Cotets. The strog law of large umbers 2. The Cheroff boud TH STRONG LAW OF LARG NUMBRS While the weak

More information

MA Advanced Econometrics: Properties of Least Squares Estimators

MA Advanced Econometrics: Properties of Least Squares Estimators MA Advaced Ecoometrics: Properties of Least Squares Estimators Karl Whela School of Ecoomics, UCD February 5, 20 Karl Whela UCD Least Squares Estimators February 5, 20 / 5 Part I Least Squares: Some Fiite-Sample

More information

Mathematics 170B Selected HW Solutions.

Mathematics 170B Selected HW Solutions. Mathematics 17B Selected HW Solutios. F 4. Suppose X is B(,p). (a)fidthemometgeeratigfuctiom (s)of(x p)/ p(1 p). Write q = 1 p. The MGF of X is (pe s + q), sice X ca be writte as the sum of idepedet Beroulli

More information

Lecture 12: November 13, 2018

Lecture 12: November 13, 2018 Mathematical Toolkit Autum 2018 Lecturer: Madhur Tulsiai Lecture 12: November 13, 2018 1 Radomized polyomial idetity testig We will use our kowledge of coditioal probability to prove the followig lemma,

More information

Introduction to Probability I: Expectations, Bayes Theorem, Gaussians, and the Poisson Distribution. 1

Introduction to Probability I: Expectations, Bayes Theorem, Gaussians, and the Poisson Distribution. 1 Itroductio to Probability I: Expectatios, Bayes Theorem, Gaussias, ad the Poisso Distributio. 1 Pakaj Mehta February 25, 2019 1 Read: This will itroduce some elemetary ideas i probability theory that we

More information

IIT JAM Mathematical Statistics (MS) 2006 SECTION A

IIT JAM Mathematical Statistics (MS) 2006 SECTION A IIT JAM Mathematical Statistics (MS) 6 SECTION A. If a > for ad lim a / L >, the which of the followig series is ot coverget? (a) (b) (c) (d) (d) = = a = a = a a + / a lim a a / + = lim a / a / + = lim

More information

Mathematical Statistics - MS

Mathematical Statistics - MS Paper Specific Istructios. The examiatio is of hours duratio. There are a total of 60 questios carryig 00 marks. The etire paper is divided ito three sectios, A, B ad C. All sectios are compulsory. Questios

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

LECTURE 8: ASYMPTOTICS I

LECTURE 8: ASYMPTOTICS I LECTURE 8: ASYMPTOTICS I We are iterested i the properties of estimators as. Cosider a sequece of radom variables {, X 1}. N. M. Kiefer, Corell Uiversity, Ecoomics 60 1 Defiitio: (Weak covergece) A sequece

More information

IE 230 Seat # Name < KEY > Please read these directions. Closed book and notes. 60 minutes.

IE 230 Seat # Name < KEY > Please read these directions. Closed book and notes. 60 minutes. IE 230 Seat # Name < KEY > Please read these directios. Closed book ad otes. 60 miutes. Covers through the ormal distributio, Sectio 4.7 of Motgomery ad Ruger, fourth editio. Cover page ad four pages of

More information

AAEC/ECON 5126 FINAL EXAM: SOLUTIONS

AAEC/ECON 5126 FINAL EXAM: SOLUTIONS AAEC/ECON 5126 FINAL EXAM: SOLUTIONS SPRING 2015 / INSTRUCTOR: KLAUS MOELTNER This exam is ope-book, ope-otes, but please work strictly o your ow. Please make sure your ame is o every sheet you re hadig

More information

The beta density, Bayes, Laplace, and Pólya

The beta density, Bayes, Laplace, and Pólya The beta desity, Bayes, Laplae, ad Pólya Saad Meimeh The beta desity as a ojugate form Suppose that is a biomial radom variable with idex ad parameter p, i.e. ( ) P ( p) p ( p) Applyig Bayes s rule, we

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

ECE 901 Lecture 13: Maximum Likelihood Estimation

ECE 901 Lecture 13: Maximum Likelihood Estimation ECE 90 Lecture 3: Maximum Likelihood Estimatio R. Nowak 5/7/009 The focus of this lecture is to cosider aother approach to learig based o maximum likelihood estimatio. Ulike earlier approaches cosidered

More information

EFFECTIVE WLLN, SLLN, AND CLT IN STATISTICAL MODELS

EFFECTIVE WLLN, SLLN, AND CLT IN STATISTICAL MODELS EFFECTIVE WLLN, SLLN, AND CLT IN STATISTICAL MODELS Ryszard Zieliński Ist Math Polish Acad Sc POBox 21, 00-956 Warszawa 10, Polad e-mail: rziel@impagovpl ABSTRACT Weak laws of large umbers (W LLN), strog

More information

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator Ecoomics 24B Relatio to Method of Momets ad Maximum Likelihood OLSE as a Maximum Likelihood Estimator Uder Assumptio 5 we have speci ed the distributio of the error, so we ca estimate the model parameters

More information

1 Review and Overview

1 Review and Overview CS9T/STATS3: Statistical Learig Theory Lecturer: Tegyu Ma Lecture #6 Scribe: Jay Whag ad Patrick Cho October 0, 08 Review ad Overview Recall i the last lecture that for ay family of scalar fuctios F, we

More information

CSE 527, Additional notes on MLE & EM

CSE 527, Additional notes on MLE & EM CSE 57 Lecture Notes: MLE & EM CSE 57, Additioal otes o MLE & EM Based o earlier otes by C. Grat & M. Narasimha Itroductio Last lecture we bega a examiatio of model based clusterig. This lecture will be

More information

HOMEWORK I: PREREQUISITES FROM MATH 727

HOMEWORK I: PREREQUISITES FROM MATH 727 HOMEWORK I: PREREQUISITES FROM MATH 727 Questio. Let X, X 2,... be idepedet expoetial radom variables with mea µ. (a) Show that for Z +, we have EX µ!. (b) Show that almost surely, X + + X (c) Fid the

More information

Asymptotics. Hypothesis Testing UMP. Asymptotic Tests and p-values

Asymptotics. Hypothesis Testing UMP. Asymptotic Tests and p-values of the secod half Biostatistics 6 - Statistical Iferece Lecture 6 Fial Exam & Practice Problems for the Fial Hyu Mi Kag Apil 3rd, 3 Hyu Mi Kag Biostatistics 6 - Lecture 6 Apil 3rd, 3 / 3 Rao-Blackwell

More information

Review Questions, Chapters 8, 9. f(y) = 0, elsewhere. F (y) = f Y(1) = n ( e y/θ) n 1 1 θ e y/θ = n θ e yn

Review Questions, Chapters 8, 9. f(y) = 0, elsewhere. F (y) = f Y(1) = n ( e y/θ) n 1 1 θ e y/θ = n θ e yn Stat 366 Lab 2 Solutios (September 2, 2006) page TA: Yury Petracheko, CAB 484, yuryp@ualberta.ca, http://www.ualberta.ca/ yuryp/ Review Questios, Chapters 8, 9 8.5 Suppose that Y, Y 2,..., Y deote a radom

More information

Statistical Pattern Recognition

Statistical Pattern Recognition Statistical Patter Recogitio Classificatio: No-Parametric Modelig Hamid R. Rabiee Jafar Muhammadi Sprig 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Ageda Parametric Modelig No-Parametric Modelig

More information

SOME THEORY AND PRACTICE OF STATISTICS by Howard G. Tucker

SOME THEORY AND PRACTICE OF STATISTICS by Howard G. Tucker SOME THEORY AND PRACTICE OF STATISTICS by Howard G. Tucker CHAPTER 9. POINT ESTIMATION 9. Covergece i Probability. The bases of poit estimatio have already bee laid out i previous chapters. I chapter 5

More information

AMS570 Lecture Notes #2

AMS570 Lecture Notes #2 AMS570 Lecture Notes # Review of Probability (cotiued) Probability distributios. () Biomial distributio Biomial Experimet: ) It cosists of trials ) Each trial results i of possible outcomes, S or F 3)

More information

Clases 7-8: Métodos de reducción de varianza en Monte Carlo *

Clases 7-8: Métodos de reducción de varianza en Monte Carlo * Clases 7-8: Métodos de reducció de variaza e Mote Carlo * 9 de septiembre de 27 Ídice. Variace reductio 2. Atithetic variates 2 2.. Example: Uiform radom variables................ 3 2.2. Example: Tail

More information

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would

More information

Worksheet on Generating Functions

Worksheet on Generating Functions Worksheet o Geeratig Fuctios October 26, 205 This worksheet is adapted from otes/exercises by Nat Thiem. Derivatives of Geeratig Fuctios. If the sequece a 0, a, a 2,... has ordiary geeratig fuctio A(x,

More information

Overview of Estimation

Overview of Estimation Topic Iferece is the problem of turig data ito kowledge, where kowledge ofte is expressed i terms of etities that are ot preset i the data per se but are preset i models that oe uses to iterpret the data.

More information

Summary. Recap ... Last Lecture. Summary. Theorem

Summary. Recap ... Last Lecture. Summary. Theorem Last Lecture Biostatistics 602 - Statistical Iferece Lecture 23 Hyu Mi Kag April 11th, 2013 What is p-value? What is the advatage of p-value compared to hypothesis testig procedure with size α? How ca

More information

Expectation and Variance of a random variable

Expectation and Variance of a random variable Chapter 11 Expectatio ad Variace of a radom variable The aim of this lecture is to defie ad itroduce mathematical Expectatio ad variace of a fuctio of discrete & cotiuous radom variables ad the distributio

More information