Statistics for Applications. Chapter 3: Maximum Likelihood Estimation 1/23
|
|
- David Byrd
- 5 years ago
- Views:
Transcription
1 Statistics for Applicatios Chapter 3: Maximum Likelihood Estimatio 1/23
2 Total variatio distace (1) ( ) Let E,(IPθ ) θ Θ be a statistical model associated with a sample of i.i.d. r.v. X 1,...,X. Assume that there exists θ Θ such that X 1 IP θ : θ is the true parameter. Statisticia s goal: give X 1,...,X, fid a estimator θ ˆ= θ(x ˆ 1,...,X ) such that IPˆ is close to IP θ θ for the true parameter θ. This meas: IPˆ(A) IP θ θ (A) is small for all A E. Defiitio The total variatio distace betwee two probability measures IP θ ad IP θ is defied by TV(IP θ,ip θ ) = max IP θ (A) IP θ (A). A E 2/23
3 Total variatio distace (2) Assume that E is discrete (i.e., fiite or coutable). This icludes Beroulli, Biomial, Poisso,... Therefore X has a PMF (probability mass fuctio): IP θ (X = x) = p θ (x) for all x E, L p θ (x) 0, p θ (x) = 1. x E The total variatio distace betwee IP θ ad IP θ is a simple fuctio of the PMF s p θ ad p θ : 1 L TV(IP θ,ip θ ) = p θ (x) p θ (x). 2 x E 3/23
4 Total variatio distace (3) Assume that E is cotiuous. This icludes Gaussia, Expoetial,... J Assume that X has a desity IP θ (X A) = A fθ (x)dx for all A E. l f θ (x) 0, f θ (x)dx = 1. E The total variatio distace betwee IP θ ad IP θ is a simple fuctio of the desities f θ ad f θ : l 1 TV(IP θ,ip θ ) = f θ (x) f θ (x) dx. 2 E 4/23
5 Total variatio distace (4) Properties of Total variatio: TV(IP θ,ip θ ) = TV(IP θ,ip θ ) (symmetric) TV(IP θ,ip θ ) 0 If TV(IP θ,ip θ ) = 0 the IP θ = IP θ (defiite) TV(IP θ,ip θ ) TV(IP θ,ip θ )+TV(IP θ,ip θ ) (triagle iequality) These imply that the total variatio is a distace betwee probability distributios. 5/23
6 Total variatio distace (5) A estimatio strategy: Build a estimator TV(IP T θ,ip θ ) for all θ Θ. The fid θ ˆthat miimizes the fuctio θ TV(IP T θ,ip θ ). 6/23
7 Total variatio distace (5) A estimatio strategy: Build a estimator TV(IP T θ,ip θ ) for all θ Θ. The fid θ ˆthat miimizes the fuctio θ TV(IP T θ,ip θ ). problem: Uclear how to build T TV(IPθ,IP θ )! 6/23
8 Kullback-Leibler (KL) divergece (1) There are may distaces betwee probability measures to replace total variatio. Let us choose oe that is more coveiet. Defiitio The Kullback-Leibler (KL) divergece betwee two probability measures IP θ ad IP θ is defied by L p θ (x)log KL(IP θ,ip θ ) = x E l E ( pθ (x) ) p θ (x) if E is discrete ( fθ (x) ) f θ (x)log dx if E is cotiuous f θ (x) 7/23
9 Kullback-Leibler (KL) divergece (2) Properties of KL-divergece: KL(IP θ,ip θ ) = KL(IP θ,ip θ ) i geeral KL(IP θ,ip θ ) 0 If KL(IP θ,ip θ ) = 0 the IP θ = IP θ (defiite) KL(IP θ,ip θ ) i KL(IP θ,ip θ )+KL(IP θ,ip θ ) i geeral Not a distace. This is is called a divergece. Asymmetry is the key to our ability to estimate it! 8/23
10 Kullback-Leibler (KL) divergece (3) KL(IP θ,ip θ ) = IE θ [ ( pθ (X) )] log p θ (X) [ = IE θ logpθ (X) ] [ IE θ logpθ (X) ] So the fuctio θ KL(IP θ [,IP θ ) is of the form: costat IE θ logpθ (X) ] 1 Ca be estimated: IE θ [h(x)] - L h(x i ) (by LLN) 1 KL(IP T θ,ip θ ) = costat L logpθ (X i ) 9/23
11 Kullback-Leibler (KL) divergece (4) KL(IP T 1 L θ,ip θ ) = costat logp θ (X i ) KL(IP T 1 L mi θ,ip θ ) mi logp θ (X i ) θ Θ θ Θ 1 L max logp θ (X i ) max θ Θ L θ Θ logp θ (X i ) max p θ (X i ) θ Θ This is the maximum likelihood priciple. 10/23
12 Iterlude: maximizig/miimizig fuctios (1) Note that mi h(θ) max h(θ) θ Θ θ Θ I this class, we focus o maximizatio. Maximizatio of arbitrary fuctios ca be difficult: Example: θ (θ X i ) 11/23
13 Iterlude: maximizig/miimizig fuctios (2) Defiitio A fuctio twice differetiable fuctio h : Θ IR IR is said to be cocave if its secod derivative satisfies h (θ) 0, θ Θ It is said to be strictly cocave if the iequality is strict: h (θ) < 0 Moreover, h is said to be (strictly) covex if h is (strictly) cocave, i.e. h (θ) 0 (h (θ) > 0). Examples: Θ = IR, h(θ) = θ 2, Θ = (0, ), h(θ) = θ, Θ = (0, ), h(θ) = logθ, Θ = [0,π], h(θ) = si(θ) Θ = IR, h(θ) = 2θ 3 12/23
14 Iterlude: maximizig/miimizig fuctios (3) More geerally for a multivariate fuctio: h : Θ IR d IR, d 2, defie the gradiet vector: h(θ) = Hessia matrix: 2 h(θ) = h θ 1 (θ).. h θ d (θ) IR d 2 h 2 h θ 1 θ 1 (θ) θ 1 θ d (θ)... IR d d 2 h θ d θ d (θ) 2 h θ d θ d (θ) h is cocave x 2 h(θ)x 0 x IR d, θ Θ. h is strictly cocave x 2 h(θ)x < 0 x IR d, θ Θ. Examples: Θ = IR 2, h(θ) = θ2 1 2θ2 2 or h(θ) = (θ 1 θ 2 ) 2 Θ = (0, ), h(θ) = log(θ 1 +θ 2 ), 13/23
15 Iterlude: maximizig/miimizig fuctios (4) Strictly cocave fuctios are easy to maximize: if they have a maximum, the it is uique. It is the uique solutio to or, i the multivariate case h (θ) = 0, h(θ) = 0 IR d. There are may algorithms to fid it umerically: this is the theory of covex optimizatio. I this class, ofte a closed form formula for the maximum. 14/23
16 Likelihood, Discrete case (1) ( ) Let E,(IPθ ) θ Θ be a statistical model associated with a sample of i.i.d. r.v. X 1,...,X. Assume that E is discrete (i.e., fiite or coutable). Defiitio The likelihood of the model is the map L (or just L) defied as: L : E Θ IR (x 1,...,x,θ) IP θ [X 1 = x 1,...,X = x ]. 15/23
17 Likelihood, Discrete case (2) iid Example 1 (Beroulli trials): If X 1,...,X Ber(p) for some p (0,1): E = {0,1}; Θ = (0,1); (x 1,...,x ) {0,1}, p (0,1), L(x 1,...,x,p) = IPp [X i = x i ] = p x i (1 p) 1 x i = p x i (1 p) x i. 16/23
18 Likelihood, Discrete case (3) Example 2 (Poisso model): iid If X 1,...,X Poiss(λ) for some λ > 0: E = IN; Θ = (0, ); (x 1,...,x ) IN, λ > 0, L(x 1,...,x,p) = IPλ [X i = x i ] λ λ x i = e x i! λ x i λ = e. x 1!...x! 17/23
19 Likelihood, Cotiuous case (1) ( ) Let E,(IPθ ) θ Θ be a statistical model associated with a sample of i.i.d. r.v. X 1,...,X. Assume that all the IP θ have desity f θ. Defiitio The likelihood of the model is the map L defied as: L : E Θ IR (x 1,...,x,θ) f θ (x i ). 18/23
20 Likelihood, Cotiuous case (2) iid Example 1 (Gaussia model): If X 1,...,X N(µ,σ 2 ), for some µ IR,σ 2 > 0: E = IR; Θ = IR (0, ) (x 1,...,x ) IR, (µ,σ 2 ) IR (0, ), 1 1 L(x 1,...,x,µ,σ 2 ) = exp (σ 2π) 2σ 2 L (xi µ) 2. 19/23
21 Maximum likelihood estimator (1) Let X 1 (,...,X be ) a i.i.d. sample associated with a statistical model E,(IPθ ) θ Θ ad let L be the correspodig likelihood. Defiitio The likelihood estimator of θ is defied as: provided it exists. θˆmle = argmax L(X 1,...,X,θ), θ Θ Remark (log-likelihood estimator): I practice, we use the fact that θˆmle = argmax logl(x 1,...,X,θ). θ Θ 20/23
22 Maximum likelihood estimator (2) Examples Beroulli trials: pˆmle = X. Poisso model: ˆ = X. Gaussia model: λ MLE ( µˆ,σˆ2) = (X,Ŝ ). 21/23
23 Maximum likelihood estimator (3) Defiitio: Fisher iformatio Defie the log-likelihood for oe observatio as: l(θ) = logl 1 (X,θ), θ Θ IR d Assume that l is a.s. twice differetiable. Uder some regularity coditios, the Fisher iformatio of the statistical model is defied as: I(θ) = IE [ l(θ) l(θ) ] IE [ l(θ) ] IE [ l(θ) ] = IE [ 2 l(θ) ]. If Θ IR, we get: I(θ) = var [ l (θ) ] = IE [ l (θ) ] 22/23
24 Maximum likelihood estimator (4) Theorem Let θ Θ (the true parameter). Assume the followig: 1. The model is idetified. 2. For all θ Θ, the support of IP θ does ot deped o θ; 3. θ is ot o the boudary of Θ; 4. I(θ) is ivertible i a eighborhood of θ ; 5. A few more techical coditios. The, ˆ θ MLE satisfies: θ MLE IP ˆ θ w.r.t. IP θ ; ( ) (d) θˆmle θ N ( 0,I(θ ) 1) w.r.t. IP θ. 23/23
25 MIT OpeCourseWare / Statistics for Applicatios Fall 2016 For iformatio about citig these materials or our Terms of Use, visit:
Stat410 Probability and Statistics II (F16)
Some Basic Cocepts of Statistical Iferece (Sec 5.) Suppose we have a rv X that has a pdf/pmf deoted by f(x; θ) or p(x; θ), where θ is called the parameter. I previous lectures, we focus o probability problems
More informationLet us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.
Lecture 5 Let us give oe more example of MLE. Example 3. The uiform distributio U[0, ] o the iterval [0, ] has p.d.f. { 1 f(x =, 0 x, 0, otherwise The likelihood fuctio ϕ( = f(x i = 1 I(X 1,..., X [0,
More informationECE 901 Lecture 13: Maximum Likelihood Estimation
ECE 90 Lecture 3: Maximum Likelihood Estimatio R. Nowak 5/7/009 The focus of this lecture is to cosider aother approach to learig based o maximum likelihood estimatio. Ulike earlier approaches cosidered
More informationLecture 13: Maximum Likelihood Estimation
ECE90 Sprig 007 Statistical Learig Theory Istructor: R. Nowak Lecture 3: Maximum Likelihood Estimatio Summary of Lecture I the last lecture we derived a risk (MSE) boud for regressio problems; i.e., select
More informationUnbiased Estimation. February 7-12, 2008
Ubiased Estimatio February 7-2, 2008 We begi with a sample X = (X,..., X ) of radom variables chose accordig to oe of a family of probabilities P θ where θ is elemet from the parameter space Θ. For radom
More informationThis exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.
Probability ad Statistics FS 07 Secod Sessio Exam 09.0.08 Time Limit: 80 Miutes Name: Studet ID: This exam cotais 9 pages (icludig this cover page) ad 0 questios. A Formulae sheet is provided with the
More informationChapter 6 Principles of Data Reduction
Chapter 6 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 0 Chapter 6 Priciples of Data Reductio Sectio 6. Itroductio Goal: To summarize or reduce the data X, X,, X to get iformatio about a
More information1.010 Uncertainty in Engineering Fall 2008
MIT OpeCourseWare http://ocw.mit.edu.00 Ucertaity i Egieerig Fall 2008 For iformatio about citig these materials or our Terms of Use, visit: http://ocw.mit.edu.terms. .00 - Brief Notes # 9 Poit ad Iterval
More informationMaximum Likelihood Estimation
ECE 645: Estimatio Theory Sprig 2015 Istructor: Prof. Staley H. Cha Maximum Likelihood Estimatio (LaTeX prepared by Shaobo Fag) April 14, 2015 This lecture ote is based o ECE 645(Sprig 2015) by Prof. Staley
More informationLecture 7: Properties of Random Samples
Lecture 7: Properties of Radom Samples 1 Cotiued From Last Class Theorem 1.1. Let X 1, X,...X be a radom sample from a populatio with mea µ ad variace σ
More information5 : Exponential Family and Generalized Linear Models
0-708: Probabilistic Graphical Models 0-708, Sprig 206 5 : Expoetial Family ad Geeralized Liear Models Lecturer: Matthew Gormley Scribes: Yua Li, Yichog Xu, Silu Wag Expoetial Family Probability desity
More informationECE 901 Lecture 14: Maximum Likelihood Estimation and Complexity Regularization
ECE 90 Lecture 4: Maximum Likelihood Estimatio ad Complexity Regularizatio R Nowak 5/7/009 Review : Maximum Likelihood Estimatio We have iid observatios draw from a ukow distributio Y i iid p θ, i,, where
More informationStatistical Theory MT 2008 Problems 1: Solution sketches
Statistical Theory MT 008 Problems : Solutio sketches. Which of the followig desities are withi a expoetial family? Explai your reasoig. a) Let 0 < θ < ad put fx, θ) = θ)θ x ; x = 0,,,... b) c) where α
More informationMaximum Likelihood Estimation and Complexity Regularization
ECE90 Sprig 004 Statistical Regularizatio ad Learig Theory Lecture: 4 Maximum Likelihood Estimatio ad Complexity Regularizatio Lecturer: Rob Nowak Scribe: Pam Limpiti Review : Maximum Likelihood Estimatio
More informationExponential Families and Bayesian Inference
Computer Visio Expoetial Families ad Bayesia Iferece Lecture Expoetial Families A expoetial family of distributios is a d-parameter family f(x; havig the followig form: f(x; = h(xe g(t T (x B(, (. where
More informationStatistical Theory MT 2009 Problems 1: Solution sketches
Statistical Theory MT 009 Problems : Solutio sketches. Which of the followig desities are withi a expoetial family? Explai your reasoig. (a) Let 0 < θ < ad put f(x, θ) = ( θ)θ x ; x = 0,,,... (b) (c) where
More informationSpring Information Theory Midterm (take home) Due: Tue, Mar 29, 2016 (in class) Prof. Y. Polyanskiy. P XY (i, j) = α 2 i 2j
Sprig 206 6.44 - Iformatio Theory Midterm (take home) Due: Tue, Mar 29, 206 (i class) Prof. Y. Polyaskiy Rules. Collaboratio strictly prohibited. 2. Write rigorously, prove all claims. 3. You ca use otes
More informationLecture 11 and 12: Basic estimation theory
Lecture ad 2: Basic estimatio theory Sprig 202 - EE 94 Networked estimatio ad cotrol Prof. Kha March 2 202 I. MAXIMUM-LIKELIHOOD ESTIMATORS The maximum likelihood priciple is deceptively simple. Louis
More informationEECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1
EECS564 Estimatio, Filterig, ad Detectio Hwk 2 Sols. Witer 25 4. Let Z be a sigle observatio havig desity fuctio where. p (z) = (2z + ), z (a) Assumig that is a oradom parameter, fid ad plot the maximum
More informationLECTURE NOTES 9. 1 Point Estimation. 1.1 The Method of Moments
LECTURE NOTES 9 Poit Estimatio Uder the hypothesis that the sample was geerated from some parametric statistical model, a atural way to uderstad the uderlyig populatio is by estimatig the parameters of
More informationQuestions and Answers on Maximum Likelihood
Questios ad Aswers o Maximum Likelihood L. Magee Fall, 2008 1. Give: a observatio-specific log likelihood fuctio l i (θ) = l f(y i x i, θ) the log likelihood fuctio l(θ y, X) = l i(θ) a data set (x i,
More informationLecture 19: Convergence
Lecture 19: Covergece Asymptotic approach I statistical aalysis or iferece, a key to the success of fidig a good procedure is beig able to fid some momets ad/or distributios of various statistics. I may
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013. Large Deviations for i.i.d. Random Variables
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013 Large Deviatios for i.i.d. Radom Variables Cotet. Cheroff boud usig expoetial momet geeratig fuctios. Properties of a momet
More informationSolutions: Homework 3
Solutios: Homework 3 Suppose that the radom variables Y,...,Y satisfy Y i = x i + " i : i =,..., IID where x,...,x R are fixed values ad ",...," Normal(0, )with R + kow. Fid ˆ = MLE( ). IND Solutio: Observe
More informationIntroductory statistics
CM9S: Machie Learig for Bioiformatics Lecture - 03/3/06 Itroductory statistics Lecturer: Sriram Sakararama Scribe: Sriram Sakararama We will provide a overview of statistical iferece focussig o the key
More informationLecture Note 8 Point Estimators and Point Estimation Methods. MIT Spring 2006 Herman Bennett
Lecture Note 8 Poit Estimators ad Poit Estimatio Methods MIT 14.30 Sprig 2006 Herma Beett Give a parameter with ukow value, the goal of poit estimatio is to use a sample to compute a umber that represets
More informationAMS570 Lecture Notes #2
AMS570 Lecture Notes # Review of Probability (cotiued) Probability distributios. () Biomial distributio Biomial Experimet: ) It cosists of trials ) Each trial results i of possible outcomes, S or F 3)
More informationAlgorithms for Clustering
CR2: Statistical Learig & Applicatios Algorithms for Clusterig Lecturer: J. Salmo Scribe: A. Alcolei Settig: give a data set X R p where is the umber of observatio ad p is the umber of features, we wat
More information4. Basic probability theory
Cotets Basic cocepts Discrete radom variables Discrete distributios (br distributios) Cotiuous radom variables Cotiuous distributios (time distributios) Other radom variables Lect04.ppt S-38.45 - Itroductio
More informationLecture 23: Minimal sufficiency
Lecture 23: Miimal sufficiecy Maximal reductio without loss of iformatio There are may sufficiet statistics for a give problem. I fact, X (the whole data set) is sufficiet. If T is a sufficiet statistic
More information4. Understand the key properties of maximum likelihood estimation. 6. Understand how to compute the uncertainty in maximum likelihood estimates.
9.07 Itroductio to Statistics for Brai ad Cogitive Scieces Emery N. Brow Lecture 9: Likelihood Methods I. Objectives 1. Review the cocept of the joit distributio of radom variables.. Uderstad the cocept
More informationBrief Review of Functions of Several Variables
Brief Review of Fuctios of Several Variables Differetiatio Differetiatio Recall, a fuctio f : R R is differetiable at x R if ( ) ( ) lim f x f x 0 exists df ( x) Whe this limit exists we call it or f(
More informationLecture 6 Ecient estimators. Rao-Cramer bound.
Lecture 6 Eciet estimators. Rao-Cramer boud. 1 MSE ad Suciecy Let X (X 1,..., X) be a radom sample from distributio f θ. Let θ ˆ δ(x) be a estimator of θ. Let T (X) be a suciet statistic for θ. As we have
More informationProbability and statistics: basic terms
Probability ad statistics: basic terms M. Veeraraghava August 203 A radom variable is a rule that assigs a umerical value to each possible outcome of a experimet. Outcomes of a experimet form the sample
More informationEconomics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator
Ecoomics 24B Relatio to Method of Momets ad Maximum Likelihood OLSE as a Maximum Likelihood Estimator Uder Assumptio 5 we have speci ed the distributio of the error, so we ca estimate the model parameters
More information( θ. sup θ Θ f X (x θ) = L. sup Pr (Λ (X) < c) = α. x : Λ (x) = sup θ H 0. sup θ Θ f X (x θ) = ) < c. NH : θ 1 = θ 2 against AH : θ 1 θ 2
82 CHAPTER 4. MAXIMUM IKEIHOOD ESTIMATION Defiitio: et X be a radom sample with joit p.m/d.f. f X x θ. The geeralised likelihood ratio test g.l.r.t. of the NH : θ H 0 agaist the alterative AH : θ H 1,
More informationCS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5
CS434a/54a: Patter Recogitio Prof. Olga Veksler Lecture 5 Today Itroductio to parameter estimatio Two methods for parameter estimatio Maimum Likelihood Estimatio Bayesia Estimatio Itroducto Bayesia Decisio
More informationIIT JAM Mathematical Statistics (MS) 2006 SECTION A
IIT JAM Mathematical Statistics (MS) 6 SECTION A. If a > for ad lim a / L >, the which of the followig series is ot coverget? (a) (b) (c) (d) (d) = = a = a = a a + / a lim a a / + = lim a / a / + = lim
More informationLecture 12: September 27
36-705: Itermediate Statistics Fall 207 Lecturer: Siva Balakrisha Lecture 2: September 27 Today we will discuss sufficiecy i more detail ad the begi to discuss some geeral strategies for costructig estimators.
More informationUnsupervised Learning 2001
Usupervised Learig 2001 Lecture 3: The EM Algorithm Zoubi Ghahramai zoubi@gatsby.ucl.ac.uk Carl Edward Rasmusse edward@gatsby.ucl.ac.uk Gatsby Computatioal Neurosciece Uit MSc Itelliget Systems, Computer
More informationHomework for 2/3. 1. Determine the values of the following quantities: a. t 0.1,15 b. t 0.05,15 c. t 0.1,25 d. t 0.05,40 e. t 0.
Name: ID: Homework for /3. Determie the values of the followig quatities: a. t 0.5 b. t 0.055 c. t 0.5 d. t 0.0540 e. t 0.00540 f. χ 0.0 g. χ 0.0 h. χ 0.00 i. χ 0.0050 j. χ 0.990 a. t 0.5.34 b. t 0.055.753
More informationAn Introduction to Asymptotic Theory
A Itroductio to Asymptotic Theory Pig Yu School of Ecoomics ad Fiace The Uiversity of Hog Kog Pig Yu (HKU) Asymptotic Theory 1 / 20 Five Weapos i Asymptotic Theory Five Weapos i Asymptotic Theory Pig Yu
More informationFinal Examination Statistics 200C. T. Ferguson June 10, 2010
Fial Examiatio Statistics 00C T. Ferguso Jue 0, 00. (a State the Borel-Catelli Lemma ad its coverse. (b Let X,X,... be i.i.d. from a distributio with desity, f(x =θx (θ+ o the iterval (,. For what value
More informationDifferentiable Convex Functions
Differetiable Covex Fuctios The followig picture motivates Theorem 11. f ( x) f ( x) f '( x)( x x) ˆx x 1 Theorem 11 : Let f : R R be differetiable. The, f is covex o the covex set C R if, ad oly if for
More informationLast Lecture. Unbiased Test
Last Lecture Biostatistics 6 - Statistical Iferece Lecture Uiformly Most Powerful Test Hyu Mi Kag March 8th, 3 What are the typical steps for costructig a likelihood ratio test? Is LRT statistic based
More informationCEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering
CEE 5 Autum 005 Ucertaity Cocepts for Geotechical Egieerig Basic Termiology Set A set is a collectio of (mutually exclusive) objects or evets. The sample space is the (collectively exhaustive) collectio
More informationReview Questions, Chapters 8, 9. f(y) = 0, elsewhere. F (y) = f Y(1) = n ( e y/θ) n 1 1 θ e y/θ = n θ e yn
Stat 366 Lab 2 Solutios (September 2, 2006) page TA: Yury Petracheko, CAB 484, yuryp@ualberta.ca, http://www.ualberta.ca/ yuryp/ Review Questios, Chapters 8, 9 8.5 Suppose that Y, Y 2,..., Y deote a radom
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 3 9/11/2013. Large deviations Theory. Cramér s Theorem
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/5.070J Fall 203 Lecture 3 9//203 Large deviatios Theory. Cramér s Theorem Cotet.. Cramér s Theorem. 2. Rate fuctio ad properties. 3. Chage of measure techique.
More informationProbability, Random Variables and Random Processes
Appedix A robability, Radom Variables ad Radom rocesses I this appedix basic cocepts from probability, radom processes ad sigal theory are reviewed.. robability ad Radom Variables robability Space Ω F
More informationGeneralized Semi- Markov Processes (GSMP)
Geeralized Semi- Markov Processes (GSMP) Summary Some Defiitios Markov ad Semi-Markov Processes The Poisso Process Properties of the Poisso Process Iterarrival times Memoryless property ad the residual
More informationEFFECTIVE WLLN, SLLN, AND CLT IN STATISTICAL MODELS
EFFECTIVE WLLN, SLLN, AND CLT IN STATISTICAL MODELS Ryszard Zieliński Ist Math Polish Acad Sc POBox 21, 00-956 Warszawa 10, Polad e-mail: rziel@impagovpl ABSTRACT Weak laws of large umbers (W LLN), strog
More informationLecture 9: September 19
36-700: Probability ad Mathematical Statistics I Fall 206 Lecturer: Siva Balakrisha Lecture 9: September 9 9. Review ad Outlie Last class we discussed: Statistical estimatio broadly Pot estimatio Bias-Variace
More informationLecture Stat Maximum Likelihood Estimation
Lecture Stat 461-561 Maximum Likelihood Estimatio A.D. Jauary 2008 A.D. () Jauary 2008 1 / 63 Maximum Likelihood Estimatio Ivariace Cosistecy E ciecy Nuisace Parameters A.D. () Jauary 2008 2 / 63 Parametric
More informationMATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4
MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS
MASSACHUSTTS INSTITUT OF TCHNOLOGY 6.436J/5.085J Fall 2008 Lecture 9 /7/2008 LAWS OF LARG NUMBRS II Cotets. The strog law of large umbers 2. The Cheroff boud TH STRONG LAW OF LARG NUMBRS While the weak
More informationEE 4TM4: Digital Communications II Probability Theory
1 EE 4TM4: Digital Commuicatios II Probability Theory I. RANDOM VARIABLES A radom variable is a real-valued fuctio defied o the sample space. Example: Suppose that our experimet cosists of tossig two fair
More information(A sequence also can be thought of as the list of function values attained for a function f :ℵ X, where f (n) = x n for n 1.) x 1 x N +k x N +4 x 3
MATH 337 Sequeces Dr. Neal, WKU Let X be a metric space with distace fuctio d. We shall defie the geeral cocept of sequece ad limit i a metric space, the apply the results i particular to some special
More information5. Likelihood Ratio Tests
1 of 5 7/29/2009 3:16 PM Virtual Laboratories > 9. Hy pothesis Testig > 1 2 3 4 5 6 7 5. Likelihood Ratio Tests Prelimiaries As usual, our startig poit is a radom experimet with a uderlyig sample space,
More informationLecture 7: October 18, 2017
Iformatio ad Codig Theory Autum 207 Lecturer: Madhur Tulsiai Lecture 7: October 8, 207 Biary hypothesis testig I this lecture, we apply the tools developed i the past few lectures to uderstad the problem
More informationDistribution of Random Samples & Limit theorems
STAT/MATH 395 A - PROBABILITY II UW Witer Quarter 2017 Néhémy Lim Distributio of Radom Samples & Limit theorems 1 Distributio of i.i.d. Samples Motivatig example. Assume that the goal of a study is to
More informationSummary. Recap ... Last Lecture. Summary. Theorem
Last Lecture Biostatistics 602 - Statistical Iferece Lecture 23 Hyu Mi Kag April 11th, 2013 What is p-value? What is the advatage of p-value compared to hypothesis testig procedure with size α? How ca
More informationMIT Spring 2016
MIT 18.655 Dr. Kempthore Sprig 2016 1 MIT 18.655 Outlie 1 2 MIT 18.655 Beroulli s Weak Law of Large Numbers X 1, X 2,... iid Beroulli(θ). S i=1 = X i Biomial(, θ). S P θ. Proof: Apply Chebychev s Iequality,
More information7.1 Convergence of sequences of random variables
Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite
More informationThis section is optional.
4 Momet Geeratig Fuctios* This sectio is optioal. The momet geeratig fuctio g : R R of a radom variable X is defied as g(t) = E[e tx ]. Propositio 1. We have g () (0) = E[X ] for = 1, 2,... Proof. Therefore
More information2.1. The Algebraic and Order Properties of R Definition. A binary operation on a set F is a function B : F F! F.
CHAPTER 2 The Real Numbers 2.. The Algebraic ad Order Properties of R Defiitio. A biary operatio o a set F is a fuctio B : F F! F. For the biary operatios of + ad, we replace B(a, b) by a + b ad a b, respectively.
More informationLecture 2: Poisson Sta*s*cs Probability Density Func*ons Expecta*on and Variance Es*mators
Lecture 2: Poisso Sta*s*cs Probability Desity Fuc*os Expecta*o ad Variace Es*mators Biomial Distribu*o: P (k successes i attempts) =! k!( k)! p k s( p s ) k prob of each success Poisso Distributio Note
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 6 9/23/2013. Brownian motion. Introduction
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/5.070J Fall 203 Lecture 6 9/23/203 Browia motio. Itroductio Cotet.. A heuristic costructio of a Browia motio from a radom walk. 2. Defiitio ad basic properties
More informationInternational Journal of Mathematical Archive-5(7), 2014, Available online through ISSN
Iteratioal Joural of Mathematical Archive-5(7), 214, 11-117 Available olie through www.ijma.ifo ISSN 2229 546 USING SQUARED-LOG ERROR LOSS FUNCTION TO ESTIMATE THE SHAPE PARAMETER AND THE RELIABILITY FUNCTION
More informationLecture 6 Simple alternatives and the Neyman-Pearson lemma
STATS 00: Itroductio to Statistical Iferece Autum 06 Lecture 6 Simple alteratives ad the Neyma-Pearso lemma Last lecture, we discussed a umber of ways to costruct test statistics for testig a simple ull
More informationStanford Statistics 311/Electrical Engineering 377
I. Uiversal predictio ad codig a. Gae: sequecex ofdata, adwattopredict(orcode)aswellasifwekew distributio of data b. Two versios: probabilistic ad adversarial. I either case, let p ad q be desities or
More informationMathematical Methods for Physics and Engineering
Mathematical Methods for Physics ad Egieerig Lecture otes Sergei V. Shabaov Departmet of Mathematics, Uiversity of Florida, Gaiesville, FL 326 USA CHAPTER The theory of covergece. Numerical sequeces..
More informationOutline. Linear regression. Regularization functions. Polynomial curve fitting. Stochastic gradient descent for regression. MLE for regression
REGRESSION 1 Outlie Liear regressio Regularizatio fuctios Polyomial curve fittig Stochastic gradiet descet for regressio MLE for regressio Step-wise forward regressio Regressio methods Statistical techiques
More informationIP Reference guide for integer programming formulations.
IP Referece guide for iteger programmig formulatios. by James B. Orli for 15.053 ad 15.058 This documet is iteded as a compact (or relatively compact) guide to the formulatio of iteger programs. For more
More informationApproximations and more PMFs and PDFs
Approximatios ad more PMFs ad PDFs Saad Meimeh 1 Approximatio of biomial with Poisso Cosider the biomial distributio ( b(k,,p = p k (1 p k, k λ: k Assume that is large, ad p is small, but p λ at the limit.
More informationLecture 5. Power properties of EL and EL for vectors
Stats 34 Empirical Likelihood Oct.8 Lecture 5. Power properties of EL ad EL for vectors Istructor: Art B. Owe, Staford Uiversity. Scribe: Jigshu Wag Power properties of empirical likelihood Power of the
More informationRegression and generalization
Regressio ad geeralizatio CE-717: Machie Learig Sharif Uiversity of Techology M. Soleymai Fall 2016 Curve fittig: probabilistic perspective Describig ucertaity over value of target variable as a probability
More informationLectures 12 and 13 - Complexity Penalized Maximum Likelihood Estimation
Lectures ad 3 - Complexity Pealized Maximum Likelihood Estimatio Rui Castro May 5, 03 Itroductio As you leared i previous courses, if we have a statistical model we ca ofte estimate ukow parameters by
More informationAdvanced Stochastic Processes.
Advaced Stochastic Processes. David Gamarik LECTURE 2 Radom variables ad measurable fuctios. Strog Law of Large Numbers (SLLN). Scary stuff cotiued... Outlie of Lecture Radom variables ad measurable fuctios.
More informationVariance of Discrete Random Variables Class 5, Jeremy Orloff and Jonathan Bloom
Variace of Discrete Radom Variables Class 5, 18.05 Jeremy Orloff ad Joatha Bloom 1 Learig Goals 1. Be able to compute the variace ad stadard deviatio of a radom variable.. Uderstad that stadard deviatio
More informationB Supplemental Notes 2 Hypergeometric, Binomial, Poisson and Multinomial Random Variables and Borel Sets
B671-672 Supplemetal otes 2 Hypergeometric, Biomial, Poisso ad Multiomial Radom Variables ad Borel Sets 1 Biomial Approximatio to the Hypergeometric Recall that the Hypergeometric istributio is fx = x
More informationElements of Statistical Methods Lots of Data or Large Samples (Ch 8)
Elemets of Statistical Methods Lots of Data or Large Samples (Ch 8) Fritz Scholz Sprig Quarter 2010 February 26, 2010 x ad X We itroduced the sample mea x as the average of the observed sample values x
More informationEcon 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.
Eco 325/327 Notes o Sample Mea, Sample Proportio, Cetral Limit Theorem, Chi-square Distributio, Studet s t distributio 1 Sample Mea By Hiro Kasahara We cosider a radom sample from a populatio. Defiitio
More informationBinomial Distribution
0.0 0.5 1.0 1.5 2.0 2.5 3.0 0 1 2 3 4 5 6 7 0.0 0.5 1.0 1.5 2.0 2.5 3.0 Overview Example: coi tossed three times Defiitio Formula Recall that a r.v. is discrete if there are either a fiite umber of possible
More informationClassification with linear models
Lecture 8 Classificatio with liear models Milos Hauskrecht milos@cs.pitt.edu 539 Seott Square Geerative approach to classificatio Idea:. Represet ad lear the distributio, ). Use it to defie probabilistic
More informationPattern Classification
Patter Classificatio All materials i these slides were tae from Patter Classificatio (d ed) by R. O. Duda, P. E. Hart ad D. G. Stor, Joh Wiley & Sos, 000 with the permissio of the authors ad the publisher
More information7.1 Convergence of sequences of random variables
Chapter 7 Limit theorems Throughout this sectio we will assume a probability space (Ω, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite
More informationLecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting
Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would
More informationStatistics for Applications Fall Problem Set 7
18.650. Statistics for Applicatios Fall 016. Proble Set 7 Due Friday, Oct. 8 at 1 oo Proble 1 QQ-plots Recall that the Laplace distributio with paraeter λ > 0 is the cotiuous probaλ bility easure with
More information4.1 Data processing inequality
ECE598: Iformatio-theoretic methods i high-dimesioal statistics Sprig 206 Lecture 4: Total variatio/iequalities betwee f-divergeces Lecturer: Yihog Wu Scribe: Matthew Tsao, Feb 8, 206 [Ed. Mar 22] Recall
More informationLecture 16: UMVUE: conditioning on sufficient and complete statistics
Lecture 16: UMVUE: coditioig o sufficiet ad complete statistics The 2d method of derivig a UMVUE whe a sufficiet ad complete statistic is available Fid a ubiased estimator of ϑ, say U(X. Coditioig o a
More informationJ. Stat. Appl. Pro. Lett. 2, No. 1, (2015) 15
J. Stat. Appl. Pro. Lett. 2, No. 1, 15-22 2015 15 Joural of Statistics Applicatios & Probability Letters A Iteratioal Joural http://dx.doi.org/10.12785/jsapl/020102 Martigale Method for Rui Probabilityi
More informationCSE 527, Additional notes on MLE & EM
CSE 57 Lecture Notes: MLE & EM CSE 57, Additioal otes o MLE & EM Based o earlier otes by C. Grat & M. Narasimha Itroductio Last lecture we bega a examiatio of model based clusterig. This lecture will be
More informationProbability and Statistics
robability ad Statistics rof. Zheg Zheg Radom Variable A fiite sigle valued fuctio.) that maps the set of all eperimetal outcomes ito the set of real umbers R is a r.v., if the set ) is a evet F ) for
More informationLecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting
Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would
More informationMath 220B Final Exam Solutions March 18, 2002
Math 0B Fial Exam Solutios March 18, 00 1. (1 poits) (a) (6 poits) Fid the Gree s fuctio for the tilted half-plae {(x 1, x ) R : x 1 + x > 0}. For x (x 1, x ), y (y 1, y ), express your Gree s fuctio G(x,
More informationDepartment of Mathematics
Departmet of Mathematics Ma 3/103 KC Border Itroductio to Probability ad Statistics Witer 2017 Lecture 19: Estimatio II Relevat textbook passages: Larse Marx [1]: Sectios 5.2 5.7 19.1 The method of momets
More informationPoint Estimation: properties of estimators 1 FINITE-SAMPLE PROPERTIES. finite-sample properties (CB 7.3) large-sample properties (CB 10.
Poit Estimatio: properties of estimators fiite-sample properties CB 7.3) large-sample properties CB 10.1) 1 FINITE-SAMPLE PROPERTIES How a estimator performs for fiite umber of observatios. Estimator:
More informationChapter 2 The Monte Carlo Method
Chapter 2 The Mote Carlo Method The Mote Carlo Method stads for a broad class of computatioal algorithms that rely o radom sampligs. It is ofte used i physical ad mathematical problems ad is most useful
More informationOptimization Methods MIT 2.098/6.255/ Final exam
Optimizatio Methods MIT 2.098/6.255/15.093 Fial exam Date Give: December 19th, 2006 P1. [30 pts] Classify the followig statemets as true or false. All aswers must be well-justified, either through a short
More information