Lecture 3: MLE and Regression

Size: px
Start display at page:

Download "Lecture 3: MLE and Regression"

Transcription

1 STAT/Q SCI 403: Itroductio to Resamplig Methods Sprig 207 Istructor: Ye-Chi Che Lecture 3: MLE ad Regressio 3. Parameters ad Distributios Some distributios are idexed by their uderlyig parameters. Thus, as log as we kow the parameter, we kow the etire distributio. For istace, for Normal distributios N(µ, σ 2 ), if we kow µ ad σ 2, the etire distributio is determied. For aother example, for Expoetial distributios Exp(λ), as log as we kow the value of λ, we kow the etire distributio. Because these distributios are determied by their parameters, they are sometimes called parametric distributios. Because parameters i the parametric distributios determie the etire distributio, fidig these parameters is very importat i practice. There are may approaches of fidig parameters; here we will itroduce the most famous ad perhaps most importat oe the maximum likelihood estimator (MLE). 3.2 MLE: Maximum Likelihood Estimator Assume that our radom sample X,, X F, where F F is a distributio depedig o a parameter. For istace, if F is a Normal distributio, the (µ, σ 2 ), the mea ad the variace; if F is a Expoetial distributio, the λ, the rate; if F is a Beroulli distributio, the p, the probability of geeratig. The idea of MLE is to use the PDF or PMF to fid the most likely parameter. For simplicity, here we use the PDF as a illustratio. Because the CDF F F, the PDF (or PMF) p p will also be determied by the parameter. By the idepedece property, the joit PDF of the radom sample X,, X p X,,X (x,, x ) p (x i ). Because p (x) also chages whe chages, we rewrite it as p(x; ) p (x). Thus, the joit PDF ca be rewritte as p X,,X (x,, x ) p(x i ; ). Havig observed x X,, x X, how ca we tell which parameter is most likely? Here is a simple proposal that the MLE uses. Based o the joit PDF ad x X,, x X, we ca rewrite the joit PDF as a fuctio of parameter : L( X,, X ) p(x i ; ). The fuctio L is called the likelihood fuctio. Ad the MLE fids the maximizer of the likelihood fuctio. Namely, MLE argmax L( X,, X ). 3-

2 3-2 Lecture 3: MLE ad Regressio I may cases, maximizig the likelihood fuctio might ot be easy so people cosider maximizig the log-likelihood fuctio: MLE argmax l( X,, X ) argmax log p(x i ; ) argmax l( X i ), where l( X i ) log p(x i ; ) but ow we fix each X i ad view it as a fuctio of. It is easy to see that maximizig the likelihood fuctio is the same as maximizig the log-likelihood fuctio. Whe the log-likelihood fuctio is differetiable with respect to, we ca use our kowledge from calculus to fid. Let s( X i ) l( X i) be the derivative of l( X i ) with respect to (here for simplicity we assume is oe-dimesioal). The fuctio s( X i ) is called the score fuctio. The the MLE is from solvig the followig likelihood equatio: s( X,, X ) s( X i ) 0. (3.) Note that we geerally eed to verify that the solutio is the maximum by checkig the secod derivative but here we igore it for simplicity. : Equatio (3.) is related to the geeralized estimatig equatios, a commo approach to obtai estimators. : The idea of obtaiig a estimator by maximizig certai criteria (or miimizig some criteria) is called a M-estimator 2. I liear regressio, we have leared that the estimators of the slope/itercept is from miimizig the sum of squares of errors (least square estimator). Thus, the least square method is aother M-estimator. Example. (Normal distributio) Here is a example of fidig the MLE of the Normal distributio. We assume X,, X N(µ, ). The goal is to fid a estimator of the mea parameter µ. Because the desity of such a Normal is p µ (x) 2π e (x µ)2 2, the log-likelihood fuctio l(µ X i ) is which further implies that the score fuctio is Thus, the MLE µ satisfies 0 l(µ X i ) log p µ (X i ) (X i µ) log(2π), s( µ X i ) s(µ X i ) µ l(µ X i) µ X i. ( µ X i ) µ Thus, the MLE of the mea parameter is just the sample mea. X i µ X i. Example. (Expoetial distributio) Assume X,, X Exp(λ). Now we fid a estimator of λ usig the MLE. By defiitio of the expoetial distributio, the desity is p λ (x) λe λx. Thus, the log-likelihood fuctio ad the score fuctio are l(λ X i ) log p λ (X i ) log λ λx i, s(λ X i ) λ X i. A bit advaced materials ca be foud i 2

3 Lecture 3: MLE ad Regressio 3-3 As a result, the MLE comes from solvig 0 s( λ X i ) ( ) X i ṋ λ λ Namely, the MLE is the iverse of the sample average. X i λ X. i Example. (Uiform distributio) Here is a case where we caot use the score fuctio to obtai the MLE but still we ca directly fid the MLE. Assume X,, X Ui[0, ]. Namely, the radom sample is from a uiform distributio over the iterval [0, ], where the upper limit parameter is the parameter of iterest. The the desity fuctio is p (x) (0 x ). Here we caot use the log-likelihood fuctio (thik about why) so we use the origial likelihood fuctio: L( X,, X ) p (X i ) (0 X i ) (0 X () X () ), where X () mi{x,, X } ad X () max X,, X are the miimum ad maximum value of the sample (here we use the otatios from order statistic). Observig from L( X,, X ) (0 X () X () ), we see that a smaller, a higher value of the likelihood fuctio. However, there is a restrictio the value of caot be below X () otherwise the idicator fuctio outputs 0. Thus, the maximum value of L( X,, X ) occurs whe X (), so the MLE is X (), the maximum value of the sample. Example. (Beroulli distributio) Fially, we provide a example of fidig the MLE of a discrete radom variable. Suppose X,, X Ber(p). The the PMF is X with a probability of p ad X 0 with a probability of p. We ca rewrite the PMF i the followig succict form: P (x) p X ( p) X. You ca verify that P () P (X ) p ad P (0) P (X 0) p. The likelihood fuctio will be L(p X,, X ) P (X i ) p Xi ( p) Xi We ca the compute the log-likelihood fuctio ad the score fuctio: l(p X,, X ) (X i log p + ( X i ) log( p)), s(p X,, X ) Therefore, the MLE ca be obtaied by solvig Multiplyig both sides by p ( p ), 0 0 s( p X,, X ) X i ( p ) ( X i ) p Agai, the MLE is the sample mea. ( Xi X ) i. p p (X i p ) p ( Xi p X ) i. p X i. : I may problems (such as the mixture models 3 ), we do ot have a closed form of the MLE. The oly way to compute the MLE is via computatioal methods such as the EM algorithm (Expectatio-Maximizatio) 4,

4 3-4 Lecture 3: MLE ad Regressio which is like a gradiet ascet approach. However, the EM algorithm will stuck at the local maximum, so we have to reru the algorithm may times to get the real MLE (the MLE is the parameters of global maximum). I machie learig/data sciece, how to umerically fid the MLE (or approximate the MLE) is a importat topic. A commo solutio is to propose other computatioally feasible estimators that are similar to the MLE ad switch our target to these ew estimators. 3.3 Theory of MLE The MLE has may appealig properties. Here we will focus o oe of its most desirable properties: asymptotic ormality ad asymptotic variace. What is asymptotic ormality? It meas that the estimator ad its target parameter has the followig elegat relatio: ) D ( N(0, I ()), (3.2) where σ 2 () is called the asymptotic variace; it is a quatity depedig oly o (ad the form of the desity fuctio). Simply put, the asymptotic ormality refers to the case where we have the covergece i distributio to a Normal limit cetered at the target parameter. Moreover, this asymptotic variace has a elegat form: ( ( ) ) 2 I() E log p(x; ) E ( s 2 ( X) ). (3.3) The asymptotic variace I() is also called the Fisher iformatio. This quatity plays a key role i both statistical theory ad iformatio theory. Here is a simplified derivatio of equatio (3.2) ad (3.3). Let X,, X p 0, where 0 is the parameter geeratig the radom sample. For simplicity, we assume 0 R ad 0 satisfies 0 argmax E(l( X)). Let be the MLE. Recall that the MLE solves the equatio s( X i ) 0. Because 0 is the maximizer of E(l( X)), it also satisfies E(s( X)) 0. Now cosider the followig expasio: s( 0 X i ) E(s( 0 X)) }{{} 0 s( 0 X i ) f ( 0 ) f ( ) s( X i ) } {{ } 0 ( 0 )f ( ), [ 0, ] by mea value theorem ( 0 )f ( 0 ) ( 0 ) s ( 0 X i ) ( 0 )E(s ( 0 X i )). (3.4)

5 Lecture 3: MLE ad Regressio 3-5 By Cetral Limit Theorem, the right had side of equatio (3.4) ( ) D s( 0 X i ) E(s( 0 X)) N(0, σ 2 ()), (3.5) where σ 2 () Var(s( 0 X i )). Therefore, rearragig the quatities i equatio (3.4) ad (3.5), ) ( 0 ( ) E(s ( 0 X i )) s( 0 X i ) E(s( 0 X)) E(s ( 0 X i )) ( ) s( 0 X i ) E(s( 0 X)) D N ( 0, σ 2 () (E(s ( 0 X i )) 2 Now we have already show the asymptotic ormality. The ext step is to show the asymptotic variace σ 2 () (E(s ( 0 X i)) 2 I( 0 ). ). (3.6) First, we expad σ 2 (): σ 2 () Var(s( 0 X i )) E(s 2 ( 0 X i )) E(s( 0 X i )). (3.7) }{{}}{{} I() 0 2 Secod, we expad E(s ( 0 X i )). But first we focus o s ( 0 X i ): For the first quatity, 2 p(x E 0 2 i ; 0 ) s ( 0 X i ) p(x; 0 ) p(x; 0 ) log 0 2 p(x 0 2 i ; 0 ) ( 0 0 ) 2 2 p(x 0 2 i ; 0 ) ( ) 2 log 0 2 p(x 0 2 i ; 0 ) s 2 ( 0 X i ). 2 p(x; 0 )dx 2 0 p(x; 0 )dx p(x; 0 )dx } {{ } Note that because we exchage the positios of the derivative ad the itegratio, we assume that the parameter is idepedet of the support of the desity fuctio. Thus, E(s ( 0 X i )) E E ( s 2 ( 0 X i ) ) E ( s 2 ( 0 X i ) ) I( 0 ). (3.8)

6 3-6 Lecture 3: MLE ad Regressio Pluggig equatios (3.7) ad (3.8) ito equatio (3.6), we coclude ) D ( 0 N (0, I( ) 0) D I 2 N ( 0, I ( 0 ) ), (3.9) ( 0 ) which is the desired result. As a byproduct, we also showed that I( 0 ) E ( s 2 ( 0 X) ) E (s ( 0 X)). Equatio (3.9) provides several useful properties of MLE: The MLE is a ubiased estimator of 0. The mea square error (MSE) of is ) MSE (, 0 The estimator error of is asymptotically Normal. bias 2 ( ) +Var( ) }{{} 0 I( 0 ). If we kow I( 0 ) or have a estimate Î( 0) of it, we ca costruct a α cofidece iterval usig z α/2 0), z α/2 0). If we wat to test H 0 : 0 versus H a : 0 uder sigificace level α, we ca first compute I( ) ad the reject the ull hypothesis if I( ) z α/2. Example. (Normal distributio) I the example where X,, X N(µ, ), we have see that s(µ X i ) µ X i. Thus, I(µ) E(s 2 (µ X i )) E((µ X i ) 2 ) because the variace is. Moreover, E(s (µ X i )) E() I(µ) agrees with the fact that E(s 2 (µ X i )) E(s (µ X i )). Aother way to check this is directly compute the variace of the MLE µ X i, which equals to I(µ). Agai, we obtai I(µ). Example. (Expoetial distributio) For the example where X,, X Exp(λ), the Fisher iformatio is more ivolved. Because the MLE λ, we caot directly compute its variace. However, usig Xi the Fisher iformatio, we ca still obtai its asymptotic variace. Recall that s(λ X i ) λ X i. Thus, ( E(s 2 (λ X i )) E λ 2 2X ) i λ + X2 i. For a expoetial distributio Exp(λ), its mea is λ ad variace is λ 2, which implies the secod momet E(X 2 i ) Var(X i) + E 2 (X i ) 2 λ 2. Puttig it altogether, I(λ) E(s 2 (λ X i )) λ 2 2 λ λ 2 λ 2. If we use E(s (λ X i )), we obtai E(s (λ X i )) λ 2 I(λ),

7 Lecture 3: MLE ad Regressio 3-7 which agrees with the previous result. : Note that we may obtai a approximatio of the variace of the MLE λ by the delta method 5. : Not all MLEs have the above elegat properties. There are MLEs who do ot satisfy the assumptio of the theory. For istace, i the example of X,, X Ui over [0, ], the MLE X () does ot satisfy the assumptio. A assumptio is that the support of the desity does ot deped o the parameter but here the support is [0, ], which depeds o the parameter. 3.4 Simple Liear Regressio Uder certai coditios, the least square estimator (LSE) i liear regressio ca be framed as a MLE of the regressio slope ad itercept. Let s recall the least square estimator of the liear regressio first. We observe IID bivariate radom variables (X, Y ),, (X, Y ) such that Y i β 0 + β X i + ɛ i, where the ɛ i is a mea 0 oise idepedet of X i. We also assume that the margial distributio of the covariate X,, X p X. The LSE fids β 0,LSE, β,lse by ( β 0,LSE, β,lse ) argmi β 0,β (Y i β 0 β X i ) 2. (3.0) I other words, we are choose β 0,LSE, β,lse such that the sum of squares of errors is miimized. Now, here is the key assumptio the liks the MLE ad the LSE: We assume that Namely, the oises are IID from a mea 0 Normal distributio. ɛ,, ɛ N(0, σ 2 ). (3.) Uder equatio (3.), what is the MLE of β 0 ad β? Here is a derivatio of that. Let p ɛ (e) be the distributio of the ɛ i. The joit desity of (X i, Y i ) is p XY (x, y) p(y x) p X (x) p ɛ (y β 0 β x) p X (x) Thus, the log-likelihood fuctio of β 0, β is l(β 0, β X, Y,, X, Y ) log p XY (X i, Y i ) 2πσ 2 e 2σ 2 (y β0 βx)2 p X (x). 2 (2πσ2 ) }{{} 2σ 2 (Y i β 0 β X i ) 2 + log p X (X i ) }{{}. β β 0,β 0,β Therefore, oly the ceter term is related to the parameter of iterest β 0, β. This meas that the MLE is also the maximizer of l (β 0, β X, Y,, X, Y ) 2σ 2 (Y i β 0 β X i ) ad

8 3-8 Lecture 3: MLE ad Regressio Because multiplyig l (β 0, β X, Y,, X, Y ) by ay positive umber will ot affect the maximizer, we multiply it by 2σ 2, which leads to the followig criterio: l (β 0, β X, Y,, X, Y ) (Y i β 0 β X i ) 2. Accordigly, the MLE fids β 0,MLE, β,mle by maximizig (Y i β 0 β X i ) 2, which is equivalet to miimizig (Y i β 0 β X i ) 2, the same criterio as equatio (3.0). So the LSE ad the MLE coicides i this case ad the theory of MLE applied to LSE (asymptotic ormality, Fisher iformatio,...etc). : Thik about the closed form of β 0,LSE, β,lse give IID bivariate radom variables (X, Y ),, (X, Y ). Ad also thik about if they coverges to the true parameter β 0, β ad if we ca derive the asymptotic ormality of them (you may eed to use the Slutsky s theorem 6 ). 3.5 Logistic Regressio Now we cosider a special case i the regressio problem: biary respose case. We observe IID bivariate radom variables (X, Y ),, (X, Y ) such that Y i takes value 0 ad. Here Y i s are called the respose variable ad X i s are called the covariate. Because Y i takes value betwee 0 ad, we ca view it as a Beroulli radom variable with parameter q but this parameter q depeds o the value of the correspodig covariate X i. Namely, if we observe X i x, the the probability P (Y i X i x) q(x) for some fuctio q(x). Here are some examples why this model is reasoable. Example. I graduate school admissio, we are woderig how a studet s GPA affects the chace that this applicat received the admissio. I this case, each observatios is a studet ad the respose variable Y represets whether the studet received admissio (Y ) or ot (Y 0). GPA is the covariate X. Thus, we ca model the probability P (admitted GPA x) P (Y X x) q(x). Example. I medical research, people are ofte woderig if the heretability of the type-2 diabetes is related to some mutatio from of a gee. Researchers record if the subject has the type-2 diabetes (respose) ad measure the mutatio sigature of gees (covariate X). Thus, the respose variable Y if this subject has the type-2 diabetes. A statistical model to associate the covariate X ad the respose Y is through P (subject has type-2 diabetes mutatio sigature x) P (Y X x) q(x). Thus, the fuctio q(x) ow plays a key role i determiig how the respose Y ad the covariate X are associated. The logistic regressio provides a simple ad elegat way to characterize the fuctio q(x) i a liear way. Because q(x) represets a probability, it rages withi [0, ] so aively usig a liear regressio will ot work. However, cosider the followig quatity: O(x) 6 s_theorem q(x) P (Y X x) [0, ). q(x) P (Y 0 X x)

9 Lecture 3: MLE ad Regressio 3-9 The quatity O(x) is called the odds that measures the cotrast betwee the evet Y versus Y 0. Whe the odds is greater tha, we have a higher chage of gettig Y tha Y 0. The odds has a iterestig asymmetric form if P (Y X x) 2P (Y 0 X x), the O(x) 2 but if P (Y 0 X x) 2P (Y X x), the O(x) 2. To symmetrize the odds, a straight-forward approach is to take (atural) logarithm of it: log O(x) log q(x) q(x). This quatity is called log odds. The log odds has several beautiful properties, for istace whe the two probabilities are the same (P (Y X x) P (Y 0 X x)), log O(x) 0, ad P (Y X x) 2P (Y 0 X x) log O(x) log 2 P (Y 0 X x) 2P (Y X x) log O(x) log 2. The logistic regressio is to impose a liear model to the log odds. Namely, the logistic regressio models leadig to log O(x) log q(x) q(x) β 0 + β x P (Y X x) q(x) eβ0+βx + e β0+βx. Thus, the quatity q(x) q(x; β 0, β ) depeds o the two parameter β 0, β. Here β 0 behaves like the itercept ad β behaves like the slope (they are the itercept ad slope i terms of the log odds). Whe we observe data, how ca we estimate these two parameters? I geeral, people will use the MLE to estimate them ad here is the likelihood fuctio of logistic regressio. Recall that we observe IID bivariate radom sample: (X, Y ),, (X, Y ). Let p X (x) deotes the probability desity of X; ote that we will ot use it i estimatig β 0, β. For a give pair X i, Y i, recalled that the radom variable Y i give X i is just a Beroulli radom variable with parameter q(x X i ). Thus, the PMF of Y i give X i is L(β 0, β X i, Y i ) P (Y Y i X i ) q(x i ) Yi ( q(x i )) Yi ( e β 0+β X i ) Yi ( + e β0+βxi + e β0+βxi eβ0yi+βxiyi + e β0+βxi. ) Yi Note that here we costruct the likelihood fuctio usig oly the coditioal PMF because similarly to the liear regressio, the distributio of the covariate X does ot depeds o the parameter β 0, β. Thus, the log-likelihood fuctio is l(β 0, β X, Y,, X, Y ) log L(β 0, β X i, Y i ) ( e β 0Y i+β X iy i ) log + e β0+βxi β 0 Y i + β X i Y i log ( + e β0+βxi).

10 3-0 Lecture 3: MLE ad Regressio We ca the take derivative to fid the maximizer. However, the derivative of the log-likelihood fuctio l(β 0, β X, Y,, X, Y ) does ot have a closedform solutio so we caot write dow a simple expressio of the estimator. Despite this disadvatage, such a log-likelihood fuctio ca be optimized by gradiet ascet approach such as the Newto-Raphso 7. 7 some refereces ca be foud:

Lecture 2: Monte Carlo Simulation

Lecture 2: Monte Carlo Simulation STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?

More information

January 25, 2017 INTRODUCTION TO MATHEMATICAL STATISTICS

January 25, 2017 INTRODUCTION TO MATHEMATICAL STATISTICS Jauary 25, 207 INTRODUCTION TO MATHEMATICAL STATISTICS Abstract. A basic itroductio to statistics assumig kowledge of probability theory.. Probability I a typical udergraduate problem i probability, we

More information

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4 MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.

More information

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f. Lecture 5 Let us give oe more example of MLE. Example 3. The uiform distributio U[0, ] o the iterval [0, ] has p.d.f. { 1 f(x =, 0 x, 0, otherwise The likelihood fuctio ϕ( = f(x i = 1 I(X 1,..., X [0,

More information

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n. Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator

More information

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator Ecoomics 24B Relatio to Method of Momets ad Maximum Likelihood OLSE as a Maximum Likelihood Estimator Uder Assumptio 5 we have speci ed the distributio of the error, so we ca estimate the model parameters

More information

Chapter 6 Principles of Data Reduction

Chapter 6 Principles of Data Reduction Chapter 6 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 0 Chapter 6 Priciples of Data Reductio Sectio 6. Itroductio Goal: To summarize or reduce the data X, X,, X to get iformatio about a

More information

Random Variables, Sampling and Estimation

Random Variables, Sampling and Estimation Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig

More information

Distribution of Random Samples & Limit theorems

Distribution of Random Samples & Limit theorems STAT/MATH 395 A - PROBABILITY II UW Witer Quarter 2017 Néhémy Lim Distributio of Radom Samples & Limit theorems 1 Distributio of i.i.d. Samples Motivatig example. Assume that the goal of a study is to

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Lecture 11 and 12: Basic estimation theory

Lecture 11 and 12: Basic estimation theory Lecture ad 2: Basic estimatio theory Sprig 202 - EE 94 Networked estimatio ad cotrol Prof. Kha March 2 202 I. MAXIMUM-LIKELIHOOD ESTIMATORS The maximum likelihood priciple is deceptively simple. Louis

More information

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1 EECS564 Estimatio, Filterig, ad Detectio Hwk 2 Sols. Witer 25 4. Let Z be a sigle observatio havig desity fuctio where. p (z) = (2z + ), z (a) Assumig that is a oradom parameter, fid ad plot the maximum

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS MASSACHUSTTS INSTITUT OF TCHNOLOGY 6.436J/5.085J Fall 2008 Lecture 9 /7/2008 LAWS OF LARG NUMBRS II Cotets. The strog law of large umbers 2. The Cheroff boud TH STRONG LAW OF LARG NUMBRS While the weak

More information

Direction: This test is worth 150 points. You are required to complete this test within 55 minutes.

Direction: This test is worth 150 points. You are required to complete this test within 55 minutes. Term Test 3 (Part A) November 1, 004 Name Math 6 Studet Number Directio: This test is worth 10 poits. You are required to complete this test withi miutes. I order to receive full credit, aswer each problem

More information

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam. Probability ad Statistics FS 07 Secod Sessio Exam 09.0.08 Time Limit: 80 Miutes Name: Studet ID: This exam cotais 9 pages (icludig this cover page) ad 0 questios. A Formulae sheet is provided with the

More information

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING Lectures MODULE 5 STATISTICS II. Mea ad stadard error of sample data. Biomial distributio. Normal distributio 4. Samplig 5. Cofidece itervals

More information

Exponential Families and Bayesian Inference

Exponential Families and Bayesian Inference Computer Visio Expoetial Families ad Bayesia Iferece Lecture Expoetial Families A expoetial family of distributios is a d-parameter family f(x; havig the followig form: f(x; = h(xe g(t T (x B(, (. where

More information

Kurskod: TAMS11 Provkod: TENB 21 March 2015, 14:00-18:00. English Version (no Swedish Version)

Kurskod: TAMS11 Provkod: TENB 21 March 2015, 14:00-18:00. English Version (no Swedish Version) Kurskod: TAMS Provkod: TENB 2 March 205, 4:00-8:00 Examier: Xiagfeg Yag (Tel: 070 2234765). Please aswer i ENGLISH if you ca. a. You are allowed to use: a calculator; formel -och tabellsamlig i matematisk

More information

10-701/ Machine Learning Mid-term Exam Solution

10-701/ Machine Learning Mid-term Exam Solution 0-70/5-78 Machie Learig Mid-term Exam Solutio Your Name: Your Adrew ID: True or False (Give oe setece explaatio) (20%). (F) For a cotiuous radom variable x ad its probability distributio fuctio p(x), it

More information

17. Joint distributions of extreme order statistics Lehmann 5.1; Ferguson 15

17. Joint distributions of extreme order statistics Lehmann 5.1; Ferguson 15 17. Joit distributios of extreme order statistics Lehma 5.1; Ferguso 15 I Example 10., we derived the asymptotic distributio of the maximum from a radom sample from a uiform distributio. We did this usig

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Simulation. Two Rule For Inverting A Distribution Function

Simulation. Two Rule For Inverting A Distribution Function Simulatio Two Rule For Ivertig A Distributio Fuctio Rule 1. If F(x) = u is costat o a iterval [x 1, x 2 ), the the uiform value u is mapped oto x 2 through the iversio process. Rule 2. If there is a jump

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Statistical Inference Based on Extremum Estimators

Statistical Inference Based on Extremum Estimators T. Rotheberg Fall, 2007 Statistical Iferece Based o Extremum Estimators Itroductio Suppose 0, the true value of a p-dimesioal parameter, is kow to lie i some subset S R p : Ofte we choose to estimate 0

More information

Lecture 7: Properties of Random Samples

Lecture 7: Properties of Random Samples Lecture 7: Properties of Radom Samples 1 Cotiued From Last Class Theorem 1.1. Let X 1, X,...X be a radom sample from a populatio with mea µ ad variace σ

More information

( θ. sup θ Θ f X (x θ) = L. sup Pr (Λ (X) < c) = α. x : Λ (x) = sup θ H 0. sup θ Θ f X (x θ) = ) < c. NH : θ 1 = θ 2 against AH : θ 1 θ 2

( θ. sup θ Θ f X (x θ) = L. sup Pr (Λ (X) < c) = α. x : Λ (x) = sup θ H 0. sup θ Θ f X (x θ) = ) < c. NH : θ 1 = θ 2 against AH : θ 1 θ 2 82 CHAPTER 4. MAXIMUM IKEIHOOD ESTIMATION Defiitio: et X be a radom sample with joit p.m/d.f. f X x θ. The geeralised likelihood ratio test g.l.r.t. of the NH : θ H 0 agaist the alterative AH : θ H 1,

More information

4. Partial Sums and the Central Limit Theorem

4. Partial Sums and the Central Limit Theorem 1 of 10 7/16/2009 6:05 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 4. Partial Sums ad the Cetral Limit Theorem The cetral limit theorem ad the law of large umbers are the two fudametal theorems

More information

CS284A: Representations and Algorithms in Molecular Biology

CS284A: Representations and Algorithms in Molecular Biology CS284A: Represetatios ad Algorithms i Molecular Biology Scribe Notes o Lectures 3 & 4: Motif Discovery via Eumeratio & Motif Represetatio Usig Positio Weight Matrix Joshua Gervi Based o presetatios by

More information

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5 CS434a/54a: Patter Recogitio Prof. Olga Veksler Lecture 5 Today Itroductio to parameter estimatio Two methods for parameter estimatio Maimum Likelihood Estimatio Bayesia Estimatio Itroducto Bayesia Decisio

More information

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara Poit Estimator Eco 325 Notes o Poit Estimator ad Cofidece Iterval 1 By Hiro Kasahara Parameter, Estimator, ad Estimate The ormal probability desity fuctio is fully characterized by two costats: populatio

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit theorems Throughout this sectio we will assume a probability space (Ω, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

Review Questions, Chapters 8, 9. f(y) = 0, elsewhere. F (y) = f Y(1) = n ( e y/θ) n 1 1 θ e y/θ = n θ e yn

Review Questions, Chapters 8, 9. f(y) = 0, elsewhere. F (y) = f Y(1) = n ( e y/θ) n 1 1 θ e y/θ = n θ e yn Stat 366 Lab 2 Solutios (September 2, 2006) page TA: Yury Petracheko, CAB 484, yuryp@ualberta.ca, http://www.ualberta.ca/ yuryp/ Review Questios, Chapters 8, 9 8.5 Suppose that Y, Y 2,..., Y deote a radom

More information

Solutions: Homework 3

Solutions: Homework 3 Solutios: Homework 3 Suppose that the radom variables Y,...,Y satisfy Y i = x i + " i : i =,..., IID where x,...,x R are fixed values ad ",...," Normal(0, )with R + kow. Fid ˆ = MLE( ). IND Solutio: Observe

More information

Table 12.1: Contingency table. Feature b. 1 N 11 N 12 N 1b 2 N 21 N 22 N 2b. ... a N a1 N a2 N ab

Table 12.1: Contingency table. Feature b. 1 N 11 N 12 N 1b 2 N 21 N 22 N 2b. ... a N a1 N a2 N ab Sectio 12 Tests of idepedece ad homogeeity I this lecture we will cosider a situatio whe our observatios are classified by two differet features ad we would like to test if these features are idepedet

More information

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample. Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized

More information

Unbiased Estimation. February 7-12, 2008

Unbiased Estimation. February 7-12, 2008 Ubiased Estimatio February 7-2, 2008 We begi with a sample X = (X,..., X ) of radom variables chose accordig to oe of a family of probabilities P θ where θ is elemet from the parameter space Θ. For radom

More information

Statistical Theory MT 2009 Problems 1: Solution sketches

Statistical Theory MT 2009 Problems 1: Solution sketches Statistical Theory MT 009 Problems : Solutio sketches. Which of the followig desities are withi a expoetial family? Explai your reasoig. (a) Let 0 < θ < ad put f(x, θ) = ( θ)θ x ; x = 0,,,... (b) (c) where

More information

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA, 016 MODULE : Statistical Iferece Time allowed: Three hours Cadidates should aswer FIVE questios. All questios carry equal marks. The umber

More information

Introductory statistics

Introductory statistics CM9S: Machie Learig for Bioiformatics Lecture - 03/3/06 Itroductory statistics Lecturer: Sriram Sakararama Scribe: Sriram Sakararama We will provide a overview of statistical iferece focussig o the key

More information

Efficient GMM LECTURE 12 GMM II

Efficient GMM LECTURE 12 GMM II DECEMBER 1 010 LECTURE 1 II Efficiet The estimator depeds o the choice of the weight matrix A. The efficiet estimator is the oe that has the smallest asymptotic variace amog all estimators defied by differet

More information

Direction: This test is worth 250 points. You are required to complete this test within 50 minutes.

Direction: This test is worth 250 points. You are required to complete this test within 50 minutes. Term Test October 3, 003 Name Math 56 Studet Number Directio: This test is worth 50 poits. You are required to complete this test withi 50 miutes. I order to receive full credit, aswer each problem completely

More information

Stat410 Probability and Statistics II (F16)

Stat410 Probability and Statistics II (F16) Some Basic Cocepts of Statistical Iferece (Sec 5.) Suppose we have a rv X that has a pdf/pmf deoted by f(x; θ) or p(x; θ), where θ is called the parameter. I previous lectures, we focus o probability problems

More information

Lecture 7: Density Estimation: k-nearest Neighbor and Basis Approach

Lecture 7: Density Estimation: k-nearest Neighbor and Basis Approach STAT 425: Itroductio to Noparametric Statistics Witer 28 Lecture 7: Desity Estimatio: k-nearest Neighbor ad Basis Approach Istructor: Ye-Chi Che Referece: Sectio 8.4 of All of Noparametric Statistics.

More information

Bayesian Methods: Introduction to Multi-parameter Models

Bayesian Methods: Introduction to Multi-parameter Models Bayesia Methods: Itroductio to Multi-parameter Models Parameter: θ = ( θ, θ) Give Likelihood p(y θ) ad prior p(θ ), the posterior p proportioal to p(y θ) x p(θ ) Margial posterior ( θ, θ y) is Iterested

More information

STA6938-Logistic Regression Model

STA6938-Logistic Regression Model Dr. Yig Zhag STA6938-Logistic Regressio Model Topic -Simple (Uivariate) Logistic Regressio Model Outlies:. Itroductio. A Example-Does the liear regressio model always work? 3. Maximum Likelihood Curve

More information

1 Models for Matched Pairs

1 Models for Matched Pairs 1 Models for Matched Pairs Matched pairs occur whe we aalyse samples such that for each measuremet i oe of the samples there is a measuremet i the other sample that directly relates to the measuremet i

More information

Problem Set 4 Due Oct, 12

Problem Set 4 Due Oct, 12 EE226: Radom Processes i Systems Lecturer: Jea C. Walrad Problem Set 4 Due Oct, 12 Fall 06 GSI: Assae Gueye This problem set essetially reviews detectio theory ad hypothesis testig ad some basic otios

More information

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen) Goodess-of-Fit Tests ad Categorical Data Aalysis (Devore Chapter Fourtee) MATH-252-01: Probability ad Statistics II Sprig 2019 Cotets 1 Chi-Squared Tests with Kow Probabilities 1 1.1 Chi-Squared Testig................

More information

Lecture 19: Convergence

Lecture 19: Convergence Lecture 19: Covergece Asymptotic approach I statistical aalysis or iferece, a key to the success of fidig a good procedure is beig able to fid some momets ad/or distributios of various statistics. I may

More information

Linear Regression Demystified

Linear Regression Demystified Liear Regressio Demystified Liear regressio is a importat subject i statistics. I elemetary statistics courses, formulae related to liear regressio are ofte stated without derivatio. This ote iteds to

More information

Expectation and Variance of a random variable

Expectation and Variance of a random variable Chapter 11 Expectatio ad Variace of a radom variable The aim of this lecture is to defie ad itroduce mathematical Expectatio ad variace of a fuctio of discrete & cotiuous radom variables ad the distributio

More information

Estimation for Complete Data

Estimation for Complete Data Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of

More information

Since X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain

Since X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain Assigmet 9 Exercise 5.5 Let X biomial, p, where p 0, 1 is ukow. Obtai cofidece itervals for p i two differet ways: a Sice X / p d N0, p1 p], the variace of the limitig distributio depeds oly o p. Use the

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight)

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight) Tests of Hypotheses Based o a Sigle Sample Devore Chapter Eight MATH-252-01: Probability ad Statistics II Sprig 2018 Cotets 1 Hypothesis Tests illustrated with z-tests 1 1.1 Overview of Hypothesis Testig..........

More information

Statistical Theory MT 2008 Problems 1: Solution sketches

Statistical Theory MT 2008 Problems 1: Solution sketches Statistical Theory MT 008 Problems : Solutio sketches. Which of the followig desities are withi a expoetial family? Explai your reasoig. a) Let 0 < θ < ad put fx, θ) = θ)θ x ; x = 0,,,... b) c) where α

More information

1 Inferential Methods for Correlation and Regression Analysis

1 Inferential Methods for Correlation and Regression Analysis 1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet

More information

Section 14. Simple linear regression.

Section 14. Simple linear regression. Sectio 14 Simple liear regressio. Let us look at the cigarette dataset from [1] (available to dowload from joural s website) ad []. The cigarette dataset cotais measuremets of tar, icotie, weight ad carbo

More information

1 Approximating Integrals using Taylor Polynomials

1 Approximating Integrals using Taylor Polynomials Seughee Ye Ma 8: Week 7 Nov Week 7 Summary This week, we will lear how we ca approximate itegrals usig Taylor series ad umerical methods. Topics Page Approximatig Itegrals usig Taylor Polyomials. Defiitios................................................

More information

This section is optional.

This section is optional. 4 Momet Geeratig Fuctios* This sectio is optioal. The momet geeratig fuctio g : R R of a radom variable X is defied as g(t) = E[e tx ]. Propositio 1. We have g () (0) = E[X ] for = 1, 2,... Proof. Therefore

More information

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y

More information

Lecture 3. Properties of Summary Statistics: Sampling Distribution

Lecture 3. Properties of Summary Statistics: Sampling Distribution Lecture 3 Properties of Summary Statistics: Samplig Distributio Mai Theme How ca we use math to justify that our umerical summaries from the sample are good summaries of the populatio? Lecture Summary

More information

Probability and MLE.

Probability and MLE. 10-701 Probability ad MLE http://www.cs.cmu.edu/~pradeepr/701 (brief) itro to probability Basic otatios Radom variable - referrig to a elemet / evet whose status is ukow: A = it will rai tomorrow Domai

More information

Binomial Distribution

Binomial Distribution 0.0 0.5 1.0 1.5 2.0 2.5 3.0 0 1 2 3 4 5 6 7 0.0 0.5 1.0 1.5 2.0 2.5 3.0 Overview Example: coi tossed three times Defiitio Formula Recall that a r.v. is discrete if there are either a fiite umber of possible

More information

Stat 421-SP2012 Interval Estimation Section

Stat 421-SP2012 Interval Estimation Section Stat 41-SP01 Iterval Estimatio Sectio 11.1-11. We ow uderstad (Chapter 10) how to fid poit estimators of a ukow parameter. o However, a poit estimate does ot provide ay iformatio about the ucertaity (possible

More information

The Riemann Zeta Function

The Riemann Zeta Function Physics 6A Witer 6 The Riema Zeta Fuctio I this ote, I will sketch some of the mai properties of the Riema zeta fuctio, ζ(x). For x >, we defie ζ(x) =, x >. () x = For x, this sum diverges. However, we

More information

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance Hypothesis Testig Empirically evaluatig accuracy of hypotheses: importat activity i ML. Three questios: Give observed accuracy over a sample set, how well does this estimate apply over additioal samples?

More information

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1. Eco 325/327 Notes o Sample Mea, Sample Proportio, Cetral Limit Theorem, Chi-square Distributio, Studet s t distributio 1 Sample Mea By Hiro Kasahara We cosider a radom sample from a populatio. Defiitio

More information

The standard deviation of the mean

The standard deviation of the mean Physics 6C Fall 20 The stadard deviatio of the mea These otes provide some clarificatio o the distictio betwee the stadard deviatio ad the stadard deviatio of the mea.. The sample mea ad variace Cosider

More information

1.010 Uncertainty in Engineering Fall 2008

1.010 Uncertainty in Engineering Fall 2008 MIT OpeCourseWare http://ocw.mit.edu.00 Ucertaity i Egieerig Fall 2008 For iformatio about citig these materials or our Terms of Use, visit: http://ocw.mit.edu.terms. .00 - Brief Notes # 9 Poit ad Iterval

More information

6.3 Testing Series With Positive Terms

6.3 Testing Series With Positive Terms 6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial

More information

Parameter, Statistic and Random Samples

Parameter, Statistic and Random Samples Parameter, Statistic ad Radom Samples A parameter is a umber that describes the populatio. It is a fixed umber, but i practice we do ot kow its value. A statistic is a fuctio of the sample data, i.e.,

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013. Large Deviations for i.i.d. Random Variables

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013. Large Deviations for i.i.d. Random Variables MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013 Large Deviatios for i.i.d. Radom Variables Cotet. Cheroff boud usig expoetial momet geeratig fuctios. Properties of a momet

More information

Understanding Samples

Understanding Samples 1 Will Moroe CS 109 Samplig ad Bootstrappig Lecture Notes #17 August 2, 2017 Based o a hadout by Chris Piech I this chapter we are goig to talk about statistics calculated o samples from a populatio. We

More information

10. Comparative Tests among Spatial Regression Models. Here we revisit the example in Section 8.1 of estimating the mean of a normal random

10. Comparative Tests among Spatial Regression Models. Here we revisit the example in Section 8.1 of estimating the mean of a normal random Part III. Areal Data Aalysis 0. Comparative Tests amog Spatial Regressio Models While the otio of relative likelihood values for differet models is somewhat difficult to iterpret directly (as metioed above),

More information

Lecture 12: September 27

Lecture 12: September 27 36-705: Itermediate Statistics Fall 207 Lecturer: Siva Balakrisha Lecture 2: September 27 Today we will discuss sufficiecy i more detail ad the begi to discuss some geeral strategies for costructig estimators.

More information

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Discrete Mathematics for CS Spring 2008 David Wagner Note 22 CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig

More information

Properties and Hypothesis Testing

Properties and Hypothesis Testing Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.

More information

Lecture 18: Sampling distributions

Lecture 18: Sampling distributions Lecture 18: Samplig distributios I may applicatios, the populatio is oe or several ormal distributios (or approximately). We ow study properties of some importat statistics based o a radom sample from

More information

HOMEWORK I: PREREQUISITES FROM MATH 727

HOMEWORK I: PREREQUISITES FROM MATH 727 HOMEWORK I: PREREQUISITES FROM MATH 727 Questio. Let X, X 2,... be idepedet expoetial radom variables with mea µ. (a) Show that for Z +, we have EX µ!. (b) Show that almost surely, X + + X (c) Fid the

More information

First Year Quantitative Comp Exam Spring, Part I - 203A. f X (x) = 0 otherwise

First Year Quantitative Comp Exam Spring, Part I - 203A. f X (x) = 0 otherwise First Year Quatitative Comp Exam Sprig, 2012 Istructio: There are three parts. Aswer every questio i every part. Questio I-1 Part I - 203A A radom variable X is distributed with the margial desity: >

More information

Advanced Stochastic Processes.

Advanced Stochastic Processes. Advaced Stochastic Processes. David Gamarik LECTURE 2 Radom variables ad measurable fuctios. Strog Law of Large Numbers (SLLN). Scary stuff cotiued... Outlie of Lecture Radom variables ad measurable fuctios.

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation ECE 645: Estimatio Theory Sprig 2015 Istructor: Prof. Staley H. Cha Maximum Likelihood Estimatio (LaTeX prepared by Shaobo Fag) April 14, 2015 This lecture ote is based o ECE 645(Sprig 2015) by Prof. Staley

More information

AMS570 Lecture Notes #2

AMS570 Lecture Notes #2 AMS570 Lecture Notes # Review of Probability (cotiued) Probability distributios. () Biomial distributio Biomial Experimet: ) It cosists of trials ) Each trial results i of possible outcomes, S or F 3)

More information

The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.

The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2. SAMPLE STATISTICS A radom sample x 1,x,,x from a distributio f(x) is a set of idepedetly ad idetically variables with x i f(x) for all i Their joit pdf is f(x 1,x,,x )=f(x 1 )f(x ) f(x )= f(x i ) The sample

More information

CSE 527, Additional notes on MLE & EM

CSE 527, Additional notes on MLE & EM CSE 57 Lecture Notes: MLE & EM CSE 57, Additioal otes o MLE & EM Based o earlier otes by C. Grat & M. Narasimha Itroductio Last lecture we bega a examiatio of model based clusterig. This lecture will be

More information

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Convergence of random variables. (telegram style notes) P.J.C. Spreij Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space

More information

EE 4TM4: Digital Communications II Probability Theory

EE 4TM4: Digital Communications II Probability Theory 1 EE 4TM4: Digital Commuicatios II Probability Theory I. RANDOM VARIABLES A radom variable is a real-valued fuctio defied o the sample space. Example: Suppose that our experimet cosists of tossig two fair

More information

Monte Carlo Integration

Monte Carlo Integration Mote Carlo Itegratio I these otes we first review basic umerical itegratio methods (usig Riema approximatio ad the trapezoidal rule) ad their limitatios for evaluatig multidimesioal itegrals. Next we itroduce

More information

1 of 7 7/16/2009 6:06 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 6. Order Statistics Defiitios Suppose agai that we have a basic radom experimet, ad that X is a real-valued radom variable

More information

Math 525: Lecture 5. January 18, 2018

Math 525: Lecture 5. January 18, 2018 Math 525: Lecture 5 Jauary 18, 2018 1 Series (review) Defiitio 1.1. A sequece (a ) R coverges to a poit L R (writte a L or lim a = L) if for each ǫ > 0, we ca fid N such that a L < ǫ for all N. If the

More information

September 2012 C1 Note. C1 Notes (Edexcel) Copyright - For AS, A2 notes and IGCSE / GCSE worksheets 1

September 2012 C1 Note. C1 Notes (Edexcel) Copyright   - For AS, A2 notes and IGCSE / GCSE worksheets 1 September 0 s (Edecel) Copyright www.pgmaths.co.uk - For AS, A otes ad IGCSE / GCSE worksheets September 0 Copyright www.pgmaths.co.uk - For AS, A otes ad IGCSE / GCSE worksheets September 0 Copyright

More information

Lecture 5: Parametric Hypothesis Testing: Comparing Means. GENOME 560, Spring 2016 Doug Fowler, GS

Lecture 5: Parametric Hypothesis Testing: Comparing Means. GENOME 560, Spring 2016 Doug Fowler, GS Lecture 5: Parametric Hypothesis Testig: Comparig Meas GENOME 560, Sprig 2016 Doug Fowler, GS (dfowler@uw.edu) 1 Review from last week What is a cofidece iterval? 2 Review from last week What is a cofidece

More information

Lecture 9: September 19

Lecture 9: September 19 36-700: Probability ad Mathematical Statistics I Fall 206 Lecturer: Siva Balakrisha Lecture 9: September 9 9. Review ad Outlie Last class we discussed: Statistical estimatio broadly Pot estimatio Bias-Variace

More information

Lecture 15: Learning Theory: Concentration Inequalities

Lecture 15: Learning Theory: Concentration Inequalities STAT 425: Itroductio to Noparametric Statistics Witer 208 Lecture 5: Learig Theory: Cocetratio Iequalities Istructor: Ye-Chi Che 5. Itroductio Recall that i the lecture o classificatio, we have see that

More information

The picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled

The picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled 1 Lecture : Area Area ad distace traveled Approximatig area by rectagles Summatio The area uder a parabola 1.1 Area ad distace Suppose we have the followig iformatio about the velocity of a particle, how

More information

Stat 319 Theory of Statistics (2) Exercises

Stat 319 Theory of Statistics (2) Exercises Kig Saud Uiversity College of Sciece Statistics ad Operatios Research Departmet Stat 39 Theory of Statistics () Exercises Refereces:. Itroductio to Mathematical Statistics, Sixth Editio, by R. Hogg, J.

More information

IIT JAM Mathematical Statistics (MS) 2006 SECTION A

IIT JAM Mathematical Statistics (MS) 2006 SECTION A IIT JAM Mathematical Statistics (MS) 6 SECTION A. If a > for ad lim a / L >, the which of the followig series is ot coverget? (a) (b) (c) (d) (d) = = a = a = a a + / a lim a a / + = lim a / a / + = lim

More information

Output Analysis and Run-Length Control

Output Analysis and Run-Length Control IEOR E4703: Mote Carlo Simulatio Columbia Uiversity c 2017 by Marti Haugh Output Aalysis ad Ru-Legth Cotrol I these otes we describe how the Cetral Limit Theorem ca be used to costruct approximate (1 α%

More information

Naïve Bayes. Naïve Bayes

Naïve Bayes. Naïve Bayes Statistical Data Miig ad Machie Learig Hilary Term 206 Dio Sejdiovic Departmet of Statistics Oxford Slides ad other materials available at: http://www.stats.ox.ac.uk/~sejdiov/sdmml : aother plug-i classifier

More information