Maximum Likelihood Methods (Hogg Chapter Six)

Size: px
Start display at page:

Download "Maximum Likelihood Methods (Hogg Chapter Six)"

Transcription

1 Maximum Likelihood Methods Hogg Chapter ix TAT 406-0: Mathematical tatistics II prig emester 06 Cotets 0 Admiistrata Maximum Likelihood Estimatio. Maximum Likelihood Estimates Motivatio Example: igal Amplitude Example: Bradley-Terry Model Reparameterizatio Fisher Iformatio ad The Cramér-Rao Boud. 4.. Fisher Iformatio The Cramér-Rao Boud Maximum-Likelihood Tests 8. Likelihood Ratio Test Example: Gaussia Distributio Wald Test Example: Gaussia Distributio Rao cores Test Example: Gaussia Distributio Copyright 06, Joh T. Whela, ad all that 3 Multi-Parameter Methods 0 3. Maximum Likelihood Estimatio Fisher Iformatio Matrix Example: ormal distributio Fisher Iformatio Matrix for a Radom ample Error Estimatio Maximum Likelihood Tests Example: Mea of Normal Distributio with Ukow Variace Example: Mea of Multivariate Normal Distributio with Uit Variace Large-ample Limit Thursday 8 February 06 Read ectio 6. of Hogg 0 Admiistrata Effects of sow day: homeworks due o Thursday through prig Break. First prelim exam ow Thursday, March 3. hift

2 back to ormal after prig Break. Maximum Likelihood Estimatio. Maximum Likelihood Estimates We ow retur to methods of classical, or frequetist statistics, which are phrased ot i terms of probabilities which quatify our cofidece i geeral propositios with ukow truth values, but exclusively i terms of the frequecy of outcomes i hypothetical repetitios of experimets. Oe of the fudametal tools is the samplig distributio, which we ll write as a joit pdf for a radom vector X: f X x. If the model has a parameter θ, we ll write this as f X x; θ, ad whe we cosider this as a fuctio of θ, we ll write it Lθ; x or sometimes Lθ. Ad we will ofte fid it coveiet to work with the logarithm lθ l Lθ. Recall from last semester that oe possible choice of a estimate for θ give a observed data vector x is the maximum likelihood estimate θx argmax θ Lθ; x argmax θ lθ; x. If we take the fuctioal form of θx ad isert the radom vector X ito the fuctio, we get a statistic θ θx which is kow as the maximum likelihood estimator. Note that this is a radom variable ot i the formal sese which we ivoked to assig a Bayesia probability distributio to a ukow parameter Θ, but rather a fuctio of the radom vector X whose value will be differet i differet radom trials, eve with a fixed value for the parameter θ. Of course, if we re usig θ as a estimator, the true value of the parameter θ will be ukow. A useful special case is where the radom vector X is a sample of size draw from some distributio fx; θ. The the likelihood fuctio will be Lθ; x fx i ; θ. ad the log-likelihood will be.. Motivatio lθ; x l fx i ; θ.3 ice we kow that i the Bayesia framework, the posterior pdf for the parameter θ give the observed data x is f Θ X θ x f Θ θ f X Θ x θ,.4 we see that if the prior pdf f Θ θ is a costat, the posterior will be proportioal to f X Θ x θ Lθ; x, ad therefore the θx which maximizes the likelihood will also maxmize the posterior psd f Θ X θ x. Hogg also ivokes two theorems which idicate that the MLE matches up with the true ukow value of the parameter i the limit that the size of a sample goes to ifiity: Theorem 6..: uppose θ 0 is the true value of a parameter θ, ad X is a sample of size from a distributio fx; θ. Give certai regularity coditios, most importatly that θ 0 is ot at the boudary of the parameter space for θ, the true parameter maximizes the likelihood i the limit, i.e., lim P Lθ 0; X > Lθ 0 ; X for θ θ 0.5 We also eed differet θ values to give differet pdfs for X, ad that the support space for X is the same for all θ.

3 Theorem 6..3: Uder the same regularity coditios, the maximum likelihood estimator, if it exists, coverges i probability to θ 0 : θ P θ 0.6 so the maximum likelihood estimate is θ h ix i /σ i j h j /σ j.0 The first theorem says that the true value becomes the maximum likelihood value i the limit of a ifiitely large sample, while the secod says that the mle becomes the true value i that limit... Example: igal Amplitude ee Hogg for several examples of maximum-likelihood solutios associated with radom samples. To complemet those, we re goig to cosider a couple of cases where the radom vector is ot quite a radom sample, but still draw from a distributio with a parameter θ. First, cosider a simplified example from sigal processig. uppose you have a set of data poits {x i } which represet a sigal with a kow shape, described by {h i } ad a ukow amplitude θ, o top of which there is some Gaussia oise with variaces {σi }. I.e., the distributio associated with each data value is X i Nθh, σi. This meas the likelihood fuctio is Lθ; x ad the log-likelihood is The derivative is lθ lθ πσ i exp x i θh i σ i.7 x i θh i + cost.8 h i x i θh i σ i σ i h i x i σ i θ h i σi.9 We ca also write this i matrix form which would allow geeralizatio to covariaces betwee data poits via a o-diagoal variace-covariace matrix Σ θ h T Σ h h T Σ x. This makes the sigal model, with maximum-likelihood amplitude, where h θ h h T Σ h h T Σ x Σ / PΣ / x. P Σ / h h T Σ h h T Σ /.3 is a projectio matrix. Note that i a more complicated problem, we might have a superpositio of differet templates, each with its ow amplitude, k so the sigal would be X N θ ih i, Σ..3 Example: Bradley-Terry Model The Bradley-Terry model 3 is desiged to model paired comparisos betwee objects teams or players i games, foods i taste tests, etc. Each object has a stregth π i, 0 < π i <, ad i a compariso betwee two objects, object i will be chose over object j with a probability i π π i +π j. Cosider a restricted E.g., for cotiuous gravitatioal waves, you have two polarizatios, each with amplitude ad phase, so four parameters. 3 Zermelo, Mathematische Zeitschrift 9, ; Bradley ad Terry Biometrika 39,

4 versio of the model where a object with ukow stregth θ is compared to a series of objects 4 with kow stregths π, π,..., π. Let X i be a Beroulli radom variable which is if the object wis the ith compariso ad 0 if it loses, so that θ X i b,.4 θ + π i The the likelihood is Lθ; x f X Θ x θ θ x i π x i i θ + π i.5 ad the log-likelihood is lθ [x i l θ + x i l π i lθ + π i ].6 ad l θ x i θ θ + π i.7 If we defie y x i to be the total umber of comparisos wo, the mle for θ satisfies y x i θ θ + π i.8 The right had side is the average umber of compariso wis you d predict, ad the mle is just the stregth which makes this equal to the actual umber observed..8 ca t be solved i closed form, oe ca fid the mle umerically by iteratig 5 y θ θ.9 + π i 4 Note that, sice the stregths are assumed to be kow, some of these comparisos may actually be repeated comparisos with the same object. 5 Ford, America Mathematical Mothly 64, Reparameterizatio A key property of maximum-likelihood estimates is captured i Hogg s Theorem 6..: If you chage variables i the likelihood from θ to some other η gθ, ad work out the mle for η, it is η g θ. The proof is almost trivial, so rather tha repeat it, we ll demostrate the property i our Bradley-Terry example. Let λ l θ so that < λ <. The calculatio of the log likelihood proceeds as before, ad we ca substitute θ e λ ito.6 to get [ lλ xi λ + x i l π i le λ + π i ].0 The derivative is l λ x i eλ e λ + π i. so the maximum-likelihood equatio is e λ y x i. e λ + π i which is just the same as.8 with θ replaced by e λ. Tuesday 3 February 06 Read ectio 6. of Hogg. Fisher Iformatio ad The Cramér-Rao Boud We ow tur to a cosideratio of the properties of estimators ad other statistics. uppose we have a radom vector X which 4

5 is a radom sample from a distributio fx; θ which icludes a parameter θ. We kow the joit samplig distributio is f X θ x,..., x θ fx i ; θ.3 If we have a statistic Y ux, this is a radom variable. The mea ad variace of Y are ot radom variables, but they deped o θ because the samplig distributio does: µ Y θ E Y θ ad VarY θ E Y µ Y θ θ.. Fisher Iformatio ux,..., x fx i ; θ dx i.4 [ux µ Y θ] f X θ x θ d x.5 The first such fuctio we wat to cosider is the derivative with respect to the parameter θ of the log-likelihood lθ; x fx; θ. We will focus first o the case of a sigle radom variable before cosiderig a radom sample. We will also assume a eve broader set of regularity coditios tha last time, specifically that we re allowed to iterchage derivatives ad itegrals. The quatity of iterest is ux l θ, x l fx; θ fx; θ fx; θ.6 ad is oe measure of how much the probability distributio chages whe we chage the parameter. Note i this case the fuctio ux depeds explicitly o the parameter, i additio to the θ depedece of the distributio for the radom variable X. ux l θ, X is a statistic whose expectatio value is E l fx; θ fx; θ θ, X fx; θ dx dx fx; θ.7 where we explicitly restrict the itegral to the support space x so that fx; θ > 0. Now the regularity coditios especially the fact that is the same for ay allowed value of θ allow us to pull the derivative outside the itegral so we have E l θ, X d fx; θ dx d 0.8 dθ dθ where we have used the fact that fx; θ is a ormalized pdf i x. o ote that while l θx, x 0, i.e., the derivative of the log-likelihood is zero at the maximum-likelihood value of θ for a give x, the expectatio value E l θ, X vaishes for ay θ. This allows us to write the variace as Var l θ, X E [l θ, X] l fx; θ.9 fx; θ dx Iθ Iθ is a property of the distributio kow as the Fisher iformatio. You ve ecoutered it o the homework i the cotext of the Jeffreys prior f Θ θ Iθ. Note that if we go back ad differetiate.8 we get 0 d dθ E l θ, X d l fx; θ fx; θ dx dθ l fx; θ l fx; θ fx; θ fx; θ + dx dx l fx; θ l fx; θ fx; θ dx + fx; θ dx.30 5

6 This meas there s a equivalet ad sometimes easier to calculate way to write the Fisher iformatio as l fx; θ Iθ fx; θ dx E l θ; X.3 Example: Fisher iformatio for the mea of a ormal distributio To give a very simple example, cosider the ormal distributio Nθ, σ. The likelihood is so the log-likelihood is whose derivatives are fx; θ σ π exp x θ σ.3 x θ l fx; θ σ lπσ.33 x θ l fx; θ σ l fx; θ σ.34a.34b o the Fisher iformatio is Iθ. Note i passig that σ the expectatio of the first derviative is X θ, whose expectatio value is ideed σ zero. Fisher iformatio for a sample of size We ca ow cosider the Fisher iformatio associated with a radom sample of size, give that l fx; θ Iθ E θ.35 I fact, sice the joit samplig distributio, ad therefore the likelihood for the sample, is f X Θ x θ fx i ; θ,.36 the log likelihood will be l f X Θ x θ l fx i ; θ,.37 ad therefore the Fisher iformatio is l f X Θ x θ E θ l fx i ; θ E θ Iθ.38 where we ve used the fact that both the derivative ad the expectatio value are liear operatios, ad that the radom variables {X i } are iid. Note that the Fisher iformatio for a sample of size from a ormal distributio is, which is oe over the variace of the σ sample mea... The Cramér-Rao Boud Now we retur to cosideratio of a geeral statistic Y ux, with a special eye towards oe beig used as a estimator of the parameter θ. Recallig the defiitio of the expectatio value µ Y θ E Y ux,..., x f X Θ x, θ d x.39 ad variace VarY θ E [Y µ Y θ] θ [ux µ Y θ] f X θ x θ d x.40 6

7 we otice a similarity to the Fisher iformatio for the sample l fx Θ X θ Iθ E θ.4 Both are of the form E [ax] θ a a where we have defied the ier product a b E ax bx θ ax bx fx; θ dx.4 This behaves i may ways like a dot product a b, ad i particular satisfies the Cauchy-chwarz iequality 6 a a b b a b.43 This meas that VarY θ Iθ E Y µ Y θ l f X ΘX θ θ 0 l fx Θ X θ l fx Θ X θ E ux θ µ Y θ E θ f X Θ X θ ux,..., x f X Θ x, θ f X Θ x, θ d x d ux,..., x f X Θ x, θ d x d dθ dθ µ Y θ.44 where the last step holds as log as ux does t deped explicitly o θ. The iequality ca be solved for the variace of the statistic Y to give the Cramér-Rao boud: VarY θ µ Y θ Iθ.45 6 I vector algebra, this is a cosequece of the law of cosies a b a b cos φ a, b I the special case where Y is a ubiased estimator of θ, so that µ Y θ, it simplifies to VarY θ Iθ if E Y θ θ.46 I.e., the variace of a ubiased estimator ca be o less tha the reciprocal of the Fisher iformatio of the sample. To retur to our simple example of a Gaussia samplig distributio Nθ, σ, let Y X, the sample mea, which we kow is a ubiased estimator of the distributio mea θ. The the Cramér-Rao boud tells us VarX θ Iθ σ.47 Of course, we kow that the variace of the sample mea is σ /, which meas that i this case the boud is saturated, ad the sample mea is the lowest-variace ubiased estimator of the distributio mea for a ormal distributio with kow variace. Whe the Cramér-Rao boud is saturated, i.e., VarY θ for a ubiased estimator, we say that Y is a efficiet estimator of θ. Otherwise, we ca defie the ratio of the CramérIθ Rao boud to the actual variace as the efficiecy of the estimator. Aother sellig poit of maximum likelihood estimates is their relatioship to efficiecy. A MLE is ot always a efficiet estimator, but oe ca prove usig Taylor series that it becomes so i the limit that the sample size goes to ifiity. Aother importat theorem the proof of which we omit for ow is that, subject to the usual regularity coditios, the distributio for the MLE coverges to a Gaussia: θx θ D N 0, Iθ.48 7

8 Tuesday March 06 Read ectio 6.3 of Hogg Maximum-Likelihood Tests o far we ve used maximum likelihood methods to cosider ways to estimate the parameter θ of a distributio fx; θ. We ow cosider tests desiged to compare a ull hypothesis H 0 which specifies θ θ 0 ad H, which allows for arbitrary θ withi the allowed parameter space. Note that sice these are classical hypothesis tests, we do t require H to specify a prior probability distributio for θ. I each case, we will costruct a test statistic from a sample of size, draw from fx; θ, which is asymptotically chi-square distributed i the limit.. Likelihood Ratio Test Defie as usual the likelihood ad its logarithm Lθ; x f X Θ x θ lθ; x l f X Θ x θ fx i ; θ. l fx i ; θ. Recall that for Bayesia hypothesis testig, the atural quatity to costruct was the Bayes Factor B 0 f Xx H 0 f X x H Lθ 0 ; x Lθ; x f θ Ω Θθ H dθ.3 i the frequetist case we do t defie a prior f Θ θ H, so istead, of all the θ values at which to evaluate the likelihood, we choose the oe which maximizes it 7, ad look at the likelihood ratio Λx Lθ 0; x.4 L θx; x Note that i the frequetist picture Lθ; X ad ΛX are statistics, i.e., radom variables costructed from the sample X... Example: Gaussia Distributio Cosider a sample of size draw from a Nθ, σ distributio, for which we kow the mle is x ad the likelihood is Lθ; x σ π exp x σ i θ exp.5 σ θ x ad the likelihood ratio is Λx exp ice X Nθ, σ /, we have σ θ 0 x.6 l ΛX X θ 0 χ assumig H σ 0.7 / Now, this wo t be exactly true i geeral, but the various asymptotic theorems show that i geeral, for a regular uderlyig distributio, it s true i the asymptotic sese: χ L l ΛX D χ assumig H There are various argumets which make this a reasoable choice, most revolvig aroud the fact that for large samples, the likelihood fuctio is approximately Gaussia with a peak value of L θx; x. 8

9 For fiite, we assume this is approximately true, ad compare l ΛX to the 00 αth percetile of χ distributio to get a test with false alarm probability α.. Wald Test The statistic l ΛX costructed from the likelihood ratio is oly oe of several statistics which are asymptotically χ. Aother possibility is to cosider the result that D θx θ0 N 0,.9 Iθ 0 ad costruct the statistic θx θ0 χ W I θx θx θ0.0 /I θx Comparig this to the percetiles of a χ is kow as the Wald test... Example: Gaussia Distributio The distributio of the statistic χ W should coverge to χ as. We ca see what the exact distributio is for a give choice of the uderlyig samplig distributio. Assumig agai a Nθ, σ distributio for fx; θ, we have Fisher iformatio Iθ σ, maximum likelihood estimator θx X, ad Wald test statistic χ W X θ 0 σ. which is of the same form as.7, which we kow to be exactly χ for ay..3 Rao cores Test The fial likelihood-based test ivolves the score l fx i ; θ. associated with each radom variable i the sample, whose sum is the derivative of the log-likelihood l θ; X l fx i; θ We kow the expectatio value is l E l fxi ; θ θ; X E ad the variace is l Var l fxi ; θ θ; X Var Iθ Iθ.5 so we costruct a test statistic which, i the limit that the sum of the scores is ormally distributed, becomes a χ : χ R l θ 0 ; X Iθ 0.6 Comparig this to the percetiles of a χ is kow as the Rao scores test. It has the advatage over the other two tests that we do t eed to work out the maximum likelihood estimator to apply it..3. Example: Gaussia Distributio If the uderlyig distributio is Nθ, σ, so that l fx; θ l πσ x θ σ.7 9

10 ad l fx; θ which makes the derivative of the log-likeihood l θ; X X i θ σ which makes the Rao scores statistic χ R X θ 0 σ 4 σ x θ σ.8 X θ σ.9 X θ 0 σ / which is, agai, exactly χ distributed i this case. Wedesday March 06 Review for Prelim Exam Oe.0 The exam covers materials from the first four class-weeks of the term, i.e., Hogg sectios.-.3 ad ad associated topics covered i class through February 3, ad problem sets -4. Thursday 3 March 06 First Prelim Exam Tuesday 8 March 06 Read ectio 6.4 of Hogg 3 Multi-Parameter Methods We ow cosider the case where the probability distributio fx; θ depeds ot just o a sigle parameter, but p parameters {θ, θ,..., θ p } {θ α } θ. The likelihood for a sample of size is the Lθ; x fx i ; θ 3. ad the log-likelihood is lθ; x l fx i ; θ Maximum Likelihood Estimatio We re still iterested i the set of parameter values { θ,..., θ p } which maximize the likelihood or, equivaletly, its logarithm. But to maximize a fuctio of p parameters, we eed to set all of the partial derivatives to zero, which meas there are ow p maximum likelihood equatios lθ; x 0 α,,..., p 3.3 α We ca i priciple solve these p equatios for the p ukows { θ α }. For example, cosider a sample of size draw from a Nθ, θ distributio with pdf fx; θ, θ exp x θ 3.4 πθ θ so that l fx; θ, θ lπ l θ x θ The partial derivatives are l fx; θ, θ x θ θ l fx; θ, θ + x θ θ θ θ a 3.6b 0

11 Takig the partial derivatives of 3. ad usig liearity the gives us lx; θ, θ x i θ θ lx; θ, θ + x i θ θ θ 3.7a 3.7b We eed to set both of these to zero to fid θ ad θ, i.e., the coupled maximum-likelihood equatios are x i θ + x i θ θ θ We ca solve the first equatio for θ θ 0 3.8a 0 3.8b x i x 3.9 ad the substitute that ito the secod to get i.e., θ θ x i x θ 3.0 x i x s 3. where s x i x is the usual ubiased sample variace. 3. Fisher Iformatio Matrix If we cosider the score statistic for a sigle observatio l fx; θ α 3. we see that it s ow actually a radom vector with oe compoet per parameter. It is still true that l fx; θ fx; θ E α fx; θ fx; θ dx α You ca show this by takig α of the ormalizatio itegral for fx; θ. ice this is a radom vector, we ca costruct its variace-covariace matrix Iθ with elemets [ ] [ ] l fx; θ l fx; θ I αβ θ E 3.4 α β As before, we ca differetiate 3.3 with respect to θ β to get the idetity 0 l fx; θ fx; θ dx β α l fx; θ l fx; θ fx; θ fx; θ dx + dx α β α β l fx; θ l fx; θ l fx; θ E + fx; θ dx α β α β 3.5 i.e., a equivalet way of writig the Fisher iformatio matrix is l fx; θ I αβ θ E 3.6 α β Note that by defiitio ad costructio, the Fisher matrix is symmetric.

12 3.. Example: ormal distributio If we retur to the case of a Nθ, θ distributio, takig derivatives of 3.6 gives us l fx; θ, θ l fx; θ, θ θ θ 3.7a x θ θ 3 l fx; θ, θ l fx; θ, θ x θ θ 3.7b 3.7c If we recall E X θ ad VarX E X θ θ, we get the Fisher iformatio matrix elemets I αβ θ E l fx;θ,θ α β : i.e., I θ θ I θ + θ θ θ 3 I θ I 0 Iθ θ 0 0 θ θ 3.8a 3.8b 3.8c Fisher Iformatio Matrix for a Radom ample As i the sigle-parameter case, liearity meas that the Fisher iformatio matrix for a sample of size is just times the FIM for the distributio: lθ; X l fx; θ E E I αβ θ α β α β Error Estimatio Recall that for a model with a sigle parameter θ, the Cramér- Rao boud stated that a ubiased estimator of θ had miimum variace of. Also, the maximum-likelihood estimator Iθ θ from a sample of size saturated that boud i the limit, ad i fact its distributio coverged to a Gaussia with the miimum variace specified by the boud: θ θ D N 0, Iθ 3. I the multi-parameter case, we have a maximum-likelihood estimator θ which is a radom vector with p compoets { θ α α,..., p}. The correspodig limitig distributio is a multivariate ormal: θ θ D N p 0, Iθ 3. The matrix iverse Iθ of the Fisher iformatio matrix is the variace-covariace matrix of the limitig distributio. It is also this iverse matrix which provides the multiparameter versio of the Cramér-Rao boud. If T α X is a ubiased estimator of oe parameter θ α, the lower boud o its variace is Var T α X [ I θ ] 3.3 αα I.e., the diagoal elemets of the iverse Fisher matrix provide lower bouds o the variace of ubiased estimators of the parameters. I cases like 3.9, where the Fisher matrix is diagoal, its iverse is also diagoal, with [I θ] αα beig the same

13 as /Iθ αα. But i geeral, 8 [ I θ ] αα /Iθ αα Iθ α 3.4 This is thus a stroger boud tha oe would get by assumig all of the uisace parameters to be kow ad applyig the usual Cramér-Rao boud to the parameter of iterest. Thursday 0 March 06 Read ectio 6.5 of Hogg 3.3 Maximum Likelihood Tests We retur ow to the questio of hypothesis testig whe the parameter space associated with the distributio is multidimesioal. Hogg cosiders this i a somewhat limited cotext where the ull hypothesis H 0 ivolves pickig values for oe or more parameters or combiatios of parameters ad leavig the rest uspecified. Formally this is writte as H 0 : θ ω where ω is some lower-dimesioal subspace of the parameter space Ω. The alterative hypothesis is H : [θ Ω] [θ / ω] The umber of parameters i the full parameter space is p, while ω is defied by specifig values for q idepedet fuctios of the parameters, where 0 < q p. This meas that ω is a p q-dimesioal space. The case cosidered before was p q so that ω was a zero-dimesioal poit i the oe-dimesioal parameter space Ω. ice H 0 is also a composite hypothesis with differet possible parameter values, if we were costructig a Bayes factor, we d 8 I I This is easy to see i the case p, where I ad I I I I I I I I so that I I [I ] I I I I. also eed to specify a prior distributio for θ ω associated with H 0 : B 0 f Xx H 0 f X x H Lθ; x f θ ω Θθ H 0 d p q θ Lθ; x f 3.5 θ Ω Θθ H d p θ I classical statistics we do t have a prior distributio associated with either hypothesis, so we let them each put their best foot forward by pickig the parameter values which maximize the likelihood, which we refer to as θx argmax θ Ω Lθ; x ad θ 0 x argmax θ ω Lθ; x. The likelihood ratio for use i tests is thus Λx L θ 0 x; x L θx; x max θ ω Lθ; x max θ Ω Lθ; x Example: Mea of Normal Distributio with Ukow Variace As a example of the method, cosider a sample of size draw from a Nθ, θ i.e., a ormal distributio where the mea ad the variace are both ukow parameters. The likelihood is Lθ, x πθ / exp x i θ θ 3.7 H 0 is θ µ 0, 0 < θ <, while H is < θ <, 0 < θ <. H 0 is effectively a oe-parameter model which has mles of θ 0 θ 0 µ 0 3.8a x i µ 0 σ 0 3.8b i0 3

14 while the MLE for H was foud last time to be θ x i x 3.9a θ x i x σ 3.9b We see that the maximized likelihoods are L θx; x π σ / e / L θ 0 x; x π σ 0 / e / 3.30a 3.30b o the likelihood ratio is / σ ΛX X i X / X 3.3 i µ 0 If we write σ 0 X i µ 0 [X i X + X µ 0 ] we ca see that X i µ 0 so X i X + X µ 0 + X i XX µ X i X +X µ 0 0 X i X X µ ΛX + X µ 0 X i X / 3.34 This meas that thresholdig o the value of ΛX is the same as thresholdig o the secod term i paretheses, or equivaletly o T X µ 0 X i X / 3.35 But we ve costructed this last statistic so that, if H 0 holds ad the {X i } are idepedet Nµ 0, θ radom variables, tudet s theorem tells is this is t-distributed with degrees of freedom. o for a test at sigificace level α we reject H 0 if T > t, α or T < t, α, sice o we reject H 0 if ΛX + T / 3.36 ΛX < + t, α / Example: Mea of Multivariate Normal Distributio with Uit Variace We kow that the behavior of maximum likelihood tests is simplest whe the sample is draw from a ormal distributio with kow variace whose mea is the parameter i questio. Now we have multiple parameters, so the straightforward extesio is to have the sample draw from a multivariate ormal distributio whose meas are the parameters. o the sample ow cosists of idepedet radom vectors X, X,..., X, each with p elemets ad draw from a multivariate ormal distributio. For simplicity, we ll use the idetity matrix as the variacecovariace matrix, so X i N p θ, p p. We ll defie the ull hypothesis H 0 to be that the first q of the {θ α } are zero. The 4

15 likelihood fuctio is Lθ; x π / exp h x i θ T x i θ π / exp p [x i ] α θ α α exp p θ α x α where α x α 3.38 [x i ] α 3.39 Thus the maximum likelihood solutio uder H is θ α x α, whike uder H 0 it is { 0 α,..., q [ θ 0 ] α 3.40 α q +,..., p x α ad the likelihood ratio is Λ{x i } L θ 0 ; x L θ; x exp q α x α 3.4 Uder H 0, the {X α } are idepedet ad X α N0, for α,..., q. Thus, if the ull hypothesis is true, q X α l Λ{X i } / χ q Large-ample Limit This last result also holds geerically as a limitig distributio for large samples : α l Λ{X i } D χ q

Unbiased Estimation. February 7-12, 2008

Unbiased Estimation. February 7-12, 2008 Ubiased Estimatio February 7-2, 2008 We begi with a sample X = (X,..., X ) of radom variables chose accordig to oe of a family of probabilities P θ where θ is elemet from the parameter space Θ. For radom

More information

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen) Goodess-of-Fit Tests ad Categorical Data Aalysis (Devore Chapter Fourtee) MATH-252-01: Probability ad Statistics II Sprig 2019 Cotets 1 Chi-Squared Tests with Kow Probabilities 1 1.1 Chi-Squared Testig................

More information

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight)

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight) Tests of Hypotheses Based o a Sigle Sample Devore Chapter Eight MATH-252-01: Probability ad Statistics II Sprig 2018 Cotets 1 Hypothesis Tests illustrated with z-tests 1 1.1 Overview of Hypothesis Testig..........

More information

Exponential Families and Bayesian Inference

Exponential Families and Bayesian Inference Computer Visio Expoetial Families ad Bayesia Iferece Lecture Expoetial Families A expoetial family of distributios is a d-parameter family f(x; havig the followig form: f(x; = h(xe g(t T (x B(, (. where

More information

Lecture 11 and 12: Basic estimation theory

Lecture 11 and 12: Basic estimation theory Lecture ad 2: Basic estimatio theory Sprig 202 - EE 94 Networked estimatio ad cotrol Prof. Kha March 2 202 I. MAXIMUM-LIKELIHOOD ESTIMATORS The maximum likelihood priciple is deceptively simple. Louis

More information

Introductory statistics

Introductory statistics CM9S: Machie Learig for Bioiformatics Lecture - 03/3/06 Itroductory statistics Lecturer: Sriram Sakararama Scribe: Sriram Sakararama We will provide a overview of statistical iferece focussig o the key

More information

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1 EECS564 Estimatio, Filterig, ad Detectio Hwk 2 Sols. Witer 25 4. Let Z be a sigle observatio havig desity fuctio where. p (z) = (2z + ), z (a) Assumig that is a oradom parameter, fid ad plot the maximum

More information

Random Variables, Sampling and Estimation

Random Variables, Sampling and Estimation Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig

More information

Lecture 2: Monte Carlo Simulation

Lecture 2: Monte Carlo Simulation STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?

More information

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample. Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5 CS434a/54a: Patter Recogitio Prof. Olga Veksler Lecture 5 Today Itroductio to parameter estimatio Two methods for parameter estimatio Maimum Likelihood Estimatio Bayesia Estimatio Itroducto Bayesia Decisio

More information

Direction: This test is worth 250 points. You are required to complete this test within 50 minutes.

Direction: This test is worth 250 points. You are required to complete this test within 50 minutes. Term Test October 3, 003 Name Math 56 Studet Number Directio: This test is worth 50 poits. You are required to complete this test withi 50 miutes. I order to receive full credit, aswer each problem completely

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering CEE 5 Autum 005 Ucertaity Cocepts for Geotechical Egieerig Basic Termiology Set A set is a collectio of (mutually exclusive) objects or evets. The sample space is the (collectively exhaustive) collectio

More information

( θ. sup θ Θ f X (x θ) = L. sup Pr (Λ (X) < c) = α. x : Λ (x) = sup θ H 0. sup θ Θ f X (x θ) = ) < c. NH : θ 1 = θ 2 against AH : θ 1 θ 2

( θ. sup θ Θ f X (x θ) = L. sup Pr (Λ (X) < c) = α. x : Λ (x) = sup θ H 0. sup θ Θ f X (x θ) = ) < c. NH : θ 1 = θ 2 against AH : θ 1 θ 2 82 CHAPTER 4. MAXIMUM IKEIHOOD ESTIMATION Defiitio: et X be a radom sample with joit p.m/d.f. f X x θ. The geeralised likelihood ratio test g.l.r.t. of the NH : θ H 0 agaist the alterative AH : θ H 1,

More information

Chapter 6 Principles of Data Reduction

Chapter 6 Principles of Data Reduction Chapter 6 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 0 Chapter 6 Priciples of Data Reductio Sectio 6. Itroductio Goal: To summarize or reduce the data X, X,, X to get iformatio about a

More information

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara Poit Estimator Eco 325 Notes o Poit Estimator ad Cofidece Iterval 1 By Hiro Kasahara Parameter, Estimator, ad Estimate The ormal probability desity fuctio is fully characterized by two costats: populatio

More information

Lecture 3. Properties of Summary Statistics: Sampling Distribution

Lecture 3. Properties of Summary Statistics: Sampling Distribution Lecture 3 Properties of Summary Statistics: Samplig Distributio Mai Theme How ca we use math to justify that our umerical summaries from the sample are good summaries of the populatio? Lecture Summary

More information

Statistical Theory MT 2008 Problems 1: Solution sketches

Statistical Theory MT 2008 Problems 1: Solution sketches Statistical Theory MT 008 Problems : Solutio sketches. Which of the followig desities are withi a expoetial family? Explai your reasoig. a) Let 0 < θ < ad put fx, θ) = θ)θ x ; x = 0,,,... b) c) where α

More information

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA, 016 MODULE : Statistical Iferece Time allowed: Three hours Cadidates should aswer FIVE questios. All questios carry equal marks. The umber

More information

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f. Lecture 5 Let us give oe more example of MLE. Example 3. The uiform distributio U[0, ] o the iterval [0, ] has p.d.f. { 1 f(x =, 0 x, 0, otherwise The likelihood fuctio ϕ( = f(x i = 1 I(X 1,..., X [0,

More information

Asymptotics. Hypothesis Testing UMP. Asymptotic Tests and p-values

Asymptotics. Hypothesis Testing UMP. Asymptotic Tests and p-values of the secod half Biostatistics 6 - Statistical Iferece Lecture 6 Fial Exam & Practice Problems for the Fial Hyu Mi Kag Apil 3rd, 3 Hyu Mi Kag Biostatistics 6 - Lecture 6 Apil 3rd, 3 / 3 Rao-Blackwell

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit theorems Throughout this sectio we will assume a probability space (Ω, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.

The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2. SAMPLE STATISTICS A radom sample x 1,x,,x from a distributio f(x) is a set of idepedetly ad idetically variables with x i f(x) for all i Their joit pdf is f(x 1,x,,x )=f(x 1 )f(x ) f(x )= f(x i ) The sample

More information

Statistical Theory MT 2009 Problems 1: Solution sketches

Statistical Theory MT 2009 Problems 1: Solution sketches Statistical Theory MT 009 Problems : Solutio sketches. Which of the followig desities are withi a expoetial family? Explai your reasoig. (a) Let 0 < θ < ad put f(x, θ) = ( θ)θ x ; x = 0,,,... (b) (c) where

More information

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n. Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator

More information

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator Ecoomics 24B Relatio to Method of Momets ad Maximum Likelihood OLSE as a Maximum Likelihood Estimator Uder Assumptio 5 we have speci ed the distributio of the error, so we ca estimate the model parameters

More information

Lecture 12: September 27

Lecture 12: September 27 36-705: Itermediate Statistics Fall 207 Lecturer: Siva Balakrisha Lecture 2: September 27 Today we will discuss sufficiecy i more detail ad the begi to discuss some geeral strategies for costructig estimators.

More information

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Discrete Mathematics for CS Spring 2008 David Wagner Note 22 CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig

More information

Efficient GMM LECTURE 12 GMM II

Efficient GMM LECTURE 12 GMM II DECEMBER 1 010 LECTURE 1 II Efficiet The estimator depeds o the choice of the weight matrix A. The efficiet estimator is the oe that has the smallest asymptotic variace amog all estimators defied by differet

More information

Distribution of Random Samples & Limit theorems

Distribution of Random Samples & Limit theorems STAT/MATH 395 A - PROBABILITY II UW Witer Quarter 2017 Néhémy Lim Distributio of Radom Samples & Limit theorems 1 Distributio of i.i.d. Samples Motivatig example. Assume that the goal of a study is to

More information

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015 ECE 8527: Itroductio to Machie Learig ad Patter Recogitio Midterm # 1 Vaishali Ami Fall, 2015 tue39624@temple.edu Problem No. 1: Cosider a two-class discrete distributio problem: ω 1 :{[0,0], [2,0], [2,2],

More information

Estimation for Complete Data

Estimation for Complete Data Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of

More information

5. Likelihood Ratio Tests

5. Likelihood Ratio Tests 1 of 5 7/29/2009 3:16 PM Virtual Laboratories > 9. Hy pothesis Testig > 1 2 3 4 5 6 7 5. Likelihood Ratio Tests Prelimiaries As usual, our startig poit is a radom experimet with a uderlyig sample space,

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS MASSACHUSTTS INSTITUT OF TCHNOLOGY 6.436J/5.085J Fall 2008 Lecture 9 /7/2008 LAWS OF LARG NUMBRS II Cotets. The strog law of large umbers 2. The Cheroff boud TH STRONG LAW OF LARG NUMBRS While the weak

More information

Expectation and Variance of a random variable

Expectation and Variance of a random variable Chapter 11 Expectatio ad Variace of a radom variable The aim of this lecture is to defie ad itroduce mathematical Expectatio ad variace of a fuctio of discrete & cotiuous radom variables ad the distributio

More information

Direction: This test is worth 150 points. You are required to complete this test within 55 minutes.

Direction: This test is worth 150 points. You are required to complete this test within 55 minutes. Term Test 3 (Part A) November 1, 004 Name Math 6 Studet Number Directio: This test is worth 10 poits. You are required to complete this test withi miutes. I order to receive full credit, aswer each problem

More information

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam. Probability ad Statistics FS 07 Secod Sessio Exam 09.0.08 Time Limit: 80 Miutes Name: Studet ID: This exam cotais 9 pages (icludig this cover page) ad 0 questios. A Formulae sheet is provided with the

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Statistical and Mathematical Methods DS-GA 1002 December 8, Sample Final Problems Solutions

Statistical and Mathematical Methods DS-GA 1002 December 8, Sample Final Problems Solutions Statistical ad Mathematical Methods DS-GA 00 December 8, 05. Short questios Sample Fial Problems Solutios a. Ax b has a solutio if b is i the rage of A. The dimesio of the rage of A is because A has liearly-idepedet

More information

Statistics 511 Additional Materials

Statistics 511 Additional Materials Cofidece Itervals o mu Statistics 511 Additioal Materials This topic officially moves us from probability to statistics. We begi to discuss makig ifereces about the populatio. Oe way to differetiate probability

More information

4. Partial Sums and the Central Limit Theorem

4. Partial Sums and the Central Limit Theorem 1 of 10 7/16/2009 6:05 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 4. Partial Sums ad the Cetral Limit Theorem The cetral limit theorem ad the law of large umbers are the two fudametal theorems

More information

STAT Homework 1 - Solutions

STAT Homework 1 - Solutions STAT-36700 Homework 1 - Solutios Fall 018 September 11, 018 This cotais solutios for Homework 1. Please ote that we have icluded several additioal commets ad approaches to the problems to give you better

More information

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1. Eco 325/327 Notes o Sample Mea, Sample Proportio, Cetral Limit Theorem, Chi-square Distributio, Studet s t distributio 1 Sample Mea By Hiro Kasahara We cosider a radom sample from a populatio. Defiitio

More information

Problem Set 4 Due Oct, 12

Problem Set 4 Due Oct, 12 EE226: Radom Processes i Systems Lecturer: Jea C. Walrad Problem Set 4 Due Oct, 12 Fall 06 GSI: Assae Gueye This problem set essetially reviews detectio theory ad hypothesis testig ad some basic otios

More information

In this section we derive some finite-sample properties of the OLS estimator. b is an estimator of β. It is a function of the random sample data.

In this section we derive some finite-sample properties of the OLS estimator. b is an estimator of β. It is a function of the random sample data. 17 3. OLS Part III I this sectio we derive some fiite-sample properties of the OLS estimator. 3.1 The Samplig Distributio of the OLS Estimator y = Xβ + ε ; ε ~ N[0, σ 2 I ] b = (X X) 1 X y = f(y) ε is

More information

Lecture Notes 15 Hypothesis Testing (Chapter 10)

Lecture Notes 15 Hypothesis Testing (Chapter 10) 1 Itroductio Lecture Notes 15 Hypothesis Testig Chapter 10) Let X 1,..., X p θ x). Suppose we we wat to kow if θ = θ 0 or ot, where θ 0 is a specific value of θ. For example, if we are flippig a coi, we

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

1.010 Uncertainty in Engineering Fall 2008

1.010 Uncertainty in Engineering Fall 2008 MIT OpeCourseWare http://ocw.mit.edu.00 Ucertaity i Egieerig Fall 2008 For iformatio about citig these materials or our Terms of Use, visit: http://ocw.mit.edu.terms. .00 - Brief Notes # 9 Poit ad Iterval

More information

Lecture 19: Convergence

Lecture 19: Convergence Lecture 19: Covergece Asymptotic approach I statistical aalysis or iferece, a key to the success of fidig a good procedure is beig able to fid some momets ad/or distributios of various statistics. I may

More information

LECTURE 14 NOTES. A sequence of α-level tests {ϕ n (x)} is consistent if

LECTURE 14 NOTES. A sequence of α-level tests {ϕ n (x)} is consistent if LECTURE 14 NOTES 1. Asymptotic power of tests. Defiitio 1.1. A sequece of -level tests {ϕ x)} is cosistet if β θ) := E θ [ ϕ x) ] 1 as, for ay θ Θ 1. Just like cosistecy of a sequece of estimators, Defiitio

More information

January 25, 2017 INTRODUCTION TO MATHEMATICAL STATISTICS

January 25, 2017 INTRODUCTION TO MATHEMATICAL STATISTICS Jauary 25, 207 INTRODUCTION TO MATHEMATICAL STATISTICS Abstract. A basic itroductio to statistics assumig kowledge of probability theory.. Probability I a typical udergraduate problem i probability, we

More information

Stat 319 Theory of Statistics (2) Exercises

Stat 319 Theory of Statistics (2) Exercises Kig Saud Uiversity College of Sciece Statistics ad Operatios Research Departmet Stat 39 Theory of Statistics () Exercises Refereces:. Itroductio to Mathematical Statistics, Sixth Editio, by R. Hogg, J.

More information

Chapter 6 Infinite Series

Chapter 6 Infinite Series Chapter 6 Ifiite Series I the previous chapter we cosidered itegrals which were improper i the sese that the iterval of itegratio was ubouded. I this chapter we are goig to discuss a topic which is somewhat

More information

Stat410 Probability and Statistics II (F16)

Stat410 Probability and Statistics II (F16) Some Basic Cocepts of Statistical Iferece (Sec 5.) Suppose we have a rv X that has a pdf/pmf deoted by f(x; θ) or p(x; θ), where θ is called the parameter. I previous lectures, we focus o probability problems

More information

Lecture 33: Bootstrap

Lecture 33: Bootstrap Lecture 33: ootstrap Motivatio To evaluate ad compare differet estimators, we eed cosistet estimators of variaces or asymptotic variaces of estimators. This is also importat for hypothesis testig ad cofidece

More information

Summary. Recap ... Last Lecture. Summary. Theorem

Summary. Recap ... Last Lecture. Summary. Theorem Last Lecture Biostatistics 602 - Statistical Iferece Lecture 23 Hyu Mi Kag April 11th, 2013 What is p-value? What is the advatage of p-value compared to hypothesis testig procedure with size α? How ca

More information

Frequentist Inference

Frequentist Inference Frequetist Iferece The topics of the ext three sectios are useful applicatios of the Cetral Limit Theorem. Without kowig aythig about the uderlyig distributio of a sequece of radom variables {X i }, for

More information

Axis Aligned Ellipsoid

Axis Aligned Ellipsoid Machie Learig for Data Sciece CS 4786) Lecture 6,7 & 8: Ellipsoidal Clusterig, Gaussia Mixture Models ad Geeral Mixture Models The text i black outlies high level ideas. The text i blue provides simple

More information

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise)

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise) Lecture 22: Review for Exam 2 Basic Model Assumptios (without Gaussia Noise) We model oe cotiuous respose variable Y, as a liear fuctio of p umerical predictors, plus oise: Y = β 0 + β X +... β p X p +

More information

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4 MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.

More information

Solutions: Homework 3

Solutions: Homework 3 Solutios: Homework 3 Suppose that the radom variables Y,...,Y satisfy Y i = x i + " i : i =,..., IID where x,...,x R are fixed values ad ",...," Normal(0, )with R + kow. Fid ˆ = MLE( ). IND Solutio: Observe

More information

Properties and Hypothesis Testing

Properties and Hypothesis Testing Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.

More information

Lecture 7: Density Estimation: k-nearest Neighbor and Basis Approach

Lecture 7: Density Estimation: k-nearest Neighbor and Basis Approach STAT 425: Itroductio to Noparametric Statistics Witer 28 Lecture 7: Desity Estimatio: k-nearest Neighbor ad Basis Approach Istructor: Ye-Chi Che Referece: Sectio 8.4 of All of Noparametric Statistics.

More information

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y

More information

6 Sample Size Calculations

6 Sample Size Calculations 6 Sample Size Calculatios Oe of the major resposibilities of a cliical trial statisticia is to aid the ivestigators i determiig the sample size required to coduct a study The most commo procedure for determiig

More information

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals 7-1 Chapter 4 Part I. Samplig Distributios ad Cofidece Itervals 1 7- Sectio 1. Samplig Distributio 7-3 Usig Statistics Statistical Iferece: Predict ad forecast values of populatio parameters... Test hypotheses

More information

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Convergence of random variables. (telegram style notes) P.J.C. Spreij Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space

More information

Element sampling: Part 2

Element sampling: Part 2 Chapter 4 Elemet samplig: Part 2 4.1 Itroductio We ow cosider uequal probability samplig desigs which is very popular i practice. I the uequal probability samplig, we ca improve the efficiecy of the resultig

More information

Statistical inference: example 1. Inferential Statistics

Statistical inference: example 1. Inferential Statistics Statistical iferece: example 1 Iferetial Statistics POPULATION SAMPLE A clothig store chai regularly buys from a supplier large quatities of a certai piece of clothig. Each item ca be classified either

More information

The standard deviation of the mean

The standard deviation of the mean Physics 6C Fall 20 The stadard deviatio of the mea These otes provide some clarificatio o the distictio betwee the stadard deviatio ad the stadard deviatio of the mea.. The sample mea ad variace Cosider

More information

Lecture Note 8 Point Estimators and Point Estimation Methods. MIT Spring 2006 Herman Bennett

Lecture Note 8 Point Estimators and Point Estimation Methods. MIT Spring 2006 Herman Bennett Lecture Note 8 Poit Estimators ad Poit Estimatio Methods MIT 14.30 Sprig 2006 Herma Beett Give a parameter with ukow value, the goal of poit estimatio is to use a sample to compute a umber that represets

More information

MA Advanced Econometrics: Properties of Least Squares Estimators

MA Advanced Econometrics: Properties of Least Squares Estimators MA Advaced Ecoometrics: Properties of Least Squares Estimators Karl Whela School of Ecoomics, UCD February 5, 20 Karl Whela UCD Least Squares Estimators February 5, 20 / 5 Part I Least Squares: Some Fiite-Sample

More information

Lecture 7: Properties of Random Samples

Lecture 7: Properties of Random Samples Lecture 7: Properties of Radom Samples 1 Cotiued From Last Class Theorem 1.1. Let X 1, X,...X be a radom sample from a populatio with mea µ ad variace σ

More information

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING Lectures MODULE 5 STATISTICS II. Mea ad stadard error of sample data. Biomial distributio. Normal distributio 4. Samplig 5. Cofidece itervals

More information

Rank tests and regression rank scores tests in measurement error models

Rank tests and regression rank scores tests in measurement error models Rak tests ad regressio rak scores tests i measuremet error models J. Jurečková ad A.K.Md.E. Saleh Charles Uiversity i Prague ad Carleto Uiversity i Ottawa Abstract The rak ad regressio rak score tests

More information

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10 DS 00: Priciples ad Techiques of Data Sciece Date: April 3, 208 Name: Hypothesis Testig Discussio #0. Defie these terms below as they relate to hypothesis testig. a) Data Geeratio Model: Solutio: A set

More information

Lesson 10: Limits and Continuity

Lesson 10: Limits and Continuity www.scimsacademy.com Lesso 10: Limits ad Cotiuity SCIMS Academy 1 Limit of a fuctio The cocept of limit of a fuctio is cetral to all other cocepts i calculus (like cotiuity, derivative, defiite itegrals

More information

Chapter 6 Sampling Distributions

Chapter 6 Sampling Distributions Chapter 6 Samplig Distributios 1 I most experimets, we have more tha oe measuremet for ay give variable, each measuremet beig associated with oe radomly selected a member of a populatio. Hece we eed to

More information

1 Introduction to reducing variance in Monte Carlo simulations

1 Introduction to reducing variance in Monte Carlo simulations Copyright c 010 by Karl Sigma 1 Itroductio to reducig variace i Mote Carlo simulatios 11 Review of cofidece itervals for estimatig a mea I statistics, we estimate a ukow mea µ = E(X) of a distributio by

More information

32 estimating the cumulative distribution function

32 estimating the cumulative distribution function 32 estimatig the cumulative distributio fuctio 4.6 types of cofidece itervals/bads Let F be a class of distributio fuctios F ad let θ be some quatity of iterest, such as the mea of F or the whole fuctio

More information

17. Joint distributions of extreme order statistics Lehmann 5.1; Ferguson 15

17. Joint distributions of extreme order statistics Lehmann 5.1; Ferguson 15 17. Joit distributios of extreme order statistics Lehma 5.1; Ferguso 15 I Example 10., we derived the asymptotic distributio of the maximum from a radom sample from a uiform distributio. We did this usig

More information

Lecture 2: Poisson Sta*s*cs Probability Density Func*ons Expecta*on and Variance Es*mators

Lecture 2: Poisson Sta*s*cs Probability Density Func*ons Expecta*on and Variance Es*mators Lecture 2: Poisso Sta*s*cs Probability Desity Fuc*os Expecta*o ad Variace Es*mators Biomial Distribu*o: P (k successes i attempts) =! k!( k)! p k s( p s ) k prob of each success Poisso Distributio Note

More information

6.3 Testing Series With Positive Terms

6.3 Testing Series With Positive Terms 6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial

More information

Review Questions, Chapters 8, 9. f(y) = 0, elsewhere. F (y) = f Y(1) = n ( e y/θ) n 1 1 θ e y/θ = n θ e yn

Review Questions, Chapters 8, 9. f(y) = 0, elsewhere. F (y) = f Y(1) = n ( e y/θ) n 1 1 θ e y/θ = n θ e yn Stat 366 Lab 2 Solutios (September 2, 2006) page TA: Yury Petracheko, CAB 484, yuryp@ualberta.ca, http://www.ualberta.ca/ yuryp/ Review Questios, Chapters 8, 9 8.5 Suppose that Y, Y 2,..., Y deote a radom

More information

1 Approximating Integrals using Taylor Polynomials

1 Approximating Integrals using Taylor Polynomials Seughee Ye Ma 8: Week 7 Nov Week 7 Summary This week, we will lear how we ca approximate itegrals usig Taylor series ad umerical methods. Topics Page Approximatig Itegrals usig Taylor Polyomials. Defiitios................................................

More information

Lecture 3: August 31

Lecture 3: August 31 36-705: Itermediate Statistics Fall 018 Lecturer: Siva Balakrisha Lecture 3: August 31 This lecture will be mostly a summary of other useful expoetial tail bouds We will ot prove ay of these i lecture,

More information

Bayesian Methods: Introduction to Multi-parameter Models

Bayesian Methods: Introduction to Multi-parameter Models Bayesia Methods: Itroductio to Multi-parameter Models Parameter: θ = ( θ, θ) Give Likelihood p(y θ) ad prior p(θ ), the posterior p proportioal to p(y θ) x p(θ ) Margial posterior ( θ, θ y) is Iterested

More information

The Random Walk For Dummies

The Random Walk For Dummies The Radom Walk For Dummies Richard A Mote Abstract We look at the priciples goverig the oe-dimesioal discrete radom walk First we review five basic cocepts of probability theory The we cosider the Beroulli

More information

Simulation. Two Rule For Inverting A Distribution Function

Simulation. Two Rule For Inverting A Distribution Function Simulatio Two Rule For Ivertig A Distributio Fuctio Rule 1. If F(x) = u is costat o a iterval [x 1, x 2 ), the the uiform value u is mapped oto x 2 through the iversio process. Rule 2. If there is a jump

More information

Linear Regression Demystified

Linear Regression Demystified Liear Regressio Demystified Liear regressio is a importat subject i statistics. I elemetary statistics courses, formulae related to liear regressio are ofte stated without derivatio. This ote iteds to

More information

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality

More information

A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece,, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet as

More information

Estimation of the Mean and the ACVF

Estimation of the Mean and the ACVF Chapter 5 Estimatio of the Mea ad the ACVF A statioary process {X t } is characterized by its mea ad its autocovariace fuctio γ ), ad so by the autocorrelatio fuctio ρ ) I this chapter we preset the estimators

More information

NANYANG TECHNOLOGICAL UNIVERSITY SYLLABUS FOR ENTRANCE EXAMINATION FOR INTERNATIONAL STUDENTS AO-LEVEL MATHEMATICS

NANYANG TECHNOLOGICAL UNIVERSITY SYLLABUS FOR ENTRANCE EXAMINATION FOR INTERNATIONAL STUDENTS AO-LEVEL MATHEMATICS NANYANG TECHNOLOGICAL UNIVERSITY SYLLABUS FOR ENTRANCE EXAMINATION FOR INTERNATIONAL STUDENTS AO-LEVEL MATHEMATICS STRUCTURE OF EXAMINATION PAPER. There will be oe 2-hour paper cosistig of 4 questios.

More information

6. Sufficient, Complete, and Ancillary Statistics

6. Sufficient, Complete, and Ancillary Statistics Sufficiet, Complete ad Acillary Statistics http://www.math.uah.edu/stat/poit/sufficiet.xhtml 1 of 7 7/16/2009 6:13 AM Virtual Laboratories > 7. Poit Estimatio > 1 2 3 4 5 6 6. Sufficiet, Complete, ad Acillary

More information

Last Lecture. Wald Test

Last Lecture. Wald Test Last Lecture Biostatistics 602 - Statistical Iferece Lecture 22 Hyu Mi Kag April 9th, 2013 Is the exact distributio of LRT statistic typically easy to obtai? How about its asymptotic distributio? For testig

More information

Lecture 9: September 19

Lecture 9: September 19 36-700: Probability ad Mathematical Statistics I Fall 206 Lecturer: Siva Balakrisha Lecture 9: September 9 9. Review ad Outlie Last class we discussed: Statistical estimatio broadly Pot estimatio Bias-Variace

More information

Stat 421-SP2012 Interval Estimation Section

Stat 421-SP2012 Interval Estimation Section Stat 41-SP01 Iterval Estimatio Sectio 11.1-11. We ow uderstad (Chapter 10) how to fid poit estimators of a ukow parameter. o However, a poit estimate does ot provide ay iformatio about the ucertaity (possible

More information