EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1

Similar documents
Direction: This test is worth 250 points. You are required to complete this test within 50 minutes.

Exponential Families and Bayesian Inference

Statistical Theory MT 2009 Problems 1: Solution sketches

Statistical Theory MT 2008 Problems 1: Solution sketches

Chapter 6 Principles of Data Reduction

Lecture 11 and 12: Basic estimation theory

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

Stat410 Probability and Statistics II (F16)

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

Unbiased Estimation. February 7-12, 2008

1.010 Uncertainty in Engineering Fall 2008

Problem Set 4 Due Oct, 12

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

First Year Quantitative Comp Exam Spring, Part I - 203A. f X (x) = 0 otherwise

Solutions: Homework 3

IIT JAM Mathematical Statistics (MS) 2006 SECTION A

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

Estimation for Complete Data

6. Sufficient, Complete, and Ancillary Statistics

Lecture Note 8 Point Estimators and Point Estimation Methods. MIT Spring 2006 Herman Bennett

Lecture 19: Convergence

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5

Mathematical Statistics - MS

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

4. Partial Sums and the Central Limit Theorem

Direction: This test is worth 150 points. You are required to complete this test within 55 minutes.

Lecture 7: Properties of Random Samples

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

Lecture 12: September 27

This section is optional.

7.1 Convergence of sequences of random variables

Review Questions, Chapters 8, 9. f(y) = 0, elsewhere. F (y) = f Y(1) = n ( e y/θ) n 1 1 θ e y/θ = n θ e yn


Stat 421-SP2012 Interval Estimation Section

Distribution of Random Samples & Limit theorems

CS284A: Representations and Algorithms in Molecular Biology

5. Likelihood Ratio Tests

ECO 312 Fall 2013 Chris Sims LIKELIHOOD, POSTERIORS, DIAGNOSING NON-NORMALITY

( θ. sup θ Θ f X (x θ) = L. sup Pr (Λ (X) < c) = α. x : Λ (x) = sup θ H 0. sup θ Θ f X (x θ) = ) < c. NH : θ 1 = θ 2 against AH : θ 1 θ 2

n n i=1 Often we also need to estimate the variance. Below are three estimators each of which is optimal in some sense: n 1 i=1 k=1 i=1 k=1 i=1 k=1

6.3 Testing Series With Positive Terms

STAT Homework 1 - Solutions

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Maximum Likelihood Estimation

Introduction to Extreme Value Theory Laurens de Haan, ISM Japan, Erasmus University Rotterdam, NL University of Lisbon, PT

7.1 Convergence of sequences of random variables

Bayesian Methods: Introduction to Multi-parameter Models

Random Variables, Sampling and Estimation

Lecture Stat Maximum Likelihood Estimation

MATH 472 / SPRING 2013 ASSIGNMENT 2: DUE FEBRUARY 4 FINALIZED

Kurskod: TAMS11 Provkod: TENB 21 March 2015, 14:00-18:00. English Version (no Swedish Version)

Lecture 6 Ecient estimators. Rao-Cramer bound.

Lecture 23: Minimal sufficiency

Monte Carlo Integration

AMS570 Lecture Notes #2

CSE 527, Additional notes on MLE & EM

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Lecture 18: Sampling distributions

17. Joint distributions of extreme order statistics Lehmann 5.1; Ferguson 15

Since X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain

Lecture 10 October Minimaxity and least favorable prior sequences

January 25, 2017 INTRODUCTION TO MATHEMATICAL STATISTICS

Chapter 6 Infinite Series

Probability and MLE.

2. The volume of the solid of revolution generated by revolving the area bounded by the

Lecture 16: UMVUE: conditioning on sufficient and complete statistics

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Statistical Pattern Recognition

Last Lecture. Unbiased Test

Problems from 9th edition of Probability and Statistical Inference by Hogg, Tanis and Zimmerman:

10-701/ Machine Learning Mid-term Exam Solution

Introductory statistics

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 3 9/11/2013. Large deviations Theory. Cramér s Theorem

Lecture 2: Monte Carlo Simulation

s = and t = with C ij = A i B j F. (i) Note that cs = M and so ca i µ(a i ) I E (cs) = = c a i µ(a i ) = ci E (s). (ii) Note that s + t = M and so

Spring Information Theory Midterm (take home) Due: Tue, Mar 29, 2016 (in class) Prof. Y. Polyanskiy. P XY (i, j) = α 2 i 2j

REGRESSION WITH QUADRATIC LOSS

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Lecture 33: Bootstrap

Parameter, Statistic and Random Samples

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Topic 9: Sampling Distributions of Estimators

5 : Exponential Family and Generalized Linear Models

SDS 321: Introduction to Probability and Statistics

Basics of Probability Theory (for Theory of Computation courses)

Study the bias (due to the nite dimensional approximation) and variance of the estimators

HOMEWORK I: PREREQUISITES FROM MATH 727

Lecture 6 Simple alternatives and the Neyman-Pearson lemma

STATISTICAL INFERENCE

Homework for 2/3. 1. Determine the values of the following quantities: a. t 0.1,15 b. t 0.05,15 c. t 0.1,25 d. t 0.05,40 e. t 0.

MATH/STAT 352: Lecture 15

Asymptotics. Hypothesis Testing UMP. Asymptotic Tests and p-values

Simulation. Two Rule For Inverting A Distribution Function

Lecture 3. Properties of Summary Statistics: Sampling Distribution

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

G. R. Pasha Department of Statistics Bahauddin Zakariya University Multan, Pakistan

Kernel density estimator

The Maximum-Likelihood Decoding Performance of Error-Correcting Codes

Transcription:

EECS564 Estimatio, Filterig, ad Detectio Hwk 2 Sols. Witer 25 4. Let Z be a sigle observatio havig desity fuctio where. p (z) = (2z + ), z (a) Assumig that is a oradom parameter, fid ad plot the maximum likelihood estimator of as a fuctio of Z. Note that p (z) is a straight lie over [, ]. This lie has slope 2z which is positive whe z is greater tha ad egative whe z is less tha. Hece maximum 2 2 over occurs at + for z > ad for z < ad therefore: 2 2 ˆ MLE = sg(z 2 ). Importat Note: for this example you caot use the statioary poit coditio p (z)/ = sice the maximum occurs at the boudary of the parameter space. (b) Is the ML estimator ubiased? If so does it achieve the CR boud? Three possible ways of solvig:. Check to see if we ca verify the coditio for equality i the CRB. First compute l p (z)/ = 2(z 2 ) 2(z 2 ) +. This fuctio is cotiuous i z ad therefore caot possibly be equal to to the discotiuous fuctio k (ˆ MLE (z) ) for ay k. Hece the CRB is ot attaied. 2. Observe that the desity p (z) is ot i the expoetial family so that there exists o efficiet estimator of. Hece the CRB is ot attaied. 3. Compute the bias ad variace of the ML estimator. If biased we kow CRB ot attaied. If ubiased compute the Fisher iformatio ad compare to your ML variace expressio. Note: this is a very bad way to solve this problem sice it ivolves a lot more work! (c) Now assume that is a radom variable with uiform prior desity: p () =, [, ]. Fid 2 ad plot the miimum mea square error estimator of as a fuctio of Z. For this case optimal estimator is CME for which we eed posterior. Fid the margial p(z) = ad therefore Thus ˆ CME = p( z) = (z 2 ) + 2 2 (z )d + d = (z )2/3 2 2 2 = This is a lie with slope 2/3 ragig from /3 to /3 ad passes through at the poit z = /2.

(d) Compute the coditioal bias E[ˆ ] ad the coditioal MSE E[(ˆ ) 2 ] give for each of the estimators of part a ad c. Plot the two coditioal MSE fuctios obtaied ad compare the MSE s of the two estimators. Does oe estimator perform uiformly better tha the other? E[ˆ MLE ] = P (z > ) P (z < ) = ( + /4) ( /4) = /2 2 2 2 2 Hece MLE is biased. E[( ˆ MLE ) 2 ] = ( ) 2 P (z > 2 ) + ( + )2 P (z < 2 ) Hece MLE has MSE idepedet of. E[ˆ CME ] = 2/3 Hece CME is biased. = ( ) 2 ( 2 + /4) + ( + )2 ( 2 /4) = (z )(2(z ) + )dz = 4/3 (z 2 2 2 )2 dz = /9 /2 The bias is /9 = 8/9. E[( ˆ CME ) 2 ] = [2/3(z 2 ) ]2 (2(z 2 ) + )dz = /27 + 2 7/9 Note that as max [,] E[( ˆ CME ) 2 ] = 22/27 < the CME has uiformly better MSE tha the MLE. Iterestigly, however, the MLE has lower magitude bias tha does the CME. 4.4 The observatio cosists of x,..., x i.i.d. samples where x i f(x ) ad { f(x ) = x, x, o.w. where, < < is a ukow parameter. (a) Compute the CR boud o ubiased estimators of. Is there a estimator that achieves the boud? Let s first see if we ca idetify a efficiet estimator. Recall coditio for equality i CRB l f(x )/ = k (ˆ ). For this example l f(x )/ = / l x i = 2 [ ] l(/x i ) Therefore we have idetified a efficiet estimator ˆ = l(/x i) which therefore attais the CRB. Note that as E [ l f(x )/ ] = this estimator is ideed ubiased. Now we tur to the CRB. Take a additioal derivative of the expressio above wrt, use the fact that E [ l(/x i) ] =, ad we obtai so that CRB is F = E [ 2 l f(x )/ 2 ] = 2 var (ˆ) 2 / 2

(b) Fid the maximum likelihood estimator of. The MLE is idetical to the efficiet estimator foud above ˆ MLE = l(/x i) sice the log likelihood has derivative zero at that poit. (c) Compute the mea ad variace of the maximum likelihood estimator. Specify a fuctio ϕ = g() for which the maximum likelihood estimator of ϕ is efficiet. We kow that the MLE is ubiased ad efficiet for estimatig ϕ() = from part (a). Hece its variace is equal the iverse Fisher iformatio 2 /. By the ivariace property, the MLE of ay affie fuctio ϕ = g() = a + b, a, b kow scalars is ˆϕ = aˆ+b, where ˆ is the MLE foud above. Of course, a special case is the idetity fuctio ϕ =. As ˆ is ubiased ad efficiet so is ˆϕ. (d) From oe of your aswers to parts a-c you should be able to derive the followig formula ( ) u β l du = u ( + β) 2, β >. Sice MLE is ubiased we have (for = ) or equivaletly E [l(/x )] = x/ l /x dx = x / l /x dx = 2 Defie β = /, or = /( + β) to obtai the desired formula. 4.8 Available are i.i.d. samples {X i } of a discrete radom variable X with probability mass fuctio P (X = x) = p(x; ), give by p(k; ) = { ( + ) k ko +, k = k o, k o +..., o.w. where k o is a kow o-egative iteger ad is ukow with < <. (A potetially useful idetity: k= kak = a/( a) 2 ). (a) Is this desity i the expoetial family with mea value parameterizatio? Fid a oe dimesioal sufficiet statistic for. This is i the expoetial family sice we ca express p(x; ) = ( ) x ko + + ( where a() = /( ), b(x) =, c() = l + = a()b(x) exp (c()t (x)), ), ad T (x) = x k o. The desity is i its mea value parameterizatio sice (usig the "useful idetity" ad the chage of variable l = k k o i the summatio below) E [T (X)] = + k=k o (k k o ) ( ) k ko = + Sice the X i s are i.i.d. T (X) = X i is a oe dimesioal SS. 3

(b) Fid a MOM estimator of. From part (a) E [X] = +k o. Therefore a MOM estimator is ˆ = X k o = (X i k o ) where X is the sample mea. (c) Fid the ML estimator of. The log likelihood fuctio is simply l() = l( + ) + ( ) T (X i ) l + which is cocave (verify secod derivative is egative over (, )). Takig derivative obtai l () = /( + ) + ( ) T (X i) + ad settig this to zero yields a MLE that is the same estimator as the MOM estimator ˆ of part (b). Observe also that l () = k (ˆ ) where k = /(+). Thus the MLE is efficiet. This simply cofirms what we already kew: whe a desity is i the expoetial famility with the mea value parameterizatio the estimator ˆ = T (X i) is efficiet. (d) Fid the Fisher iformatio o estimator variace for ay ubiased estimator of. Are either of the estimators of part (b) or part (c) efficiet? This ca be computed by fidig the egative of the secod derivative of the log-likelihood fuctio or by usig the fact that, as ˆ i part (b) is efficiet, the Fisher iformatio is F () = k : F () = /( + ). 4.2 Let X, X 2,..., X be i.i.d. variables with the stadard Pareto desity: { c f(x; ) = x (+), x c, o.w. where c > is kow ad > is ukow. (a) Is f(x; ) a member of the expoetial family? Why or why ot? Represet the desity as the followig to see that this desity is a expoetial family member: f(x; ) = c a() e c()d(x) {}}{ ( + ) l x I(x c) b(x) (b) Fid a oe dimesioal sufficiet statistic for give X, X 2,..., X. Sice desity is i expoetial family we idetify a (miimal) sufficiet statistic as d(x i) = l x i. 4

(c) Fid the Fisher iformatio ad state the CR boud for ubiased estimators of. Fid the sigle sample Fisher ifo as F () = E [ 2 l f(x; )] = /2. is therefore var (ˆ) /(F ()) = 2 / The CR boud (d) Derive the maximum likelihood estimator ˆ of. Look at settig derivative wrt of l f(x; ) to zero ad the check the secod derivative to establish that this statioary poit coditio correspods to a global maximum. l f(x; ) = / + l c l x i = ad from part (c) we kow that secod derivative is / 2 which is always egative. Hece ˆ = l(x i/c) is the ML estimator. (e) Is your estimator efficiet? We eed to check the coditio for equality i the CR boud l f(x; ) = K (ˆ ) where K is ot a fuctio of x. From part (d) l f(x; ) = l c (ˆ) (ˆ ) K which meas that that the ML estimator is ot efficiet sice the multiplicative term K is a fuctio of x through ˆ. 4.2 Let X, X 2,..., X be i.i.d. variables with the geeralized Pareto desity: { c f(x; ) = c x (c+), x, o.w. where c > is kow ad > is ukow. (a) Is f(x; ) a member of the expoetial family? Why or why ot? The desity is ot a member of expoetial family sice its support set depeds o (b) Fid a oe dimesioal sufficiet statistic for give X, X 2,..., X. Represet the j.p.d.f. as f(x; ) = c c x (c+) i I(x i ) = c x (c+) i } {{ } h(x) c I(mi X i ) g(t,) where T = mi X i is recogizable as a scalar S.S. 5

(c) Derive the maximum likelihood estimator ˆ of. A plot of the likelihood fuctio of part (b) is a mootoe icreasig fuctio of for mi X i ad equal to zero for > mi X i. Thus it is obvious that the maximum occurs at ˆ = mi X i which is the MLE. Note you caot take a derivative ad set it to zero here! 4.29 A sequece of measuremets X,..., X are i.i.d. with margial desity where > is a ukow parameter. f Xi (x; ) = x 2 e x, x > (a) For part (a) ad (b) assume that is o-radom. Is this desity a member of the expoetial family? Fid a oe dimesioal sufficiet statistic for. Sice the measuremets are i.i.d. it suffices to verify that the uivariate distributio is from the expoetial family: f Xi (x; ) = a() x 2 I(x > ) e b(x) c() t(x). () Sice we could show f Xi (x; ) = a()b(x)e c()t(x), the desity is from the expoetial family. We kow that the sufficiet statistic for the multivariate (expoetial family) desity is T (x) = t(x i), therefore where x = [x, x 2,..., x ] T. T (x) = i, (b) Fid the maximum likelihood estimator of. To obtai the maximum likelihood estimator we ca maximize the likelihood or its logarithm: f X (x; ) = e x i x2 i ˆ ML = arg max K(x) + log() I(xi > ) where K(x) is the x-depedet term idepedet of. Sice the argumet is smooth w.r.t. ad has a egative secod derivative: < (cocave), the its global 2 maximizer ca be foud by differetiatig ad settig to zero: ( ) d K(x) + log() i = d Therefore, the MLE of is ˆ ML = i, i =. x i. 6

(c) For part (c) ad (d) assume that is a radom variable havig desity f() = e, >. Fid the MAP estimator of. MAP estimator ca be obtaied by maximizig the posterior desity f( x) or by maximizig the joit desity f(x, ). For simplicity, we proceed with the latter ad maximize the joit desity by maximizig its logarithm ˆ MAP = arg max log f(x, ) = arg max log f(x ) + log f() = arg max K(x) + log() = arg max log() ( i + ) i + log e Sice the argumet is smooth w.r.t. ad has a egative secod derivative: < 2 (cocave), the its global maximizer ca be foud by differetiatig ad settig to zero, yieldig: ˆ MAP = +. x i (d) Fid the miimum mea squared error estimator of ad compare to your result i part (c). Hit: α e α dα =!. The MMSE estimator is give by ˆ MMSE = E[ x], with the itegral form of ˆ MMSE = f( x)d = f(, x)d f(, x)d Substitutig e (+ x i x2 i ) I(x i > ) for the joit pdf f(, x), we obtai ˆ MMSE = e (+ x i x2 i e (+ x2 x i i ) I(x i > )d ) I(x i > )d = + e (+ e (+ = ( + ( + = = x i x i + x i + +. x i x i x i ) d ) d ) (+2) α + e α dα ) (+) α e α dα ( + )!! There is a factor of + betwee the MAP ad the MMSE estimator. 7

4.33 I this problem you will ivestigate estimatio of the trasitio probability of a observed biary valued Markov chai. Available for measuremet is a sequece X, X,..., X whose joit probability mass fuctio satisfies p (x, x,..., x ) = p(x ) p (x i x i ) where p(x ) = P (X = x ) = /2, x {, } ad the coditioal probability p (x i x i ) = P (X i = x i X i = x i ), x i {, } is give by {, (xi, x p (x i x i ) = i ) {(, ), (, )}, o.w. This is a biary Markov process that has trasitio probability equal to (ote that it is oly a i.i.d. process whe = /2). The problem of estimatig from a realizatio x, x,..., x arises i (BSC) chael idetificatio ad sequece depedecy estimatio. (a) Fid a sufficiet statistic for ad show that the likelihood fuctio is i the expoetial family. (Hit: express p (x i x i ) as a expoetial fuctio of ad with expoet depedet o products of x k s). Sol Usig the hit express p (x i x i ) = xixi +( xi)( xi ) ( ) xi( xi )+( xi)xi Defie T = x ix i +( x i )( x i ) the umber of successive pairs i the sequece whose values are idetical ((, ) or (, )). The T = x i( x i ) + ( x i )x i is the umber of successive pairs takig o differet values. With this we have the form of the likelihood fuctio p (x, x,..., x ) = 2 T ( ) T = ( ) T ( ) 2 which is i the expoetial family with sufficiet statistic T (idetify a() = ( ), b(x) =, c() = l(/( )) i the form p (x, x,..., x ) = a()b(x)e T c().) (b) Fid a method of momets estimator of. Is your estimator ubiased? Sol If you try to use the stadard approach m = E [X ] you will fid that m = E [E [X X ]] = /2 which does ot deped o. Thus a trivial MOM is ˆ = /2, which is biased. A alterative is to look at the first momet of T, E [T ], sice we kow that T is a sufficiet statistic for. From (a) the pmf p (T ) of T must be proportioal to T ( ) T (recall Ex 3.7). Thus as T takes values i the rage of {,..., }, T must i fact be biomial ad thus E[T ] =. Hece a MOM is ˆ = T/. This is ubiased. (c) Fid a maximum likelihood estimator of. Is your estimator ubiased? Sol From (a) the MLE ˆ maximizes T ( ) T over. This yields ˆ = T/ like i part (b). This is ubiased. (d) Compute the Cramér-Rao lower boud o the variace of ubiased estimators of. Is the CR boud achievable by the ML estimator? Sol The Fisher iformatio is which is easily foud from (a) to be F () = E [ d 2 l p (x)/d 2 ] F () = /(( )). 8

The CRB is /F (). To ivestigate attaiability of the CRB cosider the first derivative d l p (x)/d = T + T = (T/ ). ( ) Therefore the boud is achievable ad is i fact achieved by the MLE sice it is ubiased ad has the form of T/. 9