Suggested solutions to written exam Jan 17, 2012

Similar documents
A Very Brief Summary of Statistical Inference, and Examples

Chapter 4 HOMEWORK ASSIGNMENTS. 4.1 Homework #1

Statistics Ph.D. Qualifying Exam: Part I October 18, 2003

Exam C Solutions Spring 2005

Math 494: Mathematical Statistics

Recall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n

A Very Brief Summary of Bayesian Inference, and Examples

Lecture 7 Introduction to Statistical Decision Theory

Mathematics Ph.D. Qualifying Examination Stat Probability, January 2018

Parameter Estimation

Principles of Statistics

Statistics Ph.D. Qualifying Exam

Statistics Masters Comprehensive Exam March 21, 2003

DA Freedman Notes on the MLE Fall 2003

UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences FINAL EXAMINATION, APRIL 2013

Course 4 Solutions November 2001 Exams

Spring 2012 Math 541B Exam 1

McGill University. Faculty of Science. Department of Mathematics and Statistics. Part A Examination. Statistics: Theory Paper

A Very Brief Summary of Statistical Inference, and Examples

SPRING 2007 EXAM C SOLUTIONS

Exercises and Answers to Chapter 1

7. Estimation and hypothesis testing. Objective. Recommended reading

Loglikelihood and Confidence Intervals

f(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain

Stat 5102 Final Exam May 14, 2015

Final Examination. STA 215: Statistical Inference. Saturday, 2001 May 5, 9:00am 12:00 noon

Introduction to Machine Learning. Lecture 2

Statistics GIDP Ph.D. Qualifying Exam Theory Jan 11, 2016, 9:00am-1:00pm

Compute f(x θ)f(θ) dθ

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

simple if it completely specifies the density of x

You may use your calculator and a single page of notes.

1. Fisher Information

Practice Exam 1. (A) (B) (C) (D) (E) You are given the following data on loss sizes:

MAS3301 Bayesian Statistics

Master s Written Examination - Solution

STAT215: Solutions for Homework 2

4 Invariant Statistical Decision Problems

Review of Probabilities and Basic Statistics

Mathematics Qualifying Examination January 2015 STAT Mathematical Statistics

STATISTICS SYLLABUS UNIT I

Probability and Statistics qualifying exam, May 2015

Problem Selected Scores

Mathematical statistics

(Practice Version) Midterm Exam 2

STAB57: Quiz-1 Tutorial 1 (Show your work clearly) 1. random variable X has a continuous distribution for which the p.d.f.

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata

Theory of Maximum Likelihood Estimation. Konstantin Kashin

STAT 425: Introduction to Bayesian Analysis

Solutions to the Spring 2015 CAS Exam ST

Part III. A Decision-Theoretic Approach and Bayesian testing

4.5.1 The use of 2 log Λ when θ is scalar

Spring 2012 Math 541A Exam 1. X i, S 2 = 1 n. n 1. X i I(X i < c), T n =

Exponential Families

Master s Written Examination

Chapter 5. Bayesian Statistics

Chapters 9. Properties of Point Estimators

Point Estimation. Vibhav Gogate The University of Texas at Dallas

Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

Lecture 1: Introduction

Final Examination a. STA 532: Statistical Inference. Wednesday, 2015 Apr 29, 7:00 10:00pm. Thisisaclosed bookexam books&phonesonthefloor.

Answer Key for STAT 200B HW No. 7

Statistics Ph.D. Qualifying Exam: Part II November 3, 2001

Theory of Statistics.

Brief Review on Estimation Theory

Hypothesis Test. The opposite of the null hypothesis, called an alternative hypothesis, becomes

Lecture 8: Information Theory and Statistics

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn

Maximum Likelihood Estimation

Estimation of reliability parameters from Experimental data (Parte 2) Prof. Enrico Zio

Statistics 135 Fall 2008 Final Exam

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

ECE 275A Homework 7 Solutions

Lectures on Statistics. William G. Faris

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

Introduction to Probabilistic Machine Learning

Unbiased Estimation. Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others.

Time Series and Dynamic Models

2017 Financial Mathematics Orientation - Statistics

The University of Hong Kong Department of Statistics and Actuarial Science STAT2802 Statistical Models Tutorial Solutions Solutions to Problems 71-80

Stat 5102 Lecture Slides Deck 3. Charles J. Geyer School of Statistics University of Minnesota

Maximum Likelihood Estimation

Central Limit Theorem ( 5.3)

Empirical Risk Minimization is an incomplete inductive principle Thomas P. Minka

Continuous Distributions

Bayesian Statistics Part III: Building Bayes Theorem Part IV: Prior Specification

Testing Statistical Hypotheses

Asymptotic Statistics-III. Changliang Zou

Bayesian Inference: Posterior Intervals

MATH c UNIVERSITY OF LEEDS Examination for the Module MATH2715 (January 2015) STATISTICAL METHODS. Time allowed: 2 hours

Final Exam. 1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given.

Statistics. Lecture 2 August 7, 2000 Frank Porter Caltech. The Fundamentals; Point Estimation. Maximum Likelihood, Least Squares and All That

Foundations of Statistical Inference

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Continuous RVs. 1. Suppose a random variable X has the following probability density function: π, zero otherwise. f ( x ) = sin x, 0 < x < 2

TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1

Statistical Theory MT 2007 Problems 4: Solution sketches

FIRST YEAR EXAM Monday May 10, 2010; 9:00 12:00am

Chapter 3. Point Estimation. 3.1 Introduction

Transcription:

LINKÖPINGS UNIVERSITET Institutionen för datavetenskap Statistik, ANd 73A36 THEORY OF STATISTICS, 6 CDTS Master s program in Statistics and Data Mining Fall semester Written exam Suggested solutions to written exam Jan 7, Task a) The mean of the Gamma distribution is x x 3 λ 4 3! e x/λ dx 4λ x x 4 λ 5 4! e x/λ dx 4λ since the latter integral integrates a new Gamma density from zero to infinity. The moment estimator of λ satisfies the equation EX) x which gives the moment estimate ˆλ MM x 4 b) Since the Gamma distribution satisfies normal regularity properties the Cramér-Rao inequality applies and hence the answer is yes. { } d lλ; x) n ) ) x 3 Iλ) E dλ lλ; x) ln i λ 4 e x i λ 3! Task Thus, and giving and the lower bound is 3 ln x i 4n ln λ n ln 3! λ dlλ; x) dλ d lλ; x) dλ Iλ) 4n λ + λ 3 4n λ + λ 4n λ λ 3 x i x i x i EX i ) 4n λ + 4n 3 n 4λ λ λ I λ) λ 4n λ 4 An estimate of this lower bound is found by replacing λ in I λ) by its moment estimate from a) giving I λ) x/4) 4 fx; θ) e x ln θ θ ln x! θx x! e θ Hence, it is the Poisson distribution.

a) Use the result that the MLE of θ has an asymptotic normal distributions with mean θ and variance I θ) lθ; x) x i ln θ nθ ln x i This can be easily derived from the first expression of the density above. Using the exponential family representation with the natural parameterization φ ln θ the MLE of φ is found by solving ) x i E X i n θ n e φ i giving ˆθ MLE x due to the invariance property of MLEs). Now i and that gives Thus, Iθ) dlθ; x) dθ d lθ; x) dθ n EX i) θ ˆθ MLE N n x i n θ n x i θ nθ θ θ, θ ) n n θ A 95% approximate confidence interval for θ can be found by solving for θ.96 < ˆθ MLE θ θ/n <.96 However, it is simpler and still good enough use the interval ˆθMLE 3 ˆθ MLE ±.96 3 ±.96 3 ±.5.5, 3.5) n 5 b) The likelihood can be written Lθ; x) 3 θ x i x i! e θ PrX < ) θ 4+3+3 e 3θ e θ + θe θ ) θ e 4θ + θ) and the log-likelihood lθ; x) Constant + ln θ 4θ + ln + θ)

Task 3 Hence, dlθ; x) dθ θ 4 + + θ The score equation dlθ; x) dθ gives upon simplification θ 7 4 θ 4 with the solution θ 7 + 9 8 since θ is known to be >. The second derivative which is negative for all θ> ). Hence, a) The likelihood function is d lθ; x) dθ θ + θ) ˆθ MLE 7 + 9 8 La; x) n x a i x i ) Ba, ) n x i) a n x i) Ba, )) n The best test satisfies Taking natural logarithms gives La ; x) La ; x) A la ; x) la ; x) ln A B a ) ln x i + ln x i ) n ln Ba, ) a ) ln x i ln x i )+n ln Ba, ) B a a ) Since a a < we get as the best test. ln x i B + nln Ba, ) ln Ba, )) ln x i C 3

b) No, it is not UMP since the inequality above changes direction if a > a c) The Central Limit Theorem can be used on ln X i provided we know Eln X i H ) and Varln X i H ). Now, Eln X i ) Ba, ) Ba, ) ln x) xa x) Ba, ) { [ ln x) xa a { [ x a a ] ] dx Ba, ) ] x a [ln a dx x xa+ + a + [ ] } x a+ + a + ) Ba, ) Ba, ) Γa) Γ) Γa + ) Under H we have a a 3 which is an integer. ln x) x a ln x) x a )dx } x a a + dx a + ) a ) B3, ) Γ3) Γ) Γ5)!! 4! and Further, Eln X i H ) 4 ) 3 7 78.4 Eln X i ) ) { [ ln x) Ba, ) xa a ln x) xa x) dx Ba, ) ] Ba, ) ln x) x a ln x) x a )dx ] ln x) [ln xa a dx x) xa+ + a + / a ln x) x a when x since xln x) / { ] [ ] [ln x) Ba, ) xa x a a + a dx + x a+ ln x) a + ) [ ] x a [ ] ) x a+ Ba, ) ) + a 3 + a + ) 3 ) Ba, ) a 3 a + ) 3 and this gives Eln X i ) H ) 4 3 3 ) 4 3 37 736 ln x } x a a + dx } x a a + )

and V arln X H ) 37 736 7 78 ).8 The Central limit theorwm now gives that 5 ln X i N5.4; 5.8) ifh istrue Pr 5 Hence, with α 5% we ll get ) ln X i C ) C 5.4 Φ 5.8 C 5.4 z.5 5.8 /z.5.6449/.9 Task 4 Repetaed coin tossing is a binomial experiment. Thus the likelihood function for the experimental data in this case is Lπ; x) ) x π x π) x a) The Minimax estimator coincides with the Bayes estimator when the risk function is constant. The conjugate prior to the binomial likelihood is the beta distribution with parameters α and β. The Bayes estimator with quatratic loss is the posterior mean, i.e. ˆπ B For this estimator the risk function is α + x α + β + n α + x α + β + Rπ; ˆπ) E X π [ ˆπ π) ] Varˆπ) + Biasˆπ)) Var ) [ ) α + x α + x + E π] α + β + α + β + /VarX) π π) and EX) π/ ) π π) α + π α + β + ) + α + β + π [α + π πα + β + )] + π π) α + β + ) [α πα + β)] + π π) α + β + ) For this risk function to be independent of π we require the coeffcients of π and π in the numerator to be zero. This gives α + β) and αα + β) which is satisfied by α β. The whole derivation for a general n is made in the textbook on page, unfortunately with a small error stating that the common value should be n/, but the correct value is n/). 5

Thus, with these values on α and β the Bayes estimator and also the minimax estimator becomes / + x ˆπ + b) This loss function is a zero-one loss function {, ˆπ π <.5 L S π, ˆπ), ˆπ π.5 With zero-one loss the Bayes estimator is the mode of the posterior distribution, which with a beta prior α,β) is ˆπ B α + x α + β + n /n / α + x α + β The prior to be used here is the one that was derived in a), i.e. a beta with α β which gives / + x / + ˆπ B /x /.85, Task 5 a) We would like to test H : The suspect is the writer of the signature against H : The owner of the signature is the writer of the signature The study of the samples of handwriting gives us approximative) likelihoods of the two hypotheses: LH ; x) PrCharacteristics H ) / LH ; x) PrCharacteristics H ) / Since the two hypothese are both simple, the Bayes factor is B / / 5 Now, since the prior odds for H are nine to one on Q 9) the posterior odds are or 45 to on. Q B Q 5 9 45 6

b) In this case we test H above against the composite hypothesis H : One of the owner and the third person is the writer of the signature The likelihood for the third person being the writer is analogously to the previous likelihoods) PrCharacteristics Third person) / Further, we also have that PrOwner H ) PrThird person H ) PrOwner H ) /3 and PrThird person H ) /3 The Bayes factor then becomes B PrCharacteristics H ) PrCharacteristics Owner) /3) + PrCharacteristics Third person) /3) and the posterior odds become / /) /3) + /) /3).5 Q.5 9.5 Task 6 a) Compute the difference in response time between engina A and engine B: 3 4 5 6 7 8 9 resp.time A 4. 5.4.7. 4.4 7.8.5.9 5. 7.7 resp.time B 5.6 6..5.5 4.3 8..4.9 6. 8.5 difference.5.6..5..4.9..8 Discard the pair with equal response times. AMong the remaining nine, seven differences are negative. Under the null hypothesis of generally equally long response times median difference ) the number of negative differences, X, would follow a Bi9,.5)-distribution. Hence, the P -value is PrX 7) PrX ) 9 ) + ) 9 + )) 9.5 9.9 b) Rank the absolute differences in a) discard the zero difference) and compute the rank sum of the absolute differences origin in negative differences. 3 4 5 6 7 8 9 resp.time A 4. 5.4.7. 4.4 7.8.5.9 5. 7.7 resp.time B 5.6 6..5.5 4.3 8..4.9 6. 8.5 difference.5.6..5..4.9..8 abs. difference.5.6..5..4.9..8 rank 9 5 4 3 7-8 6 7

The rank sum of the originally negative differences becomes W 9 + 5 + 4 + 3 + 7 + 8 + 6 4 Under the assumption the response times are generally equally long for the two engines ) nn + ) nn + )n + ) W N ; /n 9/ N.5; 7.5) 4 4 Hence, the P -value is approximately ) 4.5 PrW 4) Φ Φ.3). 7/4 8