Problem 1 (20) Log-normal. f(x) Cauchy

Similar documents
Review. December 4 th, Review

STA 2201/442 Assignment 2

Non-parametric Inference and Resampling

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Mathematics Qualifying Examination January 2015 STAT Mathematical Statistics

INTERVAL ESTIMATION AND HYPOTHESES TESTING

Mathematics Ph.D. Qualifying Examination Stat Probability, January 2018

Qualifying Exam CS 661: System Simulation Summer 2013 Prof. Marvin K. Nakayama

0, otherwise. U = Y 1 Y 2 Hint: Use either the method of distribution functions or a bivariate transformation. (b) Find E(U).

Regression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood

Review: General Approach to Hypothesis Testing. 1. Define the research question and formulate the appropriate null and alternative hypotheses.

Masters Comprehensive Examination Department of Statistics, University of Florida

Distributions of Functions of Random Variables. 5.1 Functions of One Random Variable

Smoking Habits. Moderate Smokers Heavy Smokers Total. Hypertension No Hypertension Total

Mathematical statistics

Mathematical statistics

Final Exam. 1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given.

Master s Written Examination

ISyE 6644 Fall 2014 Test 3 Solutions

Master s Written Examination - Solution

Regression Estimation Least Squares and Maximum Likelihood

Mathematical statistics

Master s Written Examination

Institute of Actuaries of India

f (1 0.5)/n Z =

First Year Examination Department of Statistics, University of Florida

STATISTICS 3A03. Applied Regression Analysis with SAS. Angelo J. Canty

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Central Limit Theorem ( 5.3)

Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2

This paper is not to be removed from the Examination Halls

Eco517 Fall 2004 C. Sims MIDTERM EXAM

Stat 5102 Final Exam May 14, 2015

A Very Brief Summary of Statistical Inference, and Examples

Masters Comprehensive Examination Department of Statistics, University of Florida

Chapter 9: Hypothesis Testing Sections

Continuous Distributions

Statistics. Statistics

STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots. March 8, 2015

MAS223 Statistical Inference and Modelling Exercises

IIT JAM : MATHEMATICAL STATISTICS (MS) 2013

[y i α βx i ] 2 (2) Q = i=1

Continuous random variables

Ch 2: Simple Linear Regression

STA 260: Statistics and Probability II

Linear Models and Estimation by Least Squares

Bias Variance Trade-off

STAT 512 sp 2018 Summary Sheet

STAT 135 Lab 3 Asymptotic MLE and the Method of Moments

M(t) = 1 t. (1 t), 6 M (0) = 20 P (95. X i 110) i=1

STAT 135 Lab 5 Bootstrapping and Hypothesis Testing

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing

Problem Selected Scores

This exam contains 13 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

FIRST YEAR EXAM Monday May 10, 2010; 9:00 12:00am

exp{ (x i) 2 i=1 n i=1 (x i a) 2 (x i ) 2 = exp{ i=1 n i=1 n 2ax i a 2 i=1

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B

Fundamental Probability and Statistics

EXAMINERS REPORT & SOLUTIONS STATISTICS 1 (MATH 11400) May-June 2009

STA 2101/442 Assignment 2 1

1. Simple Linear Regression

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)

Parameter Estimation

SDS 321: Practice questions

Simple Linear Regression

Statistics and Econometrics I

Direction: This test is worth 250 points and each problem worth points. DO ANY SIX

Probability Theory and Statistics. Peter Jochumzen

Estimators as Random Variables

Post-exam 2 practice questions 18.05, Spring 2014

Open book, but no loose leaf notes and no electronic devices. Points (out of 200) are in parentheses. Put all answers on the paper provided to you.

Summary of Chapters 7-9

Hypothesis Test. The opposite of the null hypothesis, called an alternative hypothesis, becomes

3. Linear Regression With a Single Regressor

This does not cover everything on the final. Look at the posted practice problems for other topics.

MATH4427 Notebook 4 Fall Semester 2017/2018

Week 1 Quantitative Analysis of Financial Markets Distributions A

DS-GA 1003: Machine Learning and Computational Statistics Homework 7: Bayesian Modeling

EC2001 Econometrics 1 Dr. Jose Olmo Room D309

Inference in Regression Analysis

Part IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015

A Very Brief Summary of Statistical Inference, and Examples

GEOMETRIC -discrete A discrete random variable R counts number of times needed before an event occurs

18.05 Final Exam. Good luck! Name. No calculators. Number of problems 16 concept questions, 16 problems, 21 pages

Null Hypothesis Significance Testing p-values, significance level, power, t-tests

Terminology Suppose we have N observations {x(n)} N 1. Estimators as Random Variables. {x(n)} N 1

Spring 2012 Math 541B Exam 1

Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester

557: MATHEMATICAL STATISTICS II HYPOTHESIS TESTING: EXAMPLES

Multivariate Regression

Math 50: Final. 1. [13 points] It was found that 35 out of 300 famous people have the star sign Sagittarius.

Final Exam # 3. Sta 230: Probability. December 16, 2012

Statistics for Engineers Lecture 5 Statistical Inference

, 0 x < 2. a. Find the probability that the text is checked out for more than half an hour but less than an hour. = (1/2)2

POLI 8501 Introduction to Maximum Likelihood Estimation

Introduction to Simple Linear Regression

Econometrics A. Simple linear model (2) Keio University, Faculty of Economics. Simon Clinet (Keio University) Econometrics A October 16, / 11

Statistics Ph.D. Qualifying Exam: Part I October 18, 2003

18.05 Practice Final Exam

STAT 285: Fall Semester Final Examination Solutions

Transcription:

ORF 245. Rigollet Date: 11/21/2008 Problem 1 (20) f(x) f(x) 0.0 0.1 0.2 0.3 0.4 0.0 0.2 0.4 0.6 0.8 4 2 0 2 4 Normal (with mean -1) 4 2 0 2 4 Negative-exponential x x f(x) f(x) 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.0 0.1 0.2 0.3 0.4 0.5 4 2 0 2 4 Log-normal 4 2 0 2 4 x x Cauchy 4 2 0 2 4 f(x) f(x) 0.0 0.1 0.2 0.3 0.4 0.5 0.0 0.1 0.2 0.3 0.4 0.5 x Uniform 5 0 5 Mixture of two normals The above figure displays six probability density functions together with the pdf of a standard normal distribution in dashed line for comparison (the scale for the axes can change!). They correspond to the following well known distributions: Normal (with mean -1), Log-normal, Uniform, Negative-exponential, Cauchy and mixture of two normals. The figure below displays normal quantile-quantile (Q-Q) plots of six samples, each of them being simulated from one of these distributions. For each of normal Q-Q plot (numbered from 1 to 6) write from which distribution you think the sample has been simulated and explain briefly your choice (simply write indications such as negatively or positively skewed, light or heavy tails,...) x page 1 of 10

Normal Q Q Plot Normal Q Q Plot Normal Q Q Plot 7 6 5 4 3 2 1 0 4 2 0 2 4 3 2 1 0 1 2 0 2 4 6 8 10 12 14 3 2 1 0 1 2 3 3 2 1 0 1 2 3 3 2 1 0 1 2 3 1 2 3 Normal Q Q Plot 3 2 1 0 1 2 3 40 20 0 20 40 Normal Q Q Plot 3 2 1 0 1 2 3 3 2 1 0 1 2 3 4 5 6 1. Negative exponential (skewed to the left) 2. Mixture of normals (two sets of aligned points)) 3. Normal with mean -1 (aligned points) 4. Log-normal (skewed to the right) 5. Cauchy (symmetric, heavy tails) 6. Uniform (Symmetric, light tails) 1.0 0.5 0.0 0.5 1.0 Normal Q Q Plot page 2 of 10

Problem 2 (25) A certain pen has been designed so that the true average writing lifetime µ under controlled conditions (involving the use of a writing machine) is at least 10 hours. A random sample of 14 pens is selected, the writing lifetime (in hours) of each is denoted by X 1,...,X 14. The observations gave: x = 9.62 and s 2 = 0.16. The normal Q-Q plot of the X i is given below. 10.0 10.2 10.4 10.6 10.8 11.0 11.2 Normal Q Q Plot 1 0 1 1. From the normal Q-Q plot, can you conclude that the observations are normally distributed? Why? Yes, the points are almost aligned. 2. State the appropriate hypothesis testing problem for the true average writing lifetime µ. H 0 : µ = 10 (H 0 : µ 10) Vs H 1 : µ < 10 The a priori belief it that µ is (at least) 10 hence H 0. The interesting alternative is µ < 10 because if we reject H 0 we want to be able to conclude that the pens have a too short lifetime. It is not interesting to conclude that the pens have a too long lifetime. 3. Find a testing procedure at level α. Normal observations and unknown variance = We use a Student t test. The test statistic is T = X 10 s/ n page 3 of 10

and has t distribution with n 1 degrees of freedom under the null hypothesis. Here n = 14 and in view of the alternative, we reject if T < t 13,α 4. Perform the above test at level 5%. From the table t 13,5% = 1.771 and the observed T is therefore, we reject H 0 at level 5%. 9.62 10 0.4/ = 3.555 < 1.771 14 5. Find a two-sided confidence interval for µ with confidence level 95%. The interval is of the form: [ X s t n 1,α/2, X s + t n 1,α/2 ] n n with α = 5% and n = 14. Numerical application yields [9.389, 9.851] 6. What is the smallest number of pens to be selected for this confidence interval to be of width 6 minutes? Since 6 min = 0.1 hr, we have to solve: s 2 t n 1,α/2 = 0.1 n with respect to n. Let us check what happens if we take n = 61 (the largest value for which we can read t n 1,α/2 in the table before it becomes equal to t,α/2 = z α/2. This value yields 0.4 2 2 = 0.205 61 which is larger than 0.1 so we should take n even larger than 61 and for such values t n 1,α/2 = z α/2 = 1.960. Therefore, we have to solve 2 1.960 0.4 n = 0.1 which yields n = 246 (which is indeed > 61). page 4 of 10

Problem 3 (25) Let X 1,...,X n be i.i.d random variables with uniform distribution on [0, θ]. In particular, each of them has pdf { 1/θ if 0 x θ f(x; θ) = 0 otherwise 1. What is the joint pdf f(x 1,...,x n ; θ) of the sample (X 1,..., X n )? Since the X i are i.i.d, their joint distribution is given by { 1/θ n if 0 min f(x 1,...,x n ; θ) = f(x 1 ; θ)... f(x n ; θ) = i x i max i x i θ 0 otherwise 2. Show that the maximum likelihood estimator of θ is given by The joint distribution of the previous question as a function of θ is equal to 0 if θ < max i x i and equal to 1/θ n if θ > max i x i. The likelihood is given by: f(x 1,...,X n ; θ) and its maximum is therefore attained at the smalled value of θ for which the likelihood is non zero and this value is ˆθ ML = max i X i. 3. Find first the cdf and then the pdf of ˆθ ML. The cdf is the function: (t/θ) n if 0 t θ F(t) = P(maxX i t) = [P(X 1 t)] n = 0 if t 0 i 1 if t θ The pdf is the derivative of the cdf and is given by nt n 1 f(t) = F (t) = if 0 t θ θ n 0 otehrwise 4. Using the previous question, compute E(ˆθ ML ). E(ˆθ ML ) = θ 0 tf(t)dt = θ 0 nt n θ n dt = n n + 1 θ page 5 of 10

5. Is ˆθ ML an unbiased estimator of θ? Why? If not, give a simple modification of that is unbiased. ˆθ ML is not an unbiased estimator of θ because E(ˆθ ML ) θ. However the estimator ˆθ ML is unbiased. Indeed n + 1 ML ˆθ n [ n + 1 E n ˆθ ] ML = n + 1 n E(ˆθ ML ) = θ. page 6 of 10

Problem 4 (30) Paul has decided to travel across Europe for a year. He is very interested in food and will be trying restaurants in each country he will visit. However he is afraid of putting on some weight and has decided that he will monitor his weight regularly every month during the 12 months he will spend there. His weight (in Lbs) at month i, i = 1,..., 12 is modelled by a random variable Y i of the form Y i = µ i + ε i, where µ i is Paul s true weight on month i and ε i, i = 1,..., 12 are i.i.d standard normal random variables which account for the errors of measurement from one month to another. Indeed, he will use different scales, measure his weight at different times of the day, etc... We assume that his true weight µ i will change over the months as follows: µ i = 150 + β i, where 150Lbs corresponds Paul s true weight before he leaves to Europe and β is an unknown parameter. We are mainly interested in the parameter β. You will need to use the following identities n n(n + 1) n i = and i 2 n(n + 1)(2n + 1) = 2 6 We will also need the pdf of a random variable X N(µ, σ 2 ): f(x; µ, σ 2 ) = 1 e (x µ)2 2σ 2 2πσ 2 Estimation problem: The parameter β controls the rate at which Paul will gain weight over time. 1. Denote by Ȳ = 1 12 Y i the average observed weight over the whole year Paul will spend in Europe. Find E(Ȳ ), V (Ȳ ) and the distribution of Ȳ. E(Ȳ ) = 1 12 12 1 12 12 13 E(Y i ) = 150 + β i = 150 + β = 150 + β 6.5 12 2 12 For the variance, remark that each Y i has variance 1. Therefore V (Ȳ ) = 1 12 V (Y 1) = 1 12 The Y i being independent random variables with normal distribution, Ȳ also has normal distribution with the above parameters: Ȳ N(150 + β 6.5, 1/12) page 7 of 10

2. Using the previous question, find an estimator ˆβ MOM for β using the method of moments. Since we have E(Ȳ ) = 150 + β 6.5 ˆβ MOM = Ȳ 150 6.5 3. Is ˆβ MOM an unbiased estimator of β? Why? E(ˆβ MOM ) = therefore ˆβ MOM is an unbiased estimator of β. E(Ȳ ) 150 = β 6.5 4. What is the joint pdf f(y 1,...,y 12 ; β) of the random variables Y 1,...,Y 12? The Y i, being independent, we find the joint pdf by taking the product of the marginal pdf. The latter are given by Taking the product gives the joint pdf: 1 f(y 1,...,y 12 ; β) = ( 2π) 12e f(y i ) = f(y i ; µ i, 1) = 1 e (y i µ i )2 2 2π P 12 (y i µ i ) 2 2 = 1 ( 2π) 12e P 12 (y i 150 β i) 2 2 5. Using the previous question, find explicitly the maximum likelihood estimator of β. Consider the log likelihood, given by ln[f(y 1,...,Y n ; β)] = n ln( 2π) 1 2 12 (Y i 150 β i) 2 Therefore, the maximum likelihood estimator is the β that minimizes the term 12 The derivative with respect to β is (Y i 150 β i) 2 ˆβ ML 12 2 (Y i 150 ˆβ ML i) i = 0 page 8 of 10

which gives β ML = where we used the fact that i(y i 150) i2 = 12 6. Is ˆβ ML an unbiased estimator of β? Why? E(β ML ) = i 2 = 650 i(e(y i) 150) = β i2 Therefore, β ML is an unbiased estimator of β. i(y i 150) 650 i2 12 i2 = β Testing problem: Paul is confident that by monitoring his weight regularly he will stay the same weight (150 Lbs) in average. He wants to test this hypothesis. 7. Given that Paul will either stay the same weight or gain some weight (losing weight in Europe is not an option!), state the appropriate hypothesis testing problem for β. H 0 : β = 0, Vs H 1 : β > 0 8. Based on 12 observations x 1,...,x 12, we used R to perform this test and the software outputs a p-value equal to 0.035 together with a lower confidence bound at level 95% for β. What is the sign (> 0 or < 0) of this lower confidence bound? Why? [Hint: a plot can be helpful] The p-value is smaller than 5%, which means that the null hypothesis is rejected. On the other hand, we can use the lower confidence bound (LCB) to construct a test at level 5% by rejecting when the LCB is positive. Since we reject, it means that the LCB is positive. page 9 of 10

Student s t critical values ν 0.60 0.667 0.75 0.80 0.87 0.90 0.95 0.975 0.99 0.995 0.999 1 0.325 0.577 1.000 1.376 2.414 3.078 6.314 12.706 31.821 63.657 318.31 2 0.289 0.500 0.816 1.061 1.604 1.886 2.920 4.303 6.965 9.925 22.327 3 0.277 0.476 0.765 0.978 1.423 1.638 2.353 3.182 4.541 5.841 10.215 4 0.271 0.464 0.741 0.941 1.344 1.533 2.132 2.776 3.747 4.604 7.173 5 0.267 0.457 0.727 0.920 1.301 1.476 2.015 2.571 3.365 4.032 5.893 6 0.265 0.453 0.718 0.906 1.273 1.440 1.943 2.447 3.143 3.707 5.208 7 0.263 0.449 0.711 0.896 1.254 1.415 1.895 2.365 2.998 3.499 4.785 8 0.262 0.447 0.706 0.889 1.240 1.397 1.860 2.306 2.896 3.355 4.501 9 0.261 0.445 0.703 0.883 1.230 1.383 1.833 2.262 2.821 3.250 4.297 10 0.260 0.444 0.700 0.879 1.221 1.372 1.812 2.228 2.764 3.169 4.144 11 0.260 0.443 0.697 0.876 1.214 1.363 1.796 2.201 2.718 3.106 4.025 12 0.259 0.442 0.695 0.873 1.209 1.356 1.782 2.179 2.681 3.055 3.930 13 0.259 0.441 0.694 0.870 1.204 1.350 1.771 2.160 2.650 3.012 3.852 14 0.258 0.440 0.692 0.868 1.200 1.345 1.761 2.145 2.624 2.977 3.787 15 0.258 0.439 0.691 0.866 1.197 1.341 1.753 2.131 2.602 2.947 3.733 16 0.258 0.439 0.690 0.865 1.194 1.337 1.746 2.120 2.583 2.921 3.686 17 0.257 0.438 0.689 0.863 1.191 1.333 1.740 2.110 2.567 2.898 3.646 18 0.257 0.438 0.688 0.862 1.189 1.330 1.734 2.101 2.552 2.878 3.610 19 0.257 0.438 0.688 0.861 1.187 1.328 1.729 2.093 2.539 2.861 3.579 20 0.257 0.437 0.687 0.860 1.185 1.325 1.725 2.086 2.528 2.845 3.552 21 0.257 0.437 0.686 0.859 1.183 1.323 1.721 2.080 2.518 2.831 3.527 22 0.256 0.437 0.686 0.858 1.182 1.321 1.717 2.074 2.508 2.819 3.505 23 0.256 0.436 0.685 0.858 1.180 1.319 1.714 2.069 2.500 2.807 3.485 24 0.256 0.436 0.685 0.857 1.179 1.318 1.711 2.064 2.492 2.797 3.467 25 0.256 0.436 0.684 0.856 1.178 1.316 1.708 2.060 2.485 2.787 3.450 26 0.256 0.436 0.684 0.856 1.177 1.315 1.706 2.056 2.479 2.779 3.435 27 0.256 0.435 0.684 0.855 1.176 1.314 1.703 2.052 2.473 2.771 3.421 28 0.256 0.435 0.683 0.855 1.175 1.313 1.701 2.048 2.467 2.763 3.408 29 0.256 0.435 0.683 0.854 1.174 1.311 1.699 2.045 2.462 2.756 3.396 30 0.256 0.435 0.683 0.854 1.173 1.310 1.697 2.042 2.457 2.750 3.385 35 0.255 0.434 0.682 0.852 1.170 1.306 1.690 2.030 2.438 2.724 3.340 40 0.255 0.434 0.681 0.851 1.167 1.303 1.684 2.021 2.423 2.704 3.307 45 0.255 0.434 0.680 0.850 1.165 1.301 1.679 2.014 2.412 2.690 3.281 50 0.255 0.433 0.679 0.849 1.164 1.299 1.676 2.009 2.403 2.678 3.261 55 0.255 0.433 0.679 0.848 1.163 1.297 1.673 2.004 2.396 2.668 3.245 60 0.254 0.433 0.679 0.848 1.162 1.296 1.671 2.000 2.390 2.660 3.232 0.253 0.431 0.674 0.842 1.150 1.282 1.645 1.960 2.326 2.576 3.090 page 10 of 10

Scrap paper

Scrap paper