Stat 5102 Final Exam May 14, 2015

Similar documents
Statistics & Data Sciences: First Year Prelim Exam May 2018

MATH 644: Regression Analysis Methods

Stat 5101 Lecture Notes

MS&E 226: Small Data

ST430 Exam 2 Solutions

UNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017

Swarthmore Honors Exam 2012: Statistics

Tests of Linear Restrictions

MODELS WITHOUT AN INTERCEPT

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models

CAS MA575 Linear Models

First Year Examination Department of Statistics, University of Florida

Inference for Regression


Subject CS1 Actuarial Statistics 1 Core Principles

SCHOOL OF MATHEMATICS AND STATISTICS

(a) (3 points) Construct a 95% confidence interval for β 2 in Equation 1.

ST430 Exam 1 with Answers

Biostatistics 380 Multiple Regression 1. Multiple Regression

STAT 461/561- Assignments, Year 2015

(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box.

Stat 401B Final Exam Fall 2015

Master s Written Examination - Solution

AMS-207: Bayesian Statistics

Mathematical statistics

First Year Examination Department of Statistics, University of Florida

STAT 350: Summer Semester Midterm 1: Solutions

Parameter Estimation

Problem #1 #2 #3 #4 #5 #6 Total Points /6 /8 /14 /10 /8 /10 /56

Recall that a measure of fit is the sum of squared residuals: where. The F-test statistic may be written as:

Mathematical statistics

Central Limit Theorem ( 5.3)

Analysis of variance. Gilles Guillot. September 30, Gilles Guillot September 30, / 29

Statistics - Lecture Three. Linear Models. Charlotte Wickham 1.

Exam Applied Statistical Regression. Good Luck!

STAT 730 Chapter 4: Estimation

36-707: Regression Analysis Homework Solutions. Homework 3

Chapter 8 Conclusion

Stat 412/512 TWO WAY ANOVA. Charlotte Wickham. stat512.cwick.co.nz. Feb

Regression and the 2-Sample t

1 Use of indicator random variables. (Chapter 8)

Stat 401B Exam 2 Fall 2016

Practice Final Examination

Master s Written Examination

Stat 401B Exam 2 Fall 2015

Linear Model Specification in R

STAT 512 MidTerm I (2/21/2013) Spring 2013 INSTRUCTIONS

Lecture 15. Hypothesis testing in the linear model

This exam contains 6 questions. The questions are of equal weight. Print your name at the top of this page in the upper right hand corner.

22s:152 Applied Linear Regression. Take random samples from each of m populations.

SCHOOL OF MATHEMATICS AND STATISTICS

UNIVERSITY OF TORONTO Faculty of Arts and Science

Regression. Bret Hanlon and Bret Larget. December 8 15, Department of Statistics University of Wisconsin Madison.

Swarthmore Honors Exam 2015: Statistics

Problem Selected Scores

STAT 510 Final Exam Spring 2015

STAT 215 Confidence and Prediction Intervals in Regression

Multiple Predictor Variables: ANOVA

22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA

McGill University. Faculty of Science. Department of Mathematics and Statistics. Part A Examination. Statistics: Theory Paper

Biostatistics for physicists fall Correlation Linear regression Analysis of variance

Final Examination. STA 215: Statistical Inference. Saturday, 2001 May 5, 9:00am 12:00 noon

Linear Regression Model. Badr Missaoui

Linear Regression Models P8111

Stat 401B Final Exam Fall 2016

Midterm Examination. STA 215: Statistical Inference. Due Wednesday, 2006 Mar 8, 1:15 pm

TUTORIAL 8 SOLUTIONS #

STAT 572 Assignment 5 - Answers Due: March 2, 2007

IES 612/STA 4-573/STA Winter 2008 Week 1--IES 612-STA STA doc

Stats Review Chapter 14. Mary Stangler Center for Academic Success Revised 8/16

Introduction to Statistical Data Analysis Lecture 8: Correlation and Simple Regression

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

STAT420 Midterm Exam. University of Illinois Urbana-Champaign October 19 (Friday), :00 4:15p. SOLUTIONS (Yellow)

Correlation and Regression

HANDBOOK OF APPLICABLE MATHEMATICS

STA 101 Final Review

Psychology 405: Psychometric Theory

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

Review. December 4 th, Review

MATH c UNIVERSITY OF LEEDS Examination for the Module MATH2715 (January 2015) STATISTICAL METHODS. Time allowed: 2 hours

The University of Hong Kong Department of Statistics and Actuarial Science STAT2802 Statistical Models Tutorial Solutions Solutions to Problems 71-80

Econometrics. 4) Statistical inference

(a) The percentage of variation in the response is given by the Multiple R-squared, which is 52.67%.

STK4900/ Lecture 3. Program

Problem 1 (20) Log-normal. f(x) Cauchy

NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION. ST3241 Categorical Data Analysis. (Semester II: ) April/May, 2011 Time Allowed : 2 Hours

Multiple Linear Regression. Chapter 12

NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION (SOLUTIONS) ST3241 Categorical Data Analysis. (Semester II: )

A Very Brief Summary of Statistical Inference, and Examples

Chapter 4 HOMEWORK ASSIGNMENTS. 4.1 Homework #1

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

Lecture 18: Simple Linear Regression

Chapter 12: Linear regression II

Booklet of Code and Output for STAC32 Final Exam

UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics January, 2018

1 Multiple Regression

General Linear Statistical Models - Part III

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables THE UNIVERSITY OF MANCHESTER. 21 June :45 11:45

This does not cover everything on the final. Look at the posted practice problems for other topics.

Dealing with Heteroskedasticity

Transcription:

Stat 5102 Final Exam May 14, 2015 Name Student ID The exam is closed book and closed notes. You may use three 8 1 11 2 sheets of paper with formulas, etc. You may also use the handouts on brand name distributions and Greek letters. You may use a calculator. Put all of your work on this test form (use the back if necessary). Show your work or give an explanation of your answer. No credit for numbers or formulas with no indication of where they came from. Leave no undone integrals or derivatives in your answers, but other than that requirement there is no unique correct simplification. Any correct answer gets full credit, except as explicitly stated in questions. Abbreviations used: distribution function (DF), independent and identically distributed (IID), maximum likelihood estimate (MLE), probability density function (PDF), and probability mass function (PMF). The points for the questions total to 200. There are 12 pages and 8 problems. Questions begin on the following page. 1

1. [25 pts.] Suppose (X i, Y i ), i = 1,..., n, are IID from the distribution with PDF f θ (x, y) = e xθ y/θ, 0 < x <, 0 < y <. where θ > 0 is an unknown parameter. (Hint: What are the marginal distributions of X and Y?) (a) Find the Jeffreys prior for this distribution. (b) Say whether it is proper or improper. 2

2. [25 pts.] The following Rweb output fits a linear model. The covariate x is numeric and the covariate color is categorical (what R calls a factor). Rweb:> Rweb:> out1 <- lm(y ~ x + color) summary(out1) Call: lm(formula = y ~ x + color) Residuals: Min 1Q Median 3Q Max -9.6101-3.7974 0.5263 2.8854 10.4138 Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) 4.35847 1.94831 2.237 0.030398 * x 0.20438 0.04896 4.174 0.000139 *** color2 3.35899 2.22408 1.510 0.138120 color3 3.99087 2.22570 1.793 0.079835. color4 2.97893 2.22839 1.337 0.188157 color5 9.66366 2.23215 4.329 8.51e-05 *** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05. 0.1 1 Residual standard error: 4.972 on 44 degrees of freedom Multiple R-squared: 0.4809, Adjusted R-squared: 0.4219 F-statistic: 8.153 on 5 and 44 DF, p-value: 1.639e-05 (a) Find a 95% confidence interval for the true unknown regression coefficient for the predictor x. (Hint: The 0.95 quantile of the standard normal distribution is 1.645, the 0.975 quantile of the standard normal distribution is 1.960, the 0.95 quantile of the t distribution on 44 degrees of freedom is 1.680, and the 0.975 quantile of the t distribution 3

on 44 degrees of freedom is 2.015.) (b) Perform a hypothesis test of whether the regression coefficient for the predictor x is zero (the null hypothesis) versus nonzero (the alternative hypothesis). State the value of the test statistic, the distribution of the test statistic under the null hypothesis, whether this distribution is exact or approximate (asymptotic, large n), and the P -value. Interpret the P -value. What does it say about what predictor variable or variables should or should not be in the model? 4

The Rweb output continues. Rweb:> out0 <- lm(y ~ x) Rweb:> anova(out0, out1) Analysis of Variance Table Model 1: y ~ x Model 2: y ~ x + color Res.Df RSS Df Sum of Sq F Pr(>F) 1 48 1579.6 2 44 1087.7 4 491.86 4.9742 0.00214 ** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05. 0.1 1 (c) What is this test about? State the null and alternative hypotheses, the value of the test statistic, the distribution of the test statistic under the null hypothesis, whether this distribution is exact or approximate (asymptotic, large n), and the P -value. Interpret the P -value. What does it say about what predictor variable or variables 5

3. [25 pts.] Suppose X 1,..., X n are IID from the NegBin(r, p) distribution, where r is considered known so p is the only unknown parameter. Show that there is a one-dimensional sufficient statistic, and identify this statistic. 6

4. [25 pts.] Suppose (X i, Y i ), i = 1,..., n are IID (two-dimensional) random vectors, the marginal distribution of X i is Poi(µ), and the conditional distribution of Y i given X i is Bin(X i, p), where µ and p are unknown parameters satisfying 0 < µ < and 0 < p < 1. Show that this statistical model is a two-dimensional exponential family (two natural statistics and two natural parameters). Identify the natural statistics and natural parameters. Clarification: Since X i can be zero, we need to know what the binomial distribution with sample size zero means. Setting sample size to zero in the usual definition of the binomial distribution from this brand name distributions handout, the sample space is all of the integers from zero to zero. Thus the distribution is concentrated at zero and Pr(Y i = 0 X i = 0) = 1. The usual formula for the PMF ( ) xi f(y i x i ) = p y i (1 p) x i y i still works when x i = y i = 0, giving ( ) 0 p 0 (1 p) 0 = 0! 0 0! 0! p0 (1 p) 0 = 1 y i so we do not have to fuss about this special case (the general formula for the PMF works in this case too). 7

5. [25 pts.] Suppose X 1,..., X n are IID from the distribution with PDF { 3θ/[(1 + θ)(1 + x ) 4 ], < x < 0 f θ (x) = 3θ/[(1 + θ)(1 + θ x ) 4 ], 0 < x < where θ is an unknown parameter satisfying 0 < θ <. This distribution has mean and variance (you do not have to prove these). E θ (X) = 1 θ 2θ 3 2θ + 3θ2 var θ (X) = 4θ 2 (a) Find a method of moments estimator of θ. (b) Find the asymptotic normal distribution of your estimator ˆθ n. 8

6. [25 pts.] The function F θ (x) = { x 2 /(1 + θx + x 2 ), x > 0 0, x 0 is a DF, where the parameter θ satisfies 0 < θ <. Hint: The roots of the quadratic polynomial ax 2 + bx + c are b ± b 2 4ac 2a (a) Find the median of this distribution. (b) Find the PDF corresponding to this DF. 9

(c) Find the asymptotic variance of the sample median for an IID sample X 1,..., X n from this distribution. 10

7. [25 pts.] Suppose X 1,..., X n are IID from the distribution with PDF f θ (x) = θ 2 xe θx, 0 < x <, where θ is an unknown parameter satisfying 0 < θ <. (a) Find the maximum likelihood estimator of θ. (b) Show that your solution to part (a) is the unique global maximizer of the likelihood if it is. If you cannot show that it is the unique global maximizer, show that it is a local maximizer. (c) Calculate expected Fisher information for sample size n. 11

(d) Give an approximate, large sample 95% confidence interval for θ. (Hint: The 0.95 quantile of the standard normal distribution is 1.645, and the 0.975 quantile of the standard normal distribution is 1.96.) 8. [25 pts.] The Rayleigh distribution (not a distribution we have met before) has PDF f σ (x) = x σ 2 e x2 /(2σ 2), 0 < x <, where σ > 0 is an unknown parameter. Suppose we have IID data X 1,..., X n having this distribution and want to do a Bayesian analysis with a Gam(α, λ) prior distribution on the parameter θ = 1/σ 2. Find the posterior distribution for θ. (Hint: the posterior is a brand name distribution.) 12