Homework for 4/9 Due 4/16

Similar documents
1 Inferential Methods for Correlation and Regression Analysis

TMA4245 Statistics. Corrected 30 May and 4 June Norwegian University of Science and Technology Department of Mathematical Sciences.

S Y Y = ΣY 2 n. Using the above expressions, the correlation coefficient is. r = SXX S Y Y

Statistics 20: Final Exam Solutions Summer Session 2007

Stat 139 Homework 7 Solutions, Fall 2015

The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.

Worksheet 23 ( ) Introduction to Simple Linear Regression (continued)

1 Models for Matched Pairs

Lecture 7: Properties of Random Samples

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

Chapter 13: Tests of Hypothesis Section 13.1 Introduction

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

STAT Homework 2 - Solutions

6 Sample Size Calculations

Stat 200 -Testing Summary Page 1

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Problems from 9th edition of Probability and Statistical Inference by Hogg, Tanis and Zimmerman:

Chapter 5: Hypothesis testing

11 Correlation and Regression

University of California, Los Angeles Department of Statistics. Practice problems - simple regression 2 - solutions

Table 12.1: Contingency table. Feature b. 1 N 11 N 12 N 1b 2 N 21 N 22 N 2b. ... a N a1 N a2 N ab

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

STAT431 Review. X = n. n )

Since X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain

Problem Set 4 Due Oct, 12

Lecture 6 Simple alternatives and the Neyman-Pearson lemma

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

ECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors

Math 152. Rumbos Fall Solutions to Review Problems for Exam #2. Number of Heads Frequency

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

General IxJ Contingency Tables

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight)

Section 14. Simple linear regression.

BHW #13 1/ Cooper. ENGR 323 Probabilistic Analysis Beautiful Homework # 13

Statistics 203 Introduction to Regression and Analysis of Variance Assignment #1 Solutions January 20, 2005

Statistical and Mathematical Methods DS-GA 1002 December 8, Sample Final Problems Solutions

2 1. The r.s., of size n2, from population 2 will be. 2 and 2. 2) The two populations are independent. This implies that all of the n1 n2

TAMS24: Notations and Formulas

Mathematical Statistics - MS

SOLUTIONS y n. n 1 = 605, y 1 = 351. y1. p y n. n 2 = 195, y 2 = 41. y p H 0 : p 1 = p 2 vs. H 1 : p 1 p 2.

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Common Large/Small Sample Tests 1/55

Lecture 11 and 12: Basic estimation theory

Solutions: Homework 3

ST 305: Exam 3 ( ) = P(A)P(B A) ( ) = P(A) + P(B) ( ) = 1 P( A) ( ) = P(A) P(B) σ X 2 = σ a+bx. σ ˆp. σ X +Y. σ X Y. σ X. σ Y. σ n.

Direction: This test is worth 150 points. You are required to complete this test within 55 minutes.

( θ. sup θ Θ f X (x θ) = L. sup Pr (Λ (X) < c) = α. x : Λ (x) = sup θ H 0. sup θ Θ f X (x θ) = ) < c. NH : θ 1 = θ 2 against AH : θ 1 θ 2

Correlation Regression

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator

SIMPLE LINEAR REGRESSION AND CORRELATION ANALYSIS

(X i X)(Y i Y ) = 1 n

Estimation for Complete Data

Cov(aX, cy ) Var(X) Var(Y ) It is completely invariant to affine transformations: for any a, b, c, d R, ρ(ax + b, cy + d) = a.s. X i. as n.

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

Maximum Likelihood Estimation

Mathematical Notation Math Introduction to Applied Statistics

SDS 321: Introduction to Probability and Statistics

UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL/MAY 2009 EXAMINATIONS ECO220Y1Y PART 1 OF 2 SOLUTIONS

Chapter 2 The Monte Carlo Method

5. Likelihood Ratio Tests

A statistical method to determine sample size to estimate characteristic value of soil parameters

Lecture 1, Jan 19. i=1 p i = 1.

Exponential Families and Bayesian Inference

Properties and Hypothesis Testing

x = Pr ( X (n) βx ) =

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

Random Variables, Sampling and Estimation

Asymptotics. Hypothesis Testing UMP. Asymptotic Tests and p-values

Chapter 4 Tests of Hypothesis

STAT Homework 1 - Solutions

STAT Homework 7 - Solutions

Comparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading

Last Lecture. Unbiased Test

Stat 319 Theory of Statistics (2) Exercises

Chapters 5 and 13: REGRESSION AND CORRELATION. Univariate data: x, Bivariate data (x,y).

1036: Probability & Statistics

Topic 9: Sampling Distributions of Estimators

STA Object Data Analysis - A List of Projects. January 18, 2018

Lecture Note 8 Point Estimators and Point Estimation Methods. MIT Spring 2006 Herman Bennett

STATISTICAL INFERENCE

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

INSTRUCTIONS (A) 1.22 (B) 0.74 (C) 4.93 (D) 1.18 (E) 2.43

Machine Learning Brett Bernstein

Simple Linear Regression

IE 230 Probability & Statistics in Engineering I. Closed book and notes. No calculators. 120 minutes.

of the matrix is =-85, so it is not positive definite. Thus, the first

Lecture 8: Non-parametric Comparison of Location. GENOME 560, Spring 2016 Doug Fowler, GS

Topic 9: Sampling Distributions of Estimators

UCLA STAT 110B Applied Statistics for Engineering and the Sciences

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.

Inferential Statistics. Inference Process. Inferential Statistics and Probability a Holistic Approach. Inference Process.

II. Descriptive Statistics D. Linear Correlation and Regression. 1. Linear Correlation

Topic 9: Sampling Distributions of Estimators

January 25, 2017 INTRODUCTION TO MATHEMATICAL STATISTICS

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:

University of California, Los Angeles Department of Statistics. Simple regression analysis

This is an introductory course in Analysis of Variance and Design of Experiments.

10. Comparative Tests among Spatial Regression Models. Here we revisit the example in Section 8.1 of estimating the mean of a normal random

Transcription:

Name: ID: Homework for 4/9 Due 4/16 1. [ 13-6] It is covetioal wisdom i military squadros that pilots ted to father more girls tha boys. Syder 1961 gathered data for military fighter pilots. The sex of the pilots offsprig were tabulated for three kids of flight duty durig the moth of coceptio, as show i the followig table. Father s Activity Female Offsprig Male Offsprig Flyig Fighters 51 38 Flyig Trasports 14 16 Not Flyig 38 46 a. Is there ay sigificat differece betwee the three groups? Use α 0.05. b. I the Uited States i 1950, 105.37 males were bor for every 100 females. Are the data cosistet with this sex ratio? Use α 0.05. Hit: this is similar to the authorship example. We are comparig pilots with geeral males. Thus we eed to combie all pilots i order to make a compariso. a. First, we ca fid the totals: Father s Activity Female Offsprig Male Offsprig Total Flyig Fighters 51 38 89 Flyig Trasports 14 16 30 Not Flyig 38 46 84 Total 103 100 203 Let the ull hypothesis be that there is o differece betwee these groups. The if H 0 is true, the expected couts for these groups would be Father s Activity Female Offsprig Male Offsprig 89 103 89 100 Flyig Fighters 45.2 43.8 203 203 30 103 30 103 Flyig Trasports 15.2 14.8 203 203 84 103 84 100 Not Flyig 42.6 41.4 203 203 Thus the Pearso s χ 2 test statistic is X 2 51 45.22 38 43.82 + + 45.2 43.8 38 42.62 46 41.42 + + 42.6 41.4 2.712. 14 15.22 15.2 + 16 14.82 14.8

Moreover, the distributio of X 2 is approximately χ 2 2 df 2 13 1 2. Sice α 0.05, the rejectio regio is R {X 2 > 5.99} 5.99 χ 2 0.95,2. Sice 2.712 < 5.99, we do ot reject H 0. I other words, there is o sigificace differece betwee the groups. b. I this part, we compare pilots with geeral males. Thus we combie the data for pilots. Father Female Offsprig Male Offsprig Total Pilot 103 100 203 Geeral male 100 105.37 205.37 Total 203 205.37 408.37 Let the ull hypothesis be that there is o differece betwee pilots ad geeral males. The if H 0 is true, the expected couts for pilots ad geeral males would be Father Female Offsprig Male Offsprig Pilot 100.911 102.089 Geeral male 102.089 103.281 Ad the Pearso s χ 2 test statistic is X 2 0.1709. The distributio of the test statistic is approximately χ 2 1 df 2 12 1 1. Sice α 0.05, the rejectio regio is R {X 2 > 3.84} 3.84 χ 2 0.95,1. Sice 0.1709 < 3.84, we do ot reject H 0. I other words, there is o sigificace differece betwee pilots ad geeral males, that is, the data is cosistet with this sex ratio. 2

2. [ 13-16] A market research team coducted a survey to ivestigate the relatioship of persoality to attitude toward small cars. A sample of adults i a metropolita area were asked to fill out a 16-item selfperceptio questioaire, o the basis of which they were classified ito three types: cautious coservative, middle-of-the-roader, ad cofidet explorer. They were the asked to give their overall opiio of small cars: favorable, eutral, or ufavorable. Is there a relatioship betwee persoality type ad attitude toward small cars? Use α 0.05. Persoality Type Attitude Cautious Midroad Explorer Favorable 79 58 49 Neutral 10 8 9 Ufavorable 10 34 42 We first fid the totals. Persoality Type Attitude Cautious Midroad Explorer Total Favorable 79 58 49 186 Neutral 10 8 9 27 Ufavorable 10 34 42 86 Total 99 100 100 Let the ull hypothesis be that there is o relatioship, that is, Persoality ad Attitude are idepedet. The if H 0 is true, the expected couts for pilots ad geeral males would be Persoality Type Attitude Cautious Midroad Explorer 186 99 186 100 186 100 Favorable 61.6 62.2 62.2 27 99 27 100 27 100 Neutral 8.9 9 9 86 99 86 100 86 100 Ufavorable 28.5 28.8 28.8 Ad the Pearso s χ 2 test statistic is X 2 27.24. The distributio of the test statistic is approximately χ 2 4 df 3 13 1 4. Sice α 0.05, the rejectio regio is R {X 2 > 9.49} 9.49 χ 2 0.95,4. Sice 27.24 > 9.49, we reject H 0. I other words, there is some relatioship betwee persoality type ad attitude toward small cars. 3

3. [ 14-2] For the followig data: x.34 1.38.65.68 1.40.88.30 1.18.50 1.75 y.27 1.34.53.35 1.28.98.72.81.64 1.59 a. Fit a lie y a + bx by the method of least squares. b. Fit a lie x c + dy by the method of least squares. a. We have x i 0.46, yi 2 8.983, x 2 i 10.434, x i y i 9.452, y i 0.75 ad 10. Thus, by the method of least squares y i x i x i y i a x 2 i 2 10.434 0.75 0.46 9.452 10 10.434 0.46 2 0.0334, x i y i x i y i b 2 10 9.452 0.46 0.75 10 10.434 0.46 2 0.904. Thus the lie is y 0.904x 0.0334. 4

b. We iterchage the role of x ad y. x i y i y i x i c y 2 i 2 yi 2 y i 8.983 0.46 0.75 9.452 10 8.983 0.75 2 0.0331, y i x i y i x i b 2 yi 2 y i 10 9.452 0.75 0.46 10 8.983 0.75 2 1.055. Thus the lie is x 1.055y + 0.0331. 5

4. [ 14-10] Show that the least squares estimates of the slope ad itercept of a lie may be expressed as ad ˆβ 1 ˆβ 0 ȳ ˆβ 1 x x i xy i ȳ. x i x 2 Hit: begi with ˆβ 1 ad expad x i xy i ȳ ad x i x 2. We will use the followig idetities several times: x i x ad y i ȳ. First we have x i xy i ȳ x i y i x i ȳ xy i + xȳ x i y i xȳ xȳ + xȳ x i y i ȳ x i x y i + xȳ x i y i xȳ 1 x i y i x ȳ [ 1 ] x i y i x i y i. 6

Similarly, x i x 2 x 2 i 2x i x + x 2 x 2 i 2 x 2 + x 2 x 2 i 2 x x i + x 2 x 2 i x 2 1 x 2 i x 2 1 2. I fact, we ca save the calculatio by usig the first result ad replace y with x. Therefore, x i xy i ȳ x i x 2 [ 1 1 ] x i y i x i y i 2 x i y i x i y i 2 ˆβ1. 7

Fially, we have x 2 i y i x i x i y i ˆβ 0 2 x 2 i ȳ ȳ ˆβ 1 x. y i 1 2 x i y i + 1 2 x i y i x i x i y i x 2 i x 2 i 2 1 2 [ x i y i x i y i 1 ] x i y i x i 1 2 x i ȳ 2 [ x i y i 1 ] x i y i x 2 x i y i x i y i 2 Oe ca also go backwards. x 8

Name: ID: Homework for 4/11 Due 4/16 1. [ 13-17] Let X ad Y be radom variables with E[X] µ x Var[X] σ 2 x E[Y ] µ y Var[Y ] σ 2 y Cov[X, Y ] σ xy Cosider predictig Y from X as Ŷ α + βx, where α ad β are chose to miimize E[Y Ŷ 2 ], the expected squared predictio error. a. Show that the miimizig values of α ad β are β σ xy σ 2 x α µ y βµ x Hit: E[Y Ŷ 2 ] E[Y ] E[Ŷ ]2 + Var[Y Ŷ ]. Sectio 4.3 may be helpful. Especially, Theorem A, Corollary A, ad Corollary B. b. Show that for this choice of α ad β Var[Y ] Var[Y Ŷ ] Var[Y ] where r xy is correlatio betwee X ad Y : r xy r 2 xy, Cov[X, Y ] Var[X]Var[Y ]. a. First we have E[Ŷ ] E[α + βx] α + βe[x] α + βµ x, Var[Ŷ ] Var[α + βx] β2 Var[X] β 2 σx, 2 ad Cov[Y, Ŷ ] Cov[Y, α + βx] βcov[y, X] βσ xy. Thus Var[Y Ŷ ] Var[Y ] + Var[Ŷ ] 2Cov[Y, Ŷ ] σ2 y + β 2 σx 2 2βσ xy. I order to miimize E[Y Ŷ 2 ], we otice that E[Y Ŷ 2 ] E[Y ] E[Ŷ ]2 + Var[Y Ŷ ] µ y α βµ x 2 σy 2 + β 2 σx 2 2βσ xy, fα, β.

Furthermore, f α 2µ y α βµ x, f β 2µ xµ y α βµ x 2σ 2 xβ 2σ xy α 2 2, α β 2 f β α 2µ x. β 2 2µ2 x 2σ 2 x, ad The solutio to { 2µ y α βµ x 0 2µ x µ y α βµ x 2σ 2 xβ 2σ xy 0 is Sice α 2 β α α µ y βµ x α β β 2 β σ xy σx 2. 2 2µ x 2µ x 2µ 2 x 2σx 2 4σ2 x > 0, we see that fα, β achieve its miimum at α µ y βµ x β σ xy σx 2. b. From a, we immediately have Var[Y ] Var[Y Ŷ ] Var[Y ] σ2 y σ 2 y + β 2 σ 2 x 2βσ xy σ 2 y β 2 σ2 x σy 2 + 2β σ xy σy 2 r 2 xy. σxy σ 2 x 2 σ2 x σ 2 y + 2 σxy σ 2 x σxy σ 2 y σ2 xy σ 2 x σ 2 y 10

2. [ 13-18] Suppose that Y i β 0 + β 1 x i + e i, i 1,..., where the e i are idepedet ad ormally distributed with mea zero ad variace σ 2. Fid the mle s of β 0 ad β 1 ad verify that they are the least squares estimates. Hit: Uder these assumptios, the Y i are idepedet ad ormally distributed with meas β 0 + β 1 x i ad variace σ 2. Write the joit desity fuctio of the Y i ad thus the likelihood. Sice e i s are i.i.d. N0, 1 radom variables, we have Y i Nβ 0 + β 1 x 1, σ 2 ad Y i s are idepedet. Let f i y i be the pdf of Y i. We have { 1 f i y i β 0, β 1 exp 1 } 2πσ 2 2σ 2 [y i β 0 + β 1 x i ] 2. Sice Y i s are idepedet, the joit pdf of Y 1,..., Y is fy 1,..., y 2 β 0, β 1 f i y i β 0, β 1. Correspodigly, the likelihood fuctio ad log-likelihood fuctio are likβ 0, β 1 where Sβ 0, β 1 f i β 0, β 1 Y i, lβ 0, β 1 log likβ 0, β 1 ad log f i β 0, β 1 Y i log 2πσ 2 1 2σ 2 [Y i β 0 + β 1 x i ] 2 log 2πσ 2 1 2σ 2 [Y i β 0 + β 1 x i ] 2 log 2πσ 2 1 2σ 2 Sβ 0, β 1, [Y i β 0 + β 1 x i ] 2. It follows that the miimizer of Sβ 0, β 1 is the maximizer of lβ 0, β 1. Therefore, the mle for β 0 ad β 1 11

are the least square estimates: x 2 i Y i x i x i Y i ˆβ 0 2 x 2 i x i x i Y i x i Y i ˆβ 1 2. 12