Stat 139 Homework 7 Solutions, Fall 2015
|
|
- Karen Waters
- 5 years ago
- Views:
Transcription
1 Stat 139 Homework 7 Solutios, Fall 2015 Problem 1. I class we leared that the classical simple liear regressio model assumes the followig distributio of resposes: Y i = β 0 + β 1 X i + ɛ i, i = 1,...,, (1) where ɛ i i.i.d N(0, σ 2 ). Two estimators we d like to calculate are: (i) ˆµ Y X = ˆβ 0 + ˆβ 1 X, the predicted mea value of the respose, µ Y, for ay idividual with a specific value of the predictor, X. This ca be used to build a cofidece iterval for where the true µ Y will be located give X. (ii) Ŷj X j = ˆβ 0 + ˆβ 1 X j + ɛ j, the predicted value for a ew idividual respose, Y j, give that idividual s value of the predictor, X j. This ca be used to build a predictio iterval for where a ew Y j will be located give it s X j. I this problem we will determie the samplig distributios of these two estimators [Note: the secod is ot techically a estimator sice ɛ j is ot observable, but we ca still determie some characteristics of this etity sice we ca assume the samplig distributio of ɛ j provided above]. (a) Calculate the expected value of ˆµ Y X ad Ŷj X j. The samplig distributio results of ˆβ 0 ad ˆβ 1 provided i class ca be used directly for this problem. E(ˆµ Y X ) = E( ˆβ 0 + ˆβ 1 X) = E( ˆβ 0 ) + XE( ˆβ 1 ) = β 0 + Xβ 1 E(Ŷj X j ) = E( ˆβ 0 + ˆβ 1 X + ɛ j ) = E( ˆβ 0 ) + XE( ˆβ 1 ) + E(ɛ j ) = β 0 + Xβ It turs out that Cov(Ȳ, ˆβ 1 ) = 0. I other words, the estimator of the slope of a regressio lie is ot correlated with the average respose. Ituitively, it is true because the regressio lie has to pass through the poit ( X, Ȳ ) regardless of the slope value. (b) Show that: Cov( ˆβ 0, ˆβ σ 2 X 1 ) = usig the fact above that Cov(Ȳ, ˆβ 1 ) = 0 ad the properties of Covariace: Cov(aX, Y ) = acov(x, Y ) ad Cov(X + W, Y ) = Cov(X, Y ) + Cov(W, Y ). Cov( ˆβ 0, ˆβ 1 ) = Cov(Ȳ ˆβ 1 X, ˆβ1 ) = Cov(Ȳ, ˆβ 1 ) Cov( ˆβ 1 X, ˆβ1 ) = 0 XVar( ˆβ σ 2 X 1 ) = (c) Determie Var(ˆµ Y X ). Hit: = i=1 (X2 i ) X 2 may be useful (though you may ot eed to use this property). Note: we decided to use X j to be where we are doig the calculatio so as to ot get cofused with the observed X s i the data set: Var(ˆµ Y Xj ) = Var( ˆβ 0 + ˆβ 1 X j ) = Var( ˆβ 0 ) + Var( ˆβ 1 X j ) + 2X j Cov( ˆβ 0, ˆβ 1 ) ( σ 2 = + X2 σ 2 ) X i=1 (X i X) j σ2 2X j σ 2 X i=1 (X i ( X) 2 = σ 2 1 X2 + + Xj 2 + 2X j X ) ( 1 = σ 2 + (X j X) 2 ) 1
2 It also turs out that Cov(ˆµ Y X, ɛ j ) = 0. I other words, the residuals aroud the lie are ot correlated with where the predicito is beig made. Ituitively this is true because oe of our assumptios i the regressio model is that the variace is the same ot matter what value of X j is observed.(note: Cov(Y j, ɛ j ) 0). (d) Determie Var(Ŷj X j ). Var(Ŷj X j ) = Var(ˆµ Yj X j + ɛ j ) = Var(ˆµ Yj X j ) + Var(ɛ j ) + 2Cov(ˆµ Y X, ɛ j ) = σ 2 ( 1 + (X j X) 2 ) + σ ( = σ (X j X) 2 ) (e) Make a argumet for why ˆµ Y X ad Ŷj X j are both Normally distributed. Both of these estimators should be Normally distributed sice they are comprised of liear combiatios of Normally distributed radom variables (liear combiatios of ˆβ 0, ˆβ 1 X j, ad ɛ j ). (f) Where will ˆµ Y X lie 95% of the time? Where will Ŷ j X j lie 95% of the time. Note: these itervals ca be used to build a cofidece itervals ad predictio itervals at a particular value of X by ceterig them at the estimates rather tha the true values, ad by usig the usual regressio estimate for σ 2. These will lie plus or mius Φ 1 (0.975) = 1.96 times their respective stadard deviatios aroud the predicted value at X j. That is ˆµ Y X will lie withi: (β 0 + β 1 X j ) ± 1.96 ( 1 σ 2 + (X j X) 2 ) 95% of the time, ad Ŷj X j will lie withi the followig bouds 95% of the time: (β 0 + β 1 X j ) ± 1.96 σ 2 ( (X j X) 2 ) Problem 2. A study was coducted to determie the associatio betwee the maximum distace at which a highway sig ca be read (i feet) ad the age of the driver (i s). Thirty drivers of various ages were studied. Sample meas ad variaces for distace ad age ad the correlatio betwee these variables are give i the accompayig table. sample mea sample variace Distace Age Correlatio r = (a) Fid ˆβ 0, ˆβ 1 the stadard error of ˆβ 1 ad the least-squares regressio equatio that would predict 2
3 the distace at which a highway sig ca be read give the age of the driver. ˆβ 1 = r s Y = s X = ˆβ 0 = Ȳ X ˆβ 1 = (51)( 3.007) = s (1 s ˆβ1 = 2 (Xi X) 2 = r 2 )s 2 Y ( ( 2)s 2 = 2 ) = X (28) Ŷ = ˆβ 0 + ˆβ 1 X = X which uses the fact that the variace estimate of the residuals is s 2 = MSE = SSE/( 2) = (1 r 2 )SSY/( 2) = (1 r 2 )s 2 Y ( 1)/( 2). (b) Is Age a sigificat predictor of Distace i this liear model? Coduct a fromal hypothesis test at the α = 0.05 level (iclude the usual elemets of a test of hypothesis). H 0 : β 1 = 0 vs. H A : β 1 0 t = ˆβ 1 s ˆβ1 = = p-value = P ( t df=28 > ) Sice our p-value is less tha 0.05, we ca reject the ull hypothesis. There is evidece that distace is related to age, i fact, youger age is associated with beig able to read from a further distace. (c) Usig oly the correlatio coefficiet (r) ad the sample size, coduct a test to determie if there is a sigificat associatio (i.e. H 0 : ρ = 0) betwee these two variables usig α = t = r 1 r 2 2 = = We get the exact same t-statistic with the same d.f. as part (b), so we ll have the same p-value ad come to the same coclusio. (d) Comparig the results of part (b) ad (c) above, what do you coclude about these two tests? These two tests are mathematically equivalet (t-statistic, degrees of freedom, ad p-value) ad will always come to the same coclusio. This ca be show algebraically (e) Usig your results from part (a) above, calculate the 95% cofidece iterval for the slope of this regressio lie. ˆβ 1 ± t df=28 s ˆβ1 = ± (0.4244) = ( 3.876, 2.138) > qt(0.975,df=28) [1] (f) Cosiderig the lower ad upper 95% cofidece limits of the slope you calculated i part (e), how are these cosistet with your results for parts (b) ad (c)? These results are cosistet sice we rejected the ull hypothesis (H 0 : β 1 = 0) that the slop is trule zero, ad the cofidece iterval does ot iclude the value zero iside it, so either way, it appears a slope of zero if ot a plausible assumptio. 3
4 Problem 3. The data for the above problem are available i a Excel file o the class website uder the fileame HighwaySigs.csv. (a) Make a scatterplot of this data (with fitted regressio lie) i R (do ot iclude it here...you ll prit it out for part (g)), ru a regressio model ad cofirm the results you calculated i problem 2(a) for ˆβ 0, ˆβ 1, ad s ˆβ1. > fit=lm(distace~age,data=highwaysigs) > summary(fit) Call: lm(formula = distace ~ age, data = highwaysigs) Residuals: Mi 1Q Media 3Q Max Coefficiets: Estimate Std. Error t value Pr(> t ) (Itercept) < 2e-16 *** age e-07 *** --- Sigif. codes: 0 *** ** 0.01 * Residual stadard error: o 28 degrees of freedom Multiple R-squared: 0.642, Adjusted R-squared: F-statistic: o 1 ad 28 DF, p-value: 1.041e-07 > plot(distace~age,data=highwaysigs,pch=16,cex=3) > ablie(fit,lwd=3,col="red") distace age Based o the above output, we see that the estimates match out had-calculated oes (igorig roudig errors): ˆβ1 = , ˆβ 0 = , ad s ˆβ1 = (b) Usig the results of the regressio model from R, locate the calculated value of the t-test statistic ad the associated p-value to determie if age is a sigificat predictor of distace ad compare these results to the results you obtaied by had i problem 2(b) above. Basead o the R output, we see the calculated t-statistic for the slope is with a p-value of , which agree with the work doe by had. 4
5 (c) Usig oly the regressio aalysis results table ad the sample meas ad variaces give i problem 2, calculate a 95% cofidece iterval for the average distace at which a highway sig ca be read by idividuals 75 s of age. Ŷ = ˆβ 0 + ˆβ 1 X = X = = % Cofidece Iterval for µ Y at x = 75: Ŷ ± t df= 2 s 1 + (x x) 2 1 (75 51)2 ( 1)s 2 = ± = (323.2, 379.2) x 30 (29) (d) Use R to cofirm the 95% cofidece iterval i part (c). You ll have to create a ew dataframe with the predictor variable age set to 75 ew=data.frame(age = 75), ad the use the commad predict usig the liear model from part (a). > ew=data.frame(age = 75) > predict(fit,ew,iterval="cofidece") fit lwr upr (e) Usig oly the regressio aalysis results table ad the sample meas ad variaces give i problem 2, calculate a 95% predictio iterval for the distace at which a highway sig ca be read by a idividual 75 s of age. Ŷ ± t df= 2 s (x x) 2 ( 1)s 2 x = ± (f) Use R to cofirm the 95% predictio iterval i part (e). > predict(fit,ew,iterval="predictio") fit lwr upr (75 51)2 + = (245.51, ) (29) (g) Prit out the scatterplot with least-squares lie, eter your two itervals from parts (c) ad (e) by had oto the scatterplot, ad explai the differece betwee the 95% cofidece iterval ad the 95% predictio iterval. distace age I the above graph the predictio iterval is i blue ad the cofidece iterval is i orage. The predictio iterval is a reasoable iterval estimate for where a sigle 75 old perso would be 5
6 predicted to be able to read a sig at the 95% cofidece level, while the cofidece iterval is a rage of plausible values for where the average distace of all 75 olds i the populatio are able to read a sig (aka, where the uderlyig populatio is lyig i the Y -directio at X = 75) at the 95% cofidece level. Problem 4. The data set malebirths.csv cotais data for the proportio of male births i 4 differet coutries (Demark, the Netherlads, Caada, ad the Uited States) for a umber of s. Use this data set to aswer the followig questios: (a) Ru four differet simple liear regressio models i R, oe for each coutry separately. Create a table with four rows (oe for each coutry) ad four colums: oe colum each for the calculatios ˆβ 1, the stadard error of ˆβ 1, the related t-statistic, ad the p-value related to this test. For which coutries is the associatio sigificat? Coutry ˆβ1 s ˆβ1 t-stat p-value Demark Netherlads < Caada USA Based o the results of the liear regressio models, we see that the associatio betwee the proportio of births that are male babies has sigificatly decreased over time i all four coutries. From most to least sigificat: Netherlads, USA, Caada, ad Demark. (b) Preset four differet scatterplots (oe for each coutry) with the observed poits ad the related fitted regressio lie as well. It would be helpful for iterpretatio if each plot had the same bouds o both axes. Be sure the plots are clearly labeled. Demark Netherlads demark etherlads Caada USA caada usa 6
7 (c) Explai why the U.S. ca have the largest of the 4 t-statistics (i magitude) eve though its slope is ot the highest. This ca be explaied by the fact that the stadard error of the slope is smaller for the U.S.(because the spread of the poits aroud the fitted regressio lie is much smaller i the vertical directio). (d) Explai why the stadard error of the estimated slope is smaller for the U.S. tha for Caada, eve though the sample size is the same. This ca be explaied by the fact that the estimate for the residuals is much smaller (less spread aroud the lie i the y-directio), so the precisio of the slope as a estimate for the ukow true value geeratig this data is much better. (e) Provide a reaso why the stadard devatios aroud the regressio lie might be differet for the four coutries (hit: the proportio of males ca be though of as a average withi each coutry respectively). This is because the poits actually represet averages of zeros ad oes: a measuremet for each baby that is bor for if it is a male or ot. Sice there are so may more observatios (births) i the U.S., the we d expect the average of these zeros ad oes to vary less (based o the Law of Large Numbers). 7
1 Inferential Methods for Correlation and Regression Analysis
1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet
More informationUniversity of California, Los Angeles Department of Statistics. Simple regression analysis
Uiversity of Califoria, Los Ageles Departmet of Statistics Statistics 100C Istructor: Nicolas Christou Simple regressio aalysis Itroductio: Regressio aalysis is a statistical method aimig at discoverig
More informationTABLES AND FORMULAS FOR MOORE Basic Practice of Statistics
TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Explorig Data: Distributios Look for overall patter (shape, ceter, spread) ad deviatios (outliers). Mea (use a calculator): x = x 1 + x 2 + +
More information3/3/2014. CDS M Phil Econometrics. Types of Relationships. Types of Relationships. Types of Relationships. Vijayamohanan Pillai N.
3/3/04 CDS M Phil Old Least Squares (OLS) Vijayamohaa Pillai N CDS M Phil Vijayamoha CDS M Phil Vijayamoha Types of Relatioships Oly oe idepedet variable, Relatioship betwee ad is Liear relatioships Curviliear
More informationST 305: Exam 3 ( ) = P(A)P(B A) ( ) = P(A) + P(B) ( ) = 1 P( A) ( ) = P(A) P(B) σ X 2 = σ a+bx. σ ˆp. σ X +Y. σ X Y. σ X. σ Y. σ n.
ST 305: Exam 3 By hadig i this completed exam, I state that I have either give or received assistace from aother perso durig the exam period. I have used o resources other tha the exam itself ad the basic
More informationUniversity of California, Los Angeles Department of Statistics. Practice problems - simple regression 2 - solutions
Uiversity of Califoria, Los Ageles Departmet of Statistics Statistics 00C Istructor: Nicolas Christou EXERCISE Aswer the followig questios: Practice problems - simple regressio - solutios a Suppose y,
More informationTABLES AND FORMULAS FOR MOORE Basic Practice of Statistics
TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Explorig Data: Distributios Look for overall patter (shape, ceter, spread) ad deviatios (outliers). Mea (use a calculator): x = x 1 + x 2 + +
More informationOverview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions
Chapter 9 Slide Ifereces from Two Samples 9- Overview 9- Ifereces about Two Proportios 9- Ifereces about Two Meas: Idepedet Samples 9-4 Ifereces about Matched Pairs 9-5 Comparig Variatio i Two Samples
More informationProperties and Hypothesis Testing
Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.
More informationStatistics 203 Introduction to Regression and Analysis of Variance Assignment #1 Solutions January 20, 2005
Statistics 203 Itroductio to Regressio ad Aalysis of Variace Assigmet #1 Solutios Jauary 20, 2005 Q. 1) (MP 2.7) (a) Let x deote the hydrocarbo percetage, ad let y deote the oxyge purity. The simple liear
More informationRegression, Inference, and Model Building
Regressio, Iferece, ad Model Buildig Scatter Plots ad Correlatio Correlatio coefficiet, r -1 r 1 If r is positive, the the scatter plot has a positive slope ad variables are said to have a positive relatioship
More informationLinear Regression Models
Liear Regressio Models Dr. Joh Mellor-Crummey Departmet of Computer Sciece Rice Uiversity johmc@cs.rice.edu COMP 528 Lecture 9 15 February 2005 Goals for Today Uderstad how to Use scatter diagrams to ispect
More informationSimple Linear Regression
Simple Liear Regressio 1. Model ad Parameter Estimatio (a) Suppose our data cosist of a collectio of pairs (x i, y i ), where x i is a observed value of variable X ad y i is the correspodig observatio
More informationFinal Examination Solutions 17/6/2010
The Islamic Uiversity of Gaza Faculty of Commerce epartmet of Ecoomics ad Political Scieces A Itroductio to Statistics Course (ECOE 30) Sprig Semester 009-00 Fial Eamiatio Solutios 7/6/00 Name: I: Istructor:
More informationContinuous Data that can take on any real number (time/length) based on sample data. Categorical data can only be named or categorised
Questio 1. (Topics 1-3) A populatio cosists of all the members of a group about which you wat to draw a coclusio (Greek letters (μ, σ, Ν) are used) A sample is the portio of the populatio selected for
More informationUNIVERSITY OF TORONTO Faculty of Arts and Science APRIL/MAY 2009 EXAMINATIONS ECO220Y1Y PART 1 OF 2 SOLUTIONS
PART of UNIVERSITY OF TORONTO Faculty of Arts ad Sciece APRIL/MAY 009 EAMINATIONS ECO0YY PART OF () The sample media is greater tha the sample mea whe there is. (B) () A radom variable is ormally distributed
More informationStatistics 20: Final Exam Solutions Summer Session 2007
1. 20 poits Testig for Diabetes. Statistics 20: Fial Exam Solutios Summer Sessio 2007 (a) 3 poits Give estimates for the sesitivity of Test I ad of Test II. Solutio: 156 patiets out of total 223 patiets
More informationFinal Review. Fall 2013 Prof. Yao Xie, H. Milton Stewart School of Industrial Systems & Engineering Georgia Tech
Fial Review Fall 2013 Prof. Yao Xie, yao.xie@isye.gatech.edu H. Milto Stewart School of Idustrial Systems & Egieerig Georgia Tech 1 Radom samplig model radom samples populatio radom samples: x 1,..., x
More informationWorksheet 23 ( ) Introduction to Simple Linear Regression (continued)
Worksheet 3 ( 11.5-11.8) Itroductio to Simple Liear Regressio (cotiued) This worksheet is a cotiuatio of Discussio Sheet 3; please complete that discussio sheet first if you have ot already doe so. This
More informationMathematical Notation Math Introduction to Applied Statistics
Mathematical Notatio Math 113 - Itroductio to Applied Statistics Name : Use Word or WordPerfect to recreate the followig documets. Each article is worth 10 poits ad ca be prited ad give to the istructor
More informationLecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise)
Lecture 22: Review for Exam 2 Basic Model Assumptios (without Gaussia Noise) We model oe cotiuous respose variable Y, as a liear fuctio of p umerical predictors, plus oise: Y = β 0 + β X +... β p X p +
More informationEXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY
EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE IN STATISTICS, 017 MODULE 4 : Liear models Time allowed: Oe ad a half hours Cadidates should aswer THREE questios. Each questio carries
More informationStatistical and Mathematical Methods DS-GA 1002 December 8, Sample Final Problems Solutions
Statistical ad Mathematical Methods DS-GA 00 December 8, 05. Short questios Sample Fial Problems Solutios a. Ax b has a solutio if b is i the rage of A. The dimesio of the rage of A is because A has liearly-idepedet
More informationBecause it tests for differences between multiple pairs of means in one test, it is called an omnibus test.
Math 308 Sprig 018 Classes 19 ad 0: Aalysis of Variace (ANOVA) Page 1 of 6 Itroductio ANOVA is a statistical procedure for determiig whether three or more sample meas were draw from populatios with equal
More informationAssessment and Modeling of Forests. FR 4218 Spring Assignment 1 Solutions
Assessmet ad Modelig of Forests FR 48 Sprig Assigmet Solutios. The first part of the questio asked that you calculate the average, stadard deviatio, coefficiet of variatio, ad 9% cofidece iterval of the
More informationCircle the single best answer for each multiple choice question. Your choice should be made clearly.
TEST #1 STA 4853 March 6, 2017 Name: Please read the followig directios. DO NOT TURN THE PAGE UNTIL INSTRUCTED TO DO SO Directios This exam is closed book ad closed otes. There are 32 multiple choice questios.
More informationMA 575, Linear Models : Homework 3
MA 575, Liear Models : Homework 3 Questio 1 RSS( ˆβ 0, ˆβ 1 ) (ŷ i y i ) Problem.7 Questio.7.1 ( ˆβ 0 + ˆβ 1 x i y i ) (ȳ SXY SXY x + SXX SXX x i y i ) ((ȳ y i ) + SXY SXX (x i x)) (ȳ y i ) SXY SXX SY
More informationDescribing the Relation between Two Variables
Copyright 010 Pearso Educatio, Ic. Tables ad Formulas for Sulliva, Statistics: Iformed Decisios Usig Data 010 Pearso Educatio, Ic Chapter Orgaizig ad Summarizig Data Relative frequecy = frequecy sum of
More informationInterval Estimation (Confidence Interval = C.I.): An interval estimate of some population parameter is an interval of the form (, ),
Cofidece Iterval Estimatio Problems Suppose we have a populatio with some ukow parameter(s). Example: Normal(,) ad are parameters. We eed to draw coclusios (make ifereces) about the ukow parameters. We
More informationSection 9.2. Tests About a Population Proportion 12/17/2014. Carrying Out a Significance Test H A N T. Parameters & Hypothesis
Sectio 9.2 Tests About a Populatio Proportio P H A N T O M S Parameters Hypothesis Assess Coditios Name the Test Test Statistic (Calculate) Obtai P value Make a decisio State coclusio Sectio 9.2 Tests
More informationComparing your lab results with the others by one-way ANOVA
Comparig your lab results with the others by oe-way ANOVA You may have developed a ew test method ad i your method validatio process you would like to check the method s ruggedess by coductig a simple
More informationOpen book and notes. 120 minutes. Cover page and six pages of exam. No calculators.
IE 330 Seat # Ope book ad otes 120 miutes Cover page ad six pages of exam No calculators Score Fial Exam (example) Schmeiser Ope book ad otes No calculator 120 miutes 1 True or false (for each, 2 poits
More informationChapter 13, Part A Analysis of Variance and Experimental Design
Slides Prepared by JOHN S. LOUCKS St. Edward s Uiversity Slide 1 Chapter 13, Part A Aalysis of Variace ad Eperimetal Desig Itroductio to Aalysis of Variace Aalysis of Variace: Testig for the Equality of
More information11 Correlation and Regression
11 Correlatio Regressio 11.1 Multivariate Data Ofte we look at data where several variables are recorded for the same idividuals or samplig uits. For example, at a coastal weather statio, we might record
More informationIsmor Fischer, 1/11/
Ismor Fischer, //04 7.4-7.4 Problems. I Problem 4.4/9, it was show that importat relatios exist betwee populatio meas, variaces, ad covariace. Specifically, we have the formulas that appear below left.
More informationSTA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:
STA 2023 Module 10 Comparig Two Proportios Learig Objectives Upo completig this module, you should be able to: 1. Perform large-sample ifereces (hypothesis test ad cofidece itervals) to compare two populatio
More informationCommon Large/Small Sample Tests 1/55
Commo Large/Small Sample Tests 1/55 Test of Hypothesis for the Mea (σ Kow) Covert sample result ( x) to a z value Hypothesis Tests for µ Cosider the test H :μ = μ H 1 :μ > μ σ Kow (Assume the populatio
More informationThis is an introductory course in Analysis of Variance and Design of Experiments.
1 Notes for M 384E, Wedesday, Jauary 21, 2009 (Please ote: I will ot pass out hard-copy class otes i future classes. If there are writte class otes, they will be posted o the web by the ight before class
More informationChapter 23: Inferences About Means
Chapter 23: Ifereces About Meas Eough Proportios! We ve spet the last two uits workig with proportios (or qualitative variables, at least) ow it s time to tur our attetios to quatitative variables. For
More informationCorrelation Regression
Correlatio Regressio While correlatio methods measure the stregth of a liear relatioship betwee two variables, we might wish to go a little further: How much does oe variable chage for a give chage i aother
More informationChapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.
Chapter 22 Comparig Two Proportios Copyright 2010, 2007, 2004 Pearso Educatio, Ic. Comparig Two Proportios Read the first two paragraphs of pg 504. Comparisos betwee two percetages are much more commo
More informationHYPOTHESIS TESTS FOR ONE POPULATION MEAN WORKSHEET MTH 1210, FALL 2018
HYPOTHESIS TESTS FOR ONE POPULATION MEAN WORKSHEET MTH 1210, FALL 2018 We are resposible for 2 types of hypothesis tests that produce ifereces about the ukow populatio mea, µ, each of which has 3 possible
More informationLinear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d
Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y
More informationAlgebra of Least Squares
October 19, 2018 Algebra of Least Squares Geometry of Least Squares Recall that out data is like a table [Y X] where Y collects observatios o the depedet variable Y ad X collects observatios o the k-dimesioal
More information(all terms are scalars).the minimization is clearer in sum notation:
7 Multiple liear regressio: with predictors) Depedet data set: y i i = 1, oe predictad, predictors x i,k i = 1,, k = 1, ' The forecast equatio is ŷ i = b + Use matrix otatio: k =1 b k x ik Y = y 1 y 1
More informationOutput Analysis (2, Chapters 10 &11 Law)
B. Maddah ENMG 6 Simulatio Output Aalysis (, Chapters 10 &11 Law) Comparig alterative system cofiguratio Sice the output of a simulatio is radom, the comparig differet systems via simulatio should be doe
More informationLecture 11 Simple Linear Regression
Lecture 11 Simple Liear Regressio Fall 2013 Prof. Yao Xie, yao.xie@isye.gatech.edu H. Milto Stewart School of Idustrial Systems & Egieerig Georgia Tech Midterm 2 mea: 91.2 media: 93.75 std: 6.5 2 Meddicorp
More informationy ij = µ + α i + ɛ ij,
STAT 4 ANOVA -Cotrasts ad Multiple Comparisos /3/04 Plaed comparisos vs uplaed comparisos Cotrasts Cofidece Itervals Multiple Comparisos: HSD Remark Alterate form of Model I y ij = µ + α i + ɛ ij, a i
More informationSample Size Determination (Two or More Samples)
Sample Sie Determiatio (Two or More Samples) STATGRAPHICS Rev. 963 Summary... Data Iput... Aalysis Summary... 5 Power Curve... 5 Calculatios... 6 Summary This procedure determies a suitable sample sie
More informationn but for a small sample of the population, the mean is defined as: n 2. For a lognormal distribution, the median equals the mean.
Sectio. True or False Questios (2 pts each). For a populatio the meas is defied as i= μ = i but for a small sample of the populatio, the mea is defied as: = i= i 2. For a logormal distributio, the media
More informationFrequentist Inference
Frequetist Iferece The topics of the ext three sectios are useful applicatios of the Cetral Limit Theorem. Without kowig aythig about the uderlyig distributio of a sequece of radom variables {X i }, for
More informationRecall the study where we estimated the difference between mean systolic blood pressure levels of users of oral contraceptives and non-users, x - y.
Testig Statistical Hypotheses Recall the study where we estimated the differece betwee mea systolic blood pressure levels of users of oral cotraceptives ad o-users, x - y. Such studies are sometimes viewed
More informationGeometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT
OCTOBER 7, 2016 LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT Geometry of LS We ca thik of y ad the colums of X as members of the -dimesioal Euclidea space R Oe ca
More informationComparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading
Topic 15 - Two Sample Iferece I STAT 511 Professor Bruce Craig Comparig Two Populatios Research ofte ivolves the compariso of two or more samples from differet populatios Graphical summaries provide visual
More informationmultiplies all measures of center and the standard deviation and range by k, while the variance is multiplied by k 2.
Lesso 3- Lesso 3- Scale Chages of Data Vocabulary scale chage of a data set scale factor scale image BIG IDEA Multiplyig every umber i a data set by k multiplies all measures of ceter ad the stadard deviatio
More informationSince X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain
Assigmet 9 Exercise 5.5 Let X biomial, p, where p 0, 1 is ukow. Obtai cofidece itervals for p i two differet ways: a Sice X / p d N0, p1 p], the variace of the limitig distributio depeds oly o p. Use the
More informationII. Descriptive Statistics D. Linear Correlation and Regression. 1. Linear Correlation
II. Descriptive Statistics D. Liear Correlatio ad Regressio I this sectio Liear Correlatio Cause ad Effect Liear Regressio 1. Liear Correlatio Quatifyig Liear Correlatio The Pearso product-momet correlatio
More informationENGI 4421 Confidence Intervals (Two Samples) Page 12-01
ENGI 44 Cofidece Itervals (Two Samples) Page -0 Two Sample Cofidece Iterval for a Differece i Populatio Meas [Navidi sectios 5.4-5.7; Devore chapter 9] From the cetral limit theorem, we kow that, for sufficietly
More informationMOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.
XI-1 (1074) MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND. R. E. D. WOOLSEY AND H. S. SWANSON XI-2 (1075) STATISTICAL DECISION MAKING Advaced
More informationStatistics Lecture 27. Final review. Administrative Notes. Outline. Experiments. Sampling and Surveys. Administrative Notes
Admiistrative Notes s - Lecture 7 Fial review Fial Exam is Tuesday, May 0th (3-5pm Covers Chapters -8 ad 0 i textbook Brig ID cards to fial! Allowed: Calculators, double-sided 8.5 x cheat sheet Exam Rooms:
More informationMA238 Assignment 4 Solutions (part a)
(i) Sigle sample tests. Questio. MA38 Assigmet 4 Solutios (part a) (a) (b) (c) H 0 : = 50 sq. ft H A : < 50 sq. ft H 0 : = 3 mpg H A : > 3 mpg H 0 : = 5 mm H A : 5mm Questio. (i) What are the ull ad alterative
More informationRandom Variables, Sampling and Estimation
Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig
More informationResponse Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable
Statistics Chapter 4 Correlatio ad Regressio If we have two (or more) variables we are usually iterested i the relatioship betwee the variables. Associatio betwee Variables Two variables are associated
More informationRegression. Correlation vs. regression. The parameters of linear regression. Regression assumes... Random sample. Y = α + β X.
Regressio Correlatio vs. regressio Predicts Y from X Liear regressio assumes that the relatioship betwee X ad Y ca be described by a lie Regressio assumes... Radom sample Y is ormally distributed with
More informationBHW #13 1/ Cooper. ENGR 323 Probabilistic Analysis Beautiful Homework # 13
BHW # /5 ENGR Probabilistic Aalysis Beautiful Homework # Three differet roads feed ito a particular freeway etrace. Suppose that durig a fixed time period, the umber of cars comig from each road oto the
More informationDS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10
DS 00: Priciples ad Techiques of Data Sciece Date: April 3, 208 Name: Hypothesis Testig Discussio #0. Defie these terms below as they relate to hypothesis testig. a) Data Geeratio Model: Solutio: A set
More informationCONFIDENCE INTERVALS STUDY GUIDE
CONFIDENCE INTERVALS STUDY UIDE Last uit, we discussed how sample statistics vary. Uder the right coditios, sample statistics like meas ad proportios follow a Normal distributio, which allows us to calculate
More informationStatistics 511 Additional Materials
Cofidece Itervals o mu Statistics 511 Additioal Materials This topic officially moves us from probability to statistics. We begi to discuss makig ifereces about the populatio. Oe way to differetiate probability
More informationChapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.
Chapter 22 Comparig Two Proportios Copyright 2010 Pearso Educatio, Ic. Comparig Two Proportios Comparisos betwee two percetages are much more commo tha questios about isolated percetages. Ad they are more
More information2 1. The r.s., of size n2, from population 2 will be. 2 and 2. 2) The two populations are independent. This implies that all of the n1 n2
Chapter 8 Comparig Two Treatmets Iferece about Two Populatio Meas We wat to compare the meas of two populatios to see whether they differ. There are two situatios to cosider, as show i the followig examples:
More informationINSTRUCTIONS (A) 1.22 (B) 0.74 (C) 4.93 (D) 1.18 (E) 2.43
PAPER NO.: 444, 445 PAGE NO.: Page 1 of 1 INSTRUCTIONS I. You have bee provided with: a) the examiatio paper i two parts (PART A ad PART B), b) a multiple choice aswer sheet (for PART A), c) selected formulae
More informationBig Picture. 5. Data, Estimates, and Models: quantifying the accuracy of estimates.
5. Data, Estimates, ad Models: quatifyig the accuracy of estimates. 5. Estimatig a Normal Mea 5.2 The Distributio of the Normal Sample Mea 5.3 Normal data, cofidece iterval for, kow 5.4 Normal data, cofidece
More information7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals
7-1 Chapter 4 Part I. Samplig Distributios ad Cofidece Itervals 1 7- Sectio 1. Samplig Distributio 7-3 Usig Statistics Statistical Iferece: Predict ad forecast values of populatio parameters... Test hypotheses
More informationSimple Regression. Acknowledgement. These slides are based on presentations created and copyrighted by Prof. Daniel Menasce (GMU) CS 700
Simple Regressio CS 7 Ackowledgemet These slides are based o presetatios created ad copyrighted by Prof. Daiel Measce (GMU) Basics Purpose of regressio aalysis: predict the value of a depedet or respose
More informationMBACATÓLICA. Quantitative Methods. Faculdade de Ciências Económicas e Empresariais UNIVERSIDADE CATÓLICA PORTUGUESA 9. SAMPLING DISTRIBUTIONS
MBACATÓLICA Quatitative Methods Miguel Gouveia Mauel Leite Moteiro Faculdade de Ciêcias Ecoómicas e Empresariais UNIVERSIDADE CATÓLICA PORTUGUESA 9. SAMPLING DISTRIBUTIONS MBACatólica 006/07 Métodos Quatitativos
More informationMATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4
MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.
More informationFACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures
FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING Lectures MODULE 5 STATISTICS II. Mea ad stadard error of sample data. Biomial distributio. Normal distributio 4. Samplig 5. Cofidece itervals
More informationNumber of fatalities X Sunday 4 Monday 6 Tuesday 2 Wednesday 0 Thursday 3 Friday 5 Saturday 8 Total 28. Day
LECTURE # 8 Mea Deviatio, Stadard Deviatio ad Variace & Coefficiet of variatio Mea Deviatio Stadard Deviatio ad Variace Coefficiet of variatio First, we will discuss it for the case of raw data, ad the
More informationTMA4245 Statistics. Corrected 30 May and 4 June Norwegian University of Science and Technology Department of Mathematical Sciences.
Norwegia Uiversity of Sciece ad Techology Departmet of Mathematical Scieces Corrected 3 May ad 4 Jue Solutios TMA445 Statistics Saturday 6 May 9: 3: Problem Sow desity a The probability is.9.5 6x x dx
More informationChapter 8: Estimating with Confidence
Chapter 8: Estimatig with Cofidece Sectio 8.2 The Practice of Statistics, 4 th editio For AP* STARNES, YATES, MOORE Chapter 8 Estimatig with Cofidece 8.1 Cofidece Itervals: The Basics 8.2 8.3 Estimatig
More informationUniversity of California, Los Angeles Department of Statistics. Hypothesis testing
Uiversity of Califoria, Los Ageles Departmet of Statistics Statistics 100B Elemets of a hypothesis test: Hypothesis testig Istructor: Nicolas Christou 1. Null hypothesis, H 0 (claim about µ, p, σ 2, µ
More informationS Y Y = ΣY 2 n. Using the above expressions, the correlation coefficient is. r = SXX S Y Y
1 Sociology 405/805 Revised February 4, 004 Summary of Formulae for Bivariate Regressio ad Correlatio Let X be a idepedet variable ad Y a depedet variable, with observatios for each of the values of these
More informationCorrelation. Two variables: Which test? Relationship Between Two Numerical Variables. Two variables: Which test? Contingency table Grouped bar graph
Correlatio Y Two variables: Which test? X Explaatory variable Respose variable Categorical Numerical Categorical Cotigecy table Cotigecy Logistic Grouped bar graph aalysis regressio Mosaic plot Numerical
More informationStatistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.
Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized
More informationSimple Random Sampling!
Simple Radom Samplig! Professor Ro Fricker! Naval Postgraduate School! Moterey, Califoria! Readig:! 3/26/13 Scheaffer et al. chapter 4! 1 Goals for this Lecture! Defie simple radom samplig (SRS) ad discuss
More informationREGRESSION MODELS ANOVA
REGRESSION MODELS ANOVA 141 Cotiuous Outcome? NO RECAP: Logistic regressio ad other methods YES Liear Regressio Examie mai effects cosiderig predictors of iterest, ad cofouders Test effect modificatio
More informationAP Statistics Review Ch. 8
AP Statistics Review Ch. 8 Name 1. Each figure below displays the samplig distributio of a statistic used to estimate a parameter. The true value of the populatio parameter is marked o each samplig distributio.
More informationMath 140 Introductory Statistics
8.2 Testig a Proportio Math 1 Itroductory Statistics Professor B. Abrego Lecture 15 Sectios 8.2 People ofte make decisios with data by comparig the results from a sample to some predetermied stadard. These
More informationStatistical Intervals for a Single Sample
3/5/06 Applied Statistics ad Probability for Egieers Sixth Editio Douglas C. Motgomery George C. Ruger Chapter 8 Statistical Itervals for a Sigle Sample 8 CHAPTER OUTLINE 8- Cofidece Iterval o the Mea
More informationM1 for method for S xy. M1 for method for at least one of S xx or S yy. A1 for at least one of S xy, S xx, S yy correct. M1 for structure of r
Questio 1 (i) EITHER: 1 S xy = xy x y = 198.56 1 19.8 140.4 =.44 x x = 1411.66 1 19.8 = 15.657 1 S xx = y y = 1417.88 1 140.4 = 9.869 14 Sxy -.44 r = = SxxSyy 15.6579.869 = 0.76 1 S yy = 14 14 M1 for method
More informationA quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population
A quick activity - Cetral Limit Theorem ad Proportios Lecture 21: Testig Proportios Statistics 10 Coli Rudel Flip a coi 30 times this is goig to get loud! Record the umber of heads you obtaied ad calculate
More informationEfficient GMM LECTURE 12 GMM II
DECEMBER 1 010 LECTURE 1 II Efficiet The estimator depeds o the choice of the weight matrix A. The efficiet estimator is the oe that has the smallest asymptotic variace amog all estimators defied by differet
More informationBIOS 4110: Introduction to Biostatistics. Breheny. Lab #9
BIOS 4110: Itroductio to Biostatistics Brehey Lab #9 The Cetral Limit Theorem is very importat i the realm of statistics, ad today's lab will explore the applicatio of it i both categorical ad cotiuous
More informationFormulas and Tables for Gerstman
Formulas ad Tables for Gerstma Measuremet ad Study Desig Biostatistics is more tha a compilatio of computatioal techiques! Measuremet scales: quatitative, ordial, categorical Iformatio quality is primary
More informationLinear Regression Demystified
Liear Regressio Demystified Liear regressio is a importat subject i statistics. I elemetary statistics courses, formulae related to liear regressio are ofte stated without derivatio. This ote iteds to
More informationMEASURES OF DISPERSION (VARIABILITY)
POLI 300 Hadout #7 N. R. Miller MEASURES OF DISPERSION (VARIABILITY) While measures of cetral tedecy idicate what value of a variable is (i oe sese or other, e.g., mode, media, mea), average or cetral
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More informationGood luck! School of Business and Economics. Business Statistics E_BK1_BS / E_IBA1_BS. Date: 25 May, Time: 12:00. Calculator allowed:
School of Busiess ad Ecoomics Exam: Code: Examiator: Co-reader: Busiess Statistics E_BK_BS / E_IBA_BS dr. R. Heijugs dr. G.J. Frax Date: 5 May, 08 Time: :00 Duratio: Calculator allowed: Graphical calculator
More informationChapter 20. Comparing Two Proportions. BPS - 5th Ed. Chapter 20 1
Chapter 0 Comparig Two Proportios BPS - 5th Ed. Chapter 0 Case Study Machie Reliability A study is performed to test of the reliability of products produced by two machies. Machie A produced 8 defective
More informationOctober 25, 2018 BIM 105 Probability and Statistics for Biomedical Engineers 1
October 25, 2018 BIM 105 Probability ad Statistics for Biomedical Egieers 1 Populatio parameters ad Sample Statistics October 25, 2018 BIM 105 Probability ad Statistics for Biomedical Egieers 2 Ifereces
More information