Simple Linear Regression
|
|
- Sharyl Owens
- 6 years ago
- Views:
Transcription
1 Simple Liear Regressio 1. Model ad Parameter Estimatio (a) Suppose our data cosist of a collectio of pairs (x i, y i ), where x i is a observed value of variable X ad y i is the correspodig observatio of radom variable Y. The simple liear regressio model y i = β 0 + β 1 x i + ɛ i expresses the relatioship betwee variables X ad Y. Here β 0 deotes the itercept ad β 1 the slope of the regressio lie. (b) Values for β 0 ad β 1 are estimated from the data by the method of least squares. (c) From the may straight lies that could be draw through our data, we fid the lie that miimizes the sum of squared residuals, where a residual is the vertical distace betwee a poit (x i, y i ) ad the regressio lie. (d) Values ˆβ 0 ad ˆβ 1 deote the estimates for β 0 ad β 1 that miimize the sum of squared residuals, or error sum of squares(sse). The estimates are called least squares estimates. SSE = ɛi 2 = i=1 i=1 (y i β 0 β 1 x i ) 2 (e) SSE is miimized whe the partial derivatives of the SSE with respect to the ukows (β 0 ad SSE β 1 ) are set to zero: β 0 = 0 ad SSE β 1 = 0. (You eed multivariable calculus [eg Math 2001] to uderstad the theoretical details, so we will just take this as a give.) These two coditios result i the two so-called ormal equatios. β 0 + β 1 i=1 x i = β 0 x i + β 1 xi 2 = i=1 i=1 y i i=1 x i y i i=1 (f) The two ormal equatios are solved simultaeously to obtai estimates of β 0 ad β 1. These estimates are: ˆβ 1 = i=1 (y i ȳ)(x i x) i=1 (x i x) 2 = i=1 x iy i ( i=1 x i) ( i=1 y i) i=1 x2 i ( i=1 x i) 2 ˆβ 0 = ȳ ˆβ 1 x Lookig at the formula for ˆβ 1, ad recallig the formula for the correlatio coefficiet r, it is easy to see that ˆβ 1 = rs y /s x. (g) The error variace, σ 2, is estimated as ˆσ 2 = SSE 2 = (y i ŷ i ) 2 2 1
2 The followig example shows the calculatios as they would be carried out by had, i gruesome detail. eg: To study the effect of ozoe pollutio o soybea yield, data were collected at four ozoe dose levels ad the resultig soybea seed yield moitored. Ozoe dose levels (i ppm)were reported as the average ozoe cocetratio durig the growig seaso. Soybea yield was reported i grams per plat. X Y Ozoe(ppm) Yield (gm/plat) Estimated values for β 0 ad β 1 are ow computed from the data X Y X 2 Y 2 XY Colum sums: x i =.35, y i = 911, x 2 i =.0399, y 2 i = 208, 495, ad x i y i = Meas: x =.0875 ad ȳ = Itermediate terms: = i (x i x) 2 = i x 2 i ( x i) 2 =.0399 (.35)2 4 = SS xy = i (x i x)(y i ȳ) = i x i y i ( x i)( y i ) = (911) 4 = ˆβ 1 = SS xy = , ˆβ 0 = ȳ ˆβ 1 x = ( )(.0875) = (h) the least squares regressio equatio which characterizes the liear relatioship betwee soybea yield ad ozoe dose is ŷ i = x i (i) The error variace, σ 2, is estimated as MSE. (j) Residuals: ˆɛ i = y i ŷ i = y i ( ˆβ 0 + ˆβ 1 x i ) x i y i ŷ i ˆɛ i = y i ŷ i
3 (k) Residual Sum of Squares (I regressio problems, the error sum of squares is also kow as the residual sum of squares). (l) Mea Squared Error: MSE = SSE = ˆɛ 2 i = ( 5.563) 2 + (4.113) 2 + (9.854) 2 + ( 8.404) 2 = SSE ( 2) =
4 x=c(.02,.07,.11,.15) y=c(242,237,231,201) SXX=sum((x-mea(x))^2) SXY=sum((x-mea(x))*(y-mea(y))) SYY=sum((y-mea(y))^2) b1=sxy/sxx b0=mea(y)-b1*mea(x) yp=b0+b1*x resids=y-yp SSE=sum(resids^2) SST=SYY SSR=SST-SSE SS=c(SSR,SSE,SST) =legth(y) df=c(1,-2,-1) MS=SS/df cbid(ss,df,ms) Calculatios by had i R SS df MS [1,] [2,] [3,]
5 Check calculatios usig builti lm, summary ad ANOVA commads i R Call: lm(formula = y ~ x) Coefficiets: (Itercept) x Call: lm(formula = y ~ x) Residuals: Coefficiets: Estimate Std. Error t value Pr(> t ) (Itercept) ** x Sigif. codes: 0 '***' '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual stadard error: o 2 degrees of freedom Multiple R-squared: ,Adjusted R-squared: F-statistic: o 1 ad 2 DF, p-value: Aalysis of Variace Table Respose: y Df Sum Sq Mea Sq F value Pr(>F) x Residuals [1] [1]
6 Statistical ifereces - CI s ad tests for the β s 2. Stadard Errors for Regressio Coefficiets (a) Regressio coefficiet values, ˆβ 0 ad ˆβ 1, are poit estimates of the true itercept ad slope, β 0 ad β 1 respectively. (b) To develop iterval estimates (cofidece itervals) for β 0 ad β 1, we eed to make assumptios about the errors i the regressio model. I partiular, we assume ɛ 1, ɛ 2,..., ɛ i.i.d N(0, σ 2 ), i which case: (c) The stadard deviatio of ˆβ 1 is σ 2 ˆβ 1 N(β 1, σ 2 ) (d) The value of σ 2 is ukow, so the estimator MSE is used i its place to produce the stadard error of the estimate ˆβ 1, as SE ˆβ 1 = MSE/ (e) The stadard error for estimate ˆβ 0 is give as: SE ˆβ 0 = MSE( 1 + x2 ) (f) Stadard Errors for regressio coefficiets i the above example are estimated below. = ad MSE = SE ˆβ 1 = MSE/ = / = SE ˆβ 0 = MSE( 1 + SS x2 xx ) = ((1/4) + (.0399/ )) =
7 3. Cofidece Itervals for Regressio Coefficiets (a) Cofidece itervals are costructed usig the stadard errors as follows: ˆβ i ± t α/2, 2 SE ˆβ i (b) I the example, 95% cofidece itervals for β 1 ad β 0 are computed as follows. t α/2, 2 = t.025,2 = For the slope, β 1 : ± 4.303(107.81) ( 757.4, 170.3) For the itercept, β 0 : ± 4.303(10.77) (207.1, 299.8) 95% Cofidece itervals i R upper 2.5th percetile of t-dist with -2 d.f. MSE=SSE/(-2) t=qt(.975,-2) t #upper.025'th percetile of t with -2 df. [1] %cofidece iterval for β 1 SEb1=sqrt(MSE/SXX) #stadard error of beta_1 c(b1-t*seb1,b1+t*seb1) [1]
8 Why does the cofidece iterval have the correct coverage probability? Cosider the example of the iterval for ˆβ 1. We eed the followig facts: (a) β 1 has a ormal distributio with mea β 1 ad ukow variace σ 2 /SXX. A cosequece is that Z = β 1 β 1 σ/ SXX (b) W = ( 2)MSE σ to prove.) 2 N(0, 1) (Easy results to prove.) χ 2 2, a chi-squared distributio with 2 degrees of freedom. (A bit harder (c) β 1 ad SSE are idepedet, implies Z = β 1 β 1 σ/ ( 2)MSE ad are idepedet. (Hard to SXX σ 2 prove. Details ivolve cosiderable matrix algebra, ad are cotaied i appedix C3 of Motgomery et al) (d) Defiitio: If Z is stadard ormal, idepedet of W which is χ 2 ν, the t = have a t distributio with ν degrees of freedom. (e) The see geeral otes o costructig cofidece itervals. Z W/ν is defied to 8
9 4. The correlatio betwee X ad Y is estimated by: r = A alterative expressio is give by or i=1 (y i ȳ)(x i x) i=1 (x i x) 2 i=1 (y i ȳ) 2 r = ˆβ 1 i=1 (x i x) 2 i=1 (y i ȳ) 2 r = ˆβ 1 SSxx SSyy where = i=1 (x i x) 2 ad SS yy = i=1 (y i ȳ) 2 are the sums of squares of the X s ad Y s, respectively. Note that SS yy = SST, the total sum of squares. Note that stadard deviatios of the X s ad the Y s. The correlatio coefficiet lies i the iterval [-1,+1]. SSxx SSyy = s x s y, the ratio of the If the relatioship bewee Y ad X is perfectly liear ad icreasig, the correlatio will be +1. If the relatioship is perfectly liear ad decreasig, the correlatio will be +1. If there is o liear relatioship betwee X ad Y, the correlatio is 0. I the example, r = ˆβ SSxx 1 = =.887 SSyy
10 5. Goodess of fit of the regressio lie is measured by the coefficiet of determiatio, R 2. For simple liear regressio R 2 = r 2. R 2 = SSR SST The Regressio Sum of Squares (SSR) is similar to the Treatmet Sum of Squares i a ANOVA problem. It is give by SSR = SS2 xy. Alterative ways of calculatig the residual sum of squares are to use the additivity relatioship (SSR + SSE = SST), or to use oe of the followig formulas. R 2 = SSR/SST 1 R 2 = (SST SSR)/SST = SSE/SST SSE = (1 R 2 )SST R 2 is the fractio of the total variability i y accouted for by the liear regressio lie, ad rages betwee 0 ad 1. R 2 = 1.00 idicates a perfect liear fit, while R 2 = 0.00 is a complete liear o-fit. I the example: SSR = SS2 xy = ( ) 2 / = SST = SSR + SSE = = R 2 = SSR/SST = Note that R 2 = r 2, the square of the correlatio coefficiet. 78.8% of the variability i Y is accouted for by the regressio model. [1] [1] [1]
11 6. Estimatig the mea of Y (a) The estimated mea of Y whe x = x is ˆµ x = ˆβ 0 + ˆβ 1 x. (b) (c) The stadard error of ˆµ x is ( ˆµ x = ˆβ 0 + ˆβ 1 x N (β 0 + β 1 x 1, σ 2 + (x x) 2 )) SE ˆµx = ( 1 MSE + (x x) 2 ) (d) A cofidece iterval for the mea µ x = β 0 + β 1 x whe x = x is give by ˆµ x ± t α/2, 2 SE ˆµx (e) eg. A 95% cofidece iterval for the mea at x = 0.10 is: Whe x = 0.10, the estimated mea is ˆµ.1 = (0.1) = ( ) SE ˆµ.1 = ( ) = 5.36 t α/2, 2 = t.025,2 = margi of error = 4.303(5.36) = ± (201, ) 95% cofidece iterval for mu at x0=.10 x0=.10 muhat=b0+b1*x0 # estimate of mea at x=x0 muhat SEmu=sqrt(MSE)*sqrt(1/+(x0-mea(x))^2/SXX) #SE of muhat SEmu c(muhat-t*semu, muhat+t*semu) [1] [1] [1]
12 7. Predictig a New Respose Value We are ow iterestig i predictig the value of y at a future value x = x. I makig a predictio iterval for a future observatio o y whe x = x, we eed to icorporate two sources of variatio which accout for the fact that we are replacig the ukow mea by the estimate ˆβ 0 + ˆβ 1 x, ad we are replacig the ukow stadard deviatio σ by the estimate MSE. y ( ˆβ 0 + ˆβ 1 x ) = (y (β 0 + β 1 x )) ( ˆβ 0 + ˆβ 1 x (β 0 + β 1 x )) The first term i brackets o the right had side of this expressio has a N(0, σ 2 ) distributio. From (b) above, the distributio of the secod term is ( 1 N (0, σ 2 + (x x) 2 )) As y represets a future observatio, the distributios of the two terms are idepedet, ad it follows that the distributio of y ( ˆβ 0 + ˆβ 1 x ) is N (0, σ ( (x x) 2 )) (a) The predicted value of y is give by ŷ = ˆβ 0 + ˆβ 1 x (b) The variace of the above distributio is estimated by: ( MSE (x x) 2 ) (c) ad the predictio iterval for y is give by ( ˆβ 0 + ˆβ 1 x ± t α/2, 2 MSE (x x) 2 ) (d) eg. A 95% predictio iterval for y whe x = 0.10 is: For x = 0.10, y = (0.1) = ) SE y = ( ( ) = t α/2, 2 = t.025,2 = margi of error = 4.303(11.69) = ± (173.79, ) SEmu=sqrt(MSE)*sqrt(1+1/+(x0-mea(x))^2/SXX) c(muhat-t*semu, muhat+t*semu) 95% predictio iterval for a ew observatio at x0=.10 12
13 [1]
Linear Regression Models
Liear Regressio Models Dr. Joh Mellor-Crummey Departmet of Computer Sciece Rice Uiversity johmc@cs.rice.edu COMP 528 Lecture 9 15 February 2005 Goals for Today Uderstad how to Use scatter diagrams to ispect
More information1 Inferential Methods for Correlation and Regression Analysis
1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet
More informationUniversity of California, Los Angeles Department of Statistics. Simple regression analysis
Uiversity of Califoria, Los Ageles Departmet of Statistics Statistics 100C Istructor: Nicolas Christou Simple regressio aalysis Itroductio: Regressio aalysis is a statistical method aimig at discoverig
More informationRegression, Inference, and Model Building
Regressio, Iferece, ad Model Buildig Scatter Plots ad Correlatio Correlatio coefficiet, r -1 r 1 If r is positive, the the scatter plot has a positive slope ad variables are said to have a positive relatioship
More informationStatistics 203 Introduction to Regression and Analysis of Variance Assignment #1 Solutions January 20, 2005
Statistics 203 Itroductio to Regressio ad Aalysis of Variace Assigmet #1 Solutios Jauary 20, 2005 Q. 1) (MP 2.7) (a) Let x deote the hydrocarbo percetage, ad let y deote the oxyge purity. The simple liear
More informationChapters 5 and 13: REGRESSION AND CORRELATION. Univariate data: x, Bivariate data (x,y).
Chapters 5 ad 13: REGREION AND CORRELATION (ectios 5.5 ad 13.5 are omitted) Uivariate data: x, Bivariate data (x,y). Example: x: umber of years studets studied paish y: score o a proficiecy test For each
More informationGrant MacEwan University STAT 252 Dr. Karen Buro Formula Sheet
Grat MacEwa Uiversity STAT 5 Dr. Kare Buro Formula Sheet Descriptive Statistics Sample Mea: x = x i i= Sample Variace: s = i= (x i x) = Σ i=x i (Σ i= x i) Sample Stadard Deviatio: s = Sample Variace =
More informationSimple Regression. Acknowledgement. These slides are based on presentations created and copyrighted by Prof. Daniel Menasce (GMU) CS 700
Simple Regressio CS 7 Ackowledgemet These slides are based o presetatios created ad copyrighted by Prof. Daiel Measce (GMU) Basics Purpose of regressio aalysis: predict the value of a depedet or respose
More informationTABLES AND FORMULAS FOR MOORE Basic Practice of Statistics
TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Explorig Data: Distributios Look for overall patter (shape, ceter, spread) ad deviatios (outliers). Mea (use a calculator): x = x 1 + x 2 + +
More informationS Y Y = ΣY 2 n. Using the above expressions, the correlation coefficient is. r = SXX S Y Y
1 Sociology 405/805 Revised February 4, 004 Summary of Formulae for Bivariate Regressio ad Correlatio Let X be a idepedet variable ad Y a depedet variable, with observatios for each of the values of these
More informationUniversity of California, Los Angeles Department of Statistics. Practice problems - simple regression 2 - solutions
Uiversity of Califoria, Los Ageles Departmet of Statistics Statistics 00C Istructor: Nicolas Christou EXERCISE Aswer the followig questios: Practice problems - simple regressio - solutios a Suppose y,
More informationProperties and Hypothesis Testing
Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.
More informationTABLES AND FORMULAS FOR MOORE Basic Practice of Statistics
TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Explorig Data: Distributios Look for overall patter (shape, ceter, spread) ad deviatios (outliers). Mea (use a calculator): x = x 1 + x 2 + +
More informationLinear Regression Demystified
Liear Regressio Demystified Liear regressio is a importat subject i statistics. I elemetary statistics courses, formulae related to liear regressio are ofte stated without derivatio. This ote iteds to
More informationStat 139 Homework 7 Solutions, Fall 2015
Stat 139 Homework 7 Solutios, Fall 2015 Problem 1. I class we leared that the classical simple liear regressio model assumes the followig distributio of resposes: Y i = β 0 + β 1 X i + ɛ i, i = 1,...,,
More informationAssessment and Modeling of Forests. FR 4218 Spring Assignment 1 Solutions
Assessmet ad Modelig of Forests FR 48 Sprig Assigmet Solutios. The first part of the questio asked that you calculate the average, stadard deviatio, coefficiet of variatio, ad 9% cofidece iterval of the
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More informationSIMPLE LINEAR REGRESSION AND CORRELATION ANALYSIS
SIMPLE LINEAR REGRESSION AND CORRELATION ANALSIS INTRODUCTION There are lot of statistical ivestigatio to kow whether there is a relatioship amog variables Two aalyses: (1) regressio aalysis; () correlatio
More information3/3/2014. CDS M Phil Econometrics. Types of Relationships. Types of Relationships. Types of Relationships. Vijayamohanan Pillai N.
3/3/04 CDS M Phil Old Least Squares (OLS) Vijayamohaa Pillai N CDS M Phil Vijayamoha CDS M Phil Vijayamoha Types of Relatioships Oly oe idepedet variable, Relatioship betwee ad is Liear relatioships Curviliear
More informationST 305: Exam 3 ( ) = P(A)P(B A) ( ) = P(A) + P(B) ( ) = 1 P( A) ( ) = P(A) P(B) σ X 2 = σ a+bx. σ ˆp. σ X +Y. σ X Y. σ X. σ Y. σ n.
ST 305: Exam 3 By hadig i this completed exam, I state that I have either give or received assistace from aother perso durig the exam period. I have used o resources other tha the exam itself ad the basic
More informationSection 14. Simple linear regression.
Sectio 14 Simple liear regressio. Let us look at the cigarette dataset from [1] (available to dowload from joural s website) ad []. The cigarette dataset cotais measuremets of tar, icotie, weight ad carbo
More informationLecture 11 Simple Linear Regression
Lecture 11 Simple Liear Regressio Fall 2013 Prof. Yao Xie, yao.xie@isye.gatech.edu H. Milto Stewart School of Idustrial Systems & Egieerig Georgia Tech Midterm 2 mea: 91.2 media: 93.75 std: 6.5 2 Meddicorp
More informationAlgebra of Least Squares
October 19, 2018 Algebra of Least Squares Geometry of Least Squares Recall that out data is like a table [Y X] where Y collects observatios o the depedet variable Y ad X collects observatios o the k-dimesioal
More information(all terms are scalars).the minimization is clearer in sum notation:
7 Multiple liear regressio: with predictors) Depedet data set: y i i = 1, oe predictad, predictors x i,k i = 1,, k = 1, ' The forecast equatio is ŷ i = b + Use matrix otatio: k =1 b k x ik Y = y 1 y 1
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More informationCorrelation Regression
Correlatio Regressio While correlatio methods measure the stregth of a liear relatioship betwee two variables, we might wish to go a little further: How much does oe variable chage for a give chage i aother
More informationWorksheet 23 ( ) Introduction to Simple Linear Regression (continued)
Worksheet 3 ( 11.5-11.8) Itroductio to Simple Liear Regressio (cotiued) This worksheet is a cotiuatio of Discussio Sheet 3; please complete that discussio sheet first if you have ot already doe so. This
More informationOpen book and notes. 120 minutes. Cover page and six pages of exam. No calculators.
IE 330 Seat # Ope book ad otes 120 miutes Cover page ad six pages of exam No calculators Score Fial Exam (example) Schmeiser Ope book ad otes No calculator 120 miutes 1 True or false (for each, 2 poits
More informationMOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.
XI-1 (1074) MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND. R. E. D. WOOLSEY AND H. S. SWANSON XI-2 (1075) STATISTICAL DECISION MAKING Advaced
More informationDescribing the Relation between Two Variables
Copyright 010 Pearso Educatio, Ic. Tables ad Formulas for Sulliva, Statistics: Iformed Decisios Usig Data 010 Pearso Educatio, Ic Chapter Orgaizig ad Summarizig Data Relative frequecy = frequecy sum of
More informationStatistical Intervals for a Single Sample
3/5/06 Applied Statistics ad Probability for Egieers Sixth Editio Douglas C. Motgomery George C. Ruger Chapter 8 Statistical Itervals for a Sigle Sample 8 CHAPTER OUTLINE 8- Cofidece Iterval o the Mea
More information11 Correlation and Regression
11 Correlatio Regressio 11.1 Multivariate Data Ofte we look at data where several variables are recorded for the same idividuals or samplig uits. For example, at a coastal weather statio, we might record
More informationStatistical Properties of OLS estimators
1 Statistical Properties of OLS estimators Liear Model: Y i = β 0 + β 1 X i + u i OLS estimators: β 0 = Y β 1X β 1 = Best Liear Ubiased Estimator (BLUE) Liear Estimator: β 0 ad β 1 are liear fuctio of
More informationLecture 1, Jan 19. i=1 p i = 1.
Lecture 1, Ja 19 Review of the expected value, covariace, correlatio coefficiet, mea, ad variace. Radom variable. A variable that takes o alterative values accordig to chace. More specifically, a radom
More informationEcon 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara
Poit Estimator Eco 325 Notes o Poit Estimator ad Cofidece Iterval 1 By Hiro Kasahara Parameter, Estimator, ad Estimate The ormal probability desity fuctio is fully characterized by two costats: populatio
More informationLecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise)
Lecture 22: Review for Exam 2 Basic Model Assumptios (without Gaussia Noise) We model oe cotiuous respose variable Y, as a liear fuctio of p umerical predictors, plus oise: Y = β 0 + β X +... β p X p +
More informationFinal Review. Fall 2013 Prof. Yao Xie, H. Milton Stewart School of Industrial Systems & Engineering Georgia Tech
Fial Review Fall 2013 Prof. Yao Xie, yao.xie@isye.gatech.edu H. Milto Stewart School of Idustrial Systems & Egieerig Georgia Tech 1 Radom samplig model radom samples populatio radom samples: x 1,..., x
More information[ ] ( ) ( ) [ ] ( ) 1 [ ] [ ] Sums of Random Variables Y = a 1 X 1 + a 2 X 2 + +a n X n The expected value of Y is:
PROBABILITY FUNCTIONS A radom variable X has a probabilit associated with each of its possible values. The probabilit is termed a discrete probabilit if X ca assume ol discrete values, or X = x, x, x 3,,
More informationECON 3150/4150, Spring term Lecture 3
Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step ECON 3150/4150, Sprig term 2014. Lecture 3 Ragar Nymoe Uiversity of Oslo 21 Jauary 2014 1 / 30 Itroductio
More informationResponse Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable
Statistics Chapter 4 Correlatio ad Regressio If we have two (or more) variables we are usually iterested i the relatioship betwee the variables. Associatio betwee Variables Two variables are associated
More informationEXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY
EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE IN STATISTICS, 017 MODULE 4 : Liear models Time allowed: Oe ad a half hours Cadidates should aswer THREE questios. Each questio carries
More informationStatistics 20: Final Exam Solutions Summer Session 2007
1. 20 poits Testig for Diabetes. Statistics 20: Fial Exam Solutios Summer Sessio 2007 (a) 3 poits Give estimates for the sesitivity of Test I ad of Test II. Solutio: 156 patiets out of total 223 patiets
More informationMA 575, Linear Models : Homework 3
MA 575, Liear Models : Homework 3 Questio 1 RSS( ˆβ 0, ˆβ 1 ) (ŷ i y i ) Problem.7 Questio.7.1 ( ˆβ 0 + ˆβ 1 x i y i ) (ȳ SXY SXY x + SXX SXX x i y i ) ((ȳ y i ) + SXY SXX (x i x)) (ȳ y i ) SXY SXX SY
More informationStatistics 511 Additional Materials
Cofidece Itervals o mu Statistics 511 Additioal Materials This topic officially moves us from probability to statistics. We begi to discuss makig ifereces about the populatio. Oe way to differetiate probability
More informationGeometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT
OCTOBER 7, 2016 LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT Geometry of LS We ca thik of y ad the colums of X as members of the -dimesioal Euclidea space R Oe ca
More informationCircle the single best answer for each multiple choice question. Your choice should be made clearly.
TEST #1 STA 4853 March 6, 2017 Name: Please read the followig directios. DO NOT TURN THE PAGE UNTIL INSTRUCTED TO DO SO Directios This exam is closed book ad closed otes. There are 32 multiple choice questios.
More informationLesson 11: Simple Linear Regression
Lesso 11: Simple Liear Regressio Ka-fu WONG December 2, 2004 I previous lessos, we have covered maily about the estimatio of populatio mea (or expected value) ad its iferece. Sometimes we are iterested
More informationStatistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.
Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized
More informationResampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.
Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator
More informationt distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference
EXST30 Backgroud material Page From the textbook The Statistical Sleuth Mea [0]: I your text the word mea deotes a populatio mea (µ) while the work average deotes a sample average ( ). Variace [0]: The
More informationRegression. Correlation vs. regression. The parameters of linear regression. Regression assumes... Random sample. Y = α + β X.
Regressio Correlatio vs. regressio Predicts Y from X Liear regressio assumes that the relatioship betwee X ad Y ca be described by a lie Regressio assumes... Radom sample Y is ormally distributed with
More informationFirst, note that the LS residuals are orthogonal to the regressors. X Xb X y = 0 ( normal equations ; (k 1) ) So,
0 2. OLS Part II The OLS residuals are orthogoal to the regressors. If the model icludes a itercept, the orthogoality of the residuals ad regressors gives rise to three results, which have limited practical
More informationTAMS24: Notations and Formulas
TAMS4: Notatios ad Formulas Basic otatios ad defiitios X: radom variable stokastiska variabel Mea Vätevärde: µ = X = by Xiagfeg Yag kpx k, if X is discrete, xf Xxdx, if X is cotiuous Variace Varias: =
More informationSimple Regression Model
Simple Regressio Model 1. The Model y i 0 1 x i u i where y i depedet variable x i idepedet variable u i disturbace/error term i 1,..., Eg: y wage (measured i 1976 dollars per hr) x educatio (measured
More informationContinuous Data that can take on any real number (time/length) based on sample data. Categorical data can only be named or categorised
Questio 1. (Topics 1-3) A populatio cosists of all the members of a group about which you wat to draw a coclusio (Greek letters (μ, σ, Ν) are used) A sample is the portio of the populatio selected for
More informationECE 901 Lecture 12: Complexity Regularization and the Squared Loss
ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality
More informationConfidence Interval for Standard Deviation of Normal Distribution with Known Coefficients of Variation
Cofidece Iterval for tadard Deviatio of Normal Distributio with Kow Coefficiets of Variatio uparat Niwitpog Departmet of Applied tatistics, Faculty of Applied ciece Kig Mogkut s Uiversity of Techology
More informationLinear Regression Analysis. Analysis of paired data and using a given value of one variable to predict the value of the other
Liear Regressio Aalysis Aalysis of paired data ad usig a give value of oe variable to predict the value of the other 5 5 15 15 1 1 5 5 1 3 4 5 6 7 8 1 3 4 5 6 7 8 Liear Regressio Aalysis E: The chirp rate
More information- E < p. ˆ p q ˆ E = q ˆ = 1 - p ˆ = sample proportion of x failures in a sample size of n. where. x n sample proportion. population proportion
1 Chapter 7 ad 8 Review for Exam Chapter 7 Estimates ad Sample Sizes 2 Defiitio Cofidece Iterval (or Iterval Estimate) a rage (or a iterval) of values used to estimate the true value of the populatio parameter
More informationSTP 226 ELEMENTARY STATISTICS
TP 6 TP 6 ELEMENTARY TATITIC CHAPTER 4 DECRIPTIVE MEAURE IN REGREION AND CORRELATION Liear Regressio ad correlatio allows us to examie the relatioship betwee two or more quatitative variables. 4.1 Liear
More informationOverview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions
Chapter 9 Slide Ifereces from Two Samples 9- Overview 9- Ifereces about Two Proportios 9- Ifereces about Two Meas: Idepedet Samples 9-4 Ifereces about Matched Pairs 9-5 Comparig Variatio i Two Samples
More informationFinal Examination Solutions 17/6/2010
The Islamic Uiversity of Gaza Faculty of Commerce epartmet of Ecoomics ad Political Scieces A Itroductio to Statistics Course (ECOE 30) Sprig Semester 009-00 Fial Eamiatio Solutios 7/6/00 Name: I: Istructor:
More informationInferential Statistics. Inference Process. Inferential Statistics and Probability a Holistic Approach. Inference Process.
Iferetial Statistics ad Probability a Holistic Approach Iferece Process Chapter 8 Poit Estimatio ad Cofidece Itervals This Course Material by Maurice Geraghty is licesed uder a Creative Commos Attributio-ShareAlike
More informationLecture 7: Properties of Random Samples
Lecture 7: Properties of Radom Samples 1 Cotiued From Last Class Theorem 1.1. Let X 1, X,...X be a radom sample from a populatio with mea µ ad variace σ
More informationIsmor Fischer, 1/11/
Ismor Fischer, //04 7.4-7.4 Problems. I Problem 4.4/9, it was show that importat relatios exist betwee populatio meas, variaces, ad covariace. Specifically, we have the formulas that appear below left.
More informationUNIVERSITY OF TORONTO Faculty of Arts and Science APRIL/MAY 2009 EXAMINATIONS ECO220Y1Y PART 1 OF 2 SOLUTIONS
PART of UNIVERSITY OF TORONTO Faculty of Arts ad Sciece APRIL/MAY 009 EAMINATIONS ECO0YY PART OF () The sample media is greater tha the sample mea whe there is. (B) () A radom variable is ormally distributed
More informationMatrix Representation of Data in Experiment
Matrix Represetatio of Data i Experimet Cosider a very simple model for resposes y ij : y ij i ij, i 1,; j 1,,..., (ote that for simplicity we are assumig the two () groups are of equal sample size ) Y
More informationSample Size Determination (Two or More Samples)
Sample Sie Determiatio (Two or More Samples) STATGRAPHICS Rev. 963 Summary... Data Iput... Aalysis Summary... 5 Power Curve... 5 Calculatios... 6 Summary This procedure determies a suitable sample sie
More informationInterval Estimation (Confidence Interval = C.I.): An interval estimate of some population parameter is an interval of the form (, ),
Cofidece Iterval Estimatio Problems Suppose we have a populatio with some ukow parameter(s). Example: Normal(,) ad are parameters. We eed to draw coclusios (make ifereces) about the ukow parameters. We
More informationBig Picture. 5. Data, Estimates, and Models: quantifying the accuracy of estimates.
5. Data, Estimates, ad Models: quatifyig the accuracy of estimates. 5. Estimatig a Normal Mea 5.2 The Distributio of the Normal Sample Mea 5.3 Normal data, cofidece iterval for, kow 5.4 Normal data, cofidece
More informationy ij = µ + α i + ɛ ij,
STAT 4 ANOVA -Cotrasts ad Multiple Comparisos /3/04 Plaed comparisos vs uplaed comparisos Cotrasts Cofidece Itervals Multiple Comparisos: HSD Remark Alterate form of Model I y ij = µ + α i + ɛ ij, a i
More informationSimple Linear Regression
Chapter 2 Simple Liear Regressio 2.1 Simple liear model The simple liear regressio model shows how oe kow depedet variable is determied by a sigle explaatory variable (regressor). Is is writte as: Y i
More informationSTATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments:
Recall: STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS Commets:. So far we have estimates of the parameters! 0 ad!, but have o idea how good these estimates are. Assumptio: E(Y x)! 0 +! x (liear coditioal
More informationChapter 13, Part A Analysis of Variance and Experimental Design
Slides Prepared by JOHN S. LOUCKS St. Edward s Uiversity Slide 1 Chapter 13, Part A Aalysis of Variace ad Eperimetal Desig Itroductio to Aalysis of Variace Aalysis of Variace: Testig for the Equality of
More informationChapter 8: Estimating with Confidence
Chapter 8: Estimatig with Cofidece Sectio 8.2 The Practice of Statistics, 4 th editio For AP* STARNES, YATES, MOORE Chapter 8 Estimatig with Cofidece 8.1 Cofidece Itervals: The Basics 8.2 8.3 Estimatig
More informationREVIEW OF SIMPLE LINEAR REGRESSION SIMPLE LINEAR REGRESSION
REVIEW OF SIMPLE LINEAR REGRESSION SIMPLE LINEAR REGRESSION I liear regreio, we coider the frequecy ditributio of oe variable (Y) at each of everal level of a ecod variable (X). Y i kow a the depedet variable.
More informationUNIT 11 MULTIPLE LINEAR REGRESSION
UNIT MULTIPLE LINEAR REGRESSION Structure. Itroductio release relies Obectives. Multiple Liear Regressio Model.3 Estimatio of Model Parameters Use of Matrix Notatio Properties of Least Squares Estimates.4
More informationTopic 10: Introduction to Estimation
Topic 0: Itroductio to Estimatio Jue, 0 Itroductio I the simplest possible terms, the goal of estimatio theory is to aswer the questio: What is that umber? What is the legth, the reactio rate, the fractio
More informationSTP 226 EXAMPLE EXAM #1
STP 226 EXAMPLE EXAM #1 Istructor: Hoor Statemet: I have either give or received iformatio regardig this exam, ad I will ot do so util all exams have bee graded ad retured. PRINTED NAME: Siged Date: DIRECTIONS:
More informationMathematical Notation Math Introduction to Applied Statistics
Mathematical Notatio Math 113 - Itroductio to Applied Statistics Name : Use Word or WordPerfect to recreate the followig documets. Each article is worth 10 poits ad ca be prited ad give to the istructor
More informationConfidence Level We want to estimate the true mean of a random variable X economically and with confidence.
Cofidece Iterval 700 Samples Sample Mea 03 Cofidece Level 095 Margi of Error 0037 We wat to estimate the true mea of a radom variable X ecoomically ad with cofidece True Mea μ from the Etire Populatio
More informationThe variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.
SAMPLE STATISTICS A radom sample x 1,x,,x from a distributio f(x) is a set of idepedetly ad idetically variables with x i f(x) for all i Their joit pdf is f(x 1,x,,x )=f(x 1 )f(x ) f(x )= f(x i ) The sample
More informationParameter, Statistic and Random Samples
Parameter, Statistic ad Radom Samples A parameter is a umber that describes the populatio. It is a fixed umber, but i practice we do ot kow its value. A statistic is a fuctio of the sample data, i.e.,
More informationImportant Formulas. Expectation: E (X) = Σ [X P(X)] = n p q σ = n p q. P(X) = n! X1! X 2! X 3! X k! p X. Chapter 6 The Normal Distribution.
Importat Formulas Chapter 3 Data Descriptio Mea for idividual data: X = _ ΣX Mea for grouped data: X= _ Σf X m Stadard deviatio for a sample: _ s = Σ(X _ X ) or s = 1 (Σ X ) (Σ X ) ( 1) Stadard deviatio
More informationFormulas and Tables for Gerstman
Formulas ad Tables for Gerstma Measuremet ad Study Desig Biostatistics is more tha a compilatio of computatioal techiques! Measuremet scales: quatitative, ordial, categorical Iformatio quality is primary
More informationLinear Regression Models, OLS, Assumptions and Properties
Chapter 2 Liear Regressio Models, OLS, Assumptios ad Properties 2.1 The Liear Regressio Model The liear regressio model is the sigle most useful tool i the ecoometricia s kit. The multiple regressio model
More informationn but for a small sample of the population, the mean is defined as: n 2. For a lognormal distribution, the median equals the mean.
Sectio. True or False Questios (2 pts each). For a populatio the meas is defied as i= μ = i but for a small sample of the populatio, the mea is defied as: = i= i 2. For a logormal distributio, the media
More informationTMA4245 Statistics. Corrected 30 May and 4 June Norwegian University of Science and Technology Department of Mathematical Sciences.
Norwegia Uiversity of Sciece ad Techology Departmet of Mathematical Scieces Corrected 3 May ad 4 Jue Solutios TMA445 Statistics Saturday 6 May 9: 3: Problem Sow desity a The probability is.9.5 6x x dx
More informationMidtermII Review. Sta Fall Office Hours Wednesday 12:30-2:30pm Watch linear regression videos before lab on Thursday
Aoucemets MidtermII Review Sta 101 - Fall 2016 Duke Uiversity, Departmet of Statistical Sciece Office Hours Wedesday 12:30-2:30pm Watch liear regressio videos before lab o Thursday Dr. Abrahamse Slides
More informationLinear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d
Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y
More informationBayesian Methods: Introduction to Multi-parameter Models
Bayesia Methods: Itroductio to Multi-parameter Models Parameter: θ = ( θ, θ) Give Likelihood p(y θ) ad prior p(θ ), the posterior p proportioal to p(y θ) x p(θ ) Margial posterior ( θ, θ y) is Iterested
More informationTests of Hypotheses Based on a Single Sample (Devore Chapter Eight)
Tests of Hypotheses Based o a Sigle Sample Devore Chapter Eight MATH-252-01: Probability ad Statistics II Sprig 2018 Cotets 1 Hypothesis Tests illustrated with z-tests 1 1.1 Overview of Hypothesis Testig..........
More informationMA Advanced Econometrics: Properties of Least Squares Estimators
MA Advaced Ecoometrics: Properties of Least Squares Estimators Karl Whela School of Ecoomics, UCD February 5, 20 Karl Whela UCD Least Squares Estimators February 5, 20 / 5 Part I Least Squares: Some Fiite-Sample
More informationDirection: This test is worth 250 points. You are required to complete this test within 50 minutes.
Term Test October 3, 003 Name Math 56 Studet Number Directio: This test is worth 50 poits. You are required to complete this test withi 50 miutes. I order to receive full credit, aswer each problem completely
More informationMATH/STAT 352: Lecture 15
MATH/STAT 352: Lecture 15 Sectios 5.2 ad 5.3. Large sample CI for a proportio ad small sample CI for a mea. 1 5.2: Cofidece Iterval for a Proportio Estimatig proportio of successes i a biomial experimet
More informationGoodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)
Goodess-of-Fit Tests ad Categorical Data Aalysis (Devore Chapter Fourtee) MATH-252-01: Probability ad Statistics II Sprig 2019 Cotets 1 Chi-Squared Tests with Kow Probabilities 1 1.1 Chi-Squared Testig................
More informationREGRESSION MODELS ANOVA
REGRESSION MODELS ANOVA 141 Cotiuous Outcome? NO RECAP: Logistic regressio ad other methods YES Liear Regressio Examie mai effects cosiderig predictors of iterest, ad cofouders Test effect modificatio
More informationExam II Review. CEE 3710 November 15, /16/2017. EXAM II Friday, November 17, in class. Open book and open notes.
Exam II Review CEE 3710 November 15, 017 EXAM II Friday, November 17, i class. Ope book ad ope otes. Focus o material covered i Homeworks #5 #8, Note Packets #10 19 1 Exam II Topics **Will emphasize material
More informationII. Descriptive Statistics D. Linear Correlation and Regression. 1. Linear Correlation
II. Descriptive Statistics D. Liear Correlatio ad Regressio I this sectio Liear Correlatio Cause ad Effect Liear Regressio 1. Liear Correlatio Quatifyig Liear Correlatio The Pearso product-momet correlatio
More information