Lecture notes on Regression & SAS example demonstration
|
|
- August Hancock
- 5 years ago
- Views:
Transcription
1 Regression & Correlation (p. 215) When two variables are measured on a single experimental unit, the resulting data are called bivariate data. You can describe each variable individually, and you can also explore the relationship between the two variables. Simple Linear Regression & Correlation (p.214) For quantitative variables one could employ methods of regression analysis. Regression analysis is an area of statistics that is concerned with finding a model that describes the relationship that may exist between variables and determining the validity of such a relationship. Examples Do housing prices vary according to distance to a major freeway? Does respiration rate vary with altitude? Is snowfall related to elevation and if so, what kind of relationship is there between these two variables. Speaking of snow, let s consider wind chill Example 10.1 (p. 214) Suppose we are interested in determining the wind chill temperature. For those of us from regions where the winters are extremely cold (like North Dakota), we know that this temperature is dependent upon variables such as the wind velocity (speed and direction), the absolute temperature, relative humidity, etc. Dependent (response) variable: wind chill temperature Independent (regressor/predictor) variables: temp, wind velocity, relative humidity Is the wind chill temp important? California: Minneapolis: Wind chill temp: What do you say to that????? Pretty cold if you ask me! 1
2 Regression analysis allows us to represent the relationship between the variables. to examine how the variable of interest (wind chill), often called the dependent or response variable is affected by one or more control or independent variables (wind speed, actual temperature, relative humidity). It provides us with a simplified view of the relationship between variables, a way of fitting a model with our data, and a means for evaluating the importance of the variables included in the model and the correctness of the model. Correlation analysis will be used as a measure of the strength of the given relationship. Note the following concepts: Quantitative variables may be classified according to types. Response variable: a variable whose changes are of interest to an experimenter. Explanatory variable: a variable that explains or causes changes in a response variable NOTE: We will generally denote the explanatory variable by x and the response variable by y. To study the relationship between variables, one could use the following as guides: Start by preparing a graph (scatterplot). Examine the graph for an overall pattern and deviations from that pattern (check for outliers, etc.). Add numerical descriptive measures for additional information and support. Scatterplots Plot explanatory (independent) variable on horizontal axis & response variable on the vertical axis Look for pattern: form, direction & strength of relationship Note the following: 2
3 Association Positive association: large values of one variable correspond to large values of the other Negative association: large values of one variable correspond to small values of the other EXAMPLE 10.3 (p. 215): Physicians have used the so-called diving reflex to reduce abnormally rapid heartbeats in humans by submerging the patient's face in cold water. (The reflex, triggered by cold water temperatures, is an involuntary neural response that shuts off circulation to the skin, muscles, and internal organs, and diverts extra oxygen-carrying blood to the heart, lungs, and brain.) A research physician conducted an experiment to investigate the effects of various cold water temperatures on the pulse rates of 10 children with the following results: (See Lecture Notes) Scatterplot of Diving Reflex Correlation (p. 220) Data looks reasonably linear with redpr decreasing as temp increases If two variables are related in such a way that the value of one is indicative of the value of the other, we say the variables are correlated. The correlation coefficient, ρ is a measure of the strength of the linear relationship between two variables. See formulas on this page. SOME NOTES (p. 221) The closer r is to ± 1, the stronger the linear relationship. The closer r is to 0, the weaker the linear relationship. If r = ± 1, the relationship is perfectly linear (all the points lie exactly on the line). SOME NOTES (p. 221) r > 0 as x increases, y increases (positive association). r < 0 as x increases, y decreases (negative association). r = 0 no linear association 3
4 Your Task Read general guidelines PROC CORR (p. 223) Produces correlation matrix which lists the Pearson's correlation coefficients between all sets of included variables. Produces descriptive statistics and the p-value for testing the population correlation coefficient ρ = 0 for each set of variables. GENERAL FORM proc corr data = dataset name options; by variables; var variables; with variables; partial variables; See Lecture Notes for options. Proc corr; SAS (p. 223) var list of variables; NOTE: If you do not specify a list of variables, SAS will report the correlation between all pairs of variables. EXAMPLE (p. 224) Refer to the previous example on diving reflex. Use SAS to find the correlation between reduction in pulse rate and cold water temperature. We write the following SAS code: options nocenter nodate ps=55 ls=70 nonumber nodate; /* Set up temporary SAS dataset named diving */ data diving; input temp datalines;
5 /* Use proc corr to obtain correlation noprob suppress printing of p-value for testing rho = 0 nosimple suppress printing of desc stat */ proc corr noprob nosimple; var temp redpr; run; Quit; temp redpr Example (p. 224) temp redpr NOTE: The correlation matrix is symmetric with 1 s along the main diagonal and the correlation along the other diagonal. 1 NOTE Corr(X,X) = 1 Corr(temp,temp) = 1 Corr(X,Y) = Corr(Y,X) Value & Interpretation R = strong inverse linear relationship between reduction in pulse rate and cold water temperatures. SIMPLE LINEAR REGRESSION GOAL: Find the equation of the line that best describes the linear relationship between the dependent variable and a single independent variable Simple single independent variable Linear equation of a line linear in the parameters Deterministic Model: y = β + β x 0 1 Requires that all points lie exactly on the line Perfect linear relationship 5
6 Probabilistic Model: y = β + β x+ ε 0 1 Does NOT require that all points lie exactly on the line Allows for some error/deviation from the line For a particular value of x: Vertical distance = (observed value of y) (predicted value of y obtained from estimated regression equation) Methods of Least Squares β 0 and β 1 are unknown parameters and need to be estimated. Want to estimate so that errors are minimized ε 2 ~ N(0, σ ε ) Represents random error, independent Want to estimate the slope and y-intercept in such a way that n n 2 min SSE = min εi = min yi yi n= 1 i 1 ( ˆ ) 2 ˆ S b = β1 = S xy xx a= ˆ β = y ˆ β x Estimate of y-intercept 0 1 Estimate of slope yˆ = ˆ β + ˆ β x 0 1 = a + bx Estimated regression equation Least squares regression equation 6
7 PROC REG in SAS (p.230) GENERAL FORMAT: proc reg data = dataset options; by variables; model dependent variable = independent variables / options; plot yvariable*xvariable symbol / options; output out = new dataset keywords = names; **See Lecture Notes for options EXAMPLE (p. 232) Refer to the previous example on diving reflex. Use SAS to find the estimated regression equation relating reduction in pulse rate and cold water temperature. We add the following SAS code to our existing code, just before the run statement: proc reg; model redpr = temp; REMEMBER: model dependent = independent; The REG Procedure Model: MODEL1 Dependent Variable: redpr Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model <.0001 Error Corr Total Root MSE R-Square Dependent Mean Adj R-Sq Coeff Var Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > t Intercept <.0001 temp <.0001 yˆ = ˆ β + ˆ β x 0 1 = x Suppose x = 61, then y ˆ = (61) Suppose x = 150. Would you use this equation? NO 7
8 Suppose x = 34. Would you use this equation? NO THE LESSON: BE CAREFUL! This equation is NOT universally valid. Evaluating Regression Equation (p. 236) Once we have the regression, we need to evaluate its effectiveness: Correlation Coefficient of Determination Test slope Validate assumptions Coefficient of Determination, R 2 (p. 236) Represents the proportion of variability in the dependent variable, y, that can be accounted for by the variability in the independent variable, x. Reduction in SSE by using regression equation to predict y as opposed to just using the sample mean 0 R 2 1 closer R 2 gets to 1, the better fit we have. SLR, R 2 = (corr coeff) 2 2 regression sum of squares R = total sum of squares = mod el sum of squares total sum of squares SSR SSM = = TSS TSS The REG Procedure Model: MODEL1 Dependent Variable: redpr Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model <.0001 Error Corr Total Root MSE R-Square Dependent Mean Adj R-Sq Coeff Var Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > t Intercept <.0001 temp <
9 R 2 = % of the variability in reduction in pulse rate can be accounted for by the variability in cold water temperature OR: One can get an 88.61% reduction in the SSE by using the model to predict the dependent variable instead just using the sample mean to predict the dependent variable NOTE: This means that approximately 11.39% of the sample variability in reduction in pulse rate cannot be accounted for by the current model. CI & Tests of Hypothesis What if slope = 0? You would have a horizontal line. Thus knowing x would not help predict y. So our regression equation would not be useful! We can perform a test of hypothesis to determine whether the slope is 0. CI & Tests of Hypothesis EXAMPLE (p. 237) Refer to the diving reflex example example. Test whether the slope is significantly different from 0. Usual t-test EXAMPLE Soln 1. H : β = H : β 0 a 1 The REG Procedure Model: MODEL1 Dependent Variable: redpr Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model <.0001 Error Corr Total Root MSE R-Square Dependent Mean Adj R-Sq Coeff Var Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > t Intercept <.0001 temp <
10 EXAMPLE Soln 3. p value< Reject H if p-value < α = EXAMPLE Soln 5. Since p value < < 0.05 reject H 0 conclude the slope is significantly different from 0 Confidence Intervals ˆ β ± t Point Estimate 1 α /2, n 2 Distribution pt s S 2 xx Standard deviation of pt estimate Soft Drink Example (Handout) A soft drink vendor, set up near a beach for the summer (clearly summer has not yet arrived in Riverside), was interested in examining the relationship between sales of soft drinks, y (in gallons per day) and the maximum temperature of the day, x. See Handout for data Write a SAS program to read in and print out the data. options ls=78 nocenter nodate ps=55 nonumber; /* Create temporary SAS dataset and enter data */ data e1q1; input x /* Add titles */ title1 'Statistics 157 Extra SLR Example'; title2 'Winter 2008'; title3 'Linda M. Penas'; title4 'Question 1'; datalines; ; /* Print the data as a check */ proc print; run; 10
11 Correlation Coeff for Example Find and interpret the correlation between sales of soft drinks and maximum temp of the day. Add the following lines of code: /* Use proc corr to generate correlation information nosimple suppress printing of desc. statistics noprob suppress printing of p-value for testing rho=0 */ proc corr nosimple noprob; var x y; run; Correlation Output The CORR Procedure 2 Variables: x y Pearson Correlation Coefficients, N = 12 x y x y R = moderate positive linear relationship between max temp and soft drink sales. Regression Find the estimated regression equation ŷ = ˆ β + ˆ β x 0 1 /* Use proc reg to generate regression information model dependent = independent */ proc reg; model y = x; run; Regression Output Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > t Intercept x yˆ = x Coefficient of Determination Find and interpret the coefficient of determination. Root MSE R-Square Dependent Mean Adj R-Sq R 2 = % of the variability in reduction in sales can be accounted for by the variability in max temperature Bad model! Intro to Residual Analysis (p.243) For each x i : residuals = e i = observed errors = y i - y-hat i, i = 1,2,..., n, where y i = observed value (in the data) y-hat i = corresponding predicted or fitted value (calculated from equation). 11
12 For a given value of x, Residual = difference between what we observe in the data and what is predicted by the regression equation = amount the regression equation has not been able to explain = observed errors if the model is correct Can examine the residuals through the use of various plots. Abnormalities would be indicated if The plot shows a fan shape. (indicates violation of common variance assumption) Plot shows a definite linear trend. (indicates the need for a linear term in the model) Plot shows a quadratic shape. (indicates the need for a quadratic or crossproduct terms in the model) NOTE: It is often easier to examine the standardized or studentized residuals. We can interpret them similarly to z- scores: 2 < std residual < 3 suspect outlier std residual > 3 extreme outlier (Outlier = doesn t seem to fit with the rest of the data = seems out of place) Quadratic term needed LOOKS RANDOM Obs x y Fit SE Fit Residual St Resid Examine to see if there are any suspect or extreme outliers 12
13 Obs x y Fit SE Fit Residual St Resid CONCLUSION The plot shows no apparent pattern. Since 0 < std res < 2 no suspect or extreme outliers either Fanning out: non-constant variance To get residual and residual plots in SAS: EXAMPLE 10.26: Diving Reflex Example proc reg; /* P = predicted values R = residuals Student = studentized residuals (act like z-scores) output out = datasetname */ model y = x /P R; output out = a P = pred R = Resid Student= stdres; run; Residual Plot Generate a residual plot of student (studentized) residuals versus predicted values. proc plot vpercent = 70 hpercent = 70; plot stdres*pred; To get residual and residual plots in SAS: EXAMPLE: Soft Drink Example proc reg; /* P = predicted values R = residuals Student = studentized residuals (act like z-scores) output out = datasetname */ model y = x /P R; output out = a P = pred R = Resid Student= stdres; run; 13
14 Residual Info The REG Procedure Model: MODEL1 Dependent Variable: y Output Statistics Dep Var Predicted Std Error Std Error Student y Value Mean Predict Residual Residual Residual Obs Residual Plot Generate a residual plot of student (studentized) residuals versus predicted values. proc plot vpercent = 70 hpercent = 70; plot stdres*pred; PART 2 data e1q2; input x title4 'Question 2'; datalines; ; proc print; proc corr nosimple noprob; var x y; Generate new information with the outlier (83,10.2) removed /* Make sure you use different names for your residuals so you do not overwrite the old ones */ proc reg; model y = x /P R; output out = b P = pred1 R = resid1 Student = stdres1; proc plot vpercent = 70 hpercent = 70; plot stdres1*pred1; Run; New Output Output Statistics Dep Var Predicted Std Error Std Error Student Obs y Value Mean Predict Residual Residual Residual
15 One should continue to remove the potential outliers and generate new models, residuals etc. until reaching the final information on pages 6-7. Normality of Residuals (add-on) Normality test proc univariate normal; ods select TestsForNormality; var stdres; Example The UNIVARIATE Procedure Variable: stdres (Studentized Residual) Tests for Normality Test --Statistic p Value Shapiro-Wilk W Pr < W Kolmogorov-Smirnov D Pr > D Cramer-von Mises W-Sq Pr > W-Sq Anderson-Darling A-Sq Pr > A-Sq Normality Test 1. H 0 : errors are normally distributed 2. H a : errors are not normally distributed 3. TS: p-value = RR: Reject H 0 if p-value < α = Since p-value = not < α = 0.05 do not reject H 0 ok to assume errors are normally distributed S S S xy xy xy Some Relationships 0 ˆ β 0, r ˆ β 0, r 0 1 = 0 ˆ β = 0, r = 0 1 SOME MORE INFO Total sum of squares = TSS = S yy = SSE (sum of squares of the error) + SSR (sum of squares due to regression model) TSS is constant for a given set of data SSE and SSR vary depending on the model change the model, SSE and SSR may/will change (but their sum is always constant = TSS) 15
16 TSS = S = ( y y) yy n i= 1 n i= 1 SSE = ( y yˆ ) = S ˆ β S SSR = TSS SSE i 2 i yy 1 xy 2 16
Lecture 11: Simple Linear Regression
Lecture 11: Simple Linear Regression Readings: Sections 3.1-3.3, 11.1-11.3 Apr 17, 2009 In linear regression, we examine the association between two quantitative variables. Number of beers that you drink
More informationChapter 8 (More on Assumptions for the Simple Linear Regression)
EXST3201 Chapter 8b Geaghan Fall 2005: Page 1 Chapter 8 (More on Assumptions for the Simple Linear Regression) Your textbook considers the following assumptions: Linearity This is not something I usually
More information1) Answer the following questions as true (T) or false (F) by circling the appropriate letter.
1) Answer the following questions as true (T) or false (F) by circling the appropriate letter. T F T F T F a) Variance estimates should always be positive, but covariance estimates can be either positive
More informationLecture 1 Linear Regression with One Predictor Variable.p2
Lecture Linear Regression with One Predictor Variablep - Basics - Meaning of regression parameters p - β - the slope of the regression line -it indicates the change in mean of the probability distn of
More informationEstimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.
Estimating σ 2 We can do simple prediction of Y and estimation of the mean of Y at any value of X. To perform inferences about our regression line, we must estimate σ 2, the variance of the error term.
More informationEXST7015: Estimating tree weights from other morphometric variables Raw data print
Simple Linear Regression SAS example Page 1 1 ********************************************; 2 *** Data from Freund & Wilson (1993) ***; 3 *** TABLE 8.24 : ESTIMATING TREE WEIGHTS ***; 4 ********************************************;
More informationSTA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007
STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007 LAST NAME: SOLUTIONS FIRST NAME: STUDENT NUMBER: ENROLLED IN: (circle one) STA 302 STA 1001 INSTRUCTIONS: Time: 90 minutes Aids allowed: calculator.
More informationHandout 1: Predicting GPA from SAT
Handout 1: Predicting GPA from SAT appsrv01.srv.cquest.utoronto.ca> appsrv01.srv.cquest.utoronto.ca> ls Desktop grades.data grades.sas oldstuff sasuser.800 appsrv01.srv.cquest.utoronto.ca> cat grades.data
More informationLecture 3: Inference in SLR
Lecture 3: Inference in SLR STAT 51 Spring 011 Background Reading KNNL:.1.6 3-1 Topic Overview This topic will cover: Review of hypothesis testing Inference about 1 Inference about 0 Confidence Intervals
More informationCorrelation Analysis
Simple Regression Correlation Analysis Correlation analysis is used to measure strength of the association (linear relationship) between two variables Correlation is only concerned with strength of the
More informationBE640 Intermediate Biostatistics 2. Regression and Correlation. Simple Linear Regression Software: SAS. Emergency Calls to the New York Auto Club
BE640 Intermediate Biostatistics 2. Regression and Correlation Simple Linear Regression Software: SAS Emergency Calls to the New York Auto Club Source: Chatterjee, S; Handcock MS and Simonoff JS A Casebook
More informationGeneral Linear Model (Chapter 4)
General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients
More informationST Correlation and Regression
Chapter 5 ST 370 - Correlation and Regression Readings: Chapter 11.1-11.4, 11.7.2-11.8, Chapter 12.1-12.2 Recap: So far we ve learned: Why we want a random sample and how to achieve it (Sampling Scheme)
More informationLecture 10 Multiple Linear Regression
Lecture 10 Multiple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: 6.1-6.5 10-1 Topic Overview Multiple Linear Regression Model 10-2 Data for Multiple Regression Y i is the response variable
More informationunadjusted model for baseline cholesterol 22:31 Monday, April 19,
unadjusted model for baseline cholesterol 22:31 Monday, April 19, 2004 1 Class Level Information Class Levels Values TRETGRP 3 3 4 5 SEX 2 0 1 Number of observations 916 unadjusted model for baseline cholesterol
More informationAnswer Keys to Homework#10
Answer Keys to Homework#10 Problem 1 Use either restricted or unrestricted mixed models. Problem 2 (a) First, the respective means for the 8 level combinations are listed in the following table A B C Mean
More informationdf=degrees of freedom = n - 1
One sample t-test test of the mean Assumptions: Independent, random samples Approximately normal distribution (from intro class: σ is unknown, need to calculate and use s (sample standard deviation)) Hypotheses:
More informationAssignment 9 Answer Keys
Assignment 9 Answer Keys Problem 1 (a) First, the respective means for the 8 level combinations are listed in the following table A B C Mean 26.00 + 34.67 + 39.67 + + 49.33 + 42.33 + + 37.67 + + 54.67
More informationSTA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #6
STA 8 Applied Linear Models: Regression Analysis Spring 011 Solution for Homework #6 6. a) = 11 1 31 41 51 1 3 4 5 11 1 31 41 51 β = β1 β β 3 b) = 1 1 1 1 1 11 1 31 41 51 1 3 4 5 β = β 0 β1 β 6.15 a) Stem-and-leaf
More information3 Variables: Cyberloafing Conscientiousness Age
title 'Cyberloafing, Mike Sage'; run; PROC CORR data=sage; var Cyberloafing Conscientiousness Age; run; quit; The CORR Procedure 3 Variables: Cyberloafing Conscientiousness Age Simple Statistics Variable
More informationFailure Time of System due to the Hot Electron Effect
of System due to the Hot Electron Effect 1 * exresist; 2 option ls=120 ps=75 nocenter nodate; 3 title of System due to the Hot Electron Effect ; 4 * TIME = failure time (hours) of a system due to drift
More informationEXST Regression Techniques Page 1. We can also test the hypothesis H :" œ 0 versus H :"
EXST704 - Regression Techniques Page 1 Using F tests instead of t-tests We can also test the hypothesis H :" œ 0 versus H :" Á 0 with an F test.! " " " F œ MSRegression MSError This test is mathematically
More information9. Linear Regression and Correlation
9. Linear Regression and Correlation Data: y a quantitative response variable x a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical) For example, y = annual income,
More informationChapter 8 Quantitative and Qualitative Predictors
STAT 525 FALL 2017 Chapter 8 Quantitative and Qualitative Predictors Professor Dabao Zhang Polynomial Regression Multiple regression using X 2 i, X3 i, etc as additional predictors Generates quadratic,
More informationCorrelation and the Analysis of Variance Approach to Simple Linear Regression
Correlation and the Analysis of Variance Approach to Simple Linear Regression Biometry 755 Spring 2009 Correlation and the Analysis of Variance Approach to Simple Linear Regression p. 1/35 Correlation
More informationSAS Procedures Inference about the Line ffl model statement in proc reg has many options ffl To construct confidence intervals use alpha=, clm, cli, c
Inference About the Slope ffl As with all estimates, ^fi1 subject to sampling var ffl Because Y jx _ Normal, the estimate ^fi1 _ Normal A linear combination of indep Normals is Normal Simple Linear Regression
More informationIES 612/STA 4-573/STA Winter 2008 Week 1--IES 612-STA STA doc
IES 612/STA 4-573/STA 4-576 Winter 2008 Week 1--IES 612-STA 4-573-STA 4-576.doc Review Notes: [OL] = Ott & Longnecker Statistical Methods and Data Analysis, 5 th edition. [Handouts based on notes prepared
More informationBusiness Statistics. Lecture 10: Correlation and Linear Regression
Business Statistics Lecture 10: Correlation and Linear Regression Scatterplot A scatterplot shows the relationship between two quantitative variables measured on the same individuals. It displays the Form
More informationLectures on Simple Linear Regression Stat 431, Summer 2012
Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population
More informationChapter 12 - Lecture 2 Inferences about regression coefficient
Chapter 12 - Lecture 2 Inferences about regression coefficient April 19th, 2010 Facts about slope Test Statistic Confidence interval Hypothesis testing Test using ANOVA Table Facts about slope In previous
More informationOverview Scatter Plot Example
Overview Topic 22 - Linear Regression and Correlation STAT 5 Professor Bruce Craig Consider one population but two variables For each sampling unit observe X and Y Assume linear relationship between variables
More informationChapter 6 Multiple Regression
STAT 525 FALL 2018 Chapter 6 Multiple Regression Professor Min Zhang The Data and Model Still have single response variable Y Now have multiple explanatory variables Examples: Blood Pressure vs Age, Weight,
More informationSTOR 455 STATISTICAL METHODS I
STOR 455 STATISTICAL METHODS I Jan Hannig Mul9variate Regression Y=X β + ε X is a regression matrix, β is a vector of parameters and ε are independent N(0,σ) Es9mated parameters b=(x X) - 1 X Y Predicted
More informationSTATISTICS 479 Exam II (100 points)
Name STATISTICS 79 Exam II (1 points) 1. A SAS data set was created using the following input statement: Answer parts(a) to (e) below. input State $ City $ Pop199 Income Housing Electric; (a) () Give the
More informationThis is a Randomized Block Design (RBD) with a single factor treatment arrangement (2 levels) which are fixed.
EXST3201 Chapter 13c Geaghan Fall 2005: Page 1 Linear Models Y ij = µ + βi + τ j + βτij + εijk This is a Randomized Block Design (RBD) with a single factor treatment arrangement (2 levels) which are fixed.
More informationSTAT 3A03 Applied Regression With SAS Fall 2017
STAT 3A03 Applied Regression With SAS Fall 2017 Assignment 2 Solution Set Q. 1 I will add subscripts relating to the question part to the parameters and their estimates as well as the errors and residuals.
More informationAMS 7 Correlation and Regression Lecture 8
AMS 7 Correlation and Regression Lecture 8 Department of Applied Mathematics and Statistics, University of California, Santa Cruz Suumer 2014 1 / 18 Correlation pairs of continuous observations. Correlation
More informationCOMPREHENSIVE WRITTEN EXAMINATION, PAPER III FRIDAY AUGUST 26, 2005, 9:00 A.M. 1:00 P.M. STATISTICS 174 QUESTION
COMPREHENSIVE WRITTEN EXAMINATION, PAPER III FRIDAY AUGUST 26, 2005, 9:00 A.M. 1:00 P.M. STATISTICS 174 QUESTION Answer all parts. Closed book, calculators allowed. It is important to show all working,
More informationLecture 11 Multiple Linear Regression
Lecture 11 Multiple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: 6.1-6.5 11-1 Topic Overview Review: Multiple Linear Regression (MLR) Computer Science Case Study 11-2 Multiple Regression
More informationChapter 16. Simple Linear Regression and dcorrelation
Chapter 16 Simple Linear Regression and dcorrelation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will
More informationLINEAR REGRESSION ANALYSIS. MODULE XVI Lecture Exercises
LINEAR REGRESSION ANALYSIS MODULE XVI Lecture - 44 Exercises Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Exercise 1 The following data has been obtained on
More informationInferences for Regression
Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In
More informationLecture 2 Simple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: Chapter 1
Lecture Simple Linear Regression STAT 51 Spring 011 Background Reading KNNL: Chapter 1-1 Topic Overview This topic we will cover: Regression Terminology Simple Linear Regression with a single predictor
More informationInference for Regression
Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu
More informationSTAT Chapter 11: Regression
STAT 515 -- Chapter 11: Regression Mostly we have studied the behavior of a single random variable. Often, however, we gather data on two random variables. We wish to determine: Is there a relationship
More informationChapter 1 Linear Regression with One Predictor
STAT 525 FALL 2018 Chapter 1 Linear Regression with One Predictor Professor Min Zhang Goals of Regression Analysis Serve three purposes Describes an association between X and Y In some applications, the
More informationLecture 12 Inference in MLR
Lecture 12 Inference in MLR STAT 512 Spring 2011 Background Reading KNNL: 6.6-6.7 12-1 Topic Overview Review MLR Model Inference about Regression Parameters Estimation of Mean Response Prediction 12-2
More informationChapter 2 Inferences in Simple Linear Regression
STAT 525 SPRING 2018 Chapter 2 Inferences in Simple Linear Regression Professor Min Zhang Testing for Linear Relationship Term β 1 X i defines linear relationship Will then test H 0 : β 1 = 0 Test requires
More informationWeek 7.1--IES 612-STA STA doc
Week 7.1--IES 612-STA 4-573-STA 4-576.doc IES 612/STA 4-576 Winter 2009 ANOVA MODELS model adequacy aka RESIDUAL ANALYSIS Numeric data samples from t populations obtained Assume Y ij ~ independent N(μ
More informationChapter 16. Simple Linear Regression and Correlation
Chapter 16 Simple Linear Regression and Correlation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will
More informationUnit 10: Simple Linear Regression and Correlation
Unit 10: Simple Linear Regression and Correlation Statistics 571: Statistical Methods Ramón V. León 6/28/2004 Unit 10 - Stat 571 - Ramón V. León 1 Introductory Remarks Regression analysis is a method for
More informationTopic 14: Inference in Multiple Regression
Topic 14: Inference in Multiple Regression Outline Review multiple linear regression Inference of regression coefficients Application to book example Inference of mean Application to book example Inference
More informationStatistics for Managers using Microsoft Excel 6 th Edition
Statistics for Managers using Microsoft Excel 6 th Edition Chapter 13 Simple Linear Regression 13-1 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of
More informationCorrelation & Simple Regression
Chapter 11 Correlation & Simple Regression The previous chapter dealt with inference for two categorical variables. In this chapter, we would like to examine the relationship between two quantitative variables.
More informationBiological Applications of ANOVA - Examples and Readings
BIO 575 Biological Applications of ANOVA - Winter Quarter 2010 Page 1 ANOVA Pac Biological Applications of ANOVA - Examples and Readings One-factor Model I (Fixed Effects) This is the same example for
More informationPubH 7405: REGRESSION ANALYSIS SLR: DIAGNOSTICS & REMEDIES
PubH 7405: REGRESSION ANALYSIS SLR: DIAGNOSTICS & REMEDIES Normal Error RegressionModel : Y = β 0 + β ε N(0,σ 2 1 x ) + ε The Model has several parts: Normal Distribution, Linear Mean, Constant Variance,
More informationInference for Regression Simple Linear Regression
Inference for Regression Simple Linear Regression IPS Chapter 10.1 2009 W.H. Freeman and Company Objectives (IPS Chapter 10.1) Simple linear regression p Statistical model for linear regression p Estimating
More informationImportant note: Transcripts are not substitutes for textbook assignments. 1
In this lesson we will cover correlation and regression, two really common statistical analyses for quantitative (or continuous) data. Specially we will review how to organize the data, the importance
More informationSTAT 4385 Topic 03: Simple Linear Regression
STAT 4385 Topic 03: Simple Linear Regression Xiaogang Su, Ph.D. Department of Mathematical Science University of Texas at El Paso xsu@utep.edu Spring, 2017 Outline The Set-Up Exploratory Data Analysis
More informationStatistics 5100 Spring 2018 Exam 1
Statistics 5100 Spring 2018 Exam 1 Directions: You have 60 minutes to complete the exam. Be sure to answer every question, and do not spend too much time on any part of any question. Be concise with all
More informationChapter 12: Multiple Regression
Chapter 12: Multiple Regression 12.1 a. A scatterplot of the data is given here: Plot of Drug Potency versus Dose Level Potency 0 5 10 15 20 25 30 0 5 10 15 20 25 30 35 Dose Level b. ŷ = 8.667 + 0.575x
More informationECO220Y Simple Regression: Testing the Slope
ECO220Y Simple Regression: Testing the Slope Readings: Chapter 18 (Sections 18.3-18.5) Winter 2012 Lecture 19 (Winter 2012) Simple Regression Lecture 19 1 / 32 Simple Regression Model y i = β 0 + β 1 x
More informationBasic Business Statistics 6 th Edition
Basic Business Statistics 6 th Edition Chapter 12 Simple Linear Regression Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of a dependent variable based
More informationSTAT 512 MidTerm I (2/21/2013) Spring 2013 INSTRUCTIONS
STAT 512 MidTerm I (2/21/2013) Spring 2013 Name: Key INSTRUCTIONS 1. This exam is open book/open notes. All papers (but no electronic devices except for calculators) are allowed. 2. There are 5 pages in
More informationKeller: Stats for Mgmt & Econ, 7th Ed July 17, 2006
Chapter 17 Simple Linear Regression and Correlation 17.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will
More informationy n 1 ( x i x )( y y i n 1 i y 2
STP3 Brief Class Notes Instructor: Ela Jackiewicz Chapter Regression and Correlation In this chapter we will explore the relationship between two quantitative variables, X an Y. We will consider n ordered
More informationStat 302 Statistical Software and Its Applications SAS: Simple Linear Regression
1 Stat 302 Statistical Software and Its Applications SAS: Simple Linear Regression Fritz Scholz Department of Statistics, University of Washington Winter Quarter 2015 February 16, 2015 2 The Spirit of
More informationChapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression
BSTT523: Kutner et al., Chapter 1 1 Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression Introduction: Functional relation between
More informationInference for Regression Inference about the Regression Model and Using the Regression Line
Inference for Regression Inference about the Regression Model and Using the Regression Line PBS Chapter 10.1 and 10.2 2009 W.H. Freeman and Company Objectives (PBS Chapter 10.1 and 10.2) Inference about
More informationy response variable x 1, x 2,, x k -- a set of explanatory variables
11. Multiple Regression and Correlation y response variable x 1, x 2,, x k -- a set of explanatory variables In this chapter, all variables are assumed to be quantitative. Chapters 12-14 show how to incorporate
More informationTopic 18: Model Selection and Diagnostics
Topic 18: Model Selection and Diagnostics Variable Selection We want to choose a best model that is a subset of the available explanatory variables Two separate problems 1. How many explanatory variables
More informationRegression Models. Chapter 4
Chapter 4 Regression Models To accompany Quantitative Analysis for Management, Eleventh Edition, by Render, Stair, and Hanna Power Point slides created by Brian Peterson Introduction Regression analysis
More informationOdor attraction CRD Page 1
Odor attraction CRD Page 1 dm'log;clear;output;clear'; options ps=512 ls=99 nocenter nodate nonumber nolabel FORMCHAR=" ---- + ---+= -/\*"; ODS LISTING; *** Table 23.2 ********************************************;
More informationChapter 9. Correlation and Regression
Chapter 9 Correlation and Regression Lesson 9-1/9-2, Part 1 Correlation Registered Florida Pleasure Crafts and Watercraft Related Manatee Deaths 100 80 60 40 20 0 1991 1993 1995 1997 1999 Year Boats in
More informationPaper: ST-161. Techniques for Evidence-Based Decision Making Using SAS Ian Stockwell, The Hilltop UMBC, Baltimore, MD
Paper: ST-161 Techniques for Evidence-Based Decision Making Using SAS Ian Stockwell, The Hilltop Institute @ UMBC, Baltimore, MD ABSTRACT SAS has many tools that can be used for data analysis. From Freqs
More informationCh 2: Simple Linear Regression
Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component
More informationAnnouncements: You can turn in homework until 6pm, slot on wall across from 2202 Bren. Make sure you use the correct slot! (Stats 8, closest to wall)
Announcements: You can turn in homework until 6pm, slot on wall across from 2202 Bren. Make sure you use the correct slot! (Stats 8, closest to wall) We will cover Chs. 5 and 6 first, then 3 and 4. Mon,
More informationAMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression
AMS 315/576 Lecture Notes Chapter 11. Simple Linear Regression 11.1 Motivation A restaurant opening on a reservations-only basis would like to use the number of advance reservations x to predict the number
More informationChapter 6: Exploring Data: Relationships Lesson Plan
Chapter 6: Exploring Data: Relationships Lesson Plan For All Practical Purposes Displaying Relationships: Scatterplots Mathematical Literacy in Today s World, 9th ed. Making Predictions: Regression Line
More information: The model hypothesizes a relationship between the variables. The simplest probabilistic model: or.
Chapter Simple Linear Regression : comparing means across groups : presenting relationships among numeric variables. Probabilistic Model : The model hypothesizes an relationship between the variables.
More informationTopic 10 - Linear Regression
Topic 10 - Linear Regression Least squares principle Hypothesis tests/confidence intervals/prediction intervals for regression 1 Linear Regression How much should you pay for a house? Would you consider
More informationLecture 3. Experiments with a Single Factor: ANOVA Montgomery 3.1 through 3.3
Lecture 3. Experiments with a Single Factor: ANOVA Montgomery 3.1 through 3.3 Fall, 2013 Page 1 Tensile Strength Experiment Investigate the tensile strength of a new synthetic fiber. The factor is the
More information5.3 Three-Stage Nested Design Example
5.3 Three-Stage Nested Design Example A researcher designs an experiment to study the of a metal alloy. A three-stage nested design was conducted that included Two alloy chemistry compositions. Three ovens
More information7.3 Ridge Analysis of the Response Surface
7.3 Ridge Analysis of the Response Surface When analyzing a fitted response surface, the researcher may find that the stationary point is outside of the experimental design region, but the researcher wants
More informationChapter 4. Regression Models. Learning Objectives
Chapter 4 Regression Models To accompany Quantitative Analysis for Management, Eleventh Edition, by Render, Stair, and Hanna Power Point slides created by Brian Peterson Learning Objectives After completing
More information1 A Review of Correlation and Regression
1 A Review of Correlation and Regression SW, Chapter 12 Suppose we select n = 10 persons from the population of college seniors who plan to take the MCAT exam. Each takes the test, is coached, and then
More informationThe program for the following sections follows.
Homework 6 nswer sheet Page 31 The program for the following sections follows. dm'log;clear;output;clear'; *************************************************************; *** EXST734 Homework Example 1
More informationApplied Regression Modeling: A Business Approach Chapter 2: Simple Linear Regression Sections
Applied Regression Modeling: A Business Approach Chapter 2: Simple Linear Regression Sections 2.1 2.3 by Iain Pardoe 2.1 Probability model for and 2 Simple linear regression model for and....................................
More informationdm'log;clear;output;clear'; options ps=512 ls=99 nocenter nodate nonumber nolabel FORMCHAR=" = -/\<>*"; ODS LISTING;
dm'log;clear;output;clear'; options ps=512 ls=99 nocenter nodate nonumber nolabel FORMCHAR=" ---- + ---+= -/\*"; ODS LISTING; *** Table 23.2 ********************************************; *** Moore, David
More informationMultiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company
Multiple Regression Inference for Multiple Regression and A Case Study IPS Chapters 11.1 and 11.2 2009 W.H. Freeman and Company Objectives (IPS Chapters 11.1 and 11.2) Multiple regression Data for multiple
More informationInteractions. Interactions. Lectures 1 & 2. Linear Relationships. y = a + bx. Slope. Intercept
Interactions Lectures 1 & Regression Sometimes two variables appear related: > smoking and lung cancers > height and weight > years of education and income > engine size and gas mileage > GMAT scores and
More informationSimple Linear Regression: One Quantitative IV
Simple Linear Regression: One Quantitative IV Linear regression is frequently used to explain variation observed in a dependent variable (DV) with theoretically linked independent variables (IV). For example,
More informationInference for the Regression Coefficient
Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression line. We can shows that b 0 and b 1 are the unbiased estimates
More informationMulticollinearity Exercise
Multicollinearity Exercise Use the attached SAS output to answer the questions. [OPTIONAL: Copy the SAS program below into the SAS editor window and run it.] You do not need to submit any output, so there
More informationCan you tell the relationship between students SAT scores and their college grades?
Correlation One Challenge Can you tell the relationship between students SAT scores and their college grades? A: The higher SAT scores are, the better GPA may be. B: The higher SAT scores are, the lower
More informationBusiness Statistics. Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220. Dr. Mohammad Zainal
Department of Quantitative Methods & Information Systems Business Statistics Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220 Dr. Mohammad Zainal Chapter Goals After completing
More informationSimple Linear Regression
9-1 l Chapter 9 l Simple Linear Regression 9.1 Simple Linear Regression 9.2 Scatter Diagram 9.3 Graphical Method for Determining Regression 9.4 Least Square Method 9.5 Correlation Coefficient and Coefficient
More informationAny of 27 linear and nonlinear models may be fit. The output parallels that of the Simple Regression procedure.
STATGRAPHICS Rev. 9/13/213 Calibration Models Summary... 1 Data Input... 3 Analysis Summary... 5 Analysis Options... 7 Plot of Fitted Model... 9 Predicted Values... 1 Confidence Intervals... 11 Observed
More informationChapter 11 : State SAT scores for 1982 Data Listing
EXST3201 Chapter 12a Geaghan Fall 2005: Page 1 Chapter 12 : Variable selection An example: State SAT scores In 1982 there was concern for scores of the Scholastic Aptitude Test (SAT) scores that varied
More informationApplied Statistics and Econometrics
Applied Statistics and Econometrics Lecture 6 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 53 Outline of Lecture 6 1 Omitted variable bias (SW 6.1) 2 Multiple
More information