Introduction to Regression
|
|
- Regina Willis
- 5 years ago
- Views:
Transcription
1 Introduction to Regression Using Mult Lin Regression Derived variables Many alternative models Which model to choose? Model Criticism Modelling Objective Model Details Data and Residuals Assumptions 1
2 Data Like This Values of coefficients Sampling Distributions Standard Errors 95% Confidence Intervals 95% Prediction Intervals ANOVA etc 2
3 Derived variables General Logs Proportions and Ratios Too many (derived) variables Redundancy Many versions of same model Indicator variables categorical data Time series applications Indicator variables eg seasonal effects Lagged variables Differences Logs and Rate of Return
4 Gas Gas Gas Consumption vs Temp 7 6 Period 1 Fitted Line Plot Gas = Temperature S R-Sq 94.4% R-Sq(adj) 94.1% Weekly gas consumption (in 1000 cubic feet) and the average outside temperature (in degrees Celsius) at one house in south-east England for two "heating seasons", one of 26 weeks before, and one of 0 weeks after cavity-wall insulation was installed. The object of the exercise was to assess the effect of the insulation on gas consumption. The house thermostat was set at 20 C throughout. 5 4 Period 2 Fitted Line Plot Gas = Temperature Temperature S R-Sq 81.% R-Sq(adj) 80.6% 4 Comparative Temperature
5 Objective Nominal focus on prediction Predict gas consumption in future for this house Knowing temp and whether or not insulated Actual interest Does insulation make a difference At all temps? How much? Slope? Intercept? SEs? Data Like This 5
6 Using an Indicator variable Insulated Week Temperature Gas Insulated Week Temperature Gas etc etc One stacked data set Week Insulation Temperature Gas etc Two parallel data sets 6
7 Temperature Gas Simple Regression & Indicator Variable Fitted Line Plot Gas = Insulated 0.4 Insulated Fitted Line Plot Temperature = Insulated 0.4 Insulated S R-Sq 29.8% R-Sq(adj) 28.5% S R-Sq 2.6% R-Sq(adj) 0.8% Gas vs Insulated Insulated = 0 Avg Gas = Insulated = 1 Avg Gas =.48 Diff = Temp vs Insulated Coeff Unit Increase Random Error Design Implications 7
8 Gas SLR with indicator var & T-test Fitted Line Plot Gas = Insulated Two-sample T for Gas Insulated S R-Sq 29.8% R-Sq(adj) 28.5% Insulated N Mean StDev SE Mean Difference = μ (0) - μ (1) T-Value = 4.79 P-Value = DF = 54 Using Pooled StDev = Regression Analysis: Gas versus Insulated S R-sq R-sq(adj) R-sq(pred) % 28.49% 24.5% Coefficients Term Coef SE Coef T-Value P-Value Constant Insulated
9 Indicator Variables in Regression Response variable Predictors x Temp, x Insulated(0 /1) Statistical Model Y x x ; ~ N 0, When x 0 Y x When x 1 Y x Y Y Gas x Y x Common Slopes Diff bet Int'cpts No interaction Binary Indicator Variable
10 Multiple Regression Output Regression Analysis: Gas versus Temperature, Insulated The regression equation is Gas = Temperature Insulated Predictor Coef SE Coef Constant Temperature Insulated ˆ SE ˆ Rough 95%CI (0.097) Prev ( 1.76, 1.7) Mean Diff (0.274) Parallel lines 10
11 Implementation: Categorical Variable 11
12 Regression Output: Categorical Var Regression Analysis: Gas versus Temperature, Insulated Categorical predictor coding (1, 0) Model Summary S R-sq % Coefficients Regression Equation Term Coef SE Coef T-Value P-Value Constant Temperature Insulated Insulated 0 Gas = Temperature 1 Gas = Temperature 12
13 Aside: Omitted predictors Hidden/Lurking variables Subset of data Used in exam Uninformed by insulation status Slope positive On avg, gas consumption increases with temp! Knowing insulation status Slopes negative On avg, gas consumption decreases with temp 1
14 Interaction? Refine the question Different slopes as well? 14
15 Indicator Variables in Regression Response variable Y Gas Predictors x Temp, x Insulated(0 /1), x Temp x Combined statistical model Y x x x ; ~ N 0, When x 0 Y x When x 1 Y x Y x diff in intercepts; diff in slopes 2 15
16 New Derived Variable 16
17 Modelling two regression lines Regression Analysis: Gas versus Temperature, Insulated, Ins X Temp Gas = Temperature Insulated Ins X Temp Predictor Coef SE Coef Constant Temperature Insulated Ins X Temp S = R-Sq = 92.8% R-Sq(adj) = 92.4% Which coeff most fundamantal to theory of heat loss? 17
18 Alt Models of two regression lines Nearly equivalent Two sep lin regs Gas vs Temp Exercise Compare Coeff Ests 95% Ints Response variable 1 2 a) One model, w interaction b) Two sep models Predictors x Temp, x Insulated(0 / 1) Two Statistical Models Y Gas 2 2 0; NoIns NoIns 1 ; 0, NoIns x Y x N 2 2 1; Ins Ins 1 ; 0, Ins x Y x N 18
19 Multiple indicator variables Will also meet Redundancy Multiple formulations of same model 19
20 Housing Completions, quarterly, 1978 to 2000 Quarter Q Q Q Q Quarter Q Q Q Q Quarter Q Q Q Q
21 Completions Figure 1.0 Housing Completions, quarterly, 1978 to Time Series Plot of Completions Take objective: forecast one quarter ahead Quarter Q1 Q2 Q Q Quarter Q1 Year 1978 Q Q Q Q Q1 199 Q Q
22 Comps Aside: Cubic/Quadratic Regression Fitted Line plot Options Log Quadratic Cubic Fitted Line Plot Comps = E time time** time** Regression 95% PI S R-Sq 88.% R-Sq(adj) 87.9% time
23 Modelling Options Focus on stable linear structure post 199 Assume this structure will continue Exploit structure extension of Indicator Vars Disadvantage: smaller data set One model for entire data set Note: structure has changed; might change again Exploit weaker structure Use Lagged variables Advantage: use all data. 2
24 Completions Comps, quarterly, 199 to 2000 Target is 2001 Q1 Use Q1 data only? OR Use all data? 4 parallel lines more efficient Why/What sense? Option 1 work since 199 Time Series Plot of Completions Quarter Q1 Q2 Q Q Quarter Year Q1 199 Q Q Q Q Q Q
25 Completions Completions Q1 only Fitted Line Plot Completions = year Other Qs; 4 sep lines S R-Sq 98.5% R-Sq(adj) 98.% year Later, use Time since 1978 Changes intercept only Pred = ± 2(16.5) = (9795, 11061) 25
26 Linear in Time plus Quarterly Ind Vars Create set of binary variables Q1, Q2, Q, Q4 Comps = 1 Q Q 2 + Q + 4 Q 4 + Time + Year. Quarter time Time since 1978 Comps Q1 Q2 Q Q4 199 Q Q Q Q Q Q Q Q
27 Multiple Indicator Vars: Tech Issue Regression Analysis: Comps versus Time since 1978, Q1, Q2, Q, Q4 * Q4 is highly correlated with other X variables * Q4 has been removed from the equation. The regression equation is Comps = Time since Q1-119 Q2-758 Q Y Q Q Q Q t Interp of t and all Q 0 i Redundancy Alternatives 0 No Constant Use indicator variables only equiv Enter " Quarter" as categorical variable 27
28 Multiple Indicator Vars: Tech Issue Regression Analysis: Comps versus Time since 1978, Q1, Q2, Q, Q4 * Q4 is highly correlated with other X variables * Q4 has been removed from the equation. Comps = Time since Q1-119 Q2-758 Q S = OR Note = = etc Regression Analysis: Comps versus Time since 1978, Q1, Q2, Q, Q4 No constant option Comps = 986 Time since Q Q Q Q4 S =
29 Multiple Indicator Vars: Tech Issue Regression Analysis: Comps versus Time since 1978, Q1, Q2, Q, Q4 * Q4 is highly correlated with other X variables * Q4 has been removed from the equation. Comps = Time since Q1-119 Q2-758 Q S = OR Note = = etc Regression Analysis: Comps versus Time since 1978, Q1, Q2, Q, Q4 No constant option Comps = 986 Time since Q Q Q Q4 S =
30 Categorical Variable approach Model Summary Regression Equations S R-sq % Quarter Q1 Comps = t Q2 Comps = t Coefficients Q Q4 Comps = t Comps = t Term Coef SE Coef Constant time since Quarter Consider Q2 Q1 at t = 0 Q Q Q
31 Derived variables and Transforms in Time Series Lags Differences Rates of Return Log scale 1
32 Completions All Comps, quarterly, 1978 to 2000 Option 2 use all data, but diff model Time Series Plot of Completions Quarter Q1 Q2 Q Q Quarter Q1 Year 1978 Q Q Q Q Q1 199 Q Q
33 Comps Auto-Regression for Time Series Basic idea next value like last value (Lag1) Fitted Line Plot Comps = Lag1Comp S R-Sq 76.1% R-Sq(adj) 75.8% Lag1Comp
34 Auto-Regression for Time Series Basic idea next value like last value (Lag1) Auto Regression Y Y + * Y + t 0 lag1 t1 t + * Y t 0 lag1 t1 + * Y lag 4 t4 t Year. QuarterComps Lag1Comp Lag4Comp 1978 Q Q Q Q Q Q Q Q Q
35 Using two lagged variables Regression Analysis: Comps versus Lag1Comp, Lag4Comp The regression equation is Comps = Lag1Comp Lag4Comp : S = Comp Q = 1287, Comp Q = % Pred Int Comp Q = ± 2(780.7)= (100, 145) 5
36 Using Lagged Variables Basic Idea Current Quarter like prev quarter same Q last year Matrix Plot of Completions, Lag1Comp, Lag4Comp Completions Lag1Comp Lag4Comp 6
37 Comparison Forecasting models Comps Linear in Time, quarter indicators Lag1 and Lag 4 Modelling Options 1 Parallel Linear Regressions Y Q Q Q Q t t Seasonal AutoRegression Y Y Y t 1 t1 4 t4 t More efficient for prediction Fewer modelling assumptions Different modelling strategy t Lin in time + Q Lag 1 and lag 4 Comps Lag 1 Lag 4 Q1 Q2 Q inds Q Q Q Q Q Q2? ? Q? Q4? Q1??
38 Model Criticism Criticism Does it make sense? Are there outliers? Choice amongst alternatives R 2 SE 8
39 Extra: Logs lags and differences Financial data IBM share price Natural language %age change MINITAB language logs 9
40 Financial Series- IBM Prices daily Simple Reg on Time 40
41 Logprice Logprice Log IBM Prices Log(Y t ) vs t Log(Y t ) vs log(y t-1 ) IBM Prices Logprice = t IBM Prices Logprice = lag1logprice Regression 95% PI S R-Sq 94.4% R-Sq(adj) 94.4% Regression 95% PI S R-Sq 99.8% R-Sq(adj) 99.8% t lag1logprice
42 price price Modeled in Log Scale, presented in original units Log(Y Log(Y t )vs log(y t-1 ) t ) vs t IBM Prices log10(price) = t IBM Prices log10(price) = log10(lag1price) Regression 95% PI S R-Sq 94.4% R-Sq(adj) 94.4% Regression 95% PI S R-Sq 99.8% R-Sq(adj) 99.8% t lag1price
43 Differences/ Ratios First Differences Seasonal Diffs Today Yesterday This Q same Q last year Ratio Y(t) / Y(t-1) Rate of Return 100 x(y(t) Y(t-1))/ Y(t-1) 100 x (Ratio -1) Log(Ratio) Log( Y(t) ) Log ( Y(t-1) ) 4
44 La g1diff Financial Series- IBM Prices daily Simple Regression of Daily Diffs vs Time IBM Prices Lag1diff = t Regression 95% PI S R-Sq 0.1% R-Sq(adj) 0.0% t
45 Lag1difflog Financial Series- IBM Prices daily Simple Regression of First Diffs of LogPrice vs Time IBM Prices Lag1difflog = t Regression 95% PI S R-Sq 0.0% R-Sq(adj) 0.0% t
46 Lag1difflog Financial Series- IBM Prices daily IBM Prices Lag1difflog = t Interpretation Regression 95% PI S R-Sq 0.0% R-Sq(adj) 0.0% log P log P 0 time t t1 t t t log Pt P log t t or in ( , ) P t1 P t1 in ( , ) Pt 10 or in 10, or in 0.96, In summary Rate of return 0.1% per day 4% P 46
47 Financial Series Day to day changes most naturally expressed as % change price tomorrow = price today small change Log(price t+1)= Log(price t) + Log(small change) Average drift per day (for logs) is ie about 0.1% growth pd = 61% pa 47
48 Financial Series Confidence in future prediction pt est hi lo ^ Factor Eg initial capital 1000 Day infinity infinity 61% per annum?? 48
49 Derived Variables Why use derived variables? Adding extra variables gives more options Challenge Is there a cost? Which is best Scientific insight can powerful & simple analysis 49
Introduction to Regression
Introduction to Regression Using Mult Lin Regression Derived variables Many alternative models Which model to choose? Model Criticism Modelling Objective Model Details Data and Residuals Assumptions 1
More informationConfidence Interval for the mean response
Week 3: Prediction and Confidence Intervals at specified x. Testing lack of fit with replicates at some x's. Inference for the correlation. Introduction to regression with several explanatory variables.
More informationPredict y from (possibly) many predictors x. Model Criticism Study the importance of columns
Lecture Week Multiple Linear Regression Predict y from (possibly) many predictors x Including extra derived variables Model Criticism Study the importance of columns Draw on Scientific framework Experiment;
More informationINFERENCE FOR REGRESSION
CHAPTER 3 INFERENCE FOR REGRESSION OVERVIEW In Chapter 5 of the textbook, we first encountered regression. The assumptions that describe the regression model we use in this chapter are the following. We
More informationMultiple Regression Examples
Multiple Regression Examples Example: Tree data. we have seen that a simple linear regression of usable volume on diameter at chest height is not suitable, but that a quadratic model y = β 0 + β 1 x +
More informationSimple Linear Regression: A Model for the Mean. Chap 7
Simple Linear Regression: A Model for the Mean Chap 7 An Intermediate Model (if the groups are defined by values of a numeric variable) Separate Means Model Means fall on a straight line function of the
More informationSTA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #6
STA 8 Applied Linear Models: Regression Analysis Spring 011 Solution for Homework #6 6. a) = 11 1 31 41 51 1 3 4 5 11 1 31 41 51 β = β1 β β 3 b) = 1 1 1 1 1 11 1 31 41 51 1 3 4 5 β = β 0 β1 β 6.15 a) Stem-and-leaf
More informationCh 13 & 14 - Regression Analysis
Ch 3 & 4 - Regression Analysis Simple Regression Model I. Multiple Choice:. A simple regression is a regression model that contains a. only one independent variable b. only one dependent variable c. more
More informationBasic Business Statistics, 10/e
Chapter 4 4- Basic Business Statistics th Edition Chapter 4 Introduction to Multiple Regression Basic Business Statistics, e 9 Prentice-Hall, Inc. Chap 4- Learning Objectives In this chapter, you learn:
More informationModel Building Chap 5 p251
Model Building Chap 5 p251 Models with one qualitative variable, 5.7 p277 Example 4 Colours : Blue, Green, Lemon Yellow and white Row Blue Green Lemon Insects trapped 1 0 0 1 45 2 0 0 1 59 3 0 0 1 48 4
More informationBasic Business Statistics 6 th Edition
Basic Business Statistics 6 th Edition Chapter 12 Simple Linear Regression Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of a dependent variable based
More informationModels with qualitative explanatory variables p216
Models with qualitative explanatory variables p216 Example gen = 1 for female Row gpa hsm gen 1 3.32 10 0 2 2.26 6 0 3 2.35 8 0 4 2.08 9 0 5 3.38 8 0 6 3.29 10 0 7 3.21 8 0 8 2.00 3 0 9 3.18 9 0 10 2.34
More informationInference for Regression Inference about the Regression Model and Using the Regression Line, with Details. Section 10.1, 2, 3
Inference for Regression Inference about the Regression Model and Using the Regression Line, with Details Section 10.1, 2, 3 Basic components of regression setup Target of inference: linear dependency
More informationGeneral Linear Model (Chapter 4)
General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients
More informationLINEAR REGRESSION ANALYSIS. MODULE XVI Lecture Exercises
LINEAR REGRESSION ANALYSIS MODULE XVI Lecture - 44 Exercises Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Exercise 1 The following data has been obtained on
More informationCorrelation & Simple Regression
Chapter 11 Correlation & Simple Regression The previous chapter dealt with inference for two categorical variables. In this chapter, we would like to examine the relationship between two quantitative variables.
More informationIs economic freedom related to economic growth?
Is economic freedom related to economic growth? It is an article of faith among supporters of capitalism: economic freedom leads to economic growth. The publication Economic Freedom of the World: 2003
More informationChapter 12: Multiple Regression
Chapter 12: Multiple Regression 12.1 a. A scatterplot of the data is given here: Plot of Drug Potency versus Dose Level Potency 0 5 10 15 20 25 30 0 5 10 15 20 25 30 35 Dose Level b. ŷ = 8.667 + 0.575x
More informationSMAM 319 Exam 1 Name. 1.Pick the best choice for the multiple choice questions below (10 points 2 each)
SMAM 319 Exam 1 Name 1.Pick the best choice for the multiple choice questions below (10 points 2 each) A b In Metropolis there are some houses for sale. Superman and Lois Lane are interested in the average
More informationMultiple Regression: Chapter 13. July 24, 2015
Multiple Regression: Chapter 13 July 24, 2015 Multiple Regression (MR) Response Variable: Y - only one response variable (quantitative) Several Predictor Variables: X 1, X 2, X 3,..., X p (p = # predictors)
More information28. SIMPLE LINEAR REGRESSION III
28. SIMPLE LINEAR REGRESSION III Fitted Values and Residuals To each observed x i, there corresponds a y-value on the fitted line, y = βˆ + βˆ x. The are called fitted values. ŷ i They are the values of
More informationTime series and Forecasting
Chapter 2 Time series and Forecasting 2.1 Introduction Data are frequently recorded at regular time intervals, for instance, daily stock market indices, the monthly rate of inflation or annual profit figures.
More informationSTAT 212 Business Statistics II 1
STAT 1 Business Statistics II 1 KING FAHD UNIVERSITY OF PETROLEUM & MINERALS DEPARTMENT OF MATHEMATICAL SCIENCES DHAHRAN, SAUDI ARABIA STAT 1: BUSINESS STATISTICS II Semester 091 Final Exam Thursday Feb
More informationSteps for Regression. Simple Linear Regression. Data. Example. Residuals vs. X. Scatterplot. Make a Scatter plot Does it make sense to plot a line?
Steps for Regression Simple Linear Regression Make a Scatter plot Does it make sense to plot a line? Check Residual Plot (Residuals vs. X) Are there any patterns? Check Histogram of Residuals Is it Normal?
More informationEstimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.
Estimating σ 2 We can do simple prediction of Y and estimation of the mean of Y at any value of X. To perform inferences about our regression line, we must estimate σ 2, the variance of the error term.
More informationSMAM 319 Exam1 Name. a B.The equation of a line is 3x + y =6. The slope is a. -3 b.3 c.6 d.1/3 e.-1/3
SMAM 319 Exam1 Name 1. Pick the best choice. (10 points-2 each) _c A. A data set consisting of fifteen observations has the five number summary 4 11 12 13 15.5. For this data set it is definitely true
More informationSt 412/512, D. Schafer, Spring 2001
St 412/512, D. Schafer, Spring 2001 Midterm Exam Your name:_solutions Your lab time (Circle one): Tues. 8:00 Tues 11:00 Tues 2:00 This is a 50-minute open-book, open-notes test. Show work where appropriate.
More informationSMAM 314 Practice Final Examination Winter 2003
SMAM 314 Practice Final Examination Winter 2003 You may use your textbook, one page of notes and a calculator. Please hand in the notes with your exam. 1. Mark the following statements True T or False
More informationHistogram of Residuals. Residual Normal Probability Plot. Reg. Analysis Check Model Utility. (con t) Check Model Utility. Inference.
Steps for Regression Simple Linear Regression Make a Scatter plot Does it make sense to plot a line? Check Residual Plot (Residuals vs. X) Are there any patterns? Check Histogram of Residuals Is it Normal?
More informationSimple Linear Regression. Steps for Regression. Example. Make a Scatter plot. Check Residual Plot (Residuals vs. X)
Simple Linear Regression 1 Steps for Regression Make a Scatter plot Does it make sense to plot a line? Check Residual Plot (Residuals vs. X) Are there any patterns? Check Histogram of Residuals Is it Normal?
More information23. Inference for regression
23. Inference for regression The Practice of Statistics in the Life Sciences Third Edition 2014 W. H. Freeman and Company Objectives (PSLS Chapter 23) Inference for regression The regression model Confidence
More informationUNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test, October 2013
UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test, October 2013 STAC67H3 Regression Analysis Duration: One hour and fifty minutes Last Name: First Name: Student
More informationStatistics - Lecture Three. Linear Models. Charlotte Wickham 1.
Statistics - Lecture Three Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Linear Models 1. The Theory 2. Practical Use 3. How to do it in R 4. An example 5. Extensions
More informationQ Lecture Introduction to Regression
Q3 2009 1 Before/After Transformation 2 Construction Role of T-ratios Formally, even under Null Hyp: H : 0, ˆ, being computed from k t k SE ˆ ˆ y values themselves containing random error, will sometimes
More informationAP Statistics Bivariate Data Analysis Test Review. Multiple-Choice
Name Period AP Statistics Bivariate Data Analysis Test Review Multiple-Choice 1. The correlation coefficient measures: (a) Whether there is a relationship between two variables (b) The strength of the
More information22S39: Class Notes / November 14, 2000 back to start 1
Model diagnostics Interpretation of fitted regression model 22S39: Class Notes / November 14, 2000 back to start 1 Model diagnostics 22S39: Class Notes / November 14, 2000 back to start 2 Model diagnostics
More informationChapter 14 Multiple Regression Analysis
Chapter 14 Multiple Regression Analysis 1. a. Multiple regression equation b. the Y-intercept c. $374,748 found by Y ˆ = 64,1 +.394(796,) + 9.6(694) 11,6(6.) (LO 1) 2. a. Multiple regression equation b.
More informationChapter 14 Student Lecture Notes 14-1
Chapter 14 Student Lecture Notes 14-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter 14 Multiple Regression Analysis and Model Building Chap 14-1 Chapter Goals After completing this
More informationInference for the Regression Coefficient
Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression line. We can shows that b 0 and b 1 are the unbiased estimates
More informationStat 501, F. Chiaromonte. Lecture #8
Stat 501, F. Chiaromonte Lecture #8 Data set: BEARS.MTW In the minitab example data sets (for description, get into the help option and search for "Data Set Description"). Wild bears were anesthetized,
More informationWeek 9: An Introduction to Time Series
BUS41100 Applied Regression Analysis Week 9: An Introduction to Time Series Dependent data, autocorrelation, AR and periodic regression models Max H. Farrell The University of Chicago Booth School of Business
More information1 Introduction to Minitab
1 Introduction to Minitab Minitab is a statistical analysis software package. The software is freely available to all students and is downloadable through the Technology Tab at my.calpoly.edu. When you
More informationFinal Exam Bus 320 Spring 2000 Russell
Name Final Exam Bus 320 Spring 2000 Russell Do not turn over this page until you are told to do so. You will have 3 hours minutes to complete this exam. The exam has a total of 100 points and is divided
More informationConfidence Intervals, Testing and ANOVA Summary
Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0
More informationLecture 18: Simple Linear Regression
Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength
More informationChapter 26 Multiple Regression, Logistic Regression, and Indicator Variables
Chapter 26 Multiple Regression, Logistic Regression, and Indicator Variables 26.1 S 4 /IEE Application Examples: Multiple Regression An S 4 /IEE project was created to improve the 30,000-footlevel metric
More informationLecture notes on Regression & SAS example demonstration
Regression & Correlation (p. 215) When two variables are measured on a single experimental unit, the resulting data are called bivariate data. You can describe each variable individually, and you can also
More informationSchool of Mathematical Sciences. Question 1
School of Mathematical Sciences MTH5120 Statistical Modelling I Practical 8 and Assignment 7 Solutions Question 1 Figure 1: The residual plots do not contradict the model assumptions of normality, constant
More informationLecture 3: Inference in SLR
Lecture 3: Inference in SLR STAT 51 Spring 011 Background Reading KNNL:.1.6 3-1 Topic Overview This topic will cover: Review of hypothesis testing Inference about 1 Inference about 0 Confidence Intervals
More information1. Review of Lecture level factors Homework A 2 3 experiment in 16 runs with no replicates
Lecture 3.1 1. Review of Lecture 2.2 2-level factors Homework 2.2.1 2. A 2 3 experiment 3. 2 4 in 16 runs with no replicates Lecture 3.1 1 2 k Factorial Designs Designs with k factors each at 2 levels
More informationApplied Econometrics. Professor Bernard Fingleton
Applied Econometrics Professor Bernard Fingleton Regression A quick summary of some key issues Some key issues Text book JH Stock & MW Watson Introduction to Econometrics 2nd Edition Software Gretl Gretl.sourceforge.net
More informationSMAM 314 Exam 42 Name
SMAM 314 Exam 42 Name Mark the following statements True (T) or False (F) (10 points) 1. F A. The line that best fits points whose X and Y values are negatively correlated should have a positive slope.
More informationSchool of Mathematical Sciences. Question 1. Best Subsets Regression
School of Mathematical Sciences MTH5120 Statistical Modelling I Practical 9 and Assignment 8 Solutions Question 1 Best Subsets Regression Response is Crime I n W c e I P a n A E P U U l e Mallows g E P
More informationMultiple Regression Methods
Chapter 1: Multiple Regression Methods Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition 1 Learning Objectives for Ch. 1 The Multiple Linear Regression Model How to interpret
More informationThe simple linear regression model discussed in Chapter 13 was written as
1519T_c14 03/27/2006 07:28 AM Page 614 Chapter Jose Luis Pelaez Inc/Blend Images/Getty Images, Inc./Getty Images, Inc. 14 Multiple Regression 14.1 Multiple Regression Analysis 14.2 Assumptions of the Multiple
More information42 GEO Metro Japan
Statistics 101 106 Lecture 11 (17 November 98) c David Pollard Page 1 Read M&M Chapters 2 and 11 again. Section leaders will decide how much of Chapters 12 and 13 to cover formally; they will assign the
More informationAnalysis of Covariance. The following example illustrates a case where the covariate is affected by the treatments.
Analysis of Covariance In some experiments, the experimental units (subjects) are nonhomogeneous or there is variation in the experimental conditions that are not due to the treatments. For example, a
More informationAnalysis of Bivariate Data
Analysis of Bivariate Data Data Two Quantitative variables GPA and GAES Interest rates and indices Tax and fund allocation Population size and prison population Bivariate data (x,y) Case corr® 2 Independent
More informationReview of Regression Basics
Review of Regression Basics When describing a Bivariate Relationship: Make a Scatterplot Strength, Direction, Form Model: y-hat=a+bx Interpret slope in context Make Predictions Residual = Observed-Predicted
More informationMULTIPLE REGRESSION ANALYSIS AND OTHER ISSUES. Business Statistics
MULTIPLE REGRESSION ANALYSIS AND OTHER ISSUES Business Statistics CONTENTS Multiple regression Dummy regressors Assumptions of regression analysis Predicting with regression analysis Old exam question
More information(4) 1. Create dummy variables for Town. Name these dummy variables A and B. These 0,1 variables now indicate the location of the house.
Exam 3 Resource Economics 312 Introductory Econometrics Please complete all questions on this exam. The data in the spreadsheet: Exam 3- Home Prices.xls are to be used for all analyses. These data are
More informationSTAT 3A03 Applied Regression With SAS Fall 2017
STAT 3A03 Applied Regression With SAS Fall 2017 Assignment 2 Solution Set Q. 1 I will add subscripts relating to the question part to the parameters and their estimates as well as the errors and residuals.
More informationSix Sigma Black Belt Study Guides
Six Sigma Black Belt Study Guides 1 www.pmtutor.org Powered by POeT Solvers Limited. Analyze Correlation and Regression Analysis 2 www.pmtutor.org Powered by POeT Solvers Limited. Variables and relationships
More informationSTA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007
STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007 LAST NAME: SOLUTIONS FIRST NAME: STUDENT NUMBER: ENROLLED IN: (circle one) STA 302 STA 1001 INSTRUCTIONS: Time: 90 minutes Aids allowed: calculator.
More informationCHAPTER 5 FUNCTIONAL FORMS OF REGRESSION MODELS
CHAPTER 5 FUNCTIONAL FORMS OF REGRESSION MODELS QUESTIONS 5.1. (a) In a log-log model the dependent and all explanatory variables are in the logarithmic form. (b) In the log-lin model the dependent variable
More informationInference for Regression Inference about the Regression Model and Using the Regression Line
Inference for Regression Inference about the Regression Model and Using the Regression Line PBS Chapter 10.1 and 10.2 2009 W.H. Freeman and Company Objectives (PBS Chapter 10.1 and 10.2) Inference about
More informationIntroduction to Regression
Regression Introduction to Regression If two variables covary, we should be able to predict the value of one variable from another. Correlation only tells us how much two variables covary. In regression,
More informationStatistics for Managers using Microsoft Excel 6 th Edition
Statistics for Managers using Microsoft Excel 6 th Edition Chapter 13 Simple Linear Regression 13-1 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of
More informationIntroduction to Econometrics
Introduction to Econometrics STAT-S-301 Introduction to Time Series Regression and Forecasting (2016/2017) Lecturer: Yves Dominicy Teaching Assistant: Elise Petit 1 Introduction to Time Series Regression
More informationSTK4900/ Lecture 3. Program
STK4900/9900 - Lecture 3 Program 1. Multiple regression: Data structure and basic questions 2. The multiple linear regression model 3. Categorical predictors 4. Planned experiments and observational studies
More information(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box.
FINAL EXAM ** Two different ways to submit your answer sheet (i) Use MS-Word and place it in a drop-box. (ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box. Deadline: December
More informationData Set 8: Laysan Finch Beak Widths
Data Set 8: Finch Beak Widths Statistical Setting This handout describes an analysis of covariance (ANCOVA) involving one categorical independent variable (with only two levels) and one quantitative covariate.
More informationLecture 1 Linear Regression with One Predictor Variable.p2
Lecture Linear Regression with One Predictor Variablep - Basics - Meaning of regression parameters p - β - the slope of the regression line -it indicates the change in mean of the probability distn of
More informationSTA 101 Final Review
STA 101 Final Review Statistics 101 Thomas Leininger June 24, 2013 Announcements All work (besides projects) should be returned to you and should be entered on Sakai. Office Hour: 2 3pm today (Old Chem
More informationMultiple Regression an Introduction. Stat 511 Chap 9
Multiple Regression an Introduction Stat 511 Chap 9 1 case studies meadowfoam flowers brain size of mammals 2 case study 1: meadowfoam flowering designed experiment carried out in a growth chamber general
More informationAP Statistics Unit 6 Note Packet Linear Regression. Scatterplots and Correlation
Scatterplots and Correlation Name Hr A scatterplot shows the relationship between two quantitative variables measured on the same individuals. variable (y) measures an outcome of a study variable (x) may
More informationSimple Linear Regression: One Qualitative IV
Simple Linear Regression: One Qualitative IV 1. Purpose As noted before regression is used both to explain and predict variation in DVs, and adding to the equation categorical variables extends regression
More informationANOVA Situation The F Statistic Multiple Comparisons. 1-Way ANOVA MATH 143. Department of Mathematics and Statistics Calvin College
1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College An example ANOVA situation Example (Treating Blisters) Subjects: 25 patients with blisters Treatments: Treatment A, Treatment
More informationStat 231 Final Exam. Consider first only the measurements made on housing number 1.
December 16, 1997 Stat 231 Final Exam Professor Vardeman 1. The first page of printout attached to this exam summarizes some data (collected by a student group) on the diameters of holes bored in certain
More informationSTATISTICS 110/201 PRACTICE FINAL EXAM
STATISTICS 110/201 PRACTICE FINAL EXAM Questions 1 to 5: There is a downloadable Stata package that produces sequential sums of squares for regression. In other words, the SS is built up as each variable
More informationThe ARIMA Procedure: The ARIMA Procedure
Page 1 of 120 Overview: ARIMA Procedure Getting Started: ARIMA Procedure The Three Stages of ARIMA Modeling Identification Stage Estimation and Diagnostic Checking Stage Forecasting Stage Using ARIMA Procedure
More informationOrthogonal contrasts for a 2x2 factorial design Example p130
Week 9: Orthogonal comparisons for a 2x2 factorial design. The general two-factor factorial arrangement. Interaction and additivity. ANOVA summary table, tests, CIs. Planned/post-hoc comparisons for the
More informationFinQuiz Notes
Reading 9 A time series is any series of data that varies over time e.g. the quarterly sales for a company during the past five years or daily returns of a security. When assumptions of the regression
More informationTopic 14: Inference in Multiple Regression
Topic 14: Inference in Multiple Regression Outline Review multiple linear regression Inference of regression coefficients Application to book example Inference of mean Application to book example Inference
More informationExamination paper for TMA4255 Applied statistics
Department of Mathematical Sciences Examination paper for TMA4255 Applied statistics Academic contact during examination: Anna Marie Holand Phone: 951 38 038 Examination date: 16 May 2015 Examination time
More informationStart with review, some new definitions, and pictures on the white board. Assumptions in the Normal Linear Regression Model
Start with review, some new definitions, and pictures on the white board. Assumptions in the Normal Linear Regression Model A1: There is a linear relationship between X and Y. A2: The error terms (and
More informationLEARNING WITH MINITAB Chapter 12 SESSION FIVE: DESIGNING AN EXPERIMENT
LEARNING WITH MINITAB Chapter 12 SESSION FIVE: DESIGNING AN EXPERIMENT Laura M Williams, RN, CLNC, MSN MOREHEAD STATE UNIVERSITY IET603: STATISTICAL QUALITY ASSURANCE IN SCIENCE AND TECHNOLOGY DR. AHMAD
More informationO2. The following printout concerns a best subsets regression. Questions follow.
STAT-UB.0103 Exam 01.APIL.11 OVAL Version Solutions O1. Frank Tanner is the lab manager at BioVigor, a firm that runs studies for agricultural food supplements. He has been asked to design a protocol for
More informationSAS Procedures Inference about the Line ffl model statement in proc reg has many options ffl To construct confidence intervals use alpha=, clm, cli, c
Inference About the Slope ffl As with all estimates, ^fi1 subject to sampling var ffl Because Y jx _ Normal, the estimate ^fi1 _ Normal A linear combination of indep Normals is Normal Simple Linear Regression
More information[4+3+3] Q 1. (a) Describe the normal regression model through origin. Show that the least square estimator of the regression parameter is given by
Concordia University Department of Mathematics and Statistics Course Number Section Statistics 360/1 40 Examination Date Time Pages Final June 2004 3 hours 7 Instructors Course Examiner Marks Y.P. Chaubey
More information" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2
Notation and Equations for Final Exam Symbol Definition X The variable we measure in a scientific study n The size of the sample N The size of the population M The mean of the sample µ The mean of the
More informationInterpreting the coefficients
Lecture Week 5 Multiple Linear Regression Interpreting the coefficients Uses of Multiple Regression Predict for specified new x-vars Predict in time. Focus on one parameter Use regression to adjust variation
More informationMBA Statistics COURSE #4
MBA Statistics 51-651-00 COURSE #4 Simple and multiple linear regression What should be the sales of ice cream? Example: Before beginning building a movie theater, one must estimate the daily number of
More informationIII. Inferential Tools
III. Inferential Tools A. Introduction to Bat Echolocation Data (10.1.1) 1. Q: Do echolocating bats expend more enery than non-echolocating bats and birds, after accounting for mass? 2. Strategy: (i) Explore
More informationChapter 7 Student Lecture Notes 7-1
Chapter 7 Student Lecture Notes 7- Chapter Goals QM353: Business Statistics Chapter 7 Multiple Regression Analysis and Model Building After completing this chapter, you should be able to: Explain model
More informationOutline. Linear OLS Models vs: Linear Marginal Models Linear Conditional Models. Random Intercepts Random Intercepts & Slopes
Lecture 2.1 Basic Linear LDA 1 Outline Linear OLS Models vs: Linear Marginal Models Linear Conditional Models Random Intercepts Random Intercepts & Slopes Cond l & Marginal Connections Empirical Bayes
More informationSTAT 360-Linear Models
STAT 360-Linear Models Instructor: Yogendra P. Chaubey Sample Test Questions Fall 004 Note: The following questions are from previous tests and exams. The final exam will be for three hours and will contain
More informationDisadvantages of using many pooled t procedures. The sampling distribution of the sample means. The variability between the sample means
Stat 529 (Winter 2011) Analysis of Variance (ANOVA) Reading: Sections 5.1 5.3. Introduction and notation Birthweight example Disadvantages of using many pooled t procedures The analysis of variance procedure
More informationA Second Course in Statistics: Regression Analysis
FIFTH E D I T I 0 N A Second Course in Statistics: Regression Analysis WILLIAM MENDENHALL University of Florida TERRY SINCICH University of South Florida PRENTICE HALL Upper Saddle River, New Jersey 07458
More information10. Alternative case influence statistics
10. Alternative case influence statistics a. Alternative to D i : dffits i (and others) b. Alternative to studres i : externally-studentized residual c. Suggestion: use whatever is convenient with the
More information