Stat 328 Regression Summary
|
|
- Suzanna Conley
- 5 years ago
- Views:
Transcription
1 Simple Linear Regression Model Stat 328 Regression Summary The basic (normal) "simple linear regression" model says that a response/output variable ` depends on an explanatory/input/system variable _ in a "noisy but linear" way. That is, one supposes that there is a linear relationship between _ and mean `, K `Š_ º? >??_ and that (for fixed _) there is around that mean a distribution of ` that is normal. Further, the model assumption is that the standard deviation of the response distribution is constant in _. In symbols it is standard to write `º? >??_B where B is normal with mean > and standard deviation R. This describes one `. Where several observations ` with corresponding values _ are under consideration, the assumption is that the ` (the B ) are independent. (The B are conceptually equivalent to unrelated random draws from the same fixed normal continuous distribution.) The model statement in its full glory is then ` º? >??_ B for º? B for º? are independent normal î> R ï random variables The model statement above is a perfectly theoretical matter. One can begin with it, and for specific choices of? >?? and Rfind probabilities for ` at given values of _. In applications, the real mode of operation is instead to take data pairs î_? î`? ïîî_ î` ïîîî_ î` ï and use them to make inferences about the parameters? > î?? and R and to make predictions based on the estimates (based on the empirically fitted model). Descriptive Analysis of Approximately Linear î_ `ï Data After plotting î_î`ï data to determine that the "linear in _ mean of `" model makes some sense, it is reasonable to try to quantify "how linear" the data look and to find a line of "best fit" to the scatterplot of the data. The sample correlation between ` and _ Yº bî_ Ÿ _ïî` Ÿ `ï º? b î_ Ÿ _ï b î` Ÿ `ï is a measure of strength of linear relationship between _ and `. º? º? 1
2 Calculus can be invoked to find a slope? and intercept? minimizing the sum of? > squared vertical distances from data points to a fitted line, bî` Ÿ î?? _ ïï. These " least squares" values are I º? bî_ Ÿ _ïî` Ÿ `ï º? bî_ Ÿ _ï º? º? >? and I > º `ŸI?_ It is further common to refer to the value of ` on the "least squares line" corresponding to _ as a fitted or predicted value ` º I I _ >? One might take the difference between what is observed ( ` ) and what is "predicted" or "explained" ( ` ) as a kind of leftover part or " residual" corresponding to a data value L º ` Ÿ` The sum b î` Ÿ `ï is most of the sample variance of the values `. It is a measure of º? raw variation in the response variable. eople often call it the " total sum of squares" and write L º? The sum of squared residuals b tt[v[ º cî` Ÿ `ï º? is a measure of variation in response remaining unaccounted for after fitting a line to the data. eople often call it the "error sum of squares" and write ttf º c L º cî` Ÿ` ï º? One is guaranteed that ttuv[ ¾ ttf. So the difference ttuv[ Ÿ ttf is a non- negative measure of variation accounted for in fitting a line to î_î`ï data. eople often call it the " regression sum of squares" and write º? tts º tt[v[ Ÿ ttf 2
3 The coefficient of determination expresses tts as a fraction of ttuv[ and is s º tts ttuv[ which is interpreted as "the fraction of raw variation in ` accounted for in the model fitting process." arameter Estimates for SLR The descriptive statistics for î_î`ï data can be used to provide "single number estimates" of the (typically unknown) parameters of the simple linear regression model. That is, the slope of the least squares line can serve as an estimate of??,? ºI?? and the intercept of the least squares line can serve as an estimate of? >,? ºI > > The variance of ` for a given _ can be estimated by a kind of average of squared residuals? ttf Z L º c L º Ÿ Ÿ Of course, the square root of this "regression sample variance" is Z L º Z a single number estimate of R. º? L and serves as Interval-Based Inference Methods for SLR The normal simple linear regression model provides inference formulas for model parameters. Confidence limits for?? (the slope of the line relating mean ` to _... the rate of change of average ` with respect to _) are ZL I? [ î_ Ÿ _ï b º? where [ is a quantile of the [ distribution with L º Ÿ degrees of freedom. Dielman calls the ratio Zì î_ Ÿ_ï L b the standard error of I? and uses the symbol ZI for it. In? º? these terms, the confidence limits for are?? I [Z? I? 3
4 Confidence limits for K `Š_ º? >??_ (the mean value of ` at a given value _) are? î_ÿ_ï îi I _ï [Z >? L bî_ Ÿ _ï One can abbreviate îi> I?_ï as `, and Dielman calls ZL the standard bî_ Ÿ_ï error of the mean and uses the symbol K`Š_ are Z T º?? º? î_ÿ_ï for it. In these terms, the confidence limits for ` [Z T (Note that by choosing _º>, this formula provides confidence limits for? >, though this parameter is rarely of independent practical interest.) rediction limits for an additional observation ` at a given value _ are? î_ÿ_ï îi I _ï [Z >? L? bî_ Ÿ _ï Dielman calls Z L?? î_ÿ_ï bî_ Ÿ_ï Z W for it. It is on occasion useful to notice that º? º? the standard error of prediction and uses the symbol W Z L Z T Z º sing the ZW notation, prediction limits for an additional ` at _ are ` [Z W Hypothesis Tests and SLR The normal simple linear regression model supports hypothesis testing. H: 0?? º # can be tested using the test statistic I? Ÿ# I? Ÿ# uº Z º L Z I î_ Ÿ_ï b? º? 4
5 and a [ Ÿ reference distribution. H: 0 K`Š_ º # can be tested using the test statistic and a [ Ÿ reference distribution. îi> I?_ï Ÿ# `Ÿ# uº º? î_ÿ_ï ZT ZL bî_ Ÿ_ï º? ANOVA and SLR The breaking down of ttuv[ into tts and ttf can be thought of as a kind of " analysis of variance" in `. That enterprise is often summarized in a special kind of table. The general form is as below. ANOVA Table (for SLR) Source SS df MS F Regression tts? nts º ttsì? g º ntsìntf Error ttf Ÿ ntf º ttfìî Ÿ ï Total tt[v[ Ÿ? In this table the ratios of sums of squares to degrees of freedom are called "mean squares." The mean square for error is, in fact, the estimate of R (i.e. ntf º Z ). As it turns out, the ratio in the "F" column can be used as a test statistic for the hypothesis H: 0?? º 0. The reference distribution appropriate is the g? Ÿ distribution. As it turns out, the value g º ntsìntf is the square of the [ statistic for testing this hypothesis, and the g test produces exactly the same W -values as a two-sided [ test. L Standardized Residuals and SLR The theoretical variances of the residuals turn out to depend upon their corresponding _ values. As a means of putting these residuals all on the same footing, it is common to "standardize" them by dividing by an estimated standard deviation for each. This produces standardized residuals L º L ZL?Ÿ? Ÿ î_ Ÿ_ï bî_ Ÿ_ï These (if the normal simple linear regression model is a good one) "ought" to look as if they are approximately normal with mean > and standard deviation?. Various kinds of plotting with these standardized residuals (or with the raw residuals) are used as means of "model checking" or "model diagnostics." º? 5
6 Multiple Linear Regression Model The basic (normal) "multiple linear regression" model says that a response/output variable ` depends on explanatory/input/system variables _? R in a "noisy but linear" way. That is, one supposes that there is a linear relationship between? _ R and mean `, K `Š,_ º? >??_??_? R_ R? R and that (for fixed? R) there is around that mean a distribution of `that is normal. Further, the standard assumption is that the standard deviation of the response distribution is constant in. In symbols it is standard to write? R ` º? >??_??_? R_ R B where B is normal with mean > and standard deviation R. This describes one `. Where several observations ` with corresponding values _? R are under consideration, the assumption is that the ` (the B ) are independent. (The B are conceptually equivalent to unrelated random draws from the same fixed normal continuous distribution.) The model statement in its full glory is then ` º? >??_??_? R_ R B for º? B for º? are independent normal î> R ï random variables The model statement above is a perfectly theoretical matter. One can begin with it, and for specific choices of? >???? R and Rfind probabilities for ` at given values of? R. In applications, the real mode of operation is instead to take data vectors î_?? _? _ R? `? ï î_? R ` ï î_? R `ï and use them to make inferences about the parameters? >??,?? R and R and to make predictions based on the estimates (based on the empirically fitted model). Descriptive Analysis of Approximately Linear î? R _ `ï Data Calculus can be invoked to find coefficients? >???? R minimizing the sum of squared vertical distances from data points in îr?ï dimensional space to a fitted surface, bî` Ÿî?? _? _? _ ïï. These " least squares" values º? >?? R R DO NOT have simple formulas (unless one is willing to use matrix notation). And in particular, one can NOT simply somehow use the formulas from simple linear regression in this more complicated context. We will call these minimizing coefficients IIII >? R and need to rely upon JM to produce them for us. 6
7 It is further common to refer to the value of ` on the "least squares surface" corresponding to _? R as a fitted or predicted value ` ºI> I?_? I _ I R_ R Exactly as in SLR, one takes the difference between what is observed ( ` ) and what is "predicted" or "explained" ( ` ) as a kind of leftover part or " residual" corresponding to a data value L º ` Ÿ` The total sum of squares, ttuv[ º b î` Ÿ `ï, is (still) most of the sample variance º? of the values and measures raw variation in the response variable. Just as in SLR, ` L º? the sum of squared residuals b is a measure of variation in response remaining unaccounted for after fitting the equation to the data. As in SLR, people call it the error sum of squares and write ttf º c L º cî` Ÿ` ï º? (The formula looks exactly like the one for SLR. It is simply the case that now ` is computed using all R inputs, not just a single _.) One is still guaranteed that ttuv[ ¾ ttf. So the difference ttuv[ Ÿ ttf is a non-negative measure of variation accounted for in fitting the linear equation to the data. As in SLR, people call it the regression sum of squares and write º? tts º tt[v[ Ÿ ttf The coefficient of (multiple) determination expresses tts as a fraction of ttuv[ and is s º tts ttuv[ which is interpreted as "the fraction of raw variation in ` accounted for in the model fitting process." This quantity can also be interpreted in terms of a correlation, as it turns out to be the square of the sample linear correlation between the observations ` and the fitted or predicted values `. arameter Estimates for MLR The descriptive statistics for î_? R`ïdata can be used to provide "single number estimates" of the (typically unknown) parameters of the multiple linear regression model. That is, the least squares coefficients IIII >? R serve as estimates of the parameters????. The first of these is a kind of high-dimensional "intercept" >? R 7
8 and in the case where the predictors are not functionally related, the others serve as rates of change of average ` with respect to a single _, provided the other _'s are held fixed. The variance of ` for a fixed values _? R can be estimated by a kind of average of squared residuals? ttf Z L º c L º ŸRŸ? ŸRŸ? The square root of this "regression sample variance" is Z L º Z number estimate of R. º? L and serves as a single Interval-Based Inference Methods for MLR The normal multiple linear regression model provides inference formulas for model parameters. Confidence limits for? Q (the rate of change of average ` with respect to _ Q ) are I Q [ZI Q where [ is a quantile of the [ distribution with L ºŸRŸ? degrees of freedom and ZI Q is a standard error of IQ. There is no simple formula for ZI Q, and in particular, one can NOT simply somehow use the formula from simple linear regression in this more complicated context. It IS the case that this standard error is a multiple of Z L, but we will have to rely upon JM to provide it for us. Confidence limits for K `Š? >??????, º R R_ R (the mean value of ` at a particular choice of the _ ) are? R ` [Z T where ZT is a standard error for ` and depends upon which values of? R are under consideration. There is no simple formula for Z T and in particular one can NOT simply somehow use the formula from simple linear regression in this more complicated context. It IS the case that this standard error is a multiple of Z L, but we will have to rely upon JM to provide it for us. rediction limits for an additional observation `at a given vector are ` [Z W î _ï? R where Z W is a standard error of prediction. It remains true (as in SLR) that Z W º Z L Z T but otherwise there are no simple formulas for it, and in particular one can NOT simply somehow use the formula from simple linear regression in this more complicated context. 8
9 Hypothesis Tests and MLR The normal multiple linear regression model supports hypothesis testing. H: 0? Q º # can be tested using the test statistic uº I Q Ÿ # Z and a [ ŸRŸ? reference distribution. H: 0 K`Š # can be tested using the test?, º R statistic and a [ ŸRŸ? reference distribution. I Q uº`ÿ Z T # ANOVA and MLR As in SLR, the breaking down of ttuv[ into tts and ttf can be thought of as a kind of " analysis of variance" in `, and summarized in a special kind of table. The general form for MLR is as below. ANOVA Table (for MLR Overall F Test) Source SS df MS F Regression tts R nts º ttsìr g º ntsìntf Error ttf Ÿ R Ÿ? ntf º ttfìî Ÿ R Ÿ?ï Total tt[v[ Ÿ? (Note that as in SLR, the mean square for error is, in fact, the estimate of ntf º Z L ).) As it turns out, the ratio in the "F" column can be used as a test statistic for the hypothesis H: 0?? º? º º? R º 0. The reference distribution appropriate is the g RŸRŸ? distribution. R (i.e. "artial F Tests" in MLR It is possible to use ANOVA ideas to invent F tests for investigating whether some whole group of?'s (short of the entire set) are all >. For example one might want to test the hypothesis H: 0? S? º? S º º? R º 0 9
10 (This is the hypothesis that only the first S of the R input variables _ have any impact on the mean system response... the hypothesis that the first S of the _'s are adequate to predict `... the hypothesis that after accounting for the first S of the _'s, the others do not contribute "significantly" to one's ability to explain or predict `.) If we call the model for ` in terms of all R of the predictors the "full model" and the model for ` involving only _? through _ S the "reduced model" then an g test of the above hypothesis can be made using the statistic gº îttsfull Ÿ ttsreducedïìîr Ÿ Sï ntf and an grÿsÿrÿ? reference distribution. ttsfull ¾ ttsreduced so the numerator here is non-negative. Finding a W-value for this kind of test is a means of judging whether for the full model is "significantly"/detectably larger than s for the reduced model. (Caution here, statistical significant is not the same as practical importance. With a big enough data set, essentially any increase in s will produce a small W-value.) It is reasonably common to expand the basic MLR ANOVA table to organize calculations for this test statistic. This is (Expanded) ANOVA Table (for MLR) Source SS df MS F Regression tts R nts º ttsìr g º nts ìntf _? _ S ttsred W îttsfullÿttsredïìîrÿsï _ S? _Š_ R S _ W ttsfull Ÿ ttsred R Ÿ S îttsfull Ÿ ttsredïìîr Ÿ Sï ntffull Error ttffull Ÿ R Ÿ? ntf Full º ttffullìî Ÿ R Ÿ?ï Total tt[v[ Ÿ? Full Full Full Full Full s Standardized Residuals in MLR As in SLR, people sometimes wish to standardize residuals before using them to do model checking/diagnostics. While it is not possible to give a simple formula for the "standard error of L " without using matrix notation, most MLR programs will compute these values. The standardized residual for data point is then (as in SLR) L º L standard error of L If the normal multiple linear regression model is a good one these "ought" to look as if they are approximately normal with mean > and standard deviation?. 10
11 Intervals and Tests for Linear Combinations of?'s in MLR It is sometimes important to do inference for a linear combination of MLR model coefficients m º J>? > J??? J? J JR? R (where JJJ >? R are known constants). Note, for example, that K`Š?, is of this R form for J > º?J? º _? J º _ and J R º _ R. Note too that a difference in mean responses at two sets of predictors, say î ï? R and î ï? R is of this form for J º >J º _ Ÿ_ J º _ Ÿ_ and J º _ Ÿ_. >??? R R R An obvious estimate of m is Confidence limits for m are mº J> I> J? I? J I J JRIR m [î standard error of mï There is no simple formula for "standard error of m " This standard error is a multiple of Z, but we will have to rely upon JM to provide it for us. (Computation of m and its standard error is under the "Custom Test" option in JM.) H: # 0 mº can be tested using the test statistic mÿ # uº standard error of m and a [ ŸRŸ? reference distribution. Or, if one thinks about it for a while, it is possible to find a reduced model that corresponds to the restriction that the null hypothesis places on the MLR model and to use a " SºRŸ? " partial g test (with? and ŸRŸ? degrees of freedom) equivalent to the [ test for this purpose. 11
Inferences for Regression
Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In
More informationSTAT Chapter 11: Regression
STAT 515 -- Chapter 11: Regression Mostly we have studied the behavior of a single random variable. Often, however, we gather data on two random variables. We wish to determine: Is there a relationship
More informationObjectives Simple linear regression. Statistical model for linear regression. Estimating the regression parameters
Objectives 10.1 Simple linear regression Statistical model for linear regression Estimating the regression parameters Confidence interval for regression parameters Significance test for the slope Confidence
More informationNature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference.
Understanding regression output from software Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals In 1966 Cyril Burt published a paper called The genetic determination of differences
More informationCorrelation Analysis
Simple Regression Correlation Analysis Correlation analysis is used to measure strength of the association (linear relationship) between two variables Correlation is only concerned with strength of the
More informationInference for Regression Inference about the Regression Model and Using the Regression Line, with Details. Section 10.1, 2, 3
Inference for Regression Inference about the Regression Model and Using the Regression Line, with Details Section 10.1, 2, 3 Basic components of regression setup Target of inference: linear dependency
More informationInference for Regression Simple Linear Regression
Inference for Regression Simple Linear Regression IPS Chapter 10.1 2009 W.H. Freeman and Company Objectives (IPS Chapter 10.1) Simple linear regression p Statistical model for linear regression p Estimating
More informationBusiness Statistics. Lecture 9: Simple Regression
Business Statistics Lecture 9: Simple Regression 1 On to Model Building! Up to now, class was about descriptive and inferential statistics Numerical and graphical summaries of data Confidence intervals
More informationBasic Business Statistics 6 th Edition
Basic Business Statistics 6 th Edition Chapter 12 Simple Linear Regression Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of a dependent variable based
More informationRegression Analysis IV... More MLR and Model Building
Regression Analysis IV... More MLR and Model Building This session finishes up presenting the formal methods of inference based on the MLR model and then begins discussion of "model building" (use of regression
More informationSimple Linear Regression for the Climate Data
Prediction Prediction Interval Temperature 0.2 0.0 0.2 0.4 0.6 0.8 320 340 360 380 CO 2 Simple Linear Regression for the Climate Data What do we do with the data? y i = Temperature of i th Year x i =CO
More informationLectures on Simple Linear Regression Stat 431, Summer 2012
Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population
More informationMathematics for Economics MA course
Mathematics for Economics MA course Simple Linear Regression Dr. Seetha Bandara Simple Regression Simple linear regression is a statistical method that allows us to summarize and study relationships between
More informationRegression Models - Introduction
Regression Models - Introduction In regression models there are two types of variables that are studied: A dependent variable, Y, also called response variable. It is modeled as random. An independent
More informationInference with Simple Regression
1 Introduction Inference with Simple Regression Alan B. Gelder 06E:071, The University of Iowa 1 Moving to infinite means: In this course we have seen one-mean problems, twomean problems, and problems
More informationA discussion on multiple regression models
A discussion on multiple regression models In our previous discussion of simple linear regression, we focused on a model in which one independent or explanatory variable X was used to predict the value
More informationCh 2: Simple Linear Regression
Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component
More information" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2
Notation and Equations for Final Exam Symbol Definition X The variable we measure in a scientific study n The size of the sample N The size of the population M The mean of the sample µ The mean of the
More informationAMS 7 Correlation and Regression Lecture 8
AMS 7 Correlation and Regression Lecture 8 Department of Applied Mathematics and Statistics, University of California, Santa Cruz Suumer 2014 1 / 18 Correlation pairs of continuous observations. Correlation
More informationInference for Regression Inference about the Regression Model and Using the Regression Line
Inference for Regression Inference about the Regression Model and Using the Regression Line PBS Chapter 10.1 and 10.2 2009 W.H. Freeman and Company Objectives (PBS Chapter 10.1 and 10.2) Inference about
More informationLecture 18: Simple Linear Regression
Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength
More informationSociology 6Z03 Review II
Sociology 6Z03 Review II John Fox McMaster University Fall 2016 John Fox (McMaster University) Sociology 6Z03 Review II Fall 2016 1 / 35 Outline: Review II Probability Part I Sampling Distributions Probability
More information: The model hypothesizes a relationship between the variables. The simplest probabilistic model: or.
Chapter Simple Linear Regression : comparing means across groups : presenting relationships among numeric variables. Probabilistic Model : The model hypothesizes an relationship between the variables.
More informationRegression Analysis: Basic Concepts
The simple linear model Regression Analysis: Basic Concepts Allin Cottrell Represents the dependent variable, y i, as a linear function of one independent variable, x i, subject to a random disturbance
More informationLecture 10 Multiple Linear Regression
Lecture 10 Multiple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: 6.1-6.5 10-1 Topic Overview Multiple Linear Regression Model 10-2 Data for Multiple Regression Y i is the response variable
More informationLecture 3: Inference in SLR
Lecture 3: Inference in SLR STAT 51 Spring 011 Background Reading KNNL:.1.6 3-1 Topic Overview This topic will cover: Review of hypothesis testing Inference about 1 Inference about 0 Confidence Intervals
More information13 Simple Linear Regression
B.Sc./Cert./M.Sc. Qualif. - Statistics: Theory and Practice 3 Simple Linear Regression 3. An industrial example A study was undertaken to determine the effect of stirring rate on the amount of impurity
More informationSimple Linear Regression
Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)
More informationChapter 16. Simple Linear Regression and dcorrelation
Chapter 16 Simple Linear Regression and dcorrelation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will
More informationPsychology 282 Lecture #3 Outline
Psychology 8 Lecture #3 Outline Simple Linear Regression (SLR) Given variables,. Sample of n observations. In study and use of correlation coefficients, and are interchangeable. In regression analysis,
More information(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box.
FINAL EXAM ** Two different ways to submit your answer sheet (i) Use MS-Word and place it in a drop-box. (ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box. Deadline: December
More informationSTA 101 Final Review
STA 101 Final Review Statistics 101 Thomas Leininger June 24, 2013 Announcements All work (besides projects) should be returned to you and should be entered on Sakai. Office Hour: 2 3pm today (Old Chem
More informationCategorical Predictor Variables
Categorical Predictor Variables We often wish to use categorical (or qualitative) variables as covariates in a regression model. For binary variables (taking on only 2 values, e.g. sex), it is relatively
More informationSimple Linear Regression. (Chs 12.1, 12.2, 12.4, 12.5)
10 Simple Linear Regression (Chs 12.1, 12.2, 12.4, 12.5) Simple Linear Regression Rating 20 40 60 80 0 5 10 15 Sugar 2 Simple Linear Regression Rating 20 40 60 80 0 5 10 15 Sugar 3 Simple Linear Regression
More informationStatistics for Managers using Microsoft Excel 6 th Edition
Statistics for Managers using Microsoft Excel 6 th Edition Chapter 13 Simple Linear Regression 13-1 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of
More informationData Analysis and Statistical Methods Statistics 651
y 1 2 3 4 5 6 7 x Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 32 Suhasini Subba Rao Previous lecture We are interested in whether a dependent
More informationImportant note: Transcripts are not substitutes for textbook assignments. 1
In this lesson we will cover correlation and regression, two really common statistical analyses for quantitative (or continuous) data. Specially we will review how to organize the data, the importance
More informationINFERENCE FOR REGRESSION
CHAPTER 3 INFERENCE FOR REGRESSION OVERVIEW In Chapter 5 of the textbook, we first encountered regression. The assumptions that describe the regression model we use in this chapter are the following. We
More informationBusiness Statistics. Lecture 10: Correlation and Linear Regression
Business Statistics Lecture 10: Correlation and Linear Regression Scatterplot A scatterplot shows the relationship between two quantitative variables measured on the same individuals. It displays the Form
More informationLAB 5 INSTRUCTIONS LINEAR REGRESSION AND CORRELATION
LAB 5 INSTRUCTIONS LINEAR REGRESSION AND CORRELATION In this lab you will learn how to use Excel to display the relationship between two quantitative variables, measure the strength and direction of the
More informationHomework 2: Simple Linear Regression
STAT 4385 Applied Regression Analysis Homework : Simple Linear Regression (Simple Linear Regression) Thirty (n = 30) College graduates who have recently entered the job market. For each student, the CGPA
More informationLinear Regression. In this lecture we will study a particular type of regression model: the linear regression model
1 Linear Regression 2 Linear Regression In this lecture we will study a particular type of regression model: the linear regression model We will first consider the case of the model with one predictor
More informationChapter 16. Simple Linear Regression and Correlation
Chapter 16 Simple Linear Regression and Correlation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will
More informationEconometrics. 4) Statistical inference
30C00200 Econometrics 4) Statistical inference Timo Kuosmanen Professor, Ph.D. http://nomepre.net/index.php/timokuosmanen Today s topics Confidence intervals of parameter estimates Student s t-distribution
More informationSection 3: Simple Linear Regression
Section 3: Simple Linear Regression Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction
More informationCorrelation and the Analysis of Variance Approach to Simple Linear Regression
Correlation and the Analysis of Variance Approach to Simple Linear Regression Biometry 755 Spring 2009 Correlation and the Analysis of Variance Approach to Simple Linear Regression p. 1/35 Correlation
More informationIntroduction and Single Predictor Regression. Correlation
Introduction and Single Predictor Regression Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Correlation A correlation
More informationLecture 2 Simple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: Chapter 1
Lecture Simple Linear Regression STAT 51 Spring 011 Background Reading KNNL: Chapter 1-1 Topic Overview This topic we will cover: Regression Terminology Simple Linear Regression with a single predictor
More informationChapter Learning Objectives. Regression Analysis. Correlation. Simple Linear Regression. Chapter 12. Simple Linear Regression
Chapter 12 12-1 North Seattle Community College BUS21 Business Statistics Chapter 12 Learning Objectives In this chapter, you learn:! How to use regression analysis to predict the value of a dependent
More informationLecture 18 Miscellaneous Topics in Multiple Regression
Lecture 18 Miscellaneous Topics in Multiple Regression STAT 512 Spring 2011 Background Reading KNNL: 8.1-8.5,10.1, 11, 12 18-1 Topic Overview Polynomial Models (8.1) Interaction Models (8.2) Qualitative
More informationMeasuring the fit of the model - SSR
Measuring the fit of the model - SSR Once we ve determined our estimated regression line, we d like to know how well the model fits. How far/close are the observations to the fitted line? One way to do
More informationLecture 19: Inference for SLR & Transformations
Lecture 19: Inference for SLR & Transformations Statistics 101 Mine Çetinkaya-Rundel April 3, 2012 Announcements Announcements HW 7 due Thursday. Correlation guessing game - ends on April 12 at noon. Winner
More informationCh. 1: Data and Distributions
Ch. 1: Data and Distributions Populations vs. Samples How to graphically display data Histograms, dot plots, stem plots, etc Helps to show how samples are distributed Distributions of both continuous and
More informationLecture 14 Simple Linear Regression
Lecture 4 Simple Linear Regression Ordinary Least Squares (OLS) Consider the following simple linear regression model where, for each unit i, Y i is the dependent variable (response). X i is the independent
More informationSimple Linear Regression
Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.
More informationEstimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.
Estimating σ 2 We can do simple prediction of Y and estimation of the mean of Y at any value of X. To perform inferences about our regression line, we must estimate σ 2, the variance of the error term.
More informationSimple Linear Regression. Material from Devore s book (Ed 8), and Cengagebrain.com
12 Simple Linear Regression Material from Devore s book (Ed 8), and Cengagebrain.com The Simple Linear Regression Model The simplest deterministic mathematical relationship between two variables x and
More informationMultiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company
Multiple Regression Inference for Multiple Regression and A Case Study IPS Chapters 11.1 and 11.2 2009 W.H. Freeman and Company Objectives (IPS Chapters 11.1 and 11.2) Multiple regression Data for multiple
More informationB. Weaver (24-Mar-2005) Multiple Regression Chapter 5: Multiple Regression Y ) (5.1) Deviation score = (Y i
B. Weaver (24-Mar-2005) Multiple Regression... 1 Chapter 5: Multiple Regression 5.1 Partial and semi-partial correlation Before starting on multiple regression per se, we need to consider the concepts
More informationVariance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression.
10/3/011 Functional Connectivity Correlation and Regression Variance VAR = Standard deviation Standard deviation SD = Unbiased SD = 1 10/3/011 Standard error Confidence interval SE = CI = = t value for
More informationSimple linear regression
Simple linear regression Biometry 755 Spring 2008 Simple linear regression p. 1/40 Overview of regression analysis Evaluate relationship between one or more independent variables (X 1,...,X k ) and a single
More information9 Correlation and Regression
9 Correlation and Regression SW, Chapter 12. Suppose we select n = 10 persons from the population of college seniors who plan to take the MCAT exam. Each takes the test, is coached, and then retakes the
More informationSection 9.4. Notation. Requirements. Definition. Inferences About Two Means (Matched Pairs) Examples
Objective Section 9.4 Inferences About Two Means (Matched Pairs) Compare of two matched-paired means using two samples from each population. Hypothesis Tests and Confidence Intervals of two dependent means
More informationData Set 1A: Algal Photosynthesis vs. Salinity and Temperature
Data Set A: Algal Photosynthesis vs. Salinity and Temperature Statistical setting These data are from a controlled experiment in which two quantitative variables were manipulated, to determine their effects
More informationInference for the Regression Coefficient
Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression line. We can shows that b 0 and b 1 are the unbiased estimates
More informationAssumptions, Diagnostics, and Inferences for the Simple Linear Regression Model with Normal Residuals
Assumptions, Diagnostics, and Inferences for the Simple Linear Regression Model with Normal Residuals 4 December 2018 1 The Simple Linear Regression Model with Normal Residuals In previous class sessions,
More informationSTAT420 Midterm Exam. University of Illinois Urbana-Champaign October 19 (Friday), :00 4:15p. SOLUTIONS (Yellow)
STAT40 Midterm Exam University of Illinois Urbana-Champaign October 19 (Friday), 018 3:00 4:15p SOLUTIONS (Yellow) Question 1 (15 points) (10 points) 3 (50 points) extra ( points) Total (77 points) Points
More informationStatistics for Engineers Lecture 9 Linear Regression
Statistics for Engineers Lecture 9 Linear Regression Chong Ma Department of Statistics University of South Carolina chongm@email.sc.edu April 17, 2017 Chong Ma (Statistics, USC) STAT 509 Spring 2017 April
More informationIntro to Linear Regression
Intro to Linear Regression Introduction to Regression Regression is a statistical procedure for modeling the relationship among variables to predict the value of a dependent variable from one or more predictor
More informationInference in Regression Analysis
Inference in Regression Analysis Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 4, Slide 1 Today: Normal Error Regression Model Y i = β 0 + β 1 X i + ǫ i Y i value
More informationCHAPTER EIGHT Linear Regression
7 CHAPTER EIGHT Linear Regression 8. Scatter Diagram Example 8. A chemical engineer is investigating the effect of process operating temperature ( x ) on product yield ( y ). The study results in the following
More informationSimple Linear Regression
Simple Linear Regression OI CHAPTER 7 Important Concepts Correlation (r or R) and Coefficient of determination (R 2 ) Interpreting y-intercept and slope coefficients Inference (hypothesis testing and confidence
More informationChapter 27 Summary Inferences for Regression
Chapter 7 Summary Inferences for Regression What have we learned? We have now applied inference to regression models. Like in all inference situations, there are conditions that we must check. We can test
More informationStatistics II Exercises Chapter 5
Statistics II Exercises Chapter 5 1. Consider the four datasets provided in the transparencies for Chapter 5 (section 5.1) (a) Check that all four datasets generate exactly the same LS linear regression
More informationRegression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont.
TCELL 9/4/205 36-309/749 Experimental Design for Behavioral and Social Sciences Simple Regression Example Male black wheatear birds carry stones to the nest as a form of sexual display. Soler et al. wanted
More information28. SIMPLE LINEAR REGRESSION III
28. SIMPLE LINEAR REGRESSION III Fitted Values and Residuals To each observed x i, there corresponds a y-value on the fitted line, y = βˆ + βˆ x. The are called fitted values. ŷ i They are the values of
More informationStat 135, Fall 2006 A. Adhikari HOMEWORK 10 SOLUTIONS
Stat 135, Fall 2006 A. Adhikari HOMEWORK 10 SOLUTIONS 1a) The model is cw i = β 0 + β 1 el i + ɛ i, where cw i is the weight of the ith chick, el i the length of the egg from which it hatched, and ɛ i
More informationStat 401B Exam 2 Fall 2015
Stat 401B Exam Fall 015 I have neither given nor received unauthorized assistance on this exam. Name Signed Date Name Printed ATTENTION! Incorrect numerical answers unaccompanied by supporting reasoning
More informationHypothesis Testing. We normally talk about two types of hypothesis: the null hypothesis and the research or alternative hypothesis.
Hypothesis Testing Today, we are going to begin talking about the idea of hypothesis testing how we can use statistics to show that our causal models are valid or invalid. We normally talk about two types
More informationAcknowledgements. Outline. Marie Diener-West. ICTR Leadership / Team INTRODUCTION TO CLINICAL RESEARCH. Introduction to Linear Regression
INTRODUCTION TO CLINICAL RESEARCH Introduction to Linear Regression Karen Bandeen-Roche, Ph.D. July 17, 2012 Acknowledgements Marie Diener-West Rick Thompson ICTR Leadership / Team JHU Intro to Clinical
More informationCorrelation and Linear Regression
Correlation and Linear Regression Correlation: Relationships between Variables So far, nearly all of our discussion of inferential statistics has focused on testing for differences between group means
More informationCorrelation & Simple Regression
Chapter 11 Correlation & Simple Regression The previous chapter dealt with inference for two categorical variables. In this chapter, we would like to examine the relationship between two quantitative variables.
More informationReview of Statistics 101
Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods
More informationBusiness Statistics. Lecture 10: Course Review
Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,
More informationCorrelation and Regression
Correlation and Regression October 25, 2017 STAT 151 Class 9 Slide 1 Outline of Topics 1 Associations 2 Scatter plot 3 Correlation 4 Regression 5 Testing and estimation 6 Goodness-of-fit STAT 151 Class
More information36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression
36-309/749 Experimental Design for Behavioral and Social Sciences Sep. 22, 2015 Lecture 4: Linear Regression TCELL Simple Regression Example Male black wheatear birds carry stones to the nest as a form
More information1.) Fit the full model, i.e., allow for separate regression lines (different slopes and intercepts) for each species
Lecture notes 2/22/2000 Dummy variables and extra SS F-test Page 1 Crab claw size and closing force. Problem 7.25, 10.9, and 10.10 Regression for all species at once, i.e., include dummy variables for
More informationCorrelation and Regression
Correlation and Regression Dr. Bob Gee Dean Scott Bonney Professor William G. Journigan American Meridian University 1 Learning Objectives Upon successful completion of this module, the student should
More informationdf=degrees of freedom = n - 1
One sample t-test test of the mean Assumptions: Independent, random samples Approximately normal distribution (from intro class: σ is unknown, need to calculate and use s (sample standard deviation)) Hypotheses:
More informationIII. Inferential Tools
III. Inferential Tools A. Introduction to Bat Echolocation Data (10.1.1) 1. Q: Do echolocating bats expend more enery than non-echolocating bats and birds, after accounting for mass? 2. Strategy: (i) Explore
More informationThe scatterplot is the basic tool for graphically displaying bivariate quantitative data.
Bivariate Data: Graphical Display The scatterplot is the basic tool for graphically displaying bivariate quantitative data. Example: Some investors think that the performance of the stock market in January
More informationStat 411/511 ESTIMATING THE SLOPE AND INTERCEPT. Charlotte Wickham. stat511.cwick.co.nz. Nov
Stat 411/511 ESTIMATING THE SLOPE AND INTERCEPT Nov 20 2015 Charlotte Wickham stat511.cwick.co.nz Quiz #4 This weekend, don t forget. Usual format Assumptions Display 7.5 p. 180 The ideal normal, simple
More information1 Introduction to Minitab
1 Introduction to Minitab Minitab is a statistical analysis software package. The software is freely available to all students and is downloadable through the Technology Tab at my.calpoly.edu. When you
More informationANOVA Situation The F Statistic Multiple Comparisons. 1-Way ANOVA MATH 143. Department of Mathematics and Statistics Calvin College
1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College An example ANOVA situation Example (Treating Blisters) Subjects: 25 patients with blisters Treatments: Treatment A, Treatment
More informationPsychology Seminar Psych 406 Dr. Jeffrey Leitzel
Psychology Seminar Psych 406 Dr. Jeffrey Leitzel Structural Equation Modeling Topic 1: Correlation / Linear Regression Outline/Overview Correlations (r, pr, sr) Linear regression Multiple regression interpreting
More informationOrdinary Least Squares Regression Explained: Vartanian
Ordinary Least Squares Regression Explained: Vartanian When to Use Ordinary Least Squares Regression Analysis A. Variable types. When you have an interval/ratio scale dependent variable.. When your independent
More informationDensity Temp vs Ratio. temp
Temp Ratio Density 0.00 0.02 0.04 0.06 0.08 0.10 0.12 Density 0.0 0.2 0.4 0.6 0.8 1.0 1. (a) 170 175 180 185 temp 1.0 1.5 2.0 2.5 3.0 ratio The histogram shows that the temperature measures have two peaks,
More informationMultiple Linear Regression
Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there
More informationMFin Econometrics I Session 4: t-distribution, Simple Linear Regression, OLS assumptions and properties of OLS estimators
MFin Econometrics I Session 4: t-distribution, Simple Linear Regression, OLS assumptions and properties of OLS estimators Thilo Klein University of Cambridge Judge Business School Session 4: Linear regression,
More informationBivariate Data: Graphical Display The scatterplot is the basic tool for graphically displaying bivariate quantitative data.
Bivariate Data: Graphical Display The scatterplot is the basic tool for graphically displaying bivariate quantitative data. Example: Some investors think that the performance of the stock market in January
More information