Multiple linear regression
|
|
- Hilary Anthony
- 6 years ago
- Views:
Transcription
1 Multiple linear regression Course MF 930: Introduction to statistics June 0 Tron Anders Moger Department of biostatistics, IMB University of Oslo Aims for this lecture: Continue where we left off. Repeat the most important things from last lecture. Learn tests for checking whether the slope of the regression line is different from zero 3. Look at what happens if more variables are included in the model Learn how to handle Binary independent variables Categorical independent variables
2 Example: 5000, ,00 birthweight 3000,00 000,00 000,00 0,00 50,00 00,00 50,00 00,00 50,00 weight in pounds Repetition: Simple linear regression We define a model ε i Y Dependent variable = β + β x + ε i 0 i i Independent variable where are independent, normally distributed, with equal variance σ Wish to fit a line as close to the observed data (two normally distributed variables) as possible Example: Birth weight=β 0 +β *mother s weight Estimate for β 0 is called a, estimate for β is called b
3 Least squares regression 5000, ,00 birthweight 3000,00 000,00 000,00 R Sq Linear = 0,035 0,00 50,00 00,00 50,00 00,00 50,00 weight in pounds Find the best fitting line by minimizing the squared distance from each data point to the line, summed over all data Let (x, y ), (x, y ),...,(x n, y n ) denote the points in the plane. Find a and b so that y=a+bx fit the points by minimizing Solution: n y) + ( a + bx y) + + ( a + bxn yn) = ( a + bxi yi ) i= SSE = ( a + bx L n b = xi yi ( xi )( yi ) xi yi = n( xi ) ( xi ) xi nxi yi b xi a = n = y bx nxy where xi y = x =, y n n i and all sums are done for i=,...,n.
4 How close are the data to the fitted line? R y SST = yi y x i,y i ε = SSE = y yˆ i i i SSR = yˆi y Predicted value=any point on the regression line ˆ i y = a + bx i R, the proportion of the total variance in the y i s in the data explained by the regression line, is given by SSR/SST x Also remember: Residuals (distance from data points to the regression line) have to be normally distributed!! Plots for checking this is easily obtained from SPSS Histograms Q-Q plots (Which SPSS calls P-P plots in regression)
5 Example: Regression of birth weight with mother s weight as independent variable Summary b SSE SST Adjusted Std. Error of R R Square R Square the Estimate,86 a,035,09 78,470 a. Predictors: (Constant), weight in pounds Pearson s r R b. Dependent Variable: birthweight Regression Residual Total Estimate for β 0 Estimate for β ANOVA b Sum of Squares df Mean Square F Sig ,30 6,686,00 a , a. Predictors: (Constant), weight in pounds b. Dependent Variable: birthweight (Constant) weight in pounds a. Dependent Variable: birthweight Unstandardized SSR a Standardized Estimate for σ P-value for test on whether there is a significant relationship between the variables in the model. Null hypothesis is no relationship P-values, confidence intervals etc. for the β s 95% Confidence Interval for B t Sig. Lower Bound Upper Bound B Std. Error Beta 369,67 8,43 0,374,000 99,040 80,304 4,49,73,86,586,00,050 7,809 But how to answer questions like: Given that a positive slope (b) has been estimated: Does it give a reproducible indication that there is a positive trend, or is it a result of random variation? What is a confidence interval for the estimated slope?
6 Confidence intervals for simple regression In a simple regression model, a estimates b estimates β ˆ σ = SSE /( n ) Also, where of b β 0 ( b β )/ S ~ t ˆ σ Sb = ( n ) s b n estimates So a confidence interval for by b± tn, α /Sb x σ estimates variance β is given Hypothesis testing for simple regression Choose hypotheses: H 0 : β = 0 H: β 0 Test statistic: b/ Sb ~ tn Reject H 0 if b/ Sb < tn, α / or b/ Sb > tn, α / For the example: Test H 0 : β mother s weight =0 on 5%-sig. level Get 4.49/.73=.586. Look up.5 and 97.5-percentiles in t-distribution with 87 degrees of freedom (use normal dist.) Find p-value<0.05, reject H 0
7 More than one independent variable: Multiple regression Assume we have data of the type (x, x, x 3, y ), (x, x, x 3, y ),... We want to explain y from the x-values by fitting the following model: y = a + bx + + cx dx3 Just like before, one can produce formulas for a,b,c,d minimizing the sum of the squares of the errors. Multiple regression model y β β x β x β x ε i = 0 + i + i n ni + i ε i The errors are independent random (normal) variables with expected value zero and variance σ The explanatory variables x i, x i,, x ni cannot be linearily related, that is, measuring almost the same thing
8 Indicator variables Binary variables (yes/no, male/female, ) can be represented as /0, and used as independent variables. Also called dummy variables in the book. When used directly, they influence only the constant term of the regression It is also possible to use a binary variable so that it changes both constant term and slope of the regression line (interaction) Example: Regression of birth weight with mother s weight and smoking status as independent variables Summary b Adjusted Std. Error of R R Square R Square the Estimate,59 a,067, ,83567 a. Predictors: (Constant), smoking status, weight in pounds b. Dependent Variable: birthweight ANOVA b Regression Residual Total Sum of Squares df Mean Square F Sig ,65 6,7,00 a , a. Predictors: (Constant), smoking status, weight in pounds b. Dependent Variable: birthweight (Constant) weight in pounds smoking status a. Dependent Variable: birthweight Unstandardized a Standardized 95% Confidence Interval for B t Sig. Lower Bound Upper Bound B Std. Error Beta 500,74 30,833 0,83, , ,56 4,38,690,78,508,03,905 7,57-70,03 05,590 -,8 -,557,0-478,3-6,705
9 Interpretation: Have fitted the model Birth weight= *mother s weight-70.03*smoking status If the mother start to smoke (and her weight remain constant), what is the predicted influence on the infant s birth weight? *= -70 grams What is the predicted weight of the child of a 50 pound, smoking woman? * *=866 grams Confounding See that the estimated effects of mothers weight has changed a little compared to the univariate analysis (where it was 4.49) Mother s weight is slightly confounded by smoking Mwt Smk Bwt Confounder: An independent variable that causes a great change (at least 0%) in the effect of other independent variables (the β), when it s included in the model
10 Confounding cont d. A confounder is differently distributed for different values of the variable it confounds E.g. if lean mothers smoked more than obese mothers, a univariate effect of mothers weight on birth weight would partly be due to smoking!! Including smoking in the model, removes this effect, you get a more correct estimate of mothers weight What if a categorical variable has more than two values? Example: Ethinicity; black, white, other For categorical variables with m possible values, use m- indicators Common to choose a large group as baseline, otherwise unstable estimation A model with two indicator variables will assume that the effect of one indicator adds to the effect of the other If this may be unsuitable, use an additional interaction variable (product of indicators)
11 birth weight as a function of ethnicity Have constructed variables black=0 or and other=0 or : Birth weight=a+b*black+c*others Get (Constant) black other Unstandardized a. Dependent Variable: birthweight a Standardized 95% Confidence Interval for B t Sig. Lower Bound Upper Bound B Std. Error Beta 303,740 7,88 4,586, , ,5-384,047 57,874 -,8 -,433,06-695,50-7,593-99,75 3,678 -,97 -,637,009-53,988-75,46 Hence, predicted birth weight decrease by 384 grams for blacks and 99 grams for others Predicted birth weight for whites is 304 grams Multiple regression: Traffic deaths in 976 Want to find if there is any relationship between highway death rate (deaths per 000 per state) in the U.S. and the following variables: Average car age (in months) Average car weight (in 000 pounds) Percentage light trucks Percentage imported cars All data are per state
12 69,00 69,50 70,00 70,50 7,00 7,50 First: Scatter plots: 0,35 0,35 0,30 0,30 0,5 0,5 deaths 0,0 deaths 0,0 0,5 0,5 0,0 0,0 0,05 0,05 carage 3,00 3,0 3,40 3,60 3,80 vehwt 0,35 0,35 0,30 0,30 0,5 0,5 deaths 0,0 deaths 0,0 0,5 0,5 0,0 0,0 0,05 0,05 5,00 0,00 5,00 0,00 5,00 30,00 35,00 lghttrks 0,00 5,00 0,00 5,00 0,00 5,00 30,00 impcars Summary b Adjusted Std. Error of R R Square R Square the Estimate,49 a,4,6,0506 a. Predictors: (Constant), carage Univariate effects (including one independent variable at a time!): b. Dependent Variable: deaths a (Constant) carage a. Dependent Variable: deaths Deaths per 000=a+b*car age (in months) Unstandardized Standardized 95% Confidence Interval for B t Sig. Lower Bound Upper Bound B Std. Error Beta 4,56,34 3,98,000,33 6,800 -,06,06 -,49-3,834,000 -,094 -,09 Hence: If all else is equal, if average car age increases by one month, you get 0.06 fewer deaths per 000 inhabitants; increase age by months, you get *0.06=0.74 fewer deaths per 000 inhabitants Summary b Adjusted Std. Error of R R Square R Square the Estimate,8 a,079,059,05740 a. Predictors: (Constant), vehwt b. Dependent Variable: deaths a (Constant) vehwt Deaths per 000=a+b*car weight (in pounds) Unstandardized a. Dependent Variable: deaths Standardized 95% Confidence Interval for B t Sig. Lower Bound Upper Bound B Std. Error Beta -,7, -,7,6 -,76,74,4,06,8,983,053 -,00,49
13 Univariate effects cont d (one independent variable at a time!): Summary b Adjusted Std. Error of R R Square R Square the Estimate,76 a,5,50,0478 a. Predictors: (Constant), lghttrks b. Dependent Variable: deaths Hence: Increase prop. light trucks by 0 means 0*0.007=0.4 more deaths per 000 inhabitants (Constant) lghttrks a. Dependent Variable: deaths Unstandardized a Standardized 95% Confidence Interval for B t Sig. Lower Bound Upper Bound B Std. Error Beta,046,08,478,07,009,083,007,00,76 6,947,000,005,00 Summary b Adjusted Std. Error of R R Square R Square the Estimate,308 a,095,075,05690 a. Predictors: (Constant), impcars b. Dependent Variable: deaths Predicted number of deaths per 000 if prop. Imported cars is 0%: *0=0.7 a (Constant) impcars a. Dependent Variable: deaths Unstandardized Standardized 95% Confidence Interval for B t Sig. Lower Bound Upper Bound B Std. Error Beta,06,00 0,46,000,66,46 -,004,00 -,308 -,93,033 -,007,000 Building a multiple regression model, exploratory analysis: Forward regression: Try all independent variables, one at a time, keep the variable with the lowest p-value Repeat step, with the independent variable from the first round now included in the model Repeat until no more variables can be added to the model (no more significant variables) Backward regression: Include all independent variables in the model, remove the variable with the highest p- value Continue until only significant variables are left However: In health sciences you would often keep age, gender etc. in the model even though they are not significant
14 Two better methods of model building:. All independent variables chosen for the study have strong medical reasons for being interesting and you have a large enough study Then, all might be included in the final model regardless of significance. Middle road: use a cut-off saying that all variables with p-value<e.g. 0. in simple analyses can be included in final model For the traffic deaths, end up with: Deaths per 000= *car age *perc. light trucks Summary b Adjusted Std. Error of R R Square R Square the Estimate,768 a,590,57,0387 a. Predictors: (Constant), lghttrks, carage b. Dependent Variable: deaths (Constant) carage lghttrks a. Dependent Variable: deaths Unstandardized a Standardized 95% Confidence Interval for B t Sig. Lower Bound Upper Bound B Std. Error Beta,668,895,98,005,865 4,470 -,037,03 -,95 -,930,005 -,063 -,0,006,00,6 6,8,000,004,009 Conclusion: Did a multiple linear regression on traffic deaths, with car age, car weight, prop. light trucks and prop. imported cars as independent variables. Car age (in months, β=-0.037, 95% CI=(-0.063, -0.0)) and prop. light trucks (β=0.006, 95% CI=(0.004, 0.009)) were significant on 5%-level
15 Check of assumptions: Are residuals normally distributed? Histogram Normal P-P Plot of Regression Standardized Residual Dependent Variable: deaths Dependent Variable: deaths,0 4 0,8 Frequency Expected Cum Prob 0,6 0,4 0, Regression Standardized Residual Mean =,3E-7 Std. Dev. = 0,978 N = 48 0,0 0,0 0, 0,4 0,6 0,8,0 Observed Cum Prob Least squares estimation in multiple regression yi = β0 + βx i + βx i βkxki + εi The least squares estimates of β0, β,..., βk are the values b, b,, b K minimizing n i= (... ) SSE = b + b x + b x + + b x y 0 i i K Ki i They can be computed with similar but more complex formulas as with simple regression
16 R is defined just as before: Defining n We get as before We define yˆ = b + bx + b x b x i 0 i i K Ki n n ( ) i SSE = ( y ˆ ) i yi SSR = ( yˆ ) i y SST = y y i= R i= SST = SSR + SSE SSR SSE = = SST SST i= Adjusted coefficient of determination Adding more independent variables will generally increase SSR and decrease SSE Thus the coefficient of determination will tend to indicate that models with many variables always fit better. To avoid this effect, the adjusted coefficient of determination may be used: SSE /( n K ) R = SST /( n )
17 Drawing inference about the model parameters in multiple regression Similar to simple regression, we get that the following statistic has a t distribution with n-k- degrees of freedom: bj β j tb = j sbj where b j is the least squares estimate for and s bj is its estimated standard deviation K is number of independent variables s bj is computed from SSE and the correlation between independent variables Confidence intervals and hypothesis tests A confidence interval for b ± t s j n K, α / bj β j becomes Testing the hypothesis H : 0 0 β j = vs H : 0 β j Reject if b s j bj < t n K, α / or b s j bj > t n K, α /
18 Testing sets of parameters We can also test the null hypothesis that a specific set of the betas are simultaneously zero. The alternative hypothesis is that at least one beta in the set is nonzero. But will not go into details here What if the relationship between x and y is non-linear? Most common thing to do is to categorize the independent variable E.g. categorize age into 0-0 yrs, -40 yrs, 4-60 yrs and so on Choose a baseline category, and estimate a slope b for each of the other categories Then, it does not matter what relationship you have between the outcome and the independent variable
19 Other options if the relationship is non-linear: Transformed variables The relationship between variables may not be linear Example: The natural model may be y = ae bx We want to find a and b bx so that the line y = ae approximates the points as well as possible Example (cont.) bx When y = ae then log( y ) = log( a) + bx Use standard formulas on the pairs (x,log(y )), (x, log(y )),..., (x n, log(y n )) We get estimates for log(a) and b, and thus a and b
20 Doing a regression analysis Plot the data first, to investigate whether there is a natural relationship Linear or transformed model? Are there outliers which will unduly affect the result? Fit a model. Different models with same number of parameters may be compared with R Check the assumptions! Make tests / confidence intervals for parameters A lot of practice is needed!
Correlation and simple linear regression S5
Basic medical statistics for clinical and eperimental research Correlation and simple linear regression S5 Katarzyna Jóźwiak k.jozwiak@nki.nl November 15, 2017 1/41 Introduction Eample: Brain size and
More informationReview of Statistics 101
Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods
More informationPredictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore
What is Multiple Linear Regression Several independent variables may influence the change in response variable we are trying to study. When several independent variables are included in the equation, the
More informationCorrelation Analysis
Simple Regression Correlation Analysis Correlation analysis is used to measure strength of the association (linear relationship) between two variables Correlation is only concerned with strength of the
More informationRegression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont.
TCELL 9/4/205 36-309/749 Experimental Design for Behavioral and Social Sciences Simple Regression Example Male black wheatear birds carry stones to the nest as a form of sexual display. Soler et al. wanted
More informationReview of Multiple Regression
Ronald H. Heck 1 Let s begin with a little review of multiple regression this week. Linear models [e.g., correlation, t-tests, analysis of variance (ANOVA), multiple regression, path analysis, multivariate
More information36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression
36-309/749 Experimental Design for Behavioral and Social Sciences Sep. 22, 2015 Lecture 4: Linear Regression TCELL Simple Regression Example Male black wheatear birds carry stones to the nest as a form
More informationSimple Linear Regression Using Ordinary Least Squares
Simple Linear Regression Using Ordinary Least Squares Purpose: To approximate a linear relationship with a line. Reason: We want to be able to predict Y using X. Definition: The Least Squares Regression
More informationLecture 9: Linear Regression
Lecture 9: Linear Regression Goals Develop basic concepts of linear regression from a probabilistic framework Estimating parameters and hypothesis testing with linear models Linear regression in R Regression
More informationBasic Business Statistics 6 th Edition
Basic Business Statistics 6 th Edition Chapter 12 Simple Linear Regression Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of a dependent variable based
More informationChapter 3 Multiple Regression Complete Example
Department of Quantitative Methods & Information Systems ECON 504 Chapter 3 Multiple Regression Complete Example Spring 2013 Dr. Mohammad Zainal Review Goals After completing this lecture, you should be
More informationCorrelation and regression
1 Correlation and regression Yongjua Laosiritaworn Introductory on Field Epidemiology 6 July 2015, Thailand Data 2 Illustrative data (Doll, 1955) 3 Scatter plot 4 Doll, 1955 5 6 Correlation coefficient,
More informationLecture 18: Simple Linear Regression
Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength
More informationBusiness Statistics. Lecture 10: Course Review
Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,
More informationChapter 9 - Correlation and Regression
Chapter 9 - Correlation and Regression 9. Scatter diagram of percentage of LBW infants (Y) and high-risk fertility rate (X ) in Vermont Health Planning Districts. 9.3 Correlation between percentage of
More informationUnit 10: Simple Linear Regression and Correlation
Unit 10: Simple Linear Regression and Correlation Statistics 571: Statistical Methods Ramón V. León 6/28/2004 Unit 10 - Stat 571 - Ramón V. León 1 Introductory Remarks Regression analysis is a method for
More informationInference for Regression Inference about the Regression Model and Using the Regression Line
Inference for Regression Inference about the Regression Model and Using the Regression Line PBS Chapter 10.1 and 10.2 2009 W.H. Freeman and Company Objectives (PBS Chapter 10.1 and 10.2) Inference about
More informationREVIEW 8/2/2017 陈芳华东师大英语系
REVIEW Hypothesis testing starts with a null hypothesis and a null distribution. We compare what we have to the null distribution, if the result is too extreme to belong to the null distribution (p
More information9. Linear Regression and Correlation
9. Linear Regression and Correlation Data: y a quantitative response variable x a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical) For example, y = annual income,
More informationStatistics for Managers using Microsoft Excel 6 th Edition
Statistics for Managers using Microsoft Excel 6 th Edition Chapter 13 Simple Linear Regression 13-1 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of
More informationInferences for Regression
Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In
More informationx3,..., Multiple Regression β q α, β 1, β 2, β 3,..., β q in the model can all be estimated by least square estimators
Multiple Regression Relating a response (dependent, input) y to a set of explanatory (independent, output, predictor) variables x, x 2, x 3,, x q. A technique for modeling the relationship between variables.
More informationInference for Regression Simple Linear Regression
Inference for Regression Simple Linear Regression IPS Chapter 10.1 2009 W.H. Freeman and Company Objectives (IPS Chapter 10.1) Simple linear regression p Statistical model for linear regression p Estimating
More informationLecture 10 Multiple Linear Regression
Lecture 10 Multiple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: 6.1-6.5 10-1 Topic Overview Multiple Linear Regression Model 10-2 Data for Multiple Regression Y i is the response variable
More informationLI EAR REGRESSIO A D CORRELATIO
CHAPTER 6 LI EAR REGRESSIO A D CORRELATIO Page Contents 6.1 Introduction 10 6. Curve Fitting 10 6.3 Fitting a Simple Linear Regression Line 103 6.4 Linear Correlation Analysis 107 6.5 Spearman s Rank Correlation
More informationSimple Linear Regression: One Quantitative IV
Simple Linear Regression: One Quantitative IV Linear regression is frequently used to explain variation observed in a dependent variable (DV) with theoretically linked independent variables (IV). For example,
More information: The model hypothesizes a relationship between the variables. The simplest probabilistic model: or.
Chapter Simple Linear Regression : comparing means across groups : presenting relationships among numeric variables. Probabilistic Model : The model hypothesizes an relationship between the variables.
More informationMultiple Regression. More Hypothesis Testing. More Hypothesis Testing The big question: What we really want to know: What we actually know: We know:
Multiple Regression Ψ320 Ainsworth More Hypothesis Testing What we really want to know: Is the relationship in the population we have selected between X & Y strong enough that we can use the relationship
More informationMathematics for Economics MA course
Mathematics for Economics MA course Simple Linear Regression Dr. Seetha Bandara Simple Regression Simple linear regression is a statistical method that allows us to summarize and study relationships between
More informationOrdinary Least Squares Regression Explained: Vartanian
Ordinary Least Squares Regression Eplained: Vartanian When to Use Ordinary Least Squares Regression Analysis A. Variable types. When you have an interval/ratio scale dependent variable.. When your independent
More informationInference for Regression
Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu
More informationChapter 4: Regression Models
Sales volume of company 1 Textbook: pp. 129-164 Chapter 4: Regression Models Money spent on advertising 2 Learning Objectives After completing this chapter, students will be able to: Identify variables,
More informationEstimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.
Estimating σ 2 We can do simple prediction of Y and estimation of the mean of Y at any value of X. To perform inferences about our regression line, we must estimate σ 2, the variance of the error term.
More informationChapter 4. Regression Models. Learning Objectives
Chapter 4 Regression Models To accompany Quantitative Analysis for Management, Eleventh Edition, by Render, Stair, and Hanna Power Point slides created by Brian Peterson Learning Objectives After completing
More informationCh. 1: Data and Distributions
Ch. 1: Data and Distributions Populations vs. Samples How to graphically display data Histograms, dot plots, stem plots, etc Helps to show how samples are distributed Distributions of both continuous and
More informationy response variable x 1, x 2,, x k -- a set of explanatory variables
11. Multiple Regression and Correlation y response variable x 1, x 2,, x k -- a set of explanatory variables In this chapter, all variables are assumed to be quantitative. Chapters 12-14 show how to incorporate
More informationMultiple Regression and Model Building Lecture 20 1 May 2006 R. Ryznar
Multiple Regression and Model Building 11.220 Lecture 20 1 May 2006 R. Ryznar Building Models: Making Sure the Assumptions Hold 1. There is a linear relationship between the explanatory (independent) variable(s)
More informationMultiple linear regression S6
Basic medical statistics for clinical and experimental research Multiple linear regression S6 Katarzyna Jóźwiak k.jozwiak@nki.nl November 15, 2017 1/42 Introduction Two main motivations for doing multiple
More informationGeneral Linear Model (Chapter 4)
General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients
More informationLecture 11: Simple Linear Regression
Lecture 11: Simple Linear Regression Readings: Sections 3.1-3.3, 11.1-11.3 Apr 17, 2009 In linear regression, we examine the association between two quantitative variables. Number of beers that you drink
More informationCS 5014: Research Methods in Computer Science
Computer Science Clifford A. Shaffer Department of Computer Science Virginia Tech Blacksburg, Virginia Fall 2010 Copyright c 2010 by Clifford A. Shaffer Computer Science Fall 2010 1 / 207 Correlation and
More informationSTAT 4385 Topic 03: Simple Linear Regression
STAT 4385 Topic 03: Simple Linear Regression Xiaogang Su, Ph.D. Department of Mathematical Science University of Texas at El Paso xsu@utep.edu Spring, 2017 Outline The Set-Up Exploratory Data Analysis
More informationBasic Business Statistics, 10/e
Chapter 4 4- Basic Business Statistics th Edition Chapter 4 Introduction to Multiple Regression Basic Business Statistics, e 9 Prentice-Hall, Inc. Chap 4- Learning Objectives In this chapter, you learn:
More informationLecture 3: Inference in SLR
Lecture 3: Inference in SLR STAT 51 Spring 011 Background Reading KNNL:.1.6 3-1 Topic Overview This topic will cover: Review of hypothesis testing Inference about 1 Inference about 0 Confidence Intervals
More informationConfidence Interval for the mean response
Week 3: Prediction and Confidence Intervals at specified x. Testing lack of fit with replicates at some x's. Inference for the correlation. Introduction to regression with several explanatory variables.
More informationExample. Multiple Regression. Review of ANOVA & Simple Regression /749 Experimental Design for Behavioral and Social Sciences
36-309/749 Experimental Design for Behavioral and Social Sciences Sep. 29, 2015 Lecture 5: Multiple Regression Review of ANOVA & Simple Regression Both Quantitative outcome Independent, Gaussian errors
More informationSTAT420 Midterm Exam. University of Illinois Urbana-Champaign October 19 (Friday), :00 4:15p. SOLUTIONS (Yellow)
STAT40 Midterm Exam University of Illinois Urbana-Champaign October 19 (Friday), 018 3:00 4:15p SOLUTIONS (Yellow) Question 1 (15 points) (10 points) 3 (50 points) extra ( points) Total (77 points) Points
More informationLECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit
LECTURE 6 Introduction to Econometrics Hypothesis testing & Goodness of fit October 25, 2016 1 / 23 ON TODAY S LECTURE We will explain how multiple hypotheses are tested in a regression model We will define
More informationPractical Biostatistics
Practical Biostatistics Clinical Epidemiology, Biostatistics and Bioinformatics AMC Multivariable regression Day 5 Recap Describing association: Correlation Parametric technique: Pearson (PMCC) Non-parametric:
More informationInference for the Regression Coefficient
Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression line. We can shows that b 0 and b 1 are the unbiased estimates
More informationChapter 14 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics. Chapter 14 Multiple Regression
Chapter 14 Student Lecture Notes 14-1 Department of Quantitative Methods & Information Systems Business Statistics Chapter 14 Multiple Regression QMIS 0 Dr. Mohammad Zainal Chapter Goals After completing
More informationA discussion on multiple regression models
A discussion on multiple regression models In our previous discussion of simple linear regression, we focused on a model in which one independent or explanatory variable X was used to predict the value
More informationChapter 16. Simple Linear Regression and Correlation
Chapter 16 Simple Linear Regression and Correlation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will
More informationT.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS
ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS In our work on hypothesis testing, we used the value of a sample statistic to challenge an accepted value of a population parameter. We focused only
More informationStatistics and Quantitative Analysis U4320
Statistics and Quantitative Analysis U3 Lecture 13: Explaining Variation Prof. Sharyn O Halloran Explaining Variation: Adjusted R (cont) Definition of Adjusted R So we'd like a measure like R, but one
More informationCh 2: Simple Linear Regression
Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component
More informationWORKSHOP 3 Measuring Association
WORKSHOP 3 Measuring Association Concepts Analysing Categorical Data o Testing of Proportions o Contingency Tables & Tests o Odds Ratios Linear Association Measures o Correlation o Simple Linear Regression
More information(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box.
FINAL EXAM ** Two different ways to submit your answer sheet (i) Use MS-Word and place it in a drop-box. (ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box. Deadline: December
More informationBusiness Statistics. Lecture 10: Correlation and Linear Regression
Business Statistics Lecture 10: Correlation and Linear Regression Scatterplot A scatterplot shows the relationship between two quantitative variables measured on the same individuals. It displays the Form
More informationCategorical Predictor Variables
Categorical Predictor Variables We often wish to use categorical (or qualitative) variables as covariates in a regression model. For binary variables (taking on only 2 values, e.g. sex), it is relatively
More informationExample: Forced Expiratory Volume (FEV) Program L13. Example: Forced Expiratory Volume (FEV) Example: Forced Expiratory Volume (FEV)
Program L13 Relationships between two variables Correlation, cont d Regression Relationships between more than two variables Multiple linear regression Two numerical variables Linear or curved relationship?
More informationChapter 14 Student Lecture Notes 14-1
Chapter 14 Student Lecture Notes 14-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter 14 Multiple Regression Analysis and Model Building Chap 14-1 Chapter Goals After completing this
More informationArea1 Scaled Score (NAPLEX) .535 ** **.000 N. Sig. (2-tailed)
Institutional Assessment Report Texas Southern University College of Pharmacy and Health Sciences "An Analysis of 2013 NAPLEX, P4-Comp. Exams and P3 courses The following analysis illustrates relationships
More informationSTAT 511. Lecture : Simple linear regression Devore: Section Prof. Michael Levine. December 3, Levine STAT 511
STAT 511 Lecture : Simple linear regression Devore: Section 12.1-12.4 Prof. Michael Levine December 3, 2018 A simple linear regression investigates the relationship between the two variables that is not
More informationCorrelation and Regression Bangkok, 14-18, Sept. 2015
Analysing and Understanding Learning Assessment for Evidence-based Policy Making Correlation and Regression Bangkok, 14-18, Sept. 2015 Australian Council for Educational Research Correlation The strength
More informationSTK4900/ Lecture 3. Program
STK4900/9900 - Lecture 3 Program 1. Multiple regression: Data structure and basic questions 2. The multiple linear regression model 3. Categorical predictors 4. Planned experiments and observational studies
More information16.400/453J Human Factors Engineering. Design of Experiments II
J Human Factors Engineering Design of Experiments II Review Experiment Design and Descriptive Statistics Research question, independent and dependent variables, histograms, box plots, etc. Inferential
More informationUnit 11: Multiple Linear Regression
Unit 11: Multiple Linear Regression Statistics 571: Statistical Methods Ramón V. León 7/13/2004 Unit 11 - Stat 571 - Ramón V. León 1 Main Application of Multiple Regression Isolating the effect of a variable
More informationFormal Statement of Simple Linear Regression Model
Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor
More informationAnalysis of variance
Analysis of variance Tron Anders Moger 3.0.007 Comparing more than two groups Up to now we have studied situations with One observation per subject One group Two groups Two or more observations per subject
More informationEconomics 113. Simple Regression Assumptions. Simple Regression Derivation. Changing Units of Measurement. Nonlinear effects
Economics 113 Simple Regression Models Simple Regression Assumptions Simple Regression Derivation Changing Units of Measurement Nonlinear effects OLS and unbiased estimates Variance of the OLS estimates
More informationBusiness Statistics. Lecture 9: Simple Regression
Business Statistics Lecture 9: Simple Regression 1 On to Model Building! Up to now, class was about descriptive and inferential statistics Numerical and graphical summaries of data Confidence intervals
More informationFinding Relationships Among Variables
Finding Relationships Among Variables BUS 230: Business and Economic Research and Communication 1 Goals Specific goals: Re-familiarize ourselves with basic statistics ideas: sampling distributions, hypothesis
More informationCHAPTER EIGHT Linear Regression
7 CHAPTER EIGHT Linear Regression 8. Scatter Diagram Example 8. A chemical engineer is investigating the effect of process operating temperature ( x ) on product yield ( y ). The study results in the following
More information1 Correlation and Inference from Regression
1 Correlation and Inference from Regression Reading: Kennedy (1998) A Guide to Econometrics, Chapters 4 and 6 Maddala, G.S. (1992) Introduction to Econometrics p. 170-177 Moore and McCabe, chapter 12 is
More informationLinear Regression. Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x).
Linear Regression Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x). A dependent variable is a random variable whose variation
More informationMultiple Linear Regression
Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there
More informationSTA6938-Logistic Regression Model
Dr. Ying Zhang STA6938-Logistic Regression Model Topic 2-Multiple Logistic Regression Model Outlines:. Model Fitting 2. Statistical Inference for Multiple Logistic Regression Model 3. Interpretation of
More informationLecture 6 Multiple Linear Regression, cont.
Lecture 6 Multiple Linear Regression, cont. BIOST 515 January 22, 2004 BIOST 515, Lecture 6 Testing general linear hypotheses Suppose we are interested in testing linear combinations of the regression
More informationThe Multiple Regression Model
Multiple Regression The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & or more independent variables (X i ) Multiple Regression Model with k Independent Variables:
More informationChapter 16. Simple Linear Regression and dcorrelation
Chapter 16 Simple Linear Regression and dcorrelation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will
More informationInference in Regression Analysis
Inference in Regression Analysis Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 4, Slide 1 Today: Normal Error Regression Model Y i = β 0 + β 1 X i + ǫ i Y i value
More informationESP 178 Applied Research Methods. 2/23: Quantitative Analysis
ESP 178 Applied Research Methods 2/23: Quantitative Analysis Data Preparation Data coding create codebook that defines each variable, its response scale, how it was coded Data entry for mail surveys and
More informationQUANTITATIVE STATISTICAL METHODS: REGRESSION AND FORECASTING JOHANNES LEDOLTER VIENNA UNIVERSITY OF ECONOMICS AND BUSINESS ADMINISTRATION SPRING 2013
QUANTITATIVE STATISTICAL METHODS: REGRESSION AND FORECASTING JOHANNES LEDOLTER VIENNA UNIVERSITY OF ECONOMICS AND BUSINESS ADMINISTRATION SPRING 3 Introduction Objectives of course: Regression and Forecasting
More informationChapter 14. Linear least squares
Serik Sagitov, Chalmers and GU, March 5, 2018 Chapter 14 Linear least squares 1 Simple linear regression model A linear model for the random response Y = Y (x) to an independent variable X = x For a given
More informationLecture 19: Inference for SLR & Transformations
Lecture 19: Inference for SLR & Transformations Statistics 101 Mine Çetinkaya-Rundel April 3, 2012 Announcements Announcements HW 7 due Thursday. Correlation guessing game - ends on April 12 at noon. Winner
More informationBiostatistics for physicists fall Correlation Linear regression Analysis of variance
Biostatistics for physicists fall 2015 Correlation Linear regression Analysis of variance Correlation Example: Antibody level on 38 newborns and their mothers There is a positive correlation in antibody
More informationLinear Modelling in Stata Session 6: Further Topics in Linear Modelling
Linear Modelling in Stata Session 6: Further Topics in Linear Modelling Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 14/11/2017 This Week Categorical Variables Categorical
More informationKeller: Stats for Mgmt & Econ, 7th Ed July 17, 2006
Chapter 17 Simple Linear Regression and Correlation 17.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will
More informationChapter 14 Simple Linear Regression (A)
Chapter 14 Simple Linear Regression (A) 1. Characteristics Managerial decisions often are based on the relationship between two or more variables. can be used to develop an equation showing how the variables
More informationDisadvantages of using many pooled t procedures. The sampling distribution of the sample means. The variability between the sample means
Stat 529 (Winter 2011) Analysis of Variance (ANOVA) Reading: Sections 5.1 5.3. Introduction and notation Birthweight example Disadvantages of using many pooled t procedures The analysis of variance procedure
More informationAcknowledgements. Outline. Marie Diener-West. ICTR Leadership / Team INTRODUCTION TO CLINICAL RESEARCH. Introduction to Linear Regression
INTRODUCTION TO CLINICAL RESEARCH Introduction to Linear Regression Karen Bandeen-Roche, Ph.D. July 17, 2012 Acknowledgements Marie Diener-West Rick Thompson ICTR Leadership / Team JHU Intro to Clinical
More informationSimple Linear Regression: One Qualitative IV
Simple Linear Regression: One Qualitative IV 1. Purpose As noted before regression is used both to explain and predict variation in DVs, and adding to the equation categorical variables extends regression
More informationChapter Goals. To understand the methods for displaying and describing relationship among variables. Formulate Theories.
Chapter Goals To understand the methods for displaying and describing relationship among variables. Formulate Theories Interpret Results/Make Decisions Collect Data Summarize Results Chapter 7: Is There
More information( ), which of the coefficients would end
Discussion Sheet 29.7.9 Qualitative Variables We have devoted most of our attention in multiple regression to quantitative or numerical variables. MR models can become more useful and complex when we consider
More informationExample: Multiple linear regression. Least squares regression. Repetition: Simple linear regression. Tron Anders Moger
Example: Multple lear regresso 5000,00 4000,00 Tro Aders Moger 0.0.007 brthweght 3000,00 000,00 000,00 0,00 50,00 00,00 50,00 00,00 50,00 weght pouds Repetto: Smple lear regresso We defe a model Y = β0
More informationCorrelation and Simple Linear Regression
Correlation and Simple Linear Regression Sasivimol Rattanasiri, Ph.D Section for Clinical Epidemiology and Biostatistics Ramathibodi Hospital, Mahidol University E-mail: sasivimol.rat@mahidol.ac.th 1 Outline
More informationDr. Junchao Xia Center of Biophysics and Computational Biology. Fall /1/2016 1/46
BIO5312 Biostatistics Lecture 10:Regression and Correlation Methods Dr. Junchao Xia Center of Biophysics and Computational Biology Fall 2016 11/1/2016 1/46 Outline In this lecture, we will discuss topics
More informationCorrelation and Regression
Correlation and Regression October 25, 2017 STAT 151 Class 9 Slide 1 Outline of Topics 1 Associations 2 Scatter plot 3 Correlation 4 Regression 5 Testing and estimation 6 Goodness-of-fit STAT 151 Class
More informationStatistics and Quantitative Analysis U4320. Segment 10 Prof. Sharyn O Halloran
Statistics and Quantitative Analysis U4320 Segment 10 Prof. Sharyn O Halloran Key Points 1. Review Univariate Regression Model 2. Introduce Multivariate Regression Model Assumptions Estimation Hypothesis
More information