Linear regression and correlation
|
|
- Justin Stevens
- 5 years ago
- Views:
Transcription
1 Faculty of Health Sciences Linear regression and correlation Statistics for experimental medical researchers 2018 Julie Forman, Christian Pipper & Claus Ekstrøm Department of Biostatistics, University of Copenhagen
2 Associations between quantitative variables How do we find evidence of association? How can we describe/quantify the association? Method 1: Linear regression. How do we determine the best fitting line so that we can predict one variable from the other? How do we quantify the statistical uncertainty? We assume a linear association to simplify; Is that fair? Method 2: Correlation. Simple symmetrical measures of association. 2 / 38
3 Outline Linear regression Model assumptions and prediction About linear models in R Correlations 3 / 38
4 Case study: A dose-response experiment Does cell concentration affect cell size of tetrahymena in cell cultivation? Is there an association? How can we describe it? How well can we predict diameter when we know concentration? 4 / 38
5 Same picture with fittet regression line The regression line doesn t fit all measurements spot on. But overall the association looks reasonably linear. 5 / 38
6 The linear model y = α + βx + ε 6 / 38 y is the response (or ouctcome), in this case: log(diameter). x is the explanatory variable (or predictor), in this case: log(concentration). β is the regression coefficient (or slope): It tells us how much y increases when x is increased by one unit. α is the interceptet: Where the line intersects with the y-axis. Note that formally α is "the expected value of y when x = 0", but this interpretation is often extrapolated and not really meaningful in practice ε is an individual error term, assumed normally distributed with zero mean and standard deviation σ ε.
7 Graphical interpretation of the regression parameters y = α + βx β 1 α / 38
8 How do we find the "best fitting line? Answer: by least squares (explicit formulae or R). Find ˆα, ˆβ which minimizes n (y i (α + βx i )) 2 i=1 The deviations r i = y i (ˆα + ˆβx i ) are called the residuals. Finding the best fitting line is the same as minimizing the residual variance. s 2 = 1 n 2 ni=1 r 2 i. We estimate the residual standard deviation by s = s 2. 8 / 38
9 The residuals Definition: r i = y i ˆα ˆβ x i vol e wt 9 / 38
10 Quantification and test of association If no association exists between x and y, then we expect the regression line to be horizontal, that is β = 0 so that y = α + ε regardless of x. We can test the nul hypothesis H : β = 0 by using: t = ˆβ s.e.( ˆβ) which has a t-distribution with n 2 degress of freedom in case the null hypothesis is true. (standard error later ) And we can obtain a confidence interval for β from: ˆβ ± t(n 2) s.e.( ˆβ) 10 / 38 Inference for α is rarely of interest, but otherwise similar.
11 Case study: estimates and inference Estimates etc are easily obtained from R: Estimate Std. Error t value Pr(> t ) (Intercept) <2e-16 *** log2conc <2e-16 *** Residual standard error: on 49 degrees of freedom The rows in the table describes first α, the intercept, and secondly β, the slope corresponding to the effect of the log2-concentration on the log2-diameter. What can we conclude? 11 / 38
12 Interpretaion 12 / 38 There is a significant association between concentration and cell diameter (P<0.0001). We estimate that log 2(diameter) decreases by every time log 2(concentration) increases by one unit, that is each time the concentration is doubled. The 95% confidence interval is to Another way of expressing this is that the diameter decreases exponentially with log 2(concentration). Specifically by an estimated factor or -3.71% every time the concentration is doubled with 95% CI to -3.15%. The intercept 5.41 might be interpreted as the expected log 2(diameter) when log 2(concentration) = 0 (that is when concentration= 1), but this is highly extrapolated.
13 Regressions coefficients by formulae The "best fitting line" can be solved explicitly: The estimated slope is given by: ˆβ = ni=1 (x i x)(y i ȳ) ni=1 (x i x) 2 And the intercept can be computed by ˆα = ȳ ˆβ x where ˆβ from the previous formula is inserted. Note that the fitted line always passes through the point ( x, ȳ). 13 / 38
14 Standard error formulae The standard errors for ˆα and ˆβ are given by. 1 s.e.(ˆα) = s n + x 2 (x x) 2 s.e.( ˆβ) s = (x x) 2 where s ε is the residual standard deviatoin. A bigger sample size n will of course give rise to smaller standard errors, but the specific values of the x s also has an impact. 14 / 38 s.e( ˆβ) is larger if x doesn t vary much. s.e(ˆα) is larger if x doesn t vary much. and / or if x is far away from 0. Both are larger if the residual variance is large.
15 Outline Linear regression Model assumptions and prediction About linear models in R Correlations 15 / 38
16 Model assumptions The statistical model assumed by the linear regression analysis is: y i = α + βx i + ε i where the error terms ε i describes the individual deviations from the regression line, assumed to be random, normally distributed with mean 0 and standard deviation σ ε. There are four model assumptions we need to consider: 16 / Observations are mutually independent (no clustering). 2. The true association is linear. 3. The error terms, ε s, are normally distributed. 4. The error terms, ε s, have the same standard deviation, regardless of the value of x.
17 What do we need to check? 1. Independence should be ensured by the study design. 2. Linearity is checked in a residualplot, i.e. a scatterplot of the residuals against the fitted (predicted) values in the data. 3. Normal distribution is checked by making a QQplot of the standardized residuals. 4. Homogenity of variance is assessed from the residualplot. Note: It is not strictly necessary that the error terms are normally distributed if the sample size is large. Confidence intervals and tests are valid as long as the other model assumptions hold. Prediction intervals ( ), however, can only be trusted if error terms are truly normal. 17 / 38
18 Case study: residual- and QQplot Note: The points in the residual plot should be randomly and symmetrically scattered around the zero-line with the same variability at any point in the interval. Any systematic deviations from this -?? 18 / 38
19 Prediction To predict the expected y for a new value of x, we plug in to the equation of the estimated regression: ŷ(x 0 ) = ˆα + ˆβx 0 Example: When the concentration is , then x 0 = log 2(250000) = , and we would expect a log 2-diameter of = I.e. a diameter around = Note that this involves interpolating between the doses tested in the experiment. 19 / 38 Extrapolating should be avoided.
20 Uncertainty in prediction I Not all responses are on the average. There are two sources of uncertainty we need to consider when making predictions: 1. Natural variation in responses (estimated by the residual standard deviation s ε ). 2. The statistical uncertainty in our estimates (standard errors). The standard error of the expected value at x 0 is: 1 s.e.(ŷ(x 0 )) = s n + (x 0 x) 2 (x x) 2. This is the uncertainty related to estimating the average response at x / 38
21 Uncertainty in prediction I Not all responses are on the average. There are two sources of uncertainty we need to consider when making predictions: 1. Natural variation in responses (estimated by the residual standard deviation s ε ). 2. The statistical uncertainty in our estimates (standard errors). 21 / 38 If we want to predict individual responses to x = x 0 with 95% certainty, then we need: s.d.(y new (x 0 ) ŷ(x 0 )) = s n + (x 0 x) 2 (x x) 2. where the residual standard deviation has been "addedl" to the estimation uncertainty.
22 Confidence- vs prediction interval Which is the prediction- and which is the confidence interval? What happens when sample size increases? 22 / 38
23 A nicer picture Obtained by back-transforming with 2 x before plotting. 23 / 38
24 Outline Linear regression Model assumptions and prediction About linear models in R Correlations 24 / 38
25 Linear models in R We use the lm-function to do linear regression (and a lot more: ANOVA, multiple regression,... ) The model must be specified by a model formula, e.g.: fit <- lm(log2diam~ log2conc, data=dr) where should be read as "potentially depends on"or "is potentially predicted by". The respone goes on the left and the predictor on the right. lm returns a so-called model object of the class "lm", you don t have to understand all of its contents to use it. 25 / 38
26 Extractor functions R has a bunch of functions that can be used to extract information from model objects, e.g.: summary(fit) table of estimates, tests, and more. confint(fit) confidence intervals. abline(fit) add the fitted line to an existing plot. residuals(fit) vector containing the residuals predict(fit, frame) predict y s for supplied x values. plot(fit) diagnostic plots (e.g. model assumptions). 26 / 38
27 R-demo: doseresponse.r load(file.choose()) # choose doseresponse.rda dr <- transform(dr, log2conc=log2(concentration), log2diam=log2(diameter)) # fit the linear model fit <- lm(log2diam~log2conc, data=dr) summary(fit) cbind(coef(fit), confint(fit)) plot(dr$log2conc, dr$log2diam) abline(fit, col= blue ) # NOTE: predictions and more in the program file Exercise: Run the demo! 27 / 38
28 Outline Linear regression Model assumptions and prediction About linear models in R Correlations 28 / 38
29 Regression vs correlation In linear regression, we model a directed relationship, either: A causal relation: We assume that x has an effect on y, not the other way around. A prediction problem: We know x and want to predict y. It matters in which order the two variables are supplied to the formula in R. But sometimes we just want to know: Are two different outcomes associated? In this case a correlation coefficient gives a crude measure of the strength of the association. 29 / 38
30 Pearson s correlation r = (x x)(y ȳ) (x x) 2 (y ȳ) 2. measureres the degree of linear association between to outcomes. r er symmetrical in x and y r is always between 1 and +1 r has the same sign as the regression coefficient β (no matter whether you regress y on x or the other way around). Crude rule of thumb: r < 0.3: weak correlation, 0.3 < r < 0.5: moderate correlation, r > 0.5: strong correlatiion. 30 / 38
31 Interpretion of Pearsons correlation coefficient r = 0, no correlation no (linear) associaton occurs when x and y are mutually independent. r > 0, positive correlation Larger/smaller values of x and y tend to coincide. r < 0, negative correlation Larger values of x tend to coincide with smaller values of y and vice versa. r = ±1, perfect linear association 31 / 38
32 Be careful! 32 / 38 The correlation assumes that both x and y are random. It doesn t make sense to report a correlation coefficient if the values of x were dictated by the study protocol. The strength of the correlation depends on the study population. E.g. height and weight is stronger correlated in pups than in adults. Interpretation should depend on the study aims. A 90% correlation may be poor if we are comparing two laboratory weights suposed to measure the same thing! Association is not the same as agreement. A device that hasn t been properly calibrated may correlate almost perfectly with one that has, but still measurements may show a large systematic deviation.
33 Analyzing correlation in R. CKD-data from course day 2: > cor.test(ckd$pwv0, ckd$aix0) Pearson s product-moment correlation data: ckd$pwv0 and ckd$aix0 t = , df = 48, p-value = alternative hypothesis: true correlation is not equal to 0 95 percent confidence interval: sample estimates: cor BUT: / 38 are outcomes normally distributed?
34 Model assumptions for Pearson korrelation Pearson s correlation is valid under the assumption that the two variables have a 2-dimensionel joint normal distribution. 34 / 38 Source: Wikipedia.
35 The 2D normal distribution as we see it Source: 35 / 38 Wikipedia.
36 Anscombe s quartet Four datasets sharing a Pearson correlation of 0.816: BUT: In which case is the Pearson correlation apropriate? 36 / 38
37 Non-normally distributed data Use Spearman s rank correlation instead. No assumptions save that observations are independent. The formula is the same as for Pearson s correlationen: r = (rank(x) rank(x))(rank(y) rank(y)) (rank(x) rank(x)) 2 (rank(y) rank(y)) 2. only the original data has been replaced by their ranks. The rank of an observation is it s number on the list when all data has been ordered from the smallest value to the largest. 37 / 38
38 Spearmans correlation in R Test the hypothesis H:ρ S = 0. > cor.test(ckd$pwv0, ckd$aix0, method= spearman ) Spearman s rank correlation rho data: ckd$pwv0 and ckd$aix0 S = 13982, p-value = alternative hypothesis: true rho is not equal to 0 sample estimates: rho Note: You don t get a confidence interval! 38 / 38
SLR output RLS. Refer to slr (code) on the Lecture Page of the class website.
SLR output RLS Refer to slr (code) on the Lecture Page of the class website. Old Faithful at Yellowstone National Park, WY: Simple Linear Regression (SLR) Analysis SLR analysis explores the linear association
More informationStatistics for exp. medical researchers Regression and Correlation
Faculty of Health Sciences Regression analysis Statistics for exp. medical researchers Regression and Correlation Lene Theil Skovgaard Sept. 28, 2015 Linear regression, Estimation and Testing Confidence
More informationScatter plot of data from the study. Linear Regression
1 2 Linear Regression Scatter plot of data from the study. Consider a study to relate birthweight to the estriol level of pregnant women. The data is below. i Weight (g / 100) i Weight (g / 100) 1 7 25
More informationScatter plot of data from the study. Linear Regression
1 2 Linear Regression Scatter plot of data from the study. Consider a study to relate birthweight to the estriol level of pregnant women. The data is below. i Weight (g / 100) i Weight (g / 100) 1 7 25
More informationREVIEW 8/2/2017 陈芳华东师大英语系
REVIEW Hypothesis testing starts with a null hypothesis and a null distribution. We compare what we have to the null distribution, if the result is too extreme to belong to the null distribution (p
More informationLab 3 A Quick Introduction to Multiple Linear Regression Psychology The Multiple Linear Regression Model
Lab 3 A Quick Introduction to Multiple Linear Regression Psychology 310 Instructions.Work through the lab, saving the output as you go. You will be submitting your assignment as an R Markdown document.
More information9. Linear Regression and Correlation
9. Linear Regression and Correlation Data: y a quantitative response variable x a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical) For example, y = annual income,
More informationMarcel Dettling. Applied Statistical Regression AS 2012 Week 05. ETH Zürich, October 22, Institute for Data Analysis and Process Design
Marcel Dettling Institute for Data Analysis and Process Design Zurich University of Applied Sciences marcel.dettling@zhaw.ch http://stat.ethz.ch/~dettling ETH Zürich, October 22, 2012 1 What is Regression?
More informationBIOL 458 BIOMETRY Lab 9 - Correlation and Bivariate Regression
BIOL 458 BIOMETRY Lab 9 - Correlation and Bivariate Regression Introduction to Correlation and Regression The procedures discussed in the previous ANOVA labs are most useful in cases where we are interested
More informationUnit 6 - Introduction to linear regression
Unit 6 - Introduction to linear regression Suggested reading: OpenIntro Statistics, Chapter 7 Suggested exercises: Part 1 - Relationship between two numerical variables: 7.7, 7.9, 7.11, 7.13, 7.15, 7.25,
More informationAnalysing data: regression and correlation S6 and S7
Basic medical statistics for clinical and experimental research Analysing data: regression and correlation S6 and S7 K. Jozwiak k.jozwiak@nki.nl 2 / 49 Correlation So far we have looked at the association
More informationStatistics in medicine
Statistics in medicine Lecture 4: and multivariable regression Fatma Shebl, MD, MS, MPH, PhD Assistant Professor Chronic Disease Epidemiology Department Yale School of Public Health Fatma.shebl@yale.edu
More informationappstats27.notebook April 06, 2017
Chapter 27 Objective Students will conduct inference on regression and analyze data to write a conclusion. Inferences for Regression An Example: Body Fat and Waist Size pg 634 Our chapter example revolves
More informationVariance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression.
10/3/011 Functional Connectivity Correlation and Regression Variance VAR = Standard deviation Standard deviation SD = Unbiased SD = 1 10/3/011 Standard error Confidence interval SE = CI = = t value for
More informationLecture 11: Simple Linear Regression
Lecture 11: Simple Linear Regression Readings: Sections 3.1-3.3, 11.1-11.3 Apr 17, 2009 In linear regression, we examine the association between two quantitative variables. Number of beers that you drink
More informationInferences for Regression
Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In
More informationCorrelation Analysis
Simple Regression Correlation Analysis Correlation analysis is used to measure strength of the association (linear relationship) between two variables Correlation is only concerned with strength of the
More informationDensity Temp vs Ratio. temp
Temp Ratio Density 0.00 0.02 0.04 0.06 0.08 0.10 0.12 Density 0.0 0.2 0.4 0.6 0.8 1.0 1. (a) 170 175 180 185 temp 1.0 1.5 2.0 2.5 3.0 ratio The histogram shows that the temperature measures have two peaks,
More informationLecture 14 Simple Linear Regression
Lecture 4 Simple Linear Regression Ordinary Least Squares (OLS) Consider the following simple linear regression model where, for each unit i, Y i is the dependent variable (response). X i is the independent
More informationL21: Chapter 12: Linear regression
L21: Chapter 12: Linear regression Department of Statistics, University of South Carolina Stat 205: Elementary Statistics for the Biological and Life Sciences 1 / 37 So far... 12.1 Introduction One sample
More informationCorrelation & Simple Regression
Chapter 11 Correlation & Simple Regression The previous chapter dealt with inference for two categorical variables. In this chapter, we would like to examine the relationship between two quantitative variables.
More informationUnit 6 - Simple linear regression
Sta 101: Data Analysis and Statistical Inference Dr. Çetinkaya-Rundel Unit 6 - Simple linear regression LO 1. Define the explanatory variable as the independent variable (predictor), and the response variable
More informationEstimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.
Estimating σ 2 We can do simple prediction of Y and estimation of the mean of Y at any value of X. To perform inferences about our regression line, we must estimate σ 2, the variance of the error term.
More informationy response variable x 1, x 2,, x k -- a set of explanatory variables
11. Multiple Regression and Correlation y response variable x 1, x 2,, x k -- a set of explanatory variables In this chapter, all variables are assumed to be quantitative. Chapters 12-14 show how to incorporate
More informationData Analysis and Statistical Methods Statistics 651
y 1 2 3 4 5 6 7 x Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 32 Suhasini Subba Rao Previous lecture We are interested in whether a dependent
More informationImportant note: Transcripts are not substitutes for textbook assignments. 1
In this lesson we will cover correlation and regression, two really common statistical analyses for quantitative (or continuous) data. Specially we will review how to organize the data, the importance
More informationCorrelation. Bivariate normal densities with ρ 0. Two-dimensional / bivariate normal density with correlation 0
Correlation Bivariate normal densities with ρ 0 Example: Obesity index and blood pressure of n people randomly chosen from a population Two-dimensional / bivariate normal density with correlation 0 Correlation?
More informationLinear Regression. Chapter 3
Chapter 3 Linear Regression Once we ve acquired data with multiple variables, one very important question is how the variables are related. For example, we could ask for the relationship between people
More informationA discussion on multiple regression models
A discussion on multiple regression models In our previous discussion of simple linear regression, we focused on a model in which one independent or explanatory variable X was used to predict the value
More informationCorrelation and the Analysis of Variance Approach to Simple Linear Regression
Correlation and the Analysis of Variance Approach to Simple Linear Regression Biometry 755 Spring 2009 Correlation and the Analysis of Variance Approach to Simple Linear Regression p. 1/35 Correlation
More information9 Correlation and Regression
9 Correlation and Regression SW, Chapter 12. Suppose we select n = 10 persons from the population of college seniors who plan to take the MCAT exam. Each takes the test, is coached, and then retakes the
More informationInference for Regression Inference about the Regression Model and Using the Regression Line, with Details. Section 10.1, 2, 3
Inference for Regression Inference about the Regression Model and Using the Regression Line, with Details Section 10.1, 2, 3 Basic components of regression setup Target of inference: linear dependency
More informationCorrelation and Linear Regression
Correlation and Linear Regression Correlation: Relationships between Variables So far, nearly all of our discussion of inferential statistics has focused on testing for differences between group means
More informationCorrelation and simple linear regression S5
Basic medical statistics for clinical and eperimental research Correlation and simple linear regression S5 Katarzyna Jóźwiak k.jozwiak@nki.nl November 15, 2017 1/41 Introduction Eample: Brain size and
More informationStatistics for Engineers Lecture 9 Linear Regression
Statistics for Engineers Lecture 9 Linear Regression Chong Ma Department of Statistics University of South Carolina chongm@email.sc.edu April 17, 2017 Chong Ma (Statistics, USC) STAT 509 Spring 2017 April
More informationChapter 8: Correlation & Regression
Chapter 8: Correlation & Regression We can think of ANOVA and the two-sample t-test as applicable to situations where there is a response variable which is quantitative, and another variable that indicates
More informationBiostatistics. Correlation and linear regression. Burkhardt Seifert & Alois Tschopp. Biostatistics Unit University of Zurich
Biostatistics Correlation and linear regression Burkhardt Seifert & Alois Tschopp Biostatistics Unit University of Zurich Master of Science in Medical Biology 1 Correlation and linear regression Analysis
More informationLinear Regression. In this lecture we will study a particular type of regression model: the linear regression model
1 Linear Regression 2 Linear Regression In this lecture we will study a particular type of regression model: the linear regression model We will first consider the case of the model with one predictor
More informationINTRODUCING LINEAR REGRESSION MODELS Response or Dependent variable y
INTRODUCING LINEAR REGRESSION MODELS Response or Dependent variable y Predictor or Independent variable x Model with error: for i = 1,..., n, y i = α + βx i + ε i ε i : independent errors (sampling, measurement,
More informationChapter 27 Summary Inferences for Regression
Chapter 7 Summary Inferences for Regression What have we learned? We have now applied inference to regression models. Like in all inference situations, there are conditions that we must check. We can test
More informationBiostatistics for physicists fall Correlation Linear regression Analysis of variance
Biostatistics for physicists fall 2015 Correlation Linear regression Analysis of variance Correlation Example: Antibody level on 38 newborns and their mothers There is a positive correlation in antibody
More informationRegression and correlation
6 Regression and correlation The main object of this chapter is to show how to perform basic regression analyses, including plots for model checking and display of confidence and prediction intervals.
More information14 Multiple Linear Regression
B.Sc./Cert./M.Sc. Qualif. - Statistics: Theory and Practice 14 Multiple Linear Regression 14.1 The multiple linear regression model In simple linear regression, the response variable y is expressed in
More informationBasic Business Statistics 6 th Edition
Basic Business Statistics 6 th Edition Chapter 12 Simple Linear Regression Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of a dependent variable based
More informationWeek 8: Correlation and Regression
Health Sciences M.Sc. Programme Applied Biostatistics Week 8: Correlation and Regression The correlation coefficient Correlation coefficients are used to measure the strength of the relationship or association
More informationSection 3: Simple Linear Regression
Section 3: Simple Linear Regression Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction
More informationOct Simple linear regression. Minimum mean square error prediction. Univariate. regression. Calculating intercept and slope
Oct 2017 1 / 28 Minimum MSE Y is the response variable, X the predictor variable, E(X) = E(Y) = 0. BLUP of Y minimizes average discrepancy var (Y ux) = C YY 2u C XY + u 2 C XX This is minimized when u
More informationCorrelation and Simple Linear Regression
Correlation and Simple Linear Regression Sasivimol Rattanasiri, Ph.D Section for Clinical Epidemiology and Biostatistics Ramathibodi Hospital, Mahidol University E-mail: sasivimol.rat@mahidol.ac.th 1 Outline
More informationLecture 18: Simple Linear Regression
Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength
More informationVariance Decomposition and Goodness of Fit
Variance Decomposition and Goodness of Fit 1. Example: Monthly Earnings and Years of Education In this tutorial, we will focus on an example that explores the relationship between total monthly earnings
More informationBusiness Statistics. Lecture 10: Course Review
Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,
More informationDe-mystifying random effects models
De-mystifying random effects models Peter J Diggle Lecture 4, Leahurst, October 2012 Linear regression input variable x factor, covariate, explanatory variable,... output variable y response, end-point,
More informationBiostatistics 4: Trends and Differences
Biostatistics 4: Trends and Differences Dr. Jessica Ketchum, PhD. email: McKinneyJL@vcu.edu Objectives 1) Know how to see the strength, direction, and linearity of relationships in a scatter plot 2) Interpret
More informationStat 135, Fall 2006 A. Adhikari HOMEWORK 10 SOLUTIONS
Stat 135, Fall 2006 A. Adhikari HOMEWORK 10 SOLUTIONS 1a) The model is cw i = β 0 + β 1 el i + ɛ i, where cw i is the weight of the ith chick, el i the length of the egg from which it hatched, and ɛ i
More informationCh 2: Simple Linear Regression
Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component
More informationChapter 10. Regression. Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania
Chapter 10 Regression Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania Scatter Diagrams A graph in which pairs of points, (x, y), are
More informationChapter 14. Statistical versus Deterministic Relationships. Distance versus Speed. Describing Relationships: Scatterplots and Correlation
Chapter 14 Describing Relationships: Scatterplots and Correlation Chapter 14 1 Statistical versus Deterministic Relationships Distance versus Speed (when travel time is constant). Income (in millions of
More informationInference for Regression
Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu
More informationRegression. Marc H. Mehlman University of New Haven
Regression Marc H. Mehlman marcmehlman@yahoo.com University of New Haven the statistician knows that in nature there never was a normal distribution, there never was a straight line, yet with normal and
More informationBusiness Statistics. Lecture 10: Correlation and Linear Regression
Business Statistics Lecture 10: Correlation and Linear Regression Scatterplot A scatterplot shows the relationship between two quantitative variables measured on the same individuals. It displays the Form
More informationIntroduction and Single Predictor Regression. Correlation
Introduction and Single Predictor Regression Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Correlation A correlation
More informationSimple Linear Regression
Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.
More informationAnswer Key. 9.1 Scatter Plots and Linear Correlation. Chapter 9 Regression and Correlation. CK-12 Advanced Probability and Statistics Concepts 1
9.1 Scatter Plots and Linear Correlation Answers 1. A high school psychologist wants to conduct a survey to answer the question: Is there a relationship between a student s athletic ability and his/her
More informationPOL 681 Lecture Notes: Statistical Interactions
POL 681 Lecture Notes: Statistical Interactions 1 Preliminaries To this point, the linear models we have considered have all been interpreted in terms of additive relationships. That is, the relationship
More information2.1: Inferences about β 1
Chapter 2 1 2.1: Inferences about β 1 Test of interest throughout regression: Need sampling distribution of the estimator b 1. Idea: If b 1 can be written as a linear combination of the responses (which
More informationStat 411/511 ESTIMATING THE SLOPE AND INTERCEPT. Charlotte Wickham. stat511.cwick.co.nz. Nov
Stat 411/511 ESTIMATING THE SLOPE AND INTERCEPT Nov 20 2015 Charlotte Wickham stat511.cwick.co.nz Quiz #4 This weekend, don t forget. Usual format Assumptions Display 7.5 p. 180 The ideal normal, simple
More information401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.
401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis
More informationAny of 27 linear and nonlinear models may be fit. The output parallels that of the Simple Regression procedure.
STATGRAPHICS Rev. 9/13/213 Calibration Models Summary... 1 Data Input... 3 Analysis Summary... 5 Analysis Options... 7 Plot of Fitted Model... 9 Predicted Values... 1 Confidence Intervals... 11 Observed
More informationMATH11400 Statistics Homepage
MATH11400 Statistics 1 2010 11 Homepage http://www.stats.bris.ac.uk/%7emapjg/teach/stats1/ 4. Linear Regression 4.1 Introduction So far our data have consisted of observations on a single variable of interest.
More informationR 2 and F -Tests and ANOVA
R 2 and F -Tests and ANOVA December 6, 2018 1 Partition of Sums of Squares The distance from any point y i in a collection of data, to the mean of the data ȳ, is the deviation, written as y i ȳ. Definition.
More informationNature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference.
Understanding regression output from software Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals In 1966 Cyril Burt published a paper called The genetic determination of differences
More informationCorrelation and regression
Correlation and regression Patrick Breheny December 1, 2016 Today s lab is about correlation and regression. It will be somewhat shorter than some of our other labs, as I would also like to spend some
More informationCorrelation & Regression. Dr. Moataza Mahmoud Abdel Wahab Lecturer of Biostatistics High Institute of Public Health University of Alexandria
بسم الرحمن الرحيم Correlation & Regression Dr. Moataza Mahmoud Abdel Wahab Lecturer of Biostatistics High Institute of Public Health University of Alexandria Correlation Finding the relationship between
More informationData Set 1A: Algal Photosynthesis vs. Salinity and Temperature
Data Set A: Algal Photosynthesis vs. Salinity and Temperature Statistical setting These data are from a controlled experiment in which two quantitative variables were manipulated, to determine their effects
More informationLinear Regression. Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x).
Linear Regression Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x). A dependent variable is a random variable whose variation
More information1 Multiple Regression
1 Multiple Regression In this section, we extend the linear model to the case of several quantitative explanatory variables. There are many issues involved in this problem and this section serves only
More information1 A Review of Correlation and Regression
1 A Review of Correlation and Regression SW, Chapter 12 Suppose we select n = 10 persons from the population of college seniors who plan to take the MCAT exam. Each takes the test, is coached, and then
More informationRegression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont.
TCELL 9/4/205 36-309/749 Experimental Design for Behavioral and Social Sciences Simple Regression Example Male black wheatear birds carry stones to the nest as a form of sexual display. Soler et al. wanted
More informationMy data doesn t look like that..
Testing assumptions My data doesn t look like that.. We have made a big deal about testing model assumptions each week. Bill Pine Testing assumptions Testing assumptions We have made a big deal about testing
More informationSimple linear regression
Simple linear regression Prof. Giuseppe Verlato Unit of Epidemiology & Medical Statistics, Dept. of Diagnostics & Public Health, University of Verona Statistics with two variables two nominal variables:
More informationRelationship Between Interval and/or Ratio Variables: Correlation & Regression. Sorana D. BOLBOACĂ
Relationship Between Interval and/or Ratio Variables: Correlation & Regression Sorana D. BOLBOACĂ OUTLINE Correlation Definition Deviation Score Formula, Z score formula Hypothesis Test Regression - Intercept
More informationChapter 8. Linear Regression. The Linear Model. Fat Versus Protein: An Example. The Linear Model (cont.) Residuals
Chapter 8 Linear Regression Copyright 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 8-1 Copyright 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Fat Versus
More informationApplied Regression. Applied Regression. Chapter 2 Simple Linear Regression. Hongcheng Li. April, 6, 2013
Applied Regression Chapter 2 Simple Linear Regression Hongcheng Li April, 6, 2013 Outline 1 Introduction of simple linear regression 2 Scatter plot 3 Simple linear regression model 4 Test of Hypothesis
More informationSimple Linear Regression
Simple Linear Regression MATH 282A Introduction to Computational Statistics University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/ eariasca/math282a.html MATH 282A University
More informationdf=degrees of freedom = n - 1
One sample t-test test of the mean Assumptions: Independent, random samples Approximately normal distribution (from intro class: σ is unknown, need to calculate and use s (sample standard deviation)) Hypotheses:
More informationChapter 8. Linear Regression. Copyright 2010 Pearson Education, Inc.
Chapter 8 Linear Regression Copyright 2010 Pearson Education, Inc. Fat Versus Protein: An Example The following is a scatterplot of total fat versus protein for 30 items on the Burger King menu: Copyright
More informationRegression Analysis: Basic Concepts
The simple linear model Regression Analysis: Basic Concepts Allin Cottrell Represents the dependent variable, y i, as a linear function of one independent variable, x i, subject to a random disturbance
More informationLinear regression. We have that the estimated mean in linear regression is. ˆµ Y X=x = ˆβ 0 + ˆβ 1 x. The standard error of ˆµ Y X=x is.
Linear regression We have that the estimated mean in linear regression is The standard error of ˆµ Y X=x is where x = 1 n s.e.(ˆµ Y X=x ) = σ ˆµ Y X=x = ˆβ 0 + ˆβ 1 x. 1 n + (x x)2 i (x i x) 2 i x i. The
More informationRegression Models - Introduction
Regression Models - Introduction In regression models there are two types of variables that are studied: A dependent variable, Y, also called response variable. It is modeled as random. An independent
More informationWeek 3: Simple Linear Regression
Week 3: Simple Linear Regression Marcelo Coca Perraillon University of Colorado Anschutz Medical Campus Health Services Research Methods I HSMP 7607 2017 c 2017 PERRAILLON ALL RIGHTS RESERVED 1 Outline
More informationDr. Junchao Xia Center of Biophysics and Computational Biology. Fall /1/2016 1/46
BIO5312 Biostatistics Lecture 10:Regression and Correlation Methods Dr. Junchao Xia Center of Biophysics and Computational Biology Fall 2016 11/1/2016 1/46 Outline In this lecture, we will discuss topics
More informationMatrices and vectors A matrix is a rectangular array of numbers. Here s an example: A =
Matrices and vectors A matrix is a rectangular array of numbers Here s an example: 23 14 17 A = 225 0 2 This matrix has dimensions 2 3 The number of rows is first, then the number of columns We can write
More informationBivariate Relationships Between Variables
Bivariate Relationships Between Variables BUS 735: Business Decision Making and Research 1 Goals Specific goals: Detect relationships between variables. Be able to prescribe appropriate statistical methods
More informationBiostatistics: Correlations
Biostatistics: s One of the most common errors we find in the press is the confusion between correlation and causation in scientific and health-related studies. In theory, these are easy to distinguish
More informationSTAT 4385 Topic 03: Simple Linear Regression
STAT 4385 Topic 03: Simple Linear Regression Xiaogang Su, Ph.D. Department of Mathematical Science University of Texas at El Paso xsu@utep.edu Spring, 2017 Outline The Set-Up Exploratory Data Analysis
More information36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression
36-309/749 Experimental Design for Behavioral and Social Sciences Sep. 22, 2015 Lecture 4: Linear Regression TCELL Simple Regression Example Male black wheatear birds carry stones to the nest as a form
More informationUnit 10: Simple Linear Regression and Correlation
Unit 10: Simple Linear Regression and Correlation Statistics 571: Statistical Methods Ramón V. León 6/28/2004 Unit 10 - Stat 571 - Ramón V. León 1 Introductory Remarks Regression analysis is a method for
More informationAMS 7 Correlation and Regression Lecture 8
AMS 7 Correlation and Regression Lecture 8 Department of Applied Mathematics and Statistics, University of California, Santa Cruz Suumer 2014 1 / 18 Correlation pairs of continuous observations. Correlation
More informationStatistical View of Least Squares
May 23, 2006 Purpose of Regression Some Examples Least Squares Purpose of Regression Purpose of Regression Some Examples Least Squares Suppose we have two variables x and y Purpose of Regression Some Examples
More informationExam Applied Statistical Regression. Good Luck!
Dr. M. Dettling Summer 2011 Exam Applied Statistical Regression Approved: Tables: Note: Any written material, calculator (without communication facility). Attached. All tests have to be done at the 5%-level.
More information