Correlation and simple linear regression S5
|
|
- Henry Fitzgerald
- 6 years ago
- Views:
Transcription
1 Basic medical statistics for clinical and eperimental research Correlation and simple linear regression S5 Katarzyna Jóźwiak November 15, /41
2 Introduction Eample: Brain size and body weight Brain size Weight Subject id (MRI total piel count per 10,000) (pounds) /41
3 Introduction Eample: Brain size and body weight Subject 4 Subject 24 3/41
4 Relationship between two numerical variables If a linear relationship between and y appears to be reasonable from the scatter plot, we can take the net step and 1. Calculate Pearson s product moment correlation coefficient between and y Measures how closely the data points on the scatter plot resemble a straight line 2. Perform a simple linear regression analysis Finds the equation of the line that best describes the relationship between variables seen in a scatter plot 4/41
5 Correlation Sample Pearson s product moment correlation coefficient (or correlation coefficient), between variables and y is calculated as where: r(, y) = 1 n 1 n ( ) ( ) i yi ȳ i=1 {( i, y i ) : i = 1,..., n} is a random sample of n observations on and y, and ȳ are the sample means of respectively and y, s and s y are the sample standard deviations of respectively and y. s s y 5/41
6 Correlation Properties of r: r estimates the true population correlation coefficient ρ r takes on any value between 1 and 1 Magnitude of r indicates the strength of a linear relationship between and y: r = 1 or 1 means perfect linear association The closer r is to -1 or 1, the stronger the linear association (e.g. r = -0.1 (weak association) vs r = 0.85 (strong association)) r = 0 indicates no linear association (but can be e.g. non-linear) Sign of r indicates the direction of association: r > 0 implies positive relationship i.e. the two variables tend to move in the same direction r < 0 implies negative relationship i.e. the two variables tend to move in the opposite directions 6/41
7 Correlation Properties of r (cont d): r(a + b, c y + d) = r(, y), where a > 0, c > 0, and b and d are constants r(, y) = r(y, ) r 0 does not imply causation! Just because two variables are correlated does not necessarily mean that one causes the other! r 2 is called the coefficient of determination 0 r 2 1 Represents the proportion of total variation in one variable that is eplained by the other For eample: the coefficient of determination between body weight and age of 0.60 means that 60% of total variation in body weight is eplained by age alone and the remaining 40% is eplained by other factors 7/41
8 Correlation Correlation r= -1 r= 1 r= 0.8 r= -0.8 r= 0 r= 0 0 < r< 1-1 < r< 0 Don t interpret r without looking at the scatter plot! 8/41
9 Correlation Hypothesis test for the population correlation coefficient ρ: H 0 : ρ = 0 (there is no linear relationship between y and ) H 1 : ρ 0 (there is a linear relationship between y and ) Under H 0, the test statistic n 2 T = r 1 r 2 follows a Student-t distribution with n 2 degrees of freedom. This test assumes that the variables and y are normally distributed 9/41
10 Correlation Eample: Brain size and body weight What is the magnitude and sign of correlation coefficient between brain size and weight? 10/41
11 Correlation Eample: Brain size and body weight Correlations Weight Brain Pearson Correlation Sig. (2-tailed) N Pearson Correlation Sig. (2-tailed) N Weight Brain ** ** **. Correlation is significant at the 0.01 level (2-tailed). 11/41
12 Pearson s product moment correlation coefficient measures the strength and direction of a linear association between and y Simple linear regression finds an equation (mathematical model) that describes the relationship between the two variables we can predict values of one variable using values of the other variable Unlike correlation, regression requires a dependent variable y (outcome/response variable): variable being predicted (always on the vertical or y-ais) an independent variable (eplanatory/predictor variable): variable used for prediction (always on the horizontal or -ais) 12/41
13 Simple linear regression postulates that in the population y = (α + β ) + ɛ, where: y is the dependent variable is the independent variable α and β are parameters called the population regression coefficients α is called the intercept or constant term β is called the slope ɛ is the random error term 13/41
14 y /41
15 y E(y i ) E(y ) = α + β E(y i ) is the mean value of y when = i E(y ) = α + β is the population regression function 15/41
16 y E(y ) = α + β 3β β α α is the y-intercept of the population regression function, i.e. the mean value of y when equals 0 β is the slope of the population regression function, i.e. the mean (or epected) change in y associated with a 1-unit increase in the value of c β is the mean change in y for a c-unit increase in the value of α and β are estimated from the sample data using the least squares method (usually) 16/41
17 y y = a + b y i e i ei = y i - y i = residual i y i 0 i Least squares method chooses a and b (estimates for α and β) to minimize the sum of the squares of the residuals n e 2 i = i=1 n (y i ŷ i ) 2 = i=1 n [y i (a + b i )] 2 i=1 17/41
18 The least squares estimates for β and α are: n i=1 b = ( i )(y i ȳ) n i=1 ( i ) 2 and a = ȳ b, where and ȳ are the respective sample means of and y. Note that: b = r(, y) sy s, where r(, y) is the sample product moment correlation between and y, and s and s y are the sample standard deviations of and y. 18/41
19 Test of H 0 : β = 0 versus H 1 : β 0 1. t-test: Test statistic: T = b, where SE(b) is the standard error of b SE(b) calculated from the data Under H0, T follows a Student-t distribution with n 2 degrees of freedom 2. F-test: ( ) 2 Test statistic: F = b SE(b) = T 2, where SE(b) and T are as above Under H0, F follows a F distribution with 1 and n 2 degrees of freedom The t-test and the F-test lead to the same outcome The test of zero intercept α is of less interest, unless = 0 is meaningful 19/41
20 Eample: Brain size (MRI total piel count per 10,000) and body weight (pounds) Coefficients a Model 1 (Constant) Weight Unstandardized Coefficients Standardized Coefficients B Std. Error Beta t Sig a. Dependent Variable: Brain ANOVA a Model 1 Regression Residual Total Sum of Squares df Mean Square F Sig b a. Dependent Variable: Brain b. Predictors: (Constant), Weight Brain = Weight 20/41
21 Eample: Brain size (MRI total piel count per 10,000) and body weight (pounds) Brain y= * Weight 21/41
22 Eample: Blood pressure (mmhg) and body weight (kg) in 20 patients with hypertension BP Weight 1 Daniel, W.W. and Cross, C.L.(2013). Biostatistics: a foundation for analysis in the health sciences, 10th edition. 22/41
23 Eample: Blood pressure (mmhg) and body weight (kg) in 20 patients with hypertension Coefficients a Model 1 (Constant) Weight Unstandardized Coefficients B Std. Error Beta t Sig a. Model 1 Regression Residual Total a. Dependent Variable: BP b. Predictors: (Constant), Weight ANOVA a Sum of Squares df Mean Square F Sig b BP = Weight 23/41
24 Eample: Blood pressure (mmhg) and body weight (kg) in 20 patients with hypertension BP y= * Weight 24/41
25 Standardized coefficients Obtained by standardizing both y and (i.e. converting into z-scores) and re-running the regression Standardized intercept equals zero and standardized slope for equals the sample correlation coefficient Correlations Weight Brain Pearson Correlation Sig. (2-tailed) N Pearson Correlation Sig. (2-tailed) N Weight Brain ** ** **. Correlation is significant at the 0.01 level (2-tailed). Coefficients a Model 1 (Constant) Weight Unstandardized Coefficients Standardized Coefficients B Std. Error Beta t Sig a. Dependent Variable: Brain 25/41
26 Standardized coefficients Obtained by standardizing both y and (i.e. converting into z-scores) and re-running the regression Standardized intercept equals zero and standardized slope for equals the sample correlation coefficient Of greater concern in multiple linear regression where the predictors are epressed in different units Standardization removes the dependence of regression coefficients on the units of measurements of y and s so they can be meaningfully compared The larger the standardized coefficient (in absolute value) the greater the contribution of the respective variable in the prediction of y 26/41
27 Linear regression is only appropriate when the following assumptions are satisfied: 1. Independence: the observations are independent, i.e. there is only one pair of observations per subject 2. Linearity: the relationship between and y is linear 3. Constant variance: the variance of y is constant for all values of 4. Normality: y has a normal distribution 27/41
28 28/41 Simple linear regression Checking linearity assumption: 1. Make a scatter plot of y versus the points should generally form a straight line 2. Plot the residuals against the eplanatory variable the points should present a random scatter of points around zero, there should be no systematic pattern 0 e Linearity 0 Lack of linearity e
29 29/41 Simple linear regression Checking constant variance assumption: Make a residual plot, i.e. plot the residuals against the fitted values of y (ŷ i = a + b i ) the points should present a random scatter of points 0 e Constant variance 0 Non-constant variance e
30 Eample: Blood pressure and body weight 30/41
31 Checking normality assumption: 1. Draw a histogram of y or the residuals and eyeball the result 2. Make a normal probability plot (P P plot) of the residuals, i.e. plot the epected cumulative probability of a normal distribution versus the observed cumulative probability at each value of the residual the points should form a straight diagonal line 31/41
32 Eample: Blood pressure and body weight 32/41
33 33/41 Simple linear regression Assessing goodness of fit The estimated regression line is the best one available (in the least-squares sense) Yet, it can still be a very poor fit to the observed data y Good fit Bad fit y
34 To assess goodness of fit of a regression line (i.e. how well the line fits the data) we can: 1. Calculate the correlation coefficient R between the predicted and observed values of y A higher absolute value of R indicates better fit (predicted and observed values of y are closer to each other) 2. Calculate R 2 (R Square in SPSS) 0 R 2 1 A higher value of R 2 indicates better fit R 2 = 1 indicates perfect fit (i.e. ŷ i = y i for each i) R 2 = 0 indicates very poor fit 34/41
35 Alternatively, R 2 can be calculated as n R 2 i=1 = (ŷ i ȳ) 2 variation in y eplained by n i=1 (y = i ȳ) 2 total variation in y R 2 is interpreted as proportion of total variability in y eplained by eplanatory variable R 2 = 1: eplains all variability in y R 2 = 0: does not eplain any variability in y R 2 is usually epressed as a percentage; e.g., R 2 = 0.93 indicates that 93% of total variation in y can be eplained by 35/41
36 Eample: Blood pressure and body weight Model Summary Model R R Square Adjusted R Square Std. Error of the Estimate 1,950 a,903,897 1,74050 a. Predictors: (Constant), Weight 36/41
37 Prediction: interpolation versus etrapolation Etrapolation Interpolation Etrapolation y Possible patterns of additional data Range of actual data Etrapolation beyond the range of the data is risky!! 37/41
38 Categorical eplanatory variable So far we assumed that the predictor variable is numerical We can also study an association between y and a categorical, e.g. between blood pressure and gender or between brain size and ethnicity Categorical variables can be incorporated through dummy variables that take on the values 0 and 1; to include a categorical variable with p categories p 1 dummy variables are required 38/41
39 Categorical eplanatory variable Eample: variable blood group with 4 categories, A, B, AB, 0 1. Dummy variables for all categories { 1, if blood group is A A = 0, otherwise { 1, if blood group is AB AB = 0, otherwise { 1, if blood group is B B = 0, otherwise { 1, if blood group is 0 0 = 0, otherwise 2. One category is a reference category category that results in useful comparisons (e.g. eposed versus non-eposed, eperimental versus standard treatment) or a category with large number of subjects 3. In the model we include all dummies ecept the one corresponding to the reference category 39/41
40 Categorical eplanatory variable Model with blood group 0 as reference category y = α + β A A + β B B + β AB AB + ɛ and its estimated counterpart is ŷ = a + b A A + b B B + b AB AB Estimation of model parameters requires running multiple linear regression, unless the eplanatory variable has only two categories (e.g. gender) Given that y represents IQ score, the estimated coefficients are interpreted as follows: a is the mean IQ for subjects with blood group 0, i.e. the reference category Each b represents the mean difference in IQ between subjects with a blood group represented by the respective dummy variable and subjects with blood group 0 (the reference category) 40/41
41 Categorical eplanatory variable Specifically: b A is the mean difference in IQ between subjects with blood group A and subjects with blood group b B is the mean difference in IQ between subjects with blood group B and subjects with blood group b AB is the mean difference in IQ between subjects with blood group AB and subjects with blood group A test for the significance of a categorical eplanatory variable with p levels involves the hypothesis that the coefficients of all p 1 dummy variables are zero. For that purpose, we need to use an overall F-test (net lecture) and not a t-test. The t-test can be used only when the variable is binary. 41/41
Multiple linear regression S6
Basic medical statistics for clinical and experimental research Multiple linear regression S6 Katarzyna Jóźwiak k.jozwiak@nki.nl November 15, 2017 1/42 Introduction Two main motivations for doing multiple
More informationAnalysing data: regression and correlation S6 and S7
Basic medical statistics for clinical and experimental research Analysing data: regression and correlation S6 and S7 K. Jozwiak k.jozwiak@nki.nl 2 / 49 Correlation So far we have looked at the association
More informationOrdinary Least Squares Regression Explained: Vartanian
Ordinary Least Squares Regression Eplained: Vartanian When to Use Ordinary Least Squares Regression Analysis A. Variable types. When you have an interval/ratio scale dependent variable.. When your independent
More informationMultiple Regression. More Hypothesis Testing. More Hypothesis Testing The big question: What we really want to know: What we actually know: We know:
Multiple Regression Ψ320 Ainsworth More Hypothesis Testing What we really want to know: Is the relationship in the population we have selected between X & Y strong enough that we can use the relationship
More informationØ Set of mutually exclusive categories. Ø Classify or categorize subject. Ø No meaningful order to categorization.
Statistical Tools in Evaluation HPS 41 Fall 213 Dr. Joe G. Schmalfeldt Types of Scores Continuous Scores scores with a potentially infinite number of values. Discrete Scores scores limited to a specific
More informationArea1 Scaled Score (NAPLEX) .535 ** **.000 N. Sig. (2-tailed)
Institutional Assessment Report Texas Southern University College of Pharmacy and Health Sciences "An Analysis of 2013 NAPLEX, P4-Comp. Exams and P3 courses The following analysis illustrates relationships
More informationBusiness Statistics. Lecture 10: Correlation and Linear Regression
Business Statistics Lecture 10: Correlation and Linear Regression Scatterplot A scatterplot shows the relationship between two quantitative variables measured on the same individuals. It displays the Form
More informationMultiple linear regression
Multiple linear regression Course MF 930: Introduction to statistics June 0 Tron Anders Moger Department of biostatistics, IMB University of Oslo Aims for this lecture: Continue where we left off. Repeat
More informationØ Set of mutually exclusive categories. Ø Classify or categorize subject. Ø No meaningful order to categorization.
Statistical Tools in Evaluation HPS 41 Dr. Joe G. Schmalfeldt Types of Scores Continuous Scores scores with a potentially infinite number of values. Discrete Scores scores limited to a specific number
More informationReview of Multiple Regression
Ronald H. Heck 1 Let s begin with a little review of multiple regression this week. Linear models [e.g., correlation, t-tests, analysis of variance (ANOVA), multiple regression, path analysis, multivariate
More informationChapter 9 - Correlation and Regression
Chapter 9 - Correlation and Regression 9. Scatter diagram of percentage of LBW infants (Y) and high-risk fertility rate (X ) in Vermont Health Planning Districts. 9.3 Correlation between percentage of
More informationPractical Biostatistics
Practical Biostatistics Clinical Epidemiology, Biostatistics and Bioinformatics AMC Multivariable regression Day 5 Recap Describing association: Correlation Parametric technique: Pearson (PMCC) Non-parametric:
More informationLecture 18: Simple Linear Regression
Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength
More informationCorrelation. A statistics method to measure the relationship between two variables. Three characteristics
Correlation Correlation A statistics method to measure the relationship between two variables Three characteristics Direction of the relationship Form of the relationship Strength/Consistency Direction
More informationChapter 27 Summary Inferences for Regression
Chapter 7 Summary Inferences for Regression What have we learned? We have now applied inference to regression models. Like in all inference situations, there are conditions that we must check. We can test
More informationUnit 6 - Introduction to linear regression
Unit 6 - Introduction to linear regression Suggested reading: OpenIntro Statistics, Chapter 7 Suggested exercises: Part 1 - Relationship between two numerical variables: 7.7, 7.9, 7.11, 7.13, 7.15, 7.25,
More informationChapter 12 - Part I: Correlation Analysis
ST coursework due Friday, April - Chapter - Part I: Correlation Analysis Textbook Assignment Page - # Page - #, Page - # Lab Assignment # (available on ST webpage) GOALS When you have completed this lecture,
More informationStatistics in medicine
Statistics in medicine Lecture 4: and multivariable regression Fatma Shebl, MD, MS, MPH, PhD Assistant Professor Chronic Disease Epidemiology Department Yale School of Public Health Fatma.shebl@yale.edu
More informationGeneral Linear Model (Chapter 4)
General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients
More informationExample: Forced Expiratory Volume (FEV) Program L13. Example: Forced Expiratory Volume (FEV) Example: Forced Expiratory Volume (FEV)
Program L13 Relationships between two variables Correlation, cont d Regression Relationships between more than two variables Multiple linear regression Two numerical variables Linear or curved relationship?
More informationReview of Statistics 101
Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods
More informationUnit 6 - Simple linear regression
Sta 101: Data Analysis and Statistical Inference Dr. Çetinkaya-Rundel Unit 6 - Simple linear regression LO 1. Define the explanatory variable as the independent variable (predictor), and the response variable
More informationApplied Regression Modeling: A Business Approach Chapter 3: Multiple Linear Regression Sections
Applied Regression Modeling: A Business Approach Chapter 3: Multiple Linear Regression Sections 3.4 3.6 by Iain Pardoe 3.4 Model assumptions 2 Regression model assumptions.............................................
More informationCan you tell the relationship between students SAT scores and their college grades?
Correlation One Challenge Can you tell the relationship between students SAT scores and their college grades? A: The higher SAT scores are, the better GPA may be. B: The higher SAT scores are, the lower
More informationChapter 9 Regression. 9.1 Simple linear regression Linear models Least squares Predictions and residuals.
9.1 Simple linear regression 9.1.1 Linear models Response and eplanatory variables Chapter 9 Regression With bivariate data, it is often useful to predict the value of one variable (the response variable,
More information9 Correlation and Regression
9 Correlation and Regression SW, Chapter 12. Suppose we select n = 10 persons from the population of college seniors who plan to take the MCAT exam. Each takes the test, is coached, and then retakes the
More informationREVIEW 8/2/2017 陈芳华东师大英语系
REVIEW Hypothesis testing starts with a null hypothesis and a null distribution. We compare what we have to the null distribution, if the result is too extreme to belong to the null distribution (p
More informationCorrelation & Regression. Dr. Moataza Mahmoud Abdel Wahab Lecturer of Biostatistics High Institute of Public Health University of Alexandria
بسم الرحمن الرحيم Correlation & Regression Dr. Moataza Mahmoud Abdel Wahab Lecturer of Biostatistics High Institute of Public Health University of Alexandria Correlation Finding the relationship between
More information1 A Review of Correlation and Regression
1 A Review of Correlation and Regression SW, Chapter 12 Suppose we select n = 10 persons from the population of college seniors who plan to take the MCAT exam. Each takes the test, is coached, and then
More informationRegression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont.
TCELL 9/4/205 36-309/749 Experimental Design for Behavioral and Social Sciences Simple Regression Example Male black wheatear birds carry stones to the nest as a form of sexual display. Soler et al. wanted
More information: The model hypothesizes a relationship between the variables. The simplest probabilistic model: or.
Chapter Simple Linear Regression : comparing means across groups : presenting relationships among numeric variables. Probabilistic Model : The model hypothesizes an relationship between the variables.
More information36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression
36-309/749 Experimental Design for Behavioral and Social Sciences Sep. 22, 2015 Lecture 4: Linear Regression TCELL Simple Regression Example Male black wheatear birds carry stones to the nest as a form
More informationCorrelation and Regression Bangkok, 14-18, Sept. 2015
Analysing and Understanding Learning Assessment for Evidence-based Policy Making Correlation and Regression Bangkok, 14-18, Sept. 2015 Australian Council for Educational Research Correlation The strength
More informationCategorical Predictor Variables
Categorical Predictor Variables We often wish to use categorical (or qualitative) variables as covariates in a regression model. For binary variables (taking on only 2 values, e.g. sex), it is relatively
More informationTHE PEARSON CORRELATION COEFFICIENT
CORRELATION Two variables are said to have a relation if knowing the value of one variable gives you information about the likely value of the second variable this is known as a bivariate relation There
More informationExample. Multiple Regression. Review of ANOVA & Simple Regression /749 Experimental Design for Behavioral and Social Sciences
36-309/749 Experimental Design for Behavioral and Social Sciences Sep. 29, 2015 Lecture 5: Multiple Regression Review of ANOVA & Simple Regression Both Quantitative outcome Independent, Gaussian errors
More informationChapter 4 Regression with Categorical Predictor Variables Page 1. Overview of regression with categorical predictors
Chapter 4 Regression with Categorical Predictor Variables Page. Overview of regression with categorical predictors 4-. Dummy coding 4-3 4-5 A. Karpinski Regression with Categorical Predictor Variables.
More informationAMS 7 Correlation and Regression Lecture 8
AMS 7 Correlation and Regression Lecture 8 Department of Applied Mathematics and Statistics, University of California, Santa Cruz Suumer 2014 1 / 18 Correlation pairs of continuous observations. Correlation
More informationSPSS Output. ANOVA a b Residual Coefficients a Standardized Coefficients
SPSS Output Homework 1-1e ANOVA a Sum of Squares df Mean Square F Sig. 1 Regression 351.056 1 351.056 11.295.002 b Residual 932.412 30 31.080 Total 1283.469 31 a. Dependent Variable: Sexual Harassment
More informationLecture (chapter 13): Association between variables measured at the interval-ratio level
Lecture (chapter 13): Association between variables measured at the interval-ratio level Ernesto F. L. Amaral April 9 11, 2018 Advanced Methods of Social Research (SOCI 420) Source: Healey, Joseph F. 2015.
More informationSimple Linear Regression
Simple Linear Regression 1 Correlation indicates the magnitude and direction of the linear relationship between two variables. Linear Regression: variable Y (criterion) is predicted by variable X (predictor)
More informationAdvanced Quantitative Data Analysis
Chapter 24 Advanced Quantitative Data Analysis Daniel Muijs Doing Regression Analysis in SPSS When we want to do regression analysis in SPSS, we have to go through the following steps: 1 As usual, we choose
More informationCorrelation and Simple Linear Regression
Correlation and Simple Linear Regression Sasivimol Rattanasiri, Ph.D Section for Clinical Epidemiology and Biostatistics Ramathibodi Hospital, Mahidol University E-mail: sasivimol.rat@mahidol.ac.th 1 Outline
More informationConfidence Intervals, Testing and ANOVA Summary
Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0
More information9. Linear Regression and Correlation
9. Linear Regression and Correlation Data: y a quantitative response variable x a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical) For example, y = annual income,
More informationChapter 4: Regression Models
Sales volume of company 1 Textbook: pp. 129-164 Chapter 4: Regression Models Money spent on advertising 2 Learning Objectives After completing this chapter, students will be able to: Identify variables,
More informationST430 Exam 2 Solutions
ST430 Exam 2 Solutions Date: November 9, 2015 Name: Guideline: You may use one-page (front and back of a standard A4 paper) of notes. No laptop or textbook are permitted but you may use a calculator. Giving
More information4/22/2010. Test 3 Review ANOVA
Test 3 Review ANOVA 1 School recruiter wants to examine if there are difference between students at different class ranks in their reported intensity of school spirit. What is the factor? How many levels
More informationLinear Regression. In this lecture we will study a particular type of regression model: the linear regression model
1 Linear Regression 2 Linear Regression In this lecture we will study a particular type of regression model: the linear regression model We will first consider the case of the model with one predictor
More informationChapter 4 Describing the Relation between Two Variables
Chapter 4 Describing the Relation between Two Variables 4.1 Scatter Diagrams and Correlation The is the variable whose value can be explained by the value of the or. A is a graph that shows the relationship
More informationKey Concepts. Correlation (Pearson & Spearman) & Linear Regression. Assumptions. Correlation parametric & non-para. Correlation
Correlation (Pearson & Spearman) & Linear Regression Azmi Mohd Tamil Key Concepts Correlation as a statistic Positive and Negative Bivariate Correlation Range Effects Outliers Regression & Prediction Directionality
More informationLecture 5: ANOVA and Correlation
Lecture 5: ANOVA and Correlation Ani Manichaikul amanicha@jhsph.edu 23 April 2007 1 / 62 Comparing Multiple Groups Continous data: comparing means Analysis of variance Binary data: comparing proportions
More informationChapter 4. Regression Models. Learning Objectives
Chapter 4 Regression Models To accompany Quantitative Analysis for Management, Eleventh Edition, by Render, Stair, and Hanna Power Point slides created by Brian Peterson Learning Objectives After completing
More informationScatter plot of data from the study. Linear Regression
1 2 Linear Regression Scatter plot of data from the study. Consider a study to relate birthweight to the estriol level of pregnant women. The data is below. i Weight (g / 100) i Weight (g / 100) 1 7 25
More informationStat 101 Exam 1 Important Formulas and Concepts 1
1 Chapter 1 1.1 Definitions Stat 101 Exam 1 Important Formulas and Concepts 1 1. Data Any collection of numbers, characters, images, or other items that provide information about something. 2. Categorical/Qualitative
More informationIT 403 Practice Problems (2-2) Answers
IT 403 Practice Problems (2-2) Answers #1. Which of the following is correct with respect to the correlation coefficient (r) and the slope of the leastsquares regression line (Choose one)? a. They will
More informationSTAT 3900/4950 MIDTERM TWO Name: Spring, 2015 (print: first last ) Covered topics: Two-way ANOVA, ANCOVA, SLR, MLR and correlation analysis
STAT 3900/4950 MIDTERM TWO Name: Spring, 205 (print: first last ) Covered topics: Two-way ANOVA, ANCOVA, SLR, MLR and correlation analysis Instructions: You may use your books, notes, and SPSS/SAS. NO
More informationLAB 5 INSTRUCTIONS LINEAR REGRESSION AND CORRELATION
LAB 5 INSTRUCTIONS LINEAR REGRESSION AND CORRELATION In this lab you will learn how to use Excel to display the relationship between two quantitative variables, measure the strength and direction of the
More informationSelf-Assessment Weeks 6 and 7: Multiple Regression with a Qualitative Predictor; Multiple Comparisons
Self-Assessment Weeks 6 and 7: Multiple Regression with a Qualitative Predictor; Multiple Comparisons 1. Suppose we wish to assess the impact of five treatments on an outcome Y. How would these five treatments
More informationOverview. Overview. Overview. Specific Examples. General Examples. Bivariate Regression & Correlation
Bivariate Regression & Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient Bivariate Linear Regression Line SPSS Output Interpretation Covariance ou already
More informationWeek 8: Correlation and Regression
Health Sciences M.Sc. Programme Applied Biostatistics Week 8: Correlation and Regression The correlation coefficient Correlation coefficients are used to measure the strength of the relationship or association
More information( ), which of the coefficients would end
Discussion Sheet 29.7.9 Qualitative Variables We have devoted most of our attention in multiple regression to quantitative or numerical variables. MR models can become more useful and complex when we consider
More informationDaniel Boduszek University of Huddersfield
Daniel Boduszek University of Huddersfield d.boduszek@hud.ac.uk Introduction to moderator effects Hierarchical Regression analysis with continuous moderator Hierarchical Regression analysis with categorical
More informationNature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference.
Understanding regression output from software Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals In 1966 Cyril Burt published a paper called The genetic determination of differences
More informationCorrelation and Linear Regression
Correlation and Linear Regression Correlation: Relationships between Variables So far, nearly all of our discussion of inferential statistics has focused on testing for differences between group means
More informationScatter plot of data from the study. Linear Regression
1 2 Linear Regression Scatter plot of data from the study. Consider a study to relate birthweight to the estriol level of pregnant women. The data is below. i Weight (g / 100) i Weight (g / 100) 1 7 25
More informationFinding Relationships Among Variables
Finding Relationships Among Variables BUS 230: Business and Economic Research and Communication 1 Goals Specific goals: Re-familiarize ourselves with basic statistics ideas: sampling distributions, hypothesis
More informationLecture 2. The Simple Linear Regression Model: Matrix Approach
Lecture 2 The Simple Linear Regression Model: Matrix Approach Matrix algebra Matrix representation of simple linear regression model 1 Vectors and Matrices Where it is necessary to consider a distribution
More informationInferences for Regression
Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In
More informationCORRELATION AND REGRESSION
CORRELATION AND REGRESSION CORRELATION Introduction CORRELATION problems which involve measuring the strength of a relationship. Correlation Analysis involves various methods and techniques used for studying
More informationChapter Goals. To understand the methods for displaying and describing relationship among variables. Formulate Theories.
Chapter Goals To understand the methods for displaying and describing relationship among variables. Formulate Theories Interpret Results/Make Decisions Collect Data Summarize Results Chapter 7: Is There
More informationRegression Models. Chapter 4. Introduction. Introduction. Introduction
Chapter 4 Regression Models Quantitative Analysis for Management, Tenth Edition, by Render, Stair, and Hanna 008 Prentice-Hall, Inc. Introduction Regression analysis is a very valuable tool for a manager
More informationCorrelation: Relationships between Variables
Correlation Correlation: Relationships between Variables So far, nearly all of our discussion of inferential statistics has focused on testing for differences between group means However, researchers are
More informationRegression. Marc H. Mehlman University of New Haven
Regression Marc H. Mehlman marcmehlman@yahoo.com University of New Haven the statistician knows that in nature there never was a normal distribution, there never was a straight line, yet with normal and
More informationBivariate Relationships Between Variables
Bivariate Relationships Between Variables BUS 735: Business Decision Making and Research 1 Goals Specific goals: Detect relationships between variables. Be able to prescribe appropriate statistical methods
More informationSTK4900/ Lecture 3. Program
STK4900/9900 - Lecture 3 Program 1. Multiple regression: Data structure and basic questions 2. The multiple linear regression model 3. Categorical predictors 4. Planned experiments and observational studies
More informationESP 178 Applied Research Methods. 2/23: Quantitative Analysis
ESP 178 Applied Research Methods 2/23: Quantitative Analysis Data Preparation Data coding create codebook that defines each variable, its response scale, how it was coded Data entry for mail surveys and
More informationThe scatterplot is the basic tool for graphically displaying bivariate quantitative data.
Bivariate Data: Graphical Display The scatterplot is the basic tool for graphically displaying bivariate quantitative data. Example: Some investors think that the performance of the stock market in January
More informationLecture 11: Simple Linear Regression
Lecture 11: Simple Linear Regression Readings: Sections 3.1-3.3, 11.1-11.3 Apr 17, 2009 In linear regression, we examine the association between two quantitative variables. Number of beers that you drink
More information16.400/453J Human Factors Engineering. Design of Experiments II
J Human Factors Engineering Design of Experiments II Review Experiment Design and Descriptive Statistics Research question, independent and dependent variables, histograms, box plots, etc. Inferential
More informationSimple Linear Regression Using Ordinary Least Squares
Simple Linear Regression Using Ordinary Least Squares Purpose: To approximate a linear relationship with a line. Reason: We want to be able to predict Y using X. Definition: The Least Squares Regression
More informationCorrelation and Regression
Correlation and Regression October 25, 2017 STAT 151 Class 9 Slide 1 Outline of Topics 1 Associations 2 Scatter plot 3 Correlation 4 Regression 5 Testing and estimation 6 Goodness-of-fit STAT 151 Class
More informationChapter 11. Correlation and Regression
Chapter 11. Correlation and Regression The word correlation is used in everyday life to denote some form of association. We might say that we have noticed a correlation between foggy days and attacks of
More informationBivariate Data: Graphical Display The scatterplot is the basic tool for graphically displaying bivariate quantitative data.
Bivariate Data: Graphical Display The scatterplot is the basic tool for graphically displaying bivariate quantitative data. Example: Some investors think that the performance of the stock market in January
More informationSingle and multiple linear regression analysis
Single and multiple linear regression analysis Marike Cockeran 2017 Introduction Outline of the session Simple linear regression analysis SPSS example of simple linear regression analysis Additional topics
More informationSelf-Assessment Weeks 8: Multiple Regression with Qualitative Predictors; Multiple Comparisons
Self-Assessment Weeks 8: Multiple Regression with Qualitative Predictors; Multiple Comparisons 1. Suppose we wish to assess the impact of five treatments while blocking for study participant race (Black,
More informationSimple Linear Regression
9-1 l Chapter 9 l Simple Linear Regression 9.1 Simple Linear Regression 9.2 Scatter Diagram 9.3 Graphical Method for Determining Regression 9.4 Least Square Method 9.5 Correlation Coefficient and Coefficient
More informationChapter 8. Linear Regression. Copyright 2010 Pearson Education, Inc.
Chapter 8 Linear Regression Copyright 2010 Pearson Education, Inc. Fat Versus Protein: An Example The following is a scatterplot of total fat versus protein for 30 items on the Burger King menu: Copyright
More informationSTATISTICAL DATA ANALYSIS IN EXCEL
Microarra Center STATISTICAL DATA ANALYSIS IN EXCEL Lecture 5 Linear Regression dr. Petr Nazarov 14-1-213 petr.nazarov@crp-sante.lu Statistical data analsis in Ecel. 5. Linear regression OUTLINE Lecture
More informationChapter 11. Correlation and Regression
Chapter 11 Correlation and Regression Correlation A relationship between two variables. The data can be represented b ordered pairs (, ) is the independent (or eplanator) variable is the dependent (or
More informationLAB 3 INSTRUCTIONS SIMPLE LINEAR REGRESSION
LAB 3 INSTRUCTIONS SIMPLE LINEAR REGRESSION In this lab you will first learn how to display the relationship between two quantitative variables with a scatterplot and also how to measure the strength of
More informationSimple Linear Regression
Simple Linear Regression ST 370 Regression models are used to study the relationship of a response variable and one or more predictors. The response is also called the dependent variable, and the predictors
More informationLINEAR REGRESSION ANALYSIS. MODULE XVI Lecture Exercises
LINEAR REGRESSION ANALYSIS MODULE XVI Lecture - 44 Exercises Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Exercise 1 The following data has been obtained on
More information22s:152 Applied Linear Regression
22s:152 Applied Linear Regression Chapter 7: Dummy Variable Regression So far, we ve only considered quantitative variables in our models. We can integrate categorical predictors by constructing artificial
More informationComparing Nested Models
Comparing Nested Models ST 370 Two regression models are called nested if one contains all the predictors of the other, and some additional predictors. For example, the first-order model in two independent
More informationCHAPTER 5 LINEAR REGRESSION AND CORRELATION
CHAPTER 5 LINEAR REGRESSION AND CORRELATION Expected Outcomes Able to use simple and multiple linear regression analysis, and correlation. Able to conduct hypothesis testing for simple and multiple linear
More informationBinary Logistic Regression
The coefficients of the multiple regression model are estimated using sample data with k independent variables Estimated (or predicted) value of Y Estimated intercept Estimated slope coefficients Ŷ = b
More informationAbout Bivariate Correlations and Linear Regression
About Bivariate Correlations and Linear Regression TABLE OF CONTENTS About Bivariate Correlations and Linear Regression... 1 What is BIVARIATE CORRELATION?... 1 What is LINEAR REGRESSION... 1 Bivariate
More informationAP Statistics Cumulative AP Exam Study Guide
AP Statistics Cumulative AP Eam Study Guide Chapters & 3 - Graphs Statistics the science of collecting, analyzing, and drawing conclusions from data. Descriptive methods of organizing and summarizing statistics
More informationInference for Regression Inference about the Regression Model and Using the Regression Line
Inference for Regression Inference about the Regression Model and Using the Regression Line PBS Chapter 10.1 and 10.2 2009 W.H. Freeman and Company Objectives (PBS Chapter 10.1 and 10.2) Inference about
More information