Multiple regression. Partial regression coefficients
|
|
- Abigail Howard
- 5 years ago
- Views:
Transcription
1 Multiple regression We now generalise the results of simple linear regression to the case where there is one response variable Y and two predictor variables, X and Z. Data consist of n triplets of values (X, Z, Y ) (X n, Z n, Y n ). We want to predict the value of Y associated with particular combination of values of X and Z, or to describe the relationship between Y, X and Z, or to estimate the effect of changes in X and Z on Y. Once again there are two situations in which this type of problem arises. a) Predictors X, Z and response Y are all random. b) Predictors X and Z are fixed, e.g. by experimental design. In either case, there is a prediction equation Y = b 0 + b X + b 2 Z + e () The prediction error e is assumed N(0, σ 2 ). Geometrically, we can imagine a 3-d picture with perpendicular horizontal axes X and Z and a vertical axis Y. The equation Y = b 0 + b X + b 2 Z represents a plane surface whose position is determined by b 0 and its orientation by b and b 2. Partial regression coefficients The coefficients b and b 2 in equation () are partial regression coefficients. For example, b represents the effect on Y of changing X while holding Z constant. The parameters b and b 2 can be estimated by least squares, i.e. chosen so that the sum of squares n (Y i= i b 0 b X i b 2 Z i ) 2 is minimized. The least squares estimates b and b 2 are solutions of the two equations [ S xx S xz S zx S zz ][ b b 2 ] = [ S xy S zy ] For the case where all variables X, Z, and Y are random, the corrected sums of squares and products would be replaced by variances and covariances. Solving the equations gives b = (S xy S xz S yz /S zz )/(S xx S 2 xz /S zz) The formula for b 2 is obtained by switching x and z in the formula for b. It sometimes helps to give regression coefficients more informative labels. For example, if we denote the regression coefficient of Y on X (ignoring Z) by b yx and the partial regression coefficient of Y on X (accounting for Z) by b yx,z then b yx = b yx,z + b yz,x b zx
2 If X changes by a small amount δx, there is a small concomitant change in Z of amount δz = b zx δx, which is invisible when we regress Y on X (ignoring Z). The total change in Y is δy = b yx,z δx + b yz,x δz and b yx = δy/δx. Residuals and fitted values The fitted value for the ith observation is Y i = b 0 + b X i + b 2 Z i, and the equation which we derived for simple linear regression still holds: n i= (Y i Y ) 2 = n i= ( Y i Y ) 2 + n i= (Y i Y i ) 2 The regression sum of squares (first term on r.h.s.) simplifies to b S xy + b 2 S zy, with 2 d.f. The total sum of squares (on l.h.s.) has n d.f., as before. The residual sum of squares (calculated as the difference between total and regression sums of squares) has n 3 d.f. The ANOVA table now has the form Source of variation DF Regression 2 Residual n 3 Total n Mean squares for regression and residual are calculated in the usual way, and F is the ratio of these two mean squares. The F statistic is used to test H 0 : b = b 2 = 0, i.e. that E(Y ) = b 0 (a constant). Extra sums of squares The regression sum of squares with 2 d.f. is usually split into two components: the SSQ associated with fitting X alone ( d.f.), and the extra SSQ obtained when Z is added to the equation (this also has d.f.). Alternatively, it is the SSQ associated with fitting Z alone, plus the extra SSQ obtained by adding X to the equation. Note that, for example, the SSQ associated with fitting X alone is not generally the same as the extra SSQ obtained by fitting X after Z. However, whichever way we chose to do the split, the two components add up to the regression SSQ with 2 d.f. Hypothesis testing As well as testing the hypothesis that b = b 2 = 0, we can test each partial regression coefficient separately for significance. The test of H 0 : b = 0 is based on the extra sum of squares obtained by fitting X after Z: if H 0 is true, (extra sum of squares)/s 2 F with and n 3 d.f. 2
3 Two examples ) In animal breeding, Y might be the breeding value of an animal for a particular trait, and X and Z values of the trait measured on two of its relatives. For example, if Y is the milk yield breeding value of a heifer, and X and Z milk yields of its mother and paternal half-sister, we might predict Y from X and Z. Covariance matrix for X, Z and Y is V P 0 2 V A 0 V P 2 V A 4 V A 4 V A V A Prediction equation is Y = h 2 ( 2 X + 4 Z), Y = b X + b 2 Z, where b V P = 2 V A, b 2 V P = 4 V A, so where h 2 = V A /V P is the heritability of the trait. 2) The difficulty of a hill race is measured by (i) the total distance covered, and (ii) the total climb required. The file hills.csv gives the distance (miles), climb (000 feet), and record time (minutes) for 35 Scottish hill races. Multiple regression can be used to find a relationship between record time and the two measures of difficulty. Fitting distance (time = distance) produces a reasonable fit, but there are large residuals for two difficult races, Bens of Jura (+75.7) and Lairig Ghru ( 39.4). With both distance and climb fitted, (time = distance +.75 climb), there is a significant improvement in overall fit. For example, the residuals for both Bens of Jura (+27.9) and Lairig Ghru (+3.3) are considerably reduced. Using lm( ) to fit this model to the hill race data gives the following estimates for the partial regression coefficients: Estimate Std. Error t value Distance Climb and the ANOVA table is: Df Sum Sq Mean Sq F value Distance Climb Residuals The t statistic for the Climb partial regression coefficient is the square root of the F statistic for the effect of fitting Climb after Distance. These two tests are equivalent. 3
4 Miscellaneous The results given above describe how to predict Y using two r.v.s X and Z. These results generalise in a straightforward way to the problem of predicting Y from any number of predictors X, Z, W, etc. Comments already made about residuals, outliers, cause and effect, etc, for simple linear regression remain relevant for multiple regression. It often happens that one of the predictor variables (X, say) is of primary interest, and the other (Z) is included as a potential confounder. In other words, Z is included not because we are interested in its effect, but to ensure that the effect of X is adjusted for the effect of Z. For example, in a study to determine whether drinking coffee might have a beneficial effect on health, we might also include annual earnings. Unless we do this, an association between good health and high earnings might appear misleadingly as an association with level of coffee consumption (if, for example, high earners consume large amounts of coffee and low earners do not, because they cannot afford to). If there is a strong correlation between X and Z, the partial regression coefficients will have large standard errors. In extreme cases, when the correlation is close to ± ( collinearity ), the fitting procedure breaks down and one or other variable must be dropped from the regression equation. Multiple regression in R The lm( ) function which we have already used for simple linear regression also deals with multiple regression. Diagnostic plots (residuals against fitted values, etc), and analysis of variance tables are produced as for simple linear regression. R code for the hill-race example is given below. If fit is a fitted lm object, summary(fit) produces estimates and standard errors for partial regression coefficients. These are adjusted for all other effects in the model, and do not depend on the order of terms in the model equation. anova(fit) produces sequential sums of squares, which depend on the order of terms. In the example below, the first row of the anova table will have a sum of squares ( d.f.) for the effect of distance (unadjusted). The second row will have a sum of squares ( d.f.) for the effect of climb, adjusted for the effect of distance. If the model equation had been given as time ~ climb + dist, a different ANOVA would be produced with sums of squares for climb (unadjusted) and distance (adjusted for climb). The sum of the two sums of squares (unadjusted and adjusted) would be the same in both cases. hills <- read.table("hills.txt") fit <- lm(time ~ dist + climb, data = hills) summary(fit) anova(fit) plot(fit) 4
5 Factors and dummy variables The following is a typical example of an analysis of variance model. We will consider such models next week. Here we show that the anova model can be considered as a special case of multiple regression. A flock of sheep consists of three breeds: Scottish Blackface, Welsh Mountain and the Blackface Welsh cross. A blood sample is taken from a random sample of each breed, and the Cu content measured. Do the breeds differ in Cu concentrations? Blackface Welsh Cross Assume the data are normally distributed with constant variance and a mean value that depends on breed. This can be treated as a multiple regression E(Y ) = b 0 + b X + b 2 Z, where the values assigned to X and Z depend on breed as follows: Breed X Z b 0 + b X + b 2 Z Blackface 0 0 b 0 Welsh 0 b 0 + b Cross 0 b 0 + b 2 The parameters b 0, b, b 2 represent the mean value for the Blackface breed, the difference between Welsh and Blackface, and the difference between Cross and Blackface. The multiple regression F test (with 2 d.f. in the numerator) tests H 0 : b 2 = b 3 = 0 (no difference among breeds). This is easily generalized to compare any number of groups. Dummy variables usually arise through use of factors in model formulas. For example, in the R code given below, the formula ~ breed is equivalent to ~ X + Z, where X and Z are the dummy variables described above. However, in this case, the usual splitting of the regression sums of squares does not take place. breed <- factor(rep(c("blackface", "Welsh", "Cross"), c(5,5,5))) Cu <- c(6.5,7.9,7.4,6.8,8.,0.4,9.8,., 0.6,9.2,6.9,9.2,8.4,7.6,9.7) fit <- lm(cu ~ breed) anova(fit) 5
6 Example model formulas Suppose x and z are numeric, and that A is a factor. Some possibilities for the right-hand side of an lm( ) formula are: Formula x x + z poly(x,2) A A + x A * x Interpretation simple linear regression multiple regression polynomial regression one-way analysis of variance parallel lines (analysis of covariance) separate lines (intercept and slope) for each level of A For example, if factor A has two levels, the model formula for parallel lines gives E(Y ) as b 0 + b 2 X for the first level of A and b 0 + b + b 2 X for the second level. The common slope of the two lines is b 2, the intercept for the first level of A is b 0, and b is the difference between the intercepts (the constant vertical separation of the two lines). This model ( analysis of covariance ) is often used when primary interest is in the factor A, and x is a potential confounder. 6
Oct Simple linear regression. Minimum mean square error prediction. Univariate. regression. Calculating intercept and slope
Oct 2017 1 / 28 Minimum MSE Y is the response variable, X the predictor variable, E(X) = E(Y) = 0. BLUP of Y minimizes average discrepancy var (Y ux) = C YY 2u C XY + u 2 C XX This is minimized when u
More informationOct Analysis of variance models. One-way anova. Three sheep breeds. Finger ridges. Random and. Fixed effects model. The random effects model
s s Oct 2017 1 / 34 s Consider N = n 0 + n 1 + + n k 1 observations, which form k groups, of sizes n 0, n 1,..., n k 1. The r-th group has sample mean Ȳ r The overall mean (for all groups combined) is
More informationStatistics - Lecture Three. Linear Models. Charlotte Wickham 1.
Statistics - Lecture Three Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Linear Models 1. The Theory 2. Practical Use 3. How to do it in R 4. An example 5. Extensions
More informationUnbalanced Data in Factorials Types I, II, III SS Part 1
Unbalanced Data in Factorials Types I, II, III SS Part 1 Chapter 10 in Oehlert STAT:5201 Week 9 - Lecture 2 1 / 14 When we perform an ANOVA, we try to quantify the amount of variability in the data accounted
More informationMultiple Regression: Example
Multiple Regression: Example Cobb-Douglas Production Function The Cobb-Douglas production function for observed economic data i = 1,..., n may be expressed as where O i is output l i is labour input c
More informationSec. 14.3: Partial Derivatives. All of the following are ways of representing the derivative. y dx
Math 2204 Multivariable Calc Chapter 14: Partial Derivatives I. Review from math 1225 A. First Derivative Sec. 14.3: Partial Derivatives 1. Def n : The derivative of the function f with respect to the
More informationNC Births, ANOVA & F-tests
Math 158, Spring 2018 Jo Hardin Multiple Regression II R code Decomposition of Sums of Squares (and F-tests) NC Births, ANOVA & F-tests A description of the data is given at http://pages.pomona.edu/~jsh04747/courses/math58/
More information36-707: Regression Analysis Homework Solutions. Homework 3
36-707: Regression Analysis Homework Solutions Homework 3 Fall 2012 Problem 1 Y i = βx i + ɛ i, i {1, 2,..., n}. (a) Find the LS estimator of β: RSS = Σ n i=1(y i βx i ) 2 RSS β = Σ n i=1( 2X i )(Y i βx
More informationy response variable x 1, x 2,, x k -- a set of explanatory variables
11. Multiple Regression and Correlation y response variable x 1, x 2,, x k -- a set of explanatory variables In this chapter, all variables are assumed to be quantitative. Chapters 12-14 show how to incorporate
More informationA DARK GREY P O N T, with a Switch Tail, and a small Star on the Forehead. Any
Y Y Y X X «/ YY Y Y ««Y x ) & \ & & } # Y \#$& / Y Y X» \\ / X X X x & Y Y X «q «z \x» = q Y # % \ & [ & Z \ & { + % ) / / «q zy» / & / / / & x x X / % % ) Y x X Y $ Z % Y Y x x } / % «] «] # z» & Y X»
More informationMODELS WITHOUT AN INTERCEPT
Consider the balanced two factor design MODELS WITHOUT AN INTERCEPT Factor A 3 levels, indexed j 0, 1, 2; Factor B 5 levels, indexed l 0, 1, 2, 3, 4; n jl 4 replicate observations for each factor level
More informationCategorical Predictor Variables
Categorical Predictor Variables We often wish to use categorical (or qualitative) variables as covariates in a regression model. For binary variables (taking on only 2 values, e.g. sex), it is relatively
More informationData Analysis and Statistical Methods Statistics 651
y 1 2 3 4 5 6 7 x Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 32 Suhasini Subba Rao Previous lecture We are interested in whether a dependent
More informationExam Applied Statistical Regression. Good Luck!
Dr. M. Dettling Summer 2011 Exam Applied Statistical Regression Approved: Tables: Note: Any written material, calculator (without communication facility). Attached. All tests have to be done at the 5%-level.
More informationMatrices and vectors A matrix is a rectangular array of numbers. Here s an example: A =
Matrices and vectors A matrix is a rectangular array of numbers Here s an example: 23 14 17 A = 225 0 2 This matrix has dimensions 2 3 The number of rows is first, then the number of columns We can write
More informationHomework 1/Solutions. Graded Exercises
MTH 310-3 Abstract Algebra I and Number Theory S18 Homework 1/Solutions Graded Exercises Exercise 1. Below are parts of the addition table and parts of the multiplication table of a ring. Complete both
More informationLecture 4 Multiple linear regression
Lecture 4 Multiple linear regression BIOST 515 January 15, 2004 Outline 1 Motivation for the multiple regression model Multiple regression in matrix notation Least squares estimation of model parameters
More informationSchool of Mathematical Sciences. Question 1
School of Mathematical Sciences MTH5120 Statistical Modelling I Practical 8 and Assignment 7 Solutions Question 1 Figure 1: The residual plots do not contradict the model assumptions of normality, constant
More informationRegression. Estimation of the linear function (straight line) describing the linear component of the joint relationship between two variables X and Y.
Regression Bivariate i linear regression: Estimation of the linear function (straight line) describing the linear component of the joint relationship between two variables and. Generally describe as a
More informationNeatest and Promptest Manner. E d i t u r ami rul)lihher. FOIt THE CIIILDIIES'. Trifles.
» ~ $ ) 7 x X ) / ( 8 2 X 39 ««x» ««! «! / x? \» «({? «» q «(? (?? x! «? 8? ( z x x q? ) «q q q ) x z x 69 7( X X ( 3»«! ( ~«x ««x ) (» «8 4 X «4 «4 «8 X «x «(» X) ()»» «X «97 X X X 4 ( 86) x) ( ) z z
More informationCorrelation. A statistics method to measure the relationship between two variables. Three characteristics
Correlation Correlation A statistics method to measure the relationship between two variables Three characteristics Direction of the relationship Form of the relationship Strength/Consistency Direction
More informationSTAT Chapter 11: Regression
STAT 515 -- Chapter 11: Regression Mostly we have studied the behavior of a single random variable. Often, however, we gather data on two random variables. We wish to determine: Is there a relationship
More informationMultiple linear regression S6
Basic medical statistics for clinical and experimental research Multiple linear regression S6 Katarzyna Jóźwiak k.jozwiak@nki.nl November 15, 2017 1/42 Introduction Two main motivations for doing multiple
More informationMultiple Regression. More Hypothesis Testing. More Hypothesis Testing The big question: What we really want to know: What we actually know: We know:
Multiple Regression Ψ320 Ainsworth More Hypothesis Testing What we really want to know: Is the relationship in the population we have selected between X & Y strong enough that we can use the relationship
More informationSTATISTICS 110/201 PRACTICE FINAL EXAM
STATISTICS 110/201 PRACTICE FINAL EXAM Questions 1 to 5: There is a downloadable Stata package that produces sequential sums of squares for regression. In other words, the SS is built up as each variable
More informationFREC 608 Guided Exercise 9
FREC 608 Guided Eercise 9 Problem. Model of Average Annual Precipitation An article in Geography (July 980) used regression to predict average annual rainfall levels in California. Data on the following
More informationSociology 593 Exam 2 Answer Key March 28, 2002
Sociology 59 Exam Answer Key March 8, 00 I. True-False. (0 points) Indicate whether the following statements are true or false. If false, briefly explain why.. A variable is called CATHOLIC. This probably
More informationEstimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.
Estimating σ 2 We can do simple prediction of Y and estimation of the mean of Y at any value of X. To perform inferences about our regression line, we must estimate σ 2, the variance of the error term.
More informationFaculty of Science FINAL EXAMINATION Mathematics MATH 523 Generalized Linear Models
Faculty of Science FINAL EXAMINATION Mathematics MATH 523 Generalized Linear Models Examiner: Professor K.J. Worsley Associate Examiner: Professor R. Steele Date: Thursday, April 17, 2008 Time: 14:00-17:00
More informationUNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017
UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics Tuesday, January 17, 2017 Work all problems 60 points are needed to pass at the Masters Level and 75
More informationST430 Exam 2 Solutions
ST430 Exam 2 Solutions Date: November 9, 2015 Name: Guideline: You may use one-page (front and back of a standard A4 paper) of notes. No laptop or textbook are permitted but you may use a calculator. Giving
More informationLecture 9: Linear Regression
Lecture 9: Linear Regression Goals Develop basic concepts of linear regression from a probabilistic framework Estimating parameters and hypothesis testing with linear models Linear regression in R Regression
More informationGeneral Linear Model (Chapter 4)
General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients
More informationStatistics 203 Introduction to Regression Models and ANOVA Practice Exam
Statistics 203 Introduction to Regression Models and ANOVA Practice Exam Prof. J. Taylor You may use your 4 single-sided pages of notes This exam is 7 pages long. There are 4 questions, first 3 worth 10
More informationCartesian Plane. Analytic Geometry. Unit Name
3.1cartesian Unit Name Analytic Geometry Unit Goals 1. Create table of values in order to graph &/or determine if a relation is linear. Determine slope 3. Calculate missing information for linearelationships.
More informationMATH 644: Regression Analysis Methods
MATH 644: Regression Analysis Methods FINAL EXAM Fall, 2012 INSTRUCTIONS TO STUDENTS: 1. This test contains SIX questions. It comprises ELEVEN printed pages. 2. Answer ALL questions for a total of 100
More informationAcknowledgements. Outline. Marie Diener-West. ICTR Leadership / Team INTRODUCTION TO CLINICAL RESEARCH. Introduction to Linear Regression
INTRODUCTION TO CLINICAL RESEARCH Introduction to Linear Regression Karen Bandeen-Roche, Ph.D. July 17, 2012 Acknowledgements Marie Diener-West Rick Thompson ICTR Leadership / Team JHU Intro to Clinical
More informationStat 500 Midterm 2 12 November 2009 page 0 of 11
Stat 500 Midterm 2 12 November 2009 page 0 of 11 Please put your name on the back of your answer book. Do NOT put it on the front. Thanks. Do not start until I tell you to. The exam is closed book, closed
More informationLinear regression and correlation
Faculty of Health Sciences Linear regression and correlation Statistics for experimental medical researchers 2018 Julie Forman, Christian Pipper & Claus Ekstrøm Department of Biostatistics, University
More informationAlgebraic Expressions
Algebraic Expressions 1. Expressions are formed from variables and constants. 2. Terms are added to form expressions. Terms themselves are formed as product of factors. 3. Expressions that contain exactly
More informationCorrelation and simple linear regression S5
Basic medical statistics for clinical and eperimental research Correlation and simple linear regression S5 Katarzyna Jóźwiak k.jozwiak@nki.nl November 15, 2017 1/41 Introduction Eample: Brain size and
More informationChapter 10. Correlation and Regression. McGraw-Hill, Bluman, 7th ed., Chapter 10 1
Chapter 10 Correlation and Regression McGraw-Hill, Bluman, 7th ed., Chapter 10 1 Chapter 10 Overview Introduction 10-1 Scatter Plots and Correlation 10- Regression 10-3 Coefficient of Determination and
More informationLab 3 A Quick Introduction to Multiple Linear Regression Psychology The Multiple Linear Regression Model
Lab 3 A Quick Introduction to Multiple Linear Regression Psychology 310 Instructions.Work through the lab, saving the output as you go. You will be submitting your assignment as an R Markdown document.
More informationSTA 303H1F: Two-way Analysis of Variance Practice Problems
STA 303H1F: Two-way Analysis of Variance Practice Problems 1. In the Pygmalion example from lecture, why are the average scores of the platoon used as the response variable, rather than the scores of the
More informationMath 423/533: The Main Theoretical Topics
Math 423/533: The Main Theoretical Topics Notation sample size n, data index i number of predictors, p (p = 2 for simple linear regression) y i : response for individual i x i = (x i1,..., x ip ) (1 p)
More informationCan you tell the relationship between students SAT scores and their college grades?
Correlation One Challenge Can you tell the relationship between students SAT scores and their college grades? A: The higher SAT scores are, the better GPA may be. B: The higher SAT scores are, the lower
More informationRegression Models. Chapter 4. Introduction. Introduction. Introduction
Chapter 4 Regression Models Quantitative Analysis for Management, Tenth Edition, by Render, Stair, and Hanna 008 Prentice-Hall, Inc. Introduction Regression analysis is a very valuable tool for a manager
More informationRe: January 27, 2015 Math 080: Final Exam Review Page 1 of 6
Re: January 7, 015 Math 080: Final Exam Review Page 1 of 6 Note: If you have difficulty with any of these problems, get help, then go back to the appropriate sections and work more problems! 1. Solve for
More information1 A Review of Correlation and Regression
1 A Review of Correlation and Regression SW, Chapter 12 Suppose we select n = 10 persons from the population of college seniors who plan to take the MCAT exam. Each takes the test, is coached, and then
More informationLOWELL JOURNAL. MUST APOLOGIZE. such communication with the shore as Is m i Boimhle, noewwary and proper for the comfort
- 7 7 Z 8 q ) V x - X > q - < Y Y X V - z - - - - V - V - q \ - q q < -- V - - - x - - V q > x - x q - x q - x - - - 7 -» - - - - 6 q x - > - - x - - - x- - - q q - V - x - - ( Y q Y7 - >»> - x Y - ] [
More informationMultiple Linear Regression
Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there
More informationLinear Regression is a very popular method in science and engineering. It lets you establish relationships between two or more numerical variables.
Lab 13. Linear Regression www.nmt.edu/~olegm/382labs/lab13r.pdf Note: the things you will read or type on the computer are in the Typewriter Font. All the files mentioned can be found at www.nmt.edu/~olegm/382labs/
More informationMrs. Poyner/Mr. Page Chapter 3 page 1
Name: Date: Period: Chapter 2: Take Home TEST Bivariate Data Part 1: Multiple Choice. (2.5 points each) Hand write the letter corresponding to the best answer in space provided on page 6. 1. In a statistics
More information22s:152 Applied Linear Regression. Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA)
22s:152 Applied Linear Regression Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA) We now consider an analysis with only categorical predictors (i.e. all predictors are
More information9. Linear Regression and Correlation
9. Linear Regression and Correlation Data: y a quantitative response variable x a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical) For example, y = annual income,
More informationBiostatistics 380 Multiple Regression 1. Multiple Regression
Biostatistics 0 Multiple Regression ORIGIN 0 Multiple Regression Multiple Regression is an extension of the technique of linear regression to describe the relationship between a single dependent (response)
More informationCSE 167: Introduction to Computer Graphics Lecture #2: Linear Algebra Primer
CSE 167: Introduction to Computer Graphics Lecture #2: Linear Algebra Primer Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2016 Announcements Monday October 3: Discussion Assignment
More informationCh 2: Simple Linear Regression
Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component
More information1. (Rao example 11.15) A study measures oxygen demand (y) (on a log scale) and five explanatory variables (see below). Data are available as
ST 51, Summer, Dr. Jason A. Osborne Homework assignment # - Solutions 1. (Rao example 11.15) A study measures oxygen demand (y) (on a log scale) and five explanatory variables (see below). Data are available
More informationLinear Modelling in Stata Session 6: Further Topics in Linear Modelling
Linear Modelling in Stata Session 6: Further Topics in Linear Modelling Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 14/11/2017 This Week Categorical Variables Categorical
More information1.) Fit the full model, i.e., allow for separate regression lines (different slopes and intercepts) for each species
Lecture notes 2/22/2000 Dummy variables and extra SS F-test Page 1 Crab claw size and closing force. Problem 7.25, 10.9, and 10.10 Regression for all species at once, i.e., include dummy variables for
More informationLecture 2. The Simple Linear Regression Model: Matrix Approach
Lecture 2 The Simple Linear Regression Model: Matrix Approach Matrix algebra Matrix representation of simple linear regression model 1 Vectors and Matrices Where it is necessary to consider a distribution
More information13 Simple Linear Regression
B.Sc./Cert./M.Sc. Qualif. - Statistics: Theory and Practice 3 Simple Linear Regression 3. An industrial example A study was undertaken to determine the effect of stirring rate on the amount of impurity
More informationConfidence Intervals, Testing and ANOVA Summary
Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0
More informationSTAT763: Applied Regression Analysis. Multiple linear regression. 4.4 Hypothesis testing
STAT763: Applied Regression Analysis Multiple linear regression 4.4 Hypothesis testing Chunsheng Ma E-mail: cma@math.wichita.edu 4.4.1 Significance of regression Null hypothesis (Test whether all β j =
More informationModels with multiple random effects: Repeated Measures and Maternal effects
Models with multiple random effects: Repeated Measures and Maternal effects 1 Often there are several vectors of random effects Repeatability models Multiple measures Common family effects Cleaning up
More information9 Correlation and Regression
9 Correlation and Regression SW, Chapter 12. Suppose we select n = 10 persons from the population of college seniors who plan to take the MCAT exam. Each takes the test, is coached, and then retakes the
More informationFormula for the t-test
Formula for the t-test: How the t-test Relates to the Distribution of the Data for the Groups Formula for the t-test: Formula for the Standard Error of the Difference Between the Means Formula for the
More informationStat/F&W Ecol/Hort 572 Review Points Ané, Spring 2010
1 Linear models Y = Xβ + ɛ with ɛ N (0, σ 2 e) or Y N (Xβ, σ 2 e) where the model matrix X contains the information on predictors and β includes all coefficients (intercept, slope(s) etc.). 1. Number of
More informationLecture 14. Analysis of Variance * Correlation and Regression. The McGraw-Hill Companies, Inc., 2000
Lecture 14 Analysis of Variance * Correlation and Regression Outline Analysis of Variance (ANOVA) 11-1 Introduction 11-2 Scatter Plots 11-3 Correlation 11-4 Regression Outline 11-5 Coefficient of Determination
More informationLecture 14. Outline. Outline. Analysis of Variance * Correlation and Regression Analysis of Variance (ANOVA)
Outline Lecture 14 Analysis of Variance * Correlation and Regression Analysis of Variance (ANOVA) 11-1 Introduction 11- Scatter Plots 11-3 Correlation 11-4 Regression Outline 11-5 Coefficient of Determination
More informationNo other aids are allowed. For example you are not allowed to have any other textbook or past exams.
UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Sample Exam Note: This is one of our past exams, In fact the only past exam with R. Before that we were using SAS. In
More informationAnalysing data: regression and correlation S6 and S7
Basic medical statistics for clinical and experimental research Analysing data: regression and correlation S6 and S7 K. Jozwiak k.jozwiak@nki.nl 2 / 49 Correlation So far we have looked at the association
More information' Liberty and Umou Ono and Inseparablo "
3 5? #< q 8 2 / / ) 9 ) 2 ) > < _ / ] > ) 2 ) ) 5 > x > [ < > < ) > _ ] ]? <
More informationCSE 167: Introduction to Computer Graphics Lecture #2: Linear Algebra Primer
CSE 167: Introduction to Computer Graphics Lecture #2: Linear Algebra Primer Jürgen P. Schulze, Ph.D. University of California, San Diego Spring Quarter 2016 Announcements Project 1 due next Friday at
More information22s:152 Applied Linear Regression. 1-way ANOVA visual:
22s:152 Applied Linear Regression 1-way ANOVA visual: Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA) 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 Y We now consider an analysis
More informationThe following formulas related to this topic are provided on the formula sheet:
Student Notes Prep Session Topic: Exploring Content The AP Statistics topic outline contains a long list of items in the category titled Exploring Data. Section D topics will be reviewed in this session.
More informationTrendlines Simple Linear Regression Multiple Linear Regression Systematic Model Building Practical Issues
Trendlines Simple Linear Regression Multiple Linear Regression Systematic Model Building Practical Issues Overfitting Categorical Variables Interaction Terms Non-linear Terms Linear Logarithmic y = a +
More informationUnit IV State of stress in Three Dimensions
Unit IV State of stress in Three Dimensions State of stress in Three Dimensions References Punmia B.C.,"Theory of Structures" (SMTS) Vol II, Laxmi Publishing Pvt Ltd, New Delhi 2004. Rattan.S.S., "Strength
More informationAP Statistics Bivariate Data Analysis Test Review. Multiple-Choice
Name Period AP Statistics Bivariate Data Analysis Test Review Multiple-Choice 1. The correlation coefficient measures: (a) Whether there is a relationship between two variables (b) The strength of the
More informationMaking Sense of Coefficients in Multiple Regression
Making Sense of Coefficients in Multiple Regression David C. Hoaglin 12 January 2013 Many analyses of data by multiple regression and related methods (e.g., logistic regression) involve interpreting coefficients
More informationUNLV University of Nevada, Las Vegas
UNLV University of Nevada, Las Vegas The Department of Mathematical Sciences Information Regarding Math 14 Final Exam Revised 8.8.016 While all material covered in the syllabus is essential for success
More informationBIOSTATISTICS NURS 3324
Simple Linear Regression and Correlation Introduction Previously, our attention has been focused on one variable which we designated by x. Frequently, it is desirable to learn something about the relationship
More informationPART I. (a) Describe all the assumptions for a normal error regression model with one predictor variable,
Concordia University Department of Mathematics and Statistics Course Number Section Statistics 360/2 01 Examination Date Time Pages Final December 2002 3 hours 6 Instructors Course Examiner Marks Y.P.
More informationOutline. Remedial Measures) Extra Sums of Squares Standardized Version of the Multiple Regression Model
Outline 1 Multiple Linear Regression (Estimation, Inference, Diagnostics and Remedial Measures) 2 Special Topics for Multiple Regression Extra Sums of Squares Standardized Version of the Multiple Regression
More informationLINEAR REGRESSION ANALYSIS. MODULE XVI Lecture Exercises
LINEAR REGRESSION ANALYSIS MODULE XVI Lecture - 44 Exercises Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Exercise 1 The following data has been obtained on
More informationSPSS Output. ANOVA a b Residual Coefficients a Standardized Coefficients
SPSS Output Homework 1-1e ANOVA a Sum of Squares df Mean Square F Sig. 1 Regression 351.056 1 351.056 11.295.002 b Residual 932.412 30 31.080 Total 1283.469 31 a. Dependent Variable: Sexual Harassment
More informationStatistical Thinking in Biomedical Research Session #3 Statistical Modeling
Statistical Thinking in Biomedical Research Session #3 Statistical Modeling Lily Wang, PhD Department of Biostatistics (modified from notes by J.Patrie, R.Abbott, U of Virginia and WD Dupont, Vanderbilt
More informationApplied Regression Modeling: A Business Approach Chapter 3: Multiple Linear Regression Sections
Applied Regression Modeling: A Business Approach Chapter 3: Multiple Linear Regression Sections 3.4 3.6 by Iain Pardoe 3.4 Model assumptions 2 Regression model assumptions.............................................
More informationPsychology Seminar Psych 406 Dr. Jeffrey Leitzel
Psychology Seminar Psych 406 Dr. Jeffrey Leitzel Structural Equation Modeling Topic 1: Correlation / Linear Regression Outline/Overview Correlations (r, pr, sr) Linear regression Multiple regression interpreting
More informationDr. Junchao Xia Center of Biophysics and Computational Biology. Fall /1/2016 1/46
BIO5312 Biostatistics Lecture 10:Regression and Correlation Methods Dr. Junchao Xia Center of Biophysics and Computational Biology Fall 2016 11/1/2016 1/46 Outline In this lecture, we will discuss topics
More informationAlternatives to Difference Scores: Polynomial Regression and Response Surface Methodology. Jeffrey R. Edwards University of North Carolina
Alternatives to Difference Scores: Polynomial Regression and Response Surface Methodology Jeffrey R. Edwards University of North Carolina 1 Outline I. Types of Difference Scores II. Questions Difference
More informationUNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test, October 2013
UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test, October 2013 STAC67H3 Regression Analysis Duration: One hour and fifty minutes Last Name: First Name: Student
More informationChapter 4: Regression Models
Sales volume of company 1 Textbook: pp. 129-164 Chapter 4: Regression Models Money spent on advertising 2 Learning Objectives After completing this chapter, students will be able to: Identify variables,
More informationMANY BILLS OF CONCERN TO PUBLIC
- 6 8 9-6 8 9 6 9 XXX 4 > -? - 8 9 x 4 z ) - -! x - x - - X - - - - - x 00 - - - - - x z - - - x x - x - - - - - ) x - - - - - - 0 > - 000-90 - - 4 0 x 00 - -? z 8 & x - - 8? > 9 - - - - 64 49 9 x - -
More informationSimple Linear Regression
Simple Linear Regression EdPsych 580 C.J. Anderson Fall 2005 Simple Linear Regression p. 1/80 Outline 1. What it is and why it s useful 2. How 3. Statistical Inference 4. Examining assumptions (diagnostics)
More informationMultiple random effects. Often there are several vectors of random effects. Covariance structure
Models with multiple random effects: Repeated Measures and Maternal effects Bruce Walsh lecture notes SISG -Mixed Model Course version 8 June 01 Multiple random effects y = X! + Za + Wu + e y is a n x
More informationA discussion on multiple regression models
A discussion on multiple regression models In our previous discussion of simple linear regression, we focused on a model in which one independent or explanatory variable X was used to predict the value
More informationThe General Linear Model. April 22, 2008
The General Linear Model. April 22, 2008 Multiple regression Data: The Faroese Mercury Study Simple linear regression Confounding The multiple linear regression model Interpretation of parameters Model
More information