General Linear Models. with General Linear Hypothesis Tests and Likelihood Ratio Tests
|
|
- Emory Oswin Warren
- 5 years ago
- Views:
Transcription
1 General Linear Models with General Linear Hypothesis Tests and Likelihood Ratio Tests 1
2 Background Linear combinations of Normals are Normal XX nn ~ NN μμ, ΣΣ AAAA ~ NN AAμμ, AAAAAA A sum of squared, standardized Normals is Chi-square XX μμ ΣΣ 11 XX μμ ~ χχ nn 22 Regression estimates have a Normal dist n bb ~ NN ββ, σσ 2 XX XX 1 2
3 Background A General Linear Hypothesis Test (GLHT) can be written as: LLLL = hh Often, with hh = 00 NB: LL rr xx pp has row rank rr Recall nn pp MMMMMM σσ 2 = SSSSSS σσ 2 = ee ee σσ 2 ~ χχ 2 nn pp 3
4 Background CCCCCC ee, bb = E eebb E ee E bb = E eebb = 00 Zero covariance implies independence, for MVN only bb ee bb ee ee bb SSSSSS Just like XX ss 2 4
5 GLHT Contrast distribution is also Normal LLbb ~ NN hh, σσ 2 LL XX XX 1 LL Consider test statistic LLbb hh σσ 2 LL XX XX 1 LL 1 LLbb hh ~ χχ r 22 5
6 GLHT LLLL hh σσ 2 LL XX XX 1 LL 1 LLLL hh /rr nn pp MMMMMM σσ 2 /(nn pp) = LLLL hh LL XX XX 1 LL 1 LLLL hh rrmmmmmm χχ 22 r /rr ~ 22 χχ n p /(nn pp) = FF rr,nn pp Provided those two χχ 2 s are independent They are, since one a function of SSE and the other a function of the regression estimates 6
7 tl;dr You can test a set of contrasts with the following statistic FF oooooo = LLLL hh LL XX XX 1 LL 1 LLLL hh rrmmmmmm Under the null hypothesis, it has an FF rr,nn pp distribution 7
8 To R Studio
9 Applied Stats Algorithm Scientific question? Classify Study Response Variable Multi-Var No Unacceptable Univariate Numerical Categorical Censored Complete Predictor(s) Numerical Both Categorical Today 9
10 ANCOVA Model Often, we d like to conduct an ANOVA, but control for some numerical predictor as well This could be a blocking variable like age If the numerical predictor xx is correlated with YY, then xx will likely explain some of the variation in YY and MMSS EE will be reduced These are efficient designs Also a good technique for non-experimental studies 10
11 ANCOVA Model When we re interested in Factor A but controlling for covariate xx, we call this Analysis of Covariance Assumptions: Linear relationship between Y and xx All the usual other ANOVA assumptions Homogeneity of regression slopes We can use interactions to relax the last one and allow different slopes for different groups 11
12 Drug Testing Example Suppose we have NN patients, and we randomize into 3 treatment arms Drug A, Drug B, Placebo II AA = 1 if Drug A, 0 otherwise II BB = 1 if Drug B, 0 otherwise Reference coding scheme We also want to control for age, a known covariate that affects the response xx is age 12
13 Make a Table EE YY XX = xx = ββ 0 + ββ 1 xx + ββ 2 II AA + ββ 3 II BB Drug II AA II BB ββ 0 + ββ 1 xx + ββ 2 II AA + ββ 3 II BB A 1 0 (ββ 0 + ββ 2 ) + ββ 1 xx B 0 1 ββ 0 + ββ 3 + ββ 1 xx Placebo 0 0 ββ 0 + ββ 1 xx The effect of Drug A (or B) is a change to the intercept of ββ 2 (or ββ 3 ) We saw this already, but the same holds true when controlling for age 13
14 ANCOVA Can test contrasts controlling for covariates Valuable Sometimes very easy, sometimes can require a bit of algebra An easy example: Are responses to Drug A and B different, controlling for age?
15 ANCOVA Drug II AA II BB ββ 0 + ββ 1 xx + ββ 2 II AA + ββ 3 II BB A 1 0 (ββ 0 + ββ 2 ) + ββ 1 xx B 0 1 ββ 0 + ββ 3 + ββ 1 xx Placebo 0 0 ββ 0 + ββ 1 xx HH 0 : ββ 2 = ββ 3 We saw this already, but the same holds true when controlling for age Q: Why is this (probably) better than ANOVA model without age? 15
16 ANCOVA Is the average response to Drugs A and B different from response to the placebo, controlling for age? Drug II AA II BB ββ 0 + ββ 1 xx + ββ 2 II AA + ββ 3 II BB A 1 0 (ββ 0 + ββ 2 ) + ββ 1 xx B 0 1 ββ 0 + ββ 3 + ββ 1 xx Placebo 0 0 ββ 0 + ββ 1 xx HH 0 : ββ 2 + ββ 3 = 0 16
17 Show your work We want to avoid this kind of thing
18 Research Questions Drug II AA II BB ββ 0 + ββ 1 xx + ββ 2 II AA + ββ 3 II BB A 1 0 (ββ 0 + ββ 2 ) + ββ 1 xx B 0 1 ββ 0 + ββ 3 + ββ 1 xx Placebo 0 0 ββ 0 + ββ 1 xx What null hypothesis would you test to answer: Is Drug A equivalent to Placebo, controlling for age? Are all three treatments arms equivalent, controlling for age? 18
19 Drug Testing Example With reference coding, regression coefficients are differences between group means and the reference group mean Conditional on the covariate With cell means coding, regression coefficients are the (conditional) group means With effect coding, regression coefficients are deviations from the average (conditional on age) population mean Use whichever scheme is most convenient 19
20 Cell means coding Covariate: Age = x Drug II AA II BB II PP ββ 1 II AA + ββ 2 II BB + ββ 3 II PP + ββ 4 xx A ββ 1 + ββ 4 xx B ββ 2 + ββ 4 xx Placebo ββ 3 + ββ 4 xx Parallel regression lines Equivalent to the model with intercept Regression coefficients for the dummy vars are the intercepts Easy to specify contrasts
21 Effect coding Covariate: Age = x Drug xx 2 xx 3 ββ 0 + ββ 1 xx + ββ 2 xx 2 + ββ 3 xx 3 A 1 0 ββ 0 + ββ 1 xx + ββ 2 B 0 1 ββ 0 + ββ 1 xx + ββ 3 Placebo -1-1 ββ 0 + ββ 1 xx ββ 2 ββ 3 Regression coefficients are deviations from the average conditional population mean (conditional on x) If the regression coefficients for all the dummy variables are zero, the categorical IV is unrelated to the DV, controlling for the covariates
22 ANCOVA We just fit a parallel lines model Or more generally, a parallel planes model What does that look like? Back to the soil 22
23 Farming Example (with a covariate) Suppose we also collect information on the amount of rainfall for each field Field is the experimental unit We can t control this, and we won t know the value when the experiment is set up So no possibility for RCB But it s easy to measure after setup Can we correct or control for the effect of rain? What effect will this have on our power to detect a 23 fertilizer effect?
24 Farming Example (with a covariate) x <- c(110,98,75,90, 85,80,110,130, 68,80,100,125, 72,90,85,115) farm <- data.frame(yield=y, fertil=a, rain=x) plot(yield ~ rain, data=farm) 24
25 Farming Example (with a covariate) require(ggplot2) qplot(rain, yield, data=farm, colour=fertil) + geom_smooth(method="lm", se=f) 25
26 Farming Example (Parallel planes model) It looks like the fertilizers have a different effect on crop yield, and have roughly the same relationship with rain Let s fit a parallel planes model and compare to vanilla ANOVA without the covariate 26
27 Farming Example (Parallel planes model) > anova(lm(yield ~ fertil, data=farm)) Analysis of Variance Table Response: yield Df Sum Sq Mean Sq F value Pr(>F) fertil * Residuals > anova(lm(yield ~ rain + fertil, data=farm)) Analysis of Variance Table Response: yield Df Sum Sq Mean Sq F value Pr(>F) rain e-05 *** fertil e-06 *** Residuals
28 Building to a GLM What if we want to allow non-parallel lines? Just fit an interaction between the covariate and the treatment groups EE YY XX = xx = ββ 0 + ββ 1 xx + ββ 2 II AA + ββ 3 II BB + ββ 4 xxii AA + ββ 5 xxii BB Make a table: Drug II AA II BB ββ 0 + ββ 1 xx + ββ 2 II AA + ββ 3 II BB + ββ 4 xxii AA + ββ 5 xxii BB A 1 0 (ββ 0 + ββ 2 ) + (ββ 1 + ββ 4 )xx B 0 1 ββ 0 + ββ 3 + (ββ 1 + ββ 5 )xx Placebo 0 0 ββ 0 + ββ 1 xx 28
29 Interactions Interaction between independent variables means It depends. Relationship between one IV and the DV depends on the value of another IV. Can have Quantitative by quantitative Quantitative by categorical Categorical by categorical (Done)
30 General principle Interaction between A and B means Relationship of A to Y depends on value of B Relationship of B to Y depends on value of A The two statements are formally equivalent
31 Quantitative by Quantitative For fixed x 2 Both slope and intercept depend on x 2 And for fixed x 1, slope and intercept relating x 2 to E(Y) depend on the value of x 1
32 First Order Model Y X 2 X 1 32
33 Interaction Effects EY { } = 3+ 2X1+ 2X2 3XX EE YY XX 1 = 2 3XX 2 EE YY XX 2 = 2 3XX 1 Y X X
34 Quantitative by Categorical Separate regression line for each value of the categorical independent variable. Interaction means slopes of regression lines are not equal.
35 One regression Model Form a product of quantitative variable times each dummy variable for the categorical variable For example, three treatments and one covariate: x 1 is the covariate and x 2, x 3 are dummy variables
36 What to do if H 0 : β 4 =β 5 =0 is rejected How do you test Group controlling for x 1? A popular choice is to set x 1 to its sample mean, and compare treatments at that point Or, test equal regressions, in which mean response is the same for all values of the covariate(s). Why not set x 1 to sample mean of each group? 3 different values
37 Test for differences at xx 1
38 Hypothesis Tests Drug II AA II BB ββ 0 + ββ 1 xx + ββ 2 II AA + ββ 3 II BB + ββ 4 xxii AA + ββ 5 xxii BB A 1 0 (ββ 0 + ββ 2 ) + (ββ 1 + ββ 4 )xx B 0 1 ββ 0 + ββ 3 + (ββ 1 + ββ 5 )xx Placebo 0 0 ββ 0 + ββ 1 xx What null hypothesis would you test for: Equal slopes for all groups? Equal slopes for Drugs A and B? Equal slopes for Drugs A and Placebo? Interaction between group and covariate? Equal regression surfaces between groups? 38
39 Farming Example (Interaction model) > anova(lm(yield ~ rain + fertil, data=farm)) Analysis of Variance Table Df Sum Sq Mean Sq F value Pr(>F) rain e-05 *** fertil e-06 *** Residuals > anova(lm(yield ~ rain*fertil, data=farm)) Analysis of Variance Table Df Sum Sq Mean Sq F value Pr(>F) rain *** fertil e-05 *** Rain:fertil Residuals
40 Farming Example (Conclusions) Given amount of rainfall, we have very strong evidence that the 4 types of fertilizers do not have the same effect on crop yield Follow-up tests Given fertilizer type, we have very strong evidence that additional rainfall leads to an increased yield Is this test important? The effect of fertilizer type does not seem to depend on rainfall Can we visualize these effects? 40
41 Partial Boxplots Regular boxplots plot the distribution of Y for each level of a factor, irrespective of any other variables you may have measured If the variation in Y due to these variables is large, these differences will be masked If I want to partial out the effect of a covariate xx, can fit a regression model E[YY] ~ ββ 0 + ββ 1 xx A plot of the residuals of this model against the factor A is called a partial boxplot 41
42 Farming Example (Boxplots) A set of side-by-side boxplots for fertilizer doesn t show that much difference 42
43 Farming Example (Partial boxplots) The partial boxplots highlight the differences between fertilizers in a more obvious way 43
44 Added-Variable Plots We can do the same thing to determine the relationship between Yield and Rain, conditional on the fertilizer type Fit model RRRRRRRR ~ FFFFFFFFFFFF and obtain the residuals Fit model YYYYYYYYYY ~ FFFFFFFFFFFF and obtain the residuals Plot these residuals against each other Such a plot is called an Added-variable plot 44
45 Farming Example (Added-variable plot) Scatterplot plot(yield ~ rain, data=farm, xlab="rain", ylab="crop Yield") Added-Variable Plot fitx <- lm(rain ~ fertil, data=farm) fity <- lm(yield ~ fertil, data=farm) plot(fitx$residuals, fity$residuals) 45
46 > anova(lm(fity$residuals ~ fitx$residuals)) Analysis of Variance Table Response: fity$residuals Df Sum Sq Mean Sq F value Pr(>F) fitx$residuals e-07 *** Residuals > anova(lm(yield ~ fertil + rain, data=farm)) Analysis of Variance Table Response: yield Df Sum Sq Mean Sq F value Pr(>F) fertil e-05 *** rain e-06 *** Residuals
47 It s a Linear Model There is really no difference between the model we just developed and the one you learned in your regression course yy = XXXX + εε Both are considered linear models, and the design matrix XX can contain numerical regressors or indicator variables, in whatever parametrization is convenient 47
48 General Linear Models We can generalize even further (though we won t) to a multivariate response case YY = XXXX + UU This is the General Linear Model of statistics GLM, not to be confused with the other GLM Stick to univariate (general) linear models 48
49 Centering IVs Subtract mean (for entire sample) from each quantitative independent variable.
50 Properties of Centering When independent variables are centered, estimates and tests for intercepts are affected Relationships between independent variables and dependent variables are unaffected Estimates and tests for slopes are unaffected R 2 is unaffected Predictions and prediction intervals are unaffected
51 More Properties Suppose regression model has an intercept Residuals add up to zero Then if each independent variable is set to its sample mean value, Y-hat equals Y-bar, the sample mean of all the Y values. In this case, if all independent variables are centered by subtracting off their means, then the intercept equals Y-bar, exactly.
52 Comments Often, X=0 is outside the range of independent variable values, and it is hard to say what the intercept means in terms of the data. When independent variables are centered, the intercept is the average Y value for average value(s) of X. If there are both quantitative variables and categorical variables (represented by dummy variables), it can help to center just the quantitative variables.
53 Centering just the quantitative IVs Subtract mean (for entire sample) from each quantitative independent variable. Then, comparing intercepts is the same as comparing expected values for average X values. It s more convenient than testing linear combinations.
54 For Example Suppose you want to test for differences among population mean Y values when x 1 equals its sample mean value You could test H 0 : Or, center x 1 and test H 0 : β 2 = β 3 = 0
55 ANCOVA x 1 = Age x 2 = 1 if Drug A, Zero otherwise x 3 = 1 if Drug B, Zero otherwise Parallel regression lines (equal slopes): ANCOVA
56 What do you report? x 1 = Age x 2 = 1 if Drug A, Zero otherwise x 3 = 1 if Drug B, Zero otherwise
57 Set all covariates to their sample mean values And compute Y-hat for each group Call it an adjusted mean, or something like average university GPA adjusted for High School GPA. SAS calls it a least squares mean (lsmeans)
58 Significance Testing Overall F test for all the IVs at once T-tests for each regression coefficient: Controlling for all the others, does that IV matter? Test a collection of IVs controlling for another collection Most general: Testing whether sets of linear combinations of regression coefficients differ from specified constants.
59 Significance Testing Controlling for mother s education and father s education, are (any of) total family income, assessed value of home and total market value of all vehicles owned by the family related to High School GPA?
60 Statistical Models Starting with the basics and building towards the GLM 60
61 t-test One-sample Used to test if a mean equals some value Two-sample Used to test if the difference between groups equals some value Most commonly used to test if two populations have the same mean Paired Really a one-sample test, and a special case of a Repeated Measures design
62 One-way ANOVA Generalized two-sample t-test Used to test if k groups have the same mean Q: What can you do with a t-test that you cannot do with ANOVA? A: Account for different variances within groups 1-way ANOVA t-test
63 Factorial ANOVA Generalized one-way ANOVA Used to test if k groups have the same mean across multiple (crossed) factors Can also test for interactions between factors ANOVA 1-way ANOVA
64 Simple Linear Regression (SLR) Fits a straight line through a scatterplot Used to test if two numerical variables have a linear correlation It is linear in the parameters, not the variables Can account for transformations this way (log, etc.) ANOVA SLR
65 Multiple Linear Regression (MLR) Generalization of regression with several predictors Fit a hyperplane through a multi-dimensional scatterplot Used to test if any numerical variables have a linear relationship with the response Model building and interactions MLR ANOVA SLR
66 Analysis of Covariance (ANCOVA) Generalized ANOVA Used to test if k groups have the same mean across multiple factors, controlling for a numerical covariate Typically used when the covariate itself is not of interest ANCOVA ANOVA MLR
67 Multivariate ANOVA (MANOVA) Generalized ANOVA Used to test if k groups have the same mean across multiple factors AND multiple responses Not studied in this course MANOVA ANOVA
68 General Linear Model (GLM) Generalization of ANOVA, ANCOVA, and regression (and MANOVA, MANCOVA) Used to test whether factors or numerical predictors explain the variation in a response (responses), and interactions between either Many statistical problems can be solved with GLMs GLM ANCOVA MLR
69 General Linear Model (Univariate) yy = XXXX + εε yy is a vector of responses XX is the design matrix ββ is a vector of unknown parameters To be estimated εε is a vector of unknown random errors Usually considered Normal
70 General Linear Model (Multivariate) YY = XXXX + UU YY is a matrix of responses XX is the design matrix Both XX and YY could involve transformations BB is a matrix of unknown parameters To be estimated UU is a matrix of unknown random errors Usually considered multivariate Normal
71 Generalized Linear Models (GLM) Suppose we want to model our random vector with some distribution other than Normal Can use a GLM to do this Logistic regression Poisson regression Can keep going with this hierarchy GLMM, GAM, GEE, We re getting ahead of ourselves here 71
STA441: Spring Multiple Regression. This slide show is a free open source document. See the last slide for copyright information.
STA441: Spring 2018 Multiple Regression This slide show is a free open source document. See the last slide for copyright information. 1 Least Squares Plane 2 Statistical MODEL There are p-1 explanatory
More informationSTA441: Spring Multiple Regression. More than one explanatory variable at the same time
STA441: Spring 2016 Multiple Regression More than one explanatory variable at the same time This slide show is a free open source document. See the last slide for copyright information. One Explanatory
More informationMultilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2
Multilevel Models in Matrix Form Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Today s Lecture Linear models from a matrix perspective An example of how to do
More informationSimple, Marginal, and Interaction Effects in General Linear Models
Simple, Marginal, and Interaction Effects in General Linear Models PRE 905: Multivariate Analysis Lecture 3 Today s Class Centering and Coding Predictors Interpreting Parameters in the Model for the Means
More informationMath, Stats, and Mathstats Review ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD
Math, Stats, and Mathstats Review ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD Outline These preliminaries serve to signal to students what tools they need to know to succeed in ECON 360 and refresh their
More informationSingle-Factor Experiments. Coding Schemes Testing Contrasts Post-Hoc Comparisons Multiple Testing
Single-Factor Experiments Coding Schemes Testing Contrasts Post-Hoc Comparisons Multiple Testing 1 Indicator Variables Indicator variables can be included in a regression model EE YY = ββ 0 + ββ 1 XX 1
More informationCategorical Predictor Variables
Categorical Predictor Variables We often wish to use categorical (or qualitative) variables as covariates in a regression model. For binary variables (taking on only 2 values, e.g. sex), it is relatively
More informationReview of the General Linear Model
Review of the General Linear Model EPSY 905: Multivariate Analysis Online Lecture #2 Learning Objectives Types of distributions: Ø Conditional distributions The General Linear Model Ø Regression Ø Analysis
More informationA Re-Introduction to General Linear Models (GLM)
A Re-Introduction to General Linear Models (GLM) Today s Class: You do know the GLM Estimation (where the numbers in the output come from): From least squares to restricted maximum likelihood (REML) Reviewing
More informationVariations. ECE 6540, Lecture 02 Multivariate Random Variables & Linear Algebra
Variations ECE 6540, Lecture 02 Multivariate Random Variables & Linear Algebra Last Time Probability Density Functions Normal Distribution Expectation / Expectation of a function Independence Uncorrelated
More informationUnivariate analysis. Simple and Multiple Regression. Univariate analysis. Simple Regression How best to summarise the data?
Univariate analysis Example - linear regression equation: y = ax + c Least squares criteria ( yobs ycalc ) = yobs ( ax + c) = minimum Simple and + = xa xc xy xa + nc = y Solve for a and c Univariate analysis
More informationSimple, Marginal, and Interaction Effects in General Linear Models: Part 1
Simple, Marginal, and Interaction Effects in General Linear Models: Part 1 PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 2: August 24, 2012 PSYC 943: Lecture 2 Today s Class Centering and
More informationMultiple Regression Analysis: Inference ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD
Multiple Regression Analysis: Inference ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD Introduction When you perform statistical inference, you are primarily doing one of two things: Estimating the boundaries
More informationSTAT Chapter 11: Regression
STAT 515 -- Chapter 11: Regression Mostly we have studied the behavior of a single random variable. Often, however, we gather data on two random variables. We wish to determine: Is there a relationship
More informationExperimental Design and Data Analysis for Biologists
Experimental Design and Data Analysis for Biologists Gerry P. Quinn Monash University Michael J. Keough University of Melbourne CAMBRIDGE UNIVERSITY PRESS Contents Preface page xv I I Introduction 1 1.1
More informationCourse Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model
Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 1: August 22, 2012
More information22s:152 Applied Linear Regression. Take random samples from each of m populations.
22s:152 Applied Linear Regression Chapter 8: ANOVA NOTE: We will meet in the lab on Monday October 10. One-way ANOVA Focuses on testing for differences among group means. Take random samples from each
More informationHomework 1 Solutions
Homework 1 Solutions January 18, 2012 Contents 1 Normal Probability Calculations 2 2 Stereo System (SLR) 2 3 Match Histograms 3 4 Match Scatter Plots 4 5 Housing (SLR) 4 6 Shock Absorber (SLR) 5 7 Participation
More information22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA
22s:152 Applied Linear Regression Chapter 8: ANOVA NOTE: We will meet in the lab on Monday October 10. One-way ANOVA Focuses on testing for differences among group means. Take random samples from each
More informationLecture 18 MA Applied Statistics II D 2004
Lecture 18 MA 2612 - Applied Statistics II D 2004 Today 1. Examples of multiple linear regression 2. The modeling process (PNC 8.4) 3. The graphical exploration of multivariable data (PNC 8.5) 4. Fitting
More informationMidterm Exam 1, section 1. Tuesday, February hour, 30 minutes
San Francisco State University Michael Bar ECON 312 Spring 2018 Midterm Exam 1, section 1 Tuesday, February 20 1 hour, 30 minutes Name: Instructions 1. This is closed book, closed notes exam. 2. You can
More informationRegression Models - Introduction
Regression Models - Introduction In regression models there are two types of variables that are studied: A dependent variable, Y, also called response variable. It is modeled as random. An independent
More informationStatistical Techniques II EXST7015 Simple Linear Regression
Statistical Techniques II EXST7015 Simple Linear Regression 03a_SLR 1 Y - the dependent variable 35 30 25 The objective Given points plotted on two coordinates, Y and X, find the best line to fit the data.
More informationEstimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.
Estimating σ 2 We can do simple prediction of Y and estimation of the mean of Y at any value of X. To perform inferences about our regression line, we must estimate σ 2, the variance of the error term.
More informationUnbalanced Data in Factorials Types I, II, III SS Part 1
Unbalanced Data in Factorials Types I, II, III SS Part 1 Chapter 10 in Oehlert STAT:5201 Week 9 - Lecture 2 1 / 14 When we perform an ANOVA, we try to quantify the amount of variability in the data accounted
More informationLecture 10 Multiple Linear Regression
Lecture 10 Multiple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: 6.1-6.5 10-1 Topic Overview Multiple Linear Regression Model 10-2 Data for Multiple Regression Y i is the response variable
More informationRegression, Part I. - In correlation, it would be irrelevant if we changed the axes on our graph.
Regression, Part I I. Difference from correlation. II. Basic idea: A) Correlation describes the relationship between two variables, where neither is independent or a predictor. - In correlation, it would
More informationAcknowledgements. Outline. Marie Diener-West. ICTR Leadership / Team INTRODUCTION TO CLINICAL RESEARCH. Introduction to Linear Regression
INTRODUCTION TO CLINICAL RESEARCH Introduction to Linear Regression Karen Bandeen-Roche, Ph.D. July 17, 2012 Acknowledgements Marie Diener-West Rick Thompson ICTR Leadership / Team JHU Intro to Clinical
More informationCorrelation and the Analysis of Variance Approach to Simple Linear Regression
Correlation and the Analysis of Variance Approach to Simple Linear Regression Biometry 755 Spring 2009 Correlation and the Analysis of Variance Approach to Simple Linear Regression p. 1/35 Correlation
More informationMultivariate Regression (Chapter 10)
Multivariate Regression (Chapter 10) This week we ll cover multivariate regression and maybe a bit of canonical correlation. Today we ll mostly review univariate multivariate regression. With multivariate
More informationSimple linear regression
Simple linear regression Biometry 755 Spring 2008 Simple linear regression p. 1/40 Overview of regression analysis Evaluate relationship between one or more independent variables (X 1,...,X k ) and a single
More informationST430 Exam 2 Solutions
ST430 Exam 2 Solutions Date: November 9, 2015 Name: Guideline: You may use one-page (front and back of a standard A4 paper) of notes. No laptop or textbook are permitted but you may use a calculator. Giving
More informationLecture 10: F -Tests, ANOVA and R 2
Lecture 10: F -Tests, ANOVA and R 2 1 ANOVA We saw that we could test the null hypothesis that β 1 0 using the statistic ( β 1 0)/ŝe. (Although I also mentioned that confidence intervals are generally
More informationPLS205!! Lab 9!! March 6, Topic 13: Covariance Analysis
PLS205!! Lab 9!! March 6, 2014 Topic 13: Covariance Analysis Covariable as a tool for increasing precision Carrying out a full ANCOVA Testing ANOVA assumptions Happiness! Covariable as a Tool for Increasing
More informationMultivariate analysis of variance and covariance
Introduction Multivariate analysis of variance and covariance Univariate ANOVA: have observations from several groups, numerical dependent variable. Ask whether dependent variable has same mean for each
More informationA Re-Introduction to General Linear Models
A Re-Introduction to General Linear Models Today s Class: Big picture overview Why we are using restricted maximum likelihood within MIXED instead of least squares within GLM Linear model interpretation
More informationSimple Linear Regression
Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)
More informationCourse Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model
Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model EPSY 905: Multivariate Analysis Lecture 1 20 January 2016 EPSY 905: Lecture 1 -
More informationQuestion 1 carries a weight of 25%; question 2 carries 25%; question 3 carries 20%; and question 4 carries 30%.
UNIVERSITY OF EAST ANGLIA School of Economics Main Series PGT Examination 2017-18 FINANCIAL ECONOMETRIC THEORY ECO-7024A Time allowed: 2 HOURS Answer ALL FOUR questions. Question 1 carries a weight of
More informationThis document contains 3 sets of practice problems.
P RACTICE PROBLEMS This document contains 3 sets of practice problems. Correlation: 3 problems Regression: 4 problems ANOVA: 8 problems You should print a copy of these practice problems and bring them
More informationAnalysis of Covariance
Analysis of Covariance (ANCOVA) Bruce A Craig Department of Statistics Purdue University STAT 514 Topic 10 1 When to Use ANCOVA In experiment, there is a nuisance factor x that is 1 Correlated with y 2
More informationSMA 6304 / MIT / MIT Manufacturing Systems. Lecture 10: Data and Regression Analysis. Lecturer: Prof. Duane S. Boning
SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems Lecture 10: Data and Regression Analysis Lecturer: Prof. Duane S. Boning 1 Agenda 1. Comparison of Treatments (One Variable) Analysis of Variance
More informationApplied Regression Analysis. Section 2: Multiple Linear Regression
Applied Regression Analysis Section 2: Multiple Linear Regression 1 The Multiple Regression Model Many problems involve more than one independent variable or factor which affects the dependent or response
More informationExtreme value statistics: from one dimension to many. Lecture 1: one dimension Lecture 2: many dimensions
Extreme value statistics: from one dimension to many Lecture 1: one dimension Lecture 2: many dimensions The challenge for extreme value statistics right now: to go from 1 or 2 dimensions to 50 or more
More informationChapter 5. Multiple Regression. 5.1 Three Meanings of Control
Chapter 5 Multiple Regression 5.1 Three Meanings of Control In this course, we will use the word control to refer to procedures designed to reduce the influence of extraneous variables on our results.
More informationStatistics Univariate Linear Models Gary W. Oehlert School of Statistics 313B Ford Hall
Statistics 5401 14. Univariate Linear Models Gary W. Oehlert School of Statistics 313B ord Hall 612-625-1557 gary@stat.umn.edu Linear models relate a target or response or dependent variable to known predictor
More informationRegression and the 2-Sample t
Regression and the 2-Sample t James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Regression and the 2-Sample t 1 / 44 Regression
More informationLecture 3. Linear Regression
Lecture 3. Linear Regression COMP90051 Statistical Machine Learning Semester 2, 2017 Lecturer: Andrey Kan Copyright: University of Melbourne Weeks 2 to 8 inclusive Lecturer: Andrey Kan MS: Moscow, PhD:
More informationGeneral Linear Model (Chapter 4)
General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients
More informationFinding Relationships Among Variables
Finding Relationships Among Variables BUS 230: Business and Economic Research and Communication 1 Goals Specific goals: Re-familiarize ourselves with basic statistics ideas: sampling distributions, hypothesis
More information1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College
1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College Spring 2010 The basic ANOVA situation Two variables: 1 Categorical, 1 Quantitative Main Question: Do the (means of) the quantitative
More informationInteractions and Factorial ANOVA
Interactions and Factorial ANOVA STA442/2101 F 2017 See last slide for copyright information 1 Interactions Interaction between explanatory variables means It depends. Relationship between one explanatory
More informationy response variable x 1, x 2,, x k -- a set of explanatory variables
11. Multiple Regression and Correlation y response variable x 1, x 2,, x k -- a set of explanatory variables In this chapter, all variables are assumed to be quantitative. Chapters 12-14 show how to incorporate
More informationAnalysis of variance, multivariate (MANOVA)
Analysis of variance, multivariate (MANOVA) Abstract: A designed experiment is set up in which the system studied is under the control of an investigator. The individuals, the treatments, the variables
More informationInteractions and Factorial ANOVA
Interactions and Factorial ANOVA STA442/2101 F 2018 See last slide for copyright information 1 Interactions Interaction between explanatory variables means It depends. Relationship between one explanatory
More informationBusiness Statistics. Lecture 10: Correlation and Linear Regression
Business Statistics Lecture 10: Correlation and Linear Regression Scatterplot A scatterplot shows the relationship between two quantitative variables measured on the same individuals. It displays the Form
More informationAnalyses of Variance. Block 2b
Analyses of Variance Block 2b Types of analyses 1 way ANOVA For more than 2 levels of a factor between subjects ANCOVA For continuous co-varying factor, between subjects ANOVA for factorial design Multiple
More informationStatistical Distribution Assumptions of General Linear Models
Statistical Distribution Assumptions of General Linear Models Applied Multilevel Models for Cross Sectional Data Lecture 4 ICPSR Summer Workshop University of Colorado Boulder Lecture 4: Statistical Distributions
More informationTopic 20: Single Factor Analysis of Variance
Topic 20: Single Factor Analysis of Variance Outline Single factor Analysis of Variance One set of treatments Cell means model Factor effects model Link to linear regression using indicator explanatory
More informationChapter 3: Multiple Regression. August 14, 2018
Chapter 3: Multiple Regression August 14, 2018 1 The multiple linear regression model The model y = β 0 +β 1 x 1 + +β k x k +ǫ (1) is called a multiple linear regression model with k regressors. The parametersβ
More informationANCOVA. ANCOVA allows the inclusion of a 3rd source of variation into the F-formula (called the covariate) and changes the F-formula
ANCOVA Workings of ANOVA & ANCOVA ANCOVA, Semi-Partial correlations, statistical control Using model plotting to think about ANCOVA & Statistical control You know how ANOVA works the total variation among
More informationBIOMETRICS INFORMATION
BIOMETRICS INFORMATION Index of Pamphlet Topics (for pamphlets #1 to #60) as of December, 2000 Adjusted R-square ANCOVA: Analysis of Covariance 13: ANCOVA: Analysis of Covariance ANOVA: Analysis of Variance
More informationLecture 13 Extra Sums of Squares
Lecture 13 Extra Sums of Squares STAT 512 Spring 2011 Background Reading KNNL: 7.1-7.4 13-1 Topic Overview Extra Sums of Squares (Defined) Using and Interpreting R 2 and Partial-R 2 Getting ESS and Partial-R
More informationCS145: INTRODUCTION TO DATA MINING
CS145: INTRODUCTION TO DATA MINING 2: Vector Data: Prediction Instructor: Yizhou Sun yzsun@cs.ucla.edu October 8, 2018 TA Office Hour Time Change Junheng Hao: Tuesday 1-3pm Yunsheng Bai: Thursday 1-3pm
More informationSTAT 3900/4950 MIDTERM TWO Name: Spring, 2015 (print: first last ) Covered topics: Two-way ANOVA, ANCOVA, SLR, MLR and correlation analysis
STAT 3900/4950 MIDTERM TWO Name: Spring, 205 (print: first last ) Covered topics: Two-way ANOVA, ANCOVA, SLR, MLR and correlation analysis Instructions: You may use your books, notes, and SPSS/SAS. NO
More informationStatistics and Quantitative Analysis U4320. Segment 10 Prof. Sharyn O Halloran
Statistics and Quantitative Analysis U4320 Segment 10 Prof. Sharyn O Halloran Key Points 1. Review Univariate Regression Model 2. Introduce Multivariate Regression Model Assumptions Estimation Hypothesis
More informationAppendix A: Review of the General Linear Model
Appendix A: Review of the General Linear Model The generallinear modelis an important toolin many fmri data analyses. As the name general suggests, this model can be used for many different types of analyses,
More informationHypothesis testing, part 2. With some material from Howard Seltman, Blase Ur, Bilge Mutlu, Vibha Sazawal
Hypothesis testing, part 2 With some material from Howard Seltman, Blase Ur, Bilge Mutlu, Vibha Sazawal 1 CATEGORICAL IV, NUMERIC DV 2 Independent samples, one IV # Conditions Normal/Parametric Non-parametric
More informationPassing-Bablok Regression for Method Comparison
Chapter 313 Passing-Bablok Regression for Method Comparison Introduction Passing-Bablok regression for method comparison is a robust, nonparametric method for fitting a straight line to two-dimensional
More informationMultiple Regression Analysis: Estimation ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD
Multiple Regression Analysis: Estimation ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD Outline Motivation. Mechanics and Interpretation. Expected Values and Variances of the Estimators. Motivation for multiple
More informationSimple Linear Regression: One Qualitative IV
Simple Linear Regression: One Qualitative IV 1. Purpose As noted before regression is used both to explain and predict variation in DVs, and adding to the equation categorical variables extends regression
More informationReview of Statistics 101
Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods
More informationBIOL 933!! Lab 10!! Fall Topic 13: Covariance Analysis
BIOL 933!! Lab 10!! Fall 2017 Topic 13: Covariance Analysis Covariable as a tool for increasing precision Carrying out a full ANCOVA Testing ANOVA assumptions Happiness Covariables as Tools for Increasing
More informationChapter 12 - Lecture 2 Inferences about regression coefficient
Chapter 12 - Lecture 2 Inferences about regression coefficient April 19th, 2010 Facts about slope Test Statistic Confidence interval Hypothesis testing Test using ANOVA Table Facts about slope In previous
More informationdf=degrees of freedom = n - 1
One sample t-test test of the mean Assumptions: Independent, random samples Approximately normal distribution (from intro class: σ is unknown, need to calculate and use s (sample standard deviation)) Hypotheses:
More informationAnalysis of Survival Data Using Cox Model (Continuous Type)
Australian Journal of Basic and Alied Sciences, 7(0): 60-607, 03 ISSN 99-878 Analysis of Survival Data Using Cox Model (Continuous Type) Khawla Mustafa Sadiq Department of Mathematics, Education College,
More informationBinary Logistic Regression
The coefficients of the multiple regression model are estimated using sample data with k independent variables Estimated (or predicted) value of Y Estimated intercept Estimated slope coefficients Ŷ = b
More informationInverse of a Square Matrix. For an N N square matrix A, the inverse of A, 1
Inverse of a Square Matrix For an N N square matrix A, the inverse of A, 1 A, exists if and only if A is of full rank, i.e., if and only if no column of A is a linear combination 1 of the others. A is
More informationInstructions: Closed book, notes, and no electronic devices. Points (out of 200) in parentheses
ISQS 5349 Final Spring 2011 Instructions: Closed book, notes, and no electronic devices. Points (out of 200) in parentheses 1. (10) What is the definition of a regression model that we have used throughout
More informationGeneralized Linear Models (GLZ)
Generalized Linear Models (GLZ) Generalized Linear Models (GLZ) are an extension of the linear modeling process that allows models to be fit to data that follow probability distributions other than the
More informationLecture 9 SLR in Matrix Form
Lecture 9 SLR in Matrix Form STAT 51 Spring 011 Background Reading KNNL: Chapter 5 9-1 Topic Overview Matrix Equations for SLR Don t focus so much on the matrix arithmetic as on the form of the equations.
More information36-707: Regression Analysis Homework Solutions. Homework 3
36-707: Regression Analysis Homework Solutions Homework 3 Fall 2012 Problem 1 Y i = βx i + ɛ i, i {1, 2,..., n}. (a) Find the LS estimator of β: RSS = Σ n i=1(y i βx i ) 2 RSS β = Σ n i=1( 2X i )(Y i βx
More informationMA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7
MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7 1 Random Vectors Let a 0 and y be n 1 vectors, and let A be an n n matrix. Here, a 0 and A are non-random, whereas y is
More informationGeneral Principles Within-Cases Factors Only Within and Between. Within Cases ANOVA. Part One
Within Cases ANOVA Part One 1 / 25 Within Cases A case contributes a DV value for every value of a categorical IV It is natural to expect data from the same case to be correlated - NOT independent For
More informationAnalysis of Covariance
Analysis of Covariance Using categorical and continuous predictor variables Example An experiment is set up to look at the effects of watering on Oak Seedling establishment Three levels of watering: (no
More informationLecture 3. STAT161/261 Introduction to Pattern Recognition and Machine Learning Spring 2018 Prof. Allie Fletcher
Lecture 3 STAT161/261 Introduction to Pattern Recognition and Machine Learning Spring 2018 Prof. Allie Fletcher Previous lectures What is machine learning? Objectives of machine learning Supervised and
More informationSIMPLE REGRESSION ANALYSIS. Business Statistics
SIMPLE REGRESSION ANALYSIS Business Statistics CONTENTS Ordinary least squares (recap for some) Statistical formulation of the regression model Assessing the regression model Testing the regression coefficients
More informationANOVA Situation The F Statistic Multiple Comparisons. 1-Way ANOVA MATH 143. Department of Mathematics and Statistics Calvin College
1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College An example ANOVA situation Example (Treating Blisters) Subjects: 25 patients with blisters Treatments: Treatment A, Treatment
More informationTopic 23: Diagnostics and Remedies
Topic 23: Diagnostics and Remedies Outline Diagnostics residual checks ANOVA remedial measures Diagnostics Overview We will take the diagnostics and remedial measures that we learned for regression and
More information2. TRUE or FALSE: Converting the units of one measured variable alters the correlation of between it and a second variable.
1. The diagnostic plots shown below are from a linear regression that models a patient s score from the SUG-HIGH diabetes risk model as function of their normalized LDL level. a. Based on these plots,
More informationCourse in Data Science
Course in Data Science About the Course: In this course you will get an introduction to the main tools and ideas which are required for Data Scientist/Business Analyst/Data Analyst. The course gives an
More informationMAT2377. Rafa l Kulik. Version 2015/November/26. Rafa l Kulik
MAT2377 Rafa l Kulik Version 2015/November/26 Rafa l Kulik Bivariate data and scatterplot Data: Hydrocarbon level (x) and Oxygen level (y): x: 0.99, 1.02, 1.15, 1.29, 1.46, 1.36, 0.87, 1.23, 1.55, 1.40,
More information14 Multiple Linear Regression
B.Sc./Cert./M.Sc. Qualif. - Statistics: Theory and Practice 14 Multiple Linear Regression 14.1 The multiple linear regression model In simple linear regression, the response variable y is expressed in
More informationMath 30-1 Trigonometry Prac ce Exam 4. There are two op ons for PP( 5, mm), it can be drawn in SOLUTIONS
SOLUTIONS Math 0- Trigonometry Prac ce Exam Visit for more Math 0- Study Materials.. First determine quadrant terminates in. Since ssssss is nega ve in Quad III and IV, and tttttt is neg. in II and IV,
More informationChapter 4: Regression Models
Sales volume of company 1 Textbook: pp. 129-164 Chapter 4: Regression Models Money spent on advertising 2 Learning Objectives After completing this chapter, students will be able to: Identify variables,
More informationMultiple Linear Regression
Multiple Linear Regression University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html 1 / 42 Passenger car mileage Consider the carmpg dataset taken from
More informationSimple Linear Regression
Simple Linear Regression MATH 282A Introduction to Computational Statistics University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/ eariasca/math282a.html MATH 282A University
More informationParametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami
Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami Parametric Assumptions The observations must be independent. Dependent variable should be continuous
More informationAnalysis of Covariance. The following example illustrates a case where the covariate is affected by the treatments.
Analysis of Covariance In some experiments, the experimental units (subjects) are nonhomogeneous or there is variation in the experimental conditions that are not due to the treatments. For example, a
More informationSTAT5044: Regression and Anova. Inyoung Kim
STAT5044: Regression and Anova Inyoung Kim 2 / 47 Outline 1 Regression 2 Simple Linear regression 3 Basic concepts in regression 4 How to estimate unknown parameters 5 Properties of Least Squares Estimators:
More information