Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

Similar documents
Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

Review of the General Linear Model

A Re-Introduction to General Linear Models (GLM)

Statistical Distribution Assumptions of General Linear Models

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7

Repeated Measures ANOVA Multivariate ANOVA and Their Relationship to Linear Mixed Models

Introduction to Matrix Algebra and the Multivariate Normal Distribution

A Re-Introduction to General Linear Models

Simple, Marginal, and Interaction Effects in General Linear Models

A Introduction to Matrix Algebra and the Multivariate Normal Distribution

Multilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2

Interactions among Continuous Predictors

Univariate Normal Distribution; GLM with the Univariate Normal; Least Squares Estimation

STA441: Spring Multiple Regression. This slide show is a free open source document. See the last slide for copyright information.

Review of Multiple Regression

Ron Heck, Fall Week 3: Notes Building a Two-Level Model

An Introduction to Path Analysis

Simple, Marginal, and Interaction Effects in General Linear Models: Part 1

Introduction to Within-Person Analysis and RM ANOVA

Path Analysis. PRE 906: Structural Equation Modeling Lecture #5 February 18, PRE 906, SEM: Lecture 5 - Path Analysis

Maximum Likelihood Estimation; Robust Maximum Likelihood; Missing Data with Maximum Likelihood

An Introduction to Mplus and Path Analysis

Time-Invariant Predictors in Longitudinal Models

Generalized Linear Models for Non-Normal Data

Class Introduction and Overview; Review of ANOVA, Regression, and Psychological Measurement

Review of CLDP 944: Multilevel Models for Longitudinal Data

36-309/749 Experimental Design for Behavioral and Social Sciences. Dec 1, 2015 Lecture 11: Mixed Models (HLMs)

Univariate Normal Distribution; GLM with the Univariate Normal; Least Squares Estimation

An Introduction to Multilevel Models. PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 25: December 7, 2012

Final Exam - Solutions

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture!

Example. Multiple Regression. Review of ANOVA & Simple Regression /749 Experimental Design for Behavioral and Social Sciences

Time-Invariant Predictors in Longitudinal Models

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016

Correlation. A statistics method to measure the relationship between two variables. Three characteristics

Generalized Models: Part 1

Time-Invariant Predictors in Longitudinal Models

Experimental Design and Data Analysis for Biologists

Binary Logistic Regression

Model Estimation Example

Subject CS1 Actuarial Statistics 1 Core Principles

Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011)

Data Analysis as a Decision Making Process

Profile Analysis Multivariate Regression

Can you tell the relationship between students SAT scores and their college grades?

ANCOVA. ANCOVA allows the inclusion of a 3rd source of variation into the F-formula (called the covariate) and changes the F-formula

Machine Learning Linear Regression. Prof. Matteo Matteucci

Contents. Acknowledgments. xix

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit

AP Final Review II Exploring Data (20% 30%)

Simple Linear Regression Using Ordinary Least Squares

Time-Invariant Predictors in Longitudinal Models

Longitudinal Modeling with Logistic Regression

Introduction to Generalized Models

Comparing IRT with Other Models

Acknowledgements. Outline. Marie Diener-West. ICTR Leadership / Team INTRODUCTION TO CLINICAL RESEARCH. Introduction to Linear Regression

Inferences for Regression

Longitudinal Data Analysis of Health Outcomes

where Female = 0 for males, = 1 for females Age is measured in years (22, 23, ) GPA is measured in units on a four-point scale (0, 1.22, 3.45, etc.

Introduction to Random Effects of Time and Model Estimation

Review of Statistics 101

WU Weiterbildung. Linear Mixed Models

Hypothesis testing, part 2. With some material from Howard Seltman, Blase Ur, Bilge Mutlu, Vibha Sazawal

Final Exam - Solutions

STA441: Spring Multiple Regression. More than one explanatory variable at the same time

Power Analysis. Ben Kite KU CRMDA 2015 Summer Methodology Institute

Regression With a Categorical Independent Variable: Mean Comparisons

Chapter 3 ANALYSIS OF RESPONSE PROFILES

Course Review. Kin 304W Week 14: April 9, 2013

psyc3010 lecture 2 factorial between-ps ANOVA I: omnibus tests

CIVL 7012/8012. Simple Linear Regression. Lecture 3

Chapter Fifteen. Frequency Distribution, Cross-Tabulation, and Hypothesis Testing

Describing Change over Time: Adding Linear Trends

Draft Proof - Do not copy, post, or distribute. Chapter Learning Objectives REGRESSION AND CORRELATION THE SCATTER DIAGRAM

Panel Data. March 2, () Applied Economoetrics: Topic 6 March 2, / 43

Big Data Analysis with Apache Spark UC#BERKELEY

Exploratory Factor Analysis and Principal Component Analysis

Review of Unconditional Multilevel Models for Longitudinal Data

Chapter 6. Logistic Regression. 6.1 A linear model for the log odds

Introducing Generalized Linear Models: Logistic Regression

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

ESP 178 Applied Research Methods. 2/23: Quantitative Analysis

Introduction To Logistic Regression

Time Invariant Predictors in Longitudinal Models

Linear Regression. In this lecture we will study a particular type of regression model: the linear regression model

8/04/2011. last lecture: correlation and regression next lecture: standard MR & hierarchical MR (MR = multiple regression)

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

Designing Multilevel Models Using SPSS 11.5 Mixed Model. John Painter, Ph.D.

Stat 5101 Lecture Notes

A (Brief) Introduction to Crossed Random Effects Models for Repeated Measures Data

Categorical Predictor Variables

Regression With a Categorical Independent Variable

Systematic error, of course, can produce either an upward or downward bias.

Lecture 14: Introduction to Poisson Regression

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

PS2.1 & 2.2: Linear Correlations PS2: Bivariate Statistics

Chapter 19: Logistic regression

Uni- and Bivariate Power

Econometrics Review questions for exam

STAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression

Transcription:

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model EPSY 905: Multivariate Analysis Lecture 1 20 January 2016 EPSY 905: Lecture 1 - Introduction

Today s Class Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance and Covariance Review of the General Linear Model EPSY 905: Lecture 1 - Introduction 2

COURSE OVERVIEW EPSY 905: Lecture 1 - Introduction 3

Guiding Principles for PRE 905 #1 of 3: Blocks #1. If you understand the building blocks of a model, you can build anything! EPSY 905: Lecture 1 - Introduction 4

4 (or 5*) Model Building Blocks 1. Linear models (for effects of predictors) 2. Link functions (for anything not normal) 3a*. Random effects (for describing dependency) 3b*. Latent variables (for measurement models) 4. Estimation (e.g., Maximum Likelihood, Bayesian) * These are really the same thing. EPSY 905: Lecture 1 - Introduction 5

Principles #2 of 3 - The Journey is Part of the Destination Not just blocks; Not just a journey in 905 you will learn: Ø Missing data (impute ) Ø Path models Ø Mediation and moderation Ø Testing complex hypotheses involving observed variables Ø Bayesian Ø Likelihood based methods EPSY 905: Lecture 1 - Introduction 6

Guiding Principles for PRE 905 the Bridge: #3 of 3 A bridge between what you know now and advanced statistical methods EPSY 905: Lecture 1 - Introduction 7

Motivation for Course Content The goal of this course is to provide you with a fundamental understanding of the underpinnings of the most commonly used contemporary statistical models The course is a combination of topics, picked to make your experience more extendable beyond coursework Some topics are math/statistics heavy Ø Mathematical statistics for the social sciences Upon completion of the course, you will be able to understand the communalities that link methods EPSY 905: Lecture 1 - Introduction 8

Course Structure (from the syllabus) Course format is all lecture based Ø Office hours held in labs Homework assignments Ø About one week to complete (Thursday-Thursday, usually) Ø Questions: data analysis, interpretation, some question-and-answer, some from the readings Ø Late penalty: 1 percent per calendar day Ø Revisions are allowed (minus any late penalties) Course Project Ø Three rounds of submissions Ø Iterative process EPSY 905: Lecture 1 - Introduction 9

Lecture Format Mix of theory and examples with data and syntax Ø Software: R package EPSY 905: Lecture 1 - Introduction 10

REVIEW: BASIC STATISTICAL TOPICS EPSY 905: Lecture 1 - Introduction 11

Data for Today s Lecture To help demonstrate the concepts of today s lecture, we will be using a data set with three variables Ø Female (Gender): Male (=0) or Female (=1) Ø Height in inches Ø Weight in pounds The end point of our lecture will be to build a linear model that predicts a person s weight Ø Linear model: a statistical model for an outcome that uses a linear combination (a weighted sum; weighted by a slope) of one or more predictor variables EPSY 905: Lecture 1 - Introduction 12

Visualizing the Data EPSY 905: Lecture 1 - Introduction 13

Histograms of Height and Weight The weight variable seems to be bimodal should that bother you? (hint: it shouldn t yet) EPSY 905: Lecture 1 - Introduction 14

Descriptive Statistics We can summarize each variable marginally through a set of descriptive statistics Ø Marginal: one variable by itself Common marginal descriptive statistics: Ø Central tendency: Mean, Median, Mode Ø Variability: Standard deviation (variance), range We can also summarize the joint (bivariate) distribution of two variables through a set of descriptive statistics: Ø Joint distribution: more than one variable simultaneously Common bivariate descriptive statistics: Ø Correlation and covariance EPSY 905: Lecture 1 - Introduction 15

Descriptive Statistics for Height/Weight Data Variable Mean SD Variance Height 67.9 7.44 55.358 Weight 183.4 56.383 3,179.095 Female 0.5 0.513 0.263 Diagonal: Variance Above Diagonal: Covariance Correlation /Covariance Height Weight Female Height 55.358 334.832-2.263 Weight.798 3,179.095-27.632 Female -.593 -.955.263 Below Diagonal: Correlation EPSY 905: Lecture 1 - Introduction 16

Re-examining the Concept of Variance Variability is a central concept in advanced statistics Ø In multivariate statistics, covariance is also central Two formulas for the variance (about the same when N is large):. S $ "# = 1 N 1 ) Y $ +, Y- +,/+. S $ "# = 1 N ) Y $ +, Y- +,/+ Here: p = person; 1 = variable number one Unbiased or sample Biased/ML or population EPSY 905: Lecture 1 - Introduction 17

Interpretation of Variance The variance describes the spread of a variable in squared units $ (which come from the Y +, Y- + term in the equation) Variance: the average squared distance of an observation from the mean Ø Variance of Height: 55.358 inches squared Ø Variance of Weight: 3,179.095 pounds squared Ø Variance of Female not applicable in the same way! Because squared units are difficult to work with, we typically use the standard deviation which is reported in units Standard deviation: the average distance of an observation from the mean Ø SD of Height: 7.44 inches Ø SD of Weight: 56.383 pounds EPSY 905: Lecture 1 - Introduction 18

Variance/SD as a More General Statistical Concept Variance (and the standard deviation) is a concept that is applied across statistics not just for data Ø Statistical parameters have variance w e.g. The sample mean Y- + has a standard error (SE) of S " - = 1 2. The standard error is another name for standard deviation Ø So standard error of the mean is equivalent to standard deviation of the mean Ø Usually error refers to parameters; deviation refers to data Ø Variance of the mean would be S " - $ = 1 2. 3 More generally, variance = error Ø You can think about the SE of the mean as telling you how far off the mean is for describing the data EPSY 905: Lecture 1 - Introduction 19

Correlation of Variables Moving from marginal summaries of each variable to joint (bivariate) summaries, the Pearson correlation is often used to describe the association between a pair of variables: 1 r "#," 3 = N 1.,/+ Y +, Y- + Y $, Y- $ S "# S "3 The correlation is unitless as it ranges from -1 to 1 for continuous variables, regardless of their variances Ø Pearson correlation of binary/categorical variables with continuous variables is called a point-biserial (same formula) Ø Pearson correlation of binary/categorical variables with other binary/categorical variables has bounds within -1 and 1 EPSY 905: Lecture 1 - Introduction 20

More on the Correlation Coefficient The Pearson correlation is a biased estimator Ø Biased estimator: the expected value differs from the true value for a statistic w Other biased estimators: Variance/SD when + is used. The unbiased correlation estimate would be: r 7 "#," 3 = r "#," 3 1 + 1 r " #," 3 2N Ø As N gets large bias goes away; Bias is largest when r "#," 3 = 0 Ø Pearson is an underestimate of true correlation If it is biased, then why does everyone use it anyway? Ø Answer: forthcoming when we talk about (ML) estimation $ EPSY 905: Lecture 1 - Introduction 21

Covariance of Variables: Association with Units The numerator of the correlation coefficient is the covariance of a pair of variables:. S "#," 3 = 1 N 1 ) Y +, Y- + Y $, Y- $,/+. S "#," 3 = 1 N ) Y +, Y- + Y $, Y- $,/+ The covariance uses the units of the original variables (but now they are multiples): Ø Covariance of height and weight: 334.832 inch-pounds The covariance of a variable with itself is the variance The covariance is often used in multivariate analyses because it ties directly into multivariate distributions Ø But covariance and correlation are easy to switch between EPSY 905: Lecture 1 - Introduction Unbiased or sample Biased/ML or population 22

Going from Covariance to Correlation If you have the covariance matrix (variances and covariances): r "#," 3 = S " #," 3 S "# S "3 If you have the correlation matrix and the standard deviations: S "#," 3 = r "#," 3 S "# S "3 EPSY 905: Lecture 1 - Introduction 23

THE GENERAL LINEAR MODEL EPSY 905: Lecture 1 - Introduction 24

The General Linear Model The general linear model incorporates many different labels of analyses under one unifying umbrella: Categorical X s Continuous X s Both Types of X s Univariate Y ANOVA Regression ANCOVA Multivariate Y s MANOVA Multivariate Regression MANCOVA The typical assumption is that error is normally distributed meaning that the data are conditionally normally distributed Models for non-normal outcomes (e.g., dichotomous, categorical, count) fall under the Generalized Linear Model, of which the GLM is a special case (i.e., for when model residuals can be assumed to be normally distributed) EPSY 905: Lecture 1 - Introduction 25

General Linear Models: Conditional Normality Y, = β < + β + X, + β $ Z, + β? X, Z, + e, Model for the Means (Predicted Values): Each person s expected (predicted) outcome is a function of his/her values on x and z (and their interaction) y, x, and z are each measured only once per person (p subscript) Model for the Variance: e, N 0, σ D $ à ONE residual (unexplained) deviation e, has a mean of 0 with some estimated constant variance σ D $, is normally distributed, is unrelated to x and z, and is unrelated across people (across all observations, just people here) We will return to the normal distribution in a few weeks but for now know that it is described by two terms: a mean and a variance EPSY 905: Lecture 1 - Introduction 26

Building a Linear Model for Predicting a Person s Weight We will now build a linear model for predicting a person s weight, using height and gender as predictors Several models we will build are done for didactic reasons to show how regression and ANOVA work under the GLM Ø You wouldn t necessarily run these models in this sequence Our beginning model is that of an empty model no predictors for weight (an unconditional model) Our ending model is one with both predictors and their interaction (a conditional model) EPSY 905: Lecture 1 - Introduction 27

Model 1: The Empty Model Linear model: Weight, = β < + e, where e, N 0, σ D $ Estimated Parameters: [ESTIMATE (STANDARD ERROR)] Ø β < = 183.4 (12.61) w Overall intercept the grand mean of weight across all people Just the mean of weight w SE for β < is standard error of the mean for weight 1 RSTUVW. Ø σ $ D = 3,179.1 (SE not given) w The (unbiased) variance of weight: e, = Weight, β < = Weight, Weight,. S $ D = 1 N 1 ) Weight $, Weight,,/+ w From Mean Square Error of F-table EPSY 905: Lecture 1 - Introduction 28

Model 2: Predicting Weight from Height ( Regression ) Linear model: Weight, = β < + β + Height, + e, where e, N 0, σ D $ Estimated Parameters: [ESTIMATE (STANDARD ERROR)] Ø β < = 227.292 (73.483) w Predicted value of Weight for a person with Height = 0 w Nonsensical but we could have centered Height Ø β + = 6.048 (1.076) w Change in predicted value of Weight for every one-unit increase in height (weight goes up 6.048 pounds per inch) Ø σ $ D = 1,218 (SE not given) w The residual variance of weight w Height explains?,+\].+^+,$+_ = 61.7% of variance of weight?,+\].+ EPSY 905: Lecture 1 - Introduction 29

Model 2a: Predicting Weight from Mean-Centered Height Linear model: W, = β < + β + H, Ha + e, where e, N 0, σ D $ Estimated Parameters: [ESTIMATE (STANDARD ERROR)] Ø β < = 183.4 (7.804) w Predicted value of Weight for a person with Height = Mean Height w Is the Mean Weight (regression line goes through means) Ø β + = 6.048 (1.076) w Change in predicted value of Weight for every one-unit increase in height (weight goes up 6.048 pounds per inch) w Same as previous Ø σ D $ = 1,218 (SE not given) w The residual variance of weight w Height explains?,+\].+^+,$+_?,+\].+ w Same as previous = 61.7% of variance of weight EPSY 905: Lecture 1 - Introduction 30

Plotting Model 2a EPSY 905: Lecture 1 - Introduction 31

Hypothesis Tests for Parameters To determine if the regression slope is significantly different from zero, we must use a hypothesis test: H < : β + = 0 H + : β + 0 We have two options for this test (both are same in this case) Ø Use ANOVA table: sums of squares F-test Ø Use Wald test for parameter: t = d # ed(d # ) Ø Here t $ = F Wald test: t = d # = g.<h_ ed(d # ) +.<\g = 5.62; p <.001 Conclusion: reject null H < ; slope is significant EPSY 905: Lecture 1 - Introduction 32

Model 3: Predicting Weight from Gender ( ANOVA ) Linear Model: Weight, = β < + β $ Female, + e, where e, N 0, σ D $ Note: because gender is a categorical predictor, we must first code it into a number before entering it into the model (typically done automatically in software) Ø Here we use Female = 1 for females; Female = 0 for males Estimated Parameters: [ESTIMATE (STANDARD ERROR)] Ø β < = 235.9 (5.415) w Predicted value of Weight for a person with Female=0 (males) w Mean weight of males Ø β $ = 105.0 7.658 w t = +<o \.go_ = 13.71; p <.001 w Change in predicted value of Weight for every one unit increase in female w In this case, the difference between the mean for males and the mean for females Ø σ $ D = 293 (SE not given) w The residual variance of weight w Gender explains?,+\].+^$]? = 90.8% of variance of weight?,+\].+ EPSY 905: Lecture 1 - Introduction 33

Model 3: More on Categorical Predictors Gender was coded using what is called reference or dummy coding: Ø Intercept becomes mean of the reference group (the 0 group) Ø Slopes become the difference in the means between reference and nonreference groups Ø For C categories, C-1 predictors are created All coding choices can be recovered from the model: Ø Predicted Weight for Females (mean weight for females): W, = β < + β $ = 239.5 105 = 134.5 Ø Predicted Weight for Males: W, = β < = 239.5 What would β < and β $ be if we coded Male = 1? Ø Super cool idea: what if you could do this in software all at once? EPSY 905: Lecture 1 - Introduction 34

Model 3: Predictions and Plots EPSY 905: Lecture 1 - Introduction 35

Model 4: Predicting Weight from Height and Gender (w/o Interaction); ( ANCOVA ) Linear Model: W, = β < + β + H, Ha + β $ F, + e, where e, N 0, σ D $ Estimated Parameters: [ESTIMATE (STANDARD ERROR)] Ø β < = 224.256 (1.439) w Predicted value of Weight for a person with Female=0 (males) and has Height = Mean Height H, Ha = 0 Ø β + = 2.708 0.155 w t = $.\<_ = 17.52;p <.001 <.+oo w Change in predicted value of Weight for every one-unit increase in height (holding gender constant) Ø β $ = 81.712 2.241 w t = _+.\+$ = 36.46;p <.001 $.$h+ w Change in predicted value of Weight for every one-unit increase in female (holding height constant) w In this case, the difference between the mean for males and the mean for females holding height constant Ø σ D $ = 16 (SE not given) w The residual variance of weight EPSY 905: Lecture 1 - Introduction 36

Model 4: By-Gender Regression Lines Model 4 assumes identical regression slopes for both genders but has different intercepts Ø This assumption is tested statistically by model 5 Predicted Weight for Females: W, = 224.256 + 2.708 H, Ha 81.712F, = 142.544 + 2.708 H, Ha Predicted Weight for Males: W, = 224.256 + 2.708 H, Ha 81.712F, = 224.256 + 2.708 H, Ha EPSY 905: Lecture 1 - Introduction 37

Model 4: Predicted Value Regression Lines EPSY 905: Lecture 1 - Introduction 38

Model 5: Predicting Weight from Height and Gender (with Interaction); ( ANCOVAish ) Linear Model: W, = β < + β + H, Ha + β $ F, + β? H, Ha F, + e, where e, N 0, σ D $ Estimated Parameters: [ESTIMATE (STANDARD ERROR)] Ø β < = 222.184 (0.838) w Predicted value of Weight for a person with Female=0 (males) and has Height = Mean Height H, Ha = 0 Ø β + = 3.190 0.111 w t =?.+]< = 28.65;p <.001 <.+++ w Simple main effect of height: Change in predicted value of Weight for every oneunit increase in height (for males only) w A conditional main effect: when interacting variable (gender) = 0 EPSY 905: Lecture 1 - Introduction 39

Model 5: Estimated Parameters Estimated Parameters: Ø β $ = 82.272 (1.211) w t = _$.$\$ = 67.93; p <.001 +.$++ w Simple main effect of gender: Change in predicted value of Weight for every one unit increase in female, for height = mean height w Gender difference at 67.9 inches Ø β? = 1.094 0.168 w t = +.<]h = 6.52; p <.001 <.+g_ w Gender-by-Height Interaction: Additional change in predicted value of weight for change in either gender or height w Difference in slope for height for females vs. males w Because Female = 1, it modifies the slope for height for females (here the height slope is less positive than for females than for males) Ø σ D $ = 5 (SE not given) EPSY 905: Lecture 1 - Introduction 40

Model 5: By-Gender Regression Lines Model 5 does not assume identical regression slopes for both genders Ø Because β? was significantly different from zero, the data supports different slopes for the genders Predicted Weight for Females: W, = 222.184 + 3.190 H, Ha 82.272F, 1.094 H, Ha F, = 139.912 + 2.096 H, Ha Predicted Weight for Males: W, = 222.184 + 3.190 H, Ha 82.272F, 1.094 H, Ha F, = 222.184 + 3.190 H, Ha EPSY 905: Lecture 1 - Introduction 41

Model 5: Predicted Value Regression Lines EPSY 905: Lecture 1 - Introduction 42

Comparing Across Models Typically, the empty model and model #5 would be the only models run Ø The trick is to describe the impact of all and each of the predictors typically using variance accounted for (explained) All predictors: Ø Baseline: empty model #1; σ D $ = 3,179.095 Ø Comparison: model #5; σ D $ = 4.731 Ø All predictors (gender, height, interaction)explained?,+\].<]o^h.\?+?,+\].<]o of variance in weight w R $ hall of fame worthy = 99.9% EPSY 905: Lecture 1 - Introduction 43

Comparing Across Models The total effect of height (main effect and interaction): Ø Baseline: model #3 (gender only); σ D $ = 293.211 Ø Comparison: model #5 (all predictors); σ D $ = 4.731 Ø Height explained $]?.$++^h.\?+ gender $]?.$++ w 98.4% of the 100-90.8% = 9.2% left after gender w True variance accounted for is 98.4%*9.2% = 9.1% = 98.4% of variance in weight remaining after The total effect of gender (main effect and interaction): Ø Baseline: model #2a (height only); σ D $ = 1,217.973 Ø Comparison: model #5 (all predictors); σ D $ = 4.731 Ø Gender explained +,$+\.]\?^h.\?+ +,$+\.]\? after height w 99.6% of the 100-61.7% = 38.3% left after height w True variance accounted for is 99.6%*38.3% = 38.1% = 99.6% of variance in weight remaining EPSY 905: Lecture 1 - Introduction 44

About Weight The distribution of weight was bimodal (shown in the beginning of the class) Ø However, the analysis only called for the residuals to be normally distributed not the actual data Ø This is the same as saying the conditional distribution of the data given the predictors must be normal Residual: e, = Weight, Weight q, = Weight, β < + β + H, Ha + β $ F, + β? H, Ha F, EPSY 905: Lecture 1 - Introduction 45

CONCLUDING REMARKS EPSY 905: Lecture 1 - Introduction 46

Wrapping Up The general linear model forms the basis for many multivariate statistical techniques Ø Certain features of the model change, but many of the same interpretations remain We will continue to use these terms in more advanced models throughout the rest of the semester Ø Extra practice for linear model terms The trick of linear models is to construct one model that answers all of your research questions EPSY 905: Lecture 1 - Introduction 47