Postgraduate course: Anova and Repeated measurements Day 2 (part 2) Mogens Erlandsen, Department of Biostatistics, Aarhus University, November 2010

Size: px
Start display at page:

Download "Postgraduate course: Anova and Repeated measurements Day 2 (part 2) Mogens Erlandsen, Department of Biostatistics, Aarhus University, November 2010"

Transcription

1 30 CVP (mean and sd) Postgraduate course in ANOVA and Repeated Measurements Day Repeated measurements (part ) Mogens Erlandsen Deptartment of Biostatistics Aarhus University The within subject variation is the relevant variation when analyzing changes over time..so How can we estimate the within subject variation and the between subject variation? 1 Univariate Repeated Measurements ANOVA using the anova command In order to use ANOVA we need stronger assumptions 3) The standard deviation σ T is the same for all measurements and the correlations between any two (different measurements) on the same subject are equal. i.e. σ T = σ B + σ W The correlation = σ B / σ T Note: The default behaviour in Stata s anova command is to test effects against the within subject standard deviation. This might be wrong if the effect is a between subjects effect. In this case Stata should be told. See next slide. 3 Example EVF continued (data in long format) Test 1: Hypothesis H: Parallel curves This test can be performed by a 3-way ANOVA with id (subject identification), time, and #time (interaction) in the model. is a between subjects effect Stata 11: anova evf /id time time#, repeated(time) The command wsanova (should be downloaded) might be easier: wsanova evf time, id(id) between() epsilon (allmost the same output!) set matsize 800, permanently before using the anova commands 4

2 The Univariate Repeated Measurements Anova: anova evf /id time time#, repeated(time) Output (continued) Between-subjects error term: id Levels: 30 (8 df) Lowest b.s.e. variable: id Covariance pooled over: (for repeated variable) Between subjects Within subjects Number of obs = 180 R-squared = Root MSE = Adj R-squared = Source Partial SS df MS F Prob > F Model id time time# Residual Total Test 1 5 Repeated variable: time Same as previous slide Huynh-Feldt epsilon = Greenhouse-Geisser epsilon = Box's conservative epsilon = Prob > F Source df F Regular H-F G-G Box time time# Residual 140 Some corrections of the p-value have been proposed when the assumptions (mainly assumption 3) are violated. They will normally be larger than the regular. 6 How can we use the four/three p - values: Prob > F Source df F Regular H-F G-G Box time time# Residual 140 The following has been proposed: If the regular/uncorrected p value is not significant (>0.05) then stop and accept (fail to reject) the hypothesis else If the G-G p value is significant (<0.05) then stop and reject the hypothesis else If the Box p value is significant (<0.05) then stop and reject the hypothesis else stop and accept (fail to reject) the hypothesis. wsanova evf time, id(id) between() epsilon Number of obs = 180 R-squared = Root MSE = Adj R-squared = Source Partial SS df MS F Prob > F Between subjects: id* Within subjects: time time* Residual Total Note: Within subjects F-test(s) above assume sphericity of residuals; p-values corrected for lack of sphericity appear below. Greenhouse-Geisser (G-G) epsilon: Huynh-Feldt (H-F) epsilon: Sphericity G-G H-F Source df F Prob > F Prob > F Prob > F time time* Kirk (198) 7 8

3 Test 3: (for each ): H4: no changes over time. wsanova evf time if ==1, id(id) epsilon Number of obs = 90 R-squared = Root MSE = Adj R-squared = Source Partial SS df MS F Prob > F id time Residual Total Note: Within subjects F-test(s) above assume sphericity of residuals; p-values corrected for lack of sphericity appear below. Greenhouse-Geisser (G-G) epsilon: Huynh-Feldt (H-F) epsilon: Sphericity G-G H-F Source df F Prob > F Prob > F Prob > F time wsanova evf time if ==, id(id) epsilon Number of obs = 90 R-squared = Root MSE = Adj R-squared = Source Partial SS df MS F Prob > F id time Residual Total Note: Within subjects F-test(s) above assume sphericity of residuals; p-values corrected for lack of sphericity appear below. Greenhouse-Geisser (G-G) epsilon: Huynh-Feldt (H-F) epsilon: Sphericity G-G H-F Source df F Prob > F Prob > F Prob > F time Lower bound for the p - value 9 10 If we want to estimate the within and between subject standard deviations (σ T = σ B + σ W ) one can use the xtmixed command: We have four variables: evf id time: xi: xtmixed evf i.time*i. id: ///,nofetable noheader no nostderr nolrtest Part of the output: Random-effects Parameters Estimate id: Identity sd(_cons) between subject sd sd(residual) within subject sd From xtmixed we have sd W = sd B = and the we can calculate s w = s B = sd T = sd B + sd W = s T = The (estimated) correlation between two measurements on the same subject s B / s T =

4 We can look at each separately: xi: xtmixed evf i.time id: if ==1 ///,nofetable noheader no nostderr nolrtest Random-effects Parameters Estimate Std. Err. [95% Conf. Interval] id: Identity sd(_cons) sd(residual) xi: xtmixed evf i.time id: if == ///,nofetable noheader no nostderr nolrtest Random-effects Parameters Estimate Std. Err. [95% Conf. Interval] id: Identity sd(_cons) sd(residual) Estimates in each : grp 1 s W s B s T s W s B s T We can see that the estimates for the between subject variation (s B ) are almost equal but the within subject variation (s W ) are different and hence also the total variation and the correlation. Remarks: The correlations are expected to be positive (why?), but in special cases one might get negative correlations. (weight of mice with limit amount of food and.) We can compare the estimates above with the standard deviations and correlation calculated from the 6 variables evf1, evf,.., evf Conclusion: We found a significant differerence between the s with respect to changes over time p<0.004) We found a statistical significant changes over time in the CPB- (p<0.006) but not in the Sham- (p>0.19) grp 1 s W s B s T Correlation Correlation Checking the model: A important part, but often suppressed, of the analysis is to check whether the assumptions for the analysis is fulfilled sufficiently (a weak statement ), or a transformation (ln-transformation??) of the data is better, or we need to look for an analysis with maybe weaker assumptions

5 Checking the model: The tests are normally F-tests and The result of the F-test is not affected by moderate departures from normality, especially for large numbers of observations in each. The F-test is more sensitive to the assumption of equal variances/ standard deviations.. unless the sample size in each are almost equal. (One can reduce the degrees of freedom as in the t test with unequal variance) Assumptions: Test 1: Parallel curves 1) All the differences between two timepoints are multivariate normaldistributed with in s. ) The sd s and the correlations between differences should be the same in the two s (mvtest) Test 3: No change over time (with-in a ) 1) All the differences between two timepoints are multivariate normaldistributed Example (evf): Probability plots for 1: Probability plots for : d1_ dif 1 d_3.15 dif dif 3 4 d1_ dif 1 d_3 dif dif Inver se Normal Inver se Normal dif 4 5 dif dif dif dif Inver se Normal dif Inver se Normal 19 0

6 Scatter plots for (some of) the differences: d_3 d1_ d1_ d_ The variation within the s should be equal for each set of differences (and all equal if we use the ANOVA) We can also use the figures from the paired analysis (see Basic Biostatistics ): difference (or changes) versus average (or sum). Bland-Altman plot Look for increasing (decreasing) changes when the average increase and/or increasing variation when the average increase If so then the ln-transformation of the data maybe appropriate. dif-ave plots: d_3 mvtest can also test for normality: d1_ d1_ d_ mvtest norm d1_ d_3 if ==1, stats(all) Test for multivariate normality Mardia mskewness = chi(35) = Prob>chi = Mardia mkurtosis = chi(1) = 0.61 Prob>chi = Henze-Zirkler = chi(1) = Prob>chi = Doornik-Hansen chi(10) = Prob>chi =

7 mvtest can also test for normality (bivariate): mvtest can also test for normality (univariate) mvtest norm d1_ d_3 if ==1, biv Doornik-Hansen test for bivariate normality Pair of variables chi df Prob>chi d1_ d_ d_ mvtest norm d1_ d_3 if ==1, uni Test for univariate normality joint Variable Pr(Skewness) Pr(Kurtosis) adj chi() Prob>chi d1_ d_ Conclusion: The assumptions (normality) seem to be ok; Similar result for Remark: be careful; a lot of tests 6 Assumption (The univariate (ANOVA) approach): 3) The standard deviation σ T is the same for all measurements and the correlations between any two (different measurements) on the same subject are equal.. mvtest cov evf1 evf evf3 evf4 evf5 evf6 if ==1, compound Test that covariance matrix is compound symmetric Adjusted LR chi(19) = 7.88 Prob > chi = mvtest cov evf1 evf evf3 evf4 evf5 evf6 if ==, compound Test that covariance matrix is compound symmetric Adjusted LR chi(19) = 7.7 Prob > chi = Conclusion: We accept the hypothesis for each 7 Checking the assumptions for the ANOVA approach : Group 1 Residuals Group Residuals Linear prediction Linear prediction Residuals Residuals residual probability-plot residual probability-plot

8 Conclusion: The evf-measurements seem to fulfilled the assumptions about the normal distribution but have problem with standard deviations/correlations between the s. The Univariate Repeated Measurement may be appropriate (for each sperately) and one can state the two/three standard deviation for the two s (i.e. in a figure showing the mean curves) (An analysis of ln-transformed data gives almost the same result) Example (distance): Distance time Boys Girls 9 30 Example (heartperiod): anova dist sex/idsex time sex#time, repeated(time) Number of obs = 108 R-squared = Root MSE = Adj R-squared = Example (distance): Between-subjects error term: idsex Levels: 7 (5 df) Lowest b.s.e. variable: id Covariance pooled over: sex (for repeated variable) Source Partial SS df MS F Prob > F Model sex idsex time sex*time Residual Total Repeated variable: time Huynh-Feldt epsilon = *Huynh-Feldt epsilon reset to Greenhouse-Geisser epsilon = Box's conservative epsilon = Prob > F Source df F Regular H-F G-G Box time sex*time Residual 75 Conclusion: We found no significant differerence between the s (sex) with respect to changes over time p>.078) 3

9 anova evf /id time time*, repeated(time) Number of obs = 108 R-squared = Root MSE = Adj R-squared = Source Partial SS df MS F Prob > F Model sex idsex time sex*time Residual Total If we accept H (parallel curves) we can test whether the two mean curves are equal. It is exactly the same test as day, part 1 i.e. equal to a t-test on the average of the 4 measurements of distances. All three assumptions should be fulfilled. 33 Example (distance): Between-subjects error term: idsex Levels: 7 (5 df) Lowest b.s.e. variable: id Covariance pooled over: sex (for repeated variable) Repeated variable: time Huynh-Feldt epsilon = *Huynh-Feldt epsilon reset to Greenhouse-Geisser epsilon = Box's conservative epsilon = Prob > F Source df F Regular H-F G-G Box time sex*time Residual 75 If we accept H (parallel curves) we can test H4 (no changes over time) for both s in one test. If we perform a test for each of the s we can have to different answers or we can accept H4 for both s separately due to low power. 34 If problems with the assumptions we can use a permutation test: permute sex r(f), reps(10000) :mvtest mean d8_10 d10_1 d1_14, by(sex) het.. Monte Carlo permutation results Number of obs = 7 command: mvtest mean d8_10 d10_1 d1_14, by(sex) het _pm_1: r(f) permute var: sex T T(obs) c n p=c/n SE(p) [95% Conf. Interval] _pm_ Note: confidence interval is with respect to p=c/n. Note: c = #{T >= T(obs)} Remarks: We have now more than one way to analyze the data. Which one (if any) shall we choose? How can describe the analysis? How can we describe the results? Depending of what we can assume we can try to answer the questions (Day 4). Conclusion: We reject H (p=0.030), the changes over time for the two s are statistical significant

Postgraduate course: Anova and Repeated measurements Day 2 (part 2) Niels Trolle Andersen, Dept. of Biostatistics, Aarhus University, June 2009

Postgraduate course: Anova and Repeated measurements Day 2 (part 2) Niels Trolle Andersen, Dept. of Biostatistics, Aarhus University, June 2009 Postgraduate course in ANOVA and Repeated Measurements Day (part ) Repeated Measurements Niels Trolle Andersen Dept. of Biostatistics, Aarhus University 1 30 5 0 15 10 CVP (mean and sd) Sd w (within) version

More information

Postgraduate course: Anova and Repeated measurements Day 4 (part 2 )

Postgraduate course: Anova and Repeated measurements Day 4 (part 2 ) Postgraduate course: Anova Repeated measurements Day (part ) Postgraduate course in ANOVA Repeated Measurements Day (part ) Summarizing homework exercises. Nielrolle Andersen Dept. of Biostatistics, Aarhus

More information

Repeated Measures Analysis of Variance

Repeated Measures Analysis of Variance Repeated Measures Analysis of Variance Review Univariate Analysis of Variance Group A Group B Group C Repeated Measures Analysis of Variance Condition A Condition B Condition C Repeated Measures Analysis

More information

Prepared by: Prof. Dr Bahaman Abu Samah Department of Professional Development and Continuing Education Faculty of Educational Studies Universiti

Prepared by: Prof. Dr Bahaman Abu Samah Department of Professional Development and Continuing Education Faculty of Educational Studies Universiti Prepared by: Prof. Dr Bahaman Abu Samah Department of Professional Development and Continuing Education Faculty of Educational Studies Universiti Putra Malaysia Serdang Use in experiment, quasi-experiment

More information

Analysis of repeated measurements (KLMED8008)

Analysis of repeated measurements (KLMED8008) Analysis of repeated measurements (KLMED8008) Eirik Skogvoll, MD PhD Professor and Consultant Institute of Circulation and Medical Imaging Dept. of Anaesthesiology and Emergency Medicine 1 Day 2 Practical

More information

1 DV is normally distributed in the population for each level of the within-subjects factor 2 The population variances of the difference scores

1 DV is normally distributed in the population for each level of the within-subjects factor 2 The population variances of the difference scores One-way Prepared by: Prof. Dr Bahaman Abu Samah Department of Professional Development and Continuing Education Faculty of Educational Studies Universiti Putra Malaysia Serdang The purpose is to test the

More information

ANOVA Longitudinal Models for the Practice Effects Data: via GLM

ANOVA Longitudinal Models for the Practice Effects Data: via GLM Psyc 943 Lecture 25 page 1 ANOVA Longitudinal Models for the Practice Effects Data: via GLM Model 1. Saturated Means Model for Session, E-only Variances Model (BP) Variances Model: NO correlation, EQUAL

More information

Correlation and Simple Linear Regression

Correlation and Simple Linear Regression Correlation and Simple Linear Regression Sasivimol Rattanasiri, Ph.D Section for Clinical Epidemiology and Biostatistics Ramathibodi Hospital, Mahidol University E-mail: sasivimol.rat@mahidol.ac.th 1 Outline

More information

How to use Stata s sem with small samples? New corrections for the L. R. χ 2 statistics and fit indices

How to use Stata s sem with small samples? New corrections for the L. R. χ 2 statistics and fit indices How to use Stata s sem with small samples? New corrections for the L. R. χ 2 statistics and fit indices Meeting of the German Stata User Group at the Konstanz University, June 22nd, 218?All models are

More information

General Linear Model (Chapter 4)

General Linear Model (Chapter 4) General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients

More information

Repeated-Measures ANOVA in SPSS Correct data formatting for a repeated-measures ANOVA in SPSS involves having a single line of data for each

Repeated-Measures ANOVA in SPSS Correct data formatting for a repeated-measures ANOVA in SPSS involves having a single line of data for each Repeated-Measures ANOVA in SPSS Correct data formatting for a repeated-measures ANOVA in SPSS involves having a single line of data for each participant, with the repeated measures entered as separate

More information

ANOVA approaches to Repeated Measures. repeated measures MANOVA (chapter 3)

ANOVA approaches to Repeated Measures. repeated measures MANOVA (chapter 3) ANOVA approaches to Repeated Measures univariate repeated-measures ANOVA (chapter 2) repeated measures MANOVA (chapter 3) Assumptions Interval measurement and normally distributed errors (homogeneous across

More information

Introductory Econometrics. Lecture 13: Hypothesis testing in the multiple regression model, Part 1

Introductory Econometrics. Lecture 13: Hypothesis testing in the multiple regression model, Part 1 Introductory Econometrics Lecture 13: Hypothesis testing in the multiple regression model, Part 1 Jun Ma School of Economics Renmin University of China October 19, 2016 The model I We consider the classical

More information

STATISTICS 110/201 PRACTICE FINAL EXAM

STATISTICS 110/201 PRACTICE FINAL EXAM STATISTICS 110/201 PRACTICE FINAL EXAM Questions 1 to 5: There is a downloadable Stata package that produces sequential sums of squares for regression. In other words, the SS is built up as each variable

More information

ANCOVA. Psy 420 Andrew Ainsworth

ANCOVA. Psy 420 Andrew Ainsworth ANCOVA Psy 420 Andrew Ainsworth What is ANCOVA? Analysis of covariance an extension of ANOVA in which main effects and interactions are assessed on DV scores after the DV has been adjusted for by the DV

More information

Chapter 14: Repeated-measures designs

Chapter 14: Repeated-measures designs Chapter 14: Repeated-measures designs Oliver Twisted Please, Sir, can I have some more sphericity? The following article is adapted from: Field, A. P. (1998). A bluffer s guide to sphericity. Newsletter

More information

Econometrics. 8) Instrumental variables

Econometrics. 8) Instrumental variables 30C00200 Econometrics 8) Instrumental variables Timo Kuosmanen Professor, Ph.D. http://nomepre.net/index.php/timokuosmanen Today s topics Thery of IV regression Overidentification Two-stage least squates

More information

WITHIN-PARTICIPANT EXPERIMENTAL DESIGNS

WITHIN-PARTICIPANT EXPERIMENTAL DESIGNS 1 WITHIN-PARTICIPANT EXPERIMENTAL DESIGNS I. Single-factor designs: the model is: yij i j ij ij where: yij score for person j under treatment level i (i = 1,..., I; j = 1,..., n) overall mean βi treatment

More information

Empirical Application of Simple Regression (Chapter 2)

Empirical Application of Simple Regression (Chapter 2) Empirical Application of Simple Regression (Chapter 2) 1. The data file is House Data, which can be downloaded from my webpage. 2. Use stata menu File Import Excel Spreadsheet to read the data. Don t forget

More information

ECO220Y Simple Regression: Testing the Slope

ECO220Y Simple Regression: Testing the Slope ECO220Y Simple Regression: Testing the Slope Readings: Chapter 18 (Sections 18.3-18.5) Winter 2012 Lecture 19 (Winter 2012) Simple Regression Lecture 19 1 / 32 Simple Regression Model y i = β 0 + β 1 x

More information

ECON Introductory Econometrics. Lecture 5: OLS with One Regressor: Hypothesis Tests

ECON Introductory Econometrics. Lecture 5: OLS with One Regressor: Hypothesis Tests ECON4150 - Introductory Econometrics Lecture 5: OLS with One Regressor: Hypothesis Tests Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 5 Lecture outline 2 Testing Hypotheses about one

More information

Descriptive Statistics

Descriptive Statistics *following creates z scores for the ydacl statedp traitdp and rads vars. *specifically adding the /SAVE subcommand to descriptives will create z. *scores for whatever variables are in the command. DESCRIPTIVES

More information

8. Nonstandard standard error issues 8.1. The bias of robust standard errors

8. Nonstandard standard error issues 8.1. The bias of robust standard errors 8.1. The bias of robust standard errors Bias Robust standard errors are now easily obtained using e.g. Stata option robust Robust standard errors are preferable to normal standard errors when residuals

More information

Statistical Inference with Regression Analysis

Statistical Inference with Regression Analysis Introductory Applied Econometrics EEP/IAS 118 Spring 2015 Steven Buck Lecture #13 Statistical Inference with Regression Analysis Next we turn to calculating confidence intervals and hypothesis testing

More information

ANOVA in SPSS. Hugo Quené. opleiding Taalwetenschap Universiteit Utrecht Trans 10, 3512 JK Utrecht.

ANOVA in SPSS. Hugo Quené. opleiding Taalwetenschap Universiteit Utrecht Trans 10, 3512 JK Utrecht. ANOVA in SPSS Hugo Quené hugo.quene@let.uu.nl opleiding Taalwetenschap Universiteit Utrecht Trans 10, 3512 JK Utrecht 7 Oct 2005 1 introduction In this example I ll use fictitious data, taken from http://www.ruf.rice.edu/~mickey/psyc339/notes/rmanova.html.

More information

Problem Set #3-Key. wage Coef. Std. Err. t P> t [95% Conf. Interval]

Problem Set #3-Key. wage Coef. Std. Err. t P> t [95% Conf. Interval] Problem Set #3-Key Sonoma State University Economics 317- Introduction to Econometrics Dr. Cuellar 1. Use the data set Wage1.dta to answer the following questions. a. For the regression model Wage i =

More information

Table 1: Fish Biomass data set on 26 streams

Table 1: Fish Biomass data set on 26 streams Math 221: Multiple Regression S. K. Hyde Chapter 27 (Moore, 5th Ed.) The following data set contains observations on the fish biomass of 26 streams. The potential regressors from which we wish to explain

More information

1 Independent Practice: Hypothesis tests for one parameter:

1 Independent Practice: Hypothesis tests for one parameter: 1 Independent Practice: Hypothesis tests for one parameter: Data from the Indian DHS survey from 2006 includes a measure of autonomy of the women surveyed (a scale from 0-10, 10 being the most autonomous)

More information

Statistical Modelling in Stata 5: Linear Models

Statistical Modelling in Stata 5: Linear Models Statistical Modelling in Stata 5: Linear Models Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 07/11/2017 Structure This Week What is a linear model? How good is my model? Does

More information

Regression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont.

Regression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont. TCELL 9/4/205 36-309/749 Experimental Design for Behavioral and Social Sciences Simple Regression Example Male black wheatear birds carry stones to the nest as a form of sexual display. Soler et al. wanted

More information

Testing methodology. It often the case that we try to determine the form of the model on the basis of data

Testing methodology. It often the case that we try to determine the form of the model on the basis of data Testing methodology It often the case that we try to determine the form of the model on the basis of data The simplest case: we try to determine the set of explanatory variables in the model Testing for

More information

EXST Regression Techniques Page 1. We can also test the hypothesis H :" œ 0 versus H :"

EXST Regression Techniques Page 1. We can also test the hypothesis H : œ 0 versus H : EXST704 - Regression Techniques Page 1 Using F tests instead of t-tests We can also test the hypothesis H :" œ 0 versus H :" Á 0 with an F test.! " " " F œ MSRegression MSError This test is mathematically

More information

Acknowledgements. Outline. Marie Diener-West. ICTR Leadership / Team INTRODUCTION TO CLINICAL RESEARCH. Introduction to Linear Regression

Acknowledgements. Outline. Marie Diener-West. ICTR Leadership / Team INTRODUCTION TO CLINICAL RESEARCH. Introduction to Linear Regression INTRODUCTION TO CLINICAL RESEARCH Introduction to Linear Regression Karen Bandeen-Roche, Ph.D. July 17, 2012 Acknowledgements Marie Diener-West Rick Thompson ICTR Leadership / Team JHU Intro to Clinical

More information

At this point, if you ve done everything correctly, you should have data that looks something like:

At this point, if you ve done everything correctly, you should have data that looks something like: This homework is due on July 19 th. Economics 375: Introduction to Econometrics Homework #4 1. One tool to aid in understanding econometrics is the Monte Carlo experiment. A Monte Carlo experiment allows

More information

Lab 07 Introduction to Econometrics

Lab 07 Introduction to Econometrics Lab 07 Introduction to Econometrics Learning outcomes for this lab: Introduce the different typologies of data and the econometric models that can be used Understand the rationale behind econometrics Understand

More information

Measurement Error. Often a data set will contain imperfect measures of the data we would ideally like.

Measurement Error. Often a data set will contain imperfect measures of the data we would ideally like. Measurement Error Often a data set will contain imperfect measures of the data we would ideally like. Aggregate Data: (GDP, Consumption, Investment are only best guesses of theoretical counterparts and

More information

36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression

36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression 36-309/749 Experimental Design for Behavioral and Social Sciences Sep. 22, 2015 Lecture 4: Linear Regression TCELL Simple Regression Example Male black wheatear birds carry stones to the nest as a form

More information

Statistics Lab One-way Within-Subject ANOVA

Statistics Lab One-way Within-Subject ANOVA Statistics Lab One-way Within-Subject ANOVA PSYCH 710 9 One-way Within-Subjects ANOVA Section 9.1 reviews the basic commands you need to perform a one-way, within-subject ANOVA and to evaluate a linear

More information

Section Least Squares Regression

Section Least Squares Regression Section 2.3 - Least Squares Regression Statistics 104 Autumn 2004 Copyright c 2004 by Mark E. Irwin Regression Correlation gives us a strength of a linear relationship is, but it doesn t tell us what it

More information

T. Mark Beasley One-Way Repeated Measures ANOVA handout

T. Mark Beasley One-Way Repeated Measures ANOVA handout T. Mark Beasley One-Way Repeated Measures ANOVA handout Profile Analysis Example In the One-Way Repeated Measures ANOVA, two factors represent separate sources of variance. Their interaction presents an

More information

Group Comparisons: Differences in Composition Versus Differences in Models and Effects

Group Comparisons: Differences in Composition Versus Differences in Models and Effects Group Comparisons: Differences in Composition Versus Differences in Models and Effects Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised February 15, 2015 Overview.

More information

Notes on Maxwell & Delaney

Notes on Maxwell & Delaney Notes on Maxwell & Delaney PSY710 12 higher-order within-subject designs Chapter 11 discussed the analysis of data collected in experiments that had a single, within-subject factor. Here we extend those

More information

Topic 12. The Split-plot Design and its Relatives (Part II) Repeated Measures [ST&D Ch. 16] 12.9 Repeated measures analysis

Topic 12. The Split-plot Design and its Relatives (Part II) Repeated Measures [ST&D Ch. 16] 12.9 Repeated measures analysis Topic 12. The Split-plot Design and its Relatives (Part II) Repeated Measures [ST&D Ch. 16] 12.9 Repeated measures analysis Sometimes researchers make multiple measurements on the same experimental unit.

More information

Topic 12. The Split-plot Design and its Relatives (continued) Repeated Measures

Topic 12. The Split-plot Design and its Relatives (continued) Repeated Measures 12.1 Topic 12. The Split-plot Design and its Relatives (continued) Repeated Measures 12.9 Repeated measures analysis Sometimes researchers make multiple measurements on the same experimental unit. We have

More information

UV Absorbance by Fish Slime

UV Absorbance by Fish Slime Data Set 1: UV Absorbance by Fish Slime Statistical Setting This handout describes a repeated-measures ANOVA, with two crossed amongsubjects factors and repeated-measures on a third (within-subjects) factor.

More information

Nonrecursive Models Highlights Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised April 6, 2015

Nonrecursive Models Highlights Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised April 6, 2015 Nonrecursive Models Highlights Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised April 6, 2015 This lecture borrows heavily from Duncan s Introduction to Structural

More information

Multivariate Tests. Mauchly's Test of Sphericity

Multivariate Tests. Mauchly's Test of Sphericity General Model Within-Sujects Factors Dependent Variale IDLS IDLF IDHS IDHF IDHCLS IDHCLF Descriptive Statistics IDLS IDLF IDHS IDHF IDHCLS IDHCLF Mean Std. Deviation N.0.70.0.0..8..88.8...97 Multivariate

More information

Covariance Structure Approach to Within-Cases

Covariance Structure Approach to Within-Cases Covariance Structure Approach to Within-Cases Remember how the data file grapefruit1.data looks: Store sales1 sales2 sales3 1 62.1 61.3 60.8 2 58.2 57.9 55.1 3 51.6 49.2 46.2 4 53.7 51.5 48.3 5 61.4 58.7

More information

Essential of Simple regression

Essential of Simple regression Essential of Simple regression We use simple regression when we are interested in the relationship between two variables (e.g., x is class size, and y is student s GPA). For simplicity we assume the relationship

More information

GLM Repeated Measures

GLM Repeated Measures GLM Repeated Measures Notation The GLM (general linear model) procedure provides analysis of variance when the same measurement or measurements are made several times on each subject or case (repeated

More information

BIOSTATS 640 Spring 2018 Unit 2. Regression and Correlation (Part 1 of 2) STATA Users

BIOSTATS 640 Spring 2018 Unit 2. Regression and Correlation (Part 1 of 2) STATA Users Unit Regression and Correlation 1 of - Practice Problems Solutions Stata Users 1. In this exercise, you will gain some practice doing a simple linear regression using a Stata data set called week0.dta.

More information

Checking model assumptions with regression diagnostics

Checking model assumptions with regression diagnostics @graemeleehickey www.glhickey.com graeme.hickey@liverpool.ac.uk Checking model assumptions with regression diagnostics Graeme L. Hickey University of Liverpool Conflicts of interest None Assistant Editor

More information

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47 ECON2228 Notes 2 Christopher F Baum Boston College Economics 2014 2015 cfb (BC Econ) ECON2228 Notes 2 2014 2015 1 / 47 Chapter 2: The simple regression model Most of this course will be concerned with

More information

One-Way ANOVA. Some examples of when ANOVA would be appropriate include:

One-Way ANOVA. Some examples of when ANOVA would be appropriate include: One-Way ANOVA 1. Purpose Analysis of variance (ANOVA) is used when one wishes to determine whether two or more groups (e.g., classes A, B, and C) differ on some outcome of interest (e.g., an achievement

More information

Interpreting coefficients for transformed variables

Interpreting coefficients for transformed variables Interpreting coefficients for transformed variables! Recall that when both independent and dependent variables are untransformed, an estimated coefficient represents the change in the dependent variable

More information

Review of Statistics 101

Review of Statistics 101 Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods

More information

Problem set - Selection and Diff-in-Diff

Problem set - Selection and Diff-in-Diff Problem set - Selection and Diff-in-Diff 1. You want to model the wage equation for women You consider estimating the model: ln wage = α + β 1 educ + β 2 exper + β 3 exper 2 + ɛ (1) Read the data into

More information

Handout 11: Measurement Error

Handout 11: Measurement Error Handout 11: Measurement Error In which you learn to recognise the consequences for OLS estimation whenever some of the variables you use are not measured as accurately as you might expect. A (potential)

More information

T-test: means of Spock's judge versus all other judges 1 12:10 Wednesday, January 5, judge1 N Mean Std Dev Std Err Minimum Maximum

T-test: means of Spock's judge versus all other judges 1 12:10 Wednesday, January 5, judge1 N Mean Std Dev Std Err Minimum Maximum T-test: means of Spock's judge versus all other judges 1 The TTEST Procedure Variable: pcwomen judge1 N Mean Std Dev Std Err Minimum Maximum OTHER 37 29.4919 7.4308 1.2216 16.5000 48.9000 SPOCKS 9 14.6222

More information

General Linear Model

General Linear Model GLM V1 V2 V3 V4 V5 V11 V12 V13 V14 V15 /WSFACTOR=placeholders 2 Polynomial target 5 Polynomial /METHOD=SSTYPE(3) /EMMEANS=TABLES(OVERALL) /EMMEANS=TABLES(placeholders) COMPARE ADJ(SIDAK) /EMMEANS=TABLES(target)

More information

Problem Set 10: Panel Data

Problem Set 10: Panel Data Problem Set 10: Panel Data 1. Read in the data set, e11panel1.dta from the course website. This contains data on a sample or 1252 men and women who were asked about their hourly wage in two years, 2005

More information

Psy 420 Final Exam Fall 06 Ainsworth. Key Name

Psy 420 Final Exam Fall 06 Ainsworth. Key Name Psy 40 Final Exam Fall 06 Ainsworth Key Name Psy 40 Final A researcher is studying the effect of Yoga, Meditation, Anti-Anxiety Drugs and taking Psy 40 and the anxiety levels of the participants. Twenty

More information

Hotelling s One- Sample T2

Hotelling s One- Sample T2 Chapter 405 Hotelling s One- Sample T2 Introduction The one-sample Hotelling s T2 is the multivariate extension of the common one-sample or paired Student s t-test. In a one-sample t-test, the mean response

More information

Warwick Economics Summer School Topics in Microeconometrics Instrumental Variables Estimation

Warwick Economics Summer School Topics in Microeconometrics Instrumental Variables Estimation Warwick Economics Summer School Topics in Microeconometrics Instrumental Variables Estimation Michele Aquaro University of Warwick This version: July 21, 2016 1 / 31 Reading material Textbook: Introductory

More information

Autocorrelation. Think of autocorrelation as signifying a systematic relationship between the residuals measured at different points in time

Autocorrelation. Think of autocorrelation as signifying a systematic relationship between the residuals measured at different points in time Autocorrelation Given the model Y t = b 0 + b 1 X t + u t Think of autocorrelation as signifying a systematic relationship between the residuals measured at different points in time This could be caused

More information

Lecture 4: Multivariate Regression, Part 2

Lecture 4: Multivariate Regression, Part 2 Lecture 4: Multivariate Regression, Part 2 Gauss-Markov Assumptions 1) Linear in Parameters: Y X X X i 0 1 1 2 2 k k 2) Random Sampling: we have a random sample from the population that follows the above

More information

df=degrees of freedom = n - 1

df=degrees of freedom = n - 1 One sample t-test test of the mean Assumptions: Independent, random samples Approximately normal distribution (from intro class: σ is unknown, need to calculate and use s (sample standard deviation)) Hypotheses:

More information

Lecture 3: Multivariate Regression

Lecture 3: Multivariate Regression Lecture 3: Multivariate Regression Rates, cont. Two weeks ago, we modeled state homicide rates as being dependent on one variable: poverty. In reality, we know that state homicide rates depend on numerous

More information

Confidence Intervals, Testing and ANOVA Summary

Confidence Intervals, Testing and ANOVA Summary Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0

More information

Econometrics Midterm Examination Answers

Econometrics Midterm Examination Answers Econometrics Midterm Examination Answers March 4, 204. Question (35 points) Answer the following short questions. (i) De ne what is an unbiased estimator. Show that X is an unbiased estimator for E(X i

More information

Stats fest Analysis of variance. Single factor ANOVA. Aims. Single factor ANOVA. Data

Stats fest Analysis of variance. Single factor ANOVA. Aims. Single factor ANOVA. Data 1 Stats fest 2007 Analysis of variance murray.logan@sci.monash.edu.au Single factor ANOVA 2 Aims Description Investigate differences between population means Explanation How much of the variation in response

More information

Chapter 9. Multivariate and Within-cases Analysis. 9.1 Multivariate Analysis of Variance

Chapter 9. Multivariate and Within-cases Analysis. 9.1 Multivariate Analysis of Variance Chapter 9 Multivariate and Within-cases Analysis 9.1 Multivariate Analysis of Variance Multivariate means more than one response variable at once. Why do it? Primarily because if you do parallel analyses

More information

1: a b c d e 2: a b c d e 3: a b c d e 4: a b c d e 5: a b c d e. 6: a b c d e 7: a b c d e 8: a b c d e 9: a b c d e 10: a b c d e

1: a b c d e 2: a b c d e 3: a b c d e 4: a b c d e 5: a b c d e. 6: a b c d e 7: a b c d e 8: a b c d e 9: a b c d e 10: a b c d e Economics 102: Analysis of Economic Data Cameron Spring 2016 Department of Economics, U.C.-Davis Final Exam (A) Tuesday June 7 Compulsory. Closed book. Total of 58 points and worth 45% of course grade.

More information

2.1. Consider the following production function, known in the literature as the transcendental production function (TPF).

2.1. Consider the following production function, known in the literature as the transcendental production function (TPF). CHAPTER Functional Forms of Regression Models.1. Consider the following production function, known in the literature as the transcendental production function (TPF). Q i B 1 L B i K i B 3 e B L B K 4 i

More information

Lecture 3: Inference in SLR

Lecture 3: Inference in SLR Lecture 3: Inference in SLR STAT 51 Spring 011 Background Reading KNNL:.1.6 3-1 Topic Overview This topic will cover: Review of hypothesis testing Inference about 1 Inference about 0 Confidence Intervals

More information

Correlation and regression. Correlation and regression analysis. Measures of association. Why bother? Positive linear relationship

Correlation and regression. Correlation and regression analysis. Measures of association. Why bother? Positive linear relationship 1 Correlation and regression analsis 12 Januar 2009 Monda, 14.00-16.00 (C1058) Frank Haege Department of Politics and Public Administration Universit of Limerick frank.haege@ul.ie www.frankhaege.eu Correlation

More information

Outline. Linear OLS Models vs: Linear Marginal Models Linear Conditional Models. Random Intercepts Random Intercepts & Slopes

Outline. Linear OLS Models vs: Linear Marginal Models Linear Conditional Models. Random Intercepts Random Intercepts & Slopes Lecture 2.1 Basic Linear LDA 1 Outline Linear OLS Models vs: Linear Marginal Models Linear Conditional Models Random Intercepts Random Intercepts & Slopes Cond l & Marginal Connections Empirical Bayes

More information

Example: Four levels of herbicide strength in an experiment on dry weight of treated plants.

Example: Four levels of herbicide strength in an experiment on dry weight of treated plants. The idea of ANOVA Reminders: A factor is a variable that can take one of several levels used to differentiate one group from another. An experiment has a one-way, or completely randomized, design if several

More information

sociology 362 regression

sociology 362 regression sociology 36 regression Regression is a means of modeling how the conditional distribution of a response variable (say, Y) varies for different values of one or more independent explanatory variables (say,

More information

Linear Modelling in Stata Session 6: Further Topics in Linear Modelling

Linear Modelling in Stata Session 6: Further Topics in Linear Modelling Linear Modelling in Stata Session 6: Further Topics in Linear Modelling Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 14/11/2017 This Week Categorical Variables Categorical

More information

STT 843 Key to Homework 1 Spring 2018

STT 843 Key to Homework 1 Spring 2018 STT 843 Key to Homework Spring 208 Due date: Feb 4, 208 42 (a Because σ = 2, σ 22 = and ρ 2 = 05, we have σ 2 = ρ 2 σ σ22 = 2/2 Then, the mean and covariance of the bivariate normal is µ = ( 0 2 and Σ

More information

Please discuss each of the 3 problems on a separate sheet of paper, not just on a separate page!

Please discuss each of the 3 problems on a separate sheet of paper, not just on a separate page! Econometrics - Exam May 11, 2011 1 Exam Please discuss each of the 3 problems on a separate sheet of paper, not just on a separate page! Problem 1: (15 points) A researcher has data for the year 2000 from

More information

Problem Set #5-Key Sonoma State University Dr. Cuellar Economics 317- Introduction to Econometrics

Problem Set #5-Key Sonoma State University Dr. Cuellar Economics 317- Introduction to Econometrics Problem Set #5-Key Sonoma State University Dr. Cuellar Economics 317- Introduction to Econometrics C1.1 Use the data set Wage1.dta to answer the following questions. Estimate regression equation wage =

More information

Economics 326 Methods of Empirical Research in Economics. Lecture 14: Hypothesis testing in the multiple regression model, Part 2

Economics 326 Methods of Empirical Research in Economics. Lecture 14: Hypothesis testing in the multiple regression model, Part 2 Economics 326 Methods of Empirical Research in Economics Lecture 14: Hypothesis testing in the multiple regression model, Part 2 Vadim Marmer University of British Columbia May 5, 2010 Multiple restrictions

More information

Linear models and their mathematical foundations: Simple linear regression

Linear models and their mathematical foundations: Simple linear regression Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction

More information

THE MULTIVARIATE LINEAR REGRESSION MODEL

THE MULTIVARIATE LINEAR REGRESSION MODEL THE MULTIVARIATE LINEAR REGRESSION MODEL Why multiple regression analysis? Model with more than 1 independent variable: y 0 1x1 2x2 u It allows : -Controlling for other factors, and get a ceteris paribus

More information

options description set confidence level; default is level(95) maximum number of iterations post estimation results

options description set confidence level; default is level(95) maximum number of iterations post estimation results Title nlcom Nonlinear combinations of estimators Syntax Nonlinear combination of estimators one expression nlcom [ name: ] exp [, options ] Nonlinear combinations of estimators more than one expression

More information

Lecture 3: Multiple Regression. Prof. Sharyn O Halloran Sustainable Development U9611 Econometrics II

Lecture 3: Multiple Regression. Prof. Sharyn O Halloran Sustainable Development U9611 Econometrics II Lecture 3: Multiple Regression Prof. Sharyn O Halloran Sustainable Development Econometrics II Outline Basics of Multiple Regression Dummy Variables Interactive terms Curvilinear models Review Strategies

More information

Fixed and Random Effects Models: Vartanian, SW 683

Fixed and Random Effects Models: Vartanian, SW 683 : Vartanian, SW 683 Fixed and random effects models See: http://teaching.sociology.ul.ie/dcw/confront/node45.html When you have repeated observations per individual this is a problem and an advantage:

More information

Homework Solutions Applied Logistic Regression

Homework Solutions Applied Logistic Regression Homework Solutions Applied Logistic Regression WEEK 6 Exercise 1 From the ICU data, use as the outcome variable vital status (STA) and CPR prior to ICU admission (CPR) as a covariate. (a) Demonstrate that

More information

Institute of Actuaries of India

Institute of Actuaries of India Institute of Actuaries of India Subject CT3 Probability & Mathematical Statistics May 2011 Examinations INDICATIVE SOLUTION Introduction The indicative solution has been written by the Examiners with the

More information

Introduction. Chapter 8

Introduction. Chapter 8 Chapter 8 Introduction In general, a researcher wants to compare one treatment against another. The analysis of variance (ANOVA) is a general test for comparing treatment means. When the null hypothesis

More information

SOCY5601 Handout 8, Fall DETECTING CURVILINEARITY (continued) CONDITIONAL EFFECTS PLOTS

SOCY5601 Handout 8, Fall DETECTING CURVILINEARITY (continued) CONDITIONAL EFFECTS PLOTS SOCY5601 DETECTING CURVILINEARITY (continued) CONDITIONAL EFFECTS PLOTS More on use of X 2 terms to detect curvilinearity: As we have said, a quick way to detect curvilinearity in the relationship between

More information

Research Design - - Topic 12 MRC Analysis and Two Factor Designs: Completely Randomized and Repeated Measures 2010 R.C. Gardner, Ph.D.

Research Design - - Topic 12 MRC Analysis and Two Factor Designs: Completely Randomized and Repeated Measures 2010 R.C. Gardner, Ph.D. esearch Design - - Topic MC nalysis and Two Factor Designs: Completely andomized and epeated Measures C Gardner, PhD General overview Completely andomized Two Factor Designs Model I Effect Coding egression

More information

Six Sigma Black Belt Study Guides

Six Sigma Black Belt Study Guides Six Sigma Black Belt Study Guides 1 www.pmtutor.org Powered by POeT Solvers Limited. Analyze Correlation and Regression Analysis 2 www.pmtutor.org Powered by POeT Solvers Limited. Variables and relationships

More information

ANOVA continued. Chapter 10

ANOVA continued. Chapter 10 ANOVA continued Chapter 10 Zettergren (003) School adjustment in adolescence for previously rejected, average, and popular children. Effect of peer reputation on academic performance and school adjustment

More information

Lecture 4: Multivariate Regression, Part 2

Lecture 4: Multivariate Regression, Part 2 Lecture 4: Multivariate Regression, Part 2 Gauss-Markov Assumptions 1) Linear in Parameters: Y X X X i 0 1 1 2 2 k k 2) Random Sampling: we have a random sample from the population that follows the above

More information

Name: Biostatistics 1 st year Comprehensive Examination: Applied in-class exam. June 8 th, 2016: 9am to 1pm

Name: Biostatistics 1 st year Comprehensive Examination: Applied in-class exam. June 8 th, 2016: 9am to 1pm Name: Biostatistics 1 st year Comprehensive Examination: Applied in-class exam June 8 th, 2016: 9am to 1pm Instructions: 1. This is exam is to be completed independently. Do not discuss your work with

More information

Computer Exercise 3 Answers Hypothesis Testing

Computer Exercise 3 Answers Hypothesis Testing Computer Exercise 3 Answers Hypothesis Testing. reg lnhpay xper yearsed tenure ---------+------------------------------ F( 3, 6221) = 512.58 Model 457.732594 3 152.577531 Residual 1851.79026 6221.297667619

More information

Thursday Morning. Growth Modelling in Mplus. Using a set of repeated continuous measures of bodyweight

Thursday Morning. Growth Modelling in Mplus. Using a set of repeated continuous measures of bodyweight Thursday Morning Growth Modelling in Mplus Using a set of repeated continuous measures of bodyweight 1 Growth modelling Continuous Data Mplus model syntax refresher ALSPAC Confirmatory Factor Analysis

More information