Estimation of relative protein abundance and statistical analysis of proteomic data from multiple itraq experiments

Size: px
Start display at page:

Download "Estimation of relative protein abundance and statistical analysis of proteomic data from multiple itraq experiments"

Transcription

1 Estimation of relative protein abundance and statistical analysis of proteomic data from multiple itraq experiments Ingo Ruczinski Department of Biostatistics Johns Hopkins Bloomberg School of Public Health April 17, 2012 Acknowledgments Shelley Herbrich Keith West, Bob Cole, Kerry Schulze, Parul Christian, Peter Scholl, Alain Labrique, Jim Yager, John Groopman. Funding from the Bill and Melinda Gates Foundation (#521).

2 What is Hidden Hunger? Hidden hunger is unlike the hunger that comes from a lack of food. It is a chronic lack of vitamins and minerals that often has no visible warning signs, so that people who suffer from it may not even be aware of it. Its consequences are nevertheless disastrous: hidden hunger can lead to mental impairment, poor health and productivity, or even death. One in three people in the world suffer from hidden hunger. Women and children from the lower income groups in developing countries are often the most affected. Prevalence of low micronutrient status Jiang et al, J Nutr (2005).

3 Strategies for preventing micronutrient deficiencies However Lacking is an ability to routinely and quantitatively assess population status with respect to multiple essential and potentially deficient micronutrients to Identify highest risk populations Estimate prevalence and severity Monitor population trends over time Inform, target and guide program development Evaluate and improve programs

4 Tandem mass spectrometry itraq Isobaric Tag for Relative and Absolute Quantitation

5 itraq Isobaric Tag for Relative and Absolute Quantitation Masterpools

6 Spectra files Variability in ratios SD ( ŶS Ŷ R ) Y S Y R (σŷs Y S ) 2 2ρ {ŶS,Ŷ R } ( )( σŷs σŷr Y S Y R ) ( ) 2 σŷr + Y R = ( ) 1 Y R σ 2 2ρ Ŷ S { Ŷ S,ŶR} σ YS σŷr Ŷ S Y R ( ) 2 + σ 2 YS Ŷ R Y R

7 Spectra files Reporter ion intensities gi gi gi5065 gi Run with 8 technical replicates of a master pool

8 Log reporter ion intensities (median polish) gi gi gi gi Run with 8 technical replicates of a master pool Log reporter ion intensities (median polish, by peptide) gi gi gi gi Run with 8 technical replicates of a master pool

9 Estimated relative protein abundance gi gi gi gi Run with 8 technical replicates of a master pool Missing data

10 Missing data Statistical models [KERR ET AL J COMP BIO 7: 2000 ]

11 Statistical models [HILL ET AL J PROT RES 7: 2008 ] Estimable parameters The true protein abundance for subject k {1,...,8} in a particular experiment is a k = µ + + δ k. However, for each spectrum s {1,...,S} of log 2 reporter ion intensities we only observe Note that (since k δ k = 0) Y sk = s + δ k + ɛ sk E[Ȳs] = s. Thus, for the de-meaned log 2 reporter ion intensities we have E[Z sk ]=E[Y sk Ȳs] =E[Y sk ] E[Ȳs] =δ k. Also note that the errors for the de-meaned reporter ion intensities are not independent.

12 Estimable parameters The question arises how information from multiple itraq runs can be combined. Housekeeping proteins can not be leveraged for the data normalization across itraq experiments. Assume that the true protein abundance for subject k in experiment r is a rk = µ + r + δ rk. r is not estimable from the proteomic data, so augmenting estimates of δ r across experiments as a surrogate for absolute abundance fails to take the variance component r into account. Randomization of subjects to experiments helps to avoid systematic biases, but does not eliminate the random mean shift r expected across experiments. Estimable parameters The r can not be estimated and eliminated using the proteomic data alone. Not an issue in (well designed) case-control studies. r can also be accounted for in other types of association studies. For example, assume that in truth we have E[N rk ]=β 0 + β 1 a rk. Substituting µ + r + δ rk for a rk, we get E[N rk ] = β 0 + β 1 (µ + r + δ rk ) = {β 0 + β 1 µ} + β 1 r + β 1 δ rk = γ 0 + B r + β 1 δ rk

13 Linear mixed model fit Linear mixed model fit by REML Formula: y x + (1 id) Data: dat AIC BIC loglik deviance REMLdev Random effects: Groups Name Variance Std.Dev. id (Intercept) Residual Number of obs: 27, groups: id, 9 Fixed effects: Estimate Std. Error t value (Intercept) x Association

14 ROC ELISA comparison

15 More results Vitamin A

16 Vitamin E Estimable parameters In case there is more than one protein with log 2 abundances linearly related to the nutrient concentration: E[N rk ] = β 0 + j = β 0 + j β j ( µ j + jr + δ jrk ) β j µ j + j β j jr + β j δ jrk = γ 0 + B r + j β j δ jrk Thus, even though we have multiple proteins, and therefore multiple random effects for between experiment differences, the resulting linear mixed effects model still only has one random effect that jointly summarizes the between experiment differences.

17 Multivariate models joint marginal β p β p retinol-binding protein 0.8 5e e-19 complement factor H isoform a e e-0 insulin-like growth factor-binding protein e e-0 complement C1r subcomponent -1 2e e-0 Multivariate models

18 Estimation comparison Concordance correlation coefficient to measure the agreement in relative abundance estimates between two methods: ρ(x, Y )= 2 cov(x, Y ) σ 2 X + σ2 Y +(µ X µ Y ) 2. 0% 25% 50% 75% 100% LM vs LME LM vs LM S LM vs LME S LME vs LM S LME vs LME S LM S vs LME S MSE / /5 6 2! Relative protein abundance estimate " 2 5/ 1 /5 2 5/ 1 /5 2 5/ Linear mixed model (joint estimation) Linear mixed model (separate estimation) Linear model (separate estimation)! Root median squared fold change " 1 8 /

19 Estimation method comparison 1. Linear mixed effects models 2. Linear model. Masterpool normalization (mean). Masterpool normalization (median) 5. Mean sweep (ignoring peptides) 6. Median sweep (ignoring peptides) 7. Mean sweep (using peptides) 8. Median sweep (using peptides) Estimation method comparison 1.0 Root mean squared fold change MP:1 MP:2 MP: MP: 1.12 Root median squared fold change MP:1 MP:2 MP: MP:

20 Estimation method comparison R gi gi gi gi gi gi gi gi gi gi gi Log reporter ion intensities (median polish) gi gi gi gi Run with 8 technical replicates of a master pool

21 http: //biostat.jhsph.edu/ iruczins/

20. REML Estimation of Variance Components. Copyright c 2018 (Iowa State University) 20. Statistics / 36

20. REML Estimation of Variance Components. Copyright c 2018 (Iowa State University) 20. Statistics / 36 20. REML Estimation of Variance Components Copyright c 2018 (Iowa State University) 20. Statistics 510 1 / 36 Consider the General Linear Model y = Xβ + ɛ, where ɛ N(0, Σ) and Σ is an n n positive definite

More information

Outline. Mixed models in R using the lme4 package Part 3: Longitudinal data. Sleep deprivation data. Simple longitudinal data

Outline. Mixed models in R using the lme4 package Part 3: Longitudinal data. Sleep deprivation data. Simple longitudinal data Outline Mixed models in R using the lme4 package Part 3: Longitudinal data Douglas Bates Longitudinal data: sleepstudy A model with random effects for intercept and slope University of Wisconsin - Madison

More information

R Output for Linear Models using functions lm(), gls() & glm()

R Output for Linear Models using functions lm(), gls() & glm() LM 04 lm(), gls() &glm() 1 R Output for Linear Models using functions lm(), gls() & glm() Different kinds of output related to linear models can be obtained in R using function lm() {stats} in the base

More information

Correlated Data: Linear Mixed Models with Random Intercepts

Correlated Data: Linear Mixed Models with Random Intercepts 1 Correlated Data: Linear Mixed Models with Random Intercepts Mixed Effects Models This lecture introduces linear mixed effects models. Linear mixed models are a type of regression model, which generalise

More information

Outline. Statistical inference for linear mixed models. One-way ANOVA in matrix-vector form

Outline. Statistical inference for linear mixed models. One-way ANOVA in matrix-vector form Outline Statistical inference for linear mixed models Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark general form of linear mixed models examples of analyses using linear mixed

More information

Covariance and Correlation

Covariance and Correlation Covariance and Correlation ST 370 The probability distribution of a random variable gives complete information about its behavior, but its mean and variance are useful summaries. Similarly, the joint probability

More information

Hierarchical Linear Models (HLM) Using R Package nlme. Interpretation. 2 = ( x 2) u 0j. e ij

Hierarchical Linear Models (HLM) Using R Package nlme. Interpretation. 2 = ( x 2) u 0j. e ij Hierarchical Linear Models (HLM) Using R Package nlme Interpretation I. The Null Model Level 1 (student level) model is mathach ij = β 0j + e ij Level 2 (school level) model is β 0j = γ 00 + u 0j Combined

More information

Bayesian Model Diagnostics and Checking

Bayesian Model Diagnostics and Checking Earvin Balderama Quantitative Ecology Lab Department of Forestry and Environmental Resources North Carolina State University April 12, 2013 1 / 34 Introduction MCMCMC 2 / 34 Introduction MCMCMC Steps in

More information

Introduction to Linear Mixed Models: Modeling continuous longitudinal outcomes

Introduction to Linear Mixed Models: Modeling continuous longitudinal outcomes 1/64 to : Modeling continuous longitudinal outcomes Dr Cameron Hurst cphurst@gmail.com CEU, ACRO and DAMASAC, Khon Kaen University 4 th Febuary, 2557 2/64 Some motivational datasets Before we start, I

More information

Introduction and Background to Multilevel Analysis

Introduction and Background to Multilevel Analysis Introduction and Background to Multilevel Analysis Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Background and

More information

Workshop 9.3a: Randomized block designs

Workshop 9.3a: Randomized block designs -1- Workshop 93a: Randomized block designs Murray Logan November 23, 16 Table of contents 1 Randomized Block (RCB) designs 1 2 Worked Examples 12 1 Randomized Block (RCB) designs 11 RCB design Simple Randomized

More information

Multivariate Statistics in Ecology and Quantitative Genetics Summary

Multivariate Statistics in Ecology and Quantitative Genetics Summary Multivariate Statistics in Ecology and Quantitative Genetics Summary Dirk Metzler & Martin Hutzenthaler http://evol.bio.lmu.de/_statgen 5. August 2011 Contents Linear Models Generalized Linear Models Mixed-effects

More information

Value Added Modeling

Value Added Modeling Value Added Modeling Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Background for VAMs Recall from previous lectures

More information

Intruction to General and Generalized Linear Models

Intruction to General and Generalized Linear Models Intruction to General and Generalized Linear Models Mixed Effects Models IV Henrik Madsen Anna Helga Jónsdóttir hm@imm.dtu.dk April 30, 2012 Henrik Madsen Anna Helga Jónsdóttir (hm@imm.dtu.dk) Intruction

More information

lme4 Luke Chang Last Revised July 16, Fitting Linear Mixed Models with a Varying Intercept

lme4 Luke Chang Last Revised July 16, Fitting Linear Mixed Models with a Varying Intercept lme4 Luke Chang Last Revised July 16, 2010 1 Using lme4 1.1 Fitting Linear Mixed Models with a Varying Intercept We will now work through the same Ultimatum Game example from the regression section and

More information

Outline for today. Two-way analysis of variance with random effects

Outline for today. Two-way analysis of variance with random effects Outline for today Two-way analysis of variance with random effects Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark Two-way ANOVA using orthogonal projections March 4, 2018 1 /

More information

Figure 36: Respiratory infection versus time for the first 49 children.

Figure 36: Respiratory infection versus time for the first 49 children. y BINARY DATA MODELS We devote an entire chapter to binary data since such data are challenging, both in terms of modeling the dependence, and parameter interpretation. We again consider mixed effects

More information

Stat/F&W Ecol/Hort 572 Review Points Ané, Spring 2010

Stat/F&W Ecol/Hort 572 Review Points Ané, Spring 2010 1 Linear models Y = Xβ + ɛ with ɛ N (0, σ 2 e) or Y N (Xβ, σ 2 e) where the model matrix X contains the information on predictors and β includes all coefficients (intercept, slope(s) etc.). 1. Number of

More information

Lecture 4 Multiple linear regression

Lecture 4 Multiple linear regression Lecture 4 Multiple linear regression BIOST 515 January 15, 2004 Outline 1 Motivation for the multiple regression model Multiple regression in matrix notation Least squares estimation of model parameters

More information

A brief introduction to mixed models

A brief introduction to mixed models A brief introduction to mixed models University of Gothenburg Gothenburg April 6, 2017 Outline An introduction to mixed models based on a few examples: Definition of standard mixed models. Parameter estimation.

More information

Lecture 17. Ingo Ruczinski. October 26, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University

Lecture 17. Ingo Ruczinski. October 26, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University Lecture 17 Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University October 26, 2015 1 2 3 4 5 1 Paired difference hypothesis tests 2 Independent group differences

More information

One-stage dose-response meta-analysis

One-stage dose-response meta-analysis One-stage dose-response meta-analysis Nicola Orsini, Alessio Crippa Biostatistics Team Department of Public Health Sciences Karolinska Institutet http://ki.se/en/phs/biostatistics-team 2017 Nordic and

More information

Mixed effects models

Mixed effects models Mixed effects models The basic theory and application in R Mitchel van Loon Research Paper Business Analytics Mixed effects models The basic theory and application in R Author: Mitchel van Loon Research

More information

Power analysis examples using R

Power analysis examples using R Power analysis examples using R Code The pwr package can be used to analytically compute power for various designs. The pwr examples below are adapted from the pwr package vignette, which is available

More information

UNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017

UNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017 UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics Tuesday, January 17, 2017 Work all problems 60 points are needed to pass at the Masters Level and 75

More information

Variance component models part I

Variance component models part I Faculty of Health Sciences Variance component models part I Analysis of repeated measurements, 30th November 2012 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen

More information

Lecture 6 Multiple Linear Regression, cont.

Lecture 6 Multiple Linear Regression, cont. Lecture 6 Multiple Linear Regression, cont. BIOST 515 January 22, 2004 BIOST 515, Lecture 6 Testing general linear hypotheses Suppose we are interested in testing linear combinations of the regression

More information

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS Duration - 3 hours Aids Allowed: Calculator LAST NAME: FIRST NAME: STUDENT NUMBER: There are 27 pages

More information

Lecture 5: Omitted Variables, Dummy Variables and Multicollinearity

Lecture 5: Omitted Variables, Dummy Variables and Multicollinearity Lecture 5: Omitted Variables, Dummy Variables and Multicollinearity R.G. Pierse 1 Omitted Variables Suppose that the true model is Y i β 1 + β X i + β 3 X 3i + u i, i 1,, n (1.1) where β 3 0 but that the

More information

Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p )

Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p ) Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p. 376-390) BIO656 2009 Goal: To see if a major health-care reform which took place in 1997 in Germany was

More information

Section IX. Introduction to Logistic Regression for binary outcomes. Poisson regression

Section IX. Introduction to Logistic Regression for binary outcomes. Poisson regression Section IX Introduction to Logistic Regression for binary outcomes Poisson regression 0 Sec 9 - Logistic regression In linear regression, we studied models where Y is a continuous variable. What about

More information

Lecture 3.1 Basic Logistic LDA

Lecture 3.1 Basic Logistic LDA y Lecture.1 Basic Logistic LDA 0.2.4.6.8 1 Outline Quick Refresher on Ordinary Logistic Regression and Stata Women s employment example Cross-Over Trial LDA Example -100-50 0 50 100 -- Longitudinal Data

More information

Correlation and simple linear regression S5

Correlation and simple linear regression S5 Basic medical statistics for clinical and eperimental research Correlation and simple linear regression S5 Katarzyna Jóźwiak k.jozwiak@nki.nl November 15, 2017 1/41 Introduction Eample: Brain size and

More information

Chapter 15 Sampling Distribution Models

Chapter 15 Sampling Distribution Models Chapter 15 Sampling Distribution Models 1 15.1 Sampling Distribution of a Proportion 2 Sampling About Evolution According to a Gallup poll, 43% believe in evolution. Assume this is true of all Americans.

More information

Lecture 4. August 24, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University.

Lecture 4. August 24, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University. random Lecture 4 Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University August 24, 2007 random 1 2 3 4 random 5 6 7 8 9 random 1 Define random 2 and 3 4 Co

More information

Analysis of Longitudinal Data: Comparison between PROC GLM and PROC MIXED.

Analysis of Longitudinal Data: Comparison between PROC GLM and PROC MIXED. Analysis of Longitudinal Data: Comparison between PROC GLM and PROC MIXED. Maribeth Johnson, Medical College of Georgia, Augusta, GA ABSTRACT Longitudinal data refers to datasets with multiple measurements

More information

Stat 579: Generalized Linear Models and Extensions

Stat 579: Generalized Linear Models and Extensions Stat 579: Generalized Linear Models and Extensions Linear Mixed Models for Longitudinal Data Yan Lu April, 2018, week 14 1 / 64 Data structure and Model t1 t2 tn i 1st subject y 11 y 12 y 1n1 2nd subject

More information

A Handbook of Statistical Analyses Using R 2nd Edition. Brian S. Everitt and Torsten Hothorn

A Handbook of Statistical Analyses Using R 2nd Edition. Brian S. Everitt and Torsten Hothorn A Handbook of Statistical Analyses Using R 2nd Edition Brian S. Everitt and Torsten Hothorn CHAPTER 12 Analysing Longitudinal Data I: Computerised Delivery of Cognitive Behavioural Therapy Beat the Blues

More information

PAPER 218 STATISTICAL LEARNING IN PRACTICE

PAPER 218 STATISTICAL LEARNING IN PRACTICE MATHEMATICAL TRIPOS Part III Thursday, 7 June, 2018 9:00 am to 12:00 pm PAPER 218 STATISTICAL LEARNING IN PRACTICE Attempt no more than FOUR questions. There are SIX questions in total. The questions carry

More information

Homework 3 - Solution

Homework 3 - Solution STAT 526 - Spring 2011 Homework 3 - Solution Olga Vitek Each part of the problems 5 points 1. KNNL 25.17 (Note: you can choose either the restricted or the unrestricted version of the model. Please state

More information

PubH 7405: REGRESSION ANALYSIS MLR: BIOMEDICAL APPLICATIONS

PubH 7405: REGRESSION ANALYSIS MLR: BIOMEDICAL APPLICATIONS PubH 7405: REGRESSION ANALYSIS MLR: BIOMEDICAL APPLICATIONS Multiple Regression allows us to get into two new areas that were not possible with Simple Linear Regression: (i) Interaction or Effect Modification,

More information

SAS Syntax and Output for Data Manipulation: CLDP 944 Example 3a page 1

SAS Syntax and Output for Data Manipulation: CLDP 944 Example 3a page 1 CLDP 944 Example 3a page 1 From Between-Person to Within-Person Models for Longitudinal Data The models for this example come from Hoffman (2015) chapter 3 example 3a. We will be examining the extent to

More information

Model comparison. Patrick Breheny. March 28. Introduction Measures of predictive power Model selection

Model comparison. Patrick Breheny. March 28. Introduction Measures of predictive power Model selection Model comparison Patrick Breheny March 28 Patrick Breheny BST 760: Advanced Regression 1/25 Wells in Bangladesh In this lecture and the next, we will consider a data set involving modeling the decisions

More information

Measurement Error in Covariates

Measurement Error in Covariates Measurement Error in Covariates Raymond J. Carroll Department of Statistics Faculty of Nutrition Institute for Applied Mathematics and Computational Science Texas A&M University My Goal Today Introduce

More information

Covariance Models (*) X i : (n i p) design matrix for fixed effects β : (p 1) regression coefficient for fixed effects

Covariance Models (*) X i : (n i p) design matrix for fixed effects β : (p 1) regression coefficient for fixed effects Covariance Models (*) Mixed Models Laird & Ware (1982) Y i = X i β + Z i b i + e i Y i : (n i 1) response vector X i : (n i p) design matrix for fixed effects β : (p 1) regression coefficient for fixed

More information

Lecture 9 STK3100/4100

Lecture 9 STK3100/4100 Lecture 9 STK3100/4100 27. October 2014 Plan for lecture: 1. Linear mixed models cont. Models accounting for time dependencies (Ch. 6.1) 2. Generalized linear mixed models (GLMM, Ch. 13.1-13.3) Examples

More information

Introduction to Statistical modeling: handout for Math 489/583

Introduction to Statistical modeling: handout for Math 489/583 Introduction to Statistical modeling: handout for Math 489/583 Statistical modeling occurs when we are trying to model some data using statistical tools. From the start, we recognize that no model is perfect

More information

Analysis of binary repeated measures data with R

Analysis of binary repeated measures data with R Analysis of binary repeated measures data with R Right-handed basketball players take right and left-handed shots from 3 locations in a different random order for each player. Hit or miss is recorded.

More information

Latent random variables

Latent random variables Latent random variables Imagine that you have collected egg size data on a fish called Austrolebias elongatus, and the graph of egg size on body size of the mother looks as follows: Egg size (Area) 4.6

More information

13. October p. 1

13. October p. 1 Lecture 8 STK3100/4100 Linear mixed models 13. October 2014 Plan for lecture: 1. The lme function in the nlme library 2. Induced correlation structure 3. Marginal models 4. Estimation - ML and REML 5.

More information

Random and Mixed Effects Models - Part II

Random and Mixed Effects Models - Part II Random and Mixed Effects Models - Part II Statistics 149 Spring 2006 Copyright 2006 by Mark E. Irwin Two-Factor Random Effects Model Example: Miles per Gallon (Neter, Kutner, Nachtsheim, & Wasserman, problem

More information

Multiple linear regression S6

Multiple linear regression S6 Basic medical statistics for clinical and experimental research Multiple linear regression S6 Katarzyna Jóźwiak k.jozwiak@nki.nl November 15, 2017 1/42 Introduction Two main motivations for doing multiple

More information

R 2 and F -Tests and ANOVA

R 2 and F -Tests and ANOVA R 2 and F -Tests and ANOVA December 6, 2018 1 Partition of Sums of Squares The distance from any point y i in a collection of data, to the mean of the data ȳ, is the deviation, written as y i ȳ. Definition.

More information

One Economist s Perspective on Some Important Estimation Issues

One Economist s Perspective on Some Important Estimation Issues One Economist s Perspective on Some Important Estimation Issues Jere R. Behrman W.R. Kenan Jr. Professor of Economics & Sociology University of Pennsylvania SRCD Seattle Preconference on Interventions

More information

An Introduction to Path Analysis

An Introduction to Path Analysis An Introduction to Path Analysis PRE 905: Multivariate Analysis Lecture 10: April 15, 2014 PRE 905: Lecture 10 Path Analysis Today s Lecture Path analysis starting with multivariate regression then arriving

More information

Statistical analysis of isobaric-labeled mass spectrometry data

Statistical analysis of isobaric-labeled mass spectrometry data Statistical analysis of isobaric-labeled mass spectrometry data Farhad Shakeri July 3, 2018 Core Unit for Bioinformatics Analyses Institute for Genomic Statistics and Bioinformatics University Hospital

More information

A Handbook of Statistical Analyses Using R 2nd Edition. Brian S. Everitt and Torsten Hothorn

A Handbook of Statistical Analyses Using R 2nd Edition. Brian S. Everitt and Torsten Hothorn A Handbook of Statistical Analyses Using R 2nd Edition Brian S. Everitt and Torsten Hothorn CHAPTER 12 Analysing Longitudinal Data I: Computerised Delivery of Cognitive Behavioural Therapy Beat the Blues

More information

BIOSTATISTICS METHODS FOR TRANSLATIONAL & CLINICAL RESEARCH DAY #4: REGRESSION APPLICATIONS, PART C MULTILE REGRESSION APPLICATIONS

BIOSTATISTICS METHODS FOR TRANSLATIONAL & CLINICAL RESEARCH DAY #4: REGRESSION APPLICATIONS, PART C MULTILE REGRESSION APPLICATIONS BIOSTATISTICS METHODS FOR TRANSLATIONAL & CLINICAL RESEARCH DAY #4: REGRESSION APPLICATIONS, PART C MULTILE REGRESSION APPLICATIONS This last part is devoted to Multiple Regression applications; it covers

More information

Spatio-temporal modeling of weekly malaria incidence in children under 5 for early epidemic detection in Mozambique

Spatio-temporal modeling of weekly malaria incidence in children under 5 for early epidemic detection in Mozambique Spatio-temporal modeling of weekly malaria incidence in children under 5 for early epidemic detection in Mozambique Katie Colborn, PhD Department of Biostatistics and Informatics University of Colorado

More information

Description Syntax for predict Menu for predict Options for predict Remarks and examples Methods and formulas References Also see

Description Syntax for predict Menu for predict Options for predict Remarks and examples Methods and formulas References Also see Title stata.com logistic postestimation Postestimation tools for logistic Description Syntax for predict Menu for predict Options for predict Remarks and examples Methods and formulas References Also see

More information

Mixed models in R using the lme4 package Part 5: Generalized linear mixed models

Mixed models in R using the lme4 package Part 5: Generalized linear mixed models Mixed models in R using the lme4 package Part 5: Generalized linear mixed models Douglas Bates Madison January 11, 2011 Contents 1 Definition 1 2 Links 2 3 Example 7 4 Model building 9 5 Conclusions 14

More information

Please discuss each of the 3 problems on a separate sheet of paper, not just on a separate page!

Please discuss each of the 3 problems on a separate sheet of paper, not just on a separate page! Econometrics - Exam May 11, 2011 1 Exam Please discuss each of the 3 problems on a separate sheet of paper, not just on a separate page! Problem 1: (15 points) A researcher has data for the year 2000 from

More information

36-463/663: Hierarchical Linear Models

36-463/663: Hierarchical Linear Models 36-463/663: Hierarchical Linear Models Lmer model selection and residuals Brian Junker 132E Baker Hall brian@stat.cmu.edu 1 Outline The London Schools Data (again!) A nice random-intercepts, random-slopes

More information

Introduction to Within-Person Analysis and RM ANOVA

Introduction to Within-Person Analysis and RM ANOVA Introduction to Within-Person Analysis and RM ANOVA Today s Class: From between-person to within-person ANOVAs for longitudinal data Variance model comparisons using 2 LL CLP 944: Lecture 3 1 The Two Sides

More information

Final Overview. Introduction to ML. Marek Petrik 4/25/2017

Final Overview. Introduction to ML. Marek Petrik 4/25/2017 Final Overview Introduction to ML Marek Petrik 4/25/2017 This Course: Introduction to Machine Learning Build a foundation for practice and research in ML Basic machine learning concepts: max likelihood,

More information

sociology sociology Scatterplots Quantitative Research Methods: Introduction to correlation and regression Age vs Income

sociology sociology Scatterplots Quantitative Research Methods: Introduction to correlation and regression Age vs Income Scatterplots Quantitative Research Methods: Introduction to correlation and regression Scatterplots can be considered as interval/ratio analogue of cross-tabs: arbitrarily many values mapped out in -dimensions

More information

Regression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont.

Regression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont. TCELL 9/4/205 36-309/749 Experimental Design for Behavioral and Social Sciences Simple Regression Example Male black wheatear birds carry stones to the nest as a form of sexual display. Soler et al. wanted

More information

1 Introduction 1. 2 The Multiple Regression Model 1

1 Introduction 1. 2 The Multiple Regression Model 1 Multiple Linear Regression Contents 1 Introduction 1 2 The Multiple Regression Model 1 3 Setting Up a Multiple Regression Model 2 3.1 Introduction.............................. 2 3.2 Significance Tests

More information

Outline. Selection Bias in Multilevel Models. The selection problem. The selection problem. Scope of analysis. The selection problem

Outline. Selection Bias in Multilevel Models. The selection problem. The selection problem. Scope of analysis. The selection problem International Workshop on tatistical Latent Variable Models in Health ciences erugia, 6-8 eptember 6 Outline election Bias in Multilevel Models Leonardo Grilli Carla Rampichini grilli@ds.unifi.it carla@ds.unifi.it

More information

Testing methodology. It often the case that we try to determine the form of the model on the basis of data

Testing methodology. It often the case that we try to determine the form of the model on the basis of data Testing methodology It often the case that we try to determine the form of the model on the basis of data The simplest case: we try to determine the set of explanatory variables in the model Testing for

More information

Variance Decomposition and Goodness of Fit

Variance Decomposition and Goodness of Fit Variance Decomposition and Goodness of Fit 1. Example: Monthly Earnings and Years of Education In this tutorial, we will focus on an example that explores the relationship between total monthly earnings

More information

Longitudinal Data Analysis

Longitudinal Data Analysis Longitudinal Data Analysis Mike Allerhand This document has been produced for the CCACE short course: Longitudinal Data Analysis. No part of this document may be reproduced, in any form or by any means,

More information

Empirical Application of Panel Data Regression

Empirical Application of Panel Data Regression Empirical Application of Panel Data Regression 1. We use Fatality data, and we are interested in whether rising beer tax rate can help lower traffic death. So the dependent variable is traffic death, while

More information

Using Spatial Statistics Social Service Applications Public Safety and Public Health

Using Spatial Statistics Social Service Applications Public Safety and Public Health Using Spatial Statistics Social Service Applications Public Safety and Public Health Lauren Rosenshein 1 Regression analysis Regression analysis allows you to model, examine, and explore spatial relationships,

More information

Multiple Regression and Regression Model Adequacy

Multiple Regression and Regression Model Adequacy Multiple Regression and Regression Model Adequacy Joseph J. Luczkovich, PhD February 14, 2014 Introduction Regression is a technique to mathematically model the linear association between two or more variables,

More information

Special Topic: Bayesian Finite Population Survey Sampling

Special Topic: Bayesian Finite Population Survey Sampling Special Topic: Bayesian Finite Population Survey Sampling Sudipto Banerjee Division of Biostatistics School of Public Health University of Minnesota April 2, 2008 1 Special Topic Overview Scientific survey

More information

Stat 5303 (Oehlert): Randomized Complete Blocks 1

Stat 5303 (Oehlert): Randomized Complete Blocks 1 Stat 5303 (Oehlert): Randomized Complete Blocks 1 > library(stat5303libs);library(cfcdae);library(lme4) > immer Loc Var Y1 Y2 1 UF M 81.0 80.7 2 UF S 105.4 82.3 3 UF V 119.7 80.4 4 UF T 109.7 87.2 5 UF

More information

Mixed models in R using the lme4 package Part 5: Generalized linear mixed models

Mixed models in R using the lme4 package Part 5: Generalized linear mixed models Mixed models in R using the lme4 package Part 5: Generalized linear mixed models Douglas Bates 2011-03-16 Contents 1 Generalized Linear Mixed Models Generalized Linear Mixed Models When using linear mixed

More information

Statistics 203: Introduction to Regression and Analysis of Variance Penalized models

Statistics 203: Introduction to Regression and Analysis of Variance Penalized models Statistics 203: Introduction to Regression and Analysis of Variance Penalized models Jonathan Taylor - p. 1/15 Today s class Bias-Variance tradeoff. Penalized regression. Cross-validation. - p. 2/15 Bias-variance

More information

36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression

36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression 36-309/749 Experimental Design for Behavioral and Social Sciences Sep. 22, 2015 Lecture 4: Linear Regression TCELL Simple Regression Example Male black wheatear birds carry stones to the nest as a form

More information

Some New Methods for Family-Based Association Studies

Some New Methods for Family-Based Association Studies Some New Methods for Family-Based Association Studies Ingo Ruczinski Department of Biostatistics Johns Hopkins Bloomberg School of Public Health April 8, 20 http: //biostat.jhsph.edu/ iruczins/ Topics

More information

A (Brief) Introduction to Crossed Random Effects Models for Repeated Measures Data

A (Brief) Introduction to Crossed Random Effects Models for Repeated Measures Data A (Brief) Introduction to Crossed Random Effects Models for Repeated Measures Data Today s Class: Review of concepts in multivariate data Introduction to random intercepts Crossed random effects models

More information

ECO220Y Simple Regression: Testing the Slope

ECO220Y Simple Regression: Testing the Slope ECO220Y Simple Regression: Testing the Slope Readings: Chapter 18 (Sections 18.3-18.5) Winter 2012 Lecture 19 (Winter 2012) Simple Regression Lecture 19 1 / 32 Simple Regression Model y i = β 0 + β 1 x

More information

Combining Non-probability and Probability Survey Samples Through Mass Imputation

Combining Non-probability and Probability Survey Samples Through Mass Imputation Combining Non-probability and Probability Survey Samples Through Mass Imputation Jae-Kwang Kim 1 Iowa State University & KAIST October 27, 2018 1 Joint work with Seho Park, Yilin Chen, and Changbao Wu

More information

Coping with Additional Sources of Variation: ANCOVA and Random Effects

Coping with Additional Sources of Variation: ANCOVA and Random Effects Coping with Additional Sources of Variation: ANCOVA and Random Effects 1/49 More Noise in Experiments & Observations Your fixed coefficients are not always so fixed Continuous variation between samples

More information

y response variable x 1, x 2,, x k -- a set of explanatory variables

y response variable x 1, x 2,, x k -- a set of explanatory variables 11. Multiple Regression and Correlation y response variable x 1, x 2,, x k -- a set of explanatory variables In this chapter, all variables are assumed to be quantitative. Chapters 12-14 show how to incorporate

More information

Leverage. the response is in line with the other values, or the high leverage has caused the fitted model to be pulled toward the observed response.

Leverage. the response is in line with the other values, or the high leverage has caused the fitted model to be pulled toward the observed response. Leverage Some cases have high leverage, the potential to greatly affect the fit. These cases are outliers in the space of predictors. Often the residuals for these cases are not large because the response

More information

Thursday Morning. Growth Modelling in Mplus. Using a set of repeated continuous measures of bodyweight

Thursday Morning. Growth Modelling in Mplus. Using a set of repeated continuous measures of bodyweight Thursday Morning Growth Modelling in Mplus Using a set of repeated continuous measures of bodyweight 1 Growth modelling Continuous Data Mplus model syntax refresher ALSPAC Confirmatory Factor Analysis

More information

Supplemental Materials. In the main text, we recommend graphing physiological values for individual dyad

Supplemental Materials. In the main text, we recommend graphing physiological values for individual dyad 1 Supplemental Materials Graphing Values for Individual Dyad Members over Time In the main text, we recommend graphing physiological values for individual dyad members over time to aid in the decision

More information

1 Motivation for Instrumental Variable (IV) Regression

1 Motivation for Instrumental Variable (IV) Regression ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data

More information

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Econometrics Week 8 Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Fall 2012 1 / 25 Recommended Reading For the today Instrumental Variables Estimation and Two Stage

More information

STA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3

STA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3 STA 303 H1S / 1002 HS Winter 2011 Test March 7, 2011 LAST NAME: FIRST NAME: STUDENT NUMBER: ENROLLED IN: (circle one) STA 303 STA 1002 INSTRUCTIONS: Time: 90 minutes Aids allowed: calculator. Some formulae

More information

STA 2101/442 Assignment Four 1

STA 2101/442 Assignment Four 1 STA 2101/442 Assignment Four 1 One version of the general linear model with fixed effects is y = Xβ + ɛ, where X is an n p matrix of known constants with n > p and the columns of X linearly independent.

More information

Introduction to the Analysis of Hierarchical and Longitudinal Data

Introduction to the Analysis of Hierarchical and Longitudinal Data Introduction to the Analysis of Hierarchical and Longitudinal Data Georges Monette, York University with Ye Sun SPIDA June 7, 2004 1 Graphical overview of selected concepts Nature of hierarchical models

More information

Tests of Linear Restrictions

Tests of Linear Restrictions Tests of Linear Restrictions 1. Linear Restricted in Regression Models In this tutorial, we consider tests on general linear restrictions on regression coefficients. In other tutorials, we examine some

More information

Mixed models in R using the lme4 package Part 7: Generalized linear mixed models

Mixed models in R using the lme4 package Part 7: Generalized linear mixed models Mixed models in R using the lme4 package Part 7: Generalized linear mixed models Douglas Bates University of Wisconsin - Madison and R Development Core Team University of

More information

Hierarchical Random Effects

Hierarchical Random Effects enote 5 1 enote 5 Hierarchical Random Effects enote 5 INDHOLD 2 Indhold 5 Hierarchical Random Effects 1 5.1 Introduction.................................... 2 5.2 Main example: Lactase measurements in

More information

Introduction and Single Predictor Regression. Correlation

Introduction and Single Predictor Regression. Correlation Introduction and Single Predictor Regression Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Correlation A correlation

More information

Correlation and regression. Correlation and regression analysis. Measures of association. Why bother? Positive linear relationship

Correlation and regression. Correlation and regression analysis. Measures of association. Why bother? Positive linear relationship 1 Correlation and regression analsis 12 Januar 2009 Monda, 14.00-16.00 (C1058) Frank Haege Department of Politics and Public Administration Universit of Limerick frank.haege@ul.ie www.frankhaege.eu Correlation

More information

Propensity Score Weighting with Multilevel Data

Propensity Score Weighting with Multilevel Data Propensity Score Weighting with Multilevel Data Fan Li Department of Statistical Science Duke University October 25, 2012 Joint work with Alan Zaslavsky and Mary Beth Landrum Introduction In comparative

More information