G-ESTIMATION OF STRUCTURAL NESTED MODELS (CHAPTER 14) BIOS G-Estimation

Size: px
Start display at page:

Download "G-ESTIMATION OF STRUCTURAL NESTED MODELS (CHAPTER 14) BIOS G-Estimation"

Transcription

1 G-ESTIMATION OF STRUCTURAL NESTED MODELS (CHAPTER 14) BIOS G-Estimation

2 G-Estimation of Structural Nested Models ( 14) Outline 14.1 The causal question revisited 14.2 Exchangeability revisited 14.3 Structural nested mean models 14.4 Rank preservation 14.5 G-estimation 14.6 Structural nested models with two or more parameters BIOS G-Estimation

3 14.2 Exchangeability revisited Recall conditional exchangeability defined to be For binary Y this is equivalent to Y a A L for a = 0,1 Pr[A = 1 Y a,l] = Pr[A = 1L] Consider the following parametric logistic regression model logit{pr[a = 1 Y a=0,l]} = α 0 + α 1 Y a=0 + α 2 L Fitting such a model to a real data set not possible b/c Y a=0 not observed for all individuals Thought experiment: Suppose Y a=0 observed for all individuals so that we can fit this model. If conditional exchangeability holds and the model is correctly specified, what would you expect ˆα 1 to equal? BIOS G-Estimation

4 Consider the model 14.3 Structural nested mean models E[Y a Y a=0 A = a,l] = β 1 a + β 2 al such that β 1 + β 2 l equals the average causal effect (RD) within stratum L = l Below we discuss using g-estimation to draw inference about β 1 and β 2 Note this model is semi-parametric in the sense that we are not specifying a model for E[Y a=0 L], i.e., there is no intercept β 0 or term β 3 L in the model This is in contrast to the parametric g-formula from 13. Thus we expect g-estimation to be more robust to model mis-specification than the parametric g-formula. BIOS G-Estimation

5 14.4 Rank Preservation Suppose, contrary to fact, for the NHEFS data we knew Y a=1 and Y a=0 for all participants, i.e., each individual s potential weight gain if they quit smoking and if they did not quit smoking Imagine we sorted individuals according to Y a=1 from largest value to smallest value Imagine we sorted individuals according to Y a=0 from largest value to smallest value Suppose in either case individuals end up in the same order: rank preservation BIOS G-Estimation

6 14.4 Rank Preservation When the effect of treatment A on the outcome Y is exactly the same, on the additive scale, for all individuals in the study population, we say that additive rank preservation holds For example, if smoking cessation increases each individual s body weight by exactly 3 kg, then the ranking of individuals according to Y a=0 would be equal to the ranking according to Y a=1 A particular case of additive rank preservation occurs when the sharp null hypothesis is true ( 1), i.e., treatment has no effect on the outcomes of any individual For the purposes of structural nested mean models, we will care about additive rank preservation within levels of L. This conditional additive rank preservation holds if the effect of treatment A on the outcome Y is exactly the same for all individuals with the same values of L BIOS G-Estimation

7 14.4 Rank Preservation An example of an (additive conditional) rank-preserving structural model is Yi a Yi a=0 = ψ 1 a + ψ 2 al i for all subjects i where ψ 1 +ψ 2 l is the constant causal effect for all individuals with covariate values L = l For every individual i with L i = l Yi a=1 = Yi a=0 + ψ 1 + ψ 2 l Potential outcome under no treatment Yi a=0 is shifted by ψ 1 + ψ 2 l to obtain potential outcome under treatment Yi a=1 BIOS G-Estimation

8 14.4 Rank Preservation Figs 14.1 and 14.2 show examples of additive rank preservation within two strata L = l and L = l Figure 14.1 that in the latter list all individuals will be 3 kg additive rank preservation occurs when the sharp 36 Chapter 1), i.e., if treatment has no effect on the o the study population. For the purposes of struct will care about additive rank preservation shifts from within = additive rank preservation holds if stratum. the effect Figur of tre is exactly the same for all individuals stratum with the = sa 0 An example of an (additive conditional) from than rank in st is to the left of th =0 = 1 individuals + 2 for in a cessation than where is the constant causal effect for a values =. That is, for every individual for al wi is equal to =0 For most tr A subject s count treatment =0 pected to be c is shifted by to obtain th with the same outcome under treatment. tion is scientific Figure Figure shows an example of additive r cessation affect stratum =. The bell-shaped curves represent terfactual outcomes =0 ues of. Some (left curve) and effects of smok =1 in the upper part of the figure represent the valu. The individ outcomes for subject, and the two dots in the after quitting s ues of the two counterfactual outcomes for subject gain little, and the situation d varies across in not preserved s when =0b Because of t use methods fo For most treatments and outcomes, the individual causal effect is not expected to be constant across individuals with the same covariate values, and thus (additive conditional) rank preservation is scientifically implausible Eg, we do not expect that smoking cessation affects equally the body weight of all individuals with the same values of L BIOS G-Estimation

9 Figure Rank Preservation Reality is probably closer to Fig 14.3 Figure 14.3 Here not only are the shifts from Y a=0 to Y a=1 different between individuals, but also the ranks are not preserved A structural nested mean model is well definedintheabsenceofrank preservation. For example, one could propose a structural nested mean model for the setting depicted in Figure 14.3 to estimate the average causal effect within strata of. Such average causal effect will generally differ from the individuallevelcausaleffects. B/c of implausibility of rank preservation, causal methods that rely on it not recommended. Used in 14.5 to introduce g-est b/c g-est is easier to understand for rank-preserving models, and b/c g-est procedure is actually the same for rank-preserving and non-rankpreserving models. BIOS G-Estimation with the same covariate values, a tion is scientifically implausible. cessation affects equally the bod ues of. Some people are gene effects of smoking cessation tha. The individual causal effect after quitting smoking some ind gain little, and others may even the situation depicted in Figure varies across individuals with th not preserved since the outcome when =0but not when = Because of the implausibility use methods for causal inference we consider in this book require structural mean models from Ch not for individual causal effects, tion. The estimated average cau was 3 5 kg (95% CI: 2 5, 4 5). rank preservation of individual nested mean model in the previ preservation. The additive rank-preserving assumption than non-rank-prese stant treatment effect for all indi reason why we would want to u in practice. And yet we use it because g-estimation is easier to because the g-estimation proced and non-rank-preserving models

10 14.5 G-Estimation Suppose the goal is estimating the parameters of the structural nested mean model E[Y a Y a=0 A = a,l] = β 1 a For simplicity only considering model with one parameter, effectively assuming average causal effect constant across strata of L Assume additive rank-preserving model Yi a Yi a=0 such that ψ 1 = β 1. Equivalently = ψ 1 a or by causal consistency Y a=0 i Y a=0 i = Y a i ψ 1 a = Y ψ 1 A BIOS G-Estimation

11 14.5 G-Estimation If model correct and we knew ψ 1, then could calculate Yi a=0 individuals for all Don t know ψ 1. Moreover, drawing inference of ψ 1 is our goal. Thought experiment: Your friend (an oracle) knows the value of ψ 1. She tells you it equals one of the following three values: ψ = 20, ψ = 0 or ψ = 10. She then challenges you to determine the true value based on the oberved data. You accept the challenge. For each individual compute H(ψ ) = Y ψ A for each of the three possible values of ψ The three newly created random variables H( 20), H(0) and H(10) are candidate potential outcomes. Only one of the three is the correct potential outcome Y a=0. How do you choose which one? BIOS G-Estimation

12 14.5 G-Estimation Remember from 14.2 that the assumption of conditional exchangeability can be expressed as a logistic model for treatment given the counterfactual outcome and the covariates L. When conditional exchangeability holds, the coefficient for the counterfactual outcome should be zero. This suggests we fit three separate logistic regression models logitpr[a = 1 H(ψ ),L] = α 0 + α 1 H(ψ ) + α 2 L The candidate H(ψ ) with ˆα 1 0 is the counterfactual Y a=0 and the corresponding ψ is the estimate of the true ψ 1 Eg, suppose for H(ψ = 10) that ˆα 1 0. Then ˆψ 1 = 10. This is g-estimation. BIOS G-Estimation

13 14.5 G-Estimation Important note: G-est does not test whether conditional exchangeability holds; it assumes it holds in order to draw inference about the causal effect of interest In reality we do not have an oracle friend supplying a short list of possible values of ψ 1 Therefore need to search over all possible values of ψ 1 until we find one where the corresponding ˆα 1 = 0 Operationally this is done by a search over a fine grid (eg, -20 to 20 by 0.01) NHEFS example: consider 31 possible candidates H(2.0), H(2.1), H(2.2),..., H(4.9), H(5.0). Fit 31 separate logistic regression models of the probability of smoking cessation A = 1 just as in 12 (with same L), but include H(ψ ) as an additional covariate BIOS G-Estimation

14 14.5 G-Estimation Coefficient estimate ˆα 1 for H(ψ ) was closest to zero for H(3.4) and H(3.5) Finer search reveals ˆα 1 essentially zero for ψ = Thus g-est of average causal effect of smoking cessation on weight gain is 3.4 kg Wald test of H 0 : α 1 = 0 at ψ = yields p-value p 1 To find a 95% confidence interval for ψ 1, find subset of ψ where p > 0.05 (this is the standard approach of constructing a CI by inverting a hypothesis test) For NHEFS data 31 logistic models, this yields 95% CI [2.5, 4.5] (essentially the same as IP weighting and parametric G-formula) BIOS G-Estimation

15 G-Estimation: chapter14.r ################################################################## # G-estimation: Checking multiple possible values of psi*/ ################################################################## require(geepack) data <- nhefs.g.est grid <- seq(from = 2,to = 5, by = 0.01) # set by = for finer estimate j = 0 store.hpsi.coefs <- double(length(grid)) for (i in grid){ psi = i; j = j+1 data$hpsi <- data$wt82_71 - psi * data$qsmk gee.obj <- geeglm(qsmk ~ as.factor(sex) + as.factor(race) + age + I(age^2) + as.factor(education) + smokeintensity + I(smokeintensity^2) + smokeyrs + I(smokeyrs^2) + as.factor(exercise) + as.factor(active) + wt71 + I(wt71^2)+Hpsi, data = data, weight = w.cens, id=id, corstr="independence", family = binomial(logit)) store.hpsi.coefs[j] <- coef(gee.obj)["hpsi"] cat("iteration", j, "completed\n") } store.results <- as.data.frame(cbind(grid, abs(store.hpsi.coefs))) names(store.results) <- c("grid", "Hpsi.est") store.results[store.results$hpsi.est == min(store.results$hpsi.est),] BIOS G-Estimation

16 G-Estimation: chapter14.r α ^ ψ BIOS G-Estimation

17 G-Estimation: chapter14.r P value ψ BIOS G-Estimation

18 14.5 G-Estimation: Comments Other tests of H 0 : α 1 = 0 aside from Wald test, such as the score test or likelihood ratio test, could be used instead If we assume Y a {A,C} L no need to adjust for censoring O/w, if we make the weaker assumption Y C {A,L}, need to construct inverse probability of censoring weights W C = 1/Pr[C = 0 A = a,l] as in 12 With IP censoring weights and standard software, can (conservatively) use robust variance estimate to construct Wald tests of H 0 : α 1 = 0; expect 95% CIs to be wider than if non-conservative variance estimate or bootstrap used instead BIOS G-Estimation

19 14.5 G-Estimation: Comments Back to non-rank-preserving models G-estimation estimator ˆψ 1 consistent for parameter β 1 of structural nested mean model, assuming mean model is correctly specified (i.e., if average treatment effect is equal in all levels of L) This is true regardless of whether the individual treatment effect is constant I.e., it is not necessary that H(β 1 ) = Y a=0 for all subjects. Rather, it is sufficient for H(β 1 ) and Y a=0 to have the same conditional mean given L BIOS G-Estimation

20 14.6 SNM with 2 or more parameters One parameter structural nested model E[Y a Y a=0 L] = β 1 a assumes same average treatment effect If this model is mis-specified, i.e., there is effect modification by some components V of L, inferences will be wrong We expect effect modification to be the case in general [Note, in contrast, that effect modification does not invalidate MSM methods described in 12] Relax this assumption by considering instead two-parameter SNM E[Y a Y a=0 L] = β 1 a + β 2 av BIOS G-Estimation

21 14.6 SNM with 2 or more parameters For g-estimation, the corresponding rank preserving model is and now let Yi a Yi a=0 = ψ 1 a + ψ 2 av H(ψ) = Y ψ 1 A ψ 2 AV To estimate ψ 1 and ψ 2, fit logistic model logitpr[a = 1 H(ψ ),L] = α 0 + α 1 H(ψ ) + α 2 H(ψ )V + α 3 L Find combination of ψ 1 and ψ 2 where H(ψ ) A L I.e., search for combination of (ψ 1,ψ 2 ) that yields ˆα 1 = ˆα 2 = 0 In general, solution does not have a closed form and therefore numerical search algorithms (eg Nelder-Mead Simplex) must be used BIOS G-Estimation

22 14.6 NHEFS Data Revisited Consider two-parameter SNM E[Y a Y a=0 L] = β 1 a + β 2 av where Y is change in weight between follow-up and baseline, and V is baseline smoking intensity Numerical 2-d grid search; fit logistic model logitpr[a = 1 H(ψ ),L] = α 0 + α 1 H(ψ ) + α 2 H(ψ )V + α 3 L for ψ 1 {2,2.05,...,5} and ψ 2 { 1, 0.95,...,1} Find values of ψ 1 and ψ 2 where α 1 α 2 0 Yields ˆβ 1 = ˆψ and ˆβ 2 = ˆψ BIOS G-Estimation

23 Contour plot of α 1 + α 2 ψ ψ 1 BIOS G-Estimation

24 Tech Pt 14.2 In certain settings, g-estimator has a closed form E.g., consider one parameter SNM E[Y a Y a=0 L] = β 1 a Suppose g-est based on score test of H 0 : α 1 = 0 logitpr[a = 1 H(ψ ),L] = α 0 + α 1 H(ψ ) + α 2 L Then equivalent to finding parameter value ψ that solves EE H i (ψ )(A i Ê[A i L i ]) = 0 i Using the fact H i (ψ ) = Y i ψ A i, closed form solution ˆψ 1 = iy i (A i Ê[A i L i ]) i A i (A i Ê[A i L i ]) What if there is censoring, or if we fit a two parameter SNM? See Tech Pt 14.2 BIOS G-Estimation

25 chapter14.r ################################################################## # G-estimation: Closed form estimator linear mean models ################################################################## logit.est <- glm(as.factor(qsmk) ~ as.factor(sex) + as.factor(race) + age + I(age^2) + as.factor(education) + smokeintensity + I(smokeintensity^2) + smokeyrs + I(smokeyrs^2) + as.factor(exercise) + as.factor(active) + wt71 + I(wt71^2), data = nhefs0, weight = w.cens, family = binomial("logit")) nhefs0$qsmk.pred <- predict(logit.est, nhefs0, type = "response") # solve sum(w_c * H(psi) * (qsmk - E[qsmk L])) = 0 # for a single psi and H(psi) = wt82_71 - psi * qsmk # this can be solved as # psi = sum( w_c * wt82_71 * (qsmk - pqsmk)) / sum(w_c * qsmk * (qsmk - pqsmk)) with(nhefs0, sum( w.cens * wt82_71 * (qsmk - qsmk.pred)) / sum(w.cens * qsmk * (qsmk - qsmk.pred))) # [1] BIOS G-Estimation

26 Recap 12 Fitting MSM via IPW requires correct model of Pr[A = a L] 13 Parametric G-formula requires correct model of E[Y A,L] Doubly robust (DR) estimators require (i) correct model of Pr[A = a L] or (ii) correct model of E[Y A,L] but not necessarily both 14 G-estimation requires correct model of Pr[A = a L] and correct (semiparametric) structural mean model E[Y a Y a=0 V ] = β 1 a + β 2 av See Tech Pt 14.2 regarding DR G-estimators Less popular b/c computationally demanding, lack of off-the-shelf software, but has advantages over other approaches (Vansteelandt and Joffe Stat Sci 2014) BIOS G-Estimation

G-ESTIMATION OF STRUCTURAL NESTED MODELS (CHAPTER 14) BIOS G-Estimation

G-ESTIMATION OF STRUCTURAL NESTED MODELS (CHAPTER 14) BIOS G-Estimation G-ESTIMATION OF STRUCTURAL NESTED MODELS (CHAPTER 14) BIOS 776 1 14 G-Estimation ( G-Estimation of Structural Nested Models 14) Outline 14.1 The causal question revisited 14.2 Exchangeability revisited

More information

IP WEIGHTING AND MARGINAL STRUCTURAL MODELS (CHAPTER 12) BIOS IPW and MSM

IP WEIGHTING AND MARGINAL STRUCTURAL MODELS (CHAPTER 12) BIOS IPW and MSM IP WEIGHTING AND MARGINAL STRUCTURAL MODELS (CHAPTER 12) BIOS 776 1 12 IPW and MSM IP weighting and marginal structural models ( 12) Outline 12.1 The causal question 12.2 Estimating IP weights via modeling

More information

OUTCOME REGRESSION AND PROPENSITY SCORES (CHAPTER 15) BIOS Outcome regressions and propensity scores

OUTCOME REGRESSION AND PROPENSITY SCORES (CHAPTER 15) BIOS Outcome regressions and propensity scores OUTCOME REGRESSION AND PROPENSITY SCORES (CHAPTER 15) BIOS 776 1 15 Outcome regressions and propensity scores Outcome Regression and Propensity Scores ( 15) Outline 15.1 Outcome regression 15.2 Propensity

More information

Semiparametric Regression

Semiparametric Regression Semiparametric Regression Patrick Breheny October 22 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/23 Introduction Over the past few weeks, we ve introduced a variety of regression models under

More information

Combining multiple observational data sources to estimate causal eects

Combining multiple observational data sources to estimate causal eects Department of Statistics, North Carolina State University Combining multiple observational data sources to estimate causal eects Shu Yang* syang24@ncsuedu Joint work with Peng Ding UC Berkeley May 23,

More information

Introduction to Statistical Analysis

Introduction to Statistical Analysis Introduction to Statistical Analysis Changyu Shen Richard A. and Susan F. Smith Center for Outcomes Research in Cardiology Beth Israel Deaconess Medical Center Harvard Medical School Objectives Descriptive

More information

MS&E 226: Small Data

MS&E 226: Small Data MS&E 226: Small Data Lecture 15: Examples of hypothesis tests (v5) Ramesh Johari ramesh.johari@stanford.edu 1 / 32 The recipe 2 / 32 The hypothesis testing recipe In this lecture we repeatedly apply the

More information

Questions 3.83, 6.11, 6.12, 6.17, 6.25, 6.29, 6.33, 6.35, 6.50, 6.51, 6.53, 6.55, 6.59, 6.60, 6.65, 6.69, 6.70, 6.77, 6.79, 6.89, 6.

Questions 3.83, 6.11, 6.12, 6.17, 6.25, 6.29, 6.33, 6.35, 6.50, 6.51, 6.53, 6.55, 6.59, 6.60, 6.65, 6.69, 6.70, 6.77, 6.79, 6.89, 6. Chapter 7 Reading 7.1, 7.2 Questions 3.83, 6.11, 6.12, 6.17, 6.25, 6.29, 6.33, 6.35, 6.50, 6.51, 6.53, 6.55, 6.59, 6.60, 6.65, 6.69, 6.70, 6.77, 6.79, 6.89, 6.112 Introduction In Chapter 5 and 6, we emphasized

More information

STAT331. Cox s Proportional Hazards Model

STAT331. Cox s Proportional Hazards Model STAT331 Cox s Proportional Hazards Model In this unit we introduce Cox s proportional hazards (Cox s PH) model, give a heuristic development of the partial likelihood function, and discuss adaptations

More information

Extending causal inferences from a randomized trial to a target population

Extending causal inferences from a randomized trial to a target population Extending causal inferences from a randomized trial to a target population Issa Dahabreh Center for Evidence Synthesis in Health, Brown University issa dahabreh@brown.edu January 16, 2019 Issa Dahabreh

More information

Web Appendix for Effect Estimation using Structural Nested Models and G-estimation

Web Appendix for Effect Estimation using Structural Nested Models and G-estimation Web Appendix for Effect Estimation using Structural Nested Models and G-estimation Introductory concepts and notation Anonymized authors. First, we provide some additional details on the general data framework

More information

What s New in Econometrics. Lecture 1

What s New in Econometrics. Lecture 1 What s New in Econometrics Lecture 1 Estimation of Average Treatment Effects Under Unconfoundedness Guido Imbens NBER Summer Institute, 2007 Outline 1. Introduction 2. Potential Outcomes 3. Estimands and

More information

Weighting Methods. Harvard University STAT186/GOV2002 CAUSAL INFERENCE. Fall Kosuke Imai

Weighting Methods. Harvard University STAT186/GOV2002 CAUSAL INFERENCE. Fall Kosuke Imai Weighting Methods Kosuke Imai Harvard University STAT186/GOV2002 CAUSAL INFERENCE Fall 2018 Kosuke Imai (Harvard) Weighting Methods Stat186/Gov2002 Fall 2018 1 / 13 Motivation Matching methods for improving

More information

Linear Regression. In this lecture we will study a particular type of regression model: the linear regression model

Linear Regression. In this lecture we will study a particular type of regression model: the linear regression model 1 Linear Regression 2 Linear Regression In this lecture we will study a particular type of regression model: the linear regression model We will first consider the case of the model with one predictor

More information

Scatter plot of data from the study. Linear Regression

Scatter plot of data from the study. Linear Regression 1 2 Linear Regression Scatter plot of data from the study. Consider a study to relate birthweight to the estriol level of pregnant women. The data is below. i Weight (g / 100) i Weight (g / 100) 1 7 25

More information

Dr. Junchao Xia Center of Biophysics and Computational Biology. Fall /1/2016 1/46

Dr. Junchao Xia Center of Biophysics and Computational Biology. Fall /1/2016 1/46 BIO5312 Biostatistics Lecture 10:Regression and Correlation Methods Dr. Junchao Xia Center of Biophysics and Computational Biology Fall 2016 11/1/2016 1/46 Outline In this lecture, we will discuss topics

More information

Targeted Maximum Likelihood Estimation in Safety Analysis

Targeted Maximum Likelihood Estimation in Safety Analysis Targeted Maximum Likelihood Estimation in Safety Analysis Sam Lendle 1 Bruce Fireman 2 Mark van der Laan 1 1 UC Berkeley 2 Kaiser Permanente ISPE Advanced Topics Session, Barcelona, August 2012 1 / 35

More information

Application of Time-to-Event Methods in the Assessment of Safety in Clinical Trials

Application of Time-to-Event Methods in the Assessment of Safety in Clinical Trials Application of Time-to-Event Methods in the Assessment of Safety in Clinical Trials Progress, Updates, Problems William Jen Hoe Koh May 9, 2013 Overview Marginal vs Conditional What is TMLE? Key Estimation

More information

Survival Analysis for Case-Cohort Studies

Survival Analysis for Case-Cohort Studies Survival Analysis for ase-ohort Studies Petr Klášterecký Dept. of Probability and Mathematical Statistics, Faculty of Mathematics and Physics, harles University, Prague, zech Republic e-mail: petr.klasterecky@matfyz.cz

More information

Scatter plot of data from the study. Linear Regression

Scatter plot of data from the study. Linear Regression 1 2 Linear Regression Scatter plot of data from the study. Consider a study to relate birthweight to the estriol level of pregnant women. The data is below. i Weight (g / 100) i Weight (g / 100) 1 7 25

More information

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models SCHOOL OF MATHEMATICS AND STATISTICS Linear and Generalised Linear Models Autumn Semester 2017 18 2 hours Attempt all the questions. The allocation of marks is shown in brackets. RESTRICTED OPEN BOOK EXAMINATION

More information

Survival Regression Models

Survival Regression Models Survival Regression Models David M. Rocke May 18, 2017 David M. Rocke Survival Regression Models May 18, 2017 1 / 32 Background on the Proportional Hazards Model The exponential distribution has constant

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)

More information

Chapter 11. Correlation and Regression

Chapter 11. Correlation and Regression Chapter 11. Correlation and Regression The word correlation is used in everyday life to denote some form of association. We might say that we have noticed a correlation between foggy days and attacks of

More information

Business Statistics. Lecture 10: Correlation and Linear Regression

Business Statistics. Lecture 10: Correlation and Linear Regression Business Statistics Lecture 10: Correlation and Linear Regression Scatterplot A scatterplot shows the relationship between two quantitative variables measured on the same individuals. It displays the Form

More information

Gov 2002: 3. Randomization Inference

Gov 2002: 3. Randomization Inference Gov 2002: 3. Randomization Inference Matthew Blackwell September 10, 2015 Where are we? Where are we going? Last week: This week: What can we identify using randomization? Estimators were justified via

More information

Casual Mediation Analysis

Casual Mediation Analysis Casual Mediation Analysis Tyler J. VanderWeele, Ph.D. Upcoming Seminar: April 21-22, 2017, Philadelphia, Pennsylvania OXFORD UNIVERSITY PRESS Explanation in Causal Inference Methods for Mediation and Interaction

More information

Bayesian Linear Regression

Bayesian Linear Regression Bayesian Linear Regression Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. September 15, 2010 1 Linear regression models: a Bayesian perspective

More information

Causal Inference Basics

Causal Inference Basics Causal Inference Basics Sam Lendle October 09, 2013 Observed data, question, counterfactuals Observed data: n i.i.d copies of baseline covariates W, treatment A {0, 1}, and outcome Y. O i = (W i, A i,

More information

Propensity Score Methods for Causal Inference

Propensity Score Methods for Causal Inference John Pura BIOS790 October 2, 2015 Causal inference Philosophical problem, statistical solution Important in various disciplines (e.g. Koch s postulates, Bradford Hill criteria, Granger causality) Good

More information

Review of Statistics 101

Review of Statistics 101 Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods

More information

Lecture 5: Clustering, Linear Regression

Lecture 5: Clustering, Linear Regression Lecture 5: Clustering, Linear Regression Reading: Chapter 10, Sections 3.1-3.2 STATS 202: Data mining and analysis October 4, 2017 1 / 22 .0.0 5 5 1.0 7 5 X2 X2 7 1.5 1.0 0.5 3 1 2 Hierarchical clustering

More information

Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal

Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal Overview In observational and experimental studies, the goal may be to estimate the effect

More information

Survival Analysis Math 434 Fall 2011

Survival Analysis Math 434 Fall 2011 Survival Analysis Math 434 Fall 2011 Part IV: Chap. 8,9.2,9.3,11: Semiparametric Proportional Hazards Regression Jimin Ding Math Dept. www.math.wustl.edu/ jmding/math434/fall09/index.html Basic Model Setup

More information

Overview. Overview. Overview. Specific Examples. General Examples. Bivariate Regression & Correlation

Overview. Overview. Overview. Specific Examples. General Examples. Bivariate Regression & Correlation Bivariate Regression & Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient Bivariate Linear Regression Line SPSS Output Interpretation Covariance ou already

More information

Rank preserving Structural Nested Distribution Model (RPSNDM) for Continuous

Rank preserving Structural Nested Distribution Model (RPSNDM) for Continuous Rank preserving Structural Nested Distribution Model (RPSNDM) for Continuous Y : X M Y a=0 = Y a a m = Y a cum (a) : Y a = Y a=0 + cum (a) an unknown parameter. = 0, Y a = Y a=0 = Y for all subjects Rank

More information

Linear Model Under General Variance

Linear Model Under General Variance Linear Model Under General Variance We have a sample of T random variables y 1, y 2,, y T, satisfying the linear model Y = X β + e, where Y = (y 1,, y T )' is a (T 1) vector of random variables, X = (T

More information

One-sample categorical data: approximate inference

One-sample categorical data: approximate inference One-sample categorical data: approximate inference Patrick Breheny October 6 Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/25 Introduction It is relatively easy to think about the distribution

More information

Regression Models - Introduction

Regression Models - Introduction Regression Models - Introduction In regression models there are two types of variables that are studied: A dependent variable, Y, also called response variable. It is modeled as random. An independent

More information

Bootstrapping Sensitivity Analysis

Bootstrapping Sensitivity Analysis Bootstrapping Sensitivity Analysis Qingyuan Zhao Department of Statistics, The Wharton School University of Pennsylvania May 23, 2018 @ ACIC Based on: Qingyuan Zhao, Dylan S. Small, and Bhaswar B. Bhattacharya.

More information

Inference for Regression

Inference for Regression Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu

More information

Personalized Treatment Selection Based on Randomized Clinical Trials. Tianxi Cai Department of Biostatistics Harvard School of Public Health

Personalized Treatment Selection Based on Randomized Clinical Trials. Tianxi Cai Department of Biostatistics Harvard School of Public Health Personalized Treatment Selection Based on Randomized Clinical Trials Tianxi Cai Department of Biostatistics Harvard School of Public Health Outline Motivation A systematic approach to separating subpopulations

More information

Multinomial Logistic Regression Models

Multinomial Logistic Regression Models Stat 544, Lecture 19 1 Multinomial Logistic Regression Models Polytomous responses. Logistic regression can be extended to handle responses that are polytomous, i.e. taking r>2 categories. (Note: The word

More information

Cox s proportional hazards model and Cox s partial likelihood

Cox s proportional hazards model and Cox s partial likelihood Cox s proportional hazards model and Cox s partial likelihood Rasmus Waagepetersen October 12, 2018 1 / 27 Non-parametric vs. parametric Suppose we want to estimate unknown function, e.g. survival function.

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2010 Paper 269 Diagnosing and Responding to Violations in the Positivity Assumption Maya L. Petersen

More information

What s New in Econometrics? Lecture 14 Quantile Methods

What s New in Econometrics? Lecture 14 Quantile Methods What s New in Econometrics? Lecture 14 Quantile Methods Jeff Wooldridge NBER Summer Institute, 2007 1. Reminders About Means, Medians, and Quantiles 2. Some Useful Asymptotic Results 3. Quantile Regression

More information

MAS3301 / MAS8311 Biostatistics Part II: Survival

MAS3301 / MAS8311 Biostatistics Part II: Survival MAS3301 / MAS8311 Biostatistics Part II: Survival M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 2009-10 1 13 The Cox proportional hazards model 13.1 Introduction In the

More information

Lecture 5: Clustering, Linear Regression

Lecture 5: Clustering, Linear Regression Lecture 5: Clustering, Linear Regression Reading: Chapter 10, Sections 3.1-3.2 STATS 202: Data mining and analysis October 4, 2017 1 / 22 Hierarchical clustering Most algorithms for hierarchical clustering

More information

Chapter 5: Logistic Regression-I

Chapter 5: Logistic Regression-I : Logistic Regression-I Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM [Acknowledgements to Tim Hanson and Haitao Chu] D. Bandyopadhyay

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models Advanced Methods for Data Analysis (36-402/36-608 Spring 2014 1 Generalized linear models 1.1 Introduction: two regressions So far we ve seen two canonical settings for regression.

More information

e author and the promoter give permission to consult this master dissertation and to copy it or parts of it for personal use. Each other use falls

e author and the promoter give permission to consult this master dissertation and to copy it or parts of it for personal use. Each other use falls e author and the promoter give permission to consult this master dissertation and to copy it or parts of it for personal use. Each other use falls under the restrictions of the copyright, in particular

More information

PSC 504: Dynamic Causal Inference

PSC 504: Dynamic Causal Inference PSC 504: Dynamic Causal Inference Matthew Blackwell 4/8/203 e problem Let s go back to a problem that we faced earlier, which is how to estimate causal effects with treatments that vary over time. We could

More information

Structural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall

Structural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall 1 Structural Nested Mean Models for Assessing Time-Varying Effect Moderation Daniel Almirall Center for Health Services Research, Durham VAMC & Dept. of Biostatistics, Duke University Medical Joint work

More information

An Introduction to Causal Mediation Analysis. Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016

An Introduction to Causal Mediation Analysis. Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016 An Introduction to Causal Mediation Analysis Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016 1 Causality In the applications of statistics, many central questions

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 y 1 2 3 4 5 6 7 x Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 32 Suhasini Subba Rao Previous lecture We are interested in whether a dependent

More information

Likelihood-Based Methods

Likelihood-Based Methods Likelihood-Based Methods Handbook of Spatial Statistics, Chapter 4 Susheela Singh September 22, 2016 OVERVIEW INTRODUCTION MAXIMUM LIKELIHOOD ESTIMATION (ML) RESTRICTED MAXIMUM LIKELIHOOD ESTIMATION (REML)

More information

Biost 518 Applied Biostatistics II. Purpose of Statistics. First Stage of Scientific Investigation. Further Stages of Scientific Investigation

Biost 518 Applied Biostatistics II. Purpose of Statistics. First Stage of Scientific Investigation. Further Stages of Scientific Investigation Biost 58 Applied Biostatistics II Scott S. Emerson, M.D., Ph.D. Professor of Biostatistics University of Washington Lecture 5: Review Purpose of Statistics Statistics is about science (Science in the broadest

More information

Introduction to Econometrics. Heteroskedasticity

Introduction to Econometrics. Heteroskedasticity Introduction to Econometrics Introduction Heteroskedasticity When the variance of the errors changes across segments of the population, where the segments are determined by different values for the explanatory

More information

L6: Regression II. JJ Chen. July 2, 2015

L6: Regression II. JJ Chen. July 2, 2015 L6: Regression II JJ Chen July 2, 2015 Today s Plan Review basic inference based on Sample average Difference in sample average Extrapolate the knowledge to sample regression coefficients Standard error,

More information

Harvard University. Rigorous Research in Engineering Education

Harvard University. Rigorous Research in Engineering Education Statistical Inference Kari Lock Harvard University Department of Statistics Rigorous Research in Engineering Education 12/3/09 Statistical Inference You have a sample and want to use the data collected

More information

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout

More information

MLR Model Selection. Author: Nicholas G Reich, Jeff Goldsmith. This material is part of the statsteachr project

MLR Model Selection. Author: Nicholas G Reich, Jeff Goldsmith. This material is part of the statsteachr project MLR Model Selection Author: Nicholas G Reich, Jeff Goldsmith This material is part of the statsteachr project Made available under the Creative Commons Attribution-ShareAlike 3.0 Unported License: http://creativecommons.org/licenses/by-sa/3.0/deed.en

More information

Lecture 14: Introduction to Poisson Regression

Lecture 14: Introduction to Poisson Regression Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu 8 May 2007 1 / 52 Overview Modelling counts Contingency tables Poisson regression models 2 / 52 Modelling counts I Why

More information

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview Modelling counts I Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu Why count data? Number of traffic accidents per day Mortality counts in a given neighborhood, per week

More information

7 Sensitivity Analysis

7 Sensitivity Analysis 7 Sensitivity Analysis A recurrent theme underlying methodology for analysis in the presence of missing data is the need to make assumptions that cannot be verified based on the observed data. If the assumption

More information

The International Journal of Biostatistics

The International Journal of Biostatistics The International Journal of Biostatistics Volume 2, Issue 1 2006 Article 2 Statistical Inference for Variable Importance Mark J. van der Laan, Division of Biostatistics, School of Public Health, University

More information

Statistics Boot Camp. Dr. Stephanie Lane Institute for Defense Analyses DATAWorks 2018

Statistics Boot Camp. Dr. Stephanie Lane Institute for Defense Analyses DATAWorks 2018 Statistics Boot Camp Dr. Stephanie Lane Institute for Defense Analyses DATAWorks 2018 March 21, 2018 Outline of boot camp Summarizing and simplifying data Point and interval estimation Foundations of statistical

More information

STAT 7030: Categorical Data Analysis

STAT 7030: Categorical Data Analysis STAT 7030: Categorical Data Analysis 5. Logistic Regression Peng Zeng Department of Mathematics and Statistics Auburn University Fall 2012 Peng Zeng (Auburn University) STAT 7030 Lecture Notes Fall 2012

More information

Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates

Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates Anastasios (Butch) Tsiatis Department of Statistics North Carolina State University http://www.stat.ncsu.edu/

More information

Behavioral Data Mining. Lecture 19 Regression and Causal Effects

Behavioral Data Mining. Lecture 19 Regression and Causal Effects Behavioral Data Mining Lecture 19 Regression and Causal Effects Outline Counterfactuals and Potential Outcomes Regression Models Causal Effects from Matching and Regression Weighted regression Counterfactuals

More information

Chapter 12 - Lecture 2 Inferences about regression coefficient

Chapter 12 - Lecture 2 Inferences about regression coefficient Chapter 12 - Lecture 2 Inferences about regression coefficient April 19th, 2010 Facts about slope Test Statistic Confidence interval Hypothesis testing Test using ANOVA Table Facts about slope In previous

More information

Data Mining Stat 588

Data Mining Stat 588 Data Mining Stat 588 Lecture 02: Linear Methods for Regression Department of Statistics & Biostatistics Rutgers University September 13 2011 Regression Problem Quantitative generic output variable Y. Generic

More information

AFT Models and Empirical Likelihood

AFT Models and Empirical Likelihood AFT Models and Empirical Likelihood Mai Zhou Department of Statistics, University of Kentucky Collaborators: Gang Li (UCLA); A. Bathke; M. Kim (Kentucky) Accelerated Failure Time (AFT) models: Y = log(t

More information

Harvard University. A Note on the Control Function Approach with an Instrumental Variable and a Binary Outcome. Eric Tchetgen Tchetgen

Harvard University. A Note on the Control Function Approach with an Instrumental Variable and a Binary Outcome. Eric Tchetgen Tchetgen Harvard University Harvard University Biostatistics Working Paper Series Year 2014 Paper 175 A Note on the Control Function Approach with an Instrumental Variable and a Binary Outcome Eric Tchetgen Tchetgen

More information

β j = coefficient of x j in the model; β = ( β1, β2,

β j = coefficient of x j in the model; β = ( β1, β2, Regression Modeling of Survival Time Data Why regression models? Groups similar except for the treatment under study use the nonparametric methods discussed earlier. Groups differ in variables (covariates)

More information

A Sampling of IMPACT Research:

A Sampling of IMPACT Research: A Sampling of IMPACT Research: Methods for Analysis with Dropout and Identifying Optimal Treatment Regimes Marie Davidian Department of Statistics North Carolina State University http://www.stat.ncsu.edu/

More information

Covariate Balancing Propensity Score for General Treatment Regimes

Covariate Balancing Propensity Score for General Treatment Regimes Covariate Balancing Propensity Score for General Treatment Regimes Kosuke Imai Princeton University October 14, 2014 Talk at the Department of Psychiatry, Columbia University Joint work with Christian

More information

The propensity score with continuous treatments

The propensity score with continuous treatments 7 The propensity score with continuous treatments Keisuke Hirano and Guido W. Imbens 1 7.1 Introduction Much of the work on propensity score analysis has focused on the case in which the treatment is binary.

More information

Methods for inferring short- and long-term effects of exposures on outcomes, using longitudinal data on both measures

Methods for inferring short- and long-term effects of exposures on outcomes, using longitudinal data on both measures Methods for inferring short- and long-term effects of exposures on outcomes, using longitudinal data on both measures Ruth Keogh, Stijn Vansteelandt, Rhian Daniel Department of Medical Statistics London

More information

Lecture 18: Simple Linear Regression

Lecture 18: Simple Linear Regression Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength

More information

A Practitioner s Guide to Cluster-Robust Inference

A Practitioner s Guide to Cluster-Robust Inference A Practitioner s Guide to Cluster-Robust Inference A. C. Cameron and D. L. Miller presented by Federico Curci March 4, 2015 Cameron Miller Cluster Clinic II March 4, 2015 1 / 20 In the previous episode

More information

Propensity Score Methods for Estimating Causal Effects from Complex Survey Data

Propensity Score Methods for Estimating Causal Effects from Complex Survey Data Propensity Score Methods for Estimating Causal Effects from Complex Survey Data Dissertation Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School

More information

Flexible mediation analysis in the presence of non-linear relations: beyond the mediation formula.

Flexible mediation analysis in the presence of non-linear relations: beyond the mediation formula. FACULTY OF PSYCHOLOGY AND EDUCATIONAL SCIENCES Flexible mediation analysis in the presence of non-linear relations: beyond the mediation formula. Modern Modeling Methods (M 3 ) Conference Beatrijs Moerkerke

More information

Double Robustness. Bang and Robins (2005) Kang and Schafer (2007)

Double Robustness. Bang and Robins (2005) Kang and Schafer (2007) Double Robustness Bang and Robins (2005) Kang and Schafer (2007) Set-Up Assume throughout that treatment assignment is ignorable given covariates (similar to assumption that data are missing at random

More information

Estimating the Marginal Odds Ratio in Observational Studies

Estimating the Marginal Odds Ratio in Observational Studies Estimating the Marginal Odds Ratio in Observational Studies Travis Loux Christiana Drake Department of Statistics University of California, Davis June 20, 2011 Outline The Counterfactual Model Odds Ratios

More information

1/15. Over or under dispersion Problem

1/15. Over or under dispersion Problem 1/15 Over or under dispersion Problem 2/15 Example 1: dogs and owners data set In the dogs and owners example, we had some concerns about the dependence among the measurements from each individual. Let

More information

STAT 525 Fall Final exam. Tuesday December 14, 2010

STAT 525 Fall Final exam. Tuesday December 14, 2010 STAT 525 Fall 2010 Final exam Tuesday December 14, 2010 Time: 2 hours Name (please print): Show all your work and calculations. Partial credit will be given for work that is partially correct. Points will

More information

Gov 2000: 6. Hypothesis Testing

Gov 2000: 6. Hypothesis Testing Gov 2000: 6. Hypothesis Testing Matthew Blackwell October 11, 2016 1 / 55 1. Hypothesis Testing Examples 2. Hypothesis Test Nomenclature 3. Conducting Hypothesis Tests 4. p-values 5. Power Analyses 6.

More information

Quantitative Empirical Methods Exam

Quantitative Empirical Methods Exam Quantitative Empirical Methods Exam Yale Department of Political Science, August 2016 You have seven hours to complete the exam. This exam consists of three parts. Back up your assertions with mathematics

More information

Generalized Linear Models. Last time: Background & motivation for moving beyond linear

Generalized Linear Models. Last time: Background & motivation for moving beyond linear Generalized Linear Models Last time: Background & motivation for moving beyond linear regression - non-normal/non-linear cases, binary, categorical data Today s class: 1. Examples of count and ordered

More information

Lecture 12: Effect modification, and confounding in logistic regression

Lecture 12: Effect modification, and confounding in logistic regression Lecture 12: Effect modification, and confounding in logistic regression Ani Manichaikul amanicha@jhsph.edu 4 May 2007 Today Categorical predictor create dummy variables just like for linear regression

More information

Section 3: Simple Linear Regression

Section 3: Simple Linear Regression Section 3: Simple Linear Regression Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction

More information

Gov 2002: 13. Dynamic Causal Inference

Gov 2002: 13. Dynamic Causal Inference Gov 2002: 13. Dynamic Causal Inference Matthew Blackwell December 19, 2015 1 / 33 1. Time-varying treatments 2. Marginal structural models 2 / 33 1/ Time-varying treatments 3 / 33 Time-varying treatments

More information

Hypothesis Testing, Power, Sample Size and Confidence Intervals (Part 2)

Hypothesis Testing, Power, Sample Size and Confidence Intervals (Part 2) Hypothesis Testing, Power, Sample Size and Confidence Intervals (Part 2) B.H. Robbins Scholars Series June 23, 2010 1 / 29 Outline Z-test χ 2 -test Confidence Interval Sample size and power Relative effect

More information

Summer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University.

Summer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University. Summer School in Statistics for Astronomers V June 1 - June 6, 2009 Regression Mosuk Chow Statistics Department Penn State University. Adapted from notes prepared by RL Karandikar Mean and variance Recall

More information

Structural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall

Structural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall 1 Structural Nested Mean Models for Assessing Time-Varying Effect Moderation Daniel Almirall Center for Health Services Research, Durham VAMC & Duke University Medical, Dept. of Biostatistics Joint work

More information

STAT 512 MidTerm I (2/21/2013) Spring 2013 INSTRUCTIONS

STAT 512 MidTerm I (2/21/2013) Spring 2013 INSTRUCTIONS STAT 512 MidTerm I (2/21/2013) Spring 2013 Name: Key INSTRUCTIONS 1. This exam is open book/open notes. All papers (but no electronic devices except for calculators) are allowed. 2. There are 5 pages in

More information

Psychology 282 Lecture #4 Outline Inferences in SLR

Psychology 282 Lecture #4 Outline Inferences in SLR Psychology 282 Lecture #4 Outline Inferences in SLR Assumptions To this point we have not had to make any distributional assumptions. Principle of least squares requires no assumptions. Can use correlations

More information

A NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL

A NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL Discussiones Mathematicae Probability and Statistics 36 206 43 5 doi:0.75/dmps.80 A NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL Tadeusz Bednarski Wroclaw University e-mail: t.bednarski@prawo.uni.wroc.pl

More information

Important note: Transcripts are not substitutes for textbook assignments. 1

Important note: Transcripts are not substitutes for textbook assignments. 1 In this lesson we will cover correlation and regression, two really common statistical analyses for quantitative (or continuous) data. Specially we will review how to organize the data, the importance

More information