Matched-Pair Case-Control Studies when Risk Factors are Correlated within the Pairs

Size: px
Start display at page:

Download "Matched-Pair Case-Control Studies when Risk Factors are Correlated within the Pairs"

Transcription

1 International Journal of Epidemiology O International Epidemlologlcal Association 1996 Vol. 25. No. 2 Printed In Great Britain Matched-Pair Case-Control Studies when Risk Factors are Correlated within the Pairs BETH C GLADEN Gladen B C (Statistics and Biomathematics Branch, Mail Drop A3-03, National Institute of Environmental Health Sciences, PO Box 12233, Research Triangle Park, NC 27709, USA). Matched-pair case-control studies when risk factors are correlated within the pairs. International Journal of Ep/demfo/ogy 1996; 25: Background. If pair members are independent, simple matched-pair case-control studies are known to yield consistent estimates of the population odds ratio. If pair members are not independent, this is not necessarily true. It has been shown previously that the usual matched-pair estimate remains consistent if the exposure of interest is correlated within the pairs. However, the effect of correlation of unmeasured risk factors within the pairs has not been studied. Methods. We examine the effect of wfthin-pair correlation of unmeasured risk factors independent of the measured exposure. This is done within the context of a simple matched-pair case-control study. We compare the large-sample expectation of the usual matched-pair estimate to the population odds ratio. Results. We show that the usual estimate may be inconsistent in the presence of this correlation. However, if the disease is rare, the magnitude of the bias will be negligible. Conclusions. Correlation of unmeasured risk factors independent of the measured exposure is not a practical problem in this setting. Keywords, bias (epidemiology), odds ratio, selection bias, epidemlological methods Matched-pair case-control studies can be used to study the relationship between a disease and an exposure of interest. In the simple version of such a study, we choose a random sample of cases and a matched control for each case. We determine whether each pair member is exposed. We calculate the ratio of the number of pairs with an exposed case and an unexposed control to the number of pairs with an exposed control and an unexposed case. Under the usual assumptions, this ratio will be a consistent (that is, unbiased in large samples) estimate of the population odds ratio. One of the usual assumptions is that everyone in the study is independent. If controls are chosen as, for example, random people from the same city and of the same age and sex as the case, this may be a reasonable assumption. If, however, the controls are siblings or spouses of the cases, the assumption of independence within pairs becomes less tenable. Such controls may be used because they are considered more appropriate; they may also be used for the practical reason that they are readily identified and likely to be willing to participate in a study. Since using these types of controls Statistics and Biomathematics Branch, Mail Drop A3-03, National Institute of Environmental Health Sciences, PO Box 12233, Research Triangle Park, NC 27709, USA. violates the usual assumptions, we need to check the behaviour of the estimate under these conditions. Goldstein, Hodge, and Haile looked at simple matched-pair case-control studies where the exposure of interest is correlated within the pairs. 1 In retrospective studies, what we are examining is the distribution of exposure. When exposure of a control is related to exposure of a case, it is reasonable to think this correlation may distort our inferences. However, Goldstein et at. demonstrated that the usual matched-pair estimate remains a consistent estimate of the population odds ratio despite the correlation of exposure (assuming no other assumptions are violated). A similar result appears in Pike and Robins 2 in a modification of the results of Flanders and Austin. 3 However, this is not completely reassuring since the correlation within pairs may well extend further. Although we are only interested in and only measure a single exposure, there are always other risk factors for the disease. The pair members may well be correlated on these other risk factors as well. These other risk factors may not be recognized, let alone measured. For example, suppose we are studying the relationship between a disease and some exposure; if the disease is thought to have a genetic component, but the genes responsible are unknown, sibling controls may be used. 420

2 EFFECT OF WITHIN-PAIR CORRELATION 421 The genetic risk factor cannot be measured, since the gene is unknown. The siblings may have correlated values of a variety of other unmeasured risk factors as well; these might include diet or socioeconomic status. Similarly, if a disease is known to vary by socioeconomic status or geographical location, neighbourhood controls may be used. The underlying risk factors may be unknown and thus unmeasured, and may be correlated within neighbourhoods. Neighbourhood is not a risk factor in itself, but a surrogate for these other risk factors. In this paper, we examine whether the usual estimate in a simple matched-pair case-control study remains consistent if correlation of risk factors (both the single measured one and the unmeasured ones) is present within pairs. Throughout, we will ignore precision; we are only concerned with bias. We will also assume that the unmeasured risk factors are independent of the measured exposure. Dependence would create a standard confounding situation where bias would be expected; under independence, one might expect to avoid problems. We explore whether this expectation is accurate. ASSUMPTIONS AND NOTATION Validity of a matched study is dependent on the rules which specify which non-cases are potential matched controls for each case. Certain schemes, such as use of friend controls, can cause bias. 2 " 5 This bias is avoided if the population from which cases arise can be divided into non-overlapping groups, and controls are chosen from the same group as the case; this has been called 'reciprocal design'. 2-5 We assume throughout that controls are chosen in this fashion. These non-overlapping groups might consist, for example, of sibship members or of residents of the same city block. For concreteness, we will assume that the groups in question are pairs, and we will call the pair members the wife and the husband. Assume a single dichotomous exposure of interest, denoted by E. This exposure will be the focus of the matched-pair case-control study. Let p and q denote the prevalences of the exposure for wives and husbands, respectively; we need not assume that they are equal. Assume that exposures of wife and husband are correlated, and let r denote the covariance. The joint probabilities of E are: P(wife is E, husband is E) = pq + r P(wife is E, husband ise) = p(l q) - r P(wife is E, husband is E) = (l-p)q - r P(wife is E, husband is E) = (1 p)(l q) + r If r = 0, the exposures of wife and husband are independent. Assume one other discrete risk factor (denoted F) with f categories. F can be thought of as subsuming all other risk factors, since it could actually be a composite of multiple, possibly dependent, risk factors; for example, level I is young white professionals, level 2 is old white professionals, level 3 is young white labourers, and so on. F will not be measured in the matched-pair study; it nevertheless plays a role in determining the distribution of disease in the pairs. Assume that F is correlated within pairs, but that F and E are independent. Assume that prevalences of F are the same for wives and husbands. Denote the marginal and joint probabilities for F by: Pr(wife is F ; ) = Pr(husband is F;) = x, Pr(wife is Fj and husband is F) = Pr(wife is ~ and husband is = x,xj + Zlj If z,j = 0 for all values of i and j, then the risk factors of husband and wife are independent. Finally, denote disease by D. Assume that disease risk depends only on E and F. In particular, assume that variations in disease risk from one pair to another are attributable solely to variations in E and F. Assume that, conditional on the risk factors, occurrence of disease in one individual is independent of occurrence in all others. Denote the disease probabilities by: Pr(D I E, Fj) = a, Pr(D I E, Fj) = bj Note that we do not assume that relative risks for E (that is, b/a,) are constant across the levels of F; this means effect modification is permitted. Thus, for example, we allow for the possibility that one factor is environmental, the other is genetic, and no elevation in risk occurs unless both are present. RESULTS Population Parameters We may derive the population values for relative risks and odds ratios for exposure through straightforward algebra; details are in the Appendix. First, we may show that the risk of disease conditional on exposure r is Pr(D I E) = X Xjb t. This is, of course, just a weighted average of the risks (bj) in the various levels of F, weighted by the frequencies (Xj). Similarly, we may show that Pr(DlE~) = Ex^. Then the relative risk

3 422 INTERNATIONAL JOURNAL OF EPIDEMIOLOGY f f is Ix,b,/ Zxa-. A similar expression in the case I-I ' ' i-i ' ' where f = 2 is given by Khoury and James. 6 Similarly, the odds ratio is: f I I-I Note that the correlation parameters, ^ and z tj, do not enter into these expressions; the relative risk and odds ratios are the same whether or not exposures are correlated within the pairs. Matched Pair Estimate Suppose we do a matched-pair case-control study looking at the effect of E on D. F is not measured in such a study, but it affects the distribution of D nonetheless. By design, only those pairs discordant for D (that is, pairs with one case and one control) appear in the study. Of those, only those pairs discordant for E contribute to the usual estimate of the odds ratio. The expected number of pairs with an exposed case and an unexposed control will be proportional to: Pr(wife is E, D and husband is E, D) + Pr(wife is E, D and husband is E, D) which can be shown to be: [ P + q-2pq-2r]{[ix i (l-a,)][ x 1 b 1 ]-iiz 1J b 1 a J } i-i i.i i-ij-i ' ' Similarly, the expected number of pairs with an unexposed case and an exposed control will be proportional to: Pr(wife is E, D and husband is E, D) + Pr(wife is E, D and husband is E, D) = [p + q-2pq-2r]{[ix,a i ][ x i (l-b 1 )]-iiz li a j b 1 } i-i i-i i-ij-i ' The ratio of these two terms gives the expression for the large sample expectation of the estimated odds ratio: (1) Behaviour of Estimate Under Various Conditions First note that the distribution of the exposure E is irrelevant to the behaviour of the estimate; p, q, and r do not appear in expression (2). Expression (2) will be equal to the population odds ratio (1) in several circumstances. First, if the exposure of interest E is not actually a risk factor, there is no bias. This condition is equivalent to a; = b ; for all i. In this situation, both the population odds ratio and the large-sample expectation of the estimate will be 1. Second, if there is no correlation on F within pairs, there is no bias. This condition is equivalent to z,j = 0 for all i and j. Third, if F is not a risk factor within both exposure groups, there is no bias. If F does not affect disease risk among the unexposed, then a, = a for all i; it can be shown that there is no bias. In similar fashion, if F does not affect disease risk among the exposed, then b, = b for all i, and there will be no bias. Note that the case studied by Goldstein et al} had no risk factor F, which is equivalent to having both a, = a and bj = b; thus their results are a special case of the results obtained here. The behaviour of expression (2) in the rare disease case can be seen by letting disease rates go to zero with relative rates fixed. Simple calculus shows that the limit is the relative risk. Thus as the disease becomes rarer, both the large-sample expectation (2) and the population odds ratio (1) approach the population relative risk and the bias disappears. Example Suppose that F is dichotomous. The distribution of F can then be described by only two parameters, due to constraints mentioned in the Appendix. Thus, we have: Pr(wife is F,) = Pr(husband is F,) = x, Pr(wife is F 2 ) = Pr(husband is F 2 ) = 1-x, Pr(wife is F, and husband is F,) = x, 2 + z n Pr(wife is F, and husband is F 2 ) = Pr(wife is F 2 and husband is F,) = x,(l-x ) - z n Pr(wife is F 2 and husband is F 2 ) = (1-x,) 2 + z,, [Zx i (l-b,)]-zzz«b i a j } i-i i-ij-i i-ij-i Clearly, this expression differs from the population odds ratio (1); specifically, it has an extra term subtracted from both numerator and denominator. Unlike the population odds ratio, the estimate is affected by the correlation parameters z~. Thus, the usual matchedpairs estimate will be biased. We now examine the nature of the bias. (2) There will be four disease parameters (a,, a^ b v b 2 ). Assume that F = 2 is the higher risk category for both exposed and unexposed, so that a 2 &a, and b 2 3>b. Assume also that exposure is detrimental in both categories of F, so that b, 3= a, and b 2 ^ a 2. Assuming all this, we conducted a numerical search through the region where disease risks (a,, a^ b,, b 2 ) are small (KT 6 to 0.1) and relative risks (li.il.i.,^.) i D l flj are moderate (1-5). The parameters x, and Z, were allowed to range through all possible values. The search yielded no example where expression (2)

4 EFFECT OF WTTHIN-PAIR CORRELATION 423 differed by more than 3% from the population odds ratio. In this particular case, expression (2) is an increasing function of z u. Thus positive correlation within the pair (z n >0) will produce a value for expression (2) greater than the population odds ratio. Conversely, negative correlation will produce a value which is smaller. DISCUSSION We have shown that correlation within matched casecontrol pairs on unmeasured risk factors independent of the measured exposure can cause the usual estimate to be inconsistent for the population odds ratio. The bias vanishes as disease becomes rare; thus the bias is unlikely to be of practical importance. There is no bias if the exposure of interest is not a risk factor. There is also no bias if the unmeasured risk factor is not truly a risk factor or if it is not correlated within pairs. We assume throughout that the quantity of interest is the population odds ratio. This will not always be the situation. For example, if the unmeasured risk factor is genotype, only the risk among the susceptibles may be of interest. 7 ' 8 We assume that disease is independent within pairs, conditional on the risk factors; for non-infectious diseases, this is likely to be true since any correlation of disease is probably induced by correlation of risk factors. We assumed that marginal distribution of unmeasured risk factors was the same for the two pair members; situations where this is not true (for example, spouses of breast cancer cases) are likely to represent problematic choices of controls. Related but different problems have been discussed by other authors. Khoury and James 6 assume a measured environmental factor and an unmeasured genetic factor, but examine a different study design. They identify affected individuals and determine the disease status of the pair member. They calculate risk of disease in one pair conditional on the other pair member being diseased and conditional on exposure status. In contrast, the matched-pair case-control study examined here looks at risk of exposure in a pair conditional on the disease status of the pair. They show that the relative risks they obtain will equal the population relative risk if risks are multiplicative. Robins and Pike 5 discuss the situation of two risk factors in matched-pair case-control studies. However, they assume that both risk factors E and F are measured and the effects of both are estimated simultaneously. This is a different estimator from the one discussed here. They assume that E and F are correlated with each other. They show that if risks are multiplicative, the estimates for both risk factors will be unbiased. ACKNOWLEDGEMENTS I thank Dale Sandier for bringing this problem to my attention and Glinda Cooper, Dale Sandier, David Umbach, and Clarice Weinberg for helpful comments. REFERENCES 1 Goldstein A M, Hodge S E, Haile R W C. Selection bias in case-control studies using relatives as the controls. Int J Epidemiol 1989; 18: Pike M C, Robins J. Re: 'Possibility of selection bias in matched case-control studies using friend controls'. Am J Epidemiol 1989; 130: Flanders W D, Austin H. Possibility of selection bias in matched case-control studies using friend controls. Am J Epidemiol 1986; 124: Austin H, Flanders W D, Rothman K J. Bias arising in casecontrol studies from selection of controls from overlapping groups. Int J Epidemiol 1989; 18: Robins J, Pike M. The validity of case-control studies with nonrandom selection of controls. Epidemiology 1990; 1: Khoury M J, James L M. Population and familial relative risks of disease associated with environmental factors in the presence of gene-environment interaction. Am J Epidemiol 1993; 137: Khoury M J, Stewart W, Beaty T H. The effect of genetic susceptibility on causal inference in epidemiologic studies. Am J Epidemiol 1987; 126: ' Breitner J C S, Murphy E A, Woodbury M A. Case-control studies of environmental influences in diseases with genetic determinants, with an application to Alzheimer's disease. Am J Epidemiol 1991; 133: (Revised version received August 1995)

5 424 INTERNATIONAL JOURNAL OF EPIDEMIOLOGY APPENDIX We give here details of some of the calculations. Note first for future reference that symmetry in the definitions of the probabilities of F imply that z^ = z-^. The fact that probabilities add to 1 implies that Z x, = ' Th e definition of x implies that z H = Xz ii =0 I.I j.i First, derive the risk of disease conditional on exposure: Pr(D I E) = Pr(E) Pr(R) Pr(D I E.R) / Pr(E) = X;b, The derivation of Pr(D E) is exactly analogous; population values of relative risks and odds ratios follow immediately. The expected number of pairs with an exposed case and an unexposed control will be proportional to: r r Pr(wife is E, D and husband is E, D) + Pr(wife is E, D and husband is E, D) = X Z[Pr(wife is E, D, F ; and husband is E, D, Fp '"' H + Pr(wife is E, D, Fj and husband is E, D, Fp] r r z I I [Pr(wife is E and husband is E) Pr(wife is F and husband is F),-i j.i i Pr(wife is D wife is E, F,) Pr(husband is D husband is E, Fp + Pr(wife is E and husband is E) Pr(wife is F, and husband is Fp Pr(wife is D wife is E\ Fj) Pr(husband is D husband is E, Fp] = {[p(l-q)-r](x 1 x j + z,pb,(l-ap + [(l^q-rkxjxj + ZyXl-a,)^} = [ P (l-q)-r] x,b, x J (l-a J ) + [(l-p)q-r] x 1 (l-a i ) x J b J +[p(l-q)-r] z, J b 1 (l-a J ) I.I j.i I-I j.i i-ij.i + [(l-p)q-r)] z ij (l-a i )b J i.ij.i = [p + q-2pq-2r][ x l (l-a,)][ x,b 1 ] + [p(l-q)-r] z lj b i (l-a j ) + [(l-p)q-r] z jl b J (l-a 1 ) i.i i.i i-ij-i I.IJ.I = [ P + q-2pq-2r]{[ x,(l-a 1 )][ x l b 1 ]+ z u b 1 (l-a J )} I-I i.i I-IJ-I = [p + q-2pq-2r]{[ x l (l-a i )][ x,b i ]+ b i z ij - z, J b 1 a j } i-i I.I i.i j.i i-ij-i = [p + q-2pq-2r]([ x,(l-a,)][ x i b 1 ]- z, J b i a j } i-i I.I i-ij-i The expected number of pairs with an unexposed case and an exposed control can be derived similarly, and expression (2) follows immediately. Expression (2) will be equal to the population odds ratio (1) in several circumstances. First, there is no bias if a; = b, for all i. Under this assumption, the numerator of (2) equals the denominator of (2), so expression (2) equals 1. Since the population odds ratio is also 1, there is no bias. Second, there is no bias if z V) = 0 for all i and j. Under this condition, the extra term in the numerator and denominator of (2) is zero; this makes expressions (1) and (2) exactly equal. Third, there is no bias if a; = a for all i. Under these circumstances, the extra term is again zero: r f f r Y. Z z^b^i = a Z b, I z. = 0. i.i J-I 'J ' J I.I ' I.I 'J Thus there is no bias. In similar fashion, if bj = b for all i, the extra term is again zero.

6 EFFECT OF WITHIN-PAJR CORRELATION The behaviour of expression (2) in the rare disease case can be seen by letting disease rates go to zero with relative rates fixed. Let b, = ^a, and a, = s^. Expression (2) becomes 1=1 1-1 i-i i.i,-ij.i ' ' /{[Ix 1 s 1 HIx i ]-a 0 [Ix 1 s,][ix i iis l ]-a 0 z lj i;s l s J } i.i i.i I-I 1=1 i-ij-i We need the limit of this expression as ag goes to zero with all other terms fixed. Simple calculus shows that the limit is: r r f r [Xx^sJ / [Xx.s,] = [ X b,] / [ x,a,] =relativerisk i.i I-I I-I I-I Thus as the disease becomes rare, the large-sample expectation approaches the population relative risk.

Ignoring the matching variables in cohort studies - when is it valid, and why?

Ignoring the matching variables in cohort studies - when is it valid, and why? Ignoring the matching variables in cohort studies - when is it valid, and why? Arvid Sjölander Abstract In observational studies of the effect of an exposure on an outcome, the exposure-outcome association

More information

Person-Time Data. Incidence. Cumulative Incidence: Example. Cumulative Incidence. Person-Time Data. Person-Time Data

Person-Time Data. Incidence. Cumulative Incidence: Example. Cumulative Incidence. Person-Time Data. Person-Time Data Person-Time Data CF Jeff Lin, MD., PhD. Incidence 1. Cumulative incidence (incidence proportion) 2. Incidence density (incidence rate) December 14, 2005 c Jeff Lin, MD., PhD. c Jeff Lin, MD., PhD. Person-Time

More information

Effect Modification and Interaction

Effect Modification and Interaction By Sander Greenland Keywords: antagonism, causal coaction, effect-measure modification, effect modification, heterogeneity of effect, interaction, synergism Abstract: This article discusses definitions

More information

Tests for Two Correlated Proportions in a Matched Case- Control Design

Tests for Two Correlated Proportions in a Matched Case- Control Design Chapter 155 Tests for Two Correlated Proportions in a Matched Case- Control Design Introduction A 2-by-M case-control study investigates a risk factor relevant to the development of a disease. A population

More information

Computational Systems Biology: Biology X

Computational Systems Biology: Biology X Bud Mishra Room 1002, 715 Broadway, Courant Institute, NYU, New York, USA L#7:(Mar-23-2010) Genome Wide Association Studies 1 The law of causality... is a relic of a bygone age, surviving, like the monarchy,

More information

Harvard University. A Note on the Control Function Approach with an Instrumental Variable and a Binary Outcome. Eric Tchetgen Tchetgen

Harvard University. A Note on the Control Function Approach with an Instrumental Variable and a Binary Outcome. Eric Tchetgen Tchetgen Harvard University Harvard University Biostatistics Working Paper Series Year 2014 Paper 175 A Note on the Control Function Approach with an Instrumental Variable and a Binary Outcome Eric Tchetgen Tchetgen

More information

Part IV Statistics in Epidemiology

Part IV Statistics in Epidemiology Part IV Statistics in Epidemiology There are many good statistical textbooks on the market, and we refer readers to some of these textbooks when they need statistical techniques to analyze data or to interpret

More information

Simple Sensitivity Analysis for Differential Measurement Error. By Tyler J. VanderWeele and Yige Li Harvard University, Cambridge, MA, U.S.A.

Simple Sensitivity Analysis for Differential Measurement Error. By Tyler J. VanderWeele and Yige Li Harvard University, Cambridge, MA, U.S.A. Simple Sensitivity Analysis for Differential Measurement Error By Tyler J. VanderWeele and Yige Li Harvard University, Cambridge, MA, U.S.A. Abstract Simple sensitivity analysis results are given for differential

More information

Estimating direct effects in cohort and case-control studies

Estimating direct effects in cohort and case-control studies Estimating direct effects in cohort and case-control studies, Ghent University Direct effects Introduction Motivation The problem of standard approaches Controlled direct effect models In many research

More information

Standardization methods have been used in epidemiology. Marginal Structural Models as a Tool for Standardization ORIGINAL ARTICLE

Standardization methods have been used in epidemiology. Marginal Structural Models as a Tool for Standardization ORIGINAL ARTICLE ORIGINAL ARTICLE Marginal Structural Models as a Tool for Standardization Tosiya Sato and Yutaka Matsuyama Abstract: In this article, we show the general relation between standardization methods and marginal

More information

Previous lecture. P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing.

Previous lecture. P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing. Previous lecture P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing. Interaction Outline: Definition of interaction Additive versus multiplicative

More information

Confounding, mediation and colliding

Confounding, mediation and colliding Confounding, mediation and colliding What types of shared covariates does the sibling comparison design control for? Arvid Sjölander and Johan Zetterqvist Causal effects and confounding A common aim of

More information

The identification of synergism in the sufficient-component cause framework

The identification of synergism in the sufficient-component cause framework * Title Page Original Article The identification of synergism in the sufficient-component cause framework Tyler J. VanderWeele Department of Health Studies, University of Chicago James M. Robins Departments

More information

Missing Covariate Data in Matched Case-Control Studies

Missing Covariate Data in Matched Case-Control Studies Missing Covariate Data in Matched Case-Control Studies Department of Statistics North Carolina State University Paul Rathouz Dept. of Health Studies U. of Chicago prathouz@health.bsd.uchicago.edu with

More information

Equivalence of random-effects and conditional likelihoods for matched case-control studies

Equivalence of random-effects and conditional likelihoods for matched case-control studies Equivalence of random-effects and conditional likelihoods for matched case-control studies Ken Rice MRC Biostatistics Unit, Cambridge, UK January 8 th 4 Motivation Study of genetic c-erbb- exposure and

More information

6.3 How the Associational Criterion Fails

6.3 How the Associational Criterion Fails 6.3. HOW THE ASSOCIATIONAL CRITERION FAILS 271 is randomized. We recall that this probability can be calculated from a causal model M either directly, by simulating the intervention do( = x), or (if P

More information

Propensity Score Analysis with Hierarchical Data

Propensity Score Analysis with Hierarchical Data Propensity Score Analysis with Hierarchical Data Fan Li Alan Zaslavsky Mary Beth Landrum Department of Health Care Policy Harvard Medical School May 19, 2008 Introduction Population-based observational

More information

In some settings, the effect of a particular exposure may be

In some settings, the effect of a particular exposure may be Original Article Attributing Effects to Interactions Tyler J. VanderWeele and Eric J. Tchetgen Tchetgen Abstract: A framework is presented that allows an investigator to estimate the portion of the effect

More information

Lecture 25. Ingo Ruczinski. November 24, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University

Lecture 25. Ingo Ruczinski. November 24, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University Lecture 25 Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University November 24, 2015 1 2 3 4 5 6 7 8 9 10 11 1 Hypothesis s of homgeneity 2 Estimating risk

More information

Missing covariate data in matched case-control studies: Do the usual paradigms apply?

Missing covariate data in matched case-control studies: Do the usual paradigms apply? Missing covariate data in matched case-control studies: Do the usual paradigms apply? Bryan Langholz USC Department of Preventive Medicine Joint work with Mulugeta Gebregziabher Larry Goldstein Mark Huberman

More information

Estimating and contextualizing the attenuation of odds ratios due to non-collapsibility

Estimating and contextualizing the attenuation of odds ratios due to non-collapsibility Estimating and contextualizing the attenuation of odds ratios due to non-collapsibility Stephen Burgess Department of Public Health & Primary Care, University of Cambridge September 6, 014 Short title:

More information

The identi cation of synergism in the su cient-component cause framework

The identi cation of synergism in the su cient-component cause framework The identi cation of synergism in the su cient-component cause framework By TYLER J. VANDEREELE Department of Health Studies, University of Chicago 5841 South Maryland Avenue, MC 2007, Chicago, IL 60637

More information

Mendelian randomization as an instrumental variable approach to causal inference

Mendelian randomization as an instrumental variable approach to causal inference Statistical Methods in Medical Research 2007; 16: 309 330 Mendelian randomization as an instrumental variable approach to causal inference Vanessa Didelez Departments of Statistical Science, University

More information

The distinction between a biologic interaction or synergism

The distinction between a biologic interaction or synergism ORIGINAL ARTICLE The Identification of Synergism in the Sufficient-Component-Cause Framework Tyler J. VanderWeele,* and James M. Robins Abstract: Various concepts of interaction are reconsidered in light

More information

Lecture 3: Measures of effect: Risk Difference Attributable Fraction Risk Ratio and Odds Ratio

Lecture 3: Measures of effect: Risk Difference Attributable Fraction Risk Ratio and Odds Ratio Lecture 3: Measures of effect: Risk Difference Attributable Fraction Risk Ratio and Odds Ratio Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK March 3-5,

More information

Known unknowns : using multiple imputation to fill in the blanks for missing data

Known unknowns : using multiple imputation to fill in the blanks for missing data Known unknowns : using multiple imputation to fill in the blanks for missing data James Stanley Department of Public Health University of Otago, Wellington james.stanley@otago.ac.nz Acknowledgments Cancer

More information

Selection on Observables: Propensity Score Matching.

Selection on Observables: Propensity Score Matching. Selection on Observables: Propensity Score Matching. Department of Economics and Management Irene Brunetti ireneb@ec.unipi.it 24/10/2017 I. Brunetti Labour Economics in an European Perspective 24/10/2017

More information

Asymptotic efficiency of general noniterative estimators of common relative risk

Asymptotic efficiency of general noniterative estimators of common relative risk Biometrika (1981), 68, 2, pp. 526-30 525 Printed in Great Britain Asymptotic efficiency of general noniterative estimators of common relative risk BY MARKKU NTJRMINEN Department of Epidemiology and Biometry,

More information

A counterfactual approach to bias and effect modification in terms of response types

A counterfactual approach to bias and effect modification in terms of response types uzuki et al. BM Medical Research Methodology 2013, 13:101 RARH ARTIL Open Access A counterfactual approach to bias and effect modification in terms of response types tsuji uzuki 1*, Toshiharu Mitsuhashi

More information

Additive and multiplicative models for the joint effect of two risk factors

Additive and multiplicative models for the joint effect of two risk factors Biostatistics (2005), 6, 1,pp. 1 9 doi: 10.1093/biostatistics/kxh024 Additive and multiplicative models for the joint effect of two risk factors A. BERRINGTON DE GONZÁLEZ Cancer Research UK Epidemiology

More information

15: Regression. Introduction

15: Regression. Introduction 15: Regression Introduction Regression Model Inference About the Slope Introduction As with correlation, regression is used to analyze the relation between two continuous (scale) variables. However, regression

More information

Journal of Biostatistics and Epidemiology

Journal of Biostatistics and Epidemiology Journal of Biostatistics and Epidemiology Methodology Marginal versus conditional causal effects Kazem Mohammad 1, Seyed Saeed Hashemi-Nazari 2, Nasrin Mansournia 3, Mohammad Ali Mansournia 1* 1 Department

More information

Specification Errors, Measurement Errors, Confounding

Specification Errors, Measurement Errors, Confounding Specification Errors, Measurement Errors, Confounding Kerby Shedden Department of Statistics, University of Michigan October 10, 2018 1 / 32 An unobserved covariate Suppose we have a data generating model

More information

Sensitivity analysis and distributional assumptions

Sensitivity analysis and distributional assumptions Sensitivity analysis and distributional assumptions Tyler J. VanderWeele Department of Health Studies, University of Chicago 5841 South Maryland Avenue, MC 2007, Chicago, IL 60637, USA vanderweele@uchicago.edu

More information

Propensity Score Weighting with Multilevel Data

Propensity Score Weighting with Multilevel Data Propensity Score Weighting with Multilevel Data Fan Li Department of Statistical Science Duke University October 25, 2012 Joint work with Alan Zaslavsky and Mary Beth Landrum Introduction In comparative

More information

Causality II: How does causal inference fit into public health and what it is the role of statistics?

Causality II: How does causal inference fit into public health and what it is the role of statistics? Causality II: How does causal inference fit into public health and what it is the role of statistics? Statistics for Psychosocial Research II November 13, 2006 1 Outline Potential Outcomes / Counterfactual

More information

Eco517 Fall 2014 C. Sims FINAL EXAM

Eco517 Fall 2014 C. Sims FINAL EXAM Eco517 Fall 2014 C. Sims FINAL EXAM This is a three hour exam. You may refer to books, notes, or computer equipment during the exam. You may not communicate, either electronically or in any other way,

More information

Confounding and effect modification: Mantel-Haenszel estimation, testing effect homogeneity. Dankmar Böhning

Confounding and effect modification: Mantel-Haenszel estimation, testing effect homogeneity. Dankmar Böhning Confounding and effect modification: Mantel-Haenszel estimation, testing effect homogeneity Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK Advanced Statistical

More information

1 Motivation for Instrumental Variable (IV) Regression

1 Motivation for Instrumental Variable (IV) Regression ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2008 Paper 241 A Note on Risk Prediction for Case-Control Studies Sherri Rose Mark J. van der Laan Division

More information

Causal Inference. Prediction and causation are very different. Typical questions are:

Causal Inference. Prediction and causation are very different. Typical questions are: Causal Inference Prediction and causation are very different. Typical questions are: Prediction: Predict Y after observing X = x Causation: Predict Y after setting X = x. Causation involves predicting

More information

Estimating the long-term health impact of air pollution using spatial ecological studies. Duncan Lee

Estimating the long-term health impact of air pollution using spatial ecological studies. Duncan Lee Estimating the long-term health impact of air pollution using spatial ecological studies Duncan Lee EPSRC and RSS workshop 12th September 2014 Acknowledgements This is joint work with Alastair Rushworth

More information

Dennis Cosrnatos. Department of Biostatistics University of North Carolina at Chapel Hill. September 1988

Dennis Cosrnatos. Department of Biostatistics University of North Carolina at Chapel Hill. September 1988 METHODS FOR MODELING DISEASE RISK USING PROBABILITY-QF-EXPOSURE MEASURES by Dennis Cosrnatos Department of Biostatistics University of North Carolina at Chapel Hill Institute of Mimeo Series No. 1858T

More information

Comparison of Three Approaches to Causal Mediation Analysis. Donna L. Coffman David P. MacKinnon Yeying Zhu Debashis Ghosh

Comparison of Three Approaches to Causal Mediation Analysis. Donna L. Coffman David P. MacKinnon Yeying Zhu Debashis Ghosh Comparison of Three Approaches to Causal Mediation Analysis Donna L. Coffman David P. MacKinnon Yeying Zhu Debashis Ghosh Introduction Mediation defined using the potential outcomes framework natural effects

More information

Interpolation and Approximation

Interpolation and Approximation Interpolation and Approximation The Basic Problem: Approximate a continuous function f(x), by a polynomial p(x), over [a, b]. f(x) may only be known in tabular form. f(x) may be expensive to compute. Definition:

More information

Probability: Why do we care? Lecture 2: Probability and Distributions. Classical Definition. What is Probability?

Probability: Why do we care? Lecture 2: Probability and Distributions. Classical Definition. What is Probability? Probability: Why do we care? Lecture 2: Probability and Distributions Sandy Eckel seckel@jhsph.edu 22 April 2008 Probability helps us by: Allowing us to translate scientific questions into mathematical

More information

Survival Analysis I (CHL5209H)

Survival Analysis I (CHL5209H) Survival Analysis Dalla Lana School of Public Health University of Toronto olli.saarela@utoronto.ca January 7, 2015 31-1 Literature Clayton D & Hills M (1993): Statistical Models in Epidemiology. Not really

More information

Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification. Todd MacKenzie, PhD

Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification. Todd MacKenzie, PhD Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification Todd MacKenzie, PhD Collaborators A. James O Malley Tor Tosteson Therese Stukel 2 Overview 1. Instrumental variable

More information

Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal

Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal Overview In observational and experimental studies, the goal may be to estimate the effect

More information

Asymptotic equivalence of paired Hotelling test and conditional logistic regression

Asymptotic equivalence of paired Hotelling test and conditional logistic regression Asymptotic equivalence of paired Hotelling test and conditional logistic regression Félix Balazard 1,2 arxiv:1610.06774v1 [math.st] 21 Oct 2016 Abstract 1 Sorbonne Universités, UPMC Univ Paris 06, CNRS

More information

Estimating the Marginal Odds Ratio in Observational Studies

Estimating the Marginal Odds Ratio in Observational Studies Estimating the Marginal Odds Ratio in Observational Studies Travis Loux Christiana Drake Department of Statistics University of California, Davis June 20, 2011 Outline The Counterfactual Model Odds Ratios

More information

Lecture 2: Probability and Distributions

Lecture 2: Probability and Distributions Lecture 2: Probability and Distributions Ani Manichaikul amanicha@jhsph.edu 17 April 2007 1 / 65 Probability: Why do we care? Probability helps us by: Allowing us to translate scientific questions info

More information

Harvard University. Harvard University Biostatistics Working Paper Series

Harvard University. Harvard University Biostatistics Working Paper Series Harvard University Harvard University Biostatistics Working Paper Series Year 2015 Paper 192 Negative Outcome Control for Unobserved Confounding Under a Cox Proportional Hazards Model Eric J. Tchetgen

More information

Describing Contingency tables

Describing Contingency tables Today s topics: Describing Contingency tables 1. Probability structure for contingency tables (distributions, sensitivity/specificity, sampling schemes). 2. Comparing two proportions (relative risk, odds

More information

BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY

BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY Ingo Langner 1, Ralf Bender 2, Rebecca Lenz-Tönjes 1, Helmut Küchenhoff 2, Maria Blettner 2 1

More information

Causal Modeling in Environmental Epidemiology. Joel Schwartz Harvard University

Causal Modeling in Environmental Epidemiology. Joel Schwartz Harvard University Causal Modeling in Environmental Epidemiology Joel Schwartz Harvard University When I was Young What do I mean by Causal Modeling? What would have happened if the population had been exposed to a instead

More information

Correlation and regression

Correlation and regression 1 Correlation and regression Yongjua Laosiritaworn Introductory on Field Epidemiology 6 July 2015, Thailand Data 2 Illustrative data (Doll, 1955) 3 Scatter plot 4 Doll, 1955 5 6 Correlation coefficient,

More information

DATA-ADAPTIVE VARIABLE SELECTION FOR

DATA-ADAPTIVE VARIABLE SELECTION FOR DATA-ADAPTIVE VARIABLE SELECTION FOR CAUSAL INFERENCE Group Health Research Institute Department of Biostatistics, University of Washington shortreed.s@ghc.org joint work with Ashkan Ertefaie Department

More information

On the Use of the Bross Formula for Prioritizing Covariates in the High-Dimensional Propensity Score Algorithm

On the Use of the Bross Formula for Prioritizing Covariates in the High-Dimensional Propensity Score Algorithm On the Use of the Bross Formula for Prioritizing Covariates in the High-Dimensional Propensity Score Algorithm Richard Wyss 1, Bruce Fireman 2, Jeremy A. Rassen 3, Sebastian Schneeweiss 1 Author Affiliations:

More information

Sampling. Module II Chapter 3

Sampling. Module II Chapter 3 Sampling Module II Chapter 3 Topics Introduction Terms in Sampling Techniques of Sampling Essentials of Good Sampling Introduction In research terms a sample is a group of people, objects, or items that

More information

statistical sense, from the distributions of the xs. The model may now be generalized to the case of k regressors:

statistical sense, from the distributions of the xs. The model may now be generalized to the case of k regressors: Wooldridge, Introductory Econometrics, d ed. Chapter 3: Multiple regression analysis: Estimation In multiple regression analysis, we extend the simple (two-variable) regression model to consider the possibility

More information

A unified framework for studying parameter identifiability and estimation in biased sampling designs

A unified framework for studying parameter identifiability and estimation in biased sampling designs Biometrika Advance Access published January 31, 2011 Biometrika (2011), pp. 1 13 C 2011 Biometrika Trust Printed in Great Britain doi: 10.1093/biomet/asq059 A unified framework for studying parameter identifiability

More information

Using Geographic Information Systems for Exposure Assessment

Using Geographic Information Systems for Exposure Assessment Using Geographic Information Systems for Exposure Assessment Ravi K. Sharma, PhD Department of Behavioral & Community Health Sciences, Graduate School of Public Health, University of Pittsburgh, Pittsburgh,

More information

This paper revisits certain issues concerning differences

This paper revisits certain issues concerning differences ORIGINAL ARTICLE On the Distinction Between Interaction and Effect Modification Tyler J. VanderWeele Abstract: This paper contrasts the concepts of interaction and effect modification using a series of

More information

Lecture 7: Interaction Analysis. Summer Institute in Statistical Genetics 2017

Lecture 7: Interaction Analysis. Summer Institute in Statistical Genetics 2017 Lecture 7: Interaction Analysis Timothy Thornton and Michael Wu Summer Institute in Statistical Genetics 2017 1 / 39 Lecture Outline Beyond main SNP effects Introduction to Concept of Statistical Interaction

More information

Problems for 3505 (2011)

Problems for 3505 (2011) Problems for 505 (2011) 1. In the simplex of genotype distributions x + y + z = 1, for two alleles, the Hardy- Weinberg distributions x = p 2, y = 2pq, z = q 2 (p + q = 1) are characterized by y 2 = 4xz.

More information

Casual Mediation Analysis

Casual Mediation Analysis Casual Mediation Analysis Tyler J. VanderWeele, Ph.D. Upcoming Seminar: April 21-22, 2017, Philadelphia, Pennsylvania OXFORD UNIVERSITY PRESS Explanation in Causal Inference Methods for Mediation and Interaction

More information

TESTS FOR EQUIVALENCE BASED ON ODDS RATIO FOR MATCHED-PAIR DESIGN

TESTS FOR EQUIVALENCE BASED ON ODDS RATIO FOR MATCHED-PAIR DESIGN Journal of Biopharmaceutical Statistics, 15: 889 901, 2005 Copyright Taylor & Francis, Inc. ISSN: 1054-3406 print/1520-5711 online DOI: 10.1080/10543400500265561 TESTS FOR EQUIVALENCE BASED ON ODDS RATIO

More information

Exact McNemar s Test and Matching Confidence Intervals Michael P. Fay April 25,

Exact McNemar s Test and Matching Confidence Intervals Michael P. Fay April 25, Exact McNemar s Test and Matching Confidence Intervals Michael P. Fay April 25, 2016 1 McNemar s Original Test Consider paired binary response data. For example, suppose you have twins randomized to two

More information

Dependent Nondifferential Misclassification of Exposure

Dependent Nondifferential Misclassification of Exposure Dependent Nondifferential Misclassification of Exposure DISCLAIMER: I am REALLY not an expert in data simulations or misclassification Outline Relevant definitions Review of implications of dependent nondifferential

More information

Probability of Detecting Disease-Associated SNPs in Case-Control Genome-Wide Association Studies

Probability of Detecting Disease-Associated SNPs in Case-Control Genome-Wide Association Studies Probability of Detecting Disease-Associated SNPs in Case-Control Genome-Wide Association Studies Ruth Pfeiffer, Ph.D. Mitchell Gail Biostatistics Branch Division of Cancer Epidemiology&Genetics National

More information

Bayesian Hierarchical Models

Bayesian Hierarchical Models Bayesian Hierarchical Models Gavin Shaddick, Millie Green, Matthew Thomas University of Bath 6 th - 9 th December 2016 1/ 34 APPLICATIONS OF BAYESIAN HIERARCHICAL MODELS 2/ 34 OUTLINE Spatial epidemiology

More information

Data, Design, and Background Knowledge in Etiologic Inference

Data, Design, and Background Knowledge in Etiologic Inference Data, Design, and Background Knowledge in Etiologic Inference James M. Robins I use two examples to demonstrate that an appropriate etiologic analysis of an epidemiologic study depends as much on study

More information

Causal Inference for Case-Control Studies. Sherri Rose. A dissertation submitted in partial satisfaction of the. requirements for the degree of

Causal Inference for Case-Control Studies. Sherri Rose. A dissertation submitted in partial satisfaction of the. requirements for the degree of Causal Inference for Case-Control Studies By Sherri Rose A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy in Biostatistics in the Graduate Division

More information

Semiparametric maximum likelihood estimation exploiting gene-environment independence in case-control studies

Semiparametric maximum likelihood estimation exploiting gene-environment independence in case-control studies Biometrika (2005), 92, 2, pp. 399 418 2005 Biometrika Trust Printed in Great Britain Semiparametric maximum likelihood estimation exploiting gene-environment independence in case-control studies BY NILANJAN

More information

Statistics 3858 : Contingency Tables

Statistics 3858 : Contingency Tables Statistics 3858 : Contingency Tables 1 Introduction Before proceeding with this topic the student should review generalized likelihood ratios ΛX) for multinomial distributions, its relation to Pearson

More information

8 Nominal and Ordinal Logistic Regression

8 Nominal and Ordinal Logistic Regression 8 Nominal and Ordinal Logistic Regression 8.1 Introduction If the response variable is categorical, with more then two categories, then there are two options for generalized linear models. One relies on

More information

Assess Assumptions and Sensitivity Analysis. Fan Li March 26, 2014

Assess Assumptions and Sensitivity Analysis. Fan Li March 26, 2014 Assess Assumptions and Sensitivity Analysis Fan Li March 26, 2014 Two Key Assumptions 1. Overlap: 0

More information

STAT331. Cox s Proportional Hazards Model

STAT331. Cox s Proportional Hazards Model STAT331 Cox s Proportional Hazards Model In this unit we introduce Cox s proportional hazards (Cox s PH) model, give a heuristic development of the partial likelihood function, and discuss adaptations

More information

Appendix: Modeling Approach

Appendix: Modeling Approach AFFECTIVE PRIMACY IN INTRAORGANIZATIONAL TASK NETWORKS Appendix: Modeling Approach There is now a significant and developing literature on Bayesian methods in social network analysis. See, for instance,

More information

Social Epidemiology and Spatial Epidemiology: An Empirical Comparison of Perspectives

Social Epidemiology and Spatial Epidemiology: An Empirical Comparison of Perspectives Social Epidemiology and Spatial Epidemiology: An Empirical Comparison of Perspectives A DISSERTATION SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL OF THE UNIVERSITY OF MINNESOTA BY Kelsey Nathel McDonald

More information

HERITABILITY ESTIMATION USING A REGULARIZED REGRESSION APPROACH (HERRA)

HERITABILITY ESTIMATION USING A REGULARIZED REGRESSION APPROACH (HERRA) BIRS 016 1 HERITABILITY ESTIMATION USING A REGULARIZED REGRESSION APPROACH (HERRA) Malka Gorfine, Tel Aviv University, Israel Joint work with Li Hsu, FHCRC, Seattle, USA BIRS 016 The concept of heritability

More information

CAUSAL INFERENCE IN THE EMPIRICAL SCIENCES. Judea Pearl University of California Los Angeles (www.cs.ucla.edu/~judea)

CAUSAL INFERENCE IN THE EMPIRICAL SCIENCES. Judea Pearl University of California Los Angeles (www.cs.ucla.edu/~judea) CAUSAL INFERENCE IN THE EMPIRICAL SCIENCES Judea Pearl University of California Los Angeles (www.cs.ucla.edu/~judea) OUTLINE Inference: Statistical vs. Causal distinctions and mental barriers Formal semantics

More information

Joint, Conditional, & Marginal Probabilities

Joint, Conditional, & Marginal Probabilities Joint, Conditional, & Marginal Probabilities The three axioms for probability don t discuss how to create probabilities for combined events such as P [A B] or for the likelihood of an event A given that

More information

Investigating mediation when counterfactuals are not metaphysical: Does sunlight exposure mediate the effect of eye-glasses on cataracts?

Investigating mediation when counterfactuals are not metaphysical: Does sunlight exposure mediate the effect of eye-glasses on cataracts? Investigating mediation when counterfactuals are not metaphysical: Does sunlight exposure mediate the effect of eye-glasses on cataracts? Brian Egleston Fox Chase Cancer Center Collaborators: Daniel Scharfstein,

More information

Contingency Tables Part One 1

Contingency Tables Part One 1 Contingency Tables Part One 1 STA 312: Fall 2012 1 See last slide for copyright information. 1 / 32 Suggested Reading: Chapter 2 Read Sections 2.1-2.4 You are not responsible for Section 2.5 2 / 32 Overview

More information

where Female = 0 for males, = 1 for females Age is measured in years (22, 23, ) GPA is measured in units on a four-point scale (0, 1.22, 3.45, etc.

where Female = 0 for males, = 1 for females Age is measured in years (22, 23, ) GPA is measured in units on a four-point scale (0, 1.22, 3.45, etc. Notes on regression analysis 1. Basics in regression analysis key concepts (actual implementation is more complicated) A. Collect data B. Plot data on graph, draw a line through the middle of the scatter

More information

11 November 2011 Department of Biostatistics, University of Copengen. 9:15 10:00 Recap of case-control studies. Frequency-matched studies.

11 November 2011 Department of Biostatistics, University of Copengen. 9:15 10:00 Recap of case-control studies. Frequency-matched studies. Matched and nested case-control studies Bendix Carstensen Steno Diabetes Center, Gentofte, Denmark http://staff.pubhealth.ku.dk/~bxc/ Department of Biostatistics, University of Copengen 11 November 2011

More information

Expression QTLs and Mapping of Complex Trait Loci. Paul Schliekelman Statistics Department University of Georgia

Expression QTLs and Mapping of Complex Trait Loci. Paul Schliekelman Statistics Department University of Georgia Expression QTLs and Mapping of Complex Trait Loci Paul Schliekelman Statistics Department University of Georgia Definitions: Genes, Loci and Alleles A gene codes for a protein. Proteins due everything.

More information

Probability and Probability Distributions. Dr. Mohammed Alahmed

Probability and Probability Distributions. Dr. Mohammed Alahmed Probability and Probability Distributions 1 Probability and Probability Distributions Usually we want to do more with data than just describing them! We might want to test certain specific inferences about

More information

Applications of GIS in Health Research. West Nile virus

Applications of GIS in Health Research. West Nile virus Applications of GIS in Health Research West Nile virus Outline Part 1. Applications of GIS in Health research or spatial epidemiology Disease Mapping Cluster Detection Spatial Exposure Assessment Assessment

More information

Analysis of Longitudinal Data. Patrick J. Heagerty PhD Department of Biostatistics University of Washington

Analysis of Longitudinal Data. Patrick J. Heagerty PhD Department of Biostatistics University of Washington Analysis of Longitudinal Data Patrick J Heagerty PhD Department of Biostatistics University of Washington Auckland 8 Session One Outline Examples of longitudinal data Scientific motivation Opportunities

More information

What Causality Is (stats for mathematicians)

What Causality Is (stats for mathematicians) What Causality Is (stats for mathematicians) Andrew Critch UC Berkeley August 31, 2011 Introduction Foreword: The value of examples With any hard question, it helps to start with simple, concrete versions

More information

Effects of Exposure Measurement Error When an Exposure Variable Is Constrained by a Lower Limit

Effects of Exposure Measurement Error When an Exposure Variable Is Constrained by a Lower Limit American Journal of Epidemiology Copyright 003 by the Johns Hopkins Bloomberg School of Public Health All rights reserved Vol. 157, No. 4 Printed in U.S.A. DOI: 10.1093/aje/kwf17 Effects of Exposure Measurement

More information

Statistics in medicine

Statistics in medicine Statistics in medicine Lecture 4: and multivariable regression Fatma Shebl, MD, MS, MPH, PhD Assistant Professor Chronic Disease Epidemiology Department Yale School of Public Health Fatma.shebl@yale.edu

More information

Joint, Conditional, & Marginal Probabilities

Joint, Conditional, & Marginal Probabilities Joint, Conditional, & Marginal Probabilities Statistics 110 Summer 2006 Copyright c 2006 by Mark E. Irwin Joint, Conditional, & Marginal Probabilities The three axioms for probability don t discuss how

More information

IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors

IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors Laura Mayoral IAE, Barcelona GSE and University of Gothenburg Gothenburg, May 2015 Roadmap Deviations from the standard

More information

Mark Scheme (Results) January 2009

Mark Scheme (Results) January 2009 Mark (Results) January 009 GCE GCE Mathematics (666/0) Edexcel Limited. Registered in England and Wales No. 4496750 Registered Office: One90 High Holborn, London WCV 7BH January 009 666 Core Mathematics

More information

Unbiased estimation of exposure odds ratios in complete records logistic regression

Unbiased estimation of exposure odds ratios in complete records logistic regression Unbiased estimation of exposure odds ratios in complete records logistic regression Jonathan Bartlett London School of Hygiene and Tropical Medicine www.missingdata.org.uk Centre for Statistical Methodology

More information

Causal inference in epidemiological practice

Causal inference in epidemiological practice Causal inference in epidemiological practice Willem van der Wal Biostatistics, Julius Center UMC Utrecht June 5, 2 Overview Introduction to causal inference Marginal causal effects Estimating marginal

More information