Effects of Exposure Measurement Error When an Exposure Variable Is Constrained by a Lower Limit

Size: px
Start display at page:

Download "Effects of Exposure Measurement Error When an Exposure Variable Is Constrained by a Lower Limit"

Transcription

1 American Journal of Epidemiology Copyright 003 by the Johns Hopkins Bloomberg School of Public Health All rights reserved Vol. 157, No. 4 Printed in U.S.A. DOI: /aje/kwf17 Effects of Exposure Measurement Error When an Exposure Variable Is Constrained by a Lower Limit David B. Richardson 1 and Antonio Ciampi 1 Department of Epidemiology, School of Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC. Department of Epidemiology and Biostatistics, McGill University, Montreal, Quebec, Canada. Received for publication March 14, 001; accepted for publication October 14, 00. Epidemiologic studies routinely suffer from bias due to exposure measurement error. In this paper, the authors examine the effect of measurement error when the exposure variable of interest is constrained by a lower limit. This is an important consideration, since often in epidemiologic studies an exposure variable is constrained by a lower limit such as zero or a nonzero detection limit. In this paper, attenuation of exposure-disease associations is defined within the framework of a classical model of uncorrelated additive error. Then, the special case of nonlinearity due to the effect of a lower threshold is examined. A general model is developed to characterize the effect of random measurement error when there is a lower threshold for recorded values. Findings are illustrated under the assumption that the true exposure follows the lognormal and gamma distributions. The authors show that the direction and magnitude of bias in estimated exposure-response associations depends on the population distribution of the exposure, the magnitude of the recording threshold, the value assigned to below-threshold measurement results, and the variance in the measured exposure due to random measurement error. bias (epidemiology); epidemiologic methods; measurement error; regression analysis Epidemiologists are routinely concerned about the consequences of exposure measurement error. Investigators may rely on inaccurate proxy measures of exposure or on information derived from imprecise exposure measurement tools. In environmental and occupational epidemiologic studies, information on exposure is often collected under poorly controlled conditions, and it may be derived from historical records that were originally compiled for purposes other than epidemiologic research. Almost any measurement process combines human error with the limitations of a measurement tool, leading to measurement error. Errors in exposure measurement may lead to biased estimates of exposure-disease associations. A number of authors have discussed the direction and magnitude of bias resulting from specified patterns of exposure measurement error and have provided models describing these associations (1 3). These models can help in assessing the bias and uncertainty that result from commonly encountered problems of exposure measurement error. Such models have been used in a range of epidemiologic investigations, including studies of nutritional factors, environmental contaminants, and occupational hazards (4 6). Taking a similar approach in this paper, we begin with a model in which exposure measurement error is assumed to be nondifferential (that is, independent of disease status) and randomly distributed. In this paper, however, we focus on how the effects of measurement error change when an exposure variable is constrained by a lower limit. It is common in epidemiologic studies for recorded exposures to be constrained by a lower limit, such as zero or a minimal detection threshold for a measurement process (7 9). For example, in studies of workers in the nuclear industry, a lower boundary for recorded radiation doses often reflects the inability of a measurement tool to accurately obtain values below a specified minimal threshold of detection (10). In this case, measurement error conforms to a nonlinear rather than an additive model, and the assumption that these errors are randomly distributed is no longer accurate. Consequently, a constraint on the minimal recorded exposure will influence the distribution of exposure measurement error and, importantly for epidemiologists, influence the effect of measurement error on estimates of exposure-disease associations. In this paper, we investigate the direction and magnitude of bias in exposure-response associations when there is random measurement error and a lower threshold for Correspondence to Professor David B. Richardson, Department of Epidemiology, School of Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC ( david_richardson@unc.edu). 355 Am J Epidemiol 003;157: on 6 December 017

2 356 Richardson and Ciampi recorded exposures. We develop a general equation for the coefficient of bias in exposure-response associations resulting from random measurement error in the presence of a recording threshold limit, and we illustrate our findings under a range of specified model conditions. METHODS Regression models of exposure-disease association Risk estimates for epidemiologic associations are often derived by ordinary linear regression with the following model: y = α + βx + ε, (1) where y is a health response measured on a continuous scale, x is a continuous exposure (assumed to be nonnegative for our purposes), and ε is the outcome error term, which is assumed to be uncorrelated with x and to have a mean equal to zero and a constant variance. The parameters α and β denote the intercept and the average change in y with x, respectively. If the health outcome under study is described by a binary variable, a logistic regression model may be preferable. In this case, equation 1 is replaced by a model in which y is a binary random variable with parameter p = E(y x) = Pr(y = 1/ x) and logit(p) = α + βx. (1 ) In the discussion that follows, we will assume that associations conform to equation 1. However, the theory remains valid even when the correct model is the logistic one (equation 1 ), provided that the following assumptions are satisfied: 1) the outcome is rare; ) β is not too large in absolute terms; and 3) the measurement error is not too severe (11). In many cases of epidemiologic interest, the above assumptions will be satisfied. Bias due to measurement error Typically, in an epidemiologic study, the true exposure, x, is not measurable without error. Consequently, one can consider that the study uses a surrogate exposure variable, z, which provides an imperfect measure of the true exposure, x. A simple case considered in epidemiology is that of the classical error model: z = x + η, () where η is a random variable with variance σ that is uncorrelated with x. It is usually assumed that cov(ε,η) = 0. Several generalizations of this model are also often used in epidemiology (5); they usually imply a linear relation between z and x and/or a relaxation of the assumption of zero correlation between η and x, while maintaining the assumption that cov(ε,η) = 0. Suppose we fitted a linear regression model between a surrogate measure of exposure, z, and a continuous measure of disease, y: y = α + β z + ε, (3) obtaining the usual least squares estimator, where the subscript s denotes the usual sample estimates of variance and covariance: β' ˆ cov s ( YZ, ) = (4) var s ( Z) Let us write z = x + η as in equation, but without the assumptions that the distribution of η is independent of x and that there is no correlation between x and η. However, we will continue to assume that cov(ε,η) = 0. Then, as the sample size of the study population tends towards infinity, it can be shown simply from cov(y,z) = cov(βx + ε,z) = βcov(x,z) that the estimated association between the surrogate measure of exposure and disease, β, is equal to the estimated association between the true exposure and disease, β, multiplied by a coefficient of bias, λ, as follows: cov s ( x, z) E( βˆ ) β = λβ, (5) n var( z) where cov( xz, ) λ = (6) var( z) Therefore, provided that ε and η are uncorrelated, equation 6 is valid in general, not only in the familiar case of additive error. We will call standard the well-studied case described by equation and its linear generalizations, characterized by the assumption that the measurement error is distributed with mean equal to zero and constant variance, σ. In this standard case, measurement error always leads to attenuation of the exposure-disease association in addition to diminishing the goodness-of-fit of the regression model (5). The coefficient of bias, λ, described by equation 6 will always take values less than 1; and, for the additive error model, the coefficient, λ, is equal to the ratio of the variance of x to the sum of the variance of x and σ (1). However, we are interested in the range of possible values for the coefficient of bias, λ, in the nonstandard case in which recorded values for z are constrained by a minimal threshold limit, d, and exposures below this threshold limit are set equal to a value, a. In this nonstandard case, the model of measurement error in equation can be replaced by the following model: z = x + η for x + η > d. a for x + η d. We first examine the case, which we call the pure threshold model, in which there is no random measurement error (the variance of η, denoted by σ, is equal to zero). In this case, the only source of exposure measurement error is the inability of the measuring instrument to detect x values that are below the threshold limit, d. Then we examine the case, which we call the threshold model with error, in which there is random measurement error (a nonzero variance for η). To explore these cases, we developed a general formula for bias due to exposure (7) Am J Epidemiol 003;157: on 6 December 017

3 Effects of Exposure Measurement Error 357 FIGURE 1. Assumed population distribution of the true exposure, x. The solid line shows the gamma (1,1) distribution; the dashed line shows the lognormal (0,1) distribution. Values for x > 5.0 are not shown. E(x) and var(x) for the lognormal (0,1) distribution are 1.65 and 4.65, respectively. E(x) and var(x) for the gamma (1,1) distribution are 1.00 and 1.00, respectively. measurement error, as described in the Appendix. Under the threshold model with error, the relation between the coefficient of bias, λ, and the threshold limit, d, depends on the distributions of x, η, and the value assigned to belowthreshold measurements, a. Thus, rather than attempting a general analytical study of λ as a function of d, a, and the parameters of the distributions involved, we give some specific examples using distributions that mimic reasonably well what can be expected in real-life situations. To explore these examples, we generated simulation data for 1,000,000 study subjects. A true exposure value, x, was assigned to each study subject by sampling from the lognormal (0,1) or gamma (1,1) distribution (figure 1). Using equation 7, we calculated values for z for the following cases: 1) a pure threshold model (σ = 0) with x distributed according to the lognormal (0,1) distribution; ) a pure threshold model (σ = 0) with x distributed according to the gamma (1,1) distribution; 3) a threshold model with error with x distributed according to the lognormal (0,1) distribution; and 4) a threshold model with error with x distributed according to the gamma (1,1) distribution. Using SAS PROC CORR to calculate variances and covariances of x and z, we derived Monte Carlo estimates of λ via equation 6. Results were derived for specified values of a and d, in the case of the pure threshold model, and for specified values of a, d, and σ, in the case of the threshold model with error (13). Each simulation was repeated 100 times, and the resulting estimates of the coefficient of bias were averaged. RESULTS Pure threshold model We begin by considering the pure threshold model in which the only source of exposure measurement error is the inability of the measuring instrument to detect x values below the threshold limit, d. In the Appendix, we present a generalized formula for the coefficient of bias, λ, under the pure threshold model as a function of the threshold limit and the value assigned to below-threshold measurements, a. Under the pure threshold model, the coefficient of bias, λ, can be either less than 1 or greater than 1, depending on the distribution of x and on d and a. By properly choosing a, it is possible to have no bias despite the presence of a threshold limit. The coefficient, λ, is equal to 1 when a is equal to the expected value of x conditional on x being below the threshold: a = E[x x d]. If the value assigned to belowthreshold measurements is less than the expected value of x conditional on x being below the threshold, attenuation will occur in estimates of association; in contrast, if the value, a, assigned to below-threshold measurements is larger, inflation will occur in estimates of association. Let us consider some specific examples. If the value assigned to below-threshold measurements, a, is equal to 0, the coefficient of bias will always be less than or equal to 1. In contrast, if the value assigned to below-threshold measurements is equal to the threshold limit, d, the coefficient of bias will always be greater than or equal to 1. The common practice of setting a equal to d/ may result in upward or downward bias, depending on the magnitude of d/ with respect to E[x x d]. Figures and 3 illustrate the relation between the coefficient of bias and a, the value assigned to exposure measurements below a threshold limit, d = 1 or d =. Figure pertains to the case in which the population distribution of exposure is lognormal. Figure 3 pertains to the case in which the population distribution of exposure conforms to the gamma distribution. Threshold model with error Next, we examine the threshold model with error. We examine three examples of recording practices commonly encountered in epidemiologic studies. We first consider the situation in which below-threshold exposures are set equal to zero (a = 0). Next we consider the situation in which belowthreshold exposures are set equal to one half of the threshold limit (a = d/). Lastly, we consider the situation in which Am J Epidemiol 003;157: on 6 December 017

4 358 Richardson and Ciampi FIGURE. Estimated coefficients of bias for a linear exposure-response association under the pure threshold model, in which there is no random measurement error (σ = 0). Exposures below the threshold limit, d = 1 (triangles) and d = (circles), are set equal to a, as in equation 7. The population distribution of exposure follows a lognormal (0,1) distribution. The two stars indicate the mean value for x d, which is 0.5d when d = 1 and 0.41d when d =. below-threshold exposures are set equal to the threshold limit (a = d). We begin with the situation in which the population distribution of exposure conforms to the lognormal distribution. In the absence of random measurement error, assigning a value of zero to exposure measurements below a threshold limit attenuates exposure-response associations (see above). Figure 4 illustrates the situation in which random measurement error occurs and a value of zero is assigned to exposure measurements below a threshold limit, d = 1 or d =. In this situation, the degree of attenuation increases as the standard deviation of the random error, σ, increases; and the degree of attenuation is larger when the threshold limit is equal to two units than when the threshold limit is equal to one unit (figure 4). Notably, one can see that in comparison with the classical model of exposure measurement error (depicted by the solid line), the effect of random measurement error differs in the case where there is a threshold limit (depicted by the dashed lines). At small values of σ, the coefficient of bias is closer to 1 under the classical model of measurement error (in which there is not a FIGURE 3. Estimated coefficients of bias for a linear exposure-response association under the pure threshold model, in which there is no random measurement error (σ = 0). Exposures below the threshold limit, d = 1 (triangles) and d = (circles), are set equal to a, as in equation 7. The population distribution of exposure follows a gamma (1,1) distribution. The two stars indicate the mean value for x d, which is 0.4d when d = 1 and 0.34d when d =. Am J Epidemiol 003;157: on 6 December 017

5 Effects of Exposure Measurement Error 359 FIGURE 4. Values of the coefficient of bias, λ, according to the standard deviation of the measurement error term and specified assumptions about the magnitude of the lower threshold limit, d. Exposures below the threshold limit, d = 1 (triangles) and d = (circles), are set equal to zero (see equation 7). The solid line shows values for the case in which there is no lower threshold limit. The population distribution of exposure follows a lognormal (0,1) distribution. Values greater than 1 indicate that the estimated association between the disease and the surrogate variable was greater than the association between the disease and true exposure. FIGURE 6. Values of the coefficient of bias, λ, according to the standard deviation of the measurement error term and specified assumptions about the magnitude of the lower threshold limit, d. Exposures below the threshold limit, d = 1 (triangles) and d = (circles), are set equal to the threshold limit (see equation 7). The solid line shows values for the case in which there is no lower threshold limit. The population distribution of exposure follows a lognormal (0,1) distribution. Values greater than 1 indicate that the estimated association between the disease and the surrogate variable was greater than the association between the disease and true exposure. FIGURE 5. Values of the coefficient of bias, λ, according to the standard deviation of the measurement error term and specified assumptions about the magnitude of the lower threshold limit, d. Exposures below the threshold limit, d = 1 (triangles) and d = (circles), are set equal to d/ (see equation 7). The solid line shows values for the case in which there is no lower threshold limit. The population distribution of exposure follows a lognormal (0,1) distribution. Values greater than 1 indicate that the estimated association between the disease and the surrogate variable was greater than the association between the disease and true exposure. threshold limit) than under the threshold model with error (figure 4). In contrast, at large values of σ, the coefficient of bias is closer to 1 under the threshold model with error than under the classical model of measurement error (figure 4). This is because the decline in the magnitude of the coefficient, λ, with increasing values of σ is consistently greater in the case where there is no threshold limit (the classical model of exposure measurement error) than in the nonstandard case where there is a threshold limit. Next we examine the situation in which below-threshold values are assigned a value of one half of the threshold limit. Figure 5 illustrates the coefficient of bias for exposureresponse associations in relation to the standard deviation of the random error, σ, and threshold limits, d = 1 and d = (assuming that the population distribution of exposure is lognormal). When the standard deviation of the random error, σ, is close to zero, assigning a value of d/ to belowthreshold measurement results leads to minimal bias in estimates of exposure-disease associations, while at larger values of σ, attenuation is observed (figure 5). However, one can see that in comparison with the classical model of exposure measurement error (depicted by the solid line), there tends to be less bias in estimates of association if there is a threshold limit (and below-threshold measurements are assigned a value equal to d/) than in the classical model of measurement error; this is particularly true at large values of σ (figure 5). We also examined the situation in which the threshold limit, d, is assigned to below-threshold measurements. When the standard deviation of the random error, σ, is close to zero, assigning a value of d to below-threshold measurement results leads to inflation of estimates of association. When there is greater random measurement error, the coefficient of bias declines in magnitude (figure 6). However, the decline in the magnitude of the attenuation coefficient with increasing values of σ is greater for the classical model of measurement error (depicted by the solid line) than in the case where there is a lower limit for recorded exposures. Again, the degree of attenuation that occurs because of random measurement error is conditional upon whether or not there is a recording threshold limit. Figures 7 9 examine the situation in which the population distribution of exposure conforms to the gamma distribution, as opposed to the lognormal distribution assumed in figures 4 6. When a value of zero is assigned to exposure measurements below a threshold limit, exposure-response associations are attenuated, and the degree of attenuation increases Am J Epidemiol 003;157: on 6 December 017

6 360 Richardson and Ciampi FIGURE 7. Values of the coefficient of bias, λ, according to the standard deviation of the measurement error term and specified assumptions about the lower threshold limit, d. Exposures below the threshold limit, d = 1 (triangles) and d = (circles), are set equal to zero (see equation 7). The solid line shows values for the case in which there is no lower threshold limit. The population distribution of exposure follows a gamma (1,1) distribution. Values greater than 1 indicate that the estimated association between the disease and the surrogate variable was greater than the association between the disease and true exposure. FIGURE 9. Values of the coefficient of bias, λ, according to the standard deviation of the measurement error term and specified assumptions about the lower threshold limit, d. Exposures below the threshold limit, d = 1 (triangles) and d = (circles), are set equal to the threshold limit (see equation 7). The solid line shows values for the case in which there is no lower threshold limit. The population distribution of exposure follows a gamma (1,1) distribution. Values greater than 1 indicate that the estimated association between the disease and the surrogate variable was greater than the association between the disease and true exposure. as the standard deviation of the random error, σ, increases (figure 7). Compared with the situation in which the population distribution conforms to the lognormal distribution, the degree of attenuation is greater when the population distribution conforms to the gamma distribution (figure 7). Assigning a value of one half of the threshold leads to inflation of exposure-response associations for low values of σ and to attenuation at higher values of σ (figure 8). When the threshold limit value, d, is assigned to below-threshold measurements, inflation is observed at low values of σ. FIGURE 8. Values of the coefficient of bias, λ, according to the standard deviation of the measurement error term and specified assumptions about the lower threshold limit, d. Exposures below the threshold limit, d = 1 (triangles) and d = (circles), are set equal to d/ (see equation 7). The solid line shows values for the case in which there is no lower threshold limit. The population distribution of exposure follows a gamma (1,1) distribution. Values greater than 1 indicate that the estimated association between the disease and the surrogate variable was greater than the association between the disease and true exposure. However, the decline in the coefficient of bias with increasing values of σ is relatively large; consequently, while estimates of association were inflated at low values of σ, estimates were attenuated at larger values of σ (figure 9). DISCUSSION A number of papers in the epidemiologic literature discuss the consequences of nondifferential randomly distributed measurement error (, 1, 14). These discussions are useful for understanding the potential effects of measurement error on a study s reported results. However, in many settings of concern to epidemiologists, exposure measurement data are recorded with a minimal threshold value. Under these circumstances, measurement error may lead to different patterns of bias in risk estimates than those observed in situations where exposure data are unconstrained. In this paper, we investigated the effect of measurement error when the surrogate exposure was constrained by a lower threshold value. We have shown that the direction and magnitude of bias in estimated associations may vary depending on recording practices, the variance in the surrogate exposure variable due to measurement error, and the population distribution of the exposure. While the assumptions underlying the simulation analyses in this paper were informed by empirical studies on the health effects of ionizing radiation, we have attempted to examine situations that are comparable to those commonly encountered by epidemiologists. In some of the examples, a large proportion of the study data had exposure values lower than the minimal threshold limit; this was particularly true when we assumed that the population distribution of the true exposure conformed to the gamma distribution. However, in studies of environmental and occupational exposures, highly skewed Am J Epidemiol 003;157: on 6 December 017

7 Effects of Exposure Measurement Error 361 exposure distributions are commonly encountered; consequently, the conclusions drawn from these examples should be useful for considerations about bias in such settings (7, 15). Measurement error was assumed to conform to a symmetrical normal distribution. In studies of ionizing radiation in which radiation doses are determined from film badge dosimetry, one source of exposure measurement error is the uncertainty that arises from laboratory processes (including film badge calibration, chemical processing of films, measurements of film optical densities, and comparison of the optical densities of badges to calibration films). In these settings, measurement error due to laboratory uncertainties may be assumed to conform to the normal distribution (16). We focused on several situations that are illustrative of recording practices for below-threshold exposure measurements: situations in which below-threshold measurements are set equal to either zero, one half of the threshold limit, or the threshold limit. Other recording practices might be considered, such as assigning a value equal to the threshold limit divided by the square root of to below-threshold measurements (17). The choice of an appropriate recording practice is often informed by knowledge or assumptions about the underlying distribution of true exposures (8). However, in regulatory settings, an upper value, such as the threshold limit, is sometimes recorded in order to ensure that exposures have not been underestimated. As figures and 3 show, in the case of the pure threshold model, the closer the assigned value is to the expected value of x in the below-threshold range, the smaller is the degree of bias in the exposure-mortality association. The special situation in which the assigned value, a, equals the expected value of x conditional on x being below the threshold limit, is an instance of Berkson error; the true exposure is distributed around the surrogate exposure with an average error equal to zero, producing no bias in estimates of exposure-response associations (14). Our primary interest in this paper, however, was in the more general case of the threshold model with error. We developed a formula for the coefficient of bias due to exposure measurement error for the case of the threshold model with error. When compared with the classical model of measurement error, the slope of the decline in the magnitude of the coefficient of bias with increasing measurement error is lower in the case where there is a threshold than in the classical case where it is assumed that there is no recording threshold. We have suggested that the threshold model with error provides a better description of the exposure measurement error encountered in many occupational and environmental studies than does the classical model of exposure measurement error. However, the threshold model with error is still a relatively simple model, and the results presented in this paper are best viewed as examples for understanding the potential effects of measurement error under simplified conditions. We emphasize that this paper explored simple patterns of measurement error, not the effects of the complex patterns of exposure measurement error that often appear in research settings. We focused on linear estimates of exposure-response associations (which are often examined in environmental and occupational epidemiology); however, patterns of measurement error that arise when there is a recording threshold may in some cases lead to departures from linearity in estimated exposure-response associations despite the presence of a true linear association. In addition, these analyses focused on the situation where there is a single exposure measurement. In settings of chronic exposure, in which a cumulative measure of exposure is derived from a series of measurements made on each individual, the problems of measurement error may be more complex. Furthermore, the assumption of nondifferential random variation in exposure estimates is often an inadequate description of the true extent of problems that measurement error entails. In epidemiologic studies, researchers may encounter complex patterns of measurement error, biased estimates of exposure, varying distributions of measurement error at different values of the true exposure, and even patterns of measurement error that are differential with regard to disease status. Patterns of measurement error that are differential with respect to disease status may occur in occupational settings, for example, if there is health-related selection of workers into jobs or areas where the exposure conditions lead to greater problems of measurement error. In addition, exposure information available for epidemiologic studies is often incomplete and may not reflect all relevant sources of exposure or periods of exposure. These are equally important considerations as sources of bias in estimates of exposure-disease associations. Investigative analyses, in which subcohorts are examined or assumptions about etiologically relevant exposures are varied, can often improve our understanding of the problems involved in measurement error. Despite these limitations, the observations in this paper contribute to a growing body of epidemiologic literature on the effects of exposure misclassification. Bias resulting from the inaccurate assignment of study subjects to exposure groups has been the subject of a great deal of discussion. Much of the early literature on this topic focused on studies in which exposure variables were categorized (18, 19). More recently, discussions have expanded to cover the effects of measurement error in continuous exposure variables. As this paper illustrates, conclusions about the effects of measurement error should be sensitive to data collection and recording processes that influence the distribution of recorded exposures. In epidemiologic studies, it is common to have exposure data that are constrained by a lower threshold limit. The direction and magnitude of bias resulting from measurement error will depend on the population distribution of the exposure, the variance due to measurement error, and the recording practices used for below-threshold measurements. REFERENCES 1. Armstrong BG. Effect of measurement error on epidemiological studies of environmental and occupational exposures. Occup Environ Med 1998;55: Thomas D, Stram D, Dwyer J. Exposure measurement error: influence on exposure-disease. Relationships and methods of correction. Annu Rev Public Health 1993;14: Am J Epidemiol 003;157: on 6 December 017

8 36 Richardson and Ciampi 3. Carroll RJ, Ruppert D, Stefanski LA. Measurement error in nonlinear models. London, United Kingdom: Chapman and Hall Ltd, Willett W. An overview of issues related to the correction of non-differential exposure measurement error in epidemiologic studies. Stat Med 1989;8: Armstrong BG. The effects of measurement errors on relative risk regressions. Am J Epidemiol 1990;13: Zeger SL, Thomas D, Dominici F, et al. Exposure measurement error in time-series studies of air pollution: concepts and consequences. Environ Health Perspect 000;108: Strom DJ. Estimating individual and collective doses to groups with less than detectable doses: a method for use in epidemiologic studies. Health Phys 1986;51: Helsel DR. Less than obvious: statistical treatment of data below the detection limit. Environ Sci Technol 1990;4: Nielson KK, Rogers VC. Statistical estimation of analytical data distributions and censored measurements. Anal Chem 1989;61: Watkins J, Cragle D, Frome E, et al. Collection, validation, and treatment of data for a mortality study of nuclear industry workers. Appl Occup Environ Hyg 1997;1: Rosner B, Willet W, Spiegelman D. Correction of logistic regression relative risk estimates and confidence intervals for systematic within-person measurement error. Stat Med 1989;8: Armstrong BK, White E, Saracci R. Principles of exposure measurement in epidemiology. Oxford, United Kingdom: Oxford University Press, SAS Institute, Inc. SAS, version Cary, NC: SAS Institute, Inc, Cox DR, Darby SC, Reeves GK, et al. The effects of measurement errors with particular reference to a study of exposure to residential radon. In: Ron E, Hoffman FO, eds. Uncertainties in radiation dosimetry and their impact on dose-response analyses. Proceedings of a workshop at the National Cancer Institute, September 3 5, Bethesda, MD: National Cancer Institute, 1999: (NIH publication no ). 15. Waters MA, Selvin S, Rappaport SM. A measure of goodnessof-fit for the lognormal model applied to occupational exposures. Am Ind Hyg Assoc J 1991;5: National Research Council, Committee on Film Badge Dosimetry in Atmospheric Nuclear Tests. Film badge dosimetry in atmospheric nuclear tests. Washington, DC: National Academy Press, Hornung RW, Reed LD. Estimation of average concentration in the presence of nondetectable values. Appl Occup Environ Hyg 1990;5: Dosemeci M, Wacholder S, Lubin JH. Does nondifferential misclassification of exposure always bias a true effect toward the null value? Am J Epidemiol 1990;13: Weinberg CR, Umbach DM, Greenland S. When will nondifferential misclassification of an exposure preserve the direction of a trend? Am J Epidemiol 1994;140: APPENDIX The effect of random measurement error in the presence of a recording threshold can be examined directly by calculating values for the coefficient, λ, under specified model conditions. Equation 7 can be written z = (x + η)i[x + η d] + ai[x + η < d], where I[ logical expression ] equals 1 if logical expression is true and 0 if it is false. In what follows, f(x) (F(x)) and g(η) (G(η)) shall denote the densities (cumulative distribution functions) of x and η, respectively. Let us first consider the pure threshold model, var(η) = 0. Define, for k = 0, 1,, Notice in particular that M f,0 (d) = F(d). Then a direct calculation yields Therefore, λ M fk, ( d) = x k fx ( ) dx. from which it follows that λ 1 if and only if In addition, the following proposition holds. (A1) (A) Proposition For a = 0, λ < 1 and, for a = d, λ > 1. Furthermore, λ = 1 for a = E(x x d). We now give the formulas for the case, var(η) 0. Define, for k = 0, 1,, Then, proceeding as above, one obtains d 0 k (,) ad = am fk, 1 ( d) M fk, ( d). Ez ( ) = µ x + 1. Ez ( ) = µ x + σ x + a 1 +. Exz ( ) = µ x + σ x +. cov(,) xz = σ x + µ x 1. var( z) = σ x + + ( a µ x 1 ) 1. + (,) ad µ x 1 (,) ad = , σ x + (,) ad + ( a µ x 1 (,) ad ) 1 ( ad, ) σ x Ez ( ) = µ x + Γ 1. 1 ( ad, )( µ x + 1 ( ad, ) a) 0. H k ( d) = Ex ( k Gd ( x) ). L k ( d) = Ex ( k M gk, ( d x) ). Γ k ( ad, ) = ah k 1 ( d) H k ( d). Ez ( ) = µ x + σ x + σ η + aγ 1 + Γ L ( d) L 1 ( d). Exz ( ) = µ x + σ x L 1 ( d) + aγ 1 + Γ. cov(,) xz = σ x L 1 ( d) + Γ +( a µ x )Γ 1. var( z) = σ x + L ( d) L 1 ( d) + Γ + ( a µ x Γ 1 )Γ 1. σ η Am J Epidemiol 003;157: on 6 December 017

9 Effects of Exposure Measurement Error 363 We finally obtain the expression for λ, λ = L 1 ( d) + Γ (,) ad + ( a µ x )Γ 1 ( ad, ) , σ x + σ η L ( d) L 1 ( d) + Γ (,) ad + ( a µ x Γ 1 ( ad, ))Γ 1 (,) ad σ x (A3) which can be seen to be a generalization of equation A1, since, for η = 0, the L s become zero and the Γ s reduce to the s. Notice also that the threshold model with error (equation A3) reduces to the formula for the familiar case of the classical model of measurement error when d = 0, since the Γ s and L s become zero. Am J Epidemiol 003;157: on 6 December 017

Data Uncertainty, MCML and Sampling Density

Data Uncertainty, MCML and Sampling Density Data Uncertainty, MCML and Sampling Density Graham Byrnes International Agency for Research on Cancer 27 October 2015 Outline... Correlated Measurement Error Maximal Marginal Likelihood Monte Carlo Maximum

More information

Non-Gaussian Berkson Errors in Bioassay

Non-Gaussian Berkson Errors in Bioassay Non-Gaussian Berkson Errors in Bioassay Alaa Althubaiti & Alexander Donev First version: 1 May 011 Research Report No., 011, Probability and Statistics Group School of Mathematics, The University of Manchester

More information

Measurement Error in Covariates

Measurement Error in Covariates Measurement Error in Covariates Raymond J. Carroll Department of Statistics Faculty of Nutrition Institute for Applied Mathematics and Computational Science Texas A&M University My Goal Today Introduce

More information

Simple Sensitivity Analysis for Differential Measurement Error. By Tyler J. VanderWeele and Yige Li Harvard University, Cambridge, MA, U.S.A.

Simple Sensitivity Analysis for Differential Measurement Error. By Tyler J. VanderWeele and Yige Li Harvard University, Cambridge, MA, U.S.A. Simple Sensitivity Analysis for Differential Measurement Error By Tyler J. VanderWeele and Yige Li Harvard University, Cambridge, MA, U.S.A. Abstract Simple sensitivity analysis results are given for differential

More information

A note on R 2 measures for Poisson and logistic regression models when both models are applicable

A note on R 2 measures for Poisson and logistic regression models when both models are applicable Journal of Clinical Epidemiology 54 (001) 99 103 A note on R measures for oisson and logistic regression models when both models are applicable Martina Mittlböck, Harald Heinzl* Department of Medical Computer

More information

AN ABSTRACT OF THE DISSERTATION OF

AN ABSTRACT OF THE DISSERTATION OF AN ABSTRACT OF THE DISSERTATION OF Vicente J. Monleon for the degree of Doctor of Philosophy in Statistics presented on November, 005. Title: Regression Calibration and Maximum Likelihood Inference for

More information

Measurement error as missing data: the case of epidemiologic assays. Roderick J. Little

Measurement error as missing data: the case of epidemiologic assays. Roderick J. Little Measurement error as missing data: the case of epidemiologic assays Roderick J. Little Outline Discuss two related calibration topics where classical methods are deficient (A) Limit of quantification methods

More information

BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY

BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY Ingo Langner 1, Ralf Bender 2, Rebecca Lenz-Tönjes 1, Helmut Küchenhoff 2, Maria Blettner 2 1

More information

Bias in the estimation of exposure effects with individual- or group-based exposure assessment

Bias in the estimation of exposure effects with individual- or group-based exposure assessment Journal of Exposure Science and Environmental Epidemiology (011) 1, 1 1 r 011 Nature America, Inc. All rights reserved 1559-0631/11 www.nature.com/jes Bias in the estimation of exposure effects with individual-

More information

Estimation of the Relative Excess Risk Due to Interaction and Associated Confidence Bounds

Estimation of the Relative Excess Risk Due to Interaction and Associated Confidence Bounds American Journal of Epidemiology ª The Author 2009. Published by the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oxfordjournals.org.

More information

Previous lecture. P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing.

Previous lecture. P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing. Previous lecture P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing. Interaction Outline: Definition of interaction Additive versus multiplicative

More information

Statistics Applications Epidemiology. Does adjustment for measurement error induce positive bias if there is no true association? Igor Burstyn, Ph.D.

Statistics Applications Epidemiology. Does adjustment for measurement error induce positive bias if there is no true association? Igor Burstyn, Ph.D. Statistics Applications Epidemiology Does adjustment for measurement error induce positive bias if there is no true association? Igor Burstyn, Ph.D. Community and Occupational Medicine Program, Department

More information

Measurement error modeling. Department of Statistical Sciences Università degli Studi Padova

Measurement error modeling. Department of Statistical Sciences Università degli Studi Padova Measurement error modeling Statistisches Beratungslabor Institut für Statistik Ludwig Maximilians Department of Statistical Sciences Università degli Studi Padova 29.4.2010 Overview 1 and Misclassification

More information

SOME ASPECTS OF MEASUREMENT ERROR IN EXPLANATORY VARIABLES FOR CONTINUOUS AND BINARY REGRESSION MODELS

SOME ASPECTS OF MEASUREMENT ERROR IN EXPLANATORY VARIABLES FOR CONTINUOUS AND BINARY REGRESSION MODELS STATISTICS IN MEDICINE Statist. Med. 17, 2157 2177 (1998) SOME ASPECTS OF MEASUREMENT ERROR IN EXPLANATORY VARIABLES FOR CONTINUOUS AND BINARY REGRESSION MODELS G. K. REEVES*, D.R.COX, S. C. DARBY AND

More information

Public Health and Statistics In India IISA-Harvard-SAMSI May Supported by NIH R01 ES Donna Spiegelman, Sc.D.

Public Health and Statistics In India IISA-Harvard-SAMSI May Supported by NIH R01 ES Donna Spiegelman, Sc.D. Public Health and Statistics In India IISA-Harvard-SAMSI May 2016 Supported by NIH R01 ES009411 Donna Spiegelman, Sc.D. Professor of Epidemiologic Methods Departments of Epidemiology, Biostatistics, Nutrition

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2008 Paper 241 A Note on Risk Prediction for Case-Control Studies Sherri Rose Mark J. van der Laan Division

More information

Specification Errors, Measurement Errors, Confounding

Specification Errors, Measurement Errors, Confounding Specification Errors, Measurement Errors, Confounding Kerby Shedden Department of Statistics, University of Michigan October 10, 2018 1 / 32 An unobserved covariate Suppose we have a data generating model

More information

Measurement error effects on bias and variance in two-stage regression, with application to air pollution epidemiology

Measurement error effects on bias and variance in two-stage regression, with application to air pollution epidemiology Measurement error effects on bias and variance in two-stage regression, with application to air pollution epidemiology Chris Paciorek Department of Statistics, University of California, Berkeley and Adam

More information

Misclassification in Logistic Regression with Discrete Covariates

Misclassification in Logistic Regression with Discrete Covariates Biometrical Journal 45 (2003) 5, 541 553 Misclassification in Logistic Regression with Discrete Covariates Ori Davidov*, David Faraggi and Benjamin Reiser Department of Statistics, University of Haifa,

More information

Harvard University. A Note on the Control Function Approach with an Instrumental Variable and a Binary Outcome. Eric Tchetgen Tchetgen

Harvard University. A Note on the Control Function Approach with an Instrumental Variable and a Binary Outcome. Eric Tchetgen Tchetgen Harvard University Harvard University Biostatistics Working Paper Series Year 2014 Paper 175 A Note on the Control Function Approach with an Instrumental Variable and a Binary Outcome Eric Tchetgen Tchetgen

More information

Statistics in medicine

Statistics in medicine Statistics in medicine Lecture 4: and multivariable regression Fatma Shebl, MD, MS, MPH, PhD Assistant Professor Chronic Disease Epidemiology Department Yale School of Public Health Fatma.shebl@yale.edu

More information

Robust covariance estimator for small-sample adjustment in the generalized estimating equations: A simulation study

Robust covariance estimator for small-sample adjustment in the generalized estimating equations: A simulation study Science Journal of Applied Mathematics and Statistics 2014; 2(1): 20-25 Published online February 20, 2014 (http://www.sciencepublishinggroup.com/j/sjams) doi: 10.11648/j.sjams.20140201.13 Robust covariance

More information

Biost 518 Applied Biostatistics II. Purpose of Statistics. First Stage of Scientific Investigation. Further Stages of Scientific Investigation

Biost 518 Applied Biostatistics II. Purpose of Statistics. First Stage of Scientific Investigation. Further Stages of Scientific Investigation Biost 58 Applied Biostatistics II Scott S. Emerson, M.D., Ph.D. Professor of Biostatistics University of Washington Lecture 5: Review Purpose of Statistics Statistics is about science (Science in the broadest

More information

Estimating Explained Variation of a Latent Scale Dependent Variable Underlying a Binary Indicator of Event Occurrence

Estimating Explained Variation of a Latent Scale Dependent Variable Underlying a Binary Indicator of Event Occurrence International Journal of Statistics and Probability; Vol. 4, No. 1; 2015 ISSN 1927-7032 E-ISSN 1927-7040 Published by Canadian Center of Science and Education Estimating Explained Variation of a Latent

More information

Measurement Error and Linear Regression of Astronomical Data. Brandon Kelly Penn State Summer School in Astrostatistics, June 2007

Measurement Error and Linear Regression of Astronomical Data. Brandon Kelly Penn State Summer School in Astrostatistics, June 2007 Measurement Error and Linear Regression of Astronomical Data Brandon Kelly Penn State Summer School in Astrostatistics, June 2007 Classical Regression Model Collect n data points, denote i th pair as (η

More information

Measurement Error in Spatial Modeling of Environmental Exposures

Measurement Error in Spatial Modeling of Environmental Exposures Measurement Error in Spatial Modeling of Environmental Exposures Chris Paciorek, Alexandros Gryparis, and Brent Coull August 9, 2005 Department of Biostatistics Harvard School of Public Health www.biostat.harvard.edu/~paciorek

More information

Harvard University. Harvard University Biostatistics Working Paper Series

Harvard University. Harvard University Biostatistics Working Paper Series Harvard University Harvard University Biostatistics Working Paper Series Year 2015 Paper 192 Negative Outcome Control for Unobserved Confounding Under a Cox Proportional Hazards Model Eric J. Tchetgen

More information

Approximate Median Regression via the Box-Cox Transformation

Approximate Median Regression via the Box-Cox Transformation Approximate Median Regression via the Box-Cox Transformation Garrett M. Fitzmaurice,StuartR.Lipsitz, and Michael Parzen Median regression is used increasingly in many different areas of applications. The

More information

A Hypothesis Test for the End of a Common Source Outbreak

A Hypothesis Test for the End of a Common Source Outbreak Johns Hopkins University, Dept. of Biostatistics Working Papers 9-20-2004 A Hypothesis Test for the End of a Common Source Outbreak Ron Brookmeyer Johns Hopkins Bloomberg School of Public Health, Department

More information

ASA Section on Survey Research Methods

ASA Section on Survey Research Methods REGRESSION-BASED STATISTICAL MATCHING: RECENT DEVELOPMENTS Chris Moriarity, Fritz Scheuren Chris Moriarity, U.S. Government Accountability Office, 411 G Street NW, Washington, DC 20548 KEY WORDS: data

More information

Constrained Maximum Likelihood Estimation for Model Calibration Using Summary-level Information from External Big Data Sources

Constrained Maximum Likelihood Estimation for Model Calibration Using Summary-level Information from External Big Data Sources Constrained Maximum Likelihood Estimation for Model Calibration Using Summary-level Information from External Big Data Sources Yi-Hau Chen Institute of Statistical Science, Academia Sinica Joint with Nilanjan

More information

Analysis of Longitudinal Data. Patrick J. Heagerty PhD Department of Biostatistics University of Washington

Analysis of Longitudinal Data. Patrick J. Heagerty PhD Department of Biostatistics University of Washington Analysis of Longitudinal Data Patrick J Heagerty PhD Department of Biostatistics University of Washington Auckland 8 Session One Outline Examples of longitudinal data Scientific motivation Opportunities

More information

Flexible modelling of the cumulative effects of time-varying exposures

Flexible modelling of the cumulative effects of time-varying exposures Flexible modelling of the cumulative effects of time-varying exposures Applications in environmental, cancer and pharmaco-epidemiology Antonio Gasparrini Department of Medical Statistics London School

More information

A New Method for Dealing With Measurement Error in Explanatory Variables of Regression Models

A New Method for Dealing With Measurement Error in Explanatory Variables of Regression Models A New Method for Dealing With Measurement Error in Explanatory Variables of Regression Models Laurence S. Freedman 1,, Vitaly Fainberg 1, Victor Kipnis 2, Douglas Midthune 2, and Raymond J. Carroll 3 1

More information

y response variable x 1, x 2,, x k -- a set of explanatory variables

y response variable x 1, x 2,, x k -- a set of explanatory variables 11. Multiple Regression and Correlation y response variable x 1, x 2,, x k -- a set of explanatory variables In this chapter, all variables are assumed to be quantitative. Chapters 12-14 show how to incorporate

More information

ON THE USE OF HIERARCHICAL MODELS

ON THE USE OF HIERARCHICAL MODELS ON THE USE OF HIERARCHICAL MODELS IN METHOD COMPARISON STUDIES Alessandra R. Brazzale LADSEB-CNR alessandra.brazzale@ladseb.pd.cnr.it TIES 2002 Genova, June 18-22, 2002 1 Credits Alberto Salvan (LADSEB-CNR,

More information

Effect Modification and Interaction

Effect Modification and Interaction By Sander Greenland Keywords: antagonism, causal coaction, effect-measure modification, effect modification, heterogeneity of effect, interaction, synergism Abstract: This article discusses definitions

More information

Tutorial 4: Power and Sample Size for the Two-sample t-test with Unequal Variances

Tutorial 4: Power and Sample Size for the Two-sample t-test with Unequal Variances Tutorial 4: Power and Sample Size for the Two-sample t-test with Unequal Variances Preface Power is the probability that a study will reject the null hypothesis. The estimated probability is a function

More information

Dr. Junchao Xia Center of Biophysics and Computational Biology. Fall /1/2016 1/46

Dr. Junchao Xia Center of Biophysics and Computational Biology. Fall /1/2016 1/46 BIO5312 Biostatistics Lecture 10:Regression and Correlation Methods Dr. Junchao Xia Center of Biophysics and Computational Biology Fall 2016 11/1/2016 1/46 Outline In this lecture, we will discuss topics

More information

Obtaining Uncertainty Measures on Slope and Intercept

Obtaining Uncertainty Measures on Slope and Intercept Obtaining Uncertainty Measures on Slope and Intercept of a Least Squares Fit with Excel s LINEST Faith A. Morrison Professor of Chemical Engineering Michigan Technological University, Houghton, MI 39931

More information

Unit 4. Statistics, Detection Limits and Uncertainty. Experts Teaching from Practical Experience

Unit 4. Statistics, Detection Limits and Uncertainty. Experts Teaching from Practical Experience Unit 4 Statistics, Detection Limits and Uncertainty Experts Teaching from Practical Experience Unit 4 Topics Statistical Analysis Detection Limits Decision thresholds & detection levels Instrument Detection

More information

f X, Y (x, y)dx (x), where f(x,y) is the joint pdf of X and Y. (x) dx

f X, Y (x, y)dx (x), where f(x,y) is the joint pdf of X and Y. (x) dx INDEPENDENCE, COVARIANCE AND CORRELATION Independence: Intuitive idea of "Y is independent of X": The distribution of Y doesn't depend on the value of X. In terms of the conditional pdf's: "f(y x doesn't

More information

General Regression Model

General Regression Model Scott S. Emerson, M.D., Ph.D. Department of Biostatistics, University of Washington, Seattle, WA 98195, USA January 5, 2015 Abstract Regression analysis can be viewed as an extension of two sample statistical

More information

Lognormal Measurement Error in Air Pollution Health Effect Studies

Lognormal Measurement Error in Air Pollution Health Effect Studies Lognormal Measurement Error in Air Pollution Health Effect Studies Richard L. Smith Department of Statistics and Operations Research University of North Carolina, Chapel Hill rls@email.unc.edu Presentation

More information

Important note: Transcripts are not substitutes for textbook assignments. 1

Important note: Transcripts are not substitutes for textbook assignments. 1 In this lesson we will cover correlation and regression, two really common statistical analyses for quantitative (or continuous) data. Specially we will review how to organize the data, the importance

More information

Supporting Information for Estimating restricted mean. treatment effects with stacked survival models

Supporting Information for Estimating restricted mean. treatment effects with stacked survival models Supporting Information for Estimating restricted mean treatment effects with stacked survival models Andrew Wey, David Vock, John Connett, and Kyle Rudser Section 1 presents several extensions to the simulation

More information

Epidemiologists often attempt to estimate the total (ie, Overadjustment Bias and Unnecessary Adjustment in Epidemiologic Studies ORIGINAL ARTICLE

Epidemiologists often attempt to estimate the total (ie, Overadjustment Bias and Unnecessary Adjustment in Epidemiologic Studies ORIGINAL ARTICLE ORIGINAL ARTICLE Overadjustment Bias and Unnecessary Adjustment in Epidemiologic Studies Enrique F. Schisterman, a Stephen R. Cole, b and Robert W. Platt c Abstract: Overadjustment is defined inconsistently.

More information

Model comparison. Patrick Breheny. March 28. Introduction Measures of predictive power Model selection

Model comparison. Patrick Breheny. March 28. Introduction Measures of predictive power Model selection Model comparison Patrick Breheny March 28 Patrick Breheny BST 760: Advanced Regression 1/25 Wells in Bangladesh In this lecture and the next, we will consider a data set involving modeling the decisions

More information

Using Geographic Information Systems for Exposure Assessment

Using Geographic Information Systems for Exposure Assessment Using Geographic Information Systems for Exposure Assessment Ravi K. Sharma, PhD Department of Behavioral & Community Health Sciences, Graduate School of Public Health, University of Pittsburgh, Pittsburgh,

More information

An Introduction to Parameter Estimation

An Introduction to Parameter Estimation Introduction Introduction to Econometrics An Introduction to Parameter Estimation This document combines several important econometric foundations and corresponds to other documents such as the Introduction

More information

STAT 5500/6500 Conditional Logistic Regression for Matched Pairs

STAT 5500/6500 Conditional Logistic Regression for Matched Pairs STAT 5500/6500 Conditional Logistic Regression for Matched Pairs The data for the tutorial came from support.sas.com, The LOGISTIC Procedure: Conditional Logistic Regression for Matched Pairs Data :: SAS/STAT(R)

More information

Core Courses for Students Who Enrolled Prior to Fall 2018

Core Courses for Students Who Enrolled Prior to Fall 2018 Biostatistics and Applied Data Analysis Students must take one of the following two sequences: Sequence 1 Biostatistics and Data Analysis I (PHP 2507) This course, the first in a year long, two-course

More information

A TWO-STAGE LINEAR MIXED-EFFECTS/COX MODEL FOR LONGITUDINAL DATA WITH MEASUREMENT ERROR AND SURVIVAL

A TWO-STAGE LINEAR MIXED-EFFECTS/COX MODEL FOR LONGITUDINAL DATA WITH MEASUREMENT ERROR AND SURVIVAL A TWO-STAGE LINEAR MIXED-EFFECTS/COX MODEL FOR LONGITUDINAL DATA WITH MEASUREMENT ERROR AND SURVIVAL Christopher H. Morrell, Loyola College in Maryland, and Larry J. Brant, NIA Christopher H. Morrell,

More information

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) 1 EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) Taisuke Otsu London School of Economics Summer 2018 A.1. Summation operator (Wooldridge, App. A.1) 2 3 Summation operator For

More information

Chapter 7: Simple linear regression

Chapter 7: Simple linear regression The absolute movement of the ground and buildings during an earthquake is small even in major earthquakes. The damage that a building suffers depends not upon its displacement, but upon the acceleration.

More information

Econometrics Honor s Exam Review Session. Spring 2012 Eunice Han

Econometrics Honor s Exam Review Session. Spring 2012 Eunice Han Econometrics Honor s Exam Review Session Spring 2012 Eunice Han Topics 1. OLS The Assumptions Omitted Variable Bias Conditional Mean Independence Hypothesis Testing and Confidence Intervals Homoskedasticity

More information

ACE 564 Spring Lecture 8. Violations of Basic Assumptions I: Multicollinearity and Non-Sample Information. by Professor Scott H.

ACE 564 Spring Lecture 8. Violations of Basic Assumptions I: Multicollinearity and Non-Sample Information. by Professor Scott H. ACE 564 Spring 2006 Lecture 8 Violations of Basic Assumptions I: Multicollinearity and Non-Sample Information by Professor Scott H. Irwin Readings: Griffiths, Hill and Judge. "Collinear Economic Variables,

More information

Continuous Time Survival in Latent Variable Models

Continuous Time Survival in Latent Variable Models Continuous Time Survival in Latent Variable Models Tihomir Asparouhov 1, Katherine Masyn 2, Bengt Muthen 3 Muthen & Muthen 1 University of California, Davis 2 University of California, Los Angeles 3 Abstract

More information

2008 Winton. Statistical Testing of RNGs

2008 Winton. Statistical Testing of RNGs 1 Statistical Testing of RNGs Criteria for Randomness For a sequence of numbers to be considered a sequence of randomly acquired numbers, it must have two basic statistical properties: Uniformly distributed

More information

A Measurement Error Model for Physical Activity Level Measured by a Questionnaire, with application to the NHANES Questionnaire

A Measurement Error Model for Physical Activity Level Measured by a Questionnaire, with application to the NHANES Questionnaire A Measurement Error Model for Physical Activity Level Measured by a Questionnaire, with application to the NHANES 1999-2006 Questionnaire Janet A. Tooze, Richard P. Troiano, Raymond J. Carroll, Alanna

More information

Meta-analysis of epidemiological dose-response studies

Meta-analysis of epidemiological dose-response studies Meta-analysis of epidemiological dose-response studies Nicola Orsini 2nd Italian Stata Users Group meeting October 10-11, 2005 Institute Environmental Medicine, Karolinska Institutet Rino Bellocco Dept.

More information

Package Rsurrogate. October 20, 2016

Package Rsurrogate. October 20, 2016 Type Package Package Rsurrogate October 20, 2016 Title Robust Estimation of the Proportion of Treatment Effect Explained by Surrogate Marker Information Version 2.0 Date 2016-10-19 Author Layla Parast

More information

Measurement error, GLMs, and notational conventions

Measurement error, GLMs, and notational conventions The Stata Journal (2003) 3, Number 4, pp. 329 341 Measurement error, GLMs, and notational conventions James W. Hardin Arnold School of Public Health University of South Carolina Columbia, SC 29208 Raymond

More information

Constructing Confidence Intervals of the Summary Statistics in the Least-Squares SROC Model

Constructing Confidence Intervals of the Summary Statistics in the Least-Squares SROC Model UW Biostatistics Working Paper Series 3-28-2005 Constructing Confidence Intervals of the Summary Statistics in the Least-Squares SROC Model Ming-Yu Fan University of Washington, myfan@u.washington.edu

More information

Prediction of ordinal outcomes when the association between predictors and outcome diers between outcome levels

Prediction of ordinal outcomes when the association between predictors and outcome diers between outcome levels STATISTICS IN MEDICINE Statist. Med. 2005; 24:1357 1369 Published online 26 November 2004 in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/sim.2009 Prediction of ordinal outcomes when the

More information

Tutorial 2: Power and Sample Size for the Paired Sample t-test

Tutorial 2: Power and Sample Size for the Paired Sample t-test Tutorial 2: Power and Sample Size for the Paired Sample t-test Preface Power is the probability that a study will reject the null hypothesis. The estimated probability is a function of sample size, variability,

More information

1 Introduction. 2 A regression model

1 Introduction. 2 A regression model Regression Analysis of Compositional Data When Both the Dependent Variable and Independent Variable Are Components LA van der Ark 1 1 Tilburg University, The Netherlands; avdark@uvtnl Abstract It is well

More information

Bootstrap Simulation Procedure Applied to the Selection of the Multiple Linear Regressions

Bootstrap Simulation Procedure Applied to the Selection of the Multiple Linear Regressions JKAU: Sci., Vol. 21 No. 2, pp: 197-212 (2009 A.D. / 1430 A.H.); DOI: 10.4197 / Sci. 21-2.2 Bootstrap Simulation Procedure Applied to the Selection of the Multiple Linear Regressions Ali Hussein Al-Marshadi

More information

A Monte-Carlo study of asymptotically robust tests for correlation coefficients

A Monte-Carlo study of asymptotically robust tests for correlation coefficients Biometrika (1973), 6, 3, p. 661 551 Printed in Great Britain A Monte-Carlo study of asymptotically robust tests for correlation coefficients BY G. T. DUNCAN AND M. W. J. LAYAKD University of California,

More information

A NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL

A NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL Discussiones Mathematicae Probability and Statistics 36 206 43 5 doi:0.75/dmps.80 A NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL Tadeusz Bednarski Wroclaw University e-mail: t.bednarski@prawo.uni.wroc.pl

More information

Chapter 2: simple regression model

Chapter 2: simple regression model Chapter 2: simple regression model Goal: understand how to estimate and more importantly interpret the simple regression Reading: chapter 2 of the textbook Advice: this chapter is foundation of econometrics.

More information

Biostatistics Workshop Longitudinal Data Analysis. Session 4 GARRETT FITZMAURICE

Biostatistics Workshop Longitudinal Data Analysis. Session 4 GARRETT FITZMAURICE Biostatistics Workshop 2008 Longitudinal Data Analysis Session 4 GARRETT FITZMAURICE Harvard University 1 LINEAR MIXED EFFECTS MODELS Motivating Example: Influence of Menarche on Changes in Body Fat Prospective

More information

Some practical aspects of the use of lognormal models for confidence limits and block distributions in South African gold mines

Some practical aspects of the use of lognormal models for confidence limits and block distributions in South African gold mines Some practical aspects of the use of lognormal models for confidence limits and block distributions in South African gold mines by D.G. Krige* Synopsis For the purpose of determining confidence limits

More information

"ZERO-POINT" IN THE EVALUATION OF MARTENS HARDNESS UNCERTAINTY

ZERO-POINT IN THE EVALUATION OF MARTENS HARDNESS UNCERTAINTY "ZERO-POINT" IN THE EVALUATION OF MARTENS HARDNESS UNCERTAINTY Professor Giulio Barbato, PhD Student Gabriele Brondino, Researcher Maurizio Galetto, Professor Grazia Vicario Politecnico di Torino Abstract

More information

Correction for classical covariate measurement error and extensions to life-course studies

Correction for classical covariate measurement error and extensions to life-course studies Correction for classical covariate measurement error and extensions to life-course studies Jonathan William Bartlett A thesis submitted to the University of London for the degree of Doctor of Philosophy

More information

Tutorial 3: Power and Sample Size for the Two-sample t-test with Equal Variances. Acknowledgements:

Tutorial 3: Power and Sample Size for the Two-sample t-test with Equal Variances. Acknowledgements: Tutorial 3: Power and Sample Size for the Two-sample t-test with Equal Variances Anna E. Barón, Keith E. Muller, Sarah M. Kreidler, and Deborah H. Glueck Acknowledgements: The project was supported in

More information

AGEC 661 Note Fourteen

AGEC 661 Note Fourteen AGEC 661 Note Fourteen Ximing Wu 1 Selection bias 1.1 Heckman s two-step model Consider the model in Heckman (1979) Y i = X iβ + ε i, D i = I {Z iγ + η i > 0}. For a random sample from the population,

More information

Logistic Regression: Regression with a Binary Dependent Variable

Logistic Regression: Regression with a Binary Dependent Variable Logistic Regression: Regression with a Binary Dependent Variable LEARNING OBJECTIVES Upon completing this chapter, you should be able to do the following: State the circumstances under which logistic regression

More information

American Journal of EPIDEMIOLOGY

American Journal of EPIDEMIOLOGY Volume 156 Number 3 August 1, 2002 American Journal of EPIDEMIOLOGY Copyright 2002 by The Johns Hopkins Bloomberg School of Public Health Sponsored by the Society for Epidemiologic Research Published by

More information

Using Estimating Equations for Spatially Correlated A

Using Estimating Equations for Spatially Correlated A Using Estimating Equations for Spatially Correlated Areal Data December 8, 2009 Introduction GEEs Spatial Estimating Equations Implementation Simulation Conclusion Typical Problem Assess the relationship

More information

Approximate analysis of covariance in trials in rare diseases, in particular rare cancers

Approximate analysis of covariance in trials in rare diseases, in particular rare cancers Approximate analysis of covariance in trials in rare diseases, in particular rare cancers Stephen Senn (c) Stephen Senn 1 Acknowledgements This work is partly supported by the European Union s 7th Framework

More information

Statistics in medicine

Statistics in medicine Statistics in medicine Lecture 3: Bivariate association : Categorical variables Proportion in one group One group is measured one time: z test Use the z distribution as an approximation to the binomial

More information

The Use of Spatial Exposure Predictions in Health Effects Models: An Application to PM Epidemiology

The Use of Spatial Exposure Predictions in Health Effects Models: An Application to PM Epidemiology The Use of Spatial Exposure Predictions in Health Effects Models: An Application to PM Epidemiology Chris Paciorek and Brent Coull Department of Biostatistics Harvard School of Public Health wwwbiostatharvardedu/

More information

Lecture Outline. Biost 518 Applied Biostatistics II. Choice of Model for Analysis. Choice of Model. Choice of Model. Lecture 10: Multiple Regression:

Lecture Outline. Biost 518 Applied Biostatistics II. Choice of Model for Analysis. Choice of Model. Choice of Model. Lecture 10: Multiple Regression: Biost 518 Applied Biostatistics II Scott S. Emerson, M.D., Ph.D. Professor of Biostatistics University of Washington Lecture utline Choice of Model Alternative Models Effect of data driven selection of

More information

Regression of Time Series

Regression of Time Series Mahlerʼs Guide to Regression of Time Series CAS Exam S prepared by Howard C. Mahler, FCAS Copyright 2016 by Howard C. Mahler. Study Aid 2016F-S-9Supplement Howard Mahler hmahler@mac.com www.howardmahler.com/teaching

More information

MA Advanced Econometrics: Applying Least Squares to Time Series

MA Advanced Econometrics: Applying Least Squares to Time Series MA Advanced Econometrics: Applying Least Squares to Time Series Karl Whelan School of Economics, UCD February 15, 2011 Karl Whelan (UCD) Time Series February 15, 2011 1 / 24 Part I Time Series: Standard

More information

statistical sense, from the distributions of the xs. The model may now be generalized to the case of k regressors:

statistical sense, from the distributions of the xs. The model may now be generalized to the case of k regressors: Wooldridge, Introductory Econometrics, d ed. Chapter 3: Multiple regression analysis: Estimation In multiple regression analysis, we extend the simple (two-variable) regression model to consider the possibility

More information

An EM-Algorithm Based Method to Deal with Rounded Zeros in Compositional Data under Dirichlet Models. Rafiq Hijazi

An EM-Algorithm Based Method to Deal with Rounded Zeros in Compositional Data under Dirichlet Models. Rafiq Hijazi An EM-Algorithm Based Method to Deal with Rounded Zeros in Compositional Data under Dirichlet Models Rafiq Hijazi Department of Statistics United Arab Emirates University P.O. Box 17555, Al-Ain United

More information

Growth Mixture Model

Growth Mixture Model Growth Mixture Model Latent Variable Modeling and Measurement Biostatistics Program Harvard Catalyst The Harvard Clinical & Translational Science Center Short course, October 28, 2016 Slides contributed

More information

Applied Quantitative Methods II

Applied Quantitative Methods II Applied Quantitative Methods II Lecture 4: OLS and Statistics revision Klára Kaĺıšková Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 1 / 68 Outline 1 Econometric analysis Properties of an estimator

More information

Efficient Robbins-Monro Procedure for Binary Data

Efficient Robbins-Monro Procedure for Binary Data Efficient Robbins-Monro Procedure for Binary Data V. Roshan Joseph School of Industrial and Systems Engineering Georgia Institute of Technology Atlanta, GA 30332-0205, USA roshan@isye.gatech.edu SUMMARY

More information

Faculty of Health Sciences. Regression models. Counts, Poisson regression, Lene Theil Skovgaard. Dept. of Biostatistics

Faculty of Health Sciences. Regression models. Counts, Poisson regression, Lene Theil Skovgaard. Dept. of Biostatistics Faculty of Health Sciences Regression models Counts, Poisson regression, 27-5-2013 Lene Theil Skovgaard Dept. of Biostatistics 1 / 36 Count outcome PKA & LTS, Sect. 7.2 Poisson regression The Binomial

More information

GENERALIZED LINEAR MIXED MODELS AND MEASUREMENT ERROR. Raymond J. Carroll: Texas A&M University

GENERALIZED LINEAR MIXED MODELS AND MEASUREMENT ERROR. Raymond J. Carroll: Texas A&M University GENERALIZED LINEAR MIXED MODELS AND MEASUREMENT ERROR Raymond J. Carroll: Texas A&M University Naisyin Wang: Xihong Lin: Roberto Gutierrez: Texas A&M University University of Michigan Southern Methodist

More information

Sigmaplot di Systat Software

Sigmaplot di Systat Software Sigmaplot di Systat Software SigmaPlot Has Extensive Statistical Analysis Features SigmaPlot is now bundled with SigmaStat as an easy-to-use package for complete graphing and data analysis. The statistical

More information

Using Instrumental Variables to Find Causal Effects in Public Health

Using Instrumental Variables to Find Causal Effects in Public Health 1 Using Instrumental Variables to Find Causal Effects in Public Health Antonio Trujillo, PhD John Hopkins Bloomberg School of Public Health Department of International Health Health Systems Program October

More information

D. A Method for Estimating Occupational Radiation Dose to Individuals, Using Weekly Dosimetry Data

D. A Method for Estimating Occupational Radiation Dose to Individuals, Using Weekly Dosimetry Data A Method for Estimating Occupational Radiation Dose to Individuals, Using Weekly Dosimetry Data Toby J. Mitchell, George Ostrouchov, Edward L. Frome, and George D. Kerr Mitchell, T. J., Ostrouchov, G.,

More information

Bootstrapping, Randomization, 2B-PLS

Bootstrapping, Randomization, 2B-PLS Bootstrapping, Randomization, 2B-PLS Statistics, Tests, and Bootstrapping Statistic a measure that summarizes some feature of a set of data (e.g., mean, standard deviation, skew, coefficient of variation,

More information

Monday, November 26: Explanatory Variable Explanatory Premise, Bias, and Large Sample Properties

Monday, November 26: Explanatory Variable Explanatory Premise, Bias, and Large Sample Properties Amherst College Department of Economics Economics 360 Fall 2012 Monday, November 26: Explanatory Variable Explanatory Premise, Bias, and Large Sample Properties Chapter 18 Outline Review o Regression Model

More information

Chapter 22: Log-linear regression for Poisson counts

Chapter 22: Log-linear regression for Poisson counts Chapter 22: Log-linear regression for Poisson counts Exposure to ionizing radiation is recognized as a cancer risk. In the United States, EPA sets guidelines specifying upper limits on the amount of exposure

More information

Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case

Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case Arthur Lewbel Boston College Original December 2016, revised July 2017 Abstract Lewbel (2012)

More information