Measurement error modeling. Department of Statistical Sciences Università degli Studi Padova
|
|
- Melvin Ball
- 6 years ago
- Views:
Transcription
1 Measurement error modeling Statistisches Beratungslabor Institut für Statistik Ludwig Maximilians Department of Statistical Sciences Università degli Studi Padova
2 Overview 1 and Misclassification
3 and Misclassfication
4 References Carroll R. J., D. Ruppert, L. Stefanski and CRainiceanu, C : in Nonlinear Models. A Modern perspective. Chapman & Hall London Buonaccorsi, J.-P.: Measurement error. Models, methods and applications. Cahpman & Hall, Buca Raton Kuha, J., C. Skinner and J. Palmgren. Misclassification error. In: Encyclopedia of Biostatistics, ed. by Armitage, P.and Colton, T., Wiley, Chichester. 4
5 Statistical Modeling Applied Statistics: Learning from data by sophisticated Several kinds of complexity Complex relationships between variables Often, the relationship between variables and data is complex, too: * Variables of interest are not ascertainable. * Only proxy variables (surrogates) are available instead. 5
6 General setting Outcome Variable Y i measurement model measurement Y i main model Aassociation effects Predictor X i measurement model measurement X i data inference data 6
7 Examples Framingham heart study, Munich blood pressure study: Blood pressure, long term average Single measurement Munich bronchitis study: Average occupational dust exposure Single measurement, expert ratings Studies on effects of air pollution Munich radon study Study on caries (misclassification) 7
8 Consequences of and misclassification Potential consequences of neglecting the difference between latent variables and their surrogates: severely biased estimation of all parameters Note: Measurement error and misclassification are not just a matter of sloppiness. Latent variables are eo ipso not exactly measurable. 8
9 Notation We have to distinguish between true (correctly measured) variable and its (possible incorrect) measurement, i.e. between the latent variable and the corresponding surrogate. * - Notation (here) X, Z : correctly (unobservable) Variable X, Z : corresponding possibly incorrect measurements analogously: Y, Y and T, T X,W, Z - Notation (Carroll et al.; 2006) ξ - X- Notation (Schneeweiß, Fuller) 9
10 Models for Systematic vs random Classical vs Berkson Additive vs multiplicative Homoscedastic vs heteroscedastic Differential vs non differential 10
11 Classical additive random X i : True value Xi : Measurement of X Xi = X i + U i (U i, X i ) indep. E(U i ) = 0 V (U i ) = σu 2 U i N(0,σU) 2 This model is suitable for Instrument m.e. One measurement is used for a mean 11
12 Additive Berkson-error X i = Xi + U i (U i, Xi ) indep. E(U i ) = 0 V (U i ) = σu 2 U i N(0,σU) 2 The model is suitable for Mean exposure of a region X instead of individual exposure X. Working place measurement Dose in a controlled experiment 12
13 Classical and Berkson Note that in the Berkson case E(X X ) = X Var(X) = Var(X )+Var(U) Var(X) > Var(X ) Note that in the Classical additive case E(X X) = X Var(X ) = Var(X)+Var(U) Var(X ) > Var(X) 13
14 Multiplicative Xi = X i U i (U i, X i ) indep. Classical X i = Xi U i (U i, Xi ) indep. Berkson E(U i ) = 1 U i Lognormal Additive on the logarithmic scale Used for exposure by chemicals or radiation 14
15 Modell für misclassification (MC) Binary case: sensitivity and specificity Y Y observed true P(Y = 1 Y = 1) = π 11 (sensitivity) P(Y = 0 Y = 0) = π 00 (specificity) P(Y = 0 Y = 1) = 1 π 11 = π 01 P(Y = 1 Y = 0) = 1 π 00 = π 10 classification matrix ( ) π00 π 01 π 10 π 11 15
16 Complex measurement The relationship between measurements and true value can be modeled, e.g. Scores Regression (estimation by validation data) Mixture of Berkson and classical Dependence of on further covariates 16
17 Additive Measurement error in response Simple linear regression Y = β 0 + β 1 X + ε Y = Y + U additive Y = β 0 + β 1 X + ε + U New equation error: ε + U Assumption : U and X independent, U and ɛ independent Higher variance of ε Inference still correct Error in equation and are not distinguishable. 17
18 Measurement error in covariates We focus on covariate in regression Main model: E(Y )=f(β, X, Z ) We are interested in Inference on β 1 Z is a further covariate measured without error Error model: X X Observed model: E(Y )=f (X, Z,β ) Naive estimation: Observed model = main model but in most cases : f f,β β 18
19 Differential and non differential Assumption of differential relates to the response: [Y X, X ]=[Y X] For Y there is no further information in U or W when X is known. Then the error and the main model can be split. [Y, X, X] =[Y X][X X][X] From the substantive point of view: Measurement process and Y are independent Blood pressure on a special day is irrelevant for CHD if long term average is known Mean exposure irrelevant if individual exposure is known But people with CHD can have a different view on their nutrition behavior 19
20 Simple linear regression We assume a classical non differential additive normal Y = β 0 + β 1 X + ɛ X = X + U, (U, X,ɛ) indep. U N(0,σu) 2 ɛ N(0,σɛ 2 ) 20
21 classical in linear regression 21
22 The observed model in linear regression E(Y X )=β 0 + β 1 E(X X ) Assuming X N(µ x,σ 2 x ), the observed model is: E(Y X ) = β 0 + β 1X β 1 = σ 2 x σ 2 x + σ 2 u β 0 = β 0 + β 1 ( 1 σ2 x σ 2 x + σ 2 u ( Y β0 β1x N 0,σɛ 2 + β2 1 σ2 uσx 2 σx 2 + σu 2 ) β 1 µ x ) The observed model is still a linear regression! Attenuation of β 1 by the factor Reliability ratio" σ 2 x σ 2 x +σ 2 u Loss of precision (higher error term) 22
23 Identification (β 0,β 1,µ x,σ 2 x,σ 2 u,σ 2 ɛ ) [Y, X ] µ y,µ x,σ 2 y,σ 2 x,σ x y (β 0,β 1,µ x,σ 2 x,σ 2 u,σ 2 ɛ ) and (β 0,β 1,µ x,σ 2 x + σ 2 u, 0,σ ɛ ) yield the identical distributions of (Y, X ). = The model parameters are not identifiable We need extra information, e.g σ u is known or can be estimated σ u /σ ɛ is known (orthogonal regression) The model with another distribution for X is identifiable by higher moments. 23
24 Naive LS- estimation For the slope : ˆβ 1n = S yx S 2 x plim( ˆβ 1n ) = σ yx = σx 2 σ yx σ 2 x + σ 2 u σ 2 x = β 1 σx 2 + σu 2 24
25 Correction for attenuation We have a first method: Solve the bias equation: ˆβ 1 = ˆβ 1n σ 2 x + σ 2 u σ 2 x ˆβ 1 = ˆβ 1n S 2 x S 2 x σ2 u Correction by reliability ratio. V ( ˆβ 1 ) > V (β 1n ) Bias Variance trade off 25
26 Multiple regression Formulae can be extended to the multiple case Note : also parameter estimates of error free variables may be affected. 26
27 Berkson-Error in simple linear regression Y = β 0 + β 1 X + ɛ X = X + U, U, (X, Y ) indep., E(U) =0 27
28 Observed Model E(Y X ) = β 0 + β 1 X V (Y X ) = σɛ 2 + β 1 σu 2 Regression model with identical β Measurement error ignorable Loss of precision 28
29 Misclassification Model for misclassification X : true binary variable, gold standard X : observed value of X, surrogate P(X = 1 X = 1) = π 11 (Sensitivity) P(X = 0 X = 0) = π 00 (Specificity) P(X = 0 X = 1) = 1 π 11 = π 01 P(X = 1 X = 0) = 1 π 00 = π 10 classification matrix ( ) π00 π Π = 01 π 10 π 11 29
30 misclassification Naive analysis: Simply ignore misclassification We want to estimate P(X = 1) We use 1 n n i=1 X i P(X = 1) = π 11 P(X = 1)+π 10 P(X = 0) P(X = 1) P(X = 1) = π 10 P(X = 0) π 01 P(X = 1) 30
31 Correction Idea: Solve the bias equation Note that X is still binomial and P(X = 1) can be consistently estimated from the observed data. P(X = 1) = π 11 P(X = 1)+π 10 (1 P(X = 1)) P(X = 1) = (P(X = 1) π 10 )/(π 11 + π 00 1) Assumptions π 11 and π 00 known π 11 + π 00 > 1 Variance factor (π 11 + π 00 1) 2 31
32 Multinomial case X is multinomial with categories 1,..., k. X is observed The error model is given by the misclassification Matrix Π= {π ij } with π ij = P(X = i X = j). Then we get for the probability vectors P X = (p x1,..., p xk ) P X = (px1,..., pxk) P X = Π P X 32
33 The matrix method The correction method is given by ˆP X := Π 1 ˆP X (1) Properties Misclassification matrix has to be known or estimated Gives sometimes probabilities > 1 or < 0 Variance calculation straight forward Use the delta method in the case of estimated Π, Greenland (1988) 33
34 example: Apolipoprotein E and Morbus Alzheimer case control Established realtionship between Apolipoprotein Eε4 and Morbus Alzheimer Data from metaanalysis ApoE genotype ε2ε2 ε2ε3 ε2ε4 ε3ε3 ε3ε4 ε4ε4 Clinical controls Clinical Alzheimer PM controls PM Alzheimer PM: Post Mortem 34
35 MC - correction 1 Annahme: P(Y = 0 Y = 1) = 10% (literature) P(Y = 1 Y = 0) = 1% (literature) In our sample Sensitivity = P(Y = 1 Y = 1) =0.97 Specificity = P(Y = 0 Y = 0) =0.96 correction (Matrix) OR: ε3ε3 (Reference) ε2ε4 ε2ε4 ε2ε4 naiv korr
36 Overview Correction and method of moments Orthogonal regression Regression calibration Simulation and extrapolation (SIMEX) Likelihood Quasi likelihood Corrected score Bayes 36
37 Direct correction Find p lim ˆβ n = β = f (β, γ) Then ˆβ := ˆf 1 ( ˆβ n ) Correction for attenuation in linear model General case Kü (1997) 37
38 Method of moments Moments of observed data can be estimated. Solve moments equations Simple linear regression: µ X = µ X µ Y = β 0 + β 1 µ x σx 2 = σ2 X + σu 2 σy 2 = β1σ 2 X 2 + σɛ 2 σ YX = β 1 σ 2 X 38
39 Orthogonal regression, total least squares In linear Regression with classical additive : Assume σ2 ɛ = η is known, σu 2 e. g. no equation error and σ ɛ is in Y. Minimize n i=1 in (β 0,β 1, X 1, X 2,..., X n ). { (Yi β 0 β 1 X i ) 2 + η(x i X i ) 2} Total least squares, Van Huffel (1997) Technical symmetric applications In epidemiology a problem, assumption of no equation error not realistic 39
40 Regression calibration This simple method has been widely applied. It was suggested by different authors: Rosner et al. (1989) Carroll and Stefanski(1990) 1 Find a model for E(X X, Z ) by validation data or replication 2 Replace the unobserved X by estimate E(X X, Z ) in the main model 3 Adjust variance estimates by bootstrap or asymptotic methods Good method in many practical situations Calibration data can be incorporated Problems in highly nonlinear 40
41 Regression calibration Berkson case: E(X X )=W Naive estimation = Regression calibration Classical : Linear regression X on X E(X X ) = σ2 X σx 2 X + µ X (1 σ2 X σ 2 X ) Correction for attenuation in linear model Classical with the assumption of mixed normal: E(X X )=P(I = 1 X ) E(X X, I = 1) +P(I = 2 X ) E(X X, I = 2) 41
42 Likelihood methods Standard inference can be done with standard errors and likelihood ratio tests Efficiency Combination of different data types are possible Sometimes more accurate than approximations Difficult to calculate Software not available Parametric model for the unobserved predictor necessary Robustness to strong parametric assumptions 42
43 The classical error likelihood The model has three components Main model [Y X, Z,ζ] Error model [X X, η] Exposure model [X Z,λ] n [Y, X Z,θ] = [y i x, z i,ζ][w i x,η][x z i,λ]dµ(x), i=1 where θ =(ζ, η, λ) Evaluation by numerical integration Misclassification is a special case 43
44 Berkson likelihood We have the three components, but the third component contains no information Main model Error model Exposure model [Y, X Z,θ] = [Y, X Z,θ] = [Y X, Z,ζ] [X X,η] [X Z,λ], n [y i x, z i,ζ][x w i,η][w i z i,λ]dµ(x) i=1 n [y i x, z i,ζ][x w i,η]dµ(x) const i=1 The Berkson likelihood does not depend on the exposure model 44
45 Quasi likelihood If calculation of the likelihood is too complicated use E(Y X, Z ) = g(x, Z )f x w dx V (Y X, Z ) = v(x, Z )f x w dx + Var[g(X, Z ) X ] This can be done e.g. for exponential g: Poisson regression model Parametric survival 45
46 Bayes Richardson and Green (2002) Evaluation by MCMC techniques Conditional independence assumptions on the three parts as seen in the likelihood approach The latent variable X is treated an unknown parameter Different data types can be combined Priori distributions for the error model Flexible handling of the exposure model WinBugs Software 46
47 SIMEX: Basic idea Cook and Stefanski(JASA, 1994). The effect of measurement error on the naive estimation is analyzed by a simulation experiment. Linear regression with additive : Y = β 0 + β X X + ɛ X = X + U, (U, X,ɛ) indep. U N(0,σ 2 u) ɛ N(0,σ 2 ɛ ) For the naive estimator (Regression from Y on X ): σ2 X β := p lim β naive = σx 2 + β X σ2 u 47
48 The SIMEX algorithm Additive with known or estimated variance σ 2 u 1 Calculate the naive estimator ˆβ naive := β(0) 2 Simulation: For a fixed grid of λ,e. g. λ = 0.5, 1, 1.5, 2 For b = 1,... B Simulate new error prone regressors Xib = X i + λ U i,b, U i,b N(0,σu) 2 Calculate the (naive) estimates ˆβ b,λ based on [Y i, X Calculate some average over b β(λ) 3 Extrapolation: Fit a regression ˆβ(λ) =g(λ, γ) ˆβ simex := g( 1, ˆγ) i,b ] 48
49 Extrapolation functions Linear : g(λ) =γ 0 + γ 1 λ Quadratic : g(λ) =γ 0 + γ 1 λ + γ 2 λ 2 Nonlinear : g(λ) =γ 1 + γ 2 γ 3 λ Nonlinear is motivated by linear regression Quadratic works fine in many examples 49
50 Misclassification SIMEX General Regression model with discrete variable X and classification matrix Π describing the relationship between X and observed X β (Π) := p lim ˆβ naive β (I k k ) = β Problem: β (Π) is a function of a matrix. We define: λ β (Π λ ) Π λ is given Π n+1 =Π n Π for natural numbers and Π 0 = I kxk. In general: Π λ := EΛ λ E 1 E := Matrix of eigenvectors Λ := Diagonal matrix of eigenvalues 50
51 Summary SIMEX Simple model Suitable for complex main model R package simex available Extrapolation problematic especially for high measurement error 51
52 Example : Occupational Dust and chronic bronchitis Kü/Carroll (1997) and Gössl /Kü(2001) Research question: Relationship between occupational dust and chronic bronchitis Data form N=1246 workers: X: log(1+average occupational dust exposure) Y : Chronic bronchitis (CBR) X : Measurements and expert ratings Z 1 : Smoking Z 2 : Duration of exposure with threshold: Model P(Y = 1) =G(β 0 + β 1 (X TLV ) + + β 2 Z 1 + β 3 Z 2 ) 52
53 Results for the TLV Method TLV-τ 0 Nom s. e. boot s.e. Naive Pseudo-MLE Regression Calibration simex: linear simex: quadratic simex: nonlinear
54 Findings MLE can be much more efficient (see also Guolo and Brazzale(2008)) SIMEX flexible model critical 54
MISCLASSIFICATION AND MEASUREMENT ERROR IN REGRESSION MODELS. Prof. Helmut Küchenhoff Ludwig-Maximilians-Universität München
Scuola di Dottorato di Ricerca in Scienze Statistiche - XXII ciclo Ciclo di seminari su MISCLASSIFICATION AND MEASUREMENT ERROR IN REGRESSION MODELS tenuti da Prof. Helmut Küchenhoff Ludwig-Maximilians-Universität
More informationGENERALIZED LINEAR MIXED MODELS AND MEASUREMENT ERROR. Raymond J. Carroll: Texas A&M University
GENERALIZED LINEAR MIXED MODELS AND MEASUREMENT ERROR Raymond J. Carroll: Texas A&M University Naisyin Wang: Xihong Lin: Roberto Gutierrez: Texas A&M University University of Michigan Southern Methodist
More informationMeasurement Error in Covariates
Measurement Error in Covariates Raymond J. Carroll Department of Statistics Faculty of Nutrition Institute for Applied Mathematics and Computational Science Texas A&M University My Goal Today Introduce
More informationImprecise Measurement Error Models Towards a More Reliable Analysis of Messy Data
Imprecise Measurement Error Models Towards a More Reliable Analysis of Messy Data Thomas Augustin Department of Statistics, Ludwig-Maximilians Universität (LMU) Munich, Germany Thomas Augustin, LMU Munich
More informationMeasurement error as missing data: the case of epidemiologic assays. Roderick J. Little
Measurement error as missing data: the case of epidemiologic assays Roderick J. Little Outline Discuss two related calibration topics where classical methods are deficient (A) Limit of quantification methods
More informationMeasurement Error and Linear Regression of Astronomical Data. Brandon Kelly Penn State Summer School in Astrostatistics, June 2007
Measurement Error and Linear Regression of Astronomical Data Brandon Kelly Penn State Summer School in Astrostatistics, June 2007 Classical Regression Model Collect n data points, denote i th pair as (η
More informationNon-Gaussian Berkson Errors in Bioassay
Non-Gaussian Berkson Errors in Bioassay Alaa Althubaiti & Alexander Donev First version: 1 May 011 Research Report No., 011, Probability and Statistics Group School of Mathematics, The University of Manchester
More informationMEASUREMENT ERROR IN HEALTH STUDIES
MEASUREMENT ERROR IN HEALTH STUDIES Lecture 1 Introduction, Examples, Effects of Measurement Error in Linear Models Lecture 2 Data Types, Nondifferential Error, Estimating Attentuation, Exact Predictors,
More informationMeasurement error effects on bias and variance in two-stage regression, with application to air pollution epidemiology
Measurement error effects on bias and variance in two-stage regression, with application to air pollution epidemiology Chris Paciorek Department of Statistics, University of California, Berkeley and Adam
More informationSpecification Errors, Measurement Errors, Confounding
Specification Errors, Measurement Errors, Confounding Kerby Shedden Department of Statistics, University of Michigan October 10, 2018 1 / 32 An unobserved covariate Suppose we have a data generating model
More informationSIMEX and TLS: An equivalence result
SIMEX and TLS: An equivalence result Polzehl, Jörg Weierstrass Institute for Applied Analysis and Stochastics, Mohrenstr. 39, 10117 Berlin, Germany polzehl@wias-berlin.de Zwanzig, Silvelyn Uppsala University,
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationUsing Estimating Equations for Spatially Correlated A
Using Estimating Equations for Spatially Correlated Areal Data December 8, 2009 Introduction GEEs Spatial Estimating Equations Implementation Simulation Conclusion Typical Problem Assess the relationship
More informationEconometric Analysis of Cross Section and Panel Data
Econometric Analysis of Cross Section and Panel Data Jeffrey M. Wooldridge / The MIT Press Cambridge, Massachusetts London, England Contents Preface Acknowledgments xvii xxiii I INTRODUCTION AND BACKGROUND
More informationCombining multiple observational data sources to estimate causal eects
Department of Statistics, North Carolina State University Combining multiple observational data sources to estimate causal eects Shu Yang* syang24@ncsuedu Joint work with Peng Ding UC Berkeley May 23,
More informationPOLYNOMIAL REGRESSION AND ESTIMATING FUNCTIONS IN THE PRESENCE OF MULTIPLICATIVE MEASUREMENT ERROR
POLYNOMIAL REGRESSION AND ESTIMATING FUNCTIONS IN THE PRESENCE OF MULTIPLICATIVE MEASUREMENT ERROR Stephen J. Iturria and Raymond J. Carroll 1 Texas A&M University, USA David Firth University of Oxford,
More informationGeneralized Linear. Mixed Models. Methods and Applications. Modern Concepts, Walter W. Stroup. Texts in Statistical Science.
Texts in Statistical Science Generalized Linear Mixed Models Modern Concepts, Methods and Applications Walter W. Stroup CRC Press Taylor & Francis Croup Boca Raton London New York CRC Press is an imprint
More informationLinear Methods for Prediction
Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we
More informationEffects of Exposure Measurement Error When an Exposure Variable Is Constrained by a Lower Limit
American Journal of Epidemiology Copyright 003 by the Johns Hopkins Bloomberg School of Public Health All rights reserved Vol. 157, No. 4 Printed in U.S.A. DOI: 10.1093/aje/kwf17 Effects of Exposure Measurement
More informationMeasurement error, GLMs, and notational conventions
The Stata Journal (2003) 3, Number 4, pp. 329 341 Measurement error, GLMs, and notational conventions James W. Hardin Arnold School of Public Health University of South Carolina Columbia, SC 29208 Raymond
More informationMeasurement Error in Spatial Modeling of Environmental Exposures
Measurement Error in Spatial Modeling of Environmental Exposures Chris Paciorek, Alexandros Gryparis, and Brent Coull August 9, 2005 Department of Biostatistics Harvard School of Public Health www.biostat.harvard.edu/~paciorek
More informationAnalysing geoadditive regression data: a mixed model approach
Analysing geoadditive regression data: a mixed model approach Institut für Statistik, Ludwig-Maximilians-Universität München Joint work with Ludwig Fahrmeir & Stefan Lang 25.11.2005 Spatio-temporal regression
More informationSimulation-Extrapolation for Estimating Means and Causal Effects with Mismeasured Covariates
Observational Studies 1 (2015) 241-290 Submitted 4/2015; Published 10/2015 Simulation-Extrapolation for Estimating Means and Causal Effects with Mismeasured Covariates J.R. Lockwood Educational Testing
More informationNon-linear panel data modeling
Non-linear panel data modeling Laura Magazzini University of Verona laura.magazzini@univr.it http://dse.univr.it/magazzini May 2010 Laura Magazzini (@univr.it) Non-linear panel data modeling May 2010 1
More informationA NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL
Discussiones Mathematicae Probability and Statistics 36 206 43 5 doi:0.75/dmps.80 A NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL Tadeusz Bednarski Wroclaw University e-mail: t.bednarski@prawo.uni.wroc.pl
More informationAppendix A1: Proof for estimation bias in linear regression with a single misclassified regressor (for
ONLINE SUPPLEMENT Appendix A1: Proof for estimation bias in linear regression with a single misclassified regressor (for reference, see Gustafson 2003). Suppose the regression equation is Y = β 0 + β 1
More informationCovariate Balancing Propensity Score for General Treatment Regimes
Covariate Balancing Propensity Score for General Treatment Regimes Kosuke Imai Princeton University October 14, 2014 Talk at the Department of Psychiatry, Columbia University Joint work with Christian
More informationThe consequences of misspecifying the random effects distribution when fitting generalized linear mixed models
The consequences of misspecifying the random effects distribution when fitting generalized linear mixed models John M. Neuhaus Charles E. McCulloch Division of Biostatistics University of California, San
More informationLecture 6: Discrete Choice: Qualitative Response
Lecture 6: Instructor: Department of Economics Stanford University 2011 Types of Discrete Choice Models Univariate Models Binary: Linear; Probit; Logit; Arctan, etc. Multinomial: Logit; Nested Logit; GEV;
More informationAN ABSTRACT OF THE DISSERTATION OF
AN ABSTRACT OF THE DISSERTATION OF Vicente J. Monleon for the degree of Doctor of Philosophy in Statistics presented on November, 005. Title: Regression Calibration and Maximum Likelihood Inference for
More informationStatistical Estimation
Statistical Estimation Use data and a model. The plug-in estimators are based on the simple principle of applying the defining functional to the ECDF. Other methods of estimation: minimize residuals from
More informationOn Measurement Error Problems with Predictors Derived from Stationary Stochastic Processes and Application to Cocaine Dependence Treatment Data
On Measurement Error Problems with Predictors Derived from Stationary Stochastic Processes and Application to Cocaine Dependence Treatment Data Yehua Li Department of Statistics University of Georgia Yongtao
More informationCorrection for classical covariate measurement error and extensions to life-course studies
Correction for classical covariate measurement error and extensions to life-course studies Jonathan William Bartlett A thesis submitted to the University of London for the degree of Doctor of Philosophy
More informationFractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling
Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Jae-Kwang Kim 1 Iowa State University June 26, 2013 1 Joint work with Shu Yang Introduction 1 Introduction
More informationGeneral Regression Model
Scott S. Emerson, M.D., Ph.D. Department of Biostatistics, University of Washington, Seattle, WA 98195, USA January 5, 2015 Abstract Regression analysis can be viewed as an extension of two sample statistical
More informationCausal Inference in Observational Studies with Non-Binary Treatments. David A. van Dyk
Causal Inference in Observational Studies with Non-Binary reatments Statistics Section, Imperial College London Joint work with Shandong Zhao and Kosuke Imai Cass Business School, October 2013 Outline
More information1. Introduction This paper focuses on two applications that are closely related mathematically, matched-pair studies and studies with errors-in-covari
Orthogonal Locally Ancillary Estimating Functions for Matched-Pair Studies and Errors-in-Covariates Molin Wang Harvard School of Public Health and Dana-Farber Cancer Institute, Boston, USA and John J.
More informationComputationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models
Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models Tihomir Asparouhov 1, Bengt Muthen 2 Muthen & Muthen 1 UCLA 2 Abstract Multilevel analysis often leads to modeling
More informationLinear Regression. Junhui Qian. October 27, 2014
Linear Regression Junhui Qian October 27, 2014 Outline The Model Estimation Ordinary Least Square Method of Moments Maximum Likelihood Estimation Properties of OLS Estimator Unbiasedness Consistency Efficiency
More informationMachine Learning Linear Classification. Prof. Matteo Matteucci
Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)
More informationMultilevel Statistical Models: 3 rd edition, 2003 Contents
Multilevel Statistical Models: 3 rd edition, 2003 Contents Preface Acknowledgements Notation Two and three level models. A general classification notation and diagram Glossary Chapter 1 An introduction
More informationA New Method for Dealing With Measurement Error in Explanatory Variables of Regression Models
A New Method for Dealing With Measurement Error in Explanatory Variables of Regression Models Laurence S. Freedman 1,, Vitaly Fainberg 1, Victor Kipnis 2, Douglas Midthune 2, and Raymond J. Carroll 3 1
More informationEric Shou Stat 598B / CSE 598D METHODS FOR MICRODATA PROTECTION
Eric Shou Stat 598B / CSE 598D METHODS FOR MICRODATA PROTECTION INTRODUCTION Statistical disclosure control part of preparations for disseminating microdata. Data perturbation techniques: Methods assuring
More informationModelling geoadditive survival data
Modelling geoadditive survival data Thomas Kneib & Ludwig Fahrmeir Department of Statistics, Ludwig-Maximilians-University Munich 1. Leukemia survival data 2. Structured hazard regression 3. Mixed model
More informationECO Class 6 Nonparametric Econometrics
ECO 523 - Class 6 Nonparametric Econometrics Carolina Caetano Contents 1 Nonparametric instrumental variable regression 1 2 Nonparametric Estimation of Average Treatment Effects 3 2.1 Asymptotic results................................
More informationFlexible Estimation of Treatment Effect Parameters
Flexible Estimation of Treatment Effect Parameters Thomas MaCurdy a and Xiaohong Chen b and Han Hong c Introduction Many empirical studies of program evaluations are complicated by the presence of both
More informationAnalysis of Correlated Data with Measurement Error in Responses or Covariates
Analysis of Correlated Data with Measurement Error in Responses or Covariates by Zhijian Chen A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree of
More informationEC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)
1 EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) Taisuke Otsu London School of Economics Summer 2018 A.1. Summation operator (Wooldridge, App. A.1) 2 3 Summation operator For
More informationA general mixed model approach for spatio-temporal regression data
A general mixed model approach for spatio-temporal regression data Thomas Kneib, Ludwig Fahrmeir & Stefan Lang Department of Statistics, Ludwig-Maximilians-University Munich 1. Spatio-temporal regression
More informationRobustness to Parametric Assumptions in Missing Data Models
Robustness to Parametric Assumptions in Missing Data Models Bryan Graham NYU Keisuke Hirano University of Arizona April 2011 Motivation Motivation We consider the classic missing data problem. In practice
More informationMulti-level Models: Idea
Review of 140.656 Review Introduction to multi-level models The two-stage normal-normal model Two-stage linear models with random effects Three-stage linear models Two-stage logistic regression with random
More informationClassification. Chapter Introduction. 6.2 The Bayes classifier
Chapter 6 Classification 6.1 Introduction Often encountered in applications is the situation where the response variable Y takes values in a finite set of labels. For example, the response Y could encode
More informationSPATIAL LINEAR MIXED MODELS WITH COVARIATE MEASUREMENT ERRORS
Statistica Sinica 19 (2009), 1077-1093 SPATIAL LINEAR MIXED MODELS WITH COVARIATE MEASUREMENT ERRORS Yi Li 1,3, Haicheng Tang 2 and Xihong Lin 3 1 Dana Farber Cancer Institute, 2 American Express and 3
More informationGibbs Sampling in Endogenous Variables Models
Gibbs Sampling in Endogenous Variables Models Econ 690 Purdue University Outline 1 Motivation 2 Identification Issues 3 Posterior Simulation #1 4 Posterior Simulation #2 Motivation In this lecture we take
More informationAugustin: Some Basic Results on the Extension of Quasi-Likelihood Based Measurement Error Correction to Multivariate and Flexible Structural Models
Augustin: Some Basic Results on the Extension of Quasi-Likelihood Based Measurement Error Correction to Multivariate and Flexible Structural Models Sonderforschungsbereich 386, Paper 196 (2000) Online
More informationGeneralized Linear Models. Kurt Hornik
Generalized Linear Models Kurt Hornik Motivation Assuming normality, the linear model y = Xβ + e has y = β + ε, ε N(0, σ 2 ) such that y N(μ, σ 2 ), E(y ) = μ = β. Various generalizations, including general
More informationLocal Polynomial Regression and SIMEX
Local Polynomial Regression and SIMEX John Staudenmayer and David Ruppert June 6, 2003 Abstract: This paper introduces a new local polynomial estimator and develops supporting asymptotic theory for non-parametric
More informationAccounting for Baseline Observations in Randomized Clinical Trials
Accounting for Baseline Observations in Randomized Clinical Trials Scott S Emerson, MD, PhD Department of Biostatistics, University of Washington, Seattle, WA 9895, USA October 6, 0 Abstract In clinical
More informationECON 4160, Autumn term Lecture 1
ECON 4160, Autumn term 2017. Lecture 1 a) Maximum Likelihood based inference. b) The bivariate normal model Ragnar Nymoen University of Oslo 24 August 2017 1 / 54 Principles of inference I Ordinary least
More informationState-space Model. Eduardo Rossi University of Pavia. November Rossi State-space Model Fin. Econometrics / 53
State-space Model Eduardo Rossi University of Pavia November 2014 Rossi State-space Model Fin. Econometrics - 2014 1 / 53 Outline 1 Motivation 2 Introduction 3 The Kalman filter 4 Forecast errors 5 State
More informationDigital Southern. Georgia Southern University. Varadan Sevilimedu Georgia Southern University. Fall 2017
Georgia Southern University Digital Commons@Georgia Southern Electronic Theses & Dissertations Graduate Studies, Jack N. Averitt College of Fall 2017 Application of the Misclassification Simulation Extrapolation
More informationLinear Methods for Prediction
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
More informationLearning Gaussian Process Models from Uncertain Data
Learning Gaussian Process Models from Uncertain Data Patrick Dallaire, Camille Besse, and Brahim Chaib-draa DAMAS Laboratory, Computer Science & Software Engineering Department, Laval University, Canada
More informationSelection on Observables: Propensity Score Matching.
Selection on Observables: Propensity Score Matching. Department of Economics and Management Irene Brunetti ireneb@ec.unipi.it 24/10/2017 I. Brunetti Labour Economics in an European Perspective 24/10/2017
More informationWeb Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D.
Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D. Ruppert A. EMPIRICAL ESTIMATE OF THE KERNEL MIXTURE Here we
More informationMulti-state Models: An Overview
Multi-state Models: An Overview Andrew Titman Lancaster University 14 April 2016 Overview Introduction to multi-state modelling Examples of applications Continuously observed processes Intermittently observed
More informationStatistics 910, #5 1. Regression Methods
Statistics 910, #5 1 Overview Regression Methods 1. Idea: effects of dependence 2. Examples of estimation (in R) 3. Review of regression 4. Comparisons and relative efficiencies Idea Decomposition Well-known
More informationData Uncertainty, MCML and Sampling Density
Data Uncertainty, MCML and Sampling Density Graham Byrnes International Agency for Research on Cancer 27 October 2015 Outline... Correlated Measurement Error Maximal Marginal Likelihood Monte Carlo Maximum
More informationA Regularization Corrected Score Method for Nonlinear Regression Models. with Covariate Error
Biometrics xx, 1 25 Pubmonth 20xx DOI: 10.1111/j.1541-0420.2005.00454.x A Regularization Corrected Score Method for Nonlinear Regression Models with Covariate Error David M. Zucker 1,, Malka Gorfine 2,
More informationDistribution-free ROC Analysis Using Binary Regression Techniques
Distribution-free Analysis Using Binary Techniques Todd A. Alonzo and Margaret S. Pepe As interpreted by: Andrew J. Spieker University of Washington Dept. of Biostatistics Introductory Talk No, not that!
More informationSOME ASPECTS OF MEASUREMENT ERROR IN EXPLANATORY VARIABLES FOR CONTINUOUS AND BINARY REGRESSION MODELS
STATISTICS IN MEDICINE Statist. Med. 17, 2157 2177 (1998) SOME ASPECTS OF MEASUREMENT ERROR IN EXPLANATORY VARIABLES FOR CONTINUOUS AND BINARY REGRESSION MODELS G. K. REEVES*, D.R.COX, S. C. DARBY AND
More informationMachine Learning 2017
Machine Learning 2017 Volker Roth Department of Mathematics & Computer Science University of Basel 21st March 2017 Volker Roth (University of Basel) Machine Learning 2017 21st March 2017 1 / 41 Section
More informationTest Code: STA/STB (Short Answer Type) 2013 Junior Research Fellowship for Research Course in Statistics
Test Code: STA/STB (Short Answer Type) 2013 Junior Research Fellowship for Research Course in Statistics The candidates for the research course in Statistics will have to take two shortanswer type tests
More informationBootstrap, Jackknife and other resampling methods
Bootstrap, Jackknife and other resampling methods Part VI: Cross-validation Rozenn Dahyot Room 128, Department of Statistics Trinity College Dublin, Ireland dahyot@mee.tcd.ie 2005 R. Dahyot (TCD) 453 Modern
More informationChapter 5: Models used in conjunction with sampling. J. Kim, W. Fuller (ISU) Chapter 5: Models used in conjunction with sampling 1 / 70
Chapter 5: Models used in conjunction with sampling J. Kim, W. Fuller (ISU) Chapter 5: Models used in conjunction with sampling 1 / 70 Nonresponse Unit Nonresponse: weight adjustment Item Nonresponse:
More informationProbability and Estimation. Alan Moses
Probability and Estimation Alan Moses Random variables and probability A random variable is like a variable in algebra (e.g., y=e x ), but where at least part of the variability is taken to be stochastic.
More informationMachine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling
Machine Learning B. Unsupervised Learning B.2 Dimensionality Reduction Lars Schmidt-Thieme, Nicolas Schilling Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University
More informationCausal Inference with General Treatment Regimes: Generalizing the Propensity Score
Causal Inference with General Treatment Regimes: Generalizing the Propensity Score David van Dyk Department of Statistics, University of California, Irvine vandyk@stat.harvard.edu Joint work with Kosuke
More informationA Significance Test for the Lasso
A Significance Test for the Lasso Lockhart R, Taylor J, Tibshirani R, and Tibshirani R Ashley Petersen May 14, 2013 1 Last time Problem: Many clinical covariates which are important to a certain medical
More informationCausal Mechanisms Short Course Part II:
Causal Mechanisms Short Course Part II: Analyzing Mechanisms with Experimental and Observational Data Teppei Yamamoto Massachusetts Institute of Technology March 24, 2012 Frontiers in the Analysis of Causal
More informationEcon 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines
Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Maximilian Kasy Department of Economics, Harvard University 1 / 37 Agenda 6 equivalent representations of the
More informationConsequences of measurement error. Psychology 588: Covariance structure and factor models
Consequences of measurement error Psychology 588: Covariance structure and factor models Scaling indeterminacy of latent variables Scale of a latent variable is arbitrary and determined by a convention
More informationA Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness
A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A. Linero and M. Daniels UF, UT-Austin SRC 2014, Galveston, TX 1 Background 2 Working model
More informationLikelihood ratio testing for zero variance components in linear mixed models
Likelihood ratio testing for zero variance components in linear mixed models Sonja Greven 1,3, Ciprian Crainiceanu 2, Annette Peters 3 and Helmut Küchenhoff 1 1 Department of Statistics, LMU Munich University,
More informationNow consider the case where E(Y) = µ = Xβ and V (Y) = σ 2 G, where G is diagonal, but unknown.
Weighting We have seen that if E(Y) = Xβ and V (Y) = σ 2 G, where G is known, the model can be rewritten as a linear model. This is known as generalized least squares or, if G is diagonal, with trace(g)
More informationGeneralized Linear Models. Last time: Background & motivation for moving beyond linear
Generalized Linear Models Last time: Background & motivation for moving beyond linear regression - non-normal/non-linear cases, binary, categorical data Today s class: 1. Examples of count and ordered
More informationRonald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California
Texts in Statistical Science Bayesian Ideas and Data Analysis An Introduction for Scientists and Statisticians Ronald Christensen University of New Mexico Albuquerque, New Mexico Wesley Johnson University
More informationComments. x > w = w > x. Clarification: this course is about getting you to be able to think as a machine learning expert
Logistic regression Comments Mini-review and feedback These are equivalent: x > w = w > x Clarification: this course is about getting you to be able to think as a machine learning expert There has to be
More informationLecture 7 Introduction to Statistical Decision Theory
Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7
More informationSemiparametric Generalized Linear Models
Semiparametric Generalized Linear Models North American Stata Users Group Meeting Chicago, Illinois Paul Rathouz Department of Health Studies University of Chicago prathouz@uchicago.edu Liping Gao MS Student
More informationFractional Imputation in Survey Sampling: A Comparative Review
Fractional Imputation in Survey Sampling: A Comparative Review Shu Yang Jae-Kwang Kim Iowa State University Joint Statistical Meetings, August 2015 Outline Introduction Fractional imputation Features Numerical
More informationProbabilistic modeling. The slides are closely adapted from Subhransu Maji s slides
Probabilistic modeling The slides are closely adapted from Subhransu Maji s slides Overview So far the models and algorithms you have learned about are relatively disconnected Probabilistic modeling framework
More informationFREQUENTIST BEHAVIOR OF FORMAL BAYESIAN INFERENCE
FREQUENTIST BEHAVIOR OF FORMAL BAYESIAN INFERENCE Donald A. Pierce Oregon State Univ (Emeritus), RERF Hiroshima (Retired), Oregon Health Sciences Univ (Adjunct) Ruggero Bellio Univ of Udine For Perugia
More informationEquivalence of random-effects and conditional likelihoods for matched case-control studies
Equivalence of random-effects and conditional likelihoods for matched case-control studies Ken Rice MRC Biostatistics Unit, Cambridge, UK January 8 th 4 Motivation Study of genetic c-erbb- exposure and
More informationEconometrics Master in Business and Quantitative Methods
Econometrics Master in Business and Quantitative Methods Helena Veiga Universidad Carlos III de Madrid Models with discrete dependent variables and applications of panel data methods in all fields of economics
More informationBayesian Methods for Machine Learning
Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),
More informationMidterm Examination. STA 215: Statistical Inference. Due Wednesday, 2006 Mar 8, 1:15 pm
Midterm Examination STA 215: Statistical Inference Due Wednesday, 2006 Mar 8, 1:15 pm This is an open-book take-home examination. You may work on it during any consecutive 24-hour period you like; please
More informationPoisson regression 1/15
Poisson regression 1/15 2/15 Counts data Examples of counts data: Number of hospitalizations over a period of time Number of passengers in a bus station Blood cells number in a blood sample Number of typos
More informationBias Study of the Naive Estimator in a Longitudinal Binary Mixed-effects Model with Measurement Error and Misclassification in Covariates
Bias Study of the Naive Estimator in a Longitudinal Binary Mixed-effects Model with Measurement Error and Misclassification in Covariates by c Ernest Dankwa A thesis submitted to the School of Graduate
More informationA Note on Bootstraps and Robustness. Tony Lancaster, Brown University, December 2003.
A Note on Bootstraps and Robustness Tony Lancaster, Brown University, December 2003. In this note we consider several versions of the bootstrap and argue that it is helpful in explaining and thinking about
More information