Estimation of Optimally-Combined-Biomarker Accuracy in the Absence of a Gold-Standard Reference Test
|
|
- Scott McGee
- 5 years ago
- Views:
Transcription
1 Estimation of Optimally-Combined-Biomarker Accuracy in the Absence of a Gold-Standard Reference Test L. García Barrado 1 E. Coart 2 T. Burzykowski 1,2 1 Interuniversity Institute for Biostatistics and Statistical Bioinformatics (I-Biostat) 2 International Drug Development Institute (IDDI) 1 / 26
2 Outline Problem setting Accuracy definition Optimal combination of biomarkers Absence of gold-standard reference Bayesian latent-class mixture model Naive prior definition Controlled prior definition Simulation study Data and model Results Conclusions 2 / 26
3 Problem setting Outline Problem setting Accuracy definition Optimal combination of biomarkers Absence of gold-standard reference Bayesian latent-class mixture model Naive prior definition Controlled prior definition Simulation study Data and model Results Conclusions 3 / 26
4 Problem setting Problem setting Establish accuracy of a combination of biomarkers in the absence of a gold-standard reference test Area under the Receiver Operating Characteristics (ROC) curve (AUC) as measure of accuracy Choose combination of biomarkers that maximizes AUC Imperfect reference test leads to biased estimates of accuracy => To this end a Bayesian latent-class mixture model will be proposed 4 / 26
5 Problem setting Accuracy definition Area under the Receiver Operating Characteristics curve Classification example Classification example ROC curve Control Disease Control Disease Se AUC=0.98 AUC= Biomarker value Biomarker value 1 Sp 5 / 26
6 Problem setting Optimal combination of biomarkers Data assumptions and notation Underlying true biomarker distribution Mixture of two K -variate normal distributions by true disease status (D) Y D=0 N K (µ 0, Σ 0 ) Y D=1 N K (µ 1, Σ 1 ) Se: Unknown sensitivity of the reference test (T) Sp: Unknown specificity of the reference test (T) θ: Unknown true prevalence of disease in the data set Reference test is imperfect Conditionally on true disease status, misclassification independent of biomarker value Ignoring will UNDERESTIMATE performance of biomarker 6 / 26
7 Problem setting Optimal combination of biomarkers AUC of optimal combination of biomarkers According to Siu and Liu (1993) the linear combination maximizing AUC is of the form: a Y D=0 N(a µ 0, a Σ 0a) a Y D=1 N(a µ 1, a Σ 1a) For which: a (Σ 0 + Σ 1) 1 (µ 1 µ 0 ) Area Under the ROC Curve: AUC OptComb = Φ { } ((µ 1 µ 0 ) (Σ 0 + Σ 1) 1 (µ 1 µ 0 )) 1 2 This is all under the assumption of a gold standard reference test. We propose to extend this to the imperfect reference test case. 7 / 26
8 Problem setting Optimal combination of biomarkers AUC of optimal combination of biomarkers According to Siu and Liu (1993) the linear combination maximizing AUC is of the form: a Y D=0 N(a µ 0, a Σ 0a) a Y D=1 N(a µ 1, a Σ 1a) For which: a (Σ 0 + Σ 1) 1 (µ 1 µ 0 ) Area Under the ROC Curve: AUC OptComb = Φ { } ((µ 1 µ 0 ) (Σ 0 + Σ 1) 1 (µ 1 µ 0 )) 1 2 This is all under the assumption of a gold standard reference test. We propose to extend this to the imperfect reference test case. 8 / 26
9 Problem setting Absence of gold-standard reference Underlying versus observed data Ignoring misclassification in imperfect reference test will lead to bias of estimated accuracy: True distributions VS observed data Observed Control (T=0) Observed Disease (T=1) True Control (D=0) True Disease (D=1) In example: conditionally independent misclassification Misclassification in reference test causes skewed observed distributions Goal: retrieve accuracy of true underlying biomarker by observed data Biomarker value 9 / 26
10 Bayesian latent-class mixture model Outline Problem setting Accuracy definition Optimal combination of biomarkers Absence of gold-standard reference Bayesian latent-class mixture model Naive prior definition Controlled prior definition Simulation study Data and model Results Conclusions 10 / 26
11 Bayesian latent-class mixture model Full data likelihood L(µ 0, µ 1, Σ 0, Σ 1, θ, Se, Sp Y, T, D) ( N { = θse t i (1 Se) (1 t i ) 1 (2π)K Σ EXP 1 } ) d i 1 2 (Yi µ 1 ) Σ 1 1 (Y i µ 1 ) i=1 ( { (1 θ)(1 Sp) t i Sp (1 t i ) 1 (2π)K Σ EXP 1 } ) (1 d i ) 0 2 (Yi µ 0) Σ 1 0 (Y i µ 0 ) 11 / 26
12 Bayesian latent-class mixture model Naive prior definition Naive prior definition Hyperprior θ Uniform(0.1,0.9) Priors D i Bernoulli(θ) (Observation i: 1,...,N) µ kj N(0,10 6 ) (Disease indicator j: 0, 1; Biomarker k: 1,...,K) Σ 1 j Wish(S,K) (Disease indicator j: 0, 1) with S = VarCov-matrix of observed control group Se = Sp Beta(1,1)T(0.51, ) [Non-informative] OR Se = Sp Beta(10, )T(0.51, ) [Informative] 12 / 26
13 Bayesian latent-class mixture model Naive prior definition Se/Sp Beta(10, ) Prior Mean = 0.85 Var = Equal-tail 95%-probability interval: Informative Se/Sp prior Density Se or Sp 13 / 26
14 Bayesian latent-class mixture model Naive prior definition Implied prior AUC Density Implied AUC prior AUC Prior specification is used commonly (e.g. O Malley and Zou (2006)) Uninformative mixture component priors lead to prior point mass distribution centred at 1 for AUC Extremely informative prior for component of interest! 14 / 26
15 Bayesian latent-class mixture model Controlled prior definition Controlled prior definition (AUC) AUC OptComb { } = Φ ((µ 1 µ 0 ) (Σ 0 + Σ 1 ) 1 (µ 1 µ 0 )) 1 2 { } = Φ ((µ 1 µ 0 ) L L(µ 1 µ 0 )) 1 2 With L = the Cholesky factor of (Σ 0 + Σ 1 ) 1 Set = L(µ 1 µ 0 ) N K (κ, Ψ) µ 0k N(0, 10 6 ) (k: 1,...,K) µ 1 = L 1 + µ 0 15 / 26
16 Bayesian latent-class mixture model Controlled prior definition Implied priors AUC For κ = 0 0, σi = 0.7 and ρ ij = [i,j: 1,...,K] AUC Kappa=(0.0,0.0,0.0),Sd=(0.7,0.7,0.7),Cor=(0.6,0.6,0.6) Simulated AUC fy Expansion appr. Less informative prior distribution for AUC Prior on gives control over informativeness AUC prior Similar issue for Σ 0 and Σ 1 (Wei, Y. & Higgins, P.T., 2013) Y 16 / 26
17 Simulation study Outline Problem setting Accuracy definition Optimal combination of biomarkers Absence of gold-standard reference Bayesian latent-class mixture model Naive prior definition Controlled prior definition Simulation study Data and model Results Conclusions 17 / 26
18 Simulation study Data and model 100 data sets for 3 biomarkers 4 data generating settings R = I K vs R I K Σ 0 = Σ 1 vs Σ 0 Σ 1 Parameter settings N = 100, 400 or 600 θ = 0.5 Se = Sp = 0.85 Mixture component parameters set such that: AUC 1 = AUC 2 = AUC 3 = 0.75 AUC OptimalCombination R=IK = 0.88 AUC OptimalCombination R IK = 0.78 Analysis model: General model fitting 2 unstructured covariance matrices 18 / 26
19 Simulation study Results AUC Results (Average of median posterior AUC) In cell: Average of median posterior AUC (Standard Error) [Number of converged fits] Data Gen. Sample Size Model Se/Sp True VarCov Corr N100 N400 N600 prior AUC Σ 0 = Σ 1 ρ 0 GS (0.039) [100] (0.022) [100] (0.018) [100] Σ 0 Σ 1 ρ = 0 GS (0.037) [100] (0.019) [100] (0.015) [100] Σ 0 Σ 1 ρ 0 GS (0.040) [100] (0.022) [100] (0.018) [100] Gold Standard model fit leads to significant underestimation 19 / 26
20 Simulation study Results AUC Results (Average of median posterior AUC) Data Gen. Model Sample Size VarCov Corr Se/Sp True prior AUC N100 N400 N600 Σ 0 = Σ 1 ρ 0 GS (0.039) [100] (0.022) [100] (0.018) [100] Σ 0 = Σ 1 ρ 0 NINF (0.056) [18] (0.034) [61] (0.030) [60] Σ 0 Σ 1 ρ = 0 GS (0.037) [100] (0.019) [100] (0.015) [100] Σ 0 Σ 1 ρ = 0 NINF (0.041) [89] (0.022) [100] (0.019) [100] Σ 0 Σ 1 ρ 0 GS (0.040) [100] (0.022) [100] (0.018) [100] Σ 0 Σ 1 ρ 0 NINF (0.047) [72] (0.030) [100] (0.025) [100] Proposed model with controlled prior leads to unbiased estimates Non-convergence issues for low information data Small sample size Correlation Over-specified VarCov model Increased sample size increases precision 20 / 26
21 Simulation study Results AUC Results (Average of median posterior AUC) Data Gen. Model Sample Size VarCov Corr Se/Sp True prior AUC N100 N400 N600 Σ 0 = Σ 1 ρ 0 GS (0.039) [100] (0.022) [100] (0.018) [100] Σ 0 = Σ 1 ρ 0 NINF (0.056) [18] (0.034) [61] (0.030) [60] Σ 0 = Σ 1 ρ 0 INF (0.053) [33] (0.033) [83] (0.027) [86] Σ 0 Σ 1 ρ = 0 GS (0.037) [100] (0.019) [100] (0.015) [100] Σ 0 Σ 1 ρ = 0 NINF (0.041) [89] (0.022) [100] (0.019) [100] Σ 0 Σ 1 ρ = 0 INF (0.045) [96] (0.022) [100] (0.019) [100] Σ 0 Σ 1 ρ 0 GS (0.040) [100] (0.022) [100] (0.018) [100] Σ 0 Σ 1 ρ 0 NINF (0.047) [72] (0.030) [100] (0.025) [100] Σ 0 Σ 1 ρ 0 INF (0.049) [83] (0.029) [100] (0.025) [100] Inclusion of prior Se and Sp information No effect on point estimate or precision Positive effect on number of converged data sets 21 / 26
22 Conclusions Outline Problem setting Accuracy definition Optimal combination of biomarkers Absence of gold-standard reference Bayesian latent-class mixture model Naive prior definition Controlled prior definition Simulation study Data and model Results Conclusions 22 / 26
23 Conclusions Conclusions Bayesian latent-class mixture model: Takes unknown true disease status into account Incorporates information from reference test while acknowledges imperfectness Provides estimates of accuracy of the reference test Simulation study Model is able to retrieve true AUC when: sample size is adequate sensible non-informative prior is considered Careful prior specification Complex function of uninformative prior distributions => informative prior => biased estimates Controlled prior specification is proposed 23 / 26
24 Conclusions Further considerations Sensitivity to misspecified Se/Sp prior distribution Extend to incorporate non-normally distributed biomarkers Evaluate impact of conditional independence assumption 24 / 26
25 Conclusions References O Malley, A.J., Zou, K.H.: Bayesian multivariate hierarchical transformation models for ROC analysis. Statistical Medicine. 25, (2006) Su, J.Q., Liu, J.S.: Linear combinations of multiple diagnostic markers. Journal of the American Statistical Association. 88, (1993) Wei, Y, Higgins, P.T.: Bayesian multivariate meta-analysis with multiple outcomes. Statistics in Medicine (2013) doi: /sim / 26
26 Conclusions Thank you for your attention! 26 / 26
Nonparametric Estimation of ROC Curves in the Absence of a Gold Standard (Biometrics 2005)
Nonparametric Estimation of ROC Curves in the Absence of a Gold Standard (Biometrics 2005) Xiao-Hua Zhou, Pete Castelluccio and Chuan Zhou As interpreted by: JooYoon Han Department of Biostatistics University
More informationLecture 16: Mixtures of Generalized Linear Models
Lecture 16: Mixtures of Generalized Linear Models October 26, 2006 Setting Outline Often, a single GLM may be insufficiently flexible to characterize the data Setting Often, a single GLM may be insufficiently
More informationIntroduction to Machine Learning
Introduction to Machine Learning Bayesian Classification Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574
More informationBayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework
HT5: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Maximum Likelihood Principle A generative model for
More informationComparison of multiple imputation methods for systematically and sporadically missing multilevel data
Comparison of multiple imputation methods for systematically and sporadically missing multilevel data V. Audigier, I. White, S. Jolani, T. Debray, M. Quartagno, J. Carpenter, S. van Buuren, M. Resche-Rigon
More informationReconstruction of individual patient data for meta analysis via Bayesian approach
Reconstruction of individual patient data for meta analysis via Bayesian approach Yusuke Yamaguchi, Wataru Sakamoto and Shingo Shirahata Graduate School of Engineering Science, Osaka University Masashi
More informationNonparametric predictive inference with parametric copulas for combining bivariate diagnostic tests
Nonparametric predictive inference with parametric copulas for combining bivariate diagnostic tests Noryanti Muhammad, Universiti Malaysia Pahang, Malaysia, noryanti@ump.edu.my Tahani Coolen-Maturi, Durham
More informationIntroduction to Machine Learning
Outline Introduction to Machine Learning Bayesian Classification Varun Chandola March 8, 017 1. {circular,large,light,smooth,thick}, malignant. {circular,large,light,irregular,thick}, malignant 3. {oval,large,dark,smooth,thin},
More informationMinimum Message Length Inference and Mixture Modelling of Inverse Gaussian Distributions
Minimum Message Length Inference and Mixture Modelling of Inverse Gaussian Distributions Daniel F. Schmidt Enes Makalic Centre for Molecular, Environmental, Genetic & Analytic (MEGA) Epidemiology School
More informationSample Size Calculations for ROC Studies: Parametric Robustness and Bayesian Nonparametrics
Baylor Health Care System From the SelectedWorks of unlei Cheng Spring January 30, 01 Sample Size Calculations for ROC Studies: Parametric Robustness and Bayesian Nonparametrics unlei Cheng, Baylor Health
More informationBayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling
Bayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling Jon Wakefield Departments of Statistics and Biostatistics University of Washington 1 / 37 Lecture Content Motivation
More information[Part 2] Model Development for the Prediction of Survival Times using Longitudinal Measurements
[Part 2] Model Development for the Prediction of Survival Times using Longitudinal Measurements Aasthaa Bansal PhD Pharmaceutical Outcomes Research & Policy Program University of Washington 69 Biomarkers
More informationNaïve Bayes classification
Naïve Bayes classification 1 Probability theory Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. Examples: A person s height, the outcome of a coin toss
More informationROC Curve Analysis for Randomly Selected Patients
INIAN INSIUE OF MANAGEMEN AHMEABA INIA ROC Curve Analysis for Randomly Selected Patients athagata Bandyopadhyay Sumanta Adhya Apratim Guha W.P. No. 015-07-0 July 015 he main objective of the working paper
More informationHigh-dimensional regression modeling
High-dimensional regression modeling David Causeur Department of Statistics and Computer Science Agrocampus Ouest IRMAR CNRS UMR 6625 http://www.agrocampus-ouest.fr/math/causeur/ Course objectives Making
More informationA Bayesian Nonparametric Model for Predicting Disease Status Using Longitudinal Profiles
A Bayesian Nonparametric Model for Predicting Disease Status Using Longitudinal Profiles Jeremy Gaskins Department of Bioinformatics & Biostatistics University of Louisville Joint work with Claudio Fuentes
More informationBAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA
BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA Intro: Course Outline and Brief Intro to Marina Vannucci Rice University, USA PASI-CIMAT 04/28-30/2010 Marina Vannucci
More informationMore on nuisance parameters
BS2 Statistical Inference, Lecture 3, Hilary Term 2009 January 30, 2009 Suppose that there is a minimal sufficient statistic T = t(x ) partitioned as T = (S, C) = (s(x ), c(x )) where: C1: the distribution
More informationBayesian Hierarchical Models
Bayesian Hierarchical Models Gavin Shaddick, Millie Green, Matthew Thomas University of Bath 6 th - 9 th December 2016 1/ 34 APPLICATIONS OF BAYESIAN HIERARCHICAL MODELS 2/ 34 OUTLINE Spatial epidemiology
More informationBayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang
Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features Yangxin Huang Department of Epidemiology and Biostatistics, COPH, USF, Tampa, FL yhuang@health.usf.edu January
More informationChapter 3: Maximum-Likelihood & Bayesian Parameter Estimation (part 1)
HW 1 due today Parameter Estimation Biometrics CSE 190 Lecture 7 Today s lecture was on the blackboard. These slides are an alternative presentation of the material. CSE190, Winter10 CSE190, Winter10 Chapter
More informationEstimating Optimum Linear Combination of Multiple Correlated Diagnostic Tests at a Fixed Specificity with Receiver Operating Characteristic Curves
Journal of Data Science 6(2008), 1-13 Estimating Optimum Linear Combination of Multiple Correlated Diagnostic Tests at a Fixed Specificity with Receiver Operating Characteristic Curves Feng Gao 1, Chengjie
More informationBayesian Methods: Naïve Bayes
Bayesian Methods: aïve Bayes icholas Ruozzi University of Texas at Dallas based on the slides of Vibhav Gogate Last Time Parameter learning Learning the parameter of a simple coin flipping model Prior
More informationStatistics & Data Sciences: First Year Prelim Exam May 2018
Statistics & Data Sciences: First Year Prelim Exam May 2018 Instructions: 1. Do not turn this page until instructed to do so. 2. Start each new question on a new sheet of paper. 3. This is a closed book
More informationLongitudinal breast density as a marker of breast cancer risk
Longitudinal breast density as a marker of breast cancer risk C. Armero (1), M. Rué (2), A. Forte (1), C. Forné (2), H. Perpiñán (1), M. Baré (3), and G. Gómez (4) (1) BIOstatnet and Universitat de València,
More informationNaïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability
Probability theory Naïve Bayes classification Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s height, the outcome of a coin toss Distinguish
More informationStat260: Bayesian Modeling and Inference Lecture Date: February 10th, Jeffreys priors. exp 1 ) p 2
Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, 2010 Jeffreys priors Lecturer: Michael I. Jordan Scribe: Timothy Hunter 1 Priors for the multivariate Gaussian Consider a multivariate
More informationStatistical Methods for Multivariate Meta-Analysis of Diagnostic Tests
Statistical Methods for Multivariate Meta-Analysis of Diagnostic Tests A DISSERTATION SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL OF THE UNIVERSITY OF MINNESOTA BY Xiaoye Ma IN PARTIAL FULFILLMENT
More informationMultinomial Data. f(y θ) θ y i. where θ i is the probability that a given trial results in category i, i = 1,..., k. The parameter space is
Multinomial Data The multinomial distribution is a generalization of the binomial for the situation in which each trial results in one and only one of several categories, as opposed to just two, as in
More informationSTAT 518 Intro Student Presentation
STAT 518 Intro Student Presentation Wen Wei Loh April 11, 2013 Title of paper Radford M. Neal [1999] Bayesian Statistics, 6: 475-501, 1999 What the paper is about Regression and Classification Flexible
More informationDavid Hughes. Flexible Discriminant Analysis Using. Multivariate Mixed Models. D. Hughes. Motivation MGLMM. Discriminant. Analysis.
Using Using David Hughes 2015 Outline Using 1. 2. Multivariate Generalized Linear Mixed () 3. Longitudinal 4. 5. Using Complex data. Using Complex data. Longitudinal Using Complex data. Longitudinal Multivariate
More informationThe consequences of misspecifying the random effects distribution when fitting generalized linear mixed models
The consequences of misspecifying the random effects distribution when fitting generalized linear mixed models John M. Neuhaus Charles E. McCulloch Division of Biostatistics University of California, San
More informationAnalysis of longitudinal imaging data. & Sandwich Estimator standard errors
with OLS & Sandwich Estimator standard errors GSK / University of Liège / University of Warwick FMRIB Centre - 07 Mar 2012 Supervisors: Thomas Nichols (Warwick University) and Christophe Phillips (Liège
More informationEquivalence of random-effects and conditional likelihoods for matched case-control studies
Equivalence of random-effects and conditional likelihoods for matched case-control studies Ken Rice MRC Biostatistics Unit, Cambridge, UK January 8 th 4 Motivation Study of genetic c-erbb- exposure and
More informationDefault Priors and Effcient Posterior Computation in Bayesian
Default Priors and Effcient Posterior Computation in Bayesian Factor Analysis January 16, 2010 Presented by Eric Wang, Duke University Background and Motivation A Brief Review of Parameter Expansion Literature
More informationAMS-207: Bayesian Statistics
Linear Regression How does a quantity y, vary as a function of another quantity, or vector of quantities x? We are interested in p(y θ, x) under a model in which n observations (x i, y i ) are exchangeable.
More informationHierarchical Modeling for non-gaussian Spatial Data
Hierarchical Modeling for non-gaussian Spatial Data Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department
More informationRepresent processes and observations that span multiple levels (aka multi level models) R 2
Hierarchical models Hierarchical models Represent processes and observations that span multiple levels (aka multi level models) R 1 R 2 R 3 N 1 N 2 N 3 N 4 N 5 N 6 N 7 N 8 N 9 N i = true abundance on a
More informationHierarchical Modelling for non-gaussian Spatial Data
Hierarchical Modelling for non-gaussian Spatial Data Sudipto Banerjee 1 and Andrew O. Finley 2 1 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. 2
More information1 EM algorithm: updating the mixing proportions {π k } ik are the posterior probabilities at the qth iteration of EM.
Université du Sud Toulon - Var Master Informatique Probabilistic Learning and Data Analysis TD: Model-based clustering by Faicel CHAMROUKHI Solution The aim of this practical wor is to show how the Classification
More information10. Exchangeability and hierarchical models Objective. Recommended reading
10. Exchangeability and hierarchical models Objective Introduce exchangeability and its relation to Bayesian hierarchical models. Show how to fit such models using fully and empirical Bayesian methods.
More informationPlausible Values for Latent Variables Using Mplus
Plausible Values for Latent Variables Using Mplus Tihomir Asparouhov and Bengt Muthén August 21, 2010 1 1 Introduction Plausible values are imputed values for latent variables. All latent variables can
More informationNew Bayesian methods for model comparison
Back to the future New Bayesian methods for model comparison Murray Aitkin murray.aitkin@unimelb.edu.au Department of Mathematics and Statistics The University of Melbourne Australia Bayesian Model Comparison
More informationIntroduction to Machine Learning CMU-10701
Introduction to Machine Learning CMU-10701 2. MLE, MAP, Bayes classification Barnabás Póczos & Aarti Singh 2014 Spring Administration http://www.cs.cmu.edu/~aarti/class/10701_spring14/index.html Blackboard
More informationModeling conditional distributions with mixture models: Theory and Inference
Modeling conditional distributions with mixture models: Theory and Inference John Geweke University of Iowa, USA Journal of Applied Econometrics Invited Lecture Università di Venezia Italia June 2, 2005
More informationA semiparametric Bayesian model for examiner agreement in periodontal research
A semiparametric Bayesian model for examiner agreement in periodontal research Elizabeth G. Hill 1 and Elizabeth H. Slate 2 1 Division of Biostatistics and Epidemiology, MUSC 2 Department of Statistics,
More informationUniversity of California, Berkeley
University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2008 Paper 241 A Note on Risk Prediction for Case-Control Studies Sherri Rose Mark J. van der Laan Division
More informationModeling conditional dependence among multiple diagnostic tests
Received: 11 June 2017 Revised: 1 August 2017 Accepted: 6 August 2017 DOI: 10.1002/sim.7449 RESEARCH ARTICLE Modeling conditional dependence among multiple diagnostic tests Zhuoyu Wang 1 Nandini Dendukuri
More informationVCMC: Variational Consensus Monte Carlo
VCMC: Variational Consensus Monte Carlo Maxim Rabinovich, Elaine Angelino, Michael I. Jordan Berkeley Vision and Learning Center September 22, 2015 probabilistic models! sky fog bridge water grass object
More informationBayesian Nonparametric Meta-Analysis Model George Karabatsos University of Illinois-Chicago (UIC)
Bayesian Nonparametric Meta-Analysis Model George Karabatsos University of Illinois-Chicago (UIC) Collaborators: Elizabeth Talbott, UIC. Stephen Walker, UT-Austin. August 9, 5, 4:5-4:45pm JSM 5 Meeting,
More informationHierarchical Modelling for non-gaussian Spatial Data
Hierarchical Modelling for non-gaussian Spatial Data Geography 890, Hierarchical Bayesian Models for Environmental Spatial Data Analysis February 15, 2011 1 Spatial Generalized Linear Models Often data
More informationStat 216 Final Solutions
Stat 16 Final Solutions Name: 5/3/05 Problem 1. (5 pts) In a study of size and shape relationships for painted turtles, Jolicoeur and Mosimann measured carapace length, width, and height. Their data suggest
More informationIntroduction to Machine Learning. Lecture 2
Introduction to Machine Learning Lecturer: Eran Halperin Lecture 2 Fall Semester Scribe: Yishay Mansour Some of the material was not presented in class (and is marked with a side line) and is given for
More informationNonparametric Bayes tensor factorizations for big data
Nonparametric Bayes tensor factorizations for big data David Dunson Department of Statistical Science, Duke University Funded from NIH R01-ES017240, R01-ES017436 & DARPA N66001-09-C-2082 Motivation Conditional
More informationW-BASED VS LATENT VARIABLES SPATIAL AUTOREGRESSIVE MODELS: EVIDENCE FROM MONTE CARLO SIMULATIONS
1 W-BASED VS LATENT VARIABLES SPATIAL AUTOREGRESSIVE MODELS: EVIDENCE FROM MONTE CARLO SIMULATIONS An Liu University of Groningen Henk Folmer University of Groningen Wageningen University Han Oud Radboud
More informationBayesian Statistics for Personalized Medicine. David Yang
Bayesian Statistics for Personalized Medicine David Yang Outline Why Bayesian Statistics for Personalized Medicine? A Network-based Bayesian Strategy for Genomic Biomarker Discovery Part One Why Bayesian
More informationCausal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions
Causal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions Joe Schafer Office of the Associate Director for Research and Methodology U.S. Census
More informationApplication of Latent Class with Random Effects Models to Longitudinal Data. Ken Beath Macquarie University
Application of Latent Class with Random Effects Models to Longitudinal Data Ken Beath Macquarie University Introduction Latent trajectory is a method of classifying subjects based on longitudinal data
More informationModeling continuous diagnostic test data using approximate Dirichlet process distributions
Research Article Received 04 June 2010 Accepted 17 May 2011 Published online 22 July 2011 in Wiley Online Library (wileyonlinelibrary.com) DOI: 10.1002/sim.4320 Modeling continuous diagnostic test data
More informationREPRODUCIBLE ANALYSIS OF HIGH-THROUGHPUT EXPERIMENTS
REPRODUCIBLE ANALYSIS OF HIGH-THROUGHPUT EXPERIMENTS Ying Liu Department of Biostatistics, Columbia University Summer Intern at Research and CMC Biostats, Sanofi, Boston August 26, 2015 OUTLINE 1 Introduction
More informationLecture 1: Bayesian Framework Basics
Lecture 1: Bayesian Framework Basics Melih Kandemir melih.kandemir@iwr.uni-heidelberg.de April 21, 2014 What is this course about? Building Bayesian machine learning models Performing the inference of
More informationIntroduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf
1 Introduction to Machine Learning Maximum Likelihood and Bayesian Inference Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf 2013-14 We know that X ~ B(n,p), but we do not know p. We get a random sample
More informationUniversity of Cambridge Engineering Part IIB Module 3F3: Signal and Pattern Processing Handout 2:. The Multivariate Gaussian & Decision Boundaries
University of Cambridge Engineering Part IIB Module 3F3: Signal and Pattern Processing Handout :. The Multivariate Gaussian & Decision Boundaries..15.1.5 1 8 6 6 8 1 Mark Gales mjfg@eng.cam.ac.uk Lent
More informationEstimating Diagnostic Error without a Gold Standard: A Mixed Membership Approach
7 Estimating Diagnostic Error without a Gold Standard: A Mixed Membership Approach Elena A. Erosheva Department of Statistics, University of Washington, Seattle, WA 98195-4320, USA Cyrille Joutard Institut
More informationBIOS 312: Precision of Statistical Inference
and Power/Sample Size and Standard Errors BIOS 312: of Statistical Inference Chris Slaughter Department of Biostatistics, Vanderbilt University School of Medicine January 3, 2013 Outline Overview and Power/Sample
More informationIntroduction: MLE, MAP, Bayesian reasoning (28/8/13)
STA561: Probabilistic machine learning Introduction: MLE, MAP, Bayesian reasoning (28/8/13) Lecturer: Barbara Engelhardt Scribes: K. Ulrich, J. Subramanian, N. Raval, J. O Hollaren 1 Classifiers In this
More informationMeta-Analysis for Diagnostic Test Data: a Bayesian Approach
Meta-Analysis for Diagnostic Test Data: a Bayesian Approach Pablo E. Verde Coordination Centre for Clinical Trials Heinrich Heine Universität Düsseldorf Preliminaries: motivations for systematic reviews
More informationConstructing Confidence Intervals of the Summary Statistics in the Least-Squares SROC Model
UW Biostatistics Working Paper Series 3-28-2005 Constructing Confidence Intervals of the Summary Statistics in the Least-Squares SROC Model Ming-Yu Fan University of Washington, myfan@u.washington.edu
More informationA Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness
A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A. Linero and M. Daniels UF, UT-Austin SRC 2014, Galveston, TX 1 Background 2 Working model
More informationLocal Likelihood Bayesian Cluster Modeling for small area health data. Andrew Lawson Arnold School of Public Health University of South Carolina
Local Likelihood Bayesian Cluster Modeling for small area health data Andrew Lawson Arnold School of Public Health University of South Carolina Local Likelihood Bayesian Cluster Modelling for Small Area
More informationAssessment of uncertainty in computer experiments: from Universal Kriging to Bayesian Kriging. Céline Helbert, Delphine Dupuy and Laurent Carraro
Assessment of uncertainty in computer experiments: from Universal Kriging to Bayesian Kriging., Delphine Dupuy and Laurent Carraro Historical context First introduced in the field of geostatistics (Matheron,
More informationZhiguang Huo 1, Chi Song 2, George Tseng 3. July 30, 2018
Bayesian latent hierarchical model for transcriptomic meta-analysis to detect biomarkers with clustered meta-patterns of differential expression signals BayesMP Zhiguang Huo 1, Chi Song 2, George Tseng
More informationspbayes: An R Package for Univariate and Multivariate Hierarchical Point-referenced Spatial Models
spbayes: An R Package for Univariate and Multivariate Hierarchical Point-referenced Spatial Models Andrew O. Finley 1, Sudipto Banerjee 2, and Bradley P. Carlin 2 1 Michigan State University, Departments
More informationBased on slides by Richard Zemel
CSC 412/2506 Winter 2018 Probabilistic Learning and Reasoning Lecture 3: Directed Graphical Models and Latent Variables Based on slides by Richard Zemel Learning outcomes What aspects of a model can we
More informationIntroduction (Alex Dmitrienko, Lilly) Web-based training program
Web-based training Introduction (Alex Dmitrienko, Lilly) Web-based training program http://www.amstat.org/sections/sbiop/webinarseries.html Four-part web-based training series Geert Verbeke (Katholieke
More informationECE521 week 3: 23/26 January 2017
ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear
More informationABTEKNILLINEN KORKEAKOULU
Two-way analysis of high-dimensional collinear data 1 Tommi Suvitaival 1 Janne Nikkilä 1,2 Matej Orešič 3 Samuel Kaski 1 1 Department of Information and Computer Science, Helsinki University of Technology,
More informationCombining diverse information sources with the II-CC-FF paradigm
Combining diverse information sources with the II-CC-FF paradigm Céline Cunen (joint work with Nils Lid Hjort) University of Oslo 18/07/2017 1/25 The Problem - Combination of information We have independent
More informationQuasi-likelihood Scan Statistics for Detection of
for Quasi-likelihood for Division of Biostatistics and Bioinformatics, National Health Research Institutes & Department of Mathematics, National Chung Cheng University 17 December 2011 1 / 25 Outline for
More information10708 Graphical Models: Homework 2
10708 Graphical Models: Homework 2 Due Monday, March 18, beginning of class Feburary 27, 2013 Instructions: There are five questions (one for extra credit) on this assignment. There is a problem involves
More informationLDA, QDA, Naive Bayes
LDA, QDA, Naive Bayes Generative Classification Models Marek Petrik 2/16/2017 Last Class Logistic Regression Maximum Likelihood Principle Logistic Regression Predict probability of a class: p(x) Example:
More informationThe impact of covariance misspecification in multivariate Gaussian mixtures on estimation and inference
The impact of covariance misspecification in multivariate Gaussian mixtures on estimation and inference An application to longitudinal modeling Brianna Heggeseth with Nicholas Jewell Department of Statistics
More informationSTA 216, GLM, Lecture 16. October 29, 2007
STA 216, GLM, Lecture 16 October 29, 2007 Efficient Posterior Computation in Factor Models Underlying Normal Models Generalized Latent Trait Models Formulation Genetic Epidemiology Illustration Structural
More informationCLASS NOTES Models, Algorithms and Data: Introduction to computing 2018
CLASS NOTES Models, Algorithms and Data: Introduction to computing 208 Petros Koumoutsakos, Jens Honore Walther (Last update: June, 208) IMPORTANT DISCLAIMERS. REFERENCES: Much of the material (ideas,
More informationStat 542: Item Response Theory Modeling Using The Extended Rank Likelihood
Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood Jonathan Gruhl March 18, 2010 1 Introduction Researchers commonly apply item response theory (IRT) models to binary and ordinal
More informationPart 8: GLMs and Hierarchical LMs and GLMs
Part 8: GLMs and Hierarchical LMs and GLMs 1 Example: Song sparrow reproductive success Arcese et al., (1992) provide data on a sample from a population of 52 female song sparrows studied over the course
More informationPrediction-based structured variable selection through the receiver operating characteristic curves. The Pennsylvania State University
Prediction-based structured variable selection through the receiver operating characteristic curves Yuanjia Wang, Huaihou Chen, Runze Li, Naihua Duan, & Roberto Lewis-Fernandez The Pennsylvania State University
More informationCASE STUDY: Bayesian Incidence Analyses from Cross-Sectional Data with Multiple Markers of Disease Severity. Outline:
CASE STUDY: Bayesian Incidence Analyses from Cross-Sectional Data with Multiple Markers of Disease Severity Outline: 1. NIEHS Uterine Fibroid Study Design of Study Scientific Questions Difficulties 2.
More informationMachine Learning Linear Classification. Prof. Matteo Matteucci
Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)
More informationLearning with Probabilities
Learning with Probabilities CS194-10 Fall 2011 Lecture 15 CS194-10 Fall 2011 Lecture 15 1 Outline Bayesian learning eliminates arbitrary loss functions and regularizers facilitates incorporation of prior
More informationApproaches to Modeling Menstrual Cycle Function
Approaches to Modeling Menstrual Cycle Function Paul S. Albert (albertp@mail.nih.gov) Biostatistics & Bioinformatics Branch Division of Epidemiology, Statistics, and Prevention Research NICHD SPER Student
More informationA Fully Nonparametric Modeling Approach to. BNP Binary Regression
A Fully Nonparametric Modeling Approach to Binary Regression Maria Department of Applied Mathematics and Statistics University of California, Santa Cruz SBIES, April 27-28, 2012 Outline 1 2 3 Simulation
More informationOutline. Selection Bias in Multilevel Models. The selection problem. The selection problem. Scope of analysis. The selection problem
International Workshop on tatistical Latent Variable Models in Health ciences erugia, 6-8 eptember 6 Outline election Bias in Multilevel Models Leonardo Grilli Carla Rampichini grilli@ds.unifi.it carla@ds.unifi.it
More informationA simulation study of model fitting to high dimensional data using penalized logistic regression
A simulation study of model fitting to high dimensional data using penalized logistic regression Ellinor Krona Kandidatuppsats i matematisk statistik Bachelor Thesis in Mathematical Statistics Kandidatuppsats
More informationModel Checking and Improvement
Model Checking and Improvement Statistics 220 Spring 2005 Copyright c 2005 by Mark E. Irwin Model Checking All models are wrong but some models are useful George E. P. Box So far we have looked at a number
More informationGenetics: Early Online, published on May 5, 2017 as /genetics
Genetics: Early Online, published on May 5, 2017 as 10.1534/genetics.116.198606 GENETICS INVESTIGATION Accounting for Sampling Error in Genetic Eigenvalues using Random Matrix Theory Jacqueline L. Sztepanacz,1
More informationUsing Bayesian Priors for More Flexible Latent Class Analysis
Using Bayesian Priors for More Flexible Latent Class Analysis Tihomir Asparouhov Bengt Muthén Abstract Latent class analysis is based on the assumption that within each class the observed class indicator
More informationProbabilistic Index Models
Probabilistic Index Models Jan De Neve Department of Data Analysis Ghent University M3 Storrs, Conneticut, USA May 23, 2017 Jan.DeNeve@UGent.be 1 / 37 Introduction 2 / 37 Introduction to Probabilistic
More informationApproximate analysis of covariance in trials in rare diseases, in particular rare cancers
Approximate analysis of covariance in trials in rare diseases, in particular rare cancers Stephen Senn (c) Stephen Senn 1 Acknowledgements This work is partly supported by the European Union s 7th Framework
More informationChapter 10. Semi-Supervised Learning
Chapter 10. Semi-Supervised Learning Wei Pan Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455 Email: weip@biostat.umn.edu PubH 7475/8475 c Wei Pan Outline
More information