David Hughes. Flexible Discriminant Analysis Using. Multivariate Mixed Models. D. Hughes. Motivation MGLMM. Discriminant. Analysis.
|
|
- Spencer Horton
- 5 years ago
- Views:
Transcription
1 Using Using David Hughes 2015
2 Outline Using Multivariate Generalized Linear Mixed () 3. Longitudinal 4. 5.
3 Using Complex data.
4 Using Complex data. Longitudinal
5 Using Complex data. Longitudinal Multivariate
6 Using Complex data. Longitudinal Multivariate Different types of data
7 Using Complex data. Longitudinal Multivariate Different types of data Complicated correlation structure
8 Using Complex data. Longitudinal Multivariate Different types of data Complicated correlation structure
9 Available Methods Univariate models using a classical linear mixed model (e.g Brant et al. (2003), Lix and Sajobi (2010), Tomasko et al. (1999) and Wernecke et al. (2004)). Fails to account properly for the dependence between markers in our case. Multivariate for continuous markers using multivariate mixed models (eg Morrell et al. (2012) using linear mixed models and Marshall et al. (2009) using non-linear mixed models). Not applicable if some of the markers are not continuous. Pairwise models for continuous and binary markers (Fieuws et al. (2008)). This method in principle is suitable for our purposes but in this talk we outline a more flexible approach. Using
10 A more flexible approach Using Typical assumption about the random effects distribution can be relaxed by using a mixture of normal distributions (Komárek et al. (2010)). This methodology only considers three continuous markers. Cluster with continuous, binary and count variables with mixture distributions for the random effects is possible (Komárek and Komáreková (2013)) In Cluster the groups are unknown whereas in our case groups are known beforehand. Software is available in the mixak package in R created by Arnošt Komárek.
11 Progress Map Using Dataset for Fitting of the multivariate mixed-effects model () model built using parameters of Allocate new patients to diagnostic groups
12 Definitions Using Y i,r,j is the j th observation of the r th marker for patient i and is measured at time t i,r,j. We consider r = 1,..., R markers on i = 1,..., N patients. Y i,r is a vector containing all observations of marker r for patient i. Y i is a stacked vector containing all the observations of all markers for patient i. Distribution of each marker may depend on additional covariates such as time, Age, Gender. It is possible for each marker to be measured at different time points and a different number of times.
13 Multivariate Generalized Linear Mixed Using To allow for different types of marker we model each marker using a generalised linear mixed model h 1 r [E(Y i,r α r, b i,r )] = X i,r α r + Z i,r b i,r (1)
14 Multivariate Generalized Linear Mixed Using To allow for different types of marker we model each marker using a generalised linear mixed model h 1 r [E(Y i,r α r, b i,r )] = X i,r α r + Z i,r b i,r (1) h r is a link function used depending on the type of longitudinal marker. α r is a vector of fixed parameters for marker r. b i,r is a vector of random effects for patient i for marker r (i.e subject specific parameters). X and Z are matrices containing covariate information for each patient.
15 Joint Distribution of the random effects Using The dependence between markers is captured by the joint distribution of the random effects b i = (b i,1,..., b i,r ), i = 1,..., N. The most common assumption is that the random effects follow a Normal distribution. b i N(µ,D) (2) This assumption can be difficult to verify and additional flexibility can be achieved by allowing a mixture of Normal distributions. K b i w k N(µ k,d k ) (3) k=1
16 Parameter Estimation Using We need to estimate the following parameters.
17 Parameter Estimation Using We need to estimate the following parameters. Fixed effects α = (α1,..., α R ) Possible dispersion parameters φ = (φ1,..., φ R ) Mixture weights w = (w1,..., w K ) Mean vector of random effects µ = (µ1,..., µ K ) Covariance matrix of random effects (vec(d1),..., vec(d K )) In all, we need to estimate, θ = (α, φ, w, µ, vec(d 1 ),..., vec(d K )) (4)
18 MCMC estimates Using Full maximum likelihood estimates are difficult to obtain due to the complexity of the likelihood. We instead use a Bayesian approach based on MCMC. We utilise weakly informative priors and a block Gibbs sampler. A benefit of this method, not explored in this talk is that credible intervals for the group membership probabilities are readily available. These could be incorporated into a classification procedure in some cases.
19 Progress Map Using Dataset for Fitting of the multivariate mixed-effects model () model built using parameters of Allocate new patients to diagnostic groups
20 Longitudinal Fit to data in each diagnostic group g, g = 1,..., G to obtain MCMC parameter estimates, ˆθ g. Use the fitted GLMM model to derive the discriminant rule that assigns the patients into two (or more) diagnostic groups. Let ˆP g,new be the probability that a new observation Y i, is from group g. The prior probability of being in group g is denoted π g. Using Bayes rule it can be seen that Using ˆP g,new = π gˆfg,new G 1 h=0 π hˆf h,new (5) Assign new patients to disease group if ˆP disease,new is greater than a specified value. If not assign to the group for which ˆP g,new is largest.
21 Specifying the predictive density f g,new Marginal Prediction Conditional Prediction f marg g,new = p(y new θ g ) (6) Using Random Effects Prediction f cond g,new = p(y new b new = b g,new, θ g ) (7) f rand g,new = p( b g,new θ g ) (8) These values are calculated using numerical integration methods such as Gauss Quadrature since they involve complex integrals that cannot be solved analytically.
22 Diabetic Retinopathy example Our motivation comes from the ISDR cohort study. We consider 12,628 patients with diabetes who were screened between 2009 and 2013 for diabetic retinopathy. Various markers measured over time, HbA1c and Cholesterol (continuous markers), retinopathy grading (treated as binary marker), and number of GP visits (count variable). 600 patients had positive screening event within the observation period. Using Figure: Left: Image of diabetic eye without retinopathy. Right: Image of diabetic eye with late stage diabetic retinopathy (Kindly provided by Dr. Yalin Zheng).
23 Example: ISDR data Using We consider two groups, 600 patients with a positive screening event (indicating STDR) and patients without. 80% of the patients in each group to train s (one for each group). 20% of patients to test the classification accuracy. End goal is to identify patients who will have a positive screening event in one years time (so only consider data gathered up to one year before final visit.)
24 Example: ISDR data Using We fit the following models: E[log(HbA1c)] = α 1 Sex + α 2 Age + b i,0 + b i,1 time (9) E[log(Cholesterol)] = α 3 Sex + α 4 Age + α 5 time + b i,2 (10) loge[visit] = α 6 Sex + α 7 Age + α 8 time + b i,3 (11) logite[grading] = α 9 Sex + α 10 Age + b i,4 + b i,5 time (12)
25 Example: ISDR data Posterior Mean Standard Error Posterior Median 95% Credible Interval No STDR Group α e e e-03 (-3.03e-03,-2.46e-03) α e e e-03 (-1.25e-02,2.07e-03) ) α e e e-03 (-3.65e-03,-2.93e-03) ) α e e e-02 (-9.61e-02,-7.9e-02) α e e-08-2e-05 (-2.45e-05,-1.57e-05) α e e e-03 (2.86e-03,4.44e-03) α e e e-02 (-3.64e-02,3.03e-03) α e e e-04 (2.8e-04,3.18e-04) α e e e-03 (4.2e-03,1.45e-02) α e e e-01 (-5.64e-03,2.45e-01) STDR Group α e e e-03 (-8.4e-03,-5.05e-03) α e e e-02 (-8.03e-02,1.49e-02) α 3-3.1e e e-03 (-4.54e-03,-1.59e-03) α e e e-02 (-1.27e-01,-3.79e-02) α e e e-05 (-5.7e-05,1.05e-05) α e e e-03 (4.77e-03,1.33e-02) α e e e-02 (-1.46e-01,9.26e-02) α e e e-04 (3.61e-04,5.95e-04) α e e e-03 (-2.63e-02,1.58e-02) α e e e-01 (-8.16e-01,5.33e-01) Table: Posterior summary statistics for the fixed effects α in our. Using
26 Example: ISDR data Posterior Mean Standard Error Posterior Median 95% Credible Interval No STDR Group E[b 0] e (4.13,4.17) E[b 1] 6.07e e e-06 (-5.18e-07,1.26e-05) E[b 2] e (1.65,1.7) E[b 3] 5.1e e e-01 (4.55e-01,5.65e-01) E[b 4] e (-3.35,-2.56) E[b 5] -3.35e e e-04 (-4.65e-04,-2.03e-04) SD[b 0] 2.71e e e-01 (2.63e-01,2.8e-01) SD[b 1] 2.2e e e-04 (2.09e-04,2.32e-04) SD[b 2] 1.83e e e-01 (1.8e-01,1.87e-01) SD[b 3] 2.27e e e-01 (2.13e-01,2.41e-01) SD[b 4] e (2.3e,2.79) SD[b 5] 9.46e e e-04 (6.91e-04,1.2e-03) STDR Group E[b 0] e (4.51,4.73) E[b 1] 3.61e e e-05 (-8.23e-07,7.26e-05) E[b 2] e (1.55,1.74) E[b 3] 3.05e e e-01 (2.17e-02,5.96e-01) E[b 4] e (2.38,5.44) E[b 5] 8.66e e e-04 (2.82e-04,1.3e-03) SD[b 0] 3.05e e e-01 (2.64e-01,3.47e-01) SD[b 1] 1.75e e e-04 (1.24e-04,2.24e-04) SD[b 2] 2.06e e e-01 (1.85e-01,2.26e-01) SD[b 3] 3.15e e e-01 (2.4e-01,3.86e-01) SD[b 4] e (2.16,3.59) SD[b 5] 8.94e e e-04 (6.43e-04,1.3e-03) Table: Posterior summary statistics for the means and standard deviations of the random effects b i in our. Using
27 Example: ISDR data No STDR Group STDR Group Using Log(HbA1c) Time (days) Time (days) Log(Cholesterol) Time (days) Time (days) Figure: Observed longitudinal profiles (in light blue) of log(hba1c) and log(cholesterol) for patients without positive screening events (left column) and patients with positive screening events (right column). The average profile over time of a male with median age is shown in each group by the red and green lines respectively.
28 Using Example: ISDR data STDR Group Number of GP Visits No STDR Group Time (days) Grading 1500 Time (days) 1.0 Time (days) 500 Time (days) Figure: Observed longitudinal profiles (in light blue) of number of GP visits and retinopathy grading for patients without positive screening events (left column) and patients with positive screening events (right column). The average profile over time of a male with median age is shown in each group by the red and green lines respectively.
29 Example: ISDR data Sensitivity ROC Plot for Methods of Group Prediction Marginal Conditional Random Effects LDA QDA Using Specificity Figure: ROC curve to compare the predictive abilities of the three longitudinal methods of group membership prediction and the simple LDA and QDA techniques.
30 Example: ISDR data Using Marginal Conditional Random effects LDA QDA Cutoff Sensitivity Specificity PCC AUC Table: The precision of the prediction of diagnostic groups for three longitudinal methods and the classical LDA and QDA methods. PCC = Probability of Correct classification. AUC = Area Under Curve. LDA = Linear. QDA = Quadratic.
31 Using There is a definite advantage to using longitudinal information in comparison to simply applying LDA (or QDA) to the last observations for each patient.
32 Using There is a definite advantage to using longitudinal information in comparison to simply applying LDA (or QDA) to the last observations for each patient. The marginal prediction method gives the best classification for the ISDR data (on all measures).
33 Using There is a definite advantage to using longitudinal information in comparison to simply applying LDA (or QDA) to the last observations for each patient. The marginal prediction method gives the best classification for the ISDR data (on all measures). Our methodology is able to obtain promising classification results by incorporating markers of different types.
34 Further work Using Can we make more use of the credible intervals that are readily available from the MCMC procedure?
35 Further work Using Can we make more use of the credible intervals that are readily available from the MCMC procedure? Can we identify the ideal timing of the next screening interval?
36 Further work Using Can we make more use of the credible intervals that are readily available from the MCMC procedure? Can we identify the ideal timing of the next screening interval? Can we include categorical longitudinal outcomes within this framework?
37 Acknowledgements Using Joint work with Arnošt Komárek (Charles University in Prague), Gabriela Czanner, Christopher P. Cheyne, Simon Harding and Marta García-Fiñana. We are grateful for the support of the ISDR team. We acknowledge support from the Medical Research Council (Research project MR/L010909/1). García-Fiñana M, Czanner G, Cox T, Bonnett L, Harding S, Marson T. Function for Longitudinal Data: Applications in Medical Research ( ) funded by MRC MRP ( 334,170)
38 References Brant, L.J., Sheng S.L., Morrell, C.H., Verbeke, G. N., Lesaffre, E. and Carter, H. B. (2003) Screening for prostate cancer by using random-effects models. Journal of the Royal Statistical Society: Series A, 166(1):51 62 Fieuws, S., Verbeke, G., Maes, B., and Vanrenterghem, Y. (2008) Predicting renal graft failure using multivariate longitudinal profiles. Biostatistics, 9(3): Komárek, A., Hansen, B.E., Kuiper, E.M.M., van Buuren, H.R., and Lesaffre, E. (2010) analysis using a multivariate linear mixed model with a normal mixture in the random effects distribution. Statistics in medicine, 29(30): Komárek A. and Komáreková, L. (2013) Clustering for multivariate continuous and discrete longitudinal data. The Annals of Applied Statistics, 7(1): Using
39 References Lix, L.M., and Sajobi, T.T. (2010) analysis for repeated measures data: a review. Frontiers in psychology, 1, Article 146 Marshall, G., De la Cruz-Mesía, R., Quintana, F.A., and Baron, A.E. (2009) for Longitudinal Data with Multiple Continuous Responses and Possibly Missing Data. Biometrics 65: Morrell, C.H., Brant, L.J., Sheng, S.L., and Metter, E. J. (2012) Screening for prostate cancer using multivariate mixed-effects models. Journal of applied statistics, 39(6): Tomasko, L., Helms, R.W. and Snapinn, S.M. (1999) A discriminant analysis extension to mixed models. Statistics in medicine, 18(10): Wernecke, K-D., Kalb, G., Schink T., and Wegner, B. (2004) A mixed model approach to discriminant analysis with longitudinal data. Biometrical journal, 46(2): Using
Statistics in Medicine. Dynamic classification using credible intervals in longitudinal discriminant analysis
Dynamic classification using credible intervals in longitudinal discriminant analysis Journal: Manuscript ID SIM--0.R Wiley - Manuscript type: Research Article Date Submitted by the Author: n/a Complete
More informationIntroduction to Machine Learning
Introduction to Machine Learning Bayesian Classification Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574
More informationA Bayesian Nonparametric Model for Predicting Disease Status Using Longitudinal Profiles
A Bayesian Nonparametric Model for Predicting Disease Status Using Longitudinal Profiles Jeremy Gaskins Department of Bioinformatics & Biostatistics University of Louisville Joint work with Claudio Fuentes
More informationMachine Learning Linear Classification. Prof. Matteo Matteucci
Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)
More informationEquivalence of random-effects and conditional likelihoods for matched case-control studies
Equivalence of random-effects and conditional likelihoods for matched case-control studies Ken Rice MRC Biostatistics Unit, Cambridge, UK January 8 th 4 Motivation Study of genetic c-erbb- exposure and
More informationIntroduction to Machine Learning
Outline Introduction to Machine Learning Bayesian Classification Varun Chandola March 8, 017 1. {circular,large,light,smooth,thick}, malignant. {circular,large,light,irregular,thick}, malignant 3. {oval,large,dark,smooth,thin},
More informationClassification: Linear Discriminant Analysis
Classification: Linear Discriminant Analysis Discriminant analysis uses sample information about individuals that are known to belong to one of several populations for the purposes of classification. Based
More informationBayesian Nonparametric Regression for Diabetes Deaths
Bayesian Nonparametric Regression for Diabetes Deaths Brian M. Hartman PhD Student, 2010 Texas A&M University College Station, TX, USA David B. Dahl Assistant Professor Texas A&M University College Station,
More informationPrerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3
University of California, Irvine 2017-2018 1 Statistics (STATS) Courses STATS 5. Seminar in Data Science. 1 Unit. An introduction to the field of Data Science; intended for entering freshman and transfers.
More informationThe STS Surgeon Composite Technical Appendix
The STS Surgeon Composite Technical Appendix Overview Surgeon-specific risk-adjusted operative operative mortality and major complication rates were estimated using a bivariate random-effects logistic
More informationRonald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California
Texts in Statistical Science Bayesian Ideas and Data Analysis An Introduction for Scientists and Statisticians Ronald Christensen University of New Mexico Albuquerque, New Mexico Wesley Johnson University
More informationDistribution-free ROC Analysis Using Binary Regression Techniques
Distribution-free Analysis Using Binary Techniques Todd A. Alonzo and Margaret S. Pepe As interpreted by: Andrew J. Spieker University of Washington Dept. of Biostatistics Introductory Talk No, not that!
More informationLecture 9: Classification, LDA
Lecture 9: Classification, LDA Reading: Chapter 4 STATS 202: Data mining and analysis October 13, 2017 1 / 21 Review: Main strategy in Chapter 4 Find an estimate ˆP (Y X). Then, given an input x 0, we
More informationLecture 9: Classification, LDA
Lecture 9: Classification, LDA Reading: Chapter 4 STATS 202: Data mining and analysis October 13, 2017 1 / 21 Review: Main strategy in Chapter 4 Find an estimate ˆP (Y X). Then, given an input x 0, we
More informationBayes Rule. CS789: Machine Learning and Neural Network Bayesian learning. A Side Note on Probability. What will we learn in this lecture?
Bayes Rule CS789: Machine Learning and Neural Network Bayesian learning P (Y X) = P (X Y )P (Y ) P (X) Jakramate Bootkrajang Department of Computer Science Chiang Mai University P (Y ): prior belief, prior
More informationGenerative classifiers: The Gaussian classifier. Ata Kaban School of Computer Science University of Birmingham
Generative classifiers: The Gaussian classifier Ata Kaban School of Computer Science University of Birmingham Outline We have already seen how Bayes rule can be turned into a classifier In all our examples
More informationBAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA
BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA Intro: Course Outline and Brief Intro to Marina Vannucci Rice University, USA PASI-CIMAT 04/28-30/2010 Marina Vannucci
More informationComparison of multiple imputation methods for systematically and sporadically missing multilevel data
Comparison of multiple imputation methods for systematically and sporadically missing multilevel data V. Audigier, I. White, S. Jolani, T. Debray, M. Quartagno, J. Carpenter, S. van Buuren, M. Resche-Rigon
More informationRichard D Riley was supported by funding from a multivariate meta-analysis grant from
Bayesian bivariate meta-analysis of correlated effects: impact of the prior distributions on the between-study correlation, borrowing of strength, and joint inferences Author affiliations Danielle L Burke
More informationISyE 6416: Computational Statistics Spring Lecture 5: Discriminant analysis and classification
ISyE 6416: Computational Statistics Spring 2017 Lecture 5: Discriminant analysis and classification Prof. Yao Xie H. Milton Stewart School of Industrial and Systems Engineering Georgia Institute of Technology
More informationLecture 8: Classification
1/26 Lecture 8: Classification Måns Eriksson Department of Mathematics, Uppsala University eriksson@math.uu.se Multivariate Methods 19/5 2010 Classification: introductory examples Goal: Classify an observation
More informationIntroduction to lnmle: An R Package for Marginally Specified Logistic-Normal Models for Longitudinal Binary Data
Introduction to lnmle: An R Package for Marginally Specified Logistic-Normal Models for Longitudinal Binary Data Bryan A. Comstock and Patrick J. Heagerty Department of Biostatistics University of Washington
More informationBayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang
Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features Yangxin Huang Department of Epidemiology and Biostatistics, COPH, USF, Tampa, FL yhuang@health.usf.edu January
More informationEstimation of Optimally-Combined-Biomarker Accuracy in the Absence of a Gold-Standard Reference Test
Estimation of Optimally-Combined-Biomarker Accuracy in the Absence of a Gold-Standard Reference Test L. García Barrado 1 E. Coart 2 T. Burzykowski 1,2 1 Interuniversity Institute for Biostatistics and
More informationMultilevel Statistical Models: 3 rd edition, 2003 Contents
Multilevel Statistical Models: 3 rd edition, 2003 Contents Preface Acknowledgements Notation Two and three level models. A general classification notation and diagram Glossary Chapter 1 An introduction
More informationFACTORIZATION MACHINES AS A TOOL FOR HEALTHCARE CASE STUDY ON TYPE 2 DIABETES DETECTION
SunLab Enlighten the World FACTORIZATION MACHINES AS A TOOL FOR HEALTHCARE CASE STUDY ON TYPE 2 DIABETES DETECTION Ioakeim (Kimis) Perros and Jimeng Sun perros@gatech.edu, jsun@cc.gatech.edu COMPUTATIONAL
More informationDiscrete Multivariate Statistics
Discrete Multivariate Statistics Univariate Discrete Random variables Let X be a discrete random variable which, in this module, will be assumed to take a finite number of t different values which are
More informationBayesian Linear Regression
Bayesian Linear Regression Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. September 15, 2010 1 Linear regression models: a Bayesian perspective
More informationThe Effect of Sample Composition on Inference for Random Effects Using Normal and Dirichlet Process Models
Journal of Data Science 8(2), 79-9 The Effect of Sample Composition on Inference for Random Effects Using Normal and Dirichlet Process Models Guofen Yan 1 and J. Sedransk 2 1 University of Virginia and
More informationPrinciples of Bayesian Inference
Principles of Bayesian Inference Sudipto Banerjee and Andrew O. Finley 2 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department
More informationGenerative Model (Naïve Bayes, LDA)
Generative Model (Naïve Bayes, LDA) IST557 Data Mining: Techniques and Applications Jessie Li, Penn State University Materials from Prof. Jia Li, sta3s3cal learning book (Has3e et al.), and machine learning
More informationThe Bayes classifier
The Bayes classifier Consider where is a random vector in is a random variable (depending on ) Let be a classifier with probability of error/risk given by The Bayes classifier (denoted ) is the optimal
More informationStat 542: Item Response Theory Modeling Using The Extended Rank Likelihood
Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood Jonathan Gruhl March 18, 2010 1 Introduction Researchers commonly apply item response theory (IRT) models to binary and ordinal
More informationLongitudinal + Reliability = Joint Modeling
Longitudinal + Reliability = Joint Modeling Carles Serrat Institute of Statistics and Mathematics Applied to Building CYTED-HAROSA International Workshop November 21-22, 2013 Barcelona Mainly from Rizopoulos,
More informationSupervised Learning: Linear Methods (1/2) Applied Multivariate Statistics Spring 2012
Supervised Learning: Linear Methods (1/2) Applied Multivariate Statistics Spring 2012 Overview Review: Conditional Probability LDA / QDA: Theory Fisher s Discriminant Analysis LDA: Example Quality control:
More informationBayes methods for categorical data. April 25, 2017
Bayes methods for categorical data April 25, 2017 Motivation for joint probability models Increasing interest in high-dimensional data in broad applications Focus may be on prediction, variable selection,
More informationGaussian processes for spatial modelling in environmental health: parameterizing for flexibility vs. computational efficiency
Gaussian processes for spatial modelling in environmental health: parameterizing for flexibility vs. computational efficiency Chris Paciorek March 11, 2005 Department of Biostatistics Harvard School of
More informationBayesian Mixture Modeling
University of California, Merced July 21, 2014 Mplus Users Meeting, Utrecht Organization of the Talk Organization s modeling estimation framework Motivating examples duce the basic LCA model Illustrated
More informationBuilding a Prognostic Biomarker
Building a Prognostic Biomarker Noah Simon and Richard Simon July 2016 1 / 44 Prognostic Biomarker for a Continuous Measure On each of n patients measure y i - single continuous outcome (eg. blood pressure,
More informationLecture 9: Classification, LDA
Lecture 9: Classification, LDA Reading: Chapter 4 STATS 202: Data mining and analysis Jonathan Taylor, 10/12 Slide credits: Sergio Bacallado 1 / 1 Review: Main strategy in Chapter 4 Find an estimate ˆP
More informationJournal of Statistical Software
JSS Journal of Statistical Software September 2014, Volume 59, Issue 12. http://www.jstatsoft.org/ Capabilities of R Package mixak for Clustering Based on Multivariate Continuous and Discrete Longitudinal
More informationA TWO-STAGE LINEAR MIXED-EFFECTS/COX MODEL FOR LONGITUDINAL DATA WITH MEASUREMENT ERROR AND SURVIVAL
A TWO-STAGE LINEAR MIXED-EFFECTS/COX MODEL FOR LONGITUDINAL DATA WITH MEASUREMENT ERROR AND SURVIVAL Christopher H. Morrell, Loyola College in Maryland, and Larry J. Brant, NIA Christopher H. Morrell,
More information[Part 2] Model Development for the Prediction of Survival Times using Longitudinal Measurements
[Part 2] Model Development for the Prediction of Survival Times using Longitudinal Measurements Aasthaa Bansal PhD Pharmaceutical Outcomes Research & Policy Program University of Washington 69 Biomarkers
More informationBayesian Decision Theory
Introduction to Pattern Recognition [ Part 4 ] Mahdi Vasighi Remarks It is quite common to assume that the data in each class are adequately described by a Gaussian distribution. Bayesian classifier is
More informationJoint longitudinal and time-to-event models via Stan
Joint longitudinal and time-to-event models via Stan Sam Brilleman 1,2, Michael J. Crowther 3, Margarita Moreno-Betancur 2,4,5, Jacqueline Buros Novik 6, Rory Wolfe 1,2 StanCon 2018 Pacific Grove, California,
More informationDEPARTMENT OF COMPUTER SCIENCE Autumn Semester MACHINE LEARNING AND ADAPTIVE INTELLIGENCE
Data Provided: None DEPARTMENT OF COMPUTER SCIENCE Autumn Semester 203 204 MACHINE LEARNING AND ADAPTIVE INTELLIGENCE 2 hours Answer THREE of the four questions. All questions carry equal weight. Figures
More informationPart III Measures of Classification Accuracy for the Prediction of Survival Times
Part III Measures of Classification Accuracy for the Prediction of Survival Times Patrick J Heagerty PhD Department of Biostatistics University of Washington 102 ISCB 2010 Session Three Outline Examples
More informationNonparametric predictive inference with parametric copulas for combining bivariate diagnostic tests
Nonparametric predictive inference with parametric copulas for combining bivariate diagnostic tests Noryanti Muhammad, Universiti Malaysia Pahang, Malaysia, noryanti@ump.edu.my Tahani Coolen-Maturi, Durham
More informationDiscussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs
Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs Michael J. Daniels and Chenguang Wang Jan. 18, 2009 First, we would like to thank Joe and Geert for a carefully
More informationA Fully Nonparametric Modeling Approach to. BNP Binary Regression
A Fully Nonparametric Modeling Approach to Binary Regression Maria Department of Applied Mathematics and Statistics University of California, Santa Cruz SBIES, April 27-28, 2012 Outline 1 2 3 Simulation
More informationMISSING or INCOMPLETE DATA
MISSING or INCOMPLETE DATA A (fairly) complete review of basic practice Don McLeish and Cyntha Struthers University of Waterloo Dec 5, 2015 Structure of the Workshop Session 1 Common methods for dealing
More informationSurrogate marker evaluation when data are small, large, or very large Geert Molenberghs
Surrogate marker evaluation when data are small, large, or very large Geert Molenberghs Interuniversity Institute for Biostatistics and statistical Bioinformatics (I-BioStat) Universiteit Hasselt & KU
More informationA comparison of fully Bayesian and two-stage imputation strategies for missing covariate data
A comparison of fully Bayesian and two-stage imputation strategies for missing covariate data Alexina Mason, Sylvia Richardson and Nicky Best Department of Epidemiology and Biostatistics, Imperial College
More informationNaïve Bayes classification
Naïve Bayes classification 1 Probability theory Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. Examples: A person s height, the outcome of a coin toss
More informationGroup Sequential Tests for Delayed Responses. Christopher Jennison. Lisa Hampson. Workshop on Special Topics on Sequential Methodology
Group Sequential Tests for Delayed Responses Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj Lisa Hampson Department of Mathematics and Statistics,
More informationA stationarity test on Markov chain models based on marginal distribution
Universiti Tunku Abdul Rahman, Kuala Lumpur, Malaysia 646 A stationarity test on Markov chain models based on marginal distribution Mahboobeh Zangeneh Sirdari 1, M. Ataharul Islam 2, and Norhashidah Awang
More informationChap 2. Linear Classifiers (FTH, ) Yongdai Kim Seoul National University
Chap 2. Linear Classifiers (FTH, 4.1-4.4) Yongdai Kim Seoul National University Linear methods for classification 1. Linear classifiers For simplicity, we only consider two-class classification problems
More informationAnalysing longitudinal data when the visit times are informative
Analysing longitudinal data when the visit times are informative Eleanor Pullenayegum, PhD Scientist, Hospital for Sick Children Associate Professor, University of Toronto eleanor.pullenayegum@sickkids.ca
More informationUniversity of Southampton Research Repository eprints Soton
University of Southampton Research Repository eprints Soton Copyright and Moral Rights for this thesis are retained by the author and/or other copyright owners. A copy can be downloaded for personal non-commercial
More informationPrinciples of Bayesian Inference
Principles of Bayesian Inference Sudipto Banerjee University of Minnesota July 20th, 2008 1 Bayesian Principles Classical statistics: model parameters are fixed and unknown. A Bayesian thinks of parameters
More informationECE521 week 3: 23/26 January 2017
ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear
More informationBayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework
HT5: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Maximum Likelihood Principle A generative model for
More informationAn R # Statistic for Fixed Effects in the Linear Mixed Model and Extension to the GLMM
An R Statistic for Fixed Effects in the Linear Mixed Model and Extension to the GLMM Lloyd J. Edwards, Ph.D. UNC-CH Department of Biostatistics email: Lloyd_Edwards@unc.edu Presented to the Department
More informationBayesian non-parametric model to longitudinally predict churn
Bayesian non-parametric model to longitudinally predict churn Bruno Scarpa Università di Padova Conference of European Statistics Stakeholders Methodologists, Producers and Users of European Statistics
More informationMultivariate Survival Analysis
Multivariate Survival Analysis Previously we have assumed that either (X i, δ i ) or (X i, δ i, Z i ), i = 1,..., n, are i.i.d.. This may not always be the case. Multivariate survival data can arise in
More informationClassification. Chapter Introduction. 6.2 The Bayes classifier
Chapter 6 Classification 6.1 Introduction Often encountered in applications is the situation where the response variable Y takes values in a finite set of labels. For example, the response Y could encode
More informationOutline. Clustering. Capturing Unobserved Heterogeneity in the Austrian Labor Market Using Finite Mixtures of Markov Chain Models
Capturing Unobserved Heterogeneity in the Austrian Labor Market Using Finite Mixtures of Markov Chain Models Collaboration with Rudolf Winter-Ebmer, Department of Economics, Johannes Kepler University
More informationMixture modelling of recurrent event times with long-term survivors: Analysis of Hutterite birth intervals. John W. Mac McDonald & Alessandro Rosina
Mixture modelling of recurrent event times with long-term survivors: Analysis of Hutterite birth intervals John W. Mac McDonald & Alessandro Rosina Quantitative Methods in the Social Sciences Seminar -
More informationLongitudinal breast density as a marker of breast cancer risk
Longitudinal breast density as a marker of breast cancer risk C. Armero (1), M. Rué (2), A. Forte (1), C. Forné (2), H. Perpiñán (1), M. Baré (3), and G. Gómez (4) (1) BIOstatnet and Universitat de València,
More informationA class of latent marginal models for capture-recapture data with continuous covariates
A class of latent marginal models for capture-recapture data with continuous covariates F Bartolucci A Forcina Università di Urbino Università di Perugia FrancescoBartolucci@uniurbit forcina@statunipgit
More informationarxiv: v1 [stat.ap] 6 Apr 2018
Individualized Dynamic Prediction of Survival under Time-Varying Treatment Strategies Grigorios Papageorgiou 1, 2, Mostafa M. Mokhles 2, Johanna J. M. Takkenberg 2, arxiv:1804.02334v1 [stat.ap] 6 Apr 2018
More informationMachine Learning. Regression-Based Classification & Gaussian Discriminant Analysis. Manfred Huber
Machine Learning Regression-Based Classification & Gaussian Discriminant Analysis Manfred Huber 2015 1 Logistic Regression Linear regression provides a nice representation and an efficient solution to
More informationFundamentals to Biostatistics. Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur
Fundamentals to Biostatistics Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur Statistics collection, analysis, interpretation of data development of new
More informationHierarchical Hurdle Models for Zero-In(De)flated Count Data of Complex Designs
for Zero-In(De)flated Count Data of Complex Designs Marek Molas 1, Emmanuel Lesaffre 1,2 1 Erasmus MC 2 L-Biostat Erasmus Universiteit - Rotterdam Katholieke Universiteit Leuven The Netherlands Belgium
More informationLatent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent
Latent Variable Models for Binary Data Suppose that for a given vector of explanatory variables x, the latent variable, U, has a continuous cumulative distribution function F (u; x) and that the binary
More informationCTDL-Positive Stable Frailty Model
CTDL-Positive Stable Frailty Model M. Blagojevic 1, G. MacKenzie 2 1 Department of Mathematics, Keele University, Staffordshire ST5 5BG,UK and 2 Centre of Biostatistics, University of Limerick, Ireland
More informationBayesian variable selection and classification with control of predictive values
Bayesian variable selection and classification with control of predictive values Eleni Vradi 1, Thomas Jaki 2, Richardus Vonk 1, Werner Brannath 3 1 Bayer AG, Germany, 2 Lancaster University, UK, 3 University
More informationBayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units
Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units Sahar Z Zangeneh Robert W. Keener Roderick J.A. Little Abstract In Probability proportional
More informationBIOS 312: Precision of Statistical Inference
and Power/Sample Size and Standard Errors BIOS 312: of Statistical Inference Chris Slaughter Department of Biostatistics, Vanderbilt University School of Medicine January 3, 2013 Outline Overview and Power/Sample
More informationPrinciples of Bayesian Inference
Principles of Bayesian Inference Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department
More informationBayesian methods for missing data: part 1. Key Concepts. Nicky Best and Alexina Mason. Imperial College London
Bayesian methods for missing data: part 1 Key Concepts Nicky Best and Alexina Mason Imperial College London BAYES 2013, May 21-23, Erasmus University Rotterdam Missing Data: Part 1 BAYES2013 1 / 68 Outline
More informationNaïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability
Probability theory Naïve Bayes classification Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s height, the outcome of a coin toss Distinguish
More informationLinear Discriminant Analysis Based in part on slides from textbook, slides of Susan Holmes. November 9, Statistics 202: Data Mining
Linear Discriminant Analysis Based in part on slides from textbook, slides of Susan Holmes November 9, 2012 1 / 1 Nearest centroid rule Suppose we break down our data matrix as by the labels yielding (X
More informationIntroduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf
1 Introduction to Machine Learning Maximum Likelihood and Bayesian Inference Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf 2013-14 We know that X ~ B(n,p), but we do not know p. We get a random sample
More informationUniversity of California, Berkeley
University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2008 Paper 241 A Note on Risk Prediction for Case-Control Studies Sherri Rose Mark J. van der Laan Division
More informationMissing Data Issues in the Studies of Neurodegenerative Disorders: the Methodology
Missing Data Issues in the Studies of Neurodegenerative Disorders: the Methodology Sheng Luo, PhD Associate Professor Department of Biostatistics & Bioinformatics Duke University Medical Center sheng.luo@duke.edu
More informationDisease mapping with Gaussian processes
EUROHEIS2 Kuopio, Finland 17-18 August 2010 Aki Vehtari (former Helsinki University of Technology) Department of Biomedical Engineering and Computational Science (BECS) Acknowledgments Researchers - Jarno
More informationUniversity of Bristol - Explore Bristol Research. Peer reviewed version. Link to published version (if available): /rssa.
Goldstein, H., Carpenter, J. R., & Browne, W. J. (2014). Fitting multilevel multivariate models with missing data in responses and covariates that may include interactions and non-linear terms. Journal
More informationLinear Methods for Prediction
Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we
More informationRepeated ordinal measurements: a generalised estimating equation approach
Repeated ordinal measurements: a generalised estimating equation approach David Clayton MRC Biostatistics Unit 5, Shaftesbury Road Cambridge CB2 2BW April 7, 1992 Abstract Cumulative logit and related
More informationPart 6: Multivariate Normal and Linear Models
Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of
More informationMixtures of Negative Binomial distributions for modelling overdispersion in RNA-Seq data
Mixtures of Negative Binomial distributions for modelling overdispersion in RNA-Seq data Cinzia Viroli 1 joint with E. Bonafede 1, S. Robin 2 & F. Picard 3 1 Department of Statistical Sciences, University
More informationPerformance of INLA analysing bivariate meta-regression and age-period-cohort models
Performance of INLA analysing bivariate meta-regression and age-period-cohort models Andrea Riebler Biostatistics Unit, Institute of Social and Preventive Medicine University of Zurich INLA workshop, May
More informationBayesian Multivariate Logistic Regression
Bayesian Multivariate Logistic Regression Sean M. O Brien and David B. Dunson Biostatistics Branch National Institute of Environmental Health Sciences Research Triangle Park, NC 1 Goals Brief review of
More informationWeb-based Supplementary Material for A Two-Part Joint. Model for the Analysis of Survival and Longitudinal Binary. Data with excess Zeros
Web-based Supplementary Material for A Two-Part Joint Model for the Analysis of Survival and Longitudinal Binary Data with excess Zeros Dimitris Rizopoulos, 1 Geert Verbeke, 1 Emmanuel Lesaffre 1 and Yves
More informationPart IV Extensions: Competing Risks Endpoints and Non-Parametric AUC(t) Estimation
Part IV Extensions: Competing Risks Endpoints and Non-Parametric AUC(t) Estimation Patrick J. Heagerty PhD Department of Biostatistics University of Washington 166 ISCB 2010 Session Four Outline Examples
More informationSimultaneous inference for multiple testing and clustering via a Dirichlet process mixture model
Simultaneous inference for multiple testing and clustering via a Dirichlet process mixture model David B Dahl 1, Qianxing Mo 2 and Marina Vannucci 3 1 Texas A&M University, US 2 Memorial Sloan-Kettering
More informationLehmann Family of ROC Curves
Memorial Sloan-Kettering Cancer Center From the SelectedWorks of Mithat Gönen May, 2007 Lehmann Family of ROC Curves Mithat Gonen, Memorial Sloan-Kettering Cancer Center Glenn Heller, Memorial Sloan-Kettering
More informationSTAC51: Categorical data Analysis
STAC51: Categorical data Analysis Mahinda Samarakoon January 26, 2016 Mahinda Samarakoon STAC51: Categorical data Analysis 1 / 32 Table of contents Contingency Tables 1 Contingency Tables Mahinda Samarakoon
More informationA Robust Approach to Regularized Discriminant Analysis
A Robust Approach to Regularized Discriminant Analysis Moritz Gschwandtner Department of Statistics and Probability Theory Vienna University of Technology, Austria Österreichische Statistiktage, Graz,
More information