Robust Bayesian Variable Selection for Modeling Mean Medical Costs

Size: px
Start display at page:

Download "Robust Bayesian Variable Selection for Modeling Mean Medical Costs"

Transcription

1 Robust Bayesian Variable Selection for Modeling Mean Medical Costs Grace Yoon 1,, Wenxin Jiang 2, Lei Liu 3 and Ya-Chen T. Shih 4 1 Department of Statistics, Texas A&M University 2 Department of Statistics, Northwestern University 3 Division of Biostatistics, Washington University in St. Louis 4 Department of Health Services Research, The University of Texas MD Anderson Cancer Center

2 Rising Medical Costs Medical costs in the U.S. rise rapidly: at a faster pace than general inflation for decades For the first time, healthcare spending per capita had exceeded $10,000 in 2016 (Keehan et al. 2016) Rapid growth in medical costs in the U.S. has been a major policy concern: Affordable Care Act (Obama Care) and its repealing by President Trump. Analyzing medical cost data has become increasingly important because understanding the impact of any cost containment initiative will require accurate and reliable statistical inference of medical costs.

3 Healthcare spending in 2016, % of GDP (China: 6.2%)

4 Medical Cost Data Medical cost data have been collected routinely by hospitals, government agencies, and health insurance companies. Modeling medical costs is of great interest in health economics study Goal: to identify the risk factors of medical costs and ascertain the most cost-effective treatment, which in turn, can assist policy makers in maximizing health benefits for individuals and society.

5 Medical Expenditure Panel Survey (MEPS) The Medical Expenditure Panel Survey (MEPS), funded by AHRQ, is the most complete source of data on the cost and use of health care and health insurance coverage with three components: Household, Insurance/Employer, and Medical Provider. Survey design: a probability sample designed to provide nationally representative estimates of health care use, medical expenditures, health attitudes, and health status Data structure: an overlapping panel, and each individual has at most 2 years of data. More information available at It is publicly available, FREE!

6 Challenges in Modeling Medical Costs Medical costs are often right-skewed and heteroscedastic. Robust Chen et al. (2013) and Chen et al. (2016) developed models that make assumptions on the mean cost only and avoid restrictive assumptions on higher order moments. However, they did not take into account variable selection in their predictive models. Parsimonious Bayesian approach can provide posterior probability ratios for selected models. Informative

7 Model Consider a log-linear mean model without any additional distribution assumption for a response variable. E (Y i X i ) = µ (X i, β) = e X i T β X i = (1, x i1, x i2,, x ip ) T, β = (β 0, β 1, β 2,, β p ) T, for i = 1,, n, (1)

8 Asymptotic normal estimate ˆβ based on the full model Estimating equation (also the score equation) for Poisson distribution: S(β) = n i=1 X i ( Y i e X T i β ) = 0. ˆβ is asymptotic normal for the true parameter β of interest, even if the true model of Y i is NOT Poisson. Naive Poisson likelihood is typically globally concave, and the maximizer ˆβ is unique and easy to compute.

9 Sandwich formula Since the Poisson model is a naive probability model, the asymptotic variance V of ˆβ should be estimated by a robust sandwich formula. ˆV = ( n i=1 X i X T i e X i T 1 ( n ( ˆβ) X i Y i e X i T ( n i=1 i=1 X i X T i e X i T ˆβ ) 1 ) ( ) ) ˆβ Y i e X i T T ˆβ X T i

10 Posterior probability Suppose we have asymptotic normal parameter estimate ˆβ of the parameter β, so that ˆβ β, U N(β, V ). Together with a normal prior β U N(0, U), we can integrate exactly p(ˆβ U) = ˆβ U N(0, U + V ). 1 det(2π(u + V )) exp{ 1 2 ˆβ T (U + V ) 1 ˆβ} Then the posterior distribution for model U can be obtained from p(u ˆβ) p(ˆβ U)p(U).

11 Spike-or-Slab (SorS) priors β Spike-or-Slab (SorS) priors β N(0, U) where U = diag(u 0, u 1,...u p ), where u j is the prior variance of β j, taking either a prespecified small value (spike variance), or a prespecified large value (slab variance). This strategy can approximately select components β j if u j is large, and neglect β j if u j is small. Let u 0 be large so we always have the intercept in the model.

12 How to choose spike and slab variance? u j = var(β j ) u j /V jj {a (spike variance), A (slab variance)} where V jj is the jth diagonal element of V = Var(ˆβ). The role of a and A is like tuning parameters. Fix A to be relatively large, to avoid unnecessary penalization on selected coefficients. Set a to be a small but positive value, rather than setting to zero, to absorb negligible nonzero coefficients into the spike distribution. In practice, choose a from cross-validation and set A as n. In simulation and real data application, we performed 10-fold cross-validation to select an optimal tuning parameter a based on the RMSE (Root Mean Square Error).

13 Model selection procedure Step 1. Calculate Z-statistics for each variable in a full model using Sandwich Variance Estimator (p < n). Z j = ˆβ j / ˆV jj for j = 1,..., p. Step 2. Rank all p variables by the absolute value of Z-statistics.

14 Step 3. Compare the posterior probability of p different candidate models in Z-scope: Φ Z = { M (1), M (2),..., M (p) }, where M (j) = { k : k {1,..., p}, Z k Z (j) } for each j = 1,..., p. Basically, each candidate model contains the largest top d of Z j = ˆβ j / ˆV jj s, with the choices of large u j s (u j = A ˆV jj for large A), and the other p d of the u j s are taken to be small (as u j = a ˆV jj for small a). Then the posterior p(u ˆβ) is computed and compared for each of the p candidate models and we can identify the best model with the highest posterior probability p(u ˆβ).

15 Simulation setting R = 100 and n = 1000 data sets 4 true variables among p = 50 (including intercept) where µ i = e X T i β. β = (2, 2, 2, 2, 2, 0,, 0).. 2 by 2 design: 2 heteroscedasticity levels and 2 correlation structures for covariates Heteroscedasticity Level: (Case 1) Y i Gamma(µ i, 1), that is, E(Y i ) = µ i, V (Y i ) = µ i. (Case 2) Y i Gamma(1, 1/µ i ), that is, E(Y i ) = µ i, V (Y i ) = µ 2 i. Correlation structure for predictors (Independent) x 1,..., x p s are iid from Bernoulli(0.5). (Correlated) x 1,..., x p Bernoulli(0.5) with corr(x j, x k ) = 0.5 j k for 1 j, k p.

16 Model Comparison Full model (without variable selection) LASSO sslasso (spike-and-slab lasso generalized linear models) 1 1 Tang et al. (2017), BhGLM R package

17 Model Comparison Criteria p ( ) 2 RMSE of coefficient estimates: j=0 β j ˆβ j /(p + 1). Selected model size. Coverage probability (Cov)= R r=1 I (M ˆM (r) )/R, Percentage of correct zeros p j=1 (Cor0)= R r=1 Percentage of incorrect zeros (Inc0)= R r=1 I ( ˆβ (r) j = 0, β j = 0)/[R(p p 0 )], p j=1 I ( ˆβ (r) j = 0, β j 0)/[Rp 0 ], Exact selection probability (Ext)= R r=1 I (M = ˆM (r) )/R, where M and ˆM (r) denote a true model and a selected model at rth generated dataset, respectively. Average accuracy rate of variable selection (Acc) = p j=1 I (ˆγ j = γ j )/p, where γ j = I (β j 0) and ˆγ j = I ( ˆβ j 0).

18 RMSE of cost estimates (prediction of Y )

19 RMSE of coefficient estimates (estimation of β)

20 Model size (selection performance)

21 Cov Cor0 Inc0 Ext Acc Oracle Full independent LASSO predictors sslasso Case 1 SorS var(y i ) = µ i Full correlated LASSO predictors sslasso SorS Full independent LASSO predictors sslasso Case 2 SorS var(y i ) = µ 2 i Full correlated LASSO predictors sslasso SorS

22 Summary of Simulation Studies In summary, SorS performs better than the other comparable methods in terms of selection, prediction, and estimation

23 Our Subset of MEPS data n = 3, 376 and p = 33 from 2014 MEPS full year consolidated data. The mean medical cost is $10, 321 and the median is $4, The standard deviation is $17, 966. Correlations between covariates ( 0.55, 0.6).

24 Application to MEPS data To assess the performance of variable selection methods: randomly sample half for training data and use the remaining for test data. (repeated 100 times) RMSE Model Size SorS sslasso LASSO Full SorS sslasso LASSO Full

25 Variable Selection: Informativeness The model with 10 variables has the highest posterior probability. The second best model with 11 variables is 21% as likely as the best model. The third best model with 12 variables is 5.7% as likely as the best model.

26 Application to MEPS data The model with the largest posterior probability: Estimate s.e. p-value (Intercept) < HOSPEXP < INSCOV < DIABETES < PCS < ANYLMT < EMERG < CANCER STRK CORHRT EDUCAT <

27 Interpretation of Results Hospitalization (HOSPEXP) and emergency room visit (EMERG) increase the medical cost by a large percentage (3.4 times and 1.3 times, respectively). Heart and blood vessel disease (CORHRT and STRK), body movement disorder (ANYLMT), cancer, and diabetes are all significantly associated with annual medical costs. Gender and race variables are not selected in our model. Expected medical costs are higher for the higher educated: more educated individuals tends to care more about their health, and probably are more likely to have regular checkups and spend more for better treatment. Physical Composite Scores (PCS) variable, a quality of life measure, has a negative association with medical costs.

28 Discussion Simultaneously account for severe skewness and heteroscedasticity of the medical cost data. Fit the data in the original scale without any transformation of the response variable. Goals: Robust, Parsimonious and Informative. Limitation: applicable to relatively low-dimension and large sample data. Analysis of Medical Cost Data: Statistical and Econometric Tools" (with Tina Shih) under contract with Cambridge University Press

29 Thank you!

30 Chen, J., Liu, L., Zhang, D. and Shih, Y.-C. T. (2013) A flexible model for the mean and variance functions, with application to medical cost data. Statistics in Medicine 32(24): Chen, J., Liu, L., Zhang, D., Shih, Y.-C. T. and Severini T. (2016) A flexible model for correlated medical costs, with application to medical expenditure panel survey data. Statistics in Medicine 35: Jiang, W. and Li, X. (2004) Consistent model selection based on parameter estimates. Journal of Statistical Planning and Inference 121: Tang, Z., Shen, Y., Zhang, X. and Yi, N. (2017) The spike-and-slab lasso generalized linear models for prediction and associated genes detection. Genetics 205(1): Zheng, X. and Loh, W.-Y. (1995) Consistent variable selection in linear models. Journal of the American Statistical Association 90(429):

31 Performance of SorS for each a Exact Selection Probability Case 1 indep Case 1 corr Case 2 indep Case 2 corr a

32 Performance of SorS for each a Model Size Case 1 indep Case 1 corr Case 2 indep Case 2 corr a

33 Prior variance and shrinkage Simple example: Y β N(X β, σ 2 I n ) β N(0, c 2 σ 2 I p ) Then, posterior probability is ( ( β y N X T X + 1 ) 1 ( c 2 I p X T Y, σ 2 X T X + 1 ) ) 1 c 2 I p, E(β ˆβ) = = ( X T X + 1 ) 1 c 2 I p X T Y ( I p + 1 ( ) ) 1 1 c 2 X T X ˆβ As c 2 0, the shrinkage is larger.

Consistent high-dimensional Bayesian variable selection via penalized credible regions

Consistent high-dimensional Bayesian variable selection via penalized credible regions Consistent high-dimensional Bayesian variable selection via penalized credible regions Howard Bondell bondell@stat.ncsu.edu Joint work with Brian Reich Howard Bondell p. 1 Outline High-Dimensional Variable

More information

Tail negative dependence and its applications for aggregate loss modeling

Tail negative dependence and its applications for aggregate loss modeling Tail negative dependence and its applications for aggregate loss modeling Lei Hua Division of Statistics Oct 20, 2014, ISU L. Hua (NIU) 1/35 1 Motivation 2 Tail order Elliptical copula Extreme value copula

More information

Building a Prognostic Biomarker

Building a Prognostic Biomarker Building a Prognostic Biomarker Noah Simon and Richard Simon July 2016 1 / 44 Prognostic Biomarker for a Continuous Measure On each of n patients measure y i - single continuous outcome (eg. blood pressure,

More information

Shrinkage Methods: Ridge and Lasso

Shrinkage Methods: Ridge and Lasso Shrinkage Methods: Ridge and Lasso Jonathan Hersh 1 Chapman University, Argyros School of Business hersh@chapman.edu February 27, 2019 J.Hersh (Chapman) Ridge & Lasso February 27, 2019 1 / 43 1 Intro and

More information

Propensity Score Weighting with Multilevel Data

Propensity Score Weighting with Multilevel Data Propensity Score Weighting with Multilevel Data Fan Li Department of Statistical Science Duke University October 25, 2012 Joint work with Alan Zaslavsky and Mary Beth Landrum Introduction In comparative

More information

Biostatistics-Lecture 16 Model Selection. Ruibin Xi Peking University School of Mathematical Sciences

Biostatistics-Lecture 16 Model Selection. Ruibin Xi Peking University School of Mathematical Sciences Biostatistics-Lecture 16 Model Selection Ruibin Xi Peking University School of Mathematical Sciences Motivating example1 Interested in factors related to the life expectancy (50 US states,1969-71 ) Per

More information

Bayesian regression tree models for causal inference: regularization, confounding and heterogeneity

Bayesian regression tree models for causal inference: regularization, confounding and heterogeneity Bayesian regression tree models for causal inference: regularization, confounding and heterogeneity P. Richard Hahn, Jared Murray, and Carlos Carvalho June 22, 2017 The problem setting We want to estimate

More information

Selection of Smoothing Parameter for One-Step Sparse Estimates with L q Penalty

Selection of Smoothing Parameter for One-Step Sparse Estimates with L q Penalty Journal of Data Science 9(2011), 549-564 Selection of Smoothing Parameter for One-Step Sparse Estimates with L q Penalty Masaru Kanba and Kanta Naito Shimane University Abstract: This paper discusses the

More information

Sparse Linear Models (10/7/13)

Sparse Linear Models (10/7/13) STA56: Probabilistic machine learning Sparse Linear Models (0/7/) Lecturer: Barbara Engelhardt Scribes: Jiaji Huang, Xin Jiang, Albert Oh Sparsity Sparsity has been a hot topic in statistics and machine

More information

For more information about how to cite these materials visit

For more information about how to cite these materials visit Author(s): Kerby Shedden, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Share Alike 3.0 License: http://creativecommons.org/licenses/by-sa/3.0/

More information

WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION. Abstract

WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION. Abstract Journal of Data Science,17(1). P. 145-160,2019 DOI:10.6339/JDS.201901_17(1).0007 WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION Wei Xiong *, Maozai Tian 2 1 School of Statistics, University of

More information

Machine Learning Linear Classification. Prof. Matteo Matteucci

Machine Learning Linear Classification. Prof. Matteo Matteucci Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)

More information

A New Bayesian Variable Selection Method: The Bayesian Lasso with Pseudo Variables

A New Bayesian Variable Selection Method: The Bayesian Lasso with Pseudo Variables A New Bayesian Variable Selection Method: The Bayesian Lasso with Pseudo Variables Qi Tang (Joint work with Kam-Wah Tsui and Sijian Wang) Department of Statistics University of Wisconsin-Madison Feb. 8,

More information

Lecture 14: Shrinkage

Lecture 14: Shrinkage Lecture 14: Shrinkage Reading: Section 6.2 STATS 202: Data mining and analysis October 27, 2017 1 / 19 Shrinkage methods The idea is to perform a linear regression, while regularizing or shrinking the

More information

Selective Inference for Effect Modification

Selective Inference for Effect Modification Inference for Modification (Joint work with Dylan Small and Ashkan Ertefaie) Department of Statistics, University of Pennsylvania May 24, ACIC 2017 Manuscript and slides are available at http://www-stat.wharton.upenn.edu/~qyzhao/.

More information

Hierarchical modelling of performance indicators, with application to MRSA & teenage conception rates

Hierarchical modelling of performance indicators, with application to MRSA & teenage conception rates Hierarchical modelling of performance indicators, with application to MRSA & teenage conception rates Hayley E Jones School of Social and Community Medicine, University of Bristol, UK Thanks to David Spiegelhalter,

More information

Nonparametric Heteroscedastic Transformation Regression Models for Skewed Data, with an Application to Health Care Costs

Nonparametric Heteroscedastic Transformation Regression Models for Skewed Data, with an Application to Health Care Costs Nonparametric Heteroscedastic Transformation Regression Models for Skewed Data, with an Application to Health Care Costs Xiao-Hua Zhou, Huazhen Lin, Eric Johnson Journal of Royal Statistical Society Series

More information

Cross-Validation with Confidence

Cross-Validation with Confidence Cross-Validation with Confidence Jing Lei Department of Statistics, Carnegie Mellon University WHOA-PSI Workshop, St Louis, 2017 Quotes from Day 1 and Day 2 Good model or pure model? Occam s razor We really

More information

BIOSTATISTICAL METHODS

BIOSTATISTICAL METHODS BIOSTATISTICAL METHODS FOR TRANSLATIONAL & CLINICAL RESEARCH Cross-over Designs #: DESIGNING CLINICAL RESEARCH The subtraction of measurements from the same subject will mostly cancel or minimize effects

More information

Combining Non-probability and Probability Survey Samples Through Mass Imputation

Combining Non-probability and Probability Survey Samples Through Mass Imputation Combining Non-probability and Probability Survey Samples Through Mass Imputation Jae-Kwang Kim 1 Iowa State University & KAIST October 27, 2018 1 Joint work with Seho Park, Yilin Chen, and Changbao Wu

More information

General Regression Model

General Regression Model Scott S. Emerson, M.D., Ph.D. Department of Biostatistics, University of Washington, Seattle, WA 98195, USA January 5, 2015 Abstract Regression analysis can be viewed as an extension of two sample statistical

More information

Finite Sample Performance of A Minimum Distance Estimator Under Weak Instruments

Finite Sample Performance of A Minimum Distance Estimator Under Weak Instruments Finite Sample Performance of A Minimum Distance Estimator Under Weak Instruments Tak Wai Chau February 20, 2014 Abstract This paper investigates the nite sample performance of a minimum distance estimator

More information

Machine Learning for OR & FE

Machine Learning for OR & FE Machine Learning for OR & FE Regression II: Regularization and Shrinkage Methods Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com

More information

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices Article Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices Fei Jin 1,2 and Lung-fei Lee 3, * 1 School of Economics, Shanghai University of Finance and Economics,

More information

Lecture 4 Multiple linear regression

Lecture 4 Multiple linear regression Lecture 4 Multiple linear regression BIOST 515 January 15, 2004 Outline 1 Motivation for the multiple regression model Multiple regression in matrix notation Least squares estimation of model parameters

More information

Univariate shrinkage in the Cox model for high dimensional data

Univariate shrinkage in the Cox model for high dimensional data Univariate shrinkage in the Cox model for high dimensional data Robert Tibshirani January 6, 2009 Abstract We propose a method for prediction in Cox s proportional model, when the number of features (regressors)

More information

Semi-Nonparametric Inferences for Massive Data

Semi-Nonparametric Inferences for Massive Data Semi-Nonparametric Inferences for Massive Data Guang Cheng 1 Department of Statistics Purdue University Statistics Seminar at NCSU October, 2015 1 Acknowledge NSF, Simons Foundation and ONR. A Joint Work

More information

Biost 518 Applied Biostatistics II. Purpose of Statistics. First Stage of Scientific Investigation. Further Stages of Scientific Investigation

Biost 518 Applied Biostatistics II. Purpose of Statistics. First Stage of Scientific Investigation. Further Stages of Scientific Investigation Biost 58 Applied Biostatistics II Scott S. Emerson, M.D., Ph.D. Professor of Biostatistics University of Washington Lecture 5: Review Purpose of Statistics Statistics is about science (Science in the broadest

More information

Analyzing the Geospatial Rates of the Primary Care Physician Labor Supply in the Contiguous United States

Analyzing the Geospatial Rates of the Primary Care Physician Labor Supply in the Contiguous United States Analyzing the Geospatial Rates of the Primary Care Physician Labor Supply in the Contiguous United States By Russ Frith Advisor: Dr. Raid Amin University of W. Florida Capstone Project in Statistics April,

More information

Chap 1. Overview of Statistical Learning (HTF, , 2.9) Yongdai Kim Seoul National University

Chap 1. Overview of Statistical Learning (HTF, , 2.9) Yongdai Kim Seoul National University Chap 1. Overview of Statistical Learning (HTF, 2.1-2.6, 2.9) Yongdai Kim Seoul National University 0. Learning vs Statistical learning Learning procedure Construct a claim by observing data or using logics

More information

Module 17: Bayesian Statistics for Genetics Lecture 4: Linear regression

Module 17: Bayesian Statistics for Genetics Lecture 4: Linear regression 1/37 The linear regression model Module 17: Bayesian Statistics for Genetics Lecture 4: Linear regression Ken Rice Department of Biostatistics University of Washington 2/37 The linear regression model

More information

Bayesian variable selection via. Penalized credible regions. Brian Reich, NCSU. Joint work with. Howard Bondell and Ander Wilson

Bayesian variable selection via. Penalized credible regions. Brian Reich, NCSU. Joint work with. Howard Bondell and Ander Wilson Bayesian variable selection via penalized credible regions Brian Reich, NC State Joint work with Howard Bondell and Ander Wilson Brian Reich, NCSU Penalized credible regions 1 Motivation big p, small n

More information

Making Our Cities Safer: A Study In Neighbhorhood Crime Patterns

Making Our Cities Safer: A Study In Neighbhorhood Crime Patterns Making Our Cities Safer: A Study In Neighbhorhood Crime Patterns Aly Kane alykane@stanford.edu Ariel Sagalovsky asagalov@stanford.edu Abstract Equipped with an understanding of the factors that influence

More information

Probabilistic classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016

Probabilistic classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016 Probabilistic classification CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2016 Topics Probabilistic approach Bayes decision theory Generative models Gaussian Bayes classifier

More information

Cross-Validation with Confidence

Cross-Validation with Confidence Cross-Validation with Confidence Jing Lei Department of Statistics, Carnegie Mellon University UMN Statistics Seminar, Mar 30, 2017 Overview Parameter est. Model selection Point est. MLE, M-est.,... Cross-validation

More information

What is to be done? Two attempts using Gaussian process priors

What is to be done? Two attempts using Gaussian process priors What is to be done? Two attempts using Gaussian process priors Maximilian Kasy Department of Economics, Harvard University Oct 14 2017 1 / 33 What questions should econometricians work on? Incentives of

More information

MS-C1620 Statistical inference

MS-C1620 Statistical inference MS-C1620 Statistical inference 10 Linear regression III Joni Virta Department of Mathematics and Systems Analysis School of Science Aalto University Academic year 2018 2019 Period III - IV 1 / 32 Contents

More information

INTRODUCTION TO BAYESIAN INFERENCE PART 2 CHRIS BISHOP

INTRODUCTION TO BAYESIAN INFERENCE PART 2 CHRIS BISHOP INTRODUCTION TO BAYESIAN INFERENCE PART 2 CHRIS BISHOP Personal Healthcare Revolution Electronic health records (CFH) Personal genomics (DeCode, Navigenics, 23andMe) X-prize: first $10k human genome technology

More information

Causal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions

Causal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions Causal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions Joe Schafer Office of the Associate Director for Research and Methodology U.S. Census

More information

Propensity Score Analysis with Hierarchical Data

Propensity Score Analysis with Hierarchical Data Propensity Score Analysis with Hierarchical Data Fan Li Alan Zaslavsky Mary Beth Landrum Department of Health Care Policy Harvard Medical School May 19, 2008 Introduction Population-based observational

More information

Linear Regression. Junhui Qian. October 27, 2014

Linear Regression. Junhui Qian. October 27, 2014 Linear Regression Junhui Qian October 27, 2014 Outline The Model Estimation Ordinary Least Square Method of Moments Maximum Likelihood Estimation Properties of OLS Estimator Unbiasedness Consistency Efficiency

More information

Generalized Quasi-likelihood versus Hierarchical Likelihood Inferences in Generalized Linear Mixed Models for Count Data

Generalized Quasi-likelihood versus Hierarchical Likelihood Inferences in Generalized Linear Mixed Models for Count Data Sankhyā : The Indian Journal of Statistics 2009, Volume 71-B, Part 1, pp. 55-78 c 2009, Indian Statistical Institute Generalized Quasi-likelihood versus Hierarchical Likelihood Inferences in Generalized

More information

WU Weiterbildung. Linear Mixed Models

WU Weiterbildung. Linear Mixed Models Linear Mixed Effects Models WU Weiterbildung SLIDE 1 Outline 1 Estimation: ML vs. REML 2 Special Models On Two Levels Mixed ANOVA Or Random ANOVA Random Intercept Model Random Coefficients Model Intercept-and-Slopes-as-Outcomes

More information

Modification and Improvement of Empirical Likelihood for Missing Response Problem

Modification and Improvement of Empirical Likelihood for Missing Response Problem UW Biostatistics Working Paper Series 12-30-2010 Modification and Improvement of Empirical Likelihood for Missing Response Problem Kwun Chuen Gary Chan University of Washington - Seattle Campus, kcgchan@u.washington.edu

More information

Regression, Ridge Regression, Lasso

Regression, Ridge Regression, Lasso Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.

More information

Or How to select variables Using Bayesian LASSO

Or How to select variables Using Bayesian LASSO Or How to select variables Using Bayesian LASSO x 1 x 2 x 3 x 4 Or How to select variables Using Bayesian LASSO x 1 x 2 x 3 x 4 Or How to select variables Using Bayesian LASSO On Bayesian Variable Selection

More information

The Impacts of the Affordable Care Act on Preventive Services among Racial Groups UNDERGRADUATE RESEARCH THESIS

The Impacts of the Affordable Care Act on Preventive Services among Racial Groups UNDERGRADUATE RESEARCH THESIS The Impacts of the Affordable Care Act on Preventive Services among Racial Groups UNDERGRADUATE RESEARCH THESIS Presented in partial fulfillment of the requirements for the Honors Research Distinction

More information

An Age-Stratified Poisson Model for Comparing Trends in Cancer Rates Across Overlapping Regions

An Age-Stratified Poisson Model for Comparing Trends in Cancer Rates Across Overlapping Regions bimj header will be provided by the publisher An Age-Stratified Poisson Model for Comparing Trends in Cancer Rates Across Overlapping Regions Yi Li, Ram C. Tiwari 2, and Zhaohui Zou 3 Harvard University,

More information

CMSC858P Supervised Learning Methods

CMSC858P Supervised Learning Methods CMSC858P Supervised Learning Methods Hector Corrada Bravo March, 2010 Introduction Today we discuss the classification setting in detail. Our setting is that we observe for each subject i a set of p predictors

More information

Nonconcave Penalized Likelihood with A Diverging Number of Parameters

Nonconcave Penalized Likelihood with A Diverging Number of Parameters Nonconcave Penalized Likelihood with A Diverging Number of Parameters Jianqing Fan and Heng Peng Presenter: Jiale Xu March 12, 2010 Jianqing Fan and Heng Peng Presenter: JialeNonconcave Xu () Penalized

More information

The Sum of Standardized Residuals: Goodness of Fit Test for Binary Response Model

The Sum of Standardized Residuals: Goodness of Fit Test for Binary Response Model The Sum of Standardized Residuals: Goodness of Fit Test for Binary Response Model Thesis Presented in Partial Fulfillment of the Requirements for the Degree Master of Science in the Graduate School of

More information

H-LIKELIHOOD ESTIMATION METHOOD FOR VARYING CLUSTERED BINARY MIXED EFFECTS MODEL

H-LIKELIHOOD ESTIMATION METHOOD FOR VARYING CLUSTERED BINARY MIXED EFFECTS MODEL H-LIKELIHOOD ESTIMATION METHOOD FOR VARYING CLUSTERED BINARY MIXED EFFECTS MODEL Intesar N. El-Saeiti Department of Statistics, Faculty of Science, University of Bengahzi-Libya. entesar.el-saeiti@uob.edu.ly

More information

ECE521 week 3: 23/26 January 2017

ECE521 week 3: 23/26 January 2017 ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear

More information

Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p )

Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p ) Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p. 376-390) BIO656 2009 Goal: To see if a major health-care reform which took place in 1997 in Germany was

More information

Package zic. August 22, 2017

Package zic. August 22, 2017 Package zic August 22, 2017 Version 0.9.1 Date 2017-08-22 Title Bayesian Inference for Zero-Inflated Count Models Author Markus Jochmann Maintainer Markus Jochmann

More information

Inference Methods for the Conditional Logistic Regression Model with Longitudinal Data Arising from Animal Habitat Selection Studies

Inference Methods for the Conditional Logistic Regression Model with Longitudinal Data Arising from Animal Habitat Selection Studies Inference Methods for the Conditional Logistic Regression Model with Longitudinal Data Arising from Animal Habitat Selection Studies Thierry Duchesne 1 (Thierry.Duchesne@mat.ulaval.ca) with Radu Craiu,

More information

Chris Fraley and Daniel Percival. August 22, 2008, revised May 14, 2010

Chris Fraley and Daniel Percival. August 22, 2008, revised May 14, 2010 Model-Averaged l 1 Regularization using Markov Chain Monte Carlo Model Composition Technical Report No. 541 Department of Statistics, University of Washington Chris Fraley and Daniel Percival August 22,

More information

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key Statistical Methods III Statistics 212 Problem Set 2 - Answer Key 1. (Analysis to be turned in and discussed on Tuesday, April 24th) The data for this problem are taken from long-term followup of 1423

More information

Single Index Quantile Regression for Heteroscedastic Data

Single Index Quantile Regression for Heteroscedastic Data Single Index Quantile Regression for Heteroscedastic Data E. Christou M. G. Akritas Department of Statistics The Pennsylvania State University SMAC, November 6, 2015 E. Christou, M. G. Akritas (PSU) SIQR

More information

Classification. Chapter Introduction. 6.2 The Bayes classifier

Classification. Chapter Introduction. 6.2 The Bayes classifier Chapter 6 Classification 6.1 Introduction Often encountered in applications is the situation where the response variable Y takes values in a finite set of labels. For example, the response Y could encode

More information

Power and sample size calculations for designing rare variant sequencing association studies.

Power and sample size calculations for designing rare variant sequencing association studies. Power and sample size calculations for designing rare variant sequencing association studies. Seunggeun Lee 1, Michael C. Wu 2, Tianxi Cai 1, Yun Li 2,3, Michael Boehnke 4 and Xihong Lin 1 1 Department

More information

Extended Bayesian Information Criteria for Model Selection with Large Model Spaces

Extended Bayesian Information Criteria for Model Selection with Large Model Spaces Extended Bayesian Information Criteria for Model Selection with Large Model Spaces Jiahua Chen, University of British Columbia Zehua Chen, National University of Singapore (Biometrika, 2008) 1 / 18 Variable

More information

Instructions: Closed book, notes, and no electronic devices. Points (out of 200) in parentheses

Instructions: Closed book, notes, and no electronic devices. Points (out of 200) in parentheses ISQS 5349 Final Spring 2011 Instructions: Closed book, notes, and no electronic devices. Points (out of 200) in parentheses 1. (10) What is the definition of a regression model that we have used throughout

More information

Statistics in medicine

Statistics in medicine Statistics in medicine Lecture 4: and multivariable regression Fatma Shebl, MD, MS, MPH, PhD Assistant Professor Chronic Disease Epidemiology Department Yale School of Public Health Fatma.shebl@yale.edu

More information

Dynamic Panel of Count Data with Initial Conditions and Correlated Random Effects. : Application for Health Data. Sungjoo Yoon

Dynamic Panel of Count Data with Initial Conditions and Correlated Random Effects. : Application for Health Data. Sungjoo Yoon Dynamic Panel of Count Data with Initial Conditions and Correlated Random Effects : Application for Health Data Sungjoo Yoon Department of Economics, Indiana University April, 2009 Key words: dynamic panel

More information

Statistical Inference

Statistical Inference Statistical Inference Liu Yang Florida State University October 27, 2016 Liu Yang, Libo Wang (Florida State University) Statistical Inference October 27, 2016 1 / 27 Outline The Bayesian Lasso Trevor Park

More information

Bayesian spatial quantile regression

Bayesian spatial quantile regression Brian J. Reich and Montserrat Fuentes North Carolina State University and David B. Dunson Duke University E-mail:reich@stat.ncsu.edu Tropospheric ozone Tropospheric ozone has been linked with several adverse

More information

Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory

Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory Statistical Inference Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory IP, José Bioucas Dias, IST, 2007

More information

The Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA

The Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA The Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA Presented by Dongjun Chung March 12, 2010 Introduction Definition Oracle Properties Computations Relationship: Nonnegative Garrote Extensions:

More information

Calibration Estimation of Semiparametric Copula Models with Data Missing at Random

Calibration Estimation of Semiparametric Copula Models with Data Missing at Random Calibration Estimation of Semiparametric Copula Models with Data Missing at Random Shigeyuki Hamori 1 Kaiji Motegi 1 Zheng Zhang 2 1 Kobe University 2 Renmin University of China Institute of Statistics

More information

Field Course Descriptions

Field Course Descriptions Field Course Descriptions Ph.D. Field Requirements 12 credit hours with 6 credit hours in each of two fields selected from the following fields. Each class can count towards only one field. Course descriptions

More information

Analysis of longitudinal neuroimaging data with OLS & Sandwich Estimator of variance

Analysis of longitudinal neuroimaging data with OLS & Sandwich Estimator of variance Analysis of longitudinal neuroimaging data with OLS & Sandwich Estimator of variance Bryan Guillaume Reading workshop lifespan neurobiology 27 June 2014 Supervisors: Thomas Nichols (Warwick University)

More information

) The cumulative probability distribution shows the probability

) The cumulative probability distribution shows the probability Problem Set ECONOMETRICS Prof Òscar Jordà Due Date: Thursday, April 0 STUDENT S NAME: Multiple Choice Questions [0 pts] Please provide your answers to this section below: 3 4 5 6 7 8 9 0 ) The cumulative

More information

Variable Selection for Highly Correlated Predictors

Variable Selection for Highly Correlated Predictors Variable Selection for Highly Correlated Predictors Fei Xue and Annie Qu Department of Statistics, University of Illinois at Urbana-Champaign WHOA-PSI, Aug, 2017 St. Louis, Missouri 1 / 30 Background Variable

More information

The consequences of misspecifying the random effects distribution when fitting generalized linear mixed models

The consequences of misspecifying the random effects distribution when fitting generalized linear mixed models The consequences of misspecifying the random effects distribution when fitting generalized linear mixed models John M. Neuhaus Charles E. McCulloch Division of Biostatistics University of California, San

More information

Calibration Estimation of Semiparametric Copula Models with Data Missing at Random

Calibration Estimation of Semiparametric Copula Models with Data Missing at Random Calibration Estimation of Semiparametric Copula Models with Data Missing at Random Shigeyuki Hamori 1 Kaiji Motegi 1 Zheng Zhang 2 1 Kobe University 2 Renmin University of China Econometrics Workshop UNC

More information

Power and Sample Size Calculations with the Additive Hazards Model

Power and Sample Size Calculations with the Additive Hazards Model Journal of Data Science 10(2012), 143-155 Power and Sample Size Calculations with the Additive Hazards Model Ling Chen, Chengjie Xiong, J. Philip Miller and Feng Gao Washington University School of Medicine

More information

Classification. Classification is similar to regression in that the goal is to use covariates to predict on outcome.

Classification. Classification is similar to regression in that the goal is to use covariates to predict on outcome. Classification Classification is similar to regression in that the goal is to use covariates to predict on outcome. We still have a vector of covariates X. However, the response is binary (or a few classes),

More information

Semi-Penalized Inference with Direct FDR Control

Semi-Penalized Inference with Direct FDR Control Jian Huang University of Iowa April 4, 2016 The problem Consider the linear regression model y = p x jβ j + ε, (1) j=1 where y IR n, x j IR n, ε IR n, and β j is the jth regression coefficient, Here p

More information

Lecture 6: Methods for high-dimensional problems

Lecture 6: Methods for high-dimensional problems Lecture 6: Methods for high-dimensional problems Hector Corrada Bravo and Rafael A. Irizarry March, 2010 In this Section we will discuss methods where data lies on high-dimensional spaces. In particular,

More information

SUPPLEMENTARY APPENDICES FOR WAVELET-DOMAIN REGRESSION AND PREDICTIVE INFERENCE IN PSYCHIATRIC NEUROIMAGING

SUPPLEMENTARY APPENDICES FOR WAVELET-DOMAIN REGRESSION AND PREDICTIVE INFERENCE IN PSYCHIATRIC NEUROIMAGING Submitted to the Annals of Applied Statistics SUPPLEMENTARY APPENDICES FOR WAVELET-DOMAIN REGRESSION AND PREDICTIVE INFERENCE IN PSYCHIATRIC NEUROIMAGING By Philip T. Reiss, Lan Huo, Yihong Zhao, Clare

More information

Various types of likelihood

Various types of likelihood Various types of likelihood 1. likelihood, marginal likelihood, conditional likelihood, profile likelihood, adjusted profile likelihood 2. semi-parametric likelihood, partial likelihood 3. empirical likelihood,

More information

Inference After Variable Selection

Inference After Variable Selection Department of Mathematics, SIU Carbondale Inference After Variable Selection Lasanthi Pelawa Watagoda lasanthi@siu.edu June 12, 2017 Outline 1 Introduction 2 Inference For Ridge and Lasso 3 Variable Selection

More information

Confounder Adjustment in Multiple Hypothesis Testing

Confounder Adjustment in Multiple Hypothesis Testing in Multiple Hypothesis Testing Department of Statistics, Stanford University January 28, 2016 Slides are available at http://web.stanford.edu/~qyzhao/. Collaborators Jingshu Wang Trevor Hastie Art Owen

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Chapter 10. Semi-Supervised Learning

Chapter 10. Semi-Supervised Learning Chapter 10. Semi-Supervised Learning Wei Pan Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455 Email: weip@biostat.umn.edu PubH 7475/8475 c Wei Pan Outline

More information

Combining multiple observational data sources to estimate causal eects

Combining multiple observational data sources to estimate causal eects Department of Statistics, North Carolina State University Combining multiple observational data sources to estimate causal eects Shu Yang* syang24@ncsuedu Joint work with Peng Ding UC Berkeley May 23,

More information

Spatial Discrete Choice Models

Spatial Discrete Choice Models Spatial Discrete Choice Models Professor William Greene Stern School of Business, New York University SPATIAL ECONOMETRICS ADVANCED INSTITUTE University of Rome May 23, 2011 Spatial Correlation Spatially

More information

Data Mining Stat 588

Data Mining Stat 588 Data Mining Stat 588 Lecture 02: Linear Methods for Regression Department of Statistics & Biostatistics Rutgers University September 13 2011 Regression Problem Quantitative generic output variable Y. Generic

More information

ISyE 691 Data mining and analytics

ISyE 691 Data mining and analytics ISyE 691 Data mining and analytics Regression Instructor: Prof. Kaibo Liu Department of Industrial and Systems Engineering UW-Madison Email: kliu8@wisc.edu Office: Room 3017 (Mechanical Engineering Building)

More information

BIOL 51A - Biostatistics 1 1. Lecture 1: Intro to Biostatistics. Smoking: hazardous? FEV (l) Smoke

BIOL 51A - Biostatistics 1 1. Lecture 1: Intro to Biostatistics. Smoking: hazardous? FEV (l) Smoke BIOL 51A - Biostatistics 1 1 Lecture 1: Intro to Biostatistics Smoking: hazardous? FEV (l) 1 2 3 4 5 No Yes Smoke BIOL 51A - Biostatistics 1 2 Box Plot a.k.a box-and-whisker diagram or candlestick chart

More information

Individualized Treatment Effects with Censored Data via Nonparametric Accelerated Failure Time Models

Individualized Treatment Effects with Censored Data via Nonparametric Accelerated Failure Time Models Individualized Treatment Effects with Censored Data via Nonparametric Accelerated Failure Time Models Nicholas C. Henderson Thomas A. Louis Gary Rosner Ravi Varadhan Johns Hopkins University July 31, 2018

More information

Linear regression methods

Linear regression methods Linear regression methods Most of our intuition about statistical methods stem from linear regression. For observations i = 1,..., n, the model is Y i = p X ij β j + ε i, j=1 where Y i is the response

More information

Practice Exam 1. (A) (B) (C) (D) (E) You are given the following data on loss sizes:

Practice Exam 1. (A) (B) (C) (D) (E) You are given the following data on loss sizes: Practice Exam 1 1. Losses for an insurance coverage have the following cumulative distribution function: F(0) = 0 F(1,000) = 0.2 F(5,000) = 0.4 F(10,000) = 0.9 F(100,000) = 1 with linear interpolation

More information

Robustness to Parametric Assumptions in Missing Data Models

Robustness to Parametric Assumptions in Missing Data Models Robustness to Parametric Assumptions in Missing Data Models Bryan Graham NYU Keisuke Hirano University of Arizona April 2011 Motivation Motivation We consider the classic missing data problem. In practice

More information

Outlier detection and variable selection via difference based regression model and penalized regression

Outlier detection and variable selection via difference based regression model and penalized regression Journal of the Korean Data & Information Science Society 2018, 29(3), 815 825 http://dx.doi.org/10.7465/jkdi.2018.29.3.815 한국데이터정보과학회지 Outlier detection and variable selection via difference based regression

More information

Ultra High Dimensional Variable Selection with Endogenous Variables

Ultra High Dimensional Variable Selection with Endogenous Variables 1 / 39 Ultra High Dimensional Variable Selection with Endogenous Variables Yuan Liao Princeton University Joint work with Jianqing Fan Job Market Talk January, 2012 2 / 39 Outline 1 Examples of Ultra High

More information

Agricultural and Applied Economics 637 Applied Econometrics II. Assignment III Maximum Likelihood Estimation (Due: March 31, 2016)

Agricultural and Applied Economics 637 Applied Econometrics II. Assignment III Maximum Likelihood Estimation (Due: March 31, 2016) Agricultural and Applied Economics 637 Applied Econometrics II Assignment III Maximum Likelihood Estimation (Due: March 31, 2016) In this assignment I would like you to apply the theoretical Maximum Likelihood

More information

IEOR165 Discussion Week 5

IEOR165 Discussion Week 5 IEOR165 Discussion Week 5 Sheng Liu University of California, Berkeley Feb 19, 2016 Outline 1 1st Homework 2 Revisit Maximum A Posterior 3 Regularization IEOR165 Discussion Sheng Liu 2 About 1st Homework

More information

On Mixture Regression Shrinkage and Selection via the MR-LASSO

On Mixture Regression Shrinkage and Selection via the MR-LASSO On Mixture Regression Shrinage and Selection via the MR-LASSO Ronghua Luo, Hansheng Wang, and Chih-Ling Tsai Guanghua School of Management, Peing University & Graduate School of Management, University

More information