Incorporating cost in Bayesian variable selection, with application to cost-effective measurement of quality of health care

Size: px
Start display at page:

Download "Incorporating cost in Bayesian variable selection, with application to cost-effective measurement of quality of health care"

Transcription

1 Incorporating cost in Bayesian variable selection, with application to cost-effective measurement of quality of health care Dimitris Fouskakis, Department of Mathematics, School of Applied Mathematical and Physical Sciences, National Technical University of Athens, Athens, Greece; Joint work with: Ioannis Ntzoufras & David Draper Department of Statistics Athens University of Economics and Business Athens, Greece; Department of Applied Mathematics and Statistics University of California Santa Cruz, USA; Presentation is available at: fouskakis/conferences/hwu/hwu.pdf.

2 23rd September 2009: School of Mathematical and Computer Sciences, Heriot-Watt University 2 Synopsis 1. Motivation. 2. Model Specification. 3. Decision Theoretic Cost-Benefit Analysis. 4. Bayesian Cost-Benefit Analysis. 5. Utility versus Cost-Adjusted BIC. 6. Discussion.

3 23rd September 2009: School of Mathematical and Computer Sciences, Heriot-Watt University 3 1 Motivation Health care quality measurements Indirect method: input-output approach. Construct a model on hospital outcomes (e.g., mortality within 30 days of admission) after adjusting for differences in inputs (sickness at admission). Compare observed and expected outcomes to infer for the health care quality. Data collection costs are available for each variable (measured in minutes or monetary units). We wish to incorporate cost in our analysis in order to reduce data collection costs but also have a well-fitted model.

4 23rd September 2009: School of Mathematical and Computer Sciences, Heriot-Watt University 4 Available data Data come form a major U.S. study constructed by RAND Corporation, with n = 2, 532 pneumonia patients (Keeler, et al., 1990). Response variable: mortality within 30 days of admission Covariates: p = 83 sickness indicators Construct a sickness scale using a logistic regression model. Benefit - Only Analysis (no costs): Classical variable selection techniques to find an optimal subset of indicators. The initial list of p = 83 sickness indicators was reduced to 14 significant predictors (Keeler, et al., 1990).

5 23rd September 2009: School of Mathematical and Computer Sciences, Heriot-Watt University 5 The 14-Variable Rand Pneumonia Scale The RAND admission sickness scale for pneumonia (p = 14 variables), with the marginal data collection costs per patient for each variable (in minutes of abstraction time). Variable Cost Variable Cost 1 Systolic Blood Pressure Score (2-point scale) (Minutes) Septic Complications (yes, no) 2 Age Prior Respiratory Failure (yes, no) (Minutes) 3 Blood Urea Nitrogen Recently Hospitalized (yes, no) APACHE II Coma Score (3-point scale) 5 Shortness of Breath Day 1 (yes, no) 6 Serum Albumin Score (3-point scale) 7 Respiratory Distress (yes, no) Initial Temperature Chest X-ray Congestive Heart Failure Score (3-point scale) Ambulatory Score (3-point scale) Total APACHE II Score (36-point scale)

6 23rd September 2009: School of Mathematical and Computer Sciences, Heriot-Watt University 6 Two different approaches The RAND Benefit - Only approach is sub-optimal: it does not consider differences in cost of data collection among available predictors. We propose a Cost - Benefit Analysis, in which variables are chosen only when they predict well enough given how much they cost to collect. In problems such as this, in which there are two desirable criteria that compete, and over which a joint optimisation must be achieved, there are two main ways to proceed: Both criteria can be placed on a common scale, and optimisation can occur on that scale (strategy (a)). One criterion can be optimised, subject to a bound on the other (strategy (b)).

7 23rd September 2009: School of Mathematical and Computer Sciences, Heriot-Watt University 7 Three methods for solving this problem (1) (strategy (a)) Draper and Fouskakis (2000) and Fouskakis and Draper (2002, 2008) proposed an approach to this problem based on Bayesian Decision Theory. They used stochastic optimisation methods to find (near-) optimal subsets of predictor variables that maximize an expected utility function which trades off data collection cost against predictive accuracy. (2) (strategy (a)) In this work, as an alternative to (1), we propose a prior distribution that accounts for the cost of each variable and results in a set of posterior model probabilities which correspond to a Generalized Cost-Adjusted version of the Bayesian Information Criterion (Fouskakis, Ntzoufras and Draper, 2009a). (3) (strategy (b)) We also implement a Cost - Restriction - Benefit Analysis, where the search is conducted only among models whose cost does not exceed a budgetary restriction (Fouskakis, Ntzoufras and Draper, 2009b), by the usage of a Population - Based Trans - Dimensional RJMCMC Method. Here we present results from methods (1) (Decision Theoretic Cost-Benefit Analysis) and (2) (Bayesian Cost-Benefit Analysis).

8 23rd September 2009: School of Mathematical and Computer Sciences, Heriot-Watt University 8 2 Model Specification Logistic regression model with Y i = 1 if patient i dies after 30 days of admission. X ij : j sickness predictor variable for the i patient. m γ = (γ 1,...,γ p ) T. γ j : Binary indicators of the inclusion of the variable X j in the model. Model space M = {0, 1} p ; p = total number of variables considered. Hence the model formulation can be summarized as (Y i γ) ( ) pi (γ) η i (γ) = log 1 p i (γ) indep = Bernoulli(p i (γ)), p β j γ j X ij, j=0 η(γ) = X diag(γ) β = Xγ βγ.

9 23rd September 2009: School of Mathematical and Computer Sciences, Heriot-Watt University 9 3 Decision Theoretic Cost-Benefit Analysis Utility Elicitation (1) We take Bayesian decision-theoretic approach (based on maximisation of expected utility). Utility function has 2 components: quantifying data collection costs and predictive successes and failures. Data-collection utility: p available sickness indicators X j ; γ j = 1 if X j is included in subset (0 otherwise). Dividing n patients at random into modeling and validation subsamples of size n M and n V, respectively, data-collection cost associated with subset γ = (γ 1,...,γ p ) for patients in validation subsample is U D (γ) = n V p c j γ j, (1) j=1 where c j is marginal cost per patient of data abstraction for variable j.

10 23rd September 2009: School of Mathematical and Computer Sciences, Heriot-Watt University 10 Utility Elicitation (2) Predictive Utility: (1) Apply logistic regression model, obtained from modeling subsample, to validation subsample to create predicted death probabilities ˆp γ i predictor subset γ. for patients using given (2) Classify patient i in the validation subsample as predicted dead or alive according to whether ˆp γ i exceeds or falls short of a cutoff p, which is chosen - by searching on a discrete grid from 0.01 to 0.99 by steps of to maximize the predictive accuracy of model γ. We then cross-tabulate actual versus predicted death status in a 2 2 contingency table, rewarding and penalizing model γ according to the numbers of patients in the validation sample which fall into the cells of the right-hand part of the following table.

11 23rd September 2009: School of Mathematical and Computer Sciences, Heriot-Watt University 11 Utility Elicitation (3) Rewards and Penalties Counts Predicted Predicted Died Lived Died Lived Actual Died C 11 C 12 n 11 n 12 Lived C 21 C 22 n 21 n 22 The predictive utility of model γ is then U P (γ) = 2 2 C lm n lm. (2) l=1 m=1 See Fouskakis and Draper (2008) for details on eliciting the utility values C lm.

12 23rd September 2009: School of Mathematical and Computer Sciences, Heriot-Watt University 12 Utility Elicitation (4) The overall expected utility function to be maximised over γ is then simply E[U(γ)] = E[U D (γ) + U P (γ)], (3) where this expectation is over all possible cross-validation splits of the data. The number of possible cross-validation splits is far too large to evaluate the expectation in (3) directly; in practice we therefore use Monte Carlo methods to evaluate it, averaging over N random modeling and validation splits.

13 23rd September 2009: School of Mathematical and Computer Sciences, Heriot-Watt University 13 Results We explored this approach in two settings: a Small World created by focusing only on the p = 14 variables in the original RAND scale (2 14 = 16, 384 is a small enough number of possible models to do brute-force enumeration of the estimated expected utility of all models). The Rand scale is nowhere near optimal when data collection costs are considered along with predictive accuracy: Estimated Expected Utility Number of Variables

14 23rd September 2009: School of Mathematical and Computer Sciences, Heriot-Watt University 14 The best model in this case is the one with the following 4 variables: 1. Systolic Blood Pressure - X 1, 2. Blood Urea Nitrogen - X 3, 3. APACHE II Coma Score - X 4 and 4. Shortness of Breath Day 1 Score - X 5. The 20 best models include the same 3 variables 18 or more times out of 20, and never include 6 other variables; the 5 best models are minor variations on each other, and include 4 6 variables. The best models save almost $8 per patient over the full 14-variable model (significant savings if input-output approach applied widely). the Big World defined by all p = 83 available predictors (2 83. = is far too large for brute-force enumeration; we compared a variety of stochastic optimisation methods - including simulated annealing, genetic algorithms, and tabu search - on their ability to find good variable subsets).

15 23rd September 2009: School of Mathematical and Computer Sciences, Heriot-Watt University 15 Drawback of the decision-theoretic approach Maximising expected utility, as we did before, is a natural Bayesian way forward in this problem, but (a) the elicitation process was complicated and difficult and (b) the utility structure we examine is only one of a number of plausible alternatives, with utility framed from only one point of view; the broader question for a decision-theoretic approach is whose utility should drive the problem formulation. It is well known (e.g., Arrow, 1963; Weerahandi and Zidek, 1981) that Bayesian decision theory can be problematic when used normatively for group decision-making, because of conflicts in preferences among members of the group; in the context of the problem addressed here, it can be difficult to identify a utility structure acceptable to all stakeholders (including patients, doctors, hospitals, citizen watchdog groups, and state and federal regulatory agencies) in the quality-of-care-assessment process.

16 23rd September 2009: School of Mathematical and Computer Sciences, Heriot-Watt University 16 4 Bayesian Cost-Benefit Analysis The aim is to identify well fitted models after taking into account the cost of each variable. Therefore we need to estimate posterior model probabilities f(γ y) = f(γ) f(y βγ, γ)f(βγ γ)dβγ γ {0,1} p f(γ ) f(y βγ, γ )f(βγ γ )dβγ after introducing a prior on model space f(γ) depending on the cost.

17 23rd September 2009: School of Mathematical and Computer Sciences, Heriot-Watt University Preliminaries: Posterior model odds and penalty functions Information criteria (1) Information criterion for model γ f(y βγ) is the maximum likelihood. IC(γ) = 2 log f(y βγ, γ) + dγf dγ dimension of the model (number of parameters) F penalty for each model parameter used/estimated. dγf is the total penalty implemented to the maximum likelihood due to the use of a model with dγ parameters. Model with minimum IC is indicated as the best. The above criterion is a penalised likelihood measure.

18 23rd September 2009: School of Mathematical and Computer Sciences, Heriot-Watt University 18 Information criteria (2) When comparing two models γ (k) and γ (l) then IC kl = IC(γ (k) ) IC(γ (l) ) = 2 log f( y βγ (k), γ(k)) f ( y βγ (l), γ(l)) + ( dγ d ) (k) γ F (l) = Deviance kl + ( dγ (k) d γ (l) ) F We select model γ (k) if IC kl < 0, and model γ (l) if IC kl > 0.

19 23rd September 2009: School of Mathematical and Computer Sciences, Heriot-Watt University 19 Posterior model probabilities and information criteria The posterior model probability of a model γ is given by f(γ y) = f(y γ)f(γ) where f(y γ) is the marginal likelihood of model γ given by f(y βγ, γ)f(βγ γ)dβγ f(γ) prior probability of model γ It can be rewritten as 2 log f(γ y) = 2 log f(y γ) + [ 2 log f(γ)] IC(γ) = 2 log f(y βγ, γ) + dγf

20 23rd September 2009: School of Mathematical and Computer Sciences, Heriot-Watt University 20 Posterior model odds and information criteria Similarly if we consider the posterior odds of model γ (k) versus model γ (l). Then ( f(y γ (k) ) ) PO kl = f(γ(k) ) f(y γ (l) ) f(γ (l) ) = B kl PrO kl, B kl is the Bayes factor of model γ (k) versus model γ (l) (ratios of marginal likelihoods). PrO kl is the prior odds of model γ (k) versus model γ (l). It can be rewritten as 2 log PO kl = 2 log B kl + [ 2 log PrO lk ] IC kl = 2 log f( y βγ (k), γ(k)) f ( y βγ (l), γ(l)) + ( dγ (k) d γ (l) ) F

21 23rd September 2009: School of Mathematical and Computer Sciences, Heriot-Watt University 21 Uniform prior on model space If the prior model probabilities are defined via a negative function of the model dimension, then the prior model odds ξ kl = 2 log PrO kl = 2 log f(γ(k) ) f(γ (l) ) can be also interpreted as the extra penalty imposed to the Bayes factor. If the (usual) uniform prior distribution is used then ξ kl = 0 and PO kl = B kl for all models γ k, γ l M where M is the set of all models under consideration (model space). Bayesian benefit-only analysis can be assumed using the uniform prior on model space and hence base our variable selection procedure on Bayes factors.

22 23rd September 2009: School of Mathematical and Computer Sciences, Heriot-Watt University 22 Prior model odds interpretation Well-known rough approximation of log B kl (Schwartz, 1978): 2 log B kl = BIC kl + O(1) 2 log PO kl = BIC kl + ξ kl + O(1) = Deviance kl + (dγ d (k) γ (l)) log n + ξ kl + O(1) (4) where BIC kl is the Bayesian Information Criterion (e.g., Kass and Wasserman, 1996; Raftery, 1995, 1996) for choosing between models γ (k) and γ (l). BIC penalty equal to F = log n for each parameter used. The overall (posterior) penalty imposed to the deviance measure will be equal to (dγ (k) d γ (l)) log n + ξ kl.

23 23rd September 2009: School of Mathematical and Computer Sciences, Heriot-Watt University Prior distributions Prior on model parameters βγ γ Normal ( ( ) ) 1 0, 4n X T γxγ Low information prior defined by Ntzoufras, Delaportas and Forster (2003). Can be derived using the power prior of Chen et al. (2000) and imaginary data supporting the simplest model included in our model space. It gives weight to the prior equal to one data-point. It is equivalent to the Zellner s g-prior (with g = 4n) used for normal regression models.

24 23rd September 2009: School of Mathematical and Computer Sciences, Heriot-Watt University 24 A cost-penalised prior on model space (1) Preliminaries We propose to specify our prior model probabilities via cost-dependent penalties for each variable. We denote by c j the cost of X j covariate and by c = (c 1, c 2,...,c p ) the vector of the costs of all variables under consideration. To specify this prior we define a baseline cost c 0 which is assumed to be a low acceptable cost for the collection of the data of a covariate. The cost of each variable can be then written as c j = k j c 0. For the Bayesian benefit only analysis we are using a uniform prior on model space.

25 23rd September 2009: School of Mathematical and Computer Sciences, Heriot-Watt University 25 A cost-penalised prior on model space (2) The five criteria We specify our prior distribution on γ to satisfy the following five criteria: (a) The prior must be unaffected to transformations c α c with α > 0, so that conversion between time and money or different monetary units (e.g., dollars and euros) leaves the prior unchanged; (b) the extra penalty ξ 1 for adding a variable X j with baseline cost c 0 is zero; (c) the extra penalty ξ 2 for adding a variable X j with cost c j = κ c 0 for some κ > 1 equals the BIC penalty of (κ 1) variables with cost c 0 ; (d) the extra penalty ξ 3 for adding any variable X j is greater or equal to zero; and (e) if all the variables have the same cost, then the prior must reduce to the uniform prior on γ.

26 23rd September 2009: School of Mathematical and Computer Sciences, Heriot-Watt University 26 A cost-penalised prior on model space (3) The five criteria - interpretation (a) ensures that the prior is invariant with respect to the manner in which cost is measured. (b) ensures that the penalty for adding a variable X j with baseline cost c 0 is the same as in the benefit-only analysis. (c) ensures that the posterior model odds will still have a BIC-like behavior. The induced extra penalty will be equal to the relative difference between the cost of X j and a variable with cost equal to c 0. (d) ensures that the cost-benefit analysis will support more parsimonious models than the corresponding ones supported by the benefit-only analysis. (e) requires that our prior should reproduce the benefit-only analysis if all costs are equal.

27 23rd September 2009: School of Mathematical and Computer Sciences, Heriot-Watt University 27 A cost-penalised prior on model space (4) The prior The following Theorem provides the only prior that meets the above five requirements, and defines the choice of c 0. Theorem 1. If a prior distribution f(γ) satisfies requirements (a-e) above, then it must be of the form f(γ j ) exp [ γ j 2 ( ) ] cj 1 log n c 0 where c j is the marginal cost per observation for variable X j and c 0 = min{c j, j = 1,...,p}. For proof see Fouskakis, Ntzoufras and Draper (2009a). for j = 1,...,p, (5)

28 23rd September 2009: School of Mathematical and Computer Sciences, Heriot-Watt University Posterior model odds Cost-adjusted generalisation of BIC Under the above prior, if we consider the BIC-based approximation (4) then ( f(y ˆβγ 2 log PO kl = 2 log (k), ) γ(k) ) f(y ˆβγ (l), + C γ C (k) γ (l) log n + O(1). (6) γ(l) ) c 0 where Cγ = p j=1 γ jc j is the cost of model γ.

29 23rd September 2009: School of Mathematical and Computer Sciences, Heriot-Watt University 29 The penalty term dγ log n of model γ used in (4) has been replaced in the above expression by the cost-dependent penalty c 1 0 C γ log n; Ignoring costs is equivalent to c j = c 0 for all j, yielding c 1 0 C γ = dγ, the original BIC expression. We may interpret log n as the imposed penalty for each variable included in the model when no costs are considered. This baseline penalty term is inflated proportionally to the cost ratio c j c 0 for each X j ; for example, if the cost of a variable X j is twice the minimum cost (c j = 2 c 0 ) then the imposed penalty is equivalent to adding two variables with the minimum cost. For all these reasons, (6) can be considered as a cost-adjusted generalisation of BIC when prior model probabilities of type (5) are adopted.

30 23rd September 2009: School of Mathematical and Computer Sciences, Heriot-Watt University Implementation and results Implementation details The procedure 1. Run RJMCMC (Green, 1995) for 100K iterations in the full model space. 2. Eliminate non-important variables (with marginal probabilities < 0.30) forming a new reduced model space. 3. Run RJMCMC for 100K iterations in the reduced model space to estimate posterior model odds and best models. Two setups: 1. Benefit only analysis (uniform prior on model space). 2. Cost - Benefit Analysis (cost penalised prior on model space).

31 23rd September 2009: School of Mathematical and Computer Sciences, Heriot-Watt University 31 Preliminary Results: Marginal Probabilities f(γ j = 1 y) Variable Variable Benefit Cost-Benefit Index Variable Name Cost Analysis Analysis 1 Systolic Blood Pressure (SBP) Score Age Blood Urea Nitrogen Apache II Coma Score Shortness of Breath Day Septic Complications Initial Temperature Heart Rate Day Chest Pain Day Cardiomegaly Score Hematologic History Score Apache Respiratory Rate Score Admission SBP Respiratory Rate Day Confusion Day Apache ph Score Morbid + Comorbid Score Musculoskeletal Score Number of variables 13 13

32 23rd September 2009: School of Mathematical and Computer Sciences, Heriot-Watt University 32 Reduced Model Space: Posterior Model Probabilities/Odds Common variables in both analyses: X 1 + X 2 + X 3 + X 5 + X 12 + X 70 Benefit-Only Analysis Common Variables Additional Model Posterior k Within Each Analysis Variables Cost Probabilities P O 1k 1 X 4 + X 15 + X 37 + X 73 +X 8 +X 27 +X X 8 +X X X 27 +X Cost-Benefit Analysis Common Variables Additional Model Posterior k Within Each Analysis Variables Cost Probabilities P O 1k 1 X 46 + X 51 +X 49 +X X 14 +X 49 +X X 13 +X 49 +X X 13 +X 14 +X 49 +X X 14 +X X X 37 +X X 13 +X 14 +X X above 3%. posterior odds of the best model within each analysis versus the current model k.

33 23rd September 2009: School of Mathematical and Computer Sciences, Heriot-Watt University 33 Reduced Model Space: Comparisons Comparison of measures of fit, cost and dimensionality between the best models in the reduced model space of the benefit-only and cost-benefit analysis; percentage difference is in relation to benefit-only. Analysis Difference Benefit-Only Cost-Benefit (%) Minimum Deviance Median Deviance Cost Dimension

34 23rd September 2009: School of Mathematical and Computer Sciences, Heriot-Watt University 34 5 Utility versus Cost-Adjusted BIC Method Variable Utility RJMCMC Cost Posterior Index Name (Minutes) Good? Good? Probability 1 Systolic Blood Pressure Score Age Blood Urea Nitrogen APACHE II Coma Score Shortness of Breath Day 1 (yes, no) Serum Albumin Score (3 point scale) Respiratory Distress (yes, no) Septic Complications (yes, no) Prior Respiratory Failure (yes, no) Recently Hospitalized (yes, no) Initial Temperature Chest X-ray Congestive Heart Failure Score 18 Ambulatory Score Total APACHE II Score It s clear that the Utility and Cost-Adjusted BIC approaches have reached nearly identical conclusions in the Small World of p = 14 predictors.

35 23rd September 2009: School of Mathematical and Computer Sciences, Heriot-Watt University 35 With p = 83 the agreement between the two methods is also strong (although not as strong as with p = 14): using a star system for variable importance given in Fouskakis, Ntzoufras and Draper (2009a), 60 variables were ignored by both methods, 8 variables had identical star patterns, 3 variables were chosen as important by both methods but with different star patterns, 10 variables were marked as important by the utility approach and not by RJMCMC, and 2 variables were singled out by RJMCMC and not by utility: thus the two methods substantially agreed on the importance of 71 (86%) of the 83 variables. Median p Method Model Cost Deviance LS CV 14 X 1 + X 2 + X 3 + X 4 + X 5 + X 6 + X 7 + X RJMCMC X 1 + X 2 + X 3 + X 4 + X 5 + X 7 + X Utility X 1 + X 3 + X 4 + X RJMCMC Utility X 1 + X 2 + X 3 + X 5 + X 12 +X 46 + X 49 + X 51 + X 70 + X X 1 + X 3 + X 4 + X 12 +X 46 + X 49 + X To the extent that the two methods differ, the utility method favors models that cost somewhat less but also predict somewhat less well.

36 23rd September 2009: School of Mathematical and Computer Sciences, Heriot-Watt University 36 6 Discussion The fact that the two methods may yield somewhat different results in high-dimensional problems does not mean that either is wrong; they are both valid solutions to similar but not identical problems. Both methods lead to noticeably better models (in a cost-benefit sense) than frequentist or Bayesian benefit-only approaches, when - as is often the case - cost is an issue that must be included in the problem formulation to arrive at a policy-relevant solution. In comparing two or more models, to say whether one is better than another I have to face the question: better for what purpose? This makes model specification a decision problem: I need to either (a) elicit a utility structure that s specific to the goals of the current study and maximise expected utility to find the best models, or (b) (if (a) is too hard, e.g., because the problem has a group decision character) I can look for a principled alternative (like the cost-adjusted BIC method described here) that approximates the utility approach while avoiding ambiguities in utility specification.

37 23rd September 2009: School of Mathematical and Computer Sciences, Heriot-Watt University 37 Authors related work Draper D, Fouskakis D (2000). A case study of stochastic optimization in health policy: problem formulation and preliminary results. Journal of Global Optimization, 18, Fouskakis D, Draper D (2002). Stochastic optimization: a review. International Statistical Review, 70, Fouskakis D, Draper D (2008). Comparing stochastic optimization methods for variable selection in binary outcome prediction, with application to health policy. Journal of the American Statistical Association, 103, Fouskakis D, Ntzoufras I, Draper D (2009a). Bayesian variable selection using cost-adjusted BIC, with application to cost-effective measurement of quality of health care. Annals of Applied Statistics, 3, Fouskakis D, Ntzoufras I, Draper D (2009b). Population Based Reversible Jump MCMC for Bayesian Variable Selection and Evaluation Under Cost Limit Restrictions. Journal of the Royal Statistical Society C (Applied Statistics), 58,

38 23rd September 2009: School of Mathematical and Computer Sciences, Heriot-Watt University 38 Additional References Arrow KJ (1963). Social Choice and Individual Values. Wiley. New York. Chen MH, Ibrahim JG and Shao QM (2000). Power prior distributions for generalized linear models. Journal of Statistical Planning and Inference, 84, Green P (1995). Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika, 82, Keeler E, Kahn K, Draper D, Sherwood M, Rubenstein L, Reinisch E, Kosecoff J, Brook R (1990). Changes in sickness at admission following the introduction of the Prospective Payment System. Journal of the American Medical Association, 264, Ntzoufras I, Dellaportas P, Forster JJ (2003). Bayesian variable and link determination for generalized linear models. Journal of Statistical Planning and Inference, 111, Weerahandi S and Zidek JV (1983). Elements of multi-bayesian decision theory. Annals of Statistics, 11,

Bayesian Decision Theory in Biostatistics

Bayesian Decision Theory in Biostatistics Bayesian Decision Theory in Biostatistics David Draper (joint work with Dimitris Fouskakis, Ioannis Ntzoufras and Ken Pietz) Department of Applied Mathematics and Statistics University of California, Santa

More information

5: Biostatistical Applications of Bayesian Decision Theory

5: Biostatistical Applications of Bayesian Decision Theory Introduction to Bayesian Data Analysis 5: Biostatistical Applications of Bayesian Decision Theory David Draper Department of Applied Mathematics and Statistics University of California, Santa Cruz, USA

More information

BAYESIAN VARIABLE SELECTION USING COST-ADJUSTED BIC, WITH APPLICATION TO COST-EFFECTIVE MEASUREMENT OF QUALITY OF HEALTH CARE

BAYESIAN VARIABLE SELECTION USING COST-ADJUSTED BIC, WITH APPLICATION TO COST-EFFECTIVE MEASUREMENT OF QUALITY OF HEALTH CARE The Annals of Applied Statistics 2009, Vol. 3, No. 2, 663 690 DOI: 10.1214/08-AOAS207 Institute of Mathematical Statistics, 2009 BAYESIAN VARIABLE SELECTION USING COST-ADJUSTED BIC, WITH APPLICATION TO

More information

arxiv: v1 [stat.ap] 17 Aug 2009

arxiv: v1 [stat.ap] 17 Aug 2009 The Annals of Applied Statistics 2009, Vol. 3, No. 2, 663 690 DOI: 10.1214/08-AOAS207 c Institute of Mathematical Statistics, 2009 arxiv:0908.2313v1 [stat.ap] 17 Aug 2009 BAYESIAN VARIABLE SELECTION USING

More information

Or How to select variables Using Bayesian LASSO

Or How to select variables Using Bayesian LASSO Or How to select variables Using Bayesian LASSO x 1 x 2 x 3 x 4 Or How to select variables Using Bayesian LASSO x 1 x 2 x 3 x 4 Or How to select variables Using Bayesian LASSO On Bayesian Variable Selection

More information

Power-Expected-Posterior Priors for Variable Selection in Gaussian Linear Models

Power-Expected-Posterior Priors for Variable Selection in Gaussian Linear Models Power-Expected-Posterior Priors for Variable Selection in Gaussian Linear Models Ioannis Ntzoufras, Department of Statistics, Athens University of Economics and Business, Athens, Greece; e-mail: ntzoufras@aueb.gr.

More information

Power-Expected-Posterior Priors for Variable Selection in Gaussian Linear Models

Power-Expected-Posterior Priors for Variable Selection in Gaussian Linear Models Power-Expected-Posterior Priors for Variable Selection in Gaussian Linear Models Dimitris Fouskakis, Department of Mathematics, School of Applied Mathematical and Physical Sciences, National Technical

More information

A note on Reversible Jump Markov Chain Monte Carlo

A note on Reversible Jump Markov Chain Monte Carlo A note on Reversible Jump Markov Chain Monte Carlo Hedibert Freitas Lopes Graduate School of Business The University of Chicago 5807 South Woodlawn Avenue Chicago, Illinois 60637 February, 1st 2006 1 Introduction

More information

Bayesian Analysis of Bivariate Count Data

Bayesian Analysis of Bivariate Count Data Karlis and Ntzoufras: Bayesian Analysis of Bivariate Count Data 1 Bayesian Analysis of Bivariate Count Data Dimitris Karlis and Ioannis Ntzoufras, Department of Statistics, Athens University of Economics

More information

BAYESIAN MODEL CRITICISM

BAYESIAN MODEL CRITICISM Monte via Chib s BAYESIAN MODEL CRITICM Hedibert Freitas Lopes The University of Chicago Booth School of Business 5807 South Woodlawn Avenue, Chicago, IL 60637 http://faculty.chicagobooth.edu/hedibert.lopes

More information

Power-Expected-Posterior Priors for Variable Selection in Gaussian Linear Models

Power-Expected-Posterior Priors for Variable Selection in Gaussian Linear Models Power-Expected-Posterior Priors for Variable Selection in Gaussian Linear Models D. Fouskakis, I. Ntzoufras and D. Draper December 1, 01 Summary: In the context of the expected-posterior prior (EPP) approach

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

Penalized Loss functions for Bayesian Model Choice

Penalized Loss functions for Bayesian Model Choice Penalized Loss functions for Bayesian Model Choice Martyn International Agency for Research on Cancer Lyon, France 13 November 2009 The pure approach For a Bayesian purist, all uncertainty is represented

More information

On Bayesian model and variable selection using MCMC

On Bayesian model and variable selection using MCMC Statistics and Computing 12: 27 36, 2002 C 2002 Kluwer Academic Publishers. Manufactured in The Netherlands. On Bayesian model and variable selection using MCMC PETROS DELLAPORTAS, JONATHAN J. FORSTER

More information

Standard Errors & Confidence Intervals. N(0, I( β) 1 ), I( β) = [ 2 l(β, φ; y) β i β β= β j

Standard Errors & Confidence Intervals. N(0, I( β) 1 ), I( β) = [ 2 l(β, φ; y) β i β β= β j Standard Errors & Confidence Intervals β β asy N(0, I( β) 1 ), where I( β) = [ 2 l(β, φ; y) ] β i β β= β j We can obtain asymptotic 100(1 α)% confidence intervals for β j using: β j ± Z 1 α/2 se( β j )

More information

ECLT 5810 Linear Regression and Logistic Regression for Classification. Prof. Wai Lam

ECLT 5810 Linear Regression and Logistic Regression for Classification. Prof. Wai Lam ECLT 5810 Linear Regression and Logistic Regression for Classification Prof. Wai Lam Linear Regression Models Least Squares Input vectors is an attribute / feature / predictor (independent variable) The

More information

Model Selection in GLMs. (should be able to implement frequentist GLM analyses!) Today: standard frequentist methods for model selection

Model Selection in GLMs. (should be able to implement frequentist GLM analyses!) Today: standard frequentist methods for model selection Model Selection in GLMs Last class: estimability/identifiability, analysis of deviance, standard errors & confidence intervals (should be able to implement frequentist GLM analyses!) Today: standard frequentist

More information

Sparse Linear Models (10/7/13)

Sparse Linear Models (10/7/13) STA56: Probabilistic machine learning Sparse Linear Models (0/7/) Lecturer: Barbara Engelhardt Scribes: Jiaji Huang, Xin Jiang, Albert Oh Sparsity Sparsity has been a hot topic in statistics and machine

More information

MS-C1620 Statistical inference

MS-C1620 Statistical inference MS-C1620 Statistical inference 10 Linear regression III Joni Virta Department of Mathematics and Systems Analysis School of Science Aalto University Academic year 2018 2019 Period III - IV 1 / 32 Contents

More information

The STS Surgeon Composite Technical Appendix

The STS Surgeon Composite Technical Appendix The STS Surgeon Composite Technical Appendix Overview Surgeon-specific risk-adjusted operative operative mortality and major complication rates were estimated using a bivariate random-effects logistic

More information

Monte Carlo in Bayesian Statistics

Monte Carlo in Bayesian Statistics Monte Carlo in Bayesian Statistics Matthew Thomas SAMBa - University of Bath m.l.thomas@bath.ac.uk December 4, 2014 Matthew Thomas (SAMBa) Monte Carlo in Bayesian Statistics December 4, 2014 1 / 16 Overview

More information

Illustrating the Implicit BIC Prior. Richard Startz * revised June Abstract

Illustrating the Implicit BIC Prior. Richard Startz * revised June Abstract Illustrating the Implicit BIC Prior Richard Startz * revised June 2013 Abstract I show how to find the uniform prior implicit in using the Bayesian Information Criterion to consider a hypothesis about

More information

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain

More information

ECLT 5810 Linear Regression and Logistic Regression for Classification. Prof. Wai Lam

ECLT 5810 Linear Regression and Logistic Regression for Classification. Prof. Wai Lam ECLT 5810 Linear Regression and Logistic Regression for Classification Prof. Wai Lam Linear Regression Models Least Squares Input vectors is an attribute / feature / predictor (independent variable) The

More information

Proteomics and Variable Selection

Proteomics and Variable Selection Proteomics and Variable Selection p. 1/55 Proteomics and Variable Selection Alex Lewin With thanks to Paul Kirk for some graphs Department of Epidemiology and Biostatistics, School of Public Health, Imperial

More information

Lecture Notes 1: Decisions and Data. In these notes, I describe some basic ideas in decision theory. theory is constructed from

Lecture Notes 1: Decisions and Data. In these notes, I describe some basic ideas in decision theory. theory is constructed from Topics in Data Analysis Steven N. Durlauf University of Wisconsin Lecture Notes : Decisions and Data In these notes, I describe some basic ideas in decision theory. theory is constructed from The Data:

More information

Statistical Methods in Particle Physics Lecture 1: Bayesian methods

Statistical Methods in Particle Physics Lecture 1: Bayesian methods Statistical Methods in Particle Physics Lecture 1: Bayesian methods SUSSP65 St Andrews 16 29 August 2009 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan

More information

Bayesian hypothesis testing for the distribution of insurance claim counts using the Gibbs sampler

Bayesian hypothesis testing for the distribution of insurance claim counts using the Gibbs sampler Journal of Computational Methods in Sciences and Engineering 5 (2005) 201 214 201 IOS Press Bayesian hypothesis testing for the distribution of insurance claim counts using the Gibbs sampler Athanassios

More information

BAYESIAN ANALYSIS OF CORRELATED PROPORTIONS

BAYESIAN ANALYSIS OF CORRELATED PROPORTIONS Sankhyā : The Indian Journal of Statistics 2001, Volume 63, Series B, Pt. 3, pp 270-285 BAYESIAN ANALYSIS OF CORRELATED PROPORTIONS By MARIA KATERI, University of Ioannina TAKIS PAPAIOANNOU University

More information

Problems with Penalised Maximum Likelihood and Jeffrey s Priors to Account For Separation in Large Datasets with Rare Events

Problems with Penalised Maximum Likelihood and Jeffrey s Priors to Account For Separation in Large Datasets with Rare Events Problems with Penalised Maximum Likelihood and Jeffrey s Priors to Account For Separation in Large Datasets with Rare Events Liam F. McGrath September 15, 215 Abstract When separation is a problem in binary

More information

Machine Learning Linear Classification. Prof. Matteo Matteucci

Machine Learning Linear Classification. Prof. Matteo Matteucci Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)

More information

Causal Inference with Big Data Sets

Causal Inference with Big Data Sets Causal Inference with Big Data Sets Marcelo Coca Perraillon University of Colorado AMC November 2016 1 / 1 Outlone Outline Big data Causal inference in economics and statistics Regression discontinuity

More information

MCMC: Markov Chain Monte Carlo

MCMC: Markov Chain Monte Carlo I529: Machine Learning in Bioinformatics (Spring 2013) MCMC: Markov Chain Monte Carlo Yuzhen Ye School of Informatics and Computing Indiana University, Bloomington Spring 2013 Contents Review of Markov

More information

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) = Until now we have always worked with likelihoods and prior distributions that were conjugate to each other, allowing the computation of the posterior distribution to be done in closed form. Unfortunately,

More information

Model Comparison. Course on Bayesian Inference, WTCN, UCL, February Model Comparison. Bayes rule for models. Linear Models. AIC and BIC.

Model Comparison. Course on Bayesian Inference, WTCN, UCL, February Model Comparison. Bayes rule for models. Linear Models. AIC and BIC. Course on Bayesian Inference, WTCN, UCL, February 2013 A prior distribution over model space p(m) (or hypothesis space ) can be updated to a posterior distribution after observing data y. This is implemented

More information

Chris Fraley and Daniel Percival. August 22, 2008, revised May 14, 2010

Chris Fraley and Daniel Percival. August 22, 2008, revised May 14, 2010 Model-Averaged l 1 Regularization using Markov Chain Monte Carlo Model Composition Technical Report No. 541 Department of Statistics, University of Washington Chris Fraley and Daniel Percival August 22,

More information

L applicazione dei metodi Bayesiani nella Farmacoeconomia

L applicazione dei metodi Bayesiani nella Farmacoeconomia L applicazione dei metodi Bayesiani nella Farmacoeconomia Gianluca Baio Department of Statistical Science, University College London (UK) Department of Statistics, University of Milano Bicocca (Italy)

More information

Investigation into the use of confidence indicators with calibration

Investigation into the use of confidence indicators with calibration WORKSHOP ON FRONTIERS IN BENCHMARKING TECHNIQUES AND THEIR APPLICATION TO OFFICIAL STATISTICS 7 8 APRIL 2005 Investigation into the use of confidence indicators with calibration Gerard Keogh and Dave Jennings

More information

Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model

Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model Centre for Molecular, Environmental, Genetic & Analytic (MEGA) Epidemiology School of Population

More information

Logistic Regression. Advanced Methods for Data Analysis (36-402/36-608) Spring 2014

Logistic Regression. Advanced Methods for Data Analysis (36-402/36-608) Spring 2014 Logistic Regression Advanced Methods for Data Analysis (36-402/36-608 Spring 204 Classification. Introduction to classification Classification, like regression, is a predictive task, but one in which the

More information

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence Bayesian Inference in GLMs Frequentists typically base inferences on MLEs, asymptotic confidence limits, and log-likelihood ratio tests Bayesians base inferences on the posterior distribution of the unknowns

More information

Specification of prior distributions under model uncertainty

Specification of prior distributions under model uncertainty Specification of prior distributions under model uncertainty Petros Dellaportas 1, Jonathan J. Forster and Ioannis Ntzoufras 3 September 15, 008 SUMMARY We consider the specification of prior distributions

More information

Bayesian non-parametric model to longitudinally predict churn

Bayesian non-parametric model to longitudinally predict churn Bayesian non-parametric model to longitudinally predict churn Bruno Scarpa Università di Padova Conference of European Statistics Stakeholders Methodologists, Producers and Users of European Statistics

More information

Robust Bayesian Variable Selection for Modeling Mean Medical Costs

Robust Bayesian Variable Selection for Modeling Mean Medical Costs Robust Bayesian Variable Selection for Modeling Mean Medical Costs Grace Yoon 1,, Wenxin Jiang 2, Lei Liu 3 and Ya-Chen T. Shih 4 1 Department of Statistics, Texas A&M University 2 Department of Statistics,

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods By Oleg Makhnin 1 Introduction a b c M = d e f g h i 0 f(x)dx 1.1 Motivation 1.1.1 Just here Supresses numbering 1.1.2 After this 1.2 Literature 2 Method 2.1 New math As

More information

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016 Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2016 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several

More information

Analysing geoadditive regression data: a mixed model approach

Analysing geoadditive regression data: a mixed model approach Analysing geoadditive regression data: a mixed model approach Institut für Statistik, Ludwig-Maximilians-Universität München Joint work with Ludwig Fahrmeir & Stefan Lang 25.11.2005 Spatio-temporal regression

More information

Quantile POD for Hit-Miss Data

Quantile POD for Hit-Miss Data Quantile POD for Hit-Miss Data Yew-Meng Koh a and William Q. Meeker a a Center for Nondestructive Evaluation, Department of Statistics, Iowa State niversity, Ames, Iowa 50010 Abstract. Probability of detection

More information

Part 8: GLMs and Hierarchical LMs and GLMs

Part 8: GLMs and Hierarchical LMs and GLMs Part 8: GLMs and Hierarchical LMs and GLMs 1 Example: Song sparrow reproductive success Arcese et al., (1992) provide data on a sample from a population of 52 female song sparrows studied over the course

More information

Model comparison. Patrick Breheny. March 28. Introduction Measures of predictive power Model selection

Model comparison. Patrick Breheny. March 28. Introduction Measures of predictive power Model selection Model comparison Patrick Breheny March 28 Patrick Breheny BST 760: Advanced Regression 1/25 Wells in Bangladesh In this lecture and the next, we will consider a data set involving modeling the decisions

More information

Section IX. Introduction to Logistic Regression for binary outcomes. Poisson regression

Section IX. Introduction to Logistic Regression for binary outcomes. Poisson regression Section IX Introduction to Logistic Regression for binary outcomes Poisson regression 0 Sec 9 - Logistic regression In linear regression, we studied models where Y is a continuous variable. What about

More information

An Extended BIC for Model Selection

An Extended BIC for Model Selection An Extended BIC for Model Selection at the JSM meeting 2007 - Salt Lake City Surajit Ray Boston University (Dept of Mathematics and Statistics) Joint work with James Berger, Duke University; Susie Bayarri,

More information

Parameter Estimation. William H. Jefferys University of Texas at Austin Parameter Estimation 7/26/05 1

Parameter Estimation. William H. Jefferys University of Texas at Austin Parameter Estimation 7/26/05 1 Parameter Estimation William H. Jefferys University of Texas at Austin bill@bayesrules.net Parameter Estimation 7/26/05 1 Elements of Inference Inference problems contain two indispensable elements: Data

More information

Three-group ROC predictive analysis for ordinal outcomes

Three-group ROC predictive analysis for ordinal outcomes Three-group ROC predictive analysis for ordinal outcomes Tahani Coolen-Maturi Durham University Business School Durham University, UK tahani.maturi@durham.ac.uk June 26, 2016 Abstract Measuring the accuracy

More information

Machine Learning for OR & FE

Machine Learning for OR & FE Machine Learning for OR & FE Regression II: Regularization and Shrinkage Methods Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com

More information

Undirected Graphical Models

Undirected Graphical Models Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Properties Properties 3 Generative vs. Conditional

More information

You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What?

You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What? You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What?) I m not goin stop (What?) I m goin work harder (What?) Sir David

More information

Hmms with variable dimension structures and extensions

Hmms with variable dimension structures and extensions Hmm days/enst/january 21, 2002 1 Hmms with variable dimension structures and extensions Christian P. Robert Université Paris Dauphine www.ceremade.dauphine.fr/ xian Hmm days/enst/january 21, 2002 2 1 Estimating

More information

Lecture 6: Model Checking and Selection

Lecture 6: Model Checking and Selection Lecture 6: Model Checking and Selection Melih Kandemir melih.kandemir@iwr.uni-heidelberg.de May 27, 2014 Model selection We often have multiple modeling choices that are equally sensible: M 1,, M T. Which

More information

Introduction to Bayesian Data Analysis

Introduction to Bayesian Data Analysis Introduction to Bayesian Data Analysis Phil Gregory University of British Columbia March 2010 Hardback (ISBN-10: 052184150X ISBN-13: 9780521841504) Resources and solutions This title has free Mathematica

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 143 Part IV

More information

Bayesian Nonparametric Regression for Diabetes Deaths

Bayesian Nonparametric Regression for Diabetes Deaths Bayesian Nonparametric Regression for Diabetes Deaths Brian M. Hartman PhD Student, 2010 Texas A&M University College Station, TX, USA David B. Dahl Assistant Professor Texas A&M University College Station,

More information

Bayesian modelling of football outcomes. 1 Introduction. Synopsis. (using Skellam s Distribution)

Bayesian modelling of football outcomes. 1 Introduction. Synopsis. (using Skellam s Distribution) Karlis & Ntzoufras: Bayesian modelling of football outcomes 3 Bayesian modelling of football outcomes (using Skellam s Distribution) Dimitris Karlis & Ioannis Ntzoufras e-mails: {karlis, ntzoufras}@aueb.gr

More information

Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Mela. P.

Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Mela. P. Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Melanie M. Wall, Bradley P. Carlin November 24, 2014 Outlines of the talk

More information

A generalization of the Multiple-try Metropolis algorithm for Bayesian estimation and model selection

A generalization of the Multiple-try Metropolis algorithm for Bayesian estimation and model selection A generalization of the Multiple-try Metropolis algorithm for Bayesian estimation and model selection Silvia Pandolfi Francesco Bartolucci Nial Friel University of Perugia, IT University of Perugia, IT

More information

The Bayesian Approach to Multi-equation Econometric Model Estimation

The Bayesian Approach to Multi-equation Econometric Model Estimation Journal of Statistical and Econometric Methods, vol.3, no.1, 2014, 85-96 ISSN: 2241-0384 (print), 2241-0376 (online) Scienpress Ltd, 2014 The Bayesian Approach to Multi-equation Econometric Model Estimation

More information

Posterior Model Probabilities via Path-based Pairwise Priors

Posterior Model Probabilities via Path-based Pairwise Priors Posterior Model Probabilities via Path-based Pairwise Priors James O. Berger 1 Duke University and Statistical and Applied Mathematical Sciences Institute, P.O. Box 14006, RTP, Durham, NC 27709, U.S.A.

More information

Bayesian Model Specification: Toward a Theory of Applied Statistics

Bayesian Model Specification: Toward a Theory of Applied Statistics Bayesian Model Specification: Toward a Theory of Applied Statistics David Draper Department of Applied Mathematics and Statistics University of California, Santa Cruz draper@ams.ucsc.edu www.ams.ucsc.edu/

More information

Efficient adaptive covariate modelling for extremes

Efficient adaptive covariate modelling for extremes Efficient adaptive covariate modelling for extremes Slides at www.lancs.ac.uk/ jonathan Matthew Jones, David Randell, Emma Ross, Elena Zanini, Philip Jonathan Copyright of Shell December 218 1 / 23 Structural

More information

Building a Prognostic Biomarker

Building a Prognostic Biomarker Building a Prognostic Biomarker Noah Simon and Richard Simon July 2016 1 / 44 Prognostic Biomarker for a Continuous Measure On each of n patients measure y i - single continuous outcome (eg. blood pressure,

More information

An Introduction to Path Analysis

An Introduction to Path Analysis An Introduction to Path Analysis PRE 905: Multivariate Analysis Lecture 10: April 15, 2014 PRE 905: Lecture 10 Path Analysis Today s Lecture Path analysis starting with multivariate regression then arriving

More information

Bayesian Networks in Educational Assessment

Bayesian Networks in Educational Assessment Bayesian Networks in Educational Assessment Estimating Parameters with MCMC Bayesian Inference: Expanding Our Context Roy Levy Arizona State University Roy.Levy@asu.edu 2017 Roy Levy MCMC 1 MCMC 2 Posterior

More information

Bayesian Statistical Methods. Jeff Gill. Department of Political Science, University of Florida

Bayesian Statistical Methods. Jeff Gill. Department of Political Science, University of Florida Bayesian Statistical Methods Jeff Gill Department of Political Science, University of Florida 234 Anderson Hall, PO Box 117325, Gainesville, FL 32611-7325 Voice: 352-392-0262x272, Fax: 352-392-8127, Email:

More information

Probabilistic machine learning group, Aalto University Bayesian theory and methods, approximative integration, model

Probabilistic machine learning group, Aalto University  Bayesian theory and methods, approximative integration, model Aki Vehtari, Aalto University, Finland Probabilistic machine learning group, Aalto University http://research.cs.aalto.fi/pml/ Bayesian theory and methods, approximative integration, model assessment and

More information

Biostatistics-Lecture 16 Model Selection. Ruibin Xi Peking University School of Mathematical Sciences

Biostatistics-Lecture 16 Model Selection. Ruibin Xi Peking University School of Mathematical Sciences Biostatistics-Lecture 16 Model Selection Ruibin Xi Peking University School of Mathematical Sciences Motivating example1 Interested in factors related to the life expectancy (50 US states,1969-71 ) Per

More information

Post-Selection Inference

Post-Selection Inference Classical Inference start end start Post-Selection Inference selected end model data inference data selection model data inference Post-Selection Inference Todd Kuffner Washington University in St. Louis

More information

Estimating the marginal likelihood with Integrated nested Laplace approximation (INLA)

Estimating the marginal likelihood with Integrated nested Laplace approximation (INLA) Estimating the marginal likelihood with Integrated nested Laplace approximation (INLA) arxiv:1611.01450v1 [stat.co] 4 Nov 2016 Aliaksandr Hubin Department of Mathematics, University of Oslo and Geir Storvik

More information

MODEL AVERAGING by Merlise Clyde 1

MODEL AVERAGING by Merlise Clyde 1 Chapter 13 MODEL AVERAGING by Merlise Clyde 1 13.1 INTRODUCTION In Chapter 12, we considered inference in a normal linear regression model with q predictors. In many instances, the set of predictor variables

More information

Classification 1: Linear regression of indicators, linear discriminant analysis

Classification 1: Linear regression of indicators, linear discriminant analysis Classification 1: Linear regression of indicators, linear discriminant analysis Ryan Tibshirani Data Mining: 36-462/36-662 April 2 2013 Optional reading: ISL 4.1, 4.2, 4.4, ESL 4.1 4.3 1 Classification

More information

Niche Modeling. STAMPS - MBL Course Woods Hole, MA - August 9, 2016

Niche Modeling. STAMPS - MBL Course Woods Hole, MA - August 9, 2016 Niche Modeling Katie Pollard & Josh Ladau Gladstone Institutes UCSF Division of Biostatistics, Institute for Human Genetics and Institute for Computational Health Science STAMPS - MBL Course Woods Hole,

More information

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 EPSY 905: Intro to Bayesian and MCMC Today s Class An

More information

Tuning Parameter Selection in L1 Regularized Logistic Regression

Tuning Parameter Selection in L1 Regularized Logistic Regression Virginia Commonwealth University VCU Scholars Compass Theses and Dissertations Graduate School 2012 Tuning Parameter Selection in L1 Regularized Logistic Regression Shujing Shi Virginia Commonwealth University

More information

Modelling geoadditive survival data

Modelling geoadditive survival data Modelling geoadditive survival data Thomas Kneib & Ludwig Fahrmeir Department of Statistics, Ludwig-Maximilians-University Munich 1. Leukemia survival data 2. Structured hazard regression 3. Mixed model

More information

Strong Lens Modeling (II): Statistical Methods

Strong Lens Modeling (II): Statistical Methods Strong Lens Modeling (II): Statistical Methods Chuck Keeton Rutgers, the State University of New Jersey Probability theory multiple random variables, a and b joint distribution p(a, b) conditional distribution

More information

Health utilities' affect you are reported alongside underestimates of uncertainty

Health utilities' affect you are reported alongside underestimates of uncertainty Dr. Kelvin Chan, Medical Oncologist, Associate Scientist, Odette Cancer Centre, Sunnybrook Health Sciences Centre and Dr. Eleanor Pullenayegum, Senior Scientist, Hospital for Sick Children Title: Underestimation

More information

LINEAR MODELS FOR CLASSIFICATION. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception

LINEAR MODELS FOR CLASSIFICATION. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception LINEAR MODELS FOR CLASSIFICATION Classification: Problem Statement 2 In regression, we are modeling the relationship between a continuous input variable x and a continuous target variable t. In classification,

More information

Ronald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California

Ronald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California Texts in Statistical Science Bayesian Ideas and Data Analysis An Introduction for Scientists and Statisticians Ronald Christensen University of New Mexico Albuquerque, New Mexico Wesley Johnson University

More information

Bayes Factors, posterior predictives, short intro to RJMCMC. Thermodynamic Integration

Bayes Factors, posterior predictives, short intro to RJMCMC. Thermodynamic Integration Bayes Factors, posterior predictives, short intro to RJMCMC Thermodynamic Integration Dave Campbell 2016 Bayesian Statistical Inference P(θ Y ) P(Y θ)π(θ) Once you have posterior samples you can compute

More information

BIOL 51A - Biostatistics 1 1. Lecture 1: Intro to Biostatistics. Smoking: hazardous? FEV (l) Smoke

BIOL 51A - Biostatistics 1 1. Lecture 1: Intro to Biostatistics. Smoking: hazardous? FEV (l) Smoke BIOL 51A - Biostatistics 1 1 Lecture 1: Intro to Biostatistics Smoking: hazardous? FEV (l) 1 2 3 4 5 No Yes Smoke BIOL 51A - Biostatistics 1 2 Box Plot a.k.a box-and-whisker diagram or candlestick chart

More information

Reconstruction of individual patient data for meta analysis via Bayesian approach

Reconstruction of individual patient data for meta analysis via Bayesian approach Reconstruction of individual patient data for meta analysis via Bayesian approach Yusuke Yamaguchi, Wataru Sakamoto and Shingo Shirahata Graduate School of Engineering Science, Osaka University Masashi

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

ECE521 week 3: 23/26 January 2017

ECE521 week 3: 23/26 January 2017 ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear

More information

Bayesian Modeling Using WinBUGS

Bayesian Modeling Using WinBUGS Bayesian Modeling Using WinBUGS WILEY SERIES IN COMPUTATIONAL STATISTICS Consulting Editors: Paolo Giudici University of Pavia, Italy Geof H. Givens Colorado State University, USA Bani K. Mallick Texas

More information

Using modern statistical methodology for validating and reporti. Outcomes

Using modern statistical methodology for validating and reporti. Outcomes Using modern statistical methodology for validating and reporting Patient Reported Outcomes Dept. of Biostatistics, Univ. of Copenhagen joint DSBS/FMS Meeting October 2, 2014, Copenhagen Agenda 1 Indirect

More information

Development of Stochastic Artificial Neural Networks for Hydrological Prediction

Development of Stochastic Artificial Neural Networks for Hydrological Prediction Development of Stochastic Artificial Neural Networks for Hydrological Prediction G. B. Kingston, M. F. Lambert and H. R. Maier Centre for Applied Modelling in Water Engineering, School of Civil and Environmental

More information

Assessing Regime Uncertainty Through Reversible Jump McMC

Assessing Regime Uncertainty Through Reversible Jump McMC Assessing Regime Uncertainty Through Reversible Jump McMC August 14, 2008 1 Introduction Background Research Question 2 The RJMcMC Method McMC RJMcMC Algorithm Dependent Proposals Independent Proposals

More information

Odds ratio estimation in Bernoulli smoothing spline analysis-ofvariance

Odds ratio estimation in Bernoulli smoothing spline analysis-ofvariance The Statistician (1997) 46, No. 1, pp. 49 56 Odds ratio estimation in Bernoulli smoothing spline analysis-ofvariance models By YUEDONG WANG{ University of Michigan, Ann Arbor, USA [Received June 1995.

More information

Obnoxious lateness humor

Obnoxious lateness humor Obnoxious lateness humor 1 Using Bayesian Model Averaging For Addressing Model Uncertainty in Environmental Risk Assessment Louise Ryan and Melissa Whitney Department of Biostatistics Harvard School of

More information

Variable selection for model-based clustering

Variable selection for model-based clustering Variable selection for model-based clustering Matthieu Marbac (Ensai - Crest) Joint works with: M. Sedki (Univ. Paris-sud) and V. Vandewalle (Univ. Lille 2) The problem Objective: Estimation of a partition

More information

Probabilistic classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016

Probabilistic classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016 Probabilistic classification CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2016 Topics Probabilistic approach Bayes decision theory Generative models Gaussian Bayes classifier

More information