Bayesian non-parametric model to longitudinally predict churn
|
|
- Kenneth Fields
- 5 years ago
- Views:
Transcription
1 Bayesian non-parametric model to longitudinally predict churn Bruno Scarpa Università di Padova Conference of European Statistics Stakeholders Methodologists, Producers and Users of European Statistics Rome, 25 November 2014
2 Churn analysis A typical problem for many companies with quite a large customer base is the evaluation of customer loyalty which customers are most likely to abandon the company? These customers are often described as being churners. This problem is prominent in sectors in which customers have ongoing relationships with companies, i.e. services companies: banks, insurance companies, telecommunications services, etc.. Good models are needed for predicting deactivation (churn) by their customers, to be able to carry out appropriate retention actions later on. A model is needed not only to fit the data and predict future data, but also, possibly, to indicate marketing actions, e.g., customer retention strategies
3 Churn analysis A typical problem for many companies with quite a large customer base is the evaluation of customer loyalty which customers are most likely to abandon the company? These customers are often described as being churners. This problem is prominent in sectors in which customers have ongoing relationships with companies, i.e. services companies: banks, insurance companies, telecommunications services, etc.. Good models are needed for predicting deactivation (churn) by their customers, to be able to carry out appropriate retention actions later on. A model is needed not only to fit the data and predict future data, but also, possibly, to indicate marketing actions, e.g., customer retention strategies
4 Churn analysis A typical problem for many companies with quite a large customer base is the evaluation of customer loyalty which customers are most likely to abandon the company? These customers are often described as being churners. This problem is prominent in sectors in which customers have ongoing relationships with companies, i.e. services companies: banks, insurance companies, telecommunications services, etc.. Good models are needed for predicting deactivation (churn) by their customers, to be able to carry out appropriate retention actions later on. A model is needed not only to fit the data and predict future data, but also, possibly, to indicate marketing actions, e.g., customer retention strategies
5 Churn analysis A typical problem for many companies with quite a large customer base is the evaluation of customer loyalty which customers are most likely to abandon the company? These customers are often described as being churners. This problem is prominent in sectors in which customers have ongoing relationships with companies, i.e. services companies: banks, insurance companies, telecommunications services, etc.. Good models are needed for predicting deactivation (churn) by their customers, to be able to carry out appropriate retention actions later on. A model is needed not only to fit the data and predict future data, but also, possibly, to indicate marketing actions, e.g., customer retention strategies
6 Churn analysis A typical problem for many companies with quite a large customer base is the evaluation of customer loyalty which customers are most likely to abandon the company? These customers are often described as being churners. This problem is prominent in sectors in which customers have ongoing relationships with companies, i.e. services companies: banks, insurance companies, telecommunications services, etc.. Good models are needed for predicting deactivation (churn) by their customers, to be able to carry out appropriate retention actions later on. A model is needed not only to fit the data and predict future data, but also, possibly, to indicate marketing actions, e.g., customer retention strategies
7 Churn analysis A typical problem for many companies with quite a large customer base is the evaluation of customer loyalty which customers are most likely to abandon the company? These customers are often described as being churners. This problem is prominent in sectors in which customers have ongoing relationships with companies, i.e. services companies: banks, insurance companies, telecommunications services, etc.. Good models are needed for predicting deactivation (churn) by their customers, to be able to carry out appropriate retention actions later on. A model is needed not only to fit the data and predict future data, but also, possibly, to indicate marketing actions, e.g., customer retention strategies
8 Churn analysis A typical problem for many companies with quite a large customer base is the evaluation of customer loyalty which customers are most likely to abandon the company? These customers are often described as being churners. This problem is prominent in sectors in which customers have ongoing relationships with companies, i.e. services companies: banks, insurance companies, telecommunications services, etc.. Good models are needed for predicting deactivation (churn) by their customers, to be able to carry out appropriate retention actions later on. A model is needed not only to fit the data and predict future data, but also, possibly, to indicate marketing actions, e.g., customer retention strategies
9 Churn analysis A typical problem for many companies with quite a large customer base is the evaluation of customer loyalty which customers are most likely to abandon the company? These customers are often described as being churners. This problem is prominent in sectors in which customers have ongoing relationships with companies, i.e. services companies: banks, insurance companies, telecommunications services, etc.. Good models are needed for predicting deactivation (churn) by their customers, to be able to carry out appropriate retention actions later on. A model is needed not only to fit the data and predict future data, but also, possibly, to indicate marketing actions, e.g., customer retention strategies
10 Churn analysis Goal: find for each customer a score of propensity to churn Understand which variables have effect on the customer decision to churn and measure this effect It is more important to understand effects than accuracy in prediction Typically a data mining model is fitted to a random sample of customer base data In this paper we consider the prediction of churn for the customer base of a telecommunication company
11 Churn analysis Goal: find for each customer a score of propensity to churn Understand which variables have effect on the customer decision to churn and measure this effect It is more important to understand effects than accuracy in prediction Typically a data mining model is fitted to a random sample of customer base data In this paper we consider the prediction of churn for the customer base of a telecommunication company
12 Churn analysis Goal: find for each customer a score of propensity to churn Understand which variables have effect on the customer decision to churn and measure this effect It is more important to understand effects than accuracy in prediction Typically a data mining model is fitted to a random sample of customer base data In this paper we consider the prediction of churn for the customer base of a telecommunication company
13 Churn analysis Goal: find for each customer a score of propensity to churn Understand which variables have effect on the customer decision to churn and measure this effect It is more important to understand effects than accuracy in prediction Typically a data mining model is fitted to a random sample of customer base data In this paper we consider the prediction of churn for the customer base of a telecommunication company
14 Churn analysis Goal: find for each customer a score of propensity to churn Understand which variables have effect on the customer decision to churn and measure this effect It is more important to understand effects than accuracy in prediction Typically a data mining model is fitted to a random sample of customer base data In this paper we consider the prediction of churn for the customer base of a telecommunication company
15 Sources socio demographic data subscription data usage & network data call center data (calls, complains, billing problems)
16 Longitudinal data In service companies data are often collected in different time intants. For example monthly telephone traffic is considered an important predictor for churn These are longitudinal data that are rarely considered (in this form) to predict churn (only sort of index number are typically used). A possibility to handle this type of data consists in considering traffic as a functional data We can use tools to analyse the relationship between a functional predictor and a binary response
17 Longitudinal data In service companies data are often collected in different time intants. For example monthly telephone traffic is considered an important predictor for churn These are longitudinal data that are rarely considered (in this form) to predict churn (only sort of index number are typically used). A possibility to handle this type of data consists in considering traffic as a functional data We can use tools to analyse the relationship between a functional predictor and a binary response
18 Longitudinal data In service companies data are often collected in different time intants. For example monthly telephone traffic is considered an important predictor for churn These are longitudinal data that are rarely considered (in this form) to predict churn (only sort of index number are typically used). A possibility to handle this type of data consists in considering traffic as a functional data We can use tools to analyse the relationship between a functional predictor and a binary response
19 Longitudinal data In service companies data are often collected in different time intants. For example monthly telephone traffic is considered an important predictor for churn These are longitudinal data that are rarely considered (in this form) to predict churn (only sort of index number are typically used). A possibility to handle this type of data consists in considering traffic as a functional data We can use tools to analyse the relationship between a functional predictor and a binary response
20 Goal of Analysis Determine whether patterns of phone traffic (number, duration, value of montly calls) are related to churn. Here, our outcome, churn, is univariate, and our predictor, phone traffic, is longitudinal. How to characterize pattern of traffic? number, duration and value measures of traffic are examples of functional predictors: a random curve that varies over time, space, or some other domain, with observations at every point in the domain that may only be measured at a finite set of points.
21 Goal of Analysis Determine whether patterns of phone traffic (number, duration, value of montly calls) are related to churn. Here, our outcome, churn, is univariate, and our predictor, phone traffic, is longitudinal. How to characterize pattern of traffic? number, duration and value measures of traffic are examples of functional predictors: a random curve that varies over time, space, or some other domain, with observations at every point in the domain that may only be measured at a finite set of points.
22 Goal of Analysis Determine whether patterns of phone traffic (number, duration, value of montly calls) are related to churn. Here, our outcome, churn, is univariate, and our predictor, phone traffic, is longitudinal. How to characterize pattern of traffic? number, duration and value measures of traffic are examples of functional predictors: a random curve that varies over time, space, or some other domain, with observations at every point in the domain that may only be measured at a finite set of points.
23 General problem and data structure Interest: relationship between functional predictor f i and the response z i (inference and prediction) Predictor f i takes value f i (t) at location t {1,..., T }. Data consist of {y i, z i } n i=1, with y i = (y i1,..., y it ) T. y ij = error-prone measure of f i (t ij ) (telephone traffic at month t ij ) t ij = location (or time) of observation j n i = number of observations on subject i z i = response variable (churn) in addition we may have x i = static predictors variable (age, sex,... )
24 General problem and data structure Interest: relationship between functional predictor f i and the response z i (inference and prediction) Predictor f i takes value f i (t) at location t {1,..., T }. Data consist of {y i, z i } n i=1, with y i = (y i1,..., y it ) T. y ij = error-prone measure of f i (t ij ) (telephone traffic at month t ij ) t ij = location (or time) of observation j n i = number of observations on subject i z i = response variable (churn) in addition we may have x i = static predictors variable (age, sex,... )
25 General problem and data structure Interest: relationship between functional predictor f i and the response z i (inference and prediction) Predictor f i takes value f i (t) at location t {1,..., T }. Data consist of {y i, z i } n i=1, with y i = (y i1,..., y it ) T. y ij = error-prone measure of f i (t ij ) (telephone traffic at month t ij ) t ij = location (or time) of observation j n i = number of observations on subject i z i = response variable (churn) in addition we may have x i = static predictors variable (age, sex,... )
26 General problem and data structure Interest: relationship between functional predictor f i and the response z i (inference and prediction) Predictor f i takes value f i (t) at location t {1,..., T }. Data consist of {y i, z i } n i=1, with y i = (y i1,..., y it ) T. y ij = error-prone measure of f i (t ij ) (telephone traffic at month t ij ) t ij = location (or time) of observation j n i = number of observations on subject i z i = response variable (churn) in addition we may have x i = static predictors variable (age, sex,... )
27 Latent class trajectory model Group-based trajectory models are used to identify clusters of subjects following similar trajectories over time. While we may not believe that each subject s phone traffic measures exactly follow one of K curves, this may be a very useful summary of the data.
28 Latent class trajectory model Group-based trajectory models are used to identify clusters of subjects following similar trajectories over time. While we may not believe that each subject s phone traffic measures exactly follow one of K curves, this may be a very useful summary of the data.
29 Issues Good choice of parametric form for latent trajectory curves often unclear (prefer a nonparametric form?) iid N(0, σ 2 ) residuals restrictive - may imply lots of latent classes Number of latent classes unknown - BIC criteria may be poor
30 Issues Good choice of parametric form for latent trajectory curves often unclear (prefer a nonparametric form?) iid N(0, σ 2 ) residuals restrictive - may imply lots of latent classes Number of latent classes unknown - BIC criteria may be poor
31 Issues Good choice of parametric form for latent trajectory curves often unclear (prefer a nonparametric form?) iid N(0, σ 2 ) residuals restrictive - may imply lots of latent classes Number of latent classes unknown - BIC criteria may be poor
32 Our interest Use semiparametric Bayes joint modelling framework For ease in interpretation, group individuals into functional predictor clusters (patterns of traffic), with number of clusters not specified in advance Allow response (churn) distribution to vary nonparametrically across clusters Conduct inferences on changes in churn
33 Our interest Use semiparametric Bayes joint modelling framework For ease in interpretation, group individuals into functional predictor clusters (patterns of traffic), with number of clusters not specified in advance Allow response (churn) distribution to vary nonparametrically across clusters Conduct inferences on changes in churn
34 Our interest Use semiparametric Bayes joint modelling framework For ease in interpretation, group individuals into functional predictor clusters (patterns of traffic), with number of clusters not specified in advance Allow response (churn) distribution to vary nonparametrically across clusters Conduct inferences on changes in churn
35 Our interest Use semiparametric Bayes joint modelling framework For ease in interpretation, group individuals into functional predictor clusters (patterns of traffic), with number of clusters not specified in advance Allow response (churn) distribution to vary nonparametrically across clusters Conduct inferences on changes in churn
36 The data 3000 post paid SIM cards number of outgoing calls for 9 consecutive months socio-demographical (sex, age,... ) and relate to the contract (services, payment method,... ) characteristics churn status (active/deactivated) after three months
37 From data to model Flexibility to get irregularities, if present non parametric modelling Estimate variability between functional curves and between output distributions
38 Bayesian non parametric approach We follow Bigelow and Dunson (2009) with some modification: we use Gaussian processes as baseline measures (B&D2009 uses Splines functions) in the estimate algorithm we use a nested Metropolis-Hastings algorithm by-product: functional clustering
39 Joint model Joint modelling of {y i, x i, z i } n i=1 1 specification of a model for each component y i and z i x i 2 specification of a joint prior for the parameters of the two models
40 Components of the model Model for the output (churn) - GLM z i Bin(1, π i ) π i = eξ i Model for the trajectory 1 + e ξ i ξ i = a i + x T i γ y i (t) = f i (t) + ε it ε it N (0, τ 1 ) f i G Joint model θ i = {f i, a i } P Dependence between functional predictor f i Ω & response z i R characterised through P P = random probability measure on (R T +1, B)
41 Components of the model Model for the output (churn) - GLM z i Bin(1, π i ) π i = eξ i Model for the trajectory 1 + e ξ i ξ i = a i + x T i γ y i (t) = f i (t) + ε it ε it N (0, τ 1 ) f i G Joint model θ i = {f i, a i } P Dependence between functional predictor f i Ω & response z i R characterised through P P = random probability measure on (R T +1, B)
42 Components of the model Model for the output (churn) - GLM z i Bin(1, π i ) π i = eξ i Model for the trajectory 1 + e ξ i ξ i = a i + x T i γ y i (t) = f i (t) + ε it ε it N (0, τ 1 ) f i G Joint model θ i = {f i, a i } P Dependence between functional predictor f i Ω & response z i R characterised through P P = random probability measure on (R T +1, B)
43 Components of the model Model for the output (churn) - GLM z i Bin(1, π i ) π i = eξ i Model for the trajectory 1 + e ξ i ξ i = a i + x T i γ y i (t) = f i (t) + ε it ε it N (0, τ 1 ) f i G Joint model θ i = {f i, a i } P Dependence between functional predictor f i Ω & response z i R characterised through P P = random probability measure on (R T +1, B)
44 Gaussian process To simplify modelling we express function f i as a Gaussian process f i (t) GP(µ, C) where µ is the mean function and C is the covariance function Considering the discrete sequences of times (9 observations in our data), the GP induces a multivariate normal distributions on the observed points of the process, f 1,..., f T.
45 Gaussian process To simplify modelling we express function f i as a Gaussian process f i (t) GP(µ, C) where µ is the mean function and C is the covariance function Considering the discrete sequences of times (9 observations in our data), the GP induces a multivariate normal distributions on the observed points of the process, f 1,..., f T.
46 Gaussian process Samples from a GP can take a very wide variety of shapes that have limited sensitivity to the mean function. we allow an unknown, fixed mean, to avoid sensitivity to the scale of the phone traffic (still allows a very wide variety of trajectory shapes) The covariance function C controls the types of shapes observed. We used the exponential covariance function, as it allows a wide variety of functional shapes (squared exponential may overly favour smooth functions). C(t, t ) = 1 ( exp t ) t κ 1 κ 2 where κ 1 and κ 2 are unknown parameters.
47 Gaussian process Samples from a GP can take a very wide variety of shapes that have limited sensitivity to the mean function. we allow an unknown, fixed mean, to avoid sensitivity to the scale of the phone traffic (still allows a very wide variety of trajectory shapes) The covariance function C controls the types of shapes observed. We used the exponential covariance function, as it allows a wide variety of functional shapes (squared exponential may overly favour smooth functions). C(t, t ) = 1 ( exp t ) t κ 1 κ 2 where κ 1 and κ 2 are unknown parameters.
48 Gaussian process Samples from a GP can take a very wide variety of shapes that have limited sensitivity to the mean function. we allow an unknown, fixed mean, to avoid sensitivity to the scale of the phone traffic (still allows a very wide variety of trajectory shapes) The covariance function C controls the types of shapes observed. We used the exponential covariance function, as it allows a wide variety of functional shapes (squared exponential may overly favour smooth functions). C(t, t ) = 1 ( exp t ) t κ 1 κ 2 where κ 1 and κ 2 are unknown parameters.
49 Gaussian process Samples from a GP can take a very wide variety of shapes that have limited sensitivity to the mean function. we allow an unknown, fixed mean, to avoid sensitivity to the scale of the phone traffic (still allows a very wide variety of trajectory shapes) The covariance function C controls the types of shapes observed. We used the exponential covariance function, as it allows a wide variety of functional shapes (squared exponential may overly favour smooth functions). C(t, t ) = 1 ( exp t ) t κ 1 κ 2 where κ 1 and κ 2 are unknown parameters.
50 Dirichlet process joint models θ i = {f i, a i } P A natural approach is to let P be unknown with P DP(αP 0 ) DP(αP 0 ) = denotes the Dirichlet process (Ferguson, 1973) with α: precision parameter P 0 : base probability measure
51 Dirichlet process joint models θ i = {f i, a i } P A natural approach is to let P be unknown with P DP(αP 0 ) DP(αP 0 ) = denotes the Dirichlet process (Ferguson, 1973) with α: precision parameter P 0 : base probability measure
52 Dirichlet process joint models θ i = {f i, a i } P A natural approach is to let P be unknown with P DP(αP 0 ) DP(αP 0 ) = denotes the Dirichlet process (Ferguson, 1973) with α: precision parameter P 0 : base probability measure
53 Dirichlet process joint models Stick-breaking representation (Sethuraman, 1994): P = π h δ θ h, θh P 0 h=1 δ θ : Dirac probability measure on the atom θ h 1 π h = V h (1 v l ) V h Beta(1, α) iid l=1
54 Dirichlet process joint models P 0 GP(µ, C) N (0, ν 1 ) P 0 N T +1 ([ µ 0 ] [ C 0, 0 T ν 1 ])
55 Dirichlet process joint models P 0 GP(µ, C) N (0, ν 1 ) P 0 N T +1 ([ µ 0 ] [ C 0, 0 T ν 1 ])
56 Prior distributions A Bayesian specification is completed with priors ν γ(a ν, b ν ) precision response component τ γ(a τ, b τ ) precision error for predictor component κ 1 γ(a κ1, b κ1 ) covariance function for GP κ 2 γ(a κ2, b κ2 ) covariance function for GP γ l N (γ 0, η 1 l ), static variable effects η l γ(a η, b η ), l = 1,..., p, variances for static variable effects
57 Dirichlet process joint models This DP prior induces the following Blackwell & MacQueen (1973) rule: ( ) α i 1 ( (θ i θ 1,..., θ i 1 ) P 0 + α + i 1 δ θ = measure concentrated at θ. Pólya Urn scheme j=1 1 α + i 1 ) δ θj
58 Comments on DP joint models Subjects automatically grouped into an unknown number of functional trajectory clusters. Cluster h has functional trajectory f (t) = GP(µ h, C) & response density Bin(1, ah + xγ h ) Marginal density of z is a mixture of Bernoulli Within predictor cluster, density of z is a single Bernoulli The DP assumes identical clusters in the predictor & response
59 Comments on DP joint models Subjects automatically grouped into an unknown number of functional trajectory clusters. Cluster h has functional trajectory f (t) = GP(µ h, C) & response density Bin(1, ah + xγ h ) Marginal density of z is a mixture of Bernoulli Within predictor cluster, density of z is a single Bernoulli The DP assumes identical clusters in the predictor & response
60 Comments on DP joint models Subjects automatically grouped into an unknown number of functional trajectory clusters. Cluster h has functional trajectory f (t) = GP(µ h, C) & response density Bin(1, ah + xγ h ) Marginal density of z is a mixture of Bernoulli Within predictor cluster, density of z is a single Bernoulli The DP assumes identical clusters in the predictor & response
61 Comments on DP joint models Subjects automatically grouped into an unknown number of functional trajectory clusters. Cluster h has functional trajectory f (t) = GP(µ h, C) & response density Bin(1, ah + xγ h ) Marginal density of z is a mixture of Bernoulli Within predictor cluster, density of z is a single Bernoulli The DP assumes identical clusters in the predictor & response
62 Comments on DP joint models Subjects automatically grouped into an unknown number of functional trajectory clusters. Cluster h has functional trajectory f (t) = GP(µ h, C) & response density Bin(1, ah + xγ h ) Marginal density of z is a mixture of Bernoulli Within predictor cluster, density of z is a single Bernoulli The DP assumes identical clusters in the predictor & response
63 Posterior distribution Gibbs sampling is straightforward to implement, involving simple steps for sampling from standard distributions Highly computationally intensive P is almost certainly discrete clustering of the sample units (customers) without first specifying the number of groups MCMC algorithm (Pólya Urn + Gibbs sampler + Metropolis-Hastings) At each iteration 1 Allocate the units in the groups 2 Update group parameters Nested Metropolis-Hastings 3 Update prior distributions iperparameters
64 Posterior distribution Gibbs sampling is straightforward to implement, involving simple steps for sampling from standard distributions Highly computationally intensive P is almost certainly discrete clustering of the sample units (customers) without first specifying the number of groups MCMC algorithm (Pólya Urn + Gibbs sampler + Metropolis-Hastings) At each iteration 1 Allocate the units in the groups 2 Update group parameters Nested Metropolis-Hastings 3 Update prior distributions iperparameters
65 Posterior distribution Gibbs sampling is straightforward to implement, involving simple steps for sampling from standard distributions Highly computationally intensive P is almost certainly discrete clustering of the sample units (customers) without first specifying the number of groups MCMC algorithm (Pólya Urn + Gibbs sampler + Metropolis-Hastings) At each iteration 1 Allocate the units in the groups 2 Update group parameters Nested Metropolis-Hastings 3 Update prior distributions iperparameters
66 Posterior distribution Gibbs sampling is straightforward to implement, involving simple steps for sampling from standard distributions Highly computationally intensive P is almost certainly discrete clustering of the sample units (customers) without first specifying the number of groups MCMC algorithm (Pólya Urn + Gibbs sampler + Metropolis-Hastings) At each iteration 1 Allocate the units in the groups 2 Update group parameters Nested Metropolis-Hastings 3 Update prior distributions iperparameters
67 Results interpretation Label switching a pain! number and composition of groups changes among algorithm iterations output not directly usable for clustering can be addressed by post-processing (Medvedovic and Sivaganesan, 2002), but adds considerably to computational burden 1 Obtain a distance matrix between sample units by using the posterior output 2 hierarchical clustering with complete link
68 Results interpretation Label switching a pain! number and composition of groups changes among algorithm iterations output not directly usable for clustering can be addressed by post-processing (Medvedovic and Sivaganesan, 2002), but adds considerably to computational burden 1 Obtain a distance matrix between sample units by using the posterior output 2 hierarchical clustering with complete link
69 Results interpretation Label switching a pain! number and composition of groups changes among algorithm iterations output not directly usable for clustering can be addressed by post-processing (Medvedovic and Sivaganesan, 2002), but adds considerably to computational burden 1 Obtain a distance matrix between sample units by using the posterior output 2 hierarchical clustering with complete link
70 Results interpretation Label switching a pain! number and composition of groups changes among algorithm iterations output not directly usable for clustering can be addressed by post-processing (Medvedovic and Sivaganesan, 2002), but adds considerably to computational burden 1 Obtain a distance matrix between sample units by using the posterior output 2 hierarchical clustering with complete link
71 Results interpretation Label switching a pain! number and composition of groups changes among algorithm iterations output not directly usable for clustering can be addressed by post-processing (Medvedovic and Sivaganesan, 2002), but adds considerably to computational burden 1 Obtain a distance matrix between sample units by using the posterior output 2 hierarchical clustering with complete link
72 Estimated trajectories and probabilities
73 some cluster
74 The static variables p1, p2, p3, p4, p5: Tariff plan m1, m2, m3: Payment method e1, e2, e3: Age
75 Lift improvement factor balanced logistic model not balanced logistic model balanced linear model not balanced linear model discriminant analysis classification tree MARS GAM SVM random forest bagging boosting fraction of predicted subjects
76
STAT Advanced Bayesian Inference
1 / 32 STAT 625 - Advanced Bayesian Inference Meng Li Department of Statistics Jan 23, 218 The Dirichlet distribution 2 / 32 θ Dirichlet(a 1,...,a k ) with density p(θ 1,θ 2,...,θ k ) = k j=1 Γ(a j) Γ(
More informationBayesian Statistics. Debdeep Pati Florida State University. April 3, 2017
Bayesian Statistics Debdeep Pati Florida State University April 3, 2017 Finite mixture model The finite mixture of normals can be equivalently expressed as y i N(µ Si ; τ 1 S i ), S i k π h δ h h=1 δ h
More informationNon-Parametric Bayes
Non-Parametric Bayes Mark Schmidt UBC Machine Learning Reading Group January 2016 Current Hot Topics in Machine Learning Bayesian learning includes: Gaussian processes. Approximate inference. Bayesian
More informationBayesian linear regression
Bayesian linear regression Linear regression is the basis of most statistical modeling. The model is Y i = X T i β + ε i, where Y i is the continuous response X i = (X i1,..., X ip ) T is the corresponding
More informationNonparametric Bayes Uncertainty Quantification
Nonparametric Bayes Uncertainty Quantification David Dunson Department of Statistical Science, Duke University Funded from NIH R01-ES017240, R01-ES017436 & ONR Review of Bayes Intro to Nonparametric Bayes
More informationBayes methods for categorical data. April 25, 2017
Bayes methods for categorical data April 25, 2017 Motivation for joint probability models Increasing interest in high-dimensional data in broad applications Focus may be on prediction, variable selection,
More informationLecture 16: Mixtures of Generalized Linear Models
Lecture 16: Mixtures of Generalized Linear Models October 26, 2006 Setting Outline Often, a single GLM may be insufficiently flexible to characterize the data Setting Often, a single GLM may be insufficiently
More informationLecture 16-17: Bayesian Nonparametrics I. STAT 6474 Instructor: Hongxiao Zhu
Lecture 16-17: Bayesian Nonparametrics I STAT 6474 Instructor: Hongxiao Zhu Plan for today Why Bayesian Nonparametrics? Dirichlet Distribution and Dirichlet Processes. 2 Parameter and Patterns Reference:
More informationCMPS 242: Project Report
CMPS 242: Project Report RadhaKrishna Vuppala Univ. of California, Santa Cruz vrk@soe.ucsc.edu Abstract The classification procedures impose certain models on the data and when the assumption match the
More informationSTA 216, GLM, Lecture 16. October 29, 2007
STA 216, GLM, Lecture 16 October 29, 2007 Efficient Posterior Computation in Factor Models Underlying Normal Models Generalized Latent Trait Models Formulation Genetic Epidemiology Illustration Structural
More informationLatent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent
Latent Variable Models for Binary Data Suppose that for a given vector of explanatory variables x, the latent variable, U, has a continuous cumulative distribution function F (u; x) and that the binary
More informationContents. Part I: Fundamentals of Bayesian Inference 1
Contents Preface xiii Part I: Fundamentals of Bayesian Inference 1 1 Probability and inference 3 1.1 The three steps of Bayesian data analysis 3 1.2 General notation for statistical inference 4 1.3 Bayesian
More informationLecture 3a: Dirichlet processes
Lecture 3a: Dirichlet processes Cédric Archambeau Centre for Computational Statistics and Machine Learning Department of Computer Science University College London c.archambeau@cs.ucl.ac.uk Advanced Topics
More information39th Annual ISMS Marketing Science Conference University of Southern California, June 8, 2017
Permuted and IROM Department, McCombs School of Business The University of Texas at Austin 39th Annual ISMS Marketing Science Conference University of Southern California, June 8, 2017 1 / 36 Joint work
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is
More informationOutline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution
Outline A short review on Bayesian analysis. Binomial, Multinomial, Normal, Beta, Dirichlet Posterior mean, MAP, credible interval, posterior distribution Gibbs sampling Revisit the Gaussian mixture model
More informationA Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness
A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A. Linero and M. Daniels UF, UT-Austin SRC 2014, Galveston, TX 1 Background 2 Working model
More informationNonparametric Bayes Modeling
Nonparametric Bayes Modeling Lecture 6: Advanced Applications of DPMs David Dunson Department of Statistical Science, Duke University Tuesday February 2, 2010 Motivation Functional data analysis Variable
More informationPattern Recognition and Machine Learning
Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability
More informationHierarchical Modeling for Univariate Spatial Data
Hierarchical Modeling for Univariate Spatial Data Geography 890, Hierarchical Bayesian Models for Environmental Spatial Data Analysis February 15, 2011 1 Spatial Domain 2 Geography 890 Spatial Domain This
More informationA Fully Nonparametric Modeling Approach to. BNP Binary Regression
A Fully Nonparametric Modeling Approach to Binary Regression Maria Department of Applied Mathematics and Statistics University of California, Santa Cruz SBIES, April 27-28, 2012 Outline 1 2 3 Simulation
More informationBayesian nonparametrics
Bayesian nonparametrics 1 Some preliminaries 1.1 de Finetti s theorem We will start our discussion with this foundational theorem. We will assume throughout all variables are defined on the probability
More informationSTAT 518 Intro Student Presentation
STAT 518 Intro Student Presentation Wen Wei Loh April 11, 2013 Title of paper Radford M. Neal [1999] Bayesian Statistics, 6: 475-501, 1999 What the paper is about Regression and Classification Flexible
More informationAnalysing geoadditive regression data: a mixed model approach
Analysing geoadditive regression data: a mixed model approach Institut für Statistik, Ludwig-Maximilians-Universität München Joint work with Ludwig Fahrmeir & Stefan Lang 25.11.2005 Spatio-temporal regression
More informationRonald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California
Texts in Statistical Science Bayesian Ideas and Data Analysis An Introduction for Scientists and Statisticians Ronald Christensen University of New Mexico Albuquerque, New Mexico Wesley Johnson University
More informationA Nonparametric Approach Using Dirichlet Process for Hierarchical Generalized Linear Mixed Models
Journal of Data Science 8(2010), 43-59 A Nonparametric Approach Using Dirichlet Process for Hierarchical Generalized Linear Mixed Models Jing Wang Louisiana State University Abstract: In this paper, we
More informationBayesian Linear Regression
Bayesian Linear Regression Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. September 15, 2010 1 Linear regression models: a Bayesian perspective
More informationImage segmentation combining Markov Random Fields and Dirichlet Processes
Image segmentation combining Markov Random Fields and Dirichlet Processes Jessica SODJO IMS, Groupe Signal Image, Talence Encadrants : A. Giremus, J.-F. Giovannelli, F. Caron, N. Dobigeon Jessica SODJO
More informationResearch Article Spiked Dirichlet Process Priors for Gaussian Process Models
Hindawi Publishing Corporation Journal of Probability and Statistics Volume 200, Article ID 20489, 4 pages doi:0.55/200/20489 Research Article Spiked Dirichlet Process Priors for Gaussian Process Models
More informationDirichlet Processes: Tutorial and Practical Course
Dirichlet Processes: Tutorial and Practical Course (updated) Yee Whye Teh Gatsby Computational Neuroscience Unit University College London August 2007 / MLSS Yee Whye Teh (Gatsby) DP August 2007 / MLSS
More informationMotivation Scale Mixutres of Normals Finite Gaussian Mixtures Skew-Normal Models. Mixture Models. Econ 690. Purdue University
Econ 690 Purdue University In virtually all of the previous lectures, our models have made use of normality assumptions. From a computational point of view, the reason for this assumption is clear: combined
More informationHierarchical Modelling for Univariate Spatial Data
Hierarchical Modelling for Univariate Spatial Data Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department
More informationRelated Concepts: Lecture 9 SEM, Statistical Modeling, AI, and Data Mining. I. Terminology of SEM
Lecture 9 SEM, Statistical Modeling, AI, and Data Mining I. Terminology of SEM Related Concepts: Causal Modeling Path Analysis Structural Equation Modeling Latent variables (Factors measurable, but thru
More informationBayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang
Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features Yangxin Huang Department of Epidemiology and Biostatistics, COPH, USF, Tampa, FL yhuang@health.usf.edu January
More informationNonparametric Bayes regression and classification through mixtures of product kernels
Nonparametric Bayes regression and classification through mixtures of product kernels David B. Dunson & Abhishek Bhattacharya Department of Statistical Science Box 90251, Duke University Durham, NC 27708-0251,
More informationFlexible Regression Modeling using Bayesian Nonparametric Mixtures
Flexible Regression Modeling using Bayesian Nonparametric Mixtures Athanasios Kottas Department of Applied Mathematics and Statistics University of California, Santa Cruz Department of Statistics Brigham
More informationFoundations of Nonparametric Bayesian Methods
1 / 27 Foundations of Nonparametric Bayesian Methods Part II: Models on the Simplex Peter Orbanz http://mlg.eng.cam.ac.uk/porbanz/npb-tutorial.html 2 / 27 Tutorial Overview Part I: Basics Part II: Models
More informationBayesian Nonparametrics: Dirichlet Process
Bayesian Nonparametrics: Dirichlet Process Yee Whye Teh Gatsby Computational Neuroscience Unit, UCL http://www.gatsby.ucl.ac.uk/~ywteh/teaching/npbayes2012 Dirichlet Process Cornerstone of modern Bayesian
More informationEfficient Bayesian Multivariate Surface Regression
Efficient Bayesian Multivariate Surface Regression Feng Li (joint with Mattias Villani) Department of Statistics, Stockholm University October, 211 Outline of the talk 1 Flexible regression models 2 The
More informationA Bayesian Nonparametric Model for Predicting Disease Status Using Longitudinal Profiles
A Bayesian Nonparametric Model for Predicting Disease Status Using Longitudinal Profiles Jeremy Gaskins Department of Bioinformatics & Biostatistics University of Louisville Joint work with Claudio Fuentes
More informationA general mixed model approach for spatio-temporal regression data
A general mixed model approach for spatio-temporal regression data Thomas Kneib, Ludwig Fahrmeir & Stefan Lang Department of Statistics, Ludwig-Maximilians-University Munich 1. Spatio-temporal regression
More informationPart 8: GLMs and Hierarchical LMs and GLMs
Part 8: GLMs and Hierarchical LMs and GLMs 1 Example: Song sparrow reproductive success Arcese et al., (1992) provide data on a sample from a population of 52 female song sparrows studied over the course
More informationNovember 2002 STA Random Effects Selection in Linear Mixed Models
November 2002 STA216 1 Random Effects Selection in Linear Mixed Models November 2002 STA216 2 Introduction It is common practice in many applications to collect multiple measurements on a subject. Linear
More informationStat 542: Item Response Theory Modeling Using The Extended Rank Likelihood
Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood Jonathan Gruhl March 18, 2010 1 Introduction Researchers commonly apply item response theory (IRT) models to binary and ordinal
More informationPrinciples of Bayesian Inference
Principles of Bayesian Inference Sudipto Banerjee University of Minnesota July 20th, 2008 1 Bayesian Principles Classical statistics: model parameters are fixed and unknown. A Bayesian thinks of parameters
More informationDavid B. Dahl. Department of Statistics, and Department of Biostatistics & Medical Informatics University of Wisconsin Madison
AN IMPROVED MERGE-SPLIT SAMPLER FOR CONJUGATE DIRICHLET PROCESS MIXTURE MODELS David B. Dahl dbdahl@stat.wisc.edu Department of Statistics, and Department of Biostatistics & Medical Informatics University
More informationNormalized kernel-weighted random measures
Normalized kernel-weighted random measures Jim Griffin University of Kent 1 August 27 Outline 1 Introduction 2 Ornstein-Uhlenbeck DP 3 Generalisations Bayesian Density Regression We observe data (x 1,
More informationColouring and breaking sticks, pairwise coincidence losses, and clustering expression profiles
Colouring and breaking sticks, pairwise coincidence losses, and clustering expression profiles Peter Green and John Lau University of Bristol P.J.Green@bristol.ac.uk Isaac Newton Institute, 11 December
More informationNonparametric Bayes tensor factorizations for big data
Nonparametric Bayes tensor factorizations for big data David Dunson Department of Statistical Science, Duke University Funded from NIH R01-ES017240, R01-ES017436 & DARPA N66001-09-C-2082 Motivation Conditional
More informationPATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS Parametric Distributions Basic building blocks: Need to determine given Representation: or? Recall Curve Fitting Binary Variables
More informationNonparametric Bayesian modeling for dynamic ordinal regression relationships
Nonparametric Bayesian modeling for dynamic ordinal regression relationships Athanasios Kottas Department of Applied Mathematics and Statistics, University of California, Santa Cruz Joint work with Maria
More informationPMR Learning as Inference
Outline PMR Learning as Inference Probabilistic Modelling and Reasoning Amos Storkey Modelling 2 The Exponential Family 3 Bayesian Sets School of Informatics, University of Edinburgh Amos Storkey PMR Learning
More informationOutline. Clustering. Capturing Unobserved Heterogeneity in the Austrian Labor Market Using Finite Mixtures of Markov Chain Models
Capturing Unobserved Heterogeneity in the Austrian Labor Market Using Finite Mixtures of Markov Chain Models Collaboration with Rudolf Winter-Ebmer, Department of Economics, Johannes Kepler University
More informationWeb Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D.
Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D. Ruppert A. EMPIRICAL ESTIMATE OF THE KERNEL MIXTURE Here we
More informationMultilevel Statistical Models: 3 rd edition, 2003 Contents
Multilevel Statistical Models: 3 rd edition, 2003 Contents Preface Acknowledgements Notation Two and three level models. A general classification notation and diagram Glossary Chapter 1 An introduction
More informationNon-parametric Clustering with Dirichlet Processes
Non-parametric Clustering with Dirichlet Processes Timothy Burns SUNY at Buffalo Mar. 31 2009 T. Burns (SUNY at Buffalo) Non-parametric Clustering with Dirichlet Processes Mar. 31 2009 1 / 24 Introduction
More informationInfinite-State Markov-switching for Dynamic. Volatility Models : Web Appendix
Infinite-State Markov-switching for Dynamic Volatility Models : Web Appendix Arnaud Dufays 1 Centre de Recherche en Economie et Statistique March 19, 2014 1 Comparison of the two MS-GARCH approximations
More informationRiemann Manifold Methods in Bayesian Statistics
Ricardo Ehlers ehlers@icmc.usp.br Applied Maths and Stats University of São Paulo, Brazil Working Group in Statistical Learning University College Dublin September 2015 Bayesian inference is based on Bayes
More informationBayesian Methods for Machine Learning
Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),
More informationRecent Advances in Bayesian Inference Techniques
Recent Advances in Bayesian Inference Techniques Christopher M. Bishop Microsoft Research, Cambridge, U.K. research.microsoft.com/~cmbishop SIAM Conference on Data Mining, April 2004 Abstract Bayesian
More informationBayesian Nonparametric Regression for Diabetes Deaths
Bayesian Nonparametric Regression for Diabetes Deaths Brian M. Hartman PhD Student, 2010 Texas A&M University College Station, TX, USA David B. Dahl Assistant Professor Texas A&M University College Station,
More informationMULTILEVEL IMPUTATION 1
MULTILEVEL IMPUTATION 1 Supplement B: MCMC Sampling Steps and Distributions for Two-Level Imputation This document gives technical details of the full conditional distributions used to draw regression
More informationAn Alternative Infinite Mixture Of Gaussian Process Experts
An Alternative Infinite Mixture Of Gaussian Process Experts Edward Meeds and Simon Osindero Department of Computer Science University of Toronto Toronto, M5S 3G4 {ewm,osindero}@cs.toronto.edu Abstract
More informationBayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework
HT5: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Maximum Likelihood Principle A generative model for
More informationSpatial Bayesian Nonparametrics for Natural Image Segmentation
Spatial Bayesian Nonparametrics for Natural Image Segmentation Erik Sudderth Brown University Joint work with Michael Jordan University of California Soumya Ghosh Brown University Parsing Visual Scenes
More informationDynamic Generalized Linear Models
Dynamic Generalized Linear Models Jesse Windle Oct. 24, 2012 Contents 1 Introduction 1 2 Binary Data (Static Case) 2 3 Data Augmentation (de-marginalization) by 4 examples 3 3.1 Example 1: CDF method.............................
More informationScaling up Bayesian Inference
Scaling up Bayesian Inference David Dunson Departments of Statistical Science, Mathematics & ECE, Duke University May 1, 2017 Outline Motivation & background EP-MCMC amcmc Discussion Motivation & background
More informationThe Bayes classifier
The Bayes classifier Consider where is a random vector in is a random variable (depending on ) Let be a classifier with probability of error/risk given by The Bayes classifier (denoted ) is the optimal
More informationLogistic Regression. Seungjin Choi
Logistic Regression Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/
More informationWavelet-Based Nonparametric Modeling of Hierarchical Functions in Colon Carcinogenesis
Wavelet-Based Nonparametric Modeling of Hierarchical Functions in Colon Carcinogenesis Jeffrey S. Morris University of Texas, MD Anderson Cancer Center Joint wor with Marina Vannucci, Philip J. Brown,
More informationNonparametric Bayesian Methods - Lecture I
Nonparametric Bayesian Methods - Lecture I Harry van Zanten Korteweg-de Vries Institute for Mathematics CRiSM Masterclass, April 4-6, 2016 Overview of the lectures I Intro to nonparametric Bayesian statistics
More informationA Brief Overview of Nonparametric Bayesian Models
A Brief Overview of Nonparametric Bayesian Models Eurandom Zoubin Ghahramani Department of Engineering University of Cambridge, UK zoubin@eng.cam.ac.uk http://learning.eng.cam.ac.uk/zoubin Also at Machine
More informationNon-parametric Bayesian Modeling and Fusion of Spatio-temporal Information Sources
th International Conference on Information Fusion Chicago, Illinois, USA, July -8, Non-parametric Bayesian Modeling and Fusion of Spatio-temporal Information Sources Priyadip Ray Department of Electrical
More informationA Nonparametric Bayesian Model for Multivariate Ordinal Data
A Nonparametric Bayesian Model for Multivariate Ordinal Data Athanasios Kottas, University of California at Santa Cruz Peter Müller, The University of Texas M. D. Anderson Cancer Center Fernando A. Quintana,
More informationA comparative review of variable selection techniques for covariate dependent Dirichlet process mixture models
A comparative review of variable selection techniques for covariate dependent Dirichlet process mixture models William Barcella 1, Maria De Iorio 1 and Gianluca Baio 1 1 Department of Statistical Science,
More informationPartial factor modeling: predictor-dependent shrinkage for linear regression
modeling: predictor-dependent shrinkage for linear Richard Hahn, Carlos Carvalho and Sayan Mukherjee JASA 2013 Review by Esther Salazar Duke University December, 2013 Factor framework The factor framework
More informationCS Lecture 19. Exponential Families & Expectation Propagation
CS 6347 Lecture 19 Exponential Families & Expectation Propagation Discrete State Spaces We have been focusing on the case of MRFs over discrete state spaces Probability distributions over discrete spaces
More informationNaïve Bayes classification
Naïve Bayes classification 1 Probability theory Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. Examples: A person s height, the outcome of a coin toss
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear
More informationIndex. Pagenumbersfollowedbyf indicate figures; pagenumbersfollowedbyt indicate tables.
Index Pagenumbersfollowedbyf indicate figures; pagenumbersfollowedbyt indicate tables. Adaptive rejection metropolis sampling (ARMS), 98 Adaptive shrinkage, 132 Advanced Photo System (APS), 255 Aggregation
More informationBayesian Point Process Modeling for Extreme Value Analysis, with an Application to Systemic Risk Assessment in Correlated Financial Markets
Bayesian Point Process Modeling for Extreme Value Analysis, with an Application to Systemic Risk Assessment in Correlated Financial Markets Athanasios Kottas Department of Applied Mathematics and Statistics,
More informationDefault Priors and Effcient Posterior Computation in Bayesian
Default Priors and Effcient Posterior Computation in Bayesian Factor Analysis January 16, 2010 Presented by Eric Wang, Duke University Background and Motivation A Brief Review of Parameter Expansion Literature
More informationGaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012
Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature
More informationA Process over all Stationary Covariance Kernels
A Process over all Stationary Covariance Kernels Andrew Gordon Wilson June 9, 0 Abstract I define a process over all stationary covariance kernels. I show how one might be able to perform inference that
More informationLocal Likelihood Bayesian Cluster Modeling for small area health data. Andrew Lawson Arnold School of Public Health University of South Carolina
Local Likelihood Bayesian Cluster Modeling for small area health data Andrew Lawson Arnold School of Public Health University of South Carolina Local Likelihood Bayesian Cluster Modelling for Small Area
More informationStat 5101 Lecture Notes
Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random
More informationPattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions
Pattern Recognition and Machine Learning Chapter 2: Probability Distributions Cécile Amblard Alex Kläser Jakob Verbeek October 11, 27 Probability Distributions: General Density Estimation: given a finite
More informationDirichlet Process Mixtures of Generalized Linear Models
Lauren A. Hannah David M. Blei Warren B. Powell Department of Computer Science, Princeton University Department of Operations Research and Financial Engineering, Princeton University Department of Operations
More informationChapter 2. Data Analysis
Chapter 2 Data Analysis 2.1. Density Estimation and Survival Analysis The most straightforward application of BNP priors for statistical inference is in density estimation problems. Consider the generic
More informationChart types and when to use them
APPENDIX A Chart types and when to use them Pie chart Figure illustration of pie chart 2.3 % 4.5 % Browser Usage for April 2012 18.3 % 38.3 % Internet Explorer Firefox Chrome Safari Opera 35.8 % Pie chart
More informationHeriot-Watt University
Heriot-Watt University Heriot-Watt University Research Gateway Prediction of settlement delay in critical illness insurance claims by using the generalized beta of the second kind distribution Dodd, Erengul;
More informationVariational Bayesian Dirichlet-Multinomial Allocation for Exponential Family Mixtures
17th Europ. Conf. on Machine Learning, Berlin, Germany, 2006. Variational Bayesian Dirichlet-Multinomial Allocation for Exponential Family Mixtures Shipeng Yu 1,2, Kai Yu 2, Volker Tresp 2, and Hans-Peter
More informationGaussian processes for spatial modelling in environmental health: parameterizing for flexibility vs. computational efficiency
Gaussian processes for spatial modelling in environmental health: parameterizing for flexibility vs. computational efficiency Chris Paciorek March 11, 2005 Department of Biostatistics Harvard School of
More informationA Nonparametric Model for Stationary Time Series
A Nonparametric Model for Stationary Time Series Isadora Antoniano-Villalobos Bocconi University, Milan, Italy. isadora.antoniano@unibocconi.it Stephen G. Walker University of Texas at Austin, USA. s.g.walker@math.utexas.edu
More informationSupplement to A Hierarchical Approach for Fitting Curves to Response Time Measurements
Supplement to A Hierarchical Approach for Fitting Curves to Response Time Measurements Jeffrey N. Rouder Francis Tuerlinckx Paul L. Speckman Jun Lu & Pablo Gomez May 4 008 1 The Weibull regression model
More informationFrailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Mela. P.
Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Melanie M. Wall, Bradley P. Carlin November 24, 2014 Outlines of the talk
More informationBayesian Additive Regression Tree (BART) with application to controlled trail data analysis
Bayesian Additive Regression Tree (BART) with application to controlled trail data analysis Weilan Yang wyang@stat.wisc.edu May. 2015 1 / 20 Background CATE i = E(Y i (Z 1 ) Y i (Z 0 ) X i ) 2 / 20 Background
More informationNPFL108 Bayesian inference. Introduction. Filip Jurčíček. Institute of Formal and Applied Linguistics Charles University in Prague Czech Republic
NPFL108 Bayesian inference Introduction Filip Jurčíček Institute of Formal and Applied Linguistics Charles University in Prague Czech Republic Home page: http://ufal.mff.cuni.cz/~jurcicek Version: 21/02/2014
More information9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering
Types of learning Modeling data Supervised: we know input and targets Goal is to learn a model that, given input data, accurately predicts target data Unsupervised: we know the input only and want to make
More informationMultivariate Normal & Wishart
Multivariate Normal & Wishart Hoff Chapter 7 October 21, 2010 Reading Comprehesion Example Twenty-two children are given a reading comprehsion test before and after receiving a particular instruction method.
More information