Outline. Introduction to Bayesian nonparametrics. Notation and discrete example. Books on Bayesian nonparametrics
|
|
- Ernest Franklin
- 5 years ago
- Views:
Transcription
1 Outline Introduction to Bayesian nonparametrics Andriy Norets Department of Economics, Princeton University September, 200 Dirichlet Process (DP) - prior on the space of probability measures. Some applications of DP. DP mixtures. Role of consistency in Bayesian nonparametrics. Schwartz posterior consistency theorem and extensions. Applications to conditional density estimation (consistency, MCMC estimation methods, comparison with classical methods on real data). /39 2/39 Books on Bayesian nonparametrics Notation and discrete example Ghosh and Ramamoorthi (2003) - foundations and theory up to Dey et al. (998) - applications and MCMC estimation methods. Hort (200) - collection of up to date reviews of theoretical and applied work in different subfields. (X, B, P) - probability space for observations. M(X ) - space of probability measures on X (P M(X )). For Bayesian nonparametric inference we need to put a prior π on M(X ). Dirichlet process prior for X = {,...,n} and M(X )={p =(p,...,p n ):p i 0, p i =}, π(p,...,p n )= Γ( α i ) Γ(αi ) pα pn αn known as the Dirichlet distribution with parameter α, D(α). 3/39 4/39
2 Discrete example continued Ferguson (973) construction The likelihood has the same functional form as the prior: k l(y,...,y K p) =p {y k=} k p {y k=n} n Thus, the posterior is also a Dirichlet distribution π(p y) =D(α + k {y k =},...,α + k If we treat vector α as a measure on X (α(i) =α i ), π(p y) =D(α + {y k = n}) K δ yi ), where δ yi is a point mass at y i. i= Kolmogorov s extension theorem for general case: Define finite dimensional distributions first: for a finite partition X = n i= A i,let (P(A ),...,P(A n )) Dirichlet(α(A ),...,α(a n )) where α is a measure on X (it is a parameter for the DP) Next, using properties of the Dirichlet distribution, show that an extension of these finite dimensional distributions to M(X ), DP(α), is unique and that, as in the discrete case, the posterior is also DP(α + K i= δ y i ). 5/39 6/39 Sethuraman (994) construction Applications of DP prior Sethuraman showed that DP can be represented as where DP(α) = p i δ θi i= θ i α( )/α(x ), iid. i p = q, p i = q i ( q ) (stick breaking) = q i Beta(,α(X )), iid. Ferguson (973) showed that mean variance and covariance CDF all converge to sample analogs. Thus, DP puts probability on discrete probability measures. 7/39 8/39
3 Application of DP prior to estimation of CDF Under squared loss, the prior point estimate is α(, t] ˆF (t) = P(, t]dp α (dp) = α(r) because (, t] and(t, ) is a partition of R and under DP α P(, t] Beta(α(, t],α(t, )). To get posterior point estimate, use α + K i= δ y i instead of α: ˆF (t y,...,y K )= α(r) α(r)+k α(, t] α(r) }{{} Prior CDF K + Empirical CDF α(r)+k Multinomial-Dirichlet model Several authors use a multinomial likelihood with the support given by observations in the sample (or all possible values for observations) and an improper Dirichlet prior on the multinomial probabilities. This can be seen as a limit of DP when α(x ) 0(theprior becomes uninformative, DP α+ K D i= δy K ). i i= δy i A parameter of interest is defined as a function of (p,...,p K ) (parameters of the multinomial likelihood) and (y,...,y K ) (observations), e.g., the mean is pi y i. 9/39 0/39 Applications of multinomial-dirichlet model Rubin (98) (Bayesian bootstrap): multinomial-dirichlet model delivers a posterior distribution for certain parameters (mean, variance) which is almost identical to classical bootstrap distribution. Lo (987): asymptotic results for the Bayesian bootstrap. Lancaster (2003): OLS with White heteroscedasticity robust variance covariance matrix can be ustified by Bayesian bootstrap if the parameter of interest is the minimizer of E(y βx) 2. Poirier (2008) obtain different versions of White s estimator used in classical literature under different prior specifications. Chamberlain and Imbens (2003) suggest multinomial-dirichlet model with added prior moment equality restrictions as a Bayesianwaytohandlemomentequalitymodels. DP mixtures (most common use of DP) y i m i, s i N(m i, s 2 i ) m i, s i F F F DP(α) By Sethuraman construction, this is equivalent to a countable mixture of normals: y i p i N(m i, si 2 ) i= Practical MCMC algorithms are available (Escobar and West (995), Neal (2000)). The model is similar to a finite mixture of normals, which we ll consider below. /39 2/39
4 Applications of DP mixtures Motivation for studying consistency in the Bayesian framework Dey et al. (998) - a collection of applications. Hirano (2002) (and Hirano s dissertation) - flexible error term distributions in panel data models. Conley et al. (2008) - IV with flexible error term distributions. Burda et al. (2008) - discrete choice model with heterogeneous coefficients (heterogeneity distribution is modeled by DP mixtures). Priors in infinite (high) dimensional problems are hard to construct/elicit. There are known examples of seemingly reasonable priors that lead to unreasonable posteriors (Diaconis and Freedman (986)). Posterior consistency (posterior concentrates around true parameter values asymptotically) provides an additional check on the prior distributions. 3/39 4/39 (A version of) Schwartz posterior consistency theorem Generalizations of Schwartz theorem For a measure μ on X,letL μ be the space of probability densities w.r.t. μ. Letπ be a prior on L μ. If any Kullback-Liebler neighborhood of the true density, f 0 L μ, has positive prior π probability, then the posterior is consistent in weak topology: π(u y,...,y n ), a.s.f n 0 for any weak topology neighborhood U. Barron (988), Barron et al. (999), Ghosal et al. (999) - consistency for total variation distance rather than weak topology distance. Ghosal et al. (2000) - posterior convergence rates. Ghosal and van der Vaart (2007) - some Bayesian nonparametric procedures achieve minimax rates in estimation of univariate density. 5/39 6/39
5 Two approaches based on mixtures FMMN and specifications of SMR (Conditional) observables density: Model the oint distribution by finite mixtures of multivariate normals (FMMN) and extract the conditional distributions of interest from the model (West, Mueller, Escobar (994), Wu and Ghosal (2009), Norets and Pelenis (2009)). Model the conditional distribution directly by smooth mixtures of regressions (SMR) mixing probabilities are functions of covariates (Jacobs et al. (99), Jiang and Tanner (999), Geweke and Keane (2007), Villani et al. (2009), Norets (200)). p(y x,θ,m) = m α m (x; θ)φ(y,μ m (x; θ),σ m (x; θ)), = φ is normal pdf, μ m (x; θ) - mean, σ m (x; θ) 2 I -variance (non-diagonal variance will be denoted by H.) How to specify mixing probabilities α m (x; θ) andμ m (x; θ) and σ m (x; θ) 2? do not depend on x, depend on linear functions of x, or flexible, e.g., (transformations of) polynomials in x? 7/39 8/39 Consistency results Summary of results from Norets (200) If f (y x) is in the KL closure of p(y x,θ,m m ) then we can put a prior on the number of mixture components m and apply Schwartz posterior consistency theorem to obtain consistency of posterior in weak topology. Results can be generalized from i.i.d to Markov process settings as in Ghosal and Tang (2006). Simplest flexible specification: α m s multinomial logit with linear indices in x, (μ m,σ m ) are independent of x. Polynomial terms in logit reduce the number of mixture components m. Alternatively, (α m,σ m ) independent of x and μ m in x also work when y is univariate. polynomial However, making α m rather than μ m flexible in x improves the convergence rate. Making σ m a function of x weakens some restrictions on F. Modeling conditional distributions directly is better. 9/39 20/39
6 Assumptions on target distribution F Examples satisfying assumptions. f (y x) is continuous in y on Y for all x X. 2. The second moments of y are finite. 3. Unbounded support (f (y x) > 0). For a cube, C r (y), with center y and side length r > 0 f (y x) log F (dy, dx) <, inf z Cr (y) f (z x) or d log f (z x) sup F (dy, dx) <, z C r (y) dz 4. Bounded support can be accommodated by using some other shapes instead of cubes C r (y). Exponential: f (y x) =γ(x)exp{ γ(x)y}, γ(x) > 0, and γ 2 df < Student t, f (y x) [ν +((y b(x))/c(x)) 2 ] (ν+)/2, ν>2, b(x) 2 and c(x) 2 are integrable w.r.t. f (x). Uniformly bounded support, f (y x) is continuous in y, > f f (y x) f > 0. Bounded away from zero condition can be substituted with monotonicity at the boundary. 2/39 22/39 Estimation for FMMN The oint distribution of observables and unobservables Latent states (Diebolt and Robert (994)): s i {,...,m} y i s i,θ,m φ(.; μ si, H s i ) P(s i = θ, M) =α p({y i, s i } T i=; {α,μ, H } m = M) T α si H si 0.5 exp[ 0.5(y i μ si ) H si (y i μ si )] i= α a αm a (Dirichlet prior) H 0.5 exp{ 0.5(μ μ) λh (μ μ)} (Normal-Wishart H (ν d )/2 exp{ 0.5trSH } prior) (ν, S,μ,λ, a) - hyperparameters 23/39 24/39
7 Gibbs sampler Gibbs sampler p({y i, s i } T i=; {α,μ, H } m = M) T α si H si 0.5 exp[ 0.5(y i μ si ) H si (y i μ si )] i= α a αm a (Dirichlet prior) H 0.5 exp{ 0.5(μ μ) λh (μ μ)} (Normal-Wishart H (ν d )/2 exp{ 0.5trSH } prior) Gibbs sampler blocks: p(s i =...) α H 0.5 exp[ 0.5(y i μ ) H (y i μ )] p({y i, s i } T i=; {α,μ, H } m = M) T α si H si 0.5 exp[ 0.5(y i μ si ) H si (y i μ si )] i= α a αm a (Dirichlet prior) H 0.5 exp{ 0.5(μ μ) λh (μ μ)} (Normal-Wishart H (ν d )/2 exp{ 0.5trSH } prior) Gibbs sampler blocks: i {s i =}+a p(α...) α α i {s i =m}+a m 25/39 26/39 Gibbs sampler Gibbs sampler, discrete variables p({y i, s i } T i=; {α,μ, H } m = M) T α si H si 0.5 exp[ 0.5(y i μ si ) H si (y i μ si )] i= α a αm a (Dirichlet prior) H 0.5 exp{ 0.5(μ μ) λh (μ μ)} (Normal-Wishart H (ν d )/2 exp{ 0.5trSH } prior) Map discrete variables into intervals of R and model continuous latent variables, y i,k y i,k =[a i,k, b i,k ], by the mixture. Add a block for the latent variables: p(yi,k...) exp[ 0.5(yi μ si ) H si (yi μ si )] [ai,k,b i,k ](yi,k ) It is truncated normal. Gibbs sampler blocks: p(μ, H...) Normal-Wishart distribution 27/39 28/39
8 Gibbs sampler, SMR If μ m is linear or polynomial in x, Gibbs sampler is almost the same. If σ m depends on x then Gibbs sampler blocks for parameters of σ m will have non-standard form. One needs to use more sophisticated MCMC algorithms, e.g., Metropolis-Hastings. If α m is defined by multinomial logit then Gibbs sampler block for the logit parameters is similar to maximum likelihood estimation of logit model. Evaluating model quality Marginal Likelihood, p(y T M), is difficult to compute. Cross Validated Log Scoring Rule T log p(y t Y T /t, M) = t= T log K t= K p(y t Y T /t,θ k, M) k= We use: Modified Cross Validated Log Scoring Rule Randomly order sample observations and use the first T observations for inference and the rest for evaluation. Repeat this process several times and compare means or medians. T t=t + log p(y t Y T, M) 29/39 30/39 Labor Market Participation Comparison of different models Gerfin (996) cross-section dataset. Compare probit, kernel (Hall et al. (2004)) and FMMN. Binary dependent variable - Labor force participation dummy. Independent variables: Log of non-labor income, Age, Education, Number of young children, Number of old children, Foreign dummy. Number of observations: T = 872. Split into two random samples of T = 650 and T 2 = 222 observations. Use T as an estimation sample, and T 2 as a prediction sample for 50 different random splits. Table: Cross-validated log scores and classification rates for Labor Force Participation Model Log Score %Correct Mean Median Mean Median Probit % 66.37% Kernel % 66.89% FMMN(m=) % 66.22% FMMN(m=2) % 68.47% FMMN(m=3) % 67.79% Mean and median log scores and classication rates for probit, kernel and FMMN models evaluated at 50 random evaluation samples of T 2 = 222 observations. 3/39 32/39
9 Boston Housing Data Data were analysed by kernel methods by Li and Racine (2008). Compare kernel methods from Hall et al. (2004) (cross-validation) with FMMN and SMR. Dependent variable - price of the house in a given area. Independent variables: average number of rooms in the area, percetange of the population having lower economic status in the area, weighted distance to five Boston employment centers. Number of observations: T = 506. Split into two random samples of T = 400 and T 2 = 06 observations. Use T as an estimation sample, and T 2 as a prediction sample for 00 different random splits. Comparison of different models Table: Cross-validated log scores Model Log Score Mean Median Nonparametric FMMN(m=3) FMMN(m=4) FMMN(m=5) FMMN(m=6) FMMN(m=ML) SMR(m=4) SMR(m=7) /39 34/39 Daily returns on S&P 500 Summary Results from Geweke and Keane (2007) Table: Out of sample model comparisons Model Predictive likelihood SMR t-garch(,) Threshold EGARCH(,) GARCH(,) Normal iid Bayesian framework provides a rich collection of non- and semi- parametric methods. Some of these methods deliver familiar classical estimators. This can help understand assumptions under which the estimators are appropriate in finite samples. In small samples, mixture based methods compare favorably with classical parametric and nonparametric alternatives. We still have a lot to learn: posterior convergence rates, flexible priors in structural models, moment (in)equality models,... 35/39 36/39
10 References I Barron, A. (988): The Exponential Convergence of Posterior Probabilities with Implications for Bayes Estimators of Density Functions,. Barron, A., M. J. Schervish, and L. Wasserman (999): The Consistency of Posterior Distributions in Nonparametric Problems, The Annals of Statistics, 27, pp Burda, M., M. Harding, and J. Hausman (2008): A Bayesian Mixed Logit-Probit Model for Multinomial Choice, Journal of Econometrics, 47, pp Chamberlain, G. and G. W. Imbens (2003): Nonparametric Applications of Bayesian Inference, Journal of Business and Economic Statistics, 2, pp Conley, T. G., C. B. Hansen, R. E. McCulloch, and P. E. Rossi (2008): A semi-parametric Bayesian approach to the instrumental variable problem, Journal of Econometrics, 44, Dey, D., P. MuIler, and D. Sinha, eds. (998): Practical Nonparametric and Semiparametric Bayesian Statistics, Lecture Notes in Statistics, Vol. 33, Springer. Diaconis, P. and D. Freedman (986): On Inconsistent Bayes Estimates of Location, The Annals of Statistics, 4, pp Diebolt, J. and C. P. Robert (994): Estimation of Finite Mixture Distributions through Bayesian Sampling, Journal of the Royal Statistical Society. Series B (Methodological), 56, Escobar, M. and M. West (995): Bayesian Density Estimation and Inference Using Mixtures, Journal of the American Statistical Association, 90, pp Ferguson, T. S. (973): A Bayesian Analysis of Some Nonparametric Problems, The Annals of Statistics,, pp Gerfin, M. (996): Parametric and Semi-Parametric Estimation of the Binary Response Model of Labour Market Participation, Journal of Applied Econometrics,, Geweke, J. and M. Keane (2007): Smoothly mixing regressions, Journal of Econometrics, 38. Ghosal, S., J. K. Ghosh, and R. V. Ramamoorthi (999): Posterior Consistency of Dirichlet Mixtures in Density Estimation, The Annals of Statistics, 27, pp Ghosal, S., J. K. Ghosh, and A. W. v. d. Vaart (2000): Convergence Rates of Posterior Distributions, The Annals of Statistics, 28, pp References II Ghosal, S. and Y. Tang (2006): Bayesian consistency for Markov processes, Sankhya, 68, Ghosal, S. and A. van der Vaart (2007): Posterior convergence rates of Dirichlet mixtures at smooth densities, Ann. Statist., 35, Ghosh, J. and R. Ramamoorthi (2003): Bayesian Nonparametrics, Springer; edition. Hall, P., J. Racine, and Q. Li (2004): Cross-Validation and the Estimation of Conditional Probability Densities, Journal of the American Statistical Association, 99, Hirano, K. (2002): Semiparametric Bayesian Inference in Autoregressive Panel Data Models, Econometrica, 70, Hort, N. L., ed. (200): Bayesian Nonparametrics (Cambridge Series in Statistical and Probabilistic Mathematics), Cambridge University Press. Jacobs, R. A., M. I. Jordan, S. J. Nowlan, and G. E. Hinton (99): Adaptive mixtures of local experts, Neural Comput., 3, Jiang, W. and M. Tanner (999): Hierarchical Mixtures-of-Experts for Exponential Family Regression models: Approximation and Maximum Likelihood Estimation, Annals of Statistics, 27, Lancaster, T. (2003): A Note on Bootstraps and Robustness,. Li, Q. and J. S. Racine (2008): Nonparametric Estimation of Conditional CDF and Quantile Functions With Mixed Categorical and Continuous Data, Journal of Business and Economic Statistics, 26, Lo,A.Y.(987): A Large Sample Study of the Bayesian Bootstrap, The Annals of Statistics, 5, pp Neal, R. (2000): Markov Chain Sampling Methods for Dirichlet Process Mixture Models, Journal of Computational and Graphical Statistics, 9, Norets, A. (200): Approximation of conditional densities by smooth mixtures of regressions, Annals of statistics, 38, Norets, A. and J. Pelenis (2009): Bayesian modeling of oint and conditional distributions,. Poirier, D. (2008): Bayesian Interpretations of Heteroskedastic Consistent Covariance Estimators Using the Informed Bayesian Bootstrap, Working Papers , University of California-Irvine, Department of Economics. 37/39 38/39 References III Rubin, D. (98): THE BAYESIAN BOOTSTRAP, The Annals of Statistics, 9, pp Sethuraman, J. (994): A CONSTRUCTIVE DEFINITION OF DIRICHLET PRIORS, Statistica Sinica, pp Villani, M., R. Kohn, and P. Giordani (2009): Regression density estimation using smooth adaptive Gaussian mixtures, Journal of Econometrics, 53, Wu, Y. and S. Ghosal (2009): L-Consistency of Dirichlet Mixtures in Multivariate Bayesian Density Estimation, working paper. 39/39
Journal of Econometrics
Journal of Econometrics 168 (2012) 332 346 Contents lists available at SciVerse ScienceDirect Journal of Econometrics ournal homepage: www.elsevier.com/locate/econom Bayesian modeling of oint and conditional
More informationBayesian Modeling of Conditional Distributions
Bayesian Modeling of Conditional Distributions John Geweke University of Iowa Indiana University Department of Economics February 27, 2007 Outline Motivation Model description Methods of inference Earnings
More informationPOSTERIOR CONSISTENCY IN CONDITIONAL DENSITY ESTIMATION BY COVARIATE DEPENDENT MIXTURES. By Andriy Norets and Justinas Pelenis
Resubmitted to Econometric Theory POSTERIOR CONSISTENCY IN CONDITIONAL DENSITY ESTIMATION BY COVARIATE DEPENDENT MIXTURES By Andriy Norets and Justinas Pelenis Princeton University and Institute for Advanced
More informationFoundations of Nonparametric Bayesian Methods
1 / 27 Foundations of Nonparametric Bayesian Methods Part II: Models on the Simplex Peter Orbanz http://mlg.eng.cam.ac.uk/porbanz/npb-tutorial.html 2 / 27 Tutorial Overview Part I: Basics Part II: Models
More informationBayesian Nonparametrics
Bayesian Nonparametrics Peter Orbanz Columbia University PARAMETERS AND PATTERNS Parameters P(X θ) = Probability[data pattern] 3 2 1 0 1 2 3 5 0 5 Inference idea data = underlying pattern + independent
More informationNonparametric Bayesian Methods - Lecture I
Nonparametric Bayesian Methods - Lecture I Harry van Zanten Korteweg-de Vries Institute for Mathematics CRiSM Masterclass, April 4-6, 2016 Overview of the lectures I Intro to nonparametric Bayesian statistics
More informationSTAT Advanced Bayesian Inference
1 / 32 STAT 625 - Advanced Bayesian Inference Meng Li Department of Statistics Jan 23, 218 The Dirichlet distribution 2 / 32 θ Dirichlet(a 1,...,a k ) with density p(θ 1,θ 2,...,θ k ) = k j=1 Γ(a j) Γ(
More informationA Fully Nonparametric Modeling Approach to. BNP Binary Regression
A Fully Nonparametric Modeling Approach to Binary Regression Maria Department of Applied Mathematics and Statistics University of California, Santa Cruz SBIES, April 27-28, 2012 Outline 1 2 3 Simulation
More informationBayesian Regression with Heteroscedastic Error Density and Parametric Mean Function
Bayesian Regression with Heteroscedastic Error Density and Parametric Mean Function Justinas Pelenis pelenis@ihs.ac.at Institute for Advanced Studies, Vienna May 8, 2013 Abstract This paper considers a
More informationOutline. Clustering. Capturing Unobserved Heterogeneity in the Austrian Labor Market Using Finite Mixtures of Markov Chain Models
Capturing Unobserved Heterogeneity in the Austrian Labor Market Using Finite Mixtures of Markov Chain Models Collaboration with Rudolf Winter-Ebmer, Department of Economics, Johannes Kepler University
More informationNonparametric Bayes Inference on Manifolds with Applications
Nonparametric Bayes Inference on Manifolds with Applications Abhishek Bhattacharya Indian Statistical Institute Based on the book Nonparametric Statistics On Manifolds With Applications To Shape Spaces
More informationVariational Bayesian Dirichlet-Multinomial Allocation for Exponential Family Mixtures
17th Europ. Conf. on Machine Learning, Berlin, Germany, 2006. Variational Bayesian Dirichlet-Multinomial Allocation for Exponential Family Mixtures Shipeng Yu 1,2, Kai Yu 2, Volker Tresp 2, and Hans-Peter
More informationNonparametric Bayesian Methods (Gaussian Processes)
[70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent
More informationFlexible Regression Modeling using Bayesian Nonparametric Mixtures
Flexible Regression Modeling using Bayesian Nonparametric Mixtures Athanasios Kottas Department of Applied Mathematics and Statistics University of California, Santa Cruz Department of Statistics Brigham
More informationBayesian estimation of bandwidths for a nonparametric regression model with a flexible error density
ISSN 1440-771X Australia Department of Econometrics and Business Statistics http://www.buseco.monash.edu.au/depts/ebs/pubs/wpapers/ Bayesian estimation of bandwidths for a nonparametric regression model
More informationBayesian Nonparametrics
Bayesian Nonparametrics Lorenzo Rosasco 9.520 Class 18 April 11, 2011 About this class Goal To give an overview of some of the basic concepts in Bayesian Nonparametrics. In particular, to discuss Dirichelet
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is
More informationFlexible Estimation of Treatment Effect Parameters
Flexible Estimation of Treatment Effect Parameters Thomas MaCurdy a and Xiaohong Chen b and Han Hong c Introduction Many empirical studies of program evaluations are complicated by the presence of both
More informationIndex. Pagenumbersfollowedbyf indicate figures; pagenumbersfollowedbyt indicate tables.
Index Pagenumbersfollowedbyf indicate figures; pagenumbersfollowedbyt indicate tables. Adaptive rejection metropolis sampling (ARMS), 98 Adaptive shrinkage, 132 Advanced Photo System (APS), 255 Aggregation
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate
More informationExchangeability. Peter Orbanz. Columbia University
Exchangeability Peter Orbanz Columbia University PARAMETERS AND PATTERNS Parameters P(X θ) = Probability[data pattern] 3 2 1 0 1 2 3 5 0 5 Inference idea data = underlying pattern + independent noise Peter
More informationStat 5101 Lecture Notes
Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random
More informationThe Variational Gaussian Approximation Revisited
The Variational Gaussian Approximation Revisited Manfred Opper Cédric Archambeau March 16, 2009 Abstract The variational approximation of posterior distributions by multivariate Gaussians has been much
More informationA Nonparametric Bayesian Model for Multivariate Ordinal Data
A Nonparametric Bayesian Model for Multivariate Ordinal Data Athanasios Kottas, University of California at Santa Cruz Peter Müller, The University of Texas M. D. Anderson Cancer Center Fernando A. Quintana,
More informationLecture 3a: Dirichlet processes
Lecture 3a: Dirichlet processes Cédric Archambeau Centre for Computational Statistics and Machine Learning Department of Computer Science University College London c.archambeau@cs.ucl.ac.uk Advanced Topics
More informationBayesian estimation of the discrepancy with misspecified parametric models
Bayesian estimation of the discrepancy with misspecified parametric models Pierpaolo De Blasi University of Torino & Collegio Carlo Alberto Bayesian Nonparametrics workshop ICERM, 17-21 September 2012
More informationBayesian Nonparametric Modeling for Multivariate Ordinal Regression
Bayesian Nonparametric Modeling for Multivariate Ordinal Regression arxiv:1408.1027v3 [stat.me] 20 Sep 2016 Maria DeYoreo Department of Statistical Science, Duke University and Athanasios Kottas Department
More informationA Simple Proof of the Stick-Breaking Construction of the Dirichlet Process
A Simple Proof of the Stick-Breaking Construction of the Dirichlet Process John Paisley Department of Computer Science Princeton University, Princeton, NJ jpaisley@princeton.edu Abstract We give a simple
More informationMultilevel Statistical Models: 3 rd edition, 2003 Contents
Multilevel Statistical Models: 3 rd edition, 2003 Contents Preface Acknowledgements Notation Two and three level models. A general classification notation and diagram Glossary Chapter 1 An introduction
More informationDirichlet Process Mixtures of Generalized Linear Models
Lauren A. Hannah David M. Blei Warren B. Powell Department of Computer Science, Princeton University Department of Operations Research and Financial Engineering, Princeton University Department of Operations
More informationBayesian Interpretations of Heteroskedastic Consistent Covariance Estimators Using the Informed Bayesian Bootstrap
Bayesian Interpretations of Heteroskedastic Consistent Covariance Estimators Using the Informed Bayesian Bootstrap Dale J. Poirier University of California, Irvine September 1, 2008 Abstract This paper
More informationDirichlet Processes: Tutorial and Practical Course
Dirichlet Processes: Tutorial and Practical Course (updated) Yee Whye Teh Gatsby Computational Neuroscience Unit University College London August 2007 / MLSS Yee Whye Teh (Gatsby) DP August 2007 / MLSS
More informationBayesian Nonparametric and Parametric Inference
JIRSS (2002) Vol. 1, Nos. 1-2, pp 143-163 Bayesian Nonparametric and Parametric Inference Stephen G. Walker Department of Mathematical Sciences, University of Bath, Bath BA2 7AY, United Kingdom. (massgw@maths.bath.ac.uk)
More informationBayesian non-parametric model to longitudinally predict churn
Bayesian non-parametric model to longitudinally predict churn Bruno Scarpa Università di Padova Conference of European Statistics Stakeholders Methodologists, Producers and Users of European Statistics
More informationA Semi-Parametric Bayesian Approach to the Instrumental Variable Problem
A Semi-Parametric Bayesian Approach to the Instrumental Variable Problem by Tim Conley Chris Hansen Rob McCulloch Peter E. Rossi Graduate School of Business University of Chicago June 2006 Revised, December
More informationLecture Notes based on Koop (2003) Bayesian Econometrics
Lecture Notes based on Koop (2003) Bayesian Econometrics A.Colin Cameron University of California - Davis November 15, 2005 1. CH.1: Introduction The concepts below are the essential concepts used throughout
More informationGibbs Sampling in Latent Variable Models #1
Gibbs Sampling in Latent Variable Models #1 Econ 690 Purdue University Outline 1 Data augmentation 2 Probit Model Probit Application A Panel Probit Panel Probit 3 The Tobit Model Example: Female Labor
More informationMultivariate Versus Multinomial Probit: When are Binary Decisions Made Separately also Jointly Optimal?
Multivariate Versus Multinomial Probit: When are Binary Decisions Made Separately also Jointly Optimal? Dale J. Poirier and Deven Kapadia University of California, Irvine March 10, 2012 Abstract We provide
More informationNonparametric Bayesian modeling for dynamic ordinal regression relationships
Nonparametric Bayesian modeling for dynamic ordinal regression relationships Athanasios Kottas Department of Applied Mathematics and Statistics, University of California, Santa Cruz Joint work with Maria
More informationSTAT 518 Intro Student Presentation
STAT 518 Intro Student Presentation Wen Wei Loh April 11, 2013 Title of paper Radford M. Neal [1999] Bayesian Statistics, 6: 475-501, 1999 What the paper is about Regression and Classification Flexible
More informationDepartamento de Economía Universidad de Chile
Departamento de Economía Universidad de Chile GRADUATE COURSE SPATIAL ECONOMETRICS November 14, 16, 17, 20 and 21, 2017 Prof. Henk Folmer University of Groningen Objectives The main objective of the course
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate
More informationNPFL108 Bayesian inference. Introduction. Filip Jurčíček. Institute of Formal and Applied Linguistics Charles University in Prague Czech Republic
NPFL108 Bayesian inference Introduction Filip Jurčíček Institute of Formal and Applied Linguistics Charles University in Prague Czech Republic Home page: http://ufal.mff.cuni.cz/~jurcicek Version: 21/02/2014
More informationBayesian Methods for Machine Learning
Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),
More informationDensity Estimation. Seungjin Choi
Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/
More informationLatent factor density regression models
Biometrika (2010), 97, 1, pp. 1 7 C 2010 Biometrika Trust Printed in Great Britain Advance Access publication on 31 July 2010 Latent factor density regression models BY A. BHATTACHARYA, D. PATI, D.B. DUNSON
More informationDirichlet Process. Yee Whye Teh, University College London
Dirichlet Process Yee Whye Teh, University College London Related keywords: Bayesian nonparametrics, stochastic processes, clustering, infinite mixture model, Blackwell-MacQueen urn scheme, Chinese restaurant
More informationBayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units
Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units Sahar Z Zangeneh Robert W. Keener Roderick J.A. Little Abstract In Probability proportional
More informationGaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012
Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature
More informationPATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS Parametric Distributions Basic building blocks: Need to determine given Representation: or? Recall Curve Fitting Binary Variables
More informationA general mixed model approach for spatio-temporal regression data
A general mixed model approach for spatio-temporal regression data Thomas Kneib, Ludwig Fahrmeir & Stefan Lang Department of Statistics, Ludwig-Maximilians-University Munich 1. Spatio-temporal regression
More informationModeling conditional distributions with mixture models: Theory and Inference
Modeling conditional distributions with mixture models: Theory and Inference John Geweke University of Iowa, USA Journal of Applied Econometrics Invited Lecture Università di Venezia Italia June 2, 2005
More informationPattern Recognition and Machine Learning
Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability
More informationDirichlet Process Mixtures of Generalized Linear Models
Dirichlet Process Mixtures of Generalized Linear Models Lauren Hannah Department of Operations Research and Financial Engineering Princeton University Princeton, NJ 08544, USA David Blei Department of
More informationSpatial Bayesian Nonparametrics for Natural Image Segmentation
Spatial Bayesian Nonparametrics for Natural Image Segmentation Erik Sudderth Brown University Joint work with Michael Jordan University of California Soumya Ghosh Brown University Parsing Visual Scenes
More informationDirichlet Process Mixtures of Generalized Linear Models
Dirichlet Process Mixtures of Generalized Linear Models Lauren A. Hannah Department of Operations Research and Financial Engineering Princeton University Princeton, NJ 08544, USA David M. Blei Department
More informationThe Bayesian Choice. Christian P. Robert. From Decision-Theoretic Foundations to Computational Implementation. Second Edition.
Christian P. Robert The Bayesian Choice From Decision-Theoretic Foundations to Computational Implementation Second Edition With 23 Illustrations ^Springer" Contents Preface to the Second Edition Preface
More informationGaussian kernel GARCH models
Gaussian kernel GARCH models Xibin (Bill) Zhang and Maxwell L. King Department of Econometrics and Business Statistics Faculty of Business and Economics 7 June 2013 Motivation A regression model is often
More informationBayesian Heteroskedasticity-Robust Regression. Richard Startz * revised February Abstract
Bayesian Heteroskedasticity-Robust Regression Richard Startz * revised February 2015 Abstract I offer here a method for Bayesian heteroskedasticity-robust regression. The Bayesian version is derived by
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear
More informationMachine Learning using Bayesian Approaches
Machine Learning using Bayesian Approaches Sargur N. Srihari University at Buffalo, State University of New York 1 Outline 1. Progress in ML and PR 2. Fully Bayesian Approach 1. Probability theory Bayes
More informationBayesian Interpretations of Heteroskedastic Consistent Covariance. Estimators Using the Informed Bayesian Bootstrap
Bayesian Interpretations of Heteroskedastic Consistent Covariance Estimators Using the Informed Bayesian Bootstrap Dale J. Poirier University of California, Irvine May 22, 2009 Abstract This paper provides
More informationLearning Bayesian network : Given structure and completely observed data
Learning Bayesian network : Given structure and completely observed data Probabilistic Graphical Models Sharif University of Technology Spring 2017 Soleymani Learning problem Target: true distribution
More informationECO 2403 TOPICS IN ECONOMETRICS
ECO 2403 TOPICS IN ECONOMETRICS Department of Economics. University of Toronto Winter 2019 Instructors: Victor Aguirregabiria Phone: 416-978-4358 Office: 150 St. George Street, Room 309 E-mail: victor.aguirregabiria@utoronto.ca
More informationBayesian nonparametrics
Bayesian nonparametrics 1 Some preliminaries 1.1 de Finetti s theorem We will start our discussion with this foundational theorem. We will assume throughout all variables are defined on the probability
More informationPattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions
Pattern Recognition and Machine Learning Chapter 2: Probability Distributions Cécile Amblard Alex Kläser Jakob Verbeek October 11, 27 Probability Distributions: General Density Estimation: given a finite
More informationStock index returns density prediction using GARCH models: Frequentist or Bayesian estimation?
MPRA Munich Personal RePEc Archive Stock index returns density prediction using GARCH models: Frequentist or Bayesian estimation? Ardia, David; Lennart, Hoogerheide and Nienke, Corré aeris CAPITAL AG,
More informationLabor-Supply Shifts and Economic Fluctuations. Technical Appendix
Labor-Supply Shifts and Economic Fluctuations Technical Appendix Yongsung Chang Department of Economics University of Pennsylvania Frank Schorfheide Department of Economics University of Pennsylvania January
More informationComputer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo
Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain
More informationMarginal Specifications and a Gaussian Copula Estimation
Marginal Specifications and a Gaussian Copula Estimation Kazim Azam Abstract Multivariate analysis involving random variables of different type like count, continuous or mixture of both is frequently required
More informationNon-Parametric Bayes
Non-Parametric Bayes Mark Schmidt UBC Machine Learning Reading Group January 2016 Current Hot Topics in Machine Learning Bayesian learning includes: Gaussian processes. Approximate inference. Bayesian
More informationDirichlet Process Mixtures of Generalized Linear Models
Dirichlet Process Mixtures of Generalized Linear Models Lauren A. Hannah Department of Operations Research and Financial Engineering Princeton University Princeton, NJ 08544, USA David M. Blei Department
More informationOn Consistency of Nonparametric Normal Mixtures for Bayesian Density Estimation
On Consistency of Nonparametric Normal Mixtures for Bayesian Density Estimation Antonio LIJOI, Igor PRÜNSTER, andstepheng.walker The past decade has seen a remarkable development in the area of Bayesian
More informationEconometric Analysis of Cross Section and Panel Data
Econometric Analysis of Cross Section and Panel Data Jeffrey M. Wooldridge / The MIT Press Cambridge, Massachusetts London, England Contents Preface Acknowledgments xvii xxiii I INTRODUCTION AND BACKGROUND
More informationGibbs Sampling in Endogenous Variables Models
Gibbs Sampling in Endogenous Variables Models Econ 690 Purdue University Outline 1 Motivation 2 Identification Issues 3 Posterior Simulation #1 4 Posterior Simulation #2 Motivation In this lecture we take
More informationBayesian Statistics. Debdeep Pati Florida State University. April 3, 2017
Bayesian Statistics Debdeep Pati Florida State University April 3, 2017 Finite mixture model The finite mixture of normals can be equivalently expressed as y i N(µ Si ; τ 1 S i ), S i k π h δ h h=1 δ h
More informationBayesian Semiparametric GARCH Models
Bayesian Semiparametric GARCH Models Xibin (Bill) Zhang and Maxwell L. King Department of Econometrics and Business Statistics Faculty of Business and Economics xibin.zhang@monash.edu Quantitative Methods
More informationBayesian Semiparametric GARCH Models
Bayesian Semiparametric GARCH Models Xibin (Bill) Zhang and Maxwell L. King Department of Econometrics and Business Statistics Faculty of Business and Economics xibin.zhang@monash.edu Quantitative Methods
More informationOnline Appendix. Online Appendix A: MCMC Algorithm. The model can be written in the hierarchical form: , Ω. V b {b k }, z, b, ν, S
Online Appendix Online Appendix A: MCMC Algorithm The model can be written in the hierarchical form: U CONV β k β, β, X k, X β, Ω U CTR θ k θ, θ, X k, X θ, Ω b k {U CONV }, {U CTR b }, X k, X b, b, z,
More informationBayesian Regularization
Bayesian Regularization Aad van der Vaart Vrije Universiteit Amsterdam International Congress of Mathematicians Hyderabad, August 2010 Contents Introduction Abstract result Gaussian process priors Co-authors
More informationHierarchical Dirichlet Processes with Random Effects
Hierarchical Dirichlet Processes with Random Effects Seyoung Kim Department of Computer Science University of California, Irvine Irvine, CA 92697-34 sykim@ics.uci.edu Padhraic Smyth Department of Computer
More informationINFINITE MIXTURES OF MULTIVARIATE GAUSSIAN PROCESSES
INFINITE MIXTURES OF MULTIVARIATE GAUSSIAN PROCESSES SHILIANG SUN Department of Computer Science and Technology, East China Normal University 500 Dongchuan Road, Shanghai 20024, China E-MAIL: slsun@cs.ecnu.edu.cn,
More informationPriors for the frequentist, consistency beyond Schwartz
Victoria University, Wellington, New Zealand, 11 January 2016 Priors for the frequentist, consistency beyond Schwartz Bas Kleijn, KdV Institute for Mathematics Part I Introduction Bayesian and Frequentist
More informationStat 542: Item Response Theory Modeling Using The Extended Rank Likelihood
Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood Jonathan Gruhl March 18, 2010 1 Introduction Researchers commonly apply item response theory (IRT) models to binary and ordinal
More informationDensity estimation Nonparametric conditional mean estimation Semiparametric conditional mean estimation. Nonparametrics. Gabriel Montes-Rojas
0 0 5 Motivation: Regression discontinuity (Angrist&Pischke) Outcome.5 1 1.5 A. Linear E[Y 0i X i] 0.2.4.6.8 1 X Outcome.5 1 1.5 B. Nonlinear E[Y 0i X i] i 0.2.4.6.8 1 X utcome.5 1 1.5 C. Nonlinearity
More informationInfinite-State Markov-switching for Dynamic. Volatility Models : Web Appendix
Infinite-State Markov-switching for Dynamic Volatility Models : Web Appendix Arnaud Dufays 1 Centre de Recherche en Economie et Statistique March 19, 2014 1 Comparison of the two MS-GARCH approximations
More informationAn Alternative Infinite Mixture Of Gaussian Process Experts
An Alternative Infinite Mixture Of Gaussian Process Experts Edward Meeds and Simon Osindero Department of Computer Science University of Toronto Toronto, M5S 3G4 {ewm,osindero}@cs.toronto.edu Abstract
More informationPart 8: GLMs and Hierarchical LMs and GLMs
Part 8: GLMs and Hierarchical LMs and GLMs 1 Example: Song sparrow reproductive success Arcese et al., (1992) provide data on a sample from a population of 52 female song sparrows studied over the course
More informationPractical Bayesian Quantile Regression. Keming Yu University of Plymouth, UK
Practical Bayesian Quantile Regression Keming Yu University of Plymouth, UK (kyu@plymouth.ac.uk) A brief summary of some recent work of us (Keming Yu, Rana Moyeed and Julian Stander). Summary We develops
More informationAnalysing geoadditive regression data: a mixed model approach
Analysing geoadditive regression data: a mixed model approach Institut für Statistik, Ludwig-Maximilians-Universität München Joint work with Ludwig Fahrmeir & Stefan Lang 25.11.2005 Spatio-temporal regression
More informationOn Bayesian Computation
On Bayesian Computation Michael I. Jordan with Elaine Angelino, Maxim Rabinovich, Martin Wainwright and Yun Yang Previous Work: Information Constraints on Inference Minimize the minimax risk under constraints
More informationLecture 5: Spatial probit models. James P. LeSage University of Toledo Department of Economics Toledo, OH
Lecture 5: Spatial probit models James P. LeSage University of Toledo Department of Economics Toledo, OH 43606 jlesage@spatial-econometrics.com March 2004 1 A Bayesian spatial probit model with individual
More informationNonparametric Bayes tensor factorizations for big data
Nonparametric Bayes tensor factorizations for big data David Dunson Department of Statistical Science, Duke University Funded from NIH R01-ES017240, R01-ES017436 & DARPA N66001-09-C-2082 Motivation Conditional
More informationImplementation of Bayesian Inference in Dynamic Discrete Choice Models
Implementation of Bayesian Inference in Dynamic Discrete Choice Models Andriy Norets anorets@princeton.edu Department of Economics, Princeton University August 30, 2009 Abstract This paper experimentally
More informationSparse Nonparametric Density Estimation in High Dimensions Using the Rodeo
Sparse Nonparametric Density Estimation in High Dimensions Using the Rodeo Han Liu John Lafferty Larry Wasserman Statistics Department Computer Science Department Machine Learning Department Carnegie Mellon
More informationCMPS 242: Project Report
CMPS 242: Project Report RadhaKrishna Vuppala Univ. of California, Santa Cruz vrk@soe.ucsc.edu Abstract The classification procedures impose certain models on the data and when the assumption match the
More informationAsymptotic properties of posterior distributions in nonparametric regression with non-gaussian errors
Ann Inst Stat Math (29) 61:835 859 DOI 1.17/s1463-8-168-2 Asymptotic properties of posterior distributions in nonparametric regression with non-gaussian errors Taeryon Choi Received: 1 January 26 / Revised:
More informationWhy experimenters should not randomize, and what they should do instead
Why experimenters should not randomize, and what they should do instead Maximilian Kasy Department of Economics, Harvard University Maximilian Kasy (Harvard) Experimental design 1 / 42 project STAR Introduction
More informationA Bayesian Nonparametric Hierarchical Framework for Uncertainty Quantification in Simulation
Submitted to Operations Research manuscript Please, provide the manuscript number! Authors are encouraged to submit new papers to INFORMS journals by means of a style file template, which includes the
More informationQuantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be
Quantile methods Class Notes Manuel Arellano December 1, 2009 1 Unconditional quantiles Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be Q τ (Y ) q τ F 1 (τ) =inf{r : F
More information