Bayesian Network Regularized Regression for Crime Modeling

Size: px
Start display at page:

Download "Bayesian Network Regularized Regression for Crime Modeling"

Transcription

1 Bayesian Network Regularized Regression for Crime Modeling Luis Carvalho Joint work with Liz Upton Dept. of Mathematics and Statistics Boston University BU-Keio Workshop, August 2016

2 Introduction A motivating example: residential burglary in Boston

3 Introduction A motivating example: residential burglary in Boston Potential goals: Understanding crime rates: covariates? predictions?

4 Introduction A motivating example: residential burglary in Boston Potential goals: Understanding crime rates: covariates? predictions? Identifying hot zones for intervention

5 Residential Burglary Data description: 7K crimes occurring from July 2012 to October 2015 in Boston, provided bydata.cityofboston.gov Reported occurrences are pooled in time and by intersection Covariates: averaged tax income, district type, distance to nearest police station Network with 13K nodes, provided by boston.opendata.arcgis.com

6 Residential Burglary Data description: 7K crimes occurring from July 2012 to October 2015 in Boston, provided bydata.cityofboston.gov Reported occurrences are pooled in time and by intersection Covariates: averaged tax income, district type, distance to nearest police station Network with 13K nodes, provided by boston.opendata.arcgis.com First take: for v in a network (undirected simple graph) G, [ ( )] iid Y v Po exp x v β

7 Residential Burglary Data description: 7K crimes occurring from July 2012 to October 2015 in Boston, provided bydata.cityofboston.gov Reported occurrences are pooled in time and by intersection Covariates: averaged tax income, district type, distance to nearest police station Network with 13K nodes, provided by boston.opendata.arcgis.com First take: for v in a network (undirected simple graph) G, [ ( )] iid Y v Po exp x v β but: Crime rates are not spatially homogeneous

8 Residential Burglary Data description: 7K crimes occurring from July 2012 to October 2015 in Boston, provided bydata.cityofboston.gov Reported occurrences are pooled in time and by intersection Covariates: averaged tax income, district type, distance to nearest police station Network with 13K nodes, provided by boston.opendata.arcgis.com First take: for v in a network (undirected simple graph) G, [ ( )] iid Y v Po exp x v β but: Crime rates are not spatially homogeneous Crime rates can vary sharply

9 Network Regularized Regression Addressing the first issue, [ ind Y v Po where β is now network indexed exp ( )] x v β(v)

10 Network Regularized Regression Addressing the first issue, [ ind Y v Po where β is now network indexed exp ( )] x v β(v) To avoid overfitting, we impose smoothness on β, e.g., under a single intercept model, β := argmin β = argmin β Y β 2 2 +λ Mβ 2 2 D(Y;β)+λβ M Mβ where M is a differential operator and λ is a roughness penalty

11 Network Regularized Regression Addressing the first issue, [ ind Y v Po where β is now network indexed exp ( )] x v β(v) To avoid overfitting, we impose smoothness on β, e.g., under a single intercept model, β := argmin β = argmin β Y β 2 2 +λ Mβ 2 2 D(Y;β)+λβ M Mβ where M is a differential operator and λ is a roughness penalty Similar works: network kernel-based regression (Smola and Kondor, 2003; Kolaczyk, 2009), and, more generally, functional data analysis (Ramsay and Silverman, 1996)

12 Network Regularized Regression For network indexed coefficients, with M the oriented weighted incidence matrix: β M Mβ := β L w β = w uv (β(u) β(v)) 2 i.e., L w is weighted Laplacian (u,v) E(G)

13 Network Regularized Regression For network indexed coefficients, with M the oriented weighted incidence matrix: β M Mβ := β L w β = w uv (β(u) β(v)) 2 i.e., L w is weighted Laplacian (u,v) E(G) With L w := ΦΞΦ, Ξ := Diag i=1,..., V(G) (ξ i ), we adopt a basis expansion for β, β = Φ 1:k θ, k V(G), so the penalty becomes: β L w β = θ Φ 1:k ΦΞΦ Φ 1:k θ = θ Diag i=1,...,k (ξ i )θ

14 Network Regularized Regression For network indexed coefficients, with M the oriented weighted incidence matrix: β M Mβ := β L w β = w uv (β(u) β(v)) 2 i.e., L w is weighted Laplacian (u,v) E(G) With L w := ΦΞΦ, Ξ := Diag i=1,..., V(G) (ξ i ), we adopt a basis expansion for β, β = Φ 1:k θ, k V(G), so the penalty becomes: β L w β = θ Φ 1:k ΦΞΦ Φ 1:k θ = θ Diag i=1,...,k (ξ i )θ Under a Bayesian formulation, β is the posterior mode when Y v θ ind Po [exp ( φ kv θ )] { θ N (0, Diag i=1,...,k (λξi ) 1})

15 Network Regularized Regression Toy example: Y = (10,2,3,4), vertex 1 connected to triangle with vertices 2, 3, and 4, w(u, v) exp{ d(u, v)/2}i[d(u, v) > 0], and D = , L = CV fitted log 10 (λ) log 10 (λ)

16 Bayesian Network Regularized Regression Addressing the issue of abrupt rate changes, [ ( )] ind Y v ζ,β,z v Po exp Z v ζ +(1 Z v )x v β(v) [ Z v γ ind Bern logit 1( )] u v γ(v) where: ζ is the background crime rate Z v codes for v being in a hot zone, also varying smoothly Both β and γ are network indexed and assume a basis expansion as before

17 Bayesian Network Regularized Regression Addressing the issue of abrupt rate changes, [ ( )] ind Y v ζ,β,z v Po exp Z v ζ +(1 Z v )x v β(v) [ Z v γ ind Bern logit 1( )] u v γ(v) where: ζ is the background crime rate Z v codes for v being in a hot zone, also varying smoothly Both β and γ are network indexed and assume a basis expansion as before Using basis coefficients, x v β(v) D X (v) θ and u v γ(v) D U (v) ω, [ ( )] ind Y v ζ,θ,z v Po exp Z v ζ +(1 Z v )D X (v) θ [ Z v ω ind Bern logit 1( )] D U (v) ω

18 Bayesian Network Regularized Regression Quick methodological recap: Network regularized regression as a building block, [ Y v θ ind F g 1( D X (v) θ )], θ N ( 0,λ 1 θ Ω(X,L w(g)) ) Change regions using latent network-indexed indicators Z and conditional responses [ Z v ω ind Bern logit 1( D U (v) ω )], ω N ( 0,λ 1 ω Ω(U,L w (G)) )

19 Bayesian Network Regularized Regression Quick methodological recap: Network regularized regression as a building block, [ Y v θ ind F g 1( D X (v) θ )], θ N ( 0,λ 1 θ Ω(X,L w(g)) ) Change regions using latent network-indexed indicators Z and conditional responses [ Z v ω ind Bern logit 1( D U (v) ω )], ω N ( 0,λ 1 ω Ω(U,L w (G)) ) There are now two main practical problems: How to define the hyper-parameters controlling the smoothness of β and γ?

20 Bayesian Network Regularized Regression Quick methodological recap: Network regularized regression as a building block, [ Y v θ ind F g 1( D X (v) θ )], θ N ( 0,λ 1 θ Ω(X,L w(g)) ) Change regions using latent network-indexed indicators Z and conditional responses [ Z v ω ind Bern logit 1( D U (v) ω )], ω N ( 0,λ 1 ω Ω(U,L w (G)) ) There are now two main practical problems: How to define the hyper-parameters controlling the smoothness of β and γ? How to fit this model efficiently for large scale datasets?

21 Prior Elicitation: Guidelines Three main sets of hyper-parameters: θ N ( 0,λ 1 Ω(X,L w (G)) ), where Ω(X,L w (G)) := D X L wd X and D X depends on Φ 1:k

22 Prior Elicitation: Guidelines Three main sets of hyper-parameters: θ N ( 0,λ 1 Ω(X,L w (G)) ), where Ω(X,L w (G)) := D X L wd X and D X depends on Φ 1:k To define L w we need a measure of similarity as weights in G: in our application, we use w(u,v) exp{ d(u,v)/ψ} and set the network range ψ such that median similarity is 0.8

23 Prior Elicitation: Guidelines Three main sets of hyper-parameters: θ N ( 0,λ 1 Ω(X,L w (G)) ), where Ω(X,L w (G)) := D X L wd X and D X depends on Φ 1:k To define L w we need a measure of similarity as weights in G: in our application, we use w(u,v) exp{ d(u,v)/ψ} and set the network range ψ such that median similarity is 0.8 Penalty λ and basis rank k can be defined jointly using leave-one-out cross-validation via PRESS working residuals ( LOOP ) Adding Eigenvectors One by One Optimal Lambda Includes all covariates LOOP LOOP

24 Model Fitting Crime Counts: Predicted and Actual Pearson Residuals Predicted Residuals Latent Variable = 0 Latent Variable = Actual Log of Predicted Counts

25 Model Fitting

26 Model Fitting

27 Model Fitting

28 Conclusions and future work Summary

29 Conclusions and future work Summary Network regularization is useful in a number of applications and can be more suitable than other types of regularization

30 Conclusions and future work Summary Network regularization is useful in a number of applications and can be more suitable than other types of regularization Methodology can be used as a building block for more elaborated models

31 Conclusions and future work Summary Network regularization is useful in a number of applications and can be more suitable than other types of regularization Methodology can be used as a building block for more elaborated models Main concern: representative models and computational efficiency

32 Conclusions and future work Summary Network regularization is useful in a number of applications and can be more suitable than other types of regularization Methodology can be used as a building block for more elaborated models Main concern: representative models and computational efficiency New challenges

33 Conclusions and future work Summary Network regularization is useful in a number of applications and can be more suitable than other types of regularization Methodology can be used as a building block for more elaborated models Main concern: representative models and computational efficiency New challenges Refinements and extensions: dynamic model, basis selection, covariance structure

34 Conclusions and future work Summary Network regularization is useful in a number of applications and can be more suitable than other types of regularization Methodology can be used as a building block for more elaborated models Main concern: representative models and computational efficiency New challenges Refinements and extensions: dynamic model, basis selection, covariance structure Other applications in Biology, Epidemiology, and Engineering

35 Conclusions and future work Summary Network regularization is useful in a number of applications and can be more suitable than other types of regularization Methodology can be used as a building block for more elaborated models Main concern: representative models and computational efficiency New challenges Refinements and extensions: dynamic model, basis selection, covariance structure Other applications in Biology, Epidemiology, and Engineering Bayesian network regression (for topology inference)

36 Conclusions and future work Summary Network regularization is useful in a number of applications and can be more suitable than other types of regularization Methodology can be used as a building block for more elaborated models Main concern: representative models and computational efficiency New challenges Refinements and extensions: dynamic model, basis selection, covariance structure Other applications in Biology, Epidemiology, and Engineering Bayesian network regression (for topology inference) Thank you!

Bayesian Network Regularized Regression for Modeling Urban Crime Occurrences

Bayesian Network Regularized Regression for Modeling Urban Crime Occurrences Bayesian Network Regularized Regression for Modeling Urban Crime Occurrences arxiv:1708.05047v1 [stat.ap] 16 Aug 2017 Elizabeth Upton and Luis Carvalho Department of Mathematics and Statistics, Boston

More information

Part 8: GLMs and Hierarchical LMs and GLMs

Part 8: GLMs and Hierarchical LMs and GLMs Part 8: GLMs and Hierarchical LMs and GLMs 1 Example: Song sparrow reproductive success Arcese et al., (1992) provide data on a sample from a population of 52 female song sparrows studied over the course

More information

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

11 : Gaussian Graphic Models and Ising Models

11 : Gaussian Graphic Models and Ising Models 10-708: Probabilistic Graphical Models 10-708, Spring 2017 11 : Gaussian Graphic Models and Ising Models Lecturer: Bryon Aragam Scribes: Chao-Ming Yen 1 Introduction Different from previous maximum likelihood

More information

Hypothesis Testing For Multilayer Network Data

Hypothesis Testing For Multilayer Network Data Hypothesis Testing For Multilayer Network Data Jun Li Dept of Mathematics and Statistics, Boston University Joint work with Eric Kolaczyk Outline Background and Motivation Geometric structure of multilayer

More information

Bayesian Hierarchical Models

Bayesian Hierarchical Models Bayesian Hierarchical Models Gavin Shaddick, Millie Green, Matthew Thomas University of Bath 6 th - 9 th December 2016 1/ 34 APPLICATIONS OF BAYESIAN HIERARCHICAL MODELS 2/ 34 OUTLINE Spatial epidemiology

More information

Kernel methods, kernel SVM and ridge regression

Kernel methods, kernel SVM and ridge regression Kernel methods, kernel SVM and ridge regression Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Collaborative Filtering 2 Collaborative Filtering R: rating matrix; U: user factor;

More information

Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines

Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Maximilian Kasy Department of Economics, Harvard University 1 / 37 Agenda 6 equivalent representations of the

More information

Approaches for Multiple Disease Mapping: MCAR and SANOVA

Approaches for Multiple Disease Mapping: MCAR and SANOVA Approaches for Multiple Disease Mapping: MCAR and SANOVA Dipankar Bandyopadhyay Division of Biostatistics, University of Minnesota SPH April 22, 2015 1 Adapted from Sudipto Banerjee s notes SANOVA vs MCAR

More information

ABC random forest for parameter estimation. Jean-Michel Marin

ABC random forest for parameter estimation. Jean-Michel Marin ABC random forest for parameter estimation Jean-Michel Marin Université de Montpellier Institut Montpelliérain Alexander Grothendieck (IMAG) Institut de Biologie Computationnelle (IBC) Labex Numev! joint

More information

A Fully Nonparametric Modeling Approach to. BNP Binary Regression

A Fully Nonparametric Modeling Approach to. BNP Binary Regression A Fully Nonparametric Modeling Approach to Binary Regression Maria Department of Applied Mathematics and Statistics University of California, Santa Cruz SBIES, April 27-28, 2012 Outline 1 2 3 Simulation

More information

STAT 518 Intro Student Presentation

STAT 518 Intro Student Presentation STAT 518 Intro Student Presentation Wen Wei Loh April 11, 2013 Title of paper Radford M. Neal [1999] Bayesian Statistics, 6: 475-501, 1999 What the paper is about Regression and Classification Flexible

More information

Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D.

Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D. Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D. Ruppert A. EMPIRICAL ESTIMATE OF THE KERNEL MIXTURE Here we

More information

STA 216, GLM, Lecture 16. October 29, 2007

STA 216, GLM, Lecture 16. October 29, 2007 STA 216, GLM, Lecture 16 October 29, 2007 Efficient Posterior Computation in Factor Models Underlying Normal Models Generalized Latent Trait Models Formulation Genetic Epidemiology Illustration Structural

More information

Bayesian Aggregation for Extraordinarily Large Dataset

Bayesian Aggregation for Extraordinarily Large Dataset Bayesian Aggregation for Extraordinarily Large Dataset Guang Cheng 1 Department of Statistics Purdue University www.science.purdue.edu/bigdata Department Seminar Statistics@LSE May 19, 2017 1 A Joint Work

More information

STA 414/2104, Spring 2014, Practice Problem Set #1

STA 414/2104, Spring 2014, Practice Problem Set #1 STA 44/4, Spring 4, Practice Problem Set # Note: these problems are not for credit, and not to be handed in Question : Consider a classification problem in which there are two real-valued inputs, and,

More information

3 Joint Distributions 71

3 Joint Distributions 71 2.2.3 The Normal Distribution 54 2.2.4 The Beta Density 58 2.3 Functions of a Random Variable 58 2.4 Concluding Remarks 64 2.5 Problems 64 3 Joint Distributions 71 3.1 Introduction 71 3.2 Discrete Random

More information

Bayes methods for categorical data. April 25, 2017

Bayes methods for categorical data. April 25, 2017 Bayes methods for categorical data April 25, 2017 Motivation for joint probability models Increasing interest in high-dimensional data in broad applications Focus may be on prediction, variable selection,

More information

Gaussian Process Regression

Gaussian Process Regression Gaussian Process Regression 4F1 Pattern Recognition, 21 Carl Edward Rasmussen Department of Engineering, University of Cambridge November 11th - 16th, 21 Rasmussen (Engineering, Cambridge) Gaussian Process

More information

Statistics 352: Spatial statistics. Jonathan Taylor. Department of Statistics. Models for discrete data. Stanford University.

Statistics 352: Spatial statistics. Jonathan Taylor. Department of Statistics. Models for discrete data. Stanford University. 352: 352: Models for discrete data April 28, 2009 1 / 33 Models for discrete data 352: Outline Dependent discrete data. Image data (binary). Ising models. Simulation: Gibbs sampling. Denoising. 2 / 33

More information

How to learn from very few examples?

How to learn from very few examples? How to learn from very few examples? Dengyong Zhou Department of Empirical Inference Max Planck Institute for Biological Cybernetics Spemannstr. 38, 72076 Tuebingen, Germany Outline Introduction Part A

More information

STATS306B STATS306B. Discriminant Analysis. Jonathan Taylor Department of Statistics Stanford University. June 3, 2010

STATS306B STATS306B. Discriminant Analysis. Jonathan Taylor Department of Statistics Stanford University. June 3, 2010 STATS306B Discriminant Analysis Jonathan Taylor Department of Statistics Stanford University June 3, 2010 Spring 2010 Classification Given K classes in R p, represented as densities f i (x), 1 i K classify

More information

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Andrew O. Finley Department of Forestry & Department of Geography, Michigan State University, Lansing

More information

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Andrew O. Finley 1 and Sudipto Banerjee 2 1 Department of Forestry & Department of Geography, Michigan

More information

Models for Count and Binary Data. Poisson and Logistic GWR Models. 24/07/2008 GWR Workshop 1

Models for Count and Binary Data. Poisson and Logistic GWR Models. 24/07/2008 GWR Workshop 1 Models for Count and Binary Data Poisson and Logistic GWR Models 24/07/2008 GWR Workshop 1 Outline I: Modelling counts Poisson regression II: Modelling binary events Logistic Regression III: Poisson Regression

More information

Active and Semi-supervised Kernel Classification

Active and Semi-supervised Kernel Classification Active and Semi-supervised Kernel Classification Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London Work done in collaboration with Xiaojin Zhu (CMU), John Lafferty (CMU),

More information

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Review. DS GA 1002 Statistical and Mathematical Models.   Carlos Fernandez-Granda Review DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Probability and statistics Probability: Framework for dealing with

More information

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota,

More information

PREDICTING SOLAR GENERATION FROM WEATHER FORECASTS. Chenlin Wu Yuhan Lou

PREDICTING SOLAR GENERATION FROM WEATHER FORECASTS. Chenlin Wu Yuhan Lou PREDICTING SOLAR GENERATION FROM WEATHER FORECASTS Chenlin Wu Yuhan Lou Background Smart grid: increasing the contribution of renewable in grid energy Solar generation: intermittent and nondispatchable

More information

Can we do statistical inference in a non-asymptotic way? 1

Can we do statistical inference in a non-asymptotic way? 1 Can we do statistical inference in a non-asymptotic way? 1 Guang Cheng 2 Statistics@Purdue www.science.purdue.edu/bigdata/ ONR Review Meeting@Duke Oct 11, 2017 1 Acknowledge NSF, ONR and Simons Foundation.

More information

Graphical Models for Collaborative Filtering

Graphical Models for Collaborative Filtering Graphical Models for Collaborative Filtering Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Sequence modeling HMM, Kalman Filter, etc.: Similarity: the same graphical model topology,

More information

Appendix: Modeling Approach

Appendix: Modeling Approach AFFECTIVE PRIMACY IN INTRAORGANIZATIONAL TASK NETWORKS Appendix: Modeling Approach There is now a significant and developing literature on Bayesian methods in social network analysis. See, for instance,

More information

Learning gradients: prescriptive models

Learning gradients: prescriptive models Department of Statistical Science Institute for Genome Sciences & Policy Department of Computer Science Duke University May 11, 2007 Relevant papers Learning Coordinate Covariances via Gradients. Sayan

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Brown University CSCI 295-P, Spring 213 Prof. Erik Sudderth Lecture 11: Inference & Learning Overview, Gaussian Graphical Models Some figures courtesy Michael Jordan s draft

More information

Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang

Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features Yangxin Huang Department of Epidemiology and Biostatistics, COPH, USF, Tampa, FL yhuang@health.usf.edu January

More information

Xia Wang and Dipak K. Dey

Xia Wang and Dipak K. Dey A Flexible Skewed Link Function for Binary Response Data Xia Wang and Dipak K. Dey Technical Report #2008-5 June 18, 2008 This material was based upon work supported by the National Science Foundation

More information

Stochastic Spectral Approaches to Bayesian Inference

Stochastic Spectral Approaches to Bayesian Inference Stochastic Spectral Approaches to Bayesian Inference Prof. Nathan L. Gibson Department of Mathematics Applied Mathematics and Computation Seminar March 4, 2011 Prof. Gibson (OSU) Spectral Approaches to

More information

Monte Carlo in Bayesian Statistics

Monte Carlo in Bayesian Statistics Monte Carlo in Bayesian Statistics Matthew Thomas SAMBa - University of Bath m.l.thomas@bath.ac.uk December 4, 2014 Matthew Thomas (SAMBa) Monte Carlo in Bayesian Statistics December 4, 2014 1 / 16 Overview

More information

Bayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling

Bayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling Bayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling Jon Wakefield Departments of Statistics and Biostatistics University of Washington 1 / 37 Lecture Content Motivation

More information

CASE STUDY: Bayesian Incidence Analyses from Cross-Sectional Data with Multiple Markers of Disease Severity. Outline:

CASE STUDY: Bayesian Incidence Analyses from Cross-Sectional Data with Multiple Markers of Disease Severity. Outline: CASE STUDY: Bayesian Incidence Analyses from Cross-Sectional Data with Multiple Markers of Disease Severity Outline: 1. NIEHS Uterine Fibroid Study Design of Study Scientific Questions Difficulties 2.

More information

Graphical Models for Climate Data Analysis: Drought Detection and Land Variable Regression

Graphical Models for Climate Data Analysis: Drought Detection and Land Variable Regression for Climate Data Analysis: Drought Detection and Land Variable Regression Arindam Banerjee banerjee@cs.umn.edu Dept of Computer Science & Engineering University of Minnesota, Twin Cities Joint Work with:

More information

Bayesian Modeling of Conditional Distributions

Bayesian Modeling of Conditional Distributions Bayesian Modeling of Conditional Distributions John Geweke University of Iowa Indiana University Department of Economics February 27, 2007 Outline Motivation Model description Methods of inference Earnings

More information

Sparse Linear Models (10/7/13)

Sparse Linear Models (10/7/13) STA56: Probabilistic machine learning Sparse Linear Models (0/7/) Lecturer: Barbara Engelhardt Scribes: Jiaji Huang, Xin Jiang, Albert Oh Sparsity Sparsity has been a hot topic in statistics and machine

More information

Nonparametric Bayesian Methods (Gaussian Processes)

Nonparametric Bayesian Methods (Gaussian Processes) [70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent

More information

20: Gaussian Processes

20: Gaussian Processes 10-708: Probabilistic Graphical Models 10-708, Spring 2016 20: Gaussian Processes Lecturer: Andrew Gordon Wilson Scribes: Sai Ganesh Bandiatmakuri 1 Discussion about ML Here we discuss an introduction

More information

Spatial Regression. 6. Specification Spatial Heterogeneity. Luc Anselin.

Spatial Regression. 6. Specification Spatial Heterogeneity. Luc Anselin. Spatial Regression 6. Specification Spatial Heterogeneity Luc Anselin http://spatial.uchicago.edu 1 homogeneity and heterogeneity spatial regimes spatially varying coefficients spatial random effects 2

More information

ATHS FC Math Department Al Ain Revision worksheet

ATHS FC Math Department Al Ain Revision worksheet ATHS FC Math Department Al Ain Revision worksheet Section Name ID Date Lesson Marks 3.3, 13.1 to 13.6, 5.1, 5.2, 5.3 Lesson 3.3 (Solving Systems of Inequalities by Graphing) Question: 1 Solve each system

More information

Data Analysis and Manifold Learning Lecture 9: Diffusion on Manifolds and on Graphs

Data Analysis and Manifold Learning Lecture 9: Diffusion on Manifolds and on Graphs Data Analysis and Manifold Learning Lecture 9: Diffusion on Manifolds and on Graphs Radu Horaud INRIA Grenoble Rhone-Alpes, France Radu.Horaud@inrialpes.fr http://perception.inrialpes.fr/ Outline of Lecture

More information

Learning Graphs from Data: A Signal Representation Perspective

Learning Graphs from Data: A Signal Representation Perspective 1 Learning Graphs from Data: A Signal Representation Perspective Xiaowen Dong*, Dorina Thanou*, Michael Rabbat, and Pascal Frossard arxiv:1806.00848v1 [cs.lg] 3 Jun 2018 The construction of a meaningful

More information

Nonlinear Dimensionality Reduction

Nonlinear Dimensionality Reduction Nonlinear Dimensionality Reduction Piyush Rai CS5350/6350: Machine Learning October 25, 2011 Recap: Linear Dimensionality Reduction Linear Dimensionality Reduction: Based on a linear projection of the

More information

Intensity Analysis of Spatial Point Patterns Geog 210C Introduction to Spatial Data Analysis

Intensity Analysis of Spatial Point Patterns Geog 210C Introduction to Spatial Data Analysis Intensity Analysis of Spatial Point Patterns Geog 210C Introduction to Spatial Data Analysis Chris Funk Lecture 4 Spatial Point Patterns Definition Set of point locations with recorded events" within study

More information

Approximate Bayesian Computation

Approximate Bayesian Computation Approximate Bayesian Computation Michael Gutmann https://sites.google.com/site/michaelgutmann University of Helsinki and Aalto University 1st December 2015 Content Two parts: 1. The basics of approximate

More information

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Alan Gelfand 1 and Andrew O. Finley 2 1 Department of Statistical Science, Duke University, Durham, North

More information

Multivariate spatial modeling

Multivariate spatial modeling Multivariate spatial modeling Point-referenced spatial data often come as multivariate measurements at each location Chapter 7: Multivariate Spatial Modeling p. 1/21 Multivariate spatial modeling Point-referenced

More information

Niche Modeling. STAMPS - MBL Course Woods Hole, MA - August 9, 2016

Niche Modeling. STAMPS - MBL Course Woods Hole, MA - August 9, 2016 Niche Modeling Katie Pollard & Josh Ladau Gladstone Institutes UCSF Division of Biostatistics, Institute for Human Genetics and Institute for Computational Health Science STAMPS - MBL Course Woods Hole,

More information

Bayesian Models for Regularization in Optimization

Bayesian Models for Regularization in Optimization Bayesian Models for Regularization in Optimization Aleksandr Aravkin, UBC Bradley Bell, UW Alessandro Chiuso, Padova Michael Friedlander, UBC Gianluigi Pilloneto, Padova Jim Burke, UW MOPTA, Lehigh University,

More information

Adaptive Multi-Modal Sensing of General Concealed Targets

Adaptive Multi-Modal Sensing of General Concealed Targets Adaptive Multi-Modal Sensing of General Concealed argets Lawrence Carin Balaji Krishnapuram, David Williams, Xuejun Liao and Ya Xue Department of Electrical & Computer Engineering Duke University Durham,

More information

Distributed Fault Detection for Interconnected Second-Order Systems with Applications to Power Networks

Distributed Fault Detection for Interconnected Second-Order Systems with Applications to Power Networks Distributed Fault Detection for Interconnected Second-Order Systems with Applications to Power Networks Iman Shames 1 André H. Teixeira 2 Henrik Sandberg 2 Karl H. Johansson 2 1 The Australian National

More information

MULTIDIMENSIONAL COVARIATE EFFECTS IN SPATIAL AND JOINT EXTREMES

MULTIDIMENSIONAL COVARIATE EFFECTS IN SPATIAL AND JOINT EXTREMES MULTIDIMENSIONAL COVARIATE EFFECTS IN SPATIAL AND JOINT EXTREMES Philip Jonathan, Kevin Ewans, David Randell, Yanyun Wu philip.jonathan@shell.com www.lancs.ac.uk/ jonathan Wave Hindcasting & Forecasting

More information

An Algorithm for Bayesian Variable Selection in High-dimensional Generalized Linear Models

An Algorithm for Bayesian Variable Selection in High-dimensional Generalized Linear Models Proceedings 59th ISI World Statistics Congress, 25-30 August 2013, Hong Kong (Session CPS023) p.3938 An Algorithm for Bayesian Variable Selection in High-dimensional Generalized Linear Models Vitara Pungpapong

More information

Data Mining Techniques

Data Mining Techniques Data Mining Techniques CS 622 - Section 2 - Spring 27 Pre-final Review Jan-Willem van de Meent Feedback Feedback https://goo.gl/er7eo8 (also posted on Piazza) Also, please fill out your TRACE evaluations!

More information

Lecture 10. Lecturer: Aleksander Mądry Scribes: Mani Bastani Parizi and Christos Kalaitzis

Lecture 10. Lecturer: Aleksander Mądry Scribes: Mani Bastani Parizi and Christos Kalaitzis CS-621 Theory Gems October 18, 2012 Lecture 10 Lecturer: Aleksander Mądry Scribes: Mani Bastani Parizi and Christos Kalaitzis 1 Introduction In this lecture, we will see how one can use random walks to

More information

Nonparametric Bayesian modeling for dynamic ordinal regression relationships

Nonparametric Bayesian modeling for dynamic ordinal regression relationships Nonparametric Bayesian modeling for dynamic ordinal regression relationships Athanasios Kottas Department of Applied Mathematics and Statistics, University of California, Santa Cruz Joint work with Maria

More information

Two step estimation for Neyman-Scott point process with inhomogeneous cluster centers. May 2012

Two step estimation for Neyman-Scott point process with inhomogeneous cluster centers. May 2012 Two step estimation for Neyman-Scott point process with inhomogeneous cluster centers Tomáš Mrkvička, Milan Muška, Jan Kubečka May 2012 Motivation Study of the influence of covariates on the occurrence

More information

Factor Analysis and Kalman Filtering (11/2/04)

Factor Analysis and Kalman Filtering (11/2/04) CS281A/Stat241A: Statistical Learning Theory Factor Analysis and Kalman Filtering (11/2/04) Lecturer: Michael I. Jordan Scribes: Byung-Gon Chun and Sunghoon Kim 1 Factor Analysis Factor analysis is used

More information

Latent Factor Regression Models for Grouped Outcomes

Latent Factor Regression Models for Grouped Outcomes Latent Factor Regression Models for Grouped Outcomes Dawn Woodard Operations Research and Information Engineering Cornell University with T. M. T. Love, S. W. Thurston, D. Ruppert S. Sathyanarayana and

More information

GIST 4302/5302: Spatial Analysis and Modeling Point Pattern Analysis

GIST 4302/5302: Spatial Analysis and Modeling Point Pattern Analysis GIST 4302/5302: Spatial Analysis and Modeling Point Pattern Analysis Guofeng Cao www.spatial.ttu.edu Department of Geosciences Texas Tech University guofeng.cao@ttu.edu Fall 2018 Spatial Point Patterns

More information

Lecture 9: PGM Learning

Lecture 9: PGM Learning 13 Oct 2014 Intro. to Stats. Machine Learning COMP SCI 4401/7401 Table of Contents I Learning parameters in MRFs 1 Learning parameters in MRFs Inference and Learning Given parameters (of potentials) and

More information

Nonparametric Drift Estimation for Stochastic Differential Equations

Nonparametric Drift Estimation for Stochastic Differential Equations Nonparametric Drift Estimation for Stochastic Differential Equations Gareth Roberts 1 Department of Statistics University of Warwick Brazilian Bayesian meeting, March 2010 Joint work with O. Papaspiliopoulos,

More information

Part 6: Multivariate Normal and Linear Models

Part 6: Multivariate Normal and Linear Models Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of

More information

Machine Learning Linear Classification. Prof. Matteo Matteucci

Machine Learning Linear Classification. Prof. Matteo Matteucci Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)

More information

Bayesian covariate models in extreme value analysis

Bayesian covariate models in extreme value analysis Bayesian covariate models in extreme value analysis David Randell, Philip Jonathan, Kathryn Turnbull, Mathew Jones EVA 2015 Ann Arbor Copyright 2015 Shell Global Solutions (UK) EVA 2015 Ann Arbor June

More information

Spatial Computing in MGS Lecture II MGS & Applications

Spatial Computing in MGS Lecture II MGS & Applications Spatial Computing in MGS Lecture II MGS & Applications Antoine Spicher 1 & Olivier Michel 1 & Jean-Louis Giavitto 2 mgs.spatial-computing.org/ www.spatial-computing.org/mgs:tutorial 1 LACL, University

More information

GAUSSIAN PROCESS REGRESSION

GAUSSIAN PROCESS REGRESSION GAUSSIAN PROCESS REGRESSION CSE 515T Spring 2015 1. BACKGROUND The kernel trick again... The Kernel Trick Consider again the linear regression model: y(x) = φ(x) w + ε, with prior p(w) = N (w; 0, Σ). The

More information

Learning Multiple Tasks with a Sparse Matrix-Normal Penalty

Learning Multiple Tasks with a Sparse Matrix-Normal Penalty Learning Multiple Tasks with a Sparse Matrix-Normal Penalty Yi Zhang and Jeff Schneider NIPS 2010 Presented by Esther Salazar Duke University March 25, 2011 E. Salazar (Reading group) March 25, 2011 1

More information

Kernel Learning via Random Fourier Representations

Kernel Learning via Random Fourier Representations Kernel Learning via Random Fourier Representations L. Law, M. Mider, X. Miscouridou, S. Ip, A. Wang Module 5: Machine Learning L. Law, M. Mider, X. Miscouridou, S. Ip, A. Wang Kernel Learning via Random

More information

Eliciting Truthful Answers to Sensitive Survey Questions: New Statistical Methods for List and Endorsement Experiments

Eliciting Truthful Answers to Sensitive Survey Questions: New Statistical Methods for List and Endorsement Experiments Eliciting Truthful Answers to Sensitive Survey Questions: New Statistical Methods for List and Endorsement Experiments Kosuke Imai Princeton University July 5, 2010 Kosuke Imai (Princeton) Sensitive Survey

More information

Estimating network degree distributions from sampled networks: An inverse problem

Estimating network degree distributions from sampled networks: An inverse problem Estimating network degree distributions from sampled networks: An inverse problem Eric D. Kolaczyk Dept of Mathematics and Statistics, Boston University kolaczyk@bu.edu Introduction: Networks and Degree

More information

Small Area Modeling of County Estimates for Corn and Soybean Yields in the US

Small Area Modeling of County Estimates for Corn and Soybean Yields in the US Small Area Modeling of County Estimates for Corn and Soybean Yields in the US Matt Williams National Agricultural Statistics Service United States Department of Agriculture Matt.Williams@nass.usda.gov

More information

CS839: Probabilistic Graphical Models. Lecture 7: Learning Fully Observed BNs. Theo Rekatsinas

CS839: Probabilistic Graphical Models. Lecture 7: Learning Fully Observed BNs. Theo Rekatsinas CS839: Probabilistic Graphical Models Lecture 7: Learning Fully Observed BNs Theo Rekatsinas 1 Exponential family: a basic building block For a numeric random variable X p(x ) =h(x)exp T T (x) A( ) = 1

More information

LECTURE NOTE #11 PROF. ALAN YUILLE

LECTURE NOTE #11 PROF. ALAN YUILLE LECTURE NOTE #11 PROF. ALAN YUILLE 1. NonLinear Dimension Reduction Spectral Methods. The basic idea is to assume that the data lies on a manifold/surface in D-dimensional space, see figure (1) Perform

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Lecture Notes Fall 2009 November, 2009 Byoung-Ta Zhang School of Computer Science and Engineering & Cognitive Science, Brain Science, and Bioinformatics Seoul National University

More information

Summary STK 4150/9150

Summary STK 4150/9150 STK4150 - Intro 1 Summary STK 4150/9150 Odd Kolbjørnsen May 22 2017 Scope You are expected to know and be able to use basic concepts introduced in the book. You knowledge is expected to be larger than

More information

Logistics. Naïve Bayes & Expectation Maximization. 573 Schedule. Coming Soon. Estimation Models. Topics

Logistics. Naïve Bayes & Expectation Maximization. 573 Schedule. Coming Soon. Estimation Models. Topics Logistics Naïve Bayes & Expectation Maximization CSE 7 eam Meetings Midterm Open book, notes Studying See AIMA exercises Daniel S. Weld Daniel S. Weld 7 Schedule Selected opics Coming Soon Selected opics

More information

Gibbs Sampling in Endogenous Variables Models

Gibbs Sampling in Endogenous Variables Models Gibbs Sampling in Endogenous Variables Models Econ 690 Purdue University Outline 1 Motivation 2 Identification Issues 3 Posterior Simulation #1 4 Posterior Simulation #2 Motivation In this lecture we take

More information

Probabilistic Graphical Models

Probabilistic Graphical Models 2016 Robert Nowak Probabilistic Graphical Models 1 Introduction We have focused mainly on linear models for signals, in particular the subspace model x = Uθ, where U is a n k matrix and θ R k is a vector

More information

Smooth Common Principal Component Analysis

Smooth Common Principal Component Analysis 1 Smooth Common Principal Component Analysis Michal Benko Wolfgang Härdle Center for Applied Statistics and Economics benko@wiwi.hu-berlin.de Humboldt-Universität zu Berlin Motivation 1-1 Volatility Surface

More information

Modeling conditional distributions with mixture models: Theory and Inference

Modeling conditional distributions with mixture models: Theory and Inference Modeling conditional distributions with mixture models: Theory and Inference John Geweke University of Iowa, USA Journal of Applied Econometrics Invited Lecture Università di Venezia Italia June 2, 2005

More information

Illustration of the Varying Coefficient Model for Analyses the Tree Growth from the Age and Space Perspectives

Illustration of the Varying Coefficient Model for Analyses the Tree Growth from the Age and Space Perspectives TR-No. 14-06, Hiroshima Statistical Research Group, 1 11 Illustration of the Varying Coefficient Model for Analyses the Tree Growth from the Age and Space Perspectives Mariko Yamamura 1, Keisuke Fukui

More information

Efficient Bayesian Multivariate Surface Regression

Efficient Bayesian Multivariate Surface Regression Efficient Bayesian Multivariate Surface Regression Feng Li (joint with Mattias Villani) Department of Statistics, Stockholm University October, 211 Outline of the talk 1 Flexible regression models 2 The

More information

New Developments in Econometrics Lecture 16: Quantile Estimation

New Developments in Econometrics Lecture 16: Quantile Estimation New Developments in Econometrics Lecture 16: Quantile Estimation Jeff Wooldridge Cemmap Lectures, UCL, June 2009 1. Review of Means, Medians, and Quantiles 2. Some Useful Asymptotic Results 3. Quantile

More information

Gibbs Sampling in Linear Models #2

Gibbs Sampling in Linear Models #2 Gibbs Sampling in Linear Models #2 Econ 690 Purdue University Outline 1 Linear Regression Model with a Changepoint Example with Temperature Data 2 The Seemingly Unrelated Regressions Model 3 Gibbs sampling

More information

An inverse scattering problem in random media

An inverse scattering problem in random media An inverse scattering problem in random media Pedro Caro Joint work with: Tapio Helin & Matti Lassas Computational and Analytic Problems in Spectral Theory June 8, 2016 Outline Introduction and motivation

More information

10708 Graphical Models: Homework 2

10708 Graphical Models: Homework 2 10708 Graphical Models: Homework 2 Due Monday, March 18, beginning of class Feburary 27, 2013 Instructions: There are five questions (one for extra credit) on this assignment. There is a problem involves

More information

Gaussian processes for spatial modelling in environmental health: parameterizing for flexibility vs. computational efficiency

Gaussian processes for spatial modelling in environmental health: parameterizing for flexibility vs. computational efficiency Gaussian processes for spatial modelling in environmental health: parameterizing for flexibility vs. computational efficiency Chris Paciorek March 11, 2005 Department of Biostatistics Harvard School of

More information

Reduction of Model Complexity and the Treatment of Discrete Inputs in Computer Model Emulation

Reduction of Model Complexity and the Treatment of Discrete Inputs in Computer Model Emulation Reduction of Model Complexity and the Treatment of Discrete Inputs in Computer Model Emulation Curtis B. Storlie a a Los Alamos National Laboratory E-mail:storlie@lanl.gov Outline Reduction of Emulator

More information

Parametrizations of Discrete Graphical Models

Parametrizations of Discrete Graphical Models Parametrizations of Discrete Graphical Models Robin J. Evans www.stat.washington.edu/ rje42 10th August 2011 1/34 Outline 1 Introduction Graphical Models Acyclic Directed Mixed Graphs Two Problems 2 Ingenuous

More information

Subject CS1 Actuarial Statistics 1 Core Principles

Subject CS1 Actuarial Statistics 1 Core Principles Institute of Actuaries of India Subject CS1 Actuarial Statistics 1 Core Principles For 2019 Examinations Aim The aim of the Actuarial Statistics 1 subject is to provide a grounding in mathematical and

More information

Single Index Quantile Regression for Heteroscedastic Data

Single Index Quantile Regression for Heteroscedastic Data Single Index Quantile Regression for Heteroscedastic Data E. Christou M. G. Akritas Department of Statistics The Pennsylvania State University JSM, 2015 E. Christou, M. G. Akritas (PSU) SIQR JSM, 2015

More information