Ecological inference with distribution regression

Size: px
Start display at page:

Download "Ecological inference with distribution regression"

Transcription

1 Ecological inference with distribution regression Seth Flaxman 10 May 2017 Department of Politics and International Relations

2 Ecological inference I How to draw conclusions about individuals from aggregate-level data? Black White Democrat?? 2,200 Republican?? 1,300 1,500 2,000 2

3 Ecological inference Ecological fallacy [Robinson 1950], Method of Bounds [Duncan and Davis 1953], Goodman s method [1959], neighborhood method [Freedman 1991], Bayesian modeling approaches to ecological inference [King 1997, Gelman 2001, Wakefield 2004] Source: King,

4 Ecological inference: a modern approach We (often) have unlabeled individual level data! Learning from label proportions [e.g. Kueck and de Freitas 2005, Quadrianto et al 2009, Sheldon and Dietterich 2011, Patrini et al 2014] Distribution regression [e.g., Szabo et al 2015] Data privacy literature / de-anonymization [e.g. Narayanan and Shmatikov, 2009] Bayesian approaches [Jackson, Best, and Richardson 2006, Wakefield 2004 and 2008] 4

5 Electoral data I Voting results per district, e.g. Obama 52%, Romney 48% I Individual-level demographic data from the American Community Survey: race, education, income, occupation, time to work, etc for 5% of US residents I Goal: infer how demographic subgroups voted (by district!) 5

6 Spatial join (a) (b) Figure: Electoral data for 67 counties (left), census data for 127 public use microdata areas (not shown). Merging results in 37 electoral regions (right). 6

7 Ground truth : exit polls women men Costly, only carried out in 18 states in 2012 Limited number of demographic variables Not representative at substate level Unreliable Other sources of ground truth: public polls, voter file, post-election surveys 7

8 My new approach 1 Insight: ecological inference with individual level data can be framed as distribution regression Goal: scalable spatially varying Bayesian model Approach: new Bayesian distribution regression method with GPs and fast kernel methods 1 Flaxman, Wang, Smola, Who Supported Obama in 2012? Ecological Inference through Distribution Regression, KDD 2015; Flaxman, Sutherland, Wang, and Teh, Understanding the 2016 US Presidential Election using ecological inference and distribution regression with census microdata. 8

9 Distribution regression Individual-level data with group-level labels: ( ) ( ) ( ) {x j 1 }N 1 j=1, y 1, {x j 2 }N 2 j=1, y 2,... {xn} j Nn j=1, y n 9

10 Distribution regression Individual-level data with group-level labels: ( ) ( ) ( ) {x j 1 }N 1 j=1, y 1, {x j 2 }N 2 j=1, y 2,... {xn} j Nn j=1, y n Learn a function: f : {x j } N j=1 y 10

11 Distribution regression Individual-level data with group-level labels: ( ) ( ) ( ) {x j 1 }N 1 j=1, y 1, {x j 2 }N 2 j=1, y 2,... {xn} j Nn j=1, y n Learn a function: f : {x j } N j=1 y Adapt it for ecological inference: make predictions for subgroups. ˆf ({x j women }N 1 j=1 ) 11

12 Learning from distributions Previous work: Jebara et al, 2004; Hein and Bousquet, 2005; Muandet et al, 2012; Póczos et al, 2013, Szabó et al (2014), Lopez-Paz et al, 2015, Lopez-Paz (2016). Distribution regression / distribution classification relies on the kernel mean embedding [see Muandet et al 2017 s survey] Given kernel k(x, ), RKHS H k, and corresponding embedding φ(x) H k, consider a measure with X P. Then define: µ P := E[φ(X )] = φ(x)dp(x) (1) Obvious empirical estimator for samples x 1,..., x n : µ ˆ P := 1 n X φ(x i ) (2) Learning: use any supervised learning method to learn a function f (µ P ). 12 i

13 Distribution embedding illustration p(x) P Q x Reproducing Kernel Hilbert Space RKHS embedding of Q RKHS embedding of P Figure: Each distribution is mapped into the reproducing kernel Hilbert space via an expectation operation. (Source: Muandet et al 2017) 13

14 My new approach: distribution regression y 2 % vote for Obama? y 3? y 1 µ 1 µ w 3 µ 3 µ m 3 µ 2 feature space x 1 1 x 2 1 x 1 3 x 2 3 x 3 3 x 1 2 x 2 2 men women both x 3 1 x 5 3 x 4 3 x 3 2 region 1 region 2 region 3 14

15 Bayesian distribution regression Estimate µ 1,..., µ n R n using kernel embeddings: Use GP logistic regression µ i = 1 k(x j i N, ) = 1 φ(x j i N ) Additive kernels with a spatial component: j K ij = σ 2 x µ i, µ j + k s (s i, s j ) f GP(0, K) k i f i Binomial(n i, logit 1 (f i )) Obama received k i out of n i votes in region i. Make predictions for demographic subgroups: j f (µ women i, s i ) 15

16 Kernel details Demographic attributes (Gaussian RBF): Standardize coordinates Expand discrete attributes: (low, medium, high income) ([1 0 0], [0 1 0], [0 0 1]). Use random Fourier features for speed: k(x, x ) = φ(x), φ(x ) ˆφ(x), ˆφ(x ) with ˆφ(x) R Spatial attributes with Matérn- 3 2 : k(s, s ) = (1 + ρ s s ) exp( ρ s s ) Millions of observations, but the covariance matrix is for the 843 electoral regions. 16

17 Algorithm details One pass through census data to create mean embeddings: µ 1 = Setup GP regression: j w j 1 φ(x j 1 ) j w j,..., µ n = 1 f GP(0, σ 2 xk x + σ 2 s K s ) k i f i Binomial(n i, logit 1 (f i )) j w j nφ(x j n) j w j n Laplace approximation for hyperparameter learning θ = [σ x, σ s, ρ] w/ marginal likelihood: arg max p(y θ) θ Bayesian posterior inference to make predictions for latent f at new locations : p(f men y, ˆθ) 17 (3)

18 Experiments Exit poll women Exit poll men Ecological regression women Ecological regression men 18

19 Experiments 70% 70% Ecological regression 60% 50% 40% Ecological regression 60% 50% 40% 40% 50% 60% 70% 40% 50% 60% 70% Exit poll Exit poll Women Men 19

20 Experiments Fraction of geographic regions % 0% 5% 10% 15% Obama support gender gap (percentage points) 20

21 Ecological regression 70% 50% 30% Ecological regression 70% 50% 30% 30% 50% 70% 30% 50% 70% Exit poll Exit poll Low income Medium income Ecological regression 70% 50% 30% 30% 50% 70% Exit poll High income 21

22 Refinements for 2016 election 2 Explicity model non-voters: y i = [Clinton votes, Trump votes, Non-votes and third party votes] Multinomial likelihood with softmax link, fit with penalized MLE with group lasso and L 2 penalty More interpretable / richer feature representation to allow for exploratory analysis / calculation of marginal effects: φ(x j i ) := [φ 1(x j r1 ),..., φ d(x j rd )] (4) Incorporation of some exit polling data as extra set of labeled distributions 2 arxiv:

23 Results for 2016 Presidential Election Clinton Trump Frac. electorate Participation rate Men Women year olds and older

24 Results for 2016 Presidential Election Clinton Trump Participation Language other than English spoken at home Mobility = lived here one year ago Mobility = moved here from outside US and Puerto Rico Mobility = moved here from inside US or Puerto Rico Active duty military Not enrolled in school Enrolled in a public school or public college Enrolled in private school, private college, or home school

25 Results for 2016 Presidential Election Clinton Trump Frac Participation personal income & men personal income & women < personal income & men < personal income < & women personal income > & men personal income > & women

26 Exploratory results feature deviance frac.deviance 1 RAC3P - race coding ethnicity interacted with has degree schooling attainment ANC2P - detailed ancestry OCCP - occupation COW - class of worker ANC1P - detailed ancestry NAICSP - industry code RAC2P - race code age interacted with usual hours worked per week (WKHP) sex interacted with ethnicity MSP - marital status FOD1P - field of degree ethnicity RAC1P - recoded race sex interacted with age has degree interacted with age age interacted with personal income sex interacted with hours worked per week personal income interacted with hours worked per week personal income RACSOR - single or multiple race has degree interacted with hours worked per week hispanic sex interacted with personal income

27 Marginal results Clinton/Trump Vote Share 100% 100% Clinton 50%/50% 100% Trump Participation Rate 80% 60% 40% 20% American Irish African American Polish German Italian English Not reported Mexican White 0% 27

28 Marginal results Clinton/Trump Vote Share 100% 100% Clinton 50%/50% 100% Trump Hispanic, college degree 80% White, college degree Black, college degree Black, no degree White, no degree Participation Rate 60% 40% Amerindian, no degree Other/Multiracial, no degree 20% Asian, college degree Asian, no degree Hispanic, no degree 0% Other/Multiracial, college degree Amerindian, college degree 28

29 Conclusion: ecological inference New ecological inference method through Bayesian distribution regression Scalable to millions of observations through random features Good empirical results Realistic uncertainty intervals Simple method [off-the-shelf tools] Python package by Dougal Sutherland and replication code Next steps: fully Bayesian version of multinomial model, learning richer feature representations, validation on ground truth 29

30 Future work Fully Bayesian method (pre-print will be on arxiv later this week) Mean embedding as an ecological covariate in hierarchical modeling Poll design: how to define likely voters? Imputing demographic characteristics of voters from voter file Many other application areas (public health, etc) Thanks! Papers and code: and on arxiv 30

1. Capitalize all surnames and attempt to match with Census list. 3. Split double-barreled names apart, and attempt to match first half of name.

1. Capitalize all surnames and attempt to match with Census list. 3. Split double-barreled names apart, and attempt to match first half of name. Supplementary Appendix: Imai, Kosuke and Kabir Kahnna. (2016). Improving Ecological Inference by Predicting Individual Ethnicity from Voter Registration Records. Political Analysis doi: 10.1093/pan/mpw001

More information

Advanced Quantitative Research Methodology Lecture Notes: January Ecological 28, 2012 Inference1 / 38

Advanced Quantitative Research Methodology Lecture Notes: January Ecological 28, 2012 Inference1 / 38 Advanced Quantitative Research Methodology Lecture Notes: Ecological Inference 1 Gary King http://gking.harvard.edu January 28, 2012 1 c Copyright 2008 Gary King, All Rights Reserved. Gary King http://gking.harvard.edu

More information

IEOR165 Discussion Week 5

IEOR165 Discussion Week 5 IEOR165 Discussion Week 5 Sheng Liu University of California, Berkeley Feb 19, 2016 Outline 1 1st Homework 2 Revisit Maximum A Posterior 3 Regularization IEOR165 Discussion Sheng Liu 2 About 1st Homework

More information

Variables and Variable De nitions

Variables and Variable De nitions APPENDIX A Variables and Variable De nitions All demographic county-level variables have been drawn directly from the 1970, 1980, and 1990 U.S. Censuses of Population, published by the U.S. Department

More information

GAUSSIAN PROCESS REGRESSION

GAUSSIAN PROCESS REGRESSION GAUSSIAN PROCESS REGRESSION CSE 515T Spring 2015 1. BACKGROUND The kernel trick again... The Kernel Trick Consider again the linear regression model: y(x) = φ(x) w + ε, with prior p(w) = N (w; 0, Σ). The

More information

Practical Bayesian Optimization of Machine Learning. Learning Algorithms

Practical Bayesian Optimization of Machine Learning. Learning Algorithms Practical Bayesian Optimization of Machine Learning Algorithms CS 294 University of California, Berkeley Tuesday, April 20, 2016 Motivation Machine Learning Algorithms (MLA s) have hyperparameters that

More information

DEPARTMENT OF COMPUTER SCIENCE Autumn Semester MACHINE LEARNING AND ADAPTIVE INTELLIGENCE

DEPARTMENT OF COMPUTER SCIENCE Autumn Semester MACHINE LEARNING AND ADAPTIVE INTELLIGENCE Data Provided: None DEPARTMENT OF COMPUTER SCIENCE Autumn Semester 203 204 MACHINE LEARNING AND ADAPTIVE INTELLIGENCE 2 hours Answer THREE of the four questions. All questions carry equal weight. Figures

More information

Density Estimation. Seungjin Choi

Density Estimation. Seungjin Choi Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/

More information

COMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017

COMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017 COMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University FEATURE EXPANSIONS FEATURE EXPANSIONS

More information

Ecological Inference

Ecological Inference Ecological Inference Simone Zhang March 2017 With thanks to Gary King for slides on EI. Simone Zhang Ecological Inference March 2017 1 / 28 What is ecological inference? Definition: Ecological inference

More information

Sample Profile Report

Sample Profile Report Customer File Name Upload Date Match Count R_SJB Enrollment.TXT 09/09/13 663 of 757 names Income Range Code New (Red: Zipcode Average) Home Market Value (Red: Zipcode Average) 3 $40K-$50K $50K-60K $60K-$75K

More information

Can a Pseudo Panel be a Substitute for a Genuine Panel?

Can a Pseudo Panel be a Substitute for a Genuine Panel? Can a Pseudo Panel be a Substitute for a Genuine Panel? Min Hee Seo Washington University in St. Louis minheeseo@wustl.edu February 16th 1 / 20 Outline Motivation: gauging mechanism of changes Introduce

More information

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature

More information

Machine Learning - MT Classification: Generative Models

Machine Learning - MT Classification: Generative Models Machine Learning - MT 2016 7. Classification: Generative Models Varun Kanade University of Oxford October 31, 2016 Announcements Practical 1 Submission Try to get signed off during session itself Otherwise,

More information

Computer Vision Group Prof. Daniel Cremers. 4. Gaussian Processes - Regression

Computer Vision Group Prof. Daniel Cremers. 4. Gaussian Processes - Regression Group Prof. Daniel Cremers 4. Gaussian Processes - Regression Definition (Rep.) Definition: A Gaussian process is a collection of random variables, any finite number of which have a joint Gaussian distribution.

More information

Computer Vision Group Prof. Daniel Cremers. 9. Gaussian Processes - Regression

Computer Vision Group Prof. Daniel Cremers. 9. Gaussian Processes - Regression Group Prof. Daniel Cremers 9. Gaussian Processes - Regression Repetition: Regularized Regression Before, we solved for w using the pseudoinverse. But: we can kernelize this problem as well! First step:

More information

How to Use the Internet for Election Surveys

How to Use the Internet for Election Surveys How to Use the Internet for Election Surveys Simon Jackman and Douglas Rivers Stanford University and Polimetrix, Inc. May 9, 2008 Theory and Practice Practice Theory Works Doesn t work Works Great! Black

More information

Probabilistic Reasoning in Deep Learning

Probabilistic Reasoning in Deep Learning Probabilistic Reasoning in Deep Learning Dr Konstantina Palla, PhD palla@stats.ox.ac.uk September 2017 Deep Learning Indaba, Johannesburgh Konstantina Palla 1 / 39 OVERVIEW OF THE TALK Basics of Bayesian

More information

DIFFERENT INFLUENCES OF SOCIOECONOMIC FACTORS ON THE HUNTING AND FISHING LICENSE SALES IN COOK COUNTY, IL

DIFFERENT INFLUENCES OF SOCIOECONOMIC FACTORS ON THE HUNTING AND FISHING LICENSE SALES IN COOK COUNTY, IL DIFFERENT INFLUENCES OF SOCIOECONOMIC FACTORS ON THE HUNTING AND FISHING LICENSE SALES IN COOK COUNTY, IL Xiaohan Zhang and Craig Miller Illinois Natural History Survey University of Illinois at Urbana

More information

Lecture 5: Spatial probit models. James P. LeSage University of Toledo Department of Economics Toledo, OH

Lecture 5: Spatial probit models. James P. LeSage University of Toledo Department of Economics Toledo, OH Lecture 5: Spatial probit models James P. LeSage University of Toledo Department of Economics Toledo, OH 43606 jlesage@spatial-econometrics.com March 2004 1 A Bayesian spatial probit model with individual

More information

Lecture: Gaussian Process Regression. STAT 6474 Instructor: Hongxiao Zhu

Lecture: Gaussian Process Regression. STAT 6474 Instructor: Hongxiao Zhu Lecture: Gaussian Process Regression STAT 6474 Instructor: Hongxiao Zhu Motivation Reference: Marc Deisenroth s tutorial on Robot Learning. 2 Fast Learning for Autonomous Robots with Gaussian Processes

More information

45.6% 2.5% Likely Obama

45.6% 2.5% Likely Obama Colorado Results For 9/21/2012-9/22/2012 Contact: Doug Kaplan, 407-242-1870 Executive Summary On the afternoon and evenings of September 21 22, 2012, Gravis Marketing, a non-partisan research firm, and

More information

Statistical Data Mining and Machine Learning Hilary Term 2016

Statistical Data Mining and Machine Learning Hilary Term 2016 Statistical Data Mining and Machine Learning Hilary Term 2016 Dino Sejdinovic Department of Statistics Oxford Slides and other materials available at: http://www.stats.ox.ac.uk/~sejdinov/sdmml Naïve Bayes

More information

Gaussian Processes (10/16/13)

Gaussian Processes (10/16/13) STA561: Probabilistic machine learning Gaussian Processes (10/16/13) Lecturer: Barbara Engelhardt Scribes: Changwei Hu, Di Jin, Mengdi Wang 1 Introduction In supervised learning, we observe some inputs

More information

Machine Learning. Lecture 4: Regularization and Bayesian Statistics. Feng Li. https://funglee.github.io

Machine Learning. Lecture 4: Regularization and Bayesian Statistics. Feng Li. https://funglee.github.io Machine Learning Lecture 4: Regularization and Bayesian Statistics Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 207 Overfitting Problem

More information

Introduction to Probabilistic Machine Learning

Introduction to Probabilistic Machine Learning Introduction to Probabilistic Machine Learning Piyush Rai Dept. of CSE, IIT Kanpur (Mini-course 1) Nov 03, 2015 Piyush Rai (IIT Kanpur) Introduction to Probabilistic Machine Learning 1 Machine Learning

More information

The Brown and Payne model of voter transition revisited

The Brown and Payne model of voter transition revisited The Brown and Payne model of voter transition revisited Antonio Forcina and Giovanni M. Marchetti Abstract We attempt a critical assessment of the assumptions, in terms of voting behavior, underlying the

More information

GALLUP NEWS SERVICE JUNE WAVE 1

GALLUP NEWS SERVICE JUNE WAVE 1 GALLUP NEWS SERVICE JUNE WAVE 1 -- FINAL TOPLINE -- Timberline: 937008 IS: 392 Princeton Job #: 15-06-006 Jeff Jones, Lydia Saad June 2-7, 2015 Results are based on telephone interviews conducted June

More information

Overview. DS GA 1002 Probability and Statistics for Data Science. Carlos Fernandez-Granda

Overview. DS GA 1002 Probability and Statistics for Data Science.   Carlos Fernandez-Granda Overview DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall17 Carlos Fernandez-Granda Probability and statistics Probability: Framework for dealing

More information

Model Selection for Gaussian Processes

Model Selection for Gaussian Processes Institute for Adaptive and Neural Computation School of Informatics,, UK December 26 Outline GP basics Model selection: covariance functions and parameterizations Criteria for model selection Marginal

More information

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) = Until now we have always worked with likelihoods and prior distributions that were conjugate to each other, allowing the computation of the posterior distribution to be done in closed form. Unfortunately,

More information

Minimax Estimation of Kernel Mean Embeddings

Minimax Estimation of Kernel Mean Embeddings Minimax Estimation of Kernel Mean Embeddings Bharath K. Sriperumbudur Department of Statistics Pennsylvania State University Gatsby Computational Neuroscience Unit May 4, 2016 Collaborators Dr. Ilya Tolstikhin

More information

GWAS V: Gaussian processes

GWAS V: Gaussian processes GWAS V: Gaussian processes Dr. Oliver Stegle Christoh Lippert Prof. Dr. Karsten Borgwardt Max-Planck-Institutes Tübingen, Germany Tübingen Summer 2011 Oliver Stegle GWAS V: Gaussian processes Summer 2011

More information

Goals. PSCI6000 Maximum Likelihood Estimation Multiple Response Model 1. Multinomial Dependent Variable. Random Utility Model

Goals. PSCI6000 Maximum Likelihood Estimation Multiple Response Model 1. Multinomial Dependent Variable. Random Utility Model Goals PSCI6000 Maximum Likelihood Estimation Multiple Response Model 1 Tetsuya Matsubayashi University of North Texas November 2, 2010 Random utility model Multinomial logit model Conditional logit model

More information

CSci 8980: Advanced Topics in Graphical Models Gaussian Processes

CSci 8980: Advanced Topics in Graphical Models Gaussian Processes CSci 8980: Advanced Topics in Graphical Models Gaussian Processes Instructor: Arindam Banerjee November 15, 2007 Gaussian Processes Outline Gaussian Processes Outline Parametric Bayesian Regression Gaussian

More information

Reliability Monitoring Using Log Gaussian Process Regression

Reliability Monitoring Using Log Gaussian Process Regression COPYRIGHT 013, M. Modarres Reliability Monitoring Using Log Gaussian Process Regression Martin Wayne Mohammad Modarres PSA 013 Center for Risk and Reliability University of Maryland Department of Mechanical

More information

h=1 exp (X : J h=1 Even the direction of the e ect is not determined by jk. A simpler interpretation of j is given by the odds-ratio

h=1 exp (X : J h=1 Even the direction of the e ect is not determined by jk. A simpler interpretation of j is given by the odds-ratio Multivariate Response Models The response variable is unordered and takes more than two values. The term unordered refers to the fact that response 3 is not more favored than response 2. One choice from

More information

Probabilistic Machine Learning. Industrial AI Lab.

Probabilistic Machine Learning. Industrial AI Lab. Probabilistic Machine Learning Industrial AI Lab. Probabilistic Linear Regression Outline Probabilistic Classification Probabilistic Clustering Probabilistic Dimension Reduction 2 Probabilistic Linear

More information

Making Our Cities Safer: A Study In Neighbhorhood Crime Patterns

Making Our Cities Safer: A Study In Neighbhorhood Crime Patterns Making Our Cities Safer: A Study In Neighbhorhood Crime Patterns Aly Kane alykane@stanford.edu Ariel Sagalovsky asagalov@stanford.edu Abstract Equipped with an understanding of the factors that influence

More information

Afternoon Meeting on Bayesian Computation 2018 University of Reading

Afternoon Meeting on Bayesian Computation 2018 University of Reading Gabriele Abbati 1, Alessra Tosi 2, Seth Flaxman 3, Michael A Osborne 1 1 University of Oxford, 2 Mind Foundry Ltd, 3 Imperial College London Afternoon Meeting on Bayesian Computation 2018 University of

More information

An Introduction to Gaussian Processes for Spatial Data (Predictions!)

An Introduction to Gaussian Processes for Spatial Data (Predictions!) An Introduction to Gaussian Processes for Spatial Data (Predictions!) Matthew Kupilik College of Engineering Seminar Series Nov 216 Matthew Kupilik UAA GP for Spatial Data Nov 216 1 / 35 Why? When evidence

More information

Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm

Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm Qiang Liu and Dilin Wang NIPS 2016 Discussion by Yunchen Pu March 17, 2017 March 17, 2017 1 / 8 Introduction Let x R d

More information

Adaptive HMC via the Infinite Exponential Family

Adaptive HMC via the Infinite Exponential Family Adaptive HMC via the Infinite Exponential Family Arthur Gretton Gatsby Unit, CSML, University College London RegML, 2017 Arthur Gretton (Gatsby Unit, UCL) Adaptive HMC via the Infinite Exponential Family

More information

What are we like? Population characteristics from UK censuses. Justin Hayes & Richard Wiseman UK Data Service Census Support

What are we like? Population characteristics from UK censuses. Justin Hayes & Richard Wiseman UK Data Service Census Support What are we like? Population characteristics from UK censuses Justin Hayes & Richard Wiseman UK Data Service Census Support Who are we? Richard Wiseman UK Data Service / Jisc Justin Hayes UK Data Service

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 305 Part VII

More information

Bayesian statistics. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Bayesian statistics. DS GA 1002 Statistical and Mathematical Models.   Carlos Fernandez-Granda Bayesian statistics DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall15 Carlos Fernandez-Granda Frequentist vs Bayesian statistics In frequentist statistics

More information

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework HT5: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Maximum Likelihood Principle A generative model for

More information

Gaussian Processes in Machine Learning

Gaussian Processes in Machine Learning Gaussian Processes in Machine Learning November 17, 2011 CharmGil Hong Agenda Motivation GP : How does it make sense? Prior : Defining a GP More about Mean and Covariance Functions Posterior : Conditioning

More information

STAT 518 Intro Student Presentation

STAT 518 Intro Student Presentation STAT 518 Intro Student Presentation Wen Wei Loh April 11, 2013 Title of paper Radford M. Neal [1999] Bayesian Statistics, 6: 475-501, 1999 What the paper is about Regression and Classification Flexible

More information

Contents. Part I: Fundamentals of Bayesian Inference 1

Contents. Part I: Fundamentals of Bayesian Inference 1 Contents Preface xiii Part I: Fundamentals of Bayesian Inference 1 1 Probability and inference 3 1.1 The three steps of Bayesian data analysis 3 1.2 General notation for statistical inference 4 1.3 Bayesian

More information

STA414/2104. Lecture 11: Gaussian Processes. Department of Statistics

STA414/2104. Lecture 11: Gaussian Processes. Department of Statistics STA414/2104 Lecture 11: Gaussian Processes Department of Statistics www.utstat.utoronto.ca Delivered by Mark Ebden with thanks to Russ Salakhutdinov Outline Gaussian Processes Exam review Course evaluations

More information

Michigan Results. For 9/21/2012-9/22/2012. Contact: Doug Kaplan,

Michigan Results. For 9/21/2012-9/22/2012. Contact: Doug Kaplan, Michigan Results For 9/21/2012-9/22/2012 Contact: Doug Kaplan, 407-242-1870 Executive Summary On the afternoon and evenings of September 21 22, 2012, Gravis Marketing, a non-partisan research firm, and

More information

Bayesian Support Vector Machines for Feature Ranking and Selection

Bayesian Support Vector Machines for Feature Ranking and Selection Bayesian Support Vector Machines for Feature Ranking and Selection written by Chu, Keerthi, Ong, Ghahramani Patrick Pletscher pat@student.ethz.ch ETH Zurich, Switzerland 12th January 2006 Overview 1 Introduction

More information

Item Response Theory for Conjoint Survey Experiments

Item Response Theory for Conjoint Survey Experiments Item Response Theory for Conjoint Survey Experiments Devin Caughey Hiroto Katsumata Teppei Yamamoto Massachusetts Institute of Technology PolMeth XXXV @ Brigham Young University July 21, 2018 Conjoint

More information

Announcements. Proposals graded

Announcements. Proposals graded Announcements Proposals graded Kevin Jamieson 2018 1 Bayesian Methods Machine Learning CSE546 Kevin Jamieson University of Washington November 1, 2018 2018 Kevin Jamieson 2 MLE Recap - coin flips Data:

More information

Machine Learning: Assignment 1

Machine Learning: Assignment 1 10-701 Machine Learning: Assignment 1 Due on Februrary 0, 014 at 1 noon Barnabas Poczos, Aarti Singh Instructions: Failure to follow these directions may result in loss of points. Your solutions for this

More information

4 Extending King s Ecological Inference Model to Multiple Elections Using Markov Chain Monte Carlo

4 Extending King s Ecological Inference Model to Multiple Elections Using Markov Chain Monte Carlo PART TWO 4 Extending King s Ecological Inference Model to Multiple Elections Using Markov Chain Monte Carlo Jeffrey B. Lewis ABSTRACT King s EI estimator has become a widely used procedure for tackling

More information

Machine Learning. Bayesian Regression & Classification. Marc Toussaint U Stuttgart

Machine Learning. Bayesian Regression & Classification. Marc Toussaint U Stuttgart Machine Learning Bayesian Regression & Classification learning as inference, Bayesian Kernel Ridge regression & Gaussian Processes, Bayesian Kernel Logistic Regression & GP classification, Bayesian Neural

More information

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering Types of learning Modeling data Supervised: we know input and targets Goal is to learn a model that, given input data, accurately predicts target data Unsupervised: we know the input only and want to make

More information

Kernel adaptive Sequential Monte Carlo

Kernel adaptive Sequential Monte Carlo Kernel adaptive Sequential Monte Carlo Ingmar Schuster (Paris Dauphine) Heiko Strathmann (University College London) Brooks Paige (Oxford) Dino Sejdinovic (Oxford) December 7, 2015 1 / 36 Section 1 Outline

More information

Ecological Inference in the Social Sciences

Ecological Inference in the Social Sciences Ecological Inference in the Social Sciences Adam Glynn 1 and Jon Wakefield 2 1 Department of Government, Harvard University, Cambridge. 2 Departments of Statistics and Biostatistics, University of Washington,

More information

Gaussian Process Regression

Gaussian Process Regression Gaussian Process Regression 4F1 Pattern Recognition, 21 Carl Edward Rasmussen Department of Engineering, University of Cambridge November 11th - 16th, 21 Rasmussen (Engineering, Cambridge) Gaussian Process

More information

Joint Emotion Analysis via Multi-task Gaussian Processes

Joint Emotion Analysis via Multi-task Gaussian Processes Joint Emotion Analysis via Multi-task Gaussian Processes Daniel Beck, Trevor Cohn, Lucia Specia October 28, 2014 1 Introduction 2 Multi-task Gaussian Process Regression 3 Experiments and Discussion 4 Conclusions

More information

Introduction to Gaussian Processes

Introduction to Gaussian Processes Introduction to Gaussian Processes Iain Murray murray@cs.toronto.edu CSC255, Introduction to Machine Learning, Fall 28 Dept. Computer Science, University of Toronto The problem Learn scalar function of

More information

Intro to Bayesian Methods

Intro to Bayesian Methods Intro to Bayesian Methods Rebecca C. Steorts Bayesian Methods and Modern Statistics: STA 360/601 Lecture 1 1 Course Webpage Syllabus LaTeX reference manual R markdown reference manual Please come to office

More information

Fully Bayesian Deep Gaussian Processes for Uncertainty Quantification

Fully Bayesian Deep Gaussian Processes for Uncertainty Quantification Fully Bayesian Deep Gaussian Processes for Uncertainty Quantification N. Zabaras 1 S. Atkinson 1 Center for Informatics and Computational Science Department of Aerospace and Mechanical Engineering University

More information

Support Vector Machines for Classification: A Statistical Portrait

Support Vector Machines for Classification: A Statistical Portrait Support Vector Machines for Classification: A Statistical Portrait Yoonkyung Lee Department of Statistics The Ohio State University May 27, 2011 The Spring Conference of Korean Statistical Society KAIST,

More information

Tutorial on Gaussian Processes and the Gaussian Process Latent Variable Model

Tutorial on Gaussian Processes and the Gaussian Process Latent Variable Model Tutorial on Gaussian Processes and the Gaussian Process Latent Variable Model (& discussion on the GPLVM tech. report by Prof. N. Lawrence, 06) Andreas Damianou Department of Neuro- and Computer Science,

More information

Generating Private Synthetic Data: Presentation 2

Generating Private Synthetic Data: Presentation 2 Generating Private Synthetic Data: Presentation 2 Mentor: Dr. Anand Sarwate July 17, 2015 Overview 1 Project Overview (Revisited) 2 Sensitivity of Mutual Information 3 Simulation 4 Results with Real Data

More information

COS513 LECTURE 8 STATISTICAL CONCEPTS

COS513 LECTURE 8 STATISTICAL CONCEPTS COS513 LECTURE 8 STATISTICAL CONCEPTS NIKOLAI SLAVOV AND ANKUR PARIKH 1. MAKING MEANINGFUL STATEMENTS FROM JOINT PROBABILITY DISTRIBUTIONS. A graphical model (GM) represents a family of probability distributions

More information

Supervised Learning Coursework

Supervised Learning Coursework Supervised Learning Coursework John Shawe-Taylor Tom Diethe Dorota Glowacka November 30, 2009; submission date: noon December 18, 2009 Abstract Using a series of synthetic examples, in this exercise session

More information

Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines

Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Maximilian Kasy Department of Economics, Harvard University 1 / 37 Agenda 6 equivalent representations of the

More information

Stochastic optimization in Hilbert spaces

Stochastic optimization in Hilbert spaces Stochastic optimization in Hilbert spaces Aymeric Dieuleveut Aymeric Dieuleveut Stochastic optimization Hilbert spaces 1 / 48 Outline Learning vs Statistics Aymeric Dieuleveut Stochastic optimization Hilbert

More information

GROWING APART: THE CHANGING FIRM-SIZE WAGE PREMIUM AND ITS INEQUALITY CONSEQUENCES ONLINE APPENDIX

GROWING APART: THE CHANGING FIRM-SIZE WAGE PREMIUM AND ITS INEQUALITY CONSEQUENCES ONLINE APPENDIX GROWING APART: THE CHANGING FIRM-SIZE WAGE PREMIUM AND ITS INEQUALITY CONSEQUENCES ONLINE APPENDIX The following document is the online appendix for the paper, Growing Apart: The Changing Firm-Size Wage

More information

Computer Vision Group Prof. Daniel Cremers. 2. Regression (cont.)

Computer Vision Group Prof. Daniel Cremers. 2. Regression (cont.) Prof. Daniel Cremers 2. Regression (cont.) Regression with MLE (Rep.) Assume that y is affected by Gaussian noise : t = f(x, w)+ where Thus, we have p(t x, w, )=N (t; f(x, w), 2 ) 2 Maximum A-Posteriori

More information

Review. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Review. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Review Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 What is Machine Learning (ML) Study of algorithms that improve their performance at some task with experience 2 Graphical Models

More information

Gaussian Process Regression Model in Spatial Logistic Regression

Gaussian Process Regression Model in Spatial Logistic Regression Journal of Physics: Conference Series PAPER OPEN ACCESS Gaussian Process Regression Model in Spatial Logistic Regression To cite this article: A Sofro and A Oktaviarina 018 J. Phys.: Conf. Ser. 947 01005

More information

Application of Indirect Race/ Ethnicity Data in Quality Metric Analyses

Application of Indirect Race/ Ethnicity Data in Quality Metric Analyses Background The fifteen wholly-owned health plans under WellPoint, Inc. (WellPoint) historically did not collect data in regard to the race/ethnicity of it members. In order to overcome this lack of data

More information

Probabilistic Machine Learning

Probabilistic Machine Learning Probabilistic Machine Learning by Prof. Seungchul Lee isystes Design Lab http://isystes.unist.ac.kr/ UNIST Table of Contents I.. Probabilistic Linear Regression I... Maxiu Likelihood Solution II... Maxiu-a-Posteriori

More information

Disease mapping with Gaussian processes

Disease mapping with Gaussian processes EUROHEIS2 Kuopio, Finland 17-18 August 2010 Aki Vehtari (former Helsinki University of Technology) Department of Biomedical Engineering and Computational Science (BECS) Acknowledgments Researchers - Jarno

More information

Mark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation.

Mark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation. CS 189 Spring 2015 Introduction to Machine Learning Midterm You have 80 minutes for the exam. The exam is closed book, closed notes except your one-page crib sheet. No calculators or electronic items.

More information

Nonparameteric Regression:

Nonparameteric Regression: Nonparameteric Regression: Nadaraya-Watson Kernel Regression & Gaussian Process Regression Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro,

More information

Weakly informative priors

Weakly informative priors Department of Statistics and Department of Political Science Columbia University 21 Oct 2011 Collaborators (in order of appearance): Gary King, Frederic Bois, Aleks Jakulin, Vince Dorie, Sophia Rabe-Hesketh,

More information

Ecological Inference for 2 2 Tables

Ecological Inference for 2 2 Tables Ecological Inference for 2 2 Tables Jon Wakefield Departments of Statistics and Biostatistics, University of Washington, Seattle, USA. Summary. A fundamental problem in many disciplines, including political

More information

Forecasting the 2012 Presidential Election from History and the Polls

Forecasting the 2012 Presidential Election from History and the Polls Forecasting the 2012 Presidential Election from History and the Polls Drew Linzer Assistant Professor Emory University Department of Political Science Visiting Assistant Professor, 2012-13 Stanford University

More information

Generative Models for Discrete Data

Generative Models for Discrete Data Generative Models for Discrete Data ddebarr@uw.edu 2016-04-21 Agenda Bayesian Concept Learning Beta-Binomial Model Dirichlet-Multinomial Model Naïve Bayes Classifiers Bayesian Concept Learning Numbers

More information

Weakly informative priors

Weakly informative priors Department of Statistics and Department of Political Science Columbia University 23 Apr 2014 Collaborators (in order of appearance): Gary King, Frederic Bois, Aleks Jakulin, Vince Dorie, Sophia Rabe-Hesketh,

More information

MTTTS16 Learning from Multiple Sources

MTTTS16 Learning from Multiple Sources MTTTS16 Learning from Multiple Sources 5 ECTS credits Autumn 2018, University of Tampere Lecturer: Jaakko Peltonen Lecture 6: Multitask learning with kernel methods and nonparametric models On this lecture:

More information

Introduction: MLE, MAP, Bayesian reasoning (28/8/13)

Introduction: MLE, MAP, Bayesian reasoning (28/8/13) STA561: Probabilistic machine learning Introduction: MLE, MAP, Bayesian reasoning (28/8/13) Lecturer: Barbara Engelhardt Scribes: K. Ulrich, J. Subramanian, N. Raval, J. O Hollaren 1 Classifiers In this

More information

g-priors for Linear Regression

g-priors for Linear Regression Stat60: Bayesian Modeling and Inference Lecture Date: March 15, 010 g-priors for Linear Regression Lecturer: Michael I. Jordan Scribe: Andrew H. Chan 1 Linear regression and g-priors In the last lecture,

More information

Probabilistic machine learning group, Aalto University Bayesian theory and methods, approximative integration, model

Probabilistic machine learning group, Aalto University  Bayesian theory and methods, approximative integration, model Aki Vehtari, Aalto University, Finland Probabilistic machine learning group, Aalto University http://research.cs.aalto.fi/pml/ Bayesian theory and methods, approximative integration, model assessment and

More information

1 Introduction Overview of the Book How to Use this Book Introduction to R 10

1 Introduction Overview of the Book How to Use this Book Introduction to R 10 List of Tables List of Figures Preface xiii xv xvii 1 Introduction 1 1.1 Overview of the Book 3 1.2 How to Use this Book 7 1.3 Introduction to R 10 1.3.1 Arithmetic Operations 10 1.3.2 Objects 12 1.3.3

More information

Clustering K-means. Clustering images. Machine Learning CSE546 Carlos Guestrin University of Washington. November 4, 2014.

Clustering K-means. Clustering images. Machine Learning CSE546 Carlos Guestrin University of Washington. November 4, 2014. Clustering K-means Machine Learning CSE546 Carlos Guestrin University of Washington November 4, 2014 1 Clustering images Set of Images [Goldberger et al.] 2 1 K-means Randomly initialize k centers µ (0)

More information

Approximation Theoretical Questions for SVMs

Approximation Theoretical Questions for SVMs Ingo Steinwart LA-UR 07-7056 October 20, 2007 Statistical Learning Theory: an Overview Support Vector Machines Informal Description of the Learning Goal X space of input samples Y space of labels, usually

More information

Demographic Data in ArcGIS. Harry J. Moore IV

Demographic Data in ArcGIS. Harry J. Moore IV Demographic Data in ArcGIS Harry J. Moore IV Outline What is demographic data? Esri Demographic data - Real world examples with GIS - Redistricting - Emergency Preparedness - Economic Development Next

More information

Active and Semi-supervised Kernel Classification

Active and Semi-supervised Kernel Classification Active and Semi-supervised Kernel Classification Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London Work done in collaboration with Xiaojin Zhu (CMU), John Lafferty (CMU),

More information

Probabilistic Graphical Models Lecture 20: Gaussian Processes

Probabilistic Graphical Models Lecture 20: Gaussian Processes Probabilistic Graphical Models Lecture 20: Gaussian Processes Andrew Gordon Wilson www.cs.cmu.edu/~andrewgw Carnegie Mellon University March 30, 2015 1 / 53 What is Machine Learning? Machine learning algorithms

More information

Gaussian processes and bayesian optimization Stanisław Jastrzębski. kudkudak.github.io kudkudak

Gaussian processes and bayesian optimization Stanisław Jastrzębski. kudkudak.github.io kudkudak Gaussian processes and bayesian optimization Stanisław Jastrzębski kudkudak.github.io kudkudak Plan Goal: talk about modern hyperparameter optimization algorithms Bayes reminder: equivalent linear regression

More information

CMU-Q Lecture 24:

CMU-Q Lecture 24: CMU-Q 15-381 Lecture 24: Supervised Learning 2 Teacher: Gianni A. Di Caro SUPERVISED LEARNING Hypotheses space Hypothesis function Labeled Given Errors Performance criteria Given a collection of input

More information

Econometrics I. Professor William Greene Stern School of Business Department of Economics 1-1/40. Part 1: Introduction

Econometrics I. Professor William Greene Stern School of Business Department of Economics 1-1/40. Part 1: Introduction Econometrics I Professor William Greene Stern School of Business Department of Economics 1-1/40 http://people.stern.nyu.edu/wgreene/econometrics/econometrics.htm 1-2/40 Overview: This is an intermediate

More information