Cluster Analysis Techniques for Neighborhood Change

Size: px
Start display at page:

Download "Cluster Analysis Techniques for Neighborhood Change"

Transcription

1 Cluster Analysis Techniques for Neighborhood Change Michael Reibel, Cal Poly Pomona Moira Regelson, Yahoo! Search Marketing

2 The Problem Define distinct types of neighborhood change in multiethnic context Seminal papers in neighborhood racial and ethnic change (Denton and Massey, 1991; Alba et al., 1995) classify neighborhoods according to a priori thresholds (e.g. > 100 persons indicates presence of group)

3 Clustering Approaches K-means clustering: For J variables, minimizes within-cluster (J dimensional) distance from the means of K clusters Number of clusters specified a priori PAM (Partitioning Around Medoid) clustering: similar to K-means clustering except uses cluster medians instead of means PAM clustering more robust (less distortion due to extreme values)

4 Problem: k-means and PAM clustering will happily divide any dataset into k clusters per analyst s instructions, regardless of whether that s appropriate or not.

5 Tibshirani Prediction Strength A supervised classificaton technique to predict number of clusters (Tibshirani et al., 2005) Use cluster reproducibility measures for different k to estimate the true number of clusters in the data set. choosing correct number of clusters > less random assignment of samples to clusters and to greater cluster reproducibility.

6 Tibshirani Prediction Strength, cont. Specify kmax and max number of iterations, B. For k in {2 kmax}, repeat B times: Randomly split data set into a training set and test set Apply clustering procedure to partition training set into k clusters, record cluster labels as outcomes Apply best fit cluster labels from training set run to observations in test set ( predicted labels) Apply the clustering procedure to the test set to arrive at the observed labels Compute a measure of agreement comparing predicted to observed labels

7 Data and Methods Census tract race and ethnic trend data for Los Angeles County (1990 counts; 2000 counts interpolated to 1990 census tracts; 2000 mixed race assigned to single races) Tracts clustered on trends for four groups: All Hispanics; NH White, Black, API Three transition moments: raw change, proportional change, relative change Tibshirani Prediction Strength to determine number of clusters, PAM clustering method

8 Selection of Cluster Numbers (Tibshirani Threshold=0.7) Clusters Raw Change Relative Change Proportional Change

9 Cluster Sizes (N J ) Cluster Raw Relative Proportional

10 Scatterplots Rotated in J Dimensional Space Raw Change, 3 clusters Raw Change, 4 clusters allhispanics allhispanics nhwhite nhwhite nhblack nhblack nhapi nhapi

11 Cluster Centers Raw Change County Median Cluster 1 Cluster 2 Cluster 3 allhispanics nhwhite nhblack nhapi

12

13 Cluster Centers Relative Change (Note overall growth: Cluster 1 is fast; Cluster 3 is slow) County Median Cluster 1 Cluster 2 Cluster 3 allhispanics nhwhite nhblack nhapi

14

15 Cluster Centers: Proportional Change County Median Hispanic nhwhite nhblack nhapi

16

17 Discussion Initial test of this clustering approach in neighborhood research (PAM clustering using Tibshirani prediction strength) Both the number of clusters and the cluster centers (with respect to variables) are different for the various moments of change Differences between moments of change most evident in case of Blacks (dispersion out of previously segregated areas does not appear in raw change; appears differently in relative and proportional change)

18 Future Research Apply this technique, with possible modifications, to census tracts across a broader study area (e.g. regional, national) Compare metropolitan areas in terms of the relative presence or absence of clusters identified for the nation s cities Use principal clusters identified for the nation s cities as neighborhood change outcomes to categorically model covariates

19 Sources Cited Alba, R.D., Denton, N. A.,Leung, S.J., Logan, J.R Neighborhood change under conditions of mass immigration: The New York City region, International Migration Review 29: Denton, N.A. and D.S. Massey Patterns of neighborhood transition in a multiethnic world: U.S. metropolitan areas, Demography 28: Kaufman, L. and Rousseeuw, P.J Finding Groups in Data : An Introduction to Cluster Analysis (Wiley Series in Probability and Statistics) Wiley- Interscience; 2 rev edition Tibshirani, R Cluster validation by prediction strength. Journal of Computational and Graphical Statistics 14:

INTRODUCTION SEGREGATION AND NEIGHBORHOOD CHANGE: WHERE ARE WE AFTER MORE THAN A HALF-CENTURY OF FORMAL ANALYSIS 1

INTRODUCTION SEGREGATION AND NEIGHBORHOOD CHANGE: WHERE ARE WE AFTER MORE THAN A HALF-CENTURY OF FORMAL ANALYSIS 1 INTRODUCTION SEGREGATION AND NEIGHBORHOOD CHANGE: WHERE ARE WE AFTER MORE THAN A HALF-CENTURY OF FORMAL ANALYSIS 1 David W. Wong 2 Department of Earth Systems and GeoInformation Sciences George Mason University

More information

Guilty of committing ecological fallacy?

Guilty of committing ecological fallacy? GIS: Guilty of committing ecological fallacy? David W. Wong Professor Geography and GeoInformation Science George Mason University dwong2@gmu.edu Ecological Fallacy (EF) Many slightly different definitions

More information

Mobility Patterns and User Dynamics in Racially Segregated Geographies of US Cities

Mobility Patterns and User Dynamics in Racially Segregated Geographies of US Cities Mobility Patterns and User Dynamics in Racially Segregated Geographies of US Cities Nibir Bora, Yu-Han Chang, and Rajiv Maheswaran Information Sciences Institute, University of Southern California, Marina

More information

MEASURING RACIAL RESIDENTIAL SEGREGATION

MEASURING RACIAL RESIDENTIAL SEGREGATION MEASURING RACIAL RESIDENTIAL SEGREGATION Race Relations Institute Fisk University 1000 Seventeenth Ave. North Nashville, Tennessee 37208 615/329-8575 WHERE WE LIVE: THE COLOR LINE The color line is carved

More information

SPATIAL ANALYSIS. Transformation. Cartogram Central. 14 & 15. Query, Measurement, Transformation, Descriptive Summary, Design, and Inference

SPATIAL ANALYSIS. Transformation. Cartogram Central. 14 & 15. Query, Measurement, Transformation, Descriptive Summary, Design, and Inference 14 & 15. Query, Measurement, Transformation, Descriptive Summary, Design, and Inference Geographic Information Systems and Science SECOND EDITION Paul A. Longley, Michael F. Goodchild, David J. Maguire,

More information

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution

More information

Basics of Geographic Analysis in R

Basics of Geographic Analysis in R Basics of Geographic Analysis in R Spatial Autocorrelation and Spatial Weights Yuri M. Zhukov GOV 2525: Political Geography February 25, 2013 Outline 1. Introduction 2. Spatial Data and Basic Visualization

More information

Social Science Research

Social Science Research Social Science Research 38 (2009) 55 70 Contents lists available at ScienceDirect Social Science Research journal homepage: www.elsevier.com/locate/ssresearch Race and space in the 1990s: Changes in the

More information

Outline. 15. Descriptive Summary, Design, and Inference. Descriptive summaries. Data mining. The centroid

Outline. 15. Descriptive Summary, Design, and Inference. Descriptive summaries. Data mining. The centroid Outline 15. Descriptive Summary, Design, and Inference Geographic Information Systems and Science SECOND EDITION Paul A. Longley, Michael F. Goodchild, David J. Maguire, David W. Rhind 2005 John Wiley

More information

Environmental Analysis, Chapter 4 Consequences, and Mitigation

Environmental Analysis, Chapter 4 Consequences, and Mitigation Environmental Analysis, Chapter 4 4.17 Environmental Justice This section summarizes the potential impacts described in Chapter 3, Transportation Impacts and Mitigation, and other sections of Chapter 4,

More information

Making Our Cities Safer: A Study In Neighbhorhood Crime Patterns

Making Our Cities Safer: A Study In Neighbhorhood Crime Patterns Making Our Cities Safer: A Study In Neighbhorhood Crime Patterns Aly Kane alykane@stanford.edu Ariel Sagalovsky asagalov@stanford.edu Abstract Equipped with an understanding of the factors that influence

More information

1Department of Demography and Organization Studies, University of Texas at San Antonio, One UTSA Circle, San Antonio, TX

1Department of Demography and Organization Studies, University of Texas at San Antonio, One UTSA Circle, San Antonio, TX Well, it depends on where you're born: A practical application of geographically weighted regression to the study of infant mortality in the U.S. P. Johnelle Sparks and Corey S. Sparks 1 Introduction Infant

More information

Evaluating racial clusters as a unit of aggregation for neighborhood effects. Extended Abstract. Draft. Do not cite.

Evaluating racial clusters as a unit of aggregation for neighborhood effects. Extended Abstract. Draft. Do not cite. Evaluating racial clusters as a unit of aggregation for neighborhood effects. Extended Abstract Draft. Do not cite. Jonathan Tannen Ph.D. Candidate, Princeton University March 24, 2014 Introduction The

More information

A Comparison of Categorical Attribute Data Clustering Methods

A Comparison of Categorical Attribute Data Clustering Methods A Comparison of Categorical Attribute Data Clustering Methods Ville Hautamäki 1, Antti Pöllänen 1, Tomi Kinnunen 1, Kong Aik Lee 2, Haizhou Li 2, and Pasi Fränti 1 1 School of Computing, University of

More information

CRP 608 Winter 10 Class presentation February 04, Senior Research Associate Kirwan Institute for the Study of Race and Ethnicity

CRP 608 Winter 10 Class presentation February 04, Senior Research Associate Kirwan Institute for the Study of Race and Ethnicity CRP 608 Winter 10 Class presentation February 04, 2010 SAMIR GAMBHIR SAMIR GAMBHIR Senior Research Associate Kirwan Institute for the Study of Race and Ethnicity Background Kirwan Institute Our work Using

More information

Descriptive Statistics

Descriptive Statistics Applied Econometrics Descriptive Statistics Michael Ash Econ 753 Descriptive Statistics p.1/22 Review of Summers Good econometrics Bad econometrics Interesting Exploratory Robust Critical test of deductive

More information

Can Public Transport Infrastructure Relieve Spatial Mismatch?

Can Public Transport Infrastructure Relieve Spatial Mismatch? Can Public Transport Infrastructure Relieve Spatial Mismatch? Evidence from Recent Light Rail Extensions Kilian Heilmann University of California San Diego April 20, 2015 Motivation Paradox: Even though

More information

Geospatial Analysis of Job-Housing Mismatch Using ArcGIS and Python

Geospatial Analysis of Job-Housing Mismatch Using ArcGIS and Python Geospatial Analysis of Job-Housing Mismatch Using ArcGIS and Python 2016 ESRI User Conference June 29, 2016 San Diego, CA Jung Seo, Frank Wen, Simon Choi and Tom Vo, Research & Analysis Southern California

More information

Hierarchical Additive Modeling of Nonlinear Association with Spatial Correlations

Hierarchical Additive Modeling of Nonlinear Association with Spatial Correlations 1 ENAR09 LSUHSC p. 1/18 Hierarchical Additive Modeling of Nonlinear Association with Spatial Correlations Qingzhao Yu, Bin Li, Richard Scribner Louisiana State University, School of Public Health Supported

More information

Spatial Analyses of Bowhead Whale Calls by Type of Call. Heidi Batchelor and Gerald L. D Spain. Marine Physical Laboratory

Spatial Analyses of Bowhead Whale Calls by Type of Call. Heidi Batchelor and Gerald L. D Spain. Marine Physical Laboratory 1 Spatial Analyses of Bowhead Whale Calls by Type of Call 2 3 Heidi Batchelor and Gerald L. D Spain 4 5 Marine Physical Laboratory 6 Scripps Institution of Oceanography 7 291 Rosecrans St., San Diego,

More information

Influence measures for CART

Influence measures for CART Jean-Michel Poggi Orsay, Paris Sud & Paris Descartes Joint work with Avner Bar-Hen Servane Gey (MAP5, Paris Descartes ) CART CART Classification And Regression Trees, Breiman et al. (1984) Learning set

More information

Application of Indirect Race/ Ethnicity Data in Quality Metric Analyses

Application of Indirect Race/ Ethnicity Data in Quality Metric Analyses Background The fifteen wholly-owned health plans under WellPoint, Inc. (WellPoint) historically did not collect data in regard to the race/ethnicity of it members. In order to overcome this lack of data

More information

Explaining Regional Differences in Environmental Inequality

Explaining Regional Differences in Environmental Inequality Explaining Regional Differences in Environmental Inequality A Multi-Level Assessment of German Cities Tobias Rüttenauer Department of Social Sciences TU Kaiserslautern November 22, 2017 Analytische Soziologie

More information

When Dictionary Learning Meets Classification

When Dictionary Learning Meets Classification When Dictionary Learning Meets Classification Bufford, Teresa 1 Chen, Yuxin 2 Horning, Mitchell 3 Shee, Liberty 1 Mentor: Professor Yohann Tendero 1 UCLA 2 Dalhousie University 3 Harvey Mudd College August

More information

Inclusion of Non-Street Addresses in Cancer Cluster Analysis

Inclusion of Non-Street Addresses in Cancer Cluster Analysis Inclusion of Non-Street Addresses in Cancer Cluster Analysis Sue-Min Lai, Zhimin Shen, Darin Banks Kansas Cancer Registry University of Kansas Medical Center KCR (Kansas Cancer Registry) KCR: population-based

More information

Analysis of Bank Branches in the Greater Los Angeles Region

Analysis of Bank Branches in the Greater Los Angeles Region Analysis of Bank Branches in the Greater Los Angeles Region Brian Moore Introduction The Community Reinvestment Act, passed by Congress in 1977, was written to address redlining by financial institutions.

More information

Space Informatics Lab - University of Cincinnati

Space Informatics Lab - University of Cincinnati Space Informatics Lab - University of Cincinnati USER GUIDE SocScape V 1.0 September 2014 1. Introduction SocScape (Social Landscape) is a GeoWeb-based tool for exploration of patterns in high resolution

More information

Spatial approach to analyzing dynamics of racial diversity in large U.S. cities:

Spatial approach to analyzing dynamics of racial diversity in large U.S. cities: Spatial approach to analyzing dynamics of racial diversity in large U.S. cities: 1990 2000 2010 Anna Dmowska a, Tomasz F. Stepinski b a Institute of Geoecology and Geoinformation, Adam Mickiewicz University,

More information

An Introduction to Nonlinear Principal Component Analysis

An Introduction to Nonlinear Principal Component Analysis An Introduction tononlinearprincipal Component Analysis p. 1/33 An Introduction to Nonlinear Principal Component Analysis Adam Monahan monahana@uvic.ca School of Earth and Ocean Sciences University of

More information

Abstract Teenage Employment and the Spatial Isolation of Minority and Poverty Households Using micro data from the US Census, this paper tests the imp

Abstract Teenage Employment and the Spatial Isolation of Minority and Poverty Households Using micro data from the US Census, this paper tests the imp Teenage Employment and the Spatial Isolation of Minority and Poverty Households by Katherine M. O'Regan Yale School of Management and John M. Quigley University of California Berkeley I II III IV V Introduction

More information

Do the Causes of Poverty Vary by Neighborhood Type?

Do the Causes of Poverty Vary by Neighborhood Type? Do the Causes of Poverty Vary by Neighborhood Type? Suburbs and the 2010 Census Conference Uday Kandula 1 and Brian Mikelbank 2 1 Ph.D. Candidate, Maxine Levin College of Urban Affairs Cleveland State

More information

An Alternative Algorithm for Classification Based on Robust Mahalanobis Distance

An Alternative Algorithm for Classification Based on Robust Mahalanobis Distance Dhaka Univ. J. Sci. 61(1): 81-85, 2013 (January) An Alternative Algorithm for Classification Based on Robust Mahalanobis Distance A. H. Sajib, A. Z. M. Shafiullah 1 and A. H. Sumon Department of Statistics,

More information

More on Unsupervised Learning

More on Unsupervised Learning More on Unsupervised Learning Two types of problems are to find association rules for occurrences in common in observations (market basket analysis), and finding the groups of values of observational data

More information

Neighborhood social characteristics and chronic disease outcomes: does the geographic scale of neighborhood matter? Malia Jones

Neighborhood social characteristics and chronic disease outcomes: does the geographic scale of neighborhood matter? Malia Jones Neighborhood social characteristics and chronic disease outcomes: does the geographic scale of neighborhood matter? Malia Jones Prepared for consideration for PAA 2013 Short Abstract Empirical research

More information

Detection of outliers in multivariate data:

Detection of outliers in multivariate data: 1 Detection of outliers in multivariate data: a method based on clustering and robust estimators Carla M. Santos-Pereira 1 and Ana M. Pires 2 1 Universidade Portucalense Infante D. Henrique, Oporto, Portugal

More information

Predicting Long-term Exposures for Health Effect Studies

Predicting Long-term Exposures for Health Effect Studies Predicting Long-term Exposures for Health Effect Studies Lianne Sheppard Adam A. Szpiro, Johan Lindström, Paul D. Sampson and the MESA Air team University of Washington CMAS Special Session, October 13,

More information

Final Exam, Machine Learning, Spring 2009

Final Exam, Machine Learning, Spring 2009 Name: Andrew ID: Final Exam, 10701 Machine Learning, Spring 2009 - The exam is open-book, open-notes, no electronics other than calculators. - The maximum possible score on this exam is 100. You have 3

More information

Visualization of Origin- Destination Commuter Flow Using CTPP Data and ArcGIS

Visualization of Origin- Destination Commuter Flow Using CTPP Data and ArcGIS Visualization of Origin- Destination Commuter Flow Using CTPP Data and ArcGIS Research & Analysis Department Southern California Association of Governments 2015 ESRI User Conference l July 23, 2015 l San

More information

Does city structure cause unemployment?

Does city structure cause unemployment? The World Bank Urban Research Symposium, December 15-17, 2003 Does city structure cause unemployment? The case study of Cape Town Presented by Harris Selod (INRA and CREST, France) Co-authored with Sandrine

More information

Deriving Spatially Refined Consistent Small Area Estimates over Time Using Cadastral Data

Deriving Spatially Refined Consistent Small Area Estimates over Time Using Cadastral Data Deriving Spatially Refined Consistent Small Area Estimates over Time Using Cadastral Data H. Zoraghein 1,*, S. Leyk 1, M. Ruther 2, B. P. Buttenfield 1 1 Department of Geography, University of Colorado,

More information

Statistical Process Control SCM Pearson Education, Inc. publishing as Prentice Hall

Statistical Process Control SCM Pearson Education, Inc. publishing as Prentice Hall S6 Statistical Process Control SCM 352 Outline Statistical Quality Control Common causes vs. assignable causes Different types of data attributes and variables Central limit theorem SPC charts Control

More information

Defining Statistically Significant Spatial Clusters of a Target Population using a Patient-Centered Approach within a GIS

Defining Statistically Significant Spatial Clusters of a Target Population using a Patient-Centered Approach within a GIS Defining Statistically Significant Spatial Clusters of a Target Population using a Patient-Centered Approach within a GIS Efforts to Improve Quality of Care Stephen Jones, PhD Bio-statistical Research

More information

Unsupervised machine learning

Unsupervised machine learning Chapter 9 Unsupervised machine learning Unsupervised machine learning (a.k.a. cluster analysis) is a set of methods to assign objects into clusters under a predefined distance measure when class labels

More information

Data Preprocessing. Cluster Similarity

Data Preprocessing. Cluster Similarity 1 Cluster Similarity Similarity is most often measured with the help of a distance function. The smaller the distance, the more similar the data objects (points). A function d: M M R is a distance on M

More information

New Prediction Methods for Tree Ensembles with Applications in Record Linkage

New Prediction Methods for Tree Ensembles with Applications in Record Linkage New Prediction Methods for Tree Ensembles with Applications in Record Linkage Samuel L. Ventura Rebecca Nugent Department of Statistics Carnegie Mellon University June 11, 2015 45th Symposium on the Interface

More information

Map your way to deeper insights

Map your way to deeper insights Map your way to deeper insights Target, forecast and plan by geographic region Highlights Apply your data to pre-installed map templates and customize to meet your needs. Select from included map files

More information

Methodological Issues in the Analysis of Residential Preferences and Residential Mobility

Methodological Issues in the Analysis of Residential Preferences and Residential Mobility Methodological Issues in the Analysis of Residential Preferences and Residential Mobility Elizabeth E. Bruch Departments of Sociology and Complex Systems, and the Population Studies Center University of

More information

.. Cal Poly CSC 466: Knowledge Discovery from Data Alexander Dekhtyar.. for each element of the dataset we are given its class label.

.. Cal Poly CSC 466: Knowledge Discovery from Data Alexander Dekhtyar.. for each element of the dataset we are given its class label. .. Cal Poly CSC 466: Knowledge Discovery from Data Alexander Dekhtyar.. Data Mining: Classification/Supervised Learning Definitions Data. Consider a set A = {A 1,...,A n } of attributes, and an additional

More information

Spatial Analysis I. Spatial data analysis Spatial analysis and inference

Spatial Analysis I. Spatial data analysis Spatial analysis and inference Spatial Analysis I Spatial data analysis Spatial analysis and inference Roadmap Outline: What is spatial analysis? Spatial Joins Step 1: Analysis of attributes Step 2: Preparing for analyses: working with

More information

DISCUSSION OF INFLUENTIAL FEATURE PCA FOR HIGH DIMENSIONAL CLUSTERING. By T. Tony Cai and Linjun Zhang University of Pennsylvania

DISCUSSION OF INFLUENTIAL FEATURE PCA FOR HIGH DIMENSIONAL CLUSTERING. By T. Tony Cai and Linjun Zhang University of Pennsylvania Submitted to the Annals of Statistics DISCUSSION OF INFLUENTIAL FEATURE PCA FOR HIGH DIMENSIONAL CLUSTERING By T. Tony Cai and Linjun Zhang University of Pennsylvania We would like to congratulate the

More information

Regression Clustering

Regression Clustering Regression Clustering In regression clustering, we assume a model of the form y = f g (x, θ g ) + ɛ g for observations y and x in the g th group. Usually, of course, we assume linear models of the form

More information

The Data Cleaning Problem: Some Key Issues & Practical Approaches. Ronald K. Pearson

The Data Cleaning Problem: Some Key Issues & Practical Approaches. Ronald K. Pearson The Data Cleaning Problem: Some Key Issues & Practical Approaches Ronald K. Pearson Daniel Baugh Institute for Functional Genomics and Computational Biology Department of Pathology, Anatomy, and Cell Biology

More information

ITEM 11 Information June 20, Visualize 2045: Update to the Equity Emphasis Areas. None

ITEM 11 Information June 20, Visualize 2045: Update to the Equity Emphasis Areas. None ITEM 11 Information June 20, 2018 Visualize 2045: Update to the Equity Emphasis Areas Staff Recommendation: Issues: Background: Briefing on the TPB-approved methodology to update the Equity Emphasis Areas

More information

An Open Source Geodemographic Classification of Small Areas In the Republic of Ireland Chris Brunsdon, Martin Charlton, Jan Rigby

An Open Source Geodemographic Classification of Small Areas In the Republic of Ireland Chris Brunsdon, Martin Charlton, Jan Rigby An Open Source Geodemographic Classification of Small Areas In the Republic of Ireland Chris Brunsdon, Martin Charlton, Jan Rigby National Centre for Geocomputation National University of Ireland, Maynooth

More information

Keywords: Air Quality, Environmental Justice, Vehicle Emissions, Public Health, Monitoring Network

Keywords: Air Quality, Environmental Justice, Vehicle Emissions, Public Health, Monitoring Network NOTICE: this is the author s version of a work that was accepted for publication in Transportation Research Part D: Transport and Environment. Changes resulting from the publishing process, such as peer

More information

2/2/2015 GEOGRAPHY 204: STATISTICAL PROBLEM SOLVING IN GEOGRAPHY MEASURES OF CENTRAL TENDENCY CHAPTER 3: DESCRIPTIVE STATISTICS AND GRAPHICS

2/2/2015 GEOGRAPHY 204: STATISTICAL PROBLEM SOLVING IN GEOGRAPHY MEASURES OF CENTRAL TENDENCY CHAPTER 3: DESCRIPTIVE STATISTICS AND GRAPHICS Spring 2015: Lembo GEOGRAPHY 204: STATISTICAL PROBLEM SOLVING IN GEOGRAPHY CHAPTER 3: DESCRIPTIVE STATISTICS AND GRAPHICS Descriptive statistics concise and easily understood summary of data set characteristics

More information

SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION

SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION 1 Outline Basic terminology Features Training and validation Model selection Error and loss measures Statistical comparison Evaluation measures 2 Terminology

More information

emerge Network: CERC Survey Survey Sampling Data Preparation

emerge Network: CERC Survey Survey Sampling Data Preparation emerge Network: CERC Survey Survey Sampling Data Preparation Overview The entire patient population does not use inpatient and outpatient clinic services at the same rate, nor are racial and ethnic subpopulations

More information

C) Discuss two factors that are contributing to the rapid geographical shifts in urbanization on a global scale.

C) Discuss two factors that are contributing to the rapid geographical shifts in urbanization on a global scale. AP Human Geography Unit VII. Cities and Urban Land Use Free Response Questions FRQ 1 Rapid urbanization in Least Developed Countries (LDCs) has many profound impacts for the world. Answer the following

More information

Confined to the Inner Ring? Effects of Ecological Distance on Patterns of Minority Suburbanization in American Metropolitan Areas *

Confined to the Inner Ring? Effects of Ecological Distance on Patterns of Minority Suburbanization in American Metropolitan Areas * Confined to the Inner Ring? Effects of Ecological Distance on Patterns of Minority Suburbanization in American Metropolitan Areas * Jeffrey M. Timberlake and Aaron J. Howell University of Cincinnati *

More information

OBESITY AND LOCATION IN MARION COUNTY, INDIANA MIDWEST STUDENT SUMMIT, APRIL Samantha Snyder, Purdue University

OBESITY AND LOCATION IN MARION COUNTY, INDIANA MIDWEST STUDENT SUMMIT, APRIL Samantha Snyder, Purdue University OBESITY AND LOCATION IN MARION COUNTY, INDIANA MIDWEST STUDENT SUMMIT, APRIL 2008 Samantha Snyder, Purdue University Organization Introduction Literature and Motivation Data Geographic Distributions ib

More information

PSY 250. Sampling. Representative Sample. Representativeness 7/23/2015. Sampling: Selecting Research Participants

PSY 250. Sampling. Representative Sample. Representativeness 7/23/2015. Sampling: Selecting Research Participants PSY 250 Sampling Selecting a sample of participants from the population Sample = subgroup of general population Sampling: Selecting Research Participants Generalize to: Population Large group of interest

More information

Multivariate Statistics: Hierarchical and k-means cluster analysis

Multivariate Statistics: Hierarchical and k-means cluster analysis Multivariate Statistics: Hierarchical and k-means cluster analysis Steffen Unkel Department of Medical Statistics University Medical Center Goettingen, Germany Summer term 217 1/43 What is a cluster? Proximity

More information

Agent Models and Demographic Research. Robert D. Mare December 7, 2007

Agent Models and Demographic Research. Robert D. Mare December 7, 2007 Agent Models and Demographic Research Robert D. Mare December 7, 2007 Agent Modeling vs. Business as Usual in Demographic Research Agent models are inherently more complex than standard multivariate models

More information

STAR COMMUNITY RATING SYSTEM OBJECTIVE EE-4: EQUITABLE SERVICES & ACCESS COMMUNITY LEVEL OUTCOMES FOR KING COUNTY, WA

STAR COMMUNITY RATING SYSTEM OBJECTIVE EE-4: EQUITABLE SERVICES & ACCESS COMMUNITY LEVEL OUTCOMES FOR KING COUNTY, WA STAR COMMUNITY RATING SYSTEM OBJECTIVE EE-4: EQUITABLE SERVICES & ACCESS COMMUNITY LEVEL OUTCOMES FOR KING COUNTY, WA OUTCOME I: EQUITABLE ACCESS AND PROXIMITY Background: This analysis has been developed

More information

GROWING APART: THE CHANGING FIRM-SIZE WAGE PREMIUM AND ITS INEQUALITY CONSEQUENCES ONLINE APPENDIX

GROWING APART: THE CHANGING FIRM-SIZE WAGE PREMIUM AND ITS INEQUALITY CONSEQUENCES ONLINE APPENDIX GROWING APART: THE CHANGING FIRM-SIZE WAGE PREMIUM AND ITS INEQUALITY CONSEQUENCES ONLINE APPENDIX The following document is the online appendix for the paper, Growing Apart: The Changing Firm-Size Wage

More information

(Department of Urban and Regional planning, Sun Yat-sen University, Guangzhou , China)

(Department of Urban and Regional planning, Sun Yat-sen University, Guangzhou , China) DOI:10.13959/j.issn.1003-2398.2008.05.001 :1003-2398(2008)05-0061-06, (, 510275) RESIDENTIAL SEGREGATION OF FLOATING POPULATION AND DRIVING FORCES IN GUANGZHOU CITY YUAN Yuan, XU Xue-qiang (Department

More information

Vehicle Freq Rel. Freq Frequency distribution. Statistics

Vehicle Freq Rel. Freq Frequency distribution. Statistics 1.1 STATISTICS Statistics is the science of data. This involves collecting, summarizing, organizing, and analyzing data in order to draw meaningful conclusions about the universe from which the data is

More information

The DOC counts only White Hispanics, not other race Hispanics, so the Hispanic calculation uses the White Hispanic population numbers.

The DOC counts only White Hispanics, not other race Hispanics, so the Hispanic calculation uses the White Hispanic population numbers. Dane County Rates of Incarceration and Community Supervision 2006 Pamela E. Oliver, PhD Department of Sociology, University of Wisconsin Madison February 23, 2008 At the end of 2006, an estimated 47% of

More information

The History Behind Census Geography

The History Behind Census Geography The History Behind Census Geography Michael Ratcliffe Geography Division US Census Bureau Tennessee State Data Center August 8, 2017 Today s Presentation A brief look at the history behind some of the

More information

GEOG 3340: Introduction to Human Geography Research

GEOG 3340: Introduction to Human Geography Research GEOG 3340: Introduction to Human Geography Research Lecture 1: Course Overview Guofeng Cao www.myweb.ttu.edu/gucao Department of Geosciences Texas Tech University guofeng.cao@ttu.edu Fall 2015 Course Description

More information

Fast and Robust Classifiers Adjusted for Skewness

Fast and Robust Classifiers Adjusted for Skewness Fast and Robust Classifiers Adjusted for Skewness Mia Hubert 1 and Stephan Van der Veeken 2 1 Department of Mathematics - LStat, Katholieke Universiteit Leuven Celestijnenlaan 200B, Leuven, Belgium, Mia.Hubert@wis.kuleuven.be

More information

2008 ESRI Business GIS Summit Spatial Analysis for Business 2008 Program

2008 ESRI Business GIS Summit Spatial Analysis for Business 2008 Program A GIS Framework F k to t Forecast F t Residential Home Prices By Mak Kaboudan and Avijit Sarkar University of Redlands School of Business 2008 ESRI Business GIS Summit Spatial Analysis for Business 2008

More information

Online Robustness Appendix to Endogenous Gentrification and Housing Price Dynamics

Online Robustness Appendix to Endogenous Gentrification and Housing Price Dynamics Online Robustness Appendix to Endogenous Gentrification and Housing Price Dynamics Robustness Appendix to Endogenous Gentrification and Housing Price Dynamics This robustness appendix provides a variety

More information

Spatiotemporal Analysis of Commuting Patterns: Using ArcGIS and Big Data

Spatiotemporal Analysis of Commuting Patterns: Using ArcGIS and Big Data Spatiotemporal Analysis of Commuting Patterns: Using ArcGIS and Big Data 2017 ESRI User Conference July 13, 2017 San Diego, VA Jung Seo, Tom Vo, Frank Wen and Simon Choi Research & Analysis Southern California

More information

Multinomial Logistic Regression Model for Predicting Tornado Intensity Based on Path Length and Width

Multinomial Logistic Regression Model for Predicting Tornado Intensity Based on Path Length and Width World Environment 2014, 4(2): 61-66 DOI: 10.5923/j.env.20140402.02 Multinomial Logistic Regression Model for Predicting Caleb Michael Akers, Nathaniel John Smith, Naima Shifa * DePauw University, Greencastle,

More information

GEOG 4110/5100 Advanced Remote Sensing Lecture 12. Classification (Supervised and Unsupervised) Richards: 6.1, ,

GEOG 4110/5100 Advanced Remote Sensing Lecture 12. Classification (Supervised and Unsupervised) Richards: 6.1, , GEOG 4110/5100 Advanced Remote Sensing Lecture 12 Classification (Supervised and Unsupervised) Richards: 6.1, 8.1-8.8.2, 9.1-9.34 GEOG 4110/5100 1 Fourier Transforms Transformations in the Frequency Domain

More information

Survival of the fittest?

Survival of the fittest? Survival of the fittest? Viggo Nordvik 1 and Lena Magnusson Turner 2 1 Norwegian Social Research (NOVA), NORWAY, viggo.nordvik@nova.no 2 Norwegian Social Research (NOVA), NORWAY, lena.m.turner@nova.no

More information

Decision Trees. CS57300 Data Mining Fall Instructor: Bruno Ribeiro

Decision Trees. CS57300 Data Mining Fall Instructor: Bruno Ribeiro Decision Trees CS57300 Data Mining Fall 2016 Instructor: Bruno Ribeiro Goal } Classification without Models Well, partially without a model } Today: Decision Trees 2015 Bruno Ribeiro 2 3 Why Trees? } interpretable/intuitive,

More information

Data Mining Part 4. Prediction

Data Mining Part 4. Prediction Data Mining Part 4. Prediction 4.3. Fall 2009 Instructor: Dr. Masoud Yaghini Outline Introduction Bayes Theorem Naïve References Introduction Bayesian classifiers A statistical classifiers Introduction

More information

Demand and Trip Prediction in Bike Share Systems

Demand and Trip Prediction in Bike Share Systems Demand and Trip Prediction in Bike Share Systems Team members: Zhaonan Qu SUNet ID: zhaonanq December 16, 2017 1 Abstract 2 Introduction Bike Share systems are becoming increasingly popular in urban areas.

More information

Understanding Your Community A Guide to Data

Understanding Your Community A Guide to Data Understanding Your Community A Guide to Data Alex Lea September 2013 Research and Insight Team LeicestershireCounty Council Understanding Geographies Important to understand the various geographies that

More information

CSC Neural Networks. Perceptron Learning Rule

CSC Neural Networks. Perceptron Learning Rule CSC 302 1.5 Neural Networks Perceptron Learning Rule 1 Objectives Determining the weight matrix and bias for perceptron networks with many inputs. Explaining what a learning rule is. Developing the perceptron

More information

Community Health Needs Assessment through Spatial Regression Modeling

Community Health Needs Assessment through Spatial Regression Modeling Community Health Needs Assessment through Spatial Regression Modeling Glen D. Johnson, PhD CUNY School of Public Health glen.johnson@lehman.cuny.edu Objectives: Assess community needs with respect to particular

More information

Luc Anselin Spatial Analysis Laboratory Dept. Agricultural and Consumer Economics University of Illinois, Urbana-Champaign

Luc Anselin Spatial Analysis Laboratory Dept. Agricultural and Consumer Economics University of Illinois, Urbana-Champaign GIS and Spatial Analysis Luc Anselin Spatial Analysis Laboratory Dept. Agricultural and Consumer Economics University of Illinois, Urbana-Champaign http://sal.agecon.uiuc.edu Outline GIS and Spatial Analysis

More information

VIKING INSPECTION PROPERTY 4921 U.S. Hwy. 85, Williston, ND 58801

VIKING INSPECTION PROPERTY 4921 U.S. Hwy. 85, Williston, ND 58801 SALE PRICE: $799,000 LOT SIZE: +/-2.49 Acres BUILDING SIZE: +/-3,800 SF DRIVE-IN DOORS 2 CEILING HEIGHT: 16' YEAR BUILT: 2007 ZONING: Commercial PROPERTY OVERVIEW Highly visible, hard-to-find small shop/office/apartment

More information

Generalized Linear Probability Models in HLM R. B. Taylor Department of Criminal Justice Temple University (c) 2000 by Ralph B.

Generalized Linear Probability Models in HLM R. B. Taylor Department of Criminal Justice Temple University (c) 2000 by Ralph B. Generalized Linear Probability Models in HLM R. B. Taylor Department of Criminal Justice Temple University (c) 2000 by Ralph B. Taylor fi=hlml15 The Problem Up to now we have been addressing multilevel

More information

An Introduction to China and US Map Library. Shuming Bao Spatial Data Center & China Data Center University of Michigan

An Introduction to China and US Map Library. Shuming Bao Spatial Data Center & China Data Center University of Michigan An Introduction to China and US Map Library Shuming Bao Spatial Data Center & China Data Center University of Michigan Current Spatial Data Services http://chinadataonline.org China Geo-Explorer http://chinageoexplorer.org

More information

Classification techniques focus on Discriminant Analysis

Classification techniques focus on Discriminant Analysis Classification techniques focus on Discriminant Analysis Seminar: Potentials of advanced image analysis technology in the cereal science research 2111 2005 Ulf Indahl/IMT - 14.06.2010 Task: Supervised

More information

Instrumentation (cont.) Statistics vs. Parameters. Descriptive Statistics. Types of Numerical Data

Instrumentation (cont.) Statistics vs. Parameters. Descriptive Statistics. Types of Numerical Data Norm-Referenced vs. Criterion- Referenced Instruments Instrumentation (cont.) October 1, 2007 Note: Measurement Plan Due Next Week All derived scores give meaning to individual scores by comparing them

More information

2009 ESRI User Conference San Diego, CA

2009 ESRI User Conference San Diego, CA Guillaume Turcotte GIS Laboratory Technician Villanova University Determining Factors in the Siting of Undesirable Land Uses 2009 ESRI User Conference San Diego, CA Introduction and Literature Review Warren

More information

Image Analysis. PCA and Eigenfaces

Image Analysis. PCA and Eigenfaces Image Analysis PCA and Eigenfaces Christophoros Nikou cnikou@cs.uoi.gr Images taken from: D. Forsyth and J. Ponce. Computer Vision: A Modern Approach, Prentice Hall, 2003. Computer Vision course by Svetlana

More information

Mark Fossett Department of Sociology Texas A&M University College Station, Texas. with

Mark Fossett Department of Sociology Texas A&M University College Station, Texas. with UNBIASED INDICES OF UNEVEN DISTRIBUTION AND EXPOSURE: NEW ALTERNATIVES FOR SEGREGATION ANALYSIS Mark Fossett Department of Sociology Texas A&M University College Station, Texas with Wenquan Zhang Department

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning Christoph Lampert Spring Semester 2015/2016 // Lecture 12 1 / 36 Unsupervised Learning Dimensionality Reduction 2 / 36 Dimensionality Reduction Given: data X = {x 1,..., x

More information

Counterfactual Dissimilarity: Can Changes in Demographics and Income Explain Increased Racial Integration in U.S. Cities?

Counterfactual Dissimilarity: Can Changes in Demographics and Income Explain Increased Racial Integration in U.S. Cities? Counterfactual Dissimilarity: Can Changes in Demographics and Income Explain Increased Racial Integration in U.S. Cities? Paul E. Carrillo George Washington University Jonathan L. Rothbaum U.S. Census

More information

USP/PLSI 493: Methods of Planning Data Analysis (4 credits) San Francisco State University Spring 2011

USP/PLSI 493: Methods of Planning Data Analysis (4 credits) San Francisco State University Spring 2011 USP/PLSI 493: Methods of Planning Data Analysis (4 credits) Department of Urban Studies and Planning Professor Ayse Pamuk San Francisco State University Spring 2011 Mondays and Wednesdays 10:10-11:50pm

More information

A Note on Commutes and the Spatial Mismatch Hypothesis

A Note on Commutes and the Spatial Mismatch Hypothesis Upjohn Institute Working Papers Upjohn Research home page 2000 A Note on Commutes and the Spatial Mismatch Hypothesis Kelly DeRango W.E. Upjohn Institute Upjohn Institute Working Paper No. 00-59 Citation

More information

Annals of the Association of American Geographers

Annals of the Association of American Geographers Annals of the Association of American Geographers Measuring Ethnic Clustering and Exposure with the Q statistic: An Exploratory Analysis of Irish, Germans, and Yankees in 1880 Newark Journal: Annals of

More information

Oregon Population Forecast Program

Oregon Population Forecast Program Oregon Population Forecast Program Regional Forecast Meeting October 2, 2015 Presentation by Population Forecast Program Team Hood River County Oregon Population Forecast Program Project Team Xiaomin Ruan,

More information

emerge Network: CERC Survey Survey Sampling Data Preparation

emerge Network: CERC Survey Survey Sampling Data Preparation emerge Network: CERC Survey Survey Sampling Data Preparation Overview The entire patient population does not use inpatient and outpatient clinic services at the same rate, nor are racial and ethnic subpopulations

More information