1Department of Demography and Organization Studies, University of Texas at San Antonio, One UTSA Circle, San Antonio, TX

Similar documents
Jun Tu. Department of Geography and Anthropology Kennesaw State University

Explorative Spatial Analysis of Coastal Community Incomes in Setiu Wetlands: Geographically Weighted Regression

Assessing Social Vulnerability to Biophysical Hazards. Dr. Jasmine Waddell

Population Profiles

Using Spatial Statistics Social Service Applications Public Safety and Public Health

Daniel Fuller Lise Gauvin Yan Kestens

Social Vulnerability Index. Susan L. Cutter Department of Geography, University of South Carolina

Summary Article: Poverty from Encyclopedia of Geography

ESRI 2008 Health GIS Conference

Neighborhood social characteristics and chronic disease outcomes: does the geographic scale of neighborhood matter? Malia Jones

Medical GIS: New Uses of Mapping Technology in Public Health. Peter Hayward, PhD Department of Geography SUNY College at Oneonta

Running head: GEOGRAPHICALLY WEIGHTED REGRESSION 1. Geographically Weighted Regression. Chelsey-Ann Cu GEOB 479 L2A. University of British Columbia

This report details analyses and methodologies used to examine and visualize the spatial and nonspatial

Spatial Disparities in the Distribution of Parks and Green Spaces in the United States

Abstract Teenage Employment and the Spatial Isolation of Minority and Poverty Households Using micro data from the US Census, this paper tests the imp

Spatial Analysis Of Inequity: Methods, Models, and Case Studies Jayajit Chakraborty

Environmental Analysis, Chapter 4 Consequences, and Mitigation

The Empirical Study of Rail Transit Impacts on Land Value in Developing Countries: A Case Study in Bangkok, Thailand

Spatial segregation and socioeconomic inequalities in health in major Brazilian cities. An ESRC pathfinder project

Spatial Variation in Infant Mortality with Geographically Weighted Poisson Regression (GWPR) Approach

Maggie M. Kovach. Department of Geography University of North Carolina at Chapel Hill

GIS in Locating and Explaining Conflict Hotspots in Nepal

Spatial Variation in Local Road Pedestrian and Bicycle Crashes

DIFFERENT INFLUENCES OF SOCIOECONOMIC FACTORS ON THE HUNTING AND FISHING LICENSE SALES IN COOK COUNTY, IL

Community Health Needs Assessment through Spatial Regression Modeling

Tracey Farrigan Research Geographer USDA-Economic Research Service

NEW YORK DEPARTMENT OF SANITATION. Spatial Analysis of Complaints

Does city structure cause unemployment?

ARIC Manuscript Proposal # PC Reviewed: _9/_25_/06 Status: A Priority: _2 SC Reviewed: _9/_25_/06 Status: A Priority: _2

Environmental Justice and the Environmental Protection Agency s Superfund Program

Rencontres de l Hôtel Dieu Paris,12-13 mai C. Padilla, B Lalloue, D Zmirou-Navier, S Deguen

AS Population Change Question spotting

CHAPTER 2: KEY ISSUE 1 Where Is the World s Population Distributed? p

Mapping Communities of Opportunity in New Orleans

Keywords: Air Quality, Environmental Justice, Vehicle Emissions, Public Health, Monitoring Network

Exploring the Association Between Family Planning and Developing Telecommunications Infrastructure in Rural Peru

Katherine J. Curtis 1, Heather O Connell 1, Perla E. Reyes 2, and Jun Zhu 1. University of Wisconsin-Madison 2. University of California-Santa Cruz

Modeling the Ecology of Urban Inequality in Space and Time

CRP 608 Winter 10 Class presentation February 04, Senior Research Associate Kirwan Institute for the Study of Race and Ethnicity

Urban Studies June 1998 v35 n7 p1187(19) Page 1

Apéndice 1: Figuras y Tablas del Marco Teórico

Do the Causes of Poverty Vary by Neighborhood Type?

Death by Segregation: Does the dimension of racial segregation matter? 1

Lecture (chapter 13): Association between variables measured at the interval-ratio level

DEVELOPING DECISION SUPPORT TOOLS FOR THE IMPLEMENTATION OF BICYCLE AND PEDESTRIAN SAFETY STRATEGIES

Module 10 Summative Assessment

Demographic Data in ArcGIS. Harry J. Moore IV

In matrix algebra notation, a linear model is written as

Integrating GIS into Food Access Analysis

Exploratory Spatial Data Analysis (ESDA)

Spatial Trends of unpaid caregiving in Ireland

Regional Performance Measures

Regional Performance Measures

2014 Planning Database (PDB)


BUILDING SOUND AND COMPARABLE METRICS FOR SDGS: THE CONTRIBUTION OF THE OECD DATA AND TOOLS FOR CITIES AND REGIONS

(Department of Urban and Regional planning, Sun Yat-sen University, Guangzhou , China)

Responding to Natural Hazards: The Effects of Disaster on Residential Location Decisions and Health Outcomes

Migration Modelling using Global Population Projections

Geospatial Analysis of Job-Housing Mismatch Using ArcGIS and Python

CSISS Tools and Spatial Analysis Software

Travel behavior of low-income residents: Studying two contrasting locations in the city of Chennai, India

ESTIMATE THE REGRESSION COEFFICIENTS OF VARIABLES SPL. REFERENCE TO FERTILITY

Guilty of committing ecological fallacy?

Links between socio-economic and ethnic segregation at different spatial scales: a comparison between The Netherlands and Belgium

BROOKINGS May

Agent Models and Demographic Research. Robert D. Mare December 7, 2007

Inclusion of Non-Street Addresses in Cancer Cluster Analysis

Impact of Metropolitan-level Built Environment on Travel Behavior

Population Research Center (PRC) Oregon Population Forecast Program

Globally Estimating the Population Characteristics of Small Geographic Areas. Tom Fitzwater

A Framework for the Study of Urban Health. Abdullah Baqui, DrPH, MPH, MBBS Johns Hopkins University

Toxic Wastes and Race at Twenty:

Modeling the Spatial Effects on Demand Estimation of Americans with Disabilities Act Paratransit Services

Electronic Journal of Sociology (2007)

AP Human Geography Free-response Questions

TUESDAYS AT APA PLANNING AND HEALTH. SAGAR SHAH, PhD AMERICAN PLANNING ASSOCIATION SEPTEMBER 2017 DISCUSSING THE ROLE OF FACTORS INFLUENCING HEALTH

Application of Indirect Race/ Ethnicity Data in Quality Metric Analyses

Spatial Analysis 1. Introduction

The Cost of Transportation : Spatial Analysis of US Fuel Prices

Spatial Regression Modeling

Understanding Your Community A Guide to Data

DO NOT CITE WITHOUT AUTHOR S PERMISSION:

Assessing Environmental Equity Related to Local Air Pollution in Tampa, FL

Geozones: an area-based method for analysis of health outcomes

WHAT IS HETEROSKEDASTICITY AND WHY SHOULD WE CARE?

Urban GIS for Health Metrics

Measuring Disaster Risk for Urban areas in Asia-Pacific

Secondary Towns and Poverty Reduction: Refocusing the Urbanization Agenda

Dr Arulsivanathan Naidoo Statistics South Africa 18 October 2017

Geographically weighted regression approach for origin-destination flows

Developing Spatial Data to Support Statistical Analysis of Education

APPENDIX V VALLEYWIDE REPORT

Everything is related to everything else, but near things are more related than distant things.

The System of Xiaokang Indicators: A Framework to Measure China's Progress

The Rural Health Workforce. Policy Brief Series. Data and Issues for Policymakers in: Washington Wyoming Alaska Montana Idaho

The Church Demographic Specialists

Spatial and Socioeconomic Analysis of Commuting Patterns in Southern California Using LODES, CTPP, and ACS PUMS

SOCIO-DEMOGRAPHIC INDICATORS FOR REGIONAL POPULATION POLICIES

Poverty and Hazard Linkages

Transcription:

Well, it depends on where you're born: A practical application of geographically weighted regression to the study of infant mortality in the U.S. P. Johnelle Sparks and Corey S. Sparks 1 Introduction Infant mortality remains one of the most sensitive measures indicating the general health status of the population (see Clarke et al. 1994; Cramer 1987; Nersesian 1988; Newland 1981; Singh and Yu 1995). Major medical advances in prenatal and early infant care have led to dramatic decreases in infant mortality rates in the United States. This is largely attributable to fewer infants being born low birth weight and reducing complications during pregnancy that could lead to an infant death. Recent reports from the National Center for Health Statistics indicate that the overall infant mortality rate for the U.S. in 2003 was 6.85 deaths per 1,000 live births (Hoyert et al. 2006). Yet, the U.S. still lags behind many industrialized nations with regard to overall infant mortality rates, and variation in rates across places in the United States and between racial/ethnic groups remains (Nersesian 1988). The goal of our research is to identify possible explanations for the variation in infant mortality rates across space. Social structure theory suggests that the social, racial, economic, and educational distributions of places may impact health outcomes (Bird and Bauman 1995). This theory argues that minority concentration and residential segregation are just two possible components of the social structure that could negatively impact health outcomes at the aggregate level, such as for counties or states. Within the social structure, resources are distributed unequally. Therefore, the social structure can perpetuate inequality in such things as quality employment, affordable and safe housing, incomes, or health care resources to name a few. It is from this theoretical perspective that we test for spatial variation in county infant mortality rates using geographically weighted regression. The purpose of this paper is to examine the associations between various socioeconomic indicators of inequality and county infant mortality rates across the United State. By using analytical methods that allow the relationships between our socioeconomic indicators and infant mortality to vary across space, we hope to provide a better understanding of the often continuous and spatially varying nature of inequality and health. The ultimate goal of this work is inform policy makers about the distinct regional variations in the relationship between socioeconomic inequality and population health. Data and methods Data for this analysis are taken from two sources: Compressed Mortality Files from the National Center for Health Statistics and 2000 U.S. Census of Population and Housing, Summary File 3. Five year sex-race standardized infant mortality rates for the years 1998-2002 serve as the dependent variable in this analysis. Standardization of mortality rates is used in order to facilitate the comparison of rates across groups. It is important in this analysis to standardize mortality rates based on sex and race, because mortality risks vary greatly based on these demographic characteristics during the first year of life. For this analysis we use the sex-race distribution of the 2000 U.S. population as our standard population. Independent variables for this analysis are taken from the Census and serve as county-level economic and social indicators. We include five independent variables in our analyses: percentage of the county population that is rural, percentage of the county population that is black, percentage of the county population that is Hispanic, the percentage of the county population below age 5 that lives in a family that is below the federally 1Department of Demography and Organization Studies, University of Texas at San Antonio, One UTSA Circle, San Antonio, TX 78249.

designated poverty level, and percentage of female headed households in the county. The percentage of the county population that is rural, defined as a population not classified as urban by the Census Bureau, is constructed by dividing the county s rural population by the county total population and multiplying this value by 100. Similarly the percentages of the county population that is black or Hispanic are constructed by taking the total number of black or Hispanic residents per county and dividing those numbers by the total county population in 2000 and multiplying the value by 100. The number of county children below the age of five living below the federally designated poverty threshold is divided by the total county population under the age of five and multiplied by 100 to construct the percentage of the country population under the age of 5 living in poverty. To construct the percentage of female headed households we take the total number of female headed households per county divided by the county population and multiplying this value by 100. The primary method used in this paper is geographically weighted regression (GWR) (Brunsdon et al. 1998; Fotheringham et al. 2002). The primary benefit of using the GWR approach is that it allows us to visualize how the effects of each covariate in our model vary over geographic space. This approach is opposed to the autoregressive models frequently used in spatial demography that, while controlling for autocorrelation in either the dependent variable or the error structure, still estimates global regression parameters. Since we are primarily interested in how the processes that influence infant mortality rates vary across space, global parameter estimates are not sufficient to increase our understanding of the process. The GWR model takes the traditional OLS model and extends the framework by allowing local, rather than global parameter estimates of βi, this is rewritten as: y i = 0 u i,v j k k u i,v j x ik i, where now each βi is estimated at the location ui,vj where i and j are the coordinates or geographic location of the observation i. βi(ui,vj) is the local realization of the continuous β function at point i. This constructs a trend surface of parameter values for each independent variable and the model intercept. Note that the basic OLS regression model above is just a special case of the GWR model where the coefficients are constant over space. The parameters in the GWR are estimated by weighted least squares. The weighting matrix is a diagonal matrix, with each diagonal element being a function of the location of the observation. If Wi is the weighting matrix at location i, then the parameter estimate at that location would be specified as: i =[X ' W i X ] 1 X ' W i Y The role of the weight matrix is to give more value to observations that are close to i, as it is assumed that observations that are close will influence each other more than those that are far away. The form of this weight matrix can vary (as far as how it is calculated). Typically a distance based weight is used, and an observation has 0 weight if it is beyond a distance dij. This distance weighting is used for defining neighborhoods on which to base the estimates of βi. If an observation is within the distance d, then it is included in the analysis for location i. We employ the kernel method of weighting proposed by Brunsdon et al. and Fotheringham et al., in which a global kernel bandwidth is estimated via cross-validation of the observed and predicted mortality rates. This bandwidth parameter defines the neighborhood or the number of observations in the data that will be used to estimate the regression parameters at that specific i,j location. Model estimation is done in R version 2.7.2 using the spgwr library and we test the improvement in model performance using the global F-tests and the tests for geographic variation in the regression

parameters derived in Leung et al. (2000). Preliminary results The results of our initial OLS regression models are provided in Table 1. They suggest that as the percentage of the county population that is rural, the percentage of the county population that is black and the percentage the county population that is Hispanic increase, these indicate a decrease in the county infant mortality rate, while as the percentage of the county population of children in poverty and the percentage the county population of households with a female head increase, that these factors tend to indicate increase county infant mortality rates. Table 1. Results from Ordinary Least Squares (OLS) Regression Model for US County Infant Mortality Rates, 1998-2002. Variable Estimate t Prob(> t ) Intercept 383.00 5.69 <.0001 % Rural -153.04 33.97 <.0001 % Black -205.74 80.02.0102 % Hispanic -325.39 80.22 <.0001 % Children in Poverty % of Female Household Heads 982.25 213.51 <.0001 1129.82 266.99 <.0001 While these effects provide a baseline estimate of the effects of our independent variables, we are more concerned with testing the hypothesis that these parameters vary across space. Table 2 presents the results of the GWR, indicating the minimum, median, first and third quartiles, and maximum values for the estimated regression coefficients. The OLS estimates are also given for comparison. Table 2 Distribution of GWR Coefficients for US County Infant Mortality Rates, 1998-2002 Variable Minimum 1 st Quartile Median 3 rd Quartile Maximum OLS Estimate Intercept -453.2 335.9 446.2 503.0 782.3 383.00 % Rural -460.6-208.7-162.5-113.9 119.4-153.04 % Black -1762.0-322.9-191.2-77.1 1373.0-205.74 % Hispanic % Children in Poverty % of Female Household Heads -1085.0-408.5-262.7.137.6 426.0-325.39 167.5 604.8 867.6 1300.0 3425.0 982.25-926.8 771.2 1010.0 1364.0 4100.0 1129.82

While inspection of the table of coefficients reveals significant variability in the regression coefficients across space, and a visual inspection of the regression parameters better illustrates how the relationships between the independent variables and county infant mortality rates vary across space. The following figures provide an illustration of the geographic variation in regression parameters. The ultimate goal of this work is inform policy makers about the distinct regional variations in the relationship between socioeconomic inequality and population health using county infant mortality rates as an example.

References Bird, S.T. and K.E. Bauman. 1995. The Relationship between Structural and Health Services Variables and State-Level Infant Mortality in the United States. American Journal of Public Health 85(1): 26-29. Brunsdon, C., S. Fotheringham, and M. Charlton. 1998. Geographically Weighted Regression: Modeling Spatial Non-Stationarity. Journal of the Royal Statistical Society Series D-The Statistician 47: 431-443. Clarke, L.L., F.L. Farmer, and M.K. Miller. 1994. Structural Determinants of Infant Mortality in Metropolitan and Nonmetropolitan America. Rural Sociology 59(1):84-99. Cramer, J.C. 1987. Social Factors and Infant Mortality: Identifying High-Risk Groups and Proximate Causes. Demography 24(3):299-322. Fotheringham, A., C. Brunsdon, and M. Charlton. (eds.) 2002. Geographically Weighted Regression: The Analysis of Spatially Varying Relationships. New York: Wiley. Hoyert, D.L., M.P. Heron, S.L. Murphy, and H.C. Kung. 2006. Deaths: Final Data for 2003. National Vital Statistics Report 54(13): 1-120. Leung, Y., C. Mei, and, W. Zhang. 2002 Statistical tests for spatial nonstationarity based on the geographically weighted regression model. Environment and Planning A 32: 9-32. Nersesian, W.S. 1988. Infant Mortality in Socially Vulnerable Populations. Annual Review of Public Health 9:361-377. Newland, K. 1981. Infant Mortality and the Health of Societies. Washington, DC: Worldwatch Institute. R Development Core Team (2008) R: A Language and Environment for Statistical Computing. 2.7.2 ed. Vienna, Austria, R Foundation for Statistical Computing. Singh, G.K. and S.M. Yu. 1995. Infant Mortality in the United States: Trends, Differentials, and Projections, 1950 through 2010. Health Services Research 85(7):957-964.