Semiparametric geographically weighted generalised linear modelling in GWR 4.0

Similar documents
Chapter 13: Multiple Regression

Comparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method

Introduction to Generalized Linear Models

Generalized Linear Methods

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

STAT 405 BIOSTATISTICS (Fall 2016) Handout 15 Introduction to Logistic Regression

4.3 Poisson Regression

NUMERICAL DIFFERENTIATION

How its computed. y outcome data λ parameters hyperparameters. where P denotes the Laplace approximation. k i k k. Andrew B Lawson 2013

Chapter 11: Simple Linear Regression and Correlation

Statistics MINITAB - Lab 2

An R implementation of bootstrap procedures for mixed models

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)

Polynomial Regression Models

A Robust Method for Calculating the Correlation Coefficient

Diagnostics in Poisson Regression. Models - Residual Analysis

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

CIE4801 Transportation and spatial modelling Trip distribution

Negative Binomial Regression

Homework Assignment 3 Due in class, Thursday October 15

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2)

28. SIMPLE LINEAR REGRESSION III

Explaining the Stein Paradox

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

MLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012

Chapter 14: Logit and Probit Models for Categorical Response Variables

Statistical analysis using matlab. HY 439 Presented by: George Fortetsanakis

Kernel Methods and SVMs Extension

Linear Feature Engineering 11

MACHINE APPLIED MACHINE LEARNING LEARNING. Gaussian Mixture Regression

Chapter 9: Statistical Inference and the Relationship between Two Variables

18. SIMPLE LINEAR REGRESSION III

Chapter 8 Indicator Variables

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

/ n ) are compared. The logic is: if the two

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI

Lecture Notes on Linear Regression

LINEAR REGRESSION ANALYSIS. MODULE VIII Lecture Indicator Variables

Chapter 5 Multilevel Models

STAT 3008 Applied Regression Analysis

Introduction to Regression

Lecture 2: Prelude to the big shrink

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution

Small Area Estimation Under Spatial Nonstationarity

First Year Examination Department of Statistics, University of Florida

Comparison of Regression Lines

Correlation and Regression. Correlation 9.1. Correlation. Chapter 9

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

Non-Mixture Cure Model for Interval Censored Data: Simulation Study ABSTRACT

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Copyright 2017 by Taylor Enterprises, Inc., All Rights Reserved. Adjusted Control Limits for P Charts. Dr. Wayne A. Taylor

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications

Linear Correlation. Many research issues are pursued with nonexperimental studies that seek to establish relationships among 2 or more variables

Appendix B: Resampling Algorithms

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

Linear Regression Analysis: Terminology and Notation

8 : Learning in Fully Observed Markov Networks. 1 Why We Need to Learn Undirected Graphical Models. 2 Structural Learning for Completely Observed MRF

Credit Card Pricing and Impact of Adverse Selection

Applications of GEE Methodology Using the SAS System

Spatial Modelling of Peak Frequencies of Brain Signals

ECONOMETRICS - FINAL EXAM, 3rd YEAR (GECO & GADE)

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

On Outlier Robust Small Area Mean Estimate Based on Prediction of Empirical Distribution Function

Low default modelling: a comparison of techniques based on a real Brazilian corporate portfolio

Chapter 15 - Multiple Regression

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

x i1 =1 for all i (the constant ).

Statistics for Business and Economics

Statistics II Final Exam 26/6/18

Numerical Heat and Mass Transfer

On the Influential Points in the Functional Circular Relationship Models

Statistics for Economics & Business

Lecture 6: Introduction to Linear Regression

System in Weibull Distribution

BIO Lab 2: TWO-LEVEL NORMAL MODELS with school children popularity data

A New Method for Estimating Overdispersion. David Fletcher and Peter Green Department of Mathematics and Statistics

Econ107 Applied Econometrics Topic 9: Heteroskedasticity (Studenmund, Chapter 10)

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

A Hybrid Variational Iteration Method for Blasius Equation

Lecture 4. Instructor: Haipeng Luo

More metrics on cartesian products

Chapter 15 Student Lecture Notes 15-1

Conjugacy and the Exponential Family

Lecture 10 Support Vector Machines II

DERIVATION OF THE PROBABILITY PLOT CORRELATION COEFFICIENT TEST STATISTICS FOR THE GENERALIZED LOGISTIC DISTRIBUTION

Statistical inference for generalized Pareto distribution based on progressive Type-II censored data with random removals

8/25/17. Data Modeling. Data Modeling. Data Modeling. Patrice Koehl Department of Biological Sciences National University of Singapore

reviewed paper Explore the Effect of Urban Flood with the Integration of Spatial Analysis Technique Hsueh-Sheng Chang, Chin-Hsien Liao

A PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS

Module Contact: Dr Susan Long, ECO Copyright of the University of East Anglia Version 1

Spatial Statistics and Analysis Methods (for GEOG 104 class).

Least squares cubic splines without B-splines S.K. Lucas

RELIABILITY ASSESSMENT

Transcription:

Semparametrc geographcally weghted generalsed lnear modellng n GWR 4.0 T. Nakaya 1, A. S. Fotherngham 2, M. Charlton 2, C. Brunsdon 3 1 Department of Geography, Rtsumekan Unversty, 56-1 Tojn-kta-mach, Kta-ku, Kyoto 603-8577, Japan Telephone: +81)-0)75-467-8801 Fax: +81)-0)75-465-8296 Emal: nakaya@lt.rtsume.ac.jp 2 Natonal Centre for Geocomputaton, John Hume Buldng Natonal Unversty of Ireland, Maynooth, Co Kldare, Ireland Telephone: +353)-0)-1-708-6455 Fax: +353)-0)-1-708-6456 Emal: Stewart.Fotherngham@num.e, Martn.Charlton@num.e 3 Department of Geography, Lecester Unversty, Lecester, LE1 7RH, UK Telephone: +44)-0)-116-252-3843 Fax: +44)-0)-116-252-3854 Emal: cb179@le.ac.uk 1. Introducton The am of ths paper s to propose a generalsed framework for semparametrc geographcally weghted regresson S-GWR) by combnng several theoretcal aspects of geographcally weghted regresson GWR). In ths framework, we can mplement model selecton n order to judge whch explanatory effects on the response varable are globally fxed or geographcally varyng n generalsed lnear modellng GLM). Ths framework s mplemented n a new verson of the GWR software GWR 4.0) whch s soon to be released and whch wll be descrbed. To date, numerous theoretcal and appled studes of GWR have been reported after the frst semnal papers of GWR appeared Fotherngham et al., 1996; Brunsdon et al, 1996). Here, we focus on two mportant extensons of GWR; geographcally weghted generalsed lnear modellng GWGLM) and semparametrc extenson of GWR. Whle the orgnal GWR assumes that the response s a contnuous varable and the error term follows a Gaussan normal) dstrbuton, GWGLM enables us to ft generalsed lnear models wth geographcally local coeffcents to accommodate commonly encountered types of response ncludng count and bnomal varables wth lkelhood functons of non-normal errors. Although Fotherngham et al. 2002) descrbed a geographcally local scorng algorthm to estmate geographcally local coeffcents of GWGLM, ssues of nference about estmated coeffcents were generally gnored. Nakaya et al. 2005) derved standard errors of coeffcents and degrees-of-freedom for model selecton ndcators by focusng on geographcally weghted Posson regresson GWPR) and ts semparametrc extenson. Here, we generalze semparametrc GWPR to S-GWR and assocate t wth model dagnostcs to assess the geographcal varablty of coeffcents. The new software, GWR 4.0, provdes a user-frendly platform to calbrate S-GWR models allowng the user to experment wth whch coeffcents are spatally varyng and whch are fxed.

2. S-GWR models Geographcally varyng coeffcent models are defned n the generalzed lnear model GLM) framework. Suppose we defne a lnear predctor wth geographcally varyng coeffcents as, t η = β,, ), ) k k u v xk = x β u, where x k, and βk s the kth explanatory ndependent) varable and ts coeffcent, respectvely. In ths model, the coeffcents vary dependng on the geographcal coordnate of the locaton, u = u,, v ). The expected value of response of the th observaton, E y ], s related to the lnear predctor va a lnk functon, g; [ g E[ y ]) = η The log-lkelhood of the observaton s defned by a dstrbutonal functon of the exponental famly wth the canoncal, η, and dsperson parameters, ϕ as well as three functonal components of exponental famly, a, b and c; yη b η) log f y η, ϕ) = + c y, ϕ). a ϕ) Ths framework covers commonly used regresson models ncludng Gaussan, Posson and logstc varants of GWR. 2 GWR Gaussan): ~ [, σ ] y N η y Posson exp η ) GWPR Posson): ~ [ ] GWLR Logstc): y Bernoull[ logstc η )] ~ GWGLM s a method to estmate a vector of local coeffcents focusng on the th regresson pont by solvng the followng maxmsaton problem of the geographcally weghted log-lkelhood of the model, n βˆ u ) = arg max log f y ˆ η, ˆ ϕ ) w u u { ) ) )} j j j j j j where the hat symbol means predcton; ˆ η) ˆ j= ηj β u ) ), ˆ ϕ) ˆ j= ϕj β u ) ). These two workng varables are ftted canoncal and dsperson parameters for the predcton of the response at the jth locaton wth coeffcents at the th regresson pont. The geographcal weght of the jth observaton at the th regresson pont, w j, s ntroduced here as a non-negatve and monotonously decreasng functon of the dstance between the regresson pont and the jth observaton locaton, such as a Gaussan kernel functon: 1 u u j w j = exp, 2 G where the parameter G called the bandwdth) regulates the kernel sze.

S-GWR as semparametrc GWGLM ncludes partally lnear terms of explanatory effects on the response n the canoncal parameter; η = β k k u,, v ) xk, + l γ l zl, where z l, s lth explanatory varable andγ l s ts coeffcent that s constant over space. Combnng geographcally local scorng and back-fttng algorthms, we can compute the estmates of coeffcents and ndcators for model dagnostcs ncludng standard errors of coeffcents, degree-of-freedom and nformaton crteron such as AICc corrected AIC) to decde an optmal bandwdth sze and model comparsons cf. Nakaya et al., 2005). 3. Model selecton of S-GWR 3.1 Assessment of geographcal varablty of coeffcent An advantage of S-GWR s that we can ncorporate a fxed effect of a subset of explanatory varables on the response varable due to pror knowledge. However, t s not always obvous whch coeffcents should be assumed to be fxed or varyng. A natural way to overcome ths dffculty s to conduct emprcal model comparsons of dfferent semparametrc models havng dfferent combnatons of fxed and varyng coeffcents. Me et al. 2004; 2006) proposed F test schemes for ths knd of model selecton. However, consderng the stuaton that an optmal bandwdth sze of a geographcal kernel for fttng GWR s normally carred out by a model selecton ndcator, t would be more approprate to conduct such model comparsons for assessng geographcal varablty of coeffcents of GWR by usng a model selecton ndcator. To assess the varablty of the kth coeffcent, we can compare two models; a ftted GWGLM model pvot model) and a model n whch only the kth coeffcent s swtched to be constant whle the other coeffcents vary spatally. If the pvot model s better than the model wth the kth coeffcent fxed, as judged by a model comparson crteron such as AICc, we can clam that the kth coeffcent vares spatally. The test routne n GWR 4.0 repeats ths comparson for each relatonshp n the model. 3.2 GtoF / FtoG automated varable selecton GWR 4.0 contans two separate fttng technques for automated varable selecton of S- GWR models. One s the GtoF from geographcally varyng to fxed) varable selecton routne whch executes a seres of model comparsons to search for an optmal combnaton of varyng and fxed term gven explanatory and response varables. The concept s smlar to that of step-wse varable selecton. Frstly, a model comparson s repeated between the orgnally ftted GWR model and a model n whch only one coeffcent s swtched to be constant whle the others reman spatally varyng. The optmal model s now selected from the orgnal GWR model and a set of models n whch one parameter s fxed. If ths optmal model s the full GWR model, the process stops. If the optmal model contans a fxed effect, a new set of model comparsons s then made by makng each of the remanng spatally varyng coeffcents constant n turn and repeatng ths procedure untl no further mprovement n model ft can be obtaned by makng a coeffcent constant nstead of spatally varyng.

An alternatve model selecton procedure also mplemented n GWR 4.0 s FtoG from fxed to geographcally varyng) whch s the reverse procedure to that descrbed above. In ths case, the default model s the global one and the frst round of model comparsons s made by allowng each parameter n turn to be spatally varyng. Model selecton s made n the same manner by selectng an optmal model based on AICc and then allowng a second round of parameters to be spatally varyng etc untl no model mprovement s possble. 4. GWR 4.0 GWR 4.0 s a new release of the current GWR 3.0 software. Compared to the prevous verson, the user nterface s largely rebult to use tabbed sub-wndows so that a modellng sesson ntutvely proceeds n a step-by-step manner Fgure 1). Also, a wder range of optons related to GWGLM ncludng geographcally varablty assessment and automated varable selecton routnes explaned above are avalable. It s executable under a MS-Wndows envronment wth.net Framework 3.5. The software wll be demonstrated n ths presentaton. Fgure 1. Screenshots of GWR4.0 nterface.

5. Concluson In ths paper, we descrbe GLM-based semparametrc geographcally weghted regresson S-GWR) whch allow mxng geographcally varyng and fxed coeffcents n a generalsed lnear model. It s also possble to explore whch explanatory terms should be varyng or fxed, through model comparsons between possble dfferent S-GWR models. GWR 4.0 has been developed as a platform for the practcal mplementaton of S-GWR modellng wth new methods of geographcal varablty assessments for estmated coeffcents and automated model selecton to search for an optmal combnaton of fxed and varyng explanatory terms n a model. References Brunsdon C, Fotherngham AS and Charlton M, 1996, Geographcally weghted regresson: a method for explorng spatal nonstatonarty. Geographcal Analyss, 28, 281-289. Fotherngham AS, Charlton M and Brunsdon C, 1996 The geography of parametrer space: an nvestgaton nto spatal nonstatonarty, Internatonal Journal of GIS, 10, 605-627 Fotherngham AS, Brunsdon C and Charlton M, 2002, Geographcally weghted regresson. Wley, Sussex, UK. Me C-L, He S-Y and Fang K-T, 2004, A note on the mxed geographcally weghted regresson model. Journal of Regonal Scence, 44, 143-157. Me C-L, Wang N and Zhang W-X, 2006, Testng the mportance of the explanatory varables n a mxed geographcally weghted regresson model. Envronment and Plannng A, 38, 587-598. Nakaya T, Fotherngham AS, Charlton M and Brunsdon C, 2005, Geographcally weghted Posson regresson for dsease assocatve mappng, Statstcs n Medcne, 24, 2695-2717.