Geographically weighted regression (GWR)

Similar documents
RELATIONSHIP BETWEEN VOLATILITY AND TRADING VOLUME: THE CASE OF HSI STOCK RETURNS DATA

Robustness Experiments with Two Variance Components

( t) Outline of program: BGC1: Survival and event history analysis Oslo, March-May Recapitulation. The additive regression model

New M-Estimator Objective Function. in Simultaneous Equations Model. (A Comparative Study)

Department of Economics University of Toronto

Solution in semi infinite diffusion couples (error function analysis)

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 4

Econ107 Applied Econometrics Topic 5: Specification: Choosing Independent Variables (Studenmund, Chapter 6)

John Geweke a and Gianni Amisano b a Departments of Economics and Statistics, University of Iowa, USA b European Central Bank, Frankfurt, Germany

2. SPATIALLY LAGGED DEPENDENT VARIABLES

5th International Conference on Advanced Design and Manufacturing Engineering (ICADME 2015)

Machine Learning Linear Regression

CHAPTER 10: LINEAR DISCRIMINATION

V.Abramov - FURTHER ANALYSIS OF CONFIDENCE INTERVALS FOR LARGE CLIENT/SERVER COMPUTER NETWORKS

Analysis And Evaluation of Econometric Time Series Models: Dynamic Transfer Function Approach

Lecture 6: Learning for Control (Generalised Linear Regression)

CHAPTER 5: MULTIVARIATE METHODS

TSS = SST + SSE An orthogonal partition of the total SS

Predicting and Preventing Emerging Outbreaks of Crime

January Examinations 2012

Panel Data Regression Models

On One Analytic Method of. Constructing Program Controls

Lecture VI Regression

Linear Response Theory: The connection between QFT and experiments

CHAPTER FOUR REPEATED MEASURES IN TOXICITY TESTING

Bayes rule for a classification problem INF Discriminant functions for the normal density. Euclidean distance. Mahalanobis distance

Ordinary Differential Equations in Neuroscience with Matlab examples. Aim 1- Gain understanding of how to set up and solve ODE s

Appendix H: Rarefaction and extrapolation of Hill numbers for incidence data

Math 128b Project. Jude Yuen

Advanced Machine Learning & Perception

( ) [ ] MAP Decision Rule

F-Tests and Analysis of Variance (ANOVA) in the Simple Linear Regression Model. 1. Introduction

Comparison of Supervised & Unsupervised Learning in βs Estimation between Stocks and the S&P500

THE PREDICTION OF COMPETITIVE ENVIRONMENT IN BUSINESS

Volatility Interpolation

ACEI working paper series RETRANSFORMATION BIAS IN THE ADJACENT ART PRICE INDEX

In the complete model, these slopes are ANALYSIS OF VARIANCE FOR THE COMPLETE TWO-WAY MODEL. (! i+1 -! i ) + [(!") i+1,q - [(!

Standard Error of Technical Cost Incorporating Parameter Uncertainty

Time-interval analysis of β decay. V. Horvat and J. C. Hardy

Robust and Accurate Cancer Classification with Gene Expression Profiling

Childhood Cancer Survivor Study Analysis Concept Proposal

Fall 2010 Graduate Course on Dynamic Learning

Bernoulli process with 282 ky periodicity is detected in the R-N reversals of the earth s magnetic field

Variants of Pegasos. December 11, 2009

CS 536: Machine Learning. Nonparametric Density Estimation Unsupervised Learning - Clustering

Bayesian Inference of the GARCH model with Rational Errors

Filtrage particulaire et suivi multi-pistes Carine Hue Jean-Pierre Le Cadre and Patrick Pérez

Data Collection Definitions of Variables - Conceptualize vs Operationalize Sample Selection Criteria Source of Data Consistency of Data

THEORETICAL AUTOCORRELATIONS. ) if often denoted by γ. Note that

Outline. Probabilistic Model Learning. Probabilistic Model Learning. Probabilistic Model for Time-series Data: Hidden Markov Model

. The geometric multiplicity is dim[ker( λi. number of linearly independent eigenvectors associated with this eigenvalue.

Notes on the stability of dynamic systems and the use of Eigen Values.

Machine Learning 2nd Edition

( ) () we define the interaction representation by the unitary transformation () = ()

Existence and Uniqueness Results for Random Impulsive Integro-Differential Equation

ELASTIC MODULUS ESTIMATION OF CHOPPED CARBON FIBER TAPE REINFORCED THERMOPLASTICS USING THE MONTE CARLO SIMULATION

Introduction to Boosting

[Link to MIT-Lab 6P.1 goes here.] After completing the lab, fill in the following blanks: Numerical. Simulation s Calculations

CS286.2 Lecture 14: Quantum de Finetti Theorems II

Anomaly Detection. Lecture Notes for Chapter 9. Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar

. The geometric multiplicity is dim[ker( λi. A )], i.e. the number of linearly independent eigenvectors associated with this eigenvalue.

MODELING TIME-VARYING TRADING-DAY EFFECTS IN MONTHLY TIME SERIES

Clustering (Bishop ch 9)

Additive Outliers (AO) and Innovative Outliers (IO) in GARCH (1, 1) Processes

Forecasting customer behaviour in a multi-service financial organisation: a profitability perspective

Dynamic Team Decision Theory. EECS 558 Project Shrutivandana Sharma and David Shuman December 10, 2005

Fall 2009 Social Sciences 7418 University of Wisconsin-Madison. Problem Set 2 Answers (4) (6) di = D (10)

Computing Relevance, Similarity: The Vector Space Model

Normal Random Variable and its discriminant functions

Lecture 11 SVM cont

GENERATING CERTAIN QUINTIC IRREDUCIBLE POLYNOMIALS OVER FINITE FIELDS. Youngwoo Ahn and Kitae Kim

Spatial Econometric Models for Panel Data: Incorporating Spatial and Temporal Data

Graduate Macroeconomics 2 Problem set 5. - Solutions

PhD/MA Econometrics Examination. January, 2019

Kernel-Based Bayesian Filtering for Object Tracking

P R = P 0. The system is shown on the next figure:

Mechanics Physics 151

Time Scale Evaluation of Economic Forecasts

Polymerization Technology Laboratory Course

Semiparametric geographically weighted generalised linear modelling in GWR 4.0

Factor models with many assets: strong factors, weak factors, and the two-pass procedure

Comb Filters. Comb Filters

Advanced time-series analysis (University of Lund, Economic History Department)

Fitting a Conditional Linear Gaussian Distribution

J i-1 i. J i i+1. Numerical integration of the diffusion equation (I) Finite difference method. Spatial Discretization. Internal nodes.

Chapter 8 Dynamic Models

An introduction to Support Vector Machine

Part II CONTINUOUS TIME STOCHASTIC PROCESSES

Sampling Procedure of the Sum of two Binary Markov Process Realizations

HEAT CONDUCTION PROBLEM IN A TWO-LAYERED HOLLOW CYLINDER BY USING THE GREEN S FUNCTION METHOD

2.1 Constitutive Theory

Methods for the estimation of missing values in time series

Mechanics Physics 151

Basic Analysis of Spatial Patterns

Anisotropic Behaviors and Its Application on Sheet Metal Stamping Processes

Accounting for Spatial Variation of Land Prices in Hedonic Imputation House Price Indexes

WiH Wei He

Mechanics Physics 151

Relative controllability of nonlinear systems with delays in control

M. Y. Adamu Mathematical Sciences Programme, AbubakarTafawaBalewa University, Bauchi, Nigeria

Transcription:

Ths s he auhor s fnal verson of he manuscrp of Nakaya, T. (007): Geographcally weghed regresson. In Kemp, K. ed., Encyclopaeda of Geographcal Informaon Scence, Sage Publcaons: Los Angeles, 179-184. Geographcally weghed regresson (GWR) GWR s a echnque of spaal sascal modellng used o analyse spaally varyng relaonshps beween varables. Dfferen processes generang geographcal observaons may work under dfferen geographcal conexs. The denfcaon of where and how such spaal heerogeney n he processes appears on maps s he key n undersandng complex geographcal phenomena. To nvesgae hs ssue, several local spaal analyss echnques hghlghng geographcally local dfferences have been developed. Oher echnques of local spaal analyss such as S. Openshaw s GAM (Geographcal Analyss Machne) and L. Anseln s LISA (Local Indcaors of Spaal Assocaon) focus on he dsrbuon of only one varable on a map. They are ypcally used for deermnng he geographcal concenraons of hgh-rsk dseases or crmes by movng crcular scannng wndow or spaal wegh ha defne neghbourhood o be analysed over space. GWR s also a represenave ool of local spaal analyss; however, s unque wh regard o he nvesgaon and mappng of he dsrbuon of local relaonshps beween varables by usng spaal wegh. When he geographcal dsrbuon of an ndcaor examned by an analys s provded, regresson modellng can prove o be an effecve mehod o nvesgae or confrm he plausble explanaons of he dsrbuon. GWR exends he convenonal regresson models o allow geographcal drfs n coeffcens by he nroducon of local model fng wh geographcal weghng. Based on a smple calbraon procedure, hs approach s compuer nensve; however, effecvely enables he modellng of complex geographcal varaons n hese relaonshps wh he leas resrcons on he funconal form of he geographcal varaons. Thus, GWR s consdered o be a useful geocompuaon ool for ESDA (exploraory spaal daa analyss) wh regard o he 1

assocaon of he arge ndcaor wh he explanaory ones under sudy. Snce GWR models can be regarded o be a specal form of non-paramerc regresson, sascal nferences of he GWR models are generally well-esablshed on he bass of heores on non-paramerc regresson. GWR was orgnally proposed by A.S. Foherngham, C. Brunsdon and M. Charlon when hey jonly worked n he Unversy of Newcasle upon Tyne n he UK. Afer he frs publcaon of GWR n 1996 and o dae, hey have conrbued mos o he fundamenal developmens of GWR, wh he release of wndows-based applcaon sofware specalsed n hs mehod. Furher, a grea varey of emprcal applcaons of GWR and heorecal unng of he GWR approach o specfc ssues have been conduced n varous felds such as bology, clmaology, epdemology, markeng analyss, polcal scence and so on. Geographcally global and local regresson models Le us consder an example of healh geography n whch a researcher seeks he deermnans of geographcal nequaly of healh by assocang he regonal moraly raes wh regonal soco-economc ndcaors such as medan ncome or resdens composon of socal classes. The analys may apply he followng smple regresson model: y x, (1) 0 1 where he dependen varable y denoes he moraly rae n locaon, he ndependen varable x s he ypcal medan ncome n he same locaon and ha follows a normal dsrbuon wh zero mean and ~ N(0, ) varance. s he error erm () In equaon (1), 0 and 1 denoe he coeffcens o be esmaed by usng he leas squares mehod.

Typcally, we expec ha an affluen area wh a hgher medan ncome wll probably be healher, ha s, have a lower moraly rae. Ths mples ha slope 1 s expeced o be negave. The regresson model mplcly posulaes ha such assocaon rules ndcaed by he esmaed coeffcens based on he enre daase should be ubquously vald whn he sudy area. Such a model s referred o as a global model. However, such assocaon rules may vary geographcally. For example, he relaonshps beween he healh and affluence/deprvaon ndcaors are more evden n urbansed/ndusralsed areas as compared o her rural counerpars. Snce he poor can more easly suffer from ll healh due o hgher sandards of lvng coss and socal solaon n urban areas han n rural areas, he healh gap beween rch and poor would be wder n urban areas han n rural areas. Therefore, should be reasonable o allow he possbly ha geographcal varaons n he relaonshps mgh vary whn he sudy area encompassng he urban and rural areas. In order o nvesgae such local varaons, GWR nroduces varyng coeffcens n he regresson model, whch depend on he geographcal locaons: y ( u, v ) ( u, v ) x, (3) 0 1 where ( u, v ) are he wo-dmensonal geographcal coordnaes of he pon locaon. If areal uns are used n he sudy, hese coordnaes are usually equvalen o hose of he cenrod of he areal uns. As a resul, he coeffcens are funcons of he geographcal coordnaes. Such a model wh geographcally local drfs n he coeffcens s referred as a local model. Fgure 1 shows an llusrave map of dsrbuon of local regresson coeffcen, (, ) 1 u v, based on he above example. The darker he colour of an area, he negavely larger he value of he regresson coeffcen becomes. GWR can be used for a 3

vsualsaon ool showng local relaonshps. Fgure 1: dagram showng he dea of geographcally weghed regresson Geographcally local model fng GWR esmaes he local coeffcens by repeaedly fng he regresson model o a geographcal subse of he daa wh a geographcal kernel weghng. The smples form s he movng-wndow regresson. Consder a crcle wh a radus measured from he regresson pon ( u, v ) a whch he coeffcens are o be esmaed. We can f he convenonal regresson model o he subse of he daa whn he crcle n order o oban he local coeffcens (, ) 0 u v and (, ) 1 u v. More nuvely, we can skech he scaer dagram n order o observe he relaonshp beween x and y whn he crcle (see Fgure 1 for hs dea). By spaally movng he crcular wndow and 4

repeang he local fng of he convenonal regresson model, we oban a se of local coeffcens for all he regresson pons ha are he cenre of he movng crcular wndow. Insead of usng a convenonal crcular wndow, a geographcal kernel weghng can normally generae a smooher surface of he local coeffcens. I s more suable for he esmaon of local coeffcens based on he premse ha he acual relaonshps beween he varables would connuously vary over space n mos of he cases. For example, consder he fuzzy naure of an urban-rural connuum. If he relaonshp beween he healh and ncome depends on he poson of he urban-rural connuum, would be reasonable o assume ha he relaonshp would gradually vary over space. The geographcally local fng wh a kernel weghng s acheved by solvng he followng geographcally weghed leas squares for each locaon as he regresson pon: mn ˆ ˆ ˆ ˆ y ˆ j y j ( 0( u, v ), 1( u, v )) wj, (4) 0( u, v ), 1( u, v ) j where y ˆ j denoes he predcon of he dependen varable a observaon wh he esmaed local coeffcens a. yˆ ˆ ( u, v ) ˆ ( u, v ) x (5) j 0 1 j In order o oban smoohed surface of local coeffcens, he geographcal wegh wj should be defned by a smooh dsance-decay funcon dependng on he proxmy of he daa observaon j o he regresson pon. The closer j s o, he heaver s he wegh. An llusraon of ypcal weghng kernel s shown n Fgure 1. Evdenly, kernel weghng yelds a fuzzy geographcal subse for esmang he local coeffcens. 5

Geographcal kernel funcons Varous funconal forms can be used for weghng kernel. The followng s a well-known Gaussan kernel funcon: w j 1 dj exp, (6) where dj s he dsance from o j and s referred o as he bandwdh parameer ha regulaes he kernel sze. In conformy wh hs bell-shaped funcon, observaons around each value of whn a dsance subsanally conrbues owards he esmaon of he local coeffcens. The bandwdh sze can be fxed over space n order o manan he same geographcal exen for analysng he local relaonshps. An alernave weghng scheme s adapve weghng used o manan he same number of observaons M whn each kernel. The followng b-square funcon s a popular adapve kernel: w j 1 dj / f dj, (7) 0 oherwse where denoes he bandwdh sze ha s defned n hs funcon as he dsance beween he Mh neares observaon pon and. Adapve kernels are useful o preven esmang unrelable coeffcens due o he lack of degree-of-freedom n local subses parcularly when a large varaon s observed n he geographcal densy of he observed daa. 6

Mappng he GWR resul The GWR resul s mappable as shown n Fgure 1. Mappng he local varaons n he esmaed local regresson coeffcen (slope) ˆ (, ) 1 u v s parcularly nformave for nerpreng he geographcal conexual effecs on he assocaon of y wh x. However, should be noed ha he varaons n he local consan ˆ (, ) 0 u v would be spurous. If he regresson coeffcen (, ) 1 u v s zero a, he local consan should be equvalen o he local weghed average of he observed dependen varable around. yw ˆ j j (, ) j u v 0 On he conrary, f (, ) 1 u v s negave (posve), he consan should be greaer (lower) han he local weghed average n order o conform o he condon ha he local weghed averages beween he observaon and predcon of he dependen varable should be equvalen on he bass of he leas squares mehod. j w j (8) In summary, whle local regresson coeffcens conan exensve nformaon on he non-saonary processes under sudy, he local consan s hghly dependen on he local correlaons beween he varables n he regresson model. Therefore, nerpreng he map of he local consan erm would be dffcul, parcularly n he case of mulple regresson models. Bandwdh and model selecon A mulvarae GWR model s shown as follows: y ( u, v ) x, (9) k, k k where x k, s he kh ndependen varable a locaon ncludng x,0 1 for all such ha (, ) 0 u v becomes he local consan erm. The esmaon of he local coeffcens 7

of he model a s descrbed by he followng marx noaon of weghed leas squares: 1 β( u, v) X WX X Wy, (10) where β ( u, v) s a vecor of he local coeffcens a regresson pon. β ( u, v ) ( u, v ), ( u, v ), (11) 0 1 X s he desgn marx and (X) denoes he ranspose of X. 1 x1,1 xk,1 1 x1, xk, X (1) 1 x1, N x K, N W s he dagonal marx of he geographcal kernel wegh based on he dsance from. w 0 y s he vecor of he dependen varable. 1 w 0 W (13) 1 w N y y, y, (14) As shown n equaon (9), he GWR model predcs he dependen varable a wh he local coeffcens β ( u, v) ha are specfc o he same pon. Thus, we can express he predcon usng he local coeffcens as follows: where yˆ ˆ ( u, v ) x k, k k (1, x, x, ) ˆ ( u, v ), ˆ ( u, v ), ˆ ( u, v ),,1, 0 1 1 x X W X X W y x s he h row vecor of X., (15) 8

In he vecor-marx noaon, he GWR predcon s rewren as yˆ Hy, (16) where he h row of marx H (h) s expressed as 1 h x X WX X W. (17) Ths marx ransformng he observaons no predcons s referred o as he ha marx n he leraure on regresson modellng. Snce he race of he ha marx corresponds o he number of regresson coeffcens n he global regresson model, s naural o defne he effecve number of parameers n he GWR model (p) by he race of H as follows: p race( H) h. (18) In general, GWR models wh a smaller bandwdh kernel have a greaer effecve number of parameers han hose wh a larger bandwdh kernel. When he bandwdh sze reaches nfny, he effecve number of parameers of he GWR model converges o he number of parameers of he correspondng global model. A GWR model wh a small bandwdh kernel effecvely fs o he daa. However, he esmaes of he coeffcens are lkely o be unrelable snce he esmaes exhb large varances due o he lack of degree-of-freedom n he local model fng. On he oher hand, meanngful spaal varaons n he coeffcens may be negleced n a GWR model wh a large bandwdh when he rue dsrbuon of he coeffcen s spaally varyng. In such cases, he GWR model usng an excessvely large bandwdh yelds srongly based esmaes of he dsrbuons of he local coeffcens. Therefore, bandwdh selecon can be regarded as a rade-off problem beween he degree-of-freedom and degree-of-f or beween he bas and varance of he local esmaes. 9

In order o solve hese rade-offs, we can use sascal ndcaors of model comparson, such as CV (cross-valdaon), GCV (generalsed cross-valdaon) and AIC (Akake s nformaon creron) for deermnng he bes bandwdh. In parcular, Akake s nformaon creron correced for a small sample sze (AICc) s useful n bandwdh selecon of he weghng kernel snce classc ndcaors such as CV and AIC may resul n undersmoohng for relavely smaller degrees-of-freedom, whch are ofen encounered n non-paramerc regresson. AICc s defned as follows: qq ( 1) AICc sup L q N q 1. (19) supl denoes he log-lkelhood of he model represenng s degree-of-f. If he model fs beer, -supl becomes smaller. Furher, q denoes he oal number of parameers n he model. I should be noed ha q = p + 1 when we assume he normal error erm; hs s because he erm ncludes he parameer of error varance means ha he model s smpler.. A smaller value of q We can use AICc and oher relaed ndcaors no only o deermne he bes bandwdh sze bu also o compare hs model wh oher compeng models, ncludng global models and GWR models wh a dfferen se of explanaory varables or dfferen formulaon. Exensons of GWR models A major exenson of he GWR model s s semparamerc formulaon: where, (0) y ( u, v ) x x k, k l, l k l l s he paral regresson coeffcen assumed o be global. In he leraure on GWR, he model s ofen denoed as a mxed model snce fxed coeffcens ha are mananed consan over space as well as varyng coeffcens ha are allowed o spaally vary whn he same model are used. The nroducon of fxed coeffcens smplfes he model such ha can employ small varances of he esmaes. 10

Anoher mporan area of GWR exenson s he formulaon of GWR based on he framework of generalsed lnear modellng. Alhough GWR has been developed by assumng a lnear modellng framework wh a Gaussan (normal) error erm, spaal analyss ofen encouner problems wheren he dependen varable s dscree and non-negave raher han beng connuous. A convenonal Gaussan modellng framework s nadequae for such modellng. In parcular, logsc and Posson regressons are popular for bnary and coun-dependen varables, respecvely. On he bass of he maxmum lkelhood framework, he applcaon of he GWR approach o hese varables yelds geographcally weghed generalsed lnear models, ncludng geographcally weghed logsc regresson and geographcally weghed Posson regresson. Oher varous heorecal and praccal exensons have also been connuously developed n order o formulae GWR models ha are applcable o wder felds. Tomok Nakaya See also Dscree vs. Connuous Varables/Phenomena, Exploraory Spaal Daa Analyss, Geographcal Analyss Machne (GAM), Kernel, Spaal Heerogeney, Spaal Nonsaonary, Spaal Sascs, Spaal Weghs Furher Readng Foherngham AS, Brunsdon C, Charlon M. Geographcally weghed regresson: he analyss of spaally varyng relaonshps. Wley: Chcheser, 00. Foherngham AS, Brunsdon C, Charlon M. Quanave Geography: Perspecves on Spaal Daa Analyss. Sage: London, 000. Nakaya T, Foherngham S, Brunsdon C, Charlon M. (005): Geographcally weghed Posson regresson for dsease assocave mappng, Sascs n Medcne 4, 11

695 717. 1