Multiple Dependent Hypothesis Tests in Geographically Weighted Regression

Similar documents
Outline. Inference in regression. Critical values GEOGRAPHICALLY WEIGHTED REGRESSION

High-Throughput Sequencing Course. Introduction. Introduction. Multiple Testing. Biostatistics and Bioinformatics. Summer 2018

Familywise Error Rate Controlling Procedures for Discrete Data

This paper has been submitted for consideration for publication in Biometrics

Exploratory Spatial Data Analysis (ESDA)

CHL 5225H Advanced Statistical Methods for Clinical Trials: Multiplicity

Modified Simes Critical Values Under Positive Dependence

Political Science 236 Hypothesis Testing: Review and Bootstrapping

Specific Differences. Lukas Meier, Seminar für Statistik

False Discovery Rate

Prospect. February 8, Geographically Weighted Analysis - Review and. Prospect. Chris Brunsdon. The Basics GWPCA. Conclusion

A Space-Time Model for Computer Assisted Mass Appraisal

Geographically Weighted Regression as a Statistical Model

High-throughput Testing

Applying the Benjamini Hochberg procedure to a set of generalized p-values

False discovery rate and related concepts in multiple comparisons problems, with applications to microarray data

1Department of Demography and Organization Studies, University of Texas at San Antonio, One UTSA Circle, San Antonio, TX

Geographically Weighted Regression and Kriging: Alternative Approaches to Interpolation A Stewart Fotheringham

y ˆ i = ˆ " T u i ( i th fitted value or i th fit)

Statistical testing. Samantha Kleinberg. October 20, 2009

Control of Directional Errors in Fixed Sequence Multiple Testing

The GWmodel R package: Further Topics for Exploring Spatial Heterogeneity using Geographically Weighted Models

FDR-CONTROLLING STEPWISE PROCEDURES AND THEIR FALSE NEGATIVES RATES

Statistical Applications in Genetics and Molecular Biology

Week 5 Video 1 Relationship Mining Correlation Mining

Basic Statistics. 1. Gross error analyst makes a gross mistake (misread balance or entered wrong value into calculation).

MULTISTAGE AND MIXTURE PARALLEL GATEKEEPING PROCEDURES IN CLINICAL TRIALS

Summary and discussion of: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing

Introductory Econometrics. Review of statistics (Part II: Inference)

Step-down FDR Procedures for Large Numbers of Hypotheses

Alpha-Investing. Sequential Control of Expected False Discoveries

Post-Selection Inference

A Simple, Graphical Procedure for Comparing Multiple Treatment Effects

Spatial Trends of unpaid caregiving in Ireland

Review of Statistics 101

STAT 263/363: Experimental Design Winter 2016/17. Lecture 1 January 9. Why perform Design of Experiments (DOE)? There are at least two reasons:

A multiple testing procedure for input variable selection in neural networks

Running head: GEOGRAPHICALLY WEIGHTED REGRESSION 1. Geographically Weighted Regression. Chelsey-Ann Cu GEOB 479 L2A. University of British Columbia

ON STEPWISE CONTROL OF THE GENERALIZED FAMILYWISE ERROR RATE. By Wenge Guo and M. Bhaskara Rao

Testing a secondary endpoint after a group sequential test. Chris Jennison. 9th Annual Adaptive Designs in Clinical Trials

Statistical Inference. Why Use Statistical Inference. Point Estimates. Point Estimates. Greg C Elvers

Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. T=number of type 2 errors

Lectures 5 & 6: Hypothesis Testing

The Design of a Survival Study

Controlling the proportion of falsely-rejected hypotheses. when conducting multiple tests with climatological data

Rigorous Science - Based on a probability value? The linkage between Popperian science and statistical analysis

On Generalized Fixed Sequence Procedures for Controlling the FWER

Interim Monitoring of Clinical Trials: Decision Theory, Dynamic Programming. and Optimal Stopping

Statistica Sinica Preprint No: SS R1

An Open Source Geodemographic Classification of Small Areas In the Republic of Ireland Chris Brunsdon, Martin Charlton, Jan Rigby

Controlling Bayes Directional False Discovery Rate in Random Effects Model 1

Comparing Adaptive Designs and the. Classical Group Sequential Approach. to Clinical Trial Design

CH.9 Tests of Hypotheses for a Single Sample

Fundamental Probability and Statistics

Multiple Testing. Tim Hanson. January, Modified from originals by Gary W. Oehlert. Department of Statistics University of South Carolina

Sample Size and Power I: Binary Outcomes. James Ware, PhD Harvard School of Public Health Boston, MA

Looking at the Other Side of Bonferroni

Journal Club: Higher Criticism

Multivariate Process Control Chart for Controlling the False Discovery Rate

Stat 206: Estimation and testing for a mean vector,

Dose-response modeling with bivariate binary data under model uncertainty

GeoDa-GWR Results: GeoDa-GWR Output (portion only): Program began at 4/8/2016 4:40:38 PM

Department of Statistics University of Central Florida. Technical Report TR APR2007 Revised 25NOV2007

Chapter 1 Statistical Inference

Model Based Statistics in Biology. Part V. The Generalized Linear Model. Chapter 16 Introduction

Non-specific filtering and control of false positives

How do we compare the relative performance among competing models?

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

An Equivalency Test for Model Fit. Craig S. Wells. University of Massachusetts Amherst. James. A. Wollack. Ronald C. Serlin

Using Spatial Statistics Social Service Applications Public Safety and Public Health

STAT 4385 Topic 01: Introduction & Review

Time: the late arrival at the Geocomputation party and the need for considered approaches to spatio- temporal analyses

Chapter 13 Section D. F versus Q: Different Approaches to Controlling Type I Errors with Multiple Comparisons

Parameter Estimation, Sampling Distributions & Hypothesis Testing

STATISTICAL INFERENCE FOR SURVEY DATA ANALYSIS

Mixtures of multiple testing procedures for gatekeeping applications in clinical trials

Inferential Statistics Hypothesis tests Confidence intervals

The Pennsylvania State University The Graduate School Eberly College of Science GENERALIZED STEPWISE PROCEDURES FOR

Econometrics. 4) Statistical inference

Rigorous Science - Based on a probability value? The linkage between Popperian science and statistical analysis

MTMS Mathematical Statistics

Contrasts and Multiple Comparisons Supplement for Pages

The Purpose of Hypothesis Testing

Analysis of variance, multivariate (MANOVA)

Multiple Testing of One-Sided Hypotheses: Combining Bonferroni and the Bootstrap

Lecture 28. Ingo Ruczinski. December 3, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University

Superchain Procedures in Clinical Trials. George Kordzakhia FDA, CDER, Office of Biostatistics Alex Dmitrienko Quintiles Innovation

Type I error rate control in adaptive designs for confirmatory clinical trials with treatment selection at interim

Rigorous Science - Based on a probability value? The linkage between Popperian science and statistical analysis

Clinical Trials. Olli Saarela. September 18, Dalla Lana School of Public Health University of Toronto.

CSISS Tools and Spatial Analysis Software

Journal of Statistical Software

Sociology 6Z03 Review II

Geographically Weighted Regression

Group Sequential Designs: Theory, Computation and Optimisation

A Simple, Graphical Procedure for Comparing. Multiple Treatment Effects

1 Overview. Coefficients of. Correlation, Alienation and Determination. Hervé Abdi Lynne J. Williams

The General Linear Model. Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

STAT 461/561- Assignments, Year 2015

Transcription:

Multiple Dependent Hypothesis Tests in Geographically Weighted Regression Graeme Byrne 1, Martin Charlton 2, and Stewart Fotheringham 3 1 La Trobe University, Bendigo, Victoria Austrlaia Telephone: +61 3 5444 7263 Fax: +61 3 54447998 Email: g.byrne@latrobe.edu.au 2 National Centre for Geocomputation,National University of Ireland, Maynooth, Co. Kildare, Ireland Telephone: +353 (0) 1 708 6455 Fax: +353 (0) 1 708 6456 Email: martin.charlton@nuim.ie 3 National Centre for Geocomputation,National University of Ireland, Maynooth, Co. Kildare, Ireland Telephone: +353 (0) 1 708 6455 Fax: +353 (0) 1 708 6456 Email: stewart.fotheringham@nuim.ie 1. Introduction Geographically weighted regression (Fotheringham et al., 2002) is a method of modelling spatial variability in regression coefficients. The procedure yields a separate model for each spatial location in the study area with all models generated from the same data set using a differential weighting scheme. The weighting scheme, which allows for spatial variation in the model parameters, involves a bandwidth parameter which is usually determined from the data using a cross-validation procedure. Part of the main output is a set of location-specific parameter estimates and associated t statistics which can be used to test hypotheses about individual model parameters. If there are n spatial locations and p parameters in each model, there will be up to np hypotheses to be tested which in most applications defines a very high order multiple inference problem. Solutions to problems of this type usually involve an adjustment to the decision rule for individual tests designed to contain the overall risk of mistaking chance variation for a genuine effect. An undesirable by-product of achieving this control is a reduction in statistical power for individual tests, which may result in genuine effects going undetected. These two competing aspects of multiple inference have become known as the multiplicity problem. In this paper we develop a simple Bonferroni style adjustment for testing multiple hypotheses about GWR model coefficients. The adjustment takes advantage of the intrinsic dependency between local GWR models to contain the overall risk mentioned above, without the large sacrifice in power associated with the traditional Bonferroni correction. We illustrate this adjustment and a range of other corrective procedures on two data sets. The first models the determinants of educational attainment in the counties of Georgia USA. Using area based census data we examine the links between levels of educational attainment and four potential predictors: the proportion of elderly, the proportion who are foreign born, the proportion living below the poverty line and the proportion of ethnic

blacks. The second model is a geographically weighted hedonic house price model based on individual mortgage records in Greater London in 1990. In both models we show how the various corrections can be used to guide the interpretation of the spatial variations in the parameter estimates. Finally we compare the statistical power of the proposed method with Bonferroni/Sidak corrections and those based on false discovery rate control. 2. Controlling the family wise error rate The traditional approach to the multiplicity problem is to control the probability of a type I error over a set (family) of m related hypothesis tests. The probability of rejecting one or more true null hypotheses is called the family wise error rate which we denote ξ m and controlling this quantity is a standard goal of most traditional multiple inference procedures. If, in a set of m hypothesis tests, E i is the event a type I errors occurs in the ith test, the family wise error rate is given by ξ m = P ( m ) E i m P(E i ), (1) where the last term follows from Boole s inequality. If P(E i ) has the same value (say α) for all tests, the Bonforroni (1935) correction follows from equation 1 since then ξ m mα and so setting α = ξ m /m controls the family-wise error rate to be ξ m or less. This result does not require the tests to be independent however if they are independent, we can write ( ) ( ) m m m m ξ m = P E i = 1 P E i = 1 P(E i ) = 1 (1 P(E i )). Thus, for m independent tests with P(E i ) = α, the family wise error rate is ξ m = 1 (1 α) m and the Šidák (1967) correction follows by choosing α = 1 (1 ξ m ) 1/m which controls the family wise error rate at exactly ξ m. Both of the above approaches become very conservative when the number of hypotheses to be tested is large and when the tests are not independent (Abdi, 2007). For dependent tests Moyè (2003, p. 417 428) uses conditional probability and induction arguments to derive the following generalisation of Šidák s result, ξ m = 1 (1 α) ( 1 α ( 1 D 2)) m 1, (2) where the family wise error rate now depends on D [0,1] which is called the degree of dependency (assumed constant) across all tests. For a full discussion of the dependency parameter see Moyè (2003, Ch. 5). If D = 1 we need only test any one of the hypotheses in the set using the desired family wise error rate, since then the decision for this hypothesis completely determines the outcome of all remaining hypotheses in the set. If D = 0 the hypotheses are independent and equation 2 reduces to the Šidák result. Of course it still remains to choose a value for D and unfortunately there are few general guidelines. Moyè (2003, pp. 188 190) discusses the issue in the context of multiple endpoints in clinical trials and later in this paper we propose a method in the GWR context. It appears that the choice of D will always depend on the context of the study and in some cases the analyst will have no clear direction and more powerful alternatives such as the

0.05 Per testαvalue to achieveξ 100 0.05 0.04 0.03 0.02 0.01 0.00 0.0 0.2 0.4 0.6 0.8 1.0 Dependency parameter D Figure 1: Plot of the per test α value against the dependency parameter D to achieve a family-wise error rate of ξ 100 = 0.05. Holm Bonferonni procedure (Holm, 1979) can be employed. If one is willing to redefine the problem in terms of the false discovery rate (the expected fraction of incorrectly rejected hypotheses), the methods of Benjamini and Hochberg (1995) or, for dependent tests, Benjamini and Yekutieli (2001) may be appropriate. Applying the binomial theorem to equation 2 it is straight forward to show that ξ m α(1 + (m 1)(1 D 2 )) a result that can also be established using Boole s inequality as demonstrated by Moyè (2003). Therefore, assuming we have a value for D, the family wise error rate can be controlled at ξ m or less by choosing α = ξ m 1 + (m 1)(1 D 2 ) (3) which is a Bonferroni style correction for dependent tests. Moyè (2003, p. 188) warns against overestimation of D since then the family-wise error rate will not be preserved. The importance of this advice is illustrated by the steepness of the curve in fig. 1 for D > 0.95. In this region even a small overestimate of D can lead to large overestimate of the per test α value resulting in the family-wise error rate being considerably larger than required. With this in mind it would be prudent to adopt a value of D at the lower end of its expected range. 3. Dependency in Geographically Weighted Regression Since the calibration procedure in geographically weighted regression uses the same data set to calibrate models at each spatial location, the method necessarily results in a certain

degree of dependency between the models, which must flow over to the t statistics and associated hypothesis tests. A measure of this dependency can be calculated from the GWR hat matrix S defined in Brunsdon et al. (1999). The effective number of parameters is defined by p e = 2tr(S) tr(s S) where tr( ) is the matrix trace operator. Now suppose that there are p parameters in each model to be calibrated and there are n spatial locations. A sensible definition for the dependency between models is D = 1 p e np, (4) which is justified as follows. First consider the (admittedly impossible) case of complete independence among the models for which p e = np giving D = 0. In the case of complete dependence among the models (i.e. the global model is appropriate) we would have p e = p giving D = 1 1/n which will be close to unity in most GWR applications since n is usually large. Therefore equation 4 satisfies the requirement 0 D 1 and on substituting into equation 3, the family wise error rate for testing hypotheses about GWR model coefficients is controlled at ξ m or less by selecting α = ξ m 1 + p e p. (5) e np In most applications p e will be much less than the maximum number of parameters (np) and so a significant gain in statistical power is expected when using this approach. 4. Summary The main result in this paper equation 5, provides a very simple method of preserving the family wise error rate when testing multiple hypotheses about GWR model coefficients. It avoids the large sacrifice of statistical power associated with the ordinary Bonferroni correction by incorporating the intrinsic dependency between the local models into the procedure. We note that the dependency parameter is a function of the effective number of parameters which is calculated from the trace of the hat matrix which itself is a function of the estimated bandwidth. Since the bandwidth is calculated from the data it is really a random variable with an unknown distribution. Therefore D and consequently α are also random variables with unknown distributions. If this distribution can be estimated (via bootstrap or Monte Carlo methods for example), confidence bounds could be developed for D and α which would enhance the procedure and provide some protection against an inflated family-wise error resulting from overestimating D. We leave this for future work. 5. Acknowledgements The first author thanks the National Centre for Geocomputation for providing resources during his time there working on this research and La Trobe University for funding study leave and travel costs. References Abdi, H. (2007). The Bonferonni and Šidàk corrections for multiple comparisons. In: N. Salkind, ed., Encyclopedia of Measurement and Statistics, Sage.

Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society, Series B (Methodological), 57(1):289 300. Benjamini, Y. and Yekutieli, D. (2001). The control of the false discovery rate in multiple testing under dependency. The Annals of Statistics, 29(4):1165 1188. Bonforroni, C.E. (1935). Il calcolo delle assicurazioni su gruppi di teste. In: Studi in Onore del Professore Salvatore Ortu Carboni, Rome, Italy. pp. 13 60. Brunsdon, C., Fotheringham, A.S. and Charlton, M. (1999). Some notes on parametric soignificance tests for geographically weighted regression. Journal of Regional Science, 39(3):497 524. Fotheringham, A.S., Brunsdon, C. and Charlton, M. (2002). Geographically Weighted Regression: The Analysis of Spatially Varying Relationships. John Wiley & Sons, Chichester. Holm, S. (1979). A simple sequentially rejective multiple test procedure. Scandinavian Journal of Statistics, 6:65 70. Moyè, L.A. (2003). Multiple Analyses in Clinical Trials: Fundamentals for Investigators. Springer. Šidák, Z. (1967). Rectangular confidence regions for the means of multivariate normal distributions. Journal of the American Statistical Association, 62(318):626 633.