Residuals for spatial point processes based on Voronoi tessellations Ka Wong 1, Frederic Paik Schoenberg 2, Chris Barr 3. 1 Google, Mountanview, CA. 2 Corresponding author, 8142 Math-Science Building, Department of Statistics, University of California, Los Angeles, CA 90095 1554, USA. email: frederic@stat.ucla.edu. phone: 310-794-5193. fax: 310-206-5658. 3 Department of Biostatistics, Harvard University. 1
Abstract A residual analysis method for spatial point processes is proposed, where differences between the modeled conditional intensity and the observed number of points are assessed over the Voronoi cells generated by the observations. The resulting residuals appear to be substantially less skewed and hence more stable, particularly for point processes with conditional intensities close to zero, compared to ordinary Pearson residuals and other pixel-based methods. An application to models for Southern California earthquakes is provided. Key words: Papangelou intensity, Pearson residuals, point patterns, residual analysis, Voronoi residuals. 1 Introduction. The aim of this paper is to propose a new form of residual analysis for assessing the goodness of fit of spatial or spatial-temporal point process models. The proposed method relies on comparing the normalized observed and expected numbers of points over Voronoi cells generated by the observed point pattern. The excellent treatment of Pearson residuals and other pixel-based residuals by Baddeley et al. (2005), the thorough discussion of their properties in Baddeley et al. (2008), and the fact that such residuals extend so readily to the case of spatial-temporal point processes may suggest that the problem of residual analysis for such point processes is generally solved. Hence, we feel it is necessary to devote a substantial portion of this paper to a major shortcoming of such pixel-based residuals, in order to motivate our proposed alternative. In 2
brief, Pearson residuals and other pixel-based residuals tend to be highly skewed when the integrated conditional intensity over some of the cells is close to zero, which is common in many applications. By contrast, the proposed Voronoi residuals are approximately Gamma distributed and tend to be far less skewed than Pearson residuals, and are thus far more amenable to assessment of goodness of fit. zzq: We might need to define spatial and spatial-temporal point processes and their intensities, and say that we are assuming throughout that the point processes are simple. We are assuming that the observation region is equipped with Lebesgue measure, µ. Note that we are not emphasizing the distinction between conditional and Papangelou intensities here. The methods and results here are essentially equivalent for spatial and spatial-temporal point processes. This paper is organized as follows. In Section 2, we briefly review the goals of residual analysis as well as Pearson residuals and other pixel-based residuals described in Baddeley et al. (2005), and discuss their limitations when the integrated conditional intensity is small. Section 3 describes Voronoi residuals and discusses their properties. The simulations shown in Section 4 demonstrate the potential advantages of the Voronoi residuals over conventional pixel-based residuals in cases where the conditional intensity is occasionally close to zero. Section 5 includes an application to models for earthquake occurrences in Southern California. 3
2 Pearson residuals and other pixel-based methods. Residual plots for spatial point processes have two related purposes: (i) to suggest locations or aspects of the model where the fit is poor, so that an incorrectly specified model may be improved; (ii) to form the basis of formal testing, i.e. to assess the overall appropriateness of a model or to what extent the model fits well and hence results based on the model may be trusted. Baddeley et al. (2005) discuss a variety of pixel-based residuals for spatial point processes. The residual diagnostics are plots showing the standardized differences between the number of points occurring in each plot and the number expected according to the fitted model, where the standardization may be performed in various ways. For instance, for Pearson residuals, one divides the difference by the estimated standard deviation of the number of points in the pixel, in analogy with Pearson residuals in the context of linear models. Baddeley et al. (2005) also propose scaling the residuals based on the contribution of each pixel to the total pseudo-loglikelihood of the model, in analogy with score statistics in generalized linear modeling. Standardization is important for both purposes (i) and (ii), since otherwise plots of the residuals will tend to overemphasize deviations in pixels where the rate is high, and obviously formal testing based on individual pixels requires the standard deviation of the number of points in the pixel to be taken into account. Behind the term Pearson residuals lies the implication, both implicit and explicit (see e.g. the error bounds in Fig.7 of Baddeley et al. 2005), that these standardized residuals should be approximately standard normally distributed, so that the squared residuals, or their sum, are distributed approximately according to Pearson s χ 2 -distribution. Pearson residuals appear to be effec- 4
tive model evaluation tools in examples where the estimate of the conditional intensity, λ, is moderately sized throughout the space of observation, as is the case throughout Baddeley et al. (2005). ZZQ: briefly outline other residuals in Baddeley et al. (2005) and Baddeley et al. (2008)? If λ is small, however, then the Pearson residuals will be heavily skewed and their distribution will not be well approximated by the normal or χ 2 distributions. Indeed, when λ is close to zero, the raw residuals tend to have a distribution that is very highly skewed, and the standardization to form Pearson residuals actually exacerbates this skew. These situations arise in many applications, unfortunately. For example, in modeling earthquake occurrences, typically the modeled conditional intensity is close to zero far way from known faults or previous seismicity, and in the case of modeling wildfires, one may have a modeled conditional intensity close to zero in areas far from human use or frequent lightning, or with vegetation types that do not readily support much wildfire activity (zzq: cite). Furthermore, even if λ is not extremely close to zero, if the pixels used for Pearson residuals are sufficiently small so that the integral of λ over pixels is occasionally very small, then the same skew occurs. Since the Pearson residuals are standardized to have mean zero and unit (or approximately unit) variance under the null hypothesis that the modeled conditional intensity is correct (see Baddeley et al. 2008), one may inquire whether the skew of these residuals is indeed problematic. Consider a case of a planar Poisson process where the estimate of λ is exactly correct, i.e. ˆλ(x, y) = λ(x, y) at all locations, and where one elects to use Pearson residuals on pixels. Suppose that there are several pixels where the integral of λ over the pixel is roughly 0.01. Given many of these pixels, it is not unlikely that at least one of 5
them will contain a point of the process. In such pixels, the raw residual will be 0.99, and the standard deviation of the number of points in the pixel is 0.01 = 0.1, so the Pearson residual is 9.90. This may yield the following effects: a) Such Pearson residuals may overwhelm the others in a visual inspection, rendering a plot of the Pearson residuals largely useless in terms of evaluating the quality of the fit of the model; b) Conventional tests based on the normal approximation will have grossly incorrect p- values, and will commonly tend to reject the model, although it is correct, based on one such residual alone. Even if one adjusts for the non-normality of the residual and instead uses exact p-values based on the Poisson distribution for one such pixel individually, the test will still reject the model at the significance level of 0.01. c) If one adjusts for the non-normality of the residual and computes exact simultaneous p-values, then the resulting tests will have extremely low power. Indeed, if 10,000 pixels (a 100 100 grid) are used, then gross mis-specification would be required in order to reject the null hypothesis with more than a probability of merely 10% under such circumstances. ZZQ: We need to check this. We need to simulate Poisson processes to make this last statement more concrete. We will need to make an assumption about the intensity. 3 Voronoi residuals. A Voronoi tessellation is a division of the metric space on which a point process is defined into convex polygons, or Voronoi cells. Specifically, given a spatial or spatial-temporal point pattern N, one may define its corresponding Voronoi tessellation as follows: for each point τ i 6
of the point process, its corresponding cell D i is the region consisting of all locations which are closer to τ i than to any other point of N. The Voronoi tessellation is the collection of such cells. See e.g. Okabe et al. (2000) for a thorough treatment of Voronoi tessellations and their properties. Given a model for the conditional intensity of a spatial or space-time point process, one may construct residuals simply by evaluating the Pearson residuals over cells rather than rectangular pixels, where the cells comprise the Voronoi tessellation of the observed spatial or spatial-temporal point pattern. We will refer to such residuals as Voronoi residuals. Voronoi residuals offer one obvious advantage over conventional pixel-based methods, in that the cell sizes are entirely automatic and data-driven in the case of Voronoi residuals. With pixel-based methods, the cell boundaries are often determined rather arbitrarily, yet these boundaries can have immense impacts on the results, particularly when λ is volatile. More importantly, the distributions of the Voronoi residuals tend to be far less skewed than pixel-based methods such as Pearson residuals, particularly when ˆλ is small in some areas. Indeed, since each Voronoi cell has exactly one point inside it by construction, the Voronoi residual for cell i is given by ˆr i := 1 D i ˆλdµ D i ˆλdµ = 1 D i λ D i λ, (1) where λ denote the mean of ˆλ over D i. Note that when N is a homogeneous Poisson process, the cell size D i is approximately Gamma distributed. Indeed, for a homogeneous Poisson process, the expected area of a Voronoi cell is equal to the reciprocal of the intensity of 7
the process (Meijering 1953), and simulation studies have shown that the area of a typical Voronoi cell is approximately Gamma distributed (Hinde and Miles, 1980; Tanemura, 2003), and these properties these properties continue to hold approximately in the inhomogeneous case provided that the conditional intensity is approximately constant near the location in question (Barr and Schoenberg 2010). Hence the numerator in equation (??) will often tend to be distributed approximately like a rescaled Gamma random variable. By contrast, for pixels over which the integrated conditional intensity is close to zero, the conventional raw residuals are approximately Bernoulli distributed. 4 Simulated examples. The exact distributions of the Voronoi residuals are generally quite intractable due to the fact that the cells themselves are random. Simulations may be useful to investigate the approximate distributions of these residuals. ZZQ1: Assume a certain specific intensity for a spatial inhomogeneous Poisson process with small pixels. For instance, we could take λ(x, y) = 100x 2 y over the space [ 1, 1] [ 1, 1]. There will be around 67 points. Use a 100 100 grid of pixels. Look at a typical plot of the raw residuals, Pearson residuals, and tessellation residuals. The raw and Pearson residuals will probably just show the points basically, near the origin. ZZQ2: Simulate the inhomogeneous Poisson process many times and look at histograms of the residuals at the origin for Pearson and Voronoi residuals. For Pearson residuals, use the pixel [0,.01] [0,.01]. 8
5 A seismological application. ZZQ3: There are many options here. We can use two models in the CSEP, Collaborative Study of Earthquake Predictability, project. I will get this data. Acknowledgements This material is based upon work supported by the National Science Foundation under Grant No. zzq. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author and do not necessarily reflect the views of the National Science Foundation. 9
References Baddeley, A., Turner, R., Moller, J., and Hazelton, M. (2005). Residual analysis for spatial point processes (with discussion). Journal of the Royal Statistical Society, series B, 67(5):617-666. Baddeley, A., Moller, J., and Pakes, A.G. (2008). Properties of residuals for spatial point processes. Annals of the Institute of Statistical Mathematics, 60:627-649. Daley, D., and Vere-Jones, D. (1988). An Introduction to the Theory of Point Processes. Springer, New York. 10