Envelope tests for spatial point patterns with and without simulation

Size: px
Start display at page:

Download "Envelope tests for spatial point patterns with and without simulation"

Transcription

1 Envelope tests for spatial point patterns with and without simulation Thorsten Wiegand, 1,2, Pavel Grabarnik, 3 and Dietrich Stoyan 4 1 Department of Ecological Modelling, Helmholtz Centre for Environmental Research-UFZ, Permoserstr. 15, Leipzig Germany 2 German Centre for Integrative Biodiversity Research (idiv) Halle-Jena-Leipzig, Deutscher Platz 5e, Leipzig Germany 3 Institute of Physico-Chemical and Biological Problems in Soil Science, Laboratory of Ecosystem Modelling, The Russian Academy of Sciences, Pushchino Russia 4 Institut für Stochastik, TU Bergakademie Freiberg, D Freiberg Germany Citation: Wiegand, T., P. Grabarnik, and D. Stoyan Envelope tests for spatial point patterns with and without simulation. Ecosphere 7(6):e /ecs Abstract. Model testing is a central step of spatial point pattern analysis, which allows ecologists to judge if their data agree with ecological hypotheses. We present a simple and elegant solution of a challenging problem: the construction of a goodness- of- fit envelope test with prescribed significance level α. Our new Analytical Global Envelope (AGE) test is not restricted to the narrow frame of complete spatial randomness testing and its envelopes can be determined by mathematical calculations. This allows us to investigate the influence of key settings of the AGE test on the width of the envelope strip. To circumvent some assumptions of the simulation- free AGE test we present a corresponding Simulation- Based Global Envelope (SBGE) test. The envelope strip of the AGE and the SBGE test encircles the range of a summary function such as the pair correlation function under the null model, and it has the desired property that the null hypothesis can be rejected with significance level α if the empirical summary function wanders outside the envelopes. The AGE test can be applied under the mild conditions that the values of the summary functions under the null model are (approximately) normally distributed and are (approximately) independent for different distance bins r j. The SBGE test requires only the independence assumption. The width of the strip of the AGE envelopes scales for a broad range of point processes with 1/n, where n is the number of points. This casts doubt about attempts of goodness- of- fit testing with low n (say <100). The AGE and SBGE test operate with wider envelope strips than the classical pointwise test. Therefore, the pointwise test has to be considered as too liberal. Furthermore, we show that the width of the AGE/SBGE strip increases approximately with ln(b), where b is the number of distance bins. For example, the AGE/SBGE envelopes are for b = 20 more than 50% wider than the corresponding pointwise envelopes. Our study opens up new avenues to the test problem in point pattern statistics and the new AGE and SBGE tests can be widely applied in ecology to improve the practice in null model testing. Key words: deviation test; global envelopes; goodness-of-fit; Monte Carlo test; null model; pair correlation function; simulation envelope; spatial point pattern; type I error. Received 2 September 2015; revised 27 January 2016; accepted 10 February Corresponding Editor: D. P. C. Peters. Copyright: 2016 Wiegand et al. This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. thorsten.wiegand@ufz.de 1

2 Introduction Spatial point patterns are often studied in science and play an important role in modern ecological research (Illian et al. 2008, Wiegand and Moloney 2014). They comprise the locations of ecological objects (e.g., trees) within a given observational window (Fig. 1a) and may include additional information, called marks, which characterize the ecological objects (e.g., the size of the trees, or surviving vs. dead). Spatial point pattern analysis (Ripley 1981, Stoyan and Stoyan 1994, Diggle 2003, Illian et al. 2008) provides powerful techniques to analyze such data sets. During the last two decades ecologists have increasingly adopted them to derive hypotheses on the underlying processes or to test ecological theory (Stoyan and Penttinen 2000, Wiegand and Moloney 2004, Perry et al. 2006, Law et al. 2009, Szmyt 2014, Wiegand and Moloney 2014, Velázquez et al., in press). A typical point pattern analysis in ecology consists of three major technical steps (Wiegand and Moloney 2014). (1) First, the researcher estimates summary functions S(r) (such as the pair correlation function; for explanations of terms and concepts see Table 1) from the data to summarize key statistical properties of the observed pattern, usually as a function of interpoint distance r (which case is always assumed in the present paper). The basic interest here is to find out if there is positive or negative spatial correlation between the ecological objects, and at which spatial scales. This information can allow for inference on underlying processes such as competition or dispersal limitation (Law et al. 2009, Wiegand et al. 2009). For example, values of the pair correlation function larger than one indicate in the absence of environmental heterogeneity aggregation (Fig. 1b) whereas values smaller than one indicate hyperdispersion (or regularity). However, because the underlying (a) (b) (c) Fig. 1. (a) The spatial pattern of 625 juvenile individuals of the species Stylogyne turbacensis at the 50 ha BCI tropical forest plot. (b) Empirical pair correlation function ĝ(r) (closed circles), the theoretical g(r) for the complete spatial randomness (CSR) null model (bold gray line), and the pointwise envelopes being the 5th lowest and highest values of the pair correlation function estimated from 199 simulations of the CSR null model (black lines) and approximation based on Eqs. 4 and 5 (red lines). (c) Same as (b), but for the L- function. We used for estimation of the pair correlation function the bandwidth h = 2.5 m. 2

3 Table 1. Terms and concepts of envelope tests. Term Explanation References Summary function S(r) Empirical summary function Distance bin r j Pointwise envelope test Global envelope test Significance level β Significance level α Asymptotic normality assumption Independence assumption AGE test Characteristic that quantifies statistical properties of spatial point patterns, usually as function of distance r. Popular examples are Ripley s K- function K(r) and the pair correlation function g(r). The corresponding estimator is indicated by Ŝ(r) and the theoretical function by S(r). Summary function S(r) estimated from the data. Particular distance where S(r) is estimated. Usually, the r j are equally spaced over the distance interval B of interest. The number b of distance bins is an important setting of envelope tests. The traditional envelope test used in ecological applications. The pointwise envelopes are given by the kth lowest and highest values of Ŝ(r) taken from s simulations of the null model (Eq. 1). However, the pointwise envelopes applied for a distance interval B (i.e., b > 1) do not allow rejecting the null model with the aimed significance level given in Eq. 1 if the empirical summary function wanders outside the envelope strip. Allows, in contrast with pointwise envelope tests, to reject the null model with prescribed significance level α. Local significance level of the pointwise test required to yield the prescribed significance level α of the global envelope test (Eq. 8). Prescribed significance level of the global envelope test. Assumes that the values of an estimator of a summary function Ŝ(r) are approximately normally distributed for fixed r. This mild assumption is required for the AGE test but not for the SGBE test. Assumes that the values of the estimators of the summary function Ŝ(r) evaluated at different distance bins r j are (approximately) independent. This assumption excludes the use of cumulative summary functions but allows for analytical determination of the critical value z β of Eq. 8. Analytical Global Envelope test, allows for analytical determination of global envelopes under three assumptions: (1) asymptotic normality, (2) unbiasedness and knowledge on the variance, and (3) independence. DCLF test Diggle Cressie Loosmore Ford test, translates the summary function S(r j ) into the single test statistic given in Eq. 3. This test guarantees the prescribed significance level α. MAD test Maximum Absolute Deviation test, translates the multiple tests at different distance bins r j into a single test statistic being the maximum absolute value of S(r j ) Ŝ i (r j ) taken over all distance bins r j (Eq. 2). This test guarantees the prescribed significance level α. SBGE test Simulation Global Envelope test, works exactly in the same way as the pointwise envelope test but determines the value of k or the number of simulations with Eq. 10 in a way that the prescribed significance level α is guaranteed. Requires a higher number of simulations of the null model than the pointwise envelope test. Note: References are: 1, Ilian et al. (2008); 2, Ripley (1977); 3, Loosmore and Ford (2006); 4, Baddeley et al. (2014); 5, Myllymaki et al. (in press); 6, this study; 7, Myllymaki et al. (2015). 1 2, 3, 4 4, 5 6 6, , 4 2, 4 6 point processes are stochastic a value of the empirical pair correlation function below or above one cannot be used to infer safely aggregation or hyperdispersion respectively. (2) The second step of point pattern analysis is therefore implementation of an ecological (null) hypothesis as stochastic null model. (3) The third step, model testing, allows then to find out if the empirical summary function is compatible with the null model or at which spatial scales significant departures occur. While the technical tools for the first two technical steps are unproblematic, the test problem is more complicated and until the recent past there existed even confusion about the use of test methods and their interpretation (Loosmore and Ford 2006). Ecologists have traditionally used the simulation envelope approach for statistical inference, which goes back to Ripley (1977) and yields valuable information on the scales of deviation. In such tests the empirical summary function is plotted as a function of distance r together with pointwise simulation envelopes, which are typically the 2.5th and 97.5th percentiles of the summary functions generated by Monte Carlo simulations of the null model (Fig. 1b, c). If the empirical summary function 3

4 wanders for some distance outside the pointwise simulation envelopes (e.g., for distances < 8 m in Fig. 1b) this is taken as evidence for a departure from the null hypothesis. This graphical representation is especially attractive for ecological applications because it encircles the fluctuations of the summary function under the null model and points to distances where departures may occur. However, it is not straight- forward to determine for pointwise simulation envelopes with a given number of simulations the significance level α for rejection of the null hypothesis (Loosmore and Ford 2006). The reason for this is that the pointwise envelopes yield only for a single distance value r (when determined prior to conduction of the test) a prescribed significance level α. Otherwise, when considering the envelopes over a distance interval, essentially multiple tests are conducted with the well- known problems of simultaneous inference (e.g., Ripley 1977, Diggle 2003, Loosmore and Ford 2006). This fact has caused confusion and some discussion on the validity of the tests (e.g., Loosmore and Ford 2006, Grabarnik et al. 2011, Baddeley et al. 2014, Myllymäki et al. 2015). Also further fundamental issues of practical importance related with envelope tests require clarification and careful consideration to avoid inappropriate use of these tools. For example, how does the number n of points of the pattern influence the ability of the test to detect departures from the null model? In point pattern studies in ecology one can often find analyses of patterns with very few points (say < 50) (see Velázquez et al., in press:fig. A1), but can one really distinguish patterns with such a low number of points from a pattern belonging to some null model? Recent research has addressed the problem of constructing envelope tests with prescribed significance level α (Grabarnik et al. 2011), and based on the idea of studentization of summary functions (see Myllymäki et al. 2015) the paper Myllymäki et al. (in press) presents solutions that lead to satisfactory Monte Carlo tests. Here, we go one step further by breaking with the long tradition of simulation- based goodness- of- fit testing in point processes statistics. We note that already Ripley (1988:46) constructed tests of the CSR (complete spatial randomness) hypothesis which do not use simulations during the model testing phase but only some analysis of the empirical L- function (see also Illian et al. 2008:95 96). This, however, leads to envelopes of constant width. The present paper introduces a goodness- of- fit envelope test with a prescribed global significance level α that does not use simulations and is not restricted, similarly to simulation- based tests, to the narrow frame of CSR testing. The new test is called Analytical Global Envelope (AGE) test. The term global indicates that the significance level α of the test is valid for a whole given distance interval and not only for one distance r as for the pointwise test. We also present a corresponding Simulation- Based Global Envelope (SBGE) test that relaxes the normality assumptions of the AGE test and runs in the same way as the traditional pointwise simulation envelopes. The simulation- free approach has two main advantages for practical applications. First, for large samples time- consuming simulations can now be avoided. Second, it allows to derive mathematical formulas for the width of the envelope strip that give ecologists precise information on the role of test characteristics such as point number n, size and shape of the window W, and other settings of the estimators of the summary functions. The price to be paid for application of the AGE or the SBGE test is the change from the popular K- or L-function, which is still mostly used for model testing in ecology (e.g., Velázquez et al., in press:figs. 5d and A2d), to the pair correlation function g(r). However, g(r) is increasingly used in ecology because it presents the second- ordervariability information in an easier and more intuitive way than the conventional cumulative functions K and L (e.g., Wiegand and Moloney 2004, Perry et al. 2006, Law et al. 2009); and there is now standard software for the estimation of g(r). The article is organized as follows: We remind the reader first to the technicalities of the conventional pointwise simulation envelopes. Second, we discuss the fundamentals of the construction of the Analytical Global Envelope (AGE) test: analytical estimation of the variance of the estimator of the pair correlation function g(r), its asymptotical normality for fixed r, and the independence of estimators of g(r) for different r in suitable spacing. Third, we present the AGE test and the formulas that link the width of the envelope strip with the settings of the test. Forth, we also present a Simulation- Based Global 4

5 Envelope (SBGE) test that makes fewer assumptions than the AGE test and can be applied with standard software. Finally, to be on the safe side, we also check the quality of the AGE test by means of simulations. While the present paper only uses the pair correlation function for testing univariate patterns, its approach may be extended to bivariate and marked patterns and other summary functions of the nature of densities. Materials and Methods Example data As example pattern we use the spatial pattern of 627 juvenile individuals (diameter at breast height between 1 and 3 cm) of the species Stylogyne turbacensis, a small understorey tree (Fig. 1a). The data are taken from the first (1982) census of the fully mapped m plot of tropical forest at Barro Colorado Island, Panamá (Hubbell et al. 2005). Summary functions We apply here two commonly used secondorder summary functions for our analyses, the L- function L(r)= K(r) π r, a transformation of the K- function, which is traditionally applied in ecology, and the pair correlation function g(r), see Illian et al. (2008) and Wiegand and Moloney (2014) for details on the interpretation and estimation of L(r) and g(r). Although the idea leading to the AGE or SBGE test can be applied in principle also to other summary functions such as nearest neighbor summary functions, we focus here on second- order summary functions. Nevertheless, we use at some points a general language to show that the main methods are more general than only for pair correlation and L- functions. All simulation analyses presented here were conducted with the software Programita (Wiegand and Moloney 2014), which can be accessed at and a R script with example data is provided in the Supplement. On the conventional pointwise simulation envelopes The method of the conventional pointwise simulation envelope was introduced by Ripley (1977) and first applied by Sterner et al. (1986), Getis and Franklin (1987), and Kenkel (1988) to ecological questions. It quickly became the standard method of statistical testing for point pattern analysis in ecology. For example, Velázquez et al. (in press) found in their review that an overwhelming majority of 93% of the point pattern studies reviewed used Monte Carlo simulations and pointwise envelopes. The basic aim of pointwise simulation envelopes is to estimate intervals that encircle the typical (e.g., 95%) range of values of the summary function Ŝ(r) under the null model. [The hat symbol denotes the estimator of the summary function, as opposed to its theoretical counterpart S(r).] For example, for 999 simulated values of Ŝ(r), the 2.5 and 95.5 percentile of the underlying distribution are approximated by the 25th lowest and highest values of Ŝ(r). More generally, simulation envelopes can be constructed by the kth highest and lowest values of the summary function Ŝ(r) taken from s simulations of the null model, which leads to a significance level of β=2k (s+1) β=k (s+1) if the test is two-sided if the test is one-sided, (1) given that the test is conducted for only one distance r. We use here the symbol β instead of the perhaps expected α to avoid confusion. The β is the local type I error of the pointwise simulation envelope, which is only valid for a single distance r [and could be more precisely called β(r)], whereas α is the prescribed significance level of the AGE test, which is conducted over a distance interval B. We therefore call α global significance level to distinguish it from the local β. To get an idea of the variability of Ŝ(r) for the whole range of distances r of interest, the simulation envelopes are then usually plotted together with the empirical summary function over distance r (e.g., Fig. 1b, c) for all r in an interval B of length b. (In the following we assume that b is an integer and consider distances r j for j = 1, 2,, b.) This provides a simple graphical assessment of distance range and strength of potential departures from the null model. For example, the pair correlation function of the juvenile S. turbacensis trees is outside, but not too far outside, the simulation envelopes for all distances below 8.5 m and inside for all larger distances (Fig. 1b). A first diagnosis arising from this observation is that the juvenile S. turbacensis trees may show weak aggregation up to distances of 8.5 m. 5

6 However, if we use these pointwise simulation envelopes in goodness- of- fit testing we come into trouble: the rate of rejection of this test is in general larger than the β given in Eq. 1 except if the envelope test is conducted for only one distance r (Loosmore and Ford 2006). The reason is that the pointwise envelope conducts many tests simultaneously for the various distances r j (j = 1,, b) where S(r) is evaluated. Therefore, we run the risk of type I error inflation due to the phenomenon known as simultaneous inference. This is well- known in the statistical literature (e.g., Conover 1999, Diggle 2003, Illian et al. 2008) and is for point process statistics in detail discussed in Loosmore and Ford (2006), Grabarnik et al. (2011), Baddeley et al. (2014) and Loop and McClure (2015). An important class of goodness- of- fit tests with a prescribed significance level α are the socalled deviation tests, which convert the multiple tests (i.e., the information on various distances r j ) into a single test, as introduced by Diggle (1979) and Ripley (1979). The test statistic is then a single number, typically the maximum T i = max S(r j ) Ŝ i (r j ) of the deviations between the expected summary function S(r j ) and the summary functions Ŝ i (r j ) estimated from the observed data (i = 0) or the ith simulation of the null model, taken over all distances r j of interest, or the corresponding sum of the squared differences u i = j=1,,b b j=1 ) 2. ( S(r j ) Ŝ i (r j ) (2) (3) The hypothesis is rejected if the empirical test statistic T 0 (or u 0 ) has an extreme position within the ordered series of all T i s (or u i s). Following Eq. 1 for the one- sided test, this is the case if T 0 (or u 0 ) is larger than the corresponding kth largest simulated value. Inverting Eq. 1 shows that the value of k is given by k = α (1 + s), but the global significance level α and the number s of simulations must be selected in a way that k approximates an integer value. For example, for a value of α = 0.05 and s = 199 simulations of the null model we have k = 10. Tests of this type are described in detail in Baddeley et al. (2014) and in textbooks such as Diggle (2003) and Illian et al. (2008). The test using Eq. 2 is called MAD (Maximum Absolute Deviation) test and it corresponds to global envelopes of constant width (e.g., the kth largest value of the T i ; i = 1,..., s), centered on the expected summary function S(r). However, these envelopes are not sufficiently flexible to represent the behavior of Ŝ i (r j ) for different distances r if the distribution of Ŝ i (r j ) is not the same for all r. The test using Eq. 3 is called integral deviation test or Diggle Cressie Loosmore Ford (DCLF) test (Baddeley et al. 2014) and does not lead to envelopes. Thus, by construction these tests do not offer envelopes that could help the ecologist to find out which scales r are relevant. Outline of the AGE test The AGE test is based on three main ingredients. The first ingredient is the observation that estimators Ŝ(r) of summary functions such as the pair correlation function follow for fixed distances r in approximation a normal distribution with standard deviation σ S (r) and mean S(r). If Ŝ(r) is normally distri buted we can estimate the pointwise lower and upper envelopes S p (r) and S+ p (r), respectively, as S (r)= S(r) z β σ S (r) S + (r)= S(r)+z β σ S (r), (4) where z β is the critical value for the (local) pointwise significance level β (e.g., z β = 1.96 for β = 0.05). The width of the pointwise envelope strip is therefore entirely determined by σ S (r) and z β. The second ingredient is a formula that approximates the variance of Ŝ(r) in dependence on the settings of the AGE test (e.g., area A and perimeter length U of the observation window, and number n of points; Eq. 5) and requires the assumption that Ŝ(r) is unbiased. The third ingredient is independence of the values of Ŝ(r) for the different distance bins r = r j (j = 1, 2,, b) that cover the distance interval B of interest. The independence property allows us to establish an analytical relationship between 6

7 the significance level β of the pointwise test required for a given value of b to obtain the desired significance level α of the AGE test. Thus, for a given value of b, we can determine the value of β (and therefore the z β of Eq. 4) that corresponds to the prescribed global significance level α (Eq. 8). In summary, Eqs. 4, 5, and 8 allow us to estimate the width of the envelope strip directly (i.e., without any simulations) in dependence on the settings of the AGE test such as α, r, n, b, A, U. Facts on the variability of estimators of summary functions Asymptotic normality of the summary functions. The AGE test is based on studies of the variability of estimators Ŝ(r) of the summary functions used in goodness- of- fit tests. A first result of such studies is that many such estimators follow an approximate normal distribution for fixed r. For example, Fig. 2 shows that the distributions of ĝ(r) and L(r) of a Thomas cluster process (Appendix S1) can be well- approximated by (a) (b) (c) (d) (e) (f) (g) (h) (i) Fig. 2. Variability of summary functions under a Thomas cluster process, estimated from 1000 simulations of the process. The patterns were generated in a m observation window with parameters λ = 0.025, ρ = , and σ = 6 m (for explanations see Appendix S1). (a) Distribution of the values of g(4.5). The red bars indicate the 2.5 and 97.5 percentiles of the distribution [the 25th lowest and highest values of g(4.5)] and the blue bar the mean value. The red line is a fit by a normal distribution. (b) Same as (a), but for distance r = 10.5 m. (c) Comparison of the pointwise simulation envelopes based on simulations (using in Eq. 1 k = 25 and β = 0.05) (black lines) with that predicted by Eq. 4 (red lines). We used a bandwidth h = 1.5 m. (d) Same as (a), but for the L-function at r = 5 m. (e) Same as (a), but for the L-function at r = 11 m. (f) Same as (c) but for the L- function. (g) Same as (a), but for the distribution function of the distances to the nearest neighbor D(r) at r = 5 m. (h) Same as (a), but for D(r) at r = 11 m. (i) Same as (c) but for D(r). 7

8 normal distributions. Furthermore, Figs. S1 S3 in Appendix S1 show several examples for typical point processes relevant for ecological questions. The authors often observed such a behavior in their statistical work and believe that it is a somewhat intuitive knowledge in the statistical community. Fortunately, there is theoretical work on central limit theorems which confirms these empirical findings and gives them a theoretical explanation for spatially homogeneous (and isotropic) point processes (Heinrich and Klein 2014, Heinrich 2015). Analytical formula for the variance of the estimator of the summary functions. A second result is (approximate) knowledge on the variance of Ŝ(r). Simple formulas for the standard deviations of ĝ(r) and L(r) exist for many null models, which will be given and discussed below. Note that if an estimator Ŝ(r) of the summary function S(r) is normally distributed with standard deviation σ S (r) and mean S(r), we can construct (without simulation!) the pointwise lower and upper envelopes S p (r) and S+ p (r), respectively, using Eq. 4. The corresponding pointwise simulation envelopes will be close to (4) if β is chosen following (1), see Figs. 1b, c, and 2c, f, i. We now consider the standard deviation σ g (r) and σ L (r) of ĝ(r) and L(r) respectively. Stoyan et al. (1993) and Illian et al. (2008:234 and Eq ) present a formula that approximates the standard deviation σ g (r) of the estimator of the pair correlation function g(r) for (not too strongly clustered) point processes in dependence on the number n of points of the pattern, distance r, the bandwidth h of the box kernel used for estimation of g(r), and the geometry of the observation window. For the standard deviation needed in Eq. 4 this formula yields: σ g (r)= g(r) A2 1 n 2 (2πrh) γ W (r) = g(r) A h n r γ W (r) A 2π, (5) where A is the area of the observation window W and γ W (r) the so- called isotropized set covariance (Illian et al. 2008:485), where the term A γ W (r) is the edge- correction weight of the Ohser estimator of g(r) (Illian et al. 2008:230), which considers the influence of the shape of W. For a rectangular W with perimeter length U and area A it is γ W (r) A r U + r2 for r not too π π large (Illian et al. 2008:486). Figs. 1b, 3a, c and Fig. S2 in Appendix S2 show the high quality of the approximation by Eq. 5. Eq. 5 in combination with Eq. 4 tells us directly (i.e., without any simulations) that the width of the pointwise envelope strip for the pair correlation function is (1) proportional to 1/n (Fig. 3c), (2) decreases with the square root of the bandwidth h, (3) decreases with the square root of distance r, and (4) is proportional to the square root of the area A of the observation window W. The influence of the geometry of W is captured by A γw (r). Especially, the width of the envelope strip increases for long and narrow observation windows, where U becomes large. For example, the term A γ W (r) yields for a m observation window and distance r = 50 m a value of 1.069, for a narrow transect of 20 12,500 m it takes a 20% larger value. For distance r = 250 m and W = m it yields 1.5 and for r = 359 m it increases to 2.0. This result confirms the common rule of thumb that it is not recommendable to explore g(r) for distances larger than half of the smallest side of W (e.g., Haase 1995). Appendix S2 presents a derivation of Eq. 5 following Stoyan et al. (1993) that is based on the heuristic assumption that the number N r of pairs of points in the window which are distance r ± h apart follows a distribution with the property that mean = variance. (This is true for the Poisson distribution, but this special distribution is not necessary for our calculations.) Simulations show that this is a realistic assumption for many homogeneous point process models ranging from strongly hyperdispersed to moderately clustered patterns and can be applied in many practical situations. However, Eq. 5 becomes inaccurate for larger distances r and/or a large number n of points as well as for small r and/or small n where the property mean = variance does not hold anymore or the distribution becomes skewed (see Appendix S2). A similar approximation can be used in the case of the L- function, as we show in Appendix S2 (see also Ward and Ferrandino 1999): 8

9 σ L (r)= 0.5 π A n Aπr 2 r 0 2πt γ W (t)dt (6) Eq. 6 yields good estimates for σ L (r) under CSR for the range of distances of practical interest (Fig. 3b). However, already small departures from CSR lead to strong departures from Eq. 6 (see Fig. S4g, o in Appendix S1) whereas the corresponding Eq. 5 for the pair correlation function holds also for larger departures from CSR (see Fig. S4b, f, j, n, r in Appendix S1). This is one of our reasons to recommend use of the pair correlation functions in the AGE test instead of the L- or K- function. Ripley (1988) also provided an approximation of the variance of K(r) under CSR. Independence property of the summary function. The third important distributional property is the independence property. For many estimators of non- cumulative summary functions (as the box kernel estimators of the pair correlation function used here and other kernel estimators) we observe that their values at not too small spatial lags Δr [i.e., ĝ(r j ) and ĝ(r j +Δr)] are in good approximation uncorrelated if the lag Δr is larger than two times the bandwidth h of the estimator (Fig. 4a). This is plausible because in the case of the box kernel different point pairs are used to estimate ĝ(r j ) and ĝ(r j +Δr). Fig. 4a shows the correlation coefficients for the pair correlation function for different values of r j and Δr, taken from 1000 simulations of the CSR null model. In all cases the correlation coefficients are small, and for Δr 2h = 3 m (Fig. 4d) they range in most cases between 0.09 and 0.19 (Fig. 4a). The same analysis of the cumulative L- function reveals as expected strong correlations (Loop and McClure 2015), especially for small lag distances Δr and large distances r (Fig. 4b, e). To show that the independence property also holds for a wider range of null models other than CSR we analyzed the correlations of ĝ(r j ) and ĝ(r j +Δr) for the Thomas process shown in Fig. S1u in Appendix S1. The spatial clustering introduces some correlation for distance lags Δr within the range of clustering (here < 20 m) (Fig. 4f), but for larger values of Δr the correlation coefficients are below 0.2 (Fig. 4c, f). This means that one has to work with a larger spacing of distance bins if one wants to ensure approximate independence, which is needed for the AGE test. Similar arguments apply for regular point processes. In the following we always assume a suitable spacing. As we will see below, simulation experiments show that weak correlations between values of the pair correlation functions are uncritical for the construction of the AGE test. Note that this does not apply for the cumulative L- function (cf. Fig. 4b, e)! Construction of the analytical global envelope (AGE) test We now construct global envelopes S + g (r) and S g (r) that correspond to a prescribed global significance level α for a given distance interval B represented by b distances r j. To this end we reinterpret Eq. 4 in a way that the β in Eqs. 1 and 4 is the probability β which ensures that the type I error probability of the AGE test is α, and z β is the corresponding critical value. To determine β we take advantage of the (approximate) independence of the values of Ŝ(r j ) for the selected values of r j. Therefore we basically test a hypothesis which is composed of b independent sub- hypotheses (one for each distance bin r j ). We reject the hypothesis if at least one sub- hypothesis is rejected. If the type I error probability β for each sub- hypothesis is the same for all distances r j then the overall type I error probability α for the hypothesis over the entire distance interval B is simply, α=1 (1 β) b. Therefore, the value of β that yields the prescribed significance level α of the AGE test is β=1 (1 α) 1 b. (7) (8) Thus, the global envelopes S + g (r) and S g (r) constructed by Eq. 4 with the β in Eq. 8 are the envelopes we want. They indicate departure from the null model with prescribed significance level α if the empirical summary function wanders at least at one distance r j outside the envelope strip and so provide the desired intuitive assessment of scale effects. Our approach allows additionally to the construction of simulation- free envelopes also analytical calculation of the p- value. For its 9

10 (a) (b) (c) (d) Fig. 3. Standard deviation of ĝ(r) and L(r) under complete spatial randomness (CSR). (a) Analytical approximation of the standard deviation σ g (r) of ĝ(r) (Eq. 5; bold gray line) and σ g (r) estimated from 1000 simulations of the CSR null model (circles). We used h = 2.5 m, A = m, and n = 626. (b) Same as (a) but for the L- function. (c) Standard deviation of ĝ(r) taken over the 1000 simulations of CSR in dependence on n. (d) Same as (c), but for the L- function. determination we first estimate the maximum value z of all z j = Ŝ(r j ) S(r j ) σ S (r j ) taken over the b distances r j used in the AGE test. Based on z we then compute the p- value of the pointwise envelope test, i.e., a local p- value p loc, by means of p loc = 2(1 Φ(z)), where Φ(z) is the cumulative distribution function of the standard normal distribution. Finally, the p- value of the AGE test is calculated analogously to Eq. 7 as p global = 1 (1 p loc ) b. (9) The SBGE test In some circumstances not all assumptions of our AGE test may hold and simulations can help. For example, if the moments S(r) and σ S (r) cannot be determined analytically as assumed until now they can be determined by simulation of the underlying point process model. This is a simple variant of the AGE test that expands the range of point process models that can be handled. Note that this is not application of Monte Carlo testing! More importantly, we present a SBGE test that can be conducted in exactly the same way as the classical pointwise approach. It only requires the independence assumption. The SBGE test uses, as the AGE test, Eq. 8 to estimate the value of β required for the prescribed significance level α, but then inverts Eq. 1 to estimate a suitable value of k that corresponds to this β, i.e., k = β(1+s) 2 k = β(1+s) if the test is two-sided if the test is one-sided, (10) However, s and β must be selected in a way that k is close to an integer value. Because this simulation- based SBGE test does not use Eq. 4 it does not require the normality assumption. Table 2 shows appropriate values of k for different numbers s of simulations and for different numbers b of distance bins. It also shows that the SBGE test needs in general a large number of simulations to obtain at least a value of k = 1. For example, a SBGE test with b = 25 distance bins and α = 0.05 requires at least 999 simulations of the null model (Table 2). The SBGE test offers the user flexibility in the selection of the point process models that can be handled, but it does not lose the advantage of the 10

11 (a) (b) (c) (d) (e) (f) Fig. 4. (a) Correlation coefficients between ĝ(r j ) and ĝ(r j +Δr) for different distances r j and different spatial lags Δr, taken from 1000 pair correlation functions estimated for the CSR null model for n = 626 points within a m observation window. The pair correlation functions were estimated at distances r j = 1.5, 2.5,, 50.5 m with a bandwidth h = 1.5 m. (b) Same as (a), but for the L- function. The L- function was estimated at distances r j = 5, 10, 15,, 50 m. (c) Same as (a), but for the Thomas process shown in Fig. S1u of Appendix S1. (d) Average correlation coefficient for CSR at lag Δr for over all distances r j = 1.5, 2.5,, 50.5 m. Note that the correlations are large if the lag Δr is smaller than the doubled bandwidth h. (e) Same as (d), but for the L- function. (f) Same as (d), but for the Thomas process. analytical estimation of β of Eq. 8. Of course, S(r) has to be a non- cumulative summary function. Evaluation of the AGE test We used simulations to check the quality of the AGE test and the predictions of Eqs. 4 and 8. In a first simulation experiment we tested if the empirical type I error of the AGE test is really close to the nominal level α. To do this we generated 10,000 point patterns using CSR (627 points within a m observation window) and applied the AGE test based on Eqs. 4, 5, and 8 for the pair correlation function (with bandwidth h = 2 m). The supplement provides the R script that was used for the AGE test. In a second simulation experiment we applied the Monte Carlo test proposed by Myllymäki et al. (in press) to check the robustness of the analytical estimate of the value of β in Eq. 8 (and that of the resulting value of z β used in Eq. 4) with respect to the independence assumption and the influence of the number b of distances r j. We used 1000 simulations of the CSR null model and conducted the test using (1) the pair correlation function, different values of b, and two different values of the spacing of the distance bins (Δr = 1 m and 5 m), and (2) using the L- function and different values of b. Results Fig. 5 shows for the forest data the AGE envelopes S + g (r) and S g (r) (red circles; α = 0.05) together with the analytical pointwise envelopes (blue circles) for the pair correlation function. As it has to be, the AGE envelope strip is clearly wider than the pointwise envelope strip (cf. red and blue circles), since the local significance level β required in Eq. 8 to obtain the prescribed global α for b = 50 is smaller than the β assumed for the pointwise envelope test. Eq. 8 predicts for b = 50 and α = 0.05 a value β = , 11

12 Table 2. Values of k required to obtain a prescribed significance level of α = 0.05 of the two- sided SBGE test (Eqs. 8 and 10) in dependence on the number b of distance bins over which the test is conducted and the number s of simulations of the null model. b β k required to obtain the prescribed α 0.05 s = 39 s = 199 s = 999 s = 1999 s = Note: Suitable values of k do not exist for all values of b and s because k has to be an integer. The values of k presented in the table yield values of α between and Value of b assumed in the pointwise test. which corresponds to a critical value of z α = compared to the pointwise z β = However, we also find that the AGE envelopes are in excellent agreement with the corresponding simulation envelopes (gray lines in Fig. 5). Additionally, because Eq. 4 contains the variance σ g (r) of the summary function under the null model, we can also assess the influence of n, h, A, U, and r on the width of the envelope strip (see discussion after Eq. 5). Finally, we note that for g(r) the envelopes that belong to the null model can be determined for many point processes without simulation (Fig. S4 in Appendix S1) if the theoretical g(r) is known. Otherwise the SBGE test or the variant of the AGE test that uses simulations to determine the mean S(r) and the standard deviation σ S (r) of the summary function S(r) is recommended. The influence of b on the width of the envelope strip Eq. 8 predicts the values of β that are required in the AGE test for a given number b of distance bins r j to obtain the prescribed significance level α (gray line of Fig. 6a). These values of β apply for all summary functions and null models as long as the Ŝ i (r) are independent for the b different distance bins r j used to cover the distance interval B. Eq. 8 allows us also to explore the dependence of the critical value z β of the AGE test (Eq. 4) on b. This is of interest because the width of the envelope strip is proportional to z β (see Eq. 4). For each value of β resulting from Eq. 8 we determined the corresponding critical value z β of the standard normal distribution. The gray line in Fig. 6c shows how z β depends on the number b of distance bins r j. We find that z β increases for α = 0.05 in good approximation as ln(b) with b (Fig. 6c), thus showing a fast increase for smaller values of b and a slower increase for larger values of b. These values of z β apply if the summary functions Ŝ i (r) are independent for different r and normally distributed for fixed r. 12

13 test, but weak departures where the empirical summary function wanders just slightly outside the pointwise envelopes will not be confirmed. Fig. 5. Different simulation envelopes for S. turbacensis. Empirical pair correlation function ĝ(r) (black circles), pointwise analytical envelopes S + p (r) and S p (r) for α = 0.05 (blue circles), expectation under complete spatial randomness (CSR) (horizontal black line), and analytical global envelopes S + g (r) and S g (r) (red circles) taken over the m distance interval with a distance bin of 1 m (i.e., b = 50). The corresponding simulation envelopes based on 999 simulations of the CSR null model are shown as gray lines behind the corresponding analytical envelopes. The bandwidth was h = 2.5 m. The analytical relationship between z β and b is the heart of the AGE and the SBGE test because it defines the envelopes in Eq. 4 and has practical consequences. First, because the abscissa in Fig. 6c has a logarithmic scale, we find the rule of thumb that z β (and thereby the width of the envelope strip) increases in good approximation with the logarithm of b if we increase the number b of distance bins r j. Therefore, a priori information on the scales where departures from the null hypothesis are expected are useful because they can reduce b and therefore increase the power of the AGE and the SBGE test. Second, because the critical value z β of the AGE and the SBGE test must coincide for b = 1 with the z β of the pointwise envelope test, we can assess the error made by the pointwise test. For example, when b = 20 distance bins are used the envelope strip increases 50% in width from the pointwise envelopes to the global envelopes (because z β increases from 1.96 to 3.02 for α = 0.05). Strong departures of the empirical summary function from the pointwise envelopes will therefore generally be confirmed by the AGE Evaluation of the AGE test The global significance level α=0.052 estimated from 10,000 replicates of the AGE test under CSR is in very good agreement with the theoretical value α = In Fig. 6a we used a simulation procedure for different numbers b of distance bins r j to test the predictions for β made with Eq. 8. Because we used in the simulations a bandwidth h = 2.5 m we first evaluated the pair correlation function at b distance bins r j = 2.5 m, 7.5 m, (i.e., r j r j 1 = 2 h) to ensure independence between g(r j ) and all g(r j + Δr). Simulations show that the resulting β and z β values for the pair correlation function (red circles) agree indeed well with the predictions (gray line) (Fig. 6a, c). Interestingly, the predictions are still good even for r j - r j-1 = 1 m (black circles). Thus, the AGE test is robust against some correlations at short lag distances (Fig. 6a, c). In contrast, the cumulative nature of the L- function leads to strong correlations over many distances (Fig. 4b, e), which leads to substantially higher values of β (Fig. 6b) and substantially lower values of z β (Fig. 6d; see also Loop and McClure 2015). For this reason we recommend to use the noncumulative pair correlation function (instead of the cumulative L- function) as summary function because then it is possible to determine β and z β analytically and to construct a test with theoretically well- understood properties. Discussion This article presents a simple and elegant solution of a long- standing problem in point process statistics: the construction of envelopes with prescribed significance level α for goodness- of- fit tests. These envelopes have the desired property that the null model is rejected with the prescribed global significance level α if the empirical summary function wanders at some distance r outside the envelopes. Additionally, we show that these envelopes can be determined without simulation. We obtained this result not based on simulations (as Ripley s CSR test, Ripley 1988), but by means of mathematical reasoning, combining central limit theorems, variance approximations and 13

14 (a) (b) (c) (d) Fig. 6. Factors influencing the local significance level β of the pointwise test (Eq. 8) and the corresponding critical value z β required to obtain the prescribed significance level α = 0.05 of the Analytical Global Envelope test. (a) Analytical estimates of β (Eq. 8) in dependence on the number b of distance bins r j used for the tested interval (gray lines) and simulation results for g(r) based on 1000 simulations of complete spatial randomness (black and red circles). We evaluated the pair correlation function at 1 m steps r j = 1.5, 2.5, (black circles) or at 5 m steps r j = 2.5, 7.5, (red circles). The bandwidth was h = 2.5 m. The horizontal line gives the significance level β = 0.05 of the pointwise envelope test. (b) Same as (a), but for the L- function. (c) Same as (a) but for the critical value z β. The black line shows the fit z β (b) = ln(b) and the horizontal line gives the critical value z β = 1.96 of the pointwise envelope test. (d) Same as (c), but for L- function. empirical knowledge on independence of pair correlation function estimators for different distances r j. Simulations showed a posteriori that our calculations are correct. However, we also present a SBGE test that is only based on the independence assumption and can be applied in exactly the same way as the traditional pointwise simulation envelopes, only s and k have to be chosen in a new way. The SBGE test requires a smaller ratio k/s, which can be obtained with integer k only with numbers s of simulations of the null model larger than for the pointwise test (Table 2, Eichhorn 2010). The significance- level problem arises due to multiple testing. Fig. 5 is a good demonstration of its effect. The width of the envelope strip of the AGE test with the correct significance level α is clearly larger than that of the pointwise test (by factor ln(b), where b is the number of distance bins over which the test is conducted). Thus, marginal departures of the empirical summary function from the pointwise envelopes will lead in most cases to spurious rejection of the null hypothesis. This result is especially relevant in the light of a recent review by Velázquez et al. (in press) that showed that only 12% of the ecological studies reviewed used some correction for type I error. For example, it can be expected that the CSR hypothesis has been rejected too often. The analytical approach allowed us to derive mathematical formulas for the width of the envelope strip that provide precise information on the role of test characteristics such as the number n of points, the size A and the perimeter length U of the observation window, the number b of distance bins used in the AGE test, as well as other settings of the estimator of the summary function. As we will see below, some of our results call in question practices that are typically encountered in ecological applications. Eqs. 4, 5, 6, 8 and 10 give valuable information on envelope tests to be considered in practical applications (note that points 1 3 apply also for the pointwise envelopes): 14

15 1. The standard deviation of the pair correlation function g(r) (and the L-function) and thereby the width of envelope strips is for a wide range of point processes most strongly determined by the number n of points of the pattern and scales approximately with 1/n (Eq. 5). This casts substantial doubt on the practice in ecological statistics to work with small samples of, say, < 100 points (Velázquez et al., in press). 2. Because the standard deviation of g(r) scales with g(r) n (Eq. 5), we can work for regular patterns with a lower number of points n than for aggregated patterns. If the point number n is low, the variance of g(r) will become large. This effect can be counterbalanced to a certain extend by using a larger bandwidth h. 3. The geometry of the observation window influences the variance of the pair correlation function and therefore the width of the envelope strip. In particular, long and narrow observation windows produce wide envelope strips. The isotropized set covariance γ W (r) that represents the geometry of the observation window (Eq. 5) allows us to directly assess the strength of this effect. 4. The width of the global envelope strips increases in good approximation logarithmically with the number b of distance bins r j that define the distance interval B over which the AGE or SBGE test is conduced (Fig. 6c), and therefore the probability to detect significant departures from the null model for a specific distance interval declines with increasing b. This means that short intervals B are useful, which can be chosen if a priori information is available on the scales where departures from the null hypothesis are expected. Thus, the distances between bins should be adjusted to obtain approximate independence between the corresponding values of the pair correlation function. We notice that Eq. 5 predicts the variance of g(r) very well for highly regular to moderately aggregated point processes (Fig. S4 in Appendix S1). One can therefore apply the AGE test for a wide range of point processes relevant in ecology. We also notice that the problem of choosing a suitable number s of simulations of the null model vanishes since our AGE test is simulation- free. This problem has been discussed for example in Loosmore and Ford (2006) and Grabarnik et al. (2011), and a recent literature review by Velázquez et al. (in press: Fig. 5f) revealed that the values of s ranged in methodological studies from as low as 19 or 39 (e.g., Baddeley et al. 2014) to as large as 10,000 (Goreaud and Pélissier 2003). Assumptions and caveats of our approach The AGE test can be applied in all cases where the distribution of the unbiased estimator of the summary function S(r) satisfies the three fundamental properties: asymptotic normality, approximate independence for r j with sufficient spacing and possibility to approximate the estimation variance. We believe that this may be shown also for cases where S(r) is a probability density function of, say, the nearest neighbor distance distribution function D(r) [or G(r)] and spherical contact distribution function H s (r) [or F(r)]. Otherwise, the simulation- based SBGE test can be applied. While the asymptotic normality of the summary function under the null model will hold in a wide range of cases, there are nevertheless situations where the distribution of the summary function under the null model is not symmetric. This happens for example for the pair correlation function g(r) if the number n of points is low and the distance r is small (see Appendix S2). Other cases where the distribution of Ŝ(r) departs from normality may occur for the nearest neighbor summary functions D(r) and H s (r). In such situations the simulation- based SBGE test can help because it does not make assumptions on the underlying distribution of Ŝ(r). The AGE test and its associated p- values are somewhat sensitive to user- defined settings such as the number b of distance bins. This may create a temptation to use researcher degrees of freedom, i.e., post hoc tinkering with test parameters to obtain a more pleasing estimate of the p- value. While this is true and may bear the danger of misuse, the p- value, if appropriately used, provides nevertheless additional important information for the evaluation of the test. The graphical display of the AGE test (as Fig. 5) shows clearly if the empirical summary function wanders only marginally outside the simulation envelope, a case where the ecological significance 15

Testing of mark independence for marked point patterns

Testing of mark independence for marked point patterns 9th SSIAB Workshop, Avignon - May 9-11, 2012 Testing of mark independence for marked point patterns Mari Myllymäki Department of Biomedical Engineering and Computational Science Aalto University mari.myllymaki@aalto.fi

More information

Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances

Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances Advances in Decision Sciences Volume 211, Article ID 74858, 8 pages doi:1.1155/211/74858 Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances David Allingham 1 andj.c.w.rayner

More information

A nonparametric two-sample wald test of equality of variances

A nonparametric two-sample wald test of equality of variances University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 211 A nonparametric two-sample wald test of equality of variances David

More information

On tests of spatial pattern based on simulation envelopes

On tests of spatial pattern based on simulation envelopes Ecological Monographs, 84(3), 2014, pp. 477 489 Ó 2014 by the Ecological Society of America On tests of spatial pattern based on simulation envelopes ADRIAN BADDELEY, 1,2,6 PETER J. DIGGLE, 3,4 ANDREW

More information

Overview of Spatial analysis in ecology

Overview of Spatial analysis in ecology Spatial Point Patterns & Complete Spatial Randomness - II Geog 0C Introduction to Spatial Data Analysis Chris Funk Lecture 8 Overview of Spatial analysis in ecology st step in understanding ecological

More information

Global envelope tests for spatial processes

Global envelope tests for spatial processes J. R. Statist. Soc. B (2017) 79, Part 2, pp. 381 404 Global envelope tests for spatial processes Mari Myllymäki, Natural Resources Institute Finland (Luke), Vantaa, Finland Tomáš Mrkvička, University of

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

GIST 4302/5302: Spatial Analysis and Modeling Point Pattern Analysis

GIST 4302/5302: Spatial Analysis and Modeling Point Pattern Analysis GIST 4302/5302: Spatial Analysis and Modeling Point Pattern Analysis Guofeng Cao www.spatial.ttu.edu Department of Geosciences Texas Tech University guofeng.cao@ttu.edu Fall 2018 Spatial Point Patterns

More information

ANALYZING THE SPATIAL STRUCTURE OF A SRI LANKAN TREE SPECIES WITH MULTIPLE SCALES OF CLUSTERING

ANALYZING THE SPATIAL STRUCTURE OF A SRI LANKAN TREE SPECIES WITH MULTIPLE SCALES OF CLUSTERING Ecology, 88(12), 2007, pp. 3088 3102 Ó 2007 by the Ecological Society of America ANALYZING THE SPATIAL STRUCTURE OF A SRI LANKAN TREE SPECIES WITH MULTIPLE SCALES OF CLUSTERING THORSTEN WIEGAND, 1,4 SAVITRI

More information

Point Pattern Analysis

Point Pattern Analysis Point Pattern Analysis Nearest Neighbor Statistics Luc Anselin http://spatial.uchicago.edu principle G function F function J function Principle Terminology events and points event: observed location of

More information

ON THE ESTIMATION OF DISTANCE DISTRIBUTION FUNCTIONS FOR POINT PROCESSES AND RANDOM SETS

ON THE ESTIMATION OF DISTANCE DISTRIBUTION FUNCTIONS FOR POINT PROCESSES AND RANDOM SETS Original Research Paper ON THE ESTIMATION OF DISTANCE DISTRIBUTION FUNCTIONS FOR POINT PROCESSES AND RANDOM SETS DIETRICH STOYAN 1, HELGA STOYAN 1, ANDRÉ TSCHESCHEL 1, TORSTEN MATTFELDT 2 1 Institut für

More information

Spatial Analysis I. Spatial data analysis Spatial analysis and inference

Spatial Analysis I. Spatial data analysis Spatial analysis and inference Spatial Analysis I Spatial data analysis Spatial analysis and inference Roadmap Outline: What is spatial analysis? Spatial Joins Step 1: Analysis of attributes Step 2: Preparing for analyses: working with

More information

CONTENTS OF DAY 2. II. Why Random Sampling is Important 10 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE

CONTENTS OF DAY 2. II. Why Random Sampling is Important 10 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE 1 2 CONTENTS OF DAY 2 I. More Precise Definition of Simple Random Sample 3 Connection with independent random variables 4 Problems with small populations 9 II. Why Random Sampling is Important 10 A myth,

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2

MA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2 MA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2 1 Bootstrapped Bias and CIs Given a multiple regression model with mean and

More information

Lab #3 Background Material Quantifying Point and Gradient Patterns

Lab #3 Background Material Quantifying Point and Gradient Patterns Lab #3 Background Material Quantifying Point and Gradient Patterns Dispersion metrics Dispersion indices that measure the degree of non-randomness Plot-based metrics Distance-based metrics First-order

More information

Spatial Point Pattern Analysis

Spatial Point Pattern Analysis Spatial Point Pattern Analysis Jiquan Chen Prof of Ecology, University of Toledo EEES698/MATH5798, UT Point variables in nature A point process is a discrete stochastic process of which the underlying

More information

K1D: Multivariate Ripley s K-function for one-dimensional data. Daniel G. Gavin University of Oregon Department of Geography Version 1.

K1D: Multivariate Ripley s K-function for one-dimensional data. Daniel G. Gavin University of Oregon Department of Geography Version 1. K1D: Multivariate Ripley s K-function for one-dimensional data Daniel G. Gavin University of Oregon Department of Geography Version 1.2 (July 2010) 1 Contents 1. Background 1a. Bivariate and multivariate

More information

Spatial point processes

Spatial point processes Mathematical sciences Chalmers University of Technology and University of Gothenburg Gothenburg, Sweden June 25, 2014 Definition A point process N is a stochastic mechanism or rule to produce point patterns

More information

Monte Carlo Studies. The response in a Monte Carlo study is a random variable.

Monte Carlo Studies. The response in a Monte Carlo study is a random variable. Monte Carlo Studies The response in a Monte Carlo study is a random variable. The response in a Monte Carlo study has a variance that comes from the variance of the stochastic elements in the data-generating

More information

Estimation and Confidence Intervals for Parameters of a Cumulative Damage Model

Estimation and Confidence Intervals for Parameters of a Cumulative Damage Model United States Department of Agriculture Forest Service Forest Products Laboratory Research Paper FPL-RP-484 Estimation and Confidence Intervals for Parameters of a Cumulative Damage Model Carol L. Link

More information

Recall the Basics of Hypothesis Testing

Recall the Basics of Hypothesis Testing Recall the Basics of Hypothesis Testing The level of significance α, (size of test) is defined as the probability of X falling in w (rejecting H 0 ) when H 0 is true: P(X w H 0 ) = α. H 0 TRUE H 1 TRUE

More information

A Monte Carlo Simulation of the Robust Rank- Order Test Under Various Population Symmetry Conditions

A Monte Carlo Simulation of the Robust Rank- Order Test Under Various Population Symmetry Conditions Journal of Modern Applied Statistical Methods Volume 12 Issue 1 Article 7 5-1-2013 A Monte Carlo Simulation of the Robust Rank- Order Test Under Various Population Symmetry Conditions William T. Mickelson

More information

Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions

Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions Journal of Modern Applied Statistical Methods Volume 8 Issue 1 Article 13 5-1-2009 Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error

More information

A Test of Cointegration Rank Based Title Component Analysis.

A Test of Cointegration Rank Based Title Component Analysis. A Test of Cointegration Rank Based Title Component Analysis Author(s) Chigira, Hiroaki Citation Issue 2006-01 Date Type Technical Report Text Version publisher URL http://hdl.handle.net/10086/13683 Right

More information

* Tuesday 17 January :30-16:30 (2 hours) Recored on ESSE3 General introduction to the course.

* Tuesday 17 January :30-16:30 (2 hours) Recored on ESSE3 General introduction to the course. Name of the course Statistical methods and data analysis Audience The course is intended for students of the first or second year of the Graduate School in Materials Engineering. The aim of the course

More information

Chapter 6 Spatial Analysis

Chapter 6 Spatial Analysis 6.1 Introduction Chapter 6 Spatial Analysis Spatial analysis, in a narrow sense, is a set of mathematical (and usually statistical) tools used to find order and patterns in spatial phenomena. Spatial patterns

More information

CONDUCTING INFERENCE ON RIPLEY S K-FUNCTION OF SPATIAL POINT PROCESSES WITH APPLICATIONS

CONDUCTING INFERENCE ON RIPLEY S K-FUNCTION OF SPATIAL POINT PROCESSES WITH APPLICATIONS CONDUCTING INFERENCE ON RIPLEY S K-FUNCTION OF SPATIAL POINT PROCESSES WITH APPLICATIONS By MICHAEL ALLEN HYMAN A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT

More information

Robustness and Distribution Assumptions

Robustness and Distribution Assumptions Chapter 1 Robustness and Distribution Assumptions 1.1 Introduction In statistics, one often works with model assumptions, i.e., one assumes that data follow a certain model. Then one makes use of methodology

More information

An overview of applied econometrics

An overview of applied econometrics An overview of applied econometrics Jo Thori Lind September 4, 2011 1 Introduction This note is intended as a brief overview of what is necessary to read and understand journal articles with empirical

More information

Volume 03, Issue 6. Comparison of Panel Cointegration Tests

Volume 03, Issue 6. Comparison of Panel Cointegration Tests Volume 03, Issue 6 Comparison of Panel Cointegration Tests Deniz Dilan Karaman Örsal Humboldt University Berlin Abstract The main aim of this paper is to compare the size and size-adjusted power properties

More information

ESTIMATING FUNCTIONS FOR INHOMOGENEOUS COX PROCESSES

ESTIMATING FUNCTIONS FOR INHOMOGENEOUS COX PROCESSES ESTIMATING FUNCTIONS FOR INHOMOGENEOUS COX PROCESSES Rasmus Waagepetersen Department of Mathematics, Aalborg University, Fredrik Bajersvej 7G, DK-9220 Aalborg, Denmark (rw@math.aau.dk) Abstract. Estimation

More information

Research Article Statistical Tests for the Reciprocal of a Normal Mean with a Known Coefficient of Variation

Research Article Statistical Tests for the Reciprocal of a Normal Mean with a Known Coefficient of Variation Probability and Statistics Volume 2015, Article ID 723924, 5 pages http://dx.doi.org/10.1155/2015/723924 Research Article Statistical Tests for the Reciprocal of a Normal Mean with a Known Coefficient

More information

ONE-YEAR AND TOTAL RUN-OFF RESERVE RISK ESTIMATORS BASED ON HISTORICAL ULTIMATE ESTIMATES

ONE-YEAR AND TOTAL RUN-OFF RESERVE RISK ESTIMATORS BASED ON HISTORICAL ULTIMATE ESTIMATES FILIPPO SIEGENTHALER / filippo78@bluewin.ch 1 ONE-YEAR AND TOTAL RUN-OFF RESERVE RISK ESTIMATORS BASED ON HISTORICAL ULTIMATE ESTIMATES ABSTRACT In this contribution we present closed-form formulas in

More information

Quantile regression and heteroskedasticity

Quantile regression and heteroskedasticity Quantile regression and heteroskedasticity José A. F. Machado J.M.C. Santos Silva June 18, 2013 Abstract This note introduces a wrapper for qreg which reports standard errors and t statistics that are

More information

Points. Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved

Points. Luc Anselin.   Copyright 2017 by Luc Anselin, All Rights Reserved Points Luc Anselin http://spatial.uchicago.edu 1 classic point pattern analysis spatial randomness intensity distance-based statistics points on networks 2 Classic Point Pattern Analysis 3 Classic Examples

More information

Extending the Robust Means Modeling Framework. Alyssa Counsell, Phil Chalmers, Matt Sigal, Rob Cribbie

Extending the Robust Means Modeling Framework. Alyssa Counsell, Phil Chalmers, Matt Sigal, Rob Cribbie Extending the Robust Means Modeling Framework Alyssa Counsell, Phil Chalmers, Matt Sigal, Rob Cribbie One-way Independent Subjects Design Model: Y ij = µ + τ j + ε ij, j = 1,, J Y ij = score of the ith

More information

Chapter 1 Statistical Inference

Chapter 1 Statistical Inference Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

More information

PAIRS OF SUCCESSES IN BERNOULLI TRIALS AND A NEW n-estimator FOR THE BINOMIAL DISTRIBUTION

PAIRS OF SUCCESSES IN BERNOULLI TRIALS AND A NEW n-estimator FOR THE BINOMIAL DISTRIBUTION APPLICATIONES MATHEMATICAE 22,3 (1994), pp. 331 337 W. KÜHNE (Dresden), P. NEUMANN (Dresden), D. STOYAN (Freiberg) and H. STOYAN (Freiberg) PAIRS OF SUCCESSES IN BERNOULLI TRIALS AND A NEW n-estimator

More information

An estimate of the long-run covariance matrix, Ω, is necessary to calculate asymptotic

An estimate of the long-run covariance matrix, Ω, is necessary to calculate asymptotic Chapter 6 ESTIMATION OF THE LONG-RUN COVARIANCE MATRIX An estimate of the long-run covariance matrix, Ω, is necessary to calculate asymptotic standard errors for the OLS and linear IV estimators presented

More information

CRISP: Capture-Recapture Interactive Simulation Package

CRISP: Capture-Recapture Interactive Simulation Package CRISP: Capture-Recapture Interactive Simulation Package George Volichenko Carnegie Mellon University Pittsburgh, PA gvoliche@andrew.cmu.edu December 17, 2012 Contents 1 Executive Summary 1 2 Introduction

More information

1 Motivation for Instrumental Variable (IV) Regression

1 Motivation for Instrumental Variable (IV) Regression ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data

More information

Generalized Linear Models (GLZ)

Generalized Linear Models (GLZ) Generalized Linear Models (GLZ) Generalized Linear Models (GLZ) are an extension of the linear modeling process that allows models to be fit to data that follow probability distributions other than the

More information

Lectures 5 & 6: Hypothesis Testing

Lectures 5 & 6: Hypothesis Testing Lectures 5 & 6: Hypothesis Testing in which you learn to apply the concept of statistical significance to OLS estimates, learn the concept of t values, how to use them in regression work and come across

More information

Precision of maximum likelihood estimation in adaptive designs

Precision of maximum likelihood estimation in adaptive designs Research Article Received 12 January 2015, Accepted 24 September 2015 Published online 12 October 2015 in Wiley Online Library (wileyonlinelibrary.com) DOI: 10.1002/sim.6761 Precision of maximum likelihood

More information

On Selecting Tests for Equality of Two Normal Mean Vectors

On Selecting Tests for Equality of Two Normal Mean Vectors MULTIVARIATE BEHAVIORAL RESEARCH, 41(4), 533 548 Copyright 006, Lawrence Erlbaum Associates, Inc. On Selecting Tests for Equality of Two Normal Mean Vectors K. Krishnamoorthy and Yanping Xia Department

More information

Tutorial 2: Power and Sample Size for the Paired Sample t-test

Tutorial 2: Power and Sample Size for the Paired Sample t-test Tutorial 2: Power and Sample Size for the Paired Sample t-test Preface Power is the probability that a study will reject the null hypothesis. The estimated probability is a function of sample size, variability,

More information

Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics

Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics A short review of the principles of mathematical statistics (or, what you should have learned in EC 151).

More information

A new multivariate CUSUM chart using principal components with a revision of Crosier's chart

A new multivariate CUSUM chart using principal components with a revision of Crosier's chart Title A new multivariate CUSUM chart using principal components with a revision of Crosier's chart Author(s) Chen, J; YANG, H; Yao, JJ Citation Communications in Statistics: Simulation and Computation,

More information

ANOVA approach. Investigates interaction terms. Disadvantages: Requires careful sampling design with replication

ANOVA approach. Investigates interaction terms. Disadvantages: Requires careful sampling design with replication ANOVA approach Advantages: Ideal for evaluating hypotheses Ideal to quantify effect size (e.g., differences between groups) Address multiple factors at once Investigates interaction terms Disadvantages:

More information

Confidence intervals for kernel density estimation

Confidence intervals for kernel density estimation Stata User Group - 9th UK meeting - 19/20 May 2003 Confidence intervals for kernel density estimation Carlo Fiorio c.fiorio@lse.ac.uk London School of Economics and STICERD Stata User Group - 9th UK meeting

More information

Descriptive Statistics-I. Dr Mahmoud Alhussami

Descriptive Statistics-I. Dr Mahmoud Alhussami Descriptive Statistics-I Dr Mahmoud Alhussami Biostatistics What is the biostatistics? A branch of applied math. that deals with collecting, organizing and interpreting data using well-defined procedures.

More information

Spatial point processes in the modern world an

Spatial point processes in the modern world an Spatial point processes in the modern world an interdisciplinary dialogue Janine Illian University of St Andrews, UK and NTNU Trondheim, Norway Bristol, October 2015 context statistical software past to

More information

The exact bootstrap method shown on the example of the mean and variance estimation

The exact bootstrap method shown on the example of the mean and variance estimation Comput Stat (2013) 28:1061 1077 DOI 10.1007/s00180-012-0350-0 ORIGINAL PAPER The exact bootstrap method shown on the example of the mean and variance estimation Joanna Kisielinska Received: 21 May 2011

More information

The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations

The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations John R. Michael, Significance, Inc. and William R. Schucany, Southern Methodist University The mixture

More information

Chapter 23: Inferences About Means

Chapter 23: Inferences About Means Chapter 3: Inferences About Means Sample of Means: number of observations in one sample the population mean (theoretical mean) sample mean (observed mean) is the theoretical standard deviation of the population

More information

Tests for spatial randomness based on spacings

Tests for spatial randomness based on spacings Tests for spatial randomness based on spacings Lionel Cucala and Christine Thomas-Agnan LSP, Université Paul Sabatier and GREMAQ, Université Sciences-Sociales, Toulouse, France E-mail addresses : cucala@cict.fr,

More information

Chapter 9. Non-Parametric Density Function Estimation

Chapter 9. Non-Parametric Density Function Estimation 9-1 Density Estimation Version 1.2 Chapter 9 Non-Parametric Density Function Estimation 9.1. Introduction We have discussed several estimation techniques: method of moments, maximum likelihood, and least

More information

PERCENTILE ESTIMATES RELATED TO EXPONENTIAL AND PARETO DISTRIBUTIONS

PERCENTILE ESTIMATES RELATED TO EXPONENTIAL AND PARETO DISTRIBUTIONS PERCENTILE ESTIMATES RELATED TO EXPONENTIAL AND PARETO DISTRIBUTIONS INTRODUCTION The paper as posted to my website examined percentile statistics from a parent-offspring or Neyman- Scott spatial pattern.

More information

IENG581 Design and Analysis of Experiments INTRODUCTION

IENG581 Design and Analysis of Experiments INTRODUCTION Experimental Design IENG581 Design and Analysis of Experiments INTRODUCTION Experiments are performed by investigators in virtually all fields of inquiry, usually to discover something about a particular

More information

A simulation study for comparing testing statistics in response-adaptive randomization

A simulation study for comparing testing statistics in response-adaptive randomization RESEARCH ARTICLE Open Access A simulation study for comparing testing statistics in response-adaptive randomization Xuemin Gu 1, J Jack Lee 2* Abstract Background: Response-adaptive randomizations are

More information

BAYESIAN ANALYSIS OF DOSE-RESPONSE CALIBRATION CURVES

BAYESIAN ANALYSIS OF DOSE-RESPONSE CALIBRATION CURVES Libraries Annual Conference on Applied Statistics in Agriculture 2005-17th Annual Conference Proceedings BAYESIAN ANALYSIS OF DOSE-RESPONSE CALIBRATION CURVES William J. Price Bahman Shafii Follow this

More information

RESEARCH REPORT. A Studentized Permutation Test for the Comparison of Spatial Point Patterns. Ute Hahn. No.

RESEARCH REPORT. A Studentized Permutation Test for the Comparison of Spatial Point Patterns.   Ute Hahn. No. CENTRE FOR STOCHASTIC GEOMETRY AND ADVANCED BIOIMAGING RESEARCH REPORT www.csgb.dk 2010 Ute Hahn A Studentized Permutation Test for the Comparison of Spatial Point Patterns No. 12, December 2010 A Studentized

More information

Subject CS1 Actuarial Statistics 1 Core Principles

Subject CS1 Actuarial Statistics 1 Core Principles Institute of Actuaries of India Subject CS1 Actuarial Statistics 1 Core Principles For 2019 Examinations Aim The aim of the Actuarial Statistics 1 subject is to provide a grounding in mathematical and

More information

Advanced analysis and modelling tools for spatial environmental data. Case study: indoor radon data in Switzerland

Advanced analysis and modelling tools for spatial environmental data. Case study: indoor radon data in Switzerland EnviroInfo 2004 (Geneva) Sh@ring EnviroInfo 2004 Advanced analysis and modelling tools for spatial environmental data. Case study: indoor radon data in Switzerland Mikhail Kanevski 1, Michel Maignan 1

More information

HANDBOOK OF APPLICABLE MATHEMATICS

HANDBOOK OF APPLICABLE MATHEMATICS HANDBOOK OF APPLICABLE MATHEMATICS Chief Editor: Walter Ledermann Volume VI: Statistics PART A Edited by Emlyn Lloyd University of Lancaster A Wiley-Interscience Publication JOHN WILEY & SONS Chichester

More information

On dealing with spatially correlated residuals in remote sensing and GIS

On dealing with spatially correlated residuals in remote sensing and GIS On dealing with spatially correlated residuals in remote sensing and GIS Nicholas A. S. Hamm 1, Peter M. Atkinson and Edward J. Milton 3 School of Geography University of Southampton Southampton SO17 3AT

More information

A Robust Approach to Estimating Production Functions: Replication of the ACF procedure

A Robust Approach to Estimating Production Functions: Replication of the ACF procedure A Robust Approach to Estimating Production Functions: Replication of the ACF procedure Kyoo il Kim Michigan State University Yao Luo University of Toronto Yingjun Su IESR, Jinan University August 2018

More information

The Bayesian Approach to Multi-equation Econometric Model Estimation

The Bayesian Approach to Multi-equation Econometric Model Estimation Journal of Statistical and Econometric Methods, vol.3, no.1, 2014, 85-96 ISSN: 2241-0384 (print), 2241-0376 (online) Scienpress Ltd, 2014 The Bayesian Approach to Multi-equation Econometric Model Estimation

More information

Does k-th Moment Exist?

Does k-th Moment Exist? Does k-th Moment Exist? Hitomi, K. 1 and Y. Nishiyama 2 1 Kyoto Institute of Technology, Japan 2 Institute of Economic Research, Kyoto University, Japan Email: hitomi@kit.ac.jp Keywords: Existence of moments,

More information

TWO-FACTOR AGRICULTURAL EXPERIMENT WITH REPEATED MEASURES ON ONE FACTOR IN A COMPLETE RANDOMIZED DESIGN

TWO-FACTOR AGRICULTURAL EXPERIMENT WITH REPEATED MEASURES ON ONE FACTOR IN A COMPLETE RANDOMIZED DESIGN Libraries Annual Conference on Applied Statistics in Agriculture 1995-7th Annual Conference Proceedings TWO-FACTOR AGRICULTURAL EXPERIMENT WITH REPEATED MEASURES ON ONE FACTOR IN A COMPLETE RANDOMIZED

More information

Parametric Empirical Bayes Methods for Microarrays

Parametric Empirical Bayes Methods for Microarrays Parametric Empirical Bayes Methods for Microarrays Ming Yuan, Deepayan Sarkar, Michael Newton and Christina Kendziorski April 30, 2018 Contents 1 Introduction 1 2 General Model Structure: Two Conditions

More information

STATISTICS ANCILLARY SYLLABUS. (W.E.F. the session ) Semester Paper Code Marks Credits Topic

STATISTICS ANCILLARY SYLLABUS. (W.E.F. the session ) Semester Paper Code Marks Credits Topic STATISTICS ANCILLARY SYLLABUS (W.E.F. the session 2014-15) Semester Paper Code Marks Credits Topic 1 ST21012T 70 4 Descriptive Statistics 1 & Probability Theory 1 ST21012P 30 1 Practical- Using Minitab

More information

Research Note: A more powerful test statistic for reasoning about interference between units

Research Note: A more powerful test statistic for reasoning about interference between units Research Note: A more powerful test statistic for reasoning about interference between units Jake Bowers Mark Fredrickson Peter M. Aronow August 26, 2015 Abstract Bowers, Fredrickson and Panagopoulos (2012)

More information

Non-uniform coverage estimators for distance sampling

Non-uniform coverage estimators for distance sampling Abstract Non-uniform coverage estimators for distance sampling CREEM Technical report 2007-01 Eric Rexstad Centre for Research into Ecological and Environmental Modelling Research Unit for Wildlife Population

More information

Tutorial 4: Power and Sample Size for the Two-sample t-test with Unequal Variances

Tutorial 4: Power and Sample Size for the Two-sample t-test with Unequal Variances Tutorial 4: Power and Sample Size for the Two-sample t-test with Unequal Variances Preface Power is the probability that a study will reject the null hypothesis. The estimated probability is a function

More information

X

X Correlation: Pitfalls and Alternatives Paul Embrechts, Alexander McNeil & Daniel Straumann Departement Mathematik, ETH Zentrum, CH-8092 Zürich Tel: +41 1 632 61 62, Fax: +41 1 632 15 23 embrechts/mcneil/strauman@math.ethz.ch

More information

Econ 423 Lecture Notes: Additional Topics in Time Series 1

Econ 423 Lecture Notes: Additional Topics in Time Series 1 Econ 423 Lecture Notes: Additional Topics in Time Series 1 John C. Chao April 25, 2017 1 These notes are based in large part on Chapter 16 of Stock and Watson (2011). They are for instructional purposes

More information

Parameter estimation! and! forecasting! Cristiano Porciani! AIfA, Uni-Bonn!

Parameter estimation! and! forecasting! Cristiano Porciani! AIfA, Uni-Bonn! Parameter estimation! and! forecasting! Cristiano Porciani! AIfA, Uni-Bonn! Questions?! C. Porciani! Estimation & forecasting! 2! Cosmological parameters! A branch of modern cosmological research focuses

More information

Data Science Unit. Global DTM Support Team, HQ Geneva

Data Science Unit. Global DTM Support Team, HQ Geneva NET FLUX VISUALISATION FOR FLOW MONITORING DATA Data Science Unit Global DTM Support Team, HQ Geneva March 2018 Summary This annex seeks to explain the way in which Flow Monitoring data collected by the

More information

Community surveys through space and time: testing the space-time interaction in the absence of replication

Community surveys through space and time: testing the space-time interaction in the absence of replication Community surveys through space and time: testing the space-time interaction in the absence of replication Pierre Legendre Département de sciences biologiques Université de Montréal http://www.numericalecology.com/

More information

Joint Estimation of Risk Preferences and Technology: Further Discussion

Joint Estimation of Risk Preferences and Technology: Further Discussion Joint Estimation of Risk Preferences and Technology: Further Discussion Feng Wu Research Associate Gulf Coast Research and Education Center University of Florida Zhengfei Guan Assistant Professor Gulf

More information

A Simulation Study on Confidence Interval Procedures of Some Mean Cumulative Function Estimators

A Simulation Study on Confidence Interval Procedures of Some Mean Cumulative Function Estimators Statistics Preprints Statistics -00 A Simulation Study on Confidence Interval Procedures of Some Mean Cumulative Function Estimators Jianying Zuo Iowa State University, jiyizu@iastate.edu William Q. Meeker

More information

Empirical Power of Four Statistical Tests in One Way Layout

Empirical Power of Four Statistical Tests in One Way Layout International Mathematical Forum, Vol. 9, 2014, no. 28, 1347-1356 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/imf.2014.47128 Empirical Power of Four Statistical Tests in One Way Layout Lorenzo

More information

Do Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods

Do Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods Do Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods Robert V. Breunig Centre for Economic Policy Research, Research School of Social Sciences and School of

More information

A homogeneity test for spatial point patterns

A homogeneity test for spatial point patterns A homogeneity test for spatial point patterns M.V. Alba-Fernández University of Jaén Paraje las lagunillas, s/n B3-053, 23071, Jaén, Spain mvalba@ujaen.es F. J. Ariza-López University of Jaén Paraje las

More information

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data Journal of Multivariate Analysis 78, 6282 (2001) doi:10.1006jmva.2000.1939, available online at http:www.idealibrary.com on Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone

More information

Multiple Comparison Procedures, Trimmed Means and Transformed Statistics. Rhonda K. Kowalchuk Southern Illinois University Carbondale

Multiple Comparison Procedures, Trimmed Means and Transformed Statistics. Rhonda K. Kowalchuk Southern Illinois University Carbondale Multiple Comparison Procedures 1 Multiple Comparison Procedures, Trimmed Means and Transformed Statistics Rhonda K. Kowalchuk Southern Illinois University Carbondale H. J. Keselman University of Manitoba

More information

Distribution-Free Procedures (Devore Chapter Fifteen)

Distribution-Free Procedures (Devore Chapter Fifteen) Distribution-Free Procedures (Devore Chapter Fifteen) MATH-5-01: Probability and Statistics II Spring 018 Contents 1 Nonparametric Hypothesis Tests 1 1.1 The Wilcoxon Rank Sum Test........... 1 1. Normal

More information

1 A Non-technical Introduction to Regression

1 A Non-technical Introduction to Regression 1 A Non-technical Introduction to Regression Chapters 1 and Chapter 2 of the textbook are reviews of material you should know from your previous study (e.g. in your second year course). They cover, in

More information

Research Article Tests of Fit for the Logarithmic Distribution

Research Article Tests of Fit for the Logarithmic Distribution Hindawi Publishing Corporation Journal of Applied Mathematics and Decision Sciences Volume 2008 Article ID 463781 8 pages doi:10.1155/2008/463781 Research Article Tests of Fit for the Logarithmic Distribution

More information

Unit 27 One-Way Analysis of Variance

Unit 27 One-Way Analysis of Variance Unit 27 One-Way Analysis of Variance Objectives: To perform the hypothesis test in a one-way analysis of variance for comparing more than two population means Recall that a two sample t test is applied

More information

Chapter 12: An introduction to Time Series Analysis. Chapter 12: An introduction to Time Series Analysis

Chapter 12: An introduction to Time Series Analysis. Chapter 12: An introduction to Time Series Analysis Chapter 12: An introduction to Time Series Analysis Introduction In this chapter, we will discuss forecasting with single-series (univariate) Box-Jenkins models. The common name of the models is Auto-Regressive

More information

Estimation and Hypothesis Testing in LAV Regression with Autocorrelated Errors: Is Correction for Autocorrelation Helpful?

Estimation and Hypothesis Testing in LAV Regression with Autocorrelated Errors: Is Correction for Autocorrelation Helpful? Journal of Modern Applied Statistical Methods Volume 10 Issue Article 13 11-1-011 Estimation and Hypothesis Testing in LAV Regression with Autocorrelated Errors: Is Correction for Autocorrelation Helpful?

More information

Financial Econometrics and Quantitative Risk Managenent Return Properties

Financial Econometrics and Quantitative Risk Managenent Return Properties Financial Econometrics and Quantitative Risk Managenent Return Properties Eric Zivot Updated: April 1, 2013 Lecture Outline Course introduction Return definitions Empirical properties of returns Reading

More information

Asymptotic distribution of the sample average value-at-risk

Asymptotic distribution of the sample average value-at-risk Asymptotic distribution of the sample average value-at-risk Stoyan V. Stoyanov Svetlozar T. Rachev September 3, 7 Abstract In this paper, we prove a result for the asymptotic distribution of the sample

More information

Lecture 7: Dynamic panel models 2

Lecture 7: Dynamic panel models 2 Lecture 7: Dynamic panel models 2 Ragnar Nymoen Department of Economics, UiO 25 February 2010 Main issues and references The Arellano and Bond method for GMM estimation of dynamic panel data models A stepwise

More information

Supplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION. September 2017

Supplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION. September 2017 Supplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION By Degui Li, Peter C. B. Phillips, and Jiti Gao September 017 COWLES FOUNDATION DISCUSSION PAPER NO.

More information

Using regression to study economic relationships is called econometrics. econo = of or pertaining to the economy. metrics = measurement

Using regression to study economic relationships is called econometrics. econo = of or pertaining to the economy. metrics = measurement EconS 450 Forecasting part 3 Forecasting with Regression Using regression to study economic relationships is called econometrics econo = of or pertaining to the economy metrics = measurement Econometrics

More information

A PRACTICAL LOOK AT THE VARIABLE AREA TRANSECT

A PRACTICAL LOOK AT THE VARIABLE AREA TRANSECT Notes Ecology, 87(7), 2006, pp. 1856 1860 Ó 2006 by the Ecological Society of America A PRACTICAL LOOK AT THE VARIABLE AREA TRANSECT SOLOMON Z. DOBROWSKI 1,3 AND SHANNON K. MURPHY 2 1 University of California,

More information

Statistical Data Analysis Stat 3: p-values, parameter estimation

Statistical Data Analysis Stat 3: p-values, parameter estimation Statistical Data Analysis Stat 3: p-values, parameter estimation London Postgraduate Lectures on Particle Physics; University of London MSci course PH4515 Glen Cowan Physics Department Royal Holloway,

More information