Distance-Based Methods: Ripley s K function vs. K density function

Size: px
Start display at page:

Download "Distance-Based Methods: Ripley s K function vs. K density function"

Transcription

1 Distance-Based Methods: Ripley s K function vs. K density function Marta Roig Casanova marta.roig-casanova@uv.es Departamento de Economía Aplicada Universidad de Valencia ÁREA TEMÁTICA: Métodos de análisis regional RESUMEN: This paper incorporates an extension of Ripley s K function, which we named M marginal function, to measure concentration of economic activity more precisely. We achieve moving closer the tradition of spatial statistics, by means of M marginal, to the spatial economics approach, K-density function, thanks to the conversion of a technique coming from the spatial statistics path in a non cumulative measure. Both measures of concentration are distance-based methods, can detect the spatial location pattern at every scale, let us know the distance at which significant concentration appears, test the significance of its results in the same way and satisfy the five requirements that any test for measuring concentration should fulfil. We implement the two distance-based methods to Spanish manufacturing sectors in order to observe how they behave and we conclude that both of them detect perfectly the existence of concentration or dispersion; however M marginal function can quantify its magnitude while it is impossible for K-density, and only M marginal detects the size of the clusters and also their position in the area of study. Furthermore, as M marginal becomes a non-cumulative function, it does not accumulate spatial information on the distribution of points up to a certain distance and, thus, it can identify with total precision local clusters, as K-density does. In this way, M marginal function can provide even more information than K-density function. PALABRAS CLAVE: M marginal function, K-density function, distance-based method, non-cumulative function.

2 1. Introduction Spatial concentration of economic activity and the analysis of establishments location have been subjects much followed for many economists along the years, dating back to Marshall (1890), and all of them have concluded that there are good reasons to expect that economic activity will be unevenly distributed across space. More recently, theoretical research into the so called new economic geography, initially developed by Krugman (1991), 1 emphasized the role of reinforcing advantages, increasing returns and the interaction between the benefits associated with operating in a large market (large number of potential consumers and related industries) and the increased costs of competition (increasing number of companies operating in that market), as the main forces to explain the emergence and sustainability of endogenous spatial disparities. Obviously, the intensity of these forces and the trade-off between them determine the location decision of firms and consumers (workers), and the location patterns of different industries in the territory, beyond the conditions established by physical geography. These theoretical developments have been accompanied, in recent years, by numerous empirical contributions 2, trying to establish the link between these forces and the location of economic activity in space, the intensity and the spatial scope of this agglomeration. Thus, measurement of economic concentration becomes again a field of renewed interest in economic geography. Nevertheless, we must bear in mind that the intensity of these centripetal and centrifugal forces and the trade-off between them do not necessarily change monotonically with distance. In this way, one would expected that the current measures used in this field are capable of capturing this peculiarity in the formation of clusters in a better way, in addition to providing information about the specific distance at which firms have more or less incentive to locate in a particular cluster. The literature on the empirical measurement of economic concentration has been influenced by two different traditions, economic geography and spatial statistics. The economic geography approach culminated with the paper of Duranton and Overman (2005), who developed their own probability function, the K-density function, by using a Gaussian kernel function to estimate the density of point-pair, based on the paper by Silverman (1986). The spatial statistics became more popular in the field of economics with the publication of the paper of Marcon and Puech (2003) in the 1 This work has been surveyed in Ottaviano and Puga (1998), Fujita et al (1999) or Fujita and Thisse (2009). 2 See Alonso-Villar et al (2004), Deveraux et al (2004), Paluzie et al (2004) or Arbia et al (2008).

3 Journal of Economic Geography 3. In this paper, Marcon and Puech followed the mathematical and statistical background of Ripley s K function, initiated years back by Ripley (1976, 1977), Diggle (1983) or Cressie (1993), to estimate the number of further events that can be found within a certain distance from an arbitrary event; thereby using a cumulative function. Since then, these two methodological traditions have improved their own techniques used to measure the extent and nature of concentration of economic activity in space. In spite of this, the economic literature has no many examples of discussions by the preference for one or the other methodology for assessing geographic concentration. We can find the first comparison between Ripley s K function and K density function in the fifth section of Marcon and Puech (2003), where these authors pointed out some of the advantages and disadvantages of both measures, without opting expressly for one of them as the best measure for evaluating spatial location patterns. Afterwards, Duranton and Overman (2005) stated in the conclusions of their paper that their methodology was more informative than that used in spatial statistics, by saying: We believe the k-densities are more informative than k-functions with respect to the scale of localization (p.1103). Nevertheless, it was not until last year when Marcon and Puech (2010) tried to discuss the convenience of using a probability density function of point-pair or a cumulative function for evaluating spatial concentration, reaching the conclusion that the two measures provide us useful information about the location of activity in space and their results are complementary. Thus, they must be implemented simultaneously and should not be considered as substitutive measures. Meanwhile, Albert, Casanova and Orts (2012) also got closer the positions of the two approaches by the use of a statistical function, M function (an extension of the Ripley s K function), and making this function meet the five requirements of Duranton and Overman (2005). 4 This last approach, as the D function of Diggle and Chetwynd (1991), has two advantages. First, it is defined to analyse the nature and physical scale of spatial clustering for inhomogeneous populations, as long as the appropriate 3 Previously, Arbia and Espa (1996) used already measures of spatial statistics. 4 (1) Be comparable across industries, (2) control for the overall agglomeration of manufacturing, (3) control for industrial concentration, (4) be unbiased with respect to scale and aggregation, and (5) give an indication of the significance of the results. 3

4 benchmark is chosen 5 and, second, it has an easy interpretation in terms of expectations. That is, given the intensity of firms within an industry, our M function measures the expected number of excess firms within a predetermined distance of this industry in comparison with the expected number in absence of spatial clustering. Now, in this paper, we go a step forward. Our approach still continues in the tradition of spatial statistics but, at the same time, continues moving closer to the spatial economics approach. At this point, we do not settle comparing a cumulative function, M function, with a probability density function, K-density. We introduce the M marginal function, which measures the variation in the number of events that can be found at a given distance from an arbitrary event when varies the distance considered, that is, how it changes the M function when we change the distance considered. Thus, the M marginal function provides the variations in the number of neighbours when the distance that defines the "neighbourhood" becomes higher and, thus, does not accumulate spatial information on the distribution of points up to each distance (in fact, its values provide the information of the slope of the M function). 6 In this way, the M marginal function directly reflects the effect of changes in the trade-off between the centripetal and the centrifugal forces which form clusters when the distance varies.with this change, we manage to adapt a technique coming from the spatial statistics path and to convert it in a non cumulative measure, more similar to the K-density function. The two measures, M marginal function and K-density function, satisfy the aforementioned five essential requirements that any test for measuring concentration should fulfil. In addition, they have other characteristics in common, they both (1) treat space as being continuous, (2) can detect the spatial location patterns at every scale, (3) let us know the at which significant concentration or dispersion occurs, (4) use the same benchmark and, besides, (5) both measures of concentration test the significance of their results and define the null hypothesis in the same way. In this paper, we first present a series of simulations, with the number of events and their spatial distribution controlled, allowing us to illustrate to what extent the M function and, especially, the M marginal function are able to detect the presence, spatial extent and intensity of spatial concentration. Next, we apply the two functions, the M marginal function and the K-density function, to a set of subsectors of Spanish 5 Diggle and Chetwynd tried to deal with inhomogeneity by means of selecting a random sample of controls, which had spatially varying density. Here, we do not need to use a previously selected group of controls to treat the inhomogeneity because the entire population (controls) is known to us, as well as its density. 6 Even Duranton and Overman (2005) appointed the K-function as the cumulative of their K-density (p.1103). 4

5 manufacturing in order to obtain a more accurate description of their location patterns. Additionally, this allow us to verify whether the behaviour of both measures over the same pattern of points coincides, or what kind of additional information can provide us a measure of concentration relative to the other. For doing this, we use microgeographic data from the Analysis System of Iberian Balances database 7, which contains exhaustive information about Spanish manufacturing subsectors at the fourdigit level, classified using the National Classification of Economic Activities 8, and also latitude and longitude coordinates of every establishment. The remainder of the paper will be organised as follows. In Section 2, we describe the methodology employed in our analysis and the key methodological and interpretation modifications incorporated in our application. In Section 3, we examine in detail the behaviour of the M marginal function when it is applied to hypothetical sectors theoretically simulated. In Section 4, we outline and discuss the main results achieved by the implementation in Spanish manufacturing subsectors of the M marginal function and K-density function. Finally, Section 5 contains the most important conclusions reached. 2. Measures of Concentration 2.1. K-density function. Spatial economics path Duranton and Overman (2005) developed a novel way to test for localization, as they said in their paper (p.1078). Their method, which takes into consideration the distribution of between pairs of establishments, maintains the importance of knowing at which spatial scale the clustering occurs. K-density has into account the frequency of at each radius (r), i.e. counts how many times is repeated each distance between pairs of establishments. This distance, denoted by d i,j, is the Euclidean distance between establishments i and j, and the estimator of the density of bilateral at any d is K ˆ ( d ) = n 1 n 1 ( ) i n 1 h n = 1 j i+ 1 d di, j = f h where n are the establishments, h is the bandwidth and f is the kernel function. 9 7 SABI 8 NACE 93 - Rev. 1 9 For further details see Duranton and Overman (2005). 5

6 Thereby, this function seems a more intuitive tool and easily identifies local density in a precise way. However, K-density has difficulties of interpretation, since it detects perfectly the existence of concentration or dispersion, but cannot quantify it. Moreover, its implementation to large sets of points is very complex and time consuming, and, as Duranton and Overman (2005) recommend, its analysis should be restricted up to the median distance between all pairs of establishments, due to the method of construction of the function. K-density of both, the sector analysed and the benchmark, sum to unity; thus, if K-density value of the sector exceeds that of the benchmark, in any place, it will be necessary that this value is below the benchmark somewhere else. It is dealt by limiting the distance considered Ripley s K function. Spatial statistics path In Albert et al (2012) was introduced the M TM function. This function had the whole of manufacturing 11 as benchmark and thus was able to compare the spatial distribution of each sector with the overall tendency of manufacturing industry to agglomerate. M TM ( r) = K( r) K ( r) Here, M TM (r) is the difference between the K-value (value of the Ripley s K function) of each sector under consideration and the K-value of the total manufacturing at radius r. Ripley s K function, K(r), is a distance-based methodology that analyses completely mapped spatial point process data, i.e. data on the locations of establishments. Under some properties, the reduced second moment measure of K function is K 1 E λ ( r ) = [ Number of further events within distance r of an arbitrary event] where λ is the density, or mean number of events per unit area. This term is estimated by N/A, where N is the observed number of points in the region studied and A is the area of the study region. Additionally, the numerator of this function can be estimated by counting the average number of neighbours each establishment has within a circle of a given radius, neighbours being understood to mean all establishments situated at a distance equal to or lower than the radius (r). This function is considered a cumulative TM 10 For further details see Duranton and Overman (2005), Marcon and Puech (2003) and Marcon and Puech (2010). 11 Henceforth TM. 6

7 function because accumulates spatial information on the distribution of points up to each distance (r). The K(r) function describes characteristics of the point patterns at many and different scales simultaneously, depending on the value of r we take into account, that is, K N N 1 ( r) = wiji ( d ij ) λn i= 1 j= 1, i j I ( d ) ij 1, = 0, d d ij ij r > r where d ij is the distance between the i th and j th establishments, I(x) is the indicator function and w ij is the weighting factor to correct for border effects. 12 The indicator function, I(d ij ), takes a value of 1 if the distance between the i th and j th establishments is lower than r, or 0 otherwise, and w ij will be equal to the area of the circle divided by the intersection between the area of the circle and the area of study. Finally, using the definition of λ, the K(r) function can be rewritten as: K N N A ( r) = wiji ( d ij ) 2 N i= 1 j= 1, i j Therefore, the K(r) function is a distance-based method that measures concentration by showing the share of average number of neighbours in an area of radius (r), over the density of the whole study region (λ). K-value will increase along different radius (r) when the average number of neighbours is higher than the density of the whole study region, and will decrease when the density of the whole study region is higher than the average number of neighbours in the different areas of radius (r). This density will always depend on the null hypothesis considered, i.e. the benchmark. For many years, in the spatial statistics path, had been using a theoretical benchmark consisting of a kind of randomly distributed set of locations in the area of study, called Complete Spatial Randomness (CSR). This benchmark is useful from a theoretical point of view, since it can detect any deviation of a specific spatial pattern of locations from a randomly distributed set. However, taking into account the economic 12 These border-effect corrections should be incorporated to avoid artificial decreases in K(r) when r increases, because the increase in the area of the circle under consideration is not followed by the increase of establishments (outside the study area there are no establishments). 7

8 point of view, this benchmark is not realistic because economic activity is not located in a random and independent way. Moreover, an appropriate benchmark must consider the dissimilarities in such natural features as mountains, rivers or harbours, i.e. first nature, and the tendency of economic activity to agglomerate. In this way, CSR as benchmark is not able to isolate the idiosyncratic tendency of each sector to locate itself in accordance with the general tendency of manufacturing establishments to agglomerate. The introduction of the whole of manufacturing as benchmark allows us to consider natural and economic factors that can condition the spatial distribution of activity. In other words, by means of the difference between the K-value of each sector under consideration and the K-value of the total manufacturing we can identify the existence and magnitude of spatial agglomeration of establishments in a considered sector over and above the level of spatial concentration of the whole of manufacturing, attributable to natural inhomogeneity of countryside and to the general tendency of economic activity to agglomerate. That is, using the entire population as controls, the M TM function becomes inhomogeneous, as Diggle and Chetwynd (1991) pointed out. Both K-values, the one of the sector and the other of the whole of manufacturing, are relative to the density of the whole study region, besides the M TM value of each sector at each radius will be relative to the benchmark considered, TM in this case. On the one hand, M TM value will increase as long as the average number of establishments at different (r) of a considered sector increases with the distance and this increase is higher than that occurred in the TM. On the other hand, M TM value will decrease when the increase of the average number of establishments of a considered sector at different (r) is lower than the increase of establishments of the TM. Finally, M TM value will remain constant for different values of (r) when the spatial location pattern of a considered sector coincides with the distribution of the benchmark, TM. Thus, K-value will never take negative values, but M TM value can take either positive or negative values, depending on whether a subsector is concentrated or dispersed relative to the TM. The M TM value will change as does the radius under consideration, i.e. the distance at which counting of establishments is performed. However, these increases or decreases of the M TM value are not monotonous with distance because the agglomerative strengths that pull economic activity together and determine the differences in the shape, the size or the outline of the cluster do not act monotonically. The slope of M TM function gives information about this increase or decrease of neighbouring 8

9 establishments when r becomes higher and reflects the trade-off between centripetal and centrifugal forces that drive the specific location of the establishments. The M TM marginal function provides the values of the slope of the M TM function, i.e. counts the variations of the neighbours when we increase the radius, that is, M TM marginal function = M TM r As we see, it remains TM as benchmark, so its values tell us at each distance the variation in the number of neighbours in each sector when r becomes higher as compared to the increase in neighbours of the overall manufacturing industry. In this way, our M TM marginal function stops being a cumulative function to make way for a non-cumulative function, as the K-density function, by getting closer to the spatial economics tradition. From here on, we should notice that M TM marginal function is simple to implement to large sets of points, in contrast to K-density function, whose implementation to large sets of points is more complex and time consuming. We must also bear in mind that a bias appears in M TM marginal function when we analyse of the radius larger than 25% of the smaller side length of the area analysed; thus, it is advisable to restrict the radius to a maximum of 1/4 of the smaller side length of the area analysed, in order to increase the reliability of our results. Thereby, M TM marginal function becomes an appropriate instrument to detect clusters and their interaction, as long as these clusters are relatively close between them. In fact, from the economic point of view has no sense to think that two very distant clusters have interactions between them, so, what economic sense has to detect clusters with a distance between them, for instance in the case of Spain, of more than 250 km? Related to this, Duranton and Overman (2005, p.1090) said: localization tends to take place mostly at fairly small. Finally, just to say that the two methodologies used in this paper have the same way of evaluating the statistical significance of departures from the null hypothesis, by means of the construction of a confidence interval. We generate 1000 simulations and reject the non-significant values. For each simulation we randomly reallocate as many establishments as the sector considered has across the sites where we can currently find establishments from the whole manufacturing. The sampling is done without replacement. Like this, by the specific Monte Carlo simulations drawn, we get that both measures of concentration applied in this paper share some similarity with the way of 9

10 constructing the case-control counterfactual, explained in the paper of Diggle and Chetwynd (1991), although, there is a difference. As we know the population density in advance, it is not necessary the use of a previously selected group of controls, or representative sample of the entire population, in order to construct the counterfactuals. 3. Theoretical and controlled simulations With the aim of illustrating to what extent the M TM function and especially the M TM marginal function are able to detect the presence, spatial extent and intensity of spatial concentration, we perform controlled simulations of 20 hypothetical sectors. Every point of our hypothetical sectors represents an establishment and is located in the area of study thanks to its coordinates. Each sector differs from the others in the intensity, number and location of the clusters considered. There are several features that are common in all the scenarios. First, we use as reference area a scaled to unit square (1x1), second, each hypothetical sector contains 500 points and, third, the benchmark employed consists on 4,000 points distributed on a random and independent way, according to a Poisson distribution, over the whole area. In these theoretical simulations, we use CSR as benchmark in order to capture the precise information of the M marginal function of each point pattern. Moreover, each sector is subdivided into two subsets of establishments, those located within the clusters considered and those located outside of these areas; and, these hypothetical establishments are distributed in a random way, both the points that are part of the cluster and those that are outside the cluster. On the contrary, every scenario has specific features, such as the number of clusters that constitute each sector, the location of each cluster in the area of study, the size of the cluster, i.e. how large its radius is, and the number of establishments inside the cluster. The whole of the features that shape each hypothetical sector are shown in a table of Appendix 1. If we focus on the last column of this table, there appears how is called each hypothetical sector. This nomenclature provides the whole of features that constitute each sector. In this way, 1111, our baseline scenario, means that the sector has one cluster (first column), the centroid of the cluster is situated in the coordinate (0.5, 0.5) (second column), the radius of the cluster is (third column) and the cluster itself contains the 75% of the points of the sector, while the rest of the points, 25%, are randomly distributed by the area of study (fourth column). 10

11 Finally, just to add that as the side of the area of study is 1, a specific radius of corresponds to the 2.5% of the side. In this way, when we describe the outcomes, we will consider that the distance of each side is 1,000 km, so that it can resemble the area of Spain. In this case, the radius of the cluster would be of 25 km. The information reported in the following figures has been generated by running 200 simulations and the two lines that we can see for each function, M TM and M marginal, are their and quantiles. For each hypothetical sector, for each radius in this interval we rank our 200 simulations and select the 2.5-th and 97.5-th percentile to obtain a lower 2.5% and an upper 2.5% confidence interval. Therefore, figures 1, 2, 3 and 4 show the values of the confidence interval (95%) of M function (continuous lines) and M marginal function (dashed line) of sixteen possible scenarios. Figure 1 presents four hypothetical sectors, whose point pattern consists, mainly, in one cluster located in the middle of the area of study, i.e. its centroid is situated in the coordinate (0.5, 0.5). The word 'mainly' is used because not all the points of the simulation are located inside the cluster. However, each scenario of this figure differs in the number of establishments included into the cluster and in their spatial extension (size). The hypothetical sector of Figure 1a, denominated 1111, consists in one cluster of radius that contains 375 points and 125 points distributed by the rest of the space. It means that 75% of the establishments of this specific industry are located inside a circular cluster of 50 km in diameter and the rest of the establishments (125) are randomly distributed by the area of study. The second scenario, Figure 1b, considers also one cluster of radius 0.025, but this time the cluster contains only 250 points, that is 50% of the establishments of the industry. In this way, we have fewer neighbours in the same area, or, what is the same, fewer establishments inside the cluster, and more establishments randomly distributed in the rest of the space. Therefore, the average number of neighbours will be lower at each distance and both the intensity of M function and M marginal function will be proportionately lower. As M function provides an average number of neighbouring points, each point located out of the cluster will reduce the M value, i.e. in figure 1b we have out of the cluster 125 points more than in figure 1a, so the intensity of M value will be lower. The size of the cluster is modified in the third and fourth simulation, i.e. the distance between the points of the cluster. Now, we generate a cluster of radius 0.05, instead of Nevertheless, the distribution of points is the same than in the two previous cases. In this case, we check that if we have the same distribution of points in 11

12 an area that is twice the previous, the intensity of the M marginal is much lower as it captures the slower growth of the M function. This is because the neighbouring establishments are located at longer and the growth of the M function is not as fast as before. 1 cluster, r=0.025, 75% establishments inside cluster 1 cluster, r=0.025, 50% establishments inside cluster (a) 1111 (b) cluster, r=0.05, 75% establishments inside cluster 1 cluster, r=0.05, 50% establishments inside cluster (c) 1121 (d) 1122 Figure 1. One cluster centred (0.5, 0.5). Finally, we can say that M marginal value indicates clearly, in the four graphs in Figure 1, the area occupied by each cluster, i.e. a diameter of 50 km in the first case and 100 km in the second. From previous simulations, we can conclude that the M marginal function is highly informative about the size and intensity of clusters. First, the velocity of the rise of M function and, as consequence, the rapid growth and high values reached by the M 12

13 marginal, are related to the size of the cluster and to the proportion of establishments in each sector that are within the cluster. Second, given the number of establishments inside the cluster, the smaller the size of the cluster, the greater the intensity of concentration, i.e. more rapidly will increase the M-value. This is because at very short of the radius (r) we will find an average number of establishments much higher than the density of the whole study region. Therefore, by means of the M marginal function we have an accurate approximation of the closeness of the establishments within the cluster. Figure 2 shows the values of M function and M marginal function of four possible scenarios where the point pattern consists, mainly, in one cluster located at one extreme of the area of study, i.e. its centroid is situated in the coordinate (0.9, 0.5). The distribution of points and the radii taken into account from figure 2a to 2d coincide with Figure 1. The only difference between the simulations of Figure 1 and those that appear in Figure 2 is the location of the cluster in the area of study. This difference in the location of the cluster causes a fast drop in the values of the M function, approximately when the radius reaches 75 km, that is, when the border effect appears more intensely. Consequently, the values of the M marginal function also drop, although its initial information does not vary and this is which provides us the essential features to describe the spatial location pattern. The pattern of points in the following two figures, Figure 3 and Figure 4, totally changes with reference to the previous ones. Now, in each simulation we generate two clusters, instead of one. Figure 3 shows the graphical representations of four hypothetical sectors, whose point patterns are, mainly, two clusters located very close to each other. Their centroids are situated in the coordinates (0.5, 0.5) and (0.65, 0.5), which indicates that both clusters are located too far away from the border for that we can observe the edge effect in the graphs Remember that we cannot observe this effect because the distance analysed is restricted to 200 km, since a bias appears in Ripley's K function when we analyse of the radius larger than 25% of the smaller side length of the area analysed. 13

14 1 cluster, r=0.025, 75% establishments inside cluster 1 cluster, r=0.025, 50% establishments inside cluster (a) 1211 (b) cluster, r=0.05, 75% establishments inside cluster 1 cluster, r=0.05, 50% establishments inside cluster (c) 1221 (d) 1222 Figure 2. One cluster next to the border (0.9, 0.5). In this case, when the clusters are very close, both M function and M marginal function indicate us the existence of two clusters. M function is a cumulative function, so it accumulates spatial information on the distribution of points. Therefore, M values will increase as long as the function finds new neighbours and will remain flat when the growth of neighbours is zero, as long as we increase the distance. In figures 3a and 3b we can clearly see this behaviour. 14

15 2 close clusters, r=0.025, 60% cluster1, 25% cluster2 2 close clusters, r=0.025, 40% cluster1, 20% cluster2 (a) 2111 (b) close clusters, r=0.05, 60% cluster1, 25% cluster2 2 close clusters, r=0.05, 40% cluster1, 20% cluster2 (c) 2121 (d) 2123 Figure 3. Two close clusters (0.5, 0.5) and (0.65, 0.5). The M marginal function provides the same information, but more clearly and accurately. Its values precisely indicate the diameter of the cluster, 50 km in figures 3a and 3b and 100 km in figures 3c and 3d. Moreover, its values show at which distance we find the closer cluster. For instance, the radius taken into account in figures 3a and 3b is 0.025, so the distance between the two clusters is 100 km. Thus, as we see, from that distance M marginal values increase again, capturing the presence of the second cluster. Finally, M marginal values, as well as M values, also capture the proportion of points within a cluster, so the intensity of both functions is higher when the union of the two clusters contains 85% of the points (figures 3a and 3c) than when it contains 60% of them (figures 3b and 3d). 15

16 In figures 3c and 3d we duplicate the size of the cluster, and thereby the distance between the points inside the cluster, although the distribution and proportion of points inside and outside the cluster are the same than in figures 3a and 3b. It makes that the velocity of the increase of M function becomes lower and, consequently, the intensity of M marginal function becomes smaller. In accordance to the definition of the M marginal, this is what should happen. Indeed, slower variations in the growth rate of the M function, when the radius becomes larger, make lower the M marginal values. Moreover, the radius measures now 0.05, so the distance between the two clusters is only 50 km, causing to appear an interaction in the M marginal values. 2 distant clusters, r=0.025, 60% cluster1, 25% cluster2 2 distant clusters, r=0.025, 40% cluster1, 20% cluster2 (a) 2211 (b) distant clusters, r=0.05, 60% cluster1, 25% cluster2 2 distant clusters, r=0.05, 40% cluster1, 20% cluster2 (c) 2221 (d) 2223 Figure 4. Two distant clusters (0.5, 0.5) and (0.9, 0.5). 16

17 The four simulations in Figure 4 have the same distribution of points and the same radii than their homologous simulations in Figure 3, i.e. figure 3a presents the same point pattern and the same radius than figure 4a. However, there is a great difference between the figures 3 and 4, the distance between the two clusters. This distance is much larger now, situating their centroids in the coordinates (0.5, 0.5) and (0.9, 0.5). In this case, when the clusters are very distant, neither M function nor M marginal function indicate us clearly the existence of the two clusters. As in Figure 2, the M function reflects the edge effect. It is reflected by means of the reduction of their values when a part of the circle analysed is outside the study area. However, the drop of M function in Figure 4 is not as pronounced as in Figure 2 because the percentage of points located next to the border is not as elevated (75% and 50% in Figure 2 versus 25% and 20% in Figure 4). As we have seen, in all cases, we are able to detect the existence, average spatial extent and average intensity of clusters. Furthermore, if there are more than one cluster and they are close enough to each other, at lower than 25% of the study area (200 km in our example), our approach also detected the interaction between them, and how far away they are from each other. Only in the case of very distant clusters, this methodology is not sufficiently informative. We can see this by comparing the results of Figure 1 and Figure 4. This problem is due to the limitation of the maximum distance of the radius tested by this function. So, in this particular case, the simulations of Figure 1, a cluster far away from the border, get almost similar results to the simulations of Figure 4, two clusters far away between them. This suggests that we should limit the interpretation of the results to determine the existence of concentration and the average length and intensity of possible clusters, and whether this pattern corresponds to the simultaneous existence of several clusters located relatively close to each other Spatial location patterns of Spanish manufacturing subsectors In this section we implement the K-density function and the M TM function in levels but, especially, its rate of variation (M TM marginal function) in a set of Spanish manufacturing sectors in order to detect the presence, spatial extent and intensity of spatial concentration and to compare whether the behaviour of K-density and M TM marginal function is similar in a real situation. 14 This conclusion is not in practice a serious problem to the extent that localization usually takes place at low, as pointed out by Duranton and Overman (2005); therefore, we should be more concerned about the location patters at lower of the radius. 17

18 In the previous controlled simulations, the values of M and M marginal functions were conditioned by a random benchmark. In this section we get closer to the economic reality and we use a benchmark that takes into account the whole of the manufacturing. 15 Thus, the values of the M TM s functions will capture the differences of the spatial location pattern of each sector with the overall tendency of manufacturing industry to agglomerate. That is, it will be captured the idiosyncrasy of each sector and isolated its specificities. The sectors in which we focus have been chosen by the fact of presenting clearly different location patters or some specific characteristics, which we wanted to see how they were reflected by both functions. For instance, sector 1511, Production and preserving of meat, has a concentration pattern very similar to the whole of manufacturing; sector 1725, Other textile weaving, 2954, Manufacture of machinery for textile, apparel and leather production, and 3650, Manufacture of games and toys, have most of their establishments in two clusters very close to the border; sector 2442, Manufacture of pharmaceutical preparations, and 3220, Manufacture of television and radio transmitters, have also most of its establishments in two clusters, but this time one of them is very close to the border (around Barcelona) and the other one is completely centred in the area of study (around Madrid); 2213, Publishing of journals and periodicals, and 2630, Manufacture of ceramic tiles and flags, are the sectors that present the highest level of concentration; or finally the establishments of sector 3511, Building and repairing of ships, are located along the coastline and those of sector 3622, Manufacture of jewellery and related articles, are located in several clusters. The graphs of M s curves and K-density s curves of the whole of sectors analysed appear in Appendix 2; however, only some of them will be commented in this section. In Appendix 2, the graphic situated on the left shows the spatial location pattern of each sector measured by the M TM function (continuous line) and the M TM marginal function (dashed line). The graph situated on the right shows the spatial location pattern of the same sector measured by the K-density function (continuous line) and their corresponding local and global confidence intervals (dashed lines). In the graphics that contain the M values does not appear the confidence interval because there are too many lines and the interpretation may be confused. As mentioned, sector 1511 is a good example of a sector whose spatial location pattern apparently does not differ significantly from the whole manufacturing. In fact, if we observe the two graphics of sector 1511, in Appendix 2, we realise that its M TM 15 For this reason, we introduce the subscript TM. 18

19 value is around zero along all. Thus, both the M TM marginal function and the K-density function, indicate that spatial location pattern of sector 1511 does not show significant differences with reference to the total manufacturing (only K-density values show, at the very initial, some relative dispersion of the subsector). In the case of sectors 1725 and 2954, both have most of their establishments in two clusters close to the border. We can see this information in Figure 5, which illustrates the location of establishments of both sectors by means of dots. This fact may cause the drop of M TM function, as we theoretically checked in previous section, and as we see in the graphs on the left in Figure 6. Moreover, we also concluded that the higher the percentages of establishments of the sector into the clusters close to the border, the faster the drop of M TM function. Thus here, we can perfectly verify that, in sector 2954, only four establishments are located out of the clusters close to the border and so the drop of M TM function in this sector is very pronounced, dropping its value almost suddenly ,8 0,8 0,6 0,6 0,4 0,4 0,2 0, ,2 0,4 0,6 0, ,2 0,4 0,6 0,8 1 Figure 5. Spatial distribution of establishments of sector 1725 and 2954 As we have deduced, the rise and later drop of M TM function at small reflects the specific distribution pattern of the establishments in sector 1725 and 2954, being located most of them in two clusters close to the border. The location of clusters in the area of study is not easily interpretable by the K-density function; however, it detects perfectly the existence of the two clusters. In sector 1725, even M TM function detects that there is another significant cluster from a distance of about 250 km. This does not happen in the case of sector 2954 because the distance between the clusters is larger and M TM function does not detect it. 19

20 1725. Other textile weaving K-density Other textile weaving Machinery for textile, apparel and leather production K-density Machinery for textile, apparel and leather production Figure 6. Comparison of M TM s functions and K-density of sector 1725 and 2954 Finally, when we refer to two of the sectors that have traditionally been considered as highly concentrated, 2630 and 2213, we observe that our results confirm this impression, as we can see in Figures 7 and 8. Both sectors present the highest level of intensity according to the M TM value and we realise that both of them present high levels of concentration at short (figures 7 and 8 left). 16 The same result is obtained using the K-density (figures 7 and 8 right). But, in each sector, the subsequent behaviour of the M TM functions is very different as we expand the range of interaction considered, that is, when the radius makes larger. In sector 2630, the M TM value decreases as quickly with increasing distance as it had risen at short of the radius. However, in sector 2213, this fast decrease does not appear. 16 We should notice that the scale of these two graphs is different because are representing the sectors with the highest level of concentration. 20

21 Manufacture of ceramic tiles and flags K-density Manufacture of ceramic tiles and flags Figure 7. Comparison of M TM s functions and K-density of sector 2630 The rapid growth of the M TM function and the consequent high values of the M TM marginal function of these last two sectors indicate that the density of establishments at small is very elevated. If we compare, Figures 6, 7 and 8 (left) we can conclude that the growth of M TM function is much slower in sectors 1725 and It means that the distance between their establishments within the clusters of sectors 1725 and 2954 is larger than the distance between those of sectors 2630 and Furthermore, the high intensity of M TM function may give us indications about the approximate number of clusters that constitute each sector, i.e. the high intensity indicates that the considered sectors have few clusters, or only one. In this way, if we do an interpretation of the values obtained by the M functions and K-density, we realise that the three functions clearly capture the existence of concentration. Specifically, if we look at Figure 7, we can say, thanks to the three functions, that this is a highly concentrated sector at short. In addition, M marginal detects and gives us more information about local clusters, to be precise about their size or range. Looking at the graphic on the left, we can also say that this is a sector with few clusters, one or two at most, and all of them are situated close to the border. We can deduce this because, as we concluded in the preceding paragraph, the high intensity of M TM function indicates the existence of few clusters, and when the decrease of the M function is very pronounced, it indicates border effect. If we consider the graphic on the right, the values of K-density function assure us that there is only one cluster, although with this function we cannot know where it is located. If we look now at Figure 8, we can also say that this is a highly concentrated sector at short. The high intensity of M TM function also indicates the existence of few clusters and the 21

22 behaviour of this function at larger indicates, this time, that at least one of the clusters that constitute the sector is far away from the border. Finally, the behaviour of M functions let us clarify the approximate location of establishments. Thus, we can conclude that most of the establishments of this sector are located within the cluster that is far away from the border, since the drop of this function is not very pronounced Publishing of journals and periodicals K-density Publishing of journals and periodicals Figure 8. Comparison of M TM s functions and K-density of sector 2213 In fact, Sector 2630 has most of its establishments (78%) concentrated in a single location. So, there is a well-defined cluster in the province of Castellón, i.e. very close to the border. The remaining 22% is located in the rest of the Spanish territory. Sector 2213 agglutinates the 82% of their establishments in two locations, so we have a cluster in Madrid that contains 57% of the establishments of the industry (far away from the border) and other one in Barcelona that contains 25% (close to the border). Therefore, given that both sectors have almost the same number of establishments within clusters, 78% versus 82%, the only reason why M TM value decreases as quickly in sector 2630, and not in sector 2213, is the edge effect. In sector 2630, a high percentage of establishments (78%) are located close to the border. Thus, although the sector 2213 has also a cluster close to the border, this only represents 25% of the total establishments in the industry and the edge effect that can cause is minimized by the location of the rest of the establishments. One last point remains to be treated in relation to the total distance taken into account in our analysis. As pointed out by Marcon and Puech (2003) and Duranton and Overman (2005) themselves, K-density function has a peculiarity in its method of construction, commented in the methodology section. As consequence, Duranton and Overman (2005) limited the relevant distance, in empirical studies, up to the median 22

23 distance between all pairs of establishments. Thus, in the specific case of the empirical application on the whole of Great Britain, the distance considered was up to 180 km. In our case, the restriction of Ripley s K function about using a radius smaller than 25% of the smaller side length of the area analysed limits the relevant radius about km approximately. In both cases, the distance ranges are economically pertinent. So, if we restrict our analysis to 200 km, distance at which the results of both functions are reliable, the information that both functions provide is very similar. In fact, M TM marginal function can provide even more information than K-density function. Summarising the specific information proportionate by M TM marginal function and K-density function, we realise that on the one hand and as common points, both M TM marginal function and K-density function detect clearly the existence of concentration, and also identify local clusters because M TM marginal is a non-cumulative function. On the other hand and as key aspect of complementarity, M TM marginal detects the size of the clusters and also their position in the area of study. This information is not provided by K-density function. Last, both M TM marginal and K-density may confirm the number of clusters that constitute each sector and the distance between them, as long as the clusters are close between them. It will depend on the distance analysed, which depends in turn on the restrictions of the two measures of concentration, commented previously. It should be noted that the number of sectors that present concentration or dispersion patterns has been the same with both methodologies implemented, M TM marginal and K-density. This confirms the robustness of the two measures. Moreover, it cannot be otherwise since a relationship exists between the number of repetitions of a specific distance, concept coming from K-density function, and the number of establishments at that specific distance, concept coming from Ripley s K function. 5. Conclusions This paper incorporates an extension of Ripley s K function, which we have named M and M marginal functions, to measure concentration of economic activity more precisely and to capture in a better way the peculiarities in formation of clusters as consequence of the intensity of centripetal and centrifugal forces and the trade-off between them. M functions, (M in levels and its rate of variation, M marginal), belong to the tradition of spatial statistics but get closer to the spatial economic approach. The two are distance-based methods that can detect the spatial location pattern at every scale, let us know the distance at which significant concentration appears and test the significance of 23

Spatial location patterns of Spanish manufacturing firms *

Spatial location patterns of Spanish manufacturing firms * Spatial location patterns of Spanish manufacturing firms * José M. Albert, Marta R. Casanova and Vicente Orts * Preliminary Version. Please do not quote without permission. Abstract In this paper, we evaluate

More information

GIST 4302/5302: Spatial Analysis and Modeling Point Pattern Analysis

GIST 4302/5302: Spatial Analysis and Modeling Point Pattern Analysis GIST 4302/5302: Spatial Analysis and Modeling Point Pattern Analysis Guofeng Cao www.spatial.ttu.edu Department of Geosciences Texas Tech University guofeng.cao@ttu.edu Fall 2018 Spatial Point Patterns

More information

Lectures 5 & 6: Hypothesis Testing

Lectures 5 & 6: Hypothesis Testing Lectures 5 & 6: Hypothesis Testing in which you learn to apply the concept of statistical significance to OLS estimates, learn the concept of t values, how to use them in regression work and come across

More information

On Spatial Dynamics. Klaus Desmet Universidad Carlos III. and. Esteban Rossi-Hansberg Princeton University. April 2009

On Spatial Dynamics. Klaus Desmet Universidad Carlos III. and. Esteban Rossi-Hansberg Princeton University. April 2009 On Spatial Dynamics Klaus Desmet Universidad Carlos and Esteban Rossi-Hansberg Princeton University April 2009 Desmet and Rossi-Hansberg () On Spatial Dynamics April 2009 1 / 15 ntroduction Economists

More information

Chapter 4. Explanation of the Model. Satoru Kumagai Inter-disciplinary Studies, IDE-JETRO, Japan

Chapter 4. Explanation of the Model. Satoru Kumagai Inter-disciplinary Studies, IDE-JETRO, Japan Chapter 4 Explanation of the Model Satoru Kumagai Inter-disciplinary Studies, IDE-JETRO, Japan Toshitaka Gokan Inter-disciplinary Studies, IDE-JETRO, Japan Ikumo Isono Bangkok Research Center, IDE-JETRO,

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION doi:10.1038/nature11226 Supplementary Discussion D1 Endemics-area relationship (EAR) and its relation to the SAR The EAR comprises the relationship between study area and the number of species that are

More information

9/2/2010. Wildlife Management is a very quantitative field of study. throughout this course and throughout your career.

9/2/2010. Wildlife Management is a very quantitative field of study. throughout this course and throughout your career. Introduction to Data and Analysis Wildlife Management is a very quantitative field of study Results from studies will be used throughout this course and throughout your career. Sampling design influences

More information

CHAPTER 3. THE IMPERFECT CUMULATIVE SCALE

CHAPTER 3. THE IMPERFECT CUMULATIVE SCALE CHAPTER 3. THE IMPERFECT CUMULATIVE SCALE 3.1 Model Violations If a set of items does not form a perfect Guttman scale but contains a few wrong responses, we do not necessarily need to discard it. A wrong

More information

Chapter 6 Spatial Analysis

Chapter 6 Spatial Analysis 6.1 Introduction Chapter 6 Spatial Analysis Spatial analysis, in a narrow sense, is a set of mathematical (and usually statistical) tools used to find order and patterns in spatial phenomena. Spatial patterns

More information

Specialization versus spatial concentration: Which approach better defines the impact of economic integration? The case of the Romania s regions

Specialization versus spatial concentration: Which approach better defines the impact of economic integration? The case of the Romania s regions Specialization versus spatial concentration: Which approach better defines the impact of economic integration? The case of the Romania s regions Ceapraz Lucian, University of Burgundy, France The Strength

More information

Probability and Statistics

Probability and Statistics Probability and Statistics Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be CHAPTER 4: IT IS ALL ABOUT DATA 4a - 1 CHAPTER 4: IT

More information

Biased Urn Theory. Agner Fog October 4, 2007

Biased Urn Theory. Agner Fog October 4, 2007 Biased Urn Theory Agner Fog October 4, 2007 1 Introduction Two different probability distributions are both known in the literature as the noncentral hypergeometric distribution. These two distributions

More information

Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances

Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances Advances in Decision Sciences Volume 211, Article ID 74858, 8 pages doi:1.1155/211/74858 Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances David Allingham 1 andj.c.w.rayner

More information

Surveying Prof. Bharat Lohani Department of Civil Engineering Indian Institute of Technology, Kanpur. Module - 3 Lecture - 4 Linear Measurements

Surveying Prof. Bharat Lohani Department of Civil Engineering Indian Institute of Technology, Kanpur. Module - 3 Lecture - 4 Linear Measurements Surveying Prof. Bharat Lohani Department of Civil Engineering Indian Institute of Technology, Kanpur Module - 3 Lecture - 4 Linear Measurements Welcome again to this another video lecture on basic surveying.

More information

A nonparametric two-sample wald test of equality of variances

A nonparametric two-sample wald test of equality of variances University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 211 A nonparametric two-sample wald test of equality of variances David

More information

DISTANCE-BASED MEASURES OF SPATIAL CONCENTRATION: INTRODUCING A RELATIVE DENSITY FUNCTION

DISTANCE-BASED MEASURES OF SPATIAL CONCENTRATION: INTRODUCING A RELATIVE DENSITY FUNCTION DISTANCE-BASED MEASURES OF SPATIAL CONCENTRATION: INTRODUCING A RELATIVE DENSITY FUNCTION Gabriel Lang, Eric Marcon, Florence Puech To cite this version: Gabriel Lang, Eric Marcon, Florence Puech. DISTANCE-BASED

More information

A Summary of Economic Methodology

A Summary of Economic Methodology A Summary of Economic Methodology I. The Methodology of Theoretical Economics All economic analysis begins with theory, based in part on intuitive insights that naturally spring from certain stylized facts,

More information

Session-Based Queueing Systems

Session-Based Queueing Systems Session-Based Queueing Systems Modelling, Simulation, and Approximation Jeroen Horters Supervisor VU: Sandjai Bhulai Executive Summary Companies often offer services that require multiple steps on the

More information

Spatial Analysis I. Spatial data analysis Spatial analysis and inference

Spatial Analysis I. Spatial data analysis Spatial analysis and inference Spatial Analysis I Spatial data analysis Spatial analysis and inference Roadmap Outline: What is spatial analysis? Spatial Joins Step 1: Analysis of attributes Step 2: Preparing for analyses: working with

More information

AP Statistics Cumulative AP Exam Study Guide

AP Statistics Cumulative AP Exam Study Guide AP Statistics Cumulative AP Eam Study Guide Chapters & 3 - Graphs Statistics the science of collecting, analyzing, and drawing conclusions from data. Descriptive methods of organizing and summarizing statistics

More information

TIMES SERIES INTRODUCTION INTRODUCTION. Page 1. A time series is a set of observations made sequentially through time

TIMES SERIES INTRODUCTION INTRODUCTION. Page 1. A time series is a set of observations made sequentially through time TIMES SERIES INTRODUCTION A time series is a set of observations made sequentially through time A time series is said to be continuous when observations are taken continuously through time, or discrete

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

INTRODUCTION TO ANALYSIS OF VARIANCE

INTRODUCTION TO ANALYSIS OF VARIANCE CHAPTER 22 INTRODUCTION TO ANALYSIS OF VARIANCE Chapter 18 on inferences about population means illustrated two hypothesis testing situations: for one population mean and for the difference between two

More information

AIM HIGH SCHOOL. Curriculum Map W. 12 Mile Road Farmington Hills, MI (248)

AIM HIGH SCHOOL. Curriculum Map W. 12 Mile Road Farmington Hills, MI (248) AIM HIGH SCHOOL Curriculum Map 2923 W. 12 Mile Road Farmington Hills, MI 48334 (248) 702-6922 www.aimhighschool.com COURSE TITLE: Statistics DESCRIPTION OF COURSE: PREREQUISITES: Algebra 2 Students will

More information

Overview of Spatial analysis in ecology

Overview of Spatial analysis in ecology Spatial Point Patterns & Complete Spatial Randomness - II Geog 0C Introduction to Spatial Data Analysis Chris Funk Lecture 8 Overview of Spatial analysis in ecology st step in understanding ecological

More information

Chapter 10: Location effects, economic geography and regional policy

Chapter 10: Location effects, economic geography and regional policy Chapter 10: Location effects, economic geography and regional policy the Community shall aim at reducing disparities between the levels of development of the various regions and the backwardness of the

More information

The exact bootstrap method shown on the example of the mean and variance estimation

The exact bootstrap method shown on the example of the mean and variance estimation Comput Stat (2013) 28:1061 1077 DOI 10.1007/s00180-012-0350-0 ORIGINAL PAPER The exact bootstrap method shown on the example of the mean and variance estimation Joanna Kisielinska Received: 21 May 2011

More information

ERSA Polish Section Working Papers

ERSA Polish Section Working Papers WORKING PAPERS VOLUME 2015, Issue 1 Please note that ERSA Polish Section Working Papers have been presented as part of the working process of ERSA Polish Section Members. They are intended to make results

More information

A Case Study: Exploring Industrial Agglomeration of Manufacturing Industries in Shanghai using Duranton and Overman s K-density Function

A Case Study: Exploring Industrial Agglomeration of Manufacturing Industries in Shanghai using Duranton and Overman s K-density Function A Case Study: Exploring Industrial Agglomeration of Manufacturing Industries in Shanghai using Duranton and Overman s K-density Function Siyu Tian a, Jimin Wang a, Zhipeng Gui a,b,c *, Huayi Wu b,c, Yuan

More information

Spatial Effects in Convergence of Portuguese Product

Spatial Effects in Convergence of Portuguese Product Spatial Effects in Convergence of Portuguese Product Vitor João Pereira Domingues Martinho Unidade de I&D do Instituto Politécnico de Viseu Av. Cor. José Maria Vale de Andrade Campus Politécnico 354-51

More information

Testing for Localisation Using Micro-Geographic Data

Testing for Localisation Using Micro-Geographic Data Testing for Localisation Using Micro-Geographic Data Gilles Duranton London School of Economics Henry G. Overman London School of Economics 16 December 2002 ABSTRACT: To study the detailed location patterns

More information

Background on Coherent Systems

Background on Coherent Systems 2 Background on Coherent Systems 2.1 Basic Ideas We will use the term system quite freely and regularly, even though it will remain an undefined term throughout this monograph. As we all have some experience

More information

Types of spatial data. The Nature of Geographic Data. Types of spatial data. Spatial Autocorrelation. Continuous spatial data: geostatistics

Types of spatial data. The Nature of Geographic Data. Types of spatial data. Spatial Autocorrelation. Continuous spatial data: geostatistics The Nature of Geographic Data Types of spatial data Continuous spatial data: geostatistics Samples may be taken at intervals, but the spatial process is continuous e.g. soil quality Discrete data Irregular:

More information

Testing for a break in persistence under long-range dependencies and mean shifts

Testing for a break in persistence under long-range dependencies and mean shifts Testing for a break in persistence under long-range dependencies and mean shifts Philipp Sibbertsen and Juliane Willert Institute of Statistics, Faculty of Economics and Management Leibniz Universität

More information

1 Introduction. 2 AIC versus SBIC. Erik Swanson Cori Saviano Li Zha Final Project

1 Introduction. 2 AIC versus SBIC. Erik Swanson Cori Saviano Li Zha Final Project Erik Swanson Cori Saviano Li Zha Final Project 1 Introduction In analyzing time series data, we are posed with the question of how past events influences the current situation. In order to determine this,

More information

Math Review Sheet, Fall 2008

Math Review Sheet, Fall 2008 1 Descriptive Statistics Math 3070-5 Review Sheet, Fall 2008 First we need to know about the relationship among Population Samples Objects The distribution of the population can be given in one of the

More information

A Spatio-Temporal Point Process Model for Firemen Demand in Twente

A Spatio-Temporal Point Process Model for Firemen Demand in Twente University of Twente A Spatio-Temporal Point Process Model for Firemen Demand in Twente Bachelor Thesis Author: Mike Wendels Supervisor: prof. dr. M.N.M. van Lieshout Stochastic Operations Research Applied

More information

Surveying Prof. Bharat Lohani Department of Civil Engineering Indian Institute of Technology, Kanpur

Surveying Prof. Bharat Lohani Department of Civil Engineering Indian Institute of Technology, Kanpur Surveying Prof. Bharat Lohani Department of Civil Engineering Indian Institute of Technology, Kanpur Module - 12 Lecture - 1 Global Positioning System (Refer Slide Time: 00:20) Welcome to this video lecture

More information

Examples of linear systems and explanation of the term linear. is also a solution to this equation.

Examples of linear systems and explanation of the term linear. is also a solution to this equation. . Linear systems Examples of linear systems and explanation of the term linear. () ax b () a x + a x +... + a x b n n Illustration by another example: The equation x x + 5x 7 has one solution as x 4, x

More information

Online Appendix The Growth of Low Skill Service Jobs and the Polarization of the U.S. Labor Market. By David H. Autor and David Dorn

Online Appendix The Growth of Low Skill Service Jobs and the Polarization of the U.S. Labor Market. By David H. Autor and David Dorn Online Appendix The Growth of Low Skill Service Jobs and the Polarization of the U.S. Labor Market By David H. Autor and David Dorn 1 2 THE AMERICAN ECONOMIC REVIEW MONTH YEAR I. Online Appendix Tables

More information

Some thoughts on the design of (dis)similarity measures

Some thoughts on the design of (dis)similarity measures Some thoughts on the design of (dis)similarity measures Christian Hennig Department of Statistical Science University College London Introduction In situations where p is large and n is small, n n-dissimilarity

More information

Slope Fields: Graphing Solutions Without the Solutions

Slope Fields: Graphing Solutions Without the Solutions 8 Slope Fields: Graphing Solutions Without the Solutions Up to now, our efforts have been directed mainly towards finding formulas or equations describing solutions to given differential equations. Then,

More information

Relative and Absolute Directions

Relative and Absolute Directions Relative and Absolute Directions Purpose Learning about latitude and longitude Developing math skills Overview Students begin by asking the simple question: Where Am I? Then they learn about the magnetic

More information

TREND AND VARIABILITY ANALYSIS OF RAINFALL SERIES AND THEIR EXTREME

TREND AND VARIABILITY ANALYSIS OF RAINFALL SERIES AND THEIR EXTREME TREND AND VARIABILITY ANALYSIS OF RAINFALL SERIES AND THEIR EXTREME EVENTS J. Abaurrea, A. C. Cebrián. Dpto. Métodos Estadísticos. Universidad de Zaragoza. Abstract: Rainfall series and their corresponding

More information

Diploma Part 2. Quantitative Methods. Examiners Suggested Answers

Diploma Part 2. Quantitative Methods. Examiners Suggested Answers Diploma Part 2 Quantitative Methods Examiners Suggested Answers Q1 (a) A frequency distribution is a table or graph (i.e. a histogram) that shows the total number of measurements that fall in each of a

More information

Data Science Unit. Global DTM Support Team, HQ Geneva

Data Science Unit. Global DTM Support Team, HQ Geneva NET FLUX VISUALISATION FOR FLOW MONITORING DATA Data Science Unit Global DTM Support Team, HQ Geneva March 2018 Summary This annex seeks to explain the way in which Flow Monitoring data collected by the

More information

CROSS-COUNTRY DIFFERENCES IN PRODUCTIVITY: THE ROLE OF ALLOCATION AND SELECTION

CROSS-COUNTRY DIFFERENCES IN PRODUCTIVITY: THE ROLE OF ALLOCATION AND SELECTION ONLINE APPENDIX CROSS-COUNTRY DIFFERENCES IN PRODUCTIVITY: THE ROLE OF ALLOCATION AND SELECTION By ERIC BARTELSMAN, JOHN HALTIWANGER AND STEFANO SCARPETTA This appendix presents a detailed sensitivity

More information

Chapter 1 Statistical Inference

Chapter 1 Statistical Inference Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

More information

CHOOSING THE RIGHT SAMPLING TECHNIQUE FOR YOUR RESEARCH. Awanis Ku Ishak, PhD SBM

CHOOSING THE RIGHT SAMPLING TECHNIQUE FOR YOUR RESEARCH. Awanis Ku Ishak, PhD SBM CHOOSING THE RIGHT SAMPLING TECHNIQUE FOR YOUR RESEARCH Awanis Ku Ishak, PhD SBM Sampling The process of selecting a number of individuals for a study in such a way that the individuals represent the larger

More information

Trends in the Relative Distribution of Wages by Gender and Cohorts in Brazil ( )

Trends in the Relative Distribution of Wages by Gender and Cohorts in Brazil ( ) Trends in the Relative Distribution of Wages by Gender and Cohorts in Brazil (1981-2005) Ana Maria Hermeto Camilo de Oliveira Affiliation: CEDEPLAR/UFMG Address: Av. Antônio Carlos, 6627 FACE/UFMG Belo

More information

Spatial Competition and the Structure of Retail Markets

Spatial Competition and the Structure of Retail Markets 18 th World IMACS / MODSIM Congress, Cairns, Australia 13-17 July 2009 http://mssanz.org.au/modsim09 Spatial Competition and the Structure of Retail Markets Dr S. Beare, S. Szakiel Concept Economics, Australian

More information

MULTIPLE CHOICE QUESTIONS DECISION SCIENCE

MULTIPLE CHOICE QUESTIONS DECISION SCIENCE MULTIPLE CHOICE QUESTIONS DECISION SCIENCE 1. Decision Science approach is a. Multi-disciplinary b. Scientific c. Intuitive 2. For analyzing a problem, decision-makers should study a. Its qualitative aspects

More information

Chapter 9: The Regression Model with Qualitative Information: Binary Variables (Dummies)

Chapter 9: The Regression Model with Qualitative Information: Binary Variables (Dummies) Chapter 9: The Regression Model with Qualitative Information: Binary Variables (Dummies) Statistics and Introduction to Econometrics M. Angeles Carnero Departamento de Fundamentos del Análisis Económico

More information

Construction Modelling of Input-Output Coefficients Matrices in an Axiomatic Context: Some Further Considerations.

Construction Modelling of Input-Output Coefficients Matrices in an Axiomatic Context: Some Further Considerations. 1 Construction Modelling of Input-Output Coefficients Matrices in an Axiomatic Context: Some Further Considerations. José Manuel Rueda-Cantuche Universidad Pablo de Olavide Seville - Spain XIVth International

More information

GRAPHIC REALIZATIONS OF SEQUENCES. Under the direction of Dr. John S. Caughman

GRAPHIC REALIZATIONS OF SEQUENCES. Under the direction of Dr. John S. Caughman GRAPHIC REALIZATIONS OF SEQUENCES JOSEPH RICHARDS Under the direction of Dr. John S. Caughman A Math 501 Project Submitted in partial fulfillment of the requirements for the degree of Master of Science

More information

Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project

Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project Devin Cornell & Sushruth Sastry May 2015 1 Abstract In this article, we explore

More information

Intersecting Two Lines, Part One

Intersecting Two Lines, Part One Module 1.4 Page 118 of 1124. Module 1.4: Intersecting Two Lines, Part One This module will explain to you several common methods used for intersecting two lines. By this, we mean finding the point x, y)

More information

CHAPTER 8 INTRODUCTION TO STATISTICAL ANALYSIS

CHAPTER 8 INTRODUCTION TO STATISTICAL ANALYSIS CHAPTER 8 INTRODUCTION TO STATISTICAL ANALYSIS LEARNING OBJECTIVES: After studying this chapter, a student should understand: notation used in statistics; how to represent variables in a mathematical form

More information

Team Production and the Allocation of Creativity across Global and Local Sectors

Team Production and the Allocation of Creativity across Global and Local Sectors RIETI Discussion Paper Series 15-E-111 Team Production and the Allocation of Creativity across Global and Local Sectors NAGAMACHI Kohei Kagawa University The Research Institute of Economy, Trade and Industry

More information

Diversity partitioning without statistical independence of alpha and beta

Diversity partitioning without statistical independence of alpha and beta 1964 Ecology, Vol. 91, No. 7 Ecology, 91(7), 2010, pp. 1964 1969 Ó 2010 by the Ecological Society of America Diversity partitioning without statistical independence of alpha and beta JOSEPH A. VEECH 1,3

More information

An adapted intensity estimator for linear networks with an application to modelling anti-social behaviour in an urban environment

An adapted intensity estimator for linear networks with an application to modelling anti-social behaviour in an urban environment An adapted intensity estimator for linear networks with an application to modelling anti-social behaviour in an urban environment M. M. Moradi 1,2,, F. J. Rodríguez-Cortés 2 and J. Mateu 2 1 Institute

More information

Sequences and Series

Sequences and Series Sequences and Series What do you think of when you read the title of our next unit? In case your answers are leading us off track, let's review the following IB problems. 1 November 2013 HL 2 3 November

More information

22: Applications of Differential Calculus

22: Applications of Differential Calculus 22: Applications of Differential Calculus A: Time Rate of Change The most common use of calculus (the one that motivated our discussions of the previous chapter) are those that involve change in some quantity

More information

IENG581 Design and Analysis of Experiments INTRODUCTION

IENG581 Design and Analysis of Experiments INTRODUCTION Experimental Design IENG581 Design and Analysis of Experiments INTRODUCTION Experiments are performed by investigators in virtually all fields of inquiry, usually to discover something about a particular

More information

Mapping Welsh Neighbourhood Types. Dr Scott Orford Wales Institute for Social and Economic Research, Data and Methods WISERD

Mapping Welsh Neighbourhood Types. Dr Scott Orford Wales Institute for Social and Economic Research, Data and Methods WISERD Mapping Welsh Neighbourhood Types Dr Scott Orford Wales Institute for Social and Economic Research, Data and Methods WISERD orfords@cardiff.ac.uk WISERD Established in 2008 and funded by the ESRC and HEFCW

More information

Commuting, Migration, and Rural Development

Commuting, Migration, and Rural Development MPRA Munich Personal RePEc Archive Commuting, Migration, and Rural Development Ayele Gelan Socio-economic Research Program, The Macaulay Institute, Aberdeen, UK 003 Online at http://mpra.ub.uni-muenchen.de/903/

More information

EXPLORING THE SPATIAL AND TEMPORAL PATTERNS OF BUSINESS CONCENTRATION AND DISPERSION: A CASE-STUDY ANALYSIS FOR COUNTY OF SANTA BARBARA

EXPLORING THE SPATIAL AND TEMPORAL PATTERNS OF BUSINESS CONCENTRATION AND DISPERSION: A CASE-STUDY ANALYSIS FOR COUNTY OF SANTA BARBARA EXPLORING THE SPATIAL AND TEMPORAL PATTERNS OF BUSINESS CONCENTRATION AND DISPERSION: A CASE-STUDY ANALYSIS FOR COUNTY OF SANTA BARBARA Srinath Ravulaparthy Graduate Student, Department of Geography and

More information

(Refer Slide Time: 1:42)

(Refer Slide Time: 1:42) Control Engineering Prof. Madan Gopal Department of Electrical Engineering Indian Institute of Technology, Delhi Lecture - 21 Basic Principles of Feedback Control (Contd..) Friends, let me get started

More information

TECHNICAL REPORT # 59 MAY Interim sample size recalculation for linear and logistic regression models: a comprehensive Monte-Carlo study

TECHNICAL REPORT # 59 MAY Interim sample size recalculation for linear and logistic regression models: a comprehensive Monte-Carlo study TECHNICAL REPORT # 59 MAY 2013 Interim sample size recalculation for linear and logistic regression models: a comprehensive Monte-Carlo study Sergey Tarima, Peng He, Tao Wang, Aniko Szabo Division of Biostatistics,

More information

Physics 509: Non-Parametric Statistics and Correlation Testing

Physics 509: Non-Parametric Statistics and Correlation Testing Physics 509: Non-Parametric Statistics and Correlation Testing Scott Oser Lecture #19 Physics 509 1 What is non-parametric statistics? Non-parametric statistics is the application of statistical tests

More information

Probabilities & Statistics Revision

Probabilities & Statistics Revision Probabilities & Statistics Revision Christopher Ting Christopher Ting http://www.mysmu.edu/faculty/christophert/ : christopherting@smu.edu.sg : 6828 0364 : LKCSB 5036 January 6, 2017 Christopher Ting QF

More information

Answers to questions for The New Introduction to Geographical Economics, 2 nd edition

Answers to questions for The New Introduction to Geographical Economics, 2 nd edition Answers to questions for The New Introduction to Geographical Economics, 2 nd edition Chapter 2 Geography and economic theory Question 2.1* Assume a trade model with transportation costs but without increasing

More information

Examine characteristics of a sample and make inferences about the population

Examine characteristics of a sample and make inferences about the population Chapter 11 Introduction to Inferential Analysis Learning Objectives Understand inferential statistics Explain the difference between a population and a sample Explain the difference between parameter and

More information

Some Notes on Linear Algebra

Some Notes on Linear Algebra Some Notes on Linear Algebra prepared for a first course in differential equations Thomas L Scofield Department of Mathematics and Statistics Calvin College 1998 1 The purpose of these notes is to present

More information

Stat 502X Exam 2 Spring 2014

Stat 502X Exam 2 Spring 2014 Stat 502X Exam 2 Spring 2014 I have neither given nor received unauthorized assistance on this exam. Name Signed Date Name Printed This exam consists of 12 parts. I'll score it at 10 points per problem/part

More information

Glossary. Appendix G AAG-SAM APP G

Glossary. Appendix G AAG-SAM APP G Appendix G Glossary Glossary 159 G.1 This glossary summarizes definitions of the terms related to audit sampling used in this guide. It does not contain definitions of common audit terms. Related terms

More information

BTRY 4090: Spring 2009 Theory of Statistics

BTRY 4090: Spring 2009 Theory of Statistics BTRY 4090: Spring 2009 Theory of Statistics Guozhang Wang September 25, 2010 1 Review of Probability We begin with a real example of using probability to solve computationally intensive (or infeasible)

More information

Filling in the Blanks: Network Structure and Systemic Risk

Filling in the Blanks: Network Structure and Systemic Risk Filling in the Blanks: Network Structure and Systemic Risk Kartik Anand, Bank of Canada Ben Craig, Federal Reserve Bank of Cleveland & Deutsche Bundesbank Goetz von Peter, Bank for International Settlements

More information

CoDa-dendrogram: A new exploratory tool. 2 Dept. Informàtica i Matemàtica Aplicada, Universitat de Girona, Spain;

CoDa-dendrogram: A new exploratory tool. 2 Dept. Informàtica i Matemàtica Aplicada, Universitat de Girona, Spain; CoDa-dendrogram: A new exploratory tool J.J. Egozcue 1, and V. Pawlowsky-Glahn 2 1 Dept. Matemàtica Aplicada III, Universitat Politècnica de Catalunya, Barcelona, Spain; juan.jose.egozcue@upc.edu 2 Dept.

More information

Solution: chapter 2, problem 5, part a:

Solution: chapter 2, problem 5, part a: Learning Chap. 4/8/ 5:38 page Solution: chapter, problem 5, part a: Let y be the observed value of a sampling from a normal distribution with mean µ and standard deviation. We ll reserve µ for the estimator

More information

The Perceptron algorithm

The Perceptron algorithm The Perceptron algorithm Tirgul 3 November 2016 Agnostic PAC Learnability A hypothesis class H is agnostic PAC learnable if there exists a function m H : 0,1 2 N and a learning algorithm with the following

More information

Stochastic dominance with imprecise information

Stochastic dominance with imprecise information Stochastic dominance with imprecise information Ignacio Montes, Enrique Miranda, Susana Montes University of Oviedo, Dep. of Statistics and Operations Research. Abstract Stochastic dominance, which is

More information

Modeling firms locational choice

Modeling firms locational choice Modeling firms locational choice Giulio Bottazzi DIMETIC School Pécs, 05 July 2010 Agglomeration derive from some form of externality. Drivers of agglomeration can be of two types: pecuniary and non-pecuniary.

More information

Settlements are the visible imprint made by the man upon the physical

Settlements are the visible imprint made by the man upon the physical Settlements are the visible imprint made by the man upon the physical landscape through the process of cultural occupancy. It is manmade colony of human being in which they live, work, and move to pursue

More information

Industrial Engineering Prof. Inderdeep Singh Department of Mechanical & Industrial Engineering Indian Institute of Technology, Roorkee

Industrial Engineering Prof. Inderdeep Singh Department of Mechanical & Industrial Engineering Indian Institute of Technology, Roorkee Industrial Engineering Prof. Inderdeep Singh Department of Mechanical & Industrial Engineering Indian Institute of Technology, Roorkee Module - 04 Lecture - 05 Sales Forecasting - II A very warm welcome

More information

Summary statistics. G.S. Questa, L. Trapani. MSc Induction - Summary statistics 1

Summary statistics. G.S. Questa, L. Trapani. MSc Induction - Summary statistics 1 Summary statistics 1. Visualize data 2. Mean, median, mode and percentiles, variance, standard deviation 3. Frequency distribution. Skewness 4. Covariance and correlation 5. Autocorrelation MSc Induction

More information

Mathematics (Project Maths)

Mathematics (Project Maths) Pre-Leaving Certificate Examination Mathematics (Project Maths) Paper Ordinary Level (with solutions) February 00 ½ hours 00 marks Running total Examination number Centre stamp For examiner Question Mark

More information

A Rothschild-Stiglitz approach to Bayesian persuasion

A Rothschild-Stiglitz approach to Bayesian persuasion A Rothschild-Stiglitz approach to Bayesian persuasion Matthew Gentzkow and Emir Kamenica Stanford University and University of Chicago December 2015 Abstract Rothschild and Stiglitz (1970) represent random

More information

The ambiguous impact of contracts on competition in the electricity market Yves Smeers

The ambiguous impact of contracts on competition in the electricity market Yves Smeers The ambiguous impact of contracts on competition in the electricity market Yves Smeers joint work with Frederic Murphy Climate Policy and Long Term Decisions-Investment and R&D, Bocconi University, Milan,

More information

LECTURE 14 WALRASIAN EQUILIBRIUM IN THE CASE OF A SINGLE LABOR SECTOR

LECTURE 14 WALRASIAN EQUILIBRIUM IN THE CASE OF A SINGLE LABOR SECTOR LECTURE 14 WALRASIAN EQUILIBRIUM IN THE CASE OF A SINGLE LABOR SECTOR JACOB T. SCHWARTZ EDITED BY KENNETH R. DRIESSEL Abstract. We continue to develop the consequences of the definition of a price equilibrium

More information

Fundamentals of Operations Research. Prof. G. Srinivasan. Indian Institute of Technology Madras. Lecture No. # 15

Fundamentals of Operations Research. Prof. G. Srinivasan. Indian Institute of Technology Madras. Lecture No. # 15 Fundamentals of Operations Research Prof. G. Srinivasan Indian Institute of Technology Madras Lecture No. # 15 Transportation Problem - Other Issues Assignment Problem - Introduction In the last lecture

More information

University of Groningen. Statistical Auditing and the AOQL-method Talens, Erik

University of Groningen. Statistical Auditing and the AOQL-method Talens, Erik University of Groningen Statistical Auditing and the AOQL-method Talens, Erik IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check

More information

Pattern Recognition Prof. P. S. Sastry Department of Electronics and Communication Engineering Indian Institute of Science, Bangalore

Pattern Recognition Prof. P. S. Sastry Department of Electronics and Communication Engineering Indian Institute of Science, Bangalore Pattern Recognition Prof. P. S. Sastry Department of Electronics and Communication Engineering Indian Institute of Science, Bangalore Lecture - 27 Multilayer Feedforward Neural networks with Sigmoidal

More information

Tutorial on Mathematical Induction

Tutorial on Mathematical Induction Tutorial on Mathematical Induction Roy Overbeek VU University Amsterdam Department of Computer Science r.overbeek@student.vu.nl April 22, 2014 1 Dominoes: from case-by-case to induction Suppose that you

More information

Spatial Point Pattern Analysis

Spatial Point Pattern Analysis Spatial Point Pattern Analysis Jiquan Chen Prof of Ecology, University of Toledo EEES698/MATH5798, UT Point variables in nature A point process is a discrete stochastic process of which the underlying

More information

Coexistence of Dynamics for Two- Dimensional Cellular Automata

Coexistence of Dynamics for Two- Dimensional Cellular Automata Coexistence of Dynamics for Two- Dimensional Cellular Automata Ricardo Severino Department of Mathematics and Applications University of Minho Campus de Gualtar - 4710-057 Braga, Portugal Maria Joana Soares

More information

3 Non-linearities and Dummy Variables

3 Non-linearities and Dummy Variables 3 Non-linearities and Dummy Variables Reading: Kennedy (1998) A Guide to Econometrics, Chapters 3, 5 and 6 Aim: The aim of this section is to introduce students to ways of dealing with non-linearities

More information

Points. Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved

Points. Luc Anselin.   Copyright 2017 by Luc Anselin, All Rights Reserved Points Luc Anselin http://spatial.uchicago.edu 1 classic point pattern analysis spatial randomness intensity distance-based statistics points on networks 2 Classic Point Pattern Analysis 3 Classic Examples

More information

Modeling Uncertainty in the Earth Sciences Jef Caers Stanford University

Modeling Uncertainty in the Earth Sciences Jef Caers Stanford University Probability theory and statistical analysis: a review Modeling Uncertainty in the Earth Sciences Jef Caers Stanford University Concepts assumed known Histograms, mean, median, spread, quantiles Probability,

More information

Exploring the Detailed Location Patterns of UK Manufacturing Industries Using Microgeographic Data

Exploring the Detailed Location Patterns of UK Manufacturing Industries Using Microgeographic Data University of Pennsylvania ScholarlyCommons Real Estate Papers Wharton Faculty Research 2-2008 Exploring the etailed Location Patterns of UK Manufacturing Industries Using Microgeographic ata Gilles uranton

More information