Test of Complete Spatial Randomness on Networks

Size: px
Start display at page:

Download "Test of Complete Spatial Randomness on Networks"

Transcription

1 Test of Complete Spatial Randomness on Networks A PROJECT SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL OF THE UNIVERSITY OF MINNESOTA BY Xinyue Chang IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF APPLIED AND COMPUTATIONAL MATHEMATICS YANG LI May, 2016

2 c Xinyue Chang 2016 ALL RIGHTS RESERVED

3 Acknowledgements Firstly, I would like to thank my advisor Professor Yang Li for his incredible support, guidance, and encouragement on my project and graduate study. I would also like to thank Professor Barry James and Professor Haiyang Wang for serving as my committee members and their time reading my project report. Last but not least, I am very grateful to Professor Kang James for her valuable suggestions and comments in the statistical seminar class. i

4 Abstract Test of complete spatial randomness (CSR) is an essential part of spatial analysis and regarded as a minimal prerequisite to any serious attempt to model an observed point pattern. It has been investigated, discussed and verified in planar region by researchers for more than 40 years. Recently more and more data of spatial point processes on networks have been collected. This project aimed to apply CSR test method to any spatial point pattern on the network. The study started with the derivation of the cumulative distribution function (CDF) of inter-event distances between two locations randomly distributed on a grid network. We then carried out a test procedure based on Monte Carlo simulation. The procedure was proposed when considering both interevent distances and nearest-neighbor distances. It was found that this method worked well when the process was constrained on a network. Finally, the car accident pattern on Minnesota major roads network was tested by both inter-event distances method and nearest-neighbor distances method. ii

5 Contents Acknowledgements Abstract List of Tables List of Figures i ii vi vii 1 Introduction Background and Organization Definitions Complete Spatial Randomness Spatial Point Processes on Networks Inter-event Distances CSR Test Based on Inter-event Distances Nearest-neighbor Distances CSR Test Based on Nearest-neighbor Distances CDF of Inter-event Distances on a Grid Network under CSR Preliminary CDF of Inter-event Distances if t < Cumulative Distribution Function Simulation CDF of Inter-event Distances if t > Cumulative Distribution Function iii

6 2.3.2 Simulation CSR Test Based on Inter-event Distances CSR Test Implementation Simulations Random Process Cluster Process Regular Process Conclusion CSR Test Based on Nearest-neighbor Distances CSR Test Implementation Simulations Random Process Cluster Process Regular Process Conclusion Car Crash Point Pattern on the Minnesota Major Roads Dataset Implementation CSR Test Based on Inter-event Distances CSR Test Based on Nearest-neighbor Distances Result and Analysis Conclusion 39 References 40 Appendix A. Glossary and Acronyms 41 A.1 Glossary A.2 Acronyms iv

7 Appendix B. Code 43 B.1 R Code B.1.1 Random Process B.1.2 Cluster Process B.1.3 Regular Process B.1.4 Random Network B.1.5 Car Crash Pattern on the MN Roads B.2 Python Code v

8 List of Tables A.1 Acronyms vi

9 List of Figures 2.1 An example of a regular grid network with m = n = Four possible locations of two arbitrary points on the 5 5 grid The grid network Simulation result and plot for CDF when t < 1 (blue is the theoretical function) The grid network Simulation result and plot for CDF when t > 1 (blue is the theoretical function) Random point pattern on the grid network Envelope plot for random process on the grid network Random point pattern on a random network Envelope plot for random process on a random network Cluster point pattern on the grid network Envelope plot for cluster process on the grid network Cluster point pattern on a random network Envelope plot for cluster process on a random network Regular point pattern on the grid network Envelope plot for regular process on the grid network Regular point pattern on a random network Envelope plot for regular process on a random network Grid network and random point pattern Envelope plot for random process on grid network Random network and random point pattern Envelope plot for random process on random network vii

10 4.5 Grid network and cluster point pattern Envelope plot for cluster process on grid network Random network and cluster point pattern Envelope plot for cluster process on random network Grid network and regular point pattern Envelope plot for regular process on grid network Random network and regular point pattern Envelope plot for regular process on random network Location of fatal crashes in Minnesota in R Plot of Minnesota Major Roads Network R Plot of the car crash pattern on the Minnesota major roads Display of the car crash pattern on the Minnesota major roads in ArcGIS R plot of a CSR point pattern on the Minnesota major roads Display of a CSR point pattern on the Minnesota major roads in ArcGIS Envelope plot for CSR test for car crash pattern by inter-event method Envelope plot for CSR test of car crash pattern by nearest-neighbor method. 38 viii

11 Chapter 1 Introduction 1.1 Background and Organization Practical investigation in ecology, epidemiology, and transportation often involves observation and study of spatial distribution of events. Researchers are interested in the classification of a spatial point pattern and need to know if it is complete spatial randomness (CSR) in the very beginning. Then the method of testing CSR for spatial point process draws researchers more and more attention. The techniques proposed for detecting non-randomness may be divided broadly into two groups, described respectively as quadrant methods and distance methods [1]. The power of randomness tests and, particularly tests based on nearest-neighbor distances, inter-point distances and estimators of moment measures have been investigated by article [2]. Some papers have also tried to develop some other methods besides distances and quadrants. In paper [3], the author introduced testing spatial randomness based on angles between the vectors joining each sample point to its nearest neighbors. And a method of qualifying spatial pattern where sample point move to a regular arrangement which resembles a hexagonal lattice was discussed by [4]. To explore deeper the performance of the CSR test, paper [5] presents results confirmed by ecological data and illustrates that tests without edge-effect correction proposed by Diggle have a higher power for small sample sizes. The assumption of all these works is that spatial point events can be located anywhere on the planar region. However, spatial points can only be located on edges of 1

12 a specific network in some practical scenarios. For example, car crash locations lie on roads, which are able to form a roads network. Then the CSR test should become different and complicated in the sense that inter-event distances are not Euclidean distance any more, and have to adjust to the geometry of network. 2 Motivated by the concern, CSR tests based on inter-event distances and nearest-neighbor distances [6] are discussed under network scenario and verified to be applicable to the network point pattern in this thesis. The result is confirmed by three point processes simulated on a grid network and random network. In terms of the test method based on the inter-event distance, it would be precise and simple enough to implement if the theoretical cumulative distribution function (CDF) of CSR were known. Not surprisingly, there are already some fancy results from the most common cases of square or circular regions. For a square of unit side, the distribution function of inter-event distances is πt 2 8t 3 /3 + t 4 /2 0 t 1 H(t) = 1/3 2t 2 t 4 /2 + 4(t 2 1) 1 2 (2t 2 + 1)/3 +2t 2 arcsin(2t 2 1) 1 < t 2 For a circle of unit radius the corresponding expression is for all 0 t 2 [6]. H(t) = 1 + π 1 {2(t 2 1) arccos(t/2) t(1 + t 2 /2) 1 t 2 /4} If we consider the CSR point pattern on the network, the distances relying on the geometry of network would make a difference from the case of planar region. Compared to the CDF of distances in planar space, CDF of inter-event distances for CSR point pattern on the network hence become more serious and important for our networkorientated research. To be organized, this thesis will start by presenting how to derive the CDF of inter-event distances for the CSR point pattern on a grid network. Chapter 2 describes how to derive the cumulative distribution function of interevent distances for the complete spatial point pattern on grid network. In Chapter 3, complete spatial randomness test method based on inter-event distances is discussed and summarized for implementation.

13 3 In Chapter 4, complete spatial randomness test method based on nearest-neighbor distances is discussed and summarized for implementation. Chapter 5 applies the proposed feasible complete spatial randomness test to the car crash pattern of 2013 on Minnesota major roads network. Chapter 6 presents a final discussion and conclusion of the work presented in the project. 1.2 Definitions Complete Spatial Randomness We consider a network denoted as a graph G = {V, E}, where the set of vertices V = {v 1, v 2,, v l }, and the set of edges E = {e 1, e 2,, e m }. For the sake of simplicity, G is assumed to be connected, which means there is a path between any pair of vertices. Furthermore, e i is the length of edge e i. Let S denote the events from a spatial points pattern on G, S = {s 1, s 2,, s n }, constrained to be on the edges. Note: In graph theory, if the vertices v 0, v 1,, v k of the walk W = v 0 e 1 v 1 e 2 v 2 e k v k are distinct, then W is called a path [7]. Speaking of the graph considered, we can say the edges are weighted representing the distance between two end vertices. The definition of complete spatial randomness (CSR) can be extended to the network case as follows [6, 8]: For a density λ > 0 and a finite network G, the number of events of S, say S, must follow a Poisson distribution with mean λ E where E = m i=1 e i. Given the number of events, i.e., S = n, events are distributed uniformly on the network G. That is to say the n events of S form an independent random sample from the continuous uniform distribution on E Spatial Point Processes on Networks A spatial network point pattern, which can be classified as random, regular, or clustered, is a set of locations distributed within an observed network. The different point process model achieves a certain deviation from complete spatial randomness.

14 Random process is a realization of complete spatial randomness; 4 Cluster process has smaller average inter-event distances than CSR and different intensity within the network; Regular process has greater average inter-event distances than CSR, and points are distributed regularly (inter-event distance is no less than a specified value δ) within the network Inter-event Distances Here are some notations which will be used in the following chapters. n is the number of events in a spatial points pattern on the network G. t ij is the shortest inter-event distance between point i and j, i < j, along the edges in spatial point pattern S. T = {t ij i, j = 1,, n, and i < j} is the collection of all inter-event distances from S. Clearly, number of elements in set T is T = n(n 1)/ CSR Test Based on Inter-event Distances Based on the definition, a test of complete spatial randomness addresses whether or not the observed point pattern could possibly be a realization of a homogeneous Poisson process. Ĥ(t) is the empirical distribution function (EDF) of all inter-event distances in T from an observed spatial point pattern S lying on network G, Ĥ(t) = { } 1 1 n(n 1) #(t ij t). 2 H i (t), (i = 1, 2,, s), is the EDF of all inter-event distances in the ith independent simulated CSR point pattern on the same network G. The average function H(t), upper envelope U 1 (t), and lower envelope L 1 (t) can be calculated as follows. H(t) = 1 s s H i (t), i=1 U 1 (t) = max{h i (t)}, L 1 (t) = min{h i (t)}. for all i = 1,, s.

15 1.2.5 Nearest-neighbor Distances 5 n = # of events in a spatial points pattern S on the network G. t ij = inter-event network distance between point i and j, i < j, in pattern S on the network G. Nearest-neighbor distance: r i = min{t ij 1 j n, j i}, i = 1, 2,, n, R = {r i, i = 1, 2,, n}. Clearly, number of elements in set R is R = n CSR Test Based on Nearest-neighbor Distances ˆK(t) is the empirical distribution function (EDF) of nearest-neighbor distances in set R from an observed spatial point pattern S lying on G. ˆK(t) = n 1 #(r i t) K i (t), (i = 1, 2,, s), is the empirical distribution function (EDF) of nearest-neighbor distances in the ith independent simulated CSR point pattern on the same network G. Then the average function K(t), upper envelope U 2 (t), and lower envelope L 2 (t) can be calculated as follows. K(t) = 1 s s K i (t) U 2 (t) = max{k i (t)} and L 2 (t) = min{k i (t)} for all i = 1, 2,, s i=1

16 Chapter 2 CDF of Inter-event Distances on a Grid Network under CSR Testing of complete spatial randomness (CSR) on a network is the primary task in this project. For a spatial point process on a planar network, we are interested in the distribution of the locations of events if the underlying mechanism is completely random. If we are able to have the theoretical cumulative distribution function (CDF) of inter-event distances for a point pattern under CSR, say H(t), then the CDF of the observed pattern should be close to H(t) if the pattern is completely random. If there is a significant difference, the observed pattern does not have the property of CSR and some further investigation should be carried out. In this chapter, I will discuss how to derive the CDF of inter-event distances for the CSR point pattern in a regular grid network. 2.1 Preliminary As stated in chapter 1, theoretical distributions of inter-event distances are available for some simple cases in planar spatial point processes. For regions with complex boundaries, it is in general impossible to derive the distribution function in a straightforward way. The same situation also occurs for spatial point process on a network. The inter-event distance depends on the geometric structure of the network. There may be multiple paths connecting two locations on a network. If the network is convoluted, 6

17 7 Figure 2.1: An example of a regular grid network with m = n = 11. it is extremely challenging to work out an exact theoretical distribution of inter-event distances. In this chapter, we will work on a grid network which has a regular geometric structure. A regular grid network consists of m horizontal lines and n vertical lines with the same spacing in both directions. Figure 2.1 shows an example with m = n = 11. without loss of generality, the spacing is assumed to be 1 in both horizontal and vertical directions. The inter-event distance between two locations is defined to be their shortest-path distance allowed by the geometry of the space in which the spatial point process is embedded. For a spatial point pattern on a two-dimensional plane, it is simply the Euclidean distance (x 1 x 2 ) 2 + (y 1 y 2 ) 2 between two locations (x 1, y 1 ) and (x 2, y 2 ). For spatial point processes on a network, it could be challenging to get the shortest-path distance. As shown in Figure 2.2 where two points are located on a 5 5 regular grid, there are four different cases. Two points can be both on horizontal lines; or both on vertical lines; or one on a vertical line and the other one on a horizontal line. We denote s 1, s 2 to be the locations of two arbitrary CSR points on an m n regular grid network.

18 Furthermore, we define a location function for an arbitrary point, 8 D(s i ) = v { h if si is on a horizontal line if s i is on a vertical line For a simpler notation, we use s i = h representing horizontal point and s i = v representing vertical points in the probability expressions. The notation is assumed to not have the property of function. Figure 2.2: Four possible locations of two arbitrary points on the 5 5 grid To simplify the notation, we also define r i = (m i, n i ) to be (i) the nearest vertex to the left of s i if s i = h; (ii) the nearest vertex below s i if s i = v. In addition, we define x to be the disance between s 1 and r 1, and y to be the distance between s 2 and r 2. Apparently, both x and y are between 0 and 1. Based on the definition of CSR (discussed in Complete Spatial Randomness), two points s 1 and s 2 are distributed independently, therefore their related vertices r 1 and r 2 are also independent. Suppose we have an m n grid, the distribution of related variables and distance functions for four cases are easy to obtain. In this grid network, the inter-event distance should be equivalent to the taxicab distance. According to the Wikipedia, the taxicab distance, d 1, between two vectors p, q in an n-dimensional real vector space with fixed Cartesian coordinate system, is the sum of the lengths of the projections of the line segment between the points onto the coordinate axes. Therefore, in the plane, d 1 (p, q) = p 1 q 1 + p 2 q 2 where p = (p 1, p 2 ) and q = (q 1, q 2 ). (1) D(s 1 ) = h, D(s 2 ) = h. m 1, m 2 DU(1, m 1); n 1, n 2 DU(1, n); x, y UNIF(0, 1). (a) If m 1 m 2, d(s 1, s 2 ) = n 1 n 2 + m 1 m 2 + x y (b) If m 1 = m 2, n 1 n 2, d(s 1, s 2 ) = n 1 n 2 + min(x + y, 2 x y)

19 (c) If m 1 = m 2, n 1 = n 2, d(s 1, s 2 ) = x y 9 (2) D(s 2 ) = v, D(s 2 ) = v. m 1, m 2 DU(1, m); n 1, n 2 DU(1, n 1); x, y UNIF(0, 1). (a) If n 1 n 2, d(s 1, s 2 ) = m 1 m 2 + n 1 n 2 + x y (b) If n 1 = n 2, m 1 m 2, d(s 1, s 2 ) = m 1 m 2 + min(x + y, 2 x y) (c) If n 1 = n 2, m 1 = m 2, d(s 1, s 2 ) = x y (3) D(s 1 ) = h, D(s 2 ) = v. m 1 DU(1, m 1); m 2 DU(1, m); n 1 DU(1, n); n 2 DU(1, n 1); x, y UNIF(0, 1). (a) If m 2 > m 1, n 2 n 1, d(s 1, s 2 ) = m 2 m 1 x + n 2 + y n 1 (b) If m 2 > m 1, n 2 < n 1, d(s 1, s 2 ) = m 2 m 1 x + n 1 n 2 y (c) If m 2 m 1, n 2 n 1, d(s 1, s 2 ) = m 1 + x m 2 + n 2 + y n 1 (d) If m 2 m 1, n 2 < n 1, then d(s 1, s 2 ) = m 1 + x m 2 + n 1 n 2 y (4) D(s 1 ) = v, D(s 2 ) = h. m 1 DU(1, m); m 2 DU(1, m 1); n 1 DU(1, n 1); n 2 DU(1, n); x, y UNIF(0, 1). (a) If n 2 > n 1, m 2 m 1, d(s 1, s 2 ) = m 2 m 1 + y + n 2 n 1 x (b) If n 2 > n 1, m 2 < m 1, d(s 1, s 2 ) = m 1 m 2 y + n 2 n 1 x (c) If n 2 n 1, m 2 m 1, d(s 1, s 2 ) = n 1 + x n 2 + m 2 + y m 1 (d) If n 2 n 1, m 2 < m 1, d(s 1, s 2 ) = m 1 m 2 y + n 1 + x n 2 Here DU(a, b) denotes the discrete uniform distribution on {a, a + 1,..., b 1, b}; UNIF(a, b) denotes the continuous uniform distribution on interval (a, b); and d(s 1, s 2 ) is the shortest-path distance between locations s 1 and s CDF of Inter-event Distances if t < Cumulative Distribution Function We start by calculating the CDF of inter-event distances less than one. We would like to obtain the CDF, P (d t) where t < 1, by conditioning on the four cases stated above.

20 (1) s 1 = h, s 2 = h. m 1, m 2 DU(1, m 1); n 1, n 2 DU(1, n); x, y UNIF(0, 1). 10 (a) P (d t s 1, s 2 = h, m 1 m 2 ) = P ( n 1 n 2 + m 1 m 2 + x y < t s 1, s 2 = h, m 1 m 2 ). Since t < 1 and x y ( 1, 1), n 1 = n 2 and m 1 m 2 = 1 must be satisfied. We have P (d t s 1, s 2 = h, m 1 m 2 ) = P (n 1 = n 2 s 1, s 2 = h)p ( m 1 m 2 = 1 s 1, s 2 = h, m 1 m 2 )P (x y t 1) [ ] [ ] [ ] 1 2(m 2) 1 = n (m 1) 2 (1 + t 1)2 (m 1) 2 t 2 = n(m 1). To obtain P (d t s 1, s 2 = h) by the Total Probability Theorem, P (m 1 m 2 s 1, s 2 = h) = (m 2)/(m 1) is needed in later calculation. (b) P (d t s 1, s 2 = h, m 1 = m 2, n 1 n 2 ) = P ( n 1 n 2 +min(x+y, 2 x y) t s 1, s 2 = h, m 1 = m 2, n 1 n 2 ). Since min(x + y, 2 x y) (0, 1), the equation makes sense if n 1 = n 2. However, assumption of n 1 n 2 results in a contradiction. Thus, P (d t s 1, s 2 = h, m 1 = m 2, n 1 n 2 ) = 0. (c) When n 1 = n 2, m 1 = m 2, P (d t s 1, s 2 = h, n 1 = n 2, m 1 = m 2 ) = P ( x y t s 1, s 2 = h, n 1 = n 2, m 1 = m 2 ) = P (x y t) P (x y t) = 1 2 t2 + t (1 t)2 2 = t 2 + 2t. Also, P (m 1 = m 2, n 1 = n 2 s 1, s 2 = h) = 1/(m 1)n will be used in obtaining P (d t s 1, s 2 = h) By the Total Probability Theorem, t 2 m 2 P (d < t s 1, s 2 = h) = n(m 1) m ( t2 + 2t) m 1 n = t2 + (2m 2)t n(m 1) 2

21 (2) s 2 = v, s 2 = v. m 1, m 2 DU(1, m); n 1, n 2 DU(1, n 1); x, y UNIF(0, 1). 11 Obviously, we can get P (d t s 1, s 2 = v) by just exchanging m and n in P (d t s 1, s 2 = h) P (d t s 1, s 2 = v) = t2 + (2n 2)t m(n 1) 2 (3) s 1 = h, s 2 = v. m 1 DU(1, m 1); m 2 DU(1, m); n 1 DU(1, n); n 2 DU(1, n 1); x, y UNIF(0, 1). (a) P (d t s 1 = h, s 2 = v, m 2 > m 1, n 2 n 1 ) = P (m 2 m 1 + n 2 n 1 x + y < t s 1 = h, s 2 = v, m 2 > m 1, n 2 n 1 ). Because t < 1, m 2 m 1 1, and x + y ( 1, 1). Therefore, m 2 m 1 = 1, n 2 = n 1, and 1 x + y < t must be satisfied. We have P (d < t s 1 = h, s 2 = v, m 2 > m 1, n 2 n 1 ) = P (m 2 m 1 = 1 s 1 = h, s 2 = v, m 2 > m 1 )P (n 2 = n 1 s 1 = h, s 2 = v, n 2 n 1 ) P (x y > 1 t) [ ] [ ] m 1 n 1 = [1 ( 12 m(m 1)/2 n(n 1)/2 (1 t)2 + 1 t + 12 ] ) = 2t2 mn (b) P (d < t s 1 = h, s 2 = v, m 2 > m 1, n 2 < n 1 ) = 2t2 mn. P (m 2 > m 1, n 2 < n 1 s 1 = h, s 2 = v) = 1/4. (c) P (d < t s 1 = h, s 2 = v, m 2 m 1, n 2 n 1 ) = 2t2 mn. P (m 2 m 1, n 2 n 1 s 1 = h, s 2 = v) = 1/4. (d) P (d < t s 1 = h, s 2 = v, m 2 m 1, n 2 < n 1 ) = 2t2 mn. P (m 2 m 1, n 2 < n 1 s 1 = h, s 2 = v) = 1/4. The conclusion of (b), (c), and (d) results from the equality among the four cases. It is easy to see that each of the four cases contributes to 1/4 of the whole condition when s 1 is horizontal and s 2 is vertical. Thus, by the Total Probability Theorem, P (d < t s 1 = h, s 2 = v) = 1 2t 2 4 mn + 1 2t 2 4 mn + 1 2t 2 4 mn + 1 2t 2 4 mn = 2t2 mn

22 (4) s 1 is vertical, s 2 is horizontal. Obviously, we can get P (d < t s 1 = v, s 2 = h) by exchanging m and n in P (d < t s 1 = h, s 2 = v). P (d < t s 1 = v, s 2 = h) = 2t2 mn For a point generated by CSR, the probability that it is located on a horizontal edge is P (h) = length of horizontal edges length of horizontal edges + length of vertical edges = (m 1)n (m 1)n + (n 1)m (n 1)m Similarly, the probability that the point lies on a vertical edge is P (v) = (m 1)n + (n 1)m. Combining all these four cases for two arbitrary points, we are able to get the cumulative distribution function of inter-event distances in an m n grid network as follows. P (d < t, t < 1) =P (h) 2 P (d < t s 1, s 2 = h) + P (v) 2 P (d < t s 1, s 2 = v) + P (h)p (v) [P (d < t s 1 = h, s 2 = v) + P (d < t s 1 = v, s 2 = h)] [ t 2 ] [ ] + (2m 2)t (m 1)n 2 [ t 2 ] [ + (2n 2)t (n 1)m = n(m 1) 2 + (m 1)n + (n 1m) m(n 1) 2 (m 1)n + (n 1)m [ ] [ ] [ ] 2t 2 (m 1)n (n 1)m + 2 mn (m 1)n + (n 1)m (m 1)n + (n 1)m = (4mn 5m 5n + 4)t2 + (4mn 2m 2n)t (2mn m n) Simulation As an illustration, I simulated 100 CSR point patterns consisting of 500 points each on a grid network. The procedure of simulation for complete spatial random point pattern will be discussed in detail in the 3.3 Simulations. Once obtaining EDF of inter-event distances for each simulation, I was able to calculate the mean function, upper and lower envelopes which are defined in the CSR Test Based on Interevent Distances. These three functions are plotted along with the theoretical result just derived. From Figure 2.4, we can tell that the simulated and theoretical mean functions are almost identical. 12 ] 2

23 13 Figure 2.3: The grid network CDF Figure 2.4: Simulation result and plot for CDF when t < 1 (blue is the theoretical function) tt 2.3 CDF of Inter-event Distances if t > Cumulative Distribution Function The analysis for t > 1 is very similar to that of t < 1 in the previous section but much more complicated. I skip similar content with the t < 1 here. (1) Both horizontal. m 1, m 2 DU(1, m 1), n 1, n 2 DU(1, n), x, y UNIF(0, 1). (a) When m 1 m 2, P (d < t s 1, s 2 = h, m 1 m 2 ) =P ( n 1 n 2 + m 1 m 2 + x y < t s 1, s 2 = h, m 1 m 2 ) =P ( n 1 n 2 + m 1 m 2 = t s 1, s 2 = h, m 1 m 2 )P (x y < t t ) + P ( n 1 n 2 + m 1 m 2 = t s 1, s 2 = h, m 1 m 2 )P (x y < t t ) + P ( n 1 n 2 + m 1 m 2 t 1 s 1, s 2 = h, m 1 m 2 ).

24 Let us define a function called Ch 1 (x) as following, 14 C 1 h (x) = P ( n 1 n 2 + m 1 m 2 = x s 1, s 2 = h, m 1 m 2 ) = = t(x) i=s(x) t(x) i=s(x) P ( n 1 n 2 = x i)p ( m 1 m 2 = i) 2(m 1 i) 2(n x + i) (m 1)(m 2) n 2 I(t(x) = x) 2(m 1 x) (m 1)(m 2) 1 n = 4 t(x) i=s(x) (m 1 i)(n x + i) 2(m 1 x) (m 1)(m 2)n 2 I(t(x) = x) n(m 1)(m 2). where s(x) = max{1, x (n 1)}, t(x) = min{x, m 2}, and I(t(x) = x) is an indicator function. Also, we have P (x y < t t ) = 1 2 (t t )2 + t t P (x y < t t ) = 1 2 (1 + t t )2. Thus, we are able to write the conditional probability in terms of C 1 h (x), P (d < t s 1, s 2 = h, m 1 m 2 ) (2.1) = 1 2 C1 h ( t )(1 + t t )2 + Ch 1 ( t )( 1 2 (t t )2 + t t + 1 t 1 2 ) + j=1 C 1 h (j). Note: If t is an integer, P (d < t s 1, s 2 = h, m 1 m 2 ) = P ( n 1 n 2 + m 1 m 2 = t s 1, s 2 = h, m 1 m 2 )P (x y < 0) + P ( n 1 n 2 + m 1 m 2 t 1) = 1 2 C1 h (t) + t 1 j=1 C1 h (j). Generalize this result to the form of 2.1, it is equal to 2.1 after removed the second term. That is 1 2 C1 h ( t )(1 + t t )2 + t 1 j=1 Ch 1(j). Also, P (m 1 m 2 s 1, s 2 = h) = (m 2)/(m 1) is required to derive the probability of distance less than t conditioning on both points are horizontal. (b) When m 1 = m 2 and n 1 n 2, P (d < t s 1, s 2 = h, m 1 = m 2, n 1 n 2 ) =P ( n 1 n 2 = t )P (min(x + y, 2 x y) < t t ) + P ( n 1 n 2 t 1)

25 Let us define a function called Ch 2 (x) as following. 15 Also, we have Ch 2 (x) = P ( n 1 n 2 = x s 1, s 2 = h, m 1 = m 2, n 1 n 2 ) 0 if x > n 1 = 2(n x) if 1 x n 1 n(n 1) P (min(x + y, 2 x y) < t t ) =P (x + y < t t ) + P (x + y > 2 t + t ) = 1 [ 2 (t t ) ] 2 (2 t + t )2 + 2(2 t + t ) 1 =(t t ) 2 Thus, we are able to write the conditional probability in terms of C 2 h (x). P (d < t s 1, s 2 = h, m 1 = m 2, n 1 n 2 ) (2.2) t 1 =Ch 2 ( t )(t t )2 + Ch 2 (j) Note: If t is an integer, P (d < t s 1, s 2 = h, m 1 = m 2, n 1 n 2 ) = P ( n 1 n 2 t 1) = t 1 j=1 C2 h (j), which is also 2.2 after removed the first term. Also, P (m 1 = m 2, n 1 n 2 s 1, s 2 = h) = (n 1)/n(m 1) should be known for calculating P (d < t s 1, s 2 = h). (c) Since x y [0, 1] and t is assumed to be greater than 1 in this section. So P (d < t s 1, s 2 = h, m 1 = m 2, n 1 = n 2 ) = P ( x y < t s 1, s 2 = h, m 1 = m 2, n 1 = n 2 ) = 1. In addition, P (m 1 = m 2, n 1 = n 2 s 1, s 2 = h) = 1/n(m 1). Therefore, by the Total Probability Theorem, P (d < t s 1, s 2 = h) (2.3) = m 2 1 m 1 2 C1 h ( t )(1 + t t )2 + Ch 1 ( t )( 1 2 (t t )2 + t t + 1 t 1 2 ) + Ch 1 (j) j=1 + n 1 t 1 C h 2 n(m 1) ( t )(t t )2 + Ch 2 (j) 1 + n(m 1) j=1 j=1

26 (2) Both vertical. Clearly, P (d < t s 1, s 2 = v) can be obtained by exchanging m and n in P (d < t s 1, s 2 = h). So let us define function C 1 v (x) and C 2 v (x) as following. Cv 1 (x) = 4 t(x) i=s(x) (n 1 i)(m x + i) Cv 2 (x) = 2(n 1 x) (n 1)(n 2)m 2 I(t(x) = x) m(n 1)(n 2) 0 if x > m 1 2(m x) m(m 1) if 1 x m 1 where s(x) = max{1, x (m 1)}, t(x) = min{x, n 2}. Then we can have P (d < t s 1, s 2 = v) (2.4) = n 2 1 n 1 2 C1 v ( t )(1 + t t ) 2 + Cv 1 ( t )( 1 2 (t t )2 + t t + 1 t 1 2 ) + Cv 1 (j) j=1 + m 1 t 1 C v 2 ( t )(t t ) 2 + Cv 2 (j) 1 + m(n 1) m(n 1) j=1 (3) s 1 is horizontal, s 2 is vertical. m 1 DU(1, m 1), m 2 DU(1, m), n 1 DU(1, n), n 2 DU(1, n 1), x, y UNIF(0, 1). (a) When m 2 > m 1 and n 2 n 1, P (d < t s 1 = h, s 2 = v, m 2 > m 1, n 2 n 1 ) =P (m 2 m 1 + n 2 n 1 (x y) < t s 1 = h, s 2 = v, m 2 > m 1, n 2 n 1 ) =P (m 2 m 1 + n 2 n 1 = t )P (x y > t t) + P (m 2 m 1 + n 2 n 1 = t ) P (x y > t t) + P (m 2 m 1 + n 2 n 1 t 1). Let us define a function called Chv 1 (x) as following. = = C 1 hv (x) = P (m 2 m 1 + n 2 n 1 = x s 1 = h, s 2 = v, m 2 > m 1, n 2 n 1 ) t(x) i=s(x) t(x) i=s(x) P (m 2 m 1 = i)p (n 2 n 1 = x i) 2(m i) 2(n 1 x + i) m(m 1) n(n 1) = 4 t(x) i=s(x) (m i)(n 1 x + i), mn(m 1)(n 1) 16

27 where s(x) = max{1, x (n 2)}, t(x) = min{x, m 1}. Also, we have [ P (x y > t t) = ( t t)2 + ( t t) + 1 ] = ( t t)2 ( t t) P (x y > t ) = 1 1 (1 + t t)2 2 Thus, we are able to write the conditional probability in terms of C 1 hv (t), P (d < t s 1 = h, s 2 = v, m 2 > m 1, n 2 n 1 ) (2.5) [ 1 =Chv 1 ( t ) 2 ( t t)2 ( t t) + 1 ] + C [1 1hv 2 ( t ) 12 ] (1 + t t)2 t 1 + j=1 C 1 hv (j). Note: If t is an integer, P (d < t s 1 = h, s 2 = v, m 2 > m 1, n 2 n 1 ) = P (m 2 m 1 + n 2 n 1 = t)p (x y 0) + P ((m 2 m 1 + n 2 n 1 t 1) = Chv 1 (t) t 1 j=1 C1 hv (t), which is also equal to 2.5 took off the second term. Also, we have P (m 2 > m 1, n 2 n 1 ) s 1 = h, s 2 = v) = 1/4. (b) When m 2 > m 1 and n 2 < n 1, the probability is same as (a). Also, P (m 2 > m 1, n 2 > n 1 s 1 = h, s 2 = v) = 1/4. (c) When m 2 m 1 and n 2 n 1, the probability is same as (a). Also, we know P (m 2 m 1, n 2 n 1 s 1 = h, s 2 = v) = 1/4. (d) When m 2 m 1 and n 2 < n 1, the probability is same as (a). Also, we know P (m 2 m 1, n 2 < n 1 s 1 = h, s 2 = v) = 1/4. Again, the conclusion of (b), (c), and (d) comes from the equality among the four cases. It is easy to see that each of the four cases contributes 1/4 to the whole condition and they are equivalent to each other. Therefore, by the Total

28 Probability Theorem, 18 P (d < t s 1 = h, s 2 = v) (2.6) = 1 4 4P (d < t s 1 = h, s 2 = v, m 2 > m 1, n 2 n 1 ) =P (d < t s 1 = h, s 2 = v, m 2 > m 1, n 2 n 1 ) [ 1 =Chv 1 ( t ) 2 ( t t)2 ( t t) + 1 ] + C [1 1hv 2 ( t ) 12 ] (1 + t t)2 t 1 + j=1 C 1 hv (j). (4) s 1 is vertical, s 2 is horizontal. Clearly, P (d < t s 1 = v, s 2 = h) can be obtained by just exchanging m and n in P (d < t s 1 = h, s 2 = v). Actually, P (d < t s 1 = v, s 2 = h) must be equal to P (d < t s 1 = h, s 2 = v) because a 90 rotation of grid will not change the distribution of distances. Finally, we are able to get P (d < t) for t > 1 by combining the four cases above. Again, (m 1)n we will use P (h) = (m 1)n + (n 1)m and P (v) = (n 1)m (m 1)n + (n 1)m. P (d < t, t > 1) (2.7) [ ] (m 1)n 2 [ (n 1)m =P (d < t s 1, s 2 = h) + P (d < t s 1, s 2 = v) (m 1)n + (n 1m) (m 1)n + (n 1)m [ ] [ ] (m 1)n (n 1)m + 2 P (d < t s 1 = h, s 2 = v) (m 1)n + (n 1m) (m 1)n + (n 1)m where P (d < t s 1, s 2 = h), P (d < t s 1, s 2 = v) and P (d < t s 1 = h, s 2 = v) refer to 2.3, 2.4, and 2.6. ] Simulation Exactly same as what I did for t < 1, I did 100 CSR simulations on the grid network in order to demonstrate the result obtained is reasonable enough. I plot three functions (mean function, upper and lower envelopes which are defined in the CSR Test Based on Inter-event Distances) from simulations together with my theoretical function in one figure. From Figure 2.5, we can tell that the mean function

29 and theoretical function (blue) are already coincident and my result is between upper and lower envelope exactly. 19 CDF tt Figure 2.5: The grid network Figure 2.6: Simulation result and plot for CDF when t > 1 (blue is the theoretical function)

30 Chapter 3 CSR Test Based on Inter-event Distances As shown in the previous chapter, the theoretical cumulative distribution function (CDF) is complicated and difficult to obtain even for the simplest grid network. In practice, however, we have various kinds of networks which are far more complicated than the regular grid network. It will not be feasible to use the theoretical CDF of the complete spatial random (CSR) point pattern as a criterion when testing for CSR. Instead a well-approximated true function derived from Monte Carlo simulations can be utilized to replace the analytic solutions. In this chapter, I will present how to implement the CSR test based on inter-event distances using Monte Carlo simulations in real applications. 3.1 CSR Test Implementation Since the empirical distribution function converges to the cumulative distribution function when the sample size is large, H(t) can be used as an approximation to the theoretical CDF of inter-event distances for CSR point pattern on the given network. Under the hypothesis of CSR, Ĥ(t), which represents the observed distribution function, should be close to H(t) which is regarded as the true function. In other words, a plot of Ĥ(t) against H(t) should be roughly linear. To assess the significance or departures from linearity, upper bound and lower bound of H i (t) are also evaluated and plotted against 20

31 21 H(t). Hence, if Ĥ(t) is always bounded by U 1 (t) and L 1 (t) across all simulations, a hypothesis of CSR certainly cannot be rejected. On the other hand, if Ĥ(t) is more extreme than U 1 (t) or L 1 (t), it is very likely that the underlying data-generating mechanism is not CSR. In summary, we can implement the CSR test based on inter-event distances as follows. (1) Calculate all unique inter-event distances t ij, 1 i < j n, in the observed spatial point pattern on the network G and get Ĥ(t); (2) Simulate s complete spatial random point patterns on the network G independently. To make the p-value for that EDF of CSR pattern is upper or lower envelope at most 0.1, the s is generally greater than 100; (3) For each ith simulated CSR pattern, calculate all inter-event distances t ij and get H i (t); (4) Obtain H(t), U 1 (t), and L 1 (t); (5) Plot Ĥ(t), U 1(t), and L 1 (t) against H(t). In spatial point process on the network, distance between a pair of points relies on the geometry of observed network, not the usual Euclidean distance used for planar region. Therefore, the main difficulty would be obtaining the network distance which is an essential part for our CSR test. Only if all distances are calculated, the comparison between the observed data and the data simulated from CSR can be analyzed in terms of the CDF to test different models. The igraph package of R computing environment turns out to be a very useful tool to deal with the problem of finding the shortest distance on a network. It is very convenient and efficient for visualizing stream network, point pattern, and calculating inter-event distances on network. It is also able to visualize simulated network point process. 3.2 Simulations Theoretically, since the CSR test method is discussed for a general setup, the proposed test method should work for both network and planar regions. I did simulations for

32 three typical point processes to for demonstration before I applied this method to a real application Random Process Based on the definition of CSR, complete spatial random process is a binomial process if the number of events n is fixed. To generate complete spatial random n points on a network, a binomial process was simulated such that n points are added onto the edges one by one. Let m be the number of edges in network. The procedure of simulation for random process is summarized as follows. (a) Choose one edge e i with probability of e i m i=1 e i ; (b) The point is distributed uniformly on e i, i.e., d UNIF(0, e i ) where d is the distance to one vertex of e i ; (c) Repeat (a) and (b) n times to get n random points on the network. I generated complete spatial random point patterns on two different types of networks. The first one is a regular grid network as the one shown in the previous chapter where all edges have the same length. The second one is a random network in which edges have different lengths and vertices are distributed irregularly. A random network is constructed by vertices distributed randomly and edges with connection probability of 0.5. To have a connected and planar network, some restrictions were added onto the generation procedure of the random network. For example, distance between every two points is no less than 0.5, and only considering a point s connection to its nearest 3 points. Figures 3.1 and 3.3 show these two types of networks.

33 23 H average Figure 3.1: Random point pattern on the grid network Figure 3.2: Envelope plot for random process on the grid network H average Figure 3.3: Random point pattern on a random network Figure 3.4: Envelope plot for random process on a random network For the spatial point pattern on the grid network, the envelope plot in Figure 3.2 shows that Ĥ(t) is roughly equal to H(t) and lies between U 1(t) and L 1 (t) throughout its range, which suggests an acceptance of CSR. For the random network, the envelope plot is Figure 3.4 also shows an acceptance of CSR which is similar to that of the grid network. We conclude that these data are compatible with completely random spatial distribution. This is not surprising since the data are indeed generated using CSR.

34 3.2.2 Cluster Process 24 For cluster processes, we generate the data using a two-step procedure. First, parent points are independently generated from the uniform distribution on the network (binomial process) [9]. We then generate some child points from each of the parents. The number of children follows a Poisson distribution, and the positions of children relative to their parents are normally distributed. After child points are generated, we remove all parent points. Thus the final data set only consists of the children. The procedure for generating cluster process is summarized as follows. (a) Generate 20 parents following steps in complete spatial randomness simulation; (b) For each parent, the number of its children follows Poisson distribution with mean 10. The position of each child relative to its parent is normally distributed, i.e., N(0, ); (c) Remove the parents and only keep the children. H average1 Figure 3.5: Cluster point pattern on the grid network Figure 3.6: Envelope plot for cluster process on the grid network

35 25 H average1 Figure 3.7: Cluster point pattern on a random network Figure 3.8: Envelope plot for cluster process on a random network Figures 3.5 and 3.7 show realizations of cluster processes on a regular grid network and a random network. Figure 3.6 shows that Ĥ(t) is greater than H(t) within the range and lies above U 1 (t) for very small values of H(t). This is typical for cluster processes because there is an excess of short inter-event distances. The plot suggests a rejection of CSR in favor of clustering. Similarly, Figure 3.8 suggests the rejection of CSR because of the obvious departure from complete spatial randomness. In summary, if the probability of getting small distances is higher than that of CSR point pattern, the observed point patten should be more likely to be classified as a cluster process Regular Process In order to generate events with regularity, I started with a homogeneous Poisson process with n events. The density λ = n/ m i=1 e i. Any two events separated by a distance of less than a specified value of δ are thinned. The probability of retention for a point x

36 on network is P (x is retained) = ) n (1 πδ2 m i=1 e i { )} = exp n ln (1 πδ2 m i=1 e i } { exp = e λπδ2 πδ 2 n m i=1 e i 26 Here is the procedure for simulation of regular process. (a) Generate a CSR point pattern with density λ = n m i=1 e i. (b) For each point x, x is retained with probability of e λπδ2 between any two events is less than a specified value δ. as long as the distance (c) The retained points are regular point pattern we wanted. H average Figure 3.9: Regular point pattern on the grid network Figure 3.10: Envelope plot for regular process on the grid network As shown in the Figures 3.10 and 3.12, Ĥ(t) is less than H(t) and even lower than L 1 (t) for small inter-event distances. The reason is that a regular process does not have inter-event distances smaller than a certain lower threshold. Thus the envelope plots claim the rejection of complete spatial randomness in favor of regularity.

37 27 H average Figure 3.11: Regular point pattern on a random network Figure 3.12: Envelope plot for regular process on a random network Conclusion In conclusion, we considered Monte Carlo tests for CSR for both regular grid network and random network. The results are similar for two types of networks. The conclusions based on envelope plots are consistent with the original assumptions. In other words, the proposed CSR test method based on inter-event distances works well for network case and is able to test any spatial point pattern on network correctly.

38 Chapter 4 CSR Test Based on Nearest-neighbor Distances Besides inter-event distance, nearest-neighbor distance is another sensible quantity which can be used to test for complete spatial randomness (CSR). Similar to method based on inter-event distance, the essential part of CSR test using nearest-neighbor distances is to calculate nearest-neighbor distance of each event. In this chapter, I will discuss test method based on the nearest-neighbor distances in the same way as previous chapter. So some repeated definitions and analysis will be omitted in this chapter. 4.1 CSR Test Implementation Similar to the demonstration in last chapter, we should know that plot of ˆK(t) against K(t) should be roughly linear under the hypothesis of CSR. Besides the rough linear plot, if ˆK(t) is always between U 2 (t) and L 2 (t) from significant number of simulations, then a hypothesis of CSR certainly cannot be rejected. In summary, we can implement CSR test based on nearest-neighbor distance as follows. (1) Calculate all nearest-neighbor distances r i in the observed spatial point pattern on the network G ˆK(t) (2) Simulate s(s 100) complete spatial random point pattern on the network G independently. Same reason as that of inter-event distances method, s should be 28

39 at least (3) For each ith simulated CSR pattern, calculate all the nearest-neighbor distances r i K i (t). (4) Obtain K(t), U 2 (t), and L 2 (t). Plot ˆK(t), U 2 (t), and L 2 (t) against K(t). Again, the main difficulty would be obtaining the nearest-neighbor network distance which is an essential part for CSR test. I also used the igraph package of R to deal with the problem here. Firstly, we can calculate the inter-event distances from a point x to all its neighbors. Then the minimum of these inter-event distances is the nearestneighbor distance of event x. 4.2 Simulations Theoretically, since the CSR test method is discussed in a general case, the proposed test method should work for both network and planar region. To guarantee its correctness and feasibility, I did simulations for three typical point processes to verify the CSR test method based on nearest-neighbor distance before I applied this method to a real application. In this section, I follow the same models and steps described in section 3.2. The main difference between inter-event method and nearest-neighbor method is the amount of distance data obtained. In contrast with n(n 1)/2 inter-event distances used for calculating EDF, we can only have n nearest-neighbor distances. To ensure the accuracy of the approximated cumulative distribution function (CDF), an observed spatial point pattern should be required to have large numbers of events. After several practice, at least 200 points should be simulated for the CSR point pattern such that the plot of EDF is smooth enough.

40 4.2.1 Random Process 30 H average Figure 4.1: Grid network and random point pattern Figure 4.2: Envelope plot for random process on grid network H average Figure 4.3: Random network and random point pattern Figure 4.4: Envelope plot for random process on random network Figure 4.2 shows that ˆK(t) is roughly equal to K(t) and lies between U 2 (t) and L 2 (t) through out its range, which suggests an acceptance of CSR. For spatial point pattern on the network generated randomly, envelope plot 4.4 shows an acceptance of CSR too. So we conclude that these data are compatible with completely random spatial distribution.

41 4.2.2 Cluster Process 31 H average1 Figure 4.5: Grid network and cluster point pattern Figure 4.6: Envelope plot for cluster process on grid network H average1 Figure 4.7: point pattern Random network and cluster Figure 4.8: Envelope plot for cluster process on random network Figure 4.6 shows that ˆK(t) is greater than K(t) within the range and lies above U 2 (t) for very small values of K(t). So the plot claimed a rejection of CSR. Similarly, envelope plot 4.8 tends to reject CSR because of the obvious departure from complete spatial randomness. Since the probability of getting small nearest-neighbor distances is higher than that of CSR point pattern, the simulated point patten should be more likely classified as a cluster process.

42 4.2.3 Regular Process 32 H average Figure 4.9: Grid network and regular point pattern Figure 4.10: Envelope plot for regular process on grid network H average Figure 4.11: Random network and regular point pattern Figure 4.12: Envelope plot for regular process on random network As shown in the Figure 4.10, ˆK(t) is less than K(t) and even lower than L2 (t) for small nearest-neighbor distances. Thus the envelope plot claims a rejection of complete spatial randomness. For the network generated randomly, the test result 3.12 also leads to a rejection of CSR and conclusion that the point pattern intends to have larger nearestneighbor distances than that of CSR point pattern. So the simulated point pattern should be more likely classified as a regular point pattern.

43 4.2.4 Conclusion 33 In conclusion, each of the three testings for their corresponding typical point process results in the correct conclusion which is consistent with the original assumption. In other words, the proposed CSR test method based on nearest-neighbor distances works well for network case and is able to test any spatial point pattern on the network correctly.

44 Chapter 5 Car Crash Point Pattern on the Minnesota Major Roads In this chapter, I tested the car crash point pattern on the Minnesota Major Roads by the method based on inter-event distance and nearest-neighbor distance respectively. Both results suggested that car crashes on the Minnesota Roads tend to follow a cluster point process. 5.1 Dataset My datasets are Locations of car accidents in Minnesota in 2013 from National Highway Traffic Safety Administration (NHTSA) and Major road network of Minnesota (U.S. highway, interstate highway, and Minnesota highway) from Minnesota Geospatial Commons. The Figure 5.1 represents how the car crash dataset looks originally on the website. And the Figure 5.2 is the presentation of Minnesota major roads network in R environment. 34

45 35 Figure 5.1: Location of fatal crashes in Min- Figure 5.2: R Plot of Minnesota Major nesota in 2013 Roads Network. Since some of the car accidents did not occur on the highway of Minnesota. In order to consider car crashes on major roads only, I detected and eliminated crash points not on the major road network. So we end up with 234 observed points on this specific network for testing, and these are shown in Figure 5.3 and Figure 5.4. Figure 5.3: R Plot of the car crash pattern Figure 5.4: Display of the car crash pattern on the Minnesota major roads. on the Minnesota major roads in ArcGIS

46 5.2 Implementation 36 Considering implementation of the proposed CSR test methods from last two chapters, computing all the inter-event distances must be the issue which should be handled above all. Different from the network that can be generated by igraph package in R, Minnesota major roads data is stored in a shape file and made up of hundreds of road segments. So the inter-event distance along the Minnesota highway can not be calculated by igraph package in R in this case. Fortunately, ArcGIS is very good at processing shape files and calculating distance in terms of map data. Thus, I utilized ArcGIS to compute inter-event distances of car crash pattern and all my simulated CSR patterns on the Minnesota major roads. Besides computing inter-event distances, simulating CSR point pattern on the Minnesota major roads is also an essential part of testing since simulations are the key to obtaining H(t). For the simulations, I just followed the same steps described in last two chapters and got 100 independent sets consisting of 200 CSR points locations. Here is one set out of 100 independent CSR point pattern shown in the Figure 5.5 and Figure 5.6. Figure 5.5: R plot of a CSR point pattern on the Minnesota major roads Figure 5.6: Display of a CSR point pattern on the Minnesota major roads in ArcGIS

Points. Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved

Points. Luc Anselin.   Copyright 2017 by Luc Anselin, All Rights Reserved Points Luc Anselin http://spatial.uchicago.edu 1 classic point pattern analysis spatial randomness intensity distance-based statistics points on networks 2 Classic Point Pattern Analysis 3 Classic Examples

More information

Spatial Analysis I. Spatial data analysis Spatial analysis and inference

Spatial Analysis I. Spatial data analysis Spatial analysis and inference Spatial Analysis I Spatial data analysis Spatial analysis and inference Roadmap Outline: What is spatial analysis? Spatial Joins Step 1: Analysis of attributes Step 2: Preparing for analyses: working with

More information

GIST 4302/5302: Spatial Analysis and Modeling Point Pattern Analysis

GIST 4302/5302: Spatial Analysis and Modeling Point Pattern Analysis GIST 4302/5302: Spatial Analysis and Modeling Point Pattern Analysis Guofeng Cao www.spatial.ttu.edu Department of Geosciences Texas Tech University guofeng.cao@ttu.edu Fall 2018 Spatial Point Patterns

More information

Overview of Spatial analysis in ecology

Overview of Spatial analysis in ecology Spatial Point Patterns & Complete Spatial Randomness - II Geog 0C Introduction to Spatial Data Analysis Chris Funk Lecture 8 Overview of Spatial analysis in ecology st step in understanding ecological

More information

ESTIMATING STATISTICAL CHARACTERISTICS UNDER INTERVAL UNCERTAINTY AND CONSTRAINTS: MEAN, VARIANCE, COVARIANCE, AND CORRELATION ALI JALAL-KAMALI

ESTIMATING STATISTICAL CHARACTERISTICS UNDER INTERVAL UNCERTAINTY AND CONSTRAINTS: MEAN, VARIANCE, COVARIANCE, AND CORRELATION ALI JALAL-KAMALI ESTIMATING STATISTICAL CHARACTERISTICS UNDER INTERVAL UNCERTAINTY AND CONSTRAINTS: MEAN, VARIANCE, COVARIANCE, AND CORRELATION ALI JALAL-KAMALI Department of Computer Science APPROVED: Vladik Kreinovich,

More information

ECE 661: Homework 10 Fall 2014

ECE 661: Homework 10 Fall 2014 ECE 661: Homework 10 Fall 2014 This homework consists of the following two parts: (1) Face recognition with PCA and LDA for dimensionality reduction and the nearest-neighborhood rule for classification;

More information

Lecture 7. Testing for Poisson cdf - Poisson regression - Random points in space 1

Lecture 7. Testing for Poisson cdf - Poisson regression - Random points in space 1 Lecture 7. Testing for Poisson cdf - Poisson regression - Random points in space 1 Igor Rychlik Chalmers Department of Mathematical Sciences Probability, Statistics and Risk, MVE300 Chalmers April 2010

More information

k-protected VERTICES IN BINARY SEARCH TREES

k-protected VERTICES IN BINARY SEARCH TREES k-protected VERTICES IN BINARY SEARCH TREES MIKLÓS BÓNA Abstract. We show that for every k, the probability that a randomly selected vertex of a random binary search tree on n nodes is at distance k from

More information

On the Partitioning of Servers in Queueing Systems during Rush Hour

On the Partitioning of Servers in Queueing Systems during Rush Hour On the Partitioning of Servers in Queueing Systems during Rush Hour Bin Hu Saif Benjaafar Department of Operations and Management Science, Ross School of Business, University of Michigan at Ann Arbor,

More information

Chapter 6 Spatial Analysis

Chapter 6 Spatial Analysis 6.1 Introduction Chapter 6 Spatial Analysis Spatial analysis, in a narrow sense, is a set of mathematical (and usually statistical) tools used to find order and patterns in spatial phenomena. Spatial patterns

More information

Created by T. Madas CALCULUS KINEMATICS. Created by T. Madas

Created by T. Madas CALCULUS KINEMATICS. Created by T. Madas CALCULUS KINEMATICS CALCULUS KINEMATICS IN SCALAR FORM Question (**) A particle P is moving on the x axis and its acceleration a ms, t seconds after a given instant, is given by a = 6t 8, t 0. The particle

More information

An Approach to Constructing Good Two-level Orthogonal Factorial Designs with Large Run Sizes

An Approach to Constructing Good Two-level Orthogonal Factorial Designs with Large Run Sizes An Approach to Constructing Good Two-level Orthogonal Factorial Designs with Large Run Sizes by Chenlu Shi B.Sc. (Hons.), St. Francis Xavier University, 013 Project Submitted in Partial Fulfillment of

More information

Texas A&M University

Texas A&M University Texas A&M University CVEN 658 Civil Engineering Applications of GIS Hotspot Analysis of Highway Accident Spatial Pattern Based on Network Spatial Weights Instructor: Dr. Francisco Olivera Author: Zachry

More information

COLORINGS FOR MORE EFFICIENT COMPUTATION OF JACOBIAN MATRICES BY DANIEL WESLEY CRANSTON

COLORINGS FOR MORE EFFICIENT COMPUTATION OF JACOBIAN MATRICES BY DANIEL WESLEY CRANSTON COLORINGS FOR MORE EFFICIENT COMPUTATION OF JACOBIAN MATRICES BY DANIEL WESLEY CRANSTON B.S., Greenville College, 1999 M.S., University of Illinois, 2000 THESIS Submitted in partial fulfillment of the

More information

Recall the Basics of Hypothesis Testing

Recall the Basics of Hypothesis Testing Recall the Basics of Hypothesis Testing The level of significance α, (size of test) is defined as the probability of X falling in w (rejecting H 0 ) when H 0 is true: P(X w H 0 ) = α. H 0 TRUE H 1 TRUE

More information

Bayesian Learning. Bayesian Learning Criteria

Bayesian Learning. Bayesian Learning Criteria Bayesian Learning In Bayesian learning, we are interested in the probability of a hypothesis h given the dataset D. By Bayes theorem: P (h D) = P (D h)p (h) P (D) Other useful formulas to remember are:

More information

Recap. Probability, stochastic processes, Markov chains. ELEC-C7210 Modeling and analysis of communication networks

Recap. Probability, stochastic processes, Markov chains. ELEC-C7210 Modeling and analysis of communication networks Recap Probability, stochastic processes, Markov chains ELEC-C7210 Modeling and analysis of communication networks 1 Recap: Probability theory important distributions Discrete distributions Geometric distribution

More information

Non-linear Dimensionality Reduction

Non-linear Dimensionality Reduction Non-linear Dimensionality Reduction CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Introduction Laplacian Eigenmaps Locally Linear Embedding (LLE)

More information

On the Partitioning of Servers in Queueing Systems during Rush Hour

On the Partitioning of Servers in Queueing Systems during Rush Hour On the Partitioning of Servers in Queueing Systems during Rush Hour This paper is motivated by two phenomena observed in many queueing systems in practice. The first is the partitioning of server capacity

More information

1. Introductory Examples

1. Introductory Examples 1. Introductory Examples We introduce the concept of the deterministic and stochastic simulation methods. Two problems are provided to explain the methods: the percolation problem, providing an example

More information

Linear Algebra. Preliminary Lecture Notes

Linear Algebra. Preliminary Lecture Notes Linear Algebra Preliminary Lecture Notes Adolfo J. Rumbos c Draft date April 29, 23 2 Contents Motivation for the course 5 2 Euclidean n dimensional Space 7 2. Definition of n Dimensional Euclidean Space...........

More information

Meeting times of Taboo random walks on bipartite graphs

Meeting times of Taboo random walks on bipartite graphs Meeting times of Taboo random walks on bipartite graphs By Xianwu Zhang MS candidate: Applied and Computational Mathematics Advisor: Barry James Co-advisor: Kang Ling James Department of Mathematics and

More information

An adapted intensity estimator for linear networks with an application to modelling anti-social behaviour in an urban environment

An adapted intensity estimator for linear networks with an application to modelling anti-social behaviour in an urban environment An adapted intensity estimator for linear networks with an application to modelling anti-social behaviour in an urban environment M. M. Moradi 1,2,, F. J. Rodríguez-Cortés 2 and J. Mateu 2 1 Institute

More information

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18 CSE 417T: Introduction to Machine Learning Final Review Henry Chai 12/4/18 Overfitting Overfitting is fitting the training data more than is warranted Fitting noise rather than signal 2 Estimating! "#$

More information

Computer simulation on homogeneity testing for weighted data sets used in HEP

Computer simulation on homogeneity testing for weighted data sets used in HEP Computer simulation on homogeneity testing for weighted data sets used in HEP Petr Bouř and Václav Kůs Department of Mathematics, Faculty of Nuclear Sciences and Physical Engineering, Czech Technical University

More information

Rose-Hulman Undergraduate Mathematics Journal

Rose-Hulman Undergraduate Mathematics Journal Rose-Hulman Undergraduate Mathematics Journal Volume 17 Issue 1 Article 5 Reversing A Doodle Bryan A. Curtis Metropolitan State University of Denver Follow this and additional works at: http://scholar.rose-hulman.edu/rhumj

More information

1 Mechanistic and generative models of network structure

1 Mechanistic and generative models of network structure 1 Mechanistic and generative models of network structure There are many models of network structure, and these largely can be divided into two classes: mechanistic models and generative or probabilistic

More information

THESIS. Presented in Partial Fulfillment of the Requirements for the Degree Master of Science in the Graduate School of The Ohio State University

THESIS. Presented in Partial Fulfillment of the Requirements for the Degree Master of Science in the Graduate School of The Ohio State University The Hasse-Minkowski Theorem in Two and Three Variables THESIS Presented in Partial Fulfillment of the Requirements for the Degree Master of Science in the Graduate School of The Ohio State University By

More information

PROJECT C: ELECTRONIC BAND STRUCTURE IN A MODEL SEMICONDUCTOR

PROJECT C: ELECTRONIC BAND STRUCTURE IN A MODEL SEMICONDUCTOR PROJECT C: ELECTRONIC BAND STRUCTURE IN A MODEL SEMICONDUCTOR The aim of this project is to present the student with a perspective on the notion of electronic energy band structures and energy band gaps

More information

A LATTICE POINT ENUMERATION APPROACH TO PARTITION IDENTITIES

A LATTICE POINT ENUMERATION APPROACH TO PARTITION IDENTITIES A LATTICE POINT ENUMERATION APPROACH TO PARTITION IDENTITIES A thesis presented to the faculty of San Francisco State University In partial fulfilment of The Requirements for The Degree Master of Arts

More information

Rigid Geometric Transformations

Rigid Geometric Transformations Rigid Geometric Transformations Carlo Tomasi This note is a quick refresher of the geometry of rigid transformations in three-dimensional space, expressed in Cartesian coordinates. 1 Cartesian Coordinates

More information

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Review. DS GA 1002 Statistical and Mathematical Models.   Carlos Fernandez-Granda Review DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Probability and statistics Probability: Framework for dealing with

More information

The Growth of Functions. A Practical Introduction with as Little Theory as possible

The Growth of Functions. A Practical Introduction with as Little Theory as possible The Growth of Functions A Practical Introduction with as Little Theory as possible Complexity of Algorithms (1) Before we talk about the growth of functions and the concept of order, let s discuss why

More information

Decentralized Stabilization of Heterogeneous Linear Multi-Agent Systems

Decentralized Stabilization of Heterogeneous Linear Multi-Agent Systems 1 Decentralized Stabilization of Heterogeneous Linear Multi-Agent Systems Mauro Franceschelli, Andrea Gasparri, Alessandro Giua, and Giovanni Ulivi Abstract In this paper the formation stabilization problem

More information

MAS1302 Computational Probability and Statistics

MAS1302 Computational Probability and Statistics MAS1302 Computational Probability and Statistics April 23, 2008 3. Simulating continuous random behaviour 3.1 The Continuous Uniform U(0,1) Distribution We have already used this random variable a great

More information

Chapter Four Gelfond s Solution of Hilbert s Seventh Problem (Revised January 2, 2011)

Chapter Four Gelfond s Solution of Hilbert s Seventh Problem (Revised January 2, 2011) Chapter Four Gelfond s Solution of Hilbert s Seventh Problem (Revised January 2, 2011) Before we consider Gelfond s, and then Schneider s, complete solutions to Hilbert s seventh problem let s look back

More information

ABSTRACT A STUDY OF PROJECTIONS OF 2-BOUQUET GRAPHS. A new field of mathematical research has emerged in knot theory,

ABSTRACT A STUDY OF PROJECTIONS OF 2-BOUQUET GRAPHS. A new field of mathematical research has emerged in knot theory, ABSTRACT A STUDY OF PROJECTIONS OF -BOUQUET GRAPHS A new field of mathematical research has emerged in knot theory, which considers knot diagrams with missing information at some of the crossings. That

More information

arxiv: v4 [cs.cg] 31 Mar 2018

arxiv: v4 [cs.cg] 31 Mar 2018 New bounds for range closest-pair problems Jie Xue Yuan Li Saladi Rahul Ravi Janardan arxiv:1712.9749v4 [cs.cg] 31 Mar 218 Abstract Given a dataset S of points in R 2, the range closest-pair (RCP) problem

More information

The dynamics of small particles whose size is roughly 1 µmt or. smaller, in a fluid at room temperature, is extremely erratic, and is

The dynamics of small particles whose size is roughly 1 µmt or. smaller, in a fluid at room temperature, is extremely erratic, and is 1 I. BROWNIAN MOTION The dynamics of small particles whose size is roughly 1 µmt or smaller, in a fluid at room temperature, is extremely erratic, and is called Brownian motion. The velocity of such particles

More information

Data Mining and Analysis: Fundamental Concepts and Algorithms

Data Mining and Analysis: Fundamental Concepts and Algorithms Data Mining and Analysis: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA

More information

G-GPE Explaining the equation for a circle

G-GPE Explaining the equation for a circle G-GPE Explaining the equation for a circle Alignments to Content Standards: G-GPE.A.1 Task This problem examines equations defining different circles in the - plane. a. Use the Pythagorean theorem to find

More information

II. Unit Speed Curves

II. Unit Speed Curves The Geometry of Curves, Part I Rob Donnelly From Murray State University s Calculus III, Fall 2001 note: This material supplements Sections 13.3 and 13.4 of the text Calculus with Early Transcendentals,

More information

Linear Algebra. Preliminary Lecture Notes

Linear Algebra. Preliminary Lecture Notes Linear Algebra Preliminary Lecture Notes Adolfo J. Rumbos c Draft date May 9, 29 2 Contents 1 Motivation for the course 5 2 Euclidean n dimensional Space 7 2.1 Definition of n Dimensional Euclidean Space...........

More information

1 Systems of Linear Equations

1 Systems of Linear Equations 1 Systems of Linear Equations Many problems that occur naturally involve finding solutions that satisfy systems of linear equations of the form a 11 x 1 + a 12 x 2 + + a 1n x n = b 1 a 21 x 1 + a 22 x

More information

Monte Carlo Composition Inversion Acceptance/Rejection Sampling. Direct Simulation. Econ 690. Purdue University

Monte Carlo Composition Inversion Acceptance/Rejection Sampling. Direct Simulation. Econ 690. Purdue University Methods Econ 690 Purdue University Outline 1 Monte Carlo Integration 2 The Method of Composition 3 The Method of Inversion 4 Acceptance/Rejection Sampling Monte Carlo Integration Suppose you wish to calculate

More information

Primitive Digraphs with Smallest Large Exponent

Primitive Digraphs with Smallest Large Exponent Primitive Digraphs with Smallest Large Exponent by Shahla Nasserasr B.Sc., University of Tabriz, Iran 1999 A Thesis Submitted in Partial Fulfillment of the Requirements for the Degree of MASTER OF SCIENCE

More information

Design of the Fuzzy Rank Tests Package

Design of the Fuzzy Rank Tests Package Design of the Fuzzy Rank Tests Package Charles J. Geyer July 15, 2013 1 Introduction We do fuzzy P -values and confidence intervals following Geyer and Meeden (2005) and Thompson and Geyer (2007) for three

More information

On Model Fitting Procedures for Inhomogeneous Neyman-Scott Processes

On Model Fitting Procedures for Inhomogeneous Neyman-Scott Processes On Model Fitting Procedures for Inhomogeneous Neyman-Scott Processes Yongtao Guan July 31, 2006 ABSTRACT In this paper we study computationally efficient procedures to estimate the second-order parameters

More information

Temporal point processes: the conditional intensity function

Temporal point processes: the conditional intensity function Temporal point processes: the conditional intensity function Jakob Gulddahl Rasmussen December 21, 2009 Contents 1 Introduction 2 2 Evolutionary point processes 2 2.1 Evolutionarity..............................

More information

Analysis of Ordinary Differential Equations

Analysis of Ordinary Differential Equations Analysis of Ordinary Differential Equations J. M. Cushing Department of Mathematics Interdisciplinary Program in Applied Mathematics University of Arizona, Tucson, AZ Version 5 August 208 Copyright 208,

More information

Metric-based classifiers. Nuno Vasconcelos UCSD

Metric-based classifiers. Nuno Vasconcelos UCSD Metric-based classifiers Nuno Vasconcelos UCSD Statistical learning goal: given a function f. y f and a collection of eample data-points, learn what the function f. is. this is called training. two major

More information

Markov Chain Monte Carlo The Metropolis-Hastings Algorithm

Markov Chain Monte Carlo The Metropolis-Hastings Algorithm Markov Chain Monte Carlo The Metropolis-Hastings Algorithm Anthony Trubiano April 11th, 2018 1 Introduction Markov Chain Monte Carlo (MCMC) methods are a class of algorithms for sampling from a probability

More information

Dielectrics. Lecture 20: Electromagnetic Theory. Professor D. K. Ghosh, Physics Department, I.I.T., Bombay

Dielectrics. Lecture 20: Electromagnetic Theory. Professor D. K. Ghosh, Physics Department, I.I.T., Bombay What are dielectrics? Dielectrics Lecture 20: Electromagnetic Theory Professor D. K. Ghosh, Physics Department, I.I.T., Bombay So far we have been discussing electrostatics in either vacuum or in a conductor.

More information

Figure 10.1: Recording when the event E occurs

Figure 10.1: Recording when the event E occurs 10 Poisson Processes Let T R be an interval. A family of random variables {X(t) ; t T} is called a continuous time stochastic process. We often consider T = [0, 1] and T = [0, ). As X(t) is a random variable

More information

STRONG FORMS OF ORTHOGONALITY FOR SETS OF HYPERCUBES

STRONG FORMS OF ORTHOGONALITY FOR SETS OF HYPERCUBES The Pennsylvania State University The Graduate School Department of Mathematics STRONG FORMS OF ORTHOGONALITY FOR SETS OF HYPERCUBES A Dissertation in Mathematics by John T. Ethier c 008 John T. Ethier

More information

MONTE CARLO METHODS IN SEQUENTIAL AND PARALLEL COMPUTING OF 2D AND 3D ISING MODEL

MONTE CARLO METHODS IN SEQUENTIAL AND PARALLEL COMPUTING OF 2D AND 3D ISING MODEL Journal of Optoelectronics and Advanced Materials Vol. 5, No. 4, December 003, p. 971-976 MONTE CARLO METHODS IN SEQUENTIAL AND PARALLEL COMPUTING OF D AND 3D ISING MODEL M. Diaconu *, R. Puscasu, A. Stancu

More information

Shortest paths with negative lengths

Shortest paths with negative lengths Chapter 8 Shortest paths with negative lengths In this chapter we give a linear-space, nearly linear-time algorithm that, given a directed planar graph G with real positive and negative lengths, but no

More information

A first model of learning

A first model of learning A first model of learning Let s restrict our attention to binary classification our labels belong to (or ) We observe the data where each Suppose we are given an ensemble of possible hypotheses / classifiers

More information

Rigid Geometric Transformations

Rigid Geometric Transformations Rigid Geometric Transformations Carlo Tomasi This note is a quick refresher of the geometry of rigid transformations in three-dimensional space, expressed in Cartesian coordinates. 1 Cartesian Coordinates

More information

On the Distortion of Embedding Perfect Binary Trees into Low-dimensional Euclidean Spaces

On the Distortion of Embedding Perfect Binary Trees into Low-dimensional Euclidean Spaces On the Distortion of Embedding Perfect Binary Trees into Low-dimensional Euclidean Spaces Dona-Maria Ivanova under the direction of Mr. Zhenkun Li Department of Mathematics Massachusetts Institute of Technology

More information

* * MATHEMATICS (MEI) 4767 Statistics 2 ADVANCED GCE. Monday 25 January 2010 Morning. Duration: 1 hour 30 minutes. Turn over

* * MATHEMATICS (MEI) 4767 Statistics 2 ADVANCED GCE. Monday 25 January 2010 Morning. Duration: 1 hour 30 minutes. Turn over ADVANCED GCE MATHEMATICS (MEI) 4767 Statistics 2 Candidates answer on the Answer Booklet OCR Supplied Materials: 8 page Answer Booklet Graph paper MEI Examination Formulae and Tables (MF2) Other Materials

More information

x y = 1, 2x y + z = 2, and 3w + x + y + 2z = 0

x y = 1, 2x y + z = 2, and 3w + x + y + 2z = 0 Section. Systems of Linear Equations The equations x + 3 y =, x y + z =, and 3w + x + y + z = 0 have a common feature: each describes a geometric shape that is linear. Upon rewriting the first equation

More information

Research Collection. Grid exploration. Master Thesis. ETH Library. Author(s): Wernli, Dino. Publication Date: 2012

Research Collection. Grid exploration. Master Thesis. ETH Library. Author(s): Wernli, Dino. Publication Date: 2012 Research Collection Master Thesis Grid exploration Author(s): Wernli, Dino Publication Date: 2012 Permanent Link: https://doi.org/10.3929/ethz-a-007343281 Rights / License: In Copyright - Non-Commercial

More information

Probability Distributions Columns (a) through (d)

Probability Distributions Columns (a) through (d) Discrete Probability Distributions Columns (a) through (d) Probability Mass Distribution Description Notes Notation or Density Function --------------------(PMF or PDF)-------------------- (a) (b) (c)

More information

Sharp threshold functions for random intersection graphs via a coupling method.

Sharp threshold functions for random intersection graphs via a coupling method. Sharp threshold functions for random intersection graphs via a coupling method. Katarzyna Rybarczyk Faculty of Mathematics and Computer Science, Adam Mickiewicz University, 60 769 Poznań, Poland kryba@amu.edu.pl

More information

Analysis of climate-crop yield relationships in Canada with Distance Correlation

Analysis of climate-crop yield relationships in Canada with Distance Correlation Analysis of climate-crop yield relationships in Canada with Distance Correlation by Yifan Dai (Under the Direction of Lynne Seymour) Abstract Distance correlation is a new measure of relationships between

More information

Master thesis. Multi-class Fork-Join queues & The stochastic knapsack problem

Master thesis. Multi-class Fork-Join queues & The stochastic knapsack problem Master thesis Multi-class Fork-Join queues & The stochastic knapsack problem Sihan Ding August 26th, 2011 Supervisor UL: Dr. Floske Spieksma Supervisors CWI: Drs. Chrétien Verhoef Prof.dr. Rob van der

More information

Introduction: The Perceptron

Introduction: The Perceptron Introduction: The Perceptron Haim Sompolinsy, MIT October 4, 203 Perceptron Architecture The simplest type of perceptron has a single layer of weights connecting the inputs and output. Formally, the perceptron

More information

2015 Canadian Team Mathematics Contest

2015 Canadian Team Mathematics Contest The CENTRE for EDUCATION in MATHEMATICS and COMPUTING cemc.uwaterloo.ca 205 Canadian Team Mathematics Contest April 205 Solutions 205 University of Waterloo 205 CTMC Solutions Page 2 Individual Problems.

More information

A PLANAR SOBOLEV EXTENSION THEOREM FOR PIECEWISE LINEAR HOMEOMORPHISMS

A PLANAR SOBOLEV EXTENSION THEOREM FOR PIECEWISE LINEAR HOMEOMORPHISMS A PLANAR SOBOLEV EXTENSION THEOREM FOR PIECEWISE LINEAR HOMEOMORPHISMS EMANUELA RADICI Abstract. We prove that a planar piecewise linear homeomorphism ϕ defined on the boundary of the square can be extended

More information

Phase transitions and finite-size scaling

Phase transitions and finite-size scaling Phase transitions and finite-size scaling Critical slowing down and cluster methods. Theory of phase transitions/ RNG Finite-size scaling Detailed treatment: Lectures on Phase Transitions and the Renormalization

More information

Inferences about Parameters of Trivariate Normal Distribution with Missing Data

Inferences about Parameters of Trivariate Normal Distribution with Missing Data Florida International University FIU Digital Commons FIU Electronic Theses and Dissertations University Graduate School 7-5-3 Inferences about Parameters of Trivariate Normal Distribution with Missing

More information

A Monte Carlo Implementation of the Ising Model in Python

A Monte Carlo Implementation of the Ising Model in Python A Monte Carlo Implementation of the Ising Model in Python Alexey Khorev alexey.s.khorev@gmail.com 2017.08.29 Contents 1 Theory 1 1.1 Introduction...................................... 1 1.2 Model.........................................

More information

arxiv: v1 [math.co] 13 May 2016

arxiv: v1 [math.co] 13 May 2016 GENERALISED RAMSEY NUMBERS FOR TWO SETS OF CYCLES MIKAEL HANSSON arxiv:1605.04301v1 [math.co] 13 May 2016 Abstract. We determine several generalised Ramsey numbers for two sets Γ 1 and Γ 2 of cycles, in

More information

On the errors introduced by the naive Bayes independence assumption

On the errors introduced by the naive Bayes independence assumption On the errors introduced by the naive Bayes independence assumption Author Matthijs de Wachter 3671100 Utrecht University Master Thesis Artificial Intelligence Supervisor Dr. Silja Renooij Department of

More information

Bulletin of the Iranian Mathematical Society

Bulletin of the Iranian Mathematical Society ISSN: 117-6X (Print) ISSN: 1735-8515 (Online) Bulletin of the Iranian Mathematical Society Vol. 4 (14), No. 6, pp. 1491 154. Title: The locating chromatic number of the join of graphs Author(s): A. Behtoei

More information

Discriminant Analysis with High Dimensional. von Mises-Fisher distribution and

Discriminant Analysis with High Dimensional. von Mises-Fisher distribution and Athens Journal of Sciences December 2014 Discriminant Analysis with High Dimensional von Mises - Fisher Distributions By Mario Romanazzi This paper extends previous work in discriminant analysis with von

More information

Section 14.1 Vector Functions and Space Curves

Section 14.1 Vector Functions and Space Curves Section 14.1 Vector Functions and Space Curves Functions whose range does not consists of numbers A bulk of elementary mathematics involves the study of functions - rules that assign to a given input a

More information

Vote. Vote on timing for night section: Option 1 (what we have now) Option 2. Lecture, 6:10-7:50 25 minute dinner break Tutorial, 8:15-9

Vote. Vote on timing for night section: Option 1 (what we have now) Option 2. Lecture, 6:10-7:50 25 minute dinner break Tutorial, 8:15-9 Vote Vote on timing for night section: Option 1 (what we have now) Lecture, 6:10-7:50 25 minute dinner break Tutorial, 8:15-9 Option 2 Lecture, 6:10-7 10 minute break Lecture, 7:10-8 10 minute break Tutorial,

More information

ORBIT-HOMOGENEITY. Abstract

ORBIT-HOMOGENEITY. Abstract Submitted exclusively to the London Mathematical Society DOI: 10.1112/S0000000000000000 ORBIT-HOMOGENEITY PETER J. CAMERON and ALEXANDER W. DENT Abstract We introduce the concept of orbit-homogeneity of

More information

2016 EF Exam Texas A&M High School Students Contest Solutions October 22, 2016

2016 EF Exam Texas A&M High School Students Contest Solutions October 22, 2016 6 EF Exam Texas A&M High School Students Contest Solutions October, 6. Assume that p and q are real numbers such that the polynomial x + is divisible by x + px + q. Find q. p Answer Solution (without knowledge

More information

through any three given points if and only if these points are not collinear.

through any three given points if and only if these points are not collinear. Discover Parabola Time required 45 minutes Teaching Goals: 1. Students verify that a unique parabola with the equation y = ax + bx+ c, a 0, exists through any three given points if and only if these points

More information

CISC - Curriculum & Instruction Steering Committee. California County Superintendents Educational Services Association

CISC - Curriculum & Instruction Steering Committee. California County Superintendents Educational Services Association CISC - Curriculum & Instruction Steering Committee California County Superintendents Educational Services Association Primary Content Module The Winning EQUATION Algebra I - Linear Equations and Inequalities

More information

besides your solutions of these problems. 1 1 We note, however, that there will be many factors in the admission decision

besides your solutions of these problems. 1 1 We note, however, that there will be many factors in the admission decision The PRIMES 2015 Math Problem Set Dear PRIMES applicant! This is the PRIMES 2015 Math Problem Set. Please send us your solutions as part of your PRIMES application by December 1, 2015. For complete rules,

More information

Hierarchical Modeling and Analysis for Spatial Data

Hierarchical Modeling and Analysis for Spatial Data Hierarchical Modeling and Analysis for Spatial Data Bradley P. Carlin, Sudipto Banerjee, and Alan E. Gelfand brad@biostat.umn.edu, sudiptob@biostat.umn.edu, and alan@stat.duke.edu University of Minnesota

More information

PLC Papers Created For:

PLC Papers Created For: PLC Papers Created For: Daniel Inequalities Inequalities on number lines 1 Grade 4 Objective: Represent the solution of a linear inequality on a number line. Question 1 Draw diagrams to represent these

More information

SPECIAL CASES OF THE CLASS NUMBER FORMULA

SPECIAL CASES OF THE CLASS NUMBER FORMULA SPECIAL CASES OF THE CLASS NUMBER FORMULA What we know from last time regarding general theory: Each quadratic extension K of Q has an associated discriminant D K (which uniquely determines K), and an

More information

Applications. More Counting Problems. Complexity of Algorithms

Applications. More Counting Problems. Complexity of Algorithms Recurrences Applications More Counting Problems Complexity of Algorithms Part I Recurrences and Binomial Coefficients Paths in a Triangle P(0, 0) P(1, 0) P(1,1) P(2, 0) P(2,1) P(2, 2) P(3, 0) P(3,1) P(3,

More information

SCHOOL OF MATHEMATICS MATHEMATICS FOR PART I ENGINEERING. Self-paced Course

SCHOOL OF MATHEMATICS MATHEMATICS FOR PART I ENGINEERING. Self-paced Course SCHOOL OF MATHEMATICS MATHEMATICS FOR PART I ENGINEERING Self-paced Course MODULE ALGEBRA Module Topics Simplifying expressions and algebraic functions Rearranging formulae Indices 4 Rationalising a denominator

More information

Beyond the Point Cloud: From Transductive to Semi-Supervised Learning

Beyond the Point Cloud: From Transductive to Semi-Supervised Learning Beyond the Point Cloud: From Transductive to Semi-Supervised Learning Vikas Sindhwani, Partha Niyogi, Mikhail Belkin Andrew B. Goldberg goldberg@cs.wisc.edu Department of Computer Sciences University of

More information

Boundary Problems for One and Two Dimensional Random Walks

Boundary Problems for One and Two Dimensional Random Walks Western Kentucky University TopSCHOLAR Masters Theses & Specialist Projects Graduate School 5-2015 Boundary Problems for One and Two Dimensional Random Walks Miky Wright Western Kentucky University, miky.wright768@topper.wku.edu

More information

Classification of root systems

Classification of root systems Classification of root systems September 8, 2017 1 Introduction These notes are an approximate outline of some of the material to be covered on Thursday, April 9; Tuesday, April 14; and Thursday, April

More information

Basic Sampling Methods

Basic Sampling Methods Basic Sampling Methods Sargur Srihari srihari@cedar.buffalo.edu 1 1. Motivation Topics Intractability in ML How sampling can help 2. Ancestral Sampling Using BNs 3. Transforming a Uniform Distribution

More information

H 2 : otherwise. that is simply the proportion of the sample points below level x. For any fixed point x the law of large numbers gives that

H 2 : otherwise. that is simply the proportion of the sample points below level x. For any fixed point x the law of large numbers gives that Lecture 28 28.1 Kolmogorov-Smirnov test. Suppose that we have an i.i.d. sample X 1,..., X n with some unknown distribution and we would like to test the hypothesis that is equal to a particular distribution

More information

ENGRG Introduction to GIS

ENGRG Introduction to GIS ENGRG 59910 Introduction to GIS Michael Piasecki October 13, 2017 Lecture 06: Spatial Analysis Outline Today Concepts What is spatial interpolation Why is necessary Sample of interpolation (size and pattern)

More information

Kernel Methods and Support Vector Machines

Kernel Methods and Support Vector Machines Kernel Methods and Support Vector Machines Oliver Schulte - CMPT 726 Bishop PRML Ch. 6 Support Vector Machines Defining Characteristics Like logistic regression, good for continuous input features, discrete

More information

More Powerful Tests for Homogeneity of Multivariate Normal Mean Vectors under an Order Restriction

More Powerful Tests for Homogeneity of Multivariate Normal Mean Vectors under an Order Restriction Sankhyā : The Indian Journal of Statistics 2007, Volume 69, Part 4, pp. 700-716 c 2007, Indian Statistical Institute More Powerful Tests for Homogeneity of Multivariate Normal Mean Vectors under an Order

More information

1.3 Distance and Midpoint Formulas

1.3 Distance and Midpoint Formulas Graduate Teacher Department of Mathematics San Diego State University Dynamical Systems Program August 29, 2011 In mathematics, a theorem is a statement that has been proven on the basis of previously

More information

This pre-publication material is for review purposes only. Any typographical or technical errors will be corrected prior to publication.

This pre-publication material is for review purposes only. Any typographical or technical errors will be corrected prior to publication. This pre-publication material is for review purposes only. Any typographical or technical errors will be corrected prior to publication. Copyright Pearson Canada Inc. All rights reserved. Copyright Pearson

More information

A Thesis Proposal. Agrawal, Ravi. Submitted to the Office of Graduate Studies of Texas A&M University

A Thesis Proposal. Agrawal, Ravi. Submitted to the Office of Graduate Studies of Texas A&M University Using Finite Element Structural Analysis of Retroreflective Raised Pavement Markers (RRPMs) to Recommend Testing Procedures for Simulating Field Performance of RRPMs A Thesis Proposal By Agrawal, Ravi

More information