A Hypothesis-Free Multiple Scan Statistic with Variable Window

Size: px
Start display at page:

Download "A Hypothesis-Free Multiple Scan Statistic with Variable Window"

Transcription

1 Biometrical Journal 50 (2008) 2, DOI: /bimj A Hypothesis-Free Multiple Scan Statistic with Variable Window L. Cucala * Institut de Math matiques de Toulouse, Universit Paul Sabatier, Toulouse Cedex 9, France Received 7 August 2007, revised 14 January 2008, accepted 28 January 2008 Summary In this article we propose a new technique for identifying clusters in temporal point processes. This relies on the comparision between all the m-order spacings and it is totally independent of any alternative hypothesis. A recursive procedure is introduced and allows to identify multiple clusters independently. This new scan statistic seems to be more efficient than the classical scan statistic for detecting and recovering cluster alternatives. These results have applications in epidemiological studies of rare diseases. Key words: Disease alarm; Epidemiology; Scan statistics; Spacings. 1 Introduction Let X 1 ;...; X n be independent and identically distributed random variables that denote the times of occurrence of n events observed in an interval (0, T). Without loss of generality, let T ¼ 1 throughout this paper. The objective of this work is to identify, if they exist, the subintervals in which the number of events is abnormally high, usually named clusters. The absence of cluster corresponds to the null hypothesis H 0 that the events are uniformly distributed on (0, 1). This problem arises typically in epidemiological applications when one wants to assess the outbreak of an unknown disease and to analyse for example whether it is infectious or dependent on a seasonal environmental factor. In these cases, it is crucial to recover as precisely as possible the clusters (or clustering zone). Many procedures are applicable to aggregated data, that is when the observation period is divided into subintervals and only the number of events within each interval is known. See for example the test introduced by Tango (1984) relying on the division of the time interval in equal subintervals. These procedures may also be used when individual data are available but the loss of information may be important and the test result depends on the arbitrarily chosen division. Moreover, as the monitoring tools get improved and the computational capacities increase, the methods for individual data become more and more relevant. The most popular of the methods for individual data is the scan statistic, first introduced by Naus (1965). Originally it was simply the maximum number of events observed within an interval of fixed length d. Later on, Nagarwalla (1996) extended the method so as to compare intervals with different lengths: he introduced the scan statistic with variable window, also called variable scan statistic. The test derived from this statistic is the generalized likelihood-ratio test for a uniform distribution against a piecewise-constant alternative. However, we may wonder whether this test is powerful against different clustering alternatives, sometimes more realistic, such as bell-shaped densities. A non-parametric method, due to Kelsall and Diggle (1995), relies on a kernel intensity estimation of the point process and identifies the clusters as the intervals in which the intensity estimate is higher * cucala@cict.fr, Phone: # 2008 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

2 300 L. Cucala: A Hypothesis-Free Scan Statistic than expected under the null hypothesis. Many other tests, so called global tests, are powerful to reject the uniform hypothesis but do not identify the most significant clusters. A lot of these tests rely on spacings (Pyke, 1965). Recently, Molinari, Bonaldi and Daur s (2001) introduced a cluster detection method based on applying a piecewise-constant regression model to the spacings issued from the times of occurrence of events. This method allows for multiple cluster detection but does not take into account the spacings dependence. Moreover, such as the variable scan statistic, it seems to be more adapted to piecewiseconstant cluster alternatives. In this article, we introduce a multiple cluster detection method based on spacings and not relying on any particular alternative hypothesis. Section 2 is an introduction to the test statistic based on a new concentration index. Then we study the distribution of this statistic under the null hypothesis in Section 3. We present a multiple procedure allowing to detect several clusters in Section 4. The interest of the method is assessed by its application to simulated and real data in Section 5. The paper concludes in a short discussion in Section 6. 2 A Hypothesis-Free Scan Statistic Let X 1 ;...; X n be defined as in the Introduction. We assume that the hypothesis H 0 is satisfied throughout this section. Let 0 ¼ X ð0þ X ð1þ... X ðnþ X ðnþ1þ ¼ 1 denote the order statistics associated to ðx 1 ;...; X n Þ and D i ¼ X ðiþ X ði 1Þ ; i ¼ 1;...; n þ 1 ; denote the associated (one-order) spacings. Since a single event cannot be considered as a cluster, the potential clusters are all the intervals ½X ðiþ ; X ðjþ Š; 1 i < j n. We need a concentration index in order to compare them. The one introduced by Nagarwalla (1996) relies on the generalized likelihood-ratio test between H 0 and a piecewise-constant density hypothesis. We introduce another concentration index which is only based on H 0. Let D i; j ¼ X ðjþ X ðiþ ¼ Pj k¼iþ1 D k ; 1 i < j n ; denote the ð j iþ-order spacings associated to ðx 1 ;...; X n Þ: they are beta distributed with parameters a ¼ j i and b ¼ n þ 1 j þ i (David, 1981). Let also U i; j ¼ B inc ðd i; j ; j i; n þ 1 j þ iþ ; 1 i < j n ; where B inc ð:; j i; n þ 1 j þ iþ stands for the so-called incomplete Beta function, that is the distribution function of D i; j. All the random variables U i; j are uniformly distributed on (0, 1). Moreover, the smaller U i; j is, the greater the concentration of events on the interval ½X ðiþ ; X ðjþ Š is. Thus, a concentration index on the interval ½X ðiþ ; X ðjþ Š is the quantity I HF ði; jþ ¼1=U i; j and the hypothesis-free (HF in short) scan statistic, L HF ¼ sup I HF ði; jþ ; 1i< jn is the maximum concentration observed on the whole domain.

3 Biometrical Journal 50 (2008) The Distribution of L HF under H 0 The second step of a cluster detection method consists in testing whether the maximal concentration observed on the domain is significant. For this, we need to know or at least to estimate the distribution of this maximal concentration under the null hypothesis. Since, even under H 0, the spacings are all correlated, finding the distribution of spacings-based statistics has always been tedious. For example, many authors have studied the distribution of the scan statistic with fixed window and it was finally tabulated by Huntington and Naus (1975). But letting the window vary makes it even more difficult. Nagarwalla (1996) did not study the distribution of his variable scan statistic L scan and his test is based on empirical quantiles obtained by a Monte Carlo procedure. However, in this section, we show how a technique introduced by Huffer and Lin (2001) can be used for deriving the distribution of L HF under H 0. Remark that this technique can also be used for deriving the distribution of the variable scan statistic L scan (see Appendix A for more details). We wish to express Pðt; nþ ¼P 0 ðl HF < tþ ; where P 0 stands for the probability under the null hypothesis. Let b a;n;m denote the quantile of the beta distribution such that B inc ðb a;n;m ; m; n þ 1 mþ ¼a. We get that Pðt; nþ ¼P 0 sup 1i< jn I HF ði; jþ < t ¼ P 0 ðu i; j > 1=t; 81 i < j nþ ¼ P 0 ðd i; j > b 1=t;n;j i ; 81 i < j nþ P ¼ P j 0 D k > b 1=t;n; j i ; 81 i < j n : k¼iþ1 Now, Huffer and Lin (2001) have introduced an algorithm for computing the joint distribution of general linear combinations of uniform spacings (i.e. spacings issued from the uniform distribution). This algorithm consists in a repeated and systematic use of two basic recursions and finally leads to a sum of simpler components which can be easily expressed in closed form. Using this algorithm, we get for example that 8 0 if 1=t < 0 ; >< 1 ð1 b t;3;2 Þ 2 ½1 4b t;3;2 þ 6b t;3;1 Š if 1=t 2½0; 2=3Š ; Pð1=t; 3Þ ¼ 1 ð1 2b t;3;1 Þ 3 if 1=t 2ð2=3; 7=8Š ; >: 1 if 1=t > 7=8 : For larger values of n, the algorithm can only be applied by a computer as it deals with high-dimensional matrices. However, these matrices must be arranged in a specific form (sometimes called lower block triangular form) through the permutation and the deletion of certain rows and columns. Translating this algorithm into a program is not straight-forward and unfortunately the C program provided by the authors only works with simpler examples. Therefore, in the following, we use empirical quantiles obtained by a Monte Carlo procedure to carry out all the tests related to L HF and L scan. 4 A Multiple Procedure In this section we introduce a method to identify multiple clustering. This procedure is adapted either to the HF scan statistic (as presented here) or to the variable scan statistic. The nominal level of the global test for H 0 is set to a. Let q n;a denote the quantile of L HF such that Pðq n;a ; nþ ¼1 a. The density function of the random variables X 1 ;...; X n is denoted f ð:þ. Let

4 302 L. Cucala: A Hypothesis-Free Scan Statistic D1 D2 D5 D6 D7 STEP T D1/T D2/T D5/T D6/T D7/T STEP Figure 1 The multiple procedure. m ¼ min t2ð0;1þ f ðtþ. The clustering zone is C ¼ft 2ð0; 1Þ: f ðtþ > mg and the null hypothesis corresponds to C ¼;. The estimator of the cluster zone, denoted ^C; is initially set to ;. The first step consists in computing L HF from X 1 ;...; X n. Let i* and j* denote the indexes such that L HF ¼ I HF ði*; j*þ. We assume that the interval ½X ði Þ; X ðj ÞŠ is a significant cluster, that is L HF > q n;a. In order to look for another significant cluster, we need to transform the data. Indeed, we do not want to analyse all the distances between events located in the cluster ½X ði Þ; X ðj ÞŠ. This is why we introduce a new set of random variables fx ð2þ k ; k ¼ 1;...; n j* þ i*g defined by X ð2þ k ¼ ( X ðkþ T if 1 k i* ; X ðkþ j i Þ X ðj Þ þx ði Þ T if i* þ 1 k n j* þ i* ; where T* ¼ 1 X ðj Þ þ X ði Þ. This data transformation is illustrated by Figure 1 for n ¼ 6; i* ¼ 2 and j* ¼ 4. The spacings between these new data are all the original spacings located outside the cluster ½X ði Þ; X ðj ÞŠ, scaled so that their sum equals 1. Actually, if C ½X ði Þ; X ðj ÞŠ, the random variables fx ð2þ k ; k ¼ 1;...; n j* þ i*g are distributed as order statistics issued from a (0, 1)-uniform sample. The proof is given in Appendix B. Therefore, we look for any cluster in this new data set using again the HF scan statistic L HF. This procedure is repeated as long as L HF remains significant. Remark that this procedure could identify a first cluster C 1 and then a second one C 2 containing the first one. This just means that the concentration is more significant in C 1 but still high in C 2 \ C 1. This suggests that the density function f ð:þ may be bell-shaped. Recently, Zhang, Kulldorff and Assuncão (2007) introduced a multiple procedure adapted to spatial scan statistics (Kulldorff, 1997). It just consists in removing the most significant cluster and going on with the procedure. Remark that doing the same in the temporal setting is not appropriate since the data to analyse in step two would be fx ð2þ 1 ;...; Xð2Þ i 1 ; Xð2Þ iþ1 ;...; Xð2Þ n jþig, which are not distributed as order statistics issued from a uniform sample, even ½X ði Þ; X ðj ÞŠ coincides exactly with C. 5 Data Analysis In this section we compare the results obtained by the test based on the HF scan statistic to the test based on the variable scan statistic. The multiple procedure introduced in the previous section is used both with the HF scan statistic L HF (it is denoted ML HF ) and the variable scan statistic L scan (it is denoted ML scan ). 5.1 Simulated data We apply the different cluster detection methods to data sets containing n ¼ 100 events. We compare the results obtained by the HF and variable scan statistics, L HF and L scan, and the multiple procedures

5 Biometrical Journal 50 (2008) derived from these statistics, ML HF and ML scan. Remark that the variable scan statistic depends on the parameter n 0, the minimal number of events in a cluster, which must be set before applying the procedure. Here, we set this parameter to n 0 ¼ 5, which is the arbitrary value chosen by Nagarwalla (1996) One-cluster detection The events are simulated according to different one-cluster alternatives. A flat cluster is obtained using the density function 1 f 1 ðxþ ¼ 0:8 þ 0:2r 1 r ð0;0:4š[½0:6;1þðxþþ 0:8 þ 0:2r 1 ð0:4;0:6þðxþ : A bell-shaped cluster is obtained using the density function f 2 ðxþ ¼ 15 2r þ 13 1 ð0;0:4š[½0:6;1þðxþ þ 15 2r þ 13 f1 þðr 1Þ½1 100ðx 0:5Þ2 Šg 1 ð0:4;0:6þ ðxþ : In both cases, the parameter r is the ratio between the maximum density and the minimum density on (0, 1). These two functions are plotted in Figure 2 for different values of r. We compare the power of the global tests associated to each method. Moreover, we check whether the significant clusters exhibited by the different methods match the true cluster, that is the interval in which the density is higher, C ¼ð0:4; 0:6Þ in our examples. Let nðþ denote the Lebesgue measure and let A ¼ð0; 1Þn A; 8A ð0; 1Þ. A true positive index is given by TP ¼ nðc \ ^CÞ. A true negative index is given by TN ¼ nð ^C \ CÞ. Finally, a correspondance index between the true and the estimated clustering zone is given by I ¼ TP þ TN. For each alternative, 1000 data sets are simulated. The nominal level of the global test is set to 5%. All the tests conducted in this section rely on a Monte Carlo procedure with a number of simulations equal to Tables 1 and 2 give the results obtained with two different cluster shapes. The bold values are the best results obtained among all the methods. By definition, the multiple procedures have the same power as the scan statistics they are derived from. Their true positive index is larger and their true negative index smaller. We first remark that the HF scan statistic is always more powerful than the variable scan statistic, even when the cluster alternative is piecewise-constant. The multiple procedure ML HF always gets the best true positive index and the variable scan statistic L scan the best true negative index. It seems that the HF concentration index gives more weight to larger intervals containing a great number of events than the concentration index derived by Nagarwalla (1996). Thus, the HF scan statistic usually selects larger clusters than the f r=2 r=3 r=5 f r=2 r=3 r= x x Figure 2 One-cluster alternative functions.

6 304 L. Cucala: A Hypothesis-Free Scan Statistic r Table 1 Tests applied to a simulated bell cluster. Empirical results of the following: L HF ML HF L scan ML scan 2 Power TP TN I Power TP TN I Power TP TN I variable scan statistic. Since the correspondance index is always larger for the HF methods, these techniques seem to be more adapted to cluster detection than the classical scan statistic Multiple cluster detection Two bell-shaped clusters are obtained using the density function f 3 ðxþ ¼ 15 4r þ 11 1 ð0;0:2š[½0:4;0:6š[½0:8;1þðxþ þ 15 4r þ 11 f1 þðr 1Þ½1 100ðx 0:3Þ2 Šg 1 ð0:2;0:4þ ðxþ þ 15 4r þ 11 f1 þðr 1Þ½1 100ðx 0:7Þ2 Šg 1 ð0:6;0:8þ ðxþ ; which is plotted in Figure 3 for different values of r. r Table 2 Tests applied to a simulated flat cluster. Empirical results of the following: L HF ML HF L scan ML scan 2 Power TP TN I Power TP TN I Power 1 1 TP TN I

7 Biometrical Journal 50 (2008) r=2 r=3 r=5 f Figure 3 x Two-cluster alternative functions. Table 3 Tests applied to two simulated bell clusters. r Empirical results of the following: L HF ML HF L scan ML scan 2 Power TP TN I Power TP TN I Power TP TN I Then, the simulation study is exactly the same. Table 3 gives the results. These results confirm the efficiency of the HF methods compared to the classical ones. They also point out the benefit of the multiple procedure when the clustering zone is the union of disjoint sets. 5.2 Real data Knox data set The same methods are now applied to a classical data set published by Knox (1959) and describing birth defects oesophageal atresia and tracheo-oesophageal fistula observed in an hospital in Birmingham, UK, during 2191 days, between 1950 and The observation days corresponding to the 35 events are plotted in Figure 4. There is a great concentration of events around the middle of the observation period and another one at the end. But comparing these two possible clusters and, moreover, discussing their significance is not possible by visual inspection. Before applying the cluster detection methods, the observation days are scaled so that they lie in [0, 1]. The variable scan statistic L scan is computed for two different values of the parameter n 0, the minimal number of events in a cluster. Table 4 gives the most significant clusters ^C and the associated p-value for each method.

8 306 L. Cucala: A Hypothesis-Free Scan Statistic Figure 4 Knox data set. We set the nominal level to a ¼ The HF scan method identifies a significant cluster from event 8 to event 22: it corresponds to the high concentration in the middle of the observation period. This interval also maximises the concentration index used by Nagarwalla (1996). Though, the significance values for L scan associated to this cluster are very dependent on the parameter n 0. Choosing n 0 ¼ 5, the arbitrary value recommended by Nagarwalla (1996), leads to significative clustering. Nevertheless, a logical choice for n 0 is 2 since two events may constitute a cluster if they are close enough. This is the implicit choice made when we use ML HF. But making this choice for ML scan leads to rejecting the significance of the cluster. Therefore, the results obtained with the variable scan method are quite difficult to interpret, whereas the HF scan method clearly indicates a significative clustering. Table 4 Tests applied to Knox data set. Method ^C p-value ML HF [1233, 1491] [1233, 1491] [ [2049, 2174] ML scan ðn 0 ¼ 2Þ [1233, 1491] ML scan ðn 0 ¼ 5Þ [1233, 1491] [1233, 1491] [ [2049, 2174]

9 Biometrical Journal 50 (2008) Figure 5 Cancer cases in Lancashire. A second cluster is detected at the end of the period, between event 29 and event 35 but its significance value is too high, whatever the method. However, it is important to remark that this could be the beginning of a more significant cluster not detected because the study stops at day Lancashire data set We now apply the methods to a data set describing the location of cancers of the larynx and of the lung recorded between 1973 and 1984 in the Chorley-Ribble area, Lancashire, UK (Diggle et al., 1990). In Figure 5, the lung cancer cases are represented by smaller dots and the larynx cancer cases by larger dots. The triangle point down represents a disused incinerator around which the larynx cancer cases are suspected to aggregate. Since the lung cancer cases seem to be less dependent on any environmental factor, they are used as control data. Let X 1 ;...; X n denote the euclidian distances from the n larynx cancer cases to the incinerator and Y 1 ;...; Y m denote the euclidian distances from the m lung cancer cases to the incinerator. We would like to test whether the X i s are distributed as the Y i s or whether they cluster around 0. First, we need to transform the data into T i ¼ ÐX i 1 ^f h ðtþ dt ; 1 i n ; Table 5 Tests applied to Lancashire data set. Method ^C p-value ML HF [1, 4] ML scan ðn 0 ¼ 2Þ [1, 3] ML scan ðn 0 ¼ 5Þ [1, 5]

10 308 L. Cucala: A Hypothesis-Free Scan Statistic where ^f h ð:þ is a kernel density estimate issued from the Y i s, and h is a bandwidth given by a crossvalidation method (Scott, 1992). Then the different scan methods are applied to T 1 ;...; T n and Table 5 gives the most significant cluster and the associated p-value for each one. We set the nominal level to a ¼ 0:01. The HF scan method identifies a very significant cluster containing the 4 larynx cancer cases closest to the incinerator. On the other hand, the variable scan method gives very contradictory results, depending on the value of the parameter n 0. When n 0 ¼ 2, a significant cluster containing the closest 3 larynx cancer cases is exhibited. But when n 0 ¼ 5, it does not conclude to significant clustering. Indeed, the fifth closest larynx cancer case is further from the incinerator than many lung cancer cases. These results confirm that the variable scan statistic is very efficient when one knows a priori the number of events in the clusters to detect. When it is not the case, which happens most of the time, the HF scan method should be preferred. Remark that this data set has previously been studied by Kelsall and Diggle (1995): they compared the kernel density estimates based on the X i s and the Y i s but they obtained a non-significant result. We believe that this comes from using a statistic reflecting the global discrepancy between H 0 and the data set in the whole observation domain. On the other side, all the scan statistics rely on a local discrepancy measure in the possible clusters and are thus more likely to detect small clusters. 6 Discussion The presented method allows one to detect several clusters in temporal data without assuming anything about the clustering structure, and without setting up any parameter. The widely-used variable scan statistic (Nagarwalla, 1996) also allows to compare intervals having different lengths but it appears that the parameter n 0, the minimal number of events in a cluster, plays a major role. The HF scan statistic is much less influenced by this minimal number of events, so that it can be considered as parameter-free. Moreover, its empirical results are better. Usually, when people apply the variable scan statistic L scan, they often classify all intervals ½X ðiþ ; X ðjþ Š according to the concentration index introduced by Nagarwalla (1996). Then they consider that the significant clusters are all the intervals where the concentration is higher than the a-quantile of L scan. This leads to a conservative procedure. The multiple procedure we introduce here allows to detect multiple clusters independently and gives a significance value to each cluster. Moreover, this procedure does not restrict the performance of the single test when only one cluster is present. It is sometimes necessary to take into account the inhomogeneity of the observed population, as it is usually done in the spatial setting (Kulldorff, 1997). The adaptation of the HF scan statistic to inhomogeneity is obtained through the transformation introduced in the Lancashire data set analysis, which is similar to the spacings transformation introduced by Molinari, Bonaldi and Daurès (2001). In this article, we only talked about scan statistics used for retrospective disease surveillance, that is statistics regarding all intervals as potential clusters. Recently, Kulldorff (2001) introduced a prospective scan statistic only considering the time intervals that are still alive, that is including the end of the observation period. This statistic is based on the variable scan statistic but the adaptation to the HF scan statistic is straight-forward and could provide an efficient tool for regular time periodic disease surveillance. Even if the theory underlying the HF scan statistic relies on individual data, the concentration index I HF may also quantify the concentration on a subinterval even if we do not know the locations of events, but only how many there are. Thus, it could also be used for analysing aggregated data. Finally, a possible extension of this method to spatial or spatio-temporal clustering problems is to transform the data as proposed by DematteÒ, Molinari and Daur s (2007) to construct a distance between events. This is the subject of a future work.

11 Biometrical Journal 50 (2008) Appendix Appendix A Let c a;n;m denote the quantity such that f scan ðc a;n;m ; n; mþ ¼a, where f scan ðt; n; mþ ¼ m þ 1 mþ1 n m 1 m m 1 1 m þ 1 > t : nt nð1 tþ n The concentration index on the interval ½X ðiþ ; X ðjþ Š related to the variable scan statistic is I scan ði; jþ ¼f scan ðd i; j ; n; j iþ : Thus, we get that P 0 ðl scan < tþ ¼P 0 sup 1i< jn I scan ði; jþ < t ¼ P 0 ð f scan ðd i; j ; n; j iþ < t; 81 i < j nþ ¼ P 0 ðd i; j > c t;n;j i ; 81 i < j nþ P ¼ P j 0 D k > c t;n; j i ; 81 i < j n : k¼iþ1 Appendix B Let N C denote the random number of events located in the clustering zone C. Given N C ¼ n C, there are ðn n C Þ events, denoted by Y 1... Y n nc, uniformly distributed on C ¼ð0; 1ÞnC. Let Z k ¼ nðð0; Y k Þ\ CÞ; k ¼ 1;...; n n C. The random variables Z 1 ;...; Z n nc are order statistics associated to aðn n C Þ uniform sample on ð0; nð ^CÞÞ: Since C ½X ði Þ; X ðj ÞŠ, we get that X ði Þ ¼ Z i and X ðj Þ ¼ Z j n C. Let ~D k ¼ Z k Z k1 ; k ¼ 1;...; n n C þ 1 denote the spacings associated to Z 1 ;...; Z n nc, where Z 0 ¼ 0 and Z n nc þ1 ¼ nð CÞ. Since these uniform spacings are interchangeable random variables, the random variables ( P k l¼1 ~X k ¼ D l if 1 k i* ; P i l¼1 D l þ P kþj i l¼j n C þ1 D k if i* þ 1 k n n C j* þ i* ; are order statistics associated to a ðn n C j* þ i*þ uniform sample on ð0; T*Þ. Scaling these random variables by T* leads to the expected result. Acknowledgements This research project is within the scope of my Ph.D. thesis work under the direction of Christine Thomas-Agnan. I gratefully thank her for her constant help and support. All programs are written in R language and are available on demand. Conflict of Interests Statement The author have declared no conflict of interest.

12 310 L. Cucala: A Hypothesis-Free Scan Statistic References David, H. A. (1981). Order statistics. Second edition. New York: Wiley. DematteÒ, C., Molinari, N., and Daur s, J. P. (2007). Arbitrarily shaped multiple spatial cluster detection for case event data. Computational Statistics and Data Analysis 51, Diggle, P. J., Gatrell, A. C., and Lovett, A. A. (1990). Modelling the prevalence of cancer of the larynx in part of Lancashire: a new methodology for spatial epidemiology. In: R. W. Thomas (ed.), Spatial epidemiology, 21. London: Pion. Huffer, F. W. and Lin, C. T. (2001). Computing the joint distribution of general linear combinations of spacings or exponential variates. Statistica Sinica 11, Huntington, R. J. and Naus, J. I. (1975). A simpler expression for k-th nearest neighbor coincidence probabilities. Annals of Probability 5, Kelsall, J. E. and Diggle, P. J. (1995). Kernel estimation of relative risk. Bernoulli 1, Knox, G. (1959). Secular pattern of congenital oesophageal atresia. British Journal of Preventive Social Medecine 13, Kulldorff, M. (1997). A spatial scan statistic. Communications in Statistics. Theory and Methods 6, Kulldorff, M. (2001). Prospective time-periodic geographical disease surveillance using a scan statistic. Journal of the Royal Statistical Society, Series A 164, Molinari, N., Bonaldi, C., and Daur s, J. P. (2001). Multiple temporal cluster detection. Biometrics 57, Nagarwalla, N. (1996). A scan statistic with a variable window. Statistics in Medicine 15, Naus, J. I. (1965). The distribution of the size of the maximum cluster of points on a line. Journal of the American Statistical Association 61, Pyke, R. (1965). Spacings (with discussion). Journal of the Royal Statistical Society, Series B 27, Scott, D. W. (1992). Multivariate density estimation. Theory, practice, and visualization. Wiley, New York. Tango, T. (1984). The detection of disease clustering in time. Biometrics 40, Zhang, Z., Kulldorff, M., and Assuncão, R. (2007). Spatial scan statistics adjusted for multiple clusters. Preprint.

USING CLUSTERING SOFTWARE FOR EXPLORING SPATIAL AND TEMPORAL PATTERNS IN NON-COMMUNICABLE DISEASES

USING CLUSTERING SOFTWARE FOR EXPLORING SPATIAL AND TEMPORAL PATTERNS IN NON-COMMUNICABLE DISEASES USING CLUSTERING SOFTWARE FOR EXPLORING SPATIAL AND TEMPORAL PATTERNS IN NON-COMMUNICABLE DISEASES Mariana Nagy "Aurel Vlaicu" University of Arad Romania Department of Mathematics and Computer Science

More information

Cluster Analysis using SaTScan. Patrick DeLuca, M.A. APHEO 2007 Conference, Ottawa October 16 th, 2007

Cluster Analysis using SaTScan. Patrick DeLuca, M.A. APHEO 2007 Conference, Ottawa October 16 th, 2007 Cluster Analysis using SaTScan Patrick DeLuca, M.A. APHEO 2007 Conference, Ottawa October 16 th, 2007 Outline Clusters & Cluster Detection Spatial Scan Statistic Case Study 28 September 2007 APHEO Conference

More information

Cluster Analysis using SaTScan

Cluster Analysis using SaTScan Cluster Analysis using SaTScan Summary 1. Statistical methods for spatial epidemiology 2. Cluster Detection What is a cluster? Few issues 3. Spatial and spatio-temporal Scan Statistic Methods Probability

More information

A spatial scan statistic for multinomial data

A spatial scan statistic for multinomial data A spatial scan statistic for multinomial data Inkyung Jung 1,, Martin Kulldorff 2 and Otukei John Richard 3 1 Department of Epidemiology and Biostatistics University of Texas Health Science Center at San

More information

Tests for spatial randomness based on spacings

Tests for spatial randomness based on spacings Tests for spatial randomness based on spacings Lionel Cucala and Christine Thomas-Agnan LSP, Université Paul Sabatier and GREMAQ, Université Sciences-Sociales, Toulouse, France E-mail addresses : cucala@cict.fr,

More information

An Introduction to SaTScan

An Introduction to SaTScan An Introduction to SaTScan Software to measure spatial, temporal or space-time clusters using a spatial scan approach Marilyn O Hara University of Illinois moruiz@illinois.edu Lecture for the Pre-conference

More information

Spatio-Temporal Cluster Detection of Point Events by Hierarchical Search of Adjacent Area Unit Combinations

Spatio-Temporal Cluster Detection of Point Events by Hierarchical Search of Adjacent Area Unit Combinations Spatio-Temporal Cluster Detection of Point Events by Hierarchical Search of Adjacent Area Unit Combinations Ryo Inoue 1, Shiho Kasuya and Takuya Watanabe 1 Tohoku University, Sendai, Japan email corresponding

More information

FleXScan User Guide. for version 3.1. Kunihiko Takahashi Tetsuji Yokoyama Toshiro Tango. National Institute of Public Health

FleXScan User Guide. for version 3.1. Kunihiko Takahashi Tetsuji Yokoyama Toshiro Tango. National Institute of Public Health FleXScan User Guide for version 3.1 Kunihiko Takahashi Tetsuji Yokoyama Toshiro Tango National Institute of Public Health October 2010 http://www.niph.go.jp/soshiki/gijutsu/index_e.html User Guide version

More information

Parameter Estimation for Partially Complete Time and Type of Failure Data

Parameter Estimation for Partially Complete Time and Type of Failure Data Biometrical Journal 46 (004), 65 79 DOI 0.00/bimj.00004 arameter Estimation for artially Complete Time and Type of Failure Data Debasis Kundu Department of Mathematics, Indian Institute of Technology Kanpur,

More information

Information Matrix for Pareto(IV), Burr, and Related Distributions

Information Matrix for Pareto(IV), Burr, and Related Distributions MARCEL DEKKER INC. 7 MADISON AVENUE NEW YORK NY 6 3 Marcel Dekker Inc. All rights reserved. This material may not be used or reproduced in any form without the express written permission of Marcel Dekker

More information

Outline. Practical Point Pattern Analysis. David Harvey s Critiques. Peter Gould s Critiques. Global vs. Local. Problems of PPA in Real World

Outline. Practical Point Pattern Analysis. David Harvey s Critiques. Peter Gould s Critiques. Global vs. Local. Problems of PPA in Real World Outline Practical Point Pattern Analysis Critiques of Spatial Statistical Methods Point pattern analysis versus cluster detection Cluster detection techniques Extensions to point pattern measures Multiple

More information

SAMPLE SIZE AND OPTIMAL DESIGNS IN STRATIFIED COMPARATIVE TRIALS TO ESTABLISH THE EQUIVALENCE OF TREATMENT EFFECTS AMONG TWO ETHNIC GROUPS

SAMPLE SIZE AND OPTIMAL DESIGNS IN STRATIFIED COMPARATIVE TRIALS TO ESTABLISH THE EQUIVALENCE OF TREATMENT EFFECTS AMONG TWO ETHNIC GROUPS MARCEL DEKKER, INC. 70 MADISON AVENUE NEW YORK, NY 006 JOURNAL OF BIOPHARMACEUTICAL STATISTICS Vol., No. 4, pp. 553 566, 00 SAMPLE SIZE AND OPTIMAL DESIGNS IN STRATIFIED COMPARATIVE TRIALS TO ESTABLISH

More information

Chapter 6 Spatial Analysis

Chapter 6 Spatial Analysis 6.1 Introduction Chapter 6 Spatial Analysis Spatial analysis, in a narrow sense, is a set of mathematical (and usually statistical) tools used to find order and patterns in spatial phenomena. Spatial patterns

More information

A Latent Model To Detect Multiple Clusters of Varying Sizes. Minge Xie, Qiankun Sun and Joseph Naus. Department of Statistics

A Latent Model To Detect Multiple Clusters of Varying Sizes. Minge Xie, Qiankun Sun and Joseph Naus. Department of Statistics A Latent Model To Detect Multiple Clusters of Varying Sizes Minge Xie, Qiankun Sun and Joseph Naus Department of Statistics Rutgers, the State University of New Jersey Piscataway, NJ 08854 Summary This

More information

Number of Complete N-ary Subtrees on Galton-Watson Family Trees

Number of Complete N-ary Subtrees on Galton-Watson Family Trees Methodol Comput Appl Probab (2006) 8: 223 233 DOI: 10.1007/s11009-006-8549-6 Number of Complete N-ary Subtrees on Galton-Watson Family Trees George P. Yanev & Ljuben Mutafchiev Received: 5 May 2005 / Revised:

More information

Quasi-likelihood Scan Statistics for Detection of

Quasi-likelihood Scan Statistics for Detection of for Quasi-likelihood for Division of Biostatistics and Bioinformatics, National Health Research Institutes & Department of Mathematics, National Chung Cheng University 17 December 2011 1 / 25 Outline for

More information

Spatial and Temporal Geovisualisation and Data Mining of Road Traffic Accidents in Christchurch, New Zealand

Spatial and Temporal Geovisualisation and Data Mining of Road Traffic Accidents in Christchurch, New Zealand 166 Spatial and Temporal Geovisualisation and Data Mining of Road Traffic Accidents in Christchurch, New Zealand Clive E. SABEL and Phil BARTIE Abstract This paper outlines the development of a method

More information

GROUPED SURVIVAL DATA. Florida State University and Medical College of Wisconsin

GROUPED SURVIVAL DATA. Florida State University and Medical College of Wisconsin FITTING COX'S PROPORTIONAL HAZARDS MODEL USING GROUPED SURVIVAL DATA Ian W. McKeague and Mei-Jie Zhang Florida State University and Medical College of Wisconsin Cox's proportional hazard model is often

More information

Intensity Analysis of Spatial Point Patterns Geog 210C Introduction to Spatial Data Analysis

Intensity Analysis of Spatial Point Patterns Geog 210C Introduction to Spatial Data Analysis Intensity Analysis of Spatial Point Patterns Geog 210C Introduction to Spatial Data Analysis Chris Funk Lecture 4 Spatial Point Patterns Definition Set of point locations with recorded events" within study

More information

Local Likelihood Bayesian Cluster Modeling for small area health data. Andrew Lawson Arnold School of Public Health University of South Carolina

Local Likelihood Bayesian Cluster Modeling for small area health data. Andrew Lawson Arnold School of Public Health University of South Carolina Local Likelihood Bayesian Cluster Modeling for small area health data Andrew Lawson Arnold School of Public Health University of South Carolina Local Likelihood Bayesian Cluster Modelling for Small Area

More information

INTRODUCTION TO INTERSECTION-UNION TESTS

INTRODUCTION TO INTERSECTION-UNION TESTS INTRODUCTION TO INTERSECTION-UNION TESTS Jimmy A. Doi, Cal Poly State University San Luis Obispo Department of Statistics (jdoi@calpoly.edu Key Words: Intersection-Union Tests; Multiple Comparisons; Acceptance

More information

A nonparametric spatial scan statistic for continuous data

A nonparametric spatial scan statistic for continuous data DOI 10.1186/s12942-015-0024-6 METHODOLOGY Open Access A nonparametric spatial scan statistic for continuous data Inkyung Jung * and Ho Jin Cho Abstract Background: Spatial scan statistics are widely used

More information

Possible numbers of ones in 0 1 matrices with a given rank

Possible numbers of ones in 0 1 matrices with a given rank Linear and Multilinear Algebra, Vol, No, 00, Possible numbers of ones in 0 1 matrices with a given rank QI HU, YAQIN LI and XINGZHI ZHAN* Department of Mathematics, East China Normal University, Shanghai

More information

Differentiation matrices in polynomial bases

Differentiation matrices in polynomial bases Math Sci () 5 DOI /s9---x ORIGINAL RESEARCH Differentiation matrices in polynomial bases A Amiraslani Received January 5 / Accepted April / Published online April The Author(s) This article is published

More information

Modeling Disease Incidence Data with Spatial and Spatio-Temporal Dirichlet Process Mixtures

Modeling Disease Incidence Data with Spatial and Spatio-Temporal Dirichlet Process Mixtures Biometrical Journal 49 (2007) 5, 1 14 DOI: 10.1002/bimj.200610375 1 Modeling Disease Incidence Data with Spatial and Spatio-Temporal Dirichlet Process Mixtures Athanasios Kottas *,1, Jason A. Duan 2, and

More information

Spatio-temporal statistical models for river monitoring networks

Spatio-temporal statistical models for river monitoring networks Spatio-temporal statistical models for river monitoring networks L. Clement, O. Thas, P.A. Vanrolleghem and J.P. Ottoy Department of Applied Mathematics, Biometrics and Process Control, Ghent University,

More information

Tests of transformation in nonlinear regression

Tests of transformation in nonlinear regression Economics Letters 84 (2004) 391 398 www.elsevier.com/locate/econbase Tests of transformation in nonlinear regression Zhenlin Yang a, *, Gamai Chen b a School of Economics and Social Sciences, Singapore

More information

EEL 851: Biometrics. An Overview of Statistical Pattern Recognition EEL 851 1

EEL 851: Biometrics. An Overview of Statistical Pattern Recognition EEL 851 1 EEL 851: Biometrics An Overview of Statistical Pattern Recognition EEL 851 1 Outline Introduction Pattern Feature Noise Example Problem Analysis Segmentation Feature Extraction Classification Design Cycle

More information

Scalable Bayesian Event Detection and Visualization

Scalable Bayesian Event Detection and Visualization Scalable Bayesian Event Detection and Visualization Daniel B. Neill Carnegie Mellon University H.J. Heinz III College E-mail: neill@cs.cmu.edu This work was partially supported by NSF grants IIS-0916345,

More information

Gaussian processes. Basic Properties VAG002-

Gaussian processes. Basic Properties VAG002- Gaussian processes The class of Gaussian processes is one of the most widely used families of stochastic processes for modeling dependent data observed over time, or space, or time and space. The popularity

More information

exclusive prepublication prepublication discount 25 FREE reprints Order now!

exclusive prepublication prepublication discount 25 FREE reprints Order now! Dear Contributor Please take advantage of the exclusive prepublication offer to all Dekker authors: Order your article reprints now to receive a special prepublication discount and FREE reprints when you

More information

Sequences, Series, and the Binomial Formula

Sequences, Series, and the Binomial Formula CHAPTER Sequences, Series, nd the Binomil Formul. SEQUENCES. ; ; ; ; 6 ; 6 6. ðþ ; ðþ ; ð Þ 6; ðþ ; ðþ 6; 6 ð6þ. ðþ ; ðþ : ðþ ; ðþ ; ðþ ; 6 ðþ 6 6 6. ; ; ; ; ; 6 6 6. 0 ; ; ; 8 ; 6 8 ; 6. 0; ; 6 ; ; 6

More information

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Review. DS GA 1002 Statistical and Mathematical Models.   Carlos Fernandez-Granda Review DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Probability and statistics Probability: Framework for dealing with

More information

On a connection between the Bradley Terry model and the Cox proportional hazards model

On a connection between the Bradley Terry model and the Cox proportional hazards model Statistics & Probability Letters 76 (2006) 698 702 www.elsevier.com/locate/stapro On a connection between the Bradley Terry model and the Cox proportional hazards model Yuhua Su, Mai Zhou Department of

More information

Macromolecular Reaction Engineering

Macromolecular Reaction Engineering Macromolecular Reaction Engineering Reprints Full Paper A Hybrid Galerkin Monte-Carlo Approach to Higher-Dimensional Population Balances in Polymerization Kinetics Christof Schütte,* Michael Wulkow* Population

More information

Bayesian Hierarchical Models

Bayesian Hierarchical Models Bayesian Hierarchical Models Gavin Shaddick, Millie Green, Matthew Thomas University of Bath 6 th - 9 th December 2016 1/ 34 APPLICATIONS OF BAYESIAN HIERARCHICAL MODELS 2/ 34 OUTLINE Spatial epidemiology

More information

ACCELERATING THE DETECTION VECTOR BORNE DISEASES

ACCELERATING THE DETECTION VECTOR BORNE DISEASES NC State s Geospatial Analytics Forum October 22 2015 ACCELERATING THE DETECTION of SPACE-TIME CLUSTERS for VECTOR BORNE DISEASES Dr. Eric Delmelle Geography & Earth Sciences, University of North Carolina

More information

Fitting Semiparametric Additive Hazards Models using Standard Statistical Software

Fitting Semiparametric Additive Hazards Models using Standard Statistical Software Biometrical Journal 49 (2007) 5, 719 730 DOI: 101002/bimj200610349 719 Fitting Semiparametric Additive Hazards Models using Standard Statistical Software Douglas E Schaubel * and Guanghui Wei Department

More information

A first model of learning

A first model of learning A first model of learning Let s restrict our attention to binary classification our labels belong to (or ) We observe the data where each Suppose we are given an ensemble of possible hypotheses / classifiers

More information

Small sample corrections for LTS and MCD

Small sample corrections for LTS and MCD Metrika (2002) 55: 111 123 > Springer-Verlag 2002 Small sample corrections for LTS and MCD G. Pison, S. Van Aelst*, and G. Willems Department of Mathematics and Computer Science, Universitaire Instelling

More information

Hotspot detection using space-time scan statistics on children under five years of age in Depok

Hotspot detection using space-time scan statistics on children under five years of age in Depok Hotspot detection using space-time scan statistics on children under five years of age in Depok Miranti Verdiana, and Yekti Widyaningsih Citation: AIP Conference Proceedings 1827, 020018 (2017); View online:

More information

Spatial Analysis I. Spatial data analysis Spatial analysis and inference

Spatial Analysis I. Spatial data analysis Spatial analysis and inference Spatial Analysis I Spatial data analysis Spatial analysis and inference Roadmap Outline: What is spatial analysis? Spatial Joins Step 1: Analysis of attributes Step 2: Preparing for analyses: working with

More information

An Unconditional-like Structure for the Conditional Estimator of Odds Ratio from 2 2 Tables

An Unconditional-like Structure for the Conditional Estimator of Odds Ratio from 2 2 Tables Biometrical Journal 48 (006), 3 34 DOI: 0.00/bimj.005067 An Unconditional-like Structure for the Conditional Estimator of Odds Ratio from Tables James A. Hanley *; ; ; 3; 4 and Olli S. Miettinen Department

More information

Appendix A Conventions

Appendix A Conventions Appendix A Conventions We use natural units h ¼ 1; c ¼ 1; 0 ¼ 1; ða:1þ where h denotes Planck s constant, c the vacuum speed of light and 0 the permittivity of vacuum. The electromagnetic fine-structure

More information

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing 1 In most statistics problems, we assume that the data have been generated from some unknown probability distribution. We desire

More information

SaTScan TM. User Guide. for version 7.0. By Martin Kulldorff. August

SaTScan TM. User Guide. for version 7.0. By Martin Kulldorff. August SaTScan TM User Guide for version 7.0 By Martin Kulldorff August 2006 http://www.satscan.org/ Contents Introduction... 4 The SaTScan Software... 4 Download and Installation... 5 Test Run... 5 Sample Data

More information

Chapter 6. Logistic Regression. 6.1 A linear model for the log odds

Chapter 6. Logistic Regression. 6.1 A linear model for the log odds Chapter 6 Logistic Regression In logistic regression, there is a categorical response variables, often coded 1=Yes and 0=No. Many important phenomena fit this framework. The patient survives the operation,

More information

Distance-based test for uncertainty hypothesis testing

Distance-based test for uncertainty hypothesis testing Sampath and Ramya Journal of Uncertainty Analysis and Applications 03, :4 RESEARCH Open Access Distance-based test for uncertainty hypothesis testing Sundaram Sampath * and Balu Ramya * Correspondence:

More information

NAG Library Chapter Introduction. G08 Nonparametric Statistics

NAG Library Chapter Introduction. G08 Nonparametric Statistics NAG Library Chapter Introduction G08 Nonparametric Statistics Contents 1 Scope of the Chapter.... 2 2 Background to the Problems... 2 2.1 Parametric and Nonparametric Hypothesis Testing... 2 2.2 Types

More information

Efficient simulation of a space-time Neyman-Scott rainfall model

Efficient simulation of a space-time Neyman-Scott rainfall model WATER RESOURCES RESEARCH, VOL. 42,, doi:10.1029/2006wr004986, 2006 Efficient simulation of a space-time Neyman-Scott rainfall model M. Leonard, 1 A. V. Metcalfe, 2 and M. F. Lambert 1 Received 21 February

More information

THE ALGORITHM TO CALCULATE THE PERIOD MATRIX OF THE CURVE x m þ y n ¼ 1

THE ALGORITHM TO CALCULATE THE PERIOD MATRIX OF THE CURVE x m þ y n ¼ 1 TSUKUA J MATH Vol 26 No (22), 5 37 THE ALGORITHM TO ALULATE THE PERIOD MATRIX OF THE URVE x m þ y n ¼ y Abstract We show how to take a canonical homology basis and a basis of the space of holomorphic -forms

More information

Spatial Clusters of Rates

Spatial Clusters of Rates Spatial Clusters of Rates Luc Anselin http://spatial.uchicago.edu concepts EBI local Moran scan statistics Concepts Rates as Risk from counts (spatially extensive) to rates (spatially intensive) rate =

More information

Stepwise Gatekeeping Procedures in Clinical Trial Applications

Stepwise Gatekeeping Procedures in Clinical Trial Applications 984 Biometrical Journal 48 (2006) 6, 984 991 DOI: 10.1002/bimj.200610274 Stepwise Gatekeeping Procedures in Clinical Trial Applications Alex Dmitrienko *,1, Ajit C. Tamhane 2, Xin Wang 2, and Xun Chen

More information

Haar wavelet method for nonlinear integro-differential equations

Haar wavelet method for nonlinear integro-differential equations Applied Mathematics and Computation 176 (6) 34 333 www.elsevier.com/locate/amc Haar wavelet method for nonlinear integro-differential equations Ülo Lepik Institute of Applied Mathematics, University of

More information

T E C H N I C A L R E P O R T KERNEL WEIGHTED INFLUENCE MEASURES. HENS, N., AERTS, M., MOLENBERGHS, G., THIJS, H. and G. VERBEKE

T E C H N I C A L R E P O R T KERNEL WEIGHTED INFLUENCE MEASURES. HENS, N., AERTS, M., MOLENBERGHS, G., THIJS, H. and G. VERBEKE T E C H N I C A L R E P O R T 0465 KERNEL WEIGHTED INFLUENCE MEASURES HENS, N., AERTS, M., MOLENBERGHS, G., THIJS, H. and G. VERBEKE * I A P S T A T I S T I C S N E T W O R K INTERUNIVERSITY ATTRACTION

More information

Uniform Random Number Generators

Uniform Random Number Generators JHU 553.633/433: Monte Carlo Methods J. C. Spall 25 September 2017 CHAPTER 2 RANDOM NUMBER GENERATION Motivation and criteria for generators Linear generators (e.g., linear congruential generators) Multiple

More information

Bayesian Nonparametric Regression for Diabetes Deaths

Bayesian Nonparametric Regression for Diabetes Deaths Bayesian Nonparametric Regression for Diabetes Deaths Brian M. Hartman PhD Student, 2010 Texas A&M University College Station, TX, USA David B. Dahl Assistant Professor Texas A&M University College Station,

More information

A comparison study of the nonparametric tests based on the empirical distributions

A comparison study of the nonparametric tests based on the empirical distributions 통계연구 (2015), 제 20 권제 3 호, 1-12 A comparison study of the nonparametric tests based on the empirical distributions Hyo-Il Park 1) Abstract In this study, we propose a nonparametric test based on the empirical

More information

Chapter 2 Descriptive Statistics

Chapter 2 Descriptive Statistics Chapter 2 Descriptive Statistics The Mean "When she told me I was average, she was just being mean". The mean is probably the most often used parameter or statistic used to describe the central tendency

More information

Outline. Introduction to SpaceStat and ESTDA. ESTDA & SpaceStat. Learning Objectives. Space-Time Intelligence System. Space-Time Intelligence System

Outline. Introduction to SpaceStat and ESTDA. ESTDA & SpaceStat. Learning Objectives. Space-Time Intelligence System. Space-Time Intelligence System Outline I Data Preparation Introduction to SpaceStat and ESTDA II Introduction to ESTDA and SpaceStat III Introduction to time-dynamic regression ESTDA ESTDA & SpaceStat Learning Objectives Activities

More information

On Moore Bipartite Digraphs

On Moore Bipartite Digraphs On Moore Bipartite Digraphs M. A. Fiol, 1 * J. Gimbert, 2 J. Gómez, 1 and Y. Wu 3 1 DEPARTMENT DE MATEMÀTICA APLICADA IV TELEMÀTICA UNIVERSITAT POLITÈCNICA DE CATALUNYA JORDI GIRONA 1-3, MÒDUL C3 CAMPUS

More information

Computational Statistics and Data Analysis

Computational Statistics and Data Analysis Computational Statistics and Data Analysis 53 (2009) 2851 2858 Contents lists available at ScienceDirect Computational Statistics and Data Analysis journal homepage: www.elsevier.com/locate/csda Spatial

More information

Spatiotemporal Outbreak Detection

Spatiotemporal Outbreak Detection Spatiotemporal Outbreak Detection A Scan Statistic Based on the Zero-Inflated Poisson Distribution Benjamin Kjellson Masteruppsats i matematisk statistik Master Thesis in Mathematical Statistics Masteruppsats

More information

Misclassification in Logistic Regression with Discrete Covariates

Misclassification in Logistic Regression with Discrete Covariates Biometrical Journal 45 (2003) 5, 541 553 Misclassification in Logistic Regression with Discrete Covariates Ori Davidov*, David Faraggi and Benjamin Reiser Department of Statistics, University of Haifa,

More information

A numerical algorithm for investigating the role of the motor cargo linkage in molecular motor-driven transport

A numerical algorithm for investigating the role of the motor cargo linkage in molecular motor-driven transport Journal of Theoretical Biology 239 (26) 33 48 www.elsevier.com/locate/yjtbi A numerical algorithm for investigating the role of the motor cargo linkage in molecular motor-driven transport John Fricks a,1,

More information

The Bayesian Approach to Multi-equation Econometric Model Estimation

The Bayesian Approach to Multi-equation Econometric Model Estimation Journal of Statistical and Econometric Methods, vol.3, no.1, 2014, 85-96 ISSN: 2241-0384 (print), 2241-0376 (online) Scienpress Ltd, 2014 The Bayesian Approach to Multi-equation Econometric Model Estimation

More information

Statistical Analysis of Spatio-temporal Point Process Data. Peter J Diggle

Statistical Analysis of Spatio-temporal Point Process Data. Peter J Diggle Statistical Analysis of Spatio-temporal Point Process Data Peter J Diggle Department of Medicine, Lancaster University and Department of Biostatistics, Johns Hopkins University School of Public Health

More information

Chapte The McGraw-Hill Companies, Inc. All rights reserved.

Chapte The McGraw-Hill Companies, Inc. All rights reserved. er15 Chapte Chi-Square Tests d Chi-Square Tests for -Fit Uniform Goodness- Poisson Goodness- Goodness- ECDF Tests (Optional) Contingency Tables A contingency table is a cross-tabulation of n paired observations

More information

THE DOMINATION NUMBER OF GRIDS *

THE DOMINATION NUMBER OF GRIDS * SIAM J. DISCRETE MATH. Vol. 2, No. 3, pp. 1443 143 2011 Society for Industrial and Applied Mathematics THE DOMINATION NUMBER OF GRIDS * DANIEL GONÇALVES, ALEXANDRE PINLOU, MICHAËL RAO, AND STÉPHAN THOMASSÉ

More information

Cellular Automaton Growth on # : Theorems, Examples, and Problems

Cellular Automaton Growth on # : Theorems, Examples, and Problems Cellular Automaton Growth on : Theorems, Examples, and Problems (Excerpt from Advances in Applied Mathematics) Exactly 1 Solidification We will study the evolution starting from a single occupied cell

More information

Confidence Intervals of the Simple Difference between the Proportions of a Primary Infection and a Secondary Infection, Given the Primary Infection

Confidence Intervals of the Simple Difference between the Proportions of a Primary Infection and a Secondary Infection, Given the Primary Infection Biometrical Journal 42 (2000) 1, 59±69 Confidence Intervals of the Simple Difference between the Proportions of a Primary Infection and a Secondary Infection, Given the Primary Infection Kung-Jong Lui

More information

Fundamentals to Biostatistics. Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur

Fundamentals to Biostatistics. Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur Fundamentals to Biostatistics Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur Statistics collection, analysis, interpretation of data development of new

More information

Chapter 2 Linear Systems

Chapter 2 Linear Systems Chapter 2 Linear Systems This chapter deals with linear systems of ordinary differential equations (ODEs, both homogeneous and nonhomogeneous equations Linear systems are etremely useful for analyzing

More information

Early Detection of a Change in Poisson Rate After Accounting For Population Size Effects

Early Detection of a Change in Poisson Rate After Accounting For Population Size Effects Early Detection of a Change in Poisson Rate After Accounting For Population Size Effects School of Industrial and Systems Engineering, Georgia Institute of Technology, 765 Ferst Drive NW, Atlanta, GA 30332-0205,

More information

TESTING FOR EQUAL DISTRIBUTIONS IN HIGH DIMENSION

TESTING FOR EQUAL DISTRIBUTIONS IN HIGH DIMENSION TESTING FOR EQUAL DISTRIBUTIONS IN HIGH DIMENSION Gábor J. Székely Bowling Green State University Maria L. Rizzo Ohio University October 30, 2004 Abstract We propose a new nonparametric test for equality

More information

Physics 509: Bootstrap and Robust Parameter Estimation

Physics 509: Bootstrap and Robust Parameter Estimation Physics 509: Bootstrap and Robust Parameter Estimation Scott Oser Lecture #20 Physics 509 1 Nonparametric parameter estimation Question: what error estimate should you assign to the slope and intercept

More information

Lattice Data. Tonglin Zhang. Spatial Statistics for Point and Lattice Data (Part III)

Lattice Data. Tonglin Zhang. Spatial Statistics for Point and Lattice Data (Part III) Title: Spatial Statistics for Point Processes and Lattice Data (Part III) Lattice Data Tonglin Zhang Outline Description Research Problems Global Clustering and Local Clusters Permutation Test Spatial

More information

Spatio-temporal epidemiology of Campylobacter jejuni enteritis, in an area of Northwest England,

Spatio-temporal epidemiology of Campylobacter jejuni enteritis, in an area of Northwest England, Epidemiol. Infect. (2), 38, 384 39. f Cambridge University Press 2 doi:.7/s952688488 Spatio-temporal epidemiology of Campylobacter jejuni enteritis, in an area of Northwest England, 2 22 E. GABRIEL,2 *,

More information

Convergence of a linear recursive sequence

Convergence of a linear recursive sequence int. j. math. educ. sci. technol., 2004 vol. 35, no. 1, 51 63 Convergence of a linear recursive sequence E. G. TAY*, T. L. TOH, F. M. DONG and T. Y. LEE Mathematics and Mathematics Education, National

More information

A SPATIAL SCAN STATISTIC FOR COMPOUND POISSON DATA, USING THE NEGATIVE BINOMIAL DISTRIBUTION AND ACCOUNTING FOR POPULATION STRATIFICATION

A SPATIAL SCAN STATISTIC FOR COMPOUND POISSON DATA, USING THE NEGATIVE BINOMIAL DISTRIBUTION AND ACCOUNTING FOR POPULATION STRATIFICATION Statistica Sinica 25 (2015), 313-327 doi:http://dx.doi.org/10.5705/ss.2013.215w A SPATIAL SCAN STATISTIC FOR COMPOUND POISSON DATA, USING THE NEGATIVE BINOMIAL DISTRIBUTION AND ACCOUNTING FOR POPULATION

More information

Implementation of the Four-Bit Deutsch Jozsa Algorithm with Josephson Charge Qubits

Implementation of the Four-Bit Deutsch Jozsa Algorithm with Josephson Charge Qubits phys. stat. sol. (b) 233, No. 3, 482 489 (2002) Implementation of the Four-Bit Deutsch Jozsa Algorithm with Josephson Charge Qubits N. Schuch ) and J. Siewert*) Institut für Theoretische Physik, Universität

More information

MONTE CARLO ANALYSIS OF CHANGE POINT ESTIMATORS

MONTE CARLO ANALYSIS OF CHANGE POINT ESTIMATORS MONTE CARLO ANALYSIS OF CHANGE POINT ESTIMATORS Gregory GUREVICH PhD, Industrial Engineering and Management Department, SCE - Shamoon College Engineering, Beer-Sheva, Israel E-mail: gregoryg@sce.ac.il

More information

A review of some semiparametric regression models with application to scoring

A review of some semiparametric regression models with application to scoring A review of some semiparametric regression models with application to scoring Jean-Loïc Berthet 1 and Valentin Patilea 2 1 ENSAI Campus de Ker-Lann Rue Blaise Pascal - BP 37203 35172 Bruz cedex, France

More information

A QBD approach to evolutionary game theory

A QBD approach to evolutionary game theory Applied Mathematical Modelling (00) 91 9 wwwelseviercom/locate/apm A QBD approach to evolutionary game theory Lotfi Tadj a, *, Abderezak Touzene b a Department of Statistics and Operations Research, College

More information

Applied Mathematics and Computation

Applied Mathematics and Computation Applied Mathematics and Computation 245 (2014) 86 107 Contents lists available at ScienceDirect Applied Mathematics and Computation journal homepage: www.elsevier.com/locate/amc An analysis of a new family

More information

Unconditional Confidence Interval for the Difference between Two Proportions

Unconditional Confidence Interval for the Difference between Two Proportions Biometrical Journal 45 (2003) 4, 426 436 Unconditional Confidence Interval for the Difference between Two Proportions A. Martín Andrés 1 and I. Herranz Tejedor 2 1 Bioestadística. Facultad de Medicina.

More information

Design sequences for sensory studies: Achieving balance for carry-over and position effects

Design sequences for sensory studies: Achieving balance for carry-over and position effects 339 British Journal of Mathematical and Statistical Psychology (2007), 60, 339 349 q 2007 The British Psychological Society The British Psychological Society www.bpsjournals.co.uk Design sequences for

More information

Interactive GIS in Veterinary Epidemiology Technology & Application in a Veterinary Diagnostic Lab

Interactive GIS in Veterinary Epidemiology Technology & Application in a Veterinary Diagnostic Lab Interactive GIS in Veterinary Epidemiology Technology & Application in a Veterinary Diagnostic Lab Basics GIS = Geographic Information System A GIS integrates hardware, software and data for capturing,

More information

Preservation of local dynamics when applying central difference methods: application to SIR model

Preservation of local dynamics when applying central difference methods: application to SIR model Journal of Difference Equations and Applications, Vol., No. 4, April 2007, 40 Preservation of local dynamics when applying central difference methods application to SIR model LIH-ING W. ROEGER* and ROGER

More information

The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations

The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations John R. Michael, Significance, Inc. and William R. Schucany, Southern Methodist University The mixture

More information

Smooth nonparametric estimation of a quantile function under right censoring using beta kernels

Smooth nonparametric estimation of a quantile function under right censoring using beta kernels Smooth nonparametric estimation of a quantile function under right censoring using beta kernels Chanseok Park 1 Department of Mathematical Sciences, Clemson University, Clemson, SC 29634 Short Title: Smooth

More information

Introduction to Bayesian Statistics

Introduction to Bayesian Statistics Bayesian Parameter Estimation Introduction to Bayesian Statistics Harvey Thornburg Center for Computer Research in Music and Acoustics (CCRMA) Department of Music, Stanford University Stanford, California

More information

Series solutions of non-linear Riccati differential equations with fractional order

Series solutions of non-linear Riccati differential equations with fractional order Available online at www.sciencedirect.com Chaos, Solitons and Fractals 40 (2009) 1 9 www.elsevier.com/locate/chaos Series solutions of non-linear Riccati differential equations with fractional order Jie

More information

Model Estimation Example

Model Estimation Example Ronald H. Heck 1 EDEP 606: Multivariate Methods (S2013) April 7, 2013 Model Estimation Example As we have moved through the course this semester, we have encountered the concept of model estimation. Discussions

More information

Performance modeling of wireless networks with generally distributed handoff interarrival times

Performance modeling of wireless networks with generally distributed handoff interarrival times Computer Communications 26 (23) 1747 1755 www.elsevier.com/locate/comcom Performance modeling of wireless networks with generally distributed handoff interarrival times S. Dharmaraja a, K.S. Trivedi b,

More information

Evaluation of the effective speed of sound in phononic crystals by the monodromy matrix method (L)

Evaluation of the effective speed of sound in phononic crystals by the monodromy matrix method (L) Evaluation of the effective speed of sound in phononic crystals by the monodromy matrix method (L) A. A. Kutsenko and A. L. Shuvalov Université de Bordeaux, Institut de Mécanique et d Ingénierie de Bordeaux,

More information

UNIVERSITÄT POTSDAM Institut für Mathematik

UNIVERSITÄT POTSDAM Institut für Mathematik UNIVERSITÄT POTSDAM Institut für Mathematik Testing the Acceleration Function in Life Time Models Hannelore Liero Matthias Liero Mathematische Statistik und Wahrscheinlichkeitstheorie Universität Potsdam

More information

Contingency Tables Part One 1

Contingency Tables Part One 1 Contingency Tables Part One 1 STA 312: Fall 2012 1 See last slide for copyright information. 1 / 32 Suggested Reading: Chapter 2 Read Sections 2.1-2.4 You are not responsible for Section 2.5 2 / 32 Overview

More information

Goodness of Fit Test and Test of Independence by Entropy

Goodness of Fit Test and Test of Independence by Entropy Journal of Mathematical Extension Vol. 3, No. 2 (2009), 43-59 Goodness of Fit Test and Test of Independence by Entropy M. Sharifdoost Islamic Azad University Science & Research Branch, Tehran N. Nematollahi

More information

A SPATIAL SCAN STATISTIC FOR COMPOUND POISSON DATA, USING THE NEGATIVE BINOMIAL DISTRIBUTION AND ACCOUNTING FOR POPULATION STRATIFICATION

A SPATIAL SCAN STATISTIC FOR COMPOUND POISSON DATA, USING THE NEGATIVE BINOMIAL DISTRIBUTION AND ACCOUNTING FOR POPULATION STRATIFICATION Statistica Sinica 25 (2015), 313-327 doi:http://dx.doi.org/10.5705/ss.2013.215w A SPATIAL SCAN STATISTIC FOR COMPOUND POISSON DATA, USING THE NEGATIVE BINOMIAL DISTRIBUTION AND ACCOUNTING FOR POPULATION

More information