Success scans in a sequence of trials

Size: px
Start display at page:

Download "Success scans in a sequence of trials"

Transcription

1 Success scans in a sequence of trials Example 4.6: application in acceptance sampling. The most classical acceptance sampling is as follows. A supplier passes along a lot of products with a p probability of failure for each of them, and the retailer (on the receiving side) will test a sample of n items, where n is usually much smaller than the total number of products in the lot. Suppose d defective items are found in the sample of n, then the retailer may reject the whole lot if d/n > pre-determined threshold Most acceptance sampling uses small samples, for example, n = 5, 8, or 10, while the size of the whole lot could be in hundreds or thousands. The problem with small samples for attributes (item defective or not defective) is that they do not discriminate well between good and bad quality. Consider an acceptance sampling plan that samples five items from a large lot, and only accepts the lot if none of the five items are defective. The retailer may consider a lot with 10% or more defectives to be of unacceptable quality. Let calculate the chance that sample of five has no defective at all. 1

2 Success scans in a sequence of trials (Example 4.6) p = 0.1, the probability for a single item to be defective Pr(no defective in a sample of five) = (1-p) 5 = 59% Even with 10% defective rate in the original lot, there is a good chance that sample of five may miss it entirely so that the lot is accepted. On the other hand, a lot might have only 1% defective (which could have been agreed by the buyer to be acceptable quality), and yet the large lot has a high chance of having at least one defective sample and being rejected. p = Pr(at least one defective in a sample of five) =1- Pr(no defective in a sample of five) = 1 - (1-p) 5 = 5% > 1% So a single acceptance sampling plan does not seem to please either side of the parties. 2

3 Success scans in a sequence of trials (Example 4.6) Combining the information from consecutive data can improve the effectiveness of the inspection system. Suppose the new rule says that suspend the transaction (between a supplier and a buyer) when at any time there are k rejected lots within m consecutive lots. This is to say, when a single lot is rejected, just mark it or record it but do not necessarily stop accepting new lots from the supplier. But if k-out-of-m rejection happens, the supplier will be made to stop any new shipments until necessary corrective actions are taken to eliminate a potential problem. How well such an acceptance sampling plan can work? - suppose the sample size remains 5; - p = 0.01 or 0.1 for an individual item; - k = 3, m=5, i.e., it is agreed to suspend transactions whenever three out of five consecutive lots are rejected. 3

4 Success scans in a sequence of trials (Example 4.6) When p = 0.01, each lot has 5% probability to be rejected but the transactions are unlikely to be stopped in a trial of 100 lots. When p = 0.1, each lot has 1-(1-p) 5 = 41% probability to be rejected, and it is highly likely to stop the transaction after 15 lots. 4

5 Success scans in a sequence of trials Retrospective scan in a sequence. When in applications, researchers condition on a known or a given value for the total number of successes in the N trials. This is a retrospective analysis. The scan statistics are defined similarly as those in the prospective analysis S m ' and W k '. In retrospective analysis, since the events had occurred, suppose there are a successes and N a failures, so that 5

6 Success scans in a sequence of trials Approximation: 6

7 Success scans in a sequence of trials Example 4.7: birthday problem. In a soccer game, there are 23 people in the field, 11 players from each side plus the referee. How likely is it to find (at least) two people of the 23 persons who share a birthday? Assume that N days of a year are equally likely to be a birthday for a person. Given that there are a people, the probability of no match is 7

8 Success scans in a sequence of trials Example 4.8: the generalized birthday problem. Find the probability that at least two birthdays out of a people fall within any m consecutive days. Because Dec 31 and Jan 1 are adjacent days, the m-day period is treated as a circular case. The probability as asked above is Given a family consisting of a mother, a father and two kids, there are four birthdays and one wedding anniversary. Assuming independence of dates. Is it unusual for at least two of the five special dates to fall within the same 7 day period? N = 365, a = 5, m = 7 Pr (generalized birthday) =

9 Success scans in a sequence of trials Example 4.8: We can use the scan statistic to solve the generalized birthday problem. Try it with a = 5, N = 365, k = 2, m = 7 but L = N/m =52.1, which is not an integer. If let N = 364, then L = N/m = 52. So use N = 364 and L = 52 to approximate. Pr(2 7, 364, 5) = , which is very close to the probability of 0.31 we obtained. 9

10 Success scans in a sequence of trials Example 4.9: A chess grandmaster played 20 tournaments over 1 year, and won nine. Seven of the won tournaments occurred within 10 consecutive tournaments. Assuming that the nine wins occurred independently of each other and completely at random over the 20 tournaments, how likely is it for there to have been 10 consecutive tournaments containing at least seven wins? Here a = 9, N = 20, k = 7, m = 10, L =N/m =2 10

11 Higher dimensional scan Many times, scans in two or three dimensional space are very useful. Typically, 2-D scan is a spatial scan and a 3-D scan is a time-space or spatial-temporal scan. Example - A public health official looks for spatial clusters of disease occurrence; - A geologist scans a region to find clusters of mineral depositions; - An astrophysicist scans the heaven for sources of concentration of gamma ray bursts; - A traffic controller looks at clusters of accidents on a highway that are close in time and space. 11

12 Higher dimensional scan Sometimes, people convert a 2-D scan problem into a 1-D and use the previous approach. For example, Conover, Bement, and Iman (1979) apply a 1-D scan statistic to a 2-D region. They were motivated by an aerial search for uranium deposits under the National Uraniun Resource Evaluation program. A plane flies over the region in a zig-zag fashion 12

13 Higher dimensional scan Consider the problem of scanning a unit square with a sub-rectangle with sides of length u and height v, that are parallel to the sides of the square. For an S T rectangular region, with an actual a b scanning window, they can be mapped into a unit square case: u = a/s and v = b/t, to make S = T =1 unit. Given N points distributed at random over the unit square, let S u,v denote the maximum number of points in any sub-rectangle with sides of length u and height v, parallel to the sides of the unit square. S u,v is a 2-D scan statistic, and let P(k, N, u, v) P(S u,v k) 13

14 Higher dimensional scan Approximation: 14

15 Higher dimensional scan Example 4.10: People looked at clusters of childhood (up to 15 years old) acute leukemia cases in Sweden diagnosed in the 20 years period ending in During the period, there were 1,534 cases of acute childhood leukemia among a population of 1,703,235 children. There is a cluster of three cases acute leukemia among a population of 133 children in a southwest town of Sweden. This was 25 times the average incidence rate for Sweden. Was this cluster significant? View the map of Sweden as stretched (because the population not evenly spread) so that there is a grid of 1,703,235 squares with one person in each square. For simplicity, view the map as a 1,305 1,305 square (1,305 = sqrt(1,703,235)). View the population of the southwest town as a square (11.5 sqrt(133)). Given 1,534 leukemia cases in the whole square, how likely is it to get a cluster of three cases within a subsquare with u = v = 11.5/1305 = ? 15

16 Higher dimensional scan (Example 4.10) Calculate Pr(3, 1,534, , ) Unfortunately, the previous approximation does not work for k = 3 or 4 (your will get a Pr > 1). But try k=5 and 6. When k = 5, Pr = when k =6, Pr = So when k = 3, the probability should be considerably larger than 0,052, suggesting that the cluster is not as unusual as people initially thought. 16

17 Higher dimensional scan Example 4.11: star cluster. In 1767, Reverend Michell scanned the sky and noted that the visual closeness of six of the stars in the Pleiades. The six stars are called Atlas, Mia, Alcyone, Merope, Electra, and Taygeta. Michell noted that we suppose that the whole number of these stars, which are equal in splendor to the faintest of these, to be about 1,500. We want to find the probability that six stars, out of that number, scattered at random in the whole heavens, to be within so small a distance from each other as the Pleiades are. 17

18 Higher dimensional scan Example 4.11: star cluster. Distance in astrophysics: The six stars in Pleiades can be contained in a circle with radius = 31 minutes. This circle on the surface of the celestial sphere has an approximate surface area (31 minutes) 2 = (31 r / ) 2 18

19 Higher dimensional scan Example 4.11: star cluster. - The surface of the whole celestial sphere with radius r is 4 r 2. - So the fraction of the celestial surface within a scanning circle of 31 minutes arc is If approximate this area by a square, then - Implication: This results simply indicates that the stars were not distributed at random. Rather they cluster into clusters of stars. This is now a wellaccepted fact but no so obvious 300 years ago. 19

20 Higher dimensional scan Clustering on the lattice: this is a discrete version of the 2-D scan statistic. In certain applications the events can naturally occur only at a discrete set of points in space. For example, in an agricultural applications, people looks for clusters of diseased plants in a field where plants are (roughly) evenly spaced. In other applications, the underlying events can occur anywhere in space, but the method of observation limits the observed events to occurring on a grid or lattice of points. We here focus on a rectangular R T lattice of points, and consider scanning the lattice by a rectangular m 1 m 2 sublattice, where the sides of the sublattice are parallel to those of the lattice. We deal with the model where events are independent and equally likely to occur at any point of the lattice. 20

21 Higher dimensional scan Let x ij, i = 1,, R, j = 1,,T, denote a rectangular lattice of independent and identically distributed Bernoulli random variables, and Pr(x ij = 1) = 1 Pr(x ij = 0) =p. View the lattice with (1,1) at its lower left corner. Let Define the two-dimensional discrete scan statistic to be 21

22 Higher dimensional scan Approximation But here q(2m-1) and q(2m) need to be calculated from a recursive algorithm in Karwe and Naus (1997). For convenience, there is a look-up table one can use. Approximation 22

23 Higher dimensional scan Example 4.12: minefield detection. A two-dimensional rectangular region on land (or sea) is being checked for minefields. The region is viewed as being divided into a grid of small squares (quadrats). An aircraft flies over the region and processes information from a sensor. When the sensor gives a reading above a specified threshold for a quadrat, the score for the quadrat is recorded as "high." An unusual number of contiguous quadrats that score high suggests a possible minefield. Suppose a large region is divided into 10,000 quadrats and, on average, about 2% of quadrats score high. In screening the map with a 5 5 square of quadrats, we observe a square 25-quadrat area that contains six high quadrats. Is this an evidence of unusual, nonrandom clustering? Here T = 100 = sqrt(10,000), m = 5, p =0.02, k =6. 23

24 Higher dimensional scan Example 4.13: A photograph shows 500 galaxies of a certain brightness. Divide the photographic plate by an evenly spaced T T grid consisting of T 2 cells. Let T = 100. Assume that the galaxies are distributed completely at random over the photograph. if one observes 17 galaxies within a scanning square, how unusual is it? There are a total of T T = 10,000 cells. If 500 galaxies are distributed randomly, the probability that each cell has a galaxy is Pr = 500/10,000 =0.05 Then, T = 100, m =10, k=17, so Pr(17 10, 0.05,100) 0.04 (from Table ) The distribution shows strong evidence of clustering. 24

25 Higher dimensional scan Discussion on shape of the scanning window. Does the shape of a scanning window (of a given area) make a difference on the probability of a large cluster? See the following example. Using formula Pr(k,N,u,v) Pr(5, 5, 0.1, 0.9) = Pr(5, 5, 0.3, 0.3) = where = Another example: Pr(4, 5, 0.15, 0.4) = Pr(4, 5, 0.2, 0.3) = Pr(4,5, , ) = One can observe that in scanning a unit square, for a give size of uv, P(k,N, u, v) is maximized for u = v. 25

26 Higher dimensional scan Scanning window type is still an ongoing research subject. Here are a few common understanding: - Square windows and circular windows are generally popular because they involve fewer parameters. - Some shapes, such as ellipses or rectangles, can be oriented to yield greater power, for example, by taking into account the direction of wind in spreading pollutants. - No matter which shape window is used, there is a need for efficient algorithms to reduce the complexities of getting the probability distribution of the scan statistic. Typically, that risk is simpler for the scanning window whose sides are parallel to the original region. 26

27 Higher dimensional scan: varying the size of scan window In the treatment so far, the scan statistics are defined over a sliding scan window (an improvement over control chart), but still with a fixed window size. Next we will allow the window size to change, which empowers the scan statistic to detect underlying clusters/patterns whose size is not known a priori. The major downside is the computational complexity in doing so. A major portion of the following materials is based on the work done by Kulldorff (a statistician working for CDC) and Knox. 27

28 Higher dimensional scan: varying the size of scan window A general model: - Let A be the area in which events may occur, a subset of Euclidean space where different dimensions may represent either physical space or time. So this model is applicable to both spatial scan statistic and spatial-temporal scan statistic. - On A, define a measure, representing a known underlying intensity that generates events under the null hypothesis. For a homogeneous Poisson process on a rectangle A, we have (x) = for all x A. - Under the null hypothesis, the number of events in any given area could be binomially distributed, so that or, be Poissonly distributed 28

29 Higher dimensional scan: varying the size of scan window For instance, the set A contains a group of locations, s i. Each location s i may represent a zip code, with location (latitude and longitude) assumed to be at the centroid of the zip code. Then, we collect a count c i for s i as the number of, say, respiratory disease cases in that zip code, and b i as the baseline information, for example, the at-risk population. Here, c i is the x(s i ) and b i is the (s i ), for s i A If b i is the underlying population of location s i, we would expect each count be proportional to its population under the null hypothesis. If b i is the expected count if location s i, and then we expect each count to equal the expected count of location s i under the null hypothesis. In either case, b i, the underlying measure, is either given or can be inferred from the past counts. For example, over-the-counter (OTC) drug sales may be estimated based on a time series model using historical sales records. 29

30 Higher dimensional scan: varying the size of scan window With a varying size window, a scan statistic is treated as an interval, area, or volume of fixed shape, which then movers across the study area. As it moves, its size varies and it defines a collection W of zones w A. Conditioning on the observed total number of events, x(a), the definition of the scan statistic is the maximum likelihood ratio over all possible zones: When assume that the underlying process follows a Binomial model or a Poisson model, the likelihood function can be explicitly written down. For example, under a Poisson model 30

31 Higher dimensional scan: varying the size of scan window Under a more specific setting, define More specifically 31

32 Higher dimensional scan: varying the size of scan window Under the previous setting: So this likelihood ratio can be easily calculated for a given region w. 32

33 Higher dimensional scan: varying the size of scan window Even though calculating the likelihood function for a given region is pretty straightforward, calculating the scan statistic S w and performing the subsequent statistical testing could be computational difficult because calculating S w needs to exhaust all possible positions and sizes of the scan window. For the cases with varying window sizes, it is not easy to get a good approximation, either, not so as in the fixed window-size cases. People rely on the Monte Carlo (MC)-based hypothesis testing procedure. 33

34 Higher dimensional scan: varying the size of scan window General procedure for MC-based hypothesis testing 34

35 Higher dimensional scan: varying the size of scan window The detailed procedures to implement this MC-based hypothesis test are developed by many different researchers. Here we introduce a free software that can be used to search for clusters in a dataset. Software: SaTScan. You can download at together with a user manual and some dataset. 35

36 Higher dimensional scan: varying the size of scan window Example 4.14: brain cancer in New Mexico. This example, together with a few others, is included in the SaTScan software. Three files: nm.cas cancer records nm.pop population nm.geo coordinates of each county Overall, the cancer data are broken down by age and sex. Brain cancer and population data are available from 1973 to 1992 at the aggregated level of 32 counties. A circular variable size window was used. The circle centroids are limited to the county centroids, while the radius varies continuously from zero and up until it include 50% of the total population at risk. 36

37 Higher dimensional scan: varying the size of scan window (Example 4.14) brain cancer in New Mexico. Here on a spatial analysis is performed but a spatial-temporal analysis can be performed likewise. When scan for areas with (unusually) high-rate cluster: - Need to check "High Rates" in the software, the "Analysis" tab; - A cluster is found in and around Albuquerque, containing Cernadillo, Cibola-Valencia, Los Alamos, Sandoval, San Miguel, Santa Fe, Socorro and Torrance counties. - With 642 cases when were expected, this area has a rate 10% higher than the New Mexico average, and it is significant with p-value =

38 Higher dimensional scan: varying the size of scan window (Example 4.14) When scan for areas with (unusually) low-rate cluster: - Need to check for "Low Rates" in the "Analysis" tab; - A cluster is found for Lea and Eddy counties combined. - With 98.3 expected but 72 cases actual, these counties has an incident rate about 26.7% lower than the state average. But the p-value = 0.167, not very significant one. When use the two-sided scan, the clusters found would be the same but p-value =

39 Higher dimensional scan: varying the size of scan window (Example 4.14) Spatial clusters for brain cancer incidence in New Mexico The most likely cluster is the grey area and the area with low rate is the shaded area. 39

40 Higher dimensional scan: varying the size of scan window Revisit Example 1.4: The paper performed the analysis can be assessed at 40

41 Example 1.4 Early warning system for West Nile virus Example 1.4: Since 1999 West Nile virus (WNV) outbreak in New York City, which caused thousands of human infection and 59 severe cases including 7 deaths, health officials have been searching for an early warning system that could have prevent human illness and death. In the summer of 2001, the New York City Department of Health and Mental Hygiene established a citywide network of adult mosquito traps, sentinel bird flocks, and system for reporting, collecting, and testing dead birds. Health officials try to use the collected data and the pattern embedded therein to set off public health alerts enough time before onset of human cases. 41

42 Example 1.4 Early warning system for West Nile virus 42

43 Example 1.4 Early warning system for West Nile virus Data Collection - Dead birds were reported by the public through an interactive voiceresponse telephone system or the internet. - The information included the date found and the location and species of the dead bird. Since pigeon deaths are common but rarely associated with WVN, they were excluded from all clustering analysis. - A sample of dead birds that met selection criteria were submitted for necropsy and testing. 43

44 Example 1.4 Early warning system for West Nile virus Data Collection (continued) - Mosquitoes were collected weekly from > 100 traps dispersed throughout the city. - Multiple mosquitoes from the same trap and of the same species were pooled, and each pool was tested for evidence of WNV; a result was considered positive if at least one mosquito was infected. - NYC Department of Health and Mental Hygiene conducted citywide active hospital-based physician and lab surveillance for human WNV cases. 44

45 Example 1.4 Early warning system for West Nile virus Data Collection (continued) - All dead bird reports, mosquito traps, and human case-patients with address information were geocoded to a point location (when possible, using the ArcView GIS software). - Multiple dead bird reports from the same location on the same date were counted as one. Dead bird reports were attributed to the census tracts (a total of 2,215 tracts, with an average size of 0.13 square miles). - The latitude and longitude in decimal degrees for each census tract centroid were used in scan statistics analysis. 45

46 Example 1.4 Early warning system for West Nile virus Spatial Statistics Scan - Use a circular window, and perform prospective analysis. - To adjust the analysis of geographic variability, we used historical dead bird counts from a given census tract as baseline controls; recent bird counts were used as cases. - Define cases as the dead bird reports that occurred in the 7 days before the date of analysis. - A minimum 2-week buffer zone between case and control birds was established, thereby limiting the influence of emerging clusters on the analysis. - Need to perform some pre-analysis using a SAS code. 46

47 Example 1.4 Early warning system for West Nile virus Spatial Statistics Scan on 2001 data. - Data for dead bird clustering analysis were first available on June 22. Analysis showed two clusters: central Staten Island and eastern Queens. This prompted a program of intensified larval surveillance and control as well as abatement of standing water starting June

48 Example 1.4 Early warning system for West Nile virus Spatial Statistics Scan on 2001 data. - Reports from these areas were prioritized for dead bird pickup and testing, and additional mosquito traps were set. - On July 19, the clustering in eastern Queens was shown to be likely due to WNV through lab testing. 48

49 Example 1.4 Early warning system for West Nile virus Spatial Statistics Scan on 2001 data. - During the next 2-3 months, six areas with major dead bird clustering were identified. Five of seven diagnosed human cases in 2001 were identified in residents of four cluster areas. 49

50 Example 1.4 Early warning system for West Nile virus Spatial Statistics Scan on 2001 data. - In summary: dead bird clusters occurred 0 40 days (median 12 days) before the onset of human illness, and days (median 17 days) before human diagnosis. In most cases, dead bird clusters also preceded time of collection of WNV-positive mosquitoes and birds. 50

51 Time-space analysis: Knox statistic Knox test is a test of space-time interaction based on a count of the number of event pairs that occur within a pre-specified critical interval of time t, and a pre-specified critical distance s. With n points located in space and time, there are n(n-1)/2 distant pairs of points. Let n s be the observed number of pairs that are close together in space (i.e., pairs that are separated by a distance less than s); and let n t be the observed number of pairs that are close together in time (i.e., pairs that are separated in time by less than t). the test statistic is n st, the observed number of pairs that are close in both space and time. 51

52 Time-space analysis: Knox statistic Knox suggested that the random variable N st, which takes the observed values n st, was approximately Poisson distributed: 52

53 Time-space analysis: Knox statistic When test statistic n st exceeds its expectation E(N st ), it implies that when points that are close together in space are closer than expected in time, or alternatively states, when points that are close together in time are closer than expected in space. This is then an indication of something in clusters is emerging because an ove-density on time or space is observed. To carry out a Knox's test, we may have the following alternatives: (a)assume that N st follows a Poisson distribution with = E(N st ), then test the potential change in the observed, which is related to n st ; (b)use a normal approximation to the Poisson distribution, with both its mean and variance equal to E(N st ); (c) Use a normal approximation, let its mean = E(N st ) and variance = var(n st ). We will use (c) here in this class. 53

54 Time-space analysis: Knox statistic A local Knox statistic. The original Knox statistic provides valuable information regarding whether a significant space-time interaction exists in the data. But the statistical importance of particular observations remains unknown solely from Knox statistic. Oftentimes, it is also of interest to find those pairs of observations that are close in both space and time. This is what the "local" Knox statistic is proposed for. the original Knox statistic is a "global" statistic, in this sense. Let n s (i) be the number of observations that are close to observation i in space, n t (i) be the number that are close in time, and n st (i) the number that are close together in both space and time. Define the observed value of the local Knox statistic N st (i) to be n st (i), and also people prove that i.e., the local statistics sum to a constant multiple of the global statistic. 54

55 Time-space analysis: Knox statistic Let j = 1,, n be the ordered index of the time value assigned to location i, where for example, j = 1 implies that i has been assigned the time associated with the first observations, and j = n implies that i has been assigned the time associated with the n th observation. 55

56 Time-space analysis: Knox statistic An approximate significance test for n st (i) may be constructed by using a z- score: Sequence of use: (1)The global Knox statistic Nst can be monitored by a CUSUM or a Shewhart chart; (2)When there is a significant global Knox statistic, the local Knox statistic can be useful in finding the observations that contribute to the global significance. 56

57 Time-space analysis: Knox statistic Interpretation of a significant Knox statistic: It implies a strong space-time interaction. This could be an indication that a disease is infectious or that it is caused by some other type of agent appears locally at specific times such as food poisoning. In criminology, a strong space-time pattern of homicides is an indication of a serial-killer. In the past, many of the medical studies utilizing the Knox statistic were concerned with leukemia. Because there was strong space-time interactions, the studies were used as an evidence supporting that certain type of leukemia may be caused by viral infection. Up to this day, there has been no settled answer in the medical community but viral etiology of the disease is still one valid hypothesis. 57

58 Time-space analysis: Knox statistic Example 4.15: Application of Knox statistic to Burkitt's lymphoma in Uganda. Data: locations and dates of onset and diagnosis for cases of Burkitt's lymphoma during The study region was the West Nile district of Uganda. There are 188 cases for which data exist on both the location and the date of onset. 58

59 Time-space analysis: Knox statistic (Example 4.15) Monitoring using the Knox statistic was carried out for the following combinations of critical time windows and spatial distances (i.e., t and s): t = 30, 60, 90, 120, 180, 360 days s = 2.5, 5, 10, 20, 40, km There are a total of 30 = 6 5 combinations. A CUSUM chart is used to detect the chance in the Knox statistic: 59

60 Time-space analysis: Knox statistic (Example 4.15) Results: Table 3 summarizes the monitoring result. Significant space-time interactions are found at critical space and time units. This also prompts people to think lymphoma may be caused by certain viral infection. 60

61 Time-space analysis: Knox statistic (Example 4.15) Results: Figure 2 provides an example of how the CUSM may be tracked over time, for the case of s = 20 km and t = 180 days. 61

62 Time-space analysis: Knox statistic (Example 4.15) Results: In the figure, three space-time clustering signals are indicated: - A brief signal is sent at observation 25, in Dec 1964; - A second signal is noted at observation 82, in Feb 1969; - The most persistent signal begins at observation 146, in Jan This clustering remains until observation 172, in Jan An examination reveals that observations were the primary contributors to this signal. Those are highlighted in Figure 1. 62

An Introduction to SaTScan

An Introduction to SaTScan An Introduction to SaTScan Software to measure spatial, temporal or space-time clusters using a spatial scan approach Marilyn O Hara University of Illinois moruiz@illinois.edu Lecture for the Pre-conference

More information

Cluster Analysis using SaTScan. Patrick DeLuca, M.A. APHEO 2007 Conference, Ottawa October 16 th, 2007

Cluster Analysis using SaTScan. Patrick DeLuca, M.A. APHEO 2007 Conference, Ottawa October 16 th, 2007 Cluster Analysis using SaTScan Patrick DeLuca, M.A. APHEO 2007 Conference, Ottawa October 16 th, 2007 Outline Clusters & Cluster Detection Spatial Scan Statistic Case Study 28 September 2007 APHEO Conference

More information

Cluster Analysis using SaTScan

Cluster Analysis using SaTScan Cluster Analysis using SaTScan Summary 1. Statistical methods for spatial epidemiology 2. Cluster Detection What is a cluster? Few issues 3. Spatial and spatio-temporal Scan Statistic Methods Probability

More information

Outline. Practical Point Pattern Analysis. David Harvey s Critiques. Peter Gould s Critiques. Global vs. Local. Problems of PPA in Real World

Outline. Practical Point Pattern Analysis. David Harvey s Critiques. Peter Gould s Critiques. Global vs. Local. Problems of PPA in Real World Outline Practical Point Pattern Analysis Critiques of Spatial Statistical Methods Point pattern analysis versus cluster detection Cluster detection techniques Extensions to point pattern measures Multiple

More information

SaTScan TM. User Guide. for version 7.0. By Martin Kulldorff. August

SaTScan TM. User Guide. for version 7.0. By Martin Kulldorff. August SaTScan TM User Guide for version 7.0 By Martin Kulldorff August 2006 http://www.satscan.org/ Contents Introduction... 4 The SaTScan Software... 4 Download and Installation... 5 Test Run... 5 Sample Data

More information

USING CLUSTERING SOFTWARE FOR EXPLORING SPATIAL AND TEMPORAL PATTERNS IN NON-COMMUNICABLE DISEASES

USING CLUSTERING SOFTWARE FOR EXPLORING SPATIAL AND TEMPORAL PATTERNS IN NON-COMMUNICABLE DISEASES USING CLUSTERING SOFTWARE FOR EXPLORING SPATIAL AND TEMPORAL PATTERNS IN NON-COMMUNICABLE DISEASES Mariana Nagy "Aurel Vlaicu" University of Arad Romania Department of Mathematics and Computer Science

More information

Probability and Probability Distributions. Dr. Mohammed Alahmed

Probability and Probability Distributions. Dr. Mohammed Alahmed Probability and Probability Distributions 1 Probability and Probability Distributions Usually we want to do more with data than just describing them! We might want to test certain specific inferences about

More information

A spatial scan statistic for multinomial data

A spatial scan statistic for multinomial data A spatial scan statistic for multinomial data Inkyung Jung 1,, Martin Kulldorff 2 and Otukei John Richard 3 1 Department of Epidemiology and Biostatistics University of Texas Health Science Center at San

More information

FleXScan User Guide. for version 3.1. Kunihiko Takahashi Tetsuji Yokoyama Toshiro Tango. National Institute of Public Health

FleXScan User Guide. for version 3.1. Kunihiko Takahashi Tetsuji Yokoyama Toshiro Tango. National Institute of Public Health FleXScan User Guide for version 3.1 Kunihiko Takahashi Tetsuji Yokoyama Toshiro Tango National Institute of Public Health October 2010 http://www.niph.go.jp/soshiki/gijutsu/index_e.html User Guide version

More information

Inclusion of Non-Street Addresses in Cancer Cluster Analysis

Inclusion of Non-Street Addresses in Cancer Cluster Analysis Inclusion of Non-Street Addresses in Cancer Cluster Analysis Sue-Min Lai, Zhimin Shen, Darin Banks Kansas Cancer Registry University of Kansas Medical Center KCR (Kansas Cancer Registry) KCR: population-based

More information

BINOMIAL DISTRIBUTION

BINOMIAL DISTRIBUTION BINOMIAL DISTRIBUTION The binomial distribution is a particular type of discrete pmf. It describes random variables which satisfy the following conditions: 1 You perform n identical experiments (called

More information

GIS for Integrated Pest Management. Christina Hailey. Abstract:

GIS for Integrated Pest Management. Christina Hailey. Abstract: GIS for Integrated Pest Management Christina Hailey Abstract: At its formation in 1965, Harris County Mosquito Control (Houston, Texas) (HCMC) was primarily involved in the prevention and control of mosquito-borne

More information

Discrete distribution. Fitting probability models to frequency data. Hypotheses for! 2 test. ! 2 Goodness-of-fit test

Discrete distribution. Fitting probability models to frequency data. Hypotheses for! 2 test. ! 2 Goodness-of-fit test Discrete distribution Fitting probability models to frequency data A probability distribution describing a discrete numerical random variable For example,! Number of heads from 10 flips of a coin! Number

More information

Expectation-based scan statistics for monitoring spatial time series data

Expectation-based scan statistics for monitoring spatial time series data International Journal of Forecasting 25 (2009) 498 517 www.elsevier.com/locate/ijforecast Expectation-based scan statistics for monitoring spatial time series data Daniel B. Neill H.J. Heinz III College,

More information

MATH 250 / SPRING 2011 SAMPLE QUESTIONS / SET 3

MATH 250 / SPRING 2011 SAMPLE QUESTIONS / SET 3 MATH 250 / SPRING 2011 SAMPLE QUESTIONS / SET 3 1. A four engine plane can fly if at least two engines work. a) If the engines operate independently and each malfunctions with probability q, what is the

More information

Rapid detection of spatiotemporal clusters

Rapid detection of spatiotemporal clusters Rapid detection of spatiotemporal clusters Markus Loecher, Berlin School of Economics and Law July 2nd, 2015 Table of contents Motivation Spatial Plots in R RgoogleMaps Spatial cluster detection Spatiotemporal

More information

Bayesian Hierarchical Models

Bayesian Hierarchical Models Bayesian Hierarchical Models Gavin Shaddick, Millie Green, Matthew Thomas University of Bath 6 th - 9 th December 2016 1/ 34 APPLICATIONS OF BAYESIAN HIERARCHICAL MODELS 2/ 34 OUTLINE Spatial epidemiology

More information

Interactive GIS in Veterinary Epidemiology Technology & Application in a Veterinary Diagnostic Lab

Interactive GIS in Veterinary Epidemiology Technology & Application in a Veterinary Diagnostic Lab Interactive GIS in Veterinary Epidemiology Technology & Application in a Veterinary Diagnostic Lab Basics GIS = Geographic Information System A GIS integrates hardware, software and data for capturing,

More information

Spatial Analysis I. Spatial data analysis Spatial analysis and inference

Spatial Analysis I. Spatial data analysis Spatial analysis and inference Spatial Analysis I Spatial data analysis Spatial analysis and inference Roadmap Outline: What is spatial analysis? Spatial Joins Step 1: Analysis of attributes Step 2: Preparing for analyses: working with

More information

ARIC Manuscript Proposal # PC Reviewed: _9/_25_/06 Status: A Priority: _2 SC Reviewed: _9/_25_/06 Status: A Priority: _2

ARIC Manuscript Proposal # PC Reviewed: _9/_25_/06 Status: A Priority: _2 SC Reviewed: _9/_25_/06 Status: A Priority: _2 ARIC Manuscript Proposal # 1186 PC Reviewed: _9/_25_/06 Status: A Priority: _2 SC Reviewed: _9/_25_/06 Status: A Priority: _2 1.a. Full Title: Comparing Methods of Incorporating Spatial Correlation in

More information

Scalable Bayesian Event Detection and Visualization

Scalable Bayesian Event Detection and Visualization Scalable Bayesian Event Detection and Visualization Daniel B. Neill Carnegie Mellon University H.J. Heinz III College E-mail: neill@cs.cmu.edu This work was partially supported by NSF grants IIS-0916345,

More information

Effective Use of Geographic Maps

Effective Use of Geographic Maps Effective Use of Geographic Maps Purpose This tool provides guidelines and tips on how to effectively use geographic maps to communicate research findings. Format This tool provides guidance on geographic

More information

Section II: Assessing Chart Performance. (Jim Benneyan)

Section II: Assessing Chart Performance. (Jim Benneyan) Section II: Assessing Chart Performance (Jim Benneyan) 1 Learning Objectives Understand concepts of chart performance Two types of errors o Type 1: Call an in-control process out-of-control o Type 2: Call

More information

Identifying West Nile Virus Risk Areas: The Dynamic Continuous-Area Space-Time System

Identifying West Nile Virus Risk Areas: The Dynamic Continuous-Area Space-Time System American Journal of Epidemiology Copyright 2003 by the Johns Hopkins Bloomberg School of Public Health All rights reserved Vol. 157, No. 9 Printed in U.S.A. DOI: 10.1093/aje/kwg046 PRACTICE OF EPIDEMIOLOGY

More information

Probability: Why do we care? Lecture 2: Probability and Distributions. Classical Definition. What is Probability?

Probability: Why do we care? Lecture 2: Probability and Distributions. Classical Definition. What is Probability? Probability: Why do we care? Lecture 2: Probability and Distributions Sandy Eckel seckel@jhsph.edu 22 April 2008 Probability helps us by: Allowing us to translate scientific questions into mathematical

More information

Urbanization, Land Cover, Weather, and Incidence Rates of Neuroinvasive West Nile Virus Infections In Illinois

Urbanization, Land Cover, Weather, and Incidence Rates of Neuroinvasive West Nile Virus Infections In Illinois Urbanization, Land Cover, Weather, and Incidence Rates of Neuroinvasive West Nile Virus Infections In Illinois JUNE 23, 2016 H ANNAH MATZ KE Background Uganda 1937 United States -1999 New York Quickly

More information

Many natural processes can be fit to a Poisson distribution

Many natural processes can be fit to a Poisson distribution BE.104 Spring Biostatistics: Poisson Analyses and Power J. L. Sherley Outline 1) Poisson analyses 2) Power What is a Poisson process? Rare events Values are observational (yes or no) Random distributed

More information

Chapter 4 Part 3. Sections Poisson Distribution October 2, 2008

Chapter 4 Part 3. Sections Poisson Distribution October 2, 2008 Chapter 4 Part 3 Sections 4.10-4.12 Poisson Distribution October 2, 2008 Goal: To develop an understanding of discrete distributions by considering the binomial (last lecture) and the Poisson distributions.

More information

Lecture 2: Probability and Distributions

Lecture 2: Probability and Distributions Lecture 2: Probability and Distributions Ani Manichaikul amanicha@jhsph.edu 17 April 2007 1 / 65 Probability: Why do we care? Probability helps us by: Allowing us to translate scientific questions info

More information

DISCRETE VARIABLE PROBLEMS ONLY

DISCRETE VARIABLE PROBLEMS ONLY DISCRETE VARIABLE PROBLEMS ONLY. A biased die with four faces is used in a game. A player pays 0 counters to roll the die. The table below shows the possible scores on the die, the probability of each

More information

Temporal and spatial mapping of hand, foot and mouth disease in Sarawak, Malaysia

Temporal and spatial mapping of hand, foot and mouth disease in Sarawak, Malaysia Geospatial Health 8(2), 2014, pp. 503-507 Temporal and spatial mapping of hand, foot and mouth disease in Sarawak, Malaysia Noraishah M. Sham 1, Isthrinayagy Krishnarajah 1,2, Noor Akma Ibrahim 1,2, Munn-Sann

More information

Detecting Emerging Space-Time Crime Patterns by Prospective STSS

Detecting Emerging Space-Time Crime Patterns by Prospective STSS Detecting Emerging Space-Time Crime Patterns by Prospective STSS Tao Cheng Monsuru Adepeju {tao.cheng@ucl.ac.uk; monsuru.adepeju.11@ucl.ac.uk} SpaceTimeLab, Department of Civil, Environmental and Geomatic

More information

Role of GIS in Tracking and Controlling Spread of Disease

Role of GIS in Tracking and Controlling Spread of Disease Role of GIS in Tracking and Controlling Spread of Disease For Dr. Baqer Al-Ramadan By Syed Imran Quadri CRP 514: Introduction to GIS Introduction Problem Statement Objectives Methodology of Study Literature

More information

37.3. The Poisson Distribution. Introduction. Prerequisites. Learning Outcomes

37.3. The Poisson Distribution. Introduction. Prerequisites. Learning Outcomes The Poisson Distribution 37.3 Introduction In this Section we introduce a probability model which can be used when the outcome of an experiment is a random variable taking on positive integer values and

More information

Exercise 1.0 THE CELESTIAL EQUATORIAL COORDINATE SYSTEM

Exercise 1.0 THE CELESTIAL EQUATORIAL COORDINATE SYSTEM Exercise 1.0 THE CELESTIAL EQUATORIAL COORDINATE SYSTEM Equipment needed: A celestial globe showing positions of bright stars and Messier Objects. I. Introduction There are several different ways of representing

More information

Hierarchical Modeling and Analysis for Spatial Data

Hierarchical Modeling and Analysis for Spatial Data Hierarchical Modeling and Analysis for Spatial Data Bradley P. Carlin, Sudipto Banerjee, and Alan E. Gelfand brad@biostat.umn.edu, sudiptob@biostat.umn.edu, and alan@stat.duke.edu University of Minnesota

More information

2/7/2018. Module 4. Spatial Statistics. Point Patterns: Nearest Neighbor. Spatial Statistics. Point Patterns: Nearest Neighbor

2/7/2018. Module 4. Spatial Statistics. Point Patterns: Nearest Neighbor. Spatial Statistics. Point Patterns: Nearest Neighbor Spatial Statistics Module 4 Geographers are very interested in studying, understanding, and quantifying the patterns we can see on maps Q: What kinds of map patterns can you think of? There are so many

More information

Swarm and Evolutionary Computation. A new PSO-optimized geometry of spatial and spatio-temporal scan statistics for disease outbreak detection

Swarm and Evolutionary Computation. A new PSO-optimized geometry of spatial and spatio-temporal scan statistics for disease outbreak detection Swarm and Evolutionary Computation 4 (2012) 1 11 Contents lists available at SciVerse ScienceDirect Swarm and Evolutionary Computation journal homepage: www.elsevier.com/locate/swevo Regular paper A new

More information

Peninsular Florida p Modeled Water Table Depth Arboviral Epidemic Risk Assessment. Current Assessment: 06/08/2008 Week 23 Initial Wetting Phase

Peninsular Florida p Modeled Water Table Depth Arboviral Epidemic Risk Assessment. Current Assessment: 06/08/2008 Week 23 Initial Wetting Phase Peninsular Florida p Modeled Water Table Depth Arboviral Epidemic Risk Assessment Current Assessment: 06/08/2008 Week 23 Initial Wetting Phase Modeled Water Table Depth: MWTD has remained low across much

More information

Introduction to Statistical Data Analysis Lecture 3: Probability Distributions

Introduction to Statistical Data Analysis Lecture 3: Probability Distributions Introduction to Statistical Data Analysis Lecture 3: Probability Distributions James V. Lambers Department of Mathematics The University of Southern Mississippi James V. Lambers Statistical Data Analysis

More information

Geometric Distribution The characteristics of a geometric experiment are: 1. There are one or more Bernoulli trials with all failures except the last

Geometric Distribution The characteristics of a geometric experiment are: 1. There are one or more Bernoulli trials with all failures except the last Geometric Distribution The characteristics of a geometric experiment are: 1. There are one or more Bernoulli trials with all failures except the last one, which is a success. In other words, you keep repeating

More information

Integrating GIS into West Nile Virus Planning and Surveillance

Integrating GIS into West Nile Virus Planning and Surveillance Integrating GIS into West Nile Virus Planning and Surveillance Fairfax County Health Department Adrian Joye, Environmental Health Specialist Agenda Background/Benefits Routes/Trap Locations Dead Bird Complaint

More information

Railway suicide clusters: how common are they and what predicts them? Lay San Too Jane Pirkis Allison Milner Lyndal Bugeja Matthew J.

Railway suicide clusters: how common are they and what predicts them? Lay San Too Jane Pirkis Allison Milner Lyndal Bugeja Matthew J. Railway suicide clusters: how common are they and what predicts them? Lay San Too Jane Pirkis Allison Milner Lyndal Bugeja Matthew J. Spittal Overview Background Aims Significance Methods Results Conclusions

More information

Hotspot detection using space-time scan statistics on children under five years of age in Depok

Hotspot detection using space-time scan statistics on children under five years of age in Depok Hotspot detection using space-time scan statistics on children under five years of age in Depok Miranti Verdiana, and Yekti Widyaningsih Citation: AIP Conference Proceedings 1827, 020018 (2017); View online:

More information

In matrix algebra notation, a linear model is written as

In matrix algebra notation, a linear model is written as DM3 Calculation of health disparity Indices Using Data Mining and the SAS Bridge to ESRI Mussie Tesfamicael, University of Louisville, Louisville, KY Abstract Socioeconomic indices are strongly believed

More information

The CrimeStat Program: Characteristics, Use, and Audience

The CrimeStat Program: Characteristics, Use, and Audience The CrimeStat Program: Characteristics, Use, and Audience Ned Levine, PhD Ned Levine & Associates and Houston-Galveston Area Council Houston, TX In the paper and presentation, I will discuss the CrimeStat

More information

GIS & Spatial Analysis in MCH

GIS & Spatial Analysis in MCH GIS & Spatial Analysis in MCH Russell S. Kirby, University of Alabama at Birmingham rkirby@uab.edu, office 205-934-2985 Dianne Enright, North Carolina State Center for Health Statistics dianne.enright@ncmail.net

More information

Statistical Experiment A statistical experiment is any process by which measurements are obtained.

Statistical Experiment A statistical experiment is any process by which measurements are obtained. (التوزيعات الا حتمالية ( Distributions Probability Statistical Experiment A statistical experiment is any process by which measurements are obtained. Examples of Statistical Experiments Counting the number

More information

Causal Modeling in Environmental Epidemiology. Joel Schwartz Harvard University

Causal Modeling in Environmental Epidemiology. Joel Schwartz Harvard University Causal Modeling in Environmental Epidemiology Joel Schwartz Harvard University When I was Young What do I mean by Causal Modeling? What would have happened if the population had been exposed to a instead

More information

BROOKINGS May

BROOKINGS May Appendix 1. Technical Methodology This study combines detailed data on transit systems, demographics, and employment to determine the accessibility of jobs via transit within and across the country s 100

More information

( ) P A B : Probability of A given B. Probability that A happens

( ) P A B : Probability of A given B. Probability that A happens A B A or B One or the other or both occurs At least one of A or B occurs Probability Review A B A and B Both A and B occur ( ) P A B : Probability of A given B. Probability that A happens given that B

More information

Module 8 Probability

Module 8 Probability Module 8 Probability Probability is an important part of modern mathematics and modern life, since so many things involve randomness. The ClassWiz is helpful for calculating probabilities, especially those

More information

PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH

PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH The First Step: SAMPLE SIZE DETERMINATION THE ULTIMATE GOAL The most important, ultimate step of any of clinical research is to do draw inferences;

More information

Nature of Spatial Data. Outline. Spatial Is Special

Nature of Spatial Data. Outline. Spatial Is Special Nature of Spatial Data Outline Spatial is special Bad news: the pitfalls of spatial data Good news: the potentials of spatial data Spatial Is Special Are spatial data special? Why spatial data require

More information

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Module No. #01 Lecture No. #11 Special Distributions-II In the Bernoullian trials that

More information

14.2 THREE IMPORTANT DISCRETE PROBABILITY MODELS

14.2 THREE IMPORTANT DISCRETE PROBABILITY MODELS 14.2 THREE IMPORTANT DISCRETE PROBABILITY MODELS In Section 14.1 the idea of a discrete probability model was introduced. In the examples of that section the probability of each basic outcome of the experiment

More information

Chapter 6 Spatial Analysis

Chapter 6 Spatial Analysis 6.1 Introduction Chapter 6 Spatial Analysis Spatial analysis, in a narrow sense, is a set of mathematical (and usually statistical) tools used to find order and patterns in spatial phenomena. Spatial patterns

More information

AFTERMATTHew, Came Irma!

AFTERMATTHew, Came Irma! AFTERMATTHew, Came Irma! MOSQUITOES & STORMS LAURA PEATY CHATHAM COUNTY MOSQUITO CONTROL SAVANNAH GA Major factors contributing to high mosquito numbers in our area include: Rain Tides Dredging operations

More information

Mapping the most and the least

Mapping the most and the least Mapping the most and the least Why do you make a map To communicate information at a glance To explore the data to see what patterns and retionships you can find To develop hypothesis (will be topic of

More information

Discrete probability distributions

Discrete probability distributions Discrete probability s BSAD 30 Dave Novak Fall 08 Source: Anderson et al., 05 Quantitative Methods for Business th edition some slides are directly from J. Loucks 03 Cengage Learning Covered so far Chapter

More information

CS 361: Probability & Statistics

CS 361: Probability & Statistics February 26, 2018 CS 361: Probability & Statistics Random variables The discrete uniform distribution If every value of a discrete random variable has the same probability, then its distribution is called

More information

Are You Maximizing The Value Of All Your Data?

Are You Maximizing The Value Of All Your Data? Are You Maximizing The Value Of All Your Data? Using The SAS Bridge for ESRI With ArcGIS Business Analyst In A Retail Market Analysis SAS and ESRI: Bringing GIS Mapping and SAS Data Together Presented

More information

Long Island Breast Cancer Study and the GIS-H (Health)

Long Island Breast Cancer Study and the GIS-H (Health) Long Island Breast Cancer Study and the GIS-H (Health) Edward J. Trapido, Sc.D. Associate Director Epidemiology and Genetics Research Program, DCCPS/NCI COMPREHENSIVE APPROACHES TO CANCER CONTROL September,

More information

Class 9. Query, Measurement & Transformation; Spatial Buffers; Descriptive Summary, Design & Inference

Class 9. Query, Measurement & Transformation; Spatial Buffers; Descriptive Summary, Design & Inference Class 9 Query, Measurement & Transformation; Spatial Buffers; Descriptive Summary, Design & Inference Spatial Analysis Turns raw data into useful information by adding greater informative content and value

More information

Discrete Probability Distributions

Discrete Probability Distributions Chapter 4 Discrete Probability Distributions 4.1 Random variable A random variable is a function that assigns values to different events in a sample space. Example 4.1.1. Consider the experiment of rolling

More information

Probability and Discrete Distributions

Probability and Discrete Distributions AMS 7L LAB #3 Fall, 2007 Objectives: Probability and Discrete Distributions 1. To explore relative frequency and the Law of Large Numbers 2. To practice the basic rules of probability 3. To work with the

More information

Census Geography, Geographic Standards, and Geographic Information

Census Geography, Geographic Standards, and Geographic Information Census Geography, Geographic Standards, and Geographic Information Michael Ratcliffe Geography Division US Census Bureau New Mexico State Data Center Data Users Conference November 19, 2015 Today s Presentation

More information

4. Discrete Probability Distributions. Introduction & Binomial Distribution

4. Discrete Probability Distributions. Introduction & Binomial Distribution 4. Discrete Probability Distributions Introduction & Binomial Distribution Aim & Objectives 1 Aims u Introduce discrete probability distributions v Binomial distribution v Poisson distribution 2 Objectives

More information

CS 1538: Introduction to Simulation Homework 1

CS 1538: Introduction to Simulation Homework 1 CS 1538: Introduction to Simulation Homework 1 1. A fair six-sided die is rolled three times. Let X be a random variable that represents the number of unique outcomes in the three tosses. For example,

More information

Superiority by a Margin Tests for One Proportion

Superiority by a Margin Tests for One Proportion Chapter 103 Superiority by a Margin Tests for One Proportion Introduction This module provides power analysis and sample size calculation for one-sample proportion tests in which the researcher is testing

More information

MATH 227 CP 7 SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question.

MATH 227 CP 7 SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. MATH 227 CP 7 SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Find the mean, µ, for the binomial distribution which has the stated values of n and p.

More information

Outline. 15. Descriptive Summary, Design, and Inference. Descriptive summaries. Data mining. The centroid

Outline. 15. Descriptive Summary, Design, and Inference. Descriptive summaries. Data mining. The centroid Outline 15. Descriptive Summary, Design, and Inference Geographic Information Systems and Science SECOND EDITION Paul A. Longley, Michael F. Goodchild, David J. Maguire, David W. Rhind 2005 John Wiley

More information

Practice problems from chapters 2 and 3

Practice problems from chapters 2 and 3 Practice problems from chapters and 3 Question-1. For each of the following variables, indicate whether it is quantitative or qualitative and specify which of the four levels of measurement (nominal, ordinal,

More information

Scottish Atlas of Variation

Scottish Atlas of Variation Scottish Atlas of Variation User Guide March 2019 Contents Introduction... 3 Accessing the Atlas of Variation... 3 Dashboard... 4 Introduction... 4 Drop-down menus... 5 Explore icon... 5 Information icons...

More information

Defining Statistically Significant Spatial Clusters of a Target Population using a Patient-Centered Approach within a GIS

Defining Statistically Significant Spatial Clusters of a Target Population using a Patient-Centered Approach within a GIS Defining Statistically Significant Spatial Clusters of a Target Population using a Patient-Centered Approach within a GIS Efforts to Improve Quality of Care Stephen Jones, PhD Bio-statistical Research

More information

Everything is related to everything else, but near things are more related than distant things.

Everything is related to everything else, but near things are more related than distant things. SPATIAL ANALYSIS DR. TRIS ERYANDO, MA Everything is related to everything else, but near things are more related than distant things. (attributed to Tobler) WHAT IS SPATIAL DATA? 4 main types event data,

More information

Cluster investigations using Disease mapping methods International workshop on Risk Factors for Childhood Leukemia Berlin May

Cluster investigations using Disease mapping methods International workshop on Risk Factors for Childhood Leukemia Berlin May Cluster investigations using Disease mapping methods International workshop on Risk Factors for Childhood Leukemia Berlin May 5-7 2008 Peter Schlattmann Institut für Biometrie und Klinische Epidemiologie

More information

Texas A&M University

Texas A&M University Texas A&M University CVEN 658 Civil Engineering Applications of GIS Hotspot Analysis of Highway Accident Spatial Pattern Based on Network Spatial Weights Instructor: Dr. Francisco Olivera Author: Zachry

More information

Statistical Process Control

Statistical Process Control Statistical Process Control Outline Statistical Process Control (SPC) Process Capability Acceptance Sampling 2 Learning Objectives When you complete this supplement you should be able to : S6.1 Explain

More information

Hennepin GIS. Tree Planting Priority Areas - Analysis Methodology. GIS Services April 2018 GOAL:

Hennepin GIS. Tree Planting Priority Areas - Analysis Methodology. GIS Services April 2018 GOAL: Hennepin GIS GIS Services April 2018 Tree Planting Priority Areas - Analysis Methodology GOAL: To create a GIS data layer that will aid Hennepin County Environment & Energy staff in determining where to

More information

Geographical Visualization Approach to Perceive Spatial Scan Statistics: An Analysis of Dengue Fever Outbreaks in Delhi

Geographical Visualization Approach to Perceive Spatial Scan Statistics: An Analysis of Dengue Fever Outbreaks in Delhi Geographical Visualization Approach to Perceive Spatial Scan Statistics: An Analysis of Dengue Fever Outbreaks in Delhi Student Name: Shuchi Mala IIIT-D-MTech-CS-DE-11-032 June 18, 2013 Indraprastha Institute

More information

Spatio-Temporal Cluster Detection of Point Events by Hierarchical Search of Adjacent Area Unit Combinations

Spatio-Temporal Cluster Detection of Point Events by Hierarchical Search of Adjacent Area Unit Combinations Spatio-Temporal Cluster Detection of Point Events by Hierarchical Search of Adjacent Area Unit Combinations Ryo Inoue 1, Shiho Kasuya and Takuya Watanabe 1 Tohoku University, Sendai, Japan email corresponding

More information

Acknowledgments xiii Preface xv. GIS Tutorial 1 Introducing GIS and health applications 1. What is GIS? 2

Acknowledgments xiii Preface xv. GIS Tutorial 1 Introducing GIS and health applications 1. What is GIS? 2 Acknowledgments xiii Preface xv GIS Tutorial 1 Introducing GIS and health applications 1 What is GIS? 2 Spatial data 2 Digital map infrastructure 4 Unique capabilities of GIS 5 Installing ArcView and the

More information

9/2/2010. Wildlife Management is a very quantitative field of study. throughout this course and throughout your career.

9/2/2010. Wildlife Management is a very quantitative field of study. throughout this course and throughout your career. Introduction to Data and Analysis Wildlife Management is a very quantitative field of study Results from studies will be used throughout this course and throughout your career. Sampling design influences

More information

Testing Independence

Testing Independence Testing Independence Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM 1/50 Testing Independence Previously, we looked at RR = OR = 1

More information

If two different people are randomly selected from the 991 subjects, find the probability that they are both women. Round to four decimal places.

If two different people are randomly selected from the 991 subjects, find the probability that they are both women. Round to four decimal places. Math 227 Name 5 pts*20=100pts 1) A bin contains 67 light bulbs of which 8 are defective. If 3 light bulbs are randomly selected from the bin with replacement, find the probability that all the bulbs selected

More information

PASS Sample Size Software. Poisson Regression

PASS Sample Size Software. Poisson Regression Chapter 870 Introduction Poisson regression is used when the dependent variable is a count. Following the results of Signorini (99), this procedure calculates power and sample size for testing the hypothesis

More information

Developing Spatial Awareness :-

Developing Spatial Awareness :- Developing Spatial Awareness :- We begin to exercise our geographic skill by examining he types of objects and features we encounter. Four different spatial objects in the real world: Point, Line, Areas

More information

A LRT Framework for Fast Spatial Anomaly Detection*

A LRT Framework for Fast Spatial Anomaly Detection* A LRT Framework for Fast Spatial Anomaly Detection* Mingxi Wu (Oracle Corp.) Xiuyao Song (Yahoo! Inc.) Chris Jermaine (Rice U.) Sanjay Ranka (U. Florida) John Gums (U. Florida) * Work undertaken when all

More information

CHARTING THE HEAVENS USING A VIRTUAL PLANETARIUM

CHARTING THE HEAVENS USING A VIRTUAL PLANETARIUM Name Partner(s) Section Date CHARTING THE HEAVENS USING A VIRTUAL PLANETARIUM You have had the opportunity to look at two different tools to display the night sky, the celestial sphere and the star chart.

More information

Examples of frequentist probability include games of chance, sample surveys, and randomized experiments. We will focus on frequentist probability sinc

Examples of frequentist probability include games of chance, sample surveys, and randomized experiments. We will focus on frequentist probability sinc FPPA-Chapters 13,14 and parts of 16,17, and 18 STATISTICS 50 Richard A. Berk Spring, 1997 May 30, 1997 1 Thinking about Chance People talk about \chance" and \probability" all the time. There are many

More information

Statistics Boot Camp. Dr. Stephanie Lane Institute for Defense Analyses DATAWorks 2018

Statistics Boot Camp. Dr. Stephanie Lane Institute for Defense Analyses DATAWorks 2018 Statistics Boot Camp Dr. Stephanie Lane Institute for Defense Analyses DATAWorks 2018 March 21, 2018 Outline of boot camp Summarizing and simplifying data Point and interval estimation Foundations of statistical

More information

Find the value of n in order for the player to get an expected return of 9 counters per roll.

Find the value of n in order for the player to get an expected return of 9 counters per roll. . A biased die with four faces is used in a game. A player pays 0 counters to roll the die. The table below shows the possible scores on the die, the probability of each score and the number of counters

More information

Program Update. Lisle Township August 2018 Status Report SEASON PERSPECTIVE

Program Update. Lisle Township August 2018 Status Report SEASON PERSPECTIVE Lisle Township August 2018 Status Report SEASON PERSPECTIVE Introduction. Weather conditions critically affect the seasonal mosquito population. Excessive rainfall periods trigger hatches of floodwater

More information

Unit 1, Lesson 2. What is geographic inquiry?

Unit 1, Lesson 2. What is geographic inquiry? What is geographic inquiry? Unit 1, Lesson 2 Understanding the way in which social scientists investigate problems will help you conduct your own investigations about problems or issues facing your community

More information

What is GIS? Introduction to data. Introduction to data modeling

What is GIS? Introduction to data. Introduction to data modeling What is GIS? Introduction to data Introduction to data modeling 2 A GIS is similar, layering mapped information in a computer to help us view our world as a system A Geographic Information System is a

More information

Chapter 22. Comparing Two Proportions 1 /29

Chapter 22. Comparing Two Proportions 1 /29 Chapter 22 Comparing Two Proportions 1 /29 Homework p519 2, 4, 12, 13, 15, 17, 18, 19, 24 2 /29 Objective Students test null and alternate hypothesis about two population proportions. 3 /29 Comparing Two

More information

Traffic accidents and the road network in SAS/GIS

Traffic accidents and the road network in SAS/GIS Traffic accidents and the road network in SAS/GIS Frank Poppe SWOV Institute for Road Safety Research, the Netherlands Introduction The first figure shows a screen snapshot of SAS/GIS with part of the

More information

Applications of GIS in Health Research. West Nile virus

Applications of GIS in Health Research. West Nile virus Applications of GIS in Health Research West Nile virus Outline Part 1. Applications of GIS in Health research or spatial epidemiology Disease Mapping Cluster Detection Spatial Exposure Assessment Assessment

More information

Part 3: Parametric Models

Part 3: Parametric Models Part 3: Parametric Models Matthew Sperrin and Juhyun Park August 19, 2008 1 Introduction There are three main objectives to this section: 1. To introduce the concepts of probability and random variables.

More information