K1D: Multivariate Ripley s K-function for one-dimensional data. Daniel G. Gavin University of Oregon Department of Geography Version 1.

Similar documents
Gravity Modelling Forward Modelling Of Synthetic Data

Module 2A Turning Multivariable Models into Interactive Animated Simulations

Tutorial 4: Power and Sample Size for the Two-sample t-test with Unequal Variances

Using Tables and Graphing Calculators in Math 11

Tutorial 1: Power and Sample Size for the One-sample t-test. Acknowledgements:

Tutorial 2: Power and Sample Size for the Paired Sample t-test

Tutorial 3: Power and Sample Size for the Two-sample t-test with Equal Variances. Acknowledgements:

1 Introduction to Minitab

Perinatal Mental Health Profile User Guide. 1. Using Fingertips Software

Work with Gravity Data in GM-SYS Profile

Introduction to Computer Tools and Uncertainties

Boyle s Law: A Multivariable Model and Interactive Animated Simulation

Outline. Introduction to SpaceStat and ESTDA. ESTDA & SpaceStat. Learning Objectives. Space-Time Intelligence System. Space-Time Intelligence System

A Plot of the Tracking Signals Calculated in Exhibit 3.9

CatchAll Version 2.0 User Operations Manual. by Linda Woodard, Sean Connolly, and John Bunge Cornell University. June 7, 2011

From Practical Data Analysis with JMP, Second Edition. Full book available for purchase here. About This Book... xiii About The Author...

Trendlines Simple Linear Regression Multiple Linear Regression Systematic Model Building Practical Issues

(Directions for Excel Mac: 2011) Most of the global average warming over the past 50 years is very likely due to anthropogenic GHG increases

Electric Fields and Equipotentials

EXPERIMENT 2 Reaction Time Objectives Theory

Tests for Two Coefficient Alphas

Titrator 3.0 Tutorial: Calcite precipitation

85. Geo Processing Mineral Liberation Data

Passing-Bablok Regression for Method Comparison

NEW HOLLAND IH AUSTRALIA. Machinery Market Information and Forecasting Portal *** Dealer User Guide Released August 2013 ***

LAB 5 INSTRUCTIONS LINEAR REGRESSION AND CORRELATION

Photometry of Supernovae with Makali i

An area chart emphasizes the trend of each value over time. An area chart also shows the relationship of parts to a whole.

85. Geo Processing Mineral Liberation Data

Your work from these three exercises will be due Thursday, March 2 at class time.

-ASTR 204 Application of Astroimaging Techniques

Rutherford s Scattering Explanation

Financial Econometrics Review Session Notes 3

Quality Measures (QM) Report. Self Guided Tutorial

Assignment 1: Molecular Mechanics (PART 1 25 points)

ANGLO-CHINESE JUNIOR COLLEGE MATHEMATICS DEPARTMENT. Paper 2 22 August 2016 JC 2 PRELIMINARY EXAMINATION Time allowed: 3 hours

Didacticiel - Études de cas

375 PU M Sc Statistics

Probabilities, Uncertainties & Units Used to Quantify Climate Change

module, with the exception that the vials are larger and you only use one initial population size.

THE CRYSTAL BALL FORECAST CHART

UNST 232 Mentor Section Assignment 5 Historical Climate Data

LAB 3 INSTRUCTIONS SIMPLE LINEAR REGRESSION

Bivariate Relationships Between Variables

Spatial Point Pattern Analysis

THE ECAT SOFTWARE PACKAGE TO ANALYZE EARTHQUAKE CATALOGUES

Creating Empirical Calibrations

Lab Activity: The Central Limit Theorem

Hypothesis Tests and Estimation for Population Variances. Copyright 2014 Pearson Education, Inc.

Conformational Analysis of n-butane

Using SPSS for One Way Analysis of Variance

How To: Deal with Heteroscedasticity Using STATGRAPHICS Centurion

Tutorial 5: Power and Sample Size for One-way Analysis of Variance (ANOVA) with Equal Variances Across Groups. Acknowledgements:

Probable Factors. Problem Statement: Equipment. Introduction Setting up the simulation. Student Activity

This lab exercise will try to answer these questions using spatial statistics in a geographic information system (GIS) context.

Chapter 9 Ingredients of Multivariable Change: Models, Graphs, Rates

Uta Bilow, Carsten Bittrich, Constanze Hasterok, Konrad Jende, Michael Kobel, Christian Rudolph, Felix Socher, Julia Woithe

Regression analysis is a tool for building mathematical and statistical models that characterize relationships between variables Finds a linear

ICM-Chemist How-To Guide. Version 3.6-1g Last Updated 12/01/2009

Regression used to predict or estimate the value of one variable corresponding to a given value of another variable.

Tests for the Odds Ratio of Two Proportions in a 2x2 Cross-Over Design

August 7, 2007 NUMERICAL SOLUTION OF LAPLACE'S EQUATION

CHAPTER 10. Regression and Correlation

Ratio of Polynomials Fit One Variable

Overview of Spatial analysis in ecology

Two Correlated Proportions Non- Inferiority, Superiority, and Equivalence Tests

Physics 191 Free Fall

Photometry with Iris Photometry with Iris

Lecture 2. Estimating Single Population Parameters 8-1

Didacticiel - Études de cas

Location Latitude Longitude Durham, NH

THE CRYSTAL BALL SCATTER CHART

Ratio of Polynomials Fit Many Variables

Three Factor Completely Randomized Design with One Continuous Factor: Using SPSS GLM UNIVARIATE R. C. Gardner Department of Psychology

Companion. Jeffrey E. Jones

Confidence Intervals for One-Way Repeated Measures Contrasts

Analysis of Variance. Contents. 1 Analysis of Variance. 1.1 Review. Anthony Tanbakuchi Department of Mathematics Pima Community College

2.4. Model Outputs Result Chart Growth Weather Water Yield trend Results Single year Results Individual run Across-run summary

Analyst Software. Peptide and Protein Quantitation Tutorial

α m ! m or v T v T v T α m mass

Supplementary File 3: Tutorial for ASReml-R. Tutorial 1 (ASReml-R) - Estimating the heritability of birth weight

Computer simulation of radioactive decay

Lab 1 Uniform Motion - Graphing and Analyzing Motion

HOW TO USE MIKANA. 1. Decompress the zip file MATLAB.zip. This will create the directory MIKANA.

let s examine pupation rates. With the conclusion of that data collection, we will go on to explore the rate at which new adults appear, a process

Multiple group models for ordinal variables

A Scientific Model for Free Fall.

Spreadsheet Solution of Systems of Nonlinear Differential Equations

Item Reliability Analysis

Basic Data Analysis. Stephen Turnbull Business Administration and Public Policy Lecture 9: June 6, Abstract. Regression and R.

Moving into the information age: From records to Google Earth

MORE ON MULTIPLE REGRESSION

DISCRETE RANDOM VARIABLES EXCEL LAB #3

Hotelling s One- Sample T2

Solving Differential Equations on 2-D Geometries with Matlab

HSC Chemistry 7.0 User's Guide

Power Analysis Introduction to Power Analysis with G*Power 3 Dale Berger 1401

BASIC TECHNOLOGY Pre K starts and shuts down computer, monitor, and printer E E D D P P P P P P P P P P

Radioactivity: Experimental Uncertainty

BIOL Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES

Transcription:

K1D: Multivariate Ripley s K-function for one-dimensional data Daniel G. Gavin University of Oregon Department of Geography Version 1.2 (July 2010) 1

Contents 1. Background 1a. Bivariate and multivariate K-Function for one dimension 1b. Use of integer data 1c. Confidence envelope for the multivariate K-function 1d. Smoothing event frequencies 2. Example of the bivariate K-function for one dimensional data 3. Running K1D Revision History: Revisioeta 2 (August 2005). Fixed file input error. Now does not require the first column to have the greatest number of events. Revisioeta 3 (November 2005). Added several options: different model forms (forward and backward selection), use of integer data, and different randomization schemes for constructing confidence envelopes. Version 1.0 (February 2007). Changed format of input file in include specifications of start-year and end-year. Otherwise, no other changes. Version 1.1 (May 2007). Fixed error in calculation of the forward-selection and backward-selection functions. Version 1.2 (July 2010). Fixed issue in setting whether a record has integer values. Clarified user guide with respect to choice of randomization method. 2

1. Background K1D computes the multivariate Ripley K-function simplified for one dimension (e.g., time or line transects). K1D computes the dependence between two or more types of events that are ordered in one dimension. The motivation for creating this program was to test whether dates of forest fires at two or more sites co-occur more than would be expected by chance. A general review of the K-function and the bivariate K-function will not be covered here, but an excellent recent review is in Wiegand and Moloney (2004). K1D has three main features: a) Calculation of the bivariate K-function and its transform (L) using an edge correction. K function may be calculated one of three ways. b) Calculation of confidence envelopes one of three ways: 1) randomization of events (optionally weighted by an intensity function), 2 ) a circular ( toroidal in two dimensions) shift of the records relative to each other, or 3) a shuffle of events (i.e., randomization without replacement). c) Calculation of smoothed frequencies of events using a set window width. 1a. Bivariate and multivariate K-Function for one dimension The bivariate K-function gives the number of events in record B occurring within ± t yr (the temporal window ) of each event in record A, and scaled by T/( ) where T is the length of the record (transect) and and are the number of events i and B respectively. The bivariate K-function for a single dimension is expressed as: K ( t) = T i=1 w( A i,b j ) Ι( A i B j < t) j =1 where A i and B j are locations (times) of events, I is the identity function, and w(a i,b j ) is an edge correction, set to 2 if A i B j is greater than the distance of A i to the nearest edge of the record, otherwise it is set to 1 (Doss 1989). Values of K greater than 2t suggest attraction, or synchrony, betwee and B, within a window of t. Values of K close to 2t suggest no relationship or independence betwee and B, and values less than 2t suggest repulsion, or asynchrony. The edge correction causes K (t) to differ slightly from K BA (t)(i.e., whether distances are measured from A to B or vice versa). K (t)and K BA (t) are averaged by weighting each to produce the unbiased estimate K ˆ (t) (Lotwick and Silverman 1982): K K ˆ (t) = n B ( t) + + The multivariate expression of the K-function (comparing 2 types of events) is implemented by comparing events one record to the aggregated events in all other records: K BA ( t) 3

K CG ( t) = G T R =1( R C) n C n R n C G n R i=1 R( R C) j =1 w( C i,r j ) Ι( C i R j t) where G is the number of records and C is a single record compared to all other records. The above function is calculated for all G records, and then the G functions are averaged as for the bivariate case: K GG ( t) = G C =1 G R( R C) G R =1 n R K CG ( t) n R (G 1) For graphing purposes, the K-function is easier to interpret if its mean and variance are stabilized over t as expressed by the L-function: L ˆ (t) = K ˆ (t) /2 t. The above methods are suitable for examining whether events occur closer to events in a second record irrespective of direction. In some cases, one may wish to focus on whether events in one record follow events in another record. This is termed the forward selection K-function: K ( t) = T i=1 j =1 Ι( ( B j A i ) t, B j A i ) where the K-function is the same as the previous bi-directional model, but there is an added criteria that B events co-occur with or follow A events (see after the comma in the identify function). Similarly, one can define the opposite case, i.e., whether events in one record precede events in another record. This is the backward selecton K-function for B events preceding A events: K ( t) = T i=1 j =1 Ι( ( A i B j ) t, A i B j ) Because events are tallied in only one direction, the L-function for both forward and backward selection models is L ˆ (t) = K ˆ (t) t. Notes on implementation in K1D: In this software, edge corrections are not applied with the forward selection or backward selection methods. The program reads the first column of data as the A record, and the second column as the B record. Thus, for forward selection, the K-function searches for B events following A events. For the backward selection method, the K-function 4

searches for B events preceding A events. Only two records are permitted with the forward and backward selection methods. 1b. Use of integer data In many cases, one wishes to compare series using values at a resolution of whole numbers. For example, time series often have only one value per year, and events cannot be defined at resolutions less than a year. In this case, it is possible to specify integer data for calculation of the K-function and for the simulation envelopes. For integer data, K1D will compute a K-function using t=0 in addition to the increments specified. It is necessary that the increment of the K-function is an integer when using integer data. 1c. Confidence envelope for the multivariate K-function A confidence envelope is constructed using simulations. There are three simulation methods: 1. Circular shift: Each randomization consists of shifting each record (=column in your data file) a random number of years and wrapping events from the end to the start of the record. This test only addresses dependence between the two records and preserves the first-order properties (frequency) of each record in each randomization (Wiegand and Moloney 2004). Confidence envelopes are set by percentiles in the distribution of the simulated K-function. Where L ˆ (t)is greater than, within, or less than the confidence envelope indicates statistically significant synchrony (attraction), independence, or statistically significant asynchrony (repulsion), respectively, in a window of t. 2. Random numbers: Events are chosen randomly (a Poisson process). Events may also be weighted if there is a non-stationary trend that must be preserved in the simulations. Such weightings may be determined using Poisson regression. 3. Shuffling of events. This is only available for integer data. Events are chosen randomly without replacement. NOTE: You may choose to randomize the first record and/or the second and all subsequent records. Note that if edge corrections are being calculated, it does matter which records you choose to randomize. If record A is fixed and record B is randomized, the result will be different compared to the reverse: fixing record B and randomizing record A. This is because the edge correction will be different, on average, for the two cases. Designing the randomizations is based on the hypothesis you with to test. If you with to test independence of two records with each other, then randomize both records (check both boxes). If you wish to test whether record B is synchronized with the events in record A, then check only the lower box. 1d. Smoothing event frequencies To help visualize changes in event frequencies using different levels of smoothing, K1D also computes smoothed event frequencies. This smoothing is done in two steps. First, along regular intervals l y,2y,3y,4 y...t, number of events occurring within a window of ±w is tallied. This number is scaled upward for cases where the window extends beyond the edge of the record: 5

F( l) = n ( ) min X i,t X i i=1 l X i <w 1, min( X i,t X i ) w ( w ) + 0.5w, min ( X,T X i i) < w This results in a step-like series of event frequencies. Second, F(l) is smoothed using the tricube function using the same window width following Huntley et al. (1989): S( l) = T y ( ) ( ) F( ym) V l,ym m=1; ym l <w T y ( ) m=1; ym l <w ( ) V l,ym where S(l) is the smoothed value at position l with a window width of ±w, m the step number of size y such that ym is the position the position along the record, and V (l,ym) is the tricube weighting function. Smaller y results in a less jagged curve and y<<w. The tricube function for the weighting of the distance between l and ym is: ( ) = 1 ym l V l, ym w 3 3 6

2. Example of the bivariate K-function for one dimensional data Demonstration of the bivariate K-function for testing synchrony over a range of temporal windows. (a) Two records where the first is a series of events places on random years and the second has events placed within 50 yr of events in the first record. (b) The L-function (transform of the K-function) for the events in (a) with 95% confidence envelope (thin lines) based on 1000 randomizations. The function exceeds the upper confidence envelop from 25 to150 yr, indicating strong correlation of event times within windows of that scale, but lack of long-term patterns in both records results in no synchrony in larger windows. (c) Two records where the number of events decreases by one each millennium. (d) The L-function for the events in (c). The function exceeds the confidence envelope at several window sizes between 500 and 1700 yr, indicating the millennial-scale pattern in common between the two records. 3. Running K1D This program is compiled for Mac OS X and Windows. The Windows version has received little testing (versioeta 3). Data input: Tab-delimited text files 5 0 2000 CY MIR RS Yahoo Barr 349 136 403 24 1360 445 282 815 907 1530 569 410 915 1267 1600 1004 1073 1034 1928 1640 1099 1288 1240 1313 1426 1427 1485 1605 1595 1801 1746 1695 1777 1857 The first row contains the number of records, the start year, and the end year. Here, the records span from 1-2000. The length of the record (T) is the difference of the start year and end year. All events must be bounded by the start-year and end-year dates. If using integer data, the data are inclusive of the start year and end year, and thus the length of the record is endyear-startyear+1. The second row contains the names of each record. All other rows contain the locations of events. Integer or non-integer values. 7

1) Load files using File Open or Command-O. If the files are read properly, a text box on the program will print the number of events in each record. 2) Choose the model form with the pop-up menu. 3) Choose the number of intervals in the K-function. The K-function will be computed to T/2, where T is the length of the record. For example, 10 intervals on a 2000-year record will yield K computed from 100 to 1000 at steps of 100 units. Run the K- function. 4) Choose the method of randomizing, and the records to randomize. For the random method, you may optionally choose an intensity-weighting file to bias event positions in simulated records. The intensity weighting file is a tab-delimited text file with two columns: column 1=location in record; column 2=probability occurring at that point. Column 1 should have rows for 1 to the length of the record. Column 2 should sum to 1 over all rows. There is a header row; see example file. The scale must be such that the intensity function can be described for the entire length of the record using integers (i.e., a record of length 500 is better than a record of length 3). This limitation may be addressed in future versions. 5) Choose the confidence envelope interval and the number of simulations to run. Larger numbers of simulations produces more reliable confidence envelopes; usually 1000 is sufficient. Run the simulations. A progress bar will show how much time remains for these calculations. 6) Output, with a header row, is copied using Command-C, and can be pasted into spreadsheets. Do not click on the spreadsheet before copying, as only the selected cells will be copied. 7) Run the smoothing by selecting a window width and the step size for the output. The window width is the total width of the window: a window of 100 results in a binning of events occurring ±50 from a regularly stepped position. The step size should be smaller than the window width, and results in a more appealing curve if 100 or more steps are used. 8) New data files may be opened without quitting and restarting. References Doss, H. 1989. On estimating the dependence between two point processes. The Annals of Statistics 17:749-763. Huntley, B., Bartlein, P.J., & Prentice, I.C. (1989) Climatic control of the distribution and abundance of beech (Fagus L ) in Europe and North America. Journal of Biogeography, 16, 551-560. Lotwick, H. W., and B. W. Silverman. 1982. Methods for analysing spatial processes of several types of points. Journal of the Royal Statistical Society B44:406-413. Wiegand, T., and K. A. Moloney. 2004. Rings, circles, and null-models for point pattern analysis in ecology. Oikos 104:209-229. 8