Residuals for spatial point processes based on Voronoi tessellations
|
|
- Joshua Oscar Lee
- 5 years ago
- Views:
Transcription
1 Residuals for spatial point processes based on Voronoi tessellations Ka Wong 1, Frederic Paik Schoenberg 2, Chris Barr 3. 1 Google, Mountanview, CA. 2 Corresponding author, 8142 Math-Science Building, Department of Statistics, University of California, Los Angeles, CA , USA. frederic@stat.ucla.edu. phone: fax: Department of Biostatistics, Harvard University. 1
2 Abstract A residual analysis method for spatial point processes is proposed, where differences between the modeled conditional intensity and the observed number of points are assessed over the Voronoi cells generated by the observations. The resulting residuals appear to be substantially less skewed and hence more stable, particularly for point processes with conditional intensities close to zero, compared to ordinary Pearson residuals and other pixel-based methods. An application to models for Southern California earthquakes is provided. Key words: Papangelou intensity, Pearson residuals, point patterns, residual analysis, Voronoi residuals. 1 Introduction. The aim of this paper is to propose a new form of residual analysis for assessing the goodness of fit of spatial or spatial-temporal point process models. The proposed method relies on comparing the normalized observed and expected numbers of points over Voronoi cells generated by the observed point pattern. The excellent treatment of Pearson residuals and other pixel-based residuals by Baddeley et al. (2005), the thorough discussion of their properties in Baddeley et al. (2008), and the fact that such residuals extend so readily to the case of spatial-temporal point processes may suggest that the problem of residual analysis for such point processes is generally solved. Hence, we feel it is necessary to devote a substantial portion of this paper to a major shortcoming of such pixel-based residuals, in order to motivate our proposed alternative. In 2
3 brief, Pearson residuals and other pixel-based residuals tend to be highly skewed when the integrated conditional intensity over some of the cells is close to zero, which is common in many applications. By contrast, the proposed Voronoi residuals are approximately Gamma distributed and tend to be far less skewed than Pearson residuals, and are thus far more amenable to assessment of goodness of fit. zzq: We might need to define spatial and spatial-temporal point processes and their intensities, and say that we are assuming throughout that the point processes are simple. We are assuming that the observation region is equipped with Lebesgue measure, µ. Note that we are not emphasizing the distinction between conditional and Papangelou intensities here. The methods and results here are essentially equivalent for spatial and spatial-temporal point processes. This paper is organized as follows. In Section 2, we briefly review the goals of residual analysis as well as Pearson residuals and other pixel-based residuals described in Baddeley et al. (2005), and discuss their limitations when the integrated conditional intensity is small. Section 3 describes Voronoi residuals and discusses their properties. The simulations shown in Section 4 demonstrate the potential advantages of the Voronoi residuals over conventional pixel-based residuals in cases where the conditional intensity is occasionally close to zero. Section 5 includes an application to models for earthquake occurrences in Southern California. 3
4 2 Pearson residuals and other pixel-based methods. Residual plots for spatial point processes have two related purposes: (i) to suggest locations or aspects of the model where the fit is poor, so that an incorrectly specified model may be improved; (ii) to form the basis of formal testing, i.e. to assess the overall appropriateness of a model or to what extent the model fits well and hence results based on the model may be trusted. Baddeley et al. (2005) discuss a variety of pixel-based residuals for spatial point processes. The residual diagnostics are plots showing the standardized differences between the number of points occurring in each plot and the number expected according to the fitted model, where the standardization may be performed in various ways. For instance, for Pearson residuals, one divides the difference by the estimated standard deviation of the number of points in the pixel, in analogy with Pearson residuals in the context of linear models. Baddeley et al. (2005) also propose scaling the residuals based on the contribution of each pixel to the total pseudo-loglikelihood of the model, in analogy with score statistics in generalized linear modeling. Standardization is important for both purposes (i) and (ii), since otherwise plots of the residuals will tend to overemphasize deviations in pixels where the rate is high, and obviously formal testing based on individual pixels requires the standard deviation of the number of points in the pixel to be taken into account. Behind the term Pearson residuals lies the implication, both implicit and explicit (see e.g. the error bounds in Fig.7 of Baddeley et al. 2005), that these standardized residuals should be approximately standard normally distributed, so that the squared residuals, or their sum, are distributed approximately according to Pearson s χ 2 -distribution. Pearson residuals appear to be effec- 4
5 tive model evaluation tools in examples where the estimate of the conditional intensity, λ, is moderately sized throughout the space of observation, as is the case throughout Baddeley et al. (2005). ZZQ: briefly outline other residuals in Baddeley et al. (2005) and Baddeley et al. (2008)? If λ is small, however, then the Pearson residuals will be heavily skewed and their distribution will not be well approximated by the normal or χ 2 distributions. Indeed, when λ is close to zero, the raw residuals tend to have a distribution that is very highly skewed, and the standardization to form Pearson residuals actually exacerbates this skew. These situations arise in many applications, unfortunately. For example, in modeling earthquake occurrences, typically the modeled conditional intensity is close to zero far way from known faults or previous seismicity, and in the case of modeling wildfires, one may have a modeled conditional intensity close to zero in areas far from human use or frequent lightning, or with vegetation types that do not readily support much wildfire activity (zzq: cite). Furthermore, even if λ is not extremely close to zero, if the pixels used for Pearson residuals are sufficiently small so that the integral of λ over pixels is occasionally very small, then the same skew occurs. Since the Pearson residuals are standardized to have mean zero and unit (or approximately unit) variance under the null hypothesis that the modeled conditional intensity is correct (see Baddeley et al. 2008), one may inquire whether the skew of these residuals is indeed problematic. Consider a case of a planar Poisson process where the estimate of λ is exactly correct, i.e. ˆλ(x, y) = λ(x, y) at all locations, and where one elects to use Pearson residuals on pixels. Suppose that there are several pixels where the integral of λ over the pixel is roughly Given many of these pixels, it is not unlikely that at least one of 5
6 them will contain a point of the process. In such pixels, the raw residual will be 0.99, and the standard deviation of the number of points in the pixel is 0.01 = 0.1, so the Pearson residual is This may yield the following effects: a) Such Pearson residuals may overwhelm the others in a visual inspection, rendering a plot of the Pearson residuals largely useless in terms of evaluating the quality of the fit of the model; b) Conventional tests based on the normal approximation will have grossly incorrect p- values, and will commonly tend to reject the model, although it is correct, based on one such residual alone. Even if one adjusts for the non-normality of the residual and instead uses exact p-values based on the Poisson distribution for one such pixel individually, the test will still reject the model at the significance level of c) If one adjusts for the non-normality of the residual and computes exact simultaneous p-values, then the resulting tests will have extremely low power. Indeed, if 10,000 pixels (a grid) are used, then gross mis-specification would be required in order to reject the null hypothesis with more than a probability of merely 10% under such circumstances. ZZQ: We need to check this. We need to simulate Poisson processes to make this last statement more concrete. We will need to make an assumption about the intensity. 3 Voronoi residuals. A Voronoi tessellation is a division of the metric space on which a point process is defined into convex polygons, or Voronoi cells. Specifically, given a spatial or spatial-temporal point pattern N, one may define its corresponding Voronoi tessellation as follows: for each point τ i 6
7 of the point process, its corresponding cell D i is the region consisting of all locations which are closer to τ i than to any other point of N. The Voronoi tessellation is the collection of such cells. See e.g. Okabe et al. (2000) for a thorough treatment of Voronoi tessellations and their properties. Given a model for the conditional intensity of a spatial or space-time point process, one may construct residuals simply by evaluating the Pearson residuals over cells rather than rectangular pixels, where the cells comprise the Voronoi tessellation of the observed spatial or spatial-temporal point pattern. We will refer to such residuals as Voronoi residuals. Voronoi residuals offer one obvious advantage over conventional pixel-based methods, in that the cell sizes are entirely automatic and data-driven in the case of Voronoi residuals. With pixel-based methods, the cell boundaries are often determined rather arbitrarily, yet these boundaries can have immense impacts on the results, particularly when λ is volatile. More importantly, the distributions of the Voronoi residuals tend to be far less skewed than pixel-based methods such as Pearson residuals, particularly when ˆλ is small in some areas. Indeed, since each Voronoi cell has exactly one point inside it by construction, the Voronoi residual for cell i is given by ˆr i := 1 D i ˆλdµ D i ˆλdµ = 1 D i λ D i λ, (1) where λ denote the mean of ˆλ over D i. Note that when N is a homogeneous Poisson process, the cell size D i is approximately Gamma distributed. Indeed, for a homogeneous Poisson process, the expected area of a Voronoi cell is equal to the reciprocal of the intensity of 7
8 the process (Meijering 1953), and simulation studies have shown that the area of a typical Voronoi cell is approximately Gamma distributed (Hinde and Miles, 1980; Tanemura, 2003), and these properties these properties continue to hold approximately in the inhomogeneous case provided that the conditional intensity is approximately constant near the location in question (Barr and Schoenberg 2010). Hence the numerator in equation (??) will often tend to be distributed approximately like a rescaled Gamma random variable. By contrast, for pixels over which the integrated conditional intensity is close to zero, the conventional raw residuals are approximately Bernoulli distributed. 4 Simulated examples. The exact distributions of the Voronoi residuals are generally quite intractable due to the fact that the cells themselves are random. Simulations may be useful to investigate the approximate distributions of these residuals. ZZQ1: Assume a certain specific intensity for a spatial inhomogeneous Poisson process with small pixels. For instance, we could take λ(x, y) = 100x 2 y over the space [ 1, 1] [ 1, 1]. There will be around 67 points. Use a grid of pixels. Look at a typical plot of the raw residuals, Pearson residuals, and tessellation residuals. The raw and Pearson residuals will probably just show the points basically, near the origin. ZZQ2: Simulate the inhomogeneous Poisson process many times and look at histograms of the residuals at the origin for Pearson and Voronoi residuals. For Pearson residuals, use the pixel [0,.01] [0,.01]. 8
9 5 A seismological application. ZZQ3: There are many options here. We can use two models in the CSEP, Collaborative Study of Earthquake Predictability, project. I will get this data. Acknowledgements This material is based upon work supported by the National Science Foundation under Grant No. zzq. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author and do not necessarily reflect the views of the National Science Foundation. 9
10 References Baddeley, A., Turner, R., Moller, J., and Hazelton, M. (2005). Residual analysis for spatial point processes (with discussion). Journal of the Royal Statistical Society, series B, 67(5): Baddeley, A., Moller, J., and Pakes, A.G. (2008). Properties of residuals for spatial point processes. Annals of the Institute of Statistical Mathematics, 60: Daley, D., and Vere-Jones, D. (1988). An Introduction to the Theory of Point Processes. Springer, New York. 10
arxiv: v1 [stat.ap] 26 Jan 2015
The Annals of Applied Statistics 2014, Vol. 8, No. 4, 2247 2267 DOI: 10.1214/14-AOAS767 c Institute of Mathematical Statistics, 2014 arxiv:1501.06387v1 [stat.ap] 26 Jan 2015 VORONOI RESIDUAL ANALYSIS OF
More informationVoronoi residuals and other residual analyses applied to CSEP earthquake forecasts.
Voronoi residuals and other residual analyses applied to CSEP earthquake forecasts. Joshua Seth Gordon 1, Robert Alan Clements 2, Frederic Paik Schoenberg 3, and Danijel Schorlemmer 4. Abstract. Voronoi
More informationEvaluation of space-time point process models using. super-thinning
Evaluation of space-time point process models using super-thinning Robert Alan lements 1, Frederic Paik Schoenberg 1, and Alejandro Veen 2 1 ULA Department of Statistics, 8125 Math Sciences Building, Los
More informationOn thinning a spatial point process into a Poisson process using the Papangelou intensity
On thinning a spatial point process into a Poisson process using the Papangelou intensity Frederic Paik choenberg Department of tatistics, University of California, Los Angeles, CA 90095-1554, UA and Jiancang
More informationOn the Voronoi estimator for the intensity of an inhomogeneous planar Poisson process
Biometrika (21), xx, x, pp. 1 13 1 2 3 4 5 6 7 8 9 1 11 12 13 14 15 16 17 18 19 2 21 22 23 24 25 C 27 Biometrika Trust Printed in Great Britain On the Voronoi estimator for the intensity of an inhomogeneous
More informationResearch Article. J. Molyneux*, J. S. Gordon, F. P. Schoenberg
Assessing the predictive accuracy of earthquake strike angle estimates using non-parametric Hawkes processes Research Article J. Molyneux*, J. S. Gordon, F. P. Schoenberg Department of Statistics, University
More informationAssessing Spatial Point Process Models Using Weighted K-functions: Analysis of California Earthquakes
Assessing Spatial Point Process Models Using Weighted K-functions: Analysis of California Earthquakes Alejandro Veen 1 and Frederic Paik Schoenberg 2 1 UCLA Department of Statistics 8125 Math Sciences
More informationAssessment of point process models for earthquake forecasting
Assessment of point process models for earthquake forecasting Andrew Bray 1 and Frederic Paik Schoenberg 1 1 UCLA Department of Statistics, 8125 Math Sciences Building, Los Angeles, CA 90095-1554 Abstract
More informationarxiv: v1 [stat.ap] 29 Feb 2012
The Annals of Applied Statistics 2011, Vol. 5, No. 4, 2549 2571 DOI: 10.1214/11-AOAS487 c Institute of Mathematical Statistics, 2011 arxiv:1202.6487v1 [stat.ap] 29 Feb 2012 RESIDUAL ANALYSIS METHODS FOR
More informationPoint processes, spatial temporal
Point processes, spatial temporal A spatial temporal point process (also called space time or spatio-temporal point process) is a random collection of points, where each point represents the time and location
More informationFIRST PAGE PROOFS. Point processes, spatial-temporal. Characterizations. vap020
Q1 Q2 Point processes, spatial-temporal A spatial temporal point process (also called space time or spatio-temporal point process) is a random collection of points, where each point represents the time
More informationAN EM ALGORITHM FOR HAWKES PROCESS
AN EM ALGORITHM FOR HAWKES PROCESS Peter F. Halpin new york university December 17, 2012 Correspondence should be sent to Dr. Peter F. Halpin 246 Greene Street, Office 316E New York, NY 10003-6677 E-Mail:
More informationAre Declustered Earthquake Catalogs Poisson?
Are Declustered Earthquake Catalogs Poisson? Philip B. Stark Department of Statistics, UC Berkeley Brad Luen Department of Mathematics, Reed College 14 October 2010 Department of Statistics, Penn State
More informationTesting for Poisson Behavior
Testing for Poisson Behavior Philip B. Stark Department of Statistics, UC Berkeley joint with Brad Luen 17 April 2012 Seismological Society of America Annual Meeting San Diego, CA Quake Physics versus
More informationMulti-dimensional residual analysis of point process models for earthquake. occurrences. Frederic Paik Schoenberg
Multi-dimensional residual analysis of point process models for earthquake occurrences. Frederic Paik Schoenberg Department of Statistics, University of California, Los Angeles, CA 90095 1554, USA. phone:
More informationResiduals and Goodness-of-fit tests for marked Gibbs point processes
Residuals and Goodness-of-fit tests for marked Gibbs point processes Frédéric Lavancier (Laboratoire Jean Leray, Nantes, France) Joint work with J.-F. Coeurjolly (Grenoble, France) 09/06/2010 F. Lavancier
More informationChapter 2. Mean and Standard Deviation
Chapter 2. Mean and Standard Deviation The median is known as a measure of location; that is, it tells us where the data are. As stated in, we do not need to know all the exact values to calculate the
More informationOn Mainshock Focal Mechanisms and the Spatial Distribution of Aftershocks
On Mainshock Focal Mechanisms and the Spatial Distribution of Aftershocks Ka Wong 1 and Frederic Paik Schoenberg 1,2 1 UCLA Department of Statistics 8125 Math-Science Building, Los Angeles, CA 90095 1554,
More informationA Cautionary Note on Estimating the Reliability of a Mastery Test with the Beta-Binomial Model
A Cautionary Note on Estimating the Reliability of a Mastery Test with the Beta-Binomial Model Rand R. Wilcox University of Southern California Based on recently published papers, it might be tempting
More informationPOINT PROCESSES. Frederic Paik Schoenberg. UCLA Department of Statistics, MS Los Angeles, CA
POINT PROCESSES Frederic Paik Schoenberg UCLA Department of Statistics, MS 8148 Los Angeles, CA 90095-1554 frederic@stat.ucla.edu July 2000 1 A point process is a random collection of points falling in
More informationORDER RESTRICTED STATISTICAL INFERENCE ON LORENZ CURVES OF PARETO DISTRIBUTIONS. Myongsik Oh. 1. Introduction
J. Appl. Math & Computing Vol. 13(2003), No. 1-2, pp. 457-470 ORDER RESTRICTED STATISTICAL INFERENCE ON LORENZ CURVES OF PARETO DISTRIBUTIONS Myongsik Oh Abstract. The comparison of two or more Lorenz
More informationCritical Values for the Test of Flatness of a Histogram Using the Bhattacharyya Measure.
Tina Memo No. 24-1 To Appear in; Towards a Quantitative mehtodology for the Quantitative Assessment of Cerebral Blood Flow in Magnetic Resonance Imaging. PhD Thesis, M.L.J.Scott, Manchester, 24. Critical
More informationNonparametric Statistics. Leah Wright, Tyler Ross, Taylor Brown
Nonparametric Statistics Leah Wright, Tyler Ross, Taylor Brown Before we get to nonparametric statistics, what are parametric statistics? These statistics estimate and test population means, while holding
More informationThe number of distributions used in this book is small, basically the binomial and Poisson distributions, and some variations on them.
Chapter 2 Statistics In the present chapter, I will briefly review some statistical distributions that are used often in this book. I will also discuss some statistical techniques that are important in
More informationA Graphical Test for Local Self-Similarity in Univariate Data
A Graphical Test for Local Self-Similarity in Univariate Data Rakhee Dinubhai Patel Frederic Paik Schoenberg Department of Statistics University of California, Los Angeles Los Angeles, CA 90095-1554 Rakhee
More informationØ Set of mutually exclusive categories. Ø Classify or categorize subject. Ø No meaningful order to categorization.
Statistical Tools in Evaluation HPS 41 Fall 213 Dr. Joe G. Schmalfeldt Types of Scores Continuous Scores scores with a potentially infinite number of values. Discrete Scores scores limited to a specific
More informationSYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions
SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu
More informationRobustness and Distribution Assumptions
Chapter 1 Robustness and Distribution Assumptions 1.1 Introduction In statistics, one often works with model assumptions, i.e., one assumes that data follow a certain model. Then one makes use of methodology
More informationModule 6: Model Diagnostics
St@tmaster 02429/MIXED LINEAR MODELS PREPARED BY THE STATISTICS GROUPS AT IMM, DTU AND KU-LIFE Module 6: Model Diagnostics 6.1 Introduction............................... 1 6.2 Linear model diagnostics........................
More informationTesting separability in multi-dimensional point processes
1 Testing separability in multi-dimensional point processes Frederic Paik Schoenberg 1, University of California, Los Angeles Abstract Nonparametric tests for investigating the separability of a multi-dimensional
More informationOn Rescaled Poisson Processes and the Brownian Bridge. Frederic Schoenberg. Department of Statistics. University of California, Los Angeles
On Rescaled Poisson Processes and the Brownian Bridge Frederic Schoenberg Department of Statistics University of California, Los Angeles Los Angeles, CA 90095 1554, USA Running head: Rescaled Poisson Processes
More informationCHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)
FROM: PAGANO, R. R. (007) I. INTRODUCTION: DISTINCTION BETWEEN PARAMETRIC AND NON-PARAMETRIC TESTS Statistical inference tests are often classified as to whether they are parametric or nonparametric Parameter
More informationQ-Matrix Development. NCME 2009 Workshop
Q-Matrix Development NCME 2009 Workshop Introduction We will define the Q-matrix Then we will discuss method of developing your own Q-matrix Talk about possible problems of the Q-matrix to avoid The Q-matrix
More informationStatistical tests for evaluating predictability experiments in Japan. Jeremy Douglas Zechar Lamont-Doherty Earth Observatory of Columbia University
Statistical tests for evaluating predictability experiments in Japan Jeremy Douglas Zechar Lamont-Doherty Earth Observatory of Columbia University Outline Likelihood tests, inherited from RELM Post-RELM
More informationApplication of Parametric Homogeneity of Variances Tests under Violation of Classical Assumption
Application of Parametric Homogeneity of Variances Tests under Violation of Classical Assumption Alisa A. Gorbunova and Boris Yu. Lemeshko Novosibirsk State Technical University Department of Applied Mathematics,
More informationØ Set of mutually exclusive categories. Ø Classify or categorize subject. Ø No meaningful order to categorization.
Statistical Tools in Evaluation HPS 41 Dr. Joe G. Schmalfeldt Types of Scores Continuous Scores scores with a potentially infinite number of values. Discrete Scores scores limited to a specific number
More informationKeller: Stats for Mgmt & Econ, 7th Ed July 17, 2006
Chapter 17 Simple Linear Regression and Correlation 17.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will
More informationSpatial Analysis I. Spatial data analysis Spatial analysis and inference
Spatial Analysis I Spatial data analysis Spatial analysis and inference Roadmap Outline: What is spatial analysis? Spatial Joins Step 1: Analysis of attributes Step 2: Preparing for analyses: working with
More informationStatistics Boot Camp. Dr. Stephanie Lane Institute for Defense Analyses DATAWorks 2018
Statistics Boot Camp Dr. Stephanie Lane Institute for Defense Analyses DATAWorks 2018 March 21, 2018 Outline of boot camp Summarizing and simplifying data Point and interval estimation Foundations of statistical
More informationStatistics of Non-Poisson Point Processes in Several Dimensions
Statistics of Non-Poisson Point Processes in Several Dimensions Kenneth A. Brakke Department of Mathematical Sciences Susquehanna University Selinsgrove, Pennsylvania 17870 brakke@susqu.edu originally
More informationEarthquake Clustering and Declustering
Earthquake Clustering and Declustering Philip B. Stark Department of Statistics, UC Berkeley joint with (separately) Peter Shearer, SIO/IGPP, UCSD Brad Luen 4 October 2011 Institut de Physique du Globe
More informationRescaling Marked Point Processes
1 Rescaling Marked Point Processes David Vere-Jones 1, Victoria University of Wellington Frederic Paik Schoenberg 2, University of California, Los Angeles Abstract In 1971, Meyer showed how one could use
More informationS The Over-Reliance on the Central Limit Theorem
S04-2008 The Over-Reliance on the Central Limit Theorem Abstract The objective is to demonstrate the theoretical and practical implication of the central limit theorem. The theorem states that as n approaches
More informationA homogeneity test for spatial point patterns
A homogeneity test for spatial point patterns M.V. Alba-Fernández University of Jaén Paraje las lagunillas, s/n B3-053, 23071, Jaén, Spain mvalba@ujaen.es F. J. Ariza-López University of Jaén Paraje las
More information9 Correlation and Regression
9 Correlation and Regression SW, Chapter 12. Suppose we select n = 10 persons from the population of college seniors who plan to take the MCAT exam. Each takes the test, is coached, and then retakes the
More informationAppendix from L. J. Revell, On the Analysis of Evolutionary Change along Single Branches in a Phylogeny
008 by The University of Chicago. All rights reserved.doi: 10.1086/588078 Appendix from L. J. Revell, On the Analysis of Evolutionary Change along Single Branches in a Phylogeny (Am. Nat., vol. 17, no.
More informationStat 13, Intro. to Statistical Methods for the Life and Health Sciences.
Stat 13, Intro. to Statistical Methods for the Life and Health Sciences. 1. Review exercises. 2. Statistical analysis of wildfires. 3. Forecasting earthquakes. 4. Global temperature data. 5. Disease epidemics.
More information, (1) e i = ˆσ 1 h ii. c 2016, Jeffrey S. Simonoff 1
Regression diagnostics As is true of all statistical methodologies, linear regression analysis can be a very effective way to model data, as along as the assumptions being made are true. For the regression
More informationExact Bounds for Degree Centralization
Exact Bounds for Degree Carter T. Butts 5/1/04 Abstract Degree centralization is a simple and widely used index of degree distribution concentration in social networks. Conventionally, the centralization
More informationChapte The McGraw-Hill Companies, Inc. All rights reserved.
er15 Chapte Chi-Square Tests d Chi-Square Tests for -Fit Uniform Goodness- Poisson Goodness- Goodness- ECDF Tests (Optional) Contingency Tables A contingency table is a cross-tabulation of n paired observations
More informationUnit 27 One-Way Analysis of Variance
Unit 27 One-Way Analysis of Variance Objectives: To perform the hypothesis test in a one-way analysis of variance for comparing more than two population means Recall that a two sample t test is applied
More informationTwo-sample inference: Continuous data
Two-sample inference: Continuous data Patrick Breheny November 11 Patrick Breheny STA 580: Biostatistics I 1/32 Introduction Our next two lectures will deal with two-sample inference for continuous data
More informationSpatial point processes
Mathematical sciences Chalmers University of Technology and University of Gothenburg Gothenburg, Sweden June 25, 2014 Definition A point process N is a stochastic mechanism or rule to produce point patterns
More informationBIOL 458 BIOMETRY Lab 9 - Correlation and Bivariate Regression
BIOL 458 BIOMETRY Lab 9 - Correlation and Bivariate Regression Introduction to Correlation and Regression The procedures discussed in the previous ANOVA labs are most useful in cases where we are interested
More informationDr. Junchao Xia Center of Biophysics and Computational Biology. Fall /1/2016 1/46
BIO5312 Biostatistics Lecture 10:Regression and Correlation Methods Dr. Junchao Xia Center of Biophysics and Computational Biology Fall 2016 11/1/2016 1/46 Outline In this lecture, we will discuss topics
More informationStatistics Introductory Correlation
Statistics Introductory Correlation Session 10 oscardavid.barrerarodriguez@sciencespo.fr April 9, 2018 Outline 1 Statistics are not used only to describe central tendency and variability for a single variable.
More information12 The Analysis of Residuals
B.Sc./Cert./M.Sc. Qualif. - Statistics: Theory and Practice 12 The Analysis of Residuals 12.1 Errors and residuals Recall that in the statistical model for the completely randomized one-way design, Y ij
More informationPRIME GENERATING LUCAS SEQUENCES
PRIME GENERATING LUCAS SEQUENCES PAUL LIU & RON ESTRIN Science One Program The University of British Columbia Vancouver, Canada April 011 1 PRIME GENERATING LUCAS SEQUENCES Abstract. The distribution of
More informationPoisson regression: Further topics
Poisson regression: Further topics April 21 Overdispersion One of the defining characteristics of Poisson regression is its lack of a scale parameter: E(Y ) = Var(Y ), and no parameter is available to
More informationUNIVERSITY OF NOTTINGHAM. Discussion Papers in Economics CONSISTENT FIRM CHOICE AND THE THEORY OF SUPPLY
UNIVERSITY OF NOTTINGHAM Discussion Papers in Economics Discussion Paper No. 0/06 CONSISTENT FIRM CHOICE AND THE THEORY OF SUPPLY by Indraneel Dasgupta July 00 DP 0/06 ISSN 1360-438 UNIVERSITY OF NOTTINGHAM
More informationA Monte-Carlo study of asymptotically robust tests for correlation coefficients
Biometrika (1973), 6, 3, p. 661 551 Printed in Great Britain A Monte-Carlo study of asymptotically robust tests for correlation coefficients BY G. T. DUNCAN AND M. W. J. LAYAKD University of California,
More informationMultivariate Time Series: Part 4
Multivariate Time Series: Part 4 Cointegration Gerald P. Dwyer Clemson University March 2016 Outline 1 Multivariate Time Series: Part 4 Cointegration Engle-Granger Test for Cointegration Johansen Test
More informationAnalytic computation of nonparametric Marsan-Lengliné. estimates for Hawkes point processes. phone:
Analytic computation of nonparametric Marsan-Lengliné estimates for Hawkes point processes. Frederic Paik Schoenberg 1, Joshua Seth Gordon 1, and Ryan Harrigan 2. 1 Department of Statistics, University
More informationAnalysis of Variance (ANOVA)
Analysis of Variance (ANOVA) Two types of ANOVA tests: Independent measures and Repeated measures Comparing 2 means: X 1 = 20 t - test X 2 = 30 How can we Compare 3 means?: X 1 = 20 X 2 = 30 X 3 = 35 ANOVA
More informationOn the Arbitrary Choice Regarding Which Inertial Reference Frame is "Stationary" and Which is "Moving" in the Special Theory of Relativity
Regarding Which Inertial Reference Frame is "Stationary" and Which is "Moving" in the Special Theory of Relativity Douglas M. Snyder Los Angeles, CA The relativity of simultaneity is central to the special
More informationSpatial Autocorrelation
Spatial Autocorrelation Luc Anselin http://spatial.uchicago.edu spatial randomness positive and negative spatial autocorrelation spatial autocorrelation statistics spatial weights Spatial Randomness The
More informationA vector identity for the Dirichlet tessellation
Math. Proc. Camb. Phil. Soc. (1980), 87, 151 Printed in Great Britain A vector identity for the Dirichlet tessellation BY ROBIN SIBSON University of Bath (Received 1 March 1979, revised 5 June 1979) Summary.
More informationLoglikelihood and Confidence Intervals
Stat 504, Lecture 2 1 Loglikelihood and Confidence Intervals The loglikelihood function is defined to be the natural logarithm of the likelihood function, l(θ ; x) = log L(θ ; x). For a variety of reasons,
More information15-388/688 - Practical Data Science: Basic probability. J. Zico Kolter Carnegie Mellon University Spring 2018
15-388/688 - Practical Data Science: Basic probability J. Zico Kolter Carnegie Mellon University Spring 2018 1 Announcements Logistics of next few lectures Final project released, proposals/groups due
More informationCHAPTER 4 VARIABILITY ANALYSES. Chapter 3 introduced the mode, median, and mean as tools for summarizing the
CHAPTER 4 VARIABILITY ANALYSES Chapter 3 introduced the mode, median, and mean as tools for summarizing the information provided in an distribution of data. Measures of central tendency are often useful
More informationAnswer Key. 9.1 Scatter Plots and Linear Correlation. Chapter 9 Regression and Correlation. CK-12 Advanced Probability and Statistics Concepts 1
9.1 Scatter Plots and Linear Correlation Answers 1. A high school psychologist wants to conduct a survey to answer the question: Is there a relationship between a student s athletic ability and his/her
More informationProbability Distributions.
Probability Distributions http://www.pelagicos.net/classes_biometry_fa18.htm Probability Measuring Discrete Outcomes Plotting probabilities for discrete outcomes: 0.6 0.5 0.4 0.3 0.2 0.1 NOTE: Area within
More informationApplication of branching point process models to the study of invasive red banana plants in Costa Rica
Application of branching point process models to the study of invasive red banana plants in Costa Rica Earvin Balderama Department of Statistics University of California Los Angeles, CA 90095 Frederic
More informationEmpirical Bayes Moderation of Asymptotically Linear Parameters
Empirical Bayes Moderation of Asymptotically Linear Parameters Nima Hejazi Division of Biostatistics University of California, Berkeley stat.berkeley.edu/~nhejazi nimahejazi.org twitter/@nshejazi github/nhejazi
More informationReview of Multiple Regression
Ronald H. Heck 1 Let s begin with a little review of multiple regression this week. Linear models [e.g., correlation, t-tests, analysis of variance (ANOVA), multiple regression, path analysis, multivariate
More informationStatistics Primer. ORC Staff: Jayme Palka Peter Boedeker Marcus Fagan Trey Dejong
Statistics Primer ORC Staff: Jayme Palka Peter Boedeker Marcus Fagan Trey Dejong 1 Quick Overview of Statistics 2 Descriptive vs. Inferential Statistics Descriptive Statistics: summarize and describe data
More informationGeneralized linear models
Generalized linear models Outline for today What is a generalized linear model Linear predictors and link functions Example: estimate a proportion Analysis of deviance Example: fit dose- response data
More informationThe University of Hong Kong Department of Statistics and Actuarial Science STAT2802 Statistical Models Tutorial Solutions Solutions to Problems 71-80
The University of Hong Kong Department of Statistics and Actuarial Science STAT2802 Statistical Models Tutorial Solutions Solutions to Problems 71-80 71. Decide in each case whether the hypothesis is simple
More informationPaper: ST-161. Techniques for Evidence-Based Decision Making Using SAS Ian Stockwell, The Hilltop UMBC, Baltimore, MD
Paper: ST-161 Techniques for Evidence-Based Decision Making Using SAS Ian Stockwell, The Hilltop Institute @ UMBC, Baltimore, MD ABSTRACT SAS has many tools that can be used for data analysis. From Freqs
More informationConfidence Intervals and Hypothesis Tests
Confidence Intervals and Hypothesis Tests STA 281 Fall 2011 1 Background The central limit theorem provides a very powerful tool for determining the distribution of sample means for large sample sizes.
More informationHarvard University. Rigorous Research in Engineering Education
Statistical Inference Kari Lock Harvard University Department of Statistics Rigorous Research in Engineering Education 12/3/09 Statistical Inference You have a sample and want to use the data collected
More informationEE290H F05. Spanos. Lecture 5: Comparison of Treatments and ANOVA
1 Design of Experiments in Semiconductor Manufacturing Comparison of Treatments which recipe works the best? Simple Factorial Experiments to explore impact of few variables Fractional Factorial Experiments
More informationChapter 16. Simple Linear Regression and Correlation
Chapter 16 Simple Linear Regression and Correlation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will
More informationTwo-sample Categorical data: Testing
Two-sample Categorical data: Testing Patrick Breheny October 29 Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/22 Lister s experiment Introduction In the 1860s, Joseph Lister conducted a landmark
More informationA Spatio-Temporal Point Process Model for Firemen Demand in Twente
University of Twente A Spatio-Temporal Point Process Model for Firemen Demand in Twente Bachelor Thesis Author: Mike Wendels Supervisor: prof. dr. M.N.M. van Lieshout Stochastic Operations Research Applied
More informationDescriptive Statistics-I. Dr Mahmoud Alhussami
Descriptive Statistics-I Dr Mahmoud Alhussami Biostatistics What is the biostatistics? A branch of applied math. that deals with collecting, organizing and interpreting data using well-defined procedures.
More informationSequential Procedure for Testing Hypothesis about Mean of Latent Gaussian Process
Applied Mathematical Sciences, Vol. 4, 2010, no. 62, 3083-3093 Sequential Procedure for Testing Hypothesis about Mean of Latent Gaussian Process Julia Bondarenko Helmut-Schmidt University Hamburg University
More informationRecall the Basics of Hypothesis Testing
Recall the Basics of Hypothesis Testing The level of significance α, (size of test) is defined as the probability of X falling in w (rejecting H 0 ) when H 0 is true: P(X w H 0 ) = α. H 0 TRUE H 1 TRUE
More informationAdvanced Regression Topics: Violation of Assumptions
Advanced Regression Topics: Violation of Assumptions Lecture 7 February 15, 2005 Applied Regression Analysis Lecture #7-2/15/2005 Slide 1 of 36 Today s Lecture Today s Lecture rapping Up Revisiting residuals.
More informationArea1 Scaled Score (NAPLEX) .535 ** **.000 N. Sig. (2-tailed)
Institutional Assessment Report Texas Southern University College of Pharmacy and Health Sciences "An Analysis of 2013 NAPLEX, P4-Comp. Exams and P3 courses The following analysis illustrates relationships
More informationRegression Analysis: Exploring relationships between variables. Stat 251
Regression Analysis: Exploring relationships between variables Stat 251 Introduction Objective of regression analysis is to explore the relationship between two (or more) variables so that information
More informationEarthquake predictability measurement: information score and error diagram
Earthquake predictability measurement: information score and error diagram Yan Y. Kagan Department of Earth and Space Sciences University of California, Los Angeles, California, USA August, 00 Abstract
More informationFRANKLIN UNIVERSITY PROFICIENCY EXAM (FUPE) STUDY GUIDE
FRANKLIN UNIVERSITY PROFICIENCY EXAM (FUPE) STUDY GUIDE Course Title: Probability and Statistics (MATH 80) Recommended Textbook(s): Number & Type of Questions: Probability and Statistics for Engineers
More informationProbabilistic approach to earthquake prediction
ANNALS OF GEOPHYSICS, VOL. 45, N. 6, December 2002 Probabilistic approach to earthquake prediction Rodolfo Console, Daniela Pantosti and Giuliana D Addezio Istituto Nazionale di Geofisica e Vulcanologia,
More informationTwo-sample inference: Continuous data
Two-sample inference: Continuous data Patrick Breheny April 6 Patrick Breheny University of Iowa to Biostatistics (BIOS 4120) 1 / 36 Our next several lectures will deal with two-sample inference for continuous
More informationA Space-Time Conditional Intensity Model for. Evaluating a Wildfire Hazard Index
A Space-Time Conditional Intensity Model for Evaluating a Wildfire Hazard Index Roger D. Peng Frederic Paik Schoenberg James Woods Author s footnote: Roger D. Peng (rpeng@stat.ucla.edu) is a Graduate Student,
More informationGibbs point processes : modelling and inference
Gibbs point processes : modelling and inference J.-F. Coeurjolly (Grenoble University) et J.-M Billiot, D. Dereudre, R. Drouilhet, F. Lavancier 03/09/2010 J.-F. Coeurjolly () Gibbs models 03/09/2010 1
More informationUC Berkeley Math 10B, Spring 2015: Midterm 2 Prof. Sturmfels, April 9, SOLUTIONS
UC Berkeley Math 10B, Spring 2015: Midterm 2 Prof. Sturmfels, April 9, SOLUTIONS 1. (5 points) You are a pollster for the 2016 presidential elections. You ask 0 random people whether they would vote for
More informationStatistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation
Statistics - Lecture One Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Outline 1. Basic ideas about estimation 2. Method of Moments 3. Maximum Likelihood 4. Confidence
More informationReview of Statistics 101
Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods
More information