X. Modeling Stratophenetic Series

Size: px
Start display at page:

Download "X. Modeling Stratophenetic Series"

Transcription

1 GEOS 33001/EVOL October 2007 Page 1 of 12 X. Modeling Stratophenetic Series 1 Differences between evolutionary and stratophenetic series 1.1 Sequence in thickness vs. time 1.2 Completeness of sampling Problem with statistical power As directional pattern sampled with decreasing completeness, ability to reject symmetric random walk diminishes. 2 Principal approaches 2.1 Symmetric random walk as null hypothesis See topic 4. Random walk with p=q= E(S n ) + 2 4pqn 20 E(S n) + 4pqn S n 0 E(S n ) 20 E(S n ) 4pqn 40 E(S n ) 2 4pqn Time steps (n)

2 GEOS 33001/EVOL October 2007 Page 2 of Generalized random walk Maximum likelihood estimates of mean step size and variance in step size (Hunt 2006). 580 GENE HUNT FIGURE 1. Three example step distributions (top) used to generate corresponding evolutionary sequences of 100 steps (bottom). When the mean of the step distribution is zero, increases and decreases are equally likely and the overall dynamics are nondirectional (A, C). Step distribution B has a positive mean, and therefore will tend to produce positively trended evolutionary sequences. With increasing step variance, evolutionary sequences are more volatile, with larger positive and negative excursions (compare C with A). Generalized Random Walk (Hunt 2006) depends only on step (see below), it is a natural measure of directionality. Before proceeding, it is necessary to clarify some terminology. What I refer to here as a general random walk includes the whole class of models that are characterized by evolutionary transitions that (1) are independent from each other and (2) are homogenous over time; i.e., they are drawn from the same step distribution through the interval of interest. When unqualified in the paleontological literature, the term random walk generally implies an unbiased random walk, and some would restrict the term only to this subset of models. Throughout this paper, I use general random walk to refer to all models that meet the above two criteria and unbiased random walk to denote the special case of nondirectional random walks ( step 0). It is important to note that modeling phyletic evolution as a random walk does not imply any particular evolutionary process; many different microevolutionary scenarios produce evolutionary sequences that can be described as random walks (Hansen and Martins 1996; Roopnarine et al. 1999). Although phenotypic transitions are modeled as random draws from a step distribution, this should not be understood to imply that the evolutionary changes themselves are unrelat- 2.3 Log-rate vs. Log-interval comparisons (Gingerich 2001; Gingerich and Clyde 1994)

3 GEOS 33001/EVOL October 2007 Page 3 of 12 Rate (slope) versus interval length for a simple sequence. Morphology Time Ideal log(rate)-log(interval) comparisons A B Directional Morphology log Evolutionary rate slope=0 Time log Interval length C D Static Morphology log Evolutionary rate slope= 1 Time log Interval length E F Random Morphology log Evolutionary rate slope = 1 2 Time log Interval length

4 GEOS 33001/EVOL October 2007 Page 4 of Comparing within- and between-lineage changes(charlesworth 1984, Cheetham 1986) High ratio of between- to within-lineage variance as operational test for punctuation.

5 GEOS 33001/EVOL October 2007 Page 5 of 12

6 GEOS 33001/EVOL October 2007 Page 6 of 12

7 GEOS 33001/EVOL October 2007 Page 7 of 12 B. 2.5 Inversion of stratophenetic series via model of stratigraphic variation (Hannisdal 2006, 2007) Assume, or empirically constrain, model of basin filling, depth, grain size Assume or empirically constrain habitat preferences (preferred depth and grain size, tolerance about preferred value) Forward model: Assume pattern of phenotypic evolution, predict stratophenetic series Inverse model: Observe stratophenetic series, including morphology and abundance, and infer evolutionary pattern H A N N I S D A L esses be exin modern ry phenom- 982; Schluare the preion (Erwin am 1999)? ons depend sure evolutterns from rphological ion are govecological, s. The relns among e scale and mic) of the e of organstanding of parameters nalysis and model of pic evolut-sediment, ed from de- The model vestigating sitional ar- It is shown stratopheple yet unortant immodes of e nature of Figure 1. Outline of model components and their input/output relations. The basin fill model (SedFlux) takes a series of input files (defining basin dimensions, sediment properties, and process parameters) and outputs various seafloor properties, including water depth d and sediment grain size g, as well as depositional characteristics (e.g., sedimentation rate q). These output variables are used to drive models of (1) abundance, predicting population size N (the sum of population sizes per spatial bin n) based on a species habitat preferences; (2) evolution, predicting population changes in phenotype f in response to selection and drift; and (3) preservation, predicting the number of preserved fossils K. substrate properties, and sedimentation rate along an onshore-offshore profile, according to userdefined sediment input, process parameters, and sea level change; a model of abundance predicting the distribution of individuals according to the species habitat preferences and peak abundance; a

8 ( [ ]) (d d) (g g) f(d, g) p exp. (1) 2 2 2ps s 2 s s d g d g GEOS 33001/EVOL October 2007 Page 8 of 12 The parameters d, g, s 2, and 2 d sg are thus used to control the environmental sensitivity and abundance distribution of a simulated benthic species. Figure 3A shows an example of a density defined by equation (1) for a particular set of parameter values. The number of individuals n in each spatial bin x is found by scaling the peak of the density function (fig. 3A) to the maximal per-bin [ ] x max f(d x, g x) N p n. (2) f(d, g) When applied to the basin fill history from SedFlux, the above model allows a simulated benthic species to track its preferred habitat in space and time. Although the underlying probability density is a symmetric Gaussian, the realized abundance distributions along the seafloor can look drastically different at different times as a result of variability in substrate properties and in the bathymetric profile in response to sea level changes (fig. 3B 3F). This matches the real- Figure 3. Model of habitat preference and abundance. A, Gaussian density representing the probability of occurrence of a benthic species, here controlled by two environmental variables: water depth ( d p 100 m, sd p 80 m) and grain size ( g p 300 mm, sg p 200 mm). The peak of the density function is scaled to the maximum per-bin abundance. B F, Realized abundance distributions across the basin profile at various times throughout a model run, using the basin fill history from figure 2 and the Gaussian model in A. Each bar represents the mean abundance of 200 spatial bins (10 km) along the seafloor. Because of variations in the bathymetric gradient and sediment grain size distribution in response to sea level changes, the abundance distributions look different at different times, ranging from relatively symmetric (B) to highly skewed (E), with a mode that is either variable in both magnitude and position or polymodal (C). Even for a species with broad depth tolerance, the actual area occupied at any given time can be a limited portion of the basin (D).

9 GEOS 33001/EVOL October 2007 Page 9 of B. H A N N I S D A L Figure 4. Model of phenotypic evolution based on the Price equation. Each panel shows the change over time (1000 steps) in the mean phenotypic value (solid line, starting at 0) and the phenotypic variance (dotted line, starting at 1) of the population. The top row of panels depicts different scenarios based on the values of the two regression parameters, from left to right: no selection ( bw,f p 0, bw, (f f) 2 p 0), directional selection ( bw, f 1 0, bw, (f f) 2 p 0), directional selection with stabilizing selection ( bw, f 1 0, b W, (f f) 2! 0), and directional selection with disruptive selection ( bw, f 1 0, bw, (f f) 1 0). The two middle rows are for the same scenarios but with decreasing population size, resulting in 2 stronger drift. The bottom row shows the effect of introducing a population bottleneck (reducing the population 10- fold over a period of time indicated by dashed lines), which can generate a rapid shift in the mean value and a spike in and/or depletion of variance. test operates on the sequence of positive and negative increments in a series, comparing the number of runs (consecutive steps with equal sign) in a series with that expected from a random walk. The scaled maximum test and the Hurst exponent comexpected from a random walk (see The Runs Test, The Scaled Maximum Test, and The Hurst Exponent in the appendix for details). The tests are applied to the stratophenetic series from each location as well as to the microevolutionary

10 GEOS 33001/EVOL October 2007 Page 10 of 12 Figure 6. Numerical experiment showing the effect of depositional architecture. A, Phenotypic evolution as random drift, based on parameters bw, f p 0, bw, (f!f)2 p 0, and N from C. Solid line is for phenotypic mean; dotted line is for variance. B, Gaussian density based on preferred depth p 120 m, depth tolerance p 100 m, preferred grain size p 400 mm, and grain size tolerance p 100 mm. C, Population size through time based on the model in B. Solid line is for total population size; stippled line is for effective population size, which in this case is held constant and small. D, Poisson density function used to determine preservation based on l as a function of the abundance in each model bin. E, Stratophenetic series plotted against synthetic basin fill from figure 2. Sample means (red) and error bars (cyan) are scaled to make the stratophenetic patterns easily visible and have to be projected back onto the blue

11 GEOS 33001/EVOL October 2007 Page 11 of 12 INFERRING EVOLUTION BY INVERSION 109 FIGURE 6. 2-D marginal posterior distributions. Axes correspond to parameter ranges, except for pres (F, ordinate), where the parameter range is truncated as in Figure 5. Solid lines represent 0.1 (inner), 0.5, and 0.95 (outer) confidence contours. Shading is scaled to the peak of each plot to enhance shapes, so shading levels are not comparable between plots. True model indicated by. preserved individuals in a particular sample, an increase in the peak abundance parameter will require a corresponding decrease in the preservation probability parameter, and we would therefore expect the parameters Nmax and prob to be negatively correlated in the posterior. This relationship is also suggested by the slope of their ensemble distribution (Fig. 4E). Note that these are not empirical correlations and do not suggest that more abundant organisms have lower per-capita presmodel correlation matrix, calculated by standardizing each element by the posterior standard errors (Fig. 7). This matrix shows the expected negative posterior correlation between maximum abundance and preservation probability. As mentioned above, the product of these two parameters is treated as a single preservation parameter (pres), and this transformed parameter is consequently positively correlated with the original parameters. Another distinct feature is the positive correla-

12 of the four principal component axes. The solresult of variable unce prior information and through the stratigra because the evolution m homogeneous and con the ensemble of pred properly reflect the dence. Future work w ed inversion approac time-varying posterior GEOS 33001/EVOL October 2007 Page 12 of 12 INFERRING EVOLUTION BY INVERSION FIGURE 9. Ensemble-based inferred pattern of phenotypic evolution. Plots A D correspond to the temporal pattern of population mean shape change along the first four PC axes (cf. Fig. 3). Solid line represents the true pattern (Fig. 1H), dotted line represents the ensemble mean trajectory, and dashed lines are an approximate error envelope representing uncertainty in both age (abscissa) and phenotype (ordinate) calculated as two times the standard error in the ensemble of predicted trajectories. Discu Despite sparse foss little information bey relevant temporal pat albeit with considerab that we can resolve d rameters from the ser is encouraging, and u tion between tempora terns. From the marg tions and the resolutio although the selection are less well resolved parameters, the tend ability implies an evo sistent directionality. does not mean constan represents a statistical the scale of millions o ference is all the mor the stratophenetic seri inated by large fluctu

ARTICLES Phenotypic Evolution in the Fossil Record: Numerical Experiments

ARTICLES Phenotypic Evolution in the Fossil Record: Numerical Experiments ARTICLES Phenotypic Evolution in the Fossil Record: Numerical Experiments Bjarte Hannisdal Department of the Geophysical Sciences, University of Chicago, 5734 South Ellis Avenue, Chicago, Illinois 60637,

More information

Inferring phenotypic evolution in the fossil record by Bayesian inversion

Inferring phenotypic evolution in the fossil record by Bayesian inversion Paleobiology, 33(1), 2007, pp. 98 115 Inferring phenotypic evolution in the fossil record by Bayesian inversion Bjarte Hannisdal Abstract. This paper takes an alternative approach to the problem of inferring

More information

vary spuriously with preservation rate, but this spurious variation is largely eliminated and true rates are reasonably well estimated.

vary spuriously with preservation rate, but this spurious variation is largely eliminated and true rates are reasonably well estimated. 606 MICHAEL FOOTE Figure 1 shows results of an experiment with simulated data. Rates of origination, extinction, and preservation were varied in a pulsed pattern, in which a background level was punctuated

More information

Measuring rates of phenotypic evolution and the inseparability of tempo and mode

Measuring rates of phenotypic evolution and the inseparability of tempo and mode Measuring rates of phenotypic evolution and the inseparability of tempo and mode Gene Hunt This pdf file is licensed for distribution in the form of electronic reprints and by way of personal or institutional

More information

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Review. DS GA 1002 Statistical and Mathematical Models.   Carlos Fernandez-Granda Review DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Probability and statistics Probability: Framework for dealing with

More information

Statistics Boot Camp. Dr. Stephanie Lane Institute for Defense Analyses DATAWorks 2018

Statistics Boot Camp. Dr. Stephanie Lane Institute for Defense Analyses DATAWorks 2018 Statistics Boot Camp Dr. Stephanie Lane Institute for Defense Analyses DATAWorks 2018 March 21, 2018 Outline of boot camp Summarizing and simplifying data Point and interval estimation Foundations of statistical

More information

Subject CS1 Actuarial Statistics 1 Core Principles

Subject CS1 Actuarial Statistics 1 Core Principles Institute of Actuaries of India Subject CS1 Actuarial Statistics 1 Core Principles For 2019 Examinations Aim The aim of the Actuarial Statistics 1 subject is to provide a grounding in mathematical and

More information

Chapter 9. Non-Parametric Density Function Estimation

Chapter 9. Non-Parametric Density Function Estimation 9-1 Density Estimation Version 1.2 Chapter 9 Non-Parametric Density Function Estimation 9.1. Introduction We have discussed several estimation techniques: method of moments, maximum likelihood, and least

More information

Phenotypic Evolution. and phylogenetic comparative methods. G562 Geometric Morphometrics. Department of Geological Sciences Indiana University

Phenotypic Evolution. and phylogenetic comparative methods. G562 Geometric Morphometrics. Department of Geological Sciences Indiana University Phenotypic Evolution and phylogenetic comparative methods Phenotypic Evolution Change in the mean phenotype from generation to generation... Evolution = Mean(genetic variation * selection) + Mean(genetic

More information

Evolutionary Theory. Sinauer Associates, Inc. Publishers Sunderland, Massachusetts U.S.A.

Evolutionary Theory. Sinauer Associates, Inc. Publishers Sunderland, Massachusetts U.S.A. Evolutionary Theory Mathematical and Conceptual Foundations Sean H. Rice Sinauer Associates, Inc. Publishers Sunderland, Massachusetts U.S.A. Contents Preface ix Introduction 1 CHAPTER 1 Selection on One

More information

9/2/2010. Wildlife Management is a very quantitative field of study. throughout this course and throughout your career.

9/2/2010. Wildlife Management is a very quantitative field of study. throughout this course and throughout your career. Introduction to Data and Analysis Wildlife Management is a very quantitative field of study Results from studies will be used throughout this course and throughout your career. Sampling design influences

More information

Statistical Distribution Assumptions of General Linear Models

Statistical Distribution Assumptions of General Linear Models Statistical Distribution Assumptions of General Linear Models Applied Multilevel Models for Cross Sectional Data Lecture 4 ICPSR Summer Workshop University of Colorado Boulder Lecture 4: Statistical Distributions

More information

Institute of Actuaries of India

Institute of Actuaries of India Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2018 Examinations Subject CT3 Probability and Mathematical Statistics Core Technical Syllabus 1 June 2017 Aim The

More information

Comparison Figures from the New 22-Year Daily Eddy Dataset (January April 2015)

Comparison Figures from the New 22-Year Daily Eddy Dataset (January April 2015) Comparison Figures from the New 22-Year Daily Eddy Dataset (January 1993 - April 2015) The figures on the following pages were constructed from the new version of the eddy dataset that is available online

More information

Bayesian Regression Linear and Logistic Regression

Bayesian Regression Linear and Logistic Regression When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we

More information

Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions

Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions Journal of Modern Applied Statistical Methods Volume 8 Issue 1 Article 13 5-1-2009 Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error

More information

PCA Advanced Examples & Applications

PCA Advanced Examples & Applications PCA Advanced Examples & Applications Objectives: Showcase advanced PCA analysis: - Addressing the assumptions - Improving the signal / decreasing the noise Principal Components (PCA) Paper II Example:

More information

Least Squares Regression

Least Squares Regression E0 70 Machine Learning Lecture 4 Jan 7, 03) Least Squares Regression Lecturer: Shivani Agarwal Disclaimer: These notes are a brief summary of the topics covered in the lecture. They are not a substitute

More information

How fast does speciation happen? Tempo and Mode of Speciation. How fast does speciation happen?

How fast does speciation happen? Tempo and Mode of Speciation. How fast does speciation happen? Tempo and Mode of Speciation Dr. Ben Waggoner BIOL 4415 How fast does speciation happen? Older assumption: Species change gradually (except in cases of rapid speciation by hybridization / polyploidy) Richard

More information

ICML Scalable Bayesian Inference on Point processes. with Gaussian Processes. Yves-Laurent Kom Samo & Stephen Roberts

ICML Scalable Bayesian Inference on Point processes. with Gaussian Processes. Yves-Laurent Kom Samo & Stephen Roberts ICML 2015 Scalable Nonparametric Bayesian Inference on Point Processes with Gaussian Processes Machine Learning Research Group and Oxford-Man Institute University of Oxford July 8, 2015 Point Processes

More information

Dimension Reduction. David M. Blei. April 23, 2012

Dimension Reduction. David M. Blei. April 23, 2012 Dimension Reduction David M. Blei April 23, 2012 1 Basic idea Goal: Compute a reduced representation of data from p -dimensional to q-dimensional, where q < p. x 1,...,x p z 1,...,z q (1) We want to do

More information

STATISTICS 110/201 PRACTICE FINAL EXAM

STATISTICS 110/201 PRACTICE FINAL EXAM STATISTICS 110/201 PRACTICE FINAL EXAM Questions 1 to 5: There is a downloadable Stata package that produces sequential sums of squares for regression. In other words, the SS is built up as each variable

More information

Introduction. Chapter 1

Introduction. Chapter 1 Chapter 1 Introduction In this book we will be concerned with supervised learning, which is the problem of learning input-output mappings from empirical data (the training dataset). Depending on the characteristics

More information

DART_LAB Tutorial Section 5: Adaptive Inflation

DART_LAB Tutorial Section 5: Adaptive Inflation DART_LAB Tutorial Section 5: Adaptive Inflation UCAR 14 The National Center for Atmospheric Research is sponsored by the National Science Foundation. Any opinions, findings and conclusions or recommendations

More information

Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing

Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu October

More information

Fig Available seismic reflection, refraction, and magnetic profiles from 107 the Offshore Indus Basin close to the representative profile GCDH,

Fig Available seismic reflection, refraction, and magnetic profiles from 107 the Offshore Indus Basin close to the representative profile GCDH, List of Figures Page No. Fig. 1.1 Generalized physiography of the Indian Ocean along with 2 selected (200 m, 1000 m, 2000 m, and 3000 m) bathymetric contours. Fig. 1.2 Lithospheric plates in the Indian

More information

* Tuesday 17 January :30-16:30 (2 hours) Recored on ESSE3 General introduction to the course.

* Tuesday 17 January :30-16:30 (2 hours) Recored on ESSE3 General introduction to the course. Name of the course Statistical methods and data analysis Audience The course is intended for students of the first or second year of the Graduate School in Materials Engineering. The aim of the course

More information

1 Using standard errors when comparing estimated values

1 Using standard errors when comparing estimated values MLPR Assignment Part : General comments Below are comments on some recurring issues I came across when marking the second part of the assignment, which I thought it would help to explain in more detail

More information

1.0 Continuous Distributions. 5.0 Shapes of Distributions. 6.0 The Normal Curve. 7.0 Discrete Distributions. 8.0 Tolerances. 11.

1.0 Continuous Distributions. 5.0 Shapes of Distributions. 6.0 The Normal Curve. 7.0 Discrete Distributions. 8.0 Tolerances. 11. Chapter 4 Statistics 45 CHAPTER 4 BASIC QUALITY CONCEPTS 1.0 Continuous Distributions.0 Measures of Central Tendency 3.0 Measures of Spread or Dispersion 4.0 Histograms and Frequency Distributions 5.0

More information

Stat 231 Exam 2 Fall 2013

Stat 231 Exam 2 Fall 2013 Stat 231 Exam 2 Fall 2013 I have neither given nor received unauthorized assistance on this exam. Name Signed Date Name Printed 1 1. Some IE 361 students worked with a manufacturer on quantifying the capability

More information

Modules 1-2 are background; they are the same for regression analysis and time series.

Modules 1-2 are background; they are the same for regression analysis and time series. Regression Analysis, Module 1: Regression models (The attached PDF file has better formatting.) Required reading: Chapter 1, pages 3 13 (until appendix 1.1). Updated: May 23, 2005 Modules 1-2 are background;

More information

Learning Objectives for Stat 225

Learning Objectives for Stat 225 Learning Objectives for Stat 225 08/20/12 Introduction to Probability: Get some general ideas about probability, and learn how to use sample space to compute the probability of a specific event. Set Theory:

More information

Computational statistics

Computational statistics Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated

More information

II. Introduction to probability, 2

II. Introduction to probability, 2 GEOS 33000/EVOL 33000 5 January 2006 updated January 10, 2006 Page 1 II. Introduction to probability, 2 1 Random Variables 1.1 Definition: A random variable is a function defined on a sample space. In

More information

Regression, Ridge Regression, Lasso

Regression, Ridge Regression, Lasso Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.

More information

Modeling Uncertainty in the Earth Sciences Jef Caers Stanford University

Modeling Uncertainty in the Earth Sciences Jef Caers Stanford University Probability theory and statistical analysis: a review Modeling Uncertainty in the Earth Sciences Jef Caers Stanford University Concepts assumed known Histograms, mean, median, spread, quantiles Probability,

More information

Lecture 30. DATA 8 Summer Regression Inference

Lecture 30. DATA 8 Summer Regression Inference DATA 8 Summer 2018 Lecture 30 Regression Inference Slides created by John DeNero (denero@berkeley.edu) and Ani Adhikari (adhikari@berkeley.edu) Contributions by Fahad Kamran (fhdkmrn@berkeley.edu) and

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

DATA ANALYSIS AND INTERPRETATION

DATA ANALYSIS AND INTERPRETATION III. DATA ANALYSIS AND INTERPRETATION 3.1. Rift Geometry Identification Based on recent analysis of modern and ancient rifts, many previous workers concluded that the basic structural unit of continental

More information

Evaluation requires to define performance measures to be optimized

Evaluation requires to define performance measures to be optimized Evaluation Basic concepts Evaluation requires to define performance measures to be optimized Performance of learning algorithms cannot be evaluated on entire domain (generalization error) approximation

More information

Single-Trial Neural Correlates. of Arm Movement Preparation. Neuron, Volume 71. Supplemental Information

Single-Trial Neural Correlates. of Arm Movement Preparation. Neuron, Volume 71. Supplemental Information Neuron, Volume 71 Supplemental Information Single-Trial Neural Correlates of Arm Movement Preparation Afsheen Afshar, Gopal Santhanam, Byron M. Yu, Stephen I. Ryu, Maneesh Sahani, and K rishna V. Shenoy

More information

Data Mining Chapter 4: Data Analysis and Uncertainty Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

Data Mining Chapter 4: Data Analysis and Uncertainty Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Data Mining Chapter 4: Data Analysis and Uncertainty Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Why uncertainty? Why should data mining care about uncertainty? We

More information

Ensemble Data Assimilation and Uncertainty Quantification

Ensemble Data Assimilation and Uncertainty Quantification Ensemble Data Assimilation and Uncertainty Quantification Jeff Anderson National Center for Atmospheric Research pg 1 What is Data Assimilation? Observations combined with a Model forecast + to produce

More information

Chapter 7. Testing Linear Restrictions on Regression Coefficients

Chapter 7. Testing Linear Restrictions on Regression Coefficients Chapter 7 Testing Linear Restrictions on Regression Coefficients 1.F-tests versus t-tests In the previous chapter we discussed several applications of the t-distribution to testing hypotheses in the linear

More information

Ch. 3 Key concepts. Fossils & Evolution Chapter 3 1

Ch. 3 Key concepts. Fossils & Evolution Chapter 3 1 Ch. 3 Key concepts A biological species is defined as a group of potentially interbreeding populations that are reproductively isolated from other such groups under natural conditions. It is impossible

More information

Probabilistic Machine Learning. Industrial AI Lab.

Probabilistic Machine Learning. Industrial AI Lab. Probabilistic Machine Learning Industrial AI Lab. Probabilistic Linear Regression Outline Probabilistic Classification Probabilistic Clustering Probabilistic Dimension Reduction 2 Probabilistic Linear

More information

Least Squares Regression

Least Squares Regression CIS 50: Machine Learning Spring 08: Lecture 4 Least Squares Regression Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may not cover all the

More information

qe qt e rt dt = q/(r + q). Pr(sampled at least once)=

qe qt e rt dt = q/(r + q). Pr(sampled at least once)= V. Introduction to homogeneous sampling models A. Basic framework 1. r is the per-capita rate of sampling per lineage-million-years. 2. Sampling means the joint incidence of preservation, exposure, collection,

More information

1.010 Uncertainty in Engineering Fall 2008

1.010 Uncertainty in Engineering Fall 2008 MIT OpenCourseWare http://ocw.mit.edu 1.010 Uncertainty in Engineering Fall 2008 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms. Example Application 12

More information

Contourites and associated sediments controlled by deep-water circulation processes: State of the art and future considerations.

Contourites and associated sediments controlled by deep-water circulation processes: State of the art and future considerations. Contourites and associated sediments controlled by deep-water circulation processes: State of the art and future considerations. Marine Geology 352 (2014) 111 154 Michele Rebesco, F. Javier Hernández-Molina,

More information

Estimating the accuracy of a hypothesis Setting. Assume a binary classification setting

Estimating the accuracy of a hypothesis Setting. Assume a binary classification setting Estimating the accuracy of a hypothesis Setting Assume a binary classification setting Assume input/output pairs (x, y) are sampled from an unknown probability distribution D = p(x, y) Train a binary classifier

More information

Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances

Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances Advances in Decision Sciences Volume 211, Article ID 74858, 8 pages doi:1.1155/211/74858 Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances David Allingham 1 andj.c.w.rayner

More information

Gaussian Process Approximations of Stochastic Differential Equations

Gaussian Process Approximations of Stochastic Differential Equations Gaussian Process Approximations of Stochastic Differential Equations Cédric Archambeau Centre for Computational Statistics and Machine Learning University College London c.archambeau@cs.ucl.ac.uk CSML

More information

Integrative Biology 200 "PRINCIPLES OF PHYLOGENETICS" Spring 2018 University of California, Berkeley

Integrative Biology 200 PRINCIPLES OF PHYLOGENETICS Spring 2018 University of California, Berkeley Integrative Biology 200 "PRINCIPLES OF PHYLOGENETICS" Spring 2018 University of California, Berkeley B.D. Mishler Feb. 14, 2018. Phylogenetic trees VI: Dating in the 21st century: clocks, & calibrations;

More information

Statistical techniques for data analysis in Cosmology

Statistical techniques for data analysis in Cosmology Statistical techniques for data analysis in Cosmology arxiv:0712.3028; arxiv:0911.3105 Numerical recipes (the bible ) Licia Verde ICREA & ICC UB-IEEC http://icc.ub.edu/~liciaverde outline Lecture 1: Introduction

More information

DART_LAB Tutorial Section 2: How should observations impact an unobserved state variable? Multivariate assimilation.

DART_LAB Tutorial Section 2: How should observations impact an unobserved state variable? Multivariate assimilation. DART_LAB Tutorial Section 2: How should observations impact an unobserved state variable? Multivariate assimilation. UCAR 2014 The National Center for Atmospheric Research is sponsored by the National

More information

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008 Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:

More information

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics DETAILED CONTENTS About the Author Preface to the Instructor To the Student How to Use SPSS With This Book PART I INTRODUCTION AND DESCRIPTIVE STATISTICS 1. Introduction to Statistics 1.1 Descriptive and

More information

Chapter 4. Displaying and Summarizing. Quantitative Data

Chapter 4. Displaying and Summarizing. Quantitative Data STAT 141 Introduction to Statistics Chapter 4 Displaying and Summarizing Quantitative Data Bin Zou (bzou@ualberta.ca) STAT 141 University of Alberta Winter 2015 1 / 31 4.1 Histograms 1 We divide the range

More information

Bayesian inference of impurity transport coefficient profiles

Bayesian inference of impurity transport coefficient profiles Bayesian inference of impurity transport coefficient profiles M.A. Chilenski,, M. Greenwald, Y. Marzouk, J.E. Rice, A.E. White Systems and Technology Research MIT Plasma Science and Fusion Center/Alcator

More information

Practice Problems Section Problems

Practice Problems Section Problems Practice Problems Section 4-4-3 4-4 4-5 4-6 4-7 4-8 4-10 Supplemental Problems 4-1 to 4-9 4-13, 14, 15, 17, 19, 0 4-3, 34, 36, 38 4-47, 49, 5, 54, 55 4-59, 60, 63 4-66, 68, 69, 70, 74 4-79, 81, 84 4-85,

More information

2D Image Processing. Bayes filter implementation: Kalman filter

2D Image Processing. Bayes filter implementation: Kalman filter 2D Image Processing Bayes filter implementation: Kalman filter Prof. Didier Stricker Kaiserlautern University http://ags.cs.uni-kl.de/ DFKI Deutsches Forschungszentrum für Künstliche Intelligenz http://av.dfki.de

More information

Thomas Bayes versus the wedge model: An example inference using a geostatistical prior function

Thomas Bayes versus the wedge model: An example inference using a geostatistical prior function Thomas Bayes versus the wedge model: An example inference using a geostatistical prior function Jason M. McCrank, Gary F. Margrave, and Don C. Lawton ABSTRACT The Bayesian inference is used to estimate

More information

Bayesian Learning in Undirected Graphical Models

Bayesian Learning in Undirected Graphical Models Bayesian Learning in Undirected Graphical Models Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London, UK http://www.gatsby.ucl.ac.uk/ Work with: Iain Murray and Hyun-Chul

More information

K1D: Multivariate Ripley s K-function for one-dimensional data. Daniel G. Gavin University of Oregon Department of Geography Version 1.

K1D: Multivariate Ripley s K-function for one-dimensional data. Daniel G. Gavin University of Oregon Department of Geography Version 1. K1D: Multivariate Ripley s K-function for one-dimensional data Daniel G. Gavin University of Oregon Department of Geography Version 1.2 (July 2010) 1 Contents 1. Background 1a. Bivariate and multivariate

More information

Simulating Properties of the Likelihood Ratio Test for a Unit Root in an Explosive Second Order Autoregression

Simulating Properties of the Likelihood Ratio Test for a Unit Root in an Explosive Second Order Autoregression Simulating Properties of the Likelihood Ratio est for a Unit Root in an Explosive Second Order Autoregression Bent Nielsen Nuffield College, University of Oxford J James Reade St Cross College, University

More information

Basic Statistics. 1. Gross error analyst makes a gross mistake (misread balance or entered wrong value into calculation).

Basic Statistics. 1. Gross error analyst makes a gross mistake (misread balance or entered wrong value into calculation). Basic Statistics There are three types of error: 1. Gross error analyst makes a gross mistake (misread balance or entered wrong value into calculation). 2. Systematic error - always too high or too low

More information

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis. 401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis

More information

Probability and Inference

Probability and Inference Deniz Yuret ECOE 554 Lecture 3 Outline 1 Probabilities and ensembles 2 3 Ensemble An ensemble X is a triple (x, A X, P X ), where the outcome x is the value of a random variable, which takes on one of

More information

Chapter 22: Log-linear regression for Poisson counts

Chapter 22: Log-linear regression for Poisson counts Chapter 22: Log-linear regression for Poisson counts Exposure to ionizing radiation is recognized as a cancer risk. In the United States, EPA sets guidelines specifying upper limits on the amount of exposure

More information

Hypothesis Testing for Var-Cov Components

Hypothesis Testing for Var-Cov Components Hypothesis Testing for Var-Cov Components When the specification of coefficients as fixed, random or non-randomly varying is considered, a null hypothesis of the form is considered, where Additional output

More information

Sparsity Models. Tong Zhang. Rutgers University. T. Zhang (Rutgers) Sparsity Models 1 / 28

Sparsity Models. Tong Zhang. Rutgers University. T. Zhang (Rutgers) Sparsity Models 1 / 28 Sparsity Models Tong Zhang Rutgers University T. Zhang (Rutgers) Sparsity Models 1 / 28 Topics Standard sparse regression model algorithms: convex relaxation and greedy algorithm sparse recovery analysis:

More information

1 Random walks and data

1 Random walks and data Inference, Models and Simulation for Complex Systems CSCI 7-1 Lecture 7 15 September 11 Prof. Aaron Clauset 1 Random walks and data Supposeyou have some time-series data x 1,x,x 3,...,x T and you want

More information

Evaluation. Andrea Passerini Machine Learning. Evaluation

Evaluation. Andrea Passerini Machine Learning. Evaluation Andrea Passerini passerini@disi.unitn.it Machine Learning Basic concepts requires to define performance measures to be optimized Performance of learning algorithms cannot be evaluated on entire domain

More information

The Temperature Proxy Controversy

The Temperature Proxy Controversy School of Mathematics February 8, 2012 Overview Introduction 1 Introduction 2 3 4 A provocative statement Figures often beguile me, particularly when I have the arranging of them myself; in which case

More information

New Statistical Methods That Improve on MLE and GLM Including for Reserve Modeling GARY G VENTER

New Statistical Methods That Improve on MLE and GLM Including for Reserve Modeling GARY G VENTER New Statistical Methods That Improve on MLE and GLM Including for Reserve Modeling GARY G VENTER MLE Going the Way of the Buggy Whip Used to be gold standard of statistical estimation Minimum variance

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION SUPPLEMENTARY INFORMATION doi:10.1038/nature11784 Methods The ECHO-G model and simulations The ECHO-G model 29 consists of the 19-level ECHAM4 atmospheric model and 20-level HOPE-G ocean circulation model.

More information

Problem Solving Strategies: Sampling and Heuristics. Kevin H. Knuth Department of Physics University at Albany Albany NY, USA

Problem Solving Strategies: Sampling and Heuristics. Kevin H. Knuth Department of Physics University at Albany Albany NY, USA Problem Solving Strategies: Sampling and Heuristics Department of Physics University at Albany Albany NY, USA Outline Methodological Differences Inverses vs. Inferences Problem Transformation From Inference

More information

Topic 12 Overview of Estimation

Topic 12 Overview of Estimation Topic 12 Overview of Estimation Classical Statistics 1 / 9 Outline Introduction Parameter Estimation Classical Statistics Densities and Likelihoods 2 / 9 Introduction In the simplest possible terms, the

More information

Statistical Inference

Statistical Inference Statistical Inference Liu Yang Florida State University October 27, 2016 Liu Yang, Libo Wang (Florida State University) Statistical Inference October 27, 2016 1 / 27 Outline The Bayesian Lasso Trevor Park

More information

A nonparametric two-sample wald test of equality of variances

A nonparametric two-sample wald test of equality of variances University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 211 A nonparametric two-sample wald test of equality of variances David

More information

Machine Learning for OR & FE

Machine Learning for OR & FE Machine Learning for OR & FE Hidden Markov Models Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com Additional References: David

More information

Analysis of Regression and Bayesian Predictive Uncertainty Measures

Analysis of Regression and Bayesian Predictive Uncertainty Measures Analysis of and Predictive Uncertainty Measures Dan Lu, Mary C. Hill, Ming Ye Florida State University, dl7f@fsu.edu, mye@fsu.edu, Tallahassee, FL, USA U.S. Geological Survey, mchill@usgs.gov, Boulder,

More information

A short introduction to INLA and R-INLA

A short introduction to INLA and R-INLA A short introduction to INLA and R-INLA Integrated Nested Laplace Approximation Thomas Opitz, BioSP, INRA Avignon Workshop: Theory and practice of INLA and SPDE November 7, 2018 2/21 Plan for this talk

More information

Introduction to Matrix Algebra and the Multivariate Normal Distribution

Introduction to Matrix Algebra and the Multivariate Normal Distribution Introduction to Matrix Algebra and the Multivariate Normal Distribution Introduction to Structural Equation Modeling Lecture #2 January 18, 2012 ERSH 8750: Lecture 2 Motivation for Learning the Multivariate

More information

I. Introduction to probability, 1

I. Introduction to probability, 1 GEOS 33000/EVOL 33000 3 January 2006 updated January 10, 2006 Page 1 I. Introduction to probability, 1 1 Sample Space and Basic Probability 1.1 Theoretical space of outcomes of conceptual experiment 1.2

More information

Basic Sampling Methods

Basic Sampling Methods Basic Sampling Methods Sargur Srihari srihari@cedar.buffalo.edu 1 1. Motivation Topics Intractability in ML How sampling can help 2. Ancestral Sampling Using BNs 3. Transforming a Uniform Distribution

More information

HANDBOOK OF APPLICABLE MATHEMATICS

HANDBOOK OF APPLICABLE MATHEMATICS HANDBOOK OF APPLICABLE MATHEMATICS Chief Editor: Walter Ledermann Volume VI: Statistics PART A Edited by Emlyn Lloyd University of Lancaster A Wiley-Interscience Publication JOHN WILEY & SONS Chichester

More information

Lecture 22/Chapter 19 Part 4. Statistical Inference Ch. 19 Diversity of Sample Proportions

Lecture 22/Chapter 19 Part 4. Statistical Inference Ch. 19 Diversity of Sample Proportions Lecture 22/Chapter 19 Part 4. Statistical Inference Ch. 19 Diversity of Sample Proportions Probability versus Inference Behavior of Sample Proportions: Example Behavior of Sample Proportions: Conditions

More information

1 Degree distributions and data

1 Degree distributions and data 1 Degree distributions and data A great deal of effort is often spent trying to identify what functional form best describes the degree distribution of a network, particularly the upper tail of that distribution.

More information

Design of Experiments

Design of Experiments Design of Experiments D R. S H A S H A N K S H E K H A R M S E, I I T K A N P U R F E B 19 TH 2 0 1 6 T E Q I P ( I I T K A N P U R ) Data Analysis 2 Draw Conclusions Ask a Question Analyze data What to

More information

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 Lecture 2: Linear Models Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector

More information

Bootstrapping, Randomization, 2B-PLS

Bootstrapping, Randomization, 2B-PLS Bootstrapping, Randomization, 2B-PLS Statistics, Tests, and Bootstrapping Statistic a measure that summarizes some feature of a set of data (e.g., mean, standard deviation, skew, coefficient of variation,

More information

Towards a more physically based approach to Extreme Value Analysis in the climate system

Towards a more physically based approach to Extreme Value Analysis in the climate system N O A A E S R L P H Y S IC A L S C IE N C E S D IV IS IO N C IR E S Towards a more physically based approach to Extreme Value Analysis in the climate system Prashant Sardeshmukh Gil Compo Cecile Penland

More information

Calibrating Environmental Engineering Models and Uncertainty Analysis

Calibrating Environmental Engineering Models and Uncertainty Analysis Models and Cornell University Oct 14, 2008 Project Team Christine Shoemaker, co-pi, Professor of Civil and works in applied optimization, co-pi Nikolai Blizniouk, PhD student in Operations Research now

More information

Computer Vision Group Prof. Daniel Cremers. 2. Regression (cont.)

Computer Vision Group Prof. Daniel Cremers. 2. Regression (cont.) Prof. Daniel Cremers 2. Regression (cont.) Regression with MLE (Rep.) Assume that y is affected by Gaussian noise : t = f(x, w)+ where Thus, we have p(t x, w, )=N (t; f(x, w), 2 ) 2 Maximum A-Posteriori

More information

Sparse Linear Models (10/7/13)

Sparse Linear Models (10/7/13) STA56: Probabilistic machine learning Sparse Linear Models (0/7/) Lecturer: Barbara Engelhardt Scribes: Jiaji Huang, Xin Jiang, Albert Oh Sparsity Sparsity has been a hot topic in statistics and machine

More information

Annealed Importance Sampling for Neural Mass Models

Annealed Importance Sampling for Neural Mass Models for Neural Mass Models, UCL. WTCN, UCL, October 2015. Generative Model Behavioural or imaging data y. Likelihood p(y w, Γ). We have a Gaussian prior over model parameters p(w µ, Λ) = N (w; µ, Λ) Assume

More information

Bayesian Nonparametrics for Speech and Signal Processing

Bayesian Nonparametrics for Speech and Signal Processing Bayesian Nonparametrics for Speech and Signal Processing Michael I. Jordan University of California, Berkeley June 28, 2011 Acknowledgments: Emily Fox, Erik Sudderth, Yee Whye Teh, and Romain Thibaux Computer

More information

STA Module 10 Comparing Two Proportions

STA Module 10 Comparing Two Proportions STA 2023 Module 10 Comparing Two Proportions Learning Objectives Upon completing this module, you should be able to: 1. Perform large-sample inferences (hypothesis test and confidence intervals) to compare

More information