qe qt e rt dt = q/(r + q). Pr(sampled at least once)=

Size: px
Start display at page:

Download "qe qt e rt dt = q/(r + q). Pr(sampled at least once)="

Transcription

1 V. Introduction to homogeneous sampling models A. Basic framework 1. r is the per-capita rate of sampling per lineage-million-years. 2. Sampling means the joint incidence of preservation, exposure, collection, identification, etc. 3. Generally consider the relevance only of extinction rate q (tacitly assume p q truncation effects negligible). 4. Assume all rates temporally taxonomically homogeneous (for now). 5. It will often be useful to deal with dimensionless rates dimensionless time. Let q = 1, express r in terms of multiples of q, express time in terms of multiples of 1/q (expected mean duration). B. Bulk properties (See Foote, 1997, Paleobiology 23:278-3.) 1. For a lineage with duration T, model sampling as a Poisson process with parameter rt. Thus: Pr(lineage never sampled)= e rt Pr(lineage sampled at least once)= 1 e rt Pr(lineage sampled exactly once)= rt e rt Pr(observed range=t [t > ])= r 2 r(t t) (T t)e NB If sampling rate varies, substitute e T r x dx for e rt. 2. These probabilities are then integrated over the entire exponential distribution of durations. For example, the probability that a romly chosen lineage will never be sampled is equal to: qe qt e rt dt = q/(r + q). Pr(sampled at least once)= qe qt (1 e rt ) dt = r/(r + q). This is the expected prortion of lineages sampled, which has also been referred to as paleontological completeness. Pr(sampled exactly once)= qe qt rt e rt dt = qr/(r + q) 2. These are the singletons. Pr(observed range=t [t > ]) qe qt r 2 (T t)e r(t t) dt = qr 2 e qt /(r + q) 2. See Appendix 3 of Foote NB: By the number of times sampled, we mean the number of distinguishable stratigraphic horizons at which it is sampled, not the actual number of fossils or localities. Thus, a singleton may be sampled numerous times, but if these are all, within the limits of resolution, at a single time horizon, this counts as being sampled once. 3. If we want to focus on distribution of stratigraphic ranges of lineages that are actually sampled, then we normalize all the probabilities by r/(r + q). Thus: Pr(singleton)= q/(r + q) Pr(range=t [t > ])= qre qt /(r + q) NB The non-singleton part of this distribution is exponential with parameter q. Thus, even though sampling is complete therefore ranges are truncated, if we ignore singletons we expect shape of distribution to reflect true extinction rate accurately. 1

2 Alroy s data on mammal species seem to bear this out. C. Bulk estimates of overall sampling rate per unit time CAVEAT: Although often overlooked, it is important to bear in mind that most approaches estimate the rate of sampling for taxa of which we even have a record. If entire clades, environments, or geographic regions are unsampled, or nearly so, we can still end up with high numerical estimates of sampling rate even though overall sampling on average may be much poorer than suggested by these estimates. 1. Gap analysis (discrete time) R is the probability of sampling a lineage at least once in a time interval, assuming it ranges all the way through (thus R = 1 e r t ). Let t i be the observed range of taxon i, let n i be the number of distinct intervals in which it occurs. Estimate R as t i >2 (n i 2) t i >2 (t i 2), where only taxa with t > 2 are considered. Intervals of first last appearances are disregarded because they must be interals in which the taxon is sampled including them leads to an overestimate of R. Aside: To estimate R for a single interval of time: Estimate R as N bt,sampled /N bt, where N bt,sampled is the number of lineages that range all the way through the interval are sampled at least once within it. Taxa making a first /or last appearance within the interval are disregarded. (This is consistent with treatment of bulk data.) 1. Three-timers part-timers (Alroy) Let N i 1,i,i+1 be the number of taxa sampled in intervals i 1, i, i + 1 (three-timers). Let N i 1,not i,i+1 be the number of taxa sampled in intervals i 1 i+1 but not in interval i (part-timers). Then estimate R as N i 1,i,i+1 /(N i 1,i,i+1 + N i 1,not i,i+1 ). This is similar to stard gap analysis, but is meant to reduce effects of spuriously long-ranging taxa (taxonomic errors, garbage-can taxa etc.). 2. Estimate from distribution of stratigraphic ranges in continuous time (with no information on internal occurrences between range endpoints) Let f(t) be the density function (relative probability of stratigraphic range of t). f() = q/(r + q) f(t) = qre qt /(r + q), t > Let n s be the number of singletons (those with range=). Let m be the number of non-singletons, with observed ranges t 1,..., t m. Then the likelihood function is: [ m L(q, r) = i=1 [ m = i=1 ] f(t i ) f() n s qre qt i ] ( q ) ns (r + q) r + q 2

3 the corresponding support (log-likelihood) function is: [ m ] S(q, r) = ln(q) + ln(r) qt i ln(r + q) + n s [ln(q) ln(r + q)] i=1 = (m + n s ) ln(q) + m ln(r) (m + n s ) ln(r + q) q i t i Since we are trying to solve for two parameters, we need to take the partial derivative with respect to each: q = (m + n s)/q (m + n s )/(r + q) i q = r q(r + q) = i t i m + n s r = m/r (m + n s)/(r + q) r = r = qm/n s If we substitute the expression for r in (B) into (A) we ultimately end up with: t i (A) (B) ˆq = m/ i t i ˆr = ˆq(1 f )/f, where the ˆ denotes the maximum likelihood estimate f is the proportion of observed taxa that are singletons, i.e f = n s /(n s + m). Note that ˆq is simply the inverse of the mean observed range of nonsingleton taxa. Note also that r q must be jointly estimated; the estimate of r depends on a prior estimate of q. This is a rather common situation in parameter estimation. We already saw that the expected probability of being sampled at least once is equal to r/(r + q). Therefore, once we have the estimates ˆr ˆq, we would estimate the proportion of taxa sampled at least once as ˆr/(ˆr + ˆq). 3. Estimate from distribution of stratigraphic ranges in continuous time (including information on internal occurrences between range endpoints) Intuitively, it seems reasonable that we should obtain estimates of q r with less uncertainty if we can incorporate additional data on the occurrences within ranges. This is what is done by Solow Smith (1997, Paleobiology 23: ). 3

4 Let there be N taxa, each known from n i distinct stratigraphic horizons (for example n i = 1 would denote a singleton), let d i be the observed stratigraphic range for each. Then: [ ] 2/ [ ˆr = (n i 1) ( d i )( ] n i ) i i i ˆq = N ˆr/ i (n i 1) (See Solow Smith for the derivation of this.) 4. Estimate from distribution of stratigraphic ranges in discrete time (with no information on internal occurrences between range endpoints) (see Foote Raup, 1996, Paleobiology 22:121-14). Let q be the per-capita rate of extinction per discrete time interval let R be the sampling probability per interval, as defined above. Because we are using a discrete-time model, we tacitly assume that a taxon present in an interval is present throughout the interval (see Figure 1 of Foote Raup, reproduced below). Let T denote the true duration t the observed stratigraphic range. Note that, in constrast to the continuous-time case, a singleton is a taxon with t = 1 (regardless of how many times it is sampled within the interval), a taxon with t = is one that is not sampled at all. Then, for a single taxon of duration T : P r(t = ) = (1 R) T P r(t 1) = 1 (1 R) T P r(t = 1) = RT (1 R) T 1 P r(t [t > 1]) = (T t + 1)R 2 (1 R) T t Let P D (T ) be the probability that a duration is equal to T (strictly speaking, that it falls between (T 1) T ), where time is tabulated in number of intervals. Thus, P D (T ) = e q(t 1) e qt. We now sum over the entire probability distribution of durations, each time considering the probability distribution of possible observed ranges (see Appendix 1 of Foote Raup, reproduced below). Finally, we normalize by the probability of being sampled at least once (Foote Raup Eq. 2). We end up with two principal results: 1. The distribution of ranges, ignoring singletons, is exponential with parameter q. Thus we can estimate q through regression of ln[f t ] against t (where f t is the proportion of observed taxa with range equal to t), for all t > 1. 4

5 2. R = f 2 2 /(f 1 f 3 ). This ratio, the so-called FreqRat, provides an estimate of the sampling probability per taxon per time bin. 5. Problem with the foregoing estimates of q R: It turns out they are biased at finite sample sizes. This can be shown either by simulation or analytically, as follows: Let f 1, f 2, f 3 be the true probabilities that a taxon will have a range of 1, 2, or 3 intervals, let f 4 = 1 f 1 f 2 f 3 be the probability that the range is greater than 3 intervals. These probabilities come from section (4) above. Let k 1, k 2, k 3 be the number of taxa with ranges of 1, 2, 3 intervals. Let k 4 be the number of all the rest. Let N be the total number of taxa (N = k 1 + k 2 + k 3 + k 4 ). For a given dataset, the FreqRat would be calculated as k 2 2/(k 1 k 3 ). The numbers k 1,...k 4 follow a multinomial distribution: P r(k 1, k 2, k 3, k 4 ) = N! k 1!k 2!k 3!k 4! f k 1 1 f k 2 2 f k 3 3 f k 4 4 The expected value of the FreqRat is given by: E( k2 2 k 1 k 3 ) = N k 1 =1 N k1 k 2 =1 N k 1 =1 N k1 k 2 =1 N k1 k 2 k 3 =1 k 2 2 k 1 k 3 P r(k 1, k 2, k 3, k 4 ) N k1 k 2, k 3 =1 P r(k 1, k 2, k 3, k 4 ) where the sum is taken only over those cases where k 1, k 2, k 3 are positive, the denominator normalizes by the sum of probabilities over these cases. Carrying out these tedious calculations, we find that the expected FreqRat is an overestimate of the true value of R at finite sample sizes, the bias is greater as N is smaller. A solution to the problem is to use maximum likelihood to jointly estimate (numerically) the parameters q R. a. Let f t (q, R) be the probabilities predicted for a given pair of (q, R) let k t be the observed frequencies. Then the likelihood is given as L(q, R k t ) = the support (log-likelihood) is f t (q, R) k t t=1 S(q, R k t ) = k t ln[f t (q, R)] t=1 Please note that k t in the foregoing is the same as d x from our dynamic survivorship analysis! 5

6 b. There seems to be no analytical solution, so we explore values of q R until we find the pair that maximizes the likelihood (or support). These are then our maximum-likelihood estimators ˆq ˆR. D. Estimate of proportion of taxa sampled at least once. 1. Given prior estimate of q r, use continuous-time equation for completeness given above. Given prior estimate of q R, use discrete-time equation (Eq. 2) from Appendix 1 of Foote Raup 1996 (reproduced below). 2. Comparison between number of known fossil taxa estimated total progeny (e.g. Foote, 1996, Paleobiology 22: , Table 1).) 3. Proportion of living taxa with a fossil record (e.g. Raup 1979 [Bull. Carnegie Museum of Natural History 13:85-91]; Valentine 1989 [Paleobiology 15:83-94]; Foote Sepkoski 1999 [Nature 398: ]) 6

II. Introduction to probability, 2

II. Introduction to probability, 2 GEOS 33000/EVOL 33000 5 January 2006 updated January 10, 2006 Page 1 II. Introduction to probability, 2 1 Random Variables 1.1 Definition: A random variable is a function defined on a sample space. In

More information

vary spuriously with preservation rate, but this spurious variation is largely eliminated and true rates are reasonably well estimated.

vary spuriously with preservation rate, but this spurious variation is largely eliminated and true rates are reasonably well estimated. 606 MICHAEL FOOTE Figure 1 shows results of an experiment with simulated data. Rates of origination, extinction, and preservation were varied in a pulsed pattern, in which a background level was punctuated

More information

A Rough Guide to CladeAge

A Rough Guide to CladeAge A Rough Guide to CladeAge Setting fossil calibrations in BEAST 2 Michael Matschiner Remco Bouckaert michaelmatschiner@mac.com remco@cs.auckland.ac.nz January 17, 2014 Contents 1 Introduction 1 1.1 Windows...............................

More information

I. Introduction to probability, 1

I. Introduction to probability, 1 GEOS 33000/EVOL 33000 3 January 2006 updated January 10, 2006 Page 1 I. Introduction to probability, 1 1 Sample Space and Basic Probability 1.1 Theoretical space of outcomes of conceptual experiment 1.2

More information

A Simple Method for Estimating Informative Node Age Priors for the Fossil Calibration of Molecular Divergence Time Analyses

A Simple Method for Estimating Informative Node Age Priors for the Fossil Calibration of Molecular Divergence Time Analyses A Simple Method for Estimating Informative Node Age Priors for the Fossil Calibration of Molecular Divergence Time Analyses Michael D. Nowak 1 *, Andrew B. Smith 2, Carl Simpson 3, Derrick J. Zwickl 4

More information

Genus extinction, origination, and the durations of sedimentary hiatuses

Genus extinction, origination, and the durations of sedimentary hiatuses Paleobiology, 32(3), 2006, pp. 387 407 Genus extinction, origination, and the durations of sedimentary hiatuses Shanan E. Peters Abstract. Short-term variations in rates of taxonomic extinction and origination

More information

GSA DATA REPOSITORY

GSA DATA REPOSITORY 1 GSA DATA REPOSITORY 2012202 Long-term origination-rates are re-set only at mass extinctions Andrew Z. Krug, David Jablonski Department of Geophysical Sciences, University of Chicago, 5734 S. Ellis Avenue,

More information

Genus-level versus species-level extinction rates

Genus-level versus species-level extinction rates Acta Geologica Polonica, Vol. 66 (2016), No. 3, pp. 261 265 DOI: 10.1515/agp-2016-0012 Genus-level versus species-level extinction rates JERZY TRAMMER Faculty of Geology, University of Warsaw, ul. Żwirki

More information

Second-Order Sequence Stratigraphic Controls on the Quality of the Fossil Record at an Active Margin: New Zealand Eocene to Recent Shelf Molluscs

Second-Order Sequence Stratigraphic Controls on the Quality of the Fossil Record at an Active Margin: New Zealand Eocene to Recent Shelf Molluscs 86 RESEARCH REPORT Second-Order Sequence Stratigraphic Controls on the Quality of the Fossil Record at an Active Margin: New Zealand Eocene to Recent Shelf Molluscs JAMES S. CRAMPTON Institute of Geological

More information

Origination and extinction components of taxonomic diversity: general problems

Origination and extinction components of taxonomic diversity: general problems Copyright 02000, The Paleontological Society Origination and extinction components of taxonomic diversity: general problems Mike Foote Abstract. -Mathematical modeling of cladogenesis and fossil preservation

More information

Supplement: Beta Diversity & The End Ordovician Extinctions. Appendix for: Response of beta diversity to pulses of Ordovician-Silurian extinction

Supplement: Beta Diversity & The End Ordovician Extinctions. Appendix for: Response of beta diversity to pulses of Ordovician-Silurian extinction Appendix for: Response of beta diversity to pulses of Ordovician-Silurian extinction Collection- and formation-based sampling biases within the original dataset Both numbers of occurrences and numbers

More information

Bayesian Estimation of Speciation and Extinction from Incomplete Fossil Occurrence Data

Bayesian Estimation of Speciation and Extinction from Incomplete Fossil Occurrence Data Syst. Biol. 63(3):349 367, 2014 The Author(s) 2014. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: journals.permissions@oup.com

More information

The Red Queen revisited: reevaluating the age selectivity of Phanerozoic marine genus extinctions

The Red Queen revisited: reevaluating the age selectivity of Phanerozoic marine genus extinctions Paleobiology, 34(3), 2008, pp. 318 341 The Red Queen revisited: reevaluating the age selectivity of Phanerozoic marine genus extinctions Seth Finnegan, Jonathan L. Payne, and Steve C. Wang Abstract. Extinction

More information

A Rough Guide to CladeAge

A Rough Guide to CladeAge A Rough Guide to CladeAge Bayesian estimation of clade ages based on probabilities of fossil sampling Michael Matschiner michaelmatschiner@mac.com Remco Bouckaert remco@cs.auckland.ac.nz June 9, 2016 Contents

More information

Integrative Biology 200 "PRINCIPLES OF PHYLOGENETICS" Spring 2018 University of California, Berkeley

Integrative Biology 200 PRINCIPLES OF PHYLOGENETICS Spring 2018 University of California, Berkeley Integrative Biology 200 "PRINCIPLES OF PHYLOGENETICS" Spring 2018 University of California, Berkeley B.D. Mishler Feb. 14, 2018. Phylogenetic trees VI: Dating in the 21st century: clocks, & calibrations;

More information

Mathematical statistics

Mathematical statistics October 1 st, 2018 Lecture 11: Sufficient statistic Where are we? Week 1 Week 2 Week 4 Week 7 Week 10 Week 14 Probability reviews Chapter 6: Statistics and Sampling Distributions Chapter 7: Point Estimation

More information

PyRate: a new program to estimate speciation and extinction rates from incomplete fossil data

PyRate: a new program to estimate speciation and extinction rates from incomplete fossil data Methods in Ecology and Evolution 2014, 5, 1126 1131 doi: 10.1111/2041-210X.12263 APPLICATION PyRate: a new program to estimate speciation and extinction rates from incomplete fossil data Daniele Silvestro

More information

Origination and extinction components of taxonomic diversity: Paleozoic and post-paleozoic dynamics

Origination and extinction components of taxonomic diversity: Paleozoic and post-paleozoic dynamics Paleobiology, 26(4), 2000, pp. 578 605 Origination and extinction components of taxonomic diversity: Paleozoic and post-paleozoic dynamics Mike Foote Abstract.Changes in genus diversity within higher taxa

More information

Diversity dynamics: molecular phylogenies need the fossil record

Diversity dynamics: molecular phylogenies need the fossil record Opinion Diversity dynamics: molecular phylogenies need the fossil record Tiago B. Quental and Charles R. Marshall Museum of Paleontology and Department of Integrative Biology, University of California

More information

Maximum likelihood estimation of phylogeny using stratigraphic data

Maximum likelihood estimation of phylogeny using stratigraphic data Paleobiology, 23(2), 1997, pp. 174-180 Maximum likelihood estimation of phylogeny using stratigraphic data John P. Huelsenbeck and Bruce Rannala Abstract.-The stratigraphic distribution of fossil species

More information

Integrating Fossils into Phylogenies. Throughout the 20th century, the relationship between paleontology and evolutionary biology has been strained.

Integrating Fossils into Phylogenies. Throughout the 20th century, the relationship between paleontology and evolutionary biology has been strained. IB 200B Principals of Phylogenetic Systematics Spring 2011 Integrating Fossils into Phylogenies Throughout the 20th century, the relationship between paleontology and evolutionary biology has been strained.

More information

Lecture 3. Truncation, length-bias and prevalence sampling

Lecture 3. Truncation, length-bias and prevalence sampling Lecture 3. Truncation, length-bias and prevalence sampling 3.1 Prevalent sampling Statistical techniques for truncated data have been integrated into survival analysis in last two decades. Truncation in

More information

Phanerozoic Diversity and Mass Extinctions

Phanerozoic Diversity and Mass Extinctions Phanerozoic Diversity and Mass Extinctions Measuring Diversity John Phillips produced the first estimates of Phanerozoic diversity in 1860, based on the British fossil record Intuitively it seems simple

More information

The Pennsylvania State University. The Graduate School. Department of Geosciences TAXIC AND PHYLOGENETIC APPROACHES TO UNDERSTANDING THE

The Pennsylvania State University. The Graduate School. Department of Geosciences TAXIC AND PHYLOGENETIC APPROACHES TO UNDERSTANDING THE The Pennsylvania State University The Graduate School Department of Geosciences TAXIC AND PHYLOGENETIC APPROACHES TO UNDERSTANDING THE LATE ORDOVICIAN MASS EXTINCTION AND EARLY SILURIAN RECOVERY A Thesis

More information

paleotree: an R package for paleontological and phylogenetic analyses of evolution

paleotree: an R package for paleontological and phylogenetic analyses of evolution Methods in Ecology and Evolution 2012, 3, 803 807 doi: 10.1111/j.2041-210X.2012.00223.x APPLICATION paleotree: an R package for paleontological and phylogenetic analyses of evolution David W. Bapst* Department

More information

ARTICLES Origination and Extinction through the Phanerozoic: A New Approach

ARTICLES Origination and Extinction through the Phanerozoic: A New Approach ARTICLES Origination and Extinction through the Phanerozoic: A New Approach Michael Foote Department of the Geophysical Sciences, University of Chicago, Chicago, Illinois 60637, U.S.A. (e-mail: mfoote@uchicago.edu)

More information

Estimating Age-Dependent Extinction: Contrasting Evidence from Fossils and Phylogenies

Estimating Age-Dependent Extinction: Contrasting Evidence from Fossils and Phylogenies Syst. Biol. 0(0):1 17, 2017 The Author(s) 2017. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. This is an Open Access article distributed under the terms of the

More information

Accurate and precise estimates of origination and extinction rates

Accurate and precise estimates of origination and extinction rates Accurate and precise estimates of origination and extinction rates Author(s): John Alroy Source: Paleobiology, 40(3):374-397. 2014. Published By: The Paleontological Society DOI: http://dx.doi.org/10.1666/13036

More information

Biodiversity in the Phanerozoic: a reinterpretation

Biodiversity in the Phanerozoic: a reinterpretation Paleobiology, 27(4), 2001, pp. 583 601 Biodiversity in the Phanerozoic: a reinterpretation Shanan E. Peters and Michael Foote Abstract. Many features of global diversity compilations have proven robust

More information

Physics 6720 Introduction to Statistics April 4, 2017

Physics 6720 Introduction to Statistics April 4, 2017 Physics 6720 Introduction to Statistics April 4, 2017 1 Statistics of Counting Often an experiment yields a result that can be classified according to a set of discrete events, giving rise to an integer

More information

+ + ( + ) = Linear recurrent networks. Simpler, much more amenable to analytic treatment E.g. by choosing

+ + ( + ) = Linear recurrent networks. Simpler, much more amenable to analytic treatment E.g. by choosing Linear recurrent networks Simpler, much more amenable to analytic treatment E.g. by choosing + ( + ) = Firing rates can be negative Approximates dynamics around fixed point Approximation often reasonable

More information

Package velociraptr. February 15, 2017

Package velociraptr. February 15, 2017 Type Package Title Fossil Analysis Version 1.0 Author Package velociraptr February 15, 2017 Maintainer Andrew A Zaffos Functions for downloading, reshaping, culling, cleaning, and analyzing

More information

RECOVERING NORMAL NETWORKS FROM SHORTEST INTER-TAXA DISTANCE INFORMATION

RECOVERING NORMAL NETWORKS FROM SHORTEST INTER-TAXA DISTANCE INFORMATION RECOVERING NORMAL NETWORKS FROM SHORTEST INTER-TAXA DISTANCE INFORMATION MAGNUS BORDEWICH, KATHARINA T. HUBER, VINCENT MOULTON, AND CHARLES SEMPLE Abstract. Phylogenetic networks are a type of leaf-labelled,

More information

Exam. Matrikelnummer: Points. Question Bonus. Total. Grade. Information Theory and Signal Reconstruction Summer term 2013

Exam. Matrikelnummer: Points. Question Bonus. Total. Grade. Information Theory and Signal Reconstruction Summer term 2013 Exam Name: Matrikelnummer: Question 1 2 3 4 5 Bonus Points Total Grade 1/6 Question 1 You are traveling to the beautiful country of Markovia. Your travel guide tells you that the weather w i in Markovia

More information

Three Monte Carlo Models. of Faunal Evolution PUBLISHED BY NATURAL HISTORY THE AMERICAN MUSEUM SYDNEY ANDERSON AND CHARLES S.

Three Monte Carlo Models. of Faunal Evolution PUBLISHED BY NATURAL HISTORY THE AMERICAN MUSEUM SYDNEY ANDERSON AND CHARLES S. AMERICAN MUSEUM Notltates PUBLISHED BY THE AMERICAN MUSEUM NATURAL HISTORY OF CENTRAL PARK WEST AT 79TH STREET NEW YORK, N.Y. 10024 U.S.A. NUMBER 2563 JANUARY 29, 1975 SYDNEY ANDERSON AND CHARLES S. ANDERSON

More information

X. Modeling Stratophenetic Series

X. Modeling Stratophenetic Series GEOS 33001/EVOL 33001 25 October 2007 Page 1 of 12 X. Modeling Stratophenetic Series 1 Differences between evolutionary and stratophenetic series 1.1 Sequence in thickness vs. time 1.2 Completeness of

More information

G331: The Nature and Adequacy of the Fossil Record

G331: The Nature and Adequacy of the Fossil Record 1 G331: The Nature and Adequacy of the Fossil Record Approaches: Rarefaction Logarithmic Decay Model How many species might have been alive in the past? What percentage are fossilized? How many skeletonized

More information

Statistics: Learning models from data

Statistics: Learning models from data DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial

More information

Significant Figures: A Brief Tutorial

Significant Figures: A Brief Tutorial Significant Figures: A Brief Tutorial 2013-2014 Mr. Berkin *Please note that some of the information contained within this guide has been reproduced for non-commercial, educational purposes under the Fair

More information

V. Introduction to Likelihood

V. Introduction to Likelihood GEOS 33000/EVOL 33000 17 January 2005 Modified July 25, 2007 Page 1 V. Introduction to Likelihood 1 Likelihood basics 1.1 Under assumed model, let X denote data and Θ the parameter(s) of the model. 1.2

More information

Treatment of Error in Experimental Measurements

Treatment of Error in Experimental Measurements in Experimental Measurements All measurements contain error. An experiment is truly incomplete without an evaluation of the amount of error in the results. In this course, you will learn to use some common

More information

Math Ordinary Differential Equations

Math Ordinary Differential Equations Math 411 - Ordinary Differential Equations Review Notes - 1 1 - Basic Theory A first order ordinary differential equation has the form x = f(t, x) (11) Here x = dx/dt Given an initial data x(t 0 ) = x

More information

GEOS 36501/EVOL January 2012 Page 1 of 23

GEOS 36501/EVOL January 2012 Page 1 of 23 GEOS 36501/EVOL 33001 13 January 2012 Page 1 of 23 III. Sampling 1 Overview of Sampling, Error, Bias 1.1 Biased vs. random sampling 1.2 Biased vs. unbiased statistic (or estimator) 1.3 Precision vs. accuracy

More information

Lecture 3. G. Cowan. Lecture 3 page 1. Lectures on Statistical Data Analysis

Lecture 3. G. Cowan. Lecture 3 page 1. Lectures on Statistical Data Analysis Lecture 3 1 Probability (90 min.) Definition, Bayes theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests (90 min.) general concepts, test statistics,

More information

CIRCULAR MOTION. Physics Department Electricity and Magnetism Laboratory. d dt. R is the distance from the body to the axis of rotation. v R.

CIRCULAR MOTION. Physics Department Electricity and Magnetism Laboratory. d dt. R is the distance from the body to the axis of rotation. v R. Physics Department Electricity and Magnetism Laboratory CIRCULAR MOTION 1. Aim The aim of this experiment is to study uniform circular motion and uniformly accelerated circular motion. The motion equation

More information

Design of Optimal Bayesian Reliability Test Plans for a Series System

Design of Optimal Bayesian Reliability Test Plans for a Series System Volume 109 No 9 2016, 125 133 ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version) url: http://wwwijpameu ijpameu Design of Optimal Bayesian Reliability Test Plans for a Series System P

More information

y Xw 2 2 y Xw λ w 2 2

y Xw 2 2 y Xw λ w 2 2 CS 189 Introduction to Machine Learning Spring 2018 Note 4 1 MLE and MAP for Regression (Part I) So far, we ve explored two approaches of the regression framework, Ordinary Least Squares and Ridge Regression:

More information

Probability Models. 4. What is the definition of the expectation of a discrete random variable?

Probability Models. 4. What is the definition of the expectation of a discrete random variable? 1 Probability Models The list of questions below is provided in order to help you to prepare for the test and exam. It reflects only the theoretical part of the course. You should expect the questions

More information

Differential Equations Practice: 2nd Order Linear: Nonhomogeneous Equations: Undetermined Coefficients Page 1

Differential Equations Practice: 2nd Order Linear: Nonhomogeneous Equations: Undetermined Coefficients Page 1 Differential Equations Practice: 2nd Order Linear: Nonhomogeneous Equations: Undetermined Coefficients Page 1 Questions Example (3.5.3) Find a general solution of the differential equation y 2y 3y = 3te

More information

MAT292 - Calculus III - Fall Solution for Term Test 2 - November 6, 2014 DO NOT WRITE ON THE QR CODE AT THE TOP OF THE PAGES.

MAT292 - Calculus III - Fall Solution for Term Test 2 - November 6, 2014 DO NOT WRITE ON THE QR CODE AT THE TOP OF THE PAGES. MAT9 - Calculus III - Fall 4 Solution for Term Test - November 6, 4 Time allotted: 9 minutes. Aids permitted: None. Full Name: Last First Student ID: Email: @mail.utoronto.ca Instructions DO NOT WRITE

More information

There are widespread scientific concerns that a global mass

There are widespread scientific concerns that a global mass How many named species are valid? John Alroy* National Center for Ecological Analysis and Synthesis, University of California, Santa Barbara, CA 93101 Edited by Peter Robert Crane, Royal Botanic Gardens,

More information

PoissonprocessandderivationofBellmanequations

PoissonprocessandderivationofBellmanequations APPENDIX B PoissonprocessandderivationofBellmanequations 1 Poisson process Let us first define the exponential distribution Definition B1 A continuous random variable X is said to have an exponential distribution

More information

Preface Introduction to Statistics and Data Analysis Overview: Statistical Inference, Samples, Populations, and Experimental Design The Role of

Preface Introduction to Statistics and Data Analysis Overview: Statistical Inference, Samples, Populations, and Experimental Design The Role of Preface Introduction to Statistics and Data Analysis Overview: Statistical Inference, Samples, Populations, and Experimental Design The Role of Probability Sampling Procedures Collection of Data Measures

More information

Modeling Data with Linear Combinations of Basis Functions. Read Chapter 3 in the text by Bishop

Modeling Data with Linear Combinations of Basis Functions. Read Chapter 3 in the text by Bishop Modeling Data with Linear Combinations of Basis Functions Read Chapter 3 in the text by Bishop A Type of Supervised Learning Problem We want to model data (x 1, t 1 ),..., (x N, t N ), where x i is a vector

More information

Differential Equations Practice: 2nd Order Linear: Nonhomogeneous Equations: Variation of Parameters Page 1

Differential Equations Practice: 2nd Order Linear: Nonhomogeneous Equations: Variation of Parameters Page 1 Differential Equations Practice: 2nd Order Linear: Nonhomogeneous Equations: Variation of Parameters Page Questions Example (3.6.) Find a particular solution of the differential equation y 5y + 6y = 2e

More information

V. Introduction to Likelihood

V. Introduction to Likelihood GEOS 36501/EVOL 33001 20 January 2012 Page 1 of 24 V. Introduction to Likelihood 1 Likelihood basics 1.1 Under assumed model, let X denote data and Θ the parameter(s) of the model. 1.2 Probability: P (X

More information

University of Groningen

University of Groningen University of Groningen Can clade age alone explain the relationship between body size and diversity? Etienne, Rampal S.; de Visser, Sara N.; Janzen, Thijs; Olsen, Jeanine L.; Olff, Han; Rosindell, James

More information

GEOGRAPHICAL, ENVIRONMENTAL AND INTRINSIC BIOTIC CONTROLS ON PHANEROZOIC MARINE DIVERSIFICATION

GEOGRAPHICAL, ENVIRONMENTAL AND INTRINSIC BIOTIC CONTROLS ON PHANEROZOIC MARINE DIVERSIFICATION [Palaeontology, Vol. 53, Part 6, 2010, pp. 1211 1235] GEOGRAPHICAL, ENVIRONMENTAL AND INTRINSIC BIOTIC CONTROLS ON PHANEROZOIC MARINE DIVERSIFICATION by JOHN ALROY* Paleobiology Database, National Center

More information

STATISTICS OF OBSERVATIONS & SAMPLING THEORY. Parent Distributions

STATISTICS OF OBSERVATIONS & SAMPLING THEORY. Parent Distributions ASTR 511/O Connell Lec 6 1 STATISTICS OF OBSERVATIONS & SAMPLING THEORY References: Bevington Data Reduction & Error Analysis for the Physical Sciences LLM: Appendix B Warning: the introductory literature

More information

Introduction. X-Ray Production and Quality. Fluorescence Yield. Fluorescence X-Rays. Initiating event. Initiating event 3/18/2011

Introduction. X-Ray Production and Quality. Fluorescence Yield. Fluorescence X-Rays. Initiating event. Initiating event 3/18/2011 X-Ray Production and Quality Chapter 9 F.A. Attix, Introduction to Radiological Physics and Radiation Dosimetry Introduction Physics of x-ray generation Fluorescence x-rays Bremsstrahlung x-rays Beam quality

More information

Mass Extinctions &Their Consequences

Mass Extinctions &Their Consequences Mass Extinctions &Their Consequences Microevolution and macroevolution Microevolution: evolution occurring within populations p Adaptive and neutral changes in allele frequencies Macroevolution: evolution

More information

Global occurrence trajectories of microfossils: environmental volatility and the rise and fall of individual species

Global occurrence trajectories of microfossils: environmental volatility and the rise and fall of individual species Paleobiology, 36(2), 2010, pp. 224 252 Global occurrence trajectories of microfossils: environmental volatility and the rise and fall of individual species Lee Hsiang Liow, Hans Julius Skaug, Torbjørn

More information

CHAPTER FIVE. Solutions for Section 5.1. Skill Refresher. Exercises

CHAPTER FIVE. Solutions for Section 5.1. Skill Refresher. Exercises CHAPTER FIVE 5.1 SOLUTIONS 265 Solutions for Section 5.1 Skill Refresher S1. Since 1,000,000 = 10 6, we have x = 6. S2. Since 0.01 = 10 2, we have t = 2. S3. Since e 3 = ( e 3) 1/2 = e 3/2, we have z =

More information

TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1

TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1 TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1 1.1 The Probability Model...1 1.2 Finite Discrete Models with Equally Likely Outcomes...5 1.2.1 Tree Diagrams...6 1.2.2 The Multiplication Principle...8

More information

Eruptions of the Old Faithful geyser

Eruptions of the Old Faithful geyser Eruptions of the Old Faithful geyser geyser2.ps Topics covered: Bimodal distribution. Data summary. Examination of univariate data. Identifying subgroups. Prediction. Key words: Boxplot. Histogram. Mean.

More information

Body Fossils Trace Fossils

Body Fossils Trace Fossils Evolution, Phanerozoic Life and Mass Extinctions Hilde Schwartz hschwartz@pmc.ucsc.edu Body Fossils Trace Fossils FOSSILIZED Living bone Calcium hydroxyapatite Ca 10 (PO4) 6 (OH OH,Cl,F,CO CO 3 ) 2 Fossil

More information

YOU CAN BACK SUBSTITUTE TO ANY OF THE PREVIOUS EQUATIONS

YOU CAN BACK SUBSTITUTE TO ANY OF THE PREVIOUS EQUATIONS The two methods we will use to solve systems are substitution and elimination. Substitution was covered in the last lesson and elimination is covered in this lesson. Method of Elimination: 1. multiply

More information

Bayesian Models in Machine Learning

Bayesian Models in Machine Learning Bayesian Models in Machine Learning Lukáš Burget Escuela de Ciencias Informáticas 2017 Buenos Aires, July 24-29 2017 Frequentist vs. Bayesian Frequentist point of view: Probability is the frequency of

More information

A General Overview of Parametric Estimation and Inference Techniques.

A General Overview of Parametric Estimation and Inference Techniques. A General Overview of Parametric Estimation and Inference Techniques. Moulinath Banerjee University of Michigan September 11, 2012 The object of statistical inference is to glean information about an underlying

More information

CS 361: Probability & Statistics

CS 361: Probability & Statistics October 17, 2017 CS 361: Probability & Statistics Inference Maximum likelihood: drawbacks A couple of things might trip up max likelihood estimation: 1) Finding the maximum of some functions can be quite

More information

Speciation and extinction in the fossil record of North American mammals

Speciation and extinction in the fossil record of North American mammals //FS2/CUP/3-PAGINATION/SPDY/2-PROOFS/3B2/9780521883184C16.3D 301 [301 323] 31.7.2008 12:09PM CHAPTER SIXTEEN Speciation and extinction in the fossil record of North American mammals john alroy Introduction

More information

Appendix from L. J. Revell, On the Analysis of Evolutionary Change along Single Branches in a Phylogeny

Appendix from L. J. Revell, On the Analysis of Evolutionary Change along Single Branches in a Phylogeny 008 by The University of Chicago. All rights reserved.doi: 10.1086/588078 Appendix from L. J. Revell, On the Analysis of Evolutionary Change along Single Branches in a Phylogeny (Am. Nat., vol. 17, no.

More information

Concepts and Methods in Molecular Divergence Time Estimation

Concepts and Methods in Molecular Divergence Time Estimation Concepts and Methods in Molecular Divergence Time Estimation 26 November 2012 Prashant P. Sharma American Museum of Natural History Overview 1. Why do we date trees? 2. The molecular clock 3. Local clocks

More information

Gaussians Linear Regression Bias-Variance Tradeoff

Gaussians Linear Regression Bias-Variance Tradeoff Readings listed in class website Gaussians Linear Regression Bias-Variance Tradeoff Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University January 22 nd, 2007 Maximum Likelihood Estimation

More information

First and Second Order Differential Equations Lecture 4

First and Second Order Differential Equations Lecture 4 First and Second Order Differential Equations Lecture 4 Dibyajyoti Deb 4.1. Outline of Lecture The Existence and the Uniqueness Theorem Homogeneous Equations with Constant Coefficients 4.2. The Existence

More information

Normal distribution We have a random sample from N(m, υ). The sample mean is Ȳ and the corrected sum of squares is S yy. After some simplification,

Normal distribution We have a random sample from N(m, υ). The sample mean is Ȳ and the corrected sum of squares is S yy. After some simplification, Likelihood Let P (D H) be the probability an experiment produces data D, given hypothesis H. Usually H is regarded as fixed and D variable. Before the experiment, the data D are unknown, and the probability

More information

DIVERSITY ESTIMATES, BIASES, AND HISTORIOGRAPHIC EFFECTS: RESOLVING CETACEAN DIVERSITY IN THE TERTIARY. Mark D. Uhen and Nicholas D.

DIVERSITY ESTIMATES, BIASES, AND HISTORIOGRAPHIC EFFECTS: RESOLVING CETACEAN DIVERSITY IN THE TERTIARY. Mark D. Uhen and Nicholas D. Palaeontologia Electronica http://palaeo-electronica.org DIVERSITY ESTIMATES, BIASES, AND HISTORIOGRAPHIC EFFECTS: RESOLVING CETACEAN DIVERSITY IN THE TERTIARY Mark D. Uhen and Nicholas D. Pyenson ABSTRACT

More information

Manuscript to be reviewed

Manuscript to be reviewed How has our knowledge of dinosaur diversity through geologic time changed through research history? Jonathan P Tennant Corresp., 1, Alfio A. Chiarenza 1 1 Department of Earth Science and Engineering, Imperial

More information

The geological completeness of paleontological sampling in North America

The geological completeness of paleontological sampling in North America Paleobiology, 36(1), 2010, pp. 61 79 The geological completeness of paleontological sampling in North America Shanan E. Peters and Noel A. Heim Abstract. A growing body of work has quantitatively linked

More information

Lecture 32. Lidar Error and Sensitivity Analysis

Lecture 32. Lidar Error and Sensitivity Analysis Lecture 3. Lidar Error and Sensitivity Analysis Introduction Accuracy in lidar measurements Precision in lidar measurements Error analysis for Na Doppler lidar Sensitivity analysis Summary 1 Errors vs.

More information

Bayesian tsunami fragility modeling considering input data uncertainty

Bayesian tsunami fragility modeling considering input data uncertainty Bayesian tsunami fragility modeling considering input data uncertainty Raffaele De Risi Research Associate University of Bristol United Kingdom De Risi, R., Goda, K., Mori, N., & Yasuda, T. (2016). Bayesian

More information

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables THE UNIVERSITY OF MANCHESTER. 21 June :45 11:45

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables THE UNIVERSITY OF MANCHESTER. 21 June :45 11:45 Two hours MATH20802 To be supplied by the Examinations Office: Mathematical Formula Tables THE UNIVERSITY OF MANCHESTER STATISTICAL METHODS 21 June 2010 9:45 11:45 Answer any FOUR of the questions. University-approved

More information

Study of the arrival directions of ultra-high-energy cosmic rays detected by the Pierre Auger Observatory

Study of the arrival directions of ultra-high-energy cosmic rays detected by the Pierre Auger Observatory Study of the arrival directions of ultra-high-energy cosmic rays detected by the Pierre Auger Observatory Piera L. Ghia*, for the Pierre Auger Collaboration * IFSI/INAF Torino, Italy, & IPN/CNRS Orsay,

More information

Inferring from data. Theory of estimators

Inferring from data. Theory of estimators Inferring from data Theory of estimators 1 Estimators Estimator is any function of the data e(x) used to provide an estimate ( a measurement ) of an unknown parameter. Because estimators are functions

More information

Probabilistic modeling. The slides are closely adapted from Subhransu Maji s slides

Probabilistic modeling. The slides are closely adapted from Subhransu Maji s slides Probabilistic modeling The slides are closely adapted from Subhransu Maji s slides Overview So far the models and algorithms you have learned about are relatively disconnected Probabilistic modeling framework

More information

How to make projections of daily-scale temperature variability in the future: a cross-validation study using the ENSEMBLES regional climate models

How to make projections of daily-scale temperature variability in the future: a cross-validation study using the ENSEMBLES regional climate models How to make projections of daily-scale temperature variability in the future: a cross-validation study using the ENSEMBLES regional climate models Jouni Räisänen Department of Physics, University of Helsinki

More information

Manual for SOA Exam MLC.

Manual for SOA Exam MLC. Chapter 10. Poisson processes. Section 10.5. Nonhomogenous Poisson processes Extract from: Arcones Fall 2009 Edition, available at http://www.actexmadriver.com/ 1/14 Nonhomogenous Poisson processes Definition

More information

Package paleotree. February 23, 2013

Package paleotree. February 23, 2013 Package paleotree February 23, 2013 Type Package Version 1.8 Title Paleontological and Phylogenetic Analyses of Evolution Date 02-22-13 Author David Bapst Depends R (>= 2.15.0), ape Imports phangorn, geiger

More information

Cox s proportional hazards model and Cox s partial likelihood

Cox s proportional hazards model and Cox s partial likelihood Cox s proportional hazards model and Cox s partial likelihood Rasmus Waagepetersen October 12, 2018 1 / 27 Non-parametric vs. parametric Suppose we want to estimate unknown function, e.g. survival function.

More information

Linear Regression. Machine Learning CSE546 Kevin Jamieson University of Washington. Oct 5, Kevin Jamieson 1

Linear Regression. Machine Learning CSE546 Kevin Jamieson University of Washington. Oct 5, Kevin Jamieson 1 Linear Regression Machine Learning CSE546 Kevin Jamieson University of Washington Oct 5, 2017 1 The regression problem Given past sales data on zillow.com, predict: y = House sale price from x = {# sq.

More information

Econometric Analysis of Cross Section and Panel Data

Econometric Analysis of Cross Section and Panel Data Econometric Analysis of Cross Section and Panel Data Jeffrey M. Wooldridge / The MIT Press Cambridge, Massachusetts London, England Contents Preface Acknowledgments xvii xxiii I INTRODUCTION AND BACKGROUND

More information

THE QUALITY OF THE FOSSIL RECORD: Implications for Evolutionary Analyses

THE QUALITY OF THE FOSSIL RECORD: Implications for Evolutionary Analyses Annu. Rev. Ecol. Syst. 2002. 33:561 88 doi: 10.1146/annurev.ecolsys.33.030602.152151 Copyright c 2002 by Annual Reviews. All rights reserved THE QUALITY OF THE FOSSIL RECORD: Implications for Evolutionary

More information

Statistical Data Analysis Stat 3: p-values, parameter estimation

Statistical Data Analysis Stat 3: p-values, parameter estimation Statistical Data Analysis Stat 3: p-values, parameter estimation London Postgraduate Lectures on Particle Physics; University of London MSci course PH4515 Glen Cowan Physics Department Royal Holloway,

More information

Package FossilSim. R topics documented: November 16, Type Package Title Simulation of Fossil and Taxonomy Data Version 2.1.0

Package FossilSim. R topics documented: November 16, Type Package Title Simulation of Fossil and Taxonomy Data Version 2.1.0 Type Package Title Simulation of Fossil and Taxonomy Data Version 2.1.0 Package FossilSim November 16, 2018 Simulating taxonomy and fossil data on phylogenetic s under mechanistic models of speciation,

More information

Parametric Techniques Lecture 3

Parametric Techniques Lecture 3 Parametric Techniques Lecture 3 Jason Corso SUNY at Buffalo 22 January 2009 J. Corso (SUNY at Buffalo) Parametric Techniques Lecture 3 22 January 2009 1 / 39 Introduction In Lecture 2, we learned how to

More information

MACHINE LEARNING INTRODUCTION: STRING CLASSIFICATION

MACHINE LEARNING INTRODUCTION: STRING CLASSIFICATION MACHINE LEARNING INTRODUCTION: STRING CLASSIFICATION THOMAS MAILUND Machine learning means different things to different people, and there is no general agreed upon core set of algorithms that must be

More information

GSA DATA REPOSITORY

GSA DATA REPOSITORY GSA DATA REPOSITORY 2012046 Lloyd et al. Additional Details of Methodology The Database As it was not tractable to enter data from the world s oceans as a whole we limited ourselves to the North Atlantic,

More information

IFRS9 PD modeling Presentation MATFYZ Konstantin Belyaev

IFRS9 PD modeling Presentation MATFYZ Konstantin Belyaev IFRS9 PD modeling Presentation MATFYZ 03.01.2018 Konstantin Belyaev IFRS9 International Financial Reporting Standard In July 2014, the International Accounting Standard Board (IASB) issued the final version

More information

Effects of sampling standardization on estimates of Phanerozoic marine diversification

Effects of sampling standardization on estimates of Phanerozoic marine diversification Effects of sampling standardization on estimates of Phanerozoic marine diversification J. Alroy a,b, C. R. Marshall c, R. K. Bambach d, K. Bezusko e, M. Foote f,f.t.fürsich g, T. A. Hansen h, S. M. Holland

More information

Math53: Ordinary Differential Equations Autumn 2004

Math53: Ordinary Differential Equations Autumn 2004 Math53: Ordinary Differential Equations Autumn 2004 Unit 2 Summary Second- and Higher-Order Ordinary Differential Equations Extremely Important: Euler s formula Very Important: finding solutions to linear

More information