Error analysis in biology

Similar documents
1 Measurement Uncertainties

Error analysis in biology

1 Measurement Uncertainties

Uncertainty, Error, and Precision in Quantitative Measurements an Introduction 4.4 cm Experimental error

The Half-Life of a Bouncing Ball

Error Analysis in Experimental Physical Science Mini-Version

Introduction to Uncertainty and Treatment of Data

Measurements and Data Analysis

Experimental Uncertainty (Error) and Data Analysis

Physics 1140 Fall 2013 Introduction to Experimental Physics

Table 2.1 presents examples and explains how the proper results should be written. Table 2.1: Writing Your Results When Adding or Subtracting

Collecting and Reporting Data

The SuperBall Lab. Objective. Instructions

A Scientific Model for Free Fall.

Error analysis for the physical sciences A course reader for phys 1140 Scott Pinegar and Markus Raschke Department of Physics, University of Colorado

Please bring the task to your first physics lesson and hand it to the teacher.

Introduction to Statistics and Data Analysis

ISP 207L Supplementary Information

Fundamentals of data, graphical, and error analysis

Uncertainty and Graphical Analysis

Physics 2020 Laboratory Manual

Computer simulation of radioactive decay

Error analysis in biology

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER /2018

BRIDGE CIRCUITS EXPERIMENT 5: DC AND AC BRIDGE CIRCUITS 10/2/13

#29: Logarithm review May 16, 2009

Experiment 0 ~ Introduction to Statistics and Excel Tutorial. Introduction to Statistics, Error and Measurement

Lab 1: Measurement. PART 1: Exponential Notation: Powers of 10

This term refers to the physical quantity that is the result of the measurement activity.

Experimental Uncertainty (Error) and Data Analysis

1 Using standard errors when comparing estimated values

PHY 123 Lab 1 - Error and Uncertainty and the Simple Pendulum

Background to Statistics

Conservation of Momentum

Radioactivity: Experimental Uncertainty

Introduction to Statistics, Error and Measurement

PROBLEMS FOR EXPERIMENT ES: ESTIMATING A SECOND Solutions

MATH 10 INTRODUCTORY STATISTICS

Data and Error Analysis

Radiological Control Technician Training Fundamental Academic Training Study Guide Phase I

Introduction to Measurement Physics 114 Eyres

!t + U " #Q. Solving Physical Problems: Pitfalls, Guidelines, and Advice

Methods and Tools of Physics

Chem 115 POGIL Worksheet - Week 1 Units, Measurement Uncertainty, and Significant Figures - Solutions

CS 5630/6630 Scientific Visualization. Elementary Plotting Techniques II

EXPERIMENT 2 Reaction Time Objectives Theory

MATH EVALUATION. What will you learn in this Lab?

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras

Some hints for the Radioactive Decay lab

Statistics for Data Analysis. Niklaus Berger. PSI Practical Course Physics Institute, University of Heidelberg

Boyle s Law and Charles Law Activity

A (Mostly) Correctly Formatted Sample Lab Report. Brett A. McGuire Lab Partner: Microsoft Windows Section AB2

Inference with Simple Regression

Algebra Exam. Solutions and Grading Guide

Error Analysis. To become familiar with some principles of error analysis for use in later laboratories.

5. Is it OK to change the spacing of the tick marks on your axes as you go across the page? a. Yes b. No - that screws up the analysis of the data.

Assume that you have made n different measurements of a quantity x. Usually the results of these measurements will vary; call them x 1

2018 Arizona State University Page 1 of 16

Introduction to Computer Tools and Uncertainties

3. What is the decimal place of the least significant figure (LSF) in the number 0.152? a. tenths place b. hundredths place c.

PHYS 275 Experiment 2 Of Dice and Distributions

CHAPTER 1: Preliminary Description of Errors Experiment Methodology and Errors To introduce the concept of error analysis, let s take a real world

A Study Guide for. Students PREPARING FOR GRADE. Nova Scotia Examinations in Mathematics

experiment3 Introduction to Data Analysis

Introduction to Physics Physics 114 Eyres

Math Refresher Answer Sheet (NOTE: Only this answer sheet and the following graph will be evaluated)

BRIEF SURVEY OF UNCERTAINITY IN PHYSICS LABS

Experiment 2. Reaction Time. Make a series of measurements of your reaction time. Use statistics to analyze your reaction time.

PYP Mathematics Continuum

USING THE EXCEL CHART WIZARD TO CREATE CURVE FITS (DATA ANALYSIS).

JUST THE MATHS UNIT NUMBER 1.6. ALGEBRA 6 (Formulae and algebraic equations) A.J.Hobson

BRIEF SURVEY OF UNCERTAINITY IN PHYSICS LABS

Algebra 2 Honors. Logs Test Review

Take the measurement of a person's height as an example. Assuming that her height has been determined to be 5' 8", how accurate is our result?

Geology Geomath Computer Lab Quadratics and Settling Velocities

Newton s Cooling Model in Matlab and the Cooling Project!

DIRECTED NUMBERS ADDING AND SUBTRACTING DIRECTED NUMBERS

Intuitive Biostatistics: Choosing a statistical test

Representations of Motion in One Dimension: Speeding up and slowing down with constant acceleration

Intermediate Algebra Chapter 12 Review

Final Marking Guidelines

Course Project. Physics I with Lab

Objectives. Assessment. Assessment 5/14/14. Convert quantities from one unit to another using appropriate conversion factors.

Practical Statistics

Experiment 1 - Mass, Volume and Graphing

Experiment 1 Simple Measurements and Error Estimation

Lab 6. RC Circuits. Switch R 5 V. ower upply. Voltmete. Capacitor. Goals. Introduction

Copyright, Nick E. Nolfi MPM1D9 Unit 6 Statistics (Data Analysis) STA-1

PHY131H1F Class 3. From Knight Chapter 1:

Page Points Score Total: 100

Chapter 2. Mean and Standard Deviation

Poisson distribution and χ 2 (Chap 11-12)

Introduction to Chemistry

Decimal Scientific Decimal Scientific

Using Microsoft Excel

Grade 8 Please show all work. Do not use a calculator! Please refer to reference section and examples.

2 One-dimensional motion with constant acceleration

Statistics lecture 3. Bell-Shaped Curves and Other Shapes

Sequences and the Binomial Theorem

Measurement Error PHYS Introduction

Transcription:

Error analysis in biology Marek Gierliński Division of Computational Biology Hand-outs available at http://is.gd/statlec Errors, like straws, upon the surface flow; He who would search for pearls must dive below John Dryden (1631-1700)

Previously on Errors Confidence interval on a correlation Fisher s transformation (Gaussian) Z = 1 1 + r ln 2 1 r 1 σ = n 3 Confidence interval on a proportion From binomial distribution we have standard error of a proportion SE Φ = p (1 p ) n Confidence interval on counts accurate approximation C L = C Z C + Z2 1 3 C U = C + Z C + 1 + Z2 + 2 3 Bootstrapping when everything else fails 2

5. Error bars Errors using inadequate data are much less than those using no data at all Charles Babbage

A good plot Figure 6-1. Exponential decay of a protein in a simulated experiment. Error bars represent propagated standard errors from individual peptides. The curve shows the best-fitting exponential decay model, y t = Ae t/τ, with A = 1.04 ± 0.05 and τ = 16 ± 3 h (95% confidence intervals). 5

3 rules for making good plots 1. Clarity of presentation 2. Clarity of presentation 3. Clarity of presentation 6

Lines and symbols Clarity! Symbols shall be easy to distinguish It is OK to join data points with lines for guidance 7

Labels! 8

Logarithmic plots Clarity! Use logarithmic axes to show data spanning many orders of magnitude 9

How to plot error bars Value axis cap M ± SD M ± SE [M L, M U ] +upper error M lower error error bar 1 SD 1 SE 1 reading error upper error point Confidence interval lower error The point represents statistical estimator (e.g. sample mean) best-fitting value direct measurement 10

How to plot error bars Clarity! Make sure error bars are visible 11

Types of errors Error bar What it represents When to use Standard deviation Scatter in the sample Comparing two or more samples, though box plots make a good alternative Standard error Confidence interval Error of the mean Confidence in the result Most commonly used error bar, though confidence intervals have better statistical intuition The best representation of uncertainty; can be used in almost any case Always state what type of uncertainty is represented by your error bars 12

Box plots Value axis outliers 95 th percentile 75 th percentile 50 th percentile 25 th percentile Central 50% of data Central 90% of data 5 th percentile 13

Box plots Box plots are a good alternative to standard deviation error bars They are non-parametric and show pure data 14

Bar plots Area of a bar is proportional to the value presented Summed area of several bars represents the total value Bar plots should only be used to present additive quantities: counts, fractions, proportions, probabilities, etc. Against a continuous variable data are integrated over the bar width Each bar is two-dimensional Bar width matters! 15

Bar plots start at zero Bar area represents its value Hence, baseline must be at zero If not, the plot is very misleading Don t do it! 16

Bar plots in logarithmic scale There is no zero in a logarithmic scale! Bar size depends on an arbitrary lower limit of the vertical axis Don t do it! 17

Bar plot problems Non-additive quantity Little variability 18

Bar plot problems Non-additive quantity Lower error bar invisible Crowded bars Confusing visual pattern Continuous variable 19

Multiple bar plots and a continuous variable bin size 20

Bar plots with error bars Useless information Lower error bar invisible Always show upper and lower error bars Other types of plots might be better Do you really need a plot? 21

Measured value Exercise: overlapping error bars Data set Each panel shows results from a pair of samples of the same size Mean and standard deviation are shown Which of these pairs of means are significantly different? 22

Rules of making good graphs 1. Always keep clarity of presentation in mind 2. You shall use axes with scales and labels 3. Use logarithmic scale to show data spanning over many orders of magnitude 4. All labels and numbers should be easy to read 5. Symbols shall be easy to distinguish 6. Add error bars were possible 7. Always state what type of uncertainty is represented by your error bars 8. Use model lines, where appropriate 9. It is OK to join data points with lines for guidance 10. You shall not use bar plots unless necessary 24

Bar plots: recommendations 1. Bar plots should only be used to present additive quantities: counts, proportions and probabilities 2. Each bar has to start at zero 3. Don t even think of making a bar plot in the logarithmic scale 4. Bar plots are not useful for presenting data with small variability 5. Multiple data bar plots are not suited for plots where the horizontal axis represents a continuous variable 6. Multiple data bar plots can be cluttered and unreadable 7. Make sure both upper and lower errors in a bar plot are clearly visible 25

William Playfair Born in Liff near Dundee Man of many careers (millwright, engineer, draftsman, accountant, inventor, silversmith, merchant, investment broker, economist, statistician, pamphleteer, translator, publicist, land speculator, blackmailer, swindler, convict, banker, editor and journalist) He invented line graph (1786) bar plot (1786) pie chart (1801) William Playfair (1759-1823) 26

William Playfair price (shillings) 80 Price of a quarter of wheat 60 40 weekly wage of a good mechanic 1600 1650 1700 1750 1800 year 20 0 Chart showing at one view the price of the quarter of wheat & wages of labour by the week (1821) 27

6. Quoting numbers and errors 46.345% of all statistics are made up Anonymous

What is used to quantify errors In a publication you typically quote: x = x best ± Δx best estimate error Error can be: Standard deviation Standard error of the mean Confidence interval Derived error Make sure you tell the reader what type of errors you use 29

Significant figures (digits) Significant figures (or digits) are those that carry meaningful information More s.f. more information The rest is meaningless junk! Example A microtubule has grown 4.1 µm in 2.6 minutes; what is the speed of growth of this microtubule? Quote only significant digits 4.1 μm 2.6 min = 1.576923077 μm min 1 There are only two significant figures (s.f.) in length and speed Therefore, only about two figures of the result are meaningful: 1.6 µm min -1 30

Which figures are significant? Non-zero figures are significant Number Significant figures Leading zeroes are not significant 34, 0.34 and 0.00034 carry the same amount of information Watch out for trailing zeroes before the decimal dot: not significant after the decimal dot: significant 365 3 1.893 4 4000 1 or 4 4 10 3 1 4.000 10 3 4 4000.00 6 0.00034 2 0.0003400 4 31

Rounding Remove non-significant figures by rounding Round the last s.f. according to the value of the next digit 0-4: round down (1.342 1.3) 5-9: round up (1.356 1.4) So, how many figures are significant? Suppose we have 2 s.f. in each number Raw number Quote 1234 1200 1287 1300 1.491123 1.5 1.449999 1.4 32

Error in the error To find how many s.f. are in a number, you need to look at its error Use sampling distribution of the standard error Error in the error is ΔSE = SE 2(n 1) This formula can be applied to SD and CI n = 12 SE = 23.77345 ΔSE = 5.1 Example True SE is somewhere between 18.7 and 28.8 (with 68% confidence) We can trust only one figure in the error Round SE to one s.f.: SE = 20 33

Error in the error n ΔSE SE s.f. to quote 10 0.24 1 100 0.07 2 1,000 0.02 2 10,000 0.007 3 100,000 0.002 3 An error quoted with 3 s.f. (2.567 0.165) implicitly states you have 10,000 replicates 34

Quote number and error Get a number and its error Find how many significant figures you have in the error (typically 1 or 2) Quote the number with the same decimal precision as error Number 1.23457456 Error 0.02377345 Reject insignificant figures and round the last s.f. Align at the decimal point Correct 1.23 ± 0.02 1.2 ± 0.5 6.0 ± 3.0 75000 ± 12000 3.5 ± 0.3 10 5 Incorrect 1.2 ± 0.02 1.23423 ± 0.5 6 ± 3.0 75156 ± 12223 3.5 ± 0.3 10 5 35

Error with no error Suppose you have a number without error (Go back to your lab and do more experiments) For example Centromeres are transported by microtubules at an average speed of 1.5 µm/min The new calibration method reduces error rates by 5% Transcription increases during the first 30 min Cells were incubated at 22 C There is an implicit error in the last significant figure All quoted figures are presumed significant 36

Intensity Avoid computer notation Example from a random paper off my shelf p-value = 5.51E-14 I d rather put it down as p-value = 6 10-14 30 20 10 0 Intensity ( 10 6 ) 37

Fixed decimal places Another example, sometimes seen in papers Numbers with fixed decimal places, copied from Excel Typically fractional errors are similar and we have the same number of s.f. raw data 1 decimal place Wrong Right 14524.2 1.4 10 4 2234.2 2200 122.2 120 12.6 13 2.2 2.2 0.1 0.12 0.0 0.022 Assume there are only 2 s.f. in these measurements 38

How to quote numbers (and errors) WHEN YOU KNOW ERROR First, calculate the error and estimate its uncertainty This will tell you how many significant figures of the error to quote Typically you quote 1-2 s.f. of the error Quote the number with the same precision as the error 1.23 ± 0.02 1.23423 ± 0.00005 (rather unlikely in biological experiments) 6 ± 3 75 ± 12 (3.2 ± 0.3) 10 5 Rounding numbers 0-4: down (6.64 6.6) 5-9: up (6.65 6.7) WHEN YOU DON T KNOW ERROR You still need to guesstimate your error! Quote only figures that are significant, e.g. p = 0.03, not p = 0.0327365 Use common sense! Try estimating order of magnitude of your uncertainty Example: measure distance between two spots in a microscope Get 416.23 nm from computer software Resolution of the microscope is 100 nm Quote 400 nm 39

Hand-outs available at http://is.gd/statlec Please leave your feedback forms on the table by the door