Introduction to Statistics and Data Analysis

Similar documents
Notes Errors and Noise PHYS 3600, Northeastern University, Don Heiman, 6/9/ Accuracy versus Precision. 2. Errors

Physics 509: Error Propagation, and the Meaning of Error Bars. Scott Oser Lecture #10

Introduction to Statistics and Error Analysis II

Error propagation. Alexander Khanov. October 4, PHYS6260: Experimental Methods is HEP Oklahoma State University

1 Measurement Uncertainties

Fourier and Stats / Astro Stats and Measurement : Stats Notes

Uncertainty and Graphical Analysis

Introduction to Error Analysis

Physics 403. Segev BenZvi. Propagation of Uncertainties. Department of Physics and Astronomy University of Rochester

The Half-Life of a Bouncing Ball

A quadratic expression is a mathematical expression that can be written in the form 2

Uncertainty and Bias UIUC, 403 Advanced Physics Laboratory, Fall 2014

Measurements of a Table

Statistics and Data Analysis

IB Physics STUDENT GUIDE 13 and Processing (DCP)

Pure Math 30: Explained! 81

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

Statistics for Data Analysis. Niklaus Berger. PSI Practical Course Physics Institute, University of Heidelberg

1 Measurement Uncertainties

EXPERIMENTAL DESIGN. u Science answers questions with experiments.

Section 6.5. Least Squares Problems

Measurement Uncertainties

An Introduction to Error Analysis

Chi Square Analysis M&M Statistics. Name Period Date

Approximate Linear Relationships

Error Analysis. To become familiar with some principles of error analysis for use in later laboratories.

How to Write a Laboratory Report

05 the development of a kinematics problem. February 07, Area under the curve

Statistics. Lent Term 2015 Prof. Mark Thomson. 2: The Gaussian Limit

Data collection and processing (DCP)

Prelab: Complete the prelab section BEFORE class Purpose:

Understanding Errors and Uncertainties in the Physics Laboratory

Statistics and data analyses

BRIDGE CIRCUITS EXPERIMENT 5: DC AND AC BRIDGE CIRCUITS 10/2/13

Physics 1050 Experiment 1. Introduction to Measurement and Uncertainty

Errors: What they are, and how to deal with them

Experiment 0 ~ Introduction to Statistics and Excel Tutorial. Introduction to Statistics, Error and Measurement

Experimental Uncertainty (Error) and Data Analysis

Experimental Uncertainty (Error) and Data Analysis

UNIT 1: NATURE OF SCIENCE

Part I. Experimental Error

Some hints for the Radioactive Decay lab

Error analysis in biology

Find the slope of the curve at the given point P and an equation of the tangent line at P. 1) y = x2 + 11x - 15, P(1, -3)

Introduction to Uncertainty and Treatment of Data

Physics 509: Propagating Systematic Uncertainties. Scott Oser Lecture #12

Data and Error Analysis

Statistical Methods in Particle Physics

Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester

D. Correct! This is the correct answer. It is found by dy/dx = (dy/dt)/(dx/dt).

Objectives: Review open, closed, and mixed intervals, and begin discussion of graphing points in the xyplane. Interval notation

Fitting a Straight Line to Data

Measurement and Uncertainty

Statistics notes. A clear statistical framework formulates the logic of what we are doing and why. It allows us to make precise statements.

Lab 8: Magnetic Fields

Some Statistics. V. Lindberg. May 16, 2007

Confidence intervals

Experiment 4 Free Fall

Statistical Analysis of Chemical Data Chapter 4

Understanding Errors and Uncertainties in the Physics Laboratory

Physics 1140 Fall 2013 Introduction to Experimental Physics

Student s Printed Name: KEY_&_Grading Guidelines_CUID:

The SuperBall Lab. Objective. Instructions

A company recorded the commuting distance in miles and number of absences in days for a group of its employees over the course of a year.

General Least Squares Fitting

appstats27.notebook April 06, 2017

Speed of waves. Apparatus: Long spring, meter stick, spring scale, stopwatch (or cell phone stopwatch)

MATH 10 INTRODUCTORY STATISTICS

Purpose: This lab is an experiment to verify Malus Law for polarized light in both a two and three polarizer system.

Physics 509: Non-Parametric Statistics and Correlation Testing

1.4 Techniques of Integration

Uncertainty in Measurements

General Least Squares Fitting

Uncertainty, Error, and Precision in Quantitative Measurements an Introduction 4.4 cm Experimental error

Originality in the Arts and Sciences: Lecture 2: Probability and Statistics

Measurements and Data Analysis

Algebra & Trig Review

Data and Error analysis

Introduction to Statistics and Error Analysis

Error Analysis. V. Lorenz L. Yang, M. Grosse Perdekamp, D. Hertzog, R. Clegg PHYS403 Spring 2016

Chapter 26: Comparing Counts (Chi Square)

An introduction to plotting data

Experiment 2: Projectile motion and conservation of energy

CONTINUOUS RANDOM VARIABLES

Poisson distribution and χ 2 (Chap 11-12)

Systematic uncertainties in statistical data analysis for particle physics. DESY Seminar Hamburg, 31 March, 2009

ISP 207L Supplementary Information

Lesson 3: Understanding Addition of Integers

Hypothesis testing. Chapter Formulating a hypothesis. 7.2 Testing if the hypothesis agrees with data

Chapter 3 Math Toolkit

Correlation and Regression

Introduction to Statistics, Error and Measurement

Final Review Sheet. B = (1, 1 + 3x, 1 + x 2 ) then 2 + 3x + 6x 2

Numeracy across the curriculum

Lecture 10: Generalized likelihood ratio test

This does not cover everything on the final. Look at the posted practice problems for other topics.

Data Analysis, Standard Error, and Confidence Limits E80 Spring 2015 Notes

PHY 123 Lab 1 - Error and Uncertainty and the Simple Pendulum

IB Physics Internal Assessment Report Sample. How does the height from which a ball is dropped affect the time it takes to reach the ground?

Name: Lab Partner: Section: In this experiment error analysis and propagation will be explored.

Transcription:

Introduction to Statistics and Data Analysis RSI 2005 Staff July 15, 2005

Variation and Statistics Good experimental technique often requires repeated measurements of the same quantity These repeatedly measured values will usually not be identical, and they may not match a theory or what you expected. You must explain why: Statistical Error (σ) Systematic Error ( ) 2

Statistical Error (σ) Statistical error is due solely to random fluctuations in measuring process You can [and should!] carefully calculate σ You can [and should!] reduce σ by averaging over many independent, uncorrelated measurements Central Limit Theorem: As the number N of independent, uncorrelated measurements of the same random variable approaches, the average value of the sum is proportional to N while the standard deviation (σ) only grows as N 3

Calculating Statistical Error Discrete case [counting]: σ = N, so the total count, with uncertainty, is N ± N. If you re counting things in different bins, use this to find the error in each bin. Continuous case [e.g. measuring somebody s height]: make many repeated measurements, and use the following formula: σ = 1 N 1 N i=1 (x i m x ) 2 where x i is the measured value on the i th trial and. m x = x = 1 N N x i i=1 4

Systematic Error ( ) Systematic error describes deviations between data and theory that result from an incorrect theory, a faulty experimental apparatus, or incorrect data reduction. Should be eliminated if possible. Otherwise, it should be explained and estimated in your discussion of results. Thus, any reported value x should have the form x ± σ x ± x 5

Reporting Data: Numbers RULE #1: Any reported value must be reported along with its uncertainty both statistical and systematic. The statement We found π = 3.149 is worse than incorrect, as we cannot evaluate its correctness. Stating We found π = 3.149 ± 0.002 is at least incorrect, in that we know the deviation from truth is due to something systematically wrong with the experiment or analysis. RULE #2: Any reported value must be accompanied by units. With the exception of tables in which units are specified in the top row, every number you put down in your paper should be dimensioned. Saying We measured a lifetime of (4.23 ± 0.07) 10 5 is completely meaningless. 6

Precision, Accuracy, and Significance RULE #3: Only use the word precise when talking about the statistical error of a measured quantity. A precise measurement has low statistical error. RULE #4: Only use the word accurate when discussing the systematic error of a measurement or process. An accurate measurement returns a value that is very close to the true value. RULE #5: Only use the word significant when discussing results that can agree or disagree with a hypothesis given some error probability threshold. Always specify this threshold when you use the word significant. 7

Error Propagation Suppose you measure some quantity X and obtain a value of x ± σ x, but you want to know the value of Y = X 2. Clearly the average value of Y you would report is y = x 2, but what is σ y? σ 2 y = σ2 x [ dy dx ] 2 X=x For more variables, errors add in quadrature: If Y = f(u, V ), and if errors in U and V are uncorrelated, then [ Y ] 2 [ Y ] 2 σ 2 y = σ 2 u U U=u + σ 2 v V V =v 8

Error Propagation Examples Suppose Z = X ± Y. Then σ z = σ 2 x + σ 2 y. Suppose Z = XY or Z = X/Y. Then σ z = z (σx x ) 2 + ( σy y ) 2. Suppose Z = a bx. Then σ z = zbσ x ln a. 9

Raw Data is Often Uninteresting May depend on factors specific to your experiment Often involves uninteresting quantities [e.g. counts per bin, intermembrane glucose concentration] Want to convert the data into more interesting physical quantities that can be compared with theoretical models and other experiments. [known as data reduction] Necessary for both quantitative and qualitative measurements 10

Plotting Data Always label your axes with quantity being measured (e.g. height) and units (nm) You MUST include [usually vertical] error bars to show uncertainty in the dependent quantity being measured. This applies to bar graphs, scatter plots, and almost any other graph you will use. Remember that for counting measurements, this error is just ± N. If there is uncertainty in any independent quantity, you must indicate this with error bars as well [usually horizontal]. If the uncertainty in the dependent quantity is correlated to the uncertainty in the independent quantity, you should plot an error ellipse. 11

Fitting Curves to Data I Goal of is to find a function that can be used to describe your data Why bother fitting? It allows us to: extract interesting parameters from our data compare our data to a model or theoretical prediction interpolate or extrapolate to make predictions Don t ever just connect the dots with plotted data. Doing so would suggest the existence of data or theories that aren t there. 12

Fitting Curves to Data II To fit, first choose a functional form for the fit with designated parameters that can be varied. Vary parameters to globally minimize a given error function. [Do this with a computer and fitting software, not by hand!] The most common function used for error is the chi-squared (χ 2 ) metric. Other metrics are used to weigh errors in different ways. 13

Minimizing χ 2 The χ 2 metric is given by χ 2 = N i=1 (x i ˆx i ) 2 Here, x i are your measured data, ˆx i are the predictions of your function [these values change as you vary parameters], and σ i is the uncertainty in x i. σ 2 i Finding a global minimum in the χ 2 metric amounts to minimizing the sum of squared differences between your data and the model. You want χ 2 /DoF 1, where DoF is the degrees of freedom in your fit [number of data points - number of free parameters in model] 14

Example 15