Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Similar documents
Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

NO! This is not evidence in favor of ESP. We are rejecting the (null) hypothesis that the results are

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators

Expectation and Variance of a random variable

6.3 Testing Series With Positive Terms

Topic 9: Sampling Distributions of Estimators

Lecture 7: Properties of Random Samples

Section 11.8: Power Series

Chapter 6 Sampling Distributions

Properties and Tests of Zeros of Polynomial Functions

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

The Method of Least Squares. To understand least squares fitting of data.

Random Variables, Sampling and Estimation

MATH/STAT 352: Lecture 15

Interval Estimation (Confidence Interval = C.I.): An interval estimate of some population parameter is an interval of the form (, ),

Exercise 4.3 Use the Continuity Theorem to prove the Cramér-Wold Theorem, Theorem. (1) φ a X(1).

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

BHW #13 1/ Cooper. ENGR 323 Probabilistic Analysis Beautiful Homework # 13

Homework 5 Solutions

4. Partial Sums and the Central Limit Theorem

Module 1 Fundamentals in statistics

Continuous Functions

Lecture 1 Probability and Statistics

PH 425 Quantum Measurement and Spin Winter SPINS Lab 1

Properties and Hypothesis Testing

Inverse Matrix. A meaning that matrix B is an inverse of matrix A.

x x x Using a second Taylor polynomial with remainder, find the best constant C so that for x 0,

Math 128A: Homework 1 Solutions

HOMEWORK #10 SOLUTIONS

Polynomial Functions and Their Graphs

Linear Regression Models

Final Review for MATH 3510

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015

Lecture 2: Poisson Sta*s*cs Probability Density Func*ons Expecta*on and Variance Es*mators

Direction: This test is worth 250 points. You are required to complete this test within 50 minutes.

Economics 250 Assignment 1 Suggested Answers. 1. We have the following data set on the lengths (in minutes) of a sample of long-distance phone calls

First, note that the LS residuals are orthogonal to the regressors. X Xb X y = 0 ( normal equations ; (k 1) ) So,

Chapter 8: Estimating with Confidence

Lecture 3. Properties of Summary Statistics: Sampling Distribution

Simulation. Two Rule For Inverting A Distribution Function

Zeros of Polynomials

Castiel, Supernatural, Season 6, Episode 18

Statistical Properties of OLS estimators

A widely used display of protein shapes is based on the coordinates of the alpha carbons - - C α

DISTRIBUTION LAW Okunev I.V.

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Lecture 2: Monte Carlo Simulation

The Growth of Functions. Theoretical Supplement

The picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled

Math 10A final exam, December 16, 2016

MATH 10550, EXAM 3 SOLUTIONS

Chapter 10: Power Series

Quick Review of Probability

Quick Review of Probability

Computing Confidence Intervals for Sample Data

Sample Size Determination (Two or More Samples)

The Poisson Distribution

DETERMINATION OF MECHANICAL PROPERTIES OF A NON- UNIFORM BEAM USING THE MEASUREMENT OF THE EXCITED LONGITUDINAL ELASTIC VIBRATIONS.

NUMERICAL METHODS FOR SOLVING EQUATIONS

Apply change-of-basis formula to rewrite x as a linear combination of eigenvectors v j.

Chapter 6 Principles of Data Reduction

The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER / Statistics

Lesson 10: Limits and Continuity

Mathematical Notation Math Introduction to Applied Statistics

Lecture 1 Probability and Statistics

The standard deviation of the mean

CHAPTER 5. Theory and Solution Using Matrix Techniques

3.2 Properties of Division 3.3 Zeros of Polynomials 3.4 Complex and Rational Zeros of Polynomials

Chapter 2 The Monte Carlo Method

1 Approximating Integrals using Taylor Polynomials

Math 152. Rumbos Fall Solutions to Review Problems for Exam #2. Number of Heads Frequency

Find quadratic function which pass through the following points (0,1),(1,1),(2, 3)... 11

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

Data Analysis and Statistical Methods Statistics 651

Chapter 2 The Solution of Numerical Algebraic and Transcendental Equations

Discrete Mathematics and Probability Theory Fall 2016 Walrand Probability: An Overview

Regression, Part I. A) Correlation describes the relationship between two variables, where neither is independent or a predictor.

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Singular Continuous Measures by Michael Pejic 5/14/10

Chapter 6. Sampling and Estimation

September 2012 C1 Note. C1 Notes (Edexcel) Copyright - For AS, A2 notes and IGCSE / GCSE worksheets 1

Random Matrices with Blocks of Intermediate Scale Strongly Correlated Band Matrices

1 Generating functions for balls in boxes

Stat 319 Theory of Statistics (2) Exercises

Lecture 11 and 12: Basic estimation theory

Estimation for Complete Data

(all terms are scalars).the minimization is clearer in sum notation:

Chapter 4. Fourier Series

ECO 312 Fall 2013 Chris Sims LIKELIHOOD, POSTERIORS, DIAGNOSING NON-NORMALITY

[ ] ( ) ( ) [ ] ( ) 1 [ ] [ ] Sums of Random Variables Y = a 1 X 1 + a 2 X 2 + +a n X n The expected value of Y is:

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22

An Introduction to Randomized Algorithms

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5

Basis for simulation techniques

x c the remainder is Pc ().

Common Large/Small Sample Tests 1/55

CHAPTER 10 INFINITE SEQUENCES AND SERIES

Transcription:

Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would like some way to measure how good these measuremets really are. Obviously the closer the (x 1, x, x ) s are to the (x t1, x t, x t ) s, the better (or more accurate) the measuremets. ca we get more specific? Assume: The measuremets are idepedet of each other. The measuremets come from a Gaussia distributio. (σ 1, σ... σ ) be the stadard deviatio associated with each measuremet. Cosider the followig two possible measures of the quality of the data: x R i x ti d = i χ (x i x ti ) = d i σ i Which of the above gives more iformatio o the quality of the data? Both R ad χ are zero if the measuremets agree with the true value. R looks good because via the Cetral Limit Theorem as the sum Gaussia. However, χ is better! R = 0 χ 0 L6: Chi Square Distributio 1 d 3 d 1 d 4 =-d 3 d =-d 1

Oe ca show that the probability distributio for χ is exactly: p(χ 1,) = / Γ( /) [χ ] / 1 e χ / 0 χ This is called the "Chi Square" (χ ) distributio. Γ is the Gamma Fuctio: Γ(x) e t 0 t x 1 dt x > 0 Γ(+1) =! =1,,3... Γ( 1 ) = π This is a cotiuous probability distributio that is a fuctio of two variables: χ Number of degrees of freedom (dof): = # of data poits - # of parameters calculated from the data poits Example: We collected N evets i a experimet. We histogram the data i bis before performig a fit to the data poits. We have data poits! Example: We cout cosmic ray evets i 15 secod itervals ad sort the data ito 5 bis: Number of couts i 15 secod itervals 0 1 3 4 Number of itervals 7 6 3 we have a total of 36 cosmic rays i 0 itervals we have oly 5 data poits Suppose we wat to compare our data with the expectatios of a Poisso distributio: e µ µ m N = N 0 m! L6: Chi Square Distributio

Sice we set N 0 = 0 i order to make the compariso, we lost oe degree of freedom: = 5-1 = 4 If we calculate the mea of the Poissio from data, we lost aother degree of freedom: = 5 - = 3 Example: We have 10 data poits. Let µ ad σ be the mea ad stadard deviatio of the data. If we calculate µ ad σ from the 10 data poit the = 8. If we kow µ ad calculate σ the = 9. If we kow σ ad calculate µ the = 9. If we kow µ ad σ the = 10. Like the Gaussia probability distributio, the probability itegral caot be doe i closed form: P(χ > a) = p( χ,)dχ 1 = [χ ] / 1 e χ / dχ a a / Γ(/) We must use to a table to fid out the probability of exceedig certai χ for a give dof P(χ,) = For 0, P(χ > a) ca be approximated usig a Gaussia pdf with a = (χ ) 1/ - (-1) 1/ χ L6: Chi Square Distributio 3

Example: What s the probability to have χ >10 with the umber of degrees of freedom = 4? Usig Table D of Taylor we fid P(χ > 10, = 4) = 0.04. We say that the probability of gettig a χ > 10 with 4 degrees of freedom by chace is 4%. Some ot so ice thigs about the χ distributio: Give a set of data poits two differet fuctios ca have the same value of χ. Does ot produce a uique form of solutio or fuctio. Does ot look at the order of the data poits. Igores treds i the data poits. Igores the sig of differeces betwee the data poits ad true values. Use oly the square of the differeces. There are other distributios/statistical test that do use the order of the poits: ru tests ad Kolmogorov test L6: Chi Square Distributio 4

Least Squares Fittig Suppose we have data poits (x i, y i, ). Assume that we kow a fuctioal relatioship betwee the poits, y = f (x,a,b...) Assume that for each y i we kow x i exactly. The parameters a, b, are costats that we wish to determie from our data poits. A procedure to obtai a ad b is to miimize the followig χ with respect to a ad b. χ = i f (x i,a,b)] This is very similar to the Maximum Likelihood Method. For the Gaussia case MLM ad LS are idetical. Techically this is a χ distributio oly if the y s are from a Gaussia distributio. Sice most of the time the y s are ot from a Gaussia we call it least squares rather tha χ. Example: We have a fuctio with oe ukow parameter: f (x,b) =1+ bx Fid b usig the least squares techique. We eed to miimize the followig: χ = i f (x i,a,b)] = i 1 bx i ] To fid the b that miimizes the above fuctio, we do the followig: χ b = i 1 bx i ] = i 1 bx i ]x i = 0 b y i x i x i = 0 bx i L6: Chi Square Distributio 5

b = y i x i x i x i Each measured data poit (y i ) is allowed to have a differet stadard deviatio ( ). LS techique ca be geeralized to two or more parameters for simple ad complicated (e.g. o-liear) fuctios. Oe especially ice case is a polyomial fuctio that is liear i the ukows (a i ): f (x,a 1...a ) = a 1 + a x + a 3 x + a x 1 We ca always recast problem i terms of solvig simultaeous liear equatios. We use the techiques from liear algebra ad ivert a x matrix to fid the a i s! Example: Give the followig data perform a least squares fit to fid the value of b. f (x,b) =1+ bx x 1.0.0 3.0 4.0 y..9 4.3 5. σ 0. 0.4 0.3 0.1 Usig the above expressio for b we calculate: b = 1.05 L6: Chi Square Distributio 6

A plot of the data poits ad the lie from the least squares fit: If we assume that the data poits are from a Gaussia distributio, we ca calculate a χ ad the probability associated with the fit. χ = i 1 1.05x i ]..05.9 3.1 4.3 4.16 = + + 0. 0.4 0.3 From Table D of Taylor: The probability to get χ > 1.04 for 3 degrees of freedom 80%. We call this a "good" fit sice the probability is close to 100%. If however the χ was large (e.g. 15), the probability would be small ( 0.% for 3 dof). we say this was a "bad" fit. L6: Chi Square Distributio 7 5. 5. + 0.1 =1.04 RULE OF THUMB A "good" fit has χ /dof 1