Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would like some way to measure how good these measuremets really are. Obviously the closer the (x 1, x, x ) s are to the (x t1, x t, x t ) s, the better (or more accurate) the measuremets. ca we get more specific? Assume: The measuremets are idepedet of each other. The measuremets come from a Gaussia distributio. (σ 1, σ... σ ) be the stadard deviatio associated with each measuremet. Cosider the followig two possible measures of the quality of the data: x R i x ti d = i χ (x i x ti ) = d i σ i Which of the above gives more iformatio o the quality of the data? Both R ad χ are zero if the measuremets agree with the true value. R looks good because via the Cetral Limit Theorem as the sum Gaussia. However, χ is better! R = 0 χ 0 L6: Chi Square Distributio 1 d 3 d 1 d 4 =-d 3 d =-d 1
Oe ca show that the probability distributio for χ is exactly: p(χ 1,) = / Γ( /) [χ ] / 1 e χ / 0 χ This is called the "Chi Square" (χ ) distributio. Γ is the Gamma Fuctio: Γ(x) e t 0 t x 1 dt x > 0 Γ(+1) =! =1,,3... Γ( 1 ) = π This is a cotiuous probability distributio that is a fuctio of two variables: χ Number of degrees of freedom (dof): = # of data poits - # of parameters calculated from the data poits Example: We collected N evets i a experimet. We histogram the data i bis before performig a fit to the data poits. We have data poits! Example: We cout cosmic ray evets i 15 secod itervals ad sort the data ito 5 bis: Number of couts i 15 secod itervals 0 1 3 4 Number of itervals 7 6 3 we have a total of 36 cosmic rays i 0 itervals we have oly 5 data poits Suppose we wat to compare our data with the expectatios of a Poisso distributio: e µ µ m N = N 0 m! L6: Chi Square Distributio
Sice we set N 0 = 0 i order to make the compariso, we lost oe degree of freedom: = 5-1 = 4 If we calculate the mea of the Poissio from data, we lost aother degree of freedom: = 5 - = 3 Example: We have 10 data poits. Let µ ad σ be the mea ad stadard deviatio of the data. If we calculate µ ad σ from the 10 data poit the = 8. If we kow µ ad calculate σ the = 9. If we kow σ ad calculate µ the = 9. If we kow µ ad σ the = 10. Like the Gaussia probability distributio, the probability itegral caot be doe i closed form: P(χ > a) = p( χ,)dχ 1 = [χ ] / 1 e χ / dχ a a / Γ(/) We must use to a table to fid out the probability of exceedig certai χ for a give dof P(χ,) = For 0, P(χ > a) ca be approximated usig a Gaussia pdf with a = (χ ) 1/ - (-1) 1/ χ L6: Chi Square Distributio 3
Example: What s the probability to have χ >10 with the umber of degrees of freedom = 4? Usig Table D of Taylor we fid P(χ > 10, = 4) = 0.04. We say that the probability of gettig a χ > 10 with 4 degrees of freedom by chace is 4%. Some ot so ice thigs about the χ distributio: Give a set of data poits two differet fuctios ca have the same value of χ. Does ot produce a uique form of solutio or fuctio. Does ot look at the order of the data poits. Igores treds i the data poits. Igores the sig of differeces betwee the data poits ad true values. Use oly the square of the differeces. There are other distributios/statistical test that do use the order of the poits: ru tests ad Kolmogorov test L6: Chi Square Distributio 4
Least Squares Fittig Suppose we have data poits (x i, y i, ). Assume that we kow a fuctioal relatioship betwee the poits, y = f (x,a,b...) Assume that for each y i we kow x i exactly. The parameters a, b, are costats that we wish to determie from our data poits. A procedure to obtai a ad b is to miimize the followig χ with respect to a ad b. χ = i f (x i,a,b)] This is very similar to the Maximum Likelihood Method. For the Gaussia case MLM ad LS are idetical. Techically this is a χ distributio oly if the y s are from a Gaussia distributio. Sice most of the time the y s are ot from a Gaussia we call it least squares rather tha χ. Example: We have a fuctio with oe ukow parameter: f (x,b) =1+ bx Fid b usig the least squares techique. We eed to miimize the followig: χ = i f (x i,a,b)] = i 1 bx i ] To fid the b that miimizes the above fuctio, we do the followig: χ b = i 1 bx i ] = i 1 bx i ]x i = 0 b y i x i x i = 0 bx i L6: Chi Square Distributio 5
b = y i x i x i x i Each measured data poit (y i ) is allowed to have a differet stadard deviatio ( ). LS techique ca be geeralized to two or more parameters for simple ad complicated (e.g. o-liear) fuctios. Oe especially ice case is a polyomial fuctio that is liear i the ukows (a i ): f (x,a 1...a ) = a 1 + a x + a 3 x + a x 1 We ca always recast problem i terms of solvig simultaeous liear equatios. We use the techiques from liear algebra ad ivert a x matrix to fid the a i s! Example: Give the followig data perform a least squares fit to fid the value of b. f (x,b) =1+ bx x 1.0.0 3.0 4.0 y..9 4.3 5. σ 0. 0.4 0.3 0.1 Usig the above expressio for b we calculate: b = 1.05 L6: Chi Square Distributio 6
A plot of the data poits ad the lie from the least squares fit: If we assume that the data poits are from a Gaussia distributio, we ca calculate a χ ad the probability associated with the fit. χ = i 1 1.05x i ]..05.9 3.1 4.3 4.16 = + + 0. 0.4 0.3 From Table D of Taylor: The probability to get χ > 1.04 for 3 degrees of freedom 80%. We call this a "good" fit sice the probability is close to 100%. If however the χ was large (e.g. 15), the probability would be small ( 0.% for 3 dof). we say this was a "bad" fit. L6: Chi Square Distributio 7 5. 5. + 0.1 =1.04 RULE OF THUMB A "good" fit has χ /dof 1