Mea Same as ordary average Sum all the data values ad dvde by the sample sze. x = ( x + x +... + x Usg summato otato, we wrte ths as x = x = x = = ) x Mea s oly approprate for terval or rato scales, ot ordal or omal. Meda Value that dvdes the sample so that a equal umber of cases are above ad below. Sort the cases by magtude of x: x [ ] x[]... x[ ] If s odd, the meda s the mddle value: e.g. x =,, 5, 0,. Meda s 5. If s eve, meda s the average of the two mddle values: e.g., x =,, 5, 0. Meda s (+5)/ = 3.5 For test score example, sorted values are 3 8 9 0 4 4 7 7 8 8 8 8 8 9 9 3 3 3 33 33 35 So meda s 8. Meda vs. Mea Ulke the meda, the mea s sestve to extreme scores (outlers):,, 3, 4, 5 => mea=3, meda=3,, 3, 4, 00 => mea=, meda=3 I symmetrcal dstrbutos, mea ad meda wll be the same. I skewed dstrbutos, they wll be dfferet. (more later). So the meda s ofte preferred for varables lke come whch have a relatvely small umber of extremely hgh scores.
Varace How dfferet s each score from the mea? x x What s the average of these dffereces? ( x x ) = 0 Postve devatos cacel out the egatve Mea s the oly score for whch ths s true. How to fx?. Take absolute values before averagg: Mea absolute devato. x x. Square the devatos before averagg: Varace. ( x x) = s The square root of ths s called the stadard devato: s = ( x x) For ay ormal dstrbuto, the followg rule holds: 68% of the cases fall wth s.d. of the mea. 95% fall wth s.d.s of the mea. 99.7% fall wth 3 s.d.s of the mea.
Y 0.8 0.7 0.6 0.5 0.4 0.3 0. 0. 0.0-4 -3 - - 0 3 4 X DENSITY 0.40 0.35 0.30 0.5 0.0 0.5 0.0 0.05 0.00-4 -3 - - 0 3 4 X 3
Stadard Error Every statstc has a stadard error assocated wth t. Not always reported ad ot always easy to calculate. Examples: Watg tmes, compaes A measure of the ()accuracy of the statstc. A stadard error of 0 meas that the statstc has o radom error. The bgger the stadard error, the less accurate the statstc. Implct ths the dea that aythg we calculate a sample of data s subject to radom errors. The mea we calculated for the watg tmes s ot the true mea, but oly a estmate of the true mea. Eve f we could perfectly replcate our study, we would get a dfferet value for the mea. What are the sources of error? Classc approach statstcs: our data set may be oly a radom sample from some larger populato. We may make errors of measuremet. There are lots of other radom factors affectg our outcome that we ca t cotrol. The stadard error of a statstc s the stadard devato of that statstc across hypothetcal repeated samples. Example: 00 replcatos of watg tme study. I theory, eed to replcate a fte umber of tmes. 4
The stadard errors that are reported computer output are oly estmates of the true stadard errors. Remarkably, we ca estmate the varablty across repeated samples by usg the varablty wth samples. The more varablty wth the sample, the more varablty betwee samples. The formula for the stadard error of the mea s by the square root of the sample sze. s,.e., the stadard devato dvded I geeral, the bgger the sample, the smaller the stadard error. Why? Bg samples gve us more formato to estmate the quatty we re terested. The stadard error geerally goes dow wth the square root of the sample sze. Thus, f you quadruple the sample sze, you cut the stadard error half. Cofdece Itervals The stadard error s ofte used to costruct cofdece tervals. To costruct a 95 percet cofdece terval aroud the mea, add two stadard errors ad subtract two stadard errors. E.g., for the watg tme example, the mea was approx. 0 ad ts stadard error was. The the upper cofdece lmt s ad the lower cofdece lmt s 8. Iterpretato: we ca be 95 percet cofdet that the true mea s somewhere betwee 8 ad. Further terpretato: Suppose we could replcate our study may tmes. For each replcato we could costruct a 95 percet cofdece terval by addg ad subtractg stadard errors from the mea. The 95 percet of those cofdece tervals would cota the true mea. Why two stadard errors? Remember our rule for ormal dstrbutos: 95% of the cases fall wth two stadard devatos of the mea. 5
Eve though the orgal dstrbuto of watg tmes was ot well approxmated by a ormal dstrbuto, the dstrbuto of meas across repeated samples s approxmately ormal. Why? Cetral lmt theorem: Wheever you average a buch of thgs together, the resultg average teds to be approxmately ormally dstrbuted. The more thgs you add together, the closer the approxmato. I large samples, most statstcs have approxmately a ormal dstrbuto across repeated samples. Correlato A measure of the stregth of the relatoshp betwee two varables. There are may measures of correlato. The most commo s Pearso s product-momet correlato coeffcet, usually deoted by r. Facts about the correlato: Whe r s 0, there s o correlato betwee the two varables. Whe r s or, there s a perfect lear relatoshp. r caot be greater tha or less tha. The correlato measures the degree of scatter aroud a straght le. Correlato oly measures the lear relatoshp betwee two varables. Correlato s symmetrc: the correlato betwee x ad y s the same as the correlato betwee y ad x. 6