CURRICULUM INSPIRATIONS: INNOVATIVE CURRICULUM ONLINE EXPERIENCES: TANTON TIDBITS:

Similar documents
ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER / Statistics

WHAT COOL MATH! JULY CURRICULUM INSPIRATIONS: Math for America DC:

6.3 Testing Series With Positive Terms

Number of fatalities X Sunday 4 Monday 6 Tuesday 2 Wednesday 0 Thursday 3 Friday 5 Saturday 8 Total 28. Day

Measures of Spread: Standard Deviation

Elementary Statistics

Understanding Samples

MEASURES OF DISPERSION (VARIABILITY)

1 Lesson 6: Measure of Variation

Topic 9: Sampling Distributions of Estimators

Math 155 (Lecture 3)

Median and IQR The median is the value which divides the ordered data values in half.

Stat 421-SP2012 Interval Estimation Section

Frequentist Inference

Regression, Part I. A) Correlation describes the relationship between two variables, where neither is independent or a predictor.

Census. Mean. µ = x 1 + x x n n

Chapter 6 Sampling Distributions

Statistics 511 Additional Materials

4.3 Growth Rates of Solutions to Recurrences

KLMED8004 Medical statistics. Part I, autumn Estimation. We have previously learned: Population and sample. New questions

Riemann Sums y = f (x)

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

n outcome is (+1,+1, 1,..., 1). Let the r.v. X denote our position (relative to our starting point 0) after n moves. Thus X = X 1 + X 2 + +X n,

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Parameter, Statistic and Random Samples

4.1 Sigma Notation and Riemann Sums

Response Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable

Lecture 1. Statistics: A science of information. Population: The population is the collection of all subjects we re interested in studying.

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

Series III. Chapter Alternating Series

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Introducing Sample Proportions

Power and Type II Error

II. Descriptive Statistics D. Linear Correlation and Regression. 1. Linear Correlation

Simulation. Two Rule For Inverting A Distribution Function

MATH/STAT 352: Lecture 15

Lecture 3. Properties of Summary Statistics: Sampling Distribution

multiplies all measures of center and the standard deviation and range by k, while the variance is multiplied by k 2.

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22

Chapter 23: Inferences About Means

PH 425 Quantum Measurement and Spin Winter SPINS Lab 1

Infinite Sequences and Series

The Poisson Distribution

Chapter 2 Descriptive Statistics

Section 1.1. Calculus: Areas And Tangents. Difference Equations to Differential Equations

Seunghee Ye Ma 8: Week 5 Oct 28


3.2 Properties of Division 3.3 Zeros of Polynomials 3.4 Complex and Rational Zeros of Polynomials

Analysis of Experimental Measurements

Zeros of Polynomials

STAT 350 Handout 19 Sampling Distribution, Central Limit Theorem (6.6)

7.1 Convergence of sequences of random variables

The picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled

Lecture 2: Monte Carlo Simulation

7.1 Convergence of sequences of random variables

Kinetics of Complex Reactions

Expectation and Variance of a random variable

The Random Walk For Dummies

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Mathematical Notation Math Introduction to Applied Statistics

Lecture 24 Floods and flood frequency

Analysis of Experimental Data

Topic 9: Sampling Distributions of Estimators

Intermediate Math Circles November 4, 2009 Counting II

1 Approximating Integrals using Taylor Polynomials

Measures of Spread: Variance and Standard Deviation

An Introduction to Randomized Algorithms

Introducing Sample Proportions

Chapter 8: Estimating with Confidence

Chapter 12 Correlation

A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

Chapter If n is odd, the median is the exact middle number If n is even, the median is the average of the two middle numbers

a. For each block, draw a free body diagram. Identify the source of each force in each free body diagram.

Z ß cos x + si x R du We start with the substitutio u = si(x), so du = cos(x). The itegral becomes but +u we should chage the limits to go with the ew

MA131 - Analysis 1. Workbook 2 Sequences I

Topic 9: Sampling Distributions of Estimators

Random Variables, Sampling and Estimation

Topic 1 2: Sequences and Series. A sequence is an ordered list of numbers, e.g. 1, 2, 4, 8, 16, or

ENGI 4421 Probability and Statistics Faculty of Engineering and Applied Science Problem Set 1 Solutions Descriptive Statistics. None at all!

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ

Activity 3: Length Measurements with the Four-Sided Meter Stick

SOLUTIONS TO PRISM PROBLEMS Junior Level 2014

( ) = p and P( i = b) = q.

RADICAL EXPRESSION. If a and x are real numbers and n is a positive integer, then x is an. n th root theorems: Example 1 Simplify

Sequences I. Chapter Introduction

Statisticians use the word population to refer the total number of (potential) observations under consideration

Example: Find the SD of the set {x j } = {2, 4, 5, 8, 5, 11, 7}.

1 Inferential Methods for Correlation and Regression Analysis

Chapter 6. Sampling and Estimation

Comparing your lab results with the others by one-way ANOVA

Economics 250 Assignment 1 Suggested Answers. 1. We have the following data set on the lengths (in minutes) of a sample of long-distance phone calls

A quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

6 Integers Modulo n. integer k can be written as k = qn + r, with q,r, 0 r b. So any integer.

Math 475, Problem Set #12: Answers

SEQUENCES AND SERIES

Properties and Tests of Zeros of Polynomial Functions

Data Description. Measure of Central Tendency. Data Description. Chapter x i

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.

Transcription:

CURRICULUM INSPIRATIONS: wwwmaaorg/ci MATH FOR AMERICA_DC: wwwmathforamericaorg/dc INNOVATIVE CURRICULUM ONLINE EXPERIENCES: wwwgdaymathcom TANTON TIDBITS: wwwjamestatocom TANTON S TAKE ON MEAN ad VARIATION JULY 01 I our early grades we lear that the average of a collectio of data measuremets represets, i some way, a typical or middle value for the data For example, the average of the umbers 1,, 5, is: 1 + + 5 + = 5 Geometrically, the average is the level of a sad-box after we smooth out colums of sad of heights give by the data: I a statistics class the average value of a collectio of data values is called the mea of the data (The word still meas average ) Oe deotes the mea of the data by puttig a bar over whichever letter is beig uses to deote the data For example, the mea of a1, a, ad a is: a + a + a a= 1 ad the mea of x1, x, K, x is: x= x + x + L+ x 1 If the data set is extraordiarily large ad oe does t have ay hope of determiig the mea of the full data set, the that true, but ukow, mea is usually deoted with the Greek letter µ For example, we have o hope of kowig the average height of wwwjamestatocom ad wwwgdaymathcom

all humas o this plaet at this very momet But we ca measure the height of 100 humas ad collect 100 data values h, h, K, h We the hope that the 1 100 sample mea h approximates the true mea µ to some reasoable degree Exercise: Data values x 1, x, L, x have mea x Prove that the sum of the differece of each data value from the mea is sure to be zero: ( x x) ( x x) ( x x) 1 + + + = 0 L Exercise: Some texts might give the followig formula for mea: x= f x + f x + L+ f x 1 1 f + f + L+ f 1 Ca you iterpret what the symbols i this formula mea ad why the formula is correct? SIMPSON S PARADOX Two studets Albert ad Bilbert each took a sample of math questios over a series of two days There were 100questios i total ad Albert scored 65% ad Bilbert 6% overall Thus Albert proved himself a better test taker But here are the scores day-by-day: FIRST DAY: Albert = 71% Bilbert = 80% SECOND DAY: Albert = 50% Bilbert = 57% So each day Bilbert did a better job tha Albert, but did ot prove to be the better test-taker after the two days combied! How is this possible? The followig table shows raw data of their test results This paradox arises because Albert ad Bilbert did ot complete the same umber of questios each day ad the averages computed are ot equally weighted This curious pheomeo is kow as Simpso s paradox ad was discovered by the Statisticia Simpso i the 1960s after examiig graduate school admissio rates for me ad wome ito UC Berkeley ASIDE: There are several other measures of a typical or cetral value of a data set The mode of a set of data values is the value i the set that occurs most ofte (if there is oe) For the te data values, 6, 5,, 1, 6, 5,, 8, the mode is The data set 5, 5, 6, 6, 9, 9,,,, has o mode The data set 1, 1, 1, 1, 5, 5, 7, 7, 7, 7, 8, 8, 9, 9, 9 is bimodal (Is the secod example quiti-modal?) For o-umerical data, such as colours, or letters of the alphabet, the mode is the oly measure of cetral tedecy available If we arrage the data set i icreasig order of values, the the media of the data is the middle value of the ordered sequece or the average of the two middle values if there are a eve umber of terms wwwjamestatocom ad wwwgdaymathcom

The media of the data set,, 5, 6, 7, 16, 16, 19, 7 is 7 The media of,,, 5, 8, 8, 10, 1 5+ 8 is = 6 5 The media is a value that divides the data set ito two equally sized groups The midrage of a data set is the average of the smallest ad largest values The midrage of the data set 5, 6, 9, 9 is 5 + 9 = 7 The midrage provides a quick estimate to a cetral value It is easy to compute, but is highly affected by extremely low or high values i the data set Exercise: a) Fid FIVE data values with: Media = 10 Mode = 10 Mea = 1000 b) Now fid five data values with media = 10, mode = 1000 ad mea = 10 c) Ca you fid five data values with media = 1000, mode = 10, mea = 10? COOL Exercise: Repeat the previous exercise but this time for SIX data values DEVIATION FROM THE MEAN The data set 1,,5, has mea 5 So too does the data set: 01, 0, 1, 110 These are two very differet data sets, with the secod beig much more spread out tha the first We ca measure the degree of spread by calculatig the average deviatio from the mea for each DATA SET 1,,5, : Deviatios: 1 5 = 15 5 = 05 5 5 = 5 5 = 05 Average deviatio: 15+ 05+ 5+ 05 = 15 DATA SET 01, 0, 1, 110: Deviatios: 101 5 = 100 0 5 = 5 1 5 = 15 110 5 = 1075 Average deviatio: 100+ 5+ 15+ 1075 = 865 The umbers 15 ad 865, the average deviatios from the mea, do give a quatitative measure of the amout of spread of each data set THE POINT OF THIS ESSAY Usig the absolute value, the distace of a particular data value from the mea value of the data, is the atural ad appropriate way to measure data variatio But statisticias DON T use absolute values i their work! This is very strage ad cofusig for studets (There is also a secod piece of cofusio, which we shall leave to later i this essay) Here are two ratioales for the switch away from absolute values: wwwjamestatocom ad wwwgdaymathcom

RATIONALE ONE: Workig with absolute values is hard Ca we avoid them? Ideed, workig with absolute values i mathematical equatios is really tough! Optioal Exercises: a) Sketch the curve x + y = b) Fid all values of w which satisfy: w w 5 w = 7 c) (From last moth s essay) Three data poits A= (, ), B= (5,8) ad C = (7,5) are plotted o a graph A horizotal lie y = k will be draw but a value k eeds to be chose so that the sum of the three vertical deviatios from the horizotal lie is at a miimum (NOTE: We ve draw the horizotal lie so that A lies below it ad B ad C above it This eed ot be the case) O a calculator, type i a fuctio that represets the sum of these three deviatios ad graph it Which value of k seems to give a miimum value for this sum of three deviatios? But we still eed a measure, a positive umber that represets the deviatio of each data value from the mea If we wat to avoid absolute value, how else ca we obtai positive values? Aswer: Square the values! Let s square all the deviatios ad take the average of those squared deviatios: DATA SET 1,,5, : Deviatios squared: 1 5 = 5 5 = 05 5 5 = 65 5 = 05 Average squared deviatio: 5+ 05+ 65+ 05 = 5 DATA SET 01, 0, 1, 110: Deviatios squared: 101 5 = 10609 0 5 = 65 1 5 = 5 110 5 = 115565 Average squared deviatio: 10609+ 65+ 5+ 115565 = 5575 These average squared deviatios still give a good sese of the differet spreads the two data sets possess Oe subtle poit: Data ofte comes from physical measuremets the height of a perso, the speed of a car o a highway, ad so o ad so has uits associated with them If x 1, x, K, x are i uits of iches, say, x1 + x + L+ x the the mea x= also has uits of iches, but the average squared deviatio: wwwjamestatocom ad wwwgdaymathcom

+ ( x + K+ ( x has uits of iches squared To brig all quatities ad comparisos betwee quatities back to the same uits, statisticias will take the square root of the average squared deviatio: + ( x + K+ ( x This quatity ow has uits of iches ad is called the stadard deviatio of the data WARNING: Statisticias might raise a eyebrow or two over at what I just said They might prefer to call the quatity: + ( x + K+ ( x 1 the stadard deviatio of the data set (Note i the deomiator, rather tha ) This chage - the secod cofusio for studets studyig statistics - is discussed at the ed of this essay RATIONALE TWO: Abstract mathematics tells us it is atural to work with quatities squared Suppose we ru a experimet or poll some people ad gai from the exercise data values: x1, x, K, x We, ot beig omisciet, kow othig about the data values we shall obtai: we do t kow what to expect for the mea of the values (what is the true average height of all humas o this plaet?), what variatio from the mea to expect, what the frequecies of particular values should be, ad so o But if the experimet was ideal or the populatio we were pollig from is truly uiform, the the experimet or pollig would be absolutely ad utterly repeatable ad we d expect o variatio i data values at all That is, i the perfect ideal, all measuremets would adopt exactly the same value q, say, over ad over agai Let s ask: How close is our data, x, K, x) from some ideal set of repeatable data ( q, q,, q) K? Now we leared last moth that, i twodimesioal geometry, the distace A= a, a ad betwee two poits ( 1 ) B= ( b, b ) is give by: 1 (, ) d A B = a b + a b 1 1 Ad the distace betwee two poits A a, a, a B= b, b, b i = ad 1 three-dimesioal space is: 1 (, ) = ( ) + ( ) + ( ) d A B a b a b a b 1 1 Ad so o, for ay dimesio of space So to aswer this questio we seek a value M = q, q, K, q is as q so that the poit close as possible to our poit P=, x, K, x) i -dimesioal geometry We wat to choose a value q that miimizes the distace: = ( ) + ( ) + L ( ), 1 d P M x q x q x q It is easier to just to miimize the quatity uder the square root sig Notice that we are ow led to study a sum of quatities squared Expad the sum uder the root ad collect terms: wwwjamestatocom ad wwwgdaymathcom

( x q) + ( x q) + L( x q) 1 ( 1 L ) ( 1 L ) = q x + + x q+ x + + x We see that the sum we wish to miimize is just a quadratic i q It has miimum value for: + L+ x) x1+ L+ x q= = = x - the data s mea! We have: The mea of a data set is the value of closest ideal, repeatable, experimet to the give data From this perspective we see that it is atural to thik about sums of deviatios squared Dividig through by, we call: + L+ ( x the variace of the data Ad to match uits, we take the square root ad call this the stadard deviatio of the data: + L+ ( x Commet: We have ow see that the mea x of a set of data values x1, x, K, x has two properties: i) The sum ( x 1 x ) + ( x x ) + L + ( x x ) is zero ON VERSUS Some text authors will argue that it is better to divide by i the formulas for variace ad stadard deviatio rather tha by for the followig philosophical reaso: We have that ( x 1 x ) + ( x x ) + L + ( x x ) is sure to equal zero This meas that if oe kows the first values x1 x, x x, K, x 1 x, the the value of the th oe, x So amog the values x, is forced, ( x, K, ( x there are oly 1real pieces of iformatio To reflect this, let s divide by rather tha ad set the variace as: + L+ ( x 1 ad the stadard deviatio as: + L+ ( x 1 But this seems usatisfactory a explaatio Text authors will ofte add: If the data sets are large, that is, if is a large umber, the there will be little differece i dividig through by over dividig through by ii) Of all the sums of the form: ( x q) + ( x q) + ( x q) the sum 1 L + ( x + L ( x has the smallest value The correct studet respose to this add o is: So, really, why bother makig this chage? To uderstad why statisticias prefer to divide by,ot, let s go back to a previous example wwwjamestatocom ad wwwgdaymathcom

Because the data set is so large, we have o hope of kowig the true average height µ of all humas o this plaet right at this momet All we ca do is measure the heights of a sample of humas, compute the data mea h of that sample, ad hope that h offers a good approximatio forµ We would expect there to be some uiformity amog all the possible samples we could work with Certiaily, if we select a sample of 100 humas ad measure their heights we would obtai a sample mea h If we chose a differet collectio of 100 people we would probably obtai a slightly differet mea h I fact, if we looked at every possible collectio of 100, we d have a whole spread of values for h, all approximatig the true mea value µ Sice the set of all samples of 100humas well ad truly covers the etire huma populatio, it would be a shock if, o average, the set of all possible values of h tured out to be differet from µ The same should be true for variace We ca t possibly kow the true variace of the etire set of huma populatio heights, but we ca take a sample of 100heights ad fid the value of the variace for that sample Ad it would be a shock if agai, o average, the variaces over all possible samples of 100 people tured out to be a value differet from the true variace of the etire populatio Let s see what ca happe with some actual umbers EXAMPLE: Cosider the data set 1,,, This is a set of = data values with true mea µ = ad true variace, whe dividig by = : ( 1 0) + ( ) + ( ) + ( ) V = 1 = ad true variace, whe dividig by = : V = = ( 1 0) + ( ) + ( ) + ( ) But suppose we do t kow these values a data set of four values is too large for us to maage so we decide to look at samples of size three istead ad work out their sample meas ad sample variaces Here is a table of all possible subsets of size three (hadlig the repeated s) ad the sample meas ad variaces we would see: V V x {1,, } / 9+ 1 / 9+ 1 / 9 5 / = / 9 = 1/ {1,,} / 9 1 {1,,} / 9 1 {,,} 7 / / 9 1/ / 9+ 1/ 9+ 1 / 9 Average 1/ / We see that the meas ad the variaces do deped o which sample of three you happe to choose We also see, i this example, that our first dream is true: the average of all the sample meas matches µ = o the ose wwwjamestatocom ad wwwgdaymathcom

Ad our secod dream is true too if we divide by istead of whe computig variaces: the average of the values of V over all samples matches the value of V for the overall data These two claims are ot a coicidece for our particular example: they are true i geeral It is for this reaso that statisticias prefer to work with the formula: + L+ ( x 1 for variace ad the square root of this for stadard deviatio Exercise: There are six two-elemet subsets of the data set 1,,, (if you hadle the repeated s appropriately) List all six subsets, compute the mea ad variace V of each, ad take the average value of 1 these six meas ad variaces Show these average values match µ = ad V = / of the origial data set MATHEMATICAL PROOFS: The mathematics here is tedious algebra ad is hard to read Oe ca phrase the algebra i terms of expected values ad variaces of radom variables ( E( X ) ad Var( X )) ad make matters less complicated visually, but oe does this at the price of obscurig the coceptual straightforwardess If you are game, here s how these proofs proceed Suppose a populatio possesses a total of N data poits ad has mea: y1 + y + L+ yn µ = N Our job is to look at a subset of data poits, x1, x, K, x, compute their data mea x, ad take the average of all possible values for x over all possible subsets ad show this average equals µ We must also compute the variaces + L+ ( x 1 over all subsets ad show that their average equals: Now there are ( µ ) + L+ ( µ ) y1 y N N 1 = subsets! N! ( ) wwwjamestatocom ad wwwgdaymathcom N C of size amog N data poits, so i each case, our average is a sum divided by this umber For the sample meas we eed to show: x1 + x + L+ x x ' 1+ x ' + L+ x ' + + L! N! ( ) equals µ where the umerator is the sum of sample meas over all possible subsets (There is a similar, but more complicated formula, for the average of the variaces) This expressio is equivalet to: ( ) 1!( N )! ( + x + L+ x) + ( x ' 1+ x ' + L+ x ' ) + L) Now a particular data poit x appears i ( N 1)! N 1C 1 = subsets of size 1!( N )! So i the sum we have each data poit metioed this may times Our expressio is thus equivalet to: 1!( N )! N 1! N 1! x+ y+ L 1! N! 1! N!

where the sum is over each ad every data poit i the set This simplifies to: 1 x y N ( + +L ) which is ideed µ! For the average value of the variaces, we eed to work with: + L+ ( x 1 +! ( N )! 1 + L This is equivalet to: ( ) ( N ) ( x ' 1 x ') + L+ ( x ' x ') + L+ ( x + ( x ' 1 x ') + L+ ( x ' x ') 1!! + L = 1!! ( ) ( N ) 1 1 x1 + L+ x) + L+ x + L+ x) 1 1 + x' 1 ( x ' 1+ L+ x ' ) + L+ x' ( x ' 1+ L+ x' ) + L = 1! N! (( 1) x1 x L + ( x1 + ( 1) x L + L (( 1 ) x ' 1 x ' x ' ) ( x ' 1 ( 1 ) x ' x' ) L + + L + L + L By expadig terms ad coutig how may times a particular data poit squared x appears ad how may times the pair 1 x x appears (ad these couts are the 1 same for all data poits), oe ca show that this expressio does ideed equal: ( µ ) + L+ ( µ ) y1 y N N 1 the variace over all the data poits We ll leave the details to the truly gug-ho reader! Exercise: To get a (maageable) feel for the algebra, do work through the details for the case of N = data poits: x1, x, x, x Write dow ad simplify the formulas for the variaces of each of the subsets,,,, x, x, x, { x1 x x },{ x1 x x },{ 1 } {,, } x x x ad a expressio for the average of these four values Show this average equals: x1 + x + x + x x1 x + x + x + x + x 1 x + x + x + x + x x + x + x + x + x 1 1 1, 01 James Tato tatomath@gmailcom wwwjamestatocom ad wwwgdaymathcom