Chapter 1 Introduction & 1.1: Analyzing Categorical Data

Chapter 1 Chapter 1 Introduction & 1.1: Analyzing Categorical Data Population Sample Make an inference about the population. Collect data from a representative sample... Perform Data Analysis, keeping probability in mind Introduction Data Analysis: Making Sense of Data After this section, you should be able to DEFINE Individuals and Variables DISTINGUISH between Categorical and Quantitative variables DEFINE Distribution DESCRIBE the idea behind Inference Categorical Variable Usually an adjective Rarely a number Examples: Gender Race Grade in School (Sophomore, Jr., Sr.) Zip Code Variable any characteristic of an individual or object Quantitative Variable Always a number Must be able to find the mean of the numbers Examples: Weight Height GPA # of AP Classes taken Square footage What is the Study of Statistics?! Statistics is the science of data. In this course we study four different aspects of statistics: Data Analysis (Chapters 1 to 3) The process of organizing, displaying, summarizing, and asking questions about data. Data Collection (Chapter 4) The process of conducting and interpreting surveys and experiments. Anticipating Patterns/Probability (Chapter 5 to 7) The process of using probability and chance to explain natural phenomena. Inference (Chapter 8 to 12) The process of making predications and evaluations about a population from a sample. Distribution Distribution: describes what values a variable takes and how often it takes those values Essentially distribution replaces the words data or graph. The median of the distribution is 28. The distribution is skewed left. Dotplot of MPG Distribution 1

Chapter 1 Organizing a Statistical Problem The Four Step Process State: What s the question that you re trying to answer? Displaying Categorical Data Frequency tables can be difficult to read. Sometimes it is easier to analyze a distribution by displaying it with a bar graph or pie chart. Plan: How will you go about answering the question? What statistical techniques does this problem call for? Do: Make graphs and carry out needed calculations. Conclude: Give your practical conclusion in the setting of the real world problem. ***Using this method is NOT required; however, all complete answers MUST include the Do and Conclude steps*** Section 1.1 Analyzing Categorical Data 2014 AP Exam Scores After this section, you should be able to CONSTRUCT and INTERPRET bar graphs and pie charts RECOGNIZE good and bad graphs CONSTRUCT and INTERPRET two way tables DESCRIBE relationships between two categorical variables ORGANIZE statistical problems Distribution & Categorical Variables The distribution of a categorical variable lists the count or percent of individuals who fall into each category. Favorite Course Count English 8 Foreign Language 4 Histroy 11 Math 15 Science 12 Favorite Course Percentage English 16% Foreign Language 8% Histroy 22% Math 30% Science 24% 2

Chapter 1 Graphs: Good and Bad Bar graphs compare several quantities by comparing the heights of bars that represent those quantities. Our eyes react to the area of the bars as well as height. Be sure to make your bars equally wide. Avoid the temptation to replace the bars with pictures for greater appeal this can be misleading! What proportion of males have a good chance at being rich? What proportion of females have a 50 50 chance at being rich? What proportion of young adults that have an almost certain chance of being rich are male? This ad for DIRECTV has multiple problems. How many can you point out? Two Way Tables Two Way Tables: describe two categorical variables, organizing counts according to a row variable and a column variable. When a dataset involves two categorical variables, we begin by examining the counts or percents in various categories for one of the variables. Member of No Clubs Member of Member of 2 or One Club More Clubs Total Rides the School Bus 55 33 20 108 Does not Ride Bus 16 44 82 142 Total 71 77 102 250 Comparing Categorical Distributions Sophomore Junior Senior Total One 0 0 4 4 Two 1 3 12 16 Three 4 7 6 17 Four 7 4 8 19 Five 2 0 3 5 Total 14 14 33 61 What proportion of students that ride the school bus are members of two or more clubs? What proportion of students that are members of no clubs do not ride the school bus? What proportion of students that do not ride the school bus are members of at least one club? Senior Comparing Categorical Distributions Member of No Clubs Member of Member of 2 or One Club More Clubs Total Rides the School Bus 55 33 20 108 Does not Ride Bus 16 44 82 142 Total 71 77 102 250 Junior Sophomore One Two Three Four Five 0% 20% 40% 60% 80% 100% 3

Chapter 1 Does not Ride Bus Rides the School Bus Comparing Categorical Distributions 0% 20% 40% 60% 80% 100% Member of No Clubs Member of One Club Member of 2 or More Clubs Comparing Categorical Distributions Sample Answer: Yes, there is a clear association between after school club participation and transportation. Only 11% of students who don t ride the bus do not participate in after school clubs, whereas 51% of students who do ride the bus do not participate. Similarly, 58% of students who do not ride the bus are involved in 2 or more clubs, while only 19% of students riding the bus are involved in 2 or more clubs. However, the proportion of students who participate in one club is the same for students who ride and students who don t ride the bus. Writing to Compare Categorical Distributions Cite specific numerical values/proportions. Use comparison words. Greater, smaller, less, while only, more, wider, narrower, etc. Use transition words However, whereas, similarly, additionally, etc. Discuss at least two points of comparison. 1.2: Displaying Quantitative Data with Graphs Does not Ride Bus Rides the School Bus Comparing Categorical Distributions Is there an association between after school club participation and whether or not the student rides the school bus? Support your answer with a discussion of the provided graphs. Member of No Clubs Member of One Club Member of 2 or More Clubs Section 1.2 Displaying Quantitative Data with Graphs After this section, you should be able to CONSTRUCT and INTERPRET dotplots, stemplots, and histograms DESCRIBE the shape of a distribution COMPARE distributions USE histograms wisely 0% 20% 40% 60% 80% 100% 4

Chapter 1 Dotplots Each data value is shown as a dot above its location on a number line. Describing Shape When you describe a distribution s shape, concentrate on the main features. Look for rough symmetry or clear skewness. Number of Goals Scored Per Game by the 2004 US Women s Soccer Team 3 0 2 7 8 2 4 3 5 1 1 4 5 3 1 1 3 3 3 2 1 2 2 2 4 3 5 6 1 5 5 1 1 5 How to Make a Dotplot 1. Draw a horizontal axis (a number line) and label it with the variable name. 2. Scale the axis from the minimum to the maximum value. Shape Definitions: Symmetric: if the right and left sides of the graph are approximately mirror images of each other. Skewed to the right (right skewed) if the right side of the graph is much longer than the left side. Skewed to the left (left skewed) if the left side of the graph is much longer than the right side. 3. Mark a dot above the location on the horizontal axis corresponding to each data value. Symmetric Skewed left Skewed right How to Describe Quantitative Data In any graph, look for the overall pattern and for striking departures from that pattern. Describe the overall pattern of a distribution by its: Shape Outliers Don t forget Center your SOCS! Spread 5

Chapter 1 Center We can describe the center by finding a value that divides the observations so that about half take larger values and about half take smaller values. Ways to describe center: Calculate median (best when distribution is skewed) Calculate mean (best when distribution is symmetric) Other Ways to Describe Shape: Unimodal Bimodal Multimodal Spread The spread of a distribution tells us how much variability there is in the data. Ways to describe spread: Calculate the range IQR (coming later) Standard Deviation (coming later) Outliers Definition: Values that differ from the overall pattern are outliers. We will learn specific ways to find outliers in a later chapter. For now, we can only identify potential outliers. Describe the shape, center, and spread of the distribution. Are there any potential outliers? Remember to include CONTEXT!!! 6

Chapter 1 Sample Answer: Shape: The shape of the distribution is roughly unimodal and skewed left. Center: The mean is 25.9 mpg and the median is 28 mpg. (only need one measure) Spread: The range is 19 mpg. Outliers: There are two potential outliers/influential values: 14 mpg and 18 mpg. Stemplots (Stem and Leaf Plots) These data represent the responses of 20 female AP Statistics students to the question, How many pairs of shoes do you have? 50 26 26 31 57 19 24 22 23 38 13 50 13 34 23 30 49 13 15 51 Stemplots (Stem and Leaf Plots) Stemplots give us a quick picture of the distribution while including the actual numerical values. Two Special Types of Stem Plots Spilt Stemplots: Best when data values are bunched up Spilt 0 4 and 5 9 Back to Back Stemplot: Compares two distributions of the same quantitative variable 0 0 1 1 2 2 3 3 4 4 5 5 split stems Females 333 95 4332 66 410 8 9 100 7 Males 0 4 0 555677778 1 0000124 1 2 2 2 3 3 58 4 4 5 5 Back to Back Key: 4 9 represents a student who reported having 49 pairs of shoes. How to Make a Stemplot 1)Separate each observation into a stem (all but the final digit) and a leaf (the final digit). 2)Write all possible stems from the smallest to the largest in a vertical column and draw a vertical line to the right of the column. 3)Write each leaf in the row to the right of its stem. 4)Arrange the leaves in increasing order out from the stem. 5)Provide a key that explains in context what the stems and leaves represent. Histograms Quantitative variables often take many values. A graph of the distribution may be clearer if nearby values are grouped together. The most common graph of the distribution of one quantitative variable is a histogram. 7

This image cannot currently be displayed. Chapter 1 How to Make a Histogram 1)Divide the range of data into classes of equal width. 2)Find the count (frequency) or percent (relative frequency) of individuals in each class. 3)Label and scale your axes and draw the histogram. The height of the bar equals its frequency. Adjacent bars should touch, unless a class contains no individuals. 1.3: Describing Quantitative Data with Numbers Making a Histogram Section 1.3 Describing Quantitative Data with Numbers Frequency Table Class Count 0 to <5 20 5 to <10 13 10 to <15 9 15 to <20 5 20 to <25 2 25 to <30 1 Total 50 Number of States Percent of foreign-born residents After this section, you should be able to MEASURE center with the mean and median MEASURE spread with standard deviation and interquartile range IDENTIFY outliers CONSTRUCT a boxplot using the five number summary CALCULATE numerical summaries with technology Caution: Using Histograms Wisely Measuring Center: The Mean 1)Don t confuse histograms and bar graphs. 2)Don t use counts (in a frequency table) or percents (in a relative frequency table) as data. 3)Use percents instead of counts on the vertical axis when comparing distributions with different numbers of observations. 4)Just because a graph looks nice, it s not necessarily a meaningful display of data. To find the mean (pronounced x bar ) of a set of observations, add their values and divide by the number of observations. If the n observations are x 1, x 2, x 3,, x n, their mean is: Compact Notation: 8

Chapter 1 Measuring Center: The Median The median M is the midpoint of a distribution, the number such that half of the observations are smaller and the other half are larger. To find the median of a distribution: 1)Arrange all observations from smallest to largest. 2)If the number of observations n is odd, the median M is the center observation in the ordered list. 3)If the number of observations n is even, the median M is the average of the two center observations in the ordered list. Why is the mean more affected by the presence of outliers than the median? Comparing the Mean and the Median The mean and median measure center in different ways, and both are useful. Mean: average value Median: typical value Standard Deviation Standard deviation is a number used to tell how measurements for a group are spread out from the mean. Relationship between Mean & Median: The mean and median of a roughly symmetric distribution are close together. If the distribution is exactly symmetric, the mean and median are exactly the same. In a skewed distribution, the mean is usually farther out in the long tail than is the median. Standard Deviation A relatively low standard deviation value indicates that the data points tend to be very close to the mean. A relatively high standard deviation value indicates that the data points are spread out over a large range of values. 9

Chapter 1 Standard Deviation Formula The standard deviation s x measures the average distance of the observations from their mean. It is calculated by finding an average of the squared distances and then taking the square root. This average squared distance is called the variance. Calculate the Standard Deviation Calculate the standard deviation. FYI: Why n 1?! Applet: http://www.uvm.edu/~dhowell/seeingstati sticsapplets/n 1.html Proof Calculate the Standard Deviation 1) Calculate the mean. Step 1: 5 2) Calculate each deviation. deviation = observation mean deviation: 1-5 = -4 deviation: 8-5 = 3 x i 1 3 4 4 4 5 7 (x i mean) = 5 8 9 Sum= How to Calculate Standard Deviation by Hand 1. Calculate mean. 2. Calculate each deviation. Subtract your mean score from every actual (observed) score. 3. Square each deviation. 4. Find the average squared deviation by calculating the sum of the squared deviations divided by (n 1). 4. Divide that sum by the number of cases in your data 5. Finally, calculate the square root of the number calculate in step #4 Calculate the Standard Deviation 3) Square each deviation. Step 3: See Table 4) Find the average squared deviation by calculating the sum of the squared deviations divided by (n 1). Step 4: Average squared deviation = 52/(9 1) = 6.5 Variance = 6.5 x i (x i mean) (x i mean) 2 1 1 5 = 4 3 3 5 = 2 4 4 5 = 1 4 4 5 = 1 4 4 5 = 1 5 5 5 = 0 7 7 5 = 2 8 8 5 = 3 9 9 5 = 4 Sum= Sum= 10

Chapter 1 Calculate the Standard Deviation 5) Calculate the square root of the variance this is the standard deviation. Step 5: Square root of variance Standard Deviation = 2.55 x i (x i mean) (x i mean) 2 1 1 5 = 4 ( 4) 2 = 16 3 3 5 = 2 ( 2) 2 = 4 4 4 5 = 1 ( 1) 2 = 1 4 4 5 = 1 ( 1) 2 = 1 4 4 5 = 1 ( 1) 2 = 1 5 5 5 = 0 (0) 2 = 0 7 7 5 = 2 (2) 2 = 4 8 8 5 = 3 (3) 2 = 9 9 9 5 = 4 (4) 2 = 16 Sum=? Sum=? TI NSpire: Calculate standard deviation and mean. 1. Select Lists & Spreadsheet (blue/green button at bottom of home screen) 2. Type the values into list1. 3. With your cursor on the values, press menu 4. Select 4: Statistics, then 1: Stat Calculations, press enter. 5. Select 1: One Variable Stats Two Extreme Examples: In dataset #1, we have five people that report eating 4 pieces of cake and five people that report eating 6 pieces of cake, for a mean of 5 pieces of cake ([4+4+4+4+4+6+6+6+6+6]/10=5). Mean =5; Variance = 1 In dataset #2, we have five people that report eating 0 piece of cake and five people that report eating 10 pieces of cake, for a mean of 5 pieces of cake ([0+0+0+0+0+10+10+10+10+10]/10=5). Mean = 5; Variance = 5 TI NSpire: Calculate standard deviation and mean. 6. Set screen to: and then press enter. Below are dotplots of three different distributions, A, B, and C. Which one has the largest standard deviation? Justify your answer. Mean Standard Deviation 11

Chapter 1 Interquartile Range (IQR) Find and Interpret the IQR Travel times to work for 20 randomly selected New Yorkers 10 30 5 25 40 20 10 15 30 20 15 20 85 15 65 15 60 60 40 45 5 10 10 15 15 15 15 20 20 20 25 30 30 40 40 45 60 60 65 85 Q 1 = 15 M = 22.5 Q 3 = 42.5 IQR = Q 3 Q 1 = 42.5 15 = 27.5 minutes Interpretation: The range of the middle half of travel times for the New Yorkers in the sample is 27.5 minutes. Interquartile Range (IQR) To calculate: 1)Arrange the observations in increasing order and locate the median M. 2)The first quartile Q 1 is the median of the observations located to the left of the median in the ordered list. 3)The third quartile Q 3 is the median of the observations located to the right of the median in the ordered list. The interquartile range (IQR) is defined as: Identifying Outliers In addition to serving as a measure of spread, the interquartile range (IQR) is used as part of a rule of thumb for identifying outliers. 1.5 x IQR Rule for Outliers Call an observation an outlier if it falls more than 1.5 x IQR above the third quartile or below the first quartile. IQR = Q 3 Q 1 Find and Interpret the IQR Travel times to work for 20 randomly selected New Yorkers 10 30 5 25 40 20 10 15 30 20 15 20 85 15 65 15 60 60 40 45 In the New York travel time data, we found Q 1 =15 minutes, Q 3 =42.5 minutes, and IQR=27.5 minutes. Calculate the outlier cutoffs using the IQR rule. For these data, 1.5 x IQR = 1.5(27.5) = 41.25 Q 1 1.5 x IQR = 15 41.25 = 26.25 Q 3 + 1.5 x IQR = 42.5 + 41.25 = 83.75 Any travel time shorter than 26.25 minutes or longer than 83.75 minutes is considered an outlier. 12

Chapter 1 In the New York travel time data, we found Q 1 =15 minutes, Q 3 =42.5 minutes, and IQR=27.5 minutes. Calculate the outlier cutoffs using the IQR rule. TI Nspire: 5 Number Summary 6. Set screen to: and then press enter. 7. Scroll down to see the 5 number summary. The Five Number Summary The five number summary of a distribution consists of the smallest observation, the first quartile, the median, the third quartile, and the largest observation, written in order from smallest to largest. Minimum Q 1 M Q 3 Maximum TI Nspire: 5 Number Summary 1. Select Lists & Spreadsheet (bottom of home screen) 2. Type the values into list1. 3. With your cursor on the values, press menu 4. Select 4: Statistics, then 1: Stat Calculations, press enter. 5. Select 1: One Variable Stats Boxplots (Box and Whisker Plots) Draw and label a number line that includes the range of the distribution. Draw a central box from Q 1 to Q 3. Note the median M inside the box. Extend lines (whiskers) from the box out to the minimum and maximum values that are not outliers. 13

Chapter 1 Construct a Boxplot Using our NY travel times data. Construct a boxplot. 10 30 5 25 40 20 10 15 30 20 15 20 85 15 65 15 60 60 40 45 Construct a Boxplot Using our NY travel times data. Construct a boxplot. 10 30 5 25 40 20 10 15 30 20 15 20 85 15 65 15 60 60 40 45 5 10 10 15 15 15 15 20 20 20 25 30 30 40 40 45 60 60 65 85 Min=5 Q 1 = 15 M = 22.5 Q 3 = 42.5 Max=85 Recall, this is an outlier by the 1.5 x IQR rule Choosing Best Measures of Center & Spread Symmetric Distribution Skewed Distribution Best Measure of Center Best Measure of Spread 14

Chapter 2 2.1: Describing Location in a Distribution Jenny earned a score of 86 on her test. How did she perform relative to the rest of the class? What percentile is she ranked in? 6 7 7 2334 7 5777899 8 00123334 8 569 9 03 Section 2.1 Describing Location in a Distribution After this section, you should be able to MEASURE position using percentiles INTERPRET cumulative relative frequency graphs TRANSFORM data Jenny earned a score of 86 on her test. How did she perform relative to the rest of the class? What percentile is she ranked in? 6 7 7 2334 7 5777899 8 00123334 8 569 9 03 Her score was greater than 21 of the 25 observations. Since 21 of the 25, or 84%, of the scores are below hers, Jenny is at the 84 th percentile in the class s test score distribution. DEFINE and DESCRIBE density curves Measuring Position: Percentiles One way to describe the location of a value in a distribution is to tell what percent of observations are less than it. The p th percentile of a distribution is the value with p percent of the observations less than it. Cumulative Relative Frequency Graphs A cumulative relative frequency graph displays the cumulative relative frequency of each class of a frequency distribution. 1

Chapter 2 Age of First 44 Presidents When They Were Inaugurated Age Frequency Relative frequency 40-44 45-49 50-54 55-59 60-64 65-69 2 2/44 = 4.5% 7 7/44 = 15.9% 13 13/44 = 29.5% 12 12/44 = 27.3% 7 7/44 = 15.9% 3 3/44 = 6.8% Cumulative frequency Cumulative relative frequency 2 2/44 = 4.5% 9 9/44 = 20.5% 22 22/44 = 50.0% 34 34/44 = 77.3% 41 41/44 = 93.2% 44 44/44 = 100% Age of Presidents When Inaugurated Transforming Data Transforming converts the original observations from the original units of measurements to another scale. Some transformations can affect the shape, center, and spread of a distribution. Age of First 44 Presidents When They Were Inaugurated Age Frequency Relative frequency 40-44 45-49 50-54 55-59 60-64 65-69 2 2/44 = 4.5% 7 7/44 = 15.9% 13 13/44 = 29.5% 12 12/44 = 27.3% 7 7/44 = 15.9% 3 3/44 = 6.8% Cumulative frequency Cumulative relative frequency 2 2/44 = 4.5% 9 9/44 = 20.5% 22 22/44 = 50.0% 34 34/44 = 77.3% 41 41/44 = 93.2% 44 44/44 = 100% Age of Presidents When Inaugurated Transforming Data: Add/Sub a Constant Adding the same number a (either positive, zero, or negative) to each observation: adds a to measures of center and location (mean, median, quartiles, percentiles), but Does not change the shape of the distribution or measures of spread (range, IQR, standard deviation). 1. Was Barack Obama, who was inaugurated at age 47, unusually young? 2. Estimate and interpret the 65 th percentile of the distribution Transforming Data: Add/Sub a Constant n Mean s x Min Q 1 M Q 3 Max IQR Range Guess(m) 44 16.02 7.14 8 11 15 17 40 6 32 Error (m 13) 44 3.02 7.14 5 2 2 4 27 6 32 65 11 47 58 2

Chapter 2 Transforming Data: Multiplying/Dividing Multiplying (or dividing) each observation by the same number b (positive, negative, or zero): Describing Density Curves The median and the mean are the same for a symmetric density curve. They both lie at the center of the curve. The mean of a skewed curve is pulled away from the median in the direction of the long tail. multiplies (divides) measures of center and location by b multiplies (divides) measures of spread by b does not change the shape of the distribution Transforming Data Change data from feet to meters n Mean s x Min Q 1 M Q 3 Max IQR Range Error (feet) 44 9.91 23.43 16.4 6.56 6.56 13.12 88.56 19.68 104.96 Error (meters) 44 3.02 7.14 5 2 2 4 27 6 32 2.2: Normal Distributions Density Curve A density curve: is always on or above the horizontal axis, and has area exactly 1 underneath it. A density curve describes the overall pattern of a distribution. The area under the curve and above any interval of values on the horizontal axis is the proportion of all observations that fall in that interval. The overall pattern of this histogram of the scores of all 947 seventh grade students in Gary, Indiana, on the vocabulary part of the Iowa Test of Basic Skills (ITBS) can be described by a smooth curve drawn through the tops of the bars. Section 2.2 Normal Distributions After this section, you should be able to DESCRIBE and APPLY the 68 95 99.7 Rule DESCRIBE the standard Normal Distribution PERFORM Normal distribution calculations ASSESS Normality 3

Chapter 2 Normal Distributions All Normal curves are symmetric, single peaked, and bellshaped A Specific Normal curve is described by giving its mean µ and standard deviation σ. Two Normal curves, showing the mean µ and standard deviation σ. The 68 95 99.7 Rule Although there are many different sizes and shapes of Normal curves, they all have properties in common. The 68 95 99.7 Rule ( The Empirical Rule ) In the Normal distribution with mean µ and standard deviation σ: Approximately 68% of the observations fall within σ of µ. Approximately 95% of the observations fall within 2σ of µ. Approximately 99.7% of the observations fall within 3σ of µ. Normal Distributions We abbreviate the Normal distribution with mean µ and standard deviation σ as N(µ,σ). Any particular Normal distribution is completely specified by two numbers: its mean µ and standard deviation σ. The mean of a Normal distribution is the center of the symmetric Normal curve. The standard deviation is the distance from the center to the change of curvature points on either side. Normal Distributions are Useful Normal distributions are good descriptions for some distributions of real data. Normal distributions are good approximations of the results of many kinds of chance outcomes. The distribution of Iowa Test of Basic Skills (ITBS) vocabulary scores for 7 th grade students in Gary, Indiana, is close to Normal. Suppose the distribution is N(6.84, 1.55) and the range is between 0 and 12. a) Sketch the Normal density curve for this distribution. Many statistical inference procedures are based on Normal distributions. 4

Chapter 2 The distribution of Iowa Test of Basic Skills (ITBS) vocabulary scores for 7 th grade students in Gary, Indiana, is close to Normal. Suppose the distribution is N(6.84, 1.55) and the range is between 0 and 12. a) Sketch the Normal density curve for this distribution. The distribution of Iowa Test of Basic Skills (ITBS) vocabulary scores for 7 th grade students in Gary, Indiana, is close to Normal. Suppose the distribution is N(6.84, 1.55).? c) Using the Empirical Rule, what percent of the scores are between 5.29 and 9.94? The distribution of Iowa Test of Basic Skills (ITBS) vocabulary scores for 7 th grade students in Gary, Indiana, is close to Normal. Suppose the distribution is N(6.84, 1.55). b) Using the Empirical Rule, what percent of ITBS vocabulary scores are less than 3.74? The distribution of Iowa Test of Basic Skills (ITBS) vocabulary scores for 7 th grade students in Gary, Indiana, is close to Normal. Suppose the distribution is N(6.84, 1.55).? c) Using the Empirical Rule, What percent of the scores are between 5.29 and 9.94? The distribution of Iowa Test of Basic Skills (ITBS) vocabulary scores for 7 th grade students in Gary, Indiana, is close to Normal. Suppose the distribution is N(6.84, 1.55). b) Using the Empirical Rule, what percent of ITBS vocabulary scores are less than 3.74? Importance of Standardizing There are infinitely many different Normal distributions; all with unique standard deviations and means. In order to more effectively compare different Normal distributions we standardize. Standardizing allows us to compare apples to apples. We can compare SAT and ACT scores by standardizing. 5

Chapter 2 The Standardized Normal Distribution All Normal distributions are the same if we measure in units of size σ from the mean µ as center. The standardized Normal distribution is the Normal distribution with mean 0 and standard deviation 1. The Standard Normal Table Because all Normal distributions are the same when we standardize, we can find areas under any Normal curve from a single table. The Standard Normal table is a table of the areas under the standard normal curve. The table entry for each value z is area under the curve to the LEFT of z. The area to left is called the p value Probability Percent How to Standardize a Variable: 1. Draw and label an Normal curve with the mean and standard deviation. 2. Calculate the z score x= variable µ= mean σ= standard deviation Using the Standard Normal Table Row: Ones and tenths digits Column: Hundredths digit Practice: What is the p value for a z score of 2.33? 3. Determine the p value by looking up the z score in the Standard Normal table. 4. Conclude in context. Using the Standard Normal Table Using the Standard Normal Table, find the following: 2.23 1.65.52 Z Score.79.23 P value 6

Chapter 2 Let s Practice In the 2008 Wimbledon tennis tournament, Rafael Nadal averaged 115 miles per hour (mph) on his first serves. Assume that the distribution of his first serve speeds is Normal with a mean of 115 mph and a standard deviation of 6.2 mph. About what proportion of his first serves would you expect to be less than 120 mph? Greater than? 4. Conclude in context. We expect that 79.1% of Nadal s first serves will be less than 120 mph. We expect that 20.9% of Nadal s first serves will be greater than 120 mps. 1. Draw and label an Normal curve with the mean and standard deviation. 2. Calculate the z score x= variable µ= mean σ= standard deviation Let s Practice When Tiger Woods hits his driver, the distance the ball travels can be described by N(304, 8). What percent of Tiger s drives travel between 305 and 325 yards? 3. Determine the p value by looking up the z score in the Standard Normal table. P(z < 0.81) =.7910 Z.00.01.02 0.7.7580.7611.7642 0.8.7881.7910.7939 0.9.8159.8186.8212 When Tiger Woods hits his driver, the distance the ball travels can be described by N(304, 8). What percent of Tiger s drives travel between 305 and 325 yards? Step 1: Draw Distribution Step 2: Z Scores 7

Chapter 2 Step 3: P values TI Nspire: NormalPDF Normalpdf Exact percentile/probability of a specific event occurring Using Table A, we can find the area to the left of z=2.63 and the area to the left of z=0.13. 0.9957 0.5517 = 0.4440. Step 4: Conclude In Context 1. Select Calculator (on home screen), press center button. 2. Press menu, press enter. 3. Select 6: Statistics, press enter. 4. Select 5: Distributions, press enter. 5. Select 1: Normal Pdf press enter. 6. Enter the following information: 1. Xvalue (not a percent) 2. µ: (mean) 3. Ơ: (standard deviation) 7. Press enter, number that appears is the p value About 44% of Tiger s drives travel between 305 and 325 yards. Normal Calculations on Calculator TI Nspire: InvNorm NormalCDF NormalPDF InvNorm Calculates Probability of obtaining a value BETWEEN two values Probability of obtaining PRECISELY or EXACTLY a specific x value X value given probability or percentile Example What percent of students scored between 70 and 95 on the test? What is the probability that Suzy scored a 75 on the test? Tommy scored a 92 on the test; what proportion of students did he score better than? invnorm Exact x value at which something occurred 1. Select Calculator (on home screen), press center button. 2. Press menu, press enter. 3. Select 6: Statistics, press enter. 4. Select 5: Distributions, press enter. 5. Select 3: Inverse Norm press enter. 6. Enter the following information: 1. Area (enter as a decimal) 2. µ: (mean) 3. Ơ: (standard deviation) 7. Press enter, number that appears is the p value TI Nspire: NormalCDF Normalcdf Area under the curve between two points 1. Select Calculator (on home screen), press center button. 2. Press menu, press enter. 3. Select 6: Statistics, press enter. 4. Select 5: Distributions, press enter. 5. Select 2: Normal Cdf, press enter. 6. Enter the following information: 1. Lower: (the lower bound of the region OR 1^ 99) 2. Upper: (the upper band of the region OR 1,000,000) 3. µ: (mean) 4. Ơ: (standard deviation) 7. Press enter, number that appears is the p value When Tiger Woods hits his driver, the distance the ball travels can be described by N(304, 8). What percent of Tiger s drives travel between 305 and 325 yards? 8

Chapter 2 When Tiger Woods hits his driver, the distance the ball travels can be described by N(304, 8). What percent of Tiger s drives travel between 305 and 325 yards? When Can I Use Normal Calculations?! Whenever the distribution is Normal. Ways to Assess Normality: Plot the data. Make a dotplot, stemplot, or histogram and see if the graph is approximately symmetric and bell shaped. Check whether the data follow the 68 95 99.7 rule. Construct a Normal probability plot. Suzy bombed her recent AP Stats exam; she scored at the 25 th percentile. The class average was a 170 with a standard deviation of 30. Assuming the scores are normally distributed, what score did Suzy earn of the exam? Normal Probability Plot These plots are constructed by plotting each observation in a data set against its corresponding percentile s z score. Suzy bombed her recent AP Stats exam; she scored at the 25 th percentile. The class average was a 170 with a standard deviation of 30. Assuming the scores are normally distributed, what score did Suzy earn of the exam? Interpreting Normal Probability Plot If the points on a Normal probability plot lie close to a straight line, the plot indicates that the data are Normal. Systematic deviations from a straight line indicate a non Normal distribution. Outliers appear as points that are far away from the overall pattern of the plot. 9

Chapter 2 Summary: Normal Distributions The Normal Distributions are described by a special family of bellshaped, symmetric density curves called Normal curves. The mean µ and standard deviation σ completely specify a Normal distribution N(µ,σ). The mean is the center of the curve, and σ is the distance from µ to the change of curvature points on either side. All Normal distributions obey the 68 95 99.7 Rule, which describes what percent of observations lie within one, two, and three standard deviations of the mean. All Normal distributions are the same when measurements are standardized. The standard Normal distribution has mean µ=0 and standard deviation σ=1. Table A gives percentiles for the standard Normal curve. By standardizing, we can use Table A to determine the percentile for a given z score or the z score corresponding to a given percentile in any Normal distribution. To assess Normality for a given set of data, we first observe its shape. We then check how well the data fits the 68 95 99.7 rule. Finding Areas Under the Standard Normal Curve Find the proportion of observations from the standard Normal distribution that are between 1.25 and 0.81. Step 3: Subtract. Additional Help Finding Areas Under the Standard Normal Curve Find the proportion of observations from the standard Normal distribution that are between 1.25 and 0.81. Step 1: Look up area to the left of 0.81 using table A. Step 2: Find the area to the left of 1.25 10

Chapter 3 3.1: Scatterplots & Correlation Scatterplots A scatterplot shows the relationship between two quantitative variables measured on the same individuals. The values of one variable appear on the horizontal axis, and the values of the other variable appear on the vertical axis. Each individual in the data appears as a point on the graph. Section 3.1 Scatterplots and Correlation After this section, you should be able to IDENTIFY explanatory and response variables CONSTRUCT scatterplots to display relationships INTERPRET scatterplots MEASURE linear association using correlation Scatterplots 1. Decide which variable should go on each axis. Remember, the explanatory variable goes on the X axis! 2. Label and scale your axes. 3. Plot individual data values. INTERPRET correlation Explanatory & Response Variables Explanatory Variables (Independent Variables ) Car weight Response Variables (Dependent Variables) Accident death rate Scatterplots Make a scatterplot of the relationship between body weight and pack weight. Body weight is our explanatory variable. Body weight (lb) 120 187 109 103 131 165 158 116 Backpack weight 26 30 26 24 29 35 31 28 (lb) Number of cigarettes smoked Life expectancy Number of hours studied SAT scores 1

Chapter 3 Constructing a Scatterplot: TI Nspire 1. Enter x values into list 1 and enter y values into list 2. 2. Label each column. Label column x : weight and column y: bpack. 3. Press HOME/On, click Add Data & Statistics Describing Scatterplots As in any graph of data, look for the overall pattern and for striking departures from that pattern. You can describe the overall pattern of a scatterplot by the direction, form, and strength of the relationship. An important kind of departure is an outlier, an individual value that falls outside the overall pattern of the relationship. Also, clustering. Constructing a Scatterplot: TI Nspire 4. Move the cursor to the bottom of the screen and click to add variable. Select weight. 5. Move the cursor to the left of the screen and click to add variable. Select bpack. Words That Describe Direction (slope) Positive or Negative Form Linear, quadratic, cubic, exponential, curved, nonlinear, etc. Strength Strong, weak, somewhat strong, very weak, moderately strong, etc. Constructing a Scatterplot More on Strength Strength refers to how tightly grouped the points are in a particular pattern. Later on we use describe strength as correlation 2

Chapter 3 Describe this Scatterplot Interpreting a Scatterplot Interpret.tell what the data suggests in real world terms. Example: The data suggests that the more hours a student studied for Mrs. Daniel s AP Stats test the higher grade the student earned. There is a positive relationship between hours studied and grade earned. Describe this Scatterplot Describe and interpret the scatterplot below. The y axis refers to backpack weight in pounds and the x axis refers to body weight in pounds. Describe this Scatterplot Describe and interpret the scatterplot below. The y axis refers to backpack weight in pounds and the x axis refers to body weight in pounds. Sample Answer: There is a moderately strong, positive, linear relationship between body weight and pack weight. There is one possible outlier, the hiker with the body weight of 187 pounds seems to be carrying relatively less weight than are the other group members. It appears that lighter students are carrying lighter backpacks 3

Chapter 3 Describe and interpret the scatterplot below. The y axis refer to a school s mean SAT math score. The x axis refers to the percentage of students at a school taking the SAT. What does r tell us?! Correlation describes what percent of variation in y is explained by x. Notice that the formula is the sum of the z scores of x multiplied by the z scores of y. Describe and interpret the scatterplot below. The y axis refer to a school s mean SAT math score. The x axis refers to the percentage of students at a school taking the SAT. Sample Answer: There is a moderately strong, negative, curved relationship between the percent of students in a state who take the SAT and the mean SAT math score. Further, there are two distinct clusters of states and at least one possible outliers that falls outside the overall pattern. Scatterplots and Correlation What is Correlation? A mathematical value that describes the strength of a linear relationship between two quantitative variables. Correlation values are between 1 and 1. Correlation is abbreviated: r The strength of the linear relationship increases as r moves away from 0 towards 1 or 1. What does r mean? R Value Strength 1 Perfectly linear; negative 0.75 Strong negative relationship 0.50 Moderately strong negative relationship 0.25 Weak negative relationship 0 nonexistent 0.25 Weak positive relationship 0.50 Moderately strong positive relationship 0.75 Strong positive relationship 1 Perfectly linear; positive 4

Chapter 3 How strong is the correlation? Is it positive or negative? Describe and interpret the scatterplot below. Be sure to estimate the correlation. 0.235 0.456 0.975 0.784 Describe and interpret the scatterplot below. Be sure to estimate the correlation. Sample Answer: As the number of predicted storms increases, so does the number of observed storms, but the relationship is weak. The relationship evidenced in the scatterplot is a fairly weak positive linear relationship. The estimated correlation is approximately r = 0.25. **Answers between 0.15 and 0.45 would be acceptable. Sample Answer: As the number of boats registered in Florida increases so does the number of manatees killed by boats. This relationship is evidenced in the scatterplot by a strong, positive linear relationship. The estimated correlation is approximately r =0.85. Estimate the Correlation Coefficient **Answers between 0.75 0.95 would be acceptable. 5

Chapter 3 Estimate the Correlation Coefficient Facts about Correlation 1. Correlation requires that both variables be quantitative. 2. Correlation does not describe curved relationships between variables, no matter how strong the relationship is. 3. Correlation is not resistant. r is strongly affected by a few outlying observations. 4. Correlation makes no distinction between explanatory and response variables. 5. r does not change when we change the units of measurement of x, y, or both. 6. r does not change when we add or subtract a constant to either x, y or both. 7. The correlation r itself has no unit of measurement. Calculate Correlation: TI Nspire 1. Enter x values in list 1 and y values in list 2. 2. Press MENU, then 4: Statistics 3. Option 1: Stat Calculations 4. Option 3: Linear Regression mx + b 5. X: a[], Y: b[], ENTER 6. Correlation = r R: Ignores distinctions between X & Y Correlation should be 0.79 Height in Feet Weight in pounds Find the Correlation 5.5 6.0 5.25 6.25 5.75 6.0 5.75 5.5 5.75 150 180 138 191 172 181 168 148 172 R: Highly Effected By Outliers R = 0.97 6

Chapter 3 Why?! Since r is calculated using standardized values (z scores), the correlation value will not change if the units of measure are changed (feet to inches, etc.) Adding a constant to either x or y or both will not change the correlation because neither the standard deviation nor distance from the mean will be impacted. Section 3.2 Least Squares Regression After this section, you should be able to INTERPRET a regression line CALCULATE the equation of the least squares regression line CALCULATE residuals CONSTRUCT and INTERPRET residual plots DETERMINE how well a line fits observed data INTERPRET computer regression output Correlation Formula: Suppose that we have data on variables x and y for n individuals. The values for the first individual are x 1 and y 1, the values for the second individual are x 2 and y 2, and so on. The means and standard deviations of the two variables are x bar and s x for the x values and y bar and s y for the y values. The correlation r between x and y is: Regression Lines A regression line summarizes the relationship between two variables, but only in settings where one of the variables helps explain or predict the other. A regression line is a line that describes how a response variable y changes as an explanatory variable x changes. We often use a regression line to predict the value of y for a given value of x. 3.2: Least Squares Regressions Regression Lines Regression lines are used to conduct analysis. Colleges use student s SAT and GPAs to predict college success Professional sports teams use player s vital stats (40 yard dash, height, weight) to predict success The Federal Reserve uses economic data (GDP, unemployment, etc.) to predict future economic trends. Macy s uses shipping, sales and inventory data predict future sales. 7

Chapter 3 Regression Line Equation Suppose that y is a response variable (plotted on the vertical axis) and x is an explanatory variable (plotted on the horizontal axis). A regression line relating y to x has an equation of the form: ŷ = ax + b In this equation, ŷ (read y hat ) is the predicted value of the response variable y for a given value of the explanatory variable x. a is the slope, the amount by which y is predicted to change when x increases by one unit. b is the y intercept, the predicted value of y when x = 0. Interpreting Linear Regression Y intercept: A student weighing zero pounds is predicted to have a backpack weight of 16.3 pounds (no practical interpretation). Slope: For each additional pound that the student weighs, it is predicted that their backpack will weigh an additional 0.0908 pounds more, on average. Regression Line Equation Interpreting Linear Regression Interpret the y intercept and slope values in context. Is there any practical interpretation? = 37x + 270 x= Hours Studied for the SAT Predicted SAT Math Score Format of Regression Lines Format 1: = 0.0908x + 16.3 = predicted back pack weight x= student s weight Format 2: Predicted back pack weight= 16.3 + 0.0908(student s weight) Interpreting Linear Regression = 37x + 270 Y intercept: If a student studies for zero hours, then the student s predicted SAT score is 270 points. This makes sense. Slope: For each additional hour the student studies, his/her score is predicted to increase 37 points, on average. This makes sense. 8

Chapter 3 Predicted Value What is the predicted SAT Math score for a student who studies 12 hours? = 37x + 270 Hours Studied for the SAT (x) Predicted SAT Math Score (y) Self Check Quiz: Calculate the Regression Equation A crazy professor believes that a child with IQ 100 should have a reading test score of 50, and that reading score should increase by 1 point for every additional point of IQ. What is the equation of the professor s regression line for predicting reading score from IQ? Be sure to identify all variables used. Predicted Value What is the predicted SAT Math score for a student who studies 12 hours? = 37x + 270 Hours Studied for the SAT (x) Predicted SAT Math Score (y) = 37(12) + 270 Predicted Score: 714 points Self Check Quiz: Calculate the Regression Equation A crazy professor believes that a child with IQ 100 should have a reading test score of 50, and that reading score should increase by 1 point for every additional point of IQ. What is the equation of the professor s regression line for predicting reading score from IQ? Be sure to identify all variables used. Answer: = 50 + x = predicted reading score x = number of IQ points above 100 Self Check Quiz! Self Check Quiz: Interpreting Regression Lines & Predicted Value Data on the IQ test scores and reading test scores for a group of fifth grade children resulted in the following regression line: predicted reading score = 33.4 + 0.882(IQ score) (a) What s the slope of this line? Interpret this value in context. (b) What s the y intercept? Explain why the value of the intercept is not statistically meaningful. (c) Find the predicted reading scores for two children with IQ scores of 90 and 130, respectively. 9

Chapter 3 predicted reading score = 33.4 + 0.882(IQ score) (a) Slope = 0.882. For each 1 point increase of IQ score, the reading score is predicted to increase 0.882 points, on average. (b) Y intercept= 33.4. If the student has an IQ of zero, which is essential impossible (would not be able to hold a pencil to take the exam), the score would be 33.4. This has no practical interpretation. (c) Predicted Value: 90: 33.4 + 0.882(90) = 45.98 130: 33.4 + 0.882(130) = 81.26 points. Least Squares Regression Line Different regression lines produce different residuals. The regression line we use in AP Stats is Least Squares Regression. The least squares regression line of y on x is the line that makes the sum of the squared residuals as small as possible. TI NSPIRE: LSRL to View Graph 1. Enter x data into list 1 and y data into list 2. Be sure to name lists 2. Press HOME/ON, Add Data & Statistics 3. Enter variables to x and y axis. 4. Click MENU, 4: Analyze 5. Option 6: Regression 6. Option 2: Show Linear (a + bx), ENTER TI NSpire: LSRL 1. Enter x data into list 1 and y data into list 2. 2. Press MENU, 4: Statistics, 1: Stat Calculations 3. Select Option4: Linear Regression. 4. Insert either name of list or a[] for x and name of list or b[] of y. Press ENTER. 10

Chapter 3 Residuals A residual is the difference between an observed value of the response variable and the value predicted by the regression line. That is, residual = observed y predicted y residual = y ŷ residual Positive residuals (above line) Calculate the Residual 1. If a student weighs 170 pounds and their backpack weighs 35 pounds, what is the value of the residual? Predicted: ŷ = 16.3 + 0.0908 (170) = 31.736 Observed: 35 Residual: 35 31.736 = 3.264 pounds The student s backpack weighs 3.264 pounds more than predicted. Negative residuals (below line) How to Calculate the Residual 1. Calculate the predicted value, by plugging in x to the LSRE. 2. Determine the observed/actual value. 3. Subtract. Calculate the Residual 2. If a student weighs 105 pounds and their backpack weighs 24 pounds, what is the value of the residual? Predicted: ŷ = 16.3 + 0.0908 (105) = 25.834 Observed: 24 Residual: 24 25.834= 1.834 The student s backpack weighs 1.834 pounds less than predicted Calculate the Residual 1. If a student weighs 170 pounds and their backpack weighs 35 pounds, what is the value of the residual? 2. If a student weighs 105 pounds and their backpack weighs 24 pounds, what is the value of the residual? Residual Plots A residual plot is a scatterplot of the residuals against the explanatory variable. Residual plots help us assess how well a regression line fits the data. 11

Chapter 3 TI NSpire: Residual Plots 1. Press MENU, 4: Analyze 2. Option 6: Residual, Option 2: Show Residual Plot Interpreting Computer Regression Output Be sure you can locate: the slope, the y intercept and determine the equation of the LSRL. = 0.0034415x + 3.5051 = predicted... x = explanatory variable Interpreting Residual Plots A residual plot magnifies the deviations of the points from the line, making it easier to see unusual observations and patterns. 1) The residual plot should show no obvious patterns 2) The residuals should be relatively small in size. A valid residual plot should look like the night sky with approximately equal amounts of positive and negative residuals. Pattern in residuals Linear model not appropriate r 2: Coefficient of Determination r 2 tells us how much better the LSRL does at predicting values of y than simply guessing the mean y for each value in the dataset. In this example, r 2 equals 60.6%. 60.6% of the variation in pack weight is explained by the linear relationship with bodyweight. (Insert r 2 )% of the variation in y is explained by the linear relationship with x. 1. Should You Use LSRL? 2. Interpret r 2 Interpret in a sentence (how much variation is accounted for?) 1. r 2 = 0.875, x= hours studied, y= SAT score 2. r 2 = 0.523, x= hours slept, y= alertness score 12

Chapter 3 Interpret r 2 Answers: 1. 87.5% of the variation in SAT score is explained by the linear relationship with the number of hours studied. 2. 52.3% of the variation in alertness score is explained by the linear relationship with the number of hours slept. S: Standard Deviation of the Residuals If we use a least squares regression line to predict the values of a response variable y from an explanatory variable x, the standard deviation of the residuals (s) is given by S represents the typical or average error (residual). Positive = UNDER predicts Negative = OVER predicts S: Standard Deviation of the Residuals Self Check Quiz! The data is a random sample of 10 trains comparing number of cars on the train and fuel consumption in pounds of coal. What is the regression equation? Be sure to define all variables. What is r 2 telling you? Define and interpret the slope in context. Does it have a practical interpretation? Define and interpret the y intercept in context. What is s telling you? 1. Identify and interpret the standard deviation of the residual. S: Standard Deviation of the Residuals Answer: S= 0.740 Interpretation: On average, the model under predicts fat gain by 0.740 kilograms using the least squares regression line. 1. ŷ = 2.1495x+ 10.667 ŷ = predicted fuel consumption in pounds of coal x = number of rail cars 2. 96.7 % of the varation is fuel consumption is explained by the linear realtionship with the number of rail cars. 3. Slope = 2.1495. With each additional car, the fuel consuption increased by 2.1495 pounds of coal, on average. This makes practical sense. 4. Y interpect = 10.667. When there are no cars attached to the train the fuel consuption is 10.667 pounds of coal. This has no practical intrepretation beacuse there is always at least one car, the engine. 5. S= 4.361. On average, the model over predicts fuel consumption by 4.361 pounds of coal using the least squares regression line. 13

Chapter 3 Extrapolation We can use a regression line to predict the response ŷ for a specific value of the explanatory variable x. The accuracy of the prediction depends on how much the data scatter about the line. Exercise caution in making predictions outside the observed values of x. Correlation and Regression Limitations The distinction between explanatory and response variables is important in regression. Extrapolation is the use of a regression line for prediction far outside the interval of values of the explanatory variable x used to obtain the line. Such predictions are often not accurate. Outliers and Influential Points An outlier is an observation that lies outside the overall pattern of the other observations. An observation is influential for a statistical calculation if removing it would markedly change the result of the calculation. Points that are outliers in the x direction of a scatterplot are often influential for the least squares regression line. Note: Not all influential points are outliers, nor are all outliers influential points. Correlation and Regression Limitations Correlation and regression lines describe only linear relationships. NO!!! Outliers and Influential Points Correlation and Regression Limitations Correlation and least squares regression lines are not resistant. The left graph is perfectly linear. In the right graph, the last value was changed from (5, 5) to (5, 8) clearly influential, because it changed the graph significantly. However, the residual is very small. 14

Chapter 3 Correlation and Regression Wisdom Association Does Not Imply Causation An association between an explanatory variable x and a response variable y, even if it is very strong, is not by itself good evidence that changes in x actually cause changes in y. A serious study once found that people with two cars live longer than people who only own one car. Owning three cars is even better, and so on. There is a substantial positive correlation between number of cars x and length of life y. Why? Calculate the Least Squares Regression Line Some people think that the behavior of the stock market in January predicts its behavior for the rest of the year. Take the explanatory variable x to be the percent change in a stock market index in January and the response variable y to be the change in the index for the entire year. We expect a positive correlation between x and y because the change during January contributes to the full year s change. Calculation from data for an 18 year period gives Mean x =1.75 % S x = 5.36% Mean y = 9.07% S y = 15.35% r = 0.596 Find the equation of the least squares line for predicting full year change from January change. Show your work. The Role of r 2 in Regression The standard deviation of the residuals gives us a numerical estimate of the average size of our prediction errors. Additional Calculations & Proofs The coefficient of determination r 2 is the fraction of the variation in the values of y that is accounted for by the leastsquares regression line of y on x. We can calculate r 2 using the following formula: In practicality, just square the correlation r. Least Squares Regression Line We can use technology to find the equation of the leastsquares regression line. We can also write it in terms of the means and standard deviations of the two variables and their correlation. Equation of the least squares regression line We have data on an explanatory variable x and a response variable y for n individuals. From the data, calculate the means and standard deviations of the two variables and their correlation. The least squares regression line is the line ŷ = a + bx with slope and y intercept Accounted for Error If we use the LSRL to make our predictions, the sum of the squared residuals is 30.90. SSE = 30.90 1 SSE/SST = 1 30.97/83.87 r 2 = 0.632 63.2 % of the variation in backpack weight is accounted for by the linear model relating pack weight to body weight. 15

Chapter 3 Unaccounted for Error SSE/SST = 30.97/83.87 SSE/SST = 0.368 If we use the mean backpack weight as our prediction, the sum of the squared residuals is 83.87. SST = 83.87 Therefore, 36.8% of the variation in pack weight is unaccounted for by the least squares regression line. Interpreting a Regression Line Consider the regression line from the example (pg. 164) Does Fidgeting Keep You Slim? Identify the slope and y intercept and interpret each value in context. The slope b = -0.00344 tells us that the amount of fat gained is predicted to go down by 0.00344 kg for each added calorie of NEA. The y-intercept a = 3.505 kg is the fat gain estimated by this model if NEA does not change when a person overeats. 16

Chapter 4 4.1: Samples & Surveys How do we gather data? Surveys Opinion polls Interviews Studies Observational Retrospective (past) Experiments Section 4.1 Samples and Surveys After this section, you should be able to IDENTIFY the population and sample in a sample survey IDENTIFY voluntary response samples and convenience samples DESCRIBE how to use a table of random digits to select a simple random sample (SRS) DESCRIBE simple random samples, stratified random samples, and cluster samples EXPLAIN how undercoverage, nonresponse, and question wording can lead to bias in a sample survey The Idea of a Sample Survey Step 1: Define the population we want to describe. Step 2: Say exactly what we want to measure. A sample survey is a study that uses an organized plan to choose a sample that represents some specific population. Step 3: Decide how to choose a sample from the population. Populations and Samples The population in a statistical study is the entire group of individuals about which we want information. A sample is the part of the population from which we actually collect information. We use information from a sample to draw conclusions about the entire population. Population Sample Collect data from a representative Sample... Make an Inference about the Population. Sampling Design Sampling Design: method used to choose the sample from the population Types of Samples: Simple Random Sample Stratified Random Sample Systematic Random Sample Cluster Sample Multistage Sample 1

Chapter 4 Simple Random Sample (SRS) Consist of n individuals from the population chosen in such a way that every individual has an equal chance of being selected every set of n individuals has an equal chance of being selected Table of Random Digits A table of random digits is a long string of the digits 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 with these properties: Each entry in the table is equally likely to be any of the 10 digits 0 9. The entries are independent of each other. That is, knowledge of one part of the table gives no information about any other part. How to Choose an SRS Using Table D Step 1: Label. Give each member of the population a numerical label of the same length. Step 2: Table. Read consecutive groups of digits of the appropriate length from Table D. Your sample contains the individuals whose labels you find. SRS Use Table D at line 130 to choose an SRS of 4 hotels. Advantages Unbiased Easy Disadvantages Large variance/high variability May not be representative Must be able to identify entire population 01 Aloha Kai 08 Captiva 15 Palm Tree 22 Sea Shell 02 Anchor Down 09 Casa del Mar 16 Radisson 23 Silver Beach 03 Banana Bay 10 Coconuts 17 Ramada 24 Sunset Beach 04 Banyan Tree 11 Diplomat 18 Sandpiper 25 Tradewinds 05 Beach Castle 12 Holiday Inn 19 Sea Castle 26 Tropical Breeze 06 Best Western 13 Lime Tree 20 Sea Club 27 Tropical Shores 07 Cabana 14 Outrigger 21 Sea Grape 28 Veranda 69051 64817 87174 09517 84534 06489 87201 97245 69 05 16 48 17 87 17 40 95 17 84 53 40 64 89 87 20 Our SRS of 4 hotels for the editors to contact is: 05 Beach Castle, 16 Radisson, 17 Ramada, and 20 Sea Club. Methods of Selecting an SRS Draw names from a hat Assign each person in the group and randomly generate chosen numbers Ways to randomly generate numbers Computer Random Table of Digits Calculator A university s financial aid office wants to know how much it can expect students to earn from summer employment. This information will be used to set the level of financial aid. The population contains 478 students who have completed at least one year of study but have not yet graduated. A questionnaire will be sent to an SRS of 100 of these students, drawn from an alphabetized list. Starting at line 135, select the first three students in the sample. 135 66925 55658 39100 78458 11206 19876 87151 31260 136 08421 44753 77377 28744 75592 08563 79140 92454 137 53645 66812 61421 47836 12609 15373 98481 14592 2

Chapter 4 Stratified Random Sample Population is divided into homogeneous (alike) groups called strata Strata 1: Seniors Strata 2: Juniors SRS s are pulled from each strata Helps control for lurking variables Common Strata What are some common stratas in the following areas? Politics School Stratified Random Sample Systematic Random Sample Pick a method of identifying subjects randomly before starting Requires strict adherence Example: Suppose a supermarket wants to study buying habits of their customers, then using systematic sampling they can choose every 10th or 15th customer entering the supermarket and conduct the study on this sample. Stratified Random Sample Advantages Disadvantages More precise Difficult to do if you must unbiased estimator divide stratum than SRS Formulas for SD & Less variability confidence intervals are Cost reduced if strata more complicated already exists Cluster Sample Based upon location Randomly pick a location & sample all there Examples: All houses on a certain block All houses in a specific zip code All students at specific schools in MDCPS All students in specific homeroom classes 3

Chapter 4 Cluster Samples Identify the Sampling Design Advantages Unbiased Cost is reduced Disadvantages Clusters may not be representative of population Formulas are complicated 1)The Educational Testing Service (ETS) needed a sample of colleges. ETS first divided all colleges into groups of similar types (small public, small private, etc.) Then they randomly selected 3 colleges from each group. Multistage Sample At least two separate levels/stages of SRS. Example: Stage 1: Juniors vs. Seniors Stage 2: Divide the above groups (Juniors and Seniors) by AP, Regular and Honors.select 10 for each of the groups for a total of 60. Identify the Sampling Design 2) A county commissioner wants to survey people in her district to determine their opinions on a particular law up for adoption. She decides to randomly select blocks in her district and then survey all who live on those blocks. Sampling at a School Assembly Describe how you would use the following sampling methods to select 80 students to complete a survey. (a) Simple Random Sample (b) Stratified Random Sample (c) Cluster Sample Identify the Sampling Design 3) A local restaurant manager wants to survey customers about the service they receive. Each night the manager randomly chooses a number between 1 & 10. He then gives a survey to that customer, and to every 10 th customer after them, to fill it out before they leave. 4

Chapter 4 How would you do it? Ms. Garcia is determining what classes to offer next school year at ATM. She wants to conduct a survey of students to help determine course offerings (electives, Dual Enrollment, AP, regular, honors, etc.). Design a sampling method to help Ms. Garcia accurately and fairly survey a representative sample of the entire school population. Sources of Error in Sample Surveys Undercoverage occurs when some groups in the population are left out of the process of choosing the sample. Nonresponse occurs when an individual chosen for the sample can t be contacted or refuses to participate. A systematic pattern of incorrect responses in a sample survey leads to response bias (wanting to look cool, not wanting to be a prude, etc.). The wording of questions is the most important influence on the answers given to a sample survey. Voluntary response bias occurs when participation is optional. Usually only people with strong opinions respond. Inference for Sampling The purpose of a sample is to give us information about a larger population. The process of drawing conclusions about a population on the basis of sample data is called inference. Why should we rely on random sampling? 1)To eliminate bias in selecting samples from the list of available individuals. 2)The laws of probability allow trustworthy inference about the population Results from random samples come with a margin of error that sets bounds on the size of the likely error. Larger random samples give better information about the population than smaller samples. Errors?! How much do you weigh? Will you not vote for President Obama s reelection? Why should guns be outlawed? How often do you exercise? How many cigarettes do you smoke each week? How often should Mrs. Daniel give quizzes? Errors in Surveys 4.2: Experiments 5

Chapter 4 Section 4.2 Experiments After this section, you should be able to DISTINGUISH observational studies from experiments DESCRIBE the language of experiments APPLY the three principles of experimental design DESIGN comparative experiments utilizing completely randomized designs and randomized block designs, including matched pairs design Survey: Asking students about how many hours they studied for the SAT and their resulting scores. Experiment: Selecting a group of same IQ students and assigning each student a different random number of hours to studying for the SAT. The student is ONLY allowed to study the mandated amount of hours. Then, compare their result scores Observational Study vs. Experiment An observational study observes individuals and measures variables of interest but does not attempt to influence the responses. An experiment deliberately imposes some treatment on individuals to measure their responses. ***When our goal is to understand cause and effect, experiments are the only source of fully convincing data.*** Lurking & Confounding Variables A lurking variable is a variable that is not among the explanatory or response variables in a study but that may influence the response variable. Lurking = not included. A confounding variable is one whose effects on the response variable cannot be distinguished from one or more of the explanatory variables in the study. Confounding = included. Experiments SAT Survey vs. SAT Experiment Describe a survey and an experiment that can be used to determine the relationship between SAT scores and hours studied? Confounding Variables Confounding refers to a problem that can arise in an experiment, when there is another variable that may effect the response and is in some way tied together with the factor under investigation, leaving us unable to tell which of the two variables (or perhaps some interaction) caused the observed response. 6

Chapter 4 Confounding Variables For example, we plant tomatoes in a garden that's halfshaded. We test a fertilizer by putting it on the plants in the sun and apply none to the shaded plants. Months later the fertilized plants bear more and better tomatoes. Why? Well, maybe it's the fertilizer, maybe it's the sun, maybe we need both. We're unable to conclude that the fertilizer works because any effect of fertilizer is confounded with any effect of the extra sunshine. The Randomized Comparative Experiment The remedy for confounding is to perform a comparative experiment in which some units receive one treatment and similar units receive another. Most well designed experiments compare two or more treatments. Comparison alone isn t enough, if the treatments are given to groups that differ greatly, bias will result. The solution to the problem of bias is random assignment. In an experiment, random assignment means that experimental units are assigned to treatments at random, that is, using some sort of chance process. Examples What s Lurking?! 1. As shoe size increases so does reading ability. 2. An increase in ice cream consumption equals an increase in the number of drowning deaths for a given period. The Randomized Comparative Experiment In a completely randomized design, the treatments are assigned to all the experimental units completely by chance. Some experiments may include a control group that receives an inactive treatment or an existing baseline treatment. Experimental Units Random Assignment Group 1 Treatment 1 Compare Results Group 2 Treatment 2 A high school regularly offers a review course to prepare students for the SAT. This year, budget cuts will allow the school to offer only an online version of the course. Over the past 10 years, the average SAT score of students in the classroom course was 1620. The online group gets an average score of 1780. That s roughly 10% higher than the long time average for those who took the classroom review course. Is the online course more effective? Is there a lurking variable? Is there a confounding variable? The Language of Experiments Experimental Units: smallest collection of individuals to which treatments are applied. When the units are human beings, they often are called subjects. Factors: General name for explanatory variables in an experiment (multi vitamin regime). Treatment: a specific condition (given vitamin A vs. vitamin B; time frame vitamin taken) applied to the individuals in an experiment. 7

Chapter 4 Factor v. Treatment A factor is a specific type or category of treatments. Whereas the specific different treatments constitute levels of a factor. For example, three different groups of runners are subjected to different training methods. Experimental units runners Factor Training methods Treatments Specific type of workout: Speed, strength training and distance workouts Factor = General Group Treatment = Specific Implementation Three Principles of Experimental Design 1. Control for lurking variables that might affect the response: Use a comparative design and ensure that the only systematic difference between the groups is the treatment administered. 2. Random assignment: Use impersonal chance to assign experimental units to treatments. This helps create roughly equivalent groups of experimental units by balancing the effects of lurking variables that aren t controlled on the treatment groups. 3. Replication: Use enough experimental units in each group so that any differences in the effects of the treatments can be distinguished from chance differences between the groups. A cookie manufacturer is trying to determine how long cookies stay fresh on store shelves, and the extent to which the type of packaging and the store s temperature influences how long the cookies stay fresh. He designs a completely randomized experiment involving low (64 Fº) and high (75 Fº) temperatures and two types of packaging plastic and waxed cardboard. List the experimental units, factors, and treatments in this experiment. Specific Types of Experimental Design Double Blind Single Blind Matched Pairs Block Design Experimental units: packages of cookies. Factors: Temperature and packaging. Treatments: Low temp and plastic, high temp and plastic, low temp and waxed cardboard, high temp and waxed cardboard. Double Blind In a double blind experiment, neither the subjects nor the experimenters know which treatment a subject received. 8

Chapter 4 Matched Pair Design In a matched pair design, subjects are paired by matching common important attributes. Some times the results are a pre test and post test with the unit being matched to itself. Block Design A block is a group of experimental units or subjects that are known before the experiment to be similar in some way that is expected to affect the response to the treatments. In a block design, the random assignment of units to treatments is carried out separately within each block. Helps control for lurking variables. Matched Pair Design Example: Tire wear and tear. Put one set of tires on the left side of the car and a different set on the right side of the car. This would help control the lurking variable of different driving styles (between teenage boys vs. teachers) and mileage driven. Block Design Experiments are often blocked by Age Gender Race Achievement Level (Regular, Honors, AP, IQ level, etc.) 9

Chapter 4 Inference for Experiments An observed effect so large that it would rarely occur by chance is called statistically significant. A statistically significant association in data from a well designed experiment does imply causation. 10

Chapter 5 5.1: Randomness, Probability and Simulation The Law of Large Numbers The law of large numbers says that if we observe more and more repetitions of any chance process, the proportion of times that a specific outcome occurs approaches a single value. Section 5.1 Randomness, Probability and Simulation After this section, you should be able to DESCRIBE the idea of probability DESCRIBE myths about randomness DESIGN and PERFORM simulations Myths about Randomness The myth of short run regularity: The idea of probability is that randomness is predictable in the long run (1 million plus occurrences). Probability does not allow us to make short run predictions. The myth of the law of averages : Probability tells us random behavior evens out in the long run. Future outcomes are not affected by past behavior. Women have a 50% of having a boy with each pregnancy; the gender of any previous children do not matter! The Idea of Probability Chance behavior is unpredictable in the short run, but has a regular and predictable pattern in the long run. The probability of any outcome of a chance process is a number between 0 (never occurs) and 1 (always occurs) that describes the proportion of times the outcome would occur in a very long series of repetitions. Performing a Simulation The imitation of chance behavior, based on a model that accurately reflects the situation, is called a simulation. Simulations are usually done with a table of random digits, calculator random number generator (RandInt) or computer software. State: Identify the probability calculation at interest. Plan: Describe how to use a chance device/tool to implement one repetition of the process. Explain clearly how to identify the outcomes of the chance process. Do: Perform many (at least 20) repetitions of the simulation. Conclude: Use the results of your simulation to answer the question of interest, in context. 1

Chapter 5 Performing a Simulation For Example: What is the probability that a student earns an 80% on a true/false quiz written in Chinese? (Assume the exam taker does not know any Chinese). Should the instructor be concerned about cheating? How can we simulate the probability of guessing 80% correct on a True/False quiz? The Golden Ticket At a local high school, 95 students have permission to park on campus. Each month, the student council holds a golden ticket parking lottery at a school assembly. The two lucky winners are given reserved parking spots next to the school s main entrance. Last month, the winning tickets were drawn by a student council member from the AP Statistics class. When both golden tickets went to members of that same class, some people thought the lottery had been rigged. There are 28 students in the AP Statistics class, all of whom are eligible to park on campus. Design and carry out a simulation to decide whether it s plausible that the lottery was carried out fairly. **See 5.1 WS Required Elements: State must include: Identify variable Statement of probability in symbols or words. Plan must include: What tool? What values are you assigning? How many values are you picking each time? How many times are you conducting the simulation? What about repeat digits or ignored digits? What are you recording? STATE: What is the probability that the lottery would result in two winners from the AP Stats class? P (X=2), where x is the number of winners from AP Stats Required Elements: Do must include: Simulation data, if number of trials is 20 or less Summary of data for larger trials Conclude must include: Statement of probability Answer to question Usually about being surprised/reasonable/expected, etc. PLAN: Using the table of random digits, we will randomly assign each student a two digit number from 01 to 95. We ll label the students in the AP Statistics class from 01 to 28, and the remaining students from 29 to 95. (Numbers from 96 to 00 will be skipped.) Starting at the randomly selected row 139 and moving left to right across the row, we ll look at pairs of digits until we come across two different values from 01 to 95. These two values will represent the two students with these labels will win the prime parking spaces. We will record whether both winners are members of the AP Statistics class (Yes or no). We will conduct the simulation 18 times. 2

Chapter 5 Required Elements: Plan must include: What tool? Table of Random of Digits, Calculator Random Number Generator (RandInt), etc. What values are you assigning? 01 to 95 How many values are you picking each time? 2 values How many times are you conducting the simulation? 18 times What about repeat digits or ignored digits? Ignore repeat digits within a single draw What are you recording? Yes for both AP Stats. NASCAR In an attempt to increase sales, a breakfast cereal company decides to offer a NASCAR promotion. Each box of cereal will contain a collectible card featuring one of these NASCAR drivers: Jeff Gordon, Dale Earnhardt, Jr., Tony Stewart, Danica Patrick, or Jimmie Johnson. The company says that each of the 5 cards is equally likely to appear in any box of cereal. A NASCAR fan decides to keep buying boxes of the cereal until she has all 5 drivers cards. She is surprised when it takes her 23 boxes to get the full set of cards. Should she be surprised? Design and carry out a simulation to help answer this question. DO: Students Labels AP Statistics Class 01-28 Other 29-95 Skip numbers from 96-00 Reading across row 139 in Table D, look at pairs of digits until you see two different labels from 01 95. Record whether or not both winners are members of the AP Statistics Class. STATE: What is the probability of needing to buy 23 or more cereal boxes to obtain one card from each driver? 55 58 89 94 04 70 70 84 10 98 43 56 35 69 34 48 39 45 17 X X X X X X X Sk X X X X X X X X No No No No No No No No No 19 12 97 51 32 58 13 04 84 51 44 72 32 18 19 40 00 36 00 24 28 Sk X X X X X X X X X Sk X Sk Yes No No No No No Yes No Yes CONCLUDE: Based on 18 repetitions of our simulation, both winners came from the AP Statistics class 3 times, so the probability is estimated as 16.67%. Therefore is definitely possible for two AP Stats students to be selected in a fair drawing. PLAN: Using the calculator's random number generator (RandInt) we are going to simulate 50 trials. We will assign each driver a unique number 1 through 5. We will record how many trials it takes to get all five values (drivers). We will record the total number of digits required each time. Driver Label Jeff Gordon 1 Dale Earnhardt, Jr. 2 Tony Stewart 3 Danica Patrick 4 Jimmie Johnson 5 3

Chapter 5 DO: Dotplot of 50 Trials Section 5.2 Probability Rules After this section, you should be able to DESCRIBE chance behavior with a probability model DEFINE and APPLY basic rules of probability DETERMINE probabilities from two way tables CONSTRUCT Venn diagrams and DETERMINE probabilities CONCLUDE: We never had to buy more than 22 boxes to get the full set of cards in 50 repetitions of our simulation. Our estimate of the probability that it takes 23 or more boxes to get a full set of driver is roughly 0. Therefore, she should be surprised that it took 23 cereal box purchases to find all 5 driver cards. Basic Rules of Probability The probability of any event is a number between 0 and 1. All possible outcomes sum to 1. If all outcomes in a sample space ( ex: rolling a single dice) are equally likely, the probability the event A occurs can be found using the formula: P(A) = The probability that an event does not occur is 1 minus the probability the event does occur. 5.2: Probability Rules Probability Models The sample space S of a chance process is the set of all possible outcomes. A probability model is a description of some chance process that consists of two parts: a sample space S and a probability for each outcome. Example of Coin Toss: Sample Space: Either heads or tails. Probability: Heads (0.5) and Tails (0.5) 4

Chapter 5 Probability Models Venn Diagram, Tree Diagram, List, Chart, etc. Probability Models Event: Rolling a sum of 5 with 2 dice or P(A)= sum of 5 Event Space: There are 4 different combination of dice rolls that sum to 5. Solution: Since each outcome has probability 1/36: P(A) = 4/36 or 1/9. Probability Models Probability models allow us to find the probability of any collection of outcomes. An event is any collection of outcomes from some chance process. That is, an event is a subset of the sample space. Events are usually designated by capital letters, like A, B, C, and so on. Specific event examples: Flipping 3 heads in a row Rolling two dice that sum to 5 What type of Pizza do you like? Meat Veggies Veggies and meat Neither (cheese) ***There are NO other choices at Mrs. Daniel's pizzeria*** Sample Space: Rolling Two Dice The probability model for the chance process of rolling two fair, six sided dice one that s red and one that s green. Meat Meat & Veggies Veggies Sample Space 36 Outcomes Since the dice are fair, each outcome is equally likely. Each outcome has probability 1/36. Neither (Cheese) 5

Chapter 5 What is the probability that a randomly selected student Likes meat Likes veggies Likes veggies and meat Likes neither (cheese) Like either veggies or meat Mutually Exclusive Two events that cannot occur at the same time. There are no common outcomes. Student is EITHER a Junior or Senior Intersection Probability of both events occurring. For example: A = likes salad, B = likes meat, therefore P(A and B) = likes both salad and meat Complement The event that did not occur.not A A= airplane takes off on time A c = airplane does not take off on time Calculating Probabilities Complement of A Complement = 1 P(A) Mutually Inclusive (A or B) P(A or B) = P(A) + P(B) P(A and B) Intersection (A and B) P(A and B) 6

Chapter 5 2014 AP Statistics Exam Scores Probabilities: Score 1 2 3 4 5 Probability 0.223 0.183 0.235 0.224 0.125 (a) Is this a legitimate probability model? Justify. (b) Find the probability that the chosen student scored 3 or better. Distance learning courses are rapidly gaining popularity among college students. Randomly select an undergraduate student who is taking distance learning courses for credit and record the student s age. Here is the probability model: Age group (yr): 18 to 23 24 to 29 30 to 39 40 or over Probability: 0.57 0.17 0.14 0.12 (a)is this a legitimate probability model? Justify. Each probability is between 0 and 1 and 0.57 + 0.17 + 0.14 + 0.12 = 1 (b)find the probability that the chosen student is not in the traditional college age group (18 to 23 years). P(not 18 to 23 years) = 1 P(18 to 23 years) = 1 0.57 = 0.43 2014 AP Statistics Exam Scores Probabilities: Score 1 2 3 4 5 Probability 0.223 0.183 0.193 0.235 0.224 0.125 (a) Is this a legitimate probability model? Justify. Each probability is between 0 and 1 and the sum of the probabilities : 0.223 + 0.193 + 0.235 + 0.224 + 0.125 = 1. (b) Find the probability that the chosen student scored 3 or better. The probability of scoring a 3 or better: 0.235 + 0.224 + 0.125 = 0.584 What is the relationship between educational achievement and home ownership? A random sample of 500 people and each member of the sample was identified as a high school graduate (or not) and as a home owner (or not). The two way table displays the data. High School Graduate Not a High School Graduate Total Homeowner 221 119 340 Not a Homeowner 89 71 160 Total 310 190 500 What is the probability that a randomly selected person (a) is a high school graduate (b) is a high school graduate and owns a home (c) is a high school graduate or owns a home Online learning courses are rapidly gaining popularity among college students. Randomly select an undergraduate student who is taking online learning courses for credit and record the student s age. Here is the probability model: Age group (yr): 18 to 23 24 to 29 30 to 39 40 or over Probability: 0.57 0.17 0.14 0.12 (a) Is this a legitimate probability model? Justify. (b)find the probability that the chosen student is not in the traditional college age group (18 to 23 years). What is the relationship between educational achievement and home ownership? A random sample of 500 people and each member of the sample was identified as a high school graduate (or not) and as a home owner (or not). The two way table displays the data. High School Graduate Not a High School Graduate Total Homeowner 221 119 340 Not a Homeowner 89 71 160 Total 310 190 500 What is the probability that a randomly selected person (a) is a high school graduate = 310/500 (b) is a high school graduate and owns a home = 221/500 (c) is a high school graduate or owns a home = 310 + 119 = 429/500 7

Chapter 5 5.3: Conditional Probability and Independence Basic Probability Assume a spinner has 8 equal sized sections; each section is numbered a unique number from 1 to 8. A. What is the probability of getting an even number? 4/8 or 1/2 B. What is the probability of getting a prime number? 5/8 C. What is the probability of getting a multiple of 3? 2/8 or 1/4 After this section, you should be able to DEFINE conditional probability COMPUTE conditional probabilities DESCRIBE chance behavior with a tree diagram DEFINE independent events DETERMINE whether two events are independent APPLY the general multiplication rule to solve probability questions Mixed Probability Assume a spinner has 8 equal sized sections; each section is numbered a unique number from 1 to 8. A. What is the probability of getting 2 even spins in a row? B. What is the probability of getting a prime number or an odd number? C. What is the probability of getting a multiple of 3 or an even spin? Basic Probability Assume a spinner has 8 equal sized sections; each section is numbered a unique number from 1 to 8. A. What is the probability of getting an even number? B. What is the probability of getting a prime number? C. What is the probability of getting a multiple of 3? Mixed Probability Assume a spinner has 8 equal sized sections; each section is numbered a unique number from 1 to 8. A. What is the probability of getting 2 even spins in a row? 1/4 B. What is the probability of getting a prime number or an odd number? 5/8 C. What is the probability of getting a multiple of 3 or an even spin? 5/8 8

Chapter 5 What is Conditional Probability? When we are trying to find the probability that one event will happen under the condition that some other event is already known to have occurred, we are trying to determine a conditional probability. The probability that one event happens given that another event is already known to have happened is called a conditional probability. Suppose we know that event A has happened. Then the probability that event B happens given that event A has happened is denoted by P(B A). Read as given that or under the condition that Calculate the following conditional probabilities: 1. P = 19/90 2. P = 4/88 3. P = 84/103 Example: Grade Distributions E: the grade comes from an EPS course, and L: the grade is lower than a B. Total 6300 1600 2100 Total 3392 2952 3656 10000 Who Reads the Newspaper? Residents of a large apartment complex can be classified based on the events A: reads USA Today and B: reads the New York Times. What is the probability that a randomly selected resident who reads USA Today also reads the New York Times? Find P(L) Find P(E L) Find P(L E) P(L) = 3656 / 10000 = 0.3656 P(E L) = 800 / 3656 = 0.2188 P(L E) = 800 / 1600 = 0.5000 Who Reads the Newspaper? Residents of a large apartment complex can be classified based on the events A: reads USA Today and B: reads the New York Times. What is the probability that a randomly selected resident who reads USA Today also reads the New York Times? Calculate the following conditional probabilities: 1. P 2. P 3. P There is a 12.5% chance that a randomly selected resident who reads USA Today also reads the New York Times. 9

Chapter 5 Conditional Probability and Independence When knowledge that one event has happened does not change the likelihood that another event will happen, we say the two events are independent. Two events A and B are independent if the occurrence of one event has no effect on the chance that the other event will happen. In other words, events A and B are independent if: P(A B) = P(A) OR P(B A) = P(B). Are these events independent? Earns A in AP Stats 1. Junior and AP Calc? 2. Senior and AP Stats? Earns A in AP Calc Total Junior 5 7 12 Senior 12 9 21 Total 17 16 33 Conditional Probability and Independence P(A B) = P(A) OR P(B A) = P(B). Are the events male and left handed independent? A: left handed B: male Are these events independent? Earns A in AP Stats Earns A in AP Calc Total Junior 5 7 12 Senior 12 9 21 Total 17 16 33 1. Junior and AP Calc? P = 7/16 ; P(Junior)= 12/33 Since the values are not equal, the events are not independent. 2. Senior and AP Stats? P (Senior Stats) = 12/17 ; P(Senior)= 21/33 Since the values are not equal, the events are not independent. Conditional Probability and Independence P(A B) = P(A) OR P(B A) = P(B). Are the events male and lefthanded independent?. A: left handed B: male General Multiplication Rule The probability that events A and B both occur can be found using the general multiplication rule P(A B) = P(A) P(B A) where P(B A) is the conditional probability that event B occurs given that event A has already occurred. P(left-handed male) = 3/23 = 0.13 P(left-handed) = 7/50 = 0.14 10

Chapter 5 Tree Diagrams Tree Diagrams are best for events that follow each other, events that happen multiple times or events that are logically related (example: graduate high school first, then attend college OR having cancer, then testing positive). Example: Teens with Online Profiles The Pew Internet and American Life Project finds that 93% of teenagers (ages 12 to 17) use the Internet, and that 55% of online teens have a Facebook profile. What percent of teens are online and have a Facebook profile? 51.15% of teens are online and have posted a profile. Tree Diagrams Consider flipping a coin twice. What is the probability of getting two heads? Sample Space: HH HT TH TT So, P(two heads) = P(HH) = 1/4 Consecutive Probability Assume a spinner has 8 equal sized sections; each section is numbered a unique number from 1 to 8. You spin the spinner three times. A. What is the probability of getting at least two even spins? B. What is the probability of getting a prime number exactly twice? C. What is the probability of getting a multiple of 3 or an even spin only once? Example: Teens with Online Profiles The Pew Internet and American Life Project finds that 93% of teenagers (ages 12 to 17) use the Internet, and that 55% of online teens have a Facebook profile. What percent of teens are online and have a Facebook profile? 11

Chapter 5 Internet & YouTube Usage About 27% of adult Internet users are 18 to 29 years old, another 45% are 30 to 49 years old, and the remaining 28% are 50 and over. The Pew Internet and American Life Project finds that 70% of Internet users aged 18 to 29 have visited a video sharing site, along with 51% of those aged 30 to 49 and 26% of those 50 or older. Make a Tree Diagram of the probabilities. Questions on next slide. B. What proportion of adults are 18 to 29 year old Internet users that visit video sharing sites?.27 x.7 =.189 C. What proportion of adults are 30 to 49 year old Internet users that visit video sharing sites?.45 x.51 =.2295 D. What proportion of adults are 50 and over year old Internet users that visit video sharing sites?.28 x 26 =.0728 E. P(video yes 18 to 29) = 0.27 0.7 =0.1890 P(video yes 30 to 49) = 0.45 0.51 =0.2295 P(video yes 50 +) = 0.28 0.26 =0.0728 B. What proportion of adults are 18 to 29 year old Internet users that visit video sharing sites? C. What proportion of adults are 30 to 49 year old Internet users that visit video sharing sites? D. What proportion of adults are 50 and over year old Internet users that visit video sharing sites? P(video yes) = 0.1890 + 0.2295 + 0.0728 = 0.4913 49.13% of all adult Americans that use the Internet watch videos online. While 49.13% represents a large proportion of the population, it is not a majority, so it is not fair to say most adult American Internet users visit video sharing sites. E. What proportion of all adult Internet users visit videosharing sites? Do most Internet users visit YouTube and/or similar sites? Justify your answer. 12

Chapter 5 Special Probability Rules Independence: A Special Multiplication Rule When events A and B are independent, we can simplify the general multiplication rule since P(B A) = P(B). Multiplication rule for independent events If A and B are independent events, then the probability that A and B both occur is P(A B) = P(A) P(B) Example: Following the Space Shuttle Challenger disaster, it was determined that the failure of O-ring joints in the shuttle s booster rockets was to blame. Under cold conditions, it was estimated that the probability that an individual O-ring joint would function properly was 0.977. Assuming O-ring joints succeed or fail independently, what is the probability all six would function properly? P(joint1 OK and joint 2 OK and joint 3 OK and joint 4 OK and joint 5 OK and joint 6 OK) =P(joint 1 OK) P(joint 2 OK) P(joint 6 OK) =(0.977)(0.977)(0.977)(0.977)(0.977)(0.977) = 0.87 13

Chapter 6 6.1: Discrete and Continuous Random Variables Discrete Random Variables A discrete random variable is one which may take on only a countable number of distinct values such as 0, 1, 2, 3, 4,... Discrete random variables are usually (but not necessarily) counts. Examples: number of children in a family the Friday night attendance at a cinema the number of patients a doctor sees in one day the number of defective light bulbs in a box of ten the number of heads flipped in 3 trials Section 6.1 Discrete & Continuous Random Variables After this section, you should be able to APPLY the concept of discrete random variables to a variety of statistical settings CALCULATE and INTERPRET the mean (expected value) of a discrete random variable CALCULATE and INTERPRET the standard deviation (and variance) of a discrete random variable DESCRIBE continuous random variables Probability Distribution The probability distribution of a discrete random variable is a list of probabilities associated with each of its possible values Consider tossing a fair coin 3 times. Define X= the number of heads obtained X = 0: TTT X = 1: HTT THT TTH X = 2: HHT HTH THH X = 3: HHH Value 0 1 2 3 Probability 1/8 3/8 3/8 1/8 Random Variables A random variable, usually written as X, is a variable whose possible values are numerical outcomes of a random phenomenon. There are two types of random variables, discrete random variables and continuous random variables. Rolling Dice: Probability Distribution Roll your pair of dice 20 times, record the sum for each trial. 1

Chapter 6 Example: Babies Health at Birth Background details are on page 343. (a)show that the probability distribution for X is legitimate. (b)make a histogram of the probability distribution. Describe the distribution. (c)apgar scores of 7 or higher indicate a healthy baby. What is P(X 7)? Value: 0 1 2 3 4 5 6 7 8 9 10 Probability: 0.001 0.006 0.007 0.008 0.012 0.020 0.038 0.099 0.319 0.437 0.053 Discrete Random Variables A discrete random variable X takes a fixed set of possible values with gaps between. The probability distribution of a discrete random variable X lists the values x i and their probabilities p i : Value: x 1 x 2 x 3 Probability: p 1 p 2 p 3 The probabilities p i must satisfy two requirements: 1. Every probability p i is a number between 0 and 1. 2. The sum of the probabilities is 1. Example: Babies Health at Birth Background details are on page 343. (a)show that the probability distribution for X is legitimate. (b)make a histogram of the probability distribution. Describe the distribution. (c)apgar scores of 7 or higher indicate a healthy baby. What is P(X 7)? Value: 0 1 2 3 4 5 6 7 8 9 10 Probability: 0.001 0.006 0.007 0.008 0.012 0.020 0.038 0.099 0.319 0.437 0.053 (a) All probabilities are between 0 and 1 and the probabilities sum to 1. This is a legitimate probability distribution. To find the probability of any event, add the probabilities p i of the particular values x i that make up the event. Describing the (Probability) Distribution When analyzing discrete random variables, we ll follow the same strategy we used with quantitative data describe the shape, center (mean), and spread (standard deviation), and identify any outliers. Example: Babies Health at Birth b. Make a histogram of the probability distribution. Describe what you see. c. Apgar scores of 7 or higher indicate a healthy baby. What is P(X 7)? Value: 0 1 2 3 4 5 6 7 8 9 10 Probability: 0.001 0.006 0.007 0.008 0.012 0.020 0.038 0.099 0.319 0.437 0.053 (c) P(X 7) =.908 We d have a 91 % chance of randomly choosing a healthy baby. (b) The left skewed shape of the distribution suggests a randomly selected newborn will have an Apgar score at the high end of the scale. While the range is from 0 to 10, there is a VERY small chance of getting a baby with a score of 5 or lower. There are no obvious outliers. The center of the distribution is approximately 8. 2

Chapter 6 Mean of a Discrete Random Variable The mean of any discrete random variable is an average of the possible outcomes, with each outcome weighted by its probability. Suppose that X is a discrete random variable whose probability distribution is Value: x 1 x 2 x 3 Probability: p 1 p 2 p 3 To find the mean (expected value) of X, multiply each possible value by its probability, then add all the products: Analyzing Discrete Random Variables on the Calculator 1. Using one variable statistics to calculate: 2. Enter ascre for X1 and freqas for frequency list. Example: Apgar Scores What s Typical? Consider the random variable X = Apgar Score Compute the mean of the random variable X and interpret it in context. Value: 0 1 2 3 4 5 6 7 8 9 10 Probability: 0.001 0.006 0.007 0.008 0.012 0.020 0.038 0.099 0.319 0.437 0.053 Analyzing Discrete Random Variables on the Calculator Example: Apgar Scores What s Typical? Consider the random variable X = Apgar Score Compute the mean of the random variable X and interpret it in context. Value: 0 1 2 3 4 5 6 7 8 9 10 Probability: 0.001 0.006 0.007 0.008 0.012 0.020 0.038 0.099 0.319 0.437 0.053 Calculate the Mean (Expected Value) Value 5 6 7 8 9 10 Probability 0.08 0.12 0.30 0.22 0.20 0.08 The mean Apgar score of a randomly selected newborn is 8.128. This is the long term average Agar score of many, many randomly chosen babies. Note: The expected value does not need to be a possible value of X or an integer! It is a long term average over many repetitions. 3

Chapter 6 Standard Deviation of a Discrete Random Variable The definition of the variance of a random variable is similar to the definition of the variance for a set of quantitative data. To get the standard deviation of a random variable, take the square root of the variance. Suppose that X is a discrete random variable whose probability distribution is Value: x 1 x 2 x 3 Probability: p 1 p 2 p 3 and that µ X is the mean of X. The variance of X is Continuous Random Variables A continuous random variable X takes on all values in an interval of numbers. The probability distribution of X is described by a density curve. The probability of any event is the area under the density curve and above the values of X that make up the event. Example: Apgar Scores How Variable Are They? Consider the random variable X = Apgar Score Compute the standard deviation of the random variable X and interpret it in context. Value: 0 1 2 3 4 5 6 7 8 9 10 Probability: 0.001 0.006 0.007 0.008 0.012 0.020 0.038 0.099 0.319 0.437 0.053 Variance Continuous Random Variables A continuous random variable is not defined at specific values. Instead, it is defined over an interval of value; however, you can calculate the probability of a range of values. It is very similar to z scores and normal distribution calculations. The standard deviation of X is 1.437. On average, a randomly selected baby s Apgar score will differ from the mean 8.128 by about 1.4 units. Continuous Random Variable A continuous random variable is one which takes an infinite number of possible values. Continuous random variables are usually measurements. Examples: height weight the amount of sugar in an orange the time required to run a mile. Example: Young Women s Heights The height of young women can be defined as a continuous random variable (Y) with a probability distribution is N(64, 2.7). A. What is the probability that a randomly chosen young woman has height between 68 and 70 inches? P(68 Y 70) =??? 4

Chapter 6 Example: Young Women s Heights The height of young women can be defined as a continuous random variable (Y) with a probability distribution is N(64, 2.7). A. What is the probability that a randomly chosen young woman has height between 68 and 70 inches? P(68 Y 70) =??? 6.2: Transforming and Combining Random Variables P(1.48 Z 2.22) = P(Z 2.22) P(Z 1.48) = 0.9868 0.9306 = 0.0562 There is about a 5.6% chance that a randomly chosen young woman has a height between 68 and 70 inches. Example: Young Women s Heights The height of young women can be defined as a continuous random variable (Y) with a probability distribution is N(64, 2.7). B. At 70 inches tall, is Mrs. Daniel unusually tall? After this section, you should be able to DESCRIBE the effect of performing a linear transformation on a random variable COMBINE random variables and CALCULATE the resulting mean and standard deviation CALCULATE and INTERPRET probabilities involving combinations of Normal random variables Example: Young Women s Heights The height of young women can be defined as a continuous random variable (Y) with a probability distribution is N(64, 2.7). B. At 70 inches tall, is Mrs. Daniel unusually tall? P(Y 70) =??? P value: 0.9868 Yes, Mrs. Daniel is unusually tall because 98.68% of the population is shorter than her. Linear Transformations on Random Variables 5

Chapter 6 Linear Transformations on Random Variables Multiplying (or dividing) each value of a random variable by a number b: Multiplies (divides) measures of center and location (mean, median, quartiles, percentiles) by b. Multiplies (divides) measures of spread (range, IQR, standard deviation) by b. Does not change the shape of the distribution. Linear Transformations Pete charges $150 per passenger. The random variable C describes the amount Pete collects on a randomly selected day. Collected c i 300 450 600 750 900 Probability p i 0.15 0.25 0.35 0.20 0.05 The mean of C is $562.50 and the standard deviation is $163.50. Note: Multiplying a random variable by a constant b multiplies the variance by b 2. Review: Linear Transformations In Chapter 2, we studied the effects of linear transformations on the shape, center, and spread of a distribution of data. Remember: Compare the shape, center and spread of each distribution. 1. Adding (or subtracting) a constant, a, to each observation: Adds a to measures of center and location. Does not change the shape or measures of spread. 2. Multiplying (or dividing) each observation by a constant, b: Multiplies (divides) measures of center and location by b. Multiplies (divides) measures of spread by b. Does not change the shape of the distribution. Linear Transformations Pete s Jeep Tours offers a popular half day trip in a tourist area. There must be at least 2 passengers for the trip to run, and the vehicle will hold up to 6 passengers. Define X as the number of passengers on a randomly selected day. Passengers x i 2 3 4 5 6 Probability p i 0.15 0.25 0.35 0.20 0.05 The mean of X is 3.75 and the standard deviation is 1.090. Linear Transformations on Random Variables Adding the same number a (which could be negative) to each value of a random variable: Adds a to measures of center and location (mean, median, quartiles, percentiles). Does not change measures of spread (range, IQR, standard deviation). Does not change the shape of the distribution. 6

Chapter 6 Linear Transformations Consider Pete s Jeep Tours again. We defined C as the amount of money Pete collects on a randomly selected day. Collected c i 300 450 600 750 900 Probability p i 0.15 0.25 0.35 0.20 0.05 The mean of C is $562.50 and the standard deviation is $163.50. It costs Pete $100 per trip to buy permits, gas, and a ferry pass. The random variable V describes the profit Pete makes on a randomly selected day. Profit v i 200 350 500 650 800 Probability p i 0.15 0.25 0.35 0.20 0.05 The mean of V is $462.50 and the standard deviation is $163.50. Combining Random Variables Before we can combine random variables, a determination about the independence of each variable from the other must be made. Probability models often assume independence when the random variables describe outcomes that appear unrelated to each other. You should always ask yourself whether the assumption of independence seems reasonable. Compare the shape, center, and spread of the two probability distributions. Bottom Line: Whether we are dealing with data or random variables, the effects of a linear transformation are the same!!! Combining Random Variables Let D = the number of passengers on a randomly selected Delta flight to Atlanta Let A = the number of passengers on a randomly selected trip American Airlines flight to Atlanta Define T = X + Y. Calculate the mean and standard deviation of T. Passengers x i 75 76 77 78 79 Probability p i 0.15 0.25 0.35 0.20 0.05 Passengers y i 75 76 77 78 Probability p i 0.3 0.4 0.2 0.1 Combining Random Variables Combining Random Variables Let D = the number of passengers on a randomly selected Delta flight to Atlanta Let A = the number of passengers on a randomly selected trip American Airlines flight to Atlanta Define T = X + Y. Calculate the mean and standard deviation of T. Passengers x i 75 76 77 78 79 Probability p i 0.15 0.25 0.35 0.20 0.05 Mean µ D = 76.75 Standard Deviation σ D = 1.0897 Passengers y i 75 76 77 78 Probability p i 0.3 0.4 0.2 0.1 Mean µ A = 76.1 Standard Deviation σ A = 0.943 7

Chapter 6 Combining Random Variables: Mean How many total passengers fly to Atlanta on a randomly selected day? Delta: µ D = 76.75 American: µ A = 76.10 Total: 76.75+ 76.10 = 142.85 passengers to Atlanta daily For any two random variables X and Y, if T = X + Y, then the expected value of T is E(T) = µ T = µ X + µ Y In general, the mean of the sum of several random variables is the sum of their means. Subtracting Random Variables: Mean Mean of the Difference of Random Variables For any two random variables X and Y, if D = X Y, then the expected value of D is E(D) = µ D = µ X µ Y In general, the mean of the difference of several random variables is the difference of their means. The order of Variance of the Difference of Random Variables subtraction is important! Combining Random Variables: Variance How much variability is there in the total number of passengers who fly to Atlanta on a randomly selected day? (Hint: find the combined variance) Delta: American: Mean µ D = 76.75 Standard Deviation σ D = 1.0897 Mean µ A = 76.1 Standard Deviation σ A = 0.943 REMEMBER: Standard Deviations do not add!!! Subtracting Random Variables: Variance Variance of the Difference of Random Variables For any two independent random variables X and Y, if D = X Y, then the variance of D is In general, the variance of the difference of two independent random variables is the sum of their variances. **This was an FRQ on the 2013 exam** Combining Random Variables: Variance Delta = (1.090) 2 American = (0.943) 2 Total Variance = (1.090) 2 + (0.943) 2 = 2.077 For any two independent random variables X and Y, if T = X + Y, then the variance of T is Combining Normal Random Variables: Calculating Probabilities If a random variable is Normally distributed, we can use its mean and standard deviation to compute probabilities. Important Fact: Any sum or difference of independent Normal random variables is also Normally distributed. In general, the variance of the sum of several independent random variables is the sum of their variances. 8

Chapter 6 Combining Normal Random Variables: Calculating Probabilities Mrs. Daniel likes between 8.5 and 9 grams of sugar in her iced coffee. Suppose the amount of sugar in a randomly selected packet follows a Normal distribution with mean 2.17 g and standard deviation 0.08 g. If Mrs. Daniel selects 4 packets at random, what is the probability her iced coffee will taste right? Combining Normal Random Variables: Calculating Probabilities DO, cont.: 4. Calculate z scores. 5. Find p values p values (z = 1.13) = 0.1292 and (z = 2) 0.9772 6. Final Calculations 0.9772 0.1292 = 0.8480 CONCLUDE: There is an 84.8% percent chance that Mrs. Daniel s iced coffee will taste right. Combining Normal Random Variables: Calculating Probabilities Combining Normal Random Variables: Calculating Probabilities STATE & PLAN: Let X = the amount of sugar in a randomly selected packet. Then, T = X 1 + X 2 + X 3 + X 4. We want to find P(8.5 T 9). YES, you may use your calculator! Just remember to recalculate the combined mean and standard deviation, before using the calculator!!!! Combining Normal Random Variables: Calculating Probabilities DO: 1. Calculate combined mean µ T = µ X1 + µ X2 + µ X3 +µ X4 = 2.17 + 2.17 + 2.17 +2.17 = 8.68 2. Calculate combined variance 3. Calculate combined standard deviation. Combining Normal Random Variables: Calculating Probabilities The diameter C of a randomly selected large drink cup at a fastfood restaurant follows a Normal distribution with a mean of 3.96 inches and a standard deviation of 0.01 inches. The diameter L of a randomly selected large lid at this restaurant follows a Normal distribution with mean 3.98 inches and standard deviation 0.02 inches. For a lid to fit on a cup, the value of L has to be bigger than the value of C, but not by more than 0.06 inches. What s the probability that a randomly selected large lid will fit on a randomly chosen large drink cup? 9

Chapter 6 Combining Normal Random Variables: Calculating Probabilities STATE & PLAN: We ll define the random variable D = L C to represent the difference between the lid s diameter and the cup s diameter. Our goal is to find P(0.00 < D 0.06). DO: 1. Calculate combined mean. μ D = μ L μ C = 3.98 3.96 = 0.02 2. Calculate combined variance (0.02) 2 + (0.01) 2 = 0.0005 3. Calculate combined standard deviation. 0.0005 = 0.0224 Combining Normal Random Variables: Calculating Probabilities Mrs. Daniel and Mrs. Cooper bowl every Tuesday night. Over the past few years, Mrs. Daniel s scores have been approximately Normally distributed with a mean of 212 and a standard deviation of 31. During the same period, Mrs. Cooper s scores have also been approximately Normally distributed with a mean of 230 and a standard deviation of 40. Assuming their scores are independent, what is the probability that Mrs. Daniel scores higher than Mrs. Cooper on a randomly selected Tuesday night? Combining Normal Random Variables: Calculating Probabilities Combining Normal Random Variables: Calculating Probabilities DO, cont.: 4. Calculate z scores: z=... = 0.89 and.. = 1.79 5. Find p values: pvalues (z = 0.89) = 0.1867 and (z = 1.79) = 0.9633 6. Final calculations: 0.9633 0.1867 = 0.7766. Combining Normal Random Variables: Calculating Probabilities Combining Normal Random Variables: Calculating Probabilities CONCLUDE: We predict that the lids will fit properly 77.66% of the time. This means the lids will not fit properly more than 22% of the time. That is annoying! CONCLUDE: There is a 35.94% chance that Mrs. Daniel will score higher than Mrs. Cooper on any given night. 10

Chapter 6 Mixed Practice: ACT Scores Leona and Fred are friendly competitors in high school. Both are about to take the ACT college entrance examination. They agree that if one of them scores 5 or more points better than the other, the loser will buy the winner a pizza. Suppose that in fact Fred and Leona have equal ability, so that each score varies Normally with mean 24 and standard deviation 2. (The variation is due to luck in guessing and the accident of the specific questions being familiar to the student.) The two scores are independent. What is the probability that the scores differ by 5 or more points in either direction? Toothpaste New mean: 0.78 New standard deviation: 0.049 Normcdf (0.85,, 0.78, 0.049) = 0.076564 ACT Scores New mean: 0 New standard deviation: 2.8284 Normal cdf (, 5, 0, 2.8284) = 0.0385 + normcdf(5,, 0, 2.8284) = 0.0385 = 0.0385 + 0.0385 = 0.0771 6.3: Binomial and Geometric Random Variables Mixed Practice: Toothpaste Mr. Daniel is traveling for his business. He has a new 0.85 ounce tube of toothpaste that s supposed to last him the whole trip. The amount of toothpaste Mr. Daniel squeezes out of the tube each time he brushes varies according to a Normal distribution with mean 0.13 ounces and standard deviation 0.02 ounces. If Mr. Daniel brushes his teeth six times during the trip, what s the probability that he ll be cranky because he ran out of toothpaste? After this section, you should be able to DETERMINE whether the conditions for a binomial setting are met COMPUTE and INTERPRET probabilities involving binomial random variables CALCULATE the mean and standard deviation of a binomial random variable and INTERPRET these values in context CALCULATE probabilities involving geometric random variables 11

Chapter 6 Binomial Settings A binomial setting arises when we perform several independent trials of the same chance process and record the number of times that a particular outcome occurs. The four conditions for a binomial setting are B I N S Binary? The possible outcomes of each trial can be classified as success or failure. Independent? Trials must be independent; that is, knowing the result of one trial must not have any effect on the result of any other trial. Number? The number of trials n of the chance process must be fixed in advance. Success? On each trial, the probability p of success must be the same. Find the mean and standard deviation of X. X is a binomial random variable with parameters n = 21 and p = 1/3. Binomial Random Variable Consider tossing a coin n times. Each toss gives either heads or tails. Knowing the outcome of one toss does not change the probability of an outcome on any other toss. If we define heads as a success, then p is the probability of a head and is 0.5 on any toss. The number of heads in n tosses is a binomial random variable X. The probability distribution of X is called a binomial distribution. Find the mean and standard deviation of X. X is a binomial random variable with parameters n = 21 and p = 1/3. Count the number of successes in a predetermined number of trials! Binomial Distribution: Mean and Standard Deviation If a count X has the binomial distribution with number of trials n and probability of success p, the mean and standard deviation of X are Binomial Distribution: Describe We describe the probability distribution of a binomial random variable just like any other distribution shape, center, and spread. Consider the probability distribution of X = number of children with type O blood in a family with 5 children. x i 0 1 2 3 4 5 p i 0.2373 0.3955 0.2637 0.0879 0.0147 0.0009 8 Note: These formulas work ONLY for binomial distributions. They can t be used for other distributions! 12

Chapter 6 Binomial Distribution: Describe x i 0 1 2 3 4 5 p i 0.2373 0.3955 0.2637 0.0879 0.0147 0.0009 8 Shape: The probability distribution of X is skewed to the right. It is more likely to have 0, 1, or 2 children with type O blood than a larger value. Center: The median number of children with type O blood is 1. The mean is 1.25. Spread: The variance of X is 0.9375 and the standard deviation is 0.96. Binomial Probabilities CHECK CONDITIONS: Binary: Yes. Type O blood = yes and not type O blood = no. There are only two options. Independent: Stated. Number: Yes. The number of trials is stated as 5. Success: Yes. The probability of success is the same on each attempt, p = 0.25. Calculator: Binomial Probability MENU, 6: Statistics, 5: Distributions D: binompdf E: binomcdf Binompdf calculates equal to value For PERCISE numbers Binomial Probabilities Using your calculator: Binompdf, enter the following information: Trials: 5 P:.25 X value: 2 Answer: 0.263671875 We are using binompdf in this example because we want the precise probability of 2. CONCLUDE: There is a 26.37% chance that the family will have two children with type O blood. Binomial Probabilities Each child of a particular pair of parents has probability 0.25 of having type O blood. Genetics says that children receive genes from each of their parents independently. If these parents have 5 children, the count X of children with type O blood is a binomial random variable with n = 5 trials and probability p = 0.25 of a success on each trial. In this setting, a child with type O blood is a success (S) and a child with another blood type is a failure (F). What s P(X = 2)? Inheriting Blood Type Each child of a particular pair of parents has probability 0.25 of having blood type O. Suppose the parents have 5 children. (a) Find the probability that exactly 3 of the children have type O blood. (b) Should the parents be surprised if more than 3 of their children have type O blood? We have already checked the conditions, so just do the calculations. 13

Chapter 6 Inheriting Blood Type Each child of a particular pair of parents has probability 0.25 of having blood type O. Suppose the parents have 5 children (a) Find the probability that exactly 3 of the children have type O blood. Binompdf :(5,.25, 3) = 0.08789 There is an 8.79% percent chance that the family will have three children with type O blood. (b) Should the parents be surprised if more than 3 of their children have type O blood? Binomcdf: (5,.25, 4, 5) = 0.015625 There is a 1.5% percent chance that more than 3 of the children (aka at least 4 children) will have type O blood. This is surprising! Example: CDs CHECK CONDITIONS: Binary: Yes. Defective or not defective, only two options. Independent: We can safely assume independence in this case because we are sampling less than 10% of the population. Number: Yes. The number of trials is stated as 10. Success: Yes. The probability of success is the same on each attempt, p = 0.10. DO & CONCLUDE: Binompdf (10,.1, 0) = 0.3486784401 There is a 34.87% that there will be no defective CDs in the sample. Binomial Distributions: Statistical Sampling The binomial distributions are important in statistics when we want to make inferences about the proportion p of successes in a population. Binomial Distributions: Normal Approximation As n gets larger, something interesting happens to the shape of a binomial distribution. Sampling Without Replacement Condition When taking an SRS of size n from a population of size N, we can use a binomial distribution to model the count of successes in the sample as long as Example: CDs Suppose 10% of CDs have defective copy protection schemes that can harm computers. A music distributor inspects an SRS of 10 CDs from a shipment of 10,000. Let X = number of defective CDs. What is P (X = 0)? Binomial Distributions: Normal Approximation Suppose that X has the binomial distribution with n trials and success probability p. When n is large, the distribution of X is approximately Normal with mean and standard deviation As a rule of thumb, we will use the Normal approximation when n is so large that np 10 and n(1 p) 10. That is, the expected number of successes and failures are both at least 10. We use the normal approximation more in Chapters 8 10. 14

Chapter 6 Example: Attitudes Toward Shopping Sample surveys show that fewer people enjoy shopping than in the past. A survey asked a nationwide random sample of 2500 adults if they agreed or disagreed that I like buying new clothes, but shopping is often frustrating and timeconsuming. Suppose that exactly 60% of all adult US residents would say Agree if asked the same question. Let X = the number in the sample who agree. Estimate the probability that 1520 or more of the sample agree. Consider the normal approximation for this setting. Geometric Settings A geometric setting arises when we perform independent trials of the same chance process and record the number of trials until a particular outcome occurs. The four conditions for a geometric setting are B I T S Binary? The possible outcomes of each trial can be classified as success or failure. Independent? Trials must be independent; that is, knowing the result of one trial must not have any effect on the result of any other trial. Trials? The goal is to count the number of trials until the first success occurs. Success? On each trial, the probability p of success must be the same. CHECK CONDITIONS: Binomial: Binary: There are only 2 options. Success = agree, Failure = don t agree Independent: Because the population of U.S. adults is greater than 25,000, it is reasonable to assume the sampling without replacement condition is met; we are sampling less than 10% of the population. Number of Trials: n = 2500 trials of the chance process Success: The probability of selecting an adult who agrees is p = 0.60 Normal: Since np = 2500(0.60) = 1500 and n(1 p) = 2500(0.40) = 1000 are both at least 10, we may use the Normal approximation. Geometric Random Variable Geometric random variable: the number of trials needed to get the first success. Examples: How many M&Ms are drawn until a blue one is selected? How many students will I draw from a hat until a pick a senior? How many households can a surveyor call until someone answers? DO 1. Calculate the mean. 2. Calculate standard deviation. 3. Use Calculator Normalcdf (1520, 2500, 1500, 24.49) = 0.207061 CONCLUDE: There is a 20.61% that 1520 or more of the people in the sample agree. Calculator: Geometric Probability MENU, 6: Statistics, 5: Distributions F: Geometpdf G: Geometcdf Geometpdf calculates equal to value For PERCISE numbers Same idea as normpdf and normcdf Geometcdf calculates the probability of getting at least one success within a specific range of number of trials 15

Chapter 6 Example: The Birthday Game I am going to think of the day of the week of one of my friend s birthdays. If the first guesser gets it right you all will receive 1 homework question. If the second guesser gets the day right you will receive 2 homework questions, etc. Before playing the game, my plan was to give you all 10 homework questions. The random variable of interest in this game is Y = the number of guesses it takes to correctly identify the birth day of one of your teacher s friends. What is the probability the first student guesses correctly? The second? Third? What is the probability one of the first three students will be correct? Geometric Distribution: Mean If Y is a geometric random variable with probability p of success on each trial, then its mean (expected value) is E(Y) = µ Y = 1/p. Meaning: Expected number of n trials to achieve first success (average) Example: Suppose you are a 80% free throw shooter. You are going to shoot until you make. For p =.8, the mean is 1/.8 = 1.25. This means we expect the shooter to take 1.25 shots, on average, to make first. CHECK CONDITIONS: Example: The Birthday Game Binary: There are only 2 options: Success = correct guess, Failure = incorrect guess Independent: The result of one student s guess has no effect on the result of any other guess. Trials: We re counting the number of guesses up to and including the first correct guess. Success: On each trial, the probability of a correct guess is 1/7, which is the same. Binomial vs. Geometric The Binomial Setting 1. Each observation falls into one of two categories. 2. The probability of success is the same for each observation. 3. The observations are all independent. 4. There is a fixed number n of observations. The Geometric Setting 1. Each observation falls into one of two categories. 2. The probability of success is the same for each observation. 3. The observations are all independent. 4. The variable of interest is the number of trials required to obtain the 1 st success. Example: The Birthday Game DO: Probability First Student: 1/7 = 0.142857 Probability Second Student: geometpdf(1/7, 2) = 0.1224 Probability Third Student: geometpdf (1/7, 3) = 0.10496 What is the probability one of the first three students will be correct? GeometCDF(1/7, 1, 3) = 0.37026 CONCLUDE: There is a 37.03% percent change that one of the first three students will guess correctly. Binomial or Geometric?? First defective tire Baskets made until first miss Questions guessed correctly on SAT Math Light blubs purchased until third failure Jurors selected for trial until first disqualification Number of students that interrupt class until Mrs. Daniel gets mad/mean 16

Chapter 6 FRQ Answers Must Include: 1. Name of distribution Geometric, Binomial 2. Parameters Binomial: X (define variable), n & p Geometric: X (define variable), p 3. Probability Statement Ex: P (X = 7) or P (X 3) 4. Calculation and p value Calculator notation is okay, but needs to be labeled. 5. Solution interpreted in context. Binomial Probability The binomial coefficient counts the number of different ways in which k successes can be arranged among n trials. The binomial probability P(X = k) is this count multiplied by the probability of any one specific arrangement of the k successes. Binomial Probability If X has the binomial distribution with n trials and probability p of success on each trial, the possible values of X are 0, 1, 2,, n. If k is any one of these values, Number of arrangements of k successes Probability of k successes Probability of n k failures Binomial Probabilities (Alternative Solution) Each child of a particular pair of parents has probability 0.25 of having type O blood. Genetics says that children receive genes from each of their parents independently. If these parents have 5 children, the count X of children with type O blood is a binomial random variable with n = 5 trials and probability p = 0.25 of a success on each trial. In this setting, a child with type O blood is a success (S) and a child with another blood type is a failure (F). What s P(X = 2)? Calculating Binomial & Geometric Distributions by Hand P(SSFFF) = (0.25)(0.25)(0.75)(0.75)(0.75) = (0.25) 2 (0.75) 3 = 0.02637 However, there are a number of different arrangements in which 2 out of the 5 children have type O blood: SSFFF SFSFF SFFSF SFFFS FSSFF FSFSF FSFFS FFSSF FFSFS FFFSS Verify that in each arrangement, P(X = 2) = (0.25) 2 (0.75) 3 = 0.02637 Therefore, P(X = 2) = 10(0.25) 2 (0.75) 3 = 0.2637 Binomial Coefficient How to Calculate Number of Arrangements: The number of ways of arranging k successes among n observations is given by the binomial coefficient Inheriting Blood Type (Alternative Solution) Each child of a particular pair of parents has probability 0.25 of having blood type O. Suppose the parents have 5 children (a) Find the probability that exactly 3 of the children have type O blood. Let X = the number of children with type O blood. We know X has a binomial distribution with n = 5 and p = 0.25. (b) Should the parents be surprised if more than 3 of their children have type O blood? To answer this, we need to find P(X > 3). Since there is only a 1.5% chance that more than 3 children out of 5 would have Type O blood, the parents should be surprised! 17

Chapter 7 7.1: What is a Sampling Distribution?!?! Section 7.1 What Is a Sampling Distribution? After this section, you should be able to üdistinguish between a parameter and a statistic üdefine sampling distribution üdistinguish between population distribution, sampling distribution, and the distribution of sample data üdetermine whether a statistic is an unbiased estimator of a population parameter üdescribe the relationship between sample size and the variability of an estimator The process of statistical inference involves using information from a sample to draw conclusions about a wider population. Different random samples yield different statistics. We need to be able to describe the sampling distribution of possible statistic values in order to perform statistical inference. We can think of a statistic as a random variable because it takes numerical values that describe the outcomes of the random sampling process. Population Sample Collect data from a representative Sample... Make an Inference about the Population. Parameters and Statistics A parameter is a number that describes some characteristic of the population. In statistical practice, the value of a parameter is usually not known because we cannot examine the entire population. A statistic is a number that describes some characteristic of a sample. The value of a statistic can be computed directly from the sample data. We use a statistic to estimate an unknown parameter. Symbols: Parameters and Statistics Statistic Proportions Means Standard Deviation Parameter p µ s 1

Chapter 7 Parameter v. Statistic Identify the population, the parameter (of interest), the sample, and the statistic in each of the following settings. A pediatrician wants to know the 75th percentile for the distribution of heights of 10- year-old boys so she takes a sample of 50 patients and calculates Q3 = 56 inches. Parameter v. Statistic Population: all 10-year-old boys Parameter: 75th percentile, or Q3 Sample: 50 10-year-old boys included in the sample Statistic: Q3 = 56 inches. Parameter v. Statistic Identify the population, the parameter, the sample, and the statistic in each of the following settings. A Pew Research Center poll asked 1100 12 to 17-year-olds in the United States if they have a cell phone. Of the respondents, 71% said yes. Parameter v. Statistic Population: All 12-17 year olds in the US Parameter: Proportion with cell phones Sample: 1100 12-17 year olds with cell phones Statistic: pp = 0.71 Sampling Distribution The sampling distribution of a statistic is the distribution of values taken by the statistic in ALL possible samples of the same size from the same population. In practice, it s difficult (usually impossible) to take all possible samples of size n to obtain the actual sampling distribution of a statistic. Instead, we can use simulation to imitate the process of taking many, many samples. One of the uses of probability theory in statistics is to obtain sampling distributions without simulation. We ll get to the theory later. 2

Chapter 7 Population Distributions vs. Sampling Distributions There are actually three distinct distributions involved when we sample repeatedly and measure a variable of interest. 1) The population distribution gives the values of the variable for all the individuals in the population. 2) The distribution of sample data shows the values of the variable for all the individuals in the sample. 3) The sampling distribution shows the statistic values from all the possible samples of the same size from the population. Hours of Sleep Activity 1. Write your name and the number of hours of sleep (e.g., 7 hours, 8.5 hours) on the paper provided. 2. Select a SRS of 5 cards. Each person will do this. (Ignore sampling independence concerns). 3. Using your values calculate the sample IQR of sleep hours and the sample maximum of sleep hours. Then, plot your values on the board. 4. Based on these values and the approximate sampling distributions, do either of these statistics appear to be unbiased estimators? Bias & Variability Bias means that our aim is off and we consistently miss the bull s-eye in the same direction. Our sample values do not center on the population value. High variability means that repeated shots are widely scattered on the target. Repeated samples do not give very similar results. Describing Sampling Distributions: Center A statistic used to estimate a parameter is an unbiased estimator (most accurate) if the mean of its sampling distribution is equal to the true value of the parameter being estimated. Describing Sampling Distributions: Spread The variability of a statistic is described by the spread of its sampling distribution. This spread is determined primarily by the size of the random sample. Larger samples give smaller spread. The spread of the sampling distribution does not depend on the size of the population, as long as the population is at least 10 times larger than the sample. n=100 n=1000 3

Chapter 7 Describing Sampling Distributions: Shape Sampling distributions can take on many shapes. The same statistic can have sampling distributions with different shapes depending on the population distribution and the sample size. Sampling distributions for different statistics used to estimate the number of tanks in German during World War 2. The blue line represents the true number of tanks. A. Which of these statistics appear to be biased estimators? B. Of the unbiased estimators, which is the best? Explain. 7.2: Sample Proportions Section 7.2 Sample Proportions After this section, you should be able to üfind the mean and standard deviation of the sampling distribution of a sample proportion üdetermine whether or not it is appropriate to use the Normal approximation to calculate probabilities involving the sample proportion ücalculate probabilities involving the sample proportion üevaluate a claim about a population proportion using the sampling distribution of the sample proportion http://www.rossmanchance.com/a pplets/reeses/reesespieces.html The Sampling Distribution of What do you notice about the shape, center, and spread of each? n =100 n =400 4

Chapter 7 Sample Proportion Formulas pˆ = p(1 n p) The sample size MUST be less than 10% of the total population. Normal Approximation & Sample Proportions As the sample size increase, sample proportion approach the normal distribution; therefore, we can use Normal calculations. Before using Normal calculation, check Normal conditions: (sample size)(proportion) must be greater than 10. (sample size)(1 proportion) must be greater than 10. Both must be greater than 10 Normal Approximation & Sample Proportions In the game of Scrabble, each player starts by drawing 7 tiles from a bag of 100 tiles. There are 42 vowels, 56 constants and 2 blank tiles. Cait choses an SRS of 7 tiles. Let be the proportion of vowels in her sample. Normal Approximation & Sample Proportions (a) Yes. Seven tiles is less than 10% of the population of 100 tiles. (b) No. Since the total sample size was 7, both np and n(1-p) must be less than 10. The Normal condition is not satisfied. a) Is the 10% condition met? Justify your answer. b) Is the Normal condition met? Justify your answer. Normal Approximation & Sample Proportions A polling organization asks an SRS of 1500 first-year college students how far away their home is. Suppose that 35% of all first-year students actually attend college within 50 miles of home. What is the probability that the random sample of 1500 students will give a result within 2 percentage points of this true value? We have an SRS of size n = 1500 drawn from a population in which the proportion p = 0.35 attend college within 50 miles of home. Ùp = 0.35 Ùp = (0.35)(0.65) 1500 = 0.0123 Conditions: Independence: It is reasonable to assume that there are more than 15,000 college freshmen and therefore the sample represents less than 10% of the population. Normality: Additionally, np = 1500(0.35) = 525 and n(1 p) = 1500(0.65)=975 are both greater than 10, so it is reasonable to assume normality. 5

Chapter 7 The Harvard College Alcohol Study finds that 67% of college students support efforts to crack down on underage drinking. The study took a random sample of almost 15,000 students, so the population proportion who support a crackdown is close to p = 0.67. The administration of a local college surveys an SRS of 100 students and finds that 62 support a crackdown on underage drinking. Suppose that the proportion of all students attending this college who support a crackdown is 67%, the same as the national proportion. Normalcdf (0.33, 0.37, 0.35, 0.0123) = 0.896054 CONCLUDE: There is an 89.61% chance that the sample will yield results within 2 percentage points of the true value. What is the probability that the proportion in an SRS of 100 students is as small as or smaller than the result of the administration s sample? (0.67)(0.33) pˆ = 0.67 pˆ = = 0. 04702 100 Conditions: Independence: It is reasonable to assume that there are more than 1000 college freshmen and therefore the sample represents less than 10% of the population. Normality: Additionally, np = 100(0.67) = 67and n(1 p) = 100(0.33)= 33 are both greater than 10, so it is reasonable to assume normality. Normalcdf (0, 0.62, 0.67, 0.04702) = 0.143805 Be sure to include labels! CONCLUDE: There is an 14.38% chance that the sample will yield results at or below 62% given that the true population proportions is 67% FYI: Derivation of Formulas In Chapter 6, we learned that the mean and standard deviation of a binomial random variable X are X = np X = np(1 p) 7.3: Sample Means Since Ùp = X /n = (1/n) X, we are just multiplying the random variable X by a constant (1/n) to get the random variable Ùp. Therefore, Ùp = 1 n (np) = p Ùp is an unbiased estimator or p Ùp = 1 n np(1 p) = np(1 p) n 2 = p(1 p) n As sample size increases, the spread decreases. 6

Chapter 7 Section 7.3 Sample Means After this section, you should be able to üfind the mean and standard deviation of the sampling distribution of a sample mean ücalculate probabilities involving a sample mean when the population distribution is Normal üexplain how the shape of the sampling distribution of sample means is related to the shape of the population distribution üapply the central limit theorem to help find probabilities involving a sample mean Sample Means Consider the mean household earnings for samples of size 100. Compare the population distribution on the left with the sampling distribution on the right. What do you notice about the shape, center, and spread of each? Theory: Sample Means Sample Means Formulas x = Notes: The sample size must be less than 10% of the population to satisfy the independence condition. The mean and standard deviation of the sample mean are true no matter the same of the population distribution. 7

Chapter 7 REVIEW: Young Women s Heights The height of young women follows a Normal distribution with mean µ = 64.5 inches and standard deviation σ = 2.5 inches. Find the probability that a randomly selected young woman is taller than 66.5 inches. REVIEW: Young Women s Heights STATE: Let X = the height of a randomly selected young woman. X is N(64.5, 2.5). PLAN: Since the sample in this case is only one person, the sample size is clearly smaller than the 10% of the population. DO: 66.5 64.5 z = = 0.80 2.5 P(X > 66.5) = P(Z > 0.80) =1 0.7881= 0.2119 OR Normalcdf (66.5, 10000, 64.5, 2.5) = 0.2118 CONCLUDE: The probability of choosing a young woman at random whose height exceeds 66.5 inches is about 0.21. Example: Young Women s Heights Example: Young Women s Heights The height of young women follows a Normal distribution with mean µ = 64.5 inches and standard deviation σ = 2.5 inches. Find the probability that the mean height of an SRS of 10 young women exceeds 66.5 inches. z = 66.5 64.5 = 2.53 0.79 P(x > 66.5) = P(Z > 2.53) =1 0.9943 = 0.0057 OR normalcdf(66.5, 10000, 64.5, 0.7905) = 0.0057 CONCLUDE: There is a 0.57% percent chance of getting a sample of 10 women with a mean height of 66.5 It is very unlikely (less than a 1% chance) that we would choose an SRS of 10 young women whose average height exceeds 66.5 inches. Sample Distributions & Normality If the population is Normal, then the sample distribution is Normal. No further checks are need! Sample Distributions & Normality: If the population is NOT Normal, then. If the sample is large enough, the distribution of sample means is approximately Normal, no matter what shape the population distribution has, as long as the population has a finite standard deviation. 8

Chapter 7 Sample Distributions & Normality: Sample Distributions & Normality: HOW LARGE IS LARGE ENOUGH? If the Population shape is. Minimum Sample Size to assume Normal Normal 0 Slightly Skewed 15 Heavily Skewed 30 Unknown 30 Example: Servicing Air Conditioners Based on many service records from the past year, the time (in hours) that a technician requires to complete preventative maintenance on an air conditioner follows the distribution that is strongly right-skewed, and whose most likely outcomes are close to 0. The mean time is µ = 1 hour and the standard deviation is σ = 1 Your company will service an SRS of 70 air conditioners. You have budgeted 1.1 hours per unit. Will this be enough time? What is the chance that the technician will not finish within the allotted time (1.1 hours)? Example: Servicing Air Conditioners Conditions: Independence: It is reasonable to assume that the company has serviced more than 700 unit, therefore the 70 units in the sample represent less than 10% of the population. Normal: Even though the population has a strong right skew, a sample size of 70 is large enough to assume normality. μμ x = =1 S σσ x = = 1 n 70 = 0.12 Example: Servicing Air Conditioners DO: z = 1.1 1 0.12 = 0.83 P(x >1.1) = P(Z > 0.83) =1 0.7967 = 0.2033 OR Normalcdf (1.1000001, 10000, 1, 0.1195) = 0.2013 CONCLUDE: If you budget 1.1 hours per unit, there is a 20.13% chance the technicians will not complete the work within the budgeted time. 9