M&M Madness In this investigation you will use the statistics skills that you learned the to display and analyze a cup of peanut M&Ms. Part I: Categorical Analysis: M&M Color Distribution 1. Record the number of each color of peanut M&Ms: Your numbers Class numbers Red Orange Yellow Green Blue Brown Total 2. Now calculate the percentage of each color in your cup and for the whole class: Your percentages Class percentages Official percentages Red Orange Yellow Green Blue Brown Total 10% 20% 20% 20% 20% 10% 100% 3. Make a three-way bar graph showing your percentages, the class percentages and the official percentages.
Quantitative Analysis: M&M Weight Distribution Part II: Peanut M&Ms: 1. Record the weight of each peanut M&M in your cup: 2. Find the Five Number Summary for your data Min = Q1 = Median= Q3= Max = 3. Identify any outliers using 1.5xIQR 4. Draw the boxplot for this data with the outliers identified by *:
5. Choose 10 of your data points. Find the mean and the standard deviation for your data. 1 2 3 4 5 6 7 8 9 10 Total = Mean = X X mean (x mean) 2 Sum= Steps for finding Standard Deviation 1. Enter the weights in the x-column 2. Find the mean of the values 3. Fill in the 3 rd column by subtracting the mean from each x-value 4. Fill in the last column by squaring each value in the 3 rd column 5. Add all the values in the last column. 6. Divide this number by N -1. 7. Take the square root of this value. Standard deviation = 6. Group data: Make a frequency table for the weights of the plain M&Ms and make a histogram of the percentages Class Tally Frequ ency Percent age Total
7. Class data: Make a frequency table for the weights of the peanut M&Ms and make a histogram of the percentages. Use the classes specified by your instructor. To find the Midpoint of each data, add the start of one class with the start of the next class and divide by 2. Class Midpoint Frequency Percentage Total Histogram of Class Data for Plain M&Ms 8. Mark the midpoint of each class at the top of each bar. Join the points with a smooth curve to create a continuous frequency graph.
9. Find the Five Number Summary for the class data: Min = Q1 = Median = Q3 = Max = 10. Identify outliers using 1.5xIQR: 11. Draw the boxplot for this data with the outliers identified by *: 12. Find the mean and standard deviation for the class data. Because there are many values for the class data, we will calculate these values using the frequency table rather than the raw data. The results are approximate but they are easier to calculate. Column 1 Column 2 Column 3 Column 4 Column 5 Column 6 Class Midpoint, X Frequency, f X*f (X Mean) 2 (X Mean) 2 * f Σ f = Σ X*f = Σ (X Mean) 2 * f = Mean = Σ X*f/ Σ f =
Here are the steps for this calculation: 1. Enter the classes in the first column. 2. Find the midpoint of each class: (start of class 1 + start of class 2)/2 3. Enter the frequencies in column 3. Find the sum of all frequencies and write the total a the bottom of the column, where it says: Σ f = 4. Multiply the values in the 2 nd column with the values in the 3 rd column to fill the 4 th column. Find the sum of all the values in the 4 th column and enter the answer where it says: Σ X*f = 5. Find the mean of the values using the formula: Mean = Σ X*f/ Σ f 6. Find the differences between the values in the 2 nd column and the mean and square these differences. Enter the answers in column 5. 7. Multiply the values in the 3 rd column with the values in the 5 th column and enter the answers in column 6. Find the sum of all the values in the 6 th column and write the answer where it says: Σ (X Mean) 2 * f = 8. Use the formula for SD to find the standard deviation. Divide the value that you got at the bottom of the 6 th column by the value of Σ X*f = that you have at the bottom of the 4 th column. Then take the square root of the answer. 13. Identify outliers using the standard deviation and the mean. A data point is considered an outlier if it is more than 3 standard deviation larger than the mean OR less than 3 standard deviations away from the mean. Upper boundary = mean + 3* SD = Lower boundary = mean 3*SD = List your outliers: 14. On the histogram for the class data, mark the values of the mean, median and the mode and any outliers by using * s. 15. Is the frequency symmetric, left-skewed or right-skewed? For symmetric, the mean is approximately equal to the median. For left-skewed, the mean is less than the median For right-skewed, the mean is more than the median. 16. If the distribution is approximately bell-shaped and symmetric, the mean and standard deviation are better measures of center and range than the median and IQR. What is more appropriate to use here: the mean and standard deviation OR the median and IQR.