Unit 1: Statistics Mrs. Valentine Math III
1.1 Analyzing Data Statistics Study, analysis, and interpretation of data Find measure of central tendency Mean average of the data Median Odd # data pts: middle value of data, Even # data pts: mean of two middle values Mode most frequent values Bimodal (two modes) 3+ modes à not statistically relevant
1.1 Analyzing Data Example: The frequency table shows the number of job offers received by each student within two months of graduating with a mathematics degree from a small college. What are the mean, median, and mode for the job offers per student? Job Offers 0 1 2 3 4 Students 2 2 4 5 2 Median: 0, 0, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 3, 4, 4 Median is 2 Mode: Five students received 3 job offers each. Mode is 3.
1.1 Analyzing Data Outliers Misleading data, different from the rest Affect measures of central tendency Use differences between adjacent values in ordered data to identify Example: Find the outlier. 56, 59, 59, 65, 65, 73, 98 3 0 6 0 8 25 Comparing Data Sets Range of data set difference between greatest and least values Quartiles if data is divided into four parts by medians the values separating the four parts are quartiles. Interquartile range difference between first and third quartiles.
1.1 Analyzing Data Example: The table shows average monthly water temperatures for four locations on the Gulf of Mexico. How can you compare the 12 water temperatures from St. Petersburg with the 12 water temperatures from Key West. Use mean, mode, range, and interquartile range to compare the two.
1.1 Analyzing Data St. Petersburg Key West
1.1 Analyzing Data Box-and-Whisker Plots Quartiles bound center box. Minimum and maximum values are used to form whiskers. In calculator: STAT, enter data in L1, enter window values, draw box-and-whisker plot. Example: 12, 11, 15, 12, 19, 14, 18, 15, 16
1.1 Analyzing Data Percentile Number from 0 100 that shows the percent of data less than or equal to x. Example: Here is an ordered list of midterm test scores for a Spanish class. What value is at the 65 th percentile? Of the 20 values, 65% fall at or below the value at the 65 th percentile. 20 * 65% = 20 * 0.65 = 13 13 values at or below 82, the value at the 65 th percentile. For the same data, what is the value at the 55 th percentile? 95 th percentile? 55 th : 79 95 th : 98
1.2 Standard Deviation and Normal Distribution Standard Deviation and Variance measures showing deviation from mean σ (sigma) = standard deviation σ 2 (sigma squared) = variance Finding standard deviation: Find the mean Find the difference between each value and the mean Square each difference Find the mean of the squares (= variance) Take the square root of the variance (= std dev)
1.2 Standard Deviation and Normal Distribution Example: What are the mean and variance of these values? 6.9, 8.7, 7.6, 4.8, and 9.0 x 6.9 8.7 7.6 4.8 9.0 7.4 7.4 7.4 7.4 7.4 0.5 0.25 1.3 1.69 0.2 0.04 2.6 6.76 1.6 2.56 Sum 11.30 The mean is 7.4. The variance is 2.26. The std. dev. is about 1.5.
1.2 Standard Deviation and Normal Distribution Using the Calculator STAT à EDIT à Enter data in L1 STAT à CALC à 1-Var Stats = mean σx = standard deviation Example: Find the mean and standard deviation of the Dow Jones Industrial data for the first 12 weeks of 1988. For the first 12 weeks of 2008.
1.2 Standard Deviation and Normal Distribution Describing Data with Standard Deviation Written as a range. Example:, then a value x with 40 x 60 is within one standard deviation of the mean. Example: Within how many standard deviations from the mean to all of the values fall? All data fall within 2 standard deviations of the mean.
1.2 Standard Deviation and Normal Distribution Normal Distributions Discrete probability distribution finite number possible events Continuous probability distribution any value in an interval of real numbers. (usually large data sets) Normal distribution data that vary randomly from the mean. 68% of data fall within 1 standard deviation 95% of data fall within 2 standard deviations 99% of data fall within 3 standard deviations Skewed data (do not vary predictably from mean)
1.2 Standard Deviation and Normal Distribution Example: The bar graph gives weights of a population of female brown bears. The red curve shows how the weights are normally distributed about the mean, 115kg. Approximately what percent of female brown bears weight between 100 and 129kg? Estimate and add percents for intervals 100-109, 110-119, and 120-129. 23 + 42 + 23 = 88 About 88% of female brown bears weigh between 100 and 129kg.
1.2 Standard Deviation and Normal Distribution Sketching a Normal Curve Use the symmetry of a normal distribution to help draw the curve. Find the mean and standard deviation of the population. Multiply the standard deviation by 1, 2, and 3. Draw vertical lines at the mean ± these values. Sketch the normal curve Example: For a population of male European eels, the mean body length and one positive and negative standard deviation is shown. Sketch a normal curve showing the eel lengths at one, two, and three standard deviations from the mean.
1.2 Standard Deviation and Normal Distribution Analyzing a Normal Distribution The area between the curve and an interval on the x-axis represents probability. Example: The heights of adult American males are approximately normally distributed with mean 69.5in and standard deviation 2.5in. What percent of adult American males are between 67in and 74.5in tall? P(67 < h < 74.5) = 0.34 + 0.34 + 0.135 = 0.815 About 82% of adult American males are between 67in and 74.5in tall. About how many adult American males in a group of 2000 would you expect to be taller than 6ft (72in)? P(>72in) = 0.5 0.34 = 0.16 2000 * 0.16 = 320 adult American males
1.3 Sample and Study Types Analyzing Sampling Methods Population members of a set Samples Convenience Sample members are readily available Self-Selected Sample volunteers only Systematic Sample selected from regular intervals in ordered set Random Sample all members equally likely to be chosen Bias Over/underrepresented part of population Systematic error introduced by sampling method Data from non-random samples may be true, but will likely be suspect.
1.3 Sample and Study Types Examples: A newspaper wants to find out what percent of the city population favors a property tax increase to raise money for local parks. What is the sampling method used for each situation? Does the sample have a bias? Explain. A newspaper article on the tax increase invites readers to express their opinions on the newspaper s website. Self-selected sample. Bias depends on who visits the website, as some views may be over or underrepresented. A reporter interviews people leaving the city s largest park. Convenience sample. Bias likely because sample may overrepresent park supporters. A survey service calls every 50 th listing from the local phone book. Systematic sample. May have a bias if there is some link between people who are listed (or not listed) in a phone bok and people who pay property taxes.
1.3 Sample and Study Types Analyzing Study Methods Methods Observational study observe but do not affect the study Controlled Experiment two groups: treatment imposed on one but not the other. Results of the two groups are compared Survey ask every member a set of questions Examples: Which type of study method described is used? Can the sample statistics be used to make a general conclusion about the population. A list of students is randomly generated from the school database. Information for every student is entered into the database, and each student has an equally likely chance of being selected. The students selected are asked how much time they spend on household chores each week. A gardener tests a new plant food by planting seeds from the same package in the same soil and location. Each plant is given the same amount of water, but half of the plants are given food and the other half are given no food at all. He records the growth and flowering rates of each plant.
1.3 Sample and Study Types Designing a Survey Example: During the 2008 Olympic Games, a U.S. swimmer won more medals than any other swimmer in history. What sampling method could you use to find the percent of students in your school who recognize that swimmer from a photograph? What is an example of a survey question that is likely to yield information that has no bias? When thinking of a sample method, consider Representation of populations Randomness of selection What conditions would require you to revise the sampling method to avoid bias? What should the photograph not depict? Question every 10 th student entering school in the morning. This is systematic sampling. It usually contains the least bias. A possible unbiased question is who is pictured in this photograph?