Measures of Central Tendency Statistics 1) Mean: The of all data values Mean= x = x 1+x 2 +x 3 + +x n n i.e: Add up all values and divide by the total number of values. 2) Mode: Most data value 3) Median: Corresponds to the data value that is in the middle of the distribution when the values are placed in. If the data set is odd then the position of the median is found by doing n+1 2 If the data set is even then the position of the median is found by doing n + 1 2 Example 1: Find the mean, median, mode and range of the following set of numbers 5,8,8,11,14,15 Example 2: Find the mean, median, mode and range of the following set of numbers 1,1,2,3,3,3,4,5 Measures of Dispersion 1) Range: Range is a measure of dispersion but is often confused with a measure of central tendency. It measures the dispersion of the data. Range: Xmax - Xmin i.e.: Subtract the highest value by the lowest
2) Mean Deviation: Indicates the average of the deviations of the data from the mean. i.e.: How close or how far away our values are from the mean. To calculate the mean deviation we must: 1) Calculate the mean 2) place our data in the following table 3) Find x i x 4) Divide by the total number of values Values x i Value Mean x i x 3 3 9 = -6 6 Absolute Value x i x Mean Deviation = x i x n Example 1 : Find the range and the mean deviation of the following data 40,50,60,70,70,85,90,95 The data will not always be presented to us in the same way. The may provide the data in a table of values, a graph or a stem and leaf plot. Stem and Leaf Plot The plot to the right represents a stem and leaf plot where the numbers to the left of the line represent the step and correspond to the tens digits of the data. The numbers to the left are the leafs and correspond the units digit of the data.
Example 1: Organize the following data into a stem and-leaf plot. 25, 28, 32, 32, 39, 41, 44, 47, 53, 55, 62, 64 Example 2: Given the following stem-and-leaf plot, determine the mean for this group of data values. Example 3: Given the following double stem-and leaf plot, determine which group has the larger range.
Percentile Percentiles divide an ordered group of data into 100 sections each containing 1% of the data. You will be asked to find the percentile rank of a particular data value or asked to find a data value given it s percentile rank. The following formula is used to calculate Percentile Rank Percentile Rank = N less + 0.5(Nequal ) N total 100 Where N less is the number of values less than the rank of the data value x N equal is the number of values equal to the data value x N total is the total number of values in the distribution ****Note: Percentile is always rounded up to the nearest integer. i.e.: 25.2 = or 25.6 = Example 1: What is the percentile rank of an athlete from the group below that went to the gym 25 times? 22, 22, 23, 24, 24, 24, 25, 27, 27, 28, 30, 31 Example 2: The results of the 198 students who wrote a math test are listed below in increasing order. Jill s result was 80. What was her percentile rank? Example 3:
Finding a data value given the percentile. Value Rank = Percentile Rank N total 100 **** Note: When finding the value rank we always round down. i.e.: 25.2 = or 65.7 = Example: Given the list of data values below, which one has a Percentile Rank of at least 72? 5,7,7,9,12,15,15,18,19,24,27,29,30,35 Linear Correlation If there is a correlation between two events, it means that the two events are linked. They, to a certain extent, have effect on each other. We will estimate and measure the correlation of events using a scatter plot. There are 3 different types of correlations we can observe on a scatter plot. It is also important to describe the strength of the correlation using or.
We will use the following formula to more accurately determine the strength of the correlation. Correlation Coefficient (r) = ± (1 short long ) Where short and long are the side lengths of the rectangle around your data. We describe the correlation using the following words When describing the correlation we must always state 1. 2. Example: Find the correlation coefficient of the following scatter plot Example 2: Match the following scatter plots with a correlation coefficient A. 1 B. 0.30 C. -0.40 D. -0.80
Contingency Table A contingency table illustrates a two variable distribution just like a scatter plot. For example, if we look at the number of hours of sleep (y) a student gets before an exam and their grade on the exam (x) To see if a correlation exists: 1) Circle all the number 2) Turn your paper counterclockwise 3) If there is a pattern of a line forming then there is a correlation **You must determine if the correlation is stong, weak, positive or negative Example : Describe the strength and the direction of the correlation. Golf score Years [70, 80[ [80, 90[ [90, 100[ [100, 110[ [110, 120[ Total Experience [0, 4[ 0 0 0 0 5 5 [4, 8[ 0 0 1 2 2 5 [8, 12[ 0 1 2 4 5 12 [12, 16[ 1 3 0 3 0 7 [16, 20[ 1 0 0 0 0 1 Total 2 4 3 9 12 30
Regression Line (Line of Best Fits) The regression line is used to approximate the results of a statistical survey. The regression line is the line that best fits a set of points on the scatter plot. For example the regression line on a scatter plot will look like: In order to find the regression line we will use the linear equation line y = ax + b. Steps : 1) Draw a rectangle around the distribution 2) Draw a straight line down the middle of the rectangle 3) Identify two points that touch your line and use them to find a (slope) 4) Solve for b Example: Find the equation of the regression line