Elementary Statistics

Size: px
Start display at page:

Download "Elementary Statistics"

Transcription

1 Elementary Statistics Q: What is data? Q: What does the data look like? Q: What conclusions can we draw from the data? Q: Where is the middle of the data? Q: Why is the spread of the data important? Q: Can we model the data? Q: How do we know if we have a good model? Q: Is our data affected by other variables? Definitions Individuals : Objects described by a set of data. Individuals may be people, but they may also be animals or things. Variable : Any characteristic of an individual. A variable can take on different values for different individuals. Categorical and Quantitative Variables Categorical variable : Places an individual into one or several categories. Quantitative variable : Takes numerical values for which arithmetic operations make sense. Distribution : Tells what values the data takes and how often it takes these values. ١

2 Dataset, Individuals and Variables Variables Company Observation Stock Annual Earn/ Exchange Sales($M) Sh.($) Dataram AMEX EnergySouth OTC Keystone NYSE LandCare NYSE Psychemedics AMEX Individuals Data Set Datum 1.1: Describing Distributions with Graphs Two Basic Strategies : 1) Begin by examining each variable by itself. Then move on to study the relationships among variables. 2) Begin with a graph or graphs. Then add numerical summaries of specific aspects of data. Different types of graphs : Bar graph, Pie chart, Stemplot, back-to-back Stemplot, Histogram, Time plot Categorical Variables Pie charts Bar graphs ٢

3 Bar Graphs Bar graph - A graph which displays the data using heights of bars to represent the counts of the variables. Example : Consider the following grade distribution : Grade Count A B C D Other How could we display the data using a bar graph? Bar Graphs Grade Count A B C D Other A B C D F Pie Charts Pie Chart : 1) A chart which represents the data using percentages. 2) Break up a circle (pie) into the respected percentages. ٣

4 Grade Count Percent Pie Charts A B C D Other B A D F C Quantitative Variables Stem-and-Leaf (Stemplot) Histogram Describing Distributions When describing a distribution we describe four things: (1) the shape of the distribution (2) the center and the spread of the distribution (3) any unusual features in the distribution ٤

5 Stemplot Example: Here are the grades Max achieved while in school his first two years. Grades: 88, 72, 91, 83, 77, 90, 45, 83, 94, 91, 86, 77, 82, 100, 58, 76, 83, 88, 72, 66 Steps 1 and 2 : Step 3 : Stemplot Example: Here are the grades Max achieved while in school his first two years. Grades: 88, 72, 91, 83, 77, 90, 45, 83, 94, 91, 86, 77, 82, 100, 58, 76, 83, 88, 72, 66 Steps 1 and 2 : Step 3 : Back-To-Back Stemplot This is a stemplot which allows you to see and compare the distribution of two related data sets Example : Here are the grades Lulu received during her first two years at college : Grades: 66, 77, 78, 84, 92, 90, 86, 78, 71, 93, 82, 55, 73, 95, 87, 76, 93, 82, 66, 75 To make a Back-To-Back Stemplot, you make the stem, and the stems going off to the right and the left. You want the smaller values closer to the stem. ٥

6 Back-To-Back Stemplot Lulu s Grades: 66, 77, 78, 84, 92, 90, 86, 78, 71, 93, 82, 55, 73, 95, 87, 76, 93, 82, 66, 75 Max s Grades: 88, 72, 91, 83, 77, 90, 45, 83, 94, 91, 86, 77, 82, 100, 58, 76, 83, 88, 72, Back-To-Back Stemplot Lulu s Grades: 66, 77, 78, 84, 92, 90, 86, 78, 71, 93, 82, 55, 73, 95, 87, 76, 93, 82, 66, 75 Max s Grades: 88, 72, 91, 83, 77, 90, 45, 83, 94, 91, 86, 77, 82, 100, 58, 76, 83, 88, 72, Splitting Stems If you have a large data set (leaves), then sometimes a stemplot will not work very well. For instance, if you have a large amount of leaves, and only a few stems, you might want to split the stems. Example : Consider the following test scores : 71, 71, 72, 74, 75, 75, 75, 76, 77, 79, 80, 81, 81, 82, 83, 83, 83, 83, 84, 85, 85, 88, 89, 90, 90, 90, 91, 93, 95, 96, 97 Normally we would set up the stems as follows : ٦

7 Splitting Stems If you have a large data set (leaves), then sometimes a stemplot will not work very well. For instance, if you have a large amount of leaves, and only a few stems, you might want to split the stems. Example : Consider the following test scores : 71, 71, 72, 74, 75, 75, 75, 76, 77, 79, 80, 81, 81, 82, 83, 83, 83, 83, 84, 85, 85, 88, 89, 90, 90, 90, 91, 93, 95, 96, 97 Normally we would set up the stems as follows : , 1, 2, 4, 5, 5, 6, 7, 9 0, 1, 1, 2, 3, 3, 3, 3, 4, 5, 5, 8, 9 0, 0, 0, 1, 3, 5, 6, 7 Splitting Stems If you have a large data set (leaves), then sometimes a stemplot will not work very well. For instance, if you have a large amount of leaves, and only a few stems, you might want to split the stems. Example : Consider the following test scores : 71, 71, 72, 74, 75, 75, 75, 76, 77, 79, 80, 81, 81, 82, 83, 83, 83, 83, 84, 85, 85, 88, 89, 90, 90, 90, 91, 93, 95, 96, 97 However, we could set up the stems as follows : This stem gets scores This stem gets scores Stem-and-Leaf Plot CAN: 1. Determine the center of the distribution. 2. Determine the range or spread of the data. 3. Determine the shape of the distribution. 4. Determine any range of values not represented. Determine if there is a concentration of data. Determine if there are any outliers (extreme values). ٧

8 Advantages Graphically Display the distribution of the data Retain the actual data Easy to construct Make sorting of the data easy Disadvantages Not very effective for large data sets (would take a long time to construct). Choice of the stems depends on the data type and data range. Different choices for the stems can cause different looking distributions. Histograms A histogram breaks the range of variables up into (equal) intervals, and displays only the count or percent of the observations which fall into the particular intervals. Notes: You can choose the intervals (usually equal) Slower to construct than stemplots Histograms do not display the individual observations In case a score falls on an interval point, you must decide in advance which interval in which the point will go. ٨

9 Steps to drawing a histogram : Histograms 1) Divide the range into classes of equal width. 2) Count the number of observations in each class. These are called frequencies. 3) Draw the histogram. Histograms 1a. Determine the number of class intervals to use. One rule is to calculate the square root of the sample size, and round up. Example: If the sample consists of 210 subjects,then the square root of 210 is 14.5, which is rounded up to 15. Histograms 1b. Determine the range of the data by subtracting the smallest observation from the largest observation. Example: If the smallest observation is 202 and the largest observation is 496, then the range is = 294. ٩

10 Histograms 1c. Divide the range by the number of class intervals and round to a convenient number. This will be the equal class width. Example: Range = 294 Number of intervals = /15 = 19.6 round to 20 Use a class width of 20 for each interval. Histograms 1d. The lower limit of the first interval should be a multiple of the class width and should be chosen such that the smallest observation is contained in the first interval. Example: Multiples of 20 are: 0, 20, 40, 60, 80, 100, 120, 140, 160, 180, 200, 220, 240, 260,.. The smallest observation is 202. Choose the lower limit of the first interval to be 200. Histograms The rest of the intervals are obtained by adding the class width to this first lower limit value. Example: ١٠

11 Histograms 2a. Count the number of observations falling in each interval. These counts are referred to as class frequencies (or just frequencies). As a rule, an observation that falls on the boundary of two intervals should be placed in the second interval, not the first. Example: 220 goes in the interval , not Histograms 2b. Determine the relative frequency for each class interval by dividing the class frequency by the total number of observations and multiplying by 100. The relative frequencies are the percentages of the observations in each interval. Example: If the frequency in the class is 4 and the number of observations (n) is 210, then the relative frequency is (4/210)*100 = 1.9% Histograms Display the class intervals, class frequencies, and relative frequencies in a frequency table. Class Intervals Frequencies % 25.5% 9.5% 10.5% 9.0% ١١

12 Histograms 3. Construct the histogram. On the horizontal axis, mark and label the class intervals. On the vertical axis, mark and label the class frequencies(to create a frequency histogram) or the relative frequencies (to create a relative frequency histogram). Over each class interval, draw a rectangle whose height equals the correct frequency or relative frequency. Histograms Example : Suppose the final breakdown in grades looks like this : Grade Amount Percent Frequency Table Histograms Example : Suppose the final breakdown in grades looks like this : Grade Amount Percent % 20% 25% 25% 20% ١٢

13 Population Density Curves Smoothed version of a relative frequency histogram, such that the area under the curve represents relative frequency and the total area is 1. Symmetric (Normal) Curve Bell-shaped curve. Most commonly used type of distribution. Basis for many statistical inference procedures. Symmetric (Normal) Curve ١٣

14 Skewed Left Distribution General bell-shape, with a long tail to the left Skewed Left Distribution Skewed Right Distribution General bell-shape, with a long tail to the right. ١٤

15 Skewed Right Distribution Bimodal Distribution A distribution with two significant peaks. Bimodal Distribution ١٥

16 Trimodal Distribution A distribution with three significant peaks. Trimodal Distribution Describing a Distribution 1. Shape 2. Center and Spread 3. Unusual features (outliers, gaps, high concentrations of data) ١٦

17 Example 1 Grades Distribution 0 0 where 3 7 = ,4, ,9,7,6,5,4,8 8 2,0,8,0,5,4,5,5,3,3,6,4,8,7,6,3,9 9 2,1,4,5,1,2,3 Skewed left, center around 80, range from 0 to 95, at least one outlier Example , ,68, ,38 Bimodal, 4 50,52 center around 500, ranges from 95 to 936, 7 79,43 no outliers 8 79,42,05, ,36 where 1 27 = 127 Example , 0 where 1 2 = 12 0 * 1. 2, 2 1 * 7, 8, 8, 8, , 1, 1, 1 2 * 7, 7, 9, 6, 8, 9, , 2, 3, 2, 1, 2, 3, 1, 2 3 * 5, 8, 6, 5, , 1, 3 4* 7, * Symmetric, center around 30, range from 0 to 51 missing values at 0* ١٧

18 Example 4: 1988 State Data ,2,8,9,3,8,5,9,3,8,7,2,9 12 6,9 13 6,9,3,8,7,7 14 5,5,3,5,5,1,8,3,0 15 9,2,7,6,8,6,5 16 7,5,6, ,4,5, Skewed Right, Bimodal???? Center between 15,000 and 16,000 Range from 11,100 to 23,100 Two outliers Example 5: 1991 State Data ,8,5,2 15 6,4,5,1,1,8,4 16 4,0,6,1,4,3 17 4,5,5,8,9,9,6,3,7,1 18 9,5,7,8,0 19 4,1,2,1,4 20 3,8,0 21 9,0,3,0 22 1,9, ,4 Roughly symmetric, center around 18,000, range from 13,300 to 25,900, two possible outliers, Example 6 Relative Frequency 20% 16% 12% Roughly symmetric, (bimodal), center around 35, range from 12 to 60, no outliers, 8% 4% 0% Test Scores ١٨

19 Time Plot A Time Plot is a graph with two axis. One axis represents time,and the other axis represents the variable being measured. Variable Time Time Plot Example : The following are homerun totals for a certain baseball player the last 10 years : Year HR Construct a timeplot for this data set. Time Plot Year HR Home Run Year ١٩

20 Time Plot Year HR What to look for in a distribution 1) Look for the overall pattern and for deviations from the pattern See if the distribution has a shape we can describe in a few words Describe the center and spread of the distribution 2) One common deviation from the overall pattern in any graph of data is an outlier, i.e., an observation that falls outside the overall pattern of the graph 1.2: Describing Distributions with Numbers A. Measures of Location (Center) Mean Median B. Measures of Spread (Variability) Quartiles (Quantiles) Variance and Standard deviation ٢٠

21 Measures of Location 1. Mean (Average) How to find the mean (average): 1) Add the values together 2) Divide the total by the number of observations Example: Test Scores : 56, 65, 54, 55, 57, 54, 61, 62, 60, 55, 57, 56, 57, 61, 62, 60, 49, 66, 59, 80 Step 1 : = 1186 Step 2 : 1186 / 20 = 59.3 Mean Mean To find the mean x of a set of observations, add their values and divide by the number of observations. If the n observations are x 1, x 2, x 3,.., x n, their mean is : x = x 1 + x 2 + x n x n Or, in more compact notation: x = x i 2. Median How to find the median M : 1) Arrange the observations in order from smallest to largest. 2) If the number of observations is odd, then the median is located at the center of the list. So, if there are n observations, then the median is located in spot (n + 1) / 2 3) If the number of observations is even, then the median is the average of the two terms in the middle spots. These are located in spots (n / 2) and (n / 2) + 1 ٢١

22 Median Example of finding a Median : List 1 : 2, 4, 6, 3, 5, 2, 6, 8, 10, 11, 1 Step 1: Order the list : 1, 2, 2, 3, 4, 5, 6, 6, 8, 10, 11 Step 2 : Find the middle term2 : (n+1) / 2 = (11 + 1) / 2 = 6 1, 2, 2, 3, 4, 5, 6, 6, 8, 10, 11 Median Median Example of finding a Median : List : 2, 4, 6, 3, 5, 2, 6, 8, 10, 11, 1, 12 Step 1: Order the list : 1, 2, 2, 3, 4, 5, 6, 6, 8, 10, 11, 12 Step 2 : Find the two middle terms : n / 2 = 12 / 2 = 6 (n / 2) + 1 = (12 / 2) + 1 = 7 Step 3 : Average the sixth and seventh terms : 1, 2, 2, 3, 4, 5, 6, 6, 8, 10, 11, 12 Median = (5 + 6) /2 = 5.5 In The Presence Of Outliers Q: Do outliers affect the Mean and Median? Consider the list on numbers from 1 through 9 : 1, 2, 3, 4, 5, 6, 7,8,9 The Mean is : 5 The Median is : 5 What if we put the number 100 at the end of the list : 1, 2, 3, 4, 5, 6, 7,8,9, 100 The Mean is : 14.5 The Median is : 5.5 A: Outliers affect the Mean much more than the Median! ٢٢

23 Distributions The mean is the point at which a histogram balances. For symmetric distributions the mean and median will be nearly the same. X M Distributions The mean is the point at which a histogram balances. For symmetric distributions the mean and median will be nearly the same. However, since the mean is influenced by outliers, for skewed distributions the mean will be pulled in the direction of the long tail while the median will be resistant to the outliers and remain in nearly the same place. Skewed Right M X ٢٣

24 Skewed Left X M Measures of Spread Consider the following pay distributions: Low High Center Low High Measuring Spread The simplest useful numerical description of a distribution consists of both a measure of center and a measure of spread. 1. Percentiles and Quantiles Definition: The pth percentile of a distribution is the value such that p percent of the observations fall at or below it. Example: The Median is the 50th percentile. Q: Why isn t the Mean the 50th percentile? 1, 2, 3, 4, 5, 6, 7,8,9, 100 The Mean is : 14.5 The Median is : 5.5 ٢٤

25 Describing Spread The Five Number Summary : 1) The Median 2) First Quartile : 25% of the observations lie below the First Quartile 3) Third Quartile : 75% of the observations lie below the third quartile 4) Lowest Individual Observation (Minimum) 5) Highest Individual Observation (Maximum) Quartiles Calculating the Quartiles : 1) Arrange the observations in increasing order and locate the Median M in the ordered list o observations. 2) The First Quartile Q1 is the median of the observations whose position in the ordered list is to the left of the location of the overall median. 3) The Third Quartile Q3 is the median of the observations whose position in the ordered list is to the right of the location of the overall median. Quartiles Example of calculating First Quartile : List of quiz scores: 10, 8, 9, 4, 6, 6, 8, 9, 2, 7 1) Order the list: 2, 4, 6, 6, 7, 8, 8, 9, 9, 10 Find the median: (7 + 8) / 2 = 7.5 2) Find all the observations whose position in the list is to the left of the median : 2, 4, 6, 6, 7, 8, 8, 9, 9, 10 Find the median of these values : 6 ٢٥

26 Quartiles Example of calculating Third Quartile : List of quiz scores: 10, 8, 9, 4, 6, 6, 8, 9, 2, 7, 11 1) Order the list: 2, 4, 6, 6, 7, 8, 8, 9, 9, 10, 11 Find the median: 8 2) Find all the observations whose position in the list is to the right of the median : 2, 4, 6, 6, 7, 8, 8, 9, 9, 10, 11 Find the median of these values : 9 Interquartile Range The interquartile range, IQR, is the distance between the first quartile and the third quartile. Determining Outliers Call an observation a suspected outlier if it falls more than 1.5 * IQR above the third quartile or below the first quartile. Example : Imagine we have a bunch of test scores with Q1 = 50 and Q3 = 80. The IQR = = 30 So, 1.5 * IQR = 1.5 * 30 = 45 This means that if there are any scores above Q = 125 or any scores Q1-45 = 5, then these scores are suspected outliers. Boxplot Example: Low = 47, High = 98, Median = 77, Q1 = 65, Q3 = 85 A Boxplot is a graph of the five number summary. A central box spans the quartiles, with a line marking the median. Whiskers extend out from the box to the extremes. Highest Observation (98) 90 Q3 (85) Median (77) 70 Q1 (65) 50 Lowest Observation (47) ٢٦

27 Describing Spread 2. The Standard Deviation Variance: The variance of a set of observations is an average of the deviations of the observations from the mean. Note: You divide by (n - 1) instead of n. Standard Deviation: The SD is the square root of the variance. Describing Spread The Standard Deviation Example : Test Scores : 65, 77, 83, 80, 95 1) Find the average : 80 2) Find the deviations from the mean, and their squares Obs Deviation from Mean Deviations Squared Describing Spread The Standard Deviation 3) Determine the mean of the squares: ( ) = 117 (5-1) 4) Determine the Standard Deviation: Variance 117 = 10.8 ٢٧

28 More Fancy Notation s 2 The variance of a set of observations is the average of the squares of the deviations of the observations from their mean. In symbols, the variance on n observations x 1,,... is : s 2 = 2 (x - x ) 1 + x 2 2 (x - x ) + 2 n - 1 x n (x - x ) n or, in more compact notation : 1 s 2 2 = (x - x ) n-1 i The standard deviation s is the square root of the variance s 2 : s = 1 n-1 2 (x - x ) i Another Example of Standard Deviation Consider the following years in our past : 1792, 1666, 1362, 1614, 1460, 1867, 1439 Find the standard deviation of these years. The Mean = x i ( x i -x) ( x i -x) s 2 2 = (x - x ) n-1 i = 1 6 ( ) = s = Why Do We Square The Deviations? 1) The sum of the squared deviations of any set of observations from their mean is the smallest that the sum of squared deviations from any number can possibly be. Why use the Standard Deviation and not the Variance? 1) The standard deviation is the measure of spread for an important class of symmetric unimodal distributions called the normal distribution. 2) The standard deviation is used by the normal distribution. 3) The variance uses squared deviations, which gives a different unit from the original data. Why use n - 1? 1) The sum of the deviations is *always* zero. So, if we know n-1 of the deviations, then the last deviation can be calculated. So, only n-1 of the deviations can vary freely. These are called degrees of freedom. ٢٨

29 Properties of Standard Deviations 1) The standard deviation measures spread about the mean and should be used only when the mean is chosen as the measure of center. 2) s = 0 only when there is no spread. This happens only when all observations have the same value. Otherwise, s > 0. As the observations get more spread out from the mean, then s gets larger. 3) s, like the mean, is not resistant. A few outliers can make s very large. Which Measure To Use? Q: When is the mean better than median? When is the five number summary better than the standard deviation? Rules Of Thumb A1: If outliers appear, or if your distribution is skewed, then the mean could be affected, so use the median and the five number summary. A2: If the distribution is reasonably symmetric and is free of outliers, then the mean and standard deviation should be used. Changing Units Consider the following values : 30, 40, 50, 60, 70 The mean is 50 and the standard deviation is 15.8 What happens to these if we take every score, multiply it by 2 and add 10 We get these values : 70, 90, 110, 130, 150 The mean is 110 and the standard deviation is 31.6 ٢٩

30 Changing Units Old values : 30, 40, 50, 60, 70 mean = 50 and s = 15.8 What happens to these if we take every score, multiply it by 2 and add 10 New values : 70, 90, 110, 130, 150 mean = 110 and s = Linear Transformations A linear transformation changes the original variable x into the new variable given an equation of the form : x new x new = bx + a Note: The constant a shifts all values of x either up or down by the value a. The constant b changes the size of the unit of the distribution. Effects of Linear Transformations 1) To get the new spread, multiply the old spread by b. 2) To get the new mean, multiply the old mean by b and add the constant a. 1.3: The Normal Distributions Density Curves A density curve is a curve that : 1) is always on or above the vertical axis, and 2) has area exactly 1 underneath it. A density curve describes the overall pattern of a distribution. The area under the curve and above any range of values is the relative frequency of all observations that fall in that range. ٣٠

31 Density Curves Normal and Skewed Curves Median Mean Mean and Median of a Density Curve The median of a density curve is the equal-areas point, the point that divides the area under the curve in half. The mean of a density curve is the balance point, at which the curve would balance if made of solid material. ٣١

32 Normal Curves Normal Curves are curves which are symmetric, unimodal, and bell shaped. µ represents the mean σ represents the standard deviation Equation for the curve : y = σ 1 2π e -1 2 x - µ ( ) 2 σ Why are Normal Distributions important in stats? 1) Normal distributions are good descriptions for some distributions of real data. 2) Normal distributions are good to the results of many kinds of chance outcomes. 3) Many statistical inference procedures based on normal distributions work well for other roughly symmetric distributions. The Rule In the normal distribution with mean µ and standard deviation σ : 68 % of the observations fall within σ of the mean µ 95 % of the observations fall within 2σ of the mean µ 99.7 % of the observations fall within 3σ of the mean µ ٣٢

33 Normal Curve Example John collected data on the heights of women ages 18 to 24. He found that the distribution was roughly normal, with a mean of 64.5 inches and a standard deviation of 2.5 inches. Normal Curve Example John collected data on the heights of women ages 18 to 24. He found that the distribution was roughly normal, with a mean of 64.5 inches and a standard deviation of 2.5 inches. Q1 : What percentage of these women were between the heights of 62 and 67 inches? Q2 : What percentage of these women were between the heights of 59.5 and 69.5 inches? Q3 : What percentage of these women were less than 64.5 inches tall? Q4 : What percentage of these women were less than 67 inches tall? Q5 : What percentage of these women were between the heights of 57 and 69.5 inches? Other Questions Q : What percentage of these women were between the heights of 60 and 70 inches? Q : Who is considered more extraordinary, a 72 inch tall female or a 72 inch tall male? Q : Who is considered more extraordinary, a 67 inch tall female or a 72 inch tall male? Q : If you get a 26 on your ACT, and your neighbor gets a 1000 on their SAT, who did better? We can answer these questions by a normalizing technique. ٣٣

34 Normalizing Data If we have two unrelated data sets, and they are both roughly normal, then we can perform a linear transformation on both data sets. This transformation will allow us to compare the data sets by examining how many standard deviations above or below the mean each score is. Example : Mike has an ACT score of 26 and Carol has an SAT score of Q : Who really has the better score? A : Mike s ACT score is 1.2 standard deviations above the mean, and Carol s SAT score is 1.4 standard deviations above the mean. This means that Carol actually did better on her test than Mike! Standardizing Observations If x is an observation from a roughly symmetric distribution that has mean µ and standard deviation σ, then the standard value of x is : z = x - µ σ Note: A standardized score is often called a z-score. Example : Women s IQ s have a symmetric distribution with a mean of 97 and a standard deviation of 6. What is the standard score for a woman with an IQ of 106? z = = 9 6 = 1.5 Standardizing Observations If x is an observation from a roughly symmetric distribution that has mean µ and standard deviation σ, then the standard value of x is : z = x - µ σ Note: A standardized score is often called a z-score. Example : Men s IQ s have a roughly symmetric distribution with a mean of 72 and a standard deviation of 8. What is the standard score for a man with an IQ of 66? z = = -6 8 = -.75 ٣٤

35 Deep Thoughts 1) When we are normalizing our data set, we are really performing a linear transformation. This transformation will result in the data set still being normal. 2) If we start off with a distribution which is normal, with mean µ and standard deviation σ, (denoted by N(µ, σ) ), then after we have standardized the data set, we will have a normal distribution, with mean 0 and standard deviation 1. (Denoted by N(0, 1) ). The Standard Normal Distribution If x is an observation from a roughly symmetric distribution that has mean µ and standard deviation σ, then the standard value of x is : z = x - µ σ Note: A standardized score is often called a z-score. Example : Men s IQ s have a roughly symmetric distribution with a mean of 72 and a standard deviation of 8. What is the standard score for a man with an IQ of 66? z = = -6 = Q: What percentage of people have a score below 66? The Standard Normal Table Table A is a table of areas under the standard normal curve. The table entry for each z value is the area under the curve to the left of z ٣٥

36 The Standard Normal Table Example : Imagine we have done an experiment, and we want to find what percentage of people fell under a score, namely x. We then proceed to find that the z-score for the value x is The Standard Normal Table Example : The Graduate Record Examinations (GRE) are widely used to help predict the performance of applicants to graduate schools. The range of possible sores on a GRE is 200 to 900. The psychology department at a university finds the scores of its applicants on the quantitative GRE are approximately normal with mean µ = 544 and standard deviation σ = 103. Answer the following : 1) Find the percentage of people who scored 700 or higher on the test. 2) Find the percentage of people who scored below 500 on the test. 3) Find the percentage of people who scored between 500 and 800 on the test. 1) Find the percentage of people who scored 700 or higher on the test. Find the percentage to the right of the 700 marker. ٣٦

37 1) Find the percentage of people who scored 700 or higher on the test. Find the z-score : z = = = P(X>700)=P(Z>1.51)=1-P(Z<1.51)= = ) Find the percentage of people who scored below 500 on the test. Find the percentage to the left of 500 2) Find the percentage of people who scored below 500 on the test. Find the z-score : z = = = Answer : ٣٧

38 3) Find the percentage of people who scored between 500 and 800 on the test. Find the percentage between 500 and 800 3) Find the percentage of people who scored between 500 and 800 on the test Find the first z-score : z = = = Find the second z-score : z = = = Area = = 0.66 Example : The Soup Nazi charges, on the average, $4.50 for a cup of soup, and if you re lucky, some bread, with a standard deviation of $ What is the probability that our check will be more than $5.00? ٣٨

39 What is the probability that our check will be more than $5.00? P (X > 5 ) =P(Z >1.11)= % Z = = 1.11 Backward Normal Calculations We could find the observed value (x) of a given proportion in N(µ, σ) by unstandardizing the z- score. 1) State the problem 2) Draw a picture 3) Use the normal table to find the proportion closest to the one you need 4) Read off the z-value 5) Unstandardize x= µ+ zσ Example Find the value of z such that the probability of being less than z is z: P(Z < z) = σ = 1 0 ٣٩

40 Example Find the value of z such that the probability of being less than z is z: P(Z < z) = σ = In the body of the normal table, find the closest value to.10. Once found, determine the z value. Closest is.1003 So z = P(Z < -1.28) =.1003 Example Find the value of z such that the probability of being greater than z is z: P(Z > z) = z: P(Z < z) = =.67 σ = 1??? Example Find the value of z such that the probability of being greater than z is z: P(Z > z) = z: P(Z < z) = = σ = In the body of the normal table, find the closest value to.67. Once found, determine the z value. I found.6700 So z =.44 P(Z >.44) =.33 ٤٠

41 Example X = time Americans stir sugar into their iced tea X ~ N(12.3, 3.1) seconds (1) Find the percent of Americans who spend between 20 to 22 seconds in stirring sugar into their iced tea? i.e. P(20 < X < 22) Example X = time Americans stir sugar into their iced tea X ~ N(12.3, 3.1) Find P(20 < X < 22) = P( < Z < ) = P(2.48 < Z < 3.13) = P(Z < 3.13) - P(Z < 2.48) = =.0057 Example X = time Americans stir sugar into their iced tea X ~ N(12.3, 3.1) (2) About 18.4% of Americans spend more than how many seconds stirring sugar into their iced tea? i.e. Find the value of X such that the probability of being greater than this value is.184. (1) z: P(Z > z) =.184 (2) z: P(Z < z) = =.816 (3) From the normal table, z = 0.90 (4) So x = µ+zσ = (3.1) = = The person would have to stir seconds. ٤١

42 Example X = IQ scores X ~ N(112, 9) Find the IQ score that replaces you in the top 2% of all scores. 1. z: P(Z > z) = z: P(Z < z) = = From the normal table, z = 2.05 x = µ+zσ = (9) = Exercise The distribution of SAT Math scores is approximately normally distributed with mean 500 and standard deviation In what range do the middle 95% of all SAT Math scores lie? 2. What proportion of SAT Math scores are between 450 and 650? 3. If high school students having SAT Math scores in the top 10% of all scores are eligible for a certain scholarship, what is the lowest score a person eligible for the scholarship can have? ٤٢

STP 420 INTRODUCTION TO APPLIED STATISTICS NOTES

STP 420 INTRODUCTION TO APPLIED STATISTICS NOTES INTRODUCTION TO APPLIED STATISTICS NOTES PART - DATA CHAPTER LOOKING AT DATA - DISTRIBUTIONS Individuals objects described by a set of data (people, animals, things) - all the data for one individual make

More information

STAT 200 Chapter 1 Looking at Data - Distributions

STAT 200 Chapter 1 Looking at Data - Distributions STAT 200 Chapter 1 Looking at Data - Distributions What is Statistics? Statistics is a science that involves the design of studies, data collection, summarizing and analyzing the data, interpreting the

More information

Chapter 2: Tools for Exploring Univariate Data

Chapter 2: Tools for Exploring Univariate Data Stats 11 (Fall 2004) Lecture Note Introduction to Statistical Methods for Business and Economics Instructor: Hongquan Xu Chapter 2: Tools for Exploring Univariate Data Section 2.1: Introduction What is

More information

Lecture 2. Quantitative variables. There are three main graphical methods for describing, summarizing, and detecting patterns in quantitative data:

Lecture 2. Quantitative variables. There are three main graphical methods for describing, summarizing, and detecting patterns in quantitative data: Lecture 2 Quantitative variables There are three main graphical methods for describing, summarizing, and detecting patterns in quantitative data: Stemplot (stem-and-leaf plot) Histogram Dot plot Stemplots

More information

Chapter 1. Looking at Data

Chapter 1. Looking at Data Chapter 1 Looking at Data Types of variables Looking at Data Be sure that each variable really does measure what you want it to. A poor choice of variables can lead to misleading conclusions!! For example,

More information

What is statistics? Statistics is the science of: Collecting information. Organizing and summarizing the information collected

What is statistics? Statistics is the science of: Collecting information. Organizing and summarizing the information collected What is statistics? Statistics is the science of: Collecting information Organizing and summarizing the information collected Analyzing the information collected in order to draw conclusions Two types

More information

Further Mathematics 2018 CORE: Data analysis Chapter 2 Summarising numerical data

Further Mathematics 2018 CORE: Data analysis Chapter 2 Summarising numerical data Chapter 2: Summarising numerical data Further Mathematics 2018 CORE: Data analysis Chapter 2 Summarising numerical data Extract from Study Design Key knowledge Types of data: categorical (nominal and ordinal)

More information

QUANTITATIVE DATA. UNIVARIATE DATA data for one variable

QUANTITATIVE DATA. UNIVARIATE DATA data for one variable QUANTITATIVE DATA Recall that quantitative (numeric) data values are numbers where data take numerical values for which it is sensible to find averages, such as height, hourly pay, and pulse rates. UNIVARIATE

More information

Lecture 6: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries)

Lecture 6: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries) Lecture 6: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries) Summarize with Shape, Center, Spread Displays: Stemplots, Histograms Five Number Summary, Outliers, Boxplots Cengage Learning

More information

Describing distributions with numbers

Describing distributions with numbers Describing distributions with numbers A large number or numerical methods are available for describing quantitative data sets. Most of these methods measure one of two data characteristics: The central

More information

Describing distributions with numbers

Describing distributions with numbers Describing distributions with numbers A large number or numerical methods are available for describing quantitative data sets. Most of these methods measure one of two data characteristics: The central

More information

CHAPTER 5: EXPLORING DATA DISTRIBUTIONS. Individuals are the objects described by a set of data. These individuals may be people, animals or things.

CHAPTER 5: EXPLORING DATA DISTRIBUTIONS. Individuals are the objects described by a set of data. These individuals may be people, animals or things. (c) Epstein 2013 Chapter 5: Exploring Data Distributions Page 1 CHAPTER 5: EXPLORING DATA DISTRIBUTIONS 5.1 Creating Histograms Individuals are the objects described by a set of data. These individuals

More information

Introduction to Statistics

Introduction to Statistics Introduction to Statistics Data and Statistics Data consists of information coming from observations, counts, measurements, or responses. Statistics is the science of collecting, organizing, analyzing,

More information

AP Final Review II Exploring Data (20% 30%)

AP Final Review II Exploring Data (20% 30%) AP Final Review II Exploring Data (20% 30%) Quantitative vs Categorical Variables Quantitative variables are numerical values for which arithmetic operations such as means make sense. It is usually a measure

More information

MATH 2560 C F03 Elementary Statistics I Lecture 1: Displaying Distributions with Graphs. Outline.

MATH 2560 C F03 Elementary Statistics I Lecture 1: Displaying Distributions with Graphs. Outline. MATH 2560 C F03 Elementary Statistics I Lecture 1: Displaying Distributions with Graphs. Outline. data; variables: categorical & quantitative; distributions; bar graphs & pie charts: What Is Statistics?

More information

STT 315 This lecture is based on Chapter 2 of the textbook.

STT 315 This lecture is based on Chapter 2 of the textbook. STT 315 This lecture is based on Chapter 2 of the textbook. Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing him to use/edit some of their

More information

1-1. Chapter 1. Sampling and Descriptive Statistics by The McGraw-Hill Companies, Inc. All rights reserved.

1-1. Chapter 1. Sampling and Descriptive Statistics by The McGraw-Hill Companies, Inc. All rights reserved. 1-1 Chapter 1 Sampling and Descriptive Statistics 1-2 Why Statistics? Deal with uncertainty in repeated scientific measurements Draw conclusions from data Design valid experiments and draw reliable conclusions

More information

Lecture 3B: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries)

Lecture 3B: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries) Lecture 3B: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries) Summarize with Shape, Center, Spread Displays: Stemplots, Histograms Five Number Summary, Outliers, Boxplots Mean vs.

More information

are the objects described by a set of data. They may be people, animals or things.

are the objects described by a set of data. They may be people, animals or things. ( c ) E p s t e i n, C a r t e r a n d B o l l i n g e r 2016 C h a p t e r 5 : E x p l o r i n g D a t a : D i s t r i b u t i o n s P a g e 1 CHAPTER 5: EXPLORING DATA DISTRIBUTIONS 5.1 Creating Histograms

More information

The empirical ( ) rule

The empirical ( ) rule The empirical (68-95-99.7) rule With a bell shaped distribution, about 68% of the data fall within a distance of 1 standard deviation from the mean. 95% fall within 2 standard deviations of the mean. 99.7%

More information

Units. Exploratory Data Analysis. Variables. Student Data

Units. Exploratory Data Analysis. Variables. Student Data Units Exploratory Data Analysis Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison Statistics 371 13th September 2005 A unit is an object that can be measured, such as

More information

Example 2. Given the data below, complete the chart:

Example 2. Given the data below, complete the chart: Statistics 2035 Quiz 1 Solutions Example 1. 2 64 150 150 2 128 150 2 256 150 8 8 Example 2. Given the data below, complete the chart: 52.4, 68.1, 66.5, 75.0, 60.5, 78.8, 63.5, 48.9, 81.3 n=9 The data is

More information

Chapter 3. Data Description

Chapter 3. Data Description Chapter 3. Data Description Graphical Methods Pie chart It is used to display the percentage of the total number of measurements falling into each of the categories of the variable by partition a circle.

More information

Chapter 5: Exploring Data: Distributions Lesson Plan

Chapter 5: Exploring Data: Distributions Lesson Plan Lesson Plan Exploring Data Displaying Distributions: Histograms Interpreting Histograms Displaying Distributions: Stemplots Describing Center: Mean and Median Describing Variability: The Quartiles The

More information

Chapter 3: The Normal Distributions

Chapter 3: The Normal Distributions Chapter 3: The Normal Distributions http://www.yorku.ca/nuri/econ2500/econ2500-online-course-materials.pdf graphs-normal.doc / histogram-density.txt / normal dist table / ch3-image Ch3 exercises: 3.2,

More information

Math 223 Lecture Notes 3/15/04 From The Basic Practice of Statistics, bymoore

Math 223 Lecture Notes 3/15/04 From The Basic Practice of Statistics, bymoore Math 223 Lecture Notes 3/15/04 From The Basic Practice of Statistics, bymoore Chapter 3 continued Describing distributions with numbers Measuring spread of data: Quartiles Definition 1: The interquartile

More information

MATH 1150 Chapter 2 Notation and Terminology

MATH 1150 Chapter 2 Notation and Terminology MATH 1150 Chapter 2 Notation and Terminology Categorical Data The following is a dataset for 30 randomly selected adults in the U.S., showing the values of two categorical variables: whether or not the

More information

TOPIC: Descriptive Statistics Single Variable

TOPIC: Descriptive Statistics Single Variable TOPIC: Descriptive Statistics Single Variable I. Numerical data summary measurements A. Measures of Location. Measures of central tendency Mean; Median; Mode. Quantiles - measures of noncentral tendency

More information

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 3.1- #

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 3.1- # Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series by Mario F. Triola Chapter 3 Statistics for Describing, Exploring, and Comparing Data 3-1 Review and Preview 3-2 Measures

More information

F78SC2 Notes 2 RJRC. If the interest rate is 5%, we substitute x = 0.05 in the formula. This gives

F78SC2 Notes 2 RJRC. If the interest rate is 5%, we substitute x = 0.05 in the formula. This gives F78SC2 Notes 2 RJRC Algebra It is useful to use letters to represent numbers. We can use the rules of arithmetic to manipulate the formula and just substitute in the numbers at the end. Example: 100 invested

More information

Topic 3: Introduction to Statistics. Algebra 1. Collecting Data. Table of Contents. Categorical or Quantitative? What is the Study of Statistics?!

Topic 3: Introduction to Statistics. Algebra 1. Collecting Data. Table of Contents. Categorical or Quantitative? What is the Study of Statistics?! Topic 3: Introduction to Statistics Collecting Data We collect data through observation, surveys and experiments. We can collect two different types of data: Categorical Quantitative Algebra 1 Table of

More information

Percentile: Formula: To find the percentile rank of a score, x, out of a set of n scores, where x is included:

Percentile: Formula: To find the percentile rank of a score, x, out of a set of n scores, where x is included: AP Statistics Chapter 2 Notes 2.1 Describing Location in a Distribution Percentile: The pth percentile of a distribution is the value with p percent of the observations (If your test score places you in

More information

Histograms allow a visual interpretation

Histograms allow a visual interpretation Chapter 4: Displaying and Summarizing i Quantitative Data s allow a visual interpretation of quantitative (numerical) data by indicating the number of data points that lie within a range of values, called

More information

Resistant Measure - A statistic that is not affected very much by extreme observations.

Resistant Measure - A statistic that is not affected very much by extreme observations. Chapter 1.3 Lecture Notes & Examples Section 1.3 Describing Quantitative Data with Numbers (pp. 50-74) 1.3.1 Measuring Center: The Mean Mean - The arithmetic average. To find the mean (pronounced x bar)

More information

Objective A: Mean, Median and Mode Three measures of central of tendency: the mean, the median, and the mode.

Objective A: Mean, Median and Mode Three measures of central of tendency: the mean, the median, and the mode. Chapter 3 Numerically Summarizing Data Chapter 3.1 Measures of Central Tendency Objective A: Mean, Median and Mode Three measures of central of tendency: the mean, the median, and the mode. A1. Mean The

More information

Math 140 Introductory Statistics

Math 140 Introductory Statistics Math 140 Introductory Statistics Professor Silvia Fernández Chapter 2 Based on the book Statistics in Action by A. Watkins, R. Scheaffer, and G. Cobb. Visualizing Distributions Recall the definition: The

More information

Math 140 Introductory Statistics

Math 140 Introductory Statistics Visualizing Distributions Math 140 Introductory Statistics Professor Silvia Fernández Chapter Based on the book Statistics in Action by A. Watkins, R. Scheaffer, and G. Cobb. Recall the definition: The

More information

Measures of center. The mean The mean of a distribution is the arithmetic average of the observations:

Measures of center. The mean The mean of a distribution is the arithmetic average of the observations: Measures of center The mean The mean of a distribution is the arithmetic average of the observations: x = x 1 + + x n n n = 1 x i n i=1 The median The median is the midpoint of a distribution: the number

More information

Unit 2. Describing Data: Numerical

Unit 2. Describing Data: Numerical Unit 2 Describing Data: Numerical Describing Data Numerically Describing Data Numerically Central Tendency Arithmetic Mean Median Mode Variation Range Interquartile Range Variance Standard Deviation Coefficient

More information

Stat 101 Exam 1 Important Formulas and Concepts 1

Stat 101 Exam 1 Important Formulas and Concepts 1 1 Chapter 1 1.1 Definitions Stat 101 Exam 1 Important Formulas and Concepts 1 1. Data Any collection of numbers, characters, images, or other items that provide information about something. 2. Categorical/Qualitative

More information

What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty.

What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty. What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty. Statistics is a field of study concerned with the data collection,

More information

CHAPTER 2: Describing Distributions with Numbers

CHAPTER 2: Describing Distributions with Numbers CHAPTER 2: Describing Distributions with Numbers The Basic Practice of Statistics 6 th Edition Moore / Notz / Fligner Lecture PowerPoint Slides Chapter 2 Concepts 2 Measuring Center: Mean and Median Measuring

More information

Chapter 4. Displaying and Summarizing. Quantitative Data

Chapter 4. Displaying and Summarizing. Quantitative Data STAT 141 Introduction to Statistics Chapter 4 Displaying and Summarizing Quantitative Data Bin Zou (bzou@ualberta.ca) STAT 141 University of Alberta Winter 2015 1 / 31 4.1 Histograms 1 We divide the range

More information

Section 3. Measures of Variation

Section 3. Measures of Variation Section 3 Measures of Variation Range Range = (maximum value) (minimum value) It is very sensitive to extreme values; therefore not as useful as other measures of variation. Sample Standard Deviation The

More information

1.3.1 Measuring Center: The Mean

1.3.1 Measuring Center: The Mean 1.3.1 Measuring Center: The Mean Mean - The arithmetic average. To find the mean (pronounced x bar) of a set of observations, add their values and divide by the number of observations. If the n observations

More information

Lecture 2 and Lecture 3

Lecture 2 and Lecture 3 Lecture 2 and Lecture 3 1 Lecture 2 and Lecture 3 We can describe distributions using 3 characteristics: shape, center and spread. These characteristics have been discussed since the foundation of statistics.

More information

Chapter 2 Solutions Page 15 of 28

Chapter 2 Solutions Page 15 of 28 Chapter Solutions Page 15 of 8.50 a. The median is 55. The mean is about 105. b. The median is a more representative average" than the median here. Notice in the stem-and-leaf plot on p.3 of the text that

More information

Determining the Spread of a Distribution

Determining the Spread of a Distribution Determining the Spread of a Distribution 1.3-1.5 Cathy Poliak, Ph.D. cathy@math.uh.edu Department of Mathematics University of Houston Lecture 3-2311 Lecture 3-2311 1 / 58 Outline 1 Describing Quantitative

More information

CIVL 7012/8012. Collection and Analysis of Information

CIVL 7012/8012. Collection and Analysis of Information CIVL 7012/8012 Collection and Analysis of Information Uncertainty in Engineering Statistics deals with the collection and analysis of data to solve real-world problems. Uncertainty is inherent in all real

More information

Chapters 1 & 2 Exam Review

Chapters 1 & 2 Exam Review Problems 1-3 refer to the following five boxplots. 1.) To which of the above boxplots does the following histogram correspond? (A) A (B) B (C) C (D) D (E) E 2.) To which of the above boxplots does the

More information

3.1 Measure of Center

3.1 Measure of Center 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable to the mean Find the mode of a data set Describe how skewness affects

More information

Determining the Spread of a Distribution

Determining the Spread of a Distribution Determining the Spread of a Distribution 1.3-1.5 Cathy Poliak, Ph.D. cathy@math.uh.edu Department of Mathematics University of Houston Lecture 3-2311 Lecture 3-2311 1 / 58 Outline 1 Describing Quantitative

More information

Chapter 5. Understanding and Comparing. Distributions

Chapter 5. Understanding and Comparing. Distributions STAT 141 Introduction to Statistics Chapter 5 Understanding and Comparing Distributions Bin Zou (bzou@ualberta.ca) STAT 141 University of Alberta Winter 2015 1 / 27 Boxplots How to create a boxplot? Assume

More information

Chapter 3: Displaying and summarizing quantitative data p52 The pattern of variation of a variable is called its distribution.

Chapter 3: Displaying and summarizing quantitative data p52 The pattern of variation of a variable is called its distribution. Chapter 3: Displaying and summarizing quantitative data p52 The pattern of variation of a variable is called its distribution. 1 Histograms p53 The breakfast cereal data Study collected data on nutritional

More information

Exercises from Chapter 3, Section 1

Exercises from Chapter 3, Section 1 Exercises from Chapter 3, Section 1 1. Consider the following sample consisting of 20 numbers. (a) Find the mode of the data 21 23 24 24 25 26 29 30 32 34 39 41 41 41 42 43 48 51 53 53 (b) Find the median

More information

Continuous random variables

Continuous random variables Continuous random variables A continuous random variable X takes all values in an interval of numbers. The probability distribution of X is described by a density curve. The total area under a density

More information

A is one of the categories into which qualitative data can be classified.

A is one of the categories into which qualitative data can be classified. Chapter 2 Methods for Describing Sets of Data 2.1 Describing qualitative data Recall qualitative data: non-numerical or categorical data Basic definitions: A is one of the categories into which qualitative

More information

Describing Distributions

Describing Distributions Describing Distributions With Numbers April 18, 2012 Summary Statistics. Measures of Center. Percentiles. Measures of Spread. A Summary Statement. Choosing Numerical Summaries. 1.0 What Are Summary Statistics?

More information

Review for Exam #1. Chapter 1. The Nature of Data. Definitions. Population. Sample. Quantitative data. Qualitative (attribute) data

Review for Exam #1. Chapter 1. The Nature of Data. Definitions. Population. Sample. Quantitative data. Qualitative (attribute) data Review for Exam #1 1 Chapter 1 Population the complete collection of elements (scores, people, measurements, etc.) to be studied Sample a subcollection of elements drawn from a population 11 The Nature

More information

Lecture 1: Descriptive Statistics

Lecture 1: Descriptive Statistics Lecture 1: Descriptive Statistics MSU-STT-351-Sum 15 (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 1 / 56 Contents 1 Introduction 2 Branches of Statistics Descriptive Statistics

More information

Chapter 6 The Standard Deviation as a Ruler and the Normal Model

Chapter 6 The Standard Deviation as a Ruler and the Normal Model Chapter 6 The Standard Deviation as a Ruler and the Normal Model Overview Key Concepts Understand how adding (subtracting) a constant or multiplying (dividing) by a constant changes the center and/or spread

More information

Section 5.4. Ken Ueda

Section 5.4. Ken Ueda Section 5.4 Ken Ueda Students seem to think that being graded on a curve is a positive thing. I took lasers 101 at Cornell and got a 92 on the exam. The average was a 93. I ended up with a C on the test.

More information

Performance of fourth-grade students on an agility test

Performance of fourth-grade students on an agility test Starter Ch. 5 2005 #1a CW Ch. 4: Regression L1 L2 87 88 84 86 83 73 81 67 78 83 65 80 50 78 78? 93? 86? Create a scatterplot Find the equation of the regression line Predict the scores Chapter 5: Understanding

More information

6 THE NORMAL DISTRIBUTION

6 THE NORMAL DISTRIBUTION CHAPTER 6 THE NORMAL DISTRIBUTION 341 6 THE NORMAL DISTRIBUTION Figure 6.1 If you ask enough people about their shoe size, you will find that your graphed data is shaped like a bell curve and can be described

More information

Sampling, Frequency Distributions, and Graphs (12.1)

Sampling, Frequency Distributions, and Graphs (12.1) 1 Sampling, Frequency Distributions, and Graphs (1.1) Design: Plan how to obtain the data. What are typical Statistical Methods? Collect the data, which is then subjected to statistical analysis, which

More information

Chapter 6. The Standard Deviation as a Ruler and the Normal Model 1 /67

Chapter 6. The Standard Deviation as a Ruler and the Normal Model 1 /67 Chapter 6 The Standard Deviation as a Ruler and the Normal Model 1 /67 Homework Read Chpt 6 Complete Reading Notes Do P129 1, 3, 5, 7, 15, 17, 23, 27, 29, 31, 37, 39, 43 2 /67 Objective Students calculate

More information

Remember your SOCS! S: O: C: S:

Remember your SOCS! S: O: C: S: Remember your SOCS! S: O: C: S: 1.1: Displaying Distributions with Graphs Dotplot: Age of your fathers Low scale: 45 High scale: 75 Doesn t have to start at zero, just cover the range of the data Label

More information

Chapter2 Description of samples and populations. 2.1 Introduction.

Chapter2 Description of samples and populations. 2.1 Introduction. Chapter2 Description of samples and populations. 2.1 Introduction. Statistics=science of analyzing data. Information collected (data) is gathered in terms of variables (characteristics of a subject that

More information

Describing Distributions With Numbers

Describing Distributions With Numbers Describing Distributions With Numbers October 24, 2012 What Do We Usually Summarize? Measures of Center. Percentiles. Measures of Spread. A Summary Statement. Choosing Numerical Summaries. 1.0 What Do

More information

Chapter 1: Exploring Data

Chapter 1: Exploring Data Chapter 1: Exploring Data Section 1.3 with Numbers The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE Chapter 1 Exploring Data Introduction: Data Analysis: Making Sense of Data 1.1

More information

Finding Quartiles. . Q1 is the median of the lower half of the data. Q3 is the median of the upper half of the data

Finding Quartiles. . Q1 is the median of the lower half of the data. Q3 is the median of the upper half of the data Finding Quartiles. Use the median to divide the ordered data set into two halves.. If n is odd, do not include the median in either half. If n is even, split this data set exactly in half.. Q1 is the median

More information

GRACEY/STATISTICS CH. 3. CHAPTER PROBLEM Do women really talk more than men? Science, Vol. 317, No. 5834). The study

GRACEY/STATISTICS CH. 3. CHAPTER PROBLEM Do women really talk more than men? Science, Vol. 317, No. 5834). The study CHAPTER PROBLEM Do women really talk more than men? A common belief is that women talk more than men. Is that belief founded in fact, or is it a myth? Do men actually talk more than women? Or do men and

More information

Chapter 3. Measuring data

Chapter 3. Measuring data Chapter 3 Measuring data 1 Measuring data versus presenting data We present data to help us draw meaning from it But pictures of data are subjective They re also not susceptible to rigorous inference Measuring

More information

Section 3.2 Measures of Central Tendency

Section 3.2 Measures of Central Tendency Section 3.2 Measures of Central Tendency 1 of 149 Section 3.2 Objectives Determine the mean, median, and mode of a population and of a sample Determine the weighted mean of a data set and the mean of a

More information

Chapter 5: Exploring Data: Distributions Lesson Plan

Chapter 5: Exploring Data: Distributions Lesson Plan Lesson Plan Exploring Data Displaying Distributions: Histograms For All Practical Purposes Mathematical Literacy in Today s World, 7th ed. Interpreting Histograms Displaying Distributions: Stemplots Describing

More information

Shape, Outliers, Center, Spread Frequency and Relative Histograms Related to other types of graphical displays

Shape, Outliers, Center, Spread Frequency and Relative Histograms Related to other types of graphical displays Histograms: Shape, Outliers, Center, Spread Frequency and Relative Histograms Related to other types of graphical displays Sep 9 1:13 PM Shape: Skewed left Bell shaped Symmetric Bi modal Symmetric Skewed

More information

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1 Lecture Slides Elementary Statistics Tenth Edition and the Triola Statistics Series by Mario F. Triola Slide 1 Chapter 3 Statistics for Describing, Exploring, and Comparing Data 3-1 Overview 3-2 Measures

More information

Describing Distributions with Numbers

Describing Distributions with Numbers Describing Distributions with Numbers Using graphs, we could determine the center, spread, and shape of the distribution of a quantitative variable. We can also use numbers (called summary statistics)

More information

CHAPTER 1. Introduction

CHAPTER 1. Introduction CHAPTER 1 Introduction Engineers and scientists are constantly exposed to collections of facts, or data. The discipline of statistics provides methods for organizing and summarizing data, and for drawing

More information

Descriptive Univariate Statistics and Bivariate Correlation

Descriptive Univariate Statistics and Bivariate Correlation ESC 100 Exploring Engineering Descriptive Univariate Statistics and Bivariate Correlation Instructor: Sudhir Khetan, Ph.D. Wednesday/Friday, October 17/19, 2012 The Central Dogma of Statistics used to

More information

1.3: Describing Quantitative Data with Numbers

1.3: Describing Quantitative Data with Numbers 1.3: Describing Quantitative Data with Numbers Section 1.3 Describing Quantitative Data with Numbers After this section, you should be able to MEASURE center with the mean and median MEASURE spread with

More information

SESSION 5 Descriptive Statistics

SESSION 5 Descriptive Statistics SESSION 5 Descriptive Statistics Descriptive statistics are used to describe the basic features of the data in a study. They provide simple summaries about the sample and the measures. Together with simple

More information

Statistics I Chapter 2: Univariate data analysis

Statistics I Chapter 2: Univariate data analysis Statistics I Chapter 2: Univariate data analysis Chapter 2: Univariate data analysis Contents Graphical displays for categorical data (barchart, piechart) Graphical displays for numerical data data (histogram,

More information

Types of Information. Topic 2 - Descriptive Statistics. Examples. Sample and Sample Size. Background Reading. Variables classified as STAT 511

Types of Information. Topic 2 - Descriptive Statistics. Examples. Sample and Sample Size. Background Reading. Variables classified as STAT 511 Topic 2 - Descriptive Statistics STAT 511 Professor Bruce Craig Types of Information Variables classified as Categorical (qualitative) - variable classifies individual into one of several groups or categories

More information

Last Lecture. Distinguish Populations from Samples. Knowing different Sampling Techniques. Distinguish Parameters from Statistics

Last Lecture. Distinguish Populations from Samples. Knowing different Sampling Techniques. Distinguish Parameters from Statistics Last Lecture Distinguish Populations from Samples Importance of identifying a population and well chosen sample Knowing different Sampling Techniques Distinguish Parameters from Statistics Knowing different

More information

1. Exploratory Data Analysis

1. Exploratory Data Analysis 1. Exploratory Data Analysis 1.1 Methods of Displaying Data A visual display aids understanding and can highlight features which may be worth exploring more formally. Displays should have impact and be

More information

CHAPTER 1 Exploring Data

CHAPTER 1 Exploring Data CHAPTER 1 Exploring Data 1.2 Displaying Quantitative Data with Graphs The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers Displaying Quantitative Data

More information

Statistics I Chapter 2: Univariate data analysis

Statistics I Chapter 2: Univariate data analysis Statistics I Chapter 2: Univariate data analysis Chapter 2: Univariate data analysis Contents Graphical displays for categorical data (barchart, piechart) Graphical displays for numerical data data (histogram,

More information

Density Curves and the Normal Distributions. Histogram: 10 groups

Density Curves and the Normal Distributions. Histogram: 10 groups Density Curves and the Normal Distributions MATH 2300 Chapter 6 Histogram: 10 groups 1 Histogram: 20 groups Histogram: 40 groups 2 Histogram: 80 groups Histogram: 160 groups 3 Density Curve Density Curves

More information

ST Presenting & Summarising Data Descriptive Statistics. Frequency Distribution, Histogram & Bar Chart

ST Presenting & Summarising Data Descriptive Statistics. Frequency Distribution, Histogram & Bar Chart ST2001 2. Presenting & Summarising Data Descriptive Statistics Frequency Distribution, Histogram & Bar Chart Summary of Previous Lecture u A study often involves taking a sample from a population that

More information

Statistics for Managers using Microsoft Excel 6 th Edition

Statistics for Managers using Microsoft Excel 6 th Edition Statistics for Managers using Microsoft Excel 6 th Edition Chapter 3 Numerical Descriptive Measures 3-1 Learning Objectives In this chapter, you learn: To describe the properties of central tendency, variation,

More information

A C E. Answers Investigation 4. Applications

A C E. Answers Investigation 4. Applications Answers Applications 1. 1 student 2. You can use the histogram with 5-minute intervals to determine the number of students that spend at least 15 minutes traveling to school. To find the number of students,

More information

P8130: Biostatistical Methods I

P8130: Biostatistical Methods I P8130: Biostatistical Methods I Lecture 2: Descriptive Statistics Cody Chiuzan, PhD Department of Biostatistics Mailman School of Public Health (MSPH) Lecture 1: Recap Intro to Biostatistics Types of Data

More information

1 Measures of the Center of a Distribution

1 Measures of the Center of a Distribution 1 Measures of the Center of a Distribution Qualitative descriptions of the shape of a distribution are important and useful. But we will often desire the precision of numerical summaries as well. Two aspects

More information

Describing Distributions With Numbers Chapter 12

Describing Distributions With Numbers Chapter 12 Describing Distributions With Numbers Chapter 12 May 1, 2013 What Do We Usually Summarize? Measures of Center. Percentiles. Measures of Spread. A Summary. 1.0 What Do We Usually Summarize? source: Prof.

More information

Chapter 4: Displaying and Summarizing Quantitative Data

Chapter 4: Displaying and Summarizing Quantitative Data Chapter 4: Displaying and Summarizing Quantitative Data This chapter discusses methods of displaying quantitative data. The objective is describe the distribution of the data. The figure below shows three

More information

Descriptive Statistics-I. Dr Mahmoud Alhussami

Descriptive Statistics-I. Dr Mahmoud Alhussami Descriptive Statistics-I Dr Mahmoud Alhussami Biostatistics What is the biostatistics? A branch of applied math. that deals with collecting, organizing and interpreting data using well-defined procedures.

More information

2011 Pearson Education, Inc

2011 Pearson Education, Inc Statistics for Business and Economics Chapter 2 Methods for Describing Sets of Data Summary of Central Tendency Measures Measure Formula Description Mean x i / n Balance Point Median ( n +1) Middle Value

More information

Preliminary Statistics course. Lecture 1: Descriptive Statistics

Preliminary Statistics course. Lecture 1: Descriptive Statistics Preliminary Statistics course Lecture 1: Descriptive Statistics Rory Macqueen (rm43@soas.ac.uk), September 2015 Organisational Sessions: 16-21 Sep. 10.00-13.00, V111 22-23 Sep. 15.00-18.00, V111 24 Sep.

More information

Chapter 2: Descriptive Analysis and Presentation of Single- Variable Data

Chapter 2: Descriptive Analysis and Presentation of Single- Variable Data Chapter 2: Descriptive Analysis and Presentation of Single- Variable Data Mean 26.86667 Standard Error 2.816392 Median 25 Mode 20 Standard Deviation 10.90784 Sample Variance 118.981 Kurtosis -0.61717 Skewness

More information