Let's Do It! What Type of Variable?

Size: px
Start display at page:

Download "Let's Do It! What Type of Variable?"

Transcription

1 Ch Online homework list: Describing Data Sets Graphical Representation of Data Summary statistics: Measures of Center Box Plots, Outliers, and Standard Deviation Ch Online quizzes list: Quiz 1: Introduction Quiz: Data tables and graphical representation Quiz3: Measures of Center Calculate and Interpret Quiz4: Skewness Quiz5: Box-plots Quiz 6: Measures of Variability 1

2 .1-.3: Organizing Data DEFINITIONS: Qualitative Data are those which classify the units into categories. The categories may or may not have a natural ordering to them. Qualitative variables are also called categorical variables. Quantitative variables have numerical values that are measurements (length, weight, and so on) or counts (of how many). Arithmetic operations on such numerical values do have meaning. We further distinguish quantitative variables based on whether or not the values fall on a continuum Let's Do It! What Type of Variable? Hurricane Charles, in August 004, has been blamed for at least 16 deaths. Listed below is information on other major storms and hurricanes that occurred from 1994 to 003. Estimated Damage/Cost* StormName Date Category Tropical Storm Alberto Jul-94 n/a $1.billion 3 Deaths Hurricane Marilyn Sep-95 $.5billion 13 Hurricane Opal Oct-95 3 $3.6billion 7 Hurricane Fran Sep-96 3 $5.8billion 37 Hurricane Bonnie Aug-98 3 $1.1billion 3 Hurricane Georges Sep-98 $6.5billion 16 Hurricane Floyd Sep-99 $6.5billion 77 Tropical Storm Allison Jun-01 n/a $5.1billion 43 Hurricane Isabel Sep-03 $4.0billion 47 For each variable, determine whether it is qualitative or quantitative. If the variable is quantitative, state whether it is discrete or continuous. (a) The name of the storm. (b) The date the storm occurred. (c) The category of the storm. (d) The estimated amount of damage or cost of the storm. (e) The number of deaths that occurred.

3 DEFINITIONS: The distribution of a variable provides the possible values that a variable can take on and how often these possible values occur. The distribution of a variable shows the pattern of variation of the variable. Let's Do It! College Admissions The following pie chart shows the breakdown of undergraduate enrollment by race at the University of Michigan for the fall term of The total number of undergraduates enrolled for that term was,604. (a) What percentage of undergraduates enrolled were of nonwhite race? (b) How many undergraduates enrolled had no racial category listed? 3

4 Let's Do It! Allied Van Lines surveyed 1000 respondents in May 004. The question asked was, Would you move if your mate had to relocate overseas because of work? The results are summarized in the following pie chart. 30% Yes No 68% % Not Sure (a) What percentage of the respondents said that they would actually move if their mate relocated overseas? (b) What questions would you ask about the sample selection using this information to draw formal conclusions? Let's Do It! Nothing Really Matters The bar graph shown here displays the percentage of respondents who think a particular problem is the most important problem facing America for two different years. (SOURCE: The Economist, March 30-April 5, 1996, pg 33.) (a) In January 199, which problem category had the highest percentage of responses? Was this the same category which had the highest percentage of responses in 1996? (b) In January 199, what percentage of respondents reported crime as the most important problem facing America? In January 1996, what percentage of respondents reported crime as the most important problem facing America? (c) What is the approximate sum of the percentage of respondents across all of the listed problem categories for January 199? Is this sum approximately 100%? If not, give a possible reason why not. 4

5 Example: A Misleading Bar Graph Problem The bar graph that follows presents the total sales figures for three realtors. When the bars are replaced with pictures, often related to the topic of the graph, the graph is called a pictogram. Total Sales $.05 million $1.41 million $0.9 million No. 1 No. Realtor #1 Realtor # Realtor No. 3 #3 Realtor (a) How does the height of the home for Realtor 1 compare to that for Realtor 3? (b) How does the area of the home for Realtor 1 compare to that for Realtor 3? What We ve Learned: When you see a pictogram, be careful to interpret the results appropriately, and do not allow the area of the pictures to mislead you. 5

6 A frequency distribution is the organization of raw data in table form, using classes and frequencies. Each raw data value is placed into a quantitative or qualitative category called a class. The frequency of a class then is the number of data values contained in a specific class. Two types of frequency distributions that are most often used are the categorical frequency distribution and the grouped frequency distribution. Categorical Frequency Distributions: The categorical frequency distribution is used for data that can be placed in specific categories, such as nominal- or ordinal-level data. Grouped Frequency Distributions: When the range of the data is large, the data must be grouped into classes that are more than one unit in width. Let's Do It! Categorical Frequency Distributions A survey was taken on how much trust people place in the information they read on the Internet. A = trust in everything they read, M = trust in most of what they read, H = trust in about half of what they read, S = trust in a small portion of what they read. Construct a categorical frequency distribution for the data. M M M A H M S M H M S M M M M A M M A M M M H M M M H M H M A M M M H M M M M M 6

7 Histograms and Pie Charts The Histogram: is a graph that displays the quantitative data by using contiguous vertical bars (unless the frequency of a class is 0) of various heights to represent the frequencies of the classes. The Pie Graph: a circle that is divided into sections or wedges according to the percentage of frequencies in each category of the distribution. The angle (in degrees) of each wedge is given by: Angle = relative frequency*360. Let's Do It! Distribution of scores For 108 randomly selected college applicants, the following frequency distribution for entrance exam scores was obtained. Class limits Frequency a. Construct a relative frequency histogram for the data b. Applicants who score above 107 need not enroll in a summer developmental program. In this group, how many students do not have to enroll in the developmental program? 7

8 Let's Do It! Matching Shapes to Characteristics Distribution 1 Distribution Characteristic = Characteristic = Distribution 3 Distribution 4 Characteristic = Characteristic = Characteristics: 1. Distribution of age for the population of the United States in the year Describe and explain the shape of the distribution.. Distribution of miles of coastline for the 50 United States. Describe and explain the shape of the distribution. Which states do you think would be in the last class furthest to the right? 3. Distribution of the number of miles traveled to work, that is, commuting distance for employed adults in a city. Describe and explain the shape of the distribution. 4. Distribution of age at death for the population of the United States (year 1980). Describe and explain the shape of the distribution. 8

9 Pie Graph Let s Explore It! The assets of the richest 1% of Americans are distributed as follows. Make a pie chart for the percentages. Principal residence 7.8% 8.08 Liquid assets 5.0% 18.0 Pension accounts 6.9% 4.84 Stock, funds, and trusts 31.6% Businesses & real estate 46.9% Miscellaneous 1.8% 6.48 Total 100% 360 Misc. 1.8% Principal Residence 7.8% Liquid Assets 5.0% Pension Accounts 6.9% Businesses & Real Estate 46.9% Stocks, Funds, and Trusts 31.6% Let s do It! The population of federal prisons, according to the most serious offenses, consists of the following. Make a Pie chart of the population. Violent offenses 1.6% Property offenses 8.5% Drug offenses 60.% Public order offenses Weapons 8.% Immigration 4.9% Other 5.6% 9

10 DATA SET 1.4 Measures of Central Tendency. Suppose you had to give a single number that would represent the most typical age for the 0 subjects. What number would you choose? Measures of center are numerical values that tend to report in some sense the middle of a set of data -- we will focus on the mean and the median. If the data are a sample, the mean and median would be called statistics. If the data form an entire population then these measures of center would be called parameters. Mean Subject # Gender Age 1001 M M F F F F M F M M M F F F M F M F M M 37 DEFINITION: The mean of a set of n observations is simply the sum of the observations divided by the number of observations, n. Mean age of the 0 subjects in the medical study -- add the 0 ages up and divide by 0: Special notation: years 0 If x 1, x,..., x n denote a sample of n observations, then the mean of the sample is called "x-bar" and is denoted by: x i x 1 x x n x n n The mean of a population is denoted by the Greek letter μ. 10

11 Let s Do it! Mean Number of Children per Household Suppose that the number of children in a simple random sample of 10 households is as follows:, 3, 0,, 1, 0, 3, 0, 1, 4 (a) (b) Calculate the sample mean number of children per household. Suppose that the observation for the last household in the above list was incorrectly recorded as 40 instead of 4.What would happen to the mean? Note that 9 of the 10 observations are less than the mean. The mean is sensitive to extreme observations. Most graphical displays would have detected this outlying observation. Let's Do it! A Mean Is Not Always Representative Kim's test scores are 7, 98, 5, 19, and 6. Calculate Kim's mean test score. Explain why the mean does not do a very good job at summarizing Kim's test scores. Let's Do It! Combining Means We have seven students. The mean score for three of these students is 54 and the mean score for the four other students is 76. What is the mean score for all seven students? The mean = the point of equilibrium, the point where the distribution would balance. 1 3 Mean = 1 5 Mean =.5 If the distribution is symmetric, as in the first picture at the left, the mean would be exactly at the center of the distribution. As the largest observation is moved further to the right, making this observation somewhat extreme, the mean shifts towards the extreme observation Mean =4 If a distribution appears to be skewed, we may wish also to report a more resistant measure of center. 11

12 The Mean of Group Data /Frequency Tables The procedure for finding the mean for grouped data uses the midpoints of the classes. This procedure is shown next. Example The data represent the number of miles run during one week for a sample of 0 runners. Solution The procedure for finding the mean for grouped datais given here. Step 1 Make a table as shown. Step Find the midpoints of each class and enter them in column C. Step 3 For each class, multiply the frequency by the midpoint, as shown, and place the product in column D. 1.8 = 8,. 13 = 6 etc. The completed table is shown here. Step 4 Find the sum of column D. Step 5 Divide the sum by n to get the mean. 1

13 Let's Do It! : Eighty randomly selected light bulbs were tested to determine their lifetime in hours. The frequency table of the results is shown in table. Find the average lifetime of a light bulb. Life interval in hours Frequency Let's Do It! The cost per load (in cents) of 35 laundry detergents tested by consumer organization is given below. Class limit Frequency

14 A measure of center that is more resistant to extreme values is the median. Median DEFINITION: The median of a set of n observations, ordered from smallest to largest, is a value such that half of the observations are less than or equal to that value and half the observations are greater than or equal to that value. If the number of observations is odd, the median is the middle observation. If the number of observations is even, the median is any number between the two middle observations, including either of the two middle observations. To be consistent, we will define the median as the mean or average of the two middle observations. Location of the median: (n+1)/, where n is the number of observations. The ages of the n = 0 subjects... Calculating (n+1)/ we get (0+1)/ = So the two middle observations are the 10th and 11th observations, namely 43 and 44. The median is the mean of these two middle observations, (43+44)/=43.5 years th o b s1 1 th o b s m e d ia n = Let's Do It! 1Median Number of Children per Household Find the median number of children in a household from this sample of 10 households, that is, find the median of Number of Children: (a) (b) (c) Note: Median = What happens to the median if the fifth observation in the first list was incorrectly recorded as 40 instead of 4? What happens to the median if the third observation in the first list was incorrectly recorded as -0 instead of 0? The median is resistant that is, it does not change, or changes very little, in response to extreme observations. 14

15 Percent Another Measure The Mode DEFINITION: The mode of a set of observations is the most frequently occurring value; it is the value having the highest frequency among the observations. The mode of the values: { 0, 0, 0, 0, 1, 1,,, 3, 4 } is 0 For { 0, 0, 0, 1, 1,,,, 3, 4 } two modes, 0 and (bimodal) What would be the mode for { 0, 1,, 4, 5, 8 }? For {0, 0, 0, 0, 0, 1,, 3, 4, 4, 4, 4, 5 }? The mode is not often used as a measure of center for quantitative data. The mode can be computed for qualitative data. The modal race category is white. If categories were given coded as: 1=White, =Asian, 3=African-American, 4=Hispanic, 5=American Indian, 6=No category listed, then the mode would be the value American Indian No Category Hispanic African- American Asian White Race 15

16 Let s Do It! Different Measures Can Give Different Impressions The famous trio the mean, the median, and the mode represent three different methods for finding a so-called center value. These three values may be the same but are more likely going to be different. When they are different, they can lead to different interpretations of the data being summarized. Consider the annual incomes of five families in a neighborhood: (a) $1,000 $1,000 $30,000 $90,000 $100,000 Calculate the average income. (b) (c) (d) Calculate the median income. Calculate the modal income. If you were trying to promote that this is an affluent neighborhood, which measure might you prefer to present? (e) If you were trying to argue against a tax increase, which measure might you prefer to present? (f) If you want to represent these values with the income that is in the middle, which measure might you prefer to present? Which Measure of Center to Use? Bell-shaped, Symmetric Bimodal 50% m e a n = m e d i a n = m o d e mean=median two modes Skewed Right Skewed Left 50% 50% m o d e m e a n m e d i a n m e a n m o d e m e d i a n 16

17 Mean, Median, and Mode The most common measure of center is the mean, which locates the balancing point of the distribution. The mean equals the sum of the observations, divided by how many there are. The mean is also affected by extreme observations (outliers and values which are far in the tail of a distribution that is skewed). So the mean tends to be a good choice for locating the center of a distribution that is unimodal and roughly symmetric, with no outliers. The median is a more robust measure of center, that is, it is not influenced by extreme values. The median is the middle observation when the data are ordered from smallest to largest. If you have an odd number of values, the median is the one in the middle. If you have an even number of values, the median is the mean of the two middle values, and fall exactly half way between them. If you have n observations, then (n+1)/ tells you the location or position of the median. For skewed distributions or distributions with outliers, the median tends to be the better choice for locating the center. The mode is the value(s) that occurs most often. For a distribution, the mode is the value associated with the highest peak. The most frequent value can be far from the center of the distribution, so the mode is not really a measure of center. However, the mode is the only measure of the three that can be used for qualitative data. Tips: When you see or hear an average reported, ask which average was really computed -- the mean or the median. Think about or examine the distribution of values to assess if the measure of center used is appropriate. 17

18 .5-.5 MEASURING VARIATION OR SPREAD Both sets of data have the same mean, median and mode but the values obviously differ in another respect -- the variation or spread of the values. The values in List 1 are much more tightly clustered around the center value of 60. The values in List are much more dispersed or spread out. List 1: 55, 56, 57, 58, 59, 60, 60, 60, 61, 6, 63, 64, 65 mean = median = mode = 60 X X XXXXXXXXXXX List : 35, 40, 45, 50, 55, 60, 60, 60, 65, 70, 75, 80, 85 mean = median = mode = 60 X X X X X X X X X X X X X Range The range is the simplest measure of variability or spread. Range is just the difference between the largest value and the smallest value. Range can give a distorted picture of the actual pattern of variation. Two distributions: same range but different patterns of variation. the first distribution has most of its values far from the center, while the second distribution has most of its values closer to the center. X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X

19 Inter-quartile Range The inter-quartile range measures the spread of the middle 50% of the data. You first find the median (represented by Q the value that divides the data into two halves), and then find the median for each half. The three values that divide the data into four parts are called the quartiles, represented by Q1, Q, and Q3. difference between the third quartile and the first quartile is called the inter-quartile range, denoted by IQR=Q3-Q1. The Example Quartiles for Age The ages of the 0 subjects in the medical study are listed below in order. 3, 37, 39, 40, 41, 41, 41, 4, 4, 43, 44, 45, 45, 45, 46, 47, 47, 49, 50, 51 The histogram of the ages is also provided. (a) (b) (c) (d) Calculate the median age. Calculate the first Quartile Q1 for this age data. Calculate the third Quartile Q3 for this age data. Calculate the range for this age data m e d ian = Q1 = 4 1 Q3 = We see that the distribution of age is approximately symmetric and that the quartiles are about the same distance from the median. The quartiles are actually the 5th, 50th, and 75th percentiles. Count DEFINITION: The pth percentile is the value such that p% of the observations fall at or below that value and (100 - p)% of the observations fall at or above that value. 19

20 Finding the Quartiles 1. Find the median of all of the observations.. First Quartile = Q1 = median of observations that fall below the median. 3. Third Quartile = Q3 = median of observations that fall above the median. Notes When the number of observations is odd, the middle observation is the median. This observation is not included in either of the two halves when computing Q1 and Q3. Although different books, calculators, and computers may use slightly different ways to compute the quartiles, they are all based on the same idea. In a left-skewed distribution, the first quartile will be farther from the median than the third quartile is. If the distribution is symmetric, the quartiles should be the same distance from the median. Five-Number Summary Five-number summary: Minimum, Q1, Median, Q3, Maximum Boxplot: To Build a Basic Boxplot M i n Q1 Q = M e d i a nq3 M a x List the data values in order from smallest to largest. Find the five number summary: minimum, Q1, median, Q3, and maximum. Locate the values for Q1, the median and Q3 on the scale. These values determine the box part of the boxplot. The quartiles determine the ends of the box, and a line is drawn inside the box to mark the value of the median. Draw lines (called whiskers) from the midpoints of the ends of the box out to the minimum and maximum. 0

21 Problem Consider the (ordered) ages of the 0 subjects in a medical study : 3, 37, 39, 40, 41, 41, 41, 4, 4, 43, 44, 45, 45, 45, 46, 47, 47, 49, 50, 51 The five-number summary for the age data is given by: min = 3, Q1 = 41, median = 43.5, Q3 = 46.5, and max = 51. Draw the Modified boxplot. The distance between the median and the quartiles is roughly the same, supporting the rough symmetry of the distribution as seen previously from the histogram. Using the 1.5 x IQR Rule to Identify Outliers and Build a Modified Boxplot List the data values in order from smallest to largest. Find the five number summary: minimum, Q1, median, Q3, and maximum. Locate the values for Q1, the median and Q3 on the scale. These values determine the box part of the boxplot. The quartiles determine the ends of the box, and a line is drawn inside the box to mark the value of the median. Find the IQR = Q3 Q1. Compute the quantity STEP = 1.5 x (IQR) Find the location of the inner fences by taking 1 step out from each of the quartiles lower inner fence = Q1 STEP; upper inner fence = Q3 + STEP. Draw the lines (whiskers) from the midpoints of the ends of the box out to the smallest and largest values WITHIN the inner fences. Observations that fall OUTSIDE the inner fences are considered potential outliers. If there are any outliers, plot them individually along the scale using a solid dot. 1

22 Five-number summary: min=1 Q1=1 median=3 Q3=66 max=35 Inner Fences Outside value Potential Outliers Far Outside value Farthest observations that are not potential outliers Example Any Age Outlier? Let s apply the "rule of thumb" to our age data set to assess if there are any outliers. (a) (b) (c) Construct the fences for the modified boxplot based on the 1.5 * IQR rule. Are there any outliers using the 1.5 * IQR rule? Construct the modified boxplot.

23 Let's Do It! Five-Number Summary and Outliers 3

24 Let s Do It! Cost of Running Shoes The prices for 1 comparable pairs of running shoes produced the following modified boxplot PRICE * (a) What was the approximate range of prices for such running shoes? Range = (b) Twenty-five percent of the shoes cost more than approximately what amount? $ Side-by-side boxplots are helpful for comparing two or more distributions with respect to the five-number summary. Although the median of the first process is closer to the target value of cm, the second process produces a less variable distribution. 4

25 Let's Do It! Comparing Ages Antibiotic Study Variable = age for 3 children randomly assigned to one of two treatment groups. (a) Give the five-number summary for each of the two treatment groups. Comment on your results. Amoxicillin Group (n=11): Five-number summary: Cefadroxil Group (n=1): Five-number summary: (b) Make side-by-side Modified box-plots for the antibiotic study data in part (a). Amoxicillin : Lower fence=, Upper fence=, outliers: Cefadroxil : Lower fence=, Upper fence=, outliers: 5

26 Standard Deviation....a measure of the spread of the observations from the mean.. think of the standard deviation as an average (or standard) distance of the observations from the mean. Example Standard Deviation What Is It? Deviations: -4, 1, 3 Squared Deviations: 16, 1, Observation Deviation Squared Deviation x x x x x = = = mean = 4 sum always = 0 sum = 6 sample variance sample standard deviation Interpretation of the Standard Deviation Think of the standard deviation as roughly an average distance of the observations from their mean. If all of the observations are the same, then the standard deviation will be 0 (i.e. no spread). Otherwise the standard deviation is positive and the more spread out the observations are about their mean, the larger the value of the standard deviation. 6

27 If x 1, x,..., x n denote a sample of n observations, the sample variance is denoted by: s xi x x ( ) 1 x x x x x n xi x n i x i n 1 x ( n 1) i / n n 1 n( n 1) Sample standard deviation, denoted by s, is the square root of the variance: s s. Shortcut formulas for computing the variance and standard deviation Mathematically equivalent to the preceding formulas and do not involve using the mean. They save time when repeated subtracting and squaring occur in the original formulas. They are also more accurate when the mean has been rounded.. Remarks: The variance is measured in squared units. By taking the square root of the variance we bring this measure of spread back into the original units. Just as the mean is not a resistant measure of center, since the standard deviation used the mean in its definition, it is not a resistant measure of spread. It is heavily influenced by extreme values. There are statistical arguments that support why we divide by n 1 instead of n in the denominator of the sample standard deviation. 7

28 Example Consistency of Weight Loss Program In a recent study of the effect of a certain diet on weight reduction, 11 subjects were put on the diet for two weeks and their weight loss/gain in lbs was measured (positive values indicate weight loss). 1, 1,,, 3,, 1, 1, 3,.5, -3. What is the standard deviation of the weight loss? Solution x , x ( 3) The standard deviation of this sample is s ( 4. 5) / Let's Do It! Emergency Room Patients The following are the ages of a sample of 0 patients seen in the emergency room of a hospital on a Friday night Find the standard deviation of the ages. 8

29 IQR and Standard Deviation The interquartile range, IQR, is the distance between the first and third quartiles (Q3 - Q1), and measures the spread of the middle 50% of the data. When the median is used as a measure of center, the IQR is often used as a measure of spread. For skewed distributions, or distributions with outliers, the IQR tends to be a better measure of spread if your goal is to summarize that distribution. Adding the minimum and maximum values to the median and quartiles results in the five-number summary. A graphical display of the five-number summary is a boxplot, and the length of the box corresponds to the IQR. The standard deviation is roughly the average distance of the observed values from their mean. The mean and the standard deviation are most useful for approximately symmetric distributions with no outliers. In the next chapter we will discuss an important family of symmetric distributions, called the normal distributions, for which the standard deviation is a very useful summary. Tip: The numerical summaries presented in this chapter provide information about the center and spread of a distribution, but a graph, such as a histogram or stem-andleaf plot, provides the best picture of the overall shape of the distribution. Graph your data first! 9

30 Variance and Standard Deviation for Grouped Data The procedure for finding the variance and standard deviation for grouped data is similar to that for finding the mean for grouped data, and it uses the midpoints of each class. Example The data represent the number of miles that 0 runners ran during one week. Find the variance and the standard deviation for the frequency distribution of the data. Solution Step1 Make a table as shown, and find the midpoint of each class. Step Multiply the frequency by the midpoint for each class, and place the products in column D. 1.8 = 8,. 13 =6,...,.38 = 76 Step 3 Multiply the frequency by the square of the midpoint, and place the products in column E. 1.8 = 64,. 13 = 338,...,.38 = 888 Step 4 Find the sums of columns B, D, and E. The sum of column B is n, the sum of column D is E is f i x m. The completed table is shown. f i x m, and the sum of column Step 5 Substitute in the formula and solve for s to get the variance. Step 6 Take the square root to get the standard deviation. 30

31 Let's Do It! The data show distribution of the birth weight ( in oz.) of 100 consecutive deliveries. Find the variance and the standard deviation. Interval Frequency Chapter Objectives: Identify Types of Variables: (Quantitative / Categorical). Compute percentages and answer questions using charts and histograms. Identify misleading Pictograms. Construct frequency table for categorical data. Construct histograms of frequency tables. Construct frequency table from a histogram. Understand that histograms match the characteristics of the population Compute Measures of Central Tendency (the three m s: mean/median/mode) of data sets. Combining Means: computing the overall mean of two groups using their averages (chapter handout, Kim s Scores) Understand the Effect of Extreme Values on the mean (sensitive)/ median (resistant). Compute the Mean of a Frequency Table Effect of the Shape of the Distribution on the Mean, Median, Mode Compute the Range of a data set. Find the 5-Number Summary (Min, Q1, Median, Q3, Max). Draw basic box-plot Find the 5-Number Summary from a Modified Box-Plot. Identify Potential outliers using the 1.5*IQR Rule. Draw a Modified Box-Plot. Remember that the Variance and Standard Deviation of a Sample is Different (in formula) from the Population s. Compute the Variance and Standard Deviation of Data Sets. Compute the Variance and Standard Deviation of a Frequency Table. 31

Let's Do It! What Type of Variable?

Let's Do It! What Type of Variable? 1 2.1-2.3: Organizing Data DEFINITIONS: Qualitative Data are those which classify the units into categories. The categories may or may not have a natural ordering to them. Qualitative variables are also

More information

STAT 200 Chapter 1 Looking at Data - Distributions

STAT 200 Chapter 1 Looking at Data - Distributions STAT 200 Chapter 1 Looking at Data - Distributions What is Statistics? Statistics is a science that involves the design of studies, data collection, summarizing and analyzing the data, interpreting the

More information

MATH 1150 Chapter 2 Notation and Terminology

MATH 1150 Chapter 2 Notation and Terminology MATH 1150 Chapter 2 Notation and Terminology Categorical Data The following is a dataset for 30 randomly selected adults in the U.S., showing the values of two categorical variables: whether or not the

More information

Describing distributions with numbers

Describing distributions with numbers Describing distributions with numbers A large number or numerical methods are available for describing quantitative data sets. Most of these methods measure one of two data characteristics: The central

More information

Chapter 2: Tools for Exploring Univariate Data

Chapter 2: Tools for Exploring Univariate Data Stats 11 (Fall 2004) Lecture Note Introduction to Statistical Methods for Business and Economics Instructor: Hongquan Xu Chapter 2: Tools for Exploring Univariate Data Section 2.1: Introduction What is

More information

Chapter 1. Looking at Data

Chapter 1. Looking at Data Chapter 1 Looking at Data Types of variables Looking at Data Be sure that each variable really does measure what you want it to. A poor choice of variables can lead to misleading conclusions!! For example,

More information

Introduction to Statistics

Introduction to Statistics Introduction to Statistics Data and Statistics Data consists of information coming from observations, counts, measurements, or responses. Statistics is the science of collecting, organizing, analyzing,

More information

Chapter 3. Data Description

Chapter 3. Data Description Chapter 3. Data Description Graphical Methods Pie chart It is used to display the percentage of the total number of measurements falling into each of the categories of the variable by partition a circle.

More information

Chapter 1: Exploring Data

Chapter 1: Exploring Data Chapter 1: Exploring Data Section 1.3 with Numbers The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE Chapter 1 Exploring Data Introduction: Data Analysis: Making Sense of Data 1.1

More information

MATH 117 Statistical Methods for Management I Chapter Three

MATH 117 Statistical Methods for Management I Chapter Three Jubail University College MATH 117 Statistical Methods for Management I Chapter Three This chapter covers the following topics: I. Measures of Center Tendency. 1. Mean for Ungrouped Data (Raw Data) 2.

More information

What is statistics? Statistics is the science of: Collecting information. Organizing and summarizing the information collected

What is statistics? Statistics is the science of: Collecting information. Organizing and summarizing the information collected What is statistics? Statistics is the science of: Collecting information Organizing and summarizing the information collected Analyzing the information collected in order to draw conclusions Two types

More information

Unit 2. Describing Data: Numerical

Unit 2. Describing Data: Numerical Unit 2 Describing Data: Numerical Describing Data Numerically Describing Data Numerically Central Tendency Arithmetic Mean Median Mode Variation Range Interquartile Range Variance Standard Deviation Coefficient

More information

3.1 Measure of Center

3.1 Measure of Center 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable to the mean Find the mode of a data set Describe how skewness affects

More information

Describing distributions with numbers

Describing distributions with numbers Describing distributions with numbers A large number or numerical methods are available for describing quantitative data sets. Most of these methods measure one of two data characteristics: The central

More information

CHAPTER 2: Describing Distributions with Numbers

CHAPTER 2: Describing Distributions with Numbers CHAPTER 2: Describing Distributions with Numbers The Basic Practice of Statistics 6 th Edition Moore / Notz / Fligner Lecture PowerPoint Slides Chapter 2 Concepts 2 Measuring Center: Mean and Median Measuring

More information

TOPIC: Descriptive Statistics Single Variable

TOPIC: Descriptive Statistics Single Variable TOPIC: Descriptive Statistics Single Variable I. Numerical data summary measurements A. Measures of Location. Measures of central tendency Mean; Median; Mode. Quantiles - measures of noncentral tendency

More information

1.3.1 Measuring Center: The Mean

1.3.1 Measuring Center: The Mean 1.3.1 Measuring Center: The Mean Mean - The arithmetic average. To find the mean (pronounced x bar) of a set of observations, add their values and divide by the number of observations. If the n observations

More information

1-1. Chapter 1. Sampling and Descriptive Statistics by The McGraw-Hill Companies, Inc. All rights reserved.

1-1. Chapter 1. Sampling and Descriptive Statistics by The McGraw-Hill Companies, Inc. All rights reserved. 1-1 Chapter 1 Sampling and Descriptive Statistics 1-2 Why Statistics? Deal with uncertainty in repeated scientific measurements Draw conclusions from data Design valid experiments and draw reliable conclusions

More information

Elementary Statistics

Elementary Statistics Elementary Statistics Q: What is data? Q: What does the data look like? Q: What conclusions can we draw from the data? Q: Where is the middle of the data? Q: Why is the spread of the data important? Q:

More information

Range The range is the simplest of the three measures and is defined now.

Range The range is the simplest of the three measures and is defined now. Measures of Variation EXAMPLE A testing lab wishes to test two experimental brands of outdoor paint to see how long each will last before fading. The testing lab makes 6 gallons of each paint to test.

More information

Lecture 1: Description of Data. Readings: Sections 1.2,

Lecture 1: Description of Data. Readings: Sections 1.2, Lecture 1: Description of Data Readings: Sections 1.,.1-.3 1 Variable Example 1 a. Write two complete and grammatically correct sentences, explaining your primary reason for taking this course and then

More information

What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty.

What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty. What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty. Statistics is a field of study concerned with the data collection,

More information

STT 315 This lecture is based on Chapter 2 of the textbook.

STT 315 This lecture is based on Chapter 2 of the textbook. STT 315 This lecture is based on Chapter 2 of the textbook. Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing him to use/edit some of their

More information

ST Presenting & Summarising Data Descriptive Statistics. Frequency Distribution, Histogram & Bar Chart

ST Presenting & Summarising Data Descriptive Statistics. Frequency Distribution, Histogram & Bar Chart ST2001 2. Presenting & Summarising Data Descriptive Statistics Frequency Distribution, Histogram & Bar Chart Summary of Previous Lecture u A study often involves taking a sample from a population that

More information

Further Mathematics 2018 CORE: Data analysis Chapter 2 Summarising numerical data

Further Mathematics 2018 CORE: Data analysis Chapter 2 Summarising numerical data Chapter 2: Summarising numerical data Further Mathematics 2018 CORE: Data analysis Chapter 2 Summarising numerical data Extract from Study Design Key knowledge Types of data: categorical (nominal and ordinal)

More information

Describing Distributions with Numbers

Describing Distributions with Numbers Describing Distributions with Numbers Using graphs, we could determine the center, spread, and shape of the distribution of a quantitative variable. We can also use numbers (called summary statistics)

More information

STP 420 INTRODUCTION TO APPLIED STATISTICS NOTES

STP 420 INTRODUCTION TO APPLIED STATISTICS NOTES INTRODUCTION TO APPLIED STATISTICS NOTES PART - DATA CHAPTER LOOKING AT DATA - DISTRIBUTIONS Individuals objects described by a set of data (people, animals, things) - all the data for one individual make

More information

Statistics for Managers using Microsoft Excel 6 th Edition

Statistics for Managers using Microsoft Excel 6 th Edition Statistics for Managers using Microsoft Excel 6 th Edition Chapter 3 Numerical Descriptive Measures 3-1 Learning Objectives In this chapter, you learn: To describe the properties of central tendency, variation,

More information

Chapter. Numerically Summarizing Data Pearson Prentice Hall. All rights reserved

Chapter. Numerically Summarizing Data Pearson Prentice Hall. All rights reserved Chapter 3 Numerically Summarizing Data Section 3.1 Measures of Central Tendency Objectives 1. Determine the arithmetic mean of a variable from raw data 2. Determine the median of a variable from raw data

More information

Resistant Measure - A statistic that is not affected very much by extreme observations.

Resistant Measure - A statistic that is not affected very much by extreme observations. Chapter 1.3 Lecture Notes & Examples Section 1.3 Describing Quantitative Data with Numbers (pp. 50-74) 1.3.1 Measuring Center: The Mean Mean - The arithmetic average. To find the mean (pronounced x bar)

More information

2011 Pearson Education, Inc

2011 Pearson Education, Inc Statistics for Business and Economics Chapter 2 Methods for Describing Sets of Data Summary of Central Tendency Measures Measure Formula Description Mean x i / n Balance Point Median ( n +1) Middle Value

More information

Review for Exam #1. Chapter 1. The Nature of Data. Definitions. Population. Sample. Quantitative data. Qualitative (attribute) data

Review for Exam #1. Chapter 1. The Nature of Data. Definitions. Population. Sample. Quantitative data. Qualitative (attribute) data Review for Exam #1 1 Chapter 1 Population the complete collection of elements (scores, people, measurements, etc.) to be studied Sample a subcollection of elements drawn from a population 11 The Nature

More information

CIVL 7012/8012. Collection and Analysis of Information

CIVL 7012/8012. Collection and Analysis of Information CIVL 7012/8012 Collection and Analysis of Information Uncertainty in Engineering Statistics deals with the collection and analysis of data to solve real-world problems. Uncertainty is inherent in all real

More information

AP Final Review II Exploring Data (20% 30%)

AP Final Review II Exploring Data (20% 30%) AP Final Review II Exploring Data (20% 30%) Quantitative vs Categorical Variables Quantitative variables are numerical values for which arithmetic operations such as means make sense. It is usually a measure

More information

Histograms allow a visual interpretation

Histograms allow a visual interpretation Chapter 4: Displaying and Summarizing i Quantitative Data s allow a visual interpretation of quantitative (numerical) data by indicating the number of data points that lie within a range of values, called

More information

A is one of the categories into which qualitative data can be classified.

A is one of the categories into which qualitative data can be classified. Chapter 2 Methods for Describing Sets of Data 2.1 Describing qualitative data Recall qualitative data: non-numerical or categorical data Basic definitions: A is one of the categories into which qualitative

More information

Chapter 4. Displaying and Summarizing. Quantitative Data

Chapter 4. Displaying and Summarizing. Quantitative Data STAT 141 Introduction to Statistics Chapter 4 Displaying and Summarizing Quantitative Data Bin Zou (bzou@ualberta.ca) STAT 141 University of Alberta Winter 2015 1 / 31 4.1 Histograms 1 We divide the range

More information

CHAPTER 1. Introduction

CHAPTER 1. Introduction CHAPTER 1 Introduction Engineers and scientists are constantly exposed to collections of facts, or data. The discipline of statistics provides methods for organizing and summarizing data, and for drawing

More information

Shape, Outliers, Center, Spread Frequency and Relative Histograms Related to other types of graphical displays

Shape, Outliers, Center, Spread Frequency and Relative Histograms Related to other types of graphical displays Histograms: Shape, Outliers, Center, Spread Frequency and Relative Histograms Related to other types of graphical displays Sep 9 1:13 PM Shape: Skewed left Bell shaped Symmetric Bi modal Symmetric Skewed

More information

Exercises from Chapter 3, Section 1

Exercises from Chapter 3, Section 1 Exercises from Chapter 3, Section 1 1. Consider the following sample consisting of 20 numbers. (a) Find the mode of the data 21 23 24 24 25 26 29 30 32 34 39 41 41 41 42 43 48 51 53 53 (b) Find the median

More information

Topic 3: Introduction to Statistics. Algebra 1. Collecting Data. Table of Contents. Categorical or Quantitative? What is the Study of Statistics?!

Topic 3: Introduction to Statistics. Algebra 1. Collecting Data. Table of Contents. Categorical or Quantitative? What is the Study of Statistics?! Topic 3: Introduction to Statistics Collecting Data We collect data through observation, surveys and experiments. We can collect two different types of data: Categorical Quantitative Algebra 1 Table of

More information

Performance of fourth-grade students on an agility test

Performance of fourth-grade students on an agility test Starter Ch. 5 2005 #1a CW Ch. 4: Regression L1 L2 87 88 84 86 83 73 81 67 78 83 65 80 50 78 78? 93? 86? Create a scatterplot Find the equation of the regression line Predict the scores Chapter 5: Understanding

More information

2/2/2015 GEOGRAPHY 204: STATISTICAL PROBLEM SOLVING IN GEOGRAPHY MEASURES OF CENTRAL TENDENCY CHAPTER 3: DESCRIPTIVE STATISTICS AND GRAPHICS

2/2/2015 GEOGRAPHY 204: STATISTICAL PROBLEM SOLVING IN GEOGRAPHY MEASURES OF CENTRAL TENDENCY CHAPTER 3: DESCRIPTIVE STATISTICS AND GRAPHICS Spring 2015: Lembo GEOGRAPHY 204: STATISTICAL PROBLEM SOLVING IN GEOGRAPHY CHAPTER 3: DESCRIPTIVE STATISTICS AND GRAPHICS Descriptive statistics concise and easily understood summary of data set characteristics

More information

SESSION 5 Descriptive Statistics

SESSION 5 Descriptive Statistics SESSION 5 Descriptive Statistics Descriptive statistics are used to describe the basic features of the data in a study. They provide simple summaries about the sample and the measures. Together with simple

More information

Summarizing and Displaying Measurement Data/Understanding and Comparing Distributions

Summarizing and Displaying Measurement Data/Understanding and Comparing Distributions Summarizing and Displaying Measurement Data/Understanding and Comparing Distributions Histograms, Mean, Median, Five-Number Summary and Boxplots, Standard Deviation Thought Questions 1. If you were to

More information

3.1 Measures of Central Tendency: Mode, Median and Mean. Average a single number that is used to describe the entire sample or population

3.1 Measures of Central Tendency: Mode, Median and Mean. Average a single number that is used to describe the entire sample or population . Measures of Central Tendency: Mode, Median and Mean Average a single number that is used to describe the entire sample or population. Mode a. Easiest to compute, but not too stable i. Changing just one

More information

The science of learning from data.

The science of learning from data. STATISTICS (PART 1) The science of learning from data. Numerical facts Collection of methods for planning experiments, obtaining data and organizing, analyzing, interpreting and drawing the conclusions

More information

Lecture 1: Descriptive Statistics

Lecture 1: Descriptive Statistics Lecture 1: Descriptive Statistics MSU-STT-351-Sum 15 (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 1 / 56 Contents 1 Introduction 2 Branches of Statistics Descriptive Statistics

More information

ADMS2320.com. We Make Stats Easy. Chapter 4. ADMS2320.com Tutorials Past Tests. Tutorial Length 1 Hour 45 Minutes

ADMS2320.com. We Make Stats Easy. Chapter 4. ADMS2320.com Tutorials Past Tests. Tutorial Length 1 Hour 45 Minutes We Make Stats Easy. Chapter 4 Tutorial Length 1 Hour 45 Minutes Tutorials Past Tests Chapter 4 Page 1 Chapter 4 Note The following topics will be covered in this chapter: Measures of central location Measures

More information

Types of Information. Topic 2 - Descriptive Statistics. Examples. Sample and Sample Size. Background Reading. Variables classified as STAT 511

Types of Information. Topic 2 - Descriptive Statistics. Examples. Sample and Sample Size. Background Reading. Variables classified as STAT 511 Topic 2 - Descriptive Statistics STAT 511 Professor Bruce Craig Types of Information Variables classified as Categorical (qualitative) - variable classifies individual into one of several groups or categories

More information

Practice problems from chapters 2 and 3

Practice problems from chapters 2 and 3 Practice problems from chapters and 3 Question-1. For each of the following variables, indicate whether it is quantitative or qualitative and specify which of the four levels of measurement (nominal, ordinal,

More information

Units. Exploratory Data Analysis. Variables. Student Data

Units. Exploratory Data Analysis. Variables. Student Data Units Exploratory Data Analysis Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison Statistics 371 13th September 2005 A unit is an object that can be measured, such as

More information

1 Measures of the Center of a Distribution

1 Measures of the Center of a Distribution 1 Measures of the Center of a Distribution Qualitative descriptions of the shape of a distribution are important and useful. But we will often desire the precision of numerical summaries as well. Two aspects

More information

QUANTITATIVE DATA. UNIVARIATE DATA data for one variable

QUANTITATIVE DATA. UNIVARIATE DATA data for one variable QUANTITATIVE DATA Recall that quantitative (numeric) data values are numbers where data take numerical values for which it is sensible to find averages, such as height, hourly pay, and pulse rates. UNIVARIATE

More information

STA 218: Statistics for Management

STA 218: Statistics for Management Al Nosedal. University of Toronto. Fall 2017 My momma always said: Life was like a box of chocolates. You never know what you re gonna get. Forrest Gump. Problem How much do people with a bachelor s degree

More information

Measures of center. The mean The mean of a distribution is the arithmetic average of the observations:

Measures of center. The mean The mean of a distribution is the arithmetic average of the observations: Measures of center The mean The mean of a distribution is the arithmetic average of the observations: x = x 1 + + x n n n = 1 x i n i=1 The median The median is the midpoint of a distribution: the number

More information

Stat 101 Exam 1 Important Formulas and Concepts 1

Stat 101 Exam 1 Important Formulas and Concepts 1 1 Chapter 1 1.1 Definitions Stat 101 Exam 1 Important Formulas and Concepts 1 1. Data Any collection of numbers, characters, images, or other items that provide information about something. 2. Categorical/Qualitative

More information

Example 2. Given the data below, complete the chart:

Example 2. Given the data below, complete the chart: Statistics 2035 Quiz 1 Solutions Example 1. 2 64 150 150 2 128 150 2 256 150 8 8 Example 2. Given the data below, complete the chart: 52.4, 68.1, 66.5, 75.0, 60.5, 78.8, 63.5, 48.9, 81.3 n=9 The data is

More information

Lecture 2. Descriptive Statistics: Measures of Center

Lecture 2. Descriptive Statistics: Measures of Center Lecture 2. Descriptive Statistics: Measures of Center Descriptive Statistics summarize or describe the important characteristics of a known set of data Inferential Statistics use sample data to make inferences

More information

Chapter 6 Group Activity - SOLUTIONS

Chapter 6 Group Activity - SOLUTIONS Chapter 6 Group Activity - SOLUTIONS Group Activity Summarizing a Distribution 1. The following data are the number of credit hours taken by Math 105 students during a summer term. You will be analyzing

More information

Unit 2: Numerical Descriptive Measures

Unit 2: Numerical Descriptive Measures Unit 2: Numerical Descriptive Measures Summation Notation Measures of Central Tendency Measures of Dispersion Chebyshev's Rule Empirical Rule Measures of Relative Standing Box Plots z scores Jan 28 10:48

More information

Chapter 1:Descriptive statistics

Chapter 1:Descriptive statistics Slide 1.1 Chapter 1:Descriptive statistics Descriptive statistics summarises a mass of information. We may use graphical and/or numerical methods Examples of the former are the bar chart and XY chart,

More information

After completing this chapter, you should be able to:

After completing this chapter, you should be able to: Chapter 2 Descriptive Statistics Chapter Goals After completing this chapter, you should be able to: Compute and interpret the mean, median, and mode for a set of data Find the range, variance, standard

More information

Chapter 5. Understanding and Comparing. Distributions

Chapter 5. Understanding and Comparing. Distributions STAT 141 Introduction to Statistics Chapter 5 Understanding and Comparing Distributions Bin Zou (bzou@ualberta.ca) STAT 141 University of Alberta Winter 2015 1 / 27 Boxplots How to create a boxplot? Assume

More information

Chapter 5: Exploring Data: Distributions Lesson Plan

Chapter 5: Exploring Data: Distributions Lesson Plan Lesson Plan Exploring Data Displaying Distributions: Histograms Interpreting Histograms Displaying Distributions: Stemplots Describing Center: Mean and Median Describing Variability: The Quartiles The

More information

Descriptive Statistics Solutions COR1-GB.1305 Statistics and Data Analysis

Descriptive Statistics Solutions COR1-GB.1305 Statistics and Data Analysis Descriptive Statistics Solutions COR-GB.0 Statistics and Data Analysis Types of Data. The class survey asked each respondent to report the following information: gender; birth date; GMAT score; undergraduate

More information

Last Lecture. Distinguish Populations from Samples. Knowing different Sampling Techniques. Distinguish Parameters from Statistics

Last Lecture. Distinguish Populations from Samples. Knowing different Sampling Techniques. Distinguish Parameters from Statistics Last Lecture Distinguish Populations from Samples Importance of identifying a population and well chosen sample Knowing different Sampling Techniques Distinguish Parameters from Statistics Knowing different

More information

Honors Algebra 1 - Fall Final Review

Honors Algebra 1 - Fall Final Review Name: Period Date: Honors Algebra 1 - Fall Final Review This review packet is due at the beginning of your final exam. In addition to this packet, you should study each of your unit reviews and your notes.

More information

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency Math 1 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency The word average: is very ambiguous and can actually refer to the mean, median, mode or midrange. Notation:

More information

Descriptive Data Summarization

Descriptive Data Summarization Descriptive Data Summarization Descriptive data summarization gives the general characteristics of the data and identify the presence of noise or outliers, which is useful for successful data cleaning

More information

Vocabulary: Samples and Populations

Vocabulary: Samples and Populations Vocabulary: Samples and Populations Concept Different types of data Categorical data results when the question asked in a survey or sample can be answered with a nonnumerical answer. For example if we

More information

Lecture 2 and Lecture 3

Lecture 2 and Lecture 3 Lecture 2 and Lecture 3 1 Lecture 2 and Lecture 3 We can describe distributions using 3 characteristics: shape, center and spread. These characteristics have been discussed since the foundation of statistics.

More information

Perhaps the most important measure of location is the mean (average). Sample mean: where n = sample size. Arrange the values from smallest to largest:

Perhaps the most important measure of location is the mean (average). Sample mean: where n = sample size. Arrange the values from smallest to largest: 1 Chapter 3 - Descriptive stats: Numerical measures 3.1 Measures of Location Mean Perhaps the most important measure of location is the mean (average). Sample mean: where n = sample size Example: The number

More information

Chapter Four. Numerical Descriptive Techniques. Range, Standard Deviation, Variance, Coefficient of Variation

Chapter Four. Numerical Descriptive Techniques. Range, Standard Deviation, Variance, Coefficient of Variation Chapter Four Numerical Descriptive Techniques 4.1 Numerical Descriptive Techniques Measures of Central Location Mean, Median, Mode Measures of Variability Range, Standard Deviation, Variance, Coefficient

More information

CHAPTER 5: EXPLORING DATA DISTRIBUTIONS. Individuals are the objects described by a set of data. These individuals may be people, animals or things.

CHAPTER 5: EXPLORING DATA DISTRIBUTIONS. Individuals are the objects described by a set of data. These individuals may be people, animals or things. (c) Epstein 2013 Chapter 5: Exploring Data Distributions Page 1 CHAPTER 5: EXPLORING DATA DISTRIBUTIONS 5.1 Creating Histograms Individuals are the objects described by a set of data. These individuals

More information

Descriptive Statistics-I. Dr Mahmoud Alhussami

Descriptive Statistics-I. Dr Mahmoud Alhussami Descriptive Statistics-I Dr Mahmoud Alhussami Biostatistics What is the biostatistics? A branch of applied math. that deals with collecting, organizing and interpreting data using well-defined procedures.

More information

Statistics and parameters

Statistics and parameters Statistics and parameters Tables, histograms and other charts are used to summarize large amounts of data. Often, an even more extreme summary is desirable. Statistics and parameters are numbers that characterize

More information

MgtOp 215 Chapter 3 Dr. Ahn

MgtOp 215 Chapter 3 Dr. Ahn MgtOp 215 Chapter 3 Dr. Ahn Measures of central tendency (center, location): measures the middle point of a distribution or data; these include mean and median. Measures of dispersion (variability, spread):

More information

P8130: Biostatistical Methods I

P8130: Biostatistical Methods I P8130: Biostatistical Methods I Lecture 2: Descriptive Statistics Cody Chiuzan, PhD Department of Biostatistics Mailman School of Public Health (MSPH) Lecture 1: Recap Intro to Biostatistics Types of Data

More information

Announcements. Lecture 1 - Data and Data Summaries. Data. Numerical Data. all variables. continuous discrete. Homework 1 - Out 1/15, due 1/22

Announcements. Lecture 1 - Data and Data Summaries. Data. Numerical Data. all variables. continuous discrete. Homework 1 - Out 1/15, due 1/22 Announcements Announcements Lecture 1 - Data and Data Summaries Statistics 102 Colin Rundel January 13, 2013 Homework 1 - Out 1/15, due 1/22 Lab 1 - Tomorrow RStudio accounts created this evening Try logging

More information

University of Jordan Fall 2009/2010 Department of Mathematics

University of Jordan Fall 2009/2010 Department of Mathematics handouts Part 1 (Chapter 1 - Chapter 5) University of Jordan Fall 009/010 Department of Mathematics Chapter 1 Introduction to Introduction; Some Basic Concepts Statistics is a science related to making

More information

Chapter2 Description of samples and populations. 2.1 Introduction.

Chapter2 Description of samples and populations. 2.1 Introduction. Chapter2 Description of samples and populations. 2.1 Introduction. Statistics=science of analyzing data. Information collected (data) is gathered in terms of variables (characteristics of a subject that

More information

Instructor: Doug Ensley Course: MAT Applied Statistics - Ensley

Instructor: Doug Ensley Course: MAT Applied Statistics - Ensley Student: Date: Instructor: Doug Ensley Course: MAT117 01 Applied Statistics - Ensley Assignment: Online 04 - Sections 2.5 and 2.6 1. A travel magazine recently presented data on the annual number of vacation

More information

MEASURING THE SPREAD OF DATA: 6F

MEASURING THE SPREAD OF DATA: 6F CONTINUING WITH DESCRIPTIVE STATS 6E,6F,6G,6H,6I MEASURING THE SPREAD OF DATA: 6F othink about this example: Suppose you are at a high school football game and you sample 40 people from the student section

More information

Objective A: Mean, Median and Mode Three measures of central of tendency: the mean, the median, and the mode.

Objective A: Mean, Median and Mode Three measures of central of tendency: the mean, the median, and the mode. Chapter 3 Numerically Summarizing Data Chapter 3.1 Measures of Central Tendency Objective A: Mean, Median and Mode Three measures of central of tendency: the mean, the median, and the mode. A1. Mean The

More information

Chapter 6. The Standard Deviation as a Ruler and the Normal Model 1 /67

Chapter 6. The Standard Deviation as a Ruler and the Normal Model 1 /67 Chapter 6 The Standard Deviation as a Ruler and the Normal Model 1 /67 Homework Read Chpt 6 Complete Reading Notes Do P129 1, 3, 5, 7, 15, 17, 23, 27, 29, 31, 37, 39, 43 2 /67 Objective Students calculate

More information

CHAPTER 4 VARIABILITY ANALYSES. Chapter 3 introduced the mode, median, and mean as tools for summarizing the

CHAPTER 4 VARIABILITY ANALYSES. Chapter 3 introduced the mode, median, and mean as tools for summarizing the CHAPTER 4 VARIABILITY ANALYSES Chapter 3 introduced the mode, median, and mean as tools for summarizing the information provided in an distribution of data. Measures of central tendency are often useful

More information

Chapter 4.notebook. August 30, 2017

Chapter 4.notebook. August 30, 2017 Sep 1 7:53 AM Sep 1 8:21 AM Sep 1 8:21 AM 1 Sep 1 8:23 AM Sep 1 8:23 AM Sep 1 8:23 AM SOCS When describing a distribution, make sure to always tell about three things: shape, outliers, center, and spread

More information

1. Exploratory Data Analysis

1. Exploratory Data Analysis 1. Exploratory Data Analysis 1.1 Methods of Displaying Data A visual display aids understanding and can highlight features which may be worth exploring more formally. Displays should have impact and be

More information

Lecture 6: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries)

Lecture 6: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries) Lecture 6: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries) Summarize with Shape, Center, Spread Displays: Stemplots, Histograms Five Number Summary, Outliers, Boxplots Cengage Learning

More information

are the objects described by a set of data. They may be people, animals or things.

are the objects described by a set of data. They may be people, animals or things. ( c ) E p s t e i n, C a r t e r a n d B o l l i n g e r 2016 C h a p t e r 5 : E x p l o r i n g D a t a : D i s t r i b u t i o n s P a g e 1 CHAPTER 5: EXPLORING DATA DISTRIBUTIONS 5.1 Creating Histograms

More information

Preliminary Statistics course. Lecture 1: Descriptive Statistics

Preliminary Statistics course. Lecture 1: Descriptive Statistics Preliminary Statistics course Lecture 1: Descriptive Statistics Rory Macqueen (rm43@soas.ac.uk), September 2015 Organisational Sessions: 16-21 Sep. 10.00-13.00, V111 22-23 Sep. 15.00-18.00, V111 24 Sep.

More information

Chapter 2: Descriptive Analysis and Presentation of Single- Variable Data

Chapter 2: Descriptive Analysis and Presentation of Single- Variable Data Chapter 2: Descriptive Analysis and Presentation of Single- Variable Data Mean 26.86667 Standard Error 2.816392 Median 25 Mode 20 Standard Deviation 10.90784 Sample Variance 118.981 Kurtosis -0.61717 Skewness

More information

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 3.1- #

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 3.1- # Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series by Mario F. Triola Chapter 3 Statistics for Describing, Exploring, and Comparing Data 3-1 Review and Preview 3-2 Measures

More information

DEPARTMENT OF QUANTITATIVE METHODS & INFORMATION SYSTEMS QM 120. Spring 2008

DEPARTMENT OF QUANTITATIVE METHODS & INFORMATION SYSTEMS QM 120. Spring 2008 DEPARTMENT OF QUANTITATIVE METHODS & INFORMATION SYSTEMS Introduction to Business Statistics QM 120 Chapter 3 Spring 2008 Measures of central tendency for ungrouped data 2 Graphs are very helpful to describe

More information

Chapter 2 Class Notes Sample & Population Descriptions Classifying variables

Chapter 2 Class Notes Sample & Population Descriptions Classifying variables Chapter 2 Class Notes Sample & Population Descriptions Classifying variables Random Variables (RVs) are discrete quantitative continuous nominal qualitative ordinal Notation and Definitions: a Sample is

More information

3.3. Section. Measures of Central Tendency and Dispersion from Grouped Data. Copyright 2013, 2010 and 2007 Pearson Education, Inc.

3.3. Section. Measures of Central Tendency and Dispersion from Grouped Data. Copyright 2013, 2010 and 2007 Pearson Education, Inc. Section 3.3 Measures of Central Tendency and Dispersion from Grouped Data Objectives 1. Approximate the mean of a variable from grouped data 2. Compute the weighted mean 3. Approximate the standard deviation

More information

Math 140 Introductory Statistics

Math 140 Introductory Statistics Math 140 Introductory Statistics Professor Silvia Fernández Chapter 2 Based on the book Statistics in Action by A. Watkins, R. Scheaffer, and G. Cobb. Visualizing Distributions Recall the definition: The

More information

Math 140 Introductory Statistics

Math 140 Introductory Statistics Visualizing Distributions Math 140 Introductory Statistics Professor Silvia Fernández Chapter Based on the book Statistics in Action by A. Watkins, R. Scheaffer, and G. Cobb. Recall the definition: The

More information

Determining the Spread of a Distribution

Determining the Spread of a Distribution Determining the Spread of a Distribution 1.3-1.5 Cathy Poliak, Ph.D. cathy@math.uh.edu Department of Mathematics University of Houston Lecture 3-2311 Lecture 3-2311 1 / 58 Outline 1 Describing Quantitative

More information