Lecture 1: Descriptive Statistics

Size: px
Start display at page:

Download "Lecture 1: Descriptive Statistics"

Transcription

1 Lecture 1: Descriptive Statistics MSU-STT-351-Sum 15 (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 1 / 56

2 Contents 1 Introduction 2 Branches of Statistics Descriptive Statistics 3 Histogram 4 Numerical Summary of Measures 5 Measure of Variability 6 Homework (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 2 / 56

3 Introduction Why Statistics? (i) It is the science that helps to understand many phenomena which occur in the field of engineering, science, economics, finance, biology, and etc. (ii) It is the scientific way that helps to make intelligent judgments/decisions from the observed data which contains uncertainty and variation. (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 3 / 56

4 Introduction We start with two examples. Example 1. The emission levels of HC (hydrocarbon) and CO (carbon monoxide) of a vehicle: HC (gm/mile): CO (gm/mile): Question: What is the emission level of HC/CO? It is difficult to make a precise statement, as there is a high variation in the observed levels. (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 4 / 56

5 Introduction Example 2. Marks of two students in 4 tests: S1: S2: Question: Who is doing better? Any difficulty in answering? No need for statistical analysis. (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 5 / 56

6 Introduction What is statistics? (i) One word definition: (a) Economics: Money (b) Philosophy: Why (c) Statistics: Variation (ii) Layman definition: Information/summary of data. (iii) Formal Definition: Statistics deals with techniques to deal with or how to (a) obtain information/data (sample) (b) analyze scientifically the data (c) draw valid conclusions/inference (iv) As a branch of mathematics, it deals with analytical techniques to analyze the data to infer about the population characteristics. (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 6 / 56

7 Introduction Population and Samples Population: The set of all well-defined objects/elements (of interest) which are under investigation. Example 1. The students studying engineering at MSU. Example 2. The population of East Lansing. If we can collect information on all the elements in the population, we call it Census. Most often, it is impossible, as it involves a lot of time, efforts and money. Sample: A subset of the population, which is selected for obtaining information. (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 7 / 56

8 Introduction Example 3. We may select 10 students from each discipline from MSU. Often, we are interested in certain characteristics of the population (number of flaws in a piece of cloth; thickness of a capsule wall, monthly income of an individual etc). A characteristic may be (i) Categorical (belongs to one of the categories) (a) Gender of a student (male/female) (b) Quality of a product (excellent/good/bad) (ii) Numerical (measured in real value) (a) Heights of students (b) Values of a stock (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 8 / 56

9 Introduction Types of Variables A variable is any characteristic which changes over the objects in the population. It is denoted by x, y, z (or by X, Y, Z). The variables X may be categorical (called categorical variable) or numerical/quantitative (called numerical variable). (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 9 / 56

10 Introduction Types of Data (i) The data X 1, X 2,..., X n (or x 1, x 2,..., x n ) on a categorical variable X is called categorical data. (ii) The data X 1, X 2,..., X n (or x 1, x 2,..., x n ) on a numerical variable X is called quantitative data. Suppose we measure height = x, and weight = y on n-individuals, (x 1, y 1 ),..., (x n, y n ). Then we have the bivariate data. Similarly, multivariate data is defined. (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 10 / 56

11 Branches of Statistics (i) Descriptive Statistics: Deals with summarizing and describing important features (such as mean, median, standard deviation) of data (tabulating or graphical methods). (ii) Inferential Statistics: Deals with techniques for drawing inferences (generalizing to population) and predictions about the population, based on the information obtained from the sample. (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 11 / 56

12 Descriptive Statistics Descriptive Statistics 1.2 Graphical (visual) Display of Univariate Data Pictures often reveal useful information about data Graphs for Quantitative Data (i) Stem-and-Leaf Display (Stem Plot) This is an useful plot for displaying quantitative data. Example 4. Consider the data on the pulse rates (per minute) of 10 patients: 45, 61, 60, 62, 65, 73, 75, 75, 78, 82 (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 12 / 56

13 Descriptive Statistics (i) Stem-and-Leaf Display (Stem Plot) Stem plot gives Actual values Extent of spread Number and location of peaks Presence of any outlier } ga 5 outlier Stem: Tens Leaf: Ones digit (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 13 / 56

14 Descriptive Statistics (ii) The Dot plot used when data is small or has few distinct values. Here each observation is represented by a dot on a horizontal scale This is similar to stem plot, except that dot is used instead of integers. (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 14 / 56

15 Descriptive Statistics Definition 1 A (quantitative) variable X is discrete if it takes finite or countable values. It is continuous if it takes all values in an interval or of the whole real line. Example 5. Let X = number of trials to get the first success. Then X {1, 2,...} and hence X is discrete. Suppose, X = height of a student (in cm). Then X [150, 190] and is a continuous variable. Let X be a discrete variable taking values in {1, 2,..., I} = S. Let X 1,..., X n be n data values on X. Then frequency of i S = Number of values in the data {X 1, X 2,..., X n } equal to i. For 1 i I, the relative frequency of i = frequency of i/n. (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 15 / 56

16 Descriptive Statistics Example 6. Let X = Number of children in a family. Then X 0, 1, 2, 3. Also, suppose the data on 20 families in East Lansing are: 2, 0, 1, 2, 2, 3, 1, 2, 3, 2, 3, 1, 2, 1, 2, 1, 2, 3, 1, 2. Then the frequency table is X Frequency Relative Frequency /20 = /20 = 0.3 9/20 = /20 = 0.20 Total (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 16 / 56

17 Frequency Descriptive Statistics (iii) Histogram for Discrete Data Take x-values on horizontal scale and the frequency/relative frequency along the vertical scale. Draw the rectangle on each value whose height is equal to the frequency/relative frequency. The histogram for Example 6 is: Histogram of C C1 2 3 Similarly, relative frequency histogram may be drawn. (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 17 / 56

18 Descriptive Statistics Histogram for Continuous Data (measurements) Case 1. (Equal Width Case) (i) The data assumes real values, not necessarily integers. (ii) Subdivide the range of the data into k subintervals or classes of equal length such that each observation lies exactly in one class. (iii) Construct rectangles whose height is equal to frequency (for frequency histogram) or relative frequency (for relative frequency histogram). Note: (i) No hard-and-fast rules concerning k; usually, an integer between 5 and 20 will do. (ii) For large data of size n, more classes be used. A rule of thumb is k = n. (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 18 / 56

19 Descriptive Statistics Note: If all data belong to one or two classes or when most sub-intervals (of equal length) have low frequencies, better to use fewer but with different lengths... (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 19 / 56

20 Descriptive Statistics Histogram: For classes of different lengths: (i) (ii) Decide the class intervals. Construct the rectangle using the formula: Rectangle height=relative frequency/class width (area of rectangle=relative frequency) (iii) (iv) The resulting rectangle heights are called densities The formula works for equal width also. (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 20 / 56

21 Descriptive Statistics Example 7. The following data represents the frequency distribution of the fracture strength (MPa) observations for ceramic bars fired in a particular kiln: (read = 81 < 83 meaning that the data value 83 is not included) Class: Freq: (a) Construct a histogram based on relative frequencies, and comment on any interesting features. (b) What proportion of strength observations are at least 85? Less than 95? (c) What proportion of the observations are less than 90? (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 21 / 56

22 Descriptive Statistics Solution: (a) The histogram appears below. A representative value for this data would be X = 90. The histogram is reasonably symmetric, unimodal, and somewhat bell-shaped. The variation in the data us not small since the spread of the data (99 81) = 18 constitutes about 20% of the typical value of 90. Relative frequency Fracture strength (MPa) (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 22 / 56

23 Descriptive Statistics (b) The proportion of the observations that are at least 85 is 1 (6 + 7)/169 = The proportion less than 95 is 1 ( )/169 = (c) Note x = 90 is the midpoint of the class 89 < 91, which contains 43 observations (a relative frequency of 43/169=0.2544). Therefore, about half of this frequency, , should be added to the relative frequencies for the classes to the left of x = 90. That is, approximate proportion of the observations that are less than 90 is = (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 23 / 56

24 Histogram Shapes The histogram shape is called (a) unimodal if it has single peak. Note: The histogram seen earlier is unimodal. frequency Flow rate 20 (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 24 / 56

25 Histogram Shapes (b) Bimodal if it has 2 different peaks. (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 25 / 56

26 Histogram Shapes (c) Multimodal if it has > 2 peaks. Symmetric if it is unimodal and right half is the mirror image of the left half. F r e q u e n c y I D T v a lu e (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 26 / 56

27 Histogram Shapes (d) Positively skewed if the right tail is stretched out compared with the left tail. (e) Negatively skewed if left tail is stretched out compared with right tail. (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 27 / 56

28 Histogram for Qualitative/Categorical Data (i) Histogram for categorical data is called bar chart. There will be natural ordering of classes. (Titanic Data) (ii) A Pareto diagram is a bar chart resulting from quality control study where different categories correspond to different defects (non-conformities). Example 8. Histogram for Titanic Data: The following table classifies 2201 people as per the class they traveled: Class: First (F) Second (S) Third (T) Crew (C) Count: (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 28 / 56

29 Histogram for Qualitative/Categorical Data Histogram for Titanic Data F S T C (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 29 / 56

30 Some Additional Examples Some Additional Examples: Example 1. Construct the stem-and-leaf display for the data on flexural strength of a certain concrete (in MPa units): 5.9, 7.2, 7.3, 6.3, 8.1, 6.8, 7.0, 7.6, 6.8, 6.5, 7.0, 6.3, 7.9, 9.0, 8.2, 8.7, 7.8, 9.7, 7.4, 7.7, 9.7, 7.8, 7.7, 11.6, 11.3, 11.8, 10.7 (a) Is it spread about a representative value? (b) Is it symmetric? (c) Any outliers? (d) What proportion of observations exceed 10 MPa? (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 30 / 56

31 Some Additional Examples Solution: (a) Minitab generated the following stem-and-leaf display of this data: Stem-and-leaf of C1 N = 27 Leaf Unit = (11) The left most column shows the cumulative numbers of observations from each stem to the nearest tail of the data. For example, the 6 in the second row indicates that there are a total of 6 data points contained in stems 6 and 5. Minitab uses parentheses around 11 in row three to indicate that the median of the data is contained in this stem. A value close to 8 is representative of this data. (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 31 / 56

32 Some Additional Examples (b) The data display is not perfectly symmetric around some middle/representative value. There tends to be some positive skewness in this data. (c) The outliers are data points that appear to be very different from the pack. Looking at the no stem-and-leaf display in Part (a), there appear to be no outliers in this data. (a more precise definition of an outlier will be given later). (d) From the stem-and-leaf display in Part (a), there are 3 leaves associated with the stem of 11, which represent the 3 data values that greater than or equal to , which is represented by the stem of 10 and the leaf of 7, also exceeds 10. Therefore, the proportion of data values that exceed 10 is 4/27 = 0.128, or, about 15%. (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 32 / 56

33 Some Additional Examples Example 2. The following data represents the IDTs (inter-division time) of a number of cells both in exposed (treatment) and in unexposed (control) conditions: 28.1, 31.2, 13.7, 46.0, , 34.8, 62.3, 28.0, 17.9, 19.5, 21.1, 31.9, 28.9, 60.1, 23.7, 18.6, 21.4, 26.6, 26.2, 32.0, 43.5, 17.4, 38.8, 30.6, 55.6, 25.5, 52.1, 21.0, 22.3, 15.5, 36.3, 19.1, 38.4, 72.8, 48.9, 21.4, 20.7, 57.3, 40.9 Construct a histogram of this data based on classes with boundaries 10, 20, 30,... Then calculate log(x) to the (base 10) for each x and construct the histogram of the transformed data using the class boundaries 1.1, 1.2, 1.3, and etc. What is the effect of the transformation? (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 33 / 56

34 Some Additional Examples Solution. A histogram of the raw data appears below: (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 34 / 56

35 Some Additional Examples After transforming the data by taking logarithms (base 10), a histogram of the log 10 data is shown above. The shape of this histogram is much less skewed than the histogram of the original data. (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 35 / 56

36 Numerical Summary of Measures We now discuss some of the important characteristics of the data and for the population. Measures of Location First we discuss for data and then for the population distribution. The Mean 1. The Sample Mean: x The sample of mean of n observation x 1,..., x n is x = 1/n x i = (x x n )/n, where, n denotes the number of observations. Example 1a. Suppose scores of 8 students in a test are: 35, 20, 45, 50, 42, 38, 39, 11. Then the sample mean is = 280/8 = 35. (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 36 / 56

37 Numerical Summary of Measures Example 1b. Suppose, the last score is recorded, by mistake, as 71. Then, x = ( )/8 = 340/8 = 42.5%. About 22% increase in the sample mean. Note this is signifiant one. Rule: Increase one decimal place more than the one present in the data. In the above example, the data are in integers (no decimal places) and so we denoted x = 42.5 (one decimal place) (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 37 / 56

38 Numerical Summary of Measures 2. The Median: x This measure is less affected by outliers or extreme values. This divides the sample distribution in to two equal parts. Definition 2 (Sample median) First order the observations as X (1) X (2)... X (n), from the smallest to the largest one. Then the median is defined as X ( n+1 2 ), if n is odd, x = ( ) X ( n 2 ) + X ( n 2 +1) /2, if n is even { middle Value, if n is odd, = average of middle 2 values, if n is even. (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 38 / 56

39 Numerical Summary of Measures Example 2: The median of the values in Example 1a is: 11, 20, 35, 38, 39, 42, 45, 50 }{{} Here, n = 8 even; n/2 = 4. Take the middle values: 4th and 5th values. Hence, the median is x = average of middle two values = {( )/2} = Example 3: Find the median of Example 1b (one outlier case). Here, 20, 35, 38, 39, 42, 45, 50, 71 }{{} Again, x = ( )/2 = 81/2 = 40.5 Remark. 1 (i) The median value is less affected than the mean. (ii) Also, this is an extreme case, as we replaced the smallest observation by one which is greater than the largest. (iii) Decreasing the first three smallest values or increasing the last three largest values in Example 3, does not affect the median. (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 39 / 56

40 Numerical Summary of Measures 3. The Trimmed Mean (i) First order the observations (ordered data) from the smallest to the largest. (ii) Let r (0, 0.5). Then 100r% trimmed data is obtained by discarding the largest 100r% and the smallest 100r% of the data. Definition: The 100r% trimmed simple mean is the sample mean of the 100r% trimmed data. (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 40 / 56

41 Numerical Summary of Measures Example 4. Obtain the 12% trimmed mean of the data in Example 1. 11, 20, 35, Here, 12 = 100r% (100r = 12, r = 12/100 = 0.12) Also n = 8; 12% of 8 = (12/100) 8 = 24/25 1. Discarding the smallest one and the largest one, we get 12.5% trimmed means (since (1/8) = 12.5) is ( )/6 = 219/6 = Remark. 2 It is less sensitive than the mean, but more sensitive than the median. (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 41 / 56

42 Measure of Variability Let x 1,..., x n be a sample of size n on a variable x. Definition 3 (i) The Range: Arrange x 1,..., x n as x (1) x (2)... x (n), where x (1) = smallest value and x (n) = the largest value. Then the range R = x (n) x (1). This is the simplest measure of variability. Drawback: It depends only on x (1) and x (n). (ii) The Sample Variance The sample variance of x 1,..., x n is defined by s 2 x = 1/(n 1) n (x i x) 2 = S xx /(n 1) i=1 and the sample standard deviation is s = + s 2, the positive square root. (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 42 / 56

43 Measure of Variability Facts: (i) The unit of s is the same as that of x i s. (ii) n (x i x) = 0, for any x 1,..., x n. i=1 That is, if the derivations (x 1 x),..., (x n 1 x) are known, then (x n x) can be found. Thus, n deviations actually contain only (n 1) independent pieces of information (called degrees of freedom) and this will suffice to find s 2 or s. Thus, s 2 or s are based on (n 1) degrees of freedom. (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 43 / 56

44 Measure of Variability A Useful Formula: S xx = = = n (x i x) 2 i=1 x 2 i ( x i ) 2 /n x 2 i nx 2. Hence, Sx 2 = 1 [ n 1 i x 2 i 1 ( ) 2 x i ]. n A Proposition: Let S 2 x be the variance of the data x 1,..., x n and c 0. (i) If y 1 = x 1 + c,..., y n = x n + c, then S 2 y = S 2 x. (ii) If y 1 = cx 1,..., y n = cx n then S 2 y = c 2 S 2 x and S y = c S x. i (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 44 / 56

45 Measure of Variability Example 5 The following data represents the value of Young s modulus for certain cast plates: 116.4, 115.9, 114.6, 115.2, (a) Find x and (x i x) (b) Using (x i x) s, compute S 2 (c) Calculate using computational for S xx. (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 45 / 56

46 Measure of Variability Solution: (a) x = 1/n i x i = 577.9/5 = Deviations from the mean: =.82, =.32, =.98, =.38, and =.22. (b) s 2 = [(.82) 2 + (.32) 2 + (.98) 2 + (.38) 2 + (.22) 2 ]/(5 1) = 1.928/4 =.482. Hence, s = (c) i x 2 = 66, , i [ ( ) 2 ] so S 2 = 1 n 1 i x 2 1 i n i x i = [ (577.9) 2 /5]/4 = 1.928/4 = (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 46 / 56

47 Measure of Variability Box Plot The quartiles and percentiles yield more information about the location of a data set. Similarly, median and IQR (inter quartile range) are used to construct box plot, a visual summary of the data. Quartiles and IQR Let x 1,..., x n denote the data set of size n. First order the observations. (i) Compute the median x. (ii) If n is even, first n 2 observations form the lower half; and the remaining n 2 observations form the upper half (median separates the data into two parts). (iii) If n is odd, the median x is the (n+1) 2 th value of the ordered data and include it both the parts. (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 47 / 56

48 Measure of Variability The Quartiles: (i) The lower quartile= Q 1 = median of the lower half of the data. (ii) The upper quartile= Q 3 = median of the upper half of the data. (iii) The interquartile range IQR = Q 3 Q 1 Note: The IQR is also called fourth spread f s = Q 3 Q 1 = upper fourth - lower fourth, and is resistant to outliers. (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 48 / 56

49 Measure of Variability Example 1 Consider the following data: 5.2, 3.9, 4.8, 5.1, 3.7, 4.5, 4.2 Here, n = 7. Ordered data: 3.7, , 4.5, 4.8, 5.1, 5.2. The median = 4.5. Since n is odd, include the median in lower half and upper half of the data. Lower half: 3.7, 3.9, 4.2, 4.5 Upper half: 4.5, 4.8, 5.1, 5.2 Q 1 = = = 4.05 Q 3 = = = 4.95 Hence, IQR = = 0.9. (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 49 / 56

50 Measure of Variability IQR Criteria for an Outlier: An observation that lies above Q 3 + (1.5)IQR or below Q 1 (1.5)IQR may be suspected to be an outlier. An outlier is called extreme if it lies outside (Q 1 3IQR, Q 3 + 3IQR). Otherwise; it is called a mild outlier. Boxplot: A box plot is a visual display of 5 number summary: (x (1), Q 1, x, Q 3, x (n) ). (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 50 / 56

51 Measure of Variability Procedure: (i) The middle box denotes the Q 1, median and the Q 3. (ii) The whiskers extend above Q 3 or below Q 1 till Q 3 + 3IQR or Q 1 3IQ, respectively. (iii) The outliers are denoted by special symbols. (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 51 / 56

52 Measure of Variability Remark. 3 The box-plot has the following properties: (i) More compact than stem plot or histogram. (ii) Central box contains roughly 50% of the data. (iii) Does not reveal the presence of clusters. (iv) Very useful in comparing (similarity and differences) data sets on same scale. (v) Height of the box = IQR (vi) If the median is roughly in the middle of the box, then the distribution is symmetric; or else it is skewed. (vii) Whiskers show skewness if they are not of the same length. (viii) Useful to detect outliers. The main use of box plots is to compare the groups. (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 52 / 56

53 Measure of Variability Example 3 The following data denotes the shear strength (MPa) of a joint bonded in a particular manner. 22.2, 40.4, 16.4, 73.7, 36.6, 109.9, 30.0, 4.4, 33.1, 66.7, 81.5 (a) What are the values of the quartiles, and the value of the IQR? (b) Construct a box plot based on the five-number summary, and comment on its features. (c) How large or small does an observation have to be to qualify as an outlier? As an extreme outlier? (d) By how much could the largest observation be decreased without affecting the IQR? (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 53 / 56

54 Measure of Variability Solution: (a) The lower half of the data set: 4.4, 16.4, 22.2, 30.0, 33.1, 36.6, and therefore the lower quartile is (( )/2) = The top half of the data set: 36.6, 40.4, 66.7, 73.7, 81.5, and therefore the upper quartile, is (( )/2) = So, the IQR = ( ) = (b)a boxplot (created in Minitab) of this data appears below: There is a slight positive skew to the data. The variation seems quite large. There are no outliers sheer strength 100 (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 54 / 56

55 Measure of Variability (c) An observation would need to be further than 1.5(44.1) = units below the lower quartile or above the upper quartile to be classified as a mild outlier. Notice that, in this case, an outlier on the lower side would not be possible since the sheer strength variable cannot have a negative value. An extreme outlier would fall (3)(44.1) = or more units below the lower, or above the upper quartile. Since the minimum and maximum observations in the data are 4.4 and respectively and so there are no outliers, of either type, in this data set. (d) Not until the value x = is lowered below 73.7 would there be any change in the value of the upper quartile. That is, the value x = could not be decreased by more than ( ) = 36.2 units. (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 55 / 56

56 Homework Home work: Sect 1.2: 11, 16, 19, 26, 27, 29 Sect 1.3: 35, 36, 41, 43 Sect 1.4: 45, 51, 54, 57, 79. END OF LECTURE 1 (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 56 / 56

STAT 200 Chapter 1 Looking at Data - Distributions

STAT 200 Chapter 1 Looking at Data - Distributions STAT 200 Chapter 1 Looking at Data - Distributions What is Statistics? Statistics is a science that involves the design of studies, data collection, summarizing and analyzing the data, interpreting the

More information

Chapter 1 Descriptive Statistics

Chapter 1 Descriptive Statistics MICHIGAN STATE UNIVERSITY STT 351 SECTION 2 FALL 2008 LECTURE NOTES Chapter 1 Descriptive Statistics Nao Mimoto Contents 1 Overview 2 2 Pictorial Methods in Descriptive Statistics 3 2.1 Different Kinds

More information

1-1. Chapter 1. Sampling and Descriptive Statistics by The McGraw-Hill Companies, Inc. All rights reserved.

1-1. Chapter 1. Sampling and Descriptive Statistics by The McGraw-Hill Companies, Inc. All rights reserved. 1-1 Chapter 1 Sampling and Descriptive Statistics 1-2 Why Statistics? Deal with uncertainty in repeated scientific measurements Draw conclusions from data Design valid experiments and draw reliable conclusions

More information

What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty.

What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty. What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty. Statistics is a field of study concerned with the data collection,

More information

Chapter 3. Data Description

Chapter 3. Data Description Chapter 3. Data Description Graphical Methods Pie chart It is used to display the percentage of the total number of measurements falling into each of the categories of the variable by partition a circle.

More information

Chapter 2: Tools for Exploring Univariate Data

Chapter 2: Tools for Exploring Univariate Data Stats 11 (Fall 2004) Lecture Note Introduction to Statistical Methods for Business and Economics Instructor: Hongquan Xu Chapter 2: Tools for Exploring Univariate Data Section 2.1: Introduction What is

More information

CHAPTER 1. Introduction

CHAPTER 1. Introduction CHAPTER 1 Introduction Engineers and scientists are constantly exposed to collections of facts, or data. The discipline of statistics provides methods for organizing and summarizing data, and for drawing

More information

CIVL 7012/8012. Collection and Analysis of Information

CIVL 7012/8012. Collection and Analysis of Information CIVL 7012/8012 Collection and Analysis of Information Uncertainty in Engineering Statistics deals with the collection and analysis of data to solve real-world problems. Uncertainty is inherent in all real

More information

Introduction to Probability and Statistics Slides 1 Chapter 1

Introduction to Probability and Statistics Slides 1 Chapter 1 1 Introduction to Probability and Statistics Slides 1 Chapter 1 Prof. Ammar M. Sarhan, asarhan@mathstat.dal.ca Department of Mathematics and Statistics, Dalhousie University Fall Semester 2010 Course outline

More information

STP 420 INTRODUCTION TO APPLIED STATISTICS NOTES

STP 420 INTRODUCTION TO APPLIED STATISTICS NOTES INTRODUCTION TO APPLIED STATISTICS NOTES PART - DATA CHAPTER LOOKING AT DATA - DISTRIBUTIONS Individuals objects described by a set of data (people, animals, things) - all the data for one individual make

More information

Describing distributions with numbers

Describing distributions with numbers Describing distributions with numbers A large number or numerical methods are available for describing quantitative data sets. Most of these methods measure one of two data characteristics: The central

More information

Further Mathematics 2018 CORE: Data analysis Chapter 2 Summarising numerical data

Further Mathematics 2018 CORE: Data analysis Chapter 2 Summarising numerical data Chapter 2: Summarising numerical data Further Mathematics 2018 CORE: Data analysis Chapter 2 Summarising numerical data Extract from Study Design Key knowledge Types of data: categorical (nominal and ordinal)

More information

Lecture 2. Quantitative variables. There are three main graphical methods for describing, summarizing, and detecting patterns in quantitative data:

Lecture 2. Quantitative variables. There are three main graphical methods for describing, summarizing, and detecting patterns in quantitative data: Lecture 2 Quantitative variables There are three main graphical methods for describing, summarizing, and detecting patterns in quantitative data: Stemplot (stem-and-leaf plot) Histogram Dot plot Stemplots

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics CHAPTER OUTLINE 6-1 Numerical Summaries of Data 6- Stem-and-Leaf Diagrams 6-3 Frequency Distributions and Histograms 6-4 Box Plots 6-5 Time Sequence Plots 6-6 Probability Plots Chapter

More information

CHAPTER 2: Describing Distributions with Numbers

CHAPTER 2: Describing Distributions with Numbers CHAPTER 2: Describing Distributions with Numbers The Basic Practice of Statistics 6 th Edition Moore / Notz / Fligner Lecture PowerPoint Slides Chapter 2 Concepts 2 Measuring Center: Mean and Median Measuring

More information

Describing distributions with numbers

Describing distributions with numbers Describing distributions with numbers A large number or numerical methods are available for describing quantitative data sets. Most of these methods measure one of two data characteristics: The central

More information

Topic 3: Introduction to Statistics. Algebra 1. Collecting Data. Table of Contents. Categorical or Quantitative? What is the Study of Statistics?!

Topic 3: Introduction to Statistics. Algebra 1. Collecting Data. Table of Contents. Categorical or Quantitative? What is the Study of Statistics?! Topic 3: Introduction to Statistics Collecting Data We collect data through observation, surveys and experiments. We can collect two different types of data: Categorical Quantitative Algebra 1 Table of

More information

Measures of center. The mean The mean of a distribution is the arithmetic average of the observations:

Measures of center. The mean The mean of a distribution is the arithmetic average of the observations: Measures of center The mean The mean of a distribution is the arithmetic average of the observations: x = x 1 + + x n n n = 1 x i n i=1 The median The median is the midpoint of a distribution: the number

More information

Elementary Statistics

Elementary Statistics Elementary Statistics Q: What is data? Q: What does the data look like? Q: What conclusions can we draw from the data? Q: Where is the middle of the data? Q: Why is the spread of the data important? Q:

More information

1. Exploratory Data Analysis

1. Exploratory Data Analysis 1. Exploratory Data Analysis 1.1 Methods of Displaying Data A visual display aids understanding and can highlight features which may be worth exploring more formally. Displays should have impact and be

More information

Review for Exam #1. Chapter 1. The Nature of Data. Definitions. Population. Sample. Quantitative data. Qualitative (attribute) data

Review for Exam #1. Chapter 1. The Nature of Data. Definitions. Population. Sample. Quantitative data. Qualitative (attribute) data Review for Exam #1 1 Chapter 1 Population the complete collection of elements (scores, people, measurements, etc.) to be studied Sample a subcollection of elements drawn from a population 11 The Nature

More information

Chapter 4. Displaying and Summarizing. Quantitative Data

Chapter 4. Displaying and Summarizing. Quantitative Data STAT 141 Introduction to Statistics Chapter 4 Displaying and Summarizing Quantitative Data Bin Zou (bzou@ualberta.ca) STAT 141 University of Alberta Winter 2015 1 / 31 4.1 Histograms 1 We divide the range

More information

F78SC2 Notes 2 RJRC. If the interest rate is 5%, we substitute x = 0.05 in the formula. This gives

F78SC2 Notes 2 RJRC. If the interest rate is 5%, we substitute x = 0.05 in the formula. This gives F78SC2 Notes 2 RJRC Algebra It is useful to use letters to represent numbers. We can use the rules of arithmetic to manipulate the formula and just substitute in the numbers at the end. Example: 100 invested

More information

Introduction to Statistics

Introduction to Statistics Introduction to Statistics Data and Statistics Data consists of information coming from observations, counts, measurements, or responses. Statistics is the science of collecting, organizing, analyzing,

More information

Chapter 1: Exploring Data

Chapter 1: Exploring Data Chapter 1: Exploring Data Section 1.3 with Numbers The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE Chapter 1 Exploring Data Introduction: Data Analysis: Making Sense of Data 1.1

More information

Descriptive Univariate Statistics and Bivariate Correlation

Descriptive Univariate Statistics and Bivariate Correlation ESC 100 Exploring Engineering Descriptive Univariate Statistics and Bivariate Correlation Instructor: Sudhir Khetan, Ph.D. Wednesday/Friday, October 17/19, 2012 The Central Dogma of Statistics used to

More information

Histograms allow a visual interpretation

Histograms allow a visual interpretation Chapter 4: Displaying and Summarizing i Quantitative Data s allow a visual interpretation of quantitative (numerical) data by indicating the number of data points that lie within a range of values, called

More information

Types of Information. Topic 2 - Descriptive Statistics. Examples. Sample and Sample Size. Background Reading. Variables classified as STAT 511

Types of Information. Topic 2 - Descriptive Statistics. Examples. Sample and Sample Size. Background Reading. Variables classified as STAT 511 Topic 2 - Descriptive Statistics STAT 511 Professor Bruce Craig Types of Information Variables classified as Categorical (qualitative) - variable classifies individual into one of several groups or categories

More information

MATH 1150 Chapter 2 Notation and Terminology

MATH 1150 Chapter 2 Notation and Terminology MATH 1150 Chapter 2 Notation and Terminology Categorical Data The following is a dataset for 30 randomly selected adults in the U.S., showing the values of two categorical variables: whether or not the

More information

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 3.1- #

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 3.1- # Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series by Mario F. Triola Chapter 3 Statistics for Describing, Exploring, and Comparing Data 3-1 Review and Preview 3-2 Measures

More information

BNG 495 Capstone Design. Descriptive Statistics

BNG 495 Capstone Design. Descriptive Statistics BNG 495 Capstone Design Descriptive Statistics Overview The overall goal of this short course in statistics is to provide an introduction to descriptive and inferential statistical methods, with a focus

More information

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1 Lecture Slides Elementary Statistics Tenth Edition and the Triola Statistics Series by Mario F. Triola Slide 1 Chapter 3 Statistics for Describing, Exploring, and Comparing Data 3-1 Overview 3-2 Measures

More information

A is one of the categories into which qualitative data can be classified.

A is one of the categories into which qualitative data can be classified. Chapter 2 Methods for Describing Sets of Data 2.1 Describing qualitative data Recall qualitative data: non-numerical or categorical data Basic definitions: A is one of the categories into which qualitative

More information

1.3: Describing Quantitative Data with Numbers

1.3: Describing Quantitative Data with Numbers 1.3: Describing Quantitative Data with Numbers Section 1.3 Describing Quantitative Data with Numbers After this section, you should be able to MEASURE center with the mean and median MEASURE spread with

More information

What is statistics? Statistics is the science of: Collecting information. Organizing and summarizing the information collected

What is statistics? Statistics is the science of: Collecting information. Organizing and summarizing the information collected What is statistics? Statistics is the science of: Collecting information Organizing and summarizing the information collected Analyzing the information collected in order to draw conclusions Two types

More information

Units. Exploratory Data Analysis. Variables. Student Data

Units. Exploratory Data Analysis. Variables. Student Data Units Exploratory Data Analysis Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison Statistics 371 13th September 2005 A unit is an object that can be measured, such as

More information

are the objects described by a set of data. They may be people, animals or things.

are the objects described by a set of data. They may be people, animals or things. ( c ) E p s t e i n, C a r t e r a n d B o l l i n g e r 2016 C h a p t e r 5 : E x p l o r i n g D a t a : D i s t r i b u t i o n s P a g e 1 CHAPTER 5: EXPLORING DATA DISTRIBUTIONS 5.1 Creating Histograms

More information

CHAPTER 5: EXPLORING DATA DISTRIBUTIONS. Individuals are the objects described by a set of data. These individuals may be people, animals or things.

CHAPTER 5: EXPLORING DATA DISTRIBUTIONS. Individuals are the objects described by a set of data. These individuals may be people, animals or things. (c) Epstein 2013 Chapter 5: Exploring Data Distributions Page 1 CHAPTER 5: EXPLORING DATA DISTRIBUTIONS 5.1 Creating Histograms Individuals are the objects described by a set of data. These individuals

More information

QUANTITATIVE DATA. UNIVARIATE DATA data for one variable

QUANTITATIVE DATA. UNIVARIATE DATA data for one variable QUANTITATIVE DATA Recall that quantitative (numeric) data values are numbers where data take numerical values for which it is sensible to find averages, such as height, hourly pay, and pulse rates. UNIVARIATE

More information

Math 140 Introductory Statistics

Math 140 Introductory Statistics Math 140 Introductory Statistics Professor Silvia Fernández Chapter 2 Based on the book Statistics in Action by A. Watkins, R. Scheaffer, and G. Cobb. Visualizing Distributions Recall the definition: The

More information

Math 140 Introductory Statistics

Math 140 Introductory Statistics Visualizing Distributions Math 140 Introductory Statistics Professor Silvia Fernández Chapter Based on the book Statistics in Action by A. Watkins, R. Scheaffer, and G. Cobb. Recall the definition: The

More information

Chapter2 Description of samples and populations. 2.1 Introduction.

Chapter2 Description of samples and populations. 2.1 Introduction. Chapter2 Description of samples and populations. 2.1 Introduction. Statistics=science of analyzing data. Information collected (data) is gathered in terms of variables (characteristics of a subject that

More information

Unit 2. Describing Data: Numerical

Unit 2. Describing Data: Numerical Unit 2 Describing Data: Numerical Describing Data Numerically Describing Data Numerically Central Tendency Arithmetic Mean Median Mode Variation Range Interquartile Range Variance Standard Deviation Coefficient

More information

Performance of fourth-grade students on an agility test

Performance of fourth-grade students on an agility test Starter Ch. 5 2005 #1a CW Ch. 4: Regression L1 L2 87 88 84 86 83 73 81 67 78 83 65 80 50 78 78? 93? 86? Create a scatterplot Find the equation of the regression line Predict the scores Chapter 5: Understanding

More information

P8130: Biostatistical Methods I

P8130: Biostatistical Methods I P8130: Biostatistical Methods I Lecture 2: Descriptive Statistics Cody Chiuzan, PhD Department of Biostatistics Mailman School of Public Health (MSPH) Lecture 1: Recap Intro to Biostatistics Types of Data

More information

Glossary for the Triola Statistics Series

Glossary for the Triola Statistics Series Glossary for the Triola Statistics Series Absolute deviation The measure of variation equal to the sum of the deviations of each value from the mean, divided by the number of values Acceptance sampling

More information

Week 1: Intro to R and EDA

Week 1: Intro to R and EDA Statistical Methods APPM 4570/5570, STAT 4000/5000 Populations and Samples 1 Week 1: Intro to R and EDA Introduction to EDA Objective: study of a characteristic (measurable quantity, random variable) for

More information

Lecture 2 and Lecture 3

Lecture 2 and Lecture 3 Lecture 2 and Lecture 3 1 Lecture 2 and Lecture 3 We can describe distributions using 3 characteristics: shape, center and spread. These characteristics have been discussed since the foundation of statistics.

More information

Lecture 6: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries)

Lecture 6: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries) Lecture 6: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries) Summarize with Shape, Center, Spread Displays: Stemplots, Histograms Five Number Summary, Outliers, Boxplots Cengage Learning

More information

Math 223 Lecture Notes 3/15/04 From The Basic Practice of Statistics, bymoore

Math 223 Lecture Notes 3/15/04 From The Basic Practice of Statistics, bymoore Math 223 Lecture Notes 3/15/04 From The Basic Practice of Statistics, bymoore Chapter 3 continued Describing distributions with numbers Measuring spread of data: Quartiles Definition 1: The interquartile

More information

Resistant Measure - A statistic that is not affected very much by extreme observations.

Resistant Measure - A statistic that is not affected very much by extreme observations. Chapter 1.3 Lecture Notes & Examples Section 1.3 Describing Quantitative Data with Numbers (pp. 50-74) 1.3.1 Measuring Center: The Mean Mean - The arithmetic average. To find the mean (pronounced x bar)

More information

Last Lecture. Distinguish Populations from Samples. Knowing different Sampling Techniques. Distinguish Parameters from Statistics

Last Lecture. Distinguish Populations from Samples. Knowing different Sampling Techniques. Distinguish Parameters from Statistics Last Lecture Distinguish Populations from Samples Importance of identifying a population and well chosen sample Knowing different Sampling Techniques Distinguish Parameters from Statistics Knowing different

More information

Chapter 1. Looking at Data

Chapter 1. Looking at Data Chapter 1 Looking at Data Types of variables Looking at Data Be sure that each variable really does measure what you want it to. A poor choice of variables can lead to misleading conclusions!! For example,

More information

Describing Distributions with Numbers

Describing Distributions with Numbers Describing Distributions with Numbers Using graphs, we could determine the center, spread, and shape of the distribution of a quantitative variable. We can also use numbers (called summary statistics)

More information

Shape, Outliers, Center, Spread Frequency and Relative Histograms Related to other types of graphical displays

Shape, Outliers, Center, Spread Frequency and Relative Histograms Related to other types of graphical displays Histograms: Shape, Outliers, Center, Spread Frequency and Relative Histograms Related to other types of graphical displays Sep 9 1:13 PM Shape: Skewed left Bell shaped Symmetric Bi modal Symmetric Skewed

More information

1.3.1 Measuring Center: The Mean

1.3.1 Measuring Center: The Mean 1.3.1 Measuring Center: The Mean Mean - The arithmetic average. To find the mean (pronounced x bar) of a set of observations, add their values and divide by the number of observations. If the n observations

More information

Lecture 1: Description of Data. Readings: Sections 1.2,

Lecture 1: Description of Data. Readings: Sections 1.2, Lecture 1: Description of Data Readings: Sections 1.,.1-.3 1 Variable Example 1 a. Write two complete and grammatically correct sentences, explaining your primary reason for taking this course and then

More information

Section 3. Measures of Variation

Section 3. Measures of Variation Section 3 Measures of Variation Range Range = (maximum value) (minimum value) It is very sensitive to extreme values; therefore not as useful as other measures of variation. Sample Standard Deviation The

More information

Descriptive Data Summarization

Descriptive Data Summarization Descriptive Data Summarization Descriptive data summarization gives the general characteristics of the data and identify the presence of noise or outliers, which is useful for successful data cleaning

More information

AP Final Review II Exploring Data (20% 30%)

AP Final Review II Exploring Data (20% 30%) AP Final Review II Exploring Data (20% 30%) Quantitative vs Categorical Variables Quantitative variables are numerical values for which arithmetic operations such as means make sense. It is usually a measure

More information

Lecture 3B: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries)

Lecture 3B: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries) Lecture 3B: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries) Summarize with Shape, Center, Spread Displays: Stemplots, Histograms Five Number Summary, Outliers, Boxplots Mean vs.

More information

MgtOp 215 Chapter 3 Dr. Ahn

MgtOp 215 Chapter 3 Dr. Ahn MgtOp 215 Chapter 3 Dr. Ahn Measures of central tendency (center, location): measures the middle point of a distribution or data; these include mean and median. Measures of dispersion (variability, spread):

More information

ECLT 5810 Data Preprocessing. Prof. Wai Lam

ECLT 5810 Data Preprocessing. Prof. Wai Lam ECLT 5810 Data Preprocessing Prof. Wai Lam Why Data Preprocessing? Data in the real world is imperfect incomplete: lacking attribute values, lacking certain attributes of interest, or containing only aggregate

More information

Chapter 3. Measuring data

Chapter 3. Measuring data Chapter 3 Measuring data 1 Measuring data versus presenting data We present data to help us draw meaning from it But pictures of data are subjective They re also not susceptible to rigorous inference Measuring

More information

Chapter 1 - Lecture 3 Measures of Location

Chapter 1 - Lecture 3 Measures of Location Chapter 1 - Lecture 3 of Location August 31st, 2009 Chapter 1 - Lecture 3 of Location General Types of measures Median Skewness Chapter 1 - Lecture 3 of Location Outline General Types of measures What

More information

STT 315 This lecture is based on Chapter 2 of the textbook.

STT 315 This lecture is based on Chapter 2 of the textbook. STT 315 This lecture is based on Chapter 2 of the textbook. Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing him to use/edit some of their

More information

Determining the Spread of a Distribution

Determining the Spread of a Distribution Determining the Spread of a Distribution 1.3-1.5 Cathy Poliak, Ph.D. cathy@math.uh.edu Department of Mathematics University of Houston Lecture 3-2311 Lecture 3-2311 1 / 58 Outline 1 Describing Quantitative

More information

Example 2. Given the data below, complete the chart:

Example 2. Given the data below, complete the chart: Statistics 2035 Quiz 1 Solutions Example 1. 2 64 150 150 2 128 150 2 256 150 8 8 Example 2. Given the data below, complete the chart: 52.4, 68.1, 66.5, 75.0, 60.5, 78.8, 63.5, 48.9, 81.3 n=9 The data is

More information

Determining the Spread of a Distribution

Determining the Spread of a Distribution Determining the Spread of a Distribution 1.3-1.5 Cathy Poliak, Ph.D. cathy@math.uh.edu Department of Mathematics University of Houston Lecture 3-2311 Lecture 3-2311 1 / 58 Outline 1 Describing Quantitative

More information

Chapter 1: Introduction. Material from Devore s book (Ed 8), and Cengagebrain.com

Chapter 1: Introduction. Material from Devore s book (Ed 8), and Cengagebrain.com 1 Chapter 1: Introduction Material from Devore s book (Ed 8), and Cengagebrain.com Populations and Samples An investigation of some characteristic of a population of interest. Example: Say you want to

More information

Sampling, Frequency Distributions, and Graphs (12.1)

Sampling, Frequency Distributions, and Graphs (12.1) 1 Sampling, Frequency Distributions, and Graphs (1.1) Design: Plan how to obtain the data. What are typical Statistical Methods? Collect the data, which is then subjected to statistical analysis, which

More information

TOPIC: Descriptive Statistics Single Variable

TOPIC: Descriptive Statistics Single Variable TOPIC: Descriptive Statistics Single Variable I. Numerical data summary measurements A. Measures of Location. Measures of central tendency Mean; Median; Mode. Quantiles - measures of noncentral tendency

More information

Chapter 4: Displaying and Summarizing Quantitative Data

Chapter 4: Displaying and Summarizing Quantitative Data Chapter 4: Displaying and Summarizing Quantitative Data This chapter discusses methods of displaying quantitative data. The objective is describe the distribution of the data. The figure below shows three

More information

Statistics I Chapter 2: Univariate data analysis

Statistics I Chapter 2: Univariate data analysis Statistics I Chapter 2: Univariate data analysis Chapter 2: Univariate data analysis Contents Graphical displays for categorical data (barchart, piechart) Graphical displays for numerical data data (histogram,

More information

AP Statistics Cumulative AP Exam Study Guide

AP Statistics Cumulative AP Exam Study Guide AP Statistics Cumulative AP Eam Study Guide Chapters & 3 - Graphs Statistics the science of collecting, analyzing, and drawing conclusions from data. Descriptive methods of organizing and summarizing statistics

More information

2011 Pearson Education, Inc

2011 Pearson Education, Inc Statistics for Business and Economics Chapter 2 Methods for Describing Sets of Data Summary of Central Tendency Measures Measure Formula Description Mean x i / n Balance Point Median ( n +1) Middle Value

More information

Section 3.2 Measures of Central Tendency

Section 3.2 Measures of Central Tendency Section 3.2 Measures of Central Tendency 1 of 149 Section 3.2 Objectives Determine the mean, median, and mode of a population and of a sample Determine the weighted mean of a data set and the mean of a

More information

Chapter 5: Exploring Data: Distributions Lesson Plan

Chapter 5: Exploring Data: Distributions Lesson Plan Lesson Plan Exploring Data Displaying Distributions: Histograms Interpreting Histograms Displaying Distributions: Stemplots Describing Center: Mean and Median Describing Variability: The Quartiles The

More information

Unit Six Information. EOCT Domain & Weight: Algebra Connections to Statistics and Probability - 15%

Unit Six Information. EOCT Domain & Weight: Algebra Connections to Statistics and Probability - 15% GSE Algebra I Unit Six Information EOCT Domain & Weight: Algebra Connections to Statistics and Probability - 15% Curriculum Map: Describing Data Content Descriptors: Concept 1: Summarize, represent, and

More information

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency Math 1 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency The word average: is very ambiguous and can actually refer to the mean, median, mode or midrange. Notation:

More information

Statistics I Chapter 2: Univariate data analysis

Statistics I Chapter 2: Univariate data analysis Statistics I Chapter 2: Univariate data analysis Chapter 2: Univariate data analysis Contents Graphical displays for categorical data (barchart, piechart) Graphical displays for numerical data data (histogram,

More information

University of Jordan Fall 2009/2010 Department of Mathematics

University of Jordan Fall 2009/2010 Department of Mathematics handouts Part 1 (Chapter 1 - Chapter 5) University of Jordan Fall 009/010 Department of Mathematics Chapter 1 Introduction to Introduction; Some Basic Concepts Statistics is a science related to making

More information

Biostatistics Presentation of data DR. AMEER KADHIM HUSSEIN M.B.CH.B.FICMS (COM.)

Biostatistics Presentation of data DR. AMEER KADHIM HUSSEIN M.B.CH.B.FICMS (COM.) Biostatistics Presentation of data DR. AMEER KADHIM HUSSEIN M.B.CH.B.FICMS (COM.) PRESENTATION OF DATA 1. Mathematical presentation (measures of central tendency and measures of dispersion). 2. Tabular

More information

Range The range is the simplest of the three measures and is defined now.

Range The range is the simplest of the three measures and is defined now. Measures of Variation EXAMPLE A testing lab wishes to test two experimental brands of outdoor paint to see how long each will last before fading. The testing lab makes 6 gallons of each paint to test.

More information

Description of Samples and Populations

Description of Samples and Populations Description of Samples and Populations Random Variables Data are generated by some underlying random process or phenomenon. Any datum (data point) represents the outcome of a random variable. We represent

More information

SESSION 5 Descriptive Statistics

SESSION 5 Descriptive Statistics SESSION 5 Descriptive Statistics Descriptive statistics are used to describe the basic features of the data in a study. They provide simple summaries about the sample and the measures. Together with simple

More information

Lecture 2. Descriptive Statistics: Measures of Center

Lecture 2. Descriptive Statistics: Measures of Center Lecture 2. Descriptive Statistics: Measures of Center Descriptive Statistics summarize or describe the important characteristics of a known set of data Inferential Statistics use sample data to make inferences

More information

Describing Distributions

Describing Distributions Describing Distributions With Numbers April 18, 2012 Summary Statistics. Measures of Center. Percentiles. Measures of Spread. A Summary Statement. Choosing Numerical Summaries. 1.0 What Are Summary Statistics?

More information

Descriptive Statistics-I. Dr Mahmoud Alhussami

Descriptive Statistics-I. Dr Mahmoud Alhussami Descriptive Statistics-I Dr Mahmoud Alhussami Biostatistics What is the biostatistics? A branch of applied math. that deals with collecting, organizing and interpreting data using well-defined procedures.

More information

Dover- Sherborn High School Mathematics Curriculum Probability and Statistics

Dover- Sherborn High School Mathematics Curriculum Probability and Statistics Mathematics Curriculum A. DESCRIPTION This is a full year courses designed to introduce students to the basic elements of statistics and probability. Emphasis is placed on understanding terminology and

More information

MATH 10 INTRODUCTORY STATISTICS

MATH 10 INTRODUCTORY STATISTICS MATH 10 INTRODUCTORY STATISTICS Tommy Khoo Your friendly neighbourhood graduate student. Week 1 Chapter 1 Introduction What is Statistics? Why do you need to know Statistics? Technical lingo and concepts:

More information

Chapter 2 Solutions Page 15 of 28

Chapter 2 Solutions Page 15 of 28 Chapter Solutions Page 15 of 8.50 a. The median is 55. The mean is about 105. b. The median is a more representative average" than the median here. Notice in the stem-and-leaf plot on p.3 of the text that

More information

21 ST CENTURY LEARNING CURRICULUM FRAMEWORK PERFORMANCE RUBRICS FOR MATHEMATICS PRE-CALCULUS

21 ST CENTURY LEARNING CURRICULUM FRAMEWORK PERFORMANCE RUBRICS FOR MATHEMATICS PRE-CALCULUS 21 ST CENTURY LEARNING CURRICULUM FRAMEWORK PERFORMANCE RUBRICS FOR MATHEMATICS PRE-CALCULUS Table of Contents Functions... 2 Polynomials and Rational Functions... 3 Exponential Functions... 4 Logarithmic

More information

DEPARTMENT OF QUANTITATIVE METHODS & INFORMATION SYSTEMS QM 120. Spring 2008

DEPARTMENT OF QUANTITATIVE METHODS & INFORMATION SYSTEMS QM 120. Spring 2008 DEPARTMENT OF QUANTITATIVE METHODS & INFORMATION SYSTEMS Introduction to Business Statistics QM 120 Chapter 3 Spring 2008 Measures of central tendency for ungrouped data 2 Graphs are very helpful to describe

More information

Lecture 12: Small Sample Intervals Based on a Normal Population Distribution

Lecture 12: Small Sample Intervals Based on a Normal Population Distribution Lecture 12: Small Sample Intervals Based on a Normal Population MSU-STT-351-Sum-17B (P. Vellaisamy: MSU-STT-351-Sum-17B) Probability & Statistics for Engineers 1 / 24 In this lecture, we will discuss (i)

More information

Vocabulary: Samples and Populations

Vocabulary: Samples and Populations Vocabulary: Samples and Populations Concept Different types of data Categorical data results when the question asked in a survey or sample can be answered with a nonnumerical answer. For example if we

More information

MATH 2560 C F03 Elementary Statistics I Lecture 1: Displaying Distributions with Graphs. Outline.

MATH 2560 C F03 Elementary Statistics I Lecture 1: Displaying Distributions with Graphs. Outline. MATH 2560 C F03 Elementary Statistics I Lecture 1: Displaying Distributions with Graphs. Outline. data; variables: categorical & quantitative; distributions; bar graphs & pie charts: What Is Statistics?

More information

Lecture Notes 2: Variables and graphics

Lecture Notes 2: Variables and graphics Highlights: Lecture Notes 2: Variables and graphics Quantitative vs. qualitative variables Continuous vs. discrete and ordinal vs. nominal variables Frequency distributions Pie charts Bar charts Histograms

More information

BIOL 51A - Biostatistics 1 1. Lecture 1: Intro to Biostatistics. Smoking: hazardous? FEV (l) Smoke

BIOL 51A - Biostatistics 1 1. Lecture 1: Intro to Biostatistics. Smoking: hazardous? FEV (l) Smoke BIOL 51A - Biostatistics 1 1 Lecture 1: Intro to Biostatistics Smoking: hazardous? FEV (l) 1 2 3 4 5 No Yes Smoke BIOL 51A - Biostatistics 1 2 Box Plot a.k.a box-and-whisker diagram or candlestick chart

More information

Chapter 5. Understanding and Comparing. Distributions

Chapter 5. Understanding and Comparing. Distributions STAT 141 Introduction to Statistics Chapter 5 Understanding and Comparing Distributions Bin Zou (bzou@ualberta.ca) STAT 141 University of Alberta Winter 2015 1 / 27 Boxplots How to create a boxplot? Assume

More information