Meelis Kull Autumn Meelis Kull - Autumn MTAT Data Mining - Lecture 03

Size: px
Start display at page:

Download "Meelis Kull Autumn Meelis Kull - Autumn MTAT Data Mining - Lecture 03"

Transcription

1 Meelis Kull Autumn

2 Demo: Data science mini-project CRISP-DM: cross-industrial standard process for data mining Data understanding: Types of data Data understanding: First look at attributes Types of attributes First look at a nominal attribute First look at a ordinal attribute First look at a numeric attribute 2

3 Demo: Data science mini-project CRISP-DM: cross-industrial standard process for data mining Data understanding: Types of data Data understanding: First look at attributes Types of attributes First look at a nominal attribute First look at a ordinal attribute First look at a numeric attribute 3

4 4

5 All the same applies as for nominal attributes Need to make sure that the order is retained in histograms Additional ways to look at the attribute: Calculate the min, max, median of the attribute 5

6 All the same applies as for nominal attributes Need to make sure that the order is retained in histograms Additional ways to look at the attribute: Calculate the min, max, median of the attribute More generally, calculate quantiles 6

7 There are 3 quartiles: lower quartile, median, upper quartile They divide a sorted data set into 4 equal parts Percentiles divide into 100 equal parts Lower quartile is the 25 th percentile Median is the 50 th percentile Deciles divide into 10 equal parts Quantiles generalise to any location over sorted data: Lower quartile = 25 th percentile = quantile at th decile = quantile at 0.4 7

8 A. 1 st and 2 nd decile B. 2 nd and 3 rd decile C. 3 rd and 4 th decile D. 4 th and 5 th decile E. 5 th and 6 th decile F. 6 th and 7 th decile G. 7 th and 8 th decile H. 8 th and 9 th decile 8

9 A. Nominal attributes B. Ordinal attributes C. Both D. Neither E. Not sure 9

10 Same aspects as in nominal attributes Additionally: Is the order specified in the documentation? (meta-data) If all possible values specified in the meta-data: Check if all values present in the data If not, make sure that the empty bars in histograms are in the right location corresponding to the order 10

11 Demo: Data science mini-project CRISP-DM: cross-industrial standard process for data mining Data understanding: Types of data Data understanding: First look at attributes Types of attributes First look at a nominal attribute First look at a ordinal attribute First look at a numeric attribute 11

12 12

13 Is it really numeric? E.g. if contains only values 0.0 and 1.0 then perhaps nominal is more appropriate? Is it discrete (only integers)? E.g. if contains integers 0 to 100 then can have a first look as if it were an ordinal attribute Calculate quantiles as for ordinal attributes Calculate the mean (arithmetic average): 13

14 Is it really numeric? Is it discrete (only integers)? Calculate quantiles as for ordinal attributes Calculate the mean (arithmetic average) Plot a histogram (with default binning) 14

15 Check if some value occurs many times With real numbers often no number occurs more than once If a value is frequent then: Possibly can denote missingness (e.g. value 0), should be treated as N/A (not available, missing) Perhaps can denote an extremal value (more extremal values represented as this value) Check for rounding E.g., if all numbers rounded to 2 nd digit and one has more digits then it might be a typing error 15

16 A. Nominal attributes B. Ordinal attributes C. Numeric attributes D. All of the above E. None of the above F. Not sure 16

17 Demo: Data science mini-project CRISP-DM: cross-industrial standard process for data mining Data understanding: Types of data Data understanding: First look at attributes Types of attributes First look at a nominal attribute First look at a ordinal attribute First look at a numeric attribute 17

18 Data understanding: distribution of attributes Types of histograms How to describe probability distributions? Some standard probability distributions More ways to visualise distributions Visualising relations of attributes Are the attributes related? 18

19 Data understanding: distribution of attributes Types of histograms How to describe probability distributions? Some standard probability distributions More ways to visualise distributions Visualising relations of attributes Are the attributes related? 19

20 20

21 We have now looked at each attribute A key question we can now answer: How do the items in this dataset look like? Now we have at least 3 answers to this: 21

22 How do the items in this dataset look like? Let us just pick up one as an example: Age = 22 Workclass = Private Education = 11 th Occupation = Other-service Capital.gain =

23 How do the items in this dataset look like? Let us just provide the ranges of attributes Age: {17,18,19,,90} Workclass: {Federal-gov, Local-gov, Private, } Education: {1 st -4 th,5 th -6 th,, Doctorate) Occupation: {Adm-clerical, Exec-managerial, } Capital.gain: [0,99999]... 23

24 How do the items in this dataset look like? Let us just provide the histograms Age: <histogram> Workclass: <histogram> Education: <histogram> Occupation: <histogram> Capital.gain: <histogram>... 24

25 Data understanding: distribution of attributes Types of histograms How to describe probability distributions? Some standard probability distributions More ways to visualise distributions Visualising relations of attributes Are the attributes related? 25

26 26

27 Histograms on discrete data Nominal Ordinal Numeric with few different values E.g. small number of different integers Histograms on continuous data Numeric with many different values 27

28 Frequency histogram Frequency = count of items with each value ggplot(data) + geom_histogram(aes(x=age),stat="count") 28

29 Frequency histogram Frequency = count of items with each value ggplot(data) + geom_histogram(aes(x=workclass),stat="count") 29

30 Relative frequency histogram Relative frequency = proportion (0..1) or percentage (0..100%) of items with each value Heights of bars sum up to 1 ggplot(data) + geom_bar(aes(x=workclass,..count../sum(..count..)),stat="count") 30

31 Depends on the goal Frequency histogram Gives actual counts in the data Relative frequency histogram Gives proportions in the data Interpretable as probability distribution of a randomly chosen item 31

32 Continuous attribute: Usually value is different in each item Need to introduce bins (a.k.a. intervals, ranges) Histogram not informative without bins: ggplot(data) + geom_histogram(aes(x=factor(salaries)),stat="count") 32

33 Frequency histogram of binned data Frequency = count of items in each bin ggplot(data) + geom_histogram(aes(x=salaries),stat="bin",boundary=0,binwidth=10000) 33

34 Relative frequency histogram of binned data Relative frequency = proportion of items in bins Heights of bars add up to 1 ggplot(data) + geom_histogram(aes(x=salaries,..count../sum(..count..)),stat="bin",boundary=0,binwidth=10000) 34

35 Density histogram of binned data Density = Y-axis such that areas of bars in the histogram add up to 1 Density scale is invariant to the sizes of bins ggplot(data) + geom_histogram(aes(x=salaries,..density..),stat="bin",boundary=0,binwidth=10000) 35

36 Density histogram of binned data Density = Y-axis such that areas of bars in the histogram add up to 1 Density scale is invariant to the sizes of bins + geom_histogram(aes(x=salaries,..density..),stat="bin",boundary=0,binwidth=1000,alpha=0,colour="red") 36

37 Density histogram of binned data Density = Y-axis such that areas of bars in the histogram add up to 1 Density scale is invariant to the sizes of bins + geom_histogram(aes(x=salaries),stat="density",colour="blue") 37

38 Represented by the probability density function (pdf) Area under the curve is equal to 1 Areas represent probabilities Area = P(a<X<b) = probability that X is between a and b 38

39 Data understanding: distribution of attributes Types of histograms How to describe probability distributions? Some standard probability distributions More ways to visualise distributions Visualising relations of attributes Are the attributes related? 39

40 40

41 Statistic measure calculated from all values Mode of the distribution: The most probable value (or values) The most frequent value if discrete Value with highest density if continuous Median and other quantiles We have defined earlier Mean of the distribution: Average value of the attribute Arithmetic average if discrete Expected value if continuous Centre of mass of the distribution 41

42 Consider attribute with values: 1,2,2,2,3,3,4,7 Mode: 2 because occurs three times Median 2.5 because 2 & 3 are in the middle, (2+3)/2=2.5 Mean: 3.0 because ( )/8 = 24/8 =

43 A. 0 B. 0.5 C. 1 D. 1.5 E. None of the above 43

44 A. 0 B. 0.5 C. 1 D. 1.5 E. None of the above 44

45 A. 0 B. 0.5 C. 1 D. 1.5 E. None of the above 45

46 Variance of a distribution is: average squared deviation from the mean Standard deviation is square root of variance Quadratic average deviation from the mean Example: Values: 1,2,2,2,3,3,4,7 Mean: 3.0 Variance: ((1-3)^2+(2-3)^2+ +(7-3)^2) / 8 = 3.0 Standard deviation: sqrt(3.0)=

47 Symmetric Skewed / right-skewed / left-skewed Heavy-tailed Bimodal Multi-modal Capped / right-capped / left-capped 47

48 Simple definitions, not fully correct: Unimodal 1 mode Bimodal 2 modes Multimodal multiple modes Actually: Bimodal usually means that the density (pdf) has two local maxima: Multimodal means pdf has multiple maxima (2 or more) 48

49 Symmetric distribution: Symmetric around a vertical axis of symmetry Left and right side are mirror-images of each other Left- (or right-)skewed distribution: Non-symmetric and mean below (above) mode Usually used only for unimodal distributions sym negati positiv m vely ely et ske ske ri wed we c d 49

50 Data understanding: distribution of attributes Types of histograms How to describe probability distributions? Some standard probability distributions More ways to visualise distributions Visualising relations of attributes Are the attributes related? 50

51 51

52 Statisticians have names for many different families of probability distributions Why need to know some of them? In practice these are used to communicate the distribution without having to visualise it We will talk about: Uniform distributions Normal distributions Power law distributions 52

53 All options are equally probable 53

54 All values in [a,b] are equally probable: The pdf is constant between a and b, and 0 elsewhere 54

55 Represent data dispersion, spread Represent central tendency 55

56 N(mean,variance) The most common non-uniform continuous distribution Why? Sum of many independent and identically distributed (i.i.d.) random variables is approximately normally distributed Standard normal distribution: N(0,1) 56

57 Heavy-tailed (or long-tailed) distribution Very high or low values are likely More likely than in case of normal distribution Technical definition: Tails are not exponentially bounded 57

58 Alternative names: scale-free / scale-independent Distributions on positive real values The probability of M times bigger value is K times smaller (power law; M, K - parameters) Linearly descending pdf when drawn in loglog-scale (both x and y logarithmic) Examples: Number of connections in many real-world graphs Frequencies of words in languages Sizes of craters on the moon 58

59 A. Uniform B. Unimodal symmetric C. Unimodal right-skewed D. Unimodal left-skewed E. Bimodal symmetric F. Bimodal asymmetric G. Multi-modal symmetric H. Multi-modal asymmetric I. Normally distributed J. Power-law distributed 59

60 A. Uniform B. Unimodal symmetric C. Unimodal right-skewed D. Unimodal left-skewed E. Bimodal symmetric F. Bimodal asymmetric G. Multi-modal symmetric H. Multi-modal asymmetric I. Normally distributed J. Power-law distributed 60

61 A. Uniform B. Unimodal symmetric C. Unimodal right-skewed D. Unimodal left-skewed E. Bimodal symmetric F. Bimodal asymmetric G. Multi-modal symmetric H. Multi-modal asymmetric I. Normally distributed J. Power-law distributed 61

62 A. Uniform B. Unimodal symmetric C. Unimodal right-skewed D. Unimodal left-skewed E. Bimodal symmetric F. Bimodal asymmetric G. Multi-modal symmetric H. Multi-modal asymmetric I. Normally distributed J. Power-law distributed 62

63 Data understanding: distribution of attributes Types of histograms How to describe probability distributions? Some standard probability distributions More ways to visualise distributions Visualising relations of attributes Are the attributes related? 63

64 64

65 Compactly visualise many distributions Density plots rotated by 90º and mirrored Distribution of salaries for each education level ggplot(data) + geom_violin(aes(x=education,y=salaries)) 65

66 Compact visualise many distributions marked median, upper and lower quartile, and outliers (R ggplot outlier = more than 1.5x interquartile range from quartile) ggplot(data) + geom_boxplot(aes(x=education,y=salaries)) 66

67 Violin plots and box plots can also be shown together: ggplot(data) + geom_violin(aes(x=education_factor,y=salaries)) + geom_boxplot(aes(x=education_factor,y=salaries),alpha=0.3) 67

68 Often too crowded, but sometimes provide extra insights compared to box plots and violin plots Over-crowded with too many points ggplot(data) + geom_point(aes(x=education_factor,y=salaries)) 68

69 Often too crowded, but sometimes provide extra insights compared to box plots and violin plots More useful, here only 100 points ggplot(data) + geom_point(aes(x=education_factor,y=salaries)) 69

70 Data understanding: distribution of attributes Types of histograms How to describe probability distributions? Some standard probability distributions More ways to visualise distributions Visualising relations of attributes Are the attributes related? 70

71 71

72 2 categorical attributes: Cross-table (workclass & education) table(data$education,data$workclass) 72

73 2 categorical attributes: Cross-table (workclass & education) Heatmap (workclass & education) ggplot(data %>% group_by(education,workclass) %>% summarise(count=length(education))) + geom_tile(aes(y=education,x=workclass,fill=count)) + scale_fill_gradient(low='white',high='black',trans="log",breaks=c(1,10,100,1000)) + theme_bw() 73

74 2 categorical attributes: Cross-table (workclass & education) Heatmap (workclass & education) Cross-table and heatmap combined + geom_text(aes(x=workclass,y=education,label=count),color="red") 74

75 Scatter plot, box plot, violin plot 75

76 Scatter plot ggplot(data) + geom_point(aes(x=age,y=salaries)) 76

77 Scatter plot ggplot(data) + geom_point(aes(x=capital.gain,y=salaries)) 77

78 Scatter plot Discretise one (or both) of the attributes Discretise = make into categorical For instance, introduce bins library(arules) data$capital.gain.discretised = discretize(data$capital.gain,method="fixed",categories=c(0,5,10,20,50,100)*1000) ggplot(data) + geom_boxplot(aes(x=capital.gain.discretised,y=salaries)) 78

79 Make all pairwise visualisations and organise in a cross-table Perform dimensionality reduction Introduced later in the course Projects the data into a 2-dimensional space Then visualise Use colours, shapes, etc. 79

80 Four attributes visualised together ggplot(data) + geom_point(aes(x=capital.gain,y=salaries,color=workclass,shape=gender)) + scale_colour_brewer(type="qual") 80

81 Four attributes visualised together Often hard to read when many attributes visualised this way ggplot(data) + geom_point(aes(x=capital.gain,y=salaries,color=workclass,shape=gender)) + scale_colour_brewer(type="qual") 81

82 Data understanding: distribution of attributes Types of histograms How to describe probability distributions? Some standard probability distributions More ways to visualise distributions Visualising relations of attributes Are the attributes related? 82

83 83

84 There are many statistical tests about this, we will cover some in the next lecture Visual inspection can reveal some relations For example, university graduates have higher median salaries than others 84

85 There are many statistical tests about this, we will cover some in the next lecture Visual inspection can reveal some relations For example, university graduates have higher median salaries than others Correlation is a statistic on 2 numeric attributes, quantifying linear relationships 85

86 Mean (actually sample mean as explained in next lecture) x 1 n n i 1 x i Correlation (Pearson correlation coefficient) 86

87 Ranges from -1 to +1: R=-1: perfectly anti-correlated R=+1: perfectly correlated R=0: absolutely uncorrelated Positively correlated Negatively correlated 87

88 Uncorrelated Uncorrelated Uncorrelated 88

89 89

90 Several sets of (x, y) points, with the Pearson correlation coefficient of x and y for each set. Note that the correlation reflects the noisiness and direction of a linear relationship (top row), but not the slope of that relationship (middle), nor many aspects of nonlinear relationships (bottom). N.B.: the figure in the center has a slope of 0 but in that case the correlation coefficient is undefined because the variance of Y is zero. 90

91 91

92 A. All equal B. All different C. Some equal, some different 92

93 93

94 94

95 95

96 96

97 97

98 405 - Feb CO2 Noise Pressure 98

99 Feb 2017 Changed the scale of noise CO2 Pressure Noise 99

100 80 70 CO2 vs Noise in 405 y = x R² = Noise CO

101 80 70 Not the same as correlation! CO2 vs Noise in 405 y = x R² = Noise CO

102 r Pearson correlation coefficient R 2 coefficient of determination Proportion of variance in the target attribute that is predictable from the source attribute Basically, measures how well the data follow the regression line In simple linear regression R 2 is equal to the square of r 102

103 Source: 103

104 Source: 104

105 ????????? Source: 105

106 Source: 106

107 ????????? 107

108 108

109 Relation Correlation Causation 109

110 Relation Correlation Causation Spurious correlations! Multiple testing problem (next lecture) 110

111 Data understanding: distribution of attributes Types of histograms How to describe probability distributions? Some standard probability distributions More ways to visualise distributions Visualising relations of attributes Are the attributes related? 111

112 112

113 The most merciful thing in the world... is the inability of the human mind to correlate all its contents. H. P. Lovecraft The correlation of quality of life and cost of energy is huge Sam Altman Visualization is daydreaming with a purpose. Bo Bennett Source: 113

P8130: Biostatistical Methods I

P8130: Biostatistical Methods I P8130: Biostatistical Methods I Lecture 2: Descriptive Statistics Cody Chiuzan, PhD Department of Biostatistics Mailman School of Public Health (MSPH) Lecture 1: Recap Intro to Biostatistics Types of Data

More information

Chapter 4. Displaying and Summarizing. Quantitative Data

Chapter 4. Displaying and Summarizing. Quantitative Data STAT 141 Introduction to Statistics Chapter 4 Displaying and Summarizing Quantitative Data Bin Zou (bzou@ualberta.ca) STAT 141 University of Alberta Winter 2015 1 / 31 4.1 Histograms 1 We divide the range

More information

Midrange: mean of highest and lowest scores. easy to compute, rough estimate, rarely used

Midrange: mean of highest and lowest scores. easy to compute, rough estimate, rarely used Measures of Central Tendency Mode: most frequent score. best average for nominal data sometimes none or multiple modes in a sample bimodal or multimodal distributions indicate several groups included in

More information

Last Lecture. Distinguish Populations from Samples. Knowing different Sampling Techniques. Distinguish Parameters from Statistics

Last Lecture. Distinguish Populations from Samples. Knowing different Sampling Techniques. Distinguish Parameters from Statistics Last Lecture Distinguish Populations from Samples Importance of identifying a population and well chosen sample Knowing different Sampling Techniques Distinguish Parameters from Statistics Knowing different

More information

Σ x i. Sigma Notation

Σ x i. Sigma Notation Sigma Notation The mathematical notation that is used most often in the formulation of statistics is the summation notation The uppercase Greek letter Σ (sigma) is used as shorthand, as a way to indicate

More information

ECLT 5810 Data Preprocessing. Prof. Wai Lam

ECLT 5810 Data Preprocessing. Prof. Wai Lam ECLT 5810 Data Preprocessing Prof. Wai Lam Why Data Preprocessing? Data in the real world is imperfect incomplete: lacking attribute values, lacking certain attributes of interest, or containing only aggregate

More information

CS 147: Computer Systems Performance Analysis

CS 147: Computer Systems Performance Analysis CS 147: Computer Systems Performance Analysis Summarizing Variability and Determining Distributions CS 147: Computer Systems Performance Analysis Summarizing Variability and Determining Distributions 1

More information

Descriptive Data Summarization

Descriptive Data Summarization Descriptive Data Summarization Descriptive data summarization gives the general characteristics of the data and identify the presence of noise or outliers, which is useful for successful data cleaning

More information

200 participants [EUR] ( =60) 200 = 30% i.e. nearly a third of the phone bills are greater than 75 EUR

200 participants [EUR] ( =60) 200 = 30% i.e. nearly a third of the phone bills are greater than 75 EUR Ana Jerončić 200 participants [EUR] about half (71+37=108) 200 = 54% of the bills are small, i.e. less than 30 EUR (18+28+14=60) 200 = 30% i.e. nearly a third of the phone bills are greater than 75 EUR

More information

Tastitsticsss? What s that? Principles of Biostatistics and Informatics. Variables, outcomes. Tastitsticsss? What s that?

Tastitsticsss? What s that? Principles of Biostatistics and Informatics. Variables, outcomes. Tastitsticsss? What s that? Tastitsticsss? What s that? Statistics describes random mass phanomenons. Principles of Biostatistics and Informatics nd Lecture: Descriptive Statistics 3 th September Dániel VERES Data Collecting (Sampling)

More information

1-1. Chapter 1. Sampling and Descriptive Statistics by The McGraw-Hill Companies, Inc. All rights reserved.

1-1. Chapter 1. Sampling and Descriptive Statistics by The McGraw-Hill Companies, Inc. All rights reserved. 1-1 Chapter 1 Sampling and Descriptive Statistics 1-2 Why Statistics? Deal with uncertainty in repeated scientific measurements Draw conclusions from data Design valid experiments and draw reliable conclusions

More information

1. Exploratory Data Analysis

1. Exploratory Data Analysis 1. Exploratory Data Analysis 1.1 Methods of Displaying Data A visual display aids understanding and can highlight features which may be worth exploring more formally. Displays should have impact and be

More information

Ø Set of mutually exclusive categories. Ø Classify or categorize subject. Ø No meaningful order to categorization.

Ø Set of mutually exclusive categories. Ø Classify or categorize subject. Ø No meaningful order to categorization. Statistical Tools in Evaluation HPS 41 Dr. Joe G. Schmalfeldt Types of Scores Continuous Scores scores with a potentially infinite number of values. Discrete Scores scores limited to a specific number

More information

STAT 200 Chapter 1 Looking at Data - Distributions

STAT 200 Chapter 1 Looking at Data - Distributions STAT 200 Chapter 1 Looking at Data - Distributions What is Statistics? Statistics is a science that involves the design of studies, data collection, summarizing and analyzing the data, interpreting the

More information

Histograms allow a visual interpretation

Histograms allow a visual interpretation Chapter 4: Displaying and Summarizing i Quantitative Data s allow a visual interpretation of quantitative (numerical) data by indicating the number of data points that lie within a range of values, called

More information

ADMS2320.com. We Make Stats Easy. Chapter 4. ADMS2320.com Tutorials Past Tests. Tutorial Length 1 Hour 45 Minutes

ADMS2320.com. We Make Stats Easy. Chapter 4. ADMS2320.com Tutorials Past Tests. Tutorial Length 1 Hour 45 Minutes We Make Stats Easy. Chapter 4 Tutorial Length 1 Hour 45 Minutes Tutorials Past Tests Chapter 4 Page 1 Chapter 4 Note The following topics will be covered in this chapter: Measures of central location Measures

More information

Histograms, Central Tendency, and Variability

Histograms, Central Tendency, and Variability The Economist, September 6, 214 1 Histograms, Central Tendency, and Variability Lecture 2 Reading: Sections 5 5.6 Includes ALL margin notes and boxes: For Example, Guided Example, Notation Alert, Just

More information

Week 1: Intro to R and EDA

Week 1: Intro to R and EDA Statistical Methods APPM 4570/5570, STAT 4000/5000 Populations and Samples 1 Week 1: Intro to R and EDA Introduction to EDA Objective: study of a characteristic (measurable quantity, random variable) for

More information

Stat 101 Exam 1 Important Formulas and Concepts 1

Stat 101 Exam 1 Important Formulas and Concepts 1 1 Chapter 1 1.1 Definitions Stat 101 Exam 1 Important Formulas and Concepts 1 1. Data Any collection of numbers, characters, images, or other items that provide information about something. 2. Categorical/Qualitative

More information

Preliminary Statistics course. Lecture 1: Descriptive Statistics

Preliminary Statistics course. Lecture 1: Descriptive Statistics Preliminary Statistics course Lecture 1: Descriptive Statistics Rory Macqueen (rm43@soas.ac.uk), September 2015 Organisational Sessions: 16-21 Sep. 10.00-13.00, V111 22-23 Sep. 15.00-18.00, V111 24 Sep.

More information

STP 420 INTRODUCTION TO APPLIED STATISTICS NOTES

STP 420 INTRODUCTION TO APPLIED STATISTICS NOTES INTRODUCTION TO APPLIED STATISTICS NOTES PART - DATA CHAPTER LOOKING AT DATA - DISTRIBUTIONS Individuals objects described by a set of data (people, animals, things) - all the data for one individual make

More information

MATH 10 INTRODUCTORY STATISTICS

MATH 10 INTRODUCTORY STATISTICS MATH 10 INTRODUCTORY STATISTICS Tommy Khoo Your friendly neighbourhood graduate student. Week 1 Chapter 1 Introduction What is Statistics? Why do you need to know Statistics? Technical lingo and concepts:

More information

IAM 530 ELEMENTS OF PROBABILITY AND STATISTICS LECTURE 3-RANDOM VARIABLES

IAM 530 ELEMENTS OF PROBABILITY AND STATISTICS LECTURE 3-RANDOM VARIABLES IAM 530 ELEMENTS OF PROBABILITY AND STATISTICS LECTURE 3-RANDOM VARIABLES VARIABLE Studying the behavior of random variables, and more importantly functions of random variables is essential for both the

More information

AP Final Review II Exploring Data (20% 30%)

AP Final Review II Exploring Data (20% 30%) AP Final Review II Exploring Data (20% 30%) Quantitative vs Categorical Variables Quantitative variables are numerical values for which arithmetic operations such as means make sense. It is usually a measure

More information

Chapter 4.notebook. August 30, 2017

Chapter 4.notebook. August 30, 2017 Sep 1 7:53 AM Sep 1 8:21 AM Sep 1 8:21 AM 1 Sep 1 8:23 AM Sep 1 8:23 AM Sep 1 8:23 AM SOCS When describing a distribution, make sure to always tell about three things: shape, outliers, center, and spread

More information

Summary statistics. G.S. Questa, L. Trapani. MSc Induction - Summary statistics 1

Summary statistics. G.S. Questa, L. Trapani. MSc Induction - Summary statistics 1 Summary statistics 1. Visualize data 2. Mean, median, mode and percentiles, variance, standard deviation 3. Frequency distribution. Skewness 4. Covariance and correlation 5. Autocorrelation MSc Induction

More information

QUANTITATIVE DATA. UNIVARIATE DATA data for one variable

QUANTITATIVE DATA. UNIVARIATE DATA data for one variable QUANTITATIVE DATA Recall that quantitative (numeric) data values are numbers where data take numerical values for which it is sensible to find averages, such as height, hourly pay, and pulse rates. UNIVARIATE

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

Descriptive Statistics-I. Dr Mahmoud Alhussami

Descriptive Statistics-I. Dr Mahmoud Alhussami Descriptive Statistics-I Dr Mahmoud Alhussami Biostatistics What is the biostatistics? A branch of applied math. that deals with collecting, organizing and interpreting data using well-defined procedures.

More information

3.1 Measure of Center

3.1 Measure of Center 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable to the mean Find the mode of a data set Describe how skewness affects

More information

Module 1. Identify parts of an expression using vocabulary such as term, equation, inequality

Module 1. Identify parts of an expression using vocabulary such as term, equation, inequality Common Core Standards Major Topic Key Skills Chapters Key Vocabulary Essential Questions Module 1 Pre- Requisites Skills: Students need to know how to add, subtract, multiply and divide. Students need

More information

Ø Set of mutually exclusive categories. Ø Classify or categorize subject. Ø No meaningful order to categorization.

Ø Set of mutually exclusive categories. Ø Classify or categorize subject. Ø No meaningful order to categorization. Statistical Tools in Evaluation HPS 41 Fall 213 Dr. Joe G. Schmalfeldt Types of Scores Continuous Scores scores with a potentially infinite number of values. Discrete Scores scores limited to a specific

More information

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1 Lecture Slides Elementary Statistics Tenth Edition and the Triola Statistics Series by Mario F. Triola Slide 1 Chapter 3 Statistics for Describing, Exploring, and Comparing Data 3-1 Overview 3-2 Measures

More information

Unit 2. Describing Data: Numerical

Unit 2. Describing Data: Numerical Unit 2 Describing Data: Numerical Describing Data Numerically Describing Data Numerically Central Tendency Arithmetic Mean Median Mode Variation Range Interquartile Range Variance Standard Deviation Coefficient

More information

Introduction to Basic Statistics Version 2

Introduction to Basic Statistics Version 2 Introduction to Basic Statistics Version 2 Pat Hammett, Ph.D. University of Michigan 2014 Instructor Comments: This document contains a brief overview of basic statistics and core terminology/concepts

More information

Chapter 2 Descriptive Statistics

Chapter 2 Descriptive Statistics Chapter 2 Descriptive Statistics Lecture 1: Measures of Central Tendency and Dispersion Donald E. Mercante, PhD Biostatistics May 2010 Biostatistics (LSUHSC) Chapter 2 05/10 1 / 34 Lecture 1: Descriptive

More information

Class 11 Maths Chapter 15. Statistics

Class 11 Maths Chapter 15. Statistics 1 P a g e Class 11 Maths Chapter 15. Statistics Statistics is the Science of collection, organization, presentation, analysis and interpretation of the numerical data. Useful Terms 1. Limit of the Class

More information

SUMMARIZING MEASURED DATA. Gaia Maselli

SUMMARIZING MEASURED DATA. Gaia Maselli SUMMARIZING MEASURED DATA Gaia Maselli maselli@di.uniroma1.it Computer Network Performance 2 Overview Basic concepts Summarizing measured data Summarizing data by a single number Summarizing variability

More information

Chapter 2: Tools for Exploring Univariate Data

Chapter 2: Tools for Exploring Univariate Data Stats 11 (Fall 2004) Lecture Note Introduction to Statistical Methods for Business and Economics Instructor: Hongquan Xu Chapter 2: Tools for Exploring Univariate Data Section 2.1: Introduction What is

More information

Index I-1. in one variable, solution set of, 474 solving by factoring, 473 cubic function definition, 394 graphs of, 394 x-intercepts on, 474

Index I-1. in one variable, solution set of, 474 solving by factoring, 473 cubic function definition, 394 graphs of, 394 x-intercepts on, 474 Index A Absolute value explanation of, 40, 81 82 of slope of lines, 453 addition applications involving, 43 associative law for, 506 508, 570 commutative law for, 238, 505 509, 570 English phrases for,

More information

Sociology 6Z03 Review I

Sociology 6Z03 Review I Sociology 6Z03 Review I John Fox McMaster University Fall 2016 John Fox (McMaster University) Sociology 6Z03 Review I Fall 2016 1 / 19 Outline: Review I Introduction Displaying Distributions Describing

More information

Stat 20: Intro to Probability and Statistics

Stat 20: Intro to Probability and Statistics Stat 20: Intro to Probability and Statistics Lecture 5: Summary Statistics Tessa L. Childers-Day UC Berkeley 30 June 2014 By the end of this lecture... You will be able to: Describe a data set by its:

More information

MIDTERM EXAMINATION (Spring 2011) STA301- Statistics and Probability

MIDTERM EXAMINATION (Spring 2011) STA301- Statistics and Probability STA301- Statistics and Probability Solved MCQS From Midterm Papers March 19,2012 MC100401285 Moaaz.pk@gmail.com Mc100401285@gmail.com PSMD01 MIDTERM EXAMINATION (Spring 2011) STA301- Statistics and Probability

More information

Statistical Methods. by Robert W. Lindeman WPI, Dept. of Computer Science

Statistical Methods. by Robert W. Lindeman WPI, Dept. of Computer Science Statistical Methods by Robert W. Lindeman WPI, Dept. of Computer Science gogo@wpi.edu Descriptive Methods Frequency distributions How many people were similar in the sense that according to the dependent

More information

Unit Two Descriptive Biostatistics. Dr Mahmoud Alhussami

Unit Two Descriptive Biostatistics. Dr Mahmoud Alhussami Unit Two Descriptive Biostatistics Dr Mahmoud Alhussami Descriptive Biostatistics The best way to work with data is to summarize and organize them. Numbers that have not been summarized and organized are

More information

2/2/2015 GEOGRAPHY 204: STATISTICAL PROBLEM SOLVING IN GEOGRAPHY MEASURES OF CENTRAL TENDENCY CHAPTER 3: DESCRIPTIVE STATISTICS AND GRAPHICS

2/2/2015 GEOGRAPHY 204: STATISTICAL PROBLEM SOLVING IN GEOGRAPHY MEASURES OF CENTRAL TENDENCY CHAPTER 3: DESCRIPTIVE STATISTICS AND GRAPHICS Spring 2015: Lembo GEOGRAPHY 204: STATISTICAL PROBLEM SOLVING IN GEOGRAPHY CHAPTER 3: DESCRIPTIVE STATISTICS AND GRAPHICS Descriptive statistics concise and easily understood summary of data set characteristics

More information

Chapter Four. Numerical Descriptive Techniques. Range, Standard Deviation, Variance, Coefficient of Variation

Chapter Four. Numerical Descriptive Techniques. Range, Standard Deviation, Variance, Coefficient of Variation Chapter Four Numerical Descriptive Techniques 4.1 Numerical Descriptive Techniques Measures of Central Location Mean, Median, Mode Measures of Variability Range, Standard Deviation, Variance, Coefficient

More information

2.0 Lesson Plan. Answer Questions. Summary Statistics. Histograms. The Normal Distribution. Using the Standard Normal Table

2.0 Lesson Plan. Answer Questions. Summary Statistics. Histograms. The Normal Distribution. Using the Standard Normal Table 2.0 Lesson Plan Answer Questions 1 Summary Statistics Histograms The Normal Distribution Using the Standard Normal Table 2. Summary Statistics Given a collection of data, one needs to find representations

More information

TOPIC: Descriptive Statistics Single Variable

TOPIC: Descriptive Statistics Single Variable TOPIC: Descriptive Statistics Single Variable I. Numerical data summary measurements A. Measures of Location. Measures of central tendency Mean; Median; Mode. Quantiles - measures of noncentral tendency

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics CHAPTER OUTLINE 6-1 Numerical Summaries of Data 6- Stem-and-Leaf Diagrams 6-3 Frequency Distributions and Histograms 6-4 Box Plots 6-5 Time Sequence Plots 6-6 Probability Plots Chapter

More information

ANÁLISE DOS DADOS. Daniela Barreiro Claro

ANÁLISE DOS DADOS. Daniela Barreiro Claro ANÁLISE DOS DADOS Daniela Barreiro Claro Outline Data types Graphical Analysis Proimity measures Prof. Daniela Barreiro Claro Types of Data Sets Record Ordered Relational records Video data: sequence of

More information

Mathematics Standards for High School Algebra I

Mathematics Standards for High School Algebra I Mathematics Standards for High School Algebra I Algebra I is a course required for graduation and course is aligned with the College and Career Ready Standards for Mathematics in High School. Throughout

More information

Descriptive Univariate Statistics and Bivariate Correlation

Descriptive Univariate Statistics and Bivariate Correlation ESC 100 Exploring Engineering Descriptive Univariate Statistics and Bivariate Correlation Instructor: Sudhir Khetan, Ph.D. Wednesday/Friday, October 17/19, 2012 The Central Dogma of Statistics used to

More information

Lecture Notes 2: Variables and graphics

Lecture Notes 2: Variables and graphics Highlights: Lecture Notes 2: Variables and graphics Quantitative vs. qualitative variables Continuous vs. discrete and ordinal vs. nominal variables Frequency distributions Pie charts Bar charts Histograms

More information

Lecture 2 and Lecture 3

Lecture 2 and Lecture 3 Lecture 2 and Lecture 3 1 Lecture 2 and Lecture 3 We can describe distributions using 3 characteristics: shape, center and spread. These characteristics have been discussed since the foundation of statistics.

More information

Elementary Statistics

Elementary Statistics Elementary Statistics Q: What is data? Q: What does the data look like? Q: What conclusions can we draw from the data? Q: Where is the middle of the data? Q: Why is the spread of the data important? Q:

More information

Keystone Exams: Algebra

Keystone Exams: Algebra KeystoneExams:Algebra TheKeystoneGlossaryincludestermsanddefinitionsassociatedwiththeKeystoneAssessmentAnchorsand Eligible Content. The terms and definitions included in the glossary are intended to assist

More information

3rd Quartile. 1st Quartile) Minimum

3rd Quartile. 1st Quartile) Minimum EXST7034 - Regression Techniques Page 1 Regression diagnostics dependent variable Y3 There are a number of graphic representations which will help with problem detection and which can be used to obtain

More information

Chapter I: Introduction & Foundations

Chapter I: Introduction & Foundations Chapter I: Introduction & Foundations } 1.1 Introduction 1.1.1 Definitions & Motivations 1.1.2 Data to be Mined 1.1.3 Knowledge to be discovered 1.1.4 Techniques Utilized 1.1.5 Applications Adapted 1.1.6

More information

What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty.

What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty. What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty. Statistics is a field of study concerned with the data collection,

More information

1 Measures of the Center of a Distribution

1 Measures of the Center of a Distribution 1 Measures of the Center of a Distribution Qualitative descriptions of the shape of a distribution are important and useful. But we will often desire the precision of numerical summaries as well. Two aspects

More information

Exploring, summarizing and presenting data. Berghold, IMI, MUG

Exploring, summarizing and presenting data. Berghold, IMI, MUG Exploring, summarizing and presenting data Example Patient Nr Gender Age Weight Height PAVK-Grade W alking Distance Physical Functioning Scale Total Cholesterol Triglycerides 01 m 65 90 185 II b 200 70

More information

Summarizing and Displaying Measurement Data/Understanding and Comparing Distributions

Summarizing and Displaying Measurement Data/Understanding and Comparing Distributions Summarizing and Displaying Measurement Data/Understanding and Comparing Distributions Histograms, Mean, Median, Five-Number Summary and Boxplots, Standard Deviation Thought Questions 1. If you were to

More information

Summarizing Measured Data

Summarizing Measured Data Summarizing Measured Data 12-1 Overview Basic Probability and Statistics Concepts: CDF, PDF, PMF, Mean, Variance, CoV, Normal Distribution Summarizing Data by a Single Number: Mean, Median, and Mode, Arithmetic,

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall17 Carlos Fernandez-Granda Descriptive statistics Techniques to visualize

More information

Units. Exploratory Data Analysis. Variables. Student Data

Units. Exploratory Data Analysis. Variables. Student Data Units Exploratory Data Analysis Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison Statistics 371 13th September 2005 A unit is an object that can be measured, such as

More information

3.1 Measures of Central Tendency: Mode, Median and Mean. Average a single number that is used to describe the entire sample or population

3.1 Measures of Central Tendency: Mode, Median and Mean. Average a single number that is used to describe the entire sample or population . Measures of Central Tendency: Mode, Median and Mean Average a single number that is used to describe the entire sample or population. Mode a. Easiest to compute, but not too stable i. Changing just one

More information

Introduction to Statistics

Introduction to Statistics Introduction to Statistics By A.V. Vedpuriswar October 2, 2016 Introduction The word Statistics is derived from the Italian word stato, which means state. Statista refers to a person involved with the

More information

Chapter2 Description of samples and populations. 2.1 Introduction.

Chapter2 Description of samples and populations. 2.1 Introduction. Chapter2 Description of samples and populations. 2.1 Introduction. Statistics=science of analyzing data. Information collected (data) is gathered in terms of variables (characteristics of a subject that

More information

Chapter 3. Data Description

Chapter 3. Data Description Chapter 3. Data Description Graphical Methods Pie chart It is used to display the percentage of the total number of measurements falling into each of the categories of the variable by partition a circle.

More information

FLORIDA STANDARDS TO BOOK CORRELATION

FLORIDA STANDARDS TO BOOK CORRELATION FLORIDA STANDARDS TO BOOK CORRELATION Florida Standards (MAFS.912) Conceptual Category: Number and Quantity Domain: The Real Number System After a standard is introduced, it is revisited many times in

More information

Observations Homework Checkpoint quizzes Chapter assessments (Possibly Projects) Blocks of Algebra

Observations Homework Checkpoint quizzes Chapter assessments (Possibly Projects) Blocks of Algebra September The Building Blocks of Algebra Rates, Patterns and Problem Solving Variables and Expressions The Commutative and Associative Properties The Distributive Property Equivalent Expressions Seeing

More information

Quantitative Tools for Research

Quantitative Tools for Research Quantitative Tools for Research KASHIF QADRI Descriptive Analysis Lecture Week 4 1 Overview Measurement of Central Tendency / Location Mean, Median & Mode Quantiles (Quartiles, Deciles, Percentiles) Measurement

More information

Example 2. Given the data below, complete the chart:

Example 2. Given the data below, complete the chart: Statistics 2035 Quiz 1 Solutions Example 1. 2 64 150 150 2 128 150 2 256 150 8 8 Example 2. Given the data below, complete the chart: 52.4, 68.1, 66.5, 75.0, 60.5, 78.8, 63.5, 48.9, 81.3 n=9 The data is

More information

Chapter 1. Looking at Data

Chapter 1. Looking at Data Chapter 1 Looking at Data Types of variables Looking at Data Be sure that each variable really does measure what you want it to. A poor choice of variables can lead to misleading conclusions!! For example,

More information

Stat Lecture Slides Exploring Numerical Data. Yibi Huang Department of Statistics University of Chicago

Stat Lecture Slides Exploring Numerical Data. Yibi Huang Department of Statistics University of Chicago Stat 22000 Lecture Slides Exploring Numerical Data Yibi Huang Department of Statistics University of Chicago Outline In this slide, we cover mostly Section 1.2 & 1.6 in the text. Data and Types of Variables

More information

MATH 117 Statistical Methods for Management I Chapter Three

MATH 117 Statistical Methods for Management I Chapter Three Jubail University College MATH 117 Statistical Methods for Management I Chapter Three This chapter covers the following topics: I. Measures of Center Tendency. 1. Mean for Ungrouped Data (Raw Data) 2.

More information

Types of Information. Topic 2 - Descriptive Statistics. Examples. Sample and Sample Size. Background Reading. Variables classified as STAT 511

Types of Information. Topic 2 - Descriptive Statistics. Examples. Sample and Sample Size. Background Reading. Variables classified as STAT 511 Topic 2 - Descriptive Statistics STAT 511 Professor Bruce Craig Types of Information Variables classified as Categorical (qualitative) - variable classifies individual into one of several groups or categories

More information

Statistics and parameters

Statistics and parameters Statistics and parameters Tables, histograms and other charts are used to summarize large amounts of data. Often, an even more extreme summary is desirable. Statistics and parameters are numbers that characterize

More information

Lecture 6: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries)

Lecture 6: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries) Lecture 6: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries) Summarize with Shape, Center, Spread Displays: Stemplots, Histograms Five Number Summary, Outliers, Boxplots Cengage Learning

More information

are the objects described by a set of data. They may be people, animals or things.

are the objects described by a set of data. They may be people, animals or things. ( c ) E p s t e i n, C a r t e r a n d B o l l i n g e r 2016 C h a p t e r 5 : E x p l o r i n g D a t a : D i s t r i b u t i o n s P a g e 1 CHAPTER 5: EXPLORING DATA DISTRIBUTIONS 5.1 Creating Histograms

More information

Lesson Plan. Answer Questions. Summary Statistics. Histograms. The Normal Distribution. Using the Standard Normal Table

Lesson Plan. Answer Questions. Summary Statistics. Histograms. The Normal Distribution. Using the Standard Normal Table Lesson Plan Answer Questions Summary Statistics Histograms The Normal Distribution Using the Standard Normal Table 1 2. Summary Statistics Given a collection of data, one needs to find representations

More information

1.0 Continuous Distributions. 5.0 Shapes of Distributions. 6.0 The Normal Curve. 7.0 Discrete Distributions. 8.0 Tolerances. 11.

1.0 Continuous Distributions. 5.0 Shapes of Distributions. 6.0 The Normal Curve. 7.0 Discrete Distributions. 8.0 Tolerances. 11. Chapter 4 Statistics 45 CHAPTER 4 BASIC QUALITY CONCEPTS 1.0 Continuous Distributions.0 Measures of Central Tendency 3.0 Measures of Spread or Dispersion 4.0 Histograms and Frequency Distributions 5.0

More information

Algebra I Number and Quantity The Real Number System (N-RN)

Algebra I Number and Quantity The Real Number System (N-RN) Number and Quantity The Real Number System (N-RN) Use properties of rational and irrational numbers N-RN.3 Explain why the sum or product of two rational numbers is rational; that the sum of a rational

More information

BNG 495 Capstone Design. Descriptive Statistics

BNG 495 Capstone Design. Descriptive Statistics BNG 495 Capstone Design Descriptive Statistics Overview The overall goal of this short course in statistics is to provide an introduction to descriptive and inferential statistical methods, with a focus

More information

Chapter 2 Class Notes Sample & Population Descriptions Classifying variables

Chapter 2 Class Notes Sample & Population Descriptions Classifying variables Chapter 2 Class Notes Sample & Population Descriptions Classifying variables Random Variables (RVs) are discrete quantitative continuous nominal qualitative ordinal Notation and Definitions: a Sample is

More information

Curriculum Summary 8 th Grade Algebra I

Curriculum Summary 8 th Grade Algebra I Curriculum Summary 8 th Grade Algebra I Students should know and be able to demonstrate mastery in the following skills by the end of Eighth Grade: The Number System Extend the properties of exponents

More information

3 GRAPHICAL DISPLAYS OF DATA

3 GRAPHICAL DISPLAYS OF DATA some without indicating nonnormality. If a sample of 30 observations contains 4 outliers, two of which are extreme, would it be reasonable to assume the population from which the data were collected has

More information

Algebra 1 Mathematics: to Hoover City Schools

Algebra 1 Mathematics: to Hoover City Schools Jump to Scope and Sequence Map Units of Study Correlation of Standards Special Notes Scope and Sequence Map Conceptual Categories, Domains, Content Clusters, & Standard Numbers NUMBER AND QUANTITY (N)

More information

Foundations of Algebra/Algebra/Math I Curriculum Map

Foundations of Algebra/Algebra/Math I Curriculum Map *Standards N-Q.1, N-Q.2, N-Q.3 are not listed. These standards represent number sense and should be integrated throughout the units. *For each specific unit, learning targets are coded as F for Foundations

More information

Lecture 3B: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries)

Lecture 3B: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries) Lecture 3B: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries) Summarize with Shape, Center, Spread Displays: Stemplots, Histograms Five Number Summary, Outliers, Boxplots Mean vs.

More information

Dublin City Schools Mathematics Graded Course of Study Algebra I Philosophy

Dublin City Schools Mathematics Graded Course of Study Algebra I Philosophy Philosophy The Dublin City Schools Mathematics Program is designed to set clear and consistent expectations in order to help support children with the development of mathematical understanding. We believe

More information

Statistics in medicine

Statistics in medicine Statistics in medicine Lecture 1- part 1: Describing variation, and graphical presentation Outline Sources of variation Types of variables Fatma Shebl, MD, MS, MPH, PhD Assistant Professor Chronic Disease

More information

COMMON CORE STATE STANDARDS TO BOOK CORRELATION

COMMON CORE STATE STANDARDS TO BOOK CORRELATION COMMON CORE STATE STANDARDS TO BOOK CORRELATION Conceptual Category: Number and Quantity Domain: The Real Number System After a standard is introduced, it is revisited many times in subsequent activities,

More information

Biostatistics for biomedical profession. BIMM34 Karin Källen & Linda Hartman November-December 2015

Biostatistics for biomedical profession. BIMM34 Karin Källen & Linda Hartman November-December 2015 Biostatistics for biomedical profession BIMM34 Karin Källen & Linda Hartman November-December 2015 12015-11-02 Who needs a course in biostatistics? - Anyone who uses quntitative methods to interpret biological

More information

Probabilities and Statistics Probabilities and Statistics Probabilities and Statistics

Probabilities and Statistics Probabilities and Statistics Probabilities and Statistics - Lecture 8 Olariu E. Florentin April, 2018 Table of contents 1 Introduction Vocabulary 2 Descriptive Variables Graphical representations Measures of the Central Tendency The Mean The Median The Mode Comparing

More information

ST Presenting & Summarising Data Descriptive Statistics. Frequency Distribution, Histogram & Bar Chart

ST Presenting & Summarising Data Descriptive Statistics. Frequency Distribution, Histogram & Bar Chart ST2001 2. Presenting & Summarising Data Descriptive Statistics Frequency Distribution, Histogram & Bar Chart Summary of Previous Lecture u A study often involves taking a sample from a population that

More information

Describing distributions with numbers

Describing distributions with numbers Describing distributions with numbers A large number or numerical methods are available for describing quantitative data sets. Most of these methods measure one of two data characteristics: The central

More information

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency Math 1 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency The word average: is very ambiguous and can actually refer to the mean, median, mode or midrange. Notation:

More information

AP Statistics Cumulative AP Exam Study Guide

AP Statistics Cumulative AP Exam Study Guide AP Statistics Cumulative AP Eam Study Guide Chapters & 3 - Graphs Statistics the science of collecting, analyzing, and drawing conclusions from data. Descriptive methods of organizing and summarizing statistics

More information