Analytical Graphing. lets start with the best graph ever made

Size: px
Start display at page:

Download "Analytical Graphing. lets start with the best graph ever made"

Transcription

1 Analytical Graphing lets start with the best graph ever made Probably the best statistical graphic ever drawn, this map by Charles Joseph Minard portrays the losses suffered by Napoleon's army in the Russian campaign of Beginning at the Polish-Russian border, the thick band shows the size of the army at each position. The path of Napoleon's retreat from Moscow in the bitterly cold winter is depicted by the dark lower band, which is tied to temperature and time scales. The graph illustrates an amazing point how an army of 4, can dwindle to, without losing a single major battle. 1

2 When is a graph appropriate? Always for data exploration Often for data analysis and to develop predictions (models) and experimental designs Sometimes for presentations Less often for publications Data Exploration Is not snooping in the pejorative sense. Exploration is a necessary and desired operation for: Checking data for unusual values Making sure the data meet the assumptions of the chosen form of analysis Eg normality, homogeneity of variances, linearity (in regression approaches) deciding (sometimes) what sort of analysis to do. This hopefully will have been done prior to initiating a study To look for patterns that may not be expected or apparent this is indeed snooping but it is an essential part of hypothesis formation 2

3 Count Count Data Exploration Checking data for unusual values Making sure the data meet the assumptions of the chosen form of analysis See ourworld - pop_86 Determining distributions and outliers Will a transformation help?? Population of countries (1986) Proportion per Bar Population of countries (1986) Proportion per Bar 3

4 Data Exploration Is not snooping in the pejorative sense. Exploration is a necessary and desired operation for: Checking data for unusual values Making sure the data meet the assumptions of the chosen form of analysis Eg normality, homogeneity of variances, linearity (in regression approaches) The relationship between birth and death rates (ourworld) Is it linear, or is there perhaps a more appropriate model 3 DEATH_ BIRTH_82 4

5 Clearly not linear using LOESS procedure (locally weighted scatterplot smoothing): a non-parametric regression method that combines multiple regression models in a k-nearest-neighbor-based meta-mode 3 DEATH_ BIRTH_82 When is a graph appropriate? Often for data analysis (e.g.) To understand the nature of interaction terms (more later) To understand the power of a test. Say we wanted to determine sample size for an experiment where we thought the response would be around (alternate Hypothesis =) the standard deviation about 8 and we were willing to relax alpha (from.5 to.) 5

6 For example the effect of relaxing alpha on power Pop. Mean = Alternative = SD = 8 Alpha=.5,. Power Power Curve (Alpha =.5) Sample Size (per cell) Power Power Curve (Alpha =.) Sample Size (per cell) When is a graph appropriate? Sometimes for presentations Idea is to communicate information quickly Be sure you know why you are presenting the graph (is it to convey stats or some other information (we will talk about this more later) Graphs should be simple and not contain too much information never have a graph that is not interpretable So many factors involved that no one could figure it out, or worse 6

7 I know you can t really see this but. GN P _ 8 6 GD P _ C A P L OG_ GD P E D U C _ 8 4 E DUC HE A L T H8 4 HE A L T H P OP _ P OP _ P OP _ P OP _ 2 2 B I R T H _ 8 2 B I R T H _ R T D E A T H _ 8 2D E A T H _ R TB A B Y M T 8 B 2 A B Y M OR TL I F E _ E X P GN P _ 8 2 GN P _ 8 6 GD P _ C A P L OG_ GD P E D U C _ 8 4 E DUC HE A L T H8 4 HE A L T H P OP _1983 P OP _1983 P OP _ P OP _ P OP _ P OP _ 2 2 B I R T H _ 8 2 B I R T H _ R T D E A T H _ 8 2D E A T H _ R TB A B Y M T 8 B 2 A B Y M OR TL I F E _ E X P GN P _ 8 2 P OP _1986 P OP _1986 P OP _199 P OP _199 D E A TH _82 B A B Y MOR T B A B Y MT82 D E A TH _R T D E A TH _82 B I R TH _R T B I R TH _R T B I R TH _82 B I R TH _82 P OP _ P OP _ H E A LTH H E A LTH H E A LTH 84 H E A LTH 84 E DUC E DUC E D U C _84 E D U C _84 LOG_GD P LOG_GD P GD P _C A P GD P _C A P GN P _86 GN P _86 GN P _82 GN P _82 LI FE _E X P D E A TH _R T B A B Y MT82 B A B Y MOR T LI FE _E X P These are usually presented to demonstrate how much work the researcher has done really conveys that he or she has not adequately prepared the presentation When is a graph appropriate? Less often for publications Idea is to communicate information that is too complex to leave in tables or text They typically depict rather than present information (you have to read across to axes to get numbers). Hence if precise bits of information are important to the argument being made use tables. If a graph is presented it must be important to the argument being made in the text (no fluff graphs) Information cannot be presented twice (eg table and figure, text and figure) If a graph is presented it must be interpretable You should be able to understand the purpose and content of the figure directly from the legend. 7

8 Basics of analytical graph theory Graph types imply a basis of logic and are not always interchangeable Even interchangeable graph types are not always equivalent (some are just non-informative) Be very clear about what you are trying to convey: models, stats or data structure Graph construction (axes, scales etc) may obscure or make clear the points you are trying to make Graph trickery is usually just that and typically subtracts from the depiction Graph types imply a basis of logic and are not always interchangeable Summary Charts Density Charts Scatterplots, quantile plots and probability plots 8

9 Summary Charts There are a series of general graphical displays useful for characterizing the relationship between independent variables (usually categorical) and summary statistics of dependent variables (usually continuous). An example would be a bar graph of the relationship between education and income (see survey2 data). Some types of summary charts: Examples of continuous and categorical variables Categorical Gender (male, female) Nationality (French, Italian) Species (Human, Chimp) Color (red, green, blue) Age Group (Young, Old) Height Group (Short, Medium, Tall) Weight Group (Thin, Obese) Speed (Fast, Slow) Continuous Hormone level Location (Latitude, Longitude) Phylogenetic distance Color (wavelength) Age (years, days) Height (cm, inches) Weight (grams, pounds) Speed (cm/sec) Temperature (Cold, Warm) Temperature (degrees C) 9

10 7 Bar Dot Line Profile Pyramid Pie Which conveys the information most clearly how about the comparisons of interest SEX Female Male Female SEX Male SEX Female Male

11 Density Charts The density of a sample is the relative concentration of data points in intervals across the range of the distribution. A histogram is one way to display the density of a quantitative variable; box plots, dot or symmetric dot density, frequency polygons, fuzzygrams, jitter plots, density stripes, and histograms with data-driven bar widths are others. Histogram Length (mm) 11

12 Features of a BOXPLOT Rather than comparing sample values to the normal distribution (mean, standard deviation, and so on), box plots show robust (what does this mean) statistics (median, quartiles, and so on). confidence interval hinge median hinge outliers mean 25% 25% 25% 25% Smallest 5% Statistical Range Y Raw Data Plots:e.g. Scatterplots, Scatterplots are probably the most common form of graphical display. The key feature of scatterplots is that raw data are plotted (in contrast to summary data as in summary charts). Regression lines with confidence bands or smoothers (e.g. linear, non-linear) can be added to help explain relationships among variables. An example is the relationship between mussel height, and length and mussel height and mussel mass. How to estimate length and mass of mussels? 12

13 Height Length Non-linear and linear smoothing Each point is a mussel 13

14 Scatterplots, quantile plots and probability plots Quantile plots and probability plots are useful for studying the distribution of a variable. Quantile Plots produces quantile plots, or Q plots. Unlike probability plots, which compare a sample to a theoretical probability distribution, a quantile plot compares a sample to its own quantiles (a one-sample plot) or to another sample (a two-sample, or Q-Q, plot). The quantile of a sample is the data point corresponding to a given fraction of the data. See ourworld (pop_1986 ) Features of a Quantile Plot Distribution of data Fraction of Data % of countries had populations less or equal to 5 million people POP_1986 Distribution of quantiles (should be uniform but is subject to sample size) 14

15 Scatterplots, quantile plots and probability plots A Probability Plot plots the values of a variable against the corresponding percentage points of a theoretical distribution--normal, chi-square, t, F, uniform, binomial, logistic, exponential, gamma, Weibull, or Studentized range. Graphs like this are called probability plots, or P plots. You can also plot the expected values of one variable against those of another (P-P plot). These graphs are very important for determining if data are in need of transformation. See ourworld (pop_1986 ) Features of a Probability Plot No transformation Log (base ) transformation 15

16 Extra slides Basics of analytical graph theory Graph types imply a basis of logic and are not always interchangeable Even interchangeable graph types are not always equivalent (some are just non-informative) Be very clear about what you are trying to convey: models, stats or data structure Graph construction (axes, scales etc) may obscure or make clear the points you are trying to make Graph trickery is usually just that and typically subtracts from the depiction 16

17 The underlying basis of the graph There are two general bases for any data graph that will be presented or published. To display data (hopefully in the most efficient way) To convey information about statistics associated with the data Both of the above Although these may not appear to present a conflict often times there is here is an example Error Bars Be very Careful - error bars convey meaning - at least two sorts Estimate of variability for subjects in that category, irrespective of strata or statistical assumptions Of use for showing spread in sampled data Of no use for conveying inferential statistics Estimate of variability for subjects in that category, with respect to strata and statistical assumptions Of no use for showing spread in sampled data Of use for conveying inferential statistics See typing 17

18 How and why are these two graphs different? 8 without respect to strata and statistical assumptions 78 Least Squares Means with respect to strata and statistical assumptions 7 68 SPEED 6 SPEED electric plain old EQUIPMNT word process 38 electric plain old EQUIPMNT word process Basics of analytical graph theory Graph types imply a basis of logic and are not always interchangeable Even interchangeable graph types are not always equivalent (some are just non-informative) Be very clear about what you are trying to convey: models, stats or data structure Graph construction (axes, scales etc) may obscure or make clear the points you are trying to make Graph trickery is usually just that and typically subtracts from the depiction 18

19 Which is best? SEX Female Male 19

Analytical Graphing. lets start with the best graph ever made

Analytical Graphing. lets start with the best graph ever made Analytical Graphing lets start with the best graph ever made Probably the best statistical graphic ever drawn, this map by Charles Joseph Minard portrays the losses suffered by Napoleon's army in the Russian

More information

Introduction to hypothesis testing

Introduction to hypothesis testing Introduction to hypothesis testing Review: Logic of Hypothesis Tests Usually, we test (attempt to falsify) a null hypothesis (H 0 ): includes all possibilities except prediction in hypothesis (H A ) If

More information

What is statistics? Statistics is the science of: Collecting information. Organizing and summarizing the information collected

What is statistics? Statistics is the science of: Collecting information. Organizing and summarizing the information collected What is statistics? Statistics is the science of: Collecting information Organizing and summarizing the information collected Analyzing the information collected in order to draw conclusions Two types

More information

AP Final Review II Exploring Data (20% 30%)

AP Final Review II Exploring Data (20% 30%) AP Final Review II Exploring Data (20% 30%) Quantitative vs Categorical Variables Quantitative variables are numerical values for which arithmetic operations such as means make sense. It is usually a measure

More information

Chapter2 Description of samples and populations. 2.1 Introduction.

Chapter2 Description of samples and populations. 2.1 Introduction. Chapter2 Description of samples and populations. 2.1 Introduction. Statistics=science of analyzing data. Information collected (data) is gathered in terms of variables (characteristics of a subject that

More information

Elementary Statistics

Elementary Statistics Elementary Statistics Q: What is data? Q: What does the data look like? Q: What conclusions can we draw from the data? Q: Where is the middle of the data? Q: Why is the spread of the data important? Q:

More information

Stat 101 Exam 1 Important Formulas and Concepts 1

Stat 101 Exam 1 Important Formulas and Concepts 1 1 Chapter 1 1.1 Definitions Stat 101 Exam 1 Important Formulas and Concepts 1 1. Data Any collection of numbers, characters, images, or other items that provide information about something. 2. Categorical/Qualitative

More information

Talking feet: Scatterplots and lines of best fit

Talking feet: Scatterplots and lines of best fit Talking feet: Scatterplots and lines of best fit Student worksheet What does your foot say about your height? Can you predict people s height by how long their feet are? If a Grade 10 student s foot is

More information

Descriptive Statistics-I. Dr Mahmoud Alhussami

Descriptive Statistics-I. Dr Mahmoud Alhussami Descriptive Statistics-I Dr Mahmoud Alhussami Biostatistics What is the biostatistics? A branch of applied math. that deals with collecting, organizing and interpreting data using well-defined procedures.

More information

Chapter 2: Tools for Exploring Univariate Data

Chapter 2: Tools for Exploring Univariate Data Stats 11 (Fall 2004) Lecture Note Introduction to Statistical Methods for Business and Economics Instructor: Hongquan Xu Chapter 2: Tools for Exploring Univariate Data Section 2.1: Introduction What is

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

1-1. Chapter 1. Sampling and Descriptive Statistics by The McGraw-Hill Companies, Inc. All rights reserved.

1-1. Chapter 1. Sampling and Descriptive Statistics by The McGraw-Hill Companies, Inc. All rights reserved. 1-1 Chapter 1 Sampling and Descriptive Statistics 1-2 Why Statistics? Deal with uncertainty in repeated scientific measurements Draw conclusions from data Design valid experiments and draw reliable conclusions

More information

Units. Exploratory Data Analysis. Variables. Student Data

Units. Exploratory Data Analysis. Variables. Student Data Units Exploratory Data Analysis Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison Statistics 371 13th September 2005 A unit is an object that can be measured, such as

More information

MATH 1150 Chapter 2 Notation and Terminology

MATH 1150 Chapter 2 Notation and Terminology MATH 1150 Chapter 2 Notation and Terminology Categorical Data The following is a dataset for 30 randomly selected adults in the U.S., showing the values of two categorical variables: whether or not the

More information

Lecture Notes 2: Variables and graphics

Lecture Notes 2: Variables and graphics Highlights: Lecture Notes 2: Variables and graphics Quantitative vs. qualitative variables Continuous vs. discrete and ordinal vs. nominal variables Frequency distributions Pie charts Bar charts Histograms

More information

Chapter 1. Looking at Data

Chapter 1. Looking at Data Chapter 1 Looking at Data Types of variables Looking at Data Be sure that each variable really does measure what you want it to. A poor choice of variables can lead to misleading conclusions!! For example,

More information

STT 315 This lecture is based on Chapter 2 of the textbook.

STT 315 This lecture is based on Chapter 2 of the textbook. STT 315 This lecture is based on Chapter 2 of the textbook. Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing him to use/edit some of their

More information

Comparing Measures of Central Tendency *

Comparing Measures of Central Tendency * OpenStax-CNX module: m11011 1 Comparing Measures of Central Tendency * David Lane This work is produced by OpenStax-CNX and licensed under the Creative Commons Attribution License 1.0 1 Comparing Measures

More information

Vocabulary: Data About Us

Vocabulary: Data About Us Vocabulary: Data About Us Two Types of Data Concept Numerical data: is data about some attribute that must be organized by numerical order to show how the data varies. For example: Number of pets Measure

More information

Descriptive statistics

Descriptive statistics Patrick Breheny February 6 Patrick Breheny to Biostatistics (171:161) 1/25 Tables and figures Human beings are not good at sifting through large streams of data; we understand data much better when it

More information

Descriptive Data Summarization

Descriptive Data Summarization Descriptive Data Summarization Descriptive data summarization gives the general characteristics of the data and identify the presence of noise or outliers, which is useful for successful data cleaning

More information

Statistics in medicine

Statistics in medicine Statistics in medicine Lecture 1- part 1: Describing variation, and graphical presentation Outline Sources of variation Types of variables Fatma Shebl, MD, MS, MPH, PhD Assistant Professor Chronic Disease

More information

Worksheet 2 - Basic statistics

Worksheet 2 - Basic statistics Worksheet 2 - Basic statistics Basic statistics references Fowler et al. (1998) -Chpts 1, 2, 3, 4, 5, 6, 9, 10, 11, 12, & 16 (16.1, 16.2, 16.3, 16.9,16.11-16.14) Holmes et al. (2006) - Chpt 4 & Sections

More information

What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty.

What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty. What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty. Statistics is a field of study concerned with the data collection,

More information

Performance of fourth-grade students on an agility test

Performance of fourth-grade students on an agility test Starter Ch. 5 2005 #1a CW Ch. 4: Regression L1 L2 87 88 84 86 83 73 81 67 78 83 65 80 50 78 78? 93? 86? Create a scatterplot Find the equation of the regression line Predict the scores Chapter 5: Understanding

More information

Last Lecture. Distinguish Populations from Samples. Knowing different Sampling Techniques. Distinguish Parameters from Statistics

Last Lecture. Distinguish Populations from Samples. Knowing different Sampling Techniques. Distinguish Parameters from Statistics Last Lecture Distinguish Populations from Samples Importance of identifying a population and well chosen sample Knowing different Sampling Techniques Distinguish Parameters from Statistics Knowing different

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Suhasini Subba Rao Review Our objective: to make confident statements about a parameter (aspect) in

More information

Resistant Measure - A statistic that is not affected very much by extreme observations.

Resistant Measure - A statistic that is not affected very much by extreme observations. Chapter 1.3 Lecture Notes & Examples Section 1.3 Describing Quantitative Data with Numbers (pp. 50-74) 1.3.1 Measuring Center: The Mean Mean - The arithmetic average. To find the mean (pronounced x bar)

More information

Descriptive Statistics C H A P T E R 5 P P

Descriptive Statistics C H A P T E R 5 P P Descriptive Statistics C H A P T E R 5 P P 1 1 0-130 Graphing data Frequency distributions Bar graphs Qualitative variable (categories) Bars don t touch Histograms Frequency polygons Quantitative variable

More information

MODULE 9 NORMAL DISTRIBUTION

MODULE 9 NORMAL DISTRIBUTION MODULE 9 NORMAL DISTRIBUTION Contents 9.1 Characteristics of a Normal Distribution........................... 62 9.2 Simple Areas Under the Curve................................. 63 9.3 Forward Calculations......................................

More information

Chapter 1 Statistical Inference

Chapter 1 Statistical Inference Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

More information

Glossary for the Triola Statistics Series

Glossary for the Triola Statistics Series Glossary for the Triola Statistics Series Absolute deviation The measure of variation equal to the sum of the deviations of each value from the mean, divided by the number of values Acceptance sampling

More information

Math Review Sheet, Fall 2008

Math Review Sheet, Fall 2008 1 Descriptive Statistics Math 3070-5 Review Sheet, Fall 2008 First we need to know about the relationship among Population Samples Objects The distribution of the population can be given in one of the

More information

3.1 Measure of Center

3.1 Measure of Center 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable to the mean Find the mode of a data set Describe how skewness affects

More information

Turning a research question into a statistical question.

Turning a research question into a statistical question. Turning a research question into a statistical question. IGINAL QUESTION: Concept Concept Concept ABOUT ONE CONCEPT ABOUT RELATIONSHIPS BETWEEN CONCEPTS TYPE OF QUESTION: DESCRIBE what s going on? DECIDE

More information

1 Measures of the Center of a Distribution

1 Measures of the Center of a Distribution 1 Measures of the Center of a Distribution Qualitative descriptions of the shape of a distribution are important and useful. But we will often desire the precision of numerical summaries as well. Two aspects

More information

In this investigation you will use the statistics skills that you learned the to display and analyze a cup of peanut M&Ms.

In this investigation you will use the statistics skills that you learned the to display and analyze a cup of peanut M&Ms. M&M Madness In this investigation you will use the statistics skills that you learned the to display and analyze a cup of peanut M&Ms. Part I: Categorical Analysis: M&M Color Distribution 1. Record the

More information

FRANKLIN UNIVERSITY PROFICIENCY EXAM (FUPE) STUDY GUIDE

FRANKLIN UNIVERSITY PROFICIENCY EXAM (FUPE) STUDY GUIDE FRANKLIN UNIVERSITY PROFICIENCY EXAM (FUPE) STUDY GUIDE Course Title: Probability and Statistics (MATH 80) Recommended Textbook(s): Number & Type of Questions: Probability and Statistics for Engineers

More information

Chapter 2 Class Notes Sample & Population Descriptions Classifying variables

Chapter 2 Class Notes Sample & Population Descriptions Classifying variables Chapter 2 Class Notes Sample & Population Descriptions Classifying variables Random Variables (RVs) are discrete quantitative continuous nominal qualitative ordinal Notation and Definitions: a Sample is

More information

- measures the center of our distribution. In the case of a sample, it s given by: y i. y = where n = sample size.

- measures the center of our distribution. In the case of a sample, it s given by: y i. y = where n = sample size. Descriptive Statistics: One of the most important things we can do is to describe our data. Some of this can be done graphically (you should be familiar with histograms, boxplots, scatter plots and so

More information

Chapter 1:Descriptive statistics

Chapter 1:Descriptive statistics Slide 1.1 Chapter 1:Descriptive statistics Descriptive statistics summarises a mass of information. We may use graphical and/or numerical methods Examples of the former are the bar chart and XY chart,

More information

Chapter 7: Statistics Describing Data. Chapter 7: Statistics Describing Data 1 / 27

Chapter 7: Statistics Describing Data. Chapter 7: Statistics Describing Data 1 / 27 Chapter 7: Statistics Describing Data Chapter 7: Statistics Describing Data 1 / 27 Categorical Data Four ways to display categorical data: 1 Frequency and Relative Frequency Table 2 Bar graph (Pareto chart)

More information

Visualizing Data: Basic Plot Types

Visualizing Data: Basic Plot Types Visualizing Data: Basic Plot Types Data Science 101 Stanford University, Department of Statistics Agenda Today s lecture focuses on these basic plot types: bar charts histograms boxplots scatter plots

More information

SESSION 5 Descriptive Statistics

SESSION 5 Descriptive Statistics SESSION 5 Descriptive Statistics Descriptive statistics are used to describe the basic features of the data in a study. They provide simple summaries about the sample and the measures. Together with simple

More information

BIOL 51A - Biostatistics 1 1. Lecture 1: Intro to Biostatistics. Smoking: hazardous? FEV (l) Smoke

BIOL 51A - Biostatistics 1 1. Lecture 1: Intro to Biostatistics. Smoking: hazardous? FEV (l) Smoke BIOL 51A - Biostatistics 1 1 Lecture 1: Intro to Biostatistics Smoking: hazardous? FEV (l) 1 2 3 4 5 No Yes Smoke BIOL 51A - Biostatistics 1 2 Box Plot a.k.a box-and-whisker diagram or candlestick chart

More information

Vocabulary: Samples and Populations

Vocabulary: Samples and Populations Vocabulary: Samples and Populations Concept Different types of data Categorical data results when the question asked in a survey or sample can be answered with a nonnumerical answer. For example if we

More information

OPEN GEODA WORKSHOP / CRASH COURSE FACILITATED BY M. KOLAK

OPEN GEODA WORKSHOP / CRASH COURSE FACILITATED BY M. KOLAK OPEN GEODA WORKSHOP / CRASH COURSE FACILITATED BY M. KOLAK WHAT IS GEODA? Software program that serves as an introduction to spatial data analysis Free Open Source Source code is available under GNU license

More information

Essentials of Statistics and Probability

Essentials of Statistics and Probability May 22, 2007 Department of Statistics, NC State University dbsharma@ncsu.edu SAMSI Undergrad Workshop Overview Practical Statistical Thinking Introduction Data and Distributions Variables and Distributions

More information

Business Statistics. Lecture 10: Course Review

Business Statistics. Lecture 10: Course Review Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,

More information

MATH 10 INTRODUCTORY STATISTICS

MATH 10 INTRODUCTORY STATISTICS MATH 10 INTRODUCTORY STATISTICS Tommy Khoo Your friendly neighbourhood graduate student. Week 1 Chapter 1 Introduction What is Statistics? Why do you need to know Statistics? Technical lingo and concepts:

More information

1. Exploratory Data Analysis

1. Exploratory Data Analysis 1. Exploratory Data Analysis 1.1 Methods of Displaying Data A visual display aids understanding and can highlight features which may be worth exploring more formally. Displays should have impact and be

More information

The science of learning from data.

The science of learning from data. STATISTICS (PART 1) The science of learning from data. Numerical facts Collection of methods for planning experiments, obtaining data and organizing, analyzing, interpreting and drawing the conclusions

More information

Prentice Hall Stats: Modeling the World 2004 (Bock) Correlated to: National Advanced Placement (AP) Statistics Course Outline (Grades 9-12)

Prentice Hall Stats: Modeling the World 2004 (Bock) Correlated to: National Advanced Placement (AP) Statistics Course Outline (Grades 9-12) National Advanced Placement (AP) Statistics Course Outline (Grades 9-12) Following is an outline of the major topics covered by the AP Statistics Examination. The ordering here is intended to define the

More information

AIM HIGH SCHOOL. Curriculum Map W. 12 Mile Road Farmington Hills, MI (248)

AIM HIGH SCHOOL. Curriculum Map W. 12 Mile Road Farmington Hills, MI (248) AIM HIGH SCHOOL Curriculum Map 2923 W. 12 Mile Road Farmington Hills, MI 48334 (248) 702-6922 www.aimhighschool.com COURSE TITLE: Statistics DESCRIPTION OF COURSE: PREREQUISITES: Algebra 2 Students will

More information

Chapter 4. Displaying and Summarizing. Quantitative Data

Chapter 4. Displaying and Summarizing. Quantitative Data STAT 141 Introduction to Statistics Chapter 4 Displaying and Summarizing Quantitative Data Bin Zou (bzou@ualberta.ca) STAT 141 University of Alberta Winter 2015 1 / 31 4.1 Histograms 1 We divide the range

More information

Introduction to Linear regression analysis. Part 2. Model comparisons

Introduction to Linear regression analysis. Part 2. Model comparisons Introduction to Linear regression analysis Part Model comparisons 1 ANOVA for regression Total variation in Y SS Total = Variation explained by regression with X SS Regression + Residual variation SS Residual

More information

A SHORT INTRODUCTION TO PROBABILITY

A SHORT INTRODUCTION TO PROBABILITY A Lecture for B.Sc. 2 nd Semester, Statistics (General) A SHORT INTRODUCTION TO PROBABILITY By Dr. Ajit Goswami Dept. of Statistics MDKG College, Dibrugarh 19-Apr-18 1 Terminology The possible outcomes

More information

Graphing Data. Example:

Graphing Data. Example: Graphing Data Bar graphs and line graphs are great for looking at data over time intervals, or showing the rise and fall of a quantity over the passage of time. Example: Auto Sales by Year Year Number

More information

ADMS2320.com. We Make Stats Easy. Chapter 4. ADMS2320.com Tutorials Past Tests. Tutorial Length 1 Hour 45 Minutes

ADMS2320.com. We Make Stats Easy. Chapter 4. ADMS2320.com Tutorials Past Tests. Tutorial Length 1 Hour 45 Minutes We Make Stats Easy. Chapter 4 Tutorial Length 1 Hour 45 Minutes Tutorials Past Tests Chapter 4 Page 1 Chapter 4 Note The following topics will be covered in this chapter: Measures of central location Measures

More information

Notes 3: Statistical Inference: Sampling, Sampling Distributions Confidence Intervals, and Hypothesis Testing

Notes 3: Statistical Inference: Sampling, Sampling Distributions Confidence Intervals, and Hypothesis Testing Notes 3: Statistical Inference: Sampling, Sampling Distributions Confidence Intervals, and Hypothesis Testing 1. Purpose of statistical inference Statistical inference provides a means of generalizing

More information

CS 361: Probability & Statistics

CS 361: Probability & Statistics January 24, 2018 CS 361: Probability & Statistics Relationships in data Standard coordinates If we have two quantities of interest in a dataset, we might like to plot their histograms and compare the two

More information

(quantitative or categorical variables) Numerical descriptions of center, variability, position (quantitative variables)

(quantitative or categorical variables) Numerical descriptions of center, variability, position (quantitative variables) 3. Descriptive Statistics Describing data with tables and graphs (quantitative or categorical variables) Numerical descriptions of center, variability, position (quantitative variables) Bivariate descriptions

More information

Visual Display of Information

Visual Display of Information Visual Display of Information XKCD Edward Tufte Charles Joseph Minard s dramatic account of Napoleon's Russian campaign of 1812 (drawn in 1861) 1, men arrived in Moscow 422, men started the journey to

More information

GRACEY/STATISTICS CH. 3. CHAPTER PROBLEM Do women really talk more than men? Science, Vol. 317, No. 5834). The study

GRACEY/STATISTICS CH. 3. CHAPTER PROBLEM Do women really talk more than men? Science, Vol. 317, No. 5834). The study CHAPTER PROBLEM Do women really talk more than men? A common belief is that women talk more than men. Is that belief founded in fact, or is it a myth? Do men actually talk more than women? Or do men and

More information

Nemours Biomedical Research Statistics Course. Li Xie Nemours Biostatistics Core October 14, 2014

Nemours Biomedical Research Statistics Course. Li Xie Nemours Biostatistics Core October 14, 2014 Nemours Biomedical Research Statistics Course Li Xie Nemours Biostatistics Core October 14, 2014 Outline Recap Introduction to Logistic Regression Recap Descriptive statistics Variable type Example of

More information

M 140 Test 1 B Name (1 point) SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points Total 75

M 140 Test 1 B Name (1 point) SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points Total 75 M 140 est 1 B Name (1 point) SHOW YOUR WORK FOR FULL CREDI! Problem Max. Points Your Points 1-10 10 11 10 12 3 13 4 14 18 15 8 16 7 17 14 otal 75 Multiple choice questions (1 point each) For questions

More information

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics DETAILED CONTENTS About the Author Preface to the Instructor To the Student How to Use SPSS With This Book PART I INTRODUCTION AND DESCRIPTIVE STATISTICS 1. Introduction to Statistics 1.1 Descriptive and

More information

Announcements. Lecture 1 - Data and Data Summaries. Data. Numerical Data. all variables. continuous discrete. Homework 1 - Out 1/15, due 1/22

Announcements. Lecture 1 - Data and Data Summaries. Data. Numerical Data. all variables. continuous discrete. Homework 1 - Out 1/15, due 1/22 Announcements Announcements Lecture 1 - Data and Data Summaries Statistics 102 Colin Rundel January 13, 2013 Homework 1 - Out 1/15, due 1/22 Lab 1 - Tomorrow RStudio accounts created this evening Try logging

More information

PS5: Two Variable Statistics LT3: Linear regression LT4: The test of independence.

PS5: Two Variable Statistics LT3: Linear regression LT4: The test of independence. PS5: Two Variable Statistics LT3: Linear regression LT4: The test of independence. Example by eye. On a hot day, nine cars were left in the sun in a car parking lot. The length of time each car was left

More information

Further Mathematics 2018 CORE: Data analysis Chapter 2 Summarising numerical data

Further Mathematics 2018 CORE: Data analysis Chapter 2 Summarising numerical data Chapter 2: Summarising numerical data Further Mathematics 2018 CORE: Data analysis Chapter 2 Summarising numerical data Extract from Study Design Key knowledge Types of data: categorical (nominal and ordinal)

More information

Psych Jan. 5, 2005

Psych Jan. 5, 2005 Psych 124 1 Wee 1: Introductory Notes on Variables and Probability Distributions (1/5/05) (Reading: Aron & Aron, Chaps. 1, 14, and this Handout.) All handouts are available outside Mija s office. Lecture

More information

For instance, we want to know whether freshmen with parents of BA degree are predicted to get higher GPA than those with parents without BA degree.

For instance, we want to know whether freshmen with parents of BA degree are predicted to get higher GPA than those with parents without BA degree. DESCRIPTIVE ANALYSIS For instance, we want to know whether freshmen with parents of BA degree are predicted to get higher GPA than those with parents without BA degree. Assume that we have data; what information

More information

Bio 183 Statistics in Research. B. Cleaning up your data: getting rid of problems

Bio 183 Statistics in Research. B. Cleaning up your data: getting rid of problems Bio 183 Statistics in Research A. Research designs B. Cleaning up your data: getting rid of problems C. Basic descriptive statistics D. What test should you use? What is science?: Science is a way of knowing.(anon.?)

More information

Sampling, Frequency Distributions, and Graphs (12.1)

Sampling, Frequency Distributions, and Graphs (12.1) 1 Sampling, Frequency Distributions, and Graphs (1.1) Design: Plan how to obtain the data. What are typical Statistical Methods? Collect the data, which is then subjected to statistical analysis, which

More information

F78SC2 Notes 2 RJRC. If the interest rate is 5%, we substitute x = 0.05 in the formula. This gives

F78SC2 Notes 2 RJRC. If the interest rate is 5%, we substitute x = 0.05 in the formula. This gives F78SC2 Notes 2 RJRC Algebra It is useful to use letters to represent numbers. We can use the rules of arithmetic to manipulate the formula and just substitute in the numbers at the end. Example: 100 invested

More information

STAT 200 Chapter 1 Looking at Data - Distributions

STAT 200 Chapter 1 Looking at Data - Distributions STAT 200 Chapter 1 Looking at Data - Distributions What is Statistics? Statistics is a science that involves the design of studies, data collection, summarizing and analyzing the data, interpreting the

More information

Math 140 Introductory Statistics

Math 140 Introductory Statistics Math 140 Introductory Statistics Professor Silvia Fernández Chapter 2 Based on the book Statistics in Action by A. Watkins, R. Scheaffer, and G. Cobb. Visualizing Distributions Recall the definition: The

More information

Math 140 Introductory Statistics

Math 140 Introductory Statistics Visualizing Distributions Math 140 Introductory Statistics Professor Silvia Fernández Chapter Based on the book Statistics in Action by A. Watkins, R. Scheaffer, and G. Cobb. Recall the definition: The

More information

Statistics Handbook. All statistical tables were computed by the author.

Statistics Handbook. All statistical tables were computed by the author. Statistics Handbook Contents Page Wilcoxon rank-sum test (Mann-Whitney equivalent) Wilcoxon matched-pairs test 3 Normal Distribution 4 Z-test Related samples t-test 5 Unrelated samples t-test 6 Variance

More information

Review for Exam #1. Chapter 1. The Nature of Data. Definitions. Population. Sample. Quantitative data. Qualitative (attribute) data

Review for Exam #1. Chapter 1. The Nature of Data. Definitions. Population. Sample. Quantitative data. Qualitative (attribute) data Review for Exam #1 1 Chapter 1 Population the complete collection of elements (scores, people, measurements, etc.) to be studied Sample a subcollection of elements drawn from a population 11 The Nature

More information

A is one of the categories into which qualitative data can be classified.

A is one of the categories into which qualitative data can be classified. Chapter 2 Methods for Describing Sets of Data 2.1 Describing qualitative data Recall qualitative data: non-numerical or categorical data Basic definitions: A is one of the categories into which qualitative

More information

Statistics lecture 3. Bell-Shaped Curves and Other Shapes

Statistics lecture 3. Bell-Shaped Curves and Other Shapes Statistics lecture 3 Bell-Shaped Curves and Other Shapes Goals for lecture 3 Realize many measurements in nature follow a bell-shaped ( normal ) curve Understand and learn to compute a standardized score

More information

Bag RED ORANGE GREEN YELLOW PURPLE Candies per Bag

Bag RED ORANGE GREEN YELLOW PURPLE Candies per Bag Skittles Project For this project our entire class when out and bought a standard 2.17 ounce bag of skittles. Before we ate them, we recorded all of our data, the amount of skittles in our bag and the

More information

Statistics, continued

Statistics, continued Statistics, continued Visual Displays of Data Since numbers often do not resonate with people, giving visual representations of data is often uses to make the data more meaningful. We will talk about a

More information

PS2.1 & 2.2: Linear Correlations PS2: Bivariate Statistics

PS2.1 & 2.2: Linear Correlations PS2: Bivariate Statistics PS2.1 & 2.2: Linear Correlations PS2: Bivariate Statistics LT1: Basics of Correlation LT2: Measuring Correlation and Line of best fit by eye Univariate (one variable) Displays Frequency tables Bar graphs

More information

Index I-1. in one variable, solution set of, 474 solving by factoring, 473 cubic function definition, 394 graphs of, 394 x-intercepts on, 474

Index I-1. in one variable, solution set of, 474 solving by factoring, 473 cubic function definition, 394 graphs of, 394 x-intercepts on, 474 Index A Absolute value explanation of, 40, 81 82 of slope of lines, 453 addition applications involving, 43 associative law for, 506 508, 570 commutative law for, 238, 505 509, 570 English phrases for,

More information

TOPIC: Descriptive Statistics Single Variable

TOPIC: Descriptive Statistics Single Variable TOPIC: Descriptive Statistics Single Variable I. Numerical data summary measurements A. Measures of Location. Measures of central tendency Mean; Median; Mode. Quantiles - measures of noncentral tendency

More information

Stat Lecture Slides Exploring Numerical Data. Yibi Huang Department of Statistics University of Chicago

Stat Lecture Slides Exploring Numerical Data. Yibi Huang Department of Statistics University of Chicago Stat 22000 Lecture Slides Exploring Numerical Data Yibi Huang Department of Statistics University of Chicago Outline In this slide, we cover mostly Section 1.2 & 1.6 in the text. Data and Types of Variables

More information

16.400/453J Human Factors Engineering. Design of Experiments II

16.400/453J Human Factors Engineering. Design of Experiments II J Human Factors Engineering Design of Experiments II Review Experiment Design and Descriptive Statistics Research question, independent and dependent variables, histograms, box plots, etc. Inferential

More information

Survey on Population Mean

Survey on Population Mean MATH 203 Survey on Population Mean Dr. Neal, Spring 2009 The first part of this project is on the analysis of a population mean. You will obtain data on a specific measurement X by performing a random

More information

This is particularly true if you see long tails in your data. What are you testing? That the two distributions are the same!

This is particularly true if you see long tails in your data. What are you testing? That the two distributions are the same! Two sample tests (part II): What to do if your data are not distributed normally: Option 1: if your sample size is large enough, don't worry - go ahead and use a t-test (the CLT will take care of non-normal

More information

Describing Data 247. Color Frequency Blue 25 Green 52 Red 41 White 36 Black 39 Grey 23

Describing Data 247. Color Frequency Blue 25 Green 52 Red 41 White 36 Black 39 Grey 23 Describing Data 247 Describing Data Once we have collected data from surveys or experiments, we need to summarize and present the data in a way that will be meaningful to the reader. We will begin with

More information

Dover- Sherborn High School Mathematics Curriculum Probability and Statistics

Dover- Sherborn High School Mathematics Curriculum Probability and Statistics Mathematics Curriculum A. DESCRIPTION This is a full year courses designed to introduce students to the basic elements of statistics and probability. Emphasis is placed on understanding terminology and

More information

Chapters 1 & 2 Exam Review

Chapters 1 & 2 Exam Review Problems 1-3 refer to the following five boxplots. 1.) To which of the above boxplots does the following histogram correspond? (A) A (B) B (C) C (D) D (E) E 2.) To which of the above boxplots does the

More information

SPSS and its usage 2073/06/07 06/12. Dr. Bijay Lal Pradhan Dr Bijay Lal Pradhan

SPSS and its usage 2073/06/07 06/12. Dr. Bijay Lal Pradhan  Dr Bijay Lal Pradhan SPSS and its usage 2073/06/07 06/12 Dr. Bijay Lal Pradhan bijayprad@gmail.com http://bijaylalpradhan.com.np Ground Rule Mobile Penalty System Involvement Object of session I Define Statistics and SPSS

More information

A C E. Answers Investigation 4. Applications

A C E. Answers Investigation 4. Applications Answers Applications 1. 1 student 2. You can use the histogram with 5-minute intervals to determine the number of students that spend at least 15 minutes traveling to school. To find the number of students,

More information

CIVL 7012/8012. Collection and Analysis of Information

CIVL 7012/8012. Collection and Analysis of Information CIVL 7012/8012 Collection and Analysis of Information Uncertainty in Engineering Statistics deals with the collection and analysis of data to solve real-world problems. Uncertainty is inherent in all real

More information

NON-PARAMETRIC STATISTICS * (http://www.statsoft.com)

NON-PARAMETRIC STATISTICS * (http://www.statsoft.com) NON-PARAMETRIC STATISTICS * (http://www.statsoft.com) 1. GENERAL PURPOSE 1.1 Brief review of the idea of significance testing To understand the idea of non-parametric statistics (the term non-parametric

More information

STP 420 INTRODUCTION TO APPLIED STATISTICS NOTES

STP 420 INTRODUCTION TO APPLIED STATISTICS NOTES INTRODUCTION TO APPLIED STATISTICS NOTES PART - DATA CHAPTER LOOKING AT DATA - DISTRIBUTIONS Individuals objects described by a set of data (people, animals, things) - all the data for one individual make

More information

Sociology 6Z03 Review I

Sociology 6Z03 Review I Sociology 6Z03 Review I John Fox McMaster University Fall 2016 John Fox (McMaster University) Sociology 6Z03 Review I Fall 2016 1 / 19 Outline: Review I Introduction Displaying Distributions Describing

More information