Analytical Graphing. lets start with the best graph ever made

Size: px
Start display at page:

Download "Analytical Graphing. lets start with the best graph ever made"

Transcription

1 Analytical Graphing lets start with the best graph ever made Probably the best statistical graphic ever drawn, this map by Charles Joseph Minard portrays the losses suffered by Napoleon's army in the Russian campaign of Beginning at the Polish-Russian border, the thick band shows the size of the army at each position. The path of Napoleon's retreat from Moscow in the bitterly cold winter is depicted by the dark lower band, which is tied to temperature and time scales. The graph illustrates an amazing point how an army of, can dwindle to, without losing a single major battle. 1

2 When is a graph appropriate? Always for data exploration Often for data analysis and to develop predictions (models) and experimental designs Sometimes for presentations Less often for publications Data Exploration Is not snooping in the pejorative sense. Exploration is a necessary and desired operation for: Checking data for unusual values Making sure the data meet the assumptions of the chosen form of analysis Eg normality, homogeneity of variances, linearity (in regression approaches) deciding (sometimes) what sort of analysis to do. This hopefully will have been done prior to initiating a study To look for patterns that may not be expected or apparent this is indeed snooping but it is an essential part of hypothesis formation 2

3 Count Count Data Exploration Checking data for unusual values Making sure the data meet the assumptions of the chosen form of analysis See ourworld - pop_86 Determining distributions and outliers Will a transformation help?? Population of countries (1986) Proportion per Bar Population of countries (1986) Proportion per Bar 3

4 Data Exploration Is not snooping in the pejorative sense. Exploration is a necessary and desired operation for: Checking data for unusual values Making sure the data meet the assumptions of the chosen form of analysis Eg normality, homogeneity of variances, linearity (in regression approaches) The relationship between birth and death rates (ourworld) Is it linear, or is there perhaps a more appropriate model DEATH_82 BIRTH_82 4

5 Clearly not linear using LOESS procedure (locally weighted scatterplot smoothing): a non-parametric regression method that combines multiple regression models in a k-nearest-neighbor-based meta-mode DEATH_82 BIRTH_82 When is a graph appropriate? Often for data analysis (e.g.) To understand the nature of interaction terms (more later) To understand the power of a test. Say we wanted to determine sample size for an experiment where we thought the response would be around (alternate Hypothesis =) the standard deviation about 8 and we were willing to relax alpha (from.5 to.) 5

6 For example the effect of relaxing alpha on power Pop. Mean = Alternative = SD = 8 Alpha=.5,. Power Power Curve (Alpha =.) Sample Size (per cell) Power Power Curve (Alpha =.) Sample Size (per cell) When is a graph appropriate? Sometimes for presentations Idea is to communicate information quickly Be sure you know why you are presenting the graph (is it to convey stats or some other information (we will talk about this more later) Graphs should be simple and not contain too much information never have a graph that is not interpretable So many factors involved that no one could figure it out, or worse 6

7 I know you can t really see this but. P OP _1986 P OP _199 P OP _ B IR TH _82 B IR TH _R T D E A TH _82 D E A TH _R T B A B Y MT82 B A B Y MOR T LIFE _E X P GN P _82 GN P _86 GD P _C A P LOG_GD P E D U C _84 E DUC H E A LTH 84 H E A LTH P OP _1983 P OP _1986 P OP _199 P OP _ B IR TH _82 B IR TH _R T D E A TH _82 D E A TH _R T B A B Y MT82 B A B Y MOR T LIFE _E X P GN P _82 GN P _86 GD P _C A P LOG_GD P E D U C _84 E DUC H E A LTH 84 H E A LTH POP_1983 POP_1983 P OP _1983 POP_1986 POP_1986 POP_199 POP_199 BIR TH _R T BIR TH _R T BIR TH _82 BIR TH _82 POP_ POP_ D EATH _R T D EATH _R T D EATH _82 D EATH _82 BABYMT82 BABYMT82 BABYMOR T BABYMOR T ED U C ED U C ED U C _84 ED U C _84 LOG_GD P LOG_GD P GD P_C AP GD P_C AP GN P_86 GN P_86 GN P_82 GN P_82 LIFE_EXP LIFE_EXP H EALTH 84 H EALTH 84 H EALTH H EALTH These are usually presented to demonstrate how much work the researcher has done really conveys that he or she has not adequately prepared the presentation When is a graph appropriate? Less often for publications Idea is to communicate information that is too complex to leave in tables or text They typically depict rather than present information (you have to read across to axes to get numbers). Hence if precise bits of information are important to the argument being made use tables. If a graph is presented it must be important to the argument being made in the text (no fluff graphs) Information cannot be presented twice (eg table and figure, text and figure) If a graph is presented it must be interpretable You should be able to understand the purpose and content of the figure directly from the legend. 7

8 Basics of analytical graph theory Graph types imply a basis of logic and are not always interchangeable Even interchangeable graph types are not always equivalent (some are just non-informative) Be very clear about what you are trying to convey: models, stats or data structure Graph construction (axes, scales etc) may obscure or make clear the points you are trying to make Graph trickery is usually just that and typically subtracts from the depiction Graph types imply a basis of logic and are not always interchangeable Summary Charts Density Charts Scatterplots, quantile plots and probability plots 8

9 Summary Charts There are a series of general graphical displays useful for characterizing the relationship between independent variables (usually categorical) and summary statistics of dependent variables (usually continuous). An example would be a bar graph of the relationship between education and income (see survey2 data). Some types of summary charts: Examples of continuous and categorical variables Categorical Gender (male, female) Nationality (French, Italian) Species (Human, Chimp) Color (red, green, blue) Age Group (Young, Old) Height Group (Short, Medium, Tall) Weight Group (Thin, Obese) Speed (Fast, Slow) Continuous Hormone level Location (Latitude, Longitude) Phylogenetic distance Color (wavelength) Age (years, days) Height (cm, inches) Weight (grams, pounds) Speed (cm/sec) Temperature (Cold, Warm) Temperature (degrees C) 9

10 7 Bar Dot Line INCOME INCOME INCOME 7 no grad hs hs grad some college college grad no grad hs hs grad some college college grad no grad hs hs grad some college EDUC EDUC EDUC Profile Pyramid Pie 7 college grad 6 6 hs grad INCOME INCOME no grad hs some college no grad hs hs grad some college EDUC college grad no grad hs hs grad some college EDUC college grad college grad Which conveys the information most clearly how about the comparisons of interest 7 6 INCOME no grad hs hs grad some college EDUC college grad SEX Female Male no grad hs hs grad Female some college EDUC college grad SEX Male no grad hs hs grad some college EDUC college grad 7 6 INCOME INCOME INCOME no grad hs hs grad some college EDUC college grad SEX Female Male

11 Density Charts The density of a sample is the relative concentration of data points in intervals across the range of the distribution. A histogram is one way to display the density of a quantitative variable; box plots, dot or symmetric dot density, frequency polygons, fuzzygrams, jitter plots, density stripes, and histograms with data-driven bar widths are others. Histogram Length (mm) 11

12 Features of a BOXPLOT Rather than comparing sample values to the normal distribution (mean, standard deviation, and so on), box plots show robust (what does this mean) statistics (median, quartiles, and so on). confidence interval hinge median hinge outliers mean 25% 25% 25% 25% Smallest % Statistical Range Y Raw Data Plots:e.g. Scatterplots, Scatterplots are probably the most common form of graphical display. The key feature of scatterplots is that raw data are plotted (in contrast to summary data as in summary charts). Regression lines with confidence bands or smoothers (e.g. linear, non-linear) can be added to help explain relationships among variables. An example is the relationship between mussel height, and length and mussel height and mussel mass. How to estimate length and mass of mussels? 12

13 Height Length Non-linear and linear smoothing Each point is a mussel 13

14 Scatterplots, quantile plots and probability plots Quantile plots and probability plots are useful for studying the distribution of a variable. Quantile Plots produces quantile plots, or Q plots. Unlike probability plots, which compare a sample to a theoretical probability distribution, a quantile plot compares a sample to its own quantiles (a one-sample plot) or to another sample (a two-sample, or Q-Q, plot). The quantile of a sample is the data point corresponding to a given fraction of the data. See ourworld (pop_1986 ) Features of a Quantile Plot Distribution of data Fraction of Data % of countries had populations less or equal to million people. 1 POP_1986 Distribution of quantiles (should be uniform but is subject to sample size) 14

15 Scatterplots, quantile plots and probability plots A Probability Plot plots the values of a variable against the corresponding percentage points of a theoretical distribution--normal, chi-square, t, F, uniform, binomial, logistic, exponential, gamma, Weibull, or Studentized range. Graphs like this are called probability plots, or P plots. You can also plot the expected values of one variable against those of another (P-P plot). These graphs are very important for determining if data are in need of transformation. See ourworld (pop_1986 ) Features of a Probability Plot No transformation Log (base ) transformation 15

16 Lets Play Activity 1: Graph construction Draw the most appropriate graph, given the data set and type provided Think about the nature of the information and how best to depict the information. Label both the x and y axis. Use appropriate scales for both axes. Think about the number of ticks on axis and labeling of tick marks Make sure the elements (points, bars, lines etc), are crafted in a way that simplifies interpretation (think about, color, pattern, shape of elements, whether or not to depict a trend) Provide a figure legend that is descriptive: the reader should be able to interpret the figure based on the graph and legend 16

17 Age (years) Average size of Seastars (Pisaster) over time Diameter (mm) Total commercial abalone landings (pounds)over time in California Year Abalone Landings ,187, ,587, ,128, ,7, ,434, ,292, , ,238, ,9, ,2, , , , , , , , , , , , , , , ,323 17

18 The relationship between time to run a mile and maximum oxygen consumption VO2 max (oxygen consumption, ml/(kg min) ) Runtime (minutes per mile) Size distribution of Limpets Limpet size (mm)

19 Two variables: Number of Blue whales as a function of period and location Southern North Hemisphere Pacific North Atlantic Prewhaling ~175, 4,9 1, Current ~2, ~2, ~ Extra slides 19

20 Basics of analytical graph theory Graph types imply a basis of logic and are not always interchangeable Even interchangeable graph types are not always equivalent (some are just non-informative) Be very clear about what you are trying to convey: models, stats or data structure Graph construction (axes, scales etc) may obscure or make clear the points you are trying to make Graph trickery is usually just that and typically subtracts from the depiction The underlying basis of the graph There are two general bases for any data graph that will be presented or published. To display data (hopefully in the most efficient way) To convey information about statistics associated with the data Both of the above Although these may not appear to present a conflict often times there is here is an example

21 Error Bars Be very Careful - error bars convey meaning - at least two sorts Estimate of variability for subjects in that category, irrespective of strata or statistical assumptions Of use for showing spread in sampled data Of no use for conveying inferential statistics Estimate of variability for subjects in that category, with respect to strata and statistical assumptions Of no use for showing spread in sampled data Of use for conveying inferential statistics See typing How and why are these two graphs different? 8 without respect to strata and statistical assumptions 78 Least Squares Means with respect to strata and statistical assumptions 7 68 SPEED 6 SPEED electric plain old EQUIPMNT word process 38 electric plain old EQUIPMNT word process 21

22 Basics of analytical graph theory Graph types imply a basis of logic and are not always interchangeable Even interchangeable graph types are not always equivalent (some are just non-informative) Be very clear about what you are trying to convey: models, stats or data structure Graph construction (axes, scales etc) may obscure or make clear the points you are trying to make Graph trickery is usually just that and typically subtracts from the depiction Which is best? INCOME SEX no grad hs hs grad some college EDUC college grad Female Male 22

Analytical Graphing. lets start with the best graph ever made

Analytical Graphing. lets start with the best graph ever made Analytical Graphing lets start with the best graph ever made Probably the best statistical graphic ever drawn, this map by Charles Joseph Minard portrays the losses suffered by Napoleon's army in the Russian

More information

Introduction to hypothesis testing

Introduction to hypothesis testing Introduction to hypothesis testing Review: Logic of Hypothesis Tests Usually, we test (attempt to falsify) a null hypothesis (H 0 ): includes all possibilities except prediction in hypothesis (H A ) If

More information

What is statistics? Statistics is the science of: Collecting information. Organizing and summarizing the information collected

What is statistics? Statistics is the science of: Collecting information. Organizing and summarizing the information collected What is statistics? Statistics is the science of: Collecting information Organizing and summarizing the information collected Analyzing the information collected in order to draw conclusions Two types

More information

AP Final Review II Exploring Data (20% 30%)

AP Final Review II Exploring Data (20% 30%) AP Final Review II Exploring Data (20% 30%) Quantitative vs Categorical Variables Quantitative variables are numerical values for which arithmetic operations such as means make sense. It is usually a measure

More information

Elementary Statistics

Elementary Statistics Elementary Statistics Q: What is data? Q: What does the data look like? Q: What conclusions can we draw from the data? Q: Where is the middle of the data? Q: Why is the spread of the data important? Q:

More information

Talking feet: Scatterplots and lines of best fit

Talking feet: Scatterplots and lines of best fit Talking feet: Scatterplots and lines of best fit Student worksheet What does your foot say about your height? Can you predict people s height by how long their feet are? If a Grade 10 student s foot is

More information

Chapter2 Description of samples and populations. 2.1 Introduction.

Chapter2 Description of samples and populations. 2.1 Introduction. Chapter2 Description of samples and populations. 2.1 Introduction. Statistics=science of analyzing data. Information collected (data) is gathered in terms of variables (characteristics of a subject that

More information

Vocabulary: Data About Us

Vocabulary: Data About Us Vocabulary: Data About Us Two Types of Data Concept Numerical data: is data about some attribute that must be organized by numerical order to show how the data varies. For example: Number of pets Measure

More information

Stat 101 Exam 1 Important Formulas and Concepts 1

Stat 101 Exam 1 Important Formulas and Concepts 1 1 Chapter 1 1.1 Definitions Stat 101 Exam 1 Important Formulas and Concepts 1 1. Data Any collection of numbers, characters, images, or other items that provide information about something. 2. Categorical/Qualitative

More information

Units. Exploratory Data Analysis. Variables. Student Data

Units. Exploratory Data Analysis. Variables. Student Data Units Exploratory Data Analysis Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison Statistics 371 13th September 2005 A unit is an object that can be measured, such as

More information

1-1. Chapter 1. Sampling and Descriptive Statistics by The McGraw-Hill Companies, Inc. All rights reserved.

1-1. Chapter 1. Sampling and Descriptive Statistics by The McGraw-Hill Companies, Inc. All rights reserved. 1-1 Chapter 1 Sampling and Descriptive Statistics 1-2 Why Statistics? Deal with uncertainty in repeated scientific measurements Draw conclusions from data Design valid experiments and draw reliable conclusions

More information

MATH 1150 Chapter 2 Notation and Terminology

MATH 1150 Chapter 2 Notation and Terminology MATH 1150 Chapter 2 Notation and Terminology Categorical Data The following is a dataset for 30 randomly selected adults in the U.S., showing the values of two categorical variables: whether or not the

More information

Chapter 2: Tools for Exploring Univariate Data

Chapter 2: Tools for Exploring Univariate Data Stats 11 (Fall 2004) Lecture Note Introduction to Statistical Methods for Business and Economics Instructor: Hongquan Xu Chapter 2: Tools for Exploring Univariate Data Section 2.1: Introduction What is

More information

Resistant Measure - A statistic that is not affected very much by extreme observations.

Resistant Measure - A statistic that is not affected very much by extreme observations. Chapter 1.3 Lecture Notes & Examples Section 1.3 Describing Quantitative Data with Numbers (pp. 50-74) 1.3.1 Measuring Center: The Mean Mean - The arithmetic average. To find the mean (pronounced x bar)

More information

Performance of fourth-grade students on an agility test

Performance of fourth-grade students on an agility test Starter Ch. 5 2005 #1a CW Ch. 4: Regression L1 L2 87 88 84 86 83 73 81 67 78 83 65 80 50 78 78? 93? 86? Create a scatterplot Find the equation of the regression line Predict the scores Chapter 5: Understanding

More information

Descriptive statistics

Descriptive statistics Patrick Breheny February 6 Patrick Breheny to Biostatistics (171:161) 1/25 Tables and figures Human beings are not good at sifting through large streams of data; we understand data much better when it

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

Graphing Data. Example:

Graphing Data. Example: Graphing Data Bar graphs and line graphs are great for looking at data over time intervals, or showing the rise and fall of a quantity over the passage of time. Example: Auto Sales by Year Year Number

More information

What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty.

What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty. What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty. Statistics is a field of study concerned with the data collection,

More information

STT 315 This lecture is based on Chapter 2 of the textbook.

STT 315 This lecture is based on Chapter 2 of the textbook. STT 315 This lecture is based on Chapter 2 of the textbook. Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing him to use/edit some of their

More information

Descriptive Statistics-I. Dr Mahmoud Alhussami

Descriptive Statistics-I. Dr Mahmoud Alhussami Descriptive Statistics-I Dr Mahmoud Alhussami Biostatistics What is the biostatistics? A branch of applied math. that deals with collecting, organizing and interpreting data using well-defined procedures.

More information

Chapter 1. Looking at Data

Chapter 1. Looking at Data Chapter 1 Looking at Data Types of variables Looking at Data Be sure that each variable really does measure what you want it to. A poor choice of variables can lead to misleading conclusions!! For example,

More information

Lecture Notes 2: Variables and graphics

Lecture Notes 2: Variables and graphics Highlights: Lecture Notes 2: Variables and graphics Quantitative vs. qualitative variables Continuous vs. discrete and ordinal vs. nominal variables Frequency distributions Pie charts Bar charts Histograms

More information

Vocabulary: Samples and Populations

Vocabulary: Samples and Populations Vocabulary: Samples and Populations Concept Different types of data Categorical data results when the question asked in a survey or sample can be answered with a nonnumerical answer. For example if we

More information

Worksheet 2 - Basic statistics

Worksheet 2 - Basic statistics Worksheet 2 - Basic statistics Basic statistics references Fowler et al. (1998) -Chpts 1, 2, 3, 4, 5, 6, 9, 10, 11, 12, & 16 (16.1, 16.2, 16.3, 16.9,16.11-16.14) Holmes et al. (2006) - Chpt 4 & Sections

More information

Comparing Measures of Central Tendency *

Comparing Measures of Central Tendency * OpenStax-CNX module: m11011 1 Comparing Measures of Central Tendency * David Lane This work is produced by OpenStax-CNX and licensed under the Creative Commons Attribution License 1.0 1 Comparing Measures

More information

3.1 Measure of Center

3.1 Measure of Center 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable to the mean Find the mode of a data set Describe how skewness affects

More information

AIM HIGH SCHOOL. Curriculum Map W. 12 Mile Road Farmington Hills, MI (248)

AIM HIGH SCHOOL. Curriculum Map W. 12 Mile Road Farmington Hills, MI (248) AIM HIGH SCHOOL Curriculum Map 2923 W. 12 Mile Road Farmington Hills, MI 48334 (248) 702-6922 www.aimhighschool.com COURSE TITLE: Statistics DESCRIPTION OF COURSE: PREREQUISITES: Algebra 2 Students will

More information

STAT 200 Chapter 1 Looking at Data - Distributions

STAT 200 Chapter 1 Looking at Data - Distributions STAT 200 Chapter 1 Looking at Data - Distributions What is Statistics? Statistics is a science that involves the design of studies, data collection, summarizing and analyzing the data, interpreting the

More information

M 140 Test 1 B Name (1 point) SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points Total 75

M 140 Test 1 B Name (1 point) SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points Total 75 M 140 est 1 B Name (1 point) SHOW YOUR WORK FOR FULL CREDI! Problem Max. Points Your Points 1-10 10 11 10 12 3 13 4 14 18 15 8 16 7 17 14 otal 75 Multiple choice questions (1 point each) For questions

More information

MATH 10 INTRODUCTORY STATISTICS

MATH 10 INTRODUCTORY STATISTICS MATH 10 INTRODUCTORY STATISTICS Tommy Khoo Your friendly neighbourhood graduate student. Week 1 Chapter 1 Introduction What is Statistics? Why do you need to know Statistics? Technical lingo and concepts:

More information

A SHORT INTRODUCTION TO PROBABILITY

A SHORT INTRODUCTION TO PROBABILITY A Lecture for B.Sc. 2 nd Semester, Statistics (General) A SHORT INTRODUCTION TO PROBABILITY By Dr. Ajit Goswami Dept. of Statistics MDKG College, Dibrugarh 19-Apr-18 1 Terminology The possible outcomes

More information

Descriptive Statistics C H A P T E R 5 P P

Descriptive Statistics C H A P T E R 5 P P Descriptive Statistics C H A P T E R 5 P P 1 1 0-130 Graphing data Frequency distributions Bar graphs Qualitative variable (categories) Bars don t touch Histograms Frequency polygons Quantitative variable

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Suhasini Subba Rao Review Our objective: to make confident statements about a parameter (aspect) in

More information

Descriptive Data Summarization

Descriptive Data Summarization Descriptive Data Summarization Descriptive data summarization gives the general characteristics of the data and identify the presence of noise or outliers, which is useful for successful data cleaning

More information

Essentials of Statistics and Probability

Essentials of Statistics and Probability May 22, 2007 Department of Statistics, NC State University dbsharma@ncsu.edu SAMSI Undergrad Workshop Overview Practical Statistical Thinking Introduction Data and Distributions Variables and Distributions

More information

SESSION 5 Descriptive Statistics

SESSION 5 Descriptive Statistics SESSION 5 Descriptive Statistics Descriptive statistics are used to describe the basic features of the data in a study. They provide simple summaries about the sample and the measures. Together with simple

More information

Statistical Concepts. Constructing a Trend Plot

Statistical Concepts. Constructing a Trend Plot Module 1: Review of Basic Statistical Concepts 1.2 Plotting Data, Measures of Central Tendency and Dispersion, and Correlation Constructing a Trend Plot A trend plot graphs the data against a variable

More information

1.3.1 Measuring Center: The Mean

1.3.1 Measuring Center: The Mean 1.3.1 Measuring Center: The Mean Mean - The arithmetic average. To find the mean (pronounced x bar) of a set of observations, add their values and divide by the number of observations. If the n observations

More information

Chapter 1 Statistical Inference

Chapter 1 Statistical Inference Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

More information

Statistics in medicine

Statistics in medicine Statistics in medicine Lecture 1- part 1: Describing variation, and graphical presentation Outline Sources of variation Types of variables Fatma Shebl, MD, MS, MPH, PhD Assistant Professor Chronic Disease

More information

In this investigation you will use the statistics skills that you learned the to display and analyze a cup of peanut M&Ms.

In this investigation you will use the statistics skills that you learned the to display and analyze a cup of peanut M&Ms. M&M Madness In this investigation you will use the statistics skills that you learned the to display and analyze a cup of peanut M&Ms. Part I: Categorical Analysis: M&M Color Distribution 1. Record the

More information

A C E. Answers Investigation 4. Applications

A C E. Answers Investigation 4. Applications Answers Applications 1. 1 student 2. You can use the histogram with 5-minute intervals to determine the number of students that spend at least 15 minutes traveling to school. To find the number of students,

More information

Glossary for the Triola Statistics Series

Glossary for the Triola Statistics Series Glossary for the Triola Statistics Series Absolute deviation The measure of variation equal to the sum of the deviations of each value from the mean, divided by the number of values Acceptance sampling

More information

Collecting and Reporting Data

Collecting and Reporting Data Types of Data Data can be classified as qualitative or quantitative: Qualitative data Are observed rather than measured Include written descriptions, videos, photographs, or live observations Examples

More information

MA30S APPLIED UNIT F: DATA MANAGEMENT CLASS NOTES

MA30S APPLIED UNIT F: DATA MANAGEMENT CLASS NOTES 1 MA30S APPLIED UNIT F: DATA MANAGEMENT CLASS NOTES 1. We represent mathematical information in more ways than just using equations! Often a simple graph or chart or picture can represent a lot of information.

More information

Background to Statistics

Background to Statistics FACT SHEET Background to Statistics Introduction Statistics include a broad range of methods for manipulating, presenting and interpreting data. Professional scientists of all kinds need to be proficient

More information

Sem. 1 Review Ch. 1-3

Sem. 1 Review Ch. 1-3 AP Stats Sem. 1 Review Ch. 1-3 Name 1. You measure the age, marital status and earned income of an SRS of 1463 women. The number and type of variables you have measured is a. 1463; all quantitative. b.

More information

Last Lecture. Distinguish Populations from Samples. Knowing different Sampling Techniques. Distinguish Parameters from Statistics

Last Lecture. Distinguish Populations from Samples. Knowing different Sampling Techniques. Distinguish Parameters from Statistics Last Lecture Distinguish Populations from Samples Importance of identifying a population and well chosen sample Knowing different Sampling Techniques Distinguish Parameters from Statistics Knowing different

More information

PS5: Two Variable Statistics LT3: Linear regression LT4: The test of independence.

PS5: Two Variable Statistics LT3: Linear regression LT4: The test of independence. PS5: Two Variable Statistics LT3: Linear regression LT4: The test of independence. Example by eye. On a hot day, nine cars were left in the sun in a car parking lot. The length of time each car was left

More information

Review for Exam #1. Chapter 1. The Nature of Data. Definitions. Population. Sample. Quantitative data. Qualitative (attribute) data

Review for Exam #1. Chapter 1. The Nature of Data. Definitions. Population. Sample. Quantitative data. Qualitative (attribute) data Review for Exam #1 1 Chapter 1 Population the complete collection of elements (scores, people, measurements, etc.) to be studied Sample a subcollection of elements drawn from a population 11 The Nature

More information

Chapter 7: Statistics Describing Data. Chapter 7: Statistics Describing Data 1 / 27

Chapter 7: Statistics Describing Data. Chapter 7: Statistics Describing Data 1 / 27 Chapter 7: Statistics Describing Data Chapter 7: Statistics Describing Data 1 / 27 Categorical Data Four ways to display categorical data: 1 Frequency and Relative Frequency Table 2 Bar graph (Pareto chart)

More information

OPEN GEODA WORKSHOP / CRASH COURSE FACILITATED BY M. KOLAK

OPEN GEODA WORKSHOP / CRASH COURSE FACILITATED BY M. KOLAK OPEN GEODA WORKSHOP / CRASH COURSE FACILITATED BY M. KOLAK WHAT IS GEODA? Software program that serves as an introduction to spatial data analysis Free Open Source Source code is available under GNU license

More information

AP Statistics Summer Assignment

AP Statistics Summer Assignment AP Statistics Summer Assignment David_I_Beck@mcpsmd.org Welcome to AP Statistics. You will need to able to use your graphing calculator with its statistics package to enter data, calculate simple statistics

More information

Instructor: Doug Ensley Course: MAT Applied Statistics - Ensley

Instructor: Doug Ensley Course: MAT Applied Statistics - Ensley Student: Date: Instructor: Doug Ensley Course: MAT117 01 Applied Statistics - Ensley Assignment: Online 04 - Sections 2.5 and 2.6 1. A travel magazine recently presented data on the annual number of vacation

More information

Math 140 Introductory Statistics

Math 140 Introductory Statistics Math 140 Introductory Statistics Professor Silvia Fernández Chapter 2 Based on the book Statistics in Action by A. Watkins, R. Scheaffer, and G. Cobb. Visualizing Distributions Recall the definition: The

More information

Math 140 Introductory Statistics

Math 140 Introductory Statistics Visualizing Distributions Math 140 Introductory Statistics Professor Silvia Fernández Chapter Based on the book Statistics in Action by A. Watkins, R. Scheaffer, and G. Cobb. Recall the definition: The

More information

Sampling, Frequency Distributions, and Graphs (12.1)

Sampling, Frequency Distributions, and Graphs (12.1) 1 Sampling, Frequency Distributions, and Graphs (1.1) Design: Plan how to obtain the data. What are typical Statistical Methods? Collect the data, which is then subjected to statistical analysis, which

More information

MODULE 9 NORMAL DISTRIBUTION

MODULE 9 NORMAL DISTRIBUTION MODULE 9 NORMAL DISTRIBUTION Contents 9.1 Characteristics of a Normal Distribution........................... 62 9.2 Simple Areas Under the Curve................................. 63 9.3 Forward Calculations......................................

More information

Further Mathematics 2018 CORE: Data analysis Chapter 2 Summarising numerical data

Further Mathematics 2018 CORE: Data analysis Chapter 2 Summarising numerical data Chapter 2: Summarising numerical data Further Mathematics 2018 CORE: Data analysis Chapter 2 Summarising numerical data Extract from Study Design Key knowledge Types of data: categorical (nominal and ordinal)

More information

ST Presenting & Summarising Data Descriptive Statistics. Frequency Distribution, Histogram & Bar Chart

ST Presenting & Summarising Data Descriptive Statistics. Frequency Distribution, Histogram & Bar Chart ST2001 2. Presenting & Summarising Data Descriptive Statistics Frequency Distribution, Histogram & Bar Chart Summary of Previous Lecture u A study often involves taking a sample from a population that

More information

Visualizing Data: Basic Plot Types

Visualizing Data: Basic Plot Types Visualizing Data: Basic Plot Types Data Science 101 Stanford University, Department of Statistics Agenda Today s lecture focuses on these basic plot types: bar charts histograms boxplots scatter plots

More information

Prentice Hall Stats: Modeling the World 2004 (Bock) Correlated to: National Advanced Placement (AP) Statistics Course Outline (Grades 9-12)

Prentice Hall Stats: Modeling the World 2004 (Bock) Correlated to: National Advanced Placement (AP) Statistics Course Outline (Grades 9-12) National Advanced Placement (AP) Statistics Course Outline (Grades 9-12) Following is an outline of the major topics covered by the AP Statistics Examination. The ordering here is intended to define the

More information

TOPIC: Descriptive Statistics Single Variable

TOPIC: Descriptive Statistics Single Variable TOPIC: Descriptive Statistics Single Variable I. Numerical data summary measurements A. Measures of Location. Measures of central tendency Mean; Median; Mode. Quantiles - measures of noncentral tendency

More information

Topic 3: Introduction to Statistics. Algebra 1. Collecting Data. Table of Contents. Categorical or Quantitative? What is the Study of Statistics?!

Topic 3: Introduction to Statistics. Algebra 1. Collecting Data. Table of Contents. Categorical or Quantitative? What is the Study of Statistics?! Topic 3: Introduction to Statistics Collecting Data We collect data through observation, surveys and experiments. We can collect two different types of data: Categorical Quantitative Algebra 1 Table of

More information

Math Review Sheet, Fall 2008

Math Review Sheet, Fall 2008 1 Descriptive Statistics Math 3070-5 Review Sheet, Fall 2008 First we need to know about the relationship among Population Samples Objects The distribution of the population can be given in one of the

More information

Statistics lecture 3. Bell-Shaped Curves and Other Shapes

Statistics lecture 3. Bell-Shaped Curves and Other Shapes Statistics lecture 3 Bell-Shaped Curves and Other Shapes Goals for lecture 3 Realize many measurements in nature follow a bell-shaped ( normal ) curve Understand and learn to compute a standardized score

More information

1. Exploratory Data Analysis

1. Exploratory Data Analysis 1. Exploratory Data Analysis 1.1 Methods of Displaying Data A visual display aids understanding and can highlight features which may be worth exploring more formally. Displays should have impact and be

More information

Chapters 1 & 2 Exam Review

Chapters 1 & 2 Exam Review Problems 1-3 refer to the following five boxplots. 1.) To which of the above boxplots does the following histogram correspond? (A) A (B) B (C) C (D) D (E) E 2.) To which of the above boxplots does the

More information

Introduction to Statistics

Introduction to Statistics Introduction to Statistics Data and Statistics Data consists of information coming from observations, counts, measurements, or responses. Statistics is the science of collecting, organizing, analyzing,

More information

Final Exam - Solutions

Final Exam - Solutions Ecn 102 - Analysis of Economic Data University of California - Davis March 17, 2010 Instructor: John Parman Final Exam - Solutions You have until 12:30pm to complete this exam. Please remember to put your

More information

For instance, we want to know whether freshmen with parents of BA degree are predicted to get higher GPA than those with parents without BA degree.

For instance, we want to know whether freshmen with parents of BA degree are predicted to get higher GPA than those with parents without BA degree. DESCRIPTIVE ANALYSIS For instance, we want to know whether freshmen with parents of BA degree are predicted to get higher GPA than those with parents without BA degree. Assume that we have data; what information

More information

F78SC2 Notes 2 RJRC. If the interest rate is 5%, we substitute x = 0.05 in the formula. This gives

F78SC2 Notes 2 RJRC. If the interest rate is 5%, we substitute x = 0.05 in the formula. This gives F78SC2 Notes 2 RJRC Algebra It is useful to use letters to represent numbers. We can use the rules of arithmetic to manipulate the formula and just substitute in the numbers at the end. Example: 100 invested

More information

The science of learning from data.

The science of learning from data. STATISTICS (PART 1) The science of learning from data. Numerical facts Collection of methods for planning experiments, obtaining data and organizing, analyzing, interpreting and drawing the conclusions

More information

Chapter 4. Displaying and Summarizing. Quantitative Data

Chapter 4. Displaying and Summarizing. Quantitative Data STAT 141 Introduction to Statistics Chapter 4 Displaying and Summarizing Quantitative Data Bin Zou (bzou@ualberta.ca) STAT 141 University of Alberta Winter 2015 1 / 31 4.1 Histograms 1 We divide the range

More information

ADMS2320.com. We Make Stats Easy. Chapter 4. ADMS2320.com Tutorials Past Tests. Tutorial Length 1 Hour 45 Minutes

ADMS2320.com. We Make Stats Easy. Chapter 4. ADMS2320.com Tutorials Past Tests. Tutorial Length 1 Hour 45 Minutes We Make Stats Easy. Chapter 4 Tutorial Length 1 Hour 45 Minutes Tutorials Past Tests Chapter 4 Page 1 Chapter 4 Note The following topics will be covered in this chapter: Measures of central location Measures

More information

Turning a research question into a statistical question.

Turning a research question into a statistical question. Turning a research question into a statistical question. IGINAL QUESTION: Concept Concept Concept ABOUT ONE CONCEPT ABOUT RELATIONSHIPS BETWEEN CONCEPTS TYPE OF QUESTION: DESCRIBE what s going on? DECIDE

More information

BIOL 51A - Biostatistics 1 1. Lecture 1: Intro to Biostatistics. Smoking: hazardous? FEV (l) Smoke

BIOL 51A - Biostatistics 1 1. Lecture 1: Intro to Biostatistics. Smoking: hazardous? FEV (l) Smoke BIOL 51A - Biostatistics 1 1 Lecture 1: Intro to Biostatistics Smoking: hazardous? FEV (l) 1 2 3 4 5 No Yes Smoke BIOL 51A - Biostatistics 1 2 Box Plot a.k.a box-and-whisker diagram or candlestick chart

More information

1 Measures of the Center of a Distribution

1 Measures of the Center of a Distribution 1 Measures of the Center of a Distribution Qualitative descriptions of the shape of a distribution are important and useful. But we will often desire the precision of numerical summaries as well. Two aspects

More information

Probability Distributions

Probability Distributions CONDENSED LESSON 13.1 Probability Distributions In this lesson, you Sketch the graph of the probability distribution for a continuous random variable Find probabilities by finding or approximating areas

More information

Visual Display of Information

Visual Display of Information Visual Display of Information XKCD Edward Tufte Charles Joseph Minard s dramatic account of Napoleon's Russian campaign of 1812 (drawn in 1861) 1, men arrived in Moscow 422, men started the journey to

More information

Describing distributions with numbers

Describing distributions with numbers Describing distributions with numbers A large number or numerical methods are available for describing quantitative data sets. Most of these methods measure one of two data characteristics: The central

More information

Exercises from Chapter 3, Section 1

Exercises from Chapter 3, Section 1 Exercises from Chapter 3, Section 1 1. Consider the following sample consisting of 20 numbers. (a) Find the mode of the data 21 23 24 24 25 26 29 30 32 34 39 41 41 41 42 43 48 51 53 53 (b) Find the median

More information

Chapter 1:Descriptive statistics

Chapter 1:Descriptive statistics Slide 1.1 Chapter 1:Descriptive statistics Descriptive statistics summarises a mass of information. We may use graphical and/or numerical methods Examples of the former are the bar chart and XY chart,

More information

(quantitative or categorical variables) Numerical descriptions of center, variability, position (quantitative variables)

(quantitative or categorical variables) Numerical descriptions of center, variability, position (quantitative variables) 3. Descriptive Statistics Describing data with tables and graphs (quantitative or categorical variables) Numerical descriptions of center, variability, position (quantitative variables) Bivariate descriptions

More information

FRANKLIN UNIVERSITY PROFICIENCY EXAM (FUPE) STUDY GUIDE

FRANKLIN UNIVERSITY PROFICIENCY EXAM (FUPE) STUDY GUIDE FRANKLIN UNIVERSITY PROFICIENCY EXAM (FUPE) STUDY GUIDE Course Title: Probability and Statistics (MATH 80) Recommended Textbook(s): Number & Type of Questions: Probability and Statistics for Engineers

More information

MEASURING THE SPREAD OF DATA: 6F

MEASURING THE SPREAD OF DATA: 6F CONTINUING WITH DESCRIPTIVE STATS 6E,6F,6G,6H,6I MEASURING THE SPREAD OF DATA: 6F othink about this example: Suppose you are at a high school football game and you sample 40 people from the student section

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching/ Suhasini Subba Rao Review In the previous lecture we looked at the statistics of M&Ms. This example illustrates

More information

- measures the center of our distribution. In the case of a sample, it s given by: y i. y = where n = sample size.

- measures the center of our distribution. In the case of a sample, it s given by: y i. y = where n = sample size. Descriptive Statistics: One of the most important things we can do is to describe our data. Some of this can be done graphically (you should be familiar with histograms, boxplots, scatter plots and so

More information

The following formulas related to this topic are provided on the formula sheet:

The following formulas related to this topic are provided on the formula sheet: Student Notes Prep Session Topic: Exploring Content The AP Statistics topic outline contains a long list of items in the category titled Exploring Data. Section D topics will be reviewed in this session.

More information

Section 5.4. Ken Ueda

Section 5.4. Ken Ueda Section 5.4 Ken Ueda Students seem to think that being graded on a curve is a positive thing. I took lasers 101 at Cornell and got a 92 on the exam. The average was a 93. I ended up with a C on the test.

More information

Description of Samples and Populations

Description of Samples and Populations Description of Samples and Populations Random Variables Data are generated by some underlying random process or phenomenon. Any datum (data point) represents the outcome of a random variable. We represent

More information

Survey on Population Mean

Survey on Population Mean MATH 203 Survey on Population Mean Dr. Neal, Spring 2009 The first part of this project is on the analysis of a population mean. You will obtain data on a specific measurement X by performing a random

More information

2. Graphing Practice. Warm Up

2. Graphing Practice. Warm Up 2. Graphing Practice In this worksheet you will practice graphing and use your graphs to interpret and analyze data. The first three questions are warm-up questions; complete this section before moving

More information

CIVL 7012/8012. Collection and Analysis of Information

CIVL 7012/8012. Collection and Analysis of Information CIVL 7012/8012 Collection and Analysis of Information Uncertainty in Engineering Statistics deals with the collection and analysis of data to solve real-world problems. Uncertainty is inherent in all real

More information

Chapter 5. Understanding and Comparing. Distributions

Chapter 5. Understanding and Comparing. Distributions STAT 141 Introduction to Statistics Chapter 5 Understanding and Comparing Distributions Bin Zou (bzou@ualberta.ca) STAT 141 University of Alberta Winter 2015 1 / 27 Boxplots How to create a boxplot? Assume

More information

Chapter 2 Class Notes Sample & Population Descriptions Classifying variables

Chapter 2 Class Notes Sample & Population Descriptions Classifying variables Chapter 2 Class Notes Sample & Population Descriptions Classifying variables Random Variables (RVs) are discrete quantitative continuous nominal qualitative ordinal Notation and Definitions: a Sample is

More information

Business Statistics. Lecture 10: Course Review

Business Statistics. Lecture 10: Course Review Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,

More information

Statistics Handbook. All statistical tables were computed by the author.

Statistics Handbook. All statistical tables were computed by the author. Statistics Handbook Contents Page Wilcoxon rank-sum test (Mann-Whitney equivalent) Wilcoxon matched-pairs test 3 Normal Distribution 4 Z-test Related samples t-test 5 Unrelated samples t-test 6 Variance

More information

Chapter 3. Measuring data

Chapter 3. Measuring data Chapter 3 Measuring data 1 Measuring data versus presenting data We present data to help us draw meaning from it But pictures of data are subjective They re also not susceptible to rigorous inference Measuring

More information