Math 141. Quantile-Quantile Plots. Albyn Jones 1. jones/courses/ Library 304. Albyn Jones Math 141
|
|
- Roderick Gilmore
- 6 years ago
- Views:
Transcription
1 Math 141 Quantile-Quantile Plots Albyn Jones 1 1 Library 304 jones@reed.edu jones/courses/141
2 Outline: Quantile-Quantile Plots Quantiles and Order Statistics Quantile-Quantile Plots Normal Quantile Plots
3 Quantiles and Order Statistics Definition: p-th quantile q p Earlier we defined the p-th quantile of the distribution of a RV X as any number q p satisfying P(X q p ) p and P(X q p ) (1 p).
4 Order Statistics Sample quantiles are based on Order Statistics: Let X 1, X 2,..., X n be a sample of size n. The order statistics X (1), X (2),..., X (n) are just the observations sorted into ascending order. # The data > x [1] # The order statistics > sort(x) [1]
5 Order Statistics and Sample Quantiles There are numerous definitions of sample quantiles chosen to perform well under various conditions. All involve interpolation between neighboring order statistics.
6 Order Statistics and Sample Quantiles There are numerous definitions of sample quantiles chosen to perform well under various conditions. All involve interpolation between neighboring order statistics. Suppose that we want the pth quantile, where p lies between k/n and (k + 1)/n. There are variations, but all are, for some choice of a [0, 1]: ˆq p = ax (k) + (1 a)x (k+1)
7 Order Statistics and Sample Quantiles There are numerous definitions of sample quantiles chosen to perform well under various conditions. All involve interpolation between neighboring order statistics. Suppose that we want the pth quantile, where p lies between k/n and (k + 1)/n. There are variations, but all are, for some choice of a [0, 1]: ˆq p = ax (k) + (1 a)x (k+1) Example, when n is even, the sample median q.5 is usually taken to be the average of the two middle observations.
8 Order Statistics as Sample Quantiles Let s turn it around, and ask what sample quantiles correspond to the order statistics? Consider 4 observations from a population. On the average, we expect them to at least approximately divide the population into equal chunks corresponding to equally spaced percentiles: 1 5, 2 5, 3 5, 4 5 In other words, they correspond to the sample quantiles q.2, q.4, q.6, q.8
9 Comparing Two Samples Suppose we have two samples of size n, X 1, X 2,... X n and Y 1, Y 2,... Y n.
10 Comparing Two Samples Suppose we have two samples of size n, X 1, X 2,... X n and Y 1, Y 2,... Y n. If they were samples from the same distribution, then the order statistics X (1), X (2),... X (n) and Y (1), Y (2),... Y (n) would be estimates of the same quantiles.
11 Comparing Two Samples Suppose we have two samples of size n, X 1, X 2,... X n and Y 1, Y 2,... Y n. If they were samples from the same distribution, then the order statistics X (1), X (2),... X (n) and Y (1), Y (2),... Y (n) would be estimates of the same quantiles. Thus we expect that X (1) Y (1), X (2) Y (2), etc.
12 The QQ plot The quantile quantile plot, or QQplot, is a simple graphical method for comparing two sets of sample quantiles. Plot the pairs of order statistics (X (k), Y (k) ). If the two datasets come from the same distribution, the points should lie roughly on a line through the origin with slope 1.
13 QQ plot example QQplot of two N(0,1) samples of size 200 Y X
14 QQ plot example, small sample! Alert!! With small samples, expect variation! QQplot of two N(0,1) samples of size 20 Y X
15 QQ plot example: location shift Two samples from similar distributions which differ only in location: the green reference line is y = x. QQplot of two samples of size 200 Y X
16 QQ plot example: different spread Two samples from similar distributions which differ only in spread, with reference line. QQplot of two samples of size 200 Y X
17 QQ plot example: different shape Two samples from distributions which differ in shape, as well as location and spread, with reference line. QQplot of two samples of size 200 Y X
18 QQ plot example: Anorexia data The Family Therapy group had 17 subjects, the Control Therapy 26. qqplot() uses estimated quantiles for the larger dataset. QQplot of Family Therapy vs Control Family Control
19 Normal Quantile Plots Often we wish to compare a dataset to the Normal distribution, a theoretical population, rather than to a second dataset. R has a function that plots the order statistics of a sample against the corresponding quaintiles of the standard normal distribution: qqnorm(x) If the plot is roughly linear, then our data are approximately normally distributed.
20 Normal Quantile Plot: Normal data Don t expect a perfectly straight plot even with normal data! Normal Q Q Plot Theoretical Quantiles Sample Quantiles
21 Normal Quantile Plot: Mean and SD We can estimate the mean and SD from a Normal quantile plot: the mean is roughly equal to the median (plotted above 0), and the slope is roughly the SD. Normal Q Q Plot Sample Quantiles rise: Theoretical Quantiles
22 Normal Quantile Plot: Short Tails Normal Q Q Plot Theoretical Quantiles Sample Quantiles
23 Normal Quantile Plot: Short Tails Short tails are hard to diagnose from a density plot! Short Tails Density N = 50 Bandwidth = 2.436
24 Normal Quantile Plot: Long Tails Normal Q Q Plot Theoretical Quantiles Sample Quantiles
25 Density Plot: Long Tails Long Tails Density N = 50 Bandwidth =
26 Normal Quantile Plot: Positive Skewness Normal Q Q Plot Theoretical Quantiles Sample Quantiles
27 Density Plot: Positive Skewness Positive Skewness Density N = 50 Bandwidth =
28 Normal Quantile Plot: Negative Skewness Normal Q Q Plot Theoretical Quantiles Sample Quantiles
29 Normal Quantile Plot: Bimodal Data Normal Q Q Plot Sample Quantiles Theoretical Quantiles
30 Density Plot: Bimodal Data Bimodal data Density N = 50 Bandwidth = 1.372
31 Other distributions Suppose we would like to make a theoretical quantile plot for a dataset X to compare to some other distribution, say a Chisquared distribution with 5 degrees of freedom. Easy:
32 Other distributions Suppose we would like to make a theoretical quantile plot for a dataset X to compare to some other distribution, say a Chisquared distribution with 5 degrees of freedom. Easy: Sort your dataset: SampleQuantiles <- sort(x)
33 Other distributions Suppose we would like to make a theoretical quantile plot for a dataset X to compare to some other distribution, say a Chisquared distribution with 5 degrees of freedom. Easy: Sort your dataset: SampleQuantiles <- sort(x) Compute the theoretical quantiles: ChiSqQuantiles <- qchisq(ppoints(x),5)
34 Other distributions Suppose we would like to make a theoretical quantile plot for a dataset X to compare to some other distribution, say a Chisquared distribution with 5 degrees of freedom. Easy: Sort your dataset: SampleQuantiles <- sort(x) Compute the theoretical quantiles: Plot: ChiSqQuantiles <- qchisq(ppoints(x),5) plot(chisqquantiles,samplequantiles, pch=19)
35 Gamma Quantile Plot Gamma Quantile Plot sort(x) Gq
36 Summary QQplots are an excellent graphical tool for comparing two samples to each other, or one sample to a theoretical distribution like the Normal. They reveal differences in location, spread and shape more clearly than do density plots or histograms.
Math 141. Lecture 16: More than one group. Albyn Jones 1. jones/courses/ Library 304. Albyn Jones Math 141
Math 141 Lecture 16: More than one group Albyn Jones 1 1 Library 304 jones@reed.edu www.people.reed.edu/ jones/courses/141 Comparing two population means If two distributions have the same shape and spread,
More informationMath 141. Lecture 10: Confidence Intervals. Albyn Jones 1. jones/courses/ Library 304. Albyn Jones Math 141
Math 141 Lecture 10: Confidence Intervals Albyn Jones 1 1 Library 304 jones@reed.edu www.people.reed.edu/ jones/courses/141 Inference Suppose X Binomial(n, p). Inference about p includes the topics: Inference
More informationHistograms, Central Tendency, and Variability
The Economist, September 6, 214 1 Histograms, Central Tendency, and Variability Lecture 2 Reading: Sections 5 5.6 Includes ALL margin notes and boxes: For Example, Guided Example, Notation Alert, Just
More informationIntroduction Probability. Math 141. Introduction to Probability and Statistics. Albyn Jones
Math 141 to and Statistics Albyn Jones Mathematics Department Library 304 jones@reed.edu www.people.reed.edu/ jones/courses/141 September 3, 2014 Motivation How likely is an eruption at Mount Rainier in
More informationElementary Statistics
Elementary Statistics Q: What is data? Q: What does the data look like? Q: What conclusions can we draw from the data? Q: Where is the middle of the data? Q: Why is the spread of the data important? Q:
More informationDescribing Distributions With Numbers
Describing Distributions With Numbers October 24, 2012 What Do We Usually Summarize? Measures of Center. Percentiles. Measures of Spread. A Summary Statement. Choosing Numerical Summaries. 1.0 What Do
More informationStat 20: Intro to Probability and Statistics
Stat 20: Intro to Probability and Statistics Lecture 5: Summary Statistics Tessa L. Childers-Day UC Berkeley 30 June 2014 By the end of this lecture... You will be able to: Describe a data set by its:
More information2.1 Measures of Location (P.9-11)
MATH1015 Biostatistics Week.1 Measures of Location (P.9-11).1.1 Summation Notation Suppose that we observe n values from an experiment. This collection (or set) of n values is called a sample. Let x 1
More informationMath 361. Day 3 Traffic Fatalities Inv. A Random Babies Inv. B
Math 361 Day 3 Traffic Fatalities Inv. A Random Babies Inv. B Last Time Did traffic fatalities decrease after the Federal Speed Limit Law? we found the percent change in fatalities dropped by 17.14% after
More informationMath 141. Lecture 20: Regression Remedies. Albyn Jones 1. jones/courses/ Library 304. Albyn Jones Math 141
Math 141 Lecture 20: Regression Remedies Albyn Jones 1 1 Library 304 jones@reed.edu www.people.reed.edu/ jones/courses/141 LAST TIME Formal Inference: Hypothesis tests and Confidence Intervals for regression
More informationChapter 1 - Lecture 3 Measures of Location
Chapter 1 - Lecture 3 of Location August 31st, 2009 Chapter 1 - Lecture 3 of Location General Types of measures Median Skewness Chapter 1 - Lecture 3 of Location Outline General Types of measures What
More informationContinuous Random Variables. and Probability Distributions. Continuous Random Variables and Probability Distributions ( ) ( ) Chapter 4 4.
UCLA STAT 11 A Applied Probability & Statistics for Engineers Instructor: Ivo Dinov, Asst. Prof. In Statistics and Neurology Teaching Assistant: Christopher Barr University of California, Los Angeles,
More informationDescribing Distributions
Describing Distributions With Numbers April 18, 2012 Summary Statistics. Measures of Center. Percentiles. Measures of Spread. A Summary Statement. Choosing Numerical Summaries. 1.0 What Are Summary Statistics?
More informationStatistical Concepts. Constructing a Trend Plot
Module 1: Review of Basic Statistical Concepts 1.2 Plotting Data, Measures of Central Tendency and Dispersion, and Correlation Constructing a Trend Plot A trend plot graphs the data against a variable
More informationLecture 2 and Lecture 3
Lecture 2 and Lecture 3 1 Lecture 2 and Lecture 3 We can describe distributions using 3 characteristics: shape, center and spread. These characteristics have been discussed since the foundation of statistics.
More informationLecture 35. Summarizing Data - II
Math 48 - Mathematical Statistics Lecture 35. Summarizing Data - II April 26, 212 Konstantin Zuev (USC) Math 48, Lecture 35 April 26, 213 1 / 18 Agenda Quantile-Quantile Plots Histograms Kernel Probability
More informationSummary statistics. G.S. Questa, L. Trapani. MSc Induction - Summary statistics 1
Summary statistics 1. Visualize data 2. Mean, median, mode and percentiles, variance, standard deviation 3. Frequency distribution. Skewness 4. Covariance and correlation 5. Autocorrelation MSc Induction
More information1. Exploratory Data Analysis
1. Exploratory Data Analysis 1.1 Methods of Displaying Data A visual display aids understanding and can highlight features which may be worth exploring more formally. Displays should have impact and be
More informationAMS 5 NUMERICAL DESCRIPTIVE METHODS
AMS 5 NUMERICAL DESCRIPTIVE METHODS Introduction A histogram provides a graphical description of the distribution of a sample of data. If we want to summarize the properties of such a distribution we can
More informationChapter 4. Displaying and Summarizing. Quantitative Data
STAT 141 Introduction to Statistics Chapter 4 Displaying and Summarizing Quantitative Data Bin Zou (bzou@ualberta.ca) STAT 141 University of Alberta Winter 2015 1 / 31 4.1 Histograms 1 We divide the range
More informationReview of basic probability and statistics
Probability: basic definitions Review of basic probability and statistics A random variable is the outcome of a natural process that can not be predicted with certainty. Examples: the maximum temperature
More informationContinuous Random Variables. and Probability Distributions. Continuous Random Variables and Probability Distributions ( ) ( )
UCLA STAT 35 Applied Computational and Interactive Probability Instructor: Ivo Dinov, Asst. Prof. In Statistics and Neurology Teaching Assistant: Chris Barr Continuous Random Variables and Probability
More information2.0 Lesson Plan. Answer Questions. Summary Statistics. Histograms. The Normal Distribution. Using the Standard Normal Table
2.0 Lesson Plan Answer Questions 1 Summary Statistics Histograms The Normal Distribution Using the Standard Normal Table 2. Summary Statistics Given a collection of data, one needs to find representations
More informationContinuous random variables
Continuous random variables A continuous random variable X takes all values in an interval of numbers. The probability distribution of X is described by a density curve. The total area under a density
More informationStandard Normal Curve Areas z
Table A.3 Standard Normal Curve Areas z.00.01.02.03.04.09-1.2 0.1151 0.1131 0.1112 0.1094 0.1075 0.0985-1.1 0.1357 0.1335 0.1314 0.1292 0.1271 0.1170 1.6 0.9452 0.9463 0.9474 0.9484 0.9495 0.9545 1.7 0.9554
More informationMATH 1150 Chapter 2 Notation and Terminology
MATH 1150 Chapter 2 Notation and Terminology Categorical Data The following is a dataset for 30 randomly selected adults in the U.S., showing the values of two categorical variables: whether or not the
More informationMATH4427 Notebook 4 Fall Semester 2017/2018
MATH4427 Notebook 4 Fall Semester 2017/2018 prepared by Professor Jenny Baglivo c Copyright 2009-2018 by Jenny A. Baglivo. All Rights Reserved. 4 MATH4427 Notebook 4 3 4.1 K th Order Statistics and Their
More information1-1. Chapter 1. Sampling and Descriptive Statistics by The McGraw-Hill Companies, Inc. All rights reserved.
1-1 Chapter 1 Sampling and Descriptive Statistics 1-2 Why Statistics? Deal with uncertainty in repeated scientific measurements Draw conclusions from data Design valid experiments and draw reliable conclusions
More informationSTAT 200 Chapter 1 Looking at Data - Distributions
STAT 200 Chapter 1 Looking at Data - Distributions What is Statistics? Statistics is a science that involves the design of studies, data collection, summarizing and analyzing the data, interpreting the
More information401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.
401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis
More informationDescriptive Statistics-I. Dr Mahmoud Alhussami
Descriptive Statistics-I Dr Mahmoud Alhussami Biostatistics What is the biostatistics? A branch of applied math. that deals with collecting, organizing and interpreting data using well-defined procedures.
More informationAverages How difficult is QM1? What is the average mark? Week 1b, Lecture 2
Averages How difficult is QM1? What is the average mark? Week 1b, Lecture 2 Topics: 1. Mean 2. Mode 3. Median 4. Order Statistics 5. Minimum, Maximum, Range 6. Percentiles, Quartiles, Interquartile Range
More informationAssignments. Statistics Workshop 1: Introduction to R. Tuesday May 26, Atoms, Vectors and Matrices
Statistics Workshop 1: Introduction to R. Tuesday May 26, 2009 Assignments Generally speaking, there are three basic forms of assigning data. Case one is the single atom or a single number. Assigning a
More informationØ Set of mutually exclusive categories. Ø Classify or categorize subject. Ø No meaningful order to categorization.
Statistical Tools in Evaluation HPS 41 Fall 213 Dr. Joe G. Schmalfeldt Types of Scores Continuous Scores scores with a potentially infinite number of values. Discrete Scores scores limited to a specific
More informationIV. The Normal Distribution
IV. The Normal Distribution The normal distribution (a.k.a., the Gaussian distribution or bell curve ) is the by far the best known random distribution. It s discovery has had such a far-reaching impact
More informationChapter 2 Descriptive Statistics
Chapter 2 Descriptive Statistics Lecture 1: Measures of Central Tendency and Dispersion Donald E. Mercante, PhD Biostatistics May 2010 Biostatistics (LSUHSC) Chapter 2 05/10 1 / 34 Lecture 1: Descriptive
More informationDensity Curves and the Normal Distributions. Histogram: 10 groups
Density Curves and the Normal Distributions MATH 2300 Chapter 6 Histogram: 10 groups 1 Histogram: 20 groups Histogram: 40 groups 2 Histogram: 80 groups Histogram: 160 groups 3 Density Curve Density Curves
More informationIV. The Normal Distribution
IV. The Normal Distribution The normal distribution (a.k.a., a the Gaussian distribution or bell curve ) is the by far the best known random distribution. It s discovery has had such a far-reaching impact
More informationChapter 3 Examining Data
Chapter 3 Examining Data This chapter discusses methods of displaying quantitative data with the objective of understanding the distribution of the data. Example During childhood and adolescence, bone
More informationCh. 1: Data and Distributions
Ch. 1: Data and Distributions Populations vs. Samples How to graphically display data Histograms, dot plots, stem plots, etc Helps to show how samples are distributed Distributions of both continuous and
More informationStatistics 528: Homework 2 Solutions
Statistics 28: Homework 2 Solutions.4 There are several gaps in the data, as can be seen from the histogram. Minitab Result: Min Q Med Q3 Max 8 3278 22 2368 2624 Manual Result: Min Q Med Q3 Max 8 338 22.
More informationSTAB22 Statistics I. Lecture 7
STAB22 Statistics I Lecture 7 1 Example Newborn babies weight follows Normal distr. w/ mean 3500 grams & SD 500 grams. A baby is defined as high birth weight if it is in the top 2% of birth weights. What
More informationUnit Two Descriptive Biostatistics. Dr Mahmoud Alhussami
Unit Two Descriptive Biostatistics Dr Mahmoud Alhussami Descriptive Biostatistics The best way to work with data is to summarize and organize them. Numbers that have not been summarized and organized are
More information4.12 Sampling Distributions 183
4.12 Sampling Distributions 183 FIGURE 4.19 Sampling distribution for y Example 4.22 illustrates for a very small population that we could in fact enumerate every possible sample of size 2 selected from
More informationF78SC2 Notes 2 RJRC. If the interest rate is 5%, we substitute x = 0.05 in the formula. This gives
F78SC2 Notes 2 RJRC Algebra It is useful to use letters to represent numbers. We can use the rules of arithmetic to manipulate the formula and just substitute in the numbers at the end. Example: 100 invested
More informationDescribing Data with Numerical Measures
10.08.009 Describing Data with Numerical Measures 10.08.009 1 Graphical methods may not always be sufficient for describing data. Numerical measures can be created for both populations and samples. A parameter
More informationDescriptive Statistics
Descriptive Statistics CHAPTER OUTLINE 6-1 Numerical Summaries of Data 6- Stem-and-Leaf Diagrams 6-3 Frequency Distributions and Histograms 6-4 Box Plots 6-5 Time Sequence Plots 6-6 Probability Plots Chapter
More informationAn automatic report for the dataset : affairs
An automatic report for the dataset : affairs (A very basic version of) The Automatic Statistician Abstract This is a report analysing the dataset affairs. Three simple strategies for building linear models
More informationIAM 530 ELEMENTS OF PROBABILITY AND STATISTICS LECTURE 3-RANDOM VARIABLES
IAM 530 ELEMENTS OF PROBABILITY AND STATISTICS LECTURE 3-RANDOM VARIABLES VARIABLE Studying the behavior of random variables, and more importantly functions of random variables is essential for both the
More informationADMS2320.com. We Make Stats Easy. Chapter 4. ADMS2320.com Tutorials Past Tests. Tutorial Length 1 Hour 45 Minutes
We Make Stats Easy. Chapter 4 Tutorial Length 1 Hour 45 Minutes Tutorials Past Tests Chapter 4 Page 1 Chapter 4 Note The following topics will be covered in this chapter: Measures of central location Measures
More informationModeling Uncertainty in the Earth Sciences Jef Caers Stanford University
Probability theory and statistical analysis: a review Modeling Uncertainty in the Earth Sciences Jef Caers Stanford University Concepts assumed known Histograms, mean, median, spread, quantiles Probability,
More informationEssentials of Statistics and Probability
May 22, 2007 Department of Statistics, NC State University dbsharma@ncsu.edu SAMSI Undergrad Workshop Overview Practical Statistical Thinking Introduction Data and Distributions Variables and Distributions
More informationChapter 2: Tools for Exploring Univariate Data
Stats 11 (Fall 2004) Lecture Note Introduction to Statistical Methods for Business and Economics Instructor: Hongquan Xu Chapter 2: Tools for Exploring Univariate Data Section 2.1: Introduction What is
More informationThe normal distribution
The normal distribution Patrick Breheny March 3 Patrick Breheny to Biostatistics (BIOS 4120) 1/25 A common histogram shape Histograms of infant mortality rates, heights, and cholesterol levels: Africa
More informationDescriptive statistics
Patrick Breheny February 6 Patrick Breheny to Biostatistics (171:161) 1/25 Tables and figures Human beings are not good at sifting through large streams of data; we understand data much better when it
More informationSTP 420 INTRODUCTION TO APPLIED STATISTICS NOTES
INTRODUCTION TO APPLIED STATISTICS NOTES PART - DATA CHAPTER LOOKING AT DATA - DISTRIBUTIONS Individuals objects described by a set of data (people, animals, things) - all the data for one individual make
More information1 Measures of the Center of a Distribution
1 Measures of the Center of a Distribution Qualitative descriptions of the shape of a distribution are important and useful. But we will often desire the precision of numerical summaries as well. Two aspects
More information3.1 Measure of Center
3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable to the mean Find the mode of a data set Describe how skewness affects
More informationMath 141. Lecture 27: More Issues in Model Selection and Interpretation. Albyn Jones 1. 1 Library 304
Math 141 Lecture 27: More Issues in Model Selection and Interpretation Albyn Jones 1 1 Library 304 jones@reed.edu www.people.reed.edu/ jones/courses/141 Confounding Confounding: a term from experimental
More informationPreliminary Statistics course. Lecture 1: Descriptive Statistics
Preliminary Statistics course Lecture 1: Descriptive Statistics Rory Macqueen (rm43@soas.ac.uk), September 2015 Organisational Sessions: 16-21 Sep. 10.00-13.00, V111 22-23 Sep. 15.00-18.00, V111 24 Sep.
More informationDensity Curves & Normal Distributions
Density Curves & Normal Distributions Sections 4.1 & 4.2 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 9-2311 Cathy Poliak, Ph.D. cathy@math.uh.edu
More informationINFERENCE FOR REGRESSION
CHAPTER 3 INFERENCE FOR REGRESSION OVERVIEW In Chapter 5 of the textbook, we first encountered regression. The assumptions that describe the regression model we use in this chapter are the following. We
More informationExplore the data. Anja Bråthen Kristoffersen
Explore the data Anja Bråthen Kristoffersen density 0.2 0.4 0.6 0.8 Probability distributions Can be either discrete or continuous (uniform, bernoulli, normal, etc) Defined by a density function, p(x)
More informationMeasures of. U4 C 1.2 Dot plot and Histogram 2 January 15 16, 2015
U4 C 1. Dot plot and Histogram January 15 16, 015 U 4 : C 1.1 CCSS. 9 1.S ID.1 Dot Plots and Histograms Objective: We will be able to represent data with plots on the real number line, using: Dot Plots
More informationChapter. Numerically Summarizing Data. Copyright 2013, 2010 and 2007 Pearson Education, Inc.
Chapter 3 Numerically Summarizing Data Section 3.1 Measures of Central Tendency Objectives 1. Determine the arithmetic mean of a variable from raw data 2. Determine the median of a variable from raw data
More informationChapter 3. Measuring data
Chapter 3 Measuring data 1 Measuring data versus presenting data We present data to help us draw meaning from it But pictures of data are subjective They re also not susceptible to rigorous inference Measuring
More informationLinear Regression. In this lecture we will study a particular type of regression model: the linear regression model
1 Linear Regression 2 Linear Regression In this lecture we will study a particular type of regression model: the linear regression model We will first consider the case of the model with one predictor
More informationApproximate Linear Relationships
Approximate Linear Relationships In the real world, rarely do things follow trends perfectly. When the trend is expected to behave linearly, or when inspection suggests the trend is behaving linearly,
More informationStatistics I Chapter 2: Univariate data analysis
Statistics I Chapter 2: Univariate data analysis Chapter 2: Univariate data analysis Contents Graphical displays for categorical data (barchart, piechart) Graphical displays for numerical data data (histogram,
More informationInference for Single Proportions and Means T.Scofield
Inference for Single Proportions and Means TScofield Confidence Intervals for Single Proportions and Means A CI gives upper and lower bounds between which we hope to capture the (fixed) population parameter
More informationReview. Midterm Exam. Midterm Review. May 6th, 2015 AMS-UCSC. Spring Session 1 (Midterm Review) AMS-5 May 6th, / 24
Midterm Exam Midterm Review AMS-UCSC May 6th, 2015 Spring 2015. Session 1 (Midterm Review) AMS-5 May 6th, 2015 1 / 24 Topics Topics We will talk about... 1 Review Spring 2015. Session 1 (Midterm Review)
More informationStatistics I Chapter 2: Univariate data analysis
Statistics I Chapter 2: Univariate data analysis Chapter 2: Univariate data analysis Contents Graphical displays for categorical data (barchart, piechart) Graphical displays for numerical data data (histogram,
More informationChapter 1. Looking at Data
Chapter 1 Looking at Data Types of variables Looking at Data Be sure that each variable really does measure what you want it to. A poor choice of variables can lead to misleading conclusions!! For example,
More informationChapter 2 Class Notes Sample & Population Descriptions Classifying variables
Chapter 2 Class Notes Sample & Population Descriptions Classifying variables Random Variables (RVs) are discrete quantitative continuous nominal qualitative ordinal Notation and Definitions: a Sample is
More informationREVIEW: Midterm Exam. Spring 2012
REVIEW: Midterm Exam Spring 2012 Introduction Important Definitions: - Data - Statistics - A Population - A census - A sample Types of Data Parameter (Describing a characteristic of the Population) Statistic
More informationBusiness Statistics. Lecture 10: Course Review
Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,
More informationSUMMARIZING MEASURED DATA. Gaia Maselli
SUMMARIZING MEASURED DATA Gaia Maselli maselli@di.uniroma1.it Computer Network Performance 2 Overview Basic concepts Summarizing measured data Summarizing data by a single number Summarizing variability
More informationTOPIC: Descriptive Statistics Single Variable
TOPIC: Descriptive Statistics Single Variable I. Numerical data summary measurements A. Measures of Location. Measures of central tendency Mean; Median; Mode. Quantiles - measures of noncentral tendency
More informationThis does not cover everything on the final. Look at the posted practice problems for other topics.
Class 7: Review Problems for Final Exam 8.5 Spring 7 This does not cover everything on the final. Look at the posted practice problems for other topics. To save time in class: set up, but do not carry
More informationMath 261 Sampling Distributions Lab Spring 2009
Math 261 Sampling Distributions Lab Spring 2009 Name: Purpose After completing this lab, you should be able to distinguish between the distribution of the population, distribution of the sample, and the
More information200 participants [EUR] ( =60) 200 = 30% i.e. nearly a third of the phone bills are greater than 75 EUR
Ana Jerončić 200 participants [EUR] about half (71+37=108) 200 = 54% of the bills are small, i.e. less than 30 EUR (18+28+14=60) 200 = 30% i.e. nearly a third of the phone bills are greater than 75 EUR
More informationContent by Week Week of October 14 27
Content by Week Week of October 14 27 Learning objectives By the end of this week, you should be able to: Understand the purpose and interpretation of confidence intervals for the mean, Calculate confidence
More informationExploratory data analysis: numerical summaries
16 Exploratory data analysis: numerical summaries The classical way to describe important features of a dataset is to give several numerical summaries We discuss numerical summaries for the center of a
More informationStatistics and parameters
Statistics and parameters Tables, histograms and other charts are used to summarize large amounts of data. Often, an even more extreme summary is desirable. Statistics and parameters are numbers that characterize
More informationOutline. 1. Histograms 2. Stem-and-leaf plots 3. Dot charts and dot plots 4. Boxplots 5. Kernel density estimates 6. QQ-plots
Outline 1. Histograms 2. Stem-and-leaf plots 3. Dot charts and dot plots 4. Boxplots 5. Kernel density estimates 6. QQ-plots Histograms Histograms display a sample estimate of the density or mass function
More informationExam Empirical Methods VU University Amsterdam, Faculty of Exact Sciences h, February 12, 2015
Exam Empirical Methods VU University Amsterdam, Faculty of Exact Sciences 18.30 21.15h, February 12, 2015 Question 1 is on this page. Always motivate your answers. Write your answers in English. Only the
More informationLecture 12: Small Sample Intervals Based on a Normal Population Distribution
Lecture 12: Small Sample Intervals Based on a Normal Population MSU-STT-351-Sum-17B (P. Vellaisamy: MSU-STT-351-Sum-17B) Probability & Statistics for Engineers 1 / 24 In this lecture, we will discuss (i)
More informationWarm-up Using the given data Create a scatterplot Find the regression line
Time at the lunch table Caloric intake 21.4 472 30.8 498 37.7 335 32.8 423 39.5 437 22.8 508 34.1 431 33.9 479 43.8 454 42.4 450 43.1 410 29.2 504 31.3 437 28.6 489 32.9 436 30.6 480 35.1 439 33.0 444
More informationMODULE 7 UNIVARIATE EDA - QUANTITATIVE
MODULE 7 UNIVARIATE EDA - QUANTITATIVE Contents 7.1 Interpreting Shape........................................ 46 7.2 Interpreting Outliers....................................... 47 7.3 Comparing the Median
More informationChapter 6 The Standard Deviation as a Ruler and the Normal Model
Chapter 6 The Standard Deviation as a Ruler and the Normal Model Overview Key Concepts Understand how adding (subtracting) a constant or multiplying (dividing) by a constant changes the center and/or spread
More information1. Density and properties Brief outline 2. Sampling from multivariate normal and MLE 3. Sampling distribution and large sample behavior of X and S 4.
Multivariate normal distribution Reading: AMSA: pages 149-200 Multivariate Analysis, Spring 2016 Institute of Statistics, National Chiao Tung University March 1, 2016 1. Density and properties Brief outline
More informationFrequency Distribution Cross-Tabulation
Frequency Distribution Cross-Tabulation 1) Overview 2) Frequency Distribution 3) Statistics Associated with Frequency Distribution i. Measures of Location ii. Measures of Variability iii. Measures of Shape
More informationChapter 4: Continuous Random Variables and Probability Distributions
Chapter 4: and Probability Distributions Walid Sharabati Purdue University February 14, 2014 Professor Sharabati (Purdue University) Spring 2014 (Slide 1 of 37) Chapter Overview Continuous random variables
More informationMeelis Kull Autumn Meelis Kull - Autumn MTAT Data Mining - Lecture 03
Meelis Kull meelis.kull@ut.ee Autumn 2017 1 Demo: Data science mini-project CRISP-DM: cross-industrial standard process for data mining Data understanding: Types of data Data understanding: First look
More informationChapter 3. Data Description
Chapter 3. Data Description Graphical Methods Pie chart It is used to display the percentage of the total number of measurements falling into each of the categories of the variable by partition a circle.
More informationQuantile-quantile plots and the method of peaksover-threshold
Problems in SF2980 2009-11-09 12 6 4 2 0 2 4 6 0.15 0.10 0.05 0.00 0.05 0.10 0.15 Figure 2: qqplot of log-returns (x-axis) against quantiles of a standard t-distribution with 4 degrees of freedom (y-axis).
More informationProbability Density Functions
Probability Density Functions Probability Density Functions Definition Let X be a continuous rv. Then a probability distribution or probability density function (pdf) of X is a function f (x) such that
More informationSLR output RLS. Refer to slr (code) on the Lecture Page of the class website.
SLR output RLS Refer to slr (code) on the Lecture Page of the class website. Old Faithful at Yellowstone National Park, WY: Simple Linear Regression (SLR) Analysis SLR analysis explores the linear association
More informationStat 427/527: Advanced Data Analysis I
Stat 427/527: Advanced Data Analysis I Review of Chapters 1-4 Sep, 2017 1 / 18 Concepts you need to know/interpret Numerical summaries: measures of center (mean, median, mode) measures of spread (sample
More informationDescriptive Univariate Statistics and Bivariate Correlation
ESC 100 Exploring Engineering Descriptive Univariate Statistics and Bivariate Correlation Instructor: Sudhir Khetan, Ph.D. Wednesday/Friday, October 17/19, 2012 The Central Dogma of Statistics used to
More information