G M = = = 0.927
|
|
- Chastity Ward
- 6 years ago
- Views:
Transcription
1 PharmaSUG 2016 Paper SP10 I Want the Mean, But not That One! David Franklin, Quintiles Real Late Phase Research, Cambridge, MA ABSTRACT The Mean, as most SAS programmers know it, is the Arithmetic Mean. However, there are situations where it may necessary to calculate different means. This paper first looks at different methods that are widely used from a programmer's perspective, starting with the humble Arithmetic Mean, then proceeding to the other Pythagorean Means, known as the Geometric Mean and Harmonic Mean, before ending with a quick look at the Interquartile Mean and its related Truncated Mean. During the journey there will be examples of data and code given to demonstrate how each method is done and output. INTRODUCTION I want the mean, but not that one! Some of us have heard that phrase, others have not. As a programmer we use procedures such as PROC MEANS or PROC UNIVARIATE, and occasionally write datastep code, to calculate the mean. But these ways of calculating the mean calculates the Arithmetic Mean. However, other methods exist. This paper looks at first that the Arithmetic Mean, then proceeds to the Geometric Mean and Harmonic Mean, before ending with a quick look at the Interquartile Mean and its related Truncated Mean. THE METHODS We have all heard of the humble mean or average, but what exactly is it? When we hear of mean or average we are actually often thinking of the Arithmetic Mean (AM) with defined as the sum of the non missing values in a set divided by the number of non missing values in the same set. The formula is often seen in textbooks as The Arithmetic Mean is the most commonly used and readily understood measure of central tendency, and is best when the data is not skewed (having no extreme outlier values) and the individual data points are not dependent on each other. But did you know there are other means out there? The first is the Geometric Mean (GM) which uses the product of the values as opposed to the arithmetic mean which uses their sum. This formula is often seen as The Geometric Mean should be used whenever the data are interrelated, usually ratios or percentages, and usually involve a time component. As an example, lets look at the growth rate of bacteria over a four hour period (as a ratio): Time 0 60 minutes = 1.2 (100 start, 120 end) Time > minutes = 1.4 (120 start, 168 end) Time > minutes = 1.1 (168 start, 185 end) Time > minutes = 0.4 (185 start, 74 end) G M = = = This is different to the arithmetic mean which would be
2 A M = = = One shows an average decrease of 7.3% uniformly across each period while the other shows an increase of 2.5%. Which is more correct? 100 x (0.927) 4 = 74, the Geometric Mean. The main reason for using the Geometric Mean over the Arithmetic Mean is that in the example the starting value of the second period is dependent on the first result. The second mean we will look at is the harmonic mean, sometimes called the subcontrary mean) which uses reciprocals, with the formula often seen as This statistic is useful in situations involving data where the majority of the values are distributed uniformly but there there are a few outliers with significantly higher or lower values. The Harmonic Mean gives less significance to high value outliers, providing a truer picture of the average. Lets look at some data: is different to the arithmetic mean which would be 10 H M = = 3.23 = A M = = = 6.3 In general the harmonic average is less biased due to a small number of outliers. The Arithmetic Mean, Geometric Mean and Harmonic Mean methods are considered part of what is often called the Pythagorean means. Another method that is useful when dealing with outlier data, as shown in the data above, is to calculate the Interquartile Mean (IM), calculated using the formula In this method the first and fourth quartiles are removed and the arithmetic mean is done on the records in the second and third quartiles. One useful reason for removing the first and fourth quartiles is that and outliers are removed. Looking at the data above and putting into order we get Using the formula above, the Interquartile Mean is I M = 2 3 * 10 10/4 = 10 2 ( ) = 4.2 (10/4)+1 = A variation of the Interquartile Mean which incorporates the simplicity of the Arithmetic Mean is the Truncated Mean (TM) where numbers are excluded which are outside specified percentiles. Using this method we get bounds by removing particular observations. If we were to remove the top and bottom observations using a 10% trimmed mean, we would remove 10% of the observations on both sides then calculate the mean from the resulting data. Using our example above and a 10% truncated mean, taking 10% of the observations off each side result in the 1 and 21 are excluded, so the Truncated Mean can be computed T M = = 41 8 =
3 As said previously, the Harmonic Mean, Interquartile Mean and Truncated Mean are useful if there are outliers which strongly influence the result. Which is correct, well that depends on the data and the statistician programmers, follow any guidance from the statistician. What does SAS do? This is an interesting question. Using SAS procedures the only mean that are available is the Arithmetic Mean (a number of procedures) and the Geometric Mean via the SURVEYMEANS procedure. Using the SAS functions in SAS 9.4, there is the MEAN, GEOMEAN and HARMEAN functions, but the problem with using these is that you have to have the data transposed to a row in order to use these while our examples have this form, real world we do not have this option. Now we have to write some code using a procedure and a datastep. To help understand the code with the examples above, I am going to use the same data in each example. First we will look at the Arithmetic Mean and the data we used for the Harmonic Mean, to get both of these means: data _dat input datalines data _null_ set _dat length numx sumx rcplx 8 retain numx sumx rcplx set _dat end=eof if _n_=1 then do numx=0 sumx=0 rcplx=0 if ^missing(x) then do numx+1 sumx+x rcplx=rcplx+(1/x) if eof then do AM=sumx/numx HM=numx/rcplx put NUMX= / SUMX= / AM= / HM= numx=10 sumx=63 AM=6.3 HM= This is what we got by hand. Because the data used in the Geometric Mean is different than that for the other two, it is impossible to use the same data, so we go back to our original example data 3
4 data _dat input datalines data _null_ length numx prodx 8 retain numx prodx set _dat end=eof if _n_=1 then do numx=0 prodx=1 if ^missing(x) then do numx+1 prodx=prodx*x if eof then do GM=prodx**(1/numx) put NUMX= / GM= numx=4 GM= This is the same result we got by hand. For the the Interquartile Mean, we will go back to our original data that we used for the Harmonic Mean, and get the following output: data _dat input datalines proc sort data=_dat by x data _null_ set _dat nobs=nobs end=eof if ((nobs/4)+1)<=_n_<=(3*nobs/4) then sumx+x if eof then do IM=(2/nobs)*sumx put nobs= / sumx= / IM= nobs=10 sumx=21 IM=4.2 This is the same result we got by hand. For the the Truncated Mean, we go back to the previous data and do a Truncated Mean of 10%, which means we drop 10% of the observations each side, and do 4
5 data _dat input datalines proc sort data=_dat by x data _null_ set _dat nobs=nobs end=eof if ((nobs*0.1)+1)<=_n_<=(nobs*0.9) then do numx+1 sumx+x put x= numx= sumx= _n_= if eof then do TM=sumx/numx put numx= / sumx= / TM= to get in the SAS LOG numx=8 sumx=41 TM=5.125 which we got by hand. CONCLUSION Key to using statistical software is knowing what it can and cannot do. It is very interesting that SAS itself does not calculate the Geometric Mean or Harmonic Mean using the MEANS or UNIVARIATE procedure, nor have an option for either the Interquartile Mean or Truncated Mean. Maybe in the next SAS Software Ballot we can ask SAS to apply this option? CONTACT INFORMATION Your comments and questions are valued and encouraged. Contact the author at: Name: Enterprise: E mail: David Franklin Quintiles Real Late Phase Research David.Franklin@Quintiles.com SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. indicates USA registration. Other brand and product names are trademarks of their respective companies. 5
Midrange: mean of highest and lowest scores. easy to compute, rough estimate, rarely used
Measures of Central Tendency Mode: most frequent score. best average for nominal data sometimes none or multiple modes in a sample bimodal or multimodal distributions indicate several groups included in
More informationPROC LOGISTIC: Traps for the unwary Peter L. Flom, Independent statistical consultant, New York, NY
Paper SD174 PROC LOGISTIC: Traps for the unwary Peter L. Flom, Independent statistical consultant, New York, NY ABSTRACT Keywords: Logistic. INTRODUCTION This paper covers some gotchas in SAS R PROC LOGISTIC.
More informationABSTRACT INTRODUCTION SUMMARY OF ANALYSIS. Paper
Paper 1891-2014 Using SAS Enterprise Miner to predict the Injury Risk involved in Car Accidents Prateek Khare, Oklahoma State University; Vandana Reddy, Oklahoma State University; Goutam Chakraborty, Oklahoma
More informationPaper: ST-161. Techniques for Evidence-Based Decision Making Using SAS Ian Stockwell, The Hilltop UMBC, Baltimore, MD
Paper: ST-161 Techniques for Evidence-Based Decision Making Using SAS Ian Stockwell, The Hilltop Institute @ UMBC, Baltimore, MD ABSTRACT SAS has many tools that can be used for data analysis. From Freqs
More informationTastitsticsss? What s that? Principles of Biostatistics and Informatics. Variables, outcomes. Tastitsticsss? What s that?
Tastitsticsss? What s that? Statistics describes random mass phanomenons. Principles of Biostatistics and Informatics nd Lecture: Descriptive Statistics 3 th September Dániel VERES Data Collecting (Sampling)
More informationPostal Test Paper_P4_Foundation_Syllabus 2016_Set 1 Paper 4 - Fundamentals of Business Mathematics and Statistics
Paper 4 - Fundamentals of Business Mathematics and Statistics Academics Department, The Institute of Cost Accountants of India (Statutory Body under an Act of Parliament) Page 1 Paper 2 - Fundamentals
More informationSESUG 2011 ABSTRACT INTRODUCTION BACKGROUND ON LOGLINEAR SMOOTHING DESCRIPTION OF AN EXAMPLE. Paper CC-01
Paper CC-01 Smoothing Scaled Score Distributions from a Standardized Test using PROC GENMOD Jonathan Steinberg, Educational Testing Service, Princeton, NJ Tim Moses, Educational Testing Service, Princeton,
More informationMapping Participants to the Closest Medical Center
SESUG Paper RV-192-2017 Mapping Participants to the Closest Medical Center David Franklin, QuintilesIMS, Cambridge, MA ABSTRACT How far are patients from Clinics? That was the question which was asked
More informationChapter 1 - Lecture 3 Measures of Location
Chapter 1 - Lecture 3 of Location August 31st, 2009 Chapter 1 - Lecture 3 of Location General Types of measures Median Skewness Chapter 1 - Lecture 3 of Location Outline General Types of measures What
More informationDynamic Determination of Mixed Model Covariance Structures. in Double-blind Clinical Trials. Matthew Davis - Omnicare Clinical Research
PharmaSUG2010 - Paper SP12 Dynamic Determination of Mixed Model Covariance Structures in Double-blind Clinical Trials Matthew Davis - Omnicare Clinical Research Abstract With the computing power of SAS
More informationDescribing distributions with numbers
Describing distributions with numbers A large number or numerical methods are available for describing quantitative data sets. Most of these methods measure one of two data characteristics: The central
More informationSAS/STAT 12.3 User s Guide. The GLMMOD Procedure (Chapter)
SAS/STAT 12.3 User s Guide The GLMMOD Procedure (Chapter) This document is an individual chapter from SAS/STAT 12.3 User s Guide. The correct bibliographic citation for the complete manual is as follows:
More informationDescriptive Statistics-I. Dr Mahmoud Alhussami
Descriptive Statistics-I Dr Mahmoud Alhussami Biostatistics What is the biostatistics? A branch of applied math. that deals with collecting, organizing and interpreting data using well-defined procedures.
More informationSAS/STAT 15.1 User s Guide The GLMMOD Procedure
SAS/STAT 15.1 User s Guide The GLMMOD Procedure This document is an individual chapter from SAS/STAT 15.1 User s Guide. The correct bibliographic citation for this manual is as follows: SAS Institute Inc.
More information1. Exploratory Data Analysis
1. Exploratory Data Analysis 1.1 Methods of Displaying Data A visual display aids understanding and can highlight features which may be worth exploring more formally. Displays should have impact and be
More informationAP Final Review II Exploring Data (20% 30%)
AP Final Review II Exploring Data (20% 30%) Quantitative vs Categorical Variables Quantitative variables are numerical values for which arithmetic operations such as means make sense. It is usually a measure
More informationIntroduction to Statistics for Traffic Crash Reconstruction
Introduction to Statistics for Traffic Crash Reconstruction Jeremy Daily Jackson Hole Scientific Investigations, Inc. c 2003 www.jhscientific.com Why Use and Learn Statistics? 1. We already do when ranging
More informationSimilarity Analysis an Introduction, a Process, and a Supernova Paper
Similarity Analysis an Introduction, a Process, and a Supernova Paper 2016-11884 David J Corliss, Wayne State University Abstract Similarity analysis is used to classify an entire time series by type.
More informationLecture 3B: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries)
Lecture 3B: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries) Summarize with Shape, Center, Spread Displays: Stemplots, Histograms Five Number Summary, Outliers, Boxplots Mean vs.
More informationChapter Four. Numerical Descriptive Techniques. Range, Standard Deviation, Variance, Coefficient of Variation
Chapter Four Numerical Descriptive Techniques 4.1 Numerical Descriptive Techniques Measures of Central Location Mean, Median, Mode Measures of Variability Range, Standard Deviation, Variance, Coefficient
More informationADMS2320.com. We Make Stats Easy. Chapter 4. ADMS2320.com Tutorials Past Tests. Tutorial Length 1 Hour 45 Minutes
We Make Stats Easy. Chapter 4 Tutorial Length 1 Hour 45 Minutes Tutorials Past Tests Chapter 4 Page 1 Chapter 4 Note The following topics will be covered in this chapter: Measures of central location Measures
More informationLecture 2. Quantitative variables. There are three main graphical methods for describing, summarizing, and detecting patterns in quantitative data:
Lecture 2 Quantitative variables There are three main graphical methods for describing, summarizing, and detecting patterns in quantitative data: Stemplot (stem-and-leaf plot) Histogram Dot plot Stemplots
More informationAutomatic Singular Spectrum Analysis and Forecasting Michele Trovero, Michael Leonard, and Bruce Elsheimer SAS Institute Inc.
ABSTRACT Automatic Singular Spectrum Analysis and Forecasting Michele Trovero, Michael Leonard, and Bruce Elsheimer SAS Institute Inc., Cary, NC, USA The singular spectrum analysis (SSA) method of time
More informationChapter 31 The GLMMOD Procedure. Chapter Table of Contents
Chapter 31 The GLMMOD Procedure Chapter Table of Contents OVERVIEW...1639 GETTING STARTED...1639 AOne-WayDesign...1639 SYNTAX...1644 PROCGLMMODStatement...1644 BYStatement...1646 CLASSStatement...1646
More informationChapter 3. Data Description
Chapter 3. Data Description Graphical Methods Pie chart It is used to display the percentage of the total number of measurements falling into each of the categories of the variable by partition a circle.
More informationChapter 4. Displaying and Summarizing. Quantitative Data
STAT 141 Introduction to Statistics Chapter 4 Displaying and Summarizing Quantitative Data Bin Zou (bzou@ualberta.ca) STAT 141 University of Alberta Winter 2015 1 / 31 4.1 Histograms 1 We divide the range
More informationLecture Slides. Elementary Statistics Eleventh Edition. by Mario F. Triola. and the Triola Statistics Series 3.1-1
Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by Mario F. Triola 3.1-1 Chapter 3 Statistics for Describing, Exploring, and Comparing Data 3-1 Review and Preview
More informationLecture 6: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries)
Lecture 6: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries) Summarize with Shape, Center, Spread Displays: Stemplots, Histograms Five Number Summary, Outliers, Boxplots Cengage Learning
More information2011 Pearson Education, Inc
Statistics for Business and Economics Chapter 2 Methods for Describing Sets of Data Summary of Central Tendency Measures Measure Formula Description Mean x i / n Balance Point Median ( n +1) Middle Value
More informationUnit Two Descriptive Biostatistics. Dr Mahmoud Alhussami
Unit Two Descriptive Biostatistics Dr Mahmoud Alhussami Descriptive Biostatistics The best way to work with data is to summarize and organize them. Numbers that have not been summarized and organized are
More informationMeelis Kull Autumn Meelis Kull - Autumn MTAT Data Mining - Lecture 03
Meelis Kull meelis.kull@ut.ee Autumn 2017 1 Demo: Data science mini-project CRISP-DM: cross-industrial standard process for data mining Data understanding: Types of data Data understanding: First look
More informationUnit 2: Numerical Descriptive Measures
Unit 2: Numerical Descriptive Measures Summation Notation Measures of Central Tendency Measures of Dispersion Chebyshev's Rule Empirical Rule Measures of Relative Standing Box Plots z scores Jan 28 10:48
More informationChapter. Numerically Summarizing Data Pearson Prentice Hall. All rights reserved
Chapter 3 Numerically Summarizing Data Section 3.1 Measures of Central Tendency Objectives 1. Determine the arithmetic mean of a variable from raw data 2. Determine the median of a variable from raw data
More informationChapter 2: Tools for Exploring Univariate Data
Stats 11 (Fall 2004) Lecture Note Introduction to Statistical Methods for Business and Economics Instructor: Hongquan Xu Chapter 2: Tools for Exploring Univariate Data Section 2.1: Introduction What is
More informationDescribing distributions with numbers
Describing distributions with numbers A large number or numerical methods are available for describing quantitative data sets. Most of these methods measure one of two data characteristics: The central
More informationDesign of Experiments
Design of Experiments D R. S H A S H A N K S H E K H A R M S E, I I T K A N P U R F E B 19 TH 2 0 1 6 T E Q I P ( I I T K A N P U R ) Data Analysis 2 Draw Conclusions Ask a Question Analyze data What to
More informationSolutions to Selected Questions from Denis Sevee s Vector Geometry. (Updated )
Solutions to Selected Questions from Denis Sevee s Vector Geometry. (Updated 24--27) Denis Sevee s Vector Geometry notes appear as Chapter 5 in the current custom textbook used at John Abbott College for
More informationDosing In NONMEM Data Sets an Enigma
PharmaSUG 2018 - Paper BB-02 ABSTRACT Dosing In NONMEM Data Sets an Enigma Sree Harsha Sreerama Reddy, Certara, Clarksburg, MD; Vishak Subramoney, Certara, Toronto, ON, Canada; Dosing data plays an integral
More informationStatistics Boot Camp. Dr. Stephanie Lane Institute for Defense Analyses DATAWorks 2018
Statistics Boot Camp Dr. Stephanie Lane Institute for Defense Analyses DATAWorks 2018 March 21, 2018 Outline of boot camp Summarizing and simplifying data Point and interval estimation Foundations of statistical
More informationTopic Page: Central tendency
Topic Page: Central tendency Definition: measures of central tendency from Dictionary of Psychological Testing, Assessment and Treatment summary statistics which divide the data into two halves (i.e. half
More informationPaper Equivalence Tests. Fei Wang and John Amrhein, McDougall Scientific Ltd.
Paper 11683-2016 Equivalence Tests Fei Wang and John Amrhein, McDougall Scientific Ltd. ABSTRACT Motivated by the frequent need for equivalence tests in clinical trials, this paper provides insights into
More informationUnit 2. Describing Data: Numerical
Unit 2 Describing Data: Numerical Describing Data Numerically Describing Data Numerically Central Tendency Arithmetic Mean Median Mode Variation Range Interquartile Range Variance Standard Deviation Coefficient
More informationCharity Quick, Rho, Inc, Chapel Hill, NC Paul Nguyen, Rho, Inc, Chapel Hill, NC
PharmaSUG 2016 - Paper DS09 Prepare for Re-entry: Challenges and Solutions for Handling Re-screened Subjects in SDTM ABSTRACT Charity Quick, Rho, Inc, Chapel Hill, NC Paul Nguyen, Rho, Inc, Chapel Hill,
More informationFitting PK Models with SAS NLMIXED Procedure Halimu Haridona, PPD Inc., Beijing
PharmaSUG China 1 st Conference, 2012 Fitting PK Models with SAS NLMIXED Procedure Halimu Haridona, PPD Inc., Beijing ABSTRACT Pharmacokinetic (PK) models are important for new drug development. Statistical
More informationMEASURES OF CENTRAL TENDENCY Shahbaz Baig
Indep Rev Jul-Sep 2017;19(7-9) MEASURES OF CENTRAL TENDENCY Shahbaz Baig IR-60 Abstract: The average is a value which expresses the central idea of the observations. It is a single value used to represent
More informationCHAPTER 2: Describing Distributions with Numbers
CHAPTER 2: Describing Distributions with Numbers The Basic Practice of Statistics 6 th Edition Moore / Notz / Fligner Lecture PowerPoint Slides Chapter 2 Concepts 2 Measuring Center: Mean and Median Measuring
More informationMITOCW ocw f99-lec16_300k
MITOCW ocw-18.06-f99-lec16_300k OK. Here's lecture sixteen and if you remember I ended up the last lecture with this formula for what I called a projection matrix. And maybe I could just recap for a minute
More informationSummarizing Measured Data
Summarizing Measured Data 12-1 Overview Basic Probability and Statistics Concepts: CDF, PDF, PMF, Mean, Variance, CoV, Normal Distribution Summarizing Data by a Single Number: Mean, Median, and Mode, Arithmetic,
More informationLeverage Sparse Information in Predictive Modeling
Leverage Sparse Information in Predictive Modeling Liang Xie Countrywide Home Loans, Countrywide Bank, FSB August 29, 2008 Abstract This paper examines an innovative method to leverage information from
More informationMathematics skills framework
Mathematics skills framework The framework for MYP mathematics outlines four branches of mathematical study. Schools can use the framework for mathematics as a tool for curriculum mapping when designing
More informationStatistics and parameters
Statistics and parameters Tables, histograms and other charts are used to summarize large amounts of data. Often, an even more extreme summary is desirable. Statistics and parameters are numbers that characterize
More informationChapter 1: Introduction. Material from Devore s book (Ed 8), and Cengagebrain.com
1 Chapter 1: Introduction Material from Devore s book (Ed 8), and Cengagebrain.com Populations and Samples An investigation of some characteristic of a population of interest. Example: Say you want to
More informationLast Lecture. Distinguish Populations from Samples. Knowing different Sampling Techniques. Distinguish Parameters from Statistics
Last Lecture Distinguish Populations from Samples Importance of identifying a population and well chosen sample Knowing different Sampling Techniques Distinguish Parameters from Statistics Knowing different
More informationChapter 1: Exploring Data
Chapter 1: Exploring Data Section 1.3 with Numbers The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE Chapter 1 Exploring Data Introduction: Data Analysis: Making Sense of Data 1.1
More informationUnit 27 One-Way Analysis of Variance
Unit 27 One-Way Analysis of Variance Objectives: To perform the hypothesis test in a one-way analysis of variance for comparing more than two population means Recall that a two sample t test is applied
More informationSummarizing and Displaying Measurement Data/Understanding and Comparing Distributions
Summarizing and Displaying Measurement Data/Understanding and Comparing Distributions Histograms, Mean, Median, Five-Number Summary and Boxplots, Standard Deviation Thought Questions 1. If you were to
More information3 Lecture 3 Notes: Measures of Variation. The Boxplot. Definition of Probability
3 Lecture 3 Notes: Measures of Variation. The Boxplot. Definition of Probability 3.1 Week 1 Review Creativity is more than just being different. Anybody can plan weird; that s easy. What s hard is to be
More informationEx-Ante Forecast Model Performance with Rolling Simulations
Paper SAS213-2014 Ex-Ante Forecast Model Performance with Rolling Simulations Michael Leonard, Ashwini Dixit, Udo Sglavo, SAS Institute Inc. ABSTRACT Given a time series data set, you can use automatic
More informationChapter 3 Data Description
Chapter 3 Data Description Section 3.1: Measures of Central Tendency Section 3.2: Measures of Variation Section 3.3: Measures of Position Section 3.1: Measures of Central Tendency Definition of Average
More informationBusiness Mathematics & Statistics (MTH 302)
LECTURE 24 STATISTICAL REPRESENTATION MEASURES OF CENTRAL TENDENCY PART 1 OBJECTIVES The objectives of the lecture are to learn about: Review Lecture 18 Statistical Representation Measures of Central Tendency
More informationA Multistage Modeling Strategy for Demand Forecasting
Paper SAS5260-2016 A Multistage Modeling Strategy for Demand Forecasting Pu Wang, Alex Chien, and Yue Li, SAS Institute Inc. ABSTRACT Although rapid development of information technologies in the past
More informationA Little Stats Won t Hurt You
A Little Stats Won t Hurt You Nate Derby Statis Pro Data Analytics Seattle, WA, USA Edmonton SAS Users Group, 11/13/09 Nate Derby A Little Stats Won t Hurt You 1 / 71 Outline Introduction 1 Introduction
More informationDEPARTMENT OF QUANTITATIVE METHODS & INFORMATION SYSTEMS QM 120. Spring 2008
DEPARTMENT OF QUANTITATIVE METHODS & INFORMATION SYSTEMS Introduction to Business Statistics QM 120 Chapter 3 Spring 2008 Measures of central tendency for ungrouped data 2 Graphs are very helpful to describe
More informationTopic-1 Describing Data with Numerical Measures
Topic-1 Describing Data with Numerical Measures Central Tendency (Center) and Dispersion (Variability) Central tendency: measures of the degree to which scores are clustered around the mean of a distribution
More informationApplication of Ghosh, Grizzle and Sen s Nonparametric Methods in. Longitudinal Studies Using SAS PROC GLM
Application of Ghosh, Grizzle and Sen s Nonparametric Methods in Longitudinal Studies Using SAS PROC GLM Chan Zeng and Gary O. Zerbe Department of Preventive Medicine and Biometrics University of Colorado
More informationTornado Inflicted Damages Pattern
ABSTRACT SESUG 2017 Paper SD-120-2017 Tornado Inflicted Damages Pattern Vasudev Sharma, Oklahoma State University, Stillwater, OK On average, about a thousand tornadoes hit the United States every year.
More informationMgtOp 215 Chapter 3 Dr. Ahn
MgtOp 215 Chapter 3 Dr. Ahn Measures of central tendency (center, location): measures the middle point of a distribution or data; these include mean and median. Measures of dispersion (variability, spread):
More informationChapter 6 The Standard Deviation as a Ruler and the Normal Model
Chapter 6 The Standard Deviation as a Ruler and the Normal Model Overview Key Concepts Understand how adding (subtracting) a constant or multiplying (dividing) by a constant changes the center and/or spread
More informationChapter 3. Measuring data
Chapter 3 Measuring data 1 Measuring data versus presenting data We present data to help us draw meaning from it But pictures of data are subjective They re also not susceptible to rigorous inference Measuring
More informationMetabolite Identification and Characterization by Mining Mass Spectrometry Data with SAS and Python
PharmaSUG 2018 - Paper AD34 Metabolite Identification and Characterization by Mining Mass Spectrometry Data with SAS and Python Kristen Cardinal, Colorado Springs, Colorado, United States Hao Sun, Sun
More informationSTP 420 INTRODUCTION TO APPLIED STATISTICS NOTES
INTRODUCTION TO APPLIED STATISTICS NOTES PART - DATA CHAPTER LOOKING AT DATA - DISTRIBUTIONS Individuals objects described by a set of data (people, animals, things) - all the data for one individual make
More informationGlossary. The ISI glossary of statistical terms provides definitions in a number of different languages:
Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the
More informationCHAPTER 1. Introduction
CHAPTER 1 Introduction Engineers and scientists are constantly exposed to collections of facts, or data. The discipline of statistics provides methods for organizing and summarizing data, and for drawing
More informationChapter 1: Introduction. Material from Devore s book (Ed 8), and Cengagebrain.com
1 Chapter 1: Introduction Material from Devore s book (Ed 8), and Cengagebrain.com Populations and Samples An investigation of some characteristic of a population of interest. Example: Say you want to
More informationSUMMARIZING MEASURED DATA. Gaia Maselli
SUMMARIZING MEASURED DATA Gaia Maselli maselli@di.uniroma1.it Computer Network Performance 2 Overview Basic concepts Summarizing measured data Summarizing data by a single number Summarizing variability
More informationError Correcting Codes Prof. Dr. P. Vijay Kumar Department of Electrical Communication Engineering Indian Institute of Science, Bangalore
(Refer Slide Time: 00:15) Error Correcting Codes Prof. Dr. P. Vijay Kumar Department of Electrical Communication Engineering Indian Institute of Science, Bangalore Lecture No. # 03 Mathematical Preliminaries:
More informationMATH 117 Statistical Methods for Management I Chapter Three
Jubail University College MATH 117 Statistical Methods for Management I Chapter Three This chapter covers the following topics: I. Measures of Center Tendency. 1. Mean for Ungrouped Data (Raw Data) 2.
More informationGenerating Half-normal Plot for Zero-inflated Binomial Regression
Paper SP05 Generating Half-normal Plot for Zero-inflated Binomial Regression Zhao Yang, Xuezheng Sun Department of Epidemiology & Biostatistics University of South Carolina, Columbia, SC 29208 SUMMARY
More informationDESCRIPTIVE STATISTICS
DESCRIPTIVE STATISTICS Statistics deals with the theories and methods used in the collection, organization, interpretation and presentation of data. Data raw material used in statistical investigation
More informationComparing Measures of Central Tendency *
OpenStax-CNX module: m11011 1 Comparing Measures of Central Tendency * David Lane This work is produced by OpenStax-CNX and licensed under the Creative Commons Attribution License 1.0 1 Comparing Measures
More informationChapters 1 & 2 Exam Review
Problems 1-3 refer to the following five boxplots. 1.) To which of the above boxplots does the following histogram correspond? (A) A (B) B (C) C (D) D (E) E 2.) To which of the above boxplots does the
More informationThe standard deviation as a descriptive statistic
The standard deviation as a descriptive statistic by Von Bing Yap* Department of Statistics and Applied Probability, National University of Singapore Introduction - ':f The bulk of statistks essentially
More information= n 1. n 1. Measures of Variability. Sample Variance. Range. Sample Standard Deviation ( ) 2. Chapter 2 Slides. Maurice Geraghty
Chapter Slides Inferential Statistics and Probability a Holistic Approach Chapter Descriptive Statistics This Course Material by Maurice Geraghty is licensed under a Creative Commons Attribution-ShareAlike.
More informationA SAS/AF Application For Sample Size And Power Determination
A SAS/AF Application For Sample Size And Power Determination Fiona Portwood, Software Product Services Ltd. Abstract When planning a study, such as a clinical trial or toxicology experiment, the choice
More informationData Analyses in Multivariate Regression Chii-Dean Joey Lin, SDSU, San Diego, CA
Data Analyses in Multivariate Regression Chii-Dean Joey Lin, SDSU, San Diego, CA ABSTRACT Regression analysis is one of the most used statistical methodologies. It can be used to describe or predict causal
More informationYou are allowed two hours to answer this question paper. All questions are compulsory.
Examination Question and Answer Book Write here your full examination number Centre Code: Hall Code: Desk Number: Foundation Level 3c Business Mathematics FBSM 0 May 00 Day 1 late afternoon INSTRUCTIONS
More informationThe Law of Averages. MARK FLANAGAN School of Electrical, Electronic and Communications Engineering University College Dublin
The Law of Averages MARK FLANAGAN School of Electrical, Electronic and Communications Engineering University College Dublin Basic Principle of Inequalities: For any real number x, we have 3 x 2 0, with
More informationa+bi form abscissa absolute value x x x absolute value equation x x absolute value function absolute value inequality x x accurate adjacent angles
Words CR PS Reason Comm Geom ALG Stat Prob NSO a+bi form abscissa absolute value absolute value equation absolute value function absolute value inequality accurate adjacent angles adjacent sides algebraic
More informationAnalyzing the effect of Weather on Uber Ridership
ABSTRACT MWSUG 2016 Paper AA22 Analyzing the effect of Weather on Uber Ridership Snigdha Gutha, Oklahoma State University Anusha Mamillapalli, Oklahoma State University Uber has changed the face of taxi
More informationStat 2300 International, Fall 2006 Sample Midterm. Friday, October 20, Your Name: A Number:
Stat 2300 International, Fall 2006 Sample Midterm Friday, October 20, 2006 Your Name: A Number: The Midterm consists of 35 questions: 20 multiple-choice questions (with exactly 1 correct answer) and 15
More informationAfter completing this chapter, you should be able to:
Chapter 2 Descriptive Statistics Chapter Goals After completing this chapter, you should be able to: Compute and interpret the mean, median, and mode for a set of data Find the range, variance, standard
More informationDescribing Distributions with Numbers
Topic 2 We next look at quantitative data. Recall that in this case, these data can be subject to the operations of arithmetic. In particular, we can add or subtract observation values, we can sort them
More informationCalculating Confidence Intervals on Proportions of Variability Using the. VARCOMP and IML Procedures
Calculating Confidence Intervals on Proportions of Variability Using the VARCOMP and IML Procedures Annette M. Green, Westat, Inc., Research Triangle Park, NC David M. Umbach, National Institute of Environmental
More informationPerinatal Mental Health Profile User Guide. 1. Using Fingertips Software
Perinatal Mental Health Profile User Guide 1. Using Fingertips Software July 2017 Contents 1. Introduction... 3 2. Quick Guide to Fingertips Software Features... 3 2.1 Additional information... 3 2.2 Search
More informationCS 147: Computer Systems Performance Analysis
CS 147: Computer Systems Performance Analysis Summarizing Variability and Determining Distributions CS 147: Computer Systems Performance Analysis Summarizing Variability and Determining Distributions 1
More information1.3.1 Measuring Center: The Mean
1.3.1 Measuring Center: The Mean Mean - The arithmetic average. To find the mean (pronounced x bar) of a set of observations, add their values and divide by the number of observations. If the n observations
More informationDescriptive Data Summarization
Descriptive Data Summarization Descriptive data summarization gives the general characteristics of the data and identify the presence of noise or outliers, which is useful for successful data cleaning
More informationSlide 1. Slide 2. Slide 3. Pick a Brick. Daphne. 400 pts 200 pts 300 pts 500 pts 100 pts. 300 pts. 300 pts 400 pts 100 pts 400 pts.
Slide 1 Slide 2 Daphne Phillip Kathy Slide 3 Pick a Brick 100 pts 200 pts 500 pts 300 pts 400 pts 200 pts 300 pts 500 pts 100 pts 300 pts 400 pts 100 pts 400 pts 100 pts 200 pts 500 pts 100 pts 400 pts
More informationMITOCW watch?v=ztnnigvy5iq
MITOCW watch?v=ztnnigvy5iq GILBERT STRANG: OK. So this is a "prepare the way" video about symmetric matrices and complex matrices. We'll see symmetric matrices in second order systems of differential equations.
More informationSummarizing Measured Data
Performance Evaluation: Summarizing Measured Data Hongwei Zhang http://www.cs.wayne.edu/~hzhang The object of statistics is to discover methods of condensing information concerning large groups of allied
More information