ASSESSING VARIATION: A UNIFYING APPROACH FOR ALL SCALES OF MEASUREMENT JSM Tamar Gadrich Emil Bashkansky
|
|
- Aubrie McCarthy
- 5 years ago
- Views:
Transcription
1 ASSESSING VARIATION: A UNIFYING APPROACH FOR ALL SCALES OF MEASUREMENT Tamar Gadrich Emil Bashkansky (ORT Braude College of Engineering, Israel) Ri cardas Zitikis (University of Western Ontario, Canada) JSM 04
2 MOTIVATION For various reasons quite often we wish or need to measure the variability of populations, or samples, which can be quantitative, qualitative, and quite often mixed. social inequality and mobility, political consensus, homogeneity of some material, uncertainty of prediction, diversity or similarity of species, synchronization degree of biological rhythms, etc. These are complex tasks due to a number of reasons, not least because of the inherent heterogeneity of populations, which are usually made up of various groups and categories often requiring different scales of measurement.
3 Status quo A number of variability measures have been developed to accommodate various scales of measurement, and there are four of them: nominal, ordinal, interval, and ratio. The variety of scales and, accordingly, various restrictions on possible arithmetical operations and order relationships carry serious challenges for researchers and decision makers. Most popular are measures designed to evaluate variability of numerical (interval and ratio) data and among them are the range, IQR, variance, and SD, as well as measures based on mean absolute deviation/difference and also entropy-based measures. There is a necessity for developing variability measures based on legitimate arithmetical operations between categorical variables and their possible decomposition into intra - and inter-components in a single unifying way. 3
4 THE CLASSICAL VARIANCE AND THE GINI MEAN DIFFERENCE Classical variance: Gini formula for variance: ( k ) k with k k () k k VAR x p x p VAR x x p p ( i j) i j () j i Gini mean dif ference (GMD): GMD x x p p j i i j i j (3) For both definitions: j i L( x, x ) p p (4) i j i j 4
5 FOUR SCALES FOR QUALITY DATA Numerical Categorical Ratio Interval Ordinal Nominal,,,,,,,,,/,,,,,, Quality cost, delivery time, number of defects, MTTF temperature, calendar time, lake level customer satisfaction, status, FMECA ranking, belt rank (Y,G,B,MB), quality level... vendor, failure mode, record type, quality requirement 5
6 THE LAYOUT OF CATEGORICAL DATA 6
7 DESCRIPTION OF CATEGORICAL DATA Set of n ordinal/nominal data based on an scale with ordered/nonordered categories coded by integers k=,,, n, n,..., n Proportion of data belonging to the k-th category For ordinal data: cumulative frequency of data belonging up to k the k-th category Fˆ ˆ k pj n=500 apples j n (Low=, Medium=, High=3) 7,9,40 pˆ k ˆ p,, ˆ F,, nk n 7
8 MEASURING VARIABILIT Y - CATEGORICAL c Let, k, k, be category codes and k their corresponding probabilities, which we also call frequencies. L( c, c ) Let i j be a two-argument function, defined on the codes, which is non-negative, symmetric, and such that for all k. We call it loss-of-similarity function. The population total variation is defined by: V L( c, c ) p p T i j i j j i The sample total variation is defined by: Vˆ L( c, c ) pˆˆ p T i j i j j i p L( c, c ) 0 (5) (6) k k The m-th group variation: Vˆ n (, ) ( ) km ˆ ˆ ˆ m L ci c j pi m p j m pk m j i nm (7) 8
9 Let m group. Obviously, WEIGHTED AVERAGE OF THE WITHIN GROUP VARIATIONS n N m to present the proportion of data in the m-th M m M ˆ Vˆ (8) V W m m 9
10 THE BETWEEN GROUPS COVARIATION & ITS CHARACTERISTIC ERNEL Cˆ L( c, c ) B i j i j j i ˆ (9) M ˆ ( pˆ pˆ )( pˆ pˆ ) i j m i m i j m j m (0) 0
11 MEASURING VARIABILIT Y - CONTINUOUS V L( x, x) df( x) df( x) T V L( x, x) df( x m) df( xm) m C L( x, x) d( x, x) M B m ( )( ) ( x, x) m F( x m) F( x) F( x m) F( x)
12 TOTAL-VARIATION DECOMPOSITION THEOREM The total-variation can be split to the sum of the within-variation and the between covariation: V V C T W B ()
13 INDEX PVE PVE ˆ C ˆB V T () Note the following properties:. PVE = 0 when there is no association, that is, when there is no group effect on the category distribution. In mathematical terms, that is, the total-variation is a pure (i.e., without any interaction) aggregate of the individual group variations..pve = when data within every group fall into one (but perhaps not the same for all samples) category, that is, when there is perfect predictability. 3
14 INDEX OF SEGREGATION POWER (SP) AMONG GROUPS Rule of thumb: if SP > 3, homogeneity hypothesis H 0 must be rejected, if SP < - not rejected, the region [-3] is the region of doubt, i.e. more data is required. 4
15 EXAMPLE- MONTE CARLO SIMULATION 5
16 SPECIAL CASE : NOMINAL VARIABLES CATANOVA OF LIGHT AND MARGOLIN (97) L( c, c ) i j { 0 when i j, when i j. Normalizing the total-variation by its maximal value we obtain: IQV ( pˆ k ) the between-covariation: T k ( k ) k k k Vˆ pˆ pˆ pˆ M k Cˆ ( pˆ pˆ ) B m km k k m 6
17 SPECIAL CASE : ORDINAL VARIABLES ORDANOVA OF GADRICH AND BASHANSY(03) L( c, c ) i j i j not c c! i j Normalizing the total-variation by its maximal value we obtain: ˆ 4 h Fˆ ( ˆ k Fk) the between-covariation: Vˆ Fˆ ( Fˆ ) T k k k M k Cˆ ( Fˆ Fˆ ) B m km k k m (Berry & Mielke, 99) (Blair & Lacy, 996) 7
18 SPECIAL CASE 3: INTERVAL VARIABLES GMD L( x, x ) x x i j i j Vˆ F( x)( F( x)) dx T Normalizing the total-variation by its maximal value we obtain: RHS Gini mean difference ( GMD) the between-covariation: M ˆ ˆ ˆ CB m ( Fm ( x) F( x)) dx m 8
19 SPECIAL CASE 4: RATIO SCALE The loss-of-similarity function L( x, x) log( x) log( x) is well suited for the ratio scale. By adopting this function, we effectively replace our considerations on the ratio scale by those on the interval scale, and thus work with the loss-of-similarity function L( y, y) y y,where instead of the original x s we now deal with their logarithms y = log x. Hence, all our earlier results pertaining to the interval scale can be utilized in a straightforward manner to establish analogous results on the ratio scale. Of course, there is an element of arbitrariness in our choice of the logarithmic transformation there are indeed many alternatives. Nevertheless, our experience suggests that underlying problems and philosophies for tackling the problems usually restrict the class of loss-of-similarity functions as well as of transformations to just a few reasonable ones, and certain axiomatic approaches may even produce unique choices. 9
20 SUMMARY We have presented a unifying approach for assessing variation in populations and data sets that accommodates every scale of measurement: nominal, ordinal, interval, and ratio. In particular, we have put forward a general decomposition result for the total variation into within (intra)and between (inter) components. This has enabled us to introduce two indices: PVE as the proportion-of-variation-explained and SP as the segregation power. Our results extend and generalize the ORDANOVA method developed by Gadrich and Bashkansky (0) in the case of categorical ordinal variables. 0
21 THAN YOU FOR YOUR ATTENTION!
22 VARIATION DEFINITION Nominal: IQV p GMD ( ˆ k ) k Ordinal: ˆ ˆ ˆ k k 4 h F F k
23 CATANOVA (CATEGORICAL DATA ANALYSIS OF VARIATION) DECOMPOSITION M samples Within IQV W IQV IQV [ ( p )] M M ( m) ˆ WITHIN m WITHIN m km m m k Between IQV B M IQV [ ( pˆ ) pˆ ] BETWEEN m km k k m pˆkm - the frequency of data belonging to the k- th category in the m-th sample Total Variation ( pˆ k ) k IQV T ( ) pˆk - the total frequency of items belonging to the k-th category 3
24 ORDANOVA (ORDINAL DATA ANALYSIS OF VARIATION) DECOMPOSITION M ordinal samples of the same size n Within M ˆ ˆ M m 4 k F km ˆ h W within mth sample ( F ) km Between S B M ( F ˆ F ˆ. ) ( ) / 4 M m km k k between samples for every k th category Fˆkm The cumulative frequency of data belonging up to the k-th category in the m-th sample Total Variation Fˆ. ˆ k Fk. 4 k ˆ h T Fˆ k. M M m Fˆ km The total cumulative frequency of items belonging up to the k-th category 4
25 ORDANOVA DECOMPOSITION EXAMPLE () Given M=3 samples, size n=00, total N=600 items Classifying according to k=4 categories Samples data: Sample Category 3 Total Total
26 ORDANOVA DECOMPOSITION EXAMPLE () Cumulative frequency up to the k-th category within the m-th sample (k=,,3,4; m=,,3) Last column The total cumulative frequency of items belonging up to the k-th category Sample Category 3 Fˆk. 78/00 /00 4/00 4/600 4/00 /00 89/00 34/ /00 4/00 35/00 44/600 4 Total
27 ORDANOVA DECOMPOSITION EXAMPLE (3) sample Category 3 Fˆk. 78/00 /00 4/00 4/600 4/00 /00 89/00 34/ /00 4/00 35/00 44/600 4 Total ˆ / h T 7
28 ORDANOVA DECOMPOSITION EXAMPLE (4) sample Category 3 Fˆk. 78/00 /00 4/00 4/600 4/00 /00 89/00 34/ /00 4/00 35/00 44/600 4 Total hˆ hˆ ˆ h h ˆ Dispersion within Wthe m-th sample: W 3W h h h W 3 3 W W W h mw ˆ h W 4 / ˆ ˆ ˆ ˆ
29 ORDANOVA DECOMPOSITION EXAMPLE (5) sample Category 3 Fˆk. 3 4 Total 78/00 4/00 65/00 00 /00 /00 4/ /00 89/00 35/ /600 34/600 44/ S B S B S 3 B Classic variation between the samples for the k-th category ˆ S B S B S S B B B 4 4 S kb 9
30 ORDANOVA DECOMPOSITION EXAMPLE (6) sample Category 3 Fˆk. 78/00 /00 4/00 4/600 4/00 /00 89/00 34/ /00 4/00 35/00 44/600 4 Total hˆ hˆ S T W B
31 DISTINGUISHING STATISTIC FOR ORDINAL DATA Item measures according to an scale with categories M samples of equal size n are drawn Were all samples drawn from the same population characterized by p, p,..., p or not? Under H 0 B Multinomial distribution W M n N E between variation E within variation E total variation = = df df df T where : df = N -, df = M(n - ), df = M -, TOTAL WITHIN BETWEEN in other words : E( MS ) =E( MS ) = E( MS ) B W T 3
32 I cr df 0.95 DISTINGUISHING STATISTIC SP MS MS B T SP can be asymptotically approximated by ( M )( ) ( M )( ) The quintiles (95%, for example) of the last may be used for hypothesis checking. degrees of freedom ( M )( ) ( M )( )
33 DISTINGUISHING FACTOR IDENTIFICATION Data can be divided/segregated according to various type of factors (segregation). For each segregation, calculate the indicator I: The best segregating factor is the one for which the indicator is the largest. 33
34 DISTRIBUTION OF ACADEMIC DEGREE HOLDERS BY ORDINAL DEGREE LEVEL ( ST DEGREE, ND DEGREE, 3 RD DEGREE) First case: according to age Up to Total Under graduate degree ( st degree) Graduate degree ( nd degree) ,3,380,70 46 SP=,84 4,058 4, 65,80 376,60 8,793,759 Doctoral degree (3 rd degree) ,437 Total 448 4,378 5,53 6,40 3,44,805 34
35 DISTRIBUTION OF ACADEMIC DEGREE HOLDERS BY ORDINAL DEGREE LEVEL ( ST DEGREE, ND DEGREE, 3 RD DEGREE) Second case: according to religion & origin/ethnic group Under graduate degree ( st degree) Graduate degree ( nd degree) Doctoral degree (3 rd degree) Jews born in Israel 3,83 9,04,04 Jews born abroad,894,90 Moslems Christians Druze 07 Age is a much more SP=33 significant 58 Others distinguishing/segregating factor than religion & origin /ethnic Total 8,793,759,437 Total 448 4,378 5,53 6,40 3,44,805 35
36 EXAMPLE IDENTIFY THE DISTINGUISHING FACTOR Distribution of faculty by ordinal academic ranks (lecturer, senior lecturer, associate professor, full professor) in five different types of higher educational institutions. Indicator ratio= 08! Number of positions Lecturer Senior lecturer Associate professor Full professor Type Type Type 3 Type 4 Type 5 36
37 EXAMPLE IDENTIFY THE DISTINGUISHING FACTOR In order to find the outlier use the Jackknife procedure Option no SP
Module 10: Analysis of Categorical Data Statistics (OA3102)
Module 10: Analysis of Categorical Data Statistics (OA3102) Professor Ron Fricker Naval Postgraduate School Monterey, California Reading assignment: WM&S chapter 14.1-14.7 Revision: 3-12 1 Goals for this
More informationGlossary. The ISI glossary of statistical terms provides definitions in a number of different languages:
Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the
More informationLecture 7: Hypothesis Testing and ANOVA
Lecture 7: Hypothesis Testing and ANOVA Goals Overview of key elements of hypothesis testing Review of common one and two sample tests Introduction to ANOVA Hypothesis Testing The intent of hypothesis
More informationHarvard University. Rigorous Research in Engineering Education
Statistical Inference Kari Lock Harvard University Department of Statistics Rigorous Research in Engineering Education 12/3/09 Statistical Inference You have a sample and want to use the data collected
More informationOne-way ANOVA. Experimental Design. One-way ANOVA
Method to compare more than two samples simultaneously without inflating Type I Error rate (α) Simplicity Few assumptions Adequate for highly complex hypothesis testing 09/30/12 1 Outline of this class
More informationTHE ROYAL STATISTICAL SOCIETY 2015 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE MODULE 3
THE ROYAL STATISTICAL SOCIETY 015 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE MODULE 3 The Society is providing these solutions to assist candidates preparing for the examinations in 017. The solutions are
More informationIntroduction to Survey Analysis!
Introduction to Survey Analysis! Professor Ron Fricker! Naval Postgraduate School! Monterey, California! Reading Assignment:! 2/22/13 None! 1 Goals for this Lecture! Introduction to analysis for surveys!
More informationCHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)
FROM: PAGANO, R. R. (007) I. INTRODUCTION: DISTINCTION BETWEEN PARAMETRIC AND NON-PARAMETRIC TESTS Statistical inference tests are often classified as to whether they are parametric or nonparametric Parameter
More informationØ Set of mutually exclusive categories. Ø Classify or categorize subject. Ø No meaningful order to categorization.
Statistical Tools in Evaluation HPS 41 Dr. Joe G. Schmalfeldt Types of Scores Continuous Scores scores with a potentially infinite number of values. Discrete Scores scores limited to a specific number
More informationCIVL 7012/8012. Collection and Analysis of Information
CIVL 7012/8012 Collection and Analysis of Information Uncertainty in Engineering Statistics deals with the collection and analysis of data to solve real-world problems. Uncertainty is inherent in all real
More informationMATH Notebook 3 Spring 2018
MATH448001 Notebook 3 Spring 2018 prepared by Professor Jenny Baglivo c Copyright 2010 2018 by Jenny A. Baglivo. All Rights Reserved. 3 MATH448001 Notebook 3 3 3.1 One Way Layout........................................
More informationUnit 9: Inferences for Proportions and Count Data
Unit 9: Inferences for Proportions and Count Data Statistics 571: Statistical Methods Ramón V. León 1/15/008 Unit 9 - Stat 571 - Ramón V. León 1 Large Sample Confidence Interval for Proportion ( pˆ p)
More informationStatistical inference
Statistical inference Contents 1. Main definitions 2. Estimation 3. Testing L. Trapani MSc Induction - Statistical inference 1 1 Introduction: definition and preliminary theory In this chapter, we shall
More informationConfidence Intervals. Confidence interval for sample mean. Confidence interval for sample mean. Confidence interval for sample mean
Confidence Intervals Confidence interval for sample mean The CLT tells us: as the sample size n increases, the sample mean is approximately Normal with mean and standard deviation Thus, we have a standard
More informationAnalysis of Variance and Co-variance. By Manza Ramesh
Analysis of Variance and Co-variance By Manza Ramesh Contents Analysis of Variance (ANOVA) What is ANOVA? The Basic Principle of ANOVA ANOVA Technique Setting up Analysis of Variance Table Short-cut Method
More informationØ Set of mutually exclusive categories. Ø Classify or categorize subject. Ø No meaningful order to categorization.
Statistical Tools in Evaluation HPS 41 Fall 213 Dr. Joe G. Schmalfeldt Types of Scores Continuous Scores scores with a potentially infinite number of values. Discrete Scores scores limited to a specific
More informationChapter Fifteen. Frequency Distribution, Cross-Tabulation, and Hypothesis Testing
Chapter Fifteen Frequency Distribution, Cross-Tabulation, and Hypothesis Testing Copyright 2010 Pearson Education, Inc. publishing as Prentice Hall 15-1 Internet Usage Data Table 15.1 Respondent Sex Familiarity
More informationUnit 9: Inferences for Proportions and Count Data
Unit 9: Inferences for Proportions and Count Data Statistics 571: Statistical Methods Ramón V. León 12/15/2008 Unit 9 - Stat 571 - Ramón V. León 1 Large Sample Confidence Interval for Proportion ( pˆ p)
More informationParametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami
Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami Parametric Assumptions The observations must be independent. Dependent variable should be continuous
More informationRandomized Decision Trees
Randomized Decision Trees compiled by Alvin Wan from Professor Jitendra Malik s lecture Discrete Variables First, let us consider some terminology. We have primarily been dealing with real-valued data,
More informationCategorical Predictor Variables
Categorical Predictor Variables We often wish to use categorical (or qualitative) variables as covariates in a regression model. For binary variables (taking on only 2 values, e.g. sex), it is relatively
More informationOverview. INFOWO Statistics lecture S1: Descriptive statistics. Detailed Overview of the Statistics track. Definition
Overview INFOWO Statistics lecture S1: Descriptive statistics Peter de Waal Introduction to statistics Descriptive statistics Department of Information and Computing Sciences Faculty of Science, Universiteit
More informationTwo-Sample Inferential Statistics
The t Test for Two Independent Samples 1 Two-Sample Inferential Statistics In an experiment there are two or more conditions One condition is often called the control condition in which the treatment is
More informationWhat is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty.
What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty. Statistics is a field of study concerned with the data collection,
More informationIdentify the scale of measurement most appropriate for each of the following variables. (Use A = nominal, B = ordinal, C = interval, D = ratio.
Answers to Items from Problem Set 1 Item 1 Identify the scale of measurement most appropriate for each of the following variables. (Use A = nominal, B = ordinal, C = interval, D = ratio.) a. response latency
More informationContents Kruskal-Wallis Test Friedman s Two-way Analysis of Variance by Ranks... 47
Contents 1 Non-parametric Tests 3 1.1 Introduction....................................... 3 1.2 Advantages of Non-parametric Tests......................... 4 1.3 Disadvantages of Non-parametric Tests........................
More informationProbabilities and Statistics Probabilities and Statistics Probabilities and Statistics
- Lecture 8 Olariu E. Florentin April, 2018 Table of contents 1 Introduction Vocabulary 2 Descriptive Variables Graphical representations Measures of the Central Tendency The Mean The Median The Mode Comparing
More informationUNIVERSITY OF THE PHILIPPINES LOS BAÑOS INSTITUTE OF STATISTICS BS Statistics - Course Description
UNIVERSITY OF THE PHILIPPINES LOS BAÑOS INSTITUTE OF STATISTICS BS Statistics - Course Description COURSE COURSE TITLE UNITS NO. OF HOURS PREREQUISITES DESCRIPTION Elementary Statistics STATISTICS 3 1,2,s
More information15: CHI SQUARED TESTS
15: CHI SQUARED ESS MULIPLE CHOICE QUESIONS In the following multiple choice questions, please circle the correct answer. 1. Which statistical technique is appropriate when we describe a single population
More information" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2
Notation and Equations for Final Exam Symbol Definition X The variable we measure in a scientific study n The size of the sample N The size of the population M The mean of the sample µ The mean of the
More informationECLT 5810 Data Preprocessing. Prof. Wai Lam
ECLT 5810 Data Preprocessing Prof. Wai Lam Why Data Preprocessing? Data in the real world is imperfect incomplete: lacking attribute values, lacking certain attributes of interest, or containing only aggregate
More informationIntroduction to Structural Equation Modeling
Introduction to Structural Equation Modeling Notes Prepared by: Lisa Lix, PhD Manitoba Centre for Health Policy Topics Section I: Introduction Section II: Review of Statistical Concepts and Regression
More informationCh. 16: Correlation and Regression
Ch. 1: Correlation and Regression With the shift to correlational analyses, we change the very nature of the question we are asking of our data. Heretofore, we were asking if a difference was likely to
More informationHYPOTHESIS TESTING II TESTS ON MEANS. Sorana D. Bolboacă
HYPOTHESIS TESTING II TESTS ON MEANS Sorana D. Bolboacă OBJECTIVES Significance value vs p value Parametric vs non parametric tests Tests on means: 1 Dec 14 2 SIGNIFICANCE LEVEL VS. p VALUE Materials and
More informationQUEEN S UNIVERSITY FINAL EXAMINATION FACULTY OF ARTS AND SCIENCE DEPARTMENT OF ECONOMICS APRIL 2018
Page 1 of 4 QUEEN S UNIVERSITY FINAL EXAMINATION FACULTY OF ARTS AND SCIENCE DEPARTMENT OF ECONOMICS APRIL 2018 ECONOMICS 250 Introduction to Statistics Instructor: Gregor Smith Instructions: The exam
More informationMS-E2112 Multivariate Statistical Analysis (5cr) Lecture 5: Bivariate Correspondence Analysis
MS-E2112 Multivariate Statistical (5cr) Lecture 5: Bivariate Contents analysis is a PCA-type method appropriate for analyzing categorical variables. The aim in bivariate correspondence analysis is to
More informationDescriptive Statistics-I. Dr Mahmoud Alhussami
Descriptive Statistics-I Dr Mahmoud Alhussami Biostatistics What is the biostatistics? A branch of applied math. that deals with collecting, organizing and interpreting data using well-defined procedures.
More informationA Test of Homogeneity Against Umbrella Scale Alternative Based on Gini s Mean Difference
J. Stat. Appl. Pro. 2, No. 2, 145-154 (2013) 145 Journal of Statistics Applications & Probability An International Journal http://dx.doi.org/10.12785/jsap/020207 A Test of Homogeneity Against Umbrella
More informationStatistics for Managers Using Microsoft Excel
Statistics for Managers Using Microsoft Excel 7 th Edition Chapter 1 Chi-Square Tests and Nonparametric Tests Statistics for Managers Using Microsoft Excel 7e Copyright 014 Pearson Education, Inc. Chap
More informationGroup comparison test for independent samples
Group comparison test for independent samples The purpose of the Analysis of Variance (ANOVA) is to test for significant differences between means. Supposing that: samples come from normal populations
More informationP8130: Biostatistical Methods I
P8130: Biostatistical Methods I Lecture 2: Descriptive Statistics Cody Chiuzan, PhD Department of Biostatistics Mailman School of Public Health (MSPH) Lecture 1: Recap Intro to Biostatistics Types of Data
More informationOn the Impossibility of Certain Ranking Functions
On the Impossibility of Certain Ranking Functions Jin-Yi Cai Abstract Suppose all the individuals in a field are linearly ordered. Groups of individuals form teams. Is there a perfect ranking function
More informationChapter 12: Inference about One Population
Chapter 1: Inference about One Population 1.1 Introduction In this chapter, we presented the statistical inference methods used when the problem objective is to describe a single population. Sections 1.
More informationCIVL /8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8
CIVL - 7904/8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8 Chi-square Test How to determine the interval from a continuous distribution I = Range 1 + 3.322(logN) I-> Range of the class interval
More informationTastitsticsss? What s that? Principles of Biostatistics and Informatics. Variables, outcomes. Tastitsticsss? What s that?
Tastitsticsss? What s that? Statistics describes random mass phanomenons. Principles of Biostatistics and Informatics nd Lecture: Descriptive Statistics 3 th September Dániel VERES Data Collecting (Sampling)
More informationGoodness-of-Fit Tests for the Ordinal Response Models with Misspecified Links
Communications of the Korean Statistical Society 2009, Vol 16, No 4, 697 705 Goodness-of-Fit Tests for the Ordinal Response Models with Misspecified Links Kwang Mo Jeong a, Hyun Yung Lee 1, a a Department
More informationChapte The McGraw-Hill Companies, Inc. All rights reserved.
er15 Chapte Chi-Square Tests d Chi-Square Tests for -Fit Uniform Goodness- Poisson Goodness- Goodness- ECDF Tests (Optional) Contingency Tables A contingency table is a cross-tabulation of n paired observations
More informationEcon 325: Introduction to Empirical Economics
Econ 325: Introduction to Empirical Economics Chapter 9 Hypothesis Testing: Single Population Ch. 9-1 9.1 What is a Hypothesis? A hypothesis is a claim (assumption) about a population parameter: population
More informationScales of Measuement Dr. Sudip Chaudhuri
Scales of Measuement Dr. Sudip Chaudhuri M. Sc., M. Tech., Ph.D., M. Ed. Assistant Professor, G.C.B.T. College, Habra, India, Honorary Researcher, Saha Institute of Nuclear Physics, Life Member, Indian
More informationGMM Estimation of a Maximum Entropy Distribution with Interval Data
GMM Estimation of a Maximum Entropy Distribution with Interval Data Ximing Wu and Jeffrey M. Perloff January, 2005 Abstract We develop a GMM estimator for the distribution of a variable where summary statistics
More informationStatistics in medicine
Statistics in medicine Lecture 4: and multivariable regression Fatma Shebl, MD, MS, MPH, PhD Assistant Professor Chronic Disease Epidemiology Department Yale School of Public Health Fatma.shebl@yale.edu
More informationPsychology 282 Lecture #4 Outline Inferences in SLR
Psychology 282 Lecture #4 Outline Inferences in SLR Assumptions To this point we have not had to make any distributional assumptions. Principle of least squares requires no assumptions. Can use correlations
More informationCan you tell the relationship between students SAT scores and their college grades?
Correlation One Challenge Can you tell the relationship between students SAT scores and their college grades? A: The higher SAT scores are, the better GPA may be. B: The higher SAT scores are, the lower
More informationThe number of distributions used in this book is small, basically the binomial and Poisson distributions, and some variations on them.
Chapter 2 Statistics In the present chapter, I will briefly review some statistical distributions that are used often in this book. I will also discuss some statistical techniques that are important in
More information* Tuesday 17 January :30-16:30 (2 hours) Recored on ESSE3 General introduction to the course.
Name of the course Statistical methods and data analysis Audience The course is intended for students of the first or second year of the Graduate School in Materials Engineering. The aim of the course
More informationA nonparametric two-sample wald test of equality of variances
University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 211 A nonparametric two-sample wald test of equality of variances David
More informationBusiness Analytics and Data Mining Modeling Using R Prof. Gaurav Dixit Department of Management Studies Indian Institute of Technology, Roorkee
Business Analytics and Data Mining Modeling Using R Prof. Gaurav Dixit Department of Management Studies Indian Institute of Technology, Roorkee Lecture - 04 Basic Statistics Part-1 (Refer Slide Time: 00:33)
More informationBasics of Uncertainty Analysis
Basics of Uncertainty Analysis Chapter Six Basics of Uncertainty Analysis 6.1 Introduction As shown in Fig. 6.1, analysis models are used to predict the performances or behaviors of a product under design.
More informationAppendix from L. J. Revell, On the Analysis of Evolutionary Change along Single Branches in a Phylogeny
008 by The University of Chicago. All rights reserved.doi: 10.1086/588078 Appendix from L. J. Revell, On the Analysis of Evolutionary Change along Single Branches in a Phylogeny (Am. Nat., vol. 17, no.
More informationMore on Roy Model of Self-Selection
V. J. Hotz Rev. May 26, 2007 More on Roy Model of Self-Selection Results drawn on Heckman and Sedlacek JPE, 1985 and Heckman and Honoré, Econometrica, 1986. Two-sector model in which: Agents are income
More informationAnswer keys for Assignment 10: Measurement of study variables (The correct answer is underlined in bold text)
Answer keys for Assignment 10: Measurement of study variables (The correct answer is underlined in bold text) 1. A quick and easy indicator of dispersion is a. Arithmetic mean b. Variance c. Standard deviation
More informationCommon Knowledge and Sequential Team Problems
Common Knowledge and Sequential Team Problems Authors: Ashutosh Nayyar and Demosthenis Teneketzis Computer Engineering Technical Report Number CENG-2018-02 Ming Hsieh Department of Electrical Engineering
More informationSTAT Section 2.1: Basic Inference. Basic Definitions
STAT 518 --- Section 2.1: Basic Inference Basic Definitions Population: The collection of all the individuals of interest. This collection may be or even. Sample: A collection of elements of the population.
More informationInference for Regression Inference about the Regression Model and Using the Regression Line, with Details. Section 10.1, 2, 3
Inference for Regression Inference about the Regression Model and Using the Regression Line, with Details Section 10.1, 2, 3 Basic components of regression setup Target of inference: linear dependency
More informationResearch Article A Nonparametric Two-Sample Wald Test of Equality of Variances
Advances in Decision Sciences Volume 211, Article ID 74858, 8 pages doi:1.1155/211/74858 Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances David Allingham 1 andj.c.w.rayner
More informationReview for Final. Chapter 1 Type of studies: anecdotal, observational, experimental Random sampling
Review for Final For a detailed review of Chapters 1 7, please see the review sheets for exam 1 and. The following only briefly covers these sections. The final exam could contain problems that are included
More informationThe Measurement of Inequality, Concentration, and Diversification.
San Jose State University From the SelectedWorks of Fred E. Foldvary 2001 The Measurement of Inequality, Concentration, and Diversification. Fred E Foldvary, Santa Clara University Available at: https://works.bepress.com/fred_foldvary/29/
More information401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.
401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis
More information15-388/688 - Practical Data Science: Basic probability. J. Zico Kolter Carnegie Mellon University Spring 2018
15-388/688 - Practical Data Science: Basic probability J. Zico Kolter Carnegie Mellon University Spring 2018 1 Announcements Logistics of next few lectures Final project released, proposals/groups due
More informationEPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7
Introduction to Generalized Univariate Models: Models for Binary Outcomes EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7 EPSY 905: Intro to Generalized In This Lecture A short review
More informationParametric Modelling of Over-dispersed Count Data. Part III / MMath (Applied Statistics) 1
Parametric Modelling of Over-dispersed Count Data Part III / MMath (Applied Statistics) 1 Introduction Poisson regression is the de facto approach for handling count data What happens then when Poisson
More informationexp{ (x i) 2 i=1 n i=1 (x i a) 2 (x i ) 2 = exp{ i=1 n i=1 n 2ax i a 2 i=1
4 Hypothesis testing 4. Simple hypotheses A computer tries to distinguish between two sources of signals. Both sources emit independent signals with normally distributed intensity, the signals of the first
More informationThis gives us an upper and lower bound that capture our population mean.
Confidence Intervals Critical Values Practice Problems 1 Estimation 1.1 Confidence Intervals Definition 1.1 Margin of error. The margin of error of a distribution is the amount of error we predict when
More informationLecture 25. STAT 225 Introduction to Probability Models April 16, Whitney Huang Purdue University. Agenda. Notes. Notes.
Lecture 25 STAT 225 Introduction to Probability Models April 16, 2104 Whitney Huang Purdue University 25.1 Agenda 1 2 3 25.2 Probability vs. Statistics Figure : Taken from JHU Statistical Computing by
More informationWorkshop Research Methods and Statistical Analysis
Workshop Research Methods and Statistical Analysis Session 2 Data Analysis Sandra Poeschl 08.04.2013 Page 1 Research process Research Question State of Research / Theoretical Background Design Data Collection
More informationEE290H F05. Spanos. Lecture 5: Comparison of Treatments and ANOVA
1 Design of Experiments in Semiconductor Manufacturing Comparison of Treatments which recipe works the best? Simple Factorial Experiments to explore impact of few variables Fractional Factorial Experiments
More informationStochastic Dominance in Polarization Work in progress. Please do not quote
Stochastic Dominance in Polarization Work in progress. Please do not quote Andre-Marie TAPTUE 17 juillet 2013 Departement D Économique and CIRPÉE, Université Laval, Canada. email: andre-marie.taptue.1@ulaval.ca
More informationChapter 10. Chapter 10. Multinomial Experiments and. Multinomial Experiments and Contingency Tables. Contingency Tables.
Chapter 10 Multinomial Experiments and Contingency Tables 1 Chapter 10 Multinomial Experiments and Contingency Tables 10-1 1 Overview 10-2 2 Multinomial Experiments: of-fitfit 10-3 3 Contingency Tables:
More informationTHE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE
THE ROYAL STATISTICAL SOCIETY 004 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE PAPER II STATISTICAL METHODS The Society provides these solutions to assist candidates preparing for the examinations in future
More informationInferences About the Difference Between Two Means
7 Inferences About the Difference Between Two Means Chapter Outline 7.1 New Concepts 7.1.1 Independent Versus Dependent Samples 7.1. Hypotheses 7. Inferences About Two Independent Means 7..1 Independent
More informationHypothesis Testing One Sample Tests
STATISTICS Lecture no. 13 Department of Econometrics FEM UO Brno office 69a, tel. 973 442029 email:jiri.neubauer@unob.cz 12. 1. 2010 Tests on Mean of a Normal distribution Tests on Variance of a Normal
More informationOne-Way ANOVA. Some examples of when ANOVA would be appropriate include:
One-Way ANOVA 1. Purpose Analysis of variance (ANOVA) is used when one wishes to determine whether two or more groups (e.g., classes A, B, and C) differ on some outcome of interest (e.g., an achievement
More informationClustering Lecture 1: Basics. Jing Gao SUNY Buffalo
Clustering Lecture 1: Basics Jing Gao SUNY Buffalo 1 Outline Basics Motivation, definition, evaluation Methods Partitional Hierarchical Density-based Mixture model Spectral methods Advanced topics Clustering
More informationPOLI 443 Applied Political Research
POLI 443 Applied Political Research Session 6: Tests of Hypotheses Contingency Analysis Lecturer: Prof. A. Essuman-Johnson, Dept. of Political Science Contact Information: aessuman-johnson@ug.edu.gh College
More informationCorrelation. A statistics method to measure the relationship between two variables. Three characteristics
Correlation Correlation A statistics method to measure the relationship between two variables Three characteristics Direction of the relationship Form of the relationship Strength/Consistency Direction
More informationLast Lecture. Distinguish Populations from Samples. Knowing different Sampling Techniques. Distinguish Parameters from Statistics
Last Lecture Distinguish Populations from Samples Importance of identifying a population and well chosen sample Knowing different Sampling Techniques Distinguish Parameters from Statistics Knowing different
More informationLecture Slides. Section 13-1 Overview. Elementary Statistics Tenth Edition. Chapter 13 Nonparametric Statistics. by Mario F.
Lecture Slides Elementary Statistics Tenth Edition and the Triola Statistics Series by Mario F. Triola Slide 1 Chapter 13 Nonparametric Statistics 13-1 Overview 13-2 Sign Test 13-3 Wilcoxon Signed-Ranks
More information11-2 Multinomial Experiment
Chapter 11 Multinomial Experiments and Contingency Tables 1 Chapter 11 Multinomial Experiments and Contingency Tables 11-11 Overview 11-2 Multinomial Experiments: Goodness-of-fitfit 11-3 Contingency Tables:
More informationChapter 5 Confidence Intervals
Chapter 5 Confidence Intervals Confidence Intervals about a Population Mean, σ, Known Abbas Motamedi Tennessee Tech University A point estimate: a single number, calculated from a set of data, that is
More information10: Crosstabs & Independent Proportions
10: Crosstabs & Independent Proportions p. 10.1 P Background < Two independent groups < Binary outcome < Compare binomial proportions P Illustrative example ( oswege.sav ) < Food poisoning following church
More informationChapter 2 Class Notes Sample & Population Descriptions Classifying variables
Chapter 2 Class Notes Sample & Population Descriptions Classifying variables Random Variables (RVs) are discrete quantitative continuous nominal qualitative ordinal Notation and Definitions: a Sample is
More informationFundamentals to Biostatistics. Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur
Fundamentals to Biostatistics Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur Statistics collection, analysis, interpretation of data development of new
More informationLecture 8: Summary Measures
Lecture 8: Summary Measures Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina Lecture 8:
More informationMONTE CARLO ANALYSIS OF CHANGE POINT ESTIMATORS
MONTE CARLO ANALYSIS OF CHANGE POINT ESTIMATORS Gregory GUREVICH PhD, Industrial Engineering and Management Department, SCE - Shamoon College Engineering, Beer-Sheva, Israel E-mail: gregoryg@sce.ac.il
More informationIf we want to analyze experimental or simulated data we might encounter the following tasks:
Chapter 1 Introduction If we want to analyze experimental or simulated data we might encounter the following tasks: Characterization of the source of the signal and diagnosis Studying dependencies Prediction
More informationClassification: Linear Discriminant Analysis
Classification: Linear Discriminant Analysis Discriminant analysis uses sample information about individuals that are known to belong to one of several populations for the purposes of classification. Based
More informationA SHORT INTRODUCTION TO PROBABILITY
A Lecture for B.Sc. 2 nd Semester, Statistics (General) A SHORT INTRODUCTION TO PROBABILITY By Dr. Ajit Goswami Dept. of Statistics MDKG College, Dibrugarh 19-Apr-18 1 Terminology The possible outcomes
More informationNotes. AS Examinations are in blue A Level Examinations are in red Other examinations are in green
Notes AS Examinations are in blue A Level Examinations are in red Other examinations are in green This is the first version of your External Examination Timetable for next summer. Once entries have been
More informationDecomposition of Parsimonious Independence Model Using Pearson, Kendall and Spearman s Correlations for Two-Way Contingency Tables
International Journal of Statistics and Probability; Vol. 7 No. 3; May 208 ISSN 927-7032 E-ISSN 927-7040 Published by Canadian Center of Science and Education Decomposition of Parsimonious Independence
More informationWELCOME! Lecture 14: Factor Analysis, part I Måns Thulin
Quantitative methods II WELCOME! Lecture 14: Factor Analysis, part I Måns Thulin The first factor analysis C. Spearman (1904). General intelligence, objectively determined and measured. The American Journal
More information