Why Data Transformation? Data Transformation. Homoscedasticity and Normality. Homoscedasticity and Normality
|
|
- Jason Williamson
- 5 years ago
- Views:
Transcription
1 Objectives: Data Transformation Understand why we often need to transform our data The three commonly used data transformation techniques Additive effects and multiplicative effects Application of data transformation in ANOVA and regression. Why Data Transformation? The assumptions of most parametric methods: Homoscedasticity Normality Additivity Linearity Data transformation is used to make your data conform to the assumptions of the statistical methods Illustrative examples Homoscedasticity and Normality Homoscedasticity and Normality The data deviates from both homoscedasticity and normality. Won t it be nice if we would make data look this way?
2 Types of Data Transformation The logarithmic transformation The square-root ttransformation ti The arcsine transformation. Data transformation can be done conveniently in EXCEL. Alternatives: Ranks and nonparametric methods. ID Group 1 Group Var t P Equal Var.? P= Kurtosis Skewness.1.1 P(Zg1) Homoscedasticity The two groups of data seem to differ greatly in means, but a t-test shows that the means do not differ significantly from each other - a surprising result. The two groups of data differ greatly in variance, and both deviate significantly from normality. These results invalidate the t-test. We calculate two ratios: var/mean ratio and Std/mean ratio (i.e., coefficient of variation). Group1 Group Var/mean C.V P(Zg) Log-transformation ID ID Group Group 1 1 Group Group Var t -3.4 P Equal Var.? P= 0.67 Kurtosis Skewness P(Zg1) P(Zg) Log-Transformed Data NewX = ln(x+1) The transformation is successful because: The variance is now similar Deviation from normality is now nonsignificant The t-test revealed a highly significant t 34 difference in means between the two groups ID Group 1 Group Var t -3.4 P Equal Var.? P= 0.67 Kurtosis Skewness P(Zg1) P(Zg) Log-Transformed Data NewX = ln(x+1) Transform back: X = e NewX 11 Compare this mean with the original i mean. Which one is more preferable? Calculate the standard error, the degree of freedom, and 95% CL (t 0.05,16 =.47).
3 Normal but Heteroscedastic Any transformation that you use is likely to change normality. Fortunately, t-test and ANOVA are quite robust for this kind of data. Of course, you can also use nonparametric tests. Normal but Heteroscedastic ID Group 1 Group The t-test, however, dt detects t significant ifi 4 13 difference in means. You can use nonparametric methods to analyse data for comparison, and you are like to find t-test to be more powerful. 13 Var t P Equal Var.? P= Kurtosis Skewness 0 0 The two variances are significantly different. Factor B Level 1 Level Level Level Additivity What experimental design is this? Compare the group means. Is there an interaction effect? Additivity i i means that the difference between levels of one factor is consistent for different levels of another factor. Multiplicative Effects Factor B Level 1 Level Level Level Compare the group means. Is there an interaction effect? Does this data set meet the assumption of additivity? When the assumption of additivity is not met, we have difficulty in interpreting main effects. Now calculate the ratio of group means. What did you find?
4 Multiplicative Effects Factor B Level 1 Level Level Level For, we see that Level has a mean about. times as large as that for Level 1. For factor B, Level has a mean about.1 times as large as that for Level 1). If you know the value for Level 1 of, you can obtain the value for Level of by multiplying the known value by.. Similarly, you can do the same for Factor B. We say that the effect of Factors A and B are multiplicative, not additive. Factor B Level 1 Level Level Level Log-transformation Now log-transform the data. Compare the means. Is the assumption of additivity met now? Original Data Transformed data Variance Why log-transformation can change the multiplicative li effects to additive effects? Z = XY ln( Z ) = ln( X) + ln( Y ) Square-Root Transformation ID Group 1 Group The two groups of data differ much in variance Calculate two ratios: var/mean ratio and Std/mean ratio (i.e., coefficient of variation) Does your calculation suggest logtransformation? When is log transformation appropriate? Use square-root transformation when different groups have similar il Variance/ ratios Var Notice the means, which do not Var/ coincide with the most frequent Std/ observations
5 Square-Root Transformation ID Group 1 Group Var Square-root transformation: X ' = X + 3/ The variance is now almost identical between the two groups Transform the means back to the original scale and compare these means with the original means: 3 X = ( X ') Quiz on Data Transformation Group n Var SE T LowerL The data set is rightskewed for each group. Calculate the variance/mean ratio and C.V. for each group, and decide what transformation you should use. Do the transformation and convert the means back to the original UpperL scale. With Multiple Groups Confidence Limits Variance When you have multiple groups, a Variance vs or a Std vs plot can help you to decide which data transformation to use. The graph on the left shows that the Var/ ratio is almost constant. What transformation should you use?, Lo ower, Uppe er Before transformation ower, Upp per, L After transformation With the skewness in our data, do confidence limits on the right make more sense? Why?
6 Arcsine Transformation Group1 Group Group1 Group p Var SE LowerL UpperL Transform back New LowerL UpperL X ' = arcsin( X ) Used for proportions Compare the variances before and after transformation Do you know how to transform the means SE and C.L. back to the original scale? X = (sin X ') Data Transformation Using SAS Data Mydata; input x; newx=log(x); newx=sqrt(x+3/); newx=arsin(sqrt(x)); cards; Natural logarithm transfromation Square-root transformation Arcsine transformation
Topic 23: Diagnostics and Remedies
Topic 23: Diagnostics and Remedies Outline Diagnostics residual checks ANOVA remedial measures Diagnostics Overview We will take the diagnostics and remedial measures that we learned for regression and
More informationAPPENDIX A. Watershed Delineation and Stream Network Defined from WMS
APPENDIX A Watershed Delineation and Stream Network Defined from WMS Figure A.1. Subbasins Delineation and Stream Network for Goodwin Creek Watershed APPENDIX B Summary Statistics of Monthly Peak Discharge
More informationCHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)
FROM: PAGANO, R. R. (007) I. INTRODUCTION: DISTINCTION BETWEEN PARAMETRIC AND NON-PARAMETRIC TESTS Statistical inference tests are often classified as to whether they are parametric or nonparametric Parameter
More informationIntroduction to Linear regression analysis. Part 2. Model comparisons
Introduction to Linear regression analysis Part Model comparisons 1 ANOVA for regression Total variation in Y SS Total = Variation explained by regression with X SS Regression + Residual variation SS Residual
More informationTHE PRINCIPLES AND PRACTICE OF STATISTICS IN BIOLOGICAL RESEARCH. Robert R. SOKAL and F. James ROHLF. State University of New York at Stony Brook
BIOMETRY THE PRINCIPLES AND PRACTICE OF STATISTICS IN BIOLOGICAL RESEARCH THIRD E D I T I O N Robert R. SOKAL and F. James ROHLF State University of New York at Stony Brook W. H. FREEMAN AND COMPANY New
More informationTopic 8. Data Transformations [ST&D section 9.16]
Topic 8. Data Transformations [ST&D section 9.16] 8.1 The assumptions of ANOVA For ANOVA, the linear model for the RCBD is: Y ij = µ + τ i + β j + ε ij There are four key assumptions implicit in this model.
More informationAssessing Model Adequacy
Assessing Model Adequacy A number of assumptions were made about the model, and these need to be verified in order to use the model for inferences. In cases where some assumptions are violated, there are
More informationPractice problems. 1. Given a = 3i 2j and b = 2i + j. Write c = i + j in terms of a and b.
Practice problems 1. Given a = 3i 2j and b = 2i + j. Write c = i + j in terms of a and b. 1, 1 = c 1 3, 2 + c 2 2, 1. Solve c 1, c 2. 2. Suppose a is a vector in the plane. If the component of the a in
More informationIntroduction to Statistical Analysis
Introduction to Statistical Analysis Changyu Shen Richard A. and Susan F. Smith Center for Outcomes Research in Cardiology Beth Israel Deaconess Medical Center Harvard Medical School Objectives Descriptive
More informationTransition Passage to Descriptive Statistics 28
viii Preface xiv chapter 1 Introduction 1 Disciplines That Use Quantitative Data 5 What Do You Mean, Statistics? 6 Statistics: A Dynamic Discipline 8 Some Terminology 9 Problems and Answers 12 Scales of
More informationHypothesis Testing hypothesis testing approach
Hypothesis Testing In this case, we d be trying to form an inference about that neighborhood: Do people there shop more often those people who are members of the larger population To ascertain this, we
More informationAnalysis of variance (ANOVA) Comparing the means of more than two groups
Analysis of variance (ANOVA) Comparing the means of more than two groups Example: Cost of mating in male fruit flies Drosophila Treatments: place males with and without unmated (virgin) females Five treatments
More informationRegression Analysis. Table Relationship between muscle contractile force (mj) and stimulus intensity (mv).
Regression Analysis Two variables may be related in such a way that the magnitude of one, the dependent variable, is assumed to be a function of the magnitude of the second, the independent variable; however,
More informationTest one Review Cal 2
Name: Class: Date: ID: A Test one Review Cal 2 Short Answer. Write the following expression as a logarithm of a single quantity. lnx 2ln x 2 ˆ 6 2. Write the following expression as a logarithm of a single
More informationChapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression
BSTT523: Kutner et al., Chapter 1 1 Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression Introduction: Functional relation between
More informationInferences for Regression
Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In
More informationGlossary. The ISI glossary of statistical terms provides definitions in a number of different languages:
Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the
More informationZ score indicates how far a raw score deviates from the sample mean in SD units. score Mean % Lower Bound
1 EDUR 8131 Chat 3 Notes 2 Normal Distribution and Standard Scores Questions Standard Scores: Z score Z = (X M) / SD Z = deviation score divided by standard deviation Z score indicates how far a raw score
More informationModel Fitting. Jean Yves Le Boudec
Model Fitting Jean Yves Le Boudec 0 Contents 1. What is model fitting? 2. Linear Regression 3. Linear regression with norm minimization 4. Choosing a distribution 5. Heavy Tail 1 Virus Infection Data We
More informationLogarithmic Functions
Metropolitan Community College The Natural Logarithmic Function The natural logarithmic function is defined on (0, ) as ln x = x 1 1 t dt. Example 1. Evaluate ln 1. Example 1. Evaluate ln 1. Solution.
More informationSection 3.5: Implicit Differentiation
Section 3.5: Implicit Differentiation In the previous sections, we considered the problem of finding the slopes of the tangent line to a given function y = f(x). The idea of a tangent line however is not
More informationOne-sided and two-sided t-test
One-sided and two-sided t-test Given a mean cancer rate in Montreal, 1. What is the probability of finding a deviation of > 1 stdev from the mean? 2. What is the probability of finding 1 stdev more cases?
More informationStat 135, Fall 2006 A. Adhikari HOMEWORK 10 SOLUTIONS
Stat 135, Fall 2006 A. Adhikari HOMEWORK 10 SOLUTIONS 1a) The model is cw i = β 0 + β 1 el i + ɛ i, where cw i is the weight of the ith chick, el i the length of the egg from which it hatched, and ɛ i
More informationLecture 3: Inference in SLR
Lecture 3: Inference in SLR STAT 51 Spring 011 Background Reading KNNL:.1.6 3-1 Topic Overview This topic will cover: Review of hypothesis testing Inference about 1 Inference about 0 Confidence Intervals
More information13: Additional ANOVA Topics. Post hoc Comparisons
13: Additional ANOVA Topics Post hoc Comparisons ANOVA Assumptions Assessing Group Variances When Distributional Assumptions are Severely Violated Post hoc Comparisons In the prior chapter we used ANOVA
More informationApplied Econometrics (QEM)
Applied Econometrics (QEM) based on Prinicples of Econometrics Jakub Mućk Department of Quantitative Economics Jakub Mućk Applied Econometrics (QEM) Meeting #3 1 / 42 Outline 1 2 3 t-test P-value Linear
More informationCHAPTER 1 Systems of Linear Equations
CHAPTER Systems of Linear Equations Section. Introduction to Systems of Linear Equations. Because the equation is in the form a x a y b, it is linear in the variables x and y. 0. Because the equation cannot
More informationData are sometimes not compatible with the assumptions of parametric statistical tests (i.e. t-test, regression, ANOVA)
BSTT523 Pagano & Gauvreau Chapter 13 1 Nonparametric Statistics Data are sometimes not compatible with the assumptions of parametric statistical tests (i.e. t-test, regression, ANOVA) In particular, data
More informationSPSS LAB FILE 1
SPSS LAB FILE www.mcdtu.wordpress.com 1 www.mcdtu.wordpress.com 2 www.mcdtu.wordpress.com 3 OBJECTIVE 1: Transporation of Data Set to SPSS Editor INPUTS: Files: group1.xlsx, group1.txt PROCEDURE FOLLOWED:
More informationProbability Distributions.
Probability Distributions http://www.pelagicos.net/classes_biometry_fa18.htm Probability Measuring Discrete Outcomes Plotting probabilities for discrete outcomes: 0.6 0.5 0.4 0.3 0.2 0.1 NOTE: Area within
More informationIntroduction and Descriptive Statistics p. 1 Introduction to Statistics p. 3 Statistics, Science, and Observations p. 5 Populations and Samples p.
Preface p. xi Introduction and Descriptive Statistics p. 1 Introduction to Statistics p. 3 Statistics, Science, and Observations p. 5 Populations and Samples p. 6 The Scientific Method and the Design of
More informationThe ε ij (i.e. the errors or residuals) are normally distributed. This assumption has the least influence on the F test.
Lecture 11 Topic 8: Data Transformations Assumptions of the Analysis of Variance 1. Independence of errors The ε ij (i.e. the errors or residuals) are statistically independent from one another. Failure
More informationBIOL 458 BIOMETRY Lab 9 - Correlation and Bivariate Regression
BIOL 458 BIOMETRY Lab 9 - Correlation and Bivariate Regression Introduction to Correlation and Regression The procedures discussed in the previous ANOVA labs are most useful in cases where we are interested
More informationCorrelation and Simple Linear Regression
Correlation and Simple Linear Regression Sasivimol Rattanasiri, Ph.D Section for Clinical Epidemiology and Biostatistics Ramathibodi Hospital, Mahidol University E-mail: sasivimol.rat@mahidol.ac.th 1 Outline
More informationTable 1: Fish Biomass data set on 26 streams
Math 221: Multiple Regression S. K. Hyde Chapter 27 (Moore, 5th Ed.) The following data set contains observations on the fish biomass of 26 streams. The potential regressors from which we wish to explain
More informationFUNCTIONS AND MODELS
1 FUNCTIONS AND MODELS FUNCTIONS AND MODELS 1.6 Inverse Functions and Logarithms In this section, we will learn about: Inverse functions and logarithms. INVERSE FUNCTIONS The table gives data from an experiment
More informationSET 1. (1) Solve for x: (a) e 2x = 5 3x
() Solve for x: (a) e x = 5 3x SET We take natural log on both sides: ln(e x ) = ln(5 3x ) x = 3 x ln(5) Now we take log base on both sides: log ( x ) = log (3 x ln 5) x = log (3 x ) + log (ln(5)) x x
More informationSix Sigma Black Belt Study Guides
Six Sigma Black Belt Study Guides 1 www.pmtutor.org Powered by POeT Solvers Limited. Analyze Correlation and Regression Analysis 2 www.pmtutor.org Powered by POeT Solvers Limited. Variables and relationships
More informationNext is material on matrix rank. Please see the handout
B90.330 / C.005 NOTES for Wednesday 0.APR.7 Suppose that the model is β + ε, but ε does not have the desired variance matrix. Say that ε is normal, but Var(ε) σ W. The form of W is W w 0 0 0 0 0 0 w 0
More informationFinal Exam. Name: Solution:
Final Exam. Name: Instructions. Answer all questions on the exam. Open books, open notes, but no electronic devices. The first 13 problems are worth 5 points each. The rest are worth 1 point each. HW1.
More informationHeteroscedasticity 1
Heteroscedasticity 1 Pierre Nguimkeu BUEC 333 Summer 2011 1 Based on P. Lavergne, Lectures notes Outline Pure Versus Impure Heteroscedasticity Consequences and Detection Remedies Pure Heteroscedasticity
More informationK. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij =
K. Model Diagnostics We ve already seen how to check model assumptions prior to fitting a one-way ANOVA. Diagnostics carried out after model fitting by using residuals are more informative for assessing
More information* Tuesday 17 January :30-16:30 (2 hours) Recored on ESSE3 General introduction to the course.
Name of the course Statistical methods and data analysis Audience The course is intended for students of the first or second year of the Graduate School in Materials Engineering. The aim of the course
More information7.4* General logarithmic and exponential functions
7.4* General logarithmic and exponential functions Mark Woodard Furman U Fall 2010 Mark Woodard (Furman U) 7.4* General logarithmic and exponential functions Fall 2010 1 / 9 Outline 1 General exponential
More informationSummary statistics. G.S. Questa, L. Trapani. MSc Induction - Summary statistics 1
Summary statistics 1. Visualize data 2. Mean, median, mode and percentiles, variance, standard deviation 3. Frequency distribution. Skewness 4. Covariance and correlation 5. Autocorrelation MSc Induction
More informationInstitute of Actuaries of India
Institute of Actuaries of India Subject CT3 Probability & Mathematical Statistics May 2011 Examinations INDICATIVE SOLUTION Introduction The indicative solution has been written by the Examiners with the
More informationIES 612/STA 4-573/STA Winter 2008 Week 1--IES 612-STA STA doc
IES 612/STA 4-573/STA 4-576 Winter 2008 Week 1--IES 612-STA 4-573-STA 4-576.doc Review Notes: [OL] = Ott & Longnecker Statistical Methods and Data Analysis, 5 th edition. [Handouts based on notes prepared
More informationApplication of Variance Homogeneity Tests Under Violation of Normality Assumption
Application of Variance Homogeneity Tests Under Violation of Normality Assumption Alisa A. Gorbunova, Boris Yu. Lemeshko Novosibirsk State Technical University Novosibirsk, Russia e-mail: gorbunova.alisa@gmail.com
More informationAssumptions of classical multiple regression model
ESD: Recitation #7 Assumptions of classical multiple regression model Linearity Full rank Exogeneity of independent variables Homoscedasticity and non autocorrellation Exogenously generated data Normal
More informationRegression Analysis. BUS 735: Business Decision Making and Research. Learn how to detect relationships between ordinal and categorical variables.
Regression Analysis BUS 735: Business Decision Making and Research 1 Goals of this section Specific goals Learn how to detect relationships between ordinal and categorical variables. Learn how to estimate
More informationDETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics
DETAILED CONTENTS About the Author Preface to the Instructor To the Student How to Use SPSS With This Book PART I INTRODUCTION AND DESCRIPTIVE STATISTICS 1. Introduction to Statistics 1.1 Descriptive and
More informationA Non-parametric bootstrap for multilevel models
A Non-parametric bootstrap for multilevel models By James Carpenter London School of Hygiene and ropical Medicine Harvey Goldstein and Jon asbash Institute of Education 1. Introduction Bootstrapping is
More information3. Nonparametric methods
3. Nonparametric methods If the probability distributions of the statistical variables are unknown or are not as required (e.g. normality assumption violated), then we may still apply nonparametric tests
More informationChapter 13 Correlation
Chapter Correlation Page. Pearson correlation coefficient -. Inferential tests on correlation coefficients -9. Correlational assumptions -. on-parametric measures of correlation -5 5. correlational example
More informationEXST Regression Techniques Page 1. We can also test the hypothesis H :" œ 0 versus H :"
EXST704 - Regression Techniques Page 1 Using F tests instead of t-tests We can also test the hypothesis H :" œ 0 versus H :" Á 0 with an F test.! " " " F œ MSRegression MSError This test is mathematically
More informationSimple Linear Regression
Simple Linear Regression 1 Correlation indicates the magnitude and direction of the linear relationship between two variables. Linear Regression: variable Y (criterion) is predicted by variable X (predictor)
More informationChapter 2. First-Order Differential Equations
Chapter 2 First-Order Differential Equations i Let M(x, y) + N(x, y) = 0 Some equations can be written in the form A(x) + B(y) = 0 DEFINITION 2.2. (Separable Equation) A first-order differential equation
More informationChapter 8 (More on Assumptions for the Simple Linear Regression)
EXST3201 Chapter 8b Geaghan Fall 2005: Page 1 Chapter 8 (More on Assumptions for the Simple Linear Regression) Your textbook considers the following assumptions: Linearity This is not something I usually
More informationHypothesis T e T sting w ith with O ne O One-Way - ANOV ANO A V Statistics Arlo Clark Foos -
Hypothesis Testing with One-Way ANOVA Statistics Arlo Clark-Foos Conceptual Refresher 1. Standardized z distribution of scores and of means can be represented as percentile rankings. 2. t distribution
More informationBooklet of Code and Output for STAC32 Final Exam
Booklet of Code and Output for STAC32 Final Exam December 8, 2014 List of Figures in this document by page: List of Figures 1 Popcorn data............................. 2 2 MDs by city, with normal quantile
More informationStatistical comparison of univariate tests of homogeneity of variances
Submitted to the Journal of Statistical Computation and Simulation Statistical comparison of univariate tests of homogeneity of variances Pierre Legendre* and Daniel Borcard Département de sciences biologiques,
More informationLinear Regression Model. Badr Missaoui
Linear Regression Model Badr Missaoui Introduction What is this course about? It is a course on applied statistics. It comprises 2 hours lectures each week and 1 hour lab sessions/tutorials. We will focus
More informationSliced Inverse Regression
Sliced Inverse Regression Ge Zhao gzz13@psu.edu Department of Statistics The Pennsylvania State University Outline Background of Sliced Inverse Regression (SIR) Dimension Reduction Definition of SIR Inversed
More informationEmpirical Power of Four Statistical Tests in One Way Layout
International Mathematical Forum, Vol. 9, 2014, no. 28, 1347-1356 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/imf.2014.47128 Empirical Power of Four Statistical Tests in One Way Layout Lorenzo
More informationMultiple Linear Regression estimation, testing and checking assumptions
Multiple Linear Regression estimation, testing and checking assumptions Lecture No. 07 Example 1 The president of a large chain of fast-food restaurants has randomly selected 10 franchises and recorded
More informationBiological Applications of ANOVA - Examples and Readings
BIO 575 Biological Applications of ANOVA - Winter Quarter 2010 Page 1 ANOVA Pac Biological Applications of ANOVA - Examples and Readings One-factor Model I (Fixed Effects) This is the same example for
More informationFractional Polynomial Regression
Chapter 382 Fractional Polynomial Regression Introduction This program fits fractional polynomial models in situations in which there is one dependent (Y) variable and one independent (X) variable. It
More informationCheat Sheet: Linear Regression
Cheat Sheet: Linear Regression Measurement and Evaluation of HCC Systems Scenario Use regression if you want to test the simultaneous linear effect of several variables varx1, varx2, on a continuous outcome
More informationFurther Pure Mathematics 3 GCE Further Mathematics GCE Pure Mathematics and Further Mathematics (Additional) A2 optional unit
Unit FP3 Further Pure Mathematics 3 GCE Further Mathematics GCE Pure Mathematics and Further Mathematics (Additional) A optional unit FP3.1 Unit description Further matrix algebra; vectors, hyperbolic
More information4.1. Introduction: Comparing Means
4. Analysis of Variance (ANOVA) 4.1. Introduction: Comparing Means Consider the problem of testing H 0 : µ 1 = µ 2 against H 1 : µ 1 µ 2 in two independent samples of two different populations of possibly
More informationTwo-by-two ANOVA: Global and Graphical Comparisons Based on an Extension of the Shift Function
Journal of Data Science 7(2009), 459-468 Two-by-two ANOVA: Global and Graphical Comparisons Based on an Extension of the Shift Function Rand R. Wilcox University of Southern California Abstract: When comparing
More informationOutline. Possible Reasons. Nature of Heteroscedasticity. Basic Econometrics in Transportation. Heteroscedasticity
1/25 Outline Basic Econometrics in Transportation Heteroscedasticity What is the nature of heteroscedasticity? What are its consequences? How does one detect it? What are the remedial measures? Amir Samimi
More informationT-test: means of Spock's judge versus all other judges 1 12:10 Wednesday, January 5, judge1 N Mean Std Dev Std Err Minimum Maximum
T-test: means of Spock's judge versus all other judges 1 The TTEST Procedure Variable: pcwomen judge1 N Mean Std Dev Std Err Minimum Maximum OTHER 37 29.4919 7.4308 1.2216 16.5000 48.9000 SPOCKS 9 14.6222
More informationEstadística II Chapter 5. Regression analysis (second part)
Estadística II Chapter 5. Regression analysis (second part) Chapter 5. Regression analysis (second part) Contents Diagnostic: Residual analysis The ANOVA (ANalysis Of VAriance) decomposition Nonlinear
More informationOutline. Topic 20 - Diagnostics and Remedies. Residuals. Overview. Diagnostics Plots Residual checks Formal Tests. STAT Fall 2013
Topic 20 - Diagnostics and Remedies - Fall 2013 Diagnostics Plots Residual checks Formal Tests Remedial Measures Outline Topic 20 2 General assumptions Overview Normally distributed error terms Independent
More informationNovember 20, Problem Number of points Points obtained Total 50
MATH 124 E MIDTERM 2, v.b Autumn 2018 November 20, 2018 NAME: SIGNATURE: STUDENT ID #: GAB AB AB AB AB AB AB AB AB AB AB AB AB AB QUIZ SECTION: ABB ABB Problem Number of points Points obtained 1 14 2 10
More informationT.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS
ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS In our work on hypothesis testing, we used the value of a sample statistic to challenge an accepted value of a population parameter. We focused only
More informationCS 147: Computer Systems Performance Analysis
CS 147: Computer Systems Performance Analysis Advanced Regression Techniques CS 147: Computer Systems Performance Analysis Advanced Regression Techniques 1 / 31 Overview Overview Overview Common Transformations
More informationPredicted Y Scores. The symbol stands for a predicted Y score
REGRESSION 1 Linear Regression Linear regression is a statistical procedure that uses relationships to predict unknown Y scores based on the X scores from a correlated variable. 2 Predicted Y Scores Y
More informationAMS 7 Correlation and Regression Lecture 8
AMS 7 Correlation and Regression Lecture 8 Department of Applied Mathematics and Statistics, University of California, Santa Cruz Suumer 2014 1 / 18 Correlation pairs of continuous observations. Correlation
More informationVariance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression.
10/3/011 Functional Connectivity Correlation and Regression Variance VAR = Standard deviation Standard deviation SD = Unbiased SD = 1 10/3/011 Standard error Confidence interval SE = CI = = t value for
More informationPhysics 509: Non-Parametric Statistics and Correlation Testing
Physics 509: Non-Parametric Statistics and Correlation Testing Scott Oser Lecture #19 Physics 509 1 What is non-parametric statistics? Non-parametric statistics is the application of statistical tests
More informationRegression analysis is a tool for building mathematical and statistical models that characterize relationships between variables Finds a linear
Regression analysis is a tool for building mathematical and statistical models that characterize relationships between variables Finds a linear relationship between: - one independent variable X and -
More informationMultiple Regression and Model Building Lecture 20 1 May 2006 R. Ryznar
Multiple Regression and Model Building 11.220 Lecture 20 1 May 2006 R. Ryznar Building Models: Making Sure the Assumptions Hold 1. There is a linear relationship between the explanatory (independent) variable(s)
More informationAn Analysis of College Algebra Exam Scores December 14, James D Jones Math Section 01
An Analysis of College Algebra Exam s December, 000 James D Jones Math - Section 0 An Analysis of College Algebra Exam s Introduction Students often complain about a test being too difficult. Are there
More informationMeasuring relationships among multiple responses
Measuring relationships among multiple responses Linear association (correlation, relatedness, shared information) between pair-wise responses is an important property used in almost all multivariate analyses.
More informationOn the Detection of Heteroscedasticity by Using CUSUM Range Distribution
International Journal of Statistics and Probability; Vol. 4, No. 3; 2015 ISSN 1927-7032 E-ISSN 1927-7040 Published by Canadian Center of Science and Education On the Detection of Heteroscedasticity by
More informationSimple Linear Regression
Simple Linear Regression MATH 282A Introduction to Computational Statistics University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/ eariasca/math282a.html MATH 282A University
More informationChapter 6 Part 4. Confidence Intervals
Chapter 6 Part 4 Confidence Intervals October 1, 008 Goal: To clearly understand the link between probability distributions and confidence intervals. Skills: Be able to calculate (1 - α)% confidence interval
More informationCHAPTER 2: Assumptions and Properties of Ordinary Least Squares, and Inference in the Linear Regression Model
CHAPTER 2: Assumptions and Properties of Ordinary Least Squares, and Inference in the Linear Regression Model Prof. Alan Wan 1 / 57 Table of contents 1. Assumptions in the Linear Regression Model 2 / 57
More informationDensity Temp vs Ratio. temp
Temp Ratio Density 0.00 0.02 0.04 0.06 0.08 0.10 0.12 Density 0.0 0.2 0.4 0.6 0.8 1.0 1. (a) 170 175 180 185 temp 1.0 1.5 2.0 2.5 3.0 ratio The histogram shows that the temperature measures have two peaks,
More informationSigmaplot di Systat Software
Sigmaplot di Systat Software SigmaPlot Has Extensive Statistical Analysis Features SigmaPlot is now bundled with SigmaStat as an easy-to-use package for complete graphing and data analysis. The statistical
More informationTABLES AND FORMULAS FOR MOORE Basic Practice of Statistics
TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Exploring Data: Distributions Look for overall pattern (shape, center, spread) and deviations (outliers). Mean (use a calculator): x = x 1 + x
More informationChecking model assumptions with regression diagnostics
@graemeleehickey www.glhickey.com graeme.hickey@liverpool.ac.uk Checking model assumptions with regression diagnostics Graeme L. Hickey University of Liverpool Conflicts of interest None Assistant Editor
More informationPSY 307 Statistics for the Behavioral Sciences. Chapter 20 Tests for Ranked Data, Choosing Statistical Tests
PSY 307 Statistics for the Behavioral Sciences Chapter 20 Tests for Ranked Data, Choosing Statistical Tests What To Do with Non-normal Distributions Tranformations (pg 382): The shape of the distribution
More informationName: ID: Math 233 Exam 1. Page 1
Page 1 Name: ID: This exam has 20 multiple choice questions, worth 5 points each. You are allowed to use a scientific calculator and a 3 5 inch note card. 1. Which of the following pairs of vectors are
More informationChapter 15: Nonparametric Statistics Section 15.1: An Overview of Nonparametric Statistics
Section 15.1: An Overview of Nonparametric Statistics Understand Difference between Parametric and Nonparametric Statistical Procedures Parametric statistical procedures inferential procedures that rely
More information401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.
401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis
More informationMultiple Linear Regression
Multiple Linear Regression University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html 1 / 42 Passenger car mileage Consider the carmpg dataset taken from
More informationContents. Acknowledgments. xix
Table of Preface Acknowledgments page xv xix 1 Introduction 1 The Role of the Computer in Data Analysis 1 Statistics: Descriptive and Inferential 2 Variables and Constants 3 The Measurement of Variables
More information