Multiple comparisons of slopes of regression lines. Jolanta Wojnar, Wojciech Zieliński

Size: px
Start display at page:

Download "Multiple comparisons of slopes of regression lines. Jolanta Wojnar, Wojciech Zieliński"

Transcription

1 Multiple comparisons of slopes of regression lines Jolanta Wojnar, Wojciech Zieliński Institute of Statistics and Econometrics University of Rzeszów ul Ćwiklińskiej 2, Rzeszów Department of Econometrics and Computer Sciences Agricultural University ul Nowoursynowska 166, Warszawa Abstract The problem of comparing K simple regression lines is considered A statistical procedure for finding groups of parallel lines is proposed Key words: multiple comparisons, simultaneous inference, regression analysis AMS subject classification: 62J15, 62J1 1 Introduction Consider K regression lines Y = α k + β k x + ε, k = 1,, K and assume that ε s are iid random variables distributed as N(, σ 2 ) The problem is to divide a set of regression coefficients {β 1,, β K } into homogeneous groups A subset {β i1,, β im } is called the homogenous group if β i1 = = β im and any other β {β 1,, β K } is not equal to β i1 This problem is similar to the problem of extracting homogeneous groups of means in the ANOVA Some of classical multiple comparison procedures (cf Miller, 1982, Hochberg and Tamhane, 1988) such as Tukey, Scheffé, Bonfferroni can be adopted to the above problem In what follows, the W procedure of multiple comparison proposed by Zieliński (1992) is used As a criterion of the procedure quality the probability of the correct decision is taken 1

2 2 Statistical model Assume that for each of the regression function we have n k observations (x ki, Y ki ) Than the overall number of observations is N = k n k Hence we have the following joint model: (1) Y ki = α k + β k x ki + ε ki, i = 1,, n k, k = 1,, K, where α s and β s are unknown regression coefficients and ε s are independent normally distributed random variables with mean zero and variance σ 2 In matrix notation the model may be written as y = Xβ + ε, where the vector ε is distributed as N N (, σ 2 I) and Xβ = y = (Y 11,, Y 1n1,, Y K1,, Y KnK ), 1 x 11 1 x 12 1 x 1n1 1 x 21 1 x 22 1 x 2n2 1 x K1 1 x K2 1 x KnK α 1 α 2 α K β 1 β 2 β K Let A be a given q 2K matrix of rank r and c be a given q 1 vector On the basis of the general theory of linear models we obtain the following test statistics for the hypothesis H : Aβ = c, (2) F = (A β c) [A(X X) A ] (A β c) y (I X(X X) X )y N rank(x) r where β is the LSE of β Its null distribution is F with (r, N rank(x)) degrees of freedom and the hypothesis is rejected, at a significance level α, if F > Fr;N rank(x) α, where Fr;N rank(x) α is an appropriate critical value 2

3 If for each k = 1,, K there exist at[ least two different ] x ki s, then the rank(x) = 2K and there exists (X X) 1 W1 W of the form 2, where W 2 W 3 { x 2 W 1 = diag 1i,, n 1 SS 1 { x1i W 2 = diag,, n 1 SS 1 W 3 = diag { 1 SS 1,, } x 2 Ki, n K SS K 1 SS K xki n K SS K } Here diag {a 1,, a K } denotes the diagonal matrix with diagonal elements a 1,, a K and SS k = n k i=1 (x ki x k ) 2 Note that 1 N 2K y (I X(X X) 1 X )y is the least square unbiased estimator of the variance σ 2 }, 3 Procedure The procedure of comparison of regression coefficients is based on the statistic (2) with c = and is stepwise On the first step, it is verified whether β 1 = = β K The matrix A is then of the form A = [ ] K K IK 1 K 1 K1 K where K K is K K zero matrix, I K denotes the identity matrix of order K and 1 K denotes the K 1 vector of ones The explicite form of the nominator of (2) is K ( βk β ) 2, where k=1 β k = i=1 (Y ik Ȳk)(x ik x k ) i=1 (x, β = ik x k ) 2 K k=1 i=1 (Y ik Ȳk)(x ik x k ) K k=1 i=1 (x ik x k ) 2 Note that β k is the LSE of β k for k th regression line and β is the LSE of the regression coefficient under assumption β 1 = = β K = β If the value of statistic (2) is less than FK 1;N 2K α, the procedure stops, and regression coefficients are considered as equal Elsewhere we go to the second step 3

4 On the p-th step we consider a division of the set of regression coefficients into p disjoint homogenous groups Let I 1,, I p be a division of {1,, K} into p disjoint subsets Let J (p) denotes this division The corresponding matrix A (after appropriate permutation of regression coefficients) takes on the form A J (p) = m1 K m2 K mp K I m1 1 m 1 1 m1 1 m 1 m1 m 2 m1 m p m2 m 1 I m2 1 m 2 1 m2 1 m 2 m2 m p mp m 1 mp m 2 I mp 1 m p 1 mp 1 m p where m i is the cardinality of the subset I i Let F J (p) denotes the statistic (2) with the matrix A J (p) The nominator of (2) equals to p ( βk β ) 2 Ij, j=1 k I j where β k = i=1 (Y ik Ȳk)(x ik x k ) i=1 (x, ik x k ) 2 βij = k I j i=1 (Y ik ȲI j )(x ik x Ij ) K k=1 i=1 (x, ik x Ij ) 2 Ȳ Ij = k I j i=1 Y ki k I j n k, x Ij = k I j i=1 x ki k I j n k The estimator ˆβ Ij is the LSE of regression coefficient under assumption that all β k for k I j are equal Let J (p) be a division into p groups such that F J (p) = min F J (p) If F J (p) < F α K p;n 2K, then we stop the procedure and accept the division J (p) Otherwise we consider divisions into p + 1 groups If p = K 1 and F J (p) > F α K p;n 2K holds, we decide that we have K groups, ie all coefficients are distinct 4

5 4 Criterion Let Θ = {θ 1, θ 2, } denotes the set of all possible divisions of the set of regression coefficients into homogenous groups Elements of the set Θ are disjoint subsets of R K and for every (β 1,, β K ) R K there exists only one θ Θ such that (β 1,, β K ) θ Note that Θ is a finite set The elements of the set Θ are commonly called states of nature For example consider K = 3 The set Θ consists of the following elements: θ 1 = {(β 1, β 2, β 3 ) R 3 : β 1 = β 2 = β 3 } θ 2 = {(β 1, β 2, β 3 ) R 3 : β 1 = β 2, β 3 β 1 } θ 3 = {(β 1, β 2, β 3 ) R 3 : β 1 = β 3, β 2 β 1 } θ 4 = {(β 1, β 2, β 3 ) R 3 : β 2 = β 3, β 1 β 2 } θ 5 = {(β 1, β 2, β 3 ) R 3 : β 1 β 2, β 1 β 3, β 2 β 3 } The aim of any multiple comparison procedure is to detect the true state of nature Let D be a set of all decisions which can be made on the basis of observations The elements of the set D are called decisions We assume that D Θ We define the loss function in the following manner L(d, θ) = {, if d = θ, 1, if d θ, for d D and θ Θ This loss function gives penalty of one when our decision is not correct If we denote by X the space of all observations, then the function δ : X x d D is called a decision rule The considered procedure of multiple comparisons may be described as a decision rule A decision rule δ is characterized by its risk function, ie, average loss Let (β 1,, β K ) θ Then the risk function of the rule δ equals R δ (β 1,, β K ) = E (β1,,β K )L(δ(x), θ) = P (β1,,β K ){δ(x) θ} Note that in general, the risk depends on the differences of the values of the parameters (β 1,, β K ) For example if we assume K = 3 and σ 2 = 1, then it is easier to make misclassification for β 1 = β 2 = 1, β 3 = 11 than for β 1 = β 2 = 1, β 3 = 5 though both belong to the same state of nature Only in the case β 1 = = β K = β the risk does not depend on the value of β 5

6 The risk of the rule δ is the probability of the false decision This probability should be as small as possible In our investigation we are interested in a probability of the correct decision which is equal to 1 R δ The most common approach to the problem under consideration is via theory of multiple hypothesis testing In that framework, there are considered different criterions of goodness, such as a Familywise Error Rate or Per Comparison Error Rate connected with controlling the risk of committing an error of type I (see Gather et al 1996) Those criterions may be considered as a generalizations of the notion of the significance level in the Neyman Pearson theory of testing hypotheses According to that terminology, we may say that probability of the correct decision is the criterion which simultaneously takes into account the possibility of committing the errors of type I and type II, as Wald decision theory does Note that the imposed criterion, as opposed to the theory of multiple hypotheses testing, does not keep the Familywise Error Rate at a fixed level Thus it is advisable to consider rules δ with R δ for β 1 = = β K equal to the value of the significance level of the hypothesis H : β 1 = = β K As a consequence, there is no possibility that the results obtained by the rule contradicts that obtained for the above hypothesis The weak point of the presented approach is that there is no possibility to obtain the uniformly best procedure But on the other hand we avoid the obvious disadvantage that the constructed procedures will be too conservative (ie giving too large homogenous groups) 5 Experiment The probability of the correct decision was estimated on the basis of a simulation experiment In the experiment we choose K = 5 regression functions and each of it was observed 2 times Random errors were normal with σ = 1 Parameters α 1,, α 5 were zero The regression functions were considered on the interval [ 1, 1] For five regression functions there are 67 states of nature but it is enough to consider seven of them Considered states are given below Notation {β 1 = β 2 = β 3, β 4, β 5 } means the following subset: {(β 1, β 2, β 3, β 4, β 5 ) : β 1 = β 2 = β 3, β 4 β 1, β 5 β 1, β 4 β 5 } R 5 6

7 Number of groups State of nature Notation 1 {β 1 = β 2 = β 3 = β 4 = β 5 } {5} 2 {β 1 = β 2 = β 3, β 4 = β 5 } {2, 3} {β 1 = β 2 = β 3 = β 4, β 5 } {1, 4} 3 {β 1 = β 2 = β 3, β 4, β 5 } {1, 1, 3} {β 1 = β 2, β 3 = β 4, β 5 } {1, 2, 2} 4 {β 1 = β 2, β 3, β 4, β 5 } {1, 1, 1, 2} 5 {β 1, β 2, β 3, β 4, β 5 } {1, 1, 1, 1, 1} For each state of nature there were generated regression coefficients from the interval (5, 15) according to uniform distribution For example, for the state {2, 3} there were generated two numbers x, y from the distribution U(5, 15) and it was set β 1 = β 2 = β 3 = x and β 4 = β 5 = y Such generation was repeated one hundred times For each generated regression coefficients (β 1,, β 5 ) there were made 1 drawn of samples (x ki, Y ki ) for k = 1,, 5 and i = 1,, 2, such that Y ki = β k x ki + ε ki To each sample the described procedure was applied and there was noted if the obtained division of regression coefficients is consistent with the state of nature The probability of the correct decision was estimated by a fraction of divisions consistent with the state of nature It is obvious that the probability of the correct decision depends on a plan of experiment, ie on the choice of values of regressors Three plans were considered In the first case (random plan) twenty values of x s were chosen randomly from the [ 1, 1] interval (due to the uniform distribution) for every regression function separately The second plan was a naive one, ie values of regressor were 1 + 2i/19 for i =, 1,, 19 The third plan was the G optimal plan in which x = 1 or x = 1 and at each x there were taken ten observations The second plan as well as the third one, were common for all regression functions 6 Results The results are presented graphically On y axis there is an estimated probability (multiplied by 1) of the correct decision while on the x axis there is minimal distance between groups The solid line represents the probability of the correct decision for the G optimal plan, dashed line for the naive plan, and the dotted line for the random plan On the basis of simulations we may formulate the following conclusions 7

8 1 The proposed procedure of detecting the division of regression coefficients closely corresponding to the true states of nature is more precise when there is a small number of groups of coefficients When we increase the number of groups of coefficients the probability of detecting differences between them is decreasing 2 In the case of each division of regression coefficients we obtained that the best plan of experiment was the G optimal plan The probability of taking the correct decision is very high even for small differences between the sets of regression coefficients Above conclusions are true for five regression functions, but it may be expected, that similar conclusions may be formulated for the higher number of functions 7 Literature Gather U, Pawlitschko J, Pigeot I, 1966: Unbiasedness of multiple tests, Scandinavian Journal of Statistics, 23: Hochberg Y, Tamhane A C (1988) Multiple Comparison Procedures, John Wiley & Sons Miller Jr R G (1982) Simultaneous Statistical Inference, Springer Verlag, 2nd ed Zieliński W (1992) Monte Carlo comparison of multiple comparison procedures, Biometrical Journal 34: Porównania wielokrotne współczynników kierunkowych prostych regresji Streszczenie W pracy rozważane jest zagadnienie porównania K prostych regresji Zaproponowana została procedura znajdowania grup równoległych prostych Słowa kluczowe: porównania wielokrotne, jednoczesne wnioskowanie, analiza regresji AMS subject classification: 62J15, 62J1 8

9 8 Figures {1, 4} {2, 3}

10 {1, 1, 3} {1, 2, 2}

11 {1, 1, 1, 2} {1, 1, 1, 1, 1}

Ch 3: Multiple Linear Regression

Ch 3: Multiple Linear Regression Ch 3: Multiple Linear Regression 1. Multiple Linear Regression Model Multiple regression model has more than one regressor. For example, we have one response variable and two regressor variables: 1. delivery

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

Review of Econometrics

Review of Econometrics Review of Econometrics Zheng Tian June 5th, 2017 1 The Essence of the OLS Estimation Multiple regression model involves the models as follows Y i = β 0 + β 1 X 1i + β 2 X 2i + + β k X ki + u i, i = 1,...,

More information

Central Limit Theorem ( 5.3)

Central Limit Theorem ( 5.3) Central Limit Theorem ( 5.3) Let X 1, X 2,... be a sequence of independent random variables, each having n mean µ and variance σ 2. Then the distribution of the partial sum S n = X i i=1 becomes approximately

More information

Ma 3/103: Lecture 25 Linear Regression II: Hypothesis Testing and ANOVA

Ma 3/103: Lecture 25 Linear Regression II: Hypothesis Testing and ANOVA Ma 3/103: Lecture 25 Linear Regression II: Hypothesis Testing and ANOVA March 6, 2017 KC Border Linear Regression II March 6, 2017 1 / 44 1 OLS estimator 2 Restricted regression 3 Errors in variables 4

More information

LECTURE 5 HYPOTHESIS TESTING

LECTURE 5 HYPOTHESIS TESTING October 25, 2016 LECTURE 5 HYPOTHESIS TESTING Basic concepts In this lecture we continue to discuss the normal classical linear regression defined by Assumptions A1-A5. Let θ Θ R d be a parameter of interest.

More information

14 Multiple Linear Regression

14 Multiple Linear Regression B.Sc./Cert./M.Sc. Qualif. - Statistics: Theory and Practice 14 Multiple Linear Regression 14.1 The multiple linear regression model In simple linear regression, the response variable y is expressed in

More information

Testing Statistical Hypotheses

Testing Statistical Hypotheses E.L. Lehmann Joseph P. Romano Testing Statistical Hypotheses Third Edition 4y Springer Preface vii I Small-Sample Theory 1 1 The General Decision Problem 3 1.1 Statistical Inference and Statistical Decisions

More information

Summary of Chapters 7-9

Summary of Chapters 7-9 Summary of Chapters 7-9 Chapter 7. Interval Estimation 7.2. Confidence Intervals for Difference of Two Means Let X 1,, X n and Y 1, Y 2,, Y m be two independent random samples of sizes n and m from two

More information

Chapter 1 Statistical Inference

Chapter 1 Statistical Inference Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

More information

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit LECTURE 6 Introduction to Econometrics Hypothesis testing & Goodness of fit October 25, 2016 1 / 23 ON TODAY S LECTURE We will explain how multiple hypotheses are tested in a regression model We will define

More information

Applying the Benjamini Hochberg procedure to a set of generalized p-values

Applying the Benjamini Hochberg procedure to a set of generalized p-values U.U.D.M. Report 20:22 Applying the Benjamini Hochberg procedure to a set of generalized p-values Fredrik Jonsson Department of Mathematics Uppsala University Applying the Benjamini Hochberg procedure

More information

Nonstationary Panels

Nonstationary Panels Nonstationary Panels Based on chapters 12.4, 12.5, and 12.6 of Baltagi, B. (2005): Econometric Analysis of Panel Data, 3rd edition. Chichester, John Wiley & Sons. June 3, 2009 Agenda 1 Spurious Regressions

More information

STAT 263/363: Experimental Design Winter 2016/17. Lecture 1 January 9. Why perform Design of Experiments (DOE)? There are at least two reasons:

STAT 263/363: Experimental Design Winter 2016/17. Lecture 1 January 9. Why perform Design of Experiments (DOE)? There are at least two reasons: STAT 263/363: Experimental Design Winter 206/7 Lecture January 9 Lecturer: Minyong Lee Scribe: Zachary del Rosario. Design of Experiments Why perform Design of Experiments (DOE)? There are at least two

More information

11 Hypothesis Testing

11 Hypothesis Testing 28 11 Hypothesis Testing 111 Introduction Suppose we want to test the hypothesis: H : A q p β p 1 q 1 In terms of the rows of A this can be written as a 1 a q β, ie a i β for each row of A (here a i denotes

More information

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

LECTURE 2 LINEAR REGRESSION MODEL AND OLS SEPTEMBER 29, 2014 LECTURE 2 LINEAR REGRESSION MODEL AND OLS Definitions A common question in econometrics is to study the effect of one group of variables X i, usually called the regressors, on another

More information

Lecture 3: Multiple Regression

Lecture 3: Multiple Regression Lecture 3: Multiple Regression R.G. Pierse 1 The General Linear Model Suppose that we have k explanatory variables Y i = β 1 + β X i + β 3 X 3i + + β k X ki + u i, i = 1,, n (1.1) or Y i = β j X ji + u

More information

Summary and discussion of: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing

Summary and discussion of: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing Summary and discussion of: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing Statistics Journal Club, 36-825 Beau Dabbs and Philipp Burckhardt 9-19-2014 1 Paper

More information

Lectures 5 & 6: Hypothesis Testing

Lectures 5 & 6: Hypothesis Testing Lectures 5 & 6: Hypothesis Testing in which you learn to apply the concept of statistical significance to OLS estimates, learn the concept of t values, how to use them in regression work and come across

More information

The Linear Regression Model

The Linear Regression Model The Linear Regression Model Carlo Favero Favero () The Linear Regression Model 1 / 67 OLS To illustrate how estimation can be performed to derive conditional expectations, consider the following general

More information

Ch. 5 Hypothesis Testing

Ch. 5 Hypothesis Testing Ch. 5 Hypothesis Testing The current framework of hypothesis testing is largely due to the work of Neyman and Pearson in the late 1920s, early 30s, complementing Fisher s work on estimation. As in estimation,

More information

FDR-CONTROLLING STEPWISE PROCEDURES AND THEIR FALSE NEGATIVES RATES

FDR-CONTROLLING STEPWISE PROCEDURES AND THEIR FALSE NEGATIVES RATES FDR-CONTROLLING STEPWISE PROCEDURES AND THEIR FALSE NEGATIVES RATES Sanat K. Sarkar a a Department of Statistics, Temple University, Speakman Hall (006-00), Philadelphia, PA 19122, USA Abstract The concept

More information

Performance Evaluation and Comparison

Performance Evaluation and Comparison Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Cross Validation and Resampling 3 Interval Estimation

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)

More information

A Simple, Graphical Procedure for Comparing Multiple Treatment Effects

A Simple, Graphical Procedure for Comparing Multiple Treatment Effects A Simple, Graphical Procedure for Comparing Multiple Treatment Effects Brennan S. Thompson and Matthew D. Webb May 15, 2015 > Abstract In this paper, we utilize a new graphical

More information

Homoskedasticity. Var (u X) = σ 2. (23)

Homoskedasticity. Var (u X) = σ 2. (23) Homoskedasticity How big is the difference between the OLS estimator and the true parameter? To answer this question, we make an additional assumption called homoskedasticity: Var (u X) = σ 2. (23) This

More information

Statistical Inference

Statistical Inference Statistical Inference Classical and Bayesian Methods Class 6 AMS-UCSC Thu 26, 2012 Winter 2012. Session 1 (Class 6) AMS-132/206 Thu 26, 2012 1 / 15 Topics Topics We will talk about... 1 Hypothesis testing

More information

Lecture 6: Linear models and Gauss-Markov theorem

Lecture 6: Linear models and Gauss-Markov theorem Lecture 6: Linear models and Gauss-Markov theorem Linear model setting Results in simple linear regression can be extended to the following general linear model with independently observed response variables

More information

Let us first identify some classes of hypotheses. simple versus simple. H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided

Let us first identify some classes of hypotheses. simple versus simple. H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided Let us first identify some classes of hypotheses. simple versus simple H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided H 0 : θ θ 0 versus H 1 : θ > θ 0. (2) two-sided; null on extremes H 0 : θ θ 1 or

More information

Master s Written Examination

Master s Written Examination Master s Written Examination Option: Statistics and Probability Spring 016 Full points may be obtained for correct answers to eight questions. Each numbered question which may have several parts is worth

More information

Analysis Of Variance Compiled by T.O. Antwi-Asare, U.G

Analysis Of Variance Compiled by T.O. Antwi-Asare, U.G Analysis Of Variance Compiled by T.O. Antwi-Asare, U.G 1 ANOVA Analysis of variance compares two or more population means of interval data. Specifically, we are interested in determining whether differences

More information

Master s Written Examination - Solution

Master s Written Examination - Solution Master s Written Examination - Solution Spring 204 Problem Stat 40 Suppose X and X 2 have the joint pdf f X,X 2 (x, x 2 ) = 2e (x +x 2 ), 0 < x < x 2

More information

The regression model with one fixed regressor cont d

The regression model with one fixed regressor cont d The regression model with one fixed regressor cont d 3150/4150 Lecture 4 Ragnar Nymoen 27 January 2012 The model with transformed variables Regression with transformed variables I References HGL Ch 2.8

More information

So far our focus has been on estimation of the parameter vector β in the. y = Xβ + u

So far our focus has been on estimation of the parameter vector β in the. y = Xβ + u Interval estimation and hypothesis tests So far our focus has been on estimation of the parameter vector β in the linear model y i = β 1 x 1i + β 2 x 2i +... + β K x Ki + u i = x iβ + u i for i = 1, 2,...,

More information

TESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST

TESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST Econometrics Working Paper EWP0402 ISSN 1485-6441 Department of Economics TESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST Lauren Bin Dong & David E. A. Giles Department

More information

The Standard Linear Model: Hypothesis Testing

The Standard Linear Model: Hypothesis Testing Department of Mathematics Ma 3/103 KC Border Introduction to Probability and Statistics Winter 2017 Lecture 25: The Standard Linear Model: Hypothesis Testing Relevant textbook passages: Larsen Marx [4]:

More information

Application of Variance Homogeneity Tests Under Violation of Normality Assumption

Application of Variance Homogeneity Tests Under Violation of Normality Assumption Application of Variance Homogeneity Tests Under Violation of Normality Assumption Alisa A. Gorbunova, Boris Yu. Lemeshko Novosibirsk State Technical University Novosibirsk, Russia e-mail: gorbunova.alisa@gmail.com

More information

Part IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015

Part IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015 Part IB Statistics Theorems with proof Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly)

More information

WISE International Masters

WISE International Masters WISE International Masters ECONOMETRICS Instructor: Brett Graham INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This examination paper contains 32 questions. You are

More information

Lecture 12 November 3

Lecture 12 November 3 STATS 300A: Theory of Statistics Fall 2015 Lecture 12 November 3 Lecturer: Lester Mackey Scribe: Jae Hyuck Park, Christian Fong Warning: These notes may contain factual and/or typographic errors. 12.1

More information

A Note on UMPI F Tests

A Note on UMPI F Tests A Note on UMPI F Tests Ronald Christensen Professor of Statistics Department of Mathematics and Statistics University of New Mexico May 22, 2015 Abstract We examine the transformations necessary for establishing

More information

Statistical Applications in Genetics and Molecular Biology

Statistical Applications in Genetics and Molecular Biology Statistical Applications in Genetics and Molecular Biology Volume 5, Issue 1 2006 Article 28 A Two-Step Multiple Comparison Procedure for a Large Number of Tests and Multiple Treatments Hongmei Jiang Rebecca

More information

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018 Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2

MA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2 MA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2 1 Bootstrapped Bias and CIs Given a multiple regression model with mean and

More information

Econometrics of Panel Data

Econometrics of Panel Data Econometrics of Panel Data Jakub Mućk Meeting # 3 Jakub Mućk Econometrics of Panel Data Meeting # 3 1 / 21 Outline 1 Fixed or Random Hausman Test 2 Between Estimator 3 Coefficient of determination (R 2

More information

Chapter Seven: Multi-Sample Methods 1/52

Chapter Seven: Multi-Sample Methods 1/52 Chapter Seven: Multi-Sample Methods 1/52 7.1 Introduction 2/52 Introduction The independent samples t test and the independent samples Z test for a difference between proportions are designed to analyze

More information

Stat 5100 Handout #26: Variations on OLS Linear Regression (Ch. 11, 13)

Stat 5100 Handout #26: Variations on OLS Linear Regression (Ch. 11, 13) Stat 5100 Handout #26: Variations on OLS Linear Regression (Ch. 11, 13) 1. Weighted Least Squares (textbook 11.1) Recall regression model Y = β 0 + β 1 X 1 +... + β p 1 X p 1 + ε in matrix form: (Ch. 5,

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7

MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7 MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7 1 Random Vectors Let a 0 and y be n 1 vectors, and let A be an n n matrix. Here, a 0 and A are non-random, whereas y is

More information

Heteroskedasticity-Robust Inference in Finite Samples

Heteroskedasticity-Robust Inference in Finite Samples Heteroskedasticity-Robust Inference in Finite Samples Jerry Hausman and Christopher Palmer Massachusetts Institute of Technology December 011 Abstract Since the advent of heteroskedasticity-robust standard

More information

Econometrics. 4) Statistical inference

Econometrics. 4) Statistical inference 30C00200 Econometrics 4) Statistical inference Timo Kuosmanen Professor, Ph.D. http://nomepre.net/index.php/timokuosmanen Today s topics Confidence intervals of parameter estimates Student s t-distribution

More information

Lecture 14 Simple Linear Regression

Lecture 14 Simple Linear Regression Lecture 4 Simple Linear Regression Ordinary Least Squares (OLS) Consider the following simple linear regression model where, for each unit i, Y i is the dependent variable (response). X i is the independent

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.

More information

GOTEBORG UNIVERSITY. Department of Statistics

GOTEBORG UNIVERSITY. Department of Statistics GOTEBORG UNIVERSITY Department of Statistics RESEARCH REPORT 1994:5 ISSN 0349-8034 COMPARING POWER AND MULTIPLE SIGNIFICANCE LEVEL FOR STEP UP AND STEP DOWN MULTIPLE TEST PROCEDURES FOR CORRELATED ESTIMATES

More information

Lecture 15. Hypothesis testing in the linear model

Lecture 15. Hypothesis testing in the linear model 14. Lecture 15. Hypothesis testing in the linear model Lecture 15. Hypothesis testing in the linear model 1 (1 1) Preliminary lemma 15. Hypothesis testing in the linear model 15.1. Preliminary lemma Lemma

More information

Institute of Actuaries of India

Institute of Actuaries of India Institute of Actuaries of India Subject CT3 Probability & Mathematical Statistics May 2011 Examinations INDICATIVE SOLUTION Introduction The indicative solution has been written by the Examiners with the

More information

Formal Statement of Simple Linear Regression Model

Formal Statement of Simple Linear Regression Model Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor

More information

Advanced Econometrics I

Advanced Econometrics I Lecture Notes Autumn 2010 Dr. Getinet Haile, University of Mannheim 1. Introduction Introduction & CLRM, Autumn Term 2010 1 What is econometrics? Econometrics = economic statistics economic theory mathematics

More information

8. Hypothesis Testing

8. Hypothesis Testing FE661 - Statistical Methods for Financial Engineering 8. Hypothesis Testing Jitkomut Songsiri introduction Wald test likelihood-based tests significance test for linear regression 8-1 Introduction elements

More information

Bootstrapping the Grainger Causality Test With Integrated Data

Bootstrapping the Grainger Causality Test With Integrated Data Bootstrapping the Grainger Causality Test With Integrated Data Richard Ti n University of Reading July 26, 2006 Abstract A Monte-carlo experiment is conducted to investigate the small sample performance

More information

Economics 520. Lecture Note 19: Hypothesis Testing via the Neyman-Pearson Lemma CB 8.1,

Economics 520. Lecture Note 19: Hypothesis Testing via the Neyman-Pearson Lemma CB 8.1, Economics 520 Lecture Note 9: Hypothesis Testing via the Neyman-Pearson Lemma CB 8., 8.3.-8.3.3 Uniformly Most Powerful Tests and the Neyman-Pearson Lemma Let s return to the hypothesis testing problem

More information

Probability and Statistics Notes

Probability and Statistics Notes Probability and Statistics Notes Chapter Seven Jesse Crawford Department of Mathematics Tarleton State University Spring 2011 (Tarleton State University) Chapter Seven Notes Spring 2011 1 / 42 Outline

More information

This model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that

This model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that Linear Regression For (X, Y ) a pair of random variables with values in R p R we assume that E(Y X) = β 0 + with β R p+1. p X j β j = (1, X T )β j=1 This model of the conditional expectation is linear

More information

20.1. Balanced One-Way Classification Cell means parametrization: ε 1. ε I. + ˆɛ 2 ij =

20.1. Balanced One-Way Classification Cell means parametrization: ε 1. ε I. + ˆɛ 2 ij = 20. ONE-WAY ANALYSIS OF VARIANCE 1 20.1. Balanced One-Way Classification Cell means parametrization: Y ij = µ i + ε ij, i = 1,..., I; j = 1,..., J, ε ij N(0, σ 2 ), In matrix form, Y = Xβ + ε, or 1 Y J

More information

LECTURE 10: NEYMAN-PEARSON LEMMA AND ASYMPTOTIC TESTING. The last equality is provided so this can look like a more familiar parametric test.

LECTURE 10: NEYMAN-PEARSON LEMMA AND ASYMPTOTIC TESTING. The last equality is provided so this can look like a more familiar parametric test. Economics 52 Econometrics Professor N.M. Kiefer LECTURE 1: NEYMAN-PEARSON LEMMA AND ASYMPTOTIC TESTING NEYMAN-PEARSON LEMMA: Lesson: Good tests are based on the likelihood ratio. The proof is easy in the

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there

More information

Inference about the Indirect Effect: a Likelihood Approach

Inference about the Indirect Effect: a Likelihood Approach Discussion Paper: 2014/10 Inference about the Indirect Effect: a Likelihood Approach Noud P.A. van Giersbergen www.ase.uva.nl/uva-econometrics Amsterdam School of Economics Department of Economics & Econometrics

More information

Testing Statistical Hypotheses

Testing Statistical Hypotheses E.L. Lehmann Joseph P. Romano, 02LEu1 ttd ~Lt~S Testing Statistical Hypotheses Third Edition With 6 Illustrations ~Springer 2 The Probability Background 28 2.1 Probability and Measure 28 2.2 Integration.........

More information

ST5215: Advanced Statistical Theory

ST5215: Advanced Statistical Theory Department of Statistics & Applied Probability Wednesday, October 5, 2011 Lecture 13: Basic elements and notions in decision theory Basic elements X : a sample from a population P P Decision: an action

More information

INTERVAL ESTIMATION AND HYPOTHESES TESTING

INTERVAL ESTIMATION AND HYPOTHESES TESTING INTERVAL ESTIMATION AND HYPOTHESES TESTING 1. IDEA An interval rather than a point estimate is often of interest. Confidence intervals are thus important in empirical work. To construct interval estimates,

More information

The Random Effects Model Introduction

The Random Effects Model Introduction The Random Effects Model Introduction Sometimes, treatments included in experiment are randomly chosen from set of all possible treatments. Conclusions from such experiment can then be generalized to other

More information

Inference in Regression Analysis

Inference in Regression Analysis Inference in Regression Analysis Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 4, Slide 1 Today: Normal Error Regression Model Y i = β 0 + β 1 X i + ǫ i Y i value

More information

Sleep data, two drugs Ch13.xls

Sleep data, two drugs Ch13.xls Model Based Statistics in Biology. Part IV. The General Linear Mixed Model.. Chapter 13.3 Fixed*Random Effects (Paired t-test) ReCap. Part I (Chapters 1,2,3,4), Part II (Ch 5, 6, 7) ReCap Part III (Ch

More information

Hypothesis Testing hypothesis testing approach

Hypothesis Testing hypothesis testing approach Hypothesis Testing In this case, we d be trying to form an inference about that neighborhood: Do people there shop more often those people who are members of the larger population To ascertain this, we

More information

Unit 12: Analysis of Single Factor Experiments

Unit 12: Analysis of Single Factor Experiments Unit 12: Analysis of Single Factor Experiments Statistics 571: Statistical Methods Ramón V. León 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 1 Introduction Chapter 8: How to compare two treatments. Chapter

More information

Estimating the accuracy of a hypothesis Setting. Assume a binary classification setting

Estimating the accuracy of a hypothesis Setting. Assume a binary classification setting Estimating the accuracy of a hypothesis Setting Assume a binary classification setting Assume input/output pairs (x, y) are sampled from an unknown probability distribution D = p(x, y) Train a binary classifier

More information

1. What does the alternate hypothesis ask for a one-way between-subjects analysis of variance?

1. What does the alternate hypothesis ask for a one-way between-subjects analysis of variance? 1. What does the alternate hypothesis ask for a one-way between-subjects analysis of variance? 2. What is the difference between between-group variability and within-group variability? 3. What does between-group

More information

Evaluation. Andrea Passerini Machine Learning. Evaluation

Evaluation. Andrea Passerini Machine Learning. Evaluation Andrea Passerini passerini@disi.unitn.it Machine Learning Basic concepts requires to define performance measures to be optimized Performance of learning algorithms cannot be evaluated on entire domain

More information

UNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017

UNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017 UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics Tuesday, January 17, 2017 Work all problems 60 points are needed to pass at the Masters Level and 75

More information

Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions

Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions Journal of Modern Applied Statistical Methods Volume 8 Issue 1 Article 13 5-1-2009 Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression ST 430/514 Recall: a regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates).

More information

1 One-way Analysis of Variance

1 One-way Analysis of Variance 1 One-way Analysis of Variance Suppose that a random sample of q individuals receives treatment T i, i = 1,,... p. Let Y ij be the response from the jth individual to be treated with the ith treatment

More information

Introductory Econometrics. Review of statistics (Part II: Inference)

Introductory Econometrics. Review of statistics (Part II: Inference) Introductory Econometrics Review of statistics (Part II: Inference) Jun Ma School of Economics Renmin University of China October 1, 2018 1/16 Null and alternative hypotheses Usually, we have two competing

More information

Inferences for Regression

Inferences for Regression Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In

More information

EC2001 Econometrics 1 Dr. Jose Olmo Room D309

EC2001 Econometrics 1 Dr. Jose Olmo Room D309 EC2001 Econometrics 1 Dr. Jose Olmo Room D309 J.Olmo@City.ac.uk 1 Revision of Statistical Inference 1.1 Sample, observations, population A sample is a number of observations drawn from a population. Population:

More information

Direction: This test is worth 250 points and each problem worth points. DO ANY SIX

Direction: This test is worth 250 points and each problem worth points. DO ANY SIX Term Test 3 December 5, 2003 Name Math 52 Student Number Direction: This test is worth 250 points and each problem worth 4 points DO ANY SIX PROBLEMS You are required to complete this test within 50 minutes

More information

Analysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems

Analysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems Analysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems Jeremy S. Conner and Dale E. Seborg Department of Chemical Engineering University of California, Santa Barbara, CA

More information

Psychology 282 Lecture #4 Outline Inferences in SLR

Psychology 282 Lecture #4 Outline Inferences in SLR Psychology 282 Lecture #4 Outline Inferences in SLR Assumptions To this point we have not had to make any distributional assumptions. Principle of least squares requires no assumptions. Can use correlations

More information

Parameter Estimation, Sampling Distributions & Hypothesis Testing

Parameter Estimation, Sampling Distributions & Hypothesis Testing Parameter Estimation, Sampling Distributions & Hypothesis Testing Parameter Estimation & Hypothesis Testing In doing research, we are usually interested in some feature of a population distribution (which

More information

Evaluation requires to define performance measures to be optimized

Evaluation requires to define performance measures to be optimized Evaluation Basic concepts Evaluation requires to define performance measures to be optimized Performance of learning algorithms cannot be evaluated on entire domain (generalization error) approximation

More information

Weighted Least Squares

Weighted Least Squares Weighted Least Squares The standard linear model assumes that Var(ε i ) = σ 2 for i = 1,..., n. As we have seen, however, there are instances where Var(Y X = x i ) = Var(ε i ) = σ2 w i. Here w 1,..., w

More information

Regression Analysis for Data Containing Outliers and High Leverage Points

Regression Analysis for Data Containing Outliers and High Leverage Points Alabama Journal of Mathematics 39 (2015) ISSN 2373-0404 Regression Analysis for Data Containing Outliers and High Leverage Points Asim Kumer Dey Department of Mathematics Lamar University Md. Amir Hossain

More information

Lec 1: An Introduction to ANOVA

Lec 1: An Introduction to ANOVA Ying Li Stockholm University October 31, 2011 Three end-aisle displays Which is the best? Design of the Experiment Identify the stores of the similar size and type. The displays are randomly assigned to

More information

General Linear Model: Statistical Inference

General Linear Model: Statistical Inference Chapter 6 General Linear Model: Statistical Inference 6.1 Introduction So far we have discussed formulation of linear models (Chapter 1), estimability of parameters in a linear model (Chapter 4), least

More information

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B Simple Linear Regression 35 Problems 1 Consider a set of data (x i, y i ), i =1, 2,,n, and the following two regression models: y i = β 0 + β 1 x i + ε, (i =1, 2,,n), Model A y i = γ 0 + γ 1 x i + γ 2

More information

Modified Simes Critical Values Under Positive Dependence

Modified Simes Critical Values Under Positive Dependence Modified Simes Critical Values Under Positive Dependence Gengqian Cai, Sanat K. Sarkar Clinical Pharmacology Statistics & Programming, BDS, GlaxoSmithKline Statistics Department, Temple University, Philadelphia

More information

Econometrics of Panel Data

Econometrics of Panel Data Econometrics of Panel Data Jakub Mućk Meeting # 2 Jakub Mućk Econometrics of Panel Data Meeting # 2 1 / 26 Outline 1 Fixed effects model The Least Squares Dummy Variable Estimator The Fixed Effect (Within

More information

Master s Written Examination

Master s Written Examination Master s Written Examination Option: Statistics and Probability Spring 05 Full points may be obtained for correct answers to eight questions Each numbered question (which may have several parts) is worth

More information

Correlation Analysis

Correlation Analysis Simple Regression Correlation Analysis Correlation analysis is used to measure strength of the association (linear relationship) between two variables Correlation is only concerned with strength of the

More information

MATH 644: Regression Analysis Methods

MATH 644: Regression Analysis Methods MATH 644: Regression Analysis Methods FINAL EXAM Fall, 2012 INSTRUCTIONS TO STUDENTS: 1. This test contains SIX questions. It comprises ELEVEN printed pages. 2. Answer ALL questions for a total of 100

More information

Simple linear regression

Simple linear regression Simple linear regression Biometry 755 Spring 2008 Simple linear regression p. 1/40 Overview of regression analysis Evaluate relationship between one or more independent variables (X 1,...,X k ) and a single

More information