Multiple comparisons of slopes of regression lines. Jolanta Wojnar, Wojciech Zieliński
|
|
- Claude Douglas
- 5 years ago
- Views:
Transcription
1 Multiple comparisons of slopes of regression lines Jolanta Wojnar, Wojciech Zieliński Institute of Statistics and Econometrics University of Rzeszów ul Ćwiklińskiej 2, Rzeszów Department of Econometrics and Computer Sciences Agricultural University ul Nowoursynowska 166, Warszawa Abstract The problem of comparing K simple regression lines is considered A statistical procedure for finding groups of parallel lines is proposed Key words: multiple comparisons, simultaneous inference, regression analysis AMS subject classification: 62J15, 62J1 1 Introduction Consider K regression lines Y = α k + β k x + ε, k = 1,, K and assume that ε s are iid random variables distributed as N(, σ 2 ) The problem is to divide a set of regression coefficients {β 1,, β K } into homogeneous groups A subset {β i1,, β im } is called the homogenous group if β i1 = = β im and any other β {β 1,, β K } is not equal to β i1 This problem is similar to the problem of extracting homogeneous groups of means in the ANOVA Some of classical multiple comparison procedures (cf Miller, 1982, Hochberg and Tamhane, 1988) such as Tukey, Scheffé, Bonfferroni can be adopted to the above problem In what follows, the W procedure of multiple comparison proposed by Zieliński (1992) is used As a criterion of the procedure quality the probability of the correct decision is taken 1
2 2 Statistical model Assume that for each of the regression function we have n k observations (x ki, Y ki ) Than the overall number of observations is N = k n k Hence we have the following joint model: (1) Y ki = α k + β k x ki + ε ki, i = 1,, n k, k = 1,, K, where α s and β s are unknown regression coefficients and ε s are independent normally distributed random variables with mean zero and variance σ 2 In matrix notation the model may be written as y = Xβ + ε, where the vector ε is distributed as N N (, σ 2 I) and Xβ = y = (Y 11,, Y 1n1,, Y K1,, Y KnK ), 1 x 11 1 x 12 1 x 1n1 1 x 21 1 x 22 1 x 2n2 1 x K1 1 x K2 1 x KnK α 1 α 2 α K β 1 β 2 β K Let A be a given q 2K matrix of rank r and c be a given q 1 vector On the basis of the general theory of linear models we obtain the following test statistics for the hypothesis H : Aβ = c, (2) F = (A β c) [A(X X) A ] (A β c) y (I X(X X) X )y N rank(x) r where β is the LSE of β Its null distribution is F with (r, N rank(x)) degrees of freedom and the hypothesis is rejected, at a significance level α, if F > Fr;N rank(x) α, where Fr;N rank(x) α is an appropriate critical value 2
3 If for each k = 1,, K there exist at[ least two different ] x ki s, then the rank(x) = 2K and there exists (X X) 1 W1 W of the form 2, where W 2 W 3 { x 2 W 1 = diag 1i,, n 1 SS 1 { x1i W 2 = diag,, n 1 SS 1 W 3 = diag { 1 SS 1,, } x 2 Ki, n K SS K 1 SS K xki n K SS K } Here diag {a 1,, a K } denotes the diagonal matrix with diagonal elements a 1,, a K and SS k = n k i=1 (x ki x k ) 2 Note that 1 N 2K y (I X(X X) 1 X )y is the least square unbiased estimator of the variance σ 2 }, 3 Procedure The procedure of comparison of regression coefficients is based on the statistic (2) with c = and is stepwise On the first step, it is verified whether β 1 = = β K The matrix A is then of the form A = [ ] K K IK 1 K 1 K1 K where K K is K K zero matrix, I K denotes the identity matrix of order K and 1 K denotes the K 1 vector of ones The explicite form of the nominator of (2) is K ( βk β ) 2, where k=1 β k = i=1 (Y ik Ȳk)(x ik x k ) i=1 (x, β = ik x k ) 2 K k=1 i=1 (Y ik Ȳk)(x ik x k ) K k=1 i=1 (x ik x k ) 2 Note that β k is the LSE of β k for k th regression line and β is the LSE of the regression coefficient under assumption β 1 = = β K = β If the value of statistic (2) is less than FK 1;N 2K α, the procedure stops, and regression coefficients are considered as equal Elsewhere we go to the second step 3
4 On the p-th step we consider a division of the set of regression coefficients into p disjoint homogenous groups Let I 1,, I p be a division of {1,, K} into p disjoint subsets Let J (p) denotes this division The corresponding matrix A (after appropriate permutation of regression coefficients) takes on the form A J (p) = m1 K m2 K mp K I m1 1 m 1 1 m1 1 m 1 m1 m 2 m1 m p m2 m 1 I m2 1 m 2 1 m2 1 m 2 m2 m p mp m 1 mp m 2 I mp 1 m p 1 mp 1 m p where m i is the cardinality of the subset I i Let F J (p) denotes the statistic (2) with the matrix A J (p) The nominator of (2) equals to p ( βk β ) 2 Ij, j=1 k I j where β k = i=1 (Y ik Ȳk)(x ik x k ) i=1 (x, ik x k ) 2 βij = k I j i=1 (Y ik ȲI j )(x ik x Ij ) K k=1 i=1 (x, ik x Ij ) 2 Ȳ Ij = k I j i=1 Y ki k I j n k, x Ij = k I j i=1 x ki k I j n k The estimator ˆβ Ij is the LSE of regression coefficient under assumption that all β k for k I j are equal Let J (p) be a division into p groups such that F J (p) = min F J (p) If F J (p) < F α K p;n 2K, then we stop the procedure and accept the division J (p) Otherwise we consider divisions into p + 1 groups If p = K 1 and F J (p) > F α K p;n 2K holds, we decide that we have K groups, ie all coefficients are distinct 4
5 4 Criterion Let Θ = {θ 1, θ 2, } denotes the set of all possible divisions of the set of regression coefficients into homogenous groups Elements of the set Θ are disjoint subsets of R K and for every (β 1,, β K ) R K there exists only one θ Θ such that (β 1,, β K ) θ Note that Θ is a finite set The elements of the set Θ are commonly called states of nature For example consider K = 3 The set Θ consists of the following elements: θ 1 = {(β 1, β 2, β 3 ) R 3 : β 1 = β 2 = β 3 } θ 2 = {(β 1, β 2, β 3 ) R 3 : β 1 = β 2, β 3 β 1 } θ 3 = {(β 1, β 2, β 3 ) R 3 : β 1 = β 3, β 2 β 1 } θ 4 = {(β 1, β 2, β 3 ) R 3 : β 2 = β 3, β 1 β 2 } θ 5 = {(β 1, β 2, β 3 ) R 3 : β 1 β 2, β 1 β 3, β 2 β 3 } The aim of any multiple comparison procedure is to detect the true state of nature Let D be a set of all decisions which can be made on the basis of observations The elements of the set D are called decisions We assume that D Θ We define the loss function in the following manner L(d, θ) = {, if d = θ, 1, if d θ, for d D and θ Θ This loss function gives penalty of one when our decision is not correct If we denote by X the space of all observations, then the function δ : X x d D is called a decision rule The considered procedure of multiple comparisons may be described as a decision rule A decision rule δ is characterized by its risk function, ie, average loss Let (β 1,, β K ) θ Then the risk function of the rule δ equals R δ (β 1,, β K ) = E (β1,,β K )L(δ(x), θ) = P (β1,,β K ){δ(x) θ} Note that in general, the risk depends on the differences of the values of the parameters (β 1,, β K ) For example if we assume K = 3 and σ 2 = 1, then it is easier to make misclassification for β 1 = β 2 = 1, β 3 = 11 than for β 1 = β 2 = 1, β 3 = 5 though both belong to the same state of nature Only in the case β 1 = = β K = β the risk does not depend on the value of β 5
6 The risk of the rule δ is the probability of the false decision This probability should be as small as possible In our investigation we are interested in a probability of the correct decision which is equal to 1 R δ The most common approach to the problem under consideration is via theory of multiple hypothesis testing In that framework, there are considered different criterions of goodness, such as a Familywise Error Rate or Per Comparison Error Rate connected with controlling the risk of committing an error of type I (see Gather et al 1996) Those criterions may be considered as a generalizations of the notion of the significance level in the Neyman Pearson theory of testing hypotheses According to that terminology, we may say that probability of the correct decision is the criterion which simultaneously takes into account the possibility of committing the errors of type I and type II, as Wald decision theory does Note that the imposed criterion, as opposed to the theory of multiple hypotheses testing, does not keep the Familywise Error Rate at a fixed level Thus it is advisable to consider rules δ with R δ for β 1 = = β K equal to the value of the significance level of the hypothesis H : β 1 = = β K As a consequence, there is no possibility that the results obtained by the rule contradicts that obtained for the above hypothesis The weak point of the presented approach is that there is no possibility to obtain the uniformly best procedure But on the other hand we avoid the obvious disadvantage that the constructed procedures will be too conservative (ie giving too large homogenous groups) 5 Experiment The probability of the correct decision was estimated on the basis of a simulation experiment In the experiment we choose K = 5 regression functions and each of it was observed 2 times Random errors were normal with σ = 1 Parameters α 1,, α 5 were zero The regression functions were considered on the interval [ 1, 1] For five regression functions there are 67 states of nature but it is enough to consider seven of them Considered states are given below Notation {β 1 = β 2 = β 3, β 4, β 5 } means the following subset: {(β 1, β 2, β 3, β 4, β 5 ) : β 1 = β 2 = β 3, β 4 β 1, β 5 β 1, β 4 β 5 } R 5 6
7 Number of groups State of nature Notation 1 {β 1 = β 2 = β 3 = β 4 = β 5 } {5} 2 {β 1 = β 2 = β 3, β 4 = β 5 } {2, 3} {β 1 = β 2 = β 3 = β 4, β 5 } {1, 4} 3 {β 1 = β 2 = β 3, β 4, β 5 } {1, 1, 3} {β 1 = β 2, β 3 = β 4, β 5 } {1, 2, 2} 4 {β 1 = β 2, β 3, β 4, β 5 } {1, 1, 1, 2} 5 {β 1, β 2, β 3, β 4, β 5 } {1, 1, 1, 1, 1} For each state of nature there were generated regression coefficients from the interval (5, 15) according to uniform distribution For example, for the state {2, 3} there were generated two numbers x, y from the distribution U(5, 15) and it was set β 1 = β 2 = β 3 = x and β 4 = β 5 = y Such generation was repeated one hundred times For each generated regression coefficients (β 1,, β 5 ) there were made 1 drawn of samples (x ki, Y ki ) for k = 1,, 5 and i = 1,, 2, such that Y ki = β k x ki + ε ki To each sample the described procedure was applied and there was noted if the obtained division of regression coefficients is consistent with the state of nature The probability of the correct decision was estimated by a fraction of divisions consistent with the state of nature It is obvious that the probability of the correct decision depends on a plan of experiment, ie on the choice of values of regressors Three plans were considered In the first case (random plan) twenty values of x s were chosen randomly from the [ 1, 1] interval (due to the uniform distribution) for every regression function separately The second plan was a naive one, ie values of regressor were 1 + 2i/19 for i =, 1,, 19 The third plan was the G optimal plan in which x = 1 or x = 1 and at each x there were taken ten observations The second plan as well as the third one, were common for all regression functions 6 Results The results are presented graphically On y axis there is an estimated probability (multiplied by 1) of the correct decision while on the x axis there is minimal distance between groups The solid line represents the probability of the correct decision for the G optimal plan, dashed line for the naive plan, and the dotted line for the random plan On the basis of simulations we may formulate the following conclusions 7
8 1 The proposed procedure of detecting the division of regression coefficients closely corresponding to the true states of nature is more precise when there is a small number of groups of coefficients When we increase the number of groups of coefficients the probability of detecting differences between them is decreasing 2 In the case of each division of regression coefficients we obtained that the best plan of experiment was the G optimal plan The probability of taking the correct decision is very high even for small differences between the sets of regression coefficients Above conclusions are true for five regression functions, but it may be expected, that similar conclusions may be formulated for the higher number of functions 7 Literature Gather U, Pawlitschko J, Pigeot I, 1966: Unbiasedness of multiple tests, Scandinavian Journal of Statistics, 23: Hochberg Y, Tamhane A C (1988) Multiple Comparison Procedures, John Wiley & Sons Miller Jr R G (1982) Simultaneous Statistical Inference, Springer Verlag, 2nd ed Zieliński W (1992) Monte Carlo comparison of multiple comparison procedures, Biometrical Journal 34: Porównania wielokrotne współczynników kierunkowych prostych regresji Streszczenie W pracy rozważane jest zagadnienie porównania K prostych regresji Zaproponowana została procedura znajdowania grup równoległych prostych Słowa kluczowe: porównania wielokrotne, jednoczesne wnioskowanie, analiza regresji AMS subject classification: 62J15, 62J1 8
9 8 Figures {1, 4} {2, 3}
10 {1, 1, 3} {1, 2, 2}
11 {1, 1, 1, 2} {1, 1, 1, 1, 1}
Ch 3: Multiple Linear Regression
Ch 3: Multiple Linear Regression 1. Multiple Linear Regression Model Multiple regression model has more than one regressor. For example, we have one response variable and two regressor variables: 1. delivery
More informationCh 2: Simple Linear Regression
Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component
More informationReview of Econometrics
Review of Econometrics Zheng Tian June 5th, 2017 1 The Essence of the OLS Estimation Multiple regression model involves the models as follows Y i = β 0 + β 1 X 1i + β 2 X 2i + + β k X ki + u i, i = 1,...,
More informationCentral Limit Theorem ( 5.3)
Central Limit Theorem ( 5.3) Let X 1, X 2,... be a sequence of independent random variables, each having n mean µ and variance σ 2. Then the distribution of the partial sum S n = X i i=1 becomes approximately
More informationMa 3/103: Lecture 25 Linear Regression II: Hypothesis Testing and ANOVA
Ma 3/103: Lecture 25 Linear Regression II: Hypothesis Testing and ANOVA March 6, 2017 KC Border Linear Regression II March 6, 2017 1 / 44 1 OLS estimator 2 Restricted regression 3 Errors in variables 4
More informationLECTURE 5 HYPOTHESIS TESTING
October 25, 2016 LECTURE 5 HYPOTHESIS TESTING Basic concepts In this lecture we continue to discuss the normal classical linear regression defined by Assumptions A1-A5. Let θ Θ R d be a parameter of interest.
More information14 Multiple Linear Regression
B.Sc./Cert./M.Sc. Qualif. - Statistics: Theory and Practice 14 Multiple Linear Regression 14.1 The multiple linear regression model In simple linear regression, the response variable y is expressed in
More informationTesting Statistical Hypotheses
E.L. Lehmann Joseph P. Romano Testing Statistical Hypotheses Third Edition 4y Springer Preface vii I Small-Sample Theory 1 1 The General Decision Problem 3 1.1 Statistical Inference and Statistical Decisions
More informationSummary of Chapters 7-9
Summary of Chapters 7-9 Chapter 7. Interval Estimation 7.2. Confidence Intervals for Difference of Two Means Let X 1,, X n and Y 1, Y 2,, Y m be two independent random samples of sizes n and m from two
More informationChapter 1 Statistical Inference
Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations
More informationLECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit
LECTURE 6 Introduction to Econometrics Hypothesis testing & Goodness of fit October 25, 2016 1 / 23 ON TODAY S LECTURE We will explain how multiple hypotheses are tested in a regression model We will define
More informationApplying the Benjamini Hochberg procedure to a set of generalized p-values
U.U.D.M. Report 20:22 Applying the Benjamini Hochberg procedure to a set of generalized p-values Fredrik Jonsson Department of Mathematics Uppsala University Applying the Benjamini Hochberg procedure
More informationNonstationary Panels
Nonstationary Panels Based on chapters 12.4, 12.5, and 12.6 of Baltagi, B. (2005): Econometric Analysis of Panel Data, 3rd edition. Chichester, John Wiley & Sons. June 3, 2009 Agenda 1 Spurious Regressions
More informationSTAT 263/363: Experimental Design Winter 2016/17. Lecture 1 January 9. Why perform Design of Experiments (DOE)? There are at least two reasons:
STAT 263/363: Experimental Design Winter 206/7 Lecture January 9 Lecturer: Minyong Lee Scribe: Zachary del Rosario. Design of Experiments Why perform Design of Experiments (DOE)? There are at least two
More information11 Hypothesis Testing
28 11 Hypothesis Testing 111 Introduction Suppose we want to test the hypothesis: H : A q p β p 1 q 1 In terms of the rows of A this can be written as a 1 a q β, ie a i β for each row of A (here a i denotes
More informationLECTURE 2 LINEAR REGRESSION MODEL AND OLS
SEPTEMBER 29, 2014 LECTURE 2 LINEAR REGRESSION MODEL AND OLS Definitions A common question in econometrics is to study the effect of one group of variables X i, usually called the regressors, on another
More informationLecture 3: Multiple Regression
Lecture 3: Multiple Regression R.G. Pierse 1 The General Linear Model Suppose that we have k explanatory variables Y i = β 1 + β X i + β 3 X 3i + + β k X ki + u i, i = 1,, n (1.1) or Y i = β j X ji + u
More informationSummary and discussion of: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing
Summary and discussion of: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing Statistics Journal Club, 36-825 Beau Dabbs and Philipp Burckhardt 9-19-2014 1 Paper
More informationLectures 5 & 6: Hypothesis Testing
Lectures 5 & 6: Hypothesis Testing in which you learn to apply the concept of statistical significance to OLS estimates, learn the concept of t values, how to use them in regression work and come across
More informationThe Linear Regression Model
The Linear Regression Model Carlo Favero Favero () The Linear Regression Model 1 / 67 OLS To illustrate how estimation can be performed to derive conditional expectations, consider the following general
More informationCh. 5 Hypothesis Testing
Ch. 5 Hypothesis Testing The current framework of hypothesis testing is largely due to the work of Neyman and Pearson in the late 1920s, early 30s, complementing Fisher s work on estimation. As in estimation,
More informationFDR-CONTROLLING STEPWISE PROCEDURES AND THEIR FALSE NEGATIVES RATES
FDR-CONTROLLING STEPWISE PROCEDURES AND THEIR FALSE NEGATIVES RATES Sanat K. Sarkar a a Department of Statistics, Temple University, Speakman Hall (006-00), Philadelphia, PA 19122, USA Abstract The concept
More informationPerformance Evaluation and Comparison
Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Cross Validation and Resampling 3 Interval Estimation
More informationSimple Linear Regression
Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)
More informationA Simple, Graphical Procedure for Comparing Multiple Treatment Effects
A Simple, Graphical Procedure for Comparing Multiple Treatment Effects Brennan S. Thompson and Matthew D. Webb May 15, 2015 > Abstract In this paper, we utilize a new graphical
More informationHomoskedasticity. Var (u X) = σ 2. (23)
Homoskedasticity How big is the difference between the OLS estimator and the true parameter? To answer this question, we make an additional assumption called homoskedasticity: Var (u X) = σ 2. (23) This
More informationStatistical Inference
Statistical Inference Classical and Bayesian Methods Class 6 AMS-UCSC Thu 26, 2012 Winter 2012. Session 1 (Class 6) AMS-132/206 Thu 26, 2012 1 / 15 Topics Topics We will talk about... 1 Hypothesis testing
More informationLecture 6: Linear models and Gauss-Markov theorem
Lecture 6: Linear models and Gauss-Markov theorem Linear model setting Results in simple linear regression can be extended to the following general linear model with independently observed response variables
More informationLet us first identify some classes of hypotheses. simple versus simple. H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided
Let us first identify some classes of hypotheses. simple versus simple H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided H 0 : θ θ 0 versus H 1 : θ > θ 0. (2) two-sided; null on extremes H 0 : θ θ 1 or
More informationMaster s Written Examination
Master s Written Examination Option: Statistics and Probability Spring 016 Full points may be obtained for correct answers to eight questions. Each numbered question which may have several parts is worth
More informationAnalysis Of Variance Compiled by T.O. Antwi-Asare, U.G
Analysis Of Variance Compiled by T.O. Antwi-Asare, U.G 1 ANOVA Analysis of variance compares two or more population means of interval data. Specifically, we are interested in determining whether differences
More informationMaster s Written Examination - Solution
Master s Written Examination - Solution Spring 204 Problem Stat 40 Suppose X and X 2 have the joint pdf f X,X 2 (x, x 2 ) = 2e (x +x 2 ), 0 < x < x 2
More informationThe regression model with one fixed regressor cont d
The regression model with one fixed regressor cont d 3150/4150 Lecture 4 Ragnar Nymoen 27 January 2012 The model with transformed variables Regression with transformed variables I References HGL Ch 2.8
More informationSo far our focus has been on estimation of the parameter vector β in the. y = Xβ + u
Interval estimation and hypothesis tests So far our focus has been on estimation of the parameter vector β in the linear model y i = β 1 x 1i + β 2 x 2i +... + β K x Ki + u i = x iβ + u i for i = 1, 2,...,
More informationTESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST
Econometrics Working Paper EWP0402 ISSN 1485-6441 Department of Economics TESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST Lauren Bin Dong & David E. A. Giles Department
More informationThe Standard Linear Model: Hypothesis Testing
Department of Mathematics Ma 3/103 KC Border Introduction to Probability and Statistics Winter 2017 Lecture 25: The Standard Linear Model: Hypothesis Testing Relevant textbook passages: Larsen Marx [4]:
More informationApplication of Variance Homogeneity Tests Under Violation of Normality Assumption
Application of Variance Homogeneity Tests Under Violation of Normality Assumption Alisa A. Gorbunova, Boris Yu. Lemeshko Novosibirsk State Technical University Novosibirsk, Russia e-mail: gorbunova.alisa@gmail.com
More informationPart IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015
Part IB Statistics Theorems with proof Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly)
More informationWISE International Masters
WISE International Masters ECONOMETRICS Instructor: Brett Graham INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This examination paper contains 32 questions. You are
More informationLecture 12 November 3
STATS 300A: Theory of Statistics Fall 2015 Lecture 12 November 3 Lecturer: Lester Mackey Scribe: Jae Hyuck Park, Christian Fong Warning: These notes may contain factual and/or typographic errors. 12.1
More informationA Note on UMPI F Tests
A Note on UMPI F Tests Ronald Christensen Professor of Statistics Department of Mathematics and Statistics University of New Mexico May 22, 2015 Abstract We examine the transformations necessary for establishing
More informationStatistical Applications in Genetics and Molecular Biology
Statistical Applications in Genetics and Molecular Biology Volume 5, Issue 1 2006 Article 28 A Two-Step Multiple Comparison Procedure for a Large Number of Tests and Multiple Treatments Hongmei Jiang Rebecca
More informationEconometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018
Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate
More informationMA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2
MA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2 1 Bootstrapped Bias and CIs Given a multiple regression model with mean and
More informationEconometrics of Panel Data
Econometrics of Panel Data Jakub Mućk Meeting # 3 Jakub Mućk Econometrics of Panel Data Meeting # 3 1 / 21 Outline 1 Fixed or Random Hausman Test 2 Between Estimator 3 Coefficient of determination (R 2
More informationChapter Seven: Multi-Sample Methods 1/52
Chapter Seven: Multi-Sample Methods 1/52 7.1 Introduction 2/52 Introduction The independent samples t test and the independent samples Z test for a difference between proportions are designed to analyze
More informationStat 5100 Handout #26: Variations on OLS Linear Regression (Ch. 11, 13)
Stat 5100 Handout #26: Variations on OLS Linear Regression (Ch. 11, 13) 1. Weighted Least Squares (textbook 11.1) Recall regression model Y = β 0 + β 1 X 1 +... + β p 1 X p 1 + ε in matrix form: (Ch. 5,
More informationMA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7
MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7 1 Random Vectors Let a 0 and y be n 1 vectors, and let A be an n n matrix. Here, a 0 and A are non-random, whereas y is
More informationHeteroskedasticity-Robust Inference in Finite Samples
Heteroskedasticity-Robust Inference in Finite Samples Jerry Hausman and Christopher Palmer Massachusetts Institute of Technology December 011 Abstract Since the advent of heteroskedasticity-robust standard
More informationEconometrics. 4) Statistical inference
30C00200 Econometrics 4) Statistical inference Timo Kuosmanen Professor, Ph.D. http://nomepre.net/index.php/timokuosmanen Today s topics Confidence intervals of parameter estimates Student s t-distribution
More informationLecture 14 Simple Linear Regression
Lecture 4 Simple Linear Regression Ordinary Least Squares (OLS) Consider the following simple linear regression model where, for each unit i, Y i is the dependent variable (response). X i is the independent
More informationSimple Linear Regression
Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.
More informationGOTEBORG UNIVERSITY. Department of Statistics
GOTEBORG UNIVERSITY Department of Statistics RESEARCH REPORT 1994:5 ISSN 0349-8034 COMPARING POWER AND MULTIPLE SIGNIFICANCE LEVEL FOR STEP UP AND STEP DOWN MULTIPLE TEST PROCEDURES FOR CORRELATED ESTIMATES
More informationLecture 15. Hypothesis testing in the linear model
14. Lecture 15. Hypothesis testing in the linear model Lecture 15. Hypothesis testing in the linear model 1 (1 1) Preliminary lemma 15. Hypothesis testing in the linear model 15.1. Preliminary lemma Lemma
More informationInstitute of Actuaries of India
Institute of Actuaries of India Subject CT3 Probability & Mathematical Statistics May 2011 Examinations INDICATIVE SOLUTION Introduction The indicative solution has been written by the Examiners with the
More informationFormal Statement of Simple Linear Regression Model
Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor
More informationAdvanced Econometrics I
Lecture Notes Autumn 2010 Dr. Getinet Haile, University of Mannheim 1. Introduction Introduction & CLRM, Autumn Term 2010 1 What is econometrics? Econometrics = economic statistics economic theory mathematics
More information8. Hypothesis Testing
FE661 - Statistical Methods for Financial Engineering 8. Hypothesis Testing Jitkomut Songsiri introduction Wald test likelihood-based tests significance test for linear regression 8-1 Introduction elements
More informationBootstrapping the Grainger Causality Test With Integrated Data
Bootstrapping the Grainger Causality Test With Integrated Data Richard Ti n University of Reading July 26, 2006 Abstract A Monte-carlo experiment is conducted to investigate the small sample performance
More informationEconomics 520. Lecture Note 19: Hypothesis Testing via the Neyman-Pearson Lemma CB 8.1,
Economics 520 Lecture Note 9: Hypothesis Testing via the Neyman-Pearson Lemma CB 8., 8.3.-8.3.3 Uniformly Most Powerful Tests and the Neyman-Pearson Lemma Let s return to the hypothesis testing problem
More informationProbability and Statistics Notes
Probability and Statistics Notes Chapter Seven Jesse Crawford Department of Mathematics Tarleton State University Spring 2011 (Tarleton State University) Chapter Seven Notes Spring 2011 1 / 42 Outline
More informationThis model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that
Linear Regression For (X, Y ) a pair of random variables with values in R p R we assume that E(Y X) = β 0 + with β R p+1. p X j β j = (1, X T )β j=1 This model of the conditional expectation is linear
More information20.1. Balanced One-Way Classification Cell means parametrization: ε 1. ε I. + ˆɛ 2 ij =
20. ONE-WAY ANALYSIS OF VARIANCE 1 20.1. Balanced One-Way Classification Cell means parametrization: Y ij = µ i + ε ij, i = 1,..., I; j = 1,..., J, ε ij N(0, σ 2 ), In matrix form, Y = Xβ + ε, or 1 Y J
More informationLECTURE 10: NEYMAN-PEARSON LEMMA AND ASYMPTOTIC TESTING. The last equality is provided so this can look like a more familiar parametric test.
Economics 52 Econometrics Professor N.M. Kiefer LECTURE 1: NEYMAN-PEARSON LEMMA AND ASYMPTOTIC TESTING NEYMAN-PEARSON LEMMA: Lesson: Good tests are based on the likelihood ratio. The proof is easy in the
More informationMultiple Linear Regression
Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there
More informationInference about the Indirect Effect: a Likelihood Approach
Discussion Paper: 2014/10 Inference about the Indirect Effect: a Likelihood Approach Noud P.A. van Giersbergen www.ase.uva.nl/uva-econometrics Amsterdam School of Economics Department of Economics & Econometrics
More informationTesting Statistical Hypotheses
E.L. Lehmann Joseph P. Romano, 02LEu1 ttd ~Lt~S Testing Statistical Hypotheses Third Edition With 6 Illustrations ~Springer 2 The Probability Background 28 2.1 Probability and Measure 28 2.2 Integration.........
More informationST5215: Advanced Statistical Theory
Department of Statistics & Applied Probability Wednesday, October 5, 2011 Lecture 13: Basic elements and notions in decision theory Basic elements X : a sample from a population P P Decision: an action
More informationINTERVAL ESTIMATION AND HYPOTHESES TESTING
INTERVAL ESTIMATION AND HYPOTHESES TESTING 1. IDEA An interval rather than a point estimate is often of interest. Confidence intervals are thus important in empirical work. To construct interval estimates,
More informationThe Random Effects Model Introduction
The Random Effects Model Introduction Sometimes, treatments included in experiment are randomly chosen from set of all possible treatments. Conclusions from such experiment can then be generalized to other
More informationInference in Regression Analysis
Inference in Regression Analysis Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 4, Slide 1 Today: Normal Error Regression Model Y i = β 0 + β 1 X i + ǫ i Y i value
More informationSleep data, two drugs Ch13.xls
Model Based Statistics in Biology. Part IV. The General Linear Mixed Model.. Chapter 13.3 Fixed*Random Effects (Paired t-test) ReCap. Part I (Chapters 1,2,3,4), Part II (Ch 5, 6, 7) ReCap Part III (Ch
More informationHypothesis Testing hypothesis testing approach
Hypothesis Testing In this case, we d be trying to form an inference about that neighborhood: Do people there shop more often those people who are members of the larger population To ascertain this, we
More informationUnit 12: Analysis of Single Factor Experiments
Unit 12: Analysis of Single Factor Experiments Statistics 571: Statistical Methods Ramón V. León 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 1 Introduction Chapter 8: How to compare two treatments. Chapter
More informationEstimating the accuracy of a hypothesis Setting. Assume a binary classification setting
Estimating the accuracy of a hypothesis Setting Assume a binary classification setting Assume input/output pairs (x, y) are sampled from an unknown probability distribution D = p(x, y) Train a binary classifier
More information1. What does the alternate hypothesis ask for a one-way between-subjects analysis of variance?
1. What does the alternate hypothesis ask for a one-way between-subjects analysis of variance? 2. What is the difference between between-group variability and within-group variability? 3. What does between-group
More informationEvaluation. Andrea Passerini Machine Learning. Evaluation
Andrea Passerini passerini@disi.unitn.it Machine Learning Basic concepts requires to define performance measures to be optimized Performance of learning algorithms cannot be evaluated on entire domain
More informationUNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017
UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics Tuesday, January 17, 2017 Work all problems 60 points are needed to pass at the Masters Level and 75
More informationLeast Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions
Journal of Modern Applied Statistical Methods Volume 8 Issue 1 Article 13 5-1-2009 Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error
More informationMultiple Linear Regression
Multiple Linear Regression ST 430/514 Recall: a regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates).
More information1 One-way Analysis of Variance
1 One-way Analysis of Variance Suppose that a random sample of q individuals receives treatment T i, i = 1,,... p. Let Y ij be the response from the jth individual to be treated with the ith treatment
More informationIntroductory Econometrics. Review of statistics (Part II: Inference)
Introductory Econometrics Review of statistics (Part II: Inference) Jun Ma School of Economics Renmin University of China October 1, 2018 1/16 Null and alternative hypotheses Usually, we have two competing
More informationInferences for Regression
Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In
More informationEC2001 Econometrics 1 Dr. Jose Olmo Room D309
EC2001 Econometrics 1 Dr. Jose Olmo Room D309 J.Olmo@City.ac.uk 1 Revision of Statistical Inference 1.1 Sample, observations, population A sample is a number of observations drawn from a population. Population:
More informationDirection: This test is worth 250 points and each problem worth points. DO ANY SIX
Term Test 3 December 5, 2003 Name Math 52 Student Number Direction: This test is worth 250 points and each problem worth 4 points DO ANY SIX PROBLEMS You are required to complete this test within 50 minutes
More informationAnalysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems
Analysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems Jeremy S. Conner and Dale E. Seborg Department of Chemical Engineering University of California, Santa Barbara, CA
More informationPsychology 282 Lecture #4 Outline Inferences in SLR
Psychology 282 Lecture #4 Outline Inferences in SLR Assumptions To this point we have not had to make any distributional assumptions. Principle of least squares requires no assumptions. Can use correlations
More informationParameter Estimation, Sampling Distributions & Hypothesis Testing
Parameter Estimation, Sampling Distributions & Hypothesis Testing Parameter Estimation & Hypothesis Testing In doing research, we are usually interested in some feature of a population distribution (which
More informationEvaluation requires to define performance measures to be optimized
Evaluation Basic concepts Evaluation requires to define performance measures to be optimized Performance of learning algorithms cannot be evaluated on entire domain (generalization error) approximation
More informationWeighted Least Squares
Weighted Least Squares The standard linear model assumes that Var(ε i ) = σ 2 for i = 1,..., n. As we have seen, however, there are instances where Var(Y X = x i ) = Var(ε i ) = σ2 w i. Here w 1,..., w
More informationRegression Analysis for Data Containing Outliers and High Leverage Points
Alabama Journal of Mathematics 39 (2015) ISSN 2373-0404 Regression Analysis for Data Containing Outliers and High Leverage Points Asim Kumer Dey Department of Mathematics Lamar University Md. Amir Hossain
More informationLec 1: An Introduction to ANOVA
Ying Li Stockholm University October 31, 2011 Three end-aisle displays Which is the best? Design of the Experiment Identify the stores of the similar size and type. The displays are randomly assigned to
More informationGeneral Linear Model: Statistical Inference
Chapter 6 General Linear Model: Statistical Inference 6.1 Introduction So far we have discussed formulation of linear models (Chapter 1), estimability of parameters in a linear model (Chapter 4), least
More informationProblems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B
Simple Linear Regression 35 Problems 1 Consider a set of data (x i, y i ), i =1, 2,,n, and the following two regression models: y i = β 0 + β 1 x i + ε, (i =1, 2,,n), Model A y i = γ 0 + γ 1 x i + γ 2
More informationModified Simes Critical Values Under Positive Dependence
Modified Simes Critical Values Under Positive Dependence Gengqian Cai, Sanat K. Sarkar Clinical Pharmacology Statistics & Programming, BDS, GlaxoSmithKline Statistics Department, Temple University, Philadelphia
More informationEconometrics of Panel Data
Econometrics of Panel Data Jakub Mućk Meeting # 2 Jakub Mućk Econometrics of Panel Data Meeting # 2 1 / 26 Outline 1 Fixed effects model The Least Squares Dummy Variable Estimator The Fixed Effect (Within
More informationMaster s Written Examination
Master s Written Examination Option: Statistics and Probability Spring 05 Full points may be obtained for correct answers to eight questions Each numbered question (which may have several parts) is worth
More informationCorrelation Analysis
Simple Regression Correlation Analysis Correlation analysis is used to measure strength of the association (linear relationship) between two variables Correlation is only concerned with strength of the
More informationMATH 644: Regression Analysis Methods
MATH 644: Regression Analysis Methods FINAL EXAM Fall, 2012 INSTRUCTIONS TO STUDENTS: 1. This test contains SIX questions. It comprises ELEVEN printed pages. 2. Answer ALL questions for a total of 100
More informationSimple linear regression
Simple linear regression Biometry 755 Spring 2008 Simple linear regression p. 1/40 Overview of regression analysis Evaluate relationship between one or more independent variables (X 1,...,X k ) and a single
More information