ge-k ) ECONOMETRIC INSTITUTE
|
|
- Christiana Gibson
- 5 years ago
- Views:
Transcription
1 ge-k ) ECONOMETRIC INSTITUTE GIANN:NFC AGRICULTURA 7:)T,4 OF NZ? THE EXACT MSE EFFICIENCY OF THE GENERAL RIDGE ESTIMATOR RELATIVE TO OLS R. TEEKENS and P.M.C. DE BOER otiv"--9 REPORT 770/ES ERASMUS UNIVERSITY ROTTERDAM, P.O. BOX 738, ROTTERDAM, THE NETHERLANDS
2 Report 770/ES Corrigendum page 3: formula in the sixth line from the bottom should read as [c] = c = X*'y = Aa + X* c page 5: equation (.)4) should read as, ) ( A in the line preceding eq. (.6) "education" should be "equation". page 8: the second part of eq. (.5) should read a / a. a. = 0 page 9: the 9th line from the bottom should read as: "is normally distributed and degrees of" _ ma a. has a x -distribution with = n-p page B. : eq. (B.) should read as x e l i r Z,z) = C.z exp )f for 8 z > 0 < x < 00 elsewhere with C = [07Trr r ) page B.: In the 4th line from the top "(w+8) should be "( I t sixth line from the top should read: " C' = 57-f-Tr r m
3 page B.3: 3rd line: " should be 4th line: " h = (w,y0)" should be " h,y;0) " page C.: remark should be: "( (0 for 0 =
4 The Exact MSE-Efficiency.of the General Ridge Estimator Relative to OLS by Rudolf Teekens and Paul de Boer ABSTRACT In this paper it has been pointed out that the merits of the Ridge procedure as proposed by Hoerl and Kennard tend to be overvalued due to an incorrect analysis of the associated Mean Square Error. For the case of the so-called General Ridge Estimator it is then shown how the exact MSE can be derived and finally it is seen that General Ridge Estimator dominates the OLS estimator.only in a limited interval of the parameter space. Contents. Introduction. The explicit general ridge estimator 3. The exact mean square error efficiency 4. Concluding remarks References Appendices * The authors are indebted to Mr A.S. Louter for performing the calculations for appendix C.
5 -- I. Introduction Since Hoerl and Kennard first published their so-called Ridge Regression method in 970, a considerable amount of research has been devoted to this subject..some Econometricians as well seem to be taken with the Ridge Method, witness the publications of Vinod (976 a, 976 b) and Moulaert (976). In our opinion, however, the Ridge Regression Method ) is based on a dubious method, which consists of optimizing an unknown loss-function, and b) dominates the OLS estimator in MSE only in a limited range of parameter values. The subsequent analysis is more or less analogous, to the one carried out by Feldstein (973) who studied the mean square error efficiency of COV (conditional omitted variable) and WTD (weighted average) estimators relative to the OLS estimator. - As Hoerl and Kennard (970 a) we consider the standard linear model ( y = + E where. y is an observable random vector of n elements X is an observable fixed matrix of order nxp with rank p is a vector of p unknown parameters.6 is a non-observable random vector of n elements which has a, multivariate normal distribution: c n(0,gi). We also follow Hoerl and Kennard in reducing the above model to a canonical form in which the X'X matrix is diagonal. This may be achieved by applying the following orthogonal transformation to X and 3. Let (.) a = and (.3) X* = XP
6 -- where the columns of P are the eigenvectors of X'X and - P' = P so that X*'X* = P'X'XP = A with A the diagonal matrix of (positive) eigenvalues of X'X. Then model (.) may be rewritten as (.5)y = X*a + c where it should be noted that a'a = = The OLS estimator of a is (X4"X*)-X* t yrz If we define the vector c as (.6) c = X*'y = P'X'y a may be written as (i.7) a = A l c Moreover, it is easily verified that (.8) = pa The general ridge procedure is defined from (.9) a* = Ex*,x* + Kri x*, = [A + K] lc with K a diagonal matrix with non-negative elements.
7 _3_ The corresponding Ridge estimator in model is then defined as (.0) P&* It should be noted here that 3 equals = [X'X + K] X'y only if K = ki. If K is not a scalar matrix it follows from (.0) that 3* = [vx + PKP']-X'y In order to determine the matrix K, we minimize the mean square error (MSE) of a* relative to a; this function will be denoted as (.) 7(&,a) = Eua* - _ ) = E E i= a- - a. Writing.9) in scalar terms, we obtain c. a* = x. + k. i =,. Realizing furthermore that [ci] = c = X*'y = Aa + X*'u has a multivariate normal distribution: c n(aa, a A) or, c. n(x.a., A.a ) we can write (.) as
8 4- c. (.) ff(a*- 'a) = E E oti] i= +k. i = E ax. C. - X.a. a.k.] a. =. + k.) avx: X. i= (X. + k.) 0.. a.k. a. i= ax. + a.k. ii (Xi + ki) Mirlinlizing7(a*;(x)witlarespecttothek. I S yields the following optimal values of k.: (.3) k. = i =, p Inspection ofthe second order conditions shows that (.3) indeed constitutes minimum for (.). Obviously this solution for k., i =, p is useless for estimation purposes, since k depends on the unknown parameters a., i =, p and a. Therefore Hoerl and Kennard propose the following method for approximating the theoretical optimal values of k.: (i) determine the OLS estimates a i and - - (ii) determine k.(0) from k.(0) = a Ia. (iii) continue the process as follows c. -* _ k. a i() X. + k. (- - a =,, until stability is achieved in
9 _5_ In the remainder of this paper we will limit ourselves to this "general form of the ridge regression", i.e. K a non-scalar diagonal matrix. But we will conclude this introduction with a number of remarks about the case of a scalar K-matrix, to which Hoerl and Kennard devote the main part of their paper. First, it is observed that in case of a unique k the MSE-function of Ci* becomes (.4) = E i= a A. + a.k so that in this case the first order condition for a minimum of 7'(6*,a) does not yield an explicit value of k. This first order condition reads as: (.5) P ak - a EA.. =0 i= (A. + k)-) Unlike the case of different k, i =, p, Hoerl and Kennard do not consider the possibility of applying an iterative method for approximating the theoretical optimal value of k. Such a method could consistofsubstitutingfortheunknownparametera.the value of its ridge estimate into (.5), c. -* a. = A. + k and then solving by a numerical method the resulting education k + (A. - c./a )k + A- ( 6) A. _o i= (k + Ai)5 The alternative method as presented by Hoerl and Kennard; viz, the use of the so-called ridge trace, has already been critizisea by Conniffe
10 -6- and Stone (973) and Farebrother (975). Their stability criterion leads them to choose a too high value of k. As indicated by. Conniffe and Stone (973), and Conniffe, Stone and O'Neill (976) and Newhouse and Oman (97)5, the proof given by Hoerl and Kennard, that for some fixed k a* has lower MSE than the OLS estimator of a, is inapplicable for the practical case where k is a random variable. However, for a unique k it seems impossible to determine analitically the true MSE of 6*, since k cannot be determined explicitely from (i.6). Fortunately, the case of general ridge regression as presented earlier in the introduction lends itself much better to an analysis of the MSE of a* and this will be the subject of the rest of this paper. II. The explicit general ridge estimator, The iterative method as proposed by Hoerl and Kennard to solve for a*. which has been described in the previous section makes things unnecessarily complicated and unclear. This method provides with a solution of the following two equations d (.) a. c. A. + k. = (.) - * k. = a /a. But these equations may easily be solved analitically; substitution of (.) into (.) yields the following quadratic equation in a*:: (.3) * * - X.a. -c.a. +o =0 and the roots of a. are ) (.4) -* a x. / c. + c 4x.a providedthatc"x.6.thecorresporiclingrootsofk.follow immediately i from substitution of (.4) into (.). Remmerle (975) found independently the same solution.
11 Given this analytical solution a number of remarks can be made (±) (ii) The general ridge estimator does not always exist; it is defined - only if c, > 4A.G The general ridge estimator is not unique, since for each i there -* - exist two pairs (a.,k.) which satisfy (. and (.) simultaneously, provided that.c > i (iii) The iteration procedure as proposed by Hoerl -* - definition the set of roots (a.,k.) with the highest la*). This may easily be seen from and Kennard smallest k. figure in selects by (or the which the c. two functions a = and a* = kl-/77 are plotted for the i X. ±k. case where c. > cx From the graph it is clear that, provided the two curves intersect (i.e. provided that c > i a ) starting the iteration with k. = 0 (i.e. chosing for = a.) one always moves to the a(0) i point of intersection with the smallest k.. In the sequel we shall concentrate on the general ridge estimator as proposed by Hoerl and Kennard, i.e. the 6* corresponding to the smallest k.. From
12 -8- the previous paragraphs it is clear that this estimator is not defined if c - < 4A G Since it is our intention to compare the MSE-performance i i of '0.ii withthatoftheolsestimatorofa.,it seems fair to complete. thedefinitionof6y defining it for c <.a to be identical to the i- OLS estimator of a.. The general ridge estimator is then defined for the entire sample space. If, moreover, we realize that the OLS estimator &T is defined as ci/xi, we may write &T as for <. (.5) -* - a- = a- ( for - where G = a lx i is the estimated variance of &.. III. The exact mean square error efficiency WedefinetheNBE-efficiencYofewith respect to a (the OLS estimator of a) as the ratio of the Mean Square Errors of the two estimators (3.) t / -.. a ] et First we consider the simple case, where the variance of the system, a, is known. In that case the general ridge estimator alf, as defined in (.5) becomes (3.) -* a. = a. for lad < a. a. - a. a. ( + - 4a../a. ) a_ for 6. > a., Since6i.isliormallydistributedwithmeana.and variance 0 = 0 IA.
13 we may apply the result of appendix A and conclude that 6 ka';a*) 4)- a where cp (.) is defined in appendix A and tabulated in appendix C. In figure we have given (p(.) for positive arguments only, since (p (.) is symmetric about the origin. From figure it can be seen that the general ridge estimator of ai dominates ) the OLS estimator for a 67: < or l a d< a/5 i. provided that the variance of the system is known. Next, we consider the realistic case where the variance a is unknown. In that case the general ridge estimator as defined in (.5) applies. Since cc:i - is normally distributed and m ci ai has a x -distribution with m degrees of freedom, we may apply the result of appendix B and conclude that the MSEefficiencYofec withrespecttoa.equals a; = ( /-*. -. = (, \ cc CC v). a l a M ) where (P(.,.) is defined in appendix B and tabulated in 'appendix C. From this table it can be.seen that (p(.,.) is very close to (p.(.) for different values of m. Therefore we may consider (p(.) as a good approximation to the mean square error efficiency of the general ridge estimator for unknown variance. ) in the sense of having a lower mean square error
14
15 - - IV. Concluding remarks (i) Contrary to the Stein-rule estimation procedures (see Baranchik 973), James and Stein (96) and Stein (960)), the Hoerl and Kennard procedure is based on the minimization of an objective function with unknown parameters. Therefore, any resulting optimal value of k depends on the unknown parameters and should be redefined in terms of estimated parameters in order to obtain an estimator. But then the original MSE properties of a* are no longer valid and the redefined a* should be reconsidered with respect to its MSE.. (ii) The exact MSE has been calculated for the case of the so-called general ridge estimator, with a non-scalar diagonal matrix K. Here the function (p(.) as tabulated in appendix C and pictured in figure turns out to be a good approximation to the mean square error of a*. This MSE is a function of (ai'xi Va only. * (iii) The general ridge estimator rii dominates the ordinary least squares estimator ii. only if ot.itia <.59668, a condition which can be a_ I made the subject of a pre-test.
16 References Baranchik, A.J. (973). Inadmissibility of Maximum Likelihood Estimators in Some Multiple Regression Problems with Three or More Independent Variables. The Annals of Statistics, Vol., No., p Conniffe, D. and J. Stone (973). A Critical View of Ridge Regression, The Statistician,, p Conniffe, D. and J. Stone 975). A Reply to Smith and Goldstein, The Statistician, 4, No., p. 67,68. Conniffe, D.-, Stone, J. and F. O'Neill (976). Is Ridge Regression a Useful Technique in Economic Analysis. Unpublished Research Memorandum. Farebrother, R.W. (975). The Minimum Mean Square Error Linear Estimator* and Ridge Regression, Technometrics, Vol. 7, No. p. 7,8. Feldstein, M.S. (973). Multicollenearity and the Mean Square Error of Alternative Estimators, Econometrica, Vol. 4, No,. Hemmerle, W.J. (975). An Explicite Solution for Generalized Ridge Regression, Technometrics, 7, P Hoerl, A.E. and R.W. Kennard (970-a). Ridge Regression: Biased Estimation for Non-orthogonal Problems, Technometrics, Vol., No., p Hoerl, A.EandR.W. Kennard (970-b). Ridge Regression: Applications to Non-orthogonal Problems, Technometrics, Vol., No., p James, W. and C. Stein (96). Estimation with Quadratic Loss, Proc. 4th Berkeley Symposium, p Moulaert, F. (976). Ridge Regression: a Geometrical Revisitation, Regional Science Research Paper no. 5, Centrum voor Economische Studien van de Katholieke Universiteit te Leuven.
17 -3-- Newhouse, J.P. and S.D. Oman (97). The Evaluation of Ridge Estimators. Report R-76-PR prepared for project Rand. Smith, A.F.M. and M. Goldstein (9 ). Ridge Regression: Some Comments on a Paper of Stone and Conniffe. Stein, C.M. (960). Multiple Regression. Chapter. 37 in Essays in Honour of Harold Hotelling. Stanford Univ. Press. Vinod, H.D..(976-a). Canonical Ridge and Econometrics of Joint Production, Journal of Econometrics 4, p Vinod, H.D. 977-b). Application of New Ridge Regression Methods to a Study of Bell System Scale Economies, JASA, Vol. Ti, No. 356, P
18 -A. - Appendix A Consider the random variable X, which is Normally distributed with mean e l and variance 8 and the random variable Y which is defined as ' (A.) =x = x ( for x. < 8 for IX I > 0 The Mean Square Error of Y with respect to 0 is defined as (A.) Tr(Y;0) = E EY and the Mean Square Error of X with respect to 0 is (A.3) Tr(X;0 ) = E [X - We shall now derive an integral expression for the ratio Tr and show that this ratio is a function of 0/8 only. Y;0) / X;8 From (A.) and (A.3) it folldws that (A.)4) Y 0) Tr( X; 0) and since X is normally distributed we may write for (A.4) (A.5) 7(Y0 ). (X;0 ) 0 0 /7 x 8 (x - 8 exp } dx,, ) exp{- 0 \ ) + )4 40/x - 0 ] x e
19 This expression can be rewritten if we apply the transformation z = x 0 we then have (A.6) T r(y-e ) ' Tr(X;0 ) /7- z< 0 exp {-; dz + f Hz ( > / z) ] exp e with 0 = 0/0 It can easily be seen that the function (p(0) is symmetric about the origin, therefore we confirm ourselves to e > o for its numerical evaluation. The function (pi(0) has been tabulated in appendix C.
20 -B.- Appendix B Consider the independently distributed random variables X and Z which have as marginal distributions a normal distribution with mean e l and variance 0 and a chi-square 'distribution with m degrees of freedom respectively. Then the joint density of X and Z is - r Z f x,z) = c.z exp i - - x-0 for -co <x < z > 0 elsewhere with C= [cs,itrr r (-) / In this appendix we shall investigate the mean square. error with respect to 0 of the following function of X and Z: (B.) Y =X - Y = Z/mX for XI < 8 Z/m. for 'XI > 0 /7 and we shall consider again the mean square error of Y as a proportion of the mean square error of X; we are thus looking for (B.3) (Y;0 Tr(X;8 = E 0 Y - oil'- = m( In - 0 z x-0 exp{- - dzd.,x z mx ' z. exp {- - dzdx
21 -B.- After the transformation w = e the above expression can b rewritten as c[ Iw m H-w + / - z /mw - exp w-0 dzdw exp 77 - dzdw 77 - (w+0 z exp with 0 = 0 /0 and C' = [ /Tr r z m / - It may readily be seen that the third integral in this expression equals - (C') and that the first two integrals may be taken together. If, moreover, we apply the transformation y= Illw we obtain B. ) Tr(Y0 ) Tr(x;e ) = + exp 0my. ir h(w,y0 ) wi. irly+4 gexp {-i( dwdy
22 -B.3- with 0 = /0 p = Itegnar+4) a = h = (w,y;6) = [{ ;4- ( +7-7-)- / Hence ff(y;8) / Tr(;8) can be written as a function of 8 = 8/8 and m alone. This function labelled as (p(6,m) has been tabulated in appendix C for positive values of 6 and even values of ml) ) It should be noted that the integral expression (B.4) is relatively easy to calculate for even m and since the function (f) (8 ' m) is not sensitive to m, it did not seem worthwhile to devote more time and effort to the calculation of (p0(8,m) for odd values of m.
23 0 q) ( ) m = m=6 m = 0 m=)4m = 8 m =. m=6 m = 30 m = 40 m = 50 0,00 0,89 0., , j.38 0, o Q., o, , ' M , ,87, , , o ,788 o o 0,869 0,89 0J H , , , , T o , o 0, o ,97 0,980 0, ,994 0, )6., ,95-,053, ,-0.86 : , 0 ' ,,06 '., ,5 T,55.5T. 3.4o ,45.64, '...8 : o , ,05-.07, ,05...,5-7. -,9.6 i D x-puaddv
24 (9 (0) = m = 6 m = 0 m = 4 m = 8 m = m = 6 m = 30 m = 40 m = o , , o Q o ; ; o , , Remark: (pi (e) - 0 for 0 =
25 REPORTS List of Reprints, nos ; List of Reports, /M "Triangular - Square - Pentagonal Numbers", by R.J. Stroeker. 770/ES "The Exact MSE-Efficiency of the General Ridge Estimator Relative to OLS", by R. Teekens and P.M.C. de Boer.
26
Ridge Regression Revisited
Ridge Regression Revisited Paul M.C. de Boer Christian M. Hafner Econometric Institute Report EI 2005-29 In general ridge (GR) regression p ridge parameters have to be determined, whereas simple ridge
More informationAPPLICATION OF RIDGE REGRESSION TO MULTICOLLINEAR DATA
Journal of Research (Science), Bahauddin Zakariya University, Multan, Pakistan. Vol.15, No.1, June 2004, pp. 97-106 ISSN 1021-1012 APPLICATION OF RIDGE REGRESSION TO MULTICOLLINEAR DATA G. R. Pasha 1 and
More informationMSE Performance and Minimax Regret Significance Points for a HPT Estimator when each Individual Regression Coefficient is Estimated
This article was downloaded by: [Kobe University] On: 8 April 03, At: 8:50 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 07954 Registered office: Mortimer House,
More informationEvaluation of a New Estimator
Pertanika J. Sci. & Technol. 16 (2): 107-117 (2008) ISSN: 0128-7680 Universiti Putra Malaysia Press Evaluation of a New Estimator Ng Set Foong 1 *, Low Heng Chin 2 and Quah Soon Hoe 3 1 Department of Information
More informationMSE Performance of the Weighted Average. Estimators Consisting of Shrinkage Estimators
MSE Performance of the Weighted Average Estimators Consisting of Shrinkage Estimators Akio Namba Kazuhiro Ohtani March 215 Discussion Paper No.1513 GRADUATE SCHOOL OF ECONOMICS KOBE UNIVERSITY ROKKO, KOBE,
More informationBayes Estimators & Ridge Regression
Readings Chapter 14 Christensen Merlise Clyde September 29, 2015 How Good are Estimators? Quadratic loss for estimating β using estimator a L(β, a) = (β a) T (β a) How Good are Estimators? Quadratic loss
More informationLECTURE 2 LINEAR REGRESSION MODEL AND OLS
SEPTEMBER 29, 2014 LECTURE 2 LINEAR REGRESSION MODEL AND OLS Definitions A common question in econometrics is to study the effect of one group of variables X i, usually called the regressors, on another
More informationPARTIAL RIDGE REGRESSION 1
1 2 This work was supported by NSF Grants GU-2059 and GU-19568 and by U.S. Air Force Grant No. AFOSR-68-1415. On leave from Punjab Agricultural University (India). Reproduction in whole or in part is permitted
More informationRidge Regression: Biased Estimation for Nonorthogonal Problems
TECHNOMETRICS VOL. 12, No. 1 FEBRUARY 1970 Ridge Regression: Biased Estimation for Nonorthogonal Problems ARTHUR E. HOERL AND ROBERT W. KENNARD University of Delaware and E. 1. du Pont de Nemours & Co.
More informationSTAT 100C: Linear models
STAT 100C: Linear models Arash A. Amini June 9, 2018 1 / 56 Table of Contents Multiple linear regression Linear model setup Estimation of β Geometric interpretation Estimation of σ 2 Hat matrix Gram matrix
More information1 Data Arrays and Decompositions
1 Data Arrays and Decompositions 1.1 Variance Matrices and Eigenstructure Consider a p p positive definite and symmetric matrix V - a model parameter or a sample variance matrix. The eigenstructure is
More informationTightening Durbin-Watson Bounds
The Economic and Social Review, Vol. 28, No. 4, October, 1997, pp. 351-356 Tightening Durbin-Watson Bounds DENIS CONNIFFE* The Economic and Social Research Institute Abstract: The null distribution of
More informationLecture Note 1: Probability Theory and Statistics
Univ. of Michigan - NAME 568/EECS 568/ROB 530 Winter 2018 Lecture Note 1: Probability Theory and Statistics Lecturer: Maani Ghaffari Jadidi Date: April 6, 2018 For this and all future notes, if you would
More informationImproved Liu Estimators for the Poisson Regression Model
www.ccsenet.org/isp International Journal of Statistics and Probability Vol., No. ; May 202 Improved Liu Estimators for the Poisson Regression Model Kristofer Mansson B. M. Golam Kibria Corresponding author
More information2 Eigenvectors and Eigenvalues in abstract spaces.
MA322 Sathaye Notes on Eigenvalues Spring 27 Introduction In these notes, we start with the definition of eigenvectors in abstract vector spaces and follow with the more common definition of eigenvectors
More informationAppendix A: Matrices
Appendix A: Matrices A matrix is a rectangular array of numbers Such arrays have rows and columns The numbers of rows and columns are referred to as the dimensions of a matrix A matrix with, say, 5 rows
More informationOur point of departure, as in Chapter 2, will once more be the outcome equation:
Chapter 4 Instrumental variables I 4.1 Selection on unobservables Our point of departure, as in Chapter 2, will once more be the outcome equation: Y Dβ + Xα + U, 4.1 where treatment intensity will once
More informationXβ is a linear combination of the columns of X: Copyright c 2010 Dan Nettleton (Iowa State University) Statistics / 25 X =
The Gauss-Markov Linear Model y Xβ + ɛ y is an n random vector of responses X is an n p matrix of constants with columns corresponding to explanatory variables X is sometimes referred to as the design
More informationEconomics 573 Problem Set 5 Fall 2002 Due: 4 October b. The sample mean converges in probability to the population mean.
Economics 573 Problem Set 5 Fall 00 Due: 4 October 00 1. In random sampling from any population with E(X) = and Var(X) =, show (using Chebyshev's inequality) that sample mean converges in probability to..
More informationEFFICIENCY of the PRINCIPAL COMPONENT LIU- TYPE ESTIMATOR in LOGISTIC REGRESSION
EFFICIENCY of the PRINCIPAL COMPONEN LIU- YPE ESIMAOR in LOGISIC REGRESSION Authors: Jibo Wu School of Mathematics and Finance, Chongqing University of Arts and Sciences, Chongqing, China, linfen52@126.com
More informationPrincipal component analysis
Principal component analysis Angela Montanari 1 Introduction Principal component analysis (PCA) is one of the most popular multivariate statistical methods. It was first introduced by Pearson (1901) and
More informationChapter 14 Stein-Rule Estimation
Chapter 14 Stein-Rule Estimation The ordinary least squares estimation of regression coefficients in linear regression model provides the estimators having minimum variance in the class of linear and unbiased
More informationThe Statistical Property of Ordinary Least Squares
The Statistical Property of Ordinary Least Squares The linear equation, on which we apply the OLS is y t = X t β + u t Then, as we have derived, the OLS estimator is ˆβ = [ X T X] 1 X T y Then, substituting
More informationCOMBINING THE LIU-TYPE ESTIMATOR AND THE PRINCIPAL COMPONENT REGRESSION ESTIMATOR
Noname manuscript No. (will be inserted by the editor) COMBINING THE LIU-TYPE ESTIMATOR AND THE PRINCIPAL COMPONENT REGRESSION ESTIMATOR Deniz Inan Received: date / Accepted: date Abstract In this study
More informationThe Multivariate Gaussian Distribution
The Multivariate Gaussian Distribution Chuong B. Do October, 8 A vector-valued random variable X = T X X n is said to have a multivariate normal or Gaussian) distribution with mean µ R n and covariance
More information01 Probability Theory and Statistics Review
NAVARCH/EECS 568, ROB 530 - Winter 2018 01 Probability Theory and Statistics Review Maani Ghaffari January 08, 2018 Last Time: Bayes Filters Given: Stream of observations z 1:t and action data u 1:t Sensor/measurement
More informationThe Linear Regression Model
The Linear Regression Model Carlo Favero Favero () The Linear Regression Model 1 / 67 OLS To illustrate how estimation can be performed to derive conditional expectations, consider the following general
More informationEconomics 620, Lecture 5: exp
1 Economics 620, Lecture 5: The K-Variable Linear Model II Third assumption (Normality): y; q(x; 2 I N ) 1 ) p(y) = (2 2 ) exp (N=2) 1 2 2(y X)0 (y X) where N is the sample size. The log likelihood function
More information1. Background: The SVD and the best basis (questions selected from Ch. 6- Can you fill in the exercises?)
Math 35 Exam Review SOLUTIONS Overview In this third of the course we focused on linear learning algorithms to model data. summarize: To. Background: The SVD and the best basis (questions selected from
More informationBiostatistics Advanced Methods in Biostatistics IV
Biostatistics 140.754 Advanced Methods in Biostatistics IV Jeffrey Leek Assistant Professor Department of Biostatistics jleek@jhsph.edu Lecture 12 1 / 36 Tip + Paper Tip: As a statistician the results
More informationJournal of Econometrics 3 (1975) North-Holland Publishing Company
Journal of Econometrics 3 (1975) 395-404. 0 North-Holland Publishing Company ESTJMATION OF VARIANCE AFTER A PRELIMINARY TEST OF HOMOGENEITY AND OPTIMAL LEVELS OF SIGNIFICANCE FOR THE PRE-TEST T. TOYODA
More informationTopic 4: Model Specifications
Topic 4: Model Specifications Advanced Econometrics (I) Dong Chen School of Economics, Peking University 1 Functional Forms 1.1 Redefining Variables Change the unit of measurement of the variables will
More informationAlternative Biased Estimator Based on Least. Trimmed Squares for Handling Collinear. Leverage Data Points
International Journal of Contemporary Mathematical Sciences Vol. 13, 018, no. 4, 177-189 HIKARI Ltd, www.m-hikari.com https://doi.org/10.1988/ijcms.018.8616 Alternative Biased Estimator Based on Least
More informationRegression. Oscar García
Regression Oscar García Regression methods are fundamental in Forest Mensuration For a more concise and general presentation, we shall first review some matrix concepts 1 Matrices An order n m matrix is
More informationLinear models. Linear models are computationally convenient and remain widely used in. applied econometric research
Linear models Linear models are computationally convenient and remain widely used in applied econometric research Our main focus in these lectures will be on single equation linear models of the form y
More informationLagrange Multipliers
Optimization with Constraints As long as algebra and geometry have been separated, their progress have been slow and their uses limited; but when these two sciences have been united, they have lent each
More informationMeasuring Local Influential Observations in Modified Ridge Regression
Journal of Data Science 9(2011), 359-372 Measuring Local Influential Observations in Modified Ridge Regression Aboobacker Jahufer 1 and Jianbao Chen 2 1 South Eastern University and 2 Xiamen University
More informationAn Introduction to Multivariate Statistical Analysis
An Introduction to Multivariate Statistical Analysis Third Edition T. W. ANDERSON Stanford University Department of Statistics Stanford, CA WILEY- INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION Contents
More informationPrincipal Component Analysis (PCA) Our starting point consists of T observations from N variables, which will be arranged in an T N matrix R,
Principal Component Analysis (PCA) PCA is a widely used statistical tool for dimension reduction. The objective of PCA is to find common factors, the so called principal components, in form of linear combinations
More informationPart IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015
Part IB Statistics Theorems with proof Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly)
More informationStrongly connected graphs and polynomials
Strongly connected graphs and polynomials Jehanne Dousse August 27, 2011 Abstract In this report, we give the exact solutions ofthe equation P(A) = 0 where P is a polynomial of degree2 with integer coefficients,
More informationMulticollinearity and A Ridge Parameter Estimation Approach
Journal of Modern Applied Statistical Methods Volume 15 Issue Article 5 11-1-016 Multicollinearity and A Ridge Parameter Estimation Approach Ghadban Khalaf King Khalid University, albadran50@yahoo.com
More informationMulticollinearity Problem and Some Hypothetical Tests in Regression Model
ISSN: 3-9653; IC Value: 45.98; SJ Impact Factor :6.887 Volume 5 Issue XII December 07- Available at www.iraset.com Multicollinearity Problem and Some Hypothetical Tests in Regression Model R.V.S.S. Nagabhushana
More informationMaximum Likelihood Estimation
Maximum Likelihood Estimation Merlise Clyde STA721 Linear Models Duke University August 31, 2017 Outline Topics Likelihood Function Projections Maximum Likelihood Estimates Readings: Christensen Chapter
More informationNonlinear Inequality Constrained Ridge Regression Estimator
The International Conference on Trends and Perspectives in Linear Statistical Inference (LinStat2014) 24 28 August 2014 Linköping, Sweden Nonlinear Inequality Constrained Ridge Regression Estimator Dr.
More informationThe purpose of this section is to derive the asymptotic distribution of the Pearson chi-square statistic. k (n j np j ) 2. np j.
Chapter 9 Pearson s chi-square test 9. Null hypothesis asymptotics Let X, X 2, be independent from a multinomial(, p) distribution, where p is a k-vector with nonnegative entries that sum to one. That
More informationMath 533 Extra Hour Material
Math 533 Extra Hour Material A Justification for Regression The Justification for Regression It is well-known that if we want to predict a random quantity Y using some quantity m according to a mean-squared
More informationECE 275A Homework 6 Solutions
ECE 275A Homework 6 Solutions. The notation used in the solutions for the concentration (hyper) ellipsoid problems is defined in the lecture supplement on concentration ellipsoids. Note that θ T Σ θ =
More informationSUBGROUPS OF CYCLIC GROUPS. 1. Introduction In a group G, we denote the (cyclic) group of powers of some g G by
SUBGROUPS OF CYCLIC GROUPS KEITH CONRAD 1. Introduction In a group G, we denote the (cyclic) group of powers of some g G by g = {g k : k Z}. If G = g, then G itself is cyclic, with g as a generator. Examples
More informationInverse of a Square Matrix. For an N N square matrix A, the inverse of A, 1
Inverse of a Square Matrix For an N N square matrix A, the inverse of A, 1 A, exists if and only if A is of full rank, i.e., if and only if no column of A is a linear combination 1 of the others. A is
More informationThe Sphere OPTIONAL - I Vectors and three dimensional Geometry THE SPHERE
36 THE SPHERE You must have played or seen students playing football, basketball or table tennis. Football, basketball, table tennis ball are all examples of geometrical figures which we call "spheres"
More informationLecture 13: Simple Linear Regression in Matrix Format. 1 Expectations and Variances with Vectors and Matrices
Lecture 3: Simple Linear Regression in Matrix Format To move beyond simple regression we need to use matrix algebra We ll start by re-expressing simple linear regression in matrix form Linear algebra is
More informationNearly Unbiased Estimation in Dynamic Panel Data Models
TI 00-008/ Tinbergen Institute Discussion Paper Nearly Unbiased Estimation in Dynamic Panel Data Models Martin A. Carree Department of General Economics, Faculty of Economics, Erasmus University Rotterdam,
More informationCOMPREHENSIVE WRITTEN EXAMINATION, PAPER III FRIDAY AUGUST 26, 2005, 9:00 A.M. 1:00 P.M. STATISTICS 174 QUESTION
COMPREHENSIVE WRITTEN EXAMINATION, PAPER III FRIDAY AUGUST 26, 2005, 9:00 A.M. 1:00 P.M. STATISTICS 174 QUESTION Answer all parts. Closed book, calculators allowed. It is important to show all working,
More informationEstimation of the Response Mean. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 27
Estimation of the Response Mean Copyright c 202 Dan Nettleton (Iowa State University) Statistics 5 / 27 The Gauss-Markov Linear Model y = Xβ + ɛ y is an n random vector of responses. X is an n p matrix
More informationMultiple Random Variables
Multiple Random Variables This Version: July 30, 2015 Multiple Random Variables 2 Now we consider models with more than one r.v. These are called multivariate models For instance: height and weight An
More informationProblem Set #6: OLS. Economics 835: Econometrics. Fall 2012
Problem Set #6: OLS Economics 835: Econometrics Fall 202 A preliminary result Suppose we have a random sample of size n on the scalar random variables (x, y) with finite means, variances, and covariance.
More informationMAS223 Statistical Inference and Modelling Exercises
MAS223 Statistical Inference and Modelling Exercises The exercises are grouped into sections, corresponding to chapters of the lecture notes Within each section exercises are divided into warm-up questions,
More informationPrinciple Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA
Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA Principle Components Analysis: Uses one group of variables (we will call this X) In
More informationAn introduction to multivariate data
An introduction to multivariate data Angela Montanari 1 The data matrix The starting point of any analysis of multivariate data is a data matrix, i.e. a collection of n observations on a set of p characters
More informationFixed Effects, Invariance, and Spatial Variation in Intergenerational Mobility
American Economic Review: Papers & Proceedings 2016, 106(5): 400 404 http://dx.doi.org/10.1257/aer.p20161082 Fixed Effects, Invariance, and Spatial Variation in Intergenerational Mobility By Gary Chamberlain*
More informationAbsolute value equations
Linear Algebra and its Applications 419 (2006) 359 367 www.elsevier.com/locate/laa Absolute value equations O.L. Mangasarian, R.R. Meyer Computer Sciences Department, University of Wisconsin, 1210 West
More informationOn Q-derived Polynomials
On Q-derived Polynomials R.J. Stroeker Econometric Institute, EUR Report EI 2002-0 Abstract A Q-derived polynomial is a univariate polynomial, defined over the rationals, with the property that its zeros,
More information(Y jz) t (XjZ) 0 t = S yx S yz S 1. S yx:z = T 1. etc. 2. Next solve the eigenvalue problem. js xx:z S xy:z S 1
Abstract Reduced Rank Regression The reduced rank regression model is a multivariate regression model with a coe cient matrix with reduced rank. The reduced rank regression algorithm is an estimation procedure,
More informationThe Multiple Regression Model Estimation
Lesson 5 The Multiple Regression Model Estimation Pilar González and Susan Orbe Dpt Applied Econometrics III (Econometrics and Statistics) Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model:
More informationEigenvalues and Eigenvectors: An Introduction
Eigenvalues and Eigenvectors: An Introduction The eigenvalue problem is a problem of considerable theoretical interest and wide-ranging application. For example, this problem is crucial in solving systems
More informationUnit 1 Matrices Notes Packet Period: Matrices
Algebra 2/Trig Unit 1 Matrices Notes Packet Name: Period: # Matrices (1) Page 203 204 #11 35 Odd (2) Page 203 204 #12 36 Even (3) Page 211 212 #4 6, 17 33 Odd (4) Page 211 212 #12 34 Even (5) Page 218
More informationMultivariate Random Variable
Multivariate Random Variable Author: Author: Andrés Hincapié and Linyi Cao This Version: August 7, 2016 Multivariate Random Variable 3 Now we consider models with more than one r.v. These are called multivariate
More informationKevin James. MTHSC 206 Section 12.5 Equations of Lines and Planes
MTHSC 206 Section 12.5 Equations of Lines and Planes Definition A line in R 3 can be described by a point and a direction vector. Given the point r 0 and the direction vector v. Any point r on the line
More informationEconomics 620, Lecture 2: Regression Mechanics (Simple Regression)
1 Economics 620, Lecture 2: Regression Mechanics (Simple Regression) Observed variables: y i ; x i i = 1; :::; n Hypothesized (model): Ey i = + x i or y i = + x i + (y i Ey i ) ; renaming we get: y i =
More informationMatrix Algebra, part 2
Matrix Algebra, part 2 Ming-Ching Luoh 2005.9.12 1 / 38 Diagonalization and Spectral Decomposition of a Matrix Optimization 2 / 38 Diagonalization and Spectral Decomposition of a Matrix Also called Eigenvalues
More informationA Note on Bootstraps and Robustness. Tony Lancaster, Brown University, December 2003.
A Note on Bootstraps and Robustness Tony Lancaster, Brown University, December 2003. In this note we consider several versions of the bootstrap and argue that it is helpful in explaining and thinking about
More informationChapter 3. Matrices. 3.1 Matrices
40 Chapter 3 Matrices 3.1 Matrices Definition 3.1 Matrix) A matrix A is a rectangular array of m n real numbers {a ij } written as a 11 a 12 a 1n a 21 a 22 a 2n A =.... a m1 a m2 a mn The array has m rows
More informationREDUNDANCY ANALYSIS AN ALTERNATIVE FOR CANONICAL CORRELATION ANALYSIS ARNOLD L. VAN DEN WOLLENBERG UNIVERSITY OF NIJMEGEN
PSYCHOMETRIKA-VOL. 42, NO, 2 JUNE, 1977 REDUNDANCY ANALYSIS AN ALTERNATIVE FOR CANONICAL CORRELATION ANALYSIS ARNOLD L. VAN DEN WOLLENBERG UNIVERSITY OF NIJMEGEN A component method is presented maximizing
More informationAPPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2.
APPENDIX A Background Mathematics A. Linear Algebra A.. Vector algebra Let x denote the n-dimensional column vector with components 0 x x 2 B C @. A x n Definition 6 (scalar product). The scalar product
More informationECONOMETRIC INSTITUTE THE COMPLEXITY OF THE CONSTRAINED GRADIENT METHOD FOR LINEAR PROGRAMMING J. TELGEN REPORT 8005/0
ECONOMETRIC INSTITUTE THE COMPLEXITY OF THE CONSTRAINED GRADIENT METHOD FOR LINEAR PROGRAMMING ffaciundation OF AGRICULTIlt.E,C"..'NOMICE; 1LT- S E P 3 1/880 J. TELGEN REPORT 8005/0 ERASMUS UNIVERSITY
More informationLecture 4: Least Squares (LS) Estimation
ME 233, UC Berkeley, Spring 2014 Xu Chen Lecture 4: Least Squares (LS) Estimation Background and general solution Solution in the Gaussian case Properties Example Big picture general least squares estimation:
More informationTHE UNIVERSITY OF CHICAGO Booth School of Business Business 41912, Spring Quarter 2016, Mr. Ruey S. Tsay
THE UNIVERSITY OF CHICAGO Booth School of Business Business 41912, Spring Quarter 2016, Mr. Ruey S. Tsay Lecture 5: Multivariate Multiple Linear Regression The model is Y n m = Z n (r+1) β (r+1) m + ɛ
More informationMultivariate Distributions
Copyright Cosma Rohilla Shalizi; do not distribute without permission updates at http://www.stat.cmu.edu/~cshalizi/adafaepov/ Appendix E Multivariate Distributions E.1 Review of Definitions Let s review
More informationJournal of Asian Scientific Research COMBINED PARAMETERS ESTIMATION METHODS OF LINEAR REGRESSION MODEL WITH MULTICOLLINEARITY AND AUTOCORRELATION
Journal of Asian Scientific Research ISSN(e): 3-1331/ISSN(p): 6-574 journal homepage: http://www.aessweb.com/journals/5003 COMBINED PARAMETERS ESTIMATION METHODS OF LINEAR REGRESSION MODEL WITH MULTICOLLINEARITY
More informationChapter 2. Matrix Arithmetic. Chapter 2
Matrix Arithmetic Matrix Addition and Subtraction Addition and subtraction act element-wise on matrices. In order for the addition/subtraction (A B) to be possible, the two matrices A and B must have the
More informationPOSTERIOR ANALYSIS OF THE MULTIPLICATIVE HETEROSCEDASTICITY MODEL
COMMUN. STATIST. THEORY METH., 30(5), 855 874 (2001) POSTERIOR ANALYSIS OF THE MULTIPLICATIVE HETEROSCEDASTICITY MODEL Hisashi Tanizaki and Xingyuan Zhang Faculty of Economics, Kobe University, Kobe 657-8501,
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning Multivariate Gaussians Mark Schmidt University of British Columbia Winter 2019 Last Time: Multivariate Gaussian http://personal.kenyon.edu/hartlaub/mellonproject/bivariate2.html
More informationSTAT Homework 8 - Solutions
STAT-36700 Homework 8 - Solutions Fall 208 November 3, 208 This contains solutions for Homework 4. lease note that we have included several additional comments and approaches to the problems to give you
More informationON UNIVERSAL SUMS OF POLYGONAL NUMBERS
Sci. China Math. 58(2015), no. 7, 1367 1396. ON UNIVERSAL SUMS OF POLYGONAL NUMBERS Zhi-Wei SUN Department of Mathematics, Nanjing University Nanjing 210093, People s Republic of China zwsun@nju.edu.cn
More informationMS&E 226: Small Data. Lecture 6: Bias and variance (v2) Ramesh Johari
MS&E 226: Small Data Lecture 6: Bias and variance (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 47 Our plan today We saw in last lecture that model scoring methods seem to be trading o two di erent
More informationBayesian Estimation of a Possibly Mis-Specified Linear Regression Model
Econometrics Working Paper EWP14 ISSN 1485-6441 Department of Economics Bayesian Estimation of a Possibly Mis-Specified Linear Regression Model David E. Giles Department of Economics, University of Victoria
More informationComparison of Some Improved Estimators for Linear Regression Model under Different Conditions
Florida International University FIU Digital Commons FIU Electronic Theses and Dissertations University Graduate School 3-24-2015 Comparison of Some Improved Estimators for Linear Regression Model under
More informationTheorems. Least squares regression
Theorems In this assignment we are trying to classify AML and ALL samples by use of penalized logistic regression. Before we indulge on the adventure of classification we should first explain the most
More informationMS&E 226: Small Data
MS&E 226: Small Data Lecture 6: Bias and variance (v5) Ramesh Johari ramesh.johari@stanford.edu 1 / 49 Our plan today We saw in last lecture that model scoring methods seem to be trading off two different
More informationSolutions for Econometrics I Homework No.1
Solutions for Econometrics I Homework No.1 due 2006-02-20 Feldkircher, Forstner, Ghoddusi, Grafenhofer, Pichler, Reiss, Yan, Zeugner Exercise 1.1 Structural form of the problem: 1. q d t = α 0 + α 1 p
More informationApproximations - the method of least squares (1)
Approximations - the method of least squares () In many applications, we have to consider the following problem: Suppose that for some y, the equation Ax = y has no solutions It could be that this is an
More informationWiley. Methods and Applications of Linear Models. Regression and the Analysis. of Variance. Third Edition. Ishpeming, Michigan RONALD R.
Methods and Applications of Linear Models Regression and the Analysis of Variance Third Edition RONALD R. HOCKING PenHock Statistical Consultants Ishpeming, Michigan Wiley Contents Preface to the Third
More informationA Probability Review
A Probability Review Outline: A probability review Shorthand notation: RV stands for random variable EE 527, Detection and Estimation Theory, # 0b 1 A Probability Review Reading: Go over handouts 2 5 in
More informationMore Linear Algebra. Edps/Soc 584, Psych 594. Carolyn J. Anderson
More Linear Algebra Edps/Soc 584, Psych 594 Carolyn J. Anderson Department of Educational Psychology I L L I N O I S university of illinois at urbana-champaign c Board of Trustees, University of Illinois
More informationNotes on Random Vectors and Multivariate Normal
MATH 590 Spring 06 Notes on Random Vectors and Multivariate Normal Properties of Random Vectors If X,, X n are random variables, then X = X,, X n ) is a random vector, with the cumulative distribution
More informationLinear Regression. y» F; Ey = + x Vary = ¾ 2. ) y = + x + u. Eu = 0 Varu = ¾ 2 Exu = 0:
Linear Regression 1 Single Explanatory Variable Assume (y is not necessarily normal) where Examples: y» F; Ey = + x Vary = ¾ 2 ) y = + x + u Eu = 0 Varu = ¾ 2 Exu = 0: 1. School performance as a function
More informationProblem Set 1. MAS 622J/1.126J: Pattern Recognition and Analysis. Due: 5:00 p.m. on September 20
Problem Set MAS 6J/.6J: Pattern Recognition and Analysis Due: 5:00 p.m. on September 0 [Note: All instructions to plot data or write a program should be carried out using Matlab. In order to maintain a
More informationRegression with a Single Regressor: Hypothesis Tests and Confidence Intervals
Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals (SW Chapter 5) Outline. The standard error of ˆ. Hypothesis tests concerning β 3. Confidence intervals for β 4. Regression
More informationTHE UNIVERSITY OF CHICAGO Graduate School of Business Business 41912, Spring Quarter 2008, Mr. Ruey S. Tsay. Solutions to Final Exam
THE UNIVERSITY OF CHICAGO Graduate School of Business Business 41912, Spring Quarter 2008, Mr. Ruey S. Tsay Solutions to Final Exam 1. (13 pts) Consider the monthly log returns, in percentages, of five
More information