. ~~~~~~ L ~~~ AD A OCSMATICS INC STATE COU.ESt PA F/S j2/j RIOSC REISCSUON FC ~ NCNSTA$CARDIZLD Moats. (U) 1O54 IMCLASSIFIID

Similar documents
Multicollinearity and A Ridge Parameter Estimation Approach

A0A TEXAS UNIV AT AUSTIN DEPT OF CHEMISTRY F/G 7/5 PHOTOASSISTED WATER-GAS SHIFT REACTION ON PLATINIZED T7TANIAI T--ETCIUI

Relationship between ridge regression estimator and sample size when multicollinearity present among regressors

A STUDY OF SEQUENTIAL PROCEDURES FOR ESTIMATING. J. Carroll* Md Robert Smith. University of North Carolina at Chapel Hill.

COMBINING THE LIU-TYPE ESTIMATOR AND THE PRINCIPAL COMPONENT REGRESSION ESTIMATOR

APPLICATION OF RIDGE REGRESSION TO MULTICOLLINEAR DATA

Improved Ridge Estimator in Linear Regression with Multicollinearity, Heteroscedastic Errors and Outliers

Ridge Regression and Ill-Conditioning

S 7ITERATIVELY REWEIGHTED LEAST SQUARES - ENCYCLOPEDIA ENTRY.(U) FEB 82 D B RUBIN DAAG29-80-C-O0N1 UNCLASSIFIED MRC-TSR-2328

DATE ~~~ 6O 300 N1. nbc OF / END FJLIEEO

ENHANCING THE EFFICIENCY OF THE RIDGE REGRESSION MODEL USING MONTE CARLO SIMULATIONS

Improved Liu Estimators for the Poisson Regression Model

PARTIAL RIDGE REGRESSION 1

Review of Linear Algebra

8kb IRT CORP SAN DIEGO CALIF F/S 20/8 MOLECLJt.AR BEAM STUDIES OF LOW ENERGY REACTIONS. (U) OCT 78 B H NEYNABER, S V TANG

Efficient Choice of Biasing Constant. for Ridge Regression

Diagonal Representation of Certain Matrices

A L A BA M A L A W R E V IE W

Solution Set 7, Fall '12

RD-Ai6i 457 LIMITING PERFORMANCE OF NONLINEAR SYSTEMS WITH APPLICATIONS TO HELICOPTER (U) VIRGINIA UNIV CHARLOTTESVILLE DEPT OF MECHANICAL AND

Ridge Regression Revisited

SOME NEW PROPOSED RIDGE PARAMETERS FOR THE LOGISTIC REGRESSION MODEL

RD-R14i 390 A LIMIT THEOREM ON CHARACTERISTIC FUNCTIONS VIA RN ini EXTREMAL PRINCIPLE(U) TEXAS UNIV AT AUSTIN CENTER FOR CYBERNETIC STUDIES A BEN-TAL

Knü. ADOAOll 835 ON PACKING SQUARES WITH EQUAL SQUARES. P. Erdos, et al. Stanford University. Prepared for:

AIR FORCE RESEARCH LABORATORY Directed Energy Directorate 3550 Aberdeen Ave SE AIR FORCE MATERIEL COMMAND KIRTLAND AIR FORCE BASE, NM

.IEEEEEEIIEII. fflllffo. I ll, 7 AO-AO CALIFORNIA UN IV BERKELEY OPERATIONS RESEARCH CENTER FIG 12/1

79 1,2 jj8 STATISTICAL PROPERTIES OF ALLOCATION AVERAGES DDC. Research Memorandum C-, Behavioral Science Research Laboratory ,*., U S.

Use of Wijsman's Theorem for the Ratio of Maximal Invariant Densities in Signal Detection Applications

Regression. Oscar García

Appendix A: Matrices

1 Inner Product and Orthogonality

Effects of Outliers and Multicollinearity on Some Estimators of Linear Regression Model

MICROCOPY RESOLUIION IESI CHARI. NAMfNA[ IC~ ANtn

THE COVARIANCE MATRIX OF NORMAL ORDER STATISTICS. BY C. S. DAVIS and M. A. STEPHENS TECHNICAL REPORT NO. 14 FEBRUARY 21, 1978

i of 6. PERFORMING ORG. REPORT NUMBER Approved for public Dtatibuft Unlimited

DEPARTMENT OF THE ARMY WATERWAYS EXPERIMENT STATION. CORPS OF ENGINEERS P. 0. BOX 631 VICKSBURG. MISSISSIPPI 39180

OF NAVAL RESEARCH. Technical Report No. 87. Prepared for Publication. in the.. v.. il and/or -- c Spocial Journal of Elactroanalytical Chemistry it

Properties of Matrices and Operations on Matrices

Matrix & Linear Algebra

Eigenvalues and diagonalization

UN MADISON MATHEMSTICS RESEARCH CENTER A P SUMS UNCLASSIFIE FER4 MONSRR2648 DAAG29-0C04 RDr C91 04

Comparison of Some Improved Estimators for Linear Regression Model under Different Conditions

SaL1a. fa/ SLECTE '1~AIR FORCE GECOPNYSICS LABORATORY AIR FORCE SYSTEMS COMMAND AFGL-TH

Evaluation of a New Estimator

A MARTINGALE INEQUALITY FOR THE SQUARE AND MAXIMAL FUNCTIONS LOUIS H.Y. CHEN TECHNICAL REPORT NO. 31 MARCH 15, 1979

15. T(x,y)=(x+4y,2a+3y) operator that maps a function into its second derivative. cos,zx are eigenvectors of D2, and find their corre

A Practical Guide for Creating Monte Carlo Simulation Studies Using R

Columbus State Community College Mathematics Department. CREDITS: 5 CLASS HOURS PER WEEK: 5 PREREQUISITES: MATH 2173 with a C or higher

3 (Maths) Linear Algebra

Dominant Eigenvalue of a Sudoku Submatrix

Chapter 14 Stein-Rule Estimation

ISyE 691 Data mining and analytics

K-0DI. flflflflflflflflflii

Bayes Estimators & Ridge Regression

Using Ridge Least Median Squares to Estimate the Parameter by Solving Multicollinearity and Outliers Problems

Research Article An Unbiased Two-Parameter Estimation with Prior Information in Linear Regression Model

Next is material on matrix rank. Please see the handout

Matrix Algebra Review

Measuring Local Influential Observations in Modified Ridge Regression

Management Science Research Report #MSRR-570. Decomposition of Balanced Matrices. Part II: Wheel-and-Parachute-Free Graphs

Mitigating collinearity in linear regression models using ridge, surrogate and raised estimators

AD-A ROCHESTER UNIV NY GRADUATE SCHOOL OF MANAGEMENT F/B 12/1. j FEB 82.J KEILSON F C-0003 UNCLASSIFIED AFGL-TR NL

Matrix Algebra. Matrix Algebra. Chapter 8 - S&B

1SEP 01 J S LANGER AFOSR

Review of Linear Algebra

Influencial Observations in Ridge Regression

ge-k ) ECONOMETRIC INSTITUTE

The purpose of this section is to derive the asymptotic distribution of the Pearson chi-square statistic. k (n j np j ) 2. np j.

geeeo. I UNCLASSIFIED NO 814-B5-K-88 F/G 716 NL SCENCESECTION L F HANCOCK ET AL 81 AUG 87 TR-i

Numerical Linear Algebra Homework Assignment - Week 2

Examples True or false: 3. Let A be a 3 3 matrix. Then there is a pattern in A with precisely 4 inversions.

Theorems. Least squares regression

flfllfll..lmfflfflf meemohhohhohmhe I DAl I";

ADl-AO PUDUE UMIV LAFAYETTE IN DEPT OF CHEMISTRY F/6 7/4

Closed-form and Numerical Reverberation and Propagation: Inclusion of Convergence Effects

April 26, Applied mathematics PhD candidate, physics MA UC Berkeley. Lecture 4/26/2013. Jed Duersch. Spd matrices. Cholesky decomposition

Matrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A =

Dr. Maddah ENMG 617 EM Statistics 11/28/12. Multiple Regression (3) (Chapter 15, Hines)

, i, ~. '~~~* a F- WI W U V U S S S S S S It

Regression Models - Introduction

UNCLASSIFIED AD NUMBER LIMITATION CHANGES

UNCLASSIFIED U. OF ZOWA81-2 ML.

Final Review Sheet. B = (1, 1 + 3x, 1 + x 2 ) then 2 + 3x + 6x 2

1.0~ O. 1*1.8.2IIII II5 uuu11~ I 1 I6 MICROCOPY RESOLUTION TEST CHART NATIONAL BUREAU OF STANDARDS 1963-A

NORWEGIAN SEISMIC ARRAY NORSAR ESD ACCESSION LIST, Approv. Royal Norwegian Council for Scientific and Industrial Research. Copy No.

Chapter 2. Matrix Arithmetic. Chapter 2

Mi] AD-A ON THE DEFLECTION OF PROJECTILES DUE TO ROTATION OF THE EARTH. G. Preston Burns

A Modern Look at Classical Multivariate Techniques

Alternative Biased Estimator Based on Least. Trimmed Squares for Handling Collinear. Leverage Data Points

Nonlinear Inequality Constrained Ridge Regression Estimator

C-) ISUMMARY REPORT LINEAR AND NONLINEAR ULTRASONIC INTERACTIONS ON LIQUID-SOLID BOUNDARIES

EC408 Topics in Applied Econometrics. B Fingleton, Dept of Economics, Strathclyde University

AD FIXED POINTS OF ANALYTIC FUNCTIONS Peter Henrici Stanford University Stanford, California July 1969

Solutions for Econometrics I Homework No.1

STATISTICS 4, S4 (4769) A2

22m:033 Notes: 7.1 Diagonalization of Symmetric Matrices

Eigenvalues, Eigenvectors, and an Intro to PCA

Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA

THE OBSERVED HAZARD AND MULTICONPONENT SYSTEMS. (U) JAN 80 M BROWN- S M R OSS N0001R77-C I~OREE80-1 he UNCLASSIFIED CA-

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8

Transcription:

ADA045 406 OCSMATICS INC STATE COU.ESt PA F/S j2/j RIOSC REISCSUON FC NCNSTA$CARDIZLD Moats. (U) OCT 77 T I. KINS, 0 SMITH N0OOLøa75 C 1O54 IMCLASSIFIID TR1066 joe l AD A04 5406 END DAn FILMED 177 DEC l t L. /1

,., P. 0. Box 618 DESMATICS, INC. Stote ColIe e, Po. 1680 1 Phone: (814) 2389621 Applied Resedrch in Stdtistics Mdthemotics Opert,tions Resedrch _J(IDGE REGRESSION FOR NONSTAIIDAIWIZED MODELS by (/ Ok._ i Terry L./King I is E./Smith \ f k. : H This study was su ported b the 0 ice of Naval Research under Contract 5_C_ 154 Task No. MR 042334 : Reproduc n in whole or in par t is permitted for any purpose of the Uni ted Sta tes Governmen t Approved for public release; distribution unlimited 2J.

r TABLE OF CONTENTS Page I. INTRODUCTION.... 1 II. STA1 DARDIZATION IN RIDGE ESTIMATION 5 III. CONCLUSIONS 17 IV. REFER ENCES. 18 ç c c \ ii u :

... w. i!_ INTRODUCTION Consider the usual regression model (1.1) where is an n x 1 vector of observations, 8 (B O,8 l,... 8) is a (p + 1) x 1 vector of unknown parameters, c is an n x 1 vector of random errors, and X is a fixed n x (p + 1) matrix. It will be assumed tha t X is of full rank and that the f irst column of X is a column of ones, denoted by 1. It will also be assumed that c has expectation 0, Var (E) c 2 i, and C follows a normal distribution. The ordinary least squares (OLS) estimator of 8 is given by A l 8 (2 2 ) X (1.2) The proper ties of A are well known. Namely, (1) 8 minimizes X8) ( (2) 8 is the bes t linear unbiased estimator of 8 (3) A is the maximum likelihood estimator of B (4) 8 N(8, c 2 (x x) 1 ) (5) MSE (8) EL ( ) ($ )] a 2 E where A1, A 2,..., ) are the eigenvalues of X X Upon examination of the last of these properties, it is clear that when the minimum elgenvalue is close to zero, the mean squared error (MSE) becomes unsatisfactorily large. This led Hoerl and 1.,.. p.,. _,..

. =.., p, Kennard (1970a) to recomend the use of the ridge estimator ( X + ki) X 1,. k > 0 (1.3) when X X, in correlation form, is an illconditioned matrix. If k is treated as a constant, the following properties are known: (1) For k 0, the ridge estimator is identical to the OLS estimator. estimator. (2) For k > 0, the ridge estimator is shorter than the OLS (3) 8* N(W X8, a )where (X X + (4) MSE( *) a 2 ZAj / (A + k) 2 + k 2. 2! (5) There always exists a k > 0 such that MSE( *) < MSE( ). Hoerl and Kennard (1970b)also suggested a graphical technique to determine the value of k. In this technique, one first constructs A a plot of the components of 8* versus k, called the ridge trace. Then using this ridge trace, he chooses a value of k for which the estimates have stabilized. Analytic methods for choosing k have also been proposed. For example, see McDonald and Galarneau (1975), or Hoerl, Kennard, and Baldwin (1975). In any event, once the value of k has been determined in some manner, (1.3) gives the estimate of to be reported. Research papers appearing since the Hoerl and Kennard articles have not always used this estimator as their ridge estimator because the form of the regression model is varied. Marquardt and Snee (1975) 2 _.f :.. r..

r p advocate using the model with only X standardized. However, they suggest choosing the value of k with also standardized. Heunnerle (1975) standardizes the design matrix X, but not the observation vector,. McDonald and Galarneau (1975) standardize both X and z but Guilkey and Murphy (1975) make no standardizations at all. Obenchain (1975) centers both X and and warns against scaling or reparameter izing unless the results are ultimately to be repor ted in that form. Thisted (1976) gives a good discussion of the problems involved with the standardization of the variables. Comparison of the results of one investigator with those of another may be hampered by the lack of uniformity in the form of the model. Most authors have recognized this problem, bu t have not investigated it further. In the case of the OLS solution, the same estimator is produced from any of the forms of the model. Thus, by solving a problem sta ted in any form, the same OLS estimator of 8 will be produced when measured in the same parameter space. This nice property is not true of ridge estimators, however. In this paper, formulae are given for different forms of the model, and comparisons are made among them. In order to examine the effect of standardization on the ridge estimator, first consider the model. If no transformations are made, the model is given in (1.1). If the X matrix and are par titioned as X [i GJ and = (8 o, 9, and the columns of G centered about their means, the model can be written as 3 lui.

E[y] (1.4) where C is the symsetric idempotent matrix ( I n j 1 ), and n 1 Ey 1. If the vectors in cc are also scaled to have unit length, then the model can be written as E[ ] n i GB (1.5) where H ccic 2, D l/2 6 and D is the diagonal matrix of scaling factors. In the case of the OLS estimator, the three forms of the model, (1.1), (1.4), and (1.5), are known to give equivalent estimators of 8. 4

before.. II. STANDARDIZATION IN RIDGE ESTIMATION For the model (1.1) the ridge estimator is given by (X X + k. J.) 1 X (2.1) In order to make the comparison easier, partition X and B as Then can be written as l * * A B (6 01, Gl (2.2) where 1 1 l Gl ( (it + k 1 ),J)G) G (! (it + k 1 ) )i 8 01 (n/ (n + k 1 ))( and J 1 1 For the centered model (1.4), the ridge estimator is where 6 (2.3) A l * G2 (c cg + k j ) G.QZ 8 02 Y In general, this is a distinct estimator from that given in (2.2). c * 2, there is nothing to prove. If O2 then 5 5

A A A 8 l (n/ (n + k 1 )8 2 I unless k 0; i.e., unless the estimator 1 is the OLS estimator. Hence, these estimators are not the same. For the third form of the model (1.5), which will be referred to as the standardized form of the model, the estimator of the original is (8 3P 3 ) (2.4) where (G CG + A l 8* A l G$ 03 C3 The estimator (2.4) can be seen to be different from that given in (2.2) in the same way a was. The estimators for the centered model are equivalent t ators for the standardized model only when D is a scalar mu pie of the identity. In this case, there do exist values of k and k that would make the estimators equivalent. 2 3 To this point in the development, the error structure has remained the same. However, if transformations are made on the observation vector, the error structure is also changed. It is not too difficult to show that if the design matrix is centered, the same ridge estimator is produced by using! in its original form or in its centered form, even if the equations to be solved do not take into account the error structure. It can also be shown that the centering and scaling of has no effect on the estimator if X has also been centered and scaled. 6 *.

r In order to evaluate the three classes of estimators, squared error (squared Euclidean distance) was used as a basis of comparison in this study. While it is possible to find explicit formulae for the MSE (i.e., the expected squared error) of each estimator, the distribution of the squared error also should be investigated. This led to a simulation study. As the discussion above has emphasized, there are three parameter spaces in which such evaluation may be made. Comparisons can be made based on the parameterization of (1) the original model, (2) the centered model, or (3) the standardized model. In this study, comparisons were made in all three parameter spaces and also in the observation ( ) space. For this simulation study, a design discussed in Marquardt and Snee (1975) was used. In the design matrix there are three vectors orthogonal to 1. Two of these vectors are highly correlated, (r 0.989), and the other correlations are zero. This design matrix, which has eight observations and four parameters to be estimated, is given in Figure 1. The eigenvalues and eigenvectors for the correlation matrix were calculated. The eigenvectors (1, 1, 0) and (1, 1, 0) corresponding to the maximum and minimum eigenvalues respectively, were multiplied by a and 3a, where a 0.8, the same standard deviation value used by Marquardt and Snee. The four resulting vec tors were transformed from the standardized space back to the or iginal space, and the constant term in the model was set equal to 0. Using each of these four vectors as the true, 1000 7 =..

I. _ 1 l l l 1 1 1 l 1 l l 1 1 1 1 1 1 l 0.8 l 1 1 0.8 l 1 0.8 1 1 1 0.8 1 1 Figure 1: Design Matrix Used in the Simulation Study 8 j.

replicates of j were generated from model (1.1) using a 0.8 Since a and $ are known, the value of k was chosen by a computer routine to minimize the MSE of each estimator, using formulae similar to that given in Hoerl and Kennard (l970a). Hence, in each case, the optimal 1 value of k was used, i.e., conditional on a and 8 At this point, the NSE of each es timator in each space could be calculated theoretically, but doing so would not tell very much about the distribution of the squared error of the estimators. Therefore, the simulation study was conducted. In this simulation the true value of B (8 ) is defined such that (1) (2) ;, when transformed to the standardized space, is equal to av, where a denotes a or 3a, and v denotes (1, 1, 0), an eigenvector corresponding to the maximum eigenvalue, or (1, 1, 0), an elgenvector corresponding to the mm imum eigenvalue. The values of a and v are used in Figures 2 through 6 to indicate the true value of used in the simulation. Estimated MSE values for each of the estimators are presented in Figure 2. Figures 3 through 6 show how the estimators compare based on the simulations. Each entry in these figures is the number of times out of 1000 that the estimator at the top of that column 1 1 Subject to computing error not exceeding 1.0 x 10 5 I...

e ci in N 0% r.,. o o in en en c in vi,i.e N. N * o..... 1 I I l i ) in. in. 1J I.4 a) 1.4 Q a) a) C 1 C l %0 vi N. a) I. 0 0 in in In in b r vi 0% in In 9.? * in N in %O in 0 o N 0% in in. i_ i in in in 0 N vi in vi 4J 0 N N 0.t 0 a 0 a) 4)a)....... N Ii en en c.i v Q CI,0 Ci %O 0%.4. a a a) IJ Co Co 0 a in ii in 0 in in vi 4 vi 1 o a) a) I N CI C4 ) C 4 c CI In U en en v Q% vi 0% vi 0% U 0 O. 1.1 0 i I in c.i in en a).4 N %O N %D N 0 0 Co Co CI In CI in c i en I. Co, x en....... 0 11 CI CI C 0 0 0 0 0 w 14 0 %O 0 0 a 0 0% 0 1.4 0 0 0 Co Co if) Co Co CI, C CI C. 0.4 0 vi 0 vi. vi vi a) a) 1 J. I I I N N a 4..4. H 0 0% 0% In a a 4. 1.1 W a 0% 0% N 0% 0 0% 0 0% H CI a) in....... a) vi 14 CI CI C 0 0 a)a) vi C 0 0 1.4..,. U) 005.rI a) N N N 0 If) vi vi vi I 1.4 0% 0% CI Co 0% vi 0 vi o 0% 0% CI C 4. C4.4. 0 0 c. d. d, vi,. : 1 _ s S 0. / 0 ṡ 0 0. 0. a).4/ 0 0 0 0 I4,_S b. /!. i.!!. e enl.5vi Sci sin a) 10

r, =.., v,.. a Multiple a 3a a 3a a B (l,l,o) 98]. 965 979 954 979 954 (1,1,0) 787 573 771 573 771 573 8! (l,l,0) 178 150 201 195 (1,1,0) 49 296 55 305 (l,l,o) 748 786 (1,1,0) 606 597 Note 1: The true value of the parameter B, when transformed to the standard ized space, is equal to av. (See text for full discussion.) Note 2: Each entry in this figure is the number of times out of 1000 replicates that the estimator at the top of the column had smaller squared error than the estimator at the left of the row. Figure 3: Comparison of Estimates of,, from the Simulation. 11 _Th.

. Jultip1e ve < a 3a a 3a 0 (l,l,o) 969 953 979 954 979 954 (1,1,0) 771 573 771 573 771 573 508 470 591 605 (1,1,0) 459 575 477 619 8* 2 (l,l,0) 748 786 (1,1,0) 609 597 Note j The true value of the parameter 3, when trf.nsformed to the standardized space, is equal to av. (See text for full discussion.) Note 2: Each entry in this figure is the number of times out of 1000 replicates that the estimator at the top of the column had smaller squared error than the estimator at the left of the row. Figure 4: Comparison of Estimates of from the Simulation. 12 III.!! i! _

F,,,..... a 3a a 3a a 3a (l,l,o) 971 953 979 954 980 954 (1,1,0) 771 573 771 573 771 573 (1,l,O) 495 467 580 610 (1,1,0) 452 575 472 622 2 (1 1,0) 755 791 (1,1,0) 609 612 No te 1: The true value of the parameter, when transformed to the standardized space, is equal to av. (See text for full discussion.) Note 2: Each entry in this figure is the number of times out of 1000 replicates that the estimator at the top of the column had smaller squared error than the estimator at the left of the row. Figure 5: Comparison of Estimates of from the Simulation. 13. 4 Z._ L I 4

it Vecto ultiple a 3a a 3a a 3a B (1 1,0) 0 0 0 0 0 0 (1,1,0) 0 0 0 0 0 0 (l,l,0) 1000 1000 1000 1000 (1,1,0) 1000 1000 1000 949 (l,1,0) 0 0 (1,1,0) 545 0 Note 1: The true value of the parameter,, when transformed to the standard ized space, is equal to av. (See text for full discussion.) No te 2 : Each entry In this f igure is the number of times out of 1000 replicates that the estimator at the top of the column had smaller squared error than the estimator at the left of the row. Figure 6: Comparison of Estimates of from the Simulation. 14 i L _

had smaller squared error than the estimator given at the left of that row. Figure 3 compares the estimators in the original space, Figure 4 in the centered space, Figure 5 in the standardized space, and Figure 6 In the observation space. In the simulation the OLS estimator performed as expected. When OLS was compared to the ridge estimators separately, each of the ridge estimators yielded estimates with squared error smaller than that of OLS in a minimum of 57% of the simulations in each of the three parameter spaces for all values of Q considered. Since the OLS estimator is a special case of all three r idge estimators, and the ridge estimators were chosen with optimal proper ties, this result is not at all surprising. Also as expected In theory, the OLS es tima tes of! produced the smallest squared error for every generated. For the original parameter, (Figure 3) the ridge estimates from model (1.1), when compared with transformed estimates from either of the other two models, (1.4) and (1.5), were closes t to the true, in more than 69% of the simulations regardless of the magnitude or orientation of the parameter vector. Since these results were based on the use of optimal k values, It does not necessarily follow that similar results will hold for nonoptimal (stochastic) k. Nonetheless, when a r idge es timator is to be used to estimate, it appears that the estimator * derived from the or iginal model may perform the best. For the centered parameter and the standardized parameter y, the results were not as striking. However, by examining Figures 4 15,. *

i. S I! and 5 it appears that in each of these cases it might be veil to base the estimation process on, i.e., by using ridge es timation with the standardized model. An estimate of can be obtained by first estimating the parameter B and then premultiplying the estimate by X As can be seen from Figure 6, the estimator based on the centered model appears to be the mos t likely candidate of the three ridge estimators considered when comparisons are made in the observation space. Of course, since the OLS es timator minimizes ( X13) ( X8), none of the ridge estimators ever resulted in a smaller squared error when measured on this basis. 16 f l...

: I III. CONCLUSION S The research described in this technical repor t has been concerned with three ridge estimators. Since each was chosen to be an optimal estimator within a par ticular transformation of the. parameter space with respect to the problem being considered, the results do not address the question of when ridge estimation rather than OLS estimation should be used. Instead, given that ridge regression is to be used, these results do address the question of which ridge estimator to use. While the results of this small scale simulation are not conclusive, they do indicate that care needs to be taken in deciding which estimator to use. The choice of estimator will depend on the criteria used to measure the goodness of the estimation process. For example, a ridge estimator with good squared error properties in the standardized parameter space may not do as well as another ridge estimator when transformed back to the original parameter space. In this case, the data analyst must decide whether measurement of squared error is more appropriate in the standardized space or in the original space. 17 L.

r, IV. REFERENCES Guilkey, D. K., and Murphy, J. L. (1975) Directed ridge regression techniques in cases of multicollinearity. JASA, 70, 769776. Hemmerle, William J. (1975) An explicit solution for generalized ridge regression. Technometrics, 17, 309314. ben, A. E. and Kennard, R. W. (1970a) Ridge regression: biased estimation for nonorthogonal problems. Technometrics, 12, 5567. Hoerl, A. E. and Kennard, R. W. (1970b ) Ridge regression: applications to nonorthogonal problems. Technometrics, 12, 6982. Hoenl, A. E., Kennard, R. W., and Baldwin, K. F. (1975) Ridge regression: some simulations. Comm. Stat., 4, 105 123. Marquardt, D. W. and Snee, R. D. (1975) Ridge regression in practice. American Statistician, 29, 320. McDonald, G. C. and Galarneau, D. I. (1975) A MonteCarlo evaluation of some ridgetype estimators. JASA, 70, 407416. Obenchain, R. L. (1975) Ridge analysis following a preliminary test of a shrunken hypothesis. Technometnics, 17, 431441. Thisted, R. A. (1976) Ridge regression, minitnax estimation, and empirical Bayes methods. Division of Biostatistics, Stanford University, Technical Report No. 28. 18. 2 2 Z i.

r UNCLASSIFIED SECURITY CLASSIFICATION OF THIS PAGE (W m.a Oaf. Enl.r d) DE 1. E READ, INSTRUCTIONS run, u j.um ri u, iui T u BEFORE COMPLETING FORM I. REPORT NUMBER 2. GOVT ACCESSION NO. 3. RECIPIENT S CATALOG NUMBER 106 6 4. TITLE (ai d SukIlli.) 5. TYPE OF REPORT 4 PERIOD COVERED RIDGE REGRESSION FOR NONSTANDARDIZED MODELS Technical Report 6. PERFORMING ORG. REPORT NUMBER 7 AUTHOR(a) S. CONTRACT ON GRANT NUMOER(.) Terry L. King and Dennis E. Smith N000l475Cl054 9. PERFORMING ORGANIZATION NAME AND ADDRESS 10. PROGRAM ELEMENT. PROJECT. TASK AREA & WORK UNIT NUMBERS Desmatics, Inc. P. 0. Box 618 State College, Pa. 16801 NR 042334 II. CONTROLLING OFFICE NAME AND ADDRESS 12. REPORT DATE Statistics and Probability Program (Code 436) October 1977 Off ice of Naval Research 13. NUMBEROF PAGES Arlington, VA 22217 18 IS MONITORING AGENCY NAME S ADDRESS(SI dsitoranl Vram Cont,oUInS OWc.) IS. SECURITY CLASS. (ol CAl. r.port) Unclassif ied ISa. DECLASSIFICATION/OOWNGRADING SCHEDULE IS DITYRIBUTION STATEMENT (of SAl. R.pcrt) 17. DISTRIBUTION STATEMENT (of A..bsf,*cS aift, d In Block 20. Sf dsfl.v.i t Irom R.po rt) IS. SUPPLEMENTARY NOTES IS. KEy WORDS (Continua en V.,.,.. aida i n c I r) aid Idvvtily by block numb.r) Ridge regression Standardization Linear model.. 20 )3 act fcontlnu..n r......id. St n.c.a.a,) aid Id.nlity by block numb.,) e In the usual regression model.a + c with X of full rank and A 4 c N (Q,a I), the ordinary least squares estimator (X X) A i has. 3 ;. i Ir.4 1. extremely large mean square error when X is an?i11conditioned atrix. Hoerl and Kennard (Technometnics, Vol. 12, No. 1, pp. 5567 (1970) 3 have DD EDITION OF I NOV SI IS OBSOLETE UNCLASSIFIED SECURITY CLASSIFICATION OF THIS PAGE (U5.n Data Enl.,...

V..,...! UNCLASSIFIED SECURITY CLASSIFICATION OF THIS PAGI(W& on D.fa Eatsvsd) A l proposed replacing,, by the ridge estimator Q X + k, in these cases. Although Uoerl and Kennard considered the regression problem in correlation form, there is nothing in their procedure that makes this mandatory. T) This paper compares ridge estimators for B that arise when the biasing factor (k) is applied at different stages of standardization (i.e., centering and scaling), and shows which estimators are identical and which are different. In addition, results of a smallscale simulation are discussed. S UNCLASSIFIED SECURITY CLASSIFICATION OF THIS PAGE(W i.n Data Ent r.d) _ n