Matrix Differential Calculus with Applications in Statistics and Econometrics

Similar documents
Hands-on Matrix Algebra Using R

MATHEMATICS FOR ECONOMISTS. An Introductory Textbook. Third Edition. Malcolm Pemberton and Nicholas Rau. UNIVERSITY OF TORONTO PRESS Toronto Buffalo

Matrix Mathematics. Theory, Facts, and Formulas with Application to Linear Systems Theory. Dennis S. Bernstein

Linear Models in Statistics

CONTENTS. Preface Preliminaries 1

Finite Population Sampling and Inference

Generalized, Linear, and Mixed Models

Preface to Second Edition... vii. Preface to First Edition...

Wiley. Methods and Applications of Linear Models. Regression and the Analysis. of Variance. Third Edition. Ishpeming, Michigan RONALD R.

MATHEMATICAL ANALYSIS

Numerical Analysis. A Comprehensive Introduction. H. R. Schwarz University of Zürich Switzerland. with a contribution by

Introduction to the Mathematical and Statistical Foundations of Econometrics Herman J. Bierens Pennsylvania State University

E. F. Beckenbach R. Bellman. Inequalities. Third Printing. '**" '»»..,,. t. Springer-Verlag Berlin Heidelberg New York 1971

LINEAR AND NONLINEAR PROGRAMMING

STUDY PLAN MASTER IN (MATHEMATICS) (Thesis Track)

New Introduction to Multiple Time Series Analysis

An Introduction to Multivariate Statistical Analysis

NUMERICAL METHODS FOR ENGINEERING APPLICATION

FUNDAMENTALS OF POLARIZED LIGHT

ELEMENTARY MATRIX ALGEBRA

ABSTRACT ALGEBRA WITH APPLICATIONS

Directional Statistics

Course Description - Master in of Mathematics Comprehensive exam& Thesis Tracks

MULTIVARIABLE CALCULUS, LINEAR ALGEBRA, AND DIFFERENTIAL EQUATIONS

The Essentials of Linear State-Space Systems

BASIC EXAM ADVANCED CALCULUS/LINEAR ALGEBRA

OPTIMAL CONTROL AND ESTIMATION

Economics 101A (Lecture 3) Stefano DellaVigna

Applied Linear Algebra

Classes of Linear Operators Vol. I

Statistical Methods in HYDROLOGY CHARLES T. HAAN. The Iowa State University Press / Ames

MATHEMATICS. Course Syllabus. Section A: Linear Algebra. Subject Code: MA. Course Structure. Ordinary Differential Equations

Differential Geometry, Lie Groups, and Symmetric Spaces

Math 302 Outcome Statements Winter 2013

Contents. Preface for the Instructor. Preface for the Student. xvii. Acknowledgments. 1 Vector Spaces 1 1.A R n and C n 2

Mathematics for Economics and Finance

Linear and Nonlinear Models

Linear Models 1. Isfahan University of Technology Fall Semester, 2014

Contents. Preface xi. vii

Statistical Signal Processing Detection, Estimation, and Time Series Analysis

Applied Probability and Stochastic Processes

Analytical Mechanics for Relativity and Quantum Mechanics

HANDBOOK OF APPLICABLE MATHEMATICS

CALCULUS SALAS AND HILLE'S REVISED BY GARRET J. ETGEI ONE VARIABLE SEVENTH EDITION ' ' ' ' i! I! I! 11 ' ;' 1 ::: T.

Albert W. Marshall. Ingram Olkin Barry. C. Arnold. Inequalities: Theory. of Majorization and Its Applications. Second Edition.

MATRIX AND LINEAR ALGEBR A Aided with MATLAB

Second-order approximation of dynamic models without the use of tensors

Regression Analysis By Example

MATHEMATICAL ECONOMICS: OPTIMIZATION. Contents

Numerical Methods in Matrix Computations

ADAPTIVE FILTER THEORY

INDEX. Bolzano-Weierstrass theorem, for sequences, boundary points, bounded functions, 142 bounded sets, 42 43

APPLIED NUMERICAL LINEAR ALGEBRA

Lessons in Estimation Theory for Signal Processing, Communications, and Control

APPLIED FUNCTIONAL ANALYSIS

PATTERN CLASSIFICATION

Mathematics for Engineers and Scientists

On Hadamard and Kronecker Products Over Matrix of Matrices

ADAPTIVE FILTER THEORY

CALCULUS GARRET J. ETGEN SALAS AND HILLE'S. ' MiIIIIIIH. I '////I! li II ii: ONE AND SEVERAL VARIABLES SEVENTH EDITION REVISED BY \

Pattern Recognition and Machine Learning

Numerical Methods for Engineers. and Scientists. Applications using MATLAB. An Introduction with. Vish- Subramaniam. Third Edition. Amos Gilat.

Response Surface Methodology:

A geometric proof of the spectral theorem for real symmetric matrices

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

New Perspectives. Functional Inequalities: and New Applications. Nassif Ghoussoub Amir Moradifam. Monographs. Surveys and

New insights into best linear unbiased estimation and the optimality of least-squares

A User's Guide To Principal Components

CHAPTER 1. LINEAR ALGEBRA Systems of Linear Equations and Matrices

Preface. 2 Linear Equations and Eigenvalue Problem 22

OPTIMAL ESTIMATION of DYNAMIC SYSTEMS

Stat 5101 Lecture Notes

Numerical Analysis for Statisticians

Modern Geometric Structures and Fields

Optimisation theory. The International College of Economics and Finance. I. An explanatory note. The Lecturer and Class teacher: Kantorovich G.G.

Chapter 1. Matrix Algebra

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

2. Linear algebra. matrices and vectors. linear equations. range and nullspace of matrices. function of vectors, gradient and Hessian

Lec3p1, ORF363/COS323

The Fractional Fourier Transform with Applications in Optics and Signal Processing

Syllabuses for Honor Courses. Algebra I & II

Philippe. Functional Analysis with Applications. Linear and Nonlinear. G. Ciarlet. City University of Hong Kong. siajtl

Continuous Univariate Distributions

M E M O R A N D U M. Faculty Senate approved November 1, 2018

COPYRIGHTED MATERIAL CONTENTS. Preface Preface to the First Edition

MA3025 Course Prerequisites

Elements of Multivariate Time Series Analysis

Condensed Table of Contents for Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control by J. C.

Introduction to Numerical Analysis

Population Games and Evolutionary Dynamics

Econometric Analysis of Cross Section and Panel Data

A THEORETICAL INTRODUCTION TO NUMERICAL ANALYSIS

STATE COUNCIL OF EDUCATIONAL RESEARCH AND TRAINING TNCF DRAFT SYLLABUS.

Lecture 2: Linear Algebra Review

Scalar, Vector, and Matrix Mathematics

component risk analysis

CHAPTER 2: CONVEX SETS AND CONCAVE FUNCTIONS. W. Erwin Diewert January 31, 2008.

Journal of Inequalities in Pure and Applied Mathematics

Calculus from Graphical, Numerical, and Symbolic Points of View, 2e Arnold Ostebee & Paul Zorn

Mathematical Foundations of Applied Statistics: Matrix Algebra

Transcription:

Matrix Differential Calculus with Applications in Statistics and Econometrics Revised Edition JAN. R. MAGNUS CentERjor Economic Research, Tilburg University and HEINZ NEUDECKER Cesaro, Schagen JOHN WILEY & SONS Chichester New York Weinheim Brisbane Singapore Toronto

Contents Preface Preface to the fast revised printing Preface to the second revised printing Part One Matrices 1 Basic properties of vectors and matrices 1 Introduction, 3 2 Sets, 3 3 Matrices: addition and multiplication, 4 4 The transpose of a matrix, 5 5 Square matrices, 6 6 Linear forms and quadratic forms, 7 7 The rank ofa matrix, 8 8 The inverse, 9 9 The determinant, 9 10 The trace, 10 11 Partitioned matrices, 11 12 Complex matrices, 13 13 cigenvalues and eigenvectors, 13 14 Schur's decomposition theorem, 16 15 The Jordan decomposition, 17 16 The singular-value decomposition, 18 17 Furt her results concerning eigenvalues, 19 18 Positive (semi)definite matrices, 21 19 Threefurther resultsfor positive definite matrices, 23 20 A useful result, 24 Miscellaneous exercises, 25 Bibiliographical notes, 26 2 Kronecker products, the vec Operator and the Moore-Penrose inverse 1 Introduction, 27 2 The Kronecker product, 27

3 Eigenvalues of a Kronecker product, 28 4 The vec Operator, 30 5 The Moore-Penrose (MP) inverse, 32 6 Existence and uniqueness of the MP inverse, 32 7 Some properties ofthe MP inverse, 33 8 Further properties, 34 9 The Solution of linear equation Systems, 36 Miscellaneous exercises, 38 Bibliographical notes, 39 3 Miscellaneous matrix results 1 Introduction, 40 2 77ie adjoint matrix, 40 3 Proof of Theorem 7, 41 4 7\vo results concerning bordered determinants, 43 5 77ie matrix equation AX = 0, 44 6 77ie Hadamard product, 45 7 77»e commutation matrix K mn, 46 8 77ie duplication matrix D n, 48 9 Relationship between D +i and D, I, 50 10 Relationship between D +l and D, II, 52 11 Conditions for a quadratic form to be positive (negative) subject linear constraints, 53 12 Necessary and sufficient conditions for r(a:b) = r{a) + r(b), 56 13 The bordered Gramian matrix, 57 14 The equations X 1 A + X 2 B' = G i,x l B = G 2, 60 Miscellaneous exercises, 62 Bibliographical notes, 62 Part Two Differentials: the theory 4 Mathematical preliminaries 1 Introduction, 65 2 Interior points and accumulation points, 65 3 Open and ciosed sets, 66 4 77ie Bolzano-Weierstrass theorem, 69 5 Functions, 70 6 77ie /imit of afunction, 70 7 Continuous functions and compactness, 71 8 Conuex sets, 72 9 Conuex and concave functions, 75 Bibliographical notes, 77

Contents vu 5 Differentials and differentiability 78 1 Introduction, 78 2 Continuity, 78 3 Differentiability and linear approximation, 80 4 The differential of a vector function, 82 5 Uniqueness of the differential, 84 6 Continuity of differentiable functions, 84 7 Partial derivatives, 85 8 Thefirst identification theorem, 87 9 Existence of the differential, I, 88 10 Existence of the differential, II, 89 11 Continuous differentiability, 91 12 The chain rule, 91 13 Cauchy invariance, 93 14 The mean-value theorem for real-valued functions, 93 15 Matrix functions, 94 16 Some remarks on notation, 96 Miscellaneous exercises, 98 Bibliographical notes, 98 6 The second differential 99 1 Introduction, 99 2 Second-order partial derivatives, 99 3 77ie Hessian matrix, 100 4 Twice differentiability and second-order approximation, 1, 101 5 Definition of twice differentiability, 102 6 The second differential, 103 7 (Column) symmetry of the Hessian matrix, 105 8 The second identification theorem, 107 9 Twice differentiability and second-order approximation, II, 108 10 Chain rule for Hessian matrices, 110 11 The analogue for second differentials, 111 12 Taylor's theorem for real-valued functions, 112 13 Higher-order differentials, 113 14 Matrix functions, 114 Bibliographical notes, 115 7 Static optimization 116 1 Introduction, 116 2 Unconstrained optimization, 116 3 The existence of absolute extrema, 118 4 Necessary conditions for a local minimum, 119

5 Sufficient conditions for a local minimum: first-derivative test, 121 6 Sufficient conditions for a local minimum: second-derivative test, 122 7 Characterization of differentiable convex functions, 124 8 Characterization of twice differentiable convex functions, 127 9 Sufficient conditions for an absolute minimum, 128 10 Monotonie transformations, 129 11 Optimization subjeet to constraints, 130 12 Necessary conditions for a local minimum under constraints, 131 13 Sufficient conditions for a local minimum under constraints, 135 14 Sufficient conditions for an absolute minimum under constraints, 139 15 v4 note on constraints in matrix form, 140 16 Economic interpretation of Lagrange multipliers, 141 Appendix: the implicit funetion theorem, 142 Bibliographical notes, 144 Part Three Differentials: the practice 8 Some important differentials 1 Introduction, 147 2 Fundamental rules of differential calculus, 147 3 The differential of a determinant, 149 4 The differential of an inverse, 151 5 The differential of the Moore-Penrose inverse, 152 6 The differential of the adjoint matrix, 155 7 On differentiating eigenvalues and eigenvectors, 157 8 The differential of eigenvalues and eigenvectors: the real Symmetrie case, 158 9 The differential of eigenvalues and eigenvectors: the general complex case, 161 10 Two alternative expressions for <U, 163 11 The second differential of the eigenvalue funetion, 166 12 Multiple eigenvalues, 167 Miscellaneous exercises, 167 Bibliographical notes, 169 9 First-order differentials and Jacobian matrices 1 Introduction, 170 2 Classification, 170 3 Bad notation, 171 4 Good notation, 173 5 Identification of Jacobian matrices, 174 6 The first identification table, 175

7 Partitioning of the derivative, 175 8 Scalar functions of a vector, 176 9 Scalar functions of a matrix, I: trace, 177 10 Scalar functions of a matrix, II: determinant, 178 11 Scalar functions of a matrix, 111: eigenvalue, 180 12 Two examples of vector functions, 181 13 Matrix functions, 182 14 Kronecker products, 184 15 Some other problems, 185 Bibliographical notes, 187 10 Second-order differentials and Hessian matrices 1 Introduction, 188 2 The Hessian matrix of a matrix function, 188 3 Identification of Hessian matrices, 189 4 77ie second identification table, 190 5 An explicit formula for the Hessian matrix, 191 6 Scalar functions, 192 7 Vector functions, 194 8 Matrix functions, I, 194 9 Matrix functions, II, 195 Part Four Inequalities 11 Inequalities 1 Introduction, 199 2 77ie Caudry-Sc/iwarz inequality, 199 3 Matrix analogues of the Cauchy-Schwarz inequality, 201 4 77ie theorem of the arithmetic and geometric means, 202 5 77ie Rayleigh quotient, 203 6 Concavity of X t, convexity of X, 204 7 Variational description of eigenvalues, 205 8 Fischefs min-max theorem, 206 9 Monotonicity of the eigenvalues, 208 10 77ie Poincare Separation theorem, 209 11 Two corollaries of Poincare's theorem, 210 12 Further consequences of the Poincare theorem, 211 13 Multiplicative version, 212 14 77je maximum of a bilinear form, 213 15 Hadamard's inequality, 214 16 /In interlude: Karamata's inequality, 215 17 Karamata's inequality applied to eigenvalues, 217

18 An inequality concerning positive semidefinite matrices 19 A representation theoremfor (Za p ) llp, 218 20 A representation theorem for (tr A") ilp, 219 21 Holder's inequality, 220 22 Concavity of log \A\, 222 23 Minkowskis inequality, 223 24 Quasilinear representation of\a\ { ", 224 25 Minkowski's determinant theorem, 227 26 Weighted means of order p, 227 27 Schlömilch's inequality, 229 28 Curvature properties of M {x, a), 230 29 Least Squares, 232 30 Generalized least Squares, 233 31 Restricted least Squares, 233 32 Restricted least Squares: matrix Version, 235 Miscellaneous exercises, 236 Bibliographical notes, 240 Part Five The linear model 12 Statistical preliminaries 1 Introduction, 243 2 77ie cumulative distribution function, 243 3 77ie yoim density function, 244 4 Expectations, 244 5 Variance and covariance, 245 6 Independence oftwo random variables (vectors), 247 7 Independence of n random variables (vectors), 249 8 Sampling, 249 9 77ie one-dimensional normal distribution, 249 10 The multivariate normal distribution, 250 11 Estimation, 252 Miscellaneous exercises, 253 Bibliographical notes, 253 13 The linear regression model 1 Introduction, 254 2 Affine minimum-trace unbiased estimation, 255 3 77ie Gauss-Markov theorem, 256 4 77ie method of least Squares, 258 5 Aitken's theorem, 259 6 Multicollinearity, 261 7 Estimable functions, 263

Contents XI 8 Linear constraints: the case M{R')<^J1 {X'\ 264 9 Linear constraints: the general case, 267 10 Linear constraints: the case Jt(R')c^J((X') = {0}, 270 11 A singular variance matrix: the case J((X)<^M(V), 271 12 A singular variance matrix: the case r(x' V + X) = r(x), 273 13 A singular variance matrix: the general case, I, 214 14 Explicit and implicit linear constraints, 275 15 The general linear model, I, 277 16 A singular variance matrix: the general case, II, 278 17 The general linear model, II, 281 18 Generalized least Squares, 282 19 Restricted least Squares, 283 Misceilaneous exercises, 285 Bibliographical notes, 286 14 Further topics in the linear model 287 1 Introduction, 287 2 Best quadratic unbiased estimation of a 2, 287 3 77ie best quadratic and positive unbiased estimator of a 2, 288 4 The best quadratic unbiased estimator ofa 2, 290-5 Best quadratic invariant estimation ofa 2, 292 6 The best quadratic and positive invariant estimator of a 2, 293 7 The best quadratic invariant estimator ofa 2, 294 8 Best quadratic unbiased estimation in the multivariate normal case, 295 9 Boundsfor the bias of the least Squares estimator ofa 2,1, 297 10 Boundsfor the bias of the least Squares estimator ofa 2, II, 299 11 The prediction of disturbances, 300 12 Predictors that are best linear unbiased with scalar variance matrix (BLUS), 301 13 Predictors that are best linear unbiased with fixed variance matrix (BLUF), l, 303 14 Predictors that are best linear unbiased with fixed variance matrix (BLUF), II, 305 15 Local sensitivity of the posterior mean, 306 16 Local sensitivity of the posterior precision, 308 Bibliographical notes, 309 Part Six Applications to maximum likelihood estimation 15 Maximum likelihood estimation 313 1 Introduction, 313 2 The method of maximum likelihood (ML), 313 3 ML estimation of the multivariate normal distribution, 314

Xll 4 Implicit versus explicit treatment of symmetry, 316 5 The treatment of positive definiteness, 317 6 The information matrix, 317 7 ML estimation of the multivariate normal distribution with distinct means, 319 8 The multivariate linear regression model, 320 9 The errors-in-variables model, 322 10 The nonlinear regression model with normal errors, 324 11 A special case: functional independence of mean parameters and variance parameters, 326 12 Generalization of Theorem 6, 327 Miscellaneous exercises, 329 Bibliographical notes, 330 16 Simultaneous equations 1 Introduction, 331 2 The simultaneous equations model, 331 3 The Identification problem, 333 4 Identification with linear constraints on B and V only, 334 5 Identification with linear constraints on B, Y and L, 335 6 Nonlinear constraints, 337 7 Full-information maximum likelihood (FIML): the information matrix (general case), 337 8 Full-information maximum likelihood (FIML): the asymptotic variance matrix (special case), 339 9 Limited-information maximum likelihood (LIML): the first-order conditions, 342 10 Limited-information maximum likelihood (LIML): the information matrix, 344 11 Limited-information maximum likelihood (LIML): the asymptotic variance matrix, 346 Bibliographical notes, 351 17 Topics in psychometrics 1 Introduction, 352 2 Population principal components, 353 3 Optimality of principal components, 353 4 A related result, 355 5 Sample principal components, 356 6 Optimality of sample principal components, 358 7 Sample analogue of Theorem 3, 358 8 One-mode component analysis, 358

Contents 9 Relationship between one-mode component analysis and sample principal components, 361 10 Two-mode component analysis, 362 11 Multimode component analysis, 363 12 Factor analysis, 366 13 A zigzag routine, 369 14 A Newton-Raphson routine, 370 15 Kaiser's varimax method, 373 16 Canonical correlations and variates in the population, 376 Bibliographical notes, 378 Bibliography Index of Symbols Subject Index