INFORMATION THEORY AND STATISTICS Solomon Kullback DOVER PUBLICATIONS, INC. Mineola, New York
Contents 1 DEFINITION OF INFORMATION 1 Introduction 1 2 Definition 3 3 Divergence 6 4 Examples 7 5 Problems...''. 10 2 PROPERTIES OF INFORMATION 1 Introduction 12 2 Additivity 12 3 Convexity 14 4 Invariance 18 5 Divergence 22 6 Fisher's information 26 7 Information and sufficiency 28 8 Problems 31 3 INEQUALITIES OF INFORMATION THEORY 1 Introduction 36 2 Minimum discrimination information 36 3 Sufficient statistics 43 4 Exponential family 45 5 Neighboring parameters 55 6 Efficiency 63 7 Problems 66 4 LIMITING PROPERTIES 1 Introduction 70 2 Limiting properties 70 3 Type I and type II errors 74 4 Problems 78
XU CONTENTS 5 INFORMATION STATISTICS 1 Estimate of/(*: 2) 81 2 Classification 83 3 Testing hypotheses 85 4 Discussion 94 5 Asymptotic properties 97 6 Estimate of /(*, 2) 106 7 Problems 107 6 MULTINOMIAL POPULATIONS 1 Introduction 109 2 Background 110 3 Conjugate distributions Ill 4 Single sample 112 4.1 Basic problem 112 4.2 Analysis of 7(*:2;OJV) 114 4.3 Parametric case 117 4.4 "One-sided" binomial hypothesis 119 4.5 "One-sided" multinomial hypotheses 121 4.5.1 Summary 125 4.5.2 Illustrative values 125 5 Two samples 128 5.1 Basic problem 128 5.2 "One-sided" hypothesis for the binomial 131 6 r samples 134 6.1 Basic problem 134 6.2 Partition 136 6.3 Parametric case 139 7 Problems 140 7 POISSON POPULATIONS 1 Background 142 2 Conjugate distributions 143 3 r samples 144 3.1 Basic problem 144 3.2 Partition 146 4 "One-sided" hypothesis, single sample 148 5 "One-sided" hypothesis, two samples 151 6 Problems : 153 8 CONTINGENCY TABLES 1 Introduction 155 2 Two-way tables 155
CONTENTS Xlll 3 Three-way tables 159 3.1 Independence of the three classifications 160 3.2 Row classification independent of the other classifications. 162 3.3 Independence hypotheses 165 3.4 Conditional independence 166 3.5 Further analysis 167 4 Homogeneity of two-way tables 168 5 Conditional homogeneity 169 6 Homogeneity 170 7 Interaction 171 8 Negative interaction 172 9 Partitions 173 10 Parametric case 176 11 Symmetry 177 12 Examples 179 13 Problems 186 9 MULTIVARIATE NORMAL POPULATIONS 1 Introduction...,.'' 189 2 Components of information 191 3 Canonical form 194 4 Linear discriminant functions 196 5 Equal covariance matrices 196 6 Principal components 197 7 Canonical correlation 200 8 Covariance variates 204 9 General case 205 10 Problems 207 10 THE LINEAR HYPOTHESIS 1 Introduction 211 2 Background 211 3 The linear hypothesis 212 4 The minimum discrimination information statistic 212 5 Subhypotheses 214 5.1 Two-partition subhypothesis 214 5.2 Three-partition subhypothesis 218 6 Analysis of regression: one-way classification, k categories... 219 7 Two-partition subhypothesis 225 7.1 One-way classification, k categories. 225 7.2 Carter's regression case 229 8 Example 231 9 Reparametrization 236 9.1 Hypotheses not of full rank 236 9.2 Partition 238
XIV CONTENTS 10 Analysis of regression, two-way classification 239 11 Problems 251 11 MULTIVARIATE ANALYSIS; THE MULTIVARIATE LINEAR HYPOTHESIS 1 Introduction 253 2 Background 253 3 The multivariate linear hypothesis 253 3.1 Specification 253 3.2 Linear discriminant function 254 4 The minimum discrimination information statistic 255 5 Subhypotheses 257 5.1 Two-partition subhypothesis 257 5.2 Three-partition subhypothesis 260 6 Special cases 261 6.1 Hotelling's generalized Student ratio (Hotelling's F 2 )... 261 6.2 Centering 262 6.3 Homogeneity of r samples 264 6.4 r samples with covariance 268 6.4.1 Test of regression '. 268 6.4.2 Test of homogeneity of means and regression... 272 6.4.3 Test of homogeneity, assuming regression 273 7 Canonical correlation 275 8 Linear discriminant functions 276 8.1 Homogeneity of r samples 276 8.2 Canonical correlation 277 8.3 Hotelling's generalized Student ratio (Hotelling's T 2 )... 279 9 Examples 279 9.1 Homogeneity of sample means 280 9.2 Canonical correlation 281 9.3 Subhypothesis 284 10 Reparametrization 289 10.1 Hypotheses not of full rank 289 10.2 Partition 294 11 Remark 294 12 Problems 295 12 MULTIVARIATE ANALYSIS: OTHER HYPOTHESES 1 Introduction 297 2 Background 297 3 Single sample 299 3.1 Homogeneity of the sample 299 3.2 The hypothesis that a A-variate normal population has a specified covariance matrix 302 3.3 The hypothesis of independence 303 3.4 Hypothesis on the correlation matrix 304
CONTENTS 3.5 Linear discriminant function 304 3.6 Independence of sets of variates 306 3.7 Independence and equality of variances 307 4 Homogeneity of means 309 4.1 Two samples 309 4.2 Linear discriminant function 311 4.3 r samples 311 5 Homogeneity of covariance matrices 315 5.1 Two samples 315 5.2 Linear discriminant function 317 5.3 r samples 318 5.4 Correlation matrices 320 6 Asymptotic distributions 324 6.1 Homogeneity of covariance matrices 324 6.2 Single sample 328 6.3 The hypothesis of independence 329 6.4 Roots of determinantal equations 330 7 Stuart's test for homogeneity of the marginal distributions in a two-way classification - 333 7.1 A multivariate normal hypothesis 333 7.2 The contingency table problem 334 8 Problems 334 13 LINEAR DISCRIMINANT FUNCTIONS 1 Introduction 342 2 Iteration 342 3 Example 344 4 Remark 347 5 Other linear discriminant functions 348 6 Comparison of the various linear discriminant functions.... 350 7 Problems 352 REFERENCES 353 TABLE I. Log, n and n log, n for values of n from 1 through 1000.. 367 XV TABLE II. F{p lt pi) - pi log + 2i log -, P% gs pi + 2i = 1 = Pi + 22 378 TABLE III. Noncentral x! 380 GLOSSARY...., 381 APPENDIX 389 INDEX 393