INFORMATION THEORY AND STATISTICS

Similar documents
An Introduction to Multivariate Statistical Analysis

Testing Statistical Hypotheses

TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1

HANDBOOK OF APPLICABLE MATHEMATICS

Testing Statistical Hypotheses

Stat 5101 Lecture Notes

Statistical Methods in HYDROLOGY CHARLES T. HAAN. The Iowa State University Press / Ames

NONPARAMETRICS. Statistical Methods Based on Ranks E. L. LEHMANN HOLDEN-DAY, INC. McGRAW-HILL INTERNATIONAL BOOK COMPANY

Subject CS1 Actuarial Statistics 1 Core Principles

Pattern Recognition and Machine Learning

Experimental Design and Data Analysis for Biologists

Factor Analysis of Data Matrices

Contents. Preface to Second Edition Preface to First Edition Abbreviations PART I PRINCIPLES OF STATISTICAL THINKING AND ANALYSIS 1

Probability for Statistics and Machine Learning

STATISTICS; An Introductory Analysis. 2nd hidition TARO YAMANE NEW YORK UNIVERSITY A HARPER INTERNATIONAL EDITION

Wiley. Methods and Applications of Linear Models. Regression and the Analysis. of Variance. Third Edition. Ishpeming, Michigan RONALD R.

Multilevel Statistical Models: 3 rd edition, 2003 Contents

Generalized Linear Models (GLZ)

COPYRIGHTED MATERIAL CONTENTS. Preface Preface to the First Edition

Preface Introduction to Statistics and Data Analysis Overview: Statistical Inference, Samples, Populations, and Experimental Design The Role of

ELEMENTARY MATRIX ALGEBRA

Institute of Actuaries of India

Applied Multivariate Statistical Analysis Richard Johnson Dean Wichern Sixth Edition

AN INTRODUCTION TO PROBABILITY AND STATISTICS

Bivariate Relationships Between Variables

STATISTICS SYLLABUS UNIT I

Textbook Examples of. SPSS Procedure

Statistical and Inductive Inference by Minimum Message Length

Generalized Linear. Mixed Models. Methods and Applications. Modern Concepts, Walter W. Stroup. Texts in Statistical Science.

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Wolfgang Karl Härdle Leopold Simar. Applied Multivariate. Statistical Analysis. Fourth Edition. ö Springer

Elements of Multivariate Time Series Analysis

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS

Statistical Signal Processing Detection, Estimation, and Time Series Analysis

Linear Models in Statistics

3 Joint Distributions 71

PART I INTRODUCTION The meaning of probability Basic definitions for frequentist statistics and Bayesian inference Bayesian inference Combinatorics

THE PRINCIPLES AND PRACTICE OF STATISTICS IN BIOLOGICAL RESEARCH. Robert R. SOKAL and F. James ROHLF. State University of New York at Stony Brook

OPTIMAL CONTROL AND ESTIMATION

Linear Models 1. Isfahan University of Technology Fall Semester, 2014

Discrete Multivariate Statistics

Theory of Probability Sir Harold Jeffreys Table of Contents

IDL Advanced Math & Stats Module

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

Hypothesis testing:power, test statistic CMS:

Pharmaceutical Experimental Design and Interpretation

MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 8: Canonical Correlation Analysis

Scope and Sequence Mathematics Algebra 2 400

Generalized, Linear, and Mixed Models

Irr. Statistical Methods in Experimental Physics. 2nd Edition. Frederick James. World Scientific. CERN, Switzerland

CAM Ph.D. Qualifying Exam in Numerical Analysis CONTENTS

Principal component analysis

GROUPED DATA E.G. FOR SAMPLE OF RAW DATA (E.G. 4, 12, 7, 5, MEAN G x / n STANDARD DEVIATION MEDIAN AND QUARTILES STANDARD DEVIATION

Introduction to Spatial Analysis. Spatial Analysis. Session organization. Learning objectives. Module organization. GIS and spatial analysis

Directional Statistics

Numerical Analysis for Statisticians

PROBABILITY DISTRIBUTIONS. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception

Statistical Inference On the High-dimensional Gaussian Covarianc

Probability and Stochastic Processes

Linear and Nonlinear Models

Types of Statistical Tests DR. MIKE MARRAPODI

Econometric Analysis of Cross Section and Panel Data

Group comparison test for independent samples

Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory

Techniques and Applications of Multivariate Analysis

STUDY PLAN MASTER IN (MATHEMATICS) (Thesis Track)

MULTIVARIABLE CALCULUS, LINEAR ALGEBRA, AND DIFFERENTIAL EQUATIONS

Overview. DS GA 1002 Probability and Statistics for Data Science. Carlos Fernandez-Granda

Multivariate Geostatistics

CS Lecture 19. Exponential Families & Expectation Propagation

PRINCIPLES OF STATISTICAL INFERENCE

Mark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation.

STATISTICS ANCILLARY SYLLABUS. (W.E.F. the session ) Semester Paper Code Marks Credits Topic

Course in Data Science

Advanced topics from statistics

STAT Chapter 13: Categorical Data. Recall we have studied binomial data, in which each trial falls into one of 2 categories (success/failure).

13.1 Categorical Data and the Multinomial Experiment

1 Linear Regression and Correlation

Ronald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California

The purpose of this section is to derive the asymptotic distribution of the Pearson chi-square statistic. k (n j np j ) 2. np j.

Small n, σ known or unknown, underlying nongaussian

APPENDIX B Sample-Size Calculation Methods: Classical Design

STATISTICS 407 METHODS OF MULTIVARIATE ANALYSIS TOPICS

Two-way contingency tables for complex sampling schemes

Transition Passage to Descriptive Statistics 28

Probability and Statistics

Graduate Econometrics I: Maximum Likelihood II

Kumaun University Nainital

Statistical. Psychology

Introduction to the Mathematical and Statistical Foundations of Econometrics Herman J. Bierens Pennsylvania State University

PROBABILITY AND STOCHASTIC PROCESSES A Friendly Introduction for Electrical and Computer Engineers

The Multivariate Gaussian Distribution [DRAFT]

Probabilistic Graphical Models

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

An Introduction to Complex Function Theory

Describing Contingency tables

Multivariate Statistical Analysis

TIME SERIES ANALYSIS. Forecasting and Control. Wiley. Fifth Edition GWILYM M. JENKINS GEORGE E. P. BOX GREGORY C. REINSEL GRETA M.

UNIVERSITY OF THE PHILIPPINES LOS BAÑOS INSTITUTE OF STATISTICS BS Statistics - Course Description

Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, Jeffreys priors. exp 1 ) p 2

Transcription:

INFORMATION THEORY AND STATISTICS Solomon Kullback DOVER PUBLICATIONS, INC. Mineola, New York

Contents 1 DEFINITION OF INFORMATION 1 Introduction 1 2 Definition 3 3 Divergence 6 4 Examples 7 5 Problems...''. 10 2 PROPERTIES OF INFORMATION 1 Introduction 12 2 Additivity 12 3 Convexity 14 4 Invariance 18 5 Divergence 22 6 Fisher's information 26 7 Information and sufficiency 28 8 Problems 31 3 INEQUALITIES OF INFORMATION THEORY 1 Introduction 36 2 Minimum discrimination information 36 3 Sufficient statistics 43 4 Exponential family 45 5 Neighboring parameters 55 6 Efficiency 63 7 Problems 66 4 LIMITING PROPERTIES 1 Introduction 70 2 Limiting properties 70 3 Type I and type II errors 74 4 Problems 78

XU CONTENTS 5 INFORMATION STATISTICS 1 Estimate of/(*: 2) 81 2 Classification 83 3 Testing hypotheses 85 4 Discussion 94 5 Asymptotic properties 97 6 Estimate of /(*, 2) 106 7 Problems 107 6 MULTINOMIAL POPULATIONS 1 Introduction 109 2 Background 110 3 Conjugate distributions Ill 4 Single sample 112 4.1 Basic problem 112 4.2 Analysis of 7(*:2;OJV) 114 4.3 Parametric case 117 4.4 "One-sided" binomial hypothesis 119 4.5 "One-sided" multinomial hypotheses 121 4.5.1 Summary 125 4.5.2 Illustrative values 125 5 Two samples 128 5.1 Basic problem 128 5.2 "One-sided" hypothesis for the binomial 131 6 r samples 134 6.1 Basic problem 134 6.2 Partition 136 6.3 Parametric case 139 7 Problems 140 7 POISSON POPULATIONS 1 Background 142 2 Conjugate distributions 143 3 r samples 144 3.1 Basic problem 144 3.2 Partition 146 4 "One-sided" hypothesis, single sample 148 5 "One-sided" hypothesis, two samples 151 6 Problems : 153 8 CONTINGENCY TABLES 1 Introduction 155 2 Two-way tables 155

CONTENTS Xlll 3 Three-way tables 159 3.1 Independence of the three classifications 160 3.2 Row classification independent of the other classifications. 162 3.3 Independence hypotheses 165 3.4 Conditional independence 166 3.5 Further analysis 167 4 Homogeneity of two-way tables 168 5 Conditional homogeneity 169 6 Homogeneity 170 7 Interaction 171 8 Negative interaction 172 9 Partitions 173 10 Parametric case 176 11 Symmetry 177 12 Examples 179 13 Problems 186 9 MULTIVARIATE NORMAL POPULATIONS 1 Introduction...,.'' 189 2 Components of information 191 3 Canonical form 194 4 Linear discriminant functions 196 5 Equal covariance matrices 196 6 Principal components 197 7 Canonical correlation 200 8 Covariance variates 204 9 General case 205 10 Problems 207 10 THE LINEAR HYPOTHESIS 1 Introduction 211 2 Background 211 3 The linear hypothesis 212 4 The minimum discrimination information statistic 212 5 Subhypotheses 214 5.1 Two-partition subhypothesis 214 5.2 Three-partition subhypothesis 218 6 Analysis of regression: one-way classification, k categories... 219 7 Two-partition subhypothesis 225 7.1 One-way classification, k categories. 225 7.2 Carter's regression case 229 8 Example 231 9 Reparametrization 236 9.1 Hypotheses not of full rank 236 9.2 Partition 238

XIV CONTENTS 10 Analysis of regression, two-way classification 239 11 Problems 251 11 MULTIVARIATE ANALYSIS; THE MULTIVARIATE LINEAR HYPOTHESIS 1 Introduction 253 2 Background 253 3 The multivariate linear hypothesis 253 3.1 Specification 253 3.2 Linear discriminant function 254 4 The minimum discrimination information statistic 255 5 Subhypotheses 257 5.1 Two-partition subhypothesis 257 5.2 Three-partition subhypothesis 260 6 Special cases 261 6.1 Hotelling's generalized Student ratio (Hotelling's F 2 )... 261 6.2 Centering 262 6.3 Homogeneity of r samples 264 6.4 r samples with covariance 268 6.4.1 Test of regression '. 268 6.4.2 Test of homogeneity of means and regression... 272 6.4.3 Test of homogeneity, assuming regression 273 7 Canonical correlation 275 8 Linear discriminant functions 276 8.1 Homogeneity of r samples 276 8.2 Canonical correlation 277 8.3 Hotelling's generalized Student ratio (Hotelling's T 2 )... 279 9 Examples 279 9.1 Homogeneity of sample means 280 9.2 Canonical correlation 281 9.3 Subhypothesis 284 10 Reparametrization 289 10.1 Hypotheses not of full rank 289 10.2 Partition 294 11 Remark 294 12 Problems 295 12 MULTIVARIATE ANALYSIS: OTHER HYPOTHESES 1 Introduction 297 2 Background 297 3 Single sample 299 3.1 Homogeneity of the sample 299 3.2 The hypothesis that a A-variate normal population has a specified covariance matrix 302 3.3 The hypothesis of independence 303 3.4 Hypothesis on the correlation matrix 304

CONTENTS 3.5 Linear discriminant function 304 3.6 Independence of sets of variates 306 3.7 Independence and equality of variances 307 4 Homogeneity of means 309 4.1 Two samples 309 4.2 Linear discriminant function 311 4.3 r samples 311 5 Homogeneity of covariance matrices 315 5.1 Two samples 315 5.2 Linear discriminant function 317 5.3 r samples 318 5.4 Correlation matrices 320 6 Asymptotic distributions 324 6.1 Homogeneity of covariance matrices 324 6.2 Single sample 328 6.3 The hypothesis of independence 329 6.4 Roots of determinantal equations 330 7 Stuart's test for homogeneity of the marginal distributions in a two-way classification - 333 7.1 A multivariate normal hypothesis 333 7.2 The contingency table problem 334 8 Problems 334 13 LINEAR DISCRIMINANT FUNCTIONS 1 Introduction 342 2 Iteration 342 3 Example 344 4 Remark 347 5 Other linear discriminant functions 348 6 Comparison of the various linear discriminant functions.... 350 7 Problems 352 REFERENCES 353 TABLE I. Log, n and n log, n for values of n from 1 through 1000.. 367 XV TABLE II. F{p lt pi) - pi log + 2i log -, P% gs pi + 2i = 1 = Pi + 22 378 TABLE III. Noncentral x! 380 GLOSSARY...., 381 APPENDIX 389 INDEX 393