Karl-Rudolf Koch Introduction to Bayesian Statistics Second Edition

Similar documents
Parameter Estimation and Hypothesis Testing in Linear Models

Karl-Rudolf Koch Introduction to Bayesian Statistics Second Edition

Bourbaki Elements of the History of Mathematics

Data Analysis Using the Method of Least Squares

Günter Zschornack Handbook of X-Ray Data

Mathematical Formulas for Economists

Springer Berlin Heidelberg New York Barcelona Budapest Hong Kong London Milan Paris Santa Clara Singapore Tokyo

Latif M. Jiji. Heat Convection. With 206 Figures and 16 Tables

Stochastic Optimization Methods

Fundamentals of Mass Determination

Experimental Techniques in Nuclear and Particle Physics

Nonlinear Dynamical Systems in Engineering

Lecture Notes in Mathematics Editors: J.-M. Morel, Cachan F. Takens, Groningen B. Teissier, Paris

Shijun Liao. Homotopy Analysis Method in Nonlinear Differential Equations

Advanced Calculus of a Single Variable

Statistics and Measurement Concepts with OpenStat

Walter R. Johnson Atomic Structure Theory

Nuclear Magnetic Resonance Data

Statistics of Random Processes

SpringerBriefs in Statistics

COPYRIGHTED MATERIAL CONTENTS. Preface Preface to the First Edition

Petroleum Geoscience: From Sedimentary Environments to Rock Physics

1000 Solved Problems in Classical Physics

Feynman Integral Calculus

Pattern Recognition and Machine Learning

Semantics of the Probabilistic Typed Lambda Calculus

Nuclear Magnetic Resonance Data

Landolt-Börnstein Numerical Data and Functional Relationships in Science and Technology New Series / Editor in Chief: W.

Lecture Notes in Artificial Intelligence

Lauge Fuglsang Nielsen. Composite Materials. Properties as Influenced by Phase Geometry. With 241 Figures ABC

Classics in Mathematics Andre Weil Elliptic Functions according to Eisenstein and Kronecker

Peter Orlik Volkmar Welker. Algebraic Combinatorics. Lectures at a Summer School in Nordfjordeid, Norway, June 2003 ABC

Theory of Elasticity

UV-VIS Spectroscopy and Its Applications

Landolt-Börnstein / New Series

Differential Scanning Calorimetry

Classics in Mathematics Lars Hormander. The Analysis of Linear Partial Differential Operators I

Topics in Algebra and Analysis

Egon Krause. Fluid Mechanics

Metalliferous Sediments of the World Ocean. Metalliferous Sediments of the World Ocean

Two -Dimensional Digital Signal Processing II

Topics in Boundary Element

Springer-Verlag Berlin Heidelberg GmbH

Lecture Notes in Economics and Mathematical Systems

Geir Evensen Data Assimilation

Lecture Notes in Computer Science

Rade Westergren Mathematics Handbook

Classics in Mathematics

Probability Theory, Random Processes and Mathematical Statistics

The Bayesian Choice. Christian P. Robert. From Decision-Theoretic Foundations to Computational Implementation. Second Edition.

Theory of Nonparametric Tests

Latif M. Jiji. Heat Conduction. Third Edition ABC

Linear Programming and its Applications

Magnetic Properties of Non-Metallic Inorganic Compounds Based on Transition Elements

Econometric Analysis of Count Data

German Annual of Spatial Research and Policy

Qing-Hua Qin. Advanced Mechanics of Piezoelectricity

Peter E. Kloeden Eckhard Platen. Numerical Solution of Stochastic Differential Equations

Ambrosio Dancer Calculus of Variations and Partial Differential Equations

Doubt-Free Uncertainty In Measurement

ThiS is a FM Blank Page

Regional Economic Development

Tianyou Fan. Mathematical Theory of Elasticity of Quasicrystals and Its Applications

Stat 5101 Lecture Notes

A. Kovacevic N. Stosic I. Smith. Screw Compressors. Three Dimensional Computational Fluid Dynamics and Solid Fluid Interaction.

UNITEXT La Matematica per il 3+2. Volume 87

Fundamental Probability and Statistics

AN INTRODUCTION TO PROBABILITY AND STATISTICS

Nuclear Fission and Cluster Radioactivity

STATISTICAL ANALYSIS WITH MISSING DATA

Differential-Algebraic Equations Forum

Ergebnisse der Mathematik und ihrer Grenzgebiete

Reliability Evaluation of Engineering Systems:

Lecture Notes in Physics

Extended Irreversible Thermodynamics

Springer-Verlag Berlin Heidelberg GmbH

Springer Series in Information Sciences 26. Editor: Thomas S. Huang. Springer-Verlag Berlin Heidelberg GmbH

Springer Series in 36 Computational Mathematics

Frank Bothmann Rudolf Kerndlmaier Albert Koffeman Klaus Mandel Sarah Wallbank A Guidebook for Riverside Regeneration Artery - Transforming Riversides

Contents. Part I: Fundamentals of Bayesian Inference 1

Studies in Systems, Decision and Control. Series editor Janusz Kacprzyk, Polish Academy of Sciences, Warsaw, Poland

Ahsan Habib Khandoker Chandan Karmakar Michael Brennan Andreas Voss Marimuthu Palaniswami. Poincaré Plot Methods for Heart Rate Variability Analysis

Reactivity and Structure Concepts in Organic Chemistry

Lecture Notes of 12 the Unione Matematica Italiana

Dynamics and Control of Lorentz-Augmented Spacecraft Relative Motion

Mechanics of Materials

Dynamics Formulas and Problems

Statics and Mechanics of Structures

Numerical Treatment of Partial Differential Equations

Publication of the Museum of Nature South Tyrol Nr. 11

Generalized, Linear, and Mixed Models

Semester I BASIC STATISTICS AND PROBABILITY STS1C01

Ronald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California

Nonlinear Optics. D.L.Mills. Basic Concepts. Springer-Verlag. With 32 Figures

Zdzislaw Bubnicki Modern Control Theory

Igor Emri Arkady Voloshin. Statics. Learning from Engineering Examples

SpringerBriefs in Mathematics

Jack Steinberger Learning About Particles 50 Privileged Years

Lecture Notes in Chemistry

Solid Phase Microextraction

Transcription:

Karl-Rudolf Koch Introduction to Bayesian Statistics Second Edition

Karl-Rudolf Koch Introduction to Bayesian Statistics Second, updated and enlarged Edition With 17 Figures

Professor Dr.-Ing., Dr.-Ing. E.h. mult. Karl-Rudolf Koch (em.) University of Bonn Institute of Theoretical Geodesy Nussallee 17 53115 Bonn E-mail: koch@geod.uni-bonn.de Library of Congress Control Number: 2007929992 ISBN 978-3-540-72723-1 Springer Berlin Heidelberg New York ISBN (1. Aufl) 978-3-540-66670-7 Einführung in Bayes-Statistik This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilm or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer-Verlag. Violations are liable to prosecution under the German Copyright Law. Springer is a part of Springer Science+Business Media springer.com Springer-Verlag Berlin Heidelberg 2007 The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. Cover design: deblik, Berlin Production: Almas Schimmel Typesetting: Camera-ready by Author Printed on acid-free paper 30/3180/as 5 4 3 2 1 0

Preface to the Second Edition This is the second and translated edition of the German book Einführung in die Bayes-Statistik, Springer-Verlag, Berlin Heidelberg New York, 2000. It has been completely revised and numerous new developments are pointed out together with the relevant literature. The Chapter 5.2.4 is extended by the stochastic trace estimation for variance components. The new Chapter 5.2.6 presents the estimation of the regularization parameter of type Tykhonov regularization for inverse problems as the ratio of two variance components. The reconstruction and the smoothing of digital three-dimensional images is demonstrated in the new Chapter 5.3. The Chapter 6.2.1 on importance sampling for the Monte Carlo integration is rewritten to solve a more general integral. This chapter contains also the derivation of the SIR (samplingimportance-resampling) algorithm as an alternative to the rejection method for generating random samples. Markov Chain Monte Carlo methods are now frequently applied in Bayesian statistics. The first of these methods, the Metropolis algorithm, is therefore presented in the new Chapter 6.3.1. The kernel method is introduced in Chapter 6.3.3, to estimate density functions for unknown parameters, and used for the example of Chapter 6.3.6. As a special application of the Gibbs sampler, finally, the computation and propagation of large covariance matrices is derived in the new Chapter 6.3.5. I want to express my gratitude to Mrs. Brigitte Gundlich, Dr.-Ing., and to Mr. Boris Kargoll, Dipl.-Ing., for their suggestions to improve the book. I also would like to mention the good cooperation with Dr. Chris Bendall of Springer-Verlag. Bonn, March 2007 Karl-Rudolf Koch

Preface to the First German Edition This book is intended to serve as an introduction to Bayesian statistics which is founded on Bayes theorem. By means of this theorem it is possible to estimate unknown parameters, to establish confidence regions for the unknown parameters and to test hypotheses for the parameters. This simple approach cannot be taken by traditional statistics, since it does not start from Bayes theorem. In this respect Bayesian statistics has an essential advantage over traditional statistics. The book addresses readers who face the task of statistical inference on unknown parameters of complex systems, i.e. who have to estimate unknown parameters, to establish confidence regions and to test hypotheses for these parameters. An effective use of the book merely requires a basic background in analysis and linear algebra. However, a short introduction to onedimensional random variables with their probability distributions is followed by introducing multidimensional random variables so that the knowledge of one-dimensional statistics will be helpful. It also will be of an advantage for the reader to be familiar with the issues of estimating parameters, although the methods here are illustrated with many examples. Bayesian statistics extends the notion of probability by defining the probability for statements or propositions, whereas traditional statistics generally restricts itself to the probability of random events resulting from random experiments. By logical and consistent reasoning three laws can be derived for the probability of statements from which all further laws of probability may be deduced. This will be explained in Chapter 2. This chapter also contains the derivation of Bayes theorem and of the probability distributions for random variables. Thereafter, the univariate and multivariate distributions required further along in the book are collected though without derivation. Prior density functions for Bayes theorem are discussed at the end of the chapter. Chapter 3 shows how Bayes theorem can lead to estimating unknown parameters, to establishing confidence regions and to testing hypotheses for the parameters. These methods are then applied in the linear model covered in Chapter 4. Cases are considered where the variance factor contained in the covariance matrix of the observations is either known or unknown, where informative or noninformative priors are available and where the linear model is of full rank or not of full rank. Estimation of parameters robust with respect to outliers and the Kalman filter are also derived. Special models and methods are given in Chapter 5, including the model of prediction and filtering, the linear model with unknown variance and covariance components, the problem of pattern recognition and the segmentation of

VIII Preface digital images. In addition, Bayesian networks are developed for decisions in systems with uncertainties. They are, for instance, applied for the automatic interpretation of digital images. If it is not possible to analytically solve the integrals for estimating parameters, for establishing confidence regions and for testing hypotheses, then numerical techniques have to be used. The two most important ones are the Monte Carlo integration and the Markoff Chain Monte Carlo methods. They are presented in Chapter 6. Illustrative examples have been variously added. The end of each is indicated by the symbol, and the examples are numbered within a chapter if necessary. For estimating parameters in linear models traditional statistics can rely on methods, which are simpler than the ones of Bayesian statistics. They are used here to derive necessary results. Thus, the techniques of traditional statistics and of Bayesian statistics are not treated separately, as is often the case such as in two of the author s books Parameter Estimation and Hypothesis Testing in Linear Models, 2nd Ed., Springer-Verlag, Berlin Heidelberg New York, 1999 and Bayesian Inference with Geodetic Applications, Springer-Verlag, Berlin Heidelberg New York, 1990. By applying Bayesian statistics with additions from traditional statistics it is tried here to derive as simply and as clearly as possible methods for the statistical inference on parameters. Discussions with colleagues provided valuable suggestions that I am grateful for. My appreciation is also forwarded to those students of our university who contributed ideas for improving this book. Equally, I would like to express my gratitude to my colleagues and staff of the Institute of Theoretical Geodesy who assisted in preparing it. My special thanks go to Mrs. Brigitte Gundlich, Dipl.-Ing., for various suggestions concerning this book and to Mrs. Ingrid Wahl for typesetting and formatting the text. Finally, I would like to thank the publisher for valuable input. Bonn, August 1999 Karl-Rudolf Koch

Contents 1 Introduction 1 2 Probability 3 2.1 Rules of Probability....................... 3 2.1.1 Deductive and Plausible Reasoning........... 3 2.1.2 Statement Calculus.................... 3 2.1.3 Conditional Probability................. 5 2.1.4 Product Rule and Sum Rule of Probability...... 6 2.1.5 Generalized Sum Rule.................. 7 2.1.6 Axioms of Probability.................. 9 2.1.7 Chain Rule and Independence.............. 11 2.1.8 Bayes Theorem..................... 12 2.1.9 Recursive Application of Bayes Theorem....... 16 2.2 Distributions........................... 16 2.2.1 Discrete Distribution................... 17 2.2.2 Continuous Distribution................. 18 2.2.3 Binomial Distribution.................. 20 2.2.4 Multidimensional Discrete and Continuous Distributions 22 2.2.5 Marginal Distribution.................. 24 2.2.6 Conditional Distribution................. 26 2.2.7 Independent Random Variables and Chain Rule... 28 2.2.8 Generalized Bayes Theorem.............. 31 2.3 Expected Value, Variance and Covariance........... 37 2.3.1 Expected Value...................... 37 2.3.2 Variance and Covariance................. 41 2.3.3 Expected Value of a Quadratic Form.......... 44 2.4 Univariate Distributions..................... 45 2.4.1 Normal Distribution................... 45 2.4.2 Gamma Distribution................... 47 2.4.3 Inverted Gamma Distribution.............. 48 2.4.4 Beta Distribution..................... 48 2.4.5 χ 2 -Distribution...................... 48 2.4.6 F -Distribution...................... 49 2.4.7 t-distribution....................... 49 2.4.8 Exponential Distribution................ 50 2.4.9 Cauchy Distribution................... 51 2.5 Multivariate Distributions.................... 51 2.5.1 Multivariate Normal Distribution............ 51 2.5.2 Multivariate t-distribution............... 53

X Contents 2.5.3 Normal-Gamma Distribution.............. 55 2.6 Prior Density Functions..................... 56 2.6.1 Noninformative Priors.................. 56 2.6.2 Maximum Entropy Priors................ 57 2.6.3 Conjugate Priors..................... 59 3 Parameter Estimation, Confidence Regions and Hypothesis Testing 63 3.1 Bayes Rule............................ 63 3.2 Point Estimation......................... 65 3.2.1 Quadratic Loss Function................. 65 3.2.2 Loss Function of the Absolute Errors.......... 67 3.2.3 Zero-One Loss...................... 69 3.3 Estimation of Confidence Regions................ 71 3.3.1 Confidence Regions.................... 71 3.3.2 Boundary of a Confidence Region............ 73 3.4 Hypothesis Testing........................ 73 3.4.1 Different Hypotheses................... 74 3.4.2 Test of Hypotheses.................... 75 3.4.3 Special Priors for Hypotheses.............. 78 3.4.4 Test of the Point Null Hypothesis by Confidence Regions 82 4 Linear Model 85 4.1 Definition and Likelihood Function............... 85 4.2 Linear Model with Known Variance Factor.......... 89 4.2.1 Noninformative Priors.................. 89 4.2.2 Method of Least Squares................ 93 4.2.3 Estimation of the Variance Factor in Traditional Statistics......................... 94 4.2.4 Linear Model with Constraints in Traditional Statistics......................... 96 4.2.5 Robust Parameter Estimation.............. 99 4.2.6 Informative Priors.................... 103 4.2.7 Kalman Filter....................... 107 4.3 Linear Model with Unknown Variance Factor......... 110 4.3.1 Noninformative Priors.................. 110 4.3.2 Informative Priors.................... 117 4.4 Linear Model not of Full Rank................. 121 4.4.1 Noninformative Priors.................. 122 4.4.2 Informative Priors.................... 124 5 Special Models and Applications 129 5.1 Prediction and Filtering..................... 129 5.1.1 Model of Prediction and Filtering as Special Linear Model........................... 130

Contents XI 5.1.2 Special Model of Prediction and Filtering....... 135 5.2 Variance and Covariance Components............. 139 5.2.1 Model and Likelihood Function............. 139 5.2.2 Noninformative Priors.................. 143 5.2.3 Informative Priors.................... 143 5.2.4 Variance Components.................. 144 5.2.5 Distributions for Variance Components........ 148 5.2.6 Regularization...................... 150 5.3 Reconstructing and Smoothing of Three-dimensional Images. 154 5.3.1 Positron Emission Tomography............. 155 5.3.2 Image Reconstruction.................. 156 5.3.3 Iterated Conditional Modes Algorithm......... 158 5.4 Pattern Recognition....................... 159 5.4.1 Classification by Bayes Rule............... 160 5.4.2 Normal Distribution with Known and Unknown Parameters........................ 161 5.4.3 Parameters for Texture................. 163 5.5 Bayesian Networks........................ 167 5.5.1 Systems with Uncertainties............... 167 5.5.2 Setup of a Bayesian Network.............. 169 5.5.3 Computation of Probabilities.............. 173 5.5.4 Bayesian Network in Form of a Chain......... 181 5.5.5 Bayesian Network in Form of a Tree.......... 184 5.5.6 Bayesian Network in Form of a Polytreee....... 187 6 Numerical Methods 193 6.1 Generating Random Values................... 193 6.1.1 Generating Random Numbers.............. 193 6.1.2 Inversion Method..................... 194 6.1.3 Rejection Method.................... 196 6.1.4 Generating Values for Normally Distributed Random Variables......................... 197 6.2 Monte Carlo Integration..................... 197 6.2.1 Importance Sampling and SIR Algorithm....... 198 6.2.2 Crude Monte Carlo Integration............. 201 6.2.3 Computation of Estimates, Confidence Regions and Probabilities for Hypotheses............... 202 6.2.4 Computation of Marginal Distributions........ 204 6.2.5 Confidence Region for Robust Estimation of Parameters as Example................. 207 6.3 Markov Chain Monte Carlo Methods.............. 216 6.3.1 Metropolis Algorithm.................. 216 6.3.2 Gibbs Sampler...................... 217 6.3.3 Computation of Estimates, Confidence Regions and Probabilities for Hypotheses............... 219

XII Contents 6.3.4 Computation of Marginal Distributions........ 222 6.3.5 Gibbs Sampler for Computing and Propagating Large Covariance Matrices............... 224 6.3.6 Continuation of the Example: Confidence Region for Robust Estimation of Parameters............ 229 References 235 Index 245