National Sun Yat-Sen University CSE Course: Information Theory. Maximum Entropy and Spectral Estimation

Similar documents
Lecture 11: Maximum Entropy

Time Series: Theory and Methods

Random Variables and Their Distributions

10-704: Information Processing and Learning Spring Lecture 8: Feb 5

Exercises with solutions (Set D)

ECE 4400:693 - Information Theory

Lecture 17: Differential Entropy

10-704: Information Processing and Learning Fall Lecture 9: Sept 28

CCNY. BME I5100: Biomedical Signal Processing. Stochastic Processes. Lucas C. Parra Biomedical Engineering Department City College of New York

Lecture 22: Final Review

Independent Events. Two events are independent if knowing that one occurs does not change the probability of the other occurring

Non-Gaussian Maximum Entropy Processes

Entropy and Limit theorems in Probability Theory

where r n = dn+1 x(t)

CONTENTS NOTATIONAL CONVENTIONS GLOSSARY OF KEY SYMBOLS 1 INTRODUCTION 1

Northwestern University Department of Electrical Engineering and Computer Science

Fundamentals of Noise

= 4. e t/a dt (2) = 4ae t/a. = 4a a = 1 4. (4) + a 2 e +j2πft 2

ELEMENTS OF PROBABILITY THEORY

SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416)

Lecture : The Indefinite Integral MTH 124

Statistics & Data Sciences: First Year Prelim Exam May 2018

Chapter 8: Differential entropy. University of Illinois at Chicago ECE 534, Natasha Devroye

8. Statistical Equilibrium and Classification of States: Discrete Time Markov Chains

Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines

EE376A: Homeworks #4 Solutions Due on Thursday, February 22, 2018 Please submit on Gradescope. Start every question on a new page.

Gaussian, Markov and stationary processes

Covariance Stationary Time Series. Example: Independent White Noise (IWN(0,σ 2 )) Y t = ε t, ε t iid N(0,σ 2 )

Entropy, Information Theory, Information Geometry and Bayesian Inference in Data, Signal and Image Processing and Inverse Problems

Discrete time processes

Monte-Carlo MMD-MA, Université Paris-Dauphine. Xiaolu Tan

1 Probability and Random Variables

Probability and Statistics for Final Year Engineering Students

LECTURE 18. Lecture outline Gaussian channels: parallel colored noise inter-symbol interference general case: multiple inputs and outputs

4. Distributions of Functions of Random Variables

Automatic Autocorrelation and Spectral Analysis

P (x). all other X j =x j. If X is a continuous random vector (see p.172), then the marginal distributions of X i are: f(x)dx 1 dx n

Math 576: Quantitative Risk Management

STA 294: Stochastic Processes & Bayesian Nonparametrics

Practice Questions for Final

Chapter 3, 4 Random Variables ENCS Probability and Stochastic Processes. Concordia University

COURSE Numerical integration of functions (continuation) 3.3. The Romberg s iterative generation method

Lecture 4 - Spectral Estimation

E[X n ]= dn dt n M X(t). ). What is the mgf? Solution. Found this the other day in the Kernel matching exercise: 1 M X (t) =

Concentration Inequalities

Stochastic Processes. Monday, November 14, 11

First and Last Name: 2. Correct The Mistake Determine whether these equations are false, and if so write the correct answer.

Chapter 5 continued. Chapter 5 sections

ENGR352 Problem Set 02

P (A G) dp G P (A G)

Next tool is Partial ACF; mathematical tools first. The Multivariate Normal Distribution. e z2 /2. f Z (z) = 1 2π. e z2 i /2

Lecture 11: Continuous-valued signals and differential entropy

Pure Random process Pure Random Process or White Noise Process: is a random process {X t, t 0} which has: { σ 2 if k = 0 0 if k 0

PROOF OF ZADOR-GERSHO THEOREM

Large sample covariance matrices and the T 2 statistic

Monte Carlo and cold gases. Lode Pollet.

NO CALCULATORS. NO BOOKS. NO NOTES. TURN OFF YOUR CELL PHONES AND PUT THEM AWAY.

International Competition in Mathematics for Universtiy Students in Plovdiv, Bulgaria 1994

Predictive Information Rate in Discrete-time Gaussian Processes

COMPUTER ALGEBRA DERIVATION OF THE BIAS OF LINEAR ESTIMATORS OF AUTOREGRESSIVE MODELS

Homework 2: Solution

Stochastic Processes

14 - Gaussian Stochastic Processes

National University of Singapore. Ralf R. Müller

Probability on a Riemannian Manifold

Dynamic models. Dependent data The AR(p) model The MA(q) model Hidden Markov models. 6 Dynamic models

E X A M. Probability Theory and Stochastic Processes Date: December 13, 2016 Duration: 4 hours. Number of pages incl.

Parameter estimation: ACVF of AR processes

UCSD ECE250 Handout #27 Prof. Young-Han Kim Friday, June 8, Practice Final Examination (Winter 2017)

Gaussian channel. Information theory 2013, lecture 6. Jens Sjölund. 8 May Jens Sjölund (IMT, LiU) Gaussian channel 1 / 26

PROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS

Chapter 3 sections. SKIP: 3.10 Markov Chains. SKIP: pages Chapter 3 - continued

A Brief Analysis of Central Limit Theorem. SIAM Chapter Florida State University

The Laplace driven moving average a non-gaussian stationary process

ELEG 3143 Probability & Stochastic Process Ch. 6 Stochastic Process

Lecture Notes on the Gaussian Distribution

Random Variables. Cumulative Distribution Function (CDF) Amappingthattransformstheeventstotherealline.

Theory of probability and mathematical statistics

Notes on Random Vectors and Multivariate Normal

Probability Theory and Statistics. Peter Jochumzen

1.1 Review of Probability Theory

Kalman Filter. Predict: Update: x k k 1 = F k x k 1 k 1 + B k u k P k k 1 = F k P k 1 k 1 F T k + Q

CHAPTER 3. P (B j A i ) P (B j ) =log 2. j=1

Chapter 5. Chapter 5 sections

If we want to analyze experimental or simulated data we might encounter the following tasks:

Notes on Random Variables, Expectations, Probability Densities, and Martingales

DUBLIN CITY UNIVERSITY

Probability Models in Electrical and Computer Engineering Mathematical models as tools in analysis and design Deterministic models Probability models

Applied Probability and Stochastic Processes

Linear Algebra review Powers of a diagonalizable matrix Spectral decomposition

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

1 Acceptance-Rejection Method

Simulation of stationary fractional Ornstein-Uhlenbeck process and application to turbulence

2. Variance and Covariance: We will now derive some classic properties of variance and covariance. Assume real-valued random variables X and Y.

The Transition Probability Function P ij (t)

Multivariate Time Series

Chapter 4: Continuous channel and its capacity

Quick Tour of Basic Probability Theory and Linear Algebra

Multiple Random Variables

Today: Fundamentals of Monte Carlo

Transcription:

Maximum Entropy and Spectral Estimation 1

Introduction What is the distribution of velocities in the gas at a given temperature? It is the Maxwell-Boltzmann distribution. The maximum entropy distribution corresponds to the macrostate with the maximum number of microstates. Implicitly assumed is that all microstates are equally probable, which is an AEP property. 2

Maximum Entropy Problem and Solution Problem: Find a distribution f (x) with maximal entropy h(f) over the set of functions satisfying the following constraints. f(x) 0. f(x)dx = 1. E(r i (X)) = f(x)r i (x)dx = α i, for i = 1,..., m. Solution: f (x) = e λ 0+ P m i=1 λ ir i (x). 3

Proof: For any g(x) satisfying the constraints, h(g) = g(x) log g(x)dx = g log g f f dx = D(g f ) g(x) log f (x)dx g(x) log f (x)dx m = g(x)(λ 0 + λ i r i (x))dx = = h(f ) f (x)(λ 0 + i=1 m λ i r i (x))dx i=1 4

Examples r i (x) is the exponent in the exponential. λ i is determined by the constraints. Examples Dice, no constraints: p(i) = const Dice, EX = α: p(i) e λi S = [0, ), EX = µ: f(x) e λx S = (, ), EX = µ, EX 2 = β: Gaussian N(µ, β µ 2 ) e λ 1x+λ 2 x 2 Multivariate EX i X j = K ij, 1 i, j n: f(x) e P i,j λ ijx i x j (Theorem 9.6.5) 5

Spectral Estimation Let {X i } be a zero-mean stochastic process. The autocorrelation function is defined by R[k] = EX i X i+k The power spectral density is the Fourier transform of R[k] S(λ) = k R[k]e iλk, π λ π So we can estimate the spectral density from a sample of the process. 6

Differential Entropy Rates The differential entropy rate of a stochastic process is defined by if the limit exists. h(x 1,..., X n ) h(x) = lim, n n For a stationary process, the limit exists. Furthermore, h(x) = lim n h(x n X n 1,..., X 1 ) 7

Gaussian Processes A Gaussian process is characterized by the property that any collection of random variables in the process is jointly Gaussian. For a stationary Gaussian process, we have h(x 1,..., X n ) = 1 2 log(2πe)n K (n), where K (n) is the covariance matrix for (X 1,..., X n ). K (n) ij = E(X i EX i )(X j EX j ) As n, the density of eigenvalues of K (n) tends to a limit, which is the spectrum of the stochastic process. 8

Entropy Rate and Variance Kolmogorov showed that the entropy rate of a stationary Gaussian process is related to the spectral density by (11.39). So the entropy rate can be computed from the spectral density. Furthermore, the best estimate of X n given the past samples (which is Gaussian) has a variance that is related to the entropy rate by (11.40). 9

Burg s Maximum Entropy Theorem The maximum entropy rate stochastic process {X i } satisfying the constraints EX i X i+k = α k, k = 0,..., p is the p-th order Gauss-Markov process X i = p k=1 a k X i k + Z i, where the Z is are i.i.d. zero-mean Gaussians N(0, σ 2 ). The a i s and σ are chosen to satisfy the constraints. See the text for the proof. 10

Yule-Walker Equations To choose a k and σ, one solves the Yule-Walker equations as given in (11.51) and (11.52), which are obtained by multiplying (11.42) by X i l, l = 0, 1,..., p and taking expectation values. The Yule-Walker equations can be solved efficiently by the Levinson-Durbin recursions. The spectrum of the maximum entropy process is S(l) = σ 2 1 + a k e ikl 2, which can be obtained from (11.51). 11