( 1 k "information" I(X;Y) given by Y about X)

Size: px
Start display at page:

Download "( 1 k "information" I(X;Y) given by Y about X)"

Transcription

1 SUMMARY OF SHANNON DISTORTION-RATE THEORY Consider a stationary source X with f (x) as its th-order pdf. Recall the following OPTA function definitions: δ(,r) = least dist'n of -dim'l fixed-rate VQ's w. rate R δ(,n,r) = least dist'n of -dim'l VQ's w. nth-order bloc lossless coding and rate R δ(r) = inf δ(,r) = = inf,n δ(,n,r) = least dist'n of VQ's with rate R, any dimension and fixed- or variable-rate coding These functions describe the best possible performance of VQ's. High-Resolution Theory enabled us to find concrete formulas for them (the Zador-Gersho formulas) for the case that R is large. Shannon's distortion-rate theory enables one to find δ(r) for ANY value of R. However, it does not allow us to find δ(,r) or δ(,n,r), not even for some R's. The ey result is the following. April 4, 2005 Sh- Shannon's Distortion-Rate Theorem For a stationary, ergodic source with finite variance. δ(r) = D(R) OPTA function = Shannon's DRF where D(R) = lim D(,R) = Shannon's "distortion-rate function" D(,R) = inf q Q (R) E X-Y 2 X = (X...X ) random variables from source Y = (Y...Y ) random variables from test channel q with X as input Q (R) = set of conditional probability densities, called "test channels" = {q(y x): f(x) q(y x) log f(x)q(y x) 2 dx dy R} f(y) ( "information" I(X;Y) given by Y about X) E X-Y 2 is computed wrt to joint density f(x,y) = f(x)q(y x) April 4, 2005 Sh-2

2 δ(r) is defined by a minimum over actual quantizers. D(R) is defined by a minimum over hypothetical conditional probability distributions. There is no straightforward connection between the δ(r) and D(R). This theorem is one of the deep and central results of information theory. Its proof can be found in information theory textboos. As does most of information theory, the proof uses the asymptotic equipartition property, which in turn derives from the law of large numbers. We'll setch some ideas of the proof later. The theorem says two things: Positive statement: For any R, there exist VQ's with rate R or less having MSE arbitrarily close to D(R). (The proof shows there exist fixed-rate codes.) Negative statement: For any R, every VQ with rate R or less (fixed- or variable-rate) has MSE greater than or equal to D(R). Unfortunately, this theorem does not indicate how large the dimension needs to be to able to attain distortion close to D(R). Fortunately, Zador's theorem does enable us to learn how large the dimension needs to be, at least for large R, which is why we have focused in this course on Zador's rather than Shannon's theorem. April 4, 2005 Sh-3 The test channels introduced in the definition of D(R) are not to be considered codes or any other part of an actual physical system. Although the definition of D(R) is quite complex, there are cases, such as Gaussian sources, where it can be reduced to a closed form or parametric expression. In other cases, the "Blahut algorithm" can at least be used to compute D(,R), and if is large, D(,R) D(R). Unfortunately, the Blahut algorithm becomes very complex for large. So in practice it is extremely difficult to compute D(R), except in special cases such as IID or Gaussian sources. Because D(R) can be so difficult ot compute, upper and lower bounds to it have been developed, which can serve as approximations. Shannon's theorem is often stated in the following equivalent form: γ(d) = R(D) where γ(d) is the rate vs. distortion OPTA function, defined as the least rate of any lossy source code with distortion D or less (it is the inverse of δ(r)), and R(D) is the "Shannon rate-distortion function", which is the inverse of D(R). In fact, Shannon originally stated the theorem in this form, and the subject is usually called "rate-distortion theory". April 4, 2005 Sh-4

3 The theorem generalizes to other measures of distortion between vectors of the form d(x,y) = d(xi,y i ) i= where d(x,y) is some distortion measure between individual samples. d(x,y) is called a per-letter distortion measure. April 4, 2005 Sh-5 THE COMPLEMENTARY NATURE OF SHANNON DISTORTION-RATE THEORY AND ZADOR'S HIGH-RESOLUTION THEORY Consider Fixed-Rate Coding Shannon Theory: For large and any R, δ(,r) δ(r) = D(R) High-Resolution Theory: For large R and any : δ(,r) Z(,R) For large and large R, they agree: δ(r) = D(R) δ(,r) Z(,R) Important Note: δ(,r) D(,R) All we can say is δ(,r) > D(,R), for all,r δ(,r) Z(,R), when,r large April 4, 2005 Sh-6

4 RELATIONSHIPS BETWEEN THE DISTORTION-RATE FUNCTION AND THE ZADOR FUNCTION The following can be shown directly from definitions (they also follow from what we now about the operational significance of D(,R) and Z(,R)): D(,R) /2πe m * Z(,R) The ratio of the left and right sides goes to one as R. D(R) Z(R) The ratio of the left and right sides goes to one as R. Sometimes they are equal for sufficiently large values of R. The above inequalities are called Shannon-Lower Bounds. They are restated and proved later in Property 2. April 4, 2005 Sh-7 PROPERTIES OF THE DISTORTION-RATE FUNCTION These properties are derived by directly using and manipulating the definitions of D(,R) and D(R).. D(0) = D(,0) = σ 2 2. D(R) > 0 and D(,R) > 0, R 0 D D(R) D(,R) D(2,R) D(3,R) R 3. D(R) and D(,R) decrease montonically to zero as R increases. 4. D(R), D(,R) are convex (and consequently continuous) functions of R. 5. The D(,R)'s are subadditive. That is for any, m, R D(+m,R) +m D(,R) + m +m D(m,R) From which it follows that D(R) D(n,R) D(,R) D(,R) for all. D(R) = inf D(,R) Thus, D(,R)'s tend to decrease with, but not necessarily monotonically. 6. D(R) = D(,R) when the source is IID. April 4, 2005 Sh-8

5 7. For an IID Gaussian source D(R) = D(,R) = σ 2 2-2R Derivation: First recall that for an IID Gaussian source, the first-order differential entropy is h = 2 log 2πeσ2. Therefore, the Shannon lower bound gives D(,R) /2πe m * Z(,R) = /2πe m * m * 2-2h -2R = σ 2 2-2R The derivation is completed by showing D(,R) σ 2 2-2R. This is accomplished by verifying that the test channel q(y x) = 2πb exp{- (y-ax)2 2b }, with a = -2-2R and b = 2-2R (-2-2R )σ 2, has I(X;Y) R and E (X-Y) 2 = σ 2 2-2R. It then follows from the definition of D(,R) that D(,R) E (X-Y) 2 = σ 2 2-2R. 8. For a first-order AR Gaussian source with correlation coef. ρ D(R) = Z(R) = σ 2 (-ρ 2 ) 2-2R for R R o = 2 log 2 (+ ρ ) 2 No closed form expression for other R's. This property follows from the next. April 4, 2005 Sh-9 9. For a stationary Gaussian source with power spectral density S(ω), D(R) = Z(R) = Q 2-2R, R R o = 2 log Q 2 S min where S min is the minimum value of S(ω), and Q is the mean-squared error of the best linear prediction of X i based on all past values of X: π Q = exp{ ln S(ω) dω } 2π -π For R R o, there is no closed form expression for D(R). However, the following parametric expression applies for all values of R. For any θ, 0 θ S max, where S min is the max value of S(ω), where D θ = D(R θ ), R θ = D θ = 2π -π 2π π -π π max {0, 2 log S(ω) 2 θ } dω min {θ, S(ω)} dω Interpretation: For a given θ, all frequences ω for which S(ω) θ, contribute S(ω) 2 log θ to the rate, and θ to the distortion. All frequencies ω for which S(ω) < θ are discarded (e.g. filtered out). They contribute 0 to the rate, and S(ω) to the distortion. April 4, 2005 Sh-0

6 Special cases: θ = 0 R θ =, D θ = 0 θ = S max R = 0, D θ = σ 2 θ S min R θ = 2π -π D θ = θ. R(D) = π 2 log S(ω) 2 θ dω, 2π π -π D(R) θ = S max R o 2 log 2 S(ω) D dω = 2 log 2 Q D, D S min D(R) = Q 2-2R, R 2 log 2 Q S min = R o. θ = S min θ 0 R For the AR source, Property 8 follows from Property 9: S(ω) = σ 2 -ρ2-2ρcos(ω)+ω2, S(ω) S max S min = σ 2 - ρ + ρ, S max = σ 2 + ρ - ρ, Q = σ 2 (-ρ2), R o = 2 log 2 (+ ρ ) 2 S min -π -π ω April 4, 2005 Sh- 0. There are a few other sources for which D(R) can be computed analytically. For other sources, D(R) must be computed numerically. The most well nown agorithm is that of Blahut for computing D(,R). Because it is hard to compute, various upper and lower bounds have been found for D(,R) and D(R). Two are given below.. An upper bound: For any source, D(R) and D(,R) are bounded from above by the corresponding functions for a Gaussian source with the same autocorrelation function (equivalently, same power spectral density). It follows that Gaussian sources are the hardest to compress among those sources with a given autocorrelation function. April 4, 2005 Sh-2

7 2. Shannon lower bounds Let X be a stationary source. For any and R D(,R) 2πe 2 2h -2R = /2πe m * Z(,R) D(R) 2πe 2 2h -2R = Z(R) where h = h(x,...,x ) is the th order differental entropy of X h = lim h is the differential entropy rate of X m * = minimum value of any valid inerital profile Z(,R) and Z(R) = lim Z(,R) are Zador functions Note: The ratio of the left and right sides of each bound can be shown to go to one as R. Sometimes equality hold for large R. Derivation: To derive lower bound to D(,R), consider any test channel q that is allowed to be used in the definition of D(,R), i.e. any q s.t. I(X;Y) = f(x) q(y x) log f(x)q(y x) 2 f(y) dx dy R. We will show: E X-Y 2 2πe 2 2h -2R. (* ) April 4, 2005 Sh-3 Since this holds for any valid q, it holds for the q that minimizes E X-Y 2. Therefore, for the minimizing q D(,R) = E X-Y 2 2πe 22h -2R = /2πe m * Z(,R) where the last equality comes from the definition of Z(,R). This derives the Shannon lower bound to D(,R). The shannon lower bound to D(R) follows by taing the limit as grows to infinity. It remains only to derive (*), which we do using the following lemma from information theory. Fano's Lemma for MSE: If X and Y are -dimensional random vectors, then E X-Y 2 2πe 22 h(x Y) By manipulating the defining formula for I(X;Y), one may straightforwardly show h(x Y) = h(x) - I(X;Y) = h - I(X;Y) by the definition of h h - R by the choice of q Substituting this into the lower bound given in Fano's Lemma yields (*), and finishes the derivation. April 4, 2005 Sh-4

Summary of Shannon Rate-Distortion Theory

Summary of Shannon Rate-Distortion Theory Summary of Shannon Rate-Distortion Theory Consider a stationary source X with kth-order probability density function denoted f k (x). Consider VQ with fixed-rate coding. Recall the following OPTA function

More information

PROOF OF ZADOR-GERSHO THEOREM

PROOF OF ZADOR-GERSHO THEOREM ZADOR-GERSHO THEOREM FOR VARIABLE-RATE VQ For a stationary source and large R, the least distortion of k-dim'l VQ with nth-order entropy coding and rate R or less is δ(k,n,r) m k * σ 2 η kn 2-2R = Z(k,n,R)

More information

Exercises with solutions (Set D)

Exercises with solutions (Set D) Exercises with solutions Set D. A fair die is rolled at the same time as a fair coin is tossed. Let A be the number on the upper surface of the die and let B describe the outcome of the coin toss, where

More information

Lecture 5 Channel Coding over Continuous Channels

Lecture 5 Channel Coding over Continuous Channels Lecture 5 Channel Coding over Continuous Channels I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw November 14, 2014 1 / 34 I-Hsiang Wang NIT Lecture 5 From

More information

Ch. 8 Math Preliminaries for Lossy Coding. 8.5 Rate-Distortion Theory

Ch. 8 Math Preliminaries for Lossy Coding. 8.5 Rate-Distortion Theory Ch. 8 Math Preliminaries for Lossy Coding 8.5 Rate-Distortion Theory 1 Introduction Theory provide insight into the trade between Rate & Distortion This theory is needed to answer: What do typical R-D

More information

ECE 4400:693 - Information Theory

ECE 4400:693 - Information Theory ECE 4400:693 - Information Theory Dr. Nghi Tran Lecture 8: Differential Entropy Dr. Nghi Tran (ECE-University of Akron) ECE 4400:693 Lecture 1 / 43 Outline 1 Review: Entropy of discrete RVs 2 Differential

More information

Lecture 22: Final Review

Lecture 22: Final Review Lecture 22: Final Review Nuts and bolts Fundamental questions and limits Tools Practical algorithms Future topics Dr Yao Xie, ECE587, Information Theory, Duke University Basics Dr Yao Xie, ECE587, Information

More information

Lecture 17: Differential Entropy

Lecture 17: Differential Entropy Lecture 17: Differential Entropy Differential entropy AEP for differential entropy Quantization Maximum differential entropy Estimation counterpart of Fano s inequality Dr. Yao Xie, ECE587, Information

More information

Chapter 8: Differential entropy. University of Illinois at Chicago ECE 534, Natasha Devroye

Chapter 8: Differential entropy. University of Illinois at Chicago ECE 534, Natasha Devroye Chapter 8: Differential entropy Chapter 8 outline Motivation Definitions Relation to discrete entropy Joint and conditional differential entropy Relative entropy and mutual information Properties AEP for

More information

Information Theory. Coding and Information Theory. Information Theory Textbooks. Entropy

Information Theory. Coding and Information Theory. Information Theory Textbooks. Entropy Coding and Information Theory Chris Williams, School of Informatics, University of Edinburgh Overview What is information theory? Entropy Coding Information Theory Shannon (1948): Information theory is

More information

Parameter Estimation

Parameter Estimation Parameter Estimation Consider a sample of observations on a random variable Y. his generates random variables: (y 1, y 2,, y ). A random sample is a sample (y 1, y 2,, y ) where the random variables y

More information

Chapter 9 Fundamental Limits in Information Theory

Chapter 9 Fundamental Limits in Information Theory Chapter 9 Fundamental Limits in Information Theory Information Theory is the fundamental theory behind information manipulation, including data compression and data transmission. 9.1 Introduction o For

More information

Information Rates of Densely Sampled Gaussian Data: Distributed Vector Quantization and Scalar Quantization with Transforms

Information Rates of Densely Sampled Gaussian Data: Distributed Vector Quantization and Scalar Quantization with Transforms Information Rates of Densely Sampled Gaussian Data: Distributed Vector Quantization and Scalar Quantization with Transforms David L. Neuhoff and S. Sandeep Pradhan Dept. of Electrical Engineering and Computer

More information

Ch. 8 Math Preliminaries for Lossy Coding. 8.4 Info Theory Revisited

Ch. 8 Math Preliminaries for Lossy Coding. 8.4 Info Theory Revisited Ch. 8 Math Preliminaries for Lossy Coding 8.4 Info Theory Revisited 1 Info Theory Goals for Lossy Coding Again just as for the lossless case Info Theory provides: Basis for Algorithms & Bounds on Performance

More information

EE/Stat 376B Handout #5 Network Information Theory October, 14, Homework Set #2 Solutions

EE/Stat 376B Handout #5 Network Information Theory October, 14, Homework Set #2 Solutions EE/Stat 376B Handout #5 Network Information Theory October, 14, 014 1. Problem.4 parts (b) and (c). Homework Set # Solutions (b) Consider h(x + Y ) h(x + Y Y ) = h(x Y ) = h(x). (c) Let ay = Y 1 + Y, where

More information

Revision of Lecture 5

Revision of Lecture 5 Revision of Lecture 5 Information transferring across channels Channel characteristics and binary symmetric channel Average mutual information Average mutual information tells us what happens to information

More information

Coding for Discrete Source

Coding for Discrete Source EGR 544 Communication Theory 3. Coding for Discrete Sources Z. Aliyazicioglu Electrical and Computer Engineering Department Cal Poly Pomona Coding for Discrete Source Coding Represent source data effectively

More information

EEM 409. Random Signals. Problem Set-2: (Power Spectral Density, LTI Systems with Random Inputs) Problem 1: Problem 2:

EEM 409. Random Signals. Problem Set-2: (Power Spectral Density, LTI Systems with Random Inputs) Problem 1: Problem 2: EEM 409 Random Signals Problem Set-2: (Power Spectral Density, LTI Systems with Random Inputs) Problem 1: Consider a random process of the form = + Problem 2: X(t) = b cos(2π t + ), where b is a constant,

More information

UCSD ECE 255C Handout #14 Prof. Young-Han Kim Thursday, March 9, Solutions to Homework Set #5

UCSD ECE 255C Handout #14 Prof. Young-Han Kim Thursday, March 9, Solutions to Homework Set #5 UCSD ECE 255C Handout #14 Prof. Young-Han Kim Thursday, March 9, 2017 Solutions to Homework Set #5 3.18 Bounds on the quadratic rate distortion function. Recall that R(D) = inf F(ˆx x):e(x ˆX)2 DI(X; ˆX).

More information

To get horizontal and slant asymptotes algebraically we need to know about end behaviour for rational functions.

To get horizontal and slant asymptotes algebraically we need to know about end behaviour for rational functions. Concepts: Horizontal Asymptotes, Vertical Asymptotes, Slant (Oblique) Asymptotes, Transforming Reciprocal Function, Sketching Rational Functions, Solving Inequalities using Sign Charts. Rational Function

More information

Quiz 2 Date: Monday, November 21, 2016

Quiz 2 Date: Monday, November 21, 2016 10-704 Information Processing and Learning Fall 2016 Quiz 2 Date: Monday, November 21, 2016 Name: Andrew ID: Department: Guidelines: 1. PLEASE DO NOT TURN THIS PAGE UNTIL INSTRUCTED. 2. Write your name,

More information

EAS 305 Random Processes Viewgraph 1 of 10. Random Processes

EAS 305 Random Processes Viewgraph 1 of 10. Random Processes EAS 305 Random Processes Viewgraph 1 of 10 Definitions: Random Processes A random process is a family of random variables indexed by a parameter t T, where T is called the index set λ i Experiment outcome

More information

Joint Probability Distributions and Random Samples (Devore Chapter Five)

Joint Probability Distributions and Random Samples (Devore Chapter Five) Joint Probability Distributions and Random Samples (Devore Chapter Five) 1016-345-01: Probability and Statistics for Engineers Spring 2013 Contents 1 Joint Probability Distributions 2 1.1 Two Discrete

More information

ELEC546 Review of Information Theory

ELEC546 Review of Information Theory ELEC546 Review of Information Theory Vincent Lau 1/1/004 1 Review of Information Theory Entropy: Measure of uncertainty of a random variable X. The entropy of X, H(X), is given by: If X is a discrete random

More information

Shannon meets Wiener II: On MMSE estimation in successive decoding schemes

Shannon meets Wiener II: On MMSE estimation in successive decoding schemes Shannon meets Wiener II: On MMSE estimation in successive decoding schemes G. David Forney, Jr. MIT Cambridge, MA 0239 USA forneyd@comcast.net Abstract We continue to discuss why MMSE estimation arises

More information

Lecture 11: Continuous-valued signals and differential entropy

Lecture 11: Continuous-valued signals and differential entropy Lecture 11: Continuous-valued signals and differential entropy Biology 429 Carl Bergstrom September 20, 2008 Sources: Parts of today s lecture follow Chapter 8 from Cover and Thomas (2007). Some components

More information

CODING SAMPLE DIFFERENCES ATTEMPT 1: NAIVE DIFFERENTIAL CODING

CODING SAMPLE DIFFERENCES ATTEMPT 1: NAIVE DIFFERENTIAL CODING 5 0 DPCM (Differential Pulse Code Modulation) Making scalar quantization work for a correlated source -- a sequential approach. Consider quantizing a slowly varying source (AR, Gauss, ρ =.95, σ 2 = 3.2).

More information

Statistics for scientists and engineers

Statistics for scientists and engineers Statistics for scientists and engineers February 0, 006 Contents Introduction. Motivation - why study statistics?................................... Examples..................................................3

More information

18.2 Continuous Alphabet (discrete-time, memoryless) Channel

18.2 Continuous Alphabet (discrete-time, memoryless) Channel 0-704: Information Processing and Learning Spring 0 Lecture 8: Gaussian channel, Parallel channels and Rate-distortion theory Lecturer: Aarti Singh Scribe: Danai Koutra Disclaimer: These notes have not

More information

Lossy Compression Coding Theorems for Arbitrary Sources

Lossy Compression Coding Theorems for Arbitrary Sources Lossy Compression Coding Theorems for Arbitrary Sources Ioannis Kontoyiannis U of Cambridge joint work with M. Madiman, M. Harrison, J. Zhang Beyond IID Workshop, Cambridge, UK July 23, 2018 Outline Motivation

More information

UCSD ECE250 Handout #27 Prof. Young-Han Kim Friday, June 8, Practice Final Examination (Winter 2017)

UCSD ECE250 Handout #27 Prof. Young-Han Kim Friday, June 8, Practice Final Examination (Winter 2017) UCSD ECE250 Handout #27 Prof. Young-Han Kim Friday, June 8, 208 Practice Final Examination (Winter 207) There are 6 problems, each problem with multiple parts. Your answer should be as clear and readable

More information

ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016

ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016 ECE598: Information-theoretic methods in high-dimensional statistics Spring 06 Lecture : Mutual Information Method Lecturer: Yihong Wu Scribe: Jaeho Lee, Mar, 06 Ed. Mar 9 Quick review: Assouad s lemma

More information

Example: for source

Example: for source Nonuniform scalar quantizer References: Sayood Chap. 9, Gersho and Gray, Chap.'s 5 and 6. The basic idea: For a nonuniform source density, put smaller cells and levels where the density is larger, thereby

More information

Performance Bounds for Joint Source-Channel Coding of Uniform. Departements *Communications et **Signal

Performance Bounds for Joint Source-Channel Coding of Uniform. Departements *Communications et **Signal Performance Bounds for Joint Source-Channel Coding of Uniform Memoryless Sources Using a Binary ecomposition Seyed Bahram ZAHIR AZAMI*, Olivier RIOUL* and Pierre UHAMEL** epartements *Communications et

More information

MATH 205C: STATIONARY PHASE LEMMA

MATH 205C: STATIONARY PHASE LEMMA MATH 205C: STATIONARY PHASE LEMMA For ω, consider an integral of the form I(ω) = e iωf(x) u(x) dx, where u Cc (R n ) complex valued, with support in a compact set K, and f C (R n ) real valued. Thus, I(ω)

More information

conditional cdf, conditional pdf, total probability theorem?

conditional cdf, conditional pdf, total probability theorem? 6 Multiple Random Variables 6.0 INTRODUCTION scalar vs. random variable cdf, pdf transformation of a random variable conditional cdf, conditional pdf, total probability theorem expectation of a random

More information

EE/Stats 376A: Homework 7 Solutions Due on Friday March 17, 5 pm

EE/Stats 376A: Homework 7 Solutions Due on Friday March 17, 5 pm EE/Stats 376A: Homework 7 Solutions Due on Friday March 17, 5 pm 1. Feedback does not increase the capacity. Consider a channel with feedback. We assume that all the recieved outputs are sent back immediately

More information

TTIC 31230, Fundamentals of Deep Learning David McAllester, April Information Theory and Distribution Modeling

TTIC 31230, Fundamentals of Deep Learning David McAllester, April Information Theory and Distribution Modeling TTIC 31230, Fundamentals of Deep Learning David McAllester, April 2017 Information Theory and Distribution Modeling Why do we model distributions and conditional distributions using the following objective

More information

at Some sort of quantization is necessary to represent continuous signals in digital form

at Some sort of quantization is necessary to represent continuous signals in digital form Quantization at Some sort of quantization is necessary to represent continuous signals in digital form x(n 1,n ) x(t 1,tt ) D Sampler Quantizer x q (n 1,nn ) Digitizer (A/D) Quantization is also used for

More information

Multiple Random Variables

Multiple Random Variables Multiple Random Variables This Version: July 30, 2015 Multiple Random Variables 2 Now we consider models with more than one r.v. These are called multivariate models For instance: height and weight An

More information

F denotes cumulative density. denotes probability density function; (.)

F denotes cumulative density. denotes probability density function; (.) BAYESIAN ANALYSIS: FOREWORDS Notation. System means the real thing and a model is an assumed mathematical form for the system.. he probability model class M contains the set of the all admissible models

More information

Statistical Learning Theory

Statistical Learning Theory Statistical Learning Theory Part I : Mathematical Learning Theory (1-8) By Sumio Watanabe, Evaluation : Report Part II : Information Statistical Mechanics (9-15) By Yoshiyuki Kabashima, Evaluation : Report

More information

Information Theory. Lecture 5 Entropy rate and Markov sources STEFAN HÖST

Information Theory. Lecture 5 Entropy rate and Markov sources STEFAN HÖST Information Theory Lecture 5 Entropy rate and Markov sources STEFAN HÖST Universal Source Coding Huffman coding is optimal, what is the problem? In the previous coding schemes (Huffman and Shannon-Fano)it

More information

Multivariate Random Variable

Multivariate Random Variable Multivariate Random Variable Author: Author: Andrés Hincapié and Linyi Cao This Version: August 7, 2016 Multivariate Random Variable 3 Now we consider models with more than one r.v. These are called multivariate

More information

E X A M. Probability Theory and Stochastic Processes Date: December 13, 2016 Duration: 4 hours. Number of pages incl.

E X A M. Probability Theory and Stochastic Processes Date: December 13, 2016 Duration: 4 hours. Number of pages incl. E X A M Course code: Course name: Number of pages incl. front page: 6 MA430-G Probability Theory and Stochastic Processes Date: December 13, 2016 Duration: 4 hours Resources allowed: Notes: Pocket calculator,

More information

Continuous Random Variables

Continuous Random Variables 1 / 24 Continuous Random Variables Saravanan Vijayakumaran sarva@ee.iitb.ac.in Department of Electrical Engineering Indian Institute of Technology Bombay February 27, 2013 2 / 24 Continuous Random Variables

More information

Capacity of AWGN channels

Capacity of AWGN channels Chapter 3 Capacity of AWGN channels In this chapter we prove that the capacity of an AWGN channel with bandwidth W and signal-tonoise ratio SNR is W log 2 (1+SNR) bits per second (b/s). The proof that

More information

Least Distortion of Fixed-Rate Vector Quantizers. High-Resolution Analysis of. Best Inertial Profile. Zador's Formula Z-1 Z-2

Least Distortion of Fixed-Rate Vector Quantizers. High-Resolution Analysis of. Best Inertial Profile. Zador's Formula Z-1 Z-2 High-Resolution Analysis of Least Distortion of Fixe-Rate Vector Quantizers Begin with Bennett's Integral D 1 M 2/k Fin best inertial profile Zaor's Formula m(x) λ 2/k (x) f X(x) x Fin best point ensity

More information

Probability and Statistics for Final Year Engineering Students

Probability and Statistics for Final Year Engineering Students Probability and Statistics for Final Year Engineering Students By Yoni Nazarathy, Last Updated: May 24, 2011. Lecture 6p: Spectral Density, Passing Random Processes through LTI Systems, Filtering Terms

More information

P 1.5 X 4.5 / X 2 and (iii) The smallest value of n for

P 1.5 X 4.5 / X 2 and (iii) The smallest value of n for DHANALAKSHMI COLLEGE OF ENEINEERING, CHENNAI DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING MA645 PROBABILITY AND RANDOM PROCESS UNIT I : RANDOM VARIABLES PART B (6 MARKS). A random variable X

More information

Economics 241B Review of Limit Theorems for Sequences of Random Variables

Economics 241B Review of Limit Theorems for Sequences of Random Variables Economics 241B Review of Limit Theorems for Sequences of Random Variables Convergence in Distribution The previous de nitions of convergence focus on the outcome sequences of a random variable. Convergence

More information

PROBABILITY AND RANDOM PROCESSESS

PROBABILITY AND RANDOM PROCESSESS PROBABILITY AND RANDOM PROCESSESS SOLUTIONS TO UNIVERSITY QUESTION PAPER YEAR : JUNE 2014 CODE NO : 6074 /M PREPARED BY: D.B.V.RAVISANKAR ASSOCIATE PROFESSOR IT DEPARTMENT MVSR ENGINEERING COLLEGE, NADERGUL

More information

Introduction to Statistical Learning Theory

Introduction to Statistical Learning Theory Introduction to Statistical Learning Theory In the last unit we looked at regularization - adding a w 2 penalty. We add a bias - we prefer classifiers with low norm. How to incorporate more complicated

More information

Recitation 2: Probability

Recitation 2: Probability Recitation 2: Probability Colin White, Kenny Marino January 23, 2018 Outline Facts about sets Definitions and facts about probability Random Variables and Joint Distributions Characteristics of distributions

More information

LECTURE 3. Last time:

LECTURE 3. Last time: LECTURE 3 Last time: Mutual Information. Convexity and concavity Jensen s inequality Information Inequality Data processing theorem Fano s Inequality Lecture outline Stochastic processes, Entropy rate

More information

Audio Coding. Fundamentals Quantization Waveform Coding Subband Coding P NCTU/CSIE DSPLAB C.M..LIU

Audio Coding. Fundamentals Quantization Waveform Coding Subband Coding P NCTU/CSIE DSPLAB C.M..LIU Audio Coding P.1 Fundamentals Quantization Waveform Coding Subband Coding 1. Fundamentals P.2 Introduction Data Redundancy Coding Redundancy Spatial/Temporal Redundancy Perceptual Redundancy Compression

More information

Statistical signal processing

Statistical signal processing Statistical signal processing Short overview of the fundamentals Outline Random variables Random processes Stationarity Ergodicity Spectral analysis Random variable and processes Intuition: A random variable

More information

Basic Principles of Video Coding

Basic Principles of Video Coding Basic Principles of Video Coding Introduction Categories of Video Coding Schemes Information Theory Overview of Video Coding Techniques Predictive coding Transform coding Quantization Entropy coding Motion

More information

If we want to analyze experimental or simulated data we might encounter the following tasks:

If we want to analyze experimental or simulated data we might encounter the following tasks: Chapter 1 Introduction If we want to analyze experimental or simulated data we might encounter the following tasks: Characterization of the source of the signal and diagnosis Studying dependencies Prediction

More information

Estimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators

Estimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators Estimation theory Parametric estimation Properties of estimators Minimum variance estimator Cramer-Rao bound Maximum likelihood estimators Confidence intervals Bayesian estimation 1 Random Variables Let

More information

Multiple-Description Coding by Dithered Delta-Sigma Quantization

Multiple-Description Coding by Dithered Delta-Sigma Quantization IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. XX, NO. XX, XXXX 009 1 Multiple-Description Coding by Dithered Delta-Sigma Quantization Jan Østergaard and Ram Zamir arxiv:0708.1859v [cs.it] 15 Jul 009 Abstract

More information

Lecture 8: Channel Capacity, Continuous Random Variables

Lecture 8: Channel Capacity, Continuous Random Variables EE376A/STATS376A Information Theory Lecture 8-02/0/208 Lecture 8: Channel Capacity, Continuous Random Variables Lecturer: Tsachy Weissman Scribe: Augustine Chemparathy, Adithya Ganesh, Philip Hwang Channel

More information

EE-597 Notes Quantization

EE-597 Notes Quantization EE-597 Notes Quantization Phil Schniter June, 4 Quantization Given a continuous-time and continuous-amplitude signal (t, processing and storage by modern digital hardware requires discretization in both

More information

Stochastic Processes. M. Sami Fadali Professor of Electrical Engineering University of Nevada, Reno

Stochastic Processes. M. Sami Fadali Professor of Electrical Engineering University of Nevada, Reno Stochastic Processes M. Sami Fadali Professor of Electrical Engineering University of Nevada, Reno 1 Outline Stochastic (random) processes. Autocorrelation. Crosscorrelation. Spectral density function.

More information

Lecture 5: Asymptotic Equipartition Property

Lecture 5: Asymptotic Equipartition Property Lecture 5: Asymptotic Equipartition Property Law of large number for product of random variables AEP and consequences Dr. Yao Xie, ECE587, Information Theory, Duke University Stock market Initial investment

More information

MMSE Dimension. snr. 1 We use the following asymptotic notation: f(x) = O (g(x)) if and only

MMSE Dimension. snr. 1 We use the following asymptotic notation: f(x) = O (g(x)) if and only MMSE Dimension Yihong Wu Department of Electrical Engineering Princeton University Princeton, NJ 08544, USA Email: yihongwu@princeton.edu Sergio Verdú Department of Electrical Engineering Princeton University

More information

Information maximization in a network of linear neurons

Information maximization in a network of linear neurons Information maximization in a network of linear neurons Holger Arnold May 30, 005 1 Introduction It is known since the work of Hubel and Wiesel [3], that many cells in the early visual areas of mammals

More information

1. Point Estimators, Review

1. Point Estimators, Review AMS571 Prof. Wei Zhu 1. Point Estimators, Review Example 1. Let be a random sample from. Please find a good point estimator for Solutions. There are the typical estimators for and. Both are unbiased estimators.

More information

Houston Journal of Mathematics. c 2015 University of Houston Volume 41, No. 4, 2015

Houston Journal of Mathematics. c 2015 University of Houston Volume 41, No. 4, 2015 Houston Journal of Mathematics c 25 University of Houston Volume 4, No. 4, 25 AN INCREASING FUNCTION WITH INFINITELY CHANGING CONVEXITY TONG TANG, YIFEI PAN, AND MEI WANG Communicated by Min Ru Abstract.

More information

x log x, which is strictly convex, and use Jensen s Inequality:

x log x, which is strictly convex, and use Jensen s Inequality: 2. Information measures: mutual information 2.1 Divergence: main inequality Theorem 2.1 (Information Inequality). D(P Q) 0 ; D(P Q) = 0 iff P = Q Proof. Let ϕ(x) x log x, which is strictly convex, and

More information

Article Rate Distortion Functions and Rate Distortion Function Lower Bounds for Real-World Sources

Article Rate Distortion Functions and Rate Distortion Function Lower Bounds for Real-World Sources entropy Article Rate Distortion Functions and Rate Distortion Function Lower Bounds for Real-World Sources Jerry Gibson Department of Electrical and Computer Engineering, University of California, Santa

More information

5 Mutual Information and Channel Capacity

5 Mutual Information and Channel Capacity 5 Mutual Information and Channel Capacity In Section 2, we have seen the use of a quantity called entropy to measure the amount of randomness in a random variable. In this section, we introduce several

More information

Information Dimension

Information Dimension Information Dimension Mina Karzand Massachusetts Institute of Technology November 16, 2011 1 / 26 2 / 26 Let X would be a real-valued random variable. For m N, the m point uniform quantized version of

More information

(Classical) Information Theory II: Source coding

(Classical) Information Theory II: Source coding (Classical) Information Theory II: Source coding Sibasish Ghosh The Institute of Mathematical Sciences CIT Campus, Taramani, Chennai 600 113, India. p. 1 Abstract The information content of a random variable

More information

The memory centre IMUJ PREPRINT 2012/03. P. Spurek

The memory centre IMUJ PREPRINT 2012/03. P. Spurek The memory centre IMUJ PREPRINT 202/03 P. Spurek Faculty of Mathematics and Computer Science, Jagiellonian University, Łojasiewicza 6, 30-348 Kraków, Poland J. Tabor Faculty of Mathematics and Computer

More information

Multiple Random Variables

Multiple Random Variables Multiple Random Variables Joint Probability Density Let X and Y be two random variables. Their joint distribution function is F ( XY x, y) P X x Y y. F XY ( ) 1, < x

More information

distortion and, usually, "many" cells, and "large" rate. Later we'll see Question: What "gross" characteristics" distinguish different highresolution

distortion and, usually, many cells, and large rate. Later we'll see Question: What gross characteristics distinguish different highresolution High-Resolution Analysis of Quantizer Distortion For fied-rate, memoryless VQ, there are two principal results of high-resolution analysis: Bennett's Integral A formula for the mean-squared error distortion

More information

Solutions to Homework Set #1 Sanov s Theorem, Rate distortion

Solutions to Homework Set #1 Sanov s Theorem, Rate distortion st Semester 00/ Solutions to Homework Set # Sanov s Theorem, Rate distortion. Sanov s theorem: Prove the simple version of Sanov s theorem for the binary random variables, i.e., let X,X,...,X n be a sequence

More information

Lecture 5: GPs and Streaming regression

Lecture 5: GPs and Streaming regression Lecture 5: GPs and Streaming regression Gaussian Processes Information gain Confidence intervals COMP-652 and ECSE-608, Lecture 5 - September 19, 2017 1 Recall: Non-parametric regression Input space X

More information

Interactions of Information Theory and Estimation in Single- and Multi-user Communications

Interactions of Information Theory and Estimation in Single- and Multi-user Communications Interactions of Information Theory and Estimation in Single- and Multi-user Communications Dongning Guo Department of Electrical Engineering Princeton University March 8, 2004 p 1 Dongning Guo Communications

More information

MAHALAKSHMI ENGINEERING COLLEGE-TRICHY QUESTION BANK UNIT V PART-A. 1. What is binary symmetric channel (AUC DEC 2006)

MAHALAKSHMI ENGINEERING COLLEGE-TRICHY QUESTION BANK UNIT V PART-A. 1. What is binary symmetric channel (AUC DEC 2006) MAHALAKSHMI ENGINEERING COLLEGE-TRICHY QUESTION BANK SATELLITE COMMUNICATION DEPT./SEM.:ECE/VIII UNIT V PART-A 1. What is binary symmetric channel (AUC DEC 2006) 2. Define information rate? (AUC DEC 2007)

More information

ELEG 3143 Probability & Stochastic Process Ch. 6 Stochastic Process

ELEG 3143 Probability & Stochastic Process Ch. 6 Stochastic Process Department of Electrical Engineering University of Arkansas ELEG 3143 Probability & Stochastic Process Ch. 6 Stochastic Process Dr. Jingxian Wu wuj@uark.edu OUTLINE 2 Definition of stochastic process (random

More information

C.M. Liu Perceptual Signal Processing Lab College of Computer Science National Chiao-Tung University

C.M. Liu Perceptual Signal Processing Lab College of Computer Science National Chiao-Tung University Quantization C.M. Liu Perceptual Signal Processing Lab College of Computer Science National Chiao-Tung University http://www.csie.nctu.edu.tw/~cmliu/courses/compression/ Office: EC538 (03)5731877 cmliu@cs.nctu.edu.tw

More information

Principles of Communications

Principles of Communications Principles of Communications Weiyao Lin Shanghai Jiao Tong University Chapter 10: Information Theory Textbook: Chapter 12 Communication Systems Engineering: Ch 6.1, Ch 9.1~ 9. 92 2009/2010 Meixia Tao @

More information

Question 1. The correct answers are: (a) (2) (b) (1) (c) (2) (d) (3) (e) (2) (f) (1) (g) (2) (h) (1)

Question 1. The correct answers are: (a) (2) (b) (1) (c) (2) (d) (3) (e) (2) (f) (1) (g) (2) (h) (1) Question 1 The correct answers are: a 2 b 1 c 2 d 3 e 2 f 1 g 2 h 1 Question 2 a Any probability measure Q equivalent to P on F 2 can be described by Q[{x 1, x 2 }] := q x1 q x1,x 2, 1 where q x1, q x1,x

More information

Convergence Tests. Academic Resource Center

Convergence Tests. Academic Resource Center Convergence Tests Academic Resource Center Series Given a sequence {a 0, a, a 2,, a n } The sum of the series, S n = A series is convergent if, as n gets larger and larger, S n goes to some finite number.

More information

Random Matrix Eigenvalue Problems in Probabilistic Structural Mechanics

Random Matrix Eigenvalue Problems in Probabilistic Structural Mechanics Random Matrix Eigenvalue Problems in Probabilistic Structural Mechanics S Adhikari Department of Aerospace Engineering, University of Bristol, Bristol, U.K. URL: http://www.aer.bris.ac.uk/contact/academic/adhikari/home.html

More information

STA205 Probability: Week 8 R. Wolpert

STA205 Probability: Week 8 R. Wolpert INFINITE COIN-TOSS AND THE LAWS OF LARGE NUMBERS The traditional interpretation of the probability of an event E is its asymptotic frequency: the limit as n of the fraction of n repeated, similar, and

More information

Afundamental component in the design and analysis of

Afundamental component in the design and analysis of IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 45, NO. 2, MARCH 1999 533 High-Resolution Source Coding for Non-Difference Distortion Measures: The Rate-Distortion Function Tamás Linder, Member, IEEE, Ram

More information

Joint Distributions. (a) Scalar multiplication: k = c d. (b) Product of two matrices: c d. (c) The transpose of a matrix:

Joint Distributions. (a) Scalar multiplication: k = c d. (b) Product of two matrices: c d. (c) The transpose of a matrix: Joint Distributions Joint Distributions A bivariate normal distribution generalizes the concept of normal distribution to bivariate random variables It requires a matrix formulation of quadratic forms,

More information

Master s Written Examination

Master s Written Examination Master s Written Examination Option: Statistics and Probability Spring 016 Full points may be obtained for correct answers to eight questions. Each numbered question which may have several parts is worth

More information

arxiv:physics/ v1 [physics.data-an] 24 Apr 2000 Naftali Tishby, 1,2 Fernando C. Pereira, 3 and William Bialek 1

arxiv:physics/ v1 [physics.data-an] 24 Apr 2000 Naftali Tishby, 1,2 Fernando C. Pereira, 3 and William Bialek 1 The information bottleneck method arxiv:physics/0004057v1 [physics.data-an] 24 Apr 2000 Naftali Tishby, 1,2 Fernando C. Pereira, 3 and William Bialek 1 1 NEC Research Institute, 4 Independence Way Princeton,

More information

A Novel Nonparametric Density Estimator

A Novel Nonparametric Density Estimator A Novel Nonparametric Density Estimator Z. I. Botev The University of Queensland Australia Abstract We present a novel nonparametric density estimator and a new data-driven bandwidth selection method with

More information

Information Theory with Applications, Math6397 Lecture Notes from September 30, 2014 taken by Ilknur Telkes

Information Theory with Applications, Math6397 Lecture Notes from September 30, 2014 taken by Ilknur Telkes Information Theory with Applications, Math6397 Lecture Notes from September 3, 24 taken by Ilknur Telkes Last Time Kraft inequality (sep.or) prefix code Shannon Fano code Bound for average code-word length

More information

EE 4TM4: Digital Communications II Scalar Gaussian Channel

EE 4TM4: Digital Communications II Scalar Gaussian Channel EE 4TM4: Digital Communications II Scalar Gaussian Channel I. DIFFERENTIAL ENTROPY Let X be a continuous random variable with probability density function (pdf) f(x) (in short X f(x)). The differential

More information

arxiv: v1 [cs.it] 20 Jan 2018

arxiv: v1 [cs.it] 20 Jan 2018 1 Analog-to-Digital Compression: A New Paradigm for Converting Signals to Bits Alon Kipnis, Yonina C. Eldar and Andrea J. Goldsmith fs arxiv:181.6718v1 [cs.it] Jan 18 X(t) sampler smp sec encoder R[ bits

More information

Lecture 17: Density Estimation Lecturer: Yihong Wu Scribe: Jiaqi Mu, Mar 31, 2016 [Ed. Apr 1]

Lecture 17: Density Estimation Lecturer: Yihong Wu Scribe: Jiaqi Mu, Mar 31, 2016 [Ed. Apr 1] ECE598: Information-theoretic methods in high-dimensional statistics Spring 06 Lecture 7: Density Estimation Lecturer: Yihong Wu Scribe: Jiaqi Mu, Mar 3, 06 [Ed. Apr ] In last lecture, we studied the minimax

More information

MAHALAKSHMI ENGINEERING COLLEGE QUESTION BANK. SUBJECT CODE / Name: EC2252 COMMUNICATION THEORY UNIT-V INFORMATION THEORY PART-A

MAHALAKSHMI ENGINEERING COLLEGE QUESTION BANK. SUBJECT CODE / Name: EC2252 COMMUNICATION THEORY UNIT-V INFORMATION THEORY PART-A MAHALAKSHMI ENGINEERING COLLEGE QUESTION BANK DEPARTMENT: ECE SEMESTER: IV SUBJECT CODE / Name: EC2252 COMMUNICATION THEORY UNIT-V INFORMATION THEORY PART-A 1. What is binary symmetric channel (AUC DEC

More information

Final Examination Solutions (Total: 100 points)

Final Examination Solutions (Total: 100 points) Final Examination Solutions (Total: points) There are 4 problems, each problem with multiple parts, each worth 5 points. Make sure you answer all questions. Your answer should be as clear and readable

More information

The Moment Method; Convex Duality; and Large/Medium/Small Deviations

The Moment Method; Convex Duality; and Large/Medium/Small Deviations Stat 928: Statistical Learning Theory Lecture: 5 The Moment Method; Convex Duality; and Large/Medium/Small Deviations Instructor: Sham Kakade The Exponential Inequality and Convex Duality The exponential

More information