PROOF OF ZADOR-GERSHO THEOREM
|
|
- Theresa Hampton
- 5 years ago
- Views:
Transcription
1 ZADOR-GERSHO THEOREM FOR VARIABLE-RATE VQ For a stationary source and large R, the least distortion of k-dim'l VQ with nth-order entropy coding and rate R or less is δ(k,n,r) m k * σ 2 η kn 2-2R = Z(k,n,R) where Z(k,n,R) = Zador-Gersho funct. for k-dim'l VQ with nth-order entropy coding m * k = best inert'l profile = least NMI of any tessel'g polytope (Gersho's conj.) η k = 2 2h k σ 2 Equivalently, S(k,n,R) 6.02 R - 0 log 0 m * k η kn. (Again, 6 db per bit.) Notes: k-dim'l VQ-EC with n = is at least as good as k-dim'l VQ-FR, because the latter is a special case of the former. Later we show directly that η k β k, which implies Z(k,,R) Z(k,R). The proof shows that an approximately optimal k-dimensional VQ with nth-order variable-rate coding can be constructed with a partition that is a tesselation of the best k-dimensional polytope, scaled to volume 2 k(h kn-r). This has constant inert'l profile m(x) = m k, * constant pt density Λ(x) = Λ k * = 2 k(r-h kn), distortion D m k * (Λ k) * -2/k = Z(k,nR). VQ-EC-3 PROOF OF ZADOR-GERSHO THEOREM We begin with δ(k,n,r) min m(x),λ(x) m(x) Λ 2/k (x) f(x) dx where x = (x x k ) and the minimization is over all inertial profiles m(x) and all point densities Λ(x) such that h kn + k f(x) log Λ(x) dx R (such a Λ rate ~ R) Best inertial profile: We assume Gersho's conjecture -- In the high rate, small distortion regime, most cells of an optimal quantizer are, approximately, congruent to the tesselating polytope with least NMI. We conclude: The best inertial profile is m k(x) * = m k * = least NMI of k-dimen'l tess'ng polytopes It follows that δ(k,n,r) m * k min Λ(x) Λ 2/k (x) f(x) dx where the min is taken over functions Λ(x) such that Λ(x) 0 and h kn + f(x) log Λ(x) dx R k VQ-EC-4
2 Best point density: Suppose Λ(x) satisfies h kn + k f(x) log 2 Λ(x) dx R ( * ) Then by convexity of the logarithm and Jensen's inequality log 2 Λ 2/k (x) f(x) dx equality iff Λ(x) is constant log 2 Λ 2/k f(x) dx (x) with probability one = - 2 k f(x) log Λ(x) dx 2h kn - 2R Hence, Λ 2/k (x) f(x) dx 2 2hkn 2-2R ( ** ) with equality iff Λ(x) is a constant with probability one. ( * ) and ( ** ) δ(k,n,r) m k * 2 2hkn 2-2R = m k * σ 2 η kn 2-2R Moreover, we have shown that the optimal point density is a constant. The constant must be such that ( * ) holds with equality. Therefore, Λ(x) = 2 k(r-h kn) * = Λ k We see from the proof that an approx'ly optimal VQ can be constructed with a partition that is a tesselation of the best k-dim'l polytope, scaled to volume 2 k(h kn-r). The tesselation need only cover the region where f(x) is not small. VQ-EC-5 SUMMARY OF FIXED- AND VARIABLE-RATE VQ Let 0th-order entropy coding (n=0) denote fixed-rate coding. Given a stationary source and large R, the least distortion of VQ with dimension k, nth-order entropy coding, and rate R or less is where δ(k,n,r) m * k σ 2 α k,n 2-2R (= Z(k,n,R)) β k, n=0 α k,n = η kn = σ 2 22h kn, n Notice the "," in α k,n but not in η kn or h kn! The best k-dimen'l VQ to use with fixed-rate coding (n=0) has - has point density λ k(x) * = c f k/(k+2) (x) - congruent cells with NMI = m k * (constant inert'l profile). The best k-dimen'l VQ to use with variable-rate binary coding (n ) has - uniform point density - congruent cells with NMI equal to m k. * That is, it is simply a tesselation. VQ-EC-6
3 WHAT HAPPENS AS K AND N CHANGE? As usual, consider a stationary source. Recall: δ(k,n,r) σ 2 m * k α k,n 2-2R m * k decreases subadditively to m * = 2πe α k,n = β k, n =0 σ 2 22h kn, n β k decreases subadditively to β 2 2hk decreases monotonically to 2 2h Therefore, α k,0 decreases monotonically with k to β α k,n decreases monotonically with k to 2 2h for n α k,n decreases monotonically with n to 2 2h Key Fact: 2 2h = β (proved later) Therefore, α k,n decreases monotonically with n or k to β = 2 2h VQ-EC-7 CONCLUSIONS () The least distortion of vector quantization with rate R or less, with any dimension, and with fixed-rate coding or with variable-rate encoding of any order is δ(r) σ 2 m * β 2-2R Among other things, this says that the best possible performance with variable-rate coding is no better than the best possible performance with fixed-rate coding. (2) Increasing n with k fixed: δ(k,n,r) decreases monotonically to the limit δ(k,,r) σ 2 m * k β 2-2R m* k m * δ(r) (space filling loss) Therefore, for large n and arbitrary k, δ(k,n,r) m* k m * δ(r) Among other things, this shows that one needs large k in order to approach the best possible performance. VQ-EC-8
4 (3) Increasing k with n fixed: δ(k,n,r) "decreases", though not monotonically, to the limit δ(,n,r) σ 2 m * β 2-2R = δ(r) (no loss) Therefore, for large k and arbitrary n (even n = 0), δ(k,n,r) δ(r) Among other things this indicates that for large k, increasing n does not improve the best possible performance attainable with for that k. That is, one can attain the best possible performance, even with n = 0 or. (4) To get the best possible performance we must have (a) k large enough that m k/m * *, i.e. well shaped cells, (b) k and/or n large enough that α k,n β, i.e. good point density and good exploitation of memory. Usually (b) is more important (a). VQ-EC-9 (5) What's the point of variable-rate coding if the best possible performance with variable-rate coding is no better than the best possible performance without it? From the point of view of VQ, the reason to use variable-rate coding, instead of fixed-rate coding) is to permit a VQ with smaller dimension (less complexity) to work well. The extreme case -- k=, i.e. scalar quantization: with n = 0 (fixed-rate coding) δ(,0,r) σ 2 m * β 2-2R = m* m * β β δ(r) with large n (high-order variable-rate coding) δ(,,r) σ 2 m * β 2-2R = m* m * δ(r), where m* m * =.42 or.53 db The variable-coding causes β to be replaced by β. Moreover, the best scalar quantizer for use with variable-rate coding is a uniform scalar quantizer. Thus this shows that with variable-rate coding, a uniform scalar quantizer can have performance within only.53 db of the best VQ of any type! VQ-EC-20
5 (6) What's the point of vector quantization if uniform scalar quantization plus variable-rate coding can come within.53 db of the best VQ of any type? From the point of view of the binary code, the purpose of VQ is to permit a lower order variable-rate coder to be used, i.e. it permits a simpler lossless coder. For example, if uniform scalar quantization were used, the variable-rate coder must exploit the memory in the source, i.e. it would have to be complex. VQ also reduces the space filling loss, i.e. it improves cell shapes. (7) It's worth re-emphasizing that one can attain the best possible performance (D vs. R) by choosing k large and n = 0; i.e. with a complex quantizer and a simple fixed-rate binary encoder. the best possible performance minus only.53 db by choosing a uniform scalar quantizer and an entropy coder with large n; i.e. with a simple quantizer and a complex entroopy coder. Which is simpler? Hard to say. Both are very complex. Good systems are usually compromises; i.e. a nontrivial quantizer and nontrivial entropy coder. In some applications variable-rate coding is not an option and so fixed-rate coding must be used. VQ-EC-2 EXAMPLE: GAUSS-MARKOV SOURCE, ρ =.9 S(k,n,R) 6.02 R - 0 log 0 m * k α k,n Plot of - log 0 α k,n 5 4 n k -5 0 VQ-EC-22
6 PROPERTIES OF DIFFERENTIAL ENTROPY AND THE ZADOR FACTOR Η K Most are extensions of properties of the differential entropy of one random variable. Many proofs are similar to those of analogous properties for ordinary entropy. Definitions: h(x,,x k ) = - f(x) log 2 f(x) dx h k = k h(x,,x k ) = kth order differential entropy η k = σ 2 22hk = Zador factor for variable-rate VQ Note that h and h k can be positive or negative! VQ-EC-23 () h k 2 log 2 σ 2 β k and η k β k where β k is Zador's factor. Equality holds in each iff f(x) has the same value wherever it is not zero, e.g. if it is uniform. Derivation: h k = - k f(x) log 2 f(x) dx = k+2 2k f(x) log 2 f -2/(k+2) (x) dx = k+2 2k E log 2 Y, where Y = f -2/(k+2) (X) k+2 2k log 2 EY by Jensen's ineq. (log 2 is concave) = k+2 2k log 2 f(x) f -2/(k+2) (x) dx = 2 log 2 f k/(k+2) (x) dx (k+2)/k = 2 log 2 σ 2 β η k = σ 2 22h k β k Since log 2 is strictly concave, equality holds if and only if Y is constant w.p.; i.e. iff P(f -2/(k+2) (X) = c) = for some some c, i.e. iff f(x) has the same value wherever it is not zero. VQ-EC-24
7 (2) If Y = a X + b, a 0, then h Y,k = h X,k + log 2 a and η Y,k = η X,k Derivation: Since f Y (y) = a k f X ( y-b a ), h Y,k = - k f(y) log 2 f(y) dy = - k a k f X ( y-b a ) log 2 a k f X ( y-b a ) dy = - k a k f X (x) log 2 a k f X () a k dx, letting x = y-b a = - k f(x) log 2 f(x) dx - k f(x) log 2 a k dx = h X,mk + log 2 a σ 2 Y = a 2 σ 2 X η Y,k = σ 2 2 2hY,k = Y a 2 σ 2 X 2 2h X,k+2 log 2 a = σ 2 2 2hX,k = η X,k X VQ-EC-25 (3) (a) If Y = A X + b and A is a k k nonsingular matrix, then h Y,k = h X,k + k log 2 A and η Y,k = η X,k A 2/k σ2 X (b) If A is orthogonal (i.e. A - = A t ), then Derivation: h Y,k = h X,k and η Y,k = η X,k. (a) Since f Y (y) = A - f X (A - (y-b)), h Y,k = - k f(y) log 2 f(y) dy = - k A - f X (A - (y-b)) log 2 A - f X (A - (y-b)) dy = - k A - f X (x) log 2 A - f X (x) A dx, with x = A - (y-b) = - k f(x) log 2 f(x) dx - k f(x) log 2 A - dx = h X,k + k log 2 A σ 2 Y η Y,k = σ 2 Y 2 2h Y,k = a 2 σ 2 X 2 2h X,k+ k log 2 A = σ 2 2 2hX,k A 2/k σ2 X X σ 2 Y = η X,k A 2/k σ2 X σ 2 Y (b) This follows from (a) and the facts that when A is orthogonal, A = and σ 2 Y = σ 2 X. VQ-EC-26
8 (4) Gaussian: If X = (X,...,X k ) is Gaussian with covariance matrix K, then h k = 2 log 2 2πe K /k and η k = 2πe K /k σ 2 This formula for η k is the same as the formula for β k except that ((k+2)/k) (k+2)/2 is replaced by "e". (Note: ((k+2)/k) (k+2)/2 e as k.) Derivation: We may assume X has zero mean, since previous properties show the mean has no effect on h k or η k. Let Y = AX, where A is the Karhunen-Loeve Transform. Then Y is Gaussian with covariance matrix Λ = AKA t, which is diagonal, with diagonal elements equal to the eigenvalues λ,,λ k of K. Thus Y has independent components with variances λ,,λ k. Note that K = i=λ k i. Since A is orthogonal h X,k = h Y,k = - k f(y) log 2 f(y) dy = - k f(y) log 2 k -/2 k yi 2 2πλi exp{- } dy 2λ i = - k f(y) log 2 k -/2 k yi 2 2πλi exp{- } dy 2λ i = - k log 2 i= i= k 2πλi -/2 + k f(y) i= i= k y 2 i i= 2λ i dy log 2 e i= VQ-EC-27 = - k log 2 (2π) -k/2 K + k k EY 2 i log 2 e i= 2λ i = 2 log 2 2π K /k + 2 log 2 e = 2 log 2 2πe K /k (5) If X has covariance matrix K, then h k 2 log 2 2πe K /k and η k 2πe K /k σ 2 with equality iff X is Gaussian Derivation: We may assume X has zero mean, since the mean has no effect on h k or η k. Let f(x) be the density of X with covariance matrix K, and let g(x) be the Gaussian density with mean zero and covariance matrix K. We make the proof in two steps. (a) h k (f) - k f(x) log 2 g(x) dx: - k f(x) log 2 g(x) dx - h k (f) = - k f(x) log g(x) 2 f(x) dx - k g(x) f(x) f(x) - dx log 2 z (z-) ln 2, ln 2 = - k g(x) dx ln 2 + f(x) dx ln 2 = - + = 0 VQ-EC-28
9 (b) - f(x) log k 2 g(x) dx = - k g(x) log 2 g(x) dx = 2 log 2 2πe K /k : - k f(x) log 2 g(x) dx = - k E log 2 (2π) -k/2 K -/2 exp{- 2 X t K - X} dx = 2 log 2 2π K /k + 2k E f [X t K - X] log 2 e where E f denotes expecation with respect to f. Since the expectation is a sum of terms of the form a ij E[X i X j ], it depends only on the covariance matrix K. Therefore, it will be the same if the expectation is taken with respect to g, because g has the same covariance matrix. Therefore - k f(x) log 2 g(x) dx = 2 log 2 2π K /k + 2k E g [X t K - X] log 2 e = - k g(x) log 2 g(x) dx = 2 log 2 2πe K /k by (4) VQ-EC-29 Definition: The conditional differential entropy of random variables X,,X k given random variables Y,,Y m h(x,,x k Y,,Y m ) = - f(x,y) log 2 f(x y) dx dy Most of the following properties are derived in the same way as the corresponding propety for entropy. (6) h(x Y) h(x) with equality iff X and Y are independent. Derivation: We'll show h(x) - h(x Y) 0 with equality iff X indep of Y. f(x,y) h(x) - h(x Y) = - f(x,y) log 2 f(x) dx dy + f(x,y) log 2 f(y) dx dy = - f(x,y) ln f(x)f(y) f(x,y) dx dy ln 2 - f(x,y) f(x)f(y) f(x,y) - dx dy ln 2 since ln z z- = - f(x)f(y) dx dy ln 2 + f(x,y) dx dy ln 2 = 0. Equality holds if and only if f(x)f(y) = f(x,y) for all x,y; i.e. if and only if X and Y are independent. VQ-EC-30
10 (7) h(y,...,y n X,,X m ) h(y,,y n X,,X m' ), 0 m' < m, with equality iff Y,,Y n is conditionally independent of X m'+,,x m given X,,X m'. Derivation: Similar to that of (6). (8) Chain rule: h(x,...,x k ) = h(x )+h(x 2 X )+H(X 3 X X 2 )+... +h(x k X X k- ) Derivation: Essentially the same proof as for the chain rule for ordinary entropy, but with H's replace by h's. (9) h(x,...,x k ) h(x ) h(x k ) with equality if and only if X i 's are independent Derivation: Essentially the same proof as for the analogous property for ordinary entropy. VQ-EC-3 Definitions: h k = k h(x,...,x k ) STATIONARY SOURCES h m = h(x n X n-m,x 2,,X n- ) ( h = h(x ) = h ) h k m = k h(xn+k- n Xn-m) n- (h k = h ) h = lim h k = differential entropy-rate of X k Properties: (0) h k+ h k Derivation: Follows from (7) and stationarity. () h k = k (h +h +h 2 + +h k- ) h k- h k- Derivation: Essentially the same as the analogous property for entropy. h k = k h(x,,x k ) = k (h(x )+h(x 2 X )+h(x 3 X,X 2 ) h(x k X,X 2,,X k- )) chain rule (8) = k (h +h +h 2 + +h k- ) by stationarity k (h k- + h k- + h k- + + h k- ) = h k- h k- by (0) VQ-EC-32
( 1 k "information" I(X;Y) given by Y about X)
SUMMARY OF SHANNON DISTORTION-RATE THEORY Consider a stationary source X with f (x) as its th-order pdf. Recall the following OPTA function definitions: δ(,r) = least dist'n of -dim'l fixed-rate VQ's w.
More informationSummary of Shannon Rate-Distortion Theory
Summary of Shannon Rate-Distortion Theory Consider a stationary source X with kth-order probability density function denoted f k (x). Consider VQ with fixed-rate coding. Recall the following OPTA function
More informationLeast Distortion of Fixed-Rate Vector Quantizers. High-Resolution Analysis of. Best Inertial Profile. Zador's Formula Z-1 Z-2
High-Resolution Analysis of Least Distortion of Fixe-Rate Vector Quantizers Begin with Bennett's Integral D 1 M 2/k Fin best inertial profile Zaor's Formula m(x) λ 2/k (x) f X(x) x Fin best point ensity
More informationExercises with solutions (Set D)
Exercises with solutions Set D. A fair die is rolled at the same time as a fair coin is tossed. Let A be the number on the upper surface of the die and let B describe the outcome of the coin toss, where
More informationECE 4400:693 - Information Theory
ECE 4400:693 - Information Theory Dr. Nghi Tran Lecture 8: Differential Entropy Dr. Nghi Tran (ECE-University of Akron) ECE 4400:693 Lecture 1 / 43 Outline 1 Review: Entropy of discrete RVs 2 Differential
More informationLecture 5 Channel Coding over Continuous Channels
Lecture 5 Channel Coding over Continuous Channels I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw November 14, 2014 1 / 34 I-Hsiang Wang NIT Lecture 5 From
More informationVARIABLE-RATE VQ (AKA VQ WITH ENTROPY CODING)
VARIABLE-RATE VQ (AKA VQ WITH ENTROPY CODING) Varable-Rate VQ = Quatzato + Lossless Varable-Legth Bary Codg A rage of optos -- from smple to complex a. Uform scalar quatzato wth varable-legth codg, oe
More informationf (x) = k=0 f (0) = k=0 k=0 a k k(0) k 1 = a 1 a 1 = f (0). a k k(k 1)x k 2, k=2 a k k(k 1)(0) k 2 = 2a 2 a 2 = f (0) 2 a k k(k 1)(k 2)x k 3, k=3
1 M 13-Lecture Contents: 1) Taylor Polynomials 2) Taylor Series Centered at x a 3) Applications of Taylor Polynomials Taylor Series The previous section served as motivation and gave some useful expansion.
More informationdistortion and, usually, "many" cells, and "large" rate. Later we'll see Question: What "gross" characteristics" distinguish different highresolution
High-Resolution Analysis of Quantizer Distortion For fied-rate, memoryless VQ, there are two principal results of high-resolution analysis: Bennett's Integral A formula for the mean-squared error distortion
More informationLet X and Y denote two random variables. The joint distribution of these random
EE385 Class Notes 9/7/0 John Stensby Chapter 3: Multiple Random Variables Let X and Y denote two random variables. The joint distribution of these random variables is defined as F XY(x,y) = [X x,y y] P.
More informationBasic Principles of Video Coding
Basic Principles of Video Coding Introduction Categories of Video Coding Schemes Information Theory Overview of Video Coding Techniques Predictive coding Transform coding Quantization Entropy coding Motion
More informationCompanding of Memoryless Sources. Peter W. Moo and David L. Neuho. Department of Electrical Engineering and Computer Science
Optimal Compressor Functions for Multidimensional Companding of Memoryless Sources Peter W. Moo and David L. Neuho Department of Electrical Engineering and Computer Science University of Michigan, Ann
More informationx log x, which is strictly convex, and use Jensen s Inequality:
2. Information measures: mutual information 2.1 Divergence: main inequality Theorem 2.1 (Information Inequality). D(P Q) 0 ; D(P Q) = 0 iff P = Q Proof. Let ϕ(x) x log x, which is strictly convex, and
More informationEE/Stat 376B Handout #5 Network Information Theory October, 14, Homework Set #2 Solutions
EE/Stat 376B Handout #5 Network Information Theory October, 14, 014 1. Problem.4 parts (b) and (c). Homework Set # Solutions (b) Consider h(x + Y ) h(x + Y Y ) = h(x Y ) = h(x). (c) Let ay = Y 1 + Y, where
More informationAn instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if. 2 l i. i=1
Kraft s inequality An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if N 2 l i 1 Proof: Suppose that we have a tree code. Let l max = max{l 1,...,
More informationChapter 1 Preliminaries
Chapter 1 Preliminaries 1.1 Conventions and Notations Throughout the book we use the following notations for standard sets of numbers: N the set {1, 2,...} of natural numbers Z the set of integers Q the
More informationECE 650 Lecture 4. Intro to Estimation Theory Random Vectors. ECE 650 D. Van Alphen 1
EE 650 Lecture 4 Intro to Estimation Theory Random Vectors EE 650 D. Van Alphen 1 Lecture Overview: Random Variables & Estimation Theory Functions of RV s (5.9) Introduction to Estimation Theory MMSE Estimation
More informationCODING SAMPLE DIFFERENCES ATTEMPT 1: NAIVE DIFFERENTIAL CODING
5 0 DPCM (Differential Pulse Code Modulation) Making scalar quantization work for a correlated source -- a sequential approach. Consider quantizing a slowly varying source (AR, Gauss, ρ =.95, σ 2 = 3.2).
More informationLecture 17: Differential Entropy
Lecture 17: Differential Entropy Differential entropy AEP for differential entropy Quantization Maximum differential entropy Estimation counterpart of Fano s inequality Dr. Yao Xie, ECE587, Information
More informationComplex Systems Methods 2. Conditional mutual information, entropy rate and algorithmic complexity
Complex Systems Methods 2. Conditional mutual information, entropy rate and algorithmic complexity Eckehard Olbrich MPI MiS Leipzig Potsdam WS 2007/08 Olbrich (Leipzig) 26.10.2007 1 / 18 Overview 1 Summary
More informationLecture 22: Final Review
Lecture 22: Final Review Nuts and bolts Fundamental questions and limits Tools Practical algorithms Future topics Dr Yao Xie, ECE587, Information Theory, Duke University Basics Dr Yao Xie, ECE587, Information
More informationExample: for source
Nonuniform scalar quantizer References: Sayood Chap. 9, Gersho and Gray, Chap.'s 5 and 6. The basic idea: For a nonuniform source density, put smaller cells and levels where the density is larger, thereby
More informationOR MSc Maths Revision Course
OR MSc Maths Revision Course Tom Byrne School of Mathematics University of Edinburgh t.m.byrne@sms.ed.ac.uk 15 September 2017 General Information Today JCMB Lecture Theatre A, 09:30-12:30 Mathematics revision
More informationCHAPTER 3. P (B j A i ) P (B j ) =log 2. j=1
CHAPTER 3 Problem 3. : Also : Hence : I(B j ; A i ) = log P (B j A i ) P (B j ) 4 P (B j )= P (B j,a i )= i= 3 P (A i )= P (B j,a i )= j= =log P (B j,a i ) P (B j )P (A i ).3, j=.7, j=.4, j=3.3, i=.7,
More informationPractice Final Exam Solutions
Important Notice: To prepare for the final exam, one should study the past exams and practice midterms (and homeworks, quizzes, and worksheets), not just this practice final. A topic not being on the practice
More informationInformation Theory and Communication
Information Theory and Communication Ritwik Banerjee rbanerjee@cs.stonybrook.edu c Ritwik Banerjee Information Theory and Communication 1/8 General Chain Rules Definition Conditional mutual information
More informationEE/Stats 376A: Homework 7 Solutions Due on Friday March 17, 5 pm
EE/Stats 376A: Homework 7 Solutions Due on Friday March 17, 5 pm 1. Feedback does not increase the capacity. Consider a channel with feedback. We assume that all the recieved outputs are sent back immediately
More informationProbability Background
Probability Background Namrata Vaswani, Iowa State University August 24, 2015 Probability recap 1: EE 322 notes Quick test of concepts: Given random variables X 1, X 2,... X n. Compute the PDF of the second
More informationInformation Dimension
Information Dimension Mina Karzand Massachusetts Institute of Technology November 16, 2011 1 / 26 2 / 26 Let X would be a real-valued random variable. For m N, the m point uniform quantized version of
More informatione x = 1 + x + x2 2! + x3 If the function f(x) can be written as a power series on an interval I, then the power series is of the form
Taylor Series Given a function f(x), we would like to be able to find a power series that represents the function. For example, in the last section we noted that we can represent e x by the power series
More informationLink lecture - Lagrange Multipliers
Link lecture - Lagrange Multipliers Lagrange multipliers provide a method for finding a stationary point of a function, say f(x, y) when the variables are subject to constraints, say of the form g(x, y)
More informationLECTURE 2. Convexity and related notions. Last time: mutual information: definitions and properties. Lecture outline
LECTURE 2 Convexity and related notions Last time: Goals and mechanics of the class notation entropy: definitions and properties mutual information: definitions and properties Lecture outline Convexity
More informationNational Sun Yat-Sen University CSE Course: Information Theory. Maximum Entropy and Spectral Estimation
Maximum Entropy and Spectral Estimation 1 Introduction What is the distribution of velocities in the gas at a given temperature? It is the Maxwell-Boltzmann distribution. The maximum entropy distribution
More informationRandom Variables. P(x) = P[X(e)] = P(e). (1)
Random Variables Random variable (discrete or continuous) is used to derive the output statistical properties of a system whose input is a random variable or random in nature. Definition Consider an experiment
More informationSeptember Math Course: First Order Derivative
September Math Course: First Order Derivative Arina Nikandrova Functions Function y = f (x), where x is either be a scalar or a vector of several variables (x,..., x n ), can be thought of as a rule which
More informationTEST CODE: MMA (Objective type) 2015 SYLLABUS
TEST CODE: MMA (Objective type) 2015 SYLLABUS Analytical Reasoning Algebra Arithmetic, geometric and harmonic progression. Continued fractions. Elementary combinatorics: Permutations and combinations,
More informationWavelet Scalable Video Codec Part 1: image compression by JPEG2000
1 Wavelet Scalable Video Codec Part 1: image compression by JPEG2000 Aline Roumy aline.roumy@inria.fr May 2011 2 Motivation for Video Compression Digital video studio standard ITU-R Rec. 601 Y luminance
More informationLecture 32: Taylor Series and McLaurin series We saw last day that some functions are equal to a power series on part of their domain.
Lecture 32: Taylor Series and McLaurin series We saw last day that some functions are equal to a power series on part of their domain. For example f(x) = 1 1 x = 1 + x + x2 + x 3 + = ln(1 + x) = x x2 2
More informatione x3 dx dy. 0 y x 2, 0 x 1.
Problem 1. Evaluate by changing the order of integration y e x3 dx dy. Solution:We change the order of integration over the region y x 1. We find and x e x3 dy dx = y x, x 1. x e x3 dx = 1 x=1 3 ex3 x=
More information3. Review of Probability and Statistics
3. Review of Probability and Statistics ECE 830, Spring 2014 Probabilistic models will be used throughout the course to represent noise, errors, and uncertainty in signal processing problems. This lecture
More informationEXAM 3 MAT 167 Calculus I Spring is a composite function of two functions y = e u and u = 4 x + x 2. By the. dy dx = dy du = e u x + 2x.
EXAM MAT 67 Calculus I Spring 20 Name: Section: I Each answer must include either supporting work or an explanation of your reasoning. These elements are considered to be the main part of each answer and
More informationRecitation 2: Probability
Recitation 2: Probability Colin White, Kenny Marino January 23, 2018 Outline Facts about sets Definitions and facts about probability Random Variables and Joint Distributions Characteristics of distributions
More informationShannon meets Wiener II: On MMSE estimation in successive decoding schemes
Shannon meets Wiener II: On MMSE estimation in successive decoding schemes G. David Forney, Jr. MIT Cambridge, MA 0239 USA forneyd@comcast.net Abstract We continue to discuss why MMSE estimation arises
More informationEE4601 Communication Systems
EE4601 Communication Systems Week 2 Review of Probability, Important Distributions 0 c 2011, Georgia Institute of Technology (lect2 1) Conditional Probability Consider a sample space that consists of two
More informationContinuous Random Variables
1 / 24 Continuous Random Variables Saravanan Vijayakumaran sarva@ee.iitb.ac.in Department of Electrical Engineering Indian Institute of Technology Bombay February 27, 2013 2 / 24 Continuous Random Variables
More informationECE 275A Homework 6 Solutions
ECE 275A Homework 6 Solutions. The notation used in the solutions for the concentration (hyper) ellipsoid problems is defined in the lecture supplement on concentration ellipsoids. Note that θ T Σ θ =
More informationP (x). all other X j =x j. If X is a continuous random vector (see p.172), then the marginal distributions of X i are: f(x)dx 1 dx n
JOINT DENSITIES - RANDOM VECTORS - REVIEW Joint densities describe probability distributions of a random vector X: an n-dimensional vector of random variables, ie, X = (X 1,, X n ), where all X is are
More informationEE376A: Homeworks #4 Solutions Due on Thursday, February 22, 2018 Please submit on Gradescope. Start every question on a new page.
EE376A: Homeworks #4 Solutions Due on Thursday, February 22, 28 Please submit on Gradescope. Start every question on a new page.. Maximum Differential Entropy (a) Show that among all distributions supported
More informationVector Quantization. Institut Mines-Telecom. Marco Cagnazzo, MN910 Advanced Compression
Institut Mines-Telecom Vector Quantization Marco Cagnazzo, cagnazzo@telecom-paristech.fr MN910 Advanced Compression 2/66 19.01.18 Institut Mines-Telecom Vector Quantization Outline Gain-shape VQ 3/66 19.01.18
More informationLecture 11. Probability Theory: an Overveiw
Math 408 - Mathematical Statistics Lecture 11. Probability Theory: an Overveiw February 11, 2013 Konstantin Zuev (USC) Math 408, Lecture 11 February 11, 2013 1 / 24 The starting point in developing the
More informationThe binary entropy function
ECE 7680 Lecture 2 Definitions and Basic Facts Objective: To learn a bunch of definitions about entropy and information measures that will be useful through the quarter, and to present some simple but
More informationLecture 6: Gaussian Channels. Copyright G. Caire (Sample Lectures) 157
Lecture 6: Gaussian Channels Copyright G. Caire (Sample Lectures) 157 Differential entropy (1) Definition 18. The (joint) differential entropy of a continuous random vector X n p X n(x) over R is: Z h(x
More informationChapter 3, 4 Random Variables ENCS Probability and Stochastic Processes. Concordia University
Chapter 3, 4 Random Variables ENCS6161 - Probability and Stochastic Processes Concordia University ENCS6161 p.1/47 The Notion of a Random Variable A random variable X is a function that assigns a real
More informationSeries Solutions. 8.1 Taylor Polynomials
8 Series Solutions 8.1 Taylor Polynomials Polynomial functions, as we have seen, are well behaved. They are continuous everywhere, and have continuous derivatives of all orders everywhere. It also turns
More informationFrom Calculus II: An infinite series is an expression of the form
MATH 3333 INTERMEDIATE ANALYSIS BLECHER NOTES 75 8. Infinite series of numbers From Calculus II: An infinite series is an expression of the form = a m + a m+ + a m+2 + ( ) Let us call this expression (*).
More informationExamination paper for TMA4180 Optimization I
Department of Mathematical Sciences Examination paper for TMA4180 Optimization I Academic contact during examination: Phone: Examination date: 26th May 2016 Examination time (from to): 09:00 13:00 Permitted
More informationProbability and Measure
Part II Year 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2018 84 Paper 4, Section II 26J Let (X, A) be a measurable space. Let T : X X be a measurable map, and µ a probability
More informationVariational Methods & Optimal Control
Variational Methods & Optimal Control lecture 02 Matthew Roughan Discipline of Applied Mathematics School of Mathematical Sciences University of Adelaide April 14, 2016
More information2 Functions of random variables
2 Functions of random variables A basic statistical model for sample data is a collection of random variables X 1,..., X n. The data are summarised in terms of certain sample statistics, calculated as
More informationSUBOPTIMALITY OF THE KARHUNEN-LOÈVE TRANSFORM FOR FIXED-RATE TRANSFORM CODING. Kenneth Zeger
SUBOPTIMALITY OF THE KARHUNEN-LOÈVE TRANSFORM FOR FIXED-RATE TRANSFORM CODING Kenneth Zeger University of California, San Diego, Department of ECE La Jolla, CA 92093-0407 USA ABSTRACT An open problem in
More information5 Operations on Multiple Random Variables
EE360 Random Signal analysis Chapter 5: Operations on Multiple Random Variables 5 Operations on Multiple Random Variables Expected value of a function of r.v. s Two r.v. s: ḡ = E[g(X, Y )] = g(x, y)f X,Y
More informationSTA205 Probability: Week 8 R. Wolpert
INFINITE COIN-TOSS AND THE LAWS OF LARGE NUMBERS The traditional interpretation of the probability of an event E is its asymptotic frequency: the limit as n of the fraction of n repeated, similar, and
More informationMATH 217A HOMEWORK. P (A i A j ). First, the basis case. We make a union disjoint as follows: P (A B) = P (A) + P (A c B)
MATH 217A HOMEWOK EIN PEASE 1. (Chap. 1, Problem 2. (a Let (, Σ, P be a probability space and {A i, 1 i n} Σ, n 2. Prove that P A i n P (A i P (A i A j + P (A i A j A k... + ( 1 n 1 P A i n P (A i P (A
More informationSolution to Assignment 3
The Chinese University of Hong Kong ENGG3D: Probability and Statistics for Engineers 5-6 Term Solution to Assignment 3 Hongyang Li, Francis Due: 3:pm, March Release Date: March 8, 6 Dear students, The
More informationC.M. Liu Perceptual Signal Processing Lab College of Computer Science National Chiao-Tung University
Quantization C.M. Liu Perceptual Signal Processing Lab College of Computer Science National Chiao-Tung University http://www.csie.nctu.edu.tw/~cmliu/courses/compression/ Office: EC538 (03)5731877 cmliu@cs.nctu.edu.tw
More informationMath 5630: Iterative Methods for Systems of Equations Hung Phan, UMass Lowell March 22, 2018
1 Linear Systems Math 5630: Iterative Methods for Systems of Equations Hung Phan, UMass Lowell March, 018 Consider the system 4x y + z = 7 4x 8y + z = 1 x + y + 5z = 15. We then obtain x = 1 4 (7 + y z)
More informationA Probability Review
A Probability Review Outline: A probability review Shorthand notation: RV stands for random variable EE 527, Detection and Estimation Theory, # 0b 1 A Probability Review Reading: Go over handouts 2 5 in
More informationMath 3215 Intro. Probability & Statistics Summer 14. Homework 5: Due 7/3/14
Math 325 Intro. Probability & Statistics Summer Homework 5: Due 7/3/. Let X and Y be continuous random variables with joint/marginal p.d.f. s f(x, y) 2, x y, f (x) 2( x), x, f 2 (y) 2y, y. Find the conditional
More informationECE 275A Homework 7 Solutions
ECE 275A Homework 7 Solutions Solutions 1. For the same specification as in Homework Problem 6.11 we want to determine an estimator for θ using the Method of Moments (MOM). In general, the MOM estimator
More informationChp 4. Expectation and Variance
Chp 4. Expectation and Variance 1 Expectation In this chapter, we will introduce two objectives to directly reflect the properties of a random variable or vector, which are the Expectation and Variance.
More informationJoint Distributions. (a) Scalar multiplication: k = c d. (b) Product of two matrices: c d. (c) The transpose of a matrix:
Joint Distributions Joint Distributions A bivariate normal distribution generalizes the concept of normal distribution to bivariate random variables It requires a matrix formulation of quadratic forms,
More informationFormulas for probability theory and linear models SF2941
Formulas for probability theory and linear models SF2941 These pages + Appendix 2 of Gut) are permitted as assistance at the exam. 11 maj 2008 Selected formulae of probability Bivariate probability Transforms
More informationConcentration Inequalities
Chapter Concentration Inequalities I. Moment generating functions, the Chernoff method, and sub-gaussian and sub-exponential random variables a. Goal for this section: given a random variable X, how does
More informationTowards control over fading channels
Towards control over fading channels Paolo Minero, Massimo Franceschetti Advanced Network Science University of California San Diego, CA, USA mail: {minero,massimo}@ucsd.edu Invited Paper) Subhrakanti
More informationProbability Space. J. McNames Portland State University ECE 538/638 Stochastic Signals Ver
Stochastic Signals Overview Definitions Second order statistics Stationarity and ergodicity Random signal variability Power spectral density Linear systems with stationary inputs Random signal memory Correlation
More informationRandom Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R
In probabilistic models, a random variable is a variable whose possible values are numerical outcomes of a random phenomenon. As a function or a map, it maps from an element (or an outcome) of a sample
More informationLecture 3: Functions of Symmetric Matrices
Lecture 3: Functions of Symmetric Matrices Yilin Mo July 2, 2015 1 Recap 1 Bayes Estimator: (a Initialization: (b Correction: f(x 0 Y 1 = f(x 0 f(x k Y k = αf(y k x k f(x k Y k 1, where ( 1 α = f(y k x
More informationLECTURE 3. Last time:
LECTURE 3 Last time: Mutual Information. Convexity and concavity Jensen s inequality Information Inequality Data processing theorem Fano s Inequality Lecture outline Stochastic processes, Entropy rate
More information0, otherwise. Find each of the following limits, or explain that the limit does not exist.
Midterm Solutions 1, y x 4 1. Let f(x, y) = 1, y 0 0, otherwise. Find each of the following limits, or explain that the limit does not exist. (a) (b) (c) lim f(x, y) (x,y) (0,1) lim f(x, y) (x,y) (2,3)
More informationChapter 8: Differential entropy. University of Illinois at Chicago ECE 534, Natasha Devroye
Chapter 8: Differential entropy Chapter 8 outline Motivation Definitions Relation to discrete entropy Joint and conditional differential entropy Relative entropy and mutual information Properties AEP for
More informationPhysics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester
Physics 403 Parameter Estimation, Correlations, and Error Bars Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Review of Last Class Best Estimates and Reliability
More informationChapter 3 sections. SKIP: 3.10 Markov Chains. SKIP: pages Chapter 3 - continued
Chapter 3 sections 3.1 Random Variables and Discrete Distributions 3.2 Continuous Distributions 3.3 The Cumulative Distribution Function 3.4 Bivariate Distributions 3.5 Marginal Distributions 3.6 Conditional
More informationDerivatives and Integrals
Derivatives and Integrals Definition 1: Derivative Formulas d dx (c) = 0 d dx (f ± g) = f ± g d dx (kx) = k d dx (xn ) = nx n 1 (f g) = f g + fg ( ) f = f g fg g g 2 (f(g(x))) = f (g(x)) g (x) d dx (ax
More informationCopyright c 2007 Jason Underdown Some rights reserved. quadratic formula. absolute value. properties of absolute values
Copyright & License Formula Copyright c 2007 Jason Underdown Some rights reserved. quadratic formula absolute value properties of absolute values equation of a line in various forms equation of a circle
More informationTEST CODE: MIII (Objective type) 2010 SYLLABUS
TEST CODE: MIII (Objective type) 200 SYLLABUS Algebra Permutations and combinations. Binomial theorem. Theory of equations. Inequalities. Complex numbers and De Moivre s theorem. Elementary set theory.
More informationStatistical signal processing
Statistical signal processing Short overview of the fundamentals Outline Random variables Random processes Stationarity Ergodicity Spectral analysis Random variable and processes Intuition: A random variable
More informationMath 209B Homework 2
Math 29B Homework 2 Edward Burkard Note: All vector spaces are over the field F = R or C 4.6. Two Compactness Theorems. 4. Point Set Topology Exercise 6 The product of countably many sequentally compact
More informationat Some sort of quantization is necessary to represent continuous signals in digital form
Quantization at Some sort of quantization is necessary to represent continuous signals in digital form x(n 1,n ) x(t 1,tt ) D Sampler Quantizer x q (n 1,nn ) Digitizer (A/D) Quantization is also used for
More informationHamiltonian Mechanics
Chapter 3 Hamiltonian Mechanics 3.1 Convex functions As background to discuss Hamiltonian mechanics we discuss convexity and convex functions. We will also give some applications to thermodynamics. We
More informationMULTIVARIATE PROBABILITY DISTRIBUTIONS
MULTIVARIATE PROBABILITY DISTRIBUTIONS. PRELIMINARIES.. Example. Consider an experiment that consists of tossing a die and a coin at the same time. We can consider a number of random variables defined
More informationHomework 2: Solution
0-704: Information Processing and Learning Sring 0 Lecturer: Aarti Singh Homework : Solution Acknowledgement: The TA graciously thanks Rafael Stern for roviding most of these solutions.. Problem Hence,
More information3. Identify and find the general solution of each of the following first order differential equations.
Final Exam MATH 33, Sample Questions. Fall 7. y = Cx 3 3 is the general solution of a differential equation. Find the equation. Answer: y = 3y + 9 xy. y = C x + C x is the general solution of a differential
More informationChapter 13. Convex and Concave. Josef Leydold Mathematical Methods WS 2018/19 13 Convex and Concave 1 / 44
Chapter 13 Convex and Concave Josef Leydold Mathematical Methods WS 2018/19 13 Convex and Concave 1 / 44 Monotone Function Function f is called monotonically increasing, if x 1 x 2 f (x 1 ) f (x 2 ) It
More informationMonte-Carlo MMD-MA, Université Paris-Dauphine. Xiaolu Tan
Monte-Carlo MMD-MA, Université Paris-Dauphine Xiaolu Tan tan@ceremade.dauphine.fr Septembre 2015 Contents 1 Introduction 1 1.1 The principle.................................. 1 1.2 The error analysis
More informationMultiple Random Variables
Multiple Random Variables This Version: July 30, 2015 Multiple Random Variables 2 Now we consider models with more than one r.v. These are called multivariate models For instance: height and weight An
More informationChapter 5,6 Multiple RandomVariables
Chapter 5,6 Multiple RandomVariables ENCS66 - Probabilityand Stochastic Processes Concordia University Vector RandomVariables A vector r.v. is a function where is the sample space of a random experiment.
More informationMath 131 Exam 2 Spring 2016
Math 3 Exam Spring 06 Name: ID: 7 multiple choice questions worth 4.7 points each. hand graded questions worth 0 points each. 0. free points (so the total will be 00). Exam covers sections.7 through 3.0
More information2.12: Derivatives of Exp/Log (cont d) and 2.15: Antiderivatives and Initial Value Problems
2.12: Derivatives of Exp/Log (cont d) and 2.15: Antiderivatives and Initial Value Problems Mathematics 3 Lecture 14 Dartmouth College February 03, 2010 Derivatives of the Exponential and Logarithmic Functions
More informationLecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable
Lecture Notes 1 Probability and Random Variables Probability Spaces Conditional Probability and Independence Random Variables Functions of a Random Variable Generation of a Random Variable Jointly Distributed
More information1 Review of di erential calculus
Review of di erential calculus This chapter presents the main elements of di erential calculus needed in probability theory. Often, students taking a course on probability theory have problems with concepts
More information