Rooted Trees with Probabilities Revisited

Size: px
Start display at page:

Download "Rooted Trees with Probabilities Revisited"

Transcription

1 Rooted Trees with Probabilities Revisited Georg öcherer Institute for Communications Engineering Technische Universität München arxiv: v1 [cs.it] 4 Feb 2013 February 5, 2013 Abstract Rooted trees with probabilities are convenient to represent a class of random processes with memory. They allow to describe and analyze variable length codes for data compression and distribution matching. In this work, the Leaf-Average Node-Sum Interchange Theorem (LANSIT) and the well-known applications to path length and leaf entropy are re-stated. The LANSIT is then applied to informational divergence. Next, the differential LANSIT is derived, which allows to write normalized functionals of leaf distributions as an average of functionals of branching distributions. Joint distributions of random variables and the corresponding conditional distributions are special cases of leaf distributions and branching distributions. Using the differential LANSIT, Pinsker s inequality is formulated for rooted trees with probabilities, with an application to the approximation of product distributions. In particular, it is shown that if the normalized informational divergence of a distribution and a product distribution approaches zero, then the entropy rate approaches the entropy rate of the product distribution. 2 / 34 1 / 34

2 Probability notation Random variable X, takes values in X Distribution P X : for each a X : P X (a) := Pr(X = a). Support supp P X := {a X : P X (a) > 0}. 3 / 34 Rooted Trees with Probabilities [1, 2, 3] L: set of leaves. L: random variable over L. We identify supp P L L, i.e., a node is a leaf of a tree if it has no successors and is generated with non-zero probability. N : set of all nodes on paths through the tree. root 0 N. = N \ L: set of branching nodes. L j : leaves below node j N. j L L j = j. S j : successors of node j. S j : random variable over successors of node j. We identify S j supp P Sj. 4 / 34

3 Node Probabilities We associate with each node j N a probability Q j = i L i P L (i) (1) The probabilities of the successors of node j are given by Q i = Q j P Sj (i), i S j. (2) 5 / 34 Example P S3 (5) = Q 5 = P L (5) = 1 2 P S1 (3) = 1 3 Q 3 = 3 4 P S0 (1) = Q 1 = Q 6 = P L (6) = 1 4 P S3 (6) = Q 0 = 1 4 P S0 (2) = Q 2 = P L (2) = 1 4 supp P L = L = {2, 5, 6} N = {0, 1, 2, 3, 5, 6} = N \ L = {0, 1, 3} supp P S1 = S 1 = {3} L 1 = {5, 6} Q 1 = i L 1 P L (i) = = 3 4 Q 3 = Q 1 P S1 (3) = = / 34

4 Leaf-Average Node-Sum Interchange Theorem (LANSIT) LANSIT [1, Theorem 1] Let f be a function that assigns to each node j N a real value f (j). For each j N \ 0, define f (j) := f (j) f (predecessor of j) Proposition 1 (LANSIT) E[f (L)] f (0) = j Q j E[ f (S j )] (3) 8 / 34

5 Proof of LANSIT Consider a tree with leaves L. Let S j L be a set of leaves with a common predecessor j. P L (i)f (i) (a) = Q j P Sj (i)[f (i) f (j) + f (j)] (4) i S j i S j =Q j f (j) P Sj (i) +Q j P Sj (i) f (i) (5) i S j i S j }{{} =1 =Q j f (j) + Q j E[ f (S j )] (6) where (a) follows from (2). L j L \ S j is a new tree with a reduced number of leaves and P L (j) = Q j. Repeat the procedure until j is the root node 0. Then Q j = 1 and Q j f (j) = f (0). 9 / 34 Path Length Lemma [3, Lemma 2.1] Function w(j) := length of path to node j. For each j N \ 0: w(j) = 1. w(0) = 0. Proposition 2 (Path Length Lemma) = j Q j. (7) 10 / 34

6 Leaf Entropy Lemma [3, Lemma 2.2] Function f (i) = log 2 Q i. Proposition 3 (Leaf Entropy Lemma) H(P L ) = j Q j H(P Sj ). (8) Proof. H(P L ) = E[ log 2 P L (L)] = E[f (L)] (a) = j Q j E[ f (S j )] (9) = j Q j E[ log 2 Q Sj Q j ] (10) (b) = j Q j E[ log 2 P Sj (S j )] (11) = j Q j H(P Sj ) (12) where (a) follows by the LANSIT and (b) by (2). 11 / 34 Informational Divergence Q Function f (i) = log i 2 Q i Proposition 4. D(P L P L ) = j Q j D(P Sj P S j ). (13) Proof. [ D(P L P L ) = E log 2 P L (L) ] (a) = Q j E[ f (S j )] (14) P L (L) j = j Q j E[log 2 Q Sj Q j Q j Q S j ] (15) (b) = P Sj (S j ) Q j E[log 2 P j S j (S j ) ] (16) = j Q j D(P Sj P S j ) (17) where (a) follows from the LANSIT and (b) by (2). 12 / 34

7 Random Vectors Remark. If all paths in a tree have the same length n, then P L can be thought of as a joint distribution P X n of a random vector X n = (X 1,..., X n ). In this case, Prop. 3 and Prop. 4 are the chain rules for entropy and informational divergence, respectively. 13 / 34 Differential LANSIT

8 Differential LANSIT : random variable over branching nodes. Define P (j) = Q j, j. (18) y path length lemma P (j) = j P defines a distribution over. Proposition 5 (Differential LANSIT) j Q j = 1. (19) E[f (L)] f (0) = E[ f (S )]. (20) Note that the expectation on the right-hand side is over P S P. 15 / 34 Example Consider the path length function w. y the Differential LANSIT, E[ w(s )] = w(0) = = 1. (21) 16 / 34

9 Entropy Rate Function f (i) = log 2 Q i. Proposition 6 H(P L ) = E[H(P S )] (22) Proof. H(P L ) = E[ log 2 P L (L)] (a) = E[ log 2 Q S Q ] (23) = E[ log 2 Q S Q ] (24) (b) = E[ log 2 P S (S )] (25) = E[H(P S )] (26) where (a) follows by the differential LANSIT and (b) by (2). 17 / 34 Normalized Informational Divergence Q Function f (i) = log i 2. Q i Proposition 7 D(P L P L ) = E[D(P S P S )]. (27) Proof. D(P L P L ) (a) = E[log 2 Q S Q Q Q S ] (28) (b) P S (S ) = E[log 2 P S (S ) ] (29) = E[D(P S P S )] (30) where (a) follows by the differential LANSIT and (b) by (2). 18 / 34

10 Pinsker s Inequality for Trees Variational distance P X, P Y two distributions on X. Variational distance d(p X, P Y ) is d(p X, P Y ) := a X P X (a) P Y (a) (31) ounds: d(p X, P Y ) 0, with equality iff a X : P X (a) = P Y (a) (32) d(p X, P Y ) 2, with equality iff supp P X supp P Y =. (33) 20 / 34

11 Approximating Distributions Set of distributions over X : P X. Proposition 8 i Pinsker s Inequality: D(P X P Y ) 1 2 ln 2 d 2 (P X, P Y ). (34) ii Let {P Xk } k=1 be a set of distributions in P X. D(P Xk P Y ) k 0 d(p Xk, P Y ) k 0 (35) iii Let g be a function on P X that is continuous in P Y. D(P Xk P Y ) k 0 g(p Xk ) g(p Y ) k 0. (36) 21 / 34 Example: Entropy y [4, Lemma 2.7], entropy is continuous in any distribution P Y P X. Thus D(P Xk P Y ) k 0 H(P Xk ) H(P Y ) k 0. (37) 22 / 34

12 Product Distributions Consider a tree and let P S be a branching distribution. Assign 1 P Sj = P S for all branching nodes j. We call the resulting node probabilities the product distribution P + S. For any complete tree with leaves L, P + S distribution, i.e., i L P+ S (i) = 1. defines a leaf For any (possibly non-complete) tree with leaves L, we define the informational divergence between the leaf distribution P L and P + S as D(P L P + S ) := i L P L (i) log 2 P L (i) P + S (i). (38) 1 This is a slight abuse of notation, since for j i, S i S i. However, we can think of P Sj as a distribution over branch labels. For example, for a binary tree, P Sj is then a distribution over the labels {0, 1}, for all i and the assignment P Sj = P S is meaningful. 23 / 34 Approximating Distributions on Trees Proposition 9 i Pinsker s Inequality for Trees: D(P L P L ) 1 2 ln(2) E[d 2 (P S, P S )]. (39) ii For any ɛ > 0, D(P L P L ) L 0 Pr[d(P S, P S ) ɛ] L 0. (40) iii Let P S be a branching distribution and let g be a function on P S that is bounded and continuous in P S. D(P L P + S ) L 0 E[g(P S )] g(p S ) 0. (41) L Proof. See Slides / 34

13 Entropy Rate y Prop. 6, H(P L ) = E[H(P S )]. (42) H is continuous and bounded. Thus by Prop. 9iii. we have the following proposition. Proposition 10 D(P L P + S ) L 0 H(P L) H(P S ) L 0. (43) 25 / 34 Random Vectors Remark. (See also Slide 13) If all paths in a tree have the same length n, then P L can be thought of as a joint distribution P X n of a random vector X n = (X 1,..., X n ). The (tree) product distribution P + S is then the conventional product distribution P n S. Prop. 9 applies and in particular, Prop. 10 becomes D(P X n P n S ) n n 0 H(P X n) n H(P S ) 0. (44) n 26 / 34

14 Proof of Prop. 9i. D(P L P L ) (a) = E[D(P S P S )] (45) (b) 1 2 ln 2 E[d 2 (P S, P S )] (46) where (a) follows by Prop. 7 and where (b) follows by Pinsker s inequality. 27 / 34 Proof of Prop. 9ii. Suppose E[d(P S, P S )] < ɛ 2 for some ɛ > 0. Then Pr[d(P S, P S ) ɛ] (a) E[d(P S, P S )] ɛ ɛ2 ɛ (47) (48) = ɛ (49) where (a) follows by Markov s inequality [3, Theo. A.2]. Together with statement i., statement ii. follows. 28 / 34

15 Proof of Prop. 9iii. (1) y assumption, g is bounded and continuous in P S. y boundedness, there exists a value g max < such that y continuity, we know that Define δ > 0: ɛ δ : ɛ < ɛ δ : j : g(p Sj ) g(p S ) g max. (50) d(p Sj, P S ) < ɛ g(p Sj ) g(p S ) < δ. (51) ɛ = min{ɛ δ, δ}. (52) 29 / 34 Proof of Prop. 9iii. (2) Suppose E[d(P S, P S )] < ɛ 2. We write E[g(P S )] g(p S ) = P (j)[g(p Sj ) g(p S )] j P (j) g(psj ) g(p S ) (53) j = P (j) g(p Sj ) g(p S ) j : d(p Sj,P S )<ɛ + j : d(p Sj,P S ) ɛ We next bound the two sums in (54). P (j) g(p Sj ) g(p S ). (54) 30 / 34

16 Proof of Prop. 9iii. (3) The first sum in (54) is bounded as j : d(p Sj,P S )<ɛ P (j) g(p Sj ) g(p S ) (a) where (a) follows by (51) and (52). j : d(p Sj,P S )<ɛ P (j)δ δ (55) 31 / 34 Proof of Prop. 9iii. (4) The second sum in (54) is bounded as j : d(p Sj,P S ) ɛ P (j) g(p Sj ) g(p S ) (a) j : d(p Sj,P S ) ɛ (b) ɛg max P (j)g max (c) δg max (56) where (a) follows by (50), where (b) follows by our assumption E[d(P S, P S )] < ɛ 2 and Slide 28 and where (c) follows by (52). 32 / 34

17 Proof of Prop. 9iii. (5) Using (55) and (56) in (54), we get E[g(P S )] g(p S ) δ + δg max = δ(1 + g max ). (57) For δ 0, the error bound on the right-hand side goes to zero, which proves part iii. of Prop / 34 References [1] R. A. Rueppel and J. L. Massey, Leaf-average node-sum interchanges in rooted trees with applications, in Communications and Cryptography: Two sides of One Tapestry, R. E. lahut, D. J. Costello Jr., U. Maurer, and T. Mittelholzer, Eds. Kluwer Academic Publishers, [2] J. L. Massey, Applied digital information theory I, lecture notes, ETH Zurich. [Online]. Available: scr/adit1.pdf [3] G. Kramer, Information theory, lecture notes TU Munich, edition WS 2012/2013. [4] I. Csiszár and J. Körner, Information Theory: Coding Theorems for Discrete Memoryless Systems. Cambridge University Press, / 34

Fixed-to-Variable Length Distribution Matching

Fixed-to-Variable Length Distribution Matching Fixed-to-Variable Length Distribution Matching Rana Ali Ajad and Georg Böcherer Institute for Counications Engineering Technische Universität München, Gerany Eail: raa2463@gail.co,georg.boecherer@tu.de

More information

An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if. 2 l i. i=1

An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if. 2 l i. i=1 Kraft s inequality An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if N 2 l i 1 Proof: Suppose that we have a tree code. Let l max = max{l 1,...,

More information

Achievable Rates for Probabilistic Shaping

Achievable Rates for Probabilistic Shaping Achievable Rates for Probabilistic Shaping Georg Böcherer Mathematical and Algorithmic Sciences Lab Huawei Technologies France S.A.S.U. georg.boecherer@ieee.org May 3, 08 arxiv:707.034v5 cs.it May 08 For

More information

Optimum Binary-Constrained Homophonic Coding

Optimum Binary-Constrained Homophonic Coding Optimum Binary-Constrained Homophonic Coding Valdemar C. da Rocha Jr. and Cecilio Pimentel Communications Research Group - CODEC Department of Electronics and Systems, P.O. Box 7800 Federal University

More information

Information Theory with Applications, Math6397 Lecture Notes from September 30, 2014 taken by Ilknur Telkes

Information Theory with Applications, Math6397 Lecture Notes from September 30, 2014 taken by Ilknur Telkes Information Theory with Applications, Math6397 Lecture Notes from September 3, 24 taken by Ilknur Telkes Last Time Kraft inequality (sep.or) prefix code Shannon Fano code Bound for average code-word length

More information

1 Introduction to information theory

1 Introduction to information theory 1 Introduction to information theory 1.1 Introduction In this chapter we present some of the basic concepts of information theory. The situations we have in mind involve the exchange of information through

More information

arxiv: v4 [cs.it] 17 Oct 2015

arxiv: v4 [cs.it] 17 Oct 2015 Upper Bounds on the Relative Entropy and Rényi Divergence as a Function of Total Variation Distance for Finite Alphabets Igal Sason Department of Electrical Engineering Technion Israel Institute of Technology

More information

Lecture 4 : Adaptive source coding algorithms

Lecture 4 : Adaptive source coding algorithms Lecture 4 : Adaptive source coding algorithms February 2, 28 Information Theory Outline 1. Motivation ; 2. adaptive Huffman encoding ; 3. Gallager and Knuth s method ; 4. Dictionary methods : Lempel-Ziv

More information

ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016

ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016 ECE598: Information-theoretic methods in high-dimensional statistics Spring 06 Lecture : Mutual Information Method Lecturer: Yihong Wu Scribe: Jaeho Lee, Mar, 06 Ed. Mar 9 Quick review: Assouad s lemma

More information

(each row defines a probability distribution). Given n-strings x X n, y Y n we can use the absence of memory in the channel to compute

(each row defines a probability distribution). Given n-strings x X n, y Y n we can use the absence of memory in the channel to compute ENEE 739C: Advanced Topics in Signal Processing: Coding Theory Instructor: Alexander Barg Lecture 6 (draft; 9/6/03. Error exponents for Discrete Memoryless Channels http://www.enee.umd.edu/ abarg/enee739c/course.html

More information

A Mathematical Theory of Communication

A Mathematical Theory of Communication A Mathematical Theory of Communication Ben Eggers Abstract This paper defines information-theoretic entropy and proves some elementary results about it. Notably, we prove that given a few basic assumptions

More information

Lecture 11: Polar codes construction

Lecture 11: Polar codes construction 15-859: Information Theory and Applications in TCS CMU: Spring 2013 Lecturer: Venkatesan Guruswami Lecture 11: Polar codes construction February 26, 2013 Scribe: Dan Stahlke 1 Polar codes: recap of last

More information

Tight Upper Bounds on the Redundancy of Optimal Binary AIFV Codes

Tight Upper Bounds on the Redundancy of Optimal Binary AIFV Codes Tight Upper Bounds on the Redundancy of Optimal Binary AIFV Codes Weihua Hu Dept. of Mathematical Eng. Email: weihua96@gmail.com Hirosuke Yamamoto Dept. of Complexity Sci. and Eng. Email: Hirosuke@ieee.org

More information

10-704: Information Processing and Learning Fall Lecture 10: Oct 3

10-704: Information Processing and Learning Fall Lecture 10: Oct 3 0-704: Information Processing and Learning Fall 206 Lecturer: Aarti Singh Lecture 0: Oct 3 Note: These notes are based on scribed notes from Spring5 offering of this course. LaTeX template courtesy of

More information

Overview. Discrete Event Systems Verification of Finite Automata. What can finite automata be used for? What can finite automata be used for?

Overview. Discrete Event Systems Verification of Finite Automata. What can finite automata be used for? What can finite automata be used for? Computer Engineering and Networks Overview Discrete Event Systems Verification of Finite Automata Lothar Thiele Introduction Binary Decision Diagrams Representation of Boolean Functions Comparing two circuits

More information

Coding of memoryless sources 1/35

Coding of memoryless sources 1/35 Coding of memoryless sources 1/35 Outline 1. Morse coding ; 2. Definitions : encoding, encoding efficiency ; 3. fixed length codes, encoding integers ; 4. prefix condition ; 5. Kraft and Mac Millan theorems

More information

Lecture 1 : Data Compression and Entropy

Lecture 1 : Data Compression and Entropy CPS290: Algorithmic Foundations of Data Science January 8, 207 Lecture : Data Compression and Entropy Lecturer: Kamesh Munagala Scribe: Kamesh Munagala In this lecture, we will study a simple model for

More information

4F5: Advanced Communications and Coding Handout 2: The Typical Set, Compression, Mutual Information

4F5: Advanced Communications and Coding Handout 2: The Typical Set, Compression, Mutual Information 4F5: Advanced Communications and Coding Handout 2: The Typical Set, Compression, Mutual Information Ramji Venkataramanan Signal Processing and Communications Lab Department of Engineering ramji.v@eng.cam.ac.uk

More information

Large Deviations Performance of Knuth-Yao algorithm for Random Number Generation

Large Deviations Performance of Knuth-Yao algorithm for Random Number Generation Large Deviations Performance of Knuth-Yao algorithm for Random Number Generation Akisato KIMURA akisato@ss.titech.ac.jp Tomohiko UYEMATSU uematsu@ss.titech.ac.jp April 2, 999 No. AK-TR-999-02 Abstract

More information

Entropy and Ergodic Theory Lecture 4: Conditional entropy and mutual information

Entropy and Ergodic Theory Lecture 4: Conditional entropy and mutual information Entropy and Ergodic Theory Lecture 4: Conditional entropy and mutual information 1 Conditional entropy Let (Ω, F, P) be a probability space, let X be a RV taking values in some finite set A. In this lecture

More information

Information Theory. David Rosenberg. June 15, New York University. David Rosenberg (New York University) DS-GA 1003 June 15, / 18

Information Theory. David Rosenberg. June 15, New York University. David Rosenberg (New York University) DS-GA 1003 June 15, / 18 Information Theory David Rosenberg New York University June 15, 2015 David Rosenberg (New York University) DS-GA 1003 June 15, 2015 1 / 18 A Measure of Information? Consider a discrete random variable

More information

Handout 5. α a1 a n. }, where. xi if a i = 1 1 if a i = 0.

Handout 5. α a1 a n. }, where. xi if a i = 1 1 if a i = 0. Notes on Complexity Theory Last updated: October, 2005 Jonathan Katz Handout 5 1 An Improved Upper-Bound on Circuit Size Here we show the result promised in the previous lecture regarding an upper-bound

More information

TTIC 31230, Fundamentals of Deep Learning David McAllester, April Information Theory and Distribution Modeling

TTIC 31230, Fundamentals of Deep Learning David McAllester, April Information Theory and Distribution Modeling TTIC 31230, Fundamentals of Deep Learning David McAllester, April 2017 Information Theory and Distribution Modeling Why do we model distributions and conditional distributions using the following objective

More information

On the Logarithmic Calculus and Sidorenko s Conjecture

On the Logarithmic Calculus and Sidorenko s Conjecture On the Logarithmic Calculus and Sidorenko s Conjecture by Xiang Li A thesis submitted in conformity with the requirements for the degree of Msc. Mathematics Graduate Department of Mathematics University

More information

Information Theory, Statistics, and Decision Trees

Information Theory, Statistics, and Decision Trees Information Theory, Statistics, and Decision Trees Léon Bottou COS 424 4/6/2010 Summary 1. Basic information theory. 2. Decision trees. 3. Information theory and statistics. Léon Bottou 2/31 COS 424 4/6/2010

More information

Information Theory. Lecture 5 Entropy rate and Markov sources STEFAN HÖST

Information Theory. Lecture 5 Entropy rate and Markov sources STEFAN HÖST Information Theory Lecture 5 Entropy rate and Markov sources STEFAN HÖST Universal Source Coding Huffman coding is optimal, what is the problem? In the previous coding schemes (Huffman and Shannon-Fano)it

More information

On the Entropy of Sums of Bernoulli Random Variables via the Chen-Stein Method

On the Entropy of Sums of Bernoulli Random Variables via the Chen-Stein Method On the Entropy of Sums of Bernoulli Random Variables via the Chen-Stein Method Igal Sason Department of Electrical Engineering Technion - Israel Institute of Technology Haifa 32000, Israel ETH, Zurich,

More information

Ahmed S. Mansour, Rafael F. Schaefer and Holger Boche. June 9, 2015

Ahmed S. Mansour, Rafael F. Schaefer and Holger Boche. June 9, 2015 The Individual Secrecy Capacity of Degraded Multi-Receiver Wiretap Broadcast Channels Ahmed S. Mansour, Rafael F. Schaefer and Holger Boche Lehrstuhl für Technische Universität München, Germany Department

More information

Source Coding. Master Universitario en Ingeniería de Telecomunicación. I. Santamaría Universidad de Cantabria

Source Coding. Master Universitario en Ingeniería de Telecomunicación. I. Santamaría Universidad de Cantabria Source Coding Master Universitario en Ingeniería de Telecomunicación I. Santamaría Universidad de Cantabria Contents Introduction Asymptotic Equipartition Property Optimal Codes (Huffman Coding) Universal

More information

A Formula for the Capacity of the General Gel fand-pinsker Channel

A Formula for the Capacity of the General Gel fand-pinsker Channel A Formula for the Capacity of the General Gel fand-pinsker Channel Vincent Y. F. Tan Institute for Infocomm Research (I 2 R, A*STAR, Email: tanyfv@i2r.a-star.edu.sg ECE Dept., National University of Singapore

More information

Universal Incremental Slepian-Wolf Coding

Universal Incremental Slepian-Wolf Coding Proceedings of the 43rd annual Allerton Conference, Monticello, IL, September 2004 Universal Incremental Slepian-Wolf Coding Stark C. Draper University of California, Berkeley Berkeley, CA, 94720 USA sdraper@eecs.berkeley.edu

More information

5 Mutual Information and Channel Capacity

5 Mutual Information and Channel Capacity 5 Mutual Information and Channel Capacity In Section 2, we have seen the use of a quantity called entropy to measure the amount of randomness in a random variable. In this section, we introduce several

More information

Problem: Data base too big to fit memory Disk reads are slow. Example: 1,000,000 records on disk Binary search might take 20 disk reads

Problem: Data base too big to fit memory Disk reads are slow. Example: 1,000,000 records on disk Binary search might take 20 disk reads B Trees Problem: Data base too big to fit memory Disk reads are slow Example: 1,000,000 records on disk Binary search might take 20 disk reads Disk reads are done in blocks Example: One block read can

More information

Hash Property and Fixed-rate Universal Coding Theorems

Hash Property and Fixed-rate Universal Coding Theorems 1 Hash Property and Fixed-rate Universal Coding Theorems Jun Muramatsu Member, IEEE, Shigeki Miyake Member, IEEE, Abstract arxiv:0804.1183v1 [cs.it 8 Apr 2008 The aim of this paper is to prove the achievability

More information

Multiaccess Channels with State Known to One Encoder: A Case of Degraded Message Sets

Multiaccess Channels with State Known to One Encoder: A Case of Degraded Message Sets Multiaccess Channels with State Known to One Encoder: A Case of Degraded Message Sets Shivaprasad Kotagiri and J. Nicholas Laneman Department of Electrical Engineering University of Notre Dame Notre Dame,

More information

Binary Decision Diagrams

Binary Decision Diagrams Binary Decision Diagrams Binary Decision Diagrams (BDDs) are a class of graphs that can be used as data structure for compactly representing boolean functions. BDDs were introduced by R. Bryant in 1986.

More information

1 Background on Information Theory

1 Background on Information Theory Review of the book Information Theory: Coding Theorems for Discrete Memoryless Systems by Imre Csiszár and János Körner Second Edition Cambridge University Press, 2011 ISBN:978-0-521-19681-9 Review by

More information

Machine Learning. Lecture 02.2: Basics of Information Theory. Nevin L. Zhang

Machine Learning. Lecture 02.2: Basics of Information Theory. Nevin L. Zhang Machine Learning Lecture 02.2: Basics of Information Theory Nevin L. Zhang lzhang@cse.ust.hk Department of Computer Science and Engineering The Hong Kong University of Science and Technology Nevin L. Zhang

More information

Information Theory and Statistics Lecture 2: Source coding

Information Theory and Statistics Lecture 2: Source coding Information Theory and Statistics Lecture 2: Source coding Łukasz Dębowski ldebowsk@ipipan.waw.pl Ph. D. Programme 2013/2014 Injections and codes Definition (injection) Function f is called an injection

More information

Model Checking for Propositions CS477 Formal Software Dev Methods

Model Checking for Propositions CS477 Formal Software Dev Methods S477 Formal Software Dev Methods Elsa L Gunter 2112 S, UIU egunter@illinois.edu http://courses.engr.illinois.edu/cs477 Slides based in part on previous lectures by Mahesh Vishwanathan, and by Gul gha January

More information

Asymptotic redundancy and prolixity

Asymptotic redundancy and prolixity Asymptotic redundancy and prolixity Yuval Dagan, Yuval Filmus, and Shay Moran April 6, 2017 Abstract Gallager (1978) considered the worst-case redundancy of Huffman codes as the maximum probability tends

More information

Lecture 16: Computation Tree Logic (CTL)

Lecture 16: Computation Tree Logic (CTL) Lecture 16: Computation Tree Logic (CTL) 1 Programme for the upcoming lectures Introducing CTL Basic Algorithms for CTL CTL and Fairness; computing strongly connected components Basic Decision Diagrams

More information

Variable Length Codes for Degraded Broadcast Channels

Variable Length Codes for Degraded Broadcast Channels Variable Length Codes for Degraded Broadcast Channels Stéphane Musy School of Computer and Communication Sciences, EPFL CH-1015 Lausanne, Switzerland Email: stephane.musy@ep.ch Abstract This paper investigates

More information

Intermittent Communication

Intermittent Communication Intermittent Communication Mostafa Khoshnevisan, Student Member, IEEE, and J. Nicholas Laneman, Senior Member, IEEE arxiv:32.42v2 [cs.it] 7 Mar 207 Abstract We formulate a model for intermittent communication

More information

Binary Decision Diagrams. Graphs. Boolean Functions

Binary Decision Diagrams. Graphs. Boolean Functions Binary Decision Diagrams Graphs Binary Decision Diagrams (BDDs) are a class of graphs that can be used as data structure for compactly representing boolean functions. BDDs were introduced by R. Bryant

More information

SHARED INFORMATION. Prakash Narayan with. Imre Csiszár, Sirin Nitinawarat, Himanshu Tyagi, Shun Watanabe

SHARED INFORMATION. Prakash Narayan with. Imre Csiszár, Sirin Nitinawarat, Himanshu Tyagi, Shun Watanabe SHARED INFORMATION Prakash Narayan with Imre Csiszár, Sirin Nitinawarat, Himanshu Tyagi, Shun Watanabe 2/40 Acknowledgement Praneeth Boda Himanshu Tyagi Shun Watanabe 3/40 Outline Two-terminal model: Mutual

More information

Lecture 21: Minimax Theory

Lecture 21: Minimax Theory Lecture : Minimax Theory Akshay Krishnamurthy akshay@cs.umass.edu November 8, 07 Recap In the first part of the course, we spent the majority of our time studying risk minimization. We found many ways

More information

Primary Rate-Splitting Achieves Capacity for the Gaussian Cognitive Interference Channel

Primary Rate-Splitting Achieves Capacity for the Gaussian Cognitive Interference Channel Primary Rate-Splitting Achieves Capacity for the Gaussian Cognitive Interference Channel Stefano Rini, Ernest Kurniawan and Andrea Goldsmith Technische Universität München, Munich, Germany, Stanford University,

More information

Solutions to Set #2 Data Compression, Huffman code and AEP

Solutions to Set #2 Data Compression, Huffman code and AEP Solutions to Set #2 Data Compression, Huffman code and AEP. Huffman coding. Consider the random variable ( ) x x X = 2 x 3 x 4 x 5 x 6 x 7 0.50 0.26 0. 0.04 0.04 0.03 0.02 (a) Find a binary Huffman code

More information

On Universal Types. Gadiel Seroussi Hewlett-Packard Laboratories Palo Alto, California, USA. University of Minnesota, September 14, 2004

On Universal Types. Gadiel Seroussi Hewlett-Packard Laboratories Palo Alto, California, USA. University of Minnesota, September 14, 2004 On Universal Types Gadiel Seroussi Hewlett-Packard Laboratories Palo Alto, California, USA University of Minnesota, September 14, 2004 Types for Parametric Probability Distributions A = finite alphabet,

More information

Lecture 4: State Estimation in Hidden Markov Models (cont.)

Lecture 4: State Estimation in Hidden Markov Models (cont.) EE378A Statistical Signal Processing Lecture 4-04/13/2017 Lecture 4: State Estimation in Hidden Markov Models (cont.) Lecturer: Tsachy Weissman Scribe: David Wugofski In this lecture we build on previous

More information

x log x, which is strictly convex, and use Jensen s Inequality:

x log x, which is strictly convex, and use Jensen s Inequality: 2. Information measures: mutual information 2.1 Divergence: main inequality Theorem 2.1 (Information Inequality). D(P Q) 0 ; D(P Q) = 0 iff P = Q Proof. Let ϕ(x) x log x, which is strictly convex, and

More information

EECS 229A Spring 2007 * * (a) By stationarity and the chain rule for entropy, we have

EECS 229A Spring 2007 * * (a) By stationarity and the chain rule for entropy, we have EECS 229A Spring 2007 * * Solutions to Homework 3 1. Problem 4.11 on pg. 93 of the text. Stationary processes (a) By stationarity and the chain rule for entropy, we have H(X 0 ) + H(X n X 0 ) = H(X 0,

More information

Reading for Lecture 13 Release v10

Reading for Lecture 13 Release v10 Reading for Lecture 13 Release v10 Christopher Lee November 15, 2011 Contents 1 Evolutionary Trees i 1.1 Evolution as a Markov Process...................................... ii 1.2 Rooted vs. Unrooted Trees........................................

More information

MGMT 69000: Topics in High-dimensional Data Analysis Falll 2016

MGMT 69000: Topics in High-dimensional Data Analysis Falll 2016 MGMT 69000: Topics in High-dimensional Data Analysis Falll 2016 Lecture 14: Information Theoretic Methods Lecturer: Jiaming Xu Scribe: Hilda Ibriga, Adarsh Barik, December 02, 2016 Outline f-divergence

More information

Lecture 5: Asymptotic Equipartition Property

Lecture 5: Asymptotic Equipartition Property Lecture 5: Asymptotic Equipartition Property Law of large number for product of random variables AEP and consequences Dr. Yao Xie, ECE587, Information Theory, Duke University Stock market Initial investment

More information

An Achievable Error Exponent for the Mismatched Multiple-Access Channel

An Achievable Error Exponent for the Mismatched Multiple-Access Channel An Achievable Error Exponent for the Mismatched Multiple-Access Channel Jonathan Scarlett University of Cambridge jms265@camacuk Albert Guillén i Fàbregas ICREA & Universitat Pompeu Fabra University of

More information

ITCT Lecture IV.3: Markov Processes and Sources with Memory

ITCT Lecture IV.3: Markov Processes and Sources with Memory ITCT Lecture IV.3: Markov Processes and Sources with Memory 4. Markov Processes Thus far, we have been occupied with memoryless sources and channels. We must now turn our attention to sources with memory.

More information

Series 7, May 22, 2018 (EM Convergence)

Series 7, May 22, 2018 (EM Convergence) Exercises Introduction to Machine Learning SS 2018 Series 7, May 22, 2018 (EM Convergence) Institute for Machine Learning Dept. of Computer Science, ETH Zürich Prof. Dr. Andreas Krause Web: https://las.inf.ethz.ch/teaching/introml-s18

More information

SHARED INFORMATION. Prakash Narayan with. Imre Csiszár, Sirin Nitinawarat, Himanshu Tyagi, Shun Watanabe

SHARED INFORMATION. Prakash Narayan with. Imre Csiszár, Sirin Nitinawarat, Himanshu Tyagi, Shun Watanabe SHARED INFORMATION Prakash Narayan with Imre Csiszár, Sirin Nitinawarat, Himanshu Tyagi, Shun Watanabe 2/41 Outline Two-terminal model: Mutual information Operational meaning in: Channel coding: channel

More information

Digital communication system. Shannon s separation principle

Digital communication system. Shannon s separation principle Digital communication system Representation of the source signal by a stream of (binary) symbols Adaptation to the properties of the transmission channel information source source coder channel coder modulation

More information

Lecture 16. Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code

Lecture 16. Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code Lecture 16 Agenda for the lecture Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code Variable-length source codes with error 16.1 Error-free coding schemes 16.1.1 The Shannon-Fano-Elias

More information

3F1 Information Theory, Lecture 1

3F1 Information Theory, Lecture 1 3F1 Information Theory, Lecture 1 Jossy Sayir Department of Engineering Michaelmas 2013, 22 November 2013 Organisation History Entropy Mutual Information 2 / 18 Course Organisation 4 lectures Course material:

More information

Introduction to Machine Learning Lecture 14. Mehryar Mohri Courant Institute and Google Research

Introduction to Machine Learning Lecture 14. Mehryar Mohri Courant Institute and Google Research Introduction to Machine Learning Lecture 14 Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Density Estimation Maxent Models 2 Entropy Definition: the entropy of a random variable

More information

Lecture 17: Trees and Merge Sort 10:00 AM, Oct 15, 2018

Lecture 17: Trees and Merge Sort 10:00 AM, Oct 15, 2018 CS17 Integrated Introduction to Computer Science Klein Contents Lecture 17: Trees and Merge Sort 10:00 AM, Oct 15, 2018 1 Tree definitions 1 2 Analysis of mergesort using a binary tree 1 3 Analysis of

More information

Decision trees COMS 4771

Decision trees COMS 4771 Decision trees COMS 4771 1. Prediction functions (again) Learning prediction functions IID model for supervised learning: (X 1, Y 1),..., (X n, Y n), (X, Y ) are iid random pairs (i.e., labeled examples).

More information

3F1 Information Theory, Lecture 3

3F1 Information Theory, Lecture 3 3F1 Information Theory, Lecture 3 Jossy Sayir Department of Engineering Michaelmas 2013, 29 November 2013 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 2 / 21 Encoding the output

More information

LECTURE 3. Last time:

LECTURE 3. Last time: LECTURE 3 Last time: Mutual Information. Convexity and concavity Jensen s inequality Information Inequality Data processing theorem Fano s Inequality Lecture outline Stochastic processes, Entropy rate

More information

Information in Biology

Information in Biology Lecture 3: Information in Biology Tsvi Tlusty, tsvi@unist.ac.kr Living information is carried by molecular channels Living systems I. Self-replicating information processors Environment II. III. Evolve

More information

Chapter I: Fundamental Information Theory

Chapter I: Fundamental Information Theory ECE-S622/T62 Notes Chapter I: Fundamental Information Theory Ruifeng Zhang Dept. of Electrical & Computer Eng. Drexel University. Information Source Information is the outcome of some physical processes.

More information

Research Collection. The relation between block length and reliability for a cascade of AWGN links. Conference Paper. ETH Library

Research Collection. The relation between block length and reliability for a cascade of AWGN links. Conference Paper. ETH Library Research Collection Conference Paper The relation between block length and reliability for a cascade of AWG links Authors: Subramanian, Ramanan Publication Date: 0 Permanent Link: https://doiorg/0399/ethz-a-00705046

More information

INFORMATION THEORY AND STATISTICS

INFORMATION THEORY AND STATISTICS CHAPTER INFORMATION THEORY AND STATISTICS We now explore the relationship between information theory and statistics. We begin by describing the method of types, which is a powerful technique in large deviation

More information

H(X) = plog 1 p +(1 p)log 1 1 p. With a slight abuse of notation, we denote this quantity by H(p) and refer to it as the binary entropy function.

H(X) = plog 1 p +(1 p)log 1 1 p. With a slight abuse of notation, we denote this quantity by H(p) and refer to it as the binary entropy function. LECTURE 2 Information Measures 2. ENTROPY LetXbeadiscreterandomvariableonanalphabetX drawnaccordingtotheprobability mass function (pmf) p() = P(X = ), X, denoted in short as X p(). The uncertainty about

More information

Information in Biology

Information in Biology Information in Biology CRI - Centre de Recherches Interdisciplinaires, Paris May 2012 Information processing is an essential part of Life. Thinking about it in quantitative terms may is useful. 1 Living

More information

Gaussian Estimation under Attack Uncertainty

Gaussian Estimation under Attack Uncertainty Gaussian Estimation under Attack Uncertainty Tara Javidi Yonatan Kaspi Himanshu Tyagi Abstract We consider the estimation of a standard Gaussian random variable under an observation attack where an adversary

More information

Each internal node v with d(v) children stores d 1 keys. k i 1 < key in i-th sub-tree k i, where we use k 0 = and k d =.

Each internal node v with d(v) children stores d 1 keys. k i 1 < key in i-th sub-tree k i, where we use k 0 = and k d =. 7.5 (a, b)-trees 7.5 (a, b)-trees Definition For b a an (a, b)-tree is a search tree with the following properties. all leaves have the same distance to the root. every internal non-root vertex v has at

More information

EE5139R: Problem Set 4 Assigned: 31/08/16, Due: 07/09/16

EE5139R: Problem Set 4 Assigned: 31/08/16, Due: 07/09/16 EE539R: Problem Set 4 Assigned: 3/08/6, Due: 07/09/6. Cover and Thomas: Problem 3.5 Sets defined by probabilities: Define the set C n (t = {x n : P X n(x n 2 nt } (a We have = P X n(x n P X n(x n 2 nt

More information

Intro to Information Theory

Intro to Information Theory Intro to Information Theory Math Circle February 11, 2018 1. Random variables Let us review discrete random variables and some notation. A random variable X takes value a A with probability P (a) 0. Here

More information

Discrete Mathematics

Discrete Mathematics Discrete Mathematics Yi Li Software School Fudan University March 13, 2017 Yi Li (Fudan University) Discrete Mathematics March 13, 2017 1 / 1 Review of Lattice Ideal Special Lattice Boolean Algebra Yi

More information

Common Randomness Principles of Secrecy

Common Randomness Principles of Secrecy Common Randomness Principles of Secrecy Himanshu Tyagi Department of Electrical and Computer Engineering and Institute of Systems Research 1 Correlated Data, Distributed in Space and Time Sensor Networks

More information

AQI: Advanced Quantum Information Lecture 6 (Module 2): Distinguishing Quantum States January 28, 2013

AQI: Advanced Quantum Information Lecture 6 (Module 2): Distinguishing Quantum States January 28, 2013 AQI: Advanced Quantum Information Lecture 6 (Module 2): Distinguishing Quantum States January 28, 2013 Lecturer: Dr. Mark Tame Introduction With the emergence of new types of information, in this case

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science Transmission of Information Spring 2006

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science Transmission of Information Spring 2006 MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science 6.44 Transmission of Information Spring 2006 Homework 2 Solution name username April 4, 2006 Reading: Chapter

More information

Lecture 5 - Information theory

Lecture 5 - Information theory Lecture 5 - Information theory Jan Bouda FI MU May 18, 2012 Jan Bouda (FI MU) Lecture 5 - Information theory May 18, 2012 1 / 42 Part I Uncertainty and entropy Jan Bouda (FI MU) Lecture 5 - Information

More information

Information and Entropy

Information and Entropy Information and Entropy Shannon s Separation Principle Source Coding Principles Entropy Variable Length Codes Huffman Codes Joint Sources Arithmetic Codes Adaptive Codes Thomas Wiegand: Digital Image Communication

More information

Midterm Exam Information Theory Fall Midterm Exam. Time: 09:10 12:10 11/23, 2016

Midterm Exam Information Theory Fall Midterm Exam. Time: 09:10 12:10 11/23, 2016 Midterm Exam Time: 09:10 12:10 11/23, 2016 Name: Student ID: Policy: (Read before You Start to Work) The exam is closed book. However, you are allowed to bring TWO A4-size cheat sheet (single-sheet, two-sided).

More information

COMPSCI 650 Applied Information Theory Jan 21, Lecture 2

COMPSCI 650 Applied Information Theory Jan 21, Lecture 2 COMPSCI 650 Applied Information Theory Jan 21, 2016 Lecture 2 Instructor: Arya Mazumdar Scribe: Gayane Vardoyan, Jong-Chyi Su 1 Entropy Definition: Entropy is a measure of uncertainty of a random variable.

More information

Lecture 8: Channel and source-channel coding theorems; BEC & linear codes. 1 Intuitive justification for upper bound on channel capacity

Lecture 8: Channel and source-channel coding theorems; BEC & linear codes. 1 Intuitive justification for upper bound on channel capacity 5-859: Information Theory and Applications in TCS CMU: Spring 23 Lecture 8: Channel and source-channel coding theorems; BEC & linear codes February 7, 23 Lecturer: Venkatesan Guruswami Scribe: Dan Stahlke

More information

ECE 587 / STA 563: Lecture 5 Lossless Compression

ECE 587 / STA 563: Lecture 5 Lossless Compression ECE 587 / STA 563: Lecture 5 Lossless Compression Information Theory Duke University, Fall 2017 Author: Galen Reeves Last Modified: October 18, 2017 Outline of lecture: 5.1 Introduction to Lossless Source

More information

Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig

Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig Multimedia Databases Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de 13 Indexes for Multimedia Data 13 Indexes for Multimedia

More information

Lecture 22: Final Review

Lecture 22: Final Review Lecture 22: Final Review Nuts and bolts Fundamental questions and limits Tools Practical algorithms Future topics Dr Yao Xie, ECE587, Information Theory, Duke University Basics Dr Yao Xie, ECE587, Information

More information

ECE 587 / STA 563: Lecture 5 Lossless Compression

ECE 587 / STA 563: Lecture 5 Lossless Compression ECE 587 / STA 563: Lecture 5 Lossless Compression Information Theory Duke University, Fall 28 Author: Galen Reeves Last Modified: September 27, 28 Outline of lecture: 5. Introduction to Lossless Source

More information

WAITING FOR A BAT TO FLY BY (IN POLYNOMIAL TIME)

WAITING FOR A BAT TO FLY BY (IN POLYNOMIAL TIME) WAITING FOR A BAT TO FLY BY (IN POLYNOMIAL TIME ITAI BENJAMINI, GADY KOZMA, LÁSZLÓ LOVÁSZ, DAN ROMIK, AND GÁBOR TARDOS Abstract. We observe returns of a simple random wal on a finite graph to a fixed node,

More information

CS 229: Lecture 7 Notes

CS 229: Lecture 7 Notes CS 9: Lecture 7 Notes Scribe: Hirsh Jain Lecturer: Angela Fan Overview Overview of today s lecture: Hypothesis Testing Total Variation Distance Pinsker s Inequality Application of Pinsker s Inequality

More information

Propositional and Predicate Logic - V

Propositional and Predicate Logic - V Propositional and Predicate Logic - V Petr Gregor KTIML MFF UK WS 2016/2017 Petr Gregor (KTIML MFF UK) Propositional and Predicate Logic - V WS 2016/2017 1 / 21 Formal proof systems Hilbert s calculus

More information

Supplemental for Spectral Algorithm For Latent Tree Graphical Models

Supplemental for Spectral Algorithm For Latent Tree Graphical Models Supplemental for Spectral Algorithm For Latent Tree Graphical Models Ankur P. Parikh, Le Song, Eric P. Xing The supplemental contains 3 main things. 1. The first is network plots of the latent variable

More information

Necessary and Sufficient Conditions for High-Dimensional Salient Feature Subset Recovery

Necessary and Sufficient Conditions for High-Dimensional Salient Feature Subset Recovery Necessary and Sufficient Conditions for High-Dimensional Salient Feature Subset Recovery Vincent Tan, Matt Johnson, Alan S. Willsky Stochastic Systems Group, Laboratory for Information and Decision Systems,

More information

A Simple Converse of Burnashev s Reliability Function

A Simple Converse of Burnashev s Reliability Function A Simple Converse of Burnashev s Reliability Function 1 arxiv:cs/0610145v3 [cs.it] 23 Sep 2008 Peter Berlin, Barış Nakiboğlu, Bixio Rimoldi, Emre Telatar School of Computer and Communication Sciences Ecole

More information

Entropy Rate of Stochastic Processes

Entropy Rate of Stochastic Processes Entropy Rate of Stochastic Processes Timo Mulder tmamulder@gmail.com Jorn Peters jornpeters@gmail.com February 8, 205 The entropy rate of independent and identically distributed events can on average be

More information

PART III. Outline. Codes and Cryptography. Sources. Optimal Codes (I) Jorge L. Villar. MAMME, Fall 2015

PART III. Outline. Codes and Cryptography. Sources. Optimal Codes (I) Jorge L. Villar. MAMME, Fall 2015 Outline Codes and Cryptography 1 Information Sources and Optimal Codes 2 Building Optimal Codes: Huffman Codes MAMME, Fall 2015 3 Shannon Entropy and Mutual Information PART III Sources Information source:

More information

SIGNAL COMPRESSION Lecture 7. Variable to Fix Encoding

SIGNAL COMPRESSION Lecture 7. Variable to Fix Encoding SIGNAL COMPRESSION Lecture 7 Variable to Fix Encoding 1. Tunstall codes 2. Petry codes 3. Generalized Tunstall codes for Markov sources (a presentation of the paper by I. Tabus, G. Korodi, J. Rissanen.

More information