Rooted Trees with Probabilities Revisited
|
|
- Ella Fletcher
- 6 years ago
- Views:
Transcription
1 Rooted Trees with Probabilities Revisited Georg öcherer Institute for Communications Engineering Technische Universität München arxiv: v1 [cs.it] 4 Feb 2013 February 5, 2013 Abstract Rooted trees with probabilities are convenient to represent a class of random processes with memory. They allow to describe and analyze variable length codes for data compression and distribution matching. In this work, the Leaf-Average Node-Sum Interchange Theorem (LANSIT) and the well-known applications to path length and leaf entropy are re-stated. The LANSIT is then applied to informational divergence. Next, the differential LANSIT is derived, which allows to write normalized functionals of leaf distributions as an average of functionals of branching distributions. Joint distributions of random variables and the corresponding conditional distributions are special cases of leaf distributions and branching distributions. Using the differential LANSIT, Pinsker s inequality is formulated for rooted trees with probabilities, with an application to the approximation of product distributions. In particular, it is shown that if the normalized informational divergence of a distribution and a product distribution approaches zero, then the entropy rate approaches the entropy rate of the product distribution. 2 / 34 1 / 34
2 Probability notation Random variable X, takes values in X Distribution P X : for each a X : P X (a) := Pr(X = a). Support supp P X := {a X : P X (a) > 0}. 3 / 34 Rooted Trees with Probabilities [1, 2, 3] L: set of leaves. L: random variable over L. We identify supp P L L, i.e., a node is a leaf of a tree if it has no successors and is generated with non-zero probability. N : set of all nodes on paths through the tree. root 0 N. = N \ L: set of branching nodes. L j : leaves below node j N. j L L j = j. S j : successors of node j. S j : random variable over successors of node j. We identify S j supp P Sj. 4 / 34
3 Node Probabilities We associate with each node j N a probability Q j = i L i P L (i) (1) The probabilities of the successors of node j are given by Q i = Q j P Sj (i), i S j. (2) 5 / 34 Example P S3 (5) = Q 5 = P L (5) = 1 2 P S1 (3) = 1 3 Q 3 = 3 4 P S0 (1) = Q 1 = Q 6 = P L (6) = 1 4 P S3 (6) = Q 0 = 1 4 P S0 (2) = Q 2 = P L (2) = 1 4 supp P L = L = {2, 5, 6} N = {0, 1, 2, 3, 5, 6} = N \ L = {0, 1, 3} supp P S1 = S 1 = {3} L 1 = {5, 6} Q 1 = i L 1 P L (i) = = 3 4 Q 3 = Q 1 P S1 (3) = = / 34
4 Leaf-Average Node-Sum Interchange Theorem (LANSIT) LANSIT [1, Theorem 1] Let f be a function that assigns to each node j N a real value f (j). For each j N \ 0, define f (j) := f (j) f (predecessor of j) Proposition 1 (LANSIT) E[f (L)] f (0) = j Q j E[ f (S j )] (3) 8 / 34
5 Proof of LANSIT Consider a tree with leaves L. Let S j L be a set of leaves with a common predecessor j. P L (i)f (i) (a) = Q j P Sj (i)[f (i) f (j) + f (j)] (4) i S j i S j =Q j f (j) P Sj (i) +Q j P Sj (i) f (i) (5) i S j i S j }{{} =1 =Q j f (j) + Q j E[ f (S j )] (6) where (a) follows from (2). L j L \ S j is a new tree with a reduced number of leaves and P L (j) = Q j. Repeat the procedure until j is the root node 0. Then Q j = 1 and Q j f (j) = f (0). 9 / 34 Path Length Lemma [3, Lemma 2.1] Function w(j) := length of path to node j. For each j N \ 0: w(j) = 1. w(0) = 0. Proposition 2 (Path Length Lemma) = j Q j. (7) 10 / 34
6 Leaf Entropy Lemma [3, Lemma 2.2] Function f (i) = log 2 Q i. Proposition 3 (Leaf Entropy Lemma) H(P L ) = j Q j H(P Sj ). (8) Proof. H(P L ) = E[ log 2 P L (L)] = E[f (L)] (a) = j Q j E[ f (S j )] (9) = j Q j E[ log 2 Q Sj Q j ] (10) (b) = j Q j E[ log 2 P Sj (S j )] (11) = j Q j H(P Sj ) (12) where (a) follows by the LANSIT and (b) by (2). 11 / 34 Informational Divergence Q Function f (i) = log i 2 Q i Proposition 4. D(P L P L ) = j Q j D(P Sj P S j ). (13) Proof. [ D(P L P L ) = E log 2 P L (L) ] (a) = Q j E[ f (S j )] (14) P L (L) j = j Q j E[log 2 Q Sj Q j Q j Q S j ] (15) (b) = P Sj (S j ) Q j E[log 2 P j S j (S j ) ] (16) = j Q j D(P Sj P S j ) (17) where (a) follows from the LANSIT and (b) by (2). 12 / 34
7 Random Vectors Remark. If all paths in a tree have the same length n, then P L can be thought of as a joint distribution P X n of a random vector X n = (X 1,..., X n ). In this case, Prop. 3 and Prop. 4 are the chain rules for entropy and informational divergence, respectively. 13 / 34 Differential LANSIT
8 Differential LANSIT : random variable over branching nodes. Define P (j) = Q j, j. (18) y path length lemma P (j) = j P defines a distribution over. Proposition 5 (Differential LANSIT) j Q j = 1. (19) E[f (L)] f (0) = E[ f (S )]. (20) Note that the expectation on the right-hand side is over P S P. 15 / 34 Example Consider the path length function w. y the Differential LANSIT, E[ w(s )] = w(0) = = 1. (21) 16 / 34
9 Entropy Rate Function f (i) = log 2 Q i. Proposition 6 H(P L ) = E[H(P S )] (22) Proof. H(P L ) = E[ log 2 P L (L)] (a) = E[ log 2 Q S Q ] (23) = E[ log 2 Q S Q ] (24) (b) = E[ log 2 P S (S )] (25) = E[H(P S )] (26) where (a) follows by the differential LANSIT and (b) by (2). 17 / 34 Normalized Informational Divergence Q Function f (i) = log i 2. Q i Proposition 7 D(P L P L ) = E[D(P S P S )]. (27) Proof. D(P L P L ) (a) = E[log 2 Q S Q Q Q S ] (28) (b) P S (S ) = E[log 2 P S (S ) ] (29) = E[D(P S P S )] (30) where (a) follows by the differential LANSIT and (b) by (2). 18 / 34
10 Pinsker s Inequality for Trees Variational distance P X, P Y two distributions on X. Variational distance d(p X, P Y ) is d(p X, P Y ) := a X P X (a) P Y (a) (31) ounds: d(p X, P Y ) 0, with equality iff a X : P X (a) = P Y (a) (32) d(p X, P Y ) 2, with equality iff supp P X supp P Y =. (33) 20 / 34
11 Approximating Distributions Set of distributions over X : P X. Proposition 8 i Pinsker s Inequality: D(P X P Y ) 1 2 ln 2 d 2 (P X, P Y ). (34) ii Let {P Xk } k=1 be a set of distributions in P X. D(P Xk P Y ) k 0 d(p Xk, P Y ) k 0 (35) iii Let g be a function on P X that is continuous in P Y. D(P Xk P Y ) k 0 g(p Xk ) g(p Y ) k 0. (36) 21 / 34 Example: Entropy y [4, Lemma 2.7], entropy is continuous in any distribution P Y P X. Thus D(P Xk P Y ) k 0 H(P Xk ) H(P Y ) k 0. (37) 22 / 34
12 Product Distributions Consider a tree and let P S be a branching distribution. Assign 1 P Sj = P S for all branching nodes j. We call the resulting node probabilities the product distribution P + S. For any complete tree with leaves L, P + S distribution, i.e., i L P+ S (i) = 1. defines a leaf For any (possibly non-complete) tree with leaves L, we define the informational divergence between the leaf distribution P L and P + S as D(P L P + S ) := i L P L (i) log 2 P L (i) P + S (i). (38) 1 This is a slight abuse of notation, since for j i, S i S i. However, we can think of P Sj as a distribution over branch labels. For example, for a binary tree, P Sj is then a distribution over the labels {0, 1}, for all i and the assignment P Sj = P S is meaningful. 23 / 34 Approximating Distributions on Trees Proposition 9 i Pinsker s Inequality for Trees: D(P L P L ) 1 2 ln(2) E[d 2 (P S, P S )]. (39) ii For any ɛ > 0, D(P L P L ) L 0 Pr[d(P S, P S ) ɛ] L 0. (40) iii Let P S be a branching distribution and let g be a function on P S that is bounded and continuous in P S. D(P L P + S ) L 0 E[g(P S )] g(p S ) 0. (41) L Proof. See Slides / 34
13 Entropy Rate y Prop. 6, H(P L ) = E[H(P S )]. (42) H is continuous and bounded. Thus by Prop. 9iii. we have the following proposition. Proposition 10 D(P L P + S ) L 0 H(P L) H(P S ) L 0. (43) 25 / 34 Random Vectors Remark. (See also Slide 13) If all paths in a tree have the same length n, then P L can be thought of as a joint distribution P X n of a random vector X n = (X 1,..., X n ). The (tree) product distribution P + S is then the conventional product distribution P n S. Prop. 9 applies and in particular, Prop. 10 becomes D(P X n P n S ) n n 0 H(P X n) n H(P S ) 0. (44) n 26 / 34
14 Proof of Prop. 9i. D(P L P L ) (a) = E[D(P S P S )] (45) (b) 1 2 ln 2 E[d 2 (P S, P S )] (46) where (a) follows by Prop. 7 and where (b) follows by Pinsker s inequality. 27 / 34 Proof of Prop. 9ii. Suppose E[d(P S, P S )] < ɛ 2 for some ɛ > 0. Then Pr[d(P S, P S ) ɛ] (a) E[d(P S, P S )] ɛ ɛ2 ɛ (47) (48) = ɛ (49) where (a) follows by Markov s inequality [3, Theo. A.2]. Together with statement i., statement ii. follows. 28 / 34
15 Proof of Prop. 9iii. (1) y assumption, g is bounded and continuous in P S. y boundedness, there exists a value g max < such that y continuity, we know that Define δ > 0: ɛ δ : ɛ < ɛ δ : j : g(p Sj ) g(p S ) g max. (50) d(p Sj, P S ) < ɛ g(p Sj ) g(p S ) < δ. (51) ɛ = min{ɛ δ, δ}. (52) 29 / 34 Proof of Prop. 9iii. (2) Suppose E[d(P S, P S )] < ɛ 2. We write E[g(P S )] g(p S ) = P (j)[g(p Sj ) g(p S )] j P (j) g(psj ) g(p S ) (53) j = P (j) g(p Sj ) g(p S ) j : d(p Sj,P S )<ɛ + j : d(p Sj,P S ) ɛ We next bound the two sums in (54). P (j) g(p Sj ) g(p S ). (54) 30 / 34
16 Proof of Prop. 9iii. (3) The first sum in (54) is bounded as j : d(p Sj,P S )<ɛ P (j) g(p Sj ) g(p S ) (a) where (a) follows by (51) and (52). j : d(p Sj,P S )<ɛ P (j)δ δ (55) 31 / 34 Proof of Prop. 9iii. (4) The second sum in (54) is bounded as j : d(p Sj,P S ) ɛ P (j) g(p Sj ) g(p S ) (a) j : d(p Sj,P S ) ɛ (b) ɛg max P (j)g max (c) δg max (56) where (a) follows by (50), where (b) follows by our assumption E[d(P S, P S )] < ɛ 2 and Slide 28 and where (c) follows by (52). 32 / 34
17 Proof of Prop. 9iii. (5) Using (55) and (56) in (54), we get E[g(P S )] g(p S ) δ + δg max = δ(1 + g max ). (57) For δ 0, the error bound on the right-hand side goes to zero, which proves part iii. of Prop / 34 References [1] R. A. Rueppel and J. L. Massey, Leaf-average node-sum interchanges in rooted trees with applications, in Communications and Cryptography: Two sides of One Tapestry, R. E. lahut, D. J. Costello Jr., U. Maurer, and T. Mittelholzer, Eds. Kluwer Academic Publishers, [2] J. L. Massey, Applied digital information theory I, lecture notes, ETH Zurich. [Online]. Available: scr/adit1.pdf [3] G. Kramer, Information theory, lecture notes TU Munich, edition WS 2012/2013. [4] I. Csiszár and J. Körner, Information Theory: Coding Theorems for Discrete Memoryless Systems. Cambridge University Press, / 34
Fixed-to-Variable Length Distribution Matching
Fixed-to-Variable Length Distribution Matching Rana Ali Ajad and Georg Böcherer Institute for Counications Engineering Technische Universität München, Gerany Eail: raa2463@gail.co,georg.boecherer@tu.de
More informationAn instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if. 2 l i. i=1
Kraft s inequality An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if N 2 l i 1 Proof: Suppose that we have a tree code. Let l max = max{l 1,...,
More informationAchievable Rates for Probabilistic Shaping
Achievable Rates for Probabilistic Shaping Georg Böcherer Mathematical and Algorithmic Sciences Lab Huawei Technologies France S.A.S.U. georg.boecherer@ieee.org May 3, 08 arxiv:707.034v5 cs.it May 08 For
More informationOptimum Binary-Constrained Homophonic Coding
Optimum Binary-Constrained Homophonic Coding Valdemar C. da Rocha Jr. and Cecilio Pimentel Communications Research Group - CODEC Department of Electronics and Systems, P.O. Box 7800 Federal University
More informationInformation Theory with Applications, Math6397 Lecture Notes from September 30, 2014 taken by Ilknur Telkes
Information Theory with Applications, Math6397 Lecture Notes from September 3, 24 taken by Ilknur Telkes Last Time Kraft inequality (sep.or) prefix code Shannon Fano code Bound for average code-word length
More information1 Introduction to information theory
1 Introduction to information theory 1.1 Introduction In this chapter we present some of the basic concepts of information theory. The situations we have in mind involve the exchange of information through
More informationarxiv: v4 [cs.it] 17 Oct 2015
Upper Bounds on the Relative Entropy and Rényi Divergence as a Function of Total Variation Distance for Finite Alphabets Igal Sason Department of Electrical Engineering Technion Israel Institute of Technology
More informationLecture 4 : Adaptive source coding algorithms
Lecture 4 : Adaptive source coding algorithms February 2, 28 Information Theory Outline 1. Motivation ; 2. adaptive Huffman encoding ; 3. Gallager and Knuth s method ; 4. Dictionary methods : Lempel-Ziv
More informationECE598: Information-theoretic methods in high-dimensional statistics Spring 2016
ECE598: Information-theoretic methods in high-dimensional statistics Spring 06 Lecture : Mutual Information Method Lecturer: Yihong Wu Scribe: Jaeho Lee, Mar, 06 Ed. Mar 9 Quick review: Assouad s lemma
More information(each row defines a probability distribution). Given n-strings x X n, y Y n we can use the absence of memory in the channel to compute
ENEE 739C: Advanced Topics in Signal Processing: Coding Theory Instructor: Alexander Barg Lecture 6 (draft; 9/6/03. Error exponents for Discrete Memoryless Channels http://www.enee.umd.edu/ abarg/enee739c/course.html
More informationA Mathematical Theory of Communication
A Mathematical Theory of Communication Ben Eggers Abstract This paper defines information-theoretic entropy and proves some elementary results about it. Notably, we prove that given a few basic assumptions
More informationLecture 11: Polar codes construction
15-859: Information Theory and Applications in TCS CMU: Spring 2013 Lecturer: Venkatesan Guruswami Lecture 11: Polar codes construction February 26, 2013 Scribe: Dan Stahlke 1 Polar codes: recap of last
More informationTight Upper Bounds on the Redundancy of Optimal Binary AIFV Codes
Tight Upper Bounds on the Redundancy of Optimal Binary AIFV Codes Weihua Hu Dept. of Mathematical Eng. Email: weihua96@gmail.com Hirosuke Yamamoto Dept. of Complexity Sci. and Eng. Email: Hirosuke@ieee.org
More information10-704: Information Processing and Learning Fall Lecture 10: Oct 3
0-704: Information Processing and Learning Fall 206 Lecturer: Aarti Singh Lecture 0: Oct 3 Note: These notes are based on scribed notes from Spring5 offering of this course. LaTeX template courtesy of
More informationOverview. Discrete Event Systems Verification of Finite Automata. What can finite automata be used for? What can finite automata be used for?
Computer Engineering and Networks Overview Discrete Event Systems Verification of Finite Automata Lothar Thiele Introduction Binary Decision Diagrams Representation of Boolean Functions Comparing two circuits
More informationCoding of memoryless sources 1/35
Coding of memoryless sources 1/35 Outline 1. Morse coding ; 2. Definitions : encoding, encoding efficiency ; 3. fixed length codes, encoding integers ; 4. prefix condition ; 5. Kraft and Mac Millan theorems
More informationLecture 1 : Data Compression and Entropy
CPS290: Algorithmic Foundations of Data Science January 8, 207 Lecture : Data Compression and Entropy Lecturer: Kamesh Munagala Scribe: Kamesh Munagala In this lecture, we will study a simple model for
More information4F5: Advanced Communications and Coding Handout 2: The Typical Set, Compression, Mutual Information
4F5: Advanced Communications and Coding Handout 2: The Typical Set, Compression, Mutual Information Ramji Venkataramanan Signal Processing and Communications Lab Department of Engineering ramji.v@eng.cam.ac.uk
More informationLarge Deviations Performance of Knuth-Yao algorithm for Random Number Generation
Large Deviations Performance of Knuth-Yao algorithm for Random Number Generation Akisato KIMURA akisato@ss.titech.ac.jp Tomohiko UYEMATSU uematsu@ss.titech.ac.jp April 2, 999 No. AK-TR-999-02 Abstract
More informationEntropy and Ergodic Theory Lecture 4: Conditional entropy and mutual information
Entropy and Ergodic Theory Lecture 4: Conditional entropy and mutual information 1 Conditional entropy Let (Ω, F, P) be a probability space, let X be a RV taking values in some finite set A. In this lecture
More informationInformation Theory. David Rosenberg. June 15, New York University. David Rosenberg (New York University) DS-GA 1003 June 15, / 18
Information Theory David Rosenberg New York University June 15, 2015 David Rosenberg (New York University) DS-GA 1003 June 15, 2015 1 / 18 A Measure of Information? Consider a discrete random variable
More informationHandout 5. α a1 a n. }, where. xi if a i = 1 1 if a i = 0.
Notes on Complexity Theory Last updated: October, 2005 Jonathan Katz Handout 5 1 An Improved Upper-Bound on Circuit Size Here we show the result promised in the previous lecture regarding an upper-bound
More informationTTIC 31230, Fundamentals of Deep Learning David McAllester, April Information Theory and Distribution Modeling
TTIC 31230, Fundamentals of Deep Learning David McAllester, April 2017 Information Theory and Distribution Modeling Why do we model distributions and conditional distributions using the following objective
More informationOn the Logarithmic Calculus and Sidorenko s Conjecture
On the Logarithmic Calculus and Sidorenko s Conjecture by Xiang Li A thesis submitted in conformity with the requirements for the degree of Msc. Mathematics Graduate Department of Mathematics University
More informationInformation Theory, Statistics, and Decision Trees
Information Theory, Statistics, and Decision Trees Léon Bottou COS 424 4/6/2010 Summary 1. Basic information theory. 2. Decision trees. 3. Information theory and statistics. Léon Bottou 2/31 COS 424 4/6/2010
More informationInformation Theory. Lecture 5 Entropy rate and Markov sources STEFAN HÖST
Information Theory Lecture 5 Entropy rate and Markov sources STEFAN HÖST Universal Source Coding Huffman coding is optimal, what is the problem? In the previous coding schemes (Huffman and Shannon-Fano)it
More informationOn the Entropy of Sums of Bernoulli Random Variables via the Chen-Stein Method
On the Entropy of Sums of Bernoulli Random Variables via the Chen-Stein Method Igal Sason Department of Electrical Engineering Technion - Israel Institute of Technology Haifa 32000, Israel ETH, Zurich,
More informationAhmed S. Mansour, Rafael F. Schaefer and Holger Boche. June 9, 2015
The Individual Secrecy Capacity of Degraded Multi-Receiver Wiretap Broadcast Channels Ahmed S. Mansour, Rafael F. Schaefer and Holger Boche Lehrstuhl für Technische Universität München, Germany Department
More informationSource Coding. Master Universitario en Ingeniería de Telecomunicación. I. Santamaría Universidad de Cantabria
Source Coding Master Universitario en Ingeniería de Telecomunicación I. Santamaría Universidad de Cantabria Contents Introduction Asymptotic Equipartition Property Optimal Codes (Huffman Coding) Universal
More informationA Formula for the Capacity of the General Gel fand-pinsker Channel
A Formula for the Capacity of the General Gel fand-pinsker Channel Vincent Y. F. Tan Institute for Infocomm Research (I 2 R, A*STAR, Email: tanyfv@i2r.a-star.edu.sg ECE Dept., National University of Singapore
More informationUniversal Incremental Slepian-Wolf Coding
Proceedings of the 43rd annual Allerton Conference, Monticello, IL, September 2004 Universal Incremental Slepian-Wolf Coding Stark C. Draper University of California, Berkeley Berkeley, CA, 94720 USA sdraper@eecs.berkeley.edu
More information5 Mutual Information and Channel Capacity
5 Mutual Information and Channel Capacity In Section 2, we have seen the use of a quantity called entropy to measure the amount of randomness in a random variable. In this section, we introduce several
More informationProblem: Data base too big to fit memory Disk reads are slow. Example: 1,000,000 records on disk Binary search might take 20 disk reads
B Trees Problem: Data base too big to fit memory Disk reads are slow Example: 1,000,000 records on disk Binary search might take 20 disk reads Disk reads are done in blocks Example: One block read can
More informationHash Property and Fixed-rate Universal Coding Theorems
1 Hash Property and Fixed-rate Universal Coding Theorems Jun Muramatsu Member, IEEE, Shigeki Miyake Member, IEEE, Abstract arxiv:0804.1183v1 [cs.it 8 Apr 2008 The aim of this paper is to prove the achievability
More informationMultiaccess Channels with State Known to One Encoder: A Case of Degraded Message Sets
Multiaccess Channels with State Known to One Encoder: A Case of Degraded Message Sets Shivaprasad Kotagiri and J. Nicholas Laneman Department of Electrical Engineering University of Notre Dame Notre Dame,
More informationBinary Decision Diagrams
Binary Decision Diagrams Binary Decision Diagrams (BDDs) are a class of graphs that can be used as data structure for compactly representing boolean functions. BDDs were introduced by R. Bryant in 1986.
More information1 Background on Information Theory
Review of the book Information Theory: Coding Theorems for Discrete Memoryless Systems by Imre Csiszár and János Körner Second Edition Cambridge University Press, 2011 ISBN:978-0-521-19681-9 Review by
More informationMachine Learning. Lecture 02.2: Basics of Information Theory. Nevin L. Zhang
Machine Learning Lecture 02.2: Basics of Information Theory Nevin L. Zhang lzhang@cse.ust.hk Department of Computer Science and Engineering The Hong Kong University of Science and Technology Nevin L. Zhang
More informationInformation Theory and Statistics Lecture 2: Source coding
Information Theory and Statistics Lecture 2: Source coding Łukasz Dębowski ldebowsk@ipipan.waw.pl Ph. D. Programme 2013/2014 Injections and codes Definition (injection) Function f is called an injection
More informationModel Checking for Propositions CS477 Formal Software Dev Methods
S477 Formal Software Dev Methods Elsa L Gunter 2112 S, UIU egunter@illinois.edu http://courses.engr.illinois.edu/cs477 Slides based in part on previous lectures by Mahesh Vishwanathan, and by Gul gha January
More informationAsymptotic redundancy and prolixity
Asymptotic redundancy and prolixity Yuval Dagan, Yuval Filmus, and Shay Moran April 6, 2017 Abstract Gallager (1978) considered the worst-case redundancy of Huffman codes as the maximum probability tends
More informationLecture 16: Computation Tree Logic (CTL)
Lecture 16: Computation Tree Logic (CTL) 1 Programme for the upcoming lectures Introducing CTL Basic Algorithms for CTL CTL and Fairness; computing strongly connected components Basic Decision Diagrams
More informationVariable Length Codes for Degraded Broadcast Channels
Variable Length Codes for Degraded Broadcast Channels Stéphane Musy School of Computer and Communication Sciences, EPFL CH-1015 Lausanne, Switzerland Email: stephane.musy@ep.ch Abstract This paper investigates
More informationIntermittent Communication
Intermittent Communication Mostafa Khoshnevisan, Student Member, IEEE, and J. Nicholas Laneman, Senior Member, IEEE arxiv:32.42v2 [cs.it] 7 Mar 207 Abstract We formulate a model for intermittent communication
More informationBinary Decision Diagrams. Graphs. Boolean Functions
Binary Decision Diagrams Graphs Binary Decision Diagrams (BDDs) are a class of graphs that can be used as data structure for compactly representing boolean functions. BDDs were introduced by R. Bryant
More informationSHARED INFORMATION. Prakash Narayan with. Imre Csiszár, Sirin Nitinawarat, Himanshu Tyagi, Shun Watanabe
SHARED INFORMATION Prakash Narayan with Imre Csiszár, Sirin Nitinawarat, Himanshu Tyagi, Shun Watanabe 2/40 Acknowledgement Praneeth Boda Himanshu Tyagi Shun Watanabe 3/40 Outline Two-terminal model: Mutual
More informationLecture 21: Minimax Theory
Lecture : Minimax Theory Akshay Krishnamurthy akshay@cs.umass.edu November 8, 07 Recap In the first part of the course, we spent the majority of our time studying risk minimization. We found many ways
More informationPrimary Rate-Splitting Achieves Capacity for the Gaussian Cognitive Interference Channel
Primary Rate-Splitting Achieves Capacity for the Gaussian Cognitive Interference Channel Stefano Rini, Ernest Kurniawan and Andrea Goldsmith Technische Universität München, Munich, Germany, Stanford University,
More informationSolutions to Set #2 Data Compression, Huffman code and AEP
Solutions to Set #2 Data Compression, Huffman code and AEP. Huffman coding. Consider the random variable ( ) x x X = 2 x 3 x 4 x 5 x 6 x 7 0.50 0.26 0. 0.04 0.04 0.03 0.02 (a) Find a binary Huffman code
More informationOn Universal Types. Gadiel Seroussi Hewlett-Packard Laboratories Palo Alto, California, USA. University of Minnesota, September 14, 2004
On Universal Types Gadiel Seroussi Hewlett-Packard Laboratories Palo Alto, California, USA University of Minnesota, September 14, 2004 Types for Parametric Probability Distributions A = finite alphabet,
More informationLecture 4: State Estimation in Hidden Markov Models (cont.)
EE378A Statistical Signal Processing Lecture 4-04/13/2017 Lecture 4: State Estimation in Hidden Markov Models (cont.) Lecturer: Tsachy Weissman Scribe: David Wugofski In this lecture we build on previous
More informationx log x, which is strictly convex, and use Jensen s Inequality:
2. Information measures: mutual information 2.1 Divergence: main inequality Theorem 2.1 (Information Inequality). D(P Q) 0 ; D(P Q) = 0 iff P = Q Proof. Let ϕ(x) x log x, which is strictly convex, and
More informationEECS 229A Spring 2007 * * (a) By stationarity and the chain rule for entropy, we have
EECS 229A Spring 2007 * * Solutions to Homework 3 1. Problem 4.11 on pg. 93 of the text. Stationary processes (a) By stationarity and the chain rule for entropy, we have H(X 0 ) + H(X n X 0 ) = H(X 0,
More informationReading for Lecture 13 Release v10
Reading for Lecture 13 Release v10 Christopher Lee November 15, 2011 Contents 1 Evolutionary Trees i 1.1 Evolution as a Markov Process...................................... ii 1.2 Rooted vs. Unrooted Trees........................................
More informationMGMT 69000: Topics in High-dimensional Data Analysis Falll 2016
MGMT 69000: Topics in High-dimensional Data Analysis Falll 2016 Lecture 14: Information Theoretic Methods Lecturer: Jiaming Xu Scribe: Hilda Ibriga, Adarsh Barik, December 02, 2016 Outline f-divergence
More informationLecture 5: Asymptotic Equipartition Property
Lecture 5: Asymptotic Equipartition Property Law of large number for product of random variables AEP and consequences Dr. Yao Xie, ECE587, Information Theory, Duke University Stock market Initial investment
More informationAn Achievable Error Exponent for the Mismatched Multiple-Access Channel
An Achievable Error Exponent for the Mismatched Multiple-Access Channel Jonathan Scarlett University of Cambridge jms265@camacuk Albert Guillén i Fàbregas ICREA & Universitat Pompeu Fabra University of
More informationITCT Lecture IV.3: Markov Processes and Sources with Memory
ITCT Lecture IV.3: Markov Processes and Sources with Memory 4. Markov Processes Thus far, we have been occupied with memoryless sources and channels. We must now turn our attention to sources with memory.
More informationSeries 7, May 22, 2018 (EM Convergence)
Exercises Introduction to Machine Learning SS 2018 Series 7, May 22, 2018 (EM Convergence) Institute for Machine Learning Dept. of Computer Science, ETH Zürich Prof. Dr. Andreas Krause Web: https://las.inf.ethz.ch/teaching/introml-s18
More informationSHARED INFORMATION. Prakash Narayan with. Imre Csiszár, Sirin Nitinawarat, Himanshu Tyagi, Shun Watanabe
SHARED INFORMATION Prakash Narayan with Imre Csiszár, Sirin Nitinawarat, Himanshu Tyagi, Shun Watanabe 2/41 Outline Two-terminal model: Mutual information Operational meaning in: Channel coding: channel
More informationDigital communication system. Shannon s separation principle
Digital communication system Representation of the source signal by a stream of (binary) symbols Adaptation to the properties of the transmission channel information source source coder channel coder modulation
More informationLecture 16. Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code
Lecture 16 Agenda for the lecture Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code Variable-length source codes with error 16.1 Error-free coding schemes 16.1.1 The Shannon-Fano-Elias
More information3F1 Information Theory, Lecture 1
3F1 Information Theory, Lecture 1 Jossy Sayir Department of Engineering Michaelmas 2013, 22 November 2013 Organisation History Entropy Mutual Information 2 / 18 Course Organisation 4 lectures Course material:
More informationIntroduction to Machine Learning Lecture 14. Mehryar Mohri Courant Institute and Google Research
Introduction to Machine Learning Lecture 14 Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Density Estimation Maxent Models 2 Entropy Definition: the entropy of a random variable
More informationLecture 17: Trees and Merge Sort 10:00 AM, Oct 15, 2018
CS17 Integrated Introduction to Computer Science Klein Contents Lecture 17: Trees and Merge Sort 10:00 AM, Oct 15, 2018 1 Tree definitions 1 2 Analysis of mergesort using a binary tree 1 3 Analysis of
More informationDecision trees COMS 4771
Decision trees COMS 4771 1. Prediction functions (again) Learning prediction functions IID model for supervised learning: (X 1, Y 1),..., (X n, Y n), (X, Y ) are iid random pairs (i.e., labeled examples).
More information3F1 Information Theory, Lecture 3
3F1 Information Theory, Lecture 3 Jossy Sayir Department of Engineering Michaelmas 2013, 29 November 2013 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 2 / 21 Encoding the output
More informationLECTURE 3. Last time:
LECTURE 3 Last time: Mutual Information. Convexity and concavity Jensen s inequality Information Inequality Data processing theorem Fano s Inequality Lecture outline Stochastic processes, Entropy rate
More informationInformation in Biology
Lecture 3: Information in Biology Tsvi Tlusty, tsvi@unist.ac.kr Living information is carried by molecular channels Living systems I. Self-replicating information processors Environment II. III. Evolve
More informationChapter I: Fundamental Information Theory
ECE-S622/T62 Notes Chapter I: Fundamental Information Theory Ruifeng Zhang Dept. of Electrical & Computer Eng. Drexel University. Information Source Information is the outcome of some physical processes.
More informationResearch Collection. The relation between block length and reliability for a cascade of AWGN links. Conference Paper. ETH Library
Research Collection Conference Paper The relation between block length and reliability for a cascade of AWG links Authors: Subramanian, Ramanan Publication Date: 0 Permanent Link: https://doiorg/0399/ethz-a-00705046
More informationINFORMATION THEORY AND STATISTICS
CHAPTER INFORMATION THEORY AND STATISTICS We now explore the relationship between information theory and statistics. We begin by describing the method of types, which is a powerful technique in large deviation
More informationH(X) = plog 1 p +(1 p)log 1 1 p. With a slight abuse of notation, we denote this quantity by H(p) and refer to it as the binary entropy function.
LECTURE 2 Information Measures 2. ENTROPY LetXbeadiscreterandomvariableonanalphabetX drawnaccordingtotheprobability mass function (pmf) p() = P(X = ), X, denoted in short as X p(). The uncertainty about
More informationInformation in Biology
Information in Biology CRI - Centre de Recherches Interdisciplinaires, Paris May 2012 Information processing is an essential part of Life. Thinking about it in quantitative terms may is useful. 1 Living
More informationGaussian Estimation under Attack Uncertainty
Gaussian Estimation under Attack Uncertainty Tara Javidi Yonatan Kaspi Himanshu Tyagi Abstract We consider the estimation of a standard Gaussian random variable under an observation attack where an adversary
More informationEach internal node v with d(v) children stores d 1 keys. k i 1 < key in i-th sub-tree k i, where we use k 0 = and k d =.
7.5 (a, b)-trees 7.5 (a, b)-trees Definition For b a an (a, b)-tree is a search tree with the following properties. all leaves have the same distance to the root. every internal non-root vertex v has at
More informationEE5139R: Problem Set 4 Assigned: 31/08/16, Due: 07/09/16
EE539R: Problem Set 4 Assigned: 3/08/6, Due: 07/09/6. Cover and Thomas: Problem 3.5 Sets defined by probabilities: Define the set C n (t = {x n : P X n(x n 2 nt } (a We have = P X n(x n P X n(x n 2 nt
More informationIntro to Information Theory
Intro to Information Theory Math Circle February 11, 2018 1. Random variables Let us review discrete random variables and some notation. A random variable X takes value a A with probability P (a) 0. Here
More informationDiscrete Mathematics
Discrete Mathematics Yi Li Software School Fudan University March 13, 2017 Yi Li (Fudan University) Discrete Mathematics March 13, 2017 1 / 1 Review of Lattice Ideal Special Lattice Boolean Algebra Yi
More informationCommon Randomness Principles of Secrecy
Common Randomness Principles of Secrecy Himanshu Tyagi Department of Electrical and Computer Engineering and Institute of Systems Research 1 Correlated Data, Distributed in Space and Time Sensor Networks
More informationAQI: Advanced Quantum Information Lecture 6 (Module 2): Distinguishing Quantum States January 28, 2013
AQI: Advanced Quantum Information Lecture 6 (Module 2): Distinguishing Quantum States January 28, 2013 Lecturer: Dr. Mark Tame Introduction With the emergence of new types of information, in this case
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science Transmission of Information Spring 2006
MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science 6.44 Transmission of Information Spring 2006 Homework 2 Solution name username April 4, 2006 Reading: Chapter
More informationLecture 5 - Information theory
Lecture 5 - Information theory Jan Bouda FI MU May 18, 2012 Jan Bouda (FI MU) Lecture 5 - Information theory May 18, 2012 1 / 42 Part I Uncertainty and entropy Jan Bouda (FI MU) Lecture 5 - Information
More informationInformation and Entropy
Information and Entropy Shannon s Separation Principle Source Coding Principles Entropy Variable Length Codes Huffman Codes Joint Sources Arithmetic Codes Adaptive Codes Thomas Wiegand: Digital Image Communication
More informationMidterm Exam Information Theory Fall Midterm Exam. Time: 09:10 12:10 11/23, 2016
Midterm Exam Time: 09:10 12:10 11/23, 2016 Name: Student ID: Policy: (Read before You Start to Work) The exam is closed book. However, you are allowed to bring TWO A4-size cheat sheet (single-sheet, two-sided).
More informationCOMPSCI 650 Applied Information Theory Jan 21, Lecture 2
COMPSCI 650 Applied Information Theory Jan 21, 2016 Lecture 2 Instructor: Arya Mazumdar Scribe: Gayane Vardoyan, Jong-Chyi Su 1 Entropy Definition: Entropy is a measure of uncertainty of a random variable.
More informationLecture 8: Channel and source-channel coding theorems; BEC & linear codes. 1 Intuitive justification for upper bound on channel capacity
5-859: Information Theory and Applications in TCS CMU: Spring 23 Lecture 8: Channel and source-channel coding theorems; BEC & linear codes February 7, 23 Lecturer: Venkatesan Guruswami Scribe: Dan Stahlke
More informationECE 587 / STA 563: Lecture 5 Lossless Compression
ECE 587 / STA 563: Lecture 5 Lossless Compression Information Theory Duke University, Fall 2017 Author: Galen Reeves Last Modified: October 18, 2017 Outline of lecture: 5.1 Introduction to Lossless Source
More informationWolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig
Multimedia Databases Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de 13 Indexes for Multimedia Data 13 Indexes for Multimedia
More informationLecture 22: Final Review
Lecture 22: Final Review Nuts and bolts Fundamental questions and limits Tools Practical algorithms Future topics Dr Yao Xie, ECE587, Information Theory, Duke University Basics Dr Yao Xie, ECE587, Information
More informationECE 587 / STA 563: Lecture 5 Lossless Compression
ECE 587 / STA 563: Lecture 5 Lossless Compression Information Theory Duke University, Fall 28 Author: Galen Reeves Last Modified: September 27, 28 Outline of lecture: 5. Introduction to Lossless Source
More informationWAITING FOR A BAT TO FLY BY (IN POLYNOMIAL TIME)
WAITING FOR A BAT TO FLY BY (IN POLYNOMIAL TIME ITAI BENJAMINI, GADY KOZMA, LÁSZLÓ LOVÁSZ, DAN ROMIK, AND GÁBOR TARDOS Abstract. We observe returns of a simple random wal on a finite graph to a fixed node,
More informationCS 229: Lecture 7 Notes
CS 9: Lecture 7 Notes Scribe: Hirsh Jain Lecturer: Angela Fan Overview Overview of today s lecture: Hypothesis Testing Total Variation Distance Pinsker s Inequality Application of Pinsker s Inequality
More informationPropositional and Predicate Logic - V
Propositional and Predicate Logic - V Petr Gregor KTIML MFF UK WS 2016/2017 Petr Gregor (KTIML MFF UK) Propositional and Predicate Logic - V WS 2016/2017 1 / 21 Formal proof systems Hilbert s calculus
More informationSupplemental for Spectral Algorithm For Latent Tree Graphical Models
Supplemental for Spectral Algorithm For Latent Tree Graphical Models Ankur P. Parikh, Le Song, Eric P. Xing The supplemental contains 3 main things. 1. The first is network plots of the latent variable
More informationNecessary and Sufficient Conditions for High-Dimensional Salient Feature Subset Recovery
Necessary and Sufficient Conditions for High-Dimensional Salient Feature Subset Recovery Vincent Tan, Matt Johnson, Alan S. Willsky Stochastic Systems Group, Laboratory for Information and Decision Systems,
More informationA Simple Converse of Burnashev s Reliability Function
A Simple Converse of Burnashev s Reliability Function 1 arxiv:cs/0610145v3 [cs.it] 23 Sep 2008 Peter Berlin, Barış Nakiboğlu, Bixio Rimoldi, Emre Telatar School of Computer and Communication Sciences Ecole
More informationEntropy Rate of Stochastic Processes
Entropy Rate of Stochastic Processes Timo Mulder tmamulder@gmail.com Jorn Peters jornpeters@gmail.com February 8, 205 The entropy rate of independent and identically distributed events can on average be
More informationPART III. Outline. Codes and Cryptography. Sources. Optimal Codes (I) Jorge L. Villar. MAMME, Fall 2015
Outline Codes and Cryptography 1 Information Sources and Optimal Codes 2 Building Optimal Codes: Huffman Codes MAMME, Fall 2015 3 Shannon Entropy and Mutual Information PART III Sources Information source:
More informationSIGNAL COMPRESSION Lecture 7. Variable to Fix Encoding
SIGNAL COMPRESSION Lecture 7 Variable to Fix Encoding 1. Tunstall codes 2. Petry codes 3. Generalized Tunstall codes for Markov sources (a presentation of the paper by I. Tabus, G. Korodi, J. Rissanen.
More information