On Upper Bounding Discrete Entropy

Size: px

Start display at page:

Download "On Upper Bounding Discrete Entropy"

Solomon Dean
5 years ago
Views:

1 On Upper Bounding Discrete Entropy by Razan Al-nakhli A report submitted to the Department of mathematics and statistics in conformity with the requirements for the degree of Master of Science Queen s University Kingston, Ontario, Canada December 011 Copyright c Razan Al-nakhli, 011

2 Abstract Two upper bounds on the entropy of a discrete random variable are studied. The standard upper bound is derived based on the differential entropy bound for a Gaussian random variable. A tighter bound is proved using the transformation formula of the Jacobi theta function and Shannon s inequality. Numerical examples are provided to illustrate their tightness. i

3 Acknowledgments First and foremost, I would like to express my sincerest gratitude to my supervisor Dr. Fady Alajaji for his patience and guidance all through my journey here at Queen s University. His support has encouraged me to learn more than what I had expected to be possible from myself. I would also like to thank Dr. David Wehlau. Many thanks go to my beloved parents for their unconditional love, patience and support all through my life. To my adorable husband Mohammad. I thank you for your endless love and for standing by my side every step of the way. I could never have imagined how this experience would have been without you. Last, but not least, I am thankful to all members and staff in the department of Mathematics and Statistics at Queen s University. I am proud to have considered some of them as my friends here in Kingston. ii

4 Contents Abstract Acknowledgments Contents List of Figures i ii iii iv Chapter 1: Introduction Overview Motivation Organization of Thesis Chapter : Preliminaries 3.1 Entropy of a Discrete Random Variable Differential Entropy of a Continuous Random Variable Chapter 3: Two Bounds on Discrete Entropy The Standard Upper Bound on Discrete Entropy A Tight Upper Bound on Discrete Entropy Chapter 4: Numerical Examples 0 Chapter 5: Conclusion 30 Bibliography 31 iii

5 List of Figures 4.1 Entropy and the different bounds for p= Entropy and the different bounds for p= Entropy and the different bounds for p= iv

6 1 Chapter 1 Introduction Entropy is a key measure in information theory. It represents for discrete alphabet sources the ultimate rate below which compression cannot be realized error-free []. 1.1 Overview The standard upper bound on discrete entropy (which was independently established by Cover and Thomas, Willems and Massey) is derived based on the differential entropy of the Gaussian random variable [3]. In 1975, Djacov published a similar bound in his work on coin-weighing [4]. In his 1998 work [5], Mow proved a tightened bound on the discrete entropy providing certain conditions on the probability mass function of the discrete random variable. 1. Motivation A major contribution of this report addresses the problem of finding bounds for the entropy of discrete random variables. Entropies are ubiquitously used in information theory. It is important to note that the task of deriving bounds for the entropy of

7 1.3. ORGANIZATION OF THESIS discrete random variables is sometimes crucial because the exact entropy value may not be handily known (particularly if the full distribution of the random variables is not known ). Thus, finding entropy bounds that can be easily determined in terms of partial knowledge about the random variables (such as their mean and variance) is an interesting and worthwhile task. 1.3 Organization of Thesis We proceed in Chapter by introducing some background on discrete entropy and differential entropy along with some important definitions and properties. In the next chapter, we first prove the standard upper bound on discrete entropy. Then, we show in details a tighter bound due to Mow [5]. In Chapter 4 we apply both bounds to several examples to illustrate their tightness. Chapter 5 concludes the report and outlines future work. Throughout the report, all logarithms are assumed to have base unless otherwise specified.

8 3 Chapter Preliminaries.1 Entropy of a Discrete Random Variable In information theory, entropy is a measure of the uncertainty associated with a random variable. In this context, the term usually refers to the Shannon entropy, which quantifies the expected value of the information contained in a message, measured usually in units such as bits. In this context, a message means a specific realization of the random variable X with discrete (finite or countable) alphabet X and probability mass function (pmf) p( ). Definition 1. The entropy H(X) of a discrete random variable is defined by H(X) = x X p(x) log b p(x) where b is the base of the logarithm used to specify units. Common values of b are, Euler s number e, and 10, and the unit of entropy is bit for b =, nat for b = e, and digit for b = 10. In the case of p(i) = 0 for some i X, the value of the corresponding summand

9 .1. ENTROPY OF A DISCRETE RANDOM VARIABLE 4 0 log b 0 is taken to be 0, which is consistent with the limit lim p log p = 0. p 0+ Also, note that the entropy is a function of the distribution p( ) of X and is sometimes written as H(p). Thus, it doesn t depend on the actual values that the random variable X takes but on the probabilities of its outcomes. Definition. The joint entropy of two random variables X and Y with joint pmf p(x, y), where x X, y Y is defined as H(X, Y ) = x p(x, y) log p(x, y) y where x and y are particular values of X and Y, respectively,and p(x, y) log p(x, y)is defined to be 0 if p(x, y) = 0 for some (x, y). For more than two variables X 1,..., X n, their joint entropy is similarly defined as H(X 1,..., X n ) = x 1... x n p(x 1,..., x n ) log p(x 1,..., x n ) where p(x 1,..., x n ) is the joint pmf of X 1,..., X n. Definition 3. Given discrete random variable X with alphabet X and Y with alphabet

10 .1. ENTROPY OF A DISCRETE RANDOM VARIABLE 5 Y, the conditional entropy of Y given X is defined as: H(Y X) = x X p(x)h(y X = x) (.1) = p(x) p(y x) log p(y x) x X y Y (.) = p(x, y) log p(y x) x X y Y (.3) where the conditional pmf p(y x) = p(x,y) p(x) for x X with p(x) > 0 and y Y. Theorem 1. The chain rule for entropy yields H(X, Y ) = H(X) + H(Y X) or H(X, Y ) = H(Y ) + H(X Y ). Proof. H(X, Y ) = p(x, y) log p(x, y) x X y Y = p(x, y) log[p(x)p(y x)] x X y Y = p(x, y) log p(x) p(x, y) log p(y x) x X y Y x X y Y = p(x) log p(x) p(x, y) log p(y x) x X x X y Y = H(X) + H(Y X).

11 .1. ENTROPY OF A DISCRETE RANDOM VARIABLE 6 Corollary 1. H(X, Y Z) = H(X Z) + H(Y X, Z). Properties of Entropy H(X) 0. Conditioning reduces entropy: for any two random variables X and Y, we have H(X Y ) H(X) with equality if and only if X and Y are independent. H(X) log X for any random variable X with finite alphabet X, with equality if and only if X is distributed uniformly over X. H(p) is concave in p. The joint entropy of a set of random variables is greater than or equal to all of the individual entropies of the random variables in the set. H(X, Y ) max[h(x), H(Y )] and H(X 1,..., X n ) max[h(x 1 ),..., H(X n )]. The joint entropy of a set of random variables is less than or equal to the sum of the individual entropies of the random variables in the set. with equality if and only if the random variables are independent of each other.

12 .. DIFFERENTIAL ENTROPY OF A CONTINUOUS RANDOM VARIABLE 7 H(X, Y ) H(X) + H(Y ) and H(X 1,..., X n ) H(X 1 ) H(X n ).. Differential Entropy of a Continuous Random Variable We now deal with continuous (real-valued) random variables that admit a probability density function (pdf). Definition 4. Let X be a random variable with pdf f( ) and support S where S = {x R : f(x) > 0}. The differential entropy of X is denoted by h(x) and defined as assuming that the integral exists. h(x) = f(x) log f(x) dx S As with its discrete analog, the units of differential entropy depend on the base of the logarithm, which is usually (i.e., the units are bits). Related concepts such as joint, conditional differential entropy are defined in a similar fashion. However, differential entropy does not share all properties of discrete entropy. For example it can be negative, and it is not necessarily invariant under invertible maps (such as scaling). Definition 5. The joint differential entropy of n random variables X 1, X,..., X n with joint pdf f(x 1, x,..., x n )is defined as h(x 1, X,.., X n ) = f(x n ) log f(x n )dx n

13 .. DIFFERENTIAL ENTROPY OF A CONTINUOUS RANDOM VARIABLE 8 assuming that the integral exists, where x n = (x 1,..., x n ) and dx n denotes dx 1,..., dx n. Definition 6. If X and Y have joint pdf f(x, y), we can define the the conditional differential entropy h(x Y ) as h(x Y ) = f(x, y) log f(x y)dxdy assuming that the integral exists. Properties of differential entropy The chain rule for differential entropy holds as in the discrete case h(x 1,..., X n ) = n h(x i X 1,..., X i 1 ) n h(x i ). Differential entropy is translation invariant: h(x + c) = h(x) for any constant c. Differential entropy is not invariant under invertible maps. In particular, for a constant a 0, h(ax) = h(x) + log a. Also, for a random vector X = (x 1, x,..., x n ) T and n n invertible matrix A, h(ax) = h(x) + log det A, where T denotes transposition. If a random vector X n R n has zero mean and covariance matrix K, then, h(x n ) 1 log [(πe)n det K ]

14 .. DIFFERENTIAL ENTROPY OF A CONTINUOUS RANDOM VARIABLE 9 with equality if and only if X n is Gaussian; i.e, the random variables X 1,..., X n are jointly Gaussian with covariance matrix K.

15 10 Chapter 3 Two Bounds on Discrete Entropy 3.1 The Standard Upper Bound on Discrete Entropy As noted in the previous chapter, the Gaussian distribution yields the largest entropy among all densities with the same variance. We next use this fact to derive an upper bound on the entropy of a discrete random variable Theorem. Let X be a discrete random variable with countable alphabet X = {a 1, a,...} and pmf P [X = a i ] = p i, i = 1,, 3,... Then, ( H(X) 1 ) log πe i p i ip i + 1 (3.1) 1 where H(X) = p i log p i is the entropy of X. Proof. Let X 0 be an integer-valued discrete random variable with pmf P [X 0 = i] = p i for i = 1,,.... Let U be a continuous random variable uniformly distributed on the interval [0, 1] and independent of X 0. Now, define the continuous random variable X by X = X 0 + U. Hence, V ar( X) = V ar(x 0 ) + V ar(u) since X 0 and U are independent, where V ar( ) denotes the variance.

16 3.1. THE STANDARD UPPER BOUND ON DISCRETE ENTROPY11 We then have H(X 0 ) = H(X) = p i log p i = = = [ i+1 i i+1 i 1 f X(x)dx ] [ i+1 log i f X(x) log f X(x)dx f X(x) log f X(x)dx = h( X) 1 [πe(v log ar( X)) ] f X(x)dx ] = 1 log [πe (V ar(x 0) + V ar(u))] ( = 1 )] [πe log i p i ( ip i ) The first equality above is true because the discrete entropy depends only on the probabilities and not on the values of the outcomes and the third equality is achieved since f X(x) = p i for i x < i + 1. The inequality follows from the fact that the Gaussian density maximizes the differential entropy among all densities with the same variance. Note: A tighter bound is possible by appropriately reordering the given set of probability masses. It is conjectured that a good bound is achieved by the assignment...p 5, p 3, p 1, p, p 4,... for p 1 p p 3... Example 1. Let X be a discrete geometric random variable on the set X = {1,,...}, and parameter p, with entropy H(X) = (1 p) log (1 p) p log p p. The standard upper

17 3.. A TIGHT UPPER BOUND ON DISCRETE ENTROPY 1 bound of Theorem gives H(X) 1 (1 p) log(πe( + 1 ). Letting p = 1, we get p 1 H(X) 1 log(πe( + 1 ) = bits while H(X) = bits 1 Example. Let X be a negative Binomial random variable on the set X = {0, 1,...}, with pmf p[x = k] = ( k+r 1 k ) (1 p) r p k, for k X, where probability p = 1 4 parameter r > 0. The standard upper bound yields and integer H(X) 1 [ r( 1 (πe log ) ]) 1 16 = 1 [ 16 (πe log 36 r + 1 ]) A Tight Upper Bound on Discrete Entropy The bound in the previous theorem can be tightened by eliminating the term 1 1. However, it s clear that this is not always possible because the upper bound will become negative if V ar(x) < 1 πe. Lemma 1. (Shannon s inequality) If {a i } and {b i } are sequences of N positive numbers such that a 1 + a + a a N = 1 and b 1 + b + b b N 1 then N N a i log a i a i log b i (3.) with equality if and only if a i = b i for i = 1,,..., N.

18 3.. A TIGHT UPPER BOUND ON DISCRETE ENTROPY 13 Proof. To prove (3.), it is sufficient to prove that N a i log a i b i 0 (3.3) Based on the log-sum inequality, (3.3) is directly proved. Note that the log-sum inequality states that for nonnegative numbers a 1, a,..., a n and b 1, b,..., b n, N a i log a i b i ( N ) a i log ( N a i) ( N b i) (3.4) with equality if and only if a i b i = N a i N b i i = 1,..., N. Indeed, using (3.4) and considering that N a i = 1, we get N a i log a i (1) (1) log b i ( N b i) (1) log(1) = 0 since N b i 1. Thus, (3.3) is proved and (3.) is directly implied using the property of logarithm function. The equality condition for (3.) is obtained by noting that equality holds in the above inequalities if and only if a i b i = N a i N b i = 1 N b i i = 1,,.., N and N b i = 1

19 3.. A TIGHT UPPER BOUND ON DISCRETE ENTROPY 14 i.e, if and only if a i b i = 1 i = 1,,.., N. Lemma. The Jacobi theta function [5], defined by θ 3 (ν t) = exp(jπti + jπνi) i= where j = 1, ν and t are two complex numbers with Im(t) > 0, satisfies the identity t j exp ( ) jπν θ 3 (ν ( ν t) = θ 3 1 t t t ). (3.5) Theorem 3. Let {p 1, p,...p N } be a set of N probability masses. The discrete entropy H(p 1, p,..., p N ) satisfies H(p 1, p,..., p N ) 1 log(πeσ ) (3.6) if the following three conditions are satisfied: πσ > 1, (πσ )i 1 (i µ) + σ ln(πσ ), and (πσ )i (i + N µ) + σ ln(πσ ),

20 3.. A TIGHT UPPER BOUND ON DISCRETE ENTROPY 15 where and N µ = µ(p 1, p,..., p N ) = ip i, N σ = σ (p 1, p,..., p N ) = p i (i µ) and i 1, i are two positive integers nearest to µ 1 (πσ ) 1 and N µ (πσ ) 1, respectively. Proof. Let X be an integer-valued random variable with alphabet X = {1,,..., N} such that µ and σ are the mean and the variance of X, respectively. Choosing a i = p i and b i = 1 (i µ) πσ exp( ) in Lemma 1 yields σ H(X) = H(p 1, p,..., p N ) = = N [ ] 1 µ) p i log exp (i πσ σ N [ ] (i µ) p i log e log(πσ ) 1 σ N [ (i µ) p i log e 1 ] σ log(πσ ) = 1 σ log e N p i (i µ) + 1 log(πσ ) = 1 σ σ log e + 1 log(πσ ) = 1 log e + 1 log(πσ ) = 1 log(πeσ ) N p i N p i

21 3.. A TIGHT UPPER BOUND ON DISCRETE ENTROPY 16 if 1 πσ N ) (i µ) exp ( 1. (3.7) σ Now, using Lemma (note that Im(t) > 0), we let t = and substitute these values in (3.5). This yields j πσ and ν = tµ = ( j πσ )µ t j exp ( jπν t ) θ 3 (ν t) = = = = = = 1 exp jπµ t πσ t 1 πσ 1 πσ 1 πσ 1 πσ 1 πσ i= i= i= exp(jπti + jπ( tµ)i) exp(jπµ t) exp(jπt(i µi) ( ) ( ) µ 1 exp exp (i µi) σ σ ( ) ( ) µ 1 exp exp (i µi + µ µ ) σ σ exp 1 σ (µ + (i µ) µ ) ) (i µ) exp ( = θ σ 3 ( µ jπσ ). (3.8) i= i= i= Using (3.8) we can rewrite (3.7) as follows θ 3 ( µ jπσ ) 1 πσ i 0 or i N+1 ) (i µ) exp ( 1. (3.9) σ

22 3.. A TIGHT UPPER BOUND ON DISCRETE ENTROPY 17 The Jacobi theta function can be written as θ 3 ( µ jπσ ) = exp(jπ(jπσ )i jπµi) = = = = = = = i= i= i= i= i= i= i= exp( π σ i jπµi) exp( π σ i ) exp( jπµi) exp( π σ i ) [cos( πµi) + j sin( πµi)] exp( π σ i ) cos( πµi) + exp( π σ i )j sin( πµi) i= exp( π σ i ) cos( πµi) + exp( π σ i )j [(sin( πµi) + sin( πµ( i))] exp( π σ i ) cos(πµi) exp( π σ i ) cos(πµi) + 1. Now substituting the above expression in (3.9) yields

23 3.. A TIGHT UPPER BOUND ON DISCRETE ENTROPY 18 exp( π σ i ) cos(πµi) = 1 πσ 1 πσ ) (i µ) exp ( σ i 0 or i N+1 ) ( i + 1 µ) exp ( σ ) (i + N µ) + exp ( σ (3.10) where we the last equality above is maintained by shifting the support of the index i as follows i N+1 ) (i µ) exp ( = σ ) (i + N µ) exp ( σ and i 0 ) (i µ) exp ( = σ ) ( i + 1 µ) exp ( σ Sufficient conditions for( 3.10) are as follows: for i = 1,,...,, we get exp( π σ i ) 1 πσ ( i+1 µ) exp( max{, (i+n µ) }) σ σ (πσ ) i max{(i 1 + µ), (i + N µ) } + σ ln(πσ ) (( πσ ) 1)i (µ 1)i (µ 1) σ ln(πσ ) 0 ((πσ ) 1)i (N µ)i (N µ) σ ln(πσ ) 0 (3.11)

24 3.. A TIGHT UPPER BOUND ON DISCRETE ENTROPY 19 ((πσ ) 1)(i (πσ ) 1)(i µ 1 (πσ ) 1 ) (µ 1) (1 + 1 (πσ ) 1 ) σ ln(πσ ) 0 N µ (πσ ) 1 ) (N µ) (1 + 1 (πσ ) 1 ) σ ln(πσ ) 0. (3.1) The left hand-sides of the first and second inequalities in (3.1) achieve their minimum values precisely wheni = i 1 and i = i, respectively, provided that πσ > 1. Substituting these values in (3.11), we obtain the if condition stated in the theorem.

25 0 Chapter 4 Numerical Examples Before we illustrate examples on the bounds of the previous chapter, we first prove the following result. Corollary. The discrete entropy bound in (3.6) holds if max{µ, N + 1 µ} < (πσ ) σ ln(πσ ). (4.1) Proof. If µ 1 (πσ ) 1 and N µ (πσ ) 1 are both less than 3, then i 1 = i = 1. Equivalently, i 1 = i if max{µ, N + 1 µ} < 3 (πσ ) 1. (4.)

26 1 Note that max{µ, N + 1 µ} µ + N + 1 µ implies that max{µ, N + 1 µ} N + 1. (4.3) From (4.) and (4.3) we get that 3 (πσ ) 1 > N + 1 (πσ ) 1 > N πσ > 1. Now, for i 1 = i = 1, the sufficient conditions in (3.6) become (πσ ) max{µ, N + 1 µ} + σ ln(πσ ) where ln(πσ ) is positive. Let θ = (πσ ) > 1, then ( 3θ 1 ) θ = 1 (9θ 1)(θ 1) > 0 4 ( 3 (πσ ) 1 ) > (πσ ). We now present several examples to evaluate the tightness of the upper bounds shown in the previous chapter.

27 Example 3. Let X be Bernoulli random variable on the set X = {0, 1} with parameter p= 1. Then H(X) = 1 bit, µ = 1 and σ = 1(1 1) = 1. The standard upper bound 4 of Theorem yields H(X) 1 log [ πe(var(x) ) ] = 1 [ log πe( 1 4 ) + 1 ] 1 = 1.55 bits. Corollary is satisfied since max{ 1, } = max{.5,.5} =.5 and (πσ ) σ ln(πσ ) =.345. Thus, the upper bound of Theorem 3 yields H(X) 1 [ log πe 1 ] = bits 4 which is closer to the real entropy H(X) = 1 than the standard bound. Example 4. Let X be a uniformly distributed discrete random variable on the set X = {1,, 3, 4}. Then H(X) = 1 4 log log log log 1 4 = bits. Also, µ = E(X) = 5 and σ = E(X ) (E(X)) = 15 1.

28 3 Corollary is satisfied since max{ 5, 5 5 } < (πσ ) σ ln(πσ ) max{6.5, 6.5} < The standard upper bound of Theorem yields H(X) 1 log(πe(v ar(x) )) = 1 log(πe( )) =.5461 bits. On the other hand, the bound of Theorem 3 is tighter as H(X) log(πe(1.5)) =.080 bits. Example 5. Let X be binomial random variable on the set X = {1,, 3, 4,...n+1} with parameter p. The mean and the variance are µ = np + 1 and σ = npq respectively, where q = (1 p). Examining (4.1), we have max{np + 1, n (np + 1)} = max{np + 1, n + 1 np} = max{np + 1, n(1 p) + 1} = max{np + 1, nq + 1} = {nmax(p, q) + 1}.

29 4 Thus, Corollary holds if {nmax(p, q) + 1} < (πnpq) npq ln(πnpq). (4.4) This condition is satisfied for n sufficiently large if max{p, q} < πpq or equivalently if min{p, 1 p} > 1 π = Calculating the standard bound and the new bound, we get H(X) 1 log πe(np(1 p) ) and H(X) 1 log(πenp(1 p)) respectively. When p equal 1, condition (4.4) reduces to ( n + 1) < ( πn ) ( n 4 ) ln(πn ). Note that this is satisfied for any n 1. In this case, the tightened bound becomes H(X) 1 log(πen ), n = 1,, 3,... Example 6. Let X be a discrete random variable defined on the set X = {1,, 3, 4}

30 5 with the following pmf ( 1, 1 4, 1 4, 0). Then, H(X) = 1.5 bits. The mean and the variance are µ = E(X) = 4 xp(x) = 1.75 and σ = E(X ) E(X) =.6875, respectively. The standard upper bound yields H(X) 1 log(πe( )) = Corollary is satisfied since max{3.06, 10.56} < Thus, the tightened upper bound yields H(X) 1 log(πe(.6875)) = 1.77 which is slightly closer to the real value of H(X) than the standard upper bound. Example 7. let X be discrete random variable on the set X = {1,, 3, 4, 5} with pmf ( 1, 1 8, 1 8, 1 8, 1 8). Then H(X) = bits. The mean is µ =.5 and the variance is σ = The standard upper bound gives H(X) Here again Corollary holds since max{5.065, } < This yields a tighter upper bound of H(X) Example 8. Let X be discrete random variable on the set X = {1,, 3, 4} with pmf ( 1, 1 4, 1 8, 1 8). Then, H(X) = 1.75 bits. The mean µ = 1.87 and the variance

31 6 σ = The standard upper bound gives H(X).17. Corollary holds since max{3.49, 9.793} < 46.43; yielding a tightened bound of H(X).1. Example 9. Consider Xto be a binomial random variable on the set {1,,...n + 1} with n 30. Figs show the exact value and various bounds of the discrete entropy of Xwith p =.1, p =.3, and p =.5. The bounds for reordering the set of probability masses as described in Section 3.1 are also included. Note that all values in the figures are normalized by subtracting the term 1 log(n). Observations For n larger than a certain threshold which is a function p, the entropy of the discrete random variable increases monotonically towards the tightened bound. Note that the convergence rate is smaller for smaller values of p. Reordering the set of probability masses gives improved bounds except for p =.5, where in this case the bounds for the reordered set of probabilities are identical to the original ones. The improvement seems to be greater for smaller values of p. For p =.3, the tighter bound is applicable for n. Fig.4. shows that it is also valid for n = 1. The conditions guaranteeing the applicability of the new bound is violated in for the case of p =.1. However, Fig.4.1 shows that the

32 standard bound standard bound for reordered probabilities exact entropy new bound new bound for reordered probabilities n Figure 4.1: Entropy and the different bounds for p=.1 bound is valid for for n 4. This suggests the possibility of generalizing the sufficient conditions under which the bound is applicable.

33 standard bound standard bound for reordered probabilities exact entropy new bound new bound for reordered probabilities n Figure 4.: Entropy and the different bounds for p=.3

34 standard bound standard bound for reordered probabilities exact entropy new bound new bound for reordered probabilities n Figure 4.3: Entropy and the different bounds for p=.5

35 30 Chapter 5 Conclusion In this report two upper bounds for discrete entropy were derived. The standard upper bound on discrete entropy was shown using the differential entropy bound for a Gaussian random variable [3]. The new upper bound [5] was developed based on Shannon s inequality [1] and the transformation formula of the Jacobi theta function [6]. The tightened bound which is an improvement over the standard bound, is only applicable if the probability masses of the discrete random variables satisfy certain conditions. Some examples were provided in the last chapter to show the applicability and the practical meaning of the two bounds. A possible direction for future work is to calculate the tightened bound for a set of discrete random variables using their joint probability mass function. Another interesting direction is to relax the sufficient conditions under which the tightened bound holds.

36 BIBLIOGRAPHY 31 Bibliography [1] J. Aczel. On Shannon s inequality, optimal coding, and characterizations of Shannon s and Renyi s entropies. Technical Report Res. Rep. CS-73-05, Dept. of Applied Anal. Comput. Science, Univ. of Waterloo, Waterloo, Ont.,Canada, Jan [] T. M. Cover and Joy A. Thomas. Elements of Information Theory [3] T. M. Cover and J. A. Tomas. Elements of Information Theory. New York: Wiley, [4] A. G. Djackov. On a search model of false coins. in Topics in Information Theory (Colloquia Mathematica Societatis Janos Bolyai 16, Keszthely, Hungary)Hungary:, page , [5] B.H. Mow. A tight upper bound on discrete entropy. IEEE Transaction on Information Theory, vol. 44, no., Mar [6] H. Rademacher. Topics in Analytic Number Theory. Berlin, Germany:Springer-Verlag, 1973.

ECE 4400:693 - Information Theory

ECE 4400:693 - Information Theory Dr. Nghi Tran Lecture 8: Differential Entropy Dr. Nghi Tran (ECE-University of Akron) ECE 4400:693 Lecture 1 / 43 Outline 1 Review: Entropy of discrete RVs 2 Differential