Computing and Communications 2. Information Theory -Entropy

Size: px

Start display at page:

Download "Computing and Communications 2. Information Theory -Entropy"

Jessie Bruce
5 years ago
Views:

1 Computing and Communications 2. Information Theory -Entropy Ying Cui Department of Electronic Engineering Shanghai Jiao Tong University, China 2017, Autumn 1

2 Outline Entropy Joint entropy and conditional entropy Relative entropy and mutual information Relationship between entropy and mutual information Chain rules for entropy, relative entropy and mutual information Jensen s inequality and its consequences 2

3 Reference Elements of information theory, T. M. Cover and J. A. Thomas, Wiley 3

4 OVERVIEW 4

5 Information Theory Information theory answers two fundamental questions in communication theory what is the ultimate data compression? -- entropy H what is the ultimate transmission rate of communication? -- channel capacity C Information theory is considered as a subset of communication theory 5

6 Information Theory Information theory has fundamental contributions to other fields 6

7 A Mathematical Theory of Commun. In 1948, Shannon published A Mathematical Theory of Communication, founding Information Theory Shannon made two major modifications having huge impact on communication design the source and channel are modeled probabilistically bits became the common currency of communication 7

8 A Mathematical Theory of Commun. Shannon proved the following three theorems Theorem 1. Minimum compression rate of the source is its entropy rate H Theorem 2. Maximum reliable rate over the channel is its mutual information I Theorem 3. End-to-end reliable communication happens if and only if H < I, i.e. there is no loss in performance by using a digital interface between source and channel coding Impacts of Shannon s results after almost 70 years, all communication systems are designed based on the principles of information theory the limits not only serve as benchmarks for evaluating communication schemes, but also provide insights on designing good ones basic information theoretic limits in Shannon s theorems have now been successfully achieved using efficient algorithms and codes 8

9 ENTROPY 9

Definition Entropy is a measure of the uncertainty of a r.v. Consider discrete r.v. X with alphabet and p.m.f. p( x) Pr[ X x], x log is to the base 2, and entropy is expressed in bits e.

10 Definition Entropy is a measure of the uncertainty of a r.v. Consider discrete r.v. X with alphabet and p.m.f. p( x) Pr[ X x], x log is to the base 2, and entropy is expressed in bits e.g., the entropy of a fair coin toss is 1 bit define 0log 0 0, since x log x 0 as x 0 adding terms of zero probability does not change the entropy 10

11 Properties entropy is nonnegative base of log can be changed 11

12 Example H(X)=1 bit when p=0.5 maximum uncertainty H(X)=0 bit when p=0 or 1 minimum uncertainty concave function of p 12

13 Example 13

14 JOINT ENTROPY AND CONDITIONAL ENTROPY 14

15 Joint Entropy Joint entropy is a measure of the uncertainty of a pair of r.v.s Consider a pair of discrete r.v.s (X,Y) with alphabet and p.m.f.s ( ) Pr[ ],, ( ) Pr[ ],, p x X x x p y Y y y 15

16 Conditional Entropy Conditional entropy of a r.v. (Y) given another r.v. (X) expected value of entropies of conditional distributions, averaged over conditioning r.v. 16

17 Chain Rule 17

18 Chain Rule 18

19 Example 19

20 Example 20

21 RELATIVE ENTROPY AND MUTUAL INFORMATION 21

22 Relative Entropy Relative entropy is a measure of the distance between two distributions convention: if there is any 0 0 p 0log 0, 0log 0 and p log 0 q 0 x such that p( x) 0 and q( x) 0, then D( p q). 22

23 Example 23

24 Mutual Information Mutual information is a measure of the amount of information that one r.v. contains about another r.v. 24

25 RELATIONSHIP BETWEEN ENTROPY AND MUTUAL INFORMATION 25

26 Relation 26

27 Proof 27

28 Illustration 28

29 CHAIN RULES FOR ENTROPY, RELATIVE ENTROPY, AND MUTUAL INFORMATION 29

30 Chain Rule for Entropy 30

31 Proof 31

32 Alternative Proof 32

33 Chain Rule for Information 33

34 Proof 34

35 Chain Rule for Relative Entropy 35

36 Proof 36

37 JENSEN'S INEQUALITY AND ITS CONSEQUENCES 37

38 Convex & Concave Functions Examples: 2 x convex functions: x, x, e, x log x (for x 0) concave functions: log x and x (for x 0) linear functions ax b are both convex and concave 38

39 Convex & Concave Functions 39

40 Jensen s Inequality 40

41 Information Inequality 41

42 Proof 42

43 Nonnegativity of Mutual Information 43

44 Max. Entropy Dist. Uniform Dist. 44

45 Conditioning Reduces Entropy 45

46 Independence Bound on Entropy 46

47 Summary 47

48 Summary 48

49 Summary 49

50 iwct.sjtu.edu.cn/personal/yingcui 50

Chapter 2: Entropy and Mutual Information. University of Illinois at Chicago ECE 534, Natasha Devroye

Chapter 2: Entropy and Mutual Information Chapter 2 outline Definitions Entropy Joint entropy, conditional entropy Relative entropy, mutual information Chain rules Jensen s inequality Log-sum inequality