Entropy and Large Deviations

Size: px
Start display at page:

Download "Entropy and Large Deviations"

Transcription

1 Entropy and Large Deviations p. 1/32 Entropy and Large Deviations S.R.S. Varadhan Courant Institute, NYU Michigan State Universiy East Lansing March 31, 2015

2 Entropy comes up in many different contexts. Entropy and Large Deviations p. 2/32

3 Entropy and Large Deviations p. 2/32 Entropy comes up in many different contexts. in Physics, introduced by Rudolf Clausius in 1865

4 Entropy and Large Deviations p. 2/32 Entropy comes up in many different contexts. in Physics, introduced by Rudolf Clausius in 1865 In connection with heat transfer, relation between heat and work. Classical thermodynamics.

5 Entropy and Large Deviations p. 2/32 Entropy comes up in many different contexts. in Physics, introduced by Rudolf Clausius in 1865 In connection with heat transfer, relation between heat and work. Classical thermodynamics. Bulk Quantity

6 Entropy and Large Deviations p. 2/32 Entropy comes up in many different contexts. in Physics, introduced by Rudolf Clausius in 1865 In connection with heat transfer, relation between heat and work. Classical thermodynamics. Bulk Quantity Boltzmann around 1877 defined entropy as clog Ω, Ω is the set of micro states that correspond to a given macro state. Ω is its size.

7 Shannon s Entropy.!948. Mathematical theory of communication. Modern information theory. Entropy and Large Deviations p. 3/32

8 Entropy and Large Deviations p. 3/32 Shannon s Entropy.!948. Mathematical theory of communication. Modern information theory. Given p = (p 1,...,p k ), H(p) = i p i logp i

9 Entropy and Large Deviations p. 3/32 Shannon s Entropy.!948. Mathematical theory of communication. Modern information theory. Given p = (p 1,...,p k ), H(p) = i p i logp i Conditional ProbabilityP[X = i Σ] = p i (ω), E[p i (ω)] = p i

10 Entropy and Large Deviations p. 4/32 Conditional Entropy H(ω) = i p i (ω)logp i (ω) E[H(ω)] = E[ i p i (ω)logp i (ω)]

11 Entropy and Large Deviations p. 5/32 Concavity H = i p i logp i E[ i p i (ω)logp i (ω)]

12 Entropy and Large Deviations p. 5/32 Concavity H = i p i logp i E[ i p i (ω)logp i (ω)] X i is a stationary stochastic process. p i (ω) = P[X 1 = i X 0,...X n,...]. H[P] = E P[ i p i (ω)logp i (ω) ]

13 Entropy and Large Deviations p. 5/32 Concavity H = i p i logp i E[ i p i (ω)logp i (ω)] X i is a stationary stochastic process. p i (ω) = P[X 1 = i X 0,...X n,...]. H[P] = E P[ i p i (ω)logp i (ω) ] Conditioning is with respect to past history.

14 Look at p(x 1,...,x n ) Entropy and Large Deviations p. 6/32

15 Entropy and Large Deviations p. 6/32 Look at p(x 1,...,x n ) H n (P) = x 1,...,x n p(x 1,...,x n )logp(x 1,...,x n )

16 Entropy and Large Deviations p. 6/32 Look at p(x 1,...,x n ) H n (P) = x 1,...,x n p(x 1,...,x n )logp(x 1,...,x n ) H n+m H n +H m,h n is.

17 Entropy and Large Deviations p. 6/32 Look at p(x 1,...,x n ) H n (P) = x 1,...,x n p(x 1,...,x n )logp(x 1,...,x n ) H n+m H n +H m,h n is. H n+1 H n is

18 Entropy and Large Deviations p. 6/32 Look at p(x 1,...,x n ) H n (P) = x 1,...,x n p(x 1,...,x n )logp(x 1,...,x n ) H n+m H n +H m,h n is. H n+1 H n is H(P) = lim H n n = lim[h n+1 H n ]

19 Dynamical system. (X,Σ,T,P). T : X X preservesp. Entropy and Large Deviations p. 7/32

20 Entropy and Large Deviations p. 7/32 Dynamical system. (X,Σ,T,P). T : X X preservesp. ShiftT in a space of sequences. P any stationary process.

21 Entropy and Large Deviations p. 7/32 Dynamical system. (X,Σ,T,P). T : X X preservesp. ShiftT in a space of sequences. P any stationary process. Isomorphism between (X, Σ, T, P) and (X,Σ,T,P ) S : X X ;ST = T S,PS 1 = P

22 Entropy and Large Deviations p. 7/32 Dynamical system. (X,Σ,T,P). T : X X preservesp. ShiftT in a space of sequences. P any stationary process. Isomorphism between (X, Σ, T, P) and (X,Σ,T,P ) S : X X ;ST = T S,PS 1 = P Spectral Invariant. (U f)(x) = f(u x) Unitary map inl 2 (X,Σ,P)

23 Entropy and Large Deviations p. 7/32 Dynamical system. (X,Σ,T,P). T : X X preservesp. ShiftT in a space of sequences. P any stationary process. Isomorphism between (X, Σ, T, P) and (X,Σ,T,P ) S : X X ;ST = T S,PS 1 = P Spectral Invariant. (U f)(x) = f(u x) Unitary map inl 2 (X,Σ,P) VU = U V

24 Entropy and Large Deviations p. 8/32 Entropy of a dynamical system. (Ω,Σ,P,T)

25 Entropy and Large Deviations p. 8/32 Entropy of a dynamical system. (Ω,Σ,P,T) Each function f taking a finite set of values generates a stochastic processp f, the distribution of {f(t n x)}

26 Entropy and Large Deviations p. 8/32 Entropy of a dynamical system. (Ω,Σ,P,T) Each function f taking a finite set of values generates a stochastic processp f, the distribution of {f(t n x)} h(ω,σ,p,t) = sup f h(p f )

27 Entropy and Large Deviations p. 8/32 Entropy of a dynamical system. (Ω,Σ,P,T) Each function f taking a finite set of values generates a stochastic processp f, the distribution of {f(t n x)} h(ω,σ,p,t) = sup f h(p f ) Invariant. Not spectral. H(T 2,P) = 2H(T,P).

28 Entropy and Large Deviations p. 8/32 Entropy of a dynamical system. (Ω,Σ,P,T) Each function f taking a finite set of values generates a stochastic processp f, the distribution of {f(t n x)} h(ω,σ,p,t) = sup f h(p f ) Invariant. Not spectral. H(T 2,P) = 2H(T,P). Computable. P = Πp, H(P) = h(p).

29 Isomorphism Theorem of Ornstein Entropy and Large Deviations p. 9/32

30 Entropy and Large Deviations p. 9/32 Isomorphism Theorem of Ornstein If h(p) = h(q)

31 Entropy and Large Deviations p. 9/32 Isomorphism Theorem of Ornstein If h(p) = h(q) The dynamical systems with product measuresp,q, i.e(f,p) and (G,Q) are isomorphic.

32 Shannon s entropy and coding Entropy and Large Deviations p. 10/32

33 Entropy and Large Deviations p. 10/32 Shannon s entropy and coding Code in such a way that the best compression is achieved.

34 Entropy and Large Deviations p. 10/32 Shannon s entropy and coding Code in such a way that the best compression is achieved. Incoming data stream has stationary statistics P.

35 Entropy and Large Deviations p. 10/32 Shannon s entropy and coding Code in such a way that the best compression is achieved. Incoming data stream has stationary statistics P. Coding to be done into words in an alphabet of size r.

36 Entropy and Large Deviations p. 10/32 Shannon s entropy and coding Code in such a way that the best compression is achieved. Incoming data stream has stationary statistics P. Coding to be done into words in an alphabet of size r. The compression factor is c = H(P) logr ;

37 The number ofntuples is k n. Entropy and Large Deviations p. 11/32

38 Entropy and Large Deviations p. 11/32 The number ofntuples is k n. Shannon-Breiman-McMillan theorem: IfP is stationary and ergodic, P(E n ) 1 where E n = { 1 n logp(x 1,...,x n ) H(P) ǫ}

39 Entropy and Large Deviations p. 11/32 The number ofntuples is k n. Shannon-Breiman-McMillan theorem: IfP is stationary and ergodic, P(E n ) 1 where E n = { 1 n logp(x 1,...,x n ) H(P) ǫ} Almost the entire probability underp is carried by nearlye nh(p) n tuples of more or less equal probability.

40 Entropy and Large Deviations p. 11/32 The number ofntuples is k n. Shannon-Breiman-McMillan theorem: IfP is stationary and ergodic, P(E n ) 1 where E n = { 1 n logp(x 1,...,x n ) H(P) ǫ} Almost the entire probability underp is carried by nearlye nh(p) n tuples of more or less equal probability. r m = e nh(p). c = m n = H(P) logr

41 k n sequences of lengthnfrom an alphabet of sizek. Entropy and Large Deviations p. 12/32

42 Entropy and Large Deviations p. 12/32 k n sequences of length n from an alphabet of size k. How many look likep?

43 Entropy and Large Deviations p. 12/32 k n sequences of length n from an alphabet of size k. How many look likep? exp[n(h(p))+o(n)]

44 Entropy and Large Deviations p. 12/32 k n sequences of length n from an alphabet of size k. How many look likep? exp[n(h(p))+o(n)] H(P) is maximized when P = P 0, the uniform distribution on F n

45 Entropy and Large Deviations p. 12/32 k n sequences of length n from an alphabet of size k. How many look likep? exp[n(h(p))+o(n)] H(P) is maximized when P = P 0, the uniform distribution on F n H(P 0 ) = logk

46 Entropy and Large Deviations p. 13/32 Kullback-Leibler information, Relative Entropy (1951) H(q;p) = q i log q i p i i

47 Entropy and Large Deviations p. 13/32 Kullback-Leibler information, Relative Entropy (1951) H(q;p) = q i log q i p i i Ifpis the uniform distribution with mass 1 k point then at every h(q : p) = logk h(q)

48 Kullback-Leibler information, Relative Entropy (1951) H(q;p) = q i log q i p i i Ifpis the uniform distribution with mass 1 k point then at every h(q : p) = logk h(q) Two probability densities H(g;f) = g(x)log g(x) f(x) dx Entropy and Large Deviations p. 13/32

49 Entropy and Large Deviations p. 14/32 Two probability measures H(µ,λ) = log dµ dλ dµ = dµ dλ log dµ dλ dλ

50 Entropy and Large Deviations p. 14/32 Two probability measures H(µ,λ) = log dµ dλ dµ = Two stationary processes dµ dλ log dµ dλ dλ H(Q,P) = E Q [H(Q Σ;P Σ)]

51 Entropy and Large Deviations p. 14/32 Two probability measures H(µ,λ) = log dµ dλ dµ = Two stationary processes dµ dλ log dµ dλ dλ H(Q,P) = E Q [H(Q Σ;P Σ)] Has issues! OK ifp Σ is globally defined.

52 α,β are probability measures on X Entropy and Large Deviations p. 15/32

53 Entropy and Large Deviations p. 15/32 α,β are probability measures on X P = Πα

54 Entropy and Large Deviations p. 15/32 α,β are probability measures on X P = Πα n r n (dx) = 1 n δ xi i=1

55 Entropy and Large Deviations p. 15/32 α,β are probability measures on X P = Πα n r n (dx) = 1 n δ xi Sanov s Theorem. i=1 P[r n (dx) β] = exp[ nh(β;α)+o(n)]

56 Large Deviations Entropy and Large Deviations p. 16/32

57 Entropy and Large Deviations p. 16/32 Large Deviations For closed sets C lim sup n 1 n logp n[c] inf x C I(x)

58 Entropy and Large Deviations p. 16/32 Large Deviations For closed sets C lim sup n 1 n logp n[c] inf x C I(x) For open sets G lim inf n 1 n logp n[c] inf x G I(x)

59 Entropy and Large Deviations p. 16/32 Large Deviations For closed sets C lim sup n 1 n logp n[c] inf x C I(x) For open sets G lim inf n 1 n logp n[c] inf x G I(x) I is lower semi continuous and has compact level sets.

60 Sanov s theorem is an example. Entropy and Large Deviations p. 17/32

61 Entropy and Large Deviations p. 17/32 Sanov s theorem is an example. P n on M(X) with weak topology is the distribution ofr n (dx) under Πα.

62 Entropy and Large Deviations p. 17/32 Sanov s theorem is an example. P n on M(X) with weak topology is the distribution ofr n (dx) under Πα. I(β) = h(β : α).

63 P is a stationary process, We have a string of n observations. Entropy and Large Deviations p. 18/32

64 Entropy and Large Deviations p. 18/32 P is a stationary process, We have a string of n observations. Ergodic theorem says most sequences resemblep.

65 Entropy and Large Deviations p. 18/32 P is a stationary process, We have a string of n observations. Ergodic theorem says most sequences resemblep. What is the probability that they resembleq?

66 Entropy and Large Deviations p. 18/32 P is a stationary process, We have a string of n observations. Ergodic theorem says most sequences resemblep. What is the probability that they resembleq? exp[ nh(q;p)]

67 Entropy and Large Deviations p. 18/32 P is a stationary process, We have a string of n observations. Ergodic theorem says most sequences resemblep. What is the probability that they resembleq? exp[ nh(q;p)] IfP = P 0 then H(Q;P) = logk H(Q).

68 Entropy and Large Deviations p. 18/32 P is a stationary process, We have a string of n observations. Ergodic theorem says most sequences resemblep. What is the probability that they resembleq? exp[ nh(q;p)] IfP = P 0 then H(Q;P) = logk H(Q). OK for nicep.

69 Functional Analysis Entropy and Large Deviations p. 19/32

70 Entropy and Large Deviations p. 19/32 Functional Analysis f 0 fdλ = 1

71 Entropy and Large Deviations p. 19/32 Functional Analysis f 0 fdλ = 1 d dp [ p=1 ] f p dλ = f logfdλ

72 There is an analog of Holder s inequality in the limit asp 1. Entropy and Large Deviations p. 20/32

73 Entropy and Large Deviations p. 20/32 There is an analog of Holder s inequality in the limit asp 1. The duality

74 Entropy and Large Deviations p. 20/32 There is an analog of Holder s inequality in the limit asp 1. The duality x,y R. xy x p p + y q q, x p p = sup y [xy y q q ], y q q = sup x [xy x p p ]

75 Entropy and Large Deviations p. 20/32 There is an analog of Holder s inequality in the limit asp 1. The duality x,y R. xy x p p + y q q, x p p = sup y [xy y q q ], y q q = sup x [xy x p p ] Becomes, forx > 0,y R xlogx x = sup y [xy e y ] e y = sup x>0 [xy (xlogx x)]

76 Entropy and Large Deviations p. 21/32 Hölder inequality becomes generalized Jensen s inequality. gdµ = fgdλ log e g dλ+h(µ,λ)

77 Entropy and Large Deviations p. 21/32 Hölder inequality becomes generalized Jensen s inequality. gdµ = fgdλ log e g dλ+h(µ,λ) g = cχ A (x). Optimize over c > 0. c = log 1 λ(a)

78 Entropy and Large Deviations p. 21/32 Hölder inequality becomes generalized Jensen s inequality. gdµ = fgdλ log e g dλ+h(µ,λ) g = cχ A (x). Optimize over c > 0. c = log 1 λ(a) e g = χ A c + 1 λ(a) χ A; e g dλ 2

79 Entropy and Large Deviations p. 21/32 Hölder inequality becomes generalized Jensen s inequality. gdµ = fgdλ log e g dλ+h(µ,λ) g = cχ A (x). Optimize over c > 0. c = log 1 λ(a) e g = χ A c + 1 λ(a) χ A; e g dλ 2 µ(a) H(µ,λ)+log2 log 1 λ(a)

80 Entropy grows linearly but thelp norms grow exponentially fast. Entropy and Large Deviations p. 22/32

81 Entropy and Large Deviations p. 22/32 Entropy grows linearly but thelp norms grow exponentially fast. Useful inequality for interacting particle systems

82 Statistical Mechanics. Probability distributions are often defined through an energy functional. Entropy and Large Deviations p. 23/32

83 Entropy and Large Deviations p. 23/32 Statistical Mechanics. Probability distributions are often defined through an energy functional. p n (x 1,x 2,...,x n ) = [c(n)] 1 exp[ n k+1 i=1 F(x i,x i+1,...,x i+k 1 )]

84 Entropy and Large Deviations p. 24/32 c n = x 1,...,x n exp[ n k+1 i=1 F(x i,x i+1,...,x i+k 1 )]

85 Entropy and Large Deviations p. 24/32 c n = x 1,...,x n exp[ n k+1 i=1 F(x i,x i+1,...,x i+k 1 )] exp[nh(p)], P typical sequences.

86 Entropy and Large Deviations p. 24/32 c n = x 1,...,x n exp[ n k+1 i=1 F(x i,x i+1,...,x i+k 1 )] exp[nh(p)], P typical sequences. exp[ n FdP]

87 Entropy and Large Deviations p. 24/32 c n = x 1,...,x n exp[ n k+1 i=1 F(x i,x i+1,...,x i+k 1 )] exp[nh(p)], P typical sequences. exp[ n FdP] The sum therefore is e n( FdP H(P))

88 Entropy and Large Deviations p. 25/32 Then by Laplace asymptotics logc n lim = sup[ n n P F dp +H(P)]

89 Entropy and Large Deviations p. 25/32 Then by Laplace asymptotics logc n lim = sup[ n n P F dp +H(P)] If the sup is attained at a uniquep then P c 1 exp[ i= F(x i,x i+1,...,x i+k 1 )]

90 Entropy and Large Deviations p. 25/32 Then by Laplace asymptotics logc n lim = sup[ n n P F dp +H(P)] If the sup is attained at a uniquep then P c 1 exp[ i= F(x i,x i+1,...,x i+k 1 )] The averages 1 n k+1 n i=1 G(x i,x i+1,...,x i+k 1 ) converge in probability under p n to E P [G(x 1,...,x k )].

91 Log Sobolev inequality. Entropy and Large Deviations p. 26/32

92 Entropy and Large Deviations p. 26/32 Log Sobolev inequality. T t is a Markov semigroup, on (X,Σ) T t f 0 iff 0 and T t 1 = 1.

93 Entropy and Large Deviations p. 26/32 Log Sobolev inequality. T t is a Markov semigroup, on (X,Σ) T t f 0 iff 0 and T t 1 = 1. Symmetric with respect to µ

94 Entropy and Large Deviations p. 26/32 Log Sobolev inequality. T t is a Markov semigroup, on (X,Σ) T t f 0 iff 0 and T t 1 = 1. Symmetric with respect to µ f 0, fdµ = 1 f logfdµ cd( f)

95 Entropy and Large Deviations p. 26/32 Log Sobolev inequality. T t is a Markov semigroup, on (X,Σ) T t f 0 iff 0 and T t 1 = 1. Symmetric with respect to µ f 0, fdµ = 1 f logfdµ cd( f) d dt f t logf t dµ 1 2 D( f) c f t logf t dµ

96 Exponential decay, spectral gap etc. Entropy and Large Deviations p. 27/32

97 Entropy and Large Deviations p. 27/32 Exponential decay, spectral gap etc. Infinitesimal Version.

98 Entropy and Large Deviations p. 27/32 Exponential decay, spectral gap etc. Infinitesimal Version. {f(x,θ)}

99 Entropy and Large Deviations p. 27/32 Exponential decay, spectral gap etc. Infinitesimal Version. {f(x,θ)} H(θ,θ 0 ) = f(x,θ)log f(x,θ) f(x,θ 0 ) dµ

100 Entropy and Large Deviations p. 27/32 Exponential decay, spectral gap etc. Infinitesimal Version. {f(x,θ)} H(θ,θ 0 ) = f(x,θ)log f(x,θ) f(x,θ 0 ) dµ H(θ,θ 0 ) has a minimum at θ 0.

101 Entropy and Large Deviations p. 27/32 Exponential decay, spectral gap etc. Infinitesimal Version. {f(x,θ)} H(θ,θ 0 ) = f(x,θ)log f(x,θ) f(x,θ 0 ) dµ H(θ,θ 0 ) has a minimum at θ 0. I(θ(0)) = d2 H dθ 2 θ=θ 0

102 Optimizing Entropy. Entropy and Large Deviations p. 28/32

103 Entropy and Large Deviations p. 28/32 Optimizing Entropy. Toss a coin n times.

104 Entropy and Large Deviations p. 28/32 Optimizing Entropy. Toss a coin n times. Had 3n 4 heads.

105 Entropy and Large Deviations p. 28/32 Optimizing Entropy. Toss a coin n times. Had 3n 4 heads. Was it a steady lead or burst of heads?

106 Entropy and Large Deviations p. 28/32 Optimizing Entropy. Toss a coin n times. Had 3n 4 heads. Was it a steady lead or burst of heads? P[ 1 ns(nt) f(t)]

107 f (t) = p(t) Entropy and Large Deviations p. 29/32

108 Entropy and Large Deviations p. 29/32 f (t) = p(t) e n 1 0 h(p(t))dt

109 Entropy and Large Deviations p. 29/32 f (t) = p(t) e n 1 0 h(p(t))dt Minimize 1 0 h(p(t))dt subject to 1 0 p(s)ds = 3 4

110 Entropy and Large Deviations p. 29/32 f (t) = p(t) e n 1 0 h(p(t))dt Minimize 1 0 h(p(t))dt subject to 1 0 p(s)ds = 3 4 h(p) = log2+plogp+(1 p)log(1 p)

111 Entropy and Large Deviations p. 29/32 f (t) = p(t) e n 1 0 h(p(t))dt Minimize 1 0 h(p(t))dt subject to 1 0 p(s)ds = 3 4 h(p) = log2+plogp+(1 p)log(1 p) Convexity. p(s) = 3s 4.

112 General Philosophy Entropy and Large Deviations p. 30/32

113 Entropy and Large Deviations p. 30/32 General Philosophy P(A) is to be estimated.

114 Entropy and Large Deviations p. 30/32 General Philosophy P(A) is to be estimated. P(A) is small.

115 Entropy and Large Deviations p. 30/32 General Philosophy P(A) is to be estimated. P(A) is small. FindQsuch that Q(A) 1

116 Entropy and Large Deviations p. 30/32 General Philosophy P(A) is to be estimated. P(A) is small. FindQsuch that Q(A) 1 1 P(A) = Q(A) Q(A) A dq log e dp dq

117 Entropy and Large Deviations p. 31/32 P(A) Q(A)exp[ 1 Q(A) A log dq dp dq]

118 Entropy and Large Deviations p. 31/32 P(A) Q(A)exp[ 1 Q(A) Asymptotically more or less A log dq dp dq] P(A) exp[ h(q : P)]

119 Entropy and Large Deviations p. 31/32 P(A) Q(A)exp[ 1 Q(A) Asymptotically more or less A log dq dp dq] Optimize over Q. P(A) exp[ h(q : P)]

120 Example. Markov Chain Entropy and Large Deviations p. 32/32

121 Entropy and Large Deviations p. 32/32 Example. Markov Chain {p i,j }. Invariant distributionp

122 Entropy and Large Deviations p. 32/32 Example. Markov Chain {p i,j }. Invariant distributionp What is the probability that the empirical r n is close toq.

123 Entropy and Large Deviations p. 32/32 Example. Markov Chain {p i,j }. Invariant distributionp What is the probability that the empirical r n is close toq. Find{q i,j } such that the invariant distribution isq

124 Entropy and Large Deviations p. 32/32 Example. Markov Chain {p i,j }. Invariant distributionp What is the probability that the empirical r n is close toq. Find{q i,j } such that the invariant distribution isq h(q : P) = i,j q iq i,j log q i,j p i,j

125 Entropy and Large Deviations p. 32/32 Example. Markov Chain {p i,j }. Invariant distributionp What is the probability that the empirical r n is close toq. Find{q i,j } such that the invariant distribution isq h(q : P) = i,j q iq i,j log q i,j p i,j Optimize over {q i,j } with q as invariant distribution.

126 Entropy and Large Deviations p. 32/32 Example. Markov Chain {p i,j }. Invariant distributionp What is the probability that the empirical r n is close toq. Find{q i,j } such that the invariant distribution isq h(q : P) = i,j q iq i,j log q i,j p i,j Optimize over {q i,j } with q as invariant distribution. Best possible lower bound is the upper bound.

127 Random graphs Entropy and Large Deviations p. 33/32

128 Entropy and Large Deviations p. 33/32 Random graphs p is the probability of edge.

129 Entropy and Large Deviations p. 33/32 Random graphs p is the probability of edge. Need more triangles ( N 3) τ, τ > p 3.

130 Entropy and Large Deviations p. 33/32 Random graphs p is the probability of edge. Need more triangles ( N 3) τ, τ > p 3. Fix f(x,y)f(y,z)f(z,x)dxdydz = τ

131 Entropy and Large Deviations p. 33/32 Random graphs p is the probability of edge. Need more triangles ( N 3) τ, τ > p 3. Fix f(x,y)f(y,z)f(z,x)dxdydz = τ Minimize H(f) = [f(x,y)log f(x,y) p +(1 f(x,y))log (1 f(x,y)) 1 p ]dxdy

132 Entropy and Large Deviations p. 33/32 Random graphs p is the probability of edge. Need more triangles ( N 3) τ, τ > p 3. Fix f(x,y)f(y,z)f(z,x)dxdydz = τ Minimize H(f) = [f(x,y)log f(x,y) p +(1 f(x,y))log (1 f(x,y)) 1 p ]dxdy f = τ 1 3?

133 THANK YOU Entropy and Large Deviations p. 34/32

Large Deviations and Random Graphs

Large Deviations and Random Graphs Large Deviations and Random Graphs S.R.S. Varadhan Courant Institute, NYU UC Irvine March 25, 2011 Large DeviationsandRandom Graphs p.1/39 Joint work with Sourav Chatterjee. Large DeviationsandRandom Graphs

More information

MEASURE-THEORETIC ENTROPY

MEASURE-THEORETIC ENTROPY MEASURE-THEORETIC ENTROPY Abstract. We introduce measure-theoretic entropy 1. Some motivation for the formula and the logs We want to define a function I : [0, 1] R which measures how suprised we are or

More information

A Note On Large Deviation Theory and Beyond

A Note On Large Deviation Theory and Beyond A Note On Large Deviation Theory and Beyond Jin Feng In this set of notes, we will develop and explain a whole mathematical theory which can be highly summarized through one simple observation ( ) lim

More information

Large-deviation theory and coverage in mobile phone networks

Large-deviation theory and coverage in mobile phone networks Weierstrass Institute for Applied Analysis and Stochastics Large-deviation theory and coverage in mobile phone networks Paul Keeler, Weierstrass Institute for Applied Analysis and Stochastics, Berlin joint

More information

δ xj β n = 1 n Theorem 1.1. The sequence {P n } satisfies a large deviation principle on M(X) with the rate function I(β) given by

δ xj β n = 1 n Theorem 1.1. The sequence {P n } satisfies a large deviation principle on M(X) with the rate function I(β) given by . Sanov s Theorem Here we consider a sequence of i.i.d. random variables with values in some complete separable metric space X with a common distribution α. Then the sample distribution β n = n maps X

More information

Lecture 5 - Information theory

Lecture 5 - Information theory Lecture 5 - Information theory Jan Bouda FI MU May 18, 2012 Jan Bouda (FI MU) Lecture 5 - Information theory May 18, 2012 1 / 42 Part I Uncertainty and entropy Jan Bouda (FI MU) Lecture 5 - Information

More information

Exercises with solutions (Set D)

Exercises with solutions (Set D) Exercises with solutions Set D. A fair die is rolled at the same time as a fair coin is tossed. Let A be the number on the upper surface of the die and let B describe the outcome of the coin toss, where

More information

LECTURE 3. Last time:

LECTURE 3. Last time: LECTURE 3 Last time: Mutual Information. Convexity and concavity Jensen s inequality Information Inequality Data processing theorem Fano s Inequality Lecture outline Stochastic processes, Entropy rate

More information

Entropy and Limit theorems in Probability Theory

Entropy and Limit theorems in Probability Theory Entropy and Limit theorems in Probability Theory Introduction Shigeki Aida Important Notice : Solve at least one problem from the following Problems -8 and submit the report to me until June 9. What is

More information

STA205 Probability: Week 8 R. Wolpert

STA205 Probability: Week 8 R. Wolpert INFINITE COIN-TOSS AND THE LAWS OF LARGE NUMBERS The traditional interpretation of the probability of an event E is its asymptotic frequency: the limit as n of the fraction of n repeated, similar, and

More information

Mean-field dual of cooperative reproduction

Mean-field dual of cooperative reproduction The mean-field dual of systems with cooperative reproduction joint with Tibor Mach (Prague) A. Sturm (Göttingen) Friday, July 6th, 2018 Poisson construction of Markov processes Let (X t ) t 0 be a continuous-time

More information

ECE 4400:693 - Information Theory

ECE 4400:693 - Information Theory ECE 4400:693 - Information Theory Dr. Nghi Tran Lecture 8: Differential Entropy Dr. Nghi Tran (ECE-University of Akron) ECE 4400:693 Lecture 1 / 43 Outline 1 Review: Entropy of discrete RVs 2 Differential

More information

Chapter 8: Differential entropy. University of Illinois at Chicago ECE 534, Natasha Devroye

Chapter 8: Differential entropy. University of Illinois at Chicago ECE 534, Natasha Devroye Chapter 8: Differential entropy Chapter 8 outline Motivation Definitions Relation to discrete entropy Joint and conditional differential entropy Relative entropy and mutual information Properties AEP for

More information

Information Theory. Lecture 5 Entropy rate and Markov sources STEFAN HÖST

Information Theory. Lecture 5 Entropy rate and Markov sources STEFAN HÖST Information Theory Lecture 5 Entropy rate and Markov sources STEFAN HÖST Universal Source Coding Huffman coding is optimal, what is the problem? In the previous coding schemes (Huffman and Shannon-Fano)it

More information

Complex Systems Methods 2. Conditional mutual information, entropy rate and algorithmic complexity

Complex Systems Methods 2. Conditional mutual information, entropy rate and algorithmic complexity Complex Systems Methods 2. Conditional mutual information, entropy rate and algorithmic complexity Eckehard Olbrich MPI MiS Leipzig Potsdam WS 2007/08 Olbrich (Leipzig) 26.10.2007 1 / 18 Overview 1 Summary

More information

INFORMATION THEORY AND STATISTICS

INFORMATION THEORY AND STATISTICS CHAPTER INFORMATION THEORY AND STATISTICS We now explore the relationship between information theory and statistics. We begin by describing the method of types, which is a powerful technique in large deviation

More information

Lecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016

Lecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016 Lecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016 1 Entropy Since this course is about entropy maximization,

More information

A relative entropy characterization of the growth rate of reward in risk-sensitive control

A relative entropy characterization of the growth rate of reward in risk-sensitive control 1 / 47 A relative entropy characterization of the growth rate of reward in risk-sensitive control Venkat Anantharam EECS Department, University of California, Berkeley (joint work with Vivek Borkar, IIT

More information

MGMT 69000: Topics in High-dimensional Data Analysis Falll 2016

MGMT 69000: Topics in High-dimensional Data Analysis Falll 2016 MGMT 69000: Topics in High-dimensional Data Analysis Falll 2016 Lecture 14: Information Theoretic Methods Lecturer: Jiaming Xu Scribe: Hilda Ibriga, Adarsh Barik, December 02, 2016 Outline f-divergence

More information

H(X) = plog 1 p +(1 p)log 1 1 p. With a slight abuse of notation, we denote this quantity by H(p) and refer to it as the binary entropy function.

H(X) = plog 1 p +(1 p)log 1 1 p. With a slight abuse of notation, we denote this quantity by H(p) and refer to it as the binary entropy function. LECTURE 2 Information Measures 2. ENTROPY LetXbeadiscreterandomvariableonanalphabetX drawnaccordingtotheprobability mass function (pmf) p() = P(X = ), X, denoted in short as X p(). The uncertainty about

More information

Bioinformatics: Biology X

Bioinformatics: Biology X Bud Mishra Room 1002, 715 Broadway, Courant Institute, NYU, New York, USA Model Building/Checking, Reverse Engineering, Causality Outline 1 Bayesian Interpretation of Probabilities 2 Where (or of what)

More information

Lecture 5 Channel Coding over Continuous Channels

Lecture 5 Channel Coding over Continuous Channels Lecture 5 Channel Coding over Continuous Channels I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw November 14, 2014 1 / 34 I-Hsiang Wang NIT Lecture 5 From

More information

Waiting times, recurrence times, ergodicity and quasiperiodic dynamics

Waiting times, recurrence times, ergodicity and quasiperiodic dynamics Waiting times, recurrence times, ergodicity and quasiperiodic dynamics Dong Han Kim Department of Mathematics, The University of Suwon, Korea Scuola Normale Superiore, 22 Jan. 2009 Outline Dynamical Systems

More information

Application of Information Theory, Lecture 7. Relative Entropy. Handout Mode. Iftach Haitner. Tel Aviv University.

Application of Information Theory, Lecture 7. Relative Entropy. Handout Mode. Iftach Haitner. Tel Aviv University. Application of Information Theory, Lecture 7 Relative Entropy Handout Mode Iftach Haitner Tel Aviv University. December 1, 2015 Iftach Haitner (TAU) Application of Information Theory, Lecture 7 December

More information

Distance-Divergence Inequalities

Distance-Divergence Inequalities Distance-Divergence Inequalities Katalin Marton Alfréd Rényi Institute of Mathematics of the Hungarian Academy of Sciences Motivation To find a simple proof of the Blowing-up Lemma, proved by Ahlswede,

More information

(1) Consider the space S consisting of all continuous real-valued functions on the closed interval [0, 1]. For f, g S, define

(1) Consider the space S consisting of all continuous real-valued functions on the closed interval [0, 1]. For f, g S, define Homework, Real Analysis I, Fall, 2010. (1) Consider the space S consisting of all continuous real-valued functions on the closed interval [0, 1]. For f, g S, define ρ(f, g) = 1 0 f(x) g(x) dx. Show that

More information

ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016

ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016 ECE598: Information-theoretic methods in high-dimensional statistics Spring 06 Lecture : Mutual Information Method Lecturer: Yihong Wu Scribe: Jaeho Lee, Mar, 06 Ed. Mar 9 Quick review: Assouad s lemma

More information

Intertwinings for Markov processes

Intertwinings for Markov processes Intertwinings for Markov processes Aldéric Joulin - University of Toulouse Joint work with : Michel Bonnefont - Univ. Bordeaux Workshop 2 Piecewise Deterministic Markov Processes ennes - May 15-17, 2013

More information

Large deviation theory and applications

Large deviation theory and applications Large deviation theory and applications Peter Mörters November 0, 2008 Abstract Large deviation theory deals with the decay of the probability of increasingly unlikely events. It is one of the key techniques

More information

Linear and non-linear programming

Linear and non-linear programming Linear and non-linear programming Benjamin Recht March 11, 2005 The Gameplan Constrained Optimization Convexity Duality Applications/Taxonomy 1 Constrained Optimization minimize f(x) subject to g j (x)

More information

Some remarks on Large Deviations

Some remarks on Large Deviations Some remarks on Large Deviations S.R.S. Varadhan Courant Institute, NYU Conference in memory of Larry Shepp April 25, 2014 Some remarks on Large Deviations p.1/?? We want to estimate certain probabilities.

More information

Bayesian Regularization

Bayesian Regularization Bayesian Regularization Aad van der Vaart Vrije Universiteit Amsterdam International Congress of Mathematicians Hyderabad, August 2010 Contents Introduction Abstract result Gaussian process priors Co-authors

More information

3 (Due ). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

3 (Due ). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure? MA 645-4A (Real Analysis), Dr. Chernov Homework assignment 1 (Due ). Show that the open disk x 2 + y 2 < 1 is a countable union of planar elementary sets. Show that the closed disk x 2 + y 2 1 is a countable

More information

Information geometry for bivariate distribution control

Information geometry for bivariate distribution control Information geometry for bivariate distribution control C.T.J.Dodson + Hong Wang Mathematics + Control Systems Centre, University of Manchester Institute of Science and Technology Optimal control of stochastic

More information

18.175: Lecture 3 Integration

18.175: Lecture 3 Integration 18.175: Lecture 3 Scott Sheffield MIT Outline Outline Recall definitions Probability space is triple (Ω, F, P) where Ω is sample space, F is set of events (the σ-algebra) and P : F [0, 1] is the probability

More information

Chaos, Complexity, and Inference (36-462)

Chaos, Complexity, and Inference (36-462) Chaos, Complexity, and Inference (36-462) Lecture 7: Information Theory Cosma Shalizi 3 February 2009 Entropy and Information Measuring randomness and dependence in bits The connection to statistics Long-run

More information

EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018

EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018 Please submit the solutions on Gradescope. EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018 1. Optimal codeword lengths. Although the codeword lengths of an optimal variable length code

More information

Information Theory: Entropy, Markov Chains, and Huffman Coding

Information Theory: Entropy, Markov Chains, and Huffman Coding The University of Notre Dame A senior thesis submitted to the Department of Mathematics and the Glynn Family Honors Program Information Theory: Entropy, Markov Chains, and Huffman Coding Patrick LeBlanc

More information

Information Theory in Intelligent Decision Making

Information Theory in Intelligent Decision Making Information Theory in Intelligent Decision Making Adaptive Systems and Algorithms Research Groups School of Computer Science University of Hertfordshire, United Kingdom June 7, 2015 Information Theory

More information

Lecture 35: December The fundamental statistical distances

Lecture 35: December The fundamental statistical distances 36-705: Intermediate Statistics Fall 207 Lecturer: Siva Balakrishnan Lecture 35: December 4 Today we will discuss distances and metrics between distributions that are useful in statistics. I will be lose

More information

A PARAMETRIC MODEL FOR DISCRETE-VALUED TIME SERIES. 1. Introduction

A PARAMETRIC MODEL FOR DISCRETE-VALUED TIME SERIES. 1. Introduction tm Tatra Mt. Math. Publ. 00 (XXXX), 1 10 A PARAMETRIC MODEL FOR DISCRETE-VALUED TIME SERIES Martin Janžura and Lucie Fialová ABSTRACT. A parametric model for statistical analysis of Markov chains type

More information

Quantum algorithms based on quantum walks. Jérémie Roland (ULB - QuIC) Quantum walks Grenoble, Novembre / 39

Quantum algorithms based on quantum walks. Jérémie Roland (ULB - QuIC) Quantum walks Grenoble, Novembre / 39 Quantum algorithms based on quantum walks Jérémie Roland Université Libre de Bruxelles Quantum Information & Communication Jérémie Roland (ULB - QuIC) Quantum walks Grenoble, Novembre 2012 1 / 39 Outline

More information

An Introduction to Entropy and Subshifts of. Finite Type

An Introduction to Entropy and Subshifts of. Finite Type An Introduction to Entropy and Subshifts of Finite Type Abby Pekoske Department of Mathematics Oregon State University pekoskea@math.oregonstate.edu August 4, 2015 Abstract This work gives an overview

More information

June 21, Peking University. Dual Connections. Zhengchao Wan. Overview. Duality of connections. Divergence: general contrast functions

June 21, Peking University. Dual Connections. Zhengchao Wan. Overview. Duality of connections. Divergence: general contrast functions Dual Peking University June 21, 2016 Divergences: Riemannian connection Let M be a manifold on which there is given a Riemannian metric g =,. A connection satisfying Z X, Y = Z X, Y + X, Z Y (1) for all

More information

Theorem 2.1 (Caratheodory). A (countably additive) probability measure on a field has an extension. n=1

Theorem 2.1 (Caratheodory). A (countably additive) probability measure on a field has an extension. n=1 Chapter 2 Probability measures 1. Existence Theorem 2.1 (Caratheodory). A (countably additive) probability measure on a field has an extension to the generated σ-field Proof of Theorem 2.1. Let F 0 be

More information

Solutions to Homework Set #1 Sanov s Theorem, Rate distortion

Solutions to Homework Set #1 Sanov s Theorem, Rate distortion st Semester 00/ Solutions to Homework Set # Sanov s Theorem, Rate distortion. Sanov s theorem: Prove the simple version of Sanov s theorem for the binary random variables, i.e., let X,X,...,X n be a sequence

More information

A note on adiabatic theorem for Markov chains

A note on adiabatic theorem for Markov chains Yevgeniy Kovchegov Abstract We state and prove a version of an adiabatic theorem for Markov chains using well known facts about mixing times. We extend the result to the case of continuous time Markov

More information

2 (Bonus). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

2 (Bonus). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure? MA 645-4A (Real Analysis), Dr. Chernov Homework assignment 1 (Due 9/5). Prove that every countable set A is measurable and µ(a) = 0. 2 (Bonus). Let A consist of points (x, y) such that either x or y is

More information

Lecture 6: Entropy Rate

Lecture 6: Entropy Rate Lecture 6: Entropy Rate Entropy rate H(X) Random walk on graph Dr. Yao Xie, ECE587, Information Theory, Duke University Coin tossing versus poker Toss a fair coin and see and sequence Head, Tail, Tail,

More information

PROBABILITY AND INFORMATION THEORY. Dr. Gjergji Kasneci Introduction to Information Retrieval WS

PROBABILITY AND INFORMATION THEORY. Dr. Gjergji Kasneci Introduction to Information Retrieval WS PROBABILITY AND INFORMATION THEORY Dr. Gjergji Kasneci Introduction to Information Retrieval WS 2012-13 1 Outline Intro Basics of probability and information theory Probability space Rules of probability

More information

Bayes spaces: use of improper priors and distances between densities

Bayes spaces: use of improper priors and distances between densities Bayes spaces: use of improper priors and distances between densities J. J. Egozcue 1, V. Pawlowsky-Glahn 2, R. Tolosana-Delgado 1, M. I. Ortego 1 and G. van den Boogaart 3 1 Universidad Politécnica de

More information

MAGIC010 Ergodic Theory Lecture Entropy

MAGIC010 Ergodic Theory Lecture Entropy 7. Entropy 7. Introduction A natural question in mathematics is the so-called isomorphism problem : when are two mathematical objects of the same class the same (in some appropriately defined sense of

More information

Homework Set #2 Data Compression, Huffman code and AEP

Homework Set #2 Data Compression, Huffman code and AEP Homework Set #2 Data Compression, Huffman code and AEP 1. Huffman coding. Consider the random variable ( x1 x X = 2 x 3 x 4 x 5 x 6 x 7 0.50 0.26 0.11 0.04 0.04 0.03 0.02 (a Find a binary Huffman code

More information

10-704: Information Processing and Learning Spring Lecture 8: Feb 5

10-704: Information Processing and Learning Spring Lecture 8: Feb 5 10-704: Information Processing and Learning Spring 2015 Lecture 8: Feb 5 Lecturer: Aarti Singh Scribe: Siheng Chen Disclaimer: These notes have not been subjected to the usual scrutiny reserved for formal

More information

Boltzmann-Gibbs Preserving Langevin Integrators

Boltzmann-Gibbs Preserving Langevin Integrators Boltzmann-Gibbs Preserving Langevin Integrators Nawaf Bou-Rabee Applied and Comp. Math., Caltech INI Workshop on Markov-Chain Monte-Carlo Methods March 28, 2008 4000 atom cluster simulation Governing Equations

More information

Gärtner-Ellis Theorem and applications.

Gärtner-Ellis Theorem and applications. Gärtner-Ellis Theorem and applications. Elena Kosygina July 25, 208 In this lecture we turn to the non-i.i.d. case and discuss Gärtner-Ellis theorem. As an application, we study Curie-Weiss model with

More information

PROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS

PROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS PROBABILITY: LIMIT THEOREMS II, SPRING 218. HOMEWORK PROBLEMS PROF. YURI BAKHTIN Instructions. You are allowed to work on solutions in groups, but you are required to write up solutions on your own. Please

More information

Contraction properties of Feynman-Kac semigroups

Contraction properties of Feynman-Kac semigroups Journées de Statistique Marne La Vallée, January 2005 Contraction properties of Feynman-Kac semigroups Pierre DEL MORAL, Laurent MICLO Lab. J. Dieudonné, Nice Univ., LATP Univ. Provence, Marseille 1 Notations

More information

3 hours UNIVERSITY OF MANCHESTER. 22nd May and. Electronic calculators may be used, provided that they cannot store text.

3 hours UNIVERSITY OF MANCHESTER. 22nd May and. Electronic calculators may be used, provided that they cannot store text. 3 hours MATH40512 UNIVERSITY OF MANCHESTER DYNAMICAL SYSTEMS AND ERGODIC THEORY 22nd May 2007 9.45 12.45 Answer ALL four questions in SECTION A (40 marks in total) and THREE of the four questions in SECTION

More information

Diffeomorphic Warping. Ben Recht August 17, 2006 Joint work with Ali Rahimi (Intel)

Diffeomorphic Warping. Ben Recht August 17, 2006 Joint work with Ali Rahimi (Intel) Diffeomorphic Warping Ben Recht August 17, 2006 Joint work with Ali Rahimi (Intel) What Manifold Learning Isn t Common features of Manifold Learning Algorithms: 1-1 charting Dense sampling Geometric Assumptions

More information

Computing and Communications 2. Information Theory -Entropy

Computing and Communications 2. Information Theory -Entropy 1896 1920 1987 2006 Computing and Communications 2. Information Theory -Entropy Ying Cui Department of Electronic Engineering Shanghai Jiao Tong University, China 2017, Autumn 1 Outline Entropy Joint entropy

More information

Quiz 2 Date: Monday, November 21, 2016

Quiz 2 Date: Monday, November 21, 2016 10-704 Information Processing and Learning Fall 2016 Quiz 2 Date: Monday, November 21, 2016 Name: Andrew ID: Department: Guidelines: 1. PLEASE DO NOT TURN THIS PAGE UNTIL INSTRUCTED. 2. Write your name,

More information

Finding is as easy as detecting for quantum walks

Finding is as easy as detecting for quantum walks Finding is as easy as detecting for quantum walks Jérémie Roland Hari Krovi Frédéric Magniez Maris Ozols LIAFA [ICALP 2010, arxiv:1002.2419] Jérémie Roland (NEC Labs) QIP 2011 1 / 14 Spatial search on

More information

1. Principle of large deviations

1. Principle of large deviations 1. Principle of large deviations This section will provide a short introduction to the Law of large deviations. The basic principle behind this theory centers on the observation that certain functions

More information

Information Theory and Statistics Lecture 3: Stationary ergodic processes

Information Theory and Statistics Lecture 3: Stationary ergodic processes Information Theory and Statistics Lecture 3: Stationary ergodic processes Łukasz Dębowski ldebowsk@ipipan.waw.pl Ph. D. Programme 2013/2014 Measurable space Definition (measurable space) Measurable space

More information

Quantitative Model Checking (QMC) - SS12

Quantitative Model Checking (QMC) - SS12 Quantitative Model Checking (QMC) - SS12 Lecture 06 David Spieler Saarland University, Germany June 4, 2012 1 / 34 Deciding Bisimulations 2 / 34 Partition Refinement Algorithm Notation: A partition P over

More information

Coding on Countably Infinite Alphabets

Coding on Countably Infinite Alphabets Coding on Countably Infinite Alphabets Non-parametric Information Theory Licence de droits d usage Outline Lossless Coding on infinite alphabets Source Coding Universal Coding Infinite Alphabets Enveloppe

More information

RS Chapter 1 Random Variables 6/5/2017. Chapter 1. Probability Theory: Introduction

RS Chapter 1 Random Variables 6/5/2017. Chapter 1. Probability Theory: Introduction Chapter 1 Probability Theory: Introduction Basic Probability General In a probability space (Ω, Σ, P), the set Ω is the set of all possible outcomes of a probability experiment. Mathematically, Ω is just

More information

A Single-letter Upper Bound for the Sum Rate of Multiple Access Channels with Correlated Sources

A Single-letter Upper Bound for the Sum Rate of Multiple Access Channels with Correlated Sources A Single-letter Upper Bound for the Sum Rate of Multiple Access Channels with Correlated Sources Wei Kang Sennur Ulukus Department of Electrical and Computer Engineering University of Maryland, College

More information

MATH 205C: STATIONARY PHASE LEMMA

MATH 205C: STATIONARY PHASE LEMMA MATH 205C: STATIONARY PHASE LEMMA For ω, consider an integral of the form I(ω) = e iωf(x) u(x) dx, where u Cc (R n ) complex valued, with support in a compact set K, and f C (R n ) real valued. Thus, I(ω)

More information

215 Problem 1. (a) Define the total variation distance µ ν tv for probability distributions µ, ν on a finite set S. Show that

215 Problem 1. (a) Define the total variation distance µ ν tv for probability distributions µ, ν on a finite set S. Show that 15 Problem 1. (a) Define the total variation distance µ ν tv for probability distributions µ, ν on a finite set S. Show that µ ν tv = (1/) x S µ(x) ν(x) = x S(µ(x) ν(x)) + where a + = max(a, 0). Show that

More information

ELEMENTS OF PROBABILITY THEORY

ELEMENTS OF PROBABILITY THEORY ELEMENTS OF PROBABILITY THEORY Elements of Probability Theory A collection of subsets of a set Ω is called a σ algebra if it contains Ω and is closed under the operations of taking complements and countable

More information

Context tree models for source coding

Context tree models for source coding Context tree models for source coding Toward Non-parametric Information Theory Licence de droits d usage Outline Lossless Source Coding = density estimation with log-loss Source Coding and Universal Coding

More information

2) Let X be a compact space. Prove that the space C(X) of continuous real-valued functions is a complete metric space.

2) Let X be a compact space. Prove that the space C(X) of continuous real-valued functions is a complete metric space. University of Bergen General Functional Analysis Problems with solutions 6 ) Prove that is unique in any normed space. Solution of ) Let us suppose that there are 2 zeros and 2. Then = + 2 = 2 + = 2. 2)

More information

A note on adiabatic theorem for Markov chains and adiabatic quantum computation. Yevgeniy Kovchegov Oregon State University

A note on adiabatic theorem for Markov chains and adiabatic quantum computation. Yevgeniy Kovchegov Oregon State University A note on adiabatic theorem for Markov chains and adiabatic quantum computation Yevgeniy Kovchegov Oregon State University Introduction Max Born and Vladimir Fock in 1928: a physical system remains in

More information

Lecture 2: August 31

Lecture 2: August 31 0-704: Information Processing and Learning Fall 206 Lecturer: Aarti Singh Lecture 2: August 3 Note: These notes are based on scribed notes from Spring5 offering of this course. LaTeX template courtesy

More information

Probability and Measure

Probability and Measure Part II Year 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2018 84 Paper 4, Section II 26J Let (X, A) be a measurable space. Let T : X X be a measurable map, and µ a probability

More information

MATH & MATH FUNCTIONS OF A REAL VARIABLE EXERCISES FALL 2015 & SPRING Scientia Imperii Decus et Tutamen 1

MATH & MATH FUNCTIONS OF A REAL VARIABLE EXERCISES FALL 2015 & SPRING Scientia Imperii Decus et Tutamen 1 MATH 5310.001 & MATH 5320.001 FUNCTIONS OF A REAL VARIABLE EXERCISES FALL 2015 & SPRING 2016 Scientia Imperii Decus et Tutamen 1 Robert R. Kallman University of North Texas Department of Mathematics 1155

More information

arxiv: v4 [cs.it] 17 Oct 2015

arxiv: v4 [cs.it] 17 Oct 2015 Upper Bounds on the Relative Entropy and Rényi Divergence as a Function of Total Variation Distance for Finite Alphabets Igal Sason Department of Electrical Engineering Technion Israel Institute of Technology

More information

Proof: The coding of T (x) is the left shift of the coding of x. φ(t x) n = L if T n+1 (x) L

Proof: The coding of T (x) is the left shift of the coding of x. φ(t x) n = L if T n+1 (x) L Lecture 24: Defn: Topological conjugacy: Given Z + d (resp, Zd ), actions T, S a topological conjugacy from T to S is a homeomorphism φ : M N s.t. φ T = S φ i.e., φ T n = S n φ for all n Z + d (resp, Zd

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science Transmission of Information Spring 2006

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science Transmission of Information Spring 2006 MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science 6.44 Transmission of Information Spring 2006 Homework 2 Solution name username April 4, 2006 Reading: Chapter

More information

Chapter 2: Source coding

Chapter 2: Source coding Chapter 2: meghdadi@ensil.unilim.fr University of Limoges Chapter 2: Entropy of Markov Source Chapter 2: Entropy of Markov Source Markov model for information sources Given the present, the future is independent

More information

3. If a choice is broken down into two successive choices, the original H should be the weighted sum of the individual values of H.

3. If a choice is broken down into two successive choices, the original H should be the weighted sum of the individual values of H. Appendix A Information Theory A.1 Entropy Shannon (Shanon, 1948) developed the concept of entropy to measure the uncertainty of a discrete random variable. Suppose X is a discrete random variable that

More information

Real Analysis, 2nd Edition, G.B.Folland Elements of Functional Analysis

Real Analysis, 2nd Edition, G.B.Folland Elements of Functional Analysis Real Analysis, 2nd Edition, G.B.Folland Chapter 5 Elements of Functional Analysis Yung-Hsiang Huang 5.1 Normed Vector Spaces 1. Note for any x, y X and a, b K, x+y x + y and by ax b y x + b a x. 2. It

More information

1 Stat 605. Homework I. Due Feb. 1, 2011

1 Stat 605. Homework I. Due Feb. 1, 2011 The first part is homework which you need to turn in. The second part is exercises that will not be graded, but you need to turn it in together with the take-home final exam. 1 Stat 605. Homework I. Due

More information

Existence, Uniqueness and Stability of Invariant Distributions in Continuous-Time Stochastic Models

Existence, Uniqueness and Stability of Invariant Distributions in Continuous-Time Stochastic Models Existence, Uniqueness and Stability of Invariant Distributions in Continuous-Time Stochastic Models Christian Bayer and Klaus Wälde Weierstrass Institute for Applied Analysis and Stochastics and University

More information

1 Math 241A-B Homework Problem List for F2015 and W2016

1 Math 241A-B Homework Problem List for F2015 and W2016 1 Math 241A-B Homework Problem List for F2015 W2016 1.1 Homework 1. Due Wednesday, October 7, 2015 Notation 1.1 Let U be any set, g be a positive function on U, Y be a normed space. For any f : U Y let

More information

6.1 Main properties of Shannon entropy. Let X be a random variable taking values x in some alphabet with probabilities.

6.1 Main properties of Shannon entropy. Let X be a random variable taking values x in some alphabet with probabilities. Chapter 6 Quantum entropy There is a notion of entropy which quantifies the amount of uncertainty contained in an ensemble of Qbits. This is the von Neumann entropy that we introduce in this chapter. In

More information

Lecture 6: Gaussian Channels. Copyright G. Caire (Sample Lectures) 157

Lecture 6: Gaussian Channels. Copyright G. Caire (Sample Lectures) 157 Lecture 6: Gaussian Channels Copyright G. Caire (Sample Lectures) 157 Differential entropy (1) Definition 18. The (joint) differential entropy of a continuous random vector X n p X n(x) over R is: Z h(x

More information

Spectral Theory of Orthogonal Polynomials

Spectral Theory of Orthogonal Polynomials Spectral Theory of Orthogonal Barry Simon IBM Professor of Mathematics and Theoretical Physics California Institute of Technology Pasadena, CA, U.S.A. Lecture 2: Szegö Theorem for OPUC for S Spectral Theory

More information

ELEC546 Review of Information Theory

ELEC546 Review of Information Theory ELEC546 Review of Information Theory Vincent Lau 1/1/004 1 Review of Information Theory Entropy: Measure of uncertainty of a random variable X. The entropy of X, H(X), is given by: If X is a discrete random

More information

EE514A Information Theory I Fall 2013

EE514A Information Theory I Fall 2013 EE514A Information Theory I Fall 2013 K. Mohan, Prof. J. Bilmes University of Washington, Seattle Department of Electrical Engineering Fall Quarter, 2013 http://j.ee.washington.edu/~bilmes/classes/ee514a_fall_2013/

More information

The main results about probability measures are the following two facts:

The main results about probability measures are the following two facts: Chapter 2 Probability measures The main results about probability measures are the following two facts: Theorem 2.1 (extension). If P is a (continuous) probability measure on a field F 0 then it has a

More information

x log x, which is strictly convex, and use Jensen s Inequality:

x log x, which is strictly convex, and use Jensen s Inequality: 2. Information measures: mutual information 2.1 Divergence: main inequality Theorem 2.1 (Information Inequality). D(P Q) 0 ; D(P Q) = 0 iff P = Q Proof. Let ϕ(x) x log x, which is strictly convex, and

More information

Entropy-dissipation methods I: Fokker-Planck equations

Entropy-dissipation methods I: Fokker-Planck equations 1 Entropy-dissipation methods I: Fokker-Planck equations Ansgar Jüngel Vienna University of Technology, Austria www.jungel.at.vu Introduction Boltzmann equation Fokker-Planck equations Degenerate parabolic

More information

Information Theory and Hypothesis Testing

Information Theory and Hypothesis Testing Summer School on Game Theory and Telecommunications Campione, 7-12 September, 2014 Information Theory and Hypothesis Testing Mauro Barni University of Siena September 8 Review of some basic results linking

More information

A construction of strictly ergodic subshifts having entropy dimension (joint with K. K. Park and J. Lee (Ajou Univ.))

A construction of strictly ergodic subshifts having entropy dimension (joint with K. K. Park and J. Lee (Ajou Univ.)) A construction of strictly ergodic subshifts having entropy dimension (joint with K. K. Park and J. Lee (Ajou Univ.)) Uijin Jung Ajou University, Suwon, South Korea Pingree Park Conference, July 5, 204

More information

Large deviations, weak convergence, and relative entropy

Large deviations, weak convergence, and relative entropy Large deviations, weak convergence, and relative entropy Markus Fischer, University of Padua Revised June 20, 203 Introduction Rare event probabilities and large deviations: basic example and definition

More information

Lecture 4. Entropy and Markov Chains

Lecture 4. Entropy and Markov Chains preliminary version : Not for diffusion Lecture 4. Entropy and Markov Chains The most important numerical invariant related to the orbit growth in topological dynamical systems is topological entropy.

More information

ENTROPY, SPEED AND SPECTRAL RADIUS OF RANDOM WALKS

ENTROPY, SPEED AND SPECTRAL RADIUS OF RANDOM WALKS ENTROPY, SPEED AND SPECTRAL RADIUS OF RANDOM WALKS FERNANDO AL ASSAL Abstract. In this paper, we give an introduction to random walks on infinite discrete groups by focusing on three invariants entropy,

More information