Minimax Redundancy for Large Alphabets by Analytic Methods

Size: px
Start display at page:

Download "Minimax Redundancy for Large Alphabets by Analytic Methods"

Transcription

1 Minimax Redundancy for Large Alphabets by Analytic Methods Wojciech Szpankowski Purdue University W. Lafayette, IN March 15, 2012 NSF STC Center for Science of Information CISS, Princeton 2012 Joint work with M. Weinberger

2 Outline 1. Source Coding: The Redundancy Rate Problem 2. Universal Memoryless Sources (a) Finite Alphabet (b) Unbounded Alphabet 3. Universal Renewal Sources

3 Source Coding and Redundancy Source coding aims at finding codesc : A {0,1} of the shortest length L(C,x), either on average or for individual sequences. Known Source P : The pointwise and maximal redundancy are: R n (C n,p;x n 1 ) = L(C n,x n 1 ) + logp(xn 1 ) R (C n,p) = max[l(c x n n,x n 1 ) + logp(xn 1 )] 1 where P(x n 1 ) is the probability of xn 1 = x 1 x n. Unknown Source P : Following Davisson, the maximal minimax redundancy R n (S) for a family of sources S is: R n (S) = minsup Cn P S max[l(c x n n,x n 1 ) + logp(xn 1 )]. 1

4 Source Coding and Redundancy Source coding aims at finding codesc : A {0,1} of the shortest length L(C,x), either on average or for individual sequences. Known Source P : The pointwise and maximal redundancy are: R n (C n,p;x n 1 ) = L(C n,x n 1 ) + logp(xn 1 ) R (C n,p) = max[l(c x n n,x n 1 ) + logp(xn 1 )] 1 where P(x n 1 ) is the probability of xn 1 = x 1 x n. Unknown Source P : Following Davisson, the maximal minimax redundancy R n (S) for a family of sources S is: R n (S) = minsup Cn P S max[l(c x n n,x n 1 ) + logp(xn 1 )]. 1 Shtarkov s Bound: d n (S) := log x n 1 An sup P S P(x n 1 ) R n (S) log x n 1 An sup P S P(x n 1 ) +1 } {{ } Dn(S)

5 Maximal Minimax Redundancy R n For the maximal minimax redundancy define Q (x n 1 ) := sup P S P(x n 1 ) y n1 An sup P S P(y n 1 ). the maximum likelihood distribution. Observe that (Shtarkov, 1976): R n (S) = min sup Cn C P S = min Cn C max x n 1 = min Cn C max x n 1 max(l(c n,x n 1 ) + logp(xn 1 )) x n 1 = R GS n (Q ) + log ( ) L(C n,x n 1 ) + sup logp(x n 1 ) P S ( L(Cn,x n 1 ) + logq (x n 1 )) + log y n 1 An sup P S P(y n 1 ) = log y n 1 An sup P S y n 1 An sup P S P(y n 1 ) P(y n 1 ) + O(1). where R GS n (Q ) is the redundancy of the optimal generalized Shannon code. Also, d n (S) = log sup x n 1 An P S P(x n 1 ) := logd n (S).

6 Learnable Information and Redundancy 1. S := M k = {P θ : θ Θ} set of k-dimensional parameterized distributions. Let ˆθ(x n ) = argmax θ Θ log1/p θ (x n ) be the ML estimator.

7 Learnable Information and Redundancy 1. S := M k = {P θ : θ Θ} set of k-dimensional parameterized distributions. Let ˆθ(x n ) = argmax θ Θ log1/p θ (x n ) be the ML estimator. C n ( Θ) balls ; log C n ( θ) useful bits θ θ'... 1 n θˆ indistinquishable models 2. Two models, say P θ (x n ) and P θ (x n ) are indistinguishable if the ML estimator ˆθ with high probability declares both models are the same. 3. The number of distinguishable distributions (i.e, ˆθ), C n (Θ), summarizes then learnable information, I(Θ) = log 2 C n (Θ).

8 Learnable Information and Redundancy 1. S := M k = {P θ : θ Θ} set of k-dimensional parameterized distributions. Let ˆθ(x n ) = argmax θ Θ log1/p θ (x n ) be the ML estimator. C n ( Θ) balls ; log C n ( θ) useful bits θ θ'... 1 n θˆ indistinquishable models 2. Two models, say P θ (x n ) and P θ (x n ) are indistinguishable if the ML estimator ˆθ with high probability declares both models are the same. 3. The number of distinguishable distributions (i.e, ˆθ), C n (Θ), summarizes then learnable information, I(Θ) = log 2 C n (Θ). 4. Consider the following expansion of the Kullback-Leibler (KL) divergence D(Pˆθ P θ ) := E[logPˆθ(X n )] E[logP θ (X n )] 1 2 (θ ˆθ) T I(ˆθ)(θ ˆθ) d 2 I (θ, ˆθ) where I(θ) = {I ij (θ)} ij is the Fisher information matrix and d I (θ, ˆθ) is a rescaled Euclidean distance known as Mahalanobis distance. 5. Balasubramanian proved the number of distinguishable balls C n (Θ) of radius O(1/ n) is asymptotically equal to the minimax redundancy: Learnable Information = logc n (Θ) = inf θ Θ max x n log Pˆθ P θ = R n (Mk )

9 Outline Update 1. Source Coding: The Redundancy Rate Problem 2. Universal Memoryless Sources (a) Finite Alphabet (b) Unbounded Alphabet 3. Universal Renewal Sources

10 Maximal Minimax for Memoryless Sources For a memoryless source over the alphabet A = {1, 2,..., m} we have Then P(x n 1 ) = p 1 k1 p m km, k k m = n. D n (M 0 ) := x n 1 sup P(x n 1 ) P(x n 1 ) = x n 1 = = sup p k 1 1 pk m m p 1,...,pm k 1 + +km=n k 1 + +km=n ( n k 1,...,k m ) sup p 1,...,pm ( n )( k1 k 1,...,k m n since the (unnormalized) likelihood distribution is sup P(x n P(x n 1 ) 1 ) = sup p k 1 1 pk m m = p 1,...,pm ( k1 n p k 1 1 pk m m ) k1 ( km ) k1 ( km n n ) k m ) k m.

11 Generating Function for D n (M 0 ) We write D n (M 0 ) = k 1 + +km=n ( n )( ) k1 k1 ( km k 1,...,k m n n ) k m = n! n n k 1 + +km=n k k 1 1 k 1! kkm m k m! Let us introduce a tree-generating function B(z) = k=0 k k k! zk = 1 1 T(z), T(z) = k=1 k k 1 z k k! where T(z) = ze T(z) (= W( z), Lambert s W -function) that enumerates all rooted labeled trees. Let now D m (z) = n=0 z n n n n! D n(m 0 ). Then by the convolution formula D m (z) = [B(z)] m 1.

12 Asymptotics for FINITE m The function B(z) has an algebraic singularity at z = e 1, and β(z) := B(z/e) = 1 2(1 z) O( (1 z). By Cauchy s coefficient formula D n (M 0 ) = n! n n[zn ][B(z)] m = 2πn(1 + O(1/n)) 1 2πi β(z) m dz. zn+1

13 Asymptotics for FINITE m The function B(z) has an algebraic singularity at z = e 1, and β(z) := B(z/e) = 1 2(1 z) O( (1 z). By Cauchy s coefficient formula D n (M 0 ) = n! n n[zn ][B(z)] m = 2πn(1 + O(1/n)) 1 2πi β(z) m dz. zn+1 For finite m, the singularity analysis of Flajolet and Odlyzko implies [z n ](1 z) α nα 1 Γ(α), α / {0, 1, 2,...} that finally yields (cf. Clarke & Barron, 1990, W.S., 1998) R n (M 0) = m 1 ( ) ( ) n π log + log 2 2 Γ( m 2 ) ( 3 + m(m 2)(2m + 1) Γ(m 2 )m 3Γ( m ) Γ2 ( m 2 )m2 9Γ 2 ( m ) ) 2 n 1 n +

14 Outline Update 1. Source Coding: The Redundancy Rate Problem 2. Universal Memoryless Sources (a) Finite Alphabet (b) Unbounded Alphabet 3. Universal Renewal Sources

15 Redundancy for LARGE m Now assume that m is unbounded and may vary with n. Then D n,m (M 0 ) = 2πn 1 2πi β(z) m where g(z) = mlnβ(z) (n + 1)lnz. z n+1 dz = 2πn 1 2πi e g(z) dz

16 Redundancy for LARGE m Now assume that m is unbounded and may vary with n. Then D n,m (M 0 ) = 2πn 1 2πi β(z) m where g(z) = mlnβ(z) (n + 1)lnz. z n+1 dz = 2πn 1 2πi The saddle point z 0 is a solution of g (z 0 ) = 0, that is, e g(z) dz g(z) = g(z 0 ) (z z 0) 2 g (z 0 ) + O(g (z 0 )(z z 0 ) 3 ). Under mild conditions satisfied by our g(z) (e.g., z 0 is real and unique), the saddle point method leads to: D n,m (M 0 ) = for some ρ < 3/2. e g(z 0 ) ( g (1 )) 2π g (z 0 ) (z 0 ) + O, (g (z 0 )) ρ

17 Saddle Points m = o(n) m = n n = o(m)

18 Main Results Large Alphabet Theorem 1 (Orlitsky and Santhanam, 2004, and W.S. and Weinberger, 2010). For memoryless sources M 0 over an m-ary alphabet, m as n grows, we have: (i) For m = o(n) R n,m (M 0) = m 1 2 log n m + m 2 loge + mloge 3 m n 1 2 O ( ) m n (ii) For m = αn + l(n), where α is a positive constant and l(n) = o(n), R n,m (M 0) = nlogb α + l(n)logc α log A α + O(l(n) 2 /n) where C α := /α, A α := C α + 2/α, B α = αc α+2 α e 1 Cα. (iii) For n = o(m) R n,m (M 0) = nlog m n + 3 n 2 2m loge 3 ( ) n 1 2m loge + O n + n3. m 2

19 Constrained Memoryless Sources Consider the following generalized source M 0 : alphabet is A B, where A = m and B = M; probabilities p 1,...,p m, of A are unknown, while the probabilities q 1,...,q M of B are fixed; define q = q q M and p = 1 q. Our goal is to compute the minimax redundancy R n,m,m ( M 0 ). Recall Lemma 1. R n,m,m ( M 0 ) = log(d n,m,m ) + O(1), D n,m,m = 2 d n,m,m. D n,m,m = n k=0 ( n k) p k (1 p) n k D k,mk. where D n,m = R n,m (M 0 ) + O(1) is the minimax redundancy over A as presented in Theorem 1. Proof. It basically follows from D n,m,m = sup x (A B) n P P(x) = sup x (A B) n P P(x).

20 Binomial Sums Consider the following binomial sum S f (n) = n k=0 ( n k) p k (1 p) n k f(k) In our case, f(k) = D n,m,m = 2 d n,m,m. Case: f(k) = O(n a log b n) = o(e n ): S f (n) = f(np)(1 + O(1/n)) (cf. Jacquet & W.S. (1999), and Flajolet (1999)). Case: f(k) = (α k ) S f (n) (pα + 1 p) n. Case: f(k) = O(k kβ ) S f (n) p n f(n).

21 Main Results for the Constrained Model M 0 Theorem 2 (W.S and Weinberger, 2010). Write m n for m depending on n. (i) m n = o(n). Assume: (a) m(x) := m x and its derivatives are continuous functions. (b) n := m n+1 m n = O(m (n)), m (n) = O(m/n), m (n) = O(m/n 2 ). If m n = o( n/logn), then R n,m,m = m np 1 2 ( np log m np ) + m np 2 loge O ( ) 1 + m2 n log2 n. mn n Otherwise, R n,m,m = m ( ) np np 2 log + m ( ) np m np 2 loge + O logn + m2 n n n log2. m n (ii) Let m n = αn + l(n), where α is a positive constant and l(n) = o(n). R n,m,m = nlog(b αp + 1 p) log ( A α + O l(n) + 1 ), n where A α and B α are defined in Theorem 1. (iii) Let n=o(m n ) and assume m k /k is a nondecreasing sequence. Then, ( ) ( R n,m,m = nlog pmn n 2 + O + 1 ). n m n n

22 Outline Update 1. Source Coding: The Redundancy Rate Problem 2. Universal Memoryless Sources 3. Universal Renewal Sources

23 Renewal Sources (Virtual Large Alphabet) The renewal process R 0 (introduced in 1996 by Csiszár and Shields) defined as follows: Let T 1,T 2... be a sequence of i.i.d. positive-valued random variables with distribution Q(j) = Pr{T i = j}. In a binary renewal sequence the positions of the 1 s are at the renewal epochs T 0,T 0 + T 1,... with runs of zeros of lengths T 1 1, T 2 1,....

24 Renewal Sources (Virtual Large Alphabet) The renewal process R 0 (introduced in 1996 by Csiszár and Shields) defined as follows: Let T 1,T 2... be a sequence of i.i.d. positive-valued random variables with distribution Q(j) = Pr{T i = j}. In a binary renewal sequence the positions of the 1 s are at the renewal epochs T 0,T 0 + T 1,... with runs of zeros of lengths T 1 1, T 2 1,.... For a sequence x n 0 = 10α 1 10 α αn 10 0 }{{} k define k m as the number of i such that α i = m. Then P(x n 1 ) = [Q(0)]k 0 [Q(1)] k1 [Q(n 1)] k n 1 Pr{T 1 > k }.

25 Renewal Sources (Virtual Large Alphabet) The renewal process R 0 (introduced in 1996 by Csiszár and Shields) defined as follows: Let T 1,T 2... be a sequence of i.i.d. positive-valued random variables with distribution Q(j) = Pr{T i = j}. In a binary renewal sequence the positions of the 1 s are at the renewal epochs T 0,T 0 + T 1,... with runs of zeros of lengths T 1 1, T 2 1,.... For a sequence x n 0 = 10α 1 10 α αn 10 0 }{{} k define k m as the number of i such that α i = m. Then P(x n 1 ) = [Q(0)]k 0 [Q(1)] k1 [Q(n 1)] k n 1 Pr{T 1 > k }. Theorem 3 (Flajolet and W.S., 1998). Consider the class of renewal processes. Then R n (R 0) = 2 cn + O(logn). log2 where c = π

26 Maximal Minimax Redundancy It can be proved that r n+1 1 D n (R 0 ) n m=0 r m r n = n r n,k, r n,k = k=0 I(n,k) ( k )( ) k0 ( ) k0 k1 k1 ( kn 1 k 0 k n 1 k k k ) kn 1 where I(n,k) is is the integer partition of n into k terms, i.e., n = k 0 + 2k nk n 1, k = k k n 1. But we shall study s n = n k=0 s n,k where s n,k = e k I(n,k) k k 0 k 0! kkn 1 k n 1! since S(z,u) = k,n s n,ku k z n = i=1 β(zi u).

27 Maximal Minimax Redundancy It can be proved that r n+1 1 D n (R 0 ) n m=0 r m r n = n r n,k, r n,k = k=0 I(n,k) ( k )( ) k0 ( ) k0 k1 k1 ( kn 1 k 0 k n 1 k k k ) kn 1 where I(n,k) is is the integer partition of n into k terms, i.e., n = k 0 + 2k nk n 1, k = k k n 1. But we shall study s n = n k=0 s n,k where s n,k = e k I(n,k) k k 0 k 0! kkn 1 k n 1! since S(z,u) = k,n s n,ku k z n = i=1 β(zi u). ( ) s n = [z n ]S(z,1) = [z n c ]exp 1 z + alog 1 1 z Theorem 4 (Flajolet and W.S., 1998). We have the following asymptotics s n exp (2 cn 78 ) logn + O(1), logr n = 2 5 cn log2 8 logn+1 2 loglogn+o(1).

28 That s IT

Analytic Information Theory: From Shannon to Knuth and Back. Knuth80: Piteaa, Sweden, 2018 Dedicated to Don E. Knuth

Analytic Information Theory: From Shannon to Knuth and Back. Knuth80: Piteaa, Sweden, 2018 Dedicated to Don E. Knuth Analytic Information Theory: From Shannon to Knuth and Back Wojciech Szpankowski Center for Science of Information Purdue University January 7, 2018 Knuth80: Piteaa, Sweden, 2018 Dedicated to Don E. Knuth

More information

Algorithms, Combinatorics, Information, and Beyond. Dedicated to PHILIPPE FLAJOLET

Algorithms, Combinatorics, Information, and Beyond. Dedicated to PHILIPPE FLAJOLET Algorithms, Combinatorics, Information, and Beyond Wojciech Szpankowski Purdue University October 3, 2011 Concordia University, Montreal, 2011 Dedicated to PHILIPPE FLAJOLET Research supported by NSF Science

More information

Analytic Information Theory and Beyond

Analytic Information Theory and Beyond Analytic Information Theory and Beyond W. Szpankowski Department of Computer Science Purdue University W. Lafayette, IN 47907 May 12, 2010 AofA and IT logos Frankfurt 2010 Research supported by NSF Science

More information

Analytic Algorithmics, Combinatorics, and Information Theory

Analytic Algorithmics, Combinatorics, and Information Theory Analytic Algorithmics, Combinatorics, and Information Theory W. Szpankowski Department of Computer Science Purdue University W. Lafayette, IN 47907 September 11, 2006 AofA and IT logos Research supported

More information

Facets of Information

Facets of Information Facets of Information W. Szpankowski Department of Computer Science Purdue University W. Lafayette, IN 47907 January 30, 2009 AofA and IT logos QUALCOMM, San Diego, 2009 Thanks to Y. Choi, Purdue, P. Jacquet,

More information

A One-to-One Code and Its Anti-Redundancy

A One-to-One Code and Its Anti-Redundancy A One-to-One Code and Its Anti-Redundancy W. Szpankowski Department of Computer Science, Purdue University July 4, 2005 This research is supported by NSF, NSA and NIH. Outline of the Talk. Prefix Codes

More information

Counting Markov Types

Counting Markov Types Counting Markov Types Philippe Jacquet, Charles Knessl, Wojciech Szpankowski To cite this version: Philippe Jacquet, Charles Knessl, Wojciech Szpankowski. Counting Markov Types. Drmota, Michael and Gittenberger,

More information

Facets of Information

Facets of Information Facets of Information Wojciech Szpankowski Department of Computer Science Purdue University W. Lafayette, IN 47907 May 31, 2010 AofA and IT logos Ecole Polytechnique, Palaiseau, 2010 Research supported

More information

Context tree models for source coding

Context tree models for source coding Context tree models for source coding Toward Non-parametric Information Theory Licence de droits d usage Outline Lossless Source Coding = density estimation with log-loss Source Coding and Universal Coding

More information

A Master Theorem for Discrete Divide and Conquer Recurrences

A Master Theorem for Discrete Divide and Conquer Recurrences A Master Theorem for Discrete Divide and Conquer Recurrences Wojciech Szpankowski Department of Computer Science Purdue University W. Lafayette, IN 47907 January 20, 2011 NSF CSoI SODA, 2011 Research supported

More information

A Master Theorem for Discrete Divide and Conquer Recurrences. Dedicated to PHILIPPE FLAJOLET

A Master Theorem for Discrete Divide and Conquer Recurrences. Dedicated to PHILIPPE FLAJOLET A Master Theorem for Discrete Divide and Conquer Recurrences Wojciech Szpankowski Department of Computer Science Purdue University U.S.A. September 6, 2012 Uniwersytet Jagieloński, Kraków, 2012 Dedicated

More information

Science of Information

Science of Information Science of Information Wojciech Szpankowski Department of Computer Science Purdue University W. Lafayette, IN 47907 July 9, 2010 AofA and IT logos ORACLE, California 2010 Research supported by NSF Science

More information

Coding on Countably Infinite Alphabets

Coding on Countably Infinite Alphabets Coding on Countably Infinite Alphabets Non-parametric Information Theory Licence de droits d usage Outline Lossless Coding on infinite alphabets Source Coding Universal Coding Infinite Alphabets Enveloppe

More information

Variable-to-Variable Codes with Small Redundancy Rates

Variable-to-Variable Codes with Small Redundancy Rates Variable-to-Variable Codes with Small Redundancy Rates M. Drmota W. Szpankowski September 25, 2004 This research is supported by NSF, NSA and NIH. Institut f. Diskrete Mathematik und Geometrie, TU Wien,

More information

Analytic Information Theory: Redundancy Rate Problems

Analytic Information Theory: Redundancy Rate Problems Analytic Information Theory: Redundancy Rate Problems W. Szpankowski Department of Computer Science Purdue University W. Lafayette, IN 47907 November 3, 2015 Research supported by NSF Science & Technology

More information

A lower bound on compression of unknown alphabets

A lower bound on compression of unknown alphabets Theoretical Computer Science 33 005 93 3 wwwelseviercom/locate/tcs A lower bound on compression of unknown alphabets Nikola Jevtić a, Alon Orlitsky a,b,, Narayana P Santhanam a a ECE Department, University

More information

Source Coding. Master Universitario en Ingeniería de Telecomunicación. I. Santamaría Universidad de Cantabria

Source Coding. Master Universitario en Ingeniería de Telecomunicación. I. Santamaría Universidad de Cantabria Source Coding Master Universitario en Ingeniería de Telecomunicación I. Santamaría Universidad de Cantabria Contents Introduction Asymptotic Equipartition Property Optimal Codes (Huffman Coding) Universal

More information

ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016

ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016 ECE598: Information-theoretic methods in high-dimensional statistics Spring 06 Lecture : Mutual Information Method Lecturer: Yihong Wu Scribe: Jaeho Lee, Mar, 06 Ed. Mar 9 Quick review: Assouad s lemma

More information

String Complexity. Dedicated to Svante Janson for his 60 Birthday

String Complexity. Dedicated to Svante Janson for his 60 Birthday String Complexity Wojciech Szpankowski Purdue University W. Lafayette, IN 47907 June 1, 2015 Dedicated to Svante Janson for his 60 Birthday Outline 1. Working with Svante 2. String Complexity 3. Joint

More information

Considering our result for the sum and product of analytic functions, this means that for (a 0, a 1,..., a N ) C N+1, the polynomial.

Considering our result for the sum and product of analytic functions, this means that for (a 0, a 1,..., a N ) C N+1, the polynomial. Lecture 3 Usual complex functions MATH-GA 245.00 Complex Variables Polynomials. Construction f : z z is analytic on all of C since its real and imaginary parts satisfy the Cauchy-Riemann relations and

More information

7 Asymptotics for Meromorphic Functions

7 Asymptotics for Meromorphic Functions Lecture G jacques@ucsd.edu 7 Asymptotics for Meromorphic Functions Hadamard s Theorem gives a broad description of the exponential growth of coefficients in power series, but the notion of exponential

More information

Robustness and duality of maximum entropy and exponential family distributions

Robustness and duality of maximum entropy and exponential family distributions Chapter 7 Robustness and duality of maximum entropy and exponential family distributions In this lecture, we continue our study of exponential families, but now we investigate their properties in somewhat

More information

Machine learning - HT Maximum Likelihood

Machine learning - HT Maximum Likelihood Machine learning - HT 2016 3. Maximum Likelihood Varun Kanade University of Oxford January 27, 2016 Outline Probabilistic Framework Formulate linear regression in the language of probability Introduce

More information

Complex Analysis Slide 9: Power Series

Complex Analysis Slide 9: Power Series Complex Analysis Slide 9: Power Series MA201 Mathematics III Department of Mathematics IIT Guwahati August 2015 Complex Analysis Slide 9: Power Series 1 / 37 Learning Outcome of this Lecture We learn Sequence

More information

Pattern Matching in Constrained Sequences

Pattern Matching in Constrained Sequences Pattern Matching in Constrained Sequences Yongwook Choi and Wojciech Szpankowski Department of Computer Science Purdue University W. Lafayette, IN 47907 U.S.A. Email: ywchoi@purdue.edu, spa@cs.purdue.edu

More information

WE start with a general discussion. Suppose we have

WE start with a general discussion. Suppose we have 646 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 43, NO. 2, MARCH 1997 Minimax Redundancy for the Class of Memoryless Sources Qun Xie and Andrew R. Barron, Member, IEEE Abstract Let X n = (X 1 ; 111;Xn)be

More information

An O(N) Semi-Predictive Universal Encoder via the BWT

An O(N) Semi-Predictive Universal Encoder via the BWT An O(N) Semi-Predictive Universal Encoder via the BWT Dror Baron and Yoram Bresler Abstract We provide an O(N) algorithm for a non-sequential semi-predictive encoder whose pointwise redundancy with respect

More information

Notes by Zvi Rosen. Thanks to Alyssa Palfreyman for supplements.

Notes by Zvi Rosen. Thanks to Alyssa Palfreyman for supplements. Lecture: Hélène Barcelo Analytic Combinatorics ECCO 202, Bogotá Notes by Zvi Rosen. Thanks to Alyssa Palfreyman for supplements.. Tuesday, June 2, 202 Combinatorics is the study of finite structures that

More information

Pointwise Redundancy in Lossy Data Compression and Universal Lossy Data Compression

Pointwise Redundancy in Lossy Data Compression and Universal Lossy Data Compression Pointwise Redundancy in Lossy Data Compression and Universal Lossy Data Compression I. Kontoyiannis To appear, IEEE Transactions on Information Theory, Jan. 2000 Last revision, November 21, 1999 Abstract

More information

Lecture 2: August 31

Lecture 2: August 31 0-704: Information Processing and Learning Fall 206 Lecturer: Aarti Singh Lecture 2: August 3 Note: These notes are based on scribed notes from Spring5 offering of this course. LaTeX template courtesy

More information

Quiz 2 Date: Monday, November 21, 2016

Quiz 2 Date: Monday, November 21, 2016 10-704 Information Processing and Learning Fall 2016 Quiz 2 Date: Monday, November 21, 2016 Name: Andrew ID: Department: Guidelines: 1. PLEASE DO NOT TURN THIS PAGE UNTIL INSTRUCTED. 2. Write your name,

More information

Lecture 8: Information Theory and Statistics

Lecture 8: Information Theory and Statistics Lecture 8: Information Theory and Statistics Part II: Hypothesis Testing and I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 23, 2015 1 / 50 I-Hsiang

More information

Asymptotic Rational Approximation To Pi: Solution of an Unsolved Problem Posed By Herbert Wilf

Asymptotic Rational Approximation To Pi: Solution of an Unsolved Problem Posed By Herbert Wilf AofA 0 DMTCS proc AM, 00, 59 60 Asymptotic Rational Approximation To Pi: Solution of an Unsolved Problem Posed By Herbert Wilf Mark Daniel Ward Department of Statistics, Purdue University, 50 North University

More information

Analytic Pattern Matching: From DNA to Twitter. AxA Workshop, Venice, 2016 Dedicated to Alberto Apostolico

Analytic Pattern Matching: From DNA to Twitter. AxA Workshop, Venice, 2016 Dedicated to Alberto Apostolico Analytic Pattern Matching: From DNA to Twitter Wojciech Szpankowski Purdue University W. Lafayette, IN 47907 June 19, 2016 AxA Workshop, Venice, 2016 Dedicated to Alberto Apostolico Joint work with Philippe

More information

Information Theory and Hypothesis Testing

Information Theory and Hypothesis Testing Summer School on Game Theory and Telecommunications Campione, 7-12 September, 2014 Information Theory and Hypothesis Testing Mauro Barni University of Siena September 8 Review of some basic results linking

More information

Covariance function estimation in Gaussian process regression

Covariance function estimation in Gaussian process regression Covariance function estimation in Gaussian process regression François Bachoc Department of Statistics and Operations Research, University of Vienna WU Research Seminar - May 2015 François Bachoc Gaussian

More information

arxiv: v4 [cs.it] 17 Oct 2015

arxiv: v4 [cs.it] 17 Oct 2015 Upper Bounds on the Relative Entropy and Rényi Divergence as a Function of Total Variation Distance for Finite Alphabets Igal Sason Department of Electrical Engineering Technion Israel Institute of Technology

More information

Lecture 8: Information Theory and Statistics

Lecture 8: Information Theory and Statistics Lecture 8: Information Theory and Statistics Part II: Hypothesis Testing and Estimation I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 22, 2015

More information

10-704: Information Processing and Learning Fall Lecture 24: Dec 7

10-704: Information Processing and Learning Fall Lecture 24: Dec 7 0-704: Information Processing and Learning Fall 206 Lecturer: Aarti Singh Lecture 24: Dec 7 Note: These notes are based on scribed notes from Spring5 offering of this course. LaTeX template courtesy of

More information

Minimax Optimal Bayes Mixtures for Memoryless Sources over Large Alphabets

Minimax Optimal Bayes Mixtures for Memoryless Sources over Large Alphabets Proceedings of Machine Learning Research 8: 8, 208 Algorithmic Learning Theory 208 Minimax Optimal Bayes Mixtures for Memoryless Sources over Large Alphabets Elias Jääsaari Helsinki Institute for Information

More information

A proof of a partition conjecture of Bateman and Erdős

A proof of a partition conjecture of Bateman and Erdős proof of a partition conjecture of Bateman and Erdős Jason P. Bell Department of Mathematics University of California, San Diego La Jolla C, 92093-0112. US jbell@math.ucsd.edu 1 Proposed Running Head:

More information

Analysis Qualifying Exam

Analysis Qualifying Exam Analysis Qualifying Exam Spring 2017 Problem 1: Let f be differentiable on R. Suppose that there exists M > 0 such that f(k) M for each integer k, and f (x) M for all x R. Show that f is bounded, i.e.,

More information

Applications of Information Geometry to Hypothesis Testing and Signal Detection

Applications of Information Geometry to Hypothesis Testing and Signal Detection CMCAA 2016 Applications of Information Geometry to Hypothesis Testing and Signal Detection Yongqiang Cheng National University of Defense Technology July 2016 Outline 1. Principles of Information Geometry

More information

Naïve Bayes classification

Naïve Bayes classification Naïve Bayes classification 1 Probability theory Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. Examples: A person s height, the outcome of a coin toss

More information

Optimal Distributed Detection Strategies for Wireless Sensor Networks

Optimal Distributed Detection Strategies for Wireless Sensor Networks Optimal Distributed Detection Strategies for Wireless Sensor Networks Ke Liu and Akbar M. Sayeed University of Wisconsin-Madison kliu@cae.wisc.edu, akbar@engr.wisc.edu Abstract We study optimal distributed

More information

Multiple Choice Tries and Distributed Hash Tables

Multiple Choice Tries and Distributed Hash Tables Multiple Choice Tries and Distributed Hash Tables Luc Devroye and Gabor Lugosi and Gahyun Park and W. Szpankowski January 3, 2007 McGill University, Montreal, Canada U. Pompeu Fabra, Barcelona, Spain U.

More information

Phenomena in high dimensions in geometric analysis, random matrices, and computational geometry Roscoff, France, June 25-29, 2012

Phenomena in high dimensions in geometric analysis, random matrices, and computational geometry Roscoff, France, June 25-29, 2012 Phenomena in high dimensions in geometric analysis, random matrices, and computational geometry Roscoff, France, June 25-29, 202 BOUNDS AND ASYMPTOTICS FOR FISHER INFORMATION IN THE CENTRAL LIMIT THEOREM

More information

Model selection in penalized Gaussian graphical models

Model selection in penalized Gaussian graphical models University of Groningen e.c.wit@rug.nl http://www.math.rug.nl/ ernst January 2014 Penalized likelihood generates a PATH of solutions Consider an experiment: Γ genes measured across T time points. Assume

More information

Chapter 3 Source Coding. 3.1 An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code

Chapter 3 Source Coding. 3.1 An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code Chapter 3 Source Coding 3. An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code 3. An Introduction to Source Coding Entropy (in bits per symbol) implies in average

More information

4 Uniform convergence

4 Uniform convergence 4 Uniform convergence In the last few sections we have seen several functions which have been defined via series or integrals. We now want to develop tools that will allow us to show that these functions

More information

LOWELL. MICHIGAN, THURSDAY, AUGUST 16, Specialty Co. Sells to Hudson Mfg. Company. Ada Farmer Dies When Boat Tips

LOWELL. MICHIGAN, THURSDAY, AUGUST 16, Specialty Co. Sells to Hudson Mfg. Company. Ada Farmer Dies When Boat Tips K N» N - K V XXXV - N 22 N 5 V N N q N 0 " " - x Q- & V N -" Q" 24-?? 2530 35-6 9025 - - K ; ) ) ( _) ) N K : ; N - K K ) q ( K N ( x- 89830 z 7 K $2000 ( - N " " K : K 89335 30 4 V N

More information

Graduate Econometrics I: Maximum Likelihood II

Graduate Econometrics I: Maximum Likelihood II Graduate Econometrics I: Maximum Likelihood II Yves Dominicy Université libre de Bruxelles Solvay Brussels School of Economics and Management ECARES Yves Dominicy Graduate Econometrics I: Maximum Likelihood

More information

A Note on a Problem Posed by D. E. Knuth on a Satisfiability Recurrence

A Note on a Problem Posed by D. E. Knuth on a Satisfiability Recurrence A Note on a Problem Posed by D. E. Knuth on a Satisfiability Recurrence Dedicated to Philippe Flajolet 1948-2011 July 4, 2013 Philippe Jacquet Charles Knessl 1 Wojciech Szpankowski 2 Bell Labs Dept. Math.

More information

Priors for the frequentist, consistency beyond Schwartz

Priors for the frequentist, consistency beyond Schwartz Victoria University, Wellington, New Zealand, 11 January 2016 Priors for the frequentist, consistency beyond Schwartz Bas Kleijn, KdV Institute for Mathematics Part I Introduction Bayesian and Frequentist

More information

Minimax Rate of Convergence for an Estimator of the Functional Component in a Semiparametric Multivariate Partially Linear Model.

Minimax Rate of Convergence for an Estimator of the Functional Component in a Semiparametric Multivariate Partially Linear Model. Minimax Rate of Convergence for an Estimator of the Functional Component in a Semiparametric Multivariate Partially Linear Model By Michael Levine Purdue University Technical Report #14-03 Department of

More information

Combinatorial/probabilistic analysis of a class of search-tree functionals

Combinatorial/probabilistic analysis of a class of search-tree functionals Combinatorial/probabilistic analysis of a class of search-tree functionals Jim Fill jimfill@jhu.edu http://www.mts.jhu.edu/ fill/ Mathematical Sciences The Johns Hopkins University Joint Statistical Meetings;

More information

6.1 Variational representation of f-divergences

6.1 Variational representation of f-divergences ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016 Lecture 6: Variational representation, HCR and CR lower bounds Lecturer: Yihong Wu Scribe: Georgios Rovatsos, Feb 11, 2016

More information

NATIONAL UNIVERSITY OF SINGAPORE Department of Mathematics MA4247 Complex Analysis II Lecture Notes Part II

NATIONAL UNIVERSITY OF SINGAPORE Department of Mathematics MA4247 Complex Analysis II Lecture Notes Part II NATIONAL UNIVERSITY OF SINGAPORE Department of Mathematics MA4247 Complex Analysis II Lecture Notes Part II Chapter 2 Further properties of analytic functions 21 Local/Global behavior of analytic functions;

More information

Asymptotically Minimax Regret for Models with Hidden Variables

Asymptotically Minimax Regret for Models with Hidden Variables Asymptotically Minimax Regret for Models with Hidden Variables Jun ichi Takeuchi Department of Informatics, Kyushu University, Fukuoka, Fukuoka, Japan Andrew R. Barron Department of Statistics, Yale University,

More information

COLLECTED WORKS OF PHILIPPE FLAJOLET

COLLECTED WORKS OF PHILIPPE FLAJOLET COLLECTED WORKS OF PHILIPPE FLAJOLET Editorial Committee: HSIEN-KUEI HWANG Institute of Statistical Science Academia Sinica Taipei 115 Taiwan ROBERT SEDGEWICK Department of Computer Science Princeton University

More information

Naïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability

Naïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability Probability theory Naïve Bayes classification Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s height, the outcome of a coin toss Distinguish

More information

Two-Part Codes with Low Worst-Case Redundancies for Distributed Compression of Bernoulli Sequences

Two-Part Codes with Low Worst-Case Redundancies for Distributed Compression of Bernoulli Sequences 2003 Conference on Information Sciences and Systems, The Johns Hopkins University, March 2 4, 2003 Two-Part Codes with Low Worst-Case Redundancies for Distributed Compression of Bernoulli Sequences Dror

More information

Markov Chains and Hidden Markov Models

Markov Chains and Hidden Markov Models Chapter 1 Markov Chains and Hidden Markov Models In this chapter, we will introduce the concept of Markov chains, and show how Markov chains can be used to model signals using structures such as hidden

More information

Course 214 Section 2: Infinite Series Second Semester 2008

Course 214 Section 2: Infinite Series Second Semester 2008 Course 214 Section 2: Infinite Series Second Semester 2008 David R. Wilkins Copyright c David R. Wilkins 1989 2008 Contents 2 Infinite Series 25 2.1 The Comparison Test and Ratio Test.............. 26

More information

Graduate Econometrics I: Maximum Likelihood I

Graduate Econometrics I: Maximum Likelihood I Graduate Econometrics I: Maximum Likelihood I Yves Dominicy Université libre de Bruxelles Solvay Brussels School of Economics and Management ECARES Yves Dominicy Graduate Econometrics I: Maximum Likelihood

More information

STATISTICAL CURVATURE AND STOCHASTIC COMPLEXITY

STATISTICAL CURVATURE AND STOCHASTIC COMPLEXITY 2nd International Symposium on Information Geometry and its Applications December 2-6, 2005, Tokyo Pages 000 000 STATISTICAL CURVATURE AND STOCHASTIC COMPLEXITY JUN-ICHI TAKEUCHI, ANDREW R. BARRON, AND

More information

Information Theory, Statistics, and Decision Trees

Information Theory, Statistics, and Decision Trees Information Theory, Statistics, and Decision Trees Léon Bottou COS 424 4/6/2010 Summary 1. Basic information theory. 2. Decision trees. 3. Information theory and statistics. Léon Bottou 2/31 COS 424 4/6/2010

More information

Fourth Week: Lectures 10-12

Fourth Week: Lectures 10-12 Fourth Week: Lectures 10-12 Lecture 10 The fact that a power series p of positive radius of convergence defines a function inside its disc of convergence via substitution is something that we cannot ignore

More information

Lecture 35: December The fundamental statistical distances

Lecture 35: December The fundamental statistical distances 36-705: Intermediate Statistics Fall 207 Lecturer: Siva Balakrishnan Lecture 35: December 4 Today we will discuss distances and metrics between distributions that are useful in statistics. I will be lose

More information

Section 8: Asymptotic Properties of the MLE

Section 8: Asymptotic Properties of the MLE 2 Section 8: Asymptotic Properties of the MLE In this part of the course, we will consider the asymptotic properties of the maximum likelihood estimator. In particular, we will study issues of consistency,

More information

Asymptotic and Exact Poissonized Variance in the Analysis of Random Digital Trees (joint with Hsien-Kuei Hwang and Vytas Zacharovas)

Asymptotic and Exact Poissonized Variance in the Analysis of Random Digital Trees (joint with Hsien-Kuei Hwang and Vytas Zacharovas) Asymptotic and Exact Poissonized Variance in the Analysis of Random Digital Trees (joint with Hsien-Kuei Hwang and Vytas Zacharovas) Michael Fuchs Institute of Applied Mathematics National Chiao Tung University

More information

Bernoulli Polynomials

Bernoulli Polynomials Chapter 4 Bernoulli Polynomials 4. Bernoulli Numbers The generating function for the Bernoulli numbers is x e x = n= B n n! xn. (4.) That is, we are to expand the left-hand side of this equation in powers

More information

Chapter 2. Binary and M-ary Hypothesis Testing 2.1 Introduction (Levy 2.1)

Chapter 2. Binary and M-ary Hypothesis Testing 2.1 Introduction (Levy 2.1) Chapter 2. Binary and M-ary Hypothesis Testing 2.1 Introduction (Levy 2.1) Detection problems can usually be casted as binary or M-ary hypothesis testing problems. Applications: This chapter: Simple hypothesis

More information

Universal Loseless Compression: Context Tree Weighting(CTW)

Universal Loseless Compression: Context Tree Weighting(CTW) Universal Loseless Compression: Context Tree Weighting(CTW) Dept. Electrical Engineering, Stanford University Dec 9, 2014 Universal Coding with Model Classes Traditional Shannon theory assume a (probabilistic)

More information

Large Deviations Performance of Knuth-Yao algorithm for Random Number Generation

Large Deviations Performance of Knuth-Yao algorithm for Random Number Generation Large Deviations Performance of Knuth-Yao algorithm for Random Number Generation Akisato KIMURA akisato@ss.titech.ac.jp Tomohiko UYEMATSU uematsu@ss.titech.ac.jp April 2, 999 No. AK-TR-999-02 Abstract

More information

Bayesian Adaptation. Aad van der Vaart. Vrije Universiteit Amsterdam. aad. Bayesian Adaptation p. 1/4

Bayesian Adaptation. Aad van der Vaart. Vrije Universiteit Amsterdam.  aad. Bayesian Adaptation p. 1/4 Bayesian Adaptation Aad van der Vaart http://www.math.vu.nl/ aad Vrije Universiteit Amsterdam Bayesian Adaptation p. 1/4 Joint work with Jyri Lember Bayesian Adaptation p. 2/4 Adaptation Given a collection

More information

arxiv: v1 [math.co] 12 Sep 2018

arxiv: v1 [math.co] 12 Sep 2018 ON THE NUMBER OF INCREASING TREES WITH LABEL REPETITIONS arxiv:809.0434v [math.co] Sep 08 OLIVIER BODINI, ANTOINE GENITRINI, AND BERNHARD GITTENBERGER Abstract. In this paper we study a special subclass

More information

Theory of Maximum Likelihood Estimation. Konstantin Kashin

Theory of Maximum Likelihood Estimation. Konstantin Kashin Gov 2001 Section 5: Theory of Maximum Likelihood Estimation Konstantin Kashin February 28, 2013 Outline Introduction Likelihood Examples of MLE Variance of MLE Asymptotic Properties What is Statistical

More information

MATH 722, COMPLEX ANALYSIS, SPRING 2009 PART 5

MATH 722, COMPLEX ANALYSIS, SPRING 2009 PART 5 MATH 722, COMPLEX ANALYSIS, SPRING 2009 PART 5.. The Arzela-Ascoli Theorem.. The Riemann mapping theorem Let X be a metric space, and let F be a family of continuous complex-valued functions on X. We have

More information

1 Assignment 1: Nonlinear dynamics (due September

1 Assignment 1: Nonlinear dynamics (due September Assignment : Nonlinear dynamics (due September 4, 28). Consider the ordinary differential equation du/dt = cos(u). Sketch the equilibria and indicate by arrows the increase or decrease of the solutions.

More information

Recall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n

Recall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n Chapter 9 Hypothesis Testing 9.1 Wald, Rao, and Likelihood Ratio Tests Suppose we wish to test H 0 : θ = θ 0 against H 1 : θ θ 0. The likelihood-based results of Chapter 8 give rise to several possible

More information

LARGE DEVIATIONS OF TYPICAL LINEAR FUNCTIONALS ON A CONVEX BODY WITH UNCONDITIONAL BASIS. S. G. Bobkov and F. L. Nazarov. September 25, 2011

LARGE DEVIATIONS OF TYPICAL LINEAR FUNCTIONALS ON A CONVEX BODY WITH UNCONDITIONAL BASIS. S. G. Bobkov and F. L. Nazarov. September 25, 2011 LARGE DEVIATIONS OF TYPICAL LINEAR FUNCTIONALS ON A CONVEX BODY WITH UNCONDITIONAL BASIS S. G. Bobkov and F. L. Nazarov September 25, 20 Abstract We study large deviations of linear functionals on an isotropic

More information

The number of inversions in permutations: A saddle point approach

The number of inversions in permutations: A saddle point approach The number of inversions in permutations: A saddle point approach Guy Louchard and Helmut Prodinger November 4, 22 Abstract Using the saddle point method, we obtain from the generating function of the

More information

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score

More information

Minimax lower bounds I

Minimax lower bounds I Minimax lower bounds I Kyoung Hee Kim Sungshin University 1 Preliminaries 2 General strategy 3 Le Cam, 1973 4 Assouad, 1983 5 Appendix Setting Family of probability measures {P θ : θ Θ} on a sigma field

More information

MAT 570 REAL ANALYSIS LECTURE NOTES. Contents. 1. Sets Functions Countability Axiom of choice Equivalence relations 9

MAT 570 REAL ANALYSIS LECTURE NOTES. Contents. 1. Sets Functions Countability Axiom of choice Equivalence relations 9 MAT 570 REAL ANALYSIS LECTURE NOTES PROFESSOR: JOHN QUIGG SEMESTER: FALL 204 Contents. Sets 2 2. Functions 5 3. Countability 7 4. Axiom of choice 8 5. Equivalence relations 9 6. Real numbers 9 7. Extended

More information

Mathematical Institute, University of Utrecht. The problem of estimating the mean of an observed Gaussian innite-dimensional vector

Mathematical Institute, University of Utrecht. The problem of estimating the mean of an observed Gaussian innite-dimensional vector On Minimax Filtering over Ellipsoids Eduard N. Belitser and Boris Y. Levit Mathematical Institute, University of Utrecht Budapestlaan 6, 3584 CD Utrecht, The Netherlands The problem of estimating the mean

More information

Entropy power inequality for a family of discrete random variables

Entropy power inequality for a family of discrete random variables 20 IEEE International Symposium on Information Theory Proceedings Entropy power inequality for a family of discrete random variables Naresh Sharma, Smarajit Das and Siddharth Muthurishnan School of Technology

More information

Topic 12 Overview of Estimation

Topic 12 Overview of Estimation Topic 12 Overview of Estimation Classical Statistics 1 / 9 Outline Introduction Parameter Estimation Classical Statistics Densities and Likelihoods 2 / 9 Introduction In the simplest possible terms, the

More information

f(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain

f(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain 0.1. INTRODUCTION 1 0.1 Introduction R. A. Fisher, a pioneer in the development of mathematical statistics, introduced a measure of the amount of information contained in an observaton from f(x θ). Fisher

More information

Asymptotic Statistics-VI. Changliang Zou

Asymptotic Statistics-VI. Changliang Zou Asymptotic Statistics-VI Changliang Zou Kolmogorov-Smirnov distance Example (Kolmogorov-Smirnov confidence intervals) We know given α (0, 1), there is a well-defined d = d α,n such that, for any continuous

More information

Methods of Estimation

Methods of Estimation Methods of Estimation MIT 18.655 Dr. Kempthorne Spring 2016 1 Outline Methods of Estimation I 1 Methods of Estimation I 2 X X, X P P = {P θ, θ Θ}. Problem: Finding a function θˆ(x ) which is close to θ.

More information

16.1 Bounding Capacity with Covering Number

16.1 Bounding Capacity with Covering Number ECE598: Information-theoretic methods in high-dimensional statistics Spring 206 Lecture 6: Upper Bounds for Density Estimation Lecturer: Yihong Wu Scribe: Yang Zhang, Apr, 206 So far we have been mostly

More information

A Representation of Approximate Self- Overlapping Word and Its Application

A Representation of Approximate Self- Overlapping Word and Its Application Purdue University Purdue e-pubs Computer Science Technical Reports Department of Computer Science 1994 A Representation of Approximate Self- Overlapping Word and Its Application Wojciech Szpankowski Purdue

More information

Real Analysis Problems

Real Analysis Problems Real Analysis Problems Cristian E. Gutiérrez September 14, 29 1 1 CONTINUITY 1 Continuity Problem 1.1 Let r n be the sequence of rational numbers and Prove that f(x) = 1. f is continuous on the irrationals.

More information

21.1 Lower bounds on minimax risk for functional estimation

21.1 Lower bounds on minimax risk for functional estimation ECE598: Information-theoretic methods in high-dimensional statistics Spring 016 Lecture 1: Functional estimation & testing Lecturer: Yihong Wu Scribe: Ashok Vardhan, Apr 14, 016 In this chapter, we will

More information

A new converse in rate-distortion theory

A new converse in rate-distortion theory A new converse in rate-distortion theory Victoria Kostina, Sergio Verdú Dept. of Electrical Engineering, Princeton University, NJ, 08544, USA Abstract This paper shows new finite-blocklength converse bounds

More information

Paul-Eugène Parent. March 12th, Department of Mathematics and Statistics University of Ottawa. MAT 3121: Complex Analysis I

Paul-Eugène Parent. March 12th, Department of Mathematics and Statistics University of Ottawa. MAT 3121: Complex Analysis I Paul-Eugène Parent Department of Mathematics and Statistics University of Ottawa March 12th, 2014 Outline 1 Holomorphic power Series Proposition Let f (z) = a n (z z o ) n be the holomorphic function defined

More information

PROBABILITY AND INFORMATION THEORY. Dr. Gjergji Kasneci Introduction to Information Retrieval WS

PROBABILITY AND INFORMATION THEORY. Dr. Gjergji Kasneci Introduction to Information Retrieval WS PROBABILITY AND INFORMATION THEORY Dr. Gjergji Kasneci Introduction to Information Retrieval WS 2012-13 1 Outline Intro Basics of probability and information theory Probability space Rules of probability

More information

Variational Inference (11/04/13)

Variational Inference (11/04/13) STA561: Probabilistic machine learning Variational Inference (11/04/13) Lecturer: Barbara Engelhardt Scribes: Matt Dickenson, Alireza Samany, Tracy Schifeling 1 Introduction In this lecture we will further

More information