Lecture 3 : Kesten-Stigum bound MATH85K - Spring 00 Lecturer: Sebastien Roch References: [EKPS00, Mos0, MP03, BCMR06]. Previous class DEF 3. (Ancestral reconstruction solvability) Let µ + h be the distribution µ h conditioned on the root state σ 0 being +, and similarly for µ h. We say that the ancestral reconstruction problem (under the CFN model) for 0 < p < / is solvable if lim inf h µ+ h µ h > 0, otherwise the problem is unsolvable. THM 3. (Solvability) Let θ = p = /. Then when p < p the ancestral reconstruction problem is solvable. Kesten-Stigum bound The previous theorem was proved by showing that majority is a good root estimator up to p = p. Here we show that this result is best possible. Of course, majority is not the best root estimator: in general its error probability can be higher than maximum likelihood. (See Figure 3 in [EKPS00] for an insightful example where majority and maximum likelihood differ.) However, it turns out that the critical threshold for majority, called the Kesten-Stigum bound, coincides with the critical threshold of maximum likelihood in the CFN model. Note that the latter is not true for more general models [Mos0]. THM 3.3 (Tightness of Kesten-Stigum Bound) Let θ = p = /. Then when p p the ancestral reconstrution problem is not solvable. Along each path from the root, information is lost through mutation at exponential rate maeasured by θ = p. Meanwhile, the tree is growing exponentially
Lecture 3: Kesten-Stigum bound and information is duplicated measured by the branching ratio b =. These two forces balance each other out when bθ =, the critical threshold in the theorem. To prove Theorem 3.3 we analyze the maximum likelihood estimator. Let µ h (s 0 s h ) be the conditional probability of the root state s 0 given the states s h at level h. It will be more convenient to work with the following related quantity Z h = µ h (+ σ h ) µ h ( σ h ) = [µ+ h (σ h) µ h (σ h)] = µ h (+ σ h ), which, as a function of σ h, is a random variable. Note that E[Z h ] = 0 by symmetry. It is enough to prove a bound on the variance of Z h. LEM 3.4 It holds that µ + h µ h E[Zh ]. Proof: By Bayes rule and Cauchy-Schwarz µ + h (s h) µ h (s h) = µ h (s h ) µ h (+ s h ) µ h ( s h ) s h s h = E Z h E[Zh ]. Let z h = E[Zh ]. The proof of Theorem 3.3 will follow from lim h z h = 0. We apply the same type of recursive argument we used for the analysis of majority: we condition on the root to exploit conditional independence; we apply the Markov channel on the top edge. Distributional recursion We first derive a recursion for Z h. Let σ h be the states at level h below the first child of the root and let µ h be the distribution of σ h. Define Ż h = µ h (+ σ h ) µ h ( σ h ), where µ h (s 0 ṡ h ) is the conditional probability that the root is s 0 given that σ h = ṡ h. Similarly, denote with a double dot the same quantities with respect to the subtree below the second child of the root.
Lecture 3: Kesten-Stigum bound 3 LEM 3.5 It holds pointwise that Z h = Żh + Z h + Żh Z h. Proof: Using µ + h (s h) = µ + h (ṡ h) µ + h ( s h), note that Z h = = = = 4 γ µγ h (σ h) µ γ h ( σ h) µ γ h ( σ h) ( ) ( + γżh + γ γ Z ) h (Żh + Z h ), where µ γ h ( σ h) µ γ h ( σ h) = = = 4 ( + Żh Z h ). µ γ h (σ h) µ γ h ( σ h) µ γ h ( σ h) ( ) ( + γżh + γ Z h ) Define Ż h = µ h (+ σ h ) µ h ( σ h ), where µ h (s 0 σ h ) is the condition probability that the first child of the root is s 0 given that the states at level h below the first child are σ h. Similarly, LEM 3.6 It holds pointwise that Ż h = θżh. Proof: The proof is similar to that of the previous lemma and is left as an exercise.
Lecture 3: Kesten-Stigum bound 4 3 Moment recursion We now take expectations in the previous recursion for Z h. Note that we need to compute the second moment. However, an important simplification arises from the following observation: E + h [Z h] = s h µ + h (s h)z h (s h ) = s h µ h (s h ) µ+ h (s h) µ h (s h ) Z h(s h ) = s h µ h (s h )( + Z h (s h ))Z h (s h ) = E[( + Z h )Z h ] = E[Z h ], so it suffices to compute the (conditioned) first moment. Proof:(of Theorem 3.3) Using the expansion we have that + r = r + r + r, Z h = θ(żh + Z h ) θ 3 (Żh + Z h )Żh Z h + θ 4 Ż h Z h Z h θ(żh + Z h ) θ 3 (Żh + Z h )Żh Z h + θ 4 Ż h Z h, () where we used Z h. To take expectations, we need the following lemma. LEM 3.7 We have and E + h [Żh ] = θe + h [Żh ], E + h [Ż h ] = ( θ)e[ż h ] + θe+ h [Ż h ] = E[Ż h ] = E+ h [Żh ]. Proof: For the first equality, note that by symmetry E + h [Żh ] = ( p)e + h [Żh ] + pe h [Żh ] = ( p)e + h [Żh ]. The second equality is proved similarly and is left as an exercise. Taking expectations in (), using conditional independence and symmetry z h θ z h θ 4 z h + θ4 z h = θ z h θ 4 z h.
Lecture 3: Kesten-Stigum bound 5 References [BCMR06] Christian Borgs, Jennifer T. Chayes, Elchanan Mossel, and Sébastien Roch. The Kesten-Stigum reconstruction bound is tight for roughly symmetric binary channels. In FOCS, pages 58 530, 006. [EKPS00] [Mos0] [MP03] W. S. Evans, C. Kenyon, Y. Peres, and L. J. Schulman. Broadcasting on trees and the Ising model. Ann. Appl. Probab., 0():40 433, 000. E. Mossel. Reconstruction on trees: beating the second eigenvalue. Ann. Appl. Probab., ():85 300, 00. E. Mossel and Y. Peres. Information flow on trees. Ann. Appl. Probab., 3(3):87 844, 003.