Applications of Analytic Combinatorics in Mathematical Biology (joint with H. Chang, M. Drmota, E. Y. Jin, and Y.-W. Lee)

Similar documents
On the number of subtrees on the fringe of random trees (partly joint with Huilan Chang)

The Subtree Size Profile of Plane-oriented Recursive Trees

MAXIMAL CLADES IN RANDOM BINARY SEARCH TREES

The Subtree Size Profile of Bucket Recursive Trees

Almost giant clusters for percolation on large trees

arxiv: v1 [q-bio.pe] 13 Aug 2015

On joint subtree distributions under two evolutionary models

Random trees and branching processes

The Subtree Size Profile of Plane-oriented Recursive Trees

Dynamics of the evolving Bolthausen-Sznitman coalescent. by Jason Schweinsberg University of California at San Diego.

PROTECTED NODES AND FRINGE SUBTREES IN SOME RANDOM TREES

Dependence between Path Lengths and Size in Random Trees (joint with H.-H. Chern, H.-K. Hwang and R. Neininger)

Cutting edges at random in large recursive trees

k-protected VERTICES IN BINARY SEARCH TREES

DISTRIBUTIONS OF CHERRIES FOR TWO MODELS OF TREES

Protected nodes and fringe subtrees in some random trees

ON COMPOUND POISSON POPULATION MODELS

The expected value of the squared euclidean cophenetic metric under the Yule and the uniform models

The range of tree-indexed random walk

RECOVERING NORMAL NETWORKS FROM SHORTEST INTER-TAXA DISTANCE INFORMATION

Chapter 11. Min Cut Min Cut Problem Definition Some Definitions. By Sariel Har-Peled, December 10, Version: 1.

Asymptotic distribution of two-protected nodes in ternary search trees

arxiv: v1 [math.pr] 21 Mar 2014

On Symmetries of Non-Plane Trees in a Non-Uniform Model

CENTRAL LIMIT THEOREMS FOR ADDITIVE TREE PARAMETERS

Asymptotic and Exact Poissonized Variance in the Analysis of Random Digital Trees (joint with Hsien-Kuei Hwang and Vytas Zacharovas)

ON THE DISTRIBUTION OF SUBTREE ORDERS OF A TREE. 1. Introduction

ANALYSIS OF SOME TREE STATISTICS

Workshop III: Evolutionary Genomics

Modern Discrete Probability Branching processes

arxiv: v1 [math.co] 12 Apr 2018

arxiv: v1 [math.co] 1 Aug 2018

Mod-φ convergence I: examples and probabilistic estimates

Almost sure asymptotics for the random binary search tree

14 Branching processes

Approximate Counting via the Poisson-Laplace-Mellin Method (joint with Chung-Kuei Lee and Helmut Prodinger)

Paul-Eugène Parent. March 12th, Department of Mathematics and Statistics University of Ottawa. MAT 3121: Complex Analysis I

Forty years of tree enumeration

Lecture 2: Discrete Probability Distributions

ROSENA R.X. DU AND HELMUT PRODINGER. Dedicated to Philippe Flajolet ( )

Hard-Core Model on Random Graphs

FRINGE TREES, CRUMP MODE JAGERS BRANCHING PROCESSES AND m-ary SEARCH TREES

The Moran Process as a Markov Chain on Leaf-labeled Trees

arxiv: v1 [q-bio.pe] 1 Jun 2014

Notes by Zvi Rosen. Thanks to Alyssa Palfreyman for supplements.

Midterm Exam 1 Solution

n! (k 1)!(n k)! = F (X) U(0, 1). (x, y) = n(n 1) ( F (y) F (x) ) n 2

The expansion of random regular graphs

Susceptible-Infective-Removed Epidemics and Erdős-Rényi random

On the number of matchings of a tree.

Networks: Lectures 9 & 10 Random graphs

5. Analytic Combinatorics

Probabilistic analysis of the asymmetric digital search trees

Randomized Algorithms III Min Cut

On the distribution of subtree orders of a tree

Gibbs distributions for random partitions generated by a fragmentation process

Math Bootcamp 2012 Miscellaneous

University of Regina. Lecture Notes. Michael Kozdron

arxiv: v1 [cs.cc] 9 Oct 2014

RANDOM CUTTING AND RECORDS IN DETERMINISTIC AND RANDOM TREES

arxiv:math.pr/ v1 17 May 2004

From Individual-based Population Models to Lineage-based Models of Phylogenies

18.175: Lecture 17 Poisson random variables

Maximum Agreement Subtrees

On finding a minimum spanning tree in a network with random weights

FINAL EXAM: Monday 8-10am

GENERATING FUNCTIONS AND CENTRAL LIMIT THEOREMS

The Wright-Fisher Model and Genetic Drift

Discrete Distributions Chapter 6

Combinatorial/probabilistic analysis of a class of search-tree functionals

arxiv: v1 [math.co] 29 Nov 2018

How should we organize the diversity of animal life?

Additive tree functionals with small toll functions and subtrees of random trees

Growing and Destroying Catalan Stanley Trees

This exam is closed book and closed notes. (You will have access to a copy of the Table of Common Distributions given in the back of the text.

Discrete Random Variable

Permutations with Inversions

Trees, Tanglegrams, and Tangled Chains

A note on a scaling limit of successive approximation for a differential equation with moving singularity.

The space requirement of m-ary search trees: distributional asymptotics for m 27

Theory of Stochastic Processes 3. Generating functions and their applications

Random Boolean expressions

Average Case Analysis of QuickSort and Insertion Tree Height using Incompressibility

n! (k 1)!(n k)! = F (X) U(0, 1). (x, y) = n(n 1) ( F (y) F (x) ) n 2

Randomized algorithm

Analysis of Approximate Quickselect and Related Problems

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Extremal Statistics on Non-Crossing Configurations

4 Sums of Independent Random Variables

FE 5204 Stochastic Differential Equations

Expectation MATH Expectation. Benjamin V.C. Collins, James A. Swenson MATH 2730

Phylogeny and systematics. Why are these disciplines important in evolutionary biology and how are they related to each other?

Probabilities of Evolutionary Trees under a Rate-Varying Model of Speciation

Math 510 midterm 3 answers

Probability and Measure

Lecture 17: Trees and Merge Sort 10:00 AM, Oct 15, 2018

Bandits, Experts, and Games

Randomized Sorting Algorithms Quick sort can be converted to a randomized algorithm by picking the pivot element randomly. In this case we can show th

Reconstruction of certain phylogenetic networks from their tree-average distances

6 Introduction to Population Genetics

Transcription:

Applications of Analytic Combinatorics in Mathematical Biology (joint with H. Chang, M. Drmota, E. Y. Jin, and Y.-W. Lee) Michael Fuchs Department of Applied Mathematics National Chiao Tung University June 30th, 2015 Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 1 / 45

Analytic Combinatorics (i) Combinatorialists use recurrence, generating functions, and such transformations as the Vandermonde convolution; others, to my horror, use contour integrals, differential equations, and other resources of mathematical analysis. - John Riordan (1968). Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 2 / 45

Analytic Combinatorics (ii) Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 3 / 45

Outline of the Talk 1 Introduction Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 4 / 45

Outline of the Talk 1 Introduction 2 Patterns in Phylogenetic Trees H. Chang and M. Fuchs (2010). Limit theorems for patterns in phylogenetic trees, J. Math. Biol., 60:4, 481 512. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 4 / 45

Outline of the Talk 1 Introduction 2 Patterns in Phylogenetic Trees H. Chang and M. Fuchs (2010). Limit theorems for patterns in phylogenetic trees, J. Math. Biol., 60:4, 481 512. 3 Shapley Value and Fair Proportion Index M. Fuchs and E. Y. Jin (2015). Equality of Shapley value and fair proportion index in phylogenetic trees, J. Math. Biol., in press. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 4 / 45

Outline of the Talk 1 Introduction 2 Patterns in Phylogenetic Trees H. Chang and M. Fuchs (2010). Limit theorems for patterns in phylogenetic trees, J. Math. Biol., 60:4, 481 512. 3 Shapley Value and Fair Proportion Index M. Fuchs and E. Y. Jin (2015). Equality of Shapley value and fair proportion index in phylogenetic trees, J. Math. Biol., in press. 4 Number of Groups formed by Social Animals M. Drmota, M. Fuchs, Y.-W. Lee (2016+). Stochastic analysis of the extra clustering model for animal grouping, in revision. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 4 / 45

Evolutionary Biology Charles Darwin (1809-1882) Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 5 / 45

Evolutionary Biology First notebook on Transmutation of Species (1837) Charles Darwin (1809-1882) Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 5 / 45

Phylogenetic trees (=PTs) Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 6 / 45

Phylogenetic trees (=PTs) Phylogenetic tree of size n: rooted, plane, unlabelled binary tree with n external nodes (and consequently n 1 internal nodes). Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 6 / 45

Applications of PTs Understanding genetic relatedness of species Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 7 / 45

Applications of PTs Understanding genetic relatedness of species Understanding the underlying evolutionary process Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 7 / 45

Applications of PTs Understanding genetic relatedness of species Understanding the underlying evolutionary process Predicting possible future outcomes Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 7 / 45

Applications of PTs Understanding genetic relatedness of species Understanding the underlying evolutionary process Predicting possible future outcomes Testing appropriateness of random models Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 7 / 45

Applications of PTs Understanding genetic relatedness of species Understanding the underlying evolutionary process Predicting possible future outcomes Testing appropriateness of random models Making conservation decisions in genetics Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 7 / 45

Applications of PTs Understanding genetic relatedness of species Understanding the underlying evolutionary process Predicting possible future outcomes Testing appropriateness of random models Making conservation decisions in genetics Modeling the group formation process of social animals Etc. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 7 / 45

Yule-Harding model or Kingman Coalescent Example: time species Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 8 / 45

Yule-Harding model or Kingman Coalescent Example: Random Model 1: At every time point, two yellow nodes uniformly coalescent. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 8 / 45

Yule-Harding model or Kingman Coalescent Example: Random Model 1: At every time point, two yellow nodes uniformly coalescent. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 8 / 45

Yule-Harding model or Kingman Coalescent Example: Random Model 1: At every time point, two yellow nodes uniformly coalescent. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 8 / 45

Yule-Harding model or Kingman Coalescent Example: Random Model 1: At every time point, two yellow nodes uniformly coalescent. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 8 / 45

Yule-Harding model or Kingman Coalescent Example: Random Model 1: At every time point, two yellow nodes uniformly coalescent. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 8 / 45

Yule-Harding model or Kingman Coalescent Example: Random Model 1: At every time point, two yellow nodes uniformly coalescent. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 8 / 45

Yule-Harding model or Increasing Binary Trees Example: Random Model 2: At every time point, a yellow node is replaced by a cherry. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 9 / 45

Yule-Harding model or Increasing Binary Trees Example: Random Model 2: At every time point, a yellow node is replaced by a cherry. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 9 / 45

Yule-Harding model or Increasing Binary Trees Example: Random Model 2: At every time point, a yellow node is replaced by a cherry. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 9 / 45

Yule-Harding model or Increasing Binary Trees Example: Random Model 2: At every time point, a yellow node is replaced by a cherry. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 9 / 45

Yule-Harding model or Increasing Binary Trees Example: Random Model 2: At every time point, a yellow node is replaced by a cherry. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 9 / 45

Yule-Harding model or Increasing Binary Trees Example: Random Model 2: At every time point, a yellow node is replaced by a cherry. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 9 / 45

Yule-Harding model or Increasing Binary Trees Example: Random Model 2: At every time point, a yellow node is replaced by a cherry. Random model 1 and random model 2 are the same. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 9 / 45

Binary Search Trees Example: Input 1, 4, 2, 5, 3 Random Model 3: All permutations of the input are equally equally. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 10 / 45

Binary Search Trees Example: Input 1, 4, 2, 5, 3 1 Random Model 3: All permutations of the input are equally equally. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 10 / 45

Binary Search Trees Example: Input 1, 4, 2, 5, 3 1 4 Random Model 3: All permutations of the input are equally equally. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 10 / 45

Binary Search Trees Example: Input 1, 4, 2, 5, 3 1 2 4 Random Model 3: All permutations of the input are equally equally. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 10 / 45

Binary Search Trees Example: Input 1, 4, 2, 5, 3 1 Random Model 3: 2 4 5 All permutations of the input are equally equally. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 10 / 45

Binary Search Trees Example: Input 1, 4, 2, 5, 3 1 Random Model 3: 2 4 5 All permutations of the input are equally equally. 3 Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 10 / 45

Binary Search Trees Example: Input 1, 4, 2, 5, 3 1 Random Model 3: 2 4 5 All permutations of the input are equally equally. 3 Again the same as random model 1 and random model 2. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 10 / 45

Patterns in PTs k-pronged node (McKenzie and Steel 2000; Rosenberg 2006): Node with an induced subtree of size k. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 11 / 45

Patterns in PTs k-pronged node (McKenzie and Steel 2000; Rosenberg 2006): Node with an induced subtree of size k. k-caterpillar (Rosenberg 2006): Induced subtree of size k with an internal node which is descendent of all other internal nodes. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 11 / 45

Patterns in PTs k-pronged node (McKenzie and Steel 2000; Rosenberg 2006): Node with an induced subtree of size k. k-caterpillar (Rosenberg 2006): Induced subtree of size k with an internal node which is descendent of all other internal nodes. Node with minimal clade size k 3 (Blum and François (2005)): Node with induced subtree of size k and either right or left subtree is an external node. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 11 / 45

Mean and variance of k-pronged nodes X n,k = # of k-pronged nodes in random PT of size n. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 12 / 45

Mean and variance of k-pronged nodes X n,k = # of k-pronged nodes in random PT of size n. Rosenberg 2006: we have µ n,k := E(X n,k ) = 2n, (n > k) k(k + 1) and 2(4k 2 3k 4)(k 1)n k(k + 1) 2, if n > 2k; (2k 1)(2k + 1) σn,k 2 := Var(X 2(5k 7)(k 1) n,k) = (k + 1) 2, if n = 2k; (2k 1) 2(k 2 + k 2n)n k 2 (k + 1) 2, if 2k > n > k. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 12 / 45

Mean and variance of k-pronged nodes X n,k = # of k-pronged nodes in random PT of size n. Rosenberg 2006: we have µ n,k := E(X n,k ) = 2n, (n > k) k(k + 1) and 2(4k 2 3k 4)(k 1)n k(k + 1) 2, if n > 2k; (2k 1)(2k + 1) σn,k 2 := Var(X 2(5k 7)(k 1) n,k) = (k + 1) 2, if n = 2k; (2k 1) 2(k 2 + k 2n)n k 2 (k + 1) 2, if 2k > n > k. This result + central limit theorem also obtained by Devroye in 1991! Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 12 / 45

The phase change Question: What happens for k as n? Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 13 / 45

The phase change Question: What happens for k as n? Theorem (Feng, Mahmoud, Panholzer; 2008) (i) (Normal range) Let k = o ( n). Then, X n,k µ n,k σ n,k (ii) (Poisson range) Let k c n. Then, X n,k d N (0, 1). d Po(2c 2 ). (iii) (Degenerate range) Let k < n and n = o(k). Then, L X 1 n,k 0. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 13 / 45

A General Framework A pattern of size k is a set of induced subtrees of size k. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 14 / 45

A General Framework A pattern of size k is a set of induced subtrees of size k. X n,k = # of occurrence of a pattern of size k in random PT of size n. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 14 / 45

A General Framework A pattern of size k is a set of induced subtrees of size k. X n,k = # of occurrence of a pattern of size k in random PT of size n. We have, d X n,k = XIn,k + Xn I, n,k where X k,k = Bernoulli(p k ), X In,k and Xn I n,k are conditionally independent given I n, and I n = Unif{1,..., n 1} Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 14 / 45

A General Framework A pattern of size k is a set of induced subtrees of size k. X n,k = # of occurrence of a pattern of size k in random PT of size n. We have, d X n,k = XIn,k + Xn I, n,k where X k,k = Bernoulli(p k ), X In,k and Xn I n,k are conditionally independent given I n, and I n = Unif{1,..., n 1} Here, p k shape parameter 1 # of k-pronged nodes 2/(k 1) # of nodes with minimal clade size k 2 k 2 /(k 1)! # of k caterpillars Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 14 / 45

Mean Value and Variance We have and for n > 2k. µ n,k = 2p kn k(k + 1), (n > k), σ 2 n,k = 2(4k3 + 4k 2 k 1 (11k 2 5)p k )p k n k(k + 1) 2 (2k 1)(2k + 1) Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 15 / 45

Mean Value and Variance We have and for n > 2k. µ n,k = 2p kn k(k + 1), (n > k), σ 2 n,k = 2(4k3 + 4k 2 k 1 (11k 2 5)p k )p k n k(k + 1) 2 (2k 1)(2k + 1) Note that for n > 2k and k. µ n,k σ 2 n,k 2p kn k 2 Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 15 / 45

Poisson Approximation Theorem (Chang and F.; 2010) Let k < n and k. Then, d T V (X n,k, Po(µ n,k )) = { O (p k /k), if µ n,k 1; O (p k /k µ n,k ), if µ n,k < 1. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 16 / 45

Poisson Approximation Theorem (Chang and F.; 2010) Let k < n and k. Then, d T V (X n,k, Po(µ n,k )) = { O (p k /k), if µ n,k 1; O (p k /k µ n,k ), if µ n,k < 1. Recently re-proved by Holmgren and Janson with Stein s method: C. Holmgren and S. Janson (2015). Limit laws for functions of fringe trees for binary search trees and recursive trees, Electronic J. Probability, 20:4, 1 51. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 16 / 45

Berry-Esseen bound and LLT Theorem (Chang and F.; 2010) For µ n,k, ( sup P X n,k µ n,k σ n,k x R ) < x Φ(x) = O ( ) k. pk n Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 17 / 45

Berry-Esseen bound and LLT Theorem (Chang and F.; 2010) For µ n,k, ( sup P X n,k µ n,k σ n,k x R ) < x Φ(x) = O ( ) k. pk n Theorem (Chang and F.; 2010) For µ n,k, ( P (X n,k = µ n,k + xσn,k 2 ) = e x2 /2 (1 + O (1 + x 3 ) 2πσn,k k pk n )), uniformly in x = o((p k n) 1/6 /k 1/3 ). Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 17 / 45

Short Summary We have proved: Explicit expressions for mean and variance. Local limit theorem + Berry-Esseen bound for the largest possible range of k. Poisson approximation + rate whenever k. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 18 / 45

Short Summary We have proved: Explicit expressions for mean and variance. Local limit theorem + Berry-Esseen bound for the largest possible range of k. Poisson approximation + rate whenever k. So, for k n the number of occurrences of the pattern can be approximated by the normal distribution, whereas for the remaining range a Poisson random variable should be used. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 18 / 45

Short Summary We have proved: Explicit expressions for mean and variance. Local limit theorem + Berry-Esseen bound for the largest possible range of k. Poisson approximation + rate whenever k. So, for k n the number of occurrences of the pattern can be approximated by the normal distribution, whereas for the remaining range a Poisson random variable should be used. In particular, the phase change occurs much earlier than predicted by Feng, Mahmoud, and Panholzer. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 18 / 45

Lloyd Shapley Lloyd Shapley (1923-) Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 19 / 45

Lloyd Shapley Shapley value: Measure of importance of each player in a cooperative game Lloyd Shapley (1923-) Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 19 / 45

Lloyd Shapley Shapley value: Measure of importance of each player in a cooperative game recently used as prioritization tool of taxa in phylogenetics Lloyd Shapley (1923-) Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 19 / 45

Shapley Value and Modified Shapley Value T... PT; a... taxon (=leaf) of T. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 20 / 45

Shapley Value and Modified Shapley Value T... PT; a... taxon (=leaf) of T. Shapley value SV T (a): SV T (a) = 1 ( S 1)!(n S )!(PD T (S) PD T (S \ {a})). n! S,a S Modified Shapley value SV T (a): SV T (a) = 1 ( S 1)!(n S )!(PD T (S) PD T (S \ {a})). n! S 2,a S PD(S) is the size of the ancestor of S. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 20 / 45

Fair Proportion Index Fair proportion index FP T (a): FP T (a) = e 1 D e with D e the number of taxa below e. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 21 / 45

Fair Proportion Index Fair proportion index FP T (a): FP T (a) = e 1 D e with D e the number of taxa below e. Selected (somehow arbitrarily) by Zoological Society of London for EDGE of Existence conservation program! Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 21 / 45

Fair Proportion Index Fair proportion index FP T (a): FP T (a) = e 1 D e with D e the number of taxa below e. Selected (somehow arbitrarily) by Zoological Society of London for EDGE of Existence conservation program! FP n = fair proportion index of random taxon in random PT of size n: { 1 j FP n (I n = j) = + FP j, with probability j/n; 1 n j + FP n j, with probability (n j)/n, where I n = Unif{1,..., n 1}. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 21 / 45

Strong Correlation between SV and FP Hartmann (2013): Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 22 / 45

SV = FP Assume a is in left subtree T l and T l = j. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 23 / 45

SV = FP Assume a is in left subtree T l and T l = j. Lemma We have, and SV T (a) = 1 j + SV T l (a) FP T (a) = 1 j + FP T l (a). Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 23 / 45

SV = FP Assume a is in left subtree T l and T l = j. Lemma We have, and SV T (a) = 1 j + SV T l (a) FP T (a) = 1 j + FP T l (a). Theorem (F. and Jin; 2015) We have, SV T (a) = FP T (a). Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 23 / 45

SV and FP (i) D T (a) = depth of a in T : FP T (a) = SV T (a) = SV T (a) + D T (a) n. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 24 / 45

SV and FP (i) D T (a) = depth of a in T : FP T (a) = SV T (a) = SV T (a) + D T (a) n. D n = depth of random taxon in random PT of size n. Lemma We have, Var(FP n ) = 10 6H (2) n 1 6 n 4 10 π2 n2 Var(D n ) = 2H n 4H n (2) + 2 2 log n, where H n = 1 j n 1/j and H(2) n = 1 j n 1/j2. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 24 / 45

SV and FP (ii) Lemma We have, Cov(FP n, D n ) = 4H (2) n 1 6 + 2 n + 4 n 2 2π2 6 6. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 25 / 45

SV and FP (ii) Lemma We have, Cov(FP n, D n ) = 4H (2) n 1 6 + 2 n + 4 n 2 2π2 6 6. Theorem (F. and Jin; 2015) The correlation coefficient ρ( SV n, FP n ) of modified Shapley value and fair proportion index tends to 1, i.e., lim ρ( SV n, FP n ) = 1. n Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 25 / 45

Short Summary Shapley value was introduced for the purpose of making conservation decisions in genetics. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 26 / 45

Short Summary Shapley value was introduced for the purpose of making conservation decisions in genetics. Fair proportion index was used by Zoological Society of London but its biodiversity value was unclear. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 26 / 45

Short Summary Shapley value was introduced for the purpose of making conservation decisions in genetics. Fair proportion index was used by Zoological Society of London but its biodiversity value was unclear. Strong correlation between modified Shapley value and fair proportion index was observed by Hartmann and others. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 26 / 45

Short Summary Shapley value was introduced for the purpose of making conservation decisions in genetics. Fair proportion index was used by Zoological Society of London but its biodiversity value was unclear. Strong correlation between modified Shapley value and fair proportion index was observed by Hartmann and others. We proved that Shapley value equals fair proportion index. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 26 / 45

Short Summary Shapley value was introduced for the purpose of making conservation decisions in genetics. Fair proportion index was used by Zoological Society of London but its biodiversity value was unclear. Strong correlation between modified Shapley value and fair proportion index was observed by Hartmann and others. We proved that Shapley value equals fair proportion index. We showed that correlation coefficient of modified Shapley value and fair proportion index tends to 1 in Yule-Harding model and other random models. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 26 / 45

Social Animals Consider n animals of a class of social animals. Construct a random PT with leaves representing the animals. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 27 / 45

Social Animals Consider n animals of a class of social animals. Construct a random PT with leaves representing the animals. Describes the genetic relatedness of the animals. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 27 / 45

Animal Groups Durand, Blum and François (2007): Groups are formed more likely by animals which are genetically related. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 28 / 45

Animal Groups Durand, Blum and François (2007): Groups are formed more likely by animals which are genetically related. neutral model. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 28 / 45

Animal Groups Durand, Blum and François (2007): Groups are formed more likely by animals which are genetically related. neutral model. Clade of a leaf: All leafs of the tree rooted at the parent. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 28 / 45

Animal Groups Durand, Blum and François (2007): Groups are formed more likely by animals which are genetically related. neutral model. Clade of a leaf: All leafs of the tree rooted at the parent. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 28 / 45

Animal Groups Durand, Blum and François (2007): Groups are formed more likely by animals which are genetically related. neutral model. # of groups # of maximal clades Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 29 / 45

Animal Groups Durand, Blum and François (2007): Groups are formed more likely by animals which are genetically related. neutral model. # of groups # of maximal clades 2 Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 29 / 45

# of Groups X n = # of groups under the Yule Harding model Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 30 / 45

# of Groups X n = # of groups under the Yule Harding model We have, { d 1, if I n = 1 or I n = n 1, X n = X In + Xn I n, otherwise, where I n = Uniform{1,..., n 1} is the # of animals in the left subtree and X n is an independent copy of X n. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 30 / 45

# of Groups X n = # of groups under the Yule Harding model We have, { d 1, if I n = 1 or I n = n 1, X n = X In + Xn I n, otherwise, where I n = Uniform{1,..., n 1} is the # of animals in the left subtree and X n is an independent copy of X n. Theorem (Durand and François; 2010) We have, E(X n ) an (a := 1 e 2 4 ). Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 30 / 45

Comparison with Real-life Data Durand, Blum and François (2007) presented the following data: Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 31 / 45

Yi-Wen s Thesis (2012) Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 32 / 45

Variance and SLLN Theorem (Lee; 2012) We have, Var(X n ) (1 e 2 ) 2 n log n = 4a 2 n log n. 4 Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 33 / 45

Variance and SLLN Theorem (Lee; 2012) We have, Var(X n ) (1 e 2 ) 2 n log n = 4a 2 n log n. 4 Theorem (Lee; 2012) We have, ( ) P lim X n n E(X n ) 1 = 0 = 1. For SLLN, X n is constructed on the same probability space via the tree evolution process underlying the Yule-Harding model. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 33 / 45

Method of Moments Theorem Assume that E(X k n) E(X k ) for all k 1 and that X is uniquely characterised by its sequence of moments. Then, X n d X. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 34 / 45

Method of Moments Theorem Assume that E(X k n) E(X k ) for all k 1 and that X is uniquely characterised by its sequence of moments. Then, X n d X. Many sufficient conditions for X being uniquely characterized by its moments are known, e.g., E(X k ) zk k! k 1 has a positive radius of convergence. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 34 / 45

Higher Moments Theorem (Lee; 2012) For all k 3, E(X n E(X n )) k ( 1) k 2k k 2 ak n k 1. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 35 / 45

Higher Moments Theorem (Lee; 2012) For all k 3, E(X n E(X n )) k ( 1) k 2k k 2 ak n k 1. This implies that all moments larger than two of X n E(X n ) Var(Xn ) tend to infinity! Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 35 / 45

Higher Moments Theorem (Lee; 2012) For all k 3, E(X n E(X n )) k ( 1) k 2k k 2 ak n k 1. This implies that all moments larger than two of X n E(X n ) Var(Xn ) tend to infinity! Question: Is there a limit distribution? Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 35 / 45

Random Recursive Trees Unordered, rooted trees. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 36 / 45

Random Recursive Trees Unordered, rooted trees. Uniformly choose one of the nodes and attach a child. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 36 / 45

Random Recursive Trees Unordered, rooted trees. Uniformly choose one of the nodes and attach a child. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 36 / 45

Random Recursive Trees Unordered, rooted trees. Uniformly choose one of the nodes and attach a child. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 36 / 45

Random Recursive Trees Unordered, rooted trees. Uniformly choose one of the nodes and attach a child. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 36 / 45

Random Recursive Trees Unordered, rooted trees. Uniformly choose one of the nodes and attach a child. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 36 / 45

Random Recursive Trees Unordered, rooted trees. Uniformly choose one of the nodes and attach a child. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 36 / 45

Random Recursive Trees Unordered, rooted trees. Uniformly choose one of the nodes and attach a child. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 36 / 45

Cutting Down Random Recursive Trees Meir and Moon (1974): Randomly pick an edge and remove it; retain the tree containing the root. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 37 / 45

Cutting Down Random Recursive Trees Meir and Moon (1974): Randomly pick an edge and remove it; retain the tree containing the root. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 37 / 45

Cutting Down Random Recursive Trees Meir and Moon (1974): Randomly pick an edge and remove it; retain the tree containing the root. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 37 / 45

Cutting Down Random Recursive Trees Meir and Moon (1974): Randomly pick an edge and remove it; retain the tree containing the root. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 37 / 45

Cutting Down Random Recursive Trees Meir and Moon (1974): Randomly pick an edge and remove it; retain the tree containing the root. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 37 / 45

Cutting Down Random Recursive Trees Meir and Moon (1974): Randomly pick an edge and remove it; retain the tree containing the root. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 37 / 45

Cutting Down Random Recursive Trees Meir and Moon (1974): Randomly pick an edge and remove it; retain the tree containing the root. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 37 / 45

Cutting Down Random Recursive Trees Meir and Moon (1974): Randomly pick an edge and remove it; retain the tree containing the root. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 37 / 45

Cutting Down Random Recursive Trees Meir and Moon (1974): Randomly pick an edge and remove it; retain the tree containing the root. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 37 / 45

Cutting Down Random Recursive Trees Meir and Moon (1974): Randomly pick an edge and remove it; retain the tree containing the root. Y n = number of steps until tree is destroyed = number of edges cut = 4. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 37 / 45

Mean, Variance and Higher Moments Theorem (Panholzer; 2004) We have, and for k 2 E(Y n ) n log n E(Y n E(Y n )) k ( 1)k k(k 1) n k log k+1 n. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 38 / 45

Mean, Variance and Higher Moments Theorem (Panholzer; 2004) We have, and for k 2 E(Y n ) n log n E(Y n E(Y n )) k Thus, again the limit law of ( 1)k k(k 1) n k log k+1 n. Y n E(Y n ) Var(Yn ) cannot obtained from the method of moments! Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 38 / 45

Limit Law Theorem (Drmota, Iksanov, Moehle, Roessler; 2009) We have, with log 2 n n Y n log n log log n d Y E(e iλy ) = e iλ log λ π λ /2. The law of Y is spectrally negative stable with index of stability 1. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 39 / 45

Limit Law Theorem (Drmota, Iksanov, Moehle, Roessler; 2009) We have, with log 2 n n Y n log n log log n d Y E(e iλy ) = e iλ log λ π λ /2. The law of Y is spectrally negative stable with index of stability 1. Different proofs of this result exist. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 39 / 45

Limit Law of X n Theorem (Drmota, F., Lee; 2014) We have, X n E(X n ) Var(Xn )/2 d N(0, 1). Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 40 / 45

Limit Law of X n Theorem (Drmota, F., Lee; 2014) We have, X n E(X n ) Var(Xn )/2 d N(0, 1). For the proof, we use singularity perturbation analysis. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 40 / 45

Limit Law of X n Theorem (Drmota, F., Lee; 2014) We have, X n E(X n ) Var(Xn )/2 d N(0, 1). For the proof, we use singularity perturbation analysis. A probabilistic proof explaining the curious normalization was given recently. S. Janson (2015). Maximal clades in random binary search trees, Electronic J. Combinatorics, 22:1, paper 31. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 40 / 45

Some Ideas of the Proof (i) Set X(y, z) = n 2 E ( e yxn) z n. Then, This is a Riccati DE. z z X(y, z) = X(y, z) + X2 (y, z) + e y z 2 2ey z 3 1 z. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 41 / 45

Some Ideas of the Proof (i) Set X(y, z) = n 2 E ( e yxn) z n. Then, This is a Riccati DE. z z X(y, z) = X(y, z) + X2 (y, z) + e y z 2 2ey z 3 1 z. Set Then, X(y, z) = X(y, z). z z X(y, z) = X 2 (y, z) + e y 1 + z 1 z. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 41 / 45

Some Ideas of the Proof (ii) Set Then, This is Whittaker s DE. X(y, z) = V (y, z) V (y, z). V (y, z) + e y 1 + z V (y, z) = 0. 1 z Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 42 / 45

Some Ideas of the Proof (ii) Set Then, This is Whittaker s DE. Solution is given by V (y, z) = M e y/2,1/2 X(y, z) = V (y, z) V (y, z). V (y, z) + e y 1 + z V (y, z) = 0. 1 z ( ) ( ) 2e y/2 (z 1) + c(y)w e y/2,1/2 2e y/2 (z 1), where c(y) = ( e y/2 1 ) M e y/2 +1,1/2 ( 2e y/2 ) ( W ) e y/2 +1,1/2 2e y/2. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 42 / 45

Some Ideas of the Proof (iii) Lemma V (y, z) is analytic in = {z C : z < 1 + δ}\ {branch cut from 1 to } for all y < η. Moreover, V (y, z) has only one (simple) zero with z 0 (y) = 1 ay + 2a 2 y 2 log y + O(y 2 ). Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 43 / 45

Some Ideas of the Proof (iii) Lemma V (y, z) is analytic in = {z C : z < 1 + δ}\ {branch cut from 1 to } for all y < η. Moreover, V (y, z) has only one (simple) zero with z 0 (y) = 1 ay + 2a 2 y 2 log y + O(y 2 ). δ z 0 (y) 1 Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 43 / 45

Some Ideas of the Proof (iv) Let y = it/(2a n log n). Then, E ( e yxn) = 1 2πi C X(y, z) z n+1 dz. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 44 / 45

Some Ideas of the Proof (iv) Let y = it/(2a n log n). Then, E ( e yxn) = 1 2πi C X(y, z) z n+1 dz. Lemma We have, E ( e yxn) ( log = z 0 (y) n 3 ) n + O. n Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 44 / 45

Some Ideas of the Proof (iv) Let y = it/(2a n log n). Then, E ( e yxn) = 1 2πi C X(y, z) z n+1 dz. Lemma We have, E ( e yxn) ( log = z 0 (y) n 3 ) n + O. n This together with the expansion of z 0 (y) yields E ( e yxn) ( ) ( it n = exp 2 log n t2 1 + O 4 ( log log n log n )). Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 44 / 45

Summary Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 45 / 45

Summary We gave a comprehensive study of pattern occurrences in random phylogenetic trees. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 45 / 45

Summary We gave a comprehensive study of pattern occurrences in random phylogenetic trees. We explained the strong correlation between Shapley value and fair proportion index in phylogenetic trees. This is of great interest for people working in biodiversity. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 45 / 45

Summary We gave a comprehensive study of pattern occurrences in random phylogenetic trees. We explained the strong correlation between Shapley value and fair proportion index in phylogenetic trees. This is of great interest for people working in biodiversity. We found a curious central limit theorem for the number of groups formed by social animals under the neutral model. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 45 / 45

Summary We gave a comprehensive study of pattern occurrences in random phylogenetic trees. We explained the strong correlation between Shapley value and fair proportion index in phylogenetic trees. This is of great interest for people working in biodiversity. We found a curious central limit theorem for the number of groups formed by social animals under the neutral model. Analytic combinatorics is useful in studying mathematical problems for random phylogenetic trees. We expect many more applications. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 45 / 45

Summary We gave a comprehensive study of pattern occurrences in random phylogenetic trees. We explained the strong correlation between Shapley value and fair proportion index in phylogenetic trees. This is of great interest for people working in biodiversity. We found a curious central limit theorem for the number of groups formed by social animals under the neutral model. Analytic combinatorics is useful in studying mathematical problems for random phylogenetic trees. We expect many more applications. More input from biologists is needed to make our results more relevant from the point of view of applications. Michael Fuchs (NCTU) Phylogenetic Trees (PTs) June 30th, 2015 45 / 45