CSci 8980: Advanced Topics in Graphical Models Analysis of Genetic Variation

Size: px
Start display at page:

Download "CSci 8980: Advanced Topics in Graphical Models Analysis of Genetic Variation"

Transcription

1 CSci 8980: Advanced Topics in Graphical Models Analysis of Genetic Variation Instructor: Arindam Banerjee November 26, 2007

2 Genetic Polymorphism Single nucleotide polymorphism (SNP)

3 Genetic Polymorphism Single nucleotide polymorphism (SNP) Two possible kinds of nucleotides at a single locus

4 Genetic Polymorphism Single nucleotide polymorphism (SNP) Two possible kinds of nucleotides at a single locus Nucleotide can be one of {A, C, T, G}

5 Genetic Polymorphism Single nucleotide polymorphism (SNP) Two possible kinds of nucleotides at a single locus Nucleotide can be one of {A, C, T, G} Most genetic human variation are related to SNPs

6 Genetic Polymorphism Single nucleotide polymorphism (SNP) Two possible kinds of nucleotides at a single locus Nucleotide can be one of {A, C, T, G} Most genetic human variation are related to SNPs Each variant is called an allele

7 Genetic Polymorphism Single nucleotide polymorphism (SNP) Two possible kinds of nucleotides at a single locus Nucleotide can be one of {A, C, T, G} Most genetic human variation are related to SNPs Each variant is called an allele Haplotype

8 Genetic Polymorphism Single nucleotide polymorphism (SNP) Two possible kinds of nucleotides at a single locus Nucleotide can be one of {A, C, T, G} Most genetic human variation are related to SNPs Each variant is called an allele Haplotype List of alleles in a local region of a chromosome

9 Genetic Polymorphism Single nucleotide polymorphism (SNP) Two possible kinds of nucleotides at a single locus Nucleotide can be one of {A, C, T, G} Most genetic human variation are related to SNPs Each variant is called an allele Haplotype List of alleles in a local region of a chromosome Inherited as a unit, if there is no recombination

10 Genetic Polymorphism Single nucleotide polymorphism (SNP) Two possible kinds of nucleotides at a single locus Nucleotide can be one of {A, C, T, G} Most genetic human variation are related to SNPs Each variant is called an allele Haplotype List of alleles in a local region of a chromosome Inherited as a unit, if there is no recombination Repeated recombinations between ancestral haplotypes

11 Genetic Polymorphism (Contd.) Linkage disequilibrium (LD)

12 Genetic Polymorphism (Contd.) Linkage disequilibrium (LD) Non-random association of alleles at different loci

13 Genetic Polymorphism (Contd.) Linkage disequilibrium (LD) Non-random association of alleles at different loci Recombination decouples alleles, increase randomness, decrease LD

14 Genetic Polymorphism (Contd.) Linkage disequilibrium (LD) Non-random association of alleles at different loci Recombination decouples alleles, increase randomness, decrease LD Infer chromosomal recombination hotspots

15 Genetic Polymorphism (Contd.) Linkage disequilibrium (LD) Non-random association of alleles at different loci Recombination decouples alleles, increase randomness, decrease LD Infer chromosomal recombination hotspots Help understand origin and characteristics of genetic variation

16 Genetic Polymorphism (Contd.) Linkage disequilibrium (LD) Non-random association of alleles at different loci Recombination decouples alleles, increase randomness, decrease LD Infer chromosomal recombination hotspots Help understand origin and characteristics of genetic variation Analyze genetic variation to reconstruct evolutionary history

17 Haplotype Recombination and Inheritance

18 Hidden Markov Process Generative model for choosing recombination sites

19 Hidden Markov Process Generative model for choosing recombination sites Hidden Markov process

20 Hidden Markov Process Generative model for choosing recombination sites Hidden Markov process Hidden states correspond to index over chromosomes

21 Hidden Markov Process Generative model for choosing recombination sites Hidden Markov process Hidden states correspond to index over chromosomes Transition probabilities correspond to recombination rates

22 Hidden Markov Process Generative model for choosing recombination sites Hidden Markov process Hidden states correspond to index over chromosomes Transition probabilities correspond to recombination rates Emission model corresponds to mutation process that give descendants

23 Hidden Markov Process Generative model for choosing recombination sites Hidden Markov process Hidden states correspond to index over chromosomes Transition probabilities correspond to recombination rates Emission model corresponds to mutation process that give descendants Implemented using a Hidden Markov Dirichlet Process (HMDP)

24 Dirichlet Process Mixtures We know the basics of DPMs

25 Dirichlet Process Mixtures We know the basics of DPMs Haplotype modeling using an infinite mixture model

26 Dirichlet Process Mixtures We know the basics of DPMs Haplotype modeling using an infinite mixture model A pool of ancestor haplotypes or founders

27 Dirichlet Process Mixtures We know the basics of DPMs Haplotype modeling using an infinite mixture model A pool of ancestor haplotypes or founders The size of the pool is unknown

28 Dirichlet Process Mixtures We know the basics of DPMs Haplotype modeling using an infinite mixture model A pool of ancestor haplotypes or founders The size of the pool is unknown Standard coalescence based models

29 Dirichlet Process Mixtures We know the basics of DPMs Haplotype modeling using an infinite mixture model A pool of ancestor haplotypes or founders The size of the pool is unknown Standard coalescence based models Hidden variables is prohibitively large

30 Dirichlet Process Mixtures We know the basics of DPMs Haplotype modeling using an infinite mixture model A pool of ancestor haplotypes or founders The size of the pool is unknown Standard coalescence based models Hidden variables is prohibitively large Hard to perform inference of ancestral features

31 Dirichlet Process Mixtures (Contd.) H i = [H i,1,..., H i,t ] haplotype over T SNPs, chromosome i

32 Dirichlet Process Mixtures (Contd.) H i = [H i,1,..., H i,t ] haplotype over T SNPs, chromosome i A k = [A k,1,..., A k,t ] ancestral haplotype, mutation rate θ k

33 Dirichlet Process Mixtures (Contd.) H i = [H i,1,..., H i,t ] haplotype over T SNPs, chromosome i A k = [A k,1,..., A k,t ] ancestral haplotype, mutation rate θ k C i, inheritance variable, latent ancestor of H i

34 Dirichlet Process Mixtures (Contd.) H i = [H i,1,..., H i,t ] haplotype over T SNPs, chromosome i A k = [A k,1,..., A k,t ] ancestral haplotype, mutation rate θ k C i, inheritance variable, latent ancestor of H i Generative Model:

35 Dirichlet Process Mixtures (Contd.) H i = [H i,1,..., H i,t ] haplotype over T SNPs, chromosome i A k = [A k,1,..., A k,t ] ancestral haplotype, mutation rate θ k C i, inheritance variable, latent ancestor of H i Generative Model: Draw a first haplotype a 1 DP(τ, Q 0 ) Q 0 h 1 P h ( a 1, θ 1 )

36 Dirichlet Process Mixtures (Contd.) H i = [H i,1,..., H i,t ] haplotype over T SNPs, chromosome i A k = [A k,1,..., A k,t ] ancestral haplotype, mutation rate θ k C i, inheritance variable, latent ancestor of H i Generative Model: Draw a first haplotype a 1 DP(τ, Q 0 ) Q 0 h 1 P h ( a 1, θ 1 ) For subsequent haplotypes { p(c i = c j for some j < i c 1,..., c i 1 ) = c i DP(τ, Q 0 ) p(c i c j for all j < i c 1,..., c i 1 ) = nc j i 1+α 0 α0 i 1+α 0

37 Dirichlet Process Mixtures (Contd.) Generative Model (contd)

38 Dirichlet Process Mixtures (Contd.) Generative Model (contd) Sample the founder of haplotype i { = {a cj, θ cj }ifc i = c j for somej < i φ ci DP(τ, Q 0 ) Q(a, θ)ifc i c j for allj < i

39 Dirichlet Process Mixtures (Contd.) Generative Model (contd) Sample the founder of haplotype i { = {a cj, θ cj }ifc i = c j for somej < i φ ci DP(τ, Q 0 ) Q(a, θ)ifc i c j for allj < i Sample the haplotype according to its founder h i c i P( a ci, θ ci )

40 Dirichlet Process Mixtures (Contd.) Generative Model (contd) Sample the founder of haplotype i { = {a cj, θ cj }ifc i = c j for somej < i φ ci DP(τ, Q 0 ) Q(a, θ)ifc i c j for allj < i Sample the haplotype according to its founder h i c i P( a ci, θ ci ) Assumes each haplotype originates from one ancestor

41 Dirichlet Process Mixtures (Contd.) Generative Model (contd) Sample the founder of haplotype i { = {a cj, θ cj }ifc i = c j for somej < i φ ci DP(τ, Q 0 ) Q(a, θ)ifc i c j for allj < i Sample the haplotype according to its founder h i c i P( a ci, θ ci ) Assumes each haplotype originates from one ancestor Valid only for short regions in chromosome

42 Dirichlet Process Mixtures (Contd.) Generative Model (contd) Sample the founder of haplotype i { = {a cj, θ cj }ifc i = c j for somej < i φ ci DP(τ, Q 0 ) Q(a, θ)ifc i c j for allj < i Sample the haplotype according to its founder h i c i P( a ci, θ ci ) Assumes each haplotype originates from one ancestor Valid only for short regions in chromosome Long regions will have recombination

43 Hidden Markov Dirichlet Process Nonparametric Bayesian HMM

44 Hidden Markov Dirichlet Process Nonparametric Bayesian HMM Sample a DP to form the support of the infinite state space

45 Hidden Markov Dirichlet Process Nonparametric Bayesian HMM Sample a DP to form the support of the infinite state space Conditioned on each state, sample a DP with the same support

46 Hidden Markov Dirichlet Process Nonparametric Bayesian HMM Sample a DP to form the support of the infinite state space Conditioned on each state, sample a DP with the same support Hierarchical Urns

47 Hidden Markov Dirichlet Process Nonparametric Bayesian HMM Sample a DP to form the support of the infinite state space Conditioned on each state, sample a DP with the same support Hierarchical Urns Stock urn Q 0 with balls of K colors, n k of color k

48 Hidden Markov Dirichlet Process Nonparametric Bayesian HMM Sample a DP to form the support of the infinite state space Conditioned on each state, sample a DP with the same support Hierarchical Urns Stock urn Q 0 with balls of K colors, n k of color k HMM-urns Q 1,..., Q K for prior and transition probabilities

49 Hidden Markov Dirichlet Process Nonparametric Bayesian HMM Sample a DP to form the support of the infinite state space Conditioned on each state, sample a DP with the same support Hierarchical Urns Stock urn Q 0 with balls of K colors, n k of color k HMM-urns Q 1,..., Q K for prior and transition probabilities Let m j,k be the number of balls of color k in urn Q j

50 Hidden Markov Dirichlet Process Nonparametric Bayesian HMM Sample a DP to form the support of the infinite state space Conditioned on each state, sample a DP with the same support Hierarchical Urns Stock urn Q 0 with balls of K colors, n k of color k HMM-urns Q 1,..., Q K for prior and transition probabilities Let m j,k be the number of balls of color k in urn Q j HDPM can be simulated by sampling from the urn hierarchy

51 Hidden Markov Dirichlet Process Nonparametric Bayesian HMM Sample a DP to form the support of the infinite state space Conditioned on each state, sample a DP with the same support Hierarchical Urns Stock urn Q 0 with balls of K colors, n k of color k HMM-urns Q 1,..., Q K for prior and transition probabilities Let m j,k be the number of balls of color k in urn Q j HDPM can be simulated by sampling from the urn hierarchy Hierarchical DPM Q 0 α, F DP(α, F ) Q j τ, Q 0 DP(τ, Q 0 )

52 Hidden Markov Dirichlet Process (Contd.) Each color corresponds to ancestor configuration φ k = {a k, θ k }

53 Hidden Markov Dirichlet Process (Contd.) Each color corresponds to ancestor configuration φ k = {a k, θ k } For n random draws from Q 0 φ n φ n K k=1 n k n 1 + α δ α φ k (φ n ) + n 1 + α F (φ n)

54 Hidden Markov Dirichlet Process (Contd.) Each color corresponds to ancestor configuration φ k = {a k, θ k } For n random draws from Q 0 φ n φ n K k=1 n k n 1 + α δ α φ k (φ n ) + n 1 + α F (φ n) Conditioned on Q 0, the marginal configs from Q j φ mj φ mj k n k m j,k + τ n 1+α m j 1 + tau + τ α m j 1 + τ n 1 + α F (φ m j )

55 Hidden Markov Dirichlet Process (Contd.) Each color corresponds to ancestor configuration φ k = {a k, θ k } For n random draws from Q 0 φ n φ n K k=1 n k n 1 + α δ α φ k (φ n ) + n 1 + α F (φ n) Conditioned on Q 0, the marginal configs from Q j φ mj φ mj k n k m j,k + τ n 1+α m j 1 + tau + τ α m j 1 + τ n 1 + α F (φ m j )

56 HMDP for Recombination and Inheritance Priors for the conditional model parameters F (A, θ) = p(a)p(θ)

57 HMDP for Recombination and Inheritance Priors for the conditional model parameters F (A, θ) = p(a)p(θ) p(a) is assumed uniform, p(θ) is assumed beta

58 HMDP for Recombination and Inheritance Priors for the conditional model parameters F (A, θ) = p(a)p(θ) p(a) is assumed uniform, p(θ) is assumed beta C i = [C i,1,..., C i,t ] ancestral index for chromosome i

59 HMDP for Recombination and Inheritance Priors for the conditional model parameters F (A, θ) = p(a)p(θ) p(a) is assumed uniform, p(θ) is assumed beta C i = [C i,1,..., C i,t ] ancestral index for chromosome i With no recombination, C i,t = k, t for some k

60 HMDP for Recombination and Inheritance Priors for the conditional model parameters F (A, θ) = p(a)p(θ) p(a) is assumed uniform, p(θ) is assumed beta C i = [C i,1,..., C i,t ] ancestral index for chromosome i With no recombination, C i,t = k, t for some k Non-recombination is modeled by Poisson point process P(C i,t+1 = C i,t = k) = exp( dr) + (1 exp( dr))π kk

61 HMDP for Recombination and Inheritance Priors for the conditional model parameters F (A, θ) = p(a)p(θ) p(a) is assumed uniform, p(θ) is assumed beta C i = [C i,1,..., C i,t ] ancestral index for chromosome i With no recombination, C i,t = k, t for some k Non-recombination is modeled by Poisson point process P(C i,t+1 = C i,t = k) = exp( dr) + (1 exp( dr))π kk d is the distance between the two loci

62 HMDP for Recombination and Inheritance Priors for the conditional model parameters F (A, θ) = p(a)p(θ) p(a) is assumed uniform, p(θ) is assumed beta C i = [C i,1,..., C i,t ] ancestral index for chromosome i With no recombination, C i,t = k, t for some k Non-recombination is modeled by Poisson point process P(C i,t+1 = C i,t = k) = exp( dr) + (1 exp( dr))π kk d is the distance between the two loci r is the rate of recombination per unit distance

63 HMDP for Recombination and Inheritance Priors for the conditional model parameters F (A, θ) = p(a)p(θ) p(a) is assumed uniform, p(θ) is assumed beta C i = [C i,1,..., C i,t ] ancestral index for chromosome i With no recombination, C i,t = k, t for some k Non-recombination is modeled by Poisson point process P(C i,t+1 = C i,t = k) = exp( dr) + (1 exp( dr))π kk d is the distance between the two loci r is the rate of recombination per unit distance The transition probability to state k is P(C i,t = k, C i,t+1 = k ) = (1 exp(dr))π kk

64 HMDP for Recombination and Inheritance (Contd.) H i is a mosaic of multiple ancestral chromosomes

65 HMDP for Recombination and Inheritance (Contd.) H i is a mosaic of multiple ancestral chromosomes Model is a time-inhomogenous infinite HMM

66 HMDP for Recombination and Inheritance (Contd.) H i is a mosaic of multiple ancestral chromosomes Model is a time-inhomogenous infinite HMM With r, we get stationary HMM

67 HMDP for Recombination and Inheritance (Contd.) H i is a mosaic of multiple ancestral chromosomes Model is a time-inhomogenous infinite HMM With r, we get stationary HMM Single locus mutation model for emission ( ) 1 θ I(ht at) p(h t a t, θ) = θ I(ht=at) B 1

68 Haplotype Recombination and Inheritance

69 HMDP for Recombination and Inheritance (Contd.) Conditional probability of haplotype list h p(h c, a) = p(h i,t a k,t, θ k )Beta(θ k α h, β h )dθ k k θ k i,t c i,t =k = Γ(α h + β h ) Γ(α h + l k )Γ(β h + l k ) ( ) 1 l k Γ(α h )Γ(β h ) Γ(α h + β h + l k + l k k ) B 1 where l k = i,t I(h i,t = a k,t )I(c i,t = k) l k = i,t I(h i,t a k,t )I(c i,t = k)

70 Inference Gibbs sampler proceeds in two steps

71 Inference Gibbs sampler proceeds in two steps Sample inheritance {C i,k } given h and a

72 Inference Gibbs sampler proceeds in two steps Sample inheritance {C i,k } given h and a Sample ancestors a = {a 1,..., a K } given h, C

73 Inference Gibbs sampler proceeds in two steps Sample inheritance {C i,k } given h and a Sample ancestors a = {a 1,..., a K } given h, C Improve mixing for sampling inheritance

74 Inference Gibbs sampler proceeds in two steps Sample inheritance {C i,k } given h and a Sample ancestors a = {a 1,..., a K } given h, C Improve mixing for sampling inheritance By Bayes rule t+δ p(c t+1 : t + δ c, h, a) p(c j+1 c j, m, n) j=t t+δ j=t+1 p(h j a cj,j, l cj )

75 Inference Gibbs sampler proceeds in two steps Sample inheritance {C i,k } given h and a Sample ancestors a = {a 1,..., a K } given h, C Improve mixing for sampling inheritance By Bayes rule t+δ p(c t+1 : t + δ c, h, a) p(c j+1 c j, m, n) j=t t+δ j=t+1 p(h j a cj,j, l cj ) Assume probability of having two recombinations is small p(c t+1 : t + δ c, h, a) p(c t c t 1, m, n)p(c t+δ+1 c t+δ = c t, m, n) t j=

76 Inference (Contd.) Assuming d, r to be small, λ = 1 exp( dr) dr { λπ k,k + (1 λ)δ(k, k )fork {1 p(c t = k c t 1 = k, m, n, r, d) = λπ k,k+1 fork = K + 1

77 Inference (Contd.) Assuming d, r to be small, λ = 1 exp( dr) dr { λπ k,k + (1 λ)δ(k, k )fork {1 p(c t = k c t 1 = k, m, n, r, d) = λπ k,k+1 fork = K + 1 Terms can be replaced in original equation to get sampler

78 Inference (Contd.) Assuming d, r to be small, λ = 1 exp( dr) dr { λπ k,k + (1 λ)δ(k, k )fork {1 p(c t = k c t 1 = k, m, n, r, d) = λπ k,k+1 fork = K + 1 Terms can be replaced in original equation to get sampler Posterior distribution for ancestors p(a k,t c, h) Γ(α h + β h ) Γ(α h + l k,t )Γ(β h + l k,t ) ( ) 1 l k,t Γ(α h )Γ(β h ) Γ(α h + β h + l k,t + l k,t ) B 1

79 Single Population Data Haplotype block boundaries HMDP (black solid), HMM (red dotted), MDL (blue dashed)

80 Two Population Data

81 Hierarchical DPM for Haplotype Inference

82 Hierarchical DPM for Haplotype Inference (Contd.)

83 Experiments: Hapmap Data SNP genotypes from four populations

84 Experiments: Hapmap Data SNP genotypes from four populations CEPH, Utah residents with northern/weatern European ancestry, 60

85 Experiments: Hapmap Data SNP genotypes from four populations CEPH, Utah residents with northern/weatern European ancestry, 60 YRI, Yoruba in Ibadan, Nigeria, 60

86 Experiments: Hapmap Data SNP genotypes from four populations CEPH, Utah residents with northern/weatern European ancestry, 60 YRI, Yoruba in Ibadan, Nigeria, 60 CHB, Han Chinese in Beijing, 45

87 Experiments: Hapmap Data SNP genotypes from four populations CEPH, Utah residents with northern/weatern European ancestry, 60 YRI, Yoruba in Ibadan, Nigeria, 60 CHB, Han Chinese in Beijing, 45 JPT, Japanese in Tokyo, 44

88 Experiments: Hapmap Data SNP genotypes from four populations CEPH, Utah residents with northern/weatern European ancestry, 60 YRI, Yoruba in Ibadan, Nigeria, 60 CHB, Han Chinese in Beijing, 45 JPT, Japanese in Tokyo, 44 Experiments on short ( 10) and long ( ) SNPs

89 Short SNP Sequences

90 Long SNP Sequences

91 Mutation Rates and Diversity

92 Mutation Rates and Diversity (Contd.)

Hidden Markov Dirichlet Process: Modeling Genetic Recombination in Open Ancestral Space

Hidden Markov Dirichlet Process: Modeling Genetic Recombination in Open Ancestral Space Hidden Markov Dirichlet Process: Modeling Genetic Recombination in Open Ancestral Space Kyung-Ah Sohn School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 ksohn@cs.cmu.edu Eric P.

More information

A Nonparametric Bayesian Approach for Haplotype Reconstruction from Single and Multi-Population Data

A Nonparametric Bayesian Approach for Haplotype Reconstruction from Single and Multi-Population Data A Nonparametric Bayesian Approach for Haplotype Reconstruction from Single and Multi-Population Data Eric P. Xing January 27 CMU-ML-7-17 Kyung-Ah Sohn School of Computer Science Carnegie Mellon University

More information

Learning ancestral genetic processes using nonparametric Bayesian models

Learning ancestral genetic processes using nonparametric Bayesian models Learning ancestral genetic processes using nonparametric Bayesian models Kyung-Ah Sohn October 31, 2011 Committee Members: Eric P. Xing, Chair Zoubin Ghahramani Russell Schwartz Kathryn Roeder Matthew

More information

1. Understand the methods for analyzing population structure in genomes

1. Understand the methods for analyzing population structure in genomes MSCBIO 2070/02-710: Computational Genomics, Spring 2016 HW3: Population Genetics Due: 24:00 EST, April 4, 2016 by autolab Your goals in this assignment are to 1. Understand the methods for analyzing population

More information

Bayesian Multi-Population Haplotype Inference via a Hierarchical Dirichlet Process Mixture

Bayesian Multi-Population Haplotype Inference via a Hierarchical Dirichlet Process Mixture Bayesian Multi-Population Haplotype Inference via a Hierarchical Dirichlet Process Mixture Eric P. Xing epxing@cs.cmu.edu Kyung-Ah Sohn sohn@cs.cmu.edu Michael I. Jordan jordan@cs.bereley.edu Yee-Whye

More information

Learning Ancestral Genetic Processes using Nonparametric Bayesian Models

Learning Ancestral Genetic Processes using Nonparametric Bayesian Models Learning Ancestral Genetic Processes using Nonparametric Bayesian Models Kyung-Ah Sohn CMU-CS-11-136 November 2011 Computer Science Department School of Computer Science Carnegie Mellon University Pittsburgh,

More information

Advanced Machine Learning

Advanced Machine Learning Advanced Machine Learning Nonparametric Bayesian Models --Learning/Reasoning in Open Possible Worlds Eric Xing Lecture 7, August 4, 2009 Reading: Eric Xing Eric Xing @ CMU, 2006-2009 Clustering Eric Xing

More information

Linkage and Linkage Disequilibrium

Linkage and Linkage Disequilibrium Linkage and Linkage Disequilibrium Summer Institute in Statistical Genetics 2014 Module 10 Topic 3 Linkage in a simple genetic cross Linkage In the early 1900 s Bateson and Punnet conducted genetic studies

More information

Non-Parametric Bayes

Non-Parametric Bayes Non-Parametric Bayes Mark Schmidt UBC Machine Learning Reading Group January 2016 Current Hot Topics in Machine Learning Bayesian learning includes: Gaussian processes. Approximate inference. Bayesian

More information

Effect of Genetic Divergence in Identifying Ancestral Origin using HAPAA

Effect of Genetic Divergence in Identifying Ancestral Origin using HAPAA Effect of Genetic Divergence in Identifying Ancestral Origin using HAPAA Andreas Sundquist*, Eugene Fratkin*, Chuong B. Do, Serafim Batzoglou Department of Computer Science, Stanford University, Stanford,

More information

Mathematical models in population genetics II

Mathematical models in population genetics II Mathematical models in population genetics II Anand Bhaskar Evolutionary Biology and Theory of Computing Bootcamp January 1, 014 Quick recap Large discrete-time randomly mating Wright-Fisher population

More information

Image segmentation combining Markov Random Fields and Dirichlet Processes

Image segmentation combining Markov Random Fields and Dirichlet Processes Image segmentation combining Markov Random Fields and Dirichlet Processes Jessica SODJO IMS, Groupe Signal Image, Talence Encadrants : A. Giremus, J.-F. Giovannelli, F. Caron, N. Dobigeon Jessica SODJO

More information

p(d g A,g B )p(g B ), g B

p(d g A,g B )p(g B ), g B Supplementary Note Marginal effects for two-locus models Here we derive the marginal effect size of the three models given in Figure 1 of the main text. For each model we assume the two loci (A and B)

More information

Introduction to Linkage Disequilibrium

Introduction to Linkage Disequilibrium Introduction to September 10, 2014 Suppose we have two genes on a single chromosome gene A and gene B such that each gene has only two alleles Aalleles : A 1 and A 2 Balleles : B 1 and B 2 Suppose we have

More information

Genotype Imputation. Biostatistics 666

Genotype Imputation. Biostatistics 666 Genotype Imputation Biostatistics 666 Previously Hidden Markov Models for Relative Pairs Linkage analysis using affected sibling pairs Estimation of pairwise relationships Identity-by-Descent Relatives

More information

Frequency Spectra and Inference in Population Genetics

Frequency Spectra and Inference in Population Genetics Frequency Spectra and Inference in Population Genetics Although coalescent models have come to play a central role in population genetics, there are some situations where genealogies may not lead to efficient

More information

Genetic Association Studies in the Presence of Population Structure and Admixture

Genetic Association Studies in the Presence of Population Structure and Admixture Genetic Association Studies in the Presence of Population Structure and Admixture Purushottam W. Laud and Nicholas M. Pajewski Division of Biostatistics Department of Population Health Medical College

More information

A Brief Overview of Nonparametric Bayesian Models

A Brief Overview of Nonparametric Bayesian Models A Brief Overview of Nonparametric Bayesian Models Eurandom Zoubin Ghahramani Department of Engineering University of Cambridge, UK zoubin@eng.cam.ac.uk http://learning.eng.cam.ac.uk/zoubin Also at Machine

More information

Supporting Information

Supporting Information Supporting Information Hammer et al. 10.1073/pnas.1109300108 SI Materials and Methods Two-Population Model. Estimating demographic parameters. For each pair of sub-saharan African populations we consider

More information

Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium. November 12, 2012

Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium. November 12, 2012 Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium November 12, 2012 Last Time Sequence data and quantification of variation Infinite sites model Nucleotide diversity (π) Sequence-based

More information

Lecture 3a: Dirichlet processes

Lecture 3a: Dirichlet processes Lecture 3a: Dirichlet processes Cédric Archambeau Centre for Computational Statistics and Machine Learning Department of Computer Science University College London c.archambeau@cs.ucl.ac.uk Advanced Topics

More information

Bayesian Inference of Interactions and Associations

Bayesian Inference of Interactions and Associations Bayesian Inference of Interactions and Associations Jun Liu Department of Statistics Harvard University http://www.fas.harvard.edu/~junliu Based on collaborations with Yu Zhang, Jing Zhang, Yuan Yuan,

More information

Bayesian Nonparametrics: Models Based on the Dirichlet Process

Bayesian Nonparametrics: Models Based on the Dirichlet Process Bayesian Nonparametrics: Models Based on the Dirichlet Process Alessandro Panella Department of Computer Science University of Illinois at Chicago Machine Learning Seminar Series February 18, 2013 Alessandro

More information

Modelling Genetic Variations with Fragmentation-Coagulation Processes

Modelling Genetic Variations with Fragmentation-Coagulation Processes Modelling Genetic Variations with Fragmentation-Coagulation Processes Yee Whye Teh, Charles Blundell, Lloyd Elliott Gatsby Computational Neuroscience Unit, UCL Genetic Variations in Populations Inferring

More information

Chapter 6 Linkage Disequilibrium & Gene Mapping (Recombination)

Chapter 6 Linkage Disequilibrium & Gene Mapping (Recombination) 12/5/14 Chapter 6 Linkage Disequilibrium & Gene Mapping (Recombination) Linkage Disequilibrium Genealogical Interpretation of LD Association Mapping 1 Linkage and Recombination v linkage equilibrium ²

More information

Calculation of IBD probabilities

Calculation of IBD probabilities Calculation of IBD probabilities David Evans and Stacey Cherny University of Oxford Wellcome Trust Centre for Human Genetics This Session IBD vs IBS Why is IBD important? Calculating IBD probabilities

More information

mstruct: Inference of Population Structure in Light of both Genetic Admixing and Allele Mutations

mstruct: Inference of Population Structure in Light of both Genetic Admixing and Allele Mutations mstruct: Inference of Population Structure in Light of both Genetic Admixing and Allele Mutations Suyash Shringarpure and Eric P. Xing May 2008 CMU-ML-08-105 mstruct: Inference of Population Structure

More information

Infinite latent feature models and the Indian Buffet Process

Infinite latent feature models and the Indian Buffet Process p.1 Infinite latent feature models and the Indian Buffet Process Tom Griffiths Cognitive and Linguistic Sciences Brown University Joint work with Zoubin Ghahramani p.2 Beyond latent classes Unsupervised

More information

Feature Selection via Block-Regularized Regression

Feature Selection via Block-Regularized Regression Feature Selection via Block-Regularized Regression Seyoung Kim School of Computer Science Carnegie Mellon University Pittsburgh, PA 3 Eric Xing School of Computer Science Carnegie Mellon University Pittsburgh,

More information

Calculation of IBD probabilities

Calculation of IBD probabilities Calculation of IBD probabilities David Evans University of Bristol This Session Identity by Descent (IBD) vs Identity by state (IBS) Why is IBD important? Calculating IBD probabilities Lander-Green Algorithm

More information

Genotype Imputation. Class Discussion for January 19, 2016

Genotype Imputation. Class Discussion for January 19, 2016 Genotype Imputation Class Discussion for January 19, 2016 Intuition Patterns of genetic variation in one individual guide our interpretation of the genomes of other individuals Imputation uses previously

More information

CS281B / Stat 241B : Statistical Learning Theory Lecture: #22 on 19 Apr Dirichlet Process I

CS281B / Stat 241B : Statistical Learning Theory Lecture: #22 on 19 Apr Dirichlet Process I X i Ν CS281B / Stat 241B : Statistical Learning Theory Lecture: #22 on 19 Apr 2004 Dirichlet Process I Lecturer: Prof. Michael Jordan Scribe: Daniel Schonberg dschonbe@eecs.berkeley.edu 22.1 Dirichlet

More information

CSC 2541: Bayesian Methods for Machine Learning

CSC 2541: Bayesian Methods for Machine Learning CSC 2541: Bayesian Methods for Machine Learning Radford M. Neal, University of Toronto, 2011 Lecture 4 Problem: Density Estimation We have observed data, y 1,..., y n, drawn independently from some unknown

More information

Bayesian Nonparametrics

Bayesian Nonparametrics Bayesian Nonparametrics Peter Orbanz Columbia University PARAMETERS AND PATTERNS Parameters P(X θ) = Probability[data pattern] 3 2 1 0 1 2 3 5 0 5 Inference idea data = underlying pattern + independent

More information

Bayesian nonparametrics

Bayesian nonparametrics Bayesian nonparametrics 1 Some preliminaries 1.1 de Finetti s theorem We will start our discussion with this foundational theorem. We will assume throughout all variables are defined on the probability

More information

The problem Lineage model Examples. The lineage model

The problem Lineage model Examples. The lineage model The lineage model A Bayesian approach to inferring community structure and evolutionary history from whole-genome metagenomic data Jack O Brien Bowdoin College with Daniel Falush and Xavier Didelot Cambridge,

More information

1.5.1 ESTIMATION OF HAPLOTYPE FREQUENCIES:

1.5.1 ESTIMATION OF HAPLOTYPE FREQUENCIES: .5. ESTIMATION OF HAPLOTYPE FREQUENCIES: Chapter - 8 For SNPs, alleles A j,b j at locus j there are 4 haplotypes: A A, A B, B A and B B frequencies q,q,q 3,q 4. Assume HWE at haplotype level. Only the

More information

BIOINFORMATICS. StructHDP: Automatic inference of number of clusters and population structure from admixed genotype data

BIOINFORMATICS. StructHDP: Automatic inference of number of clusters and population structure from admixed genotype data BIOINFORMATICS Vol. 00 no. 00 2011 Pages 1 9 StructHDP: Automatic inference of number of clusters and population structure from admixed genotype data Suyash Shringarpure 1, Daegun Won 1 and Eric P. Xing

More information

Stationary Distribution of the Linkage Disequilibrium Coefficient r 2

Stationary Distribution of the Linkage Disequilibrium Coefficient r 2 Stationary Distribution of the Linkage Disequilibrium Coefficient r 2 Wei Zhang, Jing Liu, Rachel Fewster and Jesse Goodman Department of Statistics, The University of Auckland December 1, 2015 Overview

More information

Analysis of DNA variations in GSTA and GSTM gene clusters based on the results of genome-wide data from three Russian populations taken as an example

Analysis of DNA variations in GSTA and GSTM gene clusters based on the results of genome-wide data from three Russian populations taken as an example Filippova et al. BMC Genetics 2012, 13:89 RESEARCH ARTICLE Open Access Analysis of DNA variations in GSTA and GSTM gene clusters based on the results of genome-wide data from three Russian populations

More information

Sharing Clusters Among Related Groups: Hierarchical Dirichlet Processes

Sharing Clusters Among Related Groups: Hierarchical Dirichlet Processes Sharing Clusters Among Related Groups: Hierarchical Dirichlet Processes Yee Whye Teh (1), Michael I. Jordan (1,2), Matthew J. Beal (3) and David M. Blei (1) (1) Computer Science Div., (2) Dept. of Statistics

More information

Latent class analysis variable selection

Latent class analysis variable selection Ann Inst Stat Math (2010) 62:11 35 DOI 10.1007/s10463-009-0258-9 Latent class analysis variable selection Nema Dean Adrian E. Raftery Received: 19 July 2008 / Revised: 22 April 2009 / Published online:

More information

Linear Regression (1/1/17)

Linear Regression (1/1/17) STA613/CBB540: Statistical methods in computational biology Linear Regression (1/1/17) Lecturer: Barbara Engelhardt Scribe: Ethan Hada 1. Linear regression 1.1. Linear regression basics. Linear regression

More information

Supporting Information Text S1

Supporting Information Text S1 Supporting Information Text S1 List of Supplementary Figures S1 The fraction of SNPs s where there is an excess of Neandertal derived alleles n over Denisova derived alleles d as a function of the derived

More information

Bayesian Nonparametric Models

Bayesian Nonparametric Models Bayesian Nonparametric Models David M. Blei Columbia University December 15, 2015 Introduction We have been looking at models that posit latent structure in high dimensional data. We use the posterior

More information

An Integrated Approach for the Assessment of Chromosomal Abnormalities

An Integrated Approach for the Assessment of Chromosomal Abnormalities An Integrated Approach for the Assessment of Chromosomal Abnormalities Department of Biostatistics Johns Hopkins Bloomberg School of Public Health June 26, 2007 Karyotypes Karyotypes General Cytogenetics

More information

CS1820 Notes. hgupta1, kjline, smechery. April 3-April 5. output: plausible Ancestral Recombination Graph (ARG)

CS1820 Notes. hgupta1, kjline, smechery. April 3-April 5. output: plausible Ancestral Recombination Graph (ARG) CS1820 Notes hgupta1, kjline, smechery April 3-April 5 April 3 Notes 1 Minichiello-Durbin Algorithm input: set of sequences output: plausible Ancestral Recombination Graph (ARG) note: the optimal ARG is

More information

Statistical Methods for studying Genetic Variation in Populations

Statistical Methods for studying Genetic Variation in Populations Statistical Methods for studying Genetic Variation in Populations Suyash Shringarpure August 2012 CMU-ML-12-105 Statistical Methods for studying Genetic Variation in Populations Suyash Shringarpure CMU-ML-12-105

More information

Computational Approaches to Statistical Genetics

Computational Approaches to Statistical Genetics Computational Approaches to Statistical Genetics GWAS I: Concepts and Probability Theory Christoph Lippert Dr. Oliver Stegle Prof. Dr. Karsten Borgwardt Max-Planck-Institutes Tübingen, Germany Tübingen

More information

Nonparametric Bayes Modeling

Nonparametric Bayes Modeling Nonparametric Bayes Modeling Lecture 6: Advanced Applications of DPMs David Dunson Department of Statistical Science, Duke University Tuesday February 2, 2010 Motivation Functional data analysis Variable

More information

Bayesian Nonparametric Modelling of Genetic Variations using Fragmentation-Coagulation Processes

Bayesian Nonparametric Modelling of Genetic Variations using Fragmentation-Coagulation Processes Journal of Machine Learning Research 1 (2) 1-48 Submitted 4/; Published 1/ Bayesian Nonparametric Modelling of Genetic Variations using Fragmentation-Coagulation Processes Yee Whye Teh y.w.teh@stats.ox.ac.uk

More information

(Genome-wide) association analysis

(Genome-wide) association analysis (Genome-wide) association analysis 1 Key concepts Mapping QTL by association relies on linkage disequilibrium in the population; LD can be caused by close linkage between a QTL and marker (= good) or by

More information

Notes on Population Genetics

Notes on Population Genetics Notes on Population Genetics Graham Coop 1 1 Department of Evolution and Ecology & Center for Population Biology, University of California, Davis. To whom correspondence should be addressed: gmcoop@ucdavis.edu

More information

Genetic Variation in Finite Populations

Genetic Variation in Finite Populations Genetic Variation in Finite Populations The amount of genetic variation found in a population is influenced by two opposing forces: mutation and genetic drift. 1 Mutation tends to increase variation. 2

More information

Nonparametric Bayesian Methods - Lecture I

Nonparametric Bayesian Methods - Lecture I Nonparametric Bayesian Methods - Lecture I Harry van Zanten Korteweg-de Vries Institute for Mathematics CRiSM Masterclass, April 4-6, 2016 Overview of the lectures I Intro to nonparametric Bayesian statistics

More information

Applied Bayesian Nonparametrics 3. Infinite Hidden Markov Models

Applied Bayesian Nonparametrics 3. Infinite Hidden Markov Models Applied Bayesian Nonparametrics 3. Infinite Hidden Markov Models Tutorial at CVPR 2012 Erik Sudderth Brown University Work by E. Fox, E. Sudderth, M. Jordan, & A. Willsky AOAS 2011: A Sticky HDP-HMM with

More information

Humans have two copies of each chromosome. Inherited from mother and father. Genotyping technologies do not maintain the phase

Humans have two copies of each chromosome. Inherited from mother and father. Genotyping technologies do not maintain the phase Humans have two copies of each chromosome Inherited from mother and father. Genotyping technologies do not maintain the phase Genotyping technologies do not maintain the phase Recall that proximal SNPs

More information

Figure S1: The model underlying our inference of the age of ancient genomes

Figure S1: The model underlying our inference of the age of ancient genomes A genetic method for dating ancient genomes provides a direct estimate of human generation interval in the last 45,000 years Priya Moorjani, Sriram Sankararaman, Qiaomei Fu, Molly Przeworski, Nick Patterson,

More information

Introduction to Advanced Population Genetics

Introduction to Advanced Population Genetics Introduction to Advanced Population Genetics Learning Objectives Describe the basic model of human evolutionary history Describe the key evolutionary forces How demography can influence the site frequency

More information

Lecture 16-17: Bayesian Nonparametrics I. STAT 6474 Instructor: Hongxiao Zhu

Lecture 16-17: Bayesian Nonparametrics I. STAT 6474 Instructor: Hongxiao Zhu Lecture 16-17: Bayesian Nonparametrics I STAT 6474 Instructor: Hongxiao Zhu Plan for today Why Bayesian Nonparametrics? Dirichlet Distribution and Dirichlet Processes. 2 Parameter and Patterns Reference:

More information

Hidden Markov Models for the Assessment of Chromosomal Alterations using High-throughput SNP Arrays

Hidden Markov Models for the Assessment of Chromosomal Alterations using High-throughput SNP Arrays Hidden Markov Models for the Assessment of Chromosomal Alterations using High-throughput SNP Arrays Department of Biostatistics Johns Hopkins Bloomberg School of Public Health November 18, 2008 Acknowledgments

More information

Bayesian Nonparametrics: Dirichlet Process

Bayesian Nonparametrics: Dirichlet Process Bayesian Nonparametrics: Dirichlet Process Yee Whye Teh Gatsby Computational Neuroscience Unit, UCL http://www.gatsby.ucl.ac.uk/~ywteh/teaching/npbayes2012 Dirichlet Process Cornerstone of modern Bayesian

More information

Construction of Dependent Dirichlet Processes based on Poisson Processes

Construction of Dependent Dirichlet Processes based on Poisson Processes 1 / 31 Construction of Dependent Dirichlet Processes based on Poisson Processes Dahua Lin Eric Grimson John Fisher CSAIL MIT NIPS 2010 Outstanding Student Paper Award Presented by Shouyuan Chen Outline

More information

Week 7.2 Ch 4 Microevolutionary Proceses

Week 7.2 Ch 4 Microevolutionary Proceses Week 7.2 Ch 4 Microevolutionary Proceses 1 Mendelian Traits vs Polygenic Traits Mendelian -discrete -single gene determines effect -rarely influenced by environment Polygenic: -continuous -multiple genes

More information

An Efficient and Accurate Graph-Based Approach to Detect Population Substructure

An Efficient and Accurate Graph-Based Approach to Detect Population Substructure An Efficient and Accurate Graph-Based Approach to Detect Population Substructure Srinath Sridhar, Satish Rao and Eran Halperin Abstract. Currently, large-scale projects are underway to perform whole genome

More information

2. Map genetic distance between markers

2. Map genetic distance between markers Chapter 5. Linkage Analysis Linkage is an important tool for the mapping of genetic loci and a method for mapping disease loci. With the availability of numerous DNA markers throughout the human genome,

More information

Dirichlet Processes: Tutorial and Practical Course

Dirichlet Processes: Tutorial and Practical Course Dirichlet Processes: Tutorial and Practical Course (updated) Yee Whye Teh Gatsby Computational Neuroscience Unit University College London August 2007 / MLSS Yee Whye Teh (Gatsby) DP August 2007 / MLSS

More information

Bayesian nonparametric latent feature models

Bayesian nonparametric latent feature models Bayesian nonparametric latent feature models François Caron UBC October 2, 2007 / MLRG François Caron (UBC) Bayes. nonparametric latent feature models October 2, 2007 / MLRG 1 / 29 Overview 1 Introduction

More information

Phasing via the Expectation Maximization (EM) Algorithm

Phasing via the Expectation Maximization (EM) Algorithm Computing Haplotype Frequencies and Haplotype Phasing via the Expectation Maximization (EM) Algorithm Department of Computer Science Brown University, Providence sorin@cs.brown.edu September 14, 2010 Outline

More information

Binomial Mixture Model-based Association Tests under Genetic Heterogeneity

Binomial Mixture Model-based Association Tests under Genetic Heterogeneity Binomial Mixture Model-based Association Tests under Genetic Heterogeneity Hui Zhou, Wei Pan Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455 April 30,

More information

Tutorial Session 2. MCMC for the analysis of genetic data on pedigrees:

Tutorial Session 2. MCMC for the analysis of genetic data on pedigrees: MCMC for the analysis of genetic data on pedigrees: Tutorial Session 2 Elizabeth Thompson University of Washington Genetic mapping and linkage lod scores Monte Carlo likelihood and likelihood ratio estimation

More information

Time-Sensitive Dirichlet Process Mixture Models

Time-Sensitive Dirichlet Process Mixture Models Time-Sensitive Dirichlet Process Mixture Models Xiaojin Zhu Zoubin Ghahramani John Lafferty May 25 CMU-CALD-5-4 School of Computer Science Carnegie Mellon University Pittsburgh, PA 523 Abstract We introduce

More information

An Integrated Approach for the Assessment of Chromosomal Abnormalities

An Integrated Approach for the Assessment of Chromosomal Abnormalities An Integrated Approach for the Assessment of Chromosomal Abnormalities Department of Biostatistics Johns Hopkins Bloomberg School of Public Health June 6, 2007 Karyotypes Mitosis and Meiosis Meiosis Meiosis

More information

URN MODELS: the Ewens Sampling Lemma

URN MODELS: the Ewens Sampling Lemma Department of Computer Science Brown University, Providence sorin@cs.brown.edu October 3, 2014 1 2 3 4 Mutation Mutation: typical values for parameters Equilibrium Probability of fixation 5 6 Ewens Sampling

More information

Detecting selection from differentiation between populations: the FLK and hapflk approach.

Detecting selection from differentiation between populations: the FLK and hapflk approach. Detecting selection from differentiation between populations: the FLK and hapflk approach. Bertrand Servin bservin@toulouse.inra.fr Maria-Ines Fariello, Simon Boitard, Claude Chevalet, Magali SanCristobal,

More information

13: Variational inference II

13: Variational inference II 10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational

More information

Figure S2. The distribution of the sizes (in bp) of syntenic regions of humans and chimpanzees on human chromosome 21.

Figure S2. The distribution of the sizes (in bp) of syntenic regions of humans and chimpanzees on human chromosome 21. Frequency 0 1000 2000 3000 4000 5000 0 2 4 6 8 10 Distance Figure S1. The distribution of human-chimpanzee sequence divergence for syntenic regions of humans and chimpanzees on human chromosome 21. Distance

More information

Bayesian Nonparametric Models on Decomposable Graphs

Bayesian Nonparametric Models on Decomposable Graphs Bayesian Nonparametric Models on Decomposable Graphs François Caron INRIA Bordeaux Sud Ouest Institut de Mathématiques de Bordeaux University of Bordeaux, France francois.caron@inria.fr Arnaud Doucet Departments

More information

Contents. Part I: Fundamentals of Bayesian Inference 1

Contents. Part I: Fundamentals of Bayesian Inference 1 Contents Preface xiii Part I: Fundamentals of Bayesian Inference 1 1 Probability and inference 3 1.1 The three steps of Bayesian data analysis 3 1.2 General notation for statistical inference 4 1.3 Bayesian

More information

Infinite-State Markov-switching for Dynamic. Volatility Models : Web Appendix

Infinite-State Markov-switching for Dynamic. Volatility Models : Web Appendix Infinite-State Markov-switching for Dynamic Volatility Models : Web Appendix Arnaud Dufays 1 Centre de Recherche en Economie et Statistique March 19, 2014 1 Comparison of the two MS-GARCH approximations

More information

MCMC IN THE ANALYSIS OF GENETIC DATA ON PEDIGREES

MCMC IN THE ANALYSIS OF GENETIC DATA ON PEDIGREES MCMC IN THE ANALYSIS OF GENETIC DATA ON PEDIGREES Elizabeth A. Thompson Department of Statistics, University of Washington Box 354322, Seattle, WA 98195-4322, USA Email: thompson@stat.washington.edu This

More information

Collapsed Variational Dirichlet Process Mixture Models

Collapsed Variational Dirichlet Process Mixture Models Collapsed Variational Dirichlet Process Mixture Models Kenichi Kurihara Dept. of Computer Science Tokyo Institute of Technology, Japan kurihara@mi.cs.titech.ac.jp Max Welling Dept. of Computer Science

More information

Statistical Methods for studying Genetic Variation in Populations

Statistical Methods for studying Genetic Variation in Populations Statistical Methods for studying Genetic Variation in Populations Suyash Shringarpure August 2012 CMU-ML-12-105 Report Documentation Page Form Approved OMB No. 0704-0188 Public reporting burden for the

More information

2 : Directed GMs: Bayesian Networks

2 : Directed GMs: Bayesian Networks 10-708: Probabilistic Graphical Models 10-708, Spring 2017 2 : Directed GMs: Bayesian Networks Lecturer: Eric P. Xing Scribes: Jayanth Koushik, Hiroaki Hayashi, Christian Perez Topic: Directed GMs 1 Types

More information

Bustamante et al., Supplementary Nature Manuscript # 1 out of 9 Information #

Bustamante et al., Supplementary Nature Manuscript # 1 out of 9 Information # Bustamante et al., Supplementary Nature Manuscript # 1 out of 9 Details of PRF Methodology In the Poisson Random Field PRF) model, it is assumed that non-synonymous mutations at a given gene are either

More information

Hierarchical Bayesian Nonparametric Models with Applications

Hierarchical Bayesian Nonparametric Models with Applications Hierarchical Bayesian Nonparametric Models with Applications Yee Whye Teh Gatsby Computational Neuroscience Unit University College London 17 Queen Square London WC1N 3AR, United Kingdom Michael I. Jordan

More information

Cover Page. The handle holds various files of this Leiden University dissertation

Cover Page. The handle   holds various files of this Leiden University dissertation Cover Page The handle http://hdl.handle.net/1887/35195 holds various files of this Leiden University dissertation Author: Balliu, Brunilda Title: Statistical methods for genetic association studies with

More information

Computational statistics

Computational statistics Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated

More information

Full Likelihood Inference from the Site Frequency Spectrum based on the Optimal Tree Resolution

Full Likelihood Inference from the Site Frequency Spectrum based on the Optimal Tree Resolution Full Likelihood Inference from the Site Frequency Spectrum based on the Optimal Tree Resolution Raazesh Sainudiin 1, Amandine Véber 2 March 3, 2018 1 Department of Mathematics, Uppsala University, Uppsala,

More information

10. Exchangeability and hierarchical models Objective. Recommended reading

10. Exchangeability and hierarchical models Objective. Recommended reading 10. Exchangeability and hierarchical models Objective Introduce exchangeability and its relation to Bayesian hierarchical models. Show how to fit such models using fully and empirical Bayesian methods.

More information

Spatial Normalized Gamma Process

Spatial Normalized Gamma Process Spatial Normalized Gamma Process Vinayak Rao Yee Whye Teh Presented at NIPS 2009 Discussion and Slides by Eric Wang June 23, 2010 Outline Introduction Motivation The Gamma Process Spatial Normalized Gamma

More information

Fast Approximate MAP Inference for Bayesian Nonparametrics

Fast Approximate MAP Inference for Bayesian Nonparametrics Fast Approximate MAP Inference for Bayesian Nonparametrics Y. Raykov A. Boukouvalas M.A. Little Department of Mathematics Aston University 10th Conference on Bayesian Nonparametrics, 2015 1 Iterated Conditional

More information

Modelling Linkage Disequilibrium, And Identifying Recombination Hotspots Using SNP Data

Modelling Linkage Disequilibrium, And Identifying Recombination Hotspots Using SNP Data Modelling Linkage Disequilibrium, And Identifying Recombination Hotspots Using SNP Data Na Li and Matthew Stephens July 25, 2003 Department of Biostatistics, University of Washington, Seattle, WA 98195

More information

Dirichlet Processes and other non-parametric Bayesian models

Dirichlet Processes and other non-parametric Bayesian models Dirichlet Processes and other non-parametric Bayesian models Zoubin Ghahramani http://learning.eng.cam.ac.uk/zoubin/ zoubin@cs.cmu.edu Statistical Machine Learning CMU 10-702 / 36-702 Spring 2008 Model

More information

Infering the Number of State Clusters in Hidden Markov Model and its Extension

Infering the Number of State Clusters in Hidden Markov Model and its Extension Infering the Number of State Clusters in Hidden Markov Model and its Extension Xugang Ye Department of Applied Mathematics and Statistics, Johns Hopkins University Elements of a Hidden Markov Model (HMM)

More information

Bayesian Hidden Markov Models and Extensions

Bayesian Hidden Markov Models and Extensions Bayesian Hidden Markov Models and Extensions Zoubin Ghahramani Department of Engineering University of Cambridge joint work with Matt Beal, Jurgen van Gael, Yunus Saatci, Tom Stepleton, Yee Whye Teh Modeling

More information

Lecture 13 : Variational Inference: Mean Field Approximation

Lecture 13 : Variational Inference: Mean Field Approximation 10-708: Probabilistic Graphical Models 10-708, Spring 2017 Lecture 13 : Variational Inference: Mean Field Approximation Lecturer: Willie Neiswanger Scribes: Xupeng Tong, Minxing Liu 1 Problem Setup 1.1

More information

Bayesian Nonparametric Autoregressive Models via Latent Variable Representation

Bayesian Nonparametric Autoregressive Models via Latent Variable Representation Bayesian Nonparametric Autoregressive Models via Latent Variable Representation Maria De Iorio Yale-NUS College Dept of Statistical Science, University College London Collaborators: Lifeng Ye (UCL, London,

More information

Steven L. Scott. Presented by Ahmet Engin Ural

Steven L. Scott. Presented by Ahmet Engin Ural Steven L. Scott Presented by Ahmet Engin Ural Overview of HMM Evaluating likelihoods The Likelihood Recursion The Forward-Backward Recursion Sampling HMM DG and FB samplers Autocovariance of samplers Some

More information

Supplemental Information Likelihood-based inference in isolation-by-distance models using the spatial distribution of low-frequency alleles

Supplemental Information Likelihood-based inference in isolation-by-distance models using the spatial distribution of low-frequency alleles Supplemental Information Likelihood-based inference in isolation-by-distance models using the spatial distribution of low-frequency alleles John Novembre and Montgomery Slatkin Supplementary Methods To

More information