QTL model selection: key players

Size: px
Start display at page:

Download "QTL model selection: key players"

Transcription

1 Bayesian Interval Mapping. Bayesian strategy -9. Markov chain sampling 0-7. sampling genetic architectures criteria for model selection 6-44 QTL : Bayes Seattle SISG: Yandell 008 QTL model selection: key players observed measurements y phenotypic trait m markers & linkage map i individual index,,n missing data missing marker data missing QT genotypes alleles QQ, Q, or at locus unknown uantities λ QT locus or loci µ phenotype model parameters γ QTL model/genetic architecture pr m,λ,γ genotype model grounded by linkage map, experimental cross recombination yields multinomial for given m pry,µ,γ phenotype model distribution shape assumed normal here unknown parameters µ could be non-parametric observed mx Yy Q unknown λ θµ γ after Sen Churchill 00 QTL : Bayes Seattle SISG: Yandell 008

2 . Bayesian strategy for QTL study augment data y,m with missing genotypes study unknowns µ,λ,γ given augmented data y,m, find better genetic architectures γ find most likely genomic regions QTL λ estimate phenotype parameters genotype means µ sample from posterior in some clever way multiple imputation Sen Churchill 00 Markov chain Monte Carlo MCMC Satagopan et al. 996; Yi et al. 005, 007 posterior likelihood* prior constant phenotype likelihood*[prior for, µ, λ, γ ] posterior for, µ, λ, γ constant pr y, µ, γ *[pr m, λ, γ pr µ γ pr λ m, γ pr γ ] pr, µ, λ, γ y, m pr y m QTL : Bayes Seattle SISG: Yandell 008 Bayes posterior for normal data n large n small prior prior mean actual mean y phenotype values n large n small prior prior mean actual mean y phenotype values small prior variance large prior variance QTL : Bayes Seattle SISG: Yandell 008 4

3 Bayes posterior for normal data model y i µ e i environment e ~ N 0,, known likelihood y~n µ, prior µ ~N µ 0, κ, κ known posterior: mean tends to sample mean single individual µ ~N µ 0 b y µ 0, b sample of n individuals µ ~ N with y bn y bn µ 0, bn / n sum y / n { i,..., n} i shrinkage factor shrinks to κn bn κn QTL : Bayes Seattle SISG: Yandell what values are the genotypic means? phenotype model pry,µ data means prior mean data mean n large n small prior posterior means Q QQ y phenotype values QTL : Bayes Seattle SISG: Yandell 008 6

4 QTL : Bayes Seattle SISG: Yandell posterior centered on sample genotypic mean but shrunken slightly toward overall mean phenotype mean: genotypic prior: posterior: shrinkage: Bayes posterior QTL means / sum } count{ / } { i i n n b n y y n n b y V y b y b y E V y E y V y E i κ κ µ µ κ µ µ µ QTL : Bayes Seattle SISG: Yandell 008 8, ; 0 Y E µ partition genotypic effects on phenotype phenotype depends on genotype genotypic value partitioned into main effects of single QTL epistasis interaction between pairs of QTL

5 QTL : Bayes Seattle SISG: Yandell consider same QTL epistasis centering variance genotypic variance heritability 0 0 h h h h V s V κ κ partitition genotypic variance QTL : Bayes Seattle SISG: Yandell / shrinkage sum ˆ variance sum ] ˆ sum [sum ˆ LS estimate, ˆ, ˆ ~ i i i i i ij j i C b C w V y w C N C b b N y κ κ posterior mean LS estimate

6 pr m,λ recombination model pr m,λ prgeno map, locus prgeno flanking markers, locus? m m m m m 4 5 m6 markers λ distance along chromosome QTL : Bayes Seattle SISG: Yandell 008 QTL : Bayes Seattle SISG: Yandell 008

7 what are likely QTL genotypes? how does phenotype y improve guess? D4Mit4 D4Mit4 bp what are probabilities for genotype between markers? recombinants AA:AB 90 all : if ignore y and if we use y? AA AA AB AA AA AB AB AB Genotype QTL : Bayes Seattle SISG: Yandell 008 posterior on QTL genotypes full conditional of given data, parameters proportional to prior pr m, λ weight toward that agrees with flanking markers proportional to likelihood pry, µ weight toward with similar phenotype values posterior recombination model balances these two this is the E-step of EM computations pr y, µ * pr m, λ pr y, m, µ, λ pr y m, µ, λ QTL : Bayes Seattle SISG: Yandell 008 4

8 Where are the loci λ on the genome? prior over genome for QTL positions flat prior no prior idea of loci or use prior studies to give more weight to some regions posterior depends on QTL genotypes prλ m, prλ pr m,λ / constant constant determined by averaging over all possible genotypes over all possible loci λ on entire map no easy way to write down posterior QTL : Bayes Seattle SISG: Yandell what is the genetic architecture γ? which positions correspond to QTLs? priors on loci previous slide which QTL have main effects? priors for presence/absence of main effects same prior for all QTL can put prior on each d.f. for BC, for F which pairs of QTL have epistatic interactions? prior for presence/absence of epistatic pairs depends on whether 0,, QTL have main effects epistatic effects less probable than main effects QTL : Bayes Seattle SISG: Yandell 008 6

9 γ genetic architecture: loci: main QTL epistatic pairs effects: add, dom aa, ad, dd QTL : Bayes Seattle SISG: Yandell Bayesian priors & posteriors augmenting with missing genotypes prior is recombination model posterior is formally E step of EM algorithm sampling phenotype model parameters µ prior is flat normal at grand mean no information posterior shrinks genotypic means toward grand mean details for unexplained variance omitted here sampling QTL loci λ prior is flat across genome all loci eually likely sampling QTL genetic architecture model γ number of QTL prior is Poisson with mean from previous IM study genetic architecture of main effects and epistatic interactions priors on epistasis depend on presence/absence of main effects QTL : Bayes Seattle SISG: Yandell 008 8

10 . Markov chain sampling construct Markov chain around posterior want posterior as stable distribution of Markov chain in practice, the chain tends toward stable distribution initial values may have low posterior probability burn-in period to get chain mixing well sample QTL model components from full conditionals sample locus λ given,γ using Metropolis-Hastings step sample genotypes given λ,µ,y,γ using Gibbs sampler sample effects µ given,y,γ using Gibbs sampler sample QTL model γ given λ,µ,y, using Gibbs or M-H λ,, µ, γ ~ pr λ,, µ, γ y, m λ,, µ, γ λ,, µ, γ L λ,, µ, γ N QTL : Bayes Seattle SISG: Yandell MCMC sampling of unknowns,µ,λ for given genetic architecture γ Gibbs sampler genotypes effects µ not loci λ ~ pr y, m, µ, λ i pr y, µ pr µ µ ~ pr y pr m, λpr λ m λ ~ pr m i Metropolis-Hastings sampler extension of Gibbs sampler does not reuire normalization pr m sum λ pr m, λ prλ QTL : Bayes Seattle SISG: Yandell 008 0

11 Gibbs sampler for two genotypic means want to study two correlated effects could sample directly from their bivariate distribution assume correlation ρ is known instead use Gibbs sampler: sample each effect from its full conditional given the other pick order of sampling at random repeat many times µ 0 ~ N, µ 0 ρ µ ~ N µ ~ N ρµ, ρ ρµ, ρ QTL : Bayes Seattle SISG: Yandell 008 ρ Gibbs sampler samples: ρ 0.6 N 50 samples N 00 samples Gibbs: mean Markov chain index Gibbs: mean Gibbs: mean Gibbs: mean Gibbs: mean Gibbs: mean Markov chain index Gibbs: mean Gibbs: mean Gibbs: mean Gibbs: mean Markov chain index Gibbs: mean Markov chain index Gibbs: mean QTL : Bayes Seattle SISG: Yandell 008

12 full conditional for locus cannot easily sample from locus full conditional prλ y,m,µ, pr λ m, pr m, λ prλ / constant constant is very difficult to compute explicitly must average over all possible loci λ over genome must do this for every possible genotype Gibbs sampler will not work in general but can use method based on ratios of probabilities Metropolis-Hastings is extension of Gibbs sampler QTL : Bayes Seattle SISG: Yandell 008 Metropolis-Hastings idea want to study distribution fλ take Monte Carlo samples unless too complicated take samples using ratios of f Metropolis-Hastings samples: propose new value λ * near? current value λ from some distribution g accept new value with prob a Gibbs sampler: a always a * * f λ g λ λ min, * f λ g λ λ QTL : Bayes Seattle SISG: Yandell fλ gλ λ *

13 mcmc seuence Metropolis-Hastings for locus λ prθ Y θ θ λ λ added twist: occasionally propose from entire genome QTL : Bayes Seattle SISG: Yandell Metropolis-Hastings samples mcmc seuence N 00 samples N 000 samples narrow g wide g narrow g wide g mcmc seuence mcmc seuence mcmc seuence histogram prθ Y θ λ θλ θ λ θ λ histogram prθ Y histogram prθ Y histogram prθ Y θ θ θ θ QTL : Bayes Seattle SISG: Yandell 008 6

14 . sampling genetic architectures search across genetic architectures A of various sizes allow change in number of QTL allow change in types of epistatic interactions methods for search reversible jump MCMC Gibbs sampler with loci indicators complexity of epistasis Fisher-Cockerham effects model general multi-qtl interaction & limits of inference QTL : Bayes Seattle SISG: Yandell reversible jump MCMC consider known genotypes at known loci λ models with or QTL M-H step between -QTL and -QTL models model changes dimension via careful bookkeeping consider mixture over QTL models H γ QTL : Y 0 e γ QTL : Y 0 e QTL : Bayes Seattle SISG: Yandell 008 8

15 geometry of reversible jump Move Between Models Reversible Jump Seuence b m m c 0.7 b b b QTL : Bayes Seattle SISG: Yandell geometry allowing and λ to change a short seuence first 000 with m< b b b b QTL : Bayes Seattle SISG: Yandell 008 0

16 collinear QTL correlated effects 4-week 8-week additive effect cor -0.8 additive effect cor additive effect additive effect linked QTL collinear genotypes correlated estimates of effects negative if in coupling phase sum of linked effects usually fairly constant QTL : Bayes Seattle SISG: Yandell 008 sampling across QTL models γ 0 λ λ m λ λ m L action steps: draw one of three choices update QTL model γ with probability -bγ-dγ update current model using full conditionals sample QTL loci, effects, and genotypes add a locus with probability bγ propose a new locus along genome innovate new genotypes at locus and phenotype effect decide whether to accept the birth of new locus drop a locus with probability dγ propose dropping one of existing loci decide whether to accept the death of locus QTL : Bayes Seattle SISG: Yandell 008

17 Gibbs sampler with loci indicators consider only QTL at pseudomarkers every - cm modest approximation with little bias use loci indicators in each pseudomarker γ if QTL present γ 0 if no QTL present Gibbs sampler on loci indicators γ relatively easy to incorporate epistasis Yi, Yandell, Churchill, Allison, Eisen, Pomp 005 Genetics see earlier work of Nengjun Yi and Ina Hoeschele µ µ γ γ, γ k 0, QTL : Bayes Seattle SISG: Yandell 008 Bayesian shrinkage estimation soft loci indicators strength of evidence for λ j depends on γ 0 γ grey scale shrink most γs to zero Wang et al. 005 Genetics Shizhong Xu group at U CA Riverside µ 0 γ γ, 0 γ k QTL : Bayes Seattle SISG: Yandell 008 4

18 4. criteria for model selection balance fit against complexity classical information criteria penalize likelihood L by model size γ IC log Lγ y penaltyγ maximize over unknowns Bayes factors marginal posteriors pry γ average over unknowns QTL : Bayes Seattle SISG: Yandell classical information criteria start with likelihood Lγ y, m measures fit of architecture γ to phenotype y given marker data m genetic architecture γ depends on parameters have to estimate loci µ and effects λ complexity related to number of parameters γ size of genetic architecture BC: γ n.tl n.tln.tl F: γ n.tl 4n.tln.tl QTL : Bayes Seattle SISG: Yandell 008 6

19 classical information criteria construct information criteria balance fit to complexity Akaike AIC logl γ Bayes/Schwartz BIC logl γ logn Broman BIC δ logl δ γ logn general form: IC logl γ Dn compare models hypothesis testing: designed for one comparison log[lrγ, γ ] Ly m, γ Ly m, γ model selection: penalize complexity ICγ, γ log[lrγ, γ ] γ γ Dn QTL : Bayes Seattle SISG: Yandell information criteria vs. model size WinQTL.0 SCD data on F AAIC BIC BIC dbicδ models,,,4 QTL 59 epistasis : AD information criteria d d d A A A A A A A A model parameters p epistasis QTL : Bayes Seattle SISG: Yandell d d d d d

20 Bayes factors ratio of model likelihoods ratio of posterior to prior odds for architectures averaged over unknowns roughly euivalent to BIC BIC maximizes over unknowns BF averages over unknowns pr γ y, m / pr γ y, m pr y m, γ B pr γ / pr γ pr y m, γ log B log LR γ γ log n QTL : Bayes Seattle SISG: Yandell scan of marginal Bayes factor & effect QTL : Bayes Seattle SISG: Yandell

21 issues in computing Bayes factors BF insensitive to shape of prior on γ geometric, Poisson, uniform precision improves when prior mimics posterior BF sensitivity to prior variance on effects θ prior variance should reflect data variability resolved by using hyper-priors automatic algorithm; no need for user tuning easy to compute Bayes factors from samples sample posterior using MCMC posterior prγ y, m is marginal histogram QTL : Bayes Seattle SISG: Yandell Bayes factors & genetic architecture γ γ number of QTL prior prγ chosen by user posterior prγ y,m sampled marginal histogram shape affected by prior pra pr γ y, m / pr γ BF γ, γ pr γ y, m / pr γ pattern of QTL across genome gene action and epistasis prior probability e e exponential p Poisson e u uniform p p u u p eu u pu u u e e p p e e p e p e p e u u pu p e u m number of QTL QTL : Bayes Seattle SISG: Yandell 008 4

22 BF sensitivity to fixed prior for effects Bayes factors B45 B4 B B 4 / m, ~ N 0, h j hyper-prior heritability h G G, h fixed QTL : Bayes Seattle SISG: Yandell total BF insensitivity to random effects prior hyper-prior density *Betaa,b insensitivity to hyper-prior density , ,9.5,9,0,, Bayes factors B4 B B hyper-parameter heritability h Eh / m, ~ N 0, h j G G total, h ~ Beta a, b QTL : Bayes Seattle SISG: Yandell

QTL model selection: key players

QTL model selection: key players QTL Model Selection. Bayesian strategy. Markov chain sampling 3. sampling genetic architectures 4. criteria for model selection Model Selection Seattle SISG: Yandell 0 QTL model selection: key players

More information

1. what is the goal of QTL study?

1. what is the goal of QTL study? NSF UAB Course 008 Bayesian Interval Mapping Brian S. Yandell, UW-Madison www.stat.wisc.edu/~yandell/statgen overview: multiple QTL approaches Bayesian QTL mapping & model selection data examples in detail

More information

Model Selection for Multiple QTL

Model Selection for Multiple QTL Model Selection for Multiple TL 1. reality of multiple TL 3-8. selecting a class of TL models 9-15 3. comparing TL models 16-4 TL model selection criteria issues of detecting epistasis 4. simulations and

More information

Multiple QTL mapping

Multiple QTL mapping Multiple QTL mapping Karl W Broman Department of Biostatistics Johns Hopkins University www.biostat.jhsph.edu/~kbroman [ Teaching Miscellaneous lectures] 1 Why? Reduce residual variation = increased power

More information

Bayesian Model Selection for Multiple QTL. Brian S. Yandell

Bayesian Model Selection for Multiple QTL. Brian S. Yandell Bayesian Model Selection for Multiple QTL Brian S. Yandell University of Wisconsin-Madison www.stat.wisc.edu/~yandell/statgen Jackson Laboratory, October 005 October 005 Jax Workshop Brian S. Yandell 1

More information

what is Bayes theorem? posterior = likelihood * prior / C

what is Bayes theorem? posterior = likelihood * prior / C who was Bayes? Reverend Thomas Bayes (70-76) part-time mathematician buried in Bunhill Cemetary, Moongate, London famous paper in 763 Phil Trans Roy Soc London was Bayes the first with this idea? (Laplace?)

More information

Seattle Summer Institute : Systems Genetics for Experimental Crosses

Seattle Summer Institute : Systems Genetics for Experimental Crosses Seattle Summer Institute 2012 15: Systems Genetics for Experimental Crosses Brian S. Yandell, UW-Madison Elias Chaibub Neto, Sage Bionetworks www.stat.wisc.edu/~yandell/statgen/sisg Real knowledge is to

More information

contact information & resources

contact information & resources Seattle Summer Institute 006 Advanced QTL Brian S. Yandell University of Wisconsin-Madison Bayesian QTL mapping & model selection data examples in detail multiple phenotypes & microarrays software demo

More information

Mapping multiple QTL in experimental crosses

Mapping multiple QTL in experimental crosses Human vs mouse Mapping multiple QTL in experimental crosses Karl W Broman Department of Biostatistics & Medical Informatics University of Wisconsin Madison www.biostat.wisc.edu/~kbroman www.daviddeen.com

More information

QTL Model Search. Brian S. Yandell, UW-Madison January 2017

QTL Model Search. Brian S. Yandell, UW-Madison January 2017 QTL Model Search Brian S. Yandell, UW-Madison January 2017 evolution of QTL models original ideas focused on rare & costly markers models & methods refined as technology advanced single marker regression

More information

Mapping multiple QTL in experimental crosses

Mapping multiple QTL in experimental crosses Mapping multiple QTL in experimental crosses Karl W Broman Department of Biostatistics and Medical Informatics University of Wisconsin Madison www.biostat.wisc.edu/~kbroman [ Teaching Miscellaneous lectures]

More information

Overview. Background

Overview. Background Overview Implementation of robust methods for locating quantitative trait loci in R Introduction to QTL mapping Andreas Baierl and Andreas Futschik Institute of Statistics and Decision Support Systems

More information

QTL Mapping I: Overview and using Inbred Lines

QTL Mapping I: Overview and using Inbred Lines QTL Mapping I: Overview and using Inbred Lines Key idea: Looking for marker-trait associations in collections of relatives If (say) the mean trait value for marker genotype MM is statisically different

More information

Tutorial Session 2. MCMC for the analysis of genetic data on pedigrees:

Tutorial Session 2. MCMC for the analysis of genetic data on pedigrees: MCMC for the analysis of genetic data on pedigrees: Tutorial Session 2 Elizabeth Thompson University of Washington Genetic mapping and linkage lod scores Monte Carlo likelihood and likelihood ratio estimation

More information

Inferring Genetic Architecture of Complex Biological Processes

Inferring Genetic Architecture of Complex Biological Processes Inferring Genetic Architecture of Complex Biological Processes BioPharmaceutical Technology Center Institute (BTCI) Brian S. Yandell University of Wisconsin-Madison http://www.stat.wisc.edu/~yandell/statgen

More information

Contents. Part I: Fundamentals of Bayesian Inference 1

Contents. Part I: Fundamentals of Bayesian Inference 1 Contents Preface xiii Part I: Fundamentals of Bayesian Inference 1 1 Probability and inference 3 1.1 The three steps of Bayesian data analysis 3 1.2 General notation for statistical inference 4 1.3 Bayesian

More information

Causal Graphical Models in Systems Genetics

Causal Graphical Models in Systems Genetics 1 Causal Graphical Models in Systems Genetics 2013 Network Analysis Short Course - UCLA Human Genetics Elias Chaibub Neto and Brian S Yandell July 17, 2013 Motivation and basic concepts 2 3 Motivation

More information

Gene mapping in model organisms

Gene mapping in model organisms Gene mapping in model organisms Karl W Broman Department of Biostatistics Johns Hopkins University http://www.biostat.jhsph.edu/~kbroman Goal Identify genes that contribute to common human diseases. 2

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

R/qtl workshop. (part 2) Karl Broman. Biostatistics and Medical Informatics University of Wisconsin Madison. kbroman.org

R/qtl workshop. (part 2) Karl Broman. Biostatistics and Medical Informatics University of Wisconsin Madison. kbroman.org R/qtl workshop (part 2) Karl Broman Biostatistics and Medical Informatics University of Wisconsin Madison kbroman.org github.com/kbroman @kwbroman Example Sugiyama et al. Genomics 71:70-77, 2001 250 male

More information

Module 4: Bayesian Methods Lecture 9 A: Default prior selection. Outline

Module 4: Bayesian Methods Lecture 9 A: Default prior selection. Outline Module 4: Bayesian Methods Lecture 9 A: Default prior selection Peter Ho Departments of Statistics and Biostatistics University of Washington Outline Je reys prior Unit information priors Empirical Bayes

More information

BAYESIAN MAPPING OF MULTIPLE QUANTITATIVE TRAIT LOCI

BAYESIAN MAPPING OF MULTIPLE QUANTITATIVE TRAIT LOCI BAYESIAN MAPPING OF MULTIPLE QUANTITATIVE TRAIT LOCI By DÁMARIS SANTANA MORANT A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

BTRY 4830/6830: Quantitative Genomics and Genetics

BTRY 4830/6830: Quantitative Genomics and Genetics BTRY 4830/6830: Quantitative Genomics and Genetics Lecture 23: Alternative tests in GWAS / (Brief) Introduction to Bayesian Inference Jason Mezey jgm45@cornell.edu Nov. 13, 2014 (Th) 8:40-9:55 Announcements

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee University of Minnesota July 20th, 2008 1 Bayesian Principles Classical statistics: model parameters are fixed and unknown. A Bayesian thinks of parameters

More information

A Bayesian Approach to Phylogenetics

A Bayesian Approach to Phylogenetics A Bayesian Approach to Phylogenetics Niklas Wahlberg Based largely on slides by Paul Lewis (www.eeb.uconn.edu) An Introduction to Bayesian Phylogenetics Bayesian inference in general Markov chain Monte

More information

Detection of multiple QTL with epistatic effects under a mixed inheritance model in an outbred population

Detection of multiple QTL with epistatic effects under a mixed inheritance model in an outbred population Genet. Sel. Evol. 36 (2004) 415 433 415 c INRA, EDP Sciences, 2004 DOI: 10.1051/gse:2004009 Original article Detection of multiple QTL with epistatic effects under a mixed inheritance model in an outbred

More information

Lecture 3. G. Cowan. Lecture 3 page 1. Lectures on Statistical Data Analysis

Lecture 3. G. Cowan. Lecture 3 page 1. Lectures on Statistical Data Analysis Lecture 3 1 Probability (90 min.) Definition, Bayes theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests (90 min.) general concepts, test statistics,

More information

Statistical issues in QTL mapping in mice

Statistical issues in QTL mapping in mice Statistical issues in QTL mapping in mice Karl W Broman Department of Biostatistics Johns Hopkins University http://www.biostat.jhsph.edu/~kbroman Outline Overview of QTL mapping The X chromosome Mapping

More information

Bayesian linear regression

Bayesian linear regression Bayesian linear regression Linear regression is the basis of most statistical modeling. The model is Y i = X T i β + ε i, where Y i is the continuous response X i = (X i1,..., X ip ) T is the corresponding

More information

Introduction to QTL mapping in model organisms

Introduction to QTL mapping in model organisms Human vs mouse Introduction to QTL mapping in model organisms Karl W Broman Department of Biostatistics Johns Hopkins University www.biostat.jhsph.edu/~kbroman [ Teaching Miscellaneous lectures] www.daviddeen.com

More information

Bayesian construction of perceptrons to predict phenotypes from 584K SNP data.

Bayesian construction of perceptrons to predict phenotypes from 584K SNP data. Bayesian construction of perceptrons to predict phenotypes from 584K SNP data. Luc Janss, Bert Kappen Radboud University Nijmegen Medical Centre Donders Institute for Neuroscience Introduction Genetic

More information

Bayesian Inference and MCMC

Bayesian Inference and MCMC Bayesian Inference and MCMC Aryan Arbabi Partly based on MCMC slides from CSC412 Fall 2018 1 / 18 Bayesian Inference - Motivation Consider we have a data set D = {x 1,..., x n }. E.g each x i can be the

More information

Multilevel Statistical Models: 3 rd edition, 2003 Contents

Multilevel Statistical Models: 3 rd edition, 2003 Contents Multilevel Statistical Models: 3 rd edition, 2003 Contents Preface Acknowledgements Notation Two and three level models. A general classification notation and diagram Glossary Chapter 1 An introduction

More information

Stat 516, Homework 1

Stat 516, Homework 1 Stat 516, Homework 1 Due date: October 7 1. Consider an urn with n distinct balls numbered 1,..., n. We sample balls from the urn with replacement. Let N be the number of draws until we encounter a ball

More information

Introduction to QTL mapping in model organisms

Introduction to QTL mapping in model organisms Introduction to QTL mapping in model organisms Karl W Broman Department of Biostatistics and Medical Informatics University of Wisconsin Madison www.biostat.wisc.edu/~kbroman [ Teaching Miscellaneous lectures]

More information

Introduction to QTL mapping in model organisms

Introduction to QTL mapping in model organisms Introduction to QTL mapping in model organisms Karl Broman Biostatistics and Medical Informatics University of Wisconsin Madison kbroman.org github.com/kbroman @kwbroman Backcross P 1 P 2 P 1 F 1 BC 4

More information

Lecture 8. QTL Mapping 1: Overview and Using Inbred Lines

Lecture 8. QTL Mapping 1: Overview and Using Inbred Lines Lecture 8 QTL Mapping 1: Overview and Using Inbred Lines Bruce Walsh. jbwalsh@u.arizona.edu. University of Arizona. Notes from a short course taught Jan-Feb 2012 at University of Uppsala While the machinery

More information

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 Lecture 2: Linear Models Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector

More information

Doing Bayesian Integrals

Doing Bayesian Integrals ASTR509-13 Doing Bayesian Integrals The Reverend Thomas Bayes (c.1702 1761) Philosopher, theologian, mathematician Presbyterian (non-conformist) minister Tunbridge Wells, UK Elected FRS, perhaps due to

More information

Evolution of phenotypic traits

Evolution of phenotypic traits Quantitative genetics Evolution of phenotypic traits Very few phenotypic traits are controlled by one locus, as in our previous discussion of genetics and evolution Quantitative genetics considers characters

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is

More information

Monte Carlo Methods. Leon Gu CSD, CMU

Monte Carlo Methods. Leon Gu CSD, CMU Monte Carlo Methods Leon Gu CSD, CMU Approximate Inference EM: y-observed variables; x-hidden variables; θ-parameters; E-step: q(x) = p(x y, θ t 1 ) M-step: θ t = arg max E q(x) [log p(y, x θ)] θ Monte

More information

Bayesian Inference of Interactions and Associations

Bayesian Inference of Interactions and Associations Bayesian Inference of Interactions and Associations Jun Liu Department of Statistics Harvard University http://www.fas.harvard.edu/~junliu Based on collaborations with Yu Zhang, Jing Zhang, Yuan Yuan,

More information

Bayesian Phylogenetics

Bayesian Phylogenetics Bayesian Phylogenetics Paul O. Lewis Department of Ecology & Evolutionary Biology University of Connecticut Woods Hole Molecular Evolution Workshop, July 27, 2006 2006 Paul O. Lewis Bayesian Phylogenetics

More information

Affected Sibling Pairs. Biostatistics 666

Affected Sibling Pairs. Biostatistics 666 Affected Sibling airs Biostatistics 666 Today Discussion of linkage analysis using affected sibling pairs Our exploration will include several components we have seen before: A simple disease model IBD

More information

1.5.1 ESTIMATION OF HAPLOTYPE FREQUENCIES:

1.5.1 ESTIMATION OF HAPLOTYPE FREQUENCIES: .5. ESTIMATION OF HAPLOTYPE FREQUENCIES: Chapter - 8 For SNPs, alleles A j,b j at locus j there are 4 haplotypes: A A, A B, B A and B B frequencies q,q,q 3,q 4. Assume HWE at haplotype level. Only the

More information

Quantitative Genomics and Genetics BTRY 4830/6830; PBSB

Quantitative Genomics and Genetics BTRY 4830/6830; PBSB Quantitative Genomics and Genetics BTRY 4830/6830; PBSB.5201.01 Lecture 20: Epistasis and Alternative Tests in GWAS Jason Mezey jgm45@cornell.edu April 16, 2016 (Th) 8:40-9:55 None Announcements Summary

More information

Advanced Machine Learning

Advanced Machine Learning Advanced Machine Learning Nonparametric Bayesian Models --Learning/Reasoning in Open Possible Worlds Eric Xing Lecture 7, August 4, 2009 Reading: Eric Xing Eric Xing @ CMU, 2006-2009 Clustering Eric Xing

More information

Supplementary Materials for Molecular QTL Discovery Incorporating Genomic Annotations using Bayesian False Discovery Rate Control

Supplementary Materials for Molecular QTL Discovery Incorporating Genomic Annotations using Bayesian False Discovery Rate Control Supplementary Materials for Molecular QTL Discovery Incorporating Genomic Annotations using Bayesian False Discovery Rate Control Xiaoquan Wen Department of Biostatistics, University of Michigan A Model

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods By Oleg Makhnin 1 Introduction a b c M = d e f g h i 0 f(x)dx 1.1 Motivation 1.1.1 Just here Supresses numbering 1.1.2 After this 1.2 Literature 2 Method 2.1 New math As

More information

Example: Ground Motion Attenuation

Example: Ground Motion Attenuation Example: Ground Motion Attenuation Problem: Predict the probability distribution for Peak Ground Acceleration (PGA), the level of ground shaking caused by an earthquake Earthquake records are used to update

More information

Use of hidden Markov models for QTL mapping

Use of hidden Markov models for QTL mapping Use of hidden Markov models for QTL mapping Karl W Broman Department of Biostatistics, Johns Hopkins University December 5, 2006 An important aspect of the QTL mapping problem is the treatment of missing

More information

Expression QTLs and Mapping of Complex Trait Loci. Paul Schliekelman Statistics Department University of Georgia

Expression QTLs and Mapping of Complex Trait Loci. Paul Schliekelman Statistics Department University of Georgia Expression QTLs and Mapping of Complex Trait Loci Paul Schliekelman Statistics Department University of Georgia Definitions: Genes, Loci and Alleles A gene codes for a protein. Proteins due everything.

More information

Bayesian model selection in graphs by using BDgraph package

Bayesian model selection in graphs by using BDgraph package Bayesian model selection in graphs by using BDgraph package A. Mohammadi and E. Wit March 26, 2013 MOTIVATION Flow cytometry data with 11 proteins from Sachs et al. (2005) RESULT FOR CELL SIGNALING DATA

More information

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis Summarizing a posterior Given the data and prior the posterior is determined Summarizing the posterior gives parameter estimates, intervals, and hypothesis tests Most of these computations are integrals

More information

Statistical Genetics I: STAT/BIOST 550 Spring Quarter, 2014

Statistical Genetics I: STAT/BIOST 550 Spring Quarter, 2014 Overview - 1 Statistical Genetics I: STAT/BIOST 550 Spring Quarter, 2014 Elizabeth Thompson University of Washington Seattle, WA, USA MWF 8:30-9:20; THO 211 Web page: www.stat.washington.edu/ thompson/stat550/

More information

I Have the Power in QTL linkage: single and multilocus analysis

I Have the Power in QTL linkage: single and multilocus analysis I Have the Power in QTL linkage: single and multilocus analysis Benjamin Neale 1, Sir Shaun Purcell 2 & Pak Sham 13 1 SGDP, IoP, London, UK 2 Harvard School of Public Health, Cambridge, MA, USA 3 Department

More information

MCMC IN THE ANALYSIS OF GENETIC DATA ON PEDIGREES

MCMC IN THE ANALYSIS OF GENETIC DATA ON PEDIGREES MCMC IN THE ANALYSIS OF GENETIC DATA ON PEDIGREES Elizabeth A. Thompson Department of Statistics, University of Washington Box 354322, Seattle, WA 98195-4322, USA Email: thompson@stat.washington.edu This

More information

Bayesian inference & Markov chain Monte Carlo. Note 1: Many slides for this lecture were kindly provided by Paul Lewis and Mark Holder

Bayesian inference & Markov chain Monte Carlo. Note 1: Many slides for this lecture were kindly provided by Paul Lewis and Mark Holder Bayesian inference & Markov chain Monte Carlo Note 1: Many slides for this lecture were kindly provided by Paul Lewis and Mark Holder Note 2: Paul Lewis has written nice software for demonstrating Markov

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo Markov Chain Monte Carlo Recall: To compute the expectation E ( h(y ) ) we use the approximation E(h(Y )) 1 n n h(y ) t=1 with Y (1),..., Y (n) h(y). Thus our aim is to sample Y (1),..., Y (n) from f(y).

More information

Reminder of some Markov Chain properties:

Reminder of some Markov Chain properties: Reminder of some Markov Chain properties: 1. a transition from one state to another occurs probabilistically 2. only state that matters is where you currently are (i.e. given present, future is independent

More information

Bustamante et al., Supplementary Nature Manuscript # 1 out of 9 Information #

Bustamante et al., Supplementary Nature Manuscript # 1 out of 9 Information # Bustamante et al., Supplementary Nature Manuscript # 1 out of 9 Details of PRF Methodology In the Poisson Random Field PRF) model, it is assumed that non-synonymous mutations at a given gene are either

More information

. Also, in this case, p i = N1 ) T, (2) where. I γ C N(N 2 2 F + N1 2 Q)

. Also, in this case, p i = N1 ) T, (2) where. I γ C N(N 2 2 F + N1 2 Q) Supplementary information S7 Testing for association at imputed SPs puted SPs Score tests A Score Test needs calculations of the observed data score and information matrix only under the null hypothesis,

More information

MCMC: Markov Chain Monte Carlo

MCMC: Markov Chain Monte Carlo I529: Machine Learning in Bioinformatics (Spring 2013) MCMC: Markov Chain Monte Carlo Yuzhen Ye School of Informatics and Computing Indiana University, Bloomington Spring 2013 Contents Review of Markov

More information

Forward Problems and their Inverse Solutions

Forward Problems and their Inverse Solutions Forward Problems and their Inverse Solutions Sarah Zedler 1,2 1 King Abdullah University of Science and Technology 2 University of Texas at Austin February, 2013 Outline 1 Forward Problem Example Weather

More information

MH I. Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution

MH I. Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution MH I Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution a lot of Bayesian mehods rely on the use of MH algorithm and it s famous

More information

Bayesian QTL mapping using skewed Student-t distributions

Bayesian QTL mapping using skewed Student-t distributions Genet. Sel. Evol. 34 00) 1 1 1 INRA, EDP Sciences, 00 DOI: 10.1051/gse:001001 Original article Bayesian QTL mapping using skewed Student-t distributions Peter VON ROHR a,b, Ina HOESCHELE a, a Departments

More information

CSC 2541: Bayesian Methods for Machine Learning

CSC 2541: Bayesian Methods for Machine Learning CSC 2541: Bayesian Methods for Machine Learning Radford M. Neal, University of Toronto, 2011 Lecture 3 More Markov Chain Monte Carlo Methods The Metropolis algorithm isn t the only way to do MCMC. We ll

More information

THE data in the QTL mapping study are usually composed. A New Simple Method for Improving QTL Mapping Under Selective Genotyping INVESTIGATION

THE data in the QTL mapping study are usually composed. A New Simple Method for Improving QTL Mapping Under Selective Genotyping INVESTIGATION INVESTIGATION A New Simple Method for Improving QTL Mapping Under Selective Genotyping Hsin-I Lee,* Hsiang-An Ho,* and Chen-Hung Kao*,,1 *Institute of Statistical Science, Academia Sinica, Taipei 11529,

More information

Answers and expectations

Answers and expectations Answers and expectations For a function f(x) and distribution P(x), the expectation of f with respect to P is The expectation is the average of f, when x is drawn from the probability distribution P E

More information

GENOMIC SELECTION WORKSHOP: Hands on Practical Sessions (BL)

GENOMIC SELECTION WORKSHOP: Hands on Practical Sessions (BL) GENOMIC SELECTION WORKSHOP: Hands on Practical Sessions (BL) Paulino Pérez 1 José Crossa 2 1 ColPos-México 2 CIMMyT-México September, 2014. SLU,Sweden GENOMIC SELECTION WORKSHOP:Hands on Practical Sessions

More information

Fast Bayesian Methods for Genetic Mapping Applicable for High-Throughput Datasets

Fast Bayesian Methods for Genetic Mapping Applicable for High-Throughput Datasets Fast Bayesian Methods for Genetic Mapping Applicable for High-Throughput Datasets Yu-Ling Chang A dissertation submitted to the faculty of the University of North Carolina at Chapel Hill in partial fulfillment

More information

Lecture 8 Genomic Selection

Lecture 8 Genomic Selection Lecture 8 Genomic Selection Guilherme J. M. Rosa University of Wisconsin-Madison Mixed Models in Quantitative Genetics SISG, Seattle 18 0 Setember 018 OUTLINE Marker Assisted Selection Genomic Selection

More information

Bayesian Networks in Educational Assessment

Bayesian Networks in Educational Assessment Bayesian Networks in Educational Assessment Estimating Parameters with MCMC Bayesian Inference: Expanding Our Context Roy Levy Arizona State University Roy.Levy@asu.edu 2017 Roy Levy MCMC 1 MCMC 2 Posterior

More information

Bayesian Methods with Monte Carlo Markov Chains II

Bayesian Methods with Monte Carlo Markov Chains II Bayesian Methods with Monte Carlo Markov Chains II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw http://tigpbp.iis.sinica.edu.tw/courses.htm 1 Part 3

More information

Bayesian Inference for DSGE Models. Lawrence J. Christiano

Bayesian Inference for DSGE Models. Lawrence J. Christiano Bayesian Inference for DSGE Models Lawrence J. Christiano Outline State space-observer form. convenient for model estimation and many other things. Bayesian inference Bayes rule. Monte Carlo integation.

More information

Bayesian reversible-jump for epistasis analysis in genomic studies

Bayesian reversible-jump for epistasis analysis in genomic studies Balestre and de Souza BMC Genomics (2016) 17:1012 DOI 10.1186/s12864-016-3342-6 RESEARCH ARTICLE Bayesian reversible-jump for epistasis analysis in genomic studies Marcio Balestre 1* and Claudio Lopes

More information

Introduction to QTL mapping in model organisms

Introduction to QTL mapping in model organisms Introduction to QTL mapping in model organisms Karl W Broman Department of Biostatistics Johns Hopkins University kbroman@jhsph.edu www.biostat.jhsph.edu/ kbroman Outline Experiments and data Models ANOVA

More information

Normal distribution We have a random sample from N(m, υ). The sample mean is Ȳ and the corrected sum of squares is S yy. After some simplification,

Normal distribution We have a random sample from N(m, υ). The sample mean is Ȳ and the corrected sum of squares is S yy. After some simplification, Likelihood Let P (D H) be the probability an experiment produces data D, given hypothesis H. Usually H is regarded as fixed and D variable. Before the experiment, the data D are unknown, and the probability

More information

Theoretical and computational aspects of association tests: application in case-control genome-wide association studies.

Theoretical and computational aspects of association tests: application in case-control genome-wide association studies. Theoretical and computational aspects of association tests: application in case-control genome-wide association studies Mathieu Emily November 18, 2014 Caen mathieu.emily@agrocampus-ouest.fr - Agrocampus

More information

Lecture 6: Markov Chain Monte Carlo

Lecture 6: Markov Chain Monte Carlo Lecture 6: Markov Chain Monte Carlo D. Jason Koskinen koskinen@nbi.ku.dk Photo by Howard Jackman University of Copenhagen Advanced Methods in Applied Statistics Feb - Apr 2016 Niels Bohr Institute 2 Outline

More information

Bayesian Partition Models for Identifying Expression Quantitative Trait Loci

Bayesian Partition Models for Identifying Expression Quantitative Trait Loci Journal of the American Statistical Association ISSN: 0162-1459 (Print) 1537-274X (Online) Journal homepage: http://www.tandfonline.com/loi/uasa20 Bayesian Partition Models for Identifying Expression Quantitative

More information

The Bayesian Choice. Christian P. Robert. From Decision-Theoretic Foundations to Computational Implementation. Second Edition.

The Bayesian Choice. Christian P. Robert. From Decision-Theoretic Foundations to Computational Implementation. Second Edition. Christian P. Robert The Bayesian Choice From Decision-Theoretic Foundations to Computational Implementation Second Edition With 23 Illustrations ^Springer" Contents Preface to the Second Edition Preface

More information

Causal Model Selection Hypothesis Tests in Systems Genetics

Causal Model Selection Hypothesis Tests in Systems Genetics 1 Causal Model Selection Hypothesis Tests in Systems Genetics Elias Chaibub Neto and Brian S Yandell SISG 2012 July 13, 2012 2 Correlation and Causation The old view of cause and effect... could only fail;

More information

ST 740: Markov Chain Monte Carlo

ST 740: Markov Chain Monte Carlo ST 740: Markov Chain Monte Carlo Alyson Wilson Department of Statistics North Carolina State University October 14, 2012 A. Wilson (NCSU Stsatistics) MCMC October 14, 2012 1 / 20 Convergence Diagnostics:

More information

The E-M Algorithm in Genetics. Biostatistics 666 Lecture 8

The E-M Algorithm in Genetics. Biostatistics 666 Lecture 8 The E-M Algorithm in Genetics Biostatistics 666 Lecture 8 Maximum Likelihood Estimation of Allele Frequencies Find parameter estimates which make observed data most likely General approach, as long as

More information

Locating multiple interacting quantitative trait. loci using rank-based model selection

Locating multiple interacting quantitative trait. loci using rank-based model selection Genetics: Published Articles Ahead of Print, published on May 16, 2007 as 10.1534/genetics.106.068031 Locating multiple interacting quantitative trait loci using rank-based model selection, 1 Ma lgorzata

More information

Introduction to QTL mapping in model organisms

Introduction to QTL mapping in model organisms Introduction to QTL mapping in model organisms Karl W Broman Department of Biostatistics Johns Hopkins University kbroman@jhsph.edu www.biostat.jhsph.edu/ kbroman Outline Experiments and data Models ANOVA

More information

Petr Volf. Model for Difference of Two Series of Poisson-like Count Data

Petr Volf. Model for Difference of Two Series of Poisson-like Count Data Petr Volf Institute of Information Theory and Automation Academy of Sciences of the Czech Republic Pod vodárenskou věží 4, 182 8 Praha 8 e-mail: volf@utia.cas.cz Model for Difference of Two Series of Poisson-like

More information

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 EPSY 905: Intro to Bayesian and MCMC Today s Class An

More information

Hmms with variable dimension structures and extensions

Hmms with variable dimension structures and extensions Hmm days/enst/january 21, 2002 1 Hmms with variable dimension structures and extensions Christian P. Robert Université Paris Dauphine www.ceremade.dauphine.fr/ xian Hmm days/enst/january 21, 2002 2 1 Estimating

More information

Review Article Statistical Methods for Mapping Multiple QTL

Review Article Statistical Methods for Mapping Multiple QTL Hindawi Publishing Corporation International Journal of Plant Genomics Volume 2008, Article ID 286561, 8 pages doi:10.1155/2008/286561 Review Article Statistical Methods for Mapping Multiple QTL Wei Zou

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos & Aarti Singh Contents Markov Chain Monte Carlo Methods Goal & Motivation Sampling Rejection Importance Markov

More information

Lecture 5. G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1

Lecture 5. G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1 Lecture 5 1 Probability (90 min.) Definition, Bayes theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests (90 min.) general concepts, test statistics,

More information

Estimating Evolutionary Trees. Phylogenetic Methods

Estimating Evolutionary Trees. Phylogenetic Methods Estimating Evolutionary Trees v if the data are consistent with infinite sites then all methods should yield the same tree v it gets more complicated when there is homoplasy, i.e., parallel or convergent

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

New Bayesian methods for model comparison

New Bayesian methods for model comparison Back to the future New Bayesian methods for model comparison Murray Aitkin murray.aitkin@unimelb.edu.au Department of Mathematics and Statistics The University of Melbourne Australia Bayesian Model Comparison

More information

Bayesian Inference in Astronomy & Astrophysics A Short Course

Bayesian Inference in Astronomy & Astrophysics A Short Course Bayesian Inference in Astronomy & Astrophysics A Short Course Tom Loredo Dept. of Astronomy, Cornell University p.1/37 Five Lectures Overview of Bayesian Inference From Gaussians to Periodograms Learning

More information

Markov Chain Monte Carlo, Numerical Integration

Markov Chain Monte Carlo, Numerical Integration Markov Chain Monte Carlo, Numerical Integration (See Statistics) Trevor Gallen Fall 2015 1 / 1 Agenda Numerical Integration: MCMC methods Estimating Markov Chains Estimating latent variables 2 / 1 Numerical

More information