Causal Graphical Models in Systems Genetics
|
|
- Eunice Strickland
- 5 years ago
- Views:
Transcription
1 1 Causal Graphical Models in Systems Genetics 2013 Network Analysis Short Course - UCLA Human Genetics Elias Chaibub Neto and Brian S Yandell July 17, 2013
2 Motivation and basic concepts 2
3 3 Motivation Suppose the expression of gene G is associated with a clinical phenotype C. We want to know whether: G C or if G C or if C G. We cannot distinguish between these models using data since f (G) f (C G) = f (G, C) = f (C) f (G C), and their likelihood scores are identical.
4 4 Schadt et al. (2005) However, if G and C map to the same QTL, we can use genetics to infer the causal ordering among the phenotypes lod Q Chromosome causal Q G reactive G Q indepen G Q C C C
5 5 Schadt et al. (2005) For a drug company, it is important to determine which genes are causal, and which genes are reactive, since: Causal genes have the potential to become drug targets. Whereas reactive genes are of lesser interest.
6 6 Genetics and causal inference The integration of genetics and phenotype data allows us to infer causal relations between phenotypes for two reasons: 1. In experimental crosses, the association of a QTL and a phenotype is causal. 2. A causal QTL can be used to determine the causal order between phenotypes using the concept of conditional independence.
7 7 Causal relations between QTLs and phenotypes In experim. crosses, the association of a QTL and a phenotype is causal. Why is it so? QTL mapping is analogous to a randomized experiment (Li et al. 2006). Randomization is considered the gold standard for causal inference. Causality can be inferred from a randomized experiment since: 1. Application of a treatment to an experimental unit precedes the observation of the outcome (genotype precedes phenotype). 2. Because the treatment levels are randomized across the experimental units, the effects of confounding variables get averaged out (the mendelian randomization of alleles during meiosis average out the effects of other unlinked loci on the phenotype).
8 8 Causal relations between QTLs and phenotypes A X chr1 Phenotype lod Effect of QTL A, after the effect of B is averaged out 1 2 Chromosome Effect of QTL B, after the effect of A is averaged out X chr2 B phenotype Aa AA phenotype Bb BB
9 Conditional independence as the key to causal ordering Model: Q G C Marginal dependence: G C Aa AA Aa AA Conditional independence: C res(c G) G Aa AA 9
10 Causal ordering between phenotypes causal reactive G G indepen G Q Q Q C C C res(c G) Aa AA res(c G) Aa AA res(c G) Aa AA res(g C) Aa AA res(g C) Aa AA res(g C) Aa AA
11 11 Causal ordering between phenotypes In general (although it is not always true): Models that share the same set of conditional independence relations (Markov equivalent models) cannot be distinguished using the data (they have equivalent likelihood functions). Whereas, models with distinct sets of conditional independence relations, can be distinguished.
12 12 Causality tests for pairs of phenotypes causal G Q C reactive G Q C indepen G Q C
13 13 Pairwise models as collapsed versions of more complex networks (a) Q Y 1 44 Y Y 444 Y (b) Y Q Y Y 1 44 (c) Q Y 6 6 Y 1 Y Y Y 4 4 Y 2 Y Y Y 2 Y 2 (d) Q (e) 7 7 Q Y Y Y 6 6 Y Y 1 Y Y 1 Y Y 2 Y 2 Y Q Q Y 1 Y 2 Y 1 Y 2 Q Q Q Y 1 Y 2 Y 1 Y 2 Y 1 Y 2 A causal relation might be direct or mediated by other phenotypes. Pairwise models are misspecified.
14 14 Schadt et al By using this approach, Schadt et al. 2005, has been able to identify, and experimentally validate genes related to obesity in a mouse cross. So, what is the issue then? Model selection via AIC or BIC scores do not provide a measure of uncertainty associated with the model selection call. With noisy data, model selection can lead to a large number of false positives.
15 The issue, and illustration For each one of the 1,000 simulations we: Generate noisy data from model: Q Y 1 Y 2. Fit models M 1 : Q Y 1 Y 2 and M 2 : Q Y 2 Y 1. Compute log-likelihood ratio LR 12. If LR 12 > 0, select M 1. If LR 12 < 0, select M 2. R 2 (Y 2 = Q + ε) false positives (318) true positives (682) R 2 (Y 1 = Q + ε) 15
16 16 Issue: no measure of uncertainty for a model selection call We want a statistical procedure that attaches a measure of uncertainty to a model selection call. However, given the characteristics of our application problem, it: 1. Needs to handle misspecified models. 2. Needs to handle non-nested models: M 1 M Q Q Y 1 Y 2 Y 1 Y 2 3. Should, ideally, be fully analytical for the sake of computational efficiency.
17 17 Assessing the significance of a model selection call Vuong s model selection test (Vuong 1989) satisfies these three criteria. R 2 (Y 2 = Q + ε) false positives (1) true positives (65) no calls (934) R 2 (Y 1 = Q + ε)
18 18 Vuong s model selection test (Vuong 1989) Consider 2 competing models M 1 M 2. Vuong s test the hypothesis: H 0 : M 1 is not closer to the true model than M 2, H 1 : M 1 is closer to the true model than M 2. where, under H 0, the scaled log-likelihood-ratio test statistic Z 12 = L ˆR 12 n ˆσ12.12 d N(0, 1), with L ˆR 12 = n i=1 (log ˆf 1,i log ˆf 2,i ), and ˆσ is the sample variance of the log-likelihood ratio scores.
19 19 Causal Model Selection Tests (CMST) Vuong s test handles model selection for 2 models only. However, we want to use data from experimental crosses to distinguish among 4 models: M 1 Q 7 7 M 2 Q M 3 Q 7 7 M 4 Q 7 7 Y 1 Y 2 Y 1 Y 2 Y 1 Y 2 Y 1 Y 2 Likelihood equivalent models: M4 a Q M Q 4 b M4 c Q Y 1 Y 2 Y 1 Y 2 Y 1 Y 2
20 20 Causal Model Selection Tests (CMST) Combine several separate Vuong s tests into a single one. 3 versions: 1. Parametric CMST: intersection-union test of 3 Vuong s tests, M 1 M 2, M 1 M 3, M 1 M 4, testing: H 0 : M 1 is not closer to the true model than M 2, M 3, or M 4. H 1 : M 1 is closer to the true model than M 2, M 3, and M Non-parametric CMST: intersection-union test of 3 paired sign tests (Clark s test). 3. Joint-parametric CMST: extension of the parametric CMST test which accounts for the correlation among the test statistics of the Vuong s tests.
21 21 Yeast data analysis Budding yeast genetical genomics data set (Brem and Kruglyak 2005). Data on 112 strains with: Expression measurements on 5,740 transcripts. Dense genotype data on 2,956 markers. Most importantly: We evaluated the precision of the causal predictions using validated causal relationships extracted from a data-base of 247 knock-out experiments in yeast (Hughes et al. 2000, Zhu et al. 2008).
22 22 Knockout signatures In each experiment, one gene was knocked-out, and the expression levels of the remainder genes in control and knocked-out strains were interrogated for differential expression. The set of differentially expressed genes form the knock-out signature (ko-signature) of the knocked-out gene (ko-gene). The ko-signature represents a validated set of causal relations.
23 23 Validation using yeast knockout signatures To leverage the ko information, we: Determined which of the 247 ko-genes also showed a significant QTL in our data-set. For each ko-gene showing significant linkages, we determined which other genes co-mapped to the ko-gene s QTL, generating, in this way, a list of putative targets of the ko-gene. For each ko-gene/putative targets list, we applied all methods using the ko-gene as the Y 1 phenotype, the putative target genes as the Y 2 phenotypes and the ko-gene s QTL as the causal anchor.
24 24 Validation using yeast knockout signatures In total, 135 ko-genes showed significant linkages (both cis- and trans-). The number of genes in the target lists varied from ko-gene to ko-gene, but, in total, there were 31,936 targets.
25 25 Validation using yeast knockout signatures Performance in terms of biologically validated TP, FP and precision: TP: a statistically significant causal relation between a ko-gene and a putative target gene when the putative target gene belongs to the ko-signature of the ko-gene. FP: a statistically significant causal relation between a ko-gene and a putative target gene when the target gene doesn t belong to the ko-signature. The validated precision, is computed as the ratio of true positives by the sum of true and false positives.
26 26 Results: cis and trans ko-genes Number of true positives True Positives Nominal significance level Number of false positives False Positives Nominal significance level Precision Precision Nominal significance level black: BIC, blue: joint CMST BIC, green: par CMST BIC, red: non par CMST BIC
27 27 Results: cis ko-genes only 27 out of the 135 candidate regulator ko-genes mapped in cis. True Positives False Positives Precision Number of true positives Nominal significance level Number of false positives Nominal significance level Precision Nominal significance level black: BIC, blue: joint CMST BIC, green: par CMST BIC, red: non par CMST BIC
28 28 Precision side by side Cis and trans Cis only Precision Nominal significance level Precision Nominal significance level black: BIC, blue: joint CMST BIC, green: par CMST BIC, red: non par CMST BIC
29 29 Cis-vs-trans case Why is the cis-vs-trans case easier than the trans-vs-trans case? In general, the cis-linkages tend to be stronger than trans-linkages. R 2 (Y 2 = Q + ε) R 2 (Y 1 = Q + ε)
30 30 Conclusions CMST tests trade a reduction in the rate of false positives by a decrease in statistical power. Whether a more powerful and less precise, or a less powerful and more precise method is more adequate, depends on the biologist s research goals and resources. If the biologist can easily validate several genes, a larger list generated by more powered and less precise methods might be more appealing. If follow up studies are time consuming and expensive, and only a few candidates can be studied in detail, a more precise method that conservatively identifies candidates with high confidence can be more appealing.
31 31 Causal Bayesian networks and the QTLnet algorithm YOL084W YOR028C YNL195C YKL091C YNR014W YEL011W YPL154C YFR043C YHR104W YKL085W YDR032C YJL210W YJL111W YLR178C YIL113W YPR160W YOL097C YNL160W YIR016W YJL161W YHR016C YAL061W YMR170C YJR096W
32 Standard Bayesian networks A graphical model is a multivariate probabilistic model whose conditional independence relations are represented by a graph. Bayesian networks are directed acyclic graph (DAG) models, Assuming the Markov property, the joint distribution factors according to the conditional independence relations: 4 P(1, 2, 3, 4, 5, 6) = P(6 5) P(5 3, 4) P(4) P(3 1, 2) P(2) P(1) 6 {1, 2, 3, 4} 5, 5 {1, 2} 3, 4, and so on i.e., each node is independent of its non-descendants given its parents. 32
33 33 Standard Bayesian networks and causality Even though the direct edges in a Bayes net are often interpreted as causal relations, in reality they only represent conditional dependencies. Different phenotype networks, for instance, Y 1 Y 2 Y 3, Y 1 Y 2 Y 3, Y 1 Y 2 Y 3, can represent the same set of conditional independence relations (Y 1 Y 3 Y 2, in this example). When this is the case, we say the networks are Markov equivalent.
34 34 Standard Bayesian networks and causality In general: Markov equivalence Distribution equivalence (equivalence of likelihood functions) Hence, model selection criteria cannot distinguish between Markov equivalent networks. The best we can do is to learn equivalence classes of likelihood equivalent phenotype networks from the data.
35 35 Genetics as a way to reduce the size of equivalence classes The incorporation of genetic information can help distinguish between likelihood equivalent networks in two distinct ways: 1. By creating priors for the network structures, using the results of causality tests (Zhu et al. 2007). 2. By augmenting the phenotype network with QTL nodes, creating new sets of conditional independence relations (Chaibub Neto et al. 2010).
36 36 Genetic priors Consider the networks M 1 : Y 1 Y 2 Y 3, M 2 : Y 1 Y 2 Y 3. These Markov equivalent networks have the same likelihood, i.e., P(D M 1 ) = P(D M 2 ). If the phenotypes are associated with QTLs, we can use the results of the causality tests to compute prior probabilities for the network structures. If P(M 1 ) P(M 2 ) 1, then P(M 1 D) P(M 2 D) = P(D M 1) P(M 1 ) P(D M 2 ) P(M 2 ) 1, and we can use the posterior probability ratio to distinguish between the networks.
37 37 Augmenting the phenotype network with QTLs Consider the Markov equivalent networks: M 1 : Y 1 Y 2 Y 3, M 2 : Y 1 Y 2 Y 3. By augmenting the phenotype network with a QTL node, M 1 : Q Y 1 Y 2 Y 3, M 2 : Q Y 1 Y 2 Y 3, we have that M 1 and M 2 have distinct sets of conditional independence relations: Y 2 Q Y 1, Y 1 Y 3 Y 2, on M 1 Y 2 Q Y 1, Y 1 Y 3 Y 2, on M 2 Hence, M 1 and M 2 are no longer likelihood equivalent.
38 38 Learning Bayesian Networks from Data Posterior prob of network M k given the observed data, D, P(M k D) = P(D M k ) P(M k ) M k P(D M k ) P(M k ). Prior predictive distribution of D given M k P(D M k ) = P(D θ, M k ) P(θ) dθ Prior distribution of network M k, P(M k ). Marginal distribution of the data P(D) = P(D M k ) P(M k ), M k cannot, generally, be computed analytically because the number of networks is too large.
39 39 Learning Bayesian Networks from Data Complexity of the learning task: # of nodes # of networks , ,781, e e e+158 Hence, heuristic search algorithms are essential to traverse the network space efficiently.
40 40 QTLnet algorithm Perform joint inference of the causal phenotype network and the associated genetic architecture. The genetic architecture is inferred conditional on the phenotype network. Because the phenotype network structure is itself unknown, the algorithm iterates between updating the network structure and genetic architecture using a Markov chain Monte Carlo (MCMC) approach. QTLnet corresponds to a mixed Bayesian network with continuous and discrete nodes representing phenotypes and QTLs, respectively.
41 QTLnet algorithm - standard structure sampler
42 42 Bayesian model averaging Posterior prob M_1 M_2 M_3 M_4 M_5 M_6 M_7 M_8 M_9 M_10 Model M 1 1 M 2 1 M 3 1 M 4 1 M M 6 1 M 7 1 M 8 1 M 9 1 M Pr(Y 1 Y 2) = Pr(M 1) + Pr(M 3) + Pr(M 4) = 0.54 Pr(Y 1... Y 2) = Pr(M 2) + Pr(M 5) + Pr(M 7) = 0.34 Pr(Y 1 Y 2) = Pr(M 6) + Pr(M 8) + Pr(M 9) + Pr(M 10) = 0.12
43 43 Yeast data analysis We build a causal phenotype network around PHM7. PHM7 is physically located close to the hotspot QTL on chr 15. chr 15 hotspot counts Map position (cm) PHM7 is the cis-gene with the largest number of significant causal calls across all hotspots (23 significant calls at α = for joint CMST).
44 44 Yeast data analysis PHM7 (yellow) shows up at the top of the transcriptional network. YOL084W YOR028C YNL195C YKL091C YNR014W YEL011W YPL154C YFR043C YHR104W YKL085W YJL210W YJL111W YLR178C YIL113W YDR032C YOL097C YIR016W YPR160W YNL160W YHR016C YJL161W YAL061W YMR170C YJR096W
45 References 1. Chaibub Neto et al. (2013) Modeling causality for pairs of phenotypes in systems genetics. Genetics 193: Chaibub Neto et al. (2010). Causal graphical models in systems genetics: a unified framework for joint inference of causal network and genetic architecture for correlated phenotypes. Annals of Applied Statistics 4: Software: R/qtlhot and R/qtlnet packages. Further references: 1. Brem and Kruglyak (2005) PNAS 102: Clarke (2007) Political Analysis 15: Hughes et al. (2000) Cell 102: Kullback (1959) Information theory and statistics. John Wiley. New York. 5. Li et al. (2006) Plos Genetics 2: e Schadt et al. (2005) Nature Genetics 37: Vuong (1989) Econometrica 57: Zhu et al. (2008) Nature Genetics 40:
46 46 Acknowledgments Co-authors: Brian S Yandell (Statistics - UW-Madison) Mark P Keller (Biochemistry - UW-Madison) Alan D Attie (Biochemistry - UW-Madison) Bin Zhang (Genetics and Genomic Sciences - MSSM) Jun Zhu (Genetics and Genomic Sciences - MSSM) Aimee T Broman (Biochemistry - UW-Madison)
47 Thank you! 47
Causal Model Selection Hypothesis Tests in Systems Genetics
1 Causal Model Selection Hypothesis Tests in Systems Genetics Elias Chaibub Neto and Brian S Yandell SISG 2012 July 13, 2012 2 Correlation and Causation The old view of cause and effect... could only fail;
More informationHotspots and Causal Inference For Yeast Data
Hotspots and Causal Inference For Yeast Data Elias Chaibub Neto and Brian S Yandell October 24, 2012 Here we reproduce the analysis of the budding yeast genetical genomics data-set presented in Chaibub
More informationCausal Model Selection Hypothesis Tests. in Systems Genetics
Causal Model Selection Hypothesis Tests in Systems Genetics Elias Chaibub Neto 1 Aimee T. Broman 2 Mark P Keller 2 Alan D Attie 2 Bin Zhang 1 Jun Zhu 1 Brian S Yandell 3,4 1 Sage Bionetworks, Seattle,
More informationInferring Causal Phenotype Networks from Segregating Populat
Inferring Causal Phenotype Networks from Segregating Populations Elias Chaibub Neto chaibub@stat.wisc.edu Statistics Department, University of Wisconsin - Madison July 15, 2008 Overview Introduction Description
More informationCausal Network Models for Correlated Quantitative Traits. outline
Causal Network Models for Correlated Quantitative Traits Brian S. Yandell UW Madison October 2012 www.stat.wisc.edu/~yandell/statgen Jax SysGen: Yandell 2012 1 outline Correlation and causation Correlatedtraitsinorganized
More informationCausal Model Selection Hypothesis Tests in Systems Genetics: a tutorial
Causal Model Selection Hypothesis Tests in Systems Genetics: a tutorial Elias Chaibub Neto and Brian S Yandell July 2, 2012 1 Motivation Current efforts in systems genetics have focused on the development
More informationInferring Genetic Architecture of Complex Biological Processes
Inferring Genetic Architecture of Complex Biological Processes BioPharmaceutical Technology Center Institute (BTCI) Brian S. Yandell University of Wisconsin-Madison http://www.stat.wisc.edu/~yandell/statgen
More informationMultiple QTL mapping
Multiple QTL mapping Karl W Broman Department of Biostatistics Johns Hopkins University www.biostat.jhsph.edu/~kbroman [ Teaching Miscellaneous lectures] 1 Why? Reduce residual variation = increased power
More informationLearning in Bayesian Networks
Learning in Bayesian Networks Florian Markowetz Max-Planck-Institute for Molecular Genetics Computational Molecular Biology Berlin Berlin: 20.06.2002 1 Overview 1. Bayesian Networks Stochastic Networks
More informationGene mapping in model organisms
Gene mapping in model organisms Karl W Broman Department of Biostatistics Johns Hopkins University http://www.biostat.jhsph.edu/~kbroman Goal Identify genes that contribute to common human diseases. 2
More informationQuantile based Permutation Thresholds for QTL Hotspots. Brian S Yandell and Elias Chaibub Neto 17 March 2012
Quantile based Permutation Thresholds for QTL Hotspots Brian S Yandell and Elias Chaibub Neto 17 March 2012 2012 Yandell 1 Fisher on inference We may at once admit that any inference from the particular
More informationTECHNICAL REPORT NO April 22, Causal Model Selection Tests in Systems Genetics 1
DEPARTMENT OF STATISTICS University of Wisconsin 1300 University Avenue Madison, WI 53706 TECHNICAL REPORT NO. 1157 April 22, 2010 Causal Model Selection Tests in Systems Genetics 1 Elias Chaibub Neto
More informationTECHNICAL REPORT NO December 1, 2008
DEPARTMENT OF STATISTICS University of Wisconsin 300 University Avenue Madison, WI 53706 TECHNICAL REPORT NO. 46 December, 2008 Revised on January 27, 2009 Causal Graphical Models in System Genetics: a
More informationUsing graphs to relate expression data and protein-protein interaction data
Using graphs to relate expression data and protein-protein interaction data R. Gentleman and D. Scholtens October 31, 2017 Introduction In Ge et al. (2001) the authors consider an interesting question.
More informationMapping multiple QTL in experimental crosses
Human vs mouse Mapping multiple QTL in experimental crosses Karl W Broman Department of Biostatistics & Medical Informatics University of Wisconsin Madison www.biostat.wisc.edu/~kbroman www.daviddeen.com
More informationComputational Genomics. Systems biology. Putting it together: Data integration using graphical models
02-710 Computational Genomics Systems biology Putting it together: Data integration using graphical models High throughput data So far in this class we discussed several different types of high throughput
More informationMapping multiple QTL in experimental crosses
Mapping multiple QTL in experimental crosses Karl W Broman Department of Biostatistics and Medical Informatics University of Wisconsin Madison www.biostat.wisc.edu/~kbroman [ Teaching Miscellaneous lectures]
More informationQTL model selection: key players
Bayesian Interval Mapping. Bayesian strategy -9. Markov chain sampling 0-7. sampling genetic architectures 8-5 4. criteria for model selection 6-44 QTL : Bayes Seattle SISG: Yandell 008 QTL model selection:
More informationStatistical issues in QTL mapping in mice
Statistical issues in QTL mapping in mice Karl W Broman Department of Biostatistics Johns Hopkins University http://www.biostat.jhsph.edu/~kbroman Outline Overview of QTL mapping The X chromosome Mapping
More informationLatent Variable models for GWAs
Latent Variable models for GWAs Oliver Stegle Machine Learning and Computational Biology Research Group Max-Planck-Institutes Tübingen, Germany September 2011 O. Stegle Latent variable models for GWAs
More informationIntroduction to QTL mapping in model organisms
Introduction to QTL mapping in model organisms Karl W Broman Department of Biostatistics Johns Hopkins University kbroman@jhsph.edu www.biostat.jhsph.edu/ kbroman Outline Experiments and data Models ANOVA
More informationR/qtl workshop. (part 2) Karl Broman. Biostatistics and Medical Informatics University of Wisconsin Madison. kbroman.org
R/qtl workshop (part 2) Karl Broman Biostatistics and Medical Informatics University of Wisconsin Madison kbroman.org github.com/kbroman @kwbroman Example Sugiyama et al. Genomics 71:70-77, 2001 250 male
More informationIntroduction to QTL mapping in model organisms
Introduction to QTL mapping in model organisms Karl W Broman Department of Biostatistics Johns Hopkins University kbroman@jhsph.edu www.biostat.jhsph.edu/ kbroman Outline Experiments and data Models ANOVA
More informationNIH Public Access Author Manuscript Ann Appl Stat. Author manuscript; available in PMC 2011 January 7.
NIH Public Access Author Manuscript Published in final edited form as: Ann Appl Stat. 2010 March 1; 4(1): 320 339. CAUSAL GRAPHICAL MODELS IN SYSTEMS GENETICS: A UNIFIED FRAMEWORK FOR JOINT INFERENCE OF
More informationIntroduction to QTL mapping in model organisms
Human vs mouse Introduction to QTL mapping in model organisms Karl W Broman Department of Biostatistics Johns Hopkins University www.biostat.jhsph.edu/~kbroman [ Teaching Miscellaneous lectures] www.daviddeen.com
More informationQuantile-based permutation thresholds for QTL hotspot analysis: a tutorial
Quantile-based permutation thresholds for QTL hotspot analysis: a tutorial Elias Chaibub Neto and Brian S Yandell September 18, 2013 1 Motivation QTL hotspots, groups of traits co-mapping to the same genomic
More informationModel Selection for Multiple QTL
Model Selection for Multiple TL 1. reality of multiple TL 3-8. selecting a class of TL models 9-15 3. comparing TL models 16-4 TL model selection criteria issues of detecting epistasis 4. simulations and
More informationQTL model selection: key players
QTL Model Selection. Bayesian strategy. Markov chain sampling 3. sampling genetic architectures 4. criteria for model selection Model Selection Seattle SISG: Yandell 0 QTL model selection: key players
More informationBayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016
Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2016 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several
More informationGWAS IV: Bayesian linear (variance component) models
GWAS IV: Bayesian linear (variance component) models Dr. Oliver Stegle Christoh Lippert Prof. Dr. Karsten Borgwardt Max-Planck-Institutes Tübingen, Germany Tübingen Summer 2011 Oliver Stegle GWAS IV: Bayesian
More informationInferring Transcriptional Regulatory Networks from High-throughput Data
Inferring Transcriptional Regulatory Networks from High-throughput Data Lectures 9 Oct 26, 2011 CSE 527 Computational Biology, Fall 2011 Instructor: Su-In Lee TA: Christopher Miles Monday & Wednesday 12:00-1:20
More informationarxiv: v1 [stat.ap] 7 Oct 2010
The Annals of Applied Statistics 2010, Vol. 4, No. 1, 320 339 DOI: 10.1214/09-AOAS288 c Institute of Mathematical Statistics, 2010 arxiv:1010.1402v1 [stat.ap] 7 Oct 2010 CAUSAL GRAPHICAL MODELS IN SYSTEMS
More informationIntroduction to QTL mapping in model organisms
Introduction to QTL mapping in model organisms Karl W Broman Department of Biostatistics and Medical Informatics University of Wisconsin Madison www.biostat.wisc.edu/~kbroman [ Teaching Miscellaneous lectures]
More informationBayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014
Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2014 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several
More informationIntroduction to QTL mapping in model organisms
Introduction to QTL mapping in model organisms Karl Broman Biostatistics and Medical Informatics University of Wisconsin Madison kbroman.org github.com/kbroman @kwbroman Backcross P 1 P 2 P 1 F 1 BC 4
More informationUse of hidden Markov models for QTL mapping
Use of hidden Markov models for QTL mapping Karl W Broman Department of Biostatistics, Johns Hopkins University December 5, 2006 An important aspect of the QTL mapping problem is the treatment of missing
More informationExpression QTLs and Mapping of Complex Trait Loci. Paul Schliekelman Statistics Department University of Georgia
Expression QTLs and Mapping of Complex Trait Loci Paul Schliekelman Statistics Department University of Georgia Definitions: Genes, Loci and Alleles A gene codes for a protein. Proteins due everything.
More informationBayesian Networks BY: MOHAMAD ALSABBAGH
Bayesian Networks BY: MOHAMAD ALSABBAGH Outlines Introduction Bayes Rule Bayesian Networks (BN) Representation Size of a Bayesian Network Inference via BN BN Learning Dynamic BN Introduction Conditional
More informationGraphical Models and Kernel Methods
Graphical Models and Kernel Methods Jerry Zhu Department of Computer Sciences University of Wisconsin Madison, USA MLSS June 17, 2014 1 / 123 Outline Graphical Models Probabilistic Inference Directed vs.
More informationAn Empirical-Bayes Score for Discrete Bayesian Networks
An Empirical-Bayes Score for Discrete Bayesian Networks scutari@stats.ox.ac.uk Department of Statistics September 8, 2016 Bayesian Network Structure Learning Learning a BN B = (G, Θ) from a data set D
More informationPenalized Loss functions for Bayesian Model Choice
Penalized Loss functions for Bayesian Model Choice Martyn International Agency for Research on Cancer Lyon, France 13 November 2009 The pure approach For a Bayesian purist, all uncertainty is represented
More informationLecture 6: Graphical Models: Learning
Lecture 6: Graphical Models: Learning 4F13: Machine Learning Zoubin Ghahramani and Carl Edward Rasmussen Department of Engineering, University of Cambridge February 3rd, 2010 Ghahramani & Rasmussen (CUED)
More informationAn Introduction to Reversible Jump MCMC for Bayesian Networks, with Application
An Introduction to Reversible Jump MCMC for Bayesian Networks, with Application, CleverSet, Inc. STARMAP/DAMARS Conference Page 1 The research described in this presentation has been funded by the U.S.
More informationMapping QTL to a phylogenetic tree
Mapping QTL to a phylogenetic tree Karl W Broman Department of Biostatistics & Medical Informatics University of Wisconsin Madison www.biostat.wisc.edu/~kbroman Human vs mouse www.daviddeen.com 3 Intercross
More informationThe Monte Carlo Method: Bayesian Networks
The Method: Bayesian Networks Dieter W. Heermann Methods 2009 Dieter W. Heermann ( Methods)The Method: Bayesian Networks 2009 1 / 18 Outline 1 Bayesian Networks 2 Gene Expression Data 3 Bayesian Networks
More informationCalculation of IBD probabilities
Calculation of IBD probabilities David Evans University of Bristol This Session Identity by Descent (IBD) vs Identity by state (IBS) Why is IBD important? Calculating IBD probabilities Lander-Green Algorithm
More informationProbabilistic Graphical Networks: Definitions and Basic Results
This document gives a cursory overview of Probabilistic Graphical Networks. The material has been gleaned from different sources. I make no claim to original authorship of this material. Bayesian Graphical
More informationMachine Learning Summer School
Machine Learning Summer School Lecture 3: Learning parameters and structure Zoubin Ghahramani zoubin@eng.cam.ac.uk http://learning.eng.cam.ac.uk/zoubin/ Department of Engineering University of Cambridge,
More informationProbabilistic Causal Models
Probabilistic Causal Models A Short Introduction Robin J. Evans www.stat.washington.edu/ rje42 ACMS Seminar, University of Washington 24th February 2011 1/26 Acknowledgements This work is joint with Thomas
More informationDirected Graphical Models
Directed Graphical Models Instructor: Alan Ritter Many Slides from Tom Mitchell Graphical Models Key Idea: Conditional independence assumptions useful but Naïve Bayes is extreme! Graphical models express
More informationBayesian Inference of Interactions and Associations
Bayesian Inference of Interactions and Associations Jun Liu Department of Statistics Harvard University http://www.fas.harvard.edu/~junliu Based on collaborations with Yu Zhang, Jing Zhang, Yuan Yuan,
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear
More informationBayesian Regression Linear and Logistic Regression
When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we
More informationCS 2750: Machine Learning. Bayesian Networks. Prof. Adriana Kovashka University of Pittsburgh March 14, 2016
CS 2750: Machine Learning Bayesian Networks Prof. Adriana Kovashka University of Pittsburgh March 14, 2016 Plan for today and next week Today and next time: Bayesian networks (Bishop Sec. 8.1) Conditional
More informationIntroduction to Probabilistic Graphical Models
Introduction to Probabilistic Graphical Models Kyu-Baek Hwang and Byoung-Tak Zhang Biointelligence Lab School of Computer Science and Engineering Seoul National University Seoul 151-742 Korea E-mail: kbhwang@bi.snu.ac.kr
More informationDirected Graphical Models
CS 2750: Machine Learning Directed Graphical Models Prof. Adriana Kovashka University of Pittsburgh March 28, 2017 Graphical Models If no assumption of independence is made, must estimate an exponential
More informationCalculation of IBD probabilities
Calculation of IBD probabilities David Evans and Stacey Cherny University of Oxford Wellcome Trust Centre for Human Genetics This Session IBD vs IBS Why is IBD important? Calculating IBD probabilities
More informationBayesian Inference and MCMC
Bayesian Inference and MCMC Aryan Arbabi Partly based on MCMC slides from CSC412 Fall 2018 1 / 18 Bayesian Inference - Motivation Consider we have a data set D = {x 1,..., x n }. E.g each x i can be the
More informationRelated Concepts: Lecture 9 SEM, Statistical Modeling, AI, and Data Mining. I. Terminology of SEM
Lecture 9 SEM, Statistical Modeling, AI, and Data Mining I. Terminology of SEM Related Concepts: Causal Modeling Path Analysis Structural Equation Modeling Latent variables (Factors measurable, but thru
More informationQTL Model Search. Brian S. Yandell, UW-Madison January 2017
QTL Model Search Brian S. Yandell, UW-Madison January 2017 evolution of QTL models original ideas focused on rare & costly markers models & methods refined as technology advanced single marker regression
More informationBeyond Uniform Priors in Bayesian Network Structure Learning
Beyond Uniform Priors in Bayesian Network Structure Learning (for Discrete Bayesian Networks) scutari@stats.ox.ac.uk Department of Statistics April 5, 2017 Bayesian Network Structure Learning Learning
More informationAffected Sibling Pairs. Biostatistics 666
Affected Sibling airs Biostatistics 666 Today Discussion of linkage analysis using affected sibling pairs Our exploration will include several components we have seen before: A simple disease model IBD
More informationOverview. Background
Overview Implementation of robust methods for locating quantitative trait loci in R Introduction to QTL mapping Andreas Baierl and Andreas Futschik Institute of Statistics and Decision Support Systems
More informationPredicting Protein Functions and Domain Interactions from Protein Interactions
Predicting Protein Functions and Domain Interactions from Protein Interactions Fengzhu Sun, PhD Center for Computational and Experimental Genomics University of Southern California Outline High-throughput
More informationIntroduction to Bayesian Learning
Course Information Introduction Introduction to Bayesian Learning Davide Bacciu Dipartimento di Informatica Università di Pisa bacciu@di.unipi.it Apprendimento Automatico: Fondamenti - A.A. 2016/2017 Outline
More informationBayesian Networks. Motivation
Bayesian Networks Computer Sciences 760 Spring 2014 http://pages.cs.wisc.edu/~dpage/cs760/ Motivation Assume we have five Boolean variables,,,, The joint probability is,,,, How many state configurations
More information6.047 / Computational Biology: Genomes, Networks, Evolution Fall 2008
MIT OpenCourseWare http://ocw.mit.edu 6.047 / 6.878 Computational Biology: Genomes, Networks, Evolution Fall 2008 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
More informationTutorial Session 2. MCMC for the analysis of genetic data on pedigrees:
MCMC for the analysis of genetic data on pedigrees: Tutorial Session 2 Elizabeth Thompson University of Washington Genetic mapping and linkage lod scores Monte Carlo likelihood and likelihood ratio estimation
More informationPrinciples of Bayesian Inference
Principles of Bayesian Inference Sudipto Banerjee University of Minnesota July 20th, 2008 1 Bayesian Principles Classical statistics: model parameters are fixed and unknown. A Bayesian thinks of parameters
More informationSPARSE MODEL BUILDING FROM GENOME-WIDE VARIATION WITH GRAPHICAL MODELS
SPARSE MODEL BUILDING FROM GENOME-WIDE VARIATION WITH GRAPHICAL MODELS A Dissertation Presented to the Faculty of the Graduate School of Cornell University in Partial Fulfillment of the Requirements for
More informationCausal Graphical Models in Quantitative Genetics and Genomics
Causal Graphical Models in Quantitative Genetics and Genomics Guilherme J. M. Rosa Department of Animal Sciences Department of Biostatistics & Medical Informatics OUTLINE Introduction: Correlation and
More informationThe genomes of recombinant inbred lines
The genomes of recombinant inbred lines Karl W Broman Department of Biostatistics Johns Hopkins University http://www.biostat.jhsph.edu/~kbroman C57BL/6 2 1 Recombinant inbred lines (by sibling mating)
More informationLearning gene regulatory networks Statistical methods for haplotype inference Part I
Learning gene regulatory networks Statistical methods for haplotype inference Part I Input: Measurement of mrn levels of all genes from microarray or rna sequencing Samples (e.g. 200 patients with lung
More informationBayesian model selection: methodology, computation and applications
Bayesian model selection: methodology, computation and applications David Nott Department of Statistics and Applied Probability National University of Singapore Statistical Genomics Summer School Program
More informationGibbs Sampling Methods for Multiple Sequence Alignment
Gibbs Sampling Methods for Multiple Sequence Alignment Scott C. Schmidler 1 Jun S. Liu 2 1 Section on Medical Informatics and 2 Department of Statistics Stanford University 11/17/99 1 Outline Statistical
More informationNetwork Biology-part II
Network Biology-part II Jun Zhu, Ph. D. Professor of Genomics and Genetic Sciences Icahn Institute of Genomics and Multi-scale Biology The Tisch Cancer Institute Icahn Medical School at Mount Sinai New
More informationBayesian Networks to design optimal experiments. Davide De March
Bayesian Networks to design optimal experiments Davide De March davidedemarch@gmail.com 1 Outline evolutionary experimental design in high-dimensional space and costly experimentation the microwell mixture
More informationComputational Approaches to Statistical Genetics
Computational Approaches to Statistical Genetics GWAS I: Concepts and Probability Theory Christoph Lippert Dr. Oliver Stegle Prof. Dr. Karsten Borgwardt Max-Planck-Institutes Tübingen, Germany Tübingen
More informationPart I. C. M. Bishop PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS
Part I C. M. Bishop PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS Probabilistic Graphical Models Graphical representation of a probabilistic model Each variable corresponds to a
More informationDown by the Bayes, where the Watermelons Grow
Down by the Bayes, where the Watermelons Grow A Bayesian example using SAS SUAVe: Victoria SAS User Group Meeting November 21, 2017 Peter K. Ott, M.Sc., P.Stat. Strategic Analysis 1 Outline 1. Motivating
More informationComputer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo
Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate
More informationLearning Bayesian networks
1 Lecture topics: Learning Bayesian networks from data maximum likelihood, BIC Bayesian, marginal likelihood Learning Bayesian networks There are two problems we have to solve in order to estimate Bayesian
More informationA Statistical Framework for Expression Trait Loci (ETL) Mapping. Meng Chen
A Statistical Framework for Expression Trait Loci (ETL) Mapping Meng Chen Prelim Paper in partial fulfillment of the requirements for the Ph.D. program in the Department of Statistics University of Wisconsin-Madison
More informationDensity Estimation. Seungjin Choi
Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/
More informationMachine Learning for Data Science (CS4786) Lecture 24
Machine Learning for Data Science (CS4786) Lecture 24 Graphical Models: Approximate Inference Course Webpage : http://www.cs.cornell.edu/courses/cs4786/2016sp/ BELIEF PROPAGATION OR MESSAGE PASSING Each
More informationMODEL-FREE LINKAGE AND ASSOCIATION MAPPING OF COMPLEX TRAITS USING QUANTITATIVE ENDOPHENOTYPES
MODEL-FREE LINKAGE AND ASSOCIATION MAPPING OF COMPLEX TRAITS USING QUANTITATIVE ENDOPHENOTYPES Saurabh Ghosh Human Genetics Unit Indian Statistical Institute, Kolkata Most common diseases are caused by
More informationBayesian Networks Inference with Probabilistic Graphical Models
4190.408 2016-Spring Bayesian Networks Inference with Probabilistic Graphical Models Byoung-Tak Zhang intelligence Lab Seoul National University 4190.408 Artificial (2016-Spring) 1 Machine Learning? Learning
More informationSummary of the Bayes Net Formalism. David Danks Institute for Human & Machine Cognition
Summary of the Bayes Net Formalism David Danks Institute for Human & Machine Cognition Bayesian Networks Two components: 1. Directed Acyclic Graph (DAG) G: There is a node for every variable D: Some nodes
More informationThe E-M Algorithm in Genetics. Biostatistics 666 Lecture 8
The E-M Algorithm in Genetics Biostatistics 666 Lecture 8 Maximum Likelihood Estimation of Allele Frequencies Find parameter estimates which make observed data most likely General approach, as long as
More informationBayesian room-acoustic modal analysis
Bayesian room-acoustic modal analysis Wesley Henderson a) Jonathan Botts b) Ning Xiang c) Graduate Program in Architectural Acoustics, School of Architecture, Rensselaer Polytechnic Institute, Troy, New
More informationBayesian Partition Models for Identifying Expression Quantitative Trait Loci
Journal of the American Statistical Association ISSN: 0162-1459 (Print) 1537-274X (Online) Journal homepage: http://www.tandfonline.com/loi/uasa20 Bayesian Partition Models for Identifying Expression Quantitative
More informationCSC321 Lecture 18: Learning Probabilistic Models
CSC321 Lecture 18: Learning Probabilistic Models Roger Grosse Roger Grosse CSC321 Lecture 18: Learning Probabilistic Models 1 / 25 Overview So far in this course: mainly supervised learning Language modeling
More informationThe Lander-Green Algorithm. Biostatistics 666 Lecture 22
The Lander-Green Algorithm Biostatistics 666 Lecture Last Lecture Relationship Inferrence Likelihood of genotype data Adapt calculation to different relationships Siblings Half-Siblings Unrelated individuals
More informationMultilevel Statistical Models: 3 rd edition, 2003 Contents
Multilevel Statistical Models: 3 rd edition, 2003 Contents Preface Acknowledgements Notation Two and three level models. A general classification notation and diagram Glossary Chapter 1 An introduction
More informationDiscovering molecular pathways from protein interaction and ge
Discovering molecular pathways from protein interaction and gene expression data 9-4-2008 Aim To have a mechanism for inferring pathways from gene expression and protein interaction data. Motivation Why
More informationThe Origin of Deep Learning. Lili Mou Jan, 2015
The Origin of Deep Learning Lili Mou Jan, 2015 Acknowledgment Most of the materials come from G. E. Hinton s online course. Outline Introduction Preliminary Boltzmann Machines and RBMs Deep Belief Nets
More informationA CAUSAL GENE NETWORK WITH GENETIC VARIATIONS INCORPORATING BIOLOGICAL KNOWLEDGE AND LATENT VARIABLES. By Jee Young Moon
A CAUSAL GENE NETWORK WITH GENETIC VARIATIONS INCORPORATING BIOLOGICAL KNOWLEDGE AND LATENT VARIABLES By Jee Young Moon A dissertation submitted in partial fulfillment of the requirements for the degree
More informationLearning Energy-Based Models of High-Dimensional Data
Learning Energy-Based Models of High-Dimensional Data Geoffrey Hinton Max Welling Yee-Whye Teh Simon Osindero www.cs.toronto.edu/~hinton/energybasedmodelsweb.htm Discovering causal structure as a goal
More informationBayes Networks. CS540 Bryan R Gibson University of Wisconsin-Madison. Slides adapted from those used by Prof. Jerry Zhu, CS540-1
Bayes Networks CS540 Bryan R Gibson University of Wisconsin-Madison Slides adapted from those used by Prof. Jerry Zhu, CS540-1 1 / 59 Outline Joint Probability: great for inference, terrible to obtain
More information