Nested Effects Models at Work
|
|
- Sharyl Wilkerson
- 6 years ago
- Views:
Transcription
1 21/06/2010 Nested ffects Models at Work Tutorial Session: Network Modelling in Systems Biology with R Prof. Dr. Holger Fröhlich Algorithmic Bioinformatics Bonn-Aachen International Center for Information Technology (B-IT)
2 Learning From Perturbation ffects P1 P2 P3 P4 effects effects effects effects Microarrays RNAi-Screening Reverse Phase Protein Arrays mass spec RNAseq... Page 2 Holger Fröhlich Algorithmic Bioinformatics
3 Break a system to learn, how it works Page 3 Holger Fröhlich Algorithmic Bioinformatics
4 Pathways from RNAi Data an xample Page 4 Holger Fröhlich Algorithmic Bioinformatics
5 Pathways from RNAi Data an xample Computational approach: Bayesian Networks - Nested ffects Models Markowetz et al., 2005, 2007 Fröhlich et al., 2007, 2008a, b, 2009 Tresch and Markowetz, 2008 Zeller et al, 2008 Vaske et al, 2009 Anchang et al., 2009 Page 5 Holger Fröhlich Algorithmic Bioinformatics
6 Principle Idea of Nested ffects Models Distinguish between: Perturbed genes Observed effects S 1 S 2 S 3 S 4 Φ θ Measure downstream effects of each knockdown Network reconstruction is based on observed effects under different perturbations Observed effects Perturbed genes S1 S2 S3 S4 Markowetz et al., 2005 Page 6 Holger Fröhlich Algorithmic Bioinformatics
7 Nested ffects Models (NMs) are transitively closed graphs, which explain the nested structure of downstream effects. Page 7 Holger Fröhlich Algorithmic Bioinformatics
8 Likelihood of the Signaling Graph (Φ) Two different approaches: Bayesian: Integrate over effects linkage graphs Θ assuming P( Θ Φ ) = P( Θ) : P( D Φ ) = P( D Φ, Θ) P( Θ) Θ Markowetz et al., 2005; Fröhlich et al., 2007, 2008 Take MAP/ML estimator for Θ: Θ ˆ = arg max P( D Φ, Θ) P( Θ) (, ˆ ˆ P D Φ Θ) P( Φ) P( Φ D, Θ ) = P( D) Θ Tresch et al., 2008 Page 8 Holger Fröhlich Algorithmic Bioinformatics
9 Likelihood of the Signaling Graph (Φ) Two different approaches: Bayesian: Integrate over effects linkage graphs Θ assuming P( Θ Φ ) = P( Θ) : P( D Φ ) = P( D Φ, Θ) P( Θ) Θ Markowetz et al., 2005; Fröhlich et al., 2007, 2008 Take MAP/ML estimator for Θ: Θ ˆ = arg max P( D Φ, Θ) P( Θ) (, ˆ ˆ P D Φ Θ) P( Φ) P( Φ D, Θ ) = P( D) Θ Tresch et al., 2008 Page 9 Holger Fröhlich Algorithmic Bioinformatics
10 Calculation of ffect Likelihoods Factorization of the likelihood under i.i.d. assumption: P( D Φ ) = P( D Φ, Θ) P( Θ) Θ = P( D Φ, Θ = 1) P( Θ = 1) k ε s S t S tk sk sk ~ P( D m = Φ ) k ε s S t S tk tk ts 1. Model for binary data D with fixed error probabilities α and β: Dtk = 1 Dtk = 0 P( D ) 1 if mtk 1 tk mtk = α α = β 1 β if mtk = 0 Markowetz et al., 2005 Page 10 Holger Fröhlich Algorithmic Bioinformatics
11 Modeling Continuous Data 2. Data D are computed as p-values for significant change, when comparing interventions to non-interventions. Under the null hypothesis (i.e. expecting no effect) p-values are distributed uniformly Under the alternative hypothesis (i.e. expecting an effect) there is a high density for small p-values and a strong decrease for increasing p-values [Pounds et al., 2003]. f ( D ) = π + π Beta( D, α,1) + π Beta( D,1, β ) tk 1k 2k tk t 3k tk t -> fit via M algorithm P( D m ) tk tk f ( Dtk ) f (1) 1 f (1) if mtk = 1 = 1 if mtk = 0 Fröhlich et al., 2008 Page 11 Holger Fröhlich Algorithmic Bioinformatics
12 Using NMs in R library(nem) load( raw_pvaluesboutros2002.rda ) D = getdensitymatrix(pvalues) Page 12 Holger Fröhlich Algorithmic Bioinformatics
13 How to Infer the Network Structure? Choose candidate graph S 1 S 2 S 3 S 4 Calculate score, e.g. using Bayesian statistics (average over -Gene positions) Likelihood model Propose different topology Complete enumeration of all topologies Markowetz et al., 2005 Page 13 Holger Fröhlich Algorithmic Bioinformatics
14 How to Infer the Network Structure? Choose candidate graph S 1 S 2 S 3 S 4 Calculate score, e.g. using Bayesian statistics (average over -Gene positions) Combinatorial explosion: Combinatorial explosion: n = 4: 355 possible networks n = 10: ~10 27 possible networks Likelihood model Propose different topology Complete enumeration of all topologies Markowetz et al., 2005 Page 14 Holger Fröhlich Algorithmic Bioinformatics
15 Heuristics for Large Networks (> 4 S-Genes). MCMC sampling time consuming neighborhood relation in transitively closed graphs difficult Greedy hill climbing Fröhlich et al., Bioinformatics, 2008 Module networks Fröhlich et al., BMC Bioinformatics, 2007 Fröhlich et al., Bioinformatics, 2008 Triplets inference Markowetz et al., Bioinformatics, 2007 Alternating MAP optimization Tresch and Markowetz, Stat. Appl. Mol. Biol., 2008 Page 15 Holger Fröhlich Algorithmic Bioinformatics
16 Large Scale Networks: Module Networks Problem: complete enumeration of all network hypotheses only possible for small networks (< 5 S-genes) Solution: Divide and conquer 1. Highest scoring subnetworks for modules of S-Genes 2. stimate connections between modules Page 16 Holger Fröhlich Algorithmic Bioinformatics
17 Large Scale Networks: (a) Module Networks S 3 S 4 S 2 S 5 S 9 Log-likelihood S 6 S 7 S 1 S Network S 8 Fröhlich et al., 2007, 2008 Page 17 Holger Fröhlich Algorithmic Bioinformatics
18 Network Inference with the nem-package control=set.default.paramet ers(unique(colnames(d)), type="contmllbayes") mynem = nem(d, inference= ModuleNetwork, control=control, verbose=fals) plot.nem(mynem, SCC=FALS, D=D, draw.lines=tru) Page 18 Holger Fröhlich Algorithmic Bioinformatics
19 Automated Selection of Relevant -Genes (Feature Selection) Motivation: Irrelevant -genes can degrade network estimation accuracy 1. Select -Genes having a positive contribution to the model s log-likelihood only. 2. Re-estimate the network with the new set of -Genes 3. Iterate the process until convergence Page 19 Holger Fröhlich Algorithmic Bioinformatics Fröhlich et al., 2008
20 Network Inference with the nem-package D2 = BoutrosRNAiDiscrete[,9:16] control=set.default.parameters (unique(colnames(d2)), selgenes=tru) mynem2 = nem(d2, inference= triples, control=control, verbose=fals) plot.nem(mynem2, D=D2, draw.lines=tru) Page 20 Holger Fröhlich Algorithmic Bioinformatics
21 Incorporation of Prior Knowledge Bias scoring such that known interactions are considered Bayesian prior on network structure P( Φ ) = P( Φ ) i, j ij ) 1 Φij Φij P( Φ ij ν ) = exp 2ν ν Φ= Signaling Graph Φ = Prior Belief ν = Hyperparameter of Laplace Distribution Complete trust in prior P( Φ ) = P( Φ ν ) P( ν ) dν Page 21 Holger Fröhlich Algorithmic Bioinformatics ij 0 ν ~ InvGamma(1, 0.5) P( Φ ) = ij ν (scale of prior) ij 1 ) ( 1+ 2 Φij Φij ) 2 Complete trust in data Fröhlich et al., 2008
22 Using Prior Knowledge with the nem-package control=set.default.parameters (unique(colnames(d)), selgenes=tru, type= CONTmLLMAP, Pm=diag(4)) mynem3 = nem(d, control=control, verbose=fals) plot.nem(mynem3, SCC=FALS, D=D, draw.lines=tru) Page 22 Holger Fröhlich Algorithmic Bioinformatics
23 Statistical Stability and Significance How stable the inferred network? Do small changes of -genes lead to different network hypotheses? Use non-parametric bootstrap repeat Sample n -genes with replacement Is the inferred network better than random? Randomly permute node labels and look, whether random network has a higher likelihood. R 0.7 S Q P Page 23 Holger Fröhlich Algorithmic Bioinformatics
24 Statistical Stability and Significance How stable the inferred network? Do small changes of -genes lead to different network hypotheses? Use non-parametric bootstrap repeat Sample n -genes with replacement Is the inferred network better than random? Randomly permute node labels and look, whether random network has a higher likelihood. S 0.7 R P Q Page 24 Holger Fröhlich Algorithmic Bioinformatics
25 Statistical Stability and Significance How stable the inferred network? Do small changes of -genes lead to different network hypotheses? Use non-parametric bootstrap repeat Sample n -genes with replacement Is the inferred network better than random? Randomly permute node labels and look, whether random network has a higher likelihood. P 0.7 S R Q Page 25 Holger Fröhlich Algorithmic Bioinformatics
26 Bootstrapping and Significance Calculation with the nem-package control=set.default.parameters (unique(colnames(d)), type= CONTmLLBayes, Pm=diag(4)) mynem.boot = nem.bootstrap(d, nboot=100, control=control) plot.nem(mynem.boot, SCC=FALS, plot.probs=tru) nem.calcsignificance(d, N=1000, mynem.boot) p = (label permutation test) Page 26 Holger Fröhlich Algorithmic Bioinformatics
27 Summary Inference of features of signaling pathways from high dimensional, targeted perturbation effects Different likelihood models Discretized data P-value log-densities Algorithms for inference of large networks Module Networks Triplets Greedy hillclimbing... Possibility to integrate prior knowledge Automatic selection of relevant -genes Various plotting and analysis methods Non-parametric bootstrap Label permutation p-values Page 27 Holger Fröhlich Algorithmic Bioinformatics
28 Acknowledgements Div. Molecular Genome Analyis, DKFZ Bioinformatics: - Tim Beißbarth - Christian Bender - Marc Johannes xpression Profiling - Holger Sültmann - Marc Fellmann - Ruprecht Kuner - Sabrina Belauger Proteomics - Christian Löbke - Özgür Sahin - Dorit Arlt Page 28 Holger Fröhlich Algorithmic Bioinformatics
Learning in Bayesian Networks
Learning in Bayesian Networks Florian Markowetz Max-Planck-Institute for Molecular Genetics Computational Molecular Biology Berlin Berlin: 20.06.2002 1 Overview 1. Bayesian Networks Stochastic Networks
More informationEXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY
EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA, 00 MODULE : Statistical Inference Time Allowed: Three Hours Candidates should answer FIVE questions. All questions carry equal marks. The
More informationComputational Genomics. Systems biology. Putting it together: Data integration using graphical models
02-710 Computational Genomics Systems biology Putting it together: Data integration using graphical models High throughput data So far in this class we discussed several different types of high throughput
More informationInferring Protein-Signaling Networks II
Inferring Protein-Signaling Networks II Lectures 15 Nov 16, 2011 CSE 527 Computational Biology, Fall 2011 Instructor: Su-In Lee TA: Christopher Miles Monday & Wednesday 12:00-1:20 Johnson Hall (JHN) 022
More informationIntroduction to Bioinformatics
CSCI8980: Applied Machine Learning in Computational Biology Introduction to Bioinformatics Rui Kuang Department of Computer Science and Engineering University of Minnesota kuang@cs.umn.edu History of Bioinformatics
More informationGLOBEX Bioinformatics (Summer 2015) Genetic networks and gene expression data
GLOBEX Bioinformatics (Summer 2015) Genetic networks and gene expression data 1 Gene Networks Definition: A gene network is a set of molecular components, such as genes and proteins, and interactions between
More informationLecture: Gaussian Process Regression. STAT 6474 Instructor: Hongxiao Zhu
Lecture: Gaussian Process Regression STAT 6474 Instructor: Hongxiao Zhu Motivation Reference: Marc Deisenroth s tutorial on Robot Learning. 2 Fast Learning for Autonomous Robots with Gaussian Processes
More informationCluster Analysis of Gene Expression Microarray Data. BIOL 495S/ CS 490B/ MATH 490B/ STAT 490B Introduction to Bioinformatics April 8, 2002
Cluster Analysis of Gene Expression Microarray Data BIOL 495S/ CS 490B/ MATH 490B/ STAT 490B Introduction to Bioinformatics April 8, 2002 1 Data representations Data are relative measurements log 2 ( red
More informationUnsupervised machine learning
Chapter 9 Unsupervised machine learning Unsupervised machine learning (a.k.a. cluster analysis) is a set of methods to assign objects into clusters under a predefined distance measure when class labels
More informationA Bayesian Nonparametric Model for Predicting Disease Status Using Longitudinal Profiles
A Bayesian Nonparametric Model for Predicting Disease Status Using Longitudinal Profiles Jeremy Gaskins Department of Bioinformatics & Biostatistics University of Louisville Joint work with Claudio Fuentes
More informationHow data assimilation helps to illuminate complex biology
How data assimilation helps to illuminate complex biology The dynamic elastic-net Maik Kschischo Department of Mathematics and Technology University of Applied Sciences Koblenz RheinAhrCampus Remagen Joseph-Rovan-Allee
More informationSignaling pathways from RNAi data
Signaling pathways from RNAi data Florian Markowetz florian.markowetz@molgen.mpg.de Max Planck Institute for Molecular Genetics Computational Diagnostics Group, Germany for IPM workshop Tehran, 2005 April
More informationSome of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks!
Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks! Paul has many great tools for teaching phylogenetics at his web site: http://hydrodictyon.eeb.uconn.edu/people/plewis
More informationAnnouncements. Proposals graded
Announcements Proposals graded Kevin Jamieson 2018 1 Hypothesis testing Machine Learning CSE546 Kevin Jamieson University of Washington October 30, 2018 2018 Kevin Jamieson 2 Anomaly detection You are
More informationProtein Complex Identification by Supervised Graph Clustering
Protein Complex Identification by Supervised Graph Clustering Yanjun Qi 1, Fernanda Balem 2, Christos Faloutsos 1, Judith Klein- Seetharaman 1,2, Ziv Bar-Joseph 1 1 School of Computer Science, Carnegie
More informationBayesian Hierarchical Classification. Seminar on Predicting Structured Data Jukka Kohonen
Bayesian Hierarchical Classification Seminar on Predicting Structured Data Jukka Kohonen 17.4.2008 Overview Intro: The task of hierarchical gene annotation Approach I: SVM/Bayes hybrid Barutcuoglu et al:
More informationEffects of Gap Open and Gap Extension Penalties
Brigham Young University BYU ScholarsArchive All Faculty Publications 200-10-01 Effects of Gap Open and Gap Extension Penalties Hyrum Carroll hyrumcarroll@gmail.com Mark J. Clement clement@cs.byu.edu See
More informationComparative Genomics II
Comparative Genomics II Advances in Bioinformatics and Genomics GEN 240B Jason Stajich May 19 Comparative Genomics II Slide 1/31 Outline Introduction Gene Families Pairwise Methods Phylogenetic Methods
More informationCMPS 6630: Introduction to Computational Biology and Bioinformatics. Structure Comparison
CMPS 6630: Introduction to Computational Biology and Bioinformatics Structure Comparison Protein Structure Comparison Motivation Understand sequence and structure variability Understand Domain architecture
More informationGraph Alignment and Biological Networks
Graph Alignment and Biological Networks Johannes Berg http://www.uni-koeln.de/ berg Institute for Theoretical Physics University of Cologne Germany p.1/12 Networks in molecular biology New large-scale
More informationInferring Protein-Signaling Networks
Inferring Protein-Signaling Networks Lectures 14 Nov 14, 2011 CSE 527 Computational Biology, Fall 2011 Instructor: Su-In Lee TA: Christopher Miles Monday & Wednesday 12:00-1:20 Johnson Hall (JHN) 022 1
More informationProbabilistic Soft Interventions in Conditional Gaussian Networks
Probabilistic Soft Interventions in Conditional Gaussian Networks Florian Markowetz, Steffen Grossmann, and Rainer Spang firstname.lastname@molgen.mpg.de Dept. Computational Molecular Biology Max Planck
More informationGene Ontology and Functional Enrichment. Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein
Gene Ontology and Functional Enrichment Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein The parsimony principle: A quick review Find the tree that requires the fewest
More informationDistributed detection of topological changes in communication networks. Riccardo Lucchese, Damiano Varagnolo, Karl H. Johansson
1 Distributed detection of topological changes in communication networks Riccardo Lucchese, Damiano Varagnolo, Karl H. Johansson Thanks to... 2 The need: detecting changes in topological networks 3 The
More informationComputer Vision Group Prof. Daniel Cremers. 14. Clustering
Group Prof. Daniel Cremers 14. Clustering Motivation Supervised learning is good for interaction with humans, but labels from a supervisor are hard to obtain Clustering is unsupervised learning, i.e. it
More informationIntroduction to Machine Learning
Introduction to Machine Learning Generative Models Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574 1
More informationGraph Wavelets to Analyze Genomic Data with Biological Networks
Graph Wavelets to Analyze Genomic Data with Biological Networks Yunlong Jiao and Jean-Philippe Vert "Emerging Topics in Biological Networks and Systems Biology" symposium, Swedish Collegium for Advanced
More informationMixture models for analysing transcriptome and ChIP-chip data
Mixture models for analysing transcriptome and ChIP-chip data Marie-Laure Martin-Magniette French National Institute for agricultural research (INRA) Unit of Applied Mathematics and Informatics at AgroParisTech,
More informationMachine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall
Machine Learning Gaussian Mixture Models Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall 2012 1 The Generative Model POV We think of the data as being generated from some process. We assume
More informationDiscovering molecular pathways from protein interaction and ge
Discovering molecular pathways from protein interaction and gene expression data 9-4-2008 Aim To have a mechanism for inferring pathways from gene expression and protein interaction data. Motivation Why
More informationOutline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution
Outline A short review on Bayesian analysis. Binomial, Multinomial, Normal, Beta, Dirichlet Posterior mean, MAP, credible interval, posterior distribution Gibbs sampling Revisit the Gaussian mixture model
More informationPredicting Protein Functions and Domain Interactions from Protein Interactions
Predicting Protein Functions and Domain Interactions from Protein Interactions Fengzhu Sun, PhD Center for Computational and Experimental Genomics University of Southern California Outline High-throughput
More informationLecture 15: MCMC Sanjeev Arora Elad Hazan. COS 402 Machine Learning and Artificial Intelligence Fall 2016
Lecture 15: MCMC Sanjeev Arora Elad Hazan COS 402 Machine Learning and Artificial Intelligence Fall 2016 Course progress Learning from examples Definition + fundamental theorem of statistical learning,
More informationProteomics. Yeast two hybrid. Proteomics - PAGE techniques. Data obtained. What is it?
Proteomics What is it? Reveal protein interactions Protein profiling in a sample Yeast two hybrid screening High throughput 2D PAGE Automatic analysis of 2D Page Yeast two hybrid Use two mating strains
More informationBioinformatics tools for phylogeny and visualization. Yanbin Yin
Bioinformatics tools for phylogeny and visualization Yanbin Yin 1 Homework assignment 5 1. Take the MAFFT alignment http://cys.bios.niu.edu/yyin/teach/pbb/purdue.cellwall.list.lignin.f a.aln as input and
More informationCausal Graphical Models in Systems Genetics
1 Causal Graphical Models in Systems Genetics 2013 Network Analysis Short Course - UCLA Human Genetics Elias Chaibub Neto and Brian S Yandell July 17, 2013 Motivation and basic concepts 2 3 Motivation
More information39th Annual ISMS Marketing Science Conference University of Southern California, June 8, 2017
Permuted and IROM Department, McCombs School of Business The University of Texas at Austin 39th Annual ISMS Marketing Science Conference University of Southern California, June 8, 2017 1 / 36 Joint work
More informationNetwork alignment and querying
Network biology minicourse (part 4) Algorithmic challenges in genomics Network alignment and querying Roded Sharan School of Computer Science, Tel Aviv University Multiple Species PPI Data Rapid growth
More informationBayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework
HT5: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Maximum Likelihood Principle A generative model for
More informationInferring Transcriptional Regulatory Networks from High-throughput Data
Inferring Transcriptional Regulatory Networks from High-throughput Data Lectures 9 Oct 26, 2011 CSE 527 Computational Biology, Fall 2011 Instructor: Su-In Lee TA: Christopher Miles Monday & Wednesday 12:00-1:20
More informationMultiple testing: Intro & FWER 1
Multiple testing: Intro & FWER 1 Mark van de Wiel mark.vdwiel@vumc.nl Dep of Epidemiology & Biostatistics,VUmc, Amsterdam Dep of Mathematics, VU 1 Some slides courtesy of Jelle Goeman 1 Practical notes
More informationReview. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda
Review DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Probability and statistics Probability: Framework for dealing with
More informationComparative Network Analysis
Comparative Network Analysis BMI/CS 776 www.biostat.wisc.edu/bmi776/ Spring 2016 Anthony Gitter gitter@biostat.wisc.edu These slides, excluding third-party material, are licensed under CC BY-NC 4.0 by
More informationBayesian model selection in graphs by using BDgraph package
Bayesian model selection in graphs by using BDgraph package A. Mohammadi and E. Wit March 26, 2013 MOTIVATION Flow cytometry data with 11 proteins from Sachs et al. (2005) RESULT FOR CELL SIGNALING DATA
More informationThe University of Hong Kong Department of Statistics and Actuarial Science STAT2802 Statistical Models Tutorial Solutions Solutions to Problems 71-80
The University of Hong Kong Department of Statistics and Actuarial Science STAT2802 Statistical Models Tutorial Solutions Solutions to Problems 71-80 71. Decide in each case whether the hypothesis is simple
More informationSTAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01
STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01 Nasser Sadeghkhani a.sadeghkhani@queensu.ca There are two main schools to statistical inference: 1-frequentist
More informationABCME: Summary statistics selection for ABC inference in R
ABCME: Summary statistics selection for ABC inference in R Matt Nunes and David Balding Lancaster University UCL Genetics Institute Outline Motivation: why the ABCME package? Description of the package
More informationBayesian Networks to design optimal experiments. Davide De March
Bayesian Networks to design optimal experiments Davide De March davidedemarch@gmail.com 1 Outline evolutionary experimental design in high-dimensional space and costly experimentation the microwell mixture
More informationPhylogenetic relationship among S. castellii, S. cerevisiae and C. glabrata.
Supplementary Note S2 Phylogenetic relationship among S. castellii, S. cerevisiae and C. glabrata. Phylogenetic trees reconstructed by a variety of methods from either single-copy orthologous loci (Class
More informationConditional variable importance in R package extendedforest
Conditional variable importance in R package extendedforest Stephen J. Smith, Nick Ellis, C. Roland Pitcher February 10, 2011 Contents 1 Introduction 1 2 Methods 2 2.1 Conditional permutation................................
More informationData Mining Chapter 4: Data Analysis and Uncertainty Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University
Data Mining Chapter 4: Data Analysis and Uncertainty Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Why uncertainty? Why should data mining care about uncertainty? We
More informationCS6220: DATA MINING TECHNIQUES
CS6220: DATA MINING TECHNIQUES Matrix Data: Classification: Part 2 Instructor: Yizhou Sun yzsun@ccs.neu.edu September 21, 2014 Methods to Learn Matrix Data Set Data Sequence Data Time Series Graph & Network
More informationO 3 O 4 O 5. q 3. q 4. Transition
Hidden Markov Models Hidden Markov models (HMM) were developed in the early part of the 1970 s and at that time mostly applied in the area of computerized speech recognition. They are first described in
More informationSTATS 200: Introduction to Statistical Inference. Lecture 29: Course review
STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout
More informationInferring Transcriptional Regulatory Networks from Gene Expression Data II
Inferring Transcriptional Regulatory Networks from Gene Expression Data II Lectures 9 Oct 26, 2011 CSE 527 Computational Biology, Fall 2011 Instructor: Su-In Lee TA: Christopher Miles Monday & Wednesday
More informationComposite Hypotheses and Generalized Likelihood Ratio Tests
Composite Hypotheses and Generalized Likelihood Ratio Tests Rebecca Willett, 06 In many real world problems, it is difficult to precisely specify probability distributions. Our models for data may involve
More informationPhysical network models and multi-source data integration
Physical network models and multi-source data integration Chen-Hsiang Yeang MIT AI Lab Cambridge, MA 02139 chyeang@ai.mit.edu Tommi Jaakkola MIT AI Lab Cambridge, MA 02139 tommi@ai.mit.edu September 30,
More informationBayesian Networks BY: MOHAMAD ALSABBAGH
Bayesian Networks BY: MOHAMAD ALSABBAGH Outlines Introduction Bayes Rule Bayesian Networks (BN) Representation Size of a Bayesian Network Inference via BN BN Learning Dynamic BN Introduction Conditional
More informationhsnim: Hyper Scalable Network Inference Machine for Scale-Free Protein-Protein Interaction Networks Inference
CS 229 Project Report (TR# MSB2010) Submitted 12/10/2010 hsnim: Hyper Scalable Network Inference Machine for Scale-Free Protein-Protein Interaction Networks Inference Muhammad Shoaib Sehgal Computer Science
More informationDS-GA 1002 Lecture notes 11 Fall Bayesian statistics
DS-GA 100 Lecture notes 11 Fall 016 Bayesian statistics In the frequentist paradigm we model the data as realizations from a distribution that depends on deterministic parameters. In contrast, in Bayesian
More informationData Envelopment Analysis with metaheuristics
Data Envelopment Analysis with metaheuristics Juan Aparicio 1 Domingo Giménez 2 José J. López-Espín 1 Jesús T. Pastor 1 1 Miguel Hernández University, 2 University of Murcia ICCS, Cairns, June 10, 2014
More informationMachine Learning for Data Science (CS4786) Lecture 24
Machine Learning for Data Science (CS4786) Lecture 24 Graphical Models: Approximate Inference Course Webpage : http://www.cs.cornell.edu/courses/cs4786/2016sp/ BELIEF PROPAGATION OR MESSAGE PASSING Each
More informationPost-exam 2 practice questions 18.05, Spring 2014
Post-exam 2 practice questions 18.05, Spring 2014 Note: This is a set of practice problems for the material that came after exam 2. In preparing for the final you should use the previous review materials,
More informationBioinformatics 2. Yeast two hybrid. Proteomics. Proteomics
GENOME Bioinformatics 2 Proteomics protein-gene PROTEOME protein-protein METABOLISM Slide from http://www.nd.edu/~networks/ Citrate Cycle Bio-chemical reactions What is it? Proteomics Reveal protein Protein
More informationUnravelling the biochemical reaction kinetics from time-series data
Unravelling the biochemical reaction kinetics from time-series data Santiago Schnell Indiana University School of Informatics and Biocomplexity Institute Email: schnell@indiana.edu WWW: http://www.informatics.indiana.edu/schnell
More informationComputer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo
Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain
More informationPhylogenetic Tree Reconstruction
I519 Introduction to Bioinformatics, 2011 Phylogenetic Tree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Evolution theory Speciation Evolution of new organisms is driven
More informationConsensus and Distributed Inference Rates Using Network Divergence
DIMACS August 2017 1 / 26 Consensus and Distributed Inference Rates Using Network Divergence Anand D. Department of Electrical and Computer Engineering, The State University of New Jersey August 23, 2017
More informationNonparametric Regression With Gaussian Processes
Nonparametric Regression With Gaussian Processes From Chap. 45, Information Theory, Inference and Learning Algorithms, D. J. C. McKay Presented by Micha Elsner Nonparametric Regression With Gaussian Processes
More informationChapter 3: Statistical methods for estimation and testing. Key reference: Statistical methods in bioinformatics by Ewens & Grant (2001).
Chapter 3: Statistical methods for estimation and testing Key reference: Statistical methods in bioinformatics by Ewens & Grant (2001). Chapter 3: Statistical methods for estimation and testing Key reference:
More informationCISC 636 Computational Biology & Bioinformatics (Fall 2016)
CISC 636 Computational Biology & Bioinformatics (Fall 2016) Predicting Protein-Protein Interactions CISC636, F16, Lec22, Liao 1 Background Proteins do not function as isolated entities. Protein-Protein
More informationPMR Learning as Inference
Outline PMR Learning as Inference Probabilistic Modelling and Reasoning Amos Storkey Modelling 2 The Exponential Family 3 Bayesian Sets School of Informatics, University of Edinburgh Amos Storkey PMR Learning
More informationDecision theory. 1 We may also consider randomized decision rules, where δ maps observed data D to a probability distribution over
Point estimation Suppose we are interested in the value of a parameter θ, for example the unknown bias of a coin. We have already seen how one may use the Bayesian method to reason about θ; namely, we
More informationComparing Bayesian Networks and Structure Learning Algorithms
Comparing Bayesian Networks and Structure Learning Algorithms (and other graphical models) marco.scutari@stat.unipd.it Department of Statistical Sciences October 20, 2009 Introduction Introduction Graphical
More informationProbabilistic Graphical Models for Image Analysis - Lecture 1
Probabilistic Graphical Models for Image Analysis - Lecture 1 Alexey Gronskiy, Stefan Bauer 21 September 2018 Max Planck ETH Center for Learning Systems Overview 1. Motivation - Why Graphical Models 2.
More informationRelated Concepts: Lecture 9 SEM, Statistical Modeling, AI, and Data Mining. I. Terminology of SEM
Lecture 9 SEM, Statistical Modeling, AI, and Data Mining I. Terminology of SEM Related Concepts: Causal Modeling Path Analysis Structural Equation Modeling Latent variables (Factors measurable, but thru
More information6.047 / Computational Biology: Genomes, Networks, Evolution Fall 2008
MIT OpenCourseWare http://ocw.mit.edu 6.047 / 6.878 Computational Biology: Genomes, Networks, Evolution Fall 2008 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
More informationSUPPLEMENTARY INFORMATION
Supplementary information S1 (box). Supplementary Methods description. Prokaryotic Genome Database Archaeal and bacterial genome sequences were downloaded from the NCBI FTP site (ftp://ftp.ncbi.nlm.nih.gov/genomes/all/)
More informationIntroduction to Probabilistic Machine Learning
Introduction to Probabilistic Machine Learning Piyush Rai Dept. of CSE, IIT Kanpur (Mini-course 1) Nov 03, 2015 Piyush Rai (IIT Kanpur) Introduction to Probabilistic Machine Learning 1 Machine Learning
More information10. Composite Hypothesis Testing. ECE 830, Spring 2014
10. Composite Hypothesis Testing ECE 830, Spring 2014 1 / 25 In many real world problems, it is difficult to precisely specify probability distributions. Our models for data may involve unknown parameters
More informationInferring Causal Phenotype Networks from Segregating Populat
Inferring Causal Phenotype Networks from Segregating Populations Elias Chaibub Neto chaibub@stat.wisc.edu Statistics Department, University of Wisconsin - Madison July 15, 2008 Overview Introduction Description
More informationBeyond Uniform Priors in Bayesian Network Structure Learning
Beyond Uniform Priors in Bayesian Network Structure Learning (for Discrete Bayesian Networks) scutari@stats.ox.ac.uk Department of Statistics April 5, 2017 Bayesian Network Structure Learning Learning
More informationMultivariate Normal & Wishart
Multivariate Normal & Wishart Hoff Chapter 7 October 21, 2010 Reading Comprehesion Example Twenty-two children are given a reading comprehsion test before and after receiving a particular instruction method.
More informationAn Empirical-Bayes Score for Discrete Bayesian Networks
An Empirical-Bayes Score for Discrete Bayesian Networks scutari@stats.ox.ac.uk Department of Statistics September 8, 2016 Bayesian Network Structure Learning Learning a BN B = (G, Θ) from a data set D
More informationInferring Models of cis-regulatory Modules using Information Theory
Inferring Models of cis-regulatory Modules using Information Theory BMI/CS 776 www.biostat.wisc.edu/bmi776/ Spring 28 Anthony Gitter gitter@biostat.wisc.edu These slides, excluding third-party material,
More informationSession 5B: A worked example EGARCH model
Session 5B: A worked example EGARCH model John Geweke Bayesian Econometrics and its Applications August 7, worked example EGARCH model August 7, / 6 EGARCH Exponential generalized autoregressive conditional
More informationHow To Use CORREP to Estimate Multivariate Correlation and Statistical Inference Procedures
How To Use CORREP to Estimate Multivariate Correlation and Statistical Inference Procedures Dongxiao Zhu June 13, 2018 1 Introduction OMICS data are increasingly available to biomedical researchers, and
More informationBayesian Inference using Markov Chain Monte Carlo in Phylogenetic Studies
Bayesian Inference using Markov Chain Monte Carlo in Phylogenetic Studies 1 What is phylogeny? Essay written for the course in Markov Chains 2004 Torbjörn Karfunkel Phylogeny is the evolutionary development
More informationPermutation Test for Bayesian Variable Selection Method for Modelling Dose-Response Data Under Simple Order Restrictions
Permutation Test for Bayesian Variable Selection Method for Modelling -Response Data Under Simple Order Restrictions Martin Otava International Hexa-Symposium on Biostatistics, Bioinformatics, and Epidemiology
More informationBayesian Methods: Naïve Bayes
Bayesian Methods: aïve Bayes icholas Ruozzi University of Texas at Dallas based on the slides of Vibhav Gogate Last Time Parameter learning Learning the parameter of a simple coin flipping model Prior
More informationMCMC: Markov Chain Monte Carlo
I529: Machine Learning in Bioinformatics (Spring 2013) MCMC: Markov Chain Monte Carlo Yuzhen Ye School of Informatics and Computing Indiana University, Bloomington Spring 2013 Contents Review of Markov
More informationIntroduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf
1 Introduction to Machine Learning Maximum Likelihood and Bayesian Inference Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf 2013-14 We know that X ~ B(n,p), but we do not know p. We get a random sample
More informationNetwork diffusion-based analysis of high-throughput data for the detection of differentially enriched modules
Network diffusion-based analysis of high-throughput data for the detection of differentially enriched modules Matteo Bersanelli 1+, Ettore Mosca 2+, Daniel Remondini 1, Gastone Castellani 1 and Luciano
More informationBayesian non-parametric model to longitudinally predict churn
Bayesian non-parametric model to longitudinally predict churn Bruno Scarpa Università di Padova Conference of European Statistics Stakeholders Methodologists, Producers and Users of European Statistics
More informationIntroduction to Bioinformatics
Systems biology Introduction to Bioinformatics Systems biology: modeling biological p Study of whole biological systems p Wholeness : Organization of dynamic interactions Different behaviour of the individual
More informationCOMP90051 Statistical Machine Learning
COMP90051 Statistical Machine Learning Semester 2, 2017 Lecturer: Trevor Cohn 2. Statistical Schools Adapted from slides by Ben Rubinstein Statistical Schools of Thought Remainder of lecture is to provide
More informationSpectral Alignment of Networks Soheil Feizi, Gerald Quon, Muriel Medard, Manolis Kellis, and Ali Jadbabaie
Computer Science and Artificial Intelligence Laboratory Technical Report MIT-CSAIL-TR-205-005 February 8, 205 Spectral Alignment of Networks Soheil Feizi, Gerald Quon, Muriel Medard, Manolis Kellis, and
More informationLecture 6: Graphical Models: Learning
Lecture 6: Graphical Models: Learning 4F13: Machine Learning Zoubin Ghahramani and Carl Edward Rasmussen Department of Engineering, University of Cambridge February 3rd, 2010 Ghahramani & Rasmussen (CUED)
More informationConcepts and Methods in Molecular Divergence Time Estimation
Concepts and Methods in Molecular Divergence Time Estimation 26 November 2012 Prashant P. Sharma American Museum of Natural History Overview 1. Why do we date trees? 2. The molecular clock 3. Local clocks
More informationTesting Statistical Hypotheses
E.L. Lehmann Joseph P. Romano Testing Statistical Hypotheses Third Edition 4y Springer Preface vii I Small-Sample Theory 1 1 The General Decision Problem 3 1.1 Statistical Inference and Statistical Decisions
More information