Egészségügyi informatika és biostatisztika Biomarkerek
|
|
- Brett French
- 5 years ago
- Views:
Transcription
1 Egészségügyi informatika és biostatisztika Biomarkerek Antal Péter Computational Biomedicine (Combine) workgroup Department of Measurement and Information Systems, Budapest University of Technology and Economics 1
2 Áttekintés Okozati és diagnosztikai markerek felfedezésének problémái Genetikai asszociációs kutatások Tumor marker kutatások Biomarker aspektusai, típusai, dimenziói Valószínűségi gráfos hálózatok, oksági Bayes-hálózatok További információ hasznossága Diagnosztikai jóság mértékei A biomarker tanulás statisztikai nehézsége 2
3 Pariticpants Pariticpants Genetic association studies (GAS) Clinical/ Demographic information A quantitative/ binary disease information Genotyping/sequencing Genetic variants Single Nucleotied Polymorphisms (SNPs) SNP i? Disease (Pairwise) statistical test of association Relevant SNPs for a disease with complex genetic background. Risch, N. and Merikangas, K., The future of genetic studies of complex human diseases. Science, 273(5281), pp
4 Genetikai asszociációs adatok Validációs SNP-ek Teljes genomi polimorfizmus adat (~1m, <100$) Kipótolt polimorfizmus adat (5m) Exome szekvenálás (1% WGS, x10k/x100k, 500$-1000$) Teljes genom szekvenálás (600$-2000$) 4
5 Megmagyarázott variancia (R 2 ) A hiányzó örökletesség Betegség Lókuszok száma Megmagyarázott örökletesség Örökletességi mérték T2 diabetesz 18 6%Testvéri előfordulás HDL cholesterol % Fenotipusos variancia Testmagasság 40 5%Fenotipusos variancia Szkizofrénia 5 3%Ikerbeli előfordulás 5
6 Tumor markers New omic markers genomics, proteomics, metabolomics,.. Missing the mark, 2007, Nature MISSING THE MARK: Why is it so hard to find a test to predict cancer?, 2011, Nature
7 Aspects of biomarkers Maximum predictivity, minimum redundancy Predictive power Directness Causality Multiple target Uncertainty
8 Questions about biomarkers Identification of a weakly significant biomarker Among huge number of irrelent factors, correction for multiple hypothesis testing Identification of weakly significant biomarkers Identification of interactions Identification of multitarget biomarkers Identification of a biomarker relevant for multiple aspects Identification of context-specific biomarkers Pleiotropic interactions and epistatic interactions Identification of pure effect modifiers Biomarkers without main effects: parametrically and structurally (structural) Discrimination of direct/indirect biomarkers Strong/weak relevance (structural) Discrimination of diagnostic and target biomarkers Causes versus effects Estimation of effect size Adjusting for confounding Optimal selection of a diagnostic biomarker Optimal selection of sequence of diagnostic biomarkers 8
9 Kérdéstípusok az orvosi döntéstámogatásban Diagnosztikai következtetés P(Diagnózis Passzív megfigyelések) Legkisebb várható veszteségű diagnózis passzív megfigyelések esetén Optimális információgyűjtés További információ hatása a következtetésre: P(Diagnózis megfigyelések, új megfigyelés) További információ hasznossága Terápiás következtetés P(Kimenetel Megfigyelés, Beavatkozás) Kontrafaktuális következtetés P(ElképzeltKimenetel Megfigyelés, Beavatkozás,Kimenetel, ElképzeltBeavatkozás) 9
10 Bayesian networks: interpretations M P ={I P,1 (X 1 ;Y 1 Z 1 ),...} ), ( ) ( ), ( ) ( ) ( ),,,, ( M S T P D S P M O D P M O P M P T S D O M P 3. Concise representation of joint distributions 2. Graphical representation of (in)dependencies 1. Causal model 4. Decision network
11 Motivation: from observational inference In a Bayesian network, any query can be answered corresponding to passive observations: p(q=q E=e). What is the (conditional) probability of Q=q given that E=e. Note that Q can preceed temporally E. X Y Specification: p(x), p(y X) Joint distribution: p(x,y) Inferences: p(x), p(y), p(y X), p(x Y) 12/1/2017 A.I. 11
12 Motivation: to interventional inference Perfect intervention: do(x=x) as set X to x. What is the relation of p(q=q E=e) and p(q=q do(e=e))? X Y Specification: p(x), p(y X) Joint distribution: p(x,y) Inferences: p(y X=x)=p(Y do(x=x)) p(x Y=y) p(x do(y=y)) What is a formal knowledge representation of a causal model? What is the formal inference method? 12/1/2017 A.I. 12
13 Principles of causality strong association, X precedes temporally Y, plausible explanation without alternative explanations based on confounding, necessity (generally: if cause is removed, effect is decreased or actually: y would not have been occurred with that much probability if x had not been present), sufficiency (generally: if exposure to cause is increased, effect is increased or actually: y would have been occurred with larger probability if x had been present). Autonomous, transportable mechanism. The probabilistic definition of causation formalizes many, but for example not the counterfactual aspects. 12/1/2017 A.I. 13
14 Conditional independence I P (X;Y Z) or (X Y Z) P denotes that X is independent of Y given Z: P(X;Y z)=p(y z) P(X z) for all z with P(z)>0. (Almost) alternatively, I P (X;Y Z) iff P(X Z,Y)= P(X Z) for all z,y with P(z,y)>0. Other notations: D P (X;Y Z) =def= I P (X;Y Z) Contextual independence: for not all z.
15 The independence model of a distribution The independence map (model) M of a distribution P is the set of the valid independence triplets: M P ={I P,1 (X 1 ;Y 1 Z 1 ),..., I P,K (X K ;Y K Z K )} If P(X,Y,Z) is a Markov chain, then M P ={D(X;Y), D(Y;Z), I(X;Z Y)} Normally/almost always: D(X;Z) Exceptionally: I(X;Z) X Y Z
16 The independence map of a N-BN Y X Z If P(Y,X,Z) is a naive Bayesian network, then M P ={D(X;Y), D(Y;Z), I(X;Z Y)} Normally/almost always: D(X;Z) Exceptionally: I(X;Z)
17 Bayesian networks: the three facets M P ={I P,1 (X 1 ;Y 1 Z 1 ),...} ), ( ) ( ), ( ) ( ) ( ),,,, ( M S T P D S P M O D P M O P M P T S D O M P 3. Concise representation of joint distributions 2. Graphical representation of (in)dependencies 1. Causal model
18 Inferring independencies from structure: d-separation I G (X;Y Z) denotes that X is d-separated (directed separated) from Y by Z in directed graph G.
19 d-separation and the global Markov condition
20 Representation of independencies For certain distributions exact representation is not possible by Bayesian networks, e.g.: 1. Intransitive Markov chain: X Y Z 2. Pure multivariate cause: {X,Z} Y V 3. Diamond structure: P(X,Y,Z,V) with M P ={D(X;Z), D(X;Y), D(V;X), D(V;Z), I(V;Y {X,Z}), I(X;Z {V,Y}).. }. X Z Y
21 Markov blanket (and boundary) 12/1/2017 A.I. 21
22 A jegykiválasztási probléma The feature subset selection (FSS) problem Egy X i jegy erősen releváns, ha létezik x i,y i, és s i = x 1, x i-1,x i+1,,x n úgy, hogy p(x i,s i )>0 és p(y x i,s i ) p(y s i ). Egy X i jegy gyengén releváns, ha létezik x i,y i, és valamely s i úgy, hogy p(x i,s i )>0 és p(y x i,s i ) p(y s i ).
23 Biomarkers and the feature subset selection (FSS) problem
24 A Bayesian network definition A directed acyclic graph (DAG) G is a Bayesian network of distribution P(U) iff P(U) obeys the global Markov condition with respect to G and G is minimal (i.e. no edges can be omitted without violating this property). 12/1/2017 A.I. 24
25 A practical definition
26 Association vs. Causation Causal models: X Y X Y X causes Y Y causes X X * Y... There is a common cause (pure confounding) X *... * Y Causal effect of Y on X is confounded by many factors From passive observations: M P ={D(X;Y)} P(X,Y) X Y X and Y are associated Reichenbach's Common Cause Principle: a correlation between events X and Y indicates either that X causes Y, or that Y causes X, or that X and Y have a common cause.
27 The building block of causality: v-structure (arrow of time) p(x),p(z X),p(Y Z) X Z Y p(x Z),p(Z Y),p(Y) X Z Y p(x Z),p(Z),p(Y Z) X Z Y transitive M intransitive M p(x),p(z X,Y),p(Y) X Y Z v-structure M P ={D(X;Z), D(Z;Y), D(X,Y), I(X;Y Z)} M P ={D(X;Z), D(Y;Z), I(X;Y), D(X;Y Z) } Often (confounding): present knowledge renders (otherwise dependent) future states conditionally independent. Ever(?): present knowledge renders (otherwise independent) future states conditionally dependent.
28 Observational equivalence of causal models
29 A limits of learnability: compelled edges ( can we interpret edges as causal relations? compelled edges)
30 Interventional inference in causal Bayesian networks (Passive, observational) inference P(Query Observations) Interventionist inference P(Query Observations, Interventions) Counterfactual inference P(Query Observations, Counterfactual conditionals)
31 Interventions and graph surgery If G is a causal model, then compute p(y do(x=x)) by 1. deleting the incoming edges to X 2. setting X=x 3. performing standard Bayesian network inference. Mutation? Subpopulation Location Disease E X? Y *
32 Local Causal Discovery can we interpret edges as causal relations in the presence of hidden variables? Can we learn causal relations from observational data in presence of confounders??? Smoking Increased propensity Smoking A genetic polymorphism* Increased susceptibility Lung cancer Lung cancer Automated, tabula rasa causal inference from (passive) observation is possible, i.e. hidden, confounding variables can be excluded E X? Y??? *
33 Questions about biomarkers Identification of a weakly significant biomarker Among huge number of irrelent factors, correction for multiple hypothesis testing Identification of weakly significant biomarkers Identification of interactions Identification of multitarget biomarkers Identification of a biomarker relevant for multiple aspects Identification of context-specific biomarkers Pleiotropic interactions and epistatic interactions Identification of pure effect modifiers Biomarkers without main effects: parametrically and structurally (structural) Discrimination of direct/indirect biomarkers Strong/weak relevance (structural) Discrimination of diagnostic and target biomarkers Causes versus effects Estimation of effect size Adjusting for confounding Optimal selection of a diagnostic biomarker Optimal selection of sequence of diagnostic biomarkers 33
34 Sensitivity of the inference 1 P(Pathology=malignant E=e) Evidence e
35 Decision theory probability theory+utility theory Decision situation: Actions Outcomes Probabilities of outcomes Utilities/losses of outcomes QALY, micromort Maximum Expected Utility Principle (MEU) Best action is the one with maximum expected utility a o i j p U ( o j ai ) ( o j ai ) ( a i) j U( oj ai ) p( oj ai EU ) a* arg max i EU ( ai ) Actions a i (which experiment) Outcomes (e.g. dataset) Probabilities Utilities, costs Expected utilities P(o j a i ) U(o j ), C(a i ) EU(a i ) = P(o j a i )U(o j ) a i o j
36 Maximizing expected utility 12/1/2017 A.I. 36
37
38
39 Value of (perfect) Information 12/1/2017 A.I. 39
40
41 Extensions Bayesian learning Predictive inference Parametric inference Value of further information Sequential decisions Optimal stopping (secretary problem) Multiarmed bandit problem Markov decision problem. U i U (e i ) e i a i =s i * D ij U i1 (s i *) U i+1 U i1 (D ir )
42 Characterizing a biomarker/test Sensitivity: p(prediction=true Ref=TRUE) Specificity: p(prediction=false Ref=FALSE) PPV: p(ref=true Prediction=TRUE) NPV: p(ref=false Prediction=FALSE) Healthy Disease present threshold t
43 Questions about biomarkers Identification of a weakly significant biomarker Among huge number of irrelent factors, correction for multiple hypothesis testing Identification of weakly significant biomarkers Identification of interactions Identification of multitarget biomarkers Identification of a biomarker relevant for multiple aspects Identification of context-specific biomarkers Pleiotropic interactions and epistatic interactions Identification of pure effect modifiers Biomarkers without main effects: parametrically and structurally (structural) Discrimination of direct/indirect biomarkers Strong/weak relevance (structural) Discrimination of diagnostic and target biomarkers Causes versus effects Estimation of effect size Adjusting for confounding Optimal selection of a diagnostic biomarker Optimal selection of sequence of diagnostic biomarkers 43
44 Why can we learn? The most incomprehensible thing about the world is that it is at all comprehensible. Albert Einstein. No theory of knowledge should attempt to explain why we are successful in our attempt to explain things. K.R.Popper: Objective Knowledge, 1972 Possibility of learning is an empirical observation.
45 Principles for induction Epicurus' (342? B.C B.C.) principle of multiple explanations which states that one should keep all hypotheses that are consistent with the data. The principle of Occam's razor ( , sometimes spelt Ockham). Occam's razor states that when inferring causes entities should not be multiplied beyond necessity. This is widely understood to mean: Among all hypotheses consistent with the observations, choose the simplest. In terms of a prior distribution over hypotheses, this is the same as giving simpler hypotheses higher a priori probability, and more complex ones lower probability.
46 Bayesian model averaging Russel&Norvig: Artificial intelligence, ch.20
47 Bayesian Model Averaging example Russel&Norvig: Artificial intelligence
48 Learning rate for models Russel&Norvig: Artificial intelligence
49 Learning rate for model predictions Russel&Norvig: Artificial intelligence
50
51
52 Probably Approximately Correct (PAC)-learning To have at least δ probability of approximate correctness: H (1 ε) n δ By expressing the sample size as function of ε accuracy and δ confidence we get a bound for sample complexity 1/ε(ln H + ln 1 δ ) n
53 Decision trees One possible representation for hypotheses E.g., here is the true tree for deciding whether to wait:
54 Expressiveness Decision trees can express any function of the input attributes. E.g., for Boolean functions, truth table row path to leaf: Trivially, there is a consistent decision tree for any training set with one path to leaf for each example (unless f nondeterministic in x) but it probably won't generalize to new examples Prefer to find more compact decision trees
55 Hypothesis spaces How many distinct decision trees with n Boolean attributes? = number of Boolean functions = number of distinct truth tables with 2 n rows = 2 2n E.g., with 6 Boolean attributes, there are 18,446,744,073,709,551,616 trees How many purely conjunctive hypotheses (e.g., Hungry Rain)? Each attribute can be in (positive), in (negative), or out 3 n distinct conjunctive hypotheses More expressive hypothesis space increases chance that target function can be expressed increases number of hypotheses consistent with training set may get worse predictions
56 Multiple testing problem (MTP) If we perform N tests and our goal is p(falserejection 1 or or FalseRejection N )<α then we have to ensure, e.g. that for all p(falserejection i )< α/n loss of statistical power (probability of discovery of a true hypothesis)!
57 Solutions for MTP Study design: incorporation of prior Corrections Permutation tests Generate perturbed data sets under the null hypothesis: permute predictors and outcome. False discovery rate, q-value Bayesian approach
58 Corrections for multiple testing
59 Corrections for multiple testing I have 1,000,000 hypotheses that are not mutually exclusive. 1. I test them all. Correction? 2. I plan to test them all, but I run out of resources after testing only one of them. Correction? 3. I test one of them, and a year later test the others. Correction? If so, when? 4. I only test the first one because that is the one I suspect. Correction? 5. I run an algorithm that prunes unlikely hypotheses, keeping only 100,000. Correction for 100,000 or for 1,000,000 hypotheses? (R.Neopolitan, 2010)
60 Permute outcome/target Permutation testing Outcome Predictor variables Y X 1 X n Samples A random permutation guarantees the independency of the outcome Y. A random permutation corresponds to an artificial data set from the null model. direct estimation of the p-value: the probability of observing a more extreme data set from the null model with the same sample size. p( D perm N : IncompatibilityWithNull ( D real N ) IncompatibilityWithNull ( D perm N ))
61 False discovery rate (FDR) I. Another aspect of multiple hypothesis testing: the probability of Type I. error for any tests the expected number of Type I. errors at a given significance level (False discovery rate, FDR) q-value: the minimum FDR at which the test may be called significant.
62 False discovery rate (FDR) II.
63 Bayesian-network based Bayesian multilevel analysis (BN-BMLA) Hierarchic statistical questions about typed relevance can be translated to questions about Bayesian network structural features: Pairwise association Markov Blanket Memberhsips (MBM) Multivariable analysis Markov Blanket sets (MB) Multivariable analysis with interactions Markov Blanket Subgraphs (MBG) Complete dependency models Partially Directed Acyclic Graphs (PDAG) Complete causal models Bayesian network (BN) Hierarchy of levels BN PDAG MBG MB MBM
64 Kapcsolt, eltérő absztrakciós szintű hipotézisosztályok Az asszociációs tanulmányok kérdései Bayes hálók strukturális jegyeivel formalizálhatók Páronkénti erős relevancia: Markov Blanket Memberhsips (MBM) Többváltozós erős relevancia: Markov Blanket sets (k-sub/sup-mbs) Multifaktoriális interakciós algráf: Markov Blanket Subgraphs (MBG, C-RPDAG) Teljes interakciós modell: Részlegesen irányított Bayes háló (PDAG) Teljes oksági modell: Bayes háló (BN) Kapcsolt, absztrakciós szintek: DAG=>PDAG=>MBG=>C-RPDAG=>MB=>MBM
Causal Bayesian networks. Peter Antal
Causal Bayesian networks Peter Antal antal@mit.bme.hu A.I. 4/8/2015 1 Can we represent exactly (in)dependencies by a BN? From a causal model? Suff.&nec.? Can we interpret edges as causal relations with
More informationCausal Bayesian networks. Peter Antal
Causal Bayesian networks Peter Antal antal@mit.bme.hu A.I. 11/25/2015 1 Can we represent exactly (in)dependencies by a BN? From a causal model? Suff.&nec.? Can we interpret edges as causal relations with
More informationBayesian networks as causal models. Peter Antal
Bayesian networks as causal models Peter Antal antal@mit.bme.hu A.I. 3/20/2018 1 Can we represent exactly (in)dependencies by a BN? From a causal model? Suff.&nec.? Can we interpret edges as causal relations
More informationBayesian learning Probably Approximately Correct Learning
Bayesian learning Probably Approximately Correct Learning Peter Antal antal@mit.bme.hu A.I. December 1, 2017 1 Learning paradigms Bayesian learning Falsification hypothesis testing approach Probably Approximately
More informationFusion in simple models
Fusion in simple models Peter Antal antal@mit.bme.hu A.I. February 8, 2018 1 Basic concepts of probability theory Joint distribution Conditional probability Bayes rule Chain rule Marginalization General
More informationFrom inductive inference to machine learning
From inductive inference to machine learning ADAPTED FROM AIMA SLIDES Russel&Norvig:Artificial Intelligence: a modern approach AIMA: Inductive inference AIMA: Inductive inference 1 Outline Bayesian inferences
More informationStatistical Learning. Philipp Koehn. 10 November 2015
Statistical Learning Philipp Koehn 10 November 2015 Outline 1 Learning agents Inductive learning Decision tree learning Measuring learning performance Bayesian learning Maximum a posteriori and maximum
More informationIntroduction to Artificial Intelligence. Learning from Oberservations
Introduction to Artificial Intelligence Learning from Oberservations Bernhard Beckert UNIVERSITÄT KOBLENZ-LANDAU Winter Term 2004/2005 B. Beckert: KI für IM p.1 Outline Learning agents Inductive learning
More informationLecture 4 October 18th
Directed and undirected graphical models Fall 2017 Lecture 4 October 18th Lecturer: Guillaume Obozinski Scribe: In this lecture, we will assume that all random variables are discrete, to keep notations
More information1. what conditional independencies are implied by the graph. 2. whether these independecies correspond to the probability distribution
NETWORK ANALYSIS Lourens Waldorp PROBABILITY AND GRAPHS The objective is to obtain a correspondence between the intuitive pictures (graphs) of variables of interest and the probability distributions of
More informationIntroduction to Causal Calculus
Introduction to Causal Calculus Sanna Tyrväinen University of British Columbia August 1, 2017 1 / 1 2 / 1 Bayesian network Bayesian networks are Directed Acyclic Graphs (DAGs) whose nodes represent random
More informationProbability theory: elements
Probability theory: elements Peter Antal antal@mit.bme.hu A.I. February 17, 2017 1 Joint distribution Conditional robability Indeendence, conditional indeendence Bayes rule Marginalization/Exansion Chain
More informationUndirected Graphical Models: Markov Random Fields
Undirected Graphical Models: Markov Random Fields 40-956 Advanced Topics in AI: Probabilistic Graphical Models Sharif University of Technology Soleymani Spring 2015 Markov Random Field Structure: undirected
More information#!" $"" % &'' % " ( Data analysis process Bayesian decision theoretic foundation Causal inference Open access data, publication, model, computation
!" #!" $"" % # &'' % " ( Data analysis process Bayesian decision theoretic foundation Causal inference Open access data, publication, model, computation Langley, P. (1978). Bacon: A general discovery system.
More informationLecture 5: Bayesian Network
Lecture 5: Bayesian Network Topics of this lecture What is a Bayesian network? A simple example Formal definition of BN A slightly difficult example Learning of BN An example of learning Important topics
More informationLearning from Observations. Chapter 18, Sections 1 3 1
Learning from Observations Chapter 18, Sections 1 3 Chapter 18, Sections 1 3 1 Outline Learning agents Inductive learning Decision tree learning Measuring learning performance Chapter 18, Sections 1 3
More informationCS 380: ARTIFICIAL INTELLIGENCE MACHINE LEARNING. Santiago Ontañón
CS 380: ARTIFICIAL INTELLIGENCE MACHINE LEARNING Santiago Ontañón so367@drexel.edu Summary so far: Rational Agents Problem Solving Systematic Search: Uninformed Informed Local Search Adversarial Search
More informationCS 380: ARTIFICIAL INTELLIGENCE
CS 380: ARTIFICIAL INTELLIGENCE MACHINE LEARNING 11/11/2013 Santiago Ontañón santi@cs.drexel.edu https://www.cs.drexel.edu/~santi/teaching/2013/cs380/intro.html Summary so far: Rational Agents Problem
More informationBayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014
Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2014 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several
More informationANALYTIC COMPARISON. Pearl and Rubin CAUSAL FRAMEWORKS
ANALYTIC COMPARISON of Pearl and Rubin CAUSAL FRAMEWORKS Content Page Part I. General Considerations Chapter 1. What is the question? 16 Introduction 16 1. Randomization 17 1.1 An Example of Randomization
More informationConditional Independence and Factorization
Conditional Independence and Factorization Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr
More informationGrundlagen der Künstlichen Intelligenz
Grundlagen der Künstlichen Intelligenz Uncertainty & Probabilities & Bandits Daniel Hennes 16.11.2017 (WS 2017/18) University Stuttgart - IPVS - Machine Learning & Robotics 1 Today Uncertainty Probability
More informationIntroduction to Artificial Intelligence. Unit # 11
Introduction to Artificial Intelligence Unit # 11 1 Course Outline Overview of Artificial Intelligence State Space Representation Search Techniques Machine Learning Logic Probabilistic Reasoning/Bayesian
More informationCausality. Pedro A. Ortega. 18th February Computational & Biological Learning Lab University of Cambridge
Causality Pedro A. Ortega Computational & Biological Learning Lab University of Cambridge 18th February 2010 Why is causality important? The future of machine learning is to control (the world). Examples
More informationCOS402- Artificial Intelligence Fall Lecture 10: Bayesian Networks & Exact Inference
COS402- Artificial Intelligence Fall 2015 Lecture 10: Bayesian Networks & Exact Inference Outline Logical inference and probabilistic inference Independence and conditional independence Bayes Nets Semantics
More informationCMPT Machine Learning. Bayesian Learning Lecture Scribe for Week 4 Jan 30th & Feb 4th
CMPT 882 - Machine Learning Bayesian Learning Lecture Scribe for Week 4 Jan 30th & Feb 4th Stephen Fagan sfagan@sfu.ca Overview: Introduction - Who was Bayes? - Bayesian Statistics Versus Classical Statistics
More informationBayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016
Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2016 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear
More information4.1 Notation and probability review
Directed and undirected graphical models Fall 2015 Lecture 4 October 21st Lecturer: Simon Lacoste-Julien Scribe: Jaime Roquero, JieYing Wu 4.1 Notation and probability review 4.1.1 Notations Let us recall
More informationLearning in Bayesian Networks
Learning in Bayesian Networks Florian Markowetz Max-Planck-Institute for Molecular Genetics Computational Molecular Biology Berlin Berlin: 20.06.2002 1 Overview 1. Bayesian Networks Stochastic Networks
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Introduction. Basic Probability and Bayes Volkan Cevher, Matthias Seeger Ecole Polytechnique Fédérale de Lausanne 26/9/2011 (EPFL) Graphical Models 26/9/2011 1 / 28 Outline
More informationUncertainty and Bayesian Networks
Uncertainty and Bayesian Networks Tutorial 3 Tutorial 3 1 Outline Uncertainty Probability Syntax and Semantics for Uncertainty Inference Independence and Bayes Rule Syntax and Semantics for Bayesian Networks
More informationFrom Causality, Second edition, Contents
From Causality, Second edition, 2009. Preface to the First Edition Preface to the Second Edition page xv xix 1 Introduction to Probabilities, Graphs, and Causal Models 1 1.1 Introduction to Probability
More informationCausal Inference & Reasoning with Causal Bayesian Networks
Causal Inference & Reasoning with Causal Bayesian Networks Neyman-Rubin Framework Potential Outcome Framework: for each unit k and each treatment i, there is a potential outcome on an attribute U, U ik,
More informationNaïve Bayes classification
Naïve Bayes classification 1 Probability theory Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. Examples: A person s height, the outcome of a coin toss
More informationComputational Genomics. Systems biology. Putting it together: Data integration using graphical models
02-710 Computational Genomics Systems biology Putting it together: Data integration using graphical models High throughput data So far in this class we discussed several different types of high throughput
More informationProbability. CS 3793/5233 Artificial Intelligence Probability 1
CS 3793/5233 Artificial Intelligence 1 Motivation Motivation Random Variables Semantics Dice Example Joint Dist. Ex. Axioms Agents don t have complete knowledge about the world. Agents need to make decisions
More informationConditional Independence
Conditional Independence Sargur Srihari srihari@cedar.buffalo.edu 1 Conditional Independence Topics 1. What is Conditional Independence? Factorization of probability distribution into marginals 2. Why
More informationQuantifying uncertainty & Bayesian networks
Quantifying uncertainty & Bayesian networks CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2016 Soleymani Artificial Intelligence: A Modern Approach, 3 rd Edition,
More informationGLOBEX Bioinformatics (Summer 2015) Genetic networks and gene expression data
GLOBEX Bioinformatics (Summer 2015) Genetic networks and gene expression data 1 Gene Networks Definition: A gene network is a set of molecular components, such as genes and proteins, and interactions between
More informationStatistical Approaches to Learning and Discovery
Statistical Approaches to Learning and Discovery Graphical Models Zoubin Ghahramani & Teddy Seidenfeld zoubin@cs.cmu.edu & teddy@stat.cmu.edu CALD / CS / Statistics / Philosophy Carnegie Mellon University
More informationUndirected Graphical Models
Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Properties Properties 3 Generative vs. Conditional
More informationDirected Graphical Models or Bayesian Networks
Directed Graphical Models or Bayesian Networks Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Bayesian Networks One of the most exciting recent advancements in statistical AI Compact
More informationBuilding Bayesian Networks. Lecture3: Building BN p.1
Building Bayesian Networks Lecture3: Building BN p.1 The focus today... Problem solving by Bayesian networks Designing Bayesian networks Qualitative part (structure) Quantitative part (probability assessment)
More informationProbabilistic Reasoning. (Mostly using Bayesian Networks)
Probabilistic Reasoning (Mostly using Bayesian Networks) Introduction: Why probabilistic reasoning? The world is not deterministic. (Usually because information is limited.) Ways of coping with uncertainty
More informationAlgorithmic Probability
Algorithmic Probability From Scholarpedia From Scholarpedia, the free peer-reviewed encyclopedia p.19046 Curator: Marcus Hutter, Australian National University Curator: Shane Legg, Dalle Molle Institute
More informationCSC 411 Lecture 3: Decision Trees
CSC 411 Lecture 3: Decision Trees Roger Grosse, Amir-massoud Farahmand, and Juan Carrasquilla University of Toronto UofT CSC 411: 03-Decision Trees 1 / 33 Today Decision Trees Simple but powerful learning
More informationBayesian Networks and Decision Graphs
Bayesian Networks and Decision Graphs A 3-week course at Reykjavik University Finn V. Jensen & Uffe Kjærulff ({fvj,uk}@cs.aau.dk) Group of Machine Intelligence Department of Computer Science, Aalborg University
More informationNaïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability
Probability theory Naïve Bayes classification Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s height, the outcome of a coin toss Distinguish
More informationChris Bishop s PRML Ch. 8: Graphical Models
Chris Bishop s PRML Ch. 8: Graphical Models January 24, 2008 Introduction Visualize the structure of a probabilistic model Design and motivate new models Insights into the model s properties, in particular
More informationCh.6 Uncertain Knowledge. Logic and Uncertainty. Representation. One problem with logical approaches: Department of Computer Science
Ch.6 Uncertain Knowledge Representation Hantao Zhang http://www.cs.uiowa.edu/ hzhang/c145 The University of Iowa Department of Computer Science Artificial Intelligence p.1/39 Logic and Uncertainty One
More informationCS 5522: Artificial Intelligence II
CS 5522: Artificial Intelligence II Bayes Nets: Independence Instructor: Alan Ritter Ohio State University [These slides were adapted from CS188 Intro to AI at UC Berkeley. All materials available at http://ai.berkeley.edu.]
More informationProbabilistic Models. Models describe how (a portion of) the world works
Probabilistic Models Models describe how (a portion of) the world works Models are always simplifications May not account for every variable May not account for all interactions between variables All models
More informationY. Xiang, Inference with Uncertain Knowledge 1
Inference with Uncertain Knowledge Objectives Why must agent use uncertain knowledge? Fundamentals of Bayesian probability Inference with full joint distributions Inference with Bayes rule Bayesian networks
More informationCHAPTER-17. Decision Tree Induction
CHAPTER-17 Decision Tree Induction 17.1 Introduction 17.2 Attribute selection measure 17.3 Tree Pruning 17.4 Extracting Classification Rules from Decision Trees 17.5 Bayesian Classification 17.6 Bayes
More informationBounding the Probability of Causation in Mediation Analysis
arxiv:1411.2636v1 [math.st] 10 Nov 2014 Bounding the Probability of Causation in Mediation Analysis A. P. Dawid R. Murtas M. Musio February 16, 2018 Abstract Given empirical evidence for the dependence
More informationGraphical models and causality: Directed acyclic graphs (DAGs) and conditional (in)dependence
Graphical models and causality: Directed acyclic graphs (DAGs) and conditional (in)dependence General overview Introduction Directed acyclic graphs (DAGs) and conditional independence DAGs and causal effects
More informationBayesian Networks BY: MOHAMAD ALSABBAGH
Bayesian Networks BY: MOHAMAD ALSABBAGH Outlines Introduction Bayes Rule Bayesian Networks (BN) Representation Size of a Bayesian Network Inference via BN BN Learning Dynamic BN Introduction Conditional
More informationIntelligent Systems: Reasoning and Recognition. Reasoning with Bayesian Networks
Intelligent Systems: Reasoning and Recognition James L. Crowley ENSIMAG 2 / MoSIG M1 Second Semester 2016/2017 Lesson 13 24 march 2017 Reasoning with Bayesian Networks Naïve Bayesian Systems...2 Example
More informationCS 5522: Artificial Intelligence II
CS 5522: Artificial Intelligence II Bayes Nets Instructor: Alan Ritter Ohio State University [These slides were adapted from CS188 Intro to AI at UC Berkeley. All materials available at http://ai.berkeley.edu.]
More information1. Courses are either tough or boring. 2. Not all courses are boring. 3. Therefore there are tough courses. (Cx, Tx, Bx, )
Logic FOL Syntax FOL Rules (Copi) 1. Courses are either tough or boring. 2. Not all courses are boring. 3. Therefore there are tough courses. (Cx, Tx, Bx, ) Dealing with Time Translate into first-order
More informationBayesian Networks Inference with Probabilistic Graphical Models
4190.408 2016-Spring Bayesian Networks Inference with Probabilistic Graphical Models Byoung-Tak Zhang intelligence Lab Seoul National University 4190.408 Artificial (2016-Spring) 1 Machine Learning? Learning
More informationBayesian networks. Soleymani. CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2018
Bayesian networks CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2018 Soleymani Slides have been adopted from Klein and Abdeel, CS188, UC Berkeley. Outline Probability
More informationAn Introduction to Bayesian Machine Learning
1 An Introduction to Bayesian Machine Learning José Miguel Hernández-Lobato Department of Engineering, Cambridge University April 8, 2013 2 What is Machine Learning? The design of computational systems
More informationProbabilistic Causal Models
Probabilistic Causal Models A Short Introduction Robin J. Evans www.stat.washington.edu/ rje42 ACMS Seminar, University of Washington 24th February 2011 1/26 Acknowledgements This work is joint with Thomas
More informationLecture Notes 22 Causal Inference
Lecture Notes 22 Causal Inference Prediction and causation are very different. Typical questions are: Prediction: Predict after observing = x Causation: Predict after setting = x. Causation involves predicting
More informationData Analysis and Uncertainty Part 1: Random Variables
Data Analysis and Uncertainty Part 1: Random Variables Instructor: Sargur N. University at Buffalo The State University of New York srihari@cedar.buffalo.edu 1 Topics 1. Why uncertainty exists? 2. Dealing
More informationProbabilistic Models
Bayes Nets 1 Probabilistic Models Models describe how (a portion of) the world works Models are always simplifications May not account for every variable May not account for all interactions between variables
More informationAlgorithmisches Lernen/Machine Learning
Algorithmisches Lernen/Machine Learning Part 1: Stefan Wermter Introduction Connectionist Learning (e.g. Neural Networks) Decision-Trees, Genetic Algorithms Part 2: Norman Hendrich Support-Vector Machines
More informationBayesian networks. Chapter 14, Sections 1 4
Bayesian networks Chapter 14, Sections 1 4 Artificial Intelligence, spring 2013, Peter Ljunglöf; based on AIMA Slides c Stuart Russel and Peter Norvig, 2004 Chapter 14, Sections 1 4 1 Bayesian networks
More informationDirected and Undirected Graphical Models
Directed and Undirected Davide Bacciu Dipartimento di Informatica Università di Pisa bacciu@di.unipi.it Machine Learning: Neural Networks and Advanced Models (AA2) Last Lecture Refresher Lecture Plan Directed
More informationOutline. CSE 573: Artificial Intelligence Autumn Agent. Partial Observability. Markov Decision Process (MDP) 10/31/2012
CSE 573: Artificial Intelligence Autumn 2012 Reasoning about Uncertainty & Hidden Markov Models Daniel Weld Many slides adapted from Dan Klein, Stuart Russell, Andrew Moore & Luke Zettlemoyer 1 Outline
More informationCS 188: Artificial Intelligence Fall 2008
CS 188: Artificial Intelligence Fall 2008 Lecture 14: Bayes Nets 10/14/2008 Dan Klein UC Berkeley 1 1 Announcements Midterm 10/21! One page note sheet Review sessions Friday and Sunday (similar) OHs on
More informationCS 343: Artificial Intelligence
CS 343: Artificial Intelligence Bayes Nets Prof. Scott Niekum The University of Texas at Austin [These slides based on those of Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley. All CS188
More informationCSCE 478/878 Lecture 6: Bayesian Learning
Bayesian Methods Not all hypotheses are created equal (even if they are all consistent with the training data) Outline CSCE 478/878 Lecture 6: Bayesian Learning Stephen D. Scott (Adapted from Tom Mitchell
More informationBayesian Learning. Two Roles for Bayesian Methods. Bayes Theorem. Choosing Hypotheses
Bayesian Learning Two Roles for Bayesian Methods Probabilistic approach to inference. Quantities of interest are governed by prob. dist. and optimal decisions can be made by reasoning about these prob.
More informationComputational Systems Biology: Biology X
Bud Mishra Room 1002, 715 Broadway, Courant Institute, NYU, New York, USA L#7:(Mar-23-2010) Genome Wide Association Studies 1 The law of causality... is a relic of a bygone age, surviving, like the monarchy,
More informationToday. Why do we need Bayesian networks (Bayes nets)? Factoring the joint probability. Conditional independence
Bayesian Networks Today Why do we need Bayesian networks (Bayes nets)? Factoring the joint probability Conditional independence What is a Bayes net? Mostly, we discuss discrete r.v. s What can you do with
More informationProbabilistic Graphical Models and Bayesian Networks. Artificial Intelligence Bert Huang Virginia Tech
Probabilistic Graphical Models and Bayesian Networks Artificial Intelligence Bert Huang Virginia Tech Concept Map for Segment Probabilistic Graphical Models Probabilistic Time Series Models Particle Filters
More informationCAUSALITY. Models, Reasoning, and Inference 1 CAMBRIDGE UNIVERSITY PRESS. Judea Pearl. University of California, Los Angeles
CAUSALITY Models, Reasoning, and Inference Judea Pearl University of California, Los Angeles 1 CAMBRIDGE UNIVERSITY PRESS Preface page xiii 1 Introduction to Probabilities, Graphs, and Causal Models 1
More informationMathematical Formulation of Our Example
Mathematical Formulation of Our Example We define two binary random variables: open and, where is light on or light off. Our question is: What is? Computer Vision 1 Combining Evidence Suppose our robot
More informationCS Lecture 3. More Bayesian Networks
CS 6347 Lecture 3 More Bayesian Networks Recap Last time: Complexity challenges Representing distributions Computing probabilities/doing inference Introduction to Bayesian networks Today: D-separation,
More informationArtificial Intelligence Bayes Nets: Independence
Artificial Intelligence Bayes Nets: Independence Instructors: David Suter and Qince Li Course Delivered @ Harbin Institute of Technology [Many slides adapted from those created by Dan Klein and Pieter
More informationMarkov Networks. l Like Bayes Nets. l Graphical model that describes joint probability distribution using tables (AKA potentials)
Markov Networks l Like Bayes Nets l Graphical model that describes joint probability distribution using tables (AKA potentials) l Nodes are random variables l Labels are outcomes over the variables Markov
More informationStatistical testing. Samantha Kleinberg. October 20, 2009
October 20, 2009 Intro to significance testing Significance testing and bioinformatics Gene expression: Frequently have microarray data for some group of subjects with/without the disease. Want to find
More informationAnnouncements. CS 188: Artificial Intelligence Spring Probability recap. Outline. Bayes Nets: Big Picture. Graphical Model Notation
CS 188: Artificial Intelligence Spring 2010 Lecture 15: Bayes Nets II Independence 3/9/2010 Pieter Abbeel UC Berkeley Many slides over the course adapted from Dan Klein, Stuart Russell, Andrew Moore Current
More information9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering
Types of learning Modeling data Supervised: we know input and targets Goal is to learn a model that, given input data, accurately predicts target data Unsupervised: we know the input only and want to make
More informationBN Semantics 3 Now it s personal! Parameter Learning 1
Readings: K&F: 3.4, 14.1, 14.2 BN Semantics 3 Now it s personal! Parameter Learning 1 Graphical Models 10708 Carlos Guestrin Carnegie Mellon University September 22 nd, 2006 1 Building BNs from independence
More informationMachine Learning (CS 567) Lecture 2
Machine Learning (CS 567) Lecture 2 Time: T-Th 5:00pm - 6:20pm Location: GFS118 Instructor: Sofus A. Macskassy (macskass@usc.edu) Office: SAL 216 Office hours: by appointment Teaching assistant: Cheol
More informationAn Introduction to Bayesian Networks: Representation and Approximate Inference
An Introduction to Bayesian Networks: Representation and Approximate Inference Marek Grześ Department of Computer Science University of York Graphical Models Reading Group May 7, 2009 Data and Probabilities
More informationMachine Learning
Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University August 30, 2017 Today: Decision trees Overfitting The Big Picture Coming soon Probabilistic learning MLE,
More informationGraphical Models - Part I
Graphical Models - Part I Oliver Schulte - CMPT 726 Bishop PRML Ch. 8, some slides from Russell and Norvig AIMA2e Outline Probabilistic Models Bayesian Networks Markov Random Fields Inference Outline Probabilistic
More informationCausal Discovery by Computer
Causal Discovery by Computer Clark Glymour Carnegie Mellon University 1 Outline 1. A century of mistakes about causation and discovery: 1. Fisher 2. Yule 3. Spearman/Thurstone 2. Search for causes is statistical
More informationArtificial Intelligence Uncertainty
Artificial Intelligence Uncertainty Ch. 13 Uncertainty Let action A t = leave for airport t minutes before flight Will A t get me there on time? A 25, A 60, A 3600 Uncertainty: partial observability (road
More informationPROBABILISTIC REASONING SYSTEMS
PROBABILISTIC REASONING SYSTEMS In which we explain how to build reasoning systems that use network models to reason with uncertainty according to the laws of probability theory. Outline Knowledge in uncertain
More informationBayesian Networks Representation
Bayesian Networks Representation Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University March 19 th, 2007 Handwriting recognition Character recognition, e.g., kernel SVMs a c z rr r r
More informationMarkov Networks. l Like Bayes Nets. l Graph model that describes joint probability distribution using tables (AKA potentials)
Markov Networks l Like Bayes Nets l Graph model that describes joint probability distribution using tables (AKA potentials) l Nodes are random variables l Labels are outcomes over the variables Markov
More informationCS 188: Artificial Intelligence Spring Announcements
CS 188: Artificial Intelligence Spring 2011 Lecture 14: Bayes Nets II Independence 3/9/2011 Pieter Abbeel UC Berkeley Many slides over the course adapted from Dan Klein, Stuart Russell, Andrew Moore Announcements
More informationCS6220: DATA MINING TECHNIQUES
CS6220: DATA MINING TECHNIQUES Chapter 8&9: Classification: Part 3 Instructor: Yizhou Sun yzsun@ccs.neu.edu March 12, 2013 Midterm Report Grade Distribution 90-100 10 80-89 16 70-79 8 60-69 4
More informationConditional probabilities and graphical models
Conditional probabilities and graphical models Thomas Mailund Bioinformatics Research Centre (BiRC), Aarhus University Probability theory allows us to describe uncertainty in the processes we model within
More information