Ockham Efficiency Theorem for Randomized Scientific Methods

Size: px
Start display at page:

Download "Ockham Efficiency Theorem for Randomized Scientific Methods"

Transcription

1 Ockham Efficiency Theorem for Randomized Scientific Methods Conor Mayo-Wilson and Kevin T. Kelly Department of Philosophy Carnegie Mellon University Formal Epistemology Workshop (FEW) June 19th, / 39

2 Point of the Talk Ockham efficiency theorem: Ockham s razor has been explained in terms of minimizing retractions en route to the truth, relative to all deterministic scientific strategies (Kevin T. Kelly and Oliver Schulte). 2 / 39

3 Point of the Talk Ockham efficiency theorem: Ockham s razor has been explained in terms of minimizing retractions en route to the truth, relative to all deterministic scientific strategies (Kevin T. Kelly and Oliver Schulte). Extension: Ockham s deterministic razor minimizes retractions en route to the truth, relative to a broad class of random scientific strategies. 2 / 39

4 Point of the Talk Ockham efficiency theorem: Ockham s razor has been explained in terms of minimizing retractions en route to the truth, relative to all deterministic scientific strategies (Kevin T. Kelly and Oliver Schulte). Extension: Ockham s deterministic razor minimizes retractions en route to the truth, relative to a broad class of random scientific strategies. Further significance: Extending the argument to expected retractions is a necessary step for lifting the idea to a theory of statistical theory choice. 2 / 39

5 Outline 1 Puzzle of Simplicity 2 Standard Explanations 3 Simplicity 4 Examples 5 Mixed Strategies and Ockham s Razor 3 / 39

6 The Puzzle of Simplicity Ockham s Razor is indespensible in scientific inference. Inference should be truth-conducive. But how could a fixed bias toward simplicity be said to help one find possibly complex truths? 4 / 39

7 Methodological virtues: Simpler theories are more testable or explanatory or are otherwise more virtuous. 5 / 39

8 Methodological virtues: Simpler theories are more testable or explanatory or are otherwise more virtuous. Response: wishful thinking desiring that the truth be virtuous doesn t make it so. 5 / 39

9 Standard Explanations Confirmation: Simple theories are better confirmed by simple data: p(t S E) p(t C E) > p(t S) p(t C ). 6 / 39

10 Standard Explanations Confirmation: Simple theories are better confirmed by simple data: p(t S E) p(t C E) > p(t S) p(t C ). Responses Simple data are compatible with complex hypotheses. 6 / 39

11 Standard Explanations Confirmation: Simple theories are better confirmed by simple data: p(t S E) p(t C E) > p(t S) p(t C ). Responses Simple data are compatible with complex hypotheses. No clear connection with finding the truth in the short run. 6 / 39

12 Standard Explanations Over-fitting: Even if the truth is complex, simple theories improve overall predictive accuracy by trading variance for bias. 7 / 39

13 Standard Explanations Over-fitting: Even if the truth is complex, simple theories improve overall predictive accuracy by trading variance for bias. Responses: The underlying decision theory is unclear the worst-case solution is the most complex theory. 7 / 39

14 Standard Explanations Over-fitting: Even if the truth is complex, simple theories improve overall predictive accuracy by trading variance for bias. Responses: The underlying decision theory is unclear the worst-case solution is the most complex theory. The over-fitting account ties Ockham s razor to choices among stochastic theories. 7 / 39

15 Standard Explanations Over-fitting: Even if the truth is complex, simple theories improve overall predictive accuracy by trading variance for bias. Responses: The underlying decision theory is unclear the worst-case solution is the most complex theory. The over-fitting account ties Ockham s razor to choices among stochastic theories. Prediction must be understood so as to rule out counterfactual or causal predictions. 7 / 39

16 Standard Explanations Convergence: Even if the truth is complex, complexities in the data will eventually over-turn over-simplified theories. 8 / 39

17 Standard Explanations Convergence: Even if the truth is complex, complexities in the data will eventually over-turn over-simplified theories. Responses: Any theory choice in the short run is compatible with finding the true theory in the long run. 8 / 39

18 A New Approach Deduction: Sound; Monotone. 9 / 39

19 A New Approach Deduction: Sound; Monotone. Induction: Approximate Soundness: converge to truth with minimal errors; Approximate Monotonicity: minimize retractions. 9 / 39

20 Monotonicity for Bayesians 10 / 39

21 Monotonicity in Belief Revision Total number of times B n+1 = B n = total number of non-expansive belief revisions. 11 / 39

22 Main Idea Ockham s Razor = Closest Inductive Approximation to Deduction 12 / 39

23 Main Idea Ockham s Razor = Closest Inductive Approximation to Deduction Ockham s razor converges to the truth with minimal retractions and errors and elapsed time to retractions, elapsed time to errors 12 / 39

24 Main Idea Ockham s Razor = Closest Inductive Approximation to Deduction Ockham s razor converges to the truth with minimal retractions and errors and elapsed time to retractions, elapsed time to errors No other convergent method does 12 / 39

25 Main Idea Ockham s Razor = Closest Inductive Approximation to Deduction Ockham s razor converges to the truth with minimal retractions and errors and elapsed time to retractions, elapsed time to errors No other convergent method does No circular appeal to prior simplicity biases 12 / 39

26 Main Idea Ockham s Razor = Closest Inductive Approximation to Deduction Ockham s razor converges to the truth with minimal retractions and errors and elapsed time to retractions, elapsed time to errors No other convergent method does No circular appeal to prior simplicity biases No awkward trade-offs between costs are required. 12 / 39

27 Empirical Problems and Simplicity Empirical Effects: Recognizable eventually. Arbitrarily subtle so may take arbitrarily long to be noticed. Each theory predicts finitely many. first order effect = nonconstant second order effect = nonlinear 13 / 39

28 Empirical Problems and Simplicity Empirical Problems: Background knowledge K picks out a set of possible, finite, effect sets. Theory T S says that exactly the effects in S will be seen in the unbounded future. A world of experience w is an infinite sequence that presents some finite set of effects at each stage and that presents some set S K in the limit. The empirical problem corresponding to K is to determine which theory in {T S : S K} is true of the actual world of experience w. 14 / 39

29 Empirical Problems and Simplicity What simplicity is: The empirical complexity of world of experience w is the length of the longest effect path in K to the effect set S w presented by w.... non-quadratic non-linear non-constant / 39

30 Empirical Problems and Simplicity What simplicity isn t: notational or computational brevity (MDL), a question-begging rescaling of prior probability using ln(x) (MML). free parameters or dimensionality (AIC). 16 / 39

31 Three Paradigmatic Examples Linearly ordered complexity with refutation: Curve Fitting Kelly [2004] 17 / 39

32 Three Paradigmatic Examples Linearly ordered complexity with refutation: Curve Fitting Kelly [2004] Partially ordered complexity with refutation: Causal Inference Schulte, Luo and Greiner [2007], Kelly and Mayo-Wilson [2008] 17 / 39

33 Three Paradigmatic Examples Linearly ordered complexity with refutation: Curve Fitting Kelly [2004] Partially ordered complexity with refutation: Causal Inference Schulte, Luo and Greiner [2007], Kelly and Mayo-Wilson [2008] Partially ordered complexity without refutation: Orientation of Causal Edge Kelly and Mayo-Wilson [2008] 17 / 39

34 Linearly Ordered Simplicity Structure Curve fitting is an instance of a more general type of problem. Problem: Choosing amongst theories that are linearly ordered in terms of complexity. Evidence: Suppose that any false, simple theory is refuted in some finite amount of time. 18 / 39

35 Linearly Ordered Simplicity Structure 19 / 39

36 Linearly Ordered Simplicity Structure 19 / 39

37 Linearly Ordered Simplicity Structure 19 / 39

38 Linearly Ordered Simplicity Structure T 4 T T 3 T 3 T 2 3 extra retraction 2 4 T 2 T T 1 1 Ockham violation 1 T 0 T 0 consistent, stalwart, Ockham Ockham violator 20 / 39

39 Branching Simplicity Structure T 4 T 4 T 4 T T 3 T 3 T 3 T T 2 T 2 T 1 2 T T 0 1 T 0 consistent, stalwart, Ockham Ockham violator 21 / 39

40 Branching Simplicity Structure X Y Z X Y Z X Y Z X Y Z X Y Z X Y Z X Y Z X Y Z X Y Z X Y Z X Y Z 22 / 39

41 Discovering Causal Networks Problem: Choose a causal network describing the causal relationships amongst a set of variables. 23 / 39

42 Discovering Causal Networks Problem: Choose a causal network describing the causal relationships amongst a set of variables. Evidence (Effects): probabilistic dependencies amongst the variables discovered over time. 23 / 39

43 Discovering Causal Networks Problem: Choose a causal network describing the causal relationships amongst a set of variables. Evidence (Effects): probabilistic dependencies amongst the variables discovered over time. Complexity = greater number of edges 23 / 39

44 Discovering Causal Networks X Y Z X Y Z X Y Z X Y Z X Y Z X Y Z X Y Z X Y Z X Y Z X Y Z X Y Z 24 / 39

45 Discovering Causal Networks X Y Z X X X X Y Z Y Z Y Z Y Z X Y Z 24 / 39

46 Discovering Causal Networks X Y Z X X Y Z Y Z? 24 / 39

47 Deterministic Theory Choice Methods Given arbitrary, finite initial segment e of a world of experience, a deterministic theory choice method produces a unique theory T S or? indicating refusal to choose. 25 / 39

48 Methodological Properties Say a method M is convergent if, for any world w, M eventually produces the true theory in w. 26 / 39

49 Methodological Properties Say a method M is convergent if, for any world w, M eventually produces the true theory in w. Ockham if it never says any non-simple theory (relative to evidence). 26 / 39

50 Methodological Properties Say a method M is convergent if, for any world w, M eventually produces the true theory in w. Ockham if it never says any non-simple theory (relative to evidence). stalwart if whenever it endorses the simplest theory, it continues to do so until it s no longer simplest. 26 / 39

51 Methodological Properties Say a method M is convergent if, for any world w, M eventually produces the true theory in w. Ockham if it never says any non-simple theory (relative to evidence). stalwart if whenever it endorses the simplest theory, it continues to do so until it s no longer simplest. eventually informative if, in any world, there is some point of inquiry n after which M never says? in w. 26 / 39

52 Ockham s Razor in Probability Say a method M is a normally Ockham if it is Ockham, stalwart, and eventually informative. 27 / 39

53 Ockham s Razor Modulo some minor assumptions on the simplicity structure, which all of the above examples satisfy. Theorem (Efficiency Theorem) Let M be a normal Ockham method, and let M be any convergent method. Suppose M and M agree along some finite initial set of experience e, and that M violates Ockham s razor at e. Then M commits strictly more retractions (in the worst-case) in every complexity class with respect to e. 28 / 39

54 Mixed Strategies in Decision and Game Theory Randomization in decision theory and games: E.g. Rock, paper, scissors, matching pennies 29 / 39

55 Mixed Strategies in Decision and Game Theory Randomization in decision theory and games: E.g. Rock, paper, scissors, matching pennies Can randomization in scientific inquiry improve expected errors and retractions? 29 / 39

56 Randomized Strategies Randomized Methods: Machines (formally, discrete state stochastic processes) for selecting theories from data 30 / 39

57 Randomized Strategies Randomized Methods: Machines (formally, discrete state stochastic processes) for selecting theories from data Outputs of machine are a function of (i) its current state and (ii) total input 30 / 39

58 Randomized Strategies Randomized Methods: Machines (formally, discrete state stochastic processes) for selecting theories from data Outputs of machine are a function of (i) its current state and (ii) total input States of machine evolve according to a random process i.e. Future and past states may be correlated to any degree - Independence not assumed! No assumptions about process being Markov, etc. 30 / 39

59 Costs and Randomized Strategies Randomized methods ought to be as deductive as possible: Approximate Soundness: convergence in probability, minimization of expected errors Approximate Monotonicity: minimization of expected retractions 31 / 39

60 Retractions in Chance and Expected Retractions 32 / 39

61 Retractions in Chance and Expected Retractions 32 / 39

62 Methodological Properties Say a randomized method M is Ockham if it never says any non-simple theory with probability greater than zero. 33 / 39

63 Methodological Properties Say a randomized method M is Ockham if it never says any non-simple theory with probability greater than zero. Stalwart if whenever it endorses the simplest theory (with any positive probability), it continues to do so with unit probability until it is no longer Ockham. 33 / 39

64 Methodological Properties Say a randomized method M is Ockham if it never says any non-simple theory with probability greater than zero. Stalwart if whenever it endorses the simplest theory (with any positive probability), it continues to do so with unit probability until it is no longer Ockham. Convergent in Probability if for any world w, the probability that M produces the theory true in w approaches 1 as time elapses. 33 / 39

65 Ockham s Razor in Probability Say a method M is a normally Ockham if it is Ockham, stalwart, and convergent in probability. 34 / 39

66 Generalized Efficiency Theorem: Suppose the simplicity structure has no short paths, and let M be a randomized or deterministic method such that 1 M first violates Ockham s razor after some initial segment of evidence e. 2 M is convergent in probability Then, in comparison to any normal Ockham method (deterministic or not!), M accrues a strictly greater number of retractions in every complexity class with respect to e. 35 / 39

67 References Mark Gold, E. Language identification in the limit. Information and Control. V Kevin Kelly. The Logic of Reliable Inquiry. Oxford University Press, Oliver Schulte. Inferring Conservation Laws in Particle Physics: A Case Study in the Problem of Induction. The British Journal for the Philosophy of Science Oliver Schulte, W. Luo and R. Greiner. Mind Change Optimal Learning of Bayes Net Structure. In 20th Annual Conference on Learning Theory (COLT), San Diego, CA, June 12-15, / 39

68 Branching Simplicity Structure... T 4 As bad in these complexity classes... T short path T 5 T 3 short path T 5 T T 2 Only strictly worse in this complexity class! 4 3 T T 1 T T 0 T 0 consistent, stalwart, Ockham Ockham violator 37 / 39

69 Stochastic Processes Definition Let T,, Σ be arbitrary sets. A stochastic process is a quadruple Q =(T, (, D, p), (Σ, S), X ), where: 1 T is a set called the index set of the process; 2 (, D, p) is a (countably additive) probability space; 3 (Σ, S) is a measurable space of possible values of the process; 4 X : T Σ is such that for each fixed t T, the function X t is D/S-measurable. 38 / 39

70 Stochastic Processes Definition Let F represent finite, initial segments of data streams, and Ans contain the set of all theories and? representing I don t know. A stochastic empirical method is a triple M =(Q, α, σ 0 ) where: 1 Q =(F, (, D, p), (Σ, S), X ) is a stochastic process indexed by the set F of all finite, initial segments of data streams. 2 σ 0 Σ is the initial state of our method (i.e. it satisfies: X 1 () (σ 0 ) = ). 3 α : F Σ Ans is such that for each e F, α e is G/2 Ans -measurable. 39 / 39

Ockham Efficiency Theorem for Stochastic Empirical Methods Conor Mayo-Wilson and Kevin Kelly. May 19, Technical Report No.

Ockham Efficiency Theorem for Stochastic Empirical Methods Conor Mayo-Wilson and Kevin Kelly. May 19, Technical Report No. Ockham Efficiency Theorem for Stochastic Empirical Methods Conor Mayo-Wilson and Kevin Kelly May 19, 2010 Technical Report No. CMU-PHIL-186 Philosophy Methodology Logic Carnegie Mellon Pittsburgh, Pennsylvania

More information

A New Solution to the Puzzle of Simplicity

A New Solution to the Puzzle of Simplicity A New Solution to the Puzzle of Simplicity 4623 words Abstract Explaining the connection, if any, between simplicity and truth is among the deepest problems facing the philosophy of science, statistics,

More information

Church s s Thesis and Hume s s Problem:

Church s s Thesis and Hume s s Problem: Church s s Thesis and Hume s s Problem: Justification as Performance Kevin T. Kelly Department of Philosophy Carnegie Mellon University kk3n@andrew.cmu.edu Further Reading (With C. Glymour) Why You'll

More information

Testability and Ockham s Razor: How Formal and Statistical Learning Theory Converge in the New Riddle of. Induction

Testability and Ockham s Razor: How Formal and Statistical Learning Theory Converge in the New Riddle of. Induction Testability and Ockham s Razor: How Formal and Statistical Learning Theory Converge in the New Riddle of Induction Daniel Steel Department of Philosophy 503 S. Kedzie Hall Michigan State University East

More information

Belief Revision and Truth-Finding

Belief Revision and Truth-Finding Belief Revision and Truth-Finding Kevin T. Kelly Department of Philosophy Carnegie Mellon University kk3n@andrew.cmu.edu Further Reading (with O. Schulte and V. Hendricks) Reliable Belief Revision, in

More information

Inductive vs. Deductive Statistical Inference

Inductive vs. Deductive Statistical Inference Inductive vs. Deductive Statistical Inference Konstantin Genin Kevin T. Kelly February 28, 2018 Abstract The distinction between deductive (infallible, monotonic) and inductive (fallible, non-monotonic)

More information

I. Induction, Probability and Confirmation: Introduction

I. Induction, Probability and Confirmation: Introduction I. Induction, Probability and Confirmation: Introduction 1. Basic Definitions and Distinctions Singular statements vs. universal statements Observational terms vs. theoretical terms Observational statement

More information

Bayesian Learning. Artificial Intelligence Programming. 15-0: Learning vs. Deduction

Bayesian Learning. Artificial Intelligence Programming. 15-0: Learning vs. Deduction 15-0: Learning vs. Deduction Artificial Intelligence Programming Bayesian Learning Chris Brooks Department of Computer Science University of San Francisco So far, we ve seen two types of reasoning: Deductive

More information

ECE521 week 3: 23/26 January 2017

ECE521 week 3: 23/26 January 2017 ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear

More information

From inductive inference to machine learning

From inductive inference to machine learning From inductive inference to machine learning ADAPTED FROM AIMA SLIDES Russel&Norvig:Artificial Intelligence: a modern approach AIMA: Inductive inference AIMA: Inductive inference 1 Outline Bayesian inferences

More information

Challenges (& Some Solutions) and Making Connections

Challenges (& Some Solutions) and Making Connections Challenges (& Some Solutions) and Making Connections Real-life Search All search algorithm theorems have form: If the world behaves like, then (probability 1) the algorithm will recover the true structure

More information

Measurement Independence, Parameter Independence and Non-locality

Measurement Independence, Parameter Independence and Non-locality Measurement Independence, Parameter Independence and Non-locality Iñaki San Pedro Department of Logic and Philosophy of Science University of the Basque Country, UPV/EHU inaki.sanpedro@ehu.es Abstract

More information

Introduction to Algorithms / Algorithms I Lecturer: Michael Dinitz Topic: Intro to Learning Theory Date: 12/8/16

Introduction to Algorithms / Algorithms I Lecturer: Michael Dinitz Topic: Intro to Learning Theory Date: 12/8/16 600.463 Introduction to Algorithms / Algorithms I Lecturer: Michael Dinitz Topic: Intro to Learning Theory Date: 12/8/16 25.1 Introduction Today we re going to talk about machine learning, but from an

More information

Weighted Majority and the Online Learning Approach

Weighted Majority and the Online Learning Approach Statistical Techniques in Robotics (16-81, F12) Lecture#9 (Wednesday September 26) Weighted Majority and the Online Learning Approach Lecturer: Drew Bagnell Scribe:Narek Melik-Barkhudarov 1 Figure 1: Drew

More information

Structure learning in human causal induction

Structure learning in human causal induction Structure learning in human causal induction Joshua B. Tenenbaum & Thomas L. Griffiths Department of Psychology Stanford University, Stanford, CA 94305 jbt,gruffydd @psych.stanford.edu Abstract We use

More information

Conceivability and Modal Knowledge

Conceivability and Modal Knowledge 1 3 Conceivability and Modal Knowledge Christopher Hill ( 2006 ) provides an account of modal knowledge that is set in a broader context of arguing against the view that conceivability provides epistemic

More information

Astronomy 301G: Revolutionary Ideas in Science. Getting Started. What is Science? Richard Feynman ( CE) The Uncertainty of Science

Astronomy 301G: Revolutionary Ideas in Science. Getting Started. What is Science? Richard Feynman ( CE) The Uncertainty of Science Astronomy 301G: Revolutionary Ideas in Science Getting Started What is Science? Reading Assignment: What s the Matter? Readings in Physics Foreword & Introduction Richard Feynman (1918-1988 CE) The Uncertainty

More information

Feyerabend: How to be a good empiricist

Feyerabend: How to be a good empiricist Feyerabend: How to be a good empiricist Critique of Nagel s model of reduction within a larger critique of the logical empiricist view of scientific progress. Logical empiricism: associated with figures

More information

Uncertainty and Disagreement in Equilibrium Models

Uncertainty and Disagreement in Equilibrium Models Uncertainty and Disagreement in Equilibrium Models Nabil I. Al-Najjar & Northwestern University Eran Shmaya Tel Aviv University RUD, Warwick, June 2014 Forthcoming: Journal of Political Economy Motivation

More information

Comment on Leitgeb s Stability Theory of Belief

Comment on Leitgeb s Stability Theory of Belief Comment on Leitgeb s Stability Theory of Belief Hanti Lin Kevin T. Kelly Carnegie Mellon University {hantil, kk3n}@andrew.cmu.edu Hannes Leitgeb s stability theory of belief provides three synchronic constraints

More information

CS340 Machine learning Lecture 5 Learning theory cont'd. Some slides are borrowed from Stuart Russell and Thorsten Joachims

CS340 Machine learning Lecture 5 Learning theory cont'd. Some slides are borrowed from Stuart Russell and Thorsten Joachims CS340 Machine learning Lecture 5 Learning theory cont'd Some slides are borrowed from Stuart Russell and Thorsten Joachims Inductive learning Simplest form: learn a function from examples f is the target

More information

Statistical Learning. Philipp Koehn. 10 November 2015

Statistical Learning. Philipp Koehn. 10 November 2015 Statistical Learning Philipp Koehn 10 November 2015 Outline 1 Learning agents Inductive learning Decision tree learning Measuring learning performance Bayesian learning Maximum a posteriori and maximum

More information

Machine Learning for Biomedical Engineering. Enrico Grisan

Machine Learning for Biomedical Engineering. Enrico Grisan Machine Learning for Biomedical Engineering Enrico Grisan enrico.grisan@dei.unipd.it Curse of dimensionality Why are more features bad? Redundant features (useless or confounding) Hard to interpret and

More information

Computational Complexity of Bayesian Networks

Computational Complexity of Bayesian Networks Computational Complexity of Bayesian Networks UAI, 2015 Complexity theory Many computations on Bayesian networks are NP-hard Meaning (no more, no less) that we cannot hope for poly time algorithms that

More information

7. Propositional Logic. Wolfram Burgard and Bernhard Nebel

7. Propositional Logic. Wolfram Burgard and Bernhard Nebel Foundations of AI 7. Propositional Logic Rational Thinking, Logic, Resolution Wolfram Burgard and Bernhard Nebel Contents Agents that think rationally The wumpus world Propositional logic: syntax and semantics

More information

Agency and Interaction in Formal Epistemology

Agency and Interaction in Formal Epistemology Agency and Interaction in Formal Epistemology Vincent F. Hendricks Department of Philosophy / MEF University of Copenhagen Denmark Department of Philosophy Columbia University New York / USA CPH / August

More information

Abstract. Three Methods and Their Limitations. N-1 Experiments Suffice to Determine the Causal Relations Among N Variables

Abstract. Three Methods and Their Limitations. N-1 Experiments Suffice to Determine the Causal Relations Among N Variables N-1 Experiments Suffice to Determine the Causal Relations Among N Variables Frederick Eberhardt Clark Glymour 1 Richard Scheines Carnegie Mellon University Abstract By combining experimental interventions

More information

Statistical and Inductive Inference by Minimum Message Length

Statistical and Inductive Inference by Minimum Message Length C.S. Wallace Statistical and Inductive Inference by Minimum Message Length With 22 Figures Springer Contents Preface 1. Inductive Inference 1 1.1 Introduction 1 1.2 Inductive Inference 5 1.3 The Demise

More information

Scientific Explanation- Causation and Unification

Scientific Explanation- Causation and Unification Scientific Explanation- Causation and Unification By Wesley Salmon Analysis by Margarita Georgieva, PSTS student, number 0102458 Van Lochemstraat 9-17 7511 EG Enschede Final Paper for Philosophy of Science

More information

Review of Gordon Belot, Geometric Possibility Oxford: Oxford UP, 2011

Review of Gordon Belot, Geometric Possibility Oxford: Oxford UP, 2011 Review of Gordon Belot, Geometric Possibility Oxford: Oxford UP, 2011 (The Philosophical Review 122 (2013): 522 525) Jill North This is a neat little book (138 pages without appendices, under 200 with).

More information

Machine Learning

Machine Learning Machine Learning 10-701 Tom M. Mitchell Machine Learning Department Carnegie Mellon University January 13, 2011 Today: The Big Picture Overfitting Review: probability Readings: Decision trees, overfiting

More information

Reasons for Rejecting a Counterfactual Analysis of Conclusive Reasons

Reasons for Rejecting a Counterfactual Analysis of Conclusive Reasons Reasons for Rejecting a Counterfactual Analysis of Conclusive Reasons Abstract In a recent article [1], Charles Cross applies Lewisian counterfactual logic to explicate and evaluate Fred Dretske s conclusive

More information

Linear Classifiers and the Perceptron

Linear Classifiers and the Perceptron Linear Classifiers and the Perceptron William Cohen February 4, 2008 1 Linear classifiers Let s assume that every instance is an n-dimensional vector of real numbers x R n, and there are only two possible

More information

Algorithmic Probability

Algorithmic Probability Algorithmic Probability From Scholarpedia From Scholarpedia, the free peer-reviewed encyclopedia p.19046 Curator: Marcus Hutter, Australian National University Curator: Shane Legg, Dalle Molle Institute

More information

Modes of Convergence to the Truth: Steps toward a Better Epistemology of Induction

Modes of Convergence to the Truth: Steps toward a Better Epistemology of Induction Modes of Convergence to the Truth: Steps toward a Better Epistemology of Induction Hanti Lin UC Davis ika@ucdavis.edu April 26, 2018 Abstract There is a large group of people engaging in normative or evaluative

More information

Bayesian Reasoning. Adapted from slides by Tim Finin and Marie desjardins.

Bayesian Reasoning. Adapted from slides by Tim Finin and Marie desjardins. Bayesian Reasoning Adapted from slides by Tim Finin and Marie desjardins. 1 Outline Probability theory Bayesian inference From the joint distribution Using independence/factoring From sources of evidence

More information

Kernel Methods and Support Vector Machines

Kernel Methods and Support Vector Machines Kernel Methods and Support Vector Machines Oliver Schulte - CMPT 726 Bishop PRML Ch. 6 Support Vector Machines Defining Characteristics Like logistic regression, good for continuous input features, discrete

More information

Classification (Categorization) CS 391L: Machine Learning: Inductive Classification. Raymond J. Mooney. Sample Category Learning Problem

Classification (Categorization) CS 391L: Machine Learning: Inductive Classification. Raymond J. Mooney. Sample Category Learning Problem Classification (Categorization) CS 9L: Machine Learning: Inductive Classification Raymond J. Mooney University of Texas at Austin Given: A description of an instance, x X, where X is the instance language

More information

Logic. Introduction to Artificial Intelligence CS/ECE 348 Lecture 11 September 27, 2001

Logic. Introduction to Artificial Intelligence CS/ECE 348 Lecture 11 September 27, 2001 Logic Introduction to Artificial Intelligence CS/ECE 348 Lecture 11 September 27, 2001 Last Lecture Games Cont. α-β pruning Outline Games with chance, e.g. Backgammon Logical Agents and thewumpus World

More information

Logic. Propositional Logic: Syntax

Logic. Propositional Logic: Syntax Logic Propositional Logic: Syntax Logic is a tool for formalizing reasoning. There are lots of different logics: probabilistic logic: for reasoning about probability temporal logic: for reasoning about

More information

Ockham s Razor, Truth, and Information

Ockham s Razor, Truth, and Information Ockham s Razor, Truth, and Information Kevin T. Kelly Department of Philosophy Carnegie Mellon University kk3n@andrew.cmu.edu March 11, 2008 Abstract In science, one faces the problem of selecting the

More information

Bayesian Learning Extension

Bayesian Learning Extension Bayesian Learning Extension This document will go over one of the most useful forms of statistical inference known as Baye s Rule several of the concepts that extend from it. Named after Thomas Bayes this

More information

Uncertainty and Bayesian Networks

Uncertainty and Bayesian Networks Uncertainty and Bayesian Networks Tutorial 3 Tutorial 3 1 Outline Uncertainty Probability Syntax and Semantics for Uncertainty Inference Independence and Bayes Rule Syntax and Semantics for Bayesian Networks

More information

Bayesian learning Probably Approximately Correct Learning

Bayesian learning Probably Approximately Correct Learning Bayesian learning Probably Approximately Correct Learning Peter Antal antal@mit.bme.hu A.I. December 1, 2017 1 Learning paradigms Bayesian learning Falsification hypothesis testing approach Probably Approximately

More information

09 Modal Logic II. CS 3234: Logic and Formal Systems. October 14, Martin Henz and Aquinas Hobor

09 Modal Logic II. CS 3234: Logic and Formal Systems. October 14, Martin Henz and Aquinas Hobor Martin Henz and Aquinas Hobor October 14, 2010 Generated on Thursday 14 th October, 2010, 11:40 1 Review of Modal Logic 2 3 4 Motivation Syntax and Semantics Valid Formulas wrt Modalities Correspondence

More information

Intervention, determinism, and the causal minimality condition

Intervention, determinism, and the causal minimality condition DOI 10.1007/s11229-010-9751-1 Intervention, determinism, and the causal minimality condition Jiji Zhang Peter Spirtes Received: 12 May 2009 / Accepted: 30 October 2009 Springer Science+Business Media B.V.

More information

Machine Learning 4771

Machine Learning 4771 Machine Learning 4771 Instructor: Tony Jebara Topic 11 Maximum Likelihood as Bayesian Inference Maximum A Posteriori Bayesian Gaussian Estimation Why Maximum Likelihood? So far, assumed max (log) likelihood

More information

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 10

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 10 EECS 70 Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 10 Introduction to Basic Discrete Probability In the last note we considered the probabilistic experiment where we flipped

More information

A Strong Relevant Logic Model of Epistemic Processes in Scientific Discovery

A Strong Relevant Logic Model of Epistemic Processes in Scientific Discovery A Strong Relevant Logic Model of Epistemic Processes in Scientific Discovery (Extended Abstract) Jingde Cheng Department of Computer Science and Communication Engineering Kyushu University, 6-10-1 Hakozaki,

More information

CS 6375: Machine Learning Computational Learning Theory

CS 6375: Machine Learning Computational Learning Theory CS 6375: Machine Learning Computational Learning Theory Vibhav Gogate The University of Texas at Dallas Many slides borrowed from Ray Mooney 1 Learning Theory Theoretical characterizations of Difficulty

More information

Ex Post Cheap Talk : Value of Information and Value of Signals

Ex Post Cheap Talk : Value of Information and Value of Signals Ex Post Cheap Talk : Value of Information and Value of Signals Liping Tang Carnegie Mellon University, Pittsburgh PA 15213, USA Abstract. Crawford and Sobel s Cheap Talk model [1] describes an information

More information

OPTIMIZATION OF FIRST ORDER MODELS

OPTIMIZATION OF FIRST ORDER MODELS Chapter 2 OPTIMIZATION OF FIRST ORDER MODELS One should not multiply explanations and causes unless it is strictly necessary William of Bakersville in Umberto Eco s In the Name of the Rose 1 In Response

More information

Introduction to Machine Learning

Introduction to Machine Learning Outline Contents Introduction to Machine Learning Concept Learning Varun Chandola February 2, 2018 1 Concept Learning 1 1.1 Example Finding Malignant Tumors............. 2 1.2 Notation..............................

More information

COMP538: Introduction to Bayesian Networks

COMP538: Introduction to Bayesian Networks COMP538: Introduction to Bayesian Networks Lecture 9: Optimal Structure Learning Nevin L. Zhang lzhang@cse.ust.hk Department of Computer Science and Engineering Hong Kong University of Science and Technology

More information

An Algorithms-based Intro to Machine Learning

An Algorithms-based Intro to Machine Learning CMU 15451 lecture 12/08/11 An Algorithmsbased Intro to Machine Learning Plan for today Machine Learning intro: models and basic issues An interesting algorithm for combining expert advice Avrim Blum [Based

More information

Chapter ML:IV. IV. Statistical Learning. Probability Basics Bayes Classification Maximum a-posteriori Hypotheses

Chapter ML:IV. IV. Statistical Learning. Probability Basics Bayes Classification Maximum a-posteriori Hypotheses Chapter ML:IV IV. Statistical Learning Probability Basics Bayes Classification Maximum a-posteriori Hypotheses ML:IV-1 Statistical Learning STEIN 2005-2017 Area Overview Mathematics Statistics...... Stochastics

More information

UNCERTAINTY. In which we see what an agent should do when not all is crystal-clear.

UNCERTAINTY. In which we see what an agent should do when not all is crystal-clear. UNCERTAINTY In which we see what an agent should do when not all is crystal-clear. Outline Uncertainty Probabilistic Theory Axioms of Probability Probabilistic Reasoning Independency Bayes Rule Summary

More information

PHIL 422 Advanced Logic Inductive Proof

PHIL 422 Advanced Logic Inductive Proof PHIL 422 Advanced Logic Inductive Proof 1. Preamble: One of the most powerful tools in your meta-logical toolkit will be proof by induction. Just about every significant meta-logical result relies upon

More information

MODULE -4 BAYEIAN LEARNING

MODULE -4 BAYEIAN LEARNING MODULE -4 BAYEIAN LEARNING CONTENT Introduction Bayes theorem Bayes theorem and concept learning Maximum likelihood and Least Squared Error Hypothesis Maximum likelihood Hypotheses for predicting probabilities

More information

A New Conception of Science

A New Conception of Science A New Conception of Science Nicholas Maxwell Published in PhysicsWorld 13 No. 8, August 2000, pp. 17-18. When scientists choose one theory over another, they reject out of hand all those that are not simple,

More information

Probabilistically Checkable Proofs

Probabilistically Checkable Proofs Probabilistically Checkable Proofs Madhu Sudan Microsoft Research June 11, 2015 TIFR: Probabilistically Checkable Proofs 1 Can Proofs be Checked Efficiently? The Riemann Hypothesis is true (12 th Revision)

More information

AIC Scores as Evidence a Bayesian Interpretation. Malcolm Forster and Elliott Sober University of Wisconsin, Madison

AIC Scores as Evidence a Bayesian Interpretation. Malcolm Forster and Elliott Sober University of Wisconsin, Madison AIC Scores as Evidence a Bayesian Interpretation Malcolm Forster and Elliott Sober University of Wisconsin, Madison Abstract: Bayesians often reject the Akaike Information Criterion (AIC) because it introduces

More information

Introduction to Artificial Intelligence. Learning from Oberservations

Introduction to Artificial Intelligence. Learning from Oberservations Introduction to Artificial Intelligence Learning from Oberservations Bernhard Beckert UNIVERSITÄT KOBLENZ-LANDAU Winter Term 2004/2005 B. Beckert: KI für IM p.1 Outline Learning agents Inductive learning

More information

CAUSALITY. Models, Reasoning, and Inference 1 CAMBRIDGE UNIVERSITY PRESS. Judea Pearl. University of California, Los Angeles

CAUSALITY. Models, Reasoning, and Inference 1 CAMBRIDGE UNIVERSITY PRESS. Judea Pearl. University of California, Los Angeles CAUSALITY Models, Reasoning, and Inference Judea Pearl University of California, Los Angeles 1 CAMBRIDGE UNIVERSITY PRESS Preface page xiii 1 Introduction to Probabilities, Graphs, and Causal Models 1

More information

Learning from Observations. Chapter 18, Sections 1 3 1

Learning from Observations. Chapter 18, Sections 1 3 1 Learning from Observations Chapter 18, Sections 1 3 Chapter 18, Sections 1 3 1 Outline Learning agents Inductive learning Decision tree learning Measuring learning performance Chapter 18, Sections 1 3

More information

What Causality Is (stats for mathematicians)

What Causality Is (stats for mathematicians) What Causality Is (stats for mathematicians) Andrew Critch UC Berkeley August 31, 2011 Introduction Foreword: The value of examples With any hard question, it helps to start with simple, concrete versions

More information

CS261: A Second Course in Algorithms Lecture #11: Online Learning and the Multiplicative Weights Algorithm

CS261: A Second Course in Algorithms Lecture #11: Online Learning and the Multiplicative Weights Algorithm CS61: A Second Course in Algorithms Lecture #11: Online Learning and the Multiplicative Weights Algorithm Tim Roughgarden February 9, 016 1 Online Algorithms This lecture begins the third module of the

More information

Lecture 11: Measuring the Complexity of Proofs

Lecture 11: Measuring the Complexity of Proofs IAS/PCMI Summer Session 2000 Clay Mathematics Undergraduate Program Advanced Course on Computational Complexity Lecture 11: Measuring the Complexity of Proofs David Mix Barrington and Alexis Maciel July

More information

Reasoning with Uncertainty

Reasoning with Uncertainty Reasoning with Uncertainty Representing Uncertainty Manfred Huber 2005 1 Reasoning with Uncertainty The goal of reasoning is usually to: Determine the state of the world Determine what actions to take

More information

Confirmation and Chaos *

Confirmation and Chaos * Confirmation and Chaos * Maralee Harrell Department of Philosophy, Colorado College Clark Glymour Institute for Human and Machine Cognition, University of West Florida and Department of Philosophy, Carnegie

More information

Proof Techniques (Review of Math 271)

Proof Techniques (Review of Math 271) Chapter 2 Proof Techniques (Review of Math 271) 2.1 Overview This chapter reviews proof techniques that were probably introduced in Math 271 and that may also have been used in a different way in Phil

More information

Discrete Mathematics

Discrete Mathematics Slides for Part IA CST 2015/16 Discrete Mathematics Prof Marcelo Fiore Marcelo.Fiore@cl.cam.ac.uk What are we up to? Learn to read and write, and also work with,

More information

Statistics: Learning models from data

Statistics: Learning models from data DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial

More information

Machine Learning

Machine Learning Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University January 14, 2015 Today: The Big Picture Overfitting Review: probability Readings: Decision trees, overfiting

More information

On the Complexity of Causal Models

On the Complexity of Causal Models On the Complexity of Causal Models Brian R. Gaines Man-Machine Systems Laboratory Department of Electrical Engineering Science University of Essex, Colchester, England It is argued that principle of causality

More information

Philosophy 148 Announcements & Such

Philosophy 148 Announcements & Such Branden Fitelson Philosophy 148 Lecture 1 Philosophy 148 Announcements & Such Overall, people did very well on the mid-term (µ = 90, σ = 16). HW #2 graded will be posted very soon. Raul won t be able to

More information

16.4 Multiattribute Utility Functions

16.4 Multiattribute Utility Functions 285 Normalized utilities The scale of utilities reaches from the best possible prize u to the worst possible catastrophe u Normalized utilities use a scale with u = 0 and u = 1 Utilities of intermediate

More information

Chaos, Complexity, and Inference (36-462)

Chaos, Complexity, and Inference (36-462) Chaos, Complexity, and Inference (36-462) Lecture 17: Inference in General Cosma Shalizi 16 March 2009 Some Basics About Inference in General Errors and Reliable Inference Further reading: Mayo (1996),

More information

Rigorous Science - Based on a probability value? The linkage between Popperian science and statistical analysis

Rigorous Science - Based on a probability value? The linkage between Popperian science and statistical analysis Rigorous Science - Based on a probability value? The linkage between Popperian science and statistical analysis The Philosophy of science: the scientific Method - from a Popperian perspective Philosophy

More information

15-424/ Recitation 1 First-Order Logic, Syntax and Semantics, and Differential Equations Notes by: Brandon Bohrer

15-424/ Recitation 1 First-Order Logic, Syntax and Semantics, and Differential Equations Notes by: Brandon Bohrer 15-424/15-624 Recitation 1 First-Order Logic, Syntax and Semantics, and Differential Equations Notes by: Brandon Bohrer (bbohrer@cs.cmu.edu) 1 Agenda Admin Everyone should have access to both Piazza and

More information

PHIL12A Section answers, 14 February 2011

PHIL12A Section answers, 14 February 2011 PHIL12A Section answers, 14 February 2011 Julian Jonker 1 How much do you know? 1. You should understand why a truth table is constructed the way it is: why are the truth values listed in the order they

More information

STA 4273H: Sta-s-cal Machine Learning

STA 4273H: Sta-s-cal Machine Learning STA 4273H: Sta-s-cal Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 2 In our

More information

Statistical inference

Statistical inference Statistical inference Contents 1. Main definitions 2. Estimation 3. Testing L. Trapani MSc Induction - Statistical inference 1 1 Introduction: definition and preliminary theory In this chapter, we shall

More information

CAUSATION CAUSATION. Chapter 10. Non-Humean Reductionism

CAUSATION CAUSATION. Chapter 10. Non-Humean Reductionism CAUSATION CAUSATION Chapter 10 Non-Humean Reductionism Humean states of affairs were characterized recursively in chapter 2, the basic idea being that distinct Humean states of affairs cannot stand in

More information

CS 188: Artificial Intelligence Spring Announcements

CS 188: Artificial Intelligence Spring Announcements CS 188: Artificial Intelligence Spring 2010 Lecture 24: Perceptrons and More! 4/22/2010 Pieter Abbeel UC Berkeley Slides adapted from Dan Klein Announcements W7 due tonight [this is your last written for

More information

Foundations of Artificial Intelligence

Foundations of Artificial Intelligence Foundations of Artificial Intelligence 7. Propositional Logic Rational Thinking, Logic, Resolution Wolfram Burgard, Maren Bennewitz, and Marco Ragni Albert-Ludwigs-Universität Freiburg Contents 1 Agents

More information

Bayesian network modeling. 1

Bayesian network modeling.  1 Bayesian network modeling http://springuniversity.bc3research.org/ 1 Probabilistic vs. deterministic modeling approaches Probabilistic Explanatory power (e.g., r 2 ) Explanation why Based on inductive

More information

Pengju XJTU 2016

Pengju XJTU 2016 Introduction to AI Chapter13 Uncertainty Pengju Ren@IAIR Outline Uncertainty Probability Syntax and Semantics Inference Independence and Bayes Rule Wumpus World Environment Squares adjacent to wumpus are

More information

Computational Learning Theory

Computational Learning Theory CS 446 Machine Learning Fall 2016 OCT 11, 2016 Computational Learning Theory Professor: Dan Roth Scribe: Ben Zhou, C. Cervantes 1 PAC Learning We want to develop a theory to relate the probability of successful

More information

Foundations of Artificial Intelligence

Foundations of Artificial Intelligence Foundations of Artificial Intelligence 7. Propositional Logic Rational Thinking, Logic, Resolution Joschka Boedecker and Wolfram Burgard and Bernhard Nebel Albert-Ludwigs-Universität Freiburg May 17, 2016

More information

Introduction to Metalogic

Introduction to Metalogic Philosophy 135 Spring 2008 Tony Martin Introduction to Metalogic 1 The semantics of sentential logic. The language L of sentential logic. Symbols of L: Remarks: (i) sentence letters p 0, p 1, p 2,... (ii)

More information

Uncertainty. Michael Peters December 27, 2013

Uncertainty. Michael Peters December 27, 2013 Uncertainty Michael Peters December 27, 20 Lotteries In many problems in economics, people are forced to make decisions without knowing exactly what the consequences will be. For example, when you buy

More information

Formal Epistemology: Lecture Notes. Horacio Arló-Costa Carnegie Mellon University

Formal Epistemology: Lecture Notes. Horacio Arló-Costa Carnegie Mellon University Formal Epistemology: Lecture Notes Horacio Arló-Costa Carnegie Mellon University hcosta@andrew.cmu.edu Bayesian Epistemology Radical probabilism doesn t insists that probabilities be based on certainties;

More information

Bayesian Social Learning with Random Decision Making in Sequential Systems

Bayesian Social Learning with Random Decision Making in Sequential Systems Bayesian Social Learning with Random Decision Making in Sequential Systems Yunlong Wang supervised by Petar M. Djurić Department of Electrical and Computer Engineering Stony Brook University Stony Brook,

More information

Chapter 2. Mathematical Reasoning. 2.1 Mathematical Models

Chapter 2. Mathematical Reasoning. 2.1 Mathematical Models Contents Mathematical Reasoning 3.1 Mathematical Models........................... 3. Mathematical Proof............................ 4..1 Structure of Proofs........................ 4.. Direct Method..........................

More information

Baker. 1. Classical physics. 2. Determinism

Baker. 1. Classical physics. 2. Determinism Baker Ted Sider Structuralism seminar 1. Classical physics Dynamics Any body will accelerate in the direction of the net force on it, with a magnitude that is the ratio of the force s magnitude and the

More information

Comparative Bayesian Confirmation and the Quine Duhem Problem: A Rejoinder to Strevens Branden Fitelson and Andrew Waterman

Comparative Bayesian Confirmation and the Quine Duhem Problem: A Rejoinder to Strevens Branden Fitelson and Andrew Waterman Brit. J. Phil. Sci. 58 (2007), 333 338 Comparative Bayesian Confirmation and the Quine Duhem Problem: A Rejoinder to Strevens Branden Fitelson and Andrew Waterman ABSTRACT By and large, we think (Strevens

More information

Solomonoff Induction Violates Nicod s Criterion

Solomonoff Induction Violates Nicod s Criterion Solomonoff Induction Violates Nicod s Criterion Jan Leike and Marcus Hutter http://jan.leike.name/ ALT 15 6 October 2015 Outline The Paradox of Confirmation Solomonoff Induction Results Resolving the Paradox

More information

The exam is closed book, closed calculator, and closed notes except your one-page crib sheet.

The exam is closed book, closed calculator, and closed notes except your one-page crib sheet. CS 188 Fall 2015 Introduction to Artificial Intelligence Final You have approximately 2 hours and 50 minutes. The exam is closed book, closed calculator, and closed notes except your one-page crib sheet.

More information

Natural deduction for truth-functional logic

Natural deduction for truth-functional logic Natural deduction for truth-functional logic Phil 160 - Boston University Why natural deduction? After all, we just found this nice method of truth-tables, which can be used to determine the validity or

More information