Ockham Efficiency Theorem for Randomized Scientific Methods
|
|
- Ethan Malone
- 6 years ago
- Views:
Transcription
1 Ockham Efficiency Theorem for Randomized Scientific Methods Conor Mayo-Wilson and Kevin T. Kelly Department of Philosophy Carnegie Mellon University Formal Epistemology Workshop (FEW) June 19th, / 39
2 Point of the Talk Ockham efficiency theorem: Ockham s razor has been explained in terms of minimizing retractions en route to the truth, relative to all deterministic scientific strategies (Kevin T. Kelly and Oliver Schulte). 2 / 39
3 Point of the Talk Ockham efficiency theorem: Ockham s razor has been explained in terms of minimizing retractions en route to the truth, relative to all deterministic scientific strategies (Kevin T. Kelly and Oliver Schulte). Extension: Ockham s deterministic razor minimizes retractions en route to the truth, relative to a broad class of random scientific strategies. 2 / 39
4 Point of the Talk Ockham efficiency theorem: Ockham s razor has been explained in terms of minimizing retractions en route to the truth, relative to all deterministic scientific strategies (Kevin T. Kelly and Oliver Schulte). Extension: Ockham s deterministic razor minimizes retractions en route to the truth, relative to a broad class of random scientific strategies. Further significance: Extending the argument to expected retractions is a necessary step for lifting the idea to a theory of statistical theory choice. 2 / 39
5 Outline 1 Puzzle of Simplicity 2 Standard Explanations 3 Simplicity 4 Examples 5 Mixed Strategies and Ockham s Razor 3 / 39
6 The Puzzle of Simplicity Ockham s Razor is indespensible in scientific inference. Inference should be truth-conducive. But how could a fixed bias toward simplicity be said to help one find possibly complex truths? 4 / 39
7 Methodological virtues: Simpler theories are more testable or explanatory or are otherwise more virtuous. 5 / 39
8 Methodological virtues: Simpler theories are more testable or explanatory or are otherwise more virtuous. Response: wishful thinking desiring that the truth be virtuous doesn t make it so. 5 / 39
9 Standard Explanations Confirmation: Simple theories are better confirmed by simple data: p(t S E) p(t C E) > p(t S) p(t C ). 6 / 39
10 Standard Explanations Confirmation: Simple theories are better confirmed by simple data: p(t S E) p(t C E) > p(t S) p(t C ). Responses Simple data are compatible with complex hypotheses. 6 / 39
11 Standard Explanations Confirmation: Simple theories are better confirmed by simple data: p(t S E) p(t C E) > p(t S) p(t C ). Responses Simple data are compatible with complex hypotheses. No clear connection with finding the truth in the short run. 6 / 39
12 Standard Explanations Over-fitting: Even if the truth is complex, simple theories improve overall predictive accuracy by trading variance for bias. 7 / 39
13 Standard Explanations Over-fitting: Even if the truth is complex, simple theories improve overall predictive accuracy by trading variance for bias. Responses: The underlying decision theory is unclear the worst-case solution is the most complex theory. 7 / 39
14 Standard Explanations Over-fitting: Even if the truth is complex, simple theories improve overall predictive accuracy by trading variance for bias. Responses: The underlying decision theory is unclear the worst-case solution is the most complex theory. The over-fitting account ties Ockham s razor to choices among stochastic theories. 7 / 39
15 Standard Explanations Over-fitting: Even if the truth is complex, simple theories improve overall predictive accuracy by trading variance for bias. Responses: The underlying decision theory is unclear the worst-case solution is the most complex theory. The over-fitting account ties Ockham s razor to choices among stochastic theories. Prediction must be understood so as to rule out counterfactual or causal predictions. 7 / 39
16 Standard Explanations Convergence: Even if the truth is complex, complexities in the data will eventually over-turn over-simplified theories. 8 / 39
17 Standard Explanations Convergence: Even if the truth is complex, complexities in the data will eventually over-turn over-simplified theories. Responses: Any theory choice in the short run is compatible with finding the true theory in the long run. 8 / 39
18 A New Approach Deduction: Sound; Monotone. 9 / 39
19 A New Approach Deduction: Sound; Monotone. Induction: Approximate Soundness: converge to truth with minimal errors; Approximate Monotonicity: minimize retractions. 9 / 39
20 Monotonicity for Bayesians 10 / 39
21 Monotonicity in Belief Revision Total number of times B n+1 = B n = total number of non-expansive belief revisions. 11 / 39
22 Main Idea Ockham s Razor = Closest Inductive Approximation to Deduction 12 / 39
23 Main Idea Ockham s Razor = Closest Inductive Approximation to Deduction Ockham s razor converges to the truth with minimal retractions and errors and elapsed time to retractions, elapsed time to errors 12 / 39
24 Main Idea Ockham s Razor = Closest Inductive Approximation to Deduction Ockham s razor converges to the truth with minimal retractions and errors and elapsed time to retractions, elapsed time to errors No other convergent method does 12 / 39
25 Main Idea Ockham s Razor = Closest Inductive Approximation to Deduction Ockham s razor converges to the truth with minimal retractions and errors and elapsed time to retractions, elapsed time to errors No other convergent method does No circular appeal to prior simplicity biases 12 / 39
26 Main Idea Ockham s Razor = Closest Inductive Approximation to Deduction Ockham s razor converges to the truth with minimal retractions and errors and elapsed time to retractions, elapsed time to errors No other convergent method does No circular appeal to prior simplicity biases No awkward trade-offs between costs are required. 12 / 39
27 Empirical Problems and Simplicity Empirical Effects: Recognizable eventually. Arbitrarily subtle so may take arbitrarily long to be noticed. Each theory predicts finitely many. first order effect = nonconstant second order effect = nonlinear 13 / 39
28 Empirical Problems and Simplicity Empirical Problems: Background knowledge K picks out a set of possible, finite, effect sets. Theory T S says that exactly the effects in S will be seen in the unbounded future. A world of experience w is an infinite sequence that presents some finite set of effects at each stage and that presents some set S K in the limit. The empirical problem corresponding to K is to determine which theory in {T S : S K} is true of the actual world of experience w. 14 / 39
29 Empirical Problems and Simplicity What simplicity is: The empirical complexity of world of experience w is the length of the longest effect path in K to the effect set S w presented by w.... non-quadratic non-linear non-constant / 39
30 Empirical Problems and Simplicity What simplicity isn t: notational or computational brevity (MDL), a question-begging rescaling of prior probability using ln(x) (MML). free parameters or dimensionality (AIC). 16 / 39
31 Three Paradigmatic Examples Linearly ordered complexity with refutation: Curve Fitting Kelly [2004] 17 / 39
32 Three Paradigmatic Examples Linearly ordered complexity with refutation: Curve Fitting Kelly [2004] Partially ordered complexity with refutation: Causal Inference Schulte, Luo and Greiner [2007], Kelly and Mayo-Wilson [2008] 17 / 39
33 Three Paradigmatic Examples Linearly ordered complexity with refutation: Curve Fitting Kelly [2004] Partially ordered complexity with refutation: Causal Inference Schulte, Luo and Greiner [2007], Kelly and Mayo-Wilson [2008] Partially ordered complexity without refutation: Orientation of Causal Edge Kelly and Mayo-Wilson [2008] 17 / 39
34 Linearly Ordered Simplicity Structure Curve fitting is an instance of a more general type of problem. Problem: Choosing amongst theories that are linearly ordered in terms of complexity. Evidence: Suppose that any false, simple theory is refuted in some finite amount of time. 18 / 39
35 Linearly Ordered Simplicity Structure 19 / 39
36 Linearly Ordered Simplicity Structure 19 / 39
37 Linearly Ordered Simplicity Structure 19 / 39
38 Linearly Ordered Simplicity Structure T 4 T T 3 T 3 T 2 3 extra retraction 2 4 T 2 T T 1 1 Ockham violation 1 T 0 T 0 consistent, stalwart, Ockham Ockham violator 20 / 39
39 Branching Simplicity Structure T 4 T 4 T 4 T T 3 T 3 T 3 T T 2 T 2 T 1 2 T T 0 1 T 0 consistent, stalwart, Ockham Ockham violator 21 / 39
40 Branching Simplicity Structure X Y Z X Y Z X Y Z X Y Z X Y Z X Y Z X Y Z X Y Z X Y Z X Y Z X Y Z 22 / 39
41 Discovering Causal Networks Problem: Choose a causal network describing the causal relationships amongst a set of variables. 23 / 39
42 Discovering Causal Networks Problem: Choose a causal network describing the causal relationships amongst a set of variables. Evidence (Effects): probabilistic dependencies amongst the variables discovered over time. 23 / 39
43 Discovering Causal Networks Problem: Choose a causal network describing the causal relationships amongst a set of variables. Evidence (Effects): probabilistic dependencies amongst the variables discovered over time. Complexity = greater number of edges 23 / 39
44 Discovering Causal Networks X Y Z X Y Z X Y Z X Y Z X Y Z X Y Z X Y Z X Y Z X Y Z X Y Z X Y Z 24 / 39
45 Discovering Causal Networks X Y Z X X X X Y Z Y Z Y Z Y Z X Y Z 24 / 39
46 Discovering Causal Networks X Y Z X X Y Z Y Z? 24 / 39
47 Deterministic Theory Choice Methods Given arbitrary, finite initial segment e of a world of experience, a deterministic theory choice method produces a unique theory T S or? indicating refusal to choose. 25 / 39
48 Methodological Properties Say a method M is convergent if, for any world w, M eventually produces the true theory in w. 26 / 39
49 Methodological Properties Say a method M is convergent if, for any world w, M eventually produces the true theory in w. Ockham if it never says any non-simple theory (relative to evidence). 26 / 39
50 Methodological Properties Say a method M is convergent if, for any world w, M eventually produces the true theory in w. Ockham if it never says any non-simple theory (relative to evidence). stalwart if whenever it endorses the simplest theory, it continues to do so until it s no longer simplest. 26 / 39
51 Methodological Properties Say a method M is convergent if, for any world w, M eventually produces the true theory in w. Ockham if it never says any non-simple theory (relative to evidence). stalwart if whenever it endorses the simplest theory, it continues to do so until it s no longer simplest. eventually informative if, in any world, there is some point of inquiry n after which M never says? in w. 26 / 39
52 Ockham s Razor in Probability Say a method M is a normally Ockham if it is Ockham, stalwart, and eventually informative. 27 / 39
53 Ockham s Razor Modulo some minor assumptions on the simplicity structure, which all of the above examples satisfy. Theorem (Efficiency Theorem) Let M be a normal Ockham method, and let M be any convergent method. Suppose M and M agree along some finite initial set of experience e, and that M violates Ockham s razor at e. Then M commits strictly more retractions (in the worst-case) in every complexity class with respect to e. 28 / 39
54 Mixed Strategies in Decision and Game Theory Randomization in decision theory and games: E.g. Rock, paper, scissors, matching pennies 29 / 39
55 Mixed Strategies in Decision and Game Theory Randomization in decision theory and games: E.g. Rock, paper, scissors, matching pennies Can randomization in scientific inquiry improve expected errors and retractions? 29 / 39
56 Randomized Strategies Randomized Methods: Machines (formally, discrete state stochastic processes) for selecting theories from data 30 / 39
57 Randomized Strategies Randomized Methods: Machines (formally, discrete state stochastic processes) for selecting theories from data Outputs of machine are a function of (i) its current state and (ii) total input 30 / 39
58 Randomized Strategies Randomized Methods: Machines (formally, discrete state stochastic processes) for selecting theories from data Outputs of machine are a function of (i) its current state and (ii) total input States of machine evolve according to a random process i.e. Future and past states may be correlated to any degree - Independence not assumed! No assumptions about process being Markov, etc. 30 / 39
59 Costs and Randomized Strategies Randomized methods ought to be as deductive as possible: Approximate Soundness: convergence in probability, minimization of expected errors Approximate Monotonicity: minimization of expected retractions 31 / 39
60 Retractions in Chance and Expected Retractions 32 / 39
61 Retractions in Chance and Expected Retractions 32 / 39
62 Methodological Properties Say a randomized method M is Ockham if it never says any non-simple theory with probability greater than zero. 33 / 39
63 Methodological Properties Say a randomized method M is Ockham if it never says any non-simple theory with probability greater than zero. Stalwart if whenever it endorses the simplest theory (with any positive probability), it continues to do so with unit probability until it is no longer Ockham. 33 / 39
64 Methodological Properties Say a randomized method M is Ockham if it never says any non-simple theory with probability greater than zero. Stalwart if whenever it endorses the simplest theory (with any positive probability), it continues to do so with unit probability until it is no longer Ockham. Convergent in Probability if for any world w, the probability that M produces the theory true in w approaches 1 as time elapses. 33 / 39
65 Ockham s Razor in Probability Say a method M is a normally Ockham if it is Ockham, stalwart, and convergent in probability. 34 / 39
66 Generalized Efficiency Theorem: Suppose the simplicity structure has no short paths, and let M be a randomized or deterministic method such that 1 M first violates Ockham s razor after some initial segment of evidence e. 2 M is convergent in probability Then, in comparison to any normal Ockham method (deterministic or not!), M accrues a strictly greater number of retractions in every complexity class with respect to e. 35 / 39
67 References Mark Gold, E. Language identification in the limit. Information and Control. V Kevin Kelly. The Logic of Reliable Inquiry. Oxford University Press, Oliver Schulte. Inferring Conservation Laws in Particle Physics: A Case Study in the Problem of Induction. The British Journal for the Philosophy of Science Oliver Schulte, W. Luo and R. Greiner. Mind Change Optimal Learning of Bayes Net Structure. In 20th Annual Conference on Learning Theory (COLT), San Diego, CA, June 12-15, / 39
68 Branching Simplicity Structure... T 4 As bad in these complexity classes... T short path T 5 T 3 short path T 5 T T 2 Only strictly worse in this complexity class! 4 3 T T 1 T T 0 T 0 consistent, stalwart, Ockham Ockham violator 37 / 39
69 Stochastic Processes Definition Let T,, Σ be arbitrary sets. A stochastic process is a quadruple Q =(T, (, D, p), (Σ, S), X ), where: 1 T is a set called the index set of the process; 2 (, D, p) is a (countably additive) probability space; 3 (Σ, S) is a measurable space of possible values of the process; 4 X : T Σ is such that for each fixed t T, the function X t is D/S-measurable. 38 / 39
70 Stochastic Processes Definition Let F represent finite, initial segments of data streams, and Ans contain the set of all theories and? representing I don t know. A stochastic empirical method is a triple M =(Q, α, σ 0 ) where: 1 Q =(F, (, D, p), (Σ, S), X ) is a stochastic process indexed by the set F of all finite, initial segments of data streams. 2 σ 0 Σ is the initial state of our method (i.e. it satisfies: X 1 () (σ 0 ) = ). 3 α : F Σ Ans is such that for each e F, α e is G/2 Ans -measurable. 39 / 39
Ockham Efficiency Theorem for Stochastic Empirical Methods Conor Mayo-Wilson and Kevin Kelly. May 19, Technical Report No.
Ockham Efficiency Theorem for Stochastic Empirical Methods Conor Mayo-Wilson and Kevin Kelly May 19, 2010 Technical Report No. CMU-PHIL-186 Philosophy Methodology Logic Carnegie Mellon Pittsburgh, Pennsylvania
More informationA New Solution to the Puzzle of Simplicity
A New Solution to the Puzzle of Simplicity 4623 words Abstract Explaining the connection, if any, between simplicity and truth is among the deepest problems facing the philosophy of science, statistics,
More informationChurch s s Thesis and Hume s s Problem:
Church s s Thesis and Hume s s Problem: Justification as Performance Kevin T. Kelly Department of Philosophy Carnegie Mellon University kk3n@andrew.cmu.edu Further Reading (With C. Glymour) Why You'll
More informationTestability and Ockham s Razor: How Formal and Statistical Learning Theory Converge in the New Riddle of. Induction
Testability and Ockham s Razor: How Formal and Statistical Learning Theory Converge in the New Riddle of Induction Daniel Steel Department of Philosophy 503 S. Kedzie Hall Michigan State University East
More informationBelief Revision and Truth-Finding
Belief Revision and Truth-Finding Kevin T. Kelly Department of Philosophy Carnegie Mellon University kk3n@andrew.cmu.edu Further Reading (with O. Schulte and V. Hendricks) Reliable Belief Revision, in
More informationInductive vs. Deductive Statistical Inference
Inductive vs. Deductive Statistical Inference Konstantin Genin Kevin T. Kelly February 28, 2018 Abstract The distinction between deductive (infallible, monotonic) and inductive (fallible, non-monotonic)
More informationI. Induction, Probability and Confirmation: Introduction
I. Induction, Probability and Confirmation: Introduction 1. Basic Definitions and Distinctions Singular statements vs. universal statements Observational terms vs. theoretical terms Observational statement
More informationBayesian Learning. Artificial Intelligence Programming. 15-0: Learning vs. Deduction
15-0: Learning vs. Deduction Artificial Intelligence Programming Bayesian Learning Chris Brooks Department of Computer Science University of San Francisco So far, we ve seen two types of reasoning: Deductive
More informationECE521 week 3: 23/26 January 2017
ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear
More informationFrom inductive inference to machine learning
From inductive inference to machine learning ADAPTED FROM AIMA SLIDES Russel&Norvig:Artificial Intelligence: a modern approach AIMA: Inductive inference AIMA: Inductive inference 1 Outline Bayesian inferences
More informationChallenges (& Some Solutions) and Making Connections
Challenges (& Some Solutions) and Making Connections Real-life Search All search algorithm theorems have form: If the world behaves like, then (probability 1) the algorithm will recover the true structure
More informationMeasurement Independence, Parameter Independence and Non-locality
Measurement Independence, Parameter Independence and Non-locality Iñaki San Pedro Department of Logic and Philosophy of Science University of the Basque Country, UPV/EHU inaki.sanpedro@ehu.es Abstract
More informationIntroduction to Algorithms / Algorithms I Lecturer: Michael Dinitz Topic: Intro to Learning Theory Date: 12/8/16
600.463 Introduction to Algorithms / Algorithms I Lecturer: Michael Dinitz Topic: Intro to Learning Theory Date: 12/8/16 25.1 Introduction Today we re going to talk about machine learning, but from an
More informationWeighted Majority and the Online Learning Approach
Statistical Techniques in Robotics (16-81, F12) Lecture#9 (Wednesday September 26) Weighted Majority and the Online Learning Approach Lecturer: Drew Bagnell Scribe:Narek Melik-Barkhudarov 1 Figure 1: Drew
More informationStructure learning in human causal induction
Structure learning in human causal induction Joshua B. Tenenbaum & Thomas L. Griffiths Department of Psychology Stanford University, Stanford, CA 94305 jbt,gruffydd @psych.stanford.edu Abstract We use
More informationConceivability and Modal Knowledge
1 3 Conceivability and Modal Knowledge Christopher Hill ( 2006 ) provides an account of modal knowledge that is set in a broader context of arguing against the view that conceivability provides epistemic
More informationAstronomy 301G: Revolutionary Ideas in Science. Getting Started. What is Science? Richard Feynman ( CE) The Uncertainty of Science
Astronomy 301G: Revolutionary Ideas in Science Getting Started What is Science? Reading Assignment: What s the Matter? Readings in Physics Foreword & Introduction Richard Feynman (1918-1988 CE) The Uncertainty
More informationFeyerabend: How to be a good empiricist
Feyerabend: How to be a good empiricist Critique of Nagel s model of reduction within a larger critique of the logical empiricist view of scientific progress. Logical empiricism: associated with figures
More informationUncertainty and Disagreement in Equilibrium Models
Uncertainty and Disagreement in Equilibrium Models Nabil I. Al-Najjar & Northwestern University Eran Shmaya Tel Aviv University RUD, Warwick, June 2014 Forthcoming: Journal of Political Economy Motivation
More informationComment on Leitgeb s Stability Theory of Belief
Comment on Leitgeb s Stability Theory of Belief Hanti Lin Kevin T. Kelly Carnegie Mellon University {hantil, kk3n}@andrew.cmu.edu Hannes Leitgeb s stability theory of belief provides three synchronic constraints
More informationCS340 Machine learning Lecture 5 Learning theory cont'd. Some slides are borrowed from Stuart Russell and Thorsten Joachims
CS340 Machine learning Lecture 5 Learning theory cont'd Some slides are borrowed from Stuart Russell and Thorsten Joachims Inductive learning Simplest form: learn a function from examples f is the target
More informationStatistical Learning. Philipp Koehn. 10 November 2015
Statistical Learning Philipp Koehn 10 November 2015 Outline 1 Learning agents Inductive learning Decision tree learning Measuring learning performance Bayesian learning Maximum a posteriori and maximum
More informationMachine Learning for Biomedical Engineering. Enrico Grisan
Machine Learning for Biomedical Engineering Enrico Grisan enrico.grisan@dei.unipd.it Curse of dimensionality Why are more features bad? Redundant features (useless or confounding) Hard to interpret and
More informationComputational Complexity of Bayesian Networks
Computational Complexity of Bayesian Networks UAI, 2015 Complexity theory Many computations on Bayesian networks are NP-hard Meaning (no more, no less) that we cannot hope for poly time algorithms that
More information7. Propositional Logic. Wolfram Burgard and Bernhard Nebel
Foundations of AI 7. Propositional Logic Rational Thinking, Logic, Resolution Wolfram Burgard and Bernhard Nebel Contents Agents that think rationally The wumpus world Propositional logic: syntax and semantics
More informationAgency and Interaction in Formal Epistemology
Agency and Interaction in Formal Epistemology Vincent F. Hendricks Department of Philosophy / MEF University of Copenhagen Denmark Department of Philosophy Columbia University New York / USA CPH / August
More informationAbstract. Three Methods and Their Limitations. N-1 Experiments Suffice to Determine the Causal Relations Among N Variables
N-1 Experiments Suffice to Determine the Causal Relations Among N Variables Frederick Eberhardt Clark Glymour 1 Richard Scheines Carnegie Mellon University Abstract By combining experimental interventions
More informationStatistical and Inductive Inference by Minimum Message Length
C.S. Wallace Statistical and Inductive Inference by Minimum Message Length With 22 Figures Springer Contents Preface 1. Inductive Inference 1 1.1 Introduction 1 1.2 Inductive Inference 5 1.3 The Demise
More informationScientific Explanation- Causation and Unification
Scientific Explanation- Causation and Unification By Wesley Salmon Analysis by Margarita Georgieva, PSTS student, number 0102458 Van Lochemstraat 9-17 7511 EG Enschede Final Paper for Philosophy of Science
More informationReview of Gordon Belot, Geometric Possibility Oxford: Oxford UP, 2011
Review of Gordon Belot, Geometric Possibility Oxford: Oxford UP, 2011 (The Philosophical Review 122 (2013): 522 525) Jill North This is a neat little book (138 pages without appendices, under 200 with).
More informationMachine Learning
Machine Learning 10-701 Tom M. Mitchell Machine Learning Department Carnegie Mellon University January 13, 2011 Today: The Big Picture Overfitting Review: probability Readings: Decision trees, overfiting
More informationReasons for Rejecting a Counterfactual Analysis of Conclusive Reasons
Reasons for Rejecting a Counterfactual Analysis of Conclusive Reasons Abstract In a recent article [1], Charles Cross applies Lewisian counterfactual logic to explicate and evaluate Fred Dretske s conclusive
More informationLinear Classifiers and the Perceptron
Linear Classifiers and the Perceptron William Cohen February 4, 2008 1 Linear classifiers Let s assume that every instance is an n-dimensional vector of real numbers x R n, and there are only two possible
More informationAlgorithmic Probability
Algorithmic Probability From Scholarpedia From Scholarpedia, the free peer-reviewed encyclopedia p.19046 Curator: Marcus Hutter, Australian National University Curator: Shane Legg, Dalle Molle Institute
More informationModes of Convergence to the Truth: Steps toward a Better Epistemology of Induction
Modes of Convergence to the Truth: Steps toward a Better Epistemology of Induction Hanti Lin UC Davis ika@ucdavis.edu April 26, 2018 Abstract There is a large group of people engaging in normative or evaluative
More informationBayesian Reasoning. Adapted from slides by Tim Finin and Marie desjardins.
Bayesian Reasoning Adapted from slides by Tim Finin and Marie desjardins. 1 Outline Probability theory Bayesian inference From the joint distribution Using independence/factoring From sources of evidence
More informationKernel Methods and Support Vector Machines
Kernel Methods and Support Vector Machines Oliver Schulte - CMPT 726 Bishop PRML Ch. 6 Support Vector Machines Defining Characteristics Like logistic regression, good for continuous input features, discrete
More informationClassification (Categorization) CS 391L: Machine Learning: Inductive Classification. Raymond J. Mooney. Sample Category Learning Problem
Classification (Categorization) CS 9L: Machine Learning: Inductive Classification Raymond J. Mooney University of Texas at Austin Given: A description of an instance, x X, where X is the instance language
More informationLogic. Introduction to Artificial Intelligence CS/ECE 348 Lecture 11 September 27, 2001
Logic Introduction to Artificial Intelligence CS/ECE 348 Lecture 11 September 27, 2001 Last Lecture Games Cont. α-β pruning Outline Games with chance, e.g. Backgammon Logical Agents and thewumpus World
More informationLogic. Propositional Logic: Syntax
Logic Propositional Logic: Syntax Logic is a tool for formalizing reasoning. There are lots of different logics: probabilistic logic: for reasoning about probability temporal logic: for reasoning about
More informationOckham s Razor, Truth, and Information
Ockham s Razor, Truth, and Information Kevin T. Kelly Department of Philosophy Carnegie Mellon University kk3n@andrew.cmu.edu March 11, 2008 Abstract In science, one faces the problem of selecting the
More informationBayesian Learning Extension
Bayesian Learning Extension This document will go over one of the most useful forms of statistical inference known as Baye s Rule several of the concepts that extend from it. Named after Thomas Bayes this
More informationUncertainty and Bayesian Networks
Uncertainty and Bayesian Networks Tutorial 3 Tutorial 3 1 Outline Uncertainty Probability Syntax and Semantics for Uncertainty Inference Independence and Bayes Rule Syntax and Semantics for Bayesian Networks
More informationBayesian learning Probably Approximately Correct Learning
Bayesian learning Probably Approximately Correct Learning Peter Antal antal@mit.bme.hu A.I. December 1, 2017 1 Learning paradigms Bayesian learning Falsification hypothesis testing approach Probably Approximately
More information09 Modal Logic II. CS 3234: Logic and Formal Systems. October 14, Martin Henz and Aquinas Hobor
Martin Henz and Aquinas Hobor October 14, 2010 Generated on Thursday 14 th October, 2010, 11:40 1 Review of Modal Logic 2 3 4 Motivation Syntax and Semantics Valid Formulas wrt Modalities Correspondence
More informationIntervention, determinism, and the causal minimality condition
DOI 10.1007/s11229-010-9751-1 Intervention, determinism, and the causal minimality condition Jiji Zhang Peter Spirtes Received: 12 May 2009 / Accepted: 30 October 2009 Springer Science+Business Media B.V.
More informationMachine Learning 4771
Machine Learning 4771 Instructor: Tony Jebara Topic 11 Maximum Likelihood as Bayesian Inference Maximum A Posteriori Bayesian Gaussian Estimation Why Maximum Likelihood? So far, assumed max (log) likelihood
More informationDiscrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 10
EECS 70 Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 10 Introduction to Basic Discrete Probability In the last note we considered the probabilistic experiment where we flipped
More informationA Strong Relevant Logic Model of Epistemic Processes in Scientific Discovery
A Strong Relevant Logic Model of Epistemic Processes in Scientific Discovery (Extended Abstract) Jingde Cheng Department of Computer Science and Communication Engineering Kyushu University, 6-10-1 Hakozaki,
More informationCS 6375: Machine Learning Computational Learning Theory
CS 6375: Machine Learning Computational Learning Theory Vibhav Gogate The University of Texas at Dallas Many slides borrowed from Ray Mooney 1 Learning Theory Theoretical characterizations of Difficulty
More informationEx Post Cheap Talk : Value of Information and Value of Signals
Ex Post Cheap Talk : Value of Information and Value of Signals Liping Tang Carnegie Mellon University, Pittsburgh PA 15213, USA Abstract. Crawford and Sobel s Cheap Talk model [1] describes an information
More informationOPTIMIZATION OF FIRST ORDER MODELS
Chapter 2 OPTIMIZATION OF FIRST ORDER MODELS One should not multiply explanations and causes unless it is strictly necessary William of Bakersville in Umberto Eco s In the Name of the Rose 1 In Response
More informationIntroduction to Machine Learning
Outline Contents Introduction to Machine Learning Concept Learning Varun Chandola February 2, 2018 1 Concept Learning 1 1.1 Example Finding Malignant Tumors............. 2 1.2 Notation..............................
More informationCOMP538: Introduction to Bayesian Networks
COMP538: Introduction to Bayesian Networks Lecture 9: Optimal Structure Learning Nevin L. Zhang lzhang@cse.ust.hk Department of Computer Science and Engineering Hong Kong University of Science and Technology
More informationAn Algorithms-based Intro to Machine Learning
CMU 15451 lecture 12/08/11 An Algorithmsbased Intro to Machine Learning Plan for today Machine Learning intro: models and basic issues An interesting algorithm for combining expert advice Avrim Blum [Based
More informationChapter ML:IV. IV. Statistical Learning. Probability Basics Bayes Classification Maximum a-posteriori Hypotheses
Chapter ML:IV IV. Statistical Learning Probability Basics Bayes Classification Maximum a-posteriori Hypotheses ML:IV-1 Statistical Learning STEIN 2005-2017 Area Overview Mathematics Statistics...... Stochastics
More informationUNCERTAINTY. In which we see what an agent should do when not all is crystal-clear.
UNCERTAINTY In which we see what an agent should do when not all is crystal-clear. Outline Uncertainty Probabilistic Theory Axioms of Probability Probabilistic Reasoning Independency Bayes Rule Summary
More informationPHIL 422 Advanced Logic Inductive Proof
PHIL 422 Advanced Logic Inductive Proof 1. Preamble: One of the most powerful tools in your meta-logical toolkit will be proof by induction. Just about every significant meta-logical result relies upon
More informationMODULE -4 BAYEIAN LEARNING
MODULE -4 BAYEIAN LEARNING CONTENT Introduction Bayes theorem Bayes theorem and concept learning Maximum likelihood and Least Squared Error Hypothesis Maximum likelihood Hypotheses for predicting probabilities
More informationA New Conception of Science
A New Conception of Science Nicholas Maxwell Published in PhysicsWorld 13 No. 8, August 2000, pp. 17-18. When scientists choose one theory over another, they reject out of hand all those that are not simple,
More informationProbabilistically Checkable Proofs
Probabilistically Checkable Proofs Madhu Sudan Microsoft Research June 11, 2015 TIFR: Probabilistically Checkable Proofs 1 Can Proofs be Checked Efficiently? The Riemann Hypothesis is true (12 th Revision)
More informationAIC Scores as Evidence a Bayesian Interpretation. Malcolm Forster and Elliott Sober University of Wisconsin, Madison
AIC Scores as Evidence a Bayesian Interpretation Malcolm Forster and Elliott Sober University of Wisconsin, Madison Abstract: Bayesians often reject the Akaike Information Criterion (AIC) because it introduces
More informationIntroduction to Artificial Intelligence. Learning from Oberservations
Introduction to Artificial Intelligence Learning from Oberservations Bernhard Beckert UNIVERSITÄT KOBLENZ-LANDAU Winter Term 2004/2005 B. Beckert: KI für IM p.1 Outline Learning agents Inductive learning
More informationCAUSALITY. Models, Reasoning, and Inference 1 CAMBRIDGE UNIVERSITY PRESS. Judea Pearl. University of California, Los Angeles
CAUSALITY Models, Reasoning, and Inference Judea Pearl University of California, Los Angeles 1 CAMBRIDGE UNIVERSITY PRESS Preface page xiii 1 Introduction to Probabilities, Graphs, and Causal Models 1
More informationLearning from Observations. Chapter 18, Sections 1 3 1
Learning from Observations Chapter 18, Sections 1 3 Chapter 18, Sections 1 3 1 Outline Learning agents Inductive learning Decision tree learning Measuring learning performance Chapter 18, Sections 1 3
More informationWhat Causality Is (stats for mathematicians)
What Causality Is (stats for mathematicians) Andrew Critch UC Berkeley August 31, 2011 Introduction Foreword: The value of examples With any hard question, it helps to start with simple, concrete versions
More informationCS261: A Second Course in Algorithms Lecture #11: Online Learning and the Multiplicative Weights Algorithm
CS61: A Second Course in Algorithms Lecture #11: Online Learning and the Multiplicative Weights Algorithm Tim Roughgarden February 9, 016 1 Online Algorithms This lecture begins the third module of the
More informationLecture 11: Measuring the Complexity of Proofs
IAS/PCMI Summer Session 2000 Clay Mathematics Undergraduate Program Advanced Course on Computational Complexity Lecture 11: Measuring the Complexity of Proofs David Mix Barrington and Alexis Maciel July
More informationReasoning with Uncertainty
Reasoning with Uncertainty Representing Uncertainty Manfred Huber 2005 1 Reasoning with Uncertainty The goal of reasoning is usually to: Determine the state of the world Determine what actions to take
More informationConfirmation and Chaos *
Confirmation and Chaos * Maralee Harrell Department of Philosophy, Colorado College Clark Glymour Institute for Human and Machine Cognition, University of West Florida and Department of Philosophy, Carnegie
More informationProof Techniques (Review of Math 271)
Chapter 2 Proof Techniques (Review of Math 271) 2.1 Overview This chapter reviews proof techniques that were probably introduced in Math 271 and that may also have been used in a different way in Phil
More informationDiscrete Mathematics
Slides for Part IA CST 2015/16 Discrete Mathematics Prof Marcelo Fiore Marcelo.Fiore@cl.cam.ac.uk What are we up to? Learn to read and write, and also work with,
More informationStatistics: Learning models from data
DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial
More informationMachine Learning
Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University January 14, 2015 Today: The Big Picture Overfitting Review: probability Readings: Decision trees, overfiting
More informationOn the Complexity of Causal Models
On the Complexity of Causal Models Brian R. Gaines Man-Machine Systems Laboratory Department of Electrical Engineering Science University of Essex, Colchester, England It is argued that principle of causality
More informationPhilosophy 148 Announcements & Such
Branden Fitelson Philosophy 148 Lecture 1 Philosophy 148 Announcements & Such Overall, people did very well on the mid-term (µ = 90, σ = 16). HW #2 graded will be posted very soon. Raul won t be able to
More information16.4 Multiattribute Utility Functions
285 Normalized utilities The scale of utilities reaches from the best possible prize u to the worst possible catastrophe u Normalized utilities use a scale with u = 0 and u = 1 Utilities of intermediate
More informationChaos, Complexity, and Inference (36-462)
Chaos, Complexity, and Inference (36-462) Lecture 17: Inference in General Cosma Shalizi 16 March 2009 Some Basics About Inference in General Errors and Reliable Inference Further reading: Mayo (1996),
More informationRigorous Science - Based on a probability value? The linkage between Popperian science and statistical analysis
Rigorous Science - Based on a probability value? The linkage between Popperian science and statistical analysis The Philosophy of science: the scientific Method - from a Popperian perspective Philosophy
More information15-424/ Recitation 1 First-Order Logic, Syntax and Semantics, and Differential Equations Notes by: Brandon Bohrer
15-424/15-624 Recitation 1 First-Order Logic, Syntax and Semantics, and Differential Equations Notes by: Brandon Bohrer (bbohrer@cs.cmu.edu) 1 Agenda Admin Everyone should have access to both Piazza and
More informationPHIL12A Section answers, 14 February 2011
PHIL12A Section answers, 14 February 2011 Julian Jonker 1 How much do you know? 1. You should understand why a truth table is constructed the way it is: why are the truth values listed in the order they
More informationSTA 4273H: Sta-s-cal Machine Learning
STA 4273H: Sta-s-cal Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 2 In our
More informationStatistical inference
Statistical inference Contents 1. Main definitions 2. Estimation 3. Testing L. Trapani MSc Induction - Statistical inference 1 1 Introduction: definition and preliminary theory In this chapter, we shall
More informationCAUSATION CAUSATION. Chapter 10. Non-Humean Reductionism
CAUSATION CAUSATION Chapter 10 Non-Humean Reductionism Humean states of affairs were characterized recursively in chapter 2, the basic idea being that distinct Humean states of affairs cannot stand in
More informationCS 188: Artificial Intelligence Spring Announcements
CS 188: Artificial Intelligence Spring 2010 Lecture 24: Perceptrons and More! 4/22/2010 Pieter Abbeel UC Berkeley Slides adapted from Dan Klein Announcements W7 due tonight [this is your last written for
More informationFoundations of Artificial Intelligence
Foundations of Artificial Intelligence 7. Propositional Logic Rational Thinking, Logic, Resolution Wolfram Burgard, Maren Bennewitz, and Marco Ragni Albert-Ludwigs-Universität Freiburg Contents 1 Agents
More informationBayesian network modeling. 1
Bayesian network modeling http://springuniversity.bc3research.org/ 1 Probabilistic vs. deterministic modeling approaches Probabilistic Explanatory power (e.g., r 2 ) Explanation why Based on inductive
More informationPengju XJTU 2016
Introduction to AI Chapter13 Uncertainty Pengju Ren@IAIR Outline Uncertainty Probability Syntax and Semantics Inference Independence and Bayes Rule Wumpus World Environment Squares adjacent to wumpus are
More informationComputational Learning Theory
CS 446 Machine Learning Fall 2016 OCT 11, 2016 Computational Learning Theory Professor: Dan Roth Scribe: Ben Zhou, C. Cervantes 1 PAC Learning We want to develop a theory to relate the probability of successful
More informationFoundations of Artificial Intelligence
Foundations of Artificial Intelligence 7. Propositional Logic Rational Thinking, Logic, Resolution Joschka Boedecker and Wolfram Burgard and Bernhard Nebel Albert-Ludwigs-Universität Freiburg May 17, 2016
More informationIntroduction to Metalogic
Philosophy 135 Spring 2008 Tony Martin Introduction to Metalogic 1 The semantics of sentential logic. The language L of sentential logic. Symbols of L: Remarks: (i) sentence letters p 0, p 1, p 2,... (ii)
More informationUncertainty. Michael Peters December 27, 2013
Uncertainty Michael Peters December 27, 20 Lotteries In many problems in economics, people are forced to make decisions without knowing exactly what the consequences will be. For example, when you buy
More informationFormal Epistemology: Lecture Notes. Horacio Arló-Costa Carnegie Mellon University
Formal Epistemology: Lecture Notes Horacio Arló-Costa Carnegie Mellon University hcosta@andrew.cmu.edu Bayesian Epistemology Radical probabilism doesn t insists that probabilities be based on certainties;
More informationBayesian Social Learning with Random Decision Making in Sequential Systems
Bayesian Social Learning with Random Decision Making in Sequential Systems Yunlong Wang supervised by Petar M. Djurić Department of Electrical and Computer Engineering Stony Brook University Stony Brook,
More informationChapter 2. Mathematical Reasoning. 2.1 Mathematical Models
Contents Mathematical Reasoning 3.1 Mathematical Models........................... 3. Mathematical Proof............................ 4..1 Structure of Proofs........................ 4.. Direct Method..........................
More informationBaker. 1. Classical physics. 2. Determinism
Baker Ted Sider Structuralism seminar 1. Classical physics Dynamics Any body will accelerate in the direction of the net force on it, with a magnitude that is the ratio of the force s magnitude and the
More informationComparative Bayesian Confirmation and the Quine Duhem Problem: A Rejoinder to Strevens Branden Fitelson and Andrew Waterman
Brit. J. Phil. Sci. 58 (2007), 333 338 Comparative Bayesian Confirmation and the Quine Duhem Problem: A Rejoinder to Strevens Branden Fitelson and Andrew Waterman ABSTRACT By and large, we think (Strevens
More informationSolomonoff Induction Violates Nicod s Criterion
Solomonoff Induction Violates Nicod s Criterion Jan Leike and Marcus Hutter http://jan.leike.name/ ALT 15 6 October 2015 Outline The Paradox of Confirmation Solomonoff Induction Results Resolving the Paradox
More informationThe exam is closed book, closed calculator, and closed notes except your one-page crib sheet.
CS 188 Fall 2015 Introduction to Artificial Intelligence Final You have approximately 2 hours and 50 minutes. The exam is closed book, closed calculator, and closed notes except your one-page crib sheet.
More informationNatural deduction for truth-functional logic
Natural deduction for truth-functional logic Phil 160 - Boston University Why natural deduction? After all, we just found this nice method of truth-tables, which can be used to determine the validity or
More information