Logic and machine learning review. CS 540 Yingyu Liang

Similar documents
Machine Learning Basics Lecture 2: Linear Classification. Princeton University COS 495 Instructor: Yingyu Liang

Propositional Logic Part 1

Machine Learning Basics Lecture 7: Multiclass Classification. Princeton University COS 495 Instructor: Yingyu Liang

Machine Learning Practice Page 2 of 2 10/28/13

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012

A Short Introduction to Propositional Logic and First-Order Logic

Logical Agents (I) Instructor: Tsung-Che Chiang

Logical Agents. Administrative. Thursday: Midterm 1, 7p-9p. Next Tuesday: DOW1013: Last name A-M DOW1017: Last name N-Z

Title: Logical Agents AIMA: Chapter 7 (Sections 7.4 and 7.5)

Machine Learning (CS 567) Lecture 2

Probabilistic Machine Learning. Industrial AI Lab.

CSCI-567: Machine Learning (Spring 2019)

CS 4700: Artificial Intelligence

MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October,

Learning Goals of CS245 Logic and Computation

Machine Learning Basics Lecture 3: Perceptron. Princeton University COS 495 Instructor: Yingyu Liang

Machine Learning Basics Lecture 4: SVM I. Princeton University COS 495 Instructor: Yingyu Liang

Propositional Logic. Logic. Propositional Logic Syntax. Propositional Logic

ECE521 week 3: 23/26 January 2017

Day 3: Classification, logistic regression

CS540 ANSWER SHEET

CS 6375 Machine Learning

Lecture 2 Machine Learning Review

Propositional Logic: Models and Proofs

Naïve Bayes. Jia-Bin Huang. Virginia Tech Spring 2019 ECE-5424G / CS-5824

The Importance of Being Formal. Martin Henz. February 5, Propositional Logic

Clustering. CSL465/603 - Fall 2016 Narayanan C Krishnan

Logistic Regression Review Fall 2012 Recitation. September 25, 2012 TA: Selen Uguroglu

Reasoning Under Uncertainty: Introduction to Probability

Logistic Regression. Jia-Bin Huang. Virginia Tech Spring 2019 ECE-5424G / CS-5824

SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION

Machine Learning Linear Classification. Prof. Matteo Matteucci

CS 6140: Machine Learning Spring What We Learned Last Week 2/26/16

Probabilistic modeling. The slides are closely adapted from Subhransu Maji s slides

Homework 1 Solutions Probability, Maximum Likelihood Estimation (MLE), Bayes Rule, knn

Reasoning Under Uncertainty: Introduction to Probability

Loss Functions, Decision Theory, and Linear Models

Knowledge base (KB) = set of sentences in a formal language Declarative approach to building an agent (or other system):

Classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012

Logic: Propositional Logic (Part I)

Qualifying Exam in Machine Learning

Agents that reason logically

Propositional Logic: Semantics and an Example

Machine Learning, Fall 2012 Homework 2

Knowledge representation DATA INFORMATION KNOWLEDGE WISDOM. Figure Relation ship between data, information knowledge and wisdom.

Machine Learning. Lecture 4: Regularization and Bayesian Statistics. Feng Li.

Inference in Propositional Logic

ECE521 Lecture7. Logistic Regression

CSCI-495 Artificial Intelligence. Lecture 17

First Order Logic (FOL)

Machine Learning 2017

Training the linear classifier

Propositional Logic: Part II - Syntax & Proofs 0-0

Knowledge based Agents

Machine Learning. Lecture 9: Learning Theory. Feng Li.

Final Overview. Introduction to ML. Marek Petrik 4/25/2017

MIDTERM SOLUTIONS: FALL 2012 CS 6375 INSTRUCTOR: VIBHAV GOGATE

EEL 851: Biometrics. An Overview of Statistical Pattern Recognition EEL 851 1

FINAL: CS 6375 (Machine Learning) Fall 2014

Knowledge Based Systems: Logic and Deduction

Machine Learning. Linear Models. Fabio Vandin October 10, 2017

Logistic Regression. COMP 527 Danushka Bollegala

Machine Learning Linear Models

Notes. Corneliu Popeea. May 3, 2013

Comments. x > w = w > x. Clarification: this course is about getting you to be able to think as a machine learning expert

How to determine if a statement is true or false. Fuzzy logic deal with statements that are somewhat vague, such as: this paint is grey.

Midterm: CS 6375 Spring 2015 Solutions

Midterm Review CS 6375: Machine Learning. Vibhav Gogate The University of Texas at Dallas

Propositional Logic: Review

Propositional Logic Language

Comp487/587 - Boolean Formulas

Probabilistic classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering

Intelligent Agents. First Order Logic. Ute Schmid. Cognitive Systems, Applied Computer Science, Bamberg University. last change: 19.

Engineering Part IIB: Module 4F10 Statistical Pattern Processing Lecture 5: Single Layer Perceptrons & Estimating Linear Classifiers

Probabilistic Graphical Models: MRFs and CRFs. CSE628: Natural Language Processing Guest Lecturer: Veselin Stoyanov

Advanced Topics in LP and FP

AI Programming CS S-09 Knowledge Representation

Knowledge-based Agents. CS 331: Artificial Intelligence Propositional Logic I. Knowledge-based Agents. Outline. Knowledge-based Agents

CS 331: Artificial Intelligence Propositional Logic I. Knowledge-based Agents

CSE 546 Final Exam, Autumn 2013

A brief introduction to Logic. (slides from

Lecture Overview [ ] Introduction to Artificial Intelligence COMP 3501 / COMP Lecture 6. Motivation. Logical Agents

Linear discriminant functions

Logical Agent & Propositional Logic

The exam is closed book, closed notes except your one-page (two sides) or two-page (one side) crib sheet.

Notation. Pattern Recognition II. Michal Haindl. Outline - PR Basic Concepts. Pattern Recognition Notions

Reasoning. Inference. Knowledge Representation 4/6/2018. User

Instance-based Learning CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016

Final Exam, Machine Learning, Spring 2009

EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING

Midterm Review CS 7301: Advanced Machine Learning. Vibhav Gogate The University of Texas at Dallas

Logic. (Propositional Logic)

Modern Information Retrieval

CPSC 340: Machine Learning and Data Mining. Linear Least Squares Fall 2016

First Order Logic: Syntax and Semantics

Intelligent Agents. Pınar Yolum Utrecht University

CS 730/730W/830: Intro AI

Handout: Proof of the completeness theorem

Propositional Logic: Methods of Proof (Part II)

Transcription:

Logic and machine learning review CS 540 Yingyu Liang

Propositional logic

Logic If the rules of the world are presented formally, then a decision maker can use logical reasoning to make rational decisions. Several types of logic: propositional logic (Boolean logic) first order logic (first order predicate calculus) A logic includes: syntax: what is a correctly formed sentence semantics: what is the meaning of a sentence Inference procedure (reasoning, entailment): what sentence logically follows given knowledge slide 3

Propositional logic syntax Sentence AtomicSentence ComplexSentence AtomicSentence True False Symbol Symbol P Q R... ComplexSentence Sentence ( Sentence Sentence ) ( Sentence Sentence ) ( Sentence Sentence ) ( Sentence Sentence ) BNF (Backus-Naur Form) grammar in propositional logic P ((True R) Q)) S) well formed P Q) S) not well formed slide 4

Summary Interpretation, semantics, knowledge base Entailment model checking Inference, soundness, completeness Inference methods Sound inference, proof Resolution, CNF Chaining with Horn clauses, forward/backward chaining

Example

First order logic

FOL syntax Summary Short summary so far: Constants: Bob, 2, Madison, Variables: x, y, a, b, c, Functions: Income, Address, Sqrt, Predicates: Teacher, Sisters, Even, Prime Connectives: Equality: = Quantifiers: " $ slide 9

More summary Term: constant, variable, function. Denotes an object. (A ground term has no variables) Atom: the smallest expression assigned a truth value. Predicate and = Sentence: an atom, sentence with connectives, sentence with quantifiers. Assigned a truth value Well-formed formula (wff): a sentence in which all variables are quantified slide 10

Example

Example

Machine learning basics

What is machine learning? A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T as measured by P, improves with experience E. ------- Machine Learning, Tom Mitchell, 1997

Example 1: image classification Task: determine if the image is indoor or outdoor Performance measure: probability of misclassification

Example 1: image classification Experience/Data: images with labels indoor Indoor outdoor

Example 1: image classification A few terminologies Training data: the images given for learning Test data: the images to be classified Binary classification: classify into two classes

Example 2: clustering images Task: partition the images into 2 groups Performance: similarities within groups Data: a set of images

Example 2: clustering images A few terminologies Unlabeled data vs labeled data Supervised learning vs unsupervised learning

Unsupervised learning

Unsupervised learning Training sample x 1, x 2,, x n No teacher providing supervision as to how individual instances should be handled Common tasks: clustering, separate the n instances into groups novelty detection, find instances that are very different from the rest dimensionality reduction, represent each instance with a lower dimensional feature vector while maintaining key characteristics of the training samples

Clustering Group training sample into k clusters How many clusters do you see? Many clustering algorithms HAC (Hierarchical Agglomerative Clustering) k-means

Hierarchical Agglomerative Clustering Euclidean (L2) distance

K-means algorithm Input: x 1 x n, k Step 1: select k cluster centers c 1 c k Step 2: for each point x, determine its cluster: find the closest center in Euclidean space Step 3: update all cluster centers as the centroids c i = {x in cluster i} x / SizeOf(cluster i) Repeat step 2, 3 until cluster centers no longer change

Example

Example

Supervised learning

Math formulation Given training data x i, y i : 1 i n i.i.d. from some unknown distribution D Find y = f(x) H using training data s.t. f correct on test data i.i.d. from distribution D If label y discrete: classification If label y continuous: regression

k-nearest-neighbor (knn) 1NN for little green men: Decision boundary

Math formulation Given training data x i, y i : 1 i n i.i.d. from distribution D Find y = f(x) H that minimizes L f = 1 σ n i=1 n l(f, x i, y i ) s.t. the expected loss is small L f = E x,y ~D [l(f, x, y)] Examples of loss functions: 0-1 loss for classification: l f, x, y = I[f x y] and L f = Pr[f x y] l 2 loss for regression: l f, x, y = [f x y] 2 and L f = E[f x y] 2

Maximum likelihood Estimation (MLE) Given training data x i, y i : 1 i n i.i.d. from distribution D Let {P θ x, y : θ Θ} be a family of distributions indexed by θ MLE: negative log-likelihood loss θ ML = argmax θ Θ σ i log(p θ x i, y i ) l P θ, x i, y i = log(p θ x i, y i ) L P θ = σ i log(p θ x i, y i )

MLE: conditional log-likelihood Given training data x i, y i : 1 i n i.i.d. from distribution D Let {P θ y x : θ Θ} be a family of distributions indexed by θ MLE: negative conditional log-likelihood loss θ ML = argmax θ Θ σ i log(p θ y i x i ) Only care about predicting y from x; do not care about p(x) l P θ, x i, y i = log(p θ y i x i ) L P θ = σ i log(p θ y i x i )

Linear regression with regularization: Ridge regression Given training data x i, y i : 1 i n i.i.d. from distribution D Find f w x = w T x that minimizes L R f w = 1 n Xw y 2 2 By setting the gradient to be zero, we have w = X T X 1 X T y l 2 loss: Normal + MLE

Linear classification: logistic regression Given training data x i, y i : 1 i n i.i.d. from distribution D Assume P w (y = 1 x) = σ w T x = 1+exp( w T x) P w y = 0 x = 1 P w y = 1 x = 1 σ w T x Find w that minimizes n L w = 1 n i=1 1 log P w y i x i

Example

Example