ORIE 4741: Learning with Big Messy Data. Feature Engineering

Size: px
Start display at page:

Download "ORIE 4741: Learning with Big Messy Data. Feature Engineering"

Transcription

1 ORIE 4741: Learning with Big Messy Data Feature Engineering Professor Udell Operations Research and Information Engineering Cornell October 2, / 24

2 Agenda announcements Vote today! + registration deadline October (for Nov 2017 election, or to change parties for 2018 election) Homework 1 due this morning Homework 2 out tonight, due in two weeks Always use Julia v0.6 (when you use Julia) Section this week: Github + projects No section next Tuesday Lecture videos: http: //cornell.videonote.com/channels/1036/videos Midterm exam in class on October 5, covers through decision trees The role of confusion in learning feature engineering project discussion 2 / 24

3 Linear models To fit a linear model (= linear in parameters w) pick a transformation φ : X R d predict y using a linear function of φ(x) h(x) = w T φ(x) = d w i (φ(x)) i i=1 we want h(x i ) y i for every i = 1,..., n 3 / 24

4 Linear models To fit a linear model (= linear in parameters w) pick a transformation φ : X R d predict y using a linear function of φ(x) h(x) = w T φ(x) = d w i (φ(x)) i i=1 we want h(x i ) y i for every i = 1,..., n Q: why do we want a model linear in the parameters w? 3 / 24

5 Linear models To fit a linear model (= linear in parameters w) pick a transformation φ : X R d predict y using a linear function of φ(x) h(x) = w T φ(x) = d w i (φ(x)) i i=1 we want h(x i ) y i for every i = 1,..., n Q: why do we want a model linear in the parameters w? A: because the optimization problems are easy to solve! e.g., just use least squares. 3 / 24

6 Feature engineering How to pick φ : X R d? so response y will depend linearly on φ(x) so d is not too big 4 / 24

7 Feature engineering How to pick φ : X R d? so response y will depend linearly on φ(x) so d is not too big if you think this looks like a hack: you re right 5 / 24

8 Feature engineering examples: adding offset standardizing features polynomial fits products of features autoregressive models local linear regression transforming images transforming Booleans transforming ordinals transforming nominals transforming sentences concatenating data all of the above 6 / 24

9 Adding offset X = R d 1 let φ(x) = (x, 1) now h(x) = w T φ(x) = w T 1:d 1 x + w d 7 / 24

10 Fitting a polynomial X = R let φ(x) = (1, x, x 2, x 3,..., x d 1 ) be the vector of all monomials in x of degree < d now h(x) = w T φ(x) = w 1 + w 2 x + w 3 x w d x d 1 8 / 24

11 Demo: Linear models 9 / 24

12 Fitting a multivariate polynomial X = R 2 pick a maximum degree k let φ(x) = (1, x 1, x 2, x 2 1, x 1 x 2, x 2 2, x 3 1, x 2 1 x 2, x 1 x 2 2, x , x k 2 ) be the vector of all monomials in x 1 and x 2 of degree < d now h(x) = w T φ(x) can fit any polynomial of degree k in X 10 / 24

13 Fitting a multivariate polynomial X = R 2 pick a maximum degree k let φ(x) = (1, x 1, x 2, x 2 1, x 1 x 2, x 2 2, x 3 1, x 2 1 x 2, x 1 x 2 2, x , x k 2 ) be the vector of all monomials in x 1 and x 2 of degree < d now h(x) = w T φ(x) can fit any polynomial of degree k in X and similarly for X = R d / 24

14 Example: fitting a multivariate polynomial X = R 2, Y = {0, 1} let φ(x) = (1, x 1, x 2, x 2 1, x 1 x 2, x 2 2 ) be the vector of all monomials of degree 2 now let h(x) = sign(w T φ(x)) Q: if h(x) = sign(5 3x 1 + 2x 2 + x1 2 x 1x 2 + 2x2 2 ), what is {x : h(x) = 1}? 11 / 24

15 Example: fitting a multivariate polynomial X = R 2, Y = {0, 1} let φ(x) = (1, x 1, x 2, x 2 1, x 1 x 2, x 2 2 ) be the vector of all monomials of degree 2 now let h(x) = sign(w T φ(x)) Q: if h(x) = sign(5 3x 1 + 2x 2 + x1 2 x 1x 2 + 2x2 2 ), what is A: An ellipse! {x : h(x) = 1}? 11 / 24

16 Example: fitting a multivariate polynomial X = R 2, Y = { 1, 1} let φ(x) = (1, x 1, x 2, x 2 1, x 1 x 2, x 2 2 ) be the vector of all monomials of degree 2 now let h(x) = sign(w T φ(x)) Q: if h(x) = sign(5 3x 1 + 2x 2 + x1 2 x 1x 2 2x2 2 ), what is {x : h(x) = 1}? 12 / 24

17 Example: fitting a multivariate polynomial X = R 2, Y = { 1, 1} let φ(x) = (1, x 1, x 2, x 2 1, x 1 x 2, x 2 2 ) be the vector of all monomials of degree 2 now let h(x) = sign(w T φ(x)) Q: if h(x) = sign(5 3x 1 + 2x 2 + x1 2 x 1x 2 2x2 2 ), what is A: A hyperbola! {x : h(x) = 1}? 12 / 24

18 Fitting a time series given a time series x t R, t = 1,..., T want to predict the value at the next time T + 1 input: time index t and time series x 1:t up to time t let φ(t, x) = (x t 1, x t 2,..., x t d ) (called the lagged outcomes ) now h(x) = w T φ(x) = w 1 x t 1 + w 2 x t w d x t d also called an auto-regressive (AR) model 13 / 24

19 New notation: boolean indicator function define 1(statement) = { 1 statement is true 0 statement is false examples: 1(1 < 0) = 0 1(17 = 17) = 1 14 / 24

20 Fitting a local model: local neighbors X = R m pick a set of points µ i R m, i = 1,..., d, radius δ R let φ(x) = [1( x µ 1 2 δ),..., 1( x µ d 2 δ)] often, d = n and µ i = x i 15 / 24

21 Fitting a local model: 1 nearest neighbor X = R m pick a set of points µ i R m, i = 1,..., d let δ = min( x µ 1 2,..., x µ d 2 ) let φ(x) = [1( x µ 1 2 δ),..., 1( x µ d 2 δ)] often, d = n and µ i = x i 16 / 24

22 Fitting a local model: smoothing X = R m pick a set of points µ i R m, i = 1,..., d, parameter α R let φ(x) = (exp ( α x µ 1 2),..., exp ( α x µ d 2) ) often, d = n and µ i = x i 17 / 24

23 Demo: Crime preprocessing: predicting: Predicting%20crime.ipynb 18 / 24

24 Boolean variables X = {true, false} let φ(x) = 1(x) 19 / 24

25 Boolean expressions X = {true, false} 2 = {true, false} 2 = {(true, true), (true, false), (false, true), (false, false)} 2. let φ(x) = [1(x 1 ), 1(x 2 ), 1(x 1 and x 2 ), 1(x 1 or x 2 )] equivalent: polynomials in [1(x 1 ), 1(x 2 )] span the same space encodes logical expressions! 20 / 24

26 Nominal values X = {apple, orange, banana} let φ(x) = [1(x = apple), 1(x = orange), 1(x = banana)] called one-hot encoding: only one element is non-zero 21 / 24

27 Language X = sentences, documents, tweets,... let {w 1,..., w d } be a set of words (or hashtags, or emoji, or... ) let φ(x) = [1(x contains w 1 ),..., 1(x contains w d )] called bag-of-words model: ignores order of words in sentence 22 / 24

28 Review linear models are linear in the parameters w can fit many different models by picking feature mapping φ : X R d 23 / 24

29 What makes a good project? Clear outcome to predict Linear regression should do something interesting New, interesting model; not a Kaggle competition Avoid: images, time series, NLP 24 / 24

ORIE 4741: Learning with Big Messy Data. Generalization

ORIE 4741: Learning with Big Messy Data. Generalization ORIE 4741: Learning with Big Messy Data Generalization Professor Udell Operations Research and Information Engineering Cornell September 23, 2017 1 / 21 Announcements midterm 10/5 makeup exam 10/2, by

More information

Data Mining Prof. Pabitra Mitra Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur

Data Mining Prof. Pabitra Mitra Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur Data Mining Prof. Pabitra Mitra Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur Lecture 21 K - Nearest Neighbor V In this lecture we discuss; how do we evaluate the

More information

CPSC 340: Machine Learning and Data Mining

CPSC 340: Machine Learning and Data Mining CPSC 340: Machine Learning and Data Mining Linear Classifiers: predictions Original version of these slides by Mark Schmidt, with modifications by Mike Gelbart. 1 Admin Assignment 4: Due Friday of next

More information

ORIE 4741: Learning with Big Messy Data. Regularization

ORIE 4741: Learning with Big Messy Data. Regularization ORIE 4741: Learning with Big Messy Data Regularization Professor Udell Operations Research and Information Engineering Cornell October 26, 2017 1 / 24 Regularized empirical risk minimization choose model

More information

Linear Regression (continued)

Linear Regression (continued) Linear Regression (continued) Professor Ameet Talwalkar Professor Ameet Talwalkar CS260 Machine Learning Algorithms February 6, 2017 1 / 39 Outline 1 Administration 2 Review of last lecture 3 Linear regression

More information

MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October,

MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October, MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October, 23 2013 The exam is closed book. You are allowed a one-page cheat sheet. Answer the questions in the spaces provided on the question sheets. If you run

More information

Self-reproducing programs. And Introduction to logic. COS 116, Spring 2012 Adam Finkelstein

Self-reproducing programs. And Introduction to logic. COS 116, Spring 2012 Adam Finkelstein Self-reproducing programs. And Introduction to logic. COS 6, Spring 22 Adam Finkelstein Midterm One week from today in class Mar 5 Covers lectures, labs, homework, readings to date Old midterms will be

More information

STA 302f16 Assignment Five 1

STA 302f16 Assignment Five 1 STA 30f16 Assignment Five 1 Except for Problem??, these problems are preparation for the quiz in tutorial on Thursday October 0th, and are not to be handed in As usual, at times you may be asked to prove

More information

Welcome to Math 102 Section 102

Welcome to Math 102 Section 102 Welcome to Math 102 Section 102 Mingfeng Qiu Sep. 5, 2018 Math 102: Announcements Instructor: Mingfeng Qiu Email: mqiu@math.ubc.ca Course webpage: https://canvas.ubc.ca Check the calendar!!! Sectional

More information

Machine Learning, Fall 2009: Midterm

Machine Learning, Fall 2009: Midterm 10-601 Machine Learning, Fall 009: Midterm Monday, November nd hours 1. Personal info: Name: Andrew account: E-mail address:. You are permitted two pages of notes and a calculator. Please turn off all

More information

Introduction to Physical Geology (GEOL 1) Physical Science

Introduction to Physical Geology (GEOL 1) Physical Science Introduction to Physical Geology (GEOL 1) Physical Science Dr. Ryan J. McCarty rmccarty@saddleback.edu Office hours 8:30-9:00 12:00-12:30, 5:00-6:00 1 What is Physical Geology Having to do with the material

More information

ORIE 4741: Learning with Big Messy Data. Proximal Gradient Method

ORIE 4741: Learning with Big Messy Data. Proximal Gradient Method ORIE 4741: Learning with Big Messy Data Proximal Gradient Method Professor Udell Operations Research and Information Engineering Cornell November 13, 2017 1 / 31 Announcements Be a TA for CS/ORIE 1380:

More information

Lecture 7: Kernels for Classification and Regression

Lecture 7: Kernels for Classification and Regression Lecture 7: Kernels for Classification and Regression CS 194-10, Fall 2011 Laurent El Ghaoui EECS Department UC Berkeley September 15, 2011 Outline Outline A linear regression problem Linear auto-regressive

More information

CS 188: Artificial Intelligence Spring Announcements

CS 188: Artificial Intelligence Spring Announcements CS 188: Artificial Intelligence Spring 2010 Lecture 24: Perceptrons and More! 4/22/2010 Pieter Abbeel UC Berkeley Slides adapted from Dan Klein Announcements W7 due tonight [this is your last written for

More information

Intelligent Systems (AI-2)

Intelligent Systems (AI-2) Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 19 Oct, 23, 2015 Slide Sources Raymond J. Mooney University of Texas at Austin D. Koller, Stanford CS - Probabilistic Graphical Models D. Page,

More information

CSCI567 Machine Learning (Fall 2014)

CSCI567 Machine Learning (Fall 2014) CSCI567 Machine Learning (Fall 24) Drs. Sha & Liu {feisha,yanliu.cs}@usc.edu October 2, 24 Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 24) October 2, 24 / 24 Outline Review

More information

Class Note #20. In today s class, the following four concepts were introduced: decision

Class Note #20. In today s class, the following four concepts were introduced: decision Class Note #20 Date: 03/29/2006 [Overall Information] In today s class, the following four concepts were introduced: decision version of a problem, formal language, P and NP. We also discussed the relationship

More information

Symbolic Logic Outline

Symbolic Logic Outline Symbolic Logic Outline 1. Symbolic Logic Outline 2. What is Logic? 3. How Do We Use Logic? 4. Logical Inferences #1 5. Logical Inferences #2 6. Symbolic Logic #1 7. Symbolic Logic #2 8. What If a Premise

More information

ORIE 4741: Learning with Big Messy Data. Train, Test, Validate

ORIE 4741: Learning with Big Messy Data. Train, Test, Validate ORIE 4741: Learning with Big Messy Data Train, Test, Validate Professor Udell Operations Research and Information Engineering Cornell December 4, 2017 1 / 14 Exercise You run a hospital. A vendor wants

More information

CSC321 Lecture 4 The Perceptron Algorithm

CSC321 Lecture 4 The Perceptron Algorithm CSC321 Lecture 4 The Perceptron Algorithm Roger Grosse and Nitish Srivastava January 17, 2017 Roger Grosse and Nitish Srivastava CSC321 Lecture 4 The Perceptron Algorithm January 17, 2017 1 / 1 Recap:

More information

Intelligent Systems (AI-2)

Intelligent Systems (AI-2) Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 19 Oct, 24, 2016 Slide Sources Raymond J. Mooney University of Texas at Austin D. Koller, Stanford CS - Probabilistic Graphical Models D. Page,

More information

Capacitance. Welcome Back to Physics Micheal Faraday by Thomas Phillips oil on canvas

Capacitance. Welcome Back to Physics Micheal Faraday by Thomas Phillips oil on canvas Welcome Back to Physics 1308 Capacitance Micheal Faraday 1868-1953 by Thomas Phillips oil on canvas Announcements Assignments for Thursday, September 27th: - Reading: Chapter 26.1-26.5 - Watch Video: https://youtu.be/tvwyfhgdvgq

More information

Overfitting, Bias / Variance Analysis

Overfitting, Bias / Variance Analysis Overfitting, Bias / Variance Analysis Professor Ameet Talwalkar Professor Ameet Talwalkar CS260 Machine Learning Algorithms February 8, 207 / 40 Outline Administration 2 Review of last lecture 3 Basic

More information

Variance Reduction and Ensemble Methods

Variance Reduction and Ensemble Methods Variance Reduction and Ensemble Methods Nicholas Ruozzi University of Texas at Dallas Based on the slides of Vibhav Gogate and David Sontag Last Time PAC learning Bias/variance tradeoff small hypothesis

More information

Machine Learning (CS 567) Lecture 5

Machine Learning (CS 567) Lecture 5 Machine Learning (CS 567) Lecture 5 Time: T-Th 5:00pm - 6:20pm Location: GFS 118 Instructor: Sofus A. Macskassy (macskass@usc.edu) Office: SAL 216 Office hours: by appointment Teaching assistant: Cheol

More information

ON1 Big Mock Test 4 October 2013 Answers This test counts 4% of assessment for the course. Time allowed: 10 minutes

ON1 Big Mock Test 4 October 2013 Answers This test counts 4% of assessment for the course. Time allowed: 10 minutes ON1 Big Mock Test 4 October 2013 Answers This test counts 4% of assessment for the course. Time allowed: 10 minutes The actual test will contain only 2 questions. Marking scheme: In each multiple choice

More information

Stochastic Gradient Descent

Stochastic Gradient Descent Stochastic Gradient Descent Machine Learning CSE546 Carlos Guestrin University of Washington October 9, 2013 1 Logistic Regression Logistic function (or Sigmoid): Learn P(Y X) directly Assume a particular

More information

HW8. Due: November 1, 2018

HW8. Due: November 1, 2018 CSCI 1010 Theory of Computation HW8 Due: November 1, 2018 Attach a fully filled-in cover sheet to the front of your printed homework Your name should not appear anywhere; the cover sheet and each individual

More information

Forecasting. BUS 735: Business Decision Making and Research. exercises. Assess what we have learned

Forecasting. BUS 735: Business Decision Making and Research. exercises. Assess what we have learned Forecasting BUS 735: Business Decision Making and Research 1 1.1 Goals and Agenda Goals and Agenda Learning Objective Learn how to identify regularities in time series data Learn popular univariate time

More information

CSC 411: Lecture 03: Linear Classification

CSC 411: Lecture 03: Linear Classification CSC 411: Lecture 03: Linear Classification Richard Zemel, Raquel Urtasun and Sanja Fidler University of Toronto Zemel, Urtasun, Fidler (UofT) CSC 411: 03-Classification 1 / 24 Examples of Problems What

More information

MIDTERM SOLUTIONS: FALL 2012 CS 6375 INSTRUCTOR: VIBHAV GOGATE

MIDTERM SOLUTIONS: FALL 2012 CS 6375 INSTRUCTOR: VIBHAV GOGATE MIDTERM SOLUTIONS: FALL 2012 CS 6375 INSTRUCTOR: VIBHAV GOGATE March 28, 2012 The exam is closed book. You are allowed a double sided one page cheat sheet. Answer the questions in the spaces provided on

More information

Electric Fields of Charge Distributions

Electric Fields of Charge Distributions Welcome to Physics 308 Electric Fields of Charge Distributions Charles-Augustin de Coulomb 736-806 Physics 308: General Physics II - Professor Jodi Cooley Announcements Assignments for Tuesday, September

More information

Machine Learning. Kernels. Fall (Kernels, Kernelized Perceptron and SVM) Professor Liang Huang. (Chap. 12 of CIML)

Machine Learning. Kernels. Fall (Kernels, Kernelized Perceptron and SVM) Professor Liang Huang. (Chap. 12 of CIML) Machine Learning Fall 2017 Kernels (Kernels, Kernelized Perceptron and SVM) Professor Liang Huang (Chap. 12 of CIML) Nonlinear Features x4: -1 x1: +1 x3: +1 x2: -1 Concatenated (combined) features XOR:

More information

Comp487/587 - Boolean Formulas

Comp487/587 - Boolean Formulas Comp487/587 - Boolean Formulas 1 Logic and SAT 1.1 What is a Boolean Formula Logic is a way through which we can analyze and reason about simple or complicated events. In particular, we are interested

More information

Linear Classifiers: Expressiveness

Linear Classifiers: Expressiveness Linear Classifiers: Expressiveness Machine Learning Spring 2018 The slides are mainly from Vivek Srikumar 1 Lecture outline Linear classifiers: Introduction What functions do linear classifiers express?

More information

MA 242 LINEAR ALGEBRA C1, Solutions to First Midterm Exam

MA 242 LINEAR ALGEBRA C1, Solutions to First Midterm Exam MA 242 LINEAR ALGEBRA C Solutions to First Midterm Exam Prof Nikola Popovic October 2 9:am - :am Problem ( points) Determine h and k such that the solution set of x + = k 4x + h = 8 (a) is empty (b) contains

More information

LECTURE 5. Introduction to Econometrics. Hypothesis testing

LECTURE 5. Introduction to Econometrics. Hypothesis testing LECTURE 5 Introduction to Econometrics Hypothesis testing October 18, 2016 1 / 26 ON TODAY S LECTURE We are going to discuss how hypotheses about coefficients can be tested in regression models We will

More information

Decision Trees. Tirgul 5

Decision Trees. Tirgul 5 Decision Trees Tirgul 5 Using Decision Trees It could be difficult to decide which pet is right for you. We ll find a nice algorithm to help us decide what to choose without having to think about it. 2

More information

Announcements Monday, September 18

Announcements Monday, September 18 Announcements Monday, September 18 WeBWorK 1.4, 1.5 are due on Wednesday at 11:59pm. The first midterm is on this Friday, September 22. Midterms happen during recitation. The exam covers through 1.5. About

More information

SVMs, Duality and the Kernel Trick

SVMs, Duality and the Kernel Trick SVMs, Duality and the Kernel Trick Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University February 26 th, 2007 2005-2007 Carlos Guestrin 1 SVMs reminder 2005-2007 Carlos Guestrin 2 Today

More information

You have 3 hours to complete the exam. Some questions are harder than others, so don t spend too long on any one question.

You have 3 hours to complete the exam. Some questions are harder than others, so don t spend too long on any one question. Data 8 Fall 2017 Foundations of Data Science Final INSTRUCTIONS You have 3 hours to complete the exam. Some questions are harder than others, so don t spend too long on any one question. The exam is closed

More information

Announcements Monday, November 13

Announcements Monday, November 13 Announcements Monday, November 13 The third midterm is on this Friday, November 17 The exam covers 31, 32, 51, 52, 53, and 55 About half the problems will be conceptual, and the other half computational

More information

Marks. bonus points. } Assignment 1: Should be out this weekend. } Mid-term: Before the last lecture. } Mid-term deferred exam:

Marks. bonus points. } Assignment 1: Should be out this weekend. } Mid-term: Before the last lecture. } Mid-term deferred exam: Marks } Assignment 1: Should be out this weekend } All are marked, I m trying to tally them and perhaps add bonus points } Mid-term: Before the last lecture } Mid-term deferred exam: } This Saturday, 9am-10.30am,

More information

Randomized Decision Trees

Randomized Decision Trees Randomized Decision Trees compiled by Alvin Wan from Professor Jitendra Malik s lecture Discrete Variables First, let us consider some terminology. We have primarily been dealing with real-valued data,

More information

Pre-calculus Lesson Plan. Week of:

Pre-calculus Lesson Plan. Week of: Pre-calculus Lesson Plan Week of: May 15 Review for semester exam Semester exams Semester exams May 8 12.4 Derivatives Find instantaneous rates of change by Teacher led calculating Student practice derivatives

More information

Lecture 02 Linear classification methods I

Lecture 02 Linear classification methods I Lecture 02 Linear classification methods I 22 January 2016 Taylor B. Arnold Yale Statistics STAT 365/665 1/32 Coursewebsite: A copy of the whole course syllabus, including a more detailed description of

More information

Part I. Linear regression & LASSO. Linear Regression. Linear Regression. Week 10 Based in part on slides from textbook, slides of Susan Holmes

Part I. Linear regression & LASSO. Linear Regression. Linear Regression. Week 10 Based in part on slides from textbook, slides of Susan Holmes Week 10 Based in part on slides from textbook, slides of Susan Holmes Part I Linear regression & December 5, 2012 1 / 1 2 / 1 We ve talked mostly about classification, where the outcome categorical. If

More information

Statistical Machine Learning Theory. From Multi-class Classification to Structured Output Prediction. Hisashi Kashima.

Statistical Machine Learning Theory. From Multi-class Classification to Structured Output Prediction. Hisashi Kashima. http://goo.gl/xilnmn Course website KYOTO UNIVERSITY Statistical Machine Learning Theory From Multi-class Classification to Structured Output Prediction Hisashi Kashima kashima@i.kyoto-u.ac.jp DEPARTMENT

More information

A Decision Stump. Decision Trees, cont. Boosting. Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University. October 1 st, 2007

A Decision Stump. Decision Trees, cont. Boosting. Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University. October 1 st, 2007 Decision Trees, cont. Boosting Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University October 1 st, 2007 1 A Decision Stump 2 1 The final tree 3 Basic Decision Tree Building Summarized

More information

Artificial Neural Networks" and Nonparametric Methods" CMPSCI 383 Nov 17, 2011!

Artificial Neural Networks and Nonparametric Methods CMPSCI 383 Nov 17, 2011! Artificial Neural Networks" and Nonparametric Methods" CMPSCI 383 Nov 17, 2011! 1 Todayʼs lecture" How the brain works (!)! Artificial neural networks! Perceptrons! Multilayer feed-forward networks! Error

More information

9 Classification. 9.1 Linear Classifiers

9 Classification. 9.1 Linear Classifiers 9 Classification This topic returns to prediction. Unlike linear regression where we were predicting a numeric value, in this case we are predicting a class: winner or loser, yes or no, rich or poor, positive

More information

Gaussian and Linear Discriminant Analysis; Multiclass Classification

Gaussian and Linear Discriminant Analysis; Multiclass Classification Gaussian and Linear Discriminant Analysis; Multiclass Classification Professor Ameet Talwalkar Slide Credit: Professor Fei Sha Professor Ameet Talwalkar CS260 Machine Learning Algorithms October 13, 2015

More information

CISC-102 Fall 2017 Week 1 David Rappaport Goodwin G-532 Office Hours: Tuesday 1:30-3:30

CISC-102 Fall 2017 Week 1 David Rappaport Goodwin G-532 Office Hours: Tuesday 1:30-3:30 Week 1 Fall 2017 1 of 42 CISC-102 Fall 2017 Week 1 David Rappaport daver@cs.queensu.ca Goodwin G-532 Office Hours: Tuesday 1:30-3:30 Homework Homework every week. Keep up to date or you risk falling behind.

More information

Data Mining Prof. Pabitra Mitra Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur

Data Mining Prof. Pabitra Mitra Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur Data Mining Prof. Pabitra Mitra Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur Lecture - 17 K - Nearest Neighbor I Welcome to our discussion on the classification

More information

Math 3191 Applied Linear Algebra

Math 3191 Applied Linear Algebra Math 191 Applied Linear Algebra Lecture 9: Characterizations of Invertible Matrices Stephen Billups University of Colorado at Denver Math 191Applied Linear Algebra p.1/ Announcements Review for Exam 1

More information

Digital Logic Design ENEE x. Lecture 14

Digital Logic Design ENEE x. Lecture 14 Digital Logic Design ENEE 244-010x Lecture 14 Announcements Homework 6 due today Agenda Last time: Binary Adders and Subtracters (5.1, 5.1.1) Carry Lookahead Adders (5.1.2, 5.1.3) This time: Decimal Adders

More information

Circuits Gustav Robert Kirchhoff 12 March October 1887

Circuits Gustav Robert Kirchhoff 12 March October 1887 Welcome Back to Physics 1308 Circuits Gustav Robert Kirchhoff 12 March 1824 17 October 1887 Announcements Assignments for Thursday, October 4th: - Reading: Chapter 27.3-4 - Watch Video: https://youtu.be/ytoe-rwi8us

More information

Support Vector Machines: Kernels

Support Vector Machines: Kernels Support Vector Machines: Kernels CS6780 Advanced Machine Learning Spring 2015 Thorsten Joachims Cornell University Reading: Murphy 14.1, 14.2, 14.4 Schoelkopf/Smola Chapter 7.4, 7.6, 7.8 Non-Linear Problems

More information

Day 3: Classification, logistic regression

Day 3: Classification, logistic regression Day 3: Classification, logistic regression Introduction to Machine Learning Summer School June 18, 2018 - June 29, 2018, Chicago Instructor: Suriya Gunasekar, TTI Chicago 20 June 2018 Topics so far Supervised

More information

Static Equilibirum CHECK YOUR NEIGHBOR

Static Equilibirum CHECK YOUR NEIGHBOR PHY131H1F - Class 19 Today: Today, Chapter 12: Conditions for Static Equilibrium Center of Gravity Static Equilibrium Problems Stability Suppose the girl on the left suddenly is handed a bag of apples

More information

CS 446 Machine Learning Fall 2016 Nov 01, Bayesian Learning

CS 446 Machine Learning Fall 2016 Nov 01, Bayesian Learning CS 446 Machine Learning Fall 206 Nov 0, 206 Bayesian Learning Professor: Dan Roth Scribe: Ben Zhou, C. Cervantes Overview Bayesian Learning Naive Bayes Logistic Regression Bayesian Learning So far, we

More information

6. The scalar multiple of u by c, denoted by c u is (also) in V. (closure under scalar multiplication)

6. The scalar multiple of u by c, denoted by c u is (also) in V. (closure under scalar multiplication) Definition: A subspace of a vector space V is a subset H of V which is itself a vector space with respect to the addition and scalar multiplication in V. As soon as one verifies a), b), c) below for H,

More information

15-388/688 - Practical Data Science: Nonlinear modeling, cross-validation, regularization, and evaluation

15-388/688 - Practical Data Science: Nonlinear modeling, cross-validation, regularization, and evaluation 15-388/688 - Practical Data Science: Nonlinear modeling, cross-validation, regularization, and evaluation J. Zico Kolter Carnegie Mellon University Fall 2016 1 Outline Example: return to peak demand prediction

More information

Learning from Examples

Learning from Examples Learning from Examples Data fitting Decision trees Cross validation Computational learning theory Linear classifiers Neural networks Nonparametric methods: nearest neighbor Support vector machines Ensemble

More information

NP-Completeness Part II

NP-Completeness Part II NP-Completeness Part II Please evaluate this course on Axess. Your comments really do make a difference. Announcements Problem Set 8 due tomorrow at 12:50PM sharp with one late day. Problem Set 9 out,

More information

Data Mining Techniques. Lecture 3: Probability

Data Mining Techniques. Lecture 3: Probability Data Mining Techniques CS 6220 - Section 3 - Fall 2016 Lecture 3: Probability Jan-Willem van de Meent (credit: Zhao, CS 229, Bishop) Project Vote 1. Freeform: Develop your own project proposals 30% of

More information

CS 188: Artificial Intelligence Spring Announcements

CS 188: Artificial Intelligence Spring Announcements CS 188: Artificial Intelligence Spring 2010 Lecture 22: Nearest Neighbors, Kernels 4/18/2011 Pieter Abbeel UC Berkeley Slides adapted from Dan Klein Announcements On-going: contest (optional and FUN!)

More information

Kernels. Machine Learning CSE446 Carlos Guestrin University of Washington. October 28, Carlos Guestrin

Kernels. Machine Learning CSE446 Carlos Guestrin University of Washington. October 28, Carlos Guestrin Kernels Machine Learning CSE446 Carlos Guestrin University of Washington October 28, 2013 Carlos Guestrin 2005-2013 1 Linear Separability: More formally, Using Margin Data linearly separable, if there

More information

Ensemble Methods. Charles Sutton Data Mining and Exploration Spring Friday, 27 January 12

Ensemble Methods. Charles Sutton Data Mining and Exploration Spring Friday, 27 January 12 Ensemble Methods Charles Sutton Data Mining and Exploration Spring 2012 Bias and Variance Consider a regression problem Y = f(x)+ N(0, 2 ) With an estimate regression function ˆf, e.g., ˆf(x) =w > x Suppose

More information

SVMs: nonlinearity through kernels

SVMs: nonlinearity through kernels Non-separable data e-8. Support Vector Machines 8.. The Optimal Hyperplane Consider the following two datasets: SVMs: nonlinearity through kernels ER Chapter 3.4, e-8 (a) Few noisy data. (b) Nonlinearly

More information

Math 308 Midterm Practice Winter 2015

Math 308 Midterm Practice Winter 2015 Math 308 Midterm Practice Winter 205 The Midterm will take place Feb th, during lecture time. It will cover the material of Chapter, Chapter 2 and Chapter 3, Section 3.. You do not need to bring paper

More information

1.1 P, NP, and NP-complete

1.1 P, NP, and NP-complete CSC5160: Combinatorial Optimization and Approximation Algorithms Topic: Introduction to NP-complete Problems Date: 11/01/2008 Lecturer: Lap Chi Lau Scribe: Jerry Jilin Le This lecture gives a general introduction

More information

Naive Bayes and Gaussian Bayes Classifier

Naive Bayes and Gaussian Bayes Classifier Naive Bayes and Gaussian Bayes Classifier Elias Tragas tragas@cs.toronto.edu October 3, 2016 Elias Tragas Naive Bayes and Gaussian Bayes Classifier October 3, 2016 1 / 23 Naive Bayes Bayes Rules: Naive

More information

Probability concepts. Math 10A. October 33, 2017

Probability concepts. Math 10A. October 33, 2017 October 33, 207 Serge Lang lecture This year s Serge Lang Undergraduate Lecture will be given by Keith Devlin of Stanford University. The title is When the precision of mathematics meets the messiness

More information

Nearest Neighbor. Machine Learning CSE546 Kevin Jamieson University of Washington. October 26, Kevin Jamieson 2

Nearest Neighbor. Machine Learning CSE546 Kevin Jamieson University of Washington. October 26, Kevin Jamieson 2 Nearest Neighbor Machine Learning CSE546 Kevin Jamieson University of Washington October 26, 2017 2017 Kevin Jamieson 2 Some data, Bayes Classifier Training data: True label: +1 True label: -1 Optimal

More information

Nonlinear Classification

Nonlinear Classification Nonlinear Classification INFO-4604, Applied Machine Learning University of Colorado Boulder October 5-10, 2017 Prof. Michael Paul Linear Classification Most classifiers we ve seen use linear functions

More information

Support Vector Machines, Kernel SVM

Support Vector Machines, Kernel SVM Support Vector Machines, Kernel SVM Professor Ameet Talwalkar Professor Ameet Talwalkar CS260 Machine Learning Algorithms February 27, 2017 1 / 40 Outline 1 Administration 2 Review of last lecture 3 SVM

More information

Willmar Public Schools Curriculum Map

Willmar Public Schools Curriculum Map Subject Area Mathematics Senior High Course Name Advanced Algebra 2A (Prentice Hall Mathematics) Date April 2010 The Advanced Algebra 2A course parallels each other in content and time. The Advanced Algebra

More information

Kernel methods CSE 250B

Kernel methods CSE 250B Kernel methods CSE 250B Deviations from linear separability Noise Find a separator that minimizes a convex loss function related to the number of mistakes. e.g. SVM, logistic regression. Deviations from

More information

Announcements Monday, November 13

Announcements Monday, November 13 Announcements Monday, November 13 The third midterm is on this Friday, November 17. The exam covers 3.1, 3.2, 5.1, 5.2, 5.3, and 5.5. About half the problems will be conceptual, and the other half computational.

More information

From Greek philosophers to circuits: An introduction to boolean logic. COS 116, Spring 2011 Sanjeev Arora

From Greek philosophers to circuits: An introduction to boolean logic. COS 116, Spring 2011 Sanjeev Arora From Greek philosophers to circuits: An introduction to boolean logic. COS 116, Spring 2011 Sanjeev Arora Midterm One week from today in class Mar 10 Covers lectures, labs, homework, readings to date You

More information

2018 Fall 2210Q Section 013 Midterm Exam I Solution

2018 Fall 2210Q Section 013 Midterm Exam I Solution 8 Fall Q Section 3 Midterm Exam I Solution True or False questions ( points = points) () An example of a linear combination of vectors v, v is the vector v. True. We can write v as v + v. () If two matrices

More information

Problem Score Problem Score 1 /12 6 /12 2 /12 7 /12 3 /12 8 /12 4 /12 9 /12 5 /12 10 /12 Total /120

Problem Score Problem Score 1 /12 6 /12 2 /12 7 /12 3 /12 8 /12 4 /12 9 /12 5 /12 10 /12 Total /120 EE103/CME103: Introduction to Matrix Methods October 27 2016 S. Boyd Midterm Exam This is an in-class 80 minute midterm. You may not use any books, notes, or computer programs (e.g., Julia). Throughout

More information

THE SAMPLING DISTRIBUTION OF THE MEAN

THE SAMPLING DISTRIBUTION OF THE MEAN THE SAMPLING DISTRIBUTION OF THE MEAN COGS 14B JANUARY 26, 2017 TODAY Sampling Distributions Sampling Distribution of the Mean Central Limit Theorem INFERENTIAL STATISTICS Inferential statistics: allows

More information

Linear and Logistic Regression. Dr. Xiaowei Huang

Linear and Logistic Regression. Dr. Xiaowei Huang Linear and Logistic Regression Dr. Xiaowei Huang https://cgi.csc.liv.ac.uk/~xiaowei/ Up to now, Two Classical Machine Learning Algorithms Decision tree learning K-nearest neighbor Model Evaluation Metrics

More information

Linear Algebra. Introduction. Marek Petrik 3/23/2017. Many slides adapted from Linear Algebra Lectures by Martin Scharlemann

Linear Algebra. Introduction. Marek Petrik 3/23/2017. Many slides adapted from Linear Algebra Lectures by Martin Scharlemann Linear Algebra Introduction Marek Petrik 3/23/2017 Many slides adapted from Linear Algebra Lectures by Martin Scharlemann Midterm Results Highest score on the non-r part: 67 / 77 Score scaling: Additive

More information

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Bayesian learning: Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Let y be the true label and y be the predicted

More information

CS 559: Machine Learning Fundamentals and Applications 1 st Set of Notes

CS 559: Machine Learning Fundamentals and Applications 1 st Set of Notes 1 CS 559: Machine Learning Fundamentals and Applications 1 st Set of Notes Instructor: Philippos Mordohai Webpage: www.cs.stevens.edu/~mordohai E-mail: Philippos.Mordohai@stevens.edu Office: Lieb 215 Objectives

More information

Class 19. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Class 19. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700 Class 19 Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science Copyright 2017 by D.B. Rowe 1 Agenda: Recap Chapter 8.3-8.4 Lecture Chapter 8.5 Go over Exam. Problem Solving

More information

Decision Trees. Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University. February 5 th, Carlos Guestrin 1

Decision Trees. Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University. February 5 th, Carlos Guestrin 1 Decision Trees Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University February 5 th, 2007 2005-2007 Carlos Guestrin 1 Linear separability A dataset is linearly separable iff 9 a separating

More information

Umans Complexity Theory Lectures

Umans Complexity Theory Lectures Complexity Theory Umans Complexity Theory Lectures Lecture 1a: Problems and Languages Classify problems according to the computational resources required running time storage space parallelism randomness

More information

Deviations from linear separability. Kernel methods. Basis expansion for quadratic boundaries. Adding new features Systematic deviation

Deviations from linear separability. Kernel methods. Basis expansion for quadratic boundaries. Adding new features Systematic deviation Deviations from linear separability Kernel methods CSE 250B Noise Find a separator that minimizes a convex loss function related to the number of mistakes. e.g. SVM, logistic regression. Systematic deviation

More information

CSE 311: Foundations of Computing I

CSE 311: Foundations of Computing I CSE 311: Foundations of Computing I Autumn 2015 Lecture 1: Propositional Logic Overload Request Link: http://tinyurl.com/p5vs5xb We will study the theory needed for CSE. about the course Logic: How can

More information

Lecture 7: DecisionTrees

Lecture 7: DecisionTrees Lecture 7: DecisionTrees What are decision trees? Brief interlude on information theory Decision tree construction Overfitting avoidance Regression trees COMP-652, Lecture 7 - September 28, 2009 1 Recall:

More information

Decision Trees. CSC411/2515: Machine Learning and Data Mining, Winter 2018 Luke Zettlemoyer, Carlos Guestrin, and Andrew Moore

Decision Trees. CSC411/2515: Machine Learning and Data Mining, Winter 2018 Luke Zettlemoyer, Carlos Guestrin, and Andrew Moore Decision Trees Claude Monet, The Mulberry Tree Slides from Pedro Domingos, CSC411/2515: Machine Learning and Data Mining, Winter 2018 Luke Zettlemoyer, Carlos Guestrin, and Andrew Moore Michael Guerzhoy

More information

Machine Learning

Machine Learning Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University October 11, 2012 Today: Computational Learning Theory Probably Approximately Coorrect (PAC) learning theorem

More information

LECTURE 15: SIMPLE LINEAR REGRESSION I

LECTURE 15: SIMPLE LINEAR REGRESSION I David Youngberg BSAD 20 Montgomery College LECTURE 5: SIMPLE LINEAR REGRESSION I I. From Correlation to Regression a. Recall last class when we discussed two basic types of correlation (positive and negative).

More information

Decision Trees. Nicholas Ruozzi University of Texas at Dallas. Based on the slides of Vibhav Gogate and David Sontag

Decision Trees. Nicholas Ruozzi University of Texas at Dallas. Based on the slides of Vibhav Gogate and David Sontag Decision Trees Nicholas Ruozzi University of Texas at Dallas Based on the slides of Vibhav Gogate and David Sontag Supervised Learning Input: labelled training data i.e., data plus desired output Assumption:

More information

CSE 2001: Introduction to Theory of Computation Fall Suprakash Datta

CSE 2001: Introduction to Theory of Computation Fall Suprakash Datta CSE 2001: Introduction to Theory of Computation Fall 2012 Suprakash Datta datta@cse.yorku.ca Office: CSEB 3043 Phone: 416-736-2100 ext 77875 Course page: http://www.cs.yorku.ca/course/2001 9/6/2012 CSE

More information