ORIE 4741: Learning with Big Messy Data. Feature Engineering
|
|
- Jean Kelley Young
- 5 years ago
- Views:
Transcription
1 ORIE 4741: Learning with Big Messy Data Feature Engineering Professor Udell Operations Research and Information Engineering Cornell October 2, / 24
2 Agenda announcements Vote today! + registration deadline October (for Nov 2017 election, or to change parties for 2018 election) Homework 1 due this morning Homework 2 out tonight, due in two weeks Always use Julia v0.6 (when you use Julia) Section this week: Github + projects No section next Tuesday Lecture videos: http: //cornell.videonote.com/channels/1036/videos Midterm exam in class on October 5, covers through decision trees The role of confusion in learning feature engineering project discussion 2 / 24
3 Linear models To fit a linear model (= linear in parameters w) pick a transformation φ : X R d predict y using a linear function of φ(x) h(x) = w T φ(x) = d w i (φ(x)) i i=1 we want h(x i ) y i for every i = 1,..., n 3 / 24
4 Linear models To fit a linear model (= linear in parameters w) pick a transformation φ : X R d predict y using a linear function of φ(x) h(x) = w T φ(x) = d w i (φ(x)) i i=1 we want h(x i ) y i for every i = 1,..., n Q: why do we want a model linear in the parameters w? 3 / 24
5 Linear models To fit a linear model (= linear in parameters w) pick a transformation φ : X R d predict y using a linear function of φ(x) h(x) = w T φ(x) = d w i (φ(x)) i i=1 we want h(x i ) y i for every i = 1,..., n Q: why do we want a model linear in the parameters w? A: because the optimization problems are easy to solve! e.g., just use least squares. 3 / 24
6 Feature engineering How to pick φ : X R d? so response y will depend linearly on φ(x) so d is not too big 4 / 24
7 Feature engineering How to pick φ : X R d? so response y will depend linearly on φ(x) so d is not too big if you think this looks like a hack: you re right 5 / 24
8 Feature engineering examples: adding offset standardizing features polynomial fits products of features autoregressive models local linear regression transforming images transforming Booleans transforming ordinals transforming nominals transforming sentences concatenating data all of the above 6 / 24
9 Adding offset X = R d 1 let φ(x) = (x, 1) now h(x) = w T φ(x) = w T 1:d 1 x + w d 7 / 24
10 Fitting a polynomial X = R let φ(x) = (1, x, x 2, x 3,..., x d 1 ) be the vector of all monomials in x of degree < d now h(x) = w T φ(x) = w 1 + w 2 x + w 3 x w d x d 1 8 / 24
11 Demo: Linear models 9 / 24
12 Fitting a multivariate polynomial X = R 2 pick a maximum degree k let φ(x) = (1, x 1, x 2, x 2 1, x 1 x 2, x 2 2, x 3 1, x 2 1 x 2, x 1 x 2 2, x , x k 2 ) be the vector of all monomials in x 1 and x 2 of degree < d now h(x) = w T φ(x) can fit any polynomial of degree k in X 10 / 24
13 Fitting a multivariate polynomial X = R 2 pick a maximum degree k let φ(x) = (1, x 1, x 2, x 2 1, x 1 x 2, x 2 2, x 3 1, x 2 1 x 2, x 1 x 2 2, x , x k 2 ) be the vector of all monomials in x 1 and x 2 of degree < d now h(x) = w T φ(x) can fit any polynomial of degree k in X and similarly for X = R d / 24
14 Example: fitting a multivariate polynomial X = R 2, Y = {0, 1} let φ(x) = (1, x 1, x 2, x 2 1, x 1 x 2, x 2 2 ) be the vector of all monomials of degree 2 now let h(x) = sign(w T φ(x)) Q: if h(x) = sign(5 3x 1 + 2x 2 + x1 2 x 1x 2 + 2x2 2 ), what is {x : h(x) = 1}? 11 / 24
15 Example: fitting a multivariate polynomial X = R 2, Y = {0, 1} let φ(x) = (1, x 1, x 2, x 2 1, x 1 x 2, x 2 2 ) be the vector of all monomials of degree 2 now let h(x) = sign(w T φ(x)) Q: if h(x) = sign(5 3x 1 + 2x 2 + x1 2 x 1x 2 + 2x2 2 ), what is A: An ellipse! {x : h(x) = 1}? 11 / 24
16 Example: fitting a multivariate polynomial X = R 2, Y = { 1, 1} let φ(x) = (1, x 1, x 2, x 2 1, x 1 x 2, x 2 2 ) be the vector of all monomials of degree 2 now let h(x) = sign(w T φ(x)) Q: if h(x) = sign(5 3x 1 + 2x 2 + x1 2 x 1x 2 2x2 2 ), what is {x : h(x) = 1}? 12 / 24
17 Example: fitting a multivariate polynomial X = R 2, Y = { 1, 1} let φ(x) = (1, x 1, x 2, x 2 1, x 1 x 2, x 2 2 ) be the vector of all monomials of degree 2 now let h(x) = sign(w T φ(x)) Q: if h(x) = sign(5 3x 1 + 2x 2 + x1 2 x 1x 2 2x2 2 ), what is A: A hyperbola! {x : h(x) = 1}? 12 / 24
18 Fitting a time series given a time series x t R, t = 1,..., T want to predict the value at the next time T + 1 input: time index t and time series x 1:t up to time t let φ(t, x) = (x t 1, x t 2,..., x t d ) (called the lagged outcomes ) now h(x) = w T φ(x) = w 1 x t 1 + w 2 x t w d x t d also called an auto-regressive (AR) model 13 / 24
19 New notation: boolean indicator function define 1(statement) = { 1 statement is true 0 statement is false examples: 1(1 < 0) = 0 1(17 = 17) = 1 14 / 24
20 Fitting a local model: local neighbors X = R m pick a set of points µ i R m, i = 1,..., d, radius δ R let φ(x) = [1( x µ 1 2 δ),..., 1( x µ d 2 δ)] often, d = n and µ i = x i 15 / 24
21 Fitting a local model: 1 nearest neighbor X = R m pick a set of points µ i R m, i = 1,..., d let δ = min( x µ 1 2,..., x µ d 2 ) let φ(x) = [1( x µ 1 2 δ),..., 1( x µ d 2 δ)] often, d = n and µ i = x i 16 / 24
22 Fitting a local model: smoothing X = R m pick a set of points µ i R m, i = 1,..., d, parameter α R let φ(x) = (exp ( α x µ 1 2),..., exp ( α x µ d 2) ) often, d = n and µ i = x i 17 / 24
23 Demo: Crime preprocessing: predicting: Predicting%20crime.ipynb 18 / 24
24 Boolean variables X = {true, false} let φ(x) = 1(x) 19 / 24
25 Boolean expressions X = {true, false} 2 = {true, false} 2 = {(true, true), (true, false), (false, true), (false, false)} 2. let φ(x) = [1(x 1 ), 1(x 2 ), 1(x 1 and x 2 ), 1(x 1 or x 2 )] equivalent: polynomials in [1(x 1 ), 1(x 2 )] span the same space encodes logical expressions! 20 / 24
26 Nominal values X = {apple, orange, banana} let φ(x) = [1(x = apple), 1(x = orange), 1(x = banana)] called one-hot encoding: only one element is non-zero 21 / 24
27 Language X = sentences, documents, tweets,... let {w 1,..., w d } be a set of words (or hashtags, or emoji, or... ) let φ(x) = [1(x contains w 1 ),..., 1(x contains w d )] called bag-of-words model: ignores order of words in sentence 22 / 24
28 Review linear models are linear in the parameters w can fit many different models by picking feature mapping φ : X R d 23 / 24
29 What makes a good project? Clear outcome to predict Linear regression should do something interesting New, interesting model; not a Kaggle competition Avoid: images, time series, NLP 24 / 24
ORIE 4741: Learning with Big Messy Data. Generalization
ORIE 4741: Learning with Big Messy Data Generalization Professor Udell Operations Research and Information Engineering Cornell September 23, 2017 1 / 21 Announcements midterm 10/5 makeup exam 10/2, by
More informationData Mining Prof. Pabitra Mitra Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur
Data Mining Prof. Pabitra Mitra Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur Lecture 21 K - Nearest Neighbor V In this lecture we discuss; how do we evaluate the
More informationCPSC 340: Machine Learning and Data Mining
CPSC 340: Machine Learning and Data Mining Linear Classifiers: predictions Original version of these slides by Mark Schmidt, with modifications by Mike Gelbart. 1 Admin Assignment 4: Due Friday of next
More informationORIE 4741: Learning with Big Messy Data. Regularization
ORIE 4741: Learning with Big Messy Data Regularization Professor Udell Operations Research and Information Engineering Cornell October 26, 2017 1 / 24 Regularized empirical risk minimization choose model
More informationLinear Regression (continued)
Linear Regression (continued) Professor Ameet Talwalkar Professor Ameet Talwalkar CS260 Machine Learning Algorithms February 6, 2017 1 / 39 Outline 1 Administration 2 Review of last lecture 3 Linear regression
More informationMIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October,
MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October, 23 2013 The exam is closed book. You are allowed a one-page cheat sheet. Answer the questions in the spaces provided on the question sheets. If you run
More informationSelf-reproducing programs. And Introduction to logic. COS 116, Spring 2012 Adam Finkelstein
Self-reproducing programs. And Introduction to logic. COS 6, Spring 22 Adam Finkelstein Midterm One week from today in class Mar 5 Covers lectures, labs, homework, readings to date Old midterms will be
More informationSTA 302f16 Assignment Five 1
STA 30f16 Assignment Five 1 Except for Problem??, these problems are preparation for the quiz in tutorial on Thursday October 0th, and are not to be handed in As usual, at times you may be asked to prove
More informationWelcome to Math 102 Section 102
Welcome to Math 102 Section 102 Mingfeng Qiu Sep. 5, 2018 Math 102: Announcements Instructor: Mingfeng Qiu Email: mqiu@math.ubc.ca Course webpage: https://canvas.ubc.ca Check the calendar!!! Sectional
More informationMachine Learning, Fall 2009: Midterm
10-601 Machine Learning, Fall 009: Midterm Monday, November nd hours 1. Personal info: Name: Andrew account: E-mail address:. You are permitted two pages of notes and a calculator. Please turn off all
More informationIntroduction to Physical Geology (GEOL 1) Physical Science
Introduction to Physical Geology (GEOL 1) Physical Science Dr. Ryan J. McCarty rmccarty@saddleback.edu Office hours 8:30-9:00 12:00-12:30, 5:00-6:00 1 What is Physical Geology Having to do with the material
More informationORIE 4741: Learning with Big Messy Data. Proximal Gradient Method
ORIE 4741: Learning with Big Messy Data Proximal Gradient Method Professor Udell Operations Research and Information Engineering Cornell November 13, 2017 1 / 31 Announcements Be a TA for CS/ORIE 1380:
More informationLecture 7: Kernels for Classification and Regression
Lecture 7: Kernels for Classification and Regression CS 194-10, Fall 2011 Laurent El Ghaoui EECS Department UC Berkeley September 15, 2011 Outline Outline A linear regression problem Linear auto-regressive
More informationCS 188: Artificial Intelligence Spring Announcements
CS 188: Artificial Intelligence Spring 2010 Lecture 24: Perceptrons and More! 4/22/2010 Pieter Abbeel UC Berkeley Slides adapted from Dan Klein Announcements W7 due tonight [this is your last written for
More informationIntelligent Systems (AI-2)
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 19 Oct, 23, 2015 Slide Sources Raymond J. Mooney University of Texas at Austin D. Koller, Stanford CS - Probabilistic Graphical Models D. Page,
More informationCSCI567 Machine Learning (Fall 2014)
CSCI567 Machine Learning (Fall 24) Drs. Sha & Liu {feisha,yanliu.cs}@usc.edu October 2, 24 Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 24) October 2, 24 / 24 Outline Review
More informationClass Note #20. In today s class, the following four concepts were introduced: decision
Class Note #20 Date: 03/29/2006 [Overall Information] In today s class, the following four concepts were introduced: decision version of a problem, formal language, P and NP. We also discussed the relationship
More informationSymbolic Logic Outline
Symbolic Logic Outline 1. Symbolic Logic Outline 2. What is Logic? 3. How Do We Use Logic? 4. Logical Inferences #1 5. Logical Inferences #2 6. Symbolic Logic #1 7. Symbolic Logic #2 8. What If a Premise
More informationORIE 4741: Learning with Big Messy Data. Train, Test, Validate
ORIE 4741: Learning with Big Messy Data Train, Test, Validate Professor Udell Operations Research and Information Engineering Cornell December 4, 2017 1 / 14 Exercise You run a hospital. A vendor wants
More informationCSC321 Lecture 4 The Perceptron Algorithm
CSC321 Lecture 4 The Perceptron Algorithm Roger Grosse and Nitish Srivastava January 17, 2017 Roger Grosse and Nitish Srivastava CSC321 Lecture 4 The Perceptron Algorithm January 17, 2017 1 / 1 Recap:
More informationIntelligent Systems (AI-2)
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 19 Oct, 24, 2016 Slide Sources Raymond J. Mooney University of Texas at Austin D. Koller, Stanford CS - Probabilistic Graphical Models D. Page,
More informationCapacitance. Welcome Back to Physics Micheal Faraday by Thomas Phillips oil on canvas
Welcome Back to Physics 1308 Capacitance Micheal Faraday 1868-1953 by Thomas Phillips oil on canvas Announcements Assignments for Thursday, September 27th: - Reading: Chapter 26.1-26.5 - Watch Video: https://youtu.be/tvwyfhgdvgq
More informationOverfitting, Bias / Variance Analysis
Overfitting, Bias / Variance Analysis Professor Ameet Talwalkar Professor Ameet Talwalkar CS260 Machine Learning Algorithms February 8, 207 / 40 Outline Administration 2 Review of last lecture 3 Basic
More informationVariance Reduction and Ensemble Methods
Variance Reduction and Ensemble Methods Nicholas Ruozzi University of Texas at Dallas Based on the slides of Vibhav Gogate and David Sontag Last Time PAC learning Bias/variance tradeoff small hypothesis
More informationMachine Learning (CS 567) Lecture 5
Machine Learning (CS 567) Lecture 5 Time: T-Th 5:00pm - 6:20pm Location: GFS 118 Instructor: Sofus A. Macskassy (macskass@usc.edu) Office: SAL 216 Office hours: by appointment Teaching assistant: Cheol
More informationON1 Big Mock Test 4 October 2013 Answers This test counts 4% of assessment for the course. Time allowed: 10 minutes
ON1 Big Mock Test 4 October 2013 Answers This test counts 4% of assessment for the course. Time allowed: 10 minutes The actual test will contain only 2 questions. Marking scheme: In each multiple choice
More informationStochastic Gradient Descent
Stochastic Gradient Descent Machine Learning CSE546 Carlos Guestrin University of Washington October 9, 2013 1 Logistic Regression Logistic function (or Sigmoid): Learn P(Y X) directly Assume a particular
More informationHW8. Due: November 1, 2018
CSCI 1010 Theory of Computation HW8 Due: November 1, 2018 Attach a fully filled-in cover sheet to the front of your printed homework Your name should not appear anywhere; the cover sheet and each individual
More informationForecasting. BUS 735: Business Decision Making and Research. exercises. Assess what we have learned
Forecasting BUS 735: Business Decision Making and Research 1 1.1 Goals and Agenda Goals and Agenda Learning Objective Learn how to identify regularities in time series data Learn popular univariate time
More informationCSC 411: Lecture 03: Linear Classification
CSC 411: Lecture 03: Linear Classification Richard Zemel, Raquel Urtasun and Sanja Fidler University of Toronto Zemel, Urtasun, Fidler (UofT) CSC 411: 03-Classification 1 / 24 Examples of Problems What
More informationMIDTERM SOLUTIONS: FALL 2012 CS 6375 INSTRUCTOR: VIBHAV GOGATE
MIDTERM SOLUTIONS: FALL 2012 CS 6375 INSTRUCTOR: VIBHAV GOGATE March 28, 2012 The exam is closed book. You are allowed a double sided one page cheat sheet. Answer the questions in the spaces provided on
More informationElectric Fields of Charge Distributions
Welcome to Physics 308 Electric Fields of Charge Distributions Charles-Augustin de Coulomb 736-806 Physics 308: General Physics II - Professor Jodi Cooley Announcements Assignments for Tuesday, September
More informationMachine Learning. Kernels. Fall (Kernels, Kernelized Perceptron and SVM) Professor Liang Huang. (Chap. 12 of CIML)
Machine Learning Fall 2017 Kernels (Kernels, Kernelized Perceptron and SVM) Professor Liang Huang (Chap. 12 of CIML) Nonlinear Features x4: -1 x1: +1 x3: +1 x2: -1 Concatenated (combined) features XOR:
More informationComp487/587 - Boolean Formulas
Comp487/587 - Boolean Formulas 1 Logic and SAT 1.1 What is a Boolean Formula Logic is a way through which we can analyze and reason about simple or complicated events. In particular, we are interested
More informationLinear Classifiers: Expressiveness
Linear Classifiers: Expressiveness Machine Learning Spring 2018 The slides are mainly from Vivek Srikumar 1 Lecture outline Linear classifiers: Introduction What functions do linear classifiers express?
More informationMA 242 LINEAR ALGEBRA C1, Solutions to First Midterm Exam
MA 242 LINEAR ALGEBRA C Solutions to First Midterm Exam Prof Nikola Popovic October 2 9:am - :am Problem ( points) Determine h and k such that the solution set of x + = k 4x + h = 8 (a) is empty (b) contains
More informationLECTURE 5. Introduction to Econometrics. Hypothesis testing
LECTURE 5 Introduction to Econometrics Hypothesis testing October 18, 2016 1 / 26 ON TODAY S LECTURE We are going to discuss how hypotheses about coefficients can be tested in regression models We will
More informationDecision Trees. Tirgul 5
Decision Trees Tirgul 5 Using Decision Trees It could be difficult to decide which pet is right for you. We ll find a nice algorithm to help us decide what to choose without having to think about it. 2
More informationAnnouncements Monday, September 18
Announcements Monday, September 18 WeBWorK 1.4, 1.5 are due on Wednesday at 11:59pm. The first midterm is on this Friday, September 22. Midterms happen during recitation. The exam covers through 1.5. About
More informationSVMs, Duality and the Kernel Trick
SVMs, Duality and the Kernel Trick Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University February 26 th, 2007 2005-2007 Carlos Guestrin 1 SVMs reminder 2005-2007 Carlos Guestrin 2 Today
More informationYou have 3 hours to complete the exam. Some questions are harder than others, so don t spend too long on any one question.
Data 8 Fall 2017 Foundations of Data Science Final INSTRUCTIONS You have 3 hours to complete the exam. Some questions are harder than others, so don t spend too long on any one question. The exam is closed
More informationAnnouncements Monday, November 13
Announcements Monday, November 13 The third midterm is on this Friday, November 17 The exam covers 31, 32, 51, 52, 53, and 55 About half the problems will be conceptual, and the other half computational
More informationMarks. bonus points. } Assignment 1: Should be out this weekend. } Mid-term: Before the last lecture. } Mid-term deferred exam:
Marks } Assignment 1: Should be out this weekend } All are marked, I m trying to tally them and perhaps add bonus points } Mid-term: Before the last lecture } Mid-term deferred exam: } This Saturday, 9am-10.30am,
More informationRandomized Decision Trees
Randomized Decision Trees compiled by Alvin Wan from Professor Jitendra Malik s lecture Discrete Variables First, let us consider some terminology. We have primarily been dealing with real-valued data,
More informationPre-calculus Lesson Plan. Week of:
Pre-calculus Lesson Plan Week of: May 15 Review for semester exam Semester exams Semester exams May 8 12.4 Derivatives Find instantaneous rates of change by Teacher led calculating Student practice derivatives
More informationLecture 02 Linear classification methods I
Lecture 02 Linear classification methods I 22 January 2016 Taylor B. Arnold Yale Statistics STAT 365/665 1/32 Coursewebsite: A copy of the whole course syllabus, including a more detailed description of
More informationPart I. Linear regression & LASSO. Linear Regression. Linear Regression. Week 10 Based in part on slides from textbook, slides of Susan Holmes
Week 10 Based in part on slides from textbook, slides of Susan Holmes Part I Linear regression & December 5, 2012 1 / 1 2 / 1 We ve talked mostly about classification, where the outcome categorical. If
More informationStatistical Machine Learning Theory. From Multi-class Classification to Structured Output Prediction. Hisashi Kashima.
http://goo.gl/xilnmn Course website KYOTO UNIVERSITY Statistical Machine Learning Theory From Multi-class Classification to Structured Output Prediction Hisashi Kashima kashima@i.kyoto-u.ac.jp DEPARTMENT
More informationA Decision Stump. Decision Trees, cont. Boosting. Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University. October 1 st, 2007
Decision Trees, cont. Boosting Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University October 1 st, 2007 1 A Decision Stump 2 1 The final tree 3 Basic Decision Tree Building Summarized
More informationArtificial Neural Networks" and Nonparametric Methods" CMPSCI 383 Nov 17, 2011!
Artificial Neural Networks" and Nonparametric Methods" CMPSCI 383 Nov 17, 2011! 1 Todayʼs lecture" How the brain works (!)! Artificial neural networks! Perceptrons! Multilayer feed-forward networks! Error
More information9 Classification. 9.1 Linear Classifiers
9 Classification This topic returns to prediction. Unlike linear regression where we were predicting a numeric value, in this case we are predicting a class: winner or loser, yes or no, rich or poor, positive
More informationGaussian and Linear Discriminant Analysis; Multiclass Classification
Gaussian and Linear Discriminant Analysis; Multiclass Classification Professor Ameet Talwalkar Slide Credit: Professor Fei Sha Professor Ameet Talwalkar CS260 Machine Learning Algorithms October 13, 2015
More informationCISC-102 Fall 2017 Week 1 David Rappaport Goodwin G-532 Office Hours: Tuesday 1:30-3:30
Week 1 Fall 2017 1 of 42 CISC-102 Fall 2017 Week 1 David Rappaport daver@cs.queensu.ca Goodwin G-532 Office Hours: Tuesday 1:30-3:30 Homework Homework every week. Keep up to date or you risk falling behind.
More informationData Mining Prof. Pabitra Mitra Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur
Data Mining Prof. Pabitra Mitra Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur Lecture - 17 K - Nearest Neighbor I Welcome to our discussion on the classification
More informationMath 3191 Applied Linear Algebra
Math 191 Applied Linear Algebra Lecture 9: Characterizations of Invertible Matrices Stephen Billups University of Colorado at Denver Math 191Applied Linear Algebra p.1/ Announcements Review for Exam 1
More informationDigital Logic Design ENEE x. Lecture 14
Digital Logic Design ENEE 244-010x Lecture 14 Announcements Homework 6 due today Agenda Last time: Binary Adders and Subtracters (5.1, 5.1.1) Carry Lookahead Adders (5.1.2, 5.1.3) This time: Decimal Adders
More informationCircuits Gustav Robert Kirchhoff 12 March October 1887
Welcome Back to Physics 1308 Circuits Gustav Robert Kirchhoff 12 March 1824 17 October 1887 Announcements Assignments for Thursday, October 4th: - Reading: Chapter 27.3-4 - Watch Video: https://youtu.be/ytoe-rwi8us
More informationSupport Vector Machines: Kernels
Support Vector Machines: Kernels CS6780 Advanced Machine Learning Spring 2015 Thorsten Joachims Cornell University Reading: Murphy 14.1, 14.2, 14.4 Schoelkopf/Smola Chapter 7.4, 7.6, 7.8 Non-Linear Problems
More informationDay 3: Classification, logistic regression
Day 3: Classification, logistic regression Introduction to Machine Learning Summer School June 18, 2018 - June 29, 2018, Chicago Instructor: Suriya Gunasekar, TTI Chicago 20 June 2018 Topics so far Supervised
More informationStatic Equilibirum CHECK YOUR NEIGHBOR
PHY131H1F - Class 19 Today: Today, Chapter 12: Conditions for Static Equilibrium Center of Gravity Static Equilibrium Problems Stability Suppose the girl on the left suddenly is handed a bag of apples
More informationCS 446 Machine Learning Fall 2016 Nov 01, Bayesian Learning
CS 446 Machine Learning Fall 206 Nov 0, 206 Bayesian Learning Professor: Dan Roth Scribe: Ben Zhou, C. Cervantes Overview Bayesian Learning Naive Bayes Logistic Regression Bayesian Learning So far, we
More information6. The scalar multiple of u by c, denoted by c u is (also) in V. (closure under scalar multiplication)
Definition: A subspace of a vector space V is a subset H of V which is itself a vector space with respect to the addition and scalar multiplication in V. As soon as one verifies a), b), c) below for H,
More information15-388/688 - Practical Data Science: Nonlinear modeling, cross-validation, regularization, and evaluation
15-388/688 - Practical Data Science: Nonlinear modeling, cross-validation, regularization, and evaluation J. Zico Kolter Carnegie Mellon University Fall 2016 1 Outline Example: return to peak demand prediction
More informationLearning from Examples
Learning from Examples Data fitting Decision trees Cross validation Computational learning theory Linear classifiers Neural networks Nonparametric methods: nearest neighbor Support vector machines Ensemble
More informationNP-Completeness Part II
NP-Completeness Part II Please evaluate this course on Axess. Your comments really do make a difference. Announcements Problem Set 8 due tomorrow at 12:50PM sharp with one late day. Problem Set 9 out,
More informationData Mining Techniques. Lecture 3: Probability
Data Mining Techniques CS 6220 - Section 3 - Fall 2016 Lecture 3: Probability Jan-Willem van de Meent (credit: Zhao, CS 229, Bishop) Project Vote 1. Freeform: Develop your own project proposals 30% of
More informationCS 188: Artificial Intelligence Spring Announcements
CS 188: Artificial Intelligence Spring 2010 Lecture 22: Nearest Neighbors, Kernels 4/18/2011 Pieter Abbeel UC Berkeley Slides adapted from Dan Klein Announcements On-going: contest (optional and FUN!)
More informationKernels. Machine Learning CSE446 Carlos Guestrin University of Washington. October 28, Carlos Guestrin
Kernels Machine Learning CSE446 Carlos Guestrin University of Washington October 28, 2013 Carlos Guestrin 2005-2013 1 Linear Separability: More formally, Using Margin Data linearly separable, if there
More informationEnsemble Methods. Charles Sutton Data Mining and Exploration Spring Friday, 27 January 12
Ensemble Methods Charles Sutton Data Mining and Exploration Spring 2012 Bias and Variance Consider a regression problem Y = f(x)+ N(0, 2 ) With an estimate regression function ˆf, e.g., ˆf(x) =w > x Suppose
More informationSVMs: nonlinearity through kernels
Non-separable data e-8. Support Vector Machines 8.. The Optimal Hyperplane Consider the following two datasets: SVMs: nonlinearity through kernels ER Chapter 3.4, e-8 (a) Few noisy data. (b) Nonlinearly
More informationMath 308 Midterm Practice Winter 2015
Math 308 Midterm Practice Winter 205 The Midterm will take place Feb th, during lecture time. It will cover the material of Chapter, Chapter 2 and Chapter 3, Section 3.. You do not need to bring paper
More information1.1 P, NP, and NP-complete
CSC5160: Combinatorial Optimization and Approximation Algorithms Topic: Introduction to NP-complete Problems Date: 11/01/2008 Lecturer: Lap Chi Lau Scribe: Jerry Jilin Le This lecture gives a general introduction
More informationNaive Bayes and Gaussian Bayes Classifier
Naive Bayes and Gaussian Bayes Classifier Elias Tragas tragas@cs.toronto.edu October 3, 2016 Elias Tragas Naive Bayes and Gaussian Bayes Classifier October 3, 2016 1 / 23 Naive Bayes Bayes Rules: Naive
More informationProbability concepts. Math 10A. October 33, 2017
October 33, 207 Serge Lang lecture This year s Serge Lang Undergraduate Lecture will be given by Keith Devlin of Stanford University. The title is When the precision of mathematics meets the messiness
More informationNearest Neighbor. Machine Learning CSE546 Kevin Jamieson University of Washington. October 26, Kevin Jamieson 2
Nearest Neighbor Machine Learning CSE546 Kevin Jamieson University of Washington October 26, 2017 2017 Kevin Jamieson 2 Some data, Bayes Classifier Training data: True label: +1 True label: -1 Optimal
More informationNonlinear Classification
Nonlinear Classification INFO-4604, Applied Machine Learning University of Colorado Boulder October 5-10, 2017 Prof. Michael Paul Linear Classification Most classifiers we ve seen use linear functions
More informationSupport Vector Machines, Kernel SVM
Support Vector Machines, Kernel SVM Professor Ameet Talwalkar Professor Ameet Talwalkar CS260 Machine Learning Algorithms February 27, 2017 1 / 40 Outline 1 Administration 2 Review of last lecture 3 SVM
More informationWillmar Public Schools Curriculum Map
Subject Area Mathematics Senior High Course Name Advanced Algebra 2A (Prentice Hall Mathematics) Date April 2010 The Advanced Algebra 2A course parallels each other in content and time. The Advanced Algebra
More informationKernel methods CSE 250B
Kernel methods CSE 250B Deviations from linear separability Noise Find a separator that minimizes a convex loss function related to the number of mistakes. e.g. SVM, logistic regression. Deviations from
More informationAnnouncements Monday, November 13
Announcements Monday, November 13 The third midterm is on this Friday, November 17. The exam covers 3.1, 3.2, 5.1, 5.2, 5.3, and 5.5. About half the problems will be conceptual, and the other half computational.
More informationFrom Greek philosophers to circuits: An introduction to boolean logic. COS 116, Spring 2011 Sanjeev Arora
From Greek philosophers to circuits: An introduction to boolean logic. COS 116, Spring 2011 Sanjeev Arora Midterm One week from today in class Mar 10 Covers lectures, labs, homework, readings to date You
More information2018 Fall 2210Q Section 013 Midterm Exam I Solution
8 Fall Q Section 3 Midterm Exam I Solution True or False questions ( points = points) () An example of a linear combination of vectors v, v is the vector v. True. We can write v as v + v. () If two matrices
More informationProblem Score Problem Score 1 /12 6 /12 2 /12 7 /12 3 /12 8 /12 4 /12 9 /12 5 /12 10 /12 Total /120
EE103/CME103: Introduction to Matrix Methods October 27 2016 S. Boyd Midterm Exam This is an in-class 80 minute midterm. You may not use any books, notes, or computer programs (e.g., Julia). Throughout
More informationTHE SAMPLING DISTRIBUTION OF THE MEAN
THE SAMPLING DISTRIBUTION OF THE MEAN COGS 14B JANUARY 26, 2017 TODAY Sampling Distributions Sampling Distribution of the Mean Central Limit Theorem INFERENTIAL STATISTICS Inferential statistics: allows
More informationLinear and Logistic Regression. Dr. Xiaowei Huang
Linear and Logistic Regression Dr. Xiaowei Huang https://cgi.csc.liv.ac.uk/~xiaowei/ Up to now, Two Classical Machine Learning Algorithms Decision tree learning K-nearest neighbor Model Evaluation Metrics
More informationLinear Algebra. Introduction. Marek Petrik 3/23/2017. Many slides adapted from Linear Algebra Lectures by Martin Scharlemann
Linear Algebra Introduction Marek Petrik 3/23/2017 Many slides adapted from Linear Algebra Lectures by Martin Scharlemann Midterm Results Highest score on the non-r part: 67 / 77 Score scaling: Additive
More informationMachine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.
Bayesian learning: Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Let y be the true label and y be the predicted
More informationCS 559: Machine Learning Fundamentals and Applications 1 st Set of Notes
1 CS 559: Machine Learning Fundamentals and Applications 1 st Set of Notes Instructor: Philippos Mordohai Webpage: www.cs.stevens.edu/~mordohai E-mail: Philippos.Mordohai@stevens.edu Office: Lieb 215 Objectives
More informationClass 19. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700
Class 19 Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science Copyright 2017 by D.B. Rowe 1 Agenda: Recap Chapter 8.3-8.4 Lecture Chapter 8.5 Go over Exam. Problem Solving
More informationDecision Trees. Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University. February 5 th, Carlos Guestrin 1
Decision Trees Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University February 5 th, 2007 2005-2007 Carlos Guestrin 1 Linear separability A dataset is linearly separable iff 9 a separating
More informationUmans Complexity Theory Lectures
Complexity Theory Umans Complexity Theory Lectures Lecture 1a: Problems and Languages Classify problems according to the computational resources required running time storage space parallelism randomness
More informationDeviations from linear separability. Kernel methods. Basis expansion for quadratic boundaries. Adding new features Systematic deviation
Deviations from linear separability Kernel methods CSE 250B Noise Find a separator that minimizes a convex loss function related to the number of mistakes. e.g. SVM, logistic regression. Systematic deviation
More informationCSE 311: Foundations of Computing I
CSE 311: Foundations of Computing I Autumn 2015 Lecture 1: Propositional Logic Overload Request Link: http://tinyurl.com/p5vs5xb We will study the theory needed for CSE. about the course Logic: How can
More informationLecture 7: DecisionTrees
Lecture 7: DecisionTrees What are decision trees? Brief interlude on information theory Decision tree construction Overfitting avoidance Regression trees COMP-652, Lecture 7 - September 28, 2009 1 Recall:
More informationDecision Trees. CSC411/2515: Machine Learning and Data Mining, Winter 2018 Luke Zettlemoyer, Carlos Guestrin, and Andrew Moore
Decision Trees Claude Monet, The Mulberry Tree Slides from Pedro Domingos, CSC411/2515: Machine Learning and Data Mining, Winter 2018 Luke Zettlemoyer, Carlos Guestrin, and Andrew Moore Michael Guerzhoy
More informationMachine Learning
Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University October 11, 2012 Today: Computational Learning Theory Probably Approximately Coorrect (PAC) learning theorem
More informationLECTURE 15: SIMPLE LINEAR REGRESSION I
David Youngberg BSAD 20 Montgomery College LECTURE 5: SIMPLE LINEAR REGRESSION I I. From Correlation to Regression a. Recall last class when we discussed two basic types of correlation (positive and negative).
More informationDecision Trees. Nicholas Ruozzi University of Texas at Dallas. Based on the slides of Vibhav Gogate and David Sontag
Decision Trees Nicholas Ruozzi University of Texas at Dallas Based on the slides of Vibhav Gogate and David Sontag Supervised Learning Input: labelled training data i.e., data plus desired output Assumption:
More informationCSE 2001: Introduction to Theory of Computation Fall Suprakash Datta
CSE 2001: Introduction to Theory of Computation Fall 2012 Suprakash Datta datta@cse.yorku.ca Office: CSEB 3043 Phone: 416-736-2100 ext 77875 Course page: http://www.cs.yorku.ca/course/2001 9/6/2012 CSE
More information