Mustafa Jarrar. Machine Learning Linear Regression. Birzeit University. Mustafa Jarrar: Lecture Notes on Linear Regression Birzeit University, 2018
|
|
- Oscar McKenzie
- 5 years ago
- Views:
Transcription
1 Mustafa Jarrar: Lecture Notes on Linear Regression Birzeit University, 2018 Version 1 Machine Learning Linear Regression Mustafa Jarrar Birzeit University 1
2 Watch this lecture and download the slides Course Page: More Online Courses at: Acknowledgement: This lecture is based on (but not limited to) Andrew Ng s course about Machine Learning 2
3 Mustafa Jarrar: Lecture Notes on Linear Regression Machine Learning Birzeit University, 2018 Machine Learning Linear Regression In this lecture: Part 1: Motivation (Regression Problems) Part 2: Linear Regression Part 3: The Cost Function Part 4: The Gradient Descent Algorithm Part 5: The Normal Euation Part 6: Linear Algebra overview Part 7: Using Octave Part 8: Using R 3
4 Motivation Given the following housing prices, Price (in 1000s of dollars) Size (feet 2 ) How much a house of 1250ft 2 costs? We may assume a linear curve, So, conclude that a house with 1250ft 2 costs 230K$. 4
5 Motivation Given the following housing prices Price (in 1000s of dollars) Supervised Learning: Given the right answers for each eample in the data (training data) Size (feet 2 ) Regression Problem: Predict real-valued output Remember that classification (not regression) refers to predicting discrete-valued output 5
6 Motivation Given the following training set of housing prices: Our job is to learn from this data how to predict prices Size in feet 2 () Price ($) in 1000s (y) Notation: m = Number of training eamples s = input variable/features y s = output variable/target variable (, y) : a training eample ( i, y i ) : the i th training eample For eample: 1 = 2104 y 1 = 460 6
7 Mustafa Jarrar: Lecture Notes on Linear Regression Machine Learning Birzeit University, 2018 Machine Learning Linear Regression In this lecture: Part 1: Motivation (Regression Problems) Part 2: Linear Regression Part 3: The Cost Function Part 4: The Gradient Descent Algorithm Part 5: The Normal Euation Part 6: Linear Algebra overview Part 7: Using Octave Part 8: Using R 7
8 Linear Regression Training Set How to represent h? Learning Algorithm y h () = h() Size of house h Hypothesis Estimated Price This is a linear function. h maps s to y s Also called linear regression with one variable. 8
9 Linear Regression with Multiple Features Linear regression with multiple features is also called multiple linear regression Suppose we have the following features y Size ft 2 bedrooms floors Age Price A hypothesis function h() might be: h() = n n or h() = n n ( 0 =1) e.g., h() =
10 Polynomial Regression Suppose we have the following features y Size ft 2 bedrooms floors Age Price Another hypothesis function h() might be (polynomial): h () = n n n ( 0 =1) 10
11 Polynomial Regression Price (y) h () = h () = ! h () = Size () We may combine features (e.g., size = width * depth). We have the option of what features and what models (uadric, cubic, ) to use. Deciding which features and models that best fit our data and application, is beyond the scope of this course, but there are several algorithms for this. 11
12 Mustafa Jarrar: Lecture Notes on Linear Regression Machine Learning Birzeit University, 2018 Machine Learning Linear Regression In this lecture: Part 1: Motivation (Regression Problems) Part 2: Linear Regression Part 3: The Cost Function Part 4: The Gradient Descent Algorithm Part 5: The Normal Euation Part 6: Linear Algebra overview Part 7: Using Octave Part 8: Using R 12
13 Understanding s Values h () = i s: Parameters How to choose i s? y 13
14 Understanding s Values h() = h() = h() = = = 0 0 = 0 1 = = 1 1 =
15 The Cost Function y Idea: Choose 0, 1 so that h() is close to y for our training eamples (,y). y: is the actual value. h(): is the estimated value. The Cost Function J aims to find 0, 1 that minimizes the error. 0, 1 m 1 J( ( h( 2m i ) y i ) 2 0, 1 ) = i=1 This cost function is also called a Suared Error function 15
16 Undersetting the Cost Function Lets try different values of s and see how the cost function J( 0, 1 ) behave! J(0,1) J(0,0.5) J(0,0) 1 2 J(1)= 1/2 3 [(0) 2 +(0) 2 (0) 2 ] = 0 Given our training data, this is the best value of J(0.5)= 1/2 3 [(0.5-1) 2 +(1-2) 2 (1.5-3) 2 ] J( 1 ) J(0)= 1/2 3 [(1) 2 +(2) 2 (3) 2 ] 2.3 But this figure plots only 1, the problem becomes complicated if we have also ( 0. 1) 3 16
17 Cost Function Intuition II Remember that our goal is find the minimum values of 0 and 1 Hypothesis: Parameters: h () = , 1 Cost Function: J( 0, 1 ) = m 1 ( h( 2m i ) y i ) 2 i=1 Our Goal: minimize J( 0, 1 ) 17
18 Cost Function Intuition II Given any dataset, when we try to draw the cost function J( 0, 1 ), we may get this 3D shape: Given a dataset, our goal is find the minimum values of J( 0, 1 ) 18
19 Cost Function Intuition II We may draw the cost function also using contour figures: 19
20 Cost Function Intuition II h () = ( 0, 1 ) 20
21 Cost Function Intuition II h () = ( 0, 1 ) 21
22 Cost Function Intuition II ( 0, 1 ) Is there any way/algorithm to find s automatically? Yes, e.g., the Gradient Descent Algorithm 22
23 Mustafa Jarrar: Lecture Notes on Linear Regression Machine Learning Birzeit University, 2018 Machine Learning Linear Regression In this lecture: Part 1: Motivation (Regression Problems) Part 2: Linear Regression Part 3: The Cost Function Part 4: The Gradient Descent Algorithm Part 5: The Normal Euation Part 6: Linear Algebra overview Part 7: Using Octave Part 8: Using R 23
24 The Gradient Descent Algorithm Starts with some initial values of ' 0 and ' 1 Keep changing ' 0 and ' 1 to reduce.(' 0, ' 1 ) Until hopefully we end up at a minimum Repeat until convergence {!"#$0 ' 0 α.!"#$1 '1 α. ' 0 : =!"#$0 + J is 0 or 1..(' +, 0, ' 1 ) - +..(' +, 0, ' 1 ) 3 } ' 1 : =!"#$1 Learning Rate Partial Derivation 24
25 The Problem of the Gradient Descent algo. 1 If α is too small, gradient descent can be slow 1 If α is too large, gradient decent can overshoot the minimum. It may fail to converge or even diverge May converge to global minimum May converge to a local minimum 25
26 The Gradient Descent for Linear Regression Simplified version for linear regression Repeat until convergence { J is 0 and 1! 0! 0 α. '. ( *+' ( (h(. * ) 0 * ) }! 1! 1 α. '. ( *+' ( (h(. * ) 0 * ).. * Net, we will try to use Linear Algebra to numerically minimize!2 (called Normal Euation) without needing to use iterating algorithms like Gradient Descent. However Gradient Descent scales better for bigger datasets. 26
27 Mustafa Jarrar: Lecture Notes on Linear Regression Machine Learning Birzeit University, 2018 Machine Learning Linear Regression In this lecture: Part 1: Motivation (Regression Problems) Part 2: Linear Regression Part 3: The Cost Function Part 4: The Gradient Descent Algorithm Part 5: The Normal Euation Part 6: Linear Algebra overview Part 7: Using Octave Part 8: Using R 27
28 Minimizing!" numerically Another way to numerically (using linear algebra) estimate the optimal values of!". Given the following features: y Size ft 2 bedrooms floors Age Price Convert the features and the target value into matries: X = y=
29 Minimizing!" numerically y Size ft 2 bedrooms floors Age Price add # 0, so to represent! 0 X = y=
30 The Normal Euation To obtain the optimal values (minimized) of!s, Use the following Normal Euation:! = (X T X) -1 X T y Where:!: the set of!s we want to minimize X: the set of features in the training set y: the output values we try to predict X T : the transpose of X X -1 : the inverse of X In Octave: pinv(x *X)*X *y 30
31 Gradient Descent Vs Normal Euation m training eamples, n features Gradient Descent Needs to choose the learning rate (α) Needs many iterations Works well even when n is large Normal Euation No need to choose (α) Don t need to iterate Need to compute (X T X), which takes about O(n 3 ) Slow if n is very large Some matrices (e.g., singular) are non-invertible Recommendation: use the Normal Euation If the number of features in less than 1000, otherwise the Gradient Descent. 31
32 Mustafa Jarrar: Lecture Notes on Linear Regression Machine Learning Birzeit University, 2018 Machine Learning Linear Regression In this lecture: Part 1: Motivation (Regression Problems) Part 2: Linear Regression Part 3: The Cost Function Part 4: The Gradient Descent Algorithm Part 5: The Normal Euation Part 6: Linear Algebra overview Part 7: Using Octave Part 8: Using R 32
33 Matri Vector Multiplication = = = = 7 33
34 Matri Matri Multiplication = Multiple with the first column = 11 9 Multiple with the second column =
35 Matri Inverse If A is an m m matri, and if it has an inverse, then A A -1 = A -1 A = I The multiplication of a matri with its inverse produce an identity matri: = Some matrices do not have inverses e.g., if all cells are zeros For more, please review Linear Algebra 35
36 Matri Transpose Eample: A= A T = Let A be an m n matri, and let B = A T Then B is an n m mati, and B ij = A ji 36
37 Mustafa Jarrar: Lecture Notes on Linear Regression Machine Learning Birzeit University, 2018 Machine Learning Linear Regression In this lecture: Part 1: Motivation (Regression Problems) Part 2: Linear Regression Part 3: The Cost Function Part 4: The Gradient Descent Algorithm Part 5: The Normal Euation Part 6: Linear Algebra overview Part 7: Using Octave Part 8: Using R 37
38 Octave Download from High-level Scientific Programming Language Free alternatives to (and compatible with) Matlab helps in solving linear and nonlinear problems numerically Online Tutorial 38
39 Compute the normal euation using Octave Given X = y= In Octave: load X.tt load y.tt C= pinv(x *X)*X *y Save!.tt! = These are the values of!s we need to use in our hypothesis function h() h () = h () =
40 Mustafa Jarrar: Lecture Notes on Linear Regression Machine Learning Birzeit University, 2018 Machine Learning Linear Regression In this lecture: Part 1: Motivation (Regression Problems) Part 2: Linear Regression Part 3: The Cost Function Part 4: The Gradient Descent Algorithm Part 5: The Normal Euation Part 6: Linear Algebra overview Part 7: Using Octave Part 8: Using R 40
41 R (programming language) Free software environment for statistical computing and graphics that is supported by the R Foundation for Statistical Computing. R comes with many functions that can do sophisticated stuff, and the ability to install additional packages to do much more. Download R: A very good IDE for R is the RStudio: R basics tutorial: UZiDpam7ygg 41
42 Compute the normal euation using R Given X = y= In R: X = as.matri(read.table("~/.tt", header=f, sep=",")) Y = as.matri(read.table("~/y.tt", header=f, sep=",")) thetas = solve( t(x) %*% X ) %*% t(x) %*% Y write.table(thetas, file="~/thetas.tt", row.names=f, col.names=f) t(x): is the transpose of matri X. solve(x): is the inverse of matri X. 42
43 References [1] Andrew Ng s course about Machine Learning [2] Sami Ghawi, Mustafa Jarrar: Lecture Notes on Introduction to Machine Learning, Birzeit University, 2018 [3] Mustafa Jarrar: Lecture Notes on Decision Tree Machine Learning, Birzeit University, 2018 [4] Mustafa Jarrar: Lecture Notes on Linear Regression Machine Learning, Birzeit University,
Linear Regression with mul2ple variables. Mul2ple features. Machine Learning
Linear Regression with mul2ple variables Mul2ple features Machine Learning Mul4ple features (variables). Size (feet 2 ) Price ($1000) 2104 460 1416 232 1534 315 852 178 Mul4ple features (variables). Size
More informationExperiment 1: Linear Regression
Experiment 1: Linear Regression August 27, 2018 1 Description This first exercise will give you practice with linear regression. These exercises have been extensively tested with Matlab, but they should
More informationLecture 11 Linear regression
Advanced Algorithms Floriano Zini Free University of Bozen-Bolzano Faculty of Computer Science Academic Year 2013-2014 Lecture 11 Linear regression These slides are taken from Andrew Ng, Machine Learning
More informationMachine Learning 4771
Machine Learning 477 Instructor: Tony Jebara Topic Regression Empirical Risk Minimization Least Squares Higher Order Polynomials Under-fitting / Over-fitting Cross-Validation Regression Classification
More informationLeast Mean Squares Regression
Least Mean Squares Regression Machine Learning Spring 2018 The slides are mainly from Vivek Srikumar 1 Lecture Overview Linear classifiers What functions do linear classifiers express? Least Squares Method
More informationCSC321 Lecture 2: Linear Regression
CSC32 Lecture 2: Linear Regression Roger Grosse Roger Grosse CSC32 Lecture 2: Linear Regression / 26 Overview First learning algorithm of the course: linear regression Task: predict scalar-valued targets,
More informationLinear Regression. Robot Image Credit: Viktoriya Sukhanova 123RF.com
Linear Regression These slides were assembled by Eric Eaton, with grateful acknowledgement of the many others who made their course materials freely available online. Feel free to reuse or adapt these
More informationIntroduction to Machine Learning Prof. Sudeshna Sarkar Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur
Introduction to Machine Learning Prof. Sudeshna Sarkar Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Module 2 Lecture 05 Linear Regression Good morning, welcome
More informationLeast Mean Squares Regression. Machine Learning Fall 2018
Least Mean Squares Regression Machine Learning Fall 2018 1 Where are we? Least Squares Method for regression Examples The LMS objective Gradient descent Incremental/stochastic gradient descent Exercises
More informationRegression with Numerical Optimization. Logistic
CSG220 Machine Learning Fall 2008 Regression with Numerical Optimization. Logistic regression Regression with Numerical Optimization. Logistic regression based on a document by Andrew Ng October 3, 204
More informationPropositional Logic Introduction and Basics. 2.2 Conditional Statements. 2.3 Inferencing. mjarrar 2015
9/18/17 Mustafa Jarrar: Lecture Notes in Discrete Mathematics. Birzeit University Palestine 2017 Propositional Logic 2.1. Introduction and Basics 2.2 Conditional Statements 2.3 Inferencing mjarrar 2015
More informationGradient Descent. Stephan Robert
Gradient Descent Stephan Robert Model Other inputs # of bathrooms Lot size Year built Location Lake view Price (CHF) 6 5 4 3 2 1 2 4 6 8 1 Size (m 2 ) Model How it works Linear regression with one variable
More informationLecture 2: Linear regression
Lecture 2: Linear regression Roger Grosse 1 Introduction Let s ump right in and look at our first machine learning algorithm, linear regression. In regression, we are interested in predicting a scalar-valued
More informationCSC 411 Lecture 6: Linear Regression
CSC 411 Lecture 6: Linear Regression Roger Grosse, Amir-massoud Farahmand, and Juan Carrasquilla University of Toronto UofT CSC 411: 06-Linear Regression 1 / 37 A Timely XKCD UofT CSC 411: 06-Linear Regression
More informationLinear Regression. S. Sumitra
Linear Regression S Sumitra Notations: x i : ith data point; x T : transpose of x; x ij : ith data point s jth attribute Let {(x 1, y 1 ), (x, y )(x N, y N )} be the given data, x i D and y i Y Here D
More informationLINEAR REGRESSION, RIDGE, LASSO, SVR
LINEAR REGRESSION, RIDGE, LASSO, SVR Supervised Learning Katerina Tzompanaki Linear regression one feature* Price (y) What is the estimated price of a new house of area 30 m 2? 30 Area (x) *Also called
More informationMachine Learning. Regression. Manfred Huber
Machine Learning Regression Manfred Huber 2015 1 Regression Regression refers to supervised learning problems where the target output is one or more continuous values Continuous output values imply that
More informationError Functions & Linear Regression (1)
Error Functions & Linear Regression (1) John Kelleher & Brian Mac Namee Machine Learning @ DIT Overview 1 Introduction Overview 2 Univariate Linear Regression Linear Regression Analytical Solution Gradient
More informationGradient Descent. Model
Gradient Descent Stephan Robert Model Other inputs # of bathrooms Lot size Year built Location Lake view Price (HF) 6 5 4 3 2 2 4 6 8 Size (m 2 ) 2 Model How it works Linear regression with one variable
More informationModeling Data with Linear Combinations of Basis Functions. Read Chapter 3 in the text by Bishop
Modeling Data with Linear Combinations of Basis Functions Read Chapter 3 in the text by Bishop A Type of Supervised Learning Problem We want to model data (x 1, t 1 ),..., (x N, t N ), where x i is a vector
More informationCSC 411: Lecture 02: Linear Regression
CSC 411: Lecture 02: Linear Regression Richard Zemel, Raquel Urtasun and Sanja Fidler University of Toronto (Most plots in this lecture are from Bishop s book) Zemel, Urtasun, Fidler (UofT) CSC 411: 02-Regression
More informationRidge Regression 1. to which some random noise is added. So that the training labels can be represented as:
CS 1: Machine Learning Spring 15 College of Computer and Information Science Northeastern University Lecture 3 February, 3 Instructor: Bilal Ahmed Scribe: Bilal Ahmed & Virgil Pavlu 1 Introduction Ridge
More informationCOMP 551 Applied Machine Learning Lecture 2: Linear Regression
COMP 551 Applied Machine Learning Lecture 2: Linear Regression Instructor: Herke van Hoof (herke.vanhoof@mail.mcgill.ca) Slides mostly by: Class web page: www.cs.mcgill.ca/~hvanho2/comp551 Unless otherwise
More informationCOMP 551 Applied Machine Learning Lecture 2: Linear regression
COMP 551 Applied Machine Learning Lecture 2: Linear regression Instructor: (jpineau@cs.mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/comp551 Unless otherwise noted, all material posted for this
More informationMATH36061 Convex Optimization
M\cr NA Manchester Numerical Analysis MATH36061 Convex Optimization Martin Lotz School of Mathematics The University of Manchester Manchester, September 26, 2017 Outline General information What is optimization?
More informationLinear and Logistic Regression. Dr. Xiaowei Huang
Linear and Logistic Regression Dr. Xiaowei Huang https://cgi.csc.liv.ac.uk/~xiaowei/ Up to now, Two Classical Machine Learning Algorithms Decision tree learning K-nearest neighbor Model Evaluation Metrics
More informationOverview. Linear Regression with One Variable Gradient Descent Linear Regression with Multiple Variables Gradient Descent with Multiple Variables
Overview Linear Regression with One Variable Gradient Descent Linear Regression with Multiple Variables Gradient Descent with Multiple Variables Example: Advertising Data Data taken from An Introduction
More informationNeural Networks and Deep Learning
Neural Networks and Deep Learning Professor Ameet Talwalkar November 12, 2015 Professor Ameet Talwalkar Neural Networks and Deep Learning November 12, 2015 1 / 16 Outline 1 Review of last lecture AdaBoost
More informationCSCI567 Machine Learning (Fall 2014)
CSCI567 Machine Learning (Fall 24) Drs. Sha & Liu {feisha,yanliu.cs}@usc.edu October 2, 24 Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 24) October 2, 24 / 24 Outline Review
More informationCMU-Q Lecture 24:
CMU-Q 15-381 Lecture 24: Supervised Learning 2 Teacher: Gianni A. Di Caro SUPERVISED LEARNING Hypotheses space Hypothesis function Labeled Given Errors Performance criteria Given a collection of input
More information6.036 midterm review. Wednesday, March 18, 15
6.036 midterm review 1 Topics covered supervised learning labels available unsupervised learning no labels available semi-supervised learning some labels available - what algorithms have you learned that
More informationSample questions for Fundamentals of Machine Learning 2018
Sample questions for Fundamentals of Machine Learning 2018 Teacher: Mohammad Emtiyaz Khan A few important informations: In the final exam, no electronic devices are allowed except a calculator. Make sure
More informationCSC2515 Winter 2015 Introduction to Machine Learning. Lecture 2: Linear regression
CSC2515 Winter 2015 Introduction to Machine Learning Lecture 2: Linear regression All lecture slides will be available as.pdf on the course website: http://www.cs.toronto.edu/~urtasun/courses/csc2515/csc2515_winter15.html
More informationCSC242: Intro to AI. Lecture 21
CSC242: Intro to AI Lecture 21 Administrivia Project 4 (homeworks 18 & 19) due Mon Apr 16 11:59PM Posters Apr 24 and 26 You need an idea! You need to present it nicely on 2-wide by 4-high landscape pages
More informationECE521 W17 Tutorial 1. Renjie Liao & Min Bai
ECE521 W17 Tutorial 1 Renjie Liao & Min Bai Schedule Linear Algebra Review Matrices, vectors Basic operations Introduction to TensorFlow NumPy Computational Graphs Basic Examples Linear Algebra Review
More informationLinear Models for Regression
Linear Models for Regression CSE 4309 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 The Regression Problem Training data: A set of input-output
More informationECE 5424: Introduction to Machine Learning
ECE 5424: Introduction to Machine Learning Topics: Ensemble Methods: Bagging, Boosting PAC Learning Readings: Murphy 16.4;; Hastie 16 Stefan Lee Virginia Tech Fighting the bias-variance tradeoff Simple
More informationVBM683 Machine Learning
VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra Bias is the algorithm's tendency to consistently learn the wrong thing by not taking into account all the information in the data
More informationLinear Regression (continued)
Linear Regression (continued) Professor Ameet Talwalkar Professor Ameet Talwalkar CS260 Machine Learning Algorithms February 6, 2017 1 / 39 Outline 1 Administration 2 Review of last lecture 3 Linear regression
More informationNaïve Bayes. Jia-Bin Huang. Virginia Tech Spring 2019 ECE-5424G / CS-5824
Naïve Bayes Jia-Bin Huang ECE-5424G / CS-5824 Virginia Tech Spring 2019 Administrative HW 1 out today. Please start early! Office hours Chen: Wed 4pm-5pm Shih-Yang: Fri 3pm-4pm Location: Whittemore 266
More informationMachine Learning: Assignment 1
10-701 Machine Learning: Assignment 1 Due on Februrary 0, 014 at 1 noon Barnabas Poczos, Aarti Singh Instructions: Failure to follow these directions may result in loss of points. Your solutions for this
More informationCPSC 340: Machine Learning and Data Mining
CPSC 340: Machine Learning and Data Mining Linear Classifiers: predictions Original version of these slides by Mark Schmidt, with modifications by Mike Gelbart. 1 Admin Assignment 4: Due Friday of next
More informationMachine Learning Linear Models
Machine Learning Linear Models Outline II - Linear Models 1. Linear Regression (a) Linear regression: History (b) Linear regression with Least Squares (c) Matrix representation and Normal Equation Method
More informationECE521 Lectures 9 Fully Connected Neural Networks
ECE521 Lectures 9 Fully Connected Neural Networks Outline Multi-class classification Learning multi-layer neural networks 2 Measuring distance in probability space We learnt that the squared L2 distance
More informationManual for a computer class in ML
Manual for a computer class in ML November 3, 2015 Abstract This document describes a tour of Machine Learning (ML) techniques using tools in MATLAB. We point to the standard implementations, give example
More informationIntroduction to Logistic Regression and Support Vector Machine
Introduction to Logistic Regression and Support Vector Machine guest lecturer: Ming-Wei Chang CS 446 Fall, 2009 () / 25 Fall, 2009 / 25 Before we start () 2 / 25 Fall, 2009 2 / 25 Before we start Feel
More informationSet Theory Basics of Set Theory. 6.2 Properties of Sets and Element Argument. 6.3 Algebraic Proofs 6.4 Boolean Algebras.
9/6/17 Mustafa Jarrar: Lecture Notes in Discrete Mathematics. Birzeit University Palestine 2015 Set Theory 6.1. Basics of Set Theory 6.2 Properties of Sets and Element Argument 6.3 Algebraic Proofs 6.4
More informationCSC321 Lecture 5 Learning in a Single Neuron
CSC321 Lecture 5 Learning in a Single Neuron Roger Grosse and Nitish Srivastava January 21, 2015 Roger Grosse and Nitish Srivastava CSC321 Lecture 5 Learning in a Single Neuron January 21, 2015 1 / 14
More informationCS 179: LECTURE 16 MODEL COMPLEXITY, REGULARIZATION, AND CONVOLUTIONAL NETS
CS 179: LECTURE 16 MODEL COMPLEXITY, REGULARIZATION, AND CONVOLUTIONAL NETS LAST TIME Intro to cudnn Deep neural nets using cublas and cudnn TODAY Building a better model for image classification Overfitting
More informationLinear Regression. CSL603 - Fall 2017 Narayanan C Krishnan
Linear Regression CSL603 - Fall 2017 Narayanan C Krishnan ckn@iitrpr.ac.in Outline Univariate regression Multivariate regression Probabilistic view of regression Loss functions Bias-Variance analysis Regularization
More informationStanford Machine Learning - Week V
Stanford Machine Learning - Week V Eric N Johnson August 13, 2016 1 Neural Networks: Learning What learning algorithm is used by a neural network to produce parameters for a model? Suppose we have a neural
More informationLinear Regression. CSL465/603 - Fall 2016 Narayanan C Krishnan
Linear Regression CSL465/603 - Fall 2016 Narayanan C Krishnan ckn@iitrpr.ac.in Outline Univariate regression Multivariate regression Probabilistic view of regression Loss functions Bias-Variance analysis
More informationLinear Regression. Aarti Singh. Machine Learning / Sept 27, 2010
Linear Regression Aarti Singh Machine Learning 10-701/15-781 Sept 27, 2010 Discrete to Continuous Labels Classification Sports Science News Anemic cell Healthy cell Regression X = Document Y = Topic X
More informationPattern Recognition Prof. P. S. Sastry Department of Electronics and Communication Engineering Indian Institute of Science, Bangalore
Pattern Recognition Prof. P. S. Sastry Department of Electronics and Communication Engineering Indian Institute of Science, Bangalore Lecture - 27 Multilayer Feedforward Neural networks with Sigmoidal
More informationMidterm, Fall 2003
5-78 Midterm, Fall 2003 YOUR ANDREW USERID IN CAPITAL LETTERS: YOUR NAME: There are 9 questions. The ninth may be more time-consuming and is worth only three points, so do not attempt 9 unless you are
More informationAdvanced Introduction to Machine Learning
10-715 Advanced Introduction to Machine Learning Homework Due Oct 15, 10.30 am Rules Please follow these guidelines. Failure to do so, will result in loss of credit. 1. Homework is due on the due date
More informationMidterm: CS 6375 Spring 2015 Solutions
Midterm: CS 6375 Spring 2015 Solutions The exam is closed book. You are allowed a one-page cheat sheet. Answer the questions in the spaces provided on the question sheets. If you run out of room for an
More informationCSE 417T: Introduction to Machine Learning. Lecture 11: Review. Henry Chai 10/02/18
CSE 417T: Introduction to Machine Learning Lecture 11: Review Henry Chai 10/02/18 Unknown Target Function!: # % Training data Formal Setup & = ( ), + ),, ( -, + - Learning Algorithm 2 Hypothesis Set H
More informationMachine Learning, Fall 2012 Homework 2
0-60 Machine Learning, Fall 202 Homework 2 Instructors: Tom Mitchell, Ziv Bar-Joseph TA in charge: Selen Uguroglu email: sugurogl@cs.cmu.edu SOLUTIONS Naive Bayes, 20 points Problem. Basic concepts, 0
More informationOptimization for Machine Learning
Optimization for Machine Learning Elman Mansimov 1 September 24, 2015 1 Modified based on Shenlong Wang s and Jake Snell s tutorials, with additional contents borrowed from Kevin Swersky and Jasper Snoek
More informationLinear Models for Regression CS534
Linear Models for Regression CS534 Example Regression Problems Predict housing price based on House size, lot size, Location, # of rooms Predict stock price based on Price history of the past month Predict
More informationIntroduction to Natural Computation. Lecture 9. Multilayer Perceptrons and Backpropagation. Peter Lewis
Introduction to Natural Computation Lecture 9 Multilayer Perceptrons and Backpropagation Peter Lewis 1 / 25 Overview of the Lecture Why multilayer perceptrons? Some applications of multilayer perceptrons.
More informationArtificial Neural Networks 2
CSC2515 Machine Learning Sam Roweis Artificial Neural s 2 We saw neural nets for classification. Same idea for regression. ANNs are just adaptive basis regression machines of the form: y k = j w kj σ(b
More informationAn introduction to plotting data
An introduction to plotting data Eric D. Black California Institute of Technology v2.0 1 Introduction Plotting data is one of the essential skills every scientist must have. We use it on a near-daily basis
More informationAnswers Machine Learning Exercises 4
Answers Machine Learning Exercises 4 Tim van Erven November, 007 Exercises. The following Boolean functions take two Boolean features x and x as input. The features can take on the values and, where represents
More information10-701/ Machine Learning - Midterm Exam, Fall 2010
10-701/15-781 Machine Learning - Midterm Exam, Fall 2010 Aarti Singh Carnegie Mellon University 1. Personal info: Name: Andrew account: E-mail address: 2. There should be 15 numbered pages in this exam
More informationBias-Variance Tradeoff
What s learning, revisited Overfitting Generative versus Discriminative Logistic Regression Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University September 19 th, 2007 Bias-Variance Tradeoff
More information10701/15781 Machine Learning, Spring 2007: Homework 2
070/578 Machine Learning, Spring 2007: Homework 2 Due: Wednesday, February 2, beginning of the class Instructions There are 4 questions on this assignment The second question involves coding Do not attach
More informationOnline Videos FERPA. Sign waiver or sit on the sides or in the back. Off camera question time before and after lecture. Questions?
Online Videos FERPA Sign waiver or sit on the sides or in the back Off camera question time before and after lecture Questions? Lecture 1, Slide 1 CS224d Deep NLP Lecture 4: Word Window Classification
More informationHomework #3 RELEASE DATE: 10/28/2013 DUE DATE: extended to 11/18/2013, BEFORE NOON QUESTIONS ABOUT HOMEWORK MATERIALS ARE WELCOMED ON THE FORUM.
Homework #3 RELEASE DATE: 10/28/2013 DUE DATE: extended to 11/18/2013, BEFORE NOON QUESTIONS ABOUT HOMEWORK MATERIALS ARE WELCOMED ON THE FORUM. Unless granted by the instructor in advance, you must turn
More informationMachine learning for pervasive systems Classification in high-dimensional spaces
Machine learning for pervasive systems Classification in high-dimensional spaces Department of Communications and Networking Aalto University, School of Electrical Engineering stephan.sigg@aalto.fi Version
More informationCross-validation for detecting and preventing overfitting
Cross-validation for detecting and preventing overfitting A Regression Problem = f() + noise Can we learn f from this data? Note to other teachers and users of these slides. Andrew would be delighted if
More informationCSE446: non-parametric methods Spring 2017
CSE446: non-parametric methods Spring 2017 Ali Farhadi Slides adapted from Carlos Guestrin and Luke Zettlemoyer Linear Regression: What can go wrong? What do we do if the bias is too strong? Might want
More informationClassification with Perceptrons. Reading:
Classification with Perceptrons Reading: Chapters 1-3 of Michael Nielsen's online book on neural networks covers the basics of perceptrons and multilayer neural networks We will cover material in Chapters
More informationCS 6375 Machine Learning
CS 6375 Machine Learning Nicholas Ruozzi University of Texas at Dallas Slides adapted from David Sontag and Vibhav Gogate Course Info. Instructor: Nicholas Ruozzi Office: ECSS 3.409 Office hours: Tues.
More informationCPSC 340: Machine Learning and Data Mining. Regularization Fall 2017
CPSC 340: Machine Learning and Data Mining Regularization Fall 2017 Assignment 2 Admin 2 late days to hand in tonight, answers posted tomorrow morning. Extra office hours Thursday at 4pm (ICICS 246). Midterm
More informationMachine Learning Basics Lecture 4: SVM I. Princeton University COS 495 Instructor: Yingyu Liang
Machine Learning Basics Lecture 4: SVM I Princeton University COS 495 Instructor: Yingyu Liang Review: machine learning basics Math formulation Given training data x i, y i : 1 i n i.i.d. from distribution
More informationLinear Models for Regression CS534
Linear Models for Regression CS534 Prediction Problems Predict housing price based on House size, lot size, Location, # of rooms Predict stock price based on Price history of the past month Predict the
More informationDefinition 8.1 Two inequalities are equivalent if they have the same solution set. Add or Subtract the same value on both sides of the inequality.
8 Inequalities Concepts: Equivalent Inequalities Linear and Nonlinear Inequalities Absolute Value Inequalities (Sections.6 and.) 8. Equivalent Inequalities Definition 8. Two inequalities are equivalent
More informationNumber Theory and Proof Methods
9/6/17 Lecture Notes on Discrete Mathematics. Birzeit University Palestine 2016 Number Theory and Proof Methods Mustafa Jarrar 4.1 Introduction 4.2 Rational Numbers 4.3 Divisibility 4.4 Quotient-Remainder
More informationCSC 411: Lecture 04: Logistic Regression
CSC 411: Lecture 04: Logistic Regression Raquel Urtasun & Rich Zemel University of Toronto Sep 23, 2015 Urtasun & Zemel (UofT) CSC 411: 04-Prob Classif Sep 23, 2015 1 / 16 Today Key Concepts: Logistic
More informationML (cont.): SUPPORT VECTOR MACHINES
ML (cont.): SUPPORT VECTOR MACHINES CS540 Bryan R Gibson University of Wisconsin-Madison Slides adapted from those used by Prof. Jerry Zhu, CS540-1 1 / 40 Support Vector Machines (SVMs) The No-Math Version
More informationVote. Vote on timing for night section: Option 1 (what we have now) Option 2. Lecture, 6:10-7:50 25 minute dinner break Tutorial, 8:15-9
Vote Vote on timing for night section: Option 1 (what we have now) Lecture, 6:10-7:50 25 minute dinner break Tutorial, 8:15-9 Option 2 Lecture, 6:10-7 10 minute break Lecture, 7:10-8 10 minute break Tutorial,
More informationECE 5984: Introduction to Machine Learning
ECE 5984: Introduction to Machine Learning Topics: Ensemble Methods: Bagging, Boosting Readings: Murphy 16.4; Hastie 16 Dhruv Batra Virginia Tech Administrativia HW3 Due: April 14, 11:55pm You will implement
More informationLECTURE # - NEURAL COMPUTATION, Feb 04, Linear Regression. x 1 θ 1 output... θ M x M. Assumes a functional form
LECTURE # - EURAL COPUTATIO, Feb 4, 4 Linear Regression Assumes a functional form f (, θ) = θ θ θ K θ (Eq) where = (,, ) are the attributes and θ = (θ, θ, θ ) are the function parameters Eample: f (, θ)
More informationMachine Learning. Nonparametric Methods. Space of ML Problems. Todo. Histograms. Instance-Based Learning (aka non-parametric methods)
Machine Learning InstanceBased Learning (aka nonparametric methods) Supervised Learning Unsupervised Learning Reinforcement Learning Parametric Non parametric CSE 446 Machine Learning Daniel Weld March
More informationOverfitting, Bias / Variance Analysis
Overfitting, Bias / Variance Analysis Professor Ameet Talwalkar Professor Ameet Talwalkar CS260 Machine Learning Algorithms February 8, 207 / 40 Outline Administration 2 Review of last lecture 3 Basic
More informationMachine Learning Lecture 7
Course Outline Machine Learning Lecture 7 Fundamentals (2 weeks) Bayes Decision Theory Probability Density Estimation Statistical Learning Theory 23.05.2016 Discriminative Approaches (5 weeks) Linear Discriminant
More informationLecture 4 Logistic Regression
Lecture 4 Logistic Regression Dr.Ammar Mohammed Normal Equation Hypothesis hθ(x)=θ0 x0+ θ x+ θ2 x2 +... + θd xd Normal Equation is a method to find the values of θ operations x0 x x2.. xd y x x2... xd
More informationECE521 week 3: 23/26 January 2017
ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear
More informationLinear Discriminant Functions
Linear Discriminant Functions Linear discriminant functions and decision surfaces Definition It is a function that is a linear combination of the components of g() = t + 0 () here is the eight vector and
More informationCOMPUTATIONAL INTELLIGENCE (INTRODUCTION TO MACHINE LEARNING) SS16
COMPUTATIONAL INTELLIGENCE (INTRODUCTION TO MACHINE LEARNING) SS6 Lecture 3: Classification with Logistic Regression Advanced optimization techniques Underfitting & Overfitting Model selection (Training-
More informationMultilayer Neural Networks
Multilayer Neural Networks Multilayer Neural Networks Discriminant function flexibility NON-Linear But with sets of linear parameters at each layer Provably general function approximators for sufficient
More informationMultilayer Neural Networks
Multilayer Neural Networks Introduction Goal: Classify objects by learning nonlinearity There are many problems for which linear discriminants are insufficient for minimum error In previous methods, the
More informationECS171: Machine Learning
ECS171: Machine Learning Lecture 4: Optimization (LFD 3.3, SGD) Cho-Jui Hsieh UC Davis Jan 22, 2018 Gradient descent Optimization Goal: find the minimizer of a function min f (w) w For now we assume f
More informationJeff Howbert Introduction to Machine Learning Winter
Classification / Regression Support Vector Machines Jeff Howbert Introduction to Machine Learning Winter 2012 1 Topics SVM classifiers for linearly separable classes SVM classifiers for non-linearly separable
More informationLinear Models for Regression CS534
Linear Models for Regression CS534 Example Regression Problems Predict housing price based on House size, lot size, Location, # of rooms Predict stock price based on Price history of the past month Predict
More informationCE213 Artificial Intelligence Lecture 14
CE213 Artificial Intelligence Lecture 14 Neural Networks: Part 2 Learning Rules -Hebb Rule - Perceptron Rule -Delta Rule Neural Networks Using Linear Units [ Difficulty warning: equations! ] 1 Learning
More informationInstructor (Andrew Ng)
MachineLearning-Lecture02 Instructor (Andrew Ng):All right, good morning, welcome back. So before we jump into today's material, I just have one administrative announcement, which is graders. So I guess
More informationGradient Descent. Dr. Xiaowei Huang
Gradient Descent Dr. Xiaowei Huang https://cgi.csc.liv.ac.uk/~xiaowei/ Up to now, Three machine learning algorithms: decision tree learning k-nn linear regression only optimization objectives are discussed,
More information