CSCI567 Machine Learning (Fall 2014)
|
|
- Eleanore Osborne
- 6 years ago
- Views:
Transcription
1 CSCI567 Machine Learning (Fall 24) Drs. Sha & Liu October 2, 24 Drs. Sha & Liu CSCI567 Machine Learning (Fall 24) October 2, 24 / 24
2 Outline Review of last lecture Review of last lecture Linear regression 2 Nonlinear basis functions 3 Basic ideas of overcome overfitting Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 24) October 2, 24 2 / 24
3 Review of last lecture Estimating model parameters Linear regression Design matrix and target vector x T x T 2 X =. RN D, X = ( X) R N (D+), y = x T N Residual sum squares in matrix form { ( ) } T RSS( w) = w T X T X w 2 X T y w + const y y 2. y N Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 24) October 2, 24 3 / 24
4 Optimal solution Review of last lecture Linear regression Normal equation Take derivative with respect to w RSS( w) w X T Xw X T y = This leads to the least-mean-square (LMS) solution w LMS = ( X T X) X T y Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 24) October 2, 24 4 / 24
5 Review of last lecture Linear regression Practical concerns Bottleneck of computing the LMS solution w = ( XT X) Xy is to invert the matrix X T X R (D+) (D+) Scalable methods Batch gradient descent Stochastic gradient descent Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 24) October 2, 24 5 / 24
6 Review of last lecture Linear regression Stochastic gradient descent Widrow-Hoff rule: update parameters using one example at a time Initialize w to w () (anything reasonable is fine); set t = ; choose η > Loop until convergence random choose a training a sample x t 2 Compute its contribution to the gradient (ignoring the constant factor) g t = ( x T t w(t) y t ) x t 3 Update the parameters w (t+) = w (t) ηg t 4 t t + Scalable to large problems and often very effective Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 24) October 2, 24 6 / 24
7 Review of last lecture Linear regression Regularized least square/ridge regression For X T X that is not invertible w = ( X T X + λi) XT y This is equivalent to adding an extra term to RSS( w) RSS( w) { { }}{ ( ) } T w T X T X w 2 X T y w λ w 2 2 }{{} regularization Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 24) October 2, 24 7 / 24
8 Review of last lecture Linear regression Things you should know about linear regression Linear regression is the linear combination of features. f : x y, with f(x) = w + d w dx d = w + w T x If we minimize residual sum squares as our learning objective, we get a closed-form solution of parameters. You can either invert a matrix to obtain the parameters, or use gradient-based methods (batch or stochastic gradient descent) Probabilistic interpretation: maximum likelihood if assuming residual is Gaussian distributed Ridge regression: regularizing residual sum squares Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 24) October 2, 24 8 / 24
9 Outline Nonlinear basis functions Review of last lecture 2 Nonlinear basis functions 3 Basic ideas of overcome overfitting Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 24) October 2, 24 9 / 24
10 Nonlinear basis functions What if data is not linearly separable or fits to a line Example of nonlinear classification Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 24) October 2, 24 / 24
11 Nonlinear basis functions What if data is not linearly separable or fits to a line Example of nonlinear classification Example of nonlinear regression t x Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 24) October 2, 24 / 24
12 Nonlinear basis functions Nonlinear basis for classification Transform the input/feature φ(x) : x R 2 z = x x 2 Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 24) October 2, 24 / 24
13 Nonlinear basis functions Nonlinear basis for classification Transform the input/feature φ(x) : x R 2 z = x x 2 Transformed training data: linearly separable! Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 24) October 2, 24 / 24
14 Another example Nonlinear basis functions How to transform the input/feature? Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 24) October 2, 24 2 / 24
15 Nonlinear basis functions Another example How to transform the input/feature? φ(x) : x R 2 z = x 2 x x 2 x 2 2 R Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 24) October 2, 24 2 / 24
16 Another example Nonlinear basis functions How to transform the input/feature? φ(x) : x R 2 z = x 2 x x 2 x 2 2 R Transformed training data: linearly separable Intuition: 2 w = [ ] T then, w T z = x 2 2, i.e., the distance to the origin! Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 24) October 2, 24 2 / 24
17 Nonlinear basis functions General nonlinear basis functions We can use a nonlinear mapping φ(x) : x R D z R M where M is the dimensionality of the new feature/input z (or φ(x)). Note that M could be either greater than D or less than or the same. Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 24) October 2, 24 3 / 24
18 Nonlinear basis functions General nonlinear basis functions We can use a nonlinear mapping φ(x) : x R D z R M where M is the dimensionality of the new feature/input z (or φ(x)). Note that M could be either greater than D or less than or the same. With the new features, we can apply our learning techniques linear methods: prediction is based on w T φ(x) other methods: nearest neighbors, decision trees, etc to minimize our errors on the transformed training data Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 24) October 2, 24 3 / 24
19 Nonlinear basis functions Regression with nonlinear basis Residual sum squares [w T φ(x n ) y n ] 2 n where w R M, the same dimensionality as the transformed features φ(x). Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 24) October 2, 24 4 / 24
20 Nonlinear basis functions Regression with nonlinear basis Residual sum squares [w T φ(x n ) y n ] 2 n where w R M, the same dimensionality as the transformed features φ(x). The LMS solution can be formulated with the new design matrix φ(x ) T φ(x 2 ) T Φ =. RN M, w lms = ( Φ T Φ ) Φ T y φ(x N ) T Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 24) October 2, 24 4 / 24
21 Nonlinear basis functions Example with regression Polynomial basis functions x φ(x) = x 2. x M f(x) = w + M w m x m m= Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 24) October 2, 24 5 / 24
22 Nonlinear basis functions Example with regression Polynomial basis functions x φ(x) = x 2 f(x) = w +. x M M w m x m m= Fitting samples from a sine function: underrfitting as f(x) is too simple M = M = t t x x Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 24) October 2, 24 5 / 24
23 Nonlinear basis functions Adding more high-order basis M=3 t M = 3 x Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 24) October 2, 24 6 / 24
24 Nonlinear basis functions Adding more high-order basis M=3 M=9: overfitting t M = 3 t M = 9 x x Being too adaptive leads better results on the training data, but not so great on data that has not been seen! Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 24) October 2, 24 6 / 24
25 Overfiting Nonlinear basis functions Parameters for higher-order polynomials are very large M = M = M = 3 M = 9 w w w w w w w w w w Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 24) October 2, 24 7 / 24
26 Nonlinear basis functions Overfitting can be quite disastrous Fitting the housing price data with M = 3 Note that the price would goes to zero (or negative) if you buy bigger ones! This is called poor generalization/overfitting. Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 24) October 2, 24 8 / 24
27 Detecting overfitting Nonlinear basis functions Plot model complexity versus objective function As model becomes more complex, performance on training keeps improving while on test data improve first and deteriorate later. ERMS.5 Training Test 3 M 6 9 Horizontal axis: measure of model complexity In this example, we use the maximum order of the polynomial basis functions. Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 24) October 2, 24 9 / 24
28 Detecting overfitting Nonlinear basis functions Plot model complexity versus objective function As model becomes more complex, performance on training keeps improving while on test data improve first and deteriorate later. ERMS.5 Training Test 3 M 6 9 Horizontal axis: measure of model complexity In this example, we use the maximum order of the polynomial basis functions. Vertical axis: For regression, the vertical axis would be RSS or RMS (squared root of RSS) 2 For classification, the vertical axis would be classification error rate or cross-entropy error function Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 24) October 2, 24 9 / 24
29 Outline Basic ideas of overcome overfitting Review of last lecture 2 Nonlinear basis functions 3 Basic ideas of overcome overfitting Use more training data Regularization methods Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 24) October 2, 24 2 / 24
30 Basic ideas of overcome overfitting Use more training data Use more training data to prevent over fitting The more, the merrier t M = 9 x Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 24) October 2, 24 2 / 24
31 Basic ideas of overcome overfitting Use more training data Use more training data to prevent over fitting The more, the merrier M = 9 N = 5 t t x x Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 24) October 2, 24 2 / 24
32 Basic ideas of overcome overfitting Use more training data Use more training data to prevent over fitting The more, the merrier M = 9 N = 5 N = t t t x x What if we do not have a lot of data? x Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 24) October 2, 24 2 / 24
33 Basic ideas of overcome overfitting Regularization methods Regularization methods Intuition: for a linear model for regression what do we mean by being simple? w T x Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 24) October 2, / 24
34 Basic ideas of overcome overfitting Regularization methods Regularization methods Intuition: for a linear model for regression what do we mean by being simple? w T x Assumptions 2 d 2σ 2 p(w d ) = e w 2πσ Namely, a prior, we believe w d is around zero, i.e., resulting in a simple model for prediction. Note that this line of thinking is to regard w as a random variable and we will use the observed data D to update our prior belief on w Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 24) October 2, / 24
35 Basic ideas of overcome overfitting Regularization methods Example: fitting data with polynomials Our regression model y = M w m x m m= Thus, smaller w m will likely lead to a smaller oder of polynomial, thus potentially preventing overfitting. Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 24) October 2, / 24
36 Basic ideas of overcome overfitting Regularization methods Setup for regularized linear regression Linear regression y = w T x + η where η is a Gaussian noise, distributed according to N(, σ 2 ). Prior distribution on the parameter p(w) = M e w 2πσ d= 2 d 2σ 2 Note that all the dimensions share the same variance σ 2. Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 24) October 2, / 24
Linear Regression (continued)
Linear Regression (continued) Professor Ameet Talwalkar Professor Ameet Talwalkar CS260 Machine Learning Algorithms February 6, 2017 1 / 39 Outline 1 Administration 2 Review of last lecture 3 Linear regression
More informationOverfitting, Bias / Variance Analysis
Overfitting, Bias / Variance Analysis Professor Ameet Talwalkar Professor Ameet Talwalkar CS260 Machine Learning Algorithms February 8, 207 / 40 Outline Administration 2 Review of last lecture 3 Basic
More informationStatistical Machine Learning Hilary Term 2018
Statistical Machine Learning Hilary Term 2018 Pier Francesco Palamara Department of Statistics University of Oxford Slide credits and other course material can be found at: http://www.stats.ox.ac.uk/~palamara/sml18.html
More informationLinear Models for Regression CS534
Linear Models for Regression CS534 Prediction Problems Predict housing price based on House size, lot size, Location, # of rooms Predict stock price based on Price history of the past month Predict the
More informationLinear Regression. CSL603 - Fall 2017 Narayanan C Krishnan
Linear Regression CSL603 - Fall 2017 Narayanan C Krishnan ckn@iitrpr.ac.in Outline Univariate regression Multivariate regression Probabilistic view of regression Loss functions Bias-Variance analysis Regularization
More informationLinear Regression. CSL465/603 - Fall 2016 Narayanan C Krishnan
Linear Regression CSL465/603 - Fall 2016 Narayanan C Krishnan ckn@iitrpr.ac.in Outline Univariate regression Multivariate regression Probabilistic view of regression Loss functions Bias-Variance analysis
More informationLinear Models for Regression
Linear Models for Regression CSE 4309 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 The Regression Problem Training data: A set of input-output
More informationLinear Regression. S. Sumitra
Linear Regression S Sumitra Notations: x i : ith data point; x T : transpose of x; x ij : ith data point s jth attribute Let {(x 1, y 1 ), (x, y )(x N, y N )} be the given data, x i D and y i Y Here D
More informationGaussian and Linear Discriminant Analysis; Multiclass Classification
Gaussian and Linear Discriminant Analysis; Multiclass Classification Professor Ameet Talwalkar Slide Credit: Professor Fei Sha Professor Ameet Talwalkar CS260 Machine Learning Algorithms October 13, 2015
More informationMachine Learning. Lecture 4: Regularization and Bayesian Statistics. Feng Li. https://funglee.github.io
Machine Learning Lecture 4: Regularization and Bayesian Statistics Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 207 Overfitting Problem
More informationLogistic Regression Review Fall 2012 Recitation. September 25, 2012 TA: Selen Uguroglu
Logistic Regression Review 10-601 Fall 2012 Recitation September 25, 2012 TA: Selen Uguroglu!1 Outline Decision Theory Logistic regression Goal Loss function Inference Gradient Descent!2 Training Data
More informationThese slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop
Music and Machine Learning (IFT68 Winter 8) Prof. Douglas Eck, Université de Montréal These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop
More informationLinear Models in Machine Learning
CS540 Intro to AI Linear Models in Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu We briefly go over two linear models frequently used in machine learning: linear regression for, well, regression,
More informationLinear Models for Regression
Linear Models for Regression Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr
More informationMachine Learning. Regression. Manfred Huber
Machine Learning Regression Manfred Huber 2015 1 Regression Regression refers to supervised learning problems where the target output is one or more continuous values Continuous output values imply that
More informationLinear Models for Regression. Sargur Srihari
Linear Models for Regression Sargur srihari@cedar.buffalo.edu 1 Topics in Linear Regression What is regression? Polynomial Curve Fitting with Scalar input Linear Basis Function Models Maximum Likelihood
More informationLinear Regression (9/11/13)
STA561: Probabilistic machine learning Linear Regression (9/11/13) Lecturer: Barbara Engelhardt Scribes: Zachary Abzug, Mike Gloudemans, Zhuosheng Gu, Zhao Song 1 Why use linear regression? Figure 1: Scatter
More informationLecture 2 - Learning Binary & Multi-class Classifiers from Labelled Training Data
Lecture 2 - Learning Binary & Multi-class Classifiers from Labelled Training Data DD2424 March 23, 2017 Binary classification problem given labelled training data Have labelled training examples? Given
More informationNeural Networks and Deep Learning
Neural Networks and Deep Learning Professor Ameet Talwalkar November 12, 2015 Professor Ameet Talwalkar Neural Networks and Deep Learning November 12, 2015 1 / 16 Outline 1 Review of last lecture AdaBoost
More informationRegression. Machine Learning and Pattern Recognition. Chris Williams. School of Informatics, University of Edinburgh.
Regression Machine Learning and Pattern Recognition Chris Williams School of Informatics, University of Edinburgh September 24 (All of the slides in this course have been adapted from previous versions
More informationLearning from Data: Regression
November 3, 2005 http://www.anc.ed.ac.uk/ amos/lfd/ Classification or Regression? Classification: want to learn a discrete target variable. Regression: want to learn a continuous target variable. Linear
More informationLinear Models for Regression CS534
Linear Models for Regression CS534 Example Regression Problems Predict housing price based on House size, lot size, Location, # of rooms Predict stock price based on Price history of the past month Predict
More informationLecture 14 : Online Learning, Stochastic Gradient Descent, Perceptron
CS446: Machine Learning, Fall 2017 Lecture 14 : Online Learning, Stochastic Gradient Descent, Perceptron Lecturer: Sanmi Koyejo Scribe: Ke Wang, Oct. 24th, 2017 Agenda Recap: SVM and Hinge loss, Representer
More informationMachine Learning. Lecture 2: Linear regression. Feng Li. https://funglee.github.io
Machine Learning Lecture 2: Linear regression Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 2017 Supervised Learning Regression: Predict
More informationModeling Data with Linear Combinations of Basis Functions. Read Chapter 3 in the text by Bishop
Modeling Data with Linear Combinations of Basis Functions Read Chapter 3 in the text by Bishop A Type of Supervised Learning Problem We want to model data (x 1, t 1 ),..., (x N, t N ), where x i is a vector
More informationLinear Regression. Aarti Singh. Machine Learning / Sept 27, 2010
Linear Regression Aarti Singh Machine Learning 10-701/15-781 Sept 27, 2010 Discrete to Continuous Labels Classification Sports Science News Anemic cell Healthy cell Regression X = Document Y = Topic X
More informationIntroduction to Machine Learning Prof. Sudeshna Sarkar Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur
Introduction to Machine Learning Prof. Sudeshna Sarkar Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Module 2 Lecture 05 Linear Regression Good morning, welcome
More informationIntroduction to Machine Learning
Introduction to Machine Learning Linear Regression Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574 1
More informationLinear Models for Regression CS534
Linear Models for Regression CS534 Example Regression Problems Predict housing price based on House size, lot size, Location, # of rooms Predict stock price based on Price history of the past month Predict
More informationBayesian Linear Regression. Sargur Srihari
Bayesian Linear Regression Sargur srihari@cedar.buffalo.edu Topics in Bayesian Regression Recall Max Likelihood Linear Regression Parameter Distribution Predictive Distribution Equivalent Kernel 2 Linear
More informationAssociation studies and regression
Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration
More informationCSC2515 Winter 2015 Introduction to Machine Learning. Lecture 2: Linear regression
CSC2515 Winter 2015 Introduction to Machine Learning Lecture 2: Linear regression All lecture slides will be available as.pdf on the course website: http://www.cs.toronto.edu/~urtasun/courses/csc2515/csc2515_winter15.html
More informationMachine Learning - MT & 5. Basis Expansion, Regularization, Validation
Machine Learning - MT 2016 4 & 5. Basis Expansion, Regularization, Validation Varun Kanade University of Oxford October 19 & 24, 2016 Outline Basis function expansion to capture non-linear relationships
More informationIntroduction to Machine Learning. Regression. Computer Science, Tel-Aviv University,
1 Introduction to Machine Learning Regression Computer Science, Tel-Aviv University, 2013-14 Classification Input: X Real valued, vectors over real. Discrete values (0,1,2,...) Other structures (e.g.,
More informationLinear Classification. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington
Linear Classification CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 Example of Linear Classification Red points: patterns belonging
More informationUniversität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Bayesian Learning. Tobias Scheffer, Niels Landwehr
Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen Bayesian Learning Tobias Scheffer, Niels Landwehr Remember: Normal Distribution Distribution over x. Density function with parameters
More informationLinear Models for Regression
Linear Models for Regression Machine Learning Torsten Möller Möller/Mori 1 Reading Chapter 3 of Pattern Recognition and Machine Learning by Bishop Chapter 3+5+6+7 of The Elements of Statistical Learning
More informationReading Group on Deep Learning Session 1
Reading Group on Deep Learning Session 1 Stephane Lathuiliere & Pablo Mesejo 2 June 2016 1/31 Contents Introduction to Artificial Neural Networks to understand, and to be able to efficiently use, the popular
More informationLeast Mean Squares Regression. Machine Learning Fall 2018
Least Mean Squares Regression Machine Learning Fall 2018 1 Where are we? Least Squares Method for regression Examples The LMS objective Gradient descent Incremental/stochastic gradient descent Exercises
More informationLeast Squares Regression
CIS 50: Machine Learning Spring 08: Lecture 4 Least Squares Regression Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may not cover all the
More informationRegression with Numerical Optimization. Logistic
CSG220 Machine Learning Fall 2008 Regression with Numerical Optimization. Logistic regression Regression with Numerical Optimization. Logistic regression based on a document by Andrew Ng October 3, 204
More informationWeek 3: Linear Regression
Week 3: Linear Regression Instructor: Sergey Levine Recap In the previous lecture we saw how linear regression can solve the following problem: given a dataset D = {(x, y ),..., (x N, y N )}, learn to
More informationLeast Mean Squares Regression
Least Mean Squares Regression Machine Learning Spring 2018 The slides are mainly from Vivek Srikumar 1 Lecture Overview Linear classifiers What functions do linear classifiers express? Least Squares Method
More informationECE521 week 3: 23/26 January 2017
ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear
More informationAdvanced Machine Learning Practical 4b Solution: Regression (BLR, GPR & Gradient Boosting)
Advanced Machine Learning Practical 4b Solution: Regression (BLR, GPR & Gradient Boosting) Professor: Aude Billard Assistants: Nadia Figueroa, Ilaria Lauzana and Brice Platerrier E-mails: aude.billard@epfl.ch,
More informationCSE 417T: Introduction to Machine Learning. Lecture 11: Review. Henry Chai 10/02/18
CSE 417T: Introduction to Machine Learning Lecture 11: Review Henry Chai 10/02/18 Unknown Target Function!: # % Training data Formal Setup & = ( ), + ),, ( -, + - Learning Algorithm 2 Hypothesis Set H
More informationLecture : Probabilistic Machine Learning
Lecture : Probabilistic Machine Learning Riashat Islam Reasoning and Learning Lab McGill University September 11, 2018 ML : Many Methods with Many Links Modelling Views of Machine Learning Machine Learning
More informationBayesian Linear Regression [DRAFT - In Progress]
Bayesian Linear Regression [DRAFT - In Progress] David S. Rosenberg Abstract Here we develop some basics of Bayesian linear regression. Most of the calculations for this document come from the basic theory
More informationStochastic Gradient Descent
Stochastic Gradient Descent Machine Learning CSE546 Carlos Guestrin University of Washington October 9, 2013 1 Logistic Regression Logistic function (or Sigmoid): Learn P(Y X) directly Assume a particular
More informationMidterm exam CS 189/289, Fall 2015
Midterm exam CS 189/289, Fall 2015 You have 80 minutes for the exam. Total 100 points: 1. True/False: 36 points (18 questions, 2 points each). 2. Multiple-choice questions: 24 points (8 questions, 3 points
More informationMIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October,
MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October, 23 2013 The exam is closed book. You are allowed a one-page cheat sheet. Answer the questions in the spaces provided on the question sheets. If you run
More informationCS540 Machine learning Lecture 5
CS540 Machine learning Lecture 5 1 Last time Basis functions for linear regression Normal equations QR SVD - briefly 2 This time Geometry of least squares (again) SVD more slowly LMS Ridge regression 3
More informationLeast Squares Regression
E0 70 Machine Learning Lecture 4 Jan 7, 03) Least Squares Regression Lecturer: Shivani Agarwal Disclaimer: These notes are a brief summary of the topics covered in the lecture. They are not a substitute
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 89 Part II
More informationLinear Classifiers: Expressiveness
Linear Classifiers: Expressiveness Machine Learning Spring 2018 The slides are mainly from Vivek Srikumar 1 Lecture outline Linear classifiers: Introduction What functions do linear classifiers express?
More informationBayesian Machine Learning
Bayesian Machine Learning Andrew Gordon Wilson ORIE 6741 Lecture 2: Bayesian Basics https://people.orie.cornell.edu/andrew/orie6741 Cornell University August 25, 2016 1 / 17 Canonical Machine Learning
More informationMachine Learning Lecture 7
Course Outline Machine Learning Lecture 7 Fundamentals (2 weeks) Bayes Decision Theory Probability Density Estimation Statistical Learning Theory 23.05.2016 Discriminative Approaches (5 weeks) Linear Discriminant
More informationLinear Regression 1 / 25. Karl Stratos. June 18, 2018
Linear Regression Karl Stratos June 18, 2018 1 / 25 The Regression Problem Problem. Find a desired input-output mapping f : X R where the output is a real value. x = = y = 0.1 How much should I turn my
More informationAn Introduction to Statistical and Probabilistic Linear Models
An Introduction to Statistical and Probabilistic Linear Models Maximilian Mozes Proseminar Data Mining Fakultät für Informatik Technische Universität München June 07, 2017 Introduction In statistical learning
More information1 Machine Learning Concepts (16 points)
CSCI 567 Fall 2018 Midterm Exam DO NOT OPEN EXAM UNTIL INSTRUCTED TO DO SO PLEASE TURN OFF ALL CELL PHONES Problem 1 2 3 4 5 6 Total Max 16 10 16 42 24 12 120 Points Please read the following instructions
More informationECE521 Tutorial 2. Regression, GPs, Assignment 1. ECE521 Winter Credits to Alireza Makhzani, Alex Schwing, Rich Zemel for the slides.
ECE521 Tutorial 2 Regression, GPs, Assignment 1 ECE521 Winter 2016 Credits to Alireza Makhzani, Alex Schwing, Rich Zemel for the slides. ECE521 Tutorial 2 ECE521 Winter 2016 Credits to Alireza / 3 Outline
More informationToday. Calculus. Linear Regression. Lagrange Multipliers
Today Calculus Lagrange Multipliers Linear Regression 1 Optimization with constraints What if I want to constrain the parameters of the model. The mean is less than 10 Find the best likelihood, subject
More informationCSC321 Lecture 2: Linear Regression
CSC32 Lecture 2: Linear Regression Roger Grosse Roger Grosse CSC32 Lecture 2: Linear Regression / 26 Overview First learning algorithm of the course: linear regression Task: predict scalar-valued targets,
More informationCSC 411: Lecture 02: Linear Regression
CSC 411: Lecture 02: Linear Regression Richard Zemel, Raquel Urtasun and Sanja Fidler University of Toronto (Most plots in this lecture are from Bishop s book) Zemel, Urtasun, Fidler (UofT) CSC 411: 02-Regression
More informationProbabilistic Machine Learning. Industrial AI Lab.
Probabilistic Machine Learning Industrial AI Lab. Probabilistic Linear Regression Outline Probabilistic Classification Probabilistic Clustering Probabilistic Dimension Reduction 2 Probabilistic Linear
More informationMachine Learning. 7. Logistic and Linear Regression
Sapienza University of Rome, Italy - Machine Learning (27/28) University of Rome La Sapienza Master in Artificial Intelligence and Robotics Machine Learning 7. Logistic and Linear Regression Luca Iocchi,
More informationCS6220: DATA MINING TECHNIQUES
CS6220: DATA MINING TECHNIQUES Matrix Data: Prediction Instructor: Yizhou Sun yzsun@ccs.neu.edu September 14, 2014 Today s Schedule Course Project Introduction Linear Regression Model Decision Tree 2 Methods
More informationESS2222. Lecture 4 Linear model
ESS2222 Lecture 4 Linear model Hosein Shahnas University of Toronto, Department of Earth Sciences, 1 Outline Logistic Regression Predicting Continuous Target Variables Support Vector Machine (Some Details)
More informationCSC242: Intro to AI. Lecture 21
CSC242: Intro to AI Lecture 21 Administrivia Project 4 (homeworks 18 & 19) due Mon Apr 16 11:59PM Posters Apr 24 and 26 You need an idea! You need to present it nicely on 2-wide by 4-high landscape pages
More informationOutline. Supervised Learning. Hong Chang. Institute of Computing Technology, Chinese Academy of Sciences. Machine Learning Methods (Fall 2012)
Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Linear Models for Regression Linear Regression Probabilistic Interpretation
More informationApril 9, Depto. de Ing. de Sistemas e Industrial Universidad Nacional de Colombia, Bogotá. Linear Classification Models. Fabio A. González Ph.D.
Depto. de Ing. de Sistemas e Industrial Universidad Nacional de Colombia, Bogotá April 9, 2018 Content 1 2 3 4 Outline 1 2 3 4 problems { C 1, y(x) threshold predict(x) = C 2, y(x) < threshold, with threshold
More informationLINEAR REGRESSION, RIDGE, LASSO, SVR
LINEAR REGRESSION, RIDGE, LASSO, SVR Supervised Learning Katerina Tzompanaki Linear regression one feature* Price (y) What is the estimated price of a new house of area 30 m 2? 30 Area (x) *Also called
More informationECE521 lecture 4: 19 January Optimization, MLE, regularization
ECE521 lecture 4: 19 January 2017 Optimization, MLE, regularization First four lectures Lectures 1 and 2: Intro to ML Probability review Types of loss functions and algorithms Lecture 3: KNN Convexity
More informationCMU-Q Lecture 24:
CMU-Q 15-381 Lecture 24: Supervised Learning 2 Teacher: Gianni A. Di Caro SUPERVISED LEARNING Hypotheses space Hypothesis function Labeled Given Errors Performance criteria Given a collection of input
More informationError Functions & Linear Regression (1)
Error Functions & Linear Regression (1) John Kelleher & Brian Mac Namee Machine Learning @ DIT Overview 1 Introduction Overview 2 Univariate Linear Regression Linear Regression Analytical Solution Gradient
More informationMachine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.
Bayesian learning: Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Let y be the true label and y be the predicted
More informationLinear Models for Classification
Linear Models for Classification Oliver Schulte - CMPT 726 Bishop PRML Ch. 4 Classification: Hand-written Digit Recognition CHINE INTELLIGENCE, VOL. 24, NO. 24, APRIL 2002 x i = t i = (0, 0, 0, 1, 0, 0,
More informationSupport Vector Machines
Support Vector Machines INFO-4604, Applied Machine Learning University of Colorado Boulder September 28, 2017 Prof. Michael Paul Today Two important concepts: Margins Kernels Large Margin Classification
More informationMidterm Review CS 6375: Machine Learning. Vibhav Gogate The University of Texas at Dallas
Midterm Review CS 6375: Machine Learning Vibhav Gogate The University of Texas at Dallas Machine Learning Supervised Learning Unsupervised Learning Reinforcement Learning Parametric Y Continuous Non-parametric
More informationClassification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012
Classification CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Topics Discriminant functions Logistic regression Perceptron Generative models Generative vs. discriminative
More informationMLCC 2017 Regularization Networks I: Linear Models
MLCC 2017 Regularization Networks I: Linear Models Lorenzo Rosasco UNIGE-MIT-IIT June 27, 2017 About this class We introduce a class of learning algorithms based on Tikhonov regularization We study computational
More informationMachine Learning (CSE 446): Probabilistic Machine Learning
Machine Learning (CSE 446): Probabilistic Machine Learning oah Smith c 2017 University of Washington nasmith@cs.washington.edu ovember 1, 2017 1 / 24 Understanding MLE y 1 MLE π^ You can think of MLE as
More informationMachine Learning and Data Mining. Linear regression. Kalev Kask
Machine Learning and Data Mining Linear regression Kalev Kask Supervised learning Notation Features x Targets y Predictions ŷ Parameters q Learning algorithm Program ( Learner ) Change q Improve performance
More informationPATTERN RECOGNITION AND MACHINE LEARNING
PATTERN RECOGNITION AND MACHINE LEARNING Chapter 1. Introduction Shuai Huang April 21, 2014 Outline 1 What is Machine Learning? 2 Curve Fitting 3 Probability Theory 4 Model Selection 5 The curse of dimensionality
More informationCovariance Matrix Simplification For Efficient Uncertainty Management
PASEO MaxEnt 2007 Covariance Matrix Simplification For Efficient Uncertainty Management André Jalobeanu, Jorge A. Gutiérrez PASEO Research Group LSIIT (CNRS/ Univ. Strasbourg) - Illkirch, France *part
More informationCS260: Machine Learning Algorithms
CS260: Machine Learning Algorithms Lecture 4: Stochastic Gradient Descent Cho-Jui Hsieh UCLA Jan 16, 2019 Large-scale Problems Machine learning: usually minimizing the training loss min w { 1 N min w {
More informationMIDTERM SOLUTIONS: FALL 2012 CS 6375 INSTRUCTOR: VIBHAV GOGATE
MIDTERM SOLUTIONS: FALL 2012 CS 6375 INSTRUCTOR: VIBHAV GOGATE March 28, 2012 The exam is closed book. You are allowed a double sided one page cheat sheet. Answer the questions in the spaces provided on
More informationDEPARTMENT OF COMPUTER SCIENCE Autumn Semester MACHINE LEARNING AND ADAPTIVE INTELLIGENCE
Data Provided: None DEPARTMENT OF COMPUTER SCIENCE Autumn Semester 203 204 MACHINE LEARNING AND ADAPTIVE INTELLIGENCE 2 hours Answer THREE of the four questions. All questions carry equal weight. Figures
More informationLearning From Data Lecture 10 Nonlinear Transforms
Learning From Data Lecture 0 Nonlinear Transforms The Z-space Polynomial transforms Be careful M. Magdon-Ismail CSCI 400/600 recap: The Linear Model linear in w: makes the algorithms work linear in x:
More informationCSCI567 Machine Learning (Fall 2018)
CSCI567 Machine Learning (Fall 2018) Prof. Haipeng Luo U of Southern California Sep 12, 2018 September 12, 2018 1 / 49 Administration GitHub repos are setup (ask TA Chi Zhang for any issues) HW 1 is due
More informationData Mining Techniques. Lecture 2: Regression
Data Mining Techniques CS 6220 - Section 3 - Fall 2016 Lecture 2: Regression Jan-Willem van de Meent (credit: Yijun Zhao, Marc Toussaint, Bishop) Administrativa Instructor Jan-Willem van de Meent Email:
More informationStatistical Machine Learning (BE4M33SSU) Lecture 5: Artificial Neural Networks
Statistical Machine Learning (BE4M33SSU) Lecture 5: Artificial Neural Networks Jan Drchal Czech Technical University in Prague Faculty of Electrical Engineering Department of Computer Science Topics covered
More informationCSC321 Lecture 18: Learning Probabilistic Models
CSC321 Lecture 18: Learning Probabilistic Models Roger Grosse Roger Grosse CSC321 Lecture 18: Learning Probabilistic Models 1 / 25 Overview So far in this course: mainly supervised learning Language modeling
More informationDeep Learning for Computer Vision
Deep Learning for Computer Vision Lecture 4: Curse of Dimensionality, High Dimensional Feature Spaces, Linear Classifiers, Linear Regression, Python, and Jupyter Notebooks Peter Belhumeur Computer Science
More informationLecture 2 Machine Learning Review
Lecture 2 Machine Learning Review CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor University of Chicago March 29, 2017 Things we will look at today Formal Setup for Supervised Learning Things
More informationFrom perceptrons to word embeddings. Simon Šuster University of Groningen
From perceptrons to word embeddings Simon Šuster University of Groningen Outline A basic computational unit Weighting some input to produce an output: classification Perceptron Classify tweets Written
More informationMachine Learning Linear Models
Machine Learning Linear Models Outline II - Linear Models 1. Linear Regression (a) Linear regression: History (b) Linear regression with Least Squares (c) Matrix representation and Normal Equation Method
More informationCSE446: non-parametric methods Spring 2017
CSE446: non-parametric methods Spring 2017 Ali Farhadi Slides adapted from Carlos Guestrin and Luke Zettlemoyer Linear Regression: What can go wrong? What do we do if the bias is too strong? Might want
More informationReminders. Thought questions should be submitted on eclass. Please list the section related to the thought question
Linear regression Reminders Thought questions should be submitted on eclass Please list the section related to the thought question If it is a more general, open-ended question not exactly related to a
More informationMachine Learning
Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University October 11, 2012 Today: Computational Learning Theory Probably Approximately Coorrect (PAC) learning theorem
More information