Overview. IAML: Linear Regression. Examples of regression problems. The Regression Problem

Size: px
Start display at page:

Download "Overview. IAML: Linear Regression. Examples of regression problems. The Regression Problem"

Transcription

1 3 / 38 Overview 4 / 38 IAML: Linear Regression Nigel Goddard School of Informatics Semester The linear model Fitting the linear model to data Probabilistic interpretation of the error function Eamples of regression problems Dealing with multiple outputs Generalized linear regression Radial basis function (RBF) models The Regression Problem / 38 Eamples of regression problems 2 / 38 Classification and regression problems: Classification: target of prediction is discrete Regression: target of prediction is continuous Training data: Set D of pairs ( i, i ) for i =,..., n, where i R D and i R Toda: Linear regression, i.e., relationship between and is linear. Although this is simple (and limited) it is: More powerful than ou would epect The basis for more comple nonlinear methods Teaches a lot about regression and classification Robot inverse dnamics: predicting what torques are needed to drive a robot arm along a given trajector Electricit load forecasting, generate hourl forecasts two das in advance (see W & F,.3) Predicting staffing requirements at help desks based on historical data and product and sales information, Predicting the time to failure of equipment based on utilization and environmental conditions

2 The Linear Model To eample: Data Linear model f (; w) = w + w w D D = φ()w where φ() = (,,..., D ) = (, T ) and w w = w... () w D The maths of fitting linear models to data is eas. We use the notation φ() to make generalisation eas later. Elements of Statistical Learning (2nd Ed.) c Hastie, Tibshirani & Friedman 29 Chap 3 5 / 38 6 / 38 To eample: Data With two features Y / 38 X 2 FIGURE 3.. Linear least squares fitting with Instead X of airline, 2. We a plane. seek the With linear more function features, of Xa hperplane. that minimizes the sum of squared residuals from Y. Figure: Hastie, Tibshirani, and Friedman X 8 / 38

3 With more features With more features CPU Performance Data Set Predict: PRP: published relative performance MYCT: machine ccle time in nanoseconds (integer) MMIN: minimum main memor in kilobtes (integer) MMAX: maimum main memor in kilobtes (integer) CACH: cache memor in kilobtes (integer) CHMIN: minimum channels in units (integer) CHMAX: maimum channels in units (integer) PRP = MYCT +.5 MMIN +.6 MMAX +.63 CACH -.27 CHMIN +.46 CHMAX In matri notation Design matri is n (D + ) Φ = 2... D D..... n n2... nd ij is the jth component of the training input i Let = (,..., n ) T Then ŷ = Φw is...? 9 / 38 / 38 Linear Algebra: The -Slide Version What is matri multiplication? a a 2 a 3 b A = a 2 a 22 a 23, b = b 2 a 3 a 32 a 33 b 3 First consider matri times vector, i.e., Ab. Two answers:. Ab is a linear combination of the columns of A a a 2 a 3 Ab = b a 2 + b 2 a 22 + b 3 a 23 a 3 a 32 a Ab is a vector. Each element of the vector is the dot products between b and one row of A. (a, a 2, a 3 )b Ab = (a 2, a 22, a 23 )b (a 3, a 32, a 33 )b / 38 2 / 38

4 Linear model (part 2) 5 / 38 Solving for Model Parameters 6 / 38 In matri notation: Design matri is n (D + ) Φ = 2... D D..... n n2... nd This looks like what we ve seen in linear algebra = Φw We know and Φ but not w. So wh not take w = Φ? (You can t, but wh?) ij is the jth component of the training input i Let = (,..., n ) T Then ŷ = Φw is the model s predicted values on training inputs. Solving for Model Parameters 3 / 38 Loss function 4 / 38 This looks like what we ve seen in linear algebra We know and Φ but not w. = Φw So wh not take w = Φ? (You can t, but wh?) Three reasons: Φ is not square. It is n (D + ). The sstem is overconstrained (n equations for D + parameters), in other words The data has noise Want a loss function O(w) that We minimize wrt w. At minimum, ŷ looks like. (Recall: ŷ depends on w) ŷ = Φw

5 9 / 38 2 / 38 Fitting a linear model to data Y X 2 X IGURE 3.. Linear least squares fitting with IR 2. We seek the linear function of X that miniizes the sum of squared residuals from Y. The Solution A common choice: squared error (makes the maths eas) O(w) = n ( i w T i ) 2 i= In the picture: this is sum of squared length of black sticks. (Each one is called a residual, i.e., each i w T i ) Fitting a linear model to data Fitting a linear model to data 7 / 38 Given a dataset D of pairs ( n O(w) i, = i ) for i =,...,n ( i w T i ) 2 Squared error makes the maths eas i= n E(w) = (= ( Φw) T i f ( i ; w)) 2 ( Φw) i= We want to minimize this with respect to w. The error Thesurface error surface is a parabolic is a parabolic bowl bowl E[w] w Figure: Tom Mitchell How do we do this? 5 / 7 Probabilistic Figure: Tom Mitchell interpretation of O(w) w / 38 Answer: to minimize O(w) = n i= ( i w T i ) 2, set partial derivatives to. This has an analtical solution ŵ = (Φ T Φ) Φ T (Φ T Φ) Φ T is the pseudo-inverse of Φ First check: Does this make sense? Do the matri dimensions line up? Then: Wh is this called a pseudo-inverse? () Finall: What happens if there are no features? Assume that = w T + ɛ, where ɛ N(, σ 2 ) (This is an eact linear relationship plus Gaussian noise.) This implies that i N(w T i, σ 2 ), i.e. log p( i i ) = log 2π + log σ + ( i w T i ) 2 2σ 2 So minimising O(w) equivalent to maimising likelihood! Can view w T as E[ ]. Squared residuals allow estimation of σ 2 ˆσ 2 = n n ( i w T i ) 2 i=

6 23 / / 38 Sensitivit to Outliers Fitting this into the general structure for learning algorithms: Linear regression is sensitive to outliers Eample: Suppose =.5 + ɛ, where ɛ N(,.25), and then add a point (2.5,3): Define the task: regression Decide on the model structure: linear regression model Decide on the score function: squared error (likelihood) Decide on optimization/search method to optimize the score function: calculus (analtic solution) / / 38 Diagnositics Dealing with multiple outputs Graphical diagnostics can be useful for checking: Is the relationship obviousl nonlinear? Look for structure in residuals? Suppose there are q different targets for each input Are there obvious outliers? The goal isn t to find all problems. You can t. The goal is to find obvious, embarrassing problems. We introduce a different w i for each target dimension, and do regression separatel for each one This is called multiple regression Eamples: Plot residuals b fitted values. Stats packages will do this for ou.

7 Basis epansion We can easil transform the original attributes non-linearl into φ() and do linear regression on them Design matri is n m φ ( ) φ 2 ( )... φ m ( ) φ ( 2 ) φ 2 ( 2 )... φ m ( 2 ) Φ =.... φ ( n ) φ 2 ( n )... φ m ( n ) Let = (,..., n ) T Minimize E(w) = Φw 2. As before we have an analtical solution ŵ = (Φ T Φ) Φ T polnomial Gaussians sigmoids (Φ T Φ) Φ T is the pseudo-inverse of Φ Figure credit: Chris Bishop, PRML 25 / / 38 Eample: polnomial regression More about the features φ() = (,, 2,..., M ) T Transforming the features can be important. t t M = M = 3 t t M = M = 9 Eample: Suppose I want to predict the CPU performance. Mabe one of the features is manufacturer. if Intel 2 if AMD = 3 if Apple 4 if Motorola Let s use this as a feature. Will this work? Figure credit: Chris Bishop, PRML 27 / / 38

8 More about the features 3 / 38 More about the features 32 / 38 Transforming the features can be important. Eample: Suppose I want to predict the CPU performance. Instead convert this into / Mabe one of the features is manufacturer. if Intel 2 if AMD = 3 if Apple 4 if Motorola Let s use this as a feature. Will this work? Not the wa ou want. Do ou reall believe AMD is double Intel? = if Intel, otherwise 2 = if AMD, otherwise. Note this is a consequence of linearit. We saw something similar with tet in the first week. Other good transformations: log, square, square root 29 / 38 3 / 38 Radial basis function (RBF) models RBF eample Set φ i () = ep( 2 c i 2 /α 2 ). Need to position these basis functions at some prior chosen centres c i and with a given width α. There are man was to do this but choosing a subset of the datapoints as centres is one method that is quite effective Finding the weights is the same as ever: the pseudo-inverse solution

9 RBF eample 35 / 38 An RBF feature 36 / Original data RBF feature, c = 3, α = phi3. 33 / / 38 Another RBF feature RBF eample Notice how the feature functions specialize in input space. Run the RBF with both basis functions above, plot the residuals Original data RBF feature, c 2 = 6, α 2 = i φ( i ) T w RBF kernel (mean 6) Original data Residuals Residual (RBF model)

10 RBF: A, there s the rub Summar So wh not use RBFs for everthing? Short answer: You might need too man basis functions. This is especiall true in high dimensions (we ll sa more later) Too man means ou probabl overfit. Etreme eample: Centre one on each training point. Also: notice that we haven t seen et in the course how to learn the RBF parameters, i.e., the mean and standard deviation of each kernel Main point of presenting RBFs now: Set up later methods like support vector machines Linear regression often useful out of the bo. More useful than it would be seem because linear means linear in the parameters. You can do a nonlinear transform of the data first, e.g., polnomial, RBF. This point will come up again. Maimum likelihood solution is computationall efficient (pseudo-inverse) Danger of overfitting, especiall with man features or basis functions 37 / / 38

Regression. Machine Learning and Pattern Recognition. Chris Williams. School of Informatics, University of Edinburgh.

Regression. Machine Learning and Pattern Recognition. Chris Williams. School of Informatics, University of Edinburgh. Regression Machine Learning and Pattern Recognition Chris Williams School of Informatics, University of Edinburgh September 24 (All of the slides in this course have been adapted from previous versions

More information

Learning from Data: Regression

Learning from Data: Regression November 3, 2005 http://www.anc.ed.ac.uk/ amos/lfd/ Classification or Regression? Classification: want to learn a discrete target variable. Regression: want to learn a continuous target variable. Linear

More information

Feedforward Neural Networks

Feedforward Neural Networks Chapter 4 Feedforward Neural Networks 4. Motivation Let s start with our logistic regression model from before: P(k d) = softma k =k ( λ(k ) + w d λ(k, w) ). (4.) Recall that this model gives us a lot

More information

Modeling Data with Linear Combinations of Basis Functions. Read Chapter 3 in the text by Bishop

Modeling Data with Linear Combinations of Basis Functions. Read Chapter 3 in the text by Bishop Modeling Data with Linear Combinations of Basis Functions Read Chapter 3 in the text by Bishop A Type of Supervised Learning Problem We want to model data (x 1, t 1 ),..., (x N, t N ), where x i is a vector

More information

15. Eigenvalues, Eigenvectors

15. Eigenvalues, Eigenvectors 5 Eigenvalues, Eigenvectors Matri of a Linear Transformation Consider a linear ( transformation ) L : a b R 2 R 2 Suppose we know that L and L Then c d because of linearit, we can determine what L does

More information

Math-3. Lesson 2-8 Factoring NICE 3rd Degree Polynomials

Math-3. Lesson 2-8 Factoring NICE 3rd Degree Polynomials Math- Lesson -8 Factoring NICE rd Degree Polnomials Find the zeroes of the following rd degree Polnomial 5 Set = 0 0 5 Factor out the common factor. 0 ( 5 ) Factor the quadratic 0 ( 1)( ) 0, -1, - Identif

More information

Linear Models for Classification

Linear Models for Classification Linear Models for Classification Oliver Schulte - CMPT 726 Bishop PRML Ch. 4 Classification: Hand-written Digit Recognition CHINE INTELLIGENCE, VOL. 24, NO. 24, APRIL 2002 x i = t i = (0, 0, 0, 1, 0, 0,

More information

Ch 3 Alg 2 Note Sheet.doc 3.1 Graphing Systems of Equations

Ch 3 Alg 2 Note Sheet.doc 3.1 Graphing Systems of Equations Ch 3 Alg Note Sheet.doc 3.1 Graphing Sstems of Equations Sstems of Linear Equations A sstem of equations is a set of two or more equations that use the same variables. If the graph of each equation =.4

More information

1 GSW Gaussian Elimination

1 GSW Gaussian Elimination Gaussian elimination is probabl the simplest technique for solving a set of simultaneous linear equations, such as: = A x + A x + A x +... + A x,,,, n n = A x + A x + A x +... + A x,,,, n n... m = Am,x

More information

Review: Support vector machines. Machine learning techniques and image analysis

Review: Support vector machines. Machine learning techniques and image analysis Review: Support vector machines Review: Support vector machines Margin optimization min (w,w 0 ) 1 2 w 2 subject to y i (w 0 + w T x i ) 1 0, i = 1,..., n. Review: Support vector machines Margin optimization

More information

CSC321 Lecture 2: Linear Regression

CSC321 Lecture 2: Linear Regression CSC32 Lecture 2: Linear Regression Roger Grosse Roger Grosse CSC32 Lecture 2: Linear Regression / 26 Overview First learning algorithm of the course: linear regression Task: predict scalar-valued targets,

More information

CSC2515 Winter 2015 Introduction to Machine Learning. Lecture 2: Linear regression

CSC2515 Winter 2015 Introduction to Machine Learning. Lecture 2: Linear regression CSC2515 Winter 2015 Introduction to Machine Learning Lecture 2: Linear regression All lecture slides will be available as.pdf on the course website: http://www.cs.toronto.edu/~urtasun/courses/csc2515/csc2515_winter15.html

More information

1.2 Functions and Their Properties PreCalculus

1.2 Functions and Their Properties PreCalculus 1. Functions and Their Properties PreCalculus 1. FUNCTIONS AND THEIR PROPERTIES Learning Targets for 1. 1. Determine whether a set of numbers or a graph is a function. Find the domain of a function given

More information

8.1 Exponents and Roots

8.1 Exponents and Roots Section 8. Eponents and Roots 75 8. Eponents and Roots Before defining the net famil of functions, the eponential functions, we will need to discuss eponent notation in detail. As we shall see, eponents

More information

Eigenvectors and Eigenvalues 1

Eigenvectors and Eigenvalues 1 Ma 2015 page 1 Eigenvectors and Eigenvalues 1 In this handout, we will eplore eigenvectors and eigenvalues. We will begin with an eploration, then provide some direct eplanation and worked eamples, and

More information

1 History of statistical/machine learning. 2 Supervised learning. 3 Two approaches to supervised learning. 4 The general learning procedure

1 History of statistical/machine learning. 2 Supervised learning. 3 Two approaches to supervised learning. 4 The general learning procedure Overview Breiman L (2001). Statistical modeling: The two cultures Statistical learning reading group Aleander L 1 Histor of statistical/machine learning 2 Supervised learning 3 Two approaches to supervised

More information

Linear Models for Regression

Linear Models for Regression Linear Models for Regression Machine Learning Torsten Möller Möller/Mori 1 Reading Chapter 3 of Pattern Recognition and Machine Learning by Bishop Chapter 3+5+6+7 of The Elements of Statistical Learning

More information

Linear regression Class 25, Jeremy Orloff and Jonathan Bloom

Linear regression Class 25, Jeremy Orloff and Jonathan Bloom 1 Learning Goals Linear regression Class 25, 18.05 Jerem Orloff and Jonathan Bloom 1. Be able to use the method of least squares to fit a line to bivariate data. 2. Be able to give a formula for the total

More information

An Introduction to Statistical and Probabilistic Linear Models

An Introduction to Statistical and Probabilistic Linear Models An Introduction to Statistical and Probabilistic Linear Models Maximilian Mozes Proseminar Data Mining Fakultät für Informatik Technische Universität München June 07, 2017 Introduction In statistical learning

More information

LESSON 35: EIGENVALUES AND EIGENVECTORS APRIL 21, (1) We might also write v as v. Both notations refer to a vector.

LESSON 35: EIGENVALUES AND EIGENVECTORS APRIL 21, (1) We might also write v as v. Both notations refer to a vector. LESSON 5: EIGENVALUES AND EIGENVECTORS APRIL 2, 27 In this contet, a vector is a column matri E Note 2 v 2, v 4 5 6 () We might also write v as v Both notations refer to a vector (2) A vector can be man

More information

COMP 551 Applied Machine Learning Lecture 20: Gaussian processes

COMP 551 Applied Machine Learning Lecture 20: Gaussian processes COMP 55 Applied Machine Learning Lecture 2: Gaussian processes Instructor: Ryan Lowe (ryan.lowe@cs.mcgill.ca) Slides mostly by: (herke.vanhoof@mcgill.ca) Class web page: www.cs.mcgill.ca/~hvanho2/comp55

More information

Polynomial approximation and Splines

Polynomial approximation and Splines Polnomial approimation and Splines 1. Weierstrass approimation theorem The basic question we ll look at toda is how to approimate a complicated function f() with a simpler function P () f() P () for eample,

More information

Unit 3 NOTES Honors Common Core Math 2 1. Day 1: Properties of Exponents

Unit 3 NOTES Honors Common Core Math 2 1. Day 1: Properties of Exponents Unit NOTES Honors Common Core Math Da : Properties of Eponents Warm-Up: Before we begin toda s lesson, how much do ou remember about eponents? Use epanded form to write the rules for the eponents. OBJECTIVE

More information

f(x) = 2x 2 + 2x - 4

f(x) = 2x 2 + 2x - 4 4-1 Graphing Quadratic Functions What You ll Learn Scan the tet under the Now heading. List two things ou will learn about in the lesson. 1. Active Vocabular 2. New Vocabular Label each bo with the terms

More information

Laurie s Notes. Overview of Section 3.5

Laurie s Notes. Overview of Section 3.5 Overview of Section.5 Introduction Sstems of linear equations were solved in Algebra using substitution, elimination, and graphing. These same techniques are applied to nonlinear sstems in this lesson.

More information

Vectors and the Geometry of Space

Vectors and the Geometry of Space Chapter 12 Vectors and the Geometr of Space Comments. What does multivariable mean in the name Multivariable Calculus? It means we stud functions that involve more than one variable in either the input

More information

Math Notes on sections 7.8,9.1, and 9.3. Derivation of a solution in the repeated roots case: 3 4 A = 1 1. x =e t : + e t w 2.

Math Notes on sections 7.8,9.1, and 9.3. Derivation of a solution in the repeated roots case: 3 4 A = 1 1. x =e t : + e t w 2. Math 7 Notes on sections 7.8,9., and 9.3. Derivation of a solution in the repeated roots case We consider the eample = A where 3 4 A = The onl eigenvalue is = ; and there is onl one linearl independent

More information

Machine Learning Foundations

Machine Learning Foundations Machine Learning Foundations ( 機器學習基石 ) Lecture 13: Hazard of Overfitting Hsuan-Tien Lin ( 林軒田 ) htlin@csie.ntu.edu.tw Department of Computer Science & Information Engineering National Taiwan Universit

More information

Support Vector Machines

Support Vector Machines Support Vector Machines Optimall deined decision surace Tpicall nonlinear in the input space Linear in a higher dimensional space Implicitl deined b a kernel unction What are Support Vector Machines Used

More information

Week 3: Linear Regression

Week 3: Linear Regression Week 3: Linear Regression Instructor: Sergey Levine Recap In the previous lecture we saw how linear regression can solve the following problem: given a dataset D = {(x, y ),..., (x N, y N )}, learn to

More information

The American School of Marrakesh. Algebra 2 Algebra 2 Summer Preparation Packet

The American School of Marrakesh. Algebra 2 Algebra 2 Summer Preparation Packet The American School of Marrakesh Algebra Algebra Summer Preparation Packet Summer 016 Algebra Summer Preparation Packet This summer packet contains eciting math problems designed to ensure our readiness

More information

COMP 551 Applied Machine Learning Lecture 3: Linear regression (cont d)

COMP 551 Applied Machine Learning Lecture 3: Linear regression (cont d) COMP 551 Applied Machine Learning Lecture 3: Linear regression (cont d) Instructor: Herke van Hoof (herke.vanhoof@mail.mcgill.ca) Slides mostly by: Class web page: www.cs.mcgill.ca/~hvanho2/comp551 Unless

More information

Biostatistics in Research Practice - Regression I

Biostatistics in Research Practice - Regression I Biostatistics in Research Practice - Regression I Simon Crouch 30th Januar 2007 In scientific studies, we often wish to model the relationships between observed variables over a sample of different subjects.

More information

Lecture Machine Learning (past summer semester) If at least one English-speaking student is present.

Lecture Machine Learning (past summer semester) If at least one English-speaking student is present. Advanced Machine Learning Lecture 1 Organization Lecturer Prof. Bastian Leibe (leibe@vision.rwth-aachen.de) Introduction 20.10.2015 Bastian Leibe Visual Computing Institute RWTH Aachen University http://www.vision.rwth-aachen.de/

More information

Linear Regression. Robot Image Credit: Viktoriya Sukhanova 123RF.com

Linear Regression. Robot Image Credit: Viktoriya Sukhanova 123RF.com Linear Regression These slides were assembled by Eric Eaton, with grateful acknowledgement of the many others who made their course materials freely available online. Feel free to reuse or adapt these

More information

Perturbation Theory for Variational Inference

Perturbation Theory for Variational Inference Perturbation heor for Variational Inference Manfred Opper U Berlin Marco Fraccaro echnical Universit of Denmark Ulrich Paquet Apple Ale Susemihl U Berlin Ole Winther echnical Universit of Denmark Abstract

More information

CPSC 340: Machine Learning and Data Mining. MLE and MAP Fall 2017

CPSC 340: Machine Learning and Data Mining. MLE and MAP Fall 2017 CPSC 340: Machine Learning and Data Mining MLE and MAP Fall 2017 Assignment 3: Admin 1 late day to hand in tonight, 2 late days for Wednesday. Assignment 4: Due Friday of next week. Last Time: Multi-Class

More information

Machine learning - HT Basis Expansion, Regularization, Validation

Machine learning - HT Basis Expansion, Regularization, Validation Machine learning - HT 016 4. Basis Expansion, Regularization, Validation Varun Kanade University of Oxford Feburary 03, 016 Outline Introduce basis function to go beyond linear regression Understanding

More information

Inf2b Learning and Data

Inf2b Learning and Data Inf2b Learning and Data Lecture : Single layer Neural Networks () (Credit: Hiroshi Shimodaira Iain Murray and Steve Renals) Centre for Speech Technology Research (CSTR) School of Informatics University

More information

Where now? Machine Learning and Bayesian Inference

Where now? Machine Learning and Bayesian Inference Machine Learning and Bayesian Inference Dr Sean Holden Computer Laboratory, Room FC6 Telephone etension 67 Email: sbh@clcamacuk wwwclcamacuk/ sbh/ Where now? There are some simple take-home messages from

More information

DM534 - Introduction to Computer Science

DM534 - Introduction to Computer Science Department of Mathematics and Computer Science University of Southern Denmark, Odense October 21, 2016 Marco Chiarandini DM534 - Introduction to Computer Science Training Session, Week 41-43, Autumn 2016

More information

Artificial Neural Networks 2

Artificial Neural Networks 2 CSC2515 Machine Learning Sam Roweis Artificial Neural s 2 We saw neural nets for classification. Same idea for regression. ANNs are just adaptive basis regression machines of the form: y k = j w kj σ(b

More information

Machine Learning - MT & 5. Basis Expansion, Regularization, Validation

Machine Learning - MT & 5. Basis Expansion, Regularization, Validation Machine Learning - MT 2016 4 & 5. Basis Expansion, Regularization, Validation Varun Kanade University of Oxford October 19 & 24, 2016 Outline Basis function expansion to capture non-linear relationships

More information

Neural Networks. Bishop PRML Ch. 5. Alireza Ghane. Feed-forward Networks Network Training Error Backpropagation Applications

Neural Networks. Bishop PRML Ch. 5. Alireza Ghane. Feed-forward Networks Network Training Error Backpropagation Applications Neural Networks Bishop PRML Ch. 5 Alireza Ghane Neural Networks Alireza Ghane / Greg Mori 1 Neural Networks Neural networks arise from attempts to model human/animal brains Many models, many claims of

More information

Cross-validation for detecting and preventing overfitting

Cross-validation for detecting and preventing overfitting Cross-validation for detecting and preventing overfitting A Regression Problem = f() + noise Can we learn f from this data? Note to other teachers and users of these slides. Andrew would be delighted if

More information

Introduction to Vector Spaces Linear Algebra, Spring 2011

Introduction to Vector Spaces Linear Algebra, Spring 2011 Introduction to Vector Spaces Linear Algebra, Spring 2011 You probabl have heard the word vector before, perhaps in the contet of Calculus III or phsics. You probabl think of a vector like this: 5 3 or

More information

Methods for Advanced Mathematics (C3) Coursework Numerical Methods

Methods for Advanced Mathematics (C3) Coursework Numerical Methods Woodhouse College 0 Page Introduction... 3 Terminolog... 3 Activit... 4 Wh use numerical methods?... Change of sign... Activit... 6 Interval Bisection... 7 Decimal Search... 8 Coursework Requirements on

More information

Feed-forward Networks Network Training Error Backpropagation Applications. Neural Networks. Oliver Schulte - CMPT 726. Bishop PRML Ch.

Feed-forward Networks Network Training Error Backpropagation Applications. Neural Networks. Oliver Schulte - CMPT 726. Bishop PRML Ch. Neural Networks Oliver Schulte - CMPT 726 Bishop PRML Ch. 5 Neural Networks Neural networks arise from attempts to model human/animal brains Many models, many claims of biological plausibility We will

More information

4452 Mathematical Modeling Lecture 13: Chaos and Fractals

4452 Mathematical Modeling Lecture 13: Chaos and Fractals Math Modeling Lecture 13: Chaos and Fractals Page 1 442 Mathematical Modeling Lecture 13: Chaos and Fractals Introduction In our tetbook, the discussion on chaos and fractals covers less than 2 pages.

More information

Machine Learning. 7. Logistic and Linear Regression

Machine Learning. 7. Logistic and Linear Regression Sapienza University of Rome, Italy - Machine Learning (27/28) University of Rome La Sapienza Master in Artificial Intelligence and Robotics Machine Learning 7. Logistic and Linear Regression Luca Iocchi,

More information

Week 3 September 5-7.

Week 3 September 5-7. MA322 Weekl topics and quiz preparations Week 3 September 5-7. Topics These are alread partl covered in lectures. We collect the details for convenience.. Solutions of homogeneous equations AX =. 2. Using

More information

Introduction. Chapter 1

Introduction. Chapter 1 Chapter 1 Introduction In this book we will be concerned with supervised learning, which is the problem of learning input-output mappings from empirical data (the training dataset). Depending on the characteristics

More information

Machine Learning Lecture 5

Machine Learning Lecture 5 Machine Learning Lecture 5 Linear Discriminant Functions 26.10.2017 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Course Outline Fundamentals Bayes Decision Theory

More information

CSC2541 Lecture 2 Bayesian Occam s Razor and Gaussian Processes

CSC2541 Lecture 2 Bayesian Occam s Razor and Gaussian Processes CSC2541 Lecture 2 Bayesian Occam s Razor and Gaussian Processes Roger Grosse Roger Grosse CSC2541 Lecture 2 Bayesian Occam s Razor and Gaussian Processes 1 / 55 Adminis-Trivia Did everyone get my e-mail

More information

Linear, threshold units. Linear Discriminant Functions and Support Vector Machines. Biometrics CSE 190 Lecture 11. X i : inputs W i : weights

Linear, threshold units. Linear Discriminant Functions and Support Vector Machines. Biometrics CSE 190 Lecture 11. X i : inputs W i : weights Linear Discriminant Functions and Support Vector Machines Linear, threshold units CSE19, Winter 11 Biometrics CSE 19 Lecture 11 1 X i : inputs W i : weights θ : threshold 3 4 5 1 6 7 Courtesy of University

More information

Machine Learning. Lecture 4: Regularization and Bayesian Statistics. Feng Li. https://funglee.github.io

Machine Learning. Lecture 4: Regularization and Bayesian Statistics. Feng Li. https://funglee.github.io Machine Learning Lecture 4: Regularization and Bayesian Statistics Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 207 Overfitting Problem

More information

Linear Models for Regression

Linear Models for Regression Linear Models for Regression Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr

More information

Machine Learning. 1. Linear Regression

Machine Learning. 1. Linear Regression Machine Learning Machine Learning 1. Linear Regression Lars Schmidt-Thieme Information Sstems and Machine Learning Lab (ISMLL) Institute for Business Economics and Information Sstems & Institute for Computer

More information

Introduction to Differential Equations. National Chiao Tung University Chun-Jen Tsai 9/14/2011

Introduction to Differential Equations. National Chiao Tung University Chun-Jen Tsai 9/14/2011 Introduction to Differential Equations National Chiao Tung Universit Chun-Jen Tsai 9/14/011 Differential Equations Definition: An equation containing the derivatives of one or more dependent variables,

More information

5. Perform the indicated operation and simplify each of the following expressions:

5. Perform the indicated operation and simplify each of the following expressions: Precalculus Worksheet.5 1. What is - 1? Just because we refer to solutions as imaginar does not mean that the solutions are meaningless. Fields such as quantum mechanics and electromagnetism depend on

More information

H.Algebra 2 Summer Review Packet

H.Algebra 2 Summer Review Packet H.Algebra Summer Review Packet 1 Correlation of Algebra Summer Packet with Algebra 1 Objectives A. Simplifing Polnomial Epressions Objectives: The student will be able to: Use the commutative, associative,

More information

COMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017

COMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017 COMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University FEATURE EXPANSIONS FEATURE EXPANSIONS

More information

1.2 Functions and Their Properties PreCalculus

1.2 Functions and Their Properties PreCalculus 1. Functions and Their Properties PreCalculus 1. FUNCTIONS AND THEIR PROPERTIES Learning Targets for 1. 1. Determine whether a set of numbers or a graph is a function. Find the domain of a function given

More information

Section 3.1. ; X = (0, 1]. (i) f : R R R, f (x, y) = x y

Section 3.1. ; X = (0, 1]. (i) f : R R R, f (x, y) = x y Paul J. Bruillard MATH 0.970 Problem Set 6 An Introduction to Abstract Mathematics R. Bond and W. Keane Section 3.1: 3b,c,e,i, 4bd, 6, 9, 15, 16, 18c,e, 19a, 0, 1b Section 3.: 1f,i, e, 6, 1e,f,h, 13e,

More information

2.3 Quadratic Functions

2.3 Quadratic Functions 88 Linear and Quadratic Functions. Quadratic Functions You ma recall studing quadratic equations in Intermediate Algebra. In this section, we review those equations in the contet of our net famil of functions:

More information

Logistic Regression. COMP 527 Danushka Bollegala

Logistic Regression. COMP 527 Danushka Bollegala Logistic Regression COMP 527 Danushka Bollegala Binary Classification Given an instance x we must classify it to either positive (1) or negative (0) class We can use {1,-1} instead of {1,0} but we will

More information

CPSC 340: Machine Learning and Data Mining

CPSC 340: Machine Learning and Data Mining CPSC 340: Machine Learning and Data Mining MLE and MAP Original version of these slides by Mark Schmidt, with modifications by Mike Gelbart. 1 Admin Assignment 4: Due tonight. Assignment 5: Will be released

More information

Tools of AI. Marcin Sydow. Summary. Machine Learning

Tools of AI. Marcin Sydow. Summary. Machine Learning Machine Learning Outline of this Lecture Motivation for Data Mining and Machine Learning Idea of Machine Learning Decision Table: Cases and Attributes Supervised and Unsupervised Learning Classication

More information

L20: MLPs, RBFs and SPR Bayes discriminants and MLPs The role of MLP hidden units Bayes discriminants and RBFs Comparison between MLPs and RBFs

L20: MLPs, RBFs and SPR Bayes discriminants and MLPs The role of MLP hidden units Bayes discriminants and RBFs Comparison between MLPs and RBFs L0: MLPs, RBFs and SPR Bayes discriminants and MLPs The role of MLP hidden units Bayes discriminants and RBFs Comparison between MLPs and RBFs CSCE 666 Pattern Analysis Ricardo Gutierrez-Osuna CSE@TAMU

More information

Physics Gravitational force. 2. Strong or color force. 3. Electroweak force

Physics Gravitational force. 2. Strong or color force. 3. Electroweak force Phsics 360 Notes on Griffths - pluses and minuses No tetbook is perfect, and Griffithsisnoeception. Themajorplusisthat it is prett readable. For minuses, see below. Much of what G sas about the del operator

More information

Stat 406: Algorithms for classification and prediction. Lecture 1: Introduction. Kevin Murphy. Mon 7 January,

Stat 406: Algorithms for classification and prediction. Lecture 1: Introduction. Kevin Murphy. Mon 7 January, 1 Stat 406: Algorithms for classification and prediction Lecture 1: Introduction Kevin Murphy Mon 7 January, 2008 1 1 Slides last updated on January 7, 2008 Outline 2 Administrivia Some basic definitions.

More information

Polynomial and Rational Functions

Polynomial and Rational Functions Name Date Chapter Polnomial and Rational Functions Section.1 Quadratic Functions Objective: In this lesson ou learned how to sketch and analze graphs of quadratic functions. Important Vocabular Define

More information

LINEAR MODELS FOR CLASSIFICATION. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception

LINEAR MODELS FOR CLASSIFICATION. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception LINEAR MODELS FOR CLASSIFICATION Classification: Problem Statement 2 In regression, we are modeling the relationship between a continuous input variable x and a continuous target variable t. In classification,

More information

Ridge Regression 1. to which some random noise is added. So that the training labels can be represented as:

Ridge Regression 1. to which some random noise is added. So that the training labels can be represented as: CS 1: Machine Learning Spring 15 College of Computer and Information Science Northeastern University Lecture 3 February, 3 Instructor: Bilal Ahmed Scribe: Bilal Ahmed & Virgil Pavlu 1 Introduction Ridge

More information

5.4 dividing POlynOmIAlS

5.4 dividing POlynOmIAlS SECTION 5.4 dividing PolNomiAls 3 9 3 learning ObjeCTIveS In this section, ou will: Use long division to divide polnomials. Use snthetic division to divide polnomials. 5.4 dividing POlnOmIAlS Figure 1

More information

Algebra II Notes Unit Six: Polynomials Syllabus Objectives: 6.2 The student will simplify polynomial expressions.

Algebra II Notes Unit Six: Polynomials Syllabus Objectives: 6.2 The student will simplify polynomial expressions. Algebra II Notes Unit Si: Polnomials Sllabus Objectives: 6. The student will simplif polnomial epressions. Review: Properties of Eponents (Allow students to come up with these on their own.) Let a and

More information

6.867 Machine learning

6.867 Machine learning 6.867 Machine learning Mid-term eam October 8, 6 ( points) Your name and MIT ID: .5.5 y.5 y.5 a).5.5 b).5.5.5.5 y.5 y.5 c).5.5 d).5.5 Figure : Plots of linear regression results with different types of

More information

RELATIONS AND FUNCTIONS through

RELATIONS AND FUNCTIONS through RELATIONS AND FUNCTIONS 11.1.2 through 11.1. Relations and Functions establish a correspondence between the input values (usuall ) and the output values (usuall ) according to the particular relation or

More information

Kernel Methods. Foundations of Data Analysis. Torsten Möller. Möller/Mori 1

Kernel Methods. Foundations of Data Analysis. Torsten Möller. Möller/Mori 1 Kernel Methods Foundations of Data Analysis Torsten Möller Möller/Mori 1 Reading Chapter 6 of Pattern Recognition and Machine Learning by Bishop Chapter 12 of The Elements of Statistical Learning by Hastie,

More information

6. Vector Random Variables

6. Vector Random Variables 6. Vector Random Variables In the previous chapter we presented methods for dealing with two random variables. In this chapter we etend these methods to the case of n random variables in the following

More information

Today. Calculus. Linear Regression. Lagrange Multipliers

Today. Calculus. Linear Regression. Lagrange Multipliers Today Calculus Lagrange Multipliers Linear Regression 1 Optimization with constraints What if I want to constrain the parameters of the model. The mean is less than 10 Find the best likelihood, subject

More information

ECE521 week 3: 23/26 January 2017

ECE521 week 3: 23/26 January 2017 ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear

More information

CS 540: Machine Learning Lecture 1: Introduction

CS 540: Machine Learning Lecture 1: Introduction CS 540: Machine Learning Lecture 1: Introduction AD January 2008 AD () January 2008 1 / 41 Acknowledgments Thanks to Nando de Freitas Kevin Murphy AD () January 2008 2 / 41 Administrivia & Announcement

More information

Machine Learning (CSE 446): Probabilistic Machine Learning

Machine Learning (CSE 446): Probabilistic Machine Learning Machine Learning (CSE 446): Probabilistic Machine Learning oah Smith c 2017 University of Washington nasmith@cs.washington.edu ovember 1, 2017 1 / 24 Understanding MLE y 1 MLE π^ You can think of MLE as

More information

SUPPORT VECTOR MACHINE

SUPPORT VECTOR MACHINE SUPPORT VECTOR MACHINE Mainly based on https://nlp.stanford.edu/ir-book/pdf/15svm.pdf 1 Overview SVM is a huge topic Integration of MMDS, IIR, and Andrew Moore s slides here Our foci: Geometric intuition

More information

Machine Learning Basics III

Machine Learning Basics III Machine Learning Basics III Benjamin Roth CIS LMU München Benjamin Roth (CIS LMU München) Machine Learning Basics III 1 / 62 Outline 1 Classification Logistic Regression 2 Gradient Based Optimization Gradient

More information

Announcements. CS 188: Artificial Intelligence Spring Classification. Today. Classification overview. Case-Based Reasoning

Announcements. CS 188: Artificial Intelligence Spring Classification. Today. Classification overview. Case-Based Reasoning CS 188: Artificial Intelligence Spring 21 Lecture 22: Nearest Neighbors, Kernels 4/18/211 Pieter Abbeel UC Berkeley Slides adapted from Dan Klein Announcements On-going: contest (optional and FUN!) Remaining

More information

Week 5: Logistic Regression & Neural Networks

Week 5: Logistic Regression & Neural Networks Week 5: Logistic Regression & Neural Networks Instructor: Sergey Levine 1 Summary: Logistic Regression In the previous lecture, we covered logistic regression. To recap, logistic regression models and

More information

5. Zeros. We deduce that the graph crosses the x-axis at the points x = 0, 1, 2 and 4, and nowhere else. And that s exactly what we see in the graph.

5. Zeros. We deduce that the graph crosses the x-axis at the points x = 0, 1, 2 and 4, and nowhere else. And that s exactly what we see in the graph. . Zeros Eample 1. At the right we have drawn the graph of the polnomial = ( 1) ( 2) ( 4). Argue that the form of the algebraic formula allows ou to see right awa where the graph is above the -ais, where

More information

Keira Godwin. Time Allotment: 13 days. Unit Objectives: Upon completion of this unit, students will be able to:

Keira Godwin. Time Allotment: 13 days. Unit Objectives: Upon completion of this unit, students will be able to: Keira Godwin Time Allotment: 3 das Unit Objectives: Upon completion of this unit, students will be able to: o Simplif comple rational fractions. o Solve comple rational fractional equations. o Solve quadratic

More information

Ch 5 Alg 2 L2 Note Sheet Key Do Activity 1 on your Ch 5 Activity Sheet.

Ch 5 Alg 2 L2 Note Sheet Key Do Activity 1 on your Ch 5 Activity Sheet. Ch Alg L Note Sheet Ke Do Activit 1 on our Ch Activit Sheet. Chapter : Quadratic Equations and Functions.1 Modeling Data With Quadratic Functions You had three forms for linear equations, ou will have

More information

Kernel-Based Principal Component Analysis (KPCA) and Its Applications. Nonlinear PCA

Kernel-Based Principal Component Analysis (KPCA) and Its Applications. Nonlinear PCA Kernel-Based Principal Component Analysis (KPCA) and Its Applications 4//009 Based on slides originaly from Dr. John Tan 1 Nonlinear PCA Natural phenomena are usually nonlinear and standard PCA is intrinsically

More information

Dependence and scatter-plots. MVE-495: Lecture 4 Correlation and Regression

Dependence and scatter-plots. MVE-495: Lecture 4 Correlation and Regression Dependence and scatter-plots MVE-495: Lecture 4 Correlation and Regression It is common for two or more quantitative variables to be measured on the same individuals. Then it is useful to consider what

More information

DIFFERENTIAL EQUATIONS First Order Differential Equations. Paul Dawkins

DIFFERENTIAL EQUATIONS First Order Differential Equations. Paul Dawkins DIFFERENTIAL EQUATIONS First Order Paul Dawkins Table of Contents Preface... First Order... 3 Introduction... 3 Linear... 4 Separable... 7 Eact... 8 Bernoulli... 39 Substitutions... 46 Intervals of Validit...

More information

Machine Learning for OR & FE

Machine Learning for OR & FE Machine Learning for OR & FE Supervised Learning: Regression I Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com Some of the

More information

LESSON #42 - INVERSES OF FUNCTIONS AND FUNCTION NOTATION PART 2 COMMON CORE ALGEBRA II

LESSON #42 - INVERSES OF FUNCTIONS AND FUNCTION NOTATION PART 2 COMMON CORE ALGEBRA II LESSON #4 - INVERSES OF FUNCTIONS AND FUNCTION NOTATION PART COMMON CORE ALGEBRA II You will recall from unit 1 that in order to find the inverse of a function, ou must switch and and solve for. Also,

More information

Linear Models for Regression CS534

Linear Models for Regression CS534 Linear Models for Regression CS534 Example Regression Problems Predict housing price based on House size, lot size, Location, # of rooms Predict stock price based on Price history of the past month Predict

More information

Overview. Probabilistic Interpretation of Linear Regression Maximum Likelihood Estimation Bayesian Estimation MAP Estimation

Overview. Probabilistic Interpretation of Linear Regression Maximum Likelihood Estimation Bayesian Estimation MAP Estimation Overview Probabilistic Interpretation of Linear Regression Maximum Likelihood Estimation Bayesian Estimation MAP Estimation Probabilistic Interpretation: Linear Regression Assume output y is generated

More information

Smoothness Selection. Simon Wood Mathematical Sciences, University of Bath, U.K.

Smoothness Selection. Simon Wood Mathematical Sciences, University of Bath, U.K. Smoothness Selection Simon Wood Mathematical Sciences, Universit of Bath, U.K. Smoothness selection approaches The smoothing model i = f ( i ) + ɛ i, ɛ i N(0, σ 2 ), is represented via a basis epansion

More information