Overview. IAML: Linear Regression. Examples of regression problems. The Regression Problem
|
|
- Preston Todd
- 5 years ago
- Views:
Transcription
1 3 / 38 Overview 4 / 38 IAML: Linear Regression Nigel Goddard School of Informatics Semester The linear model Fitting the linear model to data Probabilistic interpretation of the error function Eamples of regression problems Dealing with multiple outputs Generalized linear regression Radial basis function (RBF) models The Regression Problem / 38 Eamples of regression problems 2 / 38 Classification and regression problems: Classification: target of prediction is discrete Regression: target of prediction is continuous Training data: Set D of pairs ( i, i ) for i =,..., n, where i R D and i R Toda: Linear regression, i.e., relationship between and is linear. Although this is simple (and limited) it is: More powerful than ou would epect The basis for more comple nonlinear methods Teaches a lot about regression and classification Robot inverse dnamics: predicting what torques are needed to drive a robot arm along a given trajector Electricit load forecasting, generate hourl forecasts two das in advance (see W & F,.3) Predicting staffing requirements at help desks based on historical data and product and sales information, Predicting the time to failure of equipment based on utilization and environmental conditions
2 The Linear Model To eample: Data Linear model f (; w) = w + w w D D = φ()w where φ() = (,,..., D ) = (, T ) and w w = w... () w D The maths of fitting linear models to data is eas. We use the notation φ() to make generalisation eas later. Elements of Statistical Learning (2nd Ed.) c Hastie, Tibshirani & Friedman 29 Chap 3 5 / 38 6 / 38 To eample: Data With two features Y / 38 X 2 FIGURE 3.. Linear least squares fitting with Instead X of airline, 2. We a plane. seek the With linear more function features, of Xa hperplane. that minimizes the sum of squared residuals from Y. Figure: Hastie, Tibshirani, and Friedman X 8 / 38
3 With more features With more features CPU Performance Data Set Predict: PRP: published relative performance MYCT: machine ccle time in nanoseconds (integer) MMIN: minimum main memor in kilobtes (integer) MMAX: maimum main memor in kilobtes (integer) CACH: cache memor in kilobtes (integer) CHMIN: minimum channels in units (integer) CHMAX: maimum channels in units (integer) PRP = MYCT +.5 MMIN +.6 MMAX +.63 CACH -.27 CHMIN +.46 CHMAX In matri notation Design matri is n (D + ) Φ = 2... D D..... n n2... nd ij is the jth component of the training input i Let = (,..., n ) T Then ŷ = Φw is...? 9 / 38 / 38 Linear Algebra: The -Slide Version What is matri multiplication? a a 2 a 3 b A = a 2 a 22 a 23, b = b 2 a 3 a 32 a 33 b 3 First consider matri times vector, i.e., Ab. Two answers:. Ab is a linear combination of the columns of A a a 2 a 3 Ab = b a 2 + b 2 a 22 + b 3 a 23 a 3 a 32 a Ab is a vector. Each element of the vector is the dot products between b and one row of A. (a, a 2, a 3 )b Ab = (a 2, a 22, a 23 )b (a 3, a 32, a 33 )b / 38 2 / 38
4 Linear model (part 2) 5 / 38 Solving for Model Parameters 6 / 38 In matri notation: Design matri is n (D + ) Φ = 2... D D..... n n2... nd This looks like what we ve seen in linear algebra = Φw We know and Φ but not w. So wh not take w = Φ? (You can t, but wh?) ij is the jth component of the training input i Let = (,..., n ) T Then ŷ = Φw is the model s predicted values on training inputs. Solving for Model Parameters 3 / 38 Loss function 4 / 38 This looks like what we ve seen in linear algebra We know and Φ but not w. = Φw So wh not take w = Φ? (You can t, but wh?) Three reasons: Φ is not square. It is n (D + ). The sstem is overconstrained (n equations for D + parameters), in other words The data has noise Want a loss function O(w) that We minimize wrt w. At minimum, ŷ looks like. (Recall: ŷ depends on w) ŷ = Φw
5 9 / 38 2 / 38 Fitting a linear model to data Y X 2 X IGURE 3.. Linear least squares fitting with IR 2. We seek the linear function of X that miniizes the sum of squared residuals from Y. The Solution A common choice: squared error (makes the maths eas) O(w) = n ( i w T i ) 2 i= In the picture: this is sum of squared length of black sticks. (Each one is called a residual, i.e., each i w T i ) Fitting a linear model to data Fitting a linear model to data 7 / 38 Given a dataset D of pairs ( n O(w) i, = i ) for i =,...,n ( i w T i ) 2 Squared error makes the maths eas i= n E(w) = (= ( Φw) T i f ( i ; w)) 2 ( Φw) i= We want to minimize this with respect to w. The error Thesurface error surface is a parabolic is a parabolic bowl bowl E[w] w Figure: Tom Mitchell How do we do this? 5 / 7 Probabilistic Figure: Tom Mitchell interpretation of O(w) w / 38 Answer: to minimize O(w) = n i= ( i w T i ) 2, set partial derivatives to. This has an analtical solution ŵ = (Φ T Φ) Φ T (Φ T Φ) Φ T is the pseudo-inverse of Φ First check: Does this make sense? Do the matri dimensions line up? Then: Wh is this called a pseudo-inverse? () Finall: What happens if there are no features? Assume that = w T + ɛ, where ɛ N(, σ 2 ) (This is an eact linear relationship plus Gaussian noise.) This implies that i N(w T i, σ 2 ), i.e. log p( i i ) = log 2π + log σ + ( i w T i ) 2 2σ 2 So minimising O(w) equivalent to maimising likelihood! Can view w T as E[ ]. Squared residuals allow estimation of σ 2 ˆσ 2 = n n ( i w T i ) 2 i=
6 23 / / 38 Sensitivit to Outliers Fitting this into the general structure for learning algorithms: Linear regression is sensitive to outliers Eample: Suppose =.5 + ɛ, where ɛ N(,.25), and then add a point (2.5,3): Define the task: regression Decide on the model structure: linear regression model Decide on the score function: squared error (likelihood) Decide on optimization/search method to optimize the score function: calculus (analtic solution) / / 38 Diagnositics Dealing with multiple outputs Graphical diagnostics can be useful for checking: Is the relationship obviousl nonlinear? Look for structure in residuals? Suppose there are q different targets for each input Are there obvious outliers? The goal isn t to find all problems. You can t. The goal is to find obvious, embarrassing problems. We introduce a different w i for each target dimension, and do regression separatel for each one This is called multiple regression Eamples: Plot residuals b fitted values. Stats packages will do this for ou.
7 Basis epansion We can easil transform the original attributes non-linearl into φ() and do linear regression on them Design matri is n m φ ( ) φ 2 ( )... φ m ( ) φ ( 2 ) φ 2 ( 2 )... φ m ( 2 ) Φ =.... φ ( n ) φ 2 ( n )... φ m ( n ) Let = (,..., n ) T Minimize E(w) = Φw 2. As before we have an analtical solution ŵ = (Φ T Φ) Φ T polnomial Gaussians sigmoids (Φ T Φ) Φ T is the pseudo-inverse of Φ Figure credit: Chris Bishop, PRML 25 / / 38 Eample: polnomial regression More about the features φ() = (,, 2,..., M ) T Transforming the features can be important. t t M = M = 3 t t M = M = 9 Eample: Suppose I want to predict the CPU performance. Mabe one of the features is manufacturer. if Intel 2 if AMD = 3 if Apple 4 if Motorola Let s use this as a feature. Will this work? Figure credit: Chris Bishop, PRML 27 / / 38
8 More about the features 3 / 38 More about the features 32 / 38 Transforming the features can be important. Eample: Suppose I want to predict the CPU performance. Instead convert this into / Mabe one of the features is manufacturer. if Intel 2 if AMD = 3 if Apple 4 if Motorola Let s use this as a feature. Will this work? Not the wa ou want. Do ou reall believe AMD is double Intel? = if Intel, otherwise 2 = if AMD, otherwise. Note this is a consequence of linearit. We saw something similar with tet in the first week. Other good transformations: log, square, square root 29 / 38 3 / 38 Radial basis function (RBF) models RBF eample Set φ i () = ep( 2 c i 2 /α 2 ). Need to position these basis functions at some prior chosen centres c i and with a given width α. There are man was to do this but choosing a subset of the datapoints as centres is one method that is quite effective Finding the weights is the same as ever: the pseudo-inverse solution
9 RBF eample 35 / 38 An RBF feature 36 / Original data RBF feature, c = 3, α = phi3. 33 / / 38 Another RBF feature RBF eample Notice how the feature functions specialize in input space. Run the RBF with both basis functions above, plot the residuals Original data RBF feature, c 2 = 6, α 2 = i φ( i ) T w RBF kernel (mean 6) Original data Residuals Residual (RBF model)
10 RBF: A, there s the rub Summar So wh not use RBFs for everthing? Short answer: You might need too man basis functions. This is especiall true in high dimensions (we ll sa more later) Too man means ou probabl overfit. Etreme eample: Centre one on each training point. Also: notice that we haven t seen et in the course how to learn the RBF parameters, i.e., the mean and standard deviation of each kernel Main point of presenting RBFs now: Set up later methods like support vector machines Linear regression often useful out of the bo. More useful than it would be seem because linear means linear in the parameters. You can do a nonlinear transform of the data first, e.g., polnomial, RBF. This point will come up again. Maimum likelihood solution is computationall efficient (pseudo-inverse) Danger of overfitting, especiall with man features or basis functions 37 / / 38
Regression. Machine Learning and Pattern Recognition. Chris Williams. School of Informatics, University of Edinburgh.
Regression Machine Learning and Pattern Recognition Chris Williams School of Informatics, University of Edinburgh September 24 (All of the slides in this course have been adapted from previous versions
More informationLearning from Data: Regression
November 3, 2005 http://www.anc.ed.ac.uk/ amos/lfd/ Classification or Regression? Classification: want to learn a discrete target variable. Regression: want to learn a continuous target variable. Linear
More informationFeedforward Neural Networks
Chapter 4 Feedforward Neural Networks 4. Motivation Let s start with our logistic regression model from before: P(k d) = softma k =k ( λ(k ) + w d λ(k, w) ). (4.) Recall that this model gives us a lot
More informationModeling Data with Linear Combinations of Basis Functions. Read Chapter 3 in the text by Bishop
Modeling Data with Linear Combinations of Basis Functions Read Chapter 3 in the text by Bishop A Type of Supervised Learning Problem We want to model data (x 1, t 1 ),..., (x N, t N ), where x i is a vector
More information15. Eigenvalues, Eigenvectors
5 Eigenvalues, Eigenvectors Matri of a Linear Transformation Consider a linear ( transformation ) L : a b R 2 R 2 Suppose we know that L and L Then c d because of linearit, we can determine what L does
More informationMath-3. Lesson 2-8 Factoring NICE 3rd Degree Polynomials
Math- Lesson -8 Factoring NICE rd Degree Polnomials Find the zeroes of the following rd degree Polnomial 5 Set = 0 0 5 Factor out the common factor. 0 ( 5 ) Factor the quadratic 0 ( 1)( ) 0, -1, - Identif
More informationLinear Models for Classification
Linear Models for Classification Oliver Schulte - CMPT 726 Bishop PRML Ch. 4 Classification: Hand-written Digit Recognition CHINE INTELLIGENCE, VOL. 24, NO. 24, APRIL 2002 x i = t i = (0, 0, 0, 1, 0, 0,
More informationCh 3 Alg 2 Note Sheet.doc 3.1 Graphing Systems of Equations
Ch 3 Alg Note Sheet.doc 3.1 Graphing Sstems of Equations Sstems of Linear Equations A sstem of equations is a set of two or more equations that use the same variables. If the graph of each equation =.4
More information1 GSW Gaussian Elimination
Gaussian elimination is probabl the simplest technique for solving a set of simultaneous linear equations, such as: = A x + A x + A x +... + A x,,,, n n = A x + A x + A x +... + A x,,,, n n... m = Am,x
More informationReview: Support vector machines. Machine learning techniques and image analysis
Review: Support vector machines Review: Support vector machines Margin optimization min (w,w 0 ) 1 2 w 2 subject to y i (w 0 + w T x i ) 1 0, i = 1,..., n. Review: Support vector machines Margin optimization
More informationCSC321 Lecture 2: Linear Regression
CSC32 Lecture 2: Linear Regression Roger Grosse Roger Grosse CSC32 Lecture 2: Linear Regression / 26 Overview First learning algorithm of the course: linear regression Task: predict scalar-valued targets,
More informationCSC2515 Winter 2015 Introduction to Machine Learning. Lecture 2: Linear regression
CSC2515 Winter 2015 Introduction to Machine Learning Lecture 2: Linear regression All lecture slides will be available as.pdf on the course website: http://www.cs.toronto.edu/~urtasun/courses/csc2515/csc2515_winter15.html
More information1.2 Functions and Their Properties PreCalculus
1. Functions and Their Properties PreCalculus 1. FUNCTIONS AND THEIR PROPERTIES Learning Targets for 1. 1. Determine whether a set of numbers or a graph is a function. Find the domain of a function given
More information8.1 Exponents and Roots
Section 8. Eponents and Roots 75 8. Eponents and Roots Before defining the net famil of functions, the eponential functions, we will need to discuss eponent notation in detail. As we shall see, eponents
More informationEigenvectors and Eigenvalues 1
Ma 2015 page 1 Eigenvectors and Eigenvalues 1 In this handout, we will eplore eigenvectors and eigenvalues. We will begin with an eploration, then provide some direct eplanation and worked eamples, and
More information1 History of statistical/machine learning. 2 Supervised learning. 3 Two approaches to supervised learning. 4 The general learning procedure
Overview Breiman L (2001). Statistical modeling: The two cultures Statistical learning reading group Aleander L 1 Histor of statistical/machine learning 2 Supervised learning 3 Two approaches to supervised
More informationLinear Models for Regression
Linear Models for Regression Machine Learning Torsten Möller Möller/Mori 1 Reading Chapter 3 of Pattern Recognition and Machine Learning by Bishop Chapter 3+5+6+7 of The Elements of Statistical Learning
More informationLinear regression Class 25, Jeremy Orloff and Jonathan Bloom
1 Learning Goals Linear regression Class 25, 18.05 Jerem Orloff and Jonathan Bloom 1. Be able to use the method of least squares to fit a line to bivariate data. 2. Be able to give a formula for the total
More informationAn Introduction to Statistical and Probabilistic Linear Models
An Introduction to Statistical and Probabilistic Linear Models Maximilian Mozes Proseminar Data Mining Fakultät für Informatik Technische Universität München June 07, 2017 Introduction In statistical learning
More informationLESSON 35: EIGENVALUES AND EIGENVECTORS APRIL 21, (1) We might also write v as v. Both notations refer to a vector.
LESSON 5: EIGENVALUES AND EIGENVECTORS APRIL 2, 27 In this contet, a vector is a column matri E Note 2 v 2, v 4 5 6 () We might also write v as v Both notations refer to a vector (2) A vector can be man
More informationCOMP 551 Applied Machine Learning Lecture 20: Gaussian processes
COMP 55 Applied Machine Learning Lecture 2: Gaussian processes Instructor: Ryan Lowe (ryan.lowe@cs.mcgill.ca) Slides mostly by: (herke.vanhoof@mcgill.ca) Class web page: www.cs.mcgill.ca/~hvanho2/comp55
More informationPolynomial approximation and Splines
Polnomial approimation and Splines 1. Weierstrass approimation theorem The basic question we ll look at toda is how to approimate a complicated function f() with a simpler function P () f() P () for eample,
More informationUnit 3 NOTES Honors Common Core Math 2 1. Day 1: Properties of Exponents
Unit NOTES Honors Common Core Math Da : Properties of Eponents Warm-Up: Before we begin toda s lesson, how much do ou remember about eponents? Use epanded form to write the rules for the eponents. OBJECTIVE
More informationf(x) = 2x 2 + 2x - 4
4-1 Graphing Quadratic Functions What You ll Learn Scan the tet under the Now heading. List two things ou will learn about in the lesson. 1. Active Vocabular 2. New Vocabular Label each bo with the terms
More informationLaurie s Notes. Overview of Section 3.5
Overview of Section.5 Introduction Sstems of linear equations were solved in Algebra using substitution, elimination, and graphing. These same techniques are applied to nonlinear sstems in this lesson.
More informationVectors and the Geometry of Space
Chapter 12 Vectors and the Geometr of Space Comments. What does multivariable mean in the name Multivariable Calculus? It means we stud functions that involve more than one variable in either the input
More informationMath Notes on sections 7.8,9.1, and 9.3. Derivation of a solution in the repeated roots case: 3 4 A = 1 1. x =e t : + e t w 2.
Math 7 Notes on sections 7.8,9., and 9.3. Derivation of a solution in the repeated roots case We consider the eample = A where 3 4 A = The onl eigenvalue is = ; and there is onl one linearl independent
More informationMachine Learning Foundations
Machine Learning Foundations ( 機器學習基石 ) Lecture 13: Hazard of Overfitting Hsuan-Tien Lin ( 林軒田 ) htlin@csie.ntu.edu.tw Department of Computer Science & Information Engineering National Taiwan Universit
More informationSupport Vector Machines
Support Vector Machines Optimall deined decision surace Tpicall nonlinear in the input space Linear in a higher dimensional space Implicitl deined b a kernel unction What are Support Vector Machines Used
More informationWeek 3: Linear Regression
Week 3: Linear Regression Instructor: Sergey Levine Recap In the previous lecture we saw how linear regression can solve the following problem: given a dataset D = {(x, y ),..., (x N, y N )}, learn to
More informationThe American School of Marrakesh. Algebra 2 Algebra 2 Summer Preparation Packet
The American School of Marrakesh Algebra Algebra Summer Preparation Packet Summer 016 Algebra Summer Preparation Packet This summer packet contains eciting math problems designed to ensure our readiness
More informationCOMP 551 Applied Machine Learning Lecture 3: Linear regression (cont d)
COMP 551 Applied Machine Learning Lecture 3: Linear regression (cont d) Instructor: Herke van Hoof (herke.vanhoof@mail.mcgill.ca) Slides mostly by: Class web page: www.cs.mcgill.ca/~hvanho2/comp551 Unless
More informationBiostatistics in Research Practice - Regression I
Biostatistics in Research Practice - Regression I Simon Crouch 30th Januar 2007 In scientific studies, we often wish to model the relationships between observed variables over a sample of different subjects.
More informationLecture Machine Learning (past summer semester) If at least one English-speaking student is present.
Advanced Machine Learning Lecture 1 Organization Lecturer Prof. Bastian Leibe (leibe@vision.rwth-aachen.de) Introduction 20.10.2015 Bastian Leibe Visual Computing Institute RWTH Aachen University http://www.vision.rwth-aachen.de/
More informationLinear Regression. Robot Image Credit: Viktoriya Sukhanova 123RF.com
Linear Regression These slides were assembled by Eric Eaton, with grateful acknowledgement of the many others who made their course materials freely available online. Feel free to reuse or adapt these
More informationPerturbation Theory for Variational Inference
Perturbation heor for Variational Inference Manfred Opper U Berlin Marco Fraccaro echnical Universit of Denmark Ulrich Paquet Apple Ale Susemihl U Berlin Ole Winther echnical Universit of Denmark Abstract
More informationCPSC 340: Machine Learning and Data Mining. MLE and MAP Fall 2017
CPSC 340: Machine Learning and Data Mining MLE and MAP Fall 2017 Assignment 3: Admin 1 late day to hand in tonight, 2 late days for Wednesday. Assignment 4: Due Friday of next week. Last Time: Multi-Class
More informationMachine learning - HT Basis Expansion, Regularization, Validation
Machine learning - HT 016 4. Basis Expansion, Regularization, Validation Varun Kanade University of Oxford Feburary 03, 016 Outline Introduce basis function to go beyond linear regression Understanding
More informationInf2b Learning and Data
Inf2b Learning and Data Lecture : Single layer Neural Networks () (Credit: Hiroshi Shimodaira Iain Murray and Steve Renals) Centre for Speech Technology Research (CSTR) School of Informatics University
More informationWhere now? Machine Learning and Bayesian Inference
Machine Learning and Bayesian Inference Dr Sean Holden Computer Laboratory, Room FC6 Telephone etension 67 Email: sbh@clcamacuk wwwclcamacuk/ sbh/ Where now? There are some simple take-home messages from
More informationDM534 - Introduction to Computer Science
Department of Mathematics and Computer Science University of Southern Denmark, Odense October 21, 2016 Marco Chiarandini DM534 - Introduction to Computer Science Training Session, Week 41-43, Autumn 2016
More informationArtificial Neural Networks 2
CSC2515 Machine Learning Sam Roweis Artificial Neural s 2 We saw neural nets for classification. Same idea for regression. ANNs are just adaptive basis regression machines of the form: y k = j w kj σ(b
More informationMachine Learning - MT & 5. Basis Expansion, Regularization, Validation
Machine Learning - MT 2016 4 & 5. Basis Expansion, Regularization, Validation Varun Kanade University of Oxford October 19 & 24, 2016 Outline Basis function expansion to capture non-linear relationships
More informationNeural Networks. Bishop PRML Ch. 5. Alireza Ghane. Feed-forward Networks Network Training Error Backpropagation Applications
Neural Networks Bishop PRML Ch. 5 Alireza Ghane Neural Networks Alireza Ghane / Greg Mori 1 Neural Networks Neural networks arise from attempts to model human/animal brains Many models, many claims of
More informationCross-validation for detecting and preventing overfitting
Cross-validation for detecting and preventing overfitting A Regression Problem = f() + noise Can we learn f from this data? Note to other teachers and users of these slides. Andrew would be delighted if
More informationIntroduction to Vector Spaces Linear Algebra, Spring 2011
Introduction to Vector Spaces Linear Algebra, Spring 2011 You probabl have heard the word vector before, perhaps in the contet of Calculus III or phsics. You probabl think of a vector like this: 5 3 or
More informationMethods for Advanced Mathematics (C3) Coursework Numerical Methods
Woodhouse College 0 Page Introduction... 3 Terminolog... 3 Activit... 4 Wh use numerical methods?... Change of sign... Activit... 6 Interval Bisection... 7 Decimal Search... 8 Coursework Requirements on
More informationFeed-forward Networks Network Training Error Backpropagation Applications. Neural Networks. Oliver Schulte - CMPT 726. Bishop PRML Ch.
Neural Networks Oliver Schulte - CMPT 726 Bishop PRML Ch. 5 Neural Networks Neural networks arise from attempts to model human/animal brains Many models, many claims of biological plausibility We will
More information4452 Mathematical Modeling Lecture 13: Chaos and Fractals
Math Modeling Lecture 13: Chaos and Fractals Page 1 442 Mathematical Modeling Lecture 13: Chaos and Fractals Introduction In our tetbook, the discussion on chaos and fractals covers less than 2 pages.
More informationMachine Learning. 7. Logistic and Linear Regression
Sapienza University of Rome, Italy - Machine Learning (27/28) University of Rome La Sapienza Master in Artificial Intelligence and Robotics Machine Learning 7. Logistic and Linear Regression Luca Iocchi,
More informationWeek 3 September 5-7.
MA322 Weekl topics and quiz preparations Week 3 September 5-7. Topics These are alread partl covered in lectures. We collect the details for convenience.. Solutions of homogeneous equations AX =. 2. Using
More informationIntroduction. Chapter 1
Chapter 1 Introduction In this book we will be concerned with supervised learning, which is the problem of learning input-output mappings from empirical data (the training dataset). Depending on the characteristics
More informationMachine Learning Lecture 5
Machine Learning Lecture 5 Linear Discriminant Functions 26.10.2017 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Course Outline Fundamentals Bayes Decision Theory
More informationCSC2541 Lecture 2 Bayesian Occam s Razor and Gaussian Processes
CSC2541 Lecture 2 Bayesian Occam s Razor and Gaussian Processes Roger Grosse Roger Grosse CSC2541 Lecture 2 Bayesian Occam s Razor and Gaussian Processes 1 / 55 Adminis-Trivia Did everyone get my e-mail
More informationLinear, threshold units. Linear Discriminant Functions and Support Vector Machines. Biometrics CSE 190 Lecture 11. X i : inputs W i : weights
Linear Discriminant Functions and Support Vector Machines Linear, threshold units CSE19, Winter 11 Biometrics CSE 19 Lecture 11 1 X i : inputs W i : weights θ : threshold 3 4 5 1 6 7 Courtesy of University
More informationMachine Learning. Lecture 4: Regularization and Bayesian Statistics. Feng Li. https://funglee.github.io
Machine Learning Lecture 4: Regularization and Bayesian Statistics Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 207 Overfitting Problem
More informationLinear Models for Regression
Linear Models for Regression Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr
More informationMachine Learning. 1. Linear Regression
Machine Learning Machine Learning 1. Linear Regression Lars Schmidt-Thieme Information Sstems and Machine Learning Lab (ISMLL) Institute for Business Economics and Information Sstems & Institute for Computer
More informationIntroduction to Differential Equations. National Chiao Tung University Chun-Jen Tsai 9/14/2011
Introduction to Differential Equations National Chiao Tung Universit Chun-Jen Tsai 9/14/011 Differential Equations Definition: An equation containing the derivatives of one or more dependent variables,
More information5. Perform the indicated operation and simplify each of the following expressions:
Precalculus Worksheet.5 1. What is - 1? Just because we refer to solutions as imaginar does not mean that the solutions are meaningless. Fields such as quantum mechanics and electromagnetism depend on
More informationH.Algebra 2 Summer Review Packet
H.Algebra Summer Review Packet 1 Correlation of Algebra Summer Packet with Algebra 1 Objectives A. Simplifing Polnomial Epressions Objectives: The student will be able to: Use the commutative, associative,
More informationCOMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017
COMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University FEATURE EXPANSIONS FEATURE EXPANSIONS
More information1.2 Functions and Their Properties PreCalculus
1. Functions and Their Properties PreCalculus 1. FUNCTIONS AND THEIR PROPERTIES Learning Targets for 1. 1. Determine whether a set of numbers or a graph is a function. Find the domain of a function given
More informationSection 3.1. ; X = (0, 1]. (i) f : R R R, f (x, y) = x y
Paul J. Bruillard MATH 0.970 Problem Set 6 An Introduction to Abstract Mathematics R. Bond and W. Keane Section 3.1: 3b,c,e,i, 4bd, 6, 9, 15, 16, 18c,e, 19a, 0, 1b Section 3.: 1f,i, e, 6, 1e,f,h, 13e,
More information2.3 Quadratic Functions
88 Linear and Quadratic Functions. Quadratic Functions You ma recall studing quadratic equations in Intermediate Algebra. In this section, we review those equations in the contet of our net famil of functions:
More informationLogistic Regression. COMP 527 Danushka Bollegala
Logistic Regression COMP 527 Danushka Bollegala Binary Classification Given an instance x we must classify it to either positive (1) or negative (0) class We can use {1,-1} instead of {1,0} but we will
More informationCPSC 340: Machine Learning and Data Mining
CPSC 340: Machine Learning and Data Mining MLE and MAP Original version of these slides by Mark Schmidt, with modifications by Mike Gelbart. 1 Admin Assignment 4: Due tonight. Assignment 5: Will be released
More informationTools of AI. Marcin Sydow. Summary. Machine Learning
Machine Learning Outline of this Lecture Motivation for Data Mining and Machine Learning Idea of Machine Learning Decision Table: Cases and Attributes Supervised and Unsupervised Learning Classication
More informationL20: MLPs, RBFs and SPR Bayes discriminants and MLPs The role of MLP hidden units Bayes discriminants and RBFs Comparison between MLPs and RBFs
L0: MLPs, RBFs and SPR Bayes discriminants and MLPs The role of MLP hidden units Bayes discriminants and RBFs Comparison between MLPs and RBFs CSCE 666 Pattern Analysis Ricardo Gutierrez-Osuna CSE@TAMU
More informationPhysics Gravitational force. 2. Strong or color force. 3. Electroweak force
Phsics 360 Notes on Griffths - pluses and minuses No tetbook is perfect, and Griffithsisnoeception. Themajorplusisthat it is prett readable. For minuses, see below. Much of what G sas about the del operator
More informationStat 406: Algorithms for classification and prediction. Lecture 1: Introduction. Kevin Murphy. Mon 7 January,
1 Stat 406: Algorithms for classification and prediction Lecture 1: Introduction Kevin Murphy Mon 7 January, 2008 1 1 Slides last updated on January 7, 2008 Outline 2 Administrivia Some basic definitions.
More informationPolynomial and Rational Functions
Name Date Chapter Polnomial and Rational Functions Section.1 Quadratic Functions Objective: In this lesson ou learned how to sketch and analze graphs of quadratic functions. Important Vocabular Define
More informationLINEAR MODELS FOR CLASSIFICATION. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception
LINEAR MODELS FOR CLASSIFICATION Classification: Problem Statement 2 In regression, we are modeling the relationship between a continuous input variable x and a continuous target variable t. In classification,
More informationRidge Regression 1. to which some random noise is added. So that the training labels can be represented as:
CS 1: Machine Learning Spring 15 College of Computer and Information Science Northeastern University Lecture 3 February, 3 Instructor: Bilal Ahmed Scribe: Bilal Ahmed & Virgil Pavlu 1 Introduction Ridge
More information5.4 dividing POlynOmIAlS
SECTION 5.4 dividing PolNomiAls 3 9 3 learning ObjeCTIveS In this section, ou will: Use long division to divide polnomials. Use snthetic division to divide polnomials. 5.4 dividing POlnOmIAlS Figure 1
More informationAlgebra II Notes Unit Six: Polynomials Syllabus Objectives: 6.2 The student will simplify polynomial expressions.
Algebra II Notes Unit Si: Polnomials Sllabus Objectives: 6. The student will simplif polnomial epressions. Review: Properties of Eponents (Allow students to come up with these on their own.) Let a and
More information6.867 Machine learning
6.867 Machine learning Mid-term eam October 8, 6 ( points) Your name and MIT ID: .5.5 y.5 y.5 a).5.5 b).5.5.5.5 y.5 y.5 c).5.5 d).5.5 Figure : Plots of linear regression results with different types of
More informationRELATIONS AND FUNCTIONS through
RELATIONS AND FUNCTIONS 11.1.2 through 11.1. Relations and Functions establish a correspondence between the input values (usuall ) and the output values (usuall ) according to the particular relation or
More informationKernel Methods. Foundations of Data Analysis. Torsten Möller. Möller/Mori 1
Kernel Methods Foundations of Data Analysis Torsten Möller Möller/Mori 1 Reading Chapter 6 of Pattern Recognition and Machine Learning by Bishop Chapter 12 of The Elements of Statistical Learning by Hastie,
More information6. Vector Random Variables
6. Vector Random Variables In the previous chapter we presented methods for dealing with two random variables. In this chapter we etend these methods to the case of n random variables in the following
More informationToday. Calculus. Linear Regression. Lagrange Multipliers
Today Calculus Lagrange Multipliers Linear Regression 1 Optimization with constraints What if I want to constrain the parameters of the model. The mean is less than 10 Find the best likelihood, subject
More informationECE521 week 3: 23/26 January 2017
ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear
More informationCS 540: Machine Learning Lecture 1: Introduction
CS 540: Machine Learning Lecture 1: Introduction AD January 2008 AD () January 2008 1 / 41 Acknowledgments Thanks to Nando de Freitas Kevin Murphy AD () January 2008 2 / 41 Administrivia & Announcement
More informationMachine Learning (CSE 446): Probabilistic Machine Learning
Machine Learning (CSE 446): Probabilistic Machine Learning oah Smith c 2017 University of Washington nasmith@cs.washington.edu ovember 1, 2017 1 / 24 Understanding MLE y 1 MLE π^ You can think of MLE as
More informationSUPPORT VECTOR MACHINE
SUPPORT VECTOR MACHINE Mainly based on https://nlp.stanford.edu/ir-book/pdf/15svm.pdf 1 Overview SVM is a huge topic Integration of MMDS, IIR, and Andrew Moore s slides here Our foci: Geometric intuition
More informationMachine Learning Basics III
Machine Learning Basics III Benjamin Roth CIS LMU München Benjamin Roth (CIS LMU München) Machine Learning Basics III 1 / 62 Outline 1 Classification Logistic Regression 2 Gradient Based Optimization Gradient
More informationAnnouncements. CS 188: Artificial Intelligence Spring Classification. Today. Classification overview. Case-Based Reasoning
CS 188: Artificial Intelligence Spring 21 Lecture 22: Nearest Neighbors, Kernels 4/18/211 Pieter Abbeel UC Berkeley Slides adapted from Dan Klein Announcements On-going: contest (optional and FUN!) Remaining
More informationWeek 5: Logistic Regression & Neural Networks
Week 5: Logistic Regression & Neural Networks Instructor: Sergey Levine 1 Summary: Logistic Regression In the previous lecture, we covered logistic regression. To recap, logistic regression models and
More information5. Zeros. We deduce that the graph crosses the x-axis at the points x = 0, 1, 2 and 4, and nowhere else. And that s exactly what we see in the graph.
. Zeros Eample 1. At the right we have drawn the graph of the polnomial = ( 1) ( 2) ( 4). Argue that the form of the algebraic formula allows ou to see right awa where the graph is above the -ais, where
More informationKeira Godwin. Time Allotment: 13 days. Unit Objectives: Upon completion of this unit, students will be able to:
Keira Godwin Time Allotment: 3 das Unit Objectives: Upon completion of this unit, students will be able to: o Simplif comple rational fractions. o Solve comple rational fractional equations. o Solve quadratic
More informationCh 5 Alg 2 L2 Note Sheet Key Do Activity 1 on your Ch 5 Activity Sheet.
Ch Alg L Note Sheet Ke Do Activit 1 on our Ch Activit Sheet. Chapter : Quadratic Equations and Functions.1 Modeling Data With Quadratic Functions You had three forms for linear equations, ou will have
More informationKernel-Based Principal Component Analysis (KPCA) and Its Applications. Nonlinear PCA
Kernel-Based Principal Component Analysis (KPCA) and Its Applications 4//009 Based on slides originaly from Dr. John Tan 1 Nonlinear PCA Natural phenomena are usually nonlinear and standard PCA is intrinsically
More informationDependence and scatter-plots. MVE-495: Lecture 4 Correlation and Regression
Dependence and scatter-plots MVE-495: Lecture 4 Correlation and Regression It is common for two or more quantitative variables to be measured on the same individuals. Then it is useful to consider what
More informationDIFFERENTIAL EQUATIONS First Order Differential Equations. Paul Dawkins
DIFFERENTIAL EQUATIONS First Order Paul Dawkins Table of Contents Preface... First Order... 3 Introduction... 3 Linear... 4 Separable... 7 Eact... 8 Bernoulli... 39 Substitutions... 46 Intervals of Validit...
More informationMachine Learning for OR & FE
Machine Learning for OR & FE Supervised Learning: Regression I Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com Some of the
More informationLESSON #42 - INVERSES OF FUNCTIONS AND FUNCTION NOTATION PART 2 COMMON CORE ALGEBRA II
LESSON #4 - INVERSES OF FUNCTIONS AND FUNCTION NOTATION PART COMMON CORE ALGEBRA II You will recall from unit 1 that in order to find the inverse of a function, ou must switch and and solve for. Also,
More informationLinear Models for Regression CS534
Linear Models for Regression CS534 Example Regression Problems Predict housing price based on House size, lot size, Location, # of rooms Predict stock price based on Price history of the past month Predict
More informationOverview. Probabilistic Interpretation of Linear Regression Maximum Likelihood Estimation Bayesian Estimation MAP Estimation
Overview Probabilistic Interpretation of Linear Regression Maximum Likelihood Estimation Bayesian Estimation MAP Estimation Probabilistic Interpretation: Linear Regression Assume output y is generated
More informationSmoothness Selection. Simon Wood Mathematical Sciences, University of Bath, U.K.
Smoothness Selection Simon Wood Mathematical Sciences, Universit of Bath, U.K. Smoothness selection approaches The smoothing model i = f ( i ) + ɛ i, ɛ i N(0, σ 2 ), is represented via a basis epansion
More information