Linear regression (cont) Logistic regression

Similar documents
Linear regression (cont.) Linear methods for classification

Classification : Logistic regression. Generative classification model.

Supervised learning: Linear regression Logistic regression

CS 2750 Machine Learning. Lecture 8. Linear regression. CS 2750 Machine Learning. Linear regression. is a linear combination of input components x

CS 2750 Machine Learning Lecture 8. Linear regression. Supervised learning. a set of n examples

Linear models for classification

CS 2750 Machine Learning. Lecture 7. Linear regression. CS 2750 Machine Learning. Linear regression. is a linear combination of input components x

Generative classification models

Binary classification: Support Vector Machines

CS 1675 Introduction to Machine Learning Lecture 12 Support vector machines

Support vector machines II

15-381: Artificial Intelligence. Regression and neural networks (NN)

Dimensionality reduction Feature selection

Objectives of Multiple Regression

Lecture 12: Multilayer perceptrons II

Regression and the LMS Algorithm

Simple Linear Regression

Kernel-based Methods and Support Vector Machines

CSE 5526: Introduction to Neural Networks Linear Regression

= 2. Statistic - function that doesn't depend on any of the known parameters; examples:

LECTURE 9: Principal Components Analysis

Linear Regression Linear Regression with Shrinkage. Some slides are due to Tommi Jaakkola, MIT AI Lab

UNIT 7 RANK CORRELATION

Generalized Linear Regression with Regularization

Lecture Notes Forecasting the process of estimating or predicting unknown situations

CS 3710 Advanced Topics in AI Lecture 17. Density estimation. CS 3710 Probabilistic graphical models. Administration

Density estimation III. Linear regression.

Dimensionality reduction Feature selection

M2S1 - EXERCISES 8: SOLUTIONS

LINEARLY CONSTRAINED MINIMIZATION BY USING NEWTON S METHOD

Ordinary Least Squares Regression. Simple Regression. Algebra and Assumptions.

Recall MLR 5 Homskedasticity error u has the same variance given any values of the explanatory variables Var(u x1,...,xk) = 2 or E(UU ) = 2 I

Outline (1/2) 2.1 Formulation of the Learning Problem. Outline (2/2) Problem Statement, Classical Approaches, and Adaptive Learning

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #1

ECON 482 / WH Hong The Simple Regression Model 1. Definition of the Simple Regression Model

Lecture Notes Types of economic variables

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution:

Functions of Random Variables

ENGI 4421 Propagation of Error Page 8-01

Model Fitting, RANSAC. Jana Kosecka

Support vector machines

Correlation and Regression Analysis

Differential Encoding

Mathematics HL and Further mathematics HL Formula booklet

Econometric Methods. Review of Estimation

Chapter 11 Systematic Sampling

Lecture 8: Linear Regression

( x) min. Nonlinear optimization problem without constraints NPP: then. Global minimum of the function f(x)

Generative classification models

( ) = ( ) ( ) Chapter 13 Asymptotic Theory and Stochastic Regressors. Stochastic regressors model

Chapter 4 (Part 1): Non-Parametric Classification (Sections ) Pattern Classification 4.3) Announcements

Multiple Choice Test. Chapter Adequacy of Models for Regression

Lecture Notes 2. The ability to manipulate matrices is critical in economics.

MMJ 1113 FINITE ELEMENT METHOD Introduction to PART I

Line Fitting and Regression

Introduction to Matrices and Matrix Approach to Simple Linear Regression

Parameter, Statistic and Random Samples

Chapter Two. An Introduction to Regression ( )

Given a table of data poins of an unknown or complicated function f : we want to find a (simpler) function p s.t. px (

Summary of the lecture in Biostatistics

Machine Learning. Introduction to Regression. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012

1 0, x? x x. 1 Root finding. 1.1 Introduction. Solve[x^2-1 0,x] {{x -1},{x 1}} Plot[x^2-1,{x,-2,2}] 3

ENGI 4421 Joint Probability Distributions Page Joint Probability Distributions [Navidi sections 2.5 and 2.6; Devore sections

Unsupervised Learning and Other Neural Networks

Lecture 2: The Simple Regression Model

Chapter 3. Differentiation 3.3 Differentiation Rules

STAT 400 Homework 09 Spring 2018 Dalpiaz UIUC Due: Friday, April 6, 2:00 PM

= y and Normed Linear Spaces

ρ < 1 be five real numbers. The

Test Paper-II. 1. If sin θ + cos θ = m and sec θ + cosec θ = n, then (a) 2n = m (n 2 1) (b) 2m = n (m 2 1) (c) 2n = m (m 2 1) (d) none of these

ENGI 3423 Simple Linear Regression Page 12-01

Analyzing Two-Dimensional Data. Analyzing Two-Dimensional Data

Bayes (Naïve or not) Classifiers: Generative Approach

The Mathematical Appendix

A MODIFIED REGULARIZED NEWTON METHOD FOR UNCONSTRAINED NONCONVEX OPTIMIZATION

Chapter 3. Differentiation 3.2 Differentiation Rules for Polynomials, Exponentials, Products and Quotients

Big Data Analytics. Data Fitting and Sampling. Acknowledgement: Notes by Profs. R. Szeliski, S. Seitz, S. Lazebnik, K. Chaturvedi, and S.

Credit Risk Evaluation Using ES Based SVM-MK

QR Factorization and Singular Value Decomposition COS 323

Linear Regression. Hsiao-Lung Chan Dept Electrical Engineering Chang Gung University, Taiwan

Chapter 4 Multiple Random Variables

best estimate (mean) for X uncertainty or error in the measurement (systematic, random or statistical) best

Special Instructions / Useful Data

Point Estimation: definition of estimators

Can we take the Mysticism Out of the Pearson Coefficient of Linear Correlation?

To use adaptive cluster sampling we must first make some definitions of the sampling universe:

4. Standard Regression Model and Spatial Dependence Tests

Radial Basis Function Networks

12.2 Estimating Model parameters Assumptions: ox and y are related according to the simple linear regression model

Linear Discriminant Functions

Johns Hopkins University Department of Biostatistics Math Review for Introductory Courses

Linear Regression. Hsiao-Lung Chan Dept Electrical Engineering Chang Gung University, Taiwan

THE ROYAL STATISTICAL SOCIETY 2016 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE MODULE 5

CS 2750 Machine Learning Lecture 5. Density estimation. Density estimation

Johns Hopkins University Department of Biostatistics Math Review for Introductory Courses

Lecture 7: Linear and quadratic classifiers

Transforms that are commonly used are separable

I. Decision trees II. Ensamble methods: Mixtures of experts

Third handout: On the Gini Index

C.11 Bang-bang Control

Transcription:

CS 7 Fouatos of Mache Lear Lecture 4 Lear reresso cot Lostc reresso Mlos Hausrecht mlos@cs.ptt.eu 539 Seott Square Lear reresso Vector efto of the moel Iclue bas costat the put vector f - parameters ehts Iput vector f

Lear reresso. Error. Data: D Fucto: f We oul le to have f for all.. Error fucto measures ho much our prectos evate from the esre asers Mea-square error.. Lear: We at to f the ehts mm the error! f Lear reresso. Eample mesoal put 3 5 5 5-5 - -5 -.5 - -.5.5.5

Lear reresso. Eample. mesoal put 5 5-5 - -5 - -3 - - 3-4 - 4 Lear reresso. Optmato. We at the ehts mm the error f.. For the optmal set of parameters ervatves of the error th respect to each parameter must be Vector of ervatves: ra.. 3

4 Solv lear reresso B rearra the terms e et a sstem of lear equatos th + uos A b Solv lear reresso he optmal set of ehts satsfes: Leas to a sstem of lear equatos SLE th + uos of the form Soluto to SLE:? A b

5 Solv lear reresso he optmal set of ehts satsfes: Leas to a sstem of lear equatos SLE th + uos of the form Soluto to SLE: matr verso A b b A Graet escet soluto Goal: the eht optmato the lear reresso moel A alteratve to SLE soluto: Graet escet Iea: Aust ehts the recto that mproves the Error he raet tells us hat s the rht recto - a lear rate scales the raet chaes Error.. f Error

Graet escet metho Desce us the raet formato Error Error * * Drecto of the escet Chae the value of accor to the raet Error Graet escet metho Error Error * * Ne value of the parameter Error * * For all - a lear rate scales the raet chaes 6

Graet escet metho Iteratvel approaches the optmum of the Error fucto Error 3 Batch vs Ole reresso alorthm he error fucto efe o the complete ataset D Error f.. We sa e are lear the moel the batch moe: All eamples are avalable at the tme of lear Wehts are optmes th respect to all tra eamples A alteratve s to lear the moel the ole moe Eamples are arrv sequetall Moel ehts are upate after ever eample If eee eamples see ca be forotte - 7

Ole raet alorthm he error fucto s efe for the complete ataset D Error f Error for oe eample.. ole Error f Ole raet metho: chaes ehts after ever eample Error vector form: D Error - Lear rate that epes o the umber of upates Ole raet metho Lear moel f O-le error ole Error f O-le alorthm: eerates a sequece of ole upates -th upate step th : -th eht: D Error f Fe lear rate: - Use a small costat C Aeale lear rate: - Grauall rescales chaes 8

Ole reresso alorthm Ole-lear-reresso stopp_crtero Itale ehts tale =; hle stopp_crtero = FALSE select the et ata pot D set lear rate upate eht vector e retur ehts f Avataes: ver eas to mplemet cotuous ata streams O-le lear. Eample 4.5 4.5 4 4 3.5 3.5 3 3.5.5.5.5-3 - - 3-3 - - 3 5.5 5 4.5 5.5 3 5 4 4.5 4 4 3.5 3.5 3 3.5.5.5.5.5-3 - - 3.5-3 - - 3 9

Aaptve moels Lear moel f O-le error ole Error f O-le alorthm: Sequece of ole upates oe eample at the tme Useful for cotuous ata streams Aaptve moels: the uerl moel s ot statoar a ca chae over tme Eample: seasoal chaes O-le alorthm ca be mae aaptve b eep the lear at some costat value c Etesos of smple lear moel Replace puts to lear uts th feature bass fuctos to moel oleartes f m - a arbtrar fucto of f m m he same techques as before to lear the ehts!!!!

Etesos of the lear moel Moels lear the parameters e at to ft f Bass fuctos eamples: a hher orer polomal oe-mesoal put 3 3 Multmesoal quaratc 3 4 Other tpes of bass fuctos s cos m... m - parameters... - feature or bass fuctos m 5 Eample. Reresso th polomals. Reresso th polomals of eree m Data pots: pars of Feature fuctos: m feature fuctos m Fucto to lear: m f m m m m

Multmesoal moel eample 5 5-5 - -5 - -3 - - 3-4 - 4 Multmesoal moel eample

Reulare lear reresso If the umber of parameters s lare relatve to the umber of ata pots use to tra the moel e face the threat of overft eeralato error of the moel oes up he precto accurac ca be ofte mprove b sett some coeffcets to ero Icreases the bas reuces the varace of estmates Solutos: Subset selecto Re reresso Lasso reresso Prcpal compoet reresso Net: re reresso Re reresso Error fucto for the staar least squares estmates:.. * We see: ar m Re reresso: Where.... a What oes the e error fucto o? 3

Re reresso Staar reresso: Re reresso:.. L.. L peales o-ero ehts th the cost proportoal to a shrae coeffcet If a put attrbute has a small effect o mprov the error fucto t s shut o b the pealt term Icluso of a shrae pealt s ofte referre to as reularato. re reresso s relate to hoov reularato Reulare lear reresso Ho to solve the least squares problem f the error fucto s erche b the reularato term? Aser: he soluto to the optmal set of ehts s obtae aa b solv a set of lear equato. Staar lear reresso: Soluto: * X X X Reulare lear reresso: here X s a matr th ros correspo to eamples a colums to puts * I X X X 4

Lasso reresso Staar reresso:.. Lasso reresso/reularato: L.. L peales o-ero ehts th the cost proportoal to. L s more aressve push the ehts to compare to L. Data: D {.. } Classfcato represets a screte class value Goal: lear f : X Y Bar classfcato A specal case he Y {} Frst step: e ee to evse a moel of the fucto f 5

Dscrmat fuctos A commo a to represet a classfer s b us Dscrmat fuctos Wors for both the bar a mult-a classfcato Iea: For ever class efe a fucto mapp X Whe the ecso o put shoul be mae choose the class th the hhest value of * ar ma Dscrmat fuctos A commo a to represet a classfer s b us Dscrmat fuctos Wors for both the bar a mult-a classfcato Iea: For ever class efe a fucto mapp X Whe the ecso o put shoul be mae choose the class th the hhest value of * ar ma So hat happes th the put space? Assume a bar case. 6

Dscrmat fuctos.5.5 -.5 - -.5 - - -.5 - -.5.5.5 Dscrmat fuctos.5.5 -.5 - -.5 - - -.5 - -.5.5.5 7

Dscrmat fuctos.5.5 -.5 - -.5 - - -.5 - -.5.5.5 Dscrmat fuctos Decso bouar: scrmat fuctos are equal.5.5 -.5 - -.5 - - -.5 - -.5.5.5 8

Quaratc ecso bouar 3 Decso bouar.5.5.5 -.5 - -.5 - - -.5 - -.5.5.5 Lostc reresso moel Defes a lear ecso bouar Dscrmat fuctos: here / e f - s a lostc fucto Iput vector f Lostc fucto 9

Fucto: Lostc fucto e Is also referre to as a smo fucto taes a real umber a outputs the umber the terval [] Moels a smooth stch fucto; replaces har threshol fucto.9.8.7.6.5.9.8.7.6.5.4.4.3.3.... - -5 - -5 5 5 Lostc smooth stch - -5 - -5 5 5 hreshol har stch Lostc reresso moel Dscrmat fuctos: Values of scrmat fuctos var terval [] Probablstc terpretato f p p Iput vector

Lostc reresso We lear a probablstc fucto f : X [] here f escrbes the probablt of class ve f p Note that: p p Ma ecsos th the lostc reresso moel:? Lostc reresso We lear a probablstc fucto f : X [] here f escrbes the probablt of class ve f p Note that: p p Ma ecsos th the lostc reresso moel: If p / the choose Else choose

Lear ecso bouar Lostc reresso moel efes a lear ecso bouar Wh? Aser: Compare to scrmat fuctos. Decso bouar: For the bouar t must hol: lo o lo lo o ep ep lo ep lo ep Lostc reresso moel. Decso bouar LR efes a lear ecso bouar Eample: classes blue a re pots Decso bouar.5.5 -.5 - -.5 = - - -.5 - -.5.5.5

3 Lelhoo of outputs Let he F ehts that mame the lelhoo of outputs Appl the lo-lelhoo trc. he optmal ehts are the same for both the lelhoo a the lo-lelhoo Lostc reresso: parameter lear D l lo lo P D L p lo lo D Lostc reresso: parameter lear Notato: Lo lelhoo Dervatves of the lolelhoo Graet escet: lo lo D l f D l ] [ D l Nolear ehts!! f ] [ D l p

4 Dervato of the raet Lo lelhoo Dervatves of the lolelhoo lo lo D l f D l D l lo lo lo lo Dervatve of a lostc fucto Lostc reresso. Ole raet escet O-le compoet of the lolelhoo O-le lear upate for eht th upate for the lostc reresso a ] [ ole D ole D D f ] [ lo lo ole D

Ole lostc reresso alorthm Ole-lostc-reresso stopp_crtero tale ehts hle stopp_crtero = FALSE o select et ata pot D set upate ehts parallel e retur ehts [ f ] Ole alorthm. Eample. 5

Ole alorthm. Eample. Ole alorthm. Eample. 6