Linear regression (cont.) Linear methods for classification

Similar documents
Classification : Logistic regression. Generative classification model.

Linear regression (cont) Logistic regression

Linear models for classification

Supervised learning: Linear regression Logistic regression

Generative classification models

CS 2750 Machine Learning. Lecture 8. Linear regression. CS 2750 Machine Learning. Linear regression. is a linear combination of input components x

CS 2750 Machine Learning. Lecture 7. Linear regression. CS 2750 Machine Learning. Linear regression. is a linear combination of input components x

Binary classification: Support Vector Machines

15-381: Artificial Intelligence. Regression and neural networks (NN)

CS 1675 Introduction to Machine Learning Lecture 12 Support vector machines

Support vector machines II

Kernel-based Methods and Support Vector Machines

Support vector machines

Regression and the LMS Algorithm

CSE 5526: Introduction to Neural Networks Linear Regression

Bayes (Naïve or not) Classifiers: Generative Approach

Dimensionality reduction Feature selection

Generative classification models

Objectives of Multiple Regression

Linear Regression Linear Regression with Shrinkage. Some slides are due to Tommi Jaakkola, MIT AI Lab

Midterm Exam 1, section 2 (Solution) Thursday, February hour, 15 minutes

Lecture 12: Multilayer perceptrons II

CS 2750 Machine Learning Lecture 5. Density estimation. Density estimation

Unsupervised Learning and Other Neural Networks

CS 2750 Machine Learning Lecture 8. Linear regression. Supervised learning. a set of n examples

Econometric Methods. Review of Estimation

Radial Basis Function Networks

Lecture Notes Forecasting the process of estimating or predicting unknown situations

LECTURE 9: Principal Components Analysis

Dimensionality reduction Feature selection

Model Fitting, RANSAC. Jana Kosecka

An Introduction to. Support Vector Machine

Chapter Two. An Introduction to Regression ( )

Overview. Basic concepts of Bayesian learning. Most probable model given data Coin tosses Linear regression Logistic regression

Machine Learning. Introduction to Regression. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012

Parametric Density Estimation: Bayesian Estimation. Naïve Bayes Classifier

Lecture Notes Types of economic variables

Summary of the lecture in Biostatistics

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution:

Bayesian Classification. CS690L Data Mining: Classification(2) Bayesian Theorem: Basics. Bayesian Theorem. Training dataset. Naïve Bayes Classifier

PGE 310: Formulation and Solution in Geosystems Engineering. Dr. Balhoff. Interpolation

Simple Linear Regression

Midterm Exam 1, section 1 (Solution) Thursday, February hour, 15 minutes

Line Fitting and Regression

Chapter 4 (Part 1): Non-Parametric Classification (Sections ) Pattern Classification 4.3) Announcements

Artificial Intelligence Learning of decision trees

ECON 482 / WH Hong The Simple Regression Model 1. Definition of the Simple Regression Model

ENGI 3423 Simple Linear Regression Page 12-01

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model

Big Data Analytics. Data Fitting and Sampling. Acknowledgement: Notes by Profs. R. Szeliski, S. Seitz, S. Lazebnik, K. Chaturvedi, and S.

Classification learning II

Feature Selection: Part 2. 1 Greedy Algorithms (continued from the last lecture)

Lecture 8: Linear Regression

I. Decision trees II. Ensamble methods: Mixtures of experts

Chapter 3 Sampling For Proportions and Percentages

Analyzing Two-Dimensional Data. Analyzing Two-Dimensional Data

Mathematics HL and Further mathematics HL Formula booklet

Lecture 7: Linear and quadratic classifiers

Module 7. Lecture 7: Statistical parameter estimation

1 0, x? x x. 1 Root finding. 1.1 Introduction. Solve[x^2-1 0,x] {{x -1},{x 1}} Plot[x^2-1,{x,-2,2}] 3

6.867 Machine Learning

MMJ 1113 FINITE ELEMENT METHOD Introduction to PART I

Investigating Cellular Automata

A new type of optimization method based on conjugate directions

Dimensionality Reduction and Learning

QR Factorization and Singular Value Decomposition COS 323

Applications of Multiple Biological Signals

b. There appears to be a positive relationship between X and Y; that is, as X increases, so does Y.

Correlation and Regression Analysis

Classification with linear models

Chapter 5 Properties of a Random Sample

Lecture 3. Least Squares Fitting. Optimization Trinity 2014 P.H.S.Torr. Classic least squares. Total least squares.

1. A real number x is represented approximately by , and we are told that the relative error is 0.1 %. What is x? Note: There are two answers.

Lecture 2: Linear Least Squares Regression

CHAPTER VI Statistical Analysis of Experimental Data

Point Estimation: definition of estimators

Recall MLR 5 Homskedasticity error u has the same variance given any values of the explanatory variables Var(u x1,...,xk) = 2 or E(UU ) = 2 I

ENGI 4421 Propagation of Error Page 8-01

Multiple Choice Test. Chapter Adequacy of Models for Regression

Lecture Notes 2. The ability to manipulate matrices is critical in economics.

Lecture 2: The Simple Regression Model

( ) 2 2. Multi-Layer Refraction Problem Rafael Espericueta, Bakersfield College, November, 2006

4. Standard Regression Model and Spatial Dependence Tests

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #1

residual. (Note that usually in descriptions of regression analysis, upper-case

Parametric Density Estimation: Bayesian Estimation. Naïve Bayes Classifier

Bayes Decision Theory - II

Maximum Likelihood Estimation

Chapter 11 Systematic Sampling

hp calculators HP 30S Statistics Averages and Standard Deviations Average and Standard Deviation Practice Finding Averages and Standard Deviations

Chapter 13 Student Lecture Notes 13-1

Given a table of data poins of an unknown or complicated function f : we want to find a (simpler) function p s.t. px (

Chapter 4 Multiple Random Variables

Minimization of Unconstrained Nonpolynomial Large-Scale Optimization Problems Using Conjugate Gradient Method Via Exact Line Search

Computational learning and discovery

COMPROMISE HYPERSPHERE FOR STOCHASTIC DOMINANCE MODEL

Discrete Mathematics and Probability Theory Fall 2016 Seshia and Walrand DIS 10b

Linear Regression with One Regressor

L5 Polynomial / Spline Curves

STATISTICS 13. Lecture 5 Apr 7, 2010

Transcription:

CS 75 Mache Lear Lecture 7 Lear reresso cot. Lear methods for classfcato Mlos Hausrecht mlos@cs.ptt.edu 539 Seott Square CS 75 Mache Lear Coeffcet shrae he least squares estmates ofte have lo bas but hh varace he predcto accurac ca be ofte mproved b sett some coeffcets to zero Icreases the bas reduces the varace of estmates Solutos: Subset selecto Rde reresso Prcpal compoet reresso Net: rde reresso CS 75 Mache Lear

Rde reresso Error fucto for the stadard least squares estmates: J.. * We see: ar m Rde reresso: J + λ Where.. d.. ad λ What does the e error fucto do? CS 75 Mache Lear Rde reresso Stadard reresso: J Rde reresso: J d.. + λ.. pealzes o-zero ehts th the cost proportoal to λ a shrae coeffcet If a put attrbute j has a small effect o mprov the error fucto t s shut do b the pealt term Icluso of a shrae pealt s ofte referred to as reularzato CS 75 Mache Lear

Supervsed lear ata: { d d.. d} a set of eamples d < > s put vector ad s desred output ve b a teacher Objectve: lear the mapp f : X Y s.t. f for all.. o tpes of problems: Reresso: Y s cotuous Eample: ears product orders compa stoc prce Classfcato: Y s dscrete Eample: temperature heart rate dsease oda: bar classfcato problems: CS 75 Mache Lear Bar classfcato o classes Y {} Our oal s to lear to classf correctl to tpes of eamples Class labeled as Class labeled as We ould le to lear f : X { } Zero-oe error loss fucto f Error f Error e ould le to mmze: E Error Frst step: e eed to devse a model of the fucto CS 75 Mache Lear

scrmat fuctos Oe coveet a to represet classfers s throuh scrmat fuctos Wors for bar ad mult-a classfcato Idea: For ever class defe a fucto mapp X R Whe the decso o put should be made choose the class th the hhest value of So hat happes th the put space? Assume a bar case. CS 75 Mache Lear scrmat fuctos.5.5 -.5 - -.5 - - -.5 - -.5.5.5 CS 75 Mache Lear

scrmat fuctos.5.5 -.5 - -.5 - - -.5 - -.5.5.5 CS 75 Mache Lear scrmat fuctos.5.5 -.5 - -.5 - - -.5 - -.5.5.5 CS 75 Mache Lear

efe decso boudar. scrmat fuctos.5.5 -.5 - -.5 - - -.5 - -.5.5.5 CS 75 Mache Lear Quadratc decso boudar 3 ecso boudar.5.5.5 -.5 - -.5 - - -.5 - -.5.5.5 CS 75 Mache Lear

Lostc reresso model efes a lear decso boudar scrmat fuctos: here z / + e z f - s a lostc fucto Iput vector z f d Lostc fucto d CS 75 Mache Lear Lostc fucto fucto z z + e also referred to as a smod fucto Replaces the threshold fucto th smooth stch taes a real umber ad outputs the umber the terval [].9.8.7.6.5.4.3.. - -5 - -5 5 5 CS 75 Mache Lear

Lostc reresso model scrmat fuctos: z Where z / + e - s a lostc fucto Values of dscrmat fuctos var [] Probablstc terpretato f p Iput vector d d z p CS 75 Mache Lear Lostc reresso Istead of lear the mapp to dscrete values f : X {} e lear a probablstc fucto f : X [] here f descrbes the probablt of class ve f p Note that: p p rasformato to dscrete class values: If p / the choose Else choose CS 75 Mache Lear

Lear decso boudar Lostc reresso model defes a lear decso boudar Wh? Aser: Compare to dscrmat fuctos. ecso boudar: For the boudar t must hold: o lo o lo lo ep + ep lo lo ep + ep CS 75 Mache Lear Lostc reresso model. ecso boudar LR defes a lear decso boudar Eample: classes blue ad red pots ecso boudar.5.5 -.5 - -.5 - - -.5 - -.5.5.5 CS 75 Mache Lear

CS 75 Mache Lear Lelhood of outputs Let he Fd ehts that mamze the lelhood of outputs Appl the lo-lelhood trc he optmal ehts are the same for both the lelhood ad the lo-lelhood Lostc reresso: parameter lear. l lo lo µ µ µ µ P L µ µ z p µ lo lo µ µ + > < CS 75 Mache Lear Lostc reresso: parameter lear Lo lelhood ervatves of the lolelhood Gradet descet: lo lo l µ µ + f l ] [ l α Nolear ehts!! + f ] [ α j j z l

Lostc reresso. Ole radet descet O-le compoet of the lolelhood J lo µ + lo µ ole O-le lear update for eht J ole α [ J ] ole th update for the lostc reresso ad < > + α [ f ] CS 75 Mache Lear Ole lostc reresso alorthm Ole-lostc-reresso umber of teratos talze ehts Kd for :: umber of teratos do select a data pot < > from set α / update ehts parallel + α [ f ] ed for retur ehts CS 75 Mache Lear

Ole alorthm. Eample. CS 75 Mache Lear Ole alorthm. Eample. CS 75 Mache Lear

Ole alorthm. Eample. CS 75 Mache Lear