Lecture 2 L n i e n a e r a M od o e d l e s

Similar documents
CHAPTER 10: LINEAR DISCRIMINATION

Advanced Machine Learning & Perception

Lecture 11 SVM cont

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 4

Variants of Pegasos. December 11, 2009

Lecture VI Regression

CHAPTER 2: Supervised Learning

J i-1 i. J i i+1. Numerical integration of the diffusion equation (I) Finite difference method. Spatial Discretization. Internal nodes.

( ) [ ] MAP Decision Rule

Outline. Probabilistic Model Learning. Probabilistic Model Learning. Probabilistic Model for Time-series Data: Hidden Markov Model

Lecture 6: Learning for Control (Generalised Linear Regression)

Notes on the stability of dynamic systems and the use of Eigen Values.

Solution in semi infinite diffusion couples (error function analysis)

Clustering (Bishop ch 9)

Introduction to Boosting

Mechanics Physics 151

THE PREDICTION OF COMPETITIVE ENVIRONMENT IN BUSINESS

CHAPTER 10: LINEAR DISCRIMINATION

Computing Relevance, Similarity: The Vector Space Model

Ordinary Differential Equations in Neuroscience with Matlab examples. Aim 1- Gain understanding of how to set up and solve ODE s

CHAPTER 5: MULTIVARIATE METHODS

Learning Objectives. Self Organization Map. Hamming Distance(1/5) Introduction. Hamming Distance(3/5) Hamming Distance(2/5) 15/04/2015

Bayes rule for a classification problem INF Discriminant functions for the normal density. Euclidean distance. Mahalanobis distance

CHAPTER 7: CLUSTERING

Volatility Interpolation

Machine Learning Linear Regression

An introduction to Support Vector Machine

Chapter 4. Neural Networks Based on Competition

Machine Learning 2nd Edition

Discriminative classifier: Logistic Regression. CS534-Machine Learning

General Weighted Majority, Online Learning as Online Optimization

In the complete model, these slopes are ANALYSIS OF VARIANCE FOR THE COMPLETE TWO-WAY MODEL. (! i+1 -! i ) + [(!") i+1,q - [(!

Clustering with Gaussian Mixtures

Professor Joseph Nygate, PhD

GMM parameter estimation. Xiaoye Lu CMPS290c Final Project

CSCE 478/878 Lecture 5: Artificial Neural Networks and Support Vector Machines. Stephen Scott. Introduction. Outline. Linear Threshold Units

Discrete Markov Process. Introduction. Example: Balls and Urns. Stochastic Automaton. INTRODUCTION TO Machine Learning 3rd Edition

Density Matrix Description of NMR BCMB/CHEM 8190

Department of Economics University of Toronto

Fall 2010 Graduate Course on Dynamic Learning

Lecture Slides for INTRODUCTION TO. Machine Learning. ETHEM ALPAYDIN The MIT Press,

Normal Random Variable and its discriminant functions

HEAT CONDUCTION PROBLEM IN A TWO-LAYERED HOLLOW CYLINDER BY USING THE GREEN S FUNCTION METHOD

PHYS 1443 Section 001 Lecture #4

( ) () we define the interaction representation by the unitary transformation () = ()

Cubic Bezier Homotopy Function for Solving Exponential Equations

Robust and Accurate Cancer Classification with Gene Expression Profiling

Pendulum Dynamics. = Ft tangential direction (2) radial direction (1)

Discriminative classifier: Logistic Regression. CS534-Machine Learning

How about the more general "linear" scalar functions of scalars (i.e., a 1st degree polynomial of the following form with a constant term )?

FTCS Solution to the Heat Equation

Density Matrix Description of NMR BCMB/CHEM 8190

Generative classification models

Displacement, Velocity, and Acceleration. (WHERE and WHEN?)

The Finite Element Method for the Analysis of Non-Linear and Dynamic Systems

V.Abramov - FURTHER ANALYSIS OF CONFIDENCE INTERVALS FOR LARGE CLIENT/SERVER COMPUTER NETWORKS

A Tour of Modeling Techniques

V The Fourier Transform

On One Analytic Method of. Constructing Program Controls

2/20/2013. EE 101 Midterm 2 Review

Classification learning II

Anomaly Detection. Lecture Notes for Chapter 9. Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar

WiH Wei He

Dishonest casino as an HMM

Robustness Experiments with Two Variance Components

Mechanics Physics 151

Dimitri Solomatine. D.P. Solomatine. Data-driven modelling (part 2). 2

TSS = SST + SSE An orthogonal partition of the total SS

Mechanics Physics 151

15-381: Artificial Intelligence. Regression and cross validation

Chapter 6: AC Circuits

. The geometric multiplicity is dim[ker( λi. number of linearly independent eigenvectors associated with this eigenvalue.

COMPUTER SCIENCE 349A SAMPLE EXAM QUESTIONS WITH SOLUTIONS PARTS 1, 2

WebAssign HW Due 11:59PM Tuesday Clicker Information

Graduate Macroeconomics 2 Problem set 5. - Solutions

( t) Outline of program: BGC1: Survival and event history analysis Oslo, March-May Recapitulation. The additive regression model

Fingerprint Image Quality Classification Based on Feature Extraction

UNIVERSITAT AUTÒNOMA DE BARCELONA MARCH 2017 EXAMINATION

Bag for Sophia by Leonie Bateman and Deirdre Bond-Abel

Improved Classification Based on Predictive Association Rules

Linear Response Theory: The connection between QFT and experiments

DEEP UNFOLDING FOR MULTICHANNEL SOURCE SEPARATION SUPPLEMENTARY MATERIAL

CH.3. COMPATIBILITY EQUATIONS. Continuum Mechanics Course (MMC) - ETSECCPB - UPC

[Link to MIT-Lab 6P.1 goes here.] After completing the lab, fill in the following blanks: Numerical. Simulation s Calculations

The Rosenblatt s LMS algorithm for Perceptron (1958) is built around a linear neuron (a neuron with a linear

CptS 570 Machine Learning School of EECS Washington State University. CptS Machine Learning 1

Data Fusion using Kalman Filter. Ioannis Rekleitis

Machine Learning 4771

Smoothing. Backward smoother: At any give T, replace the observation yt by a combination of observations at & before T

Scattering at an Interface: Oblique Incidence

Dynamic Team Decision Theory. EECS 558 Project Shrutivandana Sharma and David Shuman December 10, 2005

. The geometric multiplicity is dim[ker( λi. A )], i.e. the number of linearly independent eigenvectors associated with this eigenvalue.

A Principled Approach to MILP Modeling

Designing Information Devices and Systems I Spring 2019 Lecture Notes Note 17

Biol. 356 Lab 8. Mortality, Recruitment, and Migration Rates

II. Light is a Ray (Geometrical Optics)

Introduction ( Week 1-2) Course introduction A brief introduction to molecular biology A brief introduction to sequence comparison Part I: Algorithms

Non-parametric techniques. Instance Based Learning. NN Decision Boundaries. Nearest Neighbor Algorithm. Distance metric important

Relative controllability of nonlinear systems with delays in control

Nonlinear Classifiers II

Transcription:

Lecure Lnear Models

Las lecure You have learned abou ha s machne learnng Supervsed learnng Unsupervsed learnng Renforcemen learnng You have seen an eample learnng problem and he general process ha one goes hrough o desgn a learnng ssem, hch nvolves deermnng: Tpes of ranng eperence Targe funcon Represenaon of he learned funcon Learnng algorhm

Supervsed learnng Le s look a he problem of spam flerng No anone can learn ho o earn $00 - $943 per da or More! If ou can pe (hun and peck s ok o sar) and fll n forms, ou can score bg! So don' dela ang around for he ne opporun... s knockng no! Sar here: hp://redbluecruse.com//c/38/polohoo/z37957.hml Do ou Have Poer ha ou hnk should be orh $0,000.00 USD, e do!.. Ener our Inernaonal Open cones and see f ou have ha akes. To see deals or o ener our on poem, Clck lnk belo. hp://e-suscrber.com/ms?e=0saoo4q9s4zyyuoyq&m=79534&l=0 Ve m phoos! I nve ou o ve he follong phoo album(s): zak-monh7 He have ou seen m ne pcs e???? Me and m grlfrend ould love f ou ould come cha h us for a b.. Well jon us f ou neresed. Jon lve eb cam cha here: hp://e-commcenral.com/ms?e=0saoo4q9s4zyyuoyq&m=8534&l=0

Le s look a he desgn choces Learnng eperence? Pas emals and heher he are consdered spam or no (ou can also choose o use non-spam or spam emals onl, bu ha ll requre dfferen choces laer on) Targe funcon? Emal -> spam or no Represenaon of he funcon?? Learnng algorhm? We ll focus mosl on hese o aspecs n hs class. In some cases, ou ll also need o pa aenon o he frs o quesons.

Connue h he desgn choces Represenaon of he funcon (emal -> spam or no)? Frs of all, ho o represen an emal? Use bag-of-ords o represen an emal Ths ll urn an emal no a collecon of feaures, e.g., here each feaure descrbe heher a parcular ord s presen n he emal Ths gves us he sandard supervsed classfcaon problem pcall seen n e books and papers Tranng se: a se of eamples (nsances, objecs) h class labels, e.g., posve (spam) and negave (non spam) Inpu represenaon: an eample s descrbed b a se of arbues (e.g., heher $ s presen, ec.) Gven an unseen emal, and s npu represenaon, predc s label Ne queson: ha funcon forms o use?

Lnear Threshold Uns (McCulloch & Ps 943) 0 n f 0 = = oherse > 0 n n Assume each feaure j and egh j s a real number LTU compues and akes hreshold o produce he predcon Wh lnear model? Smples model feer parameers o learn Vsuall nuve - drang a sragh lne o separae posve from negave

Geomerc ve - - - - - - W=(,), pons o he posve sde - - = Referred o as decson boundar

A Canoncal Represenaon Gven a ranng eample: (<,,, m >, ) ransform o (<,,,, m >, ) The parameer vecor ll hen be = < 0,,,, m > Gven a ranng se, e need o learn g(,) = 0 m m = Or equvalenl h(,) = sgn(g(,)) Do (or nner) produc: akes o equal-lengh vecors, and reurns he sum of her componen-se produc To dfferenae he learned funcon and he rue underlng funcon, s common o refer o he learned funcon as a hpohess (each unque se of parameer values s one hpohess) A predcon s correc f g(,) >0 (or h(,)>0)

Geomercall, usng he canoncal represenaon ranslaes o o hngs:.i ll ncrease he npu space dmenson b, and.he decson boundar no alas passes hrough he orgn.

Geomerc ve - - -

Ho o learn: he percepron algorhm The equaon 0... m m = 0 defnes a lnear decson boundar ha separaes npu space no dfferen decson regons The goal of learnng s o fnd a egh vecor such ha s decson boundar correcl separae posve eamples from negave eamples. Ho can e acheve hs? Percepron s one approach. I sars h some vecor and ncremenall updae hen makes a msake. Le be curren egh vecor, and suppose makes a msake on eample <, >, ha s o sa <0. The percepron updae rule s: =

Percepron Algorhm Le (0,0,0,...,0) Repea Accep ranng eample u f u <= 0 : (, )

Effec of Percepron Updang Rule Mahemacall speakng = ( ) = > The updang rule makes more posve, hus can poenall correc he msake Geomercall _ Sep Sep _

Onlne vs Bach We call he above percepron algorhm an onlne algorhm Onlne algorhms perform learnng each me receves an ranng eample In conras, bach learnng algorhms collec a bach of ranng eamples and learn from hem all a once.

Bach Percepron Algorhm Gven : ranng eamples ( Le (0,0,0,...,0) do dela (0,0,0,...,0) for = o N do u f u <= 0 dela dela dela dela / N η dela unl dela < ε, ), =,..., N

Good nes If here s a lnear decson boundar ha correcl classf all ranng eamples, hs algorhm ll fnd Formall speakng, hs s he convergence Proper: For lnearl separable daa (.e., here ess an lnear decson boundar ha perfecl separaes posve and negave ranng eamples), he percepron algorhm converges n a fne number of seps. Wh? If ou are mahemacall curous, read he follong slde, ou ll fnd he anser. And ho man seps? If ou are praccall curous, read he follong slde, anser s n here oo. The furher good nes s ha ou are no requred o maser hs maeral, he are jus for he curous ones

Proof ne = ), ( cos ) ( = = eamples all for.e.,, h a margn eamples all classf ha Assume > h sep, a be our and be a soluon vecor, Le bounded amoun b a loer o a soluon vecor vecor closer egh he updae moves each need o sho ha jus e To sho convergence, = > > > > = 0... < = = < = = D b bounded are ha Assume... D D D < < < < < ), ( cos D ne > > = D D D < < < 3 3

Margn s referred o as he margn The bgger he margn, he easer he classfcaon problem s, he percepron algorhm ll lkel fnd he soluon faser! Sde sor: he bgger he margn, he more confden e are abou our predcon, hch makes desrable o fnd he one ha gves he mamum margn Laer n he course hs concep ll be core o one of he recen mos ecng developmens n he ML feld suppor vecor machne

Bad nes Wha abou non-lnearl separable cases! In such cases he algorhm ll never sop! Ho o f? One possble soluon: look for decson boundar ha make as fe msakes as possble NP-hard (refresh our 35 memor!)

Le (0,0,0,...,0) for =,...,N Take ranng eample : ( u f u < 0, )

Le c 0 = 0 0 u = (0,0,0,...,0) repea Take eample : (, ) f u n n n = n else c c n <= 0 n = 0 = c n n Sore a collecon of lnear separaors 0,,, along h her survval me c 0, c, The c s can be good measures of relabl of he s. For classfcaon, ake a eghed voe among all separaors: