Machine Learning. 7. Logistic and Linear Regression
|
|
- Agatha Edwards
- 6 years ago
- Views:
Transcription
1 Sapienza University of Rome, Italy - Machine Learning (27/28) University of Rome La Sapienza Master in Artificial Intelligence and Robotics Machine Learning 7. Logistic and Linear Regression Luca Iocchi, Valsamis Ntouskos Luca Iocchi, Valsamis Ntouskos 7. Logistic and Linear Regression / 29
2 Sapienza University of Rome, Italy - Machine Learning (27/28) Probabilistic Generative Models Consider first the case of two classes. Find the conditional probability: with: p(x C )P(C ) P(C x) = p(x C )P(C ) + p(x C 2 )P(C 2 ) = + exp( α) = σ(α). α = ln p(x C )P(C ) p(x C 2 )P(C 2 ) and σ(α) = +exp( α) the sigmoid function. Luca Iocchi, Valsamis Ntouskos 7. Logistic and Linear Regression 2 / 29
3 Sapienza University of Rome, Italy - Machine Learning (27/28) Probabilistic Generative Models Assume p(x C i ) = N (x µ i, Σ) - same covariance matrix we get: with: P(C x) = σ(w T x + w ), w = Σ (µ µ 2 ), w = 2 µt Σ µ + 2 µt 2 Σ µ 2 + ln P(C ) P(C 2 ). note that: P(C 2 x) = P(C x) µ P(C {}}{{}}{ ) {}}{ Nr. of parameters: M(M + 5)/2 + = 2 M + M(M + )/2 +. Σ Luca Iocchi, Valsamis Ntouskos 7. Logistic and Linear Regression 3 / 29
4 Sapienza University of Rome, Italy - Machine Learning (27/28) Probabilistic Generative Models Luca Iocchi, Valsamis Ntouskos 7. Logistic and Linear Regression 4 / 29
5 Sapienza University of Rome, Italy - Machine Learning (27/28) Logistic regression - Model P(C x) = σ(w T x + w ), Decision rule: c = C P(c = C x) >.5 Nr. of parameters: M. Luca Iocchi, Valsamis Ntouskos 7. Logistic and Linear Regression 5 / 29
6 Sapienza University of Rome, Italy - Machine Learning (27/28) Varying the parameters w in 2D Logistic regression - Model W = ( 2, 3 ) x x 2 W = ( 2, ) x x 2 w 2.5 W = (, 4 ).5 x x 2 W = (, 2 ) W = ( 2, 2 ).5 x x 2 x x 2 W = (, ) W = ( 3, ).5.5 x x 2 x x 2 W = ( 2, 2 ) W = ( 5, 4 ) x x 2 W = ( 5, ) x x 2 w 2 x x Luca Iocchi, Valsamis Ntouskos 7. Logistic and Linear Regression 6 / 29
7 Sapienza University of Rome, Italy - Machine Learning (27/28) Logistic regression - Algorithms How parameters w are computed given a dataset D? We will see this in a while... Luca Iocchi, Valsamis Ntouskos 7. Logistic and Linear Regression 7 / 29
8 Sapienza University of Rome, Italy - Machine Learning (27/28) Linear Regression Goal: Estimate the value t of a continuous function at x based on a dataset D composed of N observations {x n }, where n =,..., N, together with the corresponding target values {t n }. Ideally: t = y(x, w) Luca Iocchi, Valsamis Ntouskos 7. Logistic and Linear Regression 8 / 29
9 Sapienza University of Rome, Italy - Machine Learning (27/28) Linear Regression - Model Linear Basis Function Models Simplest case: y(x, w) = w + w x w D x D = w T x w with x =. and w =. x D w D Linear both in model parameters w and variables x. Too limiting! Luca Iocchi, Valsamis Ntouskos 7. Logistic and Linear Regression 9 / 29
10 Sapienza University of Rome, Italy - Machine Learning (27/28) Example - Line fitting 5 y = w x + w 4 prediction truth Luca Iocchi, Valsamis Ntouskos 7. Logistic and Linear Regression / 29
11 Sapienza University of Rome, Italy - Machine Learning (27/28) Linear Regression - Model Linear Basis Function Models Using nonlinear functions of input variables: y(x, w) = M j= w j φ j (x) = w T φ, with φ (x) = and φ = φ. φ M Still linear in the parameters w! Luca Iocchi, Valsamis Ntouskos 7. Logistic and Linear Regression / 29
12 Sapienza University of Rome, Italy - Machine Learning (27/28) Example - Polynomial curve fitting y = w + w x + w 2 x w M x M = M w j x j j= M = M = t t x x M = 3 M = 9 t t x x Luca Iocchi, Valsamis Ntouskos 7. Logistic and Linear Regression 2 / 29
13 Sapienza University of Rome, Italy - Machine Learning (27/28) Linear Regression - Model Examples of basis functions Polynomial Radial Sigmoid / Tanh Luca Iocchi, Valsamis Ntouskos 7. Logistic and Linear Regression 3 / 29
14 Sapienza University of Rome, Italy - Machine Learning (27/28) Linear Regression - Algorithms Maximum likelihood and least squares Target value t is given by y(x, w) affected by additive noise t = y(x, w) + ɛ Assume Gaussian noise p(ɛ β) = N (ɛ, β ), with β the precision (inverse variance). We have: p(t x, w, β) = N (t y(x, w), β ) Luca Iocchi, Valsamis Ntouskos 7. Logistic and Linear Regression 4 / 29
15 Sapienza University of Rome, Italy - Machine Learning (27/28) Linear Regression - Algorithms Assume observations independent and identically distributed (i.i.d.) We seek the maximum of the likelihood function: p({t,..., t N } x, w, β) = or equivalently minimize: ln p({t,..., t N } w, β) = N N (t n w T φ(x n ), β ). n= N ln N (t n w T φ(x n ), β ) n= = β N [t n w T φ(x n )] 2 + N 2 2 ln(2πβ ). n= }{{} E D (w) Luca Iocchi, Valsamis Ntouskos 7. Logistic and Linear Regression 5 / 29
16 Sapienza University of Rome, Italy - Machine Learning (27/28) Note: with t = t. Linear Regression - Algorithms t N E D = 2 (t Φw)T (t Φw), and Φ = φ (x ) φ (x ) φ M (x ) φ (x 2 ) φ (x 2 ) φ M (x 2 ) φ (x N ) φ (x N ) φ M (x N ) Optimality condition: E D = Φ T Φw = Φ T t. Hence: w ML = (Φ T Φ) Φ T t. }{{} Φ : pseudo-inverse Luca Iocchi, Valsamis Ntouskos 7. Logistic and Linear Regression 6 / 29
17 Sapienza University of Rome, Italy - Machine Learning (27/28) Linear Regression - Algorithms Sequential Learning Stochastic gradient descent algorithm: w (τ+) = w (τ) η E n, with τ the iteration number and η the learning rate parameter. Therefore: w (τ+) = w (τ) + η(t n (w (τ) ) T φ(x n ))φ(x n ). Algorithm converges for suitable values of η. Luca Iocchi, Valsamis Ntouskos 7. Logistic and Linear Regression 7 / 29
18 Sapienza University of Rome, Italy - Machine Learning (27/28) Linear Regression - Algorithms Regularization Why regularize? To control over-fitting. Minimize: E D (w) + λe W (w). A common choice: Other choices: E W (w) = 2 wt w. M E W (w) = w j q. j= Luca Iocchi, Valsamis Ntouskos 7. Logistic and Linear Regression 8 / 29
19 Sapienza University of Rome, Italy - Machine Learning (27/28) Linear Regression - Algorithms Regularization w 2 w 2 w w w w q = 2 q = Luca Iocchi, Valsamis Ntouskos 7. Logistic and Linear Regression 9 / 29
20 Sapienza University of Rome, Italy - Machine Learning (27/28) Linear Regression - Algorithms Regularization E W (w) = 2 wt w. ln λ = 8 ln λ = t t x x Luca Iocchi, Valsamis Ntouskos 7. Logistic and Linear Regression 2 / 29
21 Sapienza University of Rome, Italy - Machine Learning (27/28) Linear Regression - Algorithms Multiple outputs y(x, W) = W T φ(x). Target variable is given by: T = y(x, W) + ɛ, with p(ɛ β) = N (ɛ, β I). Similarly with before we obtain: W ML = (Φ T Φ) Φ T T. Luca Iocchi, Valsamis Ntouskos 7. Logistic and Linear Regression 2 / 29
22 Sapienza University of Rome, Italy - Machine Learning (27/28) Logistic regression - Algorithms Time to see how parameters of the logistic regression model are estimated. Remember that logistic regression is used for classification! In the following we will consider the logistic regression problem: P(C φ) = y(φ) = σ(w T φ), with σ( ) the logistic sigmoid function. Luca Iocchi, Valsamis Ntouskos 7. Logistic and Linear Regression 22 / 29
23 Sapienza University of Rome, Italy - Machine Learning (27/28) Logistic regression - Algorithms Given data set {φ n, t n }, t n {, } and φ n = φ(x n ), n =,..., N Likelihood function: P(t w) = N n= with y n = P(C φ n ) = σ(w T φ n ). y tn n ( y n ) tn ( ) (*) Hint: P(c = φ n ) = y n and P(c = φ n ) = y n. Luca Iocchi, Valsamis Ntouskos 7. Logistic and Linear Regression 23 / 29
24 Sapienza University of Rome, Italy - Machine Learning (27/28) Logistic regression - Algorithms Cross-entropy error function N E(w) = ln P(t w) = [t n ln y n + ( t n ) ln( y n )]. n= Gradient of the error with respect to w using dσ dα = σ( σ) N E(w) = (y n t n )φ n, n= Luca Iocchi, Valsamis Ntouskos 7. Logistic and Linear Regression 24 / 29
25 Sapienza University of Rome, Italy - Machine Learning (27/28) Iterative reweighted least squares Apply Newton-Raphson iterative optimization technique for minimizing E(w). w w H E(w) H = E(w) is the Hessian matrix of E(w). Luca Iocchi, Valsamis Ntouskos 7. Logistic and Linear Regression 25 / 29
26 Sapienza University of Rome, Italy - Machine Learning (27/28) Iterative reweighted least squares E(w) = Φ T (y t) H = Φ T RΦ with t = (t,..., t n ) T, y = (y,..., y n ) T R diagonal matrix with R n,n = y n ( y n ) φ T Φ =... φ T N Luca Iocchi, Valsamis Ntouskos 7. Logistic and Linear Regression 26 / 29
27 Sapienza University of Rome, Italy - Machine Learning (27/28) Iterative reweighted least squares Iterative method w w (Φ T RΦ) Φ T (y t) Luca Iocchi, Valsamis Ntouskos 7. Logistic and Linear Regression 27 / 29
28 Sapienza University of Rome, Italy - Machine Learning (27/28) Multiclass logistic regression with a k = w T k φ. P(C k φ) = y k (φ) = exp(a k) j exp(a j) }{{} softmax Discriminative model P(T w,... w K ) = N n= k= with y nk = y k (φ n ) and T n,k = t nk. K P(C k φ n ) t nk = N K n= k= y t nk nk Luca Iocchi, Valsamis Ntouskos 7. Logistic and Linear Regression 28 / 29
29 Sapienza University of Rome, Italy - Machine Learning (27/28) Multiclass logistic regression Cross-entropy error function N K E(w,... w K ) = ln P(T w,... w K ) = t nk ln y nk Gradient n= k= Hessian matrix N wj E(w,... w K ) = (y nj t nj )φ n n= N wk wj E(w,... w K ) = y nk (I kj y nj )φ n φ T n n= Luca Iocchi, Valsamis Ntouskos 7. Logistic and Linear Regression 29 / 29
Multiclass Logistic Regression
Multiclass Logistic Regression Sargur. Srihari University at Buffalo, State University of ew York USA Machine Learning Srihari Topics in Linear Classification using Probabilistic Discriminative Models
More informationIterative Reweighted Least Squares
Iterative Reweighted Least Squares Sargur. University at Buffalo, State University of ew York USA Topics in Linear Classification using Probabilistic Discriminative Models Generative vs Discriminative
More informationApril 9, Depto. de Ing. de Sistemas e Industrial Universidad Nacional de Colombia, Bogotá. Linear Classification Models. Fabio A. González Ph.D.
Depto. de Ing. de Sistemas e Industrial Universidad Nacional de Colombia, Bogotá April 9, 2018 Content 1 2 3 4 Outline 1 2 3 4 problems { C 1, y(x) threshold predict(x) = C 2, y(x) < threshold, with threshold
More informationLinear Classification. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington
Linear Classification CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 Example of Linear Classification Red points: patterns belonging
More informationReading Group on Deep Learning Session 1
Reading Group on Deep Learning Session 1 Stephane Lathuiliere & Pablo Mesejo 2 June 2016 1/31 Contents Introduction to Artificial Neural Networks to understand, and to be able to efficiently use, the popular
More informationLinear Models for Classification
Linear Models for Classification Oliver Schulte - CMPT 726 Bishop PRML Ch. 4 Classification: Hand-written Digit Recognition CHINE INTELLIGENCE, VOL. 24, NO. 24, APRIL 2002 x i = t i = (0, 0, 0, 1, 0, 0,
More informationLINEAR MODELS FOR CLASSIFICATION. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception
LINEAR MODELS FOR CLASSIFICATION Classification: Problem Statement 2 In regression, we are modeling the relationship between a continuous input variable x and a continuous target variable t. In classification,
More informationLogistic Regression. Sargur N. Srihari. University at Buffalo, State University of New York USA
Logistic Regression Sargur N. University at Buffalo, State University of New York USA Topics in Linear Classification using Probabilistic Discriminative Models Generative vs Discriminative 1. Fixed basis
More informationNeural Network Training
Neural Network Training Sargur Srihari Topics in Network Training 0. Neural network parameters Probabilistic problem formulation Specifying the activation and error functions for Regression Binary classification
More informationCh 4. Linear Models for Classification
Ch 4. Linear Models for Classification Pattern Recognition and Machine Learning, C. M. Bishop, 2006. Department of Computer Science and Engineering Pohang University of Science and echnology 77 Cheongam-ro,
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 254 Part V
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 305 Part VII
More informationLast updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition
Last updated: Oct 22, 2012 LINEAR CLASSIFIERS Problems 2 Please do Problem 8.3 in the textbook. We will discuss this in class. Classification: Problem Statement 3 In regression, we are modeling the relationship
More informationBayesian Logistic Regression
Bayesian Logistic Regression Sargur N. University at Buffalo, State University of New York USA Topics in Linear Models for Classification Overview 1. Discriminant Functions 2. Probabilistic Generative
More informationLecture 5: Linear models for classification. Logistic regression. Gradient Descent. Second-order methods.
Lecture 5: Linear models for classification. Logistic regression. Gradient Descent. Second-order methods. Linear models for classification Logistic regression Gradient descent and second-order methods
More informationMachine Learning Lecture 5
Machine Learning Lecture 5 Linear Discriminant Functions 26.10.2017 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Course Outline Fundamentals Bayes Decision Theory
More informationMachine Learning Lecture 7
Course Outline Machine Learning Lecture 7 Fundamentals (2 weeks) Bayes Decision Theory Probability Density Estimation Statistical Learning Theory 23.05.2016 Discriminative Approaches (5 weeks) Linear Discriminant
More informationThese slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop
Music and Machine Learning (IFT68 Winter 8) Prof. Douglas Eck, Université de Montréal These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop
More informationLinear Models for Classification
Catherine Lee Anderson figures courtesy of Christopher M. Bishop Department of Computer Science University of Nebraska at Lincoln CSCE 970: Pattern Recognition and Machine Learning Congradulations!!!!
More informationMachine Learning Basics Lecture 7: Multiclass Classification. Princeton University COS 495 Instructor: Yingyu Liang
Machine Learning Basics Lecture 7: Multiclass Classification Princeton University COS 495 Instructor: Yingyu Liang Example: image classification indoor Indoor outdoor Example: image classification (multiclass)
More informationLinear Models for Regression
Linear Models for Regression CSE 4309 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 The Regression Problem Training data: A set of input-output
More informationLinear Models for Regression. Sargur Srihari
Linear Models for Regression Sargur srihari@cedar.buffalo.edu 1 Topics in Linear Regression What is regression? Polynomial Curve Fitting with Scalar input Linear Basis Function Models Maximum Likelihood
More informationVasil Khalidov & Miles Hansard. C.M. Bishop s PRML: Chapter 5; Neural Networks
C.M. Bishop s PRML: Chapter 5; Neural Networks Introduction The aim is, as before, to find useful decompositions of the target variable; t(x) = y(x, w) + ɛ(x) (3.7) t(x n ) and x n are the observations,
More informationCSC 411: Lecture 04: Logistic Regression
CSC 411: Lecture 04: Logistic Regression Raquel Urtasun & Rich Zemel University of Toronto Sep 23, 2015 Urtasun & Zemel (UofT) CSC 411: 04-Prob Classif Sep 23, 2015 1 / 16 Today Key Concepts: Logistic
More informationNeural Networks. Bishop PRML Ch. 5. Alireza Ghane. Feed-forward Networks Network Training Error Backpropagation Applications
Neural Networks Bishop PRML Ch. 5 Alireza Ghane Neural Networks Alireza Ghane / Greg Mori 1 Neural Networks Neural networks arise from attempts to model human/animal brains Many models, many claims of
More informationLinear and logistic regression
Linear and logistic regression Guillaume Obozinski Ecole des Ponts - ParisTech Master MVA Linear and logistic regression 1/22 Outline 1 Linear regression 2 Logistic regression 3 Fisher discriminant analysis
More informationOutline Lecture 2 2(32)
Outline Lecture (3), Lecture Linear Regression and Classification it is our firm belief that an understanding of linear models is essential for understanding nonlinear ones Thomas Schön Division of Automatic
More informationFeed-forward Networks Network Training Error Backpropagation Applications. Neural Networks. Oliver Schulte - CMPT 726. Bishop PRML Ch.
Neural Networks Oliver Schulte - CMPT 726 Bishop PRML Ch. 5 Neural Networks Neural networks arise from attempts to model human/animal brains Many models, many claims of biological plausibility We will
More informationStochastic gradient descent; Classification
Stochastic gradient descent; Classification Steve Renals Machine Learning Practical MLP Lecture 2 28 September 2016 MLP Lecture 2 Stochastic gradient descent; Classification 1 Single Layer Networks MLP
More informationClassification. Sandro Cumani. Politecnico di Torino
Politecnico di Torino Outline Generative model: Gaussian classifier (Linear) discriminative model: logistic regression (Non linear) discriminative model: neural networks Gaussian Classifier We want to
More informationLinear & nonlinear classifiers
Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1394 1 / 34 Table
More informationOverfitting, Bias / Variance Analysis
Overfitting, Bias / Variance Analysis Professor Ameet Talwalkar Professor Ameet Talwalkar CS260 Machine Learning Algorithms February 8, 207 / 40 Outline Administration 2 Review of last lecture 3 Basic
More informationProbabilistic classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016
Probabilistic classification CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2016 Topics Probabilistic approach Bayes decision theory Generative models Gaussian Bayes classifier
More informationLecture 4: Types of errors. Bayesian regression models. Logistic regression
Lecture 4: Types of errors. Bayesian regression models. Logistic regression A Bayesian interpretation of regularization Bayesian vs maximum likelihood fitting more generally COMP-652 and ECSE-68, Lecture
More informationGaussian and Linear Discriminant Analysis; Multiclass Classification
Gaussian and Linear Discriminant Analysis; Multiclass Classification Professor Ameet Talwalkar Slide Credit: Professor Fei Sha Professor Ameet Talwalkar CS260 Machine Learning Algorithms October 13, 2015
More informationPartially Directed Graphs and Conditional Random Fields. Sargur Srihari
Partially Directed Graphs and Conditional Random Fields Sargur srihari@cedar.buffalo.edu 1 Topics Conditional Random Fields Gibbs distribution and CRF Directed and Undirected Independencies View as combination
More informationPattern Recognition and Machine Learning. Bishop Chapter 6: Kernel Methods
Pattern Recognition and Machine Learning Chapter 6: Kernel Methods Vasil Khalidov Alex Kläser December 13, 2007 Training Data: Keep or Discard? Parametric methods (linear/nonlinear) so far: learn parameter
More informationLearning from Data: Regression
November 3, 2005 http://www.anc.ed.ac.uk/ amos/lfd/ Classification or Regression? Classification: want to learn a discrete target variable. Regression: want to learn a continuous target variable. Linear
More informationLinear Models in Machine Learning
CS540 Intro to AI Linear Models in Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu We briefly go over two linear models frequently used in machine learning: linear regression for, well, regression,
More informationNeural Networks, Computation Graphs. CMSC 470 Marine Carpuat
Neural Networks, Computation Graphs CMSC 470 Marine Carpuat Binary Classification with a Multi-layer Perceptron φ A = 1 φ site = 1 φ located = 1 φ Maizuru = 1 φ, = 2 φ in = 1 φ Kyoto = 1 φ priest = 0 φ
More informationNEURAL NETWORKS
5 Neural Networks In Chapters 3 and 4 we considered models for regression and classification that comprised linear combinations of fixed basis functions. We saw that such models have useful analytical
More informationArtificial Neural Networks. MGS Lecture 2
Artificial Neural Networks MGS 2018 - Lecture 2 OVERVIEW Biological Neural Networks Cell Topology: Input, Output, and Hidden Layers Functional description Cost functions Training ANNs Back-Propagation
More informationRegression with Numerical Optimization. Logistic
CSG220 Machine Learning Fall 2008 Regression with Numerical Optimization. Logistic regression Regression with Numerical Optimization. Logistic regression based on a document by Andrew Ng October 3, 204
More informationOutline. Supervised Learning. Hong Chang. Institute of Computing Technology, Chinese Academy of Sciences. Machine Learning Methods (Fall 2012)
Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Linear Models for Regression Linear Regression Probabilistic Interpretation
More informationIntroduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary. Neural Networks - I. Henrik I Christensen
Neural Networks - I Henrik I Christensen Robotics & Intelligent Machines @ GT Georgia Institute of Technology, Atlanta, GA 30332-0280 hic@cc.gatech.edu Henrik I Christensen (RIM@GT) Neural Networks 1 /
More informationIntroduction to Neural Networks
Introduction to Neural Networks Steve Renals Automatic Speech Recognition ASR Lecture 10 24 February 2014 ASR Lecture 10 Introduction to Neural Networks 1 Neural networks for speech recognition Introduction
More informationStochastic Gradient Descent
Stochastic Gradient Descent Machine Learning CSE546 Carlos Guestrin University of Washington October 9, 2013 1 Logistic Regression Logistic function (or Sigmoid): Learn P(Y X) directly Assume a particular
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression
More informationAn Introduction to Statistical and Probabilistic Linear Models
An Introduction to Statistical and Probabilistic Linear Models Maximilian Mozes Proseminar Data Mining Fakultät für Informatik Technische Universität München June 07, 2017 Introduction In statistical learning
More informationLogistic Regression. Seungjin Choi
Logistic Regression Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/
More informationClassification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012
Classification CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Topics Discriminant functions Logistic regression Perceptron Generative models Generative vs. discriminative
More informationMark Gales October y (x) x 1. x 2 y (x) Inputs. Outputs. x d. y (x) Second Output layer layer. layer.
University of Cambridge Engineering Part IIB & EIST Part II Paper I0: Advanced Pattern Processing Handouts 4 & 5: Multi-Layer Perceptron: Introduction and Training x y (x) Inputs x 2 y (x) 2 Outputs x
More informationRelevance Vector Machines
LUT February 21, 2011 Support Vector Machines Model / Regression Marginal Likelihood Regression Relevance vector machines Exercise Support Vector Machines The relevance vector machine (RVM) is a bayesian
More informationSlides modified from: PATTERN RECOGNITION CHRISTOPHER M. BISHOP. and: Computer vision: models, learning and inference Simon J.D.
Slides modified from: PATTERN RECOGNITION AND MACHINE LEARNING CHRISTOPHER M. BISHOP and: Computer vision: models, learning and inference. 2011 Simon J.D. Prince ClassificaLon Example: Gender ClassificaLon
More informationProbabilistic generative models
Linear models for classification Francesco Corona Probabilistic discriminative models Models with linear decision boundaries arise from assumptions about the data In a generative approach to classification,
More informationLinear Models for Regression
Linear Models for Regression Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr
More informationUniversität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Bayesian Learning. Tobias Scheffer, Niels Landwehr
Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen Bayesian Learning Tobias Scheffer, Niels Landwehr Remember: Normal Distribution Distribution over x. Density function with parameters
More informationECE521 week 3: 23/26 January 2017
ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear
More informationModeling Data with Linear Combinations of Basis Functions. Read Chapter 3 in the text by Bishop
Modeling Data with Linear Combinations of Basis Functions Read Chapter 3 in the text by Bishop A Type of Supervised Learning Problem We want to model data (x 1, t 1 ),..., (x N, t N ), where x i is a vector
More informationLogistic Regression. COMP 527 Danushka Bollegala
Logistic Regression COMP 527 Danushka Bollegala Binary Classification Given an instance x we must classify it to either positive (1) or negative (0) class We can use {1,-1} instead of {1,0} but we will
More informationLinear Models for Regression
Linear Models for Regression Machine Learning Torsten Möller Möller/Mori 1 Reading Chapter 3 of Pattern Recognition and Machine Learning by Bishop Chapter 3+5+6+7 of The Elements of Statistical Learning
More informationMachine Learning - Waseda University Logistic Regression
Machine Learning - Waseda University Logistic Regression AD June AD ) June / 9 Introduction Assume you are given some training data { x i, y i } i= where xi R d and y i can take C different values. Given
More informationInformatics 2B: Learning and Data Lecture 10 Discriminant functions 2. Minimal misclassifications. Decision Boundaries
Overview Gaussians estimated from training data Guido Sanguinetti Informatics B Learning and Data Lecture 1 9 March 1 Today s lecture Posterior probabilities, decision regions and minimising the probability
More informationComputational statistics
Computational statistics Lecture 3: Neural networks Thierry Denœux 5 March, 2016 Neural networks A class of learning methods that was developed separately in different fields statistics and artificial
More informationy(x n, w) t n 2. (1)
Network training: Training a neural network involves determining the weight parameter vector w that minimizes a cost function. Given a training set comprising a set of input vector {x n }, n = 1,...N,
More informationLinear Classification
Linear Classification Lili MOU moull12@sei.pku.edu.cn http://sei.pku.edu.cn/ moull12 23 April 2015 Outline Introduction Discriminant Functions Probabilistic Generative Models Probabilistic Discriminative
More informationLecture 7. Logistic Regression. Luigi Freda. ALCOR Lab DIAG University of Rome La Sapienza. December 11, 2016
Lecture 7 Logistic Regression Luigi Freda ALCOR Lab DIAG University of Rome La Sapienza December 11, 2016 Luigi Freda ( La Sapienza University) Lecture 7 December 11, 2016 1 / 39 Outline 1 Intro Logistic
More informationLinear Classification: Probabilistic Generative Models
Linear Classification: Probabilistic Generative Models Sargur N. University at Buffalo, State University of New York USA 1 Linear Classification using Probabilistic Generative Models Topics 1. Overview
More informationNeural Networks and Deep Learning
Neural Networks and Deep Learning Professor Ameet Talwalkar November 12, 2015 Professor Ameet Talwalkar Neural Networks and Deep Learning November 12, 2015 1 / 16 Outline 1 Review of last lecture AdaBoost
More informationMachine Learning. Lecture 4: Regularization and Bayesian Statistics. Feng Li. https://funglee.github.io
Machine Learning Lecture 4: Regularization and Bayesian Statistics Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 207 Overfitting Problem
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression
More informationDEPARTMENT OF COMPUTER SCIENCE Autumn Semester MACHINE LEARNING AND ADAPTIVE INTELLIGENCE
Data Provided: None DEPARTMENT OF COMPUTER SCIENCE Autumn Semester 203 204 MACHINE LEARNING AND ADAPTIVE INTELLIGENCE 2 hours Answer THREE of the four questions. All questions carry equal weight. Figures
More informationBayesian Linear Regression. Sargur Srihari
Bayesian Linear Regression Sargur srihari@cedar.buffalo.edu Topics in Bayesian Regression Recall Max Likelihood Linear Regression Parameter Distribution Predictive Distribution Equivalent Kernel 2 Linear
More informationMachine Learning and Bayesian Inference
Machine Learning and Bayesian Inference Dr Sean Holden Computer Laboratory, Room FC6 Telephone extension 63725 Email: sbh11@cl.cam.ac.uk www.cl.cam.ac.uk/ sbh11/ Copyright c Sean Holden 22-17. 1 Artificial
More informationPattern Recognition and Machine Learning
Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability
More informationOverview c 1 What is? 2 Definition Outlines 3 Examples of 4 Related Fields Overview Linear Regression Linear Classification Neural Networks Kernel Met
c Outlines Statistical Group and College of Engineering and Computer Science Overview Linear Regression Linear Classification Neural Networks Kernel Methods and SVM Mixture Models and EM Resources More
More informationIntroduction to Machine Learning
Introduction to Machine Learning Logistic Regression Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574
More informationStatistical Machine Learning Hilary Term 2018
Statistical Machine Learning Hilary Term 2018 Pier Francesco Palamara Department of Statistics University of Oxford Slide credits and other course material can be found at: http://www.stats.ox.ac.uk/~palamara/sml18.html
More information6.867 Machine Learning
6.867 Machine Learning Problem Set 2 Due date: Wednesday October 6 Please address all questions and comments about this problem set to 6867-staff@csail.mit.edu. You will need to use MATLAB for some of
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear
More informationMachine Learning. Regression-Based Classification & Gaussian Discriminant Analysis. Manfred Huber
Machine Learning Regression-Based Classification & Gaussian Discriminant Analysis Manfred Huber 2015 1 Logistic Regression Linear regression provides a nice representation and an efficient solution to
More informationLecture 10. Neural networks and optimization. Machine Learning and Data Mining November Nando de Freitas UBC. Nonlinear Supervised Learning
Lecture 0 Neural networks and optimization Machine Learning and Data Mining November 2009 UBC Gradient Searching for a good solution can be interpreted as looking for a minimum of some error (loss) function
More informationDATA MINING AND MACHINE LEARNING. Lecture 4: Linear models for regression and classification Lecturer: Simone Scardapane
DATA MINING AND MACHINE LEARNING Lecture 4: Linear models for regression and classification Lecturer: Simone Scardapane Academic Year 2016/2017 Table of contents Linear models for regression Regularized
More informationMulti-layer Neural Networks
Multi-layer Neural Networks Steve Renals Informatics 2B Learning and Data Lecture 13 8 March 2011 Informatics 2B: Learning and Data Lecture 13 Multi-layer Neural Networks 1 Overview Multi-layer neural
More informationSTAT5044: Regression and Anova
STAT5044: Regression and Anova Inyoung Kim 1 / 15 Outline 1 Fitting GLMs 2 / 15 Fitting GLMS We study how to find the maxlimum likelihood estimator ˆβ of GLM parameters The likelihood equaions are usually
More informationPATTERN RECOGNITION AND MACHINE LEARNING
PATTERN RECOGNITION AND MACHINE LEARNING Chapter 1. Introduction Shuai Huang April 21, 2014 Outline 1 What is Machine Learning? 2 Curve Fitting 3 Probability Theory 4 Model Selection 5 The curse of dimensionality
More informationArtificial Neural Networks
Artificial Neural Networks Stephan Dreiseitl University of Applied Sciences Upper Austria at Hagenberg Harvard-MIT Division of Health Sciences and Technology HST.951J: Medical Decision Support Knowledge
More informationNeural Networks - II
Neural Networks - II Henrik I Christensen Robotics & Intelligent Machines @ GT Georgia Institute of Technology, Atlanta, GA 30332-0280 hic@cc.gatech.edu Henrik I Christensen (RIM@GT) Neural Networks 1
More informationMachine Learning: Logistic Regression. Lecture 04
Machine Learning: Logistic Regression Razvan C. Bunescu School of Electrical Engineering and Computer Science bunescu@ohio.edu Supervised Learning Task = learn an (unkon function t : X T that maps input
More informationLinear & nonlinear classifiers
Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1396 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1396 1 / 44 Table
More informationIntroduction Dual Representations Kernel Design RBF Linear Reg. GP Regression GP Classification Summary. Kernel Methods. Henrik I Christensen
Kernel Methods Henrik I Christensen Robotics & Intelligent Machines @ GT Georgia Institute of Technology, Atlanta, GA 30332-0280 hic@cc.gatech.edu Henrik I Christensen (RIM@GT) Kernel Methods 1 / 37 Outline
More informationMultilayer Perceptron
Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Single Perceptron 3 Boolean Function Learning 4
More informationMidterm. Introduction to Machine Learning. CS 189 Spring You have 1 hour 20 minutes for the exam.
CS 189 Spring 2013 Introduction to Machine Learning Midterm You have 1 hour 20 minutes for the exam. The exam is closed book, closed notes except your one-page crib sheet. Please use non-programmable calculators
More informationLogistic Regression & Neural Networks
Logistic Regression & Neural Networks CMSC 723 / LING 723 / INST 725 Marine Carpuat Slides credit: Graham Neubig, Jacob Eisenstein Logistic Regression Perceptron & Probabilities What if we want a probability
More informationArtificial Intelligence: what have we seen so far? Machine Learning and Bayesian Inference
Machine Learning and Bayesian Inference Dr Sean Holden Computer Laboratory, Room FC6 Telephone extension 6375 Email: sbh@cl.cam.ac.uk www.cl.cam.ac.uk/ sbh/ Artificial Intelligence: what have we seen so
More informationCS 340 Lec. 16: Logistic Regression
CS 34 Lec. 6: Logistic Regression AD March AD ) March / 6 Introduction Assume you are given some training data { x i, y i } i= where xi R d and y i can take C different values. Given an input test data
More informationCSCI567 Machine Learning (Fall 2014)
CSCI567 Machine Learning (Fall 24) Drs. Sha & Liu {feisha,yanliu.cs}@usc.edu October 2, 24 Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 24) October 2, 24 / 24 Outline Review
More information> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2016 BASEL. Logistic Regression. Pattern Recognition 2016 Sandro Schönborn University of Basel
Logistic Regression Pattern Recognition 2016 Sandro Schönborn University of Basel Two Worlds: Probabilistic & Algorithmic We have seen two conceptual approaches to classification: data class density estimation
More informationMachine Learning Basics Lecture 2: Linear Classification. Princeton University COS 495 Instructor: Yingyu Liang
Machine Learning Basics Lecture 2: Linear Classification Princeton University COS 495 Instructor: Yingyu Liang Review: machine learning basics Math formulation Given training data x i, y i : 1 i n i.i.d.
More information