Learning Spectral Graph Segmentation

Size: px
Start display at page:

Download "Learning Spectral Graph Segmentation"

Transcription

1 Learning Spectral Graph Segmentation AISTATS 2005 Timothée Cour Jianbo Shi Nicolas Gogin Computer and Information Science Department University of Pennsylvania Computer Science Ecole Polytechnique

2 Graph-based Image Segmentation Weighted graph G =(V,W) W ij V = vertices (pixels i) Similarity between pixels i and j : W ij = W ji 0 Segmentation = graph partition of pixels 2

3 Spectral Graph Segmentation Image I Intensity Color Edges Texture Graph Affinities W=W(I,Θ) 3

4 Spectral Graph Segmentation Image I Intensity Color Edges Texture Graph Affinities W=W(I,Θ) NCut( A, B) = cut( A, B) Vol A Vol B 4

5 Spectral Graph Segmentation Image I Graph Affinities W=W(I,Θ) Eigenvector X(W) NCut( A, B) = cut( A, B) Vol A Vol B WX X A () i = λdx 1 if i = 0 if i A A 5

6 Spectral Graph Segmentation Image I Graph Affinities W=W(I,Θ) Eigenvector X(W) Discretisation NCut( A, B) = cut( A, B) Vol A Vol B WX = λdx X A ( i) 1 if i A = 0 if i A 6

7 Spectral Graph Segmentation Image I Intensity + Color + Edges + Texture + Graph Affinities W=W(I,Θ) Eigenvector X(W) Learn graph parameters Θ Target segmentation Reverse pipeline 7

8 intensity cues edges cues color cues [Shi & Malik, 97] Multiscale cues [Yu, 04; Fowlkes 04] Texture cues 8

9 Where is Waldo? Do you use Edges cues? Color cues? Texture cues? -That s not enough, you need Shape cues High-level object priors 9

10 Spectral Graph Segmentation Image I Intensity + Color + Edges + Texture + Graph Affinities W=W(I,Θ) Eigenvector X(W) Learn graph parameters Θ Target segmentation 10

11 Θ Image I Affinities W=W(I,Θ) Eigenvector X(W) - Target X * Target P * ~W * Error criterion on affinity matrix W [1] Meila and Shi (2001) [2] Fowlkes, Martin, Malik (2004) 11

12 Θ Image I Affinities W=W(I,Θ) Eigenvector X(W) - Target X * Target P * ~W * Error criterion on affinity matrix W Constraining Wij is overkill: Many W have segmentation W: O(n 2 ) parameters Segments : O(n) parameters [1] Meila and Shi (2001) [2] Fowlkes, Martin, Malik (2004) 12

13 Θ Image I Affinities W=W(I,Θ) Eigenvector X(W) - Error criterion on partition vector X only! Target X * 13

14 Energy function for segmenting 1 image I Eigenvector X(W) - * 2 E ( W) = X( W( I, Θ)) X ( I) I Target X * 14

15 Energy function for segmenting 1 image I Eigenvector X(W) - * 2 E ( W) = X( W( I, Θ)) X ( I) I Min. E( Θ) = XW ( ( I, Θ)) X( I) images I * 2 Target X * 15

16 Eigenvector X(W) - E ( W) = X( W( I, Θ)) X ( I) I Energy function for image I * 2 Min. E( Θ) = XW ( ( I, Θ)) X( I) images I * 2 Target X * Can use gradient descent...but XW ( ) is only implicitly defined, by W( I, Θ ) - λ D X = 0, ( ) X f( W, Θ) Can Not backtrack easily 16

17 Min. E( Θ) = X( W( I, Θ)) X ( I) images I * 2 Gradient descent: Θ = η E E X W = η Θ X W Θ 17

18 Min. E( Θ) = X( W( I, Θ)) X ( I) images I * 2 Gradient descent: X(W) is implicit Θ = η E E X W = η Θ X W Θ E(X) is quadratic depends on W(Θ) 18

19 E X W Θ = η X W Θ Theorem : Derivative of NCut eigenvectors The map W (X, λ) is C over Ω and we can 1 express the derivatives over any C path ( ) as : Wt dλ( W( t)) dt dx ( W ( t)) dt ( ' λ ') T X W D X = T X DX = ( W λd) W' λd' dλ dt D X 19

20 E X W Θ = η X W Θ Theorem : Derivative of NCut eigenvectors The map W (X, λ) is C over Ω and we can 1 express the derivatives over any C path ( ) as : Wt dλ( W( t)) dt dx ( W ( t)) dt ( ' λ ') T X W D X = T X DX = ( W λd) W' λd' dλ dt D X Feasible set in the space of graph weight matrices : W Ωiff 1) W is n n symmetric matrix 2) W1 > 0 3) λ is single with WX = λ DX

21 E X W Θ = η X W Θ Theorem : Derivative of NCut eigenvectors The map W (X, λ) is C over Ω and we can 1 express the derivatives over any C path ( ) as : Wt dλ( W( t)) dt dx ( W ( t)) dt ( ' λ ') T X W D X = T X DX = ( W λd) W' λd' dλ dt D X [Meila, Shortreed, Xu, 05] 21

22 Task 1: Learning 2D point segmentation 100 points at ( x, y ) i i W ij = 2 2 e σx xi xj σ y yi yj ( σ σ ) parameters : Θ =, x y 22

23 ( 2 2) Learn W = exp σ ( x x ) σ ( y y ) ij x i j y i j ( ) * 2 σ σ E( σ, σ ) = X W(, ) X x y x y Ground truth segmentation X * and X(W) learned 23

24 Learning 2D point segmentation ( 2 2) Learn W = exp σ ( x x ) σ ( y y ) ij x i j y i j ( ) * 2 σ σ E( σ, σ ) = X W(, ) X x y x y Ground truth segmentation X * and X(W) learned 24

25 Proposition : Exponential convergence E The PDE W = either W converges to a global minimum W : E W(t) ( ) 0, exponentially or escapes any compact K Ω this happens when 1) λ ( t) 1 2 W() t or λ () t λ () t ) D ( i, i) 0 for some i 25

26 Is there a ground truth segmentation? { e x y α } Graph nodes = edge at (, ) with angle i i i i 26

27 Task 2: Learning rectangular shape W( e, e ) = f( x, y, α; x, y, α ) i j i i i j j j Edge location x, y Edge orientation i i α i Convex Concave Impossible 27

28 Task 2: Learning rectangular shape W( e, e ) = f( x, y, α; x, y, α ) i j i i i j j j Edge location Edge orientaion x, y i α i i Convex Concave Impossible edge locations 4 edge angles 2 W = n n n = 160,000 parameters x y α 28

29 Task 2: Learning rectangular shape W( e, e ) = f( x, y, α; x, y, α ) i j i i i j j j Edge location x, y Edge orientaion i α i i Convex Concave Impossible Invariance through parameterization : W( e, e ) = f( x x, y y, α, α ) i j j i j i i j = 6400 parameters 29

30 Synthetic case training Input edges Ncut eigenvector after training target testing Input edges Ncut eigenvector 30

31 Testing on real images Learning with precise edges 31

32 Comparison of the spectral graph learning schemes Meila and Shi, Fowlkes, Martin and Malik Bach-Jordan Our method 1 1 * arg min D W D W( X ) W arg min W J ( W, X * ) * argmin XW ( ) X W Bach and Jordan (2003) Learning spectral clustering Use a differentiable approximation of eigenvectors Our work Exact analytical form computationally efficient solution 32

33 Conclusion Supervise the un-supervised spectral clustering Exact and efficient learning of graph weights using derivatives of eigenvectors 33

Spectral Clustering. Spectral Clustering? Two Moons Data. Spectral Clustering Algorithm: Bipartioning. Spectral methods

Spectral Clustering. Spectral Clustering? Two Moons Data. Spectral Clustering Algorithm: Bipartioning. Spectral methods Spectral Clustering Seungjin Choi Department of Computer Science POSTECH, Korea seungjin@postech.ac.kr 1 Spectral methods Spectral Clustering? Methods using eigenvectors of some matrices Involve eigen-decomposition

More information

CS 556: Computer Vision. Lecture 13

CS 556: Computer Vision. Lecture 13 CS 556: Computer Vision Lecture 1 Prof. Sinisa Todorovic sinisa@eecs.oregonstate.edu 1 Outline Perceptual grouping Low-level segmentation Ncuts Perceptual Grouping What do you see? 4 What do you see? Rorschach

More information

Grouping with Bias. Stella X. Yu 1,2 Jianbo Shi 1. Robotics Institute 1 Carnegie Mellon University Center for the Neural Basis of Cognition 2

Grouping with Bias. Stella X. Yu 1,2 Jianbo Shi 1. Robotics Institute 1 Carnegie Mellon University Center for the Neural Basis of Cognition 2 Grouping with Bias Stella X. Yu, Jianbo Shi Robotics Institute Carnegie Mellon University Center for the Neural Basis of Cognition What Is It About? Incorporating prior knowledge into grouping Unitary

More information

Summary: A Random Walks View of Spectral Segmentation, by Marina Meila (University of Washington) and Jianbo Shi (Carnegie Mellon University)

Summary: A Random Walks View of Spectral Segmentation, by Marina Meila (University of Washington) and Jianbo Shi (Carnegie Mellon University) Summary: A Random Walks View of Spectral Segmentation, by Marina Meila (University of Washington) and Jianbo Shi (Carnegie Mellon University) The authors explain how the NCut algorithm for graph bisection

More information

Introduction to Spectral Graph Theory and Graph Clustering

Introduction to Spectral Graph Theory and Graph Clustering Introduction to Spectral Graph Theory and Graph Clustering Chengming Jiang ECS 231 Spring 2016 University of California, Davis 1 / 40 Motivation Image partitioning in computer vision 2 / 40 Motivation

More information

Linear Regression. S. Sumitra

Linear Regression. S. Sumitra Linear Regression S Sumitra Notations: x i : ith data point; x T : transpose of x; x ij : ith data point s jth attribute Let {(x 1, y 1 ), (x, y )(x N, y N )} be the given data, x i D and y i Y Here D

More information

Diffuse interface methods on graphs: Data clustering and Gamma-limits

Diffuse interface methods on graphs: Data clustering and Gamma-limits Diffuse interface methods on graphs: Data clustering and Gamma-limits Yves van Gennip joint work with Andrea Bertozzi, Jeff Brantingham, Blake Hunter Department of Mathematics, UCLA Research made possible

More information

Grouping with Directed Relationships

Grouping with Directed Relationships Grouping with Directed Relationships Stella. Yu Jianbo Shi Robotics Institute Carnegie Mellon University Center for the Neural Basis of Cognition Grouping: finding global structures coherence salience

More information

CS 664 Segmentation (2) Daniel Huttenlocher

CS 664 Segmentation (2) Daniel Huttenlocher CS 664 Segmentation (2) Daniel Huttenlocher Recap Last time covered perceptual organization more broadly, focused in on pixel-wise segmentation Covered local graph-based methods such as MST and Felzenszwalb-Huttenlocher

More information

Unsupervised Image Segmentation Using Comparative Reasoning and Random Walks

Unsupervised Image Segmentation Using Comparative Reasoning and Random Walks Unsupervised Image Segmentation Using Comparative Reasoning and Random Walks Anuva Kulkarni Carnegie Mellon University Filipe Condessa Carnegie Mellon, IST-University of Lisbon Jelena Kovacevic Carnegie

More information

Linear Regression 1 / 25. Karl Stratos. June 18, 2018

Linear Regression 1 / 25. Karl Stratos. June 18, 2018 Linear Regression Karl Stratos June 18, 2018 1 / 25 The Regression Problem Problem. Find a desired input-output mapping f : X R where the output is a real value. x = = y = 0.1 How much should I turn my

More information

Spectral Clustering. Guokun Lai 2016/10

Spectral Clustering. Guokun Lai 2016/10 Spectral Clustering Guokun Lai 2016/10 1 / 37 Organization Graph Cut Fundamental Limitations of Spectral Clustering Ng 2002 paper (if we have time) 2 / 37 Notation We define a undirected weighted graph

More information

THE HIDDEN CONVEXITY OF SPECTRAL CLUSTERING

THE HIDDEN CONVEXITY OF SPECTRAL CLUSTERING THE HIDDEN CONVEXITY OF SPECTRAL CLUSTERING Luis Rademacher, Ohio State University, Computer Science and Engineering. Joint work with Mikhail Belkin and James Voss This talk A new approach to multi-way

More information

Multiclass Spectral Clustering

Multiclass Spectral Clustering Multiclass Spectral Clustering Stella X. Yu Jianbo Shi Robotics Institute and CNBC Dept. of Computer and Information Science Carnegie Mellon University University of Pennsylvania Pittsburgh, PA 15213-3890

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning Christoph Lampert Spring Semester 2015/2016 // Lecture 12 1 / 36 Unsupervised Learning Dimensionality Reduction 2 / 36 Dimensionality Reduction Given: data X = {x 1,..., x

More information

K-Means, Expectation Maximization and Segmentation. D.A. Forsyth, CS543

K-Means, Expectation Maximization and Segmentation. D.A. Forsyth, CS543 K-Means, Expectation Maximization and Segmentation D.A. Forsyth, CS543 K-Means Choose a fixed number of clusters Choose cluster centers and point-cluster allocations to minimize error can t do this by

More information

Object Detection and Segmentation from Joint Embedding of Parts and Pixels

Object Detection and Segmentation from Joint Embedding of Parts and Pixels Object Detection and Segmentation from Joint Embedding of Parts and Pixels Michael Maire 1, Stella X. Yu 2, Pietro Perona 1 1 California Institute of Technology - Pasadena, CA 91125 2 Boston College -

More information

Machine Learning for Data Science (CS4786) Lecture 11

Machine Learning for Data Science (CS4786) Lecture 11 Machine Learning for Data Science (CS4786) Lecture 11 Spectral clustering Course Webpage : http://www.cs.cornell.edu/courses/cs4786/2016sp/ ANNOUNCEMENT 1 Assignment P1 the Diagnostic assignment 1 will

More information

Segmentation with Pairwise Attraction and Repulsion

Segmentation with Pairwise Attraction and Repulsion Segmentation with Pairwise Attraction and Repulsion Stella. Yu Jianbo Shi Robotics Institute Carnegie Mellon Universit {stella.u, jshi}@cs.cmu.edu Funded b DARPA HumanID ONR N4---95 Grouping: decomposition

More information

A spectral clustering algorithm based on Gram operators

A spectral clustering algorithm based on Gram operators A spectral clustering algorithm based on Gram operators Ilaria Giulini De partement de Mathe matiques et Applications ENS, Paris Joint work with Olivier Catoni 1 july 2015 Clustering task of grouping

More information

Segmentation Using Superpixels: A Bipartite Graph Partitioning Approach

Segmentation Using Superpixels: A Bipartite Graph Partitioning Approach Segmentation Using Superpixels: A Bipartite Graph Partitioning Approach Zhenguo Li Xiao-Ming Wu Shih-Fu Chang Dept. of Electrical Engineering, Columbia University, New York {zgli,xmwu,sfchang}@ee.columbia.edu

More information

Spectral Clustering. Zitao Liu

Spectral Clustering. Zitao Liu Spectral Clustering Zitao Liu Agenda Brief Clustering Review Similarity Graph Graph Laplacian Spectral Clustering Algorithm Graph Cut Point of View Random Walk Point of View Perturbation Theory Point of

More information

CSE 291. Assignment Spectral clustering versus k-means. Out: Wed May 23 Due: Wed Jun 13

CSE 291. Assignment Spectral clustering versus k-means. Out: Wed May 23 Due: Wed Jun 13 CSE 291. Assignment 3 Out: Wed May 23 Due: Wed Jun 13 3.1 Spectral clustering versus k-means Download the rings data set for this problem from the course web site. The data is stored in MATLAB format as

More information

Graphs in Machine Learning

Graphs in Machine Learning Graphs in Machine Learning Michal Valko INRIA Lille - Nord Europe, France Partially based on material by: Ulrike von Luxburg, Gary Miller, Doyle & Schnell, Daniel Spielman January 27, 2015 MVA 2014/2015

More information

Lecture 5 : Projections

Lecture 5 : Projections Lecture 5 : Projections EE227C. Lecturer: Professor Martin Wainwright. Scribe: Alvin Wan Up until now, we have seen convergence rates of unconstrained gradient descent. Now, we consider a constrained minimization

More information

Approximation Algorithms

Approximation Algorithms Approximation Algorithms Chapter 26 Semidefinite Programming Zacharias Pitouras 1 Introduction LP place a good lower bound on OPT for NP-hard problems Are there other ways of doing this? Vector programs

More information

MAP Examples. Sargur Srihari

MAP Examples. Sargur Srihari MAP Examples Sargur srihari@cedar.buffalo.edu 1 Potts Model CRF for OCR Topics Image segmentation based on energy minimization 2 Examples of MAP Many interesting examples of MAP inference are instances

More information

Convex relaxation for Combinatorial Penalties

Convex relaxation for Combinatorial Penalties Convex relaxation for Combinatorial Penalties Guillaume Obozinski Equipe Imagine Laboratoire d Informatique Gaspard Monge Ecole des Ponts - ParisTech Joint work with Francis Bach Fête Parisienne in Computation,

More information

Solving Constrained Rayleigh Quotient Optimization Problem by Projected QEP Method

Solving Constrained Rayleigh Quotient Optimization Problem by Projected QEP Method Solving Constrained Rayleigh Quotient Optimization Problem by Projected QEP Method Yunshen Zhou Advisor: Prof. Zhaojun Bai University of California, Davis yszhou@math.ucdavis.edu June 15, 2017 Yunshen

More information

MATH 567: Mathematical Techniques in Data Science Clustering II

MATH 567: Mathematical Techniques in Data Science Clustering II This lecture is based on U. von Luxburg, A Tutorial on Spectral Clustering, Statistics and Computing, 17 (4), 2007. MATH 567: Mathematical Techniques in Data Science Clustering II Dominique Guillot Departments

More information

Shared Segmentation of Natural Scenes. Dependent Pitman-Yor Processes

Shared Segmentation of Natural Scenes. Dependent Pitman-Yor Processes Shared Segmentation of Natural Scenes using Dependent Pitman-Yor Processes Erik Sudderth & Michael Jordan University of California, Berkeley Parsing Visual Scenes sky skyscraper sky dome buildings trees

More information

2.1 Optimization formulation of k-means

2.1 Optimization formulation of k-means MGMT 69000: Topics in High-dimensional Data Analysis Falll 2016 Lecture 2: k-means Clustering Lecturer: Jiaming Xu Scribe: Jiaming Xu, September 2, 2016 Outline Optimization formulation of k-means Convergence

More information

Constrained Optimization and Lagrangian Duality

Constrained Optimization and Lagrangian Duality CIS 520: Machine Learning Oct 02, 2017 Constrained Optimization and Lagrangian Duality Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may

More information

Convex Optimization Algorithms for Machine Learning in 10 Slides

Convex Optimization Algorithms for Machine Learning in 10 Slides Convex Optimization Algorithms for Machine Learning in 10 Slides Presenter: Jul. 15. 2015 Outline 1 Quadratic Problem Linear System 2 Smooth Problem Newton-CG 3 Composite Problem Proximal-Newton-CD 4 Non-smooth,

More information

Doubly Stochastic Normalization for Spectral Clustering

Doubly Stochastic Normalization for Spectral Clustering Doubly Stochastic Normalization for Spectral Clustering Ron Zass and Amnon Shashua Abstract In this paper we focus on the issue of normalization of the affinity matrix in spectral clustering. We show that

More information

Spectral Techniques for Clustering

Spectral Techniques for Clustering Nicola Rebagliati 1/54 Spectral Techniques for Clustering Nicola Rebagliati 29 April, 2010 Nicola Rebagliati 2/54 Thesis Outline 1 2 Data Representation for Clustering Setting Data Representation and Methods

More information

Learning Spectral Clustering

Learning Spectral Clustering Learning Spectral Clustering Francis R. Bach Computer Science University of California Berkeley, CA 94720 fbach@cs.berkeley.edu Michael I. Jordan Computer Science and Statistics University of California

More information

Data Analysis and Manifold Learning Lecture 7: Spectral Clustering

Data Analysis and Manifold Learning Lecture 7: Spectral Clustering Data Analysis and Manifold Learning Lecture 7: Spectral Clustering Radu Horaud INRIA Grenoble Rhone-Alpes, France Radu.Horaud@inrialpes.fr http://perception.inrialpes.fr/ Outline of Lecture 7 What is spectral

More information

A Quick Tour of Linear Algebra and Optimization for Machine Learning

A Quick Tour of Linear Algebra and Optimization for Machine Learning A Quick Tour of Linear Algebra and Optimization for Machine Learning Masoud Farivar January 8, 2015 1 / 28 Outline of Part I: Review of Basic Linear Algebra Matrices and Vectors Matrix Multiplication Operators

More information

Lecture 15 Newton Method and Self-Concordance. October 23, 2008

Lecture 15 Newton Method and Self-Concordance. October 23, 2008 Newton Method and Self-Concordance October 23, 2008 Outline Lecture 15 Self-concordance Notion Self-concordant Functions Operations Preserving Self-concordance Properties of Self-concordant Functions Implications

More information

Probabilistic Graphical Models & Applications

Probabilistic Graphical Models & Applications Probabilistic Graphical Models & Applications Learning of Graphical Models Bjoern Andres and Bernt Schiele Max Planck Institute for Informatics The slides of today s lecture are authored by and shown with

More information

NORMALIZED CUTS ARE APPROXIMATELY INVERSE EXIT TIMES

NORMALIZED CUTS ARE APPROXIMATELY INVERSE EXIT TIMES NORMALIZED CUTS ARE APPROXIMATELY INVERSE EXIT TIMES MATAN GAVISH AND BOAZ NADLER Abstract The Normalized Cut is a widely used measure of separation between clusters in a graph In this paper we provide

More information

Spectral Clustering of Polarimetric SAR Data With Wishart-Derived Distance Measures

Spectral Clustering of Polarimetric SAR Data With Wishart-Derived Distance Measures Spectral Clustering of Polarimetric SAR Data With Wishart-Derived Distance Measures STIAN NORMANN ANFINSEN ROBERT JENSSEN TORBJØRN ELTOFT COMPUTATIONAL EARTH OBSERVATION AND MACHINE LEARNING LABORATORY

More information

We choose parameter values that will minimize the difference between the model outputs & the true function values.

We choose parameter values that will minimize the difference between the model outputs & the true function values. CSE 4502/5717 Big Data Analytics Lecture #16, 4/2/2018 with Dr Sanguthevar Rajasekaran Notes from Yenhsiang Lai Machine learning is the task of inferring a function, eg, f : R " R This inference has to

More information

minimize x subject to (x 2)(x 4) u,

minimize x subject to (x 2)(x 4) u, Math 6366/6367: Optimization and Variational Methods Sample Preliminary Exam Questions 1. Suppose that f : [, L] R is a C 2 -function with f () on (, L) and that you have explicit formulae for

More information

Laplacian Eigenmaps for Dimensionality Reduction and Data Representation

Laplacian Eigenmaps for Dimensionality Reduction and Data Representation Introduction and Data Representation Mikhail Belkin & Partha Niyogi Department of Electrical Engieering University of Minnesota Mar 21, 2017 1/22 Outline Introduction 1 Introduction 2 3 4 Connections to

More information

Robust Motion Segmentation by Spectral Clustering

Robust Motion Segmentation by Spectral Clustering Robust Motion Segmentation by Spectral Clustering Hongbin Wang and Phil F. Culverhouse Centre for Robotics Intelligent Systems University of Plymouth Plymouth, PL4 8AA, UK {hongbin.wang, P.Culverhouse}@plymouth.ac.uk

More information

Graph Partitioning by Spectral Rounding: Applications in Image Segmentation and Clustering

Graph Partitioning by Spectral Rounding: Applications in Image Segmentation and Clustering Graph Partitioning by Spectral Rounding: Applications in Image Segmentation and Clustering David A. Tolliver Robotics Institute, Carnegie Mellon University, Pittsburgh, PA 15213. tolliver@ri.cmu.edu Gary

More information

Distributed Box-Constrained Quadratic Optimization for Dual Linear SVM

Distributed Box-Constrained Quadratic Optimization for Dual Linear SVM Distributed Box-Constrained Quadratic Optimization for Dual Linear SVM Lee, Ching-pei University of Illinois at Urbana-Champaign Joint work with Dan Roth ICML 2015 Outline Introduction Algorithm Experiments

More information

Improving the Convergence of Back-Propogation Learning with Second Order Methods

Improving the Convergence of Back-Propogation Learning with Second Order Methods the of Back-Propogation Learning with Second Order Methods Sue Becker and Yann le Cun, Sept 1988 Kasey Bray, October 2017 Table of Contents 1 with Back-Propagation 2 the of BP 3 A Computationally Feasible

More information

Jeff Howbert Introduction to Machine Learning Winter

Jeff Howbert Introduction to Machine Learning Winter Classification / Regression Support Vector Machines Jeff Howbert Introduction to Machine Learning Winter 2012 1 Topics SVM classifiers for linearly separable classes SVM classifiers for non-linearly separable

More information

Differentiation of functions of covariance

Differentiation of functions of covariance Differentiation of log X May 5, 2005 1 Differentiation of functions of covariance matrices or: Why you can forget you ever read this Richard Turner Covariance matrices are symmetric, but we often conveniently

More information

Classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012

Classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012 Classification CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Topics Discriminant functions Logistic regression Perceptron Generative models Generative vs. discriminative

More information

Self-Tuning Spectral Clustering

Self-Tuning Spectral Clustering Self-Tuning Spectral Clustering Lihi Zelnik-Manor Pietro Perona Department of Electrical Engineering Department of Electrical Engineering California Institute of Technology California Institute of Technology

More information

Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials

Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials by Phillip Krahenbuhl and Vladlen Koltun Presented by Adam Stambler Multi-class image segmentation Assign a class label to each

More information

Regularization in Neural Networks

Regularization in Neural Networks Regularization in Neural Networks Sargur Srihari 1 Topics in Neural Network Regularization What is regularization? Methods 1. Determining optimal number of hidden units 2. Use of regularizer in error function

More information

Engineering Part IIB: Module 4F10 Statistical Pattern Processing Lecture 5: Single Layer Perceptrons & Estimating Linear Classifiers

Engineering Part IIB: Module 4F10 Statistical Pattern Processing Lecture 5: Single Layer Perceptrons & Estimating Linear Classifiers Engineering Part IIB: Module 4F0 Statistical Pattern Processing Lecture 5: Single Layer Perceptrons & Estimating Linear Classifiers Phil Woodland: pcw@eng.cam.ac.uk Michaelmas 202 Engineering Part IIB:

More information

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)

More information

Radial-Basis Function Networks

Radial-Basis Function Networks Radial-Basis Function etworks A function is radial () if its output depends on (is a nonincreasing function of) the distance of the input from a given stored vector. s represent local receptors, as illustrated

More information

Advanced Machine Learning & Perception

Advanced Machine Learning & Perception Advanced Machine Learning & Perception Instructor: Tony Jebara Topic 2 Nonlinear Manifold Learning Multidimensional Scaling (MDS) Locally Linear Embedding (LLE) Beyond Principal Components Analysis (PCA)

More information

Stochastic Proximal Gradient Algorithm

Stochastic Proximal Gradient Algorithm Stochastic Institut Mines-Télécom / Telecom ParisTech / Laboratoire Traitement et Communication de l Information Joint work with: Y. Atchade, Ann Arbor, USA, G. Fort LTCI/Télécom Paristech and the kind

More information

Probabilistic classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016

Probabilistic classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016 Probabilistic classification CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2016 Topics Probabilistic approach Bayes decision theory Generative models Gaussian Bayes classifier

More information

Part 4: Conditional Random Fields

Part 4: Conditional Random Fields Part 4: Conditional Random Fields Sebastian Nowozin and Christoph H. Lampert Colorado Springs, 25th June 2011 1 / 39 Problem (Probabilistic Learning) Let d(y x) be the (unknown) true conditional distribution.

More information

Intelligent Systems:

Intelligent Systems: Intelligent Systems: Undirected Graphical models (Factor Graphs) (2 lectures) Carsten Rother 15/01/2015 Intelligent Systems: Probabilistic Inference in DGM and UGM Roadmap for next two lectures Definition

More information

PU Learning for Matrix Completion

PU Learning for Matrix Completion Cho-Jui Hsieh Dept of Computer Science UT Austin ICML 2015 Joint work with N. Natarajan and I. S. Dhillon Matrix Completion Example: movie recommendation Given a set Ω and the values M Ω, how to predict

More information

CS675: Convex and Combinatorial Optimization Fall 2016 Convex Optimization Problems. Instructor: Shaddin Dughmi

CS675: Convex and Combinatorial Optimization Fall 2016 Convex Optimization Problems. Instructor: Shaddin Dughmi CS675: Convex and Combinatorial Optimization Fall 2016 Convex Optimization Problems Instructor: Shaddin Dughmi Outline 1 Convex Optimization Basics 2 Common Classes 3 Interlude: Positive Semi-Definite

More information

Stability of Feedback Solutions for Infinite Horizon Noncooperative Differential Games

Stability of Feedback Solutions for Infinite Horizon Noncooperative Differential Games Stability of Feedback Solutions for Infinite Horizon Noncooperative Differential Games Alberto Bressan ) and Khai T. Nguyen ) *) Department of Mathematics, Penn State University **) Department of Mathematics,

More information

Radial-Basis Function Networks

Radial-Basis Function Networks Radial-Basis Function etworks A function is radial basis () if its output depends on (is a non-increasing function of) the distance of the input from a given stored vector. s represent local receptors,

More information

Logistic Regression: Online, Lazy, Kernelized, Sequential, etc.

Logistic Regression: Online, Lazy, Kernelized, Sequential, etc. Logistic Regression: Online, Lazy, Kernelized, Sequential, etc. Harsha Veeramachaneni Thomson Reuter Research and Development April 1, 2010 Harsha Veeramachaneni (TR R&D) Logistic Regression April 1, 2010

More information

SPECTRAL CLUSTERING AND KERNEL PRINCIPAL COMPONENT ANALYSIS ARE PURSUING GOOD PROJECTIONS

SPECTRAL CLUSTERING AND KERNEL PRINCIPAL COMPONENT ANALYSIS ARE PURSUING GOOD PROJECTIONS SPECTRAL CLUSTERING AND KERNEL PRINCIPAL COMPONENT ANALYSIS ARE PURSUING GOOD PROJECTIONS VIKAS CHANDRAKANT RAYKAR DECEMBER 5, 24 Abstract. We interpret spectral clustering algorithms in the light of unsupervised

More information

Random walks for deformable image registration Dana Cobzas

Random walks for deformable image registration Dana Cobzas Random walks for deformable image registration Dana Cobzas with Abhishek Sen, Martin Jagersand, Neil Birkbeck, Karteek Popuri Computing Science, University of Alberta, Edmonton Deformable image registration?t

More information

MLCC Clustering. Lorenzo Rosasco UNIGE-MIT-IIT

MLCC Clustering. Lorenzo Rosasco UNIGE-MIT-IIT MLCC 2018 - Clustering Lorenzo Rosasco UNIGE-MIT-IIT About this class We will consider an unsupervised setting, and in particular the problem of clustering unlabeled data into coherent groups. MLCC 2018

More information

Support Vector Machines

Support Vector Machines Support Vector Machines Le Song Machine Learning I CSE 6740, Fall 2013 Naïve Bayes classifier Still use Bayes decision rule for classification P y x = P x y P y P x But assume p x y = 1 is fully factorized

More information

Deep Learning. Authors: I. Goodfellow, Y. Bengio, A. Courville. Chapter 4: Numerical Computation. Lecture slides edited by C. Yim. C.

Deep Learning. Authors: I. Goodfellow, Y. Bengio, A. Courville. Chapter 4: Numerical Computation. Lecture slides edited by C. Yim. C. Chapter 4: Numerical Computation Deep Learning Authors: I. Goodfellow, Y. Bengio, A. Courville Lecture slides edited by 1 Chapter 4: Numerical Computation 4.1 Overflow and Underflow 4.2 Poor Conditioning

More information

Mixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate

Mixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate Mixture Models & EM icholas Ruozzi University of Texas at Dallas based on the slides of Vibhav Gogate Previously We looed at -means and hierarchical clustering as mechanisms for unsupervised learning -means

More information

Convergence of the Graph Allen-Cahn Scheme

Convergence of the Graph Allen-Cahn Scheme Journal of Statistical Physics manuscript No. (will be inserted by the editor) Xiyang Luo Andrea L. Bertozzi Convergence of the Graph Allen-Cahn Scheme the date of receipt and acceptance should be inserted

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Lecture 9: Variational Inference Relaxations Volkan Cevher, Matthias Seeger Ecole Polytechnique Fédérale de Lausanne 24/10/2011 (EPFL) Graphical Models 24/10/2011 1 / 15

More information

OWL to the rescue of LASSO

OWL to the rescue of LASSO OWL to the rescue of LASSO IISc IBM day 2018 Joint Work R. Sankaran and Francis Bach AISTATS 17 Chiranjib Bhattacharyya Professor, Department of Computer Science and Automation Indian Institute of Science,

More information

Statistical and Computational Analysis of Locality Preserving Projection

Statistical and Computational Analysis of Locality Preserving Projection Statistical and Computational Analysis of Locality Preserving Projection Xiaofei He xiaofei@cs.uchicago.edu Department of Computer Science, University of Chicago, 00 East 58th Street, Chicago, IL 60637

More information

Mixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate

Mixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate Mixture Models & EM icholas Ruozzi University of Texas at Dallas based on the slides of Vibhav Gogate Previously We looed at -means and hierarchical clustering as mechanisms for unsupervised learning -means

More information

Optimization geometry and implicit regularization

Optimization geometry and implicit regularization Optimization geometry and implicit regularization Suriya Gunasekar Joint work with N. Srebro (TTIC), J. Lee (USC), D. Soudry (Technion), M.S. Nacson (Technion), B. Woodworth (TTIC), S. Bhojanapalli (TTIC),

More information

Kernel-Based Contrast Functions for Sufficient Dimension Reduction

Kernel-Based Contrast Functions for Sufficient Dimension Reduction Kernel-Based Contrast Functions for Sufficient Dimension Reduction Michael I. Jordan Departments of Statistics and EECS University of California, Berkeley Joint work with Kenji Fukumizu and Francis Bach

More information

Towards stability and optimality in stochastic gradient descent

Towards stability and optimality in stochastic gradient descent Towards stability and optimality in stochastic gradient descent Panos Toulis, Dustin Tran and Edoardo M. Airoldi August 26, 2016 Discussion by Ikenna Odinaka Duke University Outline Introduction 1 Introduction

More information

Sequence labeling. Taking collective a set of interrelated instances x 1,, x T and jointly labeling them

Sequence labeling. Taking collective a set of interrelated instances x 1,, x T and jointly labeling them HMM, MEMM and CRF 40-957 Special opics in Artificial Intelligence: Probabilistic Graphical Models Sharif University of echnology Soleymani Spring 2014 Sequence labeling aking collective a set of interrelated

More information

Fusion of Similarity Data in Clustering

Fusion of Similarity Data in Clustering Fusion of Similarity Data in Clustering Tilman Lange and Joachim M. Buhmann (langet,jbuhmann)@inf.ethz.ch Institute of Computational Science, Dept. of Computer Sience, ETH Zurich, Switzerland Abstract

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Variational Inference II: Mean Field Method and Variational Principle Junming Yin Lecture 15, March 7, 2012 X 1 X 1 X 1 X 1 X 2 X 3 X 2 X 2 X 3

More information

HW1 solutions. 1. α Ef(x) β, where Ef(x) is the expected value of f(x), i.e., Ef(x) = n. i=1 p if(a i ). (The function f : R R is given.

HW1 solutions. 1. α Ef(x) β, where Ef(x) is the expected value of f(x), i.e., Ef(x) = n. i=1 p if(a i ). (The function f : R R is given. HW1 solutions Exercise 1 (Some sets of probability distributions.) Let x be a real-valued random variable with Prob(x = a i ) = p i, i = 1,..., n, where a 1 < a 2 < < a n. Of course p R n lies in the standard

More information

Principal Curvatures of Surfaces

Principal Curvatures of Surfaces Principal Curvatures of Surfaces David Eberly, Geometric Tools, Redmond WA 98052 https://www.geometrictools.com/ This work is licensed under the Creative Commons Attribution 4.0 International License.

More information

Weighted Minimal Surfaces and Discrete Weighted Minimal Surfaces

Weighted Minimal Surfaces and Discrete Weighted Minimal Surfaces Weighted Minimal Surfaces and Discrete Weighted Minimal Surfaces Qin Zhang a,b, Guoliang Xu a, a LSEC, Institute of Computational Mathematics, Academy of Mathematics and System Science, Chinese Academy

More information

CSC 412 (Lecture 4): Undirected Graphical Models

CSC 412 (Lecture 4): Undirected Graphical Models CSC 412 (Lecture 4): Undirected Graphical Models Raquel Urtasun University of Toronto Feb 2, 2016 R Urtasun (UofT) CSC 412 Feb 2, 2016 1 / 37 Today Undirected Graphical Models: Semantics of the graph:

More information

Chris Bishop s PRML Ch. 8: Graphical Models

Chris Bishop s PRML Ch. 8: Graphical Models Chris Bishop s PRML Ch. 8: Graphical Models January 24, 2008 Introduction Visualize the structure of a probabilistic model Design and motivate new models Insights into the model s properties, in particular

More information

Lecture 9: PGM Learning

Lecture 9: PGM Learning 13 Oct 2014 Intro. to Stats. Machine Learning COMP SCI 4401/7401 Table of Contents I Learning parameters in MRFs 1 Learning parameters in MRFs Inference and Learning Given parameters (of potentials) and

More information

Computational Intelligence Lecture 3: Simple Neural Networks for Pattern Classification

Computational Intelligence Lecture 3: Simple Neural Networks for Pattern Classification Computational Intelligence Lecture 3: Simple Neural Networks for Pattern Classification Farzaneh Abdollahi Department of Electrical Engineering Amirkabir University of Technology Fall 2011 arzaneh Abdollahi

More information

MATH 567: Mathematical Techniques in Data Science Clustering II

MATH 567: Mathematical Techniques in Data Science Clustering II Spectral clustering: overview MATH 567: Mathematical Techniques in Data Science Clustering II Dominique uillot Departments of Mathematical Sciences University of Delaware Overview of spectral clustering:

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 254 Part V

More information

11. Learning graphical models

11. Learning graphical models Learning graphical models 11-1 11. Learning graphical models Maximum likelihood Parameter learning Structural learning Learning partially observed graphical models Learning graphical models 11-2 statistical

More information

Machine Learning Linear Models

Machine Learning Linear Models Machine Learning Linear Models Outline II - Linear Models 1. Linear Regression (a) Linear regression: History (b) Linear regression with Least Squares (c) Matrix representation and Normal Equation Method

More information

Feedforward Neural Nets and Backpropagation

Feedforward Neural Nets and Backpropagation Feedforward Neural Nets and Backpropagation Julie Nutini University of British Columbia MLRG September 28 th, 2016 1 / 23 Supervised Learning Roadmap Supervised Learning: Assume that we are given the features

More information

An optimal control problem for a parabolic PDE with control constraints

An optimal control problem for a parabolic PDE with control constraints An optimal control problem for a parabolic PDE with control constraints PhD Summer School on Reduced Basis Methods, Ulm Martin Gubisch University of Konstanz October 7 Martin Gubisch (University of Konstanz)

More information

EECS 275 Matrix Computation

EECS 275 Matrix Computation EECS 275 Matrix Computation Ming-Hsuan Yang Electrical Engineering and Computer Science University of California at Merced Merced, CA 95344 http://faculty.ucmerced.edu/mhyang Lecture 23 1 / 27 Overview

More information