# CS 231A Section 1: Linear Algebra & Probability Review

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 CS 231A Section 1: Linear Algebra & Probability Review 1

2 Topics Support Vector Machines Boosting Viola-Jones face detector Linear Algebra Review Notation Operations & Properties Matrix Calculus Probability Axioms Basic Properties Bayes Theorem, Chain Rule 2

3 Linear classifiers Find linear function (hyperplane) to separate positive and negative examples x i positive: x i w b 0 x i negative : x i w b 0 w, b Which hyperplane is best? 3

4 Support vector machines Find hyperplane that maximizes the margin between the positive and negative examples Support vectors Margin 4

5 Support Vector Machines (SVM) Wish to perform binary classification, i.e. find a linear classifier Given data and labels where When data is linearly separable we can solve the optimization problem to find our linear classifier 5

6 Datasets that are linearly separable work out great: Nonlinear SVMs 0 x But what if the dataset is just too hard? 0 x We can map it to a higher-dimensional space: x 2 0 x Slide credit: Andrew Moore 6

7 Nonlinear SVMs General idea: the original input space can always be mapped to some higher-dimensional feature space where the training set is separable: Φ: x φ(x) lifting transformation Slide credit: Andrew Moore 7

8 SVM l 1 regularization What if data is not linearly separable? Can use regularization to solve this problem We solve a new optimization problem and tune our regularization parameter C 8

9 Solving the SVM There are many different packages for solving SVMs In PS0 we have you use the liblinear package. This is an efficient implementation but can only use a linear kernel If you wish to have more flexibility with your choice of kernel you can use the LibSVM package 9

10 Topics Support Vector Machines Boosting Viola-Jones face detector Linear Algebra Review Notation Operations & Properties Matrix Calculus Probability Axioms Basic Properties Bayes Theorem, Chain Rule 10

11 Boosting Y. Freund and R. Schapire, A short introduction to boosting, Journal of Japanese Society for Artificial Intelligence, 14(5): , September, x t=1 x t=2 x t Each data point has a class label: y t = +1 ( ) -1 ( ) and a weight: w t =1 It is a sequential procedure: 11

12 Weak learners from the family of lines Toy example Each data point has a class label: y t = +1 ( ) -1 ( ) and a weight: w t =1 h => p(error) = 0.5 it is at chance 12

13 Toy example Each data point has a class label: y t = +1 ( ) -1 ( ) and a weight: w t =1 This one seems to be the best This is a weak classifier : It performs slightly better than chance. 13

14 Toy example Each data point has a class label: +1 ( ) y t = -1 ( ) We update the weights: w t w t exp{-y t H t } 14

15 Toy example Each data point has a class label: y t = +1 ( ) -1 ( ) We update the weights: w t w t exp{-y t H t } 15

16 Toy example Each data point has a class label: y t = +1 ( ) -1 ( ) We update the weights: w t w t exp{-y t H t } 16

17 Toy example Each data point has a class label: y t = +1 ( ) -1 ( ) We update the weights: w t w t exp{-y t H t } 17

18 Toy example f 1 f 2 f 4 f 3 The strong (non- linear) classifier is built as the combination of all the weak (linear) classifiers. 18

19 Boosting Defines a classifier using an additive model: Strong classifier Features vector Weight Weak classifier 19

20 Boosting Defines a classifier using an additive model: Strong classifier Features vector Weight Weak classifier We need to define a family of weak classifiers form a family of weak classifiers 20

21 Why boosting? A simple algorithm for learning robust classifiers Freund & Shapire, 1995 Friedman, Hastie, Tibshhirani, 1998 Provides efficient algorithm for sparse visual feature selection Tieu & Viola, 2000 Viola & Jones, 2003 Easy to implement, doesn t require external optimization tools. 21

22 Weak learners Boosting - mathematics value of rectangle feature h ( x) j 1 if f j( x) j 0 otherwise threshold Final strong classifier T 1 1 h( x) hx ( ) 2 0 otherwise t 1 t t t 1 t T 22

23 Weak classifier 4 kind of Rectangle filters Value = (pixels in white area) (pixels in black area) Credit slide: S. Lazebnik 23

24 Weak classifier Source Result Credit slide: S. Lazebnik 24

25 Viola & Jones algorithm 1. Evaluate each rectangle filter on each example 1 ( 1,1) x ( x2,1) ( x3,0) x4 (,0) ( 5,0) x 6 ( x,0).. ( x, y ) n n Weak classifier h ( x) j 1 if f j( x) j 0 otherwise threshold P. Viola and M. Jones. Rapid object detection using a boosted cascade of simple features. CVPR

26 Viola & Jones algorithm For a 24x24 detection region, P. Viola and M. Jones. Rapid object detection using a boosted cascade of simple features. CVPR

27 Viola & Jones algorithm 2. Select best filter/threshold combination a. Normalize the weights b. For each feature, j w h ( x ) i y j i j i i c. Choose the classifier, h t with the lowest error t w ti, w ti, n j 1 w t, j 1 if f j( x) j hj ( x) 0 otherwise 3. Reweight examples 1 h ( x ) y t 1, i t, i t w w t i i t t 1 t P. Viola and M. Jones. Rapid object detection using a boosted cascade of simple features. CVPR

28 Viola & Jones algorithm 4. The final strong classifier is T 1 1 h( x) hx ( ) 2 0 otherwise t 1 t t t 1 t T t 1 log t The final hypothesis is a weighted linear combination of the T hypotheses where the weights are inversely proportional to the training errors P. Viola and M. Jones. Rapid object detection using a boosted cascade of simple features. CVPR

29 Boosting for face detection For each round of boosting: 1. Evaluate each rectangle filter on each example 2. Select best filter/threshold combination 3. Reweight examples 29

30 The implemented system Training Data 5000 faces All frontal, rescaled to 24x24 pixels 300 million non-faces 9500 non-face images Faces are normalized Scale, translation Many variations Across individuals Illumination Pose P. Viola and M. Jones. Rapid object detection using a boosted cascade of simple features. CVPR

31 System performance Training time: weeks on 466 MHz Sun workstation 38 layers, total of 6061 features Average of 10 features evaluated per window on test set On a 700 Mhz Pentium III processor, the face detector can process a 384 by 288 pixel image in about.067 seconds 15 Hz 15 times faster than previous detector of comparable accuracy (Rowley et al., 1998) P. Viola and M. Jones. Rapid object detection using a boosted cascade of simple features. CVPR

32 Output of Face Detector on Test Images P. Viola and M. Jones. Rapid object detection using a boosted cascade of simple features. CVPR

33 Topics Support Vector Machines Boosting Viola-Jones face detector Linear Algebra Review Notation Operations & Properties Matrix Calculus Probability Axioms Basic Properties Bayes Theorem, Chain Rule 33

34 Linear Algebra in Computer Vision Representation 3D points in the scene 2D points in the image (Images are matrices) Transformations Mapping 2D to 2D Mapping 3D to 2D 34

35 Notation We adopt the notation for a matrix which is a real valued matrix with m rows, and n columns We adopt the notation for a column vector, and a row vector respectively 35

36 Notation To indicate the element in the i th row and j th column of a matrix we use Similarly to indicate the i th entry in a vector we use 36

37 Norms Intuitively the norm of a vector is the measure of its length The l 2 norm is defined as in this class we will use the l 2 norm unless otherwise noted. Thus we drop the 2 subscript on the norm for convenience. Note that 37

38 Linear Independence and Rank A set of vectors is linearly independent if no vector in the set can be represented as a linear combination of the remaining vectors in the set The rank of a matrix is the maximal number of linearly independent column or rows of a matrix 38

39 Range and Nullspace The range of a matrix is the span of the columns of the matrix, denoted by the set The nullspace of a matrix, is the set of vectors that when multiplied by the matrix result in 0, given by the set 39

40 Eigenvalues and Eigenvectors Given a matrix, and are said to be an eigenvalue and the corresponding eigenvector of the matrix if We can solve for the eigenvalues by solving for the roots of the polynomial generated by 40

41 Eigenvalue Properties The rank of a matrix is equal to the number of its non-zero eigenvalues Eigenvalues of a diagonal matrix, are simply the diagonal entries A matrix is said to be diagonalizable if we can write 41

42 Eigenvalues & Eigenvectors of Symmetric Matrices Eigenvalues of symmetric matrices are real Eigenvectors of symmetric matrices are orthonormal Consider the optimization problem involving the symmetric matrix the maximizing is the eigenvector corresponding to the largest eigenvalue 42

43 Generalized Eigenvalues Generalized Eigenvalue problem Generalized eigenvalues must satisfy This reduces to the original eigenvalue problem when exists Generalized eigenvalues are used in Fisherfaces 43

44 Singular Value Decomposition (SVD) The SVD of matrix is given by Where are the columns of and called the left singular vectors is a diagonal matrix whose values are, and called the singular values are the columns of, and are called the right singular vectors 44

45 SVD If the matrix has rank, then has nonzero singular values are an orthonormal basis for are an orthonormal basis for Singular values of are the square root of the non-zero eigenvalues of or 45

46 Matlab [V,D] = eig(a) The eigenvectors of A are the columns of V. D is a diagonal matrix whose entries are the eigenvalues of A. [V,D] = eig(a,b) The generalized eigenvectors are the columns of V. D is a diagonal matrix whose entries of the generalized eigenvalues. [U,S,V] = svd(x) The columns of U are the left singular vectors of X. S is a diagonal matrix whose entries are the singular values of X. The columns of V are the right singular vectors of X. Recall X = U*S*V ; 46

47 Matrix Calculus -- Gradient Let then the gradient is given by is always the same size as, thus if we just have a vector the gradient is simply 47

49 Topics Support Vector Machines Boosting Viola-Jones face detector Linear Algebra Review Notation Operations & Properties Matrix Calculus Probability Axioms Basic Properties Bayes Theorem, Chain Rule 49

50 Probability in Computer Vision Foundation for algorithms to solve Tracking problems Human activity recognition Object recognition Segmentation 50

51 Probability Axioms Sample space: The set of all the outcomes of a random experiment. Denoted by Event space: A set whose elements are subsets of. The event space is denoted by. For example Probability measure: A function that satisfies 51

52 Basic Properties 52

53 Conditional Probability Two events are independent if Conditional Independence 53

54 Product Rule From the definition of conditional probability we can write From the product rule we can derive the chain rule of probability 54

55 Bayes Theorem Likelihood Posterior Probability Normalizing Constant Prior Probability 55

### CS 231A Section 1: Linear Algebra & Probability Review. Kevin Tang

CS 231A Section 1: Linear Algebra & Probability Review Kevin Tang Kevin Tang Section 1-1 9/30/2011 Topics Support Vector Machines Boosting Viola Jones face detector Linear Algebra Review Notation Operations

### 2D Image Processing Face Detection and Recognition

2D Image Processing Face Detection and Recognition Prof. Didier Stricker Kaiserlautern University http://ags.cs.uni-kl.de/ DFKI Deutsches Forschungszentrum für Künstliche Intelligenz http://av.dfki.de

### COS 429: COMPUTER VISON Face Recognition

COS 429: COMPUTER VISON Face Recognition Intro to recognition PCA and Eigenfaces LDA and Fisherfaces Face detection: Viola & Jones (Optional) generic object models for faces: the Constellation Model Reading:

### PCA FACE RECOGNITION

PCA FACE RECOGNITION The slides are from several sources through James Hays (Brown); Srinivasa Narasimhan (CMU); Silvio Savarese (U. of Michigan); Shree Nayar (Columbia) including their own slides. Goal

### CS4495/6495 Introduction to Computer Vision. 8C-L3 Support Vector Machines

CS4495/6495 Introduction to Computer Vision 8C-L3 Support Vector Machines Discriminative classifiers Discriminative classifiers find a division (surface) in feature space that separates the classes Several

### Linear Subspace Models

Linear Subspace Models Goal: Explore linear models of a data set. Motivation: A central question in vision concerns how we represent a collection of data vectors. The data vectors may be rasterized images,

### Learning theory. Ensemble methods. Boosting. Boosting: history

Learning theory Probability distribution P over X {0, 1}; let (X, Y ) P. We get S := {(x i, y i )} n i=1, an iid sample from P. Ensemble methods Goal: Fix ɛ, δ (0, 1). With probability at least 1 δ (over

### Introduction to Machine Learning. Introduction to ML - TAU 2016/7 1

Introduction to Machine Learning Introduction to ML - TAU 2016/7 1 Course Administration Lecturers: Amir Globerson (gamir@post.tau.ac.il) Yishay Mansour (Mansour@tau.ac.il) Teaching Assistance: Regev Schweiger

### MATH 304 Linear Algebra Lecture 34: Review for Test 2.

MATH 304 Linear Algebra Lecture 34: Review for Test 2. Topics for Test 2 Linear transformations (Leon 4.1 4.3) Matrix transformations Matrix of a linear mapping Similar matrices Orthogonality (Leon 5.1

### LEC 2: Principal Component Analysis (PCA) A First Dimensionality Reduction Approach

LEC 2: Principal Component Analysis (PCA) A First Dimensionality Reduction Approach Dr. Guangliang Chen February 9, 2016 Outline Introduction Review of linear algebra Matrix SVD PCA Motivation The digits

### Linear Algebra Practice Problems

Linear Algebra Practice Problems Math 24 Calculus III Summer 25, Session II. Determine whether the given set is a vector space. If not, give at least one axiom that is not satisfied. Unless otherwise stated,

### Least Squares Optimization

Least Squares Optimization The following is a brief review of least squares optimization and constrained optimization techniques. Broadly, these techniques can be used in data analysis and visualization

### CS 4495 Computer Vision Principle Component Analysis

CS 4495 Computer Vision Principle Component Analysis (and it s use in Computer Vision) Aaron Bobick School of Interactive Computing Administrivia PS6 is out. Due *** Sunday, Nov 24th at 11:55pm *** PS7

### Linear Algebra review Powers of a diagonalizable matrix Spectral decomposition

Linear Algebra review Powers of a diagonalizable matrix Spectral decomposition Prof. Tesler Math 283 Fall 2016 Also see the separate version of this with Matlab and R commands. Prof. Tesler Diagonalizing

### CITS 4402 Computer Vision

CITS 4402 Computer Vision A/Prof Ajmal Mian Adj/A/Prof Mehdi Ravanbakhsh Lecture 06 Object Recognition Objectives To understand the concept of image based object recognition To learn how to match images

### I. Multiple Choice Questions (Answer any eight)

Name of the student : Roll No : CS65: Linear Algebra and Random Processes Exam - Course Instructor : Prashanth L.A. Date : Sep-24, 27 Duration : 5 minutes INSTRUCTIONS: The test will be evaluated ONLY

### Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Linear Classifiers. Blaine Nelson, Tobias Scheffer

Universität Potsdam Institut für Informatik Lehrstuhl Linear Classifiers Blaine Nelson, Tobias Scheffer Contents Classification Problem Bayesian Classifier Decision Linear Classifiers, MAP Models Logistic

### Review of some mathematical tools

MATHEMATICAL FOUNDATIONS OF SIGNAL PROCESSING Fall 2016 Benjamín Béjar Haro, Mihailo Kolundžija, Reza Parhizkar, Adam Scholefield Teaching assistants: Golnoosh Elhami, Hanjie Pan Review of some mathematical

### Math 224, Fall 2007 Exam 3 Thursday, December 6, 2007

Math 224, Fall 2007 Exam 3 Thursday, December 6, 2007 You have 1 hour and 20 minutes. No notes, books, or other references. You are permitted to use Maple during this exam, but you must start with a blank

### CSC321 Lecture 2: Linear Regression

CSC32 Lecture 2: Linear Regression Roger Grosse Roger Grosse CSC32 Lecture 2: Linear Regression / 26 Overview First learning algorithm of the course: linear regression Task: predict scalar-valued targets,

### Background. Adaptive Filters and Machine Learning. Bootstrap. Combining models. Boosting and Bagging. Poltayev Rassulzhan

Adaptive Filters and Machine Learning Boosting and Bagging Background Poltayev Rassulzhan rasulzhan@gmail.com Resampling Bootstrap We are using training set and different subsets in order to validate results

### Lecture: Face Recognition and Feature Reduction

Lecture: Face Recognition and Feature Reduction Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab Lecture 11-1 Recap - Curse of dimensionality Assume 5000 points uniformly distributed

### Least Squares Optimization

Least Squares Optimization The following is a brief review of least squares optimization and constrained optimization techniques. I assume the reader is familiar with basic linear algebra, including the

### Machine Learning Ensemble Learning I Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi Spring /

Machine Learning Ensemble Learning I Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi Spring 2015 http://ce.sharif.edu/courses/93-94/2/ce717-1 / Agenda Combining Classifiers Empirical view Theoretical

### Neural networks and support vector machines

Neural netorks and support vector machines Perceptron Input x 1 Weights 1 x 2 x 3... x D 2 3 D Output: sgn( x + b) Can incorporate bias as component of the eight vector by alays including a feature ith

### 10-725/36-725: Convex Optimization Prerequisite Topics

10-725/36-725: Convex Optimization Prerequisite Topics February 3, 2015 This is meant to be a brief, informal refresher of some topics that will form building blocks in this course. The content of the

### Machine learning for pervasive systems Classification in high-dimensional spaces

Machine learning for pervasive systems Classification in high-dimensional spaces Department of Communications and Networking Aalto University, School of Electrical Engineering stephan.sigg@aalto.fi Version

### Nearest Neighbors Methods for Support Vector Machines

Nearest Neighbors Methods for Support Vector Machines A. J. Quiroz, Dpto. de Matemáticas. Universidad de Los Andes joint work with María González-Lima, Universidad Simón Boĺıvar and Sergio A. Camelo, Universidad

### CPSC 340: Machine Learning and Data Mining. More PCA Fall 2017

CPSC 340: Machine Learning and Data Mining More PCA Fall 2017 Admin Assignment 4: Due Friday of next week. No class Monday due to holiday. There will be tutorials next week on MAP/PCA (except Monday).

### Linear Algebra & Geometry why is linear algebra useful in computer vision?

Linear Algebra & Geometry why is linear algebra useful in computer vision? References: -Any book on linear algebra! -[HZ] chapters 2, 4 Some of the slides in this lecture are courtesy to Prof. Octavia

### 1 Linearity and Linear Systems

Mathematical Tools for Neuroscience (NEU 34) Princeton University, Spring 26 Jonathan Pillow Lecture 7-8 notes: Linear systems & SVD Linearity and Linear Systems Linear system is a kind of mapping f( x)

### Computational Methods. Eigenvalues and Singular Values

Computational Methods Eigenvalues and Singular Values Manfred Huber 2010 1 Eigenvalues and Singular Values Eigenvalues and singular values describe important aspects of transformations and of data relations

### Stability of the Gram-Schmidt process

Stability of the Gram-Schmidt process Orthogonal projection We learned in multivariable calculus (or physics or elementary linear algebra) that if q is a unit vector and v is any vector then the orthogonal

### Information Retrieval

Introduction to Information CS276: Information and Web Search Christopher Manning and Pandu Nayak Lecture 13: Latent Semantic Indexing Ch. 18 Today s topic Latent Semantic Indexing Term-document matrices

### Math 240 Calculus III

Generalized Calculus III Summer 2015, Session II Thursday, July 23, 2015 Agenda 1. 2. 3. 4. Motivation Defective matrices cannot be diagonalized because they do not possess enough eigenvectors to make

### SVMs, Duality and the Kernel Trick

SVMs, Duality and the Kernel Trick Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University February 26 th, 2007 2005-2007 Carlos Guestrin 1 SVMs reminder 2005-2007 Carlos Guestrin 2 Today

### Boosting: Foundations and Algorithms. Rob Schapire

Boosting: Foundations and Algorithms Rob Schapire Example: Spam Filtering problem: filter out spam (junk email) gather large collection of examples of spam and non-spam: From: yoav@ucsd.edu Rob, can you

### B553 Lecture 5: Matrix Algebra Review

B553 Lecture 5: Matrix Algebra Review Kris Hauser January 19, 2012 We have seen in prior lectures how vectors represent points in R n and gradients of functions. Matrices represent linear transformations

### (a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? Solution: dim N(A) 1, since rank(a) 3. Ax =

. (5 points) (a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? dim N(A), since rank(a) 3. (b) If we also know that Ax = has no solution, what do we know about the rank of A? C(A)

### Evaluation requires to define performance measures to be optimized

Evaluation Basic concepts Evaluation requires to define performance measures to be optimized Performance of learning algorithms cannot be evaluated on entire domain (generalization error) approximation

### Linear Algebra- Final Exam Review

Linear Algebra- Final Exam Review. Let A be invertible. Show that, if v, v, v 3 are linearly independent vectors, so are Av, Av, Av 3. NOTE: It should be clear from your answer that you know the definition.

### UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014 Exam policy: This exam allows two one-page, two-sided cheat sheets (i.e. 4 sides); No other materials. Time: 2 hours. Be sure to write

### DS-GA 1002 Lecture notes 10 November 23, Linear models

DS-GA 2 Lecture notes November 23, 2 Linear functions Linear models A linear model encodes the assumption that two quantities are linearly related. Mathematically, this is characterized using linear functions.

### Matrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A =

30 MATHEMATICS REVIEW G A.1.1 Matrices and Vectors Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A = a 11 a 12... a 1N a 21 a 22... a 2N...... a M1 a M2... a MN A matrix can

### Machine Learning Lecture 5

Machine Learning Lecture 5 Linear Discriminant Functions 26.10.2017 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Course Outline Fundamentals Bayes Decision Theory

### ANSWERS. E k E 2 E 1 A = B

MATH 7- Final Exam Spring ANSWERS Essay Questions points Define an Elementary Matrix Display the fundamental matrix multiply equation which summarizes a sequence of swap, combination and multiply operations,

### Quick Introduction to Nonnegative Matrix Factorization

Quick Introduction to Nonnegative Matrix Factorization Norm Matloff University of California at Davis 1 The Goal Given an u v matrix A with nonnegative elements, we wish to find nonnegative, rank-k matrices

### Latent Semantic Models. Reference: Introduction to Information Retrieval by C. Manning, P. Raghavan, H. Schutze

Latent Semantic Models Reference: Introduction to Information Retrieval by C. Manning, P. Raghavan, H. Schutze 1 Vector Space Model: Pros Automatic selection of index terms Partial matching of queries

### and let s calculate the image of some vectors under the transformation T.

Chapter 5 Eigenvalues and Eigenvectors 5. Eigenvalues and Eigenvectors Let T : R n R n be a linear transformation. Then T can be represented by a matrix (the standard matrix), and we can write T ( v) =

### CS 188: Artificial Intelligence Spring Announcements

CS 188: Artificial Intelligence Spring 2010 Lecture 22: Nearest Neighbors, Kernels 4/18/2011 Pieter Abbeel UC Berkeley Slides adapted from Dan Klein Announcements On-going: contest (optional and FUN!)

### Corners, Blobs & Descriptors. With slides from S. Lazebnik & S. Seitz, D. Lowe, A. Efros

Corners, Blobs & Descriptors With slides from S. Lazebnik & S. Seitz, D. Lowe, A. Efros Motivation: Build a Panorama M. Brown and D. G. Lowe. Recognising Panoramas. ICCV 2003 How do we build panorama?

### UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 Exam policy: This exam allows two one-page, two-sided cheat sheets; No other materials. Time: 2 hours. Be sure to write your name and

### Support Vector and Kernel Methods

SIGIR 2003 Tutorial Support Vector and Kernel Methods Thorsten Joachims Cornell University Computer Science Department tj@cs.cornell.edu http://www.joachims.org 0 Linear Classifiers Rules of the Form:

### Maths for Signals and Systems Linear Algebra in Engineering

Maths for Signals and Systems Linear Algebra in Engineering Lectures 13 15, Tuesday 8 th and Friday 11 th November 016 DR TANIA STATHAKI READER (ASSOCIATE PROFFESOR) IN SIGNAL PROCESSING IMPERIAL COLLEGE

### MATH 304 Linear Algebra Lecture 23: Diagonalization. Review for Test 2.

MATH 304 Linear Algebra Lecture 23: Diagonalization. Review for Test 2. Diagonalization Let L be a linear operator on a finite-dimensional vector space V. Then the following conditions are equivalent:

### Numerical Methods I Singular Value Decomposition

Numerical Methods I Singular Value Decomposition Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 MATH-GA 2011.003 / CSCI-GA 2945.003, Fall 2014 October 9th, 2014 A. Donev (Courant Institute)

### No books, no notes, no calculators. You must show work, unless the question is a true/false, yes/no, or fill-in-the-blank question.

Math 304 Final Exam (May 8) Spring 206 No books, no notes, no calculators. You must show work, unless the question is a true/false, yes/no, or fill-in-the-blank question. Name: Section: Question Points

### MA 265 FINAL EXAM Fall 2012

MA 265 FINAL EXAM Fall 22 NAME: INSTRUCTOR S NAME:. There are a total of 25 problems. You should show work on the exam sheet, and pencil in the correct answer on the scantron. 2. No books, notes, or calculators

### ECE 5984: Introduction to Machine Learning

ECE 5984: Introduction to Machine Learning Topics: Ensemble Methods: Bagging, Boosting Readings: Murphy 16.4; Hastie 16 Dhruv Batra Virginia Tech Administrativia HW3 Due: April 14, 11:55pm You will implement

### Lecture: Face Recognition

Lecture: Face Recognition Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab Lecture 12-1 What we will learn today Introduction to face recognition The Eigenfaces Algorithm Linear

### Kernel Methods. Machine Learning A W VO

Kernel Methods Machine Learning A 708.063 07W VO Outline 1. Dual representation 2. The kernel concept 3. Properties of kernels 4. Examples of kernel machines Kernel PCA Support vector regression (Relevance

### Machine Learning Lecture 7

Course Outline Machine Learning Lecture 7 Fundamentals (2 weeks) Bayes Decision Theory Probability Density Estimation Statistical Learning Theory 23.05.2016 Discriminative Approaches (5 weeks) Linear Discriminant

### Singular value decomposition

Singular value decomposition The eigenvalue decomposition (EVD) for a square matrix A gives AU = UD. Let A be rectangular (m n, m > n). A singular value σ and corresponding pair of singular vectors u (m

### Midterm Exam Solutions, Spring 2007

1-71 Midterm Exam Solutions, Spring 7 1. Personal info: Name: Andrew account: E-mail address:. There should be 16 numbered pages in this exam (including this cover sheet). 3. You can use any material you

### Elementary Linear Algebra Review for Exam 2 Exam is Monday, November 16th.

Elementary Linear Algebra Review for Exam Exam is Monday, November 6th. The exam will cover sections:.4,..4, 5. 5., 7., the class notes on Markov Models. You must be able to do each of the following. Section.4

### Singular Value Decomposition

Singular Value Decomposition Motivatation The diagonalization theorem play a part in many interesting applications. Unfortunately not all matrices can be factored as A = PDP However a factorization A =

### Last updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

Last updated: Oct 22, 2012 LINEAR CLASSIFIERS Problems 2 Please do Problem 8.3 in the textbook. We will discuss this in class. Classification: Problem Statement 3 In regression, we are modeling the relationship

### Linear Algebra for Machine Learning. Sargur N. Srihari

Linear Algebra for Machine Learning Sargur N. srihari@cedar.buffalo.edu 1 Overview Linear Algebra is based on continuous math rather than discrete math Computer scientists have little experience with it

### Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining

Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes October 3, 2012 1 / 1 Combinations of features Given a data matrix X n p with p fairly large, it can

### 3D Computer Vision - WT 2004

3D Computer Vision - WT 2004 Singular Value Decomposition Darko Zikic CAMP - Chair for Computer Aided Medical Procedures November 4, 2004 1 2 3 4 5 Properties For any given matrix A R m n there exists

### Parallel Singular Value Decomposition. Jiaxing Tan

Parallel Singular Value Decomposition Jiaxing Tan Outline What is SVD? How to calculate SVD? How to parallelize SVD? Future Work What is SVD? Matrix Decomposition Eigen Decomposition A (non-zero) vector

### MA 575 Linear Models: Cedric E. Ginestet, Boston University Regularization: Ridge Regression and Lasso Week 14, Lecture 2

MA 575 Linear Models: Cedric E. Ginestet, Boston University Regularization: Ridge Regression and Lasso Week 14, Lecture 2 1 Ridge Regression Ridge regression and the Lasso are two forms of regularized

### MATH 369 Linear Algebra

Assignment # Problem # A father and his two sons are together 00 years old. The father is twice as old as his older son and 30 years older than his younger son. How old is each person? Problem # 2 Determine

### The Singular Value Decomposition

The Singular Value Decomposition An Important topic in NLA Radu Tiberiu Trîmbiţaş Babeş-Bolyai University February 23, 2009 Radu Tiberiu Trîmbiţaş ( Babeş-Bolyai University)The Singular Value Decomposition

### Part I Week 7 Based in part on slides from textbook, slides of Susan Holmes

Part I Week 7 Based in part on slides from textbook, slides of Susan Holmes Support Vector Machine, Random Forests, Boosting December 2, 2012 1 / 1 2 / 1 Neural networks Artificial Neural networks: Networks

### NONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

NONLINEAR CLASSIFICATION AND REGRESSION Nonlinear Classification and Regression: Outline 2 Multi-Layer Perceptrons The Back-Propagation Learning Algorithm Generalized Linear Models Radial Basis Function

### Linear Algebra Massoud Malek

CSUEB Linear Algebra Massoud Malek Inner Product and Normed Space In all that follows, the n n identity matrix is denoted by I n, the n n zero matrix by Z n, and the zero vector by θ n An inner product

### Linear Algebra and Eigenproblems

Appendix A A Linear Algebra and Eigenproblems A working knowledge of linear algebra is key to understanding many of the issues raised in this work. In particular, many of the discussions of the details

### Name Solutions Linear Algebra; Test 3. Throughout the test simplify all answers except where stated otherwise.

Name Solutions Linear Algebra; Test 3 Throughout the test simplify all answers except where stated otherwise. 1) Find the following: (10 points) ( ) Or note that so the rows are linearly independent, so

### Midterm Review CS 6375: Machine Learning. Vibhav Gogate The University of Texas at Dallas

Midterm Review CS 6375: Machine Learning Vibhav Gogate The University of Texas at Dallas Machine Learning Supervised Learning Unsupervised Learning Reinforcement Learning Parametric Y Continuous Non-parametric

### Lecture 14 : Online Learning, Stochastic Gradient Descent, Perceptron

CS446: Machine Learning, Fall 2017 Lecture 14 : Online Learning, Stochastic Gradient Descent, Perceptron Lecturer: Sanmi Koyejo Scribe: Ke Wang, Oct. 24th, 2017 Agenda Recap: SVM and Hinge loss, Representer

### Using SVD to Recommend Movies

Michael Percy University of California, Santa Cruz Last update: December 12, 2009 Last update: December 12, 2009 1 / Outline 1 Introduction 2 Singular Value Decomposition 3 Experiments 4 Conclusion Last

### 18.06 Problem Set 10 - Solutions Due Thursday, 29 November 2007 at 4 pm in

86 Problem Set - Solutions Due Thursday, 29 November 27 at 4 pm in 2-6 Problem : (5=5+5+5) Take any matrix A of the form A = B H CB, where B has full column rank and C is Hermitian and positive-definite

### Chapters 5 & 6: Theory Review: Solutions Math 308 F Spring 2015

Chapters 5 & 6: Theory Review: Solutions Math 308 F Spring 205. If A is a 3 3 triangular matrix, explain why det(a) is equal to the product of entries on the diagonal. If A is a lower triangular or diagonal

### Lecture Notes 5: Multiresolution Analysis

Optimization-based data analysis Fall 2017 Lecture Notes 5: Multiresolution Analysis 1 Frames A frame is a generalization of an orthonormal basis. The inner products between the vectors in a frame and

### Image Registration Lecture 2: Vectors and Matrices

Image Registration Lecture 2: Vectors and Matrices Prof. Charlene Tsai Lecture Overview Vectors Matrices Basics Orthogonal matrices Singular Value Decomposition (SVD) 2 1 Preliminary Comments Some of this

### Vote. Vote on timing for night section: Option 1 (what we have now) Option 2. Lecture, 6:10-7:50 25 minute dinner break Tutorial, 8:15-9

Vote Vote on timing for night section: Option 1 (what we have now) Lecture, 6:10-7:50 25 minute dinner break Tutorial, 8:15-9 Option 2 Lecture, 6:10-7 10 minute break Lecture, 7:10-8 10 minute break Tutorial,

### Perceptron Revisited: Linear Separators. Support Vector Machines

Support Vector Machines Perceptron Revisited: Linear Separators Binary classification can be viewed as the task of separating classes in feature space: w T x + b > 0 w T x + b = 0 w T x + b < 0 Department

### 0.1 Eigenvalues and Eigenvectors

0.. EIGENVALUES AND EIGENVECTORS MATH 22AL Computer LAB for Linear Algebra Eigenvalues and Eigenvectors Dr. Daddel Please save your MATLAB Session (diary)as LAB9.text and submit. 0. Eigenvalues and Eigenvectors

### Image Analysis. PCA and Eigenfaces

Image Analysis PCA and Eigenfaces Christophoros Nikou cnikou@cs.uoi.gr Images taken from: D. Forsyth and J. Ponce. Computer Vision: A Modern Approach, Prentice Hall, 2003. Computer Vision course by Svetlana

### 1. The Polar Decomposition

A PERSONAL INTERVIEW WITH THE SINGULAR VALUE DECOMPOSITION MATAN GAVISH Part. Theory. The Polar Decomposition In what follows, F denotes either R or C. The vector space F n is an inner product space with

### Singular Value Decomposition

Chapter 6 Singular Value Decomposition In Chapter 5, we derived a number of algorithms for computing the eigenvalues and eigenvectors of matrices A R n n. Having developed this machinery, we complete our

### Multiclass Classification-1

CS 446 Machine Learning Fall 2016 Oct 27, 2016 Multiclass Classification Professor: Dan Roth Scribe: C. Cheng Overview Binary to multiclass Multiclass SVM Constraint classification 1 Introduction Multiclass

### MATHEMATICS COMPREHENSIVE EXAM: IN-CLASS COMPONENT

MATHEMATICS COMPREHENSIVE EXAM: IN-CLASS COMPONENT The following is the list of questions for the oral exam. At the same time, these questions represent all topics for the written exam. The procedure for

### Lecture 7: Kernels for Classification and Regression

Lecture 7: Kernels for Classification and Regression CS 194-10, Fall 2011 Laurent El Ghaoui EECS Department UC Berkeley September 15, 2011 Outline Outline A linear regression problem Linear auto-regressive

### Logistic Regression: Online, Lazy, Kernelized, Sequential, etc.

Logistic Regression: Online, Lazy, Kernelized, Sequential, etc. Harsha Veeramachaneni Thomson Reuter Research and Development April 1, 2010 Harsha Veeramachaneni (TR R&D) Logistic Regression April 1, 2010

### Probabilistic Latent Semantic Analysis

Probabilistic Latent Semantic Analysis Dan Oneaţă 1 Introduction Probabilistic Latent Semantic Analysis (plsa) is a technique from the category of topic models. Its main goal is to model cooccurrence information

### MATH 310, REVIEW SHEET 2

MATH 310, REVIEW SHEET 2 These notes are a very short summary of the key topics in the book (and follow the book pretty closely). You should be familiar with everything on here, but it s not comprehensive,