Size: px
Start display at page:



1 ROBUST ESTIMATOR FOR MULTIPLE INLIER STRUCTURES Xiang Yang (1) and Peter Meer (2) (1) Dept. of Mechanical and Aerospace Engineering (2) Dept. of Electrical and Computer Engineering Rutgers University, NJ 08854, USA

2 Several inlier structures and outliers... each structure with a different scale... the scales also have to be found. How do you solve it? S. Mittal, S. Anand, P. Meer: Generalized Projection Based M-Estimator. IEEE Trans. Pattern Anal. Machine Intell., 34, ,

3 Three distinct steps in Mittal et al. 1. Scale Estimation A scale estimate was obtained with all undetected structures contributing as well. 2. Model Estimation Using Mean Shift The mean shift was more complicated than it should be. 3. Inlier/Outlier Dichotomy The stopping criterion was just a heuristic relation. A different solution: simpler, avoids the above problems. Will examine also the criterions for robustness. 3

4 Nonlinear objective functions The n 1 < n inlier measurements y i plus outliers, in R l. In general a nonlinear objective function Ψ (y i, β) 0 k i = 1,..., n 1 Ψ( ) R k Ψ (y i, β) outliers i = (n 1 + 1),..., n. e.g. 3 3 fundamental matrix F in y i Fy i = 0. k = 1. In a higher dimensional linear space the vector Ψ can be separated into a matrix of measurements Φ and a corresponding new parameter vector θ Ψ (y i, β) = Φ(y i )θ(β) Φ(y) R k m θ(β) R m k = ζ relations between rows called carriers x [c] i, and θ x [c] i θ 0 c = 1,..., ζ i = 1,..., n 1. 4

5 There is ambiguity is these equations x [c] i θ 0 which can be eliminated by taking θ θ = 1. The estimator found by the generalized projection based M-estimator (gpbm) was refined by a few percents by the above relation. S. Mittal, P. Meer: Conjugate gradient on Grassmann manifolds for robust subspace estimation. Image and Vision Computing, 30, , This procedure can also be used for the new algorithm too, but will not described it in this talk. 5

6 Ellipse estimation. ζ = 1. Input: [x y] R 2 Nonlin. obj. funct.: (y i y c ) Q(y i y c ) 1 0 where matrix Q is 2 2 symmetric, positive definite and y c is the ellipse center. y i are measurements! Carrier: x = [x y x 2 xy y 2 ] R 5 gives the linear relation x i θ α 0 i = 1,..., n 1 with the scalar α (intercept) pulled out. The relation between the input parameters and θ θ = [ 2y c Q Q 11 2Q 12 Q 22 ] α = y c Qy c 1 6

7 Beside θ θ = 1 the ellipses also have to satisfy the positive symmetry condition 4θ 3 θ 5 θ 2 4 > 0 (= 1). The nonlinearity of the ellipses is a difficult problem. A carrier x perturbed with noise having s.d. σ from the true value x o, does not have the expectation zero mean E(x x o ) = [0 0 σ 2 since the carrier has x 2, y 2 terms. 0 σ 2 ] 7

8 Fundamental matrix. ζ = 1. Input: [x y x y ] R 4 Nonlin. obj. funct.: y i Fy i = [x i y i 1] F [x i y i 1] 0 Carrier: x = [x y x y xx xy x y yy ] R 8 gives the linear relation x i θ α 0 i = 1,..., n 1. 8

9 Homography. ζ = 2. Input: [x y x y ] R 4 Nonlinear obj. funct.: y i Hy i or [x hi y hi w hi] H[x i y i 1] Direct linear transformation (DLT): with θ = h = vec[h ] and A i a 2 9 matrix [ ] y A i h = i 0 3 x 1 i y i h 0 3 yi y i h y i h 3 Carriers: x [1] = [ x y x x x y x ] x [2] = [0 0 0 x y 1 y x y y y ] gives two linear relations x [c] i h 0 c = 1, 2 i = 1,..., n 1. 9

10 Covariance of the carriers The inliers at input have the same l l covariance σ 2 C y. σ is unknown and have to be estimated; det[c y ] = 1. If no additional information, C y = I y. The m m covariances of the carriers are σ 2 C [c] i = σ 2 J x [c] i y i C y J x [c] i y i with the Jacobian matrix J x y = x(y) y = x 1 x 1 y 1 y l x m x m y 1 y l c = 1,..., ζ A carrier covariance depends on the input point.. 10

11 Ellipse estimation. ζ = 1. The 5 2 Jacobian gives the 5 5 covariance [ ] 1 0 J 2xi y x i y i = i x i 2y i Fundamental matrix. ζ = 1. The 8 4 Jacobian matrix gives the 8 8 covariance x i y i 0 0 J x i y i = x i y i x i 0 y i x i 0 y i 11

12 Homography. ζ = 2. The two 9 4 Jacobian matrices give the 9 9 covariances I 2 2 x J i I = 0 0 x [1] 4 4 i y 2 yi i J x [2] i y i = I 2 2 y i I yi 12

13 1. Scale estimation for an inlier structure M elemental subsets, each containing the minimal number of points defining θ. The α is computed from θ. If ζ > 1, we want to have only a one-dimensional null space, R. Should take into account only one x [c]. For each elemental subset θ, α, and each point i, consider only the largest Mahalanobis distance from α x [ c] d [ c] i θ α i = θ C [ c] i θ = d i c i = max c=1,...,ζ d[c] i. The carrier vector is x i, covariance matrix is C i, variance in the null space is H i = θ Ci θ = θ J xi y i J x i y i θ. 13

14 For each elemental subset order the Mahalanobis distances in ascending order, d [i]. Take n r1 n points from min M n r1 i=1 d [i]. If M larger, this minimun is almost sure from an inlier structure. This is the initial set, having the intercept α m. Taking n r1 = 0.05n, or ɛ = 5%, is enough, as will see. All structures have n r1 = 0.05n, where n is the total number of input points. 14

15 Initial set has n r1 points with the largest distance from α m being d [r1 ]. Increase the distance 2 d [r1] = d [r2]. Find n r2. Continue... Assume that b d [r1 ] = d [rb] satisfies 2[n rb n r(b 1) ] n r(b 1) ˆσ = b 1 d [rb] because the border of an inlier structure is not precise. y = x R 2 x R 5 search is in R, the null space 15

16 2. Inlier estimation with mean shift N = M/10 new elemental subsets from the inlier points found in the previous step. For each elemental subset, do mean shift in one dimension with the profile κ(u) = K(u 2 ), u 0, [ θ, α ] = arg max θ,α = 1 nˆσ 1 nˆσ arg max θ ( n κ i=1 arg max z ( ) (z z i ) B 1 i (z z i ) n i=1 κ(z) ) having the variance B i = ˆσ 2 Hi = ˆσ 2 θ Ci θ and with all the points z i = x i θ still existing. For each θ, move from z = α to the closest mode ẑ. 16

17 The profile of Epanechnikov kernel { 1 u (z z κ(u) = i ) B 1 i (z z i ) 1 0 (z z i ) B 1 i (z z i ) > 1 and g(u) = κ (u) = 1 for u 1 and zero for u > 1. An iteration around z = z old gives the new value z new [ n ] 1 [ n ] z new = g (u) g (u) z i i=1 where only the points inside the Epanechnikov kernel around z old are taken into account. The largest mode is ˆα, and the corresponding elemental subset gives ˆθ. Number of inliers is n in, TLS estimates are ˆθ tls and ˆα tls. This is the recovered inlier structure. i=1 17

18 3. Strength based merge process The strength of a structure s is defined as nin i=1 d = d i s = n in n in d = n2 in nin d. i=1 i The jth structure has: n j inliers and strength s j. Before l = 1,..., (j 1) structures were detected: n l inliers and strength s l. TLS estimate for j and l together gives strength s j,l and will be fused if s j,l > n js j + n l s l n j + n l for an l. The merge is done in the linear space. 18

19 300 inliers. 300 outliers. Gaussian inlier noise σ g = 20. M = 500. Three similar inlier structures. Final structure: ˆσ = 46.1 and 329 inliers. To merge two linear objective functions, a small angle and a small average distance between the two is enough. The input vector and the carrier vector are the same. 19

20 The merges of two nonlinear objective functions should be done in the input space and not in the linear space. It was not done here! For two ellipses the overlap area can be computed. They merge if the shared area is large and the major axes are close one to the other. For fundamental matrices (or homographies) first recover the Euclidean geometry, which also means processing in 3D. At least 3 images are required, if there is no additional knowledge. Depth is a strong separator, as will see it in a homography example too. 20

21 Final classification Continue until you don t have enough points to start another initial set. The structures are sorted by their strengths in descending order. The inlier structures will be always at the beginning of the list. The user is able to specify where to stop and retain only the inliers, which have denser structures. 21

22 θ 1 x i + θ 2 y i α 0 i = 1,..., n in M = 300 for all multiple line estimations. 100 points, σ g = points, σ g = unstructured outliers 300 points, σ g = 20 scale : structure : strength : The correct inliers are recovered 97 times from 100. Three times the 100 points line didn t recover. scale : ± ± ± inliers : ± ± ±

23 θ 1 x i + θ 2 y i α 0 i = 1,..., n in 100 points, σ g = points, σ g = unstructured outliers 300 points, σ g = 5 scale : structure : strength : The correct inliers are recovered 96 times from 100. Four times the top structure didn t recover. scale : 8.63 ± ± ± inliers : ± ± ±

24 (y i y c ) Q(y i y c ) 1 0 i = 1,..., n in M = 2000 for all multiple ellipse estimations. 200 points, σ g = points, σ g = unstructured outliers 400 points, σ g = 9 red green blue cyan magenta scale : structure : strength : The correct inliers are recovered 98 times from 100. Two times the 200 points ellipse didn t recover. scale : 7.83 ± ± ± 4.37 inliers : ± ± ±

25 (y i y c ) Q(y i y c ) 1 0 i = 1,..., n in 200 points, σ g = points, σ g = unstructured outliers 400 points, σ g = 3 red green blue cyan magenta scale : structure : strength : The correct inliers are recovered 95 times from 100. Five times the 200 points ellipse didn t recover. scale : 6.34 ± ± ± 5.72 inliers : ± ± ±

26 Ellipse estimation in a real image Image size EDISON segmentation system based on the mean shift is applied (top/right) with the default spatial σ s = 7 and range σ r = 6.5 bandwidth. Canny edge detection (bottom/left) with the thresholds of 100 and 200. The strongest three ellipses are drawn superimposed over the edges. The ellipses drawn superimposed over the original image (bottom/right) are correct. 26

27 Fundamental matrix estimation. M = 1000 for all fundamental matrix estimations. The 546 points pairs were extracted with SIFT with the default distance ratio of 0.8. The first three structures are 160 pairs with ˆσ 1 = 0.42 (red) 147 pairs with ˆσ 2 = 0.59 (green) 60 pairs with ˆσ 3 = 0.34 (blue). 27

28 Fundamental matrix estimation. The SIFT finds 173 point correspondences. The first structure is 70 pairs with ˆσ =

29 Homography estimation M = 1000 for all homography estimations. The SIFT finds 168 point correspondences. The first two structures are 66 pairs with ˆσ 1 = 1.37 (red) 34 pairs with ˆσ 2 = 4.3 (green). 29

30 Homography estimation The SIFT finds 495 correspondences. The first three structures are 160 pairs with ˆσ 1 = 1.25 (red) 98 pairs with ˆσ 2 = 1.59 (green) 121 pairs with ˆσ 3 = 1.77 (blue). 30

31 Conditions for robustness For the algorithm to be robust the four parameters M, the number of trial; n out, amount of outliers; ˆσ, the noise of the inlier structure; ɛ, the initial set interact in a complex manner. 31

32 200 inliers, σ g = outliers. Change the parameters, one at a time n out = 100, 400, 800 σ g = 3, 9, 15 M = ɛ = 1 40%. In each condition do 100 trials and return the average of the counted true inliers over the total number of points classified as inliers. 32

33 How important is the number of trials M. n out σ g Not much, when M exceeds some value depending on the input data. Increasing to outliers to n out = 900 (σ g = 9), has stronger effect than increasing the noise (n out = 400) σ g =

34 The initial set should be quasi-correct. M = ɛ = 5% is equivalent to n r1 = 30 points. From the initial set 24 points are between ±2σ g = ±18. From the initial set 29 points are between ±18. 34

35 Changing the starting ratio. M = Average real initial set with automatic increase. σ g = 9. σ g changing: initial sets final results Taking ɛ = 5% is enough most of the time. 35

36 Robustness of the algorithm If the input data is preprocessed and part of the outliers are eliminated, stronger inlier noise will be tolerated. The number of trials M does not matter beyond a value depending on the input data. The ɛ = 5% generally is enough as the starting point. If the number of outliers are three or four times more than the number of inliers, usually the algorithm still will deliver. If not all the inlier structures came out, since the data was too challenging, the recovered inlier structures are correct. 36

37 Thank You!

38 Open problems... Estimating the m k matrix Θ and a k-dimensional vector α. Reducing to k independently runs of the algorithm is it enough? An image contains both lines and conics together with outliers. How do you approach it, if all inlier structures should be recovered? You have to process robustly a very large image. Will hierarchical processing from many small images to the large image find all the relevant inlier structures? The covariances of the input points are not equal. Estimate first all the inlier structures with the same ˆσ. After that, for each structure separately, do scale estimation. Will this procedure work all the time? 38

Nonrobust and Robust Objective Functions

Nonrobust and Robust Objective Functions Nonrobust and Robust Objective Functions The objective function of the estimators in the input space is built from the sum of squared Mahalanobis distances (residuals) d 2 i = 1 σ 2(y i y io ) C + y i

More information

Robust Estimation for Computer Vision using Grassmann Manifolds

Robust Estimation for Computer Vision using Grassmann Manifolds Robust Estimation for Computer Vision using Grassmann Manifolds Saket Anand 1, Sushil Mittal 2 and Peter Meer 3 Abstract Real-world visual data is often corrupted and requires the use of estimation techniques

More information

Linear Dimensionality Reduction

Linear Dimensionality Reduction Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Principal Component Analysis 3 Factor Analysis

More information

Minimum Hellinger Distance Estimation in a. Semiparametric Mixture Model

Minimum Hellinger Distance Estimation in a. Semiparametric Mixture Model Minimum Hellinger Distance Estimation in a Semiparametric Mixture Model Sijia Xiang 1, Weixin Yao 1, and Jingjing Wu 2 1 Department of Statistics, Kansas State University, Manhattan, Kansas, USA 66506-0802.

More information

nonrobust estimation The n measurement vectors taken together give the vector X R N. The unknown parameter vector is P R M.

nonrobust estimation The n measurement vectors taken together give the vector X R N. The unknown parameter vector is P R M. Introduction to nonlinear LS estimation R. I. Hartley and A. Zisserman: Multiple View Geometry in Computer Vision. Cambridge University Press, 2ed., 2004. After Chapter 5 and Appendix 6. We will use x

More information

Data Mining. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395

Data Mining. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395 Data Mining Dimensionality reduction Hamid Beigy Sharif University of Technology Fall 1395 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1395 1 / 42 Outline 1 Introduction 2 Feature selection

More information

Deep Learning: Approximation of Functions by Composition

Deep Learning: Approximation of Functions by Composition Deep Learning: Approximation of Functions by Composition Zuowei Shen Department of Mathematics National University of Singapore Outline 1 A brief introduction of approximation theory 2 Deep learning: approximation

More information

Lecture 8: Interest Point Detection. Saad J Bedros

Lecture 8: Interest Point Detection. Saad J Bedros #1 Lecture 8: Interest Point Detection Saad J Bedros Last Lecture : Edge Detection Preprocessing of image is desired to eliminate or at least minimize noise effects There is always tradeoff

More information

Algorithms for Computing a Planar Homography from Conics in Correspondence

Algorithms for Computing a Planar Homography from Conics in Correspondence Algorithms for Computing a Planar Homography from Conics in Correspondence Juho Kannala, Mikko Salo and Janne Heikkilä Machine Vision Group University of Oulu, Finland {jkannala, msa,} Abstract

More information



More information

Kernel Methods. Charles Elkan October 17, 2007

Kernel Methods. Charles Elkan October 17, 2007 Kernel Methods Charles Elkan October 17, 2007 Remember the xor example of a classification problem that is not linearly separable. If we map every example into a new representation, then

More information

Bayesian Optimization

Bayesian Optimization Practitioner Course: Portfolio October 15, 2008 No introduction to portfolio optimization would be complete without acknowledging the significant contribution of the Markowitz mean-variance efficient frontier

More information

Conditions for Segmentation of Motion with Affine Fundamental Matrix

Conditions for Segmentation of Motion with Affine Fundamental Matrix Conditions for Segmentation of Motion with Affine Fundamental Matrix Shafriza Nisha Basah 1, Reza Hoseinnezhad 2, and Alireza Bab-Hadiashar 1 Faculty of Engineering and Industrial Sciences, Swinburne University

More information

Machine Learning (CS 567) Lecture 5

Machine Learning (CS 567) Lecture 5 Machine Learning (CS 567) Lecture 5 Time: T-Th 5:00pm - 6:20pm Location: GFS 118 Instructor: Sofus A. Macskassy ( Office: SAL 216 Office hours: by appointment Teaching assistant: Cheol

More information

PCA and LDA. Man-Wai MAK

PCA and LDA. Man-Wai MAK PCA and LDA Man-Wai MAK Dept. of Electronic and Information Engineering, The Hong Kong Polytechnic University mwmak References: S.J.D. Prince,Computer

More information

Lecture 8: Interest Point Detection. Saad J Bedros

Lecture 8: Interest Point Detection. Saad J Bedros #1 Lecture 8: Interest Point Detection Saad J Bedros Review of Edge Detectors #2 Today s Lecture Interest Points Detection What do we mean with Interest Point Detection in an Image Goal:

More information

1. Projective geometry

1. Projective geometry 1. Projective geometry Homogeneous representation of points and lines in D space D projective space Points at infinity and the line at infinity Conics and dual conics Projective transformation Hierarchy

More information

Motion Estimation (I) Ce Liu Microsoft Research New England

Motion Estimation (I) Ce Liu Microsoft Research New England Motion Estimation (I) Ce Liu Microsoft Research New England We live in a moving world Perceiving, understanding and predicting motion is an important part of our daily lives Motion

More information

Data Preprocessing. Cluster Similarity

Data Preprocessing. Cluster Similarity 1 Cluster Similarity Similarity is most often measured with the help of a distance function. The smaller the distance, the more similar the data objects (points). A function d: M M R is a distance on M

More information

Lecture Notes 1: Vector spaces

Lecture Notes 1: Vector spaces Optimization-based data analysis Fall 2017 Lecture Notes 1: Vector spaces In this chapter we review certain basic concepts of linear algebra, highlighting their application to signal processing. 1 Vector

More information

Lecture 22: Section 4.7

Lecture 22: Section 4.7 Lecture 22: Section 47 Shuanglin Shao December 2, 213 Row Space, Column Space, and Null Space Definition For an m n, a 11 a 12 a 1n a 21 a 22 a 2n A = a m1 a m2 a mn, the vectors r 1 = [ a 11 a 12 a 1n

More information

Machine Learning, Fall 2009: Midterm

Machine Learning, Fall 2009: Midterm 10-601 Machine Learning, Fall 009: Midterm Monday, November nd hours 1. Personal info: Name: Andrew account: E-mail address:. You are permitted two pages of notes and a calculator. Please turn off all

More information

Method 1: Geometric Error Optimization

Method 1: Geometric Error Optimization Method 1: Geometric Error Optimization we need to encode the constraints ŷ i F ˆx i = 0, rank F = 2 idea: reconstruct 3D point via equivalent projection matrices and use reprojection error equivalent projection

More information

A Statistical Analysis of Fukunaga Koontz Transform

A Statistical Analysis of Fukunaga Koontz Transform 1 A Statistical Analysis of Fukunaga Koontz Transform Xiaoming Huo Dr. Xiaoming Huo is an assistant professor at the School of Industrial and System Engineering of the Georgia Institute of Technology,

More information

ORIE 6326: Convex Optimization. Quasi-Newton Methods

ORIE 6326: Convex Optimization. Quasi-Newton Methods ORIE 6326: Convex Optimization Quasi-Newton Methods Professor Udell Operations Research and Information Engineering Cornell April 10, 2017 Slides on steepest descent and analysis of Newton s method adapted

More information

Lecture 3: Pattern Classification. Pattern classification

Lecture 3: Pattern Classification. Pattern classification EE E68: Speech & Audio Processing & Recognition Lecture 3: Pattern Classification 3 4 5 The problem of classification Linear and nonlinear classifiers Probabilistic classification Gaussians, mitures and

More information

Clustering by Mixture Models. General background on clustering Example method: k-means Mixture model based clustering Model estimation

Clustering by Mixture Models. General background on clustering Example method: k-means Mixture model based clustering Model estimation Clustering by Mixture Models General bacground on clustering Example method: -means Mixture model based clustering Model estimation 1 Clustering A basic tool in data mining/pattern recognition: Divide

More information

The Expectation-Maximization Algorithm

The Expectation-Maximization Algorithm 1/29 EM & Latent Variable Models Gaussian Mixture Models EM Theory The Expectation-Maximization Algorithm Mihaela van der Schaar Department of Engineering Science University of Oxford MLE for Latent Variable

More information

Lecture 4: Probabilistic Learning

Lecture 4: Probabilistic Learning DD2431 Autumn, 2015 1 Maximum Likelihood Methods Maximum A Posteriori Methods Bayesian methods 2 Classification vs Clustering Heuristic Example: K-means Expectation Maximization 3 Maximum Likelihood Methods

More information

PCA and LDA. Man-Wai MAK

PCA and LDA. Man-Wai MAK PCA and LDA Man-Wai MAK Dept. of Electronic and Information Engineering, The Hong Kong Polytechnic University mwmak References: S.J.D. Prince,Computer

More information

MFM Practitioner Module: Risk & Asset Allocation. John Dodson. February 18, 2015

MFM Practitioner Module: Risk & Asset Allocation. John Dodson. February 18, 2015 MFM Practitioner Module: Risk & Asset Allocation February 18, 2015 No introduction to portfolio optimization would be complete without acknowledging the significant contribution of the Markowitz mean-variance

More information

Kalman Filter. Predict: Update: x k k 1 = F k x k 1 k 1 + B k u k P k k 1 = F k P k 1 k 1 F T k + Q

Kalman Filter. Predict: Update: x k k 1 = F k x k 1 k 1 + B k u k P k k 1 = F k P k 1 k 1 F T k + Q Kalman Filter Kalman Filter Predict: x k k 1 = F k x k 1 k 1 + B k u k P k k 1 = F k P k 1 k 1 F T k + Q Update: K = P k k 1 Hk T (H k P k k 1 Hk T + R) 1 x k k = x k k 1 + K(z k H k x k k 1 ) P k k =(I

More information

Introduction to Machine Learning

Introduction to Machine Learning 10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what

More information

Lecture 4.3 Estimating homographies from feature correspondences. Thomas Opsahl

Lecture 4.3 Estimating homographies from feature correspondences. Thomas Opsahl Lecture 4.3 Estimating homographies from feature correspondences Thomas Opsahl Homographies induced by central projection 1 H 2 1 H 2 u uu 2 3 1 Homography Hu = u H = h 1 h 2 h 3 h 4 h 5 h 6 h 7 h 8 h

More information

Solving Corrupted Quadratic Equations, Provably

Solving Corrupted Quadratic Equations, Provably Solving Corrupted Quadratic Equations, Provably Yuejie Chi London Workshop on Sparse Signal Processing September 206 Acknowledgement Joint work with Yuanxin Li (OSU), Huishuai Zhuang (Syracuse) and Yingbin

More information

Vision par ordinateur

Vision par ordinateur Vision par ordinateur Géométrie épipolaire Frédéric Devernay Avec des transparents de Marc Pollefeys Epipolar geometry π Underlying structure in set of matches for rigid scenes C1 m1 l1 M L2 L1 l T 1 l

More information

Augmented Reality VU numerical optimization algorithms. Prof. Vincent Lepetit

Augmented Reality VU numerical optimization algorithms. Prof. Vincent Lepetit Augmented Reality VU numerical optimization algorithms Prof. Vincent Lepetit P3P: can exploit only 3 correspondences; DLT: difficult to exploit the knowledge of the internal parameters correctly, and the

More information

Sparsity Models. Tong Zhang. Rutgers University. T. Zhang (Rutgers) Sparsity Models 1 / 28

Sparsity Models. Tong Zhang. Rutgers University. T. Zhang (Rutgers) Sparsity Models 1 / 28 Sparsity Models Tong Zhang Rutgers University T. Zhang (Rutgers) Sparsity Models 1 / 28 Topics Standard sparse regression model algorithms: convex relaxation and greedy algorithm sparse recovery analysis:

More information



More information

Pose estimation from point and line correspondences

Pose estimation from point and line correspondences Pose estimation from point and line correspondences Giorgio Panin October 17, 008 1 Problem formulation Estimate (in a LSE sense) the pose of an object from N correspondences between known object points

More information

Dominant Feature Vectors Based Audio Similarity Measure

Dominant Feature Vectors Based Audio Similarity Measure Dominant Feature Vectors Based Audio Similarity Measure Jing Gu 1, Lie Lu 2, Rui Cai 3, Hong-Jiang Zhang 2, and Jian Yang 1 1 Dept. of Electronic Engineering, Tsinghua Univ., Beijing, 100084, China 2 Microsoft

More information

Linear Regression and Its Applications

Linear Regression and Its Applications Linear Regression and Its Applications Predrag Radivojac October 13, 2014 Given a data set D = {(x i, y i )} n the objective is to learn the relationship between features and the target. We usually start

More information

Detection of signal transitions by order statistics filtering

Detection of signal transitions by order statistics filtering Detection of signal transitions by order statistics filtering A. Raji Images, Signals and Intelligent Systems Laboratory Paris-Est Creteil University, France Abstract In this article, we present a non

More information

Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability

More information

Lecture 3: Pattern Classification

Lecture 3: Pattern Classification EE E6820: Speech & Audio Processing & Recognition Lecture 3: Pattern Classification 1 2 3 4 5 The problem of classification Linear and nonlinear classifiers Probabilistic classification Gaussians, mixtures

More information

IV. Matrix Approximation using Least-Squares

IV. Matrix Approximation using Least-Squares IV. Matrix Approximation using Least-Squares The SVD and Matrix Approximation We begin with the following fundamental question. Let A be an M N matrix with rank R. What is the closest matrix to A that

More information

Motion Estimation (I)

Motion Estimation (I) Motion Estimation (I) Ce Liu Microsoft Research New England We live in a moving world Perceiving, understanding and predicting motion is an important part of our daily lives Motion

More information

One Picture and a Thousand Words Using Matrix Approximtions October 2017 Oak Ridge National Lab Dianne P. O Leary c 2017

One Picture and a Thousand Words Using Matrix Approximtions October 2017 Oak Ridge National Lab Dianne P. O Leary c 2017 One Picture and a Thousand Words Using Matrix Approximtions October 2017 Oak Ridge National Lab Dianne P. O Leary c 2017 1 One Picture and a Thousand Words Using Matrix Approximations Dianne P. O Leary

More information

CS 195-5: Machine Learning Problem Set 1

CS 195-5: Machine Learning Problem Set 1 CS 95-5: Machine Learning Problem Set Douglas Lanman 7 September Regression Problem Show that the prediction errors y f(x; ŵ) are necessarily uncorrelated with any linear function of

More information

1 Probability Model. 1.1 Types of models to be discussed in the course

1 Probability Model. 1.1 Types of models to be discussed in the course Sufficiency January 18, 016 Debdeep Pati 1 Probability Model Model: A family of distributions P θ : θ Θ}. P θ (B) is the probability of the event B when the parameter takes the value θ. P θ is described

More information

SLAM Techniques and Algorithms. Jack Collier. Canada. Recherche et développement pour la défense Canada. Defence Research and Development Canada

SLAM Techniques and Algorithms. Jack Collier. Canada. Recherche et développement pour la défense Canada. Defence Research and Development Canada SLAM Techniques and Algorithms Jack Collier Defence Research and Development Canada Recherche et développement pour la défense Canada Canada Goals What will we learn Gain an appreciation for what SLAM

More information

Grassmann Averages for Scalable Robust PCA Supplementary Material

Grassmann Averages for Scalable Robust PCA Supplementary Material Grassmann Averages for Scalable Robust PCA Supplementary Material Søren Hauberg DTU Compute Lyngby, Denmark Aasa Feragen DIKU and MPIs Tübingen Denmark and Germany Michael J.

More information

Linear Algebra, Summer 2011, pt. 3

Linear Algebra, Summer 2011, pt. 3 Linear Algebra, Summer 011, pt. 3 September 0, 011 Contents 1 Orthogonality. 1 1.1 The length of a vector....................... 1. Orthogonal vectors......................... 3 1.3 Orthogonal Subspaces.......................

More information

Machine Learning. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395

Machine Learning. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395 Machine Learning Dimensionality reduction Hamid Beigy Sharif University of Technology Fall 1395 Hamid Beigy (Sharif University of Technology) Machine Learning Fall 1395 1 / 47 Table of contents 1 Introduction

More information



More information

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,

More information

Data Exploration and Unsupervised Learning with Clustering

Data Exploration and Unsupervised Learning with Clustering Data Exploration and Unsupervised Learning with Clustering Paul F Rodriguez,PhD San Diego Supercomputer Center Predictive Analytic Center of Excellence Clustering Idea Given a set of data can we find a

More information

A spectral clustering algorithm based on Gram operators

A spectral clustering algorithm based on Gram operators A spectral clustering algorithm based on Gram operators Ilaria Giulini De partement de Mathe matiques et Applications ENS, Paris Joint work with Olivier Catoni 1 july 2015 Clustering task of grouping

More information

BSCB 6520: Statistical Computing Homework 4 Due: Friday, December 5

BSCB 6520: Statistical Computing Homework 4 Due: Friday, December 5 BSCB 6520: Statistical Computing Homework 4 Due: Friday, December 5 All computer code should be produced in R or Matlab and e-mailed to (one file for the entire homework). Responses to

More information


SYMBOL RECOGNITION IN HANDWRITTEN MATHEMATI- CAL FORMULAS SYMBOL RECOGNITION IN HANDWRITTEN MATHEMATI- CAL FORMULAS Hans-Jürgen Winkler ABSTRACT In this paper an efficient on-line recognition system for handwritten mathematical formulas is proposed. After formula

More information

Sparse Levenberg-Marquardt algorithm.

Sparse Levenberg-Marquardt algorithm. Sparse Levenberg-Marquardt algorithm. R. I. Hartley and A. Zisserman: Multiple View Geometry in Computer Vision. Cambridge University Press, second edition, 2004. Appendix 6 was used in part. The Levenberg-Marquardt

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis Introduction Consider a zero mean random vector R n with autocorrelation matri R = E( T ). R has eigenvectors q(1),,q(n) and associated eigenvalues λ(1) λ(n). Let Q = [ q(1)

More information

Notes on Discriminant Functions and Optimal Classification

Notes on Discriminant Functions and Optimal Classification Notes on Discriminant Functions and Optimal Classification Padhraic Smyth, Department of Computer Science University of California, Irvine c 2017 1 Discriminant Functions Consider a classification problem

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning Christoph Lampert Spring Semester 2015/2016 // Lecture 12 1 / 36 Unsupervised Learning Dimensionality Reduction 2 / 36 Dimensionality Reduction Given: data X = {x 1,..., x

More information

Assignment 1 Math 5341 Linear Algebra Review. Give complete answers to each of the following questions. Show all of your work.

Assignment 1 Math 5341 Linear Algebra Review. Give complete answers to each of the following questions. Show all of your work. Assignment 1 Math 5341 Linear Algebra Review Give complete answers to each of the following questions Show all of your work Note: You might struggle with some of these questions, either because it has

More information

Presentation in Convex Optimization

Presentation in Convex Optimization Dec 22, 2014 Introduction Sample size selection in optimization methods for machine learning Introduction Sample size selection in optimization methods for machine learning Main results: presents a methodology

More information

Two-View Segmentation of Dynamic Scenes from the Multibody Fundamental Matrix

Two-View Segmentation of Dynamic Scenes from the Multibody Fundamental Matrix Two-View Segmentation of Dynamic Scenes from the Multibody Fundamental Matrix René Vidal Stefano Soatto Shankar Sastry Department of EECS, UC Berkeley Department of Computer Sciences, UCLA 30 Cory Hall,

More information

UNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017

UNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017 UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics Tuesday, January 17, 2017 Work all problems 60 points are needed to pass at the Masters Level and 75

More information

4 Bias-Variance for Ridge Regression (24 points)

4 Bias-Variance for Ridge Regression (24 points) Implement Ridge Regression with λ = 0.00001. Plot the Squared Euclidean test error for the following values of k (the dimensions you reduce to): k = {0, 50, 100, 150, 200, 250, 300, 350, 400, 450, 500,

More information

8 Numerical methods for unconstrained problems

8 Numerical methods for unconstrained problems 8 Numerical methods for unconstrained problems Optimization is one of the important fields in numerical computation, beside solving differential equations and linear systems. We can see that these fields

More information

Mark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation.

Mark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation. CS 189 Spring 2015 Introduction to Machine Learning Midterm You have 80 minutes for the exam. The exam is closed book, closed notes except your one-page crib sheet. No calculators or electronic items.

More information

ROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015

ROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015 ROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015 Roberto Battiti

More information

Affine Structure From Motion

Affine Structure From Motion EECS43-Advanced Computer Vision Notes Series 9 Affine Structure From Motion Ying Wu Electrical Engineering & Computer Science Northwestern University Evanston, IL 68 Contents

More information

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani PCA & ICA CE-717: Machine Learning Sharif University of Technology Spring 2015 Soleymani Dimensionality Reduction: Feature Selection vs. Feature Extraction Feature selection Select a subset of a given

More information

High Accuracy Fundamental Matrix Computation and Its Performance Evaluation

High Accuracy Fundamental Matrix Computation and Its Performance Evaluation High Accuracy Fundamental Matrix Computation and Its Performance Evaluation Kenichi Kanatani Department of Computer Science, Okayama University, Okayama 700-8530 Japan

More information

Vector spaces. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis.

Vector spaces. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis. Vector spaces DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis Carlos Fernandez-Granda Vector space Consists of: A set V A scalar

More information

Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project

Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project Devin Cornell & Sushruth Sastry May 2015 1 Abstract In this article, we explore

More information

10. Unconstrained minimization

10. Unconstrained minimization Convex Optimization Boyd & Vandenberghe 10. Unconstrained minimization terminology and assumptions gradient descent method steepest descent method Newton s method self-concordant functions implementation

More information

Maximum Likelihood, Logistic Regression, and Stochastic Gradient Training

Maximum Likelihood, Logistic Regression, and Stochastic Gradient Training Maximum Likelihood, Logistic Regression, and Stochastic Gradient Training Charles Elkan January 17, 2013 1 Principle of maximum likelihood Consider a family of probability distributions

More information

Machine Learning. Lecture 4: Regularization and Bayesian Statistics. Feng Li.

Machine Learning. Lecture 4: Regularization and Bayesian Statistics. Feng Li. Machine Learning Lecture 4: Regularization and Bayesian Statistics Feng Li School of Computer Science and Technology Shandong University Fall 207 Overfitting Problem

More information

CSE 546 Final Exam, Autumn 2013

CSE 546 Final Exam, Autumn 2013 CSE 546 Final Exam, Autumn 0. Personal info: Name: Student ID: E-mail address:. There should be 5 numbered pages in this exam (including this cover sheet).. You can use any material you brought: any book,

More information

Convex Optimization in Classification Problems

Convex Optimization in Classification Problems New Trends in Optimization and Computational Algorithms December 9 13, 2001 Convex Optimization in Classification Problems Laurent El Ghaoui Department of EECS, UC Berkeley 1

More information

Robust Scale Estimation for the Generalized Gaussian Probability Density Function

Robust Scale Estimation for the Generalized Gaussian Probability Density Function Metodološki zvezki, Vol. 3, No. 1, 26, 21-37 Robust Scale Estimation for the Generalized Gaussian Probability Density Function Rozenn Dahyot 1 and Simon Wilson 2 Abstract This article proposes a robust

More information

CS168: The Modern Algorithmic Toolbox Lecture #8: PCA and the Power Iteration Method

CS168: The Modern Algorithmic Toolbox Lecture #8: PCA and the Power Iteration Method CS168: The Modern Algorithmic Toolbox Lecture #8: PCA and the Power Iteration Method Tim Roughgarden & Gregory Valiant April 15, 015 This lecture began with an extended recap of Lecture 7. Recall that

More information

Edges and Scale. Image Features. Detecting edges. Origin of Edges. Solution: smooth first. Effects of noise

Edges and Scale. Image Features. Detecting edges. Origin of Edges. Solution: smooth first. Effects of noise Edges and Scale Image Features From Sandlot Science Slides revised from S. Seitz, R. Szeliski, S. Lazebnik, etc. Origin of Edges surface normal discontinuity depth discontinuity surface color discontinuity

More information



More information

Metric-based classifiers. Nuno Vasconcelos UCSD

Metric-based classifiers. Nuno Vasconcelos UCSD Metric-based classifiers Nuno Vasconcelos UCSD Statistical learning goal: given a function f. y f and a collection of eample data-points, learn what the function f. is. this is called training. two major

More information

Learning Multiple Tasks with a Sparse Matrix-Normal Penalty

Learning Multiple Tasks with a Sparse Matrix-Normal Penalty Learning Multiple Tasks with a Sparse Matrix-Normal Penalty Yi Zhang and Jeff Schneider NIPS 2010 Presented by Esther Salazar Duke University March 25, 2011 E. Salazar (Reading group) March 25, 2011 1

More information


ADAPTIVE ANTENNAS. SPATIAL BF ADAPTIVE ANTENNAS SPATIAL BF 1 1-Spatial reference BF -Spatial reference beamforming may not use of embedded training sequences. Instead, the directions of arrival (DoA) of the impinging waves are used

More information

Least Squares Regression

Least Squares Regression CIS 50: Machine Learning Spring 08: Lecture 4 Least Squares Regression Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may not cover all the

More information

Robust scale estimate for the generalized Gaussian Probability Density Function

Robust scale estimate for the generalized Gaussian Probability Density Function Robust scale estimate for the generalized Gaussian Probability Density Function Rozenn Dahyot & Simon Wilson Department of Statistics Trinity College Dublin Ireland, September

More information

More on Unsupervised Learning

More on Unsupervised Learning More on Unsupervised Learning Two types of problems are to find association rules for occurrences in common in observations (market basket analysis), and finding the groups of values of observational data

More information

When Dictionary Learning Meets Classification

When Dictionary Learning Meets Classification When Dictionary Learning Meets Classification Bufford, Teresa 1 Chen, Yuxin 2 Horning, Mitchell 3 Shee, Liberty 1 Mentor: Professor Yohann Tendero 1 UCLA 2 Dalhousie University 3 Harvey Mudd College August

More information

Linear Algebra Massoud Malek

Linear Algebra Massoud Malek CSUEB Linear Algebra Massoud Malek Inner Product and Normed Space In all that follows, the n n identity matrix is denoted by I n, the n n zero matrix by Z n, and the zero vector by θ n An inner product

More information

Hierarchical Boosting and Filter Generation

Hierarchical Boosting and Filter Generation January 29, 2007 Plan Combining Classifiers Boosting Neural Network Structure of AdaBoost Image processing Hierarchical Boosting Hierarchical Structure Filters Combining Classifiers Combining Classifiers

More information

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008 Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:

More information

Dimension Reduction Techniques. Presented by Jie (Jerry) Yu

Dimension Reduction Techniques. Presented by Jie (Jerry) Yu Dimension Reduction Techniques Presented by Jie (Jerry) Yu Outline Problem Modeling Review of PCA and MDS Isomap Local Linear Embedding (LLE) Charting Background Advances in data collection and storage

More information

High-Rank Matrix Completion and Subspace Clustering with Missing Data

High-Rank Matrix Completion and Subspace Clustering with Missing Data High-Rank Matrix Completion and Subspace Clustering with Missing Data Authors: Brian Eriksson, Laura Balzano and Robert Nowak Presentation by Wang Yuxiang 1 About the authors

More information

Chapter 5 Joint Probability Distributions

Chapter 5 Joint Probability Distributions Applied Statistics and Probability for Engineers Sixth Edition Douglas C. Montgomery George C. Runger Chapter 5 Joint Probability Distributions 5 Joint Probability Distributions CHAPTER OUTLINE 5-1 Two

More information

Foundations of Computer Vision

Foundations of Computer Vision Foundations of Computer Vision Wesley. E. Snyder North Carolina State University Hairong Qi University of Tennessee, Knoxville Last Edited February 8, 2017 1 3.2. A BRIEF REVIEW OF LINEAR ALGEBRA Apply

More information