To appear in Optimization Methods and Software DISCRIMINATION OF TWO LINEARLY INSEPARABLE SETS. Computer Sciences Department, University of Wisconsin,

Size: px
Start display at page:

Download "To appear in Optimization Methods and Software DISCRIMINATION OF TWO LINEARLY INSEPARABLE SETS. Computer Sciences Department, University of Wisconsin,"

Transcription

1 To appear in Optimization Methods and Software ROBUST LINEAR PROGRAMMING DISCRIMINATION OF TWO LINEARLY INSEPARABLE SETS KRISTIN P. BENNETT and O. L. MANGASARIAN Computer Sciences Department, University of Wisconsin, 0 West Dayton Street, Madison, WI A single linear programming formulation is proposed which generates a plane that minimizes an average sum of misclassied points belonging to two disjoint points sets in n-dimensional real space. When the convex hulls of the two sets are also disjoint, the plane completely separates the two sets. When the convex hulls intersect, our linear program, unlike all previously proposed linear programs, is guaranteed to generate some error-minimizing plane, without the imposition of extraneous normalization constraints that inevitably fail to handle certain cases. The eectiveness of the proposed linear program has been demonstrated by successfully testing it on a number of databases. In addition, it has been used in conjunction with the multisurface method of piecewiselinear separation to train a feed-forward neural network with a single hidden layer. KEY WORDS: Linear Programming, Pattern Recognition, Neural Networks INTRODUCTION We consider the two point-sets A and B in the n-dimensional real space R n represented by the m n matrix A and the k n matrix B respectively. Our principal objective here is to formulate a single linear program with the following properties: (i) If the convex hulls of A and B are disjoint, a strictly separating plane is obtained. (ii) If the convex hulls of A and B intersect, a plane is obtained that minimizes some measure of misclassication points, for all possible cases. (iii) No extraneous constraints are imposed on the linear program that rule out any specic case from consideration. Most linear programming formulations 6,5,,4 have property (i), however, none to our knowledge have properties (ii) and (iii). For example, the linear program of Mangasarian 6 fails to satisfy property (ii) for all linearly inseparable cases, while Smith's linear program fails in satisfying (ii) when uniform weights are used in its objective function as originally proposed. The linear programs 5,4 fail in satisfying both (ii) and (iii). Our linear programming formulation on the other hand has

2 K. P. BENNETT and O. L. MANGASARIAN all three properties (i), (ii) and (iii). It is interesting to note that our proposed linear program (.) will always generate some error-minimizing plane even in the usually troublesome case when the means of the two sets are identical. For this case, among possible solutions to our linear program is the null solution. However, this null solution is never unique for our linear program and thus a useful alternative solution is always available. For example, such an alternative, the 45 line, is obtained computationally by our linear program for the classical counterexample of linear inseparability: the Exclusive-Or example. (See Example.7 below.) We outline our results now. In Section we state our linear program (.) and establish that it possesses properties (i)-(iii) above in Theorems.5 and.6. On the other hand in Example.8 we show that ( w = 0; = ) uniquely solves Smith's linear program ((.0b) with = ) and hence property (ii) is violated. Similarly in Remark.9 we give an example which violates property (ii) for Grinold's linear program 5 (.0) and give conditions under which this is always true. In Section 3 we report on some computational results using our proposed linear program on the Wisconsin Breast Cancer Database and the Cleveland Heart Disease Database. See also Bennett-Mangasarian for other computational results using linear programming on these databases. A word about our notation now. For a vector x in the n-dimensional real space R n ; x + will denote the vector in R n with components (x + ) i := maxfx i ; 0g; i = ;...; n: The notation A R mn will signify a real m n matrix. For such a matrix, A 0 will denote the transpose while A i will denote the ith row. The -norm of x; nx i= jx i j; will be denoted by x ; while the -norm of x; max in jx ij; will be denoted by x : A vector of ones in a real space of arbitrary dimension will be denoted by e: A ROBUST LINEAR PROGRAMMING SEPARATION Our linear program is based on the following error-minimizing optimization problem (:) min w; m Aw + e + e + + k Bw e + e + where A R mn represents the m points of the set A; B R kn represents the k points of the set B; w is the n-dimensional \weight" vector representing the normal to the optimal \separating" plane, and the real number is a threshold that gives the location of the separating plane: wx =. The choice of the weights m and k in (.) is critical (as we shall demonstrate below in Theorems.5 and.6) in that it sets it apart from Smith's linear program where equal weights were proposed, and from other linear programming formulations 6,5,4. Our choice we believe is a \natural" one in that the useless null solution w = 0 is not encountered computationally for linearly inseparable sets. This is theoretically justied (Theorem.5

3 ROBUST LINEAR PROGRAMMING DISCRIMINATION 3 below) because w = 0 cannot be a solution unless the following equality between the arithmetic means of A and B holds (:) ea m = eb k However in this case, it is guaranteed that a nonzero optimal w exists in addition to w = 0 (Theorem.6 below). We begin our analysis by justifying the use of the optimization problem (.) which minimizes the average of the misclassied points of A and B by the separating plane xw = : We dene now linear separability for concreteness. Definition.. (Linear Separability) The point sets A and B; represented by the matrices A R mn and B R kn respectively, are linearly separable if and only if (:3) min im A iv > max ik B iv for some v R n or equivalently (:4) Aw e + e; e e Bw for some w R n ; R That (.4) implies (.3) is evident. To see the converse just note the relations (:5) w := v ; := min A iv max B A i v iv > 0; := min + max B i v im ik im ik Note also that when the sets A and B are linearly separable, as dened by (.4), the plane (:6) fxjwx = g is a strict separating plane with (:7) Aw > e and e > Bw With the above denitions the following lemma becomes evident. Lemma.. The sets A and B represented by A R mn and B R kn respectively are linearly separable if and only if the minimum value of (.) is zero, in which case (w = 0; ) cannot be optimal. Proof. Note that the minimum of (.) is zero if and only if Aw + e + e 0 and Bw e + e 0 which is equivalent to the linear separability denition (.4). To see that (w = 0; ) cannot be optimal for (.), note that if we set w = 0 in (.) we get (:8) min = > 0 which contradicts the requirement that the minimum of (.) be zero for linearly separable A and B:

4 4 K. P. BENNETT and O. L. MANGASARIAN o o + + o o o o o o o o o o o o A i o o o o + o + + o o o o o o o o o o o o o o o o o o o o o o o o o o o o o + + o o o o o + + B j wx=γ+ wx=γ wx=γ Figure : An Optimal Separator wx = for Linearly Inseparable Sets: A (o) and B (+) The import of Lemma. is that the optimization problem (.), which is equivalent to the linear program (.) below, will always generate a separating plane wx = for linearly separable sets A and B: For linearly inseparable set A and B the optimization problem (.) will generate an optimal separating plane wx = which minimizes the average violations (:a) m mx i= (A i w + + ) + + k kx i= (B i w + ) + ; of points of A which lie on the wrong side of the plane wx = + ; that is in fx wx < + g; and of points of B which lie on the wrong side of the plane wx = ; that is in fx wx > g: See Fig.. Note also that the location of the plane wx = obtained by minimizing the average violations (.a) can be further optimized by holding w xed at the optimal

5 ROBUST LINEAR PROGRAMMING DISCRIMINATION 5 value and solving the one-dimensional optimization problem in (:c) min min i A iwmax j B jw m mx i= (A i w + ) + + k kx k= (B i w ) + This "secondary" optimization is not necessary in general, but for some problems it does improve the location of the optimal separator for a xed orientation of the planes. The objective of the one-dimensional problem (.c) is a piecewise-linear convex function which can be easily minimized by evaluating the function at the breakpoints = A w; :::; A m w; B w; :::; B k w. In order to set up the equivalent linear programming formulation to (.) we state rst a simple lemma that relates a norm minimization problem such as (.) to a constrained optimization problem devoid of norms of plus-functions. Lemma.3. Let g: R n! R m ; h: R n! R k and let S be a subset of R n : The problems (:9a) (:9b) min g(x)+ xs + h(x)+ min xs fey + ez y g(x); y 0; z h(x); z 0g have identical solution sets. Proof. The equivalence follows by noting that for the minimization problem (.9b), the optimal y; z and x must be related through the equalities y = g(x) + ; z = h(x) + : By using this lemma we can state an equivalent linear programming formulation to (.) as follows. Proposition.4. For > 0; > 0; the error-minimizing problem (:0a)) min w; is equivalent to the linear program Aw + e + e + + Bw e + e + (:0b) min w;;y;z ey + ez Aw e + y e; Bw + e + z e; y 0; z 0 The linear program (.0b) originally proposed by Smith with equal weights = = does not possess all the properties found in our linear program m + k with = m and = k : n ey (:) min + ez w;;y;z m k Aw e + y e; Bw + e + z e; y 0; z 0 The principal property that (.) has over other linear programs, including Smith's, is that for the linearly inseparable case it will always generate a nontrivial w without an extraneous constraint. To our knowledge no other linear programming o

6 6 K. P. BENNETT and O. L. MANGASARIAN formulation has this property for linearly inseparable sets. We establish this property by rst considering the linear program (.0b) for arbitrary positive weights and and showing under what conditions w = 0 constitutes a solution of the problem. Theorem.5. (Occurrence of the null solution w in linear program (.0b)) Let k m: The linear program (.0b) has a solution (w = 0; ; y; z) if and only if (:) ea = m vb; v 0; ev = ; v e m ; that is if the arithmetic mean of the points in A equals a convex combination of some points of B: When k = m; (.) degenerates to (:a) ea m = eb k ; that is the arithmetic mean of the points in A equals the arithmetic mean of the points in B: Proof. We note rst that k m does not result in any loss of generality because the roles of the sets A and B can be switched to obtain this inequality. Consider now the dual to the linear program (.0b) (:3) max u;v eu + ev A 0 u B 0 v = 0; eu + ev = 0; 0 u e; 0 v e The point (w = 0; ; y; z) is optimal for the primal problem (.0b) if and only if (:4a) m = min m k + = min ;y;z f ey + ez e + y e; e + z e; (y; z) 0g (:4b) = max eu + eva 0 u B 0 v = 0; eu + ev = 0; 0 u e; 0 v e u;v Since eu = ev and eu+ev = m; it follows that eu = ev = m: Since 0 u e; and so if any u i < then eu < m contradicting eu = m: Hence u = e and ev = eu = m: By normalizing u and v by dividing by m we obtain (.). When k = m; then from (.) we have that 0 v e : Since ev = ; it follows k that v = e and (.a) follows (.). k This theorem gives a theoretical explanation to some observed computational experience, namely that Smith's linear program (.0b) with = ; ended sometimes with the useless null w for real world linearly inseparable problems, whereas our linear program (.) never did. The reason for that is the rarity of the satisfaction of (.a) by real problems in contrast to the possibly frequent satisfaction of (.). We now proceed to our next results which show that when the null vector w = 0 constitutes a solution to the linear program (.0b), except for our proposed choice

7 ROBUST LINEAR PROGRAMMING DISCRIMINATION 7 of = and m = ; such w = 0 can be unique and nothing can be done to k alter it. (See Example.8 below.) However, for our linear program (.), even when the null w occurs in the rare case of (.a), e.g. in the contrived but classical Exclusive-Or example, there always exists an alternate non-null optimal w. (See Example.7 below.) These results are contained in the following theorem, examples and remarks. Theorem.6. (Nonuniqueness of the null w solution to the linear program (.)) The solution (w = 0; ; y; z) to (.) is not unique in w: Proof. Note from the rst equality of (.4a) with m = k = that when ( w = 0; ; y; z) is a solution to (.), then can be any point in [-, ]. In particular, take = 0: Then for this choice of w = 0; = 0; the corresponding optimal y; z for (.) are y = e; z = e and the active constraints are the rst two constraints of (.). Hence ( w; ; y; z) is unique in w if and only if the following system of linear inequalities has no solution (w; ; y; z) (:5a) e m y + e k z e m y + z k z Aw e + y A w e + y = e Bw + e + z B w + e + z = e w 6= w This is equivalent to the following system of linear inequalities having no solution (w; ; y; z) for each h in R n : (:5b) e m (y y) e (z z) 0 k A(w w) e( ) + (y y) 0 B(w w) + e( ) + (z z) 0 h(w w) > 0 By Motzkin's theorem of the alternative 8, (.5b) has no solution for a given h in R n if and only if the following system of linear inequalities does have a solution (; u; v) for that h in R n : (:6) A 0 u B 0 v = h eu + ev = 0 m e + u = 0 k e + v = 0 ; u; v 0 Obviously it is possible to choose h in R n such that (.6) has no solution, since there are h in R n that cannot be written as: (:7) h = A0 e m B 0 e k ; 0: Hence (.5b) has a solution for some h in R n. Consequently (.5a) has a solution and w = 0 is not unique.

8 8 K. P. BENNETT and O. L. MANGASARIAN We now apply this theorem to the classical Exclusive-Or example for which condition (.a) is satised and hence ( w = 0; ; y; z) is a solution to (.) which, however, is not unique in w = 0: Example.7. (Exclusive-Or) A = 0 0 ; B = 0 : 0 For this example ea = eb and ( w = 0; = 0; y = e; z = e) is a solution to the linear program (.) which can also be written in the equivalent (.) formulation of (:8) min w; w w w w + = However, the point w =, = is also optimal, because it gives the same minimum value of. The 45 direction in the wspace associated with this solution is useful in the multisurface method of pattern separation 7, since it can be used to generate the rst part of a piecewise-linearseparator. In practice the linear program package returned the optimal point w =, =, which generates another 45 direction that can also be used for piecewise-linear separation. We turn now to the case when m 6= k. In particular consider Smith's case of = = : A similar analysis to that of Theorem.6 does not give m + k guaranteed nonuniqueness of the solution ( w = 0; ; y; z) in w = 0 for the linear program (.0b) with = = : In fact, to the contrary, the analysis m + k shows that indeed w = 0 is unique under certain conditions which are satised by the following counterexample from Mangasarian et al 9 to Smith's claim that his linear program (.0b) with = = always generates a nonzero w: In m + k reality w = 0 is unique for this example. Example.8. (Unique w = 0 for Smith's LP (.0b), = = A = ; B = ; m = ; k = 3: m + k ) 4 For this problem the equivalent norm-minimization problem (.) to the Smith's LP (.0b) with = = m + is: k (:9) min w; f(w; ): = min w; 5 w w w w + + = 4 5

9 ROBUST LINEAR PROGRAMMING DISCRIMINATION 9 is achieved at w = 0, = : The uniqueness of this solution can be established by considering the subdierential (see Section 5..4, p.7 of Polyak 0 and Equation 4..4, p.363 of Fletcher 3 ) of the function f(w; ) at (0,) which is given ) = ` + 0 :` + 4`3 = 3 ` + 4`3 5 + ` ` `3 5 ` ` `3 with 0 ` e: Since ) with `3 0:8; ` = 5( `3); ` = 4`3 3; it follows that 0 interior (@f(0; )) and hence by Lemma 3, p.37 of Polyak 0, (0,) is a unique solution of (.9). (The uniqueness of (0,) can also be shown by considering the linear program (.0b)). Grinold 5 proposed the following linear program (:0) min w;; Aw e e 0; Bw + e e 0; (ea eb)w + (k m) = k + m 3 In Mangasarian et al 9 the example A = ; B = 4 5 was given to show 4 that the solution ( w = 0; = 5; = 5) is unique in w: In fact, it can be shown that ( w = 0; ; ) is always a solution of (.0) whenever there exist u such that ea eb (:) ua + = 0; eu = ; u 0; k > m k m Furthermore, w = 0 is unique if for each h in R n (:) A 0 ea eb + k m e u = h; has a solution u COMPUTATIONAL COMPARISONS In this section we give computational comparisons on two real-world databases: the Wisconsin Breast Cancer Database 4,3, and the Cleveland Heart Disease Database, using our proposed linear program (.), Smith's linear program (.0b) with = = m + k and the following linear programming formulation9 for multisurface discrimination (3:) max in max w;; Aw e; Bw e; e w e; wi = : Note that (3.) can be solved by solving n linear programs. The corresponding linear separation obtained from (3.) is (3:) wx = +

10 0 K. P. BENNETT and O. L. MANGASARIAN where ( w; ; ) is a solution of (3.). Problem (3.) is equivalent 9 to (3:3) max w;; Aw e; Bw e; w = which in turn is easily seen to be the following problem (3:4) max kwk= min A iw max B iw im ik Fig. summarizes the results obtained for two linearly inseparable databases using the three linear programs mentioned above. The Wisconsin Breast Cancer Database consists of 566 points of which 354 are benign and are malignant, all in a 9-dimensional real space. The Cleveland Heart Disease Database consists of 97 points in a 3-dimensional real space, of which 37 are negative and 60 are positive. Our testing methodology consisted of dividing each set randomly into a training set consisting of 67% of the data and a testing set consisting of the remaining 33%. Each linear programming formulation was run on the training set and the resulting separator tested on the testing set. This was repeated ten times and the average results of the ten runs are depicted in Fig.. No "secondary" optimization using (.c) was performed for any method. For each database our linear program (.) (referred to as MSM, multisurface method -norm) outperformed both Smith's linear program (.0b) with = ; and the linear programming method of (3.), referred to as MSM. The average run times for MSM and Smith are very close: 3.84 and 3.89 seconds respectively on a DEC station 5000/5 for the Wisconsin Breast Cancer Database, while the corresponding time for MSM was seconds. For the Cleveland Heart Disease Database the corresponding times are:.8,.89 and 53.8 seconds respectively. Note that the percent error of MSM on the testing set was better than that of Smith and considerably better than that of MSM on both databases. 4 CONCLUSIONS We have presented a robust linear program which always generates a linear surface as an \optimal" separator for two linearly inseparable sets. The \optimality" of the separator consists in minimizing a weighted average sum of the violations of the points lying on the wrong side of the separator. By using an appropriately weighted sum, we have overcome the problem of the null solution which has plagued previous linear programming approaches. These approaches either left this diculty unresolved or imposed an extraneous linear constraint which never resolved the problem completely. The fact that computational results on real-world problems give an edge to our linear program over Smith's and a substantial edge over another linear programming approach makes it, in our opinion, a suitable linear program for the linearly inseparable case.

11 ROBUST LINEAR PROGRAMMING DISCRIMINATION Wisconsin Breast Cancer Database Training Set Testing Set Percent Errors MSM SMITH MSM Percent Errors Cleveland Heart Disease Database 5.9 Training Set 5.03 Testing Set MSM SMITH MSM Figure : Comparison of Three Linear Programming Discriminators for Linearly Inseparable Sets

12 K. P. BENNETT and O. L. MANGASARIAN ACKNOWLEDGEMENTS This material is based on research supported by Air Force Oce of Scientic Research Grant AFOSR , National Science Foundation Grant CCR-9080, and Air Force Laboratory Graduate Fellowship Program Grant SSAN REFERENCES. K. P. Bennett and O. L. Mangasarian, Neural Network Training Via Linear Programming, in P. M. Pardalos (Ed.), Advances in Optimization and Parallel Computing, North Holland, Amsterdam 99, pp R. Detrano, A. Janosi, W. Steinbrunn, M. Psterer, J. Schmid, S. Sandhu, K. Guppy, S. Lee, V. Froelicher, International Application of a New Probability Algorithm for the Diagnosis of Coronary Artery Disease, American Journal of Cardiology 64, 989, pp R. Fletcher, Practical Methods in Optimization, Second Edition, Wiley, New York F. Glover, Improved Linear Programming Models for Discriminant Analysis, Decision Sciences,,4, 990, pp R. C. Grinold, Mathematical Programming Methods of Pattern Classication, Management Science 9, 97, pp O. L. Mangasarian, Linear and Nonlinear Separation of Patterns by Linear Programming, Operations Research 3, 965, pp O. L. Mangasarian, Multisurface Method of Pattern Separation, IEEE Transactions on Information Theory IT-4(6), 968, pp O. L. Mangasarian, Nonlinear Programming, McGraw-Hill, New York 969, pp O. L. Mangasarian, R. Setiono, and W. H. Wolberg, Pattern Recognition via Linear Programming: Theory and Application to Medical Diagnosis, in: Large-Scale Numerical Optimization, Thomas F. Coleman and Yuying Li, (Eds.), SIAM, Philadelphia 990, pp B. T. Polyak, Introduction to Optimization, Optimization Software, New York D. E. Rumelhart, G. E. Hinton, and J. L. McClelland, Learning Internal Representations, in Parallel Distributed Processing, D. E. Rumelhart and J. L. McClelland (Eds.), Vol. I, M.I.T. Press, Cambridge, Massachusetts 969, pp F. W. Smith, Pattern Classier Design by Linear Programming, IEEE Transactions on Computers C-7, 4, 968, pp W. H. Wolberg AND O. L. Mangasarian, Multisurface Method of Pattern Separation Applied to Breast Cytology Diagnosis, Proceeding of National Academy of Sciences U.S.A. 87, 990, pp W. H. Wolberg, M. S. Tanner and W. Y. Loh, Diagnostic Schemes for Fine Needle Aspirates of Breast Masses, Analytical and Quantitative Cytology and Histology 0, 988, pp. 5-8.

Optimization in Machine Learning. O. L. Mangasarian y. January approaches [24, 2] that render it more tractable.

Optimization in Machine Learning. O. L. Mangasarian y. January approaches [24, 2] that render it more tractable. Optimization in Machine Learning O. L. Mangasarian y Mathematical Programming Technical Report 95-01 January 1995 1 Introduction Optimization has played a signicant role in training neural networks [23].

More information

Support Vector Machine Classification via Parameterless Robust Linear Programming

Support Vector Machine Classification via Parameterless Robust Linear Programming Support Vector Machine Classification via Parameterless Robust Linear Programming O. L. Mangasarian Abstract We show that the problem of minimizing the sum of arbitrary-norm real distances to misclassified

More information

Unsupervised Classification via Convex Absolute Value Inequalities

Unsupervised Classification via Convex Absolute Value Inequalities Unsupervised Classification via Convex Absolute Value Inequalities Olvi L. Mangasarian Abstract We consider the problem of classifying completely unlabeled data by using convex inequalities that contain

More information

Unsupervised Classification via Convex Absolute Value Inequalities

Unsupervised Classification via Convex Absolute Value Inequalities Unsupervised Classification via Convex Absolute Value Inequalities Olvi Mangasarian University of Wisconsin - Madison University of California - San Diego January 17, 2017 Summary Classify completely unlabeled

More information

Nonlinear Knowledge-Based Classification

Nonlinear Knowledge-Based Classification Nonlinear Knowledge-Based Classification Olvi L. Mangasarian Edward W. Wild Abstract Prior knowledge over general nonlinear sets is incorporated into nonlinear kernel classification problems as linear

More information

Absolute value equations

Absolute value equations Linear Algebra and its Applications 419 (2006) 359 367 www.elsevier.com/locate/laa Absolute value equations O.L. Mangasarian, R.R. Meyer Computer Sciences Department, University of Wisconsin, 1210 West

More information

Knowledge-Based Nonlinear Kernel Classifiers

Knowledge-Based Nonlinear Kernel Classifiers Knowledge-Based Nonlinear Kernel Classifiers Glenn M. Fung, Olvi L. Mangasarian, and Jude W. Shavlik Computer Sciences Department, University of Wisconsin Madison, WI 5376 {gfung,olvi,shavlik}@cs.wisc.edu

More information

1. Introduction The nonlinear complementarity problem (NCP) is to nd a point x 2 IR n such that hx; F (x)i = ; x 2 IR n + ; F (x) 2 IRn + ; where F is

1. Introduction The nonlinear complementarity problem (NCP) is to nd a point x 2 IR n such that hx; F (x)i = ; x 2 IR n + ; F (x) 2 IRn + ; where F is New NCP-Functions and Their Properties 3 by Christian Kanzow y, Nobuo Yamashita z and Masao Fukushima z y University of Hamburg, Institute of Applied Mathematics, Bundesstrasse 55, D-2146 Hamburg, Germany,

More information

Absolute Value Programming

Absolute Value Programming O. L. Mangasarian Absolute Value Programming Abstract. We investigate equations, inequalities and mathematical programs involving absolute values of variables such as the equation Ax + B x = b, where A

More information

Robust linear optimization under general norms

Robust linear optimization under general norms Operations Research Letters 3 (004) 50 56 Operations Research Letters www.elsevier.com/locate/dsw Robust linear optimization under general norms Dimitris Bertsimas a; ;, Dessislava Pachamanova b, Melvyn

More information

On p-norm linear discrimination

On p-norm linear discrimination On p-norm linear discrimination Yana Morenko Alexander Vinel Zhaohan Yu Pavlo Krokhmal ; Abstract We consider a p-norm linear discrimination model that generalizes the model of Bennett and Mangasarian

More information

16 Chapter 3. Separation Properties, Principal Pivot Transforms, Classes... for all j 2 J is said to be a subcomplementary vector of variables for (3.

16 Chapter 3. Separation Properties, Principal Pivot Transforms, Classes... for all j 2 J is said to be a subcomplementary vector of variables for (3. Chapter 3 SEPARATION PROPERTIES, PRINCIPAL PIVOT TRANSFORMS, CLASSES OF MATRICES In this chapter we present the basic mathematical results on the LCP. Many of these results are used in later chapters to

More information

Lecture 8. Strong Duality Results. September 22, 2008

Lecture 8. Strong Duality Results. September 22, 2008 Strong Duality Results September 22, 2008 Outline Lecture 8 Slater Condition and its Variations Convex Objective with Linear Inequality Constraints Quadratic Objective over Quadratic Constraints Representation

More information

Methods for a Class of Convex. Functions. Stephen M. Robinson WP April 1996

Methods for a Class of Convex. Functions. Stephen M. Robinson WP April 1996 Working Paper Linear Convergence of Epsilon-Subgradient Descent Methods for a Class of Convex Functions Stephen M. Robinson WP-96-041 April 1996 IIASA International Institute for Applied Systems Analysis

More information

[3] R.M. Colomb and C.Y.C. Chung. Very fast decision table execution of propositional

[3] R.M. Colomb and C.Y.C. Chung. Very fast decision table execution of propositional - 14 - [3] R.M. Colomb and C.Y.C. Chung. Very fast decision table execution of propositional expert systems. Proceedings of the 8th National Conference on Articial Intelligence (AAAI-9), 199, 671{676.

More information

Improvements to Platt s SMO Algorithm for SVM Classifier Design

Improvements to Platt s SMO Algorithm for SVM Classifier Design LETTER Communicated by John Platt Improvements to Platt s SMO Algorithm for SVM Classifier Design S. S. Keerthi Department of Mechanical and Production Engineering, National University of Singapore, Singapore-119260

More information

Eigenface-based facial recognition

Eigenface-based facial recognition Eigenface-based facial recognition Dimitri PISSARENKO December 1, 2002 1 General This document is based upon Turk and Pentland (1991b), Turk and Pentland (1991a) and Smith (2002). 2 How does it work? The

More information

Unsupervised and Semisupervised Classification via Absolute Value Inequalities

Unsupervised and Semisupervised Classification via Absolute Value Inequalities Unsupervised and Semisupervised Classification via Absolute Value Inequalities Glenn M. Fung & Olvi L. Mangasarian Abstract We consider the problem of classifying completely or partially unlabeled data

More information

The Uniformity Principle: A New Tool for. Probabilistic Robustness Analysis. B. R. Barmish and C. M. Lagoa. further discussion.

The Uniformity Principle: A New Tool for. Probabilistic Robustness Analysis. B. R. Barmish and C. M. Lagoa. further discussion. The Uniformity Principle A New Tool for Probabilistic Robustness Analysis B. R. Barmish and C. M. Lagoa Department of Electrical and Computer Engineering University of Wisconsin-Madison, Madison, WI 53706

More information

Lecture Note 5: Semidefinite Programming for Stability Analysis

Lecture Note 5: Semidefinite Programming for Stability Analysis ECE7850: Hybrid Systems:Theory and Applications Lecture Note 5: Semidefinite Programming for Stability Analysis Wei Zhang Assistant Professor Department of Electrical and Computer Engineering Ohio State

More information

The Relation Between Pseudonormality and Quasiregularity in Constrained Optimization 1

The Relation Between Pseudonormality and Quasiregularity in Constrained Optimization 1 October 2003 The Relation Between Pseudonormality and Quasiregularity in Constrained Optimization 1 by Asuman E. Ozdaglar and Dimitri P. Bertsekas 2 Abstract We consider optimization problems with equality,

More information

1 Review Session. 1.1 Lecture 2

1 Review Session. 1.1 Lecture 2 1 Review Session Note: The following lists give an overview of the material that was covered in the lectures and sections. Your TF will go through these lists. If anything is unclear or you have questions

More information

Linear Programming: Simplex

Linear Programming: Simplex Linear Programming: Simplex Stephen J. Wright 1 2 Computer Sciences Department, University of Wisconsin-Madison. IMA, August 2016 Stephen Wright (UW-Madison) Linear Programming: Simplex IMA, August 2016

More information

Proximal Knowledge-Based Classification

Proximal Knowledge-Based Classification Proximal Knowledge-Based Classification O. L. Mangasarian E. W. Wild G. M. Fung June 26, 2008 Abstract Prior knowledge over general nonlinear sets is incorporated into proximal nonlinear kernel classification

More information

Key words. Complementarity set, Lyapunov rank, Bishop-Phelps cone, Irreducible cone

Key words. Complementarity set, Lyapunov rank, Bishop-Phelps cone, Irreducible cone ON THE IRREDUCIBILITY LYAPUNOV RANK AND AUTOMORPHISMS OF SPECIAL BISHOP-PHELPS CONES M. SEETHARAMA GOWDA AND D. TROTT Abstract. Motivated by optimization considerations we consider cones in R n to be called

More information

Nonlinear Support Vector Machines through Iterative Majorization and I-Splines

Nonlinear Support Vector Machines through Iterative Majorization and I-Splines Nonlinear Support Vector Machines through Iterative Majorization and I-Splines P.J.F. Groenen G. Nalbantov J.C. Bioch July 9, 26 Econometric Institute Report EI 26-25 Abstract To minimize the primal support

More information

Deep Linear Networks with Arbitrary Loss: All Local Minima Are Global

Deep Linear Networks with Arbitrary Loss: All Local Minima Are Global homas Laurent * 1 James H. von Brecht * 2 Abstract We consider deep linear networks with arbitrary convex differentiable loss. We provide a short and elementary proof of the fact that all local minima

More information

A Finite Newton Method for Classification Problems

A Finite Newton Method for Classification Problems A Finite Newton Method for Classification Problems O. L. Mangasarian Computer Sciences Department University of Wisconsin 1210 West Dayton Street Madison, WI 53706 olvi@cs.wisc.edu Abstract A fundamental

More information

Support Vector Machines vs Multi-Layer. Perceptron in Particle Identication. DIFI, Universita di Genova (I) INFN Sezione di Genova (I) Cambridge (US)

Support Vector Machines vs Multi-Layer. Perceptron in Particle Identication. DIFI, Universita di Genova (I) INFN Sezione di Genova (I) Cambridge (US) Support Vector Machines vs Multi-Layer Perceptron in Particle Identication N.Barabino 1, M.Pallavicini 2, A.Petrolini 1;2, M.Pontil 3;1, A.Verri 4;3 1 DIFI, Universita di Genova (I) 2 INFN Sezione di Genova

More information

satisfying ( i ; j ) = ij Here ij = if i = j and 0 otherwise The idea to use lattices is the following Suppose we are given a lattice L and a point ~x

satisfying ( i ; j ) = ij Here ij = if i = j and 0 otherwise The idea to use lattices is the following Suppose we are given a lattice L and a point ~x Dual Vectors and Lower Bounds for the Nearest Lattice Point Problem Johan Hastad* MIT Abstract: We prove that given a point ~z outside a given lattice L then there is a dual vector which gives a fairly

More information

CHAPTER 2: CONVEX SETS AND CONCAVE FUNCTIONS. W. Erwin Diewert January 31, 2008.

CHAPTER 2: CONVEX SETS AND CONCAVE FUNCTIONS. W. Erwin Diewert January 31, 2008. 1 ECONOMICS 594: LECTURE NOTES CHAPTER 2: CONVEX SETS AND CONCAVE FUNCTIONS W. Erwin Diewert January 31, 2008. 1. Introduction Many economic problems have the following structure: (i) a linear function

More information

1 Outline Part I: Linear Programming (LP) Interior-Point Approach 1. Simplex Approach Comparison Part II: Semidenite Programming (SDP) Concludin

1 Outline Part I: Linear Programming (LP) Interior-Point Approach 1. Simplex Approach Comparison Part II: Semidenite Programming (SDP) Concludin Sensitivity Analysis in LP and SDP Using Interior-Point Methods E. Alper Yldrm School of Operations Research and Industrial Engineering Cornell University Ithaca, NY joint with Michael J. Todd INFORMS

More information

EXERCISE SET 5.1. = (kx + kx + k, ky + ky + k ) = (kx + kx + 1, ky + ky + 1) = ((k + )x + 1, (k + )y + 1)

EXERCISE SET 5.1. = (kx + kx + k, ky + ky + k ) = (kx + kx + 1, ky + ky + 1) = ((k + )x + 1, (k + )y + 1) EXERCISE SET 5. 6. The pair (, 2) is in the set but the pair ( )(, 2) = (, 2) is not because the first component is negative; hence Axiom 6 fails. Axiom 5 also fails. 8. Axioms, 2, 3, 6, 9, and are easily

More information

15-780: LinearProgramming

15-780: LinearProgramming 15-780: LinearProgramming J. Zico Kolter February 1-3, 2016 1 Outline Introduction Some linear algebra review Linear programming Simplex algorithm Duality and dual simplex 2 Outline Introduction Some linear

More information

RESEARCH ARTICLE. An extension of the polytope of doubly stochastic matrices

RESEARCH ARTICLE. An extension of the polytope of doubly stochastic matrices Linear and Multilinear Algebra Vol. 00, No. 00, Month 200x, 1 15 RESEARCH ARTICLE An extension of the polytope of doubly stochastic matrices Richard A. Brualdi a and Geir Dahl b a Department of Mathematics,

More information

Chapter 1. Preliminaries

Chapter 1. Preliminaries Introduction This dissertation is a reading of chapter 4 in part I of the book : Integer and Combinatorial Optimization by George L. Nemhauser & Laurence A. Wolsey. The chapter elaborates links between

More information

Equivalence of Minimal l 0 and l p Norm Solutions of Linear Equalities, Inequalities and Linear Programs for Sufficiently Small p

Equivalence of Minimal l 0 and l p Norm Solutions of Linear Equalities, Inequalities and Linear Programs for Sufficiently Small p Equivalence of Minimal l 0 and l p Norm Solutions of Linear Equalities, Inequalities and Linear Programs for Sufficiently Small p G. M. FUNG glenn.fung@siemens.com R&D Clinical Systems Siemens Medical

More information

Techinical Proofs for Nonlinear Learning using Local Coordinate Coding

Techinical Proofs for Nonlinear Learning using Local Coordinate Coding Techinical Proofs for Nonlinear Learning using Local Coordinate Coding 1 Notations and Main Results Denition 1.1 (Lipschitz Smoothness) A function f(x) on R d is (α, β, p)-lipschitz smooth with respect

More information

58 Appendix 1 fundamental inconsistent equation (1) can be obtained as a linear combination of the two equations in (2). This clearly implies that the

58 Appendix 1 fundamental inconsistent equation (1) can be obtained as a linear combination of the two equations in (2). This clearly implies that the Appendix PRELIMINARIES 1. THEOREMS OF ALTERNATIVES FOR SYSTEMS OF LINEAR CONSTRAINTS Here we consider systems of linear constraints, consisting of equations or inequalities or both. A feasible solution

More information

1. Introduction Let the least value of an objective function F (x), x2r n, be required, where F (x) can be calculated for any vector of variables x2r

1. Introduction Let the least value of an objective function F (x), x2r n, be required, where F (x) can be calculated for any vector of variables x2r DAMTP 2002/NA08 Least Frobenius norm updating of quadratic models that satisfy interpolation conditions 1 M.J.D. Powell Abstract: Quadratic models of objective functions are highly useful in many optimization

More information

Assignment 1: From the Definition of Convexity to Helley Theorem

Assignment 1: From the Definition of Convexity to Helley Theorem Assignment 1: From the Definition of Convexity to Helley Theorem Exercise 1 Mark in the following list the sets which are convex: 1. {x R 2 : x 1 + i 2 x 2 1, i = 1,..., 10} 2. {x R 2 : x 2 1 + 2ix 1x

More information

Linear & nonlinear classifiers

Linear & nonlinear classifiers Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1396 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1396 1 / 44 Table

More information

arxiv: v1 [cs.cc] 5 Dec 2018

arxiv: v1 [cs.cc] 5 Dec 2018 Consistency for 0 1 Programming Danial Davarnia 1 and J. N. Hooker 2 1 Iowa state University davarnia@iastate.edu 2 Carnegie Mellon University jh38@andrew.cmu.edu arxiv:1812.02215v1 [cs.cc] 5 Dec 2018

More information

Optimality, Duality, Complementarity for Constrained Optimization

Optimality, Duality, Complementarity for Constrained Optimization Optimality, Duality, Complementarity for Constrained Optimization Stephen Wright University of Wisconsin-Madison May 2014 Wright (UW-Madison) Optimality, Duality, Complementarity May 2014 1 / 41 Linear

More information

using the Hamiltonian constellations from the packing theory, i.e., the optimal sphere packing points. However, in [11] it is shown that the upper bou

using the Hamiltonian constellations from the packing theory, i.e., the optimal sphere packing points. However, in [11] it is shown that the upper bou Some 2 2 Unitary Space-Time Codes from Sphere Packing Theory with Optimal Diversity Product of Code Size 6 Haiquan Wang Genyuan Wang Xiang-Gen Xia Abstract In this correspondence, we propose some new designs

More information

A Parametric Simplex Algorithm for Linear Vector Optimization Problems

A Parametric Simplex Algorithm for Linear Vector Optimization Problems A Parametric Simplex Algorithm for Linear Vector Optimization Problems Birgit Rudloff Firdevs Ulus Robert Vanderbei July 9, 2015 Abstract In this paper, a parametric simplex algorithm for solving linear

More information

Linear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) 1.1 The Formal Denition of a Vector Space

Linear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) 1.1 The Formal Denition of a Vector Space Linear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) Contents 1 Vector Spaces 1 1.1 The Formal Denition of a Vector Space.................................. 1 1.2 Subspaces...................................................

More information

REAL AND COMPLEX ANALYSIS

REAL AND COMPLEX ANALYSIS REAL AND COMPLE ANALYSIS Third Edition Walter Rudin Professor of Mathematics University of Wisconsin, Madison Version 1.1 No rights reserved. Any part of this work can be reproduced or transmitted in any

More information

Constrained and Unconstrained Optimization Prof. Adrijit Goswami Department of Mathematics Indian Institute of Technology, Kharagpur

Constrained and Unconstrained Optimization Prof. Adrijit Goswami Department of Mathematics Indian Institute of Technology, Kharagpur Constrained and Unconstrained Optimization Prof. Adrijit Goswami Department of Mathematics Indian Institute of Technology, Kharagpur Lecture - 01 Introduction to Optimization Today, we will start the constrained

More information

A Worst-Case Estimate of Stability Probability for Engineering Drive

A Worst-Case Estimate of Stability Probability for Engineering Drive A Worst-Case Estimate of Stability Probability for Polynomials with Multilinear Uncertainty Structure Sheila R. Ross and B. Ross Barmish Department of Electrical and Computer Engineering University of

More information

Introduction to Real Analysis Alternative Chapter 1

Introduction to Real Analysis Alternative Chapter 1 Christopher Heil Introduction to Real Analysis Alternative Chapter 1 A Primer on Norms and Banach Spaces Last Updated: March 10, 2018 c 2018 by Christopher Heil Chapter 1 A Primer on Norms and Banach Spaces

More information

Notes on Alekhnovich s cryptosystems

Notes on Alekhnovich s cryptosystems Notes on Alekhnovich s cryptosystems Gilles Zémor November 2016 Decisional Decoding Hypothesis with parameter t. Let 0 < R 1 < R 2 < 1. There is no polynomial-time decoding algorithm A such that: Given

More information

April 25 May 6, 2016, Verona, Italy. GAME THEORY and APPLICATIONS Mikhail Ivanov Krastanov

April 25 May 6, 2016, Verona, Italy. GAME THEORY and APPLICATIONS Mikhail Ivanov Krastanov April 25 May 6, 2016, Verona, Italy GAME THEORY and APPLICATIONS Mikhail Ivanov Krastanov Games in normal form There are given n-players. The set of all strategies (possible actions) of the i-th player

More information

1 Introduction CONVEXIFYING THE SET OF MATRICES OF BOUNDED RANK. APPLICATIONS TO THE QUASICONVEXIFICATION AND CONVEXIFICATION OF THE RANK FUNCTION

1 Introduction CONVEXIFYING THE SET OF MATRICES OF BOUNDED RANK. APPLICATIONS TO THE QUASICONVEXIFICATION AND CONVEXIFICATION OF THE RANK FUNCTION CONVEXIFYING THE SET OF MATRICES OF BOUNDED RANK. APPLICATIONS TO THE QUASICONVEXIFICATION AND CONVEXIFICATION OF THE RANK FUNCTION Jean-Baptiste Hiriart-Urruty and Hai Yen Le Institut de mathématiques

More information

y Ray of Half-line or ray through in the direction of y

y Ray of Half-line or ray through in the direction of y Chapter LINEAR COMPLEMENTARITY PROBLEM, ITS GEOMETRY, AND APPLICATIONS. THE LINEAR COMPLEMENTARITY PROBLEM AND ITS GEOMETRY The Linear Complementarity Problem (abbreviated as LCP) is a general problem

More information

Buered Probability of Exceedance: Mathematical Properties and Optimization

Buered Probability of Exceedance: Mathematical Properties and Optimization Buered Probability of Exceedance: Mathematical Properties and Optimization Alexander Mafusalov, Stan Uryasev RESEARCH REPORT 2014-1 Risk Management and Financial Engineering Lab Department of Industrial

More information

Approximation Algorithms for Maximum. Coverage and Max Cut with Given Sizes of. Parts? A. A. Ageev and M. I. Sviridenko

Approximation Algorithms for Maximum. Coverage and Max Cut with Given Sizes of. Parts? A. A. Ageev and M. I. Sviridenko Approximation Algorithms for Maximum Coverage and Max Cut with Given Sizes of Parts? A. A. Ageev and M. I. Sviridenko Sobolev Institute of Mathematics pr. Koptyuga 4, 630090, Novosibirsk, Russia fageev,svirg@math.nsc.ru

More information

Exhaustive Classication of Finite Classical Probability Spaces with Regard to the Notion of Causal Up-to-n-closedness

Exhaustive Classication of Finite Classical Probability Spaces with Regard to the Notion of Causal Up-to-n-closedness Exhaustive Classication of Finite Classical Probability Spaces with Regard to the Notion of Causal Up-to-n-closedness Michaª Marczyk, Leszek Wro«ski Jagiellonian University, Kraków 16 June 2009 Abstract

More information

Robust Solutions to Multi-Objective Linear Programs with Uncertain Data

Robust Solutions to Multi-Objective Linear Programs with Uncertain Data Robust Solutions to Multi-Objective Linear Programs with Uncertain Data M.A. Goberna yz V. Jeyakumar x G. Li x J. Vicente-Pérez x Revised Version: October 1, 2014 Abstract In this paper we examine multi-objective

More information

Nonlinear Stationary Subdivision

Nonlinear Stationary Subdivision Nonlinear Stationary Subdivision Michael S. Floater SINTEF P. O. Box 4 Blindern, 034 Oslo, Norway E-mail: michael.floater@math.sintef.no Charles A. Micchelli IBM Corporation T.J. Watson Research Center

More information

Lecture 1. 1 Conic programming. MA 796S: Convex Optimization and Interior Point Methods October 8, Consider the conic program. min.

Lecture 1. 1 Conic programming. MA 796S: Convex Optimization and Interior Point Methods October 8, Consider the conic program. min. MA 796S: Convex Optimization and Interior Point Methods October 8, 2007 Lecture 1 Lecturer: Kartik Sivaramakrishnan Scribe: Kartik Sivaramakrishnan 1 Conic programming Consider the conic program min s.t.

More information

Tree sets. Reinhard Diestel

Tree sets. Reinhard Diestel 1 Tree sets Reinhard Diestel Abstract We study an abstract notion of tree structure which generalizes treedecompositions of graphs and matroids. Unlike tree-decompositions, which are too closely linked

More information

GLOBAL ANALYSIS OF PIECEWISE LINEAR SYSTEMS USING IMPACT MAPS AND QUADRATIC SURFACE LYAPUNOV FUNCTIONS

GLOBAL ANALYSIS OF PIECEWISE LINEAR SYSTEMS USING IMPACT MAPS AND QUADRATIC SURFACE LYAPUNOV FUNCTIONS GLOBAL ANALYSIS OF PIECEWISE LINEAR SYSTEMS USING IMPACT MAPS AND QUADRATIC SURFACE LYAPUNOV FUNCTIONS Jorge M. Gonçalves, Alexandre Megretski y, Munther A. Dahleh y California Institute of Technology

More information

The Disputed Federalist Papers: SVM Feature Selection via Concave Minimization

The Disputed Federalist Papers: SVM Feature Selection via Concave Minimization The Disputed Federalist Papers: SVM Feature Selection via Concave Minimization Glenn Fung Siemens Medical Solutions In this paper, we use a method proposed by Bradley and Mangasarian Feature Selection

More information

Upper and Lower Bounds on the Number of Faults. a System Can Withstand Without Repairs. Cambridge, MA 02139

Upper and Lower Bounds on the Number of Faults. a System Can Withstand Without Repairs. Cambridge, MA 02139 Upper and Lower Bounds on the Number of Faults a System Can Withstand Without Repairs Michel Goemans y Nancy Lynch z Isaac Saias x Laboratory for Computer Science Massachusetts Institute of Technology

More information

Linear Codes, Target Function Classes, and Network Computing Capacity

Linear Codes, Target Function Classes, and Network Computing Capacity Linear Codes, Target Function Classes, and Network Computing Capacity Rathinakumar Appuswamy, Massimo Franceschetti, Nikhil Karamchandani, and Kenneth Zeger IEEE Transactions on Information Theory Submitted:

More information

Lecture 2: Review of Prerequisites. Table of contents

Lecture 2: Review of Prerequisites. Table of contents Math 348 Fall 217 Lecture 2: Review of Prerequisites Disclaimer. As we have a textbook, this lecture note is for guidance and supplement only. It should not be relied on when preparing for exams. In this

More information

The model reduction algorithm proposed is based on an iterative two-step LMI scheme. The convergence of the algorithm is not analyzed but examples sho

The model reduction algorithm proposed is based on an iterative two-step LMI scheme. The convergence of the algorithm is not analyzed but examples sho Model Reduction from an H 1 /LMI perspective A. Helmersson Department of Electrical Engineering Linkoping University S-581 8 Linkoping, Sweden tel: +6 1 816 fax: +6 1 86 email: andersh@isy.liu.se September

More information

Robust Fisher Discriminant Analysis

Robust Fisher Discriminant Analysis Robust Fisher Discriminant Analysis Seung-Jean Kim Alessandro Magnani Stephen P. Boyd Information Systems Laboratory Electrical Engineering Department, Stanford University Stanford, CA 94305-9510 sjkim@stanford.edu

More information

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. XX, NO. Y, MONTH A Modied Back Propagation Algorithm for. Neural Network Training

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. XX, NO. Y, MONTH A Modied Back Propagation Algorithm for. Neural Network Training IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. XX, NO. Y, MONTH 1999 1 A Modied Bac Propagation Algorithm for Neural Networ Training Renato De Leone, Rosario Capparuccia, Emanuela Merelli Abstract A variation

More information

SOME MEASURABILITY AND CONTINUITY PROPERTIES OF ARBITRARY REAL FUNCTIONS

SOME MEASURABILITY AND CONTINUITY PROPERTIES OF ARBITRARY REAL FUNCTIONS LE MATEMATICHE Vol. LVII (2002) Fasc. I, pp. 6382 SOME MEASURABILITY AND CONTINUITY PROPERTIES OF ARBITRARY REAL FUNCTIONS VITTORINO PATA - ALFONSO VILLANI Given an arbitrary real function f, the set D

More information

Topological properties

Topological properties CHAPTER 4 Topological properties 1. Connectedness Definitions and examples Basic properties Connected components Connected versus path connected, again 2. Compactness Definition and first examples Topological

More information

below, kernel PCA Eigenvectors, and linear combinations thereof. For the cases where the pre-image does exist, we can provide a means of constructing

below, kernel PCA Eigenvectors, and linear combinations thereof. For the cases where the pre-image does exist, we can provide a means of constructing Kernel PCA Pattern Reconstruction via Approximate Pre-Images Bernhard Scholkopf, Sebastian Mika, Alex Smola, Gunnar Ratsch, & Klaus-Robert Muller GMD FIRST, Rudower Chaussee 5, 12489 Berlin, Germany fbs,

More information

Convex Optimization Theory. Chapter 5 Exercises and Solutions: Extended Version

Convex Optimization Theory. Chapter 5 Exercises and Solutions: Extended Version Convex Optimization Theory Chapter 5 Exercises and Solutions: Extended Version Dimitri P. Bertsekas Massachusetts Institute of Technology Athena Scientific, Belmont, Massachusetts http://www.athenasc.com

More information

Chapter 5 Linear Programming (LP)

Chapter 5 Linear Programming (LP) Chapter 5 Linear Programming (LP) General constrained optimization problem: minimize f(x) subject to x R n is called the constraint set or feasible set. any point x is called a feasible point We consider

More information

On the decomposition of orthogonal arrays

On the decomposition of orthogonal arrays On the decomposition of orthogonal arrays Wiebke S. Diestelkamp Department of Mathematics University of Dayton Dayton, OH 45469-2316 wiebke@udayton.edu Jay H. Beder Department of Mathematical Sciences

More information

Date: July 5, Contents

Date: July 5, Contents 2 Lagrange Multipliers Date: July 5, 2001 Contents 2.1. Introduction to Lagrange Multipliers......... p. 2 2.2. Enhanced Fritz John Optimality Conditions...... p. 14 2.3. Informative Lagrange Multipliers...........

More information

Linear Algebra, Summer 2011, pt. 2

Linear Algebra, Summer 2011, pt. 2 Linear Algebra, Summer 2, pt. 2 June 8, 2 Contents Inverses. 2 Vector Spaces. 3 2. Examples of vector spaces..................... 3 2.2 The column space......................... 6 2.3 The null space...........................

More information

3. Vector spaces 3.1 Linear dependence and independence 3.2 Basis and dimension. 5. Extreme points and basic feasible solutions

3. Vector spaces 3.1 Linear dependence and independence 3.2 Basis and dimension. 5. Extreme points and basic feasible solutions A. LINEAR ALGEBRA. CONVEX SETS 1. Matrices and vectors 1.1 Matrix operations 1.2 The rank of a matrix 2. Systems of linear equations 2.1 Basic solutions 3. Vector spaces 3.1 Linear dependence and independence

More information

LP Duality: outline. Duality theory for Linear Programming. alternatives. optimization I Idea: polyhedra

LP Duality: outline. Duality theory for Linear Programming. alternatives. optimization I Idea: polyhedra LP Duality: outline I Motivation and definition of a dual LP I Weak duality I Separating hyperplane theorem and theorems of the alternatives I Strong duality and complementary slackness I Using duality

More information

L5 Support Vector Classification

L5 Support Vector Classification L5 Support Vector Classification Support Vector Machine Problem definition Geometrical picture Optimization problem Optimization Problem Hard margin Convexity Dual problem Soft margin problem Alexander

More information

Multilayer Perceptron

Multilayer Perceptron Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Single Perceptron 3 Boolean Function Learning 4

More information

Support Vector Machines. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington

Support Vector Machines. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington Support Vector Machines CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 A Linearly Separable Problem Consider the binary classification

More information

Knowledge-based Support Vector Machine Classifiers via Nearest Points

Knowledge-based Support Vector Machine Classifiers via Nearest Points Available online at www.sciencedirect.com Procedia Computer Science 9 (22 ) 24 248 International Conference on Computational Science, ICCS 22 Knowledge-based Support Vector Machine Classifiers via Nearest

More information

Some global properties of neural networks. L. Accardi and A. Aiello. Laboratorio di Cibernetica del C.N.R., Arco Felice (Na), Italy

Some global properties of neural networks. L. Accardi and A. Aiello. Laboratorio di Cibernetica del C.N.R., Arco Felice (Na), Italy Some global properties of neural networks L. Accardi and A. Aiello Laboratorio di Cibernetica del C.N.R., Arco Felice (Na), Italy 1 Contents 1 Introduction 3 2 The individual problem of synthesis 4 3 The

More information

Plan of Class 4. Radial Basis Functions with moving centers. Projection Pursuit Regression and ridge. Principal Component Analysis: basic ideas

Plan of Class 4. Radial Basis Functions with moving centers. Projection Pursuit Regression and ridge. Principal Component Analysis: basic ideas Plan of Class 4 Radial Basis Functions with moving centers Multilayer Perceptrons Projection Pursuit Regression and ridge functions approximation Principal Component Analysis: basic ideas Radial Basis

More information

Machine Learning Lecture 5

Machine Learning Lecture 5 Machine Learning Lecture 5 Linear Discriminant Functions 26.10.2017 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Course Outline Fundamentals Bayes Decision Theory

More information

A Generalized Homogeneous and Self-Dual Algorithm. for Linear Programming. February 1994 (revised December 1994)

A Generalized Homogeneous and Self-Dual Algorithm. for Linear Programming. February 1994 (revised December 1994) A Generalized Homogeneous and Self-Dual Algorithm for Linear Programming Xiaojie Xu Yinyu Ye y February 994 (revised December 994) Abstract: A generalized homogeneous and self-dual (HSD) infeasible-interior-point

More information

DO NOT OPEN THIS QUESTION BOOKLET UNTIL YOU ARE TOLD TO DO SO

DO NOT OPEN THIS QUESTION BOOKLET UNTIL YOU ARE TOLD TO DO SO QUESTION BOOKLET EECS 227A Fall 2009 Midterm Tuesday, Ocotober 20, 11:10-12:30pm DO NOT OPEN THIS QUESTION BOOKLET UNTIL YOU ARE TOLD TO DO SO You have 80 minutes to complete the midterm. The midterm consists

More information

Lecture 5. Theorems of Alternatives and Self-Dual Embedding

Lecture 5. Theorems of Alternatives and Self-Dual Embedding IE 8534 1 Lecture 5. Theorems of Alternatives and Self-Dual Embedding IE 8534 2 A system of linear equations may not have a solution. It is well known that either Ax = c has a solution, or A T y = 0, c

More information

Hamilton Cycles in Digraphs of Unitary Matrices

Hamilton Cycles in Digraphs of Unitary Matrices Hamilton Cycles in Digraphs of Unitary Matrices G. Gutin A. Rafiey S. Severini A. Yeo Abstract A set S V is called an q + -set (q -set, respectively) if S has at least two vertices and, for every u S,

More information

Lecture 2: Convex Sets and Functions

Lecture 2: Convex Sets and Functions Lecture 2: Convex Sets and Functions Hyang-Won Lee Dept. of Internet & Multimedia Eng. Konkuk University Lecture 2 Network Optimization, Fall 2015 1 / 22 Optimization Problems Optimization problems are

More information

Support'Vector'Machines. Machine(Learning(Spring(2018 March(5(2018 Kasthuri Kannan

Support'Vector'Machines. Machine(Learning(Spring(2018 March(5(2018 Kasthuri Kannan Support'Vector'Machines Machine(Learning(Spring(2018 March(5(2018 Kasthuri Kannan kasthuri.kannan@nyumc.org Overview Support Vector Machines for Classification Linear Discrimination Nonlinear Discrimination

More information

Richard DiSalvo. Dr. Elmer. Mathematical Foundations of Economics. Fall/Spring,

Richard DiSalvo. Dr. Elmer. Mathematical Foundations of Economics. Fall/Spring, The Finite Dimensional Normed Linear Space Theorem Richard DiSalvo Dr. Elmer Mathematical Foundations of Economics Fall/Spring, 20-202 The claim that follows, which I have called the nite-dimensional normed

More information

Knowledge Discovery in Clinical Databases with Neural Network Evidence Combination

Knowledge Discovery in Clinical Databases with Neural Network Evidence Combination nowledge Discovery in Clinical Databases with Neural Network Evidence Combination #T Srinivasan, Arvind Chandrasekhar 2, Jayesh Seshadri 3, J B Siddharth Jonathan 4 Department of Computer Science and Engineering,

More information

Support Vector Regression Machines

Support Vector Regression Machines Support Vector Regression Machines Harris Drucker* Chris J.C. Burges** Linda Kaufman** Alex Smola** Vladimir Vapnik + *Bell Labs and Monmouth University Department of Electronic Engineering West Long Branch,

More information

Statistical Pattern Recognition

Statistical Pattern Recognition Statistical Pattern Recognition Feature Extraction Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi, Payam Siyari Spring 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Agenda Dimensionality Reduction

More information

Primal-Dual Bilinear Programming Solution of the Absolute Value Equation

Primal-Dual Bilinear Programming Solution of the Absolute Value Equation Primal-Dual Bilinear Programg Solution of the Absolute Value Equation Olvi L. Mangasarian Abstract We propose a finitely terating primal-dual bilinear programg algorithm for the solution of the NP-hard

More information

HW1 solutions. 1. α Ef(x) β, where Ef(x) is the expected value of f(x), i.e., Ef(x) = n. i=1 p if(a i ). (The function f : R R is given.

HW1 solutions. 1. α Ef(x) β, where Ef(x) is the expected value of f(x), i.e., Ef(x) = n. i=1 p if(a i ). (The function f : R R is given. HW1 solutions Exercise 1 (Some sets of probability distributions.) Let x be a real-valued random variable with Prob(x = a i ) = p i, i = 1,..., n, where a 1 < a 2 < < a n. Of course p R n lies in the standard

More information