Multicategory Classification by Support Vector Machines

Size: px
Start display at page:

Download "Multicategory Classification by Support Vector Machines"

Transcription

1 Muticategory Cassification by Support Vector Machines Erin J Bredensteiner Department of Mathematics University of Evansvie 800 Lincon Avenue Evansvie, Indiana eb6@evansvieedu Kristin P Bennett Department of Mathematica Sciences Rensseaer Poytechnic Institute Troy, NY 280 bennek@rpiedu Abstract We examine the probem of how to discriminate between objects of three or more casses Specificay, we investigate how two-cass discrimination methods can be extended to the muticass case We show how the inear programming (LP) approaches based on the work of Mangasarian and quadratic programming (QP) approaches based on Vapnik s Support Vector Machines (SVM) can be combined to yied two new approaches to the muticass probem In LP muticass discrimination, a singe inear program is used to construct a piecewise inear cassification function In our proposed muticass SVM method, a singe quadratic program is used to construct a piecewise noninear cassification function Each piece of this function can take the form of a poynomia, radia basis function, or even a neura network For the k > 2 cass probems, the SVM method as originay proposed required the construction of a two-cass SVM to separate each cass from the remaining casses Simiariy, k two-cass inear programs can be used for the muticass probem We performed an empirica study of the origina LP method, the proposed k LP method, the proposed singe QP method and the origina k QP methods We discuss the advantages and disadvantages of each approach

2 Introduction We investigate the probem of discriminating arge rea-word datasets with more than two casses Given exampes of points known to come from k > 2 casses, we construct a function to discriminate between the casses The goa is to seect a function that wi efficienty and correcty cassify future points This cassification technique can be used for data mining or pattern recognition For exampe, the United States Posta Service is interested in an efficient yet accurate method of cassifying zipcodes Actua handwritten digits from zipcodes coected by the United States Posta Service are used in our study Each digit is represented by a 6 by 6 pixe grayscae map, resuting in 256 attributes for each sampe number Given the enormous quantities of mai the Posta Service sorts each day, the accuracy and efficiency in evauation are extremey important In this paper, we combine two independent but reated research directions deveoped for soving the two-cass inear discrimination probem The first is the inear programming (LP) methods stemming from the Mutisurface Method of Mangasarian [2, 3] This method and it s ater extension the Robust Linear Programming (RLP) approach [6] have been used in a highy successfuy breast cancer diagnosis system [26] The second direction is the quadratic programming (QP) methods based on Vapnik s Statistica Learning Theory [24, 25] Statistica Learning Theory addresses mathematicay the probem of how to best construct functions that generaize we on future points The probem of constructing the best inear two-cass discriminant can be posed as a convex quadratic program with inear constraints The resuting inear discriminant is known as a Support Vector Machine (SVM) because it is a function of a subset of the training data known as support vectors Specific impementations such as the Generaized Optima Pane (GOP) method has proven to perform very we in practice [8] Throughout this paper we wi refer to the two different approaches as RLP and SVM The primary focus of this paper is how the the two research directions have differed in their approach to soving probems with k > 2 casses The origina SVM method for muticass probems was to find k separate two-cass discriminants [23] Each discriminant is constructed by separating a singe cass from a the others This process requires the soution of k quadratic programs When appying a k cassifiers to the origina muticategory dataset, mutipy cassified points or uncassified points may occur This ambiguity has been avoided by choosing the cass of a point corresponding to the cassification function that is maximized at that point The LP approach has been to directy construct k cassification functions such that for each point the corresponding cass function is maximized [5, 6] The Muticategory Discrimination Method [5, 6] constructs a piecewise-inear discriminate for the k- cass probem using a singe inear program We wi ca this method M-RLP since it is a direction extension of the RLP approach We wi show how these two different approaches can be combined two yied two new methods: k-rlp, and M-SVM In Section 2, we wi provide background on the existing RLP and SVM 2

3 methods Whie the k-cass cases are quite different, the two-cass inear discrimination methods for SVM and RLP are amost identica They differ ony in the reguarization term used in the objective We use the reguarized form of RLP proposed in [3] which is equivaent to SVM except that a different norm is used for the reguarization term For two-cass inear discrimination, RLP generaizes equay we and is more computationay efficient than SVM RLP expoits the fact that state-of-the-art LP codes are far more efficient and reiabe than QP codes The primary appea of SVM is that they can be simpy and eeganty appied to noninear discrimination With ony minor changes, SVM methods can construct a wide cass of two-cass noninear discriminants by soving a singe QP [24] The basic idea is that the points are mapped nonineary to a higher dimensiona space Then the dua SVM probem is used to construct a inear discriminant in the higher dimensiona space that is noninear in the origina attribute space By using kerne functions in the dua SVM probem, SVM can efficienty and effectivey construct many types of noninear discriminant functions incuding poynomia, radia basis function machine, and neura networks The successfu poynomia-time noninear methods based on LP use a muti-step approaches The methods of Roy et a [20, 9, 8] use custering in conjuction with LP to generate neura networks in poynomia time Another approach is to recursivey construct piecewise-inear discriminants using a series of LP s [3, 2, 5] These approaches coud aso be used with SVM but we imit discussion to noninear discriminants constructed using the SVM kerne-type approaches After the introduction to the existing muticass methods, M-RLP and k- SVM, we wi show how same idea used in the M-RLP, can be adapted to construct muticass SVM using a singe quadratic program We adapt a probem formuation simiar to the two-cass case In the two-cass case, initiay the probem is to construct a inear discriminant The data points are then transformed to a higher dimensiona feature space A inear discriminant is constructed in the higher dimension space This resuts in a noninear cassification function in the origina feature space In Section 3, for the k > 2 cass case, we begin by constructing a piecewise-inear discriminant function A reguarization term is added to avoid overfitting This method is then extended to piecewisenoninear cassification functions in Section 4 The variabes are mapped to a higher dimensiona space Then a piecewise-inear discriminant function is constructed in the new space This resuts in a piecewise-noninear discriminant in the origina space In Section 5, we extend the method to piecewise inseparabe datasets We ca the fina approach the Muticategory Support Vector Machine (M-SVM) Depending on the choice of transformation, the pieces may be poynomias, radia basis functions, neura networks, etc We concentrate our research on the poynomia cassifier and eave the computationa investigation other cassification functions as future work Figure shows a piecewise-seconddegree poynomia separating three casses in two dimensions M-SVM requires the soution of a very arge quadratic program When transforming the data points into a higher dimension feature space, the number 3

4 Figure : Piecewise-poynomia separation of three casses in two dimensions 4

5 of variabes wi grow exponentiay For exampe, a second degree poynomia cassifier in two dimensions requires the origina variabes x and x 2 as we as the variabes x 2, x 2 2, and x x 2 In the prima probem, the probem size wi expode as the degree of the poynomia increases The dua probem, however, remains tractabe The number of dua variabes is k times the number of points regardess of what transformation is seected In the dua probem, the transformation appears as an inner product in the high dimensiona space Inexpensive techniques exist for computing these inner products Each dua variabe corresponds to a point in the origina feature space A point with a corresponding positive dua variabe is referred to as a support vector The goa is to maintain a high accuracy whie using a sma number of support vectors Minimizing the number of support vectors is important for generaization and aso for reducing the computationa time required to evauate new exampes Section 6 contains computationa resuts comparing the two LP approaches k- RLP and M-RLP; and the the two QP approaches k-svm and M-SVM The methods were compared in terms of generaization (testing set accuracy), number of support vectors, and computationa time The foowing notation wi be used throughout this paper Mathematicay we can abstract the probem as foows: Given the eements of the sets, A i, i =,, k, in the n-dimensiona rea space R n, construct a a discriminant function is determined which separates these points into distinct regions Each region shoud contains points beonging to a or amost a of the same cass Let A j be a set of points in the n-dimensiona rea space R n with cardinaity m j Let A j be an m j n matrix whose rows are the points in A j The i th point in A j and the i th row of A j are both denoted A j i Let e denote a vector of ones of the appropriate dimension The scaar 0 and a vector of zeros are both represented by 0 Thus, for x R n, x > 0 impies that x i > 0 for i =,, n Simiary, x y impies that x i y i for i =,, n The set of minimizers of f(x) on the set S is denoted by arg min f(x) For a vector x in x S Rn, x + wi denote the vector in R n with components (x + ) i := max{x i, 0}, i =,, n The step function x wi denote the vector in [0, ] n with components (x ) i := 0 if (x) i 0 and (x ) i := if (x) i > 0, i =,, n For the vector x in R n and the matrix A in R n m, the transpose of x and A are denoted x T and A T respectivey The dot product of two vectors x and y wi be denoted x T y and (x y) 2 Background This section contains a brief overview of the RLP and SVM methods for cassification First we wi discuss the two-cass probem using a inear cassifier Then SVM for two casses wi be defined Then RLP wi be reviewed Finay, the piecewise-inear function used for muticategory cassification in M-RLP wi be reviewed 5

6 2 Two Cass Linear Discrimination Commony, the method of discrimination for two casses of points invoves determining a inear function that consists of a inear combination of the attributes of the given sets In the simpest case, a inear function can be used to separate two sets as shown in Figure 2 This function is the separating pane x T w = γ x T w = γ PSfrag repacements Figure 2: Two ineary separabe sets and a separating pane where w is the norma to the pane and γ is the distance from the origin Let A and A 2 be two sets of points in the n-dimensiona rea space R n with cardinaity m and m 2 respectivey Let A be an m n matrix whose rows are the points in A Let A 2 be an m 2 n matrix whose rows are the points in A 2 Let x R n be a point to be cassified as foows: x T w γ > 0 x A x T w γ < 0 x A 2 () The two sets of points, A and A 2, are ineary separabe if A w > γe γe > A 2 w (2) where e is a vector of ones of the appropriate dimension If the two casses are inear separabe, there are infinitey many panes that separate the two casses The goa is two choose the pane that wi generaize best on future points Both Mangasarian [2] and Vapnik and Chervonenkis [25] concuded that the best pane in the separabe case is the one that minimizes the distance of the cosest vector in each cass to the separating pane For the separabe case the formuations of Mangasarian s Muti-surface Method of Pattern Recognition [3] and those of Vapnik s Optima Hyperpane [24, 25] are very simiar [3] We wi concentrate on the Optima Hyperpane probem since it the basis of SVM, and it is vaidated theoreticay by Statistica Learning Theory [24] According to Statistica Learning Theory, the Optima Hyperpane can construct inear 6

7 Cass A 2 Cass A PSfrag repacements xw = γ + xw = γ xw = γ Figure 3: Two supporting panes and the resuting optima separating pane discriminants in very high dimensiona spaces without overfitting The reader shoud consut [24] for fu detais of Statistica Learning Theory not covered in this paper The probem in the canonica form of Vapnik [24] becomes to determine two parae panes xw = γ + and xw = γ such that A w γe e 0 A 2 w + γe e 0 (3) and the margin or distance between the two panes is maximized The margin 2 of seperation between the two supporting panes is An exampe of such w a pane is shown in Figure 3 The probem of finding the maximum margin becomes[24]: min w,γ 2 wt w st A w γe e 0 A 2 w + γe e 0 In genera it is not aways possibe for a singe inear function to competey separate two given sets of points Thus, it is important to find the inear function that discriminates best between the two sets according to some error minimization criterion Bennett and Mangasarian [4] minimize the average magnitude of the miscassification errors in the construction of their foowing robust inear programming probem (RLP) min w,γ,y,z δ e T y + δ 2 e T z subject to y + A w γe e 0 z A 2 w + γe e 0 y 0, z 0 where δ > 0 and δ 2 > 0 are the miscassification costs To avoid the nu soution w = 0, use δ = m and δ 2 = m 2 where m and m 2 are the cardinaities of A and A 2 respectivey The RLP method is very effective in practice (4) (5) 7

8 The functions generated by RLP generaize we on many rea-word probems Additionay, the computationa time is reasonaby sma because its soution invoves ony a singe inear program Note however that the RLP method no onger incudes any notion of maximizing the margin Statistica Learning Theory indicates that the maximizing the margin is essentia for good generaization The SVM approach [8, 23] is a mutiobjective quadratic program which minimizes the absoute miscassification errors, and maximizing the separation margin by minimizing w 2 min w,y,z,γ ( λ)(e T y + e T z) + λ 2 wt w st A w γe + y e 0 A 2 w + γe + z e 0 y 0 z 0 (6) where 0 < λ < is a fixed constant Note that Probem 6 is equivaent to RLP with the addition of a reguarization term λ 2 wt w, and δ = δ 2 = A inear programming version of (6) can be constructed by repacing the norm used to minimize the weights w [3] Reca that the SVM objective minimizes the square of the 2-norm of w, w 2 = w T w The -norm of w, w = e T w, can be used instead The absoute vaue function can be removed by introducing the variabe s and the constraints s w s The SVM objective is then modified by substituting e T s for wt w At optimaity, 2 s i = w i, i =,, k The resuting LP is: min w,α,β,y,z,s ( λ)( m e T y + m 2 e T z) + λe T s st A w γe + y 0 A 2 w + γe + z 0 s w s y 0 z 0 s 0 (7) We wi refer to this probem as RLP since λ = 0 yieds the origina RLP method As in the SVM method, the RLP method minimizes both the average distance of the miscassified points from the reaxed supporting panes and the maximum cassification error The main advantage of the RLP method over the SVM probem is that RLP is a inear program sovabe using very robust agorithms such as the Simpex Method [7] SVM requires the soution of quadratic program that is typicay much more computationay costy for the same size probem In [3], the RLP method was found to generaize as we as the inear SVM but with much ess computationa cost It is more efficient computationay to sove the dua RLP and SVM probems The dua RLP probem is min u,v st e T u + e T v λe u T A v T A 2 λe e T u e T v = 0 0 u ( λ)δ 0 v ( λ)δ 2 (8) 8

9 In this paper we use δ = m and δ 2 = k but δ and δ 2 may be any positive weights for the miscassification costs The dua SVM probem and its extension to noninear discriminants is given in the next section 22 Noninear Cassifiers Using Support Vector Machines The primary advantage of the SVM (6) over RLP (7) is that in its dua form it can be used to construct noninear discriminants using poynomia separators, radia basis functions, neura networks, etc The basic idea is to map the origina probems to a higher dimensiona space and then to construct a inear discriminant in a higher dimensiona space that corresponds to a inear discriminant in the origina space So for exampe, to construct a quadratic discriminant for a two dimensiona probems, the input attributes [x, x 2 ] are mapped into [x 2, x 2, 2x x 2, x, x 2 ] and a inear discriminant function is constructed in the new five-dimensiona space Two exampes of possibe poynomia cassifiers are given in Figure 4 The dua SVM is appied to the mapped points The reguarization term in the prima objective heps avoid overfitting the higher dimensiona space The dua SVM provides a practica computationa approach through the use of generaized inner products or kernes Figure 4: Two exampes of second degree poynomia separations of two sets The dua SVM is as foows: as foows: A T u A 2T v 2 e T u e T v min u,v st 2λ e T u = e T v ( λ)e u 0 ( λ)e v 0 (9) To formuate the noninear case it is convenient to rewrite the probem in summation notation Let A be the set of a points A and A 2 Define M = m + 9

10 m 2 to be the tota number of points Let { α T = [α, α 2,, α M ] = [ λ ut λ vt ] Let t R M xi A be such that for x i A t i = x i A 2 To construct the noninear cassification function, the origina data points x are transformed to the higher dimension feature space by the function φ(x) : R n R n, n >> n The dot product of the origina vectors x T i x j is repaced by the dot product of the transformed vectors (φ(x i ) φ(x j )) The first term of the objective function can then be written as the sum: λ M M t i t j α i α j (φ(x i ) φ(x j )) 2 i= j= Using this notation and simpifying the probem becomes: min α st M M M t i t j α i α j (φ(x i ) φ(x j )) 2 i= j= M α i t i = 0 i= ( λ) λ e α 0 i= α i (0) In the support vector machine (SVM), Vapnik repaces the inner product (φ(x) φ(x i )) with the inner product in the Hibert space K(x, x i ) This symmetric function K(x, x i ) must satisfy Theorem 53 in [23] This theorem ensures K(x, x i ) is an inner product in some feature space The choice of K(x, x i ) determines the type of cassifier that is constructed Possibe choices incude poynomia cassifiers as in Figure 4 (K(x, x i ) = (x T x i +) d, where d is the degree of the poynomia), radia basis function machines (K γ ( x x i ) = exp{ γ x x i 2 } where x x i is the distance between two vectors and γ is the width parameter), and two-ayer neura networks (K(x, x i ) = S[v(x T x i ) + c] where S(u) is a sigmoid function) [23] Variants of SVM (0) have proven to be quite successfu in paractice [2, 22, 7] Note that the number of variabes in Program (0) remains constant as K(x, x i ) increases in dimensionaity Additionay, the objective function remains quadratic and thus the compexity of the probem does not increase In fact, the size of the probem is dependent on the number of nonzero dua variabes α i The points x i corresponding to these variabes are caed the support vectors According to Statistica Learning Theory, the best soution for a given miscassification error uses the minimum number of support vectors The fina cassification function with the generaized kerne function K(x, x i ) is: f(x) = sign t i α i K(x, x i ) γ () support vectors where x A if f(x) =, otherwise x A 2 0

11 f(x) = max w i x γ i i=,2,3 PSfrag repacements w x γ w 2 x γ 2 w 3 x γ 3 A A 2 A 3 Figure 5: Piecewise-inear separation of sets A, A 2, and A 3 by the convex piecewise-inear function f(x) 23 Muticategory Discrimination In muticategory cassification a piecewise-inear separator is used to discriminate between k > 2 casses of m i, i =,, k, points We wi examine two methods for accompishing this The first used in SVM [24] is two construct a discriminate function to separate one cass from the remaining k casses This is process is repeated k times In the separabe case, the inear discriminant for each cass must satisfy the foowing set of inequaities Find (w, γ ),, (w k, γ k ), such that A i w i γ i > A j w i γ i, i, j =,, k, i j (2) To cassify a new point x, compute f i (x) = x T i w i γ i If f i (x) > 0 for ony one i then ceary the point beongs to Cass A i If more than one f i (x) > 0 or f i (x) 0 for i =,, m then the cass is ambiguous Thus the genera rue is that the cass of a point x is determined from (w i, γ i ), i =,, k by finding i such that f i (x) = x T w i γ i (3) is maximized Figure 5 shows a piecewise-inear function f(x) = max f i (x) on i=,2,3 R that separates three sets Note either SVM (0) or RLP can be used to construct the k two-cass discriminants For carity, we wi ca this method used with SVM (0), k- SVM We wi denote this method used with RLP (8), k-svm The advantage of k-svm is that it can used for piecewise-noninear discriminants which k-rlp is imited to piecewise-inear discriminants For both k-svm and k-rlp to attain perfect training set accuracy, foowing inequaities must be satisfied: A i w i γ i > A j w j γ j, i, j =,, k, i j This inequaity can be used as a definition of piecewise-inear separabiity

12 Definition 2 (Piecewise-inear Separabiity) The sets of points A i, i =,, k, represented by the matrices A i R mi n, i =,, k, are piecewiseineary separabe if there exist w i R n and γ i R, i =,, k, such that A i w i γ i e > A i w j γ j e, i, j =,, k, i j (4) Equivaent to Definition 2, finding the piecewise-inear separator invoves soving the equation A i w i γ i e A i w j γ j e + e, i, j =,, k, i j This can be rewritten as 0 A i (w i w j ) + (γ i γ j )e + e, i, j =,, k, i j Figure 6 shows an exampe of a piecewise-inear separator for three casses in two dimensions The inear separating functions are represented by the quantities A 2 x T (w 2 w 3 ) = γ 2 γ 3 A 3 PSfrag repacements x T (w w 2 ) = γ γ 2 x T (w 3 w ) = γ 3 γ A Figure 6: Three casses separated by a piecewise-inear function (w i w j, γ i γ j ), i, j =,, k, j i, where w i R n and γ i R, i =,, k The M-RLP method proposed and investigated in [5, 6] can be used to find (w i, γ i ), i =,, k satisfying Definition 2 min w i,γ i,y ij i= j= e T y ij m i yij A i (w i w j ) + (γ i γ j )e + e, y ij 0, i j, i, j =,, k (5) where y ij R mi In M-RLP (5), if the optima objective vaue is zero, then the dataset is piecewise-ineary separabe If the dataset is not piecewiseineary separabe, the positive vaues of the variabes y ij are proportiona to the The method was originay caed Muticategory Discrimination 2

13 magnitude of the miscassified points from the pane x T (w i w j ) = (γ i γ j )+ This program (5) is a generaization of the two-cass RLP inear program (5) to the muticategory case Like the origina RLP (5) M-RLP does not incude any terms for maximizing the margin and it does not directy permit the use of generaized inner products or kernes to aow extension to the noninear case So in the next section we wi show how M-RLP and SVM can be combined by incuding margin maximization and generaized inner products into M-RLP 3

14 3 Formuation of M-SVM: Piecewise-inear Separabe Case We now propose to construct piecewise-inear and piecewise-noninear SVM using a singe quadratic program Anaogous to the two cass case we start by formuating the optima piecewise-inear separator for the separabe case Assume that the k sets of points are piecewise-ineary separabe, ie, there exist w i R n and γ i R, i =,, k, such that A i w i γ i e > A i w j γ j e, i, j =,, k, i j (6) The cass of a point x is determined from (w i, γ i ), i =,, k by finding i such that f i (x) = x T w i γ i (7) is maximized For this piecewise-ineary separabe probem, infinitey many (w i, γ i ) exist that satisfy (6) Intuitivey, the optima (w i, γ i ) provides the argest margin of cassification So in an approach anaogous to the two cass support vector machine (SVM) approach, we add reguarization terms The dashed ines in Figure 7 represent the margins for each piece (w i w j, γ i γ j ) of the piecewiseinear separating function The margin of separation between the casses i and j, ie the distance between A i (w i w j ) (γ i γ j )e + e and A j (w i w j ) (γ i γ j )e e 2 is w i w j So, we woud ike to minimize w i w j for a i, j =,, k, i j Aso, we wi add the reguarization term 2 w i 2 to the objective For the i= piecewise-ineary separabe probem we get the foowing: min w i,γ i 2 i w i w j 2 + w i 2 2 i= j= i= st A i (w i w j ) e(γ i γ j ) e 0 i, j =,, k i j (8) To simpify the notation for formuation of the piecewise-inear SVM, we rewrite this in matrix notation See Appendix A for compete matrix definitions for genera k For the three cass probem (k = 3) the foowing matrices are obtained: Let C = I I 0 I 0 I 0 I I 4

15 (w w 2 ) x = (γ γ 2 ) + A (w w 2, γ γ 2 ) (w w 2 ) x = (γ γ 2 ) A 2 PSfrag repacements (w w 3, γ γ 3 ) (w 2 w 3, γ 2 γ 3 ) A 3 Figure 7: Piecewise-inear separator with margins for three casses where I R n n is the identity matrix Let A A 0 A 0 A Ā = A 2 A A 2 A 2 A 3 0 A 3 0 A 3 A 3 Ē = e e 0 e 0 e e 2 e e 2 e 2 e 3 0 e 3 0 e 3 e 3 where A i R mi n, i =,, 3, and e i R mi, i =,, 3, is a vector of ones Using this notation for fixed k > 2 the program becomes: min w,γ 2 Cw w 2 st Āw + Ēγ e 0 (9) where w = [w T, w 2T,, w kt ] T and γ = [γ, γ 2,, γ k ] T The dua of this probem can be written as: Cw 2 + T (Āw + Ēγ e) max u,w,γ st 2 2 w 2 u (I + C T C)w = Ā T u Ē T u = 0 u 0 (20) 5

16 To eiminate the variabes w and γ from this probem we wi first show that the matrix (I + C T C) is nonsinguar Proposition 3 (Nonsinguarity of (I + C T C)) The inverse of matrix (I+ C T C) for k > 2 is (I kn + C T C) = 2 k+ I n k+ I n k+ I n where I n indicates the n n identity matrix k+ I n k+ I n k+ I n k+ I n 2 k+ I n (2) Proof To show that (I + C T C) is nonsinguar for some k > 2, we wi cacuate its inverse The matrix C as defined in Appendix A has size (n (i ) kn) Reca that n indicates the dimension of the feature space (k )I n I n I n C T C = I n In I n I n (k )I n has size kn kn Therefore ki n I n I n I kn + C T C = I n In I n I n ki n Through simpe cacuations it can be shown that the inverse of this matrix is (2): (I kn + C T C) = 2 k+ I n k+ I n k+ I n k+ I n k+ I n k+ I n k+ I n 2 k+ I n i=2 Using Proposition 3 the foowing reationship resuts: (I + C T C) Ā T = k + ĀT (22) 6

17 It foows from Probem (20) and equation (22) that w = (I + C T C) Ā T u = k + ĀT u (23) Using this reationship, we eiminate w from the dua probem Additionay, γ is removed because ĒT u = 0 After some simpification the new dua probem becomes: max u e T u 2(k+) ut ĀĀ T u st Ē T u = 0 u 0 (24) To construct the muticategory support vector machine, it is convenient to write this probem in summation notation Let the dua vector u T = [u 2T, u 3T,, u kt, u 2T, u 23T,, u k(k )T ] where R mi The resuting dua probem for piecewise-inear datasets is: max u st i= m i j= = m i j= = m j 2 2(k+) i= m i p= q= m j + j= = i [ mi m u ji p u i q A j pa i T j q + j= = u ji m i p u i q A i pa i T q p= q= m p= q= = 0 for i =,, k u ji p u i q A j pa T q 0 for i, j =,, k, i j and =,, m i ] (25) where m i is the number of points in cass i Reca, for the piecewise-inear cassification function, the cass of a point x is determined by finding i =,, k, such that is maximized From equation (23), w w 2 w = = ĀT u k + w k Soving for w i in summation notation we get: w i = j= p= f i (x) = x T w i γ i (26) m i p A i T p m j j= p= u ji p A j pt 7

18 Therefore, f i (x) = m i p x T A i T p j= p= j= m j p= u ji p x T A j pt γ i 4 Formuation of M-SVM: Piecewise-nonineary Separabe Case Just ike in the two-cass case, M-SVM can be generaized to the piecewisenoninear functions To construct the separating functions, f i (x), in a higher dimension feature space, the origina data points x are transformed by some function φ(x) : R n R n [23, 8] The function f i (x) is now reated to the sum of dot products of vectors in this higher dimension feature space: f i (x) = m i p (φ(x) φ(a i T p )) j= p= j= m j p= u ji p (φ(x) φ(a j pt )) γ i According to [23], any symmetric function K(x, x i ) L 2 that satisfies Mercer s Theorem [9] can repace the dot product (φ(x) φ(x i )) Mercer s Theorem guarantees that any eigenvaue λ j in the expansion K(x, x i ) = λ j (φ j (x) φ j (x i )) is positive This is a sufficient condition for a function K(x, x i ) to define a dot product in the higher dimension feature space Therefore we et K(x, x i ) = (φ(x) φ(x i )) Returning to dua Probem (25), the objective function contains the sum of dot products A j p Ai qt of two points in the origina feature space To transform the points A j p to a higher dimension feature space we repace these dot products by K(A j pt, A i T q ) The resuting M-SVM for piecewise-ineary separabe datasets is: j= max u st m i i= j= = m j m i 2 p= q= m i j= = + 2(k+) i= j= = i [ mi m u ji p u i q K(A j T p, A i T j q ) + m j j= = u ji m i p u i q K(A i pt, A i T q ) p= q= m p= q= = 0 for i =,, k ] u ji p u i q K(A j T p, A T q ) 0 for i, j =,, k, i j and =,, m i (27) The points A i corresponding to nonzero dua variabes uij, j =,, k, j i are referred to as support vectors It is possibe for A i to correspond with more 8

19 Figure 8: Piecewise-poynomia separation of three casses in two dimensions Support vectors are indicated with circes than one nonzero variabe, j =,, k, j i In Figure 8, support vectors are represented by a circe around the point Some points have doube circes which indicate that two dua variabes > 0, j =,, 3, j i By the compementarity within the KKT conditions [4], > 0 A i (w i w j ) = (γ i γ j ) + Consequenty the support vectors are ocated cosest to the separating function In fact, the remainder of the points, those that are not support vectors, are not necessary in the construction of the separating function The resuting noninear cassification probem for a point x is to find i =,, k such that the cassification function f i (x) = j= support vectors A i p K(x, A i T p ) support vectors A j u ji p K(x, A j p T ) γi (28) 9

20 is maximized 5 Formuation of M-SVM: Piecewise Inseparabe Case The proceeding sections provided a formuation for the piecewise-ineary and piecewise-noninear separabe cases To construct a cassification function for a piecewiseineary inseparabe dataset, we must first choose an error minimization criterion The technique used in the preceeding sections of formuating the M-SVM for piecewise-ineary separabe datasets can be combined with the -norm error criterion used in Probem (5) of Bennett and Mangasarian [6] The resut is the M-SVM for piecewise-ineary inseparabe probems Using the same matrix notation as in Section 3, we add the terms Cw w 2 to the objective of Probem (5) The resuting prima probem is as foows: min w,γ,y λ( 2 Cw w 2 ) + ( λ)e T y st Āw + Ēγ e + y 0 y 0 (29) where y = [y2, T y3, T, yk T, yt 2,, yk(k ) T ]T and 0 < λ < Soving for the dua, substituting w = k+āt u, and simpifying produces the foowing probem: max u st u T e 2(k+) ut ĀĀ T u 0 u λ λ e Ē T u = 0 (30) As shown in Proposition 5, Probem (30) maximizes a concave quadratic objective over a bounded poyhedra set Thus there exists a ocay optima soution that is gobay optima Proposition 5 (Concavity of objective) The function u T e 2(k+) ut ĀĀT u is concave Proof The matrix ĀĀT is aways positive semi-definite and symmetric Thus the Hessian matrix (= (k+)āāt ) is negative semi-definite Therefore, the objective is a concave function Probem (30) is identica to Probem (24) in the piecewise-ineary separabe case except the dua variabes are now bounded by λ λ Therefore, transforming the data points A i wi proceed identicay as in Section 4 Using the function K(x, x i ) to denote the dot product in some feature space, the fina M-SVM resuts: 20

21 max u st m i i= j= = m j m i 2 p= q= m i j= 0 = + 2(k+) i= j= = i [ mi m u ji p u i q K(A j T p, A i T j q ) + m j j= = u ji m i p ui q K(Ai pt, A i T q ) p= q= m p= q= = 0 for i =,, k ] u ji p u i q K(A j T p, A T q ) λ λ for i, j =,, k, i j and =,, m i (3) As in Sections 3 and 4, the cass of a point x is determined by finding the maximum function f i (x) = p K(x, A i T p ) u ji p K(x, A j T ) p γi j= support vectors support vectors A i A j (32) for i =,, k To determine the threshod vaues γ i, i =,, k, we sove the prima probem (29) with w fixed, where Āw is transformed to the higher dimension feature space This probem is as foows: min γ,y st γ i + γ j + y ij k+ k+ i= [ m i mi = q= i[ m i mj = j q= r= r= m i j= = y ij K(A i T q, A i T r )u i K(A i T q, A j T r )u j r y ij 0, i, j =,, k, i j, =,, m i r m m r= r= ] K(A i T q, A T r )u i r ] K(A i T q, A T r )u j r + (33) The right side of the constraints are constant Thus Probem (33) is a inear program and is easiy soved 6 Computationa Experiments In this section, we present computationa resuts comparing M-SVM (32), M- RLP (5), k-svm using SVM (0), and k-rlp using RLP (8) Severa experiments on rea-word datasets are reported A description of each of the 2

22 datasets foows this paragraph Each of these methods was impemented using the MINOS 54 [7] sover The quadratic programming probems for M- SVM and k-svm were soved using the noninear sover impemented in Minos 54 This sover uses a reduced-gradient agorithm in conjunction with a quasi- Newton method In M-SVM, k-svm and M-RLP, the seected vaues for λ are given Better soutions may resut with different choices of λ Additionay, it is not necessary for the same vaue of λ to be used for both methods The kerne function for the piecewise-noninear M-SVM and k-svm methods is K(x, x i ) = ( x x i n + ) d, where d is the degree of the desired poynomia Wine Recognition Data The Wine dataset [] uses the chemica anaysis of wine to determine the cutivar There are 78 points with 3 features This is a three cass dataset distributed as foows: 59 points in cass, 7 points in cass 2, and 48 points in cass 3 This dataset is avaiabe via anonymous fie transfer protoco (ftp) from the UCI Repository of Machine Learning Databases and Domain Theories [6] at ftp://ftpicsuciedu/pub/machine-earning-databases Gass Identification Database The Gass dataset [] is used to identify the origin of a sampe of gass through chemica anaysis This dataset is comprised of six casses of 24 points with 9 features The distribution of points by cass is as foows: 70 foat processed buiding windows, 7 foat processed vehice windows, 76 non-foat processed buiding windows, 3 containers, 9 tabeware, and 29 headamps This dataset is avaiabe via anonymous fie transfer protoco (ftp) from the UCI Repository of Machine Learning Databases and Domain Theories [6] at ftp://ftpicsuciedu/pub/machine-earning-databases US Posta Service Database The USPS Database [0] contains zipcode sampes from actua mai This database is comprised of separate training and testing sets There are 729 sampes in the training set and 2007 sampes in the testing set Each sampe beongs to one of ten casses: the integers 0 through 9 The sampes are represented by 256 features Two experiments were performed In the first, the datasets were normaized between - and 0-fod cross vaidation was used to estimate generaization on future data The second experiment was conducted on two subsets of the United States Posta Service (USPS) data This data contains handwriting sampes of the integers 0 through 9 The objective of this dataset is to quicky and effectivey interpret zipcodes This data has separate training and testing sets, each of which consist of the 0 integer casses We compied two individua training subsets from the USPS training data The first subset contains 756 exampes each beonging to the casses 3, 5, and 8 We ca this set USPS- training data The second subset contains 96 exampes each beonging to the casses 4, 6, and 7 We ca this set USPS-2 training data Simiary two subsets are created from the testing data In a of these datasets, the data vaues are scaed by 200 Testing set accuracies are reported for a four methods The tota numbers of unique support vectors in the resuting cassification functions for the M-SVM and k-svm methods are given Tabe contains resuts for M-RLP, k-rlp, M-SVM, and k-svm on the Wine and Gass datasets As anticipated, adding the reguarization term to 22

23 Data Degree Wine M-RLP k-rlp M-SVM (378) (29) (258) (239) (228) k-svm (537) (424) (405) (394) (43) Gass M-RLP k-rlp M-SVM (759) (660) (595) (533) (476) k-svm (898) (854) (796) (769) (734) Tabe : Percent testing set accuracies and (tota number of support vectors) for M-SVM and k-svm λ = 05 for k-rlp, M-SVM, and k-svm the degree one probem in M-SVM produced better testing generaization than M-RLP on the Wine dataset The Wine dataset is piecewise-ineary separabe Therefore, the M-RLP method has infinitey many optima soutions However, the testing accuracy for M-SVM with degree one on the Gass data was much ower than the M-RLP accuracy This may indicate that the choice of λ is too arge However, as the degree increases the accuracy of the M-SVM method improves and exceeds the M-RLP resuts The k-svm method generaized surprisingy we The testing accuracies reported for k-svm on the Wine dataset are higher than those of M-SVM The inear k-rlp method performed just as we as the quadratic k-svm program on the Wine dataset and better than the M-SVM and M-RLP methods On the Gass data, as the degree increases, both methods, M-SVM and k-svm, improve dramaticay in testing accuracy Using higher degree poynomias the M-SVM and k-svm methods surpass the accuracies of M-RLP and k-rlp This demonstrates the potentia for poynomia and piecewise-poynomia cassification functions over inear and piecewise-inear functions Tabe 2 contains resuts for the four methods on the USPS data subsets Simiar observations as above can be made Both of these datasets are piecewiseineary separabe The soution that m-rlp has found for each of these datasets tests significanty ower than the other methods The k-svm method generaizes sighty better than M-SVM The k-rlp method reports simiar accuracies as the k-svm method Additionay, it is soving inear programs rather than quadratic programs, so the computationa training time is significanty smaer than the other methods Changing the parameter λ may further improve generaization The M-SVM method consistenty finds cassification functions using fewer support vectors than those of k-svm With fewer support vectors, a sam- 23

24 Data Degree USPS- M-RLP k-rlp M-SVM (45) (327) (32) (305) (37) k-svm (666) (557) (54) (59) (56) USPS-2 M-RLP k-rlp M-SVM (228) (85) (67) (66) (80) k-svm (383) (33) (303) (294) (289) Tabe 2: Percent testing set accuracies and (tota number of support vectors) for M-SVM and SVM λ = 05 for k-svm and λ = 03 for k-rlp and M-SVM Degree M-RLP k-rlp M-SVM k-svm Tabe 3: Tota computationa training time (in seconds) for M-RLP,k-RLP, M-SVM, and k-svm on USPS- pe can be cassified more quicky since the dot-product of the sampe with each support vector must be computed Thus the M-SVM woud be a good method to choose when cassification time is critica CPU times for training a four methods on the USPS- dataset are reported in Tabe 3 The times for a the datasets are not isted because the programs were run using a batch system on custers of machines so the timing was not reiabe However, the trends were cear The k-rlp method is significanty faster than the other methods In the M-SVM and k-svm methods, as the degree increased the computationa time woud decrease and then after a certain degree is reached it woud increase The degree of the poynomia for which it starts to increase varies by dataset Surprisingy, for the USPS datasets the k-svm method was faster than the M-RLP method This was not the case for the Wine and Gass datasets The M-RLP method had faster training times than k-svm for these datasets The times reported are for IBM RS6000 mode 590 workstations with 28 MB RAM 24

25 7 Concusions We have examined four methods for the soution of muticategory discrimination probems based on the LP methods of Mangasarian and the QP methods for SVM of Vapnik The two-cass methods, RLP and SVM are differ ony in the norm of the reguarization term In the past two different approaches had been used for the k > 2 cass case The method we caed k-svm, constructed k two-cass discriminants using k quadratic programs The resuting cassifier was a piecewise-inear or piecewise noninear discriminant function depending on what kerne function was used in the SVM The origina muticategory RLP for k casses, constructed a piecewise-inear discriminant using a singe inear program We proposed two new hybrid approaches Like the k-svm method, k- RLP uses LP to construct k two-cass discriminants We aso formuated a new approach, M-SVM We began the formuation by adding reguarization terms to M-RLP Then ike k-svm with piecewise-noninear discriminants, the noninear pieces are found by mapping the origina data points into a higher dimension feature space This transformation appeared in the dua probem as an inner product of two points in the higher dimension space A generaized inner product was used to make the probem tractabe The new M-SVM method requires the soution of a singe quadratic program We performed a computationa study of the four methods on four datasets In genera we found that the k- SVM and k-rlp generaized However, M-SVM used fewer support vectors a counter-intuitive resut since for the two-cass cass Statistica Learning Theory predicts that fewer support vector shoud resut in better generaization The theoretic justification of the better generaization of k-svm and k-rlp and M- SVM and M-RLP is an open question The k-rlp method provided accurate and efficient resuts on the piecewise-inear separabe datasets The k-svm aso tested surprisingy we but requires the soution of k quadratic programs Thus providing soutions with smaer cassification time On the piecewiseineary inseparabe dataset, the poynomia and piecewise-poynomia cassifiers provided an improvement over the M-RLP and k-rlp methods On the other datasets, the k-rlp method found soutions that generaized best or neary best in ess computationa time 25

26 A Matrix Representations for Muticategory Support Vector Machines This appendix contains the definitions of the matrices used for the genera k- cass SVM formuation (8): min w,γ 2 Cw w 2 st Āw + Ēγ e 0 Let I I I 0 I I 0 0 I 0 I I C = I 0 0 I 0 0 I I I 0 I 0 0 I I (34) where I R n n is the identity matrix The matrix C has n (i ) rows and kn coumns i=2 26

27 Let Ā = A A A 0 A A 0 0 A A 2 A A 2 A A 2 0 A 2 A k 0 0 A k 0 A k 0 0 A k A k A k (35) where A i R mi n The matrix Ā has (k ) i= m i rows and kn coumns 27

28 Let Ē = e e e 0 e e 0 0 e e 2 e e 2 e e 2 0 e 2 e k 0 0 e k 0 e k 0 0 e k e k e k where e i R mi is a vector of ones The matrix Ē has (k ) i= m i rows and kn coumns 28

29 References [] S Aeberhard, D Coomans, and O de Ve Comparison of cassifiers in high dimensiona settings Technica Report 92-02, Departments of Computer Science and of Mathematics and Statistics, James Cook University of North Queensand, 992 [2] K P Bennett Decision tree construction via inear programming In M Evans, editor, Proceedings of the 4th Midwest Artificia Inteigence and Cognitive Science Society Conference, pages 97 0, Utica, Iinois, 992 [3] K P Bennett and E J Bredensteiner Geometry in earning In C Gorini, E Hart, W Meyer, and T Phiips, editors, Geometry at Work, Washington, DC, 998 Mathematica Association of America To appear [4] K P Bennett and O L Mangasarian Neura network training via inear programming In P M Pardaos, editor, Advances in Optimization and Parae Computing, pages 56 67, Amsterdam, 992 North Hoand [5] K P Bennett and O L Mangasarian Muticategory discrimination via inear programming Optimization Methods and Software, 3:27 39, 994 [6] K P Bennett and O L Mangasarian Seria and parae muticategory discrimination SIAM Journa on Optimization, 4(4): , 994 [7] V Banz, B Schökopf, H Büthoff, C Burges, V Vapnik, and T Vetter Comparison of view based object recognition agorithms using reaistic 3D modes In C von der Masburg, W von Seeen, J C Vorbrüggen, and B Sendhoff, editors, Artificia Neura Networks - ICANN 96, pages , Berin, 996 Springer Lecture Notes in Computer Science Vo 2 [8] C Cortes and V N Vapnik Support vector networks Machine Learning, 20: , 995 [9] R Courant and D Hibert Methods of Mathematica Physics J Wiey, New York, 953 [0] Y Le Cun, B Boser, J S Denker, D Henderson, R E Howard, W Hubbard, and L J Jacke Backpropagation appied to handwritten zip code recognition Neura Computation, :54 55, 989 [] I W Evett and E J Spieher Rue induction in forensic science Technica report, Centra Research Estabishment, Home Office Forensic Science Service, Adermaston, Reading, Berkshire RG7 4PN, 987 [2] O L Mangasarian Linear and noninear separation of patterns by inear programming Operations Research, 3: , 965 [3] O L Mangasarian Muti-surface method of pattern separation IEEE Transactions on Information Theory, IT-4:80 807,

30 [4] O L Mangasarian Noninear Programming McGraw Hi, New York, 969 [5] O L Mangasarian Mathematica programming in machine earning In G DiPio and F Giannessi, editors, Proceedings of Noninear Optimization and Appications Workshop, pages , New York, 996 Penum Press [6] P M Murphy and D W Aha UCI repository of machine earning databases [ mearn/mlrepositoryhtm] Department of Information and Computer Science, University of Caifornia, Irvine, Caifornia, 994 [7] B A Murtagh and M A Saunders MINOS 54 user s guide Technica Report SOL 8320, Stanford University, 993 [8] A Roy, S Govi, and R Miranda An agorithm to generate radia basis function (RBF)-ike nets for cassification probems Neura Networks, 8(2):79 202, 995 [9] A Roy, L S Kim, and S Mukhopadhyay A poynomia time agorithm for the construction and training of a cass of mutiayer perceptrons Neura Networks, 6: , 993 [20] A Roy and S Mukhopadhyay Pattern cassification using inear programming ORSA Journa of Computing, 3:66 80, 990 [2] B Schökopf, C Burges, and V Vapnik Incorporating invariances in support vector machines In C von der Masburg, W von Seeen, J C Vorbrüggen, and B Sendhoff, editors, Artificia Neura Networks - ICANN 96, pages 47 52, Berin, 996 Springer Lecture Notes in Computer Science Vo 2 [22] B Schökopf, K Sung, C Burges, F Girosi, P Niyogi, T Poggio, and V Vapnik Comparing support vector machines with gaussian kernes to radia basis function cassifiers AI Memo No 599; CBCL Paper No 42, Massachusetts Institute of Technoogy, Cambridge, 996 [23] V Vapnik The Nature of Statistica Learning Theory Springer-Verag, 995 [24] V N Vapnik The Nature of Statistica Learning Theory John Wiey & Sons, New York, 996 [25] V N Vapnik and A Ja Chervonenkis Theory of Pattern Recognition Nauka, Moscow, 974 In Russian [26] W H Woberg and O L Mangasarian Mutisurface method of pattern separation for medica diagnosis appied to breast cytoogy Proceedings of the Nationa Academy of Sciences, USA, 87: ,

SVM: Terminology 1(6) SVM: Terminology 2(6)

SVM: Terminology 1(6) SVM: Terminology 2(6) Andrew Kusiak Inteigent Systems Laboratory 39 Seamans Center he University of Iowa Iowa City, IA 54-57 SVM he maxima margin cassifier is simiar to the perceptron: It aso assumes that the data points are

More information

An Algorithm for Pruning Redundant Modules in Min-Max Modular Network

An Algorithm for Pruning Redundant Modules in Min-Max Modular Network An Agorithm for Pruning Redundant Modues in Min-Max Moduar Network Hui-Cheng Lian and Bao-Liang Lu Department of Computer Science and Engineering, Shanghai Jiao Tong University 1954 Hua Shan Rd., Shanghai

More information

Statistical Learning Theory: A Primer

Statistical Learning Theory: A Primer Internationa Journa of Computer Vision 38(), 9 3, 2000 c 2000 uwer Academic Pubishers. Manufactured in The Netherands. Statistica Learning Theory: A Primer THEODOROS EVGENIOU, MASSIMILIANO PONTIL AND TOMASO

More information

Statistical Learning Theory: a Primer

Statistical Learning Theory: a Primer ??,??, 1 6 (??) c?? Kuwer Academic Pubishers, Boston. Manufactured in The Netherands. Statistica Learning Theory: a Primer THEODOROS EVGENIOU AND MASSIMILIANO PONTIL Center for Bioogica and Computationa

More information

FRST Multivariate Statistics. Multivariate Discriminant Analysis (MDA)

FRST Multivariate Statistics. Multivariate Discriminant Analysis (MDA) 1 FRST 531 -- Mutivariate Statistics Mutivariate Discriminant Anaysis (MDA) Purpose: 1. To predict which group (Y) an observation beongs to based on the characteristics of p predictor (X) variabes, using

More information

From Margins to Probabilities in Multiclass Learning Problems

From Margins to Probabilities in Multiclass Learning Problems From Margins to Probabiities in Muticass Learning Probems Andrea Passerini and Massimiiano Ponti 2 and Paoo Frasconi 3 Abstract. We study the probem of muticass cassification within the framework of error

More information

Multilayer Kerceptron

Multilayer Kerceptron Mutiayer Kerceptron Zotán Szabó, András Lőrincz Department of Information Systems, Facuty of Informatics Eötvös Loránd University Pázmány Péter sétány 1/C H-1117, Budapest, Hungary e-mai: szzoi@csetehu,

More information

A Brief Introduction to Markov Chains and Hidden Markov Models

A Brief Introduction to Markov Chains and Hidden Markov Models A Brief Introduction to Markov Chains and Hidden Markov Modes Aen B MacKenzie Notes for December 1, 3, &8, 2015 Discrete-Time Markov Chains You may reca that when we first introduced random processes,

More information

Explicit overall risk minimization transductive bound

Explicit overall risk minimization transductive bound 1 Expicit overa risk minimization transductive bound Sergio Decherchi, Paoo Gastado, Sandro Ridea, Rodofo Zunino Dept. of Biophysica and Eectronic Engineering (DIBE), Genoa University Via Opera Pia 11a,

More information

SVM-based Supervised and Unsupervised Classification Schemes

SVM-based Supervised and Unsupervised Classification Schemes SVM-based Supervised and Unsupervised Cassification Schemes LUMINITA STATE University of Pitesti Facuty of Mathematics and Computer Science 1 Targu din Vae St., Pitesti 110040 ROMANIA state@cicknet.ro

More information

Support Vector Machine and Its Application to Regression and Classification

Support Vector Machine and Its Application to Regression and Classification BearWorks Institutiona Repository MSU Graduate Theses Spring 2017 Support Vector Machine and Its Appication to Regression and Cassification Xiaotong Hu As with any inteectua project, the content and views

More information

Primal and dual active-set methods for convex quadratic programming

Primal and dual active-set methods for convex quadratic programming Math. Program., Ser. A 216) 159:469 58 DOI 1.17/s117-15-966-2 FULL LENGTH PAPER Prima and dua active-set methods for convex quadratic programming Anders Forsgren 1 Phiip E. Gi 2 Eizabeth Wong 2 Received:

More information

Cryptanalysis of PKP: A New Approach

Cryptanalysis of PKP: A New Approach Cryptanaysis of PKP: A New Approach Éiane Jaumes and Antoine Joux DCSSI 18, rue du Dr. Zamenhoff F-92131 Issy-es-Mx Cedex France eiane.jaumes@wanadoo.fr Antoine.Joux@ens.fr Abstract. Quite recenty, in

More information

Bayesian Learning. You hear a which which could equally be Thanks or Tanks, which would you go with?

Bayesian Learning. You hear a which which could equally be Thanks or Tanks, which would you go with? Bayesian Learning A powerfu and growing approach in machine earning We use it in our own decision making a the time You hear a which which coud equay be Thanks or Tanks, which woud you go with? Combine

More information

CS229 Lecture notes. Andrew Ng

CS229 Lecture notes. Andrew Ng CS229 Lecture notes Andrew Ng Part IX The EM agorithm In the previous set of notes, we taked about the EM agorithm as appied to fitting a mixture of Gaussians. In this set of notes, we give a broader view

More information

The EM Algorithm applied to determining new limit points of Mahler measures

The EM Algorithm applied to determining new limit points of Mahler measures Contro and Cybernetics vo. 39 (2010) No. 4 The EM Agorithm appied to determining new imit points of Maher measures by Souad E Otmani, Georges Rhin and Jean-Marc Sac-Épée Université Pau Veraine-Metz, LMAM,

More information

A unified framework for Regularization Networks and Support Vector Machines. Theodoros Evgeniou, Massimiliano Pontil, Tomaso Poggio

A unified framework for Regularization Networks and Support Vector Machines. Theodoros Evgeniou, Massimiliano Pontil, Tomaso Poggio MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES A.I. Memo No. 1654 March23, 1999

More information

Expectation-Maximization for Estimating Parameters for a Mixture of Poissons

Expectation-Maximization for Estimating Parameters for a Mixture of Poissons Expectation-Maximization for Estimating Parameters for a Mixture of Poissons Brandon Maone Department of Computer Science University of Hesini February 18, 2014 Abstract This document derives, in excrutiating

More information

Lecture Note 3: Stationary Iterative Methods

Lecture Note 3: Stationary Iterative Methods MATH 5330: Computationa Methods of Linear Agebra Lecture Note 3: Stationary Iterative Methods Xianyi Zeng Department of Mathematica Sciences, UTEP Stationary Iterative Methods The Gaussian eimination (or

More information

Appendix A: MATLAB commands for neural networks

Appendix A: MATLAB commands for neural networks Appendix A: MATLAB commands for neura networks 132 Appendix A: MATLAB commands for neura networks p=importdata('pn.xs'); t=importdata('tn.xs'); [pn,meanp,stdp,tn,meant,stdt]=prestd(p,t); for m=1:10 net=newff(minmax(pn),[m,1],{'tansig','purein'},'trainm');

More information

Asynchronous Control for Coupled Markov Decision Systems

Asynchronous Control for Coupled Markov Decision Systems INFORMATION THEORY WORKSHOP (ITW) 22 Asynchronous Contro for Couped Marov Decision Systems Michae J. Neey University of Southern Caifornia Abstract This paper considers optima contro for a coection of

More information

Two view learning: SVM-2K, Theory and Practice

Two view learning: SVM-2K, Theory and Practice Two view earning: SVM-2K, Theory and Practice Jason D.R. Farquhar jdrf99r@ecs.soton.ac.uk Hongying Meng hongying@cs.york.ac.uk David R. Hardoon drh@ecs.soton.ac.uk John Shawe-Tayor jst@ecs.soton.ac.uk

More information

A. Distribution of the test statistic

A. Distribution of the test statistic A. Distribution of the test statistic In the sequentia test, we first compute the test statistic from a mini-batch of size m. If a decision cannot be made with this statistic, we keep increasing the mini-batch

More information

A Novel Learning Method for Elman Neural Network Using Local Search

A Novel Learning Method for Elman Neural Network Using Local Search Neura Information Processing Letters and Reviews Vo. 11, No. 8, August 2007 LETTER A Nove Learning Method for Eman Neura Networ Using Loca Search Facuty of Engineering, Toyama University, Gofuu 3190 Toyama

More information

Discriminant Analysis: A Unified Approach

Discriminant Analysis: A Unified Approach Discriminant Anaysis: A Unified Approach Peng Zhang & Jing Peng Tuane University Eectrica Engineering & Computer Science Department New Oreans, LA 708 {zhangp,jp}@eecs.tuane.edu Norbert Riede Tuane University

More information

Partial permutation decoding for MacDonald codes

Partial permutation decoding for MacDonald codes Partia permutation decoding for MacDonad codes J.D. Key Department of Mathematics and Appied Mathematics University of the Western Cape 7535 Bevie, South Africa P. Seneviratne Department of Mathematics

More information

VALIDATED CONTINUATION FOR EQUILIBRIA OF PDES

VALIDATED CONTINUATION FOR EQUILIBRIA OF PDES VALIDATED CONTINUATION FOR EQUILIBRIA OF PDES SARAH DAY, JEAN-PHILIPPE LESSARD, AND KONSTANTIN MISCHAIKOW Abstract. One of the most efficient methods for determining the equiibria of a continuous parameterized

More information

Uniprocessor Feasibility of Sporadic Tasks with Constrained Deadlines is Strongly conp-complete

Uniprocessor Feasibility of Sporadic Tasks with Constrained Deadlines is Strongly conp-complete Uniprocessor Feasibiity of Sporadic Tasks with Constrained Deadines is Strongy conp-compete Pontus Ekberg and Wang Yi Uppsaa University, Sweden Emai: {pontus.ekberg yi}@it.uu.se Abstract Deciding the feasibiity

More information

Efficiently Generating Random Bits from Finite State Markov Chains

Efficiently Generating Random Bits from Finite State Markov Chains 1 Efficienty Generating Random Bits from Finite State Markov Chains Hongchao Zhou and Jehoshua Bruck, Feow, IEEE Abstract The probem of random number generation from an uncorreated random source (of unknown

More information

Math 124B January 31, 2012

Math 124B January 31, 2012 Math 124B January 31, 212 Viktor Grigoryan 7 Inhomogeneous boundary vaue probems Having studied the theory of Fourier series, with which we successfuy soved boundary vaue probems for the homogeneous heat

More information

C. Fourier Sine Series Overview

C. Fourier Sine Series Overview 12 PHILIP D. LOEWEN C. Fourier Sine Series Overview Let some constant > be given. The symboic form of the FSS Eigenvaue probem combines an ordinary differentia equation (ODE) on the interva (, ) with a

More information

First-Order Corrections to Gutzwiller s Trace Formula for Systems with Discrete Symmetries

First-Order Corrections to Gutzwiller s Trace Formula for Systems with Discrete Symmetries c 26 Noninear Phenomena in Compex Systems First-Order Corrections to Gutzwier s Trace Formua for Systems with Discrete Symmetries Hoger Cartarius, Jörg Main, and Günter Wunner Institut für Theoretische

More information

More Scattering: the Partial Wave Expansion

More Scattering: the Partial Wave Expansion More Scattering: the Partia Wave Expansion Michae Fower /7/8 Pane Waves and Partia Waves We are considering the soution to Schrödinger s equation for scattering of an incoming pane wave in the z-direction

More information

Inductive Bias: How to generalize on novel data. CS Inductive Bias 1

Inductive Bias: How to generalize on novel data. CS Inductive Bias 1 Inductive Bias: How to generaize on nove data CS 478 - Inductive Bias 1 Overfitting Noise vs. Exceptions CS 478 - Inductive Bias 2 Non-Linear Tasks Linear Regression wi not generaize we to the task beow

More information

$, (2.1) n="# #. (2.2)

$, (2.1) n=# #. (2.2) Chapter. Eectrostatic II Notes: Most of the materia presented in this chapter is taken from Jackson, Chap.,, and 4, and Di Bartoo, Chap... Mathematica Considerations.. The Fourier series and the Fourier

More information

II. PROBLEM. A. Description. For the space of audio signals

II. PROBLEM. A. Description. For the space of audio signals CS229 - Fina Report Speech Recording based Language Recognition (Natura Language) Leopod Cambier - cambier; Matan Leibovich - matane; Cindy Orozco Bohorquez - orozcocc ABSTRACT We construct a rea time

More information

Discrete Techniques. Chapter Introduction

Discrete Techniques. Chapter Introduction Chapter 3 Discrete Techniques 3. Introduction In the previous two chapters we introduced Fourier transforms of continuous functions of the periodic and non-periodic (finite energy) type, as we as various

More information

Do Schools Matter for High Math Achievement? Evidence from the American Mathematics Competitions Glenn Ellison and Ashley Swanson Online Appendix

Do Schools Matter for High Math Achievement? Evidence from the American Mathematics Competitions Glenn Ellison and Ashley Swanson Online Appendix VOL. NO. DO SCHOOLS MATTER FOR HIGH MATH ACHIEVEMENT? 43 Do Schoos Matter for High Math Achievement? Evidence from the American Mathematics Competitions Genn Eison and Ashey Swanson Onine Appendix Appendix

More information

Integrating Factor Methods as Exponential Integrators

Integrating Factor Methods as Exponential Integrators Integrating Factor Methods as Exponentia Integrators Borisav V. Minchev Department of Mathematica Science, NTNU, 7491 Trondheim, Norway Borko.Minchev@ii.uib.no Abstract. Recenty a ot of effort has been

More information

Stochastic Variational Inference with Gradient Linearization

Stochastic Variational Inference with Gradient Linearization Stochastic Variationa Inference with Gradient Linearization Suppementa Materia Tobias Pötz * Anne S Wannenwetsch Stefan Roth Department of Computer Science, TU Darmstadt Preface In this suppementa materia,

More information

Problem set 6 The Perron Frobenius theorem.

Problem set 6 The Perron Frobenius theorem. Probem set 6 The Perron Frobenius theorem. Math 22a4 Oct 2 204, Due Oct.28 In a future probem set I want to discuss some criteria which aow us to concude that that the ground state of a sef-adjoint operator

More information

LECTURE NOTES 9 TRACELESS SYMMETRIC TENSOR APPROACH TO LEGENDRE POLYNOMIALS AND SPHERICAL HARMONICS

LECTURE NOTES 9 TRACELESS SYMMETRIC TENSOR APPROACH TO LEGENDRE POLYNOMIALS AND SPHERICAL HARMONICS MASSACHUSETTS INSTITUTE OF TECHNOLOGY Physics Department Physics 8.07: Eectromagnetism II October 7, 202 Prof. Aan Guth LECTURE NOTES 9 TRACELESS SYMMETRIC TENSOR APPROACH TO LEGENDRE POLYNOMIALS AND SPHERICAL

More information

A Solution to the 4-bit Parity Problem with a Single Quaternary Neuron

A Solution to the 4-bit Parity Problem with a Single Quaternary Neuron Neura Information Processing - Letters and Reviews Vo. 5, No. 2, November 2004 LETTER A Soution to the 4-bit Parity Probem with a Singe Quaternary Neuron Tohru Nitta Nationa Institute of Advanced Industria

More information

VALIDATED CONTINUATION FOR EQUILIBRIA OF PDES

VALIDATED CONTINUATION FOR EQUILIBRIA OF PDES SIAM J. NUMER. ANAL. Vo. 0, No. 0, pp. 000 000 c 200X Society for Industria and Appied Mathematics VALIDATED CONTINUATION FOR EQUILIBRIA OF PDES SARAH DAY, JEAN-PHILIPPE LESSARD, AND KONSTANTIN MISCHAIKOW

More information

Convergence Property of the Iri-Imai Algorithm for Some Smooth Convex Programming Problems

Convergence Property of the Iri-Imai Algorithm for Some Smooth Convex Programming Problems Convergence Property of the Iri-Imai Agorithm for Some Smooth Convex Programming Probems S. Zhang Communicated by Z.Q. Luo Assistant Professor, Department of Econometrics, University of Groningen, Groningen,

More information

Kernel pea and De-Noising in Feature Spaces

Kernel pea and De-Noising in Feature Spaces Kerne pea and De-Noising in Feature Spaces Sebastian Mika, Bernhard Schokopf, Aex Smoa Kaus-Robert Muer, Matthias Schoz, Gunnar Riitsch GMD FIRST, Rudower Chaussee 5, 12489 Berin, Germany {mika, bs, smoa,

More information

MARKOV CHAINS AND MARKOV DECISION THEORY. Contents

MARKOV CHAINS AND MARKOV DECISION THEORY. Contents MARKOV CHAINS AND MARKOV DECISION THEORY ARINDRIMA DATTA Abstract. In this paper, we begin with a forma introduction to probabiity and expain the concept of random variabes and stochastic processes. After

More information

Determining The Degree of Generalization Using An Incremental Learning Algorithm

Determining The Degree of Generalization Using An Incremental Learning Algorithm Determining The Degree of Generaization Using An Incrementa Learning Agorithm Pabo Zegers Facutad de Ingeniería, Universidad de os Andes San Caros de Apoquindo 22, Las Condes, Santiago, Chie pzegers@uandes.c

More information

Power Control and Transmission Scheduling for Network Utility Maximization in Wireless Networks

Power Control and Transmission Scheduling for Network Utility Maximization in Wireless Networks ower Contro and Transmission Scheduing for Network Utiity Maximization in Wireess Networks Min Cao, Vivek Raghunathan, Stephen Hany, Vinod Sharma and. R. Kumar Abstract We consider a joint power contro

More information

(f) is called a nearly holomorphic modular form of weight k + 2r as in [5].

(f) is called a nearly holomorphic modular form of weight k + 2r as in [5]. PRODUCTS OF NEARLY HOLOMORPHIC EIGENFORMS JEFFREY BEYERL, KEVIN JAMES, CATHERINE TRENTACOSTE, AND HUI XUE Abstract. We prove that the product of two neary hoomorphic Hece eigenforms is again a Hece eigenform

More information

Moreau-Yosida Regularization for Grouped Tree Structure Learning

Moreau-Yosida Regularization for Grouped Tree Structure Learning Moreau-Yosida Reguarization for Grouped Tree Structure Learning Jun Liu Computer Science and Engineering Arizona State University J.Liu@asu.edu Jieping Ye Computer Science and Engineering Arizona State

More information

4 1-D Boundary Value Problems Heat Equation

4 1-D Boundary Value Problems Heat Equation 4 -D Boundary Vaue Probems Heat Equation The main purpose of this chapter is to study boundary vaue probems for the heat equation on a finite rod a x b. u t (x, t = ku xx (x, t, a < x < b, t > u(x, = ϕ(x

More information

ASummaryofGaussianProcesses Coryn A.L. Bailer-Jones

ASummaryofGaussianProcesses Coryn A.L. Bailer-Jones ASummaryofGaussianProcesses Coryn A.L. Baier-Jones Cavendish Laboratory University of Cambridge caj@mrao.cam.ac.uk Introduction A genera prediction probem can be posed as foows. We consider that the variabe

More information

CONJUGATE GRADIENT WITH SUBSPACE OPTIMIZATION

CONJUGATE GRADIENT WITH SUBSPACE OPTIMIZATION CONJUGATE GRADIENT WITH SUBSPACE OPTIMIZATION SAHAR KARIMI AND STEPHEN VAVASIS Abstract. In this paper we present a variant of the conjugate gradient (CG) agorithm in which we invoke a subspace minimization

More information

Universal Consistency of Multi-Class Support Vector Classification

Universal Consistency of Multi-Class Support Vector Classification Universa Consistency of Muti-Cass Support Vector Cassification Tobias Gasmachers Dae Moe Institute for rtificia Inteigence IDSI, 6928 Manno-Lugano, Switzerand tobias@idsia.ch bstract Steinwart was the

More information

4 Separation of Variables

4 Separation of Variables 4 Separation of Variabes In this chapter we describe a cassica technique for constructing forma soutions to inear boundary vaue probems. The soution of three cassica (paraboic, hyperboic and eiptic) PDE

More information

XSAT of linear CNF formulas

XSAT of linear CNF formulas XSAT of inear CN formuas Bernd R. Schuh Dr. Bernd Schuh, D-50968 Kön, Germany; bernd.schuh@netcoogne.de eywords: compexity, XSAT, exact inear formua, -reguarity, -uniformity, NPcompeteness Abstract. Open

More information

How the backpropagation algorithm works Srikumar Ramalingam School of Computing University of Utah

How the backpropagation algorithm works Srikumar Ramalingam School of Computing University of Utah How the backpropagation agorithm works Srikumar Ramaingam Schoo of Computing University of Utah Reference Most of the sides are taken from the second chapter of the onine book by Michae Nieson: neuranetworksanddeepearning.com

More information

Paragraph Topic Classification

Paragraph Topic Classification Paragraph Topic Cassification Eugene Nho Graduate Schoo of Business Stanford University Stanford, CA 94305 enho@stanford.edu Edward Ng Department of Eectrica Engineering Stanford University Stanford, CA

More information

Separation of Variables and a Spherical Shell with Surface Charge

Separation of Variables and a Spherical Shell with Surface Charge Separation of Variabes and a Spherica She with Surface Charge In cass we worked out the eectrostatic potentia due to a spherica she of radius R with a surface charge density σθ = σ cos θ. This cacuation

More information

PHYS 110B - HW #1 Fall 2005, Solutions by David Pace Equations referenced as Eq. # are from Griffiths Problem statements are paraphrased

PHYS 110B - HW #1 Fall 2005, Solutions by David Pace Equations referenced as Eq. # are from Griffiths Problem statements are paraphrased PHYS 110B - HW #1 Fa 2005, Soutions by David Pace Equations referenced as Eq. # are from Griffiths Probem statements are paraphrased [1.] Probem 6.8 from Griffiths A ong cyinder has radius R and a magnetization

More information

Formulas for Angular-Momentum Barrier Factors Version II

Formulas for Angular-Momentum Barrier Factors Version II BNL PREPRINT BNL-QGS-06-101 brfactor1.tex Formuas for Anguar-Momentum Barrier Factors Version II S. U. Chung Physics Department, Brookhaven Nationa Laboratory, Upton, NY 11973 March 19, 2015 abstract A

More information

DIGITAL FILTER DESIGN OF IIR FILTERS USING REAL VALUED GENETIC ALGORITHM

DIGITAL FILTER DESIGN OF IIR FILTERS USING REAL VALUED GENETIC ALGORITHM DIGITAL FILTER DESIGN OF IIR FILTERS USING REAL VALUED GENETIC ALGORITHM MIKAEL NILSSON, MATTIAS DAHL AND INGVAR CLAESSON Bekinge Institute of Technoogy Department of Teecommunications and Signa Processing

More information

A Statistical Framework for Real-time Event Detection in Power Systems

A Statistical Framework for Real-time Event Detection in Power Systems 1 A Statistica Framework for Rea-time Event Detection in Power Systems Noan Uhrich, Tim Christman, Phiip Swisher, and Xichen Jiang Abstract A quickest change detection (QCD) agorithm is appied to the probem

More information

An approximate method for solving the inverse scattering problem with fixed-energy data

An approximate method for solving the inverse scattering problem with fixed-energy data J. Inv. I-Posed Probems, Vo. 7, No. 6, pp. 561 571 (1999) c VSP 1999 An approximate method for soving the inverse scattering probem with fixed-energy data A. G. Ramm and W. Scheid Received May 12, 1999

More information

Gauss Law. 2. Gauss s Law: connects charge and field 3. Applications of Gauss s Law

Gauss Law. 2. Gauss s Law: connects charge and field 3. Applications of Gauss s Law Gauss Law 1. Review on 1) Couomb s Law (charge and force) 2) Eectric Fied (fied and force) 2. Gauss s Law: connects charge and fied 3. Appications of Gauss s Law Couomb s Law and Eectric Fied Couomb s

More information

BALANCING REGULAR MATRIX PENCILS

BALANCING REGULAR MATRIX PENCILS BALANCING REGULAR MATRIX PENCILS DAMIEN LEMONNIER AND PAUL VAN DOOREN Abstract. In this paper we present a new diagona baancing technique for reguar matrix pencis λb A, which aims at reducing the sensitivity

More information

Some Measures for Asymmetry of Distributions

Some Measures for Asymmetry of Distributions Some Measures for Asymmetry of Distributions Georgi N. Boshnakov First version: 31 January 2006 Research Report No. 5, 2006, Probabiity and Statistics Group Schoo of Mathematics, The University of Manchester

More information

Sequential Decoding of Polar Codes with Arbitrary Binary Kernel

Sequential Decoding of Polar Codes with Arbitrary Binary Kernel Sequentia Decoding of Poar Codes with Arbitrary Binary Kerne Vera Miosavskaya, Peter Trifonov Saint-Petersburg State Poytechnic University Emai: veram,petert}@dcn.icc.spbstu.ru Abstract The probem of efficient

More information

arxiv: v2 [cond-mat.stat-mech] 14 Nov 2008

arxiv: v2 [cond-mat.stat-mech] 14 Nov 2008 Random Booean Networks Barbara Drosse Institute of Condensed Matter Physics, Darmstadt University of Technoogy, Hochschustraße 6, 64289 Darmstadt, Germany (Dated: June 27) arxiv:76.335v2 [cond-mat.stat-mech]

More information

Chapter 1 Decomposition methods for Support Vector Machines

Chapter 1 Decomposition methods for Support Vector Machines Chapter 1 Decomposition methods for Support Vector Machines Support Vector Machines (SVM) are widey used as a simpe and efficient too for inear and noninear cassification as we as for regression probems.

More information

Symbolic models for nonlinear control systems using approximate bisimulation

Symbolic models for nonlinear control systems using approximate bisimulation Symboic modes for noninear contro systems using approximate bisimuation Giordano Poa, Antoine Girard and Pauo Tabuada Abstract Contro systems are usuay modeed by differentia equations describing how physica

More information

Discrete Techniques. Chapter Introduction

Discrete Techniques. Chapter Introduction Chapter 3 Discrete Techniques 3. Introduction In the previous two chapters we introduced Fourier transforms of continuous functions of the periodic and non-periodic (finite energy) type, we as various

More information

Algorithms to solve massively under-defined systems of multivariate quadratic equations

Algorithms to solve massively under-defined systems of multivariate quadratic equations Agorithms to sove massivey under-defined systems of mutivariate quadratic equations Yasufumi Hashimoto Abstract It is we known that the probem to sove a set of randomy chosen mutivariate quadratic equations

More information

A Ridgelet Kernel Regression Model using Genetic Algorithm

A Ridgelet Kernel Regression Model using Genetic Algorithm A Ridgeet Kerne Regression Mode using Genetic Agorithm Shuyuan Yang, Min Wang, Licheng Jiao * Institute of Inteigence Information Processing, Department of Eectrica Engineering Xidian University Xi an,

More information

An explicit Jordan Decomposition of Companion matrices

An explicit Jordan Decomposition of Companion matrices An expicit Jordan Decomposition of Companion matrices Fermín S V Bazán Departamento de Matemática CFM UFSC 88040-900 Forianópois SC E-mai: fermin@mtmufscbr S Gratton CERFACS 42 Av Gaspard Coriois 31057

More information

NEW DEVELOPMENT OF OPTIMAL COMPUTING BUDGET ALLOCATION FOR DISCRETE EVENT SIMULATION

NEW DEVELOPMENT OF OPTIMAL COMPUTING BUDGET ALLOCATION FOR DISCRETE EVENT SIMULATION NEW DEVELOPMENT OF OPTIMAL COMPUTING BUDGET ALLOCATION FOR DISCRETE EVENT SIMULATION Hsiao-Chang Chen Dept. of Systems Engineering University of Pennsyvania Phiadephia, PA 904-635, U.S.A. Chun-Hung Chen

More information

MATH 172: MOTIVATION FOR FOURIER SERIES: SEPARATION OF VARIABLES

MATH 172: MOTIVATION FOR FOURIER SERIES: SEPARATION OF VARIABLES MATH 172: MOTIVATION FOR FOURIER SERIES: SEPARATION OF VARIABLES Separation of variabes is a method to sove certain PDEs which have a warped product structure. First, on R n, a inear PDE of order m is

More information

Another Look at Linear Programming for Feature Selection via Methods of Regularization 1

Another Look at Linear Programming for Feature Selection via Methods of Regularization 1 Another Look at Linear Programming for Feature Seection via Methods of Reguarization Yonggang Yao, The Ohio State University Yoonkyung Lee, The Ohio State University Technica Report No. 800 November, 2007

More information

A proposed nonparametric mixture density estimation using B-spline functions

A proposed nonparametric mixture density estimation using B-spline functions A proposed nonparametric mixture density estimation using B-spine functions Atizez Hadrich a,b, Mourad Zribi a, Afif Masmoudi b a Laboratoire d Informatique Signa et Image de a Côte d Opae (LISIC-EA 4491),

More information

Mixed Volume Computation, A Revisit

Mixed Volume Computation, A Revisit Mixed Voume Computation, A Revisit Tsung-Lin Lee, Tien-Yien Li October 31, 2007 Abstract The superiority of the dynamic enumeration of a mixed ces suggested by T Mizutani et a for the mixed voume computation

More information

Efficient Generation of Random Bits from Finite State Markov Chains

Efficient Generation of Random Bits from Finite State Markov Chains Efficient Generation of Random Bits from Finite State Markov Chains Hongchao Zhou and Jehoshua Bruck, Feow, IEEE Abstract The probem of random number generation from an uncorreated random source (of unknown

More information

Scalable Spectrum Allocation for Large Networks Based on Sparse Optimization

Scalable Spectrum Allocation for Large Networks Based on Sparse Optimization Scaabe Spectrum ocation for Large Networks ased on Sparse Optimization innan Zhuang Modem R&D Lab Samsung Semiconductor, Inc. San Diego, C Dongning Guo, Ermin Wei, and Michae L. Honig Department of Eectrica

More information

Data Mining Technology for Failure Prognostic of Avionics

Data Mining Technology for Failure Prognostic of Avionics IEEE Transactions on Aerospace and Eectronic Systems. Voume 38, #, pp.388-403, 00. Data Mining Technoogy for Faiure Prognostic of Avionics V.A. Skormin, Binghamton University, Binghamton, NY, 1390, USA

More information

BP neural network-based sports performance prediction model applied research

BP neural network-based sports performance prediction model applied research Avaiabe onine www.jocpr.com Journa of Chemica and Pharmaceutica Research, 204, 6(7:93-936 Research Artice ISSN : 0975-7384 CODEN(USA : JCPRC5 BP neura networ-based sports performance prediction mode appied

More information

High Spectral Resolution Infrared Radiance Modeling Using Optimal Spectral Sampling (OSS) Method

High Spectral Resolution Infrared Radiance Modeling Using Optimal Spectral Sampling (OSS) Method High Spectra Resoution Infrared Radiance Modeing Using Optima Spectra Samping (OSS) Method J.-L. Moncet and G. Uymin Background Optima Spectra Samping (OSS) method is a fast and accurate monochromatic

More information

Appendix of the Paper The Role of No-Arbitrage on Forecasting: Lessons from a Parametric Term Structure Model

Appendix of the Paper The Role of No-Arbitrage on Forecasting: Lessons from a Parametric Term Structure Model Appendix of the Paper The Roe of No-Arbitrage on Forecasting: Lessons from a Parametric Term Structure Mode Caio Ameida cameida@fgv.br José Vicente jose.vaentim@bcb.gov.br June 008 1 Introduction In this

More information

Componentwise Determination of the Interval Hull Solution for Linear Interval Parameter Systems

Componentwise Determination of the Interval Hull Solution for Linear Interval Parameter Systems Componentwise Determination of the Interva Hu Soution for Linear Interva Parameter Systems L. V. Koev Dept. of Theoretica Eectrotechnics, Facuty of Automatics, Technica University of Sofia, 1000 Sofia,

More information

Fitting affine and orthogonal transformations between two sets of points

Fitting affine and orthogonal transformations between two sets of points Mathematica Communications 9(2004), 27-34 27 Fitting affine and orthogona transformations between two sets of points Hemuth Späth Abstract. Let two point sets P and Q be given in R n. We determine a transation

More information

(This is a sample cover image for this issue. The actual cover is not yet available at this time.)

(This is a sample cover image for this issue. The actual cover is not yet available at this time.) (This is a sampe cover image for this issue The actua cover is not yet avaiabe at this time) This artice appeared in a journa pubished by Esevier The attached copy is furnished to the author for interna

More information

Research of Data Fusion Method of Multi-Sensor Based on Correlation Coefficient of Confidence Distance

Research of Data Fusion Method of Multi-Sensor Based on Correlation Coefficient of Confidence Distance Send Orders for Reprints to reprints@benthamscience.ae 340 The Open Cybernetics & Systemics Journa, 015, 9, 340-344 Open Access Research of Data Fusion Method of Muti-Sensor Based on Correation Coefficient

More information

Lecture 17 - The Secrets we have Swept Under the Rug

Lecture 17 - The Secrets we have Swept Under the Rug Lecture 17 - The Secrets we have Swept Under the Rug Today s ectures examines some of the uirky features of eectrostatics that we have negected up unti this point A Puzze... Let s go back to the basics

More information

Kernel Trick Embedded Gaussian Mixture Model

Kernel Trick Embedded Gaussian Mixture Model Kerne Trick Embedded Gaussian Mixture Mode Jingdong Wang, Jianguo Lee, and Changshui Zhang State Key Laboratory of Inteigent Technoogy and Systems Department of Automation, Tsinghua University Beijing,

More information

Source and Relay Matrices Optimization for Multiuser Multi-Hop MIMO Relay Systems

Source and Relay Matrices Optimization for Multiuser Multi-Hop MIMO Relay Systems Source and Reay Matrices Optimization for Mutiuser Muti-Hop MIMO Reay Systems Yue Rong Department of Eectrica and Computer Engineering, Curtin University, Bentey, WA 6102, Austraia Abstract In this paper,

More information

Combining reaction kinetics to the multi-phase Gibbs energy calculation

Combining reaction kinetics to the multi-phase Gibbs energy calculation 7 th European Symposium on Computer Aided Process Engineering ESCAPE7 V. Pesu and P.S. Agachi (Editors) 2007 Esevier B.V. A rights reserved. Combining reaction inetics to the muti-phase Gibbs energy cacuation

More information

Theory and implementation behind: Universal surface creation - smallest unitcell

Theory and implementation behind: Universal surface creation - smallest unitcell Teory and impementation beind: Universa surface creation - smaest unitce Bjare Brin Buus, Jaob Howat & Tomas Bigaard September 15, 218 1 Construction of surface sabs Te aim for tis part of te project is

More information

14 Separation of Variables Method

14 Separation of Variables Method 14 Separation of Variabes Method Consider, for exampe, the Dirichet probem u t = Du xx < x u(x, ) = f(x) < x < u(, t) = = u(, t) t > Let u(x, t) = T (t)φ(x); now substitute into the equation: dt

More information

Akaike Information Criterion for ANOVA Model with a Simple Order Restriction

Akaike Information Criterion for ANOVA Model with a Simple Order Restriction Akaike Information Criterion for ANOVA Mode with a Simpe Order Restriction Yu Inatsu * Department of Mathematics, Graduate Schoo of Science, Hiroshima University ABSTRACT In this paper, we consider Akaike

More information

6.434J/16.391J Statistics for Engineers and Scientists May 4 MIT, Spring 2006 Handout #17. Solution 7

6.434J/16.391J Statistics for Engineers and Scientists May 4 MIT, Spring 2006 Handout #17. Solution 7 6.434J/16.391J Statistics for Engineers and Scientists May 4 MIT, Spring 2006 Handout #17 Soution 7 Probem 1: Generating Random Variabes Each part of this probem requires impementation in MATLAB. For the

More information

Stochastic Complement Analysis of Multi-Server Threshold Queues. with Hysteresis. Abstract

Stochastic Complement Analysis of Multi-Server Threshold Queues. with Hysteresis. Abstract Stochastic Compement Anaysis of Muti-Server Threshod Queues with Hysteresis John C.S. Lui The Dept. of Computer Science & Engineering The Chinese University of Hong Kong Leana Goubchik Dept. of Computer

More information