Probability Estimates for Multi-class Classification by Pairwise Coupling
|
|
- Willa Cobb
- 6 years ago
- Views:
Transcription
1 Probability Estimates for Multi-class Classification by Pairwise Couling Ting-Fan Wu Chih-Jen Lin Deartment of Comuter Science National Taiwan University Taiei 06, Taiwan Ruby C. Weng Deartment of Statistics National Chenechi University Taiei 6, Taiwan Abstract Pairwise couling is a oular multi-class classification method that combines together all airwise comarisons for each air of classes. This aer resents two aroaches for obtaining class robabilities. Both methods can be reduced to linear systems and are easy to imlement. We show concetually and exerimentally that the roosed aroaches are more stable than two existing oular methods: voting and [3]. Introduction The multi-class classification roblem refers to assigning each of the observations into one of k classes. As two-class roblems are much easier to solve, many authors roose to use two-class classifiers for multi-class classification. In this aer we focus on techniques that rovide a multi-class classification solution by combining all airwise comarisons. A common way to combine airwise comarisons is by voting [6, 2]. It constructs a rule for discriating between every air of classes and then selecting the class with the most winning two-class decisions. Though the voting rocedure requires just airwise decisions, it only redicts a class label. In many scenarios, however, robability estimates are desired. As numerous (airwise) classifiers do rovide class robabilities, several authors [2,, 3] have roosed robability estimates by combining the airwise class robabilities. Given the observation x and the class label y, we assume that the estimated airwise class robabilities r ij of µ ij = (y = i y = i or j, x) are available. Here r ij are obtained by some binary classifiers. Then, the goal is to estimate { i } k i=, where i = (y = i x), i =,..., k. We roose to obtain an aroximate solution to an identity, and then select the label with the highest estimated class robability. The existence of the solution is guaranteed by theory in finite Markov Chains. Motivated by the otimization formulation of this method, we roose a second aroach. Interestingly, it can also be regarded as an imroved version of the couling aroach given by [2]. Both of the roosed methods can be reduced to solving linear systems and are simle in ractical imlementation. Furthermore, from concetual and exerimental oints of view, we show that the two roosed methods are more stable than voting and the method in [3]. We organize the aer as follows. In Section 2, we review two existing methods. Sections 3 and 4 detail the two roosed aroaches. Section 5 resents the relationshi among the four methods through their corresonding otimization formulas. In Section 6, we comare
2 these methods using simulated and real data. The classifiers considered are suort vector machines. Section 7 concludes the aer. Due to sace limit, we omit all detailed roofs. A comlete version of this work is available at htt:// cjlin/aers/svmrob/svmrob.df. 2 Review of Two Methods Let r ij be the estimates of µ ij = i /( i + j ). The voting rule [6, 2] is δ V = argmax i [ I {rij>r ji}]. () A simle estimate of robabilities can be derived as v i = 2 I {r ij>r ji}/(k(k )). The authors of [3] suggest another method to estimate class robabilities, and they claim that the resulting classification rule can outerform δ V in some situations. Their aroach is based on the imization of the Kullback-Leibler (KL) distance between r ij and µ ij : l() = i j n ij r ij log(r ij /µ ij ), (2) where k i= i =, i > 0, i =,..., k, and n ij is the number of instances in class i or j. By letting l() = 0, a nonlinear system has to be solved. [3] rooses an iterative rocedure to find the imum of (2). If r ij > 0, i j, the existence of a unique global imal solution to (2) has been roved in [5] and references therein. Let denote this oint. Then the resulting classification rule is It is shown in Theorem of [3] that δ HT (x) = argmax i [ i ]. i > j if and only if i > j, where j = 2 s:s j r js ; (3) k(k ) that is, the i are in the same order as the i. Therefore, are sufficient if one only requires the classification rule. In fact, as ointed out by [3], can be derived as an aroximation to the identity by relacing i + j with 2/k, and µ ij with r ij. i = ( i + j k )( i ) = ( i + j i + j k )µ ij (4) 3 Our First Aroach Note that δ HT is essentially argmax i [ i ], and is an aroximate solution to (4). Instead of relacing i + j by 2/k, in this section we roose to solve the system: i = ( i + j k )r ij, i, subject to Let denote the solution to (5). Then the resulting decision rule is δ = argmax i [ i ]. i =, i 0, i. (5) As δ HT relies on i + j k/2, in Section 6. we use two examles to illustrate ossible roblems with this rule. i=
3 To solve (5), we rewrite it as Q =, i =, i 0, i, where Q ij = i= { r ij /(k ) if i j, s:s i r is/(k ) if i = j. Observe that k j= Q ij = for i =,..., k and 0 Q ij for i, j =,..., k, so there exists a finite Markov Chain whose transition matrix is Q. Moreover, if r ij > 0 for all i j, then Q ij > 0, which imlies this Markov Chain is irreducible and aeriodic. These conditions guarantee the existence of a unique stationary robability and all states being ositive recurrent. Hence, we have the following theorem: Theorem If r ij > 0, i j, then (6) has a unique solution with 0 < i <, i. With Theorem and some further analyses, if we remove the constraint i 0, i, the linear system with k + equations still has the same unique solution. Furthermore, if any one of the k equalities Q = is removed, we have a system with k variables and k equalities, which, again, has the same single solution. Thus, (6) can be solved by Gaussian eliation. On the other hand, as the stationary solution of a Markov Chain can be derived by the limit of the n-ste transition robability matrix Q n, we can solve by reeatedly multilying Q T with any initial vector. Now we reexae this method to gain more insight. The following arguments show that the solution to (5) is a global imum of a meaningful otimization roblem. To begin, we exress (5) as r ji i r ij j = 0, i =,..., k, using the roerty that r ij + r ji =, i j. Then the solution to (5) is in fact the global imum of the following roblem: ( r ji i r ij j ) 2 i= subject to (6) i =, i 0, i. (7) Since the object function is always nonnegative, and it attains zero under (5) and (6). 4 Our Second Aroach Note that both aroaches in Sections 2 and 3 involve solving otimization roblems using the relations like i /( i + j ) r ij or r ji i r ij j. Motivated by (7), we suggest another otimization formulation as follows: 2 (r ji i r ij j ) 2 subject to i= i= i =, i 0, i. (8) In related work, [2] rooses to solve a linear system consisting of k i= i = and any k equations of the form r ji i = r ij j. However, ointed out in [], the results of [2] strongly deends on the selection of k equations. In fact, as (8) considers all r ij j r ji i, not just k of them, it can be viewed as an imroved version of [2]. Let denote the corresonding solution. We then define the classification rule as δ 2 = argmax i [ i ]. Since (7) has a unique solution, which can be obtained by solving a simle linear system, it is desired to see whether the imization roblem (8) has these nice roerties. In the rest of the section, we show that this is true. The following theorem shows that the nonnegative constraints in (8) are redundant. i=
4 Theorem 2 Problem (8) is equivalent to a simlification without conditions i 0, i. Note that we can rewrite the objective function of (8) as = 2 T Q, where Q ij = { s:s i r2 si if i = j, r ji r ij if i j. (9) From here we can show that Q is ositive semi-definite. Therefore, without constraints i 0, i, (9) is a linear-constrained convex quadratic rogramg roblem. Consequently, a oint is a global imum if and only if it satisfies the KKT otimality condition: There is a scalar b such that [ [ ] [ Q e 0 e 0] T =. (0) b ] Here e is the vector of all ones and b is the Lagrangian multilier of the equality constraint k i= i =. Thus, the solution of (8) can be obtained by solving the simle linear system (0). The existence of a unique solution is guaranteed by the invertibility of the matrix of (0). Moreover, if Q is ositive definite(pd), this matrix is invertible. The following theorem shows that Q is PD under quite general conditions. Theorem 3 If for any i =,..., k, there are s i and j i such that rsirsj r is then Q is ositive definite. rjirjs r ij, In addition to direct methods, next we roose a simle iterative method for solving (0): Algorithm. Start with some initial i 0, i and k i= i =. 2. Reeat (t =,..., k,,...) until (0) is satisfied. t Q tt [ j:j t Q tj j + T Q] () normalize (2) Theorem 4 If r sj > 0, s j, and { i } i= is the sequence generated by Algorithm, any convergent sub-sequence goes to a global imum of (8). As Theorem 3 indicates that in general Q is ositive definite, the sequence { i } i= from Algorithm usually globally converges to the unique imum of (8). 5 Relations Among Four Methods The four decision rules δ HT, δ, δ 2, and δ V can be written as argmax i [ i ], where is derived by the following four otimization formulations under the constants k i= i =
5 and i 0, i: δ HT : δ : δ 2 : δ V : [ i= [ i= i= i= (r ij k 2 i)] 2, (3) (r ij j r ji i )] 2, (4) (r ij j r ji i ) 2, (5) (I {rij>r ji} j I {rji>r ij} i ) 2. (6) Note that (3) can be easily verified, and that (4) and (5) have been exlained in Sections 3 and 4. For (6), its solution is c i = I, {r ji>r ij} where c is the normalizing constant; and therefore, argmax i [ i ] is the same as (). Clearly, (3) can be obtained from (4) by letting j /k, j and r ji /2, i, j. Such aroximations ignore the differences between i. Similarly, (6) is from (5) by taking the extreme values of r ij : 0 or. As a result, (6) may enlarge the differences between i. Next, comared with (5), (4) may tend to underestimate the differences between the i s. The reason is that (4) allows the difference between r ij j and r ji i to get canceled first. Thus, concetually, (3) and (6) are more extreme the former tends to underestimate the differences between i s, while the latter overestimate them. These arguments will be suorted by simulated and real data in the next section. 6 Exeriments 6. Simle Simulated Examles [3] designs a simle exeriment in which all i s are fairly close and their method δ HT outerforms the voting strategy δ V. We conduct this exeriment first to assess the erformance of our roosed methods. As in [3], we define class robabilities =.5/k, j = ( )/(k ), j = 2,..., k, and then set r ij = i i + j + 0.z ij if i > j, (7) r ji = r ij if j > i, (8) where z ij are standard normal variates. Since r ij are required to be within (0,), we truncate r ij at ɛ below and ɛ above, with ɛ = In this examle, class has the highest robability and hence is the correct class. Figure shows accuracy rates for each of the four methods when k = 3, 5, 8, 0, 2, 5, 20. The accuracy rates are averaged over,000 relicates. Note that in this exeriment all classes are quite cometitive, so, when using δ V, sometimes the highest vote occurs at two For I to be well defined, we consider r ij r ji, which is generally true. In addition, if there is an i for which P I {r ji >r ij } = 0, an otimal solution of (6) is i =, and j = 0, j i. The resulting decision is the same as that of ().
6 Accuracy Rates log k 2 Accuracy Rates log k 2 Accuracy Rates log k 2 (a) balanced i (b) unbalanced i (c) highly unbalanced i Figure : Accuracy of redicting the true class by the methods δ HT (solid line, cross marked), δ V (dash line, square marked), δ (dotted line, circle marked), and δ 2 (dashed line, asterisk marked) from simulated class robability i, i =, 2 k. or more different classes. We handle this roblem by randomly selecting one class from the ties. This artly exlains why δ V erforms oor. Another exlanation is that the r ij here are all close to /2, but (6) uses or 0 instead; therefore, the solution may be severely biased. Besides δ V, the other three rules have done very well in this examle. Since δ HT relies on the aroximation i + j k/2, this rule may suffer some losses if the class robabilities are not highly balanced. To exae this oint, we consider the following two sets of class robabilities: () We let k = k/2 if k is even, and (k + )/2 if k is odd; then we define = /k, i = (0.95 )/(k ) for i = 2,..., k, and i = 0.05/(k k ) for i = k +,..., k. (2) If k = 3, we define = /2, 2 = 0.95, and 3 = If k > 3, we define = 0.475, 2 = 3 = 0.475/2, and i = 0.05/(k 3) for i = 4,..., k. After setting i, we define the airwise comarisons r ij as in (7)-(8). Both exeriments are reeated for,000 times. The accuracy rates are shown in Figures (b) and (c). In both scenarios, i are not balanced. As exected, δ HT is quite sensitive to the imbalance of i. The situation is much worse in Figure (c) because the aroximation i + j k/2 is more seriously violated, esecially when k is large. In summary, δ and δ 2 are less sensitive to i, and their overall erformance are fairly stable. All features observed here agree with our analysis in Section Real Data In this section we resent exerimental results on several multi-class roblems: segment, satimage, and letter from the Statlog collection [9], USPS [4], and MNIST [7]. All data sets are available at htt:// cjlin/libsvmtools/ t. Their numbers of classes are 7, 6, 26, 0, and 0, resectively. From thousands of instances in each data, we select 300 and 500 as our training and testing sets. We consider suort vector machines (SVM) with RBF kernel e γ xi xj 2 as the binary classifier. The regularization arameter C and the kernel arameter γ are selected by crossvalidation. To begin, for each training set, a five-fold cross-validation is conducted on the following oints of (C, γ): [2 5, 2 3,..., 2 5 ] [2 5, 2 3,..., 2 5 ]. This is done by modifying LIBSVM [], a library for SVM. At each (C, γ), sequentially four folds are
7 Table : Testing errors (in ercentage) by four methods: Each row reorts the testing errors based on a air of the training and testing sets. The mean and std (standard deviation) are from five 5-fold cross-validation rocedures to select the best (C, γ). Dataset k δ HT δ δ 2 δ V satimage 6 segment 7 USPS 0 MNIST 0 letter 26 mean std mean std mean std mean std used as the training set while one fold as the validation set. The training of the four folds consists of k(k )/2 binary SVMs. For the binary SVM of the ith and the jth classes, using decision values ˆf of training data, we emloy an imroved imlementation [8] of Platt s osterior robabilities [0] to estimate r ij : r ij = P (i i or j, x) =, (9) + ea ˆf+B where A and B are estimated by imizing the negative log-likelihood function. Then, for each validation instance, we aly the four methods to obtain classification decisions. The error of the five validation sets is thus the cross-validation error at (C, γ). After the cross-validation is done, each rule obtains its best (C, γ). Using these arameters, we train the whole training set to obtain the final model. Next, the same as (9), the decision values from the training data are emloyed to find r ij. Then, testing data are tested using each of the four rules. Due to the randomness of searating training data into five folds for finding the best (C, γ), we reeat the five-fold cross-validation five times and obtain the mean and standard deviation of the testing error. Moreover, as the selection of 300 and 500 training and testing instances from a larger dataset is also random, we generate five of such airs. In Table, each row reorts the testing error based on a air of the training and testing sets. The results show that when the number of classes k is small, the four methods erform similarly; however, for roblems with larger k, δ HT is less cometitive. In articular, for roblem letter which has 26 classes, δ 2 or δ V outerforms δ HT by at least 5%. It seems that for [0] suggests to use ˆf from the validation instead of the training. However, this requires a further cross-validation on the four-fold data. For simlicity, we directly use ˆf from the training. If more than one arameter sets return the smallest cross-validation error, we simly choose one with the smallest C.
8 roblems here, their characteristics are closer to the setting of Figure (c), rather than that of Figure (a). All these results agree with the revious findings in Sections 5 and 6.. Note that in Table, some standard deviations are zero. That means the best (C, γ) by different cross-validations are all the same. Overall, the variation on arameter selection due to the randomness of cross-validation is not large. 7 Discussions and Conclusions As the imization of the KL distance is a well known criterion, some may wonder why the erformance of δ HT is not quite satisfactory in some of the examles. One ossible exlanation is that here KL distance is derived under the assumtions that n ij r ij Bin(n ij, µ ij ) and r ij are indeendent; however, as ointed out in [3], neither of the assumtions holds in the classification roblem. In conclusion, we have rovided two methods which are shown to be more stable than both δ HT and δ V. In addition, the two roosed aroaches require only solutions of linear systems instead of a nonlinear one in [3]. The authors thank S. Sathiya Keerthi for helful comments. References [] C.-C. Chang and C.-J. Lin. LIBSVM: a library for suort vector machines, 200. Software available at htt:// cjlin/libsvm. [2] J. Friedman. Another aroach to olychotomous classification. Technical reort, Deartment of Statistics, Stanford University, 996. Available at htt://www-stat.stanford.edu/reorts/friedman/oly.s.z. [3] T. Hastie and R. Tibshirani. Classification by airwise couling. The Annals of Statistics, 26():45 47, 998. [4] J. J. Hull. A database for handwritten text recognition research. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6(5): , May 994. [5] D. R. Hunter. MM algorithms for generalized Bradley-Terry models. The Annals of Statistics, To aear. [6] S. Knerr, L. Personnaz, and G. Dreyfus. Single-layer learning revisited: a stewise rocedure for building and training a neural network. In J. Fogelman, editor, Neurocomuting: Algorithms, Architectures and Alications. Sringer-Verlag, 990. [7] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning alied to document recognition. Proceedings of the IEEE, 86(): , November 998. MNIST database available at htt://yann.lecun.com/exdb/mnist/. [8] H.-T. Lin, C.-J. Lin, and R. C. Weng. A note on Platt s robabilistic oututs for suort vector machines. Technical reort, Deartment of Comuter Science and Information Engineering, National Taiwan University, [9] D. Michie, D. J. Siegelhalter, and C. C. Taylor. Machine Learning, Neural and Statistical Classification. Prentice Hall, Englewood Cliffs, N.J., 994. Data available at htt:// [0] J. Platt. Probabilistic oututs for suort vector machines and comarison to regularized likelihood methods. In A. Smola, P. Bartlett, B. Schölkof, and D. Schuurmans, editors, Advances in Large Margin Classifiers, Cambridge, MA, MIT Press. [] D. Price, S. Knerr, L. Personnaz, and G. Dreyfus. Pairwise nerual network classifiers with robabilistic oututs. In G. Tesauro, D. Touretzky, and T. Leen, editors, Neural Information Processing Systems, volume 7, ages The MIT Press, 995. [2] P. Refregier and F. Vallet. Probabilistic aroach for multiclass classification with neural networks. In Proceedings of International Conference on Artificial Networks, ages , 99.
Probability Estimates for Multi-class Classification by Pairwise Coupling
Probability Estimates for Multi-class Classification by Pairwise Coupling By Ting-Fan Wu, Chih-Jen Lin, and Ruby C. Weng 2 National Taiwan University and National Chengchi University Abstract Pairwise
More informationProbability Estimates for Multi-class Classification by Pairwise Coupling
Journal of Machine Learning Research 5 (2004) 975 005 Submitted /03; Revised 05/04; Published 8/04 Probability Estimates for Multi-class Classification by Pairwise Coupling Ting-Fan Wu Chih-Jen Lin Department
More informationProbability Estimates for Multi-class Classification by Pairwise Coupling
Journal of Machine Learning Research 5 (2004) 975-005 Submitted /03; Revised 05/04; Published 8/04 Probability Estimates for Multi-class Classification by Pairwise Coupling Ting-Fan Wu Chih-Jen Lin Department
More informationwhere x i is the ith coordinate of x R N. 1. Show that the following upper bound holds for the growth function of H:
Mehryar Mohri Foundations of Machine Learning Courant Institute of Mathematical Sciences Homework assignment 2 October 25, 2017 Due: November 08, 2017 A. Growth function Growth function of stum functions.
More informationAn Analysis of Reliable Classifiers through ROC Isometrics
An Analysis of Reliable Classifiers through ROC Isometrics Stijn Vanderlooy s.vanderlooy@cs.unimaas.nl Ida G. Srinkhuizen-Kuyer kuyer@cs.unimaas.nl Evgueni N. Smirnov smirnov@cs.unimaas.nl MICC-IKAT, Universiteit
More information4. Score normalization technical details We now discuss the technical details of the score normalization method.
SMT SCORING SYSTEM This document describes the scoring system for the Stanford Math Tournament We begin by giving an overview of the changes to scoring and a non-technical descrition of the scoring rules
More informationOn Wald-Type Optimal Stopping for Brownian Motion
J Al Probab Vol 34, No 1, 1997, (66-73) Prerint Ser No 1, 1994, Math Inst Aarhus On Wald-Tye Otimal Stoing for Brownian Motion S RAVRSN and PSKIR The solution is resented to all otimal stoing roblems of
More informationApproximating min-max k-clustering
Aroximating min-max k-clustering Asaf Levin July 24, 2007 Abstract We consider the roblems of set artitioning into k clusters with minimum total cost and minimum of the maximum cost of a cluster. The cost
More informationCombining Logistic Regression with Kriging for Mapping the Risk of Occurrence of Unexploded Ordnance (UXO)
Combining Logistic Regression with Kriging for Maing the Risk of Occurrence of Unexloded Ordnance (UXO) H. Saito (), P. Goovaerts (), S. A. McKenna (2) Environmental and Water Resources Engineering, Deartment
More informationState Estimation with ARMarkov Models
Deartment of Mechanical and Aerosace Engineering Technical Reort No. 3046, October 1998. Princeton University, Princeton, NJ. State Estimation with ARMarkov Models Ryoung K. Lim 1 Columbia University,
More informationFeedback-error control
Chater 4 Feedback-error control 4.1 Introduction This chater exlains the feedback-error (FBE) control scheme originally described by Kawato [, 87, 8]. FBE is a widely used neural network based controller
More informationUsing the Divergence Information Criterion for the Determination of the Order of an Autoregressive Process
Using the Divergence Information Criterion for the Determination of the Order of an Autoregressive Process P. Mantalos a1, K. Mattheou b, A. Karagrigoriou b a.deartment of Statistics University of Lund
More informationImproved Capacity Bounds for the Binary Energy Harvesting Channel
Imroved Caacity Bounds for the Binary Energy Harvesting Channel Kaya Tutuncuoglu 1, Omur Ozel 2, Aylin Yener 1, and Sennur Ulukus 2 1 Deartment of Electrical Engineering, The Pennsylvania State University,
More informationSystem Reliability Estimation and Confidence Regions from Subsystem and Full System Tests
009 American Control Conference Hyatt Regency Riverfront, St. Louis, MO, USA June 0-, 009 FrB4. System Reliability Estimation and Confidence Regions from Subsystem and Full System Tests James C. Sall Abstract
More informationSolved Problems. (a) (b) (c) Figure P4.1 Simple Classification Problems First we draw a line between each set of dark and light data points.
Solved Problems Solved Problems P Solve the three simle classification roblems shown in Figure P by drawing a decision boundary Find weight and bias values that result in single-neuron ercetrons with the
More informationAI*IA 2003 Fusion of Multiple Pattern Classifiers PART III
AI*IA 23 Fusion of Multile Pattern Classifiers PART III AI*IA 23 Tutorial on Fusion of Multile Pattern Classifiers by F. Roli 49 Methods for fusing multile classifiers Methods for fusing multile classifiers
More informationResearch Note REGRESSION ANALYSIS IN MARKOV CHAIN * A. Y. ALAMUTI AND M. R. MESHKANI **
Iranian Journal of Science & Technology, Transaction A, Vol 3, No A3 Printed in The Islamic Reublic of Iran, 26 Shiraz University Research Note REGRESSION ANALYSIS IN MARKOV HAIN * A Y ALAMUTI AND M R
More informationA Comparison between Biased and Unbiased Estimators in Ordinary Least Squares Regression
Journal of Modern Alied Statistical Methods Volume Issue Article 7 --03 A Comarison between Biased and Unbiased Estimators in Ordinary Least Squares Regression Ghadban Khalaf King Khalid University, Saudi
More informationDETC2003/DAC AN EFFICIENT ALGORITHM FOR CONSTRUCTING OPTIMAL DESIGN OF COMPUTER EXPERIMENTS
Proceedings of DETC 03 ASME 003 Design Engineering Technical Conferences and Comuters and Information in Engineering Conference Chicago, Illinois USA, Setember -6, 003 DETC003/DAC-48760 AN EFFICIENT ALGORITHM
More informationUncorrelated Multilinear Principal Component Analysis for Unsupervised Multilinear Subspace Learning
TNN-2009-P-1186.R2 1 Uncorrelated Multilinear Princial Comonent Analysis for Unsuervised Multilinear Subsace Learning Haiing Lu, K. N. Plataniotis and A. N. Venetsanooulos The Edward S. Rogers Sr. Deartment
More informationEstimation of the large covariance matrix with two-step monotone missing data
Estimation of the large covariance matrix with two-ste monotone missing data Masashi Hyodo, Nobumichi Shutoh 2, Takashi Seo, and Tatjana Pavlenko 3 Deartment of Mathematical Information Science, Tokyo
More informationPreconditioning techniques for Newton s method for the incompressible Navier Stokes equations
Preconditioning techniques for Newton s method for the incomressible Navier Stokes equations H. C. ELMAN 1, D. LOGHIN 2 and A. J. WATHEN 3 1 Deartment of Comuter Science, University of Maryland, College
More informationDistributed Rule-Based Inference in the Presence of Redundant Information
istribution Statement : roved for ublic release; distribution is unlimited. istributed Rule-ased Inference in the Presence of Redundant Information June 8, 004 William J. Farrell III Lockheed Martin dvanced
More informationLIBSVM: a Library for Support Vector Machines
LIBSVM: a Library for Support Vector Machines Chih-Chung Chang and Chih-Jen Lin Last updated: September 6, 2006 Abstract LIBSVM is a library for support vector machines (SVM). Its goal is to help users
More informationOn a Markov Game with Incomplete Information
On a Markov Game with Incomlete Information Johannes Hörner, Dinah Rosenberg y, Eilon Solan z and Nicolas Vieille x{ January 24, 26 Abstract We consider an examle of a Markov game with lack of information
More informationarxiv: v1 [physics.data-an] 26 Oct 2012
Constraints on Yield Parameters in Extended Maximum Likelihood Fits Till Moritz Karbach a, Maximilian Schlu b a TU Dortmund, Germany, moritz.karbach@cern.ch b TU Dortmund, Germany, maximilian.schlu@cern.ch
More informationAn Ant Colony Optimization Approach to the Probabilistic Traveling Salesman Problem
An Ant Colony Otimization Aroach to the Probabilistic Traveling Salesman Problem Leonora Bianchi 1, Luca Maria Gambardella 1, and Marco Dorigo 2 1 IDSIA, Strada Cantonale Galleria 2, CH-6928 Manno, Switzerland
More informationA New Perspective on Learning Linear Separators with Large L q L p Margins
A New Persective on Learning Linear Searators with Large L q L Margins Maria-Florina Balcan Georgia Institute of Technology Christoher Berlind Georgia Institute of Technology Abstract We give theoretical
More information#A64 INTEGERS 18 (2018) APPLYING MODULAR ARITHMETIC TO DIOPHANTINE EQUATIONS
#A64 INTEGERS 18 (2018) APPLYING MODULAR ARITHMETIC TO DIOPHANTINE EQUATIONS Ramy F. Taki ElDin Physics and Engineering Mathematics Deartment, Faculty of Engineering, Ain Shams University, Cairo, Egyt
More informationON THE LEAST SIGNIFICANT p ADIC DIGITS OF CERTAIN LUCAS NUMBERS
#A13 INTEGERS 14 (014) ON THE LEAST SIGNIFICANT ADIC DIGITS OF CERTAIN LUCAS NUMBERS Tamás Lengyel Deartment of Mathematics, Occidental College, Los Angeles, California lengyel@oxy.edu Received: 6/13/13,
More informationOn split sample and randomized confidence intervals for binomial proportions
On slit samle and randomized confidence intervals for binomial roortions Måns Thulin Deartment of Mathematics, Usala University arxiv:1402.6536v1 [stat.me] 26 Feb 2014 Abstract Slit samle methods have
More informationUncorrelated Multilinear Discriminant Analysis with Regularization and Aggregation for Tensor Object Recognition
Uncorrelated Multilinear Discriminant Analysis with Regularization and Aggregation for Tensor Object Recognition Haiing Lu, K.N. Plataniotis and A.N. Venetsanooulos The Edward S. Rogers Sr. Deartment of
More informationMulti-instance Support Vector Machine Based on Convex Combination
The Eighth International Symosium on Oerations Research and Its Alications (ISORA 09) Zhangjiajie, China, Setember 20 22, 2009 Coyright 2009 ORSC & APORC,. 48 487 Multi-instance Suort Vector Machine Based
More informationRadial Basis Function Networks: Algorithms
Radial Basis Function Networks: Algorithms Introduction to Neural Networks : Lecture 13 John A. Bullinaria, 2004 1. The RBF Maing 2. The RBF Network Architecture 3. Comutational Power of RBF Networks 4.
More informationConvex Optimization methods for Computing Channel Capacity
Convex Otimization methods for Comuting Channel Caacity Abhishek Sinha Laboratory for Information and Decision Systems (LIDS), MIT sinhaa@mit.edu May 15, 2014 We consider a classical comutational roblem
More informationInformation collection on a graph
Information collection on a grah Ilya O. Ryzhov Warren Powell February 10, 2010 Abstract We derive a knowledge gradient olicy for an otimal learning roblem on a grah, in which we use sequential measurements
More informationON POLYNOMIAL SELECTION FOR THE GENERAL NUMBER FIELD SIEVE
MATHEMATICS OF COMPUTATIO Volume 75, umber 256, October 26, Pages 237 247 S 25-5718(6)187-9 Article electronically ublished on June 28, 26 O POLYOMIAL SELECTIO FOR THE GEERAL UMBER FIELD SIEVE THORSTE
More informationUniversal Finite Memory Coding of Binary Sequences
Deartment of Electrical Engineering Systems Universal Finite Memory Coding of Binary Sequences Thesis submitted towards the degree of Master of Science in Electrical and Electronic Engineering in Tel-Aviv
More informationUncorrelated Multilinear Discriminant Analysis with Regularization and Aggregation for Tensor Object Recognition
TNN-2007-P-0332.R1 1 Uncorrelated Multilinear Discriminant Analysis with Regularization and Aggregation for Tensor Object Recognition Haiing Lu, K.N. Plataniotis and A.N. Venetsanooulos The Edward S. Rogers
More informationInformation collection on a graph
Information collection on a grah Ilya O. Ryzhov Warren Powell October 25, 2009 Abstract We derive a knowledge gradient olicy for an otimal learning roblem on a grah, in which we use sequential measurements
More informationsubstantial literature on emirical likelihood indicating that it is widely viewed as a desirable and natural aroach to statistical inference in a vari
Condence tubes for multile quantile lots via emirical likelihood John H.J. Einmahl Eindhoven University of Technology Ian W. McKeague Florida State University May 7, 998 Abstract The nonarametric emirical
More informationApplied Mathematics and Computation
Alied Mathematics and Comutation 217 (2010) 1887 1895 Contents lists available at ScienceDirect Alied Mathematics and Comutation journal homeage: www.elsevier.com/locate/amc Derivative free two-oint methods
More informationPlotting the Wilson distribution
, Survey of English Usage, University College London Setember 018 1 1. Introduction We have discussed the Wilson score interval at length elsewhere (Wallis 013a, b). Given an observed Binomial roortion
More informationTopic: Lower Bounds on Randomized Algorithms Date: September 22, 2004 Scribe: Srinath Sridhar
15-859(M): Randomized Algorithms Lecturer: Anuam Guta Toic: Lower Bounds on Randomized Algorithms Date: Setember 22, 2004 Scribe: Srinath Sridhar 4.1 Introduction In this lecture, we will first consider
More informationFor q 0; 1; : : : ; `? 1, we have m 0; 1; : : : ; q? 1. The set fh j(x) : j 0; 1; ; : : : ; `? 1g forms a basis for the tness functions dened on the i
Comuting with Haar Functions Sami Khuri Deartment of Mathematics and Comuter Science San Jose State University One Washington Square San Jose, CA 9519-0103, USA khuri@juiter.sjsu.edu Fax: (40)94-500 Keywords:
More informationOn the Chvatál-Complexity of Knapsack Problems
R u t c o r Research R e o r t On the Chvatál-Comlexity of Knasack Problems Gergely Kovács a Béla Vizvári b RRR 5-08, October 008 RUTCOR Rutgers Center for Oerations Research Rutgers University 640 Bartholomew
More informationNamed Entity Recognition using Maximum Entropy Model SEEM5680
Named Entity Recognition using Maximum Entroy Model SEEM5680 Named Entity Recognition System Named Entity Recognition (NER): Identifying certain hrases/word sequences in a free text. Generally it involves
More informationAn Investigation on the Numerical Ill-conditioning of Hybrid State Estimators
An Investigation on the Numerical Ill-conditioning of Hybrid State Estimators S. K. Mallik, Student Member, IEEE, S. Chakrabarti, Senior Member, IEEE, S. N. Singh, Senior Member, IEEE Deartment of Electrical
More informationMetrics Performance Evaluation: Application to Face Recognition
Metrics Performance Evaluation: Alication to Face Recognition Naser Zaeri, Abeer AlSadeq, and Abdallah Cherri Electrical Engineering Det., Kuwait University, P.O. Box 5969, Safat 6, Kuwait {zaery, abeer,
More informationDiverse Routing in Networks with Probabilistic Failures
Diverse Routing in Networks with Probabilistic Failures Hyang-Won Lee, Member, IEEE, Eytan Modiano, Senior Member, IEEE, Kayi Lee, Member, IEEE Abstract We develo diverse routing schemes for dealing with
More informationMATHEMATICAL MODELLING OF THE WIRELESS COMMUNICATION NETWORK
Comuter Modelling and ew Technologies, 5, Vol.9, o., 3-39 Transort and Telecommunication Institute, Lomonosov, LV-9, Riga, Latvia MATHEMATICAL MODELLIG OF THE WIRELESS COMMUICATIO ETWORK M. KOPEETSK Deartment
More informationUsing a Computational Intelligence Hybrid Approach to Recognize the Faults of Variance Shifts for a Manufacturing Process
Journal of Industrial and Intelligent Information Vol. 4, No. 2, March 26 Using a Comutational Intelligence Hybrid Aroach to Recognize the Faults of Variance hifts for a Manufacturing Process Yuehjen E.
More informationLower Confidence Bound for Process-Yield Index S pk with Autocorrelated Process Data
Quality Technology & Quantitative Management Vol. 1, No.,. 51-65, 15 QTQM IAQM 15 Lower onfidence Bound for Process-Yield Index with Autocorrelated Process Data Fu-Kwun Wang * and Yeneneh Tamirat Deartment
More informationTowards understanding the Lorenz curve using the Uniform distribution. Chris J. Stephens. Newcastle City Council, Newcastle upon Tyne, UK
Towards understanding the Lorenz curve using the Uniform distribution Chris J. Stehens Newcastle City Council, Newcastle uon Tyne, UK (For the Gini-Lorenz Conference, University of Siena, Italy, May 2005)
More informationMATH 2710: NOTES FOR ANALYSIS
MATH 270: NOTES FOR ANALYSIS The main ideas we will learn from analysis center around the idea of a limit. Limits occurs in several settings. We will start with finite limits of sequences, then cover infinite
More informationA Simple Weight Decay Can Improve. Abstract. It has been observed in numerical simulations that a weight decay can improve
In Advances in Neural Information Processing Systems 4, J.E. Moody, S.J. Hanson and R.P. Limann, eds. Morgan Kaumann Publishers, San Mateo CA, 1995,. 950{957. A Simle Weight Decay Can Imrove Generalization
More informationNONLINEAR OPTIMIZATION WITH CONVEX CONSTRAINTS. The Goldstein-Levitin-Polyak algorithm
- (23) NLP - NONLINEAR OPTIMIZATION WITH CONVEX CONSTRAINTS The Goldstein-Levitin-Polya algorithm We consider an algorithm for solving the otimization roblem under convex constraints. Although the convexity
More informationCharacterizing the Behavior of a Probabilistic CMOS Switch Through Analytical Models and Its Verification Through Simulations
Characterizing the Behavior of a Probabilistic CMOS Switch Through Analytical Models and Its Verification Through Simulations PINAR KORKMAZ, BILGE E. S. AKGUL and KRISHNA V. PALEM Georgia Institute of
More informationRecursive Estimation of the Preisach Density function for a Smart Actuator
Recursive Estimation of the Preisach Density function for a Smart Actuator Ram V. Iyer Deartment of Mathematics and Statistics, Texas Tech University, Lubbock, TX 7949-142. ABSTRACT The Preisach oerator
More informationLinear diophantine equations for discrete tomography
Journal of X-Ray Science and Technology 10 001 59 66 59 IOS Press Linear diohantine euations for discrete tomograhy Yangbo Ye a,gewang b and Jiehua Zhu a a Deartment of Mathematics, The University of Iowa,
More informationShadow Computing: An Energy-Aware Fault Tolerant Computing Model
Shadow Comuting: An Energy-Aware Fault Tolerant Comuting Model Bryan Mills, Taieb Znati, Rami Melhem Deartment of Comuter Science University of Pittsburgh (bmills, znati, melhem)@cs.itt.edu Index Terms
More informationMODELING THE RELIABILITY OF C4ISR SYSTEMS HARDWARE/SOFTWARE COMPONENTS USING AN IMPROVED MARKOV MODEL
Technical Sciences and Alied Mathematics MODELING THE RELIABILITY OF CISR SYSTEMS HARDWARE/SOFTWARE COMPONENTS USING AN IMPROVED MARKOV MODEL Cezar VASILESCU Regional Deartment of Defense Resources Management
More informationDeriving Indicator Direct and Cross Variograms from a Normal Scores Variogram Model (bigaus-full) David F. Machuca Mory and Clayton V.
Deriving ndicator Direct and Cross Variograms from a Normal Scores Variogram Model (bigaus-full) David F. Machuca Mory and Clayton V. Deutsch Centre for Comutational Geostatistics Deartment of Civil &
More informationCovariance Matrix Estimation for Reinforcement Learning
Covariance Matrix Estimation for Reinforcement Learning Tomer Lancewicki Deartment of Electrical Engineering and Comuter Science University of Tennessee Knoxville, TN 37996 tlancewi@utk.edu Itamar Arel
More informationNotes on duality in second order and -order cone otimization E. D. Andersen Λ, C. Roos y, and T. Terlaky z Aril 6, 000 Abstract Recently, the so-calle
McMaster University Advanced Otimization Laboratory Title: Notes on duality in second order and -order cone otimization Author: Erling D. Andersen, Cornelis Roos and Tamás Terlaky AdvOl-Reort No. 000/8
More informationA SIMPLE PLASTICITY MODEL FOR PREDICTING TRANSVERSE COMPOSITE RESPONSE AND FAILURE
THE 19 TH INTERNATIONAL CONFERENCE ON COMPOSITE MATERIALS A SIMPLE PLASTICITY MODEL FOR PREDICTING TRANSVERSE COMPOSITE RESPONSE AND FAILURE K.W. Gan*, M.R. Wisnom, S.R. Hallett, G. Allegri Advanced Comosites
More informationRecent Developments in Multilayer Perceptron Neural Networks
Recent Develoments in Multilayer Percetron eural etworks Walter H. Delashmit Lockheed Martin Missiles and Fire Control Dallas, Texas 75265 walter.delashmit@lmco.com walter.delashmit@verizon.net Michael
More informationESTIMATION OF THE RECIPROCAL OF THE MEAN OF THE INVERSE GAUSSIAN DISTRIBUTION WITH PRIOR INFORMATION
STATISTICA, anno LXVIII, n., 008 ESTIMATION OF THE RECIPROCAL OF THE MEAN OF THE INVERSE GAUSSIAN DISTRIBUTION WITH PRIOR INFORMATION 1. INTRODUCTION The Inverse Gaussian distribution was first introduced
More informationFactor Analysis of Convective Heat Transfer for a Horizontal Tube in the Turbulent Flow Region Using Artificial Neural Network
COMPUTATIONAL METHODS IN ENGINEERING AND SCIENCE EPMESC X, Aug. -3, 6, Sanya, Hainan,China 6 Tsinghua University ess & Sringer-Verlag Factor Analysis of Convective Heat Transfer for a Horizontal Tube in
More informationAnalysis of some entrance probabilities for killed birth-death processes
Analysis of some entrance robabilities for killed birth-death rocesses Master s Thesis O.J.G. van der Velde Suervisor: Dr. F.M. Sieksma July 5, 207 Mathematical Institute, Leiden University Contents Introduction
More informationPredicting the Probability of Correct Classification
Predicting the Probability of Correct Classification Gregory Z. Grudic Department of Computer Science University of Colorado, Boulder grudic@cs.colorado.edu Abstract We propose a formulation for binary
More informationYixi Shi. Jose Blanchet. IEOR Department Columbia University New York, NY 10027, USA. IEOR Department Columbia University New York, NY 10027, USA
Proceedings of the 2011 Winter Simulation Conference S. Jain, R. R. Creasey, J. Himmelsach, K. P. White, and M. Fu, eds. EFFICIENT RARE EVENT SIMULATION FOR HEAVY-TAILED SYSTEMS VIA CROSS ENTROPY Jose
More informationLECTURE 7 NOTES. x n. d x if. E [g(x n )] E [g(x)]
LECTURE 7 NOTES 1. Convergence of random variables. Before delving into the large samle roerties of the MLE, we review some concets from large samle theory. 1. Convergence in robability: x n x if, for
More informationAsymptotically Optimal Simulation Allocation under Dependent Sampling
Asymtotically Otimal Simulation Allocation under Deendent Samling Xiaoing Xiong The Robert H. Smith School of Business, University of Maryland, College Park, MD 20742-1815, USA, xiaoingx@yahoo.com Sandee
More informationLocation of solutions for quasi-linear elliptic equations with general gradient dependence
Electronic Journal of Qualitative Theory of Differential Equations 217, No. 87, 1 1; htts://doi.org/1.14232/ejqtde.217.1.87 www.math.u-szeged.hu/ejqtde/ Location of solutions for quasi-linear ellitic equations
More informationLIBSVM: a Library for Support Vector Machines (Version 2.32)
LIBSVM: a Library for Support Vector Machines (Version 2.32) Chih-Chung Chang and Chih-Jen Lin Last updated: October 6, 200 Abstract LIBSVM is a library for support vector machines (SVM). Its goal is to
More informationOn the capacity of the general trapdoor channel with feedback
On the caacity of the general tradoor channel with feedback Jui Wu and Achilleas Anastasooulos Electrical Engineering and Comuter Science Deartment University of Michigan Ann Arbor, MI, 48109-1 email:
More informationRobust Predictive Control of Input Constraints and Interference Suppression for Semi-Trailer System
Vol.7, No.7 (4),.37-38 htt://dx.doi.org/.457/ica.4.7.7.3 Robust Predictive Control of Inut Constraints and Interference Suression for Semi-Trailer System Zhao, Yang Electronic and Information Technology
More informationBrownian Motion and Random Prime Factorization
Brownian Motion and Random Prime Factorization Kendrick Tang June 4, 202 Contents Introduction 2 2 Brownian Motion 2 2. Develoing Brownian Motion.................... 2 2.. Measure Saces and Borel Sigma-Algebras.........
More informationOne step ahead prediction using Fuzzy Boolean Neural Networks 1
One ste ahead rediction using Fuzzy Boolean eural etworks 1 José A. B. Tomé IESC-ID, IST Rua Alves Redol, 9 1000 Lisboa jose.tome@inesc-id.t João Paulo Carvalho IESC-ID, IST Rua Alves Redol, 9 1000 Lisboa
More informationRobust Solutions to Markov Decision Problems
Robust Solutions to Markov Decision Problems Arnab Nilim and Laurent El Ghaoui Deartment of Electrical Engineering and Comuter Sciences University of California, Berkeley, CA 94720 nilim@eecs.berkeley.edu,
More informationCERIAS Tech Report The period of the Bell numbers modulo a prime by Peter Montgomery, Sangil Nahm, Samuel Wagstaff Jr Center for Education
CERIAS Tech Reort 2010-01 The eriod of the Bell numbers modulo a rime by Peter Montgomery, Sangil Nahm, Samuel Wagstaff Jr Center for Education and Research Information Assurance and Security Purdue University,
More informationLOGISTIC REGRESSION. VINAYANAND KANDALA M.Sc. (Agricultural Statistics), Roll No I.A.S.R.I, Library Avenue, New Delhi
LOGISTIC REGRESSION VINAANAND KANDALA M.Sc. (Agricultural Statistics), Roll No. 444 I.A.S.R.I, Library Avenue, New Delhi- Chairerson: Dr. Ranjana Agarwal Abstract: Logistic regression is widely used when
More informationEstimating Posterior Ratio for Classification: Transfer Learning from Probabilistic Perspective
Estimating Posterior Ratio for Classification: Transfer Learning from Probabilistic Persective Song Liu, Kenji Fukumizu arxiv:506.02784v3 [stat.ml] 9 Oct 205 Abstract Transfer learning assumes classifiers
More informationThe non-stochastic multi-armed bandit problem
Submitted for journal ublication. The non-stochastic multi-armed bandit roblem Peter Auer Institute for Theoretical Comuter Science Graz University of Technology A-8010 Graz (Austria) auer@igi.tu-graz.ac.at
More informationEstimating function analysis for a class of Tweedie regression models
Title Estimating function analysis for a class of Tweedie regression models Author Wagner Hugo Bonat Deartamento de Estatística - DEST, Laboratório de Estatística e Geoinformação - LEG, Universidade Federal
More informationOptimal Design of Truss Structures Using a Neutrosophic Number Optimization Model under an Indeterminate Environment
Neutrosohic Sets and Systems Vol 14 016 93 University of New Mexico Otimal Design of Truss Structures Using a Neutrosohic Number Otimization Model under an Indeterminate Environment Wenzhong Jiang & Jun
More informationRANDOM WALKS AND PERCOLATION: AN ANALYSIS OF CURRENT RESEARCH ON MODELING NATURAL PROCESSES
RANDOM WALKS AND PERCOLATION: AN ANALYSIS OF CURRENT RESEARCH ON MODELING NATURAL PROCESSES AARON ZWIEBACH Abstract. In this aer we will analyze research that has been recently done in the field of discrete
More informationPaper C Exact Volume Balance Versus Exact Mass Balance in Compositional Reservoir Simulation
Paer C Exact Volume Balance Versus Exact Mass Balance in Comositional Reservoir Simulation Submitted to Comutational Geosciences, December 2005. Exact Volume Balance Versus Exact Mass Balance in Comositional
More informationGeneral Linear Model Introduction, Classes of Linear models and Estimation
Stat 740 General Linear Model Introduction, Classes of Linear models and Estimation An aim of scientific enquiry: To describe or to discover relationshis among events (variables) in the controlled (laboratory)
More informationPositive decomposition of transfer functions with multiple poles
Positive decomosition of transfer functions with multile oles Béla Nagy 1, Máté Matolcsi 2, and Márta Szilvási 1 Deartment of Analysis, Technical University of Budaest (BME), H-1111, Budaest, Egry J. u.
More informationLecture 6. 2 Recurrence/transience, harmonic functions and martingales
Lecture 6 Classification of states We have shown that all states of an irreducible countable state Markov chain must of the same tye. This gives rise to the following classification. Definition. [Classification
More informationVIBRATION ANALYSIS OF BEAMS WITH MULTIPLE CONSTRAINED LAYER DAMPING PATCHES
Journal of Sound and Vibration (998) 22(5), 78 85 VIBRATION ANALYSIS OF BEAMS WITH MULTIPLE CONSTRAINED LAYER DAMPING PATCHES Acoustics and Dynamics Laboratory, Deartment of Mechanical Engineering, The
More informationBayesian Model Averaging Kriging Jize Zhang and Alexandros Taflanidis
HIPAD LAB: HIGH PERFORMANCE SYSTEMS LABORATORY DEPARTMENT OF CIVIL AND ENVIRONMENTAL ENGINEERING AND EARTH SCIENCES Bayesian Model Averaging Kriging Jize Zhang and Alexandros Taflanidis Why use metamodeling
More information8.7 Associated and Non-associated Flow Rules
8.7 Associated and Non-associated Flow Rules Recall the Levy-Mises flow rule, Eqn. 8.4., d ds (8.7.) The lastic multilier can be determined from the hardening rule. Given the hardening rule one can more
More informationCHAPTER-II Control Charts for Fraction Nonconforming using m-of-m Runs Rules
CHAPTER-II Control Charts for Fraction Nonconforming using m-of-m Runs Rules. Introduction: The is widely used in industry to monitor the number of fraction nonconforming units. A nonconforming unit is
More informationScaling Multiple Point Statistics for Non-Stationary Geostatistical Modeling
Scaling Multile Point Statistics or Non-Stationary Geostatistical Modeling Julián M. Ortiz, Steven Lyster and Clayton V. Deutsch Centre or Comutational Geostatistics Deartment o Civil & Environmental Engineering
More informationDeveloping A Deterioration Probabilistic Model for Rail Wear
International Journal of Traffic and Transortation Engineering 2012, 1(2): 13-18 DOI: 10.5923/j.ijtte.20120102.02 Develoing A Deterioration Probabilistic Model for Rail Wear Jabbar-Ali Zakeri *, Shahrbanoo
More informationPublished: 14 October 2013
Electronic Journal of Alied Statistical Analysis EJASA, Electron. J. A. Stat. Anal. htt://siba-ese.unisalento.it/index.h/ejasa/index e-issn: 27-5948 DOI: 1.1285/i275948v6n213 Estimation of Parameters of
More informationElementary Analysis in Q p
Elementary Analysis in Q Hannah Hutter, May Szedlák, Phili Wirth November 17, 2011 This reort follows very closely the book of Svetlana Katok 1. 1 Sequences and Series In this section we will see some
More information