From Margins to Probabilities in Multiclass Learning Problems

Size: px
Start display at page:

Download "From Margins to Probabilities in Multiclass Learning Problems"

Transcription

1 From Margins to Probabiities in Muticass Learning Probems Andrea Passerini and Massimiiano Ponti 2 and Paoo Frasconi 3 Abstract. We study the probem of muticass cassification within the framework of error correcting output codes (ECOC) using margin-based binary cassifiers. An important open probem in this context is how to measure the distance between cass codewords and the outputs of the cassifiers. In this paper we propose a new decoding function that combines the margins through an estimate of their cass conditiona probabiities. We report experiments using support vector machines as the base binary cassifiers, showing the advantage of the proposed decoding function over other functions of the margin commony used in practice. We aso present new theoretica resuts bounding the eave-one-out error of ECOC of kerne machines, which can be used to tune kerne parameters. An empirica vaidation indicates that the bound eads to good estimates of kerne parameters and the corresponding cassifiers attain high accuracy. Keywords: Machine Learning, Error Correcting Output Codes, Support Vector Machines, Statistica Learning Theory. Introduction and Notation Many machine earning agorithms are intrinsicay conceived for binary cassification. However, many rea word earning probems require that inputs are mapped into one of severa possibe categories. The extension of a binary agorithm to its muticass counterpart is not aways possibe or easy to conceive. An aternative consists in reducing a muticass probem into severa binary sub-probems. Perhaps the most genera reduction scheme is the information theoretic method based on error correcting output codes (ECOC), introduced by Dietterich and Bakiri [5] and more recenty extended in []. ECOC work in two steps: training and cassification. During the first step, S binary cassifiers are trained on S dichotomies of the instance space, formed by joining non overapping subsets of casses. Assuming Q casses, et us introduce a coding matrix M {, 0, } Q S which specifies a reation between casses and dichotomies. m qs = (m qs = ) means that exampes beonging to cass q are used as positive (negative) exampes to train the s th cassifier f s. When m qs = 0, points in cass q are not used to train the s th cassifier. Thus each cass q is encoded by the q th row of matrix M which we aso denoted by M q. Typicay the cassifiers are trained independenty. This approach is known as muti-ca approach. Recent works [8, 2] considered aso the case where cassifiers are trained simutaneousy (singe-ca approach). In this paper we focus on the former approach athough some of the resuts may Dept. of Systems and Computer Science, University of Forence, Firenze, Itay 2 Dept. of Computer Science, University of Siena, Siena, Itay 3 Dept. of Systems and Computer Science, University of Forence, Firenze, Itay be extended to the atter one. In the second step a new input x is cassified by computing the vector formed by the outputs of the cassifiers, f(x) = (f (x),..., f S(x)) and choosing the cass whose corresponding row is cosest to f(x). In so doing, cassification can be seen as a decoding operation and the cass of input x is computed as arg min Q d(mq, f(x)), q= where d is the decoding function. In [5] the entries of matrix M were restricted to take ony binary vaues and the d was chosen to be the Hamming distance: L(M q, f) = m qs sign(f s) 2 In the case that the binary earners are margin-based cassifiers, [] shown the advantage of using a oss-based function of the margin d L(M q, f) = L(m qsf s) where L is a oss function. Such a function has the advantage of weighting the confidence of each cassifier thus aowing a more robust cassification criterion. The simpest oss function one can use is the inear oss for which L(m qsf s) = m qsf s but severa other choices are possibe and it is not cear which one shoud work the best. In the case that a binary cassifiers are computed by the same earning agorithm Awein, Shapire and Singer [] proposed to set L to be the same oss function used by that agorithm. In this paper we suggest a different approach which is based on decoding via conditiona probabiities of the outputs of the cassifiers. The advantages offered by our approach is twofod. First, the use of conditiona probabiities aows to combine the margins of each cassifier in a principed way. Second, the decoding function is itsef a cass conditiona probabiity which can give an estimate of muticassification confidence. This approach is discussed in Section 2. In Section 3 we present experiments, using SVM as the underying binary cassifiers, which enighten the advantage over other proposed decoding schemes. In Section 4 we study the generaization error of ECOC. We present a genera bound on the eave one out error in the case of ECOC of kerne machines. The bound can be used for estimating optima kerne parameters. The novety of this anaysis is that it aows muticass parameters optimization even though the binary cassifiers are trained independenty. We report experiments showing that the bound eads to good estimates of kerne parameters and the corresponding cassifiers attain high accuracy. ()

2 2 Decoding Functions Based on Conditiona Probabiities We have seen that a oss function of the margin presents some advantage over the standard Hamming distance because it can encode the confidence of each cassifier in the ECOC. This confidence is, however, a reative quantity, i.e. the range of the vaues of the margin may vary with the cassifier used. Thus, just using a inear oss function may introduce some bias in the fina cassification in the sense that cassifiers with a arger output range wi receive a higher weight. Not surprisingy, we wi see in the experiments beow that the Hamming distance can work better than the inear and soft-margin osses in the case of pairwise schemes. A straightforward normaization in some interva, e.g. [, ], can aso introduce bias since it does not fuy take into account the margin distribution. A more principed approach is to estimate the conditiona probabiity of each output codebit O s given the input vector x. Assuming that a the information about x that is reevant for determining O s is contained in the margin f s(x) (or f s for short) the above probabiity for each cassifier is P (O s f s, s). We can now assume a simpe mode for the probabiity of Y given the codebits O,..., O S. The conditiona probabiity that Y = q shoud be if the configuration on O,..., O S is the code of q, shoud be zero if it is the code of some other cass q q, and uniform (i.e. /Q if the configuration is not a vaid cass code. Under this mode, P (Y = q f) = P (O = m q,..., O S = m qs f) + α being α a constant that coects the probabiity mass dispersed on the 2 S Q invaid codes. Assuming that O s and O s are conditionay independent given x for each s, s, we can write the ikeihood that the resuting output codeword is q as P (Y = q f) = S P (O s = m qs f s) + α. (2) In this case, if α is sma, the decoding function wi be: d(m q, f) og P (Y = q f). (3) The probem bois down to estimating the individua conditiona probabiities in Eq. (2). To do so, one can try to fit a parametric mode on the output produced by a n-fod cross vaidation procedure. In our experiments we choose this mode to be a sigmoid computed on a 3- fod cross vaidation as suggested in [0]: P (O s = m qs f s, s) = + exp{a sf s + B s}. It is interesting to notice that an additiona advantage of the proposed decoding agorithm is that the muticass cassifier outputs a conditiona probabiity rather than a mere cass decision. 3 Experimenta Comparison Between Different Decoding Functions The proposed decoding method is vaidated on ten datasets from UCI repository. Their characteristics are shorty summarized in Tabe. We trained muticass cassifiers using SVM as the base binary cassifier. In our experiments we compared our decoding strategy to Hamming and other common oss-based decoding schemes (inear, and the soft margin oss used to train SVM) for different types of ECOC schemes: one-vs-a, a-pairs, and dense matrices consisting Tabe. Characteristics of the Datasets used Name Casses Train Test Inputs Annea Ecoi Gass Letter Optdigits Pendigits Satimage Segment Soybean Yeast of 3Q coumns of {, } entries. SVM were trained on a Gaussian kerne with the same vaue of the variance for a the experiments. For datasets with ess than 2,000 instances we used ten random spits of the avaiabe data (picking up 2/3 of exampes for training and /3 for testing) and averaged resuts over the ten trias. For the remaining datasets we used the origina spit defined in the UCI repository. Resuts are summarized in Tabes 2 4. Tabe 2. One-vs-A Annea Ecoi Gass Letter Optdigits Pendigits Satimage Segment Soybean Yeast Tabe 3. A-Pairs Annea Ecoi Gass Letter Optdigits Pendigits Satimage Segment Soybean Yeast Our ikeihood decoding works significanty better for a ECOC schemes. This resut is particuary cear in the case of pairwise cassifiers. In the case of dense ECOC some dichotomies can be particuary hard to earn and in this situation the sigmoid may be a too simpe mode for the conditiona probabiity. We suspect that by using a better estimate of the conditiona probabiity one coud obtain

3 Tabe 4. Dense Codes Let us introduce the matrix G defined by G ij = K(x i, x j). The coefficients α i in Equation (5) are earned by soving the foowing optimization probem: Annea Ecoi Gass Letter Optdigits Pendigits Satimage Segment Soybean Yeast better resuts with the ikeihood decoding. Notice that the Hamming distance works we in the case of pairwise cassification, whie it performs poory with one-vs-a cassifiers. Both resuts are not surprising: the Hamming distance corresponds to the majority vote, which is known to work we for pairwise cassifiers [7] but does not make much sense for one-vs-a because in this case ties may occur often. 4 Tuning Kerne Parameters In this section we study the probem of mode seection in the case of ECOC of Kerne Machines [2,, 6, 3]. The anaysis uses a eave-one-out error estimate as the key quantity for seecting the best mode. We first reca the main features of kerne machines for binary cassification. For a more detaied account consistent with the notations in this section see [6]. 4. Background on Kerne Machines Kerne machines are the minimizers of functionas of the form: H[f; D ] = V (c if(x i)) + λ f 2 K, (4) where we use the foowing notation: {(x i, c i) X {, }} is the training set. f is a function IR n IR beonging to a reproducing kerne Hibert space H defined by a symmetric and positive definite kerne K, and f 2 K is the norm of f in this space. See [2, 3] for a number of kernes. The cassification is done by taking the sign of this function. V (cf(x)) is a oss function whose choice determines different earning techniques, each eading to a different earning agorithm (for computing the coefficients α i - see beow). In this paper we assume that V is monotonic non increasing. λ is caed the reguarization parameter and is a positive constant. Machines of the form in Eq. (4) have been motivated in the framework of statistica earning theory. Under rather genera conditions the soution of Equation (4) is of the form 4 f(x) = α ic ik(x i, x). (5) 4 We assume that the bias term is incorporated in the kerne K. max α H(α) = m S(αi) 2 i,j= αiαjcicigij subject to : 0 α i C, i =,...,, where S( ) is a concave function and C = a constant. Support 2λ Vector Machines (SVMs) are a particuar case of these machines for S(α) = α. This corresponds to a oss function V in (4) that is of the form cf(x) +, where x + = x if x > 0 and zero otherwise. The points for which α i > 0 are caed support vectors. An important property of kerne machines is that, since we assumed K is symmetric and positive definite, the kerne function can be re-written as K(x, t) = λ nφ n(x)φ n(t). (7) n= where φ n(x) are the eigenfunctions of the integra operator associated to kerne K and λ n 0 their eigenvaues [4]. By setting Φ(x) to be the sequence ( λ ) nφ n(x), we see that K is a dot product n in a Hibert space of sequences (caed the feature space), that is, K(x, t) = Φ(x) Φ(t). Thus, function in Eq. (5) is a inear combination of features, f(x) = wnφn(x). The great advantage of n= working with the compact representation in Eq. (5) is that we avoid directy estimating the the infinite parameters w n. Finay, note that kerne K can aso be defined/buit by choosing the φs and λs but, in genera, those do not need to be known and can be impicity defined by K. 4.2 Bounds on the Leave-One-Out Error We first introduce some more notation. We define the muticass margin (see aso []) of point (x, y) to be with g(x, y) = d(m y, f(x)) + d(m r(x,y), f(x)) r(x, y) = argmin r y d(m r, f(x)). When g(x, y) is negative, point x is miscassified. Notice that when Q = 2 and and L is the inear oss, g(x, y) reduces to the definition of margin for a binary rea vaued cassification function. The empirica miscassification error can be written as: θ ( g(x i, y i)). The eave-one-out (LOO) error is defined by θ ( g i (x i, y ) i) where we have denoted by g i (x i, y i) the margin of exampe x i when the ECOC is trained on the dataset D \{(x i, y i)}. The LOO error is an interesting quantity to ook at when we want to seect/find the optimum (hyper)parameters of a cassification function, as is an amost unbiased estimator for the test error and we may expect that his minimum wi be cose to the minimum of the test error. Unfortunatey, computing the LOO error is time demanding when is arge. This (6)

4 becomes practicay impossibe in the case that we need to know the LOO error for severa vaues of the parameters of the machine used. In the foowing theorem we give a bound on the LOO error of ECOC of kerne machines. An interesting property of this bound is that it ony depends on the soution of the machine trained on the fu dataset (so training the machine once wi suffice). Beow we denote by f s the s machine, f s(x) = m αs i m yi sk s (x i, x), and et G s ij = K s (x i, x j). Theorem Suppose the decoding function uses the inear oss function. Then, the LOO error of the ECOC of kerne machines is bounded by ( ) θ g(x i, y i) + max U t(x i) (8) t y i where we have defined the function U t(x i) = (M t M r) f(x i) + m yi s(m yi s m ts)αi s G s ii To prove this theorem we first need the foowing Lemma for binary kerne machines. Second, if the number of support vectors of each kerne machine is sma, say ess than ν, the LOO error wi be at most S ν. Therefore, if Sν the LOO error wi be sma and so wi be the test error. We aso notice that, athough bound in Eq. (8) ony uses the knowedge about the machine trained on the fu dataset, it gives an estimate cose to the LOO error when the parameter C used to train the kerne machine is sma. Sma in this case means that C sup K(x i, x i) <. 4.3 Mode Seection Experiments We now show experiments where we use the bound provided by Theorem to seect optima kerne parameters. We focused on four of the argest datasets and searched the best vaue of the variance σ of the Gaussian kerne. To simpify the probem we searched a common vaue for a the binary cassifiers. Figures 3 show the test error and our LOO estimate for different vaues of γ = for the three 2σ 2 ECOC schemes considered. Lemma 4. Let f(x) be the kerne machine as defined in Equations (5) and (6). Let f i (x) be the soution of (4) found when the data point (x i, c i) is removed from the training set. We have c if(x i) α ig ii c if i (x i) c if(x i). (9) Proof: The.h.s part of Inequaity (9) was proved in [9]. Note that, if x i is not a support vector, f = f i and we have a trivia inequaity. Thus suppose that x i is a support vector. To prove the r.h.s. inequaity we observe that: H[f; D ] H[f i ; D ], and H[f; D i ] H[f i ; D i ]. By subtracting the second bound from the first one and using the definition of H we obtain that V (c if(x i)) V (c if i (x i)). Then, the resut foows by the monotonicay property of V. Figure. One-vs-a: Test error and LOO error bound as a function of the ogarithm of the parameter γ for satimage (eft) and optdigits (right) datasets. We now turn to the proof of Theorem. Our goa is to bound the difference of the muticass margin of f and f i, I = g(x i, y i) g i (x i, y i). Let us appy Lemma 4. to each SVM trained in the ECOC procedure. Inequaity (9) can be rewritten as m tsf i s(x i) = m tsf s(x i) λ sm yi sm ts (0) where λ s is a parameter in [0, α s i G s ii]. Using this equation we can transform quantity I to the desired resut through the foowing steps, where we have defined r i = argmin t yi d(m q, f i (x)): I = M yi f(x i) M r f(x i) [ M yi f i (x i) M r i f i (x i) ] = M yi f(x i) M yi f i (x i) + [ M r i f i (x i) M r f(x ] i) = (M yi s) 2 λ s + (M t M r) f(x i) m yi sm r i sλ s = (M r i M r) f(x i) + m yi s(m yi s m r i s)λ s Then, bound in Eq. (8) foows by using the definition of r i and by setting λ s = α s i G s ii. This theorem enightens some interesting properties of the ECOC of kerne machines. First, observe that the arger the margin of an exampe is, the ess ikey that this point wi be counted as a LOO error. Figure 2. A-pairs: Test error and LOO error bound as a function of the ogarithm of the parameter γ for satimage (eft) and optdigits (right) datasets. Figure 3. Dense-codes: Test error and LOO error bound as a function of the ogarithm of the parameter γ for satimage (eft) and optdigits (right) datasets.

5 Resuts indicate that the minimum of our LOO estimate is very cose to the minimum of the test error, athough we often observed a sight bias towards smaer vaues of the variance. Experiments with a ECOC schemes were repeated with such optima parameters. Tabes 5 7 show that we can improve performance significanty. Notice that in the case of one-vs-a and dense codes the advantage of using ikeihood decoding is ess evident than in experiments in section 3. We conjecture that this may be due to the fact that ikeihood estimates are computed on a 3-fod cross vaidation, thus the optima vaue of the variance shoud be computed separatey for each fod. Moreover the bias issues discussed above may further affect such estimates. We wi investigate such probems in future experiments. Tabe 5. One-vs-a: Optimizing the variance of the Gaussian kerne Optdigits Pendigits Satimage Segment Tabe 6. A-pairs: Optimizing the variance of the Gaussian kerne [3] N. Cristianini and J. Shawe-Tayor, An Introduction to Support Vector Machines, Cambridge University Press, [4] F. Cuker and S. Smae, On the mathematica foundations of earning, Buetin of American Mathematica Society, 39(), 49, (200). [5] T. G. Dietterich and G. Bakiri, Soving muticass earning probems via error-correcting output codes, Journa of Artificia Inteigence Research, (995). [6] T. Evgeniou, M. Ponti, and T. Poggio, Reguarization networks and support vector machines, Advances in Computationa Mathematics, 3, 50, (2000). [7] Jerome H. Friedman, Another approach to poychotomous cassification, Technica report, Department of Statistics, Stanford University, (997). [8] Y. Guermeur, A. Eisseeff, and H. Paugam-Mousy, A new muti-cass svm based on a uniform convergence resut, in Proceedings of IJCNN - Internationa Joint Conference on Neura Networks. IEEE, (2000). [9] T. Jaakkoa and D. Hausser, Expoiting generative modes in discriminative cassifiers, in Proc. of Neura Information Processing Conference, (998). [0] J. Patt, Probabiistic outputs for support vector machines and comparisons to reguarized ikeihood methods, in Advances in Large Margin Cassifiers, A. Smoa et a. Eds, MIT Press, (999). [] B. Schokopf, C. Burges, and A. Smoa, Advances in Kerne Methods Support Vector Learning, MIT Press, 998. [2] V. N. Vapnik, Statistica Learning Theory, Wiey, New York, 998. [3] G. Wahba, Spines Modes for Observationa Data, Series in Appied Mathematics, Vo. 59, SIAM, Phiadephia, 990. Optdigits Pendigits Satimage Segment Tabe 7. Dense-codes: Optimizing the variance of the Gaussian kerne Optdigits Pendigits Satimage Segment Concusions We studied ECOC constructed on margin based binary cassifiers under two compementary perspectives: the use of conditiona probabiities for buiding a decoding function, and the use of a theoreticay estimated bound on the eave-one-out error for optimizing kerne parameters. Our experiments show that transforming margins into conditiona probabiities is very effective and improves the overa accuracy of muticass cassification in comparison to standard oss-based decoding schemes. Moreover, fitting Gaussian kerne parameters by means of our theoretica bound further improves cassification accuracy of ECOC machines. REFERENCES [] E. L. Awein, R. E. Schapire, and Y. Singer, Reducing muticass to binary: A unifying approach for margin cassifiers, in Proc. 7th Internationa Conf. on Machine Learning, pp Morgan Kaufmann, San Francisco, CA, (2000). [2] K. Crammer and Y. Singer, the earnabiity and design of output codes for muticass probems, in In Proceedings of the Thirteenth Annua Conference on Computationa Learning Theory, 2000., (2000).

Statistical Learning Theory: A Primer

Statistical Learning Theory: A Primer Internationa Journa of Computer Vision 38(), 9 3, 2000 c 2000 uwer Academic Pubishers. Manufactured in The Netherands. Statistica Learning Theory: A Primer THEODOROS EVGENIOU, MASSIMILIANO PONTIL AND TOMASO

More information

Statistical Learning Theory: a Primer

Statistical Learning Theory: a Primer ??,??, 1 6 (??) c?? Kuwer Academic Pubishers, Boston. Manufactured in The Netherands. Statistica Learning Theory: a Primer THEODOROS EVGENIOU AND MASSIMILIANO PONTIL Center for Bioogica and Computationa

More information

From Margins to Probabilities in Multiclass Learning Problems

From Margins to Probabilities in Multiclass Learning Problems * rom Margins to Probabilities in Multiclass Learning Problems Andrea Passerini and Massimiliano Pontil and Paolo rasconi Abstract. We study the problem of multiclass classification within the framework

More information

Bayesian Learning. You hear a which which could equally be Thanks or Tanks, which would you go with?

Bayesian Learning. You hear a which which could equally be Thanks or Tanks, which would you go with? Bayesian Learning A powerfu and growing approach in machine earning We use it in our own decision making a the time You hear a which which coud equay be Thanks or Tanks, which woud you go with? Combine

More information

SVM: Terminology 1(6) SVM: Terminology 2(6)

SVM: Terminology 1(6) SVM: Terminology 2(6) Andrew Kusiak Inteigent Systems Laboratory 39 Seamans Center he University of Iowa Iowa City, IA 54-57 SVM he maxima margin cassifier is simiar to the perceptron: It aso assumes that the data points are

More information

Explicit overall risk minimization transductive bound

Explicit overall risk minimization transductive bound 1 Expicit overa risk minimization transductive bound Sergio Decherchi, Paoo Gastado, Sandro Ridea, Rodofo Zunino Dept. of Biophysica and Eectronic Engineering (DIBE), Genoa University Via Opera Pia 11a,

More information

Multilayer Kerceptron

Multilayer Kerceptron Mutiayer Kerceptron Zotán Szabó, András Lőrincz Department of Information Systems, Facuty of Informatics Eötvös Loránd University Pázmány Péter sétány 1/C H-1117, Budapest, Hungary e-mai: szzoi@csetehu,

More information

An Algorithm for Pruning Redundant Modules in Min-Max Modular Network

An Algorithm for Pruning Redundant Modules in Min-Max Modular Network An Agorithm for Pruning Redundant Modues in Min-Max Moduar Network Hui-Cheng Lian and Bao-Liang Lu Department of Computer Science and Engineering, Shanghai Jiao Tong University 1954 Hua Shan Rd., Shanghai

More information

A Brief Introduction to Markov Chains and Hidden Markov Models

A Brief Introduction to Markov Chains and Hidden Markov Models A Brief Introduction to Markov Chains and Hidden Markov Modes Aen B MacKenzie Notes for December 1, 3, &8, 2015 Discrete-Time Markov Chains You may reca that when we first introduced random processes,

More information

II. PROBLEM. A. Description. For the space of audio signals

II. PROBLEM. A. Description. For the space of audio signals CS229 - Fina Report Speech Recording based Language Recognition (Natura Language) Leopod Cambier - cambier; Matan Leibovich - matane; Cindy Orozco Bohorquez - orozcocc ABSTRACT We construct a rea time

More information

FRST Multivariate Statistics. Multivariate Discriminant Analysis (MDA)

FRST Multivariate Statistics. Multivariate Discriminant Analysis (MDA) 1 FRST 531 -- Mutivariate Statistics Mutivariate Discriminant Anaysis (MDA) Purpose: 1. To predict which group (Y) an observation beongs to based on the characteristics of p predictor (X) variabes, using

More information

A unified framework for Regularization Networks and Support Vector Machines. Theodoros Evgeniou, Massimiliano Pontil, Tomaso Poggio

A unified framework for Regularization Networks and Support Vector Machines. Theodoros Evgeniou, Massimiliano Pontil, Tomaso Poggio MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES A.I. Memo No. 1654 March23, 1999

More information

Two view learning: SVM-2K, Theory and Practice

Two view learning: SVM-2K, Theory and Practice Two view earning: SVM-2K, Theory and Practice Jason D.R. Farquhar jdrf99r@ecs.soton.ac.uk Hongying Meng hongying@cs.york.ac.uk David R. Hardoon drh@ecs.soton.ac.uk John Shawe-Tayor jst@ecs.soton.ac.uk

More information

New results on error correcting output codes of kernel machines

New results on error correcting output codes of kernel machines New results on error correcting output codes of kernel machines Andrea Passerini, Massimiliano Pontil, and Paolo Frasconi, Member, IEEE Abstract We study the problem of multiclass classification within

More information

SVM-based Supervised and Unsupervised Classification Schemes

SVM-based Supervised and Unsupervised Classification Schemes SVM-based Supervised and Unsupervised Cassification Schemes LUMINITA STATE University of Pitesti Facuty of Mathematics and Computer Science 1 Targu din Vae St., Pitesti 110040 ROMANIA state@cicknet.ro

More information

The EM Algorithm applied to determining new limit points of Mahler measures

The EM Algorithm applied to determining new limit points of Mahler measures Contro and Cybernetics vo. 39 (2010) No. 4 The EM Agorithm appied to determining new imit points of Maher measures by Souad E Otmani, Georges Rhin and Jean-Marc Sac-Épée Université Pau Veraine-Metz, LMAM,

More information

Universal Consistency of Multi-Class Support Vector Classification

Universal Consistency of Multi-Class Support Vector Classification Universa Consistency of Muti-Cass Support Vector Cassification Tobias Gasmachers Dae Moe Institute for rtificia Inteigence IDSI, 6928 Manno-Lugano, Switzerand tobias@idsia.ch bstract Steinwart was the

More information

CS229 Lecture notes. Andrew Ng

CS229 Lecture notes. Andrew Ng CS229 Lecture notes Andrew Ng Part IX The EM agorithm In the previous set of notes, we taked about the EM agorithm as appied to fitting a mixture of Gaussians. In this set of notes, we give a broader view

More information

Multicategory Classification by Support Vector Machines

Multicategory Classification by Support Vector Machines Muticategory Cassification by Support Vector Machines Erin J Bredensteiner Department of Mathematics University of Evansvie 800 Lincon Avenue Evansvie, Indiana 47722 eb6@evansvieedu Kristin P Bennett Department

More information

Discriminant Analysis: A Unified Approach

Discriminant Analysis: A Unified Approach Discriminant Anaysis: A Unified Approach Peng Zhang & Jing Peng Tuane University Eectrica Engineering & Computer Science Department New Oreans, LA 708 {zhangp,jp}@eecs.tuane.edu Norbert Riede Tuane University

More information

Appendix A: MATLAB commands for neural networks

Appendix A: MATLAB commands for neural networks Appendix A: MATLAB commands for neura networks 132 Appendix A: MATLAB commands for neura networks p=importdata('pn.xs'); t=importdata('tn.xs'); [pn,meanp,stdp,tn,meant,stdt]=prestd(p,t); for m=1:10 net=newff(minmax(pn),[m,1],{'tansig','purein'},'trainm');

More information

Expectation-Maximization for Estimating Parameters for a Mixture of Poissons

Expectation-Maximization for Estimating Parameters for a Mixture of Poissons Expectation-Maximization for Estimating Parameters for a Mixture of Poissons Brandon Maone Department of Computer Science University of Hesini February 18, 2014 Abstract This document derives, in excrutiating

More information

Kernel Matching Pursuit

Kernel Matching Pursuit Kerne Matching Pursuit Pasca Vincent and Yoshua Bengio Dept. IRO, Université demontréa C.P. 6128, Montrea, Qc, H3C 3J7, Canada {vincentp,bengioy}@iro.umontrea.ca Technica Report #1179 Département d Informatique

More information

MARKOV CHAINS AND MARKOV DECISION THEORY. Contents

MARKOV CHAINS AND MARKOV DECISION THEORY. Contents MARKOV CHAINS AND MARKOV DECISION THEORY ARINDRIMA DATTA Abstract. In this paper, we begin with a forma introduction to probabiity and expain the concept of random variabes and stochastic processes. After

More information

Some Measures for Asymmetry of Distributions

Some Measures for Asymmetry of Distributions Some Measures for Asymmetry of Distributions Georgi N. Boshnakov First version: 31 January 2006 Research Report No. 5, 2006, Probabiity and Statistics Group Schoo of Mathematics, The University of Manchester

More information

Ant Colony Algorithms for Constructing Bayesian Multi-net Classifiers

Ant Colony Algorithms for Constructing Bayesian Multi-net Classifiers Ant Coony Agorithms for Constructing Bayesian Muti-net Cassifiers Khaid M. Saama and Aex A. Freitas Schoo of Computing, University of Kent, Canterbury, UK. {kms39,a.a.freitas}@kent.ac.uk December 5, 2013

More information

MANY machine learning algorithms are intrinsically

MANY machine learning algorithms are intrinsically IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 15, NO. 1, JANUARY 2004 45 New Results on Error Correcting Output Codes of Kernel Machines Andrea Passerini, Massimiliano Pontil, and Paolo Frasconi, Member,

More information

arxiv: v1 [cs.lg] 23 Aug 2018

arxiv: v1 [cs.lg] 23 Aug 2018 Muticass Universum SVM Sauptik Dhar 1 Vadimir Cherkassky 2 Mohak Shah 1 3 arxiv:1808.08111v1 [cs.lg] 23 Aug 2018 Abstract We introduce Universum earning for muticass probems and propose a nove formuation

More information

Support Vector Machine and Its Application to Regression and Classification

Support Vector Machine and Its Application to Regression and Classification BearWorks Institutiona Repository MSU Graduate Theses Spring 2017 Support Vector Machine and Its Appication to Regression and Cassification Xiaotong Hu As with any inteectua project, the content and views

More information

Paragraph Topic Classification

Paragraph Topic Classification Paragraph Topic Cassification Eugene Nho Graduate Schoo of Business Stanford University Stanford, CA 94305 enho@stanford.edu Edward Ng Department of Eectrica Engineering Stanford University Stanford, CA

More information

STA 216 Project: Spline Approach to Discrete Survival Analysis

STA 216 Project: Spline Approach to Discrete Survival Analysis : Spine Approach to Discrete Surviva Anaysis November 4, 005 1 Introduction Athough continuous surviva anaysis differs much from the discrete surviva anaysis, there is certain ink between the two modeing

More information

Do Schools Matter for High Math Achievement? Evidence from the American Mathematics Competitions Glenn Ellison and Ashley Swanson Online Appendix

Do Schools Matter for High Math Achievement? Evidence from the American Mathematics Competitions Glenn Ellison and Ashley Swanson Online Appendix VOL. NO. DO SCHOOLS MATTER FOR HIGH MATH ACHIEVEMENT? 43 Do Schoos Matter for High Math Achievement? Evidence from the American Mathematics Competitions Genn Eison and Ashey Swanson Onine Appendix Appendix

More information

Some Properties of Regularized Kernel Methods

Some Properties of Regularized Kernel Methods Journa of Machine Learning Research 5 (2004) 1363 1390 Submitted 12/03; Revised 7/04; Pubished 10/04 Some Properties of Reguarized Kerne Methods Ernesto De Vito Dipartimento di Matematica Università di

More information

Stochastic Variational Inference with Gradient Linearization

Stochastic Variational Inference with Gradient Linearization Stochastic Variationa Inference with Gradient Linearization Suppementa Materia Tobias Pötz * Anne S Wannenwetsch Stefan Roth Department of Computer Science, TU Darmstadt Preface In this suppementa materia,

More information

Another Look at Linear Programming for Feature Selection via Methods of Regularization 1

Another Look at Linear Programming for Feature Selection via Methods of Regularization 1 Another Look at Linear Programming for Feature Seection via Methods of Reguarization Yonggang Yao, The Ohio State University Yoonkyung Lee, The Ohio State University Technica Report No. 800 November, 2007

More information

ASummaryofGaussianProcesses Coryn A.L. Bailer-Jones

ASummaryofGaussianProcesses Coryn A.L. Bailer-Jones ASummaryofGaussianProcesses Coryn A.L. Baier-Jones Cavendish Laboratory University of Cambridge caj@mrao.cam.ac.uk Introduction A genera prediction probem can be posed as foows. We consider that the variabe

More information

A. Distribution of the test statistic

A. Distribution of the test statistic A. Distribution of the test statistic In the sequentia test, we first compute the test statistic from a mini-batch of size m. If a decision cannot be made with this statistic, we keep increasing the mini-batch

More information

4 1-D Boundary Value Problems Heat Equation

4 1-D Boundary Value Problems Heat Equation 4 -D Boundary Vaue Probems Heat Equation The main purpose of this chapter is to study boundary vaue probems for the heat equation on a finite rod a x b. u t (x, t = ku xx (x, t, a < x < b, t > u(x, = ϕ(x

More information

Evolutionary Product-Unit Neural Networks for Classification 1

Evolutionary Product-Unit Neural Networks for Classification 1 Evoutionary Product-Unit Neura Networs for Cassification F.. Martínez-Estudio, C. Hervás-Martínez, P. A. Gutiérrez Peña A. C. Martínez-Estudio and S. Ventura-Soto Department of Management and Quantitative

More information

CONVERGENCE RATES OF COMPACTLY SUPPORTED RADIAL BASIS FUNCTION REGULARIZATION

CONVERGENCE RATES OF COMPACTLY SUPPORTED RADIAL BASIS FUNCTION REGULARIZATION Statistica Sinica 16(2006), 425-439 CONVERGENCE RATES OF COMPACTLY SUPPORTED RADIAL BASIS FUNCTION REGULARIZATION Yi Lin and Ming Yuan University of Wisconsin-Madison and Georgia Institute of Technoogy

More information

First-Order Corrections to Gutzwiller s Trace Formula for Systems with Discrete Symmetries

First-Order Corrections to Gutzwiller s Trace Formula for Systems with Discrete Symmetries c 26 Noninear Phenomena in Compex Systems First-Order Corrections to Gutzwier s Trace Formua for Systems with Discrete Symmetries Hoger Cartarius, Jörg Main, and Günter Wunner Institut für Theoretische

More information

Moreau-Yosida Regularization for Grouped Tree Structure Learning

Moreau-Yosida Regularization for Grouped Tree Structure Learning Moreau-Yosida Reguarization for Grouped Tree Structure Learning Jun Liu Computer Science and Engineering Arizona State University J.Liu@asu.edu Jieping Ye Computer Science and Engineering Arizona State

More information

MATH 172: MOTIVATION FOR FOURIER SERIES: SEPARATION OF VARIABLES

MATH 172: MOTIVATION FOR FOURIER SERIES: SEPARATION OF VARIABLES MATH 172: MOTIVATION FOR FOURIER SERIES: SEPARATION OF VARIABLES Separation of variabes is a method to sove certain PDEs which have a warped product structure. First, on R n, a inear PDE of order m is

More information

XSAT of linear CNF formulas

XSAT of linear CNF formulas XSAT of inear CN formuas Bernd R. Schuh Dr. Bernd Schuh, D-50968 Kön, Germany; bernd.schuh@netcoogne.de eywords: compexity, XSAT, exact inear formua, -reguarity, -uniformity, NPcompeteness Abstract. Open

More information

Melodic contour estimation with B-spline models using a MDL criterion

Melodic contour estimation with B-spline models using a MDL criterion Meodic contour estimation with B-spine modes using a MDL criterion Damien Loive, Ney Barbot, Oivier Boeffard IRISA / University of Rennes 1 - ENSSAT 6 rue de Kerampont, B.P. 80518, F-305 Lannion Cedex

More information

Inductive Bias: How to generalize on novel data. CS Inductive Bias 1

Inductive Bias: How to generalize on novel data. CS Inductive Bias 1 Inductive Bias: How to generaize on nove data CS 478 - Inductive Bias 1 Overfitting Noise vs. Exceptions CS 478 - Inductive Bias 2 Non-Linear Tasks Linear Regression wi not generaize we to the task beow

More information

Turbo Codes. Coding and Communication Laboratory. Dept. of Electrical Engineering, National Chung Hsing University

Turbo Codes. Coding and Communication Laboratory. Dept. of Electrical Engineering, National Chung Hsing University Turbo Codes Coding and Communication Laboratory Dept. of Eectrica Engineering, Nationa Chung Hsing University Turbo codes 1 Chapter 12: Turbo Codes 1. Introduction 2. Turbo code encoder 3. Design of intereaver

More information

Appendix for Stochastic Gradient Monomial Gamma Sampler

Appendix for Stochastic Gradient Monomial Gamma Sampler 3 4 5 6 7 8 9 3 4 5 6 7 8 9 3 4 5 6 7 8 9 3 3 3 33 34 35 36 37 38 39 4 4 4 43 44 45 46 47 48 49 5 5 5 53 54 Appendix for Stochastic Gradient Monomia Gamma Samper A The Main Theorem We provide the foowing

More information

Cryptanalysis of PKP: A New Approach

Cryptanalysis of PKP: A New Approach Cryptanaysis of PKP: A New Approach Éiane Jaumes and Antoine Joux DCSSI 18, rue du Dr. Zamenhoff F-92131 Issy-es-Mx Cedex France eiane.jaumes@wanadoo.fr Antoine.Joux@ens.fr Abstract. Quite recenty, in

More information

Efficiently Generating Random Bits from Finite State Markov Chains

Efficiently Generating Random Bits from Finite State Markov Chains 1 Efficienty Generating Random Bits from Finite State Markov Chains Hongchao Zhou and Jehoshua Bruck, Feow, IEEE Abstract The probem of random number generation from an uncorreated random source (of unknown

More information

A proposed nonparametric mixture density estimation using B-spline functions

A proposed nonparametric mixture density estimation using B-spline functions A proposed nonparametric mixture density estimation using B-spine functions Atizez Hadrich a,b, Mourad Zribi a, Afif Masmoudi b a Laboratoire d Informatique Signa et Image de a Côte d Opae (LISIC-EA 4491),

More information

Active Learning & Experimental Design

Active Learning & Experimental Design Active Learning & Experimenta Design Danie Ting Heaviy modified, of course, by Lye Ungar Origina Sides by Barbara Engehardt and Aex Shyr Lye Ungar, University of Pennsyvania Motivation u Data coection

More information

Mat 1501 lecture notes, penultimate installment

Mat 1501 lecture notes, penultimate installment Mat 1501 ecture notes, penutimate instament 1. bounded variation: functions of a singe variabe optiona) I beieve that we wi not actuay use the materia in this section the point is mainy to motivate the

More information

Efficient Approximate Leave-One-Out Cross-Validation for Kernel Logistic Regression

Efficient Approximate Leave-One-Out Cross-Validation for Kernel Logistic Regression Machine Learning manuscript No. (wi be inserted by the editor) Efficient Approximate Leave-One-Out Cross-Vaidation for Kerne Logistic Regression Gavin C. Cawey, Nicoa L. C. Tabot Schoo of Computing Sciences

More information

Kernel pea and De-Noising in Feature Spaces

Kernel pea and De-Noising in Feature Spaces Kerne pea and De-Noising in Feature Spaces Sebastian Mika, Bernhard Schokopf, Aex Smoa Kaus-Robert Muer, Matthias Schoz, Gunnar Riitsch GMD FIRST, Rudower Chaussee 5, 12489 Berin, Germany {mika, bs, smoa,

More information

6.434J/16.391J Statistics for Engineers and Scientists May 4 MIT, Spring 2006 Handout #17. Solution 7

6.434J/16.391J Statistics for Engineers and Scientists May 4 MIT, Spring 2006 Handout #17. Solution 7 6.434J/16.391J Statistics for Engineers and Scientists May 4 MIT, Spring 2006 Handout #17 Soution 7 Probem 1: Generating Random Variabes Each part of this probem requires impementation in MATLAB. For the

More information

Appendix for Stochastic Gradient Monomial Gamma Sampler

Appendix for Stochastic Gradient Monomial Gamma Sampler Appendix for Stochastic Gradient Monomia Gamma Samper A The Main Theorem We provide the foowing theorem to characterize the stationary distribution of the stochastic process with SDEs in (3) Theorem 3

More information

On the Statistical Consistency of Algorithms for Binary Classification under Class Imbalance

On the Statistical Consistency of Algorithms for Binary Classification under Class Imbalance On the Statistica Consistency of Agorithms for Binary Cassification under Cass Imbaance Aditya Krishna Menon University of Caifornia, San iego, La Joa CA 92093, USA Harikrishna Narasimhan Shivani Agarwa

More information

4 Separation of Variables

4 Separation of Variables 4 Separation of Variabes In this chapter we describe a cassica technique for constructing forma soutions to inear boundary vaue probems. The soution of three cassica (paraboic, hyperboic and eiptic) PDE

More information

NOISE-INDUCED STABILIZATION OF STOCHASTIC DIFFERENTIAL EQUATIONS

NOISE-INDUCED STABILIZATION OF STOCHASTIC DIFFERENTIAL EQUATIONS NOISE-INDUCED STABILIZATION OF STOCHASTIC DIFFERENTIAL EQUATIONS TONY ALLEN, EMILY GEBHARDT, AND ADAM KLUBALL 3 ADVISOR: DR. TIFFANY KOLBA 4 Abstract. The phenomenon of noise-induced stabiization occurs

More information

Math 124B January 31, 2012

Math 124B January 31, 2012 Math 124B January 31, 212 Viktor Grigoryan 7 Inhomogeneous boundary vaue probems Having studied the theory of Fourier series, with which we successfuy soved boundary vaue probems for the homogeneous heat

More information

Data Mining Technology for Failure Prognostic of Avionics

Data Mining Technology for Failure Prognostic of Avionics IEEE Transactions on Aerospace and Eectronic Systems. Voume 38, #, pp.388-403, 00. Data Mining Technoogy for Faiure Prognostic of Avionics V.A. Skormin, Binghamton University, Binghamton, NY, 1390, USA

More information

Trainable fusion rules. I. Large sample size case

Trainable fusion rules. I. Large sample size case Neura Networks 19 (2006) 1506 1516 www.esevier.com/ocate/neunet Trainabe fusion rues. I. Large sampe size case Šarūnas Raudys Institute of Mathematics and Informatics, Akademijos 4, Vinius 08633, Lithuania

More information

14 Separation of Variables Method

14 Separation of Variables Method 14 Separation of Variabes Method Consider, for exampe, the Dirichet probem u t = Du xx < x u(x, ) = f(x) < x < u(, t) = = u(, t) t > Let u(x, t) = T (t)φ(x); now substitute into the equation: dt

More information

Akaike Information Criterion for ANOVA Model with a Simple Order Restriction

Akaike Information Criterion for ANOVA Model with a Simple Order Restriction Akaike Information Criterion for ANOVA Mode with a Simpe Order Restriction Yu Inatsu * Department of Mathematics, Graduate Schoo of Science, Hiroshima University ABSTRACT In this paper, we consider Akaike

More information

Reichenbachian Common Cause Systems

Reichenbachian Common Cause Systems Reichenbachian Common Cause Systems G. Hofer-Szabó Department of Phiosophy Technica University of Budapest e-mai: gszabo@hps.ete.hu Mikós Rédei Department of History and Phiosophy of Science Eötvös University,

More information

Steepest Descent Adaptation of Min-Max Fuzzy If-Then Rules 1

Steepest Descent Adaptation of Min-Max Fuzzy If-Then Rules 1 Steepest Descent Adaptation of Min-Max Fuzzy If-Then Rues 1 R.J. Marks II, S. Oh, P. Arabshahi Λ, T.P. Caude, J.J. Choi, B.G. Song Λ Λ Dept. of Eectrica Engineering Boeing Computer Services University

More information

(This is a sample cover image for this issue. The actual cover is not yet available at this time.)

(This is a sample cover image for this issue. The actual cover is not yet available at this time.) (This is a sampe cover image for this issue The actua cover is not yet avaiabe at this time) This artice appeared in a journa pubished by Esevier The attached copy is furnished to the author for interna

More information

An approximate method for solving the inverse scattering problem with fixed-energy data

An approximate method for solving the inverse scattering problem with fixed-energy data J. Inv. I-Posed Probems, Vo. 7, No. 6, pp. 561 571 (1999) c VSP 1999 An approximate method for soving the inverse scattering probem with fixed-energy data A. G. Ramm and W. Scheid Received May 12, 1999

More information

Global sensitivity analysis using low-rank tensor approximations

Global sensitivity analysis using low-rank tensor approximations Goba sensitivity anaysis using ow-rank tensor approximations K. Konaki 1 and B. Sudret 1 1 Chair of Risk, Safety and Uncertainty Quantification, arxiv:1605.09009v1 [stat.co] 29 May 2016 ETH Zurich, Stefano-Franscini-Patz

More information

arxiv: v1 [math.ca] 6 Mar 2017

arxiv: v1 [math.ca] 6 Mar 2017 Indefinite Integras of Spherica Besse Functions MIT-CTP/487 arxiv:703.0648v [math.ca] 6 Mar 07 Joyon K. Boomfied,, Stephen H. P. Face,, and Zander Moss, Center for Theoretica Physics, Laboratory for Nucear

More information

Converting Z-number to Fuzzy Number using. Fuzzy Expected Value

Converting Z-number to Fuzzy Number using. Fuzzy Expected Value ISSN 1746-7659, Engand, UK Journa of Information and Computing Science Vo. 1, No. 4, 017, pp.91-303 Converting Z-number to Fuzzy Number using Fuzzy Expected Vaue Mahdieh Akhbari * Department of Industria

More information

Research of Data Fusion Method of Multi-Sensor Based on Correlation Coefficient of Confidence Distance

Research of Data Fusion Method of Multi-Sensor Based on Correlation Coefficient of Confidence Distance Send Orders for Reprints to reprints@benthamscience.ae 340 The Open Cybernetics & Systemics Journa, 015, 9, 340-344 Open Access Research of Data Fusion Method of Muti-Sensor Based on Correation Coefficient

More information

BALANCING REGULAR MATRIX PENCILS

BALANCING REGULAR MATRIX PENCILS BALANCING REGULAR MATRIX PENCILS DAMIEN LEMONNIER AND PAUL VAN DOOREN Abstract. In this paper we present a new diagona baancing technique for reguar matrix pencis λb A, which aims at reducing the sensitivity

More information

arxiv: v1 [cs.db] 1 Aug 2012

arxiv: v1 [cs.db] 1 Aug 2012 Functiona Mechanism: Regression Anaysis under Differentia Privacy arxiv:208.029v [cs.db] Aug 202 Jun Zhang Zhenjie Zhang 2 Xiaokui Xiao Yin Yang 2 Marianne Winsett 2,3 ABSTRACT Schoo of Computer Engineering

More information

Research Article Analysis of Heart Transplant Survival Data Using Generalized Additive Models

Research Article Analysis of Heart Transplant Survival Data Using Generalized Additive Models Hindawi Pubishing Corporation Computationa and Mathematica Methods in Medicine Voume 2013, Artice ID 609857, 7 pages http://dx.doi.org/10.1155/2013/609857 Research Artice Anaysis of Heart Transpant Surviva

More information

1D Heat Propagation Problems

1D Heat Propagation Problems Chapter 1 1D Heat Propagation Probems If the ambient space of the heat conduction has ony one dimension, the Fourier equation reduces to the foowing for an homogeneous body cρ T t = T λ 2 + Q, 1.1) x2

More information

Approximated MLC shape matrix decomposition with interleaf collision constraint

Approximated MLC shape matrix decomposition with interleaf collision constraint Approximated MLC shape matrix decomposition with intereaf coision constraint Thomas Kainowski Antje Kiese Abstract Shape matrix decomposition is a subprobem in radiation therapy panning. A given fuence

More information

Math 220B - Summer 2003 Homework 1 Solutions

Math 220B - Summer 2003 Homework 1 Solutions Math 0B - Summer 003 Homework Soutions Consider the eigenvaue probem { X = λx 0 < x < X satisfies symmetric BCs x = 0, Suppose f(x)f (x) x=b x=a 0 for a rea-vaued functions f(x) which satisfy the boundary

More information

Restricted weak type on maximal linear and multilinear integral maps.

Restricted weak type on maximal linear and multilinear integral maps. Restricted weak type on maxima inear and mutiinear integra maps. Oscar Basco Abstract It is shown that mutiinear operators of the form T (f 1,..., f k )(x) = R K(x, y n 1,..., y k )f 1 (y 1 )...f k (y

More information

u(x) s.t. px w x 0 Denote the solution to this problem by ˆx(p, x). In order to obtain ˆx we may simply solve the standard problem max x 0

u(x) s.t. px w x 0 Denote the solution to this problem by ˆx(p, x). In order to obtain ˆx we may simply solve the standard problem max x 0 Bocconi University PhD in Economics - Microeconomics I Prof M Messner Probem Set 4 - Soution Probem : If an individua has an endowment instead of a monetary income his weath depends on price eves In particuar,

More information

Problem set 6 The Perron Frobenius theorem.

Problem set 6 The Perron Frobenius theorem. Probem set 6 The Perron Frobenius theorem. Math 22a4 Oct 2 204, Due Oct.28 In a future probem set I want to discuss some criteria which aow us to concude that that the ground state of a sef-adjoint operator

More information

More Scattering: the Partial Wave Expansion

More Scattering: the Partial Wave Expansion More Scattering: the Partia Wave Expansion Michae Fower /7/8 Pane Waves and Partia Waves We are considering the soution to Schrödinger s equation for scattering of an incoming pane wave in the z-direction

More information

arxiv: v2 [stat.ml] 17 Mar 2015

arxiv: v2 [stat.ml] 17 Mar 2015 Journa of Machine Learning Research 1x (201x) x-xx Submitted x/0x; Pubished x/0x A Unifying Framework in Vector-vaued Reproducing Kerne Hibert Spaces for Manifod Reguarization and Co-Reguarized Muti-view

More information

Soft Clustering on Graphs

Soft Clustering on Graphs Soft Custering on Graphs Kai Yu 1, Shipeng Yu 2, Voker Tresp 1 1 Siemens AG, Corporate Technoogy 2 Institute for Computer Science, University of Munich kai.yu@siemens.com, voker.tresp@siemens.com spyu@dbs.informatik.uni-muenchen.de

More information

Separation of Variables and a Spherical Shell with Surface Charge

Separation of Variables and a Spherical Shell with Surface Charge Separation of Variabes and a Spherica She with Surface Charge In cass we worked out the eectrostatic potentia due to a spherica she of radius R with a surface charge density σθ = σ cos θ. This cacuation

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Schoo of Computer Science Probabiistic Graphica Modes Gaussian graphica modes and Ising modes: modeing networks Eric Xing Lecture 0, February 0, 07 Reading: See cass website Eric Xing @ CMU, 005-07 Network

More information

Sequential Decoding of Polar Codes with Arbitrary Binary Kernel

Sequential Decoding of Polar Codes with Arbitrary Binary Kernel Sequentia Decoding of Poar Codes with Arbitrary Binary Kerne Vera Miosavskaya, Peter Trifonov Saint-Petersburg State Poytechnic University Emai: veram,petert}@dcn.icc.spbstu.ru Abstract The probem of efficient

More information

Learning Fully Observed Undirected Graphical Models

Learning Fully Observed Undirected Graphical Models Learning Fuy Observed Undirected Graphica Modes Sides Credit: Matt Gormey (2016) Kayhan Batmangheich 1 Machine Learning The data inspires the structures we want to predict Inference finds {best structure,

More information

LECTURE NOTES 9 TRACELESS SYMMETRIC TENSOR APPROACH TO LEGENDRE POLYNOMIALS AND SPHERICAL HARMONICS

LECTURE NOTES 9 TRACELESS SYMMETRIC TENSOR APPROACH TO LEGENDRE POLYNOMIALS AND SPHERICAL HARMONICS MASSACHUSETTS INSTITUTE OF TECHNOLOGY Physics Department Physics 8.07: Eectromagnetism II October 7, 202 Prof. Aan Guth LECTURE NOTES 9 TRACELESS SYMMETRIC TENSOR APPROACH TO LEGENDRE POLYNOMIALS AND SPHERICAL

More information

Determining The Degree of Generalization Using An Incremental Learning Algorithm

Determining The Degree of Generalization Using An Incremental Learning Algorithm Determining The Degree of Generaization Using An Incrementa Learning Agorithm Pabo Zegers Facutad de Ingeniería, Universidad de os Andes San Caros de Apoquindo 22, Las Condes, Santiago, Chie pzegers@uandes.c

More information

Uniprocessor Feasibility of Sporadic Tasks with Constrained Deadlines is Strongly conp-complete

Uniprocessor Feasibility of Sporadic Tasks with Constrained Deadlines is Strongly conp-complete Uniprocessor Feasibiity of Sporadic Tasks with Constrained Deadines is Strongy conp-compete Pontus Ekberg and Wang Yi Uppsaa University, Sweden Emai: {pontus.ekberg yi}@it.uu.se Abstract Deciding the feasibiity

More information

A Solution to the 4-bit Parity Problem with a Single Quaternary Neuron

A Solution to the 4-bit Parity Problem with a Single Quaternary Neuron Neura Information Processing - Letters and Reviews Vo. 5, No. 2, November 2004 LETTER A Soution to the 4-bit Parity Probem with a Singe Quaternary Neuron Tohru Nitta Nationa Institute of Advanced Industria

More information

A Ridgelet Kernel Regression Model using Genetic Algorithm

A Ridgelet Kernel Regression Model using Genetic Algorithm A Ridgeet Kerne Regression Mode using Genetic Agorithm Shuyuan Yang, Min Wang, Licheng Jiao * Institute of Inteigence Information Processing, Department of Eectrica Engineering Xidian University Xi an,

More information

Lecture Note 3: Stationary Iterative Methods

Lecture Note 3: Stationary Iterative Methods MATH 5330: Computationa Methods of Linear Agebra Lecture Note 3: Stationary Iterative Methods Xianyi Zeng Department of Mathematica Sciences, UTEP Stationary Iterative Methods The Gaussian eimination (or

More information

Emmanuel Abbe Colin Sandon

Emmanuel Abbe Colin Sandon Detection in the stochastic bock mode with mutipe custers: proof of the achievabiity conjectures, acycic BP, and the information-computation gap Emmanue Abbe Coin Sandon Abstract In a paper that initiated

More information

SUPPLEMENTARY MATERIAL TO INNOVATED SCALABLE EFFICIENT ESTIMATION IN ULTRA-LARGE GAUSSIAN GRAPHICAL MODELS

SUPPLEMENTARY MATERIAL TO INNOVATED SCALABLE EFFICIENT ESTIMATION IN ULTRA-LARGE GAUSSIAN GRAPHICAL MODELS ISEE 1 SUPPLEMENTARY MATERIAL TO INNOVATED SCALABLE EFFICIENT ESTIMATION IN ULTRA-LARGE GAUSSIAN GRAPHICAL MODELS By Yingying Fan and Jinchi Lv University of Southern Caifornia This Suppementary Materia

More information

Partial permutation decoding for MacDonald codes

Partial permutation decoding for MacDonald codes Partia permutation decoding for MacDonad codes J.D. Key Department of Mathematics and Appied Mathematics University of the Western Cape 7535 Bevie, South Africa P. Seneviratne Department of Mathematics

More information

8 APPENDIX. E[m M] = (n S )(1 exp( exp(s min + c M))) (19) E[m M] n exp(s min + c M) (20) 8.1 EMPIRICAL EVALUATION OF SAMPLING

8 APPENDIX. E[m M] = (n S )(1 exp( exp(s min + c M))) (19) E[m M] n exp(s min + c M) (20) 8.1 EMPIRICAL EVALUATION OF SAMPLING 8 APPENDIX 8.1 EMPIRICAL EVALUATION OF SAMPLING We wish to evauate the empirica accuracy of our samping technique on concrete exampes. We do this in two ways. First, we can sort the eements by probabiity

More information

NEW DEVELOPMENT OF OPTIMAL COMPUTING BUDGET ALLOCATION FOR DISCRETE EVENT SIMULATION

NEW DEVELOPMENT OF OPTIMAL COMPUTING BUDGET ALLOCATION FOR DISCRETE EVENT SIMULATION NEW DEVELOPMENT OF OPTIMAL COMPUTING BUDGET ALLOCATION FOR DISCRETE EVENT SIMULATION Hsiao-Chang Chen Dept. of Systems Engineering University of Pennsyvania Phiadephia, PA 904-635, U.S.A. Chun-Hung Chen

More information