From Margins to Probabilities in Multiclass Learning Problems
|
|
- Randolph Hensley
- 5 years ago
- Views:
Transcription
1 From Margins to Probabiities in Muticass Learning Probems Andrea Passerini and Massimiiano Ponti 2 and Paoo Frasconi 3 Abstract. We study the probem of muticass cassification within the framework of error correcting output codes (ECOC) using margin-based binary cassifiers. An important open probem in this context is how to measure the distance between cass codewords and the outputs of the cassifiers. In this paper we propose a new decoding function that combines the margins through an estimate of their cass conditiona probabiities. We report experiments using support vector machines as the base binary cassifiers, showing the advantage of the proposed decoding function over other functions of the margin commony used in practice. We aso present new theoretica resuts bounding the eave-one-out error of ECOC of kerne machines, which can be used to tune kerne parameters. An empirica vaidation indicates that the bound eads to good estimates of kerne parameters and the corresponding cassifiers attain high accuracy. Keywords: Machine Learning, Error Correcting Output Codes, Support Vector Machines, Statistica Learning Theory. Introduction and Notation Many machine earning agorithms are intrinsicay conceived for binary cassification. However, many rea word earning probems require that inputs are mapped into one of severa possibe categories. The extension of a binary agorithm to its muticass counterpart is not aways possibe or easy to conceive. An aternative consists in reducing a muticass probem into severa binary sub-probems. Perhaps the most genera reduction scheme is the information theoretic method based on error correcting output codes (ECOC), introduced by Dietterich and Bakiri [5] and more recenty extended in []. ECOC work in two steps: training and cassification. During the first step, S binary cassifiers are trained on S dichotomies of the instance space, formed by joining non overapping subsets of casses. Assuming Q casses, et us introduce a coding matrix M {, 0, } Q S which specifies a reation between casses and dichotomies. m qs = (m qs = ) means that exampes beonging to cass q are used as positive (negative) exampes to train the s th cassifier f s. When m qs = 0, points in cass q are not used to train the s th cassifier. Thus each cass q is encoded by the q th row of matrix M which we aso denoted by M q. Typicay the cassifiers are trained independenty. This approach is known as muti-ca approach. Recent works [8, 2] considered aso the case where cassifiers are trained simutaneousy (singe-ca approach). In this paper we focus on the former approach athough some of the resuts may Dept. of Systems and Computer Science, University of Forence, Firenze, Itay 2 Dept. of Computer Science, University of Siena, Siena, Itay 3 Dept. of Systems and Computer Science, University of Forence, Firenze, Itay be extended to the atter one. In the second step a new input x is cassified by computing the vector formed by the outputs of the cassifiers, f(x) = (f (x),..., f S(x)) and choosing the cass whose corresponding row is cosest to f(x). In so doing, cassification can be seen as a decoding operation and the cass of input x is computed as arg min Q d(mq, f(x)), q= where d is the decoding function. In [5] the entries of matrix M were restricted to take ony binary vaues and the d was chosen to be the Hamming distance: L(M q, f) = m qs sign(f s) 2 In the case that the binary earners are margin-based cassifiers, [] shown the advantage of using a oss-based function of the margin d L(M q, f) = L(m qsf s) where L is a oss function. Such a function has the advantage of weighting the confidence of each cassifier thus aowing a more robust cassification criterion. The simpest oss function one can use is the inear oss for which L(m qsf s) = m qsf s but severa other choices are possibe and it is not cear which one shoud work the best. In the case that a binary cassifiers are computed by the same earning agorithm Awein, Shapire and Singer [] proposed to set L to be the same oss function used by that agorithm. In this paper we suggest a different approach which is based on decoding via conditiona probabiities of the outputs of the cassifiers. The advantages offered by our approach is twofod. First, the use of conditiona probabiities aows to combine the margins of each cassifier in a principed way. Second, the decoding function is itsef a cass conditiona probabiity which can give an estimate of muticassification confidence. This approach is discussed in Section 2. In Section 3 we present experiments, using SVM as the underying binary cassifiers, which enighten the advantage over other proposed decoding schemes. In Section 4 we study the generaization error of ECOC. We present a genera bound on the eave one out error in the case of ECOC of kerne machines. The bound can be used for estimating optima kerne parameters. The novety of this anaysis is that it aows muticass parameters optimization even though the binary cassifiers are trained independenty. We report experiments showing that the bound eads to good estimates of kerne parameters and the corresponding cassifiers attain high accuracy. ()
2 2 Decoding Functions Based on Conditiona Probabiities We have seen that a oss function of the margin presents some advantage over the standard Hamming distance because it can encode the confidence of each cassifier in the ECOC. This confidence is, however, a reative quantity, i.e. the range of the vaues of the margin may vary with the cassifier used. Thus, just using a inear oss function may introduce some bias in the fina cassification in the sense that cassifiers with a arger output range wi receive a higher weight. Not surprisingy, we wi see in the experiments beow that the Hamming distance can work better than the inear and soft-margin osses in the case of pairwise schemes. A straightforward normaization in some interva, e.g. [, ], can aso introduce bias since it does not fuy take into account the margin distribution. A more principed approach is to estimate the conditiona probabiity of each output codebit O s given the input vector x. Assuming that a the information about x that is reevant for determining O s is contained in the margin f s(x) (or f s for short) the above probabiity for each cassifier is P (O s f s, s). We can now assume a simpe mode for the probabiity of Y given the codebits O,..., O S. The conditiona probabiity that Y = q shoud be if the configuration on O,..., O S is the code of q, shoud be zero if it is the code of some other cass q q, and uniform (i.e. /Q if the configuration is not a vaid cass code. Under this mode, P (Y = q f) = P (O = m q,..., O S = m qs f) + α being α a constant that coects the probabiity mass dispersed on the 2 S Q invaid codes. Assuming that O s and O s are conditionay independent given x for each s, s, we can write the ikeihood that the resuting output codeword is q as P (Y = q f) = S P (O s = m qs f s) + α. (2) In this case, if α is sma, the decoding function wi be: d(m q, f) og P (Y = q f). (3) The probem bois down to estimating the individua conditiona probabiities in Eq. (2). To do so, one can try to fit a parametric mode on the output produced by a n-fod cross vaidation procedure. In our experiments we choose this mode to be a sigmoid computed on a 3- fod cross vaidation as suggested in [0]: P (O s = m qs f s, s) = + exp{a sf s + B s}. It is interesting to notice that an additiona advantage of the proposed decoding agorithm is that the muticass cassifier outputs a conditiona probabiity rather than a mere cass decision. 3 Experimenta Comparison Between Different Decoding Functions The proposed decoding method is vaidated on ten datasets from UCI repository. Their characteristics are shorty summarized in Tabe. We trained muticass cassifiers using SVM as the base binary cassifier. In our experiments we compared our decoding strategy to Hamming and other common oss-based decoding schemes (inear, and the soft margin oss used to train SVM) for different types of ECOC schemes: one-vs-a, a-pairs, and dense matrices consisting Tabe. Characteristics of the Datasets used Name Casses Train Test Inputs Annea Ecoi Gass Letter Optdigits Pendigits Satimage Segment Soybean Yeast of 3Q coumns of {, } entries. SVM were trained on a Gaussian kerne with the same vaue of the variance for a the experiments. For datasets with ess than 2,000 instances we used ten random spits of the avaiabe data (picking up 2/3 of exampes for training and /3 for testing) and averaged resuts over the ten trias. For the remaining datasets we used the origina spit defined in the UCI repository. Resuts are summarized in Tabes 2 4. Tabe 2. One-vs-A Annea Ecoi Gass Letter Optdigits Pendigits Satimage Segment Soybean Yeast Tabe 3. A-Pairs Annea Ecoi Gass Letter Optdigits Pendigits Satimage Segment Soybean Yeast Our ikeihood decoding works significanty better for a ECOC schemes. This resut is particuary cear in the case of pairwise cassifiers. In the case of dense ECOC some dichotomies can be particuary hard to earn and in this situation the sigmoid may be a too simpe mode for the conditiona probabiity. We suspect that by using a better estimate of the conditiona probabiity one coud obtain
3 Tabe 4. Dense Codes Let us introduce the matrix G defined by G ij = K(x i, x j). The coefficients α i in Equation (5) are earned by soving the foowing optimization probem: Annea Ecoi Gass Letter Optdigits Pendigits Satimage Segment Soybean Yeast better resuts with the ikeihood decoding. Notice that the Hamming distance works we in the case of pairwise cassification, whie it performs poory with one-vs-a cassifiers. Both resuts are not surprising: the Hamming distance corresponds to the majority vote, which is known to work we for pairwise cassifiers [7] but does not make much sense for one-vs-a because in this case ties may occur often. 4 Tuning Kerne Parameters In this section we study the probem of mode seection in the case of ECOC of Kerne Machines [2,, 6, 3]. The anaysis uses a eave-one-out error estimate as the key quantity for seecting the best mode. We first reca the main features of kerne machines for binary cassification. For a more detaied account consistent with the notations in this section see [6]. 4. Background on Kerne Machines Kerne machines are the minimizers of functionas of the form: H[f; D ] = V (c if(x i)) + λ f 2 K, (4) where we use the foowing notation: {(x i, c i) X {, }} is the training set. f is a function IR n IR beonging to a reproducing kerne Hibert space H defined by a symmetric and positive definite kerne K, and f 2 K is the norm of f in this space. See [2, 3] for a number of kernes. The cassification is done by taking the sign of this function. V (cf(x)) is a oss function whose choice determines different earning techniques, each eading to a different earning agorithm (for computing the coefficients α i - see beow). In this paper we assume that V is monotonic non increasing. λ is caed the reguarization parameter and is a positive constant. Machines of the form in Eq. (4) have been motivated in the framework of statistica earning theory. Under rather genera conditions the soution of Equation (4) is of the form 4 f(x) = α ic ik(x i, x). (5) 4 We assume that the bias term is incorporated in the kerne K. max α H(α) = m S(αi) 2 i,j= αiαjcicigij subject to : 0 α i C, i =,...,, where S( ) is a concave function and C = a constant. Support 2λ Vector Machines (SVMs) are a particuar case of these machines for S(α) = α. This corresponds to a oss function V in (4) that is of the form cf(x) +, where x + = x if x > 0 and zero otherwise. The points for which α i > 0 are caed support vectors. An important property of kerne machines is that, since we assumed K is symmetric and positive definite, the kerne function can be re-written as K(x, t) = λ nφ n(x)φ n(t). (7) n= where φ n(x) are the eigenfunctions of the integra operator associated to kerne K and λ n 0 their eigenvaues [4]. By setting Φ(x) to be the sequence ( λ ) nφ n(x), we see that K is a dot product n in a Hibert space of sequences (caed the feature space), that is, K(x, t) = Φ(x) Φ(t). Thus, function in Eq. (5) is a inear combination of features, f(x) = wnφn(x). The great advantage of n= working with the compact representation in Eq. (5) is that we avoid directy estimating the the infinite parameters w n. Finay, note that kerne K can aso be defined/buit by choosing the φs and λs but, in genera, those do not need to be known and can be impicity defined by K. 4.2 Bounds on the Leave-One-Out Error We first introduce some more notation. We define the muticass margin (see aso []) of point (x, y) to be with g(x, y) = d(m y, f(x)) + d(m r(x,y), f(x)) r(x, y) = argmin r y d(m r, f(x)). When g(x, y) is negative, point x is miscassified. Notice that when Q = 2 and and L is the inear oss, g(x, y) reduces to the definition of margin for a binary rea vaued cassification function. The empirica miscassification error can be written as: θ ( g(x i, y i)). The eave-one-out (LOO) error is defined by θ ( g i (x i, y ) i) where we have denoted by g i (x i, y i) the margin of exampe x i when the ECOC is trained on the dataset D \{(x i, y i)}. The LOO error is an interesting quantity to ook at when we want to seect/find the optimum (hyper)parameters of a cassification function, as is an amost unbiased estimator for the test error and we may expect that his minimum wi be cose to the minimum of the test error. Unfortunatey, computing the LOO error is time demanding when is arge. This (6)
4 becomes practicay impossibe in the case that we need to know the LOO error for severa vaues of the parameters of the machine used. In the foowing theorem we give a bound on the LOO error of ECOC of kerne machines. An interesting property of this bound is that it ony depends on the soution of the machine trained on the fu dataset (so training the machine once wi suffice). Beow we denote by f s the s machine, f s(x) = m αs i m yi sk s (x i, x), and et G s ij = K s (x i, x j). Theorem Suppose the decoding function uses the inear oss function. Then, the LOO error of the ECOC of kerne machines is bounded by ( ) θ g(x i, y i) + max U t(x i) (8) t y i where we have defined the function U t(x i) = (M t M r) f(x i) + m yi s(m yi s m ts)αi s G s ii To prove this theorem we first need the foowing Lemma for binary kerne machines. Second, if the number of support vectors of each kerne machine is sma, say ess than ν, the LOO error wi be at most S ν. Therefore, if Sν the LOO error wi be sma and so wi be the test error. We aso notice that, athough bound in Eq. (8) ony uses the knowedge about the machine trained on the fu dataset, it gives an estimate cose to the LOO error when the parameter C used to train the kerne machine is sma. Sma in this case means that C sup K(x i, x i) <. 4.3 Mode Seection Experiments We now show experiments where we use the bound provided by Theorem to seect optima kerne parameters. We focused on four of the argest datasets and searched the best vaue of the variance σ of the Gaussian kerne. To simpify the probem we searched a common vaue for a the binary cassifiers. Figures 3 show the test error and our LOO estimate for different vaues of γ = for the three 2σ 2 ECOC schemes considered. Lemma 4. Let f(x) be the kerne machine as defined in Equations (5) and (6). Let f i (x) be the soution of (4) found when the data point (x i, c i) is removed from the training set. We have c if(x i) α ig ii c if i (x i) c if(x i). (9) Proof: The.h.s part of Inequaity (9) was proved in [9]. Note that, if x i is not a support vector, f = f i and we have a trivia inequaity. Thus suppose that x i is a support vector. To prove the r.h.s. inequaity we observe that: H[f; D ] H[f i ; D ], and H[f; D i ] H[f i ; D i ]. By subtracting the second bound from the first one and using the definition of H we obtain that V (c if(x i)) V (c if i (x i)). Then, the resut foows by the monotonicay property of V. Figure. One-vs-a: Test error and LOO error bound as a function of the ogarithm of the parameter γ for satimage (eft) and optdigits (right) datasets. We now turn to the proof of Theorem. Our goa is to bound the difference of the muticass margin of f and f i, I = g(x i, y i) g i (x i, y i). Let us appy Lemma 4. to each SVM trained in the ECOC procedure. Inequaity (9) can be rewritten as m tsf i s(x i) = m tsf s(x i) λ sm yi sm ts (0) where λ s is a parameter in [0, α s i G s ii]. Using this equation we can transform quantity I to the desired resut through the foowing steps, where we have defined r i = argmin t yi d(m q, f i (x)): I = M yi f(x i) M r f(x i) [ M yi f i (x i) M r i f i (x i) ] = M yi f(x i) M yi f i (x i) + [ M r i f i (x i) M r f(x ] i) = (M yi s) 2 λ s + (M t M r) f(x i) m yi sm r i sλ s = (M r i M r) f(x i) + m yi s(m yi s m r i s)λ s Then, bound in Eq. (8) foows by using the definition of r i and by setting λ s = α s i G s ii. This theorem enightens some interesting properties of the ECOC of kerne machines. First, observe that the arger the margin of an exampe is, the ess ikey that this point wi be counted as a LOO error. Figure 2. A-pairs: Test error and LOO error bound as a function of the ogarithm of the parameter γ for satimage (eft) and optdigits (right) datasets. Figure 3. Dense-codes: Test error and LOO error bound as a function of the ogarithm of the parameter γ for satimage (eft) and optdigits (right) datasets.
5 Resuts indicate that the minimum of our LOO estimate is very cose to the minimum of the test error, athough we often observed a sight bias towards smaer vaues of the variance. Experiments with a ECOC schemes were repeated with such optima parameters. Tabes 5 7 show that we can improve performance significanty. Notice that in the case of one-vs-a and dense codes the advantage of using ikeihood decoding is ess evident than in experiments in section 3. We conjecture that this may be due to the fact that ikeihood estimates are computed on a 3-fod cross vaidation, thus the optima vaue of the variance shoud be computed separatey for each fod. Moreover the bias issues discussed above may further affect such estimates. We wi investigate such probems in future experiments. Tabe 5. One-vs-a: Optimizing the variance of the Gaussian kerne Optdigits Pendigits Satimage Segment Tabe 6. A-pairs: Optimizing the variance of the Gaussian kerne [3] N. Cristianini and J. Shawe-Tayor, An Introduction to Support Vector Machines, Cambridge University Press, [4] F. Cuker and S. Smae, On the mathematica foundations of earning, Buetin of American Mathematica Society, 39(), 49, (200). [5] T. G. Dietterich and G. Bakiri, Soving muticass earning probems via error-correcting output codes, Journa of Artificia Inteigence Research, (995). [6] T. Evgeniou, M. Ponti, and T. Poggio, Reguarization networks and support vector machines, Advances in Computationa Mathematics, 3, 50, (2000). [7] Jerome H. Friedman, Another approach to poychotomous cassification, Technica report, Department of Statistics, Stanford University, (997). [8] Y. Guermeur, A. Eisseeff, and H. Paugam-Mousy, A new muti-cass svm based on a uniform convergence resut, in Proceedings of IJCNN - Internationa Joint Conference on Neura Networks. IEEE, (2000). [9] T. Jaakkoa and D. Hausser, Expoiting generative modes in discriminative cassifiers, in Proc. of Neura Information Processing Conference, (998). [0] J. Patt, Probabiistic outputs for support vector machines and comparisons to reguarized ikeihood methods, in Advances in Large Margin Cassifiers, A. Smoa et a. Eds, MIT Press, (999). [] B. Schokopf, C. Burges, and A. Smoa, Advances in Kerne Methods Support Vector Learning, MIT Press, 998. [2] V. N. Vapnik, Statistica Learning Theory, Wiey, New York, 998. [3] G. Wahba, Spines Modes for Observationa Data, Series in Appied Mathematics, Vo. 59, SIAM, Phiadephia, 990. Optdigits Pendigits Satimage Segment Tabe 7. Dense-codes: Optimizing the variance of the Gaussian kerne Optdigits Pendigits Satimage Segment Concusions We studied ECOC constructed on margin based binary cassifiers under two compementary perspectives: the use of conditiona probabiities for buiding a decoding function, and the use of a theoreticay estimated bound on the eave-one-out error for optimizing kerne parameters. Our experiments show that transforming margins into conditiona probabiities is very effective and improves the overa accuracy of muticass cassification in comparison to standard oss-based decoding schemes. Moreover, fitting Gaussian kerne parameters by means of our theoretica bound further improves cassification accuracy of ECOC machines. REFERENCES [] E. L. Awein, R. E. Schapire, and Y. Singer, Reducing muticass to binary: A unifying approach for margin cassifiers, in Proc. 7th Internationa Conf. on Machine Learning, pp Morgan Kaufmann, San Francisco, CA, (2000). [2] K. Crammer and Y. Singer, the earnabiity and design of output codes for muticass probems, in In Proceedings of the Thirteenth Annua Conference on Computationa Learning Theory, 2000., (2000).
Statistical Learning Theory: A Primer
Internationa Journa of Computer Vision 38(), 9 3, 2000 c 2000 uwer Academic Pubishers. Manufactured in The Netherands. Statistica Learning Theory: A Primer THEODOROS EVGENIOU, MASSIMILIANO PONTIL AND TOMASO
More informationStatistical Learning Theory: a Primer
??,??, 1 6 (??) c?? Kuwer Academic Pubishers, Boston. Manufactured in The Netherands. Statistica Learning Theory: a Primer THEODOROS EVGENIOU AND MASSIMILIANO PONTIL Center for Bioogica and Computationa
More informationFrom Margins to Probabilities in Multiclass Learning Problems
* rom Margins to Probabilities in Multiclass Learning Problems Andrea Passerini and Massimiliano Pontil and Paolo rasconi Abstract. We study the problem of multiclass classification within the framework
More informationBayesian Learning. You hear a which which could equally be Thanks or Tanks, which would you go with?
Bayesian Learning A powerfu and growing approach in machine earning We use it in our own decision making a the time You hear a which which coud equay be Thanks or Tanks, which woud you go with? Combine
More informationSVM: Terminology 1(6) SVM: Terminology 2(6)
Andrew Kusiak Inteigent Systems Laboratory 39 Seamans Center he University of Iowa Iowa City, IA 54-57 SVM he maxima margin cassifier is simiar to the perceptron: It aso assumes that the data points are
More informationExplicit overall risk minimization transductive bound
1 Expicit overa risk minimization transductive bound Sergio Decherchi, Paoo Gastado, Sandro Ridea, Rodofo Zunino Dept. of Biophysica and Eectronic Engineering (DIBE), Genoa University Via Opera Pia 11a,
More informationMultilayer Kerceptron
Mutiayer Kerceptron Zotán Szabó, András Lőrincz Department of Information Systems, Facuty of Informatics Eötvös Loránd University Pázmány Péter sétány 1/C H-1117, Budapest, Hungary e-mai: szzoi@csetehu,
More informationAn Algorithm for Pruning Redundant Modules in Min-Max Modular Network
An Agorithm for Pruning Redundant Modues in Min-Max Moduar Network Hui-Cheng Lian and Bao-Liang Lu Department of Computer Science and Engineering, Shanghai Jiao Tong University 1954 Hua Shan Rd., Shanghai
More informationA Brief Introduction to Markov Chains and Hidden Markov Models
A Brief Introduction to Markov Chains and Hidden Markov Modes Aen B MacKenzie Notes for December 1, 3, &8, 2015 Discrete-Time Markov Chains You may reca that when we first introduced random processes,
More informationII. PROBLEM. A. Description. For the space of audio signals
CS229 - Fina Report Speech Recording based Language Recognition (Natura Language) Leopod Cambier - cambier; Matan Leibovich - matane; Cindy Orozco Bohorquez - orozcocc ABSTRACT We construct a rea time
More informationFRST Multivariate Statistics. Multivariate Discriminant Analysis (MDA)
1 FRST 531 -- Mutivariate Statistics Mutivariate Discriminant Anaysis (MDA) Purpose: 1. To predict which group (Y) an observation beongs to based on the characteristics of p predictor (X) variabes, using
More informationA unified framework for Regularization Networks and Support Vector Machines. Theodoros Evgeniou, Massimiliano Pontil, Tomaso Poggio
MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES A.I. Memo No. 1654 March23, 1999
More informationTwo view learning: SVM-2K, Theory and Practice
Two view earning: SVM-2K, Theory and Practice Jason D.R. Farquhar jdrf99r@ecs.soton.ac.uk Hongying Meng hongying@cs.york.ac.uk David R. Hardoon drh@ecs.soton.ac.uk John Shawe-Tayor jst@ecs.soton.ac.uk
More informationNew results on error correcting output codes of kernel machines
New results on error correcting output codes of kernel machines Andrea Passerini, Massimiliano Pontil, and Paolo Frasconi, Member, IEEE Abstract We study the problem of multiclass classification within
More informationSVM-based Supervised and Unsupervised Classification Schemes
SVM-based Supervised and Unsupervised Cassification Schemes LUMINITA STATE University of Pitesti Facuty of Mathematics and Computer Science 1 Targu din Vae St., Pitesti 110040 ROMANIA state@cicknet.ro
More informationThe EM Algorithm applied to determining new limit points of Mahler measures
Contro and Cybernetics vo. 39 (2010) No. 4 The EM Agorithm appied to determining new imit points of Maher measures by Souad E Otmani, Georges Rhin and Jean-Marc Sac-Épée Université Pau Veraine-Metz, LMAM,
More informationUniversal Consistency of Multi-Class Support Vector Classification
Universa Consistency of Muti-Cass Support Vector Cassification Tobias Gasmachers Dae Moe Institute for rtificia Inteigence IDSI, 6928 Manno-Lugano, Switzerand tobias@idsia.ch bstract Steinwart was the
More informationCS229 Lecture notes. Andrew Ng
CS229 Lecture notes Andrew Ng Part IX The EM agorithm In the previous set of notes, we taked about the EM agorithm as appied to fitting a mixture of Gaussians. In this set of notes, we give a broader view
More informationMulticategory Classification by Support Vector Machines
Muticategory Cassification by Support Vector Machines Erin J Bredensteiner Department of Mathematics University of Evansvie 800 Lincon Avenue Evansvie, Indiana 47722 eb6@evansvieedu Kristin P Bennett Department
More informationDiscriminant Analysis: A Unified Approach
Discriminant Anaysis: A Unified Approach Peng Zhang & Jing Peng Tuane University Eectrica Engineering & Computer Science Department New Oreans, LA 708 {zhangp,jp}@eecs.tuane.edu Norbert Riede Tuane University
More informationAppendix A: MATLAB commands for neural networks
Appendix A: MATLAB commands for neura networks 132 Appendix A: MATLAB commands for neura networks p=importdata('pn.xs'); t=importdata('tn.xs'); [pn,meanp,stdp,tn,meant,stdt]=prestd(p,t); for m=1:10 net=newff(minmax(pn),[m,1],{'tansig','purein'},'trainm');
More informationExpectation-Maximization for Estimating Parameters for a Mixture of Poissons
Expectation-Maximization for Estimating Parameters for a Mixture of Poissons Brandon Maone Department of Computer Science University of Hesini February 18, 2014 Abstract This document derives, in excrutiating
More informationKernel Matching Pursuit
Kerne Matching Pursuit Pasca Vincent and Yoshua Bengio Dept. IRO, Université demontréa C.P. 6128, Montrea, Qc, H3C 3J7, Canada {vincentp,bengioy}@iro.umontrea.ca Technica Report #1179 Département d Informatique
More informationMARKOV CHAINS AND MARKOV DECISION THEORY. Contents
MARKOV CHAINS AND MARKOV DECISION THEORY ARINDRIMA DATTA Abstract. In this paper, we begin with a forma introduction to probabiity and expain the concept of random variabes and stochastic processes. After
More informationSome Measures for Asymmetry of Distributions
Some Measures for Asymmetry of Distributions Georgi N. Boshnakov First version: 31 January 2006 Research Report No. 5, 2006, Probabiity and Statistics Group Schoo of Mathematics, The University of Manchester
More informationAnt Colony Algorithms for Constructing Bayesian Multi-net Classifiers
Ant Coony Agorithms for Constructing Bayesian Muti-net Cassifiers Khaid M. Saama and Aex A. Freitas Schoo of Computing, University of Kent, Canterbury, UK. {kms39,a.a.freitas}@kent.ac.uk December 5, 2013
More informationMANY machine learning algorithms are intrinsically
IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 15, NO. 1, JANUARY 2004 45 New Results on Error Correcting Output Codes of Kernel Machines Andrea Passerini, Massimiliano Pontil, and Paolo Frasconi, Member,
More informationarxiv: v1 [cs.lg] 23 Aug 2018
Muticass Universum SVM Sauptik Dhar 1 Vadimir Cherkassky 2 Mohak Shah 1 3 arxiv:1808.08111v1 [cs.lg] 23 Aug 2018 Abstract We introduce Universum earning for muticass probems and propose a nove formuation
More informationSupport Vector Machine and Its Application to Regression and Classification
BearWorks Institutiona Repository MSU Graduate Theses Spring 2017 Support Vector Machine and Its Appication to Regression and Cassification Xiaotong Hu As with any inteectua project, the content and views
More informationParagraph Topic Classification
Paragraph Topic Cassification Eugene Nho Graduate Schoo of Business Stanford University Stanford, CA 94305 enho@stanford.edu Edward Ng Department of Eectrica Engineering Stanford University Stanford, CA
More informationSTA 216 Project: Spline Approach to Discrete Survival Analysis
: Spine Approach to Discrete Surviva Anaysis November 4, 005 1 Introduction Athough continuous surviva anaysis differs much from the discrete surviva anaysis, there is certain ink between the two modeing
More informationDo Schools Matter for High Math Achievement? Evidence from the American Mathematics Competitions Glenn Ellison and Ashley Swanson Online Appendix
VOL. NO. DO SCHOOLS MATTER FOR HIGH MATH ACHIEVEMENT? 43 Do Schoos Matter for High Math Achievement? Evidence from the American Mathematics Competitions Genn Eison and Ashey Swanson Onine Appendix Appendix
More informationSome Properties of Regularized Kernel Methods
Journa of Machine Learning Research 5 (2004) 1363 1390 Submitted 12/03; Revised 7/04; Pubished 10/04 Some Properties of Reguarized Kerne Methods Ernesto De Vito Dipartimento di Matematica Università di
More informationStochastic Variational Inference with Gradient Linearization
Stochastic Variationa Inference with Gradient Linearization Suppementa Materia Tobias Pötz * Anne S Wannenwetsch Stefan Roth Department of Computer Science, TU Darmstadt Preface In this suppementa materia,
More informationAnother Look at Linear Programming for Feature Selection via Methods of Regularization 1
Another Look at Linear Programming for Feature Seection via Methods of Reguarization Yonggang Yao, The Ohio State University Yoonkyung Lee, The Ohio State University Technica Report No. 800 November, 2007
More informationASummaryofGaussianProcesses Coryn A.L. Bailer-Jones
ASummaryofGaussianProcesses Coryn A.L. Baier-Jones Cavendish Laboratory University of Cambridge caj@mrao.cam.ac.uk Introduction A genera prediction probem can be posed as foows. We consider that the variabe
More informationA. Distribution of the test statistic
A. Distribution of the test statistic In the sequentia test, we first compute the test statistic from a mini-batch of size m. If a decision cannot be made with this statistic, we keep increasing the mini-batch
More information4 1-D Boundary Value Problems Heat Equation
4 -D Boundary Vaue Probems Heat Equation The main purpose of this chapter is to study boundary vaue probems for the heat equation on a finite rod a x b. u t (x, t = ku xx (x, t, a < x < b, t > u(x, = ϕ(x
More informationEvolutionary Product-Unit Neural Networks for Classification 1
Evoutionary Product-Unit Neura Networs for Cassification F.. Martínez-Estudio, C. Hervás-Martínez, P. A. Gutiérrez Peña A. C. Martínez-Estudio and S. Ventura-Soto Department of Management and Quantitative
More informationCONVERGENCE RATES OF COMPACTLY SUPPORTED RADIAL BASIS FUNCTION REGULARIZATION
Statistica Sinica 16(2006), 425-439 CONVERGENCE RATES OF COMPACTLY SUPPORTED RADIAL BASIS FUNCTION REGULARIZATION Yi Lin and Ming Yuan University of Wisconsin-Madison and Georgia Institute of Technoogy
More informationFirst-Order Corrections to Gutzwiller s Trace Formula for Systems with Discrete Symmetries
c 26 Noninear Phenomena in Compex Systems First-Order Corrections to Gutzwier s Trace Formua for Systems with Discrete Symmetries Hoger Cartarius, Jörg Main, and Günter Wunner Institut für Theoretische
More informationMoreau-Yosida Regularization for Grouped Tree Structure Learning
Moreau-Yosida Reguarization for Grouped Tree Structure Learning Jun Liu Computer Science and Engineering Arizona State University J.Liu@asu.edu Jieping Ye Computer Science and Engineering Arizona State
More informationMATH 172: MOTIVATION FOR FOURIER SERIES: SEPARATION OF VARIABLES
MATH 172: MOTIVATION FOR FOURIER SERIES: SEPARATION OF VARIABLES Separation of variabes is a method to sove certain PDEs which have a warped product structure. First, on R n, a inear PDE of order m is
More informationXSAT of linear CNF formulas
XSAT of inear CN formuas Bernd R. Schuh Dr. Bernd Schuh, D-50968 Kön, Germany; bernd.schuh@netcoogne.de eywords: compexity, XSAT, exact inear formua, -reguarity, -uniformity, NPcompeteness Abstract. Open
More informationMelodic contour estimation with B-spline models using a MDL criterion
Meodic contour estimation with B-spine modes using a MDL criterion Damien Loive, Ney Barbot, Oivier Boeffard IRISA / University of Rennes 1 - ENSSAT 6 rue de Kerampont, B.P. 80518, F-305 Lannion Cedex
More informationInductive Bias: How to generalize on novel data. CS Inductive Bias 1
Inductive Bias: How to generaize on nove data CS 478 - Inductive Bias 1 Overfitting Noise vs. Exceptions CS 478 - Inductive Bias 2 Non-Linear Tasks Linear Regression wi not generaize we to the task beow
More informationTurbo Codes. Coding and Communication Laboratory. Dept. of Electrical Engineering, National Chung Hsing University
Turbo Codes Coding and Communication Laboratory Dept. of Eectrica Engineering, Nationa Chung Hsing University Turbo codes 1 Chapter 12: Turbo Codes 1. Introduction 2. Turbo code encoder 3. Design of intereaver
More informationAppendix for Stochastic Gradient Monomial Gamma Sampler
3 4 5 6 7 8 9 3 4 5 6 7 8 9 3 4 5 6 7 8 9 3 3 3 33 34 35 36 37 38 39 4 4 4 43 44 45 46 47 48 49 5 5 5 53 54 Appendix for Stochastic Gradient Monomia Gamma Samper A The Main Theorem We provide the foowing
More informationCryptanalysis of PKP: A New Approach
Cryptanaysis of PKP: A New Approach Éiane Jaumes and Antoine Joux DCSSI 18, rue du Dr. Zamenhoff F-92131 Issy-es-Mx Cedex France eiane.jaumes@wanadoo.fr Antoine.Joux@ens.fr Abstract. Quite recenty, in
More informationEfficiently Generating Random Bits from Finite State Markov Chains
1 Efficienty Generating Random Bits from Finite State Markov Chains Hongchao Zhou and Jehoshua Bruck, Feow, IEEE Abstract The probem of random number generation from an uncorreated random source (of unknown
More informationA proposed nonparametric mixture density estimation using B-spline functions
A proposed nonparametric mixture density estimation using B-spine functions Atizez Hadrich a,b, Mourad Zribi a, Afif Masmoudi b a Laboratoire d Informatique Signa et Image de a Côte d Opae (LISIC-EA 4491),
More informationActive Learning & Experimental Design
Active Learning & Experimenta Design Danie Ting Heaviy modified, of course, by Lye Ungar Origina Sides by Barbara Engehardt and Aex Shyr Lye Ungar, University of Pennsyvania Motivation u Data coection
More informationMat 1501 lecture notes, penultimate installment
Mat 1501 ecture notes, penutimate instament 1. bounded variation: functions of a singe variabe optiona) I beieve that we wi not actuay use the materia in this section the point is mainy to motivate the
More informationEfficient Approximate Leave-One-Out Cross-Validation for Kernel Logistic Regression
Machine Learning manuscript No. (wi be inserted by the editor) Efficient Approximate Leave-One-Out Cross-Vaidation for Kerne Logistic Regression Gavin C. Cawey, Nicoa L. C. Tabot Schoo of Computing Sciences
More informationKernel pea and De-Noising in Feature Spaces
Kerne pea and De-Noising in Feature Spaces Sebastian Mika, Bernhard Schokopf, Aex Smoa Kaus-Robert Muer, Matthias Schoz, Gunnar Riitsch GMD FIRST, Rudower Chaussee 5, 12489 Berin, Germany {mika, bs, smoa,
More information6.434J/16.391J Statistics for Engineers and Scientists May 4 MIT, Spring 2006 Handout #17. Solution 7
6.434J/16.391J Statistics for Engineers and Scientists May 4 MIT, Spring 2006 Handout #17 Soution 7 Probem 1: Generating Random Variabes Each part of this probem requires impementation in MATLAB. For the
More informationAppendix for Stochastic Gradient Monomial Gamma Sampler
Appendix for Stochastic Gradient Monomia Gamma Samper A The Main Theorem We provide the foowing theorem to characterize the stationary distribution of the stochastic process with SDEs in (3) Theorem 3
More informationOn the Statistical Consistency of Algorithms for Binary Classification under Class Imbalance
On the Statistica Consistency of Agorithms for Binary Cassification under Cass Imbaance Aditya Krishna Menon University of Caifornia, San iego, La Joa CA 92093, USA Harikrishna Narasimhan Shivani Agarwa
More information4 Separation of Variables
4 Separation of Variabes In this chapter we describe a cassica technique for constructing forma soutions to inear boundary vaue probems. The soution of three cassica (paraboic, hyperboic and eiptic) PDE
More informationNOISE-INDUCED STABILIZATION OF STOCHASTIC DIFFERENTIAL EQUATIONS
NOISE-INDUCED STABILIZATION OF STOCHASTIC DIFFERENTIAL EQUATIONS TONY ALLEN, EMILY GEBHARDT, AND ADAM KLUBALL 3 ADVISOR: DR. TIFFANY KOLBA 4 Abstract. The phenomenon of noise-induced stabiization occurs
More informationMath 124B January 31, 2012
Math 124B January 31, 212 Viktor Grigoryan 7 Inhomogeneous boundary vaue probems Having studied the theory of Fourier series, with which we successfuy soved boundary vaue probems for the homogeneous heat
More informationData Mining Technology for Failure Prognostic of Avionics
IEEE Transactions on Aerospace and Eectronic Systems. Voume 38, #, pp.388-403, 00. Data Mining Technoogy for Faiure Prognostic of Avionics V.A. Skormin, Binghamton University, Binghamton, NY, 1390, USA
More informationTrainable fusion rules. I. Large sample size case
Neura Networks 19 (2006) 1506 1516 www.esevier.com/ocate/neunet Trainabe fusion rues. I. Large sampe size case Šarūnas Raudys Institute of Mathematics and Informatics, Akademijos 4, Vinius 08633, Lithuania
More information14 Separation of Variables Method
14 Separation of Variabes Method Consider, for exampe, the Dirichet probem u t = Du xx < x u(x, ) = f(x) < x < u(, t) = = u(, t) t > Let u(x, t) = T (t)φ(x); now substitute into the equation: dt
More informationAkaike Information Criterion for ANOVA Model with a Simple Order Restriction
Akaike Information Criterion for ANOVA Mode with a Simpe Order Restriction Yu Inatsu * Department of Mathematics, Graduate Schoo of Science, Hiroshima University ABSTRACT In this paper, we consider Akaike
More informationReichenbachian Common Cause Systems
Reichenbachian Common Cause Systems G. Hofer-Szabó Department of Phiosophy Technica University of Budapest e-mai: gszabo@hps.ete.hu Mikós Rédei Department of History and Phiosophy of Science Eötvös University,
More informationSteepest Descent Adaptation of Min-Max Fuzzy If-Then Rules 1
Steepest Descent Adaptation of Min-Max Fuzzy If-Then Rues 1 R.J. Marks II, S. Oh, P. Arabshahi Λ, T.P. Caude, J.J. Choi, B.G. Song Λ Λ Dept. of Eectrica Engineering Boeing Computer Services University
More information(This is a sample cover image for this issue. The actual cover is not yet available at this time.)
(This is a sampe cover image for this issue The actua cover is not yet avaiabe at this time) This artice appeared in a journa pubished by Esevier The attached copy is furnished to the author for interna
More informationAn approximate method for solving the inverse scattering problem with fixed-energy data
J. Inv. I-Posed Probems, Vo. 7, No. 6, pp. 561 571 (1999) c VSP 1999 An approximate method for soving the inverse scattering probem with fixed-energy data A. G. Ramm and W. Scheid Received May 12, 1999
More informationGlobal sensitivity analysis using low-rank tensor approximations
Goba sensitivity anaysis using ow-rank tensor approximations K. Konaki 1 and B. Sudret 1 1 Chair of Risk, Safety and Uncertainty Quantification, arxiv:1605.09009v1 [stat.co] 29 May 2016 ETH Zurich, Stefano-Franscini-Patz
More informationarxiv: v1 [math.ca] 6 Mar 2017
Indefinite Integras of Spherica Besse Functions MIT-CTP/487 arxiv:703.0648v [math.ca] 6 Mar 07 Joyon K. Boomfied,, Stephen H. P. Face,, and Zander Moss, Center for Theoretica Physics, Laboratory for Nucear
More informationConverting Z-number to Fuzzy Number using. Fuzzy Expected Value
ISSN 1746-7659, Engand, UK Journa of Information and Computing Science Vo. 1, No. 4, 017, pp.91-303 Converting Z-number to Fuzzy Number using Fuzzy Expected Vaue Mahdieh Akhbari * Department of Industria
More informationResearch of Data Fusion Method of Multi-Sensor Based on Correlation Coefficient of Confidence Distance
Send Orders for Reprints to reprints@benthamscience.ae 340 The Open Cybernetics & Systemics Journa, 015, 9, 340-344 Open Access Research of Data Fusion Method of Muti-Sensor Based on Correation Coefficient
More informationBALANCING REGULAR MATRIX PENCILS
BALANCING REGULAR MATRIX PENCILS DAMIEN LEMONNIER AND PAUL VAN DOOREN Abstract. In this paper we present a new diagona baancing technique for reguar matrix pencis λb A, which aims at reducing the sensitivity
More informationarxiv: v1 [cs.db] 1 Aug 2012
Functiona Mechanism: Regression Anaysis under Differentia Privacy arxiv:208.029v [cs.db] Aug 202 Jun Zhang Zhenjie Zhang 2 Xiaokui Xiao Yin Yang 2 Marianne Winsett 2,3 ABSTRACT Schoo of Computer Engineering
More informationResearch Article Analysis of Heart Transplant Survival Data Using Generalized Additive Models
Hindawi Pubishing Corporation Computationa and Mathematica Methods in Medicine Voume 2013, Artice ID 609857, 7 pages http://dx.doi.org/10.1155/2013/609857 Research Artice Anaysis of Heart Transpant Surviva
More information1D Heat Propagation Problems
Chapter 1 1D Heat Propagation Probems If the ambient space of the heat conduction has ony one dimension, the Fourier equation reduces to the foowing for an homogeneous body cρ T t = T λ 2 + Q, 1.1) x2
More informationApproximated MLC shape matrix decomposition with interleaf collision constraint
Approximated MLC shape matrix decomposition with intereaf coision constraint Thomas Kainowski Antje Kiese Abstract Shape matrix decomposition is a subprobem in radiation therapy panning. A given fuence
More informationMath 220B - Summer 2003 Homework 1 Solutions
Math 0B - Summer 003 Homework Soutions Consider the eigenvaue probem { X = λx 0 < x < X satisfies symmetric BCs x = 0, Suppose f(x)f (x) x=b x=a 0 for a rea-vaued functions f(x) which satisfy the boundary
More informationRestricted weak type on maximal linear and multilinear integral maps.
Restricted weak type on maxima inear and mutiinear integra maps. Oscar Basco Abstract It is shown that mutiinear operators of the form T (f 1,..., f k )(x) = R K(x, y n 1,..., y k )f 1 (y 1 )...f k (y
More informationu(x) s.t. px w x 0 Denote the solution to this problem by ˆx(p, x). In order to obtain ˆx we may simply solve the standard problem max x 0
Bocconi University PhD in Economics - Microeconomics I Prof M Messner Probem Set 4 - Soution Probem : If an individua has an endowment instead of a monetary income his weath depends on price eves In particuar,
More informationProblem set 6 The Perron Frobenius theorem.
Probem set 6 The Perron Frobenius theorem. Math 22a4 Oct 2 204, Due Oct.28 In a future probem set I want to discuss some criteria which aow us to concude that that the ground state of a sef-adjoint operator
More informationMore Scattering: the Partial Wave Expansion
More Scattering: the Partia Wave Expansion Michae Fower /7/8 Pane Waves and Partia Waves We are considering the soution to Schrödinger s equation for scattering of an incoming pane wave in the z-direction
More informationarxiv: v2 [stat.ml] 17 Mar 2015
Journa of Machine Learning Research 1x (201x) x-xx Submitted x/0x; Pubished x/0x A Unifying Framework in Vector-vaued Reproducing Kerne Hibert Spaces for Manifod Reguarization and Co-Reguarized Muti-view
More informationSoft Clustering on Graphs
Soft Custering on Graphs Kai Yu 1, Shipeng Yu 2, Voker Tresp 1 1 Siemens AG, Corporate Technoogy 2 Institute for Computer Science, University of Munich kai.yu@siemens.com, voker.tresp@siemens.com spyu@dbs.informatik.uni-muenchen.de
More informationSeparation of Variables and a Spherical Shell with Surface Charge
Separation of Variabes and a Spherica She with Surface Charge In cass we worked out the eectrostatic potentia due to a spherica she of radius R with a surface charge density σθ = σ cos θ. This cacuation
More informationProbabilistic Graphical Models
Schoo of Computer Science Probabiistic Graphica Modes Gaussian graphica modes and Ising modes: modeing networks Eric Xing Lecture 0, February 0, 07 Reading: See cass website Eric Xing @ CMU, 005-07 Network
More informationSequential Decoding of Polar Codes with Arbitrary Binary Kernel
Sequentia Decoding of Poar Codes with Arbitrary Binary Kerne Vera Miosavskaya, Peter Trifonov Saint-Petersburg State Poytechnic University Emai: veram,petert}@dcn.icc.spbstu.ru Abstract The probem of efficient
More informationLearning Fully Observed Undirected Graphical Models
Learning Fuy Observed Undirected Graphica Modes Sides Credit: Matt Gormey (2016) Kayhan Batmangheich 1 Machine Learning The data inspires the structures we want to predict Inference finds {best structure,
More informationLECTURE NOTES 9 TRACELESS SYMMETRIC TENSOR APPROACH TO LEGENDRE POLYNOMIALS AND SPHERICAL HARMONICS
MASSACHUSETTS INSTITUTE OF TECHNOLOGY Physics Department Physics 8.07: Eectromagnetism II October 7, 202 Prof. Aan Guth LECTURE NOTES 9 TRACELESS SYMMETRIC TENSOR APPROACH TO LEGENDRE POLYNOMIALS AND SPHERICAL
More informationDetermining The Degree of Generalization Using An Incremental Learning Algorithm
Determining The Degree of Generaization Using An Incrementa Learning Agorithm Pabo Zegers Facutad de Ingeniería, Universidad de os Andes San Caros de Apoquindo 22, Las Condes, Santiago, Chie pzegers@uandes.c
More informationUniprocessor Feasibility of Sporadic Tasks with Constrained Deadlines is Strongly conp-complete
Uniprocessor Feasibiity of Sporadic Tasks with Constrained Deadines is Strongy conp-compete Pontus Ekberg and Wang Yi Uppsaa University, Sweden Emai: {pontus.ekberg yi}@it.uu.se Abstract Deciding the feasibiity
More informationA Solution to the 4-bit Parity Problem with a Single Quaternary Neuron
Neura Information Processing - Letters and Reviews Vo. 5, No. 2, November 2004 LETTER A Soution to the 4-bit Parity Probem with a Singe Quaternary Neuron Tohru Nitta Nationa Institute of Advanced Industria
More informationA Ridgelet Kernel Regression Model using Genetic Algorithm
A Ridgeet Kerne Regression Mode using Genetic Agorithm Shuyuan Yang, Min Wang, Licheng Jiao * Institute of Inteigence Information Processing, Department of Eectrica Engineering Xidian University Xi an,
More informationLecture Note 3: Stationary Iterative Methods
MATH 5330: Computationa Methods of Linear Agebra Lecture Note 3: Stationary Iterative Methods Xianyi Zeng Department of Mathematica Sciences, UTEP Stationary Iterative Methods The Gaussian eimination (or
More informationEmmanuel Abbe Colin Sandon
Detection in the stochastic bock mode with mutipe custers: proof of the achievabiity conjectures, acycic BP, and the information-computation gap Emmanue Abbe Coin Sandon Abstract In a paper that initiated
More informationSUPPLEMENTARY MATERIAL TO INNOVATED SCALABLE EFFICIENT ESTIMATION IN ULTRA-LARGE GAUSSIAN GRAPHICAL MODELS
ISEE 1 SUPPLEMENTARY MATERIAL TO INNOVATED SCALABLE EFFICIENT ESTIMATION IN ULTRA-LARGE GAUSSIAN GRAPHICAL MODELS By Yingying Fan and Jinchi Lv University of Southern Caifornia This Suppementary Materia
More informationPartial permutation decoding for MacDonald codes
Partia permutation decoding for MacDonad codes J.D. Key Department of Mathematics and Appied Mathematics University of the Western Cape 7535 Bevie, South Africa P. Seneviratne Department of Mathematics
More information8 APPENDIX. E[m M] = (n S )(1 exp( exp(s min + c M))) (19) E[m M] n exp(s min + c M) (20) 8.1 EMPIRICAL EVALUATION OF SAMPLING
8 APPENDIX 8.1 EMPIRICAL EVALUATION OF SAMPLING We wish to evauate the empirica accuracy of our samping technique on concrete exampes. We do this in two ways. First, we can sort the eements by probabiity
More informationNEW DEVELOPMENT OF OPTIMAL COMPUTING BUDGET ALLOCATION FOR DISCRETE EVENT SIMULATION
NEW DEVELOPMENT OF OPTIMAL COMPUTING BUDGET ALLOCATION FOR DISCRETE EVENT SIMULATION Hsiao-Chang Chen Dept. of Systems Engineering University of Pennsyvania Phiadephia, PA 904-635, U.S.A. Chun-Hung Chen
More information