Kernel Reconstruction ICA for Sparse Representation

Size: px
Start display at page:

Download "Kernel Reconstruction ICA for Sparse Representation"

Transcription

1 JOURNAL OF L A T E X CLASS FILES, VOL 6, NO, JANUARY 2007 Kernel Reconstruction ICA for Sparse Representation Yanhui Xiao, Zhenfeng Zhu, and Yao Zhao arxiv: v [cscv] 9 Apr 203 Abstract Independent Coponent Analysis (ICA) is an effective unsupervised tool to learn statistically independent representation However, ICA is not only sensitive to whitening but also difficult to learn an over-coplete basis Consequently, ICA with soft Reconstruction cost(rica) was presented to learn sparse representations with over-coplete basis even on unwhitened data Whereas RICA is infeasible to represent the data with nonlinear structure due to its intrinsic linearity In addition, RICA is essentially an unsupervised ethod and can not utilize the class inforation In this paper, we propose a kernel ICA odel with reconstruction constraint (krica) to capture the nonlinear features To bring in the class inforation, we further extend the unsupervised krica to a supervised one by introducing a discriination constraint, naely d-krica This constraint leads to learn a structured basis consisted of basis vectors fro different basis subsets corresponding to different class labels Then each subset will sparsely represent well for its own class but not for the others Furtherore, data saples belonging to the sae class will have siilar representations, and thereby the learned sparse representations can take ore discriinative power Experiental results validate the effectiveness of krica and d-krica for iage classification Index Ters Independent coponent analysis, nonlinear apping, supervised learning, iage classification INTRODUCTION Sparsity is an attribute characterizing a ass of natural and anade signals [], and has played a vital role in the success of any achine learning algoriths and techniques such as copressed sensing [2], atrix factorization [3], sparse coding [4], dictionary learning [5], [6], sparse auto-encoders [7], Restricted Boltzann Machines (RBMs) [8] and Independent Coponent Analysis (ICA) [9] Aong these, ICA transfors an observed ultidiensional rando vector into sparse coponents which are statistically as independent fro each other as possible Specifically, to estiate the independent coponents, a general principle is the axiization of nongaussianity [9] This is based on the central liit theore that su of independent rando variables is closer to gaussian than any of the original rando variables, ie, non-gaussian is independent Meanwhile, sparsity is one for of non-gaussianity [0], which is doinant in natural iages Then axiization of sparseness in natural iages is basically equivalent to axiization of non-gaussianity Thus, ICA has been successfully applied to learn sparse representation for classification tasks by axiizing sparsity [] However, there are two ain drawbacks to standard ICA ) ICA is sensitive to whitening, which is an iportant preprocessing step in ICA to extract efficient features In Y Xiao and Z Zhu are with the Institute of Inforation Science, Beijing Jiaotong University, and with Beijing Key Laboratory of Advanced Inforation Science and Network Technology, Beijing 00044, China (e-ail: xiaoyanhui@gailco, zhfzhu@bjtueducn) Y Zhao is with the Institute of Inforation Science, Beijing Jiaotong University, and with State Key Laboratory of Rail Traffic Control and Safety, Beijing 00044, China (e-ail: yzhao@bjtueducn) addition, standard ICA is difficult to exactly whiten high diensional data For exaple, an input iage of size pixels could be exactly whitened by principal coponent analysis(pca), while it has to solve the eigen-decoposition of the 0,000 0,000 covariance atrix 2) ICA is hard to learn the over-coplete basis (that is the nuber of basis vectors is greater than diensionality of input data) Whereas Coates et al [2] have shown that several approaches with over-coplete basis, eg, sparse autoencoders [7], K-eans [2] and RBMs [8], obtain an iproveent for the perforance of classification This puts ICA at a disadvantage copared to these ethods Both drawbacks are ainly due to the hard orthonorality constraint in standard ICA Matheatically, that is WW T = I, which is utilized to prevent degenerate solution for the basis atrix W where each basis vector is a row of W While this orthonoralization cannot be satisfied when W is over-coplete Specifically, the optiization proble of standard ICA is generally solved by using gradient descent ethods, where W is orthonoralized at each iteration by syetric orthonoralization, ie, W (WW T ) /2 W, which doesn t work for over-coplete learning In addition, although alternative orthonoralization ethods could be eployed to learn over-coplete basis, they not only are expensive to copute but also ay arise fro the cuulation of errors To address the above issues, QV Le et al [3] replaced the orthonorality constraint with a robust soft reconstruction cost for ICA (RICA) Thus, RICA can learn sparse representation with highly over-coplete basis even on unwhitened data However, this odel

2 JOURNAL OF L A T E X CLASS FILES, VOL 6, NO, JANUARY is so far also a linear technique which is infeasible to discover nonlinear relationships aong input data Additionally, as an unsupervised ethod, RICA ay not be sufficient for classification tasks, which failed to consider the association between the training saple and its class Recall that, to explore the nonlinear features, kernel trick [4] can be used to nonlinearly project the input data into a high diensional feature space Therefore, we develop a kernel extension of RICA (krica) to represent the data with nonlinear structure In addition, to bring in label inforation, we further extend the unsupervised krica to a supervised one by introducing a discriination constraint, naely d-krica Particularly, this constraint axiizes the hoogeneous representation cost and iniizes the inhoogeneous representation cost jointly, which leads to learn a structured basis consisted of basis vectors fro different basis subsets corresponding to the class labels Then each subset will sparsely represent well for its own class but not for the others Furtherore, data saples belonging to the sae class will have siilar representations, and thereby the obtained sparse representation can take ore discriinative power It is iportant to note that this work is fundaentally based on our previous work DRICA [5] In coparison to DRICA, we further iprove our work as follows: ) By taking advantage of the kernel trick, we replace the linear projection with nonlinear one to capture the nonlinear features Experiental results show that our kernel extension usually further iproves the iage classification accuracy 2) The discriinative capability of basis is further enhanced by axiizing the hoogeneous representation cost besides iniizing the inhoogeneous representation cost siultaneously Thus, we can obtain a set of ore discriinative basis vectors that are forced to sparsely represent better for their own classes but poorer for the others Experients show that this basis can further boost the perforance for iage classification 3) In the experients, we conduct coprehensive analysis for our proposed ethod, eg, the effects of different paraeters and kernels for iage classification, experient settings, and the siilarity coparative analysis The rest of the paper is organized as follows In Section 2, we revisit related works on sparse coding and RICA, and describe the connection between the Then we give a brief review of reconstruction ICA in Section 3 Section 4 introduces the details of our proposed krica, including its optiization proble and ipleentation By incorporating the discriination constraint, krica is further extended to supervised learning in Section 5 Section 6 presents extensive experiental results on iage classification Finally, we conclude our work in Section 7 2 RELATED WORK In this section, we will review soe related work in the following aspects: () Sparse coding and its applications; (2) Connection between RICA and sparse coding; (3) The other kernel sparse representation algoriths Sparse coding is an unsupervised ethod for reconstructing a given signal by selecting a relatively sall subset of basis vectors fro an over-coplete basis set, and eanwhile aking the reconstruction error as sall as possible Because of its plausive statistical theory [6], sparse coding has attracted ore and ore attention fro scientists in coputer vision field Meanwhile, it has been successfully used for ore and ore coputer vision applications, eg, iage classification [7], [8], [9], face recognition [20], iage restoration [2] etc This success is largely due to two factors: ) The sparsity characteristic ubiquitously exists in any coputer vision applications For exaple, for iage classification, the iage coponents can be sparsely reconstructed by utilizing siilar coponents of other iages fro sae class [7] Another exaple is face recognition The face iage to be tested can be accurately reconstructed by a few training iages fro the sae category [20] As a consequence, sparsity is the foundation for these applications based on sparse coding 2) Iages are often corrupted by noise, which ay arise due to sensor iperfection, poor illuination or counication errors While sparse coding can effectively select the related basis vectors to reconstruct the clean iage, and eanwhile can deal with noise by allowing the reconstruction error and prooting sparsity Therefore, sparse coding has been successfully applied to iage denoising [22], iage restoration [2] etc Siilar to sparse coding, ICA with a reconstruction cost (RICA) [3] also can learn highly over-coplete sparse representation In addition, in [3], it has been shown that RICA is atheatically equivalent to sparse coding if using explicit encoding and ignoring the nor ball constraint The above-entioned studies only seek the sparse representations of the input data in the original data space, which are incopetent to represent the data with nonlinear structure To solve this proble, Yang et al [23] developed a two-phase kernel ICA algorith: whitened kernel principal coponent analysis (KPCA) plus ICA Different fro [23], another solution [24] was proposed to use contrast function based on canonical correlations in a reproducing kernel Hilbert space However, both of these ethods couldn t learn the over-coplete sparse representation of nonlinear features due to the orthonorality constraint Therefore, to find such representation, Gao et al [25], [26] presented a kernel sparse coding ethod (KSR) in a high diensional feature space But this work failed to utilize the class inforation as an unsupervised approach Additionally, in Section 44, we will show that our proposed kernel extension of RICA (krica) is equivalent to KSR under certain conditions

3 JOURNAL OF L A T E X CLASS FILES, VOL 6, NO, JANUARY RECONSTRUCTION ICA Since sparsity is one for of non-gaussianity, axiization of sparsity for ICA is equivalent to axiization of independence[0] Given the unlabeled data set X = {x i } i= where x i R n, the optiization proble of standard ICA [9] is generally defined as in g(w j x i ) W i= j= st WW T = I, where g( ) is a nonlinear convex function, W = [w,w 2,,w K ] T R K n is the basis atrix, K is the nuber of basis vectors and w j is j-th row basis vector in W, and I is the identity atrix Additionally, the orthonorality constraint WW T = I is traditionally utilized to prevent the basis vectors in W fro becoing degenerate Meanwhile, a good general purpose sooth L penalty is: g( ) = log(cosh( )) [0] However, as above pointed out, the orthonorality constraint akes standard ICA difficult to learn the over-coplete basis In addition, ICA is sensitive to whitening These drawbacks restrict ICA to scale high diensional data Consequently, RICA [3] used a soft reconstruction cost to replace the orthonorality constraint in ICA Applying this replaceent to Equation (2), RICA can be forulated as the following unconstrained proble in W () [ W T Wx i x i 2 2 +λg(wx i)], (2) i= where paraeter λ is a tradeoff between reconstruction and sparsity Swapping the orthonorality constraint with a reconstruction penalty, the RICA could learn sparse representations even on the data without whitening when W is over-coplete Furtherore, since the L penalty is not sufficient to learn invariant features [0], RICA [3], [27] replaced it by a L 2 pooling penalty which encourages pooling features to group siilar features together to achieve coplex invariances such as scale and rotational invariance Besides, the L 2 pooling can also proote sparsity for feature learning Particularly,L 2 pooling [28], [29] is a two-layered network with square nonlinearity in the first layer, and square-root nonlinearity in the second layer: g(wx i ) = j= ε+h j (Wx i ) 2, (3) whereh j is the row of spatial pooling atrixh R K K fixed to unifor weights and ε is a sall constant to prevent division by zero Nevertheless, RICA is infeasible to represent the data with nonlinear structure due to its intrinsic linearity In addition, this odel just siply learned the overcoplete basis set with reconstruction cost while failed to consider the association between the training saple and its class, which ay be insufficient for classification tasks To address these probles, on one hand, we focus on developing a kernel extension of RICA to find the sparse representation of nonlinear features On the other hand, we ai to learn a ore discriinative basis by bringing in class inforation than unsupervised RICA, which will facilitate the better perforance of sparse representation in classification tasks 4 KERNEL EXTENSION FOR RICA Motivated by the success that kernel trick can capture the nonlinear structure in data [4], we propose a kernel version of RICA, called krica, to learn the sparse representation of nonlinear features 4 Model Forulation Suppose that there is a kernel function κ(, ) induced by a high diensional feature apping φ : R n R N, where n N Given two data points x i and x j, κ(x i,x j ) = φ(x i ) T φ(x j ) represents a nonlinear siilarity between the Then the function aps the data and basis fro the original data space to the feature space as follows x φ φ(x) W = [w,,w K ] T φ W = [φ(w ),,φ(w K )] T (4) Furtherore, by substituting the apped data and basis into Equation (2), we can get the following objective function of krica in W [ W T Wφ(x i ) φ(x i ) 2 2 +λg(wφ(x i))] (5) i= Due to its excellent perforance in any coputer vision applications [4], [25], Gaussian kernel, ie, κ(x i,x j ) = exp( γ x i x j 2 2) is used in this study Thus, the nor ball constraints on basis in RICA can be reoved owing to φ(w i ) T φ(w i ) = κ(w i,w i ) = In addition, we perfor kernel principal coponent analysis (KPCA) in the feature space for data whitening siilar to [23], which akes the proble of ICA estiation sipler and better conditioned [0] When data is whitened, there exists a close relationship between kernel ICA [23] and krica Regarding this relationship, we have the following Lea: Lea 4 When the input data set X = {x i } i= is whitened in the feature space, the reconstruction cost i= W T Wφ(x i ) φ(x i ) 2 2 is equivalent to the orthonorality cost W T W I 2 F Where F is the Frobenius nor Lea 4 shows that kernel ICA s hard orthonorality constraint and krica s reconstruction cost are equivalent when data is whitened While krica can learn the over-coplete sparse representation of nonlinear features and kernel ICA fails to work due to the orthonorality constraint Please see the Appendix A for a detailed proof

4 JOURNAL OF L A T E X CLASS FILES, VOL 6, NO, JANUARY Ipleentation The Equation (5) is an unconstrained convex optiization proble To solve this proble, we rewrite the objective as follows f(w) = [ W T Wφ(x i ) φ(x i ) 2 2 +λg(wφ(x i))] i= = [+ κ(w u,x i )κ(w u,w v )κ(w v,x i ) i= u= v= 2 (κ(w u,x i )) 2 +λ K ε+ h ju (κ(w u,x i )) 2 ], u= j= u= (6) where w u and w v are the rows of basis W, and h ju is the eleent in pooling atrix H Since the row w j of W is contained in the kernel κ(w j, ), it is very hard to directly utilize the optiization ethods in RICA, eg L-BFGS and CG [30], to copute the optial basis Thus, to solve this proble, we alternatively optiize each row of basis W instead With respect to each updating row w p of W, the derivative of f(w) is f = γ w p [ 4κ(w p,x i )κ(w p,w v )κ(w v,x i ) i= v= ((w p x i )+(w p w v )) 8κ(w p,x i )(w p x i ) h jp κ(w p,x i )(w p x i ) +2λ ] j= ε+ K h jv (κ(w v,x i )) 2 v= f Then, to copute the optial w p, we set w p = 0 Since w p is contained in κ(w p, ), it is challenging to solve the Equation (7) Thus, we seek the approxiate solution instead of the exact solution Inspired by fixed point algorith [25], to update w p in the (q)-th iteration, we utilize the result of w p in the (q )-th iteration to calculate the part in the kernel function In addition, we utilize k-eans to initialize the basis followed by [25] Let denote the w p in the (q)-th iteration as w p,(q), and the Equation (7) with respect to w p,(q) becoes f γ = w p,(q) [ 4κ(w p,(q ),x i )κ(w p,(q ),w v ) i= v= κ(w v,x i )((w p,(q) x i )+(w p,(q) w v )) 8κ(w p,(q ),x i ) h jp κ(w p,(q ),x i )(w p,(q) x i ) (w p,(q) x i )+2λ ] j= ε+ K h jv (κ(w v,x i )) 2 = 0 v= When all the reaining rows are fixed, the proble becoes a linear equation of w p,(q), which can be solved straightforwardly (7) 43 Connection between krica and KSR It is clear there is a close connection between the proposed krica and KSR [25] Siilar to krica, KSR attepts to find the sparse representation of nonlinear features in a high diensional feature space and its optiization proble is in W,s i [ W T s i φ(x i ) 2 2 +λ s i ], (8) i= where s i R K is the sparse representation of saple x i Therefore, there are two ajor differences between the () KSR utilizes explicit encoding for sparse representation corresponding to input data saple, ie, s i = Wφ(x i ) Since the objective of Equation (8) in KSR is not convex, the basis W and sparse codes v i should be optiized, alternatively (2) The siple L penalty, g(s i ) = s i, is eployed by KSR to proote sparsity while krica uses L 2 pooling instead, which can force the pooling features to group siilar features together to achieve invariance, and eanwhile optiize the sparsity 5 SUPERVISED KERNEL RICA Given the labeled training data, our goal is to utilize class inforation to learn a structured basis set, which is consisted of basis vectors fro different basis subsets corresponding to different class labels Then each subset will sparsely represent well for its own class but not for the others Thus, to learn such basis, we further extend the unsupervised krica to a supervised one by introducing a discriination constraint, naely d- krica Matheatically, when the saple x i is labeled as y i {,,c} where c is the total nuber of classes, we can further utilize class inforation to learn a structured basis set W = [W (),W (2),,W (c) ] T R K n, where W (yi) R k n is the basis subset that can well represent the saple x i belonging to the y i -th class rather than others, k is the nuber of basis vectors for each subset and K = k c Let denote s i = Wx i where s i can be regarded as the sparse representation of saple x i [3] 5 Discriination constraint Since we ai to utilize class inforation to learn a structured basis, we hope that the saplex i labeled as y i will only be reconstructed by the basis subset W yi with coefficients s i To achieve this goal, an inhoogeneous representation cost constraint [5], [3] was utilized to iniize the inhoogeneous representation coefficients of s i, ie, coefficients corresponding to basis vectors other than belonging to W yi However, this constraint only focuses on iniizing the inhoogeneous coefficients while fails to consider axiizing the the hoogeneous ones, which is not sufficient to learn an optial structured basis Consequently, to learn such basis, we

5 JOURNAL OF L A T E X CLASS FILES, VOL 6, NO, JANUARY introduce a discriination constraint, which axiizes the hoogeneous representation cost and iniizes the inhoogeneous representation cost, jointly Matheatically, we define the hoogeneous cost as P + and the inhoogeneous cost as P Specifically, P + and P are P + = D +yi s i 2 2, P = D yi s i 2 2, (9) where D +yi R K and D yi R K select the hoogeneous and inhoogeneous representation coefficients of s i, respectively For exaple, assuing W = [W (),W (2),W (3) ] T, W (yi) R 2 n (y i {,2,3}) and y i =3, D +yi and D yi can be respectively defined as follows D +3 = [ ] D 3 = [ 0 0 ] Intuitively, we can define the discriination constraint function d(s i ) as P P +, which eans the sparse representation s i in ters of basis atrix W will only concentrate on the basis subset W (yi) However, this constraint is non-convex and unstable To address the proble, we propose to incorporate an elastic ter s i 2 2 into d(s i ) Thus, d(s i ) is defined as d(s i ) = D yi s i 2 2 D +yi s i 2 2 +η s i 2 2 (0) It can be proved that if η k+, d(s i ) is strictly convex to s i Please see the Appendix B for a detailed proof The constraint (0) axiizes the hoogeneous representation cost and iniizes the inhoogeneous representation cost, siultaneously, which leads to learn a structured basis consisted of basis vectors fro different basis subsets corresponding to the class labels Then each subset will sparsely represent well for its own class but not for the others Furtherore, data saples belonging to the sae class will have siilar representations, and thereby the obtained new representations can take ore discriinative power By incorporating the discriination constraint into the krica fraework (d-krica), we can get the following objective function in W [ W T Wφ(x i ) φ(x i ) i= λg(wφ(x i ))+αd(wφ(x i ))], () where λ and α are the scalars controlling the relative contribution of the corresponding ters Given a test saple, Equation () eans that the learned basis set can sparsely represent it with nonlinear structure while deands its hoogeneous representations as large as possible and eanwhile inhoogeneous representations as sall as possible Following krica, the optiization proble () can be easily solved by the above proposed fixed point algorith TABLE Iage classification Accuracy on Caltech 0 dataset 6 EXPERIMENTS Training size 5 30 ScSPM [7] 670% 732% D-KSVD [6] 65% 730% LC-KSVD [9] 677% 736% RICA [3] 67% 737% KICA [23] 652% 728% KSR [25] 679% 75% DRICA [5] 678% 744% d-rica 687% 756% krica 682% 754% d-krica 73% 77% In this section, we will firstly introduce the feature extraction for iage classification Then, we evaluate the perforances of our krica and d-krica for iage classification on three public datasets: Caltech 0 [32], CIFAR-0 [2] and STL-0 [2] Furtherore, we study the selections of tuning paraeters and kernel functions for our ethod Finally, we give the siilarity atrix to further illustrate the perforances of krica and d- krica 6 Feature Extraction for Classification Given a p p input iage patch (with d channels) x R n (n = p p d), krica can transfor it to a new representation s = Wφ(x i ) R K in the feature space, where p is tered as the receptive field size For an iage of N M pixels (with d channels), we could obtain a (N p + ) (M p + )(with K channels) feature following the sae setting in [3], by estiating the representation for each p p subpatch of the input iage To reduce the diensionality of the iage representation, we utilize siilar pooling ethod in [3] to for a reduced 4K-diensional pooled representation for iage classification Given the pooled feature for each iage, we utilize linear SVM for classification 62 Classification on Caltech 0 Caltech 0 dataset consists of 944 iages which are divided aong 0 object classes and background class including anials, vehicles, etc Following the coon experient setup [7], we ipleent our algorith on 5 and 30 training iages per category with basis size K = 020 and 0 0 receptive fields, respectively Coparison results are shown in Table 2 We copare our classification accuracy with ScSPM [7], D-KSVD [6], LC-KSVD [9], RICA [3], KICA [23], KSR [25] and DRICA [5] In addition, in order to copare with DRICA, we incorporate the discriination constraint (0) into the RICA fraework (2), naely d-rica Table shows that krica and d-krica outperfor the other copeting approaches

6 JOURNAL OF L A T E X CLASS FILES, VOL 6, NO, JANUARY TABLE 2 Test Classification Accuracy on CIFAR-0 dataset Model Accuracy Iproved Local Coord Coding [8] 745% Conv Deep Belief Net (2 layers) [33] 789% Sparse auto-encoder [2] 734% Sparse RBM [2] 724% K-eans (Hard) [2] 686% K-eans (Triangle) [2] 779% K-eans (Triangle, 4000 features) [2] 796% RICA [3] 84% KICA [23] 783% KSR [25] 826% DRICA [5] 82% d-rica 829% krica 834% d-krica 845% 63 Classification on CIFAR-0 The CIFAR-0 dataset includes 0 categories and color iages in all with 00 iages per category, such as airplane, autoobile, truck and horse etc In addition, there are 000 training iages and 0000 testing iages Specifically, 000 iages fro each class are randoly selected as test iages and the other 00 iages fro each class as training iages In this experient, we fix the size of basis set to 4000 with 6 6 receptive fields followed by [2] We copare our approach with RICA, K-eans (Triangle, 4000 features) [2], KSR, DRICA and d-rica etc Table 2 shows the effectiveness of our proposed krica and d-krica Accuracy (%) d krica krica 40 KSR d RICA 35 DRICA RICA ICA 30 KICA keans The Size of Over coplete Basis Fig Classification perforance on STL-0 dataset with varying basis size and 8 8 receptive fields TABLE 3 Test Classification Accuracy on STL-0 dataset Model Accuracy Raw pixels [2] 38% K-eans(Triangle 0 features) [2] 55% RICA(8x8 receptive fields) [3] 54% RICA(0x0 receptive fields) [3] 529% KICA [23] 5% KSR [25] 544% DRICA [5] 542% d-rica 548% krica 552% d-krica 569% 64 Classification on STL-0 In STL-0, there are 0 classes(eg, airplane, dog, onkey and ship etc), where each iage is 96x96 pixels and color In addition, this dataset is divided into 0 training iages (0 pre-defined folds), 800 test iages per class and 00,000 unlabeled iages for unsupervised learning In our experients, we set the size of basis set K= 0 and 8 8 receptive fields in the sae anner described in [3] Table 3 shows the classification results of the raw pixels [2], K-eans, RICA, KSR, DRICA, d-rica, krica and d-krica As can be seen, d-rica achieves better perforance than DRICA on all of the above datasets It is because that DRICA just only iniized the inhoogeneous representation cost for structured basis learning, while d-rica siultaneously axiizes the hoogeneous representation cost and iniizes the inhoogeneous representation cost, which akes the learned sparse representation take ore discriinative power Although both DRICA and d-rica introduce the class inforation, unsupervised krica still perfors better than both these algoriths This eans that krica iplies ore discriinative power for classification by representing the data with nonlinear structure Additionally, since krica utilizes the L 2 pooling instead of L penalty to achieve feature invariance, it deonstrates better perforance than KSR Furtherore, the d-krica achieves better perforance than krica in all the cases by bringing in class inforation We also investigate the effect of basis size for our proposed krica and d-krica on STL-0 dataset In our experients, we try seven sizes:, 00, 200, 400, 800, 200 and 0 As shown in Fig, the classification accuracies of d-krica and krica continue to increase when the basis size goes up to 0 and the perforances augent slightly fro basis size of 800 Especially, d- krica outperfors all the other algoriths all the way 65 Tuning Paraeter and Kernel Selection In the experients, the tuning paraeters in krica and d-krica, ie λ, α and γ in the objective function, are verified by cross validation to avoid over-fitting More specifically, we experientally set these paraeters as follows The effect of λ : The paraeter λ is the weight of sparsity ter, which is an iportant factor in krica To facilitate the paraeter selection, we experientally investigate how the perforance of krica varies with the paraeter λ on STL-0 dataset in Fig 2 (γ = 0 )

7 JOURNAL OF L A T E X CLASS FILES, VOL 6, NO, JANUARY Accuracy (%) RICA krica Accuracy (%) 55 d krica d RICA DRICA λ α Fig 2 The relationship between the weight of sparsity ter (λ) and classification accuracy on STL-0 dataset Fig 2 shows that krica achieves best perforance when λ is fixed to be 0 2 Thus, we set λ = 0 2 for STL-0 data In addition, we test the accuracy of RICA under the sae sparsity weight It is easy to find that our proposed nonlinear RICA (krica) can consistently outperfor linear RICA with respect to λ Siilarly, we experientally set λ = 0 for Caltech data and λ = 0 2 for CIFAR-0 data The effect of α : The paraeter α controls the weight of discriination constraint ter When α = 0, the supervised d-krica optiization proble becoes the unsupervised krica proble Fig 3 shows the relationship between the weight of discriination constraint ter α and classification accuracy on the STL-0 We can see that d-krica achieves best perforance when α = 0 Hence, we set α = 0 for STL-0 data In particular, d-rica achieves better perforance than DRICA in a wide range of α values This is because that DRICA just only iniizes the inhoogeneous representation cost, while d-rica jointly optiizes both the hoogeneous and inhoogeneous representation costs for basis learning, which akes the learned sparse representations take ore discriinative power Furtherore, by representing the data with nonlinear structure, d-krica iplies ore discriinative power for classification and outperfors both these algoriths Siilarly, we set α = for Caltech data and α = 0 for CIFAR-0 data The effect of γ : When we utilize the Gaussian kernel in krica, it is vital to select the kernel paraeter γ, which affects the iage classification accuracy Fig 4 shows the relationship between γ and classification accuracy on STL-0 dataset Therefore, we set γ = 0 for STL-0 data Siilarly, we experientally set γ = 0 2 for Caltech data and γ = 0 for CIFAR-0 data We also investigate the effect of different kernels for krica in iage classification, ie, Polynoial kernel: ( + x T y) b, Inverse Distance kernel: +b x y, Inverse Fig 3 The relationship between the weight of discriination constraint ter (α) and classification accuracy on STL-0 dataset Accuracy (%) γ krica Fig 4 Classification perforance on STL-0 dataset with varying kernel paraeter (γ) in Gaussian kernel Square Distance kernel: +b x y, Exponential Histogra Intersection kernel: 2 i,e in(ebxi byi ) Table 4 deonstrates the classification perforances of different kernels on STL-0 dataset, and Gaussian kernel outperfors the other kernels Thus, we eploy Gaussian kernel in our studies Following the work [26], we set b=3 for Polynoial kernel and b= for the others TABLE 4 Classification perforances of different kernels on STL-0 dataset Kernel Accuracy Polynoial kernel 542% Inverse Distance kernel 383% Inverse Square Distance kernel 476% Exponential Histogra Intersection kernel 365% Gaussian kernel 569%

8 JOURNAL OF L A T E X CLASS FILES, VOL 6, NO, JANUARY Siilarity Analysis In above sections, we have shown the effectiveness of krica and d-krica for iage classification To further illustrate their perforances, we firstly choose 90 iages fro three classes in Caltech 0, and 30 iages for each class Then we copute the siilarity between sparse representations of these iages for RICA, krica and d-krica, respectively Fig 5 deonstrates the siilarity atrices corresponding to sparse representations of RICA, krica and d-krica, respectively Each eleent (i, j) in siilarity atrix is the sparse representation siilarity easured by Euclidean distance between iage i and j Since a good sparse representation ethod can ake the new representations belonging to the sae class ore siilar, their siilarity atrix also should be block-wise Fig 5 shows that nonlinear krica takes ore discriinative power than linear RICA, and d- krica achieves best by binging in class inforation 7 CONCLUSIONS In this paper, we propose a kernel ICA odel with reconstruction constraint (krica) to capture the nonlinear features To bring in the class inforation, we further extend the unsupervised krica to a supervised one by introducing a discriination constraint This constraint leads to learn a structured basis consisted of basis vectors fro different basis subsets corresponding to different class labels Then each subset will sparsely represent well for its own class but not for the others Furtherore, data saples belonging to the sae class will have siilar representations, and thereby the obtained sparse representation can take ore discriinative power The experients conducted on standardized datasets have deonstrated the effectiveness of our proposed ethod APPENDIX A PROOF OF LEMMA 4 Poof Since the input data set X = {x i } i= is whitened in the feature space by KPCA, we have E[φ(X)φ(X) T ] = φ(x i )φ(x i ) T = I, i= where I is the identity atrix Furtherore, we can obtain W T Wφ(x i ) φ(x i ) 2 2 i= = Tr[(W T Wφ(x i ) φ(x i )) T (W T Wφ(x i ) φ(x i ))] i= =Tr[(W T W I) T (W T W I) = W T W I 2 F, φ(x i )φ(x i ) T ] i= where Tr[ ] denotes the trace of a atrix, and the steps of derivation eploy the atrix property T r(ab) = T r(ba) Thus, the reconstruction cost is equivalent to the orthonorality constraint when data is whitened in the feature space APPENDIX B PROOF OF THE CONVEXITY OF d(s i ) We rewrite the Equation (0) as d(s i ) = D yi s i 2 2 D +y i s i 2 2 +η s i 2 2 = Tr[s T i DT y i D yi s i s T i DT +y i D +yi s i +ηs T i s i] (2) Then, we can obtain its Hessian atrix 2 d with respect to s i 2 d =2D T y i D yi 2D T y i+d yi+ +2ηI (3) Without loss of generality, we assue D +yi = [0 0 D yi = [ k {}}{ 0 0 ] R K, k {}}{ 0 0 ] R K After soe derivations, we have 2 d = 2 A, where A = η η η η η η + The convexity of d(s i ) depends on whether its Hessian atrix 2 d, ie atrix A, is positive definite or not [34] Meanwhile, the K K atrix A is positive definite if and only if z T Az > 0 for all nonzero vectors z R K [35], where z T denotes the transpose Let the size of upper left atrix in A be t t, and suppose z = [z,,z t,z t+,,z t+k,z t+k+,,z K ] T Then, we have Az = (η +)z +z 2 + +z t +z t+k+ + +z K z +z 2 + +(η +)z t +z t+k+ + +z K (η )z t+ z t+2 z t+k z t+ z t+2 +(η )z t+k z +z 2 + +z t +(η +)z t+k+ + +z K z +z 2 + +z t +z t+k+ + +(η +)z K Furtherore, we can get

9 JOURNAL OF L A T E X CLASS FILES, VOL 6, NO, JANUARY (a) RICA (b) krica (c) d-krica Fig 5 The siilarity atrices for sparse representations of RICA, krica and d-krica z T Az =(η +) t+k t K z 2 i +(η ) z 2 i +(η +) i= i=t+ z 2 i i=t+k+ + 2 z i z j + 2 z i z j + 2 z i z j i t 2 j t 2 z i z j =η( t+ i t+k t+2 j t+k t z 2 i + i t t+k+ j K t+ i t+k t+2 j t+k K z 2 i )+(z +z 2 + +z t +z t+k+ i= i=t+k+ t+k + +z K ) 2 +(η ) z 2 i 2 z i z j i=t+ t+ i t+k t+2 j t+k Define function h(η) = z T Az, and when η k +, it is easy to verify that h(η) h(k +) = (k +)( t z 2 i + i= i=t+ z 2 i )+(z + +z t i=t+k+ t+k +z t+k+ + +z K ) 2 +k z 2 i 2 z i z j = k( + t z 2 i + i= z 2 i + i= t+ i t+k t+2 j t+k z 2 i )+(z + +z t +z t+k+ + +z K ) 2 i=t+k+ (z i z j ) 2 t+ i t+k t+2 j t+k Since K z 2 i > 0, we have h(η) h(k + ) > 0 Thus, i= Hessian atrix 2 d is positive definite for η k +, which guarantees that d(s i ) is convex to s i ACKNOWLEDGMENTS We thank Wende Dong for helpful discussions, and acknowledge Quoc V Le for providing the RICA code REFERENCES [] E Candes and M Wakin, An introduction to copressive sapling, IEEE Signal Proc Mag, vol 25, no 2, pp 2 30, arch 2008 [2] D Donoho, Copressed sensing, IEEE Trans Infor Theory, vol 52, no 4, pp , 2006 [3] D Lee, H Seung et al, Learning the parts of objects by nonnegative atrix factorization, Nature, vol 40, no 6755, pp , 999 [4] B Olshausen et al, Eergence of siple-cell receptive field properties by learning a sparse code for natural iages, Nature, vol 38, no 6583, pp 7 9, 996 [5] M Aharon, M Elad, and A Bruckstein, K-svd: An algorith for designing overcoplete dictionaries for sparse representation, IEEE Trans Signal Process, vol 54, no, pp , 2006 [6] Q Zhang and B Li, Discriinative k-svd for dictionary learning in face recognition, in Proc Coput Vis Pattern Recognit, 200 [7] Y Bengio, P Lablin, D Popovici, and H Larochelle, Greedy layer-wise training of deep networks, in Proc Adv Neural Infor Process Syst, vol 9, 2007, p 53 [8] G Hinton, S Osindero, and Y Teh, A fast learning algorith for deep belief nets, Neural Coput, vol 8, no 7, pp , 2006 [9] A Hyvärinen, J Karhunen, and E Oja, Independent coponent analysis Wiley-interscience, 200, vol 26 [0] A Hyvärinen, J Hurri, and P Hoyer, Natural iage statistics Springer, 2009, vol [] A Hyvärinen, P Hoyer, and M Inki, Topographic independent coponent analysis, Neural Coput, vol 3, no 7, pp , 200 [2] A Coates, H Lee, and A Ng, An analysis of single-layer networks in unsupervised feature learning, in Proc AISTATS, 200 [3] Q Le, A Karpenko, J Ngia, and A Ng, Ica with reconstruction cost for efficient overcoplete feature learning, in Proc Adv Neural Infor Process Syst, 20 [4] J Shawe-Taylor and N Cristianini, Kernel ethods for pattern analysis Cabridge university press, 2004 [5] Y Xiao, Z Zhu, S Wei, and Y Zhao, Discriinative ica odel with reconstruction constraint for iage classification, in Proc ACM Multiedia, 202, pp [6] D Donoho, For ost large underdeterined systes of linear equations the inial l-nor solution is also the sparsest solution, Coun Pur Appl Math, vol 59, no 6, pp , 2006 [7] J Yang, K Yu, Y Gong, and T Huang, Linear spatial pyraid atching using sparse coding for iage classification, in Proc Coput Vis Pattern Recognit, 2009, pp [8] K Yu and T Zhang, Iproved local coordinate coding using local tangents, in Proc Int Conf Mach Learn, 200 [9] Z Jiang, Z Lin, and L Davis, Learning a discriinative dictionary for sparse coding via label consistent k-svd, in Proc Coput Vis Pattern Recognit, 20, pp [20] J Wright, A Yang, A Ganesh, S Sastry, and Y Ma, Robust face recognition via sparse representation, IEEE Trans Pattern Anal Mach Intell, vol 3, no 2, pp , 2009

10 JOURNAL OF L A T E X CLASS FILES, VOL 6, NO, JANUARY [2] J Mairal, F Bach, J Ponce, G Sapiro, and A Zisseran, Nonlocal sparse odels for iage restoration, in Proc Coput Vis Pattern Recognit, 2009, pp [22] M Elad, Sparse and redundant representations: fro theory to applications in signal and iage processing Springer, 200 [23] J Yang, X Gao, D Zhang, and J Yang, Kernel ica: An alternative forulation and its application to face recognition, Pattern Recognit, vol 38, no 0, pp , 2005 [24] F Bach and M Jordan, Kernel independent coponent analysis, J Mach Learn Res, vol 3, pp 48, 2003 [25] S Gao, I Tsang, and L Chia, Kernel sparse representation for iage classification and face recognition, in Proc ECCV, 200, pp 4 [26] S Gao, I W-H Tsang, and L-T Chia, Sparse representation with kernels, IEEE Trans Iage Process, vol 22, no 2, pp , feb 203 [27] Q Le, R Monga, M Devin, K Chen, G Corrado, J Dean, and A Ng, Building high-level features using large-scale unsupervised learning, in Proc Int Conf Mach Learn, 202 [28] Y LeCun, Learning invariant feature hierarchies, in Proc ECCV Workshops and Deonstrations, 202, pp [29] Y-L Boureau, J Ponce, and Y LeCun, A theoretical analysis of feature pooling in visual recognition, in Proc Int Conf Mach Learn, 200, pp 8 [30] M Schidt, infunc, 2005 [3] J Yang, J Wang, and T Huang, Learning the sparse representation for classification, in Proc ICME, 20, pp 6 [32] L Fei-Fei, R Fergus, and P Perona, Learning generative visual odels fro few training exaples: An increental bayesian approach tested on 0 object categories, Coput Vis Iage Und, vol 06, no, pp 59 70, 2007 [33] A Krizhevsky, Convolutional deep belief networks on cifar-0, Unpublished anuscript, 200 [34] S Boyd and L Vandenberghe, Convex optiization, 2004 [35] G H Golub and C F Van Loan, Matrix coputations, 996

Intelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines

Intelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines Intelligent Systes: Reasoning and Recognition Jaes L. Crowley osig 1 Winter Seester 2018 Lesson 6 27 February 2018 Outline Perceptrons and Support Vector achines Notation...2 Linear odels...3 Lines, Planes

More information

Pattern Recognition and Machine Learning. Artificial Neural networks

Pattern Recognition and Machine Learning. Artificial Neural networks Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2016/2017 Lessons 9 11 Jan 2017 Outline Artificial Neural networks Notation...2 Convolutional Neural Networks...3

More information

Kernel Methods and Support Vector Machines

Kernel Methods and Support Vector Machines Intelligent Systes: Reasoning and Recognition Jaes L. Crowley ENSIAG 2 / osig 1 Second Seester 2012/2013 Lesson 20 2 ay 2013 Kernel ethods and Support Vector achines Contents Kernel Functions...2 Quadratic

More information

Sharp Time Data Tradeoffs for Linear Inverse Problems

Sharp Time Data Tradeoffs for Linear Inverse Problems Sharp Tie Data Tradeoffs for Linear Inverse Probles Saet Oyak Benjain Recht Mahdi Soltanolkotabi January 016 Abstract In this paper we characterize sharp tie-data tradeoffs for optiization probles used

More information

Multi-view Discriminative Manifold Embedding for Pattern Classification

Multi-view Discriminative Manifold Embedding for Pattern Classification Multi-view Discriinative Manifold Ebedding for Pattern Classification X. Wang Departen of Inforation Zhenghzou 450053, China Y. Guo Departent of Digestive Zhengzhou 450053, China Z. Wang Henan University

More information

CS Lecture 13. More Maximum Likelihood

CS Lecture 13. More Maximum Likelihood CS 6347 Lecture 13 More Maxiu Likelihood Recap Last tie: Introduction to axiu likelihood estiation MLE for Bayesian networks Optial CPTs correspond to epirical counts Today: MLE for CRFs 2 Maxiu Likelihood

More information

Pattern Recognition and Machine Learning. Artificial Neural networks

Pattern Recognition and Machine Learning. Artificial Neural networks Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2016 Lessons 7 14 Dec 2016 Outline Artificial Neural networks Notation...2 1. Introduction...3... 3 The Artificial

More information

Feature Extraction Techniques

Feature Extraction Techniques Feature Extraction Techniques Unsupervised Learning II Feature Extraction Unsupervised ethods can also be used to find features which can be useful for categorization. There are unsupervised ethods that

More information

Randomized Recovery for Boolean Compressed Sensing

Randomized Recovery for Boolean Compressed Sensing Randoized Recovery for Boolean Copressed Sensing Mitra Fatei and Martin Vetterli Laboratory of Audiovisual Counication École Polytechnique Fédéral de Lausanne (EPFL) Eail: {itra.fatei, artin.vetterli}@epfl.ch

More information

PAC-Bayes Analysis Of Maximum Entropy Learning

PAC-Bayes Analysis Of Maximum Entropy Learning PAC-Bayes Analysis Of Maxiu Entropy Learning John Shawe-Taylor and David R. Hardoon Centre for Coputational Statistics and Machine Learning Departent of Coputer Science University College London, UK, WC1E

More information

Using EM To Estimate A Probablity Density With A Mixture Of Gaussians

Using EM To Estimate A Probablity Density With A Mixture Of Gaussians Using EM To Estiate A Probablity Density With A Mixture Of Gaussians Aaron A. D Souza adsouza@usc.edu Introduction The proble we are trying to address in this note is siple. Given a set of data points

More information

Pattern Recognition and Machine Learning. Learning and Evaluation for Pattern Recognition

Pattern Recognition and Machine Learning. Learning and Evaluation for Pattern Recognition Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2017 Lesson 1 4 October 2017 Outline Learning and Evaluation for Pattern Recognition Notation...2 1. The Pattern Recognition

More information

A Simple Regression Problem

A Simple Regression Problem A Siple Regression Proble R. M. Castro March 23, 2 In this brief note a siple regression proble will be introduced, illustrating clearly the bias-variance tradeoff. Let Y i f(x i ) + W i, i,..., n, where

More information

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization Recent Researches in Coputer Science Support Vector Machine Classification of Uncertain and Ibalanced data using Robust Optiization RAGHAV PAT, THEODORE B. TRAFALIS, KASH BARKER School of Industrial Engineering

More information

Support Vector Machines. Goals for the lecture

Support Vector Machines. Goals for the lecture Support Vector Machines Mark Craven and David Page Coputer Sciences 760 Spring 2018 www.biostat.wisc.edu/~craven/cs760/ Soe of the slides in these lectures have been adapted/borrowed fro aterials developed

More information

Pattern Recognition and Machine Learning. Artificial Neural networks

Pattern Recognition and Machine Learning. Artificial Neural networks Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2017 Lessons 7 20 Dec 2017 Outline Artificial Neural networks Notation...2 Introduction...3 Key Equations... 3 Artificial

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE7C (Spring 018: Convex Optiization and Approxiation Instructor: Moritz Hardt Eail: hardt+ee7c@berkeley.edu Graduate Instructor: Max Sichowitz Eail: sichow+ee7c@berkeley.edu October 15,

More information

Recovering Data from Underdetermined Quadratic Measurements (CS 229a Project: Final Writeup)

Recovering Data from Underdetermined Quadratic Measurements (CS 229a Project: Final Writeup) Recovering Data fro Underdeterined Quadratic Measureents (CS 229a Project: Final Writeup) Mahdi Soltanolkotabi Deceber 16, 2011 1 Introduction Data that arises fro engineering applications often contains

More information

An l 1 Regularized Method for Numerical Differentiation Using Empirical Eigenfunctions

An l 1 Regularized Method for Numerical Differentiation Using Empirical Eigenfunctions Journal of Matheatical Research with Applications Jul., 207, Vol. 37, No. 4, pp. 496 504 DOI:0.3770/j.issn:2095-265.207.04.0 Http://jre.dlut.edu.cn An l Regularized Method for Nuerical Differentiation

More information

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon Model Fitting CURM Background Material, Fall 014 Dr. Doreen De Leon 1 Introduction Given a set of data points, we often want to fit a selected odel or type to the data (e.g., we suspect an exponential

More information

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis E0 370 tatistical Learning Theory Lecture 6 (Aug 30, 20) Margin Analysis Lecturer: hivani Agarwal cribe: Narasihan R Introduction In the last few lectures we have seen how to obtain high confidence bounds

More information

Intelligent Systems: Reasoning and Recognition. Artificial Neural Networks

Intelligent Systems: Reasoning and Recognition. Artificial Neural Networks Intelligent Systes: Reasoning and Recognition Jaes L. Crowley MOSIG M1 Winter Seester 2018 Lesson 7 1 March 2018 Outline Artificial Neural Networks Notation...2 Introduction...3 Key Equations... 3 Artificial

More information

arxiv: v1 [cs.lg] 8 Jan 2019

arxiv: v1 [cs.lg] 8 Jan 2019 Data Masking with Privacy Guarantees Anh T. Pha Oregon State University phatheanhbka@gail.co Shalini Ghosh Sasung Research shalini.ghosh@gail.co Vinod Yegneswaran SRI international vinod@csl.sri.co arxiv:90.085v

More information

1 Bounding the Margin

1 Bounding the Margin COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #12 Scribe: Jian Min Si March 14, 2013 1 Bounding the Margin We are continuing the proof of a bound on the generalization error of AdaBoost

More information

COS 424: Interacting with Data. Written Exercises

COS 424: Interacting with Data. Written Exercises COS 424: Interacting with Data Hoework #4 Spring 2007 Regression Due: Wednesday, April 18 Written Exercises See the course website for iportant inforation about collaboration and late policies, as well

More information

Introduction to Machine Learning. Recitation 11

Introduction to Machine Learning. Recitation 11 Introduction to Machine Learning Lecturer: Regev Schweiger Recitation Fall Seester Scribe: Regev Schweiger. Kernel Ridge Regression We now take on the task of kernel-izing ridge regression. Let x,...,

More information

Weighted- 1 minimization with multiple weighting sets

Weighted- 1 minimization with multiple weighting sets Weighted- 1 iniization with ultiple weighting sets Hassan Mansour a,b and Özgür Yılaza a Matheatics Departent, University of British Colubia, Vancouver - BC, Canada; b Coputer Science Departent, University

More information

Block designs and statistics

Block designs and statistics Bloc designs and statistics Notes for Math 447 May 3, 2011 The ain paraeters of a bloc design are nuber of varieties v, bloc size, nuber of blocs b. A design is built on a set of v eleents. Each eleent

More information

Ch 12: Variations on Backpropagation

Ch 12: Variations on Backpropagation Ch 2: Variations on Backpropagation The basic backpropagation algorith is too slow for ost practical applications. It ay take days or weeks of coputer tie. We deonstrate why the backpropagation algorith

More information

Domain-Adversarial Neural Networks

Domain-Adversarial Neural Networks Doain-Adversarial Neural Networks Hana Ajakan, Pascal Gerain 2, Hugo Larochelle 3, François Laviolette 2, Mario Marchand 2,2 Départeent d inforatique et de génie logiciel, Université Laval, Québec, Canada

More information

Machine Learning Basics: Estimators, Bias and Variance

Machine Learning Basics: Estimators, Bias and Variance Machine Learning Basics: Estiators, Bias and Variance Sargur N. srihari@cedar.buffalo.edu This is part of lecture slides on Deep Learning: http://www.cedar.buffalo.edu/~srihari/cse676 1 Topics in Basics

More information

A Theoretical Analysis of a Warm Start Technique

A Theoretical Analysis of a Warm Start Technique A Theoretical Analysis of a War Start Technique Martin A. Zinkevich Yahoo! Labs 701 First Avenue Sunnyvale, CA Abstract Batch gradient descent looks at every data point for every step, which is wasteful

More information

DISSIMILARITY MEASURES FOR ICA-BASED SOURCE NUMBER ESTIMATION. Seungchul Lee 2 2. University of Michigan. Ann Arbor, MI, USA.

DISSIMILARITY MEASURES FOR ICA-BASED SOURCE NUMBER ESTIMATION. Seungchul Lee 2 2. University of Michigan. Ann Arbor, MI, USA. Proceedings of the ASME International Manufacturing Science and Engineering Conference MSEC June -8,, Notre Dae, Indiana, USA MSEC-7 DISSIMILARIY MEASURES FOR ICA-BASED SOURCE NUMBER ESIMAION Wei Cheng,

More information

A note on the multiplication of sparse matrices

A note on the multiplication of sparse matrices Cent. Eur. J. Cop. Sci. 41) 2014 1-11 DOI: 10.2478/s13537-014-0201-x Central European Journal of Coputer Science A note on the ultiplication of sparse atrices Research Article Keivan Borna 12, Sohrab Aboozarkhani

More information

TUTORIAL PART 1 Unsupervised Learning

TUTORIAL PART 1 Unsupervised Learning TUTORIAL PART 1 Unsupervised Learning Marc'Aurelio Ranzato Department of Computer Science Univ. of Toronto ranzato@cs.toronto.edu Co-organizers: Honglak Lee, Yoshua Bengio, Geoff Hinton, Yann LeCun, Andrew

More information

Experimental Design For Model Discrimination And Precise Parameter Estimation In WDS Analysis

Experimental Design For Model Discrimination And Precise Parameter Estimation In WDS Analysis City University of New York (CUNY) CUNY Acadeic Works International Conference on Hydroinforatics 8-1-2014 Experiental Design For Model Discriination And Precise Paraeter Estiation In WDS Analysis Giovanna

More information

Transformation-invariant Collaborative Sub-representation

Transformation-invariant Collaborative Sub-representation Transforation-invariant Collaborative Sub-representation Yeqing Li, Chen Chen, Jungzhou Huang Departent of Coputer Science and Engineering University of Texas at Arlington, Texas 769, USA. Eail: yeqing.li@avs.uta.edu,

More information

On the Use of A Priori Information for Sparse Signal Approximations

On the Use of A Priori Information for Sparse Signal Approximations ITS TECHNICAL REPORT NO. 3/4 On the Use of A Priori Inforation for Sparse Signal Approxiations Oscar Divorra Escoda, Lorenzo Granai and Pierre Vandergheynst Signal Processing Institute ITS) Ecole Polytechnique

More information

ASSUME a source over an alphabet size m, from which a sequence of n independent samples are drawn. The classical

ASSUME a source over an alphabet size m, from which a sequence of n independent samples are drawn. The classical IEEE TRANSACTIONS ON INFORMATION THEORY Large Alphabet Source Coding using Independent Coponent Analysis Aichai Painsky, Meber, IEEE, Saharon Rosset and Meir Feder, Fellow, IEEE arxiv:67.7v [cs.it] Jul

More information

This model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t.

This model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t. CS 493: Algoriths for Massive Data Sets Feb 2, 2002 Local Models, Bloo Filter Scribe: Qin Lv Local Models In global odels, every inverted file entry is copressed with the sae odel. This work wells when

More information

Soft-margin SVM can address linearly separable problems with outliers

Soft-margin SVM can address linearly separable problems with outliers Non-linear Support Vector Machines Non-linearly separable probles Hard-argin SVM can address linearly separable probles Soft-argin SVM can address linearly separable probles with outliers Non-linearly

More information

Optimal Jamming Over Additive Noise: Vector Source-Channel Case

Optimal Jamming Over Additive Noise: Vector Source-Channel Case Fifty-first Annual Allerton Conference Allerton House, UIUC, Illinois, USA October 2-3, 2013 Optial Jaing Over Additive Noise: Vector Source-Channel Case Erah Akyol and Kenneth Rose Abstract This paper

More information

Boosting with log-loss

Boosting with log-loss Boosting with log-loss Marco Cusuano-Towner Septeber 2, 202 The proble Suppose we have data exaples {x i, y i ) i =... } for a two-class proble with y i {, }. Let F x) be the predictor function with the

More information

arxiv: v1 [math.na] 10 Oct 2016

arxiv: v1 [math.na] 10 Oct 2016 GREEDY GAUSS-NEWTON ALGORITHM FOR FINDING SPARSE SOLUTIONS TO NONLINEAR UNDERDETERMINED SYSTEMS OF EQUATIONS MÅRTEN GULLIKSSON AND ANNA OLEYNIK arxiv:6.395v [ath.na] Oct 26 Abstract. We consider the proble

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE227C (Spring 2018): Convex Optiization and Approxiation Instructor: Moritz Hardt Eail: hardt+ee227c@berkeley.edu Graduate Instructor: Max Sichowitz Eail: sichow+ee227c@berkeley.edu October

More information

A Markov Framework for the Simple Genetic Algorithm

A Markov Framework for the Simple Genetic Algorithm A arkov Fraework for the Siple Genetic Algorith Thoas E. Davis*, Jose C. Principe Electrical Engineering Departent University of Florida, Gainesville, FL 326 *WL/NGS Eglin AFB, FL32542 Abstract This paper

More information

Support Vector Machines. Machine Learning Series Jerry Jeychandra Blohm Lab

Support Vector Machines. Machine Learning Series Jerry Jeychandra Blohm Lab Support Vector Machines Machine Learning Series Jerry Jeychandra Bloh Lab Outline Main goal: To understand how support vector achines (SVMs) perfor optial classification for labelled data sets, also a

More information

Support recovery in compressed sensing: An estimation theoretic approach

Support recovery in compressed sensing: An estimation theoretic approach Support recovery in copressed sensing: An estiation theoretic approach Ain Karbasi, Ali Horati, Soheil Mohajer, Martin Vetterli School of Coputer and Counication Sciences École Polytechnique Fédérale de

More information

Research Article Robust ε-support Vector Regression

Research Article Robust ε-support Vector Regression Matheatical Probles in Engineering, Article ID 373571, 5 pages http://dx.doi.org/10.1155/2014/373571 Research Article Robust ε-support Vector Regression Yuan Lv and Zhong Gan School of Mechanical Engineering,

More information

An Improved Particle Filter with Applications in Ballistic Target Tracking

An Improved Particle Filter with Applications in Ballistic Target Tracking Sensors & ransducers Vol. 72 Issue 6 June 204 pp. 96-20 Sensors & ransducers 204 by IFSA Publishing S. L. http://www.sensorsportal.co An Iproved Particle Filter with Applications in Ballistic arget racing

More information

Fast Montgomery-like Square Root Computation over GF(2 m ) for All Trinomials

Fast Montgomery-like Square Root Computation over GF(2 m ) for All Trinomials Fast Montgoery-like Square Root Coputation over GF( ) for All Trinoials Yin Li a, Yu Zhang a, a Departent of Coputer Science and Technology, Xinyang Noral University, Henan, P.R.China Abstract This letter

More information

Distributed Subgradient Methods for Multi-agent Optimization

Distributed Subgradient Methods for Multi-agent Optimization 1 Distributed Subgradient Methods for Multi-agent Optiization Angelia Nedić and Asuan Ozdaglar October 29, 2007 Abstract We study a distributed coputation odel for optiizing a su of convex objective functions

More information

Support Vector Machines. Maximizing the Margin

Support Vector Machines. Maximizing the Margin Support Vector Machines Support vector achines (SVMs) learn a hypothesis: h(x) = b + Σ i= y i α i k(x, x i ) (x, y ),..., (x, y ) are the training exs., y i {, } b is the bias weight. α,..., α are the

More information

Non-Parametric Non-Line-of-Sight Identification 1

Non-Parametric Non-Line-of-Sight Identification 1 Non-Paraetric Non-Line-of-Sight Identification Sinan Gezici, Hisashi Kobayashi and H. Vincent Poor Departent of Electrical Engineering School of Engineering and Applied Science Princeton University, Princeton,

More information

Statistical clustering and Mineral Spectral Unmixing in Aviris Hyperspectral Image of Cuprite, NV

Statistical clustering and Mineral Spectral Unmixing in Aviris Hyperspectral Image of Cuprite, NV CS229 REPORT, DECEMBER 05 1 Statistical clustering and Mineral Spectral Unixing in Aviris Hyperspectral Iage of Cuprite, NV Mario Parente, Argyris Zynis I. INTRODUCTION Hyperspectral Iaging is a technique

More information

Supplementary Material for Fast and Provable Algorithms for Spectrally Sparse Signal Reconstruction via Low-Rank Hankel Matrix Completion

Supplementary Material for Fast and Provable Algorithms for Spectrally Sparse Signal Reconstruction via Low-Rank Hankel Matrix Completion Suppleentary Material for Fast and Provable Algoriths for Spectrally Sparse Signal Reconstruction via Low-Ran Hanel Matrix Copletion Jian-Feng Cai Tianing Wang Ke Wei March 1, 017 Abstract We establish

More information

Robustness and Regularization of Support Vector Machines

Robustness and Regularization of Support Vector Machines Robustness and Regularization of Support Vector Machines Huan Xu ECE, McGill University Montreal, QC, Canada xuhuan@ci.cgill.ca Constantine Caraanis ECE, The University of Texas at Austin Austin, TX, USA

More information

Interactive Markov Models of Evolutionary Algorithms

Interactive Markov Models of Evolutionary Algorithms Cleveland State University EngagedScholarship@CSU Electrical Engineering & Coputer Science Faculty Publications Electrical Engineering & Coputer Science Departent 2015 Interactive Markov Models of Evolutionary

More information

On the theoretical analysis of cross validation in compressive sensing

On the theoretical analysis of cross validation in compressive sensing MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.erl.co On the theoretical analysis of cross validation in copressive sensing Zhang, J.; Chen, L.; Boufounos, P.T.; Gu, Y. TR2014-025 May 2014 Abstract

More information

Estimating Parameters for a Gaussian pdf

Estimating Parameters for a Gaussian pdf Pattern Recognition and achine Learning Jaes L. Crowley ENSIAG 3 IS First Seester 00/0 Lesson 5 7 Noveber 00 Contents Estiating Paraeters for a Gaussian pdf Notation... The Pattern Recognition Proble...3

More information

Compressive Distilled Sensing: Sparse Recovery Using Adaptivity in Compressive Measurements

Compressive Distilled Sensing: Sparse Recovery Using Adaptivity in Compressive Measurements 1 Copressive Distilled Sensing: Sparse Recovery Using Adaptivity in Copressive Measureents Jarvis D. Haupt 1 Richard G. Baraniuk 1 Rui M. Castro 2 and Robert D. Nowak 3 1 Dept. of Electrical and Coputer

More information

arxiv: v1 [cs.cv] 28 Aug 2015

arxiv: v1 [cs.cv] 28 Aug 2015 Discrete Hashing with Deep Neural Network Thanh-Toan Do Anh-Zung Doan Ngai-Man Cheung Singapore University of Technology and Design {thanhtoan do, dung doan, ngaian cheung}@sutd.edu.sg arxiv:58.748v [cs.cv]

More information

A Nonlinear Sparsity Promoting Formulation and Algorithm for Full Waveform Inversion

A Nonlinear Sparsity Promoting Formulation and Algorithm for Full Waveform Inversion A Nonlinear Sparsity Prooting Forulation and Algorith for Full Wavefor Inversion Aleksandr Aravkin, Tristan van Leeuwen, Jaes V. Burke 2 and Felix Herrann Dept. of Earth and Ocean sciences University of

More information

Lower Bounds for Quantized Matrix Completion

Lower Bounds for Quantized Matrix Completion Lower Bounds for Quantized Matrix Copletion Mary Wootters and Yaniv Plan Departent of Matheatics University of Michigan Ann Arbor, MI Eail: wootters, yplan}@uich.edu Mark A. Davenport School of Elec. &

More information

HIGH RESOLUTION NEAR-FIELD MULTIPLE TARGET DETECTION AND LOCALIZATION USING SUPPORT VECTOR MACHINES

HIGH RESOLUTION NEAR-FIELD MULTIPLE TARGET DETECTION AND LOCALIZATION USING SUPPORT VECTOR MACHINES ICONIC 2007 St. Louis, O, USA June 27-29, 2007 HIGH RESOLUTION NEAR-FIELD ULTIPLE TARGET DETECTION AND LOCALIZATION USING SUPPORT VECTOR ACHINES A. Randazzo,. A. Abou-Khousa 2,.Pastorino, and R. Zoughi

More information

Upper bound on false alarm rate for landmine detection and classification using syntactic pattern recognition

Upper bound on false alarm rate for landmine detection and classification using syntactic pattern recognition Upper bound on false alar rate for landine detection and classification using syntactic pattern recognition Ahed O. Nasif, Brian L. Mark, Kenneth J. Hintz, and Nathalia Peixoto Dept. of Electrical and

More information

Stochastic Subgradient Methods

Stochastic Subgradient Methods Stochastic Subgradient Methods Lingjie Weng Yutian Chen Bren School of Inforation and Coputer Science University of California, Irvine {wengl, yutianc}@ics.uci.edu Abstract Stochastic subgradient ethods

More information

Kernel based Object Tracking using Color Histogram Technique

Kernel based Object Tracking using Color Histogram Technique Kernel based Object Tracking using Color Histogra Technique 1 Prajna Pariita Dash, 2 Dipti patra, 3 Sudhansu Kuar Mishra, 4 Jagannath Sethi, 1,2 Dept of Electrical engineering, NIT, Rourkela, India 3 Departent

More information

Probability Distributions

Probability Distributions Probability Distributions In Chapter, we ephasized the central role played by probability theory in the solution of pattern recognition probles. We turn now to an exploration of soe particular exaples

More information

On Conditions for Linearity of Optimal Estimation

On Conditions for Linearity of Optimal Estimation On Conditions for Linearity of Optial Estiation Erah Akyol, Kuar Viswanatha and Kenneth Rose {eakyol, kuar, rose}@ece.ucsb.edu Departent of Electrical and Coputer Engineering University of California at

More information

arxiv: v2 [cs.lg] 30 Mar 2017

arxiv: v2 [cs.lg] 30 Mar 2017 Batch Renoralization: Towards Reducing Minibatch Dependence in Batch-Noralized Models Sergey Ioffe Google Inc., sioffe@google.co arxiv:1702.03275v2 [cs.lg] 30 Mar 2017 Abstract Batch Noralization is quite

More information

Nyström Method vs Random Fourier Features: A Theoretical and Empirical Comparison

Nyström Method vs Random Fourier Features: A Theoretical and Empirical Comparison yströ Method vs : A Theoretical and Epirical Coparison Tianbao Yang, Yu-Feng Li, Mehrdad Mahdavi, Rong Jin, Zhi-Hua Zhou Machine Learning Lab, GE Global Research, San Raon, CA 94583 Michigan State University,

More information

Multi-Scale/Multi-Resolution: Wavelet Transform

Multi-Scale/Multi-Resolution: Wavelet Transform Multi-Scale/Multi-Resolution: Wavelet Transfor Proble with Fourier Fourier analysis -- breaks down a signal into constituent sinusoids of different frequencies. A serious drawback in transforing to the

More information

arxiv: v1 [cs.ds] 29 Jan 2012

arxiv: v1 [cs.ds] 29 Jan 2012 A parallel approxiation algorith for ixed packing covering seidefinite progras arxiv:1201.6090v1 [cs.ds] 29 Jan 2012 Rahul Jain National U. Singapore January 28, 2012 Abstract Penghui Yao National U. Singapore

More information

Qualitative Modelling of Time Series Using Self-Organizing Maps: Application to Animal Science

Qualitative Modelling of Time Series Using Self-Organizing Maps: Application to Animal Science Proceedings of the 6th WSEAS International Conference on Applied Coputer Science, Tenerife, Canary Islands, Spain, Deceber 16-18, 2006 183 Qualitative Modelling of Tie Series Using Self-Organizing Maps:

More information

e-companion ONLY AVAILABLE IN ELECTRONIC FORM

e-companion ONLY AVAILABLE IN ELECTRONIC FORM OPERATIONS RESEARCH doi 10.1287/opre.1070.0427ec pp. ec1 ec5 e-copanion ONLY AVAILABLE IN ELECTRONIC FORM infors 07 INFORMS Electronic Copanion A Learning Approach for Interactive Marketing to a Custoer

More information

Extension of CSRSM for the Parametric Study of the Face Stability of Pressurized Tunnels

Extension of CSRSM for the Parametric Study of the Face Stability of Pressurized Tunnels Extension of CSRSM for the Paraetric Study of the Face Stability of Pressurized Tunnels Guilhe Mollon 1, Daniel Dias 2, and Abdul-Haid Soubra 3, M.ASCE 1 LGCIE, INSA Lyon, Université de Lyon, Doaine scientifique

More information

Proc. of the IEEE/OES Seventh Working Conference on Current Measurement Technology UNCERTAINTIES IN SEASONDE CURRENT VELOCITIES

Proc. of the IEEE/OES Seventh Working Conference on Current Measurement Technology UNCERTAINTIES IN SEASONDE CURRENT VELOCITIES Proc. of the IEEE/OES Seventh Working Conference on Current Measureent Technology UNCERTAINTIES IN SEASONDE CURRENT VELOCITIES Belinda Lipa Codar Ocean Sensors 15 La Sandra Way, Portola Valley, CA 98 blipa@pogo.co

More information

Fixed-to-Variable Length Distribution Matching

Fixed-to-Variable Length Distribution Matching Fixed-to-Variable Length Distribution Matching Rana Ali Ajad and Georg Böcherer Institute for Counications Engineering Technische Universität München, Gerany Eail: raa2463@gail.co,georg.boecherer@tu.de

More information

Ensemble Based on Data Envelopment Analysis

Ensemble Based on Data Envelopment Analysis Enseble Based on Data Envelopent Analysis So Young Sohn & Hong Choi Departent of Coputer Science & Industrial Systes Engineering, Yonsei University, Seoul, Korea Tel) 82-2-223-404, Fax) 82-2- 364-7807

More information

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and This article appeared in a ournal published by Elsevier. The attached copy is furnished to the author for internal non-coercial research and education use, including for instruction at the authors institution

More information

Combining Classifiers

Combining Classifiers Cobining Classifiers Generic ethods of generating and cobining ultiple classifiers Bagging Boosting References: Duda, Hart & Stork, pg 475-480. Hastie, Tibsharini, Friedan, pg 246-256 and Chapter 10. http://www.boosting.org/

More information

Warning System of Dangerous Chemical Gas in Factory Based on Wireless Sensor Network

Warning System of Dangerous Chemical Gas in Factory Based on Wireless Sensor Network 565 A publication of CHEMICAL ENGINEERING TRANSACTIONS VOL. 59, 07 Guest Editors: Zhuo Yang, Junie Ba, Jing Pan Copyright 07, AIDIC Servizi S.r.l. ISBN 978-88-95608-49-5; ISSN 83-96 The Italian Association

More information

UNIVERSITY OF TRENTO ON THE USE OF SVM FOR ELECTROMAGNETIC SUBSURFACE SENSING. A. Boni, M. Conci, A. Massa, and S. Piffer.

UNIVERSITY OF TRENTO ON THE USE OF SVM FOR ELECTROMAGNETIC SUBSURFACE SENSING. A. Boni, M. Conci, A. Massa, and S. Piffer. UIVRSITY OF TRTO DIPARTITO DI IGGRIA SCIZA DLL IFORAZIO 3823 Povo Trento (Italy) Via Soarive 4 http://www.disi.unitn.it O TH US OF SV FOR LCTROAGTIC SUBSURFAC SSIG A. Boni. Conci A. assa and S. Piffer

More information

ON THE TWO-LEVEL PRECONDITIONING IN LEAST SQUARES METHOD

ON THE TWO-LEVEL PRECONDITIONING IN LEAST SQUARES METHOD PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY Physical and Matheatical Sciences 04,, p. 7 5 ON THE TWO-LEVEL PRECONDITIONING IN LEAST SQUARES METHOD M a t h e a t i c s Yu. A. HAKOPIAN, R. Z. HOVHANNISYAN

More information

Using a De-Convolution Window for Operating Modal Analysis

Using a De-Convolution Window for Operating Modal Analysis Using a De-Convolution Window for Operating Modal Analysis Brian Schwarz Vibrant Technology, Inc. Scotts Valley, CA Mark Richardson Vibrant Technology, Inc. Scotts Valley, CA Abstract Operating Modal Analysis

More information

SPECTRUM sensing is a core concept of cognitive radio

SPECTRUM sensing is a core concept of cognitive radio World Acadey of Science, Engineering and Technology International Journal of Electronics and Counication Engineering Vol:6, o:2, 202 Efficient Detection Using Sequential Probability Ratio Test in Mobile

More information

ZISC Neural Network Base Indicator for Classification Complexity Estimation

ZISC Neural Network Base Indicator for Classification Complexity Estimation ZISC Neural Network Base Indicator for Classification Coplexity Estiation Ivan Budnyk, Abdennasser Сhebira and Kurosh Madani Iages, Signals and Intelligent Systes Laboratory (LISSI / EA 3956) PARIS XII

More information

Quantum algorithms (CO 781, Winter 2008) Prof. Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search

Quantum algorithms (CO 781, Winter 2008) Prof. Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search Quantu algoriths (CO 781, Winter 2008) Prof Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search ow we begin to discuss applications of quantu walks to search algoriths

More information

Fast Structural Similarity Search of Noncoding RNAs Based on Matched Filtering of Stem Patterns

Fast Structural Similarity Search of Noncoding RNAs Based on Matched Filtering of Stem Patterns Fast Structural Siilarity Search of Noncoding RNs Based on Matched Filtering of Ste Patterns Byung-Jun Yoon Dept. of Electrical Engineering alifornia Institute of Technology Pasadena, 91125, S Eail: bjyoon@caltech.edu

More information

RANDOM GRADIENT EXTRAPOLATION FOR DISTRIBUTED AND STOCHASTIC OPTIMIZATION

RANDOM GRADIENT EXTRAPOLATION FOR DISTRIBUTED AND STOCHASTIC OPTIMIZATION RANDOM GRADIENT EXTRAPOLATION FOR DISTRIBUTED AND STOCHASTIC OPTIMIZATION GUANGHUI LAN AND YI ZHOU Abstract. In this paper, we consider a class of finite-su convex optiization probles defined over a distributed

More information

Exact tensor completion with sum-of-squares

Exact tensor completion with sum-of-squares Proceedings of Machine Learning Research vol 65:1 54, 2017 30th Annual Conference on Learning Theory Exact tensor copletion with su-of-squares Aaron Potechin Institute for Advanced Study, Princeton David

More information

Hybrid System Identification: An SDP Approach

Hybrid System Identification: An SDP Approach 49th IEEE Conference on Decision and Control Deceber 15-17, 2010 Hilton Atlanta Hotel, Atlanta, GA, USA Hybrid Syste Identification: An SDP Approach C Feng, C M Lagoa, N Ozay and M Sznaier Abstract The

More information

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices CS71 Randoness & Coputation Spring 018 Instructor: Alistair Sinclair Lecture 13: February 7 Disclaier: These notes have not been subjected to the usual scrutiny accorded to foral publications. They ay

More information

Data-Driven Imaging in Anisotropic Media

Data-Driven Imaging in Anisotropic Media 18 th World Conference on Non destructive Testing, 16- April 1, Durban, South Africa Data-Driven Iaging in Anisotropic Media Arno VOLKER 1 and Alan HUNTER 1 TNO Stieltjesweg 1, 6 AD, Delft, The Netherlands

More information

A Low-Complexity Congestion Control and Scheduling Algorithm for Multihop Wireless Networks with Order-Optimal Per-Flow Delay

A Low-Complexity Congestion Control and Scheduling Algorithm for Multihop Wireless Networks with Order-Optimal Per-Flow Delay A Low-Coplexity Congestion Control and Scheduling Algorith for Multihop Wireless Networks with Order-Optial Per-Flow Delay Po-Kai Huang, Xiaojun Lin, and Chih-Chun Wang School of Electrical and Coputer

More information

OPTIMIZATION in multi-agent networks has attracted

OPTIMIZATION in multi-agent networks has attracted Distributed constrained optiization and consensus in uncertain networks via proxial iniization Kostas Margellos, Alessandro Falsone, Sione Garatti and Maria Prandini arxiv:603.039v3 [ath.oc] 3 May 07 Abstract

More information

Sparse beamforming in peer-to-peer relay networks Yunshan Hou a, Zhijuan Qi b, Jianhua Chenc

Sparse beamforming in peer-to-peer relay networks Yunshan Hou a, Zhijuan Qi b, Jianhua Chenc 3rd International Conference on Machinery, Materials and Inforation echnology Applications (ICMMIA 015) Sparse beaforing in peer-to-peer relay networs Yunshan ou a, Zhijuan Qi b, Jianhua Chenc College

More information

Topic 5a Introduction to Curve Fitting & Linear Regression

Topic 5a Introduction to Curve Fitting & Linear Regression /7/08 Course Instructor Dr. Rayond C. Rup Oice: A 337 Phone: (95) 747 6958 E ail: rcrup@utep.edu opic 5a Introduction to Curve Fitting & Linear Regression EE 4386/530 Coputational ethods in EE Outline

More information

On the Impact of Kernel Approximation on Learning Accuracy

On the Impact of Kernel Approximation on Learning Accuracy On the Ipact of Kernel Approxiation on Learning Accuracy Corinna Cortes Mehryar Mohri Aeet Talwalkar Google Research New York, NY corinna@google.co Courant Institute and Google Research New York, NY ohri@cs.nyu.edu

More information