Least-squares Independent Component Analysis

Size: px
Start display at page:

Download "Least-squares Independent Component Analysis"

Transcription

1 Neura Computation, vo.23, no., pp , 20. Least-squares Independent Component Anaysis Taiji Suzuki Department of Mathematica Informatics, The University of Tokyo 7-3- Hongo, Bunkyo-ku, Tokyo , Japan Masashi Sugiyama Department of Computer Science, Tokyo Institute of Technoogy and PRESTO, Japan Science and Technoogy Agency 2-2- O-okayama, Meguro-ku, Tokyo , Japan Abstract Accuratey evauating statistica independence among random variabes is a key eement of Independent Component Anaysis ICA. In this paper, we empoy a squared-oss variant of mutua information as an independence measure and give its estimation method. Our basic idea is to estimate the ratio of probabiity densities directy without going through density estimation, by which a hard task of density estimation can be avoided. In this density ratio approach, a natura cross-vaidation procedure is avaiabe for hyper-parameter seection. Thus, a tuning parameters such as the kerne width or the reguarization parameter can be objectivey optimized. This is an advantage over recenty deveoped kerne-based independence measures and is a highy usefu property in unsupervised earning probems such as ICA. Based on this nove independence measure, we deveop an ICA agorithm named Least-squares Independent Component Anaysis LICA. Introduction The purpose of Independent Component Anaysis ICA Hyvärinen et a., 200 is to obtain a transformation matrix that separates mixed signas into statisticay-independent source signas. A direct approach to ICA is to find a transformation matrix such that independence among separated signas is maximized under some independence measure such as mutua information MI. A MATLAB R impementation of the proposed agorithm, LICA, is avaiabe from

2 Least-squares Independent Component Anaysis 2 Various approaches to evauating the independence among random variabes from sampes have been expored so far. A naive approach is to estimate probabiity densities based on parametric or non-parametric density estimation methods. However, finding an appropriate parametric mode is not easy without strong prior knowedge and nonparametric estimation is not accurate in high-dimensiona probems. Thus this naive approach is not reiabe in practice. Another approach is to approximate the negentropy or negative entropy based on the Gram-Charier expansion Cardoso & Sououmiac, 993; Comon, 994; Amari et a., 996 or the Edgeworth expansion Hue, An advantage of this negentropy-based approach is that a hard task of density estimation is not directy invoved. However, these expansion techniques are based on the assumption that the target density is cose to norma and vioation of this assumption can cause arge approximation error. The above approaches are based on the probabiity densities of signas. Another ine of research that does not expicity invove probabiity densities empoys non-inear correation signas are statisticay independent if and ony if a non-inear correations among the signas vanish. Foowing this ine, computationay efficient agorithms have been deveoped based on a contrast function Jutten & Heraut, 99; Hyvärinen, 999, which is an approximation of negentropy or mutua information. However, these methods require to pre-specify non-inearities in the contrast function, and thus coud be inaccurate if the predetermined non-inearities do not match the target distribution. To cope with this probem, the kerne trick has been appied in ICA, which aows one to evauate a non-inear correations in a computationay efficient manner Bach & Jordan, However, its practica performance depends on the choice of kernes more specificay, the Gaussian kerne width and there seems no theoreticay justified method to determine the kerne width see aso Fukumizu et a., This is a critica probem in unsupervised earning probems such as ICA. In this paper, we deveop a new ICA agorithm that resoves the probems mentioned above. We adopt a squared-oss variant of MI which we ca squared-oss MI ; SMI as an independence measure and approximate it by estimating the ratio of probabiity densities contained in SMI directy without going through density estimation. This approach which foows the ine of Sugiyama et a. 2008, Kanamori et a. 2009, and Nguyen et a. 200 aows us to avoid a hard task of density estimation. Another practica advantage of this density-ratio approach is that a natura cross-vaidation CV procedure is avaiabe for hyper-parameter seection. Thus a tuning parameters such as the kerne width or the reguarization parameter can be objectivey and systematicay optimized through CV. From an agorithmic point of view, our density-ratio approach anayticay provides a non-parametric estimator of SMI; furthermore its derivative can aso be computed anayticay and these properties are utiized in deriving a new ICA agorithm. The proposed method is named Least-squares Independent Component Anaysis LICA. Characteristics of existing and proposed ICA methods are summarized in Tabe, highighting the advantage of the proposed LICA approach. The structure of this paper is as foows. In Section 2, we formuate our estimator of

3 Least-squares Independent Component Anaysis 3 Tabe : Summary of existing and proposed ICA methods. Fast ICA FICA Hyvärinen, 999 Natura-gradient ICA NICA Amari et a., 996 Kerne ICA KICA Bach & Jordan, 2002 Edgeworth-expansion ICA EICA Hue, 2008 Least-squares ICA LICA proposed Hyper-parameter seection Not Necessary Not Necessary Not Avaiabe Not Necessary Avaiabe Distribution Not Free Not Free Free Neary norma Free SMI. In Section 3, we derive the LICA agorithm based on the SMI estimator. Section 4 is devoted to numerica experiments where we show that our method propery estimate the true demixing matrix using toy datasets, and compare the performances of the proposed and existing methods on artificia and rea datasets. 2 SMI Estimation for ICA In this section, we formuate the ICA probem and introduce our independence measure, SMI. Then we give an estimation method of SMI and derive an ICA agorithm. 2. Probem Formuation Suppose there is a d-dimensiona random signa x = x,..., x d drawn from a distribution with density px, where {x m } d m= are statisticay independent of each other, and denotes the transpose of a matrix or a vector. Thus, px can be factorized as d px = p m x m. m= We cannot directy observe the source signa x, but ony a ineary mixed signa y: y = y,..., y d := Ax, where A is a d d invertibe matrix caed the mixing matrix. The goa of ICA is, given sampes of the mixed signas {y i } n i=, to obtain a demixing matrix W that recovers the origina source signa x. We denote the demixed signa by z: z = W y.

4 Least-squares Independent Component Anaysis 4 The idea soution is W = A, but we can ony recover the source signas up to permutation and scaing of components of x due to non-identifiabiity of the ICA setup Hyvärinen et a., 200. A direct approach to ICA is to determine W so that components of z are as independent as possibe. Here, we adopt SMI as the independence measure: I s Z,..., Z d := 2 qz 2 rz rzdz, where qz denotes the joint density of z and rz denotes the product of margina densities {q m z m } d m=: d rz = q m z m. m= Note that SMI is the Pearson divergence Pearson, 900; Paninsky, 2003; Liese & Vajda, 2006; Cichocki et a., 2009 between qz and rz, whie ordinary MI is the Kuback- Leiber divergence Kuback & Leiber, 95. Since I s is non-negative and it vanishes if and ony if qz = rz, the degree of independence among {z m } d m= may be measured by SMI. Note that Eq. corresponds to the f-divergence Ai & Sivey, 966; Csiszár, 967 between qx and rz with the squared-oss, whie ordinary MI corresponds to the f-divergence with the og-oss. Thus SMI coud be regarded as a natura generaization of ordinary MI. Based on the independence detection property of SMI, we try to find the demixing matrix W that minimizes SMI. Let us denote the demixed sampes by {z i z i = z i,..., z d i := W y i } n i=. Our key constraint when estimating SMI is that we want to avoid density estimation since it is a hard task Vapnik, 998. Beow, we show how this coud be accompished. 2.2 SMI Approximation via Density Ratio Estimation We approximate SMI via density ratio estimation. Let us denote the ratio of the densities qz and rz by g z := qz rz. 2 Then SMI can be written as I s Z,..., Z d = g z 2 rzdz 2 = g z 2 rz 2g zrz + rz dz 2 = g zqz 2qz + rz dz 2 = g zqzdz

5 Least-squares Independent Component Anaysis 5 Therefore, SMI can be approximated through the estimation of g zqzdz, the expectation of g z over qz. This can be achieved by taking the sampe average of an estimator of the density ratio g z, say ĝz: Î s = 2n n ĝz i 2. 4 We take the east-squares approach to estimating the density ratio g z: [ ] 2rzdz inf gz g z g 2 [ ] = inf g 2 gz2 rz gzqz dz + constant, i= where inf g is taken over a measurabe functions. Obviousy the optima soution is the density ratio g. Thus computing I s is now reduced to soving the foowing optimization probem: inf g [ 2 gz2 rz gzqz dz ]. 5 However, directy soving the probem 5 is not possibe due to the foowing two reasons. The first reason is that finding the minimizer over a measurabe functions is not tractabe in practice since the search space is too vast. To overcome this probem, we restrict the search space to some inear subspace G: G = {α φz α = α,..., α b R b }, 6 where α is a parameter to be earned from sampes, and φz is a basis function vector such that φz = φ z,..., φ b z 0 b for a z. 0 b denotes the b-dimensiona vector with a zeros. Note that φz coud be dependent on the sampes {z i } n i=, i.e., kerne modes are aso aowed. We expain how the basis functions φz are chosen in Section 2.3. The second reason why directy soving the probem 5 is not possibe is that the expectations over the true probabiity densities qz and rz cannot be computed since qz and rz are unknown. To cope with this probem, we approximate the expectations by their empirica averages then the optimization probem is reduced to α := argmin α R b [ ] 2 α Ĥα ĥ α + λα Rα, 7 where we incuded λα Rα λ > 0 for avoiding overfitting. λ is caed the reguarization

6 Least-squares Independent Component Anaysis 6 parameter, and R is some positive definite matrix. Ĥ and ĥ are defined as Ĥ := n d ĥ := n n i,...,i d = n i= φz i,..., z d i d φz i,..., z d i d, 8 φz i,..., zd i. 9 Differentiating the objective function in Eq.7 with respect to α and equating it to zero, we can obtain an anaytic-form soution as α = Ĥ + λr ĥ. Thus, the soution can be computed very efficienty just by soving a system of inear equations. Once the density ratio 2 has been estimated, SMI can be approximated by pugging the estimated density ratio ĝz = α φz in Eq.4: Î s = 2 α ĥ 2. 0 Note that we may obtain various expressions of SMI using the foowing identities: g z 2 rzdz = g zqzdz, g zrzdz = qzdz =. Ordinary MI based on the Kuback-Leiber divergence can aso be estimated simiary using the density ratio Suzuki et a., However, the use of SMI is more advantageous due to the anaytic-form soution, as described in Section Design of Basis Functions and Hyper-parameter Seection As basis functions, we propose to use a Gaussian kerne: φ z = exp z v 2 d = exp zm v m 2, 2σ 2 2σ 2 where m= {v v = v,..., v d } b = are Gaussian centers randomy chosen from {z i } n i= more precisey, we set v = z c, where {c} b = are randomy chosen from {,..., n} without repacement. An advantage

7 Least-squares Independent Component Anaysis 7 of the Gaussian kerne ies in the factorizabiity in Eq., contributing to reducing the computationa cost of the matrix [ Ĥ significanty: Ĥ, = d n ] exp zm i v m 2 + z m i v m 2. n d 2σ 2 m= i= We use the RKHS Reproducing Kerne Hibert Space norm of α φz induced by the Gaussian kerne as the reguarization term α Rα, which is a popuar choice in the kerne method community Schökopf & Smoa, 2002: R, = exp v v σ 2 In the experiments, we fix the number of basis functions to b = min300, n, and choose the Gaussian width σ and the reguarization parameter λ by CV with grid search as foows. First, the sampes {z i } n i= are divided into K disjoint subsets {Z k } K k= of approximatey the same size we use K = 5 in the experiments. Then an estimator α Z\Zk is obtained using Z\Z k i.e., Z without Z k and the approximation error for the hod-out sampes Z k is computed: J K-CV Z k = 2 α Z\Z k Ĥ Zk α Z\Zk ĥ Z k α Z\Zk, where the matrix ĤZ k and the vector ĥz k are defined in the same way as Ĥ and ĥ, but computed ony using Z k. This procedure is repeated for k =,..., K and its average J K-CV is computed: J K-CV = K J K-CV Z K k. For parameter seection, we compute J K-CV for a hyper-parameter candidates the Gaussian width σ and the reguarization parameter λ in the current setting and choose the parameter that minimizes J K-CV. We can show that J K-CV is an amost unbiased estimator of the objective function in Eq.5, where the amost -ness comes from the fact that the number of sampes is reduced in the CV procedure due to data spitting Geisser, 975; Kohave, The LICA Agorithms In this section, we show how the above SMI estimation idea coud be empoyed in the context of ICA. Here, we derive two agorithms, which we ca Least-squares Independent Component Anaysis LICA, for obtaining a minimizer of Îs with respect to the demixing matrix W one is based on a pain gradient method which we refer to as PG-LICA and the other is based on a natura gradient method for whitened sampes which we refer to as NG-LICA. A MATLAB R impementation of LICA is avaiabe from k=

8 Least-squares Independent Component Anaysis Pain Gradient Agorithm: PG-LICA Based on the pain gradient technique, an update rue of W is given by W W ε Îs W, 3 where ε > 0 is the step size. As shown in Appendix, the gradient is given by Îs W, = ĥ W, α 2 α where, for u = y c and y i = y i,..., y d i, ĥ Ĥ, = nσ 2 = n d Ĥ W, + λ R α, 4 W, n z k i v k u k y k i exp z i v k 2, 5 2σ 2 i= [ n ] exp zm i v m 2 + z m i v m 2 2σ 2 m k i= [ n z k nσ 2 i v k u k y k i + z k i v k u k y k i i= ] exp zk i v k 2 + z k i v k σ 2 For the reguarization matrix R defined by Eq.2, the partia derivative is given by R, = σ 2 vk v k u k u k exp v v 2. 2σ 2 In ICA, scaing of components of z can be arbitrary. This impies that the above gradient updating rue can ead to a soution with poor scaing, which is not preferabe from a numerica point of view. To avoid possibe numerica instabiity, we normaize W at each gradient iteration as W k,k W k,k d m= W 2 k,m. 7 In practice, we may iterativey perform ine search aong the gradient and optimize the Gaussian width σ and the reguarization parameter λ by CV. A pseudo code of the PG-LICA agorithm is summarized in Figure.

9 Least-squares Independent Component Anaysis 9. Initiaize demixing matrix W and normaize it by Eq Optimize Gaussian width σ and reguarization parameter λ by CV. 3. Compute gradient Îs W by Eq Choose step-size ε such that Îs see Eq.0 is minimized ine-search. 5. Update W by Eq Normaize W by Eq Repeat unti W converges. Figure : The LICA agorithm with pain gradient descent PG-LICA. 3.2 Natura Gradient Agorithm for Whitened Data: NG-LICA The second agorithm is based on a natura gradient technique Amari, 998. Suppose the data sampes are whitened, i.e., sampes {y i } n i= are transformed as where Ĉ is the sampe covariance matrix: Ĉ := n y n i n y n j y i n i= y i Ĉ 2 y i, 8 j= n y j. Then it can be shown that a demixing matrix which eiminates the second order correation is an orthogona matrix Hyvärinen et a., 200. Thus, for whitened data, the search space of W can be restricted to the orthogona group Od without oss of generaity. The tangent space of Od at W is equa to the space of a matrices U such that W U is skew symmetric, i.e., UW = W U. The steepest direction on this tangent space, which is caed the natura gradient, is given as foows Amari, 998: ÎsW := Îs 2 W W Îs W, 9 W where the canonica metric G, G 2 = 2 trg G 2 is adopted in the tangent space. Then the geodesic from W in the direction of the natura gradient over Od can be expressed by W exp tw ÎsW, where t R and exp denotes the matrix exponentia, i.e., for a square matrix D, expd = k! Dk. k=0 j=

10 Least-squares Independent Component Anaysis 0. Whiten the data sampes by Eq Initiaize demixing matrix W and normaize it by Eq Optimize Gaussian width σ and reguarization parameter λ by CV. 4. Compute the natura gradient Îs by Eq Choose step-size t such that Îs see Eq.0 is minimized over the set Update W by Eq Repeat unti W converges. Figure 2: The LICA agorithm with natura gradient descent NG-LICA. Thus when we perform ine search aong the geodesic in the natura gradient direction, the minimizer may be searched from the set { } W exp tw ÎsW t 0, 20 i.e., t is chosen such that Îs see Eq.0 is minimized and W is updated as W W exp tw ÎsW. 2 Geometry and optimization agorithms on more genera structure, the Stiefe manifod, is discussed in more detai in Nishimori and Akaho A pseudo code of the NG-LICA agorithm is summarized in Figure Remarks The proposed LICA agorithms can be regarded as an appication of the genera unconstrained east-squares density-ratio estimator proposed by Kanamori et a to SMI in the context of ICA. The optimization probem 5 can aso be obtained foowing the ine of Nguyen et a. 200, which addresses a divergence estimation probem utiizing the Legendre-Fenche duaity. SMI defined by Eq. can be expressed as I s Z,..., Z d = 2 If the Legendre-Fenche duaity of the convex function 2 x2, 2 x2 = sup yx 2 y2, y 2 qz rzdz rz 2. 22

11 Least-squares Independent Component Anaysis is appied to 2 qz rz 2 in Eq.22 in a pointwise manner, we have [ qz I s Z,..., Z d = sup g rz gz rzdz 2 gz2 ] 2 [ ] = inf g 2 gz2 qz gzrz dz 2, where sup g and inf g are taken over a measurabe functions. SMI is cosey reated to the kerne independence measures deveoped recenty Gretton et a., 2005a; Gretton et a., 2005b; Fukumizu et a., In particuar, it has been shown that the NOrmaized Cross-Covariance Operator NOCCO proposed in Fukumizu et a is aso an estimator of SMI for d = 2. However, there is no reasonabe hyperparameter seection method for this and a other kerne-based independence measures see aso Bach & Jordan, 2002 and Fukumizu et a., This is a crucia imitation in unsupervised earning scenarios such as ICA. On the other hand, cross-vaidation can be appied to our method for hyper-parameter seection, as shown in Section Experiments In this section, we investigate the experimenta performance of the proposed method. 4. Iustrative Exampes First, we iustrate how the proposed method behaves using the foowing three 2- dimensiona datasets: a Sub Sub-Gaussians: px = Ux ; 0.5, 0.5Ux 2 ; 0.5, 0.5, b Super Super-Gaussians: px = Lx ; 0, Lx 2 ; 0,, c Sub Super-Gaussians: px = Ux ; 0.5, 0.5Lx 2 ; 0,, where Ux; a, b a, b R, a < b denotes the uniform density on [a, b] and Lx; µ, v µ R, v > 0 denotes the Lapace density with mean µ and variance v. Let the number of sampes be n = 300 and we observe mixed sampes {y i } n i= through the foowing mixing matrix: cosπ/4 sinπ/4 A = =. sinπ/4 cosπ/4 2 The observed sampes are potted in Figure 3. We empoyed the NG-LICA agorithm described in Figure 2. Hyper-parameters σ and λ in LICA were chosen by 5-fod CV from the 0 vaues in [0., ] at reguar intervas and the 0 vaues in [0.00, ] at reguar intervas in og scae, respectivey. The reguarization term was set to the squared RKHS norm induced by the Gaussian kerne, i.e., we empoyed R defined by Eq.2.

12 Least-squares Independent Component Anaysis a b c Figure 3: Observed sampes asterisks, true independent directions dotted ines and estimated independent directions soid ines. Estimated SMI Estimated SMI Estimated SMI Iterations Iterations Iterations a b c Figure 4: The vaue of Îs over iterations. W, W,2 0.5 W 2, W 2,2 true Iterations Iterations Iterations a b c Figure 5: The eements of the demixing matrix W over iterations. Soid ines correspond to W,, W,2, W 2,, and W 2,2, respectivey. The dotted ines denote the true vaues.

13 Least-squares Independent Component Anaysis 3 The true independent directions as we as the estimated independent directions are potted in Figure 3. Figure 4 depicts the vaue of the estimated SMI 0 over iterations and Figure 5 depicts the eements of the demixing matrix W over iterations. The resuts show that estimated SMI decreases rapidy and good soutions are obtained for a the datasets. The reason why the estimated SMI in Figure 4 does not decrease monotonicay is that during the natura gradient optimization procedure, the hyper-parameters λ and σ are adjusted by CV see Figure 2, which possiby causes an increase in the objective vaues. 4.2 Performance Comparison Here we compare our method with some existing methods KICA, FICA, JADE Cardoso & Sououmiac, 993 on artificia and rea datasets. We used the datasets a, b, and c in Section 4., the demosig dataset avaiabe in the FastICA package for MATLAB R, and 0hao, Sergio7, Speech4, and c5signas datasets avaiabe in the ICALAB signa processing benchmark datasets 2 Cichocki & Amari, The datasets a, b, c, demosig, Sergio7, and c5signas are artificia datasets. The datasets 0hao and Speech4 are rea datasets. We empoyed the Amari index Amari et a., 996 as the performance measure smaer is better: Amari index := 2dd d m,m = o m,m max m o m,m + o m,m max m o m,m d, where o m,m := [Ŵ A] m,m for an estimated demixing matrix Ŵ. We used the pubicy avaiabe MATLAB R codes for KICA 3, FICA and JADE 4, where defaut parameter settings were used. Hyper-parameters σ and λ in LICA were chosen by 5-fod CV from the 0 vaues in [0., ] at reguar intervas and the 0 vaues in [0.00, ] at reguar intervas in og scae, respectivey. R was set as Eq.2. We randomy generated the mixing matrix A and source signas for artificia datasets, and computed the Amari index between the true A and Ŵ for Ŵ estimated by each method. As training sampes, we used the first n sampes for Sergio7 and c5signas, and the n sampes between the 00th and 000+n-th interva for 0hao and Speech4, where we tested n = 200 and 500. The performance of each method is summarized in Tabe 2, which depicts the mean and standard deviation of the Amari index over 50 trias. NG-LICA overa shows good performance. KICA tends to work reasonaby we for datasets a, b, c and demosig, but it performs poory for the ICALAB datasets; this seems to be caused by an inappropriate choice of the Gaussian kerne width and oca optima. On the other hand, FICA and JADE tend to work reasonaby we for the ICALAB datasets, but performs poory fbach/kerne-ica/index.htm 4 cardoso/guidesepsou.htm

14 Least-squares Independent Component Anaysis 4 Tabe 2: Mean and standard deviation of the Amari index smaer is better for the benchmark datasets. The datasets a, b, and c are taken from Section 4.. The demosig dataset is taken from the FastICA package. The 0hao, Sergio7, Speech4, and c5signas datasets are taken from the ICALAB benchmarks datasets. The best method in terms of the mean Amari index and comparabe ones based on the one-sided t-test at the significance eve % are indicated by bodface. dataset n NG-LICA KICA FICA JADE a b c demosig hao Sergio Speech c5signas for a, b, c and demosig ; we conjecture that the contrast functions in FICA and the fourth-order statistics in JADE did not appropriatey catch the non-gaussianity of the datasets a, b, c and demosig. Overa, the proposed LICA agorithm is shown to be a promising ICA method. 5 Concusions In this paper, we proposed a new ICA method based on a squared-oss variant of mutua information. The proposed method, named east-squares ICA LICA, has severa preferabe properties, e.g., it is distribution-free and hyper-parameter seection by crossvaidation is avaiabe. Simiary to other ICA agorithms, the optimization probem invoved in LICA is nonconvex. Thus it is practicay very important to deveop good heuristics for initiaization and avoiding oca optima in the gradient procedures, which is an open research topic to be investigated. Moreover, athough our SMI estimator is anaytic, the LICA agorithm is sti computationay rather expensive due to inear equations and cross-vaidation. Our

15 Least-squares Independent Component Anaysis 5 future work wi address the computationa issue, e.g., by vectorization and paraeization. Appendix: Derivation of the Gradient of the SMI Estimator Here we show the derivation of the gradient 4 of the SMI estimator 0. Since Îs = 2ĥ α + 2 see Eq.0, the derivative of Îs with respect to W k,k Îs = 2ĥ α is given as foows: + ĥ 2 α. 23 Remind that dbx = Bx dbx dx dx Bx for an arbitrary matrix function Bx. Then the partia derivative of α = Ĥ + λr ĥ with respect to W k,k is given by α = Ĥ = Ĥ Ĥ + λr + λr Ĥ W + λr ĥ + Ĥ + ĥ λr k,k + λr Ĥ + λr α + Ĥ + λr Substituting this in Eq.23, we have Îs = 2ĥ Ĥ + λr Ĥ + λr α + Ĥ W + λr k,k = 2 α Ĥ which gives Eq.4. Acknowedgments α λ 2 α R α + α ĥ, ĥ ĥ. + 2 α ĥ The authors woud ike to thank Dr. Takafumi Kanamori for his vauabe comments. T.S. was supported in part by the JSPS Research Feowships for Young Scientists and Goba COE Program The research and training center for new deveopment in mathematics, MEXT, Japan. M.S. acknowedges support from SCAT, AOARD, and the JST PRESTO program. References Ai, S. M., & Sivey, S. D A genera cass of coefficients of divergence of one distribution from another. Journa of the Roya Statistica Society, Series B, 28, 3 42.

16 Least-squares Independent Component Anaysis 6 Amari, S Natura gradient works efficienty in earning. Neura Computation, 0, Amari, S., Cichocki, A., & Yang, H. H A new earning agorithm for bind signa separation. Advances in Neura Information Processing Systems pp MIT Press. Bach, F. R., & Jordan, M. I Kerne independent component anaysis. Journa of Machine Learning Research, 3, 48. Cardoso, J.-F., & Sououmiac, A Bind beamforming for non-gaussian signas. Radar and Signa Processing, IEE Proceedings-F, 40, Cichocki, A., & Amari, S Adaptive bind signa and image processing: Learning agorithms and appications. Wiey. Cichocki, A., Zdunek, R., Phan, A. H., & Amari, S Non-negative matrix and tensor factorizations: Appications to exporatory muti-way data anaysis and bind source separation. New York: Wiey. Comon, P Independent component anaysis, a new concept? Signa Processing, 36, Csiszár, I Information-type measures of difference of probabiity distributions and indirect observation. Studia Scientiarum Mathematicarum Hungarica, 2, Fukumizu, K., Bach, F. R., & Jordan, M. I regression. The Annas of Statistics, 37, Kerne dimension reduction in Fukumizu, K., Gretton, A., Sun, X., & Schökopf, B Kerne measures of conditiona dependence. Advances in Neura Information Processing Systems 20 pp Cambridge, MA: MIT Press. Geisser, S The predictive sampe reuse method with appications. Journa of the American Statistica Association, 70, Gretton, A., Bousquet, O., Smoa, A., & Schökopf, B. 2005a. Measuring statistica dependence with Hibert-Schmidt norms. Agorithmic Learning Theory pp Berin: Springer-Verag. Gretton, A., Herbrich, R., Smoa, A., Bousquet, O., & Schökopf, B. 2005b. Kerne methods for measuring independence. Journa of Machine Learning Research, 6, Hue, M. M. V Sequentia fixed-point ICA based on mutua information minimization. Neura Computation, 20,

17 Least-squares Independent Component Anaysis 7 Hyvärinen, A Fast and robust fixed-point agorithms for independent component anaysis. IEEE Transactions on Neura Networks, 0, Hyvärinen, A., Karhunen, J., & Oja, E Independent component anaysis. New York: Wiey. Jutten, C., & Heraut, J. 99. Bind separation of sources, part I: An adaptive agorithm based on neuromimetic architecture. Signa Processing, 24, 0. Kanamori, T., Hido, S., & Sugiyama, M A east-squares approach to direct importance estimation. Journa of Machine Learning Research, 0, Kohave, R A study of cross-vaidation and bootstrap for accuracy estimation and mode seection. the Fourteenth Internationa Joint Conference on Artificia Inteigence pp Morgan Kaufmann. Kuback, S., & Leiber, R. A. 95. On information and sufficiency. Annas of Mathematica Statistics, 22, Liese, F., & Vajda, I On divergences and informations in statistics and information theory. IEEE Transactions on Information Theory, 52, Nguyen, X., Wainwright, M. J., & Jordan, M. I Estimating divergence functionas and the ikeihood ratio by convex risk minimization. IEEE Transactions on Information Theory. to appear. Nishimori, Y., & Akaho, S Learning agorithms utiizing quasi-geodesic fows on the Stiefe manifod. Neurocomputing, 67, Paninsky, L Estimation of entropy and mutua information. Neura Computation, 5, Pearson, K On the criterion that a given system of deviations from the probabe in the case of a correated system of variabes is such that it can be reasonaby supposed to have arisen from random samping. Phiosophica Magazine, 50, Schökopf, B., & Smoa, A. J Learning with kernes. Cambridge, MA: MIT Press. Sugiyama, M., Suzuki, T., Nakajima, S., Kashima, H., von Bünau, P., & Kawanabe, M Direct importance estimation for covariate shift adaptation. Annas of the Institute of Statistica Mathematics, 60, Suzuki, T., Sugiyama, M., Sese, J., & Kanamori, T Approximating mutua information by maximum ikeihood density ratio estimation. New Chaenges for Feature Seection in Data Mining and Knowedge Discovery pp Vapnik, V. N Statistica earning theory. New York: Wiey.

Statistical Learning Theory: A Primer

Statistical Learning Theory: A Primer Internationa Journa of Computer Vision 38(), 9 3, 2000 c 2000 uwer Academic Pubishers. Manufactured in The Netherands. Statistica Learning Theory: A Primer THEODOROS EVGENIOU, MASSIMILIANO PONTIL AND TOMASO

More information

An Information Geometrical View of Stationary Subspace Analysis

An Information Geometrical View of Stationary Subspace Analysis An Information Geometrica View of Stationary Subspace Anaysis Motoaki Kawanabe, Wojciech Samek, Pau von Bünau, and Frank C. Meinecke Fraunhofer Institute FIRST, Kekuéstr. 7, 12489 Berin, Germany Berin

More information

Multilayer Kerceptron

Multilayer Kerceptron Mutiayer Kerceptron Zotán Szabó, András Lőrincz Department of Information Systems, Facuty of Informatics Eötvös Loránd University Pázmány Péter sétány 1/C H-1117, Budapest, Hungary e-mai: szzoi@csetehu,

More information

On Information-Maximization Clustering: Tuning Parameter Selection and Analytic Solution

On Information-Maximization Clustering: Tuning Parameter Selection and Analytic Solution : Tuning Parameter Seection and Anaytic Soution Masashi Sugiyama sugi@cs.titech.ac.jp Makoto Yamada yamada@sg.cs.titech.ac.jp Manabu Kimura kimura@sg.cs.titech.ac.jp Hirotaka Hachiya hachiya@sg.cs.titech.ac.jp

More information

A. Distribution of the test statistic

A. Distribution of the test statistic A. Distribution of the test statistic In the sequentia test, we first compute the test statistic from a mini-batch of size m. If a decision cannot be made with this statistic, we keep increasing the mini-batch

More information

FRST Multivariate Statistics. Multivariate Discriminant Analysis (MDA)

FRST Multivariate Statistics. Multivariate Discriminant Analysis (MDA) 1 FRST 531 -- Mutivariate Statistics Mutivariate Discriminant Anaysis (MDA) Purpose: 1. To predict which group (Y) an observation beongs to based on the characteristics of p predictor (X) variabes, using

More information

CS229 Lecture notes. Andrew Ng

CS229 Lecture notes. Andrew Ng CS229 Lecture notes Andrew Ng Part IX The EM agorithm In the previous set of notes, we taked about the EM agorithm as appied to fitting a mixture of Gaussians. In this set of notes, we give a broader view

More information

Statistical Learning Theory: a Primer

Statistical Learning Theory: a Primer ??,??, 1 6 (??) c?? Kuwer Academic Pubishers, Boston. Manufactured in The Netherands. Statistica Learning Theory: a Primer THEODOROS EVGENIOU AND MASSIMILIANO PONTIL Center for Bioogica and Computationa

More information

Explicit overall risk minimization transductive bound

Explicit overall risk minimization transductive bound 1 Expicit overa risk minimization transductive bound Sergio Decherchi, Paoo Gastado, Sandro Ridea, Rodofo Zunino Dept. of Biophysica and Eectronic Engineering (DIBE), Genoa University Via Opera Pia 11a,

More information

Moreau-Yosida Regularization for Grouped Tree Structure Learning

Moreau-Yosida Regularization for Grouped Tree Structure Learning Moreau-Yosida Reguarization for Grouped Tree Structure Learning Jun Liu Computer Science and Engineering Arizona State University J.Liu@asu.edu Jieping Ye Computer Science and Engineering Arizona State

More information

6.434J/16.391J Statistics for Engineers and Scientists May 4 MIT, Spring 2006 Handout #17. Solution 7

6.434J/16.391J Statistics for Engineers and Scientists May 4 MIT, Spring 2006 Handout #17. Solution 7 6.434J/16.391J Statistics for Engineers and Scientists May 4 MIT, Spring 2006 Handout #17 Soution 7 Probem 1: Generating Random Variabes Each part of this probem requires impementation in MATLAB. For the

More information

A Novel Learning Method for Elman Neural Network Using Local Search

A Novel Learning Method for Elman Neural Network Using Local Search Neura Information Processing Letters and Reviews Vo. 11, No. 8, August 2007 LETTER A Nove Learning Method for Eman Neura Networ Using Loca Search Facuty of Engineering, Toyama University, Gofuu 3190 Toyama

More information

From Margins to Probabilities in Multiclass Learning Problems

From Margins to Probabilities in Multiclass Learning Problems From Margins to Probabiities in Muticass Learning Probems Andrea Passerini and Massimiiano Ponti 2 and Paoo Frasconi 3 Abstract. We study the probem of muticass cassification within the framework of error

More information

Bayesian Learning. You hear a which which could equally be Thanks or Tanks, which would you go with?

Bayesian Learning. You hear a which which could equally be Thanks or Tanks, which would you go with? Bayesian Learning A powerfu and growing approach in machine earning We use it in our own decision making a the time You hear a which which coud equay be Thanks or Tanks, which woud you go with? Combine

More information

ASummaryofGaussianProcesses Coryn A.L. Bailer-Jones

ASummaryofGaussianProcesses Coryn A.L. Bailer-Jones ASummaryofGaussianProcesses Coryn A.L. Baier-Jones Cavendish Laboratory University of Cambridge caj@mrao.cam.ac.uk Introduction A genera prediction probem can be posed as foows. We consider that the variabe

More information

The EM Algorithm applied to determining new limit points of Mahler measures

The EM Algorithm applied to determining new limit points of Mahler measures Contro and Cybernetics vo. 39 (2010) No. 4 The EM Agorithm appied to determining new imit points of Maher measures by Souad E Otmani, Georges Rhin and Jean-Marc Sac-Épée Université Pau Veraine-Metz, LMAM,

More information

A proposed nonparametric mixture density estimation using B-spline functions

A proposed nonparametric mixture density estimation using B-spline functions A proposed nonparametric mixture density estimation using B-spine functions Atizez Hadrich a,b, Mourad Zribi a, Afif Masmoudi b a Laboratoire d Informatique Signa et Image de a Côte d Opae (LISIC-EA 4491),

More information

Appendix for Stochastic Gradient Monomial Gamma Sampler

Appendix for Stochastic Gradient Monomial Gamma Sampler 3 4 5 6 7 8 9 3 4 5 6 7 8 9 3 4 5 6 7 8 9 3 3 3 33 34 35 36 37 38 39 4 4 4 43 44 45 46 47 48 49 5 5 5 53 54 Appendix for Stochastic Gradient Monomia Gamma Samper A The Main Theorem We provide the foowing

More information

SVM-based Supervised and Unsupervised Classification Schemes

SVM-based Supervised and Unsupervised Classification Schemes SVM-based Supervised and Unsupervised Cassification Schemes LUMINITA STATE University of Pitesti Facuty of Mathematics and Computer Science 1 Targu din Vae St., Pitesti 110040 ROMANIA state@cicknet.ro

More information

A Brief Introduction to Markov Chains and Hidden Markov Models

A Brief Introduction to Markov Chains and Hidden Markov Models A Brief Introduction to Markov Chains and Hidden Markov Modes Aen B MacKenzie Notes for December 1, 3, &8, 2015 Discrete-Time Markov Chains You may reca that when we first introduced random processes,

More information

Nonlinear Gaussian Filtering via Radial Basis Function Approximation

Nonlinear Gaussian Filtering via Radial Basis Function Approximation 51st IEEE Conference on Decision and Contro December 10-13 01 Maui Hawaii USA Noninear Gaussian Fitering via Radia Basis Function Approximation Huazhen Fang Jia Wang and Raymond A de Caafon Abstract This

More information

Appendix for Stochastic Gradient Monomial Gamma Sampler

Appendix for Stochastic Gradient Monomial Gamma Sampler Appendix for Stochastic Gradient Monomia Gamma Samper A The Main Theorem We provide the foowing theorem to characterize the stationary distribution of the stochastic process with SDEs in (3) Theorem 3

More information

Statistics for Applications. Chapter 7: Regression 1/43

Statistics for Applications. Chapter 7: Regression 1/43 Statistics for Appications Chapter 7: Regression 1/43 Heuristics of the inear regression (1) Consider a coud of i.i.d. random points (X i,y i ),i =1,...,n : 2/43 Heuristics of the inear regression (2)

More information

Do Schools Matter for High Math Achievement? Evidence from the American Mathematics Competitions Glenn Ellison and Ashley Swanson Online Appendix

Do Schools Matter for High Math Achievement? Evidence from the American Mathematics Competitions Glenn Ellison and Ashley Swanson Online Appendix VOL. NO. DO SCHOOLS MATTER FOR HIGH MATH ACHIEVEMENT? 43 Do Schoos Matter for High Math Achievement? Evidence from the American Mathematics Competitions Genn Eison and Ashey Swanson Onine Appendix Appendix

More information

An approximate method for solving the inverse scattering problem with fixed-energy data

An approximate method for solving the inverse scattering problem with fixed-energy data J. Inv. I-Posed Probems, Vo. 7, No. 6, pp. 561 571 (1999) c VSP 1999 An approximate method for soving the inverse scattering probem with fixed-energy data A. G. Ramm and W. Scheid Received May 12, 1999

More information

Dependence Minimizing Regression with Model Selection for Non-Linear Causal Inference under Non-Gaussian Noise

Dependence Minimizing Regression with Model Selection for Non-Linear Causal Inference under Non-Gaussian Noise Dependence Minimizing Regression with Model Selection for Non-Linear Causal Inference under Non-Gaussian Noise Makoto Yamada and Masashi Sugiyama Department of Computer Science, Tokyo Institute of Technology

More information

Determining The Degree of Generalization Using An Incremental Learning Algorithm

Determining The Degree of Generalization Using An Incremental Learning Algorithm Determining The Degree of Generaization Using An Incrementa Learning Agorithm Pabo Zegers Facutad de Ingeniería, Universidad de os Andes San Caros de Apoquindo 22, Las Condes, Santiago, Chie pzegers@uandes.c

More information

An Algorithm for Pruning Redundant Modules in Min-Max Modular Network

An Algorithm for Pruning Redundant Modules in Min-Max Modular Network An Agorithm for Pruning Redundant Modues in Min-Max Moduar Network Hui-Cheng Lian and Bao-Liang Lu Department of Computer Science and Engineering, Shanghai Jiao Tong University 1954 Hua Shan Rd., Shanghai

More information

STA 216 Project: Spline Approach to Discrete Survival Analysis

STA 216 Project: Spline Approach to Discrete Survival Analysis : Spine Approach to Discrete Surviva Anaysis November 4, 005 1 Introduction Athough continuous surviva anaysis differs much from the discrete surviva anaysis, there is certain ink between the two modeing

More information

Two view learning: SVM-2K, Theory and Practice

Two view learning: SVM-2K, Theory and Practice Two view earning: SVM-2K, Theory and Practice Jason D.R. Farquhar jdrf99r@ecs.soton.ac.uk Hongying Meng hongying@cs.york.ac.uk David R. Hardoon drh@ecs.soton.ac.uk John Shawe-Tayor jst@ecs.soton.ac.uk

More information

Optimality of Inference in Hierarchical Coding for Distributed Object-Based Representations

Optimality of Inference in Hierarchical Coding for Distributed Object-Based Representations Optimaity of Inference in Hierarchica Coding for Distributed Object-Based Representations Simon Brodeur, Jean Rouat NECOTIS, Département génie éectrique et génie informatique, Université de Sherbrooke,

More information

Deep Gaussian Processes for Multi-fidelity Modeling

Deep Gaussian Processes for Multi-fidelity Modeling Deep Gaussian Processes for Muti-fideity Modeing Kurt Cutajar EURECOM Sophia Antipois, France Mark Puin Andreas Damianou Nei Lawrence Javier Gonzáez Abstract Muti-fideity modes are prominenty used in various

More information

Cryptanalysis of PKP: A New Approach

Cryptanalysis of PKP: A New Approach Cryptanaysis of PKP: A New Approach Éiane Jaumes and Antoine Joux DCSSI 18, rue du Dr. Zamenhoff F-92131 Issy-es-Mx Cedex France eiane.jaumes@wanadoo.fr Antoine.Joux@ens.fr Abstract. Quite recenty, in

More information

II. PROBLEM. A. Description. For the space of audio signals

II. PROBLEM. A. Description. For the space of audio signals CS229 - Fina Report Speech Recording based Language Recognition (Natura Language) Leopod Cambier - cambier; Matan Leibovich - matane; Cindy Orozco Bohorquez - orozcocc ABSTRACT We construct a rea time

More information

Some Measures for Asymmetry of Distributions

Some Measures for Asymmetry of Distributions Some Measures for Asymmetry of Distributions Georgi N. Boshnakov First version: 31 January 2006 Research Report No. 5, 2006, Probabiity and Statistics Group Schoo of Mathematics, The University of Manchester

More information

A Comparison Study of the Test for Right Censored and Grouped Data

A Comparison Study of the Test for Right Censored and Grouped Data Communications for Statistica Appications and Methods 2015, Vo. 22, No. 4, 313 320 DOI: http://dx.doi.org/10.5351/csam.2015.22.4.313 Print ISSN 2287-7843 / Onine ISSN 2383-4757 A Comparison Study of the

More information

Appendix A: MATLAB commands for neural networks

Appendix A: MATLAB commands for neural networks Appendix A: MATLAB commands for neura networks 132 Appendix A: MATLAB commands for neura networks p=importdata('pn.xs'); t=importdata('tn.xs'); [pn,meanp,stdp,tn,meant,stdt]=prestd(p,t); for m=1:10 net=newff(minmax(pn),[m,1],{'tansig','purein'},'trainm');

More information

Appendix of the Paper The Role of No-Arbitrage on Forecasting: Lessons from a Parametric Term Structure Model

Appendix of the Paper The Role of No-Arbitrage on Forecasting: Lessons from a Parametric Term Structure Model Appendix of the Paper The Roe of No-Arbitrage on Forecasting: Lessons from a Parametric Term Structure Mode Caio Ameida cameida@fgv.br José Vicente jose.vaentim@bcb.gov.br June 008 1 Introduction In this

More information

C. Fourier Sine Series Overview

C. Fourier Sine Series Overview 12 PHILIP D. LOEWEN C. Fourier Sine Series Overview Let some constant > be given. The symboic form of the FSS Eigenvaue probem combines an ordinary differentia equation (ODE) on the interva (, ) with a

More information

Approximation and Fast Calculation of Non-local Boundary Conditions for the Time-dependent Schrödinger Equation

Approximation and Fast Calculation of Non-local Boundary Conditions for the Time-dependent Schrödinger Equation Approximation and Fast Cacuation of Non-oca Boundary Conditions for the Time-dependent Schrödinger Equation Anton Arnod, Matthias Ehrhardt 2, and Ivan Sofronov 3 Universität Münster, Institut für Numerische

More information

A Sparse Covariance Function for Exact Gaussian Process Inference in Large Datasets

A Sparse Covariance Function for Exact Gaussian Process Inference in Large Datasets A Covariance Function for Exact Gaussian Process Inference in Large Datasets Arman ekumyan Austraian Centre for Fied Robotics The University of Sydney NSW 26, Austraia a.mekumyan@acfr.usyd.edu.au Fabio

More information

Proceedings of the 2012 Winter Simulation Conference C. Laroque, J. Himmelspach, R. Pasupathy, O. Rose, and A. M. Uhrmacher, eds.

Proceedings of the 2012 Winter Simulation Conference C. Laroque, J. Himmelspach, R. Pasupathy, O. Rose, and A. M. Uhrmacher, eds. Proceedings of the 2012 Winter Simuation Conference C. aroque, J. Himmespach, R. Pasupathy, O. Rose, and A. M. Uhrmacher, eds. TIGHT BOUNDS FOR AMERICAN OPTIONS VIA MUTIEVE MONTE CARO Denis Beomestny Duisburg-Essen

More information

Research of Data Fusion Method of Multi-Sensor Based on Correlation Coefficient of Confidence Distance

Research of Data Fusion Method of Multi-Sensor Based on Correlation Coefficient of Confidence Distance Send Orders for Reprints to reprints@benthamscience.ae 340 The Open Cybernetics & Systemics Journa, 015, 9, 340-344 Open Access Research of Data Fusion Method of Muti-Sensor Based on Correation Coefficient

More information

arxiv: v1 [cs.lg] 31 Oct 2017

arxiv: v1 [cs.lg] 31 Oct 2017 ACCELERATED SPARSE SUBSPACE CLUSTERING Abofaz Hashemi and Haris Vikao Department of Eectrica and Computer Engineering, University of Texas at Austin, Austin, TX, USA arxiv:7.26v [cs.lg] 3 Oct 27 ABSTRACT

More information

Learning Fully Observed Undirected Graphical Models

Learning Fully Observed Undirected Graphical Models Learning Fuy Observed Undirected Graphica Modes Sides Credit: Matt Gormey (2016) Kayhan Batmangheich 1 Machine Learning The data inspires the structures we want to predict Inference finds {best structure,

More information

Separation of Variables and a Spherical Shell with Surface Charge

Separation of Variables and a Spherical Shell with Surface Charge Separation of Variabes and a Spherica She with Surface Charge In cass we worked out the eectrostatic potentia due to a spherica she of radius R with a surface charge density σθ = σ cos θ. This cacuation

More information

Data Mining Technology for Failure Prognostic of Avionics

Data Mining Technology for Failure Prognostic of Avionics IEEE Transactions on Aerospace and Eectronic Systems. Voume 38, #, pp.388-403, 00. Data Mining Technoogy for Faiure Prognostic of Avionics V.A. Skormin, Binghamton University, Binghamton, NY, 1390, USA

More information

Uniformly Reweighted Belief Propagation: A Factor Graph Approach

Uniformly Reweighted Belief Propagation: A Factor Graph Approach Uniformy Reweighted Beief Propagation: A Factor Graph Approach Henk Wymeersch Chamers University of Technoogy Gothenburg, Sweden henkw@chamers.se Federico Penna Poitecnico di Torino Torino, Itay federico.penna@poito.it

More information

Bayesian Unscented Kalman Filter for State Estimation of Nonlinear and Non-Gaussian Systems

Bayesian Unscented Kalman Filter for State Estimation of Nonlinear and Non-Gaussian Systems Bayesian Unscented Kaman Fiter for State Estimation of Noninear and Non-aussian Systems Zhong Liu, Shing-Chow Chan, Ho-Chun Wu and iafei Wu Department of Eectrica and Eectronic Engineering, he University

More information

Ant Colony Algorithms for Constructing Bayesian Multi-net Classifiers

Ant Colony Algorithms for Constructing Bayesian Multi-net Classifiers Ant Coony Agorithms for Constructing Bayesian Muti-net Cassifiers Khaid M. Saama and Aex A. Freitas Schoo of Computing, University of Kent, Canterbury, UK. {kms39,a.a.freitas}@kent.ac.uk December 5, 2013

More information

Scalable Spectrum Allocation for Large Networks Based on Sparse Optimization

Scalable Spectrum Allocation for Large Networks Based on Sparse Optimization Scaabe Spectrum ocation for Large Networks ased on Sparse Optimization innan Zhuang Modem R&D Lab Samsung Semiconductor, Inc. San Diego, C Dongning Guo, Ermin Wei, and Michae L. Honig Department of Eectrica

More information

Stochastic Variational Inference with Gradient Linearization

Stochastic Variational Inference with Gradient Linearization Stochastic Variationa Inference with Gradient Linearization Suppementa Materia Tobias Pötz * Anne S Wannenwetsch Stefan Roth Department of Computer Science, TU Darmstadt Preface In this suppementa materia,

More information

Expectation-Maximization for Estimating Parameters for a Mixture of Poissons

Expectation-Maximization for Estimating Parameters for a Mixture of Poissons Expectation-Maximization for Estimating Parameters for a Mixture of Poissons Brandon Maone Department of Computer Science University of Hesini February 18, 2014 Abstract This document derives, in excrutiating

More information

Converting Z-number to Fuzzy Number using. Fuzzy Expected Value

Converting Z-number to Fuzzy Number using. Fuzzy Expected Value ISSN 1746-7659, Engand, UK Journa of Information and Computing Science Vo. 1, No. 4, 017, pp.91-303 Converting Z-number to Fuzzy Number using Fuzzy Expected Vaue Mahdieh Akhbari * Department of Industria

More information

Sequential Decoding of Polar Codes with Arbitrary Binary Kernel

Sequential Decoding of Polar Codes with Arbitrary Binary Kernel Sequentia Decoding of Poar Codes with Arbitrary Binary Kerne Vera Miosavskaya, Peter Trifonov Saint-Petersburg State Poytechnic University Emai: veram,petert}@dcn.icc.spbstu.ru Abstract The probem of efficient

More information

Kernel pea and De-Noising in Feature Spaces

Kernel pea and De-Noising in Feature Spaces Kerne pea and De-Noising in Feature Spaces Sebastian Mika, Bernhard Schokopf, Aex Smoa Kaus-Robert Muer, Matthias Schoz, Gunnar Riitsch GMD FIRST, Rudower Chaussee 5, 12489 Berin, Germany {mika, bs, smoa,

More information

A MODEL FOR ESTIMATING THE LATERAL OVERLAP PROBABILITY OF AIRCRAFT WITH RNP ALERTING CAPABILITY IN PARALLEL RNAV ROUTES

A MODEL FOR ESTIMATING THE LATERAL OVERLAP PROBABILITY OF AIRCRAFT WITH RNP ALERTING CAPABILITY IN PARALLEL RNAV ROUTES 6 TH INTERNATIONAL CONGRESS OF THE AERONAUTICAL SCIENCES A MODEL FOR ESTIMATING THE LATERAL OVERLAP PROBABILITY OF AIRCRAFT WITH RNP ALERTING CAPABILITY IN PARALLEL RNAV ROUTES Sakae NAGAOKA* *Eectronic

More information

arxiv: v1 [math.ca] 6 Mar 2017

arxiv: v1 [math.ca] 6 Mar 2017 Indefinite Integras of Spherica Besse Functions MIT-CTP/487 arxiv:703.0648v [math.ca] 6 Mar 07 Joyon K. Boomfied,, Stephen H. P. Face,, and Zander Moss, Center for Theoretica Physics, Laboratory for Nucear

More information

Discrete Techniques. Chapter Introduction

Discrete Techniques. Chapter Introduction Chapter 3 Discrete Techniques 3. Introduction In the previous two chapters we introduced Fourier transforms of continuous functions of the periodic and non-periodic (finite energy) type, we as various

More information

Melodic contour estimation with B-spline models using a MDL criterion

Melodic contour estimation with B-spline models using a MDL criterion Meodic contour estimation with B-spine modes using a MDL criterion Damien Loive, Ney Barbot, Oivier Boeffard IRISA / University of Rennes 1 - ENSSAT 6 rue de Kerampont, B.P. 80518, F-305 Lannion Cedex

More information

FORECASTING TELECOMMUNICATIONS DATA WITH AUTOREGRESSIVE INTEGRATED MOVING AVERAGE MODELS

FORECASTING TELECOMMUNICATIONS DATA WITH AUTOREGRESSIVE INTEGRATED MOVING AVERAGE MODELS FORECASTING TEECOMMUNICATIONS DATA WITH AUTOREGRESSIVE INTEGRATED MOVING AVERAGE MODES Niesh Subhash naawade a, Mrs. Meenakshi Pawar b a SVERI's Coege of Engineering, Pandharpur. nieshsubhash15@gmai.com

More information

Asynchronous Control for Coupled Markov Decision Systems

Asynchronous Control for Coupled Markov Decision Systems INFORMATION THEORY WORKSHOP (ITW) 22 Asynchronous Contro for Couped Marov Decision Systems Michae J. Neey University of Southern Caifornia Abstract This paper considers optima contro for a coection of

More information

Fitting Algorithms for MMPP ATM Traffic Models

Fitting Algorithms for MMPP ATM Traffic Models Fitting Agorithms for PP AT Traffic odes A. Nogueira, P. Savador, R. Vaadas University of Aveiro / Institute of Teecommunications, 38-93 Aveiro, Portuga; e-mai: (nogueira, savador, rv)@av.it.pt ABSTRACT

More information

A Better Way to Pretrain Deep Boltzmann Machines

A Better Way to Pretrain Deep Boltzmann Machines A Better Way to Pretrain Deep Botzmann Machines Rusan Saakhutdino Department of Statistics and Computer Science Uniersity of Toronto rsaakhu@cs.toronto.edu Geoffrey Hinton Department of Computer Science

More information

Discrete Techniques. Chapter Introduction

Discrete Techniques. Chapter Introduction Chapter 3 Discrete Techniques 3. Introduction In the previous two chapters we introduced Fourier transforms of continuous functions of the periodic and non-periodic (finite energy) type, as we as various

More information

4 1-D Boundary Value Problems Heat Equation

4 1-D Boundary Value Problems Heat Equation 4 -D Boundary Vaue Probems Heat Equation The main purpose of this chapter is to study boundary vaue probems for the heat equation on a finite rod a x b. u t (x, t = ku xx (x, t, a < x < b, t > u(x, = ϕ(x

More information

NEW DEVELOPMENT OF OPTIMAL COMPUTING BUDGET ALLOCATION FOR DISCRETE EVENT SIMULATION

NEW DEVELOPMENT OF OPTIMAL COMPUTING BUDGET ALLOCATION FOR DISCRETE EVENT SIMULATION NEW DEVELOPMENT OF OPTIMAL COMPUTING BUDGET ALLOCATION FOR DISCRETE EVENT SIMULATION Hsiao-Chang Chen Dept. of Systems Engineering University of Pennsyvania Phiadephia, PA 904-635, U.S.A. Chun-Hung Chen

More information

Incremental Reformulated Automatic Relevance Determination

Incremental Reformulated Automatic Relevance Determination IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 9, SEPTEMBER 22 4977 Incrementa Reformuated Automatic Reevance Determination Dmitriy Shutin, Sanjeev R. Kukarni, and H. Vincent Poor Abstract In this

More information

Steepest Descent Adaptation of Min-Max Fuzzy If-Then Rules 1

Steepest Descent Adaptation of Min-Max Fuzzy If-Then Rules 1 Steepest Descent Adaptation of Min-Max Fuzzy If-Then Rues 1 R.J. Marks II, S. Oh, P. Arabshahi Λ, T.P. Caude, J.J. Choi, B.G. Song Λ Λ Dept. of Eectrica Engineering Boeing Computer Services University

More information

CONJUGATE GRADIENT WITH SUBSPACE OPTIMIZATION

CONJUGATE GRADIENT WITH SUBSPACE OPTIMIZATION CONJUGATE GRADIENT WITH SUBSPACE OPTIMIZATION SAHAR KARIMI AND STEPHEN VAVASIS Abstract. In this paper we present a variant of the conjugate gradient (CG) agorithm in which we invoke a subspace minimization

More information

Fast Blind Recognition of Channel Codes

Fast Blind Recognition of Channel Codes Fast Bind Recognition of Channe Codes Reza Moosavi and Erik G. Larsson Linköping University Post Print N.B.: When citing this work, cite the origina artice. 213 IEEE. Persona use of this materia is permitted.

More information

https://doi.org/ /epjconf/

https://doi.org/ /epjconf/ HOW TO APPLY THE OPTIMAL ESTIMATION METHOD TO YOUR LIDAR MEASUREMENTS FOR IMPROVED RETRIEVALS OF TEMPERATURE AND COMPOSITION R. J. Sica 1,2,*, A. Haefee 2,1, A. Jaai 1, S. Gamage 1 and G. Farhani 1 1 Department

More information

Integrating Factor Methods as Exponential Integrators

Integrating Factor Methods as Exponential Integrators Integrating Factor Methods as Exponentia Integrators Borisav V. Minchev Department of Mathematica Science, NTNU, 7491 Trondheim, Norway Borko.Minchev@ii.uib.no Abstract. Recenty a ot of effort has been

More information

(This is a sample cover image for this issue. The actual cover is not yet available at this time.)

(This is a sample cover image for this issue. The actual cover is not yet available at this time.) (This is a sampe cover image for this issue The actua cover is not yet avaiabe at this time) This artice appeared in a journa pubished by Esevier The attached copy is furnished to the author for interna

More information

BALANCING REGULAR MATRIX PENCILS

BALANCING REGULAR MATRIX PENCILS BALANCING REGULAR MATRIX PENCILS DAMIEN LEMONNIER AND PAUL VAN DOOREN Abstract. In this paper we present a new diagona baancing technique for reguar matrix pencis λb A, which aims at reducing the sensitivity

More information

Wavelet shrinkage estimators of Hilbert transform

Wavelet shrinkage estimators of Hilbert transform Journa of Approximation Theory 163 (2011) 652 662 www.esevier.com/ocate/jat Fu ength artice Waveet shrinkage estimators of Hibert transform Di-Rong Chen, Yao Zhao Department of Mathematics, LMIB, Beijing

More information

First-Order Corrections to Gutzwiller s Trace Formula for Systems with Discrete Symmetries

First-Order Corrections to Gutzwiller s Trace Formula for Systems with Discrete Symmetries c 26 Noninear Phenomena in Compex Systems First-Order Corrections to Gutzwier s Trace Formua for Systems with Discrete Symmetries Hoger Cartarius, Jörg Main, and Günter Wunner Institut für Theoretische

More information

Soft Clustering on Graphs

Soft Clustering on Graphs Soft Custering on Graphs Kai Yu 1, Shipeng Yu 2, Voker Tresp 1 1 Siemens AG, Corporate Technoogy 2 Institute for Computer Science, University of Munich kai.yu@siemens.com, voker.tresp@siemens.com spyu@dbs.informatik.uni-muenchen.de

More information

SVM: Terminology 1(6) SVM: Terminology 2(6)

SVM: Terminology 1(6) SVM: Terminology 2(6) Andrew Kusiak Inteigent Systems Laboratory 39 Seamans Center he University of Iowa Iowa City, IA 54-57 SVM he maxima margin cassifier is simiar to the perceptron: It aso assumes that the data points are

More information

DIGITAL FILTER DESIGN OF IIR FILTERS USING REAL VALUED GENETIC ALGORITHM

DIGITAL FILTER DESIGN OF IIR FILTERS USING REAL VALUED GENETIC ALGORITHM DIGITAL FILTER DESIGN OF IIR FILTERS USING REAL VALUED GENETIC ALGORITHM MIKAEL NILSSON, MATTIAS DAHL AND INGVAR CLAESSON Bekinge Institute of Technoogy Department of Teecommunications and Signa Processing

More information

4 Separation of Variables

4 Separation of Variables 4 Separation of Variabes In this chapter we describe a cassica technique for constructing forma soutions to inear boundary vaue probems. The soution of three cassica (paraboic, hyperboic and eiptic) PDE

More information

<C 2 2. λ 2 l. λ 1 l 1 < C 1

<C 2 2. λ 2 l. λ 1 l 1 < C 1 Teecommunication Network Contro and Management (EE E694) Prof. A. A. Lazar Notes for the ecture of 7/Feb/95 by Huayan Wang (this document was ast LaT E X-ed on May 9,995) Queueing Primer for Muticass Optima

More information

Lecture Note 3: Stationary Iterative Methods

Lecture Note 3: Stationary Iterative Methods MATH 5330: Computationa Methods of Linear Agebra Lecture Note 3: Stationary Iterative Methods Xianyi Zeng Department of Mathematica Sciences, UTEP Stationary Iterative Methods The Gaussian eimination (or

More information

Uniprocessor Feasibility of Sporadic Tasks with Constrained Deadlines is Strongly conp-complete

Uniprocessor Feasibility of Sporadic Tasks with Constrained Deadlines is Strongly conp-complete Uniprocessor Feasibiity of Sporadic Tasks with Constrained Deadines is Strongy conp-compete Pontus Ekberg and Wang Yi Uppsaa University, Sweden Emai: {pontus.ekberg yi}@it.uu.se Abstract Deciding the feasibiity

More information

Global sensitivity analysis using low-rank tensor approximations

Global sensitivity analysis using low-rank tensor approximations Goba sensitivity anaysis using ow-rank tensor approximations K. Konaki 1 and B. Sudret 1 1 Chair of Risk, Safety and Uncertainty Quantification, arxiv:1605.09009v1 [stat.co] 29 May 2016 ETH Zurich, Stefano-Franscini-Patz

More information

BP neural network-based sports performance prediction model applied research

BP neural network-based sports performance prediction model applied research Avaiabe onine www.jocpr.com Journa of Chemica and Pharmaceutica Research, 204, 6(7:93-936 Research Artice ISSN : 0975-7384 CODEN(USA : JCPRC5 BP neura networ-based sports performance prediction mode appied

More information

Efficient Approximate Leave-One-Out Cross-Validation for Kernel Logistic Regression

Efficient Approximate Leave-One-Out Cross-Validation for Kernel Logistic Regression Machine Learning manuscript No. (wi be inserted by the editor) Efficient Approximate Leave-One-Out Cross-Vaidation for Kerne Logistic Regression Gavin C. Cawey, Nicoa L. C. Tabot Schoo of Computing Sciences

More information

Learning Structural Changes of Gaussian Graphical Models in Controlled Experiments

Learning Structural Changes of Gaussian Graphical Models in Controlled Experiments Learning Structura Changes of Gaussian Graphica Modes in Controed Experiments Bai Zhang and Yue Wang Bradey Department of Eectrica and Computer Engineering Virginia Poytechnic Institute and State University

More information

A Ridgelet Kernel Regression Model using Genetic Algorithm

A Ridgelet Kernel Regression Model using Genetic Algorithm A Ridgeet Kerne Regression Mode using Genetic Agorithm Shuyuan Yang, Min Wang, Licheng Jiao * Institute of Inteigence Information Processing, Department of Eectrica Engineering Xidian University Xi an,

More information

Efficiently Generating Random Bits from Finite State Markov Chains

Efficiently Generating Random Bits from Finite State Markov Chains 1 Efficienty Generating Random Bits from Finite State Markov Chains Hongchao Zhou and Jehoshua Bruck, Feow, IEEE Abstract The probem of random number generation from an uncorreated random source (of unknown

More information

A Statistical Framework for Real-time Event Detection in Power Systems

A Statistical Framework for Real-time Event Detection in Power Systems 1 A Statistica Framework for Rea-time Event Detection in Power Systems Noan Uhrich, Tim Christman, Phiip Swisher, and Xichen Jiang Abstract A quickest change detection (QCD) agorithm is appied to the probem

More information

MARKOV CHAINS AND MARKOV DECISION THEORY. Contents

MARKOV CHAINS AND MARKOV DECISION THEORY. Contents MARKOV CHAINS AND MARKOV DECISION THEORY ARINDRIMA DATTA Abstract. In this paper, we begin with a forma introduction to probabiity and expain the concept of random variabes and stochastic processes. After

More information

Inductive Bias: How to generalize on novel data. CS Inductive Bias 1

Inductive Bias: How to generalize on novel data. CS Inductive Bias 1 Inductive Bias: How to generaize on nove data CS 478 - Inductive Bias 1 Overfitting Noise vs. Exceptions CS 478 - Inductive Bias 2 Non-Linear Tasks Linear Regression wi not generaize we to the task beow

More information

On the evaluation of saving-consumption plans

On the evaluation of saving-consumption plans On the evauation of saving-consumption pans Steven Vanduffe Jan Dhaene Marc Goovaerts Juy 13, 2004 Abstract Knowedge of the distribution function of the stochasticay compounded vaue of a series of future

More information

Power Control and Transmission Scheduling for Network Utility Maximization in Wireless Networks

Power Control and Transmission Scheduling for Network Utility Maximization in Wireless Networks ower Contro and Transmission Scheduing for Network Utiity Maximization in Wireess Networks Min Cao, Vivek Raghunathan, Stephen Hany, Vinod Sharma and. R. Kumar Abstract We consider a joint power contro

More information

Kernel Trick Embedded Gaussian Mixture Model

Kernel Trick Embedded Gaussian Mixture Model Kerne Trick Embedded Gaussian Mixture Mode Jingdong Wang, Jianguo Lee, and Changshui Zhang State Key Laboratory of Inteigent Technoogy and Systems Department of Automation, Tsinghua University Beijing,

More information

Discriminant Analysis: A Unified Approach

Discriminant Analysis: A Unified Approach Discriminant Anaysis: A Unified Approach Peng Zhang & Jing Peng Tuane University Eectrica Engineering & Computer Science Department New Oreans, LA 708 {zhangp,jp}@eecs.tuane.edu Norbert Riede Tuane University

More information

Related Topics Maxwell s equations, electrical eddy field, magnetic field of coils, coil, magnetic flux, induced voltage

Related Topics Maxwell s equations, electrical eddy field, magnetic field of coils, coil, magnetic flux, induced voltage Magnetic induction TEP Reated Topics Maxwe s equations, eectrica eddy fied, magnetic fied of cois, coi, magnetic fux, induced votage Principe A magnetic fied of variabe frequency and varying strength is

More information

Supervised i-vector Modeling - Theory and Applications

Supervised i-vector Modeling - Theory and Applications Supervised i-vector Modeing - Theory and Appications Shreyas Ramoji, Sriram Ganapathy Learning and Extraction of Acoustic Patterns LEAP) Lab, Eectrica Engineering, Indian Institute of Science, Bengauru,

More information

8 APPENDIX. E[m M] = (n S )(1 exp( exp(s min + c M))) (19) E[m M] n exp(s min + c M) (20) 8.1 EMPIRICAL EVALUATION OF SAMPLING

8 APPENDIX. E[m M] = (n S )(1 exp( exp(s min + c M))) (19) E[m M] n exp(s min + c M) (20) 8.1 EMPIRICAL EVALUATION OF SAMPLING 8 APPENDIX 8.1 EMPIRICAL EVALUATION OF SAMPLING We wish to evauate the empirica accuracy of our samping technique on concrete exampes. We do this in two ways. First, we can sort the eements by probabiity

More information