Manifold learning via Multi-Penalty Regularization

Size: px
Start display at page:

Download "Manifold learning via Multi-Penalty Regularization"

Transcription

1 Manifold learning via Multi-Penalty Regularization Abhishake Rastogi Departent of Matheatics Indian Institute of Technology Delhi New Delhi 006, India Abstract Manifold regularization is an approach which exploits the geoetry of the arginal distribution. The ain goal of this paper is to analyze the convergence issues of such regularization algoriths in learning theory. We propose a ore general ulti-penalty fraework and establish the optial convergence rates under the general soothness assuption. We study a theoretical analysis of the perforance of the ulti-penalty regularization over the reproducing kernel Hilbert space. We discuss the error estiates of the regularization schees under soe prior assuptions for the joint probability easure on the saple space. We analyze the convergence rates of learning algoriths easured in the nor in reproducing kernel Hilbert space and in the nor in Hilbert space of square-integrable functions. The convergence issues for the learning algoriths are discussed in probabilistic sense by exponential tail inequalities. In order to optiize the regularization functional, one of the crucial issue is to select regularization paraeters to ensure good perforance of the solution. We propose a new paraeter choice rule the penalty balancing principle based on augented Tikhonov regularization for the choice of regularization paraeters. The superiority of ulti-penalty regularization over single-penalty regularization is shown using the acadeic exaple and oon data set. Keywords: Learning theory, Multi-penalty regularization, General source condition, Optial rates, Penalty balancing principle. Matheatics Subject Classification 200: 68T05, 68Q32. Introduction Let X be a copact etric space and Y R with the joint probability easure ρ on Z = X Y. Suppose z = (x i, y i )} Z be a observation set drawn fro the unknown probability easure ρ. The learning proble [, 2, 3, 4] ais to approxiate a function f z based on z such that f z (x) y. The goodness of the estiator can be easured by the generalization error of a function f which can be defined as E(f) := E ρ (f) = Z V (f(x), y)dρ(x, y), () DOI : 0.52/ijaia

2 where V : Y Y R is the loss function. The iniizer of E(f) for the square loss function V (f(x), y) = (f(x) y) 2 Y is given by f ρ (x) := Y ydρ(y x), (2) where f ρ is called the regression function. It is clear fro the following proposition: E(f) = X (f(x) f ρ (x)) 2 dρ X (x) + σ 2 ρ, where σρ 2 = X Y (y f ρ(x)) 2 dρ(y x)dρ X (x). The regression function f ρ belongs to the space of square integrable functions provided that Z y 2 dρ(x, y) <. (3) Therefore our objective becoes to estiate the regression function f ρ. Single-penalty regularization is widely considered to infer the estiator fro given set of rando saples [5, 6, 7, 8, 9, 0]. Sale et al. [9,, 2] provided the foundations of theoretical analysis of square-loss regularization schee under Hölder s source condition. Caponnetto et al. [6] iproved the error estiates to optial convergence rates for regularized least-square algorith using the polynoial decay condition of eigenvalues of the integral operator. But soeties, one ay require to add ore penalties to incorporate ore features in the regularized solution. Multi-penalty regularization is studied by various authors for both inverse probles and learning algoriths [3, 4, 5, 6, 7, 8, 9, 20]. Belkin et al. [3] discussed the proble of anifold regularization which controls the coplexity of the function in abient space as well as geoetry of the probability space: f = argin (f(x i ) y i ) 2 + λ A f 2 H K + λ I n i,j= (f(x i ) f(x j )) 2 ω ij, (4) where (x i, y i ) X Y : i } x i X : < i n} is given set of labeled and unlabeled data, λ A and λ I are non-negative regularization paraeters, ω ij s are non-negative weights, H K is reproducing kernel Hilbert space and HK is its nor. Further, the anifold regularization algorith is developed and widely considered in the vectorvalued fraework to analyze the ulti-task learning proble [2, 22, 23, 24] (Also see references therein). So it otivates us to theoretically analyze this proble. The convergence issues of the ulti-penalty regularizer are discussed under general source condition in [25] but the convergence rates are not optial. Here we are able to achieve the optial iniax convergence rates using the polynoial decay condition of eigenvalues of the integral operator. In order to optiize regularization functional, one of the crucial proble is the paraeter choice strategy. Various prior and posterior paraeter choice rules are proposed for single-penalty regularization [26, 27, 28, 29, 30] (also see references therein). Many regularization paraeter selection approaches are discussed for ulti-penalized ill-posed inverse probles such as discrepancy principle [5, 3], quasi-optiality principle [8, 32], balanced-discrepancy principle [33], heuristic L-curve [34], noise structure based paraeter choice rules [35, 36, 37], soe approaches which require reduction to single-penalty regularization [38]. Due to growing interest in ulti-penalty 78

3 x i } is defined by S x (f) = (f(x)) x x. International Journal of Artificial Intelligence and Applications (IJAIA), Vol.8, No.5, Septeber 207 regularization in learning, ulti-paraeter choice rules are discussed in learning theory fraework such as discrepancy principle [5, 6], balanced-discrepancy principle [25], paraeter choice strategy based on generalized cross validation score [9]. Here we discuss the penalty balancing principle (PB-principle) to choose the regularization paraeters in our learning theory fraework which is considered for ulti-penalty regularization in ill-posed probles [33].. Matheatical Preliinaries and Notations Definition.. Let X be a non-epty set and H be a Hilbert space of real-valued functions on X. If the pointwise evaluation ap F x : H R, defined by F x (f) = f(x) f H, is continuous for every x X. Then H is called reproducing kernel Hilbert space. For each reproducing kernel Hilbert space H there exists a ercer kernel K : X X R such that for K x : X R, defined as K x (y) = K(x, y), the span of the set K x : x X} is dense in H. Moreover, there is one to one correspondence between ercer kernels and reproducing kernel Hilbert spaces [39]. So we denote the reproducing kernel Hilbert space H by H K corresponding to a ercer kernel K and its nor by K. Definition.2. The sapling operator S x : H K R associated with a discrete subset x = Denote S x : R H K as the adjoint of S x. Then for c R, f, S xc K = S x f, c = Then its adjoint is given by S xc = c i f(x i ) = f, c i K xi K, f H K. c i K xi, c = (c,, c ) R. Fro the following assertion we observe that S x is a bounded operator: S x f 2 = } f, K xi 2 K } f 2 K K xi 2 K κ 2 f 2 K, which iplies S x κ, where κ := sup K(x, x). x X For each (x i, y i ) Z, y i = f ρ (x i ) + η xi, where the probability distribution of η xi has ean 0 and variance σx 2 i. Denote σ 2 := σx 2 i <. Learning Schee. The optiization functional (4) can be expressed as } f = argin S x f y 2 + λ A f 2 K + λ I (Sx LS x )/2 f 2 K, (5) 79

4 where x = x i X : i n}, y 2 = y2 i, L = D W with W = (ω ij) is a weight atrix with non-negative entries and D is a diagonal atrix with D ii = n ω ij. Here we consider a ore general regularized learning schee based on two penalties: f z,λ := argin Sx f y 2 + λ f 2 K + λ 2 Bf 2 } K, (6) where B : H K H K is a bounded operator and λ, λ 2 are non-negative paraeters. Theore.. If S xs x + λ I + λ 2 B B is invertible, then the optiization functional (6) has unique iniizer: Proof. we know that for f H K, j= f z,λ = S S xy, where S := (S xs x + λ I + λ 2 B B). S x f y 2 + λ f 2 K + λ 2 Bf 2 K= (S xs x + λ I + λ 2 B B)f, f K 2 S xy, f K + y 2. Taking the functional derivative for f H K, we see that any iniizer f z,λ of (6) satisfies This proves Theore.. (S xs x + λ I + λ 2 B B)f z,λ = S xy. Define f x,λ as the iniizer of the optiization proble: f x,λ := argin } (f(x i ) f ρ (x i )) 2 + λ f 2 K + λ 2 Bf 2 K (7) which gives f x,λ = S S xs x f ρ. (8) The data-free version of the considered regularization schee (6) is f λ := argin f fρ 2 ρ + λ f 2 K + λ 2 Bf 2 } K, (9) where the nor ρ := L 2 ρx. Then we get the expression of f λ, f λ = (L K + λ I + λ 2 B B) L K f ρ (0) and which iplies f λ := argin f fρ 2 ρ + λ f 2 } K. () f λ = (L K + λ I) L K f ρ, (2) where the integral operator L K : Lρ 2 X Lρ 2 X is a self-adjoint, non-negative, copact operator, defined as L K (f)(x) := K(x, t)f(t)dρ X (t), x X. X 80

5 The integral operator L K can also be defined as a self-adjoint operator on H K. We use the sae notation L K for both the operators. Using the singular value decoposition L K = t i, e i K e i for orthonoral syste e i } in H K and sequence of singular nubers κ 2 t t , we define φ(l K ) = φ(t i ), e i K e i, where φ is a continuous increasing index function defined on the interval [0, κ 2 ] with the assuption φ(0) = 0. We require soe prior assuptions on the probability easure ρ to achieve the unifor convergence rates for learning algoriths. Assuption. (Source condition) Suppose Ω φ,r := f H K : f = φ(l K )g and g K R}, Then the condition f ρ Ω φ,r is usually referred as general source condition [40]. Assuption 2. (Polynoial decay condition) We assue the eigenvalues t n s of the integral operator L K follows the polynoial decay: For fixed positive constants α, β and b >, αn b t n βn b n N. Following the notion of Bauer et al. [5] and Caponnetto et al. [6], we consider the class of probability easures P φ which satisfies the source condition and the probability easure class P φ,b satisfying the source condition and polynoial decay condition. The effective diension N (λ ) can be estiated fro Proposition 3 [6] under the polynoial decay condition as follows, N (λ ) := T r ( (L K + λ I) L K ) where T r(a) := Ae k, e k for soe orthonoral basis e k } k=. k= βb b λ /b, for b >. (3) Shuai Lu et al. [4] and Blanchard et al. [42] considered the logarith decay condition of the effective diension N (λ ), Assuption 3. (logarithic decay) Assue that there exists soe positive constant c > 0 such that ( ) N (λ ) c log, λ > 0. (4) λ 2 Convergence Analysis In this section, we discuss the convergence issues of ulti-penalty regularization schee on reproducing kernel Hilbert space under the considered soothness priors in learning theory fraework. We address the convergence rates of the ulti-penalty regularizer by estiating the saple error f z,λ f λ and approxiation error f λ f ρ in interpolation nor. In Theore 2., the upper convergence rates of ulti-penalty regularized solution f z,λ are derived fro the estiates of 8

6 Proposition 2.2, 2.3 for the class of probability easure P φ,b, respectively. We discuss the error estiates under the general source condition and the paraeter choice rule based on the index function φ and saple size. Under the polynoial decay condition, in Theore 2.2 we obtain the optial iniax convergence rates in ters of index function φ, the paraeter b and the nuber of saples. In particular under Hölder s source condition, we present the convergence rates under the logarith decay condition on effective diension in Corollary 2.2. Proposition 2.. Let z be i.i.d. saples drawn according to the probability easure ρ with the hypothesis y i M for each (x i, y i ) Z. Then for 0 s 2 and for every 0 < < with prob., L s K(f z,λ f x,λ ) K 2λ s 2 Ξ ( + 2 log ( ) ) 2 + 4κM 3 log λ ( ) } 2, where N xi (λ ) = T r ( ) (L K + λ I) K xi Kx i and Ξ = σ2 x i N xi (λ ) for the variance σx 2 i of the probability distribution of η xi = y i f ρ (x i ). Proof. The expression f z,λ f x,λ can be written as S S x(y S x f ρ ). Then we find that L s K(f z,λ f x,λ ) K I L s K(L K + λ I) /2 (L K + λ I) /2 S (L K + λ I) /2 I I 2 L s K(L K + λ I) /2, (5) where I = (L K +λ I) /2 S x(y S x f ρ ) K and I 2 = (L K +λ I) /2 (S xs x +λ I) (L K +λ I) /2. For sufficiently large saple size, the following inequality holds: Then fro Theore 2 [43] we have with confidence, 8κ 2 ( ) 2 log λ (6) I 3 = (L K + λ I) /2 (L K S xs x )(L K + λ I) /2 S xs x L K Then the Neuann series gives λ 4κ 2 ( ) 2 log λ 2. I 2 = I (L K + λ I) /2 (L K SxS x )(L K + λ I) /2 } (7) = (L K + λ I) /2 (L K SxS x )(L K + λ I) /2 } i I3 i = 2. I 3 Now we have, i=0 L s K(L K + λ I) /2 sup 0<t κ 2 i=0 t s λs /2 (t + λ ) /2 for 0 s 2. (8) To estiate the error bound for (L K + λ I) /2 S x(y S x f ρ ) K using the McDiarid inequality (Lea 2 [2]), define the function F : R R as F(y) = (L K + λ I) /2 S x(y S x f ρ ) K 82

7 So F 2 (y) = 2 = (L K + λ I) /2 (y i f ρ (x i ))K xi. K (y i f ρ (x i ))(y j f ρ (x j )) (L K + λ I) K xi, K xj K. i,j= The independence of the saples together with E y (y i f ρ (x i )) = 0, E y (y i f ρ (x i )) 2 = σx 2 i iplies E y (F 2 ) = 2 σx 2 i N xi (λ ) Ξ 2, where N xi (λ ) = T r ( ) (L K + λ I) K xi Kx i and Ξ = σ2 x i N xi (λ ). Since E y (F) Ey (F 2 ). It iplies E y (F) Ξ. Let y i = (y,..., y i, y i, y i+,..., y ), where y i is another saple at x i. We have F(y) F(y i ) (L K + λ I) /2 Sx(y y i ) K = (y i y i)(l K + λ I) /2 K xi K 2κM. λ This can be taken as B in Lea 2(2) [2]. Now ( E yi F(y) Eyi (F(y)) 2) ( 2 2 y i y i (L K + λ I) /2 K xi K dρ(y i x i )) dρ(y i x i ) Y Y 2 (y i y i) 2 N xi (λ )dρ(y i x i )dρ(y i x i ) which iplies Y Y 2 2 σ2 x i N xi (λ ) σi 2 (F) 2Ξ 2. In view of Lea 2(2) [2] for every ε > 0, P rob F(y) E ε 2 } y(f(y)) ε} exp y Y 4(Ξ 2 + εκm/3 =. (let) λ ) In ters of, probability inequality becoes ( P rob y Y F(y) Ξ + 2 log ( ) ) + 4κM 3 log λ ( ) }. Incorporating this inequality with (7), (8) in (5), we get the desired result. Proposition 2.2. Let z be i.i.d. saples drawn according to the probability easure ρ with the hypothesis y i M for each (x i, y i ) Z. Suppose f ρ Ω φ,r. Then for 0 s 2 and for every 0 < < with prob., L s K(f z,λ f λ ) K 2λs 2 3M N (λ ) + 4κ λ f λ f ρ ρ + λ 6 f λ f ρ K + 7κM } ( ) 4 log. λ 83

8 Proof. We can express f x,λ f λ = S (S xs x L K )(f ρ f λ ), which iplies L s K(f x,λ f λ ) K I 4 (f ρ (x i ) f λ (x i ))K xi L K (f ρ f λ ). K where I 4 = L s K S. Using Lea 3 [2] for the function f ρ f λ, we get with confidence, L s K(f x,λ f λ ) K I 4 ( 4κ f λ f ρ log 3 ( ) + κ f λ f ρ ρ ( + 8log For sufficiently large saple (6), fro Theore 2 [43] we get ( ) (L K SxS x )(L K + λ I) S xs x L K 4κ2 2 log λ λ 2 with confidence, which iplies ( ) )). (9) (L K + λ I)(S xs x + λ I) = I (L K S xs x )(L K + λ I) } 2. (20) We have, L s K(L K + λ I) sup 0<t κ 2 Now equation (20) and (2) iplies the following inequality, t s (t + λ ) λs for 0 s. (2) I 4 L s K(S xs x + λ I) L s K(L K + λ I) (L K + λ I)(S xs x + λ I) 2λ s. (22) Let ξ(x) = σ 2 xn x (λ ) be the rando variable. Then it satisfies ξ 4κ 2 M 2 /λ, E x (ξ) M 2 N (λ ) and σ 2 (ξ) 4κ 2 M 4 N (λ )/λ. Using the Bernstein inequality we get P rob x X ( σ 2 xi N xi (λ ) M 2 N (λ ) ) } ( > t exp t 2 /2 4κ 2 M 4 N (λ ) λ + 4κ2 M 2 t 3λ ) which iplies P rob x X M 2 N (λ ) Ξ + 8κ 2 M λ log ( ) }. (23) We get the required error estiate by cobining the estiates of Proposition 2. with inequalities (9), (22), (23). Proposition 2.3. Suppose f ρ Ω φ,r. Then under the assuption that φ(t) and t s /φ(t) are nondecreasing functions, we have L s K(f λ f ρ ) K λ s ( ) Rφ(λ ) + λ 2 λ 3/2 M B B. (24) Proof. To realize the above error estiates, we decoposes f λ f ρ into f λ f λ + f λ f ρ. The first ter can be expressed as f λ f λ = λ 2 (L K + λ I + λ 2 B B) B Bf λ. 84

9 Then we get L s K(f λ f λ ) K λ 2 L s K(L K + λ I) B B f λ K (25) λ 2 λ s B B f λ K λ 2 λ s 3/2 M B B. L s K(f λ f ρ ) R r λ (L K )L s Kφ(L K ) Rλ s φ(λ ), (26) where r λ (t) = (t + λ ) t. Cobining these error bounds, we achieve the required estiate. Theore 2.. Let z be i.i.d. saples drawn according to probability easure P φ,b. Suppose φ(t) and t s /φ(t) are nondecreasing functions. Then under paraeter choice λ (0, ], λ = Ψ ( /2 ), λ 2 = (Ψ ( /2 )) 3/2 φ(ψ ( /2 )) where Ψ(t) = t 2 + 2b φ(t), for 0 s 2 and for all 0 < <, the following error estiates holds with confidence, ( )} 4 P rob L s K(f z,λ f ρ ) K C(Ψ ( /2 )) s φ(ψ ( /2 )) log, z Z where C = 4κM + (2 + 8κ)(R + M B B ) + 6M βb/(b ) and li τ li sup sup P rob ρ P φ,b z Z Proof. Let Ψ(t) = t 2 + 2b φ(t). Then Ψ(t) = y follows, } L s K(f z,λ f ρ ) K > τ(ψ ( /2 )) s φ(ψ ( /2 )) =0. Ψ(t) y li = li t 0 t y 0 Ψ (y) = 0. Under the paraeter choice λ = Ψ ( /2 ) we have large, 2b = λ φ(λ ) λ λ li λ =. Therefore for sufficiently λ 2b φ(λ ). Under the fact λ fro Proposition 2.2, 2.3 and eqn. (3) follows that with confidence, L s K(f z,λ f ρ ) K C(Ψ ( /2 )) s φ(ψ ( /2 )) log ( ) 4, (27) where C = 4κM + (2 + 8κ)(R + M B B ) + 6M βb/(b ). Now defining τ := C log ( 4 ) gives = τ = 4e τ/c. The estiate (27) can be reexpressed as P rob K(f z Z Ls z,λ f ρ ) K > τ(ψ ( /2 )) s φ(ψ ( /2 ))} τ. (28) Theore 2.2. Let z be i.i.d. saples drawn according to the probability easure ρ P φ,b. Then under the paraeter choice λ 0 (0, ], λ 0 = Ψ ( /2 ), λ j = (Ψ ( /2 )) 3/2 φ(ψ ( /2 )) for j p, where Ψ(t) = t 2 + 2b φ(t), for all 0 < η <, the following error estiates hold with confidence η, 85

10 (i) If φ(t) and t/φ(t) are nondecreasing functions. Then we have, ( )} Prob f z,λ f H H Cφ(Ψ 4 ( /2 )) log η z Z η and li τ li sup sup Prob ρ P φ,b z Z } f z,λ f H H > τφ(ψ ( /2 )) = 0, where C = 2R + 4κM + 2M B B L(H) + 4κΣ. (ii) If φ(t) and t/φ(t) are nondecreasing functions. Then we have, ( )} Prob f z,λ f H ρ C(Ψ 4 ( /2 )) /2 φ(ψ ( /2 )) log η z Z η and li τ li sup sup Prob ρ P φ,b z Z } f z,λ f H ρ > τ(ψ ( /2 )) /2 φ(ψ ( /2 )) = 0. Corollary 2.. Under the sae assuptions of Theore 2. for Hölder s source condition f ρ Ω φ,r, φ(t) = t r, for 0 s 2 and for all 0 < <, with confidence, for the paraeter choice λ = b 2br+b+ and λ2 = 2br+3b 4br+2b+2 we have the following convergence rates: ( ) L s K(f z,λ f ρ ) K C b(r+s) 2br+b+ 4 log for 0 r s. Reark 2.. For Hölder source condition f H Ω φ,r, φ(t) = t r with the paraeter choice λ 0 (0, ], λ 0 = b 2br+b+ and for j p, λj = 2br+3b 4br+2b+2, we obtain the optial iniax convergence rates O( br 2br+b+ ) for 0 r and O( 2br+b 4br+2b+2 ) for 0 r 2 in RKHS-nor and Lρ 2 X -nor, respectively. Corollary 2.2. Under the logarith decay condition of effective diension N (λ ), for Hölder s source condition f ρ Ω φ,r, φ(t) = t r, for 0 s 2 and for all 0 < <, with confidence, for the paraeter choice λ = convergence rates: ( log ) 2r+ and λ 2 = ( ) s+r ( ) log L s 2r+ 4 K(f z,λ f ρ ) K C log ( log ) 2r+3 4r+2 for 0 r s. we have the following Reark 2.2. The upper convergence rates of the regularized solution is estiated in the interpolation nor for the paraeter s [0, 2 ]. In particular, we obtain the error estiates in HK -nor for s = 0 and in L 2 ρx -nor for s = 2. We present the error estiates of ulti-penalty regularizer over the regularity class P φ,b in Theore 2. and Corollary 2.. We can also obtain the convergence rates of the estiator f z,λ under the source condition without the polynoial decay of the eigenvalues of the integral operator L K by substituting N (λ ) κ2 λ. In addition, for B = (Sx LS x )/2 we obtain the error estiates of the anifold regularization schee (30) considered in [3]. Reark 2.3. The paraeter choice is said to be optial, if the iniax lower rates coincide with the upper convergence rates for soe λ = λ(). For the paraeter choice λ = Ψ ( /2 ) and 86

11 λ 2 = (Ψ ( /2 )) 3/2 φ(ψ ( /2 )), Theore 2.2 share the upper convergence rates with the lower convergence rates of Theore 3., 3.2 [44]. Therefore the choice of paraeters is optial. Reark 2.4. The results can be easily generalized to n-penalty regularization in vector-valued fraework. For siplicity, we discuss two-paraeter regularization schee in scalar-valued function setting. Reark 2.5. We can also address the convergence issues of binary classification proble [45] using our error estiates as siilar to discussed in Section 3.3 [5] and Section 5 [9]. The proposed choice of paraeters in Theore 2. is based on the regularity paraeters which are generally not known in practice. In the proceeding section, we discuss the paraeter choice rules based on saples. 3 Paraeter Choice Rules Most regularized learning algoriths depend on the tuning paraeter, whose appropriate choice is crucial to ensure good perforance of the regularized solution. Many paraeter choice strategies are discussed for single-penalty regularization schees for both ill-posed probles and the learning algoriths [27, 28] (also see references therein). Various paraeter choice rules are studied for ulti-penalty regularization schees [5, 8, 9, 25, 3, 32, 33, 36, 46]. Ito el al. [33] studied a balancing principle for choosing regularization paraeters based on the augented Tikhonov regularization approach for ill posed inverse probles. In learning theory fraework, we are discussing the fixed point algorith based on the penalty balancing principle considered in [33]. The Bayesian inference approach provides a echanis for selecting the regularization paraeters through hierarchical odeling. Various authors successfully applied this approach in different probles. Thopson et al. [47] applied this for selecting paraeters for iage restoration. Jin et al. [48] considered the approach for ill-posed Cauchy proble of steady-state heat conduction. The posterior probability density function (PPDF) for the functional (5) is given by P (f, σ 2, µ, z) ( ) n/2 ( σ 2 exp ) ( 2σ 2 S xf y 2 µ n/2 exp µ ) 2 f 2 K µ n2/2 2 ( exp µ ) ( ) α 2 2 Bf 2 K µ α e β µ µ α 2 e β µ 2 o σ 2 e β o ( σ 2 ). where (α, β ) are paraeter pairs for µ = (µ, µ 2 ), (α o, β o) are paraeter pair for inverse variance σ 2. In the Bayesian inference approach, we select paraeter set (f, σ 2, µ) which axiizes the PPDF. By taking the negative logarith and siplifying, the proble can be reforulated as J (f, τ, µ) = τ S x f y 2 + µ f 2 K + µ 2 Bf 2 K +β(µ + µ 2 ) α(logµ + logµ 2 ) + β o τ α o logτ, where τ = /σ 2, β = 2β, α = n + 2α 2, β o = 2β o, α o = n 2 + 2α o 2. We assue that the scalars τ and µ i s have Gaa distributions with known paraeter pairs. The functional is pronounced as augented Tikhonov regularization. For non-inforative prior β o = β = 0, the optiality of a-tikhonov functional can be reduced to 87

12 f z,λ = arg in Sx f y 2 + λ f 2 K + λ } 2 Bf 2 K α µ = τ = f z,λ 2 K α o S xf z,λ y 2, µ 2 = α Bf z,λ 2 K where λ = µ τ, λ 2 = µ2 τ, γ = αo α, this can be reforulated as f z,λ = arg in Sx f y 2 + λ f 2 K + λ } 2 Bf 2 K λ = S xf z,λ y 2 γ f z,λ, λ 2 2 = S xf z,λ y 2 γ K Bf z,λ 2 K which iplies λ f z,λ 2 K = λ 2 Bf z,λ 2 K. It selects the regularization paraeter λ in the functional (6) by balancing the penalty with the fidelity. Therefore the ter Penalty balancing principle follows. Now we describe the fixed point algorith based on PB-principle. Algorith Paraeter choice rule Penalty-balancing Principle. For an initial value λ = (λ 0, λ 0 2), start with k = Calculate f z,λ k and update λ by = S xf z,λ k y 2 + λ k 2 Bf z,λ k 2 K ( + γ) f z,λ k 2, K λ k+ 2 = S xf z,λ k y 2 + λ k f z,λ k 2 K ( + γ) Bf z,λ k 2. K λ k+ 3. If stopping criteria λ k+ λ k < ε satisfied then stop otherwise set k = k + and GOTO (2). 4 Nuerical Realization In this section, the perforance of single-penalty regularization versus ulti-penalty regularization is deonstrated using the acadeic exaple and two oon data set. For single-penalty regularization, paraeters are chosen according to the quasi-optiality principle while for twoparaeter regularization according to PB-principle. We consider the well-known acadeic exaple [28, 6, 49] to test the ulti-penalty regularization under PB-principle paraeter choice rule, f ρ (x) = ( x + 2 e 8( 4π 3 x)2 e 8( π 2 x)2 e 8( 3π x)2)} 2, x [0, 2π], (29) 0 which belongs to reproducing kernel Hilbert space H K corresponding to the kernel K(x, y) = xy + exp ( 8(x y) 2 ). We generate noisy data 00 ties in the for y = f ρ (x) + ξ corresponding to the inputs x = x i } = π 0 (i )}, where ξ follows the unifor distribution over [, ] with =

13 We consider the following ulti-penalty functional proposed in the anifold regularization [3, 5], } argin (f(x i ) y i ) 2 + λ f 2 K + λ 2 (Sx LS x )/2 f 2 K, (30) where x = x i X : i n} and L = D W with W = (ω ij ) is a weight atrix with non-negative entries and D is a diagonal atrix with D ii = n ω ij. In our experient, we illustrate the error estiates of single-penalty regularizers f = f z,λ, f = f z,λ2 and ulti-penalty regularizer f = f z,λ using the relative error easure f fρ f for the acadeic exaple in sup nor, H K -nor and -epirical nor in Fig. (a), (b) & (c) respectively. j= Figure : Figures show the relative errors of different estiators for the acadeic exaple in K - nor (a), -epirical nor (b) and infinity nor (c) corresponding to 00 test probles with the noise = 0.02 for all estiators. Now we copare the perforance of ulti-penalty regularization over single-penalty regularization ethod using the well-known two oon data set (Fig. 2) in the context of anifold learning. The data set contains 200 exaples with k labeled exaple for each class. We perfor experients 500 ties by taking l = 2k = 2, 6, 0, 20 labeled points randoly. We solve the anifold regularization proble (30) for the ercer kernel K(x i, x j ) = exp( γ x i x j 2 ) with the exponential weights ω ij = exp( x i x j 2 /4b), for soe b, γ > 0. We choose initial paraeters λ = 0 4, λ 2 = , the kernel paraeter γ = 3.5 and the weight paraeter b = in all experients. The perforance of single-penalty (λ 2 = 0) and the proposed ulti-penalty regularizer (30) is presented in Fig. 2, Table. Based on the considered exaples, we observe that the proposed ulti-penalty regularization with the penalty balancing principle paraeter choice outperfors the single-penalty regularizers. 89

14 (a) (b) Figure 2: The figures show the decision surfaces generated with two labeled saples (red star) by single-penalty regularizer (a) based on the quasi-optiality principle and anifold regularizer (b) based on PB-principle. Single-penalty Regularizer Multi-penalty Regularizer (SP %) (WC) Paraeters (SP %) (WC) Paraeters = λ = λ = λ 2 = = λ = λ = λ 2 = = λ = λ = λ 2 = = λ = λ = λ 2 = Table : Statistical perforance interpretation of single-penalty (λ 2 = 0) and ulti-penalty regularizers of the functional Sybols: labeled points (); successfully predicted (SP); axiu of wrongly classified points (WC) 5 Conclusion In suary, we achieved the optial iniax rates of ulti-penalized regression proble under the general source condition with the decay conditions of effective diension. In particular, the convergence analysis of ulti-penalty regularization provide the error estiates of anifold regularization proble. We can also address the convergence issues of binary classification proble using our error estiates. Here we discussed the penalty balancing principle based on augented Tikhonov regularization for the choice of regularization paraeters. Many other paraeter choice rules are proposed to obtain the regularized solution of ulti-paraeter regularization schees. The next proble of interest can be the rigorous analysis of different paraeter choice rules of ulti-penalty regularization schees. Finally, the superiority of ulti-penalty regularization over single-penalty regularization is shown using the acadeic exaple and oon data set. Acknowledgeents: The authors are grateful for the valuable suggestions and coents of the anonyous referees that led to iprove the quality of the paper. 90

15 References [] O. Bousquet, S. Boucheron, and G. Lugosi, Introduction to statistical learning theory, in Advanced lectures on achine learning, pp , Berlin/Heidelberg: Springer, [2] F. Cucker and S. Sale, On the atheatical foundations of learning, Bull. Aer. Math. Soc. (NS), vol. 39, no., pp. 49, [3] T. Evgeniou, M. Pontil, and T. Poggio, Regularization networks and support vector achines, Adv. Coput. Math., vol. 3, no., pp. 50, [4] V. N. Vapnik and V. Vapnik, Statistical Learning Theory, vol.. New York: Wiley, 998. [5] F. Bauer, S. Pereverzev, and L. Rosasco, On regularization algoriths in learning theory, J. Coplexity, vol. 23, no., pp , [6] A. Caponnetto and E. De Vito, Optial rates for the regularized least-squares algorith, Found. Coput. Math., vol. 7, no. 3, pp , [7] H. W. Engl, M. Hanke, and A. Neubauer, Regularization of Inverse Probles, vol Dordrecht, The Netherlands: Math. Appl., Kluwer Acadeic Publishers Group, 996. [8] L. L. Gerfo, L. Rosasco, F. Odone, E. De Vito, and A. Verri, Spectral algoriths for supervised learning, Neural Coputation, vol. 20, no. 7, pp , [9] S. Sale and D. X. Zhou, Learning theory estiates via integral operators and their approxiations, Constr. Approx., vol. 26, no. 2, pp , [0] A. N. Tikhonov and V. Y. Arsenin, Solutions of Ill-posed Probles, vol. 4. Washington, DC: W. H. Winston, 977. [] S. Sale and D. X. Zhou, Shannon sapling and function reconstruction fro point values, Bull. Aer. Math. Soc., vol. 4, no. 3, pp , [2] S. Sale and D. X. Zhou, Shannon sapling II: Connections to learning theory, Appl. Coput. Haronic Anal., vol. 9, no. 3, pp , [3] M. Belkin, P. Niyogi, and V. Sindhwani, Manifold regularization: A geoetric fraework for learning fro labeled and unlabeled exaples, J. Mach. Learn. Res., vol. 7, pp , [4] D. Düveleyer and B. Hofann, A ulti-paraeter regularization approach for estiating paraeters in jup diffusion processes, J. Inverse Ill-Posed Probl., vol. 4, no. 9, pp , [5] S. Lu and S. V. Pereverzev, Multi-paraeter regularization and its nuerical realization, Nuer. Math., vol. 8, no., pp. 3, 20. [6] S. Lu, S. Pereverzyev Jr., and S. Sivananthan, Multiparaeter regularization for construction of extrapolating estiators in statistical learning theory, in Multiscale Signal Analysis and Modeling, pp , New York: Springer,

16 [7] Y. Lu, L. Shen, and Y. Xu, Multi-paraeter regularization ethods for high-resolution iage reconstruction with displaceent errors, IEEE Trans. Circuits Syst. I: Regular Papers, vol. 54, no. 8, pp , [8] V. Nauova and S. V. Pereverzyev, Multi-penalty regularization with a coponent-wise penalization, Inverse Probles, vol. 29, no. 7, p , 203. [9] S. N. Wood, Modelling and soothing paraeter estiation with ultiple quadratic penalties, J. R. Statist. Soc., vol. 62, pp , [20] P. Xu, Y. Fukuda, and Y. Liu, Multiple paraeter regularization: Nuerical solutions and applications to the deterination of geopotential fro precise satellite orbits, J. Geodesy, vol. 80, no., pp. 7 27, [2] Y. Luo, D. Tao, C. Xu, D. Li, and C. Xu, Vector-valued ulti-view sei-supervsed learning for ulti-label iage classification., in AAAI, pp , 203. [22] Y. Luo, D. Tao, C. Xu, C. Xu, H. Liu, and Y. Wen, Multiview vector-valued anifold regularization for ultilabel iage classification, IEEE Trans. Neural Netw. Learn. Syst., vol. 24, no. 5, pp , 203. [23] H. Q. Minh, L. Bazzani, and V. Murino, A unifying fraework in vector-valued reproducing kernel Hilbert spaces for anifold regularization and co-regularized ulti-view learning, J. Mach. Learn. Res., vol. 7, no. 25, pp. 72, 206. [24] H. Q. Minh and V. Sindhwani, Vector-valued anifold regularization, in International Conference on Machine Learning, 20. [25] Abhishake and S. Sivananthan, Multi-penalty regularization in learning theory, J. Coplexity, vol. 36, pp. 4 65, 206. [26] F. Bauer and S. Kinderann, The quasi-optiality criterion for classical inverse probles, Inverse Probles, vol. 24, p , [27] A. Caponnetto and Y. Yao, Cross-validation based adaptation for regularization operators in learning theory, Anal. Appl., vol. 8, no. 2, pp. 6 83, 200. [28] E. De Vito, S. Pereverzyev, and L. Rosasco, Adaptive kernel ethods using the balancing principle, Found. Coput. Math., vol. 0, no. 4, pp , 200. [29] V. A. Morozov, On the solution of functional equations by the ethod of regularization, Soviet Math. Dokl, vol. 7, no., pp , 966. [30] J. Xie and J. Zou, An iproved odel function ethod for choosing regularization paraeters in linear inverse probles, Inverse Probles, vol. 8, no. 3, pp , [3] S. Lu, S. V. Pereverzev, and U. Tautenhahn, A odel function ethod in regularized total least squares, Appl. Anal., vol. 89, no., pp , 200. [32] M. Fornasier, V. Nauova, and S. V. Pereverzyev, Paraeter choice strategies for ultipenalty regularization, SIAM J. Nuer. Anal., vol. 52, no. 4, pp ,

17 [33] K. Ito, B. Jin, and T. Takeuchi, Multi-paraeter Tikhonov regularization An augented approach, Chinese Ann. Math., vol. 35, no. 3, pp , 204. [34] M. Belge, M. E. Kiler, and E. L. Miller, Efficient deterination of ultiple regularization paraeters in a generalized L-curve fraework, Inverse Probles, vol. 8, pp. 6 83, [35] F. Bauer and O. Ivanyshyn, Optial regularization with two interdependent regularization paraeters, Inverse probles, vol. 23, no., pp , [36] F. Bauer and S. V. Pereverzev, An utilization of a rough approxiation of a noise covariance within the fraework of ulti-paraeter regularization, Int. J. Toogr. Stat, vol. 4, pp. 2, [37] Z. Chen, Y. Lu, Y. Xu, and H. Yang, Multi-paraeter Tikhonov regularization for linear ill-posed operator equations, J. Cop. Math., vol. 26, pp , [38] C. Brezinski, M. Redivo-Zaglia, G. Rodriguez, and S. Seatzu, Multi-paraeter regularization techniques for ill-conditioned linear systes, Nuer. Math., vol. 94, no. 2, pp , [39] N. Aronszajn, Theory of reproducing kernels, Trans. Aer. Math. Soc., vol. 68, pp , 950. [40] P. Mathé and S. V. Pereverzev, Geoetry of linear ill-posed probles in variable Hilbert scales, Inverse probles, vol. 9, no. 3, pp , [4] S. Lu, P. Mathé, and S. Pereverzyev, Balancing principle in supervised learning for a general regularization schee, RICAM-Report, vol. 38, 206. [42] G. Blanchard and P. Mathé, Discrepancy principle for statistical inverse probles with application to conjugate gradient iteration, Inverse probles, vol. 28, no., p. 50, 202. [43] E. De Vito, L. Rosasco, A. Caponnetto, U. De Giovannini, and F. Odone, Learning fro exaples as an inverse proble, J. Mach. Learn. Res., vol. 6, pp , [44] A. Rastogi and S. Sivananthan, Optial rates for the regularized learning algoriths under general source condition, Front. Appl. Math. Stat., vol. 3, p. 3, 207. [45] S. Boucheron, O. Bousquet, and G. Lugosi, Theory of classification: A survey of soe recent advances, ESAIM: probability and statistics, vol. 9, pp , [46] S. Lu and S. Pereverzev, Regularization Theory for Ill-posed Probles: Selected Topics, vol. 58. Berlin: Walter de Gruyter, 203. [47] A. M. Thopson and J. Kay, On soe choices of regularization paraeter in iage restoration, Inverse Probles, vol. 9, pp , 993. [48] B. Jin and J. Zou, Augented Tikhonov regularization, Inverse Probles, vol. 25, no. 2, p , [49] C. A. Micchelli and M. Pontil, Learning the kernel function via regularization, J. Mach. Learn. Res., vol. 6, no. 2, pp ,

Error Estimates for Multi-Penalty Regularization under General Source Condition

Error Estimates for Multi-Penalty Regularization under General Source Condition Error Estimates for Multi-Penalty Regularization under General Source Condition Abhishake Rastogi Department of Mathematics Indian Institute of Technology Delhi New Delhi 006, India abhishekrastogi202@gmail.com

More information

Shannon Sampling II. Connections to Learning Theory

Shannon Sampling II. Connections to Learning Theory Shannon Sapling II Connections to Learning heory Steve Sale oyota echnological Institute at Chicago 147 East 60th Street, Chicago, IL 60637, USA E-ail: sale@athberkeleyedu Ding-Xuan Zhou Departent of Matheatics,

More information

An l 1 Regularized Method for Numerical Differentiation Using Empirical Eigenfunctions

An l 1 Regularized Method for Numerical Differentiation Using Empirical Eigenfunctions Journal of Matheatical Research with Applications Jul., 207, Vol. 37, No. 4, pp. 496 504 DOI:0.3770/j.issn:2095-265.207.04.0 Http://jre.dlut.edu.cn An l Regularized Method for Nuerical Differentiation

More information

Learnability of Gaussians with flexible variances

Learnability of Gaussians with flexible variances Learnability of Gaussians with flexible variances Ding-Xuan Zhou City University of Hong Kong E-ail: azhou@cityu.edu.hk Supported in part by Research Grants Council of Hong Kong Start October 20, 2007

More information

CS Lecture 13. More Maximum Likelihood

CS Lecture 13. More Maximum Likelihood CS 6347 Lecture 13 More Maxiu Likelihood Recap Last tie: Introduction to axiu likelihood estiation MLE for Bayesian networks Optial CPTs correspond to epirical counts Today: MLE for CRFs 2 Maxiu Likelihood

More information

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization Recent Researches in Coputer Science Support Vector Machine Classification of Uncertain and Ibalanced data using Robust Optiization RAGHAV PAT, THEODORE B. TRAFALIS, KASH BARKER School of Industrial Engineering

More information

Feature Extraction Techniques

Feature Extraction Techniques Feature Extraction Techniques Unsupervised Learning II Feature Extraction Unsupervised ethods can also be used to find features which can be useful for categorization. There are unsupervised ethods that

More information

Kernel Methods and Support Vector Machines

Kernel Methods and Support Vector Machines Intelligent Systes: Reasoning and Recognition Jaes L. Crowley ENSIAG 2 / osig 1 Second Seester 2012/2013 Lesson 20 2 ay 2013 Kernel ethods and Support Vector achines Contents Kernel Functions...2 Quadratic

More information

Multi-kernel Regularized Classifiers

Multi-kernel Regularized Classifiers Multi-kernel Regularized Classifiers Qiang Wu, Yiing Ying, and Ding-Xuan Zhou Departent of Matheatics, City University of Hong Kong Kowloon, Hong Kong, CHINA wu.qiang@student.cityu.edu.hk, yying@cityu.edu.hk,

More information

A Simple Regression Problem

A Simple Regression Problem A Siple Regression Proble R. M. Castro March 23, 2 In this brief note a siple regression proble will be introduced, illustrating clearly the bias-variance tradeoff. Let Y i f(x i ) + W i, i,..., n, where

More information

SVM Soft Margin Classifiers: Linear Programming versus Quadratic Programming

SVM Soft Margin Classifiers: Linear Programming versus Quadratic Programming SVM Soft Margin Classifiers: Linear Prograing versus Quadratic Prograing Qiang Wu wu.qiang@student.cityu.edu.hk Ding-Xuan Zhou azhou@cityu.edu.hk Departent of Matheatics, City University of Hong Kong,

More information

Yongquan Zhang a, Feilong Cao b & Zongben Xu a a Institute for Information and System Sciences, Xi'an Jiaotong. Available online: 11 Mar 2011

Yongquan Zhang a, Feilong Cao b & Zongben Xu a a Institute for Information and System Sciences, Xi'an Jiaotong. Available online: 11 Mar 2011 This article was downloaded by: [Xi'an Jiaotong University] On: 15 Noveber 2011, At: 18:34 Publisher: Taylor & Francis Infora Ltd Registered in England and Wales Registered Nuber: 1072954 Registered office:

More information

Computational and Statistical Learning Theory

Computational and Statistical Learning Theory Coputational and Statistical Learning Theory Proble sets 5 and 6 Due: Noveber th Please send your solutions to learning-subissions@ttic.edu Notations/Definitions Recall the definition of saple based Radeacher

More information

A Smoothed Boosting Algorithm Using Probabilistic Output Codes

A Smoothed Boosting Algorithm Using Probabilistic Output Codes A Soothed Boosting Algorith Using Probabilistic Output Codes Rong Jin rongjin@cse.su.edu Dept. of Coputer Science and Engineering, Michigan State University, MI 48824, USA Jian Zhang jian.zhang@cs.cu.edu

More information

Bayes Decision Rule and Naïve Bayes Classifier

Bayes Decision Rule and Naïve Bayes Classifier Bayes Decision Rule and Naïve Bayes Classifier Le Song Machine Learning I CSE 6740, Fall 2013 Gaussian Mixture odel A density odel p(x) ay be ulti-odal: odel it as a ixture of uni-odal distributions (e.g.

More information

Combining Classifiers

Combining Classifiers Cobining Classifiers Generic ethods of generating and cobining ultiple classifiers Bagging Boosting References: Duda, Hart & Stork, pg 475-480. Hastie, Tibsharini, Friedan, pg 246-256 and Chapter 10. http://www.boosting.org/

More information

Intelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines

Intelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines Intelligent Systes: Reasoning and Recognition Jaes L. Crowley osig 1 Winter Seester 2018 Lesson 6 27 February 2018 Outline Perceptrons and Support Vector achines Notation...2 Linear odels...3 Lines, Planes

More information

DEPARTMENT OF ECONOMETRICS AND BUSINESS STATISTICS

DEPARTMENT OF ECONOMETRICS AND BUSINESS STATISTICS ISSN 1440-771X AUSTRALIA DEPARTMENT OF ECONOMETRICS AND BUSINESS STATISTICS An Iproved Method for Bandwidth Selection When Estiating ROC Curves Peter G Hall and Rob J Hyndan Working Paper 11/00 An iproved

More information

Intelligent Systems: Reasoning and Recognition. Artificial Neural Networks

Intelligent Systems: Reasoning and Recognition. Artificial Neural Networks Intelligent Systes: Reasoning and Recognition Jaes L. Crowley MOSIG M1 Winter Seester 2018 Lesson 7 1 March 2018 Outline Artificial Neural Networks Notation...2 Introduction...3 Key Equations... 3 Artificial

More information

Journal of Mathematical Analysis and Applications

Journal of Mathematical Analysis and Applications J Math Anal Appl 386 202 205 22 Contents lists available at ScienceDirect Journal of Matheatical Analysis and Applications wwwelsevierco/locate/jaa Sei-Supervised Learning with the help of Parzen Windows

More information

e-companion ONLY AVAILABLE IN ELECTRONIC FORM

e-companion ONLY AVAILABLE IN ELECTRONIC FORM OPERATIONS RESEARCH doi 10.1287/opre.1070.0427ec pp. ec1 ec5 e-copanion ONLY AVAILABLE IN ELECTRONIC FORM infors 07 INFORMS Electronic Copanion A Learning Approach for Interactive Marketing to a Custoer

More information

Sharp Time Data Tradeoffs for Linear Inverse Problems

Sharp Time Data Tradeoffs for Linear Inverse Problems Sharp Tie Data Tradeoffs for Linear Inverse Probles Saet Oyak Benjain Recht Mahdi Soltanolkotabi January 016 Abstract In this paper we characterize sharp tie-data tradeoffs for optiization probles used

More information

Block designs and statistics

Block designs and statistics Bloc designs and statistics Notes for Math 447 May 3, 2011 The ain paraeters of a bloc design are nuber of varieties v, bloc size, nuber of blocs b. A design is built on a set of v eleents. Each eleent

More information

Supplementary Material for Fast and Provable Algorithms for Spectrally Sparse Signal Reconstruction via Low-Rank Hankel Matrix Completion

Supplementary Material for Fast and Provable Algorithms for Spectrally Sparse Signal Reconstruction via Low-Rank Hankel Matrix Completion Suppleentary Material for Fast and Provable Algoriths for Spectrally Sparse Signal Reconstruction via Low-Ran Hanel Matrix Copletion Jian-Feng Cai Tianing Wang Ke Wei March 1, 017 Abstract We establish

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE227C (Spring 2018): Convex Optiization and Approxiation Instructor: Moritz Hardt Eail: hardt+ee227c@berkeley.edu Graduate Instructor: Max Sichowitz Eail: sichow+ee227c@berkeley.edu October

More information

Pattern Recognition and Machine Learning. Artificial Neural networks

Pattern Recognition and Machine Learning. Artificial Neural networks Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2017 Lessons 7 20 Dec 2017 Outline Artificial Neural networks Notation...2 Introduction...3 Key Equations... 3 Artificial

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE7C (Spring 018: Convex Optiization and Approxiation Instructor: Moritz Hardt Eail: hardt+ee7c@berkeley.edu Graduate Instructor: Max Sichowitz Eail: sichow+ee7c@berkeley.edu October 15,

More information

A Bernstein-Markov Theorem for Normed Spaces

A Bernstein-Markov Theorem for Normed Spaces A Bernstein-Markov Theore for Nored Spaces Lawrence A. Harris Departent of Matheatics, University of Kentucky Lexington, Kentucky 40506-0027 Abstract Let X and Y be real nored linear spaces and let φ :

More information

Pattern Recognition and Machine Learning. Learning and Evaluation for Pattern Recognition

Pattern Recognition and Machine Learning. Learning and Evaluation for Pattern Recognition Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2017 Lesson 1 4 October 2017 Outline Learning and Evaluation for Pattern Recognition Notation...2 1. The Pattern Recognition

More information

E0 370 Statistical Learning Theory Lecture 5 (Aug 25, 2011)

E0 370 Statistical Learning Theory Lecture 5 (Aug 25, 2011) E0 370 Statistical Learning Theory Lecture 5 Aug 5, 0 Covering Nubers, Pseudo-Diension, and Fat-Shattering Diension Lecturer: Shivani Agarwal Scribe: Shivani Agarwal Introduction So far we have seen how

More information

Generalized AOR Method for Solving System of Linear Equations. Davod Khojasteh Salkuyeh. Department of Mathematics, University of Mohaghegh Ardabili,

Generalized AOR Method for Solving System of Linear Equations. Davod Khojasteh Salkuyeh. Department of Mathematics, University of Mohaghegh Ardabili, Australian Journal of Basic and Applied Sciences, 5(3): 35-358, 20 ISSN 99-878 Generalized AOR Method for Solving Syste of Linear Equations Davod Khojasteh Salkuyeh Departent of Matheatics, University

More information

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis E0 370 tatistical Learning Theory Lecture 6 (Aug 30, 20) Margin Analysis Lecturer: hivani Agarwal cribe: Narasihan R Introduction In the last few lectures we have seen how to obtain high confidence bounds

More information

UNIVERSITY OF TRENTO ON THE USE OF SVM FOR ELECTROMAGNETIC SUBSURFACE SENSING. A. Boni, M. Conci, A. Massa, and S. Piffer.

UNIVERSITY OF TRENTO ON THE USE OF SVM FOR ELECTROMAGNETIC SUBSURFACE SENSING. A. Boni, M. Conci, A. Massa, and S. Piffer. UIVRSITY OF TRTO DIPARTITO DI IGGRIA SCIZA DLL IFORAZIO 3823 Povo Trento (Italy) Via Soarive 4 http://www.disi.unitn.it O TH US OF SV FOR LCTROAGTIC SUBSURFAC SSIG A. Boni. Conci A. assa and S. Piffer

More information

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon Model Fitting CURM Background Material, Fall 014 Dr. Doreen De Leon 1 Introduction Given a set of data points, we often want to fit a selected odel or type to the data (e.g., we suspect an exponential

More information

Non-Parametric Non-Line-of-Sight Identification 1

Non-Parametric Non-Line-of-Sight Identification 1 Non-Paraetric Non-Line-of-Sight Identification Sinan Gezici, Hisashi Kobayashi and H. Vincent Poor Departent of Electrical Engineering School of Engineering and Applied Science Princeton University, Princeton,

More information

The linear sampling method and the MUSIC algorithm

The linear sampling method and the MUSIC algorithm INSTITUTE OF PHYSICS PUBLISHING INVERSE PROBLEMS Inverse Probles 17 (2001) 591 595 www.iop.org/journals/ip PII: S0266-5611(01)16989-3 The linear sapling ethod and the MUSIC algorith Margaret Cheney Departent

More information

Online Gradient Descent Learning Algorithms

Online Gradient Descent Learning Algorithms DISI, Genova, December 2006 Online Gradient Descent Learning Algorithms Yiming Ying (joint work with Massimiliano Pontil) Department of Computer Science, University College London Introduction Outline

More information

Multi-view Discriminative Manifold Embedding for Pattern Classification

Multi-view Discriminative Manifold Embedding for Pattern Classification Multi-view Discriinative Manifold Ebedding for Pattern Classification X. Wang Departen of Inforation Zhenghzou 450053, China Y. Guo Departent of Digestive Zhengzhou 450053, China Z. Wang Henan University

More information

Pattern Recognition and Machine Learning. Artificial Neural networks

Pattern Recognition and Machine Learning. Artificial Neural networks Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2016 Lessons 7 14 Dec 2016 Outline Artificial Neural networks Notation...2 1. Introduction...3... 3 The Artificial

More information

Using EM To Estimate A Probablity Density With A Mixture Of Gaussians

Using EM To Estimate A Probablity Density With A Mixture Of Gaussians Using EM To Estiate A Probablity Density With A Mixture Of Gaussians Aaron A. D Souza adsouza@usc.edu Introduction The proble we are trying to address in this note is siple. Given a set of data points

More information

1 Proof of learning bounds

1 Proof of learning bounds COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #4 Scribe: Akshay Mittal February 13, 2013 1 Proof of learning bounds For intuition of the following theore, suppose there exists a

More information

Machine Learning Basics: Estimators, Bias and Variance

Machine Learning Basics: Estimators, Bias and Variance Machine Learning Basics: Estiators, Bias and Variance Sargur N. srihari@cedar.buffalo.edu This is part of lecture slides on Deep Learning: http://www.cedar.buffalo.edu/~srihari/cse676 1 Topics in Basics

More information

Distributed Subgradient Methods for Multi-agent Optimization

Distributed Subgradient Methods for Multi-agent Optimization 1 Distributed Subgradient Methods for Multi-agent Optiization Angelia Nedić and Asuan Ozdaglar October 29, 2007 Abstract We study a distributed coputation odel for optiizing a su of convex objective functions

More information

Support Vector Machines. Goals for the lecture

Support Vector Machines. Goals for the lecture Support Vector Machines Mark Craven and David Page Coputer Sciences 760 Spring 2018 www.biostat.wisc.edu/~craven/cs760/ Soe of the slides in these lectures have been adapted/borrowed fro aterials developed

More information

Universal algorithms for learning theory Part II : piecewise polynomial functions

Universal algorithms for learning theory Part II : piecewise polynomial functions Universal algoriths for learning theory Part II : piecewise polynoial functions Peter Binev, Albert Cohen, Wolfgang Dahen, and Ronald DeVore Deceber 6, 2005 Abstract This paper is concerned with estiating

More information

Geometry on Probability Spaces

Geometry on Probability Spaces Geometry on Probability Spaces Steve Smale Toyota Technological Institute at Chicago 427 East 60th Street, Chicago, IL 60637, USA E-mail: smale@math.berkeley.edu Ding-Xuan Zhou Department of Mathematics,

More information

Detection and Estimation Theory

Detection and Estimation Theory ESE 54 Detection and Estiation Theory Joseph A. O Sullivan Sauel C. Sachs Professor Electronic Systes and Signals Research Laboratory Electrical and Systes Engineering Washington University 11 Urbauer

More information

Fixed-to-Variable Length Distribution Matching

Fixed-to-Variable Length Distribution Matching Fixed-to-Variable Length Distribution Matching Rana Ali Ajad and Georg Böcherer Institute for Counications Engineering Technische Universität München, Gerany Eail: raa2463@gail.co,georg.boecherer@tu.de

More information

Two-parameter regularization method for determining the heat source

Two-parameter regularization method for determining the heat source Global Journal of Pure and Applied Mathematics. ISSN 0973-1768 Volume 13, Number 8 (017), pp. 3937-3950 Research India Publications http://www.ripublication.com Two-parameter regularization method for

More information

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 1

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 1 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE Two-Diensional Multi-Label Active Learning with An Efficient Online Adaptation Model for Iage Classification Guo-Jun Qi, Xian-Sheng Hua, Meber,

More information

PAC-Bayes Analysis Of Maximum Entropy Learning

PAC-Bayes Analysis Of Maximum Entropy Learning PAC-Bayes Analysis Of Maxiu Entropy Learning John Shawe-Taylor and David R. Hardoon Centre for Coputational Statistics and Machine Learning Departent of Coputer Science University College London, UK, WC1E

More information

Boosting with log-loss

Boosting with log-loss Boosting with log-loss Marco Cusuano-Towner Septeber 2, 202 The proble Suppose we have data exaples {x i, y i ) i =... } for a two-class proble with y i {, }. Let F x) be the predictor function with the

More information

Introduction to Machine Learning. Recitation 11

Introduction to Machine Learning. Recitation 11 Introduction to Machine Learning Lecturer: Regev Schweiger Recitation Fall Seester Scribe: Regev Schweiger. Kernel Ridge Regression We now take on the task of kernel-izing ridge regression. Let x,...,

More information

Stochastic Subgradient Methods

Stochastic Subgradient Methods Stochastic Subgradient Methods Lingjie Weng Yutian Chen Bren School of Inforation and Coputer Science University of California, Irvine {wengl, yutianc}@ics.uci.edu Abstract Stochastic subgradient ethods

More information

On the Use of A Priori Information for Sparse Signal Approximations

On the Use of A Priori Information for Sparse Signal Approximations ITS TECHNICAL REPORT NO. 3/4 On the Use of A Priori Inforation for Sparse Signal Approxiations Oscar Divorra Escoda, Lorenzo Granai and Pierre Vandergheynst Signal Processing Institute ITS) Ecole Polytechnique

More information

Chapter 6 1-D Continuous Groups

Chapter 6 1-D Continuous Groups Chapter 6 1-D Continuous Groups Continuous groups consist of group eleents labelled by one or ore continuous variables, say a 1, a 2,, a r, where each variable has a well- defined range. This chapter explores:

More information

Quantum algorithms (CO 781, Winter 2008) Prof. Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search

Quantum algorithms (CO 781, Winter 2008) Prof. Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search Quantu algoriths (CO 781, Winter 2008) Prof Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search ow we begin to discuss applications of quantu walks to search algoriths

More information

Weighted- 1 minimization with multiple weighting sets

Weighted- 1 minimization with multiple weighting sets Weighted- 1 iniization with ultiple weighting sets Hassan Mansour a,b and Özgür Yılaza a Matheatics Departent, University of British Colubia, Vancouver - BC, Canada; b Coputer Science Departent, University

More information

Support recovery in compressed sensing: An estimation theoretic approach

Support recovery in compressed sensing: An estimation theoretic approach Support recovery in copressed sensing: An estiation theoretic approach Ain Karbasi, Ali Horati, Soheil Mohajer, Martin Vetterli School of Coputer and Counication Sciences École Polytechnique Fédérale de

More information

On the Impact of Kernel Approximation on Learning Accuracy

On the Impact of Kernel Approximation on Learning Accuracy On the Ipact of Kernel Approxiation on Learning Accuracy Corinna Cortes Mehryar Mohri Aeet Talwalkar Google Research New York, NY corinna@google.co Courant Institute and Google Research New York, NY ohri@cs.nyu.edu

More information

A remark on a success rate model for DPA and CPA

A remark on a success rate model for DPA and CPA A reark on a success rate odel for DPA and CPA A. Wieers, BSI Version 0.5 andreas.wieers@bsi.bund.de Septeber 5, 2018 Abstract The success rate is the ost coon evaluation etric for easuring the perforance

More information

Support Vector Machines MIT Course Notes Cynthia Rudin

Support Vector Machines MIT Course Notes Cynthia Rudin Support Vector Machines MIT 5.097 Course Notes Cynthia Rudin Credit: Ng, Hastie, Tibshirani, Friedan Thanks: Şeyda Ertekin Let s start with soe intuition about argins. The argin of an exaple x i = distance

More information

Lecture October 23. Scribes: Ruixin Qiang and Alana Shine

Lecture October 23. Scribes: Ruixin Qiang and Alana Shine CSCI699: Topics in Learning and Gae Theory Lecture October 23 Lecturer: Ilias Scribes: Ruixin Qiang and Alana Shine Today s topic is auction with saples. 1 Introduction to auctions Definition 1. In a single

More information

NORMAL MATRIX POLYNOMIALS WITH NONSINGULAR LEADING COEFFICIENTS

NORMAL MATRIX POLYNOMIALS WITH NONSINGULAR LEADING COEFFICIENTS NORMAL MATRIX POLYNOMIALS WITH NONSINGULAR LEADING COEFFICIENTS NIKOLAOS PAPATHANASIOU AND PANAYIOTIS PSARRAKOS Abstract. In this paper, we introduce the notions of weakly noral and noral atrix polynoials,

More information

Uniform Approximation and Bernstein Polynomials with Coefficients in the Unit Interval

Uniform Approximation and Bernstein Polynomials with Coefficients in the Unit Interval Unifor Approxiation and Bernstein Polynoials with Coefficients in the Unit Interval Weiang Qian and Marc D. Riedel Electrical and Coputer Engineering, University of Minnesota 200 Union St. S.E. Minneapolis,

More information

ASSUME a source over an alphabet size m, from which a sequence of n independent samples are drawn. The classical

ASSUME a source over an alphabet size m, from which a sequence of n independent samples are drawn. The classical IEEE TRANSACTIONS ON INFORMATION THEORY Large Alphabet Source Coding using Independent Coponent Analysis Aichai Painsky, Meber, IEEE, Saharon Rosset and Meir Feder, Fellow, IEEE arxiv:67.7v [cs.it] Jul

More information

Probabilistic Machine Learning

Probabilistic Machine Learning Probabilistic Machine Learning by Prof. Seungchul Lee isystes Design Lab http://isystes.unist.ac.kr/ UNIST Table of Contents I.. Probabilistic Linear Regression I... Maxiu Likelihood Solution II... Maxiu-a-Posteriori

More information

Experimental Design For Model Discrimination And Precise Parameter Estimation In WDS Analysis

Experimental Design For Model Discrimination And Precise Parameter Estimation In WDS Analysis City University of New York (CUNY) CUNY Acadeic Works International Conference on Hydroinforatics 8-1-2014 Experiental Design For Model Discriination And Precise Paraeter Estiation In WDS Analysis Giovanna

More information

A Note on the Applied Use of MDL Approximations

A Note on the Applied Use of MDL Approximations A Note on the Applied Use of MDL Approxiations Daniel J. Navarro Departent of Psychology Ohio State University Abstract An applied proble is discussed in which two nested psychological odels of retention

More information

Lower Bounds for Quantized Matrix Completion

Lower Bounds for Quantized Matrix Completion Lower Bounds for Quantized Matrix Copletion Mary Wootters and Yaniv Plan Departent of Matheatics University of Michigan Ann Arbor, MI Eail: wootters, yplan}@uich.edu Mark A. Davenport School of Elec. &

More information

Research Article Robust ε-support Vector Regression

Research Article Robust ε-support Vector Regression Matheatical Probles in Engineering, Article ID 373571, 5 pages http://dx.doi.org/10.1155/2014/373571 Research Article Robust ε-support Vector Regression Yuan Lv and Zhong Gan School of Mechanical Engineering,

More information

Algorithms for parallel processor scheduling with distinct due windows and unit-time jobs

Algorithms for parallel processor scheduling with distinct due windows and unit-time jobs BULLETIN OF THE POLISH ACADEMY OF SCIENCES TECHNICAL SCIENCES Vol. 57, No. 3, 2009 Algoriths for parallel processor scheduling with distinct due windows and unit-tie obs A. JANIAK 1, W.A. JANIAK 2, and

More information

1 Bounding the Margin

1 Bounding the Margin COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #12 Scribe: Jian Min Si March 14, 2013 1 Bounding the Margin We are continuing the proof of a bound on the generalization error of AdaBoost

More information

Support Vector Machines. Maximizing the Margin

Support Vector Machines. Maximizing the Margin Support Vector Machines Support vector achines (SVMs) learn a hypothesis: h(x) = b + Σ i= y i α i k(x, x i ) (x, y ),..., (x, y ) are the training exs., y i {, } b is the bias weight. α,..., α are the

More information

A note on the multiplication of sparse matrices

A note on the multiplication of sparse matrices Cent. Eur. J. Cop. Sci. 41) 2014 1-11 DOI: 10.2478/s13537-014-0201-x Central European Journal of Coputer Science A note on the ultiplication of sparse atrices Research Article Keivan Borna 12, Sohrab Aboozarkhani

More information

Konrad-Zuse-Zentrum für Informationstechnik Berlin Heilbronner Str. 10, D Berlin - Wilmersdorf

Konrad-Zuse-Zentrum für Informationstechnik Berlin Heilbronner Str. 10, D Berlin - Wilmersdorf Konrad-Zuse-Zentru für Inforationstechnik Berlin Heilbronner Str. 10, D-10711 Berlin - Wilersdorf Folkar A. Borneann On the Convergence of Cascadic Iterations for Elliptic Probles SC 94-8 (Marz 1994) 1

More information

Consistent Multiclass Algorithms for Complex Performance Measures. Supplementary Material

Consistent Multiclass Algorithms for Complex Performance Measures. Supplementary Material Consistent Multiclass Algoriths for Coplex Perforance Measures Suppleentary Material Notations. Let λ be the base easure over n given by the unifor rando variable (say U over n. Hence, for all easurable

More information

Asynchronous Gossip Algorithms for Stochastic Optimization

Asynchronous Gossip Algorithms for Stochastic Optimization Asynchronous Gossip Algoriths for Stochastic Optiization S. Sundhar Ra ECE Dept. University of Illinois Urbana, IL 680 ssrini@illinois.edu A. Nedić IESE Dept. University of Illinois Urbana, IL 680 angelia@illinois.edu

More information

A Simplified Analytical Approach for Efficiency Evaluation of the Weaving Machines with Automatic Filling Repair

A Simplified Analytical Approach for Efficiency Evaluation of the Weaving Machines with Automatic Filling Repair Proceedings of the 6th SEAS International Conference on Siulation, Modelling and Optiization, Lisbon, Portugal, Septeber -4, 006 0 A Siplified Analytical Approach for Efficiency Evaluation of the eaving

More information

RANDOM GRADIENT EXTRAPOLATION FOR DISTRIBUTED AND STOCHASTIC OPTIMIZATION

RANDOM GRADIENT EXTRAPOLATION FOR DISTRIBUTED AND STOCHASTIC OPTIMIZATION RANDOM GRADIENT EXTRAPOLATION FOR DISTRIBUTED AND STOCHASTIC OPTIMIZATION GUANGHUI LAN AND YI ZHOU Abstract. In this paper, we consider a class of finite-su convex optiization probles defined over a distributed

More information

Neural Network Learning as an Inverse Problem

Neural Network Learning as an Inverse Problem Neural Network Learning as an Inverse Proble VĚRA KU RKOVÁ, Institute of Coputer Science, Acadey of Sciences of the Czech Republic, Pod Vodárenskou věží 2, 182 07 Prague 8, Czech Republic. Eail: vera@cs.cas.cz

More information

RECOVERY OF A DENSITY FROM THE EIGENVALUES OF A NONHOMOGENEOUS MEMBRANE

RECOVERY OF A DENSITY FROM THE EIGENVALUES OF A NONHOMOGENEOUS MEMBRANE Proceedings of ICIPE rd International Conference on Inverse Probles in Engineering: Theory and Practice June -8, 999, Port Ludlow, Washington, USA : RECOVERY OF A DENSITY FROM THE EIGENVALUES OF A NONHOMOGENEOUS

More information

Computational and Statistical Learning Theory

Computational and Statistical Learning Theory Coputational and Statistical Learning Theory TTIC 31120 Prof. Nati Srebro Lecture 2: PAC Learning and VC Theory I Fro Adversarial Online to Statistical Three reasons to ove fro worst-case deterinistic

More information

Support Vector Machines. Machine Learning Series Jerry Jeychandra Blohm Lab

Support Vector Machines. Machine Learning Series Jerry Jeychandra Blohm Lab Support Vector Machines Machine Learning Series Jerry Jeychandra Bloh Lab Outline Main goal: To understand how support vector achines (SVMs) perfor optial classification for labelled data sets, also a

More information

Least Squares Fitting of Data

Least Squares Fitting of Data Least Squares Fitting of Data David Eberly, Geoetric Tools, Redond WA 98052 https://www.geoetrictools.co/ This work is licensed under the Creative Coons Attribution 4.0 International License. To view a

More information

Learning, Regularization and Ill-Posed Inverse Problems

Learning, Regularization and Ill-Posed Inverse Problems Learning, Regularization and Ill-Posed Inverse Problems Lorenzo Rosasco DISI, Università di Genova rosasco@disi.unige.it Andrea Caponnetto DISI, Università di Genova caponnetto@disi.unige.it Ernesto De

More information

OPTIMIZATION in multi-agent networks has attracted

OPTIMIZATION in multi-agent networks has attracted Distributed constrained optiization and consensus in uncertain networks via proxial iniization Kostas Margellos, Alessandro Falsone, Sione Garatti and Maria Prandini arxiv:603.039v3 [ath.oc] 3 May 07 Abstract

More information

Optimal Jamming Over Additive Noise: Vector Source-Channel Case

Optimal Jamming Over Additive Noise: Vector Source-Channel Case Fifty-first Annual Allerton Conference Allerton House, UIUC, Illinois, USA October 2-3, 2013 Optial Jaing Over Additive Noise: Vector Source-Channel Case Erah Akyol and Kenneth Rose Abstract This paper

More information

On the Communication Complexity of Lipschitzian Optimization for the Coordinated Model of Computation

On the Communication Complexity of Lipschitzian Optimization for the Coordinated Model of Computation journal of coplexity 6, 459473 (2000) doi:0.006jco.2000.0544, available online at http:www.idealibrary.co on On the Counication Coplexity of Lipschitzian Optiization for the Coordinated Model of Coputation

More information

Algebraic Montgomery-Yang problem: the log del Pezzo surface case

Algebraic Montgomery-Yang problem: the log del Pezzo surface case c 2014 The Matheatical Society of Japan J. Math. Soc. Japan Vol. 66, No. 4 (2014) pp. 1073 1089 doi: 10.2969/jsj/06641073 Algebraic Montgoery-Yang proble: the log del Pezzo surface case By DongSeon Hwang

More information

This model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t.

This model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t. CS 493: Algoriths for Massive Data Sets Feb 2, 2002 Local Models, Bloo Filter Scribe: Qin Lv Local Models In global odels, every inverted file entry is copressed with the sae odel. This work wells when

More information

Compressive Distilled Sensing: Sparse Recovery Using Adaptivity in Compressive Measurements

Compressive Distilled Sensing: Sparse Recovery Using Adaptivity in Compressive Measurements 1 Copressive Distilled Sensing: Sparse Recovery Using Adaptivity in Copressive Measureents Jarvis D. Haupt 1 Richard G. Baraniuk 1 Rui M. Castro 2 and Robert D. Nowak 3 1 Dept. of Electrical and Coputer

More information

Machine Learning: Fisher s Linear Discriminant. Lecture 05

Machine Learning: Fisher s Linear Discriminant. Lecture 05 Machine Learning: Fisher s Linear Discriinant Lecture 05 Razvan C. Bunescu chool of Electrical Engineering and Coputer cience bunescu@ohio.edu Lecture 05 upervised Learning ask learn an (unkon) function

More information

On Conditions for Linearity of Optimal Estimation

On Conditions for Linearity of Optimal Estimation On Conditions for Linearity of Optial Estiation Erah Akyol, Kuar Viswanatha and Kenneth Rose {eakyol, kuar, rose}@ece.ucsb.edu Departent of Electrical and Coputer Engineering University of California at

More information

The Hilbert Schmidt version of the commutator theorem for zero trace matrices

The Hilbert Schmidt version of the commutator theorem for zero trace matrices The Hilbert Schidt version of the coutator theore for zero trace atrices Oer Angel Gideon Schechtan March 205 Abstract Let A be a coplex atrix with zero trace. Then there are atrices B and C such that

More information

MSEC MODELING OF DEGRADATION PROCESSES TO OBTAIN AN OPTIMAL SOLUTION FOR MAINTENANCE AND PERFORMANCE

MSEC MODELING OF DEGRADATION PROCESSES TO OBTAIN AN OPTIMAL SOLUTION FOR MAINTENANCE AND PERFORMANCE Proceeding of the ASME 9 International Manufacturing Science and Engineering Conference MSEC9 October 4-7, 9, West Lafayette, Indiana, USA MSEC9-8466 MODELING OF DEGRADATION PROCESSES TO OBTAIN AN OPTIMAL

More information

Lecture 21. Interior Point Methods Setup and Algorithm

Lecture 21. Interior Point Methods Setup and Algorithm Lecture 21 Interior Point Methods In 1984, Kararkar introduced a new weakly polynoial tie algorith for solving LPs [Kar84a], [Kar84b]. His algorith was theoretically faster than the ellipsoid ethod and

More information

Lecture 9: Multi Kernel SVM

Lecture 9: Multi Kernel SVM Lecture 9: Multi Kernel SVM Stéphane Canu stephane.canu@litislab.eu Sao Paulo 204 April 6, 204 Roadap Tuning the kernel: MKL The ultiple kernel proble Sparse kernel achines for regression: SVR SipleMKL:

More information

A Theoretical Framework for Deep Transfer Learning

A Theoretical Framework for Deep Transfer Learning A Theoretical Fraewor for Deep Transfer Learning Toer Galanti The School of Coputer Science Tel Aviv University toer22g@gail.co Lior Wolf The School of Coputer Science Tel Aviv University wolf@cs.tau.ac.il

More information

Inspection; structural health monitoring; reliability; Bayesian analysis; updating; decision analysis; value of information

Inspection; structural health monitoring; reliability; Bayesian analysis; updating; decision analysis; value of information Cite as: Straub D. (2014). Value of inforation analysis with structural reliability ethods. Structural Safety, 49: 75-86. Value of Inforation Analysis with Structural Reliability Methods Daniel Straub

More information