A Unified Approach to Universal Prediction: Generalized Upper and Lower Bounds

Size: px
Start display at page:

Download "A Unified Approach to Universal Prediction: Generalized Upper and Lower Bounds"

Transcription

1 646 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL 6, NO 3, MARCH 05 A Unified Approach to Universal Prediction: Generalized Upper and Lower Bounds Nuri Denizcan Vanli and Suleyan S Kozat, Senior Meber, IEEE Abstract We study sequential prediction of real-valued, arbitrary, and unknown sequences under the squared error loss as well as the best paraetric predictor out of a large, continuous class of predictors Inspired by recent results fro coputational learning theory, we refrain fro any statistical assuptions and define the perforance with respect to the class of general paraetric predictors In particular, we present generic lower and upper bounds on this relative perforance by transforing the prediction task into a paraeter learning proble We first introduce the lower bounds on this relative perforance in the iture of eperts fraework, where we show that for any sequential algorith, there always eists a sequence for which the perforance of the sequential algorith is lower bounded by zero We then introduce a sequential learning algorith to predict such arbitrary and unknown sequences, and calculate upper bounds on its total squared prediction error for every bounded sequence We further show that in soe scenarios, we achieve atching lower and upper bounds, deonstrating that our algoriths are optial in a strong inia sense such that their perforances cannot be iproved further As an interesting result, we also prove that for the worst case scenario, the perforance of randoized output algoriths can be achieved by sequential algoriths so that randoized output algoriths do not iprove the perforance Inde Ters Online learning, sequential prediction, worst-case perforance I INTRODUCTION In this brief, we investigate the generic sequential online) prediction proble fro an individual sequence perspective using tools of coputational learning theory, where we refrain fro any statistical assuptions either in odeling or on signals ] 4] In this approach, we have an arbitrary, deterinistic, bounded, and unknown signal t] t, where t] < A <, and t] R Since we do not ipose any statistical assuptions on the underlying data, we, otivated by recent results fro sequential learning ] 4], define the perforance of a sequential algorith with respect to a coparison class, where the predictors of the coparison class are fored by observing the entire sequence in hindsight, under the squared error loss, that is t] ˆ s t]) inf t] ˆc t] ) c C for an arbitrary length of data n, and for any possible sequence t] t,where ˆ s t] is the prediction at tie t of any sequential algorith that has access data fro ] up to t ] for prediction, and ˆ c t] is the prediction at tie t of the predictor c such that c C, where C represents the class of predictors we copete against We ephasize that since the predictors ˆ c t], c C have the access Manuscript received July 5, 03; revised January 4, 04 and April 3, 04; accepted April 6, 04 Date of publication April 4, 04; date of current version February 6, 05 This work was supported in part by the IBM Faculty Award and in part by TUBITAK under Contract E6 and Contract 3E57 The authors are with the Departent of Electrical and Electronics Engineering, Bilkent University, Ankara 06800, Turkey e-ail: vanli@eebilkentedutr; kozat@eebilkentedutr) Digital Object Identifier 009/TNNLS to the entire sequence before the processing starts, the iniu squared prediction error that can be achieved with a sequential predictor ˆ s t] is equal to the squared prediction error of the optial batch predictor ˆ c t], c C Here, we call the difference in the squared prediction error of the sequential algorith ˆ s t] and the optial batch predictor ˆ c t], c C as the regret of not using the optial predictor or equivalently, not knowing the future) Therefore, we seek for sequential algoriths ˆ s t] that iniize this regret or loss for any possible t] t We ephasize that this regret definition is for the accuulated sequential cost, instead of the batch cost Instead of fiing a coparison class of predictors, we paraeterize the coparison classes such that the paraeter set and functional for of these classes can be chosen as desired In this sense, in this brief, we consider the ost general class of paraetric predictors as our class of predictors C such that the regret for an arbitrary length of data n is given by t] ˆ s t]) inf t] f w, ) ) where f w, ) is a paraetric function whose paraeters w w,,w ] T can be set prior to prediction, and this function uses the data, t a for prediction for soe arbitrary integer a, which can be viewed as the tap size of the predictor Although the paraeters of the paraetric prediction function f w, ) can be set arbitrarily, even by observing all the data t] t apriori,the function is naturally restricted to use only the sequential data in prediction 5] 7] Since we have no statistical assuptions on the underlying data, the corresponding lower and upper bounds on the regret in ) in this sense provide the ultiate easure of the learning perforance for any sequential predictor We ephasize that lower bounds not only provide the worst-case perforance of an algorith, but also quantify the prediction power of the paraetric class As such, a positive lower bound guarantees the eistence of a data sequence having an arbitrary length such that no atter how sart the learning algorith is, the perforance of this sart algorith on this sequence will be worse than the class of paraetric predictors by at least an order of the lower bound Hence, if an algorith is found such that the upper bound of the regret of that algorith atches with the lower bound, then that algorith is optial in a strong inia sense such that the actual convergence perforance cannot be further iproved 7] To this end, the inia sense optiality of different paraetric learning algoriths, such as the well-known prediction algoriths, least ean squares LMSs) 8], recursive least squares RLSs) 8], and online sequential etree learning achine of ] can be deterined using the lower bounds provided in this brief In this sense, the rates of the corresponding upper and lower bounds are analogous to the VC All vectors are colun vectors and denoted by boldface lower case letters For a vector u, u T is the ordinary transpose We denote b a t]b ta 6-37X 04 IEEE Personal use is peritted, but republication/redistribution requires IEEE perission See for ore inforation )

2 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL 6, NO 3, MARCH diension 9] of classifiers and can be used to quantify the learning perforance ] 3], 0] Various sequential learning algoriths have been proposed in ], 7], 8], 0] ], and 3] in order to efficiently learn the relationship between the observations and the desired data One of the siplest ethods is to linearly odel this relationship, ie, f wt], ) wt]t, and then update wt] using the wellknown algoriths, such as the LMS or RLS algoriths ], 8] In ore recent studies 7], ], universal algoriths have been proposed that achieve the perforance of the optial weighting vector without any statistical assuptions Kivinen and Waruth 0] have proposed a ultiplicative update of the weights and provided guaranteed upper bounds on the perforance of the proposed algorith On the other hand, in order to introduce a nonlinear odeling, siilar learning ethods are usually etended by either apping the observations to higher diensions as in polynoial and Volterra filters ] or partitioning the observation space and fitting linear odels in each partition, ie, piecewise linear odeling 3] In order to derive upper and lower bounds on the perforance of such learning algoriths, the iture of eperts fraework is usually used As an eaple, linear prediction 5], 7], ], nonlinear odels based on piecewise linear approiations 3], and the learning of an individual noise-corrupted deterinistic sequence 4] are studied These results are then etended to the filtering probles 5], 6] In this brief, on the other hand, we consider a holistic approach and provide upper and lower bounds for the general fraework, which was previously issing in the literature Our ain contribution in this brief is to obtain the generalized lower bounds for a variety of prediction fraeworks by transforing the prediction proble to a well known and studied statistical paraeter learning proble ], 4] 7] By doing so, we prove that for any sequential algorith there always eists soe data sequence over any length such that the regret of the sequential algorith is lower bounded by zero We further derive lower bounds for iportant classes of predictors heavily investigated in achine learning literature, including univariate polynoial, ultivariate polynoial, and linear predictors 4] 7], 0] ], 4] We also provide a universal sequential prediction algorith and calculate upper bounds on the regret of this algorith, and show that we obtain atching lower and upper bounds in soe scenarios As an interesting result, we also show that given the regret in ) as the perforance easure, there is no additional gain achieved by using randoized algoriths in the worst-case scenario The rest of this brief is organized as follows In Section II, we first present general lower bounds, and then analyze couple of specific scenarios We then introduce a universal prediction algorith and calculate the upper bounds on its regret in Section III In Section IV, we show that in the worst-case scenario, the perforance of randoized algoriths can be achieved by sequential algoriths Finally, conclusions are drawn in Section V II LOWER BOUNDS In this section, we investigate the worst-case perforance of sequential algoriths to obtain guaranteed lower bounds on the regret Hence, for any arbitrary length of data n, t] t, we are trying to find a lower bound on the following: n sup t] ˆ s t]) inf t] f w, n w R ) ) ) For this regret, we have the following theore that relates the perforance of any sequential algorith to the general class of paraetric predictors While proving this theore, we also provide a generic procedure to find lower bounds on the regret in ) and later use this ethod to derive lower bounds for paraetric classes, including the classes of univariate polynoial, ultivariate polynoial, and linear predictors 4] 7], 0] ], 4] Theore : There is no best sequential algorith for all sequences for any class in the paraetric for f w, ),wherew R Given a paraetric class there eists always a sequence such that the regret in ) is always lower bounded by soe nonnegative value This theore iplies that no atter how sart a sequential algorith is or how naive the copetition class is, it is not possible to outperfor the copetition class for all sequences As an eaple, this result deonstrates that even copeting against the class of constant predictors, ie, the ost naive copetition class, where ˆ c t] always predicts a constant value, any sequential algorith, no atter how sart, cannot outperfor this class of constant predictors for all sequences We ephasize that in this sense, the lower bounds provide the prediction and odeling power of the paraetric class Proof of Theore : We begin our proof by pointing out that finding the best sequential predictor for an arbitrary and unknown sequence of n is not straightforward Yet, for a specific distribution on n, the best predictor is the conditional ean on n under the squared error 7] Therefore, by this clever transforation, we are able to calculate the regret in ) in the epectation sense and prove this theore Since the supreu in ) is taken over all n, for any distribution n, the regret is lower bounded by n sup t] ˆ s t]) inf t] f w, n w R ) ) ) n E n t] ˆ s t]) inf t] f w, w R ) ) ] Ln) where epectation is taken with respect to this particular distribution Hence, it is enough to lower bound Ln) to get a final lower bound By the linearity of the epectation n ] Ln) E n t] ˆ s t]) E n inf t] f w, ) ) ] 3) The squared-error loss Et] ˆ s t]) ] is iniized with the wellknown iniu ean squared error MMSE) predictor given by 7] ˆ s t] E t] ] t ],,] E t] ] 4) where we drop the eplicit n -dependence of the epectation to siplify the presentation Suppose we select a paraetric distribution for n with paraeter vector θ θ,,θ ] Then, for the second ter in 3), we use the following inequality: E θ E n θ inf E θ inf E n t] f w, n θ ) ) ]] t] f w, ) ) ]] 5)

3 648 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL 6, NO 3, MARCH 05 By using 4) and 5), and epanding the epectation, we can lower bound Ln) as n Ln) E θ E n θ t] E t] ]] ]) E θ inf E n n θ t] f w, ) ) ]] The inequality in 6) is true for any distribution on n Hence, for a distribution on n such that E t] ], θ h θ, ) 7) with soe function h, if we can find a vector function gθ) satisfying f gθ), ) hθ, ) then the last ter in 6) yields E θ inf w R E n n θ t] f w, ) ) ]] n E θ E n θ t] h θ, ) ) ]] Thus, 6) can be written as Ln) E θ n E n θ n E θ E n θ t] E t] ]) ]] t] E t] ]] ]), θ which is by definition of the MMSE estiator is always lower bounded by zero, ie, Ln) 0 By this inequality, we conclude that for predictors of the for f w, ) for which this special paraetric distribution, ie, w gθ) eists, the best sequential predictor will be always outperfored by soe predictor in this class for soe sequence n Hence, there is no best algorith for all sequences for any class in this paraetric for The question arises if a suitable distribution on n can be found for a given f w, ) such that f gθ), ) hθ, ) with a suitable transforation gθ) Suppose f w, ) is bounded by soe 0 < M < for all t] A, ie, f w, ) M Then, given θ fro a beta distribution with paraeters C, C), C R +, we generate a sequence n such that t] A/M fw, ) with probability θ and t] A/M f w, )) with probability θ)then E t] ],θ A M θ ) f w, ) Hence, this concludes the proof of the Theore As an iportant special case, if we use the restricted functional for f w, ) so that f w, ) is separable, then the prediction proble is transfored to a paraeter estiation proble The separable for is given by f w, ) f w w) T f ) where f w w) and f ) are vector functions of size for soe integer Then, 7) can be written as E t] ], θ f w gθ)) T f ) where f w gθ)) A/Mθ ) f w w) Denoting f n w) A/M f w w) as the noralized prediction function, and after soe 6) algebra 6) is obtained as n Ln) E θ E n θ t] E θ ) ] T f n w) T f ) ]] n E θ E n θ t] θ ) f n w) T f ) ) ]] so that the regret of the sequential algorith over the best prediction function is due to the regret attained by the sequential algorith while learning the paraeters of the prediction function, ie, the paraeters of the underlying distribution To illustrate this procedure, we investigate the regret given in ) for three candidate function classes that are widely studied in coputational learning theory A th-order Univariate Polynoial Prediction For an th-order polynoial in t ], the regret is given by p sup t] ˆ s t]) inf t] w n w R i i t ] i 8) where ˆ s t] is the prediction at tie t of any sequential algorith that has access data fro ] up to t ] for prediction, w w,,w ] T is the paraeter vector, and i t ] is the ith power of t ] Since i w i i t ] w t ] with appropriate selection of w, considering the following distribution on n, we can lower bound the regret in 8) Given θ fro a beta distribution with paraeters C, C), C R +, we generate a sequence n having only two values, A and A such that t] t ] with probability θ and t] t ] with probability θ) Then, Et],θ]θ )t ], givinghθ, ) θ )t ] Since the MMSE given θ is linear in t ], the optiu w that iniizes the accuulated error for this distribution is w θ ), 0,,0] T After following the lines in 5], we obtain a lower bound of the for Olnn)) B Multivariate Polynoial Prediction Suppose the prediction function is given by w T f ) k w k f k t r ), where each f kt r ) is a ultivariate polynoial function as an eaple f k t r ) t ] t ]/t 3]), and regret is taken over all w w,,w ] T R,thatis n sup t] ˆ s t]) inf t] w n w R T f ) ) where ˆ s t] is the prediction at tie t of any sequential algorith that has access data fro ] up to t ] for prediction, and w is the paraeter for prediction We ephasize that this class of predictors are not only the super set of univariate polynoial predictors, but also widely used in any signal processing applications to odel nonlinearity, such as Volterra filters ] This filtering technique is attractive when linear filtering

4 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL 6, NO 3, MARCH techniques do not provide satisfactory results, and includes cross products of the input signals Since k w k f k t r ) w f t r ) with an appropriate selection of w and redefinition of f t r ), we define the following paraetric distribution on n to obtain a lower bound Given θ fro a beta distribution with paraeters C, C), C R +, we generate a sequence n having only two values, A and A, such that t] f n ) with probability θ and t] f n ) with probability θ), where f n ) Af t r )/M, ie, noralized version of f t r )Thus,givenθ, n fors a twostate Markov chain with transition probability θ) Hence, we have Et],θ]θ ) f n ) The lower bound for the regret is given by Ln) E t] ˆθ ) f n ) ) ] E t] θ ) f n ) ) ] where ˆθ Eθ ] After soe algebra, we achieve Ln) 4E ˆθt] f n ) ] + 4E θt] f n ) ] It can be deduced that +E ˆθ ) ] E θ ) ] ˆθ E θ ] t F t + C t + C where F t is the total nuber of transitions between the two states in a sequence of length t ), ie, ˆθ is ratio of nuber of transitions to tie period Hence E ˆθt] f n ) ] t Ft + C E t] f n ) ] t + C t + C)E t] f n ) ] E t + C t + C E θ)t )t] f n ) ] t t + C E θt] f n ) ] F t t] f n where the third line follows fro Et] f n )] Eθ )A ]0andEF t t] f n )]t ) θ) since F t is a binoial rando variable with paraeters θ) and size t ) Thus, we obtain t Ln) 4 t + C E θt) f n ) ] +4E θt) f n ) ] +E ˆθ ) ] E θ ) ] After this line, the derivation follows siilar lines to 7], giving a lower bound of the for Olnn)) for the regret C k-ahead th-order Linear Prediction The regret in ) for k-ahead th-order linear prediction is given by n sup t] ˆ s t]) inf t] w n w R T t k]) 9) where ˆ s t] is the prediction at tie t of any sequential algorith that has access data fro ] up to t k] for prediction for soe ) ] integer k, w w,,w ] T is the paraeter vector, and t k] t k],,t k + ]] T We first find a lower bound for k-ahead first-order prediction, where w T t k] wt k] For this purpose, we define the following paraetric distribution on n as in 5] Given θ fro a beta distribution with paraeters C, C), C R +, we generate a sequence n having only two values, A and A, such that t] t k] with probability θ and t] t k] with probability θ) Thus, given θ, n fors a two-state Markov chain with transition probability θ) Then, Et] t k,θ]θ )t k], giving hθ, ) θ )t k] and gθ) θ ) After this point, the derivation eactly follows the lines in 5] resulting a lower bound of the for Olnn)) For k-ahead th-order prediction, we generalize the lower bound obtained for k-ahead first-order prediction and following the lines in 5], we obtain a lower bound of the for O lnn)) III COMPREHENSIVE APPROACH TO REGRET MINIMIZATION In this section, we introduce a ethod which can be used to predict a bounded, arbitrary, and unknown sequence We derive the upper bounds of this algorith such that for any sequence n, our algorith will not perfor worse than the presented upper bounds In soe cases, by achieving atching upper and lower bounds, we prove that this algorith is optial in a strong inia sense such that the worst-case perforance cannot be further iproved We restrict the prediction functions to be separable, ie, f w, ) f w w)t f ), where f w w) and f ) are vector functions of size forsoe integer To avoid any confusion, we siply denote β f w w), whereβ R Hence, the sae prediction function can be written as f w, ) β T f ) If the paraeter vector β is selected such that the total squared prediction error is iniized over a batch of data of length n, then the coefficients are given by β n] arg in t] β T f ) ) β R The well-known least-squares solution to this proble is given by β n] R n ff ) r n f,where R n ff f ) f ) T is invertible and r n f t] f ) When R n is singular, the solution is no longer unique, ff however, a suitable choice can be ade using, eg, pseudoinverses We also consider the ore general least-squares ridge regression) proble that arises in any signal processing probles, and whose total squared prediction error is iniized over a batch of data of length n with β n n] arg in t] β T f ) ) + δ β β R ] R n ff + δ I r n f We define a universal predictor u n], as u n] β u n ] T f n a n )

5 650 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL 6, NO 3, MARCH 05 where β u n] β n] R n ] ff + δi r n f and δ>0 is a positive constant Theore : The total squared prediction error of the th-order universal predictor for any bounded arbitrary sequence of t] t, t] A, having an arbitrary length of n satisfies t] u t]) n in t] β T f ) ) β R +δ β +A ln I + R n ff δ Theore indicates that the total squared prediction error of the th-order universal predictor is within O lnn)) of the best batch th-order paraetric predictor for any individual sequence of t] t This result iplies that in order to learn paraeters, the universal algorith pays a regret of O lnn)), which can be viewed as the paraeter regret After we prove Theore, we apply Theore to the copetition classes discussed in Section II Proof of Theore : We prove this result for a scalar prediction function such that f ) f ) to avoid any confusions Yet for a vector prediction function of f ), one can follow the eact sae steps in this proof with vector etensions of the Gaussian iture The derivations follow siilar lines to 5] and 0], hence only ain points are presented We first define a function of the loss, naely the probability for a predictor having paraeter β as follows: P β n ) ep h k k] β f ) ) which can be viewed as a probability assignent of the predictor with paraeter β to the data t], for t n, induced by perforance of β on the sequence n We then construct a universal estiate of the probability of the sequence n,asanaprioriweighted iture aong all of the probabilities, ie, P u n) pβ)p β n)dβ, where pβ) is an aprioriweight assigned to the paraeter β, and is selected as Gaussian in order to obtain a closed for bounds, ie, pβ) /π) / σ ep β /σ Following siilar lines to 7] with a predictor of β f ),we obtain: P u n n ) γ ep h γ n] βn ] f n a n ) ) where γ R n ff + δ)/r n ff + δ) ) / If we could find another Gaussian satisfying P u n ) P u n ), then it would coplete the proof of the theore After soe algebra, we find that the universal predictor is given by u n] γ β n ] f n a n ) rf n R n ff + δ f n ) n a Now, we can select the sallest value of h over the region A, A], P u n n ) is larger than P u n n ),thatis h lnγ )γ A ) + γ ˆ u n] γ ) γ ) which ust hold for all values of ˆ u n] A, A] Therefore, h A γ )/ lnγ ), where γ < Note that for 0 < γ < wehave0< γ )/ lnγ <, which iplies that we ust have h A to ensure that P u P u In fact, since this bound on the value of h depends upon the value of γ and ˆ u n], and is only tight for γ, and ˆ u n] 0, then the restriction that n] < A can actually be occasionally violated, as long as P u P u still holds To illustrate this procedure, we investigate the upper bound for the regret in ) for the sae candidate function classes as we also investigated in Section II A th-order Univariate Polynoial Predictor For an th-order polynoial in t ], the prediction function is given by f w, ) βt f ) βt t ], where t ] t ],, t ]] T, ie, the vector of powers of t ] After replacing R n ff Rn n t ]t ] T and r n f r n n t]t ], we obtain an upper bound t] u t]) in β R n t] β T ) t ] +δ β + A ln I + R n δ A ln+a n/δ) B Multivariate Polynoial Prediction The upper bound for a ultivariate polynoial prediction function f ) eactly follows the upper bound derivation of th-order univariate polynoial predictor giving an upper bound: t] u t]) n in t] β β R T f ) +δ β ) +A ln + A n δ C k-ahead th-order Linear Prediction For k-ahead th-order prediction, the prediction class is given by f w, ) βt f ) βt t k] where t k] t k],,t k +]] T as before After replacing R n ff Rn n t k]t k] T and r n f r n n t]t k] with suitable liits, we obtain an upper bound n t] u t]) in t] β β R T t k]) +δ β ) +A ln + A n δ IV RANDOMIZED OUTPUT PREDICTIONS In this section, we investigate the perforance of randoized output algoriths for the worst-case scenario with respect to linear predictors with using the sae regret easure in ) We ephasize that the randoized output algoriths are a super set of the deterinistic sequential predictors and the derivations here can be readily generalized to include any prediction class In particular, we consider randoized output algoriths f θ ), ) such that the randoization paraeters θ R can be a function of the whole past Hence, a randoized sequential algorith introduce randoization or uncertainty in its output such that the output also depends on a rando eleent Note that such ethods are widely used in applications involving security considerations As an eaple, suppose there are prediction algoriths running in parallel to predict the observation sequence t] t sequentially At each tie t, the randoized output algorith selects one of the constituent algoriths randoly such that the algorith k is

6 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL 6, NO 3, MARCH selected with probability p k t] By definition k p k t] and p k t] ay be generated as the cobination of the past observation saples and a seed independent fro the observations For such randoized output prediction algoriths, we consider the following tie-accuulated prediction error over a deterinistic sequence t] t as the prediction error: P rand n) E θ t] f θ ) )) ], 0) This epectation is taken over all the randoization due to independent or dependent seeds Hence, our general regret can be etended to include this perforance easure sup n P rand n) in Epanding 0), we obtain P rand n) t] E θ f +Var θ f ) t] w T t ] ) θ θ ) )]), ) )), noting that t] is independent of the randoization Since E θ f θ ), )] is a sequential function of and Var θ f θ ), )) is always nonnegative, the perforance of a randoized output algorith can be reached by a deterinistic sequential algorith Since deterinistic algoriths are subclass of randoized output algoriths, upper bounds we derived for k-ahead th-order prediction in 9) also hold for ) Since we also proved that the lower bound for such linear predictions of th order are in the for of O lnn)), the lower and upper bounds are tight and of the for O lnn)) V CONCLUSION In this brief, we consider the proble of sequential prediction fro a iture of eperts perspective We have introduced coprehensive lower bounds on the sequential learning fraework by proving that for any sequential algorith, there always eists a sequence for which the sequential predictor cannot outperfor the class of paraetric predictors, whose paraeters are set noncasually The lower bounds for iportant paraetric classes, such as univariate polynoial, ultivariate polynoial, and linear predictor classes, are derived in detail We then introduced a universal sequential prediction algorith and investigated the upper bound on the regret of this algorith We also derived the upper bounds in detail for the sae iportant classes that we discussed for lower bounds, where we further showed that this algorith is optial in a strong inia sense for soe scenarios Finally, we have proven that for the worst-case scenario, randoized output algoriths cannot provide any iproveent in the perforance copared with the sequential algoriths REFERENCES ] N-Y Liang, G-B Huang, P Saratchandran, and N Sundararajan, A fast and accurate online sequential learning algorith for feedforward networks, IEEE Trans Neural Netw, vol 7, no 6, pp 4 43, Nov 006 ] L Devroye, T Linder, and G Lugosi, Nonparaetric estiation and classification using radial basis function nets and epirical risk iniization, IEEE Trans Neural Netw, vol 7, no, pp , Mar 996 3] A Krzyzak and T Linder, Radial basis function networks and copleity regularization in function learning, IEEE Trans Neural Netw, vol 9, no, pp 47 56, Mar 998 4] N Cesa-Bianchi, P M Long, and M K Waruth, Worst-case quadratic loss bounds for prediction using linear functions and gradient descent, IEEE Trans Neural Netw, vol 7, no 3, pp , May 996 5] A C Singer and M Feder, Universal linear prediction by odel order weighting, IEEE Trans Signal Process, vol 47, no 0, pp , Oct 999 6] G C Zeitler and A Singer, Universal linear least-squares prediction in the presence of noise, in Proc IEEE/SP 4th Workshop SSP, Aug 007, pp ] A C Singer, S S Kozat, and M Feder, Universal linear least squares prediction: Upper and lower bounds, IEEE Trans Inf Theory, vol 48, no 8, pp , Aug 00 8] T Kailath, A H Sayed, and B Hassibi, Linear Estiation Englewood Cliffs, NJ, USA: Prentice-Hall, 000 9] V Cherkassky, X Shao, F M Mulier, and V N Vapnik, Model copleity control for regression using VC generalization bounds, IEEE Trans Neural Netw, vol 0, no 5, pp , Sep 999 0] J Kivinen and M K Waruth, Eponentiated gradient versus gradient descent for linear predictors, J Inf Coput, vol 3, no, pp 63, 997 ] V J Mathews, Adaptive polynoial filters, IEEE Signal Process Mag, vol 8, no 3, pp 0 6, Jul 99 ] V Vovk, Copetitive on-line statistics, Int Statist Rev, vol 69, no, pp 3 48, 00 3] S S Kozat, A C Singer, and G C Zeitler, Universal piecewise linear prediction via contet trees, IEEE Trans Signal Process, vol 55, no 7, pp , Jul 007 4] T Weissan and N Merhav, Universal prediction of individual binary sequences in the presence of noise, IEEE Trans Inf Theory, vol 47, no 6, pp 5 73, Sep 00 5] T Moon and T Weissan, Universal FIR MMSE filtering, IEEE Trans Signal Process, vol 57, no 3, pp , Mar 009 6] T Moon and T Weissan, Copetitive on-line linear FIR MMSE filtering, in Proc IEEE ISIT, Jun 007, pp ] H Stark and J Woods, Probability, Rando Processes, and Estiation Theory for Engineers Upper Saddle River, NJ, USA: Prentice-Hall, 994

A Simple Regression Problem

A Simple Regression Problem A Siple Regression Proble R. M. Castro March 23, 2 In this brief note a siple regression proble will be introduced, illustrating clearly the bias-variance tradeoff. Let Y i f(x i ) + W i, i,..., n, where

More information

Non-Parametric Non-Line-of-Sight Identification 1

Non-Parametric Non-Line-of-Sight Identification 1 Non-Paraetric Non-Line-of-Sight Identification Sinan Gezici, Hisashi Kobayashi and H. Vincent Poor Departent of Electrical Engineering School of Engineering and Applied Science Princeton University, Princeton,

More information

Intelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines

Intelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines Intelligent Systes: Reasoning and Recognition Jaes L. Crowley osig 1 Winter Seester 2018 Lesson 6 27 February 2018 Outline Perceptrons and Support Vector achines Notation...2 Linear odels...3 Lines, Planes

More information

Feature Extraction Techniques

Feature Extraction Techniques Feature Extraction Techniques Unsupervised Learning II Feature Extraction Unsupervised ethods can also be used to find features which can be useful for categorization. There are unsupervised ethods that

More information

Sharp Time Data Tradeoffs for Linear Inverse Problems

Sharp Time Data Tradeoffs for Linear Inverse Problems Sharp Tie Data Tradeoffs for Linear Inverse Probles Saet Oyak Benjain Recht Mahdi Soltanolkotabi January 016 Abstract In this paper we characterize sharp tie-data tradeoffs for optiization probles used

More information

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices CS71 Randoness & Coputation Spring 018 Instructor: Alistair Sinclair Lecture 13: February 7 Disclaier: These notes have not been subjected to the usual scrutiny accorded to foral publications. They ay

More information

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis E0 370 tatistical Learning Theory Lecture 6 (Aug 30, 20) Margin Analysis Lecturer: hivani Agarwal cribe: Narasihan R Introduction In the last few lectures we have seen how to obtain high confidence bounds

More information

Vulnerability of MRD-Code-Based Universal Secure Error-Correcting Network Codes under Time-Varying Jamming Links

Vulnerability of MRD-Code-Based Universal Secure Error-Correcting Network Codes under Time-Varying Jamming Links Vulnerability of MRD-Code-Based Universal Secure Error-Correcting Network Codes under Tie-Varying Jaing Links Jun Kurihara KDDI R&D Laboratories, Inc 2 5 Ohara, Fujiino, Saitaa, 356 8502 Japan Eail: kurihara@kddilabsjp

More information

Block designs and statistics

Block designs and statistics Bloc designs and statistics Notes for Math 447 May 3, 2011 The ain paraeters of a bloc design are nuber of varieties v, bloc size, nuber of blocs b. A design is built on a set of v eleents. Each eleent

More information

List Scheduling and LPT Oliver Braun (09/05/2017)

List Scheduling and LPT Oliver Braun (09/05/2017) List Scheduling and LPT Oliver Braun (09/05/207) We investigate the classical scheduling proble P ax where a set of n independent jobs has to be processed on 2 parallel and identical processors (achines)

More information

Bounds on the Minimax Rate for Estimating a Prior over a VC Class from Independent Learning Tasks

Bounds on the Minimax Rate for Estimating a Prior over a VC Class from Independent Learning Tasks Bounds on the Miniax Rate for Estiating a Prior over a VC Class fro Independent Learning Tasks Liu Yang Steve Hanneke Jaie Carbonell Deceber 01 CMU-ML-1-11 School of Coputer Science Carnegie Mellon University

More information

Qualitative Modelling of Time Series Using Self-Organizing Maps: Application to Animal Science

Qualitative Modelling of Time Series Using Self-Organizing Maps: Application to Animal Science Proceedings of the 6th WSEAS International Conference on Applied Coputer Science, Tenerife, Canary Islands, Spain, Deceber 16-18, 2006 183 Qualitative Modelling of Tie Series Using Self-Organizing Maps:

More information

Stochastic Subgradient Methods

Stochastic Subgradient Methods Stochastic Subgradient Methods Lingjie Weng Yutian Chen Bren School of Inforation and Coputer Science University of California, Irvine {wengl, yutianc}@ics.uci.edu Abstract Stochastic subgradient ethods

More information

Machine Learning Basics: Estimators, Bias and Variance

Machine Learning Basics: Estimators, Bias and Variance Machine Learning Basics: Estiators, Bias and Variance Sargur N. srihari@cedar.buffalo.edu This is part of lecture slides on Deep Learning: http://www.cedar.buffalo.edu/~srihari/cse676 1 Topics in Basics

More information

Prediction by random-walk perturbation

Prediction by random-walk perturbation Prediction by rando-walk perturbation Luc Devroye School of Coputer Science McGill University Gábor Lugosi ICREA and Departent of Econoics Universitat Popeu Fabra lucdevroye@gail.co gabor.lugosi@gail.co

More information

Probability Distributions

Probability Distributions Probability Distributions In Chapter, we ephasized the central role played by probability theory in the solution of pattern recognition probles. We turn now to an exploration of soe particular exaples

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE227C (Spring 2018): Convex Optiization and Approxiation Instructor: Moritz Hardt Eail: hardt+ee227c@berkeley.edu Graduate Instructor: Max Sichowitz Eail: sichow+ee227c@berkeley.edu October

More information

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon Model Fitting CURM Background Material, Fall 014 Dr. Doreen De Leon 1 Introduction Given a set of data points, we often want to fit a selected odel or type to the data (e.g., we suspect an exponential

More information

Using EM To Estimate A Probablity Density With A Mixture Of Gaussians

Using EM To Estimate A Probablity Density With A Mixture Of Gaussians Using EM To Estiate A Probablity Density With A Mixture Of Gaussians Aaron A. D Souza adsouza@usc.edu Introduction The proble we are trying to address in this note is siple. Given a set of data points

More information

A Better Algorithm For an Ancient Scheduling Problem. David R. Karger Steven J. Phillips Eric Torng. Department of Computer Science

A Better Algorithm For an Ancient Scheduling Problem. David R. Karger Steven J. Phillips Eric Torng. Department of Computer Science A Better Algorith For an Ancient Scheduling Proble David R. Karger Steven J. Phillips Eric Torng Departent of Coputer Science Stanford University Stanford, CA 9435-4 Abstract One of the oldest and siplest

More information

Estimation of the Mean of the Exponential Distribution Using Maximum Ranked Set Sampling with Unequal Samples

Estimation of the Mean of the Exponential Distribution Using Maximum Ranked Set Sampling with Unequal Samples Open Journal of Statistics, 4, 4, 64-649 Published Online Septeber 4 in SciRes http//wwwscirporg/ournal/os http//ddoiorg/436/os4486 Estiation of the Mean of the Eponential Distribution Using Maiu Ranked

More information

On Poset Merging. 1 Introduction. Peter Chen Guoli Ding Steve Seiden. Keywords: Merging, Partial Order, Lower Bounds. AMS Classification: 68W40

On Poset Merging. 1 Introduction. Peter Chen Guoli Ding Steve Seiden. Keywords: Merging, Partial Order, Lower Bounds. AMS Classification: 68W40 On Poset Merging Peter Chen Guoli Ding Steve Seiden Abstract We consider the follow poset erging proble: Let X and Y be two subsets of a partially ordered set S. Given coplete inforation about the ordering

More information

Randomized Recovery for Boolean Compressed Sensing

Randomized Recovery for Boolean Compressed Sensing Randoized Recovery for Boolean Copressed Sensing Mitra Fatei and Martin Vetterli Laboratory of Audiovisual Counication École Polytechnique Fédéral de Lausanne (EPFL) Eail: {itra.fatei, artin.vetterli}@epfl.ch

More information

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization Recent Researches in Coputer Science Support Vector Machine Classification of Uncertain and Ibalanced data using Robust Optiization RAGHAV PAT, THEODORE B. TRAFALIS, KASH BARKER School of Industrial Engineering

More information

Fast Montgomery-like Square Root Computation over GF(2 m ) for All Trinomials

Fast Montgomery-like Square Root Computation over GF(2 m ) for All Trinomials Fast Montgoery-like Square Root Coputation over GF( ) for All Trinoials Yin Li a, Yu Zhang a, a Departent of Coputer Science and Technology, Xinyang Noral University, Henan, P.R.China Abstract This letter

More information

The Distribution of the Covariance Matrix for a Subset of Elliptical Distributions with Extension to Two Kurtosis Parameters

The Distribution of the Covariance Matrix for a Subset of Elliptical Distributions with Extension to Two Kurtosis Parameters journal of ultivariate analysis 58, 96106 (1996) article no. 0041 The Distribution of the Covariance Matrix for a Subset of Elliptical Distributions with Extension to Two Kurtosis Paraeters H. S. Steyn

More information

CS Lecture 13. More Maximum Likelihood

CS Lecture 13. More Maximum Likelihood CS 6347 Lecture 13 More Maxiu Likelihood Recap Last tie: Introduction to axiu likelihood estiation MLE for Bayesian networks Optial CPTs correspond to epirical counts Today: MLE for CRFs 2 Maxiu Likelihood

More information

SPECTRUM sensing is a core concept of cognitive radio

SPECTRUM sensing is a core concept of cognitive radio World Acadey of Science, Engineering and Technology International Journal of Electronics and Counication Engineering Vol:6, o:2, 202 Efficient Detection Using Sequential Probability Ratio Test in Mobile

More information

Adaptive Stabilization of a Class of Nonlinear Systems With Nonparametric Uncertainty

Adaptive Stabilization of a Class of Nonlinear Systems With Nonparametric Uncertainty IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 46, NO. 11, NOVEMBER 2001 1821 Adaptive Stabilization of a Class of Nonlinear Systes With Nonparaetric Uncertainty Aleander V. Roup and Dennis S. Bernstein

More information

New Bounds for Learning Intervals with Implications for Semi-Supervised Learning

New Bounds for Learning Intervals with Implications for Semi-Supervised Learning JMLR: Workshop and Conference Proceedings vol (1) 1 15 New Bounds for Learning Intervals with Iplications for Sei-Supervised Learning David P. Helbold dph@soe.ucsc.edu Departent of Coputer Science, University

More information

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and This article appeared in a ournal published by Elsevier. The attached copy is furnished to the author for internal non-coercial research and education use, including for instruction at the authors institution

More information

A Self-Organizing Model for Logical Regression Jerry Farlow 1 University of Maine. (1900 words)

A Self-Organizing Model for Logical Regression Jerry Farlow 1 University of Maine. (1900 words) 1 A Self-Organizing Model for Logical Regression Jerry Farlow 1 University of Maine (1900 words) Contact: Jerry Farlow Dept of Matheatics Univeristy of Maine Orono, ME 04469 Tel (07) 866-3540 Eail: farlow@ath.uaine.edu

More information

Pattern Recognition and Machine Learning. Learning and Evaluation for Pattern Recognition

Pattern Recognition and Machine Learning. Learning and Evaluation for Pattern Recognition Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2017 Lesson 1 4 October 2017 Outline Learning and Evaluation for Pattern Recognition Notation...2 1. The Pattern Recognition

More information

Optimal Jamming Over Additive Noise: Vector Source-Channel Case

Optimal Jamming Over Additive Noise: Vector Source-Channel Case Fifty-first Annual Allerton Conference Allerton House, UIUC, Illinois, USA October 2-3, 2013 Optial Jaing Over Additive Noise: Vector Source-Channel Case Erah Akyol and Kenneth Rose Abstract This paper

More information

Keywords: Estimator, Bias, Mean-squared error, normality, generalized Pareto distribution

Keywords: Estimator, Bias, Mean-squared error, normality, generalized Pareto distribution Testing approxiate norality of an estiator using the estiated MSE and bias with an application to the shape paraeter of the generalized Pareto distribution J. Martin van Zyl Abstract In this work the norality

More information

Combining Classifiers

Combining Classifiers Cobining Classifiers Generic ethods of generating and cobining ultiple classifiers Bagging Boosting References: Duda, Hart & Stork, pg 475-480. Hastie, Tibsharini, Friedan, pg 246-256 and Chapter 10. http://www.boosting.org/

More information

Homework 3 Solutions CSE 101 Summer 2017

Homework 3 Solutions CSE 101 Summer 2017 Hoework 3 Solutions CSE 0 Suer 207. Scheduling algoriths The following n = 2 jobs with given processing ties have to be scheduled on = 3 parallel and identical processors with the objective of iniizing

More information

Support Vector Machines. Goals for the lecture

Support Vector Machines. Goals for the lecture Support Vector Machines Mark Craven and David Page Coputer Sciences 760 Spring 2018 www.biostat.wisc.edu/~craven/cs760/ Soe of the slides in these lectures have been adapted/borrowed fro aterials developed

More information

Soft Computing Techniques Help Assign Weights to Different Factors in Vulnerability Analysis

Soft Computing Techniques Help Assign Weights to Different Factors in Vulnerability Analysis Soft Coputing Techniques Help Assign Weights to Different Factors in Vulnerability Analysis Beverly Rivera 1,2, Irbis Gallegos 1, and Vladik Kreinovich 2 1 Regional Cyber and Energy Security Center RCES

More information

Ch 12: Variations on Backpropagation

Ch 12: Variations on Backpropagation Ch 2: Variations on Backpropagation The basic backpropagation algorith is too slow for ost practical applications. It ay take days or weeks of coputer tie. We deonstrate why the backpropagation algorith

More information

This model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t.

This model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t. CS 493: Algoriths for Massive Data Sets Feb 2, 2002 Local Models, Bloo Filter Scribe: Qin Lv Local Models In global odels, every inverted file entry is copressed with the sae odel. This work wells when

More information

1 Rademacher Complexity Bounds

1 Rademacher Complexity Bounds COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #10 Scribe: Max Goer March 07, 2013 1 Radeacher Coplexity Bounds Recall the following theore fro last lecture: Theore 1. With probability

More information

COS 424: Interacting with Data. Written Exercises

COS 424: Interacting with Data. Written Exercises COS 424: Interacting with Data Hoework #4 Spring 2007 Regression Due: Wednesday, April 18 Written Exercises See the course website for iportant inforation about collaboration and late policies, as well

More information

An Improved Particle Filter with Applications in Ballistic Target Tracking

An Improved Particle Filter with Applications in Ballistic Target Tracking Sensors & ransducers Vol. 72 Issue 6 June 204 pp. 96-20 Sensors & ransducers 204 by IFSA Publishing S. L. http://www.sensorsportal.co An Iproved Particle Filter with Applications in Ballistic arget racing

More information

CHAPTER 19: Single-Loop IMC Control

CHAPTER 19: Single-Loop IMC Control When I coplete this chapter, I want to be able to do the following. Recognize that other feedback algoriths are possible Understand the IMC structure and how it provides the essential control features

More information

Pattern Recognition and Machine Learning. Artificial Neural networks

Pattern Recognition and Machine Learning. Artificial Neural networks Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2016 Lessons 7 14 Dec 2016 Outline Artificial Neural networks Notation...2 1. Introduction...3... 3 The Artificial

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE7C (Spring 018: Convex Optiization and Approxiation Instructor: Moritz Hardt Eail: hardt+ee7c@berkeley.edu Graduate Instructor: Max Sichowitz Eail: sichow+ee7c@berkeley.edu October 15,

More information

A Note on Online Scheduling for Jobs with Arbitrary Release Times

A Note on Online Scheduling for Jobs with Arbitrary Release Times A Note on Online Scheduling for Jobs with Arbitrary Release Ties Jihuan Ding, and Guochuan Zhang College of Operations Research and Manageent Science, Qufu Noral University, Rizhao 7686, China dingjihuan@hotail.co

More information

Ştefan ŞTEFĂNESCU * is the minimum global value for the function h (x)

Ştefan ŞTEFĂNESCU * is the minimum global value for the function h (x) 7Applying Nelder Mead s Optiization Algorith APPLYING NELDER MEAD S OPTIMIZATION ALGORITHM FOR MULTIPLE GLOBAL MINIMA Abstract Ştefan ŞTEFĂNESCU * The iterative deterinistic optiization ethod could not

More information

Computational and Statistical Learning Theory

Computational and Statistical Learning Theory Coputational and Statistical Learning Theory TTIC 31120 Prof. Nati Srebro Lecture 2: PAC Learning and VC Theory I Fro Adversarial Online to Statistical Three reasons to ove fro worst-case deterinistic

More information

Finite fields. and we ve used it in various examples and homework problems. In these notes I will introduce more finite fields

Finite fields. and we ve used it in various examples and homework problems. In these notes I will introduce more finite fields Finite fields I talked in class about the field with two eleents F 2 = {, } and we ve used it in various eaples and hoework probles. In these notes I will introduce ore finite fields F p = {,,...,p } for

More information

Fixed-to-Variable Length Distribution Matching

Fixed-to-Variable Length Distribution Matching Fixed-to-Variable Length Distribution Matching Rana Ali Ajad and Georg Böcherer Institute for Counications Engineering Technische Universität München, Gerany Eail: raa2463@gail.co,georg.boecherer@tu.de

More information

A Comparative Study of Parametric and Nonparametric Regressions

A Comparative Study of Parametric and Nonparametric Regressions Iranian Econoic Review, Vol.16, No.30, Fall 011 A Coparative Study of Paraetric and Nonparaetric Regressions Shahra Fattahi Received: 010/08/8 Accepted: 011/04/4 Abstract his paper evaluates inflation

More information

E. Alpaydın AERFAISS

E. Alpaydın AERFAISS E. Alpaydın AERFAISS 00 Introduction Questions: Is the error rate of y classifier less than %? Is k-nn ore accurate than MLP? Does having PCA before iprove accuracy? Which kernel leads to highest accuracy

More information

A Smoothed Boosting Algorithm Using Probabilistic Output Codes

A Smoothed Boosting Algorithm Using Probabilistic Output Codes A Soothed Boosting Algorith Using Probabilistic Output Codes Rong Jin rongjin@cse.su.edu Dept. of Coputer Science and Engineering, Michigan State University, MI 48824, USA Jian Zhang jian.zhang@cs.cu.edu

More information

Support recovery in compressed sensing: An estimation theoretic approach

Support recovery in compressed sensing: An estimation theoretic approach Support recovery in copressed sensing: An estiation theoretic approach Ain Karbasi, Ali Horati, Soheil Mohajer, Martin Vetterli School of Coputer and Counication Sciences École Polytechnique Fédérale de

More information

ASSUME a source over an alphabet size m, from which a sequence of n independent samples are drawn. The classical

ASSUME a source over an alphabet size m, from which a sequence of n independent samples are drawn. The classical IEEE TRANSACTIONS ON INFORMATION THEORY Large Alphabet Source Coding using Independent Coponent Analysis Aichai Painsky, Meber, IEEE, Saharon Rosset and Meir Feder, Fellow, IEEE arxiv:67.7v [cs.it] Jul

More information

Estimating Parameters for a Gaussian pdf

Estimating Parameters for a Gaussian pdf Pattern Recognition and achine Learning Jaes L. Crowley ENSIAG 3 IS First Seester 00/0 Lesson 5 7 Noveber 00 Contents Estiating Paraeters for a Gaussian pdf Notation... The Pattern Recognition Proble...3

More information

A NEW ROBUST AND EFFICIENT ESTIMATOR FOR ILL-CONDITIONED LINEAR INVERSE PROBLEMS WITH OUTLIERS

A NEW ROBUST AND EFFICIENT ESTIMATOR FOR ILL-CONDITIONED LINEAR INVERSE PROBLEMS WITH OUTLIERS A NEW ROBUST AND EFFICIENT ESTIMATOR FOR ILL-CONDITIONED LINEAR INVERSE PROBLEMS WITH OUTLIERS Marta Martinez-Caara 1, Michael Mua 2, Abdelhak M. Zoubir 2, Martin Vetterli 1 1 School of Coputer and Counication

More information

A MESHSIZE BOOSTING ALGORITHM IN KERNEL DENSITY ESTIMATION

A MESHSIZE BOOSTING ALGORITHM IN KERNEL DENSITY ESTIMATION A eshsize boosting algorith in kernel density estiation A MESHSIZE BOOSTING ALGORITHM IN KERNEL DENSITY ESTIMATION C.C. Ishiekwene, S.M. Ogbonwan and J.E. Osewenkhae Departent of Matheatics, University

More information

Using a De-Convolution Window for Operating Modal Analysis

Using a De-Convolution Window for Operating Modal Analysis Using a De-Convolution Window for Operating Modal Analysis Brian Schwarz Vibrant Technology, Inc. Scotts Valley, CA Mark Richardson Vibrant Technology, Inc. Scotts Valley, CA Abstract Operating Modal Analysis

More information

Kernel Methods and Support Vector Machines

Kernel Methods and Support Vector Machines Intelligent Systes: Reasoning and Recognition Jaes L. Crowley ENSIAG 2 / osig 1 Second Seester 2012/2013 Lesson 20 2 ay 2013 Kernel ethods and Support Vector achines Contents Kernel Functions...2 Quadratic

More information

Ensemble Based on Data Envelopment Analysis

Ensemble Based on Data Envelopment Analysis Enseble Based on Data Envelopent Analysis So Young Sohn & Hong Choi Departent of Coputer Science & Industrial Systes Engineering, Yonsei University, Seoul, Korea Tel) 82-2-223-404, Fax) 82-2- 364-7807

More information

On Constant Power Water-filling

On Constant Power Water-filling On Constant Power Water-filling Wei Yu and John M. Cioffi Electrical Engineering Departent Stanford University, Stanford, CA94305, U.S.A. eails: {weiyu,cioffi}@stanford.edu Abstract This paper derives

More information

HIGH RESOLUTION NEAR-FIELD MULTIPLE TARGET DETECTION AND LOCALIZATION USING SUPPORT VECTOR MACHINES

HIGH RESOLUTION NEAR-FIELD MULTIPLE TARGET DETECTION AND LOCALIZATION USING SUPPORT VECTOR MACHINES ICONIC 2007 St. Louis, O, USA June 27-29, 2007 HIGH RESOLUTION NEAR-FIELD ULTIPLE TARGET DETECTION AND LOCALIZATION USING SUPPORT VECTOR ACHINES A. Randazzo,. A. Abou-Khousa 2,.Pastorino, and R. Zoughi

More information

Bayes Decision Rule and Naïve Bayes Classifier

Bayes Decision Rule and Naïve Bayes Classifier Bayes Decision Rule and Naïve Bayes Classifier Le Song Machine Learning I CSE 6740, Fall 2013 Gaussian Mixture odel A density odel p(x) ay be ulti-odal: odel it as a ixture of uni-odal distributions (e.g.

More information

Boosting with log-loss

Boosting with log-loss Boosting with log-loss Marco Cusuano-Towner Septeber 2, 202 The proble Suppose we have data exaples {x i, y i ) i =... } for a two-class proble with y i {, }. Let F x) be the predictor function with the

More information

e-companion ONLY AVAILABLE IN ELECTRONIC FORM

e-companion ONLY AVAILABLE IN ELECTRONIC FORM OPERATIONS RESEARCH doi 10.1287/opre.1070.0427ec pp. ec1 ec5 e-copanion ONLY AVAILABLE IN ELECTRONIC FORM infors 07 INFORMS Electronic Copanion A Learning Approach for Interactive Marketing to a Custoer

More information

Ufuk Demirci* and Feza Kerestecioglu**

Ufuk Demirci* and Feza Kerestecioglu** 1 INDIRECT ADAPTIVE CONTROL OF MISSILES Ufuk Deirci* and Feza Kerestecioglu** *Turkish Navy Guided Missile Test Station, Beykoz, Istanbul, TURKEY **Departent of Electrical and Electronics Engineering,

More information

Consistent Multiclass Algorithms for Complex Performance Measures. Supplementary Material

Consistent Multiclass Algorithms for Complex Performance Measures. Supplementary Material Consistent Multiclass Algoriths for Coplex Perforance Measures Suppleentary Material Notations. Let λ be the base easure over n given by the unifor rando variable (say U over n. Hence, for all easurable

More information

arxiv: v1 [cs.lg] 8 Jan 2019

arxiv: v1 [cs.lg] 8 Jan 2019 Data Masking with Privacy Guarantees Anh T. Pha Oregon State University phatheanhbka@gail.co Shalini Ghosh Sasung Research shalini.ghosh@gail.co Vinod Yegneswaran SRI international vinod@csl.sri.co arxiv:90.085v

More information

Support Vector Machines. Machine Learning Series Jerry Jeychandra Blohm Lab

Support Vector Machines. Machine Learning Series Jerry Jeychandra Blohm Lab Support Vector Machines Machine Learning Series Jerry Jeychandra Bloh Lab Outline Main goal: To understand how support vector achines (SVMs) perfor optial classification for labelled data sets, also a

More information

THE KALMAN FILTER: A LOOK BEHIND THE SCENE

THE KALMAN FILTER: A LOOK BEHIND THE SCENE HE KALMA FILER: A LOOK BEHID HE SCEE R.E. Deain School of Matheatical and Geospatial Sciences, RMI University eail: rod.deain@rit.edu.au Presented at the Victorian Regional Survey Conference, Mildura,

More information

Uniform Approximation and Bernstein Polynomials with Coefficients in the Unit Interval

Uniform Approximation and Bernstein Polynomials with Coefficients in the Unit Interval Unifor Approxiation and Bernstein Polynoials with Coefficients in the Unit Interval Weiang Qian and Marc D. Riedel Electrical and Coputer Engineering, University of Minnesota 200 Union St. S.E. Minneapolis,

More information

A note on the multiplication of sparse matrices

A note on the multiplication of sparse matrices Cent. Eur. J. Cop. Sci. 41) 2014 1-11 DOI: 10.2478/s13537-014-0201-x Central European Journal of Coputer Science A note on the ultiplication of sparse atrices Research Article Keivan Borna 12, Sohrab Aboozarkhani

More information

ESTIMATING AND FORMING CONFIDENCE INTERVALS FOR EXTREMA OF RANDOM POLYNOMIALS. A Thesis. Presented to. The Faculty of the Department of Mathematics

ESTIMATING AND FORMING CONFIDENCE INTERVALS FOR EXTREMA OF RANDOM POLYNOMIALS. A Thesis. Presented to. The Faculty of the Department of Mathematics ESTIMATING AND FORMING CONFIDENCE INTERVALS FOR EXTREMA OF RANDOM POLYNOMIALS A Thesis Presented to The Faculty of the Departent of Matheatics San Jose State University In Partial Fulfillent of the Requireents

More information

Support Vector Machines. Maximizing the Margin

Support Vector Machines. Maximizing the Margin Support Vector Machines Support vector achines (SVMs) learn a hypothesis: h(x) = b + Σ i= y i α i k(x, x i ) (x, y ),..., (x, y ) are the training exs., y i {, } b is the bias weight. α,..., α are the

More information

Learnability and Stability in the General Learning Setting

Learnability and Stability in the General Learning Setting Learnability and Stability in the General Learning Setting Shai Shalev-Shwartz TTI-Chicago shai@tti-c.org Ohad Shair The Hebrew University ohadsh@cs.huji.ac.il Nathan Srebro TTI-Chicago nati@uchicago.edu

More information

AN OPTIMAL SHRINKAGE FACTOR IN PREDICTION OF ORDERED RANDOM EFFECTS

AN OPTIMAL SHRINKAGE FACTOR IN PREDICTION OF ORDERED RANDOM EFFECTS Statistica Sinica 6 016, 1709-178 doi:http://dx.doi.org/10.5705/ss.0014.0034 AN OPTIMAL SHRINKAGE FACTOR IN PREDICTION OF ORDERED RANDOM EFFECTS Nilabja Guha 1, Anindya Roy, Yaakov Malinovsky and Gauri

More information

Efficient Filter Banks And Interpolators

Efficient Filter Banks And Interpolators Efficient Filter Banks And Interpolators A. G. DEMPSTER AND N. P. MURPHY Departent of Electronic Systes University of Westinster 115 New Cavendish St, London W1M 8JS United Kingdo Abstract: - Graphical

More information

Interactive Markov Models of Evolutionary Algorithms

Interactive Markov Models of Evolutionary Algorithms Cleveland State University EngagedScholarship@CSU Electrical Engineering & Coputer Science Faculty Publications Electrical Engineering & Coputer Science Departent 2015 Interactive Markov Models of Evolutionary

More information

Chapter 6 1-D Continuous Groups

Chapter 6 1-D Continuous Groups Chapter 6 1-D Continuous Groups Continuous groups consist of group eleents labelled by one or ore continuous variables, say a 1, a 2,, a r, where each variable has a well- defined range. This chapter explores:

More information

TABLE FOR UPPER PERCENTAGE POINTS OF THE LARGEST ROOT OF A DETERMINANTAL EQUATION WITH FIVE ROOTS. By William W. Chen

TABLE FOR UPPER PERCENTAGE POINTS OF THE LARGEST ROOT OF A DETERMINANTAL EQUATION WITH FIVE ROOTS. By William W. Chen TABLE FOR UPPER PERCENTAGE POINTS OF THE LARGEST ROOT OF A DETERMINANTAL EQUATION WITH FIVE ROOTS By Willia W. Chen The distribution of the non-null characteristic roots of a atri derived fro saple observations

More information

Computational and Statistical Learning Theory

Computational and Statistical Learning Theory Coputational and Statistical Learning Theory Proble sets 5 and 6 Due: Noveber th Please send your solutions to learning-subissions@ttic.edu Notations/Definitions Recall the definition of saple based Radeacher

More information

Topic 5a Introduction to Curve Fitting & Linear Regression

Topic 5a Introduction to Curve Fitting & Linear Regression /7/08 Course Instructor Dr. Rayond C. Rup Oice: A 337 Phone: (95) 747 6958 E ail: rcrup@utep.edu opic 5a Introduction to Curve Fitting & Linear Regression EE 4386/530 Coputational ethods in EE Outline

More information

1 Identical Parallel Machines

1 Identical Parallel Machines FB3: Matheatik/Inforatik Dr. Syaantak Das Winter 2017/18 Optiizing under Uncertainty Lecture Notes 3: Scheduling to Miniize Makespan In any standard scheduling proble, we are given a set of jobs J = {j

More information

Inspection; structural health monitoring; reliability; Bayesian analysis; updating; decision analysis; value of information

Inspection; structural health monitoring; reliability; Bayesian analysis; updating; decision analysis; value of information Cite as: Straub D. (2014). Value of inforation analysis with structural reliability ethods. Structural Safety, 49: 75-86. Value of Inforation Analysis with Structural Reliability Methods Daniel Straub

More information

Experimental Design For Model Discrimination And Precise Parameter Estimation In WDS Analysis

Experimental Design For Model Discrimination And Precise Parameter Estimation In WDS Analysis City University of New York (CUNY) CUNY Acadeic Works International Conference on Hydroinforatics 8-1-2014 Experiental Design For Model Discriination And Precise Paraeter Estiation In WDS Analysis Giovanna

More information

On the Communication Complexity of Lipschitzian Optimization for the Coordinated Model of Computation

On the Communication Complexity of Lipschitzian Optimization for the Coordinated Model of Computation journal of coplexity 6, 459473 (2000) doi:0.006jco.2000.0544, available online at http:www.idealibrary.co on On the Counication Coplexity of Lipschitzian Optiization for the Coordinated Model of Coputation

More information

Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence

Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence Best Ar Identification: A Unified Approach to Fixed Budget and Fixed Confidence Victor Gabillon Mohaad Ghavazadeh Alessandro Lazaric INRIA Lille - Nord Europe, Tea SequeL {victor.gabillon,ohaad.ghavazadeh,alessandro.lazaric}@inria.fr

More information

Decentralized Adaptive Control of Nonlinear Systems Using Radial Basis Neural Networks

Decentralized Adaptive Control of Nonlinear Systems Using Radial Basis Neural Networks 050 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 44, NO., NOVEMBER 999 Decentralized Adaptive Control of Nonlinear Systes Using Radial Basis Neural Networks Jeffrey T. Spooner and Kevin M. Passino Abstract

More information

On Conditions for Linearity of Optimal Estimation

On Conditions for Linearity of Optimal Estimation On Conditions for Linearity of Optial Estiation Erah Akyol, Kuar Viswanatha and Kenneth Rose {eakyol, kuar, rose}@ece.ucsb.edu Departent of Electrical and Coputer Engineering University of California at

More information

World's largest Science, Technology & Medicine Open Access book publisher

World's largest Science, Technology & Medicine Open Access book publisher PUBLISHED BY World's largest Science, Technology & Medicine Open Access book publisher 2750+ OPEN ACCESS BOOKS 95,000+ INTERNATIONAL AUTHORS AND EDITORS 88+ MILLION DOWNLOADS BOOKS DELIVERED TO 5 COUNTRIES

More information

arxiv: v1 [cs.ds] 3 Feb 2014

arxiv: v1 [cs.ds] 3 Feb 2014 arxiv:40.043v [cs.ds] 3 Feb 04 A Bound on the Expected Optiality of Rando Feasible Solutions to Cobinatorial Optiization Probles Evan A. Sultani The Johns Hopins University APL evan@sultani.co http://www.sultani.co/

More information

Algorithms for parallel processor scheduling with distinct due windows and unit-time jobs

Algorithms for parallel processor scheduling with distinct due windows and unit-time jobs BULLETIN OF THE POLISH ACADEMY OF SCIENCES TECHNICAL SCIENCES Vol. 57, No. 3, 2009 Algoriths for parallel processor scheduling with distinct due windows and unit-tie obs A. JANIAK 1, W.A. JANIAK 2, and

More information

Compression and Predictive Distributions for Large Alphabet i.i.d and Markov models

Compression and Predictive Distributions for Large Alphabet i.i.d and Markov models 2014 IEEE International Syposiu on Inforation Theory Copression and Predictive Distributions for Large Alphabet i.i.d and Markov odels Xiao Yang Departent of Statistics Yale University New Haven, CT, 06511

More information

Efficient Learning with Partially Observed Attributes

Efficient Learning with Partially Observed Attributes Nicolò Cesa-Bianchi DSI, Università degli Studi di Milano, Italy Shai Shalev-Shwartz The Hebrew University, Jerusale, Israel Ohad Shair The Hebrew University, Jerusale, Israel Abstract We describe and

More information

CHAPTER 8 CONSTRAINED OPTIMIZATION 2: SEQUENTIAL QUADRATIC PROGRAMMING, INTERIOR POINT AND GENERALIZED REDUCED GRADIENT METHODS

CHAPTER 8 CONSTRAINED OPTIMIZATION 2: SEQUENTIAL QUADRATIC PROGRAMMING, INTERIOR POINT AND GENERALIZED REDUCED GRADIENT METHODS CHAPER 8 CONSRAINED OPIMIZAION : SEQUENIAL QUADRAIC PROGRAMMING, INERIOR POIN AND GENERALIZED REDUCED GRADIEN MEHODS 8. Introduction In the previous chapter we eained the necessary and sufficient conditions

More information

Lower Bounds for Quantized Matrix Completion

Lower Bounds for Quantized Matrix Completion Lower Bounds for Quantized Matrix Copletion Mary Wootters and Yaniv Plan Departent of Matheatics University of Michigan Ann Arbor, MI Eail: wootters, yplan}@uich.edu Mark A. Davenport School of Elec. &

More information

Efficient Learning of Generalized Linear and Single Index Models with Isotonic Regression

Efficient Learning of Generalized Linear and Single Index Models with Isotonic Regression Efficient Learning of Generalized Linear and Single Index Models with Isotonic Regression Sha M Kakade Microsoft Research and Wharton, U Penn skakade@icrosoftco Varun Kanade SEAS, Harvard University vkanade@fasharvardedu

More information