Robust Bounds for Classification via Selective Sampling

Size: px
Start display at page:

Download "Robust Bounds for Classification via Selective Sampling"

Transcription

1 Nicolò Cesa-Bianchi DSI, Università egli Stui i Milano, Italy Clauio Gentile DICOM, Università ell Insubria, Varese, Italy Francesco Orabona Iiap, Martigny, Switzerlan cesa-bianchi@siunimiit clauiogentile@uninsubriait forabona@iiapch Abstract We introuce a new algorithm for binary classification in the selective sampling protocol Our algorithm uses Regularize Least Squares RLS as base classifier, an for this reason it can be efficiently run in any RKHS Unlike previous margin-base semisupervise algorithms, our sampling conition hinges on a simultaneous upper boun on bias an variance of the RLS estimate uner a simple linear label noise moel This fact allows us to prove performance bouns that hol for an arbitrary sequence of instances In particular, we show that our sampling strategy approximates the margin of the Bayes optimal classifier to any esire accuracy ε by asking Õ /ε 2 queries in the RKHS case is replace by a suitable spectral quantity While these are the stanar rates in the fully supervise ii case, the best previously known result in our harer setting was Õ 3 /ε 4 Preliminary experiments show that some of our algorithms also exhibit a goo practical performance 1 Introuction A practical variant of the stanar fully supervise online learning protocol is a setting where, at each preiction step, the learner can abstain from observing the current label If the learner observes the label, which he can o by issuing a query, then the label value can be use to improve future preictions If Appearing in Proceeings of the 26 th International Conference on Machine Learning, Montreal, Canaa, 2009 Copyright 2009 by the authors/owners the label is preicte but not querie, then the learner never knows whether his preiction was correct Thus, only querie labels are observe while all others remain unknown This protocol is often calle selective sampling, an we interchangeably use querie labels an sample labels to enote the labels observe by the learner Given a general online preiction technique, like regularize least squares RLS, we are intereste in controlling the preictive performance as the query rate goes from fully supervise all labels are querie to fully unsupervise no label is querie This is motivate by observing that, in a typical practical scenario, one might want to control the accuracy of preictions while imposing an upper boun on the query rate In fact, the number of observe labels has usually a very irect influence on basic computational aspects of online learning algorithms, such as running time an storage requirements In this work we evelop semi-supervise variants of RLS for binary classification We analyze these variants uner no assumptions on the mechanism generating the sequence of instances, while imposing a simple linear noise moel for the conitional label istribution Intuitively, our algorithms issue a query when a common upper boun on bias an variance of the current RLS estimate is larger than a given threshol Conversely, when this upper boun gets small, we infer via a simple large eviation argument that the margin of the RLS estimate on the current instance is close enough to the margin of the Bayes optimal classifier Hence the learner can safely avoi issuing a query on that step In orer to summarize our results, assume for the sake of simplicity that the Bayes optimal classifier which for us is a linear classifier u has unknown

2 margin u x t ε > 0 on all instances x t R Then, in our ata moel, the average per-step risk of the fully supervise RLS, asking N T = T labels in T steps, is known to converge to the average risk of the Bayes optimal classifier at rate ε 2 T 1 excluing logarithmic factors In this work we show that, using our semi-supervise RLS variant, we can replace N T = T with any esire query boun N T = T κ for 0 κ 1 while achieving a convergence rate of orer ε 2 T 1 + ε 2/κ T 1 One might woner whether these results coul also be obtaine just by running a stanar RLS algorithm with a constant label sampling rate, say T κ 1, inepenent of the sequence of instances If we coul prove for RLS an instantaneous regret boun like /T then the answer woul be affirmative However, the lack of assumptions on the way instances are generate makes it har to prove any nontrivial instantaneous regret boun If the margin ε is known, or, equivalently, our goal is to approximate the Bayes margin to some accuracy ε, then we show that the above strategies achieve, with high probability, any esire accuracy ε by querying only orer of /ε 2 labels excluing logarithmic factors Again, the reaer shoul observe that this boun coul not be obtaine by, say, concentrating all queries on an initial phase of length O/ε 2 In such a case, an obvious aversarial strategy woul be to generate noninformative instances just in that phase In short, if we require online semi-supervise learning algorithms to work in worst-case scenarios we nee to resort to nontrivial label sampling techniques We have run comparative experiments on both artificial an real-worl meium-size atasets These experiments, though preliminary in nature, reveal the effectiveness of our sampling strategies even from a practical stanpoint 11 Relate work an comparison Selective sampling is a well-known semi-supervise online learning setting Pioneering works in this area are Cohn et al, 1990 an Freun et al, 1997 More recent relate results focusing on linear classification problems inclue Balcan et al, 2006; Balcan et al, 2007; Cavallanti et al, 2009; Cesa-Bianchi et al, 2006b; Dasgupta et al, 2008; Dasgupta et al, 2005; Strehl & Littman, 2008, although some of these works analyze batch rather than online protocols Most previous stuies consier the case when instances are rawn ii from a fixe istribution, exceptions being the worst-case analysis in Cesa-Bianchi et al, 2006b an the very recent analysis in the KWIK learning protocol Strehl & Littman, 2008 Both of these papers use variants of RLS working on arbitrary instance sequences The work Cesa-Bianchi et al, 2006b is completely worst case: the authors make no assumptions whatsoever on the mechanism generating labels an instances; however, they are unable prove bouns on the label query rate as we o here The KWIK moel of Strehl & Littman, 2008 see also the more general setup in Li et al, 2008 is closest to the setting consiere in this paper There the goal is to approximate the Bayes margin to within a given accuracy ε The authors assume arbitrary sequences of instances an the same linear stochastic moel for labels as the one consiere here A moification of the linear regressor in Auer, 2002, combine with covering arguments, allows them to compete against an aaptive aversarial strategy for generating instances Their algorithm, however, yiels the significantly worse boun Õ 3 /ε 4 on the number of queries, an seems to work in the finite imensional < case only In contrast, our algorithms achieve the better query boun Õ /ε 2 against oblivious aversaries Moreover, our algorithms can be easily run in any infinite imensional RKHS 2 Preliminaries In the selective sampling protocol for online binary classification, at each step t = 1, 2, the learner receives an instance x t R an outputs a binary preiction for the associate unknown label y t { 1, +1} After each preiction the learner may observe the label y t only by issuing a query If no query is issue at time t, then y t remains unknown Since one expects the learner s performance to improve if more labels are observe, our goal is to trae off preictive accuracy against number of queries All results proven in this paper hol for any fixe iniviual sequence x 1, x 2, of instances, uner the sole assumption that x t = 1 for all t 1 Given any such sequence, we assume the corresponing labels y t { 1, +1} are realizations of ranom variables Y t such that EY t = u x t for all t 1, where u R is a fixe an unknown vector such that u = 1 Note that sgn t, for t = u x, is the Bayes optimal classifier for this noise moel We stuy selective sampling algorithms that use sgn t to preict Yt The quantity t = w t x t is a margin compute via the RLS estimate w t = I + S t 1 St 1 + x t x 1 t St 1 Y t 1 1 efine over the matrix S t 1 = [ x 1,, x N t 1 ] of the

3 N t 1 querie instances up to time t 1 The ranom vector Y t 1 = Y 1,, Y N t 1 contains the observe labels so that Y k is the label of x k, an I is the ientity matrix We are intereste in simultaneously controlling the cumulative regret R T = T PY t t < 0 PY t t < 0 t=1 2 an the number N T of querie labels, uniformly over T Let A t = I + S t 1 St 1 + x t x t We introuce the following relevant quantities: B t = u I + x t x t A 1 t x t, r t = x t A 1 t x t q t = S t 1A 1 t x t, s t = A 1 t x t The following properties of the RLS estimate 1 have been proven in, eg, Cesa-Bianchi et al, 2006a Lemma 1 For each t = 1, 2, the following inequalities hol: 1 E t = t B t ; 2 B t s t + r t ; 3 s t r t ; 4 q t 2 r t ; 5 For all ε > 0, P t + B t t ε 2 exp ε2 2 q t 2 ; 6 If N T is the total number of queries issue in the first T steps, then 1 t T Y t querie r t i=1 ln1 + λ i ln 1 + N T where λ i is the i-th eigenvalue of the Gram matrix S T S T efine on the querie instances 3 A new selective sampling algorithm Our main theoretical result provies bouns on the cumulative regret an the number of querie labels for the selective sampling algorithm introuce in Figure 1 We call this algorithm the BBQ Boun on Algorithm 1 The BBQ selective sampler Parameters: 0 κ 1 Initialization: Weight vector w = 0 for each time step t = 1, 2, o Observe instance x t R ; preict label y t { 1, +1} with sgnw x t ; if r t > t κ then query label y t, upate w t using x t, y t as in 1 en if en for Bias Query algorithm BBQ queries x t whenever r t is larger than a threshol vanishing as t κ, where 0 κ 1 is an input parameter This simple query conition buils on Property 5 of Lemma 1 This property shows that t is likely to be close to the margin t of the Bayes optimal preictor when both the bias B t an the variance boun q t 2 are small Since these quantities are both boune by functions of r t see Properties 2, 3, an 4 of Lemma 1, this suggests that one can safely isregar Y t when r t is small Accoring to our noise moel, the label of x t is harer to preict if t is small For this reason, our regret boun is split into a cumulative regret on big margin steps t, where t ε, an small margin steps, where t < ε On one han, we boun the regret on small margin steps simply by ε T ε, where T ε = {1 t T : t < ε} On the other han, we show that the overall regret can be boune in terms of the best possible choice of ε with no nee for the algorithm to know this optimal value Theorem 1 If BBQ is run with input κ [0, 1] then its cumulative regret R T after any number T of steps satisfies R T min 0<ε<1 8 ε T ε e 1/κ! ε e ε 2 ln 1/κ 1 + N T The number of querie labels is N T = O T κ lnt It is worth observing that the bouns presente here hol in the finite imensional < case only One can easily turn them to work in any RKHS after switching to an eigenvalue representation of the cumulative regret eg, by using the mile boun in Property 6 of Lemma 1 rather than the rightmost one, as we i in the proof below This essentially correspons to analyzing Algorithm 1 in a ual variable representation A similar comment hols for Remark 1 an Theorem 2 below

4 In the rest of the paper we enote by {φ} the inicator function of a Boolean preicate φ Proof: [of Theorem 1] Fix any ε 0, 1 As in Cavallanti et al, 2009, we first observe that our label noise moel allows us to upper boun the time-t regret PY t t < 0 PY t t < 0 as PY t t < 0 PY t t < 0 ε{ t < ε} + P t t 0, t ε ε{ t < ε} + P t t ε Hence the cumulative regret 2 can be split as follows: R T ε T ε + T t=1 P t t ε 3 We procee by expaning the inicator of t t ε with the introuction of the bias term B t { } { t t ε t + B t t ε } { 2 + B t > ε } 2 Note that { B t > ε } 2 } {r t > ε2 e exp ε2 8 8r t the first inequality eriving from a combination of Properties 2 an 3 in Lemma 1 an then overapproximating, whereas the secon one uses {b < 1} e 1 b b Moreover, by Properties 4 an 5 in Lemma 1, we have that P t + B t t 2 ε 2 exp ε2 8r t We substitute this back into 3 an single out the steps where queries are issue This gives R T ε T ε e exp ε2 8r t t:r t t κ e ε2 8r t t:r t>t κ exp The secon term is boune as follows: exp ε2 8r t t : r t t κ 0 T exp ε2 t κ 8 t=1 exp ε2 x κ x = 1 1/κ 8 8 κ Γ1/κ ε 2 where Γ is the Euler s Gamma function Γx = e t t x 1 t We further boun 1 0 κγ1/κ 1/κ! using the monotonicity of Γ For the thir term we write exp ε2 8 8r t eε 2 r t t : r t>t κ t : r t>t κ 8 eε 2 ln 1 + N T The first step uses the inequality e x 1 ex for x > 0, while the secon step uses Property 6 in Lemma 1 Finally, in orer to erive a boun on the number N T of querie labels, we have N T t:r t>t κ r t t κ T κ T κ ln t : r t>t κ r t 1 + N T where for the last inequality we use, once more, Property 6 in Lemma 1 Hence, N T = O T κ lnt, an this conclues the proof It is important to observe that, if we isregar the margin term ε T ε which is fully controlle by the aversary, the regret boun epens logarithmically on T for any constant κ > 0: 1 R T ε T ε + O ε + 2/κ ε 2 lnt If κ is set to 1 then our boun on the number of queries N T becomes vacuous, an the selective sampling algorithm essentially becomes fully supervise This recovers the known regret boun for RLS in the fully supervise case, R T ε T ε + O ln T / ε 2 Remark 1 A ranomize variant of BBQ exists that queries label y t with inepenent probability r 1 κ/κ t [0, 1] Through a similar bias-variance analysis as the one in Theorem 1 above, one can show that in expectation over the internal ranomization of this algorithm the cumulative regret R T is boune by min 0<ε<1 ε Tε + O L ε while the number of 2/κ querie labels N T is O T κ L 1 κ, being L = lnt This boun is similar though generally incomparable to the one of Theorem 1 4 A parametric performance guarantee In the proof of Theorem 1 the quantity ε acts as a threshol for the cumulative regret, which is split into a sum over steps t such that t < ε where the regret

5 Algorithm 2 The parametric BBQ selective sampler Parameters: 0 < ε, < 1 Initialization: weight vector w = 0 for each time step t = 1, 2, o observe instance x t R ; preict label y t { 1, +1} with sgnw x t if [ ε r t s t ]+ < q tt + 1 t 2 ln then 2 query label y t upate w t using x t, y t as in 1 en if en for grows by less than ε an a sum over the remaining steps Most technicalities in the proof are ue to the fact that the final boun epens on the optimal choice of this ε, which the algorithm nee not know On the other han, if a specific value for ε is provie in input to the algorithm, then the cumulative regret over steps t such that t ε can be boune by any constant > 0 using only orer of /ε 2 lnt/ queries In particular, when min t t ε, the above logarithmic boun implies that the per-step regret vanishes exponentially fast as a function of the number of queries As we state in the introuction, this result cannot be obtaine as an easy consequence of known results, ue to the aversarial nature of the instance sequence We now evelop the above argument for a practically motivate variant of our BBQ selective sampler Let us isregar for a moment the bias term B t In orer to guarantee that t t ε hols when no query is issue, it is enough to observe that Property 5 of Lemma 1 implies that t t 2r t ln 2 with probability at least 1 This immeiately elivers a rule prescribing that no query be issue at time t when 2r t ln 2 ε A slightly more involve conition, one that better exploits the inequalities of Lemma 1, allows us to obtain a significantly improve practical performance This results in the algorithm escribe in Figure 2 The algorithm, calle Parametric BBQ, takes in input two parameters ε an, an issues a query at time t whenever 1 [ ε rt s t ] + < q t 2 ln 2tt Theorem 2 If Parametric BBQ is run with input ε, 0, 1 then: 1 with probability at least 1, t t ε hols on all time steps t when no query is issue; 1 Here an throughout ˆx] + = max{0, x} 2 the number N T of queries issue after any number T of steps is boune as N T = O ε 2 ln T ln lnt/ ε This theorem has been phrase so as to make it easier to compare to a corresponing result in Strehl & Littman, 2008 for the KWIK Knows What It Knows framework In that paper, the authors use a moification of Auer s upper confience linear regression algorithm for associative reinforcement learning Auer, 2002 This moification allows them to compete against any aaptive aversarial strategy generating instance vectors x t, but it yiels the significantly worse boun Õ 3 /ε 4 on N T in the KWIK setting N T is the number of times the preiction algorithm answers I on t know Besies, their strategy seems to work in the finite imensional < case only In contrast, Parametric BBQ works against an oblivious aversary only, but it has the better boun N T = Õ /ε 2, with the Õ notation hiing a mil logarithmic epenence on T Moreover, Parametric BBQ can be reaily run in infinite = imensional RKHS recall the comment before the proof of Theorem 1 In fact, this is a quite important feature: the real-worl experiments of Section 5 neee kernels in orer to either attain a goo empirical performance on Ault or use a reasonable amount of computational resources on RCV1 Remark 2 The boun on the number of querie labels in Theorem 2 is optimal up to logarithmic factors In fact, it is possible to prove that there exists a sequence x 1, x 2, of instances an a number ε 0 > 0 such that: for all ε ε 0 an for any learning algorithm that issues N = O/ε 2 queries there exists a target vector u R an a time step t = Ω /ε 2 for which the estimate t compute by the algorithm for t = u x t has the property P t t > ε = Ω 1 Hence, at least Ω /ε 2 queries are neee to learn any target hyperplane with arbitrarily small accuracy an arbitrarily high confience Proof: [Theorem 2] Let I {1,, T } be the set of time steps when a query is issue Then, using Property 2 of Lemma 1 we can write { } t t > ε t/ I { } t + B t t > ε Bt t/ I t/ I { } t + B t t > [ε rt s t ] +

6 We first take expectations on both sies, an then apply Property 5 of Lemma 1 along with conition 4 rewritten as follows 2 [ε rt s t ] + 2 exp 2 q t 2 This gives P t t > ε t/ I tt + 1 P t + B t t > [ε r t s t ] + t/ I t/ I t/ I 2 exp 2 ε rt s t + 2 q t 2 tt + 1 tt + 1 = t=1 In orer to erive a boun on the number N T of querie labels, we procee as follows For every step t I in which a query was issue we can write ε r t r t ε r t s t [ ε r t s t ]+ 2tt + 1 2tt + 1 q t 2 ln 2 r t ln where we use Properties 3 an 4 of Lemma 1 Solving for r t an overapproximating we obtain r t 2ε ε 2 2 ln 2tt Similarly to the proof of Theorem 1, we then write N T min r t r t ln 1 + N T t I t I Using 5 we get N T = O ε ln T 2 ln lnt/ ε 5 Experiments In this section we report on preliminary experiments with the Parametric BBQ algorithm The first test is a synthetic experiment to valiate the moel We generate 10,000 ranom examples on the unit circle in R 2 The labels of these examples were generate accoring to our noise moel see Section 2 using a ranomly selecte hyperplane u with unit norm We then set = 01 an analyze the behavior of the algorithm with various settings of ε > 0 an using a Maximum error Theoretical maximum error Observe maximum error Fraction of querie labels ε Figure 1 Maximum error jagge blue line an number of querie labels ecreasing green line on a synthetic ataset for Parametric BBQ with = 01 an 015 ε 1 The straight blue line is the theoretical upper boun on the maximum error provie by the theory linear kernel In Figure 1 the jagge blue line represents the maximum error over the example sequence, ie, max t t t Although we stoppe the plot at ε = 1, note that the maximum error is ominate by t, which can be of the orer of N t As preicte by Theorem 2, the maximum error remains below the straight line y = ε the maximum error preicte by the theory In the same plot, the ecreasing green line shows the number of querie labels, which closely follows the curve ε 2 preicte by the theory This initial test reveals that the algorithm is ramatically unerconfient, ie, it is a lot more precise than it thinks Moreover, the actual error is rather insensitive to the choice of ε In orer to leverage on this, we ran the remaining tests using Parametric BBQ with a more extreme setting of parameters Namely, we change the query conition the if conition in Algorithm 2 to [ ] 1 rt s t + < q t 2 ln 2 for 0 < < 1 This amounts to setting the esire error to a efault value of ε = 1 while making the number of querie labels inepenent of T With the above setting, we compare Parametric BBQ to the secon-orer version of the label-efficient classifier SOLE of Cesa-Bianchi et al, 2006b This is a mistake-riven RLS algorithm that queries the label of the current instance with probability 1/1+b t, where b > 0 is a parameter an t is the RLS margin The other baseline algorithm is a vanilla sampler calle Ranom in the plots that asks labels at ranom with constant probability 0 < p < 1 Recall that SOLE oes not come with a guarantee boun on the Fraction of querie labels

7 F measure SOLE 048 Ranom Parametric BBQ Fraction of querie labels F measure SOLE Ranom Parametric BBQ Fraction of querie labels Figure 2 F-measure against fraction of querie labels on the a9a ataset 32,561 examples in ranom orer The plotte curves are averages over 10 ranom shuffles Figure 3 F-measure against fraction of querie labels average over the 50 most frequent categories of RCV1 first 40,000 examples in chronological orer number of querie labels Ranom, on the other han, has the simple expectation boun E[N T ] = p T For each algorithm we plot the F-measure harmonic mean of precision an recall against the fraction of querie labels We control the fraction of querie labels by changing the parameters of the three algorithms for Parametric BBQ, b for SOLE, an p for Ranom For the first real-worl experiment we chose a9a 2, a subset of the census-income Ault atabase with 32,561 binary-labele examples an 123 features In orer to bring all algorithms to a reasonable performance level, we use a Gaussian kernel with σ 2 = 125 The plots Figure 2 show that less than 6% queries are enough for the three algorithms to saturate their performance In the whole query range Parametric BBQ is consistently slightly better than SOLE, while Ranom has the worst performance For our secon real-worl experiment we use the first 40,000 newswire stories in chronological orer from the Reuters Corpus Volume 1 ataset RCV1 Each newsstory of this corpus is tagge with one or more labels from a set of 102 classes A stanar TF-IDF bag-of-wors encoing was use to obtain 138,860 features We consiere the 50 most populate classes an traine 50 classifiers one-vs-all using a linear kernel Earlier experiments, such as those reporte in Cesa-Bianchi et al, 2006b, show that RLS-base algorithms perform best on RCV1 when run in a mistake riven fashion For this reason, on this ataset we use a mistake-riven variant of Parametric BBQ, storing a querie label only when it is wrongly preicte Figure 3 shows the macroaverage F-measure 2 wwwcsientueutw/ cjlin/libsvmtools/ plotte against the average fraction of querie labels, where averages are compute over the 50 classifiers Here the algorithms nee over 35% of labels to saturate Moreover, Parametric BBQ performs worse than SOLE, although still better than Ranom Since SOLE an Parametric BBQ are both base on the mistake-riven RLS classifier, any ifference of performance is ue to their ifferent query conitions: SOLE is margin-base, while Parametric BBQ uses q t an relate quantities Note that, unlike the margin, q t oes not epen on the querie labels, but only on the correlation between their corresponing instances This fact, which helpe us a lot in the analysis of BBQ, coul make a crucial ifference between omains like RCV1 where instances are extremely sparse an Ault where instances are relatively ense More experimental work is neee in orer to settle this conjecture 6 Conclusions an ongoing research We have introuce a new family of online algorithms, the BBQ family, for selective sampling uner oblivious aversarial environments These algorithms naturally interpolate between fully supervise an fully unsupervise learning scenarios A parametric variant Parametric BBQ of our basic algorithm is esigne to work in a weakene KWIK framework Li et al, 2008; Strehl & Littman, 2008 with improve bouns on the number of querie labels We have mae preliminary experiments First, we valiate the theory on an artificially generate ataset Secon, we compare a variant of Parametric BBQ to algorithms with similar guarantees, with encouraging results

8 A few issues we are currently working on are the following First, we are trying to see if a sharper analysis of BBQ exists which allows one to prove a regret boun of the form ε T ε + ln T ε when N 2 T = O ln T This boun woul be a worst-case analog of the boun Cavallanti et al 2009 have obtaine in an ii setting This improvement is likely to require refine bouns on bias an variance of our estimators Moreover, we woul like to see if it is possible either to remove the lnt epenence on the boun on N T in Theorem 2 or to make Parametric BBQ work in aaptive aversarial environments presumably at the cost of looser bouns on N T In fact, it is currently unclear to us how a irect covering argument coul be applie in Theorem 2 which avois the nee for a conitionally inepenent structure of the involve ranom variables On the experimental sie, we are planning to perform a more thorough empirical investigation using aitional atasets In particular, since our algorithms can also be viewe as memory boune proceures, we woul like to see how they perform when compare to buget-base algorithms, such as those in Weston et al, 2005; Dekel et al, 2007; Cavallanti et al, 2007; Orabona et al, 2008 Finally, since our algorithms can be easily aapte to solve regression tasks, we are planning to test the BBQ family on stanar regression benchmarks Acknowlegments We woul like to thank the anonymous reviewers for their helpful comments The thir author is supporte by the DIRAC project uner EC grant FP All authors acknowlege partial support by the PAS- CAL2 NoE uner EC grant FP This publication only reflects the authors views References Auer, P 2002 Using confience bouns for exploitation-exploration trae-offs Journal of Machine Learning Research, 3, Balcan, M, Beygelzimer, A, & Langfor, J 2006 Agnostic active learning Proceeings of the 23r International Conference on Machine Learning Balcan, M, Broer, A, & Zhang, T 2007 Marginbase active learning Proceeings of the 20th Annual Conference on Learning Theory Cavallanti, G, Cesa-Bianchi, N, & Gentile, C 2007 Tracking the best hyperplane with a simple buget Perceptron Machine Learning, 69, Cavallanti, G, Cesa-Bianchi, N, & Gentile, C 2009 Linear classification an selective sampling uner low noise conitions In Avances in Neural Information Processing Systems 21 MIT Press Cesa-Bianchi, N, Gentile, C, & Zaniboni, L 2006a Incremental algorithms for hierarchical classication Journal of Machine Learning Research, 7, Cesa-Bianchi, N, Gentile, C, & Zaniboni, L 2006b Worst-case analysis of selective sampling for linear classification Journal of Machine Learning Research, 7, Cohn, R, Atlas, L, & Laner, R 1990 Training connectionist networks with queries an selective sampling In Avances in Neural Information Processing Systems 2 MIT Press Dasgupta, S, Hsu, D, & Monteleoni, C 2008 A general agnostic active learning algorithm In Avances in Neural Information Processing Systems 21, MIT Press Dasgupta, S, Kalai, A T, & Monteleoni, C 2005 Analysis of perceptron-base active learning Proceeings of the 18th Annual Conference on Learning Theory pp Dekel, O, Shalev-Shwartz, S, & Singer, Y 2007 The Forgetron: A kernel-base Perceptron on a buget SIAM Journal on Computing, 37, Freun, Y, Seung, S, Shamir, E, & Tishby, N 1997 Selective sampling using the query by committee algorithm Machine Learning, 28, Li, L, Littman, M, & Walsh, T 2008 Knows what it knows: a framework for self-aware learning Proceeings of the 25th International Conference on Machine Learning pp Orabona, F, Keshet, J, & Caputo, B 2008 The Projectron: a boune kernel-base Perceptron Proceeings of the 25th International Conference on Machine Learning pp Strehl, A, & Littman, M 2008 Online linear regression an its application to moel-base reinforcement learning In Avances in Neural Information Processing Systems 20 MIT Press Weston, J, Bores, A, & Bottou, L 2005 Online an offline on an even tighter buget Proceeings of the 10th International Workshop on Artificial Intelligence an Statistics pp

Better Algorithms for Selective Sampling

Better Algorithms for Selective Sampling Francesco Orabona Nicolò Cesa-Bianchi DSI, Università degli Studi di Milano, Italy francesco@orabonacom nicolocesa-bianchi@unimiit Abstract We study online algorithms for selective sampling that use regularized

More information

Adaptive Sampling Under Low Noise Conditions 1

Adaptive Sampling Under Low Noise Conditions 1 Manuscrit auteur, publié dans "41èmes Journées de Statistique, SFdS, Bordeaux (2009)" Adaptive Sampling Under Low Noise Conditions 1 Nicolò Cesa-Bianchi Dipartimento di Scienze dell Informazione Università

More information

Least-Squares Regression on Sparse Spaces

Least-Squares Regression on Sparse Spaces Least-Squares Regression on Sparse Spaces Yuri Grinberg, Mahi Milani Far, Joelle Pineau School of Computer Science McGill University Montreal, Canaa {ygrinb,mmilan1,jpineau}@cs.mcgill.ca 1 Introuction

More information

A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks

A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks A PAC-Bayesian Approach to Spectrally-Normalize Margin Bouns for Neural Networks Behnam Neyshabur, Srinah Bhojanapalli, Davi McAllester, Nathan Srebro Toyota Technological Institute at Chicago {bneyshabur,

More information

An Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback

An Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback Journal of Machine Learning Research 8 07) - Submitte /6; Publishe 5/7 An Optimal Algorithm for Banit an Zero-Orer Convex Optimization with wo-point Feeback Oha Shamir Department of Computer Science an

More information

Linear Regression with Limited Observation

Linear Regression with Limited Observation Ela Hazan Tomer Koren Technion Israel Institute of Technology, Technion City 32000, Haifa, Israel ehazan@ie.technion.ac.il tomerk@cs.technion.ac.il Abstract We consier the most common variants of linear

More information

Lower Bounds for the Smoothed Number of Pareto optimal Solutions

Lower Bounds for the Smoothed Number of Pareto optimal Solutions Lower Bouns for the Smoothe Number of Pareto optimal Solutions Tobias Brunsch an Heiko Röglin Department of Computer Science, University of Bonn, Germany brunsch@cs.uni-bonn.e, heiko@roeglin.org Abstract.

More information

Linear Classification and Selective Sampling Under Low Noise Conditions

Linear Classification and Selective Sampling Under Low Noise Conditions Linear Classification and Selective Sampling Under Low Noise Conditions Giovanni Cavallanti DSI, Università degli Studi di Milano, Italy cavallanti@dsi.unimi.it Nicolò Cesa-Bianchi DSI, Università degli

More information

Lecture Introduction. 2 Examples of Measure Concentration. 3 The Johnson-Lindenstrauss Lemma. CS-621 Theory Gems November 28, 2012

Lecture Introduction. 2 Examples of Measure Concentration. 3 The Johnson-Lindenstrauss Lemma. CS-621 Theory Gems November 28, 2012 CS-6 Theory Gems November 8, 0 Lecture Lecturer: Alesaner Mąry Scribes: Alhussein Fawzi, Dorina Thanou Introuction Toay, we will briefly iscuss an important technique in probability theory measure concentration

More information

A Course in Machine Learning

A Course in Machine Learning A Course in Machine Learning Hal Daumé III 12 EFFICIENT LEARNING So far, our focus has been on moels of learning an basic algorithms for those moels. We have not place much emphasis on how to learn quickly.

More information

Survey Sampling. 1 Design-based Inference. Kosuke Imai Department of Politics, Princeton University. February 19, 2013

Survey Sampling. 1 Design-based Inference. Kosuke Imai Department of Politics, Princeton University. February 19, 2013 Survey Sampling Kosuke Imai Department of Politics, Princeton University February 19, 2013 Survey sampling is one of the most commonly use ata collection methos for social scientists. We begin by escribing

More information

Robust Forward Algorithms via PAC-Bayes and Laplace Distributions. ω Q. Pr (y(ω x) < 0) = Pr A k

Robust Forward Algorithms via PAC-Bayes and Laplace Distributions. ω Q. Pr (y(ω x) < 0) = Pr A k A Proof of Lemma 2 B Proof of Lemma 3 Proof: Since the support of LL istributions is R, two such istributions are equivalent absolutely continuous with respect to each other an the ivergence is well-efine

More information

Lectures - Week 10 Introduction to Ordinary Differential Equations (ODES) First Order Linear ODEs

Lectures - Week 10 Introduction to Ordinary Differential Equations (ODES) First Order Linear ODEs Lectures - Week 10 Introuction to Orinary Differential Equations (ODES) First Orer Linear ODEs When stuying ODEs we are consiering functions of one inepenent variable, e.g., f(x), where x is the inepenent

More information

Topic 7: Convergence of Random Variables

Topic 7: Convergence of Random Variables Topic 7: Convergence of Ranom Variables Course 003, 2016 Page 0 The Inference Problem So far, our starting point has been a given probability space (S, F, P). We now look at how to generate information

More information

19 Eigenvalues, Eigenvectors, Ordinary Differential Equations, and Control

19 Eigenvalues, Eigenvectors, Ordinary Differential Equations, and Control 19 Eigenvalues, Eigenvectors, Orinary Differential Equations, an Control This section introuces eigenvalues an eigenvectors of a matrix, an iscusses the role of the eigenvalues in etermining the behavior

More information

FLUCTUATIONS IN THE NUMBER OF POINTS ON SMOOTH PLANE CURVES OVER FINITE FIELDS. 1. Introduction

FLUCTUATIONS IN THE NUMBER OF POINTS ON SMOOTH PLANE CURVES OVER FINITE FIELDS. 1. Introduction FLUCTUATIONS IN THE NUMBER OF POINTS ON SMOOTH PLANE CURVES OVER FINITE FIELDS ALINA BUCUR, CHANTAL DAVID, BROOKE FEIGON, MATILDE LALÍN 1 Introuction In this note, we stuy the fluctuations in the number

More information

Linear First-Order Equations

Linear First-Order Equations 5 Linear First-Orer Equations Linear first-orer ifferential equations make up another important class of ifferential equations that commonly arise in applications an are relatively easy to solve (in theory)

More information

Efficient and Principled Online Classification Algorithms for Lifelon

Efficient and Principled Online Classification Algorithms for Lifelon Efficient and Principled Online Classification Algorithms for Lifelong Learning Toyota Technological Institute at Chicago Chicago, IL USA Talk @ Lifelong Learning for Mobile Robotics Applications Workshop,

More information

Learning noisy linear classifiers via adaptive and selective sampling

Learning noisy linear classifiers via adaptive and selective sampling Mach Learn (2011) 83: 71 102 DOI 10.1007/s10994-010-5191-x Learning noisy linear classifiers via adaptive and selective sampling Giovanni Cavallanti Nicolò Cesa-Bianchi Claudio Gentile Received: 13 January

More information

7.1 Support Vector Machine

7.1 Support Vector Machine 67577 Intro. to Machine Learning Fall semester, 006/7 Lecture 7: Support Vector Machines an Kernel Functions II Lecturer: Amnon Shashua Scribe: Amnon Shashua 7. Support Vector Machine We return now to

More information

PLAL: Cluster-based Active Learning

PLAL: Cluster-based Active Learning JMLR: Workshop an Conference Proceeings vol 3 (13) 1 22 PLAL: Cluster-base Active Learning Ruth Urner rurner@cs.uwaterloo.ca School of Computer Science, University of Waterloo, Canaa, ON, N2L 3G1 Sharon

More information

This article was published in an Elsevier journal. The attached copy is furnished to the author for non-commercial research and education use, including for instruction at the author s institution, sharing

More information

Multi-View Clustering via Canonical Correlation Analysis

Multi-View Clustering via Canonical Correlation Analysis Technical Report TTI-TR-2008-5 Multi-View Clustering via Canonical Correlation Analysis Kamalika Chauhuri UC San Diego Sham M. Kakae Toyota Technological Institute at Chicago ABSTRACT Clustering ata in

More information

6 General properties of an autonomous system of two first order ODE

6 General properties of an autonomous system of two first order ODE 6 General properties of an autonomous system of two first orer ODE Here we embark on stuying the autonomous system of two first orer ifferential equations of the form ẋ 1 = f 1 (, x 2 ), ẋ 2 = f 2 (, x

More information

Topic Modeling: Beyond Bag-of-Words

Topic Modeling: Beyond Bag-of-Words Hanna M. Wallach Cavenish Laboratory, University of Cambrige, Cambrige CB3 0HE, UK hmw26@cam.ac.u Abstract Some moels of textual corpora employ text generation methos involving n-gram statistics, while

More information

Lower bounds on Locality Sensitive Hashing

Lower bounds on Locality Sensitive Hashing Lower bouns on Locality Sensitive Hashing Rajeev Motwani Assaf Naor Rina Panigrahy Abstract Given a metric space (X, X ), c 1, r > 0, an p, q [0, 1], a istribution over mappings H : X N is calle a (r,

More information

NOTES ON EULER-BOOLE SUMMATION (1) f (l 1) (n) f (l 1) (m) + ( 1)k 1 k! B k (y) f (k) (y) dy,

NOTES ON EULER-BOOLE SUMMATION (1) f (l 1) (n) f (l 1) (m) + ( 1)k 1 k! B k (y) f (k) (y) dy, NOTES ON EULER-BOOLE SUMMATION JONATHAN M BORWEIN, NEIL J CALKIN, AND DANTE MANNA Abstract We stuy a connection between Euler-MacLaurin Summation an Boole Summation suggeste in an AMM note from 196, which

More information

arxiv: v4 [stat.ml] 21 Dec 2016

arxiv: v4 [stat.ml] 21 Dec 2016 Online Active Linear Regression via Thresholing Carlos Riquelme Stanfor University riel@stanfor.eu Ramesh Johari Stanfor University rjohari@stanfor.eu Baosen Zhang University of Washington zhangbao@uw.eu

More information

Analyzing Tensor Power Method Dynamics in Overcomplete Regime

Analyzing Tensor Power Method Dynamics in Overcomplete Regime Journal of Machine Learning Research 18 (2017) 1-40 Submitte 9/15; Revise 11/16; Publishe 4/17 Analyzing Tensor Power Metho Dynamics in Overcomplete Regime Animashree Ananumar Department of Electrical

More information

Expected Value of Partial Perfect Information

Expected Value of Partial Perfect Information Expecte Value of Partial Perfect Information Mike Giles 1, Takashi Goa 2, Howar Thom 3 Wei Fang 1, Zhenru Wang 1 1 Mathematical Institute, University of Oxfor 2 School of Engineering, University of Tokyo

More information

Flexible High-Dimensional Classification Machines and Their Asymptotic Properties

Flexible High-Dimensional Classification Machines and Their Asymptotic Properties Journal of Machine Learning Research 16 (2015) 1547-1572 Submitte 1/14; Revise 9/14; Publishe 8/15 Flexible High-Dimensional Classification Machines an Their Asymptotic Properties Xingye Qiao Department

More information

Gaussian processes with monotonicity information

Gaussian processes with monotonicity information Gaussian processes with monotonicity information Anonymous Author Anonymous Author Unknown Institution Unknown Institution Abstract A metho for using monotonicity information in multivariate Gaussian process

More information

Time-of-Arrival Estimation in Non-Line-Of-Sight Environments

Time-of-Arrival Estimation in Non-Line-Of-Sight Environments 2 Conference on Information Sciences an Systems, The Johns Hopkins University, March 2, 2 Time-of-Arrival Estimation in Non-Line-Of-Sight Environments Sinan Gezici, Hisashi Kobayashi an H. Vincent Poor

More information

Tractability results for weighted Banach spaces of smooth functions

Tractability results for weighted Banach spaces of smooth functions Tractability results for weighte Banach spaces of smooth functions Markus Weimar Mathematisches Institut, Universität Jena Ernst-Abbe-Platz 2, 07740 Jena, Germany email: markus.weimar@uni-jena.e March

More information

This module is part of the. Memobust Handbook. on Methodology of Modern Business Statistics

This module is part of the. Memobust Handbook. on Methodology of Modern Business Statistics This moule is part of the Memobust Hanbook on Methoology of Moern Business Statistics 26 March 2014 Metho: Balance Sampling for Multi-Way Stratification Contents General section... 3 1. Summary... 3 2.

More information

How to Minimize Maximum Regret in Repeated Decision-Making

How to Minimize Maximum Regret in Repeated Decision-Making How to Minimize Maximum Regret in Repeate Decision-Making Karl H. Schlag July 3 2003 Economics Department, European University Institute, Via ella Piazzuola 43, 033 Florence, Italy, Tel: 0039-0-4689, email:

More information

SYNCHRONOUS SEQUENTIAL CIRCUITS

SYNCHRONOUS SEQUENTIAL CIRCUITS CHAPTER SYNCHRONOUS SEUENTIAL CIRCUITS Registers an counters, two very common synchronous sequential circuits, are introuce in this chapter. Register is a igital circuit for storing information. Contents

More information

Balancing Expected and Worst-Case Utility in Contracting Models with Asymmetric Information and Pooling

Balancing Expected and Worst-Case Utility in Contracting Models with Asymmetric Information and Pooling Balancing Expecte an Worst-Case Utility in Contracting Moels with Asymmetric Information an Pooling R.B.O. erkkamp & W. van en Heuvel & A.P.M. Wagelmans Econometric Institute Report EI2018-01 9th January

More information

JUST THE MATHS UNIT NUMBER DIFFERENTIATION 2 (Rates of change) A.J.Hobson

JUST THE MATHS UNIT NUMBER DIFFERENTIATION 2 (Rates of change) A.J.Hobson JUST THE MATHS UNIT NUMBER 10.2 DIFFERENTIATION 2 (Rates of change) by A.J.Hobson 10.2.1 Introuction 10.2.2 Average rates of change 10.2.3 Instantaneous rates of change 10.2.4 Derivatives 10.2.5 Exercises

More information

CUSTOMER REVIEW FEATURE EXTRACTION Heng Ren, Jingye Wang, and Tony Wu

CUSTOMER REVIEW FEATURE EXTRACTION Heng Ren, Jingye Wang, and Tony Wu CUSTOMER REVIEW FEATURE EXTRACTION Heng Ren, Jingye Wang, an Tony Wu Abstract Popular proucts often have thousans of reviews that contain far too much information for customers to igest. Our goal for the

More information

Parameter estimation: A new approach to weighting a priori information

Parameter estimation: A new approach to weighting a priori information Parameter estimation: A new approach to weighting a priori information J.L. Mea Department of Mathematics, Boise State University, Boise, ID 83725-555 E-mail: jmea@boisestate.eu Abstract. We propose a

More information

'HVLJQ &RQVLGHUDWLRQ LQ 0DWHULDO 6HOHFWLRQ 'HVLJQ 6HQVLWLYLW\,1752'8&7,21

'HVLJQ &RQVLGHUDWLRQ LQ 0DWHULDO 6HOHFWLRQ 'HVLJQ 6HQVLWLYLW\,1752'8&7,21 Large amping in a structural material may be either esirable or unesirable, epening on the engineering application at han. For example, amping is a esirable property to the esigner concerne with limiting

More information

Cascaded redundancy reduction

Cascaded redundancy reduction Network: Comput. Neural Syst. 9 (1998) 73 84. Printe in the UK PII: S0954-898X(98)88342-5 Cascae reunancy reuction Virginia R e Sa an Geoffrey E Hinton Department of Computer Science, University of Toronto,

More information

Acute sets in Euclidean spaces

Acute sets in Euclidean spaces Acute sets in Eucliean spaces Viktor Harangi April, 011 Abstract A finite set H in R is calle an acute set if any angle etermine by three points of H is acute. We examine the maximal carinality α() of

More information

Computing Exact Confidence Coefficients of Simultaneous Confidence Intervals for Multinomial Proportions and their Functions

Computing Exact Confidence Coefficients of Simultaneous Confidence Intervals for Multinomial Proportions and their Functions Working Paper 2013:5 Department of Statistics Computing Exact Confience Coefficients of Simultaneous Confience Intervals for Multinomial Proportions an their Functions Shaobo Jin Working Paper 2013:5

More information

A simple model for the small-strain behaviour of soils

A simple model for the small-strain behaviour of soils A simple moel for the small-strain behaviour of soils José Jorge Naer Department of Structural an Geotechnical ngineering, Polytechnic School, University of São Paulo 05508-900, São Paulo, Brazil, e-mail:

More information

Multi-View Clustering via Canonical Correlation Analysis

Multi-View Clustering via Canonical Correlation Analysis Keywors: multi-view learning, clustering, canonical correlation analysis Abstract Clustering ata in high-imensions is believe to be a har problem in general. A number of efficient clustering algorithms

More information

The Exact Form and General Integrating Factors

The Exact Form and General Integrating Factors 7 The Exact Form an General Integrating Factors In the previous chapters, we ve seen how separable an linear ifferential equations can be solve using methos for converting them to forms that can be easily

More information

Separation of Variables

Separation of Variables Physics 342 Lecture 1 Separation of Variables Lecture 1 Physics 342 Quantum Mechanics I Monay, January 25th, 2010 There are three basic mathematical tools we nee, an then we can begin working on the physical

More information

II. First variation of functionals

II. First variation of functionals II. First variation of functionals The erivative of a function being zero is a necessary conition for the etremum of that function in orinary calculus. Let us now tackle the question of the equivalent

More information

Admin BACKPROPAGATION. Neural network. Neural network 11/3/16. Assignment 7. Assignment 8 Goals today. David Kauchak CS158 Fall 2016

Admin BACKPROPAGATION. Neural network. Neural network 11/3/16. Assignment 7. Assignment 8 Goals today. David Kauchak CS158 Fall 2016 Amin Assignment 7 Assignment 8 Goals toay BACKPROPAGATION Davi Kauchak CS58 Fall 206 Neural network Neural network inputs inputs some inputs are provie/ entere Iniviual perceptrons/ neurons Neural network

More information

Table of Common Derivatives By David Abraham

Table of Common Derivatives By David Abraham Prouct an Quotient Rules: Table of Common Derivatives By Davi Abraham [ f ( g( ] = [ f ( ] g( + f ( [ g( ] f ( = g( [ f ( ] g( g( f ( [ g( ] Trigonometric Functions: sin( = cos( cos( = sin( tan( = sec

More information

Necessary and Sufficient Conditions for Sketched Subspace Clustering

Necessary and Sufficient Conditions for Sketched Subspace Clustering Necessary an Sufficient Conitions for Sketche Subspace Clustering Daniel Pimentel-Alarcón, Laura Balzano 2, Robert Nowak University of Wisconsin-Maison, 2 University of Michigan-Ann Arbor Abstract This

More information

Influence of weight initialization on multilayer perceptron performance

Influence of weight initialization on multilayer perceptron performance Influence of weight initialization on multilayer perceptron performance M. Karouia (1,2) T. Denœux (1) R. Lengellé (1) (1) Université e Compiègne U.R.A. CNRS 817 Heuiasyc BP 649 - F-66 Compiègne ceex -

More information

Designing Information Devices and Systems II Fall 2017 Note Theorem: Existence and Uniqueness of Solutions to Differential Equations

Designing Information Devices and Systems II Fall 2017 Note Theorem: Existence and Uniqueness of Solutions to Differential Equations EECS 6B Designing Information Devices an Systems II Fall 07 Note 3 Secon Orer Differential Equations Secon orer ifferential equations appear everywhere in the real worl. In this note, we will walk through

More information

Improving Estimation Accuracy in Nonrandomized Response Questioning Methods by Multiple Answers

Improving Estimation Accuracy in Nonrandomized Response Questioning Methods by Multiple Answers International Journal of Statistics an Probability; Vol 6, No 5; September 207 ISSN 927-7032 E-ISSN 927-7040 Publishe by Canaian Center of Science an Eucation Improving Estimation Accuracy in Nonranomize

More information

Generalized Tractability for Multivariate Problems

Generalized Tractability for Multivariate Problems Generalize Tractability for Multivariate Problems Part II: Linear Tensor Prouct Problems, Linear Information, an Unrestricte Tractability Michael Gnewuch Department of Computer Science, University of Kiel,

More information

The Role of Models in Model-Assisted and Model- Dependent Estimation for Domains and Small Areas

The Role of Models in Model-Assisted and Model- Dependent Estimation for Domains and Small Areas The Role of Moels in Moel-Assiste an Moel- Depenent Estimation for Domains an Small Areas Risto Lehtonen University of Helsini Mio Myrsylä University of Pennsylvania Carl-Eri Särnal University of Montreal

More information

Stopping-Set Enumerator Approximations for Finite-Length Protograph LDPC Codes

Stopping-Set Enumerator Approximations for Finite-Length Protograph LDPC Codes Stopping-Set Enumerator Approximations for Finite-Length Protograph LDPC Coes Kaiann Fu an Achilleas Anastasopoulos Electrical Engineering an Computer Science Dept. University of Michigan, Ann Arbor, MI

More information

The derivative of a function f(x) is another function, defined in terms of a limiting expression: f(x + δx) f(x)

The derivative of a function f(x) is another function, defined in terms of a limiting expression: f(x + δx) f(x) Y. D. Chong (2016) MH2801: Complex Methos for the Sciences 1. Derivatives The erivative of a function f(x) is another function, efine in terms of a limiting expression: f (x) f (x) lim x δx 0 f(x + δx)

More information

STATISTICAL LIKELIHOOD REPRESENTATIONS OF PRIOR KNOWLEDGE IN MACHINE LEARNING

STATISTICAL LIKELIHOOD REPRESENTATIONS OF PRIOR KNOWLEDGE IN MACHINE LEARNING STATISTICAL LIKELIHOOD REPRESENTATIONS OF PRIOR KNOWLEDGE IN MACHINE LEARNING Mark A. Kon Department of Mathematics an Statistics Boston University Boston, MA 02215 email: mkon@bu.eu Anrzej Przybyszewski

More information

All s Well That Ends Well: Supplementary Proofs

All s Well That Ends Well: Supplementary Proofs All s Well That Ens Well: Guarantee Resolution of Simultaneous Rigi Boy Impact 1:1 All s Well That Ens Well: Supplementary Proofs This ocument complements the paper All s Well That Ens Well: Guarantee

More information

Survey-weighted Unit-Level Small Area Estimation

Survey-weighted Unit-Level Small Area Estimation Survey-weighte Unit-Level Small Area Estimation Jan Pablo Burgar an Patricia Dörr Abstract For evience-base regional policy making, geographically ifferentiate estimates of socio-economic inicators are

More information

Closed and Open Loop Optimal Control of Buffer and Energy of a Wireless Device

Closed and Open Loop Optimal Control of Buffer and Energy of a Wireless Device Close an Open Loop Optimal Control of Buffer an Energy of a Wireless Device V. S. Borkar School of Technology an Computer Science TIFR, umbai, Inia. borkar@tifr.res.in A. A. Kherani B. J. Prabhu INRIA

More information

Lower Bounds for Local Monotonicity Reconstruction from Transitive-Closure Spanners

Lower Bounds for Local Monotonicity Reconstruction from Transitive-Closure Spanners Lower Bouns for Local Monotonicity Reconstruction from Transitive-Closure Spanners Arnab Bhattacharyya Elena Grigorescu Mahav Jha Kyomin Jung Sofya Raskhonikova Davi P. Wooruff Abstract Given a irecte

More information

Robust Selective Sampling from Single and Multiple Teachers

Robust Selective Sampling from Single and Multiple Teachers Robust Selective Sampling from Single and Multiple Teachers Ofer Dekel Microsoft Research oferd@microsoftcom Claudio Gentile DICOM, Università dell Insubria claudiogentile@uninsubriait Karthik Sridharan

More information

Multi-View Clustering via Canonical Correlation Analysis

Multi-View Clustering via Canonical Correlation Analysis Kamalika Chauhuri ITA, UC San Diego, 9500 Gilman Drive, La Jolla, CA Sham M. Kakae Karen Livescu Karthik Sriharan Toyota Technological Institute at Chicago, 6045 S. Kenwoo Ave., Chicago, IL kamalika@soe.ucs.eu

More information

The Principle of Least Action

The Principle of Least Action Chapter 7. The Principle of Least Action 7.1 Force Methos vs. Energy Methos We have so far stuie two istinct ways of analyzing physics problems: force methos, basically consisting of the application of

More information

A note on asymptotic formulae for one-dimensional network flow problems Carlos F. Daganzo and Karen R. Smilowitz

A note on asymptotic formulae for one-dimensional network flow problems Carlos F. Daganzo and Karen R. Smilowitz A note on asymptotic formulae for one-imensional network flow problems Carlos F. Daganzo an Karen R. Smilowitz (to appear in Annals of Operations Research) Abstract This note evelops asymptotic formulae

More information

Thermal conductivity of graded composites: Numerical simulations and an effective medium approximation

Thermal conductivity of graded composites: Numerical simulations and an effective medium approximation JOURNAL OF MATERIALS SCIENCE 34 (999)5497 5503 Thermal conuctivity of grae composites: Numerical simulations an an effective meium approximation P. M. HUI Department of Physics, The Chinese University

More information

Schrödinger s equation.

Schrödinger s equation. Physics 342 Lecture 5 Schröinger s Equation Lecture 5 Physics 342 Quantum Mechanics I Wenesay, February 3r, 2010 Toay we iscuss Schröinger s equation an show that it supports the basic interpretation of

More information

inflow outflow Part I. Regular tasks for MAE598/494 Task 1

inflow outflow Part I. Regular tasks for MAE598/494 Task 1 MAE 494/598, Fall 2016 Project #1 (Regular tasks = 20 points) Har copy of report is ue at the start of class on the ue ate. The rules on collaboration will be release separately. Please always follow the

More information

Multi-View Clustering via Canonical Correlation Analysis

Multi-View Clustering via Canonical Correlation Analysis Kamalika Chauhuri ITA, UC San Diego, 9500 Gilman Drive, La Jolla, CA Sham M. Kakae Karen Livescu Karthik Sriharan Toyota Technological Institute at Chicago, 6045 S. Kenwoo Ave., Chicago, IL kamalika@soe.ucs.eu

More information

Qubit channels that achieve capacity with two states

Qubit channels that achieve capacity with two states Qubit channels that achieve capacity with two states Dominic W. Berry Department of Physics, The University of Queenslan, Brisbane, Queenslan 4072, Australia Receive 22 December 2004; publishe 22 March

More information

Database-friendly Random Projections

Database-friendly Random Projections Database-frienly Ranom Projections Dimitris Achlioptas Microsoft ABSTRACT A classic result of Johnson an Linenstrauss asserts that any set of n points in -imensional Eucliean space can be embee into k-imensional

More information

Equilibrium in Queues Under Unknown Service Times and Service Value

Equilibrium in Queues Under Unknown Service Times and Service Value University of Pennsylvania ScholarlyCommons Finance Papers Wharton Faculty Research 1-2014 Equilibrium in Queues Uner Unknown Service Times an Service Value Laurens Debo Senthil K. Veeraraghavan University

More information

Convergence of Random Walks

Convergence of Random Walks Chapter 16 Convergence of Ranom Walks This lecture examines the convergence of ranom walks to the Wiener process. This is very important both physically an statistically, an illustrates the utility of

More information

PDE Notes, Lecture #11

PDE Notes, Lecture #11 PDE Notes, Lecture # from Professor Jalal Shatah s Lectures Febuary 9th, 2009 Sobolev Spaces Recall that for u L loc we can efine the weak erivative Du by Du, φ := udφ φ C0 If v L loc such that Du, φ =

More information

LATTICE-BASED D-OPTIMUM DESIGN FOR FOURIER REGRESSION

LATTICE-BASED D-OPTIMUM DESIGN FOR FOURIER REGRESSION The Annals of Statistics 1997, Vol. 25, No. 6, 2313 2327 LATTICE-BASED D-OPTIMUM DESIGN FOR FOURIER REGRESSION By Eva Riccomagno, 1 Rainer Schwabe 2 an Henry P. Wynn 1 University of Warwick, Technische

More information

On combinatorial approaches to compressed sensing

On combinatorial approaches to compressed sensing On combinatorial approaches to compresse sensing Abolreza Abolhosseini Moghaam an Hayer Raha Department of Electrical an Computer Engineering, Michigan State University, East Lansing, MI, U.S. Emails:{abolhos,raha}@msu.eu

More information

Chaos, Solitons and Fractals Nonlinear Science, and Nonequilibrium and Complex Phenomena

Chaos, Solitons and Fractals Nonlinear Science, and Nonequilibrium and Complex Phenomena Chaos, Solitons an Fractals (7 64 73 Contents lists available at ScienceDirect Chaos, Solitons an Fractals onlinear Science, an onequilibrium an Complex Phenomena journal homepage: www.elsevier.com/locate/chaos

More information

A study on ant colony systems with fuzzy pheromone dispersion

A study on ant colony systems with fuzzy pheromone dispersion A stuy on ant colony systems with fuzzy pheromone ispersion Louis Gacogne LIP6 104, Av. Kenney, 75016 Paris, France gacogne@lip6.fr Sanra Sanri IIIA/CSIC Campus UAB, 08193 Bellaterra, Spain sanri@iiia.csic.es

More information

Efficient Learning of Linear Separators under Bounded Noise

Efficient Learning of Linear Separators under Bounded Noise Efficient Learning of Linear Separators uner Boune Noise Pranjal Awasthi pawashti@csprincetoneu Maria-Florina Balcan ninamf@cscmueu Ruth Urner rurner@tuebingenmpge March 1, 15 Nika Haghtalab nhaghtal@cscmueu

More information

Euler equations for multiple integrals

Euler equations for multiple integrals Euler equations for multiple integrals January 22, 2013 Contents 1 Reminer of multivariable calculus 2 1.1 Vector ifferentiation......................... 2 1.2 Matrix ifferentiation........................

More information

SINGULAR PERTURBATION AND STATIONARY SOLUTIONS OF PARABOLIC EQUATIONS IN GAUSS-SOBOLEV SPACES

SINGULAR PERTURBATION AND STATIONARY SOLUTIONS OF PARABOLIC EQUATIONS IN GAUSS-SOBOLEV SPACES Communications on Stochastic Analysis Vol. 2, No. 2 (28) 289-36 Serials Publications www.serialspublications.com SINGULAR PERTURBATION AND STATIONARY SOLUTIONS OF PARABOLIC EQUATIONS IN GAUSS-SOBOLEV SPACES

More information

A. Incorrect! The letter t does not appear in the expression of the given integral

A. Incorrect! The letter t does not appear in the expression of the given integral AP Physics C - Problem Drill 1: The Funamental Theorem of Calculus Question No. 1 of 1 Instruction: (1) Rea the problem statement an answer choices carefully () Work the problems on paper as neee (3) Question

More information

Online Appendix for Trade Policy under Monopolistic Competition with Firm Selection

Online Appendix for Trade Policy under Monopolistic Competition with Firm Selection Online Appenix for Trae Policy uner Monopolistic Competition with Firm Selection Kyle Bagwell Stanfor University an NBER Seung Hoon Lee Georgia Institute of Technology September 6, 2018 In this Online

More information

Active Learning Class 22, 03 May Claire Monteleoni MIT CSAIL

Active Learning Class 22, 03 May Claire Monteleoni MIT CSAIL Active Learning 9.520 Class 22, 03 May 2006 Claire Monteleoni MIT CSAIL Outline Motivation Historical framework: query learning Current framework: selective sampling Some recent results Open problems Active

More information

Iterated Point-Line Configurations Grow Doubly-Exponentially

Iterated Point-Line Configurations Grow Doubly-Exponentially Iterate Point-Line Configurations Grow Doubly-Exponentially Joshua Cooper an Mark Walters July 9, 008 Abstract Begin with a set of four points in the real plane in general position. A to this collection

More information

A new proof of the sharpness of the phase transition for Bernoulli percolation on Z d

A new proof of the sharpness of the phase transition for Bernoulli percolation on Z d A new proof of the sharpness of the phase transition for Bernoulli percolation on Z Hugo Duminil-Copin an Vincent Tassion October 8, 205 Abstract We provie a new proof of the sharpness of the phase transition

More information

An Iterative Incremental Learning Algorithm for Complex-Valued Hopfield Associative Memory

An Iterative Incremental Learning Algorithm for Complex-Valued Hopfield Associative Memory An Iterative Incremental Learning Algorithm for Complex-Value Hopfiel Associative Memory aoki Masuyama (B) an Chu Kiong Loo Faculty of Computer Science an Information Technology, University of Malaya,

More information

Implicit Differentiation

Implicit Differentiation Implicit Differentiation Thus far, the functions we have been concerne with have been efine explicitly. A function is efine explicitly if the output is given irectly in terms of the input. For instance,

More information

12.11 Laplace s Equation in Cylindrical and

12.11 Laplace s Equation in Cylindrical and SEC. 2. Laplace s Equation in Cylinrical an Spherical Coorinates. Potential 593 2. Laplace s Equation in Cylinrical an Spherical Coorinates. Potential One of the most important PDEs in physics an engineering

More information

THE VAN KAMPEN EXPANSION FOR LINKED DUFFING LINEAR OSCILLATORS EXCITED BY COLORED NOISE

THE VAN KAMPEN EXPANSION FOR LINKED DUFFING LINEAR OSCILLATORS EXCITED BY COLORED NOISE Journal of Soun an Vibration (1996) 191(3), 397 414 THE VAN KAMPEN EXPANSION FOR LINKED DUFFING LINEAR OSCILLATORS EXCITED BY COLORED NOISE E. M. WEINSTEIN Galaxy Scientific Corporation, 2500 English Creek

More information

. Using a multinomial model gives us the following equation for P d. , with respect to same length term sequences.

. Using a multinomial model gives us the following equation for P d. , with respect to same length term sequences. S 63 Lecture 8 2/2/26 Lecturer Lillian Lee Scribes Peter Babinski, Davi Lin Basic Language Moeling Approach I. Special ase of LM-base Approach a. Recap of Formulas an Terms b. Fixing θ? c. About that Multinomial

More information

An Analytical Expression of the Probability of Error for Relaying with Decode-and-forward

An Analytical Expression of the Probability of Error for Relaying with Decode-and-forward An Analytical Expression of the Probability of Error for Relaying with Decoe-an-forwar Alexanre Graell i Amat an Ingmar Lan Department of Electronics, Institut TELECOM-TELECOM Bretagne, Brest, France Email:

More information

Estimation of the Maximum Domination Value in Multi-Dimensional Data Sets

Estimation of the Maximum Domination Value in Multi-Dimensional Data Sets Proceeings of the 4th East-European Conference on Avances in Databases an Information Systems ADBIS) 200 Estimation of the Maximum Domination Value in Multi-Dimensional Data Sets Eleftherios Tiakas, Apostolos.

More information

Optimized Schwarz Methods with the Yin-Yang Grid for Shallow Water Equations

Optimized Schwarz Methods with the Yin-Yang Grid for Shallow Water Equations Optimize Schwarz Methos with the Yin-Yang Gri for Shallow Water Equations Abessama Qaouri Recherche en prévision numérique, Atmospheric Science an Technology Directorate, Environment Canaa, Dorval, Québec,

More information

Q( t) = T C T =! " t 3,t 2,t,1# Q( t) T = C T T T. Announcements. Bezier Curves and Splines. Review: Third Order Curves. Review: Cubic Examples

Q( t) = T C T =!  t 3,t 2,t,1# Q( t) T = C T T T. Announcements. Bezier Curves and Splines. Review: Third Order Curves. Review: Cubic Examples Bezier Curves an Splines December 1, 2015 Announcements PA4 ue one week from toay Submit your most fun test cases, too! Infinitely thin planes with parallel sies μ oesn t matter Term Paper ue one week

More information

The total derivative. Chapter Lagrangian and Eulerian approaches

The total derivative. Chapter Lagrangian and Eulerian approaches Chapter 5 The total erivative 51 Lagrangian an Eulerian approaches The representation of a flui through scalar or vector fiels means that each physical quantity uner consieration is escribe as a function

More information