Quantile Search: A Distance-Penalized Active Learning Algorithm for Spatial Sampling
|
|
- Matthew Cain
- 5 years ago
- Views:
Transcription
1 Quantile Search: A Distance-Penalized Active Learning Algorith for Spatial Sapling John Lipor, Laura Balzano, Branko Kerkez, and Don Scavia 3 Departent of Electrical and Coputer Engineering Departent of Civil and Environental Engineering 3 School of Natural Resources and Environent University of Michigan, Ann Arbor {lipor,girasole,bkerkez,scavia}@uichedu Abstract Adaptive sapling theory has shown that, with proper assuptions on the signal class, algoriths exist to reconstruct a signal in R d with an optial nuber of saples We generalize this proble to when the cost of sapling is not only the nuber of saples but also the distance traveled between saples This is otivated by our work studying regions of low oxygen concentration in the Great Lakes We show that for one-diensional threshold classifiers, a tradeoff between nuber of saples and distance traveled can be achieved using a generalization of binary search, which we refer to as quantile search We derive the expected total sapling tie for noiseless easureents and the expected nuber of saples for an extension to the noisy case We illustrate our results in siulations relevant to our sapling application I INTRODUCTION In any signal reconstruction probles, the cost associated with taking a saple is in soe way dependent on the distance between saples For exaple, consider sapling the Great Lakes with a sall autonoous watercraft in order to reconstruct a spatial signal Lakes such as Lake Erie span an area on the order of thousands of kiloeters; hence the tie it takes for the boat to reach the next saple location is a ajor part of the sapling cost In the traditional or passive sapling scenario, the saple locations or saple ties are chosen a priori, often on a unifor grid Contrasting this is the field of active sapling or active learning, wherein saple locations ay be chosen as a function of all previous locations and their corresponding easureents Active learning theory has shown that binary search guarantees optial sapling rates [], but this theory ignores any sapling cost other than the nuber of saples taken Our proble of interest is one in spatial sapling where the spatial region of interest is relatively large, and the total sapling tie is a function of both the nuber of saples required and the distance traveled throughout the sapling procedure Our contribution is an algorith, tered quantile search, that provides a tradeoff between the nuber of saples required and the distance traveled in finding the change point of a one-diensional threshold classifier At its two extrees, quantile search iniizes either the nuber of saples or the distance traveled, with a tradeoff achieved by varying a search paraeter We derive and analyze this algorith in the noise-free and noisy cases and study it in siulations with an application to environental onitoring, the study of oxygen levels in the Great Lakes The reconstruction of this signal is of interest to water quality experts in order to ensure the balance and quality of water in this region []; we show an estiate interpolated fro historical easureents in Figure Fig : Dissolved oxygen concentrations in Lake Erie Points represent saple locations and solid black lines delineate the central basin Source:
2 fx 04 0 θ x Fig : Threshold classifier in one diension with change point θ 3 Where as binary search takes saples at the bisection of a probability distribution, or the -quantile, quantile search instead saples at the first -quantile For a preview of our results, suppose that we wish to estiate a threshold θ on a one diensional interval For noiseless saples, our results show that the expected estiation error after n easureents is E[ ˆθ n θ ] n +, 4 which generalizes to binary search E[ ˆθ n θ ] n+ The expected distance traveled is E[D], which also generalizes to binary search where E[D] Therefore, larger values of allow for a tradeoff between nuber of saples and distance traveled The reainder of this paper is organized as follows In Section II, we otivate and forulate the proble of interest In Section III we place this work into the context of the current literature on active learning Section IV describes the quantile search algorith for both the noise-free and noisy cases In Section V we verify the theory presented via siulation and apply the algoriths described to our proble of interest, sapling hypoxia in Lake Erie Conclusions and directions of future research are given in Section VI II MOTIVATION AND PROBLEM FORMULATION The proble of interest for this work is the initial otivating proble described in the Introduction, where we wish to deterine hypoxic regions in the Great Lakes The spatial extent of low oxygen concentration or hypoxia is a strong indicator of the health of the lakes [3] Studies perfored by the National Oceanic and Atospheric Adinistration on Lake Erie utilize a nuber of easureent stations on the order of tens, as well as hydrodynaic odels, in order to provide a warning syste for severe hypoxia in the central basin We wish to deterine the spatial extent of the hypoxic region in the central basin of Lake Erie, a phenoenon that occurs yearly in the late suer onths A region is deeed hypoxic if the dissolved oxygen concentration is less than 0 pp [3] Further, we assue the hypoxic zone is one connected region with a sooth boundary The proble can then be considered a binary classification proble, with spatial points receiving a label 0 if they are hypoxic and otherwise, where the desired spatial extent corresponds to the Bayes decision boundary To perfor sapling, an autonoous watercraft with a speed ranging fro 05-4 /s will be used Further, since the axiu spatial extent ay occur at one of any depths, a depth profile ust be taken at each point, increasing the tie per saple significantly Finally, the hypoxic zone can only be considered approxiately stationary for a period of tie less than one week We therefore seek to estiate the decision boundary while siultaneously iniizing the total tie spent sapling Following [4], we split our proble into several one diensional intervals, a process that is described further in Section V-B In each interval we ust find a threshold beyond which the lake is hypoxic Define the step function class F {f : [0, ] R fx [0,θ x} where θ [0, ] is the change point and S x denotes the indicator function, which is on the set S and 0 elsewhere In contrast to the active learning scenario, our goal is to estiate θ while iniizing the total tie required for sapling, a function of both the nuber of saples taken and the distance traveled during sapling Denote the observations {Y i } n i {0, }n as saples of an unknown function f θ F taken at saple locations on the unit interval {X i } n i With probability p, 0 p < /, we observe an erroneous easureent Thus { f θ X i with probability p Y i fx i U i, f θ X i with probability p
3 3 where denotes suation odulo, and U i {0, } are Bernoulli rando variables with paraeter p While other noise scenarios are coon, here we assue the U i are independent and identically distributed and independent of {X i } This noise scenario is of interest as the otivating data hypoxia is a thresholded value in {0, }, where Gaussian noise results in iproper thresholding of the easureents The extension to nonunifor noise eg, a Tsybakov-like noise condition as studied in [5] reains as a topic for future work See Figure for an illustration of f θ X As stated previously, our goal is to estiate θ using as little tie as possible To this end, the saple location at step n is chosen as a function of all previous saple locations and easureents as well as the noise statistics III RELATED WORK A nuber of active learning algoriths exist to estiate θ while iniizing the nuber of saples taken Binary search or binary bisection has been studied on this proble and on any other active learning probles Consider binary bisection applied to the signal in Figure, where θ /3 Initially, the hypothesis space is the unit interval The binary bisection algorith takes its first easureent at X /, easuring Y 0, indicating that θ lies to the left of /, and thus reducing the hypothesis space to the interval [0, /] Continuing in this anner, the algorith reduces the hypothesis space by half after each easureent, achieving what has been shown to be an optial rate of convergence [] In fact, copared to its passive counterpart in this case, saple locations distributed uniforly throughout the interval, binary bisection has been shown to achieve an exponential speedup [], [6] A large body of work has answered questions such as what type of iproveent is attainable in higher diensions [6], [7], for ore coplex decision boundaries and under easureent noise [4], [8] [0], and whether algoriths exist that achieve such iproveents [] [4] Extensions of the binary bisection algorith continue to be an iportant topic of research in active learning and related fields In [5], the author presents a odified version of binary search that can handle noisy easureents, referred to as probabilistic binary search PBS A discretized version of this algorith was presented in [8] and analyzed, yielding an upper bound on the expected rate of convergence of the algorith This discretized algorith, referred to as the B-Z algorith after its authors, has been analyzed fro an inforation-theoretic perspective [6], [7], as well as under Tsybakov s noise condition [5] Further, in [5], the authors show that for the boundary fragent class in diension d, the B-Z algorith can be used to perfor a series of one-diensional searches, achieving an optial rate of convergence up to a logarithic factor The algorith has also been used ore recently in [8] to deonstrate the relationship between active learning and stochastic convex optiization As we noted before, however, binary search travels on average the entire length of the interval under consideration in order to estiate the threshold In the case of Lake Erie, one interval corresponds to approxiately 58 k, aking the use of binary search undesirable A nuber of active learning algoriths exist aside fro binary search, any of which have been shown to be optial under various boundary and noise conditions [4], [7], [9], [] [4] The work of [7], [] [4] presents active linear classifiers that are increasingly adaptive to noise conditions and coputationally feasible In [4], the authors present an algorith that akes use of epirical risk iniization of convex losses Another approach relies on the use of recursive dyadic partitions RDPs [9], in which the d-diensional unit hypercube is recursively divided into d equal sized hypercubes, foring a tree that is then pruned to give rise to an optial estiator An active learning algorith based on this idea is shown to be nearly optial in [9], and this algorith is used in a lake sapling context in [0] While these algoriths provide optial saple rates, they both begin by sapling uniforly at rando throughout the entire feature space This would require traversing the entire lake before proceeding to locations where the boundary is likely to lie, rendering these algoriths infeasible A sall aount of literature exists in which the cost of sapling is generalized beyond the nuber of saples [0] [] The probles considered in [0], [] are siilar to the setting here, in that the distance traveled to obtain saples is taken into account In [0], the authors eploy a traveling salesan with profits odel to iniize the distance traveled while obtaining labels for points of axiu uncertainty The authors of [] iniize an objective function that takes into account both label uncertainty and label cost In both cases, the saples are taken batch-wise before being labeled by a huan expert As every individual saple is inforative, we wish to take advantage of all available data by choosing our saple locations on a saple-by-saple basis, rather than updating after each batch Further, neither algorith is accopanied by theoretical analysis In [], the authors investigate perforance bounds for active learning in the case where label costs are nonunifor The quantile search algoriths presented here extend this idea to the case where label costs are both nonunifor and dynaic IV QUANTILE SEARCH In this section, we extend the ideas of [4], [8] to account for a distance penalization in addition to saple coplexity, resulting in what we refer to as quantile search The basic idea behind this algorith is as follows We wish to find a tradeoff between the nuber of saples required and the total distance traveled to achieve a given estiation error for the change point of a step function on the unit interval As we know, binary bisection iniizes the nuber of required saples On the other hand, continuous spatial sapling iniizes the required distance to estiate the threshold Binary search bisects
4 4 Algorith Deterinistic Quantile Search DQS : input search paraeter, saple budget N : initialize X 0 0, Y 0, n, a 0, b 3: while n N do 4: if Y n then 5: X n X n + b a 6: else 7: X n X n b a 8: end if 9: Y n fx n 0: a ax {X i : Y i, i n} : b in {X i : Y i 0, i n} : ˆθn a+b 3: end while the feasible interval hypothesis space at each step In contrast, one can think of continuous sapling as dividing the feasible interval into infinitesial subintervals at each step With this in ind, a tradeoff becoes clear: one can divide the feasible interval into subintervals of size /, where ranges between and Intuition would tell us that increasing would increase the nuber of saples required but decrease the distance traveled in sapling In what follows, we show that this intuition is correct in both the noise-free and noisy cases, resulting in two novel search algoriths A Deterinistic Quantile Search We first describe and analyze quantile search in the noise-free case p 0, here referred to as deterinistic quantile search DQS and given as Algorith In the following subsections, we analyze the expected saple coplexity and distance traveled for the algorith and show how these results can be used to iniize the total sapling tie Further, the results show that the required nuber of saples increases onotonically with and the distance traveled decreases onotonically with, indicating that the desired tradeoff is achieved Rate of Convergence: Recalling fro Section II, our goal is to estiate the threshold θ of a one-diensional threshold classifier We analyze the expected error of our estiate ˆθ n after a fixed nuber of saples for the DQS algorith The ain result and a sketch of the proof are provided here An expanded proof can be found in Appendix A Theore Consider a deterinistic quantile search with paraeter and let ρ Begin with a unifor prior on θ The expected estiation error after n easureents is then E[ ˆθ n θ ] [ ρ + ρ ] n 3 4 Proof Sketch The proof proceeds fro the law of total expectation Let Z n ˆθ n θ The first easureent is taken at /, and hence the expected error can be calculated when θ / and θ > / [ E[Z ] E Z θ ] Pr θ [ + E Z θ > ] Pr θ > [ ρ + ρ ] 4 Siilarly, after the second easureent is taken, there are four intervals, two that partition the interval [0, /], and two that partition /, ] These intervals contribute four onoials of degree 4 to the error, one of which is ρ 4, one which is ρ 4, and two which are ρ ρ The basic idea is that each parent interval integrates to ρ i ρ j and in the next step gives birth to two child intervals, one evaluating to ρ i+ ρ j and the other ρ i ρ j+ The proof of the theore then follows by induction Consider the result of Theore when In this case, the error becoes E[ ˆθ n θ ] n+ Coparing to the worst case, we see that the average case saple coplexity is exactly one saple better than the worst case, atching the well-known theory of binary search In Section V we confir this result through siulation Distance Traveled: Next, we analyze the expected distance traveled by the DQS algorith in order to converge to the true θ The proof is siilar to that of the previous section, in that it follows by the law of total expectation After each saple, we analyze the expected distance given that the true θ lies in a given interval The result and a proof sketch are given below, with the full proof included in Appendix B
5 5 Theore Let D denote a rando variable representing the distance traveled during a deterinistic quantile search with paraeter Begin with a unifor prior on θ Then E[D] 4 Proof Sketch The DQS algorith for our class of functions chooses saples oving further into the interval until it akes a zero easureent It then turns back and takes saples in the opposite direction until a one is easured This behavior allows us to analyze the total distance by splitting it up into stages first the expected distance traveled before the algorith reaches a point x > θ, and then x < θ, etc Let D n be a rando variable denoting the distance required to ove to the right of θ for the n th tie when n is odd, and to the left of θ for the n th tie when n is even In this case, we have that E[D] E[D n ] n [ i First, we would like to find E[D ] Let A i denote the interval p p0, i p p0, where A 0 [ 0,, so that the A i s for a partition of the unit interval whose endpoints are possible values of the saple locations X j Now note that E[D ] E[D θ A i ] Prθ A i Then since we assue θ is distributed uniforly over the unit interval, Prθ A i i p i p i Next, note that Thus we have p0 E[D ] i0 E[D θ A i ] i0 p0 i p0 p E[D θ A i ] Prθ A i The proof proceeds by rewriting the above in ters of ρ / and then calculating E[D n ] This is done by dividing each A i into subintervals which for partitions of A i The result then follows by induction on E[D n ] 3 Sapling Tie: Given the results in Theores and, we can now choose the optial value of to iniize the expected sapling tie First, let N be a rando variable denoting the nuber of saples required to achieve an error ε and denote its expectation E[N] log4ε log + n, which is found by solving 3 for n Now let T be a rando variable denoting the total tie required to achieve an error ε Then the expected value is E[T ] γe[n] + ηe[d] γn + η n γn + η, 5 where γ denotes the tie required to take one saple, and η denotes the tie required to travel one unit of distance Finding the optial is difficult However, 5 can be used to solve for nuerically using standard techniques We show exaples of such values in Section V As a final note, one ay wonder about the relation to what is known as -ary search [3] In contrast to quantile search, -ary search is tree-based To ake the difference clear, consider the an exaple with θ 3/8 and let 4 In this case, both algoriths take their first saple at X /4 However, after easuring Y, quantile search takes its second easureent at X 7/6, while -ary search proceeds to X / One ay then expect that both algoriths would achieve the desired tradeoff, with -ary search requiring fewer saples than quantile search for a fixed value of We choose the latter for two reasons First, quantile search does not require to be an integer and therefore gives ore flexibility in the resulting tradeoff Second, quantile search as described is the natural generalization of PBS and lends itself to the analysis of [4], [8] in the case where the easureents are noisy A coparison to noisy -ary search is a topic for future work
6 6 Algorith Probabilistic Quantile Search PQS : input search paraeter, saple budget N, probability of error p : initialize π 0 x for all x [0, ], n 0 3: while n N do 4: choose X n+ such that X n+ π 0 n xdx 5: observe Y n+ fx n+ U n+, where U n+ Bernp 6: if Y n+ 0 then 7: 8: else 9: 0: end if : end while : estiate ˆθ n such that ˆθ n 0 π n x / p π n+ x p + p p π n+ x p + p + p π n x, x X n+ π n x, x > X n+ π n x, x X n+ π n x, x > X n+ + p B Probabilistic Quantile Search We now extend the idea behind Section IV-A to the case where easureents ay be noisy ie, p 0 In [8], the authors present the probabilistic binary search PBS algorith The basic idea behind this algorith is to perfor Bayesian updating in order to aintain a posterior distribution on θ given the easureents and locations Rather than bisecting the interval, at each step the algorith bisects the posterior distribution This process is then iterated until convergence and has been shown to achieve optial saple coplexity [5], [7] We now extend this idea to achieve a tradeoff between saple coplexity and distance traveled We refer to the noise-tolerant algorith as probabilistic quantile search PQS, which proceeds as follows Starting with a unifor prior, the first saple is taken at X / The posterior density π n x is then updated as described below, and ˆθ n is chosen as the edian of this distribution The algorith proceeds by taking saples X n such that Xn+ 0 π n xdx, ie, X n is the first -quantile of π n For, the above denotes the edian of the posterior distribution and reduces to PBS A foral description is given in Algorith The Bayesian update on π n can be derived as follows Begin with the first saple We have π 0 x for all x and wish to find π x Let f x X, Y be the conditional density of θ given X, Y Applying Bayes rule, the posterior becoes: f x X, Y PrX, Y θ xπ 0 x PrX, Y For illustration, consider the case where θ 0 We now take the first easureent at X / Then and Pr X /, Y 0 θ 0 p Pr X /, Y θ 0 p In fact, this holds for any θ < / Now exaine the denoinator: + p PrX /, Y 0 Notice that for this siplifies to /, which is the case in [8] We then update the posterior distribution to be p x / π x p + p + p x > /
7 7 The equivalent posterior density can be found for when Y The process of aking an observation and updating the prior is then repeated, yielding general forula for the posterior update When Y n+ 0, we have p + p π n x x / π n+ x p π n x x > / Siilarly, for Y n+, we have + p p π n+ x p + p π n x x / π n x x > / + p Rate of Convergence: Analysis of the above algorith has proven difficult since its inception in 974, with a first proof of a geoetric rate of convergence appearing only recently in [4] Instead, the authors of [4], [5], [8], [7] use a discretized version involving inor odifications In this case, the unit interval is divided into bins of size, such that N The posterior distribution is paraeterized, and a paraeter α is used instead of p in the Bayesian update, where 0 < p < α The analysis of rate of convergence then centers around the increasing probability that at least half of the ass of π n x lies in the correct bin A foral description of the algorith can be found in [5] Given a version of PQS discretized in the sae way, we arrive at the following result Theore 3 Under the assuptions given in Section II, the discretized PQS algorith satisfies sup Pr ˆθ n θ > [ p θ [0,] α + p p + α p ] n [ Taking p α + p + p α p ] n/ yields a bound on the expected error [ p sup E[ ˆθ n θ ] θ [0,] α + p p + α p ] n/ Finally, taking α p/ p + q iniizes the right hand side of these bounds, yielding sup E[ ˆθ n θ ] θ [0,] + n/ p p 6 Proof Sketch In the discretized algorith, a i j denotes the posterior ass at the ith bin after j easureents and aj denotes the vectorized collection of all bins after j easureents Let kθ be the index of the bin containing θ Note that a i 0 for all i We define M θ j a kθj, a kθ j and N θ j + M θj + M θ j a kθj a kθ j + a kθ j + a kθ j As a brief bit of intuition, note that M θ j is a easure of how uch ass is in the bin actually containing θ, while N θ j + is a easure of iproveent fro one iteration to the next and is strictly less than one when an iproveent is ade [4] Fro [4] and following a straightforward application of Markov s inequality and the law of total expectation, we have that Pr θ n θ > E[M θ n], 7 and { } n E[M θ n] M θ 0 ax ax E[N θj + aj] j 0,,n aj The reainder of the proof consists of bounding E[N θ j + aj] by considering the three cases: i kj kθ; ii kj > kθ; and iii kj < kθ In all cases, we show that E[N θ j + aj] q + p q + p β α, where q p and β α Using 7, we have Pr ˆθ n θ > [ q + p + q p β α ] n
8 8 Next, we bound the expected error E[ ˆθ n θ ] Pr ˆθ n θ > t dt + 0 and the result follows by iniizing with respect to and α [ q + p q + p β α The full proof can be found in Appendix C In the case where, the above result atches that of [4], [8] as desired One iportant fact to note is that in contrast to the deterinistic case, the result here is an upper bound on the nuber of saples required for convergence as opposed to an expected value Therefore this result cannot be used directly select the value of that iniizes the expected sapling tie Other siilar analyses [4], [8], [4] also find upper bounds, with the ain barrier to finding expected values being the use of bounds to handle the dependence aong saple locations We instead rely on Monte Carlo siulations to choose the optial value of Finally, the bound here is loose For clarity, consider the case where p 0 and Then the above becoes As noted in [5] and entioned in Section IV-A, sup E[ ˆθ n θ ] θ [0,] sup E[ ˆθ n θ ] θ [0,] n/ n+, indicating that the bound is quite loose, even for the previously considered PBS algorith However, in [5], the authors use this result when to show rate optiality of the PBS algorith This fact suggests that despite the discrepancy, the result of Theore 3 ay still be useful in proving optiality for the PQS algorith While the rate of convergence for PQS can be derived using standard techniques, the expected distance or a useful bound on the distance is ore difficult The technique used in Section IV-A becoes intractable as the values of X i are no longer deterinistic given θ We have considered the approach of exaining the posterior distribution after each step and calculating the possible locations, but at the nth easureent, there are n possible distributions Deterining the expected distance traveled by PQS reains a subject of future work ] n,
9 9 Algorith 3 Discretized Probabilistic Quantile Search : input search paraeter, saple budget N, > 0 such that N, α, β α : define the posterior after j easureent as π j : [0, ] R π j x a i j Ii x, where I [0, ] and I i i, i], for i {,, } for a partition of the unit interval 3: initialize a i 0 such that i a ij 4: while n N do 5: define kj {,, } such that kj i i i a i j, kj a i j > 6: copute τ j, τ j, P j, and P j as τ j a i j kj a i j ikj i τ j kj a i j a i j P j τ j τ j + τ j, i ikj+ P j P j 7: choose X j+ { kj, kj, with probability P j with probability P j P j and define k X j+ 8: observe Y j+ fx j+ U j+ where U j+ Bernα 9: if Y j+ 0 then 0: { : else : 3: end if 4: end while 5: estiate ˆθ n such that ˆθ n 0 π nx / a i j + a i j + { +τ /β α +τ /β α + / τβ α + / τβ α i k i > k i k i > k A Verification of Algoriths V SIMULATIONS In this section, we verify through siulation the theoretical rate of convergence and distance traveled derived in Section IV-A Further, we present siulated results for the PQS algorith and copare with the bound derived in Section IV-B We first siulate the the DQS algorith over a range of fro to 0, where θ ranges over a 000 point grid on the unit interval The resulting average error after 0 saples is shown in Fig 3a, while the average distance before convergence to an error of ε 0 4 is shown in Fig 3b, and the average nuber of saples to achieve the sae error is shown in Fig 3c The figures show the theoretical values for expected error and distance atch the siulated values Fig 3c shows the nuber of saples required to achieve a given error is onotonically increasing in, while Fig 3b shows the distance traveled is onotonically decreasing This indicates that DQS achieves a tradeoff in the noise-free case Fig 3d shows a plot of the expected sapling tie as a function of, with the optial value of indicated in the case where γ 60 s and η 4 /s
10 theoretical siulated 0 theoretical siulated ˆθ0 θ average distance a b average saples expected sapling tie s c Fig 3: Siulated and theoretical vales for a expected error after 0 saples and b distance traveled for DQS algorith c Optial value of for γ 60 s and η 4 /s d Next, we siulate the PQS algorith over a range of fro to 0, where θ ranges over a 00 point grid on the unit interval with 500 Monte Carlo trials run for each value of θ Fig 4a shows the average nuber of saples required to converge to an error of 0 4 As in the deterinistic case, the required nuber of saples increases onotonically with Fig 4b shows the average distance traveled before converging to the sae error value Again, the distance decreases onotonically with, indicating that the algorith achieves the desired tradeoff in the noisy case B Application to Spatial Sapling In this section, we apply the DQS and PQS algoriths to the proble of sapling hypoxic regions in Lake Erie Fig 5a shows the lake with an exaple hypoxic zone pictured in gray In [5], the authors show that for the set of distributions such that the Bayes decision set is a boundary fragent, a variation on PBS can be used to estiate the boundary while achieving optial rates up to a logarithic factor The authors of [5] inforally describe the boundary fragent class on [0, ] d as the collection of sets in which the Bayes decision boundary is a Hölder sooth function of the first d coordinates In [0, ], this iplies that the boundary is crossed at ost one tie when traveling on a path along the second coordinate For a foral definition, see [5], Defn and the surrounding text Given one location x a, b inside the hypoxic zone, the proble can be transfored into two probles, each with a function belonging to the boundary fragent class Using odels and easureents fro previous years, this is a reasonable assuption Splitting the lake along the line y b yields the two sets shown in Fig 5b Now we can further divide the proble into strips along the first diension, as shown in Fig 5c Along each of these strips, the proble reduces to change
11 80 70 average saples average distance a Fig 4: Siulated average a saples required for convergence and b distance traveled for PQS algorith with probability of error p 0 Dashed lines indicate 95% of points above/below line Sapling Tie s Speed /s Total Tie hrs TABLE I: Coparison of deterinistic quantile search with binary search for a variety of sapling ties and speeds b point estiation of a one-diensional threshold classifier as we have studied thus far After estiating the change point at each strip, the boundary is estiated as a piecewise linear function of the estiates, as shown in Fig 5d We apply this procedure to the hypoxic region shown in Fig 5a using 6 strips for a variety of values for tie per saple and speed of watercraft These values are used with Theores and in order to select the optial value of The total tie required for both DQS and binary search can be seen in Table I The results are as expected A higher sapling tie biases the algorith toward lower values of in order to iniize the nuber of saples required, while a lower speed biases the algorith toward higher values of in order to iniize the distance traveled We see that in the case of a low sapling tie and low speed, the total tie saved copared to binary search is on the order of 60 hours This brings the sapling tie fro roughly 5 days down to -3, decreasing the possibility that the spatial extent of the hypoxic region will change significantly before the easureent is coplete Siilar results hold for PQS, as shown in Table II, where we show the sapling ties with a probability of easureent error p 0 averaged over 00 Monte Carlo siulations Note that in this case, the chosen value of ay not be optial VI CONCLUSIONS & FUTURE WORK The algoriths described here deonstrate the desired tradeoff between saples required and distance traveled for onediensional threshold classifier estiation We have shown how this idea can be used to estiate a two-diensional region of hypoxia under certain soothness assuptions on the boundary, and epirical results indicate the benefits of quantile search over traditional binary search Several open questions reain Deriving or bounding the expected distance for the PQS algorith is an iportant next step Further, while the choice of shown here is optial for the DQS algorith, the question reains whether this algorith is optial in soe general sense Beyond this, several interesting generalizations exist The boundary fragent class entioned here is restrictive [4], and the extension to ore general cases would be of interest The recent work of [6] describes a graphbased algorith that eploys PBS to higher-diensional nonparaetric estiation Extending this idea to penalize distance traveled is a proising avenue for practical applications of quantile search Finally, the PQS algorith requires knowledge of the noise paraeter p in order to update the posterior The algoriths presented in [4], [8] enjoy the property that they
12 a b c d Fig 5: Exaple splitting of hypoxic zone in Lake Erie to achieve two boundary fragent sets a Lake erie with hypoxic region illustrated in gray b Exaple splitting of hypoxic zone to achieve two boundary fragent sets c Division of boundary fragent into strips d Piecewise linear estiation of boundary
13 3 Sapling Tie s Speed /s Total Tie hrs TABLE II: Coparison of probabilistic quantile search with binary search for a variety of sapling ties and speeds with probability of error p 0 are adaptive to unknown noise levels The developent of a noise-adaptive probabilistic search would certainly be of great interest, with potential applications in areas such as stochastic optiization [8] beyond direct applicability to this proble APPENDIX A PROOF OF THEOREM RATE OF CONVERGENCE FOR DQS The proof follows by induction After n saples, the unit interval has been split into n subintervals, one for each possible sequence of n easureents In order to find the expected error, we break the interval down by subinterval, find the conditional expected error given that θ is in each subinterval, and cobine all subintervals using the law of total expectation We note that in binary search, these subintervals are all the sae length, but in quantile search each has a different length Let Z n denote the error after n saples, eg, Z n ˆθ n θ We establish the base case by direct coputation The first saple is ade at / ρ into the interval The expected error after one saple is then E[Z ] E [ Z θ / ] Pr θ + E [ Z θ > 0 + / x 0 dx + ρ ρ x 0 dx + ρ x ρ dx [ ρ + ρ ] 4 ] Pr / θ > / + x dx After the second easureent, each interval [0, ρ] and [ ρ, ] is split into two subintervals, and the error can be calculated again using the law of total expectation We generalize this idea using the following lea Lea Suppose at saple n we know θ lies in a subinterval [a, b] [0, ] with noralized conditional error E[Z n θ [a, b]] Prθ [a, b] 4 ρi ρ j At saple n +, we split [a, b] into two subintervals according to quantile search Then the noralized conditional error of one subinterval will be 4 ρi ρ j+ and of the other will be 4 ρi+ ρ j Proof Note that quantile search either splits the interval at the point a + b a or b b a, depending on the value of Y n We show the first case the second is syetric At saple n +, we have E[Z n+ θ [a, b]] E [Z n+ θ [a, a + ] b a] Pr θ [a, a + b a] + E [Z n+ θ [a + ] b a, b] Pr θ [a + b a, b] a+ b a a a + a + b a b a 4 + b a 4 4 b a ρ + 4 b a ρ b x dx + a+ b a a + b a + b x dx We now show that b a ρ i ρ j using induction on n Consider the base case of n and note that / ρ Then the possible intervals are [0, /] and [/, ], and we have { b a ρ, [a, b] [0, ] ρ, [a, b] [, ]
14 4 Noting that E[Z ] 4 [ ρ + ρ ] proves the base case Now suppose that b a ρ i ρ j for soe n such that E[Z n θ [a, b]] Prθ [a, b] 4 ρi ρ j Splitting the interval at a + b a yields two subintervals, [c, d] and [e, f] with d c a + b a a b a ρ ρ i ρ j and Therefore we see that f e b a b a b a ρ ρ i ρ j E[Z n+ θ [a, b]] 4 ρi ρ j+ + 4 ρi+ ρ j and the proof is coplete We now use this lea to prove the induction step Suppose at saple n we have E[Z n ] n n ρ n k ρ k 4 k k0 With the next saple, the intervals will split into two We then apply the lea to see that the resulting conditional expectation will be E[Z n+ ] n n ρ n k+ ρ k + ρ n k ρ k+ 4 k k0 n + ρ n+ + n n n + ρ n+ k ρ k + n ρ n k k 4 n k Using the fact that n n n+ and we have n k Recalling the binoial forula copletes the proof + n k n! n k + k! n k +! + n! k k! n k +! E[Z n+ ] n+ n + ρ n+ k ρ k 4 k k0 x + y n n k0 n x n k y k k n +, k APPENDIX B PROOF OF THEOREM DISTANCE TRAVELED FOR DQS Following the proof sketch given in Section IV-A, we first rewrite the interval A i by siplifying the geoetric sus Let ρ / Then [ ρ i A i ρ, ρ i+ ρ [ ρ i, ρ i+, where we have used the fact that / ρ It is tedious but straightforward to define the subintervals that partition A i, and the following result is obtained A ii ρ i+ ρi ρ i+, ρ i+ ] ρi ρ i After further inspection, one can obtain a general equation for odd values of n and a siilar equation for even values [ A ii n ρ i+ ρi ρ i+ + + ρi+ +in n ρ in, ρ i + ρi ρ i+ + + ρi+ +in n ρ in+
15 5 The DQS algorith for our class of functions chooses saples oving further into the interval until it akes a zero easureent It then turns back and takes saples in the opposite direction until a one is easured This behavior allows us to analyze the total distance by splitting it up into stages first the expected distance traveled before the algorith reaches a point x > θ, and then x < θ, etc Let D n be a rando variable denoting the distance required to ove to the right of θ for the n th tie when n is odd, and to the left of θ for the n th tie when n is even Then by linearity of expectation, the total expected distance is E[D] E[D n ] 8 n We now calculate E[D n ] Using the above definition, we can easily calculate E[D n θ A ii n ] by taking the absolute value of the difference between the lower edge of A ii n and the upper edge of A ii n for odd values of n and the converse for even values This yields E[D n θ A ii n ] ρi+ +in n ρ in+ The probability Pr θ A ii n is easily calculated as the length of the interval Note that the ajority of ters cancel, so the result becoes Pr θ A ii n ρi+ +in n ρ in ρ in+ ρi+ +in n We now prove the expected distance using induction on E[D n ] The base case has been shown to be E[D ] Now assue for arbitrary n that Using the above definitions, we see that E[D n ] i 0 i 0 n E[D n ] n E[D n θ A ii n ] Pr θ A ii n i n0 i n0 i 0 n ρi+ +in ρ in+ n ρi+ +in i n 0 n 3 i 0 n 3 ρ n 3 i 0 ρ i+ +in i n 0 i 0 i n 0 E[D n ] n, i n0 ρ i+ +in i n 0 ρ i+ +in ρ in ρ in+ ρ i+ +in i n 0 where the third and sixth lines follow fro the fact that i0 ρi ρ i+ ρ definition of E[D n ] This shows the inductive hypothesis and ay obtain E[D] by 8 ρ in ρ in + APPENDIX C PROOF OF THEOREM 3 RATE OF CONVERGENCE FOR DISCRETIZED PQS and the final line follows fro the The proof follows that of [4], [8] Recall fro Algorith 3 that I i denotes the ith discretized bin of the interval and a i j denotes the posterior ass at the ith bin after j easureents First define kθ to be the index of the bin I i containing θ, so that θ I kθ Next define the following ters M θ j a kθj, a kθ j
16 6 and N θ j + M θj + M θ j a kθj a kθ j + a kθ j + a kθ j As a brief bit of intuition, note that M θ j is a easure of how uch ass is in the bin actually containing θ after the jth easureent, while N θ j + is a easure of iproveent fro one iteration to the next and is strictly less than one when an iproveent correct easureent is ade [4] Fro [4] and following a straightforward application of Markov s inequality and the law of total expectation, we have that Pr θ n θ > E[M θ n], 9 and { } n E[M θ n] M θ 0 ax ax E[N θj + aj] 0 j 0,,n aj The reainder of the proof is to bound E[N θ j + aj] To do this, we consider three cases: i kj kθ; ii kj > kθ; and iii kj < kθ First, consider the case i, where kj kθ, and let X j+ kj so that the correct easureent is Y j+ In this case, k in the algorith is kθ, and hence i k + for updating Also, assue the case where we easure correctly, so that with probability q p where β α Then we have where N θ j + a kθ j + a kθ j + / τβ α, + / τβ α a kθj + / τβ α a kθj a kθ j + / τβ α a kθj + a kθ j a kθ j a kθ j a kθj + β α / τ a kθ j a kθ j a kθj β α + a kθ j β αx +, / τ akθ j a kθ j x τ j + / a kθ j a kθ j and τ j is defined in Algorith 3 Siilarly, when X j+ kj, a kθ j +τ /β α a kθj N θ j + +τ /β α a kθj a kθ j where + τ /β α a kθj + a kθ j a kθ j a kθ j a kθj + β α τ / a kθ j a kθ j a kθj β α + a kθ j β αx +, τ / akθ j a kθ j x τ j / a kθ j a kθ j and τ j is defined in Algorith 3 Equivalent steps can be followed for cases ii and iii, yielding respectively τ j / a kθ j a x kθ j if X j+ kj, if X j+ kj τ j / a kθ j a kθ j
17 7 and x τ j+ / a kθ j a kθ j if X j+ kj τ j+ / a kθ j a kθ j if X j+ kj The algebra is analogous for the case where we easure incorrectly, so that { +β αx, with probability q N θ j + β αx, with probability p, where x is as defined above Before bounding E[N θ j + aj], we prove three useful leas Lea For 0 a and 0 τ, τ + / a a τ a a Proof Rewrite the left hand side in two ways as follows τ + / a a τ a a τ a + a τ a a τ a τ a + a Now consider two cases, a τ and a > τ a Case a τ:: In equation, the left ter of the product is positive but the right ter is nonpositive, aking our whole ter less than or equal to zero b Case a > τ:: Note that both ters in the product of equation are less than or equal to one because 0 τ First consider the situation where τ a Then τ a + a τ Next consider the situation when τ a < τ This iplies that τ a < τ τ This iplies the left hand ter of the product in equation is a τ a < a and therefore that τ τ τ a +, aking the product again less than /, which copletes the proof Lea 3 For all 0 < a < and τ, τ / a For all 0 < a < and τ, a τ + / a a τ + τ + Proof We prove the first stateent given in the lea The proof of the second stateent is nearly identical Note that the stateent fro the lea is equivalent to τ a τ + a τ + τa Equivalently, we rearrange further to see that the original stateent holds if a τa a, a
18 8 and hence if Note since a < copleting the proof of the first stateent a + τ a + τ + τ +, We are now ready to bound E[N θ j + aj] Fro the definitions given in Algorith 3, it is straightforward to see that 0 τ j, 0 < τ j, and τ j + τ j a kj j a kθ j Also define gx q p + β αx + β αx q p β αx, q + p + so that E[N θ j + aj] P jgx + P jgx, where x is defined above for the various cases and P j and P j are defined in Algorith 3 Consider case i, kj kθ Then a rearrangeent of ters shows τ j + / a kθ j τ j / a kθ j E[N θ j + aj] P jg + P jg a kθ j a kθ j q + p q + p β α [ ] τ j + / a kθ j τ j / a kθ j P j + P j a kθ j a kθ j q + p q + p β α [ ] τ j + / a kθ j τ j / + a kθ j P j + P j a kθ j a kθ j q + p q + p τ j + / a kθ j β α P j P j a kθ j Note that Then by Lea, we have that P j P j τ j τ j τ j + τ j a kjj τ j a kj j E[N θ j + aj] q + p + q p β α Using Lea 3 for τ /, we see that for case ii kj < kθ E[N θ j + aj] P jg τ j + + P jg τ j + q + p q + p [ ] β α P τ j + P τ j + P + P q + p q + p [ β α P τ j + P τ j + ] q + p q + p β α
19 9 Siilarly applying Lea 3 for τ / in case iii kj > kθ we have E[N θ j + aj] P jg τ j + + P jg τ j + q + p q + p [ ] β α P τ j P τ j + P + P q + p q + p [ β α P τ j P τ j + ] q + p q + p β α For, these reduce to the bound given in [4] We now coplete the proof by using the above result into 7 and 0 Plugging in to 7, we see that Pr ˆθ n θ > [ q + p q + p ] n β α Next, we bound the expected error E ˆθ n θ 0 + Pr ˆθ n θ > t dt Pr ˆθ n θ > t dt Pr ˆθ n θ > [ q + p + Pr q p and the result follows fro the values for, α, and β indicated in the theore ˆθ n θ > t dt β α ] n, REFERENCES [] S Dasgupta, Analysis of a greedy active learning strategy, in Proc Advances in Neural Inforation Processing Systes, 005 [] D Beletsky, D Schwab, and M McCorick, Modeling the suer circulation and theral structure in Lake Michigan, Journal of Geophys Res, vol, 006 [3] G L E R Laboratory, Lake Erie hypoxia warning syste, [4] R Castro and R Nowak, Active learning and sapling, in Foundations and Applications of Sensor Manageent, st ed New York, NY: Springer, 008, ch 8 [5], Miniax bounds for active learning, IEEE Trans Inf Theory, vol 54, pp , May 008 [6] S Dasgupta, Coarse saple coplexity bounds for active learning, in Proc Advances in Neural Inforation Processing Systes, 005 [7] S Hanneke, A bound on the label coplexity of agnostic active learning, in Proc ICML, 007 [8] M V Burnashev and K S Zigangirov, An interval estiation proble for controlled observations, Probles in Inforation Transission, vol 0:3-3, 974, translated fro Probley Peredachi Inforatsii, 03:5-6, July-Septeber, 974 [9] R Castro, R Willett, and R Nowak, Faster rates in regression via active learning, in Proc Advances in Neural Inforation Processing Systes, 005 [0] A Singh, R Nowak, and P Raanathan, Active learning for adaptive obile sensing networks, in Proc Inforation Processing in Sensor Networks, 006 [] M-F Balcan, A Beygelzier, and J Langford, Agnostic active learning, in Proc International Conference on Machine Learning, 006 [] S Dasgupta, D Hsu, and C Monteleoni, A general agnostic active learning algorith, in Proc Advances in Neural Inforation Processing Systes, 007 [3] M-F Balcan, A Broder, and T Zhang, Margin based active learning, Learning Theory, p 350, 007 [4] Y Wang and A Singh, Noise-adaptive argin-based active learning for ulti-diensional data, arxiv Preprint, vol v, 04 [5] M Horstein, Sequential decoding using noiseless feedback, IEEE Trans Inf Theory, vol 9, 963 [6] R M Karp and R Kleinberg, Noisy binary search and its applications, in Proc ACM-SIAM Syposiu on Discrete Algoriths, 007 [7] M B Or and A Hassidi, The bayesian learner is optial for noisy binary search and pretty good for quantu as well, in Proc IEEE Syposiu of Foundations of Coputer Science, 008 [8] Y R Wang and A Singh, Algorithic connections between active learning and stochastic convex optiization, in Proc Algorithic Learning Theory, 03 [9] C Scott and R Nowak, Miniax-optial classification with dyadic decision trees, IEEE Trans Inf Theory, vol 5, pp , Apr 006 [0] A Liu, G Jun, and J Ghosh, Spatially cost-sensitive active learning, in Proc SIAM Conf on Data Mining, 009 [] B Deir, L Minello, and L Bruzzone, A cost-sensitive active learning technique for the definition of effective training sets for supervised classifiers, in Proc SIAM Conf on Data Mining, 009 [] A Guillory and J Blies, Average-case active learning with costs, in Proc Algorithic Learning Theory, 009 [3] B Schlegel, R Geulla, and W Lehner, k-ary search on odern processors, in Proc Fifth Int Workshop on Data Manageent on New Hardware, 009 [4] R Waeber, P I Frazier, and S G Henderson, Bisection search with noisy responses, SIAM Journal on Control and Optiization, vol 5, pp 6 79, 03 [5] J Lipor and L Balzano, Quantile search: A distance-penalized active learning algorith for spatial sapling, Technical report, 05 [6] G Dasarathy and R Nowak, S : An efficient graph-based active learning algorith with application to nonparaetric learning, in Proc Conf on Learning Theory, 05
Quantile Search: A Distance-Penalized Active Learning Algorithm for Spatial Sampling
Quantile Search: A Distance-Penalized Active Learning Algorith for Spatial Sapling John Lipor 1, Laura Balzano 1, Branko Kerkez 2, and Don Scavia 3 1 Departent of Electrical and Coputer Engineering, 2
More informationSupport Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization
Recent Researches in Coputer Science Support Vector Machine Classification of Uncertain and Ibalanced data using Robust Optiization RAGHAV PAT, THEODORE B. TRAFALIS, KASH BARKER School of Industrial Engineering
More informationComputational and Statistical Learning Theory
Coputational and Statistical Learning Theory Proble sets 5 and 6 Due: Noveber th Please send your solutions to learning-subissions@ttic.edu Notations/Definitions Recall the definition of saple based Radeacher
More informationNon-Parametric Non-Line-of-Sight Identification 1
Non-Paraetric Non-Line-of-Sight Identification Sinan Gezici, Hisashi Kobayashi and H. Vincent Poor Departent of Electrical Engineering School of Engineering and Applied Science Princeton University, Princeton,
More informatione-companion ONLY AVAILABLE IN ELECTRONIC FORM
OPERATIONS RESEARCH doi 10.1287/opre.1070.0427ec pp. ec1 ec5 e-copanion ONLY AVAILABLE IN ELECTRONIC FORM infors 07 INFORMS Electronic Copanion A Learning Approach for Interactive Marketing to a Custoer
More informationA Simple Regression Problem
A Siple Regression Proble R. M. Castro March 23, 2 In this brief note a siple regression proble will be introduced, illustrating clearly the bias-variance tradeoff. Let Y i f(x i ) + W i, i,..., n, where
More informationarxiv: v1 [cs.ds] 3 Feb 2014
arxiv:40.043v [cs.ds] 3 Feb 04 A Bound on the Expected Optiality of Rando Feasible Solutions to Cobinatorial Optiization Probles Evan A. Sultani The Johns Hopins University APL evan@sultani.co http://www.sultani.co/
More informationE0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis
E0 370 tatistical Learning Theory Lecture 6 (Aug 30, 20) Margin Analysis Lecturer: hivani Agarwal cribe: Narasihan R Introduction In the last few lectures we have seen how to obtain high confidence bounds
More informationSharp Time Data Tradeoffs for Linear Inverse Problems
Sharp Tie Data Tradeoffs for Linear Inverse Probles Saet Oyak Benjain Recht Mahdi Soltanolkotabi January 016 Abstract In this paper we characterize sharp tie-data tradeoffs for optiization probles used
More informationFixed-to-Variable Length Distribution Matching
Fixed-to-Variable Length Distribution Matching Rana Ali Ajad and Georg Böcherer Institute for Counications Engineering Technische Universität München, Gerany Eail: raa2463@gail.co,georg.boecherer@tu.de
More informationComputational and Statistical Learning Theory
Coputational and Statistical Learning Theory TTIC 31120 Prof. Nati Srebro Lecture 2: PAC Learning and VC Theory I Fro Adversarial Online to Statistical Three reasons to ove fro worst-case deterinistic
More informationProc. of the IEEE/OES Seventh Working Conference on Current Measurement Technology UNCERTAINTIES IN SEASONDE CURRENT VELOCITIES
Proc. of the IEEE/OES Seventh Working Conference on Current Measureent Technology UNCERTAINTIES IN SEASONDE CURRENT VELOCITIES Belinda Lipa Codar Ocean Sensors 15 La Sandra Way, Portola Valley, CA 98 blipa@pogo.co
More informationKernel Methods and Support Vector Machines
Intelligent Systes: Reasoning and Recognition Jaes L. Crowley ENSIAG 2 / osig 1 Second Seester 2012/2013 Lesson 20 2 ay 2013 Kernel ethods and Support Vector achines Contents Kernel Functions...2 Quadratic
More informationModel Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon
Model Fitting CURM Background Material, Fall 014 Dr. Doreen De Leon 1 Introduction Given a set of data points, we often want to fit a selected odel or type to the data (e.g., we suspect an exponential
More informationChapter 6 1-D Continuous Groups
Chapter 6 1-D Continuous Groups Continuous groups consist of group eleents labelled by one or ore continuous variables, say a 1, a 2,, a r, where each variable has a well- defined range. This chapter explores:
More informationMachine Learning Basics: Estimators, Bias and Variance
Machine Learning Basics: Estiators, Bias and Variance Sargur N. srihari@cedar.buffalo.edu This is part of lecture slides on Deep Learning: http://www.cedar.buffalo.edu/~srihari/cse676 1 Topics in Basics
More informationCourse Notes for EE227C (Spring 2018): Convex Optimization and Approximation
Course Notes for EE7C (Spring 018: Convex Optiization and Approxiation Instructor: Moritz Hardt Eail: hardt+ee7c@berkeley.edu Graduate Instructor: Max Sichowitz Eail: sichow+ee7c@berkeley.edu October 15,
More informationCombining Classifiers
Cobining Classifiers Generic ethods of generating and cobining ultiple classifiers Bagging Boosting References: Duda, Hart & Stork, pg 475-480. Hastie, Tibsharini, Friedan, pg 246-256 and Chapter 10. http://www.boosting.org/
More information13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices
CS71 Randoness & Coputation Spring 018 Instructor: Alistair Sinclair Lecture 13: February 7 Disclaier: These notes have not been subjected to the usual scrutiny accorded to foral publications. They ay
More informationCS Lecture 13. More Maximum Likelihood
CS 6347 Lecture 13 More Maxiu Likelihood Recap Last tie: Introduction to axiu likelihood estiation MLE for Bayesian networks Optial CPTs correspond to epirical counts Today: MLE for CRFs 2 Maxiu Likelihood
More informationInspection; structural health monitoring; reliability; Bayesian analysis; updating; decision analysis; value of information
Cite as: Straub D. (2014). Value of inforation analysis with structural reliability ethods. Structural Safety, 49: 75-86. Value of Inforation Analysis with Structural Reliability Methods Daniel Straub
More informationCourse Notes for EE227C (Spring 2018): Convex Optimization and Approximation
Course Notes for EE227C (Spring 2018): Convex Optiization and Approxiation Instructor: Moritz Hardt Eail: hardt+ee227c@berkeley.edu Graduate Instructor: Max Sichowitz Eail: sichow+ee227c@berkeley.edu October
More informationBayes Decision Rule and Naïve Bayes Classifier
Bayes Decision Rule and Naïve Bayes Classifier Le Song Machine Learning I CSE 6740, Fall 2013 Gaussian Mixture odel A density odel p(x) ay be ulti-odal: odel it as a ixture of uni-odal distributions (e.g.
More informationExperimental Design For Model Discrimination And Precise Parameter Estimation In WDS Analysis
City University of New York (CUNY) CUNY Acadeic Works International Conference on Hydroinforatics 8-1-2014 Experiental Design For Model Discriination And Precise Paraeter Estiation In WDS Analysis Giovanna
More informationThe Weierstrass Approximation Theorem
36 The Weierstrass Approxiation Theore Recall that the fundaental idea underlying the construction of the real nubers is approxiation by the sipler rational nubers. Firstly, nubers are often deterined
More informationA Note on Scheduling Tall/Small Multiprocessor Tasks with Unit Processing Time to Minimize Maximum Tardiness
A Note on Scheduling Tall/Sall Multiprocessor Tasks with Unit Processing Tie to Miniize Maxiu Tardiness Philippe Baptiste and Baruch Schieber IBM T.J. Watson Research Center P.O. Box 218, Yorktown Heights,
More information1 Rademacher Complexity Bounds
COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #10 Scribe: Max Goer March 07, 2013 1 Radeacher Coplexity Bounds Recall the following theore fro last lecture: Theore 1. With probability
More informationSPECTRUM sensing is a core concept of cognitive radio
World Acadey of Science, Engineering and Technology International Journal of Electronics and Counication Engineering Vol:6, o:2, 202 Efficient Detection Using Sequential Probability Ratio Test in Mobile
More informationQuantum algorithms (CO 781, Winter 2008) Prof. Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search
Quantu algoriths (CO 781, Winter 2008) Prof Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search ow we begin to discuss applications of quantu walks to search algoriths
More informationOn Constant Power Water-filling
On Constant Power Water-filling Wei Yu and John M. Cioffi Electrical Engineering Departent Stanford University, Stanford, CA94305, U.S.A. eails: {weiyu,cioffi}@stanford.edu Abstract This paper derives
More informationNonmonotonic Networks. a. IRST, I Povo (Trento) Italy, b. Univ. of Trento, Physics Dept., I Povo (Trento) Italy
Storage Capacity and Dynaics of Nononotonic Networks Bruno Crespi a and Ignazio Lazzizzera b a. IRST, I-38050 Povo (Trento) Italy, b. Univ. of Trento, Physics Dept., I-38050 Povo (Trento) Italy INFN Gruppo
More informationUsing EM To Estimate A Probablity Density With A Mixture Of Gaussians
Using EM To Estiate A Probablity Density With A Mixture Of Gaussians Aaron A. D Souza adsouza@usc.edu Introduction The proble we are trying to address in this note is siple. Given a set of data points
More informationIntelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines
Intelligent Systes: Reasoning and Recognition Jaes L. Crowley osig 1 Winter Seester 2018 Lesson 6 27 February 2018 Outline Perceptrons and Support Vector achines Notation...2 Linear odels...3 Lines, Planes
More informationPolygonal Designs: Existence and Construction
Polygonal Designs: Existence and Construction John Hegean Departent of Matheatics, Stanford University, Stanford, CA 9405 Jeff Langford Departent of Matheatics, Drake University, Des Moines, IA 5011 G
More informationUnderstanding Machine Learning Solution Manual
Understanding Machine Learning Solution Manual Written by Alon Gonen Edited by Dana Rubinstein Noveber 17, 2014 2 Gentle Start 1. Given S = ((x i, y i )), define the ultivariate polynoial p S (x) = i []:y
More informationComputable Shell Decomposition Bounds
Coputable Shell Decoposition Bounds John Langford TTI-Chicago jcl@cs.cu.edu David McAllester TTI-Chicago dac@autoreason.co Editor: Leslie Pack Kaelbling and David Cohn Abstract Haussler, Kearns, Seung
More informationBoosting with log-loss
Boosting with log-loss Marco Cusuano-Towner Septeber 2, 202 The proble Suppose we have data exaples {x i, y i ) i =... } for a two-class proble with y i {, }. Let F x) be the predictor function with the
More informationStochastic Subgradient Methods
Stochastic Subgradient Methods Lingjie Weng Yutian Chen Bren School of Inforation and Coputer Science University of California, Irvine {wengl, yutianc}@ics.uci.edu Abstract Stochastic subgradient ethods
More informationBayesian Approach for Fatigue Life Prediction from Field Inspection
Bayesian Approach for Fatigue Life Prediction fro Field Inspection Dawn An and Jooho Choi School of Aerospace & Mechanical Engineering, Korea Aerospace University, Goyang, Seoul, Korea Srira Pattabhiraan
More informationNew Bounds for Learning Intervals with Implications for Semi-Supervised Learning
JMLR: Workshop and Conference Proceedings vol (1) 1 15 New Bounds for Learning Intervals with Iplications for Sei-Supervised Learning David P. Helbold dph@soe.ucsc.edu Departent of Coputer Science, University
More informationNew Slack-Monotonic Schedulability Analysis of Real-Time Tasks on Multiprocessors
New Slack-Monotonic Schedulability Analysis of Real-Tie Tasks on Multiprocessors Risat Mahud Pathan and Jan Jonsson Chalers University of Technology SE-41 96, Göteborg, Sweden {risat, janjo}@chalers.se
More informationList Scheduling and LPT Oliver Braun (09/05/2017)
List Scheduling and LPT Oliver Braun (09/05/207) We investigate the classical scheduling proble P ax where a set of n independent jobs has to be processed on 2 parallel and identical processors (achines)
More informationPattern Recognition and Machine Learning. Learning and Evaluation for Pattern Recognition
Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2017 Lesson 1 4 October 2017 Outline Learning and Evaluation for Pattern Recognition Notation...2 1. The Pattern Recognition
More information1 Bounding the Margin
COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #12 Scribe: Jian Min Si March 14, 2013 1 Bounding the Margin We are continuing the proof of a bound on the generalization error of AdaBoost
More informationRandomized Recovery for Boolean Compressed Sensing
Randoized Recovery for Boolean Copressed Sensing Mitra Fatei and Martin Vetterli Laboratory of Audiovisual Counication École Polytechnique Fédéral de Lausanne (EPFL) Eail: {itra.fatei, artin.vetterli}@epfl.ch
More informationComputable Shell Decomposition Bounds
Journal of Machine Learning Research 5 (2004) 529-547 Subitted 1/03; Revised 8/03; Published 5/04 Coputable Shell Decoposition Bounds John Langford David McAllester Toyota Technology Institute at Chicago
More informationASSUME a source over an alphabet size m, from which a sequence of n independent samples are drawn. The classical
IEEE TRANSACTIONS ON INFORMATION THEORY Large Alphabet Source Coding using Independent Coponent Analysis Aichai Painsky, Meber, IEEE, Saharon Rosset and Meir Feder, Fellow, IEEE arxiv:67.7v [cs.it] Jul
More informationAnalyzing Simulation Results
Analyzing Siulation Results Dr. John Mellor-Cruey Departent of Coputer Science Rice University johnc@cs.rice.edu COMP 528 Lecture 20 31 March 2005 Topics for Today Model verification Model validation Transient
More informationarxiv: v3 [cs.lg] 7 Jan 2016
Efficient and Parsionious Agnostic Active Learning Tzu-Kuo Huang Alekh Agarwal Daniel J. Hsu tkhuang@icrosoft.co alekha@icrosoft.co djhsu@cs.colubia.edu John Langford Robert E. Schapire jcl@icrosoft.co
More informationSupport Vector Machines. Machine Learning Series Jerry Jeychandra Blohm Lab
Support Vector Machines Machine Learning Series Jerry Jeychandra Bloh Lab Outline Main goal: To understand how support vector achines (SVMs) perfor optial classification for labelled data sets, also a
More informationProbability Distributions
Probability Distributions In Chapter, we ephasized the central role played by probability theory in the solution of pattern recognition probles. We turn now to an exploration of soe particular exaples
More informationHomework 3 Solutions CSE 101 Summer 2017
Hoework 3 Solutions CSE 0 Suer 207. Scheduling algoriths The following n = 2 jobs with given processing ties have to be scheduled on = 3 parallel and identical processors with the objective of iniizing
More informationa a a a a a a m a b a b
Algebra / Trig Final Exa Study Guide (Fall Seester) Moncada/Dunphy Inforation About the Final Exa The final exa is cuulative, covering Appendix A (A.1-A.5) and Chapter 1. All probles will be ultiple choice
More informationSupport Vector Machines MIT Course Notes Cynthia Rudin
Support Vector Machines MIT 5.097 Course Notes Cynthia Rudin Credit: Ng, Hastie, Tibshirani, Friedan Thanks: Şeyda Ertekin Let s start with soe intuition about argins. The argin of an exaple x i = distance
More informationA Simplified Analytical Approach for Efficiency Evaluation of the Weaving Machines with Automatic Filling Repair
Proceedings of the 6th SEAS International Conference on Siulation, Modelling and Optiization, Lisbon, Portugal, Septeber -4, 006 0 A Siplified Analytical Approach for Efficiency Evaluation of the eaving
More informationCh 12: Variations on Backpropagation
Ch 2: Variations on Backpropagation The basic backpropagation algorith is too slow for ost practical applications. It ay take days or weeks of coputer tie. We deonstrate why the backpropagation algorith
More informationBounds on the Minimax Rate for Estimating a Prior over a VC Class from Independent Learning Tasks
Bounds on the Miniax Rate for Estiating a Prior over a VC Class fro Independent Learning Tasks Liu Yang Steve Hanneke Jaie Carbonell Deceber 01 CMU-ML-1-11 School of Coputer Science Carnegie Mellon University
More informationLecture 21. Interior Point Methods Setup and Algorithm
Lecture 21 Interior Point Methods In 1984, Kararkar introduced a new weakly polynoial tie algorith for solving LPs [Kar84a], [Kar84b]. His algorith was theoretically faster than the ellipsoid ethod and
More informationMSEC MODELING OF DEGRADATION PROCESSES TO OBTAIN AN OPTIMAL SOLUTION FOR MAINTENANCE AND PERFORMANCE
Proceeding of the ASME 9 International Manufacturing Science and Engineering Conference MSEC9 October 4-7, 9, West Lafayette, Indiana, USA MSEC9-8466 MODELING OF DEGRADATION PROCESSES TO OBTAIN AN OPTIMAL
More information1 Identical Parallel Machines
FB3: Matheatik/Inforatik Dr. Syaantak Das Winter 2017/18 Optiizing under Uncertainty Lecture Notes 3: Scheduling to Miniize Makespan In any standard scheduling proble, we are given a set of jobs J = {j
More informationFairness via priority scheduling
Fairness via priority scheduling Veeraruna Kavitha, N Heachandra and Debayan Das IEOR, IIT Bobay, Mubai, 400076, India vavitha,nh,debayan}@iitbacin Abstract In the context of ulti-agent resource allocation
More informationFigure 1: Equivalent electric (RC) circuit of a neurons membrane
Exercise: Leaky integrate and fire odel of neural spike generation This exercise investigates a siplified odel of how neurons spike in response to current inputs, one of the ost fundaental properties of
More informationLost-Sales Problems with Stochastic Lead Times: Convexity Results for Base-Stock Policies
OPERATIONS RESEARCH Vol. 52, No. 5, Septeber October 2004, pp. 795 803 issn 0030-364X eissn 1526-5463 04 5205 0795 infors doi 10.1287/opre.1040.0130 2004 INFORMS TECHNICAL NOTE Lost-Sales Probles with
More informationAlgorithms for parallel processor scheduling with distinct due windows and unit-time jobs
BULLETIN OF THE POLISH ACADEMY OF SCIENCES TECHNICAL SCIENCES Vol. 57, No. 3, 2009 Algoriths for parallel processor scheduling with distinct due windows and unit-tie obs A. JANIAK 1, W.A. JANIAK 2, and
More informationExtension of CSRSM for the Parametric Study of the Face Stability of Pressurized Tunnels
Extension of CSRSM for the Paraetric Study of the Face Stability of Pressurized Tunnels Guilhe Mollon 1, Daniel Dias 2, and Abdul-Haid Soubra 3, M.ASCE 1 LGCIE, INSA Lyon, Université de Lyon, Doaine scientifique
More informationConvex Programming for Scheduling Unrelated Parallel Machines
Convex Prograing for Scheduling Unrelated Parallel Machines Yossi Azar Air Epstein Abstract We consider the classical proble of scheduling parallel unrelated achines. Each job is to be processed by exactly
More informationA Smoothed Boosting Algorithm Using Probabilistic Output Codes
A Soothed Boosting Algorith Using Probabilistic Output Codes Rong Jin rongjin@cse.su.edu Dept. of Coputer Science and Engineering, Michigan State University, MI 48824, USA Jian Zhang jian.zhang@cs.cu.edu
More informationESTIMATING AND FORMING CONFIDENCE INTERVALS FOR EXTREMA OF RANDOM POLYNOMIALS. A Thesis. Presented to. The Faculty of the Department of Mathematics
ESTIMATING AND FORMING CONFIDENCE INTERVALS FOR EXTREMA OF RANDOM POLYNOMIALS A Thesis Presented to The Faculty of the Departent of Matheatics San Jose State University In Partial Fulfillent of the Requireents
More informationUniform Approximation and Bernstein Polynomials with Coefficients in the Unit Interval
Unifor Approxiation and Bernstein Polynoials with Coefficients in the Unit Interval Weiang Qian and Marc D. Riedel Electrical and Coputer Engineering, University of Minnesota 200 Union St. S.E. Minneapolis,
More informationResearch Article Rapidly-Converging Series Representations of a Mutual-Information Integral
International Scholarly Research Network ISRN Counications and Networking Volue 11, Article ID 5465, 6 pages doi:1.54/11/5465 Research Article Rapidly-Converging Series Representations of a Mutual-Inforation
More informationIntelligent Systems: Reasoning and Recognition. Artificial Neural Networks
Intelligent Systes: Reasoning and Recognition Jaes L. Crowley MOSIG M1 Winter Seester 2018 Lesson 7 1 March 2018 Outline Artificial Neural Networks Notation...2 Introduction...3 Key Equations... 3 Artificial
More informationInteractive Markov Models of Evolutionary Algorithms
Cleveland State University EngagedScholarship@CSU Electrical Engineering & Coputer Science Faculty Publications Electrical Engineering & Coputer Science Departent 2015 Interactive Markov Models of Evolutionary
More informationIn this chapter, we consider several graph-theoretic and probabilistic models
THREE ONE GRAPH-THEORETIC AND STATISTICAL MODELS 3.1 INTRODUCTION In this chapter, we consider several graph-theoretic and probabilistic odels for a social network, which we do under different assuptions
More informationSoft Computing Techniques Help Assign Weights to Different Factors in Vulnerability Analysis
Soft Coputing Techniques Help Assign Weights to Different Factors in Vulnerability Analysis Beverly Rivera 1,2, Irbis Gallegos 1, and Vladik Kreinovich 2 1 Regional Cyber and Energy Security Center RCES
More informationA Note on the Applied Use of MDL Approximations
A Note on the Applied Use of MDL Approxiations Daniel J. Navarro Departent of Psychology Ohio State University Abstract An applied proble is discussed in which two nested psychological odels of retention
More informationEffective joint probabilistic data association using maximum a posteriori estimates of target states
Effective joint probabilistic data association using axiu a posteriori estiates of target states 1 Viji Paul Panakkal, 2 Rajbabu Velurugan 1 Central Research Laboratory, Bharat Electronics Ltd., Bangalore,
More informationA remark on a success rate model for DPA and CPA
A reark on a success rate odel for DPA and CPA A. Wieers, BSI Version 0.5 andreas.wieers@bsi.bund.de Septeber 5, 2018 Abstract The success rate is the ost coon evaluation etric for easuring the perforance
More informationA Better Algorithm For an Ancient Scheduling Problem. David R. Karger Steven J. Phillips Eric Torng. Department of Computer Science
A Better Algorith For an Ancient Scheduling Proble David R. Karger Steven J. Phillips Eric Torng Departent of Coputer Science Stanford University Stanford, CA 9435-4 Abstract One of the oldest and siplest
More informationCSE525: Randomized Algorithms and Probabilistic Analysis May 16, Lecture 13
CSE55: Randoied Algoriths and obabilistic Analysis May 6, Lecture Lecturer: Anna Karlin Scribe: Noah Siegel, Jonathan Shi Rando walks and Markov chains This lecture discusses Markov chains, which capture
More informationBootstrapping Dependent Data
Bootstrapping Dependent Data One of the key issues confronting bootstrap resapling approxiations is how to deal with dependent data. Consider a sequence fx t g n t= of dependent rando variables. Clearly
More informationImproved Guarantees for Agnostic Learning of Disjunctions
Iproved Guarantees for Agnostic Learning of Disjunctions Pranjal Awasthi Carnegie Mellon University pawasthi@cs.cu.edu Avri Blu Carnegie Mellon University avri@cs.cu.edu Or Sheffet Carnegie Mellon University
More informationFeature Extraction Techniques
Feature Extraction Techniques Unsupervised Learning II Feature Extraction Unsupervised ethods can also be used to find features which can be useful for categorization. There are unsupervised ethods that
More informationGraphical Models in Local, Asymmetric Multi-Agent Markov Decision Processes
Graphical Models in Local, Asyetric Multi-Agent Markov Decision Processes Ditri Dolgov and Edund Durfee Departent of Electrical Engineering and Coputer Science University of Michigan Ann Arbor, MI 48109
More informationPattern Recognition and Machine Learning. Artificial Neural networks
Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2017 Lessons 7 20 Dec 2017 Outline Artificial Neural networks Notation...2 Introduction...3 Key Equations... 3 Artificial
More informationSupplementary to Learning Discriminative Bayesian Networks from High-dimensional Continuous Neuroimaging Data
Suppleentary to Learning Discriinative Bayesian Networks fro High-diensional Continuous Neuroiaging Data Luping Zhou, Lei Wang, Lingqiao Liu, Philip Ogunbona, and Dinggang Shen Proposition. Given a sparse
More informationResearch in Area of Longevity of Sylphon Scraies
IOP Conference Series: Earth and Environental Science PAPER OPEN ACCESS Research in Area of Longevity of Sylphon Scraies To cite this article: Natalia Y Golovina and Svetlana Y Krivosheeva 2018 IOP Conf.
More informationEstimating Parameters for a Gaussian pdf
Pattern Recognition and achine Learning Jaes L. Crowley ENSIAG 3 IS First Seester 00/0 Lesson 5 7 Noveber 00 Contents Estiating Paraeters for a Gaussian pdf Notation... The Pattern Recognition Proble...3
More informationHamming Compressed Sensing
Haing Copressed Sensing Tianyi Zhou, and Dacheng Tao, Meber, IEEE Abstract arxiv:.73v2 [cs.it] Oct 2 Copressed sensing CS and -bit CS cannot directly recover quantized signals and require tie consuing
More informationHybrid System Identification: An SDP Approach
49th IEEE Conference on Decision and Control Deceber 15-17, 2010 Hilton Atlanta Hotel, Atlanta, GA, USA Hybrid Syste Identification: An SDP Approach C Feng, C M Lagoa, N Ozay and M Sznaier Abstract The
More informationSolutions of some selected problems of Homework 4
Solutions of soe selected probles of Hoework 4 Sangchul Lee May 7, 2018 Proble 1 Let there be light A professor has two light bulbs in his garage. When both are burned out, they are replaced, and the next
More informationComparison of Stability of Selected Numerical Methods for Solving Stiff Semi- Linear Differential Equations
International Journal of Applied Science and Technology Vol. 7, No. 3, Septeber 217 Coparison of Stability of Selected Nuerical Methods for Solving Stiff Sei- Linear Differential Equations Kwaku Darkwah
More informationLower Bounds for Quantized Matrix Completion
Lower Bounds for Quantized Matrix Copletion Mary Wootters and Yaniv Plan Departent of Matheatics University of Michigan Ann Arbor, MI Eail: wootters, yplan}@uich.edu Mark A. Davenport School of Elec. &
More information3.8 Three Types of Convergence
3.8 Three Types of Convergence 3.8 Three Types of Convergence 93 Suppose that we are given a sequence functions {f k } k N on a set X and another function f on X. What does it ean for f k to converge to
More informationSampling How Big a Sample?
C. G. G. Aitken, 1 Ph.D. Sapling How Big a Saple? REFERENCE: Aitken CGG. Sapling how big a saple? J Forensic Sci 1999;44(4):750 760. ABSTRACT: It is thought that, in a consignent of discrete units, a certain
More informationAsynchronous Gossip Algorithms for Stochastic Optimization
Asynchronous Gossip Algoriths for Stochastic Optiization S. Sundhar Ra ECE Dept. University of Illinois Urbana, IL 680 ssrini@illinois.edu A. Nedić IESE Dept. University of Illinois Urbana, IL 680 angelia@illinois.edu
More informationAn Improved Particle Filter with Applications in Ballistic Target Tracking
Sensors & ransducers Vol. 72 Issue 6 June 204 pp. 96-20 Sensors & ransducers 204 by IFSA Publishing S. L. http://www.sensorsportal.co An Iproved Particle Filter with Applications in Ballistic arget racing
More informationUpper bound on false alarm rate for landmine detection and classification using syntactic pattern recognition
Upper bound on false alar rate for landine detection and classification using syntactic pattern recognition Ahed O. Nasif, Brian L. Mark, Kenneth J. Hintz, and Nathalia Peixoto Dept. of Electrical and
More informationCOS 424: Interacting with Data. Written Exercises
COS 424: Interacting with Data Hoework #4 Spring 2007 Regression Due: Wednesday, April 18 Written Exercises See the course website for iportant inforation about collaboration and late policies, as well
More informationBlock designs and statistics
Bloc designs and statistics Notes for Math 447 May 3, 2011 The ain paraeters of a bloc design are nuber of varieties v, bloc size, nuber of blocs b. A design is built on a set of v eleents. Each eleent
More informationThis model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t.
CS 493: Algoriths for Massive Data Sets Feb 2, 2002 Local Models, Bloo Filter Scribe: Qin Lv Local Models In global odels, every inverted file entry is copressed with the sae odel. This work wells when
More information